# Model Support With the rapid iteration of AI Lab, we have now supported various model inference services. Here, you can see information about the supported models. - AI Lab v0.3.0 launched model inference services, facilitating users to directly use the inference services of AI Lab without worrying about model deployment and maintenance for traditional deep learning models. - AI Lab v0.6.0 supports the complete version of vLLM inference capabilities, supporting many large language models such as `LLama`, `Qwen`, `ChatGLM`, and more. !!! note The support for inference capabilities is related to the version of AI Lab. Refer to the [Release Notes](../../intro/release-notes.md) to understand the latest version and update timely. You can use GPU types that have been verified by DCE 5.0 in AI Lab. For more details, refer to the [GPU Support Matrix](../../../kpanda/user-guide/gpu/gpu-metrics.md). ![Click to Create](../../images/inference-interface.png) ## Triton Inference Server Through the Triton Inference Server, traditional deep learning models can be well supported. Currently, AI Lab supports mainstream inference backend services: | Backend | Supported Model Formats | Description | | ------- | ----------------------- | ----------- | | pytorch | TorchScript, PyTorch 2.0 formats | [triton-inference-server/pytorch_backend](https://github.com/triton-inference-server/pytorch_backend) | | tensorflow | TensorFlow 2.x | [triton-inference-server/tensorflow_backend](https://github.com/triton-inference-server/tensorflow_backend) | | vLLM (Deprecated) | TensorFlow 2.x | [triton-inference-server/tensorflow_backend](https://github.com/triton-inference-server/tensorflow_backend) | !!! danger The use of Triton's Backend vLLM method has been deprecated. It is recommended to use the latest support for vLLM to deploy your large language models. ## vLLM With vLLM, we can quickly use large language models. Here, you can see the list of models we support, which generally aligns with the `vLLM Support Models`. - HuggingFace Models: We support most of HuggingFace's models. You can see more models at the [HuggingFace Model Hub](https://huggingface.co/models). - The [vLLM Supported Models](https://docs.vllm.ai/en/stable/models/supported_models.html) list includes supported large language models and vision-language models. - Models fine-tuned using the vLLM support framework. ### New Features of vLLM Currently, AI Lab also supports some new features when using vLLM as an inference tool: - Enable `Lora Adapter` to optimize model inference services during inference. - Provide a compatible `OpenAPI` interface with `OpenAI`, making it easy for users to switch to local inference services at a low cost and quickly transition.