Self-hosted models supported platforms

DETAILS: Tier: Ultimate with GitLab Duo Enterprise - Start a trial Offering: Self-managed Status: Beta

Introduced in GitLab 17.1 with a flag named ai_custom_model. Disabled by default.

Enabled on self-managed in GitLab 17.6.

Changed to require GitLab Duo add-on in GitLab 17.6 and later.

There are multiple platforms available to host your self-hosted Large Language Models (LLMs). Each platform has unique features and benefits that can cater to different needs. The following documentation summarises the currently supported options:

For non-cloud on-premise model deployments

vLLM. A high-performance inference server optimized for serving LLMs with memory efficiency. It supports model parallelism and integrates easily with existing workflows.
- vLLM Installation Guide
- vLLM Supported Models

For cloud-hosted model deployments

AWS Bedrock. A fully managed service that allows developers to build and scale generative AI applications using pre-trained models from leading AI companies. It seamlessly integrates with other AWS services and offers a pay-as-you-go pricing model.
- AWS Bedrock Model Deployment Guide
- Supported foundation models in Amazon Bedrock
Azure OpenAI. Provides access to OpenAI's powerful models, enabling developers to integrate advanced AI capabilities into their applications with robust security and scalable infrastructure.
- Working with Azure OpenAI models
- Azure OpenAI Service models