GitLab Duo Self-Hosted Models

DETAILS: Tier: Ultimate with GitLab Duo Enterprise - Start a trial Offering: Self-managed Status: Beta

Introduced in GitLab 17.1 with a flag named ai_custom_model. Disabled by default.

Enabled on self-managed in GitLab 17.6.

Changed to require GitLab Duo add-on in GitLab 17.6 and later.

To maintain full control over your data privacy, security, and the deployment of large language models (LLMs) in your own infrastructure, use GitLab Duo Self-Hosted Models.

By deploying self-hosted models, you can manage the entire lifecycle of requests made to LLM backends for GitLab Duo features, ensuring that all requests stay within your enterprise network and avoiding external dependencies.

Why use self-hosted models

With self-hosted models, you can:

Choose any GitLab-approved LLM.
Retain full control over data by keeping all request/response logs within your domain, ensuring complete privacy and security with no external API calls.
Isolate the GitLab instance, AI Gateway, and models within your own environment.
Select specific GitLab Duo features tailored to your users.
Eliminate reliance on the shared GitLab AI Gateway.

This setup ensures enterprise-level privacy and flexibility, allowing seamless integration of your LLMs with GitLab Duo features.

Prerequisites

Before setting up a self-hosted model infrastructure, you must have:

A supported model (either cloud-based or on-premises).
A supported serving platform (either cloud-based or on-premises).
A locally hosted or GitLab.com AI Gateway.
GitLab Ultimate + Duo Enterprise license.

Choose a configuration type

There are two configuration options for self-managed customers:

GitLab.com AI gateway with default GitLab external vendor LLMs
Self-hosted AI gateway and LLMs

Before setting up a self-hosted model infrastructure, you must decide which configuration type to implement.

GitLab.com AI gateway with default GitLab external vendor LLMs

This is the default Enterprise offering and is not fully self-hosted. In this configuration, you connect your self-managed GitLab instance to the GitLab-hosted AI gateway, which integrates with external vendor LLM providers (such as Google Vertex or Anthropic).

These LLMs communicate through the GitLab Cloud Connector, offering a ready-to-use AI solution without the need for on-premise infrastructure.

For licensing, you must have a GitLab Premium or Ultimate subscription and GitLab Duo Enterprise.

For more information, see the GitLab.com AI gateway configuration diagram.

Self-hosted AI gateway and LLMs

In this configuration, you deploy your own AI gateway and LLMs within your infrastructure, without relying on external public services. This gives you full control over your data and security.

For licensing, you must have a valid GitLab license. You can request a license through the Customers Portal.

For more information, see the self-hosted AI gateway configuration diagram.

Set up a self-hosted infrastructure

To set up a fully isolated self-hosted model infrastructure:

Install a Large Language Model (LLM) Serving Infrastructure
- We support various platforms for serving and hosting your LLMs, such as vLLM, AWS Bedrock, and Azure OpenAI. To help you choose the most suitable option for effectively deploying your models, see the supported LLM platforms documentation for more information on each platform's features.
- We provide a comprehensive matrix of supported models along with their specific features and hardware requirements. To help select models that best align with your infrastructure needs for optimal performance, see the supported models and hardware requirements documentation.
Install the GitLab AI Gateway Install the AI Gateway to efficiently configure your AI infrastructure.
Configure GitLab Duo features See the Configure GitLab Duo features documentation for instructions on how to customize your environment to effectively meet your operational needs.
Enable logging You can find configuration details for enabling logging within your environment. For help in using logs to track and manage your system's performance effectively, see the logging documentation.