How to deploy a model on a dedicated Generative APIs deployment

Reviewed on April 16, 2026

Generative APIs – Dedicated Deployment is a fully managed service for deploying and running AI models in a private, dedicated environment. Unlike the pre-configured endpoints of the Serverless offering, Dedicated Deployment requires you to create a deployment, letting you customize settings such as quantization and instance size. This page walks you through the steps to create that deployment.

To get started using Serverless models, see How to query language models.

Before you start

To complete the actions presented below, you must have:

A Scaleway account logged into the console
Owner status or IAM permissions allowing you to perform actions in the intended Organization

Click Generative APIs in the AI section of the side menu in the Scaleway console to access the dashboard. The list of models displays.
Select the Deployments tab.
Click Deploy a model to launch the model deployment wizard.
From the drop-down menu, select the geographical region where you want to create your deployment.
Provide the necessary information:
- Choose the geographical region for the deployment.
- Select the desired model and quantization to use for your deployment from the available options.
  Important
  Scaleway Generative APIs - Dedicated Deployment allows you to deploy various AI models, either from the Scaleway catalog or by importing a custom model. For detailed information about supported models, visit our Supported models documentation.
  
  Note
  Some models may require acceptance of an end-user license agreement (EULA). If prompted, review the terms and conditions and accept the license accordingly.
- For custom models: Choose the model quantization.
  Tip
  Each model comes with a default quantization. Select lower bits quantization to improve performance and enable the model to run on smaller GPU nodes, while potentially reducing precision.
- Select a node type, that is, the GPU Instance that will be used with your deployment.
- Choose the number of nodes for your deployment. Note that this feature is currently in Public Beta.
  Tip
  High availability is only guaranteed with two or more nodes.
Enter a name for the deployment, and optional tags.
Configure the network connectivity settings for the deployment:
- Attach to a Private Network for secure communication and restricted availability. Choose an existing Private Network from the drop-down list, or create a new one.
- Set up Public connectivity to access resources via the public internet. Authentication by API key is enabled by default.
Important

Enabling both private and public connectivity will result in two distinct endpoints (public and private) for your deployment.

Deployments must have at least one endpoint, either public or private.
Click Deploy model to launch the deployment process. Once the model is ready, it will be listed among your deployments.