Scaleway Managed Inference allows you to deploy various AI models, either from the Scaleway catalog or by importing a custom model. For detailed information about supported models, visit our Supported models in Managed Inference documentation.
How to deploy a model on Scaleway Managed Inference
Reviewed on 09 April 2025 • Published on 06 March 2024
Before you startLink to this anchor
To complete the actions presented below, you must have:
- A Scaleway account logged into the console
- Owner status or IAM permissions allowing you to perform actions in the intended Organization
- Click the AI section of the Scaleway console, and select Managed Inference from the side menu to access the Managed Inference dashboard.
- From the drop-down menu, select the geographical region where you want to create your deployment.
- Click Deploy a model to launch the model deployment wizard.
- Provide the necessary information:
- Select the desired model and quantization to use for your deployment from the available options.
ImportantNote
Some models may require acceptance of an end-user license agreement. If prompted, review the terms and conditions and accept the license accordingly.
- Choose the geographical region for the deployment.
- For custom models: Choose the model quantization.
Tip
Each model comes with a default quantization. Select lower bits quantization to improve performance and enable the model to run on smaller GPU nodes, while potentially reducing precision.
- Specify the GPU Instance type to be used with your deployment.
- Select the desired model and quantization to use for your deployment from the available options.
- Choose the number of nodes for your deployment. Note that this feature is currently in Public Beta.
Note
High availability is only guaranteed with two or more nodes.
- Enter a name for the deployment, and optional tags.
- Configure the network connectivity settings for the deployment:
- Attach to a Private Network for secure communication and restricted availability. Choose an existing Private Network from the drop-down list, or create a new one.
- Set up Public connectivity to access resources via the public internet. Authentication by API key is enabled by default.
Important- Enabling both private and public connectivity will result in two distinct endpoints (public and private) for your deployment.
- Deployments must have at least one endpoint, either public or private.
- Click Deploy model to launch the deployment process. Once the model is ready, it will be listed among your deployments.
Was this page helpful?