Was this page helpful?

Managed Inference - Quickstart

Reviewed on 24 February 2025

Scaleway Managed Inference is the first European Managed Inference platform on the market. It is a scalable and secure inference engine for Large Language Models (LLMs).

Scaleway Managed Inference is a fully managed service that allows you to serve generative AI models in a production environment. With Scaleway Managed Inference, you can easily deploy, manage, and scale LLMs without worrying about the underlying infrastructure.

Here are some of the key features of Scaleway Managed Inference:

Easy deployment: Deploy state-of-the-art open weights LLMs with just a few clicks. Scaleway Managed Inference provides a simple and intuitive interface for generating dedicated endpoints.
Security: Scaleway provides a secure environment to run your models. Our platform is built on top of a secure architecture, and we use state-of-the-art cloud security.
Complete data privacy: No storage or third-party access to your data (prompt or responses), to ensure it remains exclusively yours.
Interoperability: Scaleway Managed Inference was designed as a drop-in replacement for the OpenAI APIs, for a seamless transition on your applications already using its libraries.

Before you startLink to this anchor

To complete the actions presented below, you must have:

A Scaleway account logged into the console
Owner status or IAM permissions allowing you to perform actions in the intended Organization

How to create a Managed Inference deploymentLink to this anchor

Navigate to the AI section of the Scaleway console, and select Managed Inference from the side menu to access the Managed Inference dashboard.
From the drop-down menu, select the geographical region where you want to create your deployment.
Click Create deployment to launch the deployment creation wizard.
Provide the necessary information:
- Select the desired model and the quantization to use for your deployment from the available options.
  Important
  Scaleway Managed Inference allows you to deploy various AI models, either from the Scaleway catalog or by importing a custom model. For detailed information about supported models, visit our Supported models in Managed Inference documentation.
  
  Note
  Some models may require acceptance of an end-user license agreement. If prompted, review the terms and conditions and accept the license accordingly.
- Choose the geographical region for the deployment.
- Specify the GPU Instance type to be used with your deployment.
- Choose the number of nodes for your deployment. Note that this feature is currently in Public Beta.
Enter a name for the deployment, along with optional tags to aid in organization.
Configure the network settings for the deployment:
- Enable Private Network for secure communication and restricted availability within Private Networks. Choose an existing Private Network from the drop-down list, or create a new one.
- Enable Public Network to access resources via the public Internet. API key protection is enabled by default.
Important
- Enabling both private and public networks will result in two distinct endpoints (public and private) for your deployment.
- Deployments must have at least one endpoint, either public or private.
Click Create deployment to launch the deployment process. Once the deployment is ready, it will be listed among your deployments.

How to access a Managed Inference deploymentLink to this anchor

Managed Inference deployments have authentication enabled by default. As such, your endpoints expect a secret key generated with Scaleway’s Identity and Access Management service (IAM) for authentication.

Click Managed Inference in the AI section of the side menu. The Managed Inference dashboard displays.
From the drop-down menu, select the geographical region where you want to manage.
Click «See more Icon» next to the deployment you want to edit. The deployment dashboard displays.
Click Generate key in the Deployment connection section of the dashboard. The token creation wizard displays.
Fill in the required information for API key creation and click Generate API key.

Tip

You have full control over authentication from the Security tab of your deployment. Authentication is enabled by default.

How to interact with Managed InferenceLink to this anchor

Click Managed Inference in the AI section of the side menu. The Managed Inference dashboard displays.
From the drop-down menu, select the geographical region where you want to manage.
Click «See more Icon» next to the deployment you want to edit. The deployment dashboard displays.
Click the Inference tab. Code examples in various environments display. Copy and paste them into your code editor or terminal.

Note

Prompt structure may vary from one model to another. Refer to the specific instructions for use in our dedicated documentation.

How to delete a deploymentLink to this anchor

Click Managed Inference in the AI section of the Scaleway console side menu. A list of your deployments displays.
From the drop-down menu, select the geographical region where you want to create your deployment.
Choose a deployment either by clicking its name or selecting More info from the drop-down menu represented by the icon «See more Icon» to access the deployment dashboard.
Click the Settings tab of your deployment to display additional settings.
Click Delete deployment.
Type DELETE to confirm and click Delete deployment to delete your deployment.

Important

Deleting a deployment is a permanent action, and will erase all its associated configuration and resources.

Was this page helpful?