NavigationContentFooter
Jump toSuggest an edit

Managed Inference - Quickstart

Reviewed on 19 August 2024

Scaleway Managed Inference is the first European Managed Inference platform on the market. It is a scalable and secure inference engine for Large Language Models (LLMs).

Scaleway Managed Inference is a fully managed service that allows you to serve generative AI models in a production environment. With Scaleway Managed Inference, you can easily deploy, manage, and scale LLMs without worrying about the underlying infrastructure.

Here are some of the key features of Scaleway Managed Inference:

  • Easy deployment: Deploy state-of-the-art open weights LLMs with just a few clicks. Scaleway Managed Inference provides a simple and intuitive interface for generating dedicated endpoints.
  • Security: Scaleway provides a secure environment to run your models. Our platform is built on top of a secure architecture, and we use state-of-the-art cloud security.
  • Complete data privacy: No storage or third-party access to your data (prompt or responses), to ensure it remains exclusively yours.
  • Interoperability: Scaleway Managed Inference was designed as a drop-in replacement for the OpenAI APIs, for a seamless transition on your applications already using its libraries.
Important

This service is in beta. Specific terms and conditions apply.

Before you start

To complete the actions presented below, you must have:

  • A Scaleway account logged into the console
  • Owner status or IAM permissions allowing you to perform actions in the intended Organization

How to create a Managed Inference deployment

  1. Navigate to the AI & Data section of the Scaleway console, and select Managed Inference from the side menu to access the Managed Inference dashboard.
  2. Click Create deployment to launch the deployment creation wizard.
  3. Provide the necessary information:
    • Select the desired model and the quantization to use for your deployment from the available options
      Note

      Some models may require acceptance of an end-user license agreement. If prompted, review the terms and conditions and accept the license accordingly.

    • Choose the geographical region for the deployment.
    • Specify the GPU Instance type to be used with your deployment.
  4. Enter a name for the deployment, along with optional tags to aid in organization.
  5. Configure the network settings for the deployment:
    • Enable Private Network for secure communication and restricted availability within Private Networks. Choose an existing Private Network from the drop-down list, or create a new one.
    • Enable Public Network to access resources via the public Internet. API key protection is enabled by default.
    Important
    • Enabling both private and public networks will result in two distinct endpoints (public and private) for your deployment.
    • Deployments must have at least one endpoint, either public or private.
  6. Click Create deployment to launch the deployment process. Once the deployment is ready, it will be listed among your deployments.

How to access a Managed Inference deployment

Managed Inference deployments have authentication enabled by default. As such, your endpoints expect a secret key generated with Scaleway’s Identity and Access Management service (IAM) for authentication.

  1. Click Managed Inference in the AI & Data section of the side menu. The Managed Inference dashboard displays.
  2. Click «See more Icon» next to the deployment you want to edit. The deployment dashboard displays.
  3. Click Generate key in the Deployment connection section of the dashboard. The token creation wizard displays.
  4. Fill in the required information for API key creation and click Generate API key.
Tip

You have full control over authentication from the Security tab of your deployment. Authentication is enabled by default.

How to interact with Managed Inference

  1. Click Managed Inference in the AI & Data section of the side menu. The Managed Inference dashboard displays.
  2. Click «See more Icon» next to the deployment you want to edit. The deployment dashboard displays.
  3. Click the Inference tab. Code examples in various environments display. Copy and paste them into your code editor or terminal.
Note

Prompt structure may vary from one model to another. Refer to the specific instructions for use in our dedicated documentation.

How to delete a deployment

  1. Click Managed Inference in the AI & Data section of the Scaleway console side menu. A list of your deployments displays.
  2. Choose a deployment either by clicking its name or selecting More info from the drop-down menu represented by the icon «See more Icon» to access the deployment dashboard.
  3. Click the Settings tab of your deployment to display additional settings.
  4. Click Delete deployment.
  5. Type DELETE to confirm and click Delete deployment to delete your deployment.
Important

Deleting a deployment is a permanent action, and will erase all its associated configuration and resources.

Docs APIScaleway consoleDedibox consoleScaleway LearningScaleway.comPricingBlogCarreer
© 2023-2024 – Scaleway