NavigationContentFooter
Jump toSuggest an edit

Managed Inference - Quickstart

Reviewed on 06 March 2024

Scaleway Managed Inference is the first European Managed Inference platform on the market. It is a scalable and secure inference engine for Large Language Models (LLMs).

Scaleway Managed Inference is a fully managed service that allows you to serve generative AI models in a production environment. With Scaleway Managed Inference, you can easily deploy, manage, and scale LLMs without worrying about the underlying infrastructure.

Here are some of the key features of Scaleway Managed Inference:

  • Easy deployment: Deploy state-of-the-art open weights LLMs with just a few clicks. Scaleway Managed Inference provides a simple and intuitive interface for generating dedicated endpoints.
  • Security: Scaleway provides a secure environment for running your models. Our platform is built on top of a secure architecture, and we use state-of-the-art cloud security.
  • Complete data privacy: No storage or third-party access to your data (prompt or responses), to ensure it remains exclusively yours.
  • Auto-scaling (coming soon): Scaleway Managed Inference automatically scales your instances based on demand, ensuring that your models are always available and responsive.
Important

This service is in beta. Specific terms and conditions apply.

How to create a Managed Inference deployment

  1. Navigate to the AI & Data section of the Scaleway console, and select Managed Inference from the side menu to access the Managed Inference dashboard.
  2. Click Create deployment to launch the deployment creation wizard.
  3. Provide the necessary information:
    • Select the desired model and the quantization to use for your deployment from the available options
      Note

      Some models may require acceptance of an end-user license agreement. If prompted, review the terms and conditions and accept the license accordingly.

    • Choose the geographical region for the deployment.
    • Specify the GPU Instance type to be used with your deployment.
  4. Enter a name for the deployment, along with optional tags to aid in organization.
  5. Configure the network settings for the deployment:
    • Enable Private Network for secure communication and restricted availability within Private Networks. Choose an existing Private Network from the drop-down list, or create a new one.
    • Enable Public Network to access resources via the public Internet. API key protection is enabled by default.
    Important
    • Enabling both private and public networks will result in two distinct endpoints (public and private) for your deployment.
    • Deployments must have at least one endpoint, either public or private.
  6. Click Create deployment to launch the deployment process. Once the deployment is ready, it will be listed among your deployments.

How to access a Managed Inference deployment

Managed Inference deployments use dynamic tokens generated with Scaleway’s Identity and Access Management service (IAM) for authentication.

  1. Click Managed Inference in the AI & Data section of the side menu. The Managed Inference dashboard displays.
  2. Click «See more Icon» next to the deployment you want to edit. The deployment dashboard displays.
  3. Click Create token in the Deployment connection section of the dashboard. The token creation wizard displays.
  4. Fill in the required information for token creation and click Generate API key.
Tip

You have full control over authentication from the Security tab of your deployment. Authentication is enabled by default.

How to interact with Managed Inference

  1. Click Managed Inference in the AI & Data section of the side menu. The Managed Inference dashboard displays.
  2. Click «See more Icon» next to the deployment you want to edit. The deployment dashboard displays.
  3. Click the Inference tab. Code examples in various environments display. Copy and paste them in your code editor or terminal.
Note

Prompt structure may vary from one model to another. Refer to the specific instructions for use in our dedicated documentation

How to delete a deployment

  1. Click Managed Inference in the AI & Data section of the Scaleway console side menu. A list of your deployments displays.
  2. Choose a deployment either by clicking its name or selecting More info from the drop-down menu represented by the icon «See more Icon» to access the deployment dashboard.
  3. Click the Settings tab of your deployment to display additional settings.
  4. Click Delete deployment.
  5. Type DELETE to confirm and click Delete deployment to delete your deployment.
Important

Deleting a deployment is a permanent action, and will erase all its associated configuration and resources.

Docs APIScaleway consoleDedibox consoleScaleway LearningScaleway.comPricingBlogCarreer
© 2023-2024 – Scaleway