NavigationContentFooter
Suggest an edit

How to deploy a model on Scaleway Managed Inference

Reviewed on 23 September 2024Published on 06 March 2024

Before you start

To complete the actions presented below, you must have:

  • A Scaleway account logged into the console
  • Owner status or IAM permissions allowing you to perform actions in the intended Organization
  1. Click the AI & Data section of the Scaleway console, and select Managed Inference from the side menu to access the Managed Inference dashboard.
  2. Click Deploy a model to launch the model deployment wizard.
  3. Provide the necessary information:
    • Select the desired model and quantization to use for your deployment from the available options
      Note

      Some models may require acceptance of an end-user license agreement. If prompted, review the terms and conditions and accept the license accordingly.

    • Choose the geographical region for the deployment.
    • Specify the GPU Instance type to be used with your deployment.
  4. Enter a name for the deployment, and optional tags.
  5. Configure the network connectivity settings for the deployment:
    • Attach to a Private Network for secure communication and restricted availability. Choose an existing Private Network from the drop-down list, or create a new one.
    • Set up Public connectivity to access resources via the public internet. Authentication by API key is enabled by default.
    Important
    • Enabling both private and public connectivity will result in two distinct endpoints (public and private) for your deployment.
    • Deployments must have at least one endpoint, either public or private.
  6. Click Deploy model to launch the deployment process. Once the model is ready, it will be listed among your deployments.
See also
How to monitor a deployment
Was this page helpful?
API DocsScaleway consoleDedibox consoleScaleway LearningScaleway.comPricingBlogCareers
© 2023-2024 – Scaleway