Skip to navigationSkip to main contentSkip to footerScaleway DocsSparklesIconAsk our AI
SparklesIconAsk our AI

How to scale dedicated Generative APIs deployments

You can scale your dedicated Generative APIs deployment up or down to match it to the incoming load of your deployment.

AlertCircleIcon
Important

This feature is currently in Public Beta.

Before you start

To complete the actions presented below, you must have:

How to scale a dedciated Generative APIs deployment in size

  1. Click Generative APIs in the AI section of the side menu in the Scaleway console to access the dashboard. The list of models displays.
  2. Select the Deployments tab.
  3. From the drop-down menu, select the geographical region you want to manage.
  4. Click a deployment name to access the deployment's dashboard.
  5. Click the Settings tab and navigate to the Scaling section.
  6. Click Update node count and adjust the number of nodes in your deployment.
    InformationOutlineIcon
    Note

    High availability is only guaranteed with two or more nodes.

  7. Click Update node count to update the number of nodes in your deployment.
    InformationOutlineIcon
    Note

    Your deployment will be unavailable for 15-30 minutes while the node update is in progress.

SearchIcon
No Results