Generative APIs - Dedicated Deployment API

Download schema

Scaleway Generative APIs - Dedicated Deployment allows you to deploy and run machine learning models on Scaleway's infrastructure. This service provides scalable and efficient endpoints for your model inference needs. The Scaleway Generative APIs - Dedicated Deployment API enables you to manage these endpoints and perform inference operations with any OpenAI API compatible software.

Tip

To retrieve information about the different models available for deployment on Scaleway Generative APIs - Dedicated Deployment, check out our model documentationOpen in new context.

Concepts

Refer to our dedicated concepts pageOpen in new context to find the definitions of all concepts and terminology related to Generative APIs - Dedicated Deployment.

Quickstart

Configure your environment variables

Note
This is an optional step that seeks to simplify your usage of the Generative APIs - Dedicated Deployment API. You can find your Project ID in the Scaleway consoleOpen in new context.
```
Code
export SCW_SECRET_KEY="<API secret key>"
export SCW_DEFAULT_REGION="fr-par"
export SCW_PROJECT_ID="<Scaleway Project ID>"
```

List available models: Run the following command to get a list of all the models available for deployment, with their details:

Code
curl -X GET \
  -H "Content-Type: application/json" \
  -H "X-Auth-Token: $SCW_SECRET_KEY" \
  "https://api.scaleway.com/inference/v1/regions/$SCW_DEFAULT_REGION/endpoints"

Create a model deployment: Run the following command to create a deployment. Customize the details in the payload (name, model, description, tags, etc.) to your needs:

Code
curl -X POST https://api.scaleway.com/inference/v1/regions/$SCW_DEFAULT_REGION/deployments \
-H "Content-Type: application/json" \
-H "X-Auth-Token: $SCW_SECRET_KEY" \
-d '{
  "project_id": "'"$SCW_PROJECT_ID"'",
  "name": "my-inference-deployment",
  "model_id": "chosen-model-id",
  "node_type": "L4",
  "min_size": 1,
  "max_size": 1,
  "accept_eula": true,
  "endpoints": [
    {
      "public": {}
    }
  ]
}'

Parameter	Description	Valid values
`project_id`	The Project in which the deployment should be created (string)	Any valid Scaleway Project ID, e.g., `"b4bd99e0-b389-11ed-afa1-0242ac120002"`
`name`	A name of your choice for the deployment (string)	Any string containing only alphanumeric characters, dots, spaces, and dashes, e.g., `"my-inference-deployment"`
`model_id`	The model to deploy (string)	Any valid model ID found in your model library (see models listing)
`node_type`	The type of node to use for the deployment (string)	Example: `"L4"`
`min_size`	Minimum number of replicas for the deployment (integer)	Any integer, e.g., `1`
`max_size`	Maximum number of replicas for the deployment (integer)	Any integer, e.g., `3`
`accept_eula`	Indicates acceptance of the End User License Agreement (boolean)	`true`
`endpoints`	Defines the endpoints for the deployment (array)	At least one endpoint, e.g., `[ { "public": {} } ]`

Create a model endpoint: Run the following command to create an inference endpoint for the deployment. Customize the details in the payload to your needs:

Example for creating a public endpoint

Code
curl -X POST https://api.scaleway.com/inference/v1/regions/$SCW_DEFAULT_REGION/endpoints \
-H "Content-Type: application/json" \
-H "X-Auth-Token: $SCW_SECRET_KEY" \
-d '{
  "project_id": "'"$SCW_PROJECT_ID"'",
  "deployment_id": "your-deployment-id",
  "endpoint": {
    "disable_auth": false,
    "public": {}
  }
}'

Example for creating a private endpoint

Code
curl -X POST https://api.scaleway.com/inference/v1/regions/$SCW_DEFAULT_REGION/endpoints \
-H "Content-Type: application/json" \
-H "X-Auth-Token: $SCW_SECRET_KEY" \
-d '{
  "project_id": "'"$SCW_PROJECT_ID"'",
  "deployment_id": "your-deployment-id",
  "endpoint": {
    "disable_auth": false,
    "private_network": {
      "private_network_id": "your-private-network-id"
    }
  }
}'

Parameter	Description	Valid values
`project_id`	The Project in which the endpoint should be created (string)	Any valid Scaleway Project ID, e.g., `"b4bd99e0-b389-11ed-afa1-0242ac120002"`
`deployment_id`	The deployment ID to which the endpoint will be associated (string)	Any valid deployment ID, e.g. `"bcb0976d-98d6-49c1-b6b5-17804941c0b7"`
`disable_auth`	Specifies whether to disable authentication (boolean)	`true` or `false`
`public`	Public endpoint configuration (object)	`{}` for public endpoint
`private_network`	Private endpoint configuration including the private network ID (object)	`{ "private_network_id": "private-network-id" }`

List your deployments: Run the following command to get a list of all the deployments in your account, with their details:

Code
curl -X GET \
  -H "Content-Type: application/json" \
  -H "X-Auth-Token: $SCW_SECRET_KEY" \
  "https://api.scaleway.com/inference/v1/regions/$SCW_DEFAULT_REGION/deployments"

List your endpoints: Run the following command to get a list of all the inference endpoints in your account, with their details:

Code
curl -X GET \
  -H "Content-Type: application/json" \
  -H "X-Auth-Token: $SCW_SECRET_KEY" \
  "https://api.scaleway.com/inference/v1/regions/$SCW_DEFAULT_REGION/endpoints"

Delete an endpoint: Run the following command to delete an inference endpoint, specified by its endpoint ID:
```
Code
curl -X DELETE \
  -H "X-Auth-Token: $SCW_SECRET_KEY" \
  -H "Content-Type: application/json" \
  "https://api.scaleway.com/inference/v1/regions/$SCW_DEFAULT_REGION/endpoints/<endpoint-ID>"
```
The expected successful response is empty.

Important
Dedicated Generative APIs deployments must have at least one endpoint, either public or private.

Requirement

You have a Scaleway accountOpen in new context
You have created an API keyOpen in new context and the API key has sufficient IAM permissionsOpen in new context to perform the actions described on this page
You have installed curlOpen in new context

Technical information

Region

Generative APIs - Dedicated Deployment endpoints are available in the following region:

Name	API ID
Paris	`fr-par`

Pagination

Most listing requests receive a paginated response. Requests against paginated endpoints accept two query arguments:

page, a positive integer to choose which page to return
per_page, a positive integer lower or equal to 100 to select the number of items to return per page. The default value is 50.

Paginated endpoints usually also accept filters to search and sort results. These filters are documented in each endpoint's documentation.

The X-Total-Count header contains the total number of items returned.

Creating a deployment: the model object

When creating a deployment, the model_id parameter is required. This specifies the model to deploy. Use the List Models endpoint to retrieve available model IDs.

Note

This information is designed to help you correctly configure the model_id parameter when using the Create a deployment method.

Going further

For more help using Scaleway Generative APIs - Dedicated Deployment, check out the following resources:

Our main documentationOpen in new context
The #ai channel on our Slack CommunityOpen in new context.