Generative APIs API

Download (.yml, 64KB)

Introduction

Scaleway Generative APIs provides access to the latest AI models hosted on Scaleway infrastructure.

Generative APIs specification targets OpenAI API compatibility.

Concepts

Refer to our dedicated concepts page to find definitions of the different terms referring to Generative APIs.

Quickstart

Configure your environment variables.

Note

This is an optional step that seeks to simplify your usage of the APIs.

export SCW_ACCESS_KEY="<API access key>"
export SCW_SECRET_KEY="<API secret key>"
export SCW_REGION="<Scaleway region>"

Generate content from a model by running the following command.

curl https://api.scaleway.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $SCW_SECRET_KEY" \
  -d '{
    "model": "llama-3.3-70b-instruct",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Hello!"
      }
    ]
  }'

See How to use Generative APIs for quickstart and snippets using REST request or libraries such as openai python client.

Requirements

To perform the following steps, you must first ensure that:

you have an account and are logged into the Scaleway console
you have created an API key and that the API key has sufficient IAM permissions to perform the actions described on this page.
you have installed curl

Technical Information

Regions

Scaleway's infrastructure is spread across different regions and Availability Zones.

Generative APIs is available in the Paris region, which is represented by the following path parameters (optional while there is only one region):

fr-par

Supported endpoints and features

Supported endpoints are:

/v1/responses (beta)
/v1/chat/completions
/v1/audio/transcriptions
/v1/embeddings
/v1/models

The /v1/chat/completions endpoint:

Supports many features such as:
- Structured outputs (JSON response format)
- Tool calling (ie. compatibility with workflows using MCP servers)
- Sending and analyzing images
- Sending and analyzing audio files
Does not yet support the following parameters: audio, metadata, modalities, prediction, prompt_cache_key, user, safety_identifier, service_tier, stream_options.include_obfuscation, store, system_fingerprint, web_search_options.
Does not support custom tools (only function tools are supported). These tools require you to provide a custom grammar to verify their input format.

The /v1/audio/transcriptions endpoint:

Does not yet support the following parameters: chunking_strategy, include[] and timestamp_granularities[].

The /v1/responses endpoint is currently in Beta status:

Does not yet support storing conversation state server-side.
Does not support execution by Scaleway of "built-in" tools (such as web or file search) while the model generates a response. Currently, only function tools are supported. These tools must be fully defined in your query, and should be executed if the model requests them. The execution result is then sent to the model so that it can use this additional context to produce a suitable answer.
Does not support sending response output object directly as an input for the next message. Standard messages using role:assistant and content fields should be used instead. Specifically, custom_tool_call and custom_tool_call_output types are not yet supported. As a workaround, tool call results can be send using role:user although we recommend using /chat/completions for better results in this case.
Does not yet support the following parameters: background, conversation, include, instructions, max_tool_calls, metadata, previous_response_id, prompt_cache_key, safety_identifier, service_tier, stream_options, top_logprobs, user, prompt, verbosity.

Third party tool integration

For full details of direct integration into third party tooling, see Integrating Scaleway Generative APIs with popular AI tools. If your tool is not listed, you can still specify the Scaleway URL and API key in most OpenAI-like plugins, as compatibility largely depends on the above APIs.

Technical Limitations

When choosing a model, select the ones compatible with the APIs endpoints you want to use in our model catalog. For example, /v1/embeddings is only available for embeddings models.

Going Further

For more information about Generative APIs, you can check out the following pages:

Generative APIs Documentation
Generative APIs FAQ
Scaleway Slack Community - AI Channel join the #ai channel
Contact our support team

Troubleshoooting

See Troubleshooting Generative APIs for advanced APIs behaviour descriptions and common issues solutions.

Responses (Beta)

A response is a model output for a given input. It represents the functionality of generating a response in various contexts.

POST/v1/{project_id}/responses

Chat Completions

A chat completion is a model response for a given conversation. It represents the functionality of generating a response in a chat context

POST/v1/{project_id}/chat/completions

Embeddings

A vector representation of an input. Similar vectors corresponds to semantically similar inputs.

See How to query embedding models for code snippets using openai Python client.

POST/v1/{project_id}/embeddings

Audio

A transcription is a text transcribed from an audio input.

To support file upload, this API must be queried with multipart/form-data content type instead of application/json.

See How to query audio models for code snippets using openai Python client.

POST/v1/{project_id}/audio/transcriptions

Models

A model refers to a system that has been trained to generate content such as text, images, or other data types based on input prompts or instructions

GET/v1/models