Generative APIs

Generative APIs provide access to pre-configured serverless endpoints of the most popular AI models, hosted in European data centers and priced per 1M tokens used.

Generative APIs Quickstart

Getting Started

Quickstart

Learn how to access, configure and use a Generative APIs endpoint in a few steps.

View Quickstart

Concepts

Core concepts that give you a better understanding of Scaleway Generative APIs.

View Concepts

How-tos

Check our guides about using Generative APIs endpoints.

View How-tos

Additional content

Guides to help you choose a Generative APIs endpoint, understand pricing and advanced configuration.

View additional content

Changelog

June 2025

Generative APIs
Changed
Llama 3.3 70B maximum context update
Llama 3.3 70B maximum context is now reduced to 100k tokens (from 130k tokens previously). This update will improve average throughput and time to first token. Managed Inference can still be used to support context lengths of 130k tokens.
Generative APIs
Changed
Mistral Small 3.1 and Deepseek R1 Distilled Lama are now in General Availability
The Mistral-small-3.1-24b-instruct-2503 and Deepseek-R1-Distill-Llama-70b models are now in General Availability.

Their lifecycle will now follow the lifecycle for Active status models in our model lifecycle policy.

April 2025

Generative APIs
Added
Gemma 3 and Mistral Small 3.1 are now available
Gemma 3 27B and Mistral Small 3.1 24B 2503 are now available in Preview in Generative APIs.

Both models are multimodal and support text generation and image analysis use cases.

View the full changelog

Questions?

Visit our Help Center and find the answers to your most frequent questions.

Visit Help Center

Generative APIs

Getting Started

Quickstart

Concepts

How-tos

Additional content

Changelog

Generative APIs

Llama 3.3 70B maximum context update

Generative APIs

Mistral Small 3.1 and Deepseek R1 Distilled Lama are now in General Availability

Generative APIs

Gemma 3 and Mistral Small 3.1 are now available