Quickstart
Learn how to access, configure and use a Generative APIs endpoint in a few steps.
View QuickstartGenerative APIs provide access to pre-configured serverless endpoints of the most popular AI models, hosted in European data centers and priced per 1M tokens used.
Generative APIs QuickstartLearn how to access, configure and use a Generative APIs endpoint in a few steps.
View QuickstartCore concepts that give you a better understanding of Scaleway Generative APIs.
View ConceptsCheck our guides about using Generative APIs endpoints.
View How-tosGuides to help you choose a Generative APIs endpoint, understand pricing and advanced configuration.
View additional contentLlama 3.3 70B maximum context is now reduced to 100k tokens (from 130k tokens previously). This update will improve average throughput and time to first token. Managed Inference can still be used to support context lengths of 130k tokens.
The Mistral-small-3.1-24b-instruct-2503 and Deepseek-R1-Distill-Llama-70b models are now in General Availability.
Their lifecycle will now follow the lifecycle for Active
status models in our model lifecycle policy.
Gemma 3 27B and Mistral Small 3.1 24B 2503 are now available in Preview in Generative APIs.
Both models are multimodal and support text generation and image analysis use cases.
Visit our Help Center and find the answers to your most frequent questions.
Visit Help CenterYour opinion helps us make a better documentation.