Quickstart
Learn how to create, access, configure, and use a Generative APIs endpoint in a few steps.
View QuickstartGenerative APIs enable you to deploy, manage, and scale AI models through serverless endpoints or dedicated infrastructure, hosted in European data centers.
Generative APIs QuickstartLearn how to create, access, configure, and use a Generative APIs endpoint in a few steps.
View QuickstartCore concepts that give you a better understanding of Scaleway Generative APIs.
View ConceptsCheck our guides about creating, managing, and using Generative APIs endpoints.
View How-tosGuides to help you choose a Generative APIs endpoint, understand pricing, and advanced configuration.
View additional contentLearn how to manage your Serverless endpoints through the API.
Go to Generative APIs - Serverless APILearn how to manage your Dedicated Deployment endpoints through the API.
Go to Generative APIs - Dedicated Deployment APIStarting 1 May 2026, Managed Inference becomes Generative APIs - Dedicated Deployment. All APIs, existing resources, pricing, and SLAs remain unchanged. All Managed Inference related content (such as documentation or Cockpit dashboards) will gradually be renamed Generative APIs - Dedicated.
The Responses API is now generally available. The Responses API is recommended for use only with the gpt-oss-120b model. For more information, see the Chat Completions and Responses API comparison.
Generative APIs billing is performed in slices of 1,000 tokens (instead of the previous slices of 1,000,000 tokens) starting April 20th. Prices do not change with this update. After this change, for a similar token usage, any bill will remain the same or may be slightly lower.
Visit our Help Center and find the answers to your most frequent questions.
Visit Help Center