Supported models

Reviewed on September 12, 2025

Our API supports the most popular models for Chat, Vision, Audio and Embeddings.

Multimodal models

Chat and Vision models

Provider	Model string	Context window (Tokens)	Maximum output (Tokens)	License	Model card
Google (Preview)	`gemma-3-27b-it`	40k	8192	Gemma	HF
Mistral	`mistral-small-3.2-24b-instruct-2506`	128k	8192	Apache-2.0	HF

Chat and Audio models

Provider	Model string	Context window (Tokens)	Maximum output (Tokens)	License	Model card
Mistral	`voxtral-small-24b-2507`	32k	8192	Apache-2.0	HF

Chat models

Provider	Model string	Context window (Tokens)	Maximum output (Tokens)	License	Model card
OpenAI	`gpt-oss-120b`	128k	8192	Apache-2.0	HF
Meta	`llama-3.3-70b-instruct`	100k	4096	Llama 3.3 Community	HF
Meta	`llama-3.1-8b-instruct`	128k	16384	Llama 3.1 Community	HF
Mistral	`mistral-nemo-instruct-2407`	128k	8192	Apache-2.0	HF
Qwen	`qwen3-235b-a22b-instruct-2507`	260k	8192	Apache-2.0	HF
Qwen	`qwen3-coder-30b-a3b-instruct`	128k	8192	Apache-2.0	HF
DeepSeek	`deepseek-r1-distill-llama-70b`	32k	4096	MIT	HF

Tip

If you are unsure which chat model to use, we currently recommend Mistral Small 3.2 24B Instruct (mistral-small-3.2-24b-instruct-2506) to get started.

Provider	Model string	Context window (Tokens)	Maximum output (Tokens)	License	Model card
Mistral	`pixtral-12b-2409`	128k	4096	Apache-2.0	HF

Image sizes are limited to 32 million pixels (e.g., a resolution of about 8096 x 4048). Images with a resolution higher than 1024 x 1024 are supported, but automatically downscaled to fit these limitations (image ratio and proportions will be preserved).

Embedding models

Our Embeddings API provides built-in support for the following models, hosted in Scaleway data centers, available via serverless endpoints.

Provider	Model string	Model size	Embedding dimension	Context window	License	Model card
BAAI	`bge-multilingual-gemma2`	9B	3584	4096	Gemma	HF

Request a model

Do not see a model you want to use? Tell us or vote for what you would like to add here.

Deprecated models

These models can still be accessed in Generative APIs, but their End of Life (EOL) is planned according to our model lifecyle policy. Deprecated models should not be queried anymore. We recommend to use newer models available in Generative APIs or to deploy these models in dedicated Managed Inference deployments.

Provider	Model string	End of Life (EOL) date
Mistral	`devstral-small-2505`	14th November, 2025
Mistral	`mistral-small-3.1-24b-instruct-2503`	14th November, 2025
Qwen	`qwen2.5-coder-32b-instruct`	14th November, 2025

Note

Llama 3.1 70B is now deprecated. The new Llama 3.3 70B is available with similar or better performance in most use cases. After May 25th 2025, your requests to Llama 3.1 70B will be redirected automatically to Llama 3.3 70B. Llama 3.1 8B is not affected by this change and remains supported.

End of Life (EOL) models

These models are not accessible anymore from Generative APIs. They can still however be deployed on dedicated Managed Inference deployments.

Provider	Model string	EOL date
Meta	`llama-3.1-70b-instruct`	25th May, 2025
SBERT	`sentence-t5-xxl`	26 February, 2025

Still need help?

Create a support ticket