NavigationContentFooter

Managed Inference

Dive into seamless language processing with our easy-to-use LLM endpoints. Perfect for everything from data analysis to creative tasks, all with clear pricing.

Managed Inference Quickstart

Getting Started

Quickstart

Learn how to create, connect to, and delete a Managed Inference endpoint in a few steps.

View Quickstart

Concepts

Core concepts that give you a better understanding of Scaleway Managed Inference.

View Concepts

How-tos

Check our guides about creating and managing Managed Inference endpoints.

View How-tos

Additional content

Guides to help you choose a Managed Inference endpoint, understand pricing and advanced configuration.

View additional content
Managed Inference API

Learn how to create and manage your Scaleway Managed Inference endpoints through the API.

Go to Managed Inference API

Changelog

  • Managed Inference

    Added

    Model library expanded

    Meta Llama 3.1 8b, Meta Llama 3.1 70b and Mistral Nemo are available for deployment on Managed Inference.

    Released July 2024, these models all support a very large context window of up to 128k tokens, particularly useful for RAG applications.

  • Managed Inference

    Added

    Chat and Embedding APIs now return usage stats

    Managed Inference endpoints now return usage stats (number of tokens in prompt, completion and totals) both for streaming and non-streaming responses.

    For streaming responses, the usage field is incremented in each chunk, and completed in the very last chunk of the response.

    More details and examples in our OpenAI API compatibility documentation.

  • Managed Inference

    Changed

    Models now support longer and better conversations

    • All models on catalog now support conversations to their full context window (e.g Mixtral-8x7b up to 32K tokens, Llama3 up to 8k tokens).
    • Llama3 70B is now available in FP8 quantization, INT8 is deprecated.
    • Llama3 8b is now available in FP8 quantization, BF16 remains default.
View the full changelog
Questions?

Visit our Help Center and find the answers to your most frequent questions.

Visit Help Center
Docs APIScaleway consoleDedibox consoleScaleway LearningScaleway.comPricingBlogCarreer
© 2023-2024 – Scaleway