Skip to navigationSkip to main contentSkip to footerScaleway DocsSparklesIconAsk our AI
SparklesIconAsk our AI

How to query audio models

Scaleway's Generative APIs service allows users to interact with powerful audio models hosted on the platform.

There are several ways to interact with audio models:

Before you start

To complete the actions presented below, you must have:

  • A Scaleway account logged into the console
  • Owner status or IAM permissions allowing you to perform actions in the intended Organization
  • A valid API key for API authentication
  • Python 3.7+ installed on your system

Accessing the playground

Scaleway provides a web playground for instruct-based models hosted on Generative APIs.

  1. Navigate to Generative APIs under the AI section of the Scaleway console side menu. The list of models you can query displays.
  2. Click the name of the chat model you want to try. Alternatively, click Try next to the model's name.

The web playground displays.

Using the playground

  1. Upload an audio file to send to the selected audio model for transcription purposes.

  2. Edit the hyperparameters listed on the right column, for example the default temperature for more or less randomness on the outputs.

  3. Switch models at the top of the page, to observe the capabilities of chat models offered via Generative APIs.

  4. Click Deploy, then select the Serverless option to get code snippets configured according to your settings in the playground.

    You can also choose to deploy a model on your own dedicated Instance by selecting the Dedicated option. In this case, you can access the playground after completing the steps in the deployment wizard. Once in the playground of your deployment, click View code to get code snippets that match your settings in the playground.

Querying audio models via API

You can query the models programmatically using your favorite tools or languages. In the example that follows, we will use the OpenAI Python client.

Audio Transcriptions API or Chat Completions API?

Both the Audio Transcriptions API and the Chat Completions API are OpenAI-compatible REST APIs that accept audio input.

The Audio Transcriptions API is designed for pure speech-to-text (audio transcription) tasks, such as transcribing a voice note or meeting recording file. It can be used with compatible audio models, such as whisper-large-v3.

The Chat Completions API is more suitable for understanding audio input as part of a broader task, rather than a pure transcription task. For example, building a voice chat assistant which listens and responds in natural language, or sending multiple inputs (audio and text) to be interpreted or classified (answering questions like "Is this audio a ringtone?"). This API can be used for audio tasks with compatible multimodal models, such as voxtral-small-24b.

InformationOutlineIcon
Note

Scaleway's support for the Audio Transcriptions API is currently at beta stage. Support of the full feature set will be incremental.

For full details on these APIs, see the reference documentation.

Installing the OpenAI SDK

Install the OpenAI SDK using pip:

pip install openai

Initializing the client

Initialize the OpenAI client with your base URL and API key:

CheckCircleOutlineIcon
Tip

In the case of a dedicated Generative APIs deployment, the base_url value is the Public Endpoint URL displayed on the Overview tab of the deployment's dashboard.

from openai import OpenAI

# Initialize the client with your base URL and API key
client = OpenAI(
    base_url="https://api.scaleway.ai/v1",  # Scaleway's Generative APIs service URL
    api_key="<SCW_SECRET_KEY>"  # Your unique API secret key from Scaleway
)

Transcribing audio

You can now generate a text transcription of a given audio file using a suitable API / model combination of your choice.

SearchIcon
No Results