openapi: 3.1.0
info:
  title: Generative APIs API
  description: |-
    Scaleway Generative APIs provides access to the latest AI models hosted on Scaleway infrastructure.

    Generative APIs specification targets OpenAI API compatibility.

    (switchcolumn)
    (switchcolumn)

    ## Concepts

    Refer to our [dedicated concepts page](https://www.scaleway.com/en/docs/generative-apis/concepts/) to find definitions of the different terms referring to Generative APIs.

    (switchcolumn)
    (switchcolumn)

    ## Quickstart

    1. Configure your environment variables.
        <Message type="note">
        This is an optional step that seeks to simplify your usage of the APIs.
        </Message>

        ```bash
        export SCW_ACCESS_KEY="<API access key>"
        export SCW_SECRET_KEY="<API secret key>"
        export SCW_REGION="<Scaleway region>"
        ```
    2. Generate content from a model by running the following command.
        ```bash
        curl https://api.scaleway.ai/v1/chat/completions \
          -H "Content-Type: application/json" \
          -H "Authorization: Bearer $SCW_SECRET_KEY" \
          -d '{
            "model": "llama-3.3-70b-instruct",
            "messages": [
              {
                "role": "system",
                "content": "You are a helpful assistant."
              },
              {
                "role": "user",
                "content": "Hello!"
              }
            ]
          }'
        ```

    See [How to use Generative APIs](https://www.scaleway.com/en/docs/generative-apis/how-to/)
    for quickstart and snippets using `REST` request or libraries such as `openai` python client.

    (switchcolumn)
    <Message type="requirement">
    To perform the following steps, you must first ensure that:
      - you have an account and are logged into the [Scaleway console](https://console.scaleway.com/organization)
      - you have created an [API key](https://www.scaleway.com/en/docs/iam/how-to/create-api-keys/) and that the API key has sufficient [IAM permissions](https://www.scaleway.com/en/docs/iam/reference-content/permission-sets/) to perform the actions described on this page.
      - you have [installed `curl`](https://curl.se/download.html)
    </Message>
    (switchcolumn)

    ## Technical Information

    ### Regions

    Scaleway's infrastructure is spread across different [regions and Availability Zones](https://www.scaleway.com/en/docs/account/reference-content/products-availability/).

    Generative APIs is available in the Paris region, which is represented by the following path parameters (optional while there is only one region):
    - `fr-par`

    ### Supported endpoints and features

    Supported endpoints are:
    - `/v1/responses`
    - `/v1/chat/completions`
    - `/v1/audio/transcriptions`
    - `/v1/embeddings`
    - `/v1/rerank`
    - `/v1/batches`
    - `/v1/models`

    The `/v1/chat/completions` endpoint:
    - Supports many features such as:
      - Structured outputs (JSON response format)
      - Tool calling (ie. compatibility with workflows using MCP servers)
      - Sending and analyzing images
      - Sending and analyzing audio files
    - Does not yet support the following parameters: `audio`, `metadata`, `modalities`,
    `prediction`, `prompt_cache_key`, `user`, `safety_identifier`,
    `service_tier`, `stream_options.include_obfuscation`, `store`, `system_fingerprint`,
    `web_search_options`.
    - Does not support `custom` tools (only `function` tools are supported). These tools require you to provide a custom `grammar` to verify their input format.

    The `/v1/audio/transcriptions` endpoint:
    - Does not yet support the following parameters: `chunking_strategy`, `include[]` and `timestamp_granularities[]`.

    The `/v1/responses` endpoint:
    - Does not yet support storing conversation state server-side.
    - Does not support execution by Scaleway of "built-in" tools (such as web or file search) while the model generates a response.
    Currently, only `function` tools are supported. These tools must be fully defined in your query, and should be executed if the model requests them.
    The execution result is then sent to the model so that it can use this additional context to produce a suitable answer.
    - Does not support sending response output object directly as an input for the next message. Standard messages using `role:assistant` and `content` fields should be used instead.
    Specifically, `custom_tool_call` and `custom_tool_call_output` types are not yet supported. As a workaround, tool call results can be send using `role:user` although we recommend using `/chat/completions` for better results in this case.
    - Does not yet support the following parameters: `background`, `conversation`, `include`,
    `instructions`, `max_tool_calls`, `metadata`, `previous_response_id`, `prompt_cache_key`,
    `safety_identifier`, `service_tier`, `stream_options`, `top_logprobs`, `user`, `prompt`, `verbosity`.

    The `/v1/batches` endpoint:
    - Supports processing files stored in Object Storage (using Amazon S3 protocol).
    - Does not yet support the following parameters: `metadata`.

    The `/v1/rerank` endpoint:
    - Aims for compatibility with JinAI API and Cohere API format (no OpenAI API exists for this endpoint)

    ### Third party tool integration

    For full details of direct integration into third party tooling, see [Integrating Scaleway Generative APIs with popular AI tools](https://www.scaleway.com/en/docs/generative-apis/reference-content/integrating-generative-apis-with-popular-tools/).
    If your tool is not listed, you can still specify the Scaleway URL and
    API key in most OpenAI-like plugins, as compatibility largely depends on the above APIs.

    ## Technical Limitations

    When choosing a model, select the ones compatible with the APIs endpoints you want to use
    in our [model catalog](https://www.scaleway.com/en/docs/generative-apis/reference-content/supported-models/).
    For example, `/v1/embeddings` is only available for embeddings models.

    ## Going Further

    For more information about Generative APIs, you can check out the following pages:

    * [Generative APIs Documentation](https://www.scaleway.com/en/docs/generative-apis/)
    * [Generative APIs FAQ](https://www.scaleway.com/en/docs/generative-apis/faq/)
    * [Scaleway Slack Community - AI Channel](https://scaleway-community.slack.com/archives/C01SGLGRLEA) join the #ai channel
    * [Contact our support team](https://console.scaleway.com/support/tickets)

    ### Troubleshoooting

    See [Troubleshooting Generative APIs](https://www.scaleway.com/en/docs/generative-apis/troubleshooting/)
    for advanced APIs behaviour descriptions and common issues solutions.
  version: v1
servers:
  - url: https://api.scaleway.ai
tags:
  - name: Responses
    description: |
      A response is a model output for a given input. It represents the functionality of generating a response in various contexts.
  - name: Chat Completions
    description: |
      A chat completion is a model response for a given conversation. It represents the functionality of generating a response in a chat context
  - name: Embeddings
    description: |
      A vector representation of an input.
      Similar vectors corresponds to semantically similar inputs.

      See [How to query embedding models](https://www.scaleway.com/en/docs/generative-apis/how-to/query-embedding-models/)
      for code snippets using `openai` Python client.
  - name: Rerank
    description: |
      A ranking is an ordering of documents by relevance, based on a query.

      See [How to query reranking models](https://www.scaleway.com/en/docs/generative-apis/how-to/query-reranking-models/)
      for code snippets using `openai` Python client.
  - name: Audio
    description: |
      A transcription is a text transcribed from an audio input.

      To support file upload, this API must be queried with `multipart/form-data` content type instead of `application/json`.

      See [How to query audio models](https://www.scaleway.com/en/docs/generative-apis/how-to/query-audio-models/)
      for code snippets using `openai` Python client.
  - name: Batch
    description: |
      A batch is an asynchronous process that executes multiple requests. It takes a file containing the requests to perform
       as input, and writes the results to an output file.

      Files are stored within Scaleway Object Storage.

      See [How to use batch processing](https://www.scaleway.com/en/docs/generative-apis/how-to/use-batch-processing/)
      for code snippets using `openai` Python client.
  - name: Models
    description: |
      A model refers to a system that has been trained to generate content such as text, images, or other data types based on input prompts or instructions
components:
  schemas:
    ResponseAPIInputContentPart:
      type: object
      properties:
        type:
          type: string
          description: Type of content. (Required).
          enum:
            - input_text
            - input_image
            - input_file
        text:
          type: string
          description: Text content.
        detail:
          type: string
          description: |-
            Detail level of the image provided to the model.
          enum:
            - high
            - low
            - auto
          default: "auto"
        image_url:
          type: string
          description: |-
            URL of a remote image or encoding in base64 of a local image.

            See [How to query vision models](https://www.scaleway.com/en/docs/generative-apis/how-to/query-vision-models/)
            for code snippets using the `openai` Python client, and guidance for encoding local images.
        file_data:
          type: string
          description: Content of a file.
        file_url:
          type: string
          description: URL of a remote file.
      required:
        - type
    ResponseAPIInputList:
      type: object
      properties:
        role:
          type: string
          description: Role providing the content input.
          enum:
            - system
            - developer
            - user
            - assistant
        content​:
          type: array
          description: |-
            List of input contents of different `type`, each compatible with different fields:

            `input_text`: Requires `text` field.

            `input_image`: Requires `detail` and `image_url` fields.

            `input_file`: Requires `file_data` or `file_url` field. Optionally, `filename` can be provided.
          items:
            $ref: '#/components/schemas/ResponseAPIInputContentPart'
        status:
          type: string
          description: Status of the response.
          enum:
            - in_progress
            - completed
            - incomplete
        type:
          type: string
          description: Type of the content input. Always set to `message`.
      required:
        - role
    ResponseAPIAnnotation:
      type: object
    ResponseAPITool:
      $ref: '#/components/schemas/ResponseAPIFunctionObject'
    ResponseAPIOutputContentPart:
      type: object
      properties:
        type:
          type: string
          description: Type of content. Always set to `output_text`.
          enum:
            - output_text
        text:
          type: string
          description: Text content.
        annotations:
          type: array
          description: Annotations of the text output, such as citations or path to a file.
          items:
            $ref: '#/components/schemas/ResponseAPIAnnotation'
      x-properties-order:
        - type
        - text
        - tutu
    ResponseAPIOutputList:
      type: object
      properties:
        role:
          type: string
          description: Role generating the content output. Always set to `assistant`.
          enum:
            - assistant
        type:
          type: string
          description: |-
            List of outputs of different `type`, each also outputting different fields:

            `message`: outputs `content` field

            `function_call`: outputs `call_id`, `name` and `arguments` fields
          enum:
            - message
            - function_call
            - reasoning
        id:
          type: string
          description: UUID of the mesage within a response.
        status:
          type: string
          description: Status of the response.
          enum:
            - in_progress
            - completed
            - incomplete
        content:
          type: array
          description: |-
            List of text output contents.
          items:
            $ref: '#/components/schemas/ResponseAPIOutputContentPart'
        call_id:
          type: string
          description: UUID of the function tool call.
        name:
          type: string
          description: Name the function to execute.
        arguments:
          type: string
          description: |-
            Arguments to pass to the function, formatted as a JSON string.

            Example: `{"city": "Paris","timezone": "UTC+2"}`
      x-properties-order:
        - role
        - type
        - id
        - status
        - content
        - call_id
        - name
        - arguments
    ResponseAPIUsage:
      type: object
      properties:
        input_tokens:
          type: integer
          description: Number of input tokens.
        input_tokens_details:
          type: object
          description: Breakdown of input tokens by type.
          nullable: true
          properties:
            cached_tokens:
              type: integer
              description: Number of cached input tokens.
        output_tokens:
          type: integer
          description: Number of output tokens.
        output_tokens_details:
          type: object
          description: Breakdown of output tokens by type.
          nullable: true
          properties:
            reasoning_tokens:
              type: integer
              description: Number of output tokens used for reasoning.
        total_tokens:
          type: integer
          description: Total number of tokens (input and output).
    ChatCompletionMessageToolCall:
      type: object
      properties:
        id:
          type: string
          description: UUID of the tool call.
        type:
          type: string
          description: Type of tool call, always set to `function`.
          enum:
            - function
        function:
          type: object
          description: Function to call, identified by the model.
          properties:
            name:
              type: string
              description: Name of the function to call.
            arguments:
              type: string
              description:
                Arguments to call the function with, as generated by the model
                in JSON format. Note that model may not always generate
                valid JSON or parameters. Validate the arguments in your code before
                calling a function.
          x-properties-order:
            - name
            - arguments
      x-properties-order:
        - id
        - type
        - function
    ChatCompletionMessageToolCalls:
      type: array
      description: List of tool calls required by the model, such as function calls.
      items:
        $ref: '#/components/schemas/ChatCompletionMessageToolCall'
    ChatCompletionRequestMessageContentPart:
      type: object
      properties:
        type:
          type: string
          description: Type of content. `image_url` and `input_audio` are only supported with `user` role.
          enum:
            - text
            - image_url
            - input_audio
        text:
          type: string
          description: Text content. Required if `type` is set to `text`.
        image_url:
          type: object
          properties:
            url:
              type: string
              description: |-
                URL of a remote image or encoding in base64 of a local image. Required if `type` is set to `image_url`.

                See [How to query vision models](https://www.scaleway.com/en/docs/generative-apis/how-to/query-vision-models/)
                for code snippets using `openai` Python client or how to encode local images.
        input_audio:
          type: object
          properties:
            data:
              type: string
              description: |-
                Encoding in base64 of a local audio file. Required if `type` is set to `input_audio`.

                See [How to query audio models](https://www.scaleway.com/en/docs/generative-apis/how-to/query-audio-models/)
                for code snippets using `openai` Python client or how to encode local audio files.
            format:
              type: string
              description: |-
                Format of the encoded audio file. Currently, the only supported values are `wav` and `mp3`.
      required:
        - type
    ChatCompletionRequestMessage:
      type: object
      properties:
        role:
          type: string
          description: Role of the message's author.
          enum:
            - system
            - user
            - assistant
            - tool
        content:
          type: string
          description: |-
            Content of a message as string. Required for all roles, except `assistant` if `tool_calls` is specified instead.
          x-one-of: content
        content​:
          type: array
          description: |-
            Content of a message as array of content parts. Required for all roles, except `assistant` if `tool_calls` is specified instead.
          items:
            $ref: '#/components/schemas/ChatCompletionRequestMessageContentPart'
          x-one-of: content
        tool_calls:
          type: array
          description: |-
            List of tool calls required by the model. Can only be used with `assistant` if `content` is not specified.
          items:
            $ref: '#/components/schemas/ChatCompletionMessageToolCall'
        tool_call_id:
          type: string
          description: UUID of the tool call. Must only be used with `tool` role.
      required:
        - role
    ChatCompletionResponseMessage:
      type: object
      description: Message generated by the model.
      properties:
        role:
          type: string
          description: Role of the message's author, always set to `assistant` in the response.
          enum:
            - assistant
        content:
          type: string
          description: Content of the message.
        reasoning_content:
            type: string
            description: Reasoning content generated for this message.
        tool_calls:
          $ref: '#/components/schemas/ChatCompletionMessageToolCalls'
      x-properties-order:
        - role
        - content
        - tool_calls
    ChatCompletionStreamOptions:
      type: object
      description: >
        An object containing parameters that modify the behavior of stream responses.
        Can only be used if `stream` is set to `true`.
      nullable: true
      default: null
      properties:
        include_usage:
          type: boolean
          description: >
            Defines whether a usage field is included in a stream.
            If set, an additional chunk will be streamed before the `data:
            [DONE]` message. The `usage` field on this chunk shows the token usage
            statistics for the complete stream.
    ChatCompletionTokenLogprob:
      type: object
      properties:
        token:
          description: Token generated.
          type: string
        logprob:
          description:
            Log probability of generating this token, if it is among the top 20 most
            likely tokens. Otherwise, the value `-9999.0` is used to mean
            that the token is very unlikely.
          type: number
        bytes:
          description:
            List of integers representing the UTF-8 bytes (in decimal format) representation of
            a token. Since some characters may be represented by multiple tokens,
            this representation can be combined to represent the corresponding character in UTF-8.
          type: array
          items:
            type: integer
          nullable: true
        top_logprobs:
          description:
            List of most probable next tokens and their log probability.
          type: array
          items:
            type: object
            properties:
              token:
                description: A token among the most next likely ones.
                type: string
              logprob:
                description:
                  Log probability of generating this token, if it is among the top 20 most
                  likely tokens. Otherwise, the value `-9999.0` will be used to mean
                  that the token is very unlikely.
                type: number
              bytes:
                description:
                  List of integers representing the UTF-8 bytes (in decimal format) representation of
                  a token. Since some characters may be represented by multiple tokens,
                  this representation can be combined to represent the corresponding character in UTF-8.
                type: array
                items:
                  type: integer
                nullable: true
            x-properties-order:
              - token
              - logprob
              - bytes
      x-properties-order:
        - token
        - logprob
        - bytes
        - top_logprobs
    ChatCompletionTool:
      type: object
      properties:
        type:
          type: string
          enum:
            - function
          description: Type of tool object, always set to `function`.
        function:
          $ref: '#/components/schemas/FunctionObject'
      required:
        - type
        - function
    ChatCompletionToolChoiceOption:
      type: string
      description: |-
        Defines whether a model can call tools, and if so, and which ones.

        `none`: model will not call any tools, and only generate a message.

        `auto`: model can choose either to generate a message, or to call one or
        multiple tools.

        `required`: model must call one or multiple tools.

        Default: `none` when no tools are present, otherwise `auto`.

        An object can also be provided to specify a tool that the model
        must call. Object format must be:

        `{"type": "function", "function": {"name": "function_name_as_provided_in_tools"}}`
      enum:
        - none
        - auto
        - required
    ChatCompletionResponseChoice:
      type: object
      properties:
        index:
          type: integer
          description: Index of the choice in the list of choices.
        message:
          $ref: '#/components/schemas/ChatCompletionResponseMessage'
        logprobs:
          description: Object containing log probability information for each token in a generated response.
          type: object
          nullable: true
          properties:
            content:
              description: List of content tokens and their log probability information.
              type: array
              items:
                $ref: '#/components/schemas/ChatCompletionTokenLogprob'
              nullable: true
            refusal:
              description: List of refusal tokens and their log probability information.
              type: array
              items:
                $ref: '#/components/schemas/ChatCompletionTokenLogprob'
              nullable: true
        finish_reason:
          type: string
          description: |
            Reason the model stopped generating tokens.

            `stop`: model successfully reached the end of its answer, or a provided
            stop sequence

            `length`: maximum number of output tokens was reached, blocking further generation

            `tool_calls`: model needed to call a tool
          enum:
            - stop
            - length
            - tool_calls
    ChatCompletionUsage:
      type: object
      properties:
        prompt_tokens:
          type: integer
          description: Number of input tokens.
        total_tokens:
          type: integer
          description: Total number of tokens (input and output).
        completion_tokens:
          type: integer
          description: Number of output tokens.
        completion_tokens_details:
          type: object
          description: Breakdown of output tokens by type.
          nullable: true
          properties:
            reasoning_tokens:
              type: integer
              description: Number of output tokens used for reasoning.
        prompt_tokens_details:
          type: object
          description: Breakdown of input tokens by type.
          nullable: true
          properties:
            audio_tokens:
              type: integer
              description: Number of audio input tokens.
      x-properties-order:
        - prompt_tokens
        - total_tokens
        - completion_tokens
        - completion_tokens_details
        - prompt_tokens_details
    CreateResponse:
      type: object
      properties:
        id:
          type: string
          description: UUID of the response.
        object:
          type: string
          description: Type of response object, always set to `chat.completion`.
          enum:
            - response
        created_at:
          type: integer
          description: Timestamp when the response was generated (Unix format, in seconds).
        status:
          type: string
          description: Status of the response.
          enum:
            - in_progress
            - completed
            - incomplete
        model:
          type: string
          description: Unique identifier of the model.
        output:
          type: array
          description: List of outputs generated by the model as response.
          minItems: 1
          items:
            $ref: '#/components/schemas/ResponseAPIOutputList'
        text:
          type: object
          description: Configuration of the response format, either plain text or JSON structured data.
          properties:
            format:
              type: object
              description: Output format type.
              properties:
                type:
                  type: string
                  description: Type of response object.
                  enum:
                  - text
                  - json_schema
                  - json_object
        usage:
          $ref: '#/components/schemas/ResponseAPIUsage'
      x-properties-order:
        - id
        - object
        - created_at
        - status
        - model
        - output
        - usage
    CreateChatCompletionResponse:
      type: object
      properties:
        id:
          type: string
          description: UUID of the response.
        object:
          type: string
          description: Type of response object, always set to `chat.completion`.
          enum:
            - chat.completion
        created:
          type: integer
          description: Timestamp when the response was generated (Unix format, in seconds).
        model:
          type: string
          description: Unique identifier of the model.
        choices:
          type: array
          description:
            List of chat completion variations. Defaults to only `1` choice,
            but can be increased by setting a value for `n` in the request.
          items:
            $ref: '#/components/schemas/ChatCompletionResponseChoice'
        usage:
          $ref: '#/components/schemas/ChatCompletionUsage'
      x-properties-order:
        - id
        - object
        - created
        - model
        - choices
        - usage
    CreateEmbeddingResponse:
      type: object
      properties:
        id:
          type: integer
          description: UUID of the response.
        object:
          type: string
          description: Type of response object, always set to `list`.
          enum:
            - list
        created:
          type: integer
          description: Timestamp when the response was generated (Unix format, in seconds).
        model:
          type: string
          description: Unique identifier of the model.
        data:
          type: array
          description: List of embeddings.
          items:
            $ref: '#/components/schemas/Embedding'
        usage:
          type: object
          description: Usage information generated by this request.
          properties:
            prompt_tokens:
              type: integer
              description: Number of input tokens.
            total_tokens:
              type: integer
              description: Total number of tokens (input and output).
            completion_tokens:
              type: integer
              description: Number of output tokens. Always set to `0` for embedding models since no token are generated (only vectors coordinates).
      x-properties-order:
        - id
        - object
        - created
        - model
        - data
        - usage
    Embedding:
      type: object
      properties:
        index:
          type: integer
          description: Index of the embedding in the list of embeddings.
        object:
          type: string
          description: Type of the response object, always set to `embedding`.
          enum:
            - embedding
        embedding:
          type: array
          description: >
            Embedding vector, represented as a list of floating point values. The length of
            a vector is equal to the number of dimensions of the model.
          items:
            type: number
    CreateRerankResponse:
      type: object
      properties:
        id:
          type: integer
          description: UUID of the response.
        model:
          type: string
          description: Unique identifier of the model.
        results:
          type: array
          description: List of documents sorted by relevance.
          items:
            $ref: '#/components/schemas/Ranking'
        usage:
          type: object
          description: Usage information generated by this request.
          properties:
            total_tokens:
              type: integer
              description: Total number of tokens (reranking models only use input tokens).
      x-properties-order:
        - id
        - model
        - results
        - usage
    Ranking:
      type: object
      properties:
        index:
          type: integer
          description: Index of the document in the initial request.
        relevance_score:
          type: number
          description: >
            Document's relevance to answering the query.
        document:
          type: object
          description: Document sent in the request.
          properties:
            text:
              type: string
              description: Content of the document.
    CreateAudioTranscriptionResponse:
      type: object
      properties:
        text:
          type: integer
          description: Transcribed text.
        usage:
          type: object
          description: |-
            Usage information generated by this request, either in tokens or duration depending
            on how the model is billed.
          properties:
            type:
              type: string
              description: Usage type for this model. Either `duration` or `tokens`.
            seconds:
              type: number
              description: Audio input duration, in seconds.
      x-properties-order:
        - text
        - usage
    Batch:
      type: object
      properties:
        id:
          type: string
          description: UUID of the batch.
        object:
          type: string
          description: Type of batch object, always set to `batch`.
          enum:
            - batch
        endpoint:
          type: string
          description: Path used to process requests in the batch.
        model:
          type: string
          description: Model used to process the batch
        errors:
          type: object
          description: Error object
          properties:
            object:
              type: string
              description: Type of batch object, always set to `list`.
            data:
              type: array
              description: Error details
              items:
                type: object
                properties:
                  code:
                    type: string
                    description: Code identifying error
                  line:
                    type: integer
                    description: Line number in the file where error occured
                  message:
                    type: string
                    description: Error message
                  param:
                    type: string
                    description: Name of the parameter that triggered error, if applicable
        input_file_id:
          type: string
          description: URL of the input file.
        completion_window:
          type: string
          description: Time range during which the batch should be processed.
        status:
          type: string
          description: Status of the batch.
        output_file_id:
          type: string
          description: URL of the input file.
        error_file_id:
          type: string
          description: URL of the input file.
        created_at:
          type: integer
          description: Timestamp when the batch was created (Unix format, in seconds).
        in_progress_at:
          type: integer
          description: Timestamp when the batch processing started (Unix format, in seconds).
        expires_at:
          type: integer
          description: Timestamp when the batch will expire (Unix format, in seconds).
        finalizing_at:
          type: integer
          description: Timestamp when the batch started finalizing (Unix format, in seconds).
        completed_at:
          type: integer
          description: Timestamp when the batch was completed (Unix format, in seconds).
        failed_at:
          type: integer
          description: Timestamp when the batch failed (Unix format, in seconds).
        expired_at:
          type: integer
          description: Timestamp when the batch expired (Unix format, in seconds).
        cancelling_at:
          type: integer
          description: Timestamp when the batch started cancelling (Unix format, in seconds).
        cancelled_at:
          type: integer
          description: Timestamp when the batch was cancelled (Unix format, in seconds).
        request_counts:
          type: object
          description: Number of requests by status.
          properties:
            completed:
              type: integer
              description: Number of requests completed successfully.
            failed:
              type: integer
              description: Number of failed requests.
            total:
              type: integer
              description: Total number of requests.
        usage:
          type: object
          description: |-
            Usage information generated by this request, either in tokens or duration depending
            on how the model is billed.
          properties:
            type:
              type: string
              description: Usage type for this model. Either `duration` or `tokens`.
            duration:
              type: number
              description: Audio input duration, in seconds.
            prompt_tokens:
              type: integer
              description: Number of input tokens.
            total_tokens:
              type: integer
              description: Total number of tokens (input and output).
            completion_tokens:
              type: integer
              description: Number of output tokens. Always set to `0` for embedding models since no token are generated (only vector coordinates).
      x-properties-order:
        - id
        - object
        - endpoint
        - model
        - input_file_id
        - output_file_id
        - error_file_id
        - errors
        - completion_window
        - status
        - created_at
        - in_progress_at
        - expires_at
        - finalizing_at
        - completed_at
        - failed_at
        - expired_at
        - cancelling_at
        - cancelled_at
        - request_counts
        - usage
    ListBatchResponse:
      type: object
      properties:
        object:
          type: string
          description: Type of response object, always set to `list`.
          enum:
            - list
        data:
          type: array
          description: List of batches.
          items:
            $ref: '#/components/schemas/Batch'
        first_id:
          type: string
          description: UUID of first batch in the response.
        last_id:
          type: string
          description: UUID of last batch in the response.
        has_more:
          type: boolean
          description: Defines whether there are more results to retrieve not returned by this query.
      x-properties-order:
        - object
        - data
    ResponseAPIFunctionObject:
      type: object
      properties:
        type:
          type: string
          enum:
            - function
          description: Type of tool object, always set to `function`.
        name:
          type: string
          description:
            Name of the function to be called. Must contain only `a-z`, `A-Z`, `0-9`,
            underscores and dashes, with a maximum length of `64` characters.
        description:
          type: string
          description:
            Description of the function. This helps the model
            choose the right function when needed.
        parameters:
          $ref: '#/components/schemas/FunctionParameters'
        strict:
          type: boolean
          nullable: true
          description: |-
            Defines whether to enforce strict schema adherence when generating a
            function call. If set to `true`, the model will follow the exact
            schema defined in the `parameters` field. Currently, even if
            set `true` this parameter will be ignored and act as if set to `false`.
            We recommend you check output schema before calling any functions or tools.

            **Default:** `false`
      required:
        - type
        - name
    FunctionObject:
      type: object
      properties:
        description:
          type: string
          description:
            Description of the function. This helps the model
            choose the right function when needed.
        name:
          type: string
          description:
            Name of the function to be called. Must contain only `a-z`, `A-Z`, `0-9`,
            underscores and dashes, with a maximum length of `64` characters.
        parameters:
          $ref: '#/components/schemas/FunctionParameters'
        strict:
          type: boolean
          nullable: true
          description: |-
            Defines whether to enforce strict schema adherence when generating a
            function call. If set to `true`, the model will follow the exact
            schema defined in the `parameters` field. Currently, even if
            set `true` this parameter will be ignored and act as if set to `false`.
            We recommend you check output schema before calling any functions or tools.

            **Default:** `false`
      required:
        - name
    FunctionParameters:
      type: object
      description: >-
        Parameters of the function, described as a JSON schema object.
        See [How to use function calling](https://www.scaleway.com/en/docs/generative-apis/how-to/use-function-calling/) for examples,
        and the [JSON schema
        reference](https://json-schema.org/understanding-json-schema/) for
        documentation about the format.

        Omitting `parameters` defines a function with an empty parameter list.
      additionalProperties: true
    ListModelsResponse:
      type: object
      properties:
        object:
          type: string
          description: Type of response object, always set to `list`.
          enum:
            - list
        data:
          type: array
          description: List of models.
          items:
            $ref: '#/components/schemas/Model'
      x-properties-order:
        - object
        - data
    Model:
      type: object
      properties:
        id:
          type: string
          description: Unique identifier of the model.
        object:
          type: string
          description: Object type. Always set to `model`.
          enum:
            - model
        created:
          type: integer
          description: Timestamp when the model was created (Unix format, in seconds).
        owned_by:
          type: string
          description: Name of the organization that created the model (i.e. the model provider).
      x-properties-order:
        - id
        - object
        - created
        - owned_by
    ParallelToolCalls:
      description: |-
        Defines whether the model can call multiple tools. Currently, even if
        set `false` this parameter will be ignored and act as if set to `true`.

        Only [specific models](https://www.scaleway.com/en/docs/managed-inference/reference-content/model-catalog/#model-details)
        can call multiple tools in a single response.

        **Default value:** `true`
      type: boolean
    MaxOutputTokens:
      description: |-
        Maximum number of output tokens that can be generated
        for a `completion`.
        Different [default maximum values](https://www.scaleway.com/en/docs/generative-apis/reference-content/supported-models/)
        are enforced for each model, to avoid edge cases where tokens are
        generated indefinitely. These values are not enforced
        in [Managed Inference](https://www.scaleway.com/en/inference/).
      type: integer
      nullable: true
    ResponseFormatChatCompletion:
      type: object
      description: |-
        Output format specification.

        Using `{ "type": "json_schema", "json_schema": {...} }`
        enables the model to output only a valid JSON following the provided schema specification.

        Deprecated. Using `{ "type": "json_object" }` enables `JSON
        mode` that should not be used anymore.

        See [How to use structured outputs](https://www.scaleway.com/en/docs/generative-apis/how-to/use-structured-outputs/)
        for code snippets using `openai` Python client and the [JSON Schema reference](https://json-schema.org/understanding-json-schema/)
        for documentation about the format.
      properties:
        type:
          type: string
          description: Type of response object.
          enum:
            - text
            - json_schema
            - json_object
          required:
          - type
        json_schema:
          type: object
          description: |-
            Schema the response object should follow in `JSON` format. This field
            can only be used if `type` is set to `json_schema`.
    ResponseFormatResponseAPI:
      type: object
      description: |-
        Output format specification.

        Using `{ "type": "text"}` ensures the model enables output text (default behavior).

        Using `{ "type": "json_schema", "name": ..., "schema": {...}, "description": ...,"strict": true}`
        enables the model to output only valid JSON following the provided schema specification.

        Deprecated. Using `{ "type": "json_object" }` enables `JSON
        mode`, and should no longer be used.

        See [How to use structured outputs](https://www.scaleway.com/en/docs/generative-apis/how-to/use-structured-outputs/)
        for code snippets using the `openai` Python client, and the [JSON Schema reference](https://json-schema.org/understanding-json-schema/)
        for documentation about the format.
      properties:
        type:
          type: string
          description: |-
            Type of response object. The properties `name`, `schema`, `description` and `strict`
            can only be used if `type` is set to `json_schema`.
          enum:
            - text
            - json_schema
            - json_object
        name:
          type: string
          description: |-
            Name of the response format. Must only contain alphanumeric characters, underscores and dashes.
        description:
          type: string
          description: |-
            Description of the response format. This helps the model
            generate a response that follows the desired structure.
        schema:
          type: object
          description: |-
            Schema the response object should follow in `JSON` format. This field
            can only be used if `type` is set to `json_schema`. [Learn more](https://www.scaleway.com/en/docs/generative-apis/how-to/use-structured-outputs/)
        strict:
          type: boolean
          nullable: true
          description: |-
            Defines whether to enforce strict schema adherence when generating
            structured output. Currently, only `true` is supported.

            **Default:** `true`
      required:
        - type
        - name
        - schema
    StopConfiguration:
      description: |-
        String, or array of strings, that when encountered in the generated text
        will stop the model from generating further output tokens.
        The generated text will not return any of the specified stop sequences.
        A maximum of 4 sequences can be provided.
      default: null
      nullable: true
      type: string
    Temperature:
      description: |-
        Value between `0` and `2` which increases randomness in token generation (e.g. encourages content "creativity" instead of "predictability").

        `temperature:0` means the distribution learned by the model will be used directly, favoring a subset of the most probable tokens at each generation step.

        `temperature>0` means randomness is added to the learnt distribution, so that tokens with a lower probability can also be generated.

        `temperature>=1` means added randomness will be so high, that almost all tokens are equally probable, leading the model to potentially mix languages.

        The ideal `temperature` value depends on the use case and model. We recommend setting `temperature` to the recommended value for each model,
        as shown in Console Playground (these values are used by default).

        Note that `temperature` does not affect request reproducibility (only affected by the `seed` parameter).
        With the same `seed` and `temperature`, two identical requests to a model will generate the same response.
      type: number
      minimum: 0
      maximum: 2
      nullable: true
    TopP:
      description: |-
        Value between `0` and `1` which increases the proportion of token vocabulary considered during generation (`0` cannot be used).

        `top_p`:`0.9` means the next token will be chosen from the 90% most probable tokens at each generation step.

        We recommend setting `top_p` to the recommended value for each model, as shown in Console Playground (these values are used by default).
      type: number
      minimum: 0
      maximum: 1
      nullable: true
    ProjectId:
      description: |-
        The ID of the Project you want to target. If this value is not provided,
        your default Project will be used.

        Specifying this value allows you to limit access through IAM policies,
        or to allocate consumption and billing to a specific project.
      type: string
      example: example-f295-43f0-9433-ab3e04445856
paths:
  /{project_id}/v1/responses:
    post:
      operationId: createResponse
      tags:
        - Responses
      summary: Create a response
      description: |-
        Create a model response for a given input.
        This method accepts a sequence of messages (a chat conversation) and returns a response generated by the model.

        Currently, this API **does not store** inputs and `store` is set to `false` even if not provided.
      parameters:
        - in: path
          name: project_id
          required: true
          schema:
            $ref: '#/components/schemas/ProjectId'
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              properties:
                input:
                  type: array
                  description: |-
                    String or list of inputs to provide to the model to generate a response. Use an array of inputs to provide multiple strings and/or other content types.
                  minItems: 1
                  items:
                    $ref: '#/components/schemas/ResponseAPIInputList'
                model:
                  description: |-
                    Unique identifier of the model. For now, the Responses API only supports `gpt-oss-120b`.

                    Refer to our [supported models](https://www.scaleway.com/en/docs/generative-apis/reference-content/supported-models/) list
                    or [/models](#path-models-list-models) endpoint for available models.
                  type: string
                  example: gpt-oss-120b
                max_output_tokens:
                  $ref: '#/components/schemas/MaxOutputTokens'
                parallel_tool_calls:
                  $ref: '#/components/schemas/ParallelToolCalls'
                instructions:
                  description: |-
                    System message added to the model's context.
                  type: string
                  nullable: true
                reasoning:
                  description: |-
                    Configuration parameters for reasoning models.
                  type: object
                  properties:
                    effort:
                      type: string
                      description: |-
                        Reasoning effort level to generate the response.
                        `minimal` is currently not supported.
                      enum:
                        - low
                        - medium
                        - high
                store:
                  description: |-
                    Defines whether to store the input content for future requests.
                    `store` is currently not supported and always set to `false`.
                  type: boolean
                  default: false
                stream:
                  description: |-
                    Defines whether the model's response can be streamed to the client
                    using server-sent events.

                    The response will be streamed in chunks over HTTP, where each chunk except
                    the last contains the following content:

                    `data: {"id": ..., "model": ..., "output":...}`

                    The last chunk will contain `data: [DONE]`.

                    Note that the object `{"id": ..., "model": ..., "output":...}` follows the same format as
                    a non-stream HTTP request.

                    See [How to query language models using streaming](https://www.scaleway.com/en/docs/generative-apis/how-to/query-language-models/#streaming)
                    for examples, and [server-sent events](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events)
                    for reference documentation about SSE format.

                    **Default value:** `false`
                  type: boolean
                  nullable: true
                temperature:
                  $ref: '#/components/schemas/Temperature'
                text:
                  type: object
                  description: Configuration of the response format, either plain text or JSON structured data.
                  properties:
                    format:
                      $ref: '#/components/schemas/ResponseFormatResponseAPI'
                tools:
                  type: array
                  description: >
                    List of tools the model can call, such as functions.
                    A maximum of `128` tools can be provided.
                    See [How to use function calling](https://www.scaleway.com/en/docs/generative-apis/how-to/use-function-calling/)
                    for code snippets using the `openai` Python client.
                  items:
                    $ref: '#/components/schemas/ResponseAPITool'
                tool_choice:
                  $ref: '#/components/schemas/ChatCompletionToolChoiceOption'
                top_logprobs:
                  description: |-
                    Number of most likely tokens to return for each token generated, along with their
                    generation log probability.
                    Value must be between `0` and `20`.
                    `logprobs` must be set to `true` to use this parameter.
                  type: integer
                  minimum: 0
                  maximum: 20
                  nullable: true
                top_p:
                  $ref: '#/components/schemas/TopP'
                truncation:
                  description: |-
                    Truncation configuration for the model response.
                    Only `disabled` is currently supported.
                  type: string
                  default: "disabled"
                  nullable: true
              required:
              - model
              x-properties-order:
              - input
              - model
              - max_output_tokens
              - parallel_tool_calls
              - instructions
              - reasoning
              - store
              - stream
              - temperature
              - text
              - tools
              - tool_choice
              - top_logprobs
              - top_p
              - truncation
      responses:
        '200':
          description: ""
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/CreateResponse'
      x-codeSamples:
        - lang: cURL
          source: |
            curl https://api.scaleway.ai/v1/responses \
              -H "Content-Type: application/json" \
              -H "Authorization: Bearer $SCW_SECRET_KEY" \
              -d '{
                "model": "gpt-oss-120b",
                "input": [
                  {
                    "role": "user",
                    "content": [
                      {
                        "type": "input_text",
                        "text": "Write a haiku about Cloud."
                      }
                    ]
                  }
                ]
              }'
  /{project_id}/v1/chat/completions:
    post:
      operationId: createChatCompletion
      tags:
        - Chat Completions
      summary: Create a chat completion
      description: |-
        Create a model response for a given chat conversation.
        This method accepts a sequence of messages (a chat conversation) and returns a response generated by the model.

        Conversation `messages` **are not stored** and need to be sent in each
        `/chat/completions` API call.
      parameters:
        - in: path
          name: project_id
          required: true
          schema:
            $ref: '#/components/schemas/ProjectId'
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              properties:
                model:
                  description: |-
                    Unique identifier of the model, such as `llama-3.3-70b-instruct` or `mistral-small-3.2-24b-instruct-2506`.

                    Refer to our [supported models](https://www.scaleway.com/en/docs/generative-apis/reference-content/supported-models/) list
                    or [/models](#path-models-list-models) endpoint for available models.
                  type: string
                  example: llama-3.3-70b-instruct
                messages:
                  description: |-
                    Array of messages representing the conversation history.
                  type: array
                  minItems: 1
                  items:
                    $ref: '#/components/schemas/ChatCompletionRequestMessage'
                max_completion_tokens:
                  $ref: '#/components/schemas/MaxOutputTokens'
                max_tokens:
                  description: |-
                    Use `max_completion_tokens` instead. Maximum number of total tokens that can be generated
                    for a completion (input and output).
                  type: integer
                  nullable: true
                  deprecated: true
                frequency_penalty:
                  description: |-
                    Value which influences the likelihood of generating tokens based on their frequency in the existing text.
                    When set to a positive value, it reduces the probability of repeating tokens that have already appeared.
                  type: number
                  minimum: -2
                  maximum: 2
                  nullable: true
                logit_bias:
                  type: object
                  default: null
                  nullable: true
                  additionalProperties: true
                  description: >
                    List of token IDs with associated bias integer values ranging from `-100` to `100`.
                    This parameter adjusts the probability of these tokens being generated during the model's output.

                    A JSON object must be provided in the following format:
                    `{"354": 80,"143": -50}` where `354` and `143` are token IDs from the tokenizer used
                    with this model. Positive values increase the likelihood of a token being generated, while negative values reduce it.

                    Model `qwen3.5-397b-a17b` does not support this field.
                logprobs:
                  description: |-
                    Defines whether to return log probabilities of each output token.
                    This allows you to see the likelihood of each token being generated.
                  type: boolean
                  default: false
                  nullable: true
                n:
                  description: |-
                    Number of chat completion choices to generate for a given input.
                    The value of `n` multiplies the number of generated tokens,
                    resulting in `n` separate responses for each input.
                  type: integer
                  minimum: 1
                  maximum: 128
                  default: 1
                  nullable: true
                parallel_tool_calls:
                  $ref: '#/components/schemas/ParallelToolCalls'
                presence_penalty:
                  description: |-
                    Value which influences the probability of generating tokens that have
                    already appeared in the text. Positive values reduce the likelihood of
                    repeating a token, regardless of how many times it has already appeared.
                  type: number
                  minimum: -2
                  maximum: 2
                  nullable: true
                reasoning_effort:
                  description: |-
                    Reasoning effort level to generate the response. `minimal` is currently not supported.

                    For `qwen3.5-397b-a17b` model:
                      - `none` value is supported
                      - `low` and `high` values are similar to `medium`

                    For `gpt-oss-120b` model:
                      - `none` value is not supported
                  type: string
                  enum:
                    - none
                    - low
                    - medium
                    - high
                  default: medium
                response_format:
                  $ref: '#/components/schemas/ResponseFormatChatCompletion'
                seed:
                  description: |-
                    Value which controls the randomness of the output to ensure determinism.
                    When using the same seed value along with identical input and parameters,
                    you should receive the same model response each time. This holds true even when
                    temperature is set above 0.

                    Note that fully deterministic output is not guaranteed over long periods of time (such
                    as several months), as the inference model may be updated and optimized.
                  type: integer
                  minimum: -9223372036854776000
                  maximum: 9223372036854776000
                  nullable: true
                stop:
                  $ref: '#/components/schemas/StopConfiguration'
                stream:
                  description: |-
                    Defines whether the model's response can be streamed to the client
                    using server-sent events.

                    The response will be streamed in chunks over HTTP, where each chunk except
                    the last contains the following content:

                    `data: {"id": ..., "model": ..., "choices":...}`

                    The last chunk will contain `data: [DONE]`.

                    Note that the object `{"id": ..., "model": ..., "choices":...}` follows the same format as
                    a non-stream HTTP request.

                    See [How to query language models using streaming](https://www.scaleway.com/en/docs/generative-apis/how-to/query-language-models/#streaming)
                    for examples, and [server-sent events](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events)
                    for reference documentation about SSE format.

                    **Default value:** `false`
                  type: boolean
                  nullable: true
                stream_options:
                  $ref: '#/components/schemas/ChatCompletionStreamOptions'
                temperature:
                  $ref: '#/components/schemas/Temperature'
                tools:
                  type: array
                  description: >
                    List of tools the model can call, such as functions.
                    A maximum of `128` tools can be provided.
                    See [How to use function calling](https://www.scaleway.com/en/docs/generative-apis/how-to/use-function-calling/)
                    for code snippets using `openai` Python client.
                  items:
                    $ref: '#/components/schemas/ChatCompletionTool'
                tool_choice:
                  $ref: '#/components/schemas/ChatCompletionToolChoiceOption'
                top_logprobs:
                  description: |-
                    Number of most likely tokens to return for each token generated, along with their
                    generation log probability.
                    Value must be between `0` and `20`.
                    `logprobs` must be set to `true` to use this parameter.
                  type: integer
                  minimum: 0
                  maximum: 20
                  nullable: true
                top_p:
                  $ref: '#/components/schemas/TopP'
              required:
              - model
              - messages
              x-properties-order:
              - model
              - messages
              - max_completion_tokens
              - max_tokens
              - frequency_penalty
              - logit_bias
              - logprobs
              - n
              - parallel_tool_calls
              - presence_penalty
              - reasoning_effort
              - response_format
              - seed
              - stop
              - stream
              - stream_options
              - temperature
              - tools
              - tool_choice
              - top_logprobs
              - top_p
      responses:
        '200':
          description: ""
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/CreateChatCompletionResponse'
      x-codeSamples:
        - lang: cURL
          source: |
            curl https://api.scaleway.ai/v1/chat/completions \
              -H "Content-Type: application/json" \
              -H "Authorization: Bearer $SCW_SECRET_KEY" \
              -d '{
                "model": "llama-3.3-70b-instruct",
                "messages": [
                  {
                    "role": "system",
                    "content": "You are a helpful assistant."
                  },
                  {
                    "role": "user",
                    "content": "Hello!"
                  }
                ]
              }'
  /{project_id}/v1/embeddings:
    post:
      operationId: createEmbedding
      tags:
        - Embeddings
      summary: Create an embedding
      description: Generate an embedding.
      parameters:
        - in: path
          name: project_id
          required: true
          schema:
            $ref: '#/components/schemas/ProjectId'
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              properties:
                input:
                  description: |-
                    String or Array of strings to represent as embedding vector.
                    Maximum array items: `2048`
                  type: string
                  example: This is a test.
                model:
                  description: >
                    Unique identifier of the model, such as `bge-multilingual-gemma2`.

                    Refer to our [supported models](https://www.scaleway.com/en/docs/generative-apis/reference-content/supported-models/) list
                    or [/models](#path-models-list-models) endpoint for available models.
                  type: string
                  example: bge-multilingual-gemma2
                encoding_format:
                  description:
                    Format of the embedding representation.
                  type: string
                  enum:
                    - float
                    - base64
                  example: float
                  default: float
                dimensions:
                  description: >
                    Number of `dimensions` to use for the embedding vector representation.
                    Currently, the only supported value is that of the [maximum dimensions of a model](https://www.scaleway.com/en/docs/generative-apis/reference-content/supported-models/#embedding-models).
                    Lower values are not supported and vectors should not be trimmed,
                    since available models do not support [matryoshka embeddings](https://huggingface.co/blog/matryoshka).
                  type: integer
                  example: 3584
              required:
                - input
                - model
              x-properties-order:
                - input
                - model
                - encoding_format
                - dimensions
      responses:
        '200':
          description: ""
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/CreateEmbeddingResponse'
      x-codeSamples:
        - lang: cURL
          source: |-
            curl https://api.scaleway.ai/v1/embeddings \
              -H "Authorization: Bearer $SCW_SECRET_KEY" \
              -H "Content-Type: application/json" \
              -d '{
                "input": "Here is a text to embed as a vector",
                "model": "bge-multilingual-gemma2"
              }'
  /{project_id}/v1/rerank:
    post:
      operationId: createRerank
      tags:
        - Rerank
      summary: Create a reranking
      description: Identify most relevant documents to answer a query.
      parameters:
        - in: path
          name: project_id
          required: true
          schema:
            $ref: '#/components/schemas/ProjectId'
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              properties:
                query:
                  description: |-
                    Query requiring relevant documents to provide a better answer.
                  type: string
                  example: This is an example query.
                documents:
                  description: |-
                    Array of document contents in string format.
                    Maximum array items: `1000`
                  type: array
                  items:
                    type: string
                model:
                  description: >
                    Unique identifier of the model, such as `qwen3-embedding-8b`.

                    Refer to our [supported models](https://www.scaleway.com/en/docs/generative-apis/reference-content/supported-models/) list
                    or [/models](#path-models-list-models) endpoint for available models.
                  type: string
                  example: qwen3-embedding-8b
                top_n:
                  description:
                    Number of most relevant documents to retrieve.
                  type: integer
                  example: 3
              required:
                - query
                - model
                - documents
              x-properties-order:
                - model
                - query
                - documents
                - top_n
      responses:
        '200':
          description: ""
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/CreateRerankResponse'
      x-codeSamples:
        - lang: cURL
          source: |-
            curl https://api.scaleway.ai/v1/rerank \
              -H "Authorization: Bearer $SCW_SECRET_KEY" \
              -H "Content-Type: application/json" \
              -d '{
                "model": "qwen3-embedding-8b",
                "query": "What is the biggest area of water on earth ?",
                "documents": [
                  "The Pacific is approximately 165 million km²",
                  "Oceans can be sorted by size: Pacific, Atlantic, Indian",
                  "The Atlantic is a very large ocean.",
                  "The deepest pool on earth is 96 000 m²"
                ],
                "top_n": 3
              }'
  /{project_id}/v1/audio/transcriptions:
    post:
      operationId: createAudioTranscription
      tags:
        - Audio
      summary: Create an audio transcription
      description: Generate an audio transcription.
      parameters:
        - in: path
          name: project_id
          required: true
          schema:
            $ref: '#/components/schemas/ProjectId'
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              properties:
                file:
                  description: |-
                    Audio file object to transcribe. Currently, the only supported formats are `wav`, `mp3`, `flac`, `mpga`, `oga`, `ogg`.

                    See [How to query audio models](https://www.scaleway.com/en/docs/generative-apis/how-to/query-audio-models/)
                    for code snippets using `openai` Python client.
                  type: object
                model:
                  description: >
                    Unique identifier of the model, such as `whisper-large-v3`.

                    Refer to our [supported models](https://www.scaleway.com/en/docs/generative-apis/reference-content/supported-models/) list
                    or [/models](#path-models-list-models) endpoint for available models.
                  type: string
                  example: whisper-large-v3
                language:
                  description: |-
                    Language of the audio input, following [ISO-639-1](https://en.wikipedia.org/wiki/List_of_ISO_639_language_codes) format such as `en` for English.
                    Refer to our [model catalog](https://www.scaleway.com/en/docs/managed-inference/reference-content/model-catalog/)
                    for supported languages.

                    **Default**: Language will be automatically detected if no value is provided.
                  type: string
                  example: en
                prompt:
                  description: |-
                    Additional content used to guide the model during transcription.
                    This fields works very differently from prompts in `/chat/completions`.
                    Refer to [How to query audio models](https://www.scaleway.com/en/docs/generative-apis/how-to/query-audio-models/)
                    for more information.
                  type: string
                  example: en
                response_format:
                  description: |-
                    Output format structure. Currently, the only supported value is `json`.
                  type: string
                  default: "json"
                  nullable: true
                stream:
                  description: |-
                    Defines whether the model's response can be streamed to the client
                    using server-sent events.

                    The response will be streamed in chunks over HTTP, where each chunk except
                    the last contains the following content:

                    `data: {"type": ..., "delta": ..., "logprobs":...}`

                    The last chunk will contain `data: [DONE]`.

                    Note that the object `data: {"type": ..., "delta": ..., "logprobs":...}` does not follow the same format as
                    a non-stream HTTP request.

                    See [How to query audio models using streaming](https://www.scaleway.com/en/docs/generative-apis/how-to/query-audio-models/#streaming)
                    for examples, and [server-sent events](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events)
                    for reference documentation about SSE format.

                    **Default value:** `false`
                  type: boolean
                  nullable: true
                temperature:
                  $ref: '#/components/schemas/Temperature'
              required:
              - file
              - model
              x-properties-order:
              - file
              - model
              - language
              - prompt
              - response_format
              - stream
              - temperature
      responses:
        '200':
          description: ""
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/CreateAudioTranscriptionResponse'
      x-codeSamples:
        - lang: cURL
          source: |-
            curl https://api.scaleway.ai/v1/audio/transcriptions \
              -H "Authorization: Bearer $SCW_SECRET_KEY" \
              -H "Content-Type: multipart/form-data" \
              -F file="@path/to/audio.mp3" \
              -F model="whisper-large-v3"
  /{project_id}/v1/batches:
    get:
      operationId: listBatch
      tags:
        - Batch
      summary: List batches
      description: List batches including their properties and status.
      parameters:
        - in: path
          name: project_id
          required: true
          schema:
            $ref: '#/components/schemas/ProjectId'
        - in: query
          name: after
          description: |-
            Pagination cursor which value should be a batch UUID. When response consists of
            multiple pages, provide the last batch UUID obtained in previous request to fetch next page.
          schema:
            type: string
        - in: query
          name: limit
          description: |-
            Maximum number of batches to retrieve.
          schema:
            type: integer
            minimum: 0
            maximum: 100
            default: 20
      responses:
        '200':
          description: ""
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ListBatchResponse'
      x-codeSamples:
        - lang: cURL
          source: |-
            curl https://api.scaleway.ai/v1/batches?limit=10 \
              -H "Authorization: Bearer $SCW_SECRET_KEY" \
              -H "Content-Type: application/json"
    post:
      operationId: createBatch
      tags:
        - Batch
      summary: Create a batch
      description: Process multiple requests asynchronously in batch.
      parameters:
        - in: path
          name: project_id
          required: true
          schema:
            $ref: '#/components/schemas/ProjectId'
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              properties:
                completion_window:
                  description: >
                    Time range during which the batch should be processed. Currently only `24h` is supported.
                  schema:
                    type: string
                    example: 24h
                endpoint:
                  description: |-
                    Path to use to process requests in the batch. Currently `/v1/chat/completions`, `/v1/responses`,
                    `/v1/embeddings` and `/v1/audio/transcriptions` are supported.
                  schema:
                    type: string
                    example: en
                input_file_id:
                  description: |-
                    URL of the file in Scaleway Object Storage. File should contain all request to process in [JSONL format](https://jsonlines.org/).
                    Results will be stored within the same bucket and folder, and named `{filename}-output.jsonl` and `{filename}-error.jsonl`.

                    See [How to use batch processing](https://www.scaleway.com/en/docs/generative-apis/how-to/use-batch-processing/)
                    for code snippets using `openai` Python client.
                  schema:
                    type: string
                    example: https://bucket-123.s3.fr-par.scw.cloud/folder-123/batch-123.jsonl
                output_expires_after:
                  description: |-
                    Expiration rules for the output and error files generated by the batch.
                  schema:
                    type: object
                    properties:
                      anchor:
                        type: string
                        description: |-
                          Reference timestamp after which the expiration duration applies. Supported value: `created_at`
                        example: created_at
                      seconds:
                        type: string
                        description: |-
                          Number of seconds after anchor timestamp that the file will expire. Value must be between 3600 (1 hour) and 31 536 000 (1 year).
              required:
              - completion_window
              - endpoint
              - input_file_id
              x-properties-order:
              - completion_window
              - endpoint
              - input_file_id
              - output_expires_after
      responses:
        '200':
          description: ""
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Batch'
      x-codeSamples:
        - lang: cURL
          source: |-
            curl https://api.scaleway.ai/v1/batches \
              -H "Authorization: Bearer $SCW_SECRET_KEY" \
              -H "Content-Type: application/json" \
              -d '{
                "input_file_id": "https://bucket-123.s3.fr-par.scw.cloud/folder-123/batch-123.jsonl",
                "endpoint": "/v1/chat/completions",
                "completion_window": "24h"
              }'
  /{project_id}/v1/batches/{batch_id}:
    get:
      operationId: getBatch
      tags:
        - Batch
      summary: Get a batch
      description: Retrieve a batch's properties and status.
      parameters:
        - in: path
          name: project_id
          required: true
          schema:
            $ref: '#/components/schemas/ProjectId'
        - in: path
          name: batch_id
          required: true
          description: >
            UUID of the batch.
          schema:
            type: string
      responses:
        '200':
          description: ""
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Batch'
      x-codeSamples:
        - lang: cURL
          source: |-
            curl https://api.scaleway.ai/v1/batches/{batch_id} \
              -H "Authorization: Bearer $SCW_SECRET_KEY" \
              -H "Content-Type: application/json"
  /{project_id}/v1/batches/{batch_id}/cancel:
    post:
      operationId: cancelBatch
      tags:
        - Batch
      summary: Cancel a batch
      description: |-
        When a batch is cancelled, results already processed are stored
        in corresponding `output.jsonl` and `errors.jsonl` files, while remaining
        requests will not be processed.
      parameters:
        - in: path
          name: project_id
          required: true
          schema:
            $ref: '#/components/schemas/ProjectId'
        - in: path
          name: batch_id
          required: true
          description: >
            UUID of the batch.
          schema:
            type: string
      responses:
        '200':
          description: ""
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Batch'
      x-codeSamples:
        - lang: cURL
          source: |-
            curl https://api.scaleway.ai/v1/batches/{batch_id}/cancel \
              -H "Authorization: Bearer $SCW_SECRET_KEY" \
              -H "Content-Type: application/json"
  /v1/models:
    get:
      operationId: listModels
      tags:
        - Models
      summary: List models
      description: List [models](https://www.scaleway.com/en/docs/generative-apis/reference-content/supported-models/) that are available via this API.
      responses:
        '200':
          description: ""
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ListModelsResponse'
      x-codeSamples:
        - lang: cURL
          source: |
            curl https://api.scaleway.ai/v1/models \
              -H "Authorization: Bearer $SCW_SECRET_KEY"
