openapi: 3.1.0
info:
  title: Generative APIs - Dedicated Deployment API
  description: |-
    Scaleway Generative APIs - Dedicated Deployment allows you to deploy and run machine learning models on Scaleway's infrastructure. This service provides scalable and efficient endpoints for your model inference needs. The Scaleway Generative APIs - Dedicated Deployment API enables you to manage these endpoints and perform inference operations with any OpenAI API compatible software.


    <Message type="tip">
    To retrieve information about the different [models](#path-models-list-models) available for deployment on Scaleway Generative APIs - Dedicated Deployment, check out our [model documentation](https://www.scaleway.com/en/docs/generative-apis/reference-content/).
    </Message>


    ## Concepts

    Refer to our [dedicated concepts page](https://www.scaleway.com/en/docs/generative-apis/concepts/) to find the definitions of all concepts and terminology related to Generative APIs - Dedicated Deployment.




    ## Quickstart

    1. Configure your environment variables

        <Message type="note">
        This is an optional step that seeks to simplify your usage of the Generative APIs - Dedicated Deployment API. You can find your Project ID in the [Scaleway console](https://console.scaleway.com/project/settings).
        </Message>

        ```bash
        export SCW_SECRET_KEY="<API secret key>"
        export SCW_DEFAULT_REGION="fr-par"
        export SCW_PROJECT_ID="<Scaleway Project ID>"
        ```

    2. **List available models**: Run the following command to get a list of all the models available for deployment, with their details:

        ```bash
        curl -X GET \
          -H "Content-Type: application/json" \
          -H "X-Auth-Token: $SCW_SECRET_KEY" \
          "https://api.scaleway.com/inference/v1/regions/$SCW_DEFAULT_REGION/endpoints"
        ```

    3. **Create a model deployment**: Run the following command to create a deployment. Customize the details in the payload (name, model, description, tags, etc.) to your needs:

        ```bash
        curl -X POST https://api.scaleway.com/inference/v1/regions/$SCW_DEFAULT_REGION/deployments \
        -H "Content-Type: application/json" \
        -H "X-Auth-Token: $SCW_SECRET_KEY" \
        -d '{
          "project_id": "'"$SCW_PROJECT_ID"'",
          "name": "my-inference-deployment",
          "model_id": "chosen-model-id",
          "node_type": "L4",
          "min_size": 1,
          "max_size": 1,
          "accept_eula": true,
          "endpoints": [
            {
              "public": {}
            }
          ]
        }'
        ```

        | Parameter       | Description                                                                                             | Valid values                                                                                                   |
        |-----------------|---------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------|
        | `project_id`    | The Project in which the deployment should be created (string)                                          | Any valid Scaleway Project ID, e.g., `"b4bd99e0-b389-11ed-afa1-0242ac120002"`                                  |
        | `name`          | A name of your choice for the deployment (string)                                                       | Any string containing only alphanumeric characters, dots, spaces, and dashes, e.g., `"my-inference-deployment"`|
        | `model_id`      | The model to deploy (string)                                                                            | Any valid model ID found in your model library (see models listing)                                            |
        | `node_type`     | The type of node to use for the deployment (string)                                                     | Example: `"L4"`                                                                                                |
        | `min_size`      | Minimum number of replicas for the deployment (integer)                                                 | Any integer, e.g., `1`                                                                                         |
        | `max_size`      | Maximum number of replicas for the deployment (integer)                                                 | Any integer, e.g., `3`                                                                                         |
        | `accept_eula`   | Indicates acceptance of the End User License Agreement (boolean)                                        | `true`                                                                                                         |
        | `endpoints`     | Defines the endpoints for the deployment (array)                                                        | At least one endpoint, e.g., `[ { "public": {} } ]`                                                            |

    4. **Create a model endpoint**: Run the following command to create an inference endpoint for the deployment. Customize the details in the payload to your needs:

        Example for creating a public endpoint

        ```bash
        curl -X POST https://api.scaleway.com/inference/v1/regions/$SCW_DEFAULT_REGION/endpoints \
        -H "Content-Type: application/json" \
        -H "X-Auth-Token: $SCW_SECRET_KEY" \
        -d '{
          "project_id": "'"$SCW_PROJECT_ID"'",
          "deployment_id": "your-deployment-id",
          "endpoint": {
            "disable_auth": false,
            "public": {}
          }
        }'
        ```

        Example for creating a private endpoint

        ```bash
        curl -X POST https://api.scaleway.com/inference/v1/regions/$SCW_DEFAULT_REGION/endpoints \
        -H "Content-Type: application/json" \
        -H "X-Auth-Token: $SCW_SECRET_KEY" \
        -d '{
          "project_id": "'"$SCW_PROJECT_ID"'",
          "deployment_id": "your-deployment-id",
          "endpoint": {
            "disable_auth": false,
            "private_network": {
              "private_network_id": "your-private-network-id"
            }
          }
        }'
        ```

        | Parameter         | Description                                                                                               | Valid values                                                                                                   |
        |-------------------|-----------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------|
        | `project_id`      | The Project in which the endpoint should be created (string)                                              | Any valid Scaleway Project ID, e.g., `"b4bd99e0-b389-11ed-afa1-0242ac120002"`                                 |
        | `deployment_id`   | The deployment ID to which the endpoint will be associated (string)                                       | Any valid deployment ID, e.g. `"bcb0976d-98d6-49c1-b6b5-17804941c0b7"`                                                               |
        | `disable_auth`    | Specifies whether to disable authentication (boolean)                                                     | `true` or `false`                                                                                              |
        | `public`          | Public endpoint configuration (object)                                                                    | `{}` for public endpoint                                                                                       |
        | `private_network` | Private endpoint configuration including the private network ID (object)                                  | `{ "private_network_id": "private-network-id" }`                                                               |

    5. **List your deployments**: Run the following command to get a list of all the deployments in your account, with their details:

        ```bash
        curl -X GET \
          -H "Content-Type: application/json" \
          -H "X-Auth-Token: $SCW_SECRET_KEY" \
          "https://api.scaleway.com/inference/v1/regions/$SCW_DEFAULT_REGION/deployments"
        ```

    6. **List your endpoints**: Run the following command to get a list of all the inference endpoints in your account, with their details:

        ```bash
        curl -X GET \
          -H "Content-Type: application/json" \
          -H "X-Auth-Token: $SCW_SECRET_KEY" \
          "https://api.scaleway.com/inference/v1/regions/$SCW_DEFAULT_REGION/endpoints"
        ```

    7. **Delete an endpoint**: Run the following command to delete an inference endpoint, specified by its endpoint ID:

        ```bash
        curl -X DELETE \
          -H "X-Auth-Token: $SCW_SECRET_KEY" \
          -H "Content-Type: application/json" \
          "https://api.scaleway.com/inference/v1/regions/$SCW_DEFAULT_REGION/endpoints/<endpoint-ID>"
        ```

        The expected successful response is empty.

        <Message type="important">
          Dedicated Generative APIs deployments must have at least one endpoint, either public or private.
        </Message>


    <Message type="requirement">
    - You have a [Scaleway account](https://console.scaleway.com/)
    - You have created an [API key](https://www.scaleway.com/en/docs/iam/how-to/create-api-keys/) and the API key has sufficient [IAM permissions](https://www.scaleway.com/en/docs/iam/reference-content/permission-sets/) to perform the actions described on this page
    - You have [installed `curl`](https://curl.se/download.html)
    </Message>


    ## Technical information

    ### Region

    Generative APIs - Dedicated Deployment endpoints are available in the following region:

    | Name  | API ID |
    |-------|--------|
    | Paris | `fr-par` |




    ### Pagination

    Most listing requests receive a paginated response. Requests against paginated endpoints accept two `query` arguments:

    - `page`, a positive integer to choose which page to return
    - `per_page`, a positive integer lower or equal to 100 to select the number of items to return per page. The default value is `50`.

    Paginated endpoints usually also accept filters to search and sort results. These filters are documented in each endpoint's documentation.

    The `X-Total-Count` header contains the total number of items returned.




    ### Creating a deployment: the model object

    When [creating a deployment](#path-deployments-create-a-deployment), the `model_id` parameter is required. This specifies the model to deploy. Use the [List Models](#path-models-list-models) endpoint to retrieve available model IDs.


    <Message type="note">
    This information is designed to help you correctly configure the `model_id` parameter when using the [Create a deployment](#path-deployments-create-a-deployment) method.
    </Message>


    ## Going further

    For more help using Scaleway Generative APIs - Dedicated Deployment, check out the following resources:
    - Our [main documentation](https://www.scaleway.com/en/docs/generative-apis/)
    - The #ai channel on our [Slack Community](https://www.scaleway.com/en/docs/tutorials/scaleway-slack-community/).
  version: v1
servers:
- url: https://api.scaleway.com
tags:
- name: Models
  description: |
    A model represents a pre-trained machine learning model that can be deployed on the Generative APIs - Dedicated Deployment service.

    They are used to define the inference model, its source, and its compatibility with the available nodes.
    Some models may be available in multiple parameters sizes, which will affect the performance and the accuracy of the model.
- name: Deployments
  description: |
    A deployment is a scalable pool of resources used to run inference models
- name: Node types
  description: |
    Nodes are the compute units that make up your inference deployments
- name: Endpoints
  description: |
    An endpoint is the URL where the inference model can be accessed

    Endpoints can be public or private, and can be protected by an IAM authentication token.
components:
  schemas:
    scaleway.inference.v1.Deployment:
      type: object
      properties:
        id:
          type: string
          description: Unique identifier. (UUID format)
          example: 6170692e-7363-616c-6577-61792e636f6d
        name:
          type: string
          description: Name of the deployment.
        project_id:
          type: string
          description: Project ID. (UUID format)
          example: 6170692e-7363-616c-6577-61792e636f6d
        status:
          type: string
          description: Status of the deployment.
          enum:
          - unknown_status
          - creating
          - deploying
          - ready
          - error
          - deleting
          - locked
          - scaling
          default: unknown_status
        tags:
          type: array
          description: List of tags applied to the deployment.
          items:
            type: string
        node_type_name:
          type: string
          description: Node type of the deployment.
        endpoints:
          type: array
          description: List of endpoints.
          items:
            $ref: '#/components/schemas/scaleway.inference.v1.Endpoint'
        size:
          type: integer
          description: Current size of the pool.
          format: uint32
        min_size:
          type: integer
          description: Defines the minimum size of the pool.
          format: uint32
        max_size:
          type: integer
          description: Defines the maximum size of the pool. Currently, autoscaling
            is not yet supported, and this value must be equal to `min_size`.
          format: uint32
        error_message:
          type: string
          description: Displays information if your deployment is in error state.
          nullable: true
        model_id:
          type: string
          description: ID of the model used for the deployment. (UUID format)
          example: 6170692e-7363-616c-6577-61792e636f6d
        quantization:
          type: object
          description: Quantization parameters for this deployment.
          properties:
            bits:
              type: integer
              description: The number of bits each model parameter should be quantized
                to. The quantization method is chosen based on this value.
              format: uint32
          required:
          - bits
          x-properties-order:
          - bits
        model_name:
          type: string
          description: Name of the deployed model.
        created_at:
          type: string
          description: Creation date of the deployment. (RFC 3339 format)
          format: date-time
          example: "2022-03-22T12:34:56.123456Z"
          nullable: true
        updated_at:
          type: string
          description: Last modification date of the deployment. (RFC 3339 format)
          format: date-time
          example: "2022-03-22T12:34:56.123456Z"
          nullable: true
        region:
          type: string
          description: Region of the deployment.
      x-properties-order:
      - id
      - name
      - project_id
      - status
      - tags
      - node_type_name
      - endpoints
      - size
      - min_size
      - max_size
      - error_message
      - model_id
      - quantization
      - model_name
      - created_at
      - updated_at
      - region
    scaleway.inference.v1.Endpoint:
      type: object
      properties:
        id:
          type: string
          description: Unique identifier. (UUID format)
          example: 6170692e-7363-616c-6577-61792e636f6d
        url:
          type: string
          description: |-
            URL of the endpoint.
            For private endpoints, the URL will be accessible only from the Private Network.
            In addition, private endpoints will expose a CA certificate that can be used to verify the server's identity.
            This CA certificate can be retrieved using the `GetDeploymentCertificate` API call.
        public_network:
          type: object
          description: Defines whether the endpoint is public.
          nullable: true
          x-one-of: details
        private_network:
          type: object
          description: Details of the Private Network.
          properties:
            private_network_id:
              type: string
              description: (UUID format)
              example: 6170692e-7363-616c-6577-61792e636f6d
          nullable: true
          x-properties-order:
          - private_network_id
          x-one-of: details
        disable_auth:
          type: boolean
          description: Defines whether the authentication is disabled.
      x-properties-order:
      - id
      - url
      - public_network
      - private_network
      - disable_auth
    scaleway.inference.v1.EndpointSpec:
      type: object
      properties:
        public_network:
          type: object
          description: Set the endpoint as public.
          nullable: true
          x-one-of: details
        private_network:
          type: object
          description: |-
            Set the endpoint as private.
            Private endpoints are only accessible from the Private Network.
          properties:
            private_network_id:
              type: string
              description: (UUID format)
              example: 6170692e-7363-616c-6577-61792e636f6d
          nullable: true
          x-properties-order:
          - private_network_id
          x-one-of: details
        disable_auth:
          type: boolean
          description: |-
            Disable the authentication on the endpoint.
            By default, deployments are protected by IAM authentication.
            When setting this field to true, the authentication will be disabled.
      x-properties-order:
      - public_network
      - private_network
      - disable_auth
    scaleway.inference.v1.Eula:
      type: object
      properties:
        content:
          type: string
          description: Content of the end user license agreement.
      x-properties-order:
      - content
    scaleway.inference.v1.ListDeploymentsResponse:
      type: object
      properties:
        deployments:
          type: array
          description: List of deployments on the current page.
          items:
            $ref: '#/components/schemas/scaleway.inference.v1.Deployment'
        total_count:
          type: integer
          description: Total number of deployments.
          format: uint64
      x-properties-order:
      - deployments
      - total_count
    scaleway.inference.v1.ListModelsResponse:
      type: object
      properties:
        models:
          type: array
          description: List of models on the current page.
          items:
            $ref: '#/components/schemas/scaleway.inference.v1.Model'
        total_count:
          type: integer
          description: Total number of models.
          format: uint64
      x-properties-order:
      - models
      - total_count
    scaleway.inference.v1.ListNodeTypesResponse:
      type: object
      properties:
        node_types:
          type: array
          description: List of node types.
          items:
            $ref: '#/components/schemas/scaleway.inference.v1.NodeType'
        total_count:
          type: integer
          description: Total number of node types.
          format: uint64
      x-properties-order:
      - node_types
      - total_count
    scaleway.inference.v1.Model:
      type: object
      properties:
        id:
          type: string
          description: Unique identifier. (UUID format)
          example: 6170692e-7363-616c-6577-61792e636f6d
        name:
          type: string
          description: Unique Name identifier.
        project_id:
          type: string
          description: Project ID. (UUID format)
          example: 6170692e-7363-616c-6577-61792e636f6d
        tags:
          type: array
          description: List of tags applied to the model.
          items:
            type: string
        status:
          type: string
          description: Status of the model.
          enum:
          - unknown_status
          - preparing
          - downloading
          - ready
          - error
          default: unknown_status
        description:
          type: string
          description: Purpose of the model.
        error_message:
          type: string
          description: Displays information if your model is in error state.
          nullable: true
        has_eula:
          type: boolean
          description: Defines whether the model has an end user license agreement.
        created_at:
          type: string
          description: Creation date of the model. (RFC 3339 format)
          format: date-time
          example: "2022-03-22T12:34:56.123456Z"
          nullable: true
        updated_at:
          type: string
          description: Last modification date of the model. (RFC 3339 format)
          format: date-time
          example: "2022-03-22T12:34:56.123456Z"
          nullable: true
        region:
          type: string
          description: Region of the model.
        nodes_support:
          type: array
          description: Supported nodes types with quantization options and context
            lengths.
          items:
            $ref: '#/components/schemas/scaleway.inference.v1.ModelSupportInfo'
        parameter_size_bits:
          type: integer
          description: Size, in bits, of the model parameters.
          format: uint32
        size_bytes:
          type: integer
          description: Total size, in bytes, of the model files.
          format: uint64
      x-properties-order:
      - id
      - name
      - project_id
      - tags
      - status
      - description
      - error_message
      - has_eula
      - created_at
      - updated_at
      - region
      - nodes_support
      - parameter_size_bits
      - size_bytes
    scaleway.inference.v1.ModelSupportInfo:
      type: object
      properties:
        nodes:
          type: array
          description: List of supported node types.
          items:
            $ref: '#/components/schemas/scaleway.inference.v1.ModelSupportedNode'
      x-properties-order:
      - nodes
    scaleway.inference.v1.ModelSupportedNode:
      type: object
      properties:
        node_type_name:
          type: string
          description: Supported node type.
        quantizations:
          type: array
          description: Supported quantizations.
          items:
            $ref: '#/components/schemas/scaleway.inference.v1.ModelSupportedQuantization'
      x-properties-order:
      - node_type_name
      - quantizations
    scaleway.inference.v1.ModelSupportedQuantization:
      type: object
      properties:
        quantization_bits:
          type: integer
          description: Number of bits for this supported quantization.
          format: uint32
        allowed:
          type: boolean
          description: Tells whether this quantization is allowed for this node type.
        max_context_size:
          type: integer
          description: Maximum inference context size available for this node type
            and quantization.
          format: uint32
      x-properties-order:
      - quantization_bits
      - allowed
      - max_context_size
    scaleway.inference.v1.NodeType:
      type: object
      properties:
        name:
          type: string
          description: Name of the node type.
        stock_status:
          type: string
          description: Current stock status for the node type.
          enum:
          - unknown_stock
          - low_stock
          - out_of_stock
          - available
          default: unknown_stock
        description:
          type: string
          description: Current specs of the offer.
        vcpus:
          type: integer
          description: Number of virtual CPUs.
          format: uint32
        memory:
          type: integer
          description: Quantity of RAM. (in bytes)
          format: uint64
        vram:
          type: integer
          description: Quantity of GPU RAM. (in bytes)
          format: uint64
        disabled:
          type: boolean
          description: The node type is currently disabled.
        beta:
          type: boolean
          description: The node type is currently in beta.
        created_at:
          type: string
          description: Creation date of the node type. (RFC 3339 format)
          format: date-time
          example: "2022-03-22T12:34:56.123456Z"
          nullable: true
        updated_at:
          type: string
          description: Last modification date of the node type. (RFC 3339 format)
          format: date-time
          example: "2022-03-22T12:34:56.123456Z"
          nullable: true
        gpus:
          type: integer
          description: Number of GPUs.
          format: uint32
        region:
          type: string
          description: Region of the node type.
      x-properties-order:
      - name
      - stock_status
      - description
      - vcpus
      - memory
      - vram
      - disabled
      - beta
      - created_at
      - updated_at
      - gpus
      - region
    scaleway.inference.v1.VerifyModelResponse:
      type: object
      properties:
        nodes:
          type: array
          items:
            $ref: '#/components/schemas/scaleway.inference.v1.ModelSupportedNode'
        size_bytes:
          type: integer
          format: uint64
      x-properties-order:
      - nodes
      - size_bytes
    scaleway.std.File:
      type: object
      properties:
        name:
          type: string
        content_type:
          type: string
        content:
          type: string
      x-properties-order:
      - name
      - content_type
      - content
  securitySchemes:
    scaleway:
      in: header
      name: X-Auth-Token
      type: apiKey
paths:
  /inference/v1/regions/{region}/deployments:
    get:
      tags:
      - Deployments
      operationId: ListDeployments
      summary: List inference deployments
      description: List all your inference deployments.
      parameters:
      - in: path
        name: region
        description: The region you want to target
        required: true
        schema:
          type: string
          enum:
          - fr-par
      - in: query
        name: page
        description: Page number to return.
        schema:
          type: integer
          format: int32
      - in: query
        name: page_size
        description: Maximum number of deployments to return per page.
        schema:
          type: integer
          format: uint32
      - in: query
        name: order_by
        description: Order in which to return results.
        schema:
          type: string
          enum:
          - created_at_desc
          - created_at_asc
          - name_asc
          - name_desc
          default: created_at_desc
      - in: query
        name: project_id
        description: Filter by Project ID. (UUID format)
        schema:
          type: string
          example: 6170692e-7363-616c-6577-61792e636f6d
      - in: query
        name: organization_id
        description: Filter by Organization ID. (UUID format)
        schema:
          type: string
          example: 6170692e-7363-616c-6577-61792e636f6d
      - in: query
        name: name
        description: Filter by deployment name.
        schema:
          type: string
      - in: query
        name: tags
        description: Filter by tags.
        schema:
          type: array
          items:
            type: string
      responses:
        "200":
          description: ""
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/scaleway.inference.v1.ListDeploymentsResponse'
      security:
      - scaleway: []
      x-codeSamples:
      - lang: cURL
        source: |-
          curl -X GET \
            -H "X-Auth-Token: $SCW_SECRET_KEY" \
            "https://api.scaleway.com/inference/v1/regions/{region}/deployments"
      - lang: HTTPie
        source: |-
          http GET "https://api.scaleway.com/inference/v1/regions/{region}/deployments" \
            X-Auth-Token:$SCW_SECRET_KEY
    post:
      tags:
      - Deployments
      operationId: CreateDeployment
      summary: Create a deployment
      description: Create a new inference deployment related to a specific model.
      parameters:
      - in: path
        name: region
        description: The region you want to target
        required: true
        schema:
          type: string
          enum:
          - fr-par
      responses:
        "200":
          description: ""
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/scaleway.inference.v1.Deployment'
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              properties:
                name:
                  type: string
                  description: Name of the deployment.
                project_id:
                  type: string
                  description: ID of the Project to create the deployment in. (UUID
                    format)
                  example: 6170692e-7363-616c-6577-61792e636f6d
                model_id:
                  type: string
                  description: ID of the model to use. (UUID format)
                  example: 6170692e-7363-616c-6577-61792e636f6d
                accept_eula:
                  type: boolean
                  description: |-
                    Accept the model's End User License Agreement (EULA).
                    If the model has an EULA, you must accept it before proceeding.
                    The terms of the EULA can be retrieved using the `GetModelEula` API call.
                  nullable: true
                node_type_name:
                  type: string
                  description: Name of the node type to use.
                tags:
                  type: array
                  description: List of tags to apply to the deployment.
                  items:
                    type: string
                min_size:
                  type: integer
                  description: Defines the minimum size of the pool.
                  format: uint32
                  nullable: true
                max_size:
                  type: integer
                  description: Defines the maximum size of the pool. Currently, autoscaling
                    is not yet supported, and this value must be equal to `min_size`.
                  format: uint32
                  nullable: true
                endpoints:
                  type: array
                  description: List of endpoints to create.
                  items:
                    $ref: '#/components/schemas/scaleway.inference.v1.EndpointSpec'
                quantization:
                  type: object
                  description: Quantization settings to apply to this deployment.
                  properties:
                    bits:
                      type: integer
                      description: The number of bits each model parameter should
                        be quantized to. The quantization method is chosen based on
                        this value.
                      format: uint32
                  required:
                  - bits
                  x-properties-order:
                  - bits
              required:
              - name
              - project_id
              - model_id
              - node_type_name
              - endpoints
              x-properties-order:
              - name
              - project_id
              - model_id
              - accept_eula
              - node_type_name
              - tags
              - min_size
              - max_size
              - endpoints
              - quantization
      security:
      - scaleway: []
      x-codeSamples:
      - lang: cURL
        source: |-
          curl -X POST \
            -H "X-Auth-Token: $SCW_SECRET_KEY" \
            -H "Content-Type: application/json" \
            -d '{
              "endpoints": [
                  ""
              ],
              "model_id": "6170692e-7363-616c-6577-61792e636f6d",
              "name": "string",
              "node_type_name": "string",
              "project_id": "6170692e-7363-616c-6577-61792e636f6d"
            }' \
            "https://api.scaleway.com/inference/v1/regions/{region}/deployments"
      - lang: HTTPie
        source: |-
          http POST "https://api.scaleway.com/inference/v1/regions/{region}/deployments" \
            X-Auth-Token:$SCW_SECRET_KEY \
            endpoints:='[
              ""
            ]' \
            model_id="6170692e-7363-616c-6577-61792e636f6d" \
            name="string" \
            node_type_name="string" \
            project_id="6170692e-7363-616c-6577-61792e636f6d"
  /inference/v1/regions/{region}/deployments/{deployment_id}:
    get:
      tags:
      - Deployments
      operationId: GetDeployment
      summary: Get a deployment
      description: Get the deployment for the given ID.
      parameters:
      - in: path
        name: region
        description: The region you want to target
        required: true
        schema:
          type: string
          enum:
          - fr-par
      - in: path
        name: deployment_id
        description: ID of the deployment to get. (UUID format)
        required: true
        schema:
          type: string
          example: 6170692e-7363-616c-6577-61792e636f6d
      responses:
        "200":
          description: ""
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/scaleway.inference.v1.Deployment'
      security:
      - scaleway: []
      x-codeSamples:
      - lang: cURL
        source: |-
          curl -X GET \
            -H "X-Auth-Token: $SCW_SECRET_KEY" \
            "https://api.scaleway.com/inference/v1/regions/{region}/deployments/{deployment_id}"
      - lang: HTTPie
        source: |-
          http GET "https://api.scaleway.com/inference/v1/regions/{region}/deployments/{deployment_id}" \
            X-Auth-Token:$SCW_SECRET_KEY
    patch:
      tags:
      - Deployments
      operationId: UpdateDeployment
      summary: Update a deployment
      description: Update an existing inference deployment.
      parameters:
      - in: path
        name: region
        description: The region you want to target
        required: true
        schema:
          type: string
          enum:
          - fr-par
      - in: path
        name: deployment_id
        description: ID of the deployment to update. (UUID format)
        required: true
        schema:
          type: string
          example: 6170692e-7363-616c-6577-61792e636f6d
      responses:
        "200":
          description: ""
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/scaleway.inference.v1.Deployment'
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              properties:
                name:
                  type: string
                  description: Name of the deployment.
                  nullable: true
                tags:
                  type: array
                  description: List of tags to apply to the deployment.
                  nullable: true
                  items:
                    type: string
                min_size:
                  type: integer
                  description: Defines the new minimum size of the pool.
                  format: uint32
                  nullable: true
                max_size:
                  type: integer
                  description: Defines the maximum size of the pool. Currently, autoscaling
                    is not yet supported, and this value must be equal to `min_size`.
                  format: uint32
                  nullable: true
                model_id:
                  type: string
                  description: Id of the model to set to the deployment.
                  nullable: true
                quantization:
                  type: object
                  description: Quantization to use to the deployment.
                  properties:
                    bits:
                      type: integer
                      description: The number of bits each model parameter should
                        be quantized to. The quantization method is chosen based on
                        this value.
                      format: uint32
                  required:
                  - bits
                  x-properties-order:
                  - bits
              x-properties-order:
              - name
              - tags
              - min_size
              - max_size
              - model_id
              - quantization
      security:
      - scaleway: []
      x-codeSamples:
      - lang: cURL
        source: |-
          curl -X PATCH \
            -H "X-Auth-Token: $SCW_SECRET_KEY" \
            -H "Content-Type: application/json" \
            -d '{}' \
            "https://api.scaleway.com/inference/v1/regions/{region}/deployments/{deployment_id}"
      - lang: HTTPie
        source: |-
          http PATCH "https://api.scaleway.com/inference/v1/regions/{region}/deployments/{deployment_id}" \
            X-Auth-Token:$SCW_SECRET_KEY
    delete:
      tags:
      - Deployments
      operationId: DeleteDeployment
      summary: Delete a deployment
      description: Delete an existing inference deployment.
      parameters:
      - in: path
        name: region
        description: The region you want to target
        required: true
        schema:
          type: string
          enum:
          - fr-par
      - in: path
        name: deployment_id
        description: ID of the deployment to delete. (UUID format)
        required: true
        schema:
          type: string
          example: 6170692e-7363-616c-6577-61792e636f6d
      responses:
        "200":
          description: ""
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/scaleway.inference.v1.Deployment'
      security:
      - scaleway: []
      x-codeSamples:
      - lang: cURL
        source: |-
          curl -X DELETE \
            -H "X-Auth-Token: $SCW_SECRET_KEY" \
            "https://api.scaleway.com/inference/v1/regions/{region}/deployments/{deployment_id}"
      - lang: HTTPie
        source: |-
          http DELETE "https://api.scaleway.com/inference/v1/regions/{region}/deployments/{deployment_id}" \
            X-Auth-Token:$SCW_SECRET_KEY
  /inference/v1/regions/{region}/deployments/{deployment_id}/certificate:
    get:
      tags:
      - Deployments
      operationId: GetDeploymentCertificate
      summary: Get the CA certificate
      description: |-
        Get the CA certificate used for the deployment of private endpoints.
        The CA certificate will be returned as a PEM file.
      parameters:
      - in: path
        name: region
        description: The region you want to target
        required: true
        schema:
          type: string
          enum:
          - fr-par
      - in: path
        name: deployment_id
        description: (UUID format)
        required: true
        schema:
          type: string
          example: 6170692e-7363-616c-6577-61792e636f6d
      responses:
        "200":
          description: ""
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/scaleway.std.File'
      security:
      - scaleway: []
      x-codeSamples:
      - lang: cURL
        source: |-
          curl -X GET \
            -H "X-Auth-Token: $SCW_SECRET_KEY" \
            "https://api.scaleway.com/inference/v1/regions/{region}/deployments/{deployment_id}/certificate"
      - lang: HTTPie
        source: |-
          http GET "https://api.scaleway.com/inference/v1/regions/{region}/deployments/{deployment_id}/certificate" \
            X-Auth-Token:$SCW_SECRET_KEY
  /inference/v1/regions/{region}/endpoints:
    post:
      tags:
      - Endpoints
      operationId: CreateEndpoint
      summary: Create an endpoint
      description: Create a new Endpoint related to a specific deployment.
      parameters:
      - in: path
        name: region
        description: The region you want to target
        required: true
        schema:
          type: string
          enum:
          - fr-par
      responses:
        "200":
          description: ""
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/scaleway.inference.v1.Endpoint'
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              properties:
                deployment_id:
                  type: string
                  description: ID of the deployment to create the endpoint for. (UUID
                    format)
                  example: 6170692e-7363-616c-6577-61792e636f6d
                endpoint:
                  type: object
                  description: Specification of the endpoint.
                  properties:
                    public_network:
                      type: object
                      description: Set the endpoint as public.
                      nullable: true
                      x-one-of: details
                    private_network:
                      type: object
                      description: |-
                        Set the endpoint as private.
                        Private endpoints are only accessible from the Private Network.
                      properties:
                        private_network_id:
                          type: string
                          description: (UUID format)
                          example: 6170692e-7363-616c-6577-61792e636f6d
                      nullable: true
                      x-properties-order:
                      - private_network_id
                      x-one-of: details
                    disable_auth:
                      type: boolean
                      description: |-
                        Disable the authentication on the endpoint.
                        By default, deployments are protected by IAM authentication.
                        When setting this field to true, the authentication will be disabled.
                  x-properties-order:
                  - public_network
                  - private_network
                  - disable_auth
              required:
              - deployment_id
              - endpoint
              x-properties-order:
              - deployment_id
              - endpoint
      security:
      - scaleway: []
      x-codeSamples:
      - lang: cURL
        source: |-
          curl -X POST \
            -H "X-Auth-Token: $SCW_SECRET_KEY" \
            -H "Content-Type: application/json" \
            -d '{"deployment_id":"6170692e-7363-616c-6577-61792e636f6d","endpoint":{"disable_auth":false,"private_network":{"private_network_id":"6170692e-7363-616c-6577-61792e636f6d"},"public_network":{}}}' \
            "https://api.scaleway.com/inference/v1/regions/{region}/endpoints"
      - lang: HTTPie
        source: |-
          http POST "https://api.scaleway.com/inference/v1/regions/{region}/endpoints" \
            X-Auth-Token:$SCW_SECRET_KEY \
            deployment_id="6170692e-7363-616c-6577-61792e636f6d" \
            endpoint:='{"disable_auth":false,"private_network":{"private_network_id":"6170692e-7363-616c-6577-61792e636f6d"},"public_network":{}}'
  /inference/v1/regions/{region}/endpoints/{endpoint_id}:
    patch:
      tags:
      - Endpoints
      operationId: UpdateEndpoint
      summary: Update an endpoint
      description: Update an existing Endpoint.
      parameters:
      - in: path
        name: region
        description: The region you want to target
        required: true
        schema:
          type: string
          enum:
          - fr-par
      - in: path
        name: endpoint_id
        description: ID of the endpoint to update. (UUID format)
        required: true
        schema:
          type: string
          example: 6170692e-7363-616c-6577-61792e636f6d
      responses:
        "200":
          description: ""
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/scaleway.inference.v1.Endpoint'
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              properties:
                disable_auth:
                  type: boolean
                  description: |-
                    Disable the authentication on the endpoint.
                    By default, deployments are protected by IAM authentication.
                    When setting this field to true, the authentication will be disabled.
                  nullable: true
              x-properties-order:
              - disable_auth
      security:
      - scaleway: []
      x-codeSamples:
      - lang: cURL
        source: |-
          curl -X PATCH \
            -H "X-Auth-Token: $SCW_SECRET_KEY" \
            -H "Content-Type: application/json" \
            -d '{}' \
            "https://api.scaleway.com/inference/v1/regions/{region}/endpoints/{endpoint_id}"
      - lang: HTTPie
        source: |-
          http PATCH "https://api.scaleway.com/inference/v1/regions/{region}/endpoints/{endpoint_id}" \
            X-Auth-Token:$SCW_SECRET_KEY
    delete:
      tags:
      - Endpoints
      operationId: DeleteEndpoint
      summary: Delete an endpoint
      description: Delete an existing Endpoint.
      parameters:
      - in: path
        name: region
        description: The region you want to target
        required: true
        schema:
          type: string
          enum:
          - fr-par
      - in: path
        name: endpoint_id
        description: ID of the endpoint to delete. (UUID format)
        required: true
        schema:
          type: string
          example: 6170692e-7363-616c-6577-61792e636f6d
      responses:
        "204":
          description: ""
      security:
      - scaleway: []
      x-codeSamples:
      - lang: cURL
        source: |-
          curl -X DELETE \
            -H "X-Auth-Token: $SCW_SECRET_KEY" \
            "https://api.scaleway.com/inference/v1/regions/{region}/endpoints/{endpoint_id}"
      - lang: HTTPie
        source: |-
          http DELETE "https://api.scaleway.com/inference/v1/regions/{region}/endpoints/{endpoint_id}" \
            X-Auth-Token:$SCW_SECRET_KEY
  /inference/v1/regions/{region}/models:
    get:
      tags:
      - Models
      operationId: ListModels
      summary: List models
      description: List all available models.
      parameters:
      - in: path
        name: region
        description: The region you want to target
        required: true
        schema:
          type: string
          enum:
          - fr-par
      - in: query
        name: order_by
        description: Order in which to return results.
        schema:
          type: string
          enum:
          - display_rank_asc
          - created_at_asc
          - created_at_desc
          - name_asc
          - name_desc
          default: display_rank_asc
      - in: query
        name: page
        description: Page number to return.
        schema:
          type: integer
          format: int32
      - in: query
        name: page_size
        description: Maximum number of models to return per page.
        schema:
          type: integer
          format: uint32
      - in: query
        name: project_id
        description: Filter by Project ID. (UUID format)
        schema:
          type: string
          example: 6170692e-7363-616c-6577-61792e636f6d
      - in: query
        name: organization_id
        description: Filter by Organization ID. (UUID format)
        schema:
          type: string
          example: 6170692e-7363-616c-6577-61792e636f6d
      - in: query
        name: name
        description: Filter by model name.
        schema:
          type: string
      - in: query
        name: tags
        description: Filter by tags.
        schema:
          type: array
          items:
            type: string
      responses:
        "200":
          description: ""
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/scaleway.inference.v1.ListModelsResponse'
      security:
      - scaleway: []
      x-codeSamples:
      - lang: cURL
        source: |-
          curl -X GET \
            -H "X-Auth-Token: $SCW_SECRET_KEY" \
            "https://api.scaleway.com/inference/v1/regions/{region}/models"
      - lang: HTTPie
        source: |-
          http GET "https://api.scaleway.com/inference/v1/regions/{region}/models" \
            X-Auth-Token:$SCW_SECRET_KEY
    post:
      tags:
      - Models
      operationId: CreateModel
      summary: Import a model
      description: Import a new model to your model library.
      parameters:
      - in: path
        name: region
        description: The region you want to target
        required: true
        schema:
          type: string
          enum:
          - fr-par
      responses:
        "200":
          description: ""
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/scaleway.inference.v1.Model'
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              properties:
                name:
                  type: string
                  description: Name of the model.
                project_id:
                  type: string
                  description: ID of the Project to import the model in. (UUID format)
                  example: 6170692e-7363-616c-6577-61792e636f6d
                source:
                  type: object
                  description: Where to import the model from.
                  properties:
                    url:
                      type: string
                    secret:
                      type: string
                      nullable: true
                      x-one-of: credentials
                  x-properties-order:
                  - url
                  - secret
              required:
              - name
              - project_id
              - source
              x-properties-order:
              - name
              - project_id
              - source
      security:
      - scaleway: []
      x-codeSamples:
      - lang: cURL
        source: |-
          curl -X POST \
            -H "X-Auth-Token: $SCW_SECRET_KEY" \
            -H "Content-Type: application/json" \
            -d '{
              "name": "string",
              "project_id": "6170692e-7363-616c-6577-61792e636f6d",
              "source": {
                  "secret": "string",
                  "url": "string"
              }
            }' \
            "https://api.scaleway.com/inference/v1/regions/{region}/models"
      - lang: HTTPie
        source: |-
          http POST "https://api.scaleway.com/inference/v1/regions/{region}/models" \
            X-Auth-Token:$SCW_SECRET_KEY \
            name="string" \
            project_id="6170692e-7363-616c-6577-61792e636f6d" \
            source:='{
              "secret": "string",
              "url": "string"
            }'
  /inference/v1/regions/{region}/models/{model_id}:
    get:
      tags:
      - Models
      operationId: GetModel
      summary: Get a model
      description: Get the model for the given ID.
      parameters:
      - in: path
        name: region
        description: The region you want to target
        required: true
        schema:
          type: string
          enum:
          - fr-par
      - in: path
        name: model_id
        description: ID of the model to get. (UUID format)
        required: true
        schema:
          type: string
          example: 6170692e-7363-616c-6577-61792e636f6d
      responses:
        "200":
          description: ""
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/scaleway.inference.v1.Model'
      security:
      - scaleway: []
      x-codeSamples:
      - lang: cURL
        source: |-
          curl -X GET \
            -H "X-Auth-Token: $SCW_SECRET_KEY" \
            "https://api.scaleway.com/inference/v1/regions/{region}/models/{model_id}"
      - lang: HTTPie
        source: |-
          http GET "https://api.scaleway.com/inference/v1/regions/{region}/models/{model_id}" \
            X-Auth-Token:$SCW_SECRET_KEY
    delete:
      tags:
      - Models
      operationId: DeleteModel
      summary: Delete a model
      description: Delete an existing model from your model library.
      parameters:
      - in: path
        name: region
        description: The region you want to target
        required: true
        schema:
          type: string
          enum:
          - fr-par
      - in: path
        name: model_id
        description: ID of the model to delete. (UUID format)
        required: true
        schema:
          type: string
          example: 6170692e-7363-616c-6577-61792e636f6d
      responses:
        "204":
          description: ""
      security:
      - scaleway: []
      x-codeSamples:
      - lang: cURL
        source: |-
          curl -X DELETE \
            -H "X-Auth-Token: $SCW_SECRET_KEY" \
            "https://api.scaleway.com/inference/v1/regions/{region}/models/{model_id}"
      - lang: HTTPie
        source: |-
          http DELETE "https://api.scaleway.com/inference/v1/regions/{region}/models/{model_id}" \
            X-Auth-Token:$SCW_SECRET_KEY
  /inference/v1/regions/{region}/models/{model_id}/eula:
    get:
      tags:
      - Models
      operationId: GetModelEula
      summary: Get a model EULA
      description: Get the EULA for the given model ID.
      parameters:
      - in: path
        name: region
        description: The region you want to target
        required: true
        schema:
          type: string
          enum:
          - fr-par
      - in: path
        name: model_id
        description: ID of the model to get the Eula for. (UUID format)
        required: true
        schema:
          type: string
          example: 6170692e-7363-616c-6577-61792e636f6d
      responses:
        "200":
          description: ""
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/scaleway.inference.v1.Eula'
      security:
      - scaleway: []
      x-codeSamples:
      - lang: cURL
        source: |-
          curl -X GET \
            -H "X-Auth-Token: $SCW_SECRET_KEY" \
            "https://api.scaleway.com/inference/v1/regions/{region}/models/{model_id}/eula"
      - lang: HTTPie
        source: |-
          http GET "https://api.scaleway.com/inference/v1/regions/{region}/models/{model_id}/eula" \
            X-Auth-Token:$SCW_SECRET_KEY
  /inference/v1/regions/{region}/node-types:
    get:
      tags:
      - Node types
      operationId: ListNodeTypes
      summary: List available node types
      description: List all available node types. By default, the node types returned
        in the list are ordered by creation date in ascending order, though this can
        be modified via the `order_by` field.
      parameters:
      - in: path
        name: region
        description: The region you want to target
        required: true
        schema:
          type: string
          enum:
          - fr-par
      - in: query
        name: page
        description: Page number to return.
        schema:
          type: integer
          format: int32
      - in: query
        name: page_size
        description: Maximum number of node types to return per page.
        schema:
          type: integer
          format: uint32
      - in: query
        name: include_disabled_types
        description: Include disabled node types in the response.
        required: true
        schema:
          type: boolean
      responses:
        "200":
          description: ""
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/scaleway.inference.v1.ListNodeTypesResponse'
      security:
      - scaleway: []
      x-codeSamples:
      - lang: cURL
        source: |-
          curl -X GET \
            -H "X-Auth-Token: $SCW_SECRET_KEY" \
            "https://api.scaleway.com/inference/v1/regions/{region}/node-types?include_disabled_types=false"
      - lang: HTTPie
        source: |-
          http GET "https://api.scaleway.com/inference/v1/regions/{region}/node-types" \
            X-Auth-Token:$SCW_SECRET_KEY \
            include_disabled_types==false
  /inference/v1/regions/{region}/verify-model:
    post:
      tags:
      - Models
      operationId: VerifyModel
      summary: Verify a model
      description: Verify a model should be good to be deployed on Generative APIs
        - Dedicated Deployment.
      parameters:
      - in: path
        name: region
        description: The region you want to target
        required: true
        schema:
          type: string
          enum:
          - fr-par
      responses:
        "200":
          description: ""
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/scaleway.inference.v1.VerifyModelResponse'
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              properties:
                source:
                  type: object
                  description: Where to verify the model from.
                  properties:
                    url:
                      type: string
                    secret:
                      type: string
                      nullable: true
                      x-one-of: credentials
                  x-properties-order:
                  - url
                  - secret
              required:
              - source
              x-properties-order:
              - source
      security:
      - scaleway: []
      x-codeSamples:
      - lang: cURL
        source: |-
          curl -X POST \
            -H "X-Auth-Token: $SCW_SECRET_KEY" \
            -H "Content-Type: application/json" \
            -d '{"source":{"secret":"string","url":"string"}}' \
            "https://api.scaleway.com/inference/v1/regions/{region}/verify-model"
      - lang: HTTPie
        source: |-
          http POST "https://api.scaleway.com/inference/v1/regions/{region}/verify-model" \
            X-Auth-Token:$SCW_SECRET_KEY \
            source:='{"secret":"string","url":"string"}'
