ResponseAPIInputContentPart
typeType of content. (Required).
textText content.
detailDetail level of the image provided to the model.
image_urlURL of a remote image or encoding in base64 of a local image.
See How to query vision modelsOpen in new context
for code snippets using the openai Python client, and guidance for encoding local images.
file_dataContent of a file.
file_urlURL of a remote file.
ResponseAPIInputList
roleRole providing the content input.
List of input contents of different type, each compatible with different fields:
input_text: Requires text field.
input_image: Requires detail and image_url fields.
input_file: Requires file_data or file_url field. Optionally, filename can be provided.
statusStatus of the response.
typeType of the content input. Always set to message.
ResponseAPITool
typeType of tool object, always set to function.
nameName of the function to be called. Must contain only a-z, A-Z, 0-9, underscores and dashes, with a maximum length of 64 characters.
descriptionDescription of the function. This helps the model choose the right function when needed.
Parameters of the function, described as a JSON schema object. See How to use function callingOpen in new context for examples, and the JSON schema referenceOpen in new context for documentation about the format.
Omitting parameters defines a function with an empty parameter list.
strictDefines whether to enforce strict schema adherence when generating a
function call. If set to true, the model will follow the exact
schema defined in the parameters field. Currently, even if
set true this parameter will be ignored and act as if set to false.
We recommend you check output schema before calling any functions or tools.
Default: false
ResponseAPIOutputContentPart
typeType of content. Always set to output_text.
textText content.
Annotations of the text output, such as citations or path to a file.
ResponseAPIOutputList
roleRole generating the content output. Always set to assistant.
typeList of outputs of different type, each also outputting different fields:
message: outputs content field
function_call: outputs call_id, name and arguments fields
idUUID of the mesage within a response.
statusStatus of the response.
List of text output contents.
call_idUUID of the function tool call.
nameName the function to execute.
argumentsArguments to pass to the function, formatted as a JSON string.
Example: {"city": "Paris","timezone": "UTC+2"}
ResponseAPIUsage
input_tokensNumber of input tokens.
Breakdown of input tokens by type.
output_tokensNumber of output tokens.
Breakdown of output tokens by type.
total_tokensTotal number of tokens (input and output).
ChatCompletionMessageToolCall
idUUID of the tool call.
typeType of tool call, always set to function.
Function to call, identified by the model.
ChatCompletionMessageToolCalls
List of tool calls required by the model, such as function calls.
idUUID of the tool call.
typeType of tool call, always set to function.
Function to call, identified by the model.
ChatCompletionRequestMessageContentPart
typeType of content. image_url and input_audio are only supported with user role.
textText content. Required if type is set to text.
ChatCompletionRequestMessage
roleRole of the message's author.
contentContent of a message as string. Required for all roles, except assistant if tool_calls is specified instead.
Content of a message as array of content parts. Required for all roles, except assistant if tool_calls is specified instead.
List of tool calls required by the model. Can only be used with assistant if content is not specified.
tool_call_idUUID of the tool call. Must only be used with tool role.
ChatCompletionResponseMessage
roleRole of the message's author, always set to assistant in the response.
contentContent of the message.
reasoning_contentReasoning content generated for this message.
List of tool calls required by the model, such as function calls.
ChatCompletionStreamOptions
include_usageDefines whether a usage field is included in a stream. If set, an additional chunk will be streamed before the data: [DONE] message. The usage field on this chunk shows the token usage statistics for the complete stream.
ChatCompletionTokenLogprob
tokenToken generated.
logprobLog probability of generating this token, if it is among the top 20 most likely tokens. Otherwise, the value -9999.0 is used to mean that the token is very unlikely.
bytesList of integers representing the UTF-8 bytes (in decimal format) representation of a token. Since some characters may be represented by multiple tokens, this representation can be combined to represent the corresponding character in UTF-8.
List of most probable next tokens and their log probability.
ChatCompletionTool
typeType of tool object, always set to function.
ChatCompletionToolChoiceOption
Defines whether a model can call tools, and if so, and which ones.
none: model will not call any tools, and only generate a message.
auto: model can choose either to generate a message, or to call one or
multiple tools.
required: model must call one or multiple tools.
Default: none when no tools are present, otherwise auto.
An object can also be provided to specify a tool that the model must call. Object format must be:
{"type": "function", "function": {"name": "function_name_as_provided_in_tools"}}
ChatCompletionResponseChoice
indexIndex of the choice in the list of choices.
Message generated by the model.
Object containing log probability information for each token in a generated response.
finish_reasonReason the model stopped generating tokens.
stop: model successfully reached the end of its answer, or a provided
stop sequence
length: maximum number of output tokens was reached, blocking further generation
tool_calls: model needed to call a tool
ChatCompletionUsage
prompt_tokensNumber of input tokens.
total_tokensTotal number of tokens (input and output).
completion_tokensNumber of output tokens.
Breakdown of output tokens by type.
Breakdown of input tokens by type.
CreateResponse
idUUID of the response.
objectType of response object, always set to chat.completion.
created_atTimestamp when the response was generated (Unix format, in seconds).
statusStatus of the response.
modelUnique identifier of the model.
List of outputs generated by the model as response.
Configuration of the response format, either plain text or JSON structured data.
CreateChatCompletionResponse
idUUID of the response.
objectType of response object, always set to chat.completion.
createdTimestamp when the response was generated (Unix format, in seconds).
modelUnique identifier of the model.
List of chat completion variations. Defaults to only 1 choice, but can be increased by setting a value for n in the request.
CreateEmbeddingResponse
idUUID of the response.
objectType of response object, always set to list.
createdTimestamp when the response was generated (Unix format, in seconds).
modelUnique identifier of the model.
List of embeddings.
Usage information generated by this request.
Embedding
indexIndex of the embedding in the list of embeddings.
objectType of the response object, always set to embedding.
embeddingEmbedding vector, represented as a list of floating point values. The length of a vector is equal to the number of dimensions of the model.
CreateRerankResponse
idUUID of the response.
modelUnique identifier of the model.
List of documents sorted by relevance.
Usage information generated by this request.
Ranking
indexIndex of the document in the initial request.
relevance_scoreDocument's relevance to answering the query.
Document sent in the request.
CreateAudioTranscriptionResponse
textTranscribed text.
Usage information generated by this request, either in tokens or duration depending on how the model is billed.
Batch
idUUID of the batch.
objectType of batch object, always set to batch.
endpointPath used to process requests in the batch.
modelModel used to process the batch
Error object
input_file_idURL of the input file.
completion_windowTime range during which the batch should be processed.
statusStatus of the batch.
output_file_idURL of the input file.
error_file_idURL of the input file.
created_atTimestamp when the batch was created (Unix format, in seconds).
in_progress_atTimestamp when the batch processing started (Unix format, in seconds).
expires_atTimestamp when the batch will expire (Unix format, in seconds).
finalizing_atTimestamp when the batch started finalizing (Unix format, in seconds).
completed_atTimestamp when the batch was completed (Unix format, in seconds).
failed_atTimestamp when the batch failed (Unix format, in seconds).
expired_atTimestamp when the batch expired (Unix format, in seconds).
cancelling_atTimestamp when the batch started cancelling (Unix format, in seconds).
cancelled_atTimestamp when the batch was cancelled (Unix format, in seconds).
Number of requests by status.
Usage information generated by this request, either in tokens or duration depending on how the model is billed.
ListBatchResponse
objectType of response object, always set to list.
List of batches.
first_idUUID of first batch in the response.
last_idUUID of last batch in the response.
has_moreDefines whether there are more results to retrieve not returned by this query.
ResponseAPIFunctionObject
typeType of tool object, always set to function.
nameName of the function to be called. Must contain only a-z, A-Z, 0-9, underscores and dashes, with a maximum length of 64 characters.
descriptionDescription of the function. This helps the model choose the right function when needed.
Parameters of the function, described as a JSON schema object. See How to use function callingOpen in new context for examples, and the JSON schema referenceOpen in new context for documentation about the format.
Omitting parameters defines a function with an empty parameter list.
strictDefines whether to enforce strict schema adherence when generating a
function call. If set to true, the model will follow the exact
schema defined in the parameters field. Currently, even if
set true this parameter will be ignored and act as if set to false.
We recommend you check output schema before calling any functions or tools.
Default: false
FunctionObject
nameName of the function to be called. Must contain only a-z, A-Z, 0-9, underscores and dashes, with a maximum length of 64 characters.
descriptionDescription of the function. This helps the model choose the right function when needed.
Parameters of the function, described as a JSON schema object. See How to use function callingOpen in new context for examples, and the JSON schema referenceOpen in new context for documentation about the format.
Omitting parameters defines a function with an empty parameter list.
strictDefines whether to enforce strict schema adherence when generating a
function call. If set to true, the model will follow the exact
schema defined in the parameters field. Currently, even if
set true this parameter will be ignored and act as if set to false.
We recommend you check output schema before calling any functions or tools.
Default: false
FunctionParameters
ListModelsResponse
objectType of response object, always set to list.
List of models.
Model
idUnique identifier of the model.
objectObject type. Always set to model.
createdTimestamp when the model was created (Unix format, in seconds).
owned_byName of the organization that created the model (i.e. the model provider).
ParallelToolCalls
Defines whether the model can call multiple tools. Currently, even if
set false this parameter will be ignored and act as if set to true.
Only specific modelsOpen in new context can call multiple tools in a single response.
Default value: true
MaxOutputTokens
Maximum number of output tokens that can be generated
for a completion.
Different default maximum valuesOpen in new context
are enforced for each model, to avoid edge cases where tokens are
generated indefinitely. These values are not enforced
in Managed InferenceOpen in new context.
ResponseFormatChatCompletion
typeType of response object.
json_schemaSchema the response object should follow in JSON format. This field
can only be used if type is set to json_schema.
ResponseFormatResponseAPI
typeType of response object. The properties name, schema, description and strict
can only be used if type is set to json_schema.
nameName of the response format. Must only contain alphanumeric characters, underscores and dashes.
schemaSchema the response object should follow in JSON format. This field
can only be used if type is set to json_schema. Learn moreOpen in new context
descriptionDescription of the response format. This helps the model generate a response that follows the desired structure.
strictDefines whether to enforce strict schema adherence when generating
structured output. Currently, only true is supported.
Default: true
StopConfiguration
String, or array of strings, that when encountered in the generated text will stop the model from generating further output tokens. The generated text will not return any of the specified stop sequences. A maximum of 4 sequences can be provided.
Temperature
Value between 0 and 2 which increases randomness in token generation (e.g. encourages content "creativity" instead of "predictability").
temperature:0 means the distribution learned by the model will be used directly, favoring a subset of the most probable tokens at each generation step.
temperature>0 means randomness is added to the learnt distribution, so that tokens with a lower probability can also be generated.
temperature>=1 means added randomness will be so high, that almost all tokens are equally probable, leading the model to potentially mix languages.
The ideal temperature value depends on the use case and model. We recommend setting temperature to the recommended value for each model,
as shown in Console Playground (these values are used by default).
Note that temperature does not affect request reproducibility (only affected by the seed parameter).
With the same seed and temperature, two identical requests to a model will generate the same response.
TopP
Value between 0 and 1 which increases the proportion of token vocabulary considered during generation (0 cannot be used).
top_p:0.9 means the next token will be chosen from the 90% most probable tokens at each generation step.
We recommend setting top_p to the recommended value for each model, as shown in Console Playground (these values are used by default).