Each cluster is composed of one or more dedicated compute nodes running Apache Spark™. It can optionally include a dedicated JupyterLab Notebook. These resources are fully manageable through this API.
List the Apache Spark™ versions the product is compatible with.
path Parameters
regionThe region you want to target
query Parameters
pageThe page number.
page_sizeThe page size.
order_byThe order by field.
Responses
The list of cluster versions.
total_countThe total count of cluster versions.
List information about cluster within a project or an organization.
path Parameters
regionThe region you want to target
query Parameters
organization_idThe unique identifier of the organization whose clusters you want to list.
project_idThe unique identifier of the project whose clusters you want to list.
nameThe name of the cluster you want to list.
tagsThe tags associated with the cluster you want to list.
pageThe page number for pagination.
page_sizeThe page size for pagination.
order_byThe order by field, available options are name_asc, name_desc, created_at_asc, created_at_desc, updated_at_asc, updated_at_desc.
Responses
The list of clusters. This is a list composed of messages of type DataLab.
total_countThe total count of clusters.
Create a new cluster. In this call, one can personalize the node counts, add a notebook, choose the private network, define the persistent volume storage capacity.
path Parameters
regionThe region you want to target
Request Body
project_idThe unique identifier of the project where the cluster will be created.
nameThe name of the cluster.
descriptionThe description of the cluster.
tagsThe tags of the cluster.
The cluster main node specification. It holds the parameters node_type which specifies the node type of the main node. See ListNodeTypes for available options. See ListNodeTypes for available options.
The cluster worker node specification. It holds the parameters node_type which specifies the node type of the worker node and node_count for specifying the amount of nodes.
has_notebookSelect this option to include a notebook as part of the cluster.
spark_versionThe version of Apache Spark™ running inside the cluster, available options can be viewed at ListClusterVersions.
The maximum persistent volume storage that will be available during workload.
private_network_idThe unique identifier of the private network the cluster will be attached to.
Responses
idThe unique identifier of the cluster. (UUID format)
project_idThe unique identifier of the project where the cluster has been created. (UUID format)
nameThe name of the cluster.
descriptionThe description of the cluster.
tagsThe tags of the cluster.
The Apache Spark™ Main node specification of cluster. It holds the parameters node_type, spark_ui_url (available to reach Apache Spark™ UI), spark_master_url (used to reach the cluster within a VPC), root_volume (size of the volume assigned to the cluster).
The cluster worker nodes specification. It holds the parameters node_type, node_count, root_volume (size of the volume assigned to the cluster).
statusThe status of the cluster. For a working cluster the status is marked as ready.
created_atThe creation timestamp of the cluster. (RFC 3339 format)
updated_atThe last update date of the cluster. (RFC 3339 format)
regionThe region of the cluster.
has_notebookWhether a JupyterLab notebook is associated with the cluster or not.
notebook_urlThe URL of the notebook if available.
spark_versionThe version of Apache Spark™ running inside the cluster.
The total persistent volume storage selected to run Apache Spark™.
private_network_idThe unique identifier of the private network to which the cluster is attached to. (UUID format)
notebook_master_urlThe URL that is used to reach the cluster from the notebook when available. This URL cannot be used to reach the cluster from a server.
Retrieve information about a given cluster, specified by the region and datalab_id parameters. Its full details, including name, status, node counts, are returned in the response object.
path Parameters
regionThe region you want to target
datalab_idThe unique identifier of the cluster.
Responses
idThe unique identifier of the cluster. (UUID format)
project_idThe unique identifier of the project where the cluster has been created. (UUID format)
nameThe name of the cluster.
descriptionThe description of the cluster.
tagsThe tags of the cluster.
The Apache Spark™ Main node specification of cluster. It holds the parameters node_type, spark_ui_url (available to reach Apache Spark™ UI), spark_master_url (used to reach the cluster within a VPC), root_volume (size of the volume assigned to the cluster).
The cluster worker nodes specification. It holds the parameters node_type, node_count, root_volume (size of the volume assigned to the cluster).
statusThe status of the cluster. For a working cluster the status is marked as ready.
created_atThe creation timestamp of the cluster. (RFC 3339 format)
updated_atThe last update date of the cluster. (RFC 3339 format)
regionThe region of the cluster.
has_notebookWhether a JupyterLab notebook is associated with the cluster or not.
notebook_urlThe URL of the notebook if available.
spark_versionThe version of Apache Spark™ running inside the cluster.
The total persistent volume storage selected to run Apache Spark™.
private_network_idThe unique identifier of the private network to which the cluster is attached to. (UUID format)
notebook_master_urlThe URL that is used to reach the cluster from the notebook when available. This URL cannot be used to reach the cluster from a server.
Delete a cluster based on its region and id.
path Parameters
regionThe region you want to target
datalab_idThe unique identifier of the cluster.
Responses
idThe unique identifier of the cluster. (UUID format)
project_idThe unique identifier of the project where the cluster has been created. (UUID format)
nameThe name of the cluster.
descriptionThe description of the cluster.
tagsThe tags of the cluster.
The Apache Spark™ Main node specification of cluster. It holds the parameters node_type, spark_ui_url (available to reach Apache Spark™ UI), spark_master_url (used to reach the cluster within a VPC), root_volume (size of the volume assigned to the cluster).
The cluster worker nodes specification. It holds the parameters node_type, node_count, root_volume (size of the volume assigned to the cluster).
statusThe status of the cluster. For a working cluster the status is marked as ready.
created_atThe creation timestamp of the cluster. (RFC 3339 format)
updated_atThe last update date of the cluster. (RFC 3339 format)
regionThe region of the cluster.
has_notebookWhether a JupyterLab notebook is associated with the cluster or not.
notebook_urlThe URL of the notebook if available.
spark_versionThe version of Apache Spark™ running inside the cluster.
The total persistent volume storage selected to run Apache Spark™.
private_network_idThe unique identifier of the private network to which the cluster is attached to. (UUID format)
notebook_master_urlThe URL that is used to reach the cluster from the notebook when available. This URL cannot be used to reach the cluster from a server.
Update a cluster node counts. Allows for up- and downscaling on demand, depending on the expected workload.
path Parameters
regionThe region you want to target
datalab_idThe unique identifier of the cluster.
Request Body
nameThe updated name of the cluster.
descriptionThe updated description of the cluster.
tagsThe updated tags of the cluster.
node_countThe updated node count of the cluster. Scale up or down the number of worker nodes.
Responses
idThe unique identifier of the cluster. (UUID format)
project_idThe unique identifier of the project where the cluster has been created. (UUID format)
nameThe name of the cluster.
descriptionThe description of the cluster.
tagsThe tags of the cluster.
The Apache Spark™ Main node specification of cluster. It holds the parameters node_type, spark_ui_url (available to reach Apache Spark™ UI), spark_master_url (used to reach the cluster within a VPC), root_volume (size of the volume assigned to the cluster).
The cluster worker nodes specification. It holds the parameters node_type, node_count, root_volume (size of the volume assigned to the cluster).
statusThe status of the cluster. For a working cluster the status is marked as ready.
created_atThe creation timestamp of the cluster. (RFC 3339 format)
updated_atThe last update date of the cluster. (RFC 3339 format)
regionThe region of the cluster.
has_notebookWhether a JupyterLab notebook is associated with the cluster or not.
notebook_urlThe URL of the notebook if available.
spark_versionThe version of Apache Spark™ running inside the cluster.
The total persistent volume storage selected to run Apache Spark™.
private_network_idThe unique identifier of the private network to which the cluster is attached to. (UUID format)
notebook_master_urlThe URL that is used to reach the cluster from the notebook when available. This URL cannot be used to reach the cluster from a server.
List the available compute node types for creating a new cluster.
path Parameters
regionThe region you want to target
query Parameters
pageThe page number.
page_sizeThe page size.
order_byThe order by field. Available fields are name_asc, name_desc, vcpus_asc, vcpus_desc, memory_gigabytes_asc, memory_gigabytes_desc, vram_bytes_asc, vram_bytes_desc, gpus_asc, gpus_desc.
Filter based on the target of the nodes. Allows to filter the nodes based on their purpose which can be main or worker node.
resource_typeFilter based on node type ( cpu/gpu/all ).
Responses
The list of node types.
total_countThe total count of node types.
Lists available notebook versions.
path Parameters
regionThe region you want to target
query Parameters
pageThe page number.
page_sizeThe page size.
order_byThe order by field. Available options are name_asc and name_desc.
Responses
The list of notebook versions.
total_countThe total count of notebook versions.