This is an optional step that seeks to simplify your usage of the Data Lab API.
Data Lab API for Apache Spark™
Introduction
Data Lab is Scaleway's fully-managed service for running Apache Spark™ workloads. It provides a scalable, secure environment to process large datasets with ease. With Data Lab, you can launch Spark clusters in minutes and focus on data processing instead of infrastructure management.
The service is currently in Public Beta.
Concepts
Refer to our dedicated concepts page to find definitions of the different terms referring to Data Lab for Apache Spark™.
Quickstart
-
Configure your environment variables.
Noteexport SCW_SECRET_KEY="<API secret key>"export SCW_DEFAULT_ZONE="<Scaleway default Availability Zone>"export SCW_DEFAULT_REGION="<Scaleway default region>"export SCW_PROJECT_ID="<Scaleway Project ID>" -
Create a Data Lab: Run the following command to create a Data Lab with a main node and 2 worker nodes with 20GB of total persistent volume storage. You can customize the details in the payload to your needs, using the table below to help. Note that you will need to have a VPC and a Private Network before running this command.
curl --request POST \--url https://api.scaleway.com//datalab/v1beta1/regions/fr-par/datalabs \-H "X-Auth-Token: $SCW_SECRET_KEY" \-H "Content-Type: application/json" -d '{"name": "my-first-datalab","project_id": "'"$SCW_PROJECT_ID"'","worker": {"node_type": "DDL-POP2-2C-8G","node_count": 2},"main": {"node_type": "DDL-PLAY2-MICRO"}, "has_notebook": true,"total_storage": {"size": 20000000000,"type": "sbs_5k"},"private_network_id": "{Your PN ID}","spark_version": "4.0.0"}}' -
Get a list of your Data Labs: Run the following command to get a list of all the Data Labs in your account, with their details:
curl --request GET \--url https://api.scaleway.com/datalab/v1beta1/regions/fr-par/datalabs \-H "X-Auth-Token: $SCW_SECRET_KEY" -
Delete your Data Lab: Run the following command to delete a Data Lab. Ensure that you replace
{datalab-id}in the URL with the ID of the Data Lab you want to delete.curl --request DELETE \--url https://api.scaleway.com/datalab/v1beta1/regions/fr-par/datalabs/{datalab-id} \-H "X-Auth-Token: $SCW_SECRET_KEY" | jq
Technical information
Regions
Scaleway's infrastructure spans different regions and Availability Zones.
Data Lab for Apache Spark™ is currently available in the Paris region, which is represented by the following path parameter:
- fr-par
Going further
For more information about Data Lab for Apache Spark™, you can check out the following pages:
Data Labs
Data Lab is an encapsulated Apache Spark™ cluster, composed of one or more dedicated compute nodes running Spark. It can optionally include a dedicated JupyterLab Notebook. These resources are fully manageable through this API.
GET/datalab/v1beta1/regions/{region}/cluster-versions
GET/datalab/v1beta1/regions/{region}/datalabs
POST/datalab/v1beta1/regions/{region}/datalabs
GET/datalab/v1beta1/regions/{region}/datalabs/{datalab_id}
PATCH/datalab/v1beta1/regions/{region}/datalabs/{datalab_id}
DELETE/datalab/v1beta1/regions/{region}/datalabs/{datalab_id}
GET/datalab/v1beta1/regions/{region}/node-types
GET/datalab/v1beta1/regions/{region}/notebook-versions