Data Orchestrator - Concepts

Reviewed on April 01, 2026

Note

Data Orchestrator is currently in Private Beta. During this Beta period:

Code view is the only supported view.
The workflows only run Scaleway Serverless Jobs.

Orchestration

Orchestration is the automated coordination of tasks and workflows that keeps data operations reliable, scalable, and maintainable. In the context of Scaleway Data Orchestrator, it enables users to define, schedule, and manage complex data pipelines. It also handles dependencies, error recovery, and execution order seamlessly. Instead of manually triggering scripts or monitoring jobs, orchestration brings structure and intelligence, turning fragmented processes into unified, business-aligned workflows.

Tasks

Action task

An action task represents the executable unit within a workflow that performs concrete work. An action task can be:

Serverless Jobs: Long-running batch processes that scale automatically without infrastructure management.
Serverless Functions: Lightweight, event-driven code execution for quick transformations or API calls.
Spark Jobs: Distributed data processing tasks for large-scale ETL or analytics using Apache Spark. Not supported in the current Beta.
Other compute-intensive or service-specific jobs (e.g., data validation, model inference). Not supported in the current Beta.

These tasks are orchestrated in sequence or in parallel, forming the backbone of data processing pipelines.

Logic task

A logic task controls the flow and decision-making within a workflow, enabling dynamic behavior beyond simple linear execution. A logic task can be:

Switch: Direct flow based on runtime conditions (e.g., file size, data quality).
Fork: Split execution into parallel branches to process data concurrently.
Try catch: Implement error-handling blocks to manage failures and enable retries or fallback logic.

These tasks allow users to embed business logic directly into pipelines, making them resilient and adaptable.

Trigger

A trigger is the event that initiates a workflow execution. A trigger can be:

Manual: User starts the run via the Scaleway console or CLI (ideal for testing).
Schedule: Automatic execution based on time (e.g., daily at 8:00 AM), set with a built-in scheduler.
Event: Triggered by external signals (e.g., new file in object storage, message in a queue), enabling reactive, real-time data processing. Not supported in the current Beta.

Views

Code view

Every workflow can be visualized as code, showing tasks and their dependencies.

Graph view

Every workflow can be visualized as a Directed Acyclic Graph (DAG), showing the tasks and their dependencies.

Workflow

A workflow is a structured sequence of action tasks and logical tasks that define an end-to-end data process.

Workflow definition

A workflow definition is the declarative blueprint of a workflow, typically described in code (e.g., YAML or JSON) or designed visually. It specifies tasks, dependencies, conditions, and execution parameters. This definition is version-controlled, reusable, and portable across environments.

Workflow execution / run

The runtime instance of a workflow definition. Each execution (or run) tracks the state, logs, and results of every task, providing full observability and auditability. Runs can succeed, fail, or be paused, with detailed insights for debugging.

Still need help?

Create a support ticket