NavigationContentFooter
Jump toSuggest an edit

Cockpit - Concepts

Reviewed on 19 June 2024

Active series

Active series refer to time series for which the latest samples received by your Cockpit are less than 10 minutes old.

Agent

An agent is a software component that runs on your systems to gather data types from the host system or applications running on it. The agent then forwards this data to Cockpit for analysis and visualization.

Tip

Find out how to configure the Grafana Alloy agent to collect and send your data to Cockpit.

Alerting

Alerting detects complex conditions defined by a rule and keeps you aware of issues in your environments. When a condition defined by a rule is met, the rule tracks it as an alert and responds by triggering one or more actions.

Important

The Grafana alert manager on the Grafana interface is inactive. We strongly recommend that you select the Scaleway Alerting alert manager if you want to manage your alerts using Grafana.

Alert manager

Scaleway’s regionalized alert manager allows you to manage and respond to alerts according to the regions you have enabled it in. It handles alerts sent when the alerting rules we run are firing. The alert manager triggers alerts (e.g. emails or texts) if a criteria you have configured on your applications’ metrics and logs is activated.

Important

The Grafana alert manager on the Grafana interface is inactive. We strongly recommend that you select the Scaleway Alerting alert manager if you want to manage your alerts using Grafana.

Alerting rules

Alerting rules allow you to define criteria that determine whether an alert is triggered. The rule consists of queries and expressions, a condition, the frequency of evaluation, and the duration over which the condition is met. They act as alarm sensors: when an alert is triggered, a notification is sent to the alert manager, which forwards the notification to receivers.

Note

If you plan on setting up alerting rules with Grafana, use the Mimir or Loki alerts rather than the Grafana managed alert.

Cockpit

A Cockpit is an instance of the Observability product that stores metrics, logs, and traces and provides a dedicated dashboarding system on Grafana to visualize them. A Scaleway Project can have only one Cockpit, which is automatically activated when you are using Scaleway resources that are integrated into Cockpit.

Contact points

Contact points define who is notified when an alert fires, according to the region in which you have added them. Contact points include emails, Slack, on-call systems, and texts. When an alert fires, all contact points are notified.

Data sources

Data sources are regionalized backends that allow you to store and fetch your metrics, logs, and traces. By default, Scaleway data sources are enabled if you are using Scaleway resources integrated with Cockpit on your Project.

Scaleway data sources are read-only.

You can create additional custom data sources in the Paris, Amsterdam, and Warsaw regions, from your own external resources.

Data types

Data types refer to the categories of data you can collect for monitoring and observability. There are three data types:

  • Metrics, which are numeric measurements. They are used for performance monitoring.
  • Logs, which are textual records of events generated by your applications. They are used for event and error monitoring.
  • Traces, which are data structures that represent the path of a request. They are used for request-behavior-monitoring.

Endpoints

An endpoint is the point of entry in a communication channel when two systems are interacting. The endpoint is the means from which the API can access the resources they need from a server to perform their task. An endpoint can include a URL of a server or service. The Observability Cockpit provides four endpoints:

  • A Prometheus-compatible endpoint responsible for dealing with metrics
  • A Loki-compatible endpoint responsible for dealing with logs
  • A Prometheus-compatible endpoint responsible for configuring your alert manager
  • A Tempo-compatible endpoint responsible for dealing with traces
Important
  • Having the default configuration on your agents might lead to more of your resources’ metrics being sent, a high consumption and a high bill at the end of the month.
  • Sending metrics, logs and traces for Scaleway resources or personal data using an external path is a billable feature. In addition, any data that you push yourself is billed, even if you send data from Scaleway products. Refer to the product pricing for more information.

Grafana users

A Grafana user is any individual who can log in to Grafana. Each user is associated with a role. There are two types of roles a user can have:

  • a viewer: can only view dashboards
  • an editor: can build and view dashboards
Note
  • Managed dashboards in the “Scaleway” folder are always read-only, regardless of your role.
  • The admin user is not yet available for creation.

Loki

Loki is the log aggregation system used by Grafana to store and query your logs.

Loki Remote Write

Loki Remote Write is the protocol used to push your logs to your Cockpit’s logs’ endpoint.

LogQL

LogQL is Grafana Loki’s language for querying logs. LogQL uses labels and operators for filtering.

Logs

Logs are a data type that provides a record of all events and errors taking place during the lifecycle of your resources. They represent an excellent source of visibility if you want to know when a problem occurred, or which events correlate with it.

You can push logs with any Loki-compatible agent such as Promtail, Fluentd, Fluent Bit or Logstash.

Preconfigured alerts

Preconfigured alerts are regionalized alerting rules that Scaleway defines for you. They are active in all the regions where you have enabled the alert manager. You can think of them as alarm sensors. They allow you to receive alerts for behaviors that we deem abnormal on your resources. Preconfigured alerts only apply to the metrics and logs of your Scaleway resources.

Note

You must enable the alert manager to enable preconfigured alerts. Preconfigured alerts will only be active in the regions where you have enabled the alert manager.

Managed dashboards

A managed dashboard is a set of one or more panels that Scaleway sets up and updates for you to visualize the metrics and logs associated with your Scaleway products.

Metric

A metric is a lifecycle-related numerical representation of data (e.g. disk usage and CPU usage) measured over intervals of time. Metrics give you a bird’s eye view of your infrastructure.

You can push metrics with any Prometheus-compatible agent such as Prometheus, Grafana or OpenTelemetry Collector.

Mimir

Grafana Mimir is an open source software project that allows you to store your metrics by providing long-term storage for Prometheus.

Prometheus Remote Write

Prometheus Remote Write is the protocol used to push your metrics to your Cockpit’s metrics’ endpoint.

PromQL

PromQL, short for Prometheus Querying Language, is the main way to query metrics within Prometheus. It is designed for building queries for graphs and alerts.

Receivers

Receivers are hubs consisting of contact points. You can associate one or several alerts with one or more receivers. This allows you to diversify your alerts.

Recording rules

Recording rules allow you to precompute frequently needed or computationally expensive expressions and save their results as a new set of time series.

Region

A region is the geographical area in which your Cockpit data is stored. They are duplicated on all Availability Zones of the selected regions (Paris, Amsterdam, or Warsaw).

You can decide in which region to enable the alert manager and your preconfigured alerts. You can also choose the regions in which to create your data types, data sources, and tokens.

Retention

Retention or data retention refers to the duration for which data, such as metrics, logs, or traces, is stored before being automatically deleted. Retention allows you to manage the lifecycle of your Scaleway and custom data by selecting storage periods that align with your needs.

The minimum and maximum retention periods for each data source type are as follows:

Custom metricsCustom logs/tracesScaleway metricsScaleway logs/traces
Minimum retention period: 1 dayMinimum retention period: 1 dayMinimum retention period: 31 daysMinimum retention period: 1 day
Maximum retention period: 365 days (12 months)Maximum retention period: 31 days (1 month)Maximum retention period: 365 days (12 months)Maximum retention period: 31 days (1 month)
Default retention period: 31 daysDefault retention period: 7 daysDefault retention period: 31 daysDefault retention period: 7 days

Samples

A sample is a unique measuring point on a time series.

Tempo

Tempo is Grafana’s open source tracing tool that allows you to search for traces, generate metrics from them, and link your tracing data with logs and metrics. Tempo can be used with open-source tracing protocols such as Jaeger, Zipkin, and OpenTelemetry.

Important

During the beta of the traces feature, only the OpenTelemetry HTTP push path is supported. Scaleway is working on implementing the Open Telemetry gRPC, Zipkin and Jaeger protocols.

Time series

A time series is a sequence of numerical data points in successive order, that tracks the value of a parameter over time. Example of parameter: the disk usage of a machine hosting a database, expressed in percentage.

Tokens

Tokens are regionalized secret keys that allow you to authenticate against your Cockpit’s endpoints (metrics, logs, alerts). You can generate new tokens and select token permissions such as:

  • Push: allows you to send your metrics, logs, and traces to your Cockpit.
  • Query: allows you to fetch your metrics, logs, and traces from your Cockpit.
  • Rules: allow you to configure alerting and recording rules.
  • Alerts: allow you to set up the alert manager.

TraceQL

TraceQL is Grafana Tempo’s query language designed for searching and extracting traces.

Traces

Traces are detailed records of your requests’ behavior, as they move through distributed systems such as microservices and containers.

Traces are an effective way to identify performance issues and bottlenecks in your environments, as they break down what happens within a request and provide information such as the moment a request starts and finishes, the events that occur, and their duration, as well as additional context.

You can push traces with the Tempo-compatible agent supported by Scaleway: OpenTelemetry.

Scaleway only supports the OpenTelemetry agent for pushing traces.

Was this page helpful?
API DocsScaleway consoleDedibox consoleScaleway LearningScaleway.comPricingBlogCareers
© 2023-2024 – Scaleway