Jump toUpdate content

Cockpit - Concepts

Alerting

Alerting detects complex conditions defined by a rule and keeps you aware of issues in your environments. When a condition defined by a rule is met, the rule tracks it as an alert and responds by triggering one or more actions.

Alert manager

Scaleway’s alert manager allows you to manage and respond to alerts. It handles alerts sent when the alerting rules we run are firing. The alert manager triggers alerts (e.g. emails or texts) if a criteria you have configured on your applications’ metrics and logs is activated.

Alerting rules

Alerting rules allow you to define criteria that determine whether an alert is triggered. The rule consists of queries and expressions, a condition, the frequency of evaluation, and the duration over which the condition is met. They act as alarm sensors: when an alert is triggered, a notification is sent to the alert manager, which forwards the notification to receivers.

Cockpit

A Cockpit is an instance of the Observability product that stores logs and metrics, and provides a dedicated dashboarding system on Grafana to visualize them. A Scaleway Project can have only one Cockpit.

Contact points

Contact points define who is notified when an alert fires. Contact points include emails, Slack, on-call systems and texts. When an alert fires, all contact points are notified.

Endpoints

An endpoint is the point of entry in a communication channel when two systems are interacting. The endpoint is the means from which the API can access the resources they need from a server to perform their task. An endpoint can include a URL of a server or service. The Observability Cockpit provides 3 endpoints:

  • A Prometheus-compatible endpoint responsible for dealing with metrics
  • A Loki-compatible endpoint responsible for dealing with logs
  • A Prometheus-compatible endpoint responsible for configuring your alert manager

Grafana users

A Grafana user is any individual who can log in to Grafana. Each user is associated with a role. There are two types of roles a user can have:

  • a viewer: can only view dashboards
  • an editor: can build and view dashboards

Managed dashboards in the “Scaleway” folder are always read-only, regardless of your role.

Loki Remote Write

Loki Remote Write is the protocol used to push your logs to your Cockpit’s logs’ endpoint.

LogQL

LogQL is Grafana Loki’s language for querying logs. LogQL uses labels and operators for filtering.

Logs

Logs provide a record of all events and errors that take place during the lifecycle of your resources. They represent an excellent source of visibility if you want to know when a problem occurred, or which events correlate with it.

Managed alerts

Managed alerts are alerting rules that Scaleway defines for you. You can think of them as alarm sensors. They allow you to receive alerts for behaviors that we deem abnormal on your products. Managed alerts only apply to Scaleway products’ metrics and logs.

Managed dashboards

A managed dashboard is a set of one or more panels that Scaleway sets up and updates for you to visualize the metrics and logs associated with your Scaleway products.

Metric

A metric is a lifecycle-related numerical representation of data (e.g. disk usage and CPU usage) measured over intervals of time. Metrics give you a bird’s eye view of your infrastructure.

Prometheus Remote Write

Prometheus Remote Write is the protocol used to push your metrics to your Cockpit’s metrics’ endpoint.

PromQL

PromQL, short for Prometheus Querying Language, is the main way to query metrics within Prometheus. It is designed for building queries for graphs and alerts.

Receivers

Receivers are hubs consisting of contacts points. You can associate one or several alerts with one or more receivers. This allows you to diversify your alerts.

Recording rules

Recording rules allow you to precompute frequently needed or computationally expensive expressions and save their results as a new set of time series.

Samples

A sample is a unique measuring point on a time series.

Time series

A time series is a sequence of numerical data points in successive order, that tracks the value of a parameter over time. Example of parameter: the disk usage of a machine hosting a database, expressed in percentage.

Tokens

Tokens are secret keys that allow you to authenticate against your Cockpit’s endpoints (metrics, logs, alerts). You can generate new tokens and select token permissions such as:

  • Push: allows you to send your metrics and logs to your Cockpit.
  • Query: allows you to fetch your metrics and logs from your Cockpit.
  • Rules: allow you to configure alerting and recording rules.
  • Alerts: allow you to set up the alert manager.