Cockpit - Concepts
Alerting detects complex conditions defined by a rule and keeps you aware of issues in your environments. When a condition defined by a rule is met, the rule tracks it as an alert and responds by triggering one or more actions.
Scaleway’s alert manager allows you to manage and respond to alerts. It handles alerts sent when the alerting rules we run are firing. The alert manager triggers alerts (e.g. emails or texts) if a criteria you have configured on your applications’ metrics and logs is activated.
Select Scaleway Alerting when choosing the Alert manager if you want to manage your alerts using Grafana.
Alerting rules allow you to define criteria that determine whether an alert is triggered. The rule consists of queries and expressions, a condition, the frequency of evaluation, and the duration over which the condition is met. They act as alarm sensors: when an alert is triggered, a notification is sent to the alert manager, which forwards the notification to receivers.
If you plan on setting up alerting rules with Grafana, use the Mimir or Loki alerts rather than the Grafana managed alert.
A Cockpit is an instance of the Observability product that stores logs and metrics, and provides a dedicated dashboarding system on Grafana to visualize them. A Scaleway Project can have only one Cockpit.
Contact points define who is notified when an alert fires. Contact points include emails, Slack, on-call systems and texts. When an alert fires, all contact points are notified.
An endpoint is the point of entry in a communication channel when two systems are interacting. The endpoint is the means from which the API can access the resources they need from a server to perform their task. An endpoint can include a URL of a server or service. The Observability Cockpit provides four endpoints:
- A Prometheus-compatible endpoint responsible for dealing with metrics
- A Loki-compatible endpoint responsible for dealing with logs
- A Prometheus-compatible endpoint responsible for configuring your alert manager
- A Tempo-compatible endpoint responsible for dealing with traces
- Having the default configuration on your agents might lead to more of your resources’ metrics being sent, a high consumption and a high bill at the end of the month.
- Sending metrics, logs and traces for Scaleway resources or personal data using an external path is a billable feature. In addition, any data that you push yourself is billed, even if you send data from Scaleway products. Refer to the product pricing for more information.
A Grafana user is any individual who can log in to Grafana. Each user is associated with a role. There are two types of roles a user can have:
- a viewer: can only view dashboards
- an editor: can build and view dashboards
Managed dashboards in the “Scaleway” folder are always read-only, regardless of your role.
Loki is the log aggregation system used by Grafana to store and query your logs.
Loki Remote Write
Loki Remote Write is the protocol used to push your logs to your Cockpit’s logs’ endpoint.
LogQL is Grafana Loki’s language for querying logs. LogQL uses labels and operators for filtering.
Logs provide a record of all events and errors that take place during the lifecycle of your resources. They represent an excellent source of visibility if you want to know when a problem occurred, or which events correlate with it.
Managed alerts are alerting rules that Scaleway defines for you. You can think of them as alarm sensors. They allow you to receive alerts for behaviors that we deem abnormal on your products. Managed alerts only apply to Scaleway products’ metrics and logs.
A managed dashboard is a set of one or more panels that Scaleway sets up and updates for you to visualize the metrics and logs associated with your Scaleway products.
A metric is a lifecycle-related numerical representation of data (e.g. disk usage and CPU usage) measured over intervals of time. Metrics give you a bird’s eye view of your infrastructure.
Prometheus Remote Write
Prometheus Remote Write is the protocol used to push your metrics to your Cockpit’s metrics’ endpoint.
PromQL, short for Prometheus Querying Language, is the main way to query metrics within Prometheus. It is designed for building queries for graphs and alerts.
Receivers are hubs consisting of contacts points. You can associate one or several alerts with one or more receivers. This allows you to diversify your alerts.
Recording rules allow you to precompute frequently needed or computationally expensive expressions and save their results as a new set of time series.
A sample is a unique measuring point on a time series.
Tempo is Grafana’s open source tracing tool that allows you to search for traces, generate metrics from them, and link your tracing data with logs and metrics. Tempo can be used with open-source tracing protocols such as Jaeger, Zipkin, and OpenTelemetry.
A time series is a sequence of numerical data points in successive order, that tracks the value of a parameter over time. Example of parameter: the disk usage of a machine hosting a database, expressed in percentage.
Tokens are secret keys that allow you to authenticate against your Cockpit’s endpoints (metrics, logs, alerts). You can generate new tokens and select token permissions such as:
- Push: allows you to send your metrics, logs and traces to your Cockpit.
- Query: allows you to fetch your metrics, logs and traces from your Cockpit.
- Rules: allow you to configure alerting and recording rules.
- Alerts: allow you to set up the alert manager.
TraceQL is Grafana Tempo’s query language designed for searching and extracting traces.
Traces are detailed records of your requests’ behavior, as they move through distributed systems such as microservices and containers. Traces are an effective way to identify performance issues and bottlenecks in your environments, as they break down what happens within a request and provide information such as the moment a request starts and finishes, the events that occur and their duration, as well as additional context.
Scaleway only supports the OpenTelemetry agent for pushing traces.