NavigationContentFooter
Jump toSuggest an edit

Visualizing time-Series data with TimescaleDB and Grafana

Reviewed on 03 October 2024Published on 10 December 2019
  • time-Series
  • TimescaleDB
  • Grafana
  • Database

TimescaleDB is a time series database built on PostgreSQL.

A time-series database typically stores “time-series data”, as the name indicates. It can be seen as a sequence of data points, that measure the same thing over time and store the measurement results in time order.

Each of these measurements is paired with a timestamp, defined by a name and a set of labeled dimensions (or “tags”).

Time-series data is everywhere around us and an important part of our everyday life. Whether it is a factory that measures the production output of a specific machine, a farmer observing the humidity of the soil, or a city measuring the regularity between trains, all of this is time-series data.

In this tutorial, we use a Database for PostgreSQL with the TimescaleDB extension and a Dataset of the air quality at the Franklin D. Roosevelt Metro station in Paris, provided by RATP, the Parisian transport authority. This kind of data is a perfect example of a time-series database, as it monitors certain values over a period of time. Other usages could include data from diverse backgrounds, such as weather data, economic information, or productivity information of an assembly line.

Before you start

To complete the actions presented below, you must have:

  • A Scaleway account logged into the console
  • Owner status or IAM permissions allowing you to perform actions in the intended Organization
  • An Instance running on Ubuntu Bionic Beaver (18.04) or later to host the Grafana interface
  • A Database for PostgreSQL
  • sudo privileges or access to the root user

Creating a new database

  1. Connect to the Scaleway console and click the Managed Databases section.
  2. Click on your Database Instance.
  3. Click the Managed Databases tab, then on + Create Database.
  4. Enter the name of the new database. In this tutorial, we use airquality.
  5. Click the Users tab, followed by «See more Icon» and Update Permissions.
  6. Grant your user permissions to the newly created airquality database.

Enabling the TimescaleDB extension

  1. Connect to your Database using the psql client:

    psql -h <rbd_host> --port <rdb_port> -d airquality -U <rdb_user>

    Enter the password for your database user when prompted.

    Tip

    Replace rdb_host with your Database Instance’s IP address and rdb_port with the port, which can be found under Endpoint on the Database information page. Replace <rdb_user> with the name of the user you granted access to airquality.

  2. Enable the timescaledb extension for your database.

    CREATE EXTENSION IF NOT EXISTS timescaledb CASCADE;

    The following output displays in your terminal:

    WARNING:
    WELCOME TO
    _____ _ _ ____________
    |_ _(_) | | | _ \ ___ \
    | | _ _ __ ___ ___ ___ ___ __ _| | ___| | | | |_/ /
    | | | | _ ` _ \ / _ \/ __|/ __/ _` | |/ _ \ | | | ___ \
    | | | | | | | | | __/\__ \ (_| (_| | | __/ |/ /| |_/ /
    |_| |_|_| |_| |_|\___||___/\___\__,_|_|\___|___/ \____/
    Running version 1.5.1
    For more information on TimescaleDB, please visit the following links:
    1. Getting started: https://docs.timescale.com/getting-started
    2. API reference documentation: https://docs.timescale.com/api/latest
    3. How TimescaleDB is designed: https://docs.timescale.com/introduction/architecture
    Note: TimescaleDB collects anonymous reports to better understand and assist our users.
    For more information and how to disable, please see our docs https://docs.timescaledb.com/using-timescaledb/telemetry.
    CREATE EXTENSION

Prepare the table for TimescaleDB

  1. Create a table called airquality, which will be used to store the information.

    CREATE TABLE airquality(
    DATETIME TIMESTAMP WITH TIME ZONE NOT NULL,
    NO INTEGER,
    NO2 INTEGER,
    PM10 INTEGER,
    CO2 INTEGER,
    TEMP NUMERIC(4,1),
    HUMI NUMERIC(4,1)
    );

    The command above creates a new table called airquality with seven columns:

    • DATETIME contains a timestamp which includes the time zone.
    • NO contains the average NO concentration in µg/m3.
    • NO2 contains the average NO2 concentration in µg/m3.
    • PM10 contains the average PM10 particles concentration in in µg/m3
    • CO2 contains the average CO2 concentration in ppm.
    • TEMP contains the average ambient temperature in °C.
    • HUMI contains the relative humidity in %.
  2. Convert the table you created into a TimescaleDB hypertable.

    SELECT create_hypertable('airquality', 'datetime');

    The following output displays in the terminal window:

    airquality=> SELECT create_hypertable('airquality', 'datetime');
    create_hypertable
    -------------------------
    (1,public,airquality,t)
    (1 row)

Adding and removing data

  1. Add a sample row to the database. The RETURNING statement visualizes the data added to the table:

    INSERT INTO airquality(DATETIME,NO,NO2,PM10,CO2,TEMP,HUMI) VALUES (NOW(),4,38,14,398,27.0,47.4) RETURNING *;

    An output like the following displays:

    datetime | no | no2 | pm10 | co2 | temp | humi
    -------------------------------+----+-----+------+-----+------+------
    2019-12-10 15:31:28.116471+00 | 4 | 38 | 14 | 398 | 27.0 | 47.4
    (1 row)
    INSERT 0 1
  2. Remove the data from the table. The following command deletes all data where the value for temp is larger than 26:

    DELETE FROM airquality WHERE temp > 26;
    DELETE 1

    Run a garbage collection using the VACCUM command to clean up the table and to free unused space:

    VACUUM airquality;

    Manually adding data works for a small-scale database, but when it comes to dealing with timescale data you have large sets of data.

Importing the dataset

In this tutorial, we use an air quality dataset from a Parisian metro station. The data is provided by Paris’ transport authority RATP.

  1. Download the sample data in CSV-format on your local computer.
  2. Convert the downloaded CSV file into an SQL database file:
    cat qualite-de-lair-mesuree-dans-la-station-franklin-d-roosevelt.csv | awk -F';' '{ printf "INSERT INTO airquality(DATETIME,NO,NO2,PM10,CO2,TEMP,HUMI) VALUES (\x27%s\x27,\x27%s\x27,\x27%s\x27,\x27%s\x27,\x27%s\x27,\x27%s\x27,\x27%s\x27);",$1,$2,$3,$4,$5,$6,$7;print ""}' > airquality.sql
  3. Import the file to your database:
    psql -h <rbd_host> --port <rdb_port> -d airquality -U <rdb_user> < airquality.sql

Querying Data

Now that you have a large dataset available, you can query the data using the psql client.

  1. Connect to your database.

    psql -h <rbd_host> --port <rdb_port> -d airquality -U <rdb_user>
  2. To receive a list of the 10 most recent entries in the database run the following command.

    SELECT * FROM airquality ORDER BY datetime DESC LIMIT 10;

    A list like the following displays:

    datetime | no | no2 | pm10 | co2 | temp | humi
    ------------------------+----+-----+------+-----+------+------
    2019-12-02 00:00:00+00 | 12 | 42 | 25 | | 14.8 | 36.3
    2019-12-01 23:00:00+00 | 11 | 40 | 23 | | 15.0 | 36.6
    2019-12-01 22:00:00+00 | 14 | 42 | 23 | | 15.0 | 38.1
    2019-12-01 21:00:00+00 | 29 | 48 | 26 | | 14.4 | 41.7
    2019-12-01 20:00:00+00 | 21 | 43 | 30 | | 13.9 | 42.3
    2019-12-01 19:00:00+00 | 18 | 40 | 31 | | 14.1 | 42.3
    2019-12-01 18:00:00+00 | 17 | 40 | 29 | | 14.2 | 41.8
    2019-12-01 17:00:00+00 | 16 | 39 | 27 | | 14.6 | 40.9
    2019-12-01 16:00:00+00 | 15 | 40 | 26 | | 15.0 | 41.4
    2019-12-01 15:00:00+00 | 13 | 40 | 20 | | 14.8 | 41.8
    (10 rows)
  3. Run the following command to retrieve the average temperature on all data:

    SELECT percentile_cont(0.5)
    WITHIN GROUP (ORDER BY temp)
    FROM airquality;

    An output like the following displays:

    percentile_cont
    -----------------
    18.7
    (1 row)

For more information on how to read data from your database, refer to TimescaleDB’s read data documentation page.

Visualizing data with Grafana

You can visualize the time-series data by creating a dashboard using Grafana, an open-source analytics & monitoring solution.

  1. Create a new Instance running on Ubuntu Bionic (18.04) from your Scaleway console

  2. Log into the Instance using SSH.

  3. Add the Grafana repository to the apt package cache.

    add-apt-repository "deb https://packages.grafana.com/oss/deb stable main"
  4. Install the GPG key of Grafana to validate the packages.

    wget -q -O - https://packages.grafana.com/gpg.key | apt-key add -
  5. Update the apt package cache.

    apt update
  6. Install Grafana using apt.

    apt install grafana
  7. Reload the systemd daemon, enable the Grafana service to start the application automatically during system boot, and start the application manually.

    systemctl daemon-reload && systemctl enable grafana-server.service && systemctl start grafana-server.service
  8. Open a web browser and point it to http://<instance_ip>:3000/.

    Note

    Replace <instance_ip> with the IP address of your Instance.

  9. The Grafana login screen displays. Log yourself into Grafana using the username and password admin:

    Upon the first connection, you will be asked to set a new password.

  10. Enter it twice and validate the form. The default dashboard displays.

  11. Click Add Datasource to configure a new datasource.

  12. Enter the credentials of your Database Instance:

    • Host: Enter the IP address and the port of your Database Instance separated by a double point.
    • Managed Databases: The name of the database (for example, airquality)
    • User: The database user
    • Password: The database password
    • SSL Mode: The desired SSL validation mode. Set the value to required.
    • Version: This value must correspond to the PostgreSQL version of your Database Instance
    • TimescaleDB: Enable this value
  13. Click Save & Test to test the database connection and save the values.

  14. Click the Grafana logo to return to the base dashboard.

  15. Create a JSON configuration to query the database in Grafana.

  16. Click + Create > Import and click Import .json File to upload the previously downloaded configuration file to Grafana.

    Grafana starts to generate graphs from the time-series data stored in TimescaleDB. You can now browse the data in a visual interface and edit the queries of the database according to your needs:

Conclusion

You now have configured a Timescale database, imported time-series data, and learned how to query, modify, and visualize the data using Grafana. For more information about TimescaleDB, refer to the official Documentation.

Docs APIScaleway consoleDedibox consoleScaleway LearningScaleway.comPricingBlogCarreer
© 2023-2024 – Scaleway