Custom Postgres metrics in Datadog

Instrumenting applications is something we are used to at Bithaus. Having the code reporting itself delivers great information, and every person involved in the system from developers, operations and managers appreciate it.

But you cannot always change the code to get metrics, or not as fast as you would like to.

An excellent way to gather business metric of your application, is to look in your tables in the database. Continue on reading for step-by-step on how to create a custom metric based on a SQL query, so we can display it on a dashboard and create monitors to check it’s value.

Your table, your metrics

In our case, we needed to measure the performance and progress of an ETL process in a data migration project. We had a statistics table that looks something like this:

example_database=> \d source_stats 
                   View "public.source_stats"
        Column         |              Type              | Modifiers 
-----------------------+--------------------------------+-----------
 source_id             | integer                        | 
 data_type_id          | smallint                       | 
 creation_ts           | timestamp without time zone    | 
 update_ts             | timestamp without time zone    | 
 last_inserted_data_ts | timestamp(0) without time zone | 
 entry_count           | bigint                         | 

Rows here are inserted/updated when the ETL works on a “source” identified by “source_id”. So we have when the ETL started working on a source, when was the last time the stats were updated and the total entry count processed.

Based on this information, we created a view called etl_stats that summarises the overall process. on a regular basis, our “source_stats” table is updated allowing the view to show real-time status.

 count | avgtime | avg_entry_count 
-------+---------+-----------------
   156 |      50 |         1092847

Configuring the Datadog Postgresql collector

If you haven’t already enabled the PostgreSQL extension of Datadog agent, you can do so HERE.

Go to the Datadog extensions configuration directory, which is /etc/dd-agent/conf.d for Ubuntu installations, and edit the postgres.yaml file.

init_config:

instances:
  - host: localhost
    port: 5432
    username: datadog
    password: YOUR_PASSWORD
    ssl: True
    tags: 
      - psql:example_instance

    dbname: example_database
    custom_metrics:
    - # ETL Stats
      descriptors: []
      metrics:
        count: [my.etl.sources_finished, GAUGE]
        avgtime: [my.etl.avg_process_time, GAUGE]
        avg_entry_count: [my.etl.avg_entry_count, GAUGE]
      query: SELECT %s from etl_stats;
      relation: false


This is the most simple setup. The logic is that every metric defined in the “metric” property correspond to a table column in the “query” statement. In this case, our table has 3 columns count, avgtime, avg_entry_count that are mapped to 3 metrics. Internally, the agent builds the query based on the metrics on runtime, as you can see in the collector.log when debbuging:

2016-02-22 13:13:51 CLT | DEBUG | dd.collector | checks.postgres(postgres.py:446) | Running query: SELECT count, avgtime, avg_entry_count from etl_stats;

The column – metric mapping. This is the magic right here!

       COLUMN_NAME : [METRIC_NAME, METRIC_TYPE]

In our example we created a metric “my.etl.sources_finished” type “gauge” that reports the value of the column “count” of our summary table.

       count : [my.etl.sources_finished, GAUGE]

Define as many metrics as you need depending on your query. Datadog collector will run these when collecting standard postgres metrics. If anything goes wrong you should check the collector log – Ubuntu: /var/log/datadog/collector.log.

Restart the agent and that’s it – Ubuntu: service datadog-agent restart.

You may check your configuration using service datadog-agent checkconfig and check the collection status with service datadog-agent info.

===================
Collector (v 5.6.1)
===================

  Status date: 2016-02-22 18:21:10 (7s ago)
  Pid: 17902
  Platform: Linux-3.2.0-23-generic-x86_64-with-debian-wheezy-sid
  Python Version: 2.7.10
  Logs: , /var/log/datadog/collector.log, syslog:/dev/log

  Clocks
  ======
  
    NTP offset: 0.0053 s
    System UTC time: 2016-02-22 21:21:18.485238
  
  Paths
  =====
  
    conf.d: /etc/dd-agent/conf.d
    checks.d: /opt/datadog-agent/agent/checks.d
  
  Hostnames
  =========
  
    socket-hostname: ubuntu-vm
    ec2-hostname: ubuntu-vm.bithaus.cl
    hostname: ubuntu-vm.bithaus.cl
    socket-fqdn: ubuntu-vm.bithaus.cl
  
  Checks
  ======
  
    ntp
    ---
      - Collected 0 metrics, 0 events & 1 service check
  
    disk
    ----
      - instance #0 [OK]
      - Collected 48 metrics, 0 events & 1 service check
  
    postgres
    --------
      - instance #0 [OK]
      - Collected 45 metrics, 0 events & 2 service checks

 

Using your custom metric on Datadog

Go into your Datadog account to the Metric > Explorer. In the Graph input type the name of the metric that you defined in postgres.yaml.

dd-postgres-01

If everything went OK, magic!

dd-postgres-02

And then you can go all crazy with dashboards B-)

dd-postgres-03

Enjoy!

 


Datadog helps teams across different areas in an organisation to have insight and control over the IT products and services that they are involved. Metrics & monitoring over SO + databases + services + apps + business; all in one TV screen.

Bithaus Software – Datadog partner from Chile.

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s