Realtime Google Analytics metrics in Datadog

This is a very popular question among or customers:

How can I get my Google Analytics metrics into Datadog?

We couldn’t find any good integration, so we developed our own for everyone to use. Read along for the explanation of how it works.

TLDR: https://github.com/bithauschile/datadog-ga

Datadog custom check for collecting Google Analytics Real Time data.

This custom check allows you to retrieve Google Analytics information from the Real Time API and send it as a regular metric to Datadog.

  • Support for multiple profiles (views)
  • Handles Active users (rt:activeUsers) and Pageviews (rt:pageviews) metrics
  • Allows for custom tagging for each profile
  • Allows for dimension-tag mapping for pageviews metric.

A custom check is a python script that the Datadog agent runs to do some metric collection of yours. You can configure it so it runs every minute, every 30 seconds, etc; by default every 15 seconds. This “check” is a Python class that comes with everything you need to report metrics and events to Datadog. You can read more about how to write a custom check here.

On the other side, Google have a public API that allows to query Google Analytics data and Python client library to consume it. It is pretty straight forward to use and was very simple to generate metrics to Datadog based on that data.

Google Analytics Datadog Integration - active users

Dimensions and tagging

Google Analytics Dimensions is the same as Datadog tagging. In GA you have a group of dimensions that are available with every metric. For example, the “rt:pageviews” metrics which represent how many page views have occurred in the last minutes, can be divided into country and city. To achieve this, we query the API requesting the “rt:pageviews” metric with the “rt:country,rt:city” dimensions.

Having the pageviews for each country and city, allows us to pass this information to Datadog as tags. So when reporting the metric, we do as many Datadog calls as results returned by GA. Example: [1, ‘Chile’, ‘Santiago’], [4, ‘Chile’, ‘Antofagasta’] indicates that 1 pageview was made from Santiago and 4 from Antofagasta, Chile. Thanks to the Datadog Agent class, the metrics are reported only once at the end of the execution.
Google Analytics Datadog Integration - pageviews

No exact timestamp

Analytics API is great except for the fact that the result structure has no timestamp. You can make 2 consecutive calls to the API and have no idea when each one was made.

To “solve” this, we run the check every 60 seconds. It is not always exact; a parameter in the check configuration file tells Datadog agent to run this custom check every 60 seconds. So instead of 60 seconds, we use 55 to consider the normal delays.

Why every minute when you can have second-precision metrics? 

Each call returns the data of the last 30 minutes. Also, you can ask a by-minute result by adding the “rt:minutesAgo” dimension:

{
 "kind": "analytics#realtimeData",
 "id": "https://www.googleapis.com/analytics/v3/data/realtime?ids=ga:111111&dimensions=rt:minutesAgo&metrics=rt:pageviews&max-results=5",
 "query": {
 "ids": "ga:93129781",
 "dimensions": "rt:minutesAgo",
 "metrics": [
 "rt:pageviews"
 ],
 "max-results": 5
 },
 "totalResults": 30,
 "selfLink": "https://www.googleapis.com/analytics/v3/data/realtime?ids=ga:1111111&dimensions=rt:minutesAgo&metrics=rt:pageviews&max-results=5",
  "profileInfo": {...},
 "columnHeaders": [
 {
  "name": "rt:minutesAgo",
  "columnType": "DIMENSION",
  "dataType": "STRING"
 },
 {
  "name": "rt:pageviews",
  "columnType": "METRIC",
  "dataType": "INTEGER"
 }
 ],
  "totalsForAllResults": {
  "rt:pageviews": "62"
 },
 "rows": [
  [ "00", "1" ],
  [ "01", "1" ],
  [ "02", "1" ],
  [ "03", "2" ],
  [ "04", "4" ]
 ]
}

In this example, we have that for “the last minute there is one pageview. If we query de API again within this minute, the result could be 2. Then before 60 seconds, you may query a third time and get 0 (zero). This is because there is no way to tell when “the last minute” ends on Google side.

One way to solve this is to look at the “1 minute ago” entry, every 60 seconds. This way we know that the the value won’t change. The downside is that we could query the same value twice or missed an entry based on when the agent runs the check. If you find a better way to solve this, please contact us on Github!

Let’s go!

You can get the check source code and installation instructions on Github:

https://github.com/bithauschile/datadog-ga

Enjoy!

 

Datadog helps teams across different areas in an organisation to have insight and control over the IT products and services that they are involved. Metrics & monitoring over SO + databases + services + apps + business; all in one TV screen.

Bithaus Software – Datadog partner from Chile.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s