Github Tableau Web Data Connector
August 2, 2015
, , , , , , , ,

Github ConnectorThis Github Web Data Connector is a very basic connector designed to showcase some basic aspects of the Tableau Web Data Connector architecture. After you read some of the WDC techniques in this previous MongoDB post we can learn two more things: authentication and pagination.  You can find the full sources in our Starschema Web Tableau Connector repository‘s GitHub folder.

Purpose of the connector

Enable us to get a list of commits from a repository and then create reports about the contributions, contributors and the dates the contributions came.

We’ll use the GitHub REST API to access these commits. The URL to access these is:

This returns us a JSON array of commits: sample commit list.

There are two aspects of this API endpoint we need to take into consideration:

  1. Non-authenticated requests are limited to 60 requests/hour, so we need to add basic authentication
  2. A single request only returns a limited number of commits, so we need pagination

So lets tackle those challenges one-by-one.

Getting started with the connector

We’ll be using the Starschema WDC Connector Base to write our connector since it simplifies some basic tasks (like creating a simple UI).

The construction of the UI and basic states of the connector basically mirror the structure of the MongoDB connector:

The only real difference here is storing the username / password pair inside the encrypted tableau.password  field as a JSON object to authenticate the requests in the callbacks and removing them from the connection data object so they wont be stored unencrypted.

Getting the metadata

Since github always returns the same data, we can keep the metadata static:

And do the header callback using this static object:

Getting the commit data

Lets go over the rows()  function that returns the actual commit data:

First we deserialize the username and password from the encrypted storage so we can do authentication if necessary.

Then figure out the github URL for the commits from the connection data.

If we are not on the first page (the lastRecordToken isn’t empty) then the url of the next page is already stored in the lastRecordToken so we set the connection url to that.

Next we need to build up the parameters for our AJAX call:

These parameters are always necessary, and they represent a very basic AJAX call to githubs api. Thankfully the GitHub API uses CORS headers so we wont have any trouble when it comes to cross-site script access.

After creating the parameter object, we may need to add the authentication header to it if we set up the connector for authentication:

The apply_auth helper simply adds an authorization header to our request:

After the AJAX parameter setup is complete, we just need to call it:

Now we have created and dispatched the request, so its time to look at what happens when we have received a reply (aka. the succes handler):

For pagination, the response returned by GitHub contains a header named Link which contains the pagination information that we need to parse with our little helper:

Next we’ll check if we actually received the correct response. We do this by checking if the response is an actual array:

Now we know that the data we received can be iterated, so lets just do that and collect the data we need from each commit:

This simply turns our array of complicated commit objects into a simple flat object that we’ll return to Tableau.

All thats left for us is to check if we need to load more pages and call the tableau.dataCallback() function:

The Result

Some nice shots from the results like the connector page and the fetched data:

Github Connector WDC Page

GitHub data in Tableau Desktop

Questions and comments are welcome as always.

Tamás Földi

Related items

/ You may check this items as well

sync frelard

Tableau Extensions Addons Introduction: Synchronized Scrollbars

At this year’s Tableau Conference, I tried t...

Read more

Tableau External Services API: Adding Haskell Expressions as Calculations

We all have our own Tableau Conference habits.  M...

Read more
Scaling Tableau Image

Scaling out Tableau Extracts – Building a distributed, multi-node MPP Hyper Cluster

Tableau Hyper Database (“Extract”) is ...

Read more