FOR BI PROS LEARN Tableau
MongoDB Web Data Connector for Tableau
July 29, 2015
0
, , , , , , ,

MongoDB Tableau Web Data Connector We are officially in the era of Tableau Web Data Connectors. Every day someone releases a new connector for a web site or application using this new API introduced in 9.1. Databoss team is not an exception, we have our own set of connectors with a brand new and fancy WDC framework which makes the connector development super easy. In this post I will show you how can you get our MongoDB connector up and running including its coding tricks behind the scenes. The connector uses the MongoDB Simple REST API to fetch data from MongoDB and tries to format it from MongoDB’s tree-structure to a linear list of rows.

Getting MongoDB to cooperate

MongoDB by default does not provide a REST API, and its documentation recommends that you use some third-party REST provider for complete functionality.

To start up MongoDb with the Simple REST API you need to start it with the --rest  switch, and since we plan on accessing it from a different host/port combination, we need to also add support for JSONP with the --jsonp  switch.

The mongod web server by default binds to the 127.0.0.1 interface which disallows other machines from connecting to it, so we also need to change the bind_address  property in the configuraion file to 0.0.0.0 (this will cause mongod to display a warning that it already listens to all ips, which is true for the mongod server but not true for the web server).

After this setup our mongodb should be accessible via

I do not publish the built connector as you need to audit anyway. The sources are published in this github repo with the compilation instructions. You know, never trust in any tableau web data connector without deep code audit. Just sayin’.

And now let’s dive into the deep.

Getting started with the connector

We’ll be using the Starschema WDC Connector Base (ConnectorBase) to write our connector since it simplifies some basic tasks (like creating a simple UI).

Also to make our lives easier, we’ll use browserify to compile the separate coffeescript source files and JADE templates into a single javascript file, and it also enables us to use the CommonJS module system to be used for dependency resolution and keeping the global namespace clean.

Creating the connector

ConnectorBase has a few concepts that may feel different to the usual WDC development process:

  • A connector is built from a number of states/steps (like setup, authentication, running the connector), where each state/step has a separate representation in the UI (like a setup page, an authentication page, etc.)
  • The form inputs used by the connector are declared in the template for each state (instead of declaring them in the connector code itself), so you can keep your connector code DRY
  • The connector source code defines the JavaScript to be ran during transitions from one state to another. (like get the data from all the inputs and call tableau.submit() when transitioning from the start state to the run state.)

So with this in mind, lets go through the mongoDB connector:

We start by importing some necessary libraries:

  • JQuery for AJAX
  • Underscore for convinience
  • Starschema WDC Connector Base

Steps and templates

Now we can declare our connector with its two states (start to gather the mongodb connection data and run to start the extraction):

The steps key describes the steps of the connector. The template key inside must point to a function that returns an HTML soup for the UI of that step. We are using JADE to both keep our templates simple and to allow browserify to compile them directly into our JavaScript file.

The only necessary state for a connector is start which always represents the startup state of the connector.

Now lets look at the start.jade  template to see how the inputs work (for brewity only parts of the template are shown here):

The important parts are the two data attributes used here:

  • data-tableau-key  tells ConnectorBase that the value of this input should be mapped to the key specified by the value of the attribute when collecting the inputs for the step (I will explain this in more detail a bit further down in the article)
  • data-state-to  tells ConnectorBase that when clicking that HTML element, a state transition should be triggered to the state specified.

Transitions between steps

How do these inputs and transitions come into play? Lets look a bit further down the connectors source and look at the transitions of the mongo connector to find out (still the same object passed to make_tableau_connector  ):

 

The state machine powering ConnectorBase passes three parameters to each transition function:

  • data which is the mutable state of the state machine.
  • from which is the name of state we are coming from
  • to which is the name of the state we are transitioning to

Since we are dealing with javascript, we can safely ignore the from and to parameters if we arent interested in using them.

Fetching the values of the inputs

This first transition (called when transitioning from start to run_mongo) gets executed when the user clicks the “Connect to MongoDB” button on the start page.

The fetch_inputs(<wrapper_selector>)  function collects the values of all inputs marked with data-tableau-key  from the given wrapper selector into a single object. This object in this case might look something like this:

 

By doing a _.extend data, wdc_base.fetch_inputs("#state-start") we are appending these key/value pairs to the state machine state so we can access them in later transitions.

Creating the MongoDB URL from the inputs

After collecting the inputs, we build our MongoDB url in a few steps (JSONP_CALLBACK_NAME is just a constant storing the name of the callback we’ll use for JSONP in JQuery):

Starting the extract process

With this completed, we have the URL to connect to, so we’ll start the extraction process using a nice feature of the state machine:

This transition gets triggered whenever we enter the run_mongo state, AFTER the transitions declared as between two states are ran (like our previous start > run_mongo  transition), so we can say that “whenever we enter the run_mongo state, after any in-between transitions are ran, run this transition handler”.

This transition handler uses the convinience function set_connection_data(<object>) to save the state machine data into tableau.connectionData  as serialized JSON. We can use the get_connection_data()  function to get the deserialized object back. This keeps the serialization part well abstracted.

Getting the table Metadata

ConnectorBase calls the methond columns(<connection_data>) to load our headers. Here we simply do the work needed and then call

There are a few helper methods at work here, so lets go through them one-by-one:

This really is only a convinience function so we dont have to repeat ourselves during the data callback and the header callback. It simply takes a URL and a callback and either calls the callback with the loaded data or aborts the extract process in case of a connection problem.

The json_flattener.remap(<object>)  is a more complicated function that flattens the tree hierarchy of the data from MongoDB to a flat list of rows. It will be explained in detail in a later article, for now the important thing is that it takes a (possibly nester) javascript object and returns an array of flat javascript objects.

The guessDataType( <value> )  function takes a Javascript string value as input and returns a possible Tableau Datatype for it

So we use json_flattener.remap  to flatten the first object returned by MongoDB and get the field names and types by using Underscore.js’s mapObject function.

Loading the data itself

The data loader callback takes almost the exact same shape as the header loader, except for the pagination logic:

  • The function uses the start offset of the next page as the lastRecordToken (or 0 for the first page)
  • MongoDB returns the start offset and returned record count as offset and total_rows in the response, and we use it to check if the total_records returned is 0 (in which case we have reached the last page).
  • If we arent at the last page, then increment the start offset by the returned record count and use this as our new lastRecordToken

Summary

Geez, if you made until the end of this article you are definitely a hard core guy.

Tamás Földi

Tamás Földi

Director of IT Development at Starschema
Decades of experience with data processing and state of the art programming. From nuclear bomb explosion simulation to distributed file systems. ethical hacking, real time stream processing practically I always had a great fun with those geeky ones and zeros.
Tamás Földi

Related items

/ You may check this items as well

Scaling Tableau Image

Scaling out Tableau Extracts – Building a distributed, multi-node MPP Hyper Cluster

We are officially in the era of Tableau Web Data C...

Read more
Pasted image at 2018_01_09 04_59 PM

Python Experiments in Tableau 1. – Add live currency conversion to Tableau Dashboards using TabPy

We are officially in the era of Tableau Web Data C...

Read more
Tableau Docker

HOWTO: Tableau Server Linux in Docker Container

We are officially in the era of Tableau Web Data C...

Read more