FOR BI PROS Tableau
Version Control and Point-in-time Recovery of Tableau Server Objects
July 22, 2015
0
, , , , , , , , , , , , ,

Automatic, shadow-copy like version control of Tableau Server objects (workbooks, data sources) is a crucial part of the server administration. While most of the admins using tableau backup tool only we should note that  it lacks features like individual workbook level restore and quick, incremental backup. But that’s why you are here: with this method you can easily create and automate incremental snapshots of your Tableau Server objects directly to the git version control system. This is the perfect solution if some of your users overwrites or deletes a workbooks then begging for restore: you can go back in time and retrieve any previous version easily.

Indigents

My solution relies on the following technologies:

  • TableauFS: The existing open source version control systems working with files. Thus, we should make all tableau repository objects available as opaque file system objects. This is exactly what TableauFS does.
  • Git: Git is the de-facto distributed version control system standard. It was written by a well-known guy (google for it if you don’t know who).
  • Git LFS: is an open source Git extension for versioning large files
  • crontab, windows scheduler or any other scheduler

The solution works on Linux, OSX and Windows.

Installation

First of all install Tableau FS as described here or here then install git with your OS’s package manager (like sudo yum install git ). Next step is to install git lfs. Just download the binaries and execute the installer shell script:

Installation is done, let’s set it up.

Configuration

TableauFS

We should create a directory for our version control system like ~/tableau-repos . Then put all of our servers inside subdirectories like ~/tableau-repos/dev . TableauFS is a (mostly) read only file system, thus, we should write any scm related information to ~/tableau-repos only.

Now we have the Tableau Server objects on our file system (not physically, just virtually), so we can move to the version control part.

Git LFS

We can use the same old git init command to create an empty git repo as usual. LFS should know what are our binary file set. This can be set by lfs track  command:

Now we can proceed an import the existing objects and commit the initial release:

Very nice, we have just synchronized our git repo with the tableau server repository. If you have a few thousands workbook it could take a few minutes.

Scheduling the updates

Next time when you execute git add and commit only the changed files will be stored (and the differences only). You can safely schedule the following one lines which will look for added, deleted and updated files and push them to the local repo:

Our customers usually requests hourly incremental backups from workbooks and datasources which is perfectly feasible even for big deployments (5000+ workbooks).

Managing versions

Lets take an example workbook called Dashboard Parameter Example Many Dashboards2 from Default site, Tableau Samples project.

Example workbook: Dashboard Parameter Example Many Dashboards2

Example workbook: Dashboard Parameter Example Many Dashboards2

Lets modify something inside the workbook, like change the calculation of DB Text field:

Change DB Text in workbook

Change DB Text in workbook

Save the changes. Now when we switch back to the console the changes will be immediately visible for git as well:

Git status

Voila, the changes were successfully committed to repository. The git log shows that changes are stored in version control system:

Change git’s default diff behavior

Diffing two different twbx files with vanilla git is not really useful. Git thinks that twbx is a binary file and shows only the binary difference. However, we can set custom diff utilities to trick git. For twbx lets create a shell script which invokes unzip -c -a  and extracts only the twb file from it. Add this script to our .gitconfig  then reference it from our repository’s .gitattributes .

Now we can see the changes directly from git logs.

Git Logs

Was this hard? I don’t think so. If you follow these your users will be protected from accidental deletes and they can request to go back to any versions which were modified even a year before.

 

Tamás Földi

Tamás Földi

Director of IT Development at Starschema
Decades of experience with data processing and state of the art programming. From nuclear bomb explosion simulation to distributed file systems. ethical hacking, real time stream processing practically I always had a great fun with those geeky ones and zeros.
Tamás Földi

Related items

/ You may check this items as well

Pasted image at 2018_01_09 04_59 PM

Python Experiments in Tableau 1. – Add live currency conversion to Tableau Dashboards using TabPy

Automatic, shadow-copy like version control of Tab...

Read more
Tableau Docker

HOWTO: Tableau Server Linux in Docker Container

Automatic, shadow-copy like version control of Tab...

Read more
Tableau Consistency Checker

Tableau Filestore Consistency Checker – How Repository Maps to Filestore

Automatic, shadow-copy like version control of Tab...

Read more