FOR ANALYSTS FOR BI PROS LEARN Tableau
How to generate the initial extract with Tableau Server
September 11, 2014
5
, , , , , ,

gold-flappy-bird

A lot of folks are struggling with this issue: imagine that you have to build an extract with millions of rows from your slow databases. You are on your corporate VPN from a hotel room or from a plane. It’s just painful and risky and the extract sorting process will consume a lot of CPU resources on your computer – giving away valuable resources from Flappy Bird and friends. So, what is the solutionBen Sullins and Peter Juel Jensen suggest to create database views while Russell Christopher using a calculation trick (where you should wait a day for the miracle) to move the initial extract generation to the Tableau Server side. Well, I don’t like these solutions: sometimes you just do not have the rights to create a database view or you need the results ASAP. So, what’s my approach?

These are the steps you should take, I will explain them in details:

  1. Create an initial extract with 1 sample row
  2. Edit the twbx file and change the “1” row to “-1”
  3. Publish to the server and generate it

If you are new to extracts I suggest to read Tom Brown’s excellent article on extracts.

Create an initial extract with 1 sample row:

Connect to Data Tableau

Open your new datasource and click on “Import some data” or if you have an existing data source just click on “Extract data”.

select top or sample 1 to refresh

Now, select top 1 or sample 1 row to refresh the extract.

It should be fast and painless.

Edit the twb(x) file and change the “1” row to “-1”

This is the tricky part. If your workbook is a “Packaged” one, you should rename it to .zip, open it, locate the twb file and open it. If you have already a twb, then just simply open it with an editor.

Locate the extract part and change count=1 to count=-1

This is the original state

This is the original state where the count=1 needs to be changed

snap00422

Now it will take all the records next time from the server

Make sure that you save these settings into the twb file. In case you have packaged workbook change back the extension from zip to the original twbx file.

Publish to the server and generate it

Now you should open this edited file with Tableau Desktop, and publish to the Server. After that, go to the data sources and trigger the full extract to refresh. Tableau Server will took all the records (-1 means infinite).

snap00439

Was that easy? I think so, but if you have a better/different solution, just drop me a line!


Liked this article? Don’t forget to subscribe so you won’t miss any of these!

Tamás Földi

Tamás Földi

Director of IT Development at Starschema
Decades of experience with data processing and state of the art programming. From nuclear bomb explosion simulation to distributed file systems. ethical hacking, real time stream processing practically I always had a great fun with those geeky ones and zeros.
Tamás Földi
  • An alternative (additional) solution, you can use sql-query too (not only table) with LIMIT clause via “connect data” process. In this case you should edit (only) the sql statement in the xml-file and no any internal part of twb xml file.

  • Thank you for sharing this. Not only does this solution help users that don’t have permissions to create a View with a TOP(N) filter, but this saves time.

    With the view solution your jumping in and out of the database, where as this solution allows you to work on you local machine, which can be accomplished quickly in a hotel.

  • Good idea Tamas. I tend to use Russell’s solution with the calculated field, but I modified the calculation to get a quicker result. If you wrap NOW() in a DATETRUNC(‘minute’,NOW()) then you can filter the extract to the current minute, then if you’re quick, you can publish to the server and wait 1 minute then kick off the extract.

    • Ankur Anwekar

      Jonathan, I am trying to replicate your method. I want to know how do I filter NOW()? Is it by “# Minutes” or by “# M/D/Y” or by “Individual Day and Time”?

    • Mike Strand

      I too prefer the NOW() solution and also wanted a quicker result than waiting a day – but achieved it with slightly different approach that doesn’t require furiously trying to beat the clock. I set the NOW() based filter to evaluate for a “Range of Dates” and then select “Starting Date” for a value about 5 minutes in the future. Create the extract locally before the started starting datetime, then publish to server, and initiate refresh after the stated datetime.

Related items

/ You may check this items as well

Pasted image at 2018_01_09 04_59 PM

Python Experiments in Tableau 1. – Add live currency conversion to Tableau Dashboards using TabPy

A lot of folks are struggling with this issue: ima...

Read more
Tableau Docker

HOWTO: Tableau Server Linux in Docker Container

A lot of folks are struggling with this issue: ima...

Read more
Tableau Consistency Checker

Tableau Filestore Consistency Checker – How Repository Maps to Filestore

A lot of folks are struggling with this issue: ima...

Read more