The journey continues: in this session we are going to move the famous TDFS (Tableau Distributed File System) with Zookeeper service to one of our favorite operating system: Linux. The goal is the same: have each and every Tableau Server processes on Linux without the need of the Windows OS. And a short disclaimer: this is 100% unsupported by Tableau and you need valid licenses for your Linux box otherwise you are going to violate their EULA.
Previously on “Tableau Server on Linux – Part 1 – Data Engine”
And today we are not just going to install these two services on Linux. No, we’ll do a lot more! We start to transform our Single Node Tableau Server to a Cluster without even touching the GUI.
The Basics
TDFS – or as Tableau calls File Store service – is installed along with the Data Engine and controls the storage of extracts. In highly available environments, the File Store ensures that extracts are synchronized to other file store nodes so they are available if one file store node stops running. How does it work in practice? If you refresh a data source then
- Backgrounder receives an extract refresh task
- Gets the Data and pass it to the
tdeserver
process with its new unique name tdeserver
writes the new local tde file- Backgrounder connects to File Store service and report the new file
- File Store puts the file to TDFS
- TDFS implementation ensures that file is replicated to all nodes. Node configurations are stored in zookeeper under
/tdfs
zookeeper directory.
In order to use our tdeserver
without the need to copy files between Tableau Server and our Linux hosts we need zookeeper and tdfs, that’s it. So, let’s configure them.
Zookeeper
First of all, what is Zookeeper? According to Zookeeper’s website Zookeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. Tableau uses Zookeeper to store cluster information, check who is doing what, who is available. Most of these functionalities are implemented in the Coordination Service. But enough from theory, let’s jump into the practice!
Installing Zookeeper on Linux
Zookeeper in Tableau Server 9.0 uses Zookeeper 3.4.6 which is the latest stable release. Zookeeper is written purely in Java, thus, binaries should work on all platforms where java is supported. You can download this version from any apache mirrors.
1 2 3 |
$ wget http://www.us.apache.org/dist/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz [..] $ tar xvzf zookeeper-3.4.6.tar.gz |
Installation is done, the zookeeper distribution is ready to server in your zookeeper-3.4.6 folder. To have everything up and running we need a tableau-compatible configuration. The configuration should look like:
1 2 3 4 5 6 7 8 9 10 11 12 |
$ cat zookeeper-3.4.6/conf/zoo.cfg tickTime=2000 initLimit=30 syncLimit=2 snapCount=100000 dataDir=/home/ec2-user/zookeeper-data clientPort=12000 maxClientCnxns=0 quorumListenOnAllIPs=true server.1=54.203.245.18:13000:14000 server.2=54.212.254.40:13000:14000 |
Couple of things: server.1
should be our original Windows Tableau server while server.2
is the Linux one. The dataDir
should point to zookeepers local data directory. This needs to be created with mkdir ~/zookeper-data
command. Also, you should create a file called myid
inside dataDir
to tell zookeeper the local node’s id:
1 |
echo 2 > ~/zookeper-data/myid |
Good. Now switch to the windows box and add the server.2
line to Tableau Server’s zoo.conf
located in
%PROGRAMDATA%\Tableau\Tableau Server\data\tabsvc\config\zookeeper\zoo.cfg . That’s it. Restart tableau server, then start our own Linux Zookeper instance with:
1 2 3 4 |
$ ./bin/zkServer.sh start JMX enabled by default Using config: /home/ec2-user/zookeeper-3.4.6/bin/../conf/zoo.cfg Starting zookeeper ... STARTED |
You can quickly check zookeeper.out to see everything is okay.
Validating Zookeper
We already built a Zookeper cluster and joined to our Tableau Server, isn’t it fantastic? Well, it is. But what’s inside? Well, let’s have a look:
1 2 3 |
$ bin/zkCli.sh -server 127.0.0.1:12000 [zk: 127.0.0.1:12000(CONNECTED) 0] ls / [configs, tdfs, zookeeper, clusterstate.json, aliases.json, clustercontroller, live_nodes, postgres, overseer, collections, overseer_elect] |
Nice, it seems we can access everything locally from Linux. Or maybe not:
1 2 |
[zk: 127.0.0.1:12000(CONNECTED) 1] ls /tdfs Authentication is not valid : /tdfs |
Tdfs folder is password protected. Time to authenticate ourselves. You can get the password from %PROGRAMDATA%\Tableau\Tableau Server\data\tabsvc\config\filestores.properties as usual:
1 2 |
filestore.zookeeper.username=fszkuser filestore.zookeeper.password=95d2cb4f8464d1560db0f8276b59e4bfe2e6ad5d |
Now, let’s authenticate and retry read from /tdfs directory:
1 2 3 |
[zk: 127.0.0.1:12000(CONNECTED) 2] addauth digest fszkuser:95d2cb4f8464d1560db0f8276b59e4bfe2e6ad5d [zk: 127.0.0.1:12000(CONNECTED) 4] ls /tdfs [hostslock, totransferperhost, status, clock, totransferperfolder, hosts, transferring] |
Everything as expected. Zookeeper: job done.
TDFS / File Store
Getting the binaries
And now, something different. Until now we deal only with ready to use services. Now, let’s move something really tableau specific. We should start moving Tableau java packages (jars) to our Linux box. Here is what and how:
- create a new folder called
tableau-apps
. This is where the code will go - create a folder as
tableau-apps/bin
. Copy alljar
files from Tableau Server’sbin/
folder recursively. If you are doin’ it right you should haverepo-jars
andrepo-migrate-jars
subfolders with jar files in it as well. You do not need everything right now, but this is only part two – and we will move all services in the next few weeks, not just TDFS! - create new folder as
tableau-apps/lib
. Just like in case of bins, copy alljar
files from Tableau Server’slib/
folder. Here you don’t need recursion, first level is enough.
That’s it, binaries are done. How about configuration?
TDFS Configuration – On Linux
Create a new folder filestore
and create the following three files:
log4j.xml
– to see what is going on:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE log4j:configuration PUBLIC "-//LOGGER" "log4j.dtd"> <log4j:configuration xmlns:log4j="http://jakarta.apache.org/log4j/"> <!-- Appenders --> <appender name="file" class="org.apache.log4j.DailyRollingFileAppender"> <param name="File" value="/home/ec2-user/filestore/filestore.log" /> <param name="DatePattern" value="'.'yyyy-MM-dd" /> <param name="encoding" value="UTF-8" /> <layout class="org.apache.log4j.EnhancedPatternLayout"> <param name="ConversionPattern" value="%d{yyyy-MM-dd HH:mm:ss.SSS Z}{UTC} %t %X{siteName} %X{userName} %-5p %X{requestId}: %c - %m%n" /> </layout> </appender> <!-- 3rdparty Loggers --> <logger name="org.apache"> <level value="warn" /> </logger> <!-- Root Logger --> <root> <priority value="info" /> <appender-ref ref="file" /> </root> </log4j:configuration> |
connections.properties
– this is required to know where to connect
1 2 3 4 5 6 7 |
# 54.203.245.18 - this is our windows box #Thu May 28 07:12:36 UTC 2015 pgsql.host=54.203.245.18 jdbc.url=jdbc\:postgresql\://54.203.245.18\:8060/workgroup primary.host=54.203.245.18 pgsql.port=8060 primary.port=8060 |
And finally the filestore.properties
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
coordinationservice.hosts=localhost:12000 coordinationservice.operationretrylimit=5 coordinationservice.operationretrydelay=5000 coordinationservice.operationtimeout=30000 coordinationservice.sessiontimeout=60000 filestore.zookeeper.username=fszkuser filestore.zookeeper.password=95d2cb4f8464d1560db0f8276b59e4bfe2e6ad5d filestore.maxmutexretries=5 filestore.hostname=54.212.254.40 filestore.maxentriesinfilestofetch=4 filestore.root=/home/ec2-user/dataengine filestore.port=9345 filestore.status.port=9346 filestore.transferreportintervalms=30000 filestore.reapholdoffms=7500000 filestore.inusereapholdoffms=86400000 filestore.filetypes=extract filestore.allfileprocessingholdoffms=300000 filestore.somefileprocessingholdoffms=300000 filestore.reapfailedtransfersholdoffms=3600000 filestore_stale_folder_reap.delay_s=3600 filestore_zookeeper_cleaner.delay_s=60 filestore_missing_folder_fetch.delay_s=60 filestore_scheduled_folder_fetch.delay_s=60 filestore_scheduled_internal_folder_fetch.delay_s=60 filestore_failed_transfers_reap.frequency_s=86400 filestore.maxservertimeoffsetms=900000 worker.hosts=54.203.245.18,54.212.254.40 |
The windows server is still the 54.203.245.18
while 54.212.254.40
is the linux node. The filestore.root
directory should point to our data engine directory (which was created in our part 1). And don’t forget to change the fszkuser
user’s password.
Linux part is done, switch to windows.
TDFS Configuration – On Windows
In addition to zookeeper authentication TDFS blocks all connections which aren’t coming from worker nodes. Thus, we should add this node as working in the following files:
filestore.properties
connections.properties
connections.yaml
backgrounder.properties
clustercontroller.properties
dataengine/tdeserver_standalone0.yml
Practically you must:
- search and replace
localhost
string with the external IP of the server in all above listed files - change
worker.hosts
toworker.hosts=windows_ip,linux_ip
infilestore.properties
andtdeserver_standalone0.yml
due to whitelisting
You can find these files in %PROGRAMDATA%\Tableau\Tableau Server\data\tabsvc\config
.
Start TDFS
Config done, let’s start TDFS:
1 2 3 4 5 6 7 |
$ java -Dconnections.properties=file:///connections.properties -Dconfig.properties=file:///$PWD/filestore.properties -cp ".:../tableau-apps/bin/app-tdfs-filestore-latest-jar.jar:../tableau-apps/bin/repo-jars/*:../tableau-apps/lib/*" com.tableausoftware.tdfs.filestore.app.Main SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/ec2-user/tableau-apps/bin/repo-jars/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/ec2-user/tableau-apps/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/ec2-user/tableau-apps/lib/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] |
Now, you should see some nice log messages in filestore/filestore.log
:
1 2 3 4 |
2015-08-04 12:49:24.709 +0000 Thread-2 INFO : com.tableausoftware.tdfs.filestore.status.StatusService - Starting Status Service on port 9346 2015-08-04 12:49:24.729 +0000 main INFO : com.tableausoftware.tdfs.filestore.app.Main - FileStore Server started 2015-08-04 12:49:24.731 +0000 main INFO : com.tableausoftware.tdfs.filestore.controller.ControllerService - Registering filestore node with zookeeper... 2015-08-04 12:49:24.841 +0000 main INFO : com.tableausoftware.tdfs.filestore.controller.ControllerService - Registered filestore node with zookeeper. |
If you are still with me then you just accomplished part 2: you have your TDFS and Zookeper on your Linux node in cluster mode.
Try things out
A typical test case would be an extract refesh. After refresh completion we should see the generated TDE file both on Windows and Linux.
Now in the backgrounder.log we can see that it was able to communicate with TDFS:
1 2 3 4 5 6 7 |
2015-08-04 13:00:38.547 +0000 (Default,,,) pool-2-thread-1 : INFO com.tableausoftware.tdfs.common.ExtractsListHelper - Wrote extracts to file C:\ProgramData\Tableau\Tableau Server\data\tabsvc\temp\allValidFolderIds651283455269617809\allValidFolderIds1576681360460061073.tmp 2015-08-04 13:00:38.562 +0000 (Default,,,) pool-2-thread-1 : INFO com.tableausoftware.model.workgroup.service.FileStoreService - Uploaded allValidFolderIds file to File Store on host 54.203.245.18 2015-08-04 13:00:38.578 +0000 (,,,) backgroundJobRunnerScheduler-1 : INFO com.tableausoftware.backgrounder.runner.BackgroundJobRunner - Job finished: SUCCESS; name: List Extracts for TDFS Reaping; type :list_extracts_for_tdfs_reaping; notes: null; total time: 1 sec; run time: 0 sec 2015-08-04 13:00:38.578 +0000 (,,,) backgroundJobRunnerScheduler-1 : INFO com.tableausoftware.backgrounder.runner.BackgroundJobRunner - Running job of type :list_extracts_for_tdfs_propagation; no timeout; priority: 10; id: 19339; args: [] 2015-08-04 13:00:38.594 +0000 (Default,,,) pool-2-thread-1 : INFO com.tableausoftware.model.workgroup.workers.ListExtractsForTDFSPropagationWorker - Deleted 0 extract_sessions created prior to last DB start time 2015-08-04 13:00:38.609 +0000 (Default,,,) pool-2-thread-1 : INFO com.tableausoftware.model.workgroup.workers.ListExtractsForTDFSPropagationWorker - done fetching orphans 2015-08-04 13:00:38.609 +0000 (Default,,,) pool-2-thread-1 : INFO com.tableausoftware.model.workgroup.workers.ListExtractsForTDFSPropagationWorker - Found 4 recent valid extract records |
On Windows:
1 2 3 |
c:\ProgramData\Tableau\Tableau Server\data\tabsvc\dataengine\extract>dir "a7\c0\{BE3565D0-4390-48E8-89D8-5A254A8FC675}\comments.tde" 08/04/2015 01:00 PM 49,034 comments.tde |
On Linux:
1 2 |
$ find dataengine/ -exec ls -l {} \; | grep -i aug | grep com -rw-rw-r-- 1 ec2-user ec2-user 49034 Aug 4 13:01 dataengine/extract/a7/c0/{BE3565D0-4390-48E8-89D8-5A254A8FC675}/\comments.tde |
Hurray, our file was replicated successfully in our newly built cluster. This is the end, the happy end.
If you have questions or comments just let me know and stay tuned for learn about more services – running on Linux.
- Tableau Extensions Addons Introduction: Synchronized Scrollbars - December 2, 2019
- Tableau External Services API: Adding Haskell Expressions as Calculations - November 20, 2019
- Scaling out Tableau Extracts – Building a distributed, multi-node MPP Hyper Cluster - August 11, 2019