Open Data Kit or ODK is (probably) one of the most underrated survey data management platform. Yet ODK powers a lot of successful commercial platform, (for example CommCare) and helping tons of commercial and non-commercial activities across the globe. As the name suggests, Open Data Kit is open source and can be downloaded, customized, deployed and used with few easy steps. Let’s get started with setting Open Data Kit on CentOS VPS.
At first let us discuss what is Open Data Kit really mean. ODK is a collection of tool to collect and manage survey data using android mobile devices. It creates the survey form, uses the form to collect data from the handheld devices, stores and publishes the data as per the user’s need. ODK Build is a tool to create a form. ODK Collect uses the form to collect and send data to ODK Aggregate. ODK Aggregate receives the data and does various data management, including export into CSV and KML. Users can even use MS Excel or Open Office to create the form without writing a single code. This is all quite simple, read on for the steps.
Setting up the environment
ODK needs a place where you can deploy and customize your own Tomcat server and database. In order to achieve this you can either use a dedicated PC, a virtual machine or something like Docker, PaaS like App Engine or even a VPS. I tried the last one.
We will install JAVA into it.
sudo yum update
sudo yum clean all
sudo yum search java | grep -i --color JDK
sudo yum install java-1.8.0-openjdk.x86_64 java-1.8.0-openjdk-devel.x86_64
clean command clears any cached files and folders.
color searches available version of java and highlights them. Finally
install installs them. The Java SDK package will be installed in
/usr/lib/jvm directory by default. Use this to set
JAVA_HOME temporarily. You can set the
JAVA_HOME permanently by adding this line to “
Setting up Tomcat
The latest stable Tomcat is version 8.5.6 and you can download that from the official website. The quickect way to install it to un-tar it to /opt/ and access it from there.
sudo wget http://www-eu.apache.org/dist/tomcat/tomcat-8/v8.5.8/bin/apache-tomcat-8.5.8.tar.gz
sudo tar -xvzf apache-tomcat-8.5.6.tar.gz
sudo mv apache-tomcat-8.5.6 /opt/tomcat
Some of you may wonder what are these codes doing. The
wget is a tool to download anything to the folder
/home/sr/odk which I already set (I am going to download everything in this folder). The
tar command extracts the tar.gz archive. The
mv stands for move (or cut and paste), I am moving the extracted folder to
/opt/tomcat folder. Notice that you are not required to create those folders beforehand, they’ll create themselves (except the
odk folder). Now see if you can start the tomcat service.
sudo /opt/tomcat/bin/catalina.sh start
Make a quick check if your Tomcat installation works. In your browser visit the IP address to see a generic Apache page. Hit the xxx.xxx.xx.xx:8080 in your browser.
Setting up MySQL
My VPS is quite cheap, only has 1GB of RAM. So I decided to shut down any non-essential services during boot. If you want to add Tomcat as a startup service use this script. We have Tomcat ready and running. Now we will need MySQL running, but you can also try PostgreSQL. To install MySQL we will need this,
sudo yum install mysql-server mysql
On fresh installation of MySQL Server, the root user and password is blank. Setting a password here is a good security practice. To manage this use this command.
Set a password. Go on with other settings with typing y.
Remove anonymous users? [Y/n] y
Disallow root login remotely? [Y/n] y
Remove test database and access to it? [Y/n] y
Reload privilege tables now? [Y/n] y
Done! Now we will check if our MySQL server works. Use your password to login.
mysql -u root -p
ODK Aggregate needs Java to run. ODK also needs a webserver (Tomcat) and a database (MySQL). Conventional web hosting services give you the later two – webserver and database, but do not give you the priviledge to install your own Java. ODK Aggregate can be installed in cloud platforms like Google and Amazon, private servers and VMs. We are using our own webserver built inside a VPS. The reason for that is there are a lot of people (means people like me) who cannot afford a server with secured and managed internet access for 24/7. If you have a short term survey work for a time sensitive project, you may not have enough resource to buy and maintain their own server and space to keep it. Even if you already have a server, you may not have time to configure it for a secured private IP, some IT folks to look after when you are sleeping, or even an internet connection with decent speed. So I decided to go for a cheap VPS which has all the above, minus the hassle of preconfiguration and the money issue. VPS can go down as much as $7/month. My VPS costs almost the same, has only 1GB RAM (pretty cheap, huh), 25GB disk space and a minimum internet speed of 100Mbps. I can always pay extra to add some RAM to my box, or can unsubscribe at my will. It’s an unmanaged CentOS 6 box with no cpenel. If you are not familiar with Linux environment, you can pay a few bux extra to install a cpenel. Cpenel or Control Penels can be used to manage files, install software and do all kinds of thing inside VPS with few clicks, you won’t need to learn any command. Some websites claim that it’s “extremely difficult” to manage a box without cpenel, but believe me it’s not that hard.
Without further ado, let’s install an ODK Aggregate.