A hands-on Big Data Workshop

IS and Saratoga present a Hadoop, R & Tableau workshop.


Technology & Analytics practitioners (or anybody looking to get an in-depth understanding of Hadoop)

Cost:R900 per delegate

Duration: Full-day

No of Attendees: 50


The IS & Saratoga big data workshop is a full day event offering business intelligence developers, analytics specialists and anyone else with an interest to learn more about big data technologies with a hands on masterclass in Hadoop, R and Tableau. The workshop will guide attendees through a mix of instructor led and hands on classes from building a hadoop cluster and loading data to identifying insights and showcasing these in a tableau dashboard. At the end of the workshop attendees will have access to the hadoop cluster for a week to further explore and play with what they have learned.

Prizes will be awarded to the best team(s).

What you need to bring:

A laptop that is wifi enabled
A willingness to learn and experiment


Cloudera or Pivotal
R & R Studio

Places are limited! Only 50 tickets available!

* Please be aware that the 5 separate Workshops on Day 2 are being run separately to the Conference on Day 1. A ticket to the conference on Day 1 will unfortunately not allow access to any of the Workshops on Day 2, so please book separately. The cost to attend the conference is R700 for the full day, the cost per workshop is R900 each. Please note that some of the workshops run concurrently, so you may not be able to attend all of the workshops.


Section 1: Building a Hadoop clutter in the cloud

8:30am – 10:00 (Jeff Fletcher & Julian Roux)
We will be showing attendees how to build a complete Hadoop cluster from scratch, in a simple and intuitive step-by-step master class. By the end of this session attendees will have a good idea of the complexities of building a hadoop cluster and what is involved.

Tea Break

Section 2: Loading a big data set onto the cluster

10:30am – 12:00 (Jeff Fletcher & Julian Roux)
In the next section attendees will be shown how to use open source tools to load a 100GB+ dataset into the cluster.


Section 3: Analyzing the data with R & R Studio

13:00 – 14:30 (Aloise Griesel & Tom Martin)
Attendees will break into teams.

Using the data set from the previous session, attendees will be given a short master class in R and shown some insights that can be derived from the dataset and will spend the last section playing with the dataset.

Tea break

Section 4: Visualising the data using Tableau

15:00 – 16:30 (Tom Martin)
In the final session of the day, attendees will take the insights from the previous session and build a compelling visualization using the cloud version of Tableau. The winning visualization and team will be showcased on the Mammoth website.