Your app is collecting lots of data, and you've been storing it in a datastore, or maybe even multiple datastores (analytics engines, databases, NOSQL db's, Kafka, etc). How can you begin to explore this data and get insights about your users?
In this session, we will dive into a couple of open source projects you can use to do Big Data, Data Viz and ML with Apache Spark and Zeppelin Notebooks. We’ll start with some Apache Spark basics for working with (large) datasets. We’ll explore an example that loads in a data set and do some parsing, filtering, unions, etc. We’ll write some Java/Scala code to massage and filter data, then show the results in our interactive Zeppelin notebook. We’ll also apply machine learning to our data set and do some analysis and show some nice graphs to examine our data.
The agenda will be:
* Intro to Apache Spark with code examples for loading, massaging data
* Intro to Apache Zeppelin, using data and interactively coding and doing visualizations
* Understanding reactivity in Zeppelin
* Using a machine learning algorithm to extract insights from our data set
* Discussion of topics not covered ( forms, cloud integration, etc)