Business : This course familiarizes participants with different aspects of large data sets and how they are managed both on site and in the Cloud. Emphasis is placed on providing participants with hands-on experience from data ingestion to analysis of large data sets, both data-at-rest or data-in-motion (streaming data), including defining Big Data and its 5 V's: Volume, Velocity, Variety, Veracity, and Value. Architectures of distributed databases and storage, ecosystems such as Hadoop and Spark are covered followed by introduction to Scala, Spark-Shell and PySpark.
Terms: This course is not scheduled for the 2018-2019 academic year.
Instructors: There are no professors associated with this course for the 2018-2019 academic year.
Prerequisite: CBUS 255
1. 35 hours in class plus at least 25 hours of assignments/readings.