1. List and explain four challenges commonly encountered by businesses when trying to use big data and analytics to add value to their operations.

2. Consider a business that generates big data from one or more sources.
a. Pick two sources, and describe the data using terms that you have learned in this class (e.g. source, volume, variety, velocity)
b. List appropriate dimensions of data quality that are relevant for this business / data.
c. Formulate a question / insight that you could gain from analyzing this set.
d. Present two cases for a proof of concept to validate the previous analysis: One from the point of view of a business evangelist, and the other from the point of view of a technical evangelist.
e. Elaborate what type of transformations or augmentation you would need to do to the data in order to obtain results for point c).

3. Explain the difference between Mean, Median and Mode. For each, come up with an example where you would use it over the others, and explain why.

4. Explain two key differences between traditional RDBMs and NoSQL / other novel data stores. Describe a business problem / application that would fit each of the two data stores better. Explain why.

5. Explain the difference between serial and parallelizable algorithms. Pick an algorithm, and show how you it could be implemented each way.



