Data science aims to solve grand global challenges such as: detecting and preventing disease in human populations; revealing community structure in large social networks; protecting our elections from cyber-threats, and improving the resilience of the electric power grid. Unlike traditional applications in computational science and engineering, solving these social problems at scale often raises new challenges because of the sparsity and lack of locality in the data, the need for research on scalable algorithms and architectures, and development of frameworks for solving these real-world problems on high performance computers, and for improved models that capture the noise and bias inherent in the torrential data streams. In the 1st data revolution of the 1980’s, relational databases solved the ability to transact records related to an object. The 2nd data revolution in the early 2000’s used graph and vertical databases to query for patterns within the data. In this talk, Bader will discuss the 3rd data revolution for discovering important answers to unknown questions in massive data.