Emerging real-world graph problems include detecting community structure in large social networks, improving the resilience of the electric power grid, and detecting and preventing disease in human populations. Unlike traditional applications in computational science and engineering, solving these problems at scale often raises new challenges because of sparsity and the lack of locality in the data, the need for additional research on scalable algorithms and development of frameworks for solving these problems on high performance computers, and the need for improved models that also capture the noise and bias inherent in the torrential data streams. In this talk, the speaker will discuss the opportunities and challenges in massive data-intensive computing for applications in computational biology, genomics, and security. The explosion of real-world graph data poses a substantial challenge: How can we analyze constantly changing streaming graphs with billions of vertices? Our approach leverages fine-grained parallelism, lightweight synchronization, and shared memory, to scale to massive graphs.
Upcoming Conference: “Machine-Learning with Real-time & Streaming Applications” FIRST CONFERENCE ANNOUNCEMENT:
From Data to Knowledge: Machine-Learning with Real-time & Streaming Applications
May 7-11 2012
On the Campus of the University of California, Berkeley
https://lyra.berkeley.edu/CDIConf/
Olfa Nasraoui (Louisville), Petros Drineas (RPI), Muthu Muthukrishnan (Rutgers), Alex Szalay (John Hopkins), David Bader (Georgia Tech), Eamonn Keogh (UC Riverside), Joao Gama (Univ. of Porto, Portugal), Michael Franklin (UC Berkeley), Ziv Bar-Joseph (Carnegie Mellon University)
We are experiencing a revolution in the capacity to quickly collect and transport large amounts of data. Not only has this revolution changed the means by which we store and access this data, but has also caused a fundamental transformation in the methods and algorithms that we use to extract knowledge from data. In scientific fields as diverse as climatology, medical science, astrophysics, particle physics, computer vision, and computational finance, massive streaming data sets have sparked innovation in methodologies for knowledge discovery in data streams. Cutting-edge methodology for streaming data has come from a number of diverse directions, from on-line learning, randomized linear algebra and approximate methods, to distributed optimization methodology for cloud computing, to multi-class classification problems in the presence of noisy and spurious data.
This conference will bring together researchers from applied mathematics and several diverse scientific fields to discuss the current state of the art and open research questions in streaming data and real-time machine learning. The conference will be domain driven, with talks focusing on well-defined areas of application and describing the techniques and algorithms necessary to address the current and future challenges in the field.
Sessions will be accessible to a broad audience and will have a single track format with additional rooms for breakout sessions and posters. There will be no formal conference proceedings, but conference applicants are encouraged to submit an abstract and present a talk and/or poster.