Better Decisions through Big Data

by David Bader, IEEE Computer Society member, and Chair, School of Computational Science & Engineering at Georgia Institute of Technology

Companies and governments increasingly rely on ‘big data’ to operate efficiently and competitively. Analytics and security must keep pace. What research underpins the latest big data-enabled advances?

Good decisions are well-informed decisions, of late powered by a diversity of data. Big data is creating profound business and social opportunities in nearly every field because of its ability to speed discovery and customize solutions. Yet without the means to synthesize and protect data, we risk not being able to do what we intended: make the right decisions.

This junction of big data and security shapes an increasingly important area of research that uses high-performance cyber analytics – led by research universities and industry in partnership.

Everyday, enterprise systems create a deluge of data: power grid use across a metropolitan area, millions of credit card transactions per hour, social network relationships, or the spread of contagious disease. One might think that so much data requires more time to analyze and draw conclusions, but in practice, big data now allows us to make near realtime responses. How is this possible?

Research universities act here, using graph analytics and by creating new visualization methods to give government and industry actionable knowledge from growing mounds of data. Streaming graphs, for example, detect structural changes and flows, spot clustering or key actors and highlight subtle anomalies. Graph analytics require large-scale, high-performance computers that can trace trillions of interconnected vertices and edges that change over time. Projects such as Georgia Tech’s STINGER offers an opensource way to understand data with large, streaming graphs. Much university work is by nature open source and open to all – providing a standard which others can improve upon without having to reinvent.

Further, streaming graphs can be combined with techniques from machine learning to be more effective. A machine might be fed information and trained to know how a well-behaved employee normally operates, using information extracted from the masses of data. Then, when an action deviates from the norm, an alert results. This is especially useful in business environments where employees connect to proprietary data via numerous mobile devices. Even the best employees now must be monitored because they may be unwittingly used in cybercrime.

Contrary to common fears, the more data we share, the more secure we may actually become. The big data behind employee behavior analysis, for example, enables new cyber security approaches that discover subtle, previously hidden suspicious changes and guide quick responses before expensive damage is done. In the end, not only can that keep organizations safe; it too can inform the right decisions.

“Under the hood” fixes that don’t involve human actions will be another area of advancement in the realm of big data and cyber security. Devices need smart tools for self-correction and even the ability to clean up after a successful attack. Also, static security defenses, such as firewalls and malware, have taxed computing power. Universities are studying how to increase computational power, which will help future cyber security solutions. For example, leading universities and industry are working on a national need to increase computational efficiency by 75-fold over the best current systems.

National laboratories and universities are environments of open innovation, where partnerships with industry permit computer scientists to work with real problems and real data. This is what leads to true societal solutions that industry can deploy. Much of today and tomorrow’s cyber security work will require heavy hitting computation, which industry usually cannot do alone. Therefore, it begins with and revolves around sharing.

Computing is, and always has been, about making better decisions faster. With every advance, technology solves some issues and introduces others. There will be challenges. But big data-enabled security keeps a nation devoted to knowledge-based innovation on the offensive. High-performance cyber analytics – in partnership between industry and academia – is the next underpinning for national progress and everyone’s security.

David Bader is an IEEE Computer Society member, and professor and chair of the School of Computational Science & Engineering and executive director of high performance computing at the Georgia Institute of Technology.

David A. Bader
David A. Bader
Distinguished Professor and Director of the Institute for Data Science

David A. Bader is a Distinguished Professor in the Department of Computer Science at New Jersey Institute of Technology.