Data Futurology Podcast #213 Solving the challenges of our times with massive graph analytics with Dr. David A Bader, Distinguished Professor at the New Jersey Institute of Technology

Felipe Flores. Season 5, AI, Data for Good, Data Governance, Data Science and Machine Learning, MLops, Ethics, privacy

This week on the Data Futurology podcast, we have the special privilege to host Dr. David A. Bader, a Distinguished Professor at the New Jersey Institute of Technology, and the inaugural director of the Institute for Data Science there.

Bader joins us on the podcast to discuss massive graph analytics, a topic that he is a recognised expert in and has recently published a book on. He and his team are currently working on a project that will allow anyone, via the Jupyter Notebook and Python, to leverage their data science framework, running on “tens of terabytes” of data. “It is quite exciting to democratise data science – and especially graph analytics – so that anyone with a problem that knows Python can work with some of the largest data sets,” he said.

According to Bader, graphs are now a mainstream part of data science and a way to solve the most challenging and complex problems in the enterprise. “A graph abstracts relationships between objects, and any problem that we can abstract where we have relationships between objects, we could use graph analytics to solve,” he said.

Much of Bader’s work – including through his book – is focused on helping organisations grapple with the exponential growth in data, and the impact that this has on their ability to dedicate adequate resources to work at scale. As he said, being able to do that is going to be fundamental to humanity’s ability to respond to the many real challenges that it faces ahead.

“I want equitable access for everyone to be able to work on these problems, and to find new discoveries that are important, and help solve global grand challenges,” he said. “I think that we have many issues in the world today. And if we give more capabilities to those with data, and let them empower the data will make the world a much better place.”

For more deep insights on the importance and value of massive graph analytics, tune in to our conversation with Dr. David A. Bader.

Enjoy the show!

Thank you to our sponsor, Talent Insights Group!

Join us in Sydney for Ops World: https://www.datafuturology.com/opsworld

Join our Slack Community: https://join.slack.com/t/datafuturologycircle/shared_invite/zt-z19cq4eq-ET6O49o2uySgvQWjM6a5ng

“We’re creating a productivity front end so that anyone from their Jupiter Notebook and Python can call our framework. But in the back end, you can have a supercomputer running on 10s of terabytes of data. For me, this is quite exciting to democratise data science, and especially graph analytics, so that anyone with a problem who knows Python will be able to work on some of the largest and most challenging data sets.” — Dr. David A Bader, Distinguished Professor at the New Jersey Institute of Technology

WHAT WE DISCUSSED

00:00 Introduction

03:51 How did you get started in the graphs and graph space?

10:26 What would you say to people looking to get started in in applying this capability to the problems that they might be facing in their organisations?

15:13 How do you increase the fidelity of the algorithms?

21:57 What was the some of the aims on what you were wanting to bring to the world with the book?

27:40 Could you tell us a little bit more about our CUDA, how’s the trajectory been so far and where is it at now? What’s coming up next?

32:39 What is your vision for the future of Cuda?

EPISODE HIGHLIGHTS

Very early on, I was very interested in looking at large graphs, and graphs that abstract from the real world, whether it’s a transportation network, or whether it is trying to understand network security, or whether it’s a friendship network on a social, a social network platform. For me, graphs have gone from being a niche to now being mainstream, where most organisations have graph problems and look for graph analytics that can solve their problems in the enterprise.

Once we can map a problem into the graph space, we think about what are we trying to find? Are we looking for a path, for instance, this path between friends from me to you in the graph? Are we looking for communities? So is there some emerging community of interest? Are we looking for influencers? And so these are the types of questions that we often have once we’re in the graph domain. And what I like to do is design new scalable algorithms that are able to work on these ever-increasing large size graphs to solve the analytic that we’re looking for.

We can also use graphs to understand for instance, patients and electronic health records, to understand better treatments, and even personalised medicine through graphs. And we can also use graphs to understand our electric power grid, so to shore up and make more resilience, the power that runs our telecommunications, our food production, our transportation, and more.

So anyone who thinks they have a small graph problem today, probably tomorrow will have a massive graph analytic problem. So we’d like to think about what do you do once the problem no longer fits on your laptop? Once you have to figure out how do I solve this problem? When I need more than my laptop, and my Python routine is taking too long? This has to run instantaneously, and it’s taking me hours, what do I do now? So this is a solution book for anyone who is thinking about graphs in the enterprise and moving towards large scale.

I think that we have many issues in the world today. And if we give more capabilities to those with data, and let them empower the data will make the world a much better place. So that’s really my goal and my vision, and I really hope others will join in and help with this work.

Thank you so much. It’s great to talk with you. And I encourage everyone out there to look at the world as a graph. And this is just a fantastic time to do so.

At Data Futurology, we are always working to bring you use cases, new approaches and everything related to the most relevant topics in data science to help you get the most value out of these technologies! Check out our upcoming events for more amazing content.

https://www.datafuturology.com/podcast/213-solving-the-challenges-of-our-times-with-massive-graph-analytics-with-dr-david-a-bader-distinguished-professor-at-the-new-jersey-institute-of-technology-e1q3ope

David A. Bader
David A. Bader
Distinguished Professor and Director of the Institute for Data Science

David A. Bader is a Distinguished Professor in the Department of Computer Science at New Jersey Institute of Technology.