A physicist tackles machine learning black box

Zhengkang (Kevin) Zhang, assistant professor, Department of Physics & Astronomy

Despite its omnipresence in society, we’re just beginning to understand the mechanisms driving the technology. In a recent study Zhengkang (Kevin) Zhang, assistant professor in the University of Utah’s Department of Physics & Astronomy, demonstrated how physicists can play an important role in unraveling its mysteries.

“People used to say machine learning is a black box—you input a lot of data and at some point, it reasons and speaks and makes decisions like humans do. It feels like magic because we don’t really know how it works,” said Zhang. “Now that we’re using AI across many critical sectors of society, we have to understand what our machine learning models are really doing—why something works or why something doesn’t work.”

As a theoretical particle physicist, Zhang explains the world around him by understanding how the smallest, most fundamental components of matter behave in an infinitesimal world. Over the past few years, he’s applied the tools of his field to better understand machine learning’s massively complex models.

Scaling up while scaling down costs

The traditional way to program a computer is with detailed instructions for completing a task. Say you wanted software that can spot irregularities on a CT scan. A programmer would have to write step-by-step protocols for countless potential scenarios.

Instead, a machine learning model trains itself. A human programmer supplies relevant data—text, numbers, photos, transactions, medical images—and lets the model find patterns or make predictions on its own.

Throughout the process, a human can tweak the parameters to get more accurate results without knowing how the model uses the data input to deliver the output.

Machine learning is energy intensive and wildly expensive. To maximize profits, industry trains models on smaller datasets before scaling them up to real-world scenarios with much larger volumes of data.

“We want to be able to predict how much better the model will do at scale. If you double the size of the model or double the size of the dataset, does the model become two times better? Four times better?” said Zhang.

A physicist’s toolbox

A machine learning model looks simple: Input data—>black box of computing—>output that’s a function of the input.

The black box contains a neural network, which is a suite of simple operations connected in a web to approximate complicated functions. To optimize the network’s performance, programmers have conventionally relied on trial and error, fine-tuning and re-training the network and racking up costs.

“Being trained as a physicist, I would like to understand better what is really going on to avoid relying on trial and error,” Zhang said. “What are the properties of a machine learning model that give it the capability to learn to do things we wanted it to do?”

In a new paper published in the journal Machine Learning: Science and Technology, Zhang solved a proposed model’s scaling laws, which describe how the system will perform at larger and larger scales. It’s not easy—the calculations require adding up to an infinite number of terms.

College of Science

A physicist tackles machine learning black box

A physicist tackles machine learning black box

From self-driving cars to facial recognition, modern life is growing more dependent on machine learning, a type of artificial intelligence (AI) that learns from datasets without explicit programming.

Scaling up while scaling down costs

A physicist’s toolbox