Skip to content
From left: PhD student Geemi Wellawatte, Andrew White, an associate professor of chemical engineering, and Aditi Seshadri ’22 in Wegmans Hall. White’s lab has developed a way to verify the predictions of machine learning models used in drug discovery by using counterfactuals. (University of Rochester photo / J. Adam Fenster)

Rochester researchers use ‘counterfactuals’ to verify predictions of drug safety.

Scientists rely increasingly on models trained with machine learning to provide solutions to complex problems. But how do we know the solutions are trustworthy when the complex algorithms the models use are not easily interrogated or able to explain their decisions to humans?

That trust is especially crucial in drug discovery, for example, where machine learning is used to sort through millions of potentially toxic compounds to determine which might be safe candidates for pharmaceutical drugs.

“There have been some high-profile accidents in computer science where a model could predict things quite well, but the predictions weren’t based on anything meaningful,” says Andrew White associate professor of chemical engineering at the University of Rochester, in an interview with Chemistry World.

White and his lab have developed a new “counterfactual” method, described in Chemical Science, that can be used with any molecular structure-based machine learning model to better understand how the model arrived at a conclusion.

Counterfactuals can tell researchers “the smallest change to the features that would alter the prediction,” says lead author Geemi Wellawatte, a PhD student in White’s lab. “In other words, a counterfactual is an example as close to the original, but with a different outcome.”

Counterfactuals can help researchers quickly pinpoint why a model made a prediction, and whether it is valid.

The paper identifies three examples of how the new method, called MMACE (Molecular Model Agonistic Counterfactual Explanations), can be used to explain why:

  • a molecule is predicted to permeate the blood-brain barrier
  • a small molecule is predicted to be soluble
  • a molecule is predicted to inhibit HIVs

The lab had to overcome some major challenges in developing MMACE. They needed a method that could be adapted for the wide array of machine-learning methods that are used in chemistry. In addition, searching for the most-similar molecule for any given scenario was also challenging because of the sheer number of possible candidate molecules.

Coauthor Aditi Seshadri in White’s lab helped solve that problem by suggesting the group adapt the STONED (Superfast traversal, optimization, novelty, exploration, and discovery) algorithm developed at the University of Toronto. STONED efficiently generates similar molecules, the fuel for counterfactual generation. Seshadri is an undergraduate researcher in White’s lab and was able to help on the project via a Rochester summer research program called “Discover.”

White says his team is continuing to improve MMACE, by trying other databases in their search for most similar molecules, for example, and refining the definition of molecular similarity.

The project was supported by grants from the National Science Foundation and the National Institute of General Medical Sciences of the National Institutes of Health. The University of Rochester Center for Integrated Research Computing (CIRC) provided computational resources and technical support.


Read more

array of six selfiesSoftware uses selfies to detect early symptoms of Parkinson’s disease
Rochester computer scientist Ehsan Hoque and his colleagues have harnessed machine learning to accurately identify signs of the neurological disease by analyzing facial muscles.
an electric car plugged into a chargerCan electric cars help strengthen electrical grids?
PhD candidate Heta Gandhi and her advisor, assistant professor Andrew White, have developed a computer model showing ways to sell surplus energy from electric vehicles to local grids.
masked and gloved young woman stands in laboratory.A big leap forward in using iron catalysts for pharmaceuticals
Researchers at Rochester and Maryland describe a novel cross-coupling reaction that could lead to a cheaper way to develop chemical compounds for pharmaceuticals.