Unraveling the Secrets of Human Health with Data Science

Sept. 9, 2020

Data Science Fellow Alise Ponsero, PhD, is using data from the ocean to train microbe-hunting machines to seek pathogens underlying diabetic amputations. Ponsero is co-hosted by CyVerse and the University of Arizona Data Science Institute.

Image
UAHS data fellows news banner

The University of Arizona Health Sciences Data Science Fellows program places data scientists in research labs to bring projects to life. Image courtesy of the University of Arizona Health Sciences.

As a graduate student in France, Alise Ponsero, PhD, didn’t suspect that her investigations into ocean microbes might someday help people with diabetes avoid amputations.

“As a microbiologist, I’ve always been interested in understanding how microbes interact with each other,” Dr. Ponsero said. “Then I became interested in taking a step back and looking at how microbes interact with each other in an ecosystem.”

 

As she started zooming out, she found herself studying how ecological changes affect

Image
color_alise_ponsero-i

Alise Ponsero, PhD, is one of the first Data Science Fellows at the University of Arizona Health Sciences. Photo by Cyril Carvalho.

relationships between ocean-dwelling bacteria and the viruses that infect them.

Dr. Ponsero is one of the first postdoctoral students to participate in the Data Science Fellows Program, which brings together postdocs from computational, health and life sciences to foster communication, collaboration, and the exchange of technologies and best practices.

The new program is funded by the University of Arizona Health Sciences as part of a strategic initiative to help establish New Frontiers for Better Health by integrating data science expertise and cross-disciplinary efforts to further Health Sciences research. The fellows are co-hosted by CyVerse and the University of Arizona's Data Science Institute.

“Data science is the new international language. It is the language that connects all languages, and connects an entire global data set. Every field is going to need it,” said Bonnie Hurwitz, PhD, associate professor of biosystems engineering and clinical instructor with the College of Pharmacy.

Data Science Fellows like Dr. Ponsero have access to vast computational power to mine robust datasets that overflow with information about genetics, health and disease. These new capabilities offer fertile ground to explore innovative ideas that ultimately contribute to unravel the mystery of human health and inform precision care.

“It’s not about the study of a single individual system anymore but being able to be a true hacker of systems and think of how they’re all interacting,” explained Dr. Hurwitz.

Uncovering the meta-organism with data science

To examine meta-organisms, scientists take a metagenomics approach.

“In metagenomics, instead of culturing some organism, we take the DNA from a sample of the microbial population and sequence the whole thing as one meta-organism,” Dr. Ponsero said. “That is a really great tool to look at complex microbial communities.”

 

Image
rdcbackwireconnections-inline_border

The University of Arizona's High Performance Computing Center empowers researchers to plow through massive datasets to uncover exciting connections.

Every lifeform has a genome, a book of genes that tells its story. Each gene is a sentence, composed of proteins spelled with the letters A, T, C and G. Powerful computers can plow through these books, cross-referencing the letters and uncovering connections inconceivable a generation ago. But someone needs to train those computers, and that’s where data science comes in — and that’s where Dr. Ponsero’s career path took a surprising detour.

“I never touched a computer when I did my PhD in molecular biology — I was deep into wet lab,” Dr. Ponsero recalled. “I got interested in what data science is bringing to microbiology, but to make that journey to more computational work, I had to take a break to learn how to code.”

That “break” concluded with a master’s degree in computer science.

Dr. Ponsero pursues her postdoc in the Hurwitz Lab where she is training computers to sift through terabyte-sized metagenomes to identify viral genes and discover previously unknown viruses. The Hurwitz Lab is one of a few in the world that can produce viromes, the portion of a metagenome that comes from viruses.

Hunting viruses with machine learning

A machine can’t produce virus-detecting algorithms out of thin air — it needs training materials. Those datasets are scooped up from the ocean.

“Ocean sciences has been ahead of the curve, creating viromes from different aquatic systems,” said Dr. Hurwitz. “They’ve sequenced across all of these different cellular levels: bacteria, viruses, the entire spectrum. That’s the kind of data needed for the predictive work Alise is doing.”

 

Dr. Ponsero’s machine-learning tool will sort through raw data from ocean

Image
Bonnie Hurwitz

Bonnie Hurwitz, PhD, is pictured at Woods Hole Oceanographic Institution next to the main chamber of the Alvin submersible, which collects samples from the deepest parts of the ocean not accessible to people.

metagenomes to detect patterns in viruses that differentiate them from other organisms, continually refining its algorithms until it can pinpoint a virus’s genetic signatures. Nailing down the origin of each gene is challenging, but viruses leave telltale clues.

“The metagenome is a text, composed of sentences. Those sentences can be broken down in words. Viruses won’t use the same words to build DNA,” Dr. Ponsero said.

“Viruses like to replicate, and they do it really fast,” Dr. Hurwitz added. “They’ll use amino acids that are easier to produce — more A’s and T’s instead of G’s and C’s.”

From the ocean to ulcers

After mastering the ocean datasets, the machine-learning tool can be given metagenomes from completely different environments — and perhaps find viral culprits for difficult-to-treat diabetic foot ulcers.

“We’re road testing it in the ocean,” Dr. Hurwitz explained. “Then we’ll reapply it in the health space.”

Just like the ocean, the skin is home to a diverse collection of microbes. For most, they are harmless, but for people with diabetes, a little cut or crack on the foot could make way for an infection that triggers a cascade of events culminating in amputation — and we don’t even know which pathogens are causing most of these infections.

“The viruses that infect foot ulcers are going to be very different than what you see in the ocean, but the same methodology and the same approach would apply,” Dr. Hurwitz explained.

Image
hurwitz

Bonnie Hurwitz, PhD, pictured here with associate professor of pharmacology George Watts, PhD, and research specialist Candice Clark-Mason, uses data science to solve health problems in the lab.

Bacterial infections are treated by antibiotics, but sometimes they mow down one species of bad bacteria only to free up territory for an even worse pathogen. Preliminary studies in the Hurwitz Lab have shown that happening in diabetic foot ulcers, raising the possibility that alternatives to antibiotics should be considered.

“We’re trying to prevent amputation by coming up with methods for understanding what is in that ulcer to better target and treat the wound.” Dr. Hurwitz said. “Maybe we need to be targeting these viruses instead of the bacteria.”

It is too early in the journey to predict exactly what Dr. Ponsero’s machine-learning methods will reveal about the interactions between viruses, bacteria and diabetic foot ulcers, but it’s an area she and Dr. Hurwitz hope to explore once their tool is ready for prime time.

“I’m most excited to be part of a growing community that will foster new collaboration and new ideas,” Dr. Ponsero said. “The switch from pure microbiology to data science is exciting because I am learning things every day!”

“There are lots of fun discoveries being made right now, especially at that interface between bacteria and viruses and their role in human health,” Dr. Hurwitz added. “Each year, there’s huge leaps and bounds. Buckle up!”

Create Account

An Open Science Workspace for Collaborative Data-driven Discovery