16-23 A1 Big Data


When doctors can take advantage of massive amounts of data on patient outcomes, lives will be saved. We look at one of the first efforts, an attempt to associate dangerous drug interactions, and the difficulty in convincing other doctors that “crunching numbers” can provide adequate proof. A researcher and reporter involved in the case explain.

Stay in the loop! Follow us on Twitter and like us on Facebook!

Subscribe and review on iTunes!

Dr. Nicholas Tatonetti, Assistant Professor of Biomedical Informatics, Columbia University

Sam Roe, reporter, Chicago Tribune

Links for more information:

16-23 Drug Interactions and Big Data

Reed Pence: Just about everyone gets prescribed medications from time to time. Many of us are prescribed multiple drugs at the same time and it’s natural to assume that our doctors and pharmacists know what that means and take drug interactions into consideration, but is that necessarily true?

Dr. Nicholas Tatonetti: Nearly everyone in the country, as they age, will be taking two or more drugs at sometime in their life and the truth of the matter is we really don’t know about what will happen to the human body when it takes these two drugs together.

Pence: That’s Dr. Nicholas Tatonetti, Assistant Professor of Biomedical Informatics at Columbia University.

Tatonetti: It’s not studied very well and we can only estimate its effects so some studies have shown that up to 30% of all adverse drug effects are actually caused by drug interactions and not the drugs themselves. So that means they’re most likely avoidable drug effects. So by just studying drug interactions and understanding when they’re going to interact, we could reduce the number of effects by upwards of 30%.

Pence: Tatonetti says he first had his eyes opened to this problem a few years ago when he was researching something else.

Tatonetti: I was really interested in ways that we could combine drugs together to treat new diseases or come up with new treatments for existing disease and I had downloaded a database to try to investigate that, which is this FDA database that they make publicly available. Anyone can download it right now and it’s a database of adverse event reports, so somebody takes a drug and they don’t feel so good or their physician observes something that’s happening and either the patient or the physician or the drug company will fill out a report that describes the adverse event that occurred and then the FDA uses these to try to figure out if a drug is causing a side effect in their post-marketing surveillance systems where they try to keep track of the safety of all of the drugs that are currently being used. It turned out, though, that there were a lot of issues with using this resource, there’s biases in the data, there are various confounding variables from the ways in which drugs are prescribed that make it very difficult to tell which drug is causing which side effect.

Pence: To figure out which drug caused what, Tatonetti did something unusual. He wrote his own algorithm. He could then pare down the information and use this new algorithm to sort through the data from the internet in a way that had never been done before.

Tatonetti: The algorithms are good at doing things that humans can’t. So humans are very good at intuition, at having very good, solid models of pharmacology and disease biology and that’s everything your physician is doing when you walk in to the clinic. They look at you and they already can tell a lot of things about your state, about your diseases, about drugs, and they know intuitively because of all of this training that they’ve had. But where algorithms excel is in holding enormous amounts of data, much of which is not very useful. So a physician will take a lot of information in and immediately dismiss most of it because it’s not going to be very informative to them for their current decision-making. But in an algorithm, we can actually load all of that data up into the algorithm’s memory. By analyzing the interactions of all of these different data elements at the same time, much more than any single human could hold in their head, it could identify relationships that we could not have otherwise identified.

Pence: The term “big data” gets thrown around a lot, but any database as extensive as the FDA’s record of every drug on the market certainly qualifies. So how does the algorithm sort through this data? Tatonetti says it’s actually pretty simple in the end.

Tatonetti: It’s a pattern-recognition algorithm, so we teach is to recognize a certain type of drug, so we’ll train this algorithm to recognize when a drug is going to cause increased blood glucose or diabetes-related adverse events, and the more examples we show our algorithm, the more confident it gets that it could recognize a pattern of these adverse events being caused. And so then what we do is we say, ‘Okay, we’ve trained you, we’ve shown you a bunch of flash cards,’ our algorithm a bunch of flashcards on what makes a drug cause an adverse event. And now what we do is we take a novel example, we take two drugs, we pair them together, and we show this to the algorithm, something the algorithm has never seen before and then we ask the algorithm to say, “Does this look like one of these adverse event-causing drugs or not?” And it gives it a score. For every pair of drugs, we are able to score them using this pattern-matching algorithm and then we can rank them by that score and the ones at the top, the ones with the highest probability of matching that pattern, are the ones that we suspect are causing drug interactions

Pence: Tatonetti says that once the algorithm was in place, it wasn’t long until he started finding some answers.

Tatonetti: One of our first classifiers, we found a drug interaction that we did not expect. It was between two really popular drugs and the association that our algorithm was telling us that these two drugs were causing increased blood glucose, but it was something that neither drug was associated with that at all. It really wouldn’t be something a physician or anyone would expect for these drugs to do when prescribed together, so it seemed pretty unlikely. But they are very popular, so if it really was happening, it could be potentially high-impact.

Pence: The two drugs Tatonetti identified were paroxetine and pravastatin. Paroxetine is used to treat depression, anxiety, and OCD while pravastatin is for high cholesterol. Because they treat very different things, doctors sometimes prescribe them to the same patient.

Tatonetti: I went out and I got electronic health records at three different hospitals- Stanford, Vanderbilt, and Partner’s HealthCare System, and I asked a very simple question which was, “Are patients who are being seen for clinical practice, and who are being given these two drugs, paroxetine and pravastatin, do they have their blood glucose go up after that take these two drugs?” And at each of those institutions, the answer was yes. So we found quite a number of cases where patients were prescribed both of these medications within a short window of time and their blood glucose went up.

Pence: Encouraged that he was on the right track, Tatonetti began telling other experts about the possible interaction. Unfortunately, he says most people he told couldn’t believe that these two drugs would interact in a meaningful way. After all, he hadn’t done any studies on it, only crunched numbers on drug complaints. So Tatonetti tested the drugs on mice.

Tatonetti: And over the course of about six months, I had conducted an experiment on 50 different mice, where we fed them this high-fat diet and then we took their glucoses and then we exposed them to drugs, and we took their glucoses and we found almost exactly what we found in the humans, which was that the mice that were exposed to both paroxetine and pravastatin had significantly higher glucoses than the other mice in the study. And in fact, further than that, even more striking was the mice in the study had a 16 milligram per deciliter increase in their blood glucose compared to control, and the humans in our electronic health record evaluations also had a 16 milligram per deciliter increase in their blood glucose compared to the controls.

Pence: Buoyed by this success, Tatonetti used his system to make a few more hypotheses. That’s about the time that his findings caught the eye of Chicago Tribune reporter Sam Roe. Roe was researching drug interactions for an investigative series and went to Tatonetti proposing a way to double-check his hypotheses.

Sam Roe: We made predictions and then went to the database at Columbia of patients going back to the late 1980s. There’s millions and millions of patients, billions of clinical measurements, everything from lab orders to demographics to indications that they had and to see if the predictions that we made in the FCC database was happening in a hospital setting. So it was really marrying those two databases together.

Pence: Now that may sound inexact, but Roe says the process is actually similar to the roadmaps used in other scientific disciplines.

Roe: It’s very similar to how astronomers discover black holes. You can’t see a black hole directly, you only know that they’re there based on the side-effects around them. And big data is very similar. There’s so much data out there that you may look at it directly and you don’t see anything, you don’t see people reporting that, “Hey, if I take drug A and drug B, I’m going to have this side effect.” People may not be reporting that, but that doesn’t mean that it’s not there. So what scientists are able to do to these new algorithms, they’re able to go in there and, like astronomers searching for black holes, they look at side-effects of all the different drugs out there, and then they make predictions based on that these drug combinations are causing all of these secondary side-effects, it’s possible it’s causing a major side effect and no one knows it, because they’re just not reporting it.

Pence: And if that’s true, Roe says it’s likely that the problem won’t be going anywhere anytime soon.

Roe: The risks are increasing because more and more folks are taking multiple drugs. One in five Americans takes three or more drugs, one in 10 takes five or more, that’s twice the percentage as back in 1994. We have an aging population, so more and more folks are going to be on drugs. There seems to be more reliance in this country on medications for certain problems, and there’s greater trust in marketing in some of these medications. You know, you can’t turn on your TV without seeing some of these drugs being advertised. So I do think that this is an issue that’s going to become increasingly important to folks. It’s hard to find especially older people who aren’t on multiple medications. The reaction we’ve gotten from a lot of folks is like, “Oh my God, I better check my mother’s medicine cabinet or check grandma’s purse to find out what she’s taking. I know she’s on multiple meds, I’ve been meaning to do that and I haven’t.”

Pence: But will medicine accept those conclusions based solely on big data? Well, not without a close look. Roe says there needs to be testing- months, maybe even years worth of testing. But in the meantime, he hopes someone can use the predictions to protect consumers.

Roe: There are so many different drug combinations, is it practical for drug companies and the FDA to do all this testing to figure it out? My answer to that would be yes. I think if you’re going to sell a drug and you’re going to make money on it, then you better make sure it’s not interacting with other commonly prescribed medications in a way that could potentially hurt somebody or kill somebody. I don’t think the answer is just throwing up your hands and saying, “Well, too many drugs, nothing we can do about it! Buyer beware.” I think there’s got to be a better solution than that.

Pence: Whether testing becomes mandated or not, at least researchers now have some way of testing drug interactions. Sorting through massive amounts of information is nothing new anymore. In recent years, big data has been used to bolster companies’ website clicks, introduce “Moneyball” to baseball, and countless other applications. But Roe says using big data to empower the medical community is especially gratifying.

Roe: I think what’s really exciting about this is that we’re using big data in a way that can really help the public. There’s a lot of talk about big data and everyone recognizes that it’s cool and it’s interesting and it’s hip and all that. But you know, it also can really help save lives and I think doctors and scientists and journalists now I guess are tackling it that way, of asking some really big questions. Can we take all this data that’s being collected and stored daily, enormous amounts of data, I mean millions and billions of bits of information daily, I mean when you think about it. And not just in hospitals but tweets and Google searches and you name it, can we start taking some of this stuff and mining it in meaningful ways? Not mining it to find out who’s searching for Donald Trump today versus yesterday, but try to mine that in a way that can actually prevent harm and save lives.

Pence: To find out more about big data in medicine and to read Sam Roe’s Chicago Tribune series on the topic, head to radio health journal.net. You’ll find archives of our programs there, too, as well as on iTunes and Stitcher.

Our writer-producer this week is Evan Rook. I’m Reed Pence.


Join the discussion

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.