Computing for Cures
Transcript of BioCentury This Week TV Episode 121
Nancy Kelley, Founding Executive Director and COO, New York Genome Center, New York, N.Y.
Eric Perakslis, Chief Information Officer and Chief Scientist, U.S. Food and Drug Administration, Silver Spring, Md.
Frederick Streitz, Director, Institute for Scientific Computing Research, Computation Directorate and Director of the High Performance Computing Innovation Center, Lawrence Livermore National Laboratory, Livermore, Calif.
PRODUCTS, COMPANIES, INSTITUTIONS AND PEOPLE MENTIONED
American Society for Clinical Oncology (ASCO), Alexandria, Va.
BGI, Shenzhen, China
IBM Corp. (NYSE:IBM), Armonk, N.Y.
Informatics for Integrating Biology and the Bedside (i2b2), Boston, Mass.
Facebook Inc. (NASDAQ:FB), Menlo Park, Calif.
Lawrence Livermore National Laboratory, Livermore, Calif.
National Center for Biomedical Computing (NCBC)
New York Genome Center, New York, N.Y.
One Mind for Research, Rutherford, Calif.
Twitter, San Francisco, Calif.
U.S. Food and Drug Administration (FDA), Silver Spring, Md.
Steve Usdin, Senior Editor
STEVE USDIN: From Cold War computing on the west coast to genome sequencing in Manhattan and federal data mining in Washington, computing for cures. I'm Steve Usdin. Welcome to BioCentury This Week.
NARRATOR: Your trusted source for biotechnology information and analysis, BioCentury This Week.
NARRATOR 2: For the last 100 years, medical progress has been based on research conducted in vivo on living animals and humans. But animal models often fail to predict human responses. And there are strict limits on human testing.
Now, advances in computer hardware and software are making it possible to create incredibly precise models of human biological systems. The new approaches also promise to more quickly make good use of the vast quantities of data being generated from the human genome and to help physicians make better decisions using data that are collected throughout the world as part of routine medical care.
STEVE USDIN: We'll hear today from FDA's chief information officer, Eric Perakslis, about how the agency is trying to harness big data to help companies create better medicines and from Nancy Kelley, director of the New York Genome Center, about the Big Apple's ambitious efforts to merge medicine and information technology. But first, I'm pleased to be joined by Fred Streitz who directs the Institute for Scientific Computing Research at Lawrence Livermore National Laboratory, which runs some of the world's most powerful computers.
Fred, I wanted to start-- the project we're talking about today is a computer simulation of the heart. People have been simulating human hearts for 50 years. What's different about what you've done?
FRED STREITZ: Well, what we've done is we've harnessed the power at Lawrence Livermore of really extreme levels of computing. We have, of necessity, some of the world's very largest computers to handle our security mission, which we do for the federal government. Along with those computers, we have the expertise to be able to use them to solve very real problems. The work we've done with modeling the human heart, we've taken that expertise, exploited the fabulous computing to develop the highest resolution model that's currently available of the physiology of a human heart.
STEVE USDIN: So you've modeled it at the level of about 370 million cells. And you can tell electrophysiologically what happens in those cells and how they interact with other cells. What does that allow you to do, having that level of resolution?
FRED STREITZ: Well, for one thing, if you have that level of resolution and you have all the information-- and that's one of the beauties of doing a computer simulation is that you don't just get the final answer. You don't just get the aggregated piece of information. We can follow the electric cells as they move-- I mean, the electric signals as they move through the heart.
So what we've been able to model is the creation of an arrhythmia in the presence of a particular drug. Now, this for us was a test study because we were modeling the presence of a drug that we knew caused an arrhythmia. So this was anticipated. But no one had ever been able to watch an arrhythmia develop transmurally where the reentrant pattern in the signal occurred perpendicular, essentially, to the surface of the heart.
STEVE USDIN: And an arrhythmia is a heart going out of sequence, going out of its rhythm. And that's basically--
FRED STREITZ: That's correct.
STEVE USDIN: --An electrical function. So you're saying then you can take this model of the heart. And you can see why that drug caused an arrhythmia. Can you also use this model then to predict in the future whether a drug would cause an arrhythmia or whether it would cause it for particular kinds of patients?
FRED STREITZ: Well in principle, yes, if we understood the chemical behavior of the drug so we could model what effect the presence of the drug has on the various ion channels and the various ion gates that are known to coordinate the creation of the electric signal in the heart. If we know that connection, then absolutely we could take a drug and introduce it-- I'm going to use that word loosely. But we introduce its effect into the computer simulation and watch it play out.
STEVE USDIN: So you did this work using the world's fastest supercomputer. That computer is going to be taken offline for civilian work and devoted to your main work, Lawrence Livermore's main work, the Stockpile Stewardship Program. Is your program going to continue in the absence of that computer?
FRED STREITZ: Oh yes, absolutely. We have a smaller machine. It's not the size of the Sequoia computer, which is this third generation IBM Blue Gene computer.
The machine that we have available is called Vulcan. And it's a five petaflop machine. So that means it's capable of doing five million billion floating point operations each and every second. It's the little baby brother to our giant Sequoia computer. But it'll be available for this work and for other work that we do with our industrial partners.
STEVE USDIN: So what's the next step? You've created this model. You've demonstrated that you can show how the heart works electrophysiologically in real time. What needs to happen for it to go from you doing it in Lawrence Livermore to it actually helping patients in a hospital or companies developing drugs?
FRED STREITZ: Well, so there are a number of steps that have to happen. First and foremost, what we've developed is a research capability. We're still developing the code. We have to validate the code. We have to verify that the results are correct.
Those steps have to happen long before we would pretend to use this in something that would involve human health. It can get there. But that's a validation and verification study. That's first and foremost.
Second thing is that we want to introduce the actual contraction of a human heart. Right now we model only the electrophysiology of the heart, which is to say the electric signals that are moving around. Once we have that information though, we can actually model the contraction of the muscle tissue as well. Putting that in, we'll have an entire picture of the behavior of a human heart, both a functioning human heart or one that has problems, like an arrhythmia.
STEVE USDIN: So you're going to have the electrical action and the physical compression of the heart. That allows you to get to something, for example, maybe like congestive heart failure.
FRED STREITZ: That's correct. That's correct. And continuing that even further, if we put in a fluid model, for instance, now when the heart actually beats, it'll pump a fluid. And we can model that behavior as well. And with that, we can actually model, for instance, the blood flow and heart failure modes that are associated with poor flow.
STEVE USDIN: So how long-- very briefly, we've just got a few seconds left. How long between your work, the work where we are now, and actually being able to have a clinical application?
FRED STREITZ: Well, it's hard to answer that. It depends a lot on what level investment there is in pushing the work forward for a particular clinical application but also what level of computing is available generally.
So this work was developed on one of the world's largest computers. Clearly, hospitals don't have that level of computing right now. But certainly in the 5, 7, 10 year time frame, that kind of computing will be available. And I think that we could be there.
STEVE USDIN: So Fred, what do you think is the first application that's going to be used-- that will come out of the work that you're doing?
FRED STREITZ: Well, I think the most direct application of this kind of modeling capability will be for drug companies to be able to investigate the effects of their particular drug, novel drugs that they're developing for whatever reason, not just heart medication, but antibiotics or--
STEVE USDIN: To determine whether they cause arrhythmias or other kinds of heart problems.
FRED STREITZ: Right, because the most expensive thing you can do in developing a drug is to get to a late stage in the clinical trial and find out that it fails. And that happens a lot. If we can catch a failure early rather than late, that's an enormous amount of savings to the company.
STEVE USDIN: Would you also be able to model particular individuals' hearts maybe and be able to determine if a drug is likely to cause a problem for a patient who has particular characteristics or a particular kind of heart problem?
FRED STREITZ: In principle yes. Now the trick there is that we would actually need personalized heart data. So we would actually need a detailed image of the person's heart. That information is hard to get, not impossible.
What I could imagine doing though is, as you accumulate these, you now have an entire pool of data that you can compare against. So when you're trying to develop a drug, you can compare that drug, not just against a single heart, but against an entire population of hearts on a computer.
STEVE USDIN: So the original model that you created now, it's one heart. And different individuals, obviously, are going to have different kinds of physiology in their heart.
FRED STREITZ: That's exactly correct. And so as we accumulate individual heart sets or artificial heart sets-- so we can investigate, for instance, some people have slightly thicker heart walls. Well, we can just do that on the computer. We can take a model and make a slightly thicker heart, slightly different concentration of M cells. All these things can be done on the computer.
It's, in some sense, creating a representative pool of hearts that now a drug company can investigate the drug behavior against this entire population. And they can do that on the computer.
STEVE USDIN: So it's a lot actually like your nuclear weapons stewardship program. You can't go around blowing up nuclear weapons anymore. You've got to be able to simulate what they're doing. You're applying the same kind of thinking, the same kind of computing power to the human heart.
FRED STREITZ: That is exactly correct. And you used the right word. It's the same sort of computing power. But it's the same type of thinking. It's like how do you get answers to questions when you can't just go out and make the measurement? That's what we use our computers for.
STEVE USDIN: And I introduced you as the director of a scientific computing institute at Lawrence Livermore. But you've got another hat that you're wearing that's oriented toward reaching out to industry.
FRED STREITZ: That's right. I'm also the director of a new center at the laboratory which is the High Performance Computing Innovation Center. And industries that are out there interested in partnering and accessing some of this capability can look us up on the web. We're at hpcinnovationcenter.com.
STEVE USDIN: Great, well thanks.
When we return, we'll move from the heart to another organ and talk about how the FDA is using big data to predict if drugs might damage the liver.
NARRATOR: BioCentury, named the 2012 Commentator of the Year by the European Mediscience Awards for excellence in communications and clear, concise commentary.
You're watching BioCentury This Week.
STEVE USDIN: We've heard how a laboratory born in the Cold War is using the world's most powerful computers to simulate the heart. FDA is starting from the opposite side of the spectrum. Its hard drives and filing cabinets constitute the world's largest collection of data on how drugs affect human biology. But until recently, the agency's had virtually no ability to turn that data into useful knowledge.
Making that happen is Eric Perakslis' job. He's FDA's chief information officer and chief scientist. So Eric, I want to just start. What can FDA do to make those mountains of data usable in endeavors to help people make new drugs?
ERIC PERAKSLIS: I think there's a lot of opportunity there. Actually, I think it's almost boundless, as you said. I think they're the richest data sets on the planet. First and foremost, we have to build systems and technologies that make that data efficient and accessible to folks within the agency so that we can speed up the review of drugs, look across different drugs for other effects, find new uses for old drugs, etc.
And also take the parts of that data that are appropriate and make it available externally to innovators that can actually do the thinking we probably won't be able to do and get those ideas moving. I think big data needs big ideas and that we don't have all the big ideas in the FDA, although we've got some.
STEVE USDIN: So, one of the early examples of what you're doing with big data is the Liver Toxicity Knowledge Database. And you're tackling a really important problem. The biggest reason for drug withdrawals is because drugs are toxic to the liver. The biggest reason for drugs failing in clinical trials is because of liver toxicity. What's the Liver Toxicity Database about, and how are you making it work?
ERIC PERAKSLIS: So, the FDA has the National Center for Toxicological Research that's based in Little Rock, Arkansas. And for more than 40 years, they've been running animal tests and pre-clinical models, human tests, aggregating data in all those things in a way that really has some of the richest toxicology data sets that are targeted actually towards drugs. Hundreds of drugs.
They also have the data sorted in such a way where you could look, for example, at neuroscience drugs, at cancer drugs, at things like that, and really make some sense of some discovery. This group has also been fantastic. They're really a bioinformatics core at FDA, and they've actually been publishing this data for a long time through the Liver Toxicity Knowledge Base, putting data sets in GEO, etc.
What they hadn't quite done yet is take that data and put it into what I would think of as a translational form where you could take it and say, well, in the context of these clinical trials, in the context of these disease areas, in the context of these mechanisms or diagnostics, can you take clinical data, pre-clinical data, biomarker data, or diagnostics data and start to either validate new targets or validate new safety profiles for drugs.
STEVE USDIN: So really just to create some tools, or maybe in the computer world what we call apps. Some way of applying this data in the real world rather than just throwing it out there.
ERIC PERAKSLIS: Absolutely.
STEVE USDIN: What are some other examples of big data projects that FDA's working on?
ERIC PERAKSLIS: I think there's some. I mean, we're obviously just getting to think about next generation medical safety systems at the agency. You mentioned apps. That's a big thing right now, right? Not only having apps within FDA that make people's jobs better, whether it's reviewer looking at lots of documents, or whether it's an import inspector inspecting drugs as they come off a ship from somewhere.
Right, what is the app for that that you need? There's a lot of them. But also in health care, the idea of an app where your mom can tap the pictures of her drugs and actually make sure that they're all safe to be taken together. It's almost an endless need to do that. So on one end, big data's a logical and normal extension of our day to day jobs.
I also think, though, that big data does take big ideas, as I said. So the idea here is, there's a lot of work at meetings like the one we're at today on learning health systems. We talk about electronic medical records. How do you take electronic medical records out of the hospital and put them into the drug discovery and development sense? What actually happens after a drug's approved?
STEVE USDIN: And when you're talking about a learning health care system, some of the elements you need, obviously you need researchers and physicians who are gathering data, obviously you need patients, but you also need buckets to put it into-- databases, repositories, something like that. Do those exist, or is that something that FDA can help to create?
ERIC PERAKSLIS: I think some of them exist. Obviously through a lot of efforts, health care has made great strides in moving to electronic medical records. Even many community-based practices uses them now. So I think we're getting pretty good at getting the data. It's getting that data and using it for something else that's really the step.
And I think there are lots of tools out there that are available and open source government-funded tools, like NCBC, I2B2. There's a lot of great ones I've used. So I actually believe we do need tools. I don't think we need to spend a lot of money and time on that part of it, because I think there's a lot of stuff that's out there. I'd rather put the money and the time into ideas and use cases for the tools.
One of the great things about big data is, what question should the data ask itself. We're great. We can come up with hundreds of hypotheses for a data set. Well, the data sets can come up with thousands of hypotheses for the data sets, if you asked it in the right ways. And I think we have to move to that, and I think the technology is there.
STEVE USDIN: And also, again sticking with this idea of apps, one of the things that made the iPad, that made Apple what it is is opening that up. Creating a platform and a language where anybody could come in and create an app. Is there any way that you can open up some of the data at FDA for that?
ERIC PERAKSLIS: Yes, we are. So the healthdata.gov initiative is really the place where all of us in HHS try to put data, and FDA already puts a lot of data there. Two legs on that approach. One leg is making that data more useful, as we already said. So it's great to put out these flat data sets, but give us an app that makes it useful. So that's one piece. The other piece is that there's a lot of data that's not out yet, or not really ready in a consumable form.
STEVE USDIN: Well, let's talk about that. We'll continue our discussion about FDA's big data initiatives in just a moment. Later, we'll hear from Nancy Kelly, director of the New York Genome Institute. First, here's some data about the mountains of information the FDA's sitting on.
NARRATOR: Now back to BioCentury This Week.
STEVE USDIN: We're talking with FDA's CIO Eric Perakslis about using big data to speed the search for cures. Eric, you were talking about two legs before the break. What's the second leg?
ERIC PERAKSLIS: The second leg is making new data sources available. We've got lots of data that has never been made available. Some of that is simple because it's FDA-generated data. Some of that data is what we call sponsor data, or data from the pharmaceutical companies, the device companies, the manufactures that send data. And of course, that data requires the right type of permission to make public, but there's a lot of interest from them and from us in doing that.
STEVE USDIN: But isn't there an ability-- You don't have to necessarily make that data public to be able to make that data useful? For example, to mine across a lot of it and even looking at placebo arms of studies?
ERIC PERAKSLIS: Absolutely. And I think that's actually our responsibility to do more of that. I think one of the things the FDA's done a great job of in the last 10 years is getting rid of the paper and going more electronic. If all that data is sitting on pallets of paper in a basement, it's difficult to mine, although more and more, we're getting everything in electronic and keeping it electronic, which actually makes it all now searchable using text searching, Google, these types of things.
If we get a cancer drug in, we should be looking at other cancer drugs and doing some comparison in the context of getting that drug out.
STEVE USDIN: And I wanted to ask you about a specific initiative. I understand that FDA's working with-- speaking of cancer-- the American Society of Clinical Oncologists, ASCO, on a health care learning project.
ERIC PERAKSLIS: Yes. We haven't done anything formal yet, but we're very involved. I think ASCO has done a great job in taking that community, the community of practicing oncologists, and working with them and convincing them, quite frankly, to share a lot of the electronic medical record data of their patients. I've worked with lots of consortiums to try to bring data together.
And the slower, hard step isn't the technology, it's not IP, it's not the legal. It's really getting the data understanding down to what it-- and ASCO has that figured out already. So I think hopefully we will be working with them very soon formally. In the context of a learning health system, and what I mean by that is we understand drug discovery and development, that's very online process.
We understand drug review, at least with the FDA, that's a very online process. Once a drug is approved--
STEVE USDIN: It's out there.
ERIC PERAKSLIS: --it kind of gets quiet. And then you wait to hear things back. And sometimes those things are good and sometimes those things are bad.
STEVE USDIN: And you hear them back anecdotally, at first.
ERIC PERAKSLIS: Absolutely. But with the ASCO system, they'll be able to bring that part of cancer treatment online. What's working? What's the best way to minimize side effects? What's the next best thing to try in a tumor type where there wasn't a drug before? Et cetera. So I think it's an excellent experiment and exactly the type of thing that FDA should and will get involved with.
STEVE USDIN: And that also brings a question up. You mentioned about side effects with adverse effects. My understanding of the way that adverse effects are collected now, it's almost random. If a doctor or if a patient decides to tell FDA about a side effect, then they do. You have no idea of what the n is, how many people have that side effect, how many people are getting that drug or something. Is there anything that you're doing, perhaps using social media that could improve that?
ERIC PERAKSLIS: Social media's really great, and social media's a good example. The issue with side effects is that they're often patient-reported. It's non-validated data. And I'm a very experienced scientist, but I'm a neophyte regulator. And the difference between data and evidence is actually very large. We have to listen to it all, and we do. It's just it's a very noisy, noisy process.
STEVE USDIN: It's not a matter of getting a Facebook page and saying like or not like.
ERIC PERAKSLIS: It's not that simple. What you need is data over time. You need to see those themes like, oh, headaches do appear to be happening to a larger group of people than we thought, and mind that. And I think social media is absolutely where a lot of this will go. Right now it's on paper, email, people pick up the phone. Yet, go to Facebook, and people are all talking about it in a very organized way.
STEVE USDIN: And is there a way that FDA can reach out and start mining the information from Facebook, from Twitter, from the kind of social media conversations that are going on around them?
ERIC PERAKSLIS: Absolutely. We do so now, and need to do more of it. I think it's a matter of setting up some very specific and intentional experiments. Again, working with a partner like in ASCO, or working with a bunch of folks like One Mind that are focused on brain disease. It makes a lot more sense when you can do it in the context of a theme or a set of experiments as opposed to just random.
STEVE USDIN: We've got a little bit less than a minute left. I want to ask you very quickly about the initiatives that you're doing on global quality and trying to use IT to track that.
ERIC PERAKSLIS: One of the funnest things that we're working on right now, one of the most important, is really modernizing the FDA inspector. How do we inspect things? How do we inspect at the several hundred ports of entry in this country? How do we inspect pharmaceutical manufacturers, et cetera? And right now, it's kind of an audit-type process.
And I think it's time-based or it's procedure-based. Like, you file for a drug, we're going to come inspect you before your drug gets approved. We need to do it risk-based, which means we're using analytics, social media, global surveillance, law enforcement data, er cetera, to actually ask the questions and drive what we're expecting and how.
STEVE USDIN: Great. Well, thanks very much. We've been discussing FDA's efforts to use computing power to mine its mountains of data. When we return, we'll hear how New York research institutions are collaborating to overcome one humanity's biggest data challenges. Making sense of the human genome.
NARRATOR: Now in its 20th year, visit biocentury.com for the most in-depth biotech news and analysis. And visit biocenturytv.com for exclusive free content.
STEVE USDIN: We're joined today by Nancy Kelly, the founding director and executive director of the New York Genome institute. Nancy, I wanted to start by asking you, the New York Genome Institute aspires to be the largest genome center in North America. What will it mean for patients and for researchers to have an institution of that scale in New York.
NANCY KELLY: So, we're bringing large amounts of genomic information to determine variations in order to figure out the clues about who might get certain diseases? How to treat those diseases? And what medications a person might be responsive to? So that all matters when you're a patient. And for researchers who are looking for clues as to what actually causes human disease, these variations offer a lot of information.
STEVE USDIN: And you created the center by getting eleven New York institutions to collaborate together. There hasn't been a lot of history in New York of institutions collaborating together. New York hasn't become the kind of biotech hub that Boston and San Francisco have become. Do you think that the New York Genome Institute, what kind of role will it play in catalyzing more activity in the biotech sector?
NANCY KELLY: So, the New York institutions have collaborated in the past, but not as successfully as other institutions have in other geographic locations. In this instance, I think that there was a realization that sequencing, and interpretation of the sequencing data, would play such a large role in clinical delivery in the future, that the institutions had no choice but to invest in this.
And it became critically clear that no one could do it alone, that they actually had to do it together to get to the scale that would be necessary.
STEVE USDIN: And it's interesting, what you're doing is different from some of the other really big genome institutes, for example, the BGI in China, which is the biggest genome institute in the world, because you're going to be focusing a lot of what you're doing on clinical care, integrating genomes in clinical care. What's your vision for that? How is that actually going to come out, work in practice?
NANCY KELLY: So, obviously a patient will go to their doctor, or to the hospital, and they'll have their genome sequenced, either a portion of it, or the whole thing, either for preventive care, or if they're sick, to determine what is actually making them sick.
To get your cancer sequence, for example, that tells you a lot about the treatment that you need. And so having this done in the clinical environment is what is actually going to affect patients' lives.
STEVE USDIN: Great, well, thanks very much, Nancy Keller from the Genome Institute. And thank you for watching, I'll see you next week.