Jun 19 2017

It’s Benign!

by at 2:09 pm

My surgeon Dr. Shawna Willey walked into the patient exam room where I waited nervously. I first saw her thumbs up before her beaming face. I could breathe again!

My friends and I who recently turned 40 and starting our baseline mammograms can’t help but wonder about the lack of consensus on optimal cancer screening strategies, target populations, its benefits and harms. My colleague Dr. Jeanne Mandleblatt and her team have studied breast screening strategies for decades and have shown that biennial screening from ages 50-74 achieves a median 25.8% breast cancer mortality reduction whereas annual screening from ages 40-74 reduces mortality an additional 12% but introduces very high false positive rates. Many women and their families are subject to extreme anxiety due to the sheer number of repeat mammograms, false-positives, benign biopsies and in 7% of the cases an over diagnosis.

My experience and of my friends with breast cancer screening are raising many questions. How can we better predict the target risk population who must undergo screening early and often? Would this decision-making process consider risk factors, lifestyle, and patient preferences? How often are patients with a diagnosis of a benign breast condition on a stereotactic core needle biopsy upgraded to a non-benign diagnosis on an excisional biopsy which requires full sedation and surgery? What was the care journey like for other patients like me – Asian female, healthy, no family history? How many in the US and globally have access to the excellent care and follow-up that I was privileged to receive from Dr. Willey and her expert team?

Touted as the fourth industrial revolution, Artificial Intelligence is poised to empower clinicians, patients and researchers in answering these questions. What is AI? The term was coined by Dartmouth professor Dr. John McCarthy in 1956 and defined as “the science and engineering of making intelligent machines, especially intelligent computer programs.” Applications of AI in medicine have been limited by the complexity of highly cognitive processes such as making a medical diagnosis or selecting a treatment which require integration of thousands of datasets with millions of variables and multiple interactions between these variables. It takes years to collect, organize and publish practice changing results such as Jeanne’s screening study. What if we could use data that we routinely collect during the care process and effectively use AI to assist clinicians in real-time to make informed treatment decisions?

Interested in learning more about AI in Biomedicine? Want to engage with expert scientists and product developers in AI? Register for Georgetown’s Big Data in Biomedicine symposium on October 27th!

Companies like Google and Amazon are betting big on this. Jeff Bezos wrote “..it is hard to overstate how big of an impact AI will have on society over the next 20 years”; Google’s Sundar Pichai, when asked recently about the next big thing at Google responded “I can’t quite tell exactly but advances in AI and machine learning, we are making a big bet on that and this will bring a difference in many many fields”.

We cannot have a conversation about AI in medicine without discussing IBM Watson, the supercomputer that sifted through 20 million cancer research papers, and conducted a differential diagnosis on a difficult to treat leukemia patient in 10 minutes by combining genomic data with the power of cognitive computing. One concern that informaticians including my informatics mentor Dr. Bill Hersh have raised is that the publicity around Watson has mostly been from news articles and press releases, primarily from researchers at IBM and call for a more scientific analysis, not n-of-one case reports, of its abilities in clinical decision making. Systems like Watson will benefit from systematic expert knowledge input to guide the cognitive computing processes in navigating the complex medical pathways.

While still early, AI is already starting to make important contributions to Medicine says AI professor at MIT and a recent breast cancer survivor, Dr. Regina Barzilay. She and her team are asking all the right questions of data – “can we apply the sophisticated algorithms we use to predict customer’s shoe-buying habits to adjust treatments for cancer patients?” “Can computers detect signs of breast cancer or even pre-malignancy earlier than humans are currently capable of?” And the Holy Grail – “Can we use the huge quantities of data from smart toothbrushes, wearables, genomic sequencing, medical records to get to the first and right treatment?”

What next?

In the last decade, big data in biomedicine has focused on collecting (e.g. through mobile and other IoT) and organizing (e.g. cloud computing) information but all signs point in one direction for the next decade – real world applications of AI. We will witness the development of expert systems, question-answering systems and deep learning methods that begin to address complex real world problems in medicine. These will augment, not replace, human expertise. Winners will find ways to rapidly and accurately integrate human input with computational output. Usability of these tools by end users and human factors will be key.

While a true tech automation enthusiast at heart and practice, I will never forget Dr. Willey’s kind and soft words as she clearly explained my pathology report. She also carefully noted in my medical record the rare chlorohexidine pre-op antiseptic agent hypersensitivity that I had developed post anesthetic induction.

One more data point!

              Let’s continue the conversation:

No responses yet | Categories: From the director's office,Newsletter,Subha Madhavan | Tags: , , , , ,

Jul 07 2016

Bioinformatics is a vocation. Not a job.

by at 10:57 am

Bioinformatics is at the heart of modern day clinical translational research. And while experts define this as an interdisciplinary field that develops and improves methods and tools for storing, retrieving, organizing, and analyzing biological (biomedical) data – it is much, much more!

Bioinformatics helps researchers connect the dots between disparate datasets; improve extraction of signal from noise; predict or explain outcomes; and improves acquisition and interpretation of clinical evidence. Ultimately, it allows us to tell the real data stories.

To effectively tell these stories, and to see this all-encompassing domain in biomedical research and its true super powers, we must pursue bioinformatics as a vocation – or a calling – and not just a job.

Spring’16 has been a busy season for us Bioinformaticians at the Georgetown ICBI. I carefully curated six of our recent impact stories that you may find useful.

  1. AMIA’16 – The perfect triangulation between clinical practitioners, researchers and industry can be seen at AMIA annual conferences. I was honored to chair the Scientific Planning Committee for this year’s AMIA Translational Bioinformatics (TBI) Summits, featuring sessions on the NIH Precision Medicine initiative, BD2K program, and ClinGen. I sat down with GenomeWeb’s Uduak Grace Thomas for a Q&A on this year’s Summit, which attracted over 500 informaticians. Come join us at the AMIA Joint Summits 2017 to discuss the latest developments in Bioinformatics.
  1. Cyberattack Response! – We were in the middle of responding to NIH’s request for de-identified health record data for our Precision Medicine collaborative when MedStar Health, our health care partner’s computer systems, were crippled by a cyberattack virus. Thousands of patient records were inaccessible and the system reverted to paper records, seldom used in modern hospital systems. Thanks to the hard work and dedication of the IT staff, MedStar Health systems were restored within days with no evidence of any compromised data, according to the MedStar Health spokesperson. However, our research team had to act fast and improvise a way to fulfill the NIH’s data request. We ended up providing a complete synthetic linked dataset for over 200 fields. As our collaborator Josh Denny, a leader in the NIH Precision Medicine Initiative put it – “this experience you had to go through will help us be better prepared for research access to EHRs for nationwide clinical networks”. We sure hope so!
  2. Amazon Web Service (AWS) – The AWS Public Sector Summit was buzzing with energy from an active ecosystem of users and developers in federal agencies, small and large businesses, and nonprofit organizations—a community created over just the past few years. It was enlightening for me to participate on a panel discussing Open Data for Genomics: Accelerating Scientific Discovery in the Cloud, with NIH’s Senior Data Science Advisor, Vivien Bonazzi, FDA’s former Chief Health Informatics Officer, Taha Kass-Hout and AWS’s Scientific Computing Lead, Angel Pizarro. Three take homes from the Summit – (1) a growing need for demand-driven open data; (2) concern over the future administration’s commitment (or lack thereof) to #opendata; and (3) moving beyond data storage, and the future of on-demand analytics.
  3. Massive Open Online Course (MOOC) on Big Data – Want to begin demystifying biomedical big data? Start with this MOOC – to be available through Open edX late Fall. Georgetown University was recently awarded a BD2K training grant to develop an online course titled “Demystifying Biomedical Big Data: A User’s Guide”. The course aims to facilitate the understanding, analysis, and interpretation of biomedical big data for basic and clinical scientists, researchers, and librarians who have limited/no significant experience in bioinformatics. My colleagues Yuriy Gusev and Bassem Haddad, who are leading the course, are recording interviews and lectures with experts on practical aspects of use of various genotype and phenotype datasets to help advance Precision Medicine.
  4. Know Your TumorSM – Patients with pancreatic cancer can obtain molecular tumor profiling through the Pancreatic Cancer Action Network’s Know Your TumorSMprecision medicine initiative. It is an innovative partnership with Perthera, a personalized medicine service company that facilitates the multi-omic profiling and generates reports to patients and physicians. Check out the results from over 500 KYT patients presented at AACR’16 by our multi-disciplinary team of patient coordinators, oncologists, molecular diagnostic experts and data scientists.
  5. Moonshot – Latest announcement from VP Biden’s Cancer Moonshot program unveiled a major database initiative at ASCO’16. I had the opportunity to comment in Scientific American on the billions of bits of information that such a database would capture to help drive an individual’s precise cancer treatment. Continue to watch the Moonshot program if you are involved with cancer research or care continuum.

It is personally gratifying to see Bioinformaticians, BioIT professionals, and data scientists continue to solidify their role as an integral part of advancing biomedicine. I have yet to meet a bioinformatician who thinks of her/his work as just a job. Engage your bioinformatics colleagues in your work, we will all be better for it!

One response so far | Categories: From the director's office,Newsletter,Subha Madhavan | Tags: , , , , , , , , , , ,

Jun 14 2015

Health Datapalooza ’15

by at 12:30 pm

It was a treat to all data enthusiasts alike! What started out five years ago with an enlightened group of 25 gathered in an obscure forum has morphed into Health Datapalooza which brought 2000 technology experts, entrepreneurs and policy makers and healthcare system experts in Washington DC last week. “It is an opportunity to transform our health care system in unprecedented ways,” said HHS Secretary Burwell during one of the keynote sessions to mark the influence that the datapalooza has had on innovation and policy in our healthcare system. Below are my notes from the 3-day event.

Fireside chats with national and international leaders in healthcare and data science were a major attraction. Uhealthdatapalloza.S. Chief Data Scientist DJ Patil discussed the dramatic democratization of health data access. He emphasized that his team’s mission is to responsibly unleash the power of data for the benefit of the American public and maximize the nation’s return of its investment on data. Along with Jeff Hammerbacher, DJ is credited to have coined the term data science. Most recently, DJ has held key positions at LinkedIn, Skype, PayPal and eBay. In Silicon Valley style, he said that he and his team are building a data product spec for Precision Medicine to drive user-centered design, he quoted an example of such an app, which will provide allergy-specific personalized weather based recommendations to users. Health meets Climate!

Responsible and secure data sharing of health data is not just a “nice to have” but is becoming a necessity to drive innovation in healthcare. Dr. Karen DeSalvo, the Acting Assistant Secretary for Health in the U.S. Department of Health and Human Services, is a physician who has focused her career toward improving access to affordable, high quality care for all people, especially vulnerable populations, and promoting overall health. She highlighted the report on Health information blocking produced by the ONC in response to Congress’s request. As more fully defined in this report, information blocking of electronic healthcare data occurs when persons or entities knowingly and unreasonably interfere with the exchange or use of electronic health information. The report produced in April lays out a comprehensive strategy to address this issue. She also described early successes of mining of social media data for healthcare describing the use of Twitter to predict Ebola outbreak. Lastly, she shared a new partnership between HHS and CVS on a tool that will provide personalized, preventive care recommendations based on the expert recommendations that drive the MyHealthFinder, a tool to get personalized health recommendations.

There was no shortage of exciting announcements including Todd Park’s call for talent by the U.S. Digital Service to work on the Government’s most pressing data and technology problems. Todd is a technology advisor to the White House based in Silicon Valley. He discussed how the USDS teams are working on problems that matter most – better healthcare for Veterans, proper use of electronic health records and data coordination for Ebola response.  Farzad Mostashari, Former National Coordinator for Health IT, announced the new petition to Get my Health Data – to garner support for easy electronic access to health data for patients. Aaron Levine, CEO of Box described the new “platform” model at Box to store and share secure, HIPAA-compliant content through any device. Current platform partners include Eli Lily, Georgetown University and Toyota among others.

An innovative company and site ClearHealthCosts, run by Jeanne Pinder, a former New York Times reporter for 23 years, caught my attention among software product demos. Her team’s mission is to expose pricing disparities as people shop for healthcare. She described numerous patient stories including one who paid $3200 for an MRI. They catalog health care costs through a crowdsourcing approach with patients entering data from their Explanation of benefit statements as well as form providers and other databases. Their motto – “Patients who know more about the costs of medical care will be better consumers.”

Will the #hdpalooza and other open data movements help improve health and healthcare? Only time will tell but I am an eternal optimist, more so after the exciting events last week. If you are interested in data science, informatics and Precision Medicine don’t forget to register for the 4th annual ICBI Symposium on October 16. More information can be found in this Newsletter. Let’s continue the conversation – find me on e-mail at subha.madhavan@georgetown.edu or on twitter at @subhamadhavan

No responses yet | Categories: From the director's office,Newsletter,Subha Madhavan | Tags: , , ,

Feb 13 2015

Informaticians on the “Precision Medicine” Team

by at 8:34 am

My first recollection of the term “Precision Medicine” (PM) is from a talk by Harvard Business School’s Clayton Christensen on disruptive technologies in healthcare and personalized medicine in 2008. He contrasted precision medicine with intuitive medicine, saying, “the advent of molecular diagnostics enables precision medicine by allowing physicians to delineate conditions that are likely constellations of diseases presenting with a handful of symptoms.” The term became mainstay after NRC’s report, “Toward precision medicine: Building a knowledge network for biomedical research and a new taxonomy of disease.” Now, we converge on the NIH’s definition– PM is an emerging approach for disease treatment and prevention that takes into account individual variability in genes, environment, and lifestyle.

“Cures for major diseases including cancer are within our reach if only we have the will to work together and find them.  Precision medicine will be the way forward,” says Dr. John Marshall, head of GI Oncology at MedStar Georgetown University Hospital.

The main question in my mind is: How can we apply PM to improve health and lower cost? Many sectors/organizations are buzzing with activity around PM to help answer this question.

NIH is developing focused efforts in cancer to explain drug resistance, genomic heterogeneity of tumors, monitoring outcomes and recurrence and applying that knowledge in the development of more effective approaches to cancer treatment. In a recent NEJM article, Drs. Collins and Varmus describe NIH’s near-term plan for PM in cancer and a longer-term goal to generate knowledge that is broadly applicable to other diseases (e.g., inherited genetic disorders and infectious diseases). These plans include an extensive characterization and integration of health records, behavioral, protein, metabolite, DNA, and RNA data from a longitudinal cohort of 1 million participants. The cost for the longitudinal cohort is roughly $200M to expand trials of genetically tailored treatments, explore cancer biology, and set up a “cancer knowledge network” for sharing this information with researchers and oncologists.

FDA is working with the scientific community to ensure that the public can be confident that genomic testing technology is safe and effective while preserving innovation among developers. The FDA recently issued draft guidance for a framework to regulate laboratory-developed tests (LDTs). Until now, most genomic testing is done through internal custom developed assays or commercially available LDTs. The comment period just ended on Feb 2.

Pharma/Biotech companies are working to discover and develop medicines and vaccines to deliver superior outcomes for their customers (patients) by integrating “Big Data” (clinical, molecular, multi-omics including epigenetics, environmental, and behavioral information).

Providers, health systems, and Academic Medical Centers are incorporating appropriate molecular testing in the care continuum and actively participating in clinical guideline development for PM testing and use.

Public and private Payors are working to appropriately determine clinical utility, value and efficacy of testing to determine reimbursement levels for molecular diagnostic tests – a big impediment for PM testing right now. Payors recognize that collecting outcomes data is key to determining clinical utility and developing appropriate coding and payment schedule.

Diagnostic companies are developing and validating new diagnostics to enable PM, especially capitalizing on the new value-based reimbursement policies for drugs. They are also addressing joint DX/RX approval processes with the FDA.

Professional organizations are setting standards and guidelines for proper use of “omics” tests in a clinical setting – examples include AMA’s CPT codes, ASCO’s QOPI guidelines, or NCCN’s compendium.

Many technology startups are disrupting current models in targeted drug development and individualized patient care to deliver on the promise of PM. mHealth domain is rapidly expanding with innovative mobile sensors and wearable technologies for personal medical data collection and intervention.

As informaticians and data scientists, we have atremendous opportunity to collaborate with these stakeholders to contribute in unique ways to PM:

  1. Develop improved decision support to assist physicians in taking action based on genomic tests.
  2. Develop common data standards for molecular testing and interpretation
  3. Develop methods and systems to protecting patient privacy and prevent genetic discrimination
  4. Develop new technologies for measurement, analysis, and visualization
  5. Gather evidence for clinical utility of PM tests to guide decisions on utility
  6. Develop reference databases on the molecular status in health and disease
  7. Develop new paradigms for clinical trials (N of one trials, basket trials, adaptive designs, other)
  8. Develop methods to bin patients by mutations and pathway activation rather than by tissue site alone.
  9. Create value from Big Data

What are your ideas? What else belongs on this list?

Jessie Tenenbaum, Chair, AMIA Genomics and Translational Bioinformatics shares: “It’s an exciting time for informatics, and translational bioinformatics in particular. New methods and approaches are needed to support precision medicine across the translational spectrum, from the discovery of actionable molecular biomarkers, to the efficient and effective storage and exchange of that information, to user-friendly decision support at the point of care.”

A PricewaterhouseCoopers analysis predicts the total market size of PM to hit between $344B-$452B in 2015. This includes products and services in molecular diagnostics, nutrition and wellness, decision support systems, targeted therapeutics and many others. For our part, at ICBI, we continue to develop tools and systems to accurately capture, process, analyze, and visualize data at patient, study, and population levels within the Georgetown Database of Cancer (G-DOC). “Precision medicine has been a focus at Lombardi for years, as evidenced by our development of the G-DOC, which has now evolved into G-DOC Plus. By creating integrated clinical and molecular databases we aim to incorporate all relevant data that will inform the care of patients,” commented Dr. Lou Weiner, Director, Lombardi Comprehensive Cancer Center who was invited to the White House precision medicine rollout event on January 30.

Other ICBI efforts go beyond our work with Lombardi. With health policy experts at theMcCourt School of Public Policy, we are working to identify barriers to implementation of precision medicine for various stakeholders including providers, LDT developers, and carriers. Through our collaboration with PRSM, the regulatory science program at Georgetown, and the FDA, we are cataloging SNP population frequencies in world populations for various drug targets to determine broad usefulness of new drugs. And through theClinGen effort, we are adding standardized, clinically actionable information to variant databases.

The President’s recent announcements on precision medicine have raised awareness and prompted smart minds to think deeply about how PM will improve health and lower cost. We are one step closer to realizing the vision laid out by Christensen’s talk in 2008. ICBI is ready for what’s next.

Let’s continue the conversation – find me on e-mail at subha.madhavan@georgetown.edu or on twitter at @subhamadhavan

No responses yet | Categories: From the director's office,Newsletter,Subha Madhavan | Tags: , , ,

Jan 12 2014

Genomes on Cloud 9

by at 4:51 pm

Genome sequencing is no longer a luxury available only to large genome centers. Recent advancements in next generation sequencing (NGS) technologies and the reduction in cost per genome have democratized access to these technologies to highly diverse research groups. However, limited access to computational infrastructure, high quality bioinformatics software, and personnel skilled to operate the tools remain a challenge. A reasonable solution to this challenge includes user-friendly software-as-a-service running on a cloud infrastructure. There are numerous articles and blogs on advantages and disadvantages of scientific cloud computing. Without repeating the messages from those articles, here I want to capture the lessons learned from our own experience as a small bioinformatics team supporting the genome analysis needs of a medical center using cloud-based resources.

 Why should a scientist care about the cloud?

Reason 1: On-demand computing (such as that offered by cloud resources) can accelerate scientific discovery at low costs. According to Ian Foster, Director of the Computation Institute at the University of Chicago, 42 percent of a federally funded PI’s time is spent on the administrative burden of research including data management. This involves collecting, storing, annotating, indexing, analyzing, sharing and archiving data relevant to their project. At ICBI, we strive to relieve investigators of this data management burden so they can focus on “doing science.” The elastic nature of the cloud allows us to invest as much or as little up front for data storage. We work with sequencing vendors to directly move data to the cloud avoiding damaged hard drives and manual backups. We have taken advantage of Amazon’s Glacier data storage that enables storage of less-frequently used data at ~10 percent of the cost of regular storage. We have optimized our analysis pipelines to convert raw sequence reads from fastq files to BAM files to VCF in 30 minutes for exome sequences using a single large compute instance on AWS with benchmarks at 12 hrs and 5 hrs per sample for whole genome sequencing and RNA sequencing, respectively.

Reason 2: Most of us are not the Broad, BGI or Sanger, says Chris Dagdigian of BioTeam, who is also the co-founder of the BioPerl project. These large genome centers operate multiple megawatt data centers and have dozens of petabytes of scientific data under their management. The rest of the 99 percent of us thankfully deal in much smaller scales of a few thousand terabytes, and thus manage to muddle through using cloud-based or local enterprise IT resources. This model puts datasets such as 1000 genomes, TCGA, UK 10K, etc. in the fingertips (literally a click away) of a lone scientist sitting in front of his/her computer with a web browser.  At ICBI we see the cloud as a powerful shared computing environment, especially when groups are geographically dispersed.  The cloud environment offers readily available reference genomes, datasets and tools.   To our research collaborators, we make available public datasets such as TCGA, dbGAP studies, and NCBI annotations among others. Scientists no longer need to download, transfer, and organize other useful reference datasets to help generate hypotheses specific to their research.

Reason 3: Nothing inspires innovation in the scientific community more than large federal funding opportunities. NIH’s Big Data to Knowledge (BD2K), NCI’s Cancer Cloud Pilot and NSF’s BIG Data Science and Engineering programs are just a few of many programs that support the research community’s innovative and economical uses for the cloud to accelerate scientific discovery. These opportunities will enhance access to data from federally funded projects, innovate to increase compute efficiency and scalability, accelerate bioinformatics tool development, and above all, serve researchers with limited or no high performance computing access.

So, what’s the flip side? We have found that scientists must be cautious while selecting the right cloud (or other IT) solution for their needs, and several key factors must be considered.  Access to large datasets from the cloud will require adequate network bandwidth to transfer data. Tools that run well on local computing resources may have to be re-engineered for the cloud.  For example, in our own work involving exome and RNAseq data, we configured Galaxy NGS tools to take advantage of Amazon cloud resources. While economy of scale is touted as an advantage of cloud-based data management solutions, it can actually turn out to be very expensive to pull data out of the cloud. Appropriate security policies need to be put in place, especially when handling patient data on the cloud. Above all, if the larger scientific community is to fully embrace cloud-based tools, cloud projects must be engineered for end users, hiding all the complexities of the operations of data storage and computes.

My prediction for 2014 is that we will definitely see an increase in biomedical applications of the cloud. This will include usage expansions on both public (e.g. Amazon cloud) and private (e.g. U. Chicago’s Bionimbus) clouds. On that note, I wish you all a very happy new year and happy computing!

Let’s continue the conversation – find me on e-mail at sm696@georgetown.edu or on twitter at @subhamadhavan

No responses yet | Categories: From the director's office,Subha Madhavan | Tags: , , , ,

Oct 24 2013

Keynote Talks at ICBI symposium: Stephen Friend and Eric Hoffman

by at 4:40 pm

Big Data in Precision Medicine was the focus of the 2nd Annual Biomedical Informatics Symposium at Georgetown, which drew nearly 250 people to hear about topics from direct-to-consumer (DTC) testing to mining data from Twitter.

The morning plenary on Genomics and Translational Medicine was kicked off by Stephen Friend, MD, PhD, President, Co-founder, and Director of Sage Bionetworks who discussed the “discontinuity between the state of our institutions and the state of our technology.”   This disconnect stems from the way results are presented in the literature and compared with one another in different scenarios, and sometimes interpreted into the clinic. “We are going to get different answers at the DNA, RNA, and functional levels,” said Friend, and different groups working on the same data can get different answers because science “context dependent” – dependent on the samples, technologies, and statistical parameters.  Our minds are wired for a “2D narrative” but the fact is we are all just “alchemists.”

Friend is a champion of open data sharing and turning the current system on its head.  We need “millions of eyes looking at biomedical data…not just one group, it’s immoral to do so,” Friend said.  We need to get rid of the paradigm, “I can’t tell you because I haven’t published yet.”   He said that GitHub has over 4M people sharing code with version tracking, and in fact hiring managers for software engineering jobs are more likely to look for a potential candidate’s work on GitHub than to considering credentials on a CV.

Sage created Synapse, a collaborative and open platform for data sharing, which he hopes could be the GitHub for biomedical scientists.   He would like to see large communities of scientists worldwide working together on a particular problem and sharing data in real time. As an example of this sort of effort, check out the Sage Crowdsourcing genetic prediction of clinical utility in the Rheumatoid Arthritis Responder Challenge.  His excitement for this future model for large scale collaboration was palpable in his closing remarks—a prediction for a future Nobel prize for “theoretical medicine.”

The afternoon plenary on Big Data in Biomedicine was led by a keynote talk from Eric Hoffman, PhD, Director of the Research Center for Genetic Medicine at Children’s National Medical Center who discussed “data integration in systems biology”  -which is a topic very close to the heart of ICBI.  He presented a new tool, miRNAVis, to integrate and visualize microRNA and mRNA expression data, which he referred to as “vertical” data integration or the integration of heterogeneous data types.  This tool will soon be released for public use.

Hoffman is considered one of the top world experts in muscular dystrophy research, having cloned the dystrophin gene in Louis Kunkel’s lab in 1987.  He has made an enormous contribution to research in this field along with dedicating countless hours to volunteering with children affected by the horrible disease.  He discussed a very exciting project in his lab on a promising new drug – VBP15, which has anti-inflammatory properties, and shows strong inhibition of NF-κB, and repair of skeletal muscle.  Most importantly, VBP15 does not have the side effects of glucocorticoids, which are currently the standard treatment for Duchenne muscular dystrophy. Hoffman said this new drug may potentially be effective against other chronic inflammatory diseases.  Let’s hope this drug will make it into clinical trial testing very soon!

More information about the keynote and other talks can be found on ICBI’s Twitter feed and at #GUinformatics, which provided snapshots of the day.

No responses yet | Categories: Symposium | Tags: , , , , ,

Apr 19 2013

ICBI Director’s blog post, Spring 2013

by at 4:26 pm

It is an exciting time to be a data scientist! From large-scale clinical genomic studies to drug discovery and development, now more than ever there is a critical need for computational analysis and interpretation. Commercial, academic, and government sectors alike are developing systems biology and computational approaches to mine BIG DATA for identifying biomarkers, drug targets and predicting outcomes for complex diseases. But the reality is that it is not about BIG DATA anymore; we already know how to store, organize, and access these data. The challenge that still remains is extracting small, actionable bites to inform biomedical research and care.

I wanted to share three recent experiences that underscore the need to recalibrate our thinking in times of diminishing resources and how best to apply data science to solve real-world challenges in biomedicine effectively. At the BioIT world conference in Boston, approximately 2,500 life sciences, pharmaceutical, clinical, healthcare, and IT professionals from 30+ countries gathered to discuss best practices and informatics/IT technologies in genomics, cloud computing, BIG DATA in disease research, and big pharma data management. About midway through the conference, I realized all 12 tracks appeared to have converged on one theme – we are grappling with how to move forward with $1000 genomes requiring $1,000,000 analyses!

IT professionals are working to break this cost barrier. Cycle Computing orchestrated 50,000-core supercomputers on the Amazon cloud for Schrodinger to accelerate the screening of potential new cancer drugs. The experiment was completed in three hours, compared to an estimated nine months required to evaluate, design, and build a 50,000-core environment and to make it fully operational. The cost of the entire project including compute-time was less than $5000 per week at its peak.

We will see many more such efficiencies gained as HTP technologies and methods evolve over the next few years with data scientists leading the way. We must put the advances in technology in the context of policy. That brings me to the second experience that I want to highlight. I had the wonderful opportunity to take part in a think tank organized by the NIH recently to discuss the identifiability of genomic data. The think tank brought together 46 leaders from several fields, including cancer genomics, bioinformatics, human subject protection, patient advocacy, and commercial genetics to discern the preferences and concerns of research participants about data sharing and individual identifiability. Some investigators suggest that human beings can be uniquely identified from just 30 to 80 statistically independent single-nucleotide polymorphisms. What does this mean for cloud service providers who currently host several petabytes of genomic data for academic medical centers, hospitals, and Pharma?  We are already experiencing the need to reexamine HIPAA through the lens of genomic medicine. While the policy will eventually catch up with technology, data scientists who manage and analyze human genome data must exercise extra caution and pay close attention to concerns and policies to protect participant privacy.  The need for well-trained and skilled data scientists is greater than ever to address these challenges.  McKinsey predicts that by 2018, the United States will have a shortage of 150,000 to 180,000 people who have deep data analytical skills.

Lastly, I attended an event hosted by Georgetown’s McDonough School of Business on “Big Data: Educating the Next Generation,” which emphasized among other things the need for a data literacy course for every college junior. I would argue that we must start earlier than that – why not in elementary school?  Last week, as one of the science fair judges at my son’s elementary school I witnessed children aged 6 through 11 as they presented extensive data tables and charts to explain the outcomes from their physics, chemistry, and biology experiments. I wish I had a penny every time I heard the word “pairwise comparison” at that science fair! Let’s continue the conversation – find me on e-mail at sm696@georgetown.edu or on twitter at @subhamadhavan.

No responses yet | Categories: From the director's office,Subha Madhavan | Tags: ,