Skip Navigation Bar

Return to NLM ARRA page

NLM ARRA Extramural Research & Development Contract Awards, 2010

Computational Thinking
Insight Toolkit Projects
Algorithms, Adapters, and Data Distribution (A2D2)
Other ARRA Extramural R&D Contracts

Computational Thinking. ARRA Extramural R&D Contracts Awarded $4.7 Million

Combining Multiple Types of Reasoning. PI: Dr. Douglas B. Lenat, Cycorp, Inc.

Project: In this one-year project, one aim is to demonstrate sufficient performance of a prototype—integrating statistical and biology model-based causal reasoning. This could lead to a new genre of multi-paradigm systems using increasingly sophisticated logical reasoning about systems biology and drawing on increasingly large and detailed datasets, to discover increasingly sophisticated, experimentally testable, new hypotheses. The long-term aim is amplifying human experts’ cognitive abilities to form plausible hypotheses at an individual patient level and at a population level. The system could be of use to clinicians, patients, and clinical researchers. Domain: Osteoporosis; data from Framingham study

Casual Inference on Narrative and Structured Temporal Data. PI: Noemie Elhadad, PhD, Columbia University Health Sciences

Project: This contract will evaluate the use of statistical methods to extract data from existing electronic health records (both structured and free text data) and demonstrate its use in augmenting discovery of diseases by identifying possible causes of the illness (population level) and in augmenting clinical care by detecting disease (patient level). Domain: Congestive heart failure; data from Geisinger Clinic and Columbia

Automated Reasoning. PI: Mark Musen, Stanford University

Project: The goal of this contract is development of a computational approach for the application of evidence-based guidelines to improve individual patient care. The goal is to individualize recommendations for therapy based on guidelines. To propose treatment plans that adhere to the guideline, the system will critique the current therapy in light of the detailed patient situation (referencing pharmacy data, laboratory values, allergies and adverse drug events, diagnosis lists, and vital signs), and then suggests modifications to the current therapy. Domain: hypertension; data from VistA or another EMR

An Evidence-Based, Open-Database Approach to Diagnostic Decisions. Michael Segal, MD-PhD, Simulconsult, Inc.

Project: This contract is designed to evaluate the development and use of a comprehensive disease database as a decision support system to aid clinicians in performing medical diagnosis. If sucessful, the resulting system will improve the accuracy and cost-effectiveness relating to diagnosis of existing diseases in the individual as well as prevention at the population level. The approach used in SimulConsult’s software is one of leveraging and supporting the clinician by providing many suggestions of diseases to consider and findings to check, rather than insisting that the clinician follow a more constrained approach. Domain: Neurogenetic disorders

Computational Thinking for Pharmacogenomics Curation and Discovery. PI: Dr. Russ B. Altman, Stanford University

Project: The goal of this contract is to build a prototype system that intelligently predicts drug-drug interactions through evaluation of genes, drugs, and drug-response phenotypes. There have recently been breakthroughs in natural language processing (NLP) and machine inference that allow us to imagine the automatic and high-fidelity extraction of relationships from text, the integration of these relationships into semantically rich networks, and the use of these networks to automatically hypothesize new relationships. We therefore propose a pilot project to demonstrate the feasibility of large scale analysis of biomedical text in the domain of pharmacogenomics, with the goal of achieving performance at or above the level of our curators—in the basic tasks of extracting relationships and proposing new ones.

Visual Clinical Problem Threading for Case Summarization. PI: Casimir Kulikowski, Rutgers University

Project: Through the use of computational thinking methods (representation, visualization, and machine learning), this contract is designed to produce high-level visual summaries of clinical histories of patients with multiple, complex medical problems. The resulting visual summaries should aid in disease management of our ageing populations, which increasingly require significantly greater management of multiple chronic diseases in an effective and efficient manner. The work involves interpretation and intelligent summarization and visualization of data, text, or images, using pattern recognition across modalities of data presented in the patient record. Visually summarizing clinical history threads is a natural, perceptually grounded, and flexible way for presenting the thinking or cognition behind the major clinical problems of a patient.

Novel Machine Learning Approaches for Processing Unstructured Clinical Data.PI: Jennifer G. Dy, Northeastern University

Project: The goal of this contract is creation of a data mining system in order to automate the analysis of unstructured medical data found in electronic health records. The automated analysis of this unstructured data is expected to increase resource efficiency and decrease overall healthcare costs. The ability to automatically analyze large amounts of medical text has clear impact in a multitude of areas including: automated reporting, process optimization, quality improvement, clinical/medical research, epidemic and disease outbreak detection, patient identification/selection (e.g., for clinical trials), among others. We believe that automated analysis of unstructured data has one of the greatest potentials to increase resource efficiency and decrease healthcare costs.

Text Mining of Clinical Narratives. Graciela Gonzalez, PhD, Arizona State University-Tempe Campus

Project: The purpose of this contract is build on existing methods for extracting biomedical concepts (e.g., problems, treatments, and tests) from clinical narratives, and identify the relationships between these concepts. This should result in advanced reasoning applications such as diagnosis explanation, disease progression modeling, and intelligent analysis of the effectiveness of treatment. Working from a small corpus of annotated clinical records (1000 from the i2b2-VA challenge), we propose a novel approach to information extraction using distributional semantics (an emerging area in NLP research) that transcends the corpus-size limitation. The small annotated corpus will be complemented with a large set of un-annotated records (850K outpatient notes from the School of Health Information Sciences at The University of Texas Health Science Center-Houston, and around 80 million inpatient records from the Mayo Clinic data warehouse).

Developing an Intelligent and Socially Oriented Query Service for EHRs. Dr. Kai Zheng, University of Michigan, Ann Arbor

Project: The goal of this contract is to enhance the ability of search engines to extract useful data from used unstructured documents commonly found in electronic health records. The proposed solution will use a combination of computation intelligence, which augments individual user’s cognition in search of query construction and social intelligence, which augments the collective cognition of all users as a community, thus creating a new paradigm for electronic health record searching. A Google-like, full-text search engine can be a viable solution to increasing the value of unstructured clinical narratives stored in EHRs. However, average users are often unable to construct elective and inclusive search queries due to their lack of search expertise and/or domain knowledge. To mitigate the issue, we propose to develop an artificial intelligence (AI) based and social intelligence extended query recommendation service that can be used by any EHR search engine to: 1) augment human cognition so that average users can quickly construct high quality queries in their EHR search; and 2) engender a collaborative culture among users so that search queries can be socially formulated and renamed, and search expertise can be preserved and discussed across people and domains.

Discovery and Explanation for Molecular Biology.PI: Lawrence Hunter, PhD, University of Colorado-Denver

Project: This contract will be used to create data, models, and open-source prototype software for use by biomedical researchers in exploring biology databases toward the creation of significant new hypotheses. We plan a multipronged approach that integrates studies of human explanatory cognition, analysis of explanatory discourse in publications, and experiments with software systems that support generation of explanatory hypotheses. The increasingly sophisticated technical infrastructure of both traditional (e.g., journal publications) and formally represented (e.g., ontologies) knowledge in molecular systems biology research, combined with the pressing needs of scientists analyzing genome-scale data, provide an ideal context for advances in computational abduction.

Computational Thinking to Support Clinical Decisions Relating to Medications. PI: Mingu Lee, Samsung, Inc.

Project: The goal of this contract is to develop an experimental intelligent clinical decision support system that will help clinicians collect critical information from raw clinical data and medical documents in order to make clinical decisions accurately.

Evidence-Based Expert Systems to Assist in Treatment of Depression. PI: Bruce Knoth, SRI International

Project: The goal of this contract is the development of a predictive, evidence-based clinical tool to assist clinicians in treating depression by suggesting treatment plans that have proven to be effective, and thus have a high probability of positive impact on patient outcomes. The results of this project should lead to evidence-based tools for other health applications, such as preventative care and childhood disorders. SRI International (SRI) proposes to create EXPERT ADVICE, a predictive, evidence-based clinical guide that provides psychologists with treatment possibilities based on expert-reviewed records of successful depression care. We will leverage the collective wisdom embodied in the IMPACT1 study, which contains records for depression cases from one of the largest mental health treatment trials conducted to date. The resulting clinical tool will assist clinicians in treating depression by suggesting treatment plans that have proven to be effective, and thus have a high probability of positive impact on patient outcomes. Applying industry-validated predictive analytics pioneered by the Advanced Analytics Group at SRI International, and demonstrated in the financial services and other fields, will generate the guidance parameters in EXPERT ADVICE. These techniques are used to develop predictions for how an individual may perform based on the historical records of a reference body of similar individuals.

Optimal Influenza Vaccine Strain Selection. PI: Dr. Shyam Viswaran, University of Pittsburgh, Department of Biomedical Informatics

Project: This contract will be used to develop a strain selection algorithm, based on decision-theory, to assist experts in the selection of influenza A strains to be included in the influenza vaccine. The contract focuses on influenza A because it is the more significant public health problem. The same techniques developed under this contract can be used for the selection of influenza B strains. The approach will be evaluated by comparing its selections to the strains selected by WHO for the years 1978-2010. We will also develop an algorithm called the Herd Immunity Calculator (HIC) that estimates the herd immunity for all influenza strains of concern to public health and evaluate HIC by comparing its estimates of population immunity with published results of cross-sectional serological surveys.

Back to top

Insight Toolkit Projects: ARRA Extramural R&D Contracts Awarded $5.2 million

Insight Toolkit Version 4 and Version 4 add-on

Project: In 1999, the NLM Office of High Performance Computing and Communications awarded multiple contracts for the formation of a software development consortium to create and develop an application programmer interface (API) and first implementation of a segmentation and registration toolkit, subsequently named the INSIGHTToolkit (ITK). The period of performance for the original ITK awards began in 1999 and ended in 2003. The resulting system was originally intended for computer-assisted exploration of the National Library of Medicine (NLM) Visible Human (VHP) data sets. The final deliverable for this group was an open-source software library concentrating on segmentation and registration algorithms, directly inserted into the public domain that now helps support research worldwide in image analysis and in other domains. Open-source initiatives such as ITK help to lower the barriers to entry in complex research fields by providing the foundations of the software infrastructure necessary to conduct advanced investigations. Revising ITK is considered a contribution to the nation's infrastructure through software. Careful thought has been placed on the cultivation and maintenance of this software, assuring developers that the software will remain supported with new releases, bug fixes, and continuing growth. Despite continued support through systems technology contracts, NLM wishes to revisit the underlying design and software architecture for ITK at this time. The initial design meeting for ITK was held in early November 1999, and while the design decisions for this software were targeted for the anticipated technology of 2004, those decisions are now showing their age.


  • Sean Megason, Harvard University Medical School
  • Luis Ibanez, Kitware, Inc.
  • Vincent Magnotta, University of Iowa
  • James Gee, University of Pennsylvania
  • General Electric Research Center
  • Cosmo Software

Insight Toolkit Version 4 add-on


  • Ricardo Avila, Kitware, Inc.
  • Sean Megason, Harvard University Medical School

Back to top

Algorithms, Adapters, and Data Distribution Outreach (A2D2)

Project: In order to continue to provide leadership within the national research community in the area of software development for image analysis, the National Library of Medicine (NLM) is currently revising and updating its open-source image processing application programmers interface (API), known as the Insight Toolkit (ITK). NLM has determined that it is time to revisit the foundations of the toolkit to assure its viability for the next ten years. In the intervening decade since the original design of ITK, computing technologies have made radical advances including the advent of multi-core microprocessors, 64-bit CPU architectures, and a proliferation of graphics processing units (GPUs) capable of general purpose computing. NLM is in the process of funding a major new software release, ITK version 4.0 (ITK-v4), which will take advantage of these emerging computing technologies.


  • Ziv Yaniv, Georgetown University 1
  • Amitha Perera, Kitware 2
  • John Galeotti, Carnegie Mellon University
  • Marc Niethammer, University of North Carolina, Chapel Hill
  • Marcel Prastawa, University of Utah 1
  • Ross Whitaker, University of Utah 2
  • Tom Fletcher, University of Utah 3
  • Kevin Cleary, Georgetown University 2
  • Raghu Machiraju, Ohio State University Research Foundation
  • Nikos Chrisochiides, College of William & Mary

Back to top

Other ARRA Extramural R&D Contracts Funded in 2010 Awarded $2 Million

C3PI – Computational Photography Project for Pill Identification. Pablo Perillan, Medicos Consultants

Project: In a national effort to promote patient safety, the National Library of Medicine (NLM) proposes to create a comprehensive, public digital image inventory of the nation's commercial prescription solid dose medications. The primary intention of this effort is create a test data collection for the advancement of automatic pharmaceutical identification through computer analysis from photographic data. NLM expects to promote computer-based image research applied to the domain of content-based information retrieval (CBIR) of solid-dose pharmaceuticals and anticipates the need for generating a test environment, including variations of photographs of the same drug or sample under different environments.

Remote Virtual Dialog Software (RVDS). Dr. William Harless, Interactive Drama

Project: The purpose is to create an Internet version of the Conversim®/Virtual Conversations® system, either directly or through a conversion program or process. In doing so, the Internet system will: enhance the programmatic capabilities of the original system, optimize the developmental process to make the virtual dialogue paradigm sustainable and to allow for the expansion of applications of the system, and evaluate the new software by testing its perceived user value and its ability to accelerate the development and expand the NLM “Dialogues in Science” series.

Back to top