Skip to Content
Archives
NLM Home | About the Archives

Skip to Content

Vocabularies for Computer-Based Patient Records: Identifying Candidates for Large Scale Testing

December 5-6, 1994
Lister Hill Auditorium,
National Library of Medicine, Bethesda, MD


Minutes of a Meeting Sponsored by
National Library of Medicine (NLM)
Agency for Health Care Policy and Research (AHCPR)


The agenda and the list of attendees are included as attachments 1 and 2.

This meeting was held shortly after the award by NLM and AHCPR of a number of cooperative agreements for research and testing sites which will address issues related to implementation of electronic medical records. Some of the areas of interest to NLM and AHCPR include: using emerging standards related to computer-based patient records, linking patient records to other types of information relevant to health care decisions using the UMLS Knowledge Sources, and abstracting data from patient records for use in health services research. The purpose of the meeting was to identify a set of existing vocabularies suitable for testing in patient record systems by the cooperative agreement recipients and, if possible, by other organizations such as the VA. The meeting was also designed to advance the broader agenda of establishing a reasonable starting point for the development and maintenance of a "standard" vocabulary for use in computer-based patient records in the United States.


December 5, 1994

Opening Remarks

Betsy Humphreys, Assistant Director for Health Services Research Information, NLM, opened the meeting and introduced Dr. Lindberg and Dr. Gaus.

Dr. Donald Lindberg, Director, NLM, welcomed the attendees. He expressed enthusiasm for the Library's cooperation with AHCPR in supporting research and testing sites related to the patient record. NLM clearly does not have the expertise or resources to lead all aspects of computer-based patient record development. AHCPR has the broader experience and mandate in the field of patient data and practice guidelines. NLM's focus has been on vocabulary issues. A primary goal of NLM's Unified Medical Language System (UMLS) project is to map disparate biomedical terminologies so that patient records can be linked effectively to decision support tools, like practice guidelines, MEDLINE, etc. After a number of years of groundwork, the UMLS Metathesaurus is ready to become a vehicle for distribution of terminology needed for health care and health services research.

Dr. Clifton Gaus, Administrator, AHCPR, also welcomed participants. He said that the NIH and AHCPR were focused on different points on the continuum that goes from clinical research to practice/health care delivery to health services research. When data from computer-based patient records can be integrated and aggregated, much more useful health services research will result. As we move away from encounter-based, fee- based health care to managed care with capitation payments, claims systems should be replaced by computer-based patient record systems. AHCPR is interested in computer-based patient records as a source of detailed and aggregatable data that can be used as a database for research on the quality and cost- effectiveness of health services. Health plans will need the same kind of data to monitor quality of care and to manage the production of cost-effective care. Current administrative systems do not describe what health care practitioners actually do and therefore exclude key data that can be used to measure the performance of different health plans. AHCPR's studies have shown that administrative data can only show gross variations in care. They don't explain what goes on in clinical care. Uniform patient data will allow much better research into what works and what doesn't and may also facilitate large simple trials. Vocabulary is at the heart of uniform patient data. Dr. Gaus commented that his personal equivalent of "landing a man on the moon" will be a national demonstration of compatible patient records. He expressed the hope that the meeting would be a step toward that goal.

Overview of Standards for Computer-based Patient Record Systems

Dr. Clement McDonald, Chair, ANSI Health Informatics Standards Planning Panel and also principal investigator on one of the Cooperative Agreements, opened by saying that he hoped that even a fraction of the funding devoted to the moon landing would be available for the patient record problem. He then briefly outlined three areas in which standards are needed for patient record systems: (1) messages or "containers" for data, i.e., record structures; (2) identifiers; and (3) vocabularies.

Standards for messages or data structures are the most highly developed. There are 6 groups working in this area: ASTM, HL7, X12-N, NCPDP, ACR-NEMA, and IEEE. Their efforts are being coordinated by ANSI/HISPP.

Standard identifiers are needed for people, facilities, and providers. Identifiers for people are essential for pooling data, but are the most politically sensitive due to concerns about privacy. Dr. McDonald favors use of the social security number for health data, but he observed that there is significant opposition to this approach. The most progress has occurred on identifiers for providers. HCFA has developed what appears to be a workable system. Some work has been initiated regarding facilities.

Standard vocabularies are needed to supply the allowable values for the slots in the messages or data structures. Dr. McDonald favors an approach that focuses first on defining what will be used as the standard vocabularies for the various elements of messages that we wish to exchange, starting first with messages related to laboratory tests and values. He stated that a combination of NDC codes and the WHO drug names would handle most of the drugs; ECRI's vocabulary handles devices; LOINC/EUCLIDES (to be addressed on Dec. 6 by Dr. Huff) might provide the best approach for lab test names; and perhaps a combination of ICD-9, SNOMED International, and Read Codes for other elements.

Dr. McDonald concluded by saying that we need agreement on the content and structure of the major objects that have to be shared and exchanged; on definition and use of major data types (we are relatively close here); on a common representation or syntax for messages, such as ASN.1; and on the choice of preferred vocabularies. We also need the cooperation of major Federal agencies that collect health-related data, such as HCFA and FDA.

Purpose of the Meeting

Betsy Humphreys defined her vision for standard vocabulary for computer-based patient records as follows:

To the extent practical and useful: (1)explicitly identify concepts in patient records that are registered in the Standard Vocabulary, using both their unique identifiers AND the locally preferred names of the concepts; (2)explicitly identify concepts in patient records that are NOT known to be in the Standard Vocabulary and indicate their relationship to one or more concept(s)in the Standard Vocabulary. This vision does not involve the abstraction or encoding of content so that meaning is lost. The meaning is always represented in the record, whether it can be represented in the standard vocabulary or not.

The purpose of the meeting is to begin plans for a large scale test of clinical vocabularies that will be a useful step toward a standard health care vocabulary. Specifically, the meeting should identify a base set of vocabularies which provide substantial coverage of the concepts and terminology likely to be needed in computer-based patient records; should outline at least some of the steps required to set up testing of these vocabularies in patient record systems -- at a minimum at some NLM/AHCPR funded cooperative agreement sites, but also elsewhere if possible; and should lead to the formation of a small working group to develop procedures for collecting and evaluating feedback from testers.

Ms. Humphreys then summarized some working assumptions about clinical vocabularies and computer-based patient record systems that helped shape the agenda for the meeting:

  1. A "controlled" clinical vocabulary is an important PART of the solution to the problem of unambiguous representation of meaning in patient records.
  2. An initial US standard health care vocabulary should combine concepts and terminology from existing thesauri, classifications, and coding schemes. (This is the position previously espoused by the Board of Directors of the American Medical Informatics Association.)
  3. Existing coding systems used for statistical reporting and billing purposes are not specific enough to represent many aspects of health care reality. To support efficient statistical reporting and billing, the more detailed health care vocabulary must be mapped to these coding systems.
  4. Once the appropriate initial set of base vocabularies are chosen, the standard vocabulary will have to evolve over time: -- to represent the terminology and different hierarchical arrangements needed by different specialty groups and in different care settings; -- to keep pace with changes in medicine and health care; -- to connect effectively to any underlying concept representation that is developed; -- to correct any structural problems that inhibit efficient automated processing.
  5. The UMLS Metathesaurus is an appropriate vehicle for: (a) distribution of the base set of vocabularies in a common format; (b)linking these vocabularies to each other and to vocabularies and classifications used for statistical reporting, billing, literature databases, expert systems, etc.; (c) representing the multiple hierarchical arrangements and subsetting approaches that will be needed to support the range of users of health care vocabulary; (d) ensuring a reasonable forward migration path from the current vocabularies to the eventual standard. This last will reduce risk for developers and purchasers of patient record systems.
  6. Large scale testing of a set of likely candidate vocabularies can and should occur simultaneously with the development of a strategy for long-term maintenance at reasonable cost of a comprehensive health care vocabulary (within the UMLS framework).

National Center for Health Statistics Plans regarding ICD-10

Any standard vocabulary used in patient record systems will have to be mapped to the International Classification of Diseases. NCHS' plans regarding ICD-10 are therefore germane to the discussion. Sue Meads, Chief, Morbidity Classification Branch, NCHS reported that NCHS had just awarded a contract to Health Care Policy, Inc. to perform a detailed examination of ICD-10 in regard to morbidity classification. The contract is co-directed by NCHS and the Health Care Financing Administration (HCFA). As part of the project, the contractor will compare ICD-10 to ICD-9- CM to see how compatible they are. The ICD-10 notes will be carefully examined; some reflect ICD's focus on representing cause of death and are not useful for those coding morbidity data. ICD-10 will also be assessed in terms of its adequacy for coverage of risk factors, severity of illness, primary care, outcome measurements, signs and symptoms, and external causes of injury. The contractor will also assess the usefulness of the ICD-9-CM extensions, i.e., have they actually been used, are some used much more than others.

Based on the results of the contract, NCHS will proceed to develop a "atistical draft" will then be turned over to HCFA for two years of testing. "riendly" on how to use and interpret the classification will have to be developed. There will be a two-year notice to the public before implementation. This should allow time for development of training materials and give industry sufficient time to modify encoding software, etc. It is unlikely that ICD-10 will be implemented for morbidity reporting before the year 2000.

There were comments and questions. Dr. Chute said that he had heard the the World Health Organization's copyright of ICD-10 would limit NCHS's ability to extend it. Ms. Meads said this was not the case. NCHS and WHO had an agreement that whatever extensions were made would be compatible with the basic ICD-10. Dr. J. Cimino asked whether ICD-9-CM could just be mapped to ICD- 10 and allow users to continue to use ICD-9-CM. Ms. Meads said that this was not possible because some sections of ICD-10 were in fact very different, e.g., leukemias and lymphomas, and much better than ICD-9-CM. These sections just don't match up very well. Dr. McDonald commented that the process for development of ICD-10 seemed somewhat closed. Ms. Meads said that comments and suggested changes had been solicited from all major U.S. medical societies for a 10 year period. Many changes proposed by these U.S. groups had in fact been incorporated into ICD-10. Dr. E. Hammond said it would be useful for NCHS to circulate its "best statistical draft" widely and as early as possible to get feedback from the community. Ms. Meads said it was difficult to continue to collect feedback while also moving the project forward. Users were currently suffering from not having an up- to-date classification and index.

Health Care Financing Administration Plans regarding Procedure codes

ICD-10 does not include procedure codes. Ann Fagan, Senior Medical Coding Analyst, HCFA reported on HCFA's project to develop a revised procedure coding system to replace volume 3 of ICD-9-CM. ICD-10-PCS, as it is called, will address inpatient hospital procedures ONLY. The goal is to develop a reliable and precise system that will lead to correct reimbursement and ensure data integrity and accurate statistics for MEDICARE beneficiaries. Implementation of the new system should be concurrent with the implementation of ICD-10 to minimize impact on users.

In the current volume 3, codes are limited to 4 digits. Some categories, e.g., cardiovascular procedures, are too full, and there is no room to add new procedures. The restrictive outdated structure leads to a lack of specificity in coding that is probably having a negative effect on reimbursement and is definitely hampering statistical aggregation and health services research.

The objectives of the revision effort are to produce a coding scheme that:

The work began in June 1990 when a competitive contract was awarded to 3M to review the cardiovascular procedures, to develop an improved structure for them that would be easier to use, and to update and expand them. It was originally hoped that the revised cardiovascular section could be integrated with the rest of ICD-9-CM volume 3, but the cardiovascular section developed was sufficiently different that this proved unworkable. In 1991, 3M was awarded another contract to recast the respiratory system procedures according to the model developed for the cardiovascular procedures and to develop a new index approach for both revised sections. HCFA is now in the process of issuing a new contract to revise the entire procedure volume that will build on the previous work. The new contract will have an extensive advisory structure that will allow specialties to have significant input in the codes developed for their procedures. The new system will also include alternative procedures such as acupuncture, massage therapy, etc.

Ms. Fagan received a number of questions. One attendee asked if HCFA's expectation was that after the system had been developed for inpatient procedures its use would eventually carry over to outpatient procedures. Ms. Fagan said that this was a politically charged issue, and there were no plans for this to happen.

Dr. McDonald asked how the new codes would be structured. Ms. Fagan explained that it would be 7-digit alphanumeric code that would make use of 24 letters (not i or o) and all 10 numbers. The 1st digit would designate the body system; the 2nd-3rd digits would be allocated to the specific procedure; the 4th digit would be for the body site; the 5th digit would cover the approach; the 6th and 7th digits would cover any medical device used or any qualifier, if applicable. There would be an explicit value in the 6th and 7th digits to indicate if they were not applicable.

Ms. Fagan emphasized that this was the draft proposal that might not be used if something better came along. She said that those with alternative products that they considered better were encouraged to contact HCFA's Office of Research. Dr. McDonald commented that some people thought that it would be far better not to embed meaning in the code itself as HCFA was proposing to do.

CPRI Study of Clinical Classifications and Vocabularies: Summary of Results

The Codes and Structures Working Group of the Computer-Based Patient Record Institute (CPRI) has conducted part 1 of what is envisioned as a multi-part comparative study of major clinical classifications and vocabularies. Dr. Christopher Chute, co- chair of the Committee and also a co-principal investigator on one of the Cooperative Agreements, discussed the rationale for the study, its methodology, and its results.

Health care is an information intensive industry which needs accurate and complete data for the whole continuum that Dr. Gaus discussed: clinical research, practice, health services research, and clinical epidemiology. Research results can be seriously biased if the data used are insufficiently detailed. Dr. Chute illustrated this point by showing how the addition of a single variable (extent of disease) to data for lung and colon cancer patients improved the accuracy of mortality predictions substantially.

Vocabulary is critical to the collection of accurate and aggregatable health care data and to linking patient records to decision support tools. As William Farr stated a century and a half ago, nomenclature has as much importance to medical care as weights and measures have to the physical sciences. A 1992 GAO study indicated that the development of a standard vocabulary was lagging signficantly behind the development of other parts of the essential infrastructure for computer-based patient records.

The CPRI undertook a quantifiable evaluation of existing major codes and vocabularies, initially to evaluate their coverage of clinical terms encountered in patient records. The systems studied were ICD-9-CM, ICD-10, SNOMED International, the Read Code (version 2), the Gabrieli Nomenclature, the UMLS Metathesaurus (Version 1.3), and two more narrowly focused systems, CPT-4 and NANDA. The method selected was to obtain a body of machine-readable clinical text from a number of different institutions. The text came from a range of sources including inpatient and outpatient records, discharge summaries, nursing notes, progress notes, etc. Samples of first 1,000 and then another 2,000 clinical text strings were extracted. The text strings were parsed by hand and each segment was assigned to one of thirty categories, such as primary diagnosis, severity modifier, etc. While no claims of perfection are made, the categorization was reviewed by multiple members of the study team.

The categorized strings were then distributed to be coded in the various classifications and vocabularies under study. A three part scoring system was used to indicate the degree to which the system covered the concept. (0 - not at all, 1 - partially covered, 2 - covered.) When the scoring was completed, the categories were collapsed into 5 aggregate categories for data analysis and reporting. Dr. Chute showed overheads that represented the coverage of each system studied (with the exception of the Gabrieli system) of the text in these 5 broad categories: diagnosis, findings, modifiers, treatment and procedures, and other.

The results showed that SNOMED International had coverage above 90% for text in all categories. ICD-9-CM's coverage was substantially less, as was the combination of ICD-9-CM and CPT-4. The data therefore did not support the view, which has been espoused by some, that ICD-9-CM and CPT-4 together will cover most of what is needed for computer-based patient records. ICD- 10's coverage was less than ICD-9-CM's, a not surprising result since ICD-9-CM includes extensive clinical additions made by the U.S. Similar additions have not been made to the basic ICD-10. The performance of the Read Code (version 2) and the UMLS Metathesaurus (Version 1.3) fell between that of SNOMED International and ICD-9-CM. Since NANDA has relatively narrow focus, it did not cover most of the text examined.

The study did NOT examine or compare the structure of the systems, including such features as relationships among concepts, compositional rules, etc. The results may be affected by the restricted sample size, although the results for the first sample of 1000 and the second sample of 2000 were identical. It is important to note that several of the systems studied have undergone substantial revision and expansion since the study data were compiled. The data for the Gabrieli system are being double-checked now and will be included in the published report of the study.

In the discussion that followed Dr. Chute's presentation, Dr. J. Cimino asked whether plain ICD-9 had been studied. It had not, due to its known lesser clinical coverage (as compared to the CM version) and because it is infrequently used in the U.S. Dr. Kolodner asked whether the text used included a broad cross- section of health problems, including mental health. Dr. Chute indicated that no special efforts were made to ensure this and, in fact, mental health was deliberately excluded because of the heavy use of the American Psychiatric Association's Diagnostic and Statistical Manual of Mental Disorders (DSM) to represent such data. Dr. Hersh asked whether the data set, including the text strings and categorizations, would be made available to other investigators. It will be made available, but probably with some restrictions since the committee intends to make further research use of it. Dr. Cohn, chair of the Committee, indicated that it probably needs to be expanded to be more representative of different health problems.

Dr. McDonald asked whether the frequency of occurrence of certain concepts was weighted when the results were compiled. Dr. Chute indicated that only one occurrence of the concept was taken from each note, but the same concept might occur in more than one note and ,if so, was counted as more than one of the sample concepts. Dr. McDonald asked whether the study data distinguished between concepts that appeared as wholes in a particular vocabulary and those that could be constructed by combining more atomic concepts from that vocabulary. He thought that in some cases a "constructed" match was quite different in character from a "whole" match and used the example of "blue sclera" to illustrate. Dr. Chute said that many matches were "constructed" and that the study data did not distinguish these from other matches.

SNOMED International's Coverage of Clinical Concepts: Summary of Findings

In addition to the CPRI study described by Dr. Chute, there have been a number of published studies of the clinical coverage of SNOMED International. (A bibliography of papers on SNOMED prepared for the meeting by Dr. Suzanne Henry and Dr. Keith Campbell is included as Attachment 3.) Dr. Suzanne Henry summarized the methods and results of four different studies, which looked at coverage of surgical diagnoses, nursing terms for patient problems, problem lists in ambulatory care, and problems self-reported by patients respectively. (A summary of the information presented by Dr. Henry appears in Attachment 4.) These four were chosen because they had quantitative results, used real clinical data, focused on concept coverage, used a range of methodologies and approaches, and were representative of a number of other studies. Two were comparative studies; two examined SNOMED International only. The results of all four studies support the conclusion that SNOMED International has excellent, although not exhaustive, coverage of a variety of clinical concepts, including modifiers and social context concepts, that are likely to appear in computer-based patient records. Dr. Henry indicated that she had not looked in detail at studies of various computational approaches to automatic extraction and encoding in SNOMED of concepts found in machine- readable text. She referred the audience to papers on this subject in the bibliography provided.

Dr. Keith Campbell then gave an overview of what was and was not represented in the research literature on SNOMED and other clinical vocabularies. The literature includes studies of domain coverage and of concept redundancy within specific classifications. It also includes proposed solutions for deficiencies identified, e.g., use medical records as source material for creation of terminology, use linguistic tools in thesaurus construction and evaluation, improve the structure and reduce unintended redundancy by expressing compositional rules in an explicit syntax. By and large, the literature does NOT include studies of the quality and completeness of hierarchies, assessments of the relevance of terms present, research on the impact of use of a particular classfication on subsequent data retrieval, and discussions of the economic, social, and political factors affecting use of particular classifications. Dr. Campbell stated that this last set of issues needs to be discussed openly so that competing priorities can be evaluated and workable strategies emerge.

Dr. Campbell said that the literature highlighted a problem that had already been discussed by Dr. Chute. All systems change and evolve over time and by the time any study is published it is likely to report results that are not relevant to the current version of what has been studied. This is not only a problem for research results, it is a major maintenance challenge for local systems which make use of these evolving systems. Dr. Campbell advocates careful review by domain experts to evaluate hierarchies and the relevance of terms, despite its subjective and labor-intensive nature. We also need a set of established metrics that can be applied to successive versions of vocabularies so we can see if things are getting better or worse.

In the ensuing discussion, Dr. K. Hammond commented that the mapping projects were laudable, but that we also needed studies of how usable the systems were to people who were trying to browse a problem list and select an appropriate concept. Dr. K. Campbell indicated that such studies were complicated because of the confounding factors, like the interface used. Dr. Hammond said that that sometimes it was precise to be imprecise and that a vocabulary system should not force someone to be more precise than the information known at the time. Dr. K. Campbell agreed, but said that this was also a coverage issue. Vocabularies should have terminology in their hierarchies for intermediate hypotheses.

Ms. Humphreys stated that ONE hierarchy cannot suit all purposes. Any useful vocabulary system will have to allow the representation of multiple perspectives. It should also allow concepts to be identified as belonging to a variety of subsets that have been used successfully for different purposes, e.g., to the set of problems that the VA includes in its problem list, the set of interest to pediatricians.

Dr. J. Campbell said that he thought that the mapping/coverage studies done so far were in fact partially done to prompt this kind of meeting and discussion. Dr. Hersh commented that the discussion was analogous to the debate over whether user satisfaction studies or more formal information retrieval studies were better; in fact both are needed. Dr. J. Cimino commented that the Barrows study (listed in the bibliography in Attachment 3) did look at user performance in selecting terms and found many different problems accounted for failure to find appropriate concepts, including coverage of SNOMED II, a poor user interface, lack of synonyms, etc. Dr. Cimino thought there must be a number of other studies of this type.

Mr. Tuttle asked what should be the priorities for vocabulary evaluation studies for the immediate future. Dr. K.Campbell said he thought that retrieval studies were critical. Dr. Henry agreed, but also thought that emphasis should be placed on the development of standardized measures or metrics and on development of larger databases of test data that could be shared by investigators. Dr. Barnett said the focus should be on what physicians find acceptable. Although the use of a controlled vocabulary may in fact modify behavior, it is going to be difficult, if not impossible, to evaluate the actual impact of a vocabulary on care provided, let alone on outcomes. Dr. R. Miller commented that one metric for evaluating vocabularies that had been alluded to, but not mentioned explicitly is the set of evolutionary forces affecting the vocabulary. In the case of MeSH, a strong evolutionary force has been feedback from large numbers of real users. Various billing issues can provide evolutionary forces that may be less desirable.

Building a US health care vocabulary: Do Read Codes and SNOMED International Offer Complementary Contributions?

Dr. William Hole, who directs the development of the UMLS Metathesaurus for NLM, opened by acknowledging the real experts on SNOMED International and the Read Clinical Classification (Dr. Roger Cote, Dr. Christopher Payne, and Dr. Michael O'Neill) who were in attendance at the meting. He thanked Drs. Payne and O'Neill for providing to NLM an advance and not yet complete copy of version 3.1 of the Read system for use in a preliminary study. Dr. Hole also indicated that Alexa McCray and Allan Browne had assisted by generating American English spelling variants and other variants for the Read data. Nels Olson of Lexical Technology, Inc. did the automated lexical matching.

Dr. Hole began by presenting summary statistics for both SNOMED International and the preliminary version 3.1 of the Read System. Both have roughly 132,600 terms of which about 100,000 are preferred or hierarchical terms. Despite this overall numerical similarity, the distribution of the types of concepts within the two systems is quite different. For example, Read has roughly 69,000 disorders and findings as opposed to about 32,000 terms in the somewhat comparable categories in SNOMED. SNOMED has substantially more anatomical terms and organisms than Read. The version of Read examined did not contain drugs and chemicals or nursing and allied health terminology. These will be present in the version 3.1 to be released in January 1995.

The methodology used in NLM's comparison of the two vocabularies involved initial lexical matching of identical normalized strings in Read, SNOMED, and the 1994 version of the UMLS Metathesaurus. "Concept" matches, rather than string matches, were counted, e.g., one match was counted if both a SNOMED preferred term and one or more of its synonyms matched to either a Read concept or a Metathesaurus concept. This procedure identified 60,695 SNOMED concepts (56%) that did not match lexically to either the Read Code or the Metathesaurus (more than 20,000 SNOMED procedures are part of the 1994 Metathesaurus) and 85,566 Read concepts (81%) that did not match either SNOMED or the Metathesaurus. Random samples of 300 of the non-matching SNOMED concepts and the non- matching Read concepts were then manually reviewed to determine whether they were actually present as WHOLE concepts in the other system. 13% of lexically unique SNOMED concepts were found in Read. 16% of lexically unique Read concepts were found in SNOMED. Although both Read and SNOMED allow the combination of concepts for coding (SNOMED is multi-axial, and Read allows qualifiers to be combined with concepts according to specific rules), no attempt was made to determine if combinations of concepts in one system could adequately represent a lexically unique concept from the other. This was obviously a small and preliminary study, but its results indicate that SNOMED International and Read 3.1 may well provide useful complementary coverage. Both might contribute to a standard US health care vocabulary. (The overheads used by Dr. Hole are included in Attachment 5).

In the discussion that followed, Dr. Huff and Dr. Korpman both expressed surprise at the outcome of the study, given their perception that SNOMED International could cover such a high percentage of important clinical concepts. Dr. Hole said that the broad comparison of the numbers of terms in different categories in the two systems probably offered the best explanation. Larger studies are needed to elucidate the differences between the two systems. Ms. Humphreys, who had assisted Dr. Hole in the human review of the lexical matches, indicated that Read has more pre-coordinated terms which probably accounts for some of its larger total number of disorders.

Dr. J. Campbell asked about the extent to which cultural differences were responsible for a large part of the unique Read concepts. Dr. Hole said the the Read administrative terms, which definitely reflected cultural differences, were a very small part of the unique terms. Comparison of a much larger sample or terms in different specialties is needed to answer the question definitively, but NLM's impression is that cultural differences probably don't account for the majority of the unique Read concepts.

Dr. Lincoln asked whether there had been studies done in England of the extent to which the Read system accurately captured the concepts in clinical narrative. Dr. O'Neill said that a large study was currently being planned that would compare Read 3 with Read 2, ICD-9, ICD-10, and the UK procedure code. This was not because the National Coding Center thought that ICD or the procedure codes were adequate for clinical concepts, but because it was necessary to provide data to refute claims that this was the case.

Dr. E. Hammond asked if the real issue was that SNOMED and Read version 3.1 could not be mapped adequately or merged. If so, perhaps that is a lesson for the whole effort to develop a useful clinical vocabulary. Maybe it isn't possible to map two different views of the world. Ms. Humphreys said that in the case of SNOMED and Read it looked like the two probably could be mapped reasonably well. Large scale testing ought to help us to determine whether one approach or the other is more useful or whether we need both.

Dr. Chute said that agreement on a basic structure would be needed to move forward. Ms. Humphreys said that we certainly needed an envelope that will accommodate both the multiaxial approach and the pre-coordinated approach because we can't predict which will be most useful in which circumstances. To a certain extent the UMLS Metathesaurus already provides this kind of envelope.

Dr. Hammond said that it was likely that the structure of procedures, billing, etc. were very different in the UK and the US and these areas would have to be examined very carefully.

In response to a question from Dr. Hole, Dr. O'Neill clarified that the purpose of Read version 2 was to summarize or abstract patient data. The purpose of Read version 3.1 is to represent the complete information present in a computer-based patient record. Although version 3 of the Read code does have many pre- coordinated terms, it also has an information model that explicitly defines when qualifiers can be combined with these terms.

DICOM/SNOMED International Anatomical Terminology Project

Dr. Dean Bidgood, chair of the ANSI HISPP Joint Working Group for Diagnostic Image Communication and of the ACR-NEMA Working Group 9 (Standards Harmonization Working Group) and member of the ANSI Image Technology Standards Board, opened by saying that his support by an NLM medical informatics training fellowship over the past two years had made it possible for the work he would describe to go forward. His presentation had three parts: a brief history of the development of standards for image data interchange; a description of the current project; and a view of some of the details associated with image data transfer.

The American College of Radiology began work on data interchange standards for images (then defined strictly as radiological images) about 10 years ago. At that time standards were desperately needed to allow use of equipment from different manufacturers in PACS (Picture Archiving and Communications Systems). The initial DICOM standard was developed with excellent leadership from industry engineers. The DICOM standard includes a header with text designed to disambiguate the image that follows it from all other transmitted images. Dr. Bidgood emphasized that it is impossible to interpret images correctly without multiple types of context, including the orientation, the method used to capture the image, etc. Without detailed context information image data are useless and can be dangerous. There is no one view or one hierarchy that can represent the appropriate context for all images. The vocabulary used to describe the elements of the context must support multiple views.

The DICOM standard includes layers for hardware, software, and the information model. Industry standards are adopted for the hardware and software levels. The focus of current work is on improving the information model. Tools are needed to apply the standard efficiently, including tools that are specific to medical applications. The ANSI HISPP MSDS Common Data Types document is used to define certain ubiquitous data types used in the standard. (Reference: ANSI HISPP Common Data Types for Harmonization of Communication Standards in Medical Informatics. Final Draft. November, 1993. Bidgood, W.D. Jr. (Editor). American National Standards Institute. Healthcare Informatics PAlnning Panel. Message Standards Developers Subcommittee.) The ASTM convention of triplet encoding, e.g., 1)the coding scheme/2)the code/3)optional text, is used in the DICOM standard in the values of different header elements.

The goal in the transfer of any data is that the information sent is identical to the information received. To achieve this the concepts present in the message must be represented as fully as possible. The initial DICOM standard had only 22 concepts for different anatomical locations. Not surprisingly this was quickly found to be inadequate. Each successive effort to define larger universes of allowed anatomical concepts inevitably ran into the same problem. Since the list of allowed terms started out as an integral part of the official standard, the standard had to be re-balloted every time new terms were added. The current, more appropriate approach is to refer to vocabularies that can be used with the standard and not to include actual lists of terms in the standard itself.

Work on the evolution of the DICOM standard is now carried out under the aegis of the ANSI HISPP Message Standards Developers Committee. This provides an umbrella that is seen as more neutral than the ACR and has therefore been helpful in bringing representatives of other diagnostic imaging specialties together to work on a version of DICOM that can handle a range of imaging data. In the version under development, an anatomic region, an anatomic region qualifier, a specific site, and a site qualifier can all be specified. Any vocabulary or coding scheme can be used with the standard, identified by triplet encoding.

The College of American Pathologists (CAP) plans to release the Topography and General Linkage/Qualifiers modules of SNOMED International for the DICOM group to use (without charge) in the development of a SNOMED Microglossary of anatomical terms needed for image data. Dr. Bidgood commented that if the CAP goes forward with this plan they are to be commended, given the level of effort and resources they have expended in the development of SNOMED. The DICOM group will start with SNOMED topography and modifiers and then conduct a large multi-specialty review. Subsets of the terms will be prepared for different specialities to match to their existing glossaries. Matching across specialties and mapping of multiaxial to combined forms will follow. The CAP will receive the input from the DICOM project and will exercise ultimate version control.

Although multiple encodings will be possible, DICOM will include default rules. Dr. Bidgood described a range from no constraint on input to default contraints to dynamic negotiation of encoding levels. The third level is not really possible today, although DICOM will have to include a robust specification for conformance claims in the vocabulary area. Dr. Bidgood concluded by saying that preconceived notions are the enemy of progress-- and of the cooperation that is needed to move vocabulary standards forward.

Dr. Lowe asked if the plan was to include the anatomical and qualifier subsets of SNOMED International in the DICOM header. Dr. Bidgood said that there would be an indirect reference to them in the standard, but they would not be included in entirety. Dr. Lowe expressed concern about how this level of textual information related to images would be captured. Dr. Bidgood indicated that the more robust encoding would at first be optional, but would probably become mandatory in conformance claims. A broad standard that covers all imaging is highly desirable. The problem of data capture, through structured data entry or other means, is obviously difficult. Capturing the data in a controlled vocabulary will enable automated links to the literature and other knowledge sources that can provide information immediately, while the practitioner is cogitating about a problem.

Dr. Lindberg commented that the radiologists deserved credit for advancing data standards in a very practical way. At meetings of the Radiological Society of America, vendors who exhibit are required to demonstrate that their equipment supports current standards on the exhibit floor. Attendees at the meeting can bring their own data and see how the machines on exhibit handle them. While scholarly studies are also needed, this real-world approach has merit.

Laboratory Terminology in the CPMC Medical Entities Dictionary: Relationships to SNOMED International and Read Clinical Classification

Dr. James Cimino, principal investigator of one of the Cooperative Agreements and one of the Intermed contracts, described the results of a comparison of laboratory test concepts found in the Columbia Presbyterian Medical Center (CPMC) Medical Entities Dictionary (MED) with those in SNOMED International and in the Read Clinical Classification. (A copy of his overheads is included in Attachment 6.)

The CPMC MED is a semantic network of medical terminology that includes classes of terms, subclasses, and individual concepts. The Intermed dictionary is a stripped down version of the MED, excluding names that are solely of local interest to CPMC. The Intermed dictionary effort allows a fresh start and an opportunity to eliminate some of the "ugliness" in the MED, while also addressing the issue of meeting the needs of multiple institutions with a single dictionary. In addition to Columbia, the Intermed institutions are Harvard, Stanford, and Utah. There are a number of other collaborating institutions testing use of the MED. The CPMC MED and the Intermed dictionary currently have links to the UMLS and to SNOMED.

The initial Intermed dictionary focuses on the narrow field of urine chemistry as a vehicle for working out dictionary structure, procedures for updating, etc. Dr. Cimino briefly outlined the sections of SNOMED International of particular interest for urine chemistry concepts: P3: Laboratory Procedures and Services and particularly P3-02: Specimen Collection, topograpic terms, and analytes. A comparison of SNOMED terms for urine chemistry with those in Intermed (chiefly derived from the CPMC MED) showed that 36 concepts appeared in both, 62 in SNOMED only, and 31 in Intermed only. The 31 "Intermed only" concepts are in fact representable in SNOMED by coordinating terms from different axes. Thus SNOMED has some precoordinated lab test terms and some that are not. Multiple encodings are possible, because the precoordinated terms could also be represented by combining items from different axes.

Urine chemistry terms are split among multiple hierarchies or classes in SNOMED. This occurs because SNOMED is a strict hierarchy in which each concept appears only once. Each urine chemistry concept appears in one reasonable place, but may not appear in some of the places you would expect to find it. There are some ambiguous connections in SNOMED, i.e., two lab test terms may be SNOMED related terms to the same preferred term and therefore share the same code.

In Read, laboratory terms are found in the sections on Samples, Analytes, and Laboratory Test Observations, which include both test names and actual findings. A comparison of Read terms for urine chemistry with those in Intermed showed that 48 concepts appeared in both, 29 in Read only, and 19 in Intermed only. There was some redundancy in terms found in the preliminary version of Read 3.1 supplied by NLM for use in the comparison. Read classifies the terms in multiple locations, although the classification was incomplete in the version used.

Dr. Cimino commented that the MED gets its chemical names from the UMLS. He has found almost all he needed at the time he looked for them. The few not found have shown up in the next edition, sometimes due to his input and sometimes not.

Dr. Cimino outlined the strategy for expansion of the Intermed dictionary. Intermed will prefer precoordinated terms, but, where possible, will have an underlying semantic description. Concepts will be mapped to the UMLS and SNOMED and potentially to Read. Users may map local concepts to Intermed concepts. If users submit additions, they must be accompanied by formal semantic descriptions in Intermed format. The urine chemistry part of Intermed is now available on the Internet.

Dr. K. Hammond asked if the Intermed project was going to look at EUCLIDES/LOINC as a potential source of information. Dr. Cimino said that the Intermed structure had already been influenced by EUCLIDES/LOINC and that he would be working with Dr. Huff to incorporate more information from LOINC.

Dr. Hersh asked, regarding the issue of precoordinated vs. atomic concepts, why Intermed could not have the atomic view mapped to the precoordinated view and then hide it from users who had no need of it. Dr. Cimino said the mapping of the atomic concepts to the precoordinated concepts was inherent in the Intermed semantic model.

Dr. Lindberg asked whether the normal ranges were included in the semantic definition of the test, which provoked a lively discussion. Dr. Cimino said they were a part of the definition in the original MED, and if the range was changed by a lab for any reason a new concept was created. The ranges are not part of the definition for new concepts added to the CPMC MED, nor are they part of Intermed since they may be institution specific. Dr. Lindberg commented that normal ranges are a function of test methodology, rather than institution. Dr. McDonald said that in HL7 messages related to lab tests, the ranges are sent separately, not as part of the name. In many cases the methodology changes at will, with the same lab using different methodologies for the same test from one day to the next. Dr. Kohane asked what happens when two methodologies used for the same test give different units in the results. Dr. Cimino said in that case there would have to be two test concepts in the dictionary.

Mr. Tuttle asked whether differences in Intermed, SNOMED, and Read were principally due to word use. Dr. Cimino said they were not; in general the three used similar names for the tests they had in common. Dr. Hole mentioned that Read appeared to have a range of specimens, including such things as conjunctival swab, that were not present in SNOMED. Dr. Cimino could not comment, because he had focused his investigation strictly on the urine chemistry area. Dr. McDonald said that the discussion would be more interesting if more people could get copies of Read to evaluate. Dr. O'Neill said evaluation copies were available to anyone who wanted them free of charge. Anyone interested should write to Dr. Payne or to him. (Addresses in the list of attendess in Attachment 2).

The meeting was adjourned at about 5:15, to be reconvened the following morning at 8:30 a.m.


December 6, 1994

LOINC (Laboratory Observation Identifiers Names and Codes) and Euclides

Ms. Humphreys reconvened the meeting and introduced Dr. Stan Huff, who has been actively engaged in LOINC development and participates in the Intermed work as a consultant to Columbia. Dr. Huff opened by saying that although the HL7 message standard is very good and useful there is no standardization of the use of test names within HL7 messages. None of the existing codes meets all the requirements. (Copies of Dr. Huff's overheads are included in Attachment 7.) The LOINC effort is focused on developing universal identifiers (test codes) for laboratory observations to be used in the context of ASTM and HL7 message standards. It is first addressing requirements for the OBX (observation result) portion of the HL7 message. Dr. Huff showed examples of the inconsistent information now transmitted in the OBX for different instances of the same test. The LOINC development, initiated by Clem McDonald in February 1994, is directed by a small working group. Its strategy is to start with test codes currently in use and to take advantage of other sources including EUCLIDES, the FDA device list, ASTM E1238 Specimen Names, ASTM E3113 Test and Analyte Names, and probably others, such as SNOMED. The goal is a fully specified name that supports automated matching. There should be one common identifier for tests that are "clinically" the same.

Following an empirical approach, the LOINC group collected existing lab test names from a number of sources, such as MetPath, LDS Hospital, Regenstrief, the Department of Veterans Affairs, and the ASTM E3113 terms. After analyzing these sources, the LOINC effort came up with the general form of a laboratory test result name: <analyte>:<property>:<specimen>:<timing>:<precision>:<method> The method is only noted when important. The value NRM (normalized reference method) can be used to represent many different methods if they can be used interchangably from the clinical perspective.

NOT in the name but transmitted elsewhere in the HL7 message are: the instrument used, details of specimen collection, test priority, and the volume of the sample. If any of these elements are included in the result name, it changes the underlying information model. Dr. Huff showed a number of examples of test names in the current draft LOINC format. The microbiology examples illustrated that LOINC accommodates both of the two common approaches to specifying microbiology test results in existing systems.

Dr. Huff reviewed the trade-offs associated with aggregate vs. atomic names. Both are needed. People use aggregate names. It is important to focus on names that are actually used, however, not to generate the universe of all possible aggregate names. Atomic names are more parsimonious, more expressive, and support more flexible information retrieval and aggregation.

Dr. Huff then described the Euclides OpenLab Coding Scheme, which has influenced LOINC development. Euclides is a European system developed under of the direction of Georges DeMoor of Belgium. It covers the complete domain of the clinical laboratory using a multi-axial approach with 39 canonical axes. HL7 represents some of the these axes in different segments of messages. Euclides contains about 8,200 analyte names, including drugs, cells, micro-organisms, etc. and about 420 "function tests" which are more complex procedures. Dr. Huff showed examples of Euclides analytes, function tests, and procedures. An informal evaluation that mapped lab terms from several American sites to Euclides found that Euclides could represent nearly 100% of the terms.

Dr. Huff concluded by saying that the future development of LOINC involved making a draft widely available for use, finding a permanent home for the system, and adding content to it.

In the discussion that followed, Dr. J. Cimino asked if the LOINC group had thought about using other source vocabularies for particular sections of the name. Dr. Huff said yes, they hoped to point to other systems for parts of the name. Ms. Moholt asked about the underlying structure of EUCLIDES. Dr. Huff said that it had a principled hierarchical structure used by its maintainers, but that it was distributed as a linear list. Mr. Tuttle asked Dr. Huff if he had a feel for the rate of change in vocabulary in the laboratory test area. Dr. Huff said 10-20 terms per month were added to support the member facilities in Intermountain Healthcare. Dr. McDonald pointed out that Arden Forrey who directed ASTM vocabulary efforts in this area exchanged terms regularly with Dr. DeMoor so their coverage was naturally similar. Dr. R. Miller asked whether there would be "emeritus terms" for retired tests. Dr. Huff said that in the HELP system the terms and codes for tests were kept forever, but those that were not currently used were flagged as inactive. He thought that a similar approach would be needed in any standard vocabulary.

VA Experience in Developing a Clinical Lexicon

At this point the focus of the meeting shifted slightly to the clinical vocabulary needs of two Federal agencies and their current efforts to meet these needs. Dr. Kenric Hammond of the Department of Veterans Affairs Medical Center, American Lake (Tacoma), Washington described the VA's project to develop a lexicon for use in their highly developed, distributed and decentralized clinical information system. Their need is to develop a consistent data representation that can handle patients who move from one VA facility to another and also support the aggregation of data across VA care sites. The VA came to the UMLS several years ago, first looking for the basis for a single problem list per site that would cover all types of problems. The VA system includes 170 hospitals, many more clinics, and 3-4 million veterans are seen each year. The VA therefore has a need for a clinical lexicon that can meet the needs of many types of sites. Synonym control is extremely important.

The effort to select a basis for the VA's lexicon involved many people including Dr. Michael Lincoln of the Salt Lake City VA hospital and a Problem List Expert Panel representing a wide spectrum of VA practitioners. A number of candidates were looked at, including ICD-9-CM, the Read Code (version 2), and the UMLS Metathesaurus. (SNOMED International was not yet released.) The UMLS Metathesaurus was selected for a number of reasons: its ability to engulf and encompass other systems; the potential for linking across systems; specific elements of its coverage including COSTAR, nursing vocabularies, and the promise of more CPT (although this last has not yet materialized); the potential value of semantic types and relationships; the Metathesaurus structure which seemed promising for management of vocabularies in a distributed, decentralized system; the ability to leverage NLM's investment; and the likelihood of continuing support.

The decision was made for the VA to develop its own local lexicon, that imports terms from the UMLS or other systems as needed. This ensures that the day-to-day needs of operational systems can be met, including frequent updates for new drugs. Most of the UMLS Metathesaurus (version 1.3) was imported as the basis of the lexicon. Another 2,000 terms that the VA facilities needed to function were added. These included billing codes, the Omaha Visiting Nurses Association terms (which will be added to the 1995 Metathesaurus), social work terminology, and the ICD-9 E and V codes. What started out as a resource for the problem list is now considered the vocabulary support for the complete patient record. It therefore must be a stable, maintainable system.

The VA clinical lexicon has been in use only since June of 1994. It includes most of the features of the UMLS Metathesaurus. It also allows associating a billing code with a more specific term and adding local usage synonyms. It allows users to add concepts but flags them for subsequent review by Dr. Lincoln's group in Salt Lake. It occupies about 150 megabytes in the VA's file structure in MUMPS. It is now used solely for the problem list, but future applications will include order entry, order checking, reminders, the VA's National Drug file, procedure recording, and supporting point of care information and knowledge retrieval services.

Dr. Hammond outlined the VA's needs for expanded UMLS coverage: links to CPT procedures; laboratory procedures such as those in LOINC/Euclides; more dental terminology; more signs, symptoms, and findings perhaps from Read and QMR; more terms related to health maintenance, health status, home care, etc.; Title 38 disability codes; reasons for cancelling orders; abbreviations and acronyms; qualifiers. The Clinical Lexicon interface seems to work well. Users like the multi-term fragment look-up capability and the ability to select a particular view of the information. The mapping to ICD-9 is helpful for billing and doesn't compromise the underlying representation of clinical reality.

The negatives associated with the VA's use of the UMLS Metathesaurus include the need for expanded coverage outlined above, the labor-intensive procedures required to update the lexicon when new editions of the Metathesaurus are issued (complicated in the VA's case because they were using the discontinued unit record format), the potential impact of newly announced changes in SUI semantics, and lack of regular communication with NLM regarding plans for the UMLS, although steps are being taken to improve this.

The VA's wish list includes better contact with other UMLS implementors; easier updates, perhaps through the use of something like the draft ETIF (Electronic Terminology Interchange Format) standard that relies on SGML (Standard Generalized Markup Language) encoding; more freguent and smaller updates; more physical exam and finding terminology which would help to address the skepticism among some VA clinicians regarding the UMLS; and particularly better liaison or strategic consultation with the UMLS developers to ensure better support for real world needs. Closer liaison should also help the VA to exploit the UMLS to assist with linkages to knowledge sources and with aggregation of patient data. Dr. Hammond asked Dr. Lincoln if he had anything to add. Dr. Lincoln underscored the importance of more history and physical concepts and more rapid turnaround of smaller updates. Dr. Hammond said that it was likely that strategic consultation with NLM could reveal practical ways to "ease the pain" of keeping a local system in sync with an evolving national product.

During the follow-up questions and comments, Dr. Hersh said that it was difficult for client-server university systems to interact with the VA system architecture and asked if there were efforts to move to a client-server approach. Dr. Kolodner responded that the VA was moving in this direction and a number of sites would be testing a client-server interface to the VA system in 1995. There is also a program to define interface standards for interaction with the VA system.

Dr. Payne asked how much of the UMLS Metathesaurus was incorporated in VA lexicon. Dr. Hammond responded that most of it was taken. VA system users can limit displays to the particular source vocabularies they are interested in, however. Dr. Payne asked if the VA had looked at Read (which had been referenced on some of Dr. Hammond's slides). Dr. Hammond said they had looked at version 2 and had decided not to base their system on it. Dr. Lindberg asked what progress the VA had made on identifying and naming the problems needed on the VA problem list. Dr. Hammond and Dr. Lincoln both said they were very happy with the UMLS coverage of problems. In their initial review of the 1,000 most commonly seen problems in the VA they found 89% in the UMLS Metathesaurus. This is probably due to the inclusion in the Metathesaurus of frquently seen problems from COSTAR sites. They are providing data to NLM on the problems not found.

Dr. McDonald asked if the VA lexicon would be generally available. Dr. Hammond said that it would be. Mr. Tuttle said that the VA appeared to have accomplished an enormous amount in a relatively short time. Dr. Hammond said that the work to select a basis for the problem list began in 1991, the UMLS Metathesaurus was selected in June of 1992, and work on the lexicon began in December 1992. The lexicon is being shipped to sites with version 1.3 of the Metathesaurus. They would like to upgrade to 1.4 since it contains more content and more connections between ICD-9-CM and MeSH. Dr. Corn asked if the VA problem list was semi-permanently attached to the patient and moved with him as he went from facility to facility. Dr. Hammond responded that this was the goal, but not the current reality. It is difficult to achieve because all the VA sites don't operate from a common clinical database.

Food and Drug Administration's Requirements, Background Investigations, and Plans for a Comprehensive Vocabulary for International Regulatory Activities

In opening her presentation, Mary Jo Veverka, Deputy Commissioner for Management and Systems, FDA, likened the FDA's current situation to that of the VA a few years ago. The FDA is early in the process of identifying and evaluating terminologies that might serve their needs.

The FDA is interested in a broad range of clinical data, but its need is a regulatory need, not a health care, billing, or outcomes research need. The FDA's responsibility is to ensure the safety and effectiveness of a range of products. To perform its regulatory function, the FDA receives and analyzes both pre- market and post-market drug, biologics, and device data.

Pre-market data are submitted by industry when applying for approval of a drug or device. Each submission includes masses of data, 7-10 years worth of information collected at multiple sites including patient data for 200 to 2,000 individuals. Some of the pre-market data is submitted in machine-readable form, but not in a standard form. Drug companies use their own vocabularies, usually based on a combination of ICD, COSTART, etc., in the pre- market data. Pre-market data are collected in a very structured, controlled data environment.

The post-market data are reports of adverse effects of drugs, biologics, or devices either submitted by manufacturers or directly by those providing care. The FDA received about 125,000 adverse drug reports and about 75,000 adverse device reports last year. The post-market adverse reports are coded by the FDA using COSTART, an adverse reaction terminology originally developed by the FDA in the 1960s. The post-market data more random data. The FDA needs to be able to search these data in ways that are likely to reveal any underlying patterns in the independent reports.

As part of a broad-based effort to streamline the new drug and device applications process and to improve its regulatory effectiveness, FDA has a strategic initiative to establish standards for automated submission of both pre and post-market data. Implementation of such standards will support more effective automated assistance to the reviewers, who must validate the large amounts of clinical information the FDA receives and identify patterns that may be indicative of safety and effectiveness problems, and more consistent labelling of FDA- approved products. In this environment the FDA's requirements for a vocabulary include: comprehensive coverage of the signs, symptoms, procedures, diagnoses, etc. that are relevant to both pre- and post-market surveillance data in order to minimize loss of clinical detail when data are encoded; ease of coding and minimization of subjective choices to improve data consistency; ability to query, retrieve, and aggregate data for multiple purposes; availability in the public domain; international accessibility and usability; and support for seamless access to large retrospective databases that are primarily encoded using COSTART or WHOART.

FDA is approaching the standards effort from an Agency-wide perspective, trying to standardize across programs dealing with biologics, drugs, blood products, devices, etc. Until this initiative was launched, the FDA had no mechanism for making decisions about adopting Agency-wide standards. The goal is consistent labelling regarding safety and effectiveness for all FDA-approved products. Four terminology areas have been selected as the initial focus: safety data, toxicology data, laboratory data, and demographics. Each area is going through a three phase process. The first phase includes scoping the problem from the FDA's perspective and from the perspective of the regulated industry and looking at existing terminologies applicable to the area. The FDA does not want to reinvent the wheel. The second phase involves choosing some existing terminologies as a basis and developing a plan of attack to achieve a useful standard. The third phase will be adopting the standard and getting it used. All four areas are moving into the second phase at varying rates.

Ms. Veverka provided additional detail about the safety data area, on which the most progress has been made to date. The safety data area also has the most international involvement. It is being pursued under the auspices of the International Conference on Harmonization, which is an industry-initiated effort to develop data standards for regulatory submissions by the pharmaceutical and biotechnology industries. The members include the EEC Countries, Japan, and the FDA. WHO and Canada participate with observer status.

Phase one involved an examination of three existing safety terminologies, COSTART, WHOART, and MEDDRA. COSTART is known to be deficient. WHOART, a similar terminology developed by the World Health Organization in the 1960s, is also generally considered to be inadequate. It is currently used for most international adverse reports data. Neither FDA (in the case of COSTART) nor WHO (in the case of WHOART) has invested sufficient resources in updating its vocabulary. MEDDRA was begun in the 1980s by the Medicines Control Agency of Great Britain, in part in response to the deficiencies of COSTART and WHOART. It includes most of COSTART and WHOART, plus many other concepts, in flexible, multi-axial, hierarchical data structure. Two more general systems, ICD-9 and SNOMED International, were looked at more superficially.

Multiple agendas were being addressed in the initial examination of these systems, including the need to develop a consensus on how to proceed and to move the discussion to a technical level and away from politics and personalities. The results were not surprising, given the known deficiencies of COSTART and WHOART. MEDDRA looked the best, based on a preliminary evaluation which looked at its ability to encode verbatim reports in a restricted domain and to support flexible retrieval of data for a set of regulatory inquiries. MEDDRA's architecture is good, but it is known to be deficient in content. Alpha-testing in the regulatory environment will help to identify content deficiencies. Then decisions must be made about which existing term sets, such as SNOMED International, will be used to populate the MEDDRA system.

The FDA's next steps include examination of samples of data submissions to identify areas, beyond safety, toxicology, laboratory, and demographics, that also need vocabulary standards. The Agency will also be looking at the requirements for the system that will hold the standard terminologies designated for use in regulatory activity. This system should help regulated industry in the transition of their own systems to the new standards, should permit linking back to the health care systems that in the future will generate data received by regulatory agencies, and should support efficient updating. One of the big issues will be ensuring that there are appropriate updating mechanisms. FDA has a broad range of subject expertise to support its vocabulary standards efforts.

Ms. Veverka's talk provoked numerous questions and comments. Ms. Humphreys asked what was happening in the device area. Ms. Veverka said they had a funding stream from industry for the drug area so they were focusing there first. FDA's device experts were being consulted about safety terminology specifically related to devices, however.

Dr. McDonald stated that both the NCPDP and HL7 standards are broadly used and can accommodate adverse report messages of the type FDA needs to collect. FDA's regulatory needs really involve the same patients, the same diseases, drugs, and symptoms that are of interest in the health care environment. There is relevant work on vocabularies and message standards going on under the auspices of ANSI HISPP. There is not enough FDA presence at these meetings. As a result, FDA is ignoring some robust existing standards that could be applied or modified slightly to meet their needs and is looking at others that are only in the early stages of development.

Dr. Chute thanked Ms. Veverka for presenting the FDA's plans and applauded the Agency's increased interest in data standards. He also questioned the distinction between regulatory needs and needs of clinical epidemiology, for example, since all are based on patient data. He asked how interested the FDA was in ensuring that standards for computer-based patient records evolved so that data from these records could meet the Agency's needs. If this was of interest, it was important to ensure that the FDA's vocabulary efforts were coordinated with the efforts to develop a standard health care vocabulary that would be used in patient record systems. Ms. Veverka said that the FDA had no intention of developing another vocabulary. It was trying to get regulatory consensus on a preferred term and wanted to obtain these preferred terms from other existing systems, such as those used in patient records. FDA would not receive the bulk of its data from patient records for quite a long time and must move ahead on streamlining data submissions in the near term. FDA has formally asked professional societies to indicate what they consider to be the gold standard terminologies for their fields. [NOTE: To date (January 11, 1995), the FDA has received relatively few responses from professional societies. Most indicate that there is no current "gold standard" terminology for their field.]

Dr. Korpman said that he was speaking from the perspective of a vendor. He said that whether data were going to the FDA or HCFA or whomever they were still the same patient data. His company is already translating data N-ways for various requirements. He was hoping that this meeting would lead to progress toward a single vocabulary that could be used for multiple purposes, instead he was hearing about additional independent vocabulary development efforts like LOINC, MEDDRA, etc. ICD-10 is going off in another direction. If we have to code everything in our system another 10 ways, then we will do it, but this is really idiotic. Ms. Veverka said that the FDA does not want to reinvent the wheel and would love to have the medical community come to some consensus. It doesn't appear that the tower of babel will disappear too soon, however, and the FDA has got to proceed to meet its immediate needs. The FDA does not want responsibility for a terminology it can't maintain. Dr. K. Campbell asked who should take the responsibility for maintenance and how should it be funded. Ms. Veverka said that part of their project would be an assessment of the cost and resources needed.

Several people pointed out that the existing HL7 standard might meet FDA's need for a messaging standard for adverse event reports. Mr. Shafarman invited the FDA to attend the next HL7 meeting and bring its requirements for messages to the Ancillary Data Reporting Technical Committee. The Committee could then determine the best way to use the HL7 standards to meet the FDA's message requirements. [NOTE: HL7 is currently looking at a 2.3 proposal to create special messages for reporting clinical trials data.] Ms. Veverka agreed that the FDA should look at the HL7 standard.

Specialized Vocabularies likely to be useful in Patient-Record Systems

Peri Schuyler, Head, Medical Subject Headings Section, NLM opened by saying that although much of the discussion at the meeting had focused on the more general and comprehensive clinical vocabularies, more narrowly focused vocabularies could also contribute to the development of standard health care vocabulary. Such vocabularies offer enriched content, added depth, and perspectives of interest to particular groups. A number of the specialized vocabularies ARE regularly reviewed and maintained. Ms. Schuyler provided several illustrations, including the nursing vocabularies endorsed by the American Nurses Association, the PDQ Cancer Thesaurus, ECRI's Universal Medical Device Nomenclature System, and the Medical Subject Headings. The UMLS Metathesaurus already serves as a vehicle for linking these specialized vocabularies to each other and to more general clinical vocabularies and for distributing them in a uniform format.

ANSI HISPP Working Group on Codes and Vocabularies Draft Framework for Evaluating Clinical Vocabularies

Dr. Simon Cohn is co-chair of the CPRI Codes and Classifications Committee, chair of the ANSI HISPP Codes and Vocabularies Group, a member of the CPRI Board, and also, as the Clinical Information Systems Coordinator for Kaiser Permanente, a co-principal investigator on one of the Cooperative Agreements. (A copy of Dr. Cohn's overheads is included in Attachment 8.) Kaiser Permanente has 6.5 million members, operates 30 hospitals, and many more clinics. Dr. Cohn said that he thought Dr. R. Miller had come up with a useful expression when he referred to "emeritus terms". Today many of us have "emeritus" or legacy systems and are using "emeritus codes" that are not suitable for current purposes. Kaiser has reached an historic point. It has recognized that it can support different systems, but it can't support different data structures or different content standards. If we just automate what we have, we will have "paved a cow path" that doesn't lead us anywhere. We have to start on a new course and use appropriate methods to evaluate our progress and to improve on our direction.

The ANSI HISPP Working Group is preparing a framework for evaluating clinical vocabularies. Dr. Cohn thanked Drs. Chute and J. Campbell for their contributions to the current draft and indicated that it had been informed by discussions of the full ANSI HISPP Working Group, by the CPRI, and by interactions with CEN TC 251 WG2. Dr. Cohn said that the Working Group is not looking for perfection. It is looking for a reasonable starting point and a strategy for moving ahead. The Working Group hopes that the framework document can help the process by focusing on what we need in a clinical vocabulary. It should also serve as an important communications tool.

Comparable data are essential for communication between health care providers, for outcomes research, for continuous quality improvement, for reimbursement and resource allocation, and, as Ken Hammond said earlier, decision support. Decision support can't be implemented without data standards. Some people fear that the effort to develop clinical data standards is a thinly disguised effort to increase regulation of health care that will increase the administrative burden on health care providers. We need a communications plan to allay these fears and to involve people in the standards development process. We need standard vocabulary in many domains. James Campbell has prepared a draft document that outlines these domains which is available for review and comment. Dr. Cohn said that he had hoped that there were some domains that didn't really need a controlled vocabulary. He has come to the realization that they all need standard vocabulary, but he thinks we may be able to identify priorities among the different domains.

The Working Group has stated the goal as "the evolution of a unified set of non-conflicting, non-redundant terminologies suitable for the complete patient record". The expectation is that these vocabularies will support efficient structured data entry. The Working Group draft identifies four dimensions or attributes for evaluation of vocabularies: scope, structural characteristics, maintenance characteristics, and useability characteristics.

The desiderata for scope include representation of the full range of concepts needed for the patient record; inclusion of synonyms, variants, and related terms; modifiers; representation of time intervals; natural or customary terminology; and context-free concepts. Desired structural characteristics include: atomic terms, with mapping to precoordinated terms; explict rules for structure; definitions; terms that are not vague, ambiguous, or redundant; rules for combining terms; multiple classification and inheritance; logical relationship linkages, such as is-a and caused-by; language independent structure; and unique identifiers with no intrinsic meaning.

Dr. Cohn said that maintenance is the critical issue. Where we start is less important that having supported evolutionary development to where we want to go. In his view, the clinical vocabulary should be developed as a National standard -- which means that no one agency, such as NLM, can do it alone. Since most people don't care about underlying data standards, attention must be paid to a lexicon that meets the needs of end users and developers and translation software that converts locally preferred terminology to the standard.

Dr. Cohn concluded by saying that the current version of the framework was a draft, and the Working Group was soliciting suggestions for improvements. The intent is to come away with a useful framework for evaluating clinical vocabularies and a 2- year work plan for both the CPRI codes committee and for ANSI HISPP.

Dr. McDonald asked whether the current draft was a consensus document. Dr. Cohn said it was the third pass, and he expected the final to be the seventh pass. Much more input is needed. Dr. McDonald said that the initial statement of the goal sounded like the group was looking for a single monolithic system. Since this is probably not possible and will be unpopular in some circles, it would be better to say that multiple systems will be merged to form the standard. Dr. McDonald also commented that the 2 year horizon seemed at odds with the magnitude of the task. Dr. Cohn said that the Working Group members did not think a solution would be reached in two years. They just wanted to define the steps that should be taken in the next two years to make progress toward the ultimate goal. Dr. Korpman (also a member of the Working Group) said that the Working Group wanted to look at the whole problem rather than a little piece of it, develop a framework for evaluating what's out there now, and see how close you can get to achieving the goal. Dr. McDonald commented that there was probably something that could be tackled that fell between "little" and "enormous".

Dr. Oliver inquired about the relationships between terms that should be in the standard vocabulary, i.e., what belongs in a Knowledge Base and what belongs in a vocabulary. Dr. Cohn said that this was an important point. Dr. Chute said that at the end of the day, say in the year 2094, we should be using a common knowledge base. Ms. Humphreys said that Dr. Oliver was raising the issue of whether rapidly changing information, such as the best treatment for any condition, should be maintained in the standard vocabulary. Dr. Barnett said there is a class of information that would be very useful to have in the vocabulary, e.g., a particular drug belongs to the class of penicillins, a condition has an effect on the liver, which was probably stable enough to be maintained. Ms. Humphreys commented that the Metathesaurus co-occurrence information, which represents an automated statistical analysis of certain information sources, was one approach to representing this kind of information without a huge maintenance burden.

Discussion: Vocabularies Suitable for Immediate Large Scale Testing and Evaluation

As a prelude to the discussion, Betsy Humphreys reiterated NLM's view of the role of the UMLS Metathesaurus in the development of a standard health care vocabulary. The UMLS Metathesaurus can serve as: (1) a distribution vehicle, (2) a means for mapping: between concepts within the health care vocabulary and from the health care vocabulary to other relevant classifications and vocabularies, including those used in billing and statistical systems and in knowledge sources, (3) a means for representing many different useful perspectives on the same concepts, and (4) a reasonable migration path for developers and users as we move from the current situation of multiple vocabularies to an eventual coherent standard U.S. health care vocabulary.

Dr. K. Campbell said that at present only a portion of SNOMED International was in the Metathesaurus and asked whether NLM intended to change this. Ms. Humphreys responded that it was NLM's intention to add all of SNOMED to the Metathesaurus. Since it was time-consuming to review additions to the Metathesaurus to ensure correct synonymy, etc., this would not be completed for the 1995 edition. For the proposed large-scale testing of vocabularies for the patient record, an interim approach would have to be taken. One option was to convert the sections of SNOMED not yet integrated into the Metathesaurus into a format similar to the Metathesaurus, but without the mapping, so that testers would not have to deal with multiple data formats.

Mr. Martin asked whether NLM's strategy included allowing responsible groups to maintain their vocabularies within the Metathesaurus. Ms. Humphreys said that was definitely the goal. ECRI was already doing this partially for the Universal Medical Device Nomenclature. NLM hoped that the National Cancer Institute could be the test case for the full capability.

Ms. Moholt asked what happens if no one can afford to update clinical vocabulary, as had been implied in earlier discussions. Ms. Humphreys agreed that if we carry that argument to its logical conclusion, we are doomed. If we can make appropriate software tools available to vocabulary developers within the UMLS environment, we can reduce the costs. Vocabulary development will remain a labor-intensive process that needs stable funding, however. Dr. Barnett commented that it was difficult to establish the boundaries of a health care vocabulary. For example, should various kinds of pregnancy counselling be included. Ms. Humphreys said that, while the boundary problem was real, version 3.1 of Read appeared to have quite detailed coverage of pregnancy counselling and similar concepts. Although SNOMED also has some coverage in this area, this may be one of the places where the two are complementary.

Dr. R. Miller asked for clarification of what was meant by providing a migration path for developers. Ms. Humphreys responded that eventually the standard health care vocabulary will be a subset of the Metathesaurus, which will continue to cover concepts and terminology related to other parts of the broad biomedical and health enterprise. If you incorporate a UMLS unique identifier for a concept into your system today, the Metathesaurus will ensure that 10 years from now it still connects you to that concept in the standard health care vocabulary.

Ms. Humphreys then briefly repeated the purposes of the meeting: (1) to identify a set of vocabularies that should undergo large- scale testing to determine their suitability as a base for an eventual standard health care vocabulary; (2) to outline some of the major issues that have to be addressed in setting up the test; (3) to designate a small working group to develop procedures for the test, in particular for collecting and analyzing feedback from the testers. The working group will include some people from Cooperative Agreement sites since most of them will have to deal with whatever procedures are established.

In response to a comment from Mr. Tuttle, Ms. Humphreys said that if the vocabulary set selected is big enough then the need for sites to create their own concepts will be diminished. The assumption is that people will have to add concepts to meet local needs, however. The test should help us to see how often this happens, where the gaps are, and what resources will be required both to fill gaps and to deal with new concepts.

Dr. J. Campbell said that he gathered that the test would focus on the concept coverage issue and would not look at the set of relationships among concepts that have high clinical utility. Ms. Humphreys confirmed that the test as proposed would address the extent to which a selected set of clinical vocabularies met the needs of computer-based patient record systems. Work on relationships was not part of the specific agenda for the test but could certainly be pursued on a parallel track. Dr. J. Cimino said that he assumed that one purpose was still to link the clinical vocabulary to the vocabulary used in knowledge sources, such as MEDLINE, that are helpful in clinical decision making. Ms. Humphreys confirmed that this was still a high priority.

Ms. Humphreys introduced Dr. Milton Corn, Acting Associate Director for Extramural Programs, NLM, who chaired the rest of the discussion. Dr. Corn opened by saying that he was grateful for the practice because he was leaving right after the meeting to mediate in Bosnia for the U.N. The time has come to test drive a set of vocabularies. Previous speakers had revealed a parade of flaws in existing vocabularies, but perfection will take many years to achieve. In the meantime, people are not waiting. We have heard from several Federal agencies who have immediate needs and are going ahead. People in the room today are not the only interested parties; Fortune 500 companies are also going ahead. It is not wise for us to assume that the commercial sector will come up on its own with something that will meet all needs. It is not wise to wait forever. Some of the vocabularies we already have are really pretty good. We need to test them and see what happens. Dr. Corn assured the group that the test is not a disguised attack from the Federal government. It is a legitimate attempt to learn something that will get us closer to producing patient data that are can be exchanged and aggregated.

Dr. Corn said that if attendees would accept the debating resolution that we will go forward with a large-scale test of existing vocabulaires, he would like recommendations for what should be included in the set. If some thought the test was a bad idea, he wanted to hear that, too.

Dr. McDonald said he thought that the diagnoses, morphology, and organisms axes from SNOMED Internatinoal should be included. He did not think it was necessary to take all of each vocabulary included in the test set. Dr. Cohn disagreed and said that all of each selected vocabulary should be included.

Dr. Milholland said that she thought the test was a great idea and should be pursued. She said that the four nursing vocabularies endorsed by the American Nurses Association should be included in the test set. Dr. Corn commented that he thought their inclusion was a given.

Dr. Lincoln strongly supported the inclusion of CPT in the test set and in the UMLS Metathesaurus as soon as possible. Dr. R. Miller said that he thought all vocabularies currently in the Metathesaurus should definitely be included, plus all parts of SNOMED, and probably the Read system, although he knew much less about it.

Dr. E. Hammond said we needed rules for determining how the disparate vocabularies are to be used. He thought we needed to define what is to be used on multiple levels. We need to know what is needed for existing message standards, but we also need to know what is needed for the complete patient record. The more quickly we can identify areas of terminology that are missing from existing systems and get working on filling the gaps the better. Dr. Corn said that identification of gaps should be one of the fall-outs from the proposed test.

Dr. J. Campbell said that the data on SNOMED International was certainly compelling. What was needed was some sort of assurances about its future, including ownership, etc. One reason the UMLS has been so successful and influential is that it has been freely available and has ongoing support. He commented that it is important to know the CAP's intentions regarding SNOMED and what strategic alliances might be built. SNOMED has things that the UMLS currently lacks. Ms. Humphreys said that the issue of what arrangements could and should be made with vocabulary owners is certainly important. The goal of this meeting was to focus on what are the best vocabularies to include. One of the follow-on strategy issues will be how to make the requisite arrangements with vocabulary developers.

Dr. McDonald reiterated that he thought it was better to take some, not all, of SNOMED. He thought the drug portion should not be included because NDC and the WHO drug terminology were better in this area. The standard vocabulary will have to include CPT and HCFA's terminology. Dr. K. Campbell said CPT couldn't represent the level of granularity needed. The standard health care vocabulary should map to the billing codes.

Dr. Cote, co-editor of SNOMED, said we should get an inventory of what the large vendors are using now. Maybe we can get vendors to supply useful data on what concepts are being used now in their systems and where local sites have to build their own terminology. He reported that CAP is forming a number of strategic alliances with other professional societies to improve SNOMED in specific areas. These include the American Nurses Association, the American Dental Association, and the American College of Radiologists. CAP is very open to different approaches. In fact, the CAP has been trying to transfer responsibility for all of this to NLM for years. CAP is committed to keeping the distribution price for SNOMED very low. The copyright of SNOMED is just so the CAP can keep control over what is done to it. Dr. Sennhauser, CAP's official representative to the meeting, said that with the release of SNOMED International CAP had attained a new plateau and a new level of recognition. He is chairing a select CAP committee to decide the future direction of SNOMED. The committee is open to suggestions. The views presented at the meeting have been helpful to him.

Dr. Chute said that on the issue of whether to have one monolithic system or to cut and paste, he wants a non- overlapping, non-redundant system. The UMLS can provide the links to any billing or other systems that will not go away. All the suggestions made previously were reasonable. He wonders if the best approach is to start with a blank sheet, agree on a structure, and then add in the pieces we need from the different existing systems.

Dr. Korpman strongly supported the inclusion of SNOMED. His company has found it useful for a wide range of purposes and it is clearly worthy of inclusion in the test. He thought it was better to include a whole that is logical than a piece of this and a piece of that. SNOMED is already mapping to other systems.

Ms. Humphreys asked what was meant by the phrase "non-redundant" coverage. She said that to meet all needs we would have to have pre-coordinated terms mapped to atomic terms. This was a form of redundancy. If one system had the atomic terms and others had useful precoordinated terms, it was surely better to use the existing pre-coordinated terms rather than forcing people to create new ones. Dr. Chute said that he agreed that mapping between atomic and precoordinated terms was needed and did not constitute the type of redundancy he wished to avoid. This gets at the need to decide on the structure in which these relationships can be represented. Dr. Hersh agreed and said we should agree on the structure at the outset of the test. He doesn't think it will take long to do this.

Dr. K. Hammond said that the worst kind of redundancy occurred within a single patient record when the use of different names separated information about the same problem. The value of the UMLS is its synonymy and ability to identify these risk areas. One type of evaluation would be to see if use of the UMLS reduces undetected synonymy in patient record systems.

Dr. K. Campbell said that people were talking about two kinds of redundancy: (1) within a coding system, where it was necessary to represent the connections between the different ways of saying the same thing and (2) between coding systems, where different terms might be used for the same concepts. He thought that people in the room were more concerned about the latter. Ms. Humphreys thought that the second kind was not a problem at all as long as the different names from the different systems were explicitly linked and labelled as synonyms. Dr. J. Cimino said that synonymy was recognized redundancy and that wasn't a problem. What is needed is a representational scheme that will help us recognize, when a precoordinated term is added, that the corresponding atoms are there so we can link them to the precoordinated term.

Dr. Barnett said we need to focus on the vocabulary that will be needed to move from patient records to important information sources like practice guidelines, results of PORT studies, etc. What data elements are important for these connections and what vocabulary is needed to fill them?

Dr. Corn asked to review what had been said regarding selection of vocabularies: SNOMED was strongly recommended, and Read sounds positive, too. Dr. McDonald said LOINC should be added to the list. He stated that we will not achieve a single vocabulary, and we might as well recognize that fact now. SNOMED doesn't have much in the way of supplies, and its drug section is not as good as the WHO nomenclature. NDC, CPT, ICD-9-CM, and ECRI are all heavily used now and are important. Dr. Korpman said he didn't disagree that more than one would be needed, but it might be a better strategy to get as much as you can from one and then add things from other systems.

Dr. Lowe said that existing databases use non-standard terminology and we want to be able to aggregate them. Having a source that links concepts from many different systems will faciliate this. For creating new data, the need is a large population of concepts at the level of granularity required for the data at hand. We don't want to have to invent concept names de novo. There is no problem with having multiple labels or names for the same concept. We do have to decide on a structure. Both SNOMED and Read have rich structures. The short cut route is to start with SNOMED and then proceed to work out the structural issues.

Dr. Lincoln agreed that structure was important. He is concerned about closely related terms. The semantics of relationships need to be addressed. His experience with Iliad taught him there are many nuances of meaning.

Dr. Cote said he could support Dr. McDonald's contention that it was possible to take pieces from different systems. In his discussions with other countries, the drug codes and the occupations were always a problem. Other countries would usually elect to use their own systems for these sections. You can in fact unplug parts of SNOMED and use the rest. It will work. CAP has no objection to this for the drug area.

Dr. J. Campbell said that the next logical step might be to expand the CPRI study to include version 3.1 of Read. This would provide the additional data needed to determine whether SNOMED International and the Read system offer complementary advantages. Ms. Humphreys said it would be good for such a study to be done while we also get the type of input from operational systems that the proposed test can provide. The test would offer a reservoir of concepts and concept names that people can use. While we collect information on what operational systems really need, these operational systems will be creating more data that is likely to be aggregatable in the future.

Dr. Huff said he also wished to emphasize the importance of the structural issues. If we don't start with an agreed-upon information model, we will not be able to exchange data. Things will be too amorphous without a defined context and data structure.

Dr. Kohane said that we can't wait for semantic purity. We need a pragmatic approach to moving ahead. Dr. Corn supported this view saying that a good system that comes too late is still useless.

Dr. R. Miller commented that there are more structures that can be imposed on good lexicons than there are good lexicons. He thought the best approach is to include all the credible candidates in their entirety, see where the gaps are, and proceed with work on the structure simultaneously.

Dr. E. Hammond said he thought we were missing a fundamental first step. It was important to define what we are really trying to evaluate in the test. We must agree on the use or the uses to which we will put coding systems. Are we talking about structured input or are we talking about providers using natural language that will then be converted into codes? Are we talking about identifying the best codes for reimbursement or for statistics or for exchanging data between systems? Ms. Humphreys said that she was not talking about codes at all. She was addressing a clinical vocabulary that lets us say what is really wrong with the patient and what was actually done about it. This clinical vocabulary should link to various codes when we need them for various purposes such as billing and statistics. Dr. E. Hammond said he meant to refer to vocabulary rather than codes, but his point about the need to decide what is being proposed and evaluated and why remains an important one. He is afraid that if we do not reach agreement about the purpose of the test up front we will have reached no conclusion two years from now.

Dr. Erlbaum commented that in some sense the Metathesaurus IS a coding scheme. It assigns a unique identifier (with no intrinsic meaning) to each concept. If you want a single system of identifiers for all concepts you will be able to use the UMLS identifiers when the Metathesaurus incorporates all of the vocabularies of interest, including SNOMED and Read. The current Metathesaurus structure already accommodates designation of allowable qualifiers and mapping of atomic concepts to precoordinated ones.

Dr. Fuller raised the question of when the Internet-based UMLS Knowledge Server would be available to all UMLS users. This will be useful for the testing planned. The UMLS also has utility as it stands as a reference tool. Dr. McCray responded that the UMLS Knowledge Source Server was in beta-testing now. It is a client-server system. It will be available to all UMLS developers by mid-1995.

Dr. K. Hammond supported Dr. R. Miller's point that it was important to include a broad range of vocabularies in the evaluation. We should not be too narrow at the outset.

Mr. Tuttle said that if we leave this room without agreeing to some form of the test proposal we will send a very bad message to the many people in the country who want forward motion toward a health care vocabulary. There is no way the test can make the situation worse. There is a good chance it may make it better. Let's make all the credible candidates available to testers in a UMLS-like format and see what happens.

Dr. Corn thanked the group for a very useful discussion and offered Ms. Humphreys the opportunity to have the last word. Ms. Humphreys said that NLM would go forward with the test. Several issues/action items necessary to set up the testing had come out in the discussion. These include: defining the nature of the test more clearly; making suitable arrangements with those who have intellectual property rights for vocabularies that will be included in the test; putting all the vocabularies to be tested into a format similar to the Metathesaurus, even though some of them have not yet been incorporated into the Metathesaurus; and looking at the current Metathesaurus structure and how it can be augmented to represent all essential features of a standard health care vocabulary. The Working Group that will draft the procedures for the testing will include: Jim Cimino, Simon Cohn, Chris Chute, Mark Tuttle, Bill Hole, someone from the VA to be designated by Rob Kolodner, someone from AHCPR, and herself. Although sympathetic to Dr. E. Hammond's point about defining the purpose and parameters of the evaluation, Ms. Humphreys said that there will be value in letting a range of institutions test the ability of the set of vocabularies to meet their individual purposes. This will provide a broad view of the extent to which they encompass the concepts needed in a health care vocabulary.

Ms. Humphreys thanked the participants and said that minutes and copies of slides used at the meeting would be sent to all of them.


Last updated: 25 May 1998
First published: 25 May 1998
Metadata | Permanence level: Permanent: Stable Content

Last updated: 25 May 1998
First published: 25 May 1998
Permanence level: Permanent: Unchanging Content