The Unified Medical Language System® (UMLS®) Metathesaurus® is a large, multi-purpose, and multi-lingual thesaurus that contains millions of biomedical and health related concepts, their synonymous names, and their relationships. The Metathesaurus is one of the three UMLS components: the Metathesaurus, the Semantic Network, and the SPECIALIST Lexicon. The National Library of Medicine (NLM) updates the UMLS twice a year in May and November.
The Metathesaurus includes over 150 electronic versions of classifications, code sets, thesauri, and lists of controlled terms in the biomedical domain. These are the source vocabularies of the Metathesaurus. Their uses incorporate:
- patient care
- health services billing
- public health statistics
- indexing and cataloging of biomedical literature
- basic, clinical, and health services research
The term Metathesaurus draws on Webster's Dictionary third definition for the prefix "Meta," i.e., "more comprehensive, transcending." In a sense, the Metathesaurus transcends the specific thesauri, codes, and classifications it encompasses.
The Metathesaurus is a set of relational files organized by concept, or meaning. It links alternative names and views of the same concept from different source vocabularies and identifies useful relationships between different concepts. All vocabularies are available in one of two UMLS common fully-specified database formats, Rich Release Format (RRF) and Original Release Format (ORF).
The Semantic Network provides a broad categorization of Metathesaurus concepts. At least one semantic type is assigned to each concept in the Metathesaurus. Semantic types include anatomical structure, biological function, chemical, disease or syndrome, laboratory or test result, medical device, and organism. The Semantic Network also defines the set of relationships between the semantic types.
Properties and Scope
The Metathesaurus attempts to maintain the original meanings and relationships provided by its source vocabularies. This principle is called ”source transparency.“ Examples of this include:
- Maintaining terms, identifiers, and other content provided by sources, and assigning them into Metathesaurus concepts or attributes.
- Providing a way to retrieve the original meaning of source-provided concepts, even if that meaning differs from the Metathesaurus CUI into which a term from a source may be assigned.
- Preserving relationships between concepts or terms provided by a source, including all source-asserted hierarchies.
Although specific concept names or relationships from some source vocabularies may be idiosyncratic and lack face validity, the Metathesaurus still includes them. In such cases, Metathesaurus editors may flag such terms as ”suppressible” so users can easily exclude them if they so choose.
The result is that the Metathesaurus does not represent a NLM-authored ontology or a single static view of biomedicine. Rather, the Metathesaurus preserves the many views of the world represented by its many source vocabularies from around the world.
The scope of its many source vocabularies determines the scope of the Metathesaurus. Note that NLM does add many relationships (primarily synonymous), concept attributes, and some concept names during Metathesaurus production and maintenance. However all the concepts essentially come from one or more of the source vocabularies. With a few exceptions, if none of the source vocabularies contain a concept then that concept will not appear in the Metathesaurus.
Systems developers are the intended users of the Metathesaurus. The Metathesaurus supplies information that computer programs can use to create standard data, interpret user inquiries, interact with users to refine their questions, and convert the users' terms into the vocabulary used in relevant information sources.
Examples of Metathesaurus use cases are:
- electronic health records
- structured data entry
- synonymous biomedical terminology mapping
- patient health record linkage to related information in bibliographic, full-text, and factual databases
- natural language processing and automated indexing research
- linking between different clinical or biomedical vocabularies
- information retrieval
Customizing the Metathesaurus
You must customize the Metathesaurus in order to use it effectively in a local application. The Metathesaurus consists of over two million concepts and many relational files, some of which are extremely large. Users rarely require the entire set of source vocabularies for their applications.
Customization requires an understanding of the functional requirements for your specific application, your specific license arrangements, and the characteristics of relevant source vocabularies. Some users may need to customize the Metathesaurus by limiting vocabularies, languages, relationships, attributes, or license restrictions.
Your customization decisions significantly affect the utility of the Metathesaurus for your application. Vocabulary sources that are essential for some purposes, e.g., LOINC for standard exchange of laboratory data, may be detrimental for other purposes such as natural language processing. It may be important to exclude a subset of the concept names found in a source vocabulary that is otherwise useful, e.g., non-standard abbreviations or shortened forms that lack face validity or produce inaccurate results in natural language processing.
For data creation uses such as patient data entry, you will need to select the Metathesaurus source vocabularies that provide the most appropriate concepts and terms to include in your application.
You must use MetamorphoSys to customize the Metathesaurus. MetamorphoSys is the UMLS installation wizard and Metathesaurus customization tool, updated and included in each UMLS release. MetamorphoSys installs the Metathesaurus and enables users to create customized Metathesaurus vocabulary subsets.
Downloading and Browsing the Metathesaurus
The Metathesaurus is freely available to both U.S. and international users. Most vocabularies included in the Metathesaurus have minimal restriction on their usage. Certain uses of some Metathesaurus source vocabularies require separate agreements that may involve fees with the individual vocabulary producers.
NLM distributes the Metathesaurus with each UMLS release, free of charge to U.S. and international users. You must accept the terms of the UMLS Metathesaurus License and create a UMLS Terminology Services (UTS) account to access the UMLS. You may download the UMLS using your UTS account. For instructions on requesting a license and accessing the UTS, see How to License and Access the Unified Medical Language System® (UMLS®) Data. Separate license fees may apply to the use of certain Metathesaurus source vocabularies.
The UTS includes a Metathesaurus Browser (Applications menu) and provides an Application Programmer Interface (API) for developers. For more information, see the UTS Fact Sheet.
See the Metathesaurus homepage on the UMLS Web site for help documentation and source vocabulary information.
The Metathesaurus release documentation has details about the latest version of the Metathesaurus including a list of updated source vocabularies, a complete list of all source vocabularies, and statistics.
For general information on NLM services, contact:
National Library of Medicine
8600 Rockville Pike
Bethesda, MD 20894
Telephone: 1-888-FINDNLM (1-888-346-3656)
NLM Customer Service Form at Contact NLM
A complete list of NLM Factsheets is available at:
(alphabetical list): http://www.nlm.nih.gov/pubs/factsheets/factsheets.html
(subject list): http://www.nlm.nih.gov/pubs/factsheets/factsubj.html
Or write to:
Office of Communications and Public Liaison
National Library of Medicine
8600 Rockville Pike
Bethesda, MD 20894
Phone: (301) 496-6308
Fax: (301) 496-4450