Skip Navigation Bar
 

Unified Medical Language System® (UMLS®)

2012AA Consumer Health Vocabulary Source Information




Skip to: Notes, Summary of Changes, Summary of Source-Provided Files, Source-Provided File Details

Notes

Summary of Changes:

  (return to top)

No changes were made to the release format of CHV or the processing for the 2012AA Metathesaurus.  Updates to the CHV content primarily consisted of deletion and/or correction of misspelled terms. 

In the previous version, CHV term values with potential spelling errors were identified by setting MRCONSO.SUPPRESS="E".   There are 20 atoms remaining with SUPPRESS="E" however these will be reviewed in future versions to determine if they should be changed to "N".

Source-Provided Files: Summary

  (return to top)

The CHV distribution includes the following. These files, along with additional information can be accessed at http://www.consumerhealthvocab.org

Documentation:

File
Description
http://www.consumerhealthvocab.org/ source website
ReadMe.pdf
README file

Data:

File
Description
CHV_concepts_terms_flatfile_20110204.tsv Tab-separated data file


Not included
  • The CUI and UMLS Preferred Name are not explicitly represented in the Metathesaurus, however they are processed to help discover synonymy between CHV terminology and other UMLS sources.
  • The UMLS preferred flag column is not processed
  • CHV terms with disparaged = "yes" are not included in the Metathesarus at this time
  • Attributes with value "\N" are not included in the Metathesaurus release

Source-Provided Files: Details

  (return to top)

The following is a list of elements available for CHV in the tab-separated data file.

Notes: 
  • During Metathesaurus source processing, CHV term values with potential spelling errors were identified by comparing words in the CHV term to words in the Specialist Lexicon and to a subset of words in the Metathesaurus (MRXW.ENG).  Term values which had potential spelling errors have MRCONSO.SUPPRESS="E".  It is anticipated that any errors will be corrected in a future update of CHV.
  • All score attributes have a range 0 to 1 (a higher score implies the term is easier). A value of -1 indicates the score could not be estimated.

# Field Description
Representation
1 CUI
UMLS CUI for this term.  (string)
Used to discover synonymy between CHV terms and terms from other UMLS sources
2 CHV_term
Term as found in text.  (string)
MRCONSO.STR
3 UMLS_preferred_name
Preferred name for UMLS CUI.  (string)
Used to discover synonymy between CHV terms and terms from other UMLS sources
4 CHV_preferred_name Preferred name as defined in Consumer Health Vocabulary. (string)
Not directly processed
5 Explanation
Explanation or definition for the term, if available. (string)
MRDEF.DEF
6 CHV_preferred
A boolean variable (yes/no) indicating whether this is the preferred CHV name for the concept.  (string)
Used to determine TTY.  CHV Terms with preferred flag = "yes" are assigned TTY="PT".  CHV Terms with preferred flag = "no" are assigned TTY = "SY".
7 UMLS_preferred
A boolean variable (yes/no) indicating whether this is the preferred CHV name for the concept.  (string)
Not processed
8 Disparaged
A value of "yes" in the CHV data indicates a misspelling or other abnormality.  For this version, disparaged terms were not processed, so all cases of ATN="DISPARAGED" have ATV="no".  (string)
CHV Terms with Disparaged = "yes" are not included in the Metathesaurus at this time
9 Frequency_score
Estimate of thedifficulty of a term, i.e. how likely it is that an average reader will be familiar with or understand a given term.  Based on the frequency in several large text corpora.  A higher score indicates that a term is more familiar (less difficult).   (real number) MRSAT.ATN = "FREQUENCY"
10 Context_score
Context based estimate of the difficulty of the term.  (real number) MRSAT.ATN = "CONTEXT_SCORE"
11 CUI_score
Estimate of the difficulty of the concept (CUI) derived from determining how closely related the concept is to known examples of easy and difficult concepts. (real number) MRSAT.ATN = "CUI_SCORE"
12 Combo_score
Combination of frequency, context and CUI scores.  Also uses whether or not the term is a top word.  (real number) MRSAT.ATN = "COMBO_SCORE"
13 Combo_score_no_top_words
A slight modification to Combo_score that ignores top word criterion.  The top word list is a list of easy words from the Dale-Chall list. (real number) MRSAT.ATN = "COMBO_SCORE_NO_TOP_WORDS"
14 CHV_string_id
Unique identifier for each entry in the CHV.  (string)
MRCONSO.SAUI
15 CHV_concept_id
Unique identifier for every concept in the CHV.  (string)
MRCONSO.CODE, MRCONSO.SCUI