Unified Medical Language System® (UMLS®)
2012AA Consumer Health Vocublary Source Information
Notes
Summary of Changes:
)No changes were made to the release format of CHV or the
processing for the 2012AA Metathesaurus. Updates to the
CHV content primarily consisted of deletion and/or correction of
misspelled terms.
In the previous version, CHV term values with potential spelling errors were identified by setting MRCONSO.SUPPRESS="E". There are 20 atoms remaining with SUPPRESS="E" however these will be reviewed in future versions to determine if they should be changed to "N".
Source-Provided Files: Summary
()The CHV distribution includes the following. These files, along
with additional information can be accessed at
http://www.consumerhealthvocab.org
| File |
Description |
| http://www.consumerhealthvocab.org/ | source website |
|---|---|
| ReadMe.pdf |
README file |
Data:
| File |
Description |
| CHV_concepts_terms_flatfile_20110204.tsv | Tab-separated data
file |
|---|
Not included:
- The CUI and UMLS Preferred Name are not explicitly represented in the Metathesaurus, however they are processed to help discover synonymy between CHV terminology and other UMLS sources.
- The UMLS preferred flag column is not processed
- CHV terms with disparaged = "yes" are not included in the Metathesarus at this time
- Attributes with value "\N" are not included in the Metathesaurus release
Source-Provided Files: Details
()The following is a list of elements available for CHV in the
tab-separated data file.
- During Metathesaurus source processing, CHV term values with potential spelling errors were identified by comparing words in the CHV term to words in the Specialist Lexicon and to a subset of words in the Metathesaurus (MRXW.ENG). Term values which had potential spelling errors have MRCONSO.SUPPRESS="E". It is anticipated that any errors will be corrected in a future update of CHV.
- All score attributes have a range 0 to 1 (a higher score implies the term is easier). A value of -1 indicates the score could not be estimated.
| # | Field | Description |
Representation |
|---|---|---|---|
| 1 | CUI |
UMLS CUI for this
term. (string) |
Used to discover synonymy between CHV terms and terms
from other UMLS sources |
| 2 | CHV_term |
Term as found in
text. (string) |
MRCONSO.STR |
| 3 | UMLS_preferred_name |
Preferred name for UMLS
CUI. (string) |
Used to discover synonymy between CHV terms and terms from other UMLS sources |
| 4 | CHV_preferred_name | Preferred name as defined
in Consumer Health Vocabulary. (string) |
Not directly processed |
| 5 | Explanation |
Explanation or definition
for the term, if available. (string) |
MRDEF.DEF |
| 6 | CHV_preferred |
A boolean variable
(yes/no) indicating whether this is the preferred CHV
name for the concept. (string) |
Used to determine TTY. CHV Terms with preferred
flag = "yes" are assigned TTY="PT". CHV Terms with
preferred flag = "no" are assigned TTY = "SY". |
| 7 | UMLS_preferred |
A boolean variable
(yes/no) indicating whether this is the preferred CHV
name for the concept. (string) |
Not processed |
| 8 | Disparaged |
A value of "yes" in the
CHV data indicates a misspelling or other
abnormality. For this version, disparaged terms
were not processed, so all cases of ATN="DISPARAGED"
have ATV="no". (string) |
CHV Terms with Disparaged = "yes" are not included in
the Metathesaurus at this time |
| 9 | Frequency_score |
Estimate of thedifficulty of a term, i.e. how likely it is that an average reader will be familiar with or understand a given term. Based on the frequency in several large text corpora. A higher score indicates that a term is more familiar (less difficult). (real number) | MRSAT.ATN = "FREQUENCY" |
| 10 | Context_score |
Context based estimate of the difficulty of the term. (real number) | MRSAT.ATN = "CONTEXT_SCORE" |
| 11 | CUI_score |
Estimate of the difficulty of the concept (CUI) derived from determining how closely related the concept is to known examples of easy and difficult concepts. (real number) | MRSAT.ATN = "CUI_SCORE" |
| 12 | Combo_score |
Combination of frequency, context and CUI scores. Also uses whether or not the term is a top word. (real number) | MRSAT.ATN = "COMBO_SCORE" |
| 13 | Combo_score_no_top_words |
A slight modification to Combo_score that ignores top word criterion. The top word list is a list of easy words from the Dale-Chall list. (real number) | MRSAT.ATN = "COMBO_SCORE_NO_TOP_WORDS" |
| 14 | CHV_string_id |
Unique identifier for
each entry in the CHV. (string) |
MRCONSO.SAUI |
| 15 | CHV_concept_id |
Unique identifier for
every concept in the CHV. (string) |
MRCONSO.CODE, MRCONSO.SCUI |
