Unified Medical Language System® (UMLS®)
2012AA Consumer Health Vocublary Source Information
VSAB: CHV2011_02
Summary of Changes
No changes were made to the release format of CHV or the
processing for the 2012AA Metathesaurus. Updates to the
CHV content primarily consisted of deletion and/or correction
of misspelled terms.
In the previous version, CHV term values with potential
spelling errors were identified by setting
MRCONSO.SUPPRESS="E". There are 20 atoms remaining with
SUPPRESS="E" however these will be reviewed in future versions
to determine if they should be changed to "N".
Source Provided Files:
Documentation:
| File |
Description |
| http://www.consumerhealthvocab.org/ | source website |
|---|---|
| ReadMe.doc |
README file |
Data:
| File |
Description |
| CHV_concepts_terms_flatfile_20110204.tsv | Tab-separated data
file |
|---|
Identifiers:
Identifiers are assigned as follows:
- CODE: CHV_concept_id
- SAUI: CHV_string_id
- SCUI: CHV_concept_id
- SDUI: Not applicable
Atoms (MRCONSO):
)| Term Type | Origin |
|---|---|
| PT |
CODE:
CHV_concept_id TTY= PT is assigned
where CHV_preferred_name="yes" |
| SY |
CODE: CHV_concept_id STR: Term SAUI: CHV_string_id SCUI: CHV_concept_id TTY=SY is assigned where CHV_preferred_name="no" |
Attributes (MRSAT):
)Note: All score attributes have a range 0 to 1 (a higher score implies the term is easier). A value of -1 indicates the score could not be estimated.
| Attribute Name | Description |
Origin |
|---|---|---|
| COMBO_SCORE |
Combination of
frequency, context and CUI scores. Also uses
whether or not the term is a top word. (real
number) |
Combo_score |
| COMBO_SCORE_NO_TOP_WORDS |
A slight modification
to Combo_score that ignores top word criterion.
The top word list is a list of easy words from the
Dale-Chall list. (real number) |
Combo_score_no_top_words |
| CONTEXT_SCORE |
Context based estimate
of the difficulty of the term. (real number) |
Context_score |
| CUI_SCORE |
Estimate of the
difficulty of the concept (CUI) derived from
determining how closely related the concept is to
known examples of easy and difficult concepts. (real
number) |
CUI_score |
| DISPARAGED | A value of "yes" in the CHV data indicates a misspelling or other abnormality. For this version, disparaged terms were not processed, so all cases of ATN="DISPARAGED" have ATV="no" | Disparaged field (yes/no flag) |
| FREQUENCY |
Estimate of
thedifficulty of a term, i.e. how likely it is that an
average reader will be familiar with or understand a
given term. Based on the frequency in several
large text corpora. A higher score indicates
that a term is more familiar (less
difficult). (real number) |
Frequency_score |
