NLM logo

CHV (Consumer Health Vocabulary) - Metathesaurus Representation


VSAB:   CHV2011_02


Summary of Changes

No changes were made to the release format of CHV or the processing for the 2012AA Metathesaurus.  Updates to the CHV content primarily consisted of deletion and/or correction of misspelled terms. 

In the previous version, CHV term values with potential spelling errors were identified by setting MRCONSO.SUPPRESS="E".   There are 20 atoms remaining with SUPPRESS="E" however these will be reviewed in future versions to determine if they should be changed to "N".


Source Provided Files:

Documentation:
File
Description
http://www.consumerhealthvocab.org/ source website
ReadMe.doc
README file
Data:
File
Description
CHV_concepts_terms_flatfile_20110204.tsv Tab-separated data file

Identifiers:

Identifiers are assigned as follows:
  • CODE: CHV_concept_id
  • SAUI:  CHV_string_id
  • SCUI:  CHV_concept_id
  • SDUI:  Not applicable

Atoms (MRCONSO):

Term Type Origin
PT

CODE:  CHV_concept_id
STR: Term
SAUI:   CHV_string_id
SCUI: CHV_concept_id

TTY= PT is assigned where CHV_preferred_name="yes"

SY
CODE:  CHV_concept_id
STR: Term
SAUI:   CHV_string_id
SCUI: CHV_concept_id

TTY=SY is assigned where CHV_preferred_name="no"

Attributes (MRSAT):

Note: All score attributes have a range 0 to 1 (a higher score implies the term is easier). A value of -1 indicates the score could not be estimated.

Attribute Name Description
Origin
COMBO_SCORE
Combination of frequency, context and CUI scores.  Also uses whether or not the term is a top word.  (real number)
Combo_score
COMBO_SCORE_NO_TOP_WORDS
A slight modification to Combo_score that ignores top word criterion.  The top word list is a list of easy words from the Dale-Chall list. (real number)
Combo_score_no_top_words
CONTEXT_SCORE
Context based estimate of the difficulty of the term.  (real number)
Context_score
CUI_SCORE
Estimate of the difficulty of the concept (CUI) derived from determining how closely related the concept is to known examples of easy and difficult concepts. (real number)
CUI_score
DISPARAGED A value of "yes" in the CHV data indicates a misspelling or other abnormality.  For this version, disparaged terms were not processed, so all cases of ATN="DISPARAGED" have ATV="no" Disparaged field (yes/no flag)
FREQUENCY
Estimate of thedifficulty of a term, i.e. how likely it is that an average reader will be familiar with or understand a given term.  Based on the frequency in several large text corpora.  A higher score indicates that a term is more familiar (less difficult).   (real number)
Frequency_score

Definitions (MRDEF):

Definitions are created from data in the "Explanation" field.


Relationships (MRREL):


No relationships are processed for this version of CHV

Mappings (MRMAP):

No relationships to other sources are processed for this version of CHV, however UMLS CUI data was used to assess possible synonymy.