Skip Navigation Bar

ASCII MeSH, 2013. Documentation and Availability

1. ASCII MeSH

ASCII MeSH contains all data that were present in ELHILL MeSH,* including cross-references and scope notes. Descriptions of the data elements in each file are available for each of the three record types:

See MeSH ELHILL Format Document (PDF format) for an additional description of the data elements, including most of those found in the ASCII MeSH file. Note changes effective with the fourth-quarter 1999 update. *Note that ELHILL MeSH is no longer in production.

2. Restrictions on use

There is no charge. Use of the ASCII MeSH file data is subject to conditions which are detailed in the Memorandum of Understanding.

3. Availability

The data for Descriptors and Qualifiers are updated annually and users of the data are encouraged to obtain the new year's data. Any changes made mid-year are included in the weekly update of ASCII MeSH files.

Supplementary Concept Records (formerly Supplementary Chemical Records) are updated at NLM on a daily basis. The current file in ASCII MeSH is updated weekly and is coordinated with 2013 MeSH descriptors so that the data elements that require a descriptor MH value, such as the HM element, have been updated to match a descriptor in 2013 MeSH. New records will be added to ASCII MeSH periodically.

4. File format

Each MeSH record is indicated by a separate line, preceding the record, consisting of the string

*NEWRECORD

Each element or occurrence is contained on a single line. Each line contains an element name and value, for example,

MH = Appendicitis

The same is true for longer, free-text fields such the Annotation and Scope Note. (The longest occurrence/line in the files is in the descriptor file and is 1124 characters.)

For data elements that are multiply-occurring, each element occurs on a separate line, for example, the MeSH Tree Number:

MN = E05.200.500.607.790
MN = E05.200.500.620.670.620

Most data in ASCII MeSH files are in 7-bit ASCII format. However, we now have a small number of non-English entry terms having characters identified by a diacritic mark, for example, "Carbocaïne", a French trade name for the anesthetic Mepivacaine. (Note the small "i" with dieresis, known in French as the tréma.) These characters are encoded in Unicode UTF-8 format and will be correctly displayed by UTF-8 applications. (The previous will also display correctly since 7-bit ASCII encoding is a subset of UTF-8.) Otherwise they may appear differently in different displays.

The file extensions are .bin rather than .txt so that Web browsers will prompt to save rather than automatically trying to display the relatively large text files (up to 66MB). The lines will usually be transmitted with only a line feed character (decimal 10) and not also a carriage return (decimal 13). Please contact the MeSH Section at the address below if you have questions.

5. Contents of files - 2013 MeSH

Files updated weekly. Counts as of September 4, 2012.

Record TypeTotal RecordsTotal TermsFile SizeBytes
Descriptors 26,853
213,815
28MB 29,470,831
Qualifiers 83
***
97KB 98,760
Supplementary
Concept Records1
209,420
519,687 80MB 84,232,205

1 Formerly Supplementary Chemical Records.

6. Contact

For questions concerning distribution, format, etc., contact:

Jacque-Lynne Schulman
Medical Subject Headings
voice: 301-496-1495; FAX: 301-402-2002
email: schulman@nlm.nih.gov