Skip to Content
United States National Library of Medicine National Institutes of Health

Overview of Annual Baseline Distribution of MEDLINE/PubMed Data

A baseline MEDLINE/PubMed database is available each December, after NLM has maintained the MEDLINE records to reflect the new year's Medical Subject Headings (MeSH®) vocabulary and other necessary global changes. The baseline MEDLINE database is distributed to licensees in XML format in several hundred files via ftp.

The baseline files include MEDLINE as well as completed and quality reviewed non-MEDLINE records found in PubMed (MedlineCitation Status = PubMed-not-MEDLINE and MedlineCitation Status = OLDMEDLINE). After the baseline files are generated, update files containing new, maintained, and deleted records (also including MedlineCitation Status = In-Data-Review and MedlineCitation Status = In-process records) are distributed via ftp each Tuesday through Saturday, with a hiatus during November and December as NLM makes the transition to a new year of MeSH® vocabulary used to index the articles.

The baseline and update files do not include records in PubMed with the XML MedlineCitation Status= Publisher and the display notation [PubMed -as supplied by publisher] or [PubMed - author manuscript in PMC]. These "publisher-supplied" records, comprising less than 2% of PubMed content, do not reside in NLM's Data Creation and Maintenance System (DCMS) from which the exports are made, and therefore are not distributed. See section '1f' of the MEDLINE/PubMed XML Element Descriptions for more information on Publisher status records.

See also: 

Last updated: 03 December 2007
First published: 17 October 2006
Metadata| Permanence level: Permanence Not Guaranteed