![]() |
|
| Home > Bibliographic Services Division > Information for Licensees > Access Instructions | |
NOTE: 2010 MEDLINE/PubMed baseline files which completely replace data exported in 2009 baseline and update files are expected to be available on December 14, 2009 or shortly thereafter.
The 2009 MEDLINE/PubMed baseline files completely replace all MEDLINE/PubMed records previously distributed to licensees during the NLM 2008 production year. See Overview of Annual Baseline Distribution of MEDLINE/PubMed Data. Subsequent update files are applied to the baseline files.
Access instructions are for NLM MEDLINE/PubMed licensees only; do not share directory or file names with others. Be sure to use the IP address you registered with NLM. All other IP addresses will be blocked from retrieving the files.
MEDLINE/PubMed Baseline Files
MEDLINE/PubMed Update Files
MEDLINE/PUBMED BASELINE FILES
The MEDLINE/PubMed baseline files reside on NLM's anonymous FTP server at ftp://ftp.nlm.nih.gov/nlmdata/.medleasebaseline/. Login as a non-fee/anonymous user; use your e-mail address as password; binary mode. From the /.medleasebaseline subdirectory, select either the gz or zip folder to get to the baseline files. Get the .gz or .zip files. There are 593 files named medline09n0001 through medline09n0593. A corresponding md5 checksum file for the compressed xml files accompanies each data file.
The baseline files are available all hours seven days per week.
Documentation of the baseline files’ data is at http://www.nlm.nih.gov/bsd/licensee/2009_stats/baseline_doc.html. A table summarizing the file name, years covered, file size, and record count of each baseline file is at http://www.nlm.nih.gov/bsd/licensee/2009_stats/baseline_med_filecount.html.
NLM’s information page for licensees containing links to the MEDLINE/PubMed DTDs, data element descriptions, and other documentation and announcements is http://www.nlm.nih.gov/bsd/licensee/medpmmenu.html.
MEDLINE/PUBMED BASELINE REPOSITORY (MBR) AND QUERY TOOL
In addition to, or instead of, getting the baseline files from NLM’s FTP server and mounting the data locally, MEDLINE/PubMed licensees may also search the MEDLINE/PubMed baseline files from NLM’s Web-based MBR Query Tool available at http://mbr.nlm.nih.gov/. Static baseline databases 2002 and forward are available for searching by MeSH Headings, Subheadings, MeSH Heading/Subheading combinations, Names of Substances (MeSH Supplementary Concept Records), and PMID. Licensees can limit or filter by Date Created, Date Completed, Date Last Revised, Publication Year, and Status. Citations retrieved are available in both XML Format and PubMed’s MEDLINE ASCII Display Format. Please explore the other resources available from the main MBR site at http://mbr.nlm.nih.gov/. You must use the same IP address registered to get the data from NLM’s FTP server to access the Query Tool.
The MEDLINE/PubMed update files reside on NLM's anonymous FTP server at ftp://ftp.nlm.nih.gov/nlmdata/.medlease/. Login as a non-fee/anonymous user; use your e-mail address as password; binary mode. From the /.medlease subdirectory, select either the gz or zip folder to get to the update files. Get the .gz or .zip files. A corresponding md5 checksum file for the compressed xml files accompanies each data file.
The update files are available all hours seven days per week throughout the year. A new file is generally scheduled to appear on the server Tuesday - Saturday at noon EST. More than one update file may become available on the same day.
Important: These files must be processed in ascending numeric sequence based on file name, added after the baseline files medline09n0001 - medline09n0593. Exception: See information below about a special file containing PMIDs of records in PubMed and not distributed to licensees.
Update files may contain new and revised records and identify records to be deleted. Records are in either MEDLINE, OLDMEDLINE, In-Process, In-Data-Review, or PubMed-not-MEDLINE MedlineCitation Status. See http://www.nlm.nih.gov/bsd/licensee/elements_descriptions.html for Status descriptions.
Large numbers of records involving various types of maintenance may be distributed in subsequent update files throughout the year. See http://www.nlm.nih.gov/bsd/licensee/medline_maintenance.html for information and points to consider when processing maintained records.
Licensees should read the _stats.html file that accompanies each update data file on the server and also look for occasional _notes.txt files that may appear later for additional information about the data distributed in that file, e.g., retracted publications, reasons for large numbers of maintained records, etc.
Example: get medline09n0594.xml.gz (or medline09n0594.xml.zip)
get medline09n0594 _stats.html
get medline09n0594 _notes.txt
NLM’s information page for licensees at http://www.nlm.nih.gov/bsd/licensee/ contains links to MEDLINE/PubMed documentation including the DTDs, data element descriptions, MEDLINE/PubMed update chart, maintenance overview, and other items and announcements. The MEDLINE/PubMed update chart generally updated weekly at http://www.nlm.nih.gov/bsd/licensee/table_rev.html tracks data found in the _stats files.
SPECIAL FILE CONTAINING PMIDS OF RECORDS IN PUBMED NOT DISTRIBUTED TO LICENSEES
**NOTE: This file may not be available until Wednesday Dec. 17, 2008**
A text file containing PMIDs of records in MedlineCitation Status = In-Process and MedlineCitation Status = In-Data-Review that have been retained in the 2009 version of PubMed at the time the 2009 baseline files were loaded and that are not exported to licensees in the first batch of update files is available. These records should eventually be exported in update files as completed records in MedlineCitation Status = MEDLINE or MedlineCitation Status = PubMed-not-MEDLINE or as deleted PMIDs in DeleteCitationSet. Licensees who wish to create a database as close as possible to the current record content in PubMed may wish to include these records now.
The file, named SpecialPubMedPMIDList_2009.txt, resides in the update file directory. Licensees may use the Entrez Utilities (http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html) to download the records using the list of PMIDs.
*Important*: If you elect to add these records to your version of MEDLINE/PubMed, they must be added either 1) immediately after the baseline files and before any update files or, 2) after update files medline09n0594 through medline09n0627 to ensure retaining the most current version of those records as subsequent update files are loaded. DO NOT add the records identified in SpecialPubMedPMIDList_2009.txt after you have processed medline09n0627 as this may result in retention of an earlier and inaccurate version of the records.
CONTACT: Jane L. Rosov, NLMdatadistrib@nlm.nih.gov, 301-496-7706
Last reviewed: 28 October 2009
Last updated: 28 October 2009
First published: 24 October 2008
Metadata| Permanence level: Permanence Not Guaranteed