Announcements to NLM Data Licensees: Year 2014
(02/19/14) Now Using md5sum Checksum File
February 19, 2014
Each baseline and update data file has a corresponding checksum file that licensees may use as confirmation that the complete data file has been downloaded. Until now, we used md5 checksum files. Starting today, February 19, 2014, we are using md5sum checksum files. Licensees using md5 to generate checksum files to compare with the checksum file downloaded from the server should ignore the comments that accompany the md5 validation string.
January 6, 2014
The updated 2014 MARC base files for Catfile, CatfilePlus, and Serfile are available on the NLM FTP server. These base files are each complete in a single file. Loading the base files on an annual basis is optional for MARC subscribers. If you have correctly loaded each of the update files, there is no need to reload the base files. The MARC file containing all the bibliographic records deleted by NLM between January 1, 2013 and December 31, 2013, will be available no later than February 1, 2014. Licensees who are new recipients of NLM MARC bibliographic records in 2014, as well as ongoing licensees who are discarding their pre-2014 records and reloading with the 2014 base files, do NOT need the delete file. The records in this file were removed from the NLM database prior to the pull of the 2014 base files. If loading new baseline files, you should then load the 2014 update files dated after the date of the base files.
December 17, 2013
- AVAILABILITY OF 2014 MEDLINE/PUBMED BASELINE DATA
The 2014 MEDLINE/PubMed baseline files, which replace all previously distributed MEDLINE/PubMed data are now available for FTP. Licensees have been e-mailed the location of the FTP access instructions with additional information.
- 2014 UPDATE FILES
The first group of 2014 update files and the special PMID list text file (see item 3 below) are also available. Please be sure to read the _notes.txt file that is on the server accompanying the first update file medline14n0747. Update files should be processed after the baseline files in ascending file name numeric sequence (see item 3 below for exception) to ensure that all new records are added and the most current and accurate version of each record is retained.
- ADDITIONAL PMID LIST FILE
A text file containing PMIDs of records in MedlineCitation Status = In-Process and MedlineCitation Status = In-Data-Review that have been retained in the 2014 version of PubMed at the time the 2014 baseline files were loaded and that are not exported to licensees in the first batch of update files is available. These records will eventually be exported in update files as completed records in MedlineCitation Status = MEDLINE or MedlineCitation Status = PubMed-not-MEDLINE or as deleted PMIDs in DeleteCitationSet. Licensees who wish to create a database as close as possible to the record content in PubMed on December 16, 2013 will want to include these records now.
The file, named SpecialPubMedPMIDList_2014.txt, resides in the update file directory. Licensees may use the Entrez Utilities to download the records using the list of PMIDs.
*IMPORTANT*: If you elect to add these records to your version of MEDLINE/PubMed, they must be added to your 2014 MEDLINE/PubMed database either 1) immediately after the baseline files and before any update files or, 2) immediately after update files medline14n0747 through medline14n0766 to ensure retaining the most current version of those records as subsequent update files are loaded. Do not add the records identified in SpecialPubMedPMIDList_2014.txt after you have processed medline14n0767 as this may result in retention of an earlier and inaccurate version of the records.
- 2013 MEDLINE/PUBMED FILES TO MOVE TO NEW DIRECTORY
The 2013 update files are moved to a new directory where they will remain for several weeks. Contact NLM at firstname.lastname@example.org if you need access to those files.
Documentation for the MEDLINE/PubMed baseline database is available from links in the Data Availability and Maintenance section of NLM’s information page for MEDLINE/PubMed licensees. Also see the MEDLINE/PubMed Maintenance Overview for information about and points to consider for processing update files. Announcements during the year will be added as they become available.
- MEDLINE/PUBMED BASELINE REPOSITORY (MBR)
The 2014 baseline data will be included at a later date in the MEDLINE/PubMed Baseline Repository (MBR) resources. If you wish to search the baseline data via the MBR Query Tool, be sure to use the same IP address registered with NLM for access to MEDLINE/PubMed from NLM’s FTP server.
CatfilePlus and Serfile XML Baseline Files
The updated XML base files for CatfilePlus and Serfile are available on the NLM FTP server. CatfilePlus is in 4 parts, named "catplusbase1of4.2014.xml", "catplusbase2of4.2014.xml", etc. The Serfile base file is complete in a single file, named "serfilebase.2014.xml". The baseline files contain all records through November 25, 2013 and should be used to completely replace all records previously distributed. The first XML update files for CatfilePlus and Serfile were made available November 25, 2013.