Skip to Content
United States National Library of Medicine National Institutes of Health

Announcements to NLM Data Licensees: Year 2007

(12/03/07) Availability of 2008 MEDLINE/PubMed Baseline Data
(11/29/07) MEDLINE/PubMed Update File Schedule Change; Continuing to Lease for 2008
(11/09/07) Continuing to Lease for 2008 and MEDLINE/PubMed Update Files Schedule [revised 11/13/07]
(10/24/07) Continuing Your License, End-of-Year Schedule and News
(08/23/07) New DTDs, Forthcoming Baseline, Revised ChemIDPlus Data, Request for Usage Statistics, and Revised Licensee Web Page [revised 10/22/07, 2/29/08]
(08/14/07) Elimination of USP Data from Leased ChemIDPlus
(07/30/07) Funding Agencies in MEDLINE/PubMed Records
(01/24/07) Change to HSDB XML
(01/19/07) Deleted NLM MARC 21 Format Records

2006 Announcements


Availability of 2008 MEDLINE/PubMed Baseline Data

December 3, 2007 [revised 12/5/07]

1. AVAILABILITY OF 2008 MEDLINE/PUBMED BASELINE DATA
I am pleased to inform you that the 2008 MEDLINE/PubMed baseline files which replace all previously distributed MEDLINE/PubMed data are now available for FTP.   FTP access instructions with additional information are attached.

2. 2008 UPDATE FILES
The first group of 2008 update files and the special PMID list text file (see item 3 below) will become available later this week, expected (although subject to change) on Thursday December 6.  You may check the directory that contains update files on that date.  Please be sure to read the _notes.txt file that will be on the server accompanying the first update file which will be named medline08n0564. Update files should be processed after the baseline files (and after the records in the optional PMID list file discussed in item 3 below) in ascending file name numeric sequence to ensure that all new records are added and the most current version of each record is retained. FTP access instructions with additional information are attached.

3. ADDITIONAL PMID LIST FILE
A text file of PMIDs of records in MedlineCitation Status = In-Process and MedlineCitation Status = In-Data-Review that have been retained in the 2008 version of PubMed at the time the 2008 baseline files were loaded and that are not exported to licensees in the first batch of update files is available. The filename is SpecialPubMedPMIDList_2008.txt and it will reside in both the .gz and .zip directories. There are a relatively small number of PMIDs in the file, but licensees who wish to create a database as close as possible to the record content in PubMed will want to include them. These records should eventually be exported as completed records in MedlineCitation Status = MEDLINE or MedlineCitation Status = PubMed-not-MEDLINE after NLM completes the work on them.

Licensees may use the Entrez Utilities to download the records using the list of PMIDs. IMPORTANT:  If you elect to add these records to your version of MEDLINE/PubMed, follow the steps provided in the update file access instructions. [edited 12/5/07]

4. 2007 MEDLINE/PUBMED FILES TO MOVE TO NEW DIRECTORY
The last 2007 update file, medline07n0831, was placed on the server for licensees December 1, 2007.  When the 2008 update files are placed on the server, the 2007 update files will move to another directory where they will remain for several weeks for licensees who need access to them while working with the 2008 baseline files.

5. DOCUMENTATION
Documentation for the MEDLINE/PubMed baseline database is available from links in the Data Availability and Maintenance section of NLM's information page for MEDLINE/PubMed licensees at http://www.nlm.nih.gov/bsd/licensee/medpmmenu.html.    The direct URLs to those pages are http://www.nlm.nih.gov/bsd/licensee/2008_stats/baseline_doc.html and http://www.nlm.nih.gov/bsd/licensee/2008_stats/baseline_med_filecount.html.  Also see the MEDLINE/PubMed Maintenance Overview at http://www.nlm.nih.gov/bsd/licensee/medline_maintenance.html for information about and points to consider for processing update files.

6. MEDLINE/PUBMED BASELINE REPOSITORY (MBR)
The 2008 baseline data will be included at a later date in the MEDLINE/PubMed Baseline Repository (MBR) resources at http://mbr.nlm.nih.gov/.   If you wish to search the baseline data via the MBR Query Tool, be sure to use the same IP address registered with NLM for access to MEDLINE/PubMed from NLM's FTP server.

Please do not hesitate to contact me with questions as they arise.   I look forward to working with you during 2008 and send best wishes for a peaceful and healthy New Year.


MEDLINE/PubMed Update File Schedule Change; Continuing to Lease for 2008

November 29, 2007

FOR MEDLINE/PUBMED LICENSEES  -  SCHEDULE UPDATE

The last MEDLINE/PubMed update file for the 2007 production year was expected to be today, Thursday November 29.  NLM’s annual end-of-year processing has been modified so that update files will also be available tomorrow and Saturday, November 30 and December 1. Unless there are unforeseen circumstances, the 2008 baseline files are still expected to be available to renewed licensees early next week and the first group of update files for the new production year should be available within several days thereafter (not the same day as the baseline files).  I will notify renewed licensees by e-mail on the day the files become available.

The timeline continues to be subject to change depending on the timing required for the remainder of NLM’s annual end-of-year processing.   

FOR ALL LICENSEES  -  RENEWALS ARE DUE

An e-mail message was sent on November 9, 2007 (followed by a reminder e-mail on November 19) instructing NLM’s database licensees who entered into their license with NLM prior to September 1, 2007 to either confirm interest in continuing to lease MEDLINE/PubMed and/or other NLM databases during our 2008 production year, or to advise that the data will no longer be used and the license should cease.  This renewal reminder is for licensees who have not yet sent in a reply.

If you have not yet returned your renewal information, refer to item 4 of the e-mail posted at http://www.nlm.nih.gov/bsd/licensee/announce/2007.html#d11_09. You will have access to the 2008 versions of the database(s) you lease only after your reply, per the specified instructions, is received and processed.  If you need another copy of the file that was attached to your specific e-mail message, please let me know and I will send it to you again. Licenses of those who do not respond will cease and access to the 2008 data will be denied. 

When a license ceases, the NLM data previously supplied should no longer be used.  In this case, if you have redistributed any data received under this license, or data derived from the NLM-supplied data, you should notify the users of the information services or products based on that database to cease their use because the data are no longer current. You should also take reasonable steps to ensure that users will not continue to access products containing NLM data that have become superseded by updated and/or maintained versions.

Please send your reply per instructions in the e-mail as soon as possible, and let me know if you have questions.


Continuing to Lease for 2008 and MEDLINE/PubMed Update Files Schedule

November 9, 2007

1. For MEDLINE/PubMed and CatfilePlus/Serfile in XML Licensees
2. For ChemIDPlus Subset Licensees
3. For All Licensees: Usage Statistics due in December
4. For All Licensees: Continuing to Lease NLM Databases for 2008
5. Reminders

1. FOR MEDLINE/PUBMED AND CATFILEPLUS/ SERFILE IN XML LICENSEES:

Refer to the announcements dated Aug. 23 and Oct. 24, 2007 available from http://www.nlm.nih.gov/bsd/licensee/announce/2007.html which provide information about the data distribution schedule for the remainder of this production year and details about DTD and data changes for 2008. It is not known when in 2008 ELocationID (DOI) or ISSNLinking data will first be included in exported records.

MEDLINE/PubMed licensees:

a. Last Distribution of Completed Records Nov. 14

Please be reminded that, as the case each year, NLM suspends distribution of new and revised records in MedlineCitation Status = MEDLINE, MedlineCitation Status = PubMed-not-Medline, and MedlineCitation Status = OLDMEDLINE as preparations are made for the forthcoming production year. The last records in these 3 completed statuses for the current 2007 production year, as well as PMIIDs of deleted records, will be distributed to licensees on November 14, 2007.

b. Update File Schedule Change

At this time, NLM expects that update files will not be available Thursday through Saturday November 15 – 17 (this amends the previous announcement that did not include Saturday). Then, after the routine suspension of files for Sunday and Monday, files limited to records in MedlineCitation Status = In-process and MedlineCitation Status = In-Data-Review will resume on Tuesday November 20. (This schedule is subject to change depending on the timing required for NLM’s annual end-of-year processing. If it goes more quickly than expected, it is possible a file will be released for Friday, but this is unlikely.)

The last update file for the 2007 production year will be available on Thursday November 29. Update files will not be available Friday November 30 and Saturday December 1, 2007. Unless there are unforeseen circumstances, the 2008 baseline files and first group of update files should be available to renewed licensees on Monday December 3 or possibly on Tuesday December 4. [added 11/13/07]

c. File of Sample Records Using 2008 DTD

A small file containing 134 representative records processed using the 2008 DTDs is available at http://www.nlm.nih.gov/databases/dtd/medsamp2008.xml. Elements new for 2008 are not yet represented in the data.

2. FOR CHEMIDPLUS SUBSET LICENSEES

As first announced last August (see http://www.nlm.nih.gov/bsd/licensee/announce/2007.html#d08_14), the file put on the server Oct. 31, 2007 and dated Oct 28, 2007 is the first ChemIDplus Subset file that does not contain data provided by U.S. Pharmacopeia (USP).

3. FOR ALL LICENSEES: USAGE STATISTICS DUE IN DECEMBER

Licensees are reminded that reports of leased NLM database usage are due December 6, 2007. This pertains to MEDLINE/PubMed, CCRIS, ChemIDplus Subset, DIRLINE, GENE-TOX, HSDB, and TOXLINE Subset; not needed for Catfile, CatfilePlus, or Serfile. The Database Usage Report Form for Fiscal Year 2006 (Oct. 1, 2006 - Sept. 30, 2007) is available at http://www.nlm.nih.gov/databases/license/userep_2007.html. Licensees using leased NLM data internally for research purposes only do not need to submit a usage report.

4. FOR ALL LICENSEES: CONTINUING TO LEASE NLM DATABASES FOR 2008

It is time to inform NLM of your continued interest in leasing MEDLINE/PubMed and/or other NLM databases for 2008. To have access to the 2008 production year data, a representative from each license must reply to this e-mail and transmit any changes to the information on file at NLM via the e-mail reply. A prompt response per the Instructions below will ensure access to the 2008 data as soon as it is available.

INSTRUCTIONS

*****************

Many licensees have designated a second contact person in addition to the primary responsible party for each license. All contacts are being sent this message; however, NLM requests that only one submit a reply. To help ensure that only one response for each license is sent, a file attachment to be used for the reply is sent only to the primary contact. Primary contacts should forward the file attachment to the secondary contact if that person, instead, is to provide the requested information to NLM.

The e-mail address on file for the primary contact that was sent the attachment for your license is:

The e-mail address on file for the secondary contact for your license is:

Please coordinate so only one reply is sent for each license.

*****************

a. Open the file attached to this message (or obtain it from the primary contact). The file includes your profile which summarizes the information previously provided to NLM. Review all information for currency/accuracy. Your previous input appears after the colon (:) for each item. Responses to items with no input after the colon are an implied No or Not Applicable.

b. View the entire 2008 Intended Use Worksheet for new licensees at http://www.nlm.nih.gov/databases/license/intend.html for the complete text of each item addressed in the profile and to see the possible choices for response.

c. Options:

1). If you will continue using NLM leased data in 2008 and there are no changes in your information for 2008 after review of your profile and the 2008 Intended Use Worksheet:
*Reply to this e-mail (NLMdatadistrib@nlm.nih.gov) and then place an X in this box in the reply [ ].
*Complete items 8 and 9 in the profile.
*Do not place an X in any box in front of any profile item.
*Save and rename the profile information file and attach it to your e-mail reply.

2). If you will continue using NLM leased data in 2008 and there are changes in your information for 2008 after review of your profile and the 2008 Intended Use Worksheet:
*Reply to this e-mail (NLMdatadistrib@nlm.nih.gov) and then check this box in your reply [ ].
*Check the box in front of each profile item that you change.
*Edit the response to each checked item to reflect the current information for 2008. Please be sure to include the most current description of your use of the databases you lease.

*Only check the box in front of an item that you edit. (For example, if you have previously leased only MEDLINE/PubMed and now wish to also lease another database for 2008, check the box in front of item 1, retain MEDLINE/PubMed in the response, and add the new database name to item 1. If your ip address(es) remains the same, do not check the box in front of item 2 and do not edit item 2. If you need to update your use of NLM data, check the box in front of item 4, refer to the 2008 Intended Use Worksheet to see the complete text of the item and its possible responses, and edit item 4 with your new response. Accordingly, be sure to update and check off item 5 as needed.

*Complete items 8 and 9 in the profile.
*Save and rename the profile file and attach it to your e-mail reply.

3). If you will not use any previously leased NLM database in 2008:
Reply to this e-mail (NLMdatadistrib@nlm.nih.gov) and then check this box in your reply [ ].
Your license will be ceased; you will not have access to 2008 data and should discontinue use of data previously received under the license. Per the standard License Section B.2, "Upon discontinuance of a database, the Licensee must clearly notify the users of the information services or products based on that database to cease their use because the data are no longer current." Per Section D.2.l, you should "Take reasonable steps to ensure that users will not continue to access products containing NLM data that have become superseded by updated and/or maintained versions."

5. REMINDERS

a. Information to assist in your preparation for NLM’s 2008 data year was e-mailed to licensees in October 2007 and is posted in the Announcements section of http://www.nlm.nih.gov/bsd/licensee. An article with links to details about MEDLINE/PubMed year-end processing activities is published at http://www.nlm.nih.gov/pubs/techbull/so07/so07_yep2.html in the Sept-Oct 2007 NLM Technical Bulletin (TB). Additional pertinent articles will be published in the forthcoming Nov-Dec 2007 TB at http://www.nlm.nih.gov/pubs/techbull/tb.html.

b. Regarding profile item 1, all leased databases are distributed in XML format, except Catfile which is distributed only in MARC21 format. CatfilePlus and Serfile are available in both XML and MARC21 formats. If you lease CatfilePlus and/or Serfile, the format you previously selected is shown and may also be changed for 2008.

c. Regarding items 8 and 9, two e-mail correspondents are permitted for each license; one is the primary responsible license contact and the other is optional. Both will receive all administrative/data-related e-mail communications from NLM for further distribution as deemed appropriate by the recipients. E-mails to licensees are subsequently posted as announcements on the licensee information page at http://www.nlm.nih.gov/bsd/licensee.html.

d. Do not submit a new Intended Use Worksheet. NLM will continue your license and distribute its 2008 production year data to you based on your reply to this e-mail which should include your updated profile information.

e. Licensees are reminded that reports of database usage for Fiscal Year 2007 (Oct. 1, 2006 - Sept. 30, 2007) are due December 6, 2007 (not needed for Catfile, CatfilePlus, or Serfile; or from licensees using NLM data internally for research purposes only). The Database Usage Report Form is available at http://www.nlm.nih.gov/databases/license/userep_2007.html.

f. MEDLINE/PubMed Query Tool: In addition to getting XML files containing MEDLINE/PubMed data from NLM’s ftp server, registered MEDLINE/PubMed licensees can also search the baseline data from NLM’s Web-based MEDLINE/PubMed Query Tool found at http://mbr.nlm.nih.gov/. A reply to this e-mail is also needed to continue searching the baseline files using the Query Tool once the 2008 files are available. See item 5 at http://www.nlm.nih.gov/bsd/licensee/announce/2005.html#d11_08 for more information about the MEDLINE/PubMed Baseline Repository site and the Query Tool.


Continuing Your License, End-of-Year Schedule and News

October 24, 2007

Items covered in this message are:
1. License Code Reminder
2. Continuing Your License in 2008
3. For MEDLINE/PubMed Licensees
4. For Catfileplus in XML and Serfile in XML Licensees
5. For MARC 21 Catfile, CatfilePlus, and Serfile Licensees
6. For CCRIS, ChemIDplus Subset, DIRLINE, GENE-TOX, HSDB, and TOXLINE Subset Licensees
7. Usage Statistics
8. Web pages, E-Mail Alerts and NLM Technical Bulletin

1. LICENSE CODE REMINDER
NLM assigns a unique 3-letter code to each license. Your code appears at the top of e-mails sent to you by NLM. Please keep a record of your code and reference it when corresponding to NLM about matters relating to your license or the databases you lease.

2. CONTINUING YOUR LICENSE FOR 2008 DATA
Instructions for continuing to license one or more NLM databases will be sent via e-mail in early November. Only licensees who respond will have access to the 2008 data; licenses of non-responders will cease (and accordingly, access to the 2008 data will be denied). Those whose licenses are ceased should cease use of previously leased MEDLINE/PubMed data. (Note: if you first became a licensee after Aug. 31, 2007 you do not need to renew for 2008; your license will remain in effect).

3. FOR MEDLINE/PUBMED LICENSEES
a. Data and DTD Changes for 2008
DTD changes for 2008 are summarized in the August 23, 2007 announcement.

In August 2007 NLM began to include the country name for UK funding agencies in the <Agency> element in <GrantList>, for example: <Agency>United Kingdom Arthritis Research Campaign</Agency>. Starting with the 2008 baseline export, NLM will include the home country source of US funding agencies as well. For example, the current export of <Agency>NIOSH</Agency> will change to <Agency>United States NIOSH</Agency> in 2008. Due to other maintenance activities occurring this time of year, it is possible that some records reflecting this change may be re-distributed as revised records prior to the 2008 baseline distribution.

b. Hiatus in Distribution of Records in MEDLINE, PubMed-Not-Medline, and OLDMEDLINE Statuses (Completed Records); No File Nov. 15 - Nov 17
As the case each year, NLM will suspend distribution of new and revised records in MedlineCitation Status = MEDLINE, MedlineCitation Status = PubMed-not-Medline, and MedlineCitation Status = OLDMEDLINE as preparations are made for the new production year. Per our current schedule, the last records in these 3 completed statuses, as well as PMIDs of deleted records, for the current 2007 production year will reside in the file on the server on November 14, 2007.

No files will be available Thursday November 15 through Saturday November 17. Distribution of records in only two statuses, MedlineCitation Status = In-process and MedlineCitation Status = In-Data-Review, will resume on Tuesday, November 20 for the remainder of the 2007 production year. If our end-of-year processing goes more quickly than expected, it is possible a file will be released for Friday, but this is unlikely.

Definitions of the various MedlineCitation Status attribute values and other XML element and attribute definitions are at http://www.nlm.nih.gov/bsd/licensee/elements_descriptions.html. Please note that this document is not updated yet to reflect DTD changes for 2008. Remember that approximately 98% of the PubMed content is distributed to NLM's MEDLINE/PubMed licensees; the records in MedlineCitation Status=Publisher found in PubMed are not distributed.

c. Complete New Baseline Reload in December
Over 550 data files containing the 2008 version of all MEDLINE status records and also containing the completed PubMed-not-Medline and OLDMEDLINE statuses records that reside in PubMed are expected to be available to MEDLINE/PubMed licensees on or about December 3, 2007. Note that this is about a week earlier than in recent previous years. NLM strives to distribute the complete reload as soon as possible after the data have undergone end-of-year maintenance and have been thoroughly tested at our end. The 2008 baseline files contain the most current and accurate versions of the records and replace all records previously distributed to MEDLINE/PubMed licensees during the 2007 production year. Licensees should be prepared to completely re-load the their databases.

The 2008 MEDLINE baseline records will have been fully maintained with 2008 MeSH Vocabulary. The records will reflect any other maintenance previously performed during the 2007 production year or resulting from end-of-year processing, and will be generated using the 2008 DTDs. The 2008 baseline database files will be organized by PubDate groups thus facilitating licensees' use of the data based on a range of publication years.

NLM has not offered a 'changed-records-only' set of records in recent years and is not expecting to offer this alternative in the future. Doing this requires that our system identify every record changed as a result of end-of-year processing or maintenance during the production year and annotate the DateRevised element accordingly. There have been, and there will likely continue to be, special programming actions that prohibit this from happening while retaining the required degree of efficiency and timeliness.

d. Update files
Update files for the 2008 production year containing records in all MedlineCitation Statuses will resume around the time the baseline reload files are available. Update files include: new records with PMIDs not used before; records with existing PMIDs that have changed their processing status (indicated in XML element = MedlineCitation Status); records with data changes; and PMID’s of deleted records. Update files sometimes include extremely large numbers of revised records. Occasionally more than one update file is distributed on a given day. The update files are summarized in the MEDLINE/PubMed Update Chart.

Additional File this Year
This year NLM plans to make an additional file available to licensees. It is expected to be a text file of PMIDs of records in MedlineCitation Status = In-Process and MedlineCitation Status = In-Data-Review that are retained in PubMed when the new version is up but are not exported to licensees in the first group of update files. This represents a very small number of records, but licensees who wish to create a database as close as possible to the record content in PubMed will want to include them. These records should eventually be exported as completed records in MedlineCitation Status = MEDLINE or MedlineCitation Status = PubMed-not-MEDLINE after NLM completes the work on them. Licensees may use the Entrez Utilities to download the actual records. Access information for this special file will be provided when the baseline files become available.

Filesize
There has been a delay in eliminating the Filesize data from the ‘stats’ file that accompanies each update file will be eliminated. See http://www.nlm.nih.gov/bsd/licensee/notes/2007_medline.html#medline07n0796 for more information. A ‘notes’ file will be put on the ftp server on the day the Filesize is no longer present.

Points to consider regarding update files:
1. Update files should be applied after the baseline files and processed in ascending numeric order by filename.
2. Records in MEDLINE, PubMed-not-MEDLINE, and OLDMEDLINE statuses are considered completed records and thus contain the DateCompleted element. Completed records that are subsequently revised most often – but not always - contain the DateRevised element.
3. In-Data-Review and In-process status records are not in a completed status, thus do not contain the DateCompleted or DateRevised elements.
4. Licensees should compare the MedlineCitation PMIDs in update files with those in records previously loaded. If there is no match, the record is new. If there is a match, the record is either a completed record that has been revised, or the record has changed its MedlineCitation Status; e.g., been elevated from In-Data-Review status to In-process status or from In-Process status to MEDLINE or PubMed-not-MEDLINE status. Replace records with <DateRevised> only if that date is later than that on your existing record (this will be a concern only if files are processed out of ascending numeric order).
5. DateRevised element is not used to indicate a change in MedlineCitation Status.
6. DeleteCitationSet is created only if there are MedlineCitation PMIDs to delete.
7. A record may contain more than one PMID. The MedlineCitation PMID is the unique number identifying the record. Do not confuse it with the PMID element that resides in the CommentsCorrections group of elements which reference, for example, a citation that is associated with (e.g., corrects or retracts) the record in hand.

e. Back Issue Citations and Abstracts
Please refer to the NLM Technical Bulletin articles at http://www.nlm.nih.gov/pubs/techbull/ma07/ma07_technote.html#10 and http://www.pubmedcentral.nih.gov/about/scanning.html for information pertaining to the PubMed Central® Back Issue Digitization Project. Citations to articles in PubMed Central that result from this project are added to PubMed if not there already and are exported to licensees in MedlineCitation Status = PubMed-not-Medline. In addition, abstracts for older articles obtained from this project are added to existing MEDLINE/PubMed records and the resulting revised records are exported to licensees.

f. 2008 MeSH Files
The 2008 MeSH Vocabulary file is available to download.

4. FOR CATFILEPLUS IN XML and SERFILE IN XML LICENSEES
a. New DTD for 2008 Production Year
The changes to the NlmCatalogRecord and related DTDs used for distribution of the XML files for 2008 are summarized in the August 23, 2007 announcement.

b. Complete New Baseline Reloads in December
A complete pull of all CatfilePlus in XML and Serfile in XML records will be available in early-December to all licensees who have returned their 2008 renewal material. Records in these baseline files completely replace all previously distributed records.

d. 2008 MeSH Files
The 2008 MeSH Vocabulary file is available to download.

5. FOR MARC 21 CATFILE, CATFILEPLUS, and SERFILE LICENSEES
a. Schedule
1. The last monthly distribution of CatfilePlus and Serfile which will reflect the 2007 MeSH vocabulary are the files dated November 1, 2007. Appropriate LocatorPlus records which are updated as part of Year-End-Processing will be distributed to licensees in the files dated December 3, 2007. All MeSH headings in records distributed in the December 3, 2007 files will conform to 2008 MeSH.
2. The last weekly distribution of Catfile which will reflect the 2007 MeSH vocabulary is the file dated November 15, 2007. Appropriate LocatorPlus records which are updated as part of Year-End-Processing will be distributed to Catfile licensees in files beginning November 22, 2007. All MeSH headings in records distributed as of November 22, 2006 will conform to 2008 MeSH.

b. 2008 MeSH Files
The 2008 MeSH Vocabulary file is available to download.

6. FOR CCRIS, CHEMIDPLUS SUBSET, DIRLINE, GENE-TOX, HSDB, and TOXLINE SUBSET LICENSEES
There are no DTD changes expected for 2008. As customary, when complete replacement files are available they will be on the server to renewing licensees on or shortly after the 28th day of the month (except for GENE-TOX for which no update is scheduled). ChemIDplus Subset licensees have already been advised in the announcement at http://www.nlm.nih.gov/bsd/licensee/announce/2007.html#d08_14 that the data provided by the U.S. Pharmacopeia (USP) will no longer be included in the version of the database distributed to licensees effective with the file available at the end of this month.

7. USAGE STATISTICS
Licensees who make the leased data available to people outside their organization or who 'vend' the data are reminded that reports of database usage are due December 6, 2007. The Database Usage Report Form for Fiscal Year 2007 (Oct. 1, 2006 - Sept. 30, 2007) is available at http://www.nlm.nih.gov/databases/license/userep_2007.html. Usage reports are not needed for Catfile, CatfilePlus, or Serfile data or from those using NLM data internally for research purposes only.

8. WEB PAGES FOR LICENSEES, E-MAIL ALERTS FROM NLM, NLM TECHNICAL BULLETIN
The main web page for licensees is http://www.nlm.nih.gov/bsd/licensee/. There are links on this page for access to comprehensive resources and documentation about the databases available for lease. Licensees are urged to explore these pages for details on database content, structure/values (DTDs, element descriptions), availability (re-loads, update files, charts), current announcements, and miscellaneous information.

Licensees who have not already done so should consider subscribing to NLM's e-mail alert, NLM-Announces (see http://www.nlm.nih.gov/listserv/emaillists.html). Once a week subscribers receive an e-mail message listing news from NLM including links to a wide variety of recently updated web pages (including when e-mails to licensees are posted on http://www.nlm.nih.gov/bsd/licensee/) and the latest articles on various topics in the NLM Technical Bulletin (TB). This alert service covers all NLM products, services, and programs and is different from the occasional e-mails sent to all licensees containing technical and administrative information related to leasing NLM databases.

Licensees of CCRIS, ChemIDPlus, DIRLINE, GENE-TOX, HSDB, or TOXLINE Subset may also wish to subscribe to NLM-Tox-Enviro-Health-L (see http://sis.nlm.nih.gov/enviro/envirolistserv.html), an email announcement list available from NLM's Division of Specialized Information Services (SIS). The purpose of the announcement list is to broadcast updates on SIS's resources, services, and outreach in toxicology and environmental health.

The NLM Technical Bulletin at http://www.nlm.nih.gov/pubs/techbull/ includes information about searching the databases you lease on NLM's systems including PubMed, Gateway, TOXNET, NLM Catalog, and LocatorPlus. Items are published as they are completed and are then compiled into bi-monthly issues. TB material supplements the data content and format documentation made available for licensees from http://www.nlm.nih.gov/bsd/licensee.


New DTDs, Forthcoming Baseline, ChemIDPlus Data, Usage Statistics, and Revised Licensee Web Page

August 23, 2007

Items covered in this message are:
1. DTDs For NLM's 2008 Production Year
2. Forthcoming 2008 Baseline and Update Files
3. For ChemIDplus Licensees
4. Usage Statistics Due in December
5. Information Page for MEDLINE/PubMed and Other NLM Data Licensees
6. License Code Reminder

1. DTDS FOR NLM'S 2008 PRODUCTION YEAR
The January 1, 2007 versions of the NLMMedline, NLMMedlineCitation, NLMCatalogRecord, NLMSharedCatCit, and NLMCommon DTDs currently in effect will be replaced by DTDs dated January 1, 2008. The new DTDs will be used for creating the 2008 versions of the baseline MEDLINE/PubMed, CatfilePlus in XML, and Serfile in XML databases and for subsequent update files during the 2008 production year.

There are no known DTD changes expected for 2008 for CCRIS, ChemID, DIRLINE, Gene-Tox, HSDB, and TOXLINE Special.

The forthcoming 2008 DTDs are available from links on http://www.nlm.nih.gov/databases/dtd/. The MEDLINE XML Element Descriptions document at http://www.nlm.nih.gov/bsd/licensee/data_elements_doc.html and the CatfilePlus and Serfile in XML Format Element Descriptions document at http://www.nlm.nih.gov/bsd/licensee/catrecordxml_overview2.html will be edited to reflect these changes in the future.

The following is a summary of changes affecting leased MEDLINE/PubMed, CatfilePlus in XML, and Serfile data in XML format:

a. ELocation ID [revised 2/29/08]

In MEDLINE/PubMed: DOIs (Digital Object Identifiers) or PIIs (Publisher Item Identifiers), as an indicator of an article’s electronic location and when supplied by publishers in electronic submission of data to NLM in lieu of or in addition to pagination, may reside in records. Electronic location or pagination (one or the other) is required and it is possible for both to be present on a record. The new element, ELocationID with its entity EIdType, will be used to house DOIs and PIIs in prospective MEDLINE/PubMed records; the data will not be added to existing records.

In CatfilePlus and Serfile: URLs will continue to be used as an indicator of electronic location. To establish common ground between the databases, the ElectronicAccessList envelope will be changed to ELocationList. Its child elements will be ELocation (formerly ElectronicAccess), ELocationID (formerly Electronic Address), and DescriptiveInformation. ELocationID with its entity EIdType will be used to house URLs in CatfilePlus and Serfile. DOIs are not expected to be used in CatfilePlus and Serfile. DescriptiveInformation is retained as an element for CatfilePlus and Serfile but is not expected to be used with MEDLINE/PubMed records.

b. ISSNLinking
A new element, ISSNLinking, may appear in MEDLINE/PubMed, CatfilePlus and Serfile records. This is the ISSN designated by the ISSN Network to enable collocation or linking among the different media versions of a continuing resource. Separate ISSN’s are assigned for each media type in which a resource is issued. The first ISSN assigned to any medium version of a continuing resource shall also be designated to function as the linking ISSN (aka ISSN-L) and shall apply to all other media versions of that resource.

c. InvestigatorList
The existing InvestigatorList elements will begin to be used for MEDLINE/PubMed in the 2008 production year to contain personal names of individuals (e.g., collaborators and investigators) who are not authors of a paper but rather are listed in the paper as members of a collective/corporate group that is an author of the paper. The personal names in InvestigatorList will not be associated with the specific collective/corporate group author in which they are listed in the paper. The names will be entered in the order that they are published; the same name listed multiple times will be repeated because NLM can not make assumptions as to whether those names are the same person.

d. GrantList Agency Element
In August 2007, NLM began to include the country name for UK funding agencies in the Agency element in GrantList; for example: <Agency>United Kingdom Arthritis Research Campaign</Agency>. Starting with the 2008 baseline export, NLM will include the home country source of US funding agencies as well. For example, for the current export: <Agency>NIOSH</Agency> the export in 2008 will be: <Agency>United States NIOSH</Agency>
. [revised 10/22/07]

The DTD changes for 2008 are:

NLMMedline DTD (used for MEDLINE/PubMed data)
Changed entity reference from "nlmmedlinecitation_070101.dtd" to: "nlmmedlinecitation_080101.dtd"

NLMMedlineCitation DTD (used for MEDLINE/PubMed data)
a. Changed entity reference from "nlmmedlinecitation_070101.dtd" to: "nlmmedlinecitation_080101.dtd"
b. Added entity EIdType with doi and pii values

NLMSharedCatCit DTD (used for MEDLINE/PubMed, CatfilePlus, and Serfile data)
a. Changed entity reference from "nlmcommon_070101.dtd" to "nlmcommon_080101.dtd”
b. Moved Investigator & InvestigatorList from NLMSharedCatCit DTD to NLMCommon DTD

NLMCommon DTD (used for MEDLINE/PubMed, CatfilePlus, and Serfile data)
a. Added ELocationID in Article
b. Added EIdType and ValidYN as attributes to ELocationID
c. Added ISSNLinking to MedlineJournalInfo (designates the single unique ISSN for a continuing resource, regardless of its medium)
d. Investigator & InvestigatorList were moved from NLMSharedCatCit DTD to NLMCommon DTD

NLMCatalogRecord DTD (used for CatfilePlus and Serfile data in XLM format):
a. Changed entity reference from "nlmsharedcatcit_070101.dtd" to: "nlmsharedcatcit_080101.dtd"
b. Changed ElectronicAccessList to ELocationList
c. Changed ElectronicAddress to ELocationID
d. Changed ElectronicAccess to ELocation
e. Added entity EIdType with doi and url values
f. Added element ISSNLinking with attribute ValidYN to NLMCatalogRecord
element

2. FORTHCOMING 2007 BASELINE AND UPDATE FILES
After NLM completes its annual database maintenance activities, we expect to release the 2008 MEDLINE/PubMed baseline files and the 2008 Catfile (MARC format), CatfilePlus (MARC and XML formats) and Serfile (MARC and XML formats) baseline files maintained with 2008 MeSH vocabulary and other global changes to renewing licensees during the first week of December (earlier than in past years). The new baseline and subsequent update files replace all previously received 2007 production year data. A complete baseline reload for each database is required; this ensures you have the most current and accurate version of all records. Licensees will be advised when the specific target date for release of the 2008 baseline files becomes more firm.

As is the case each year, in mid-November NLM will suspend distribution of new and revised MEDLINE/PubMed records in MedlineCitation Status = MEDLINE, MedlineCitation Status = PubMed-not-Medline, and MedlineCitation Status = OLDMEDLINE as preparations are made for the new production year. The last records in these three statuses, as well as deleted records, for the current 2007 production year will be available around November 13. Records in In-process and In-Data-Review statuses will continue to be exported to MEDLINE/PubMed licensees prior to release of the 2008 baseline files.

Licensees must confirm continued interest in leasing NLM databases during the 2008 production year by responding to an e-mail that will be sent in November. Only those who respond will have access to the 2008 data and licenses of non-responders will cease (and accordingly, access to the 2008 data will be denied). Those whose licenses are ceased should cease use of previously leased MEDLINE/PubMed data.

3. FOR CHEMIDPLUS LICENSEES
An important announcement about the elimination of data provided by the U.S. Pharmacopeia (USP) in ChemIDplus was e-mailed to licensees on August 14, 2007.

4. USAGE STATISTICS DUE IN DECEMBER
Licensees are reminded that reports of leased NLM database usage are due December 6, 2007. This pertains to MEDLINE/PubMed, CCRIS, ChemID, DIRLINE, Gene-Tox, HSDB, and TOXLINE Special; not needed for Catfile, CatfilePlus, or Serfile. The Database Usage Report Form for Fiscal Year 2006 (Oct. 1, 2006 - Sept. 30, 2007) is available at http://www.nlm.nih.gov/databases/license/userep_2007.html and should be submitted as soon as possible after September 30. Licensees using leased NLM data internally for research purposes only do not need to submit a usage report.

5. INFORMATION PAGE FOR MEDLINE/PUBMED AND OTHER NLM DATA LICENSEES
NLM's information page for licensees at http://www.nlm.nih.gov/bsd/licensee/ has been redesigned to improve navigation and minimize redundancy within the related pages (please note the new URL). The redesign also enables more efficient internal management of the pages as they are updated and archived at NLM (the content of the related pages has not changed). Please visit the new pages to familiarize yourself with the location of the data element descriptions, update charts, postings of this and other e-mail announcements to licensees, and other documentation/information relating to MEDLINE/PubMed or and/or other data you may lease from NLM.

6. LICENSE CODE REMINDER
A unique 3-letter code is assigned to each license. Your code appears at the top of this message. Please keep a record of your license code and reference it when sending e-mails to NLM about matters relating to your license or leased databases.


Elimination of USP Data from Leased ChemIDPlus

August 14, 2007

The U.S. Pharmacopeia (USP) is one of the providers of data in the NLM ChemIDplus database. The data provided by USP (identified in ChemIDplus with the source tag of USPDDN) will no longer be distributed by NLM in its leased version of ChemID beginning with the replacement file expected to be available from NLM's server to licensees on October 28, 2007. Although these data will remain searchable in the NLM web-based databases ChemIDplus Lite
(http://chem.sis.nlm.nih.gov/chemidplus/chemidlite.jsp) and Advanced (http://chem.sis.nlm.nih.gov/chemidplus/), USP will no longer allow NLM to include their data in the XML version of ChemID that NLM leases and is distributed from the NLM server.

If you are interested in licensing the data with the source tag of USPDDN, or have questions about the removal of USP data from the leased version of ChemID, please contact Donna Shriver (dms@usp.org) of the U.S. Pharmacopeial Convention, Inc. (USP), the provider of USPDDN data for ChemIDplus.

Please contact tehip@teh.nlm.nih.gov with questions about the content of the NLM ChemIDplus database and nlmdatadistrib@nlm.nih.gov with NLM licensing questions.


Funding Agencies in MEDLINE/PubMed Records

July 30, 2007

One day during the week of August 13, or shortly thereafter, a large number of existing MEDLINE/PubMed records will be re-exported as revised records due to the addition of the country name 'United Kingdom' before the value 'Wellcome Trust' in the <Agency> element in <GrantList>. Approximately 9,000 records are expected to be affected by this enhancement. From then on, new records with Wellcome Trust as a funding agency will also include the country name.

For example, a record currently with this content:
<GrantList CompleteYN="Y">
<Grant>
<GrantID>>043965</GrantID>
<Agency>Wellcome Trust</Agency>
</Grant>
</GrantList>

will become as follows:
<GrantList CompleteYN="Y">
<Grant>
<GrantID>043965</GrantID>
<Agency>United Kingdom Wellcome Trust</Agency>
</Grant>
</GrantList>

The only country name to be used during the 2007 production year is United Kingdom. NLM expects to introduce the country name United States for US funding agencies at the time the 2008 baseline files are exported in December 2007. It is possible that other country names will be introduced in the future if funding agencies in other countries begin to be carried in MEDLINE/PubMed records.

Additional UK Funding Agencies
At the time the country name is added to records showing Wellcome Trust as the funding agency, NLM expects to begin accepting data into PubMed from the United Kingdom Manuscript Submission System (see NLM Technical Bulletin article, "PubMed® Links to Author Manuscripts in PubMed Central®". As a result, grant information from the following seven additional UK granting agencies may be included in MEDLINE/PubMed records:

Arthritis Research Campaign
Biotechnology and Biological Sciences Research Council
British Heart Foundation
Cancer Research
UK Chief Scientist
Office Department of Health
Medical Research Council

When present, these will each also be preceded by the country name 'United Kingdom' in the <Agency> element, for example:

<GrantList CompleteYN="Y">
<Grant>
<GrantID>12345</GrantID>
<Agency>United Kingdom British Heart Foundation</Agency> </Grant> </GrantList>

A forthcoming NLM Technical Bulletin article will cover these enhancements to MEDLINE/PubMed records.


Change to HSDB XML

January 24, 2007

A new HSDB file which replaces the previous file is usually (but not always) available at the end of each month.  Effective this month, for the new file available on January 29, the XML will be changed as follows:  CDATA will be added to the <ocpp> element.  This enables the parser to ignore the <table> tags that now may reside in the XML data.  The DTD available on January 29 will not include the various <table> tags.


Deleted NLM MARC 21 Format Records

January 19, 2007

The MARC file containing all the bibliographic records deleted by NLM between January 1, 2006 and December 31, 2006, is now available for licensees to ftp from the same location on NLM's ftp server where the appropriate base files are posted.  This MARC file is called "deleted.20070101" and contains 2,938 MARC-formatted records. Each deleted record contains Leader byte 05="d", as specified by the MARC 21 format. There is also a label file, "deleted.20070101.label".

Important: The file "deleted.20070101" contains all the records NLM has deleted from its bibliographic file during the designated timeframe. This means some licensees will receive deletes for records they may never have actually received as current or updated cataloging. Be sure to use the delete file only to remove those records which you do have from your system. Do not add any of them as new or modified records.

Licensees who are new recipients of NLM's MARC bibliographic records in 2007, as well as ongoing licensees who are discarding their pre-2007 records and reloading with the 2007 base files, do NOT need "deleted.20070101". The records in this delete file were removed from NLM's database prior to the pull of the 2007 base files.


Last updated: 29 February 2008
First published: 22 January 2007
Metadata| Permanence level: Permanence Not Guaranteed