Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Exercise #4: Try the clustered nr database


Background

We created an experimental, clustered nr database that gives you faster searches and reduced redundancy. This provides results with a wider range of organisms and evolutionary distances.

We generate clustered nr from the default protein nr database with MMseqs2. Each cluster contains proteins that are more than 90% identical to each other and within 90% of the length of the longest member.

Read more in this NCBI Insights blog.

Setup

  • Use the Protein BLAST (blastp) page. Get there from the BLAST home page.
  • Query: NP_938033.1
  • Database: Check the box to "Compare". This runs searches against both standard nr and clustered nr.
  • Organism: Limit to Eukarya (taxid:2759), for standard nr.

Click here to see the RID

RID for standard nr: M8NWX25K016
RID for clustered nr: M8NWXVSD016


Results

Look at the Descriptions or "Distance tree of results" to gauge the taxonomic range of the hits in the two searches.
Here is the Distance tree of results for standard nr. The most distant organisms are mammals.

Image of Distance tree of results for standard nr search


Here is the Distance tree of results for clustered nr. We see hits that are much more distant than mammals; as far as the lamprey.

Image of Distance tree of results for standard nr search