NCBI BLAST Exercise:
Identifying and Comparing Sequences
NCBI BLAST allows you to input a sequence from DNA, RNA, or protein residues (amino acids) and find sequences that are identical or similar.
This is an exercise to practice using NCBI BLAST to identify and to compare sequences.
Click the Next button below.
Identifying Sequences
1 of 7
Let's identify a nucleotide sequence.
You can get to BLAST directly by going to http://blast.ncbi.nlm.nih.gov/.
For the first part of this exercise, we will give you a nucleotide sequence to identify. Click Nucleotide BLAST on the left of the page.
Identifying Sequences
2 of 7
Copy and paste the entire string of nucleotide symbols below into the box under Enter Query Sequence.
>Seq1
ATGGGTAAGGAGGACAAGACTCACCTTAACGTCGTCGTCATCGGCCACGTCGACTCTGGCAAGTCGACCACTGTAAGTACAACCAACAGCGGGTTGCTTATCTGCACTCGGAATCCGCCAAACCTGGCAGGGTATCACCAAAACATCTTGCTAACTTTTGACAGACCGGTCACTTGATCTACCAGTGCGGTGGTATCGACAAGCGAACCATCGAGAAGTTCGAGAAGGTTAGTCAATATCCCTTCGATTACGCGCGCTCCCATCGATTCCCACGATTCGCTCCCTCACTCGAAACACATCCATTACCCCGCTCGAGTCCGAAAATTTTGCGGTGCGACCGTGATTTTTTCTGGTGGGGTATCTTACCCCGCCACTCGAGTCACGGATGCGCTTGCCCTGTTCCCACAAAACCTTACCACCCTGTCGCGCACTACATGTCTTGCAGTCACTAACCACTGGACAATAGGAAGCCGCCGAGCTCGGAAAGGGTTCCTTCAAGTACGCCTGGGTTCTTGACAAGCTCAAAGCCGAGCGTGAGCGTGGTATCACCATTGATATCGCTCTCTGGAAGTTCGAGACTCCTCGCTACTATGTCACCGTCATTGGTATGTTGTCACCGTCTCACACTATCATGTATTCATCATGCTAACATCTCTCTCAGATGCCCCCGGTCATCGTGATTTCATCAAGAACATGATC
...to the Enter Query Sequence box.
Uncheck the box labeled "Align two or more sequences" if it is checked.
Then scroll down and click the BLAST button.
Identifying Sequences
3 of 7
Once your results are displayed, you will see a header followed by the results of your search. The results can be displayed in several different views, including a list of sequence "Descriptions," via a "Graphic Summary," and via a more detailed "Alignments" view.
Click on the Descriptions tab (if you're not there, already) to learn more about each of the sequences that aligned with yours.
Identifying Sequences
4 of 7
Note that your list may be different from this screenshot, but the top results should still be the same organism and gene/protein.
Click on the description of the sequence to see the alignment.
Identifying Sequences
5 of 7
Click on the Sequence ID to view the record in the Nucleotide database and learn more about the sequence.
Identifying Sequences
6 of 7
Looking at the Nucleotide record, answer the following questions:
Look at the line in the Nucleotide record that says ORGANISM. What organism is this sequence from?
Incorrect!
Follow the link to the Nucleotide record and look on the line that says "ORGANISM."
Correct!
Fusarium tricinctum is a fungus that is considered a weak plant pathogen.
Incorrect!
Follow the link to the Nucleotide record and look on the line that says "ORGANISM."
Look at the Nucleotide record title or scroll down to the Features table to see what gene this sequence is associated with. What gene is this sequence from?
Incorrect!
Please try again.
Incorrect!
Please try again.
Correct!
From either the record title or the Features table of the Nucleotide record, you can see that this sequence is from the translation elongation factor 1-alpha (TEF1) gene.
Identifying Sequences
7 of 7
Now go back to the Nucleotide BLAST (BLASTn) homepage and BLAST this sequence:
>Seq2
ATGGGAAAGGAGAAGACCCACATCAACATCGTTGTCATTGGGCACGTAGATTCAGGGAAGTCTACCACGACTGGCCATCTGATCTATAAATGTGGCGGGATCGACAAGAGAACAATTGAAAAGTTCGAGAAGGAGGCTGCCGAGATGGGAAAGGGCTCCTTCAAATATGCCTGGGTCTTGGACAAACTTAAAGCTGAACGTGAGCGTGGTATCACCATTGATATCTCCCTGTGGAAATTTGAGACCAGCAAGTACTATGTTACCATCATTGATGCCCCAGGACACAGAGACTTCATCAAAAACATGATTACAGGCACATCCCAGGCTGACTGTGCTGTCCTGATCGTTGCTGCTGGTGTTGGTGAATTTGAAGCCGGTATCTCCAAGAACGGGCAGACCCGTGAGCATGCCCTTTTGGCTTACACCCTGGGTGTGAAACAACTAATTGTTGGCGTTAACAAAATGGATTCCACTGAGCCACCCTATAGCCAGAAGAGATACGAAGAAATTGTTAAGGAAGTCAGCACCTATATTAAGAAAATTGGCTACAACCCCGACACAGTAGCATTTGTGCCAATTTCTGGCTGGAATGGTGACAACATGCTAGAACCAAGTGCTAATATGCCATGGTTCAAGGGATGGAAAGTCACCCGTAAGGACGGCAATGCCAGTGGAACCACCCTGCTTGAAGCTCTGGATTGCATTCTGCCACCAACTCGCCCAACTGACAAACCCTTGCGTTTGCCTCTCCAGGATGTCTATAAAATTGGTGGTATTGGTACTGTCCCTGTGGGTCGTGTGGAGACTGGTGTTCTCAAACCTGGCATGGTGGTCACCTTTGCTCCAGTCAATGTAACAACTGAAGTGAAGTCTGTAGAAATGCACCATGAAGCATTGAGTGAAGCCCTTCCTGGGGACAATGTGGGCTTTAATGTCAAAAACGTGTCTGTCAAAGATGTCCGTCGTGGCAATGTGGCTGGTGACAGCAAAAATGATCCACCCATGGAAGCTGCTGGCTTCACAGCTCAGGTGATTATTTTGAACCATCCAGGCCAAATCAGTGCTGGATATGCACCTGTGCTGGATTGTCACACAGCTCACATTGCTTGCAAGTTTGCTGAGCTGAAGGAGAAGATTGATCGTCGTTCTGGGAAAAAGCTGGAAGATGGCCCTAAATTCTTGAAATCTGGTGACGCTGCCATCGTTGATATGGTTCCTGGCAAGCCCATGTGTGTCGAGAGCTTCTCTGATTATCCTCCCCTGGGCCGTTTTGCTGTGCGTGACATGAGACAGACAGTCGCTGTGGGTGTCATCAAAGCAGTGGACAAGAAGGCAGCTGGAGCTGGCAAGGTCACCAAGTCTGCCCAGAAAGCTCAGAAGGCTAAATGA
What organism is this sequence from?
Incorrect!
Look at the sequences that align 100% with this one. From the Nucleotide record title or ORGANISM field, find the species name.
Correct!
The sequence matches 100% with sequences in Nucleotide from bos taurus (cattle).
Incorrect!
Look at the sequences that align 100% with this one. From the Nucleotide record title or ORGANISM field, find the species name.
What gene is this sequence from?
Incorrect!
You could look at the record title or scroll down to the Features table to see what gene this sequence is associated with.
Incorrect!
You could look at the record title or scroll down to the Features table to see what gene this sequence is associated with.
Correct!
This sequence is from translation elongation factor 1 alpha 1 (EEF1A1).
You just found two translation elongation factor 1-alpha sequences from two different species.
Comparing Sequences
1 of 6
Now let's compare the two translation elongation factor 1-alpha sequences from the fungus and the cattle to find the similarities.
Back to the Nucleotide BLAST (BLASTn) homepage, enter the first (fungus) sequence in the Enter Query Sequence box.
>Seq1
ATGGGTAAGGAGGACAAGACTCACCTTAACGTCGTCGTCATCGGCCACGTCGACTCTGGCAAGTCGACCACTGTAAGTACAACCAACAGCGGGTTGCTTATCTGCACTCGGAATCCGCCAAACCTGGCAGGGTATCACCAAAACATCTTGCTAACTTTTGACAGACCGGTCACTTGATCTACCAGTGCGGTGGTATCGACAAGCGAACCATCGAGAAGTTCGAGAAGGTTAGTCAATATCCCTTCGATTACGCGCGCTCCCATCGATTCCCACGATTCGCTCCCTCACTCGAAACACATCCATTACCCCGCTCGAGTCCGAAAATTTTGCGGTGCGACCGTGATTTTTTCTGGTGGGGTATCTTACCCCGCCACTCGAGTCACGGATGCGCTTGCCCTGTTCCCACAAAACCTTACCACCCTGTCGCGCACTACATGTCTTGCAGTCACTAACCACTGGACAATAGGAAGCCGCCGAGCTCGGAAAGGGTTCCTTCAAGTACGCCTGGGTTCTTGACAAGCTCAAAGCCGAGCGTGAGCGTGGTATCACCATTGATATCGCTCTCTGGAAGTTCGAGACTCCTCGCTACTATGTCACCGTCATTGGTATGTTGTCACCGTCTCACACTATCATGTATTCATCATGCTAACATCTCTCTCAGATGCCCCCGGTCATCGTGATTTCATCAAGAACATGATC
Comparing Sequences
2 of 6
Select Align two or more sequences.
Enter the second (cattle) sequence in the Enter Subject Sequence box.
>Seq2
ATGGGAAAGGAGAAGACCCACATCAACATCGTTGTCATTGGGCACGTAGATTCAGGGAAGTCTACCACGACTGGCCATCTGATCTATAAATGTGGCGGGATCGACAAGAGAACAATTGAAAAGTTCGAGAAGGAGGCTGCCGAGATGGGAAAGGGCTCCTTCAAATATGCCTGGGTCTTGGACAAACTTAAAGCTGAACGTGAGCGTGGTATCACCATTGATATCTCCCTGTGGAAATTTGAGACCAGCAAGTACTATGTTACCATCATTGATGCCCCAGGACACAGAGACTTCATCAAAAACATGATTACAGGCACATCCCAGGCTGACTGTGCTGTCCTGATCGTTGCTGCTGGTGTTGGTGAATTTGAAGCCGGTATCTCCAAGAACGGGCAGACCCGTGAGCATGCCCTTTTGGCTTACACCCTGGGTGTGAAACAACTAATTGTTGGCGTTAACAAAATGGATTCCACTGAGCCACCCTATAGCCAGAAGAGATACGAAGAAATTGTTAAGGAAGTCAGCACCTATATTAAGAAAATTGGCTACAACCCCGACACAGTAGCATTTGTGCCAATTTCTGGCTGGAATGGTGACAACATGCTAGAACCAAGTGCTAATATGCCATGGTTCAAGGGATGGAAAGTCACCCGTAAGGACGGCAATGCCAGTGGAACCACCCTGCTTGAAGCTCTGGATTGCATTCTGCCACCAACTCGCCCAACTGACAAACCCTTGCGTTTGCCTCTCCAGGATGTCTATAAAATTGGTGGTATTGGTACTGTCCCTGTGGGTCGTGTGGAGACTGGTGTTCTCAAACCTGGCATGGTGGTCACCTTTGCTCCAGTCAATGTAACAACTGAAGTGAAGTCTGTAGAAATGCACCATGAAGCATTGAGTGAAGCCCTTCCTGGGGACAATGTGGGCTTTAATGTCAAAAACGTGTCTGTCAAAGATGTCCGTCGTGGCAATGTGGCTGGTGACAGCAAAAATGATCCACCCATGGAAGCTGCTGGCTTCACAGCTCAGGTGATTATTTTGAACCATCCAGGCCAAATCAGTGCTGGATATGCACCTGTGCTGGATTGTCACACAGCTCACATTGCTTGCAAGTTTGCTGAGCTGAAGGAGAAGATTGATCGTCGTTCTGGGAAAAAGCTGGAAGATGGCCCTAAATTCTTGAAATCTGGTGACGCTGCCATCGTTGATATGGTTCCTGGCAAGCCCATGTGTGTCGAGAGCTTCTCTGATTATCCTCCCCTGGGCCGTTTTGCTGTGCGTGACATGAGACAGACAGTCGCTGTGGGTGTCATCAAAGCAGTGGACAAGAAGGCAGCTGGAGCTGGCAAGGTCACCAAGTCTGCCCAGAAAGCTCAGAAGGCTAAATGA
Comparing Sequences
3 of 6
Note the default setting for Program Selection is Highly similar sequences (megablast).
Leave this setting at megablast and click the BLAST button at the bottom.
Comparing Sequences
4 of 6
You might be disappointed to see that no significant similarity between these two sequences was found.
When this happens, you can try a more sensitive algorithm.
Use the Edit Search button at the top to return to the BLASTn screen.
Comparing Sequences
5 of 6
This time, under Program Selection, choose the option for More dissimilar sequences (discontiguous megablast).
Then try BLAST again.
Comparing Sequences
6 of 6
Once you are on the results screen, click the Alignments tab to see how these sequences match up.
To see the alignments more easily and view the coding differences, select the Alignment view as "Pairwise with dots for identities" and check the CDS feature box.
The results show that the two sequences align in three different places or ranges. The three ranges are sorted by E value. In other words, they are shown in the order of similarity.
You can study these sequences to see the similarities and differences.
Conclusion
Congratulations! You've completed this exercise on identifying and comparing sequences using NCBI BLAST.
You can now close the NLM Navigator windows.