Some Relevant NCBI Resources for PAG 33 Attendees
The NIH Comparative Genomics Resource (CGR)The National Institutes of Health (NIH) Comparative Genomics Resource (CGR) is a multi-year National Library of Medicine (NLM) project to maximize the impact of eukaryotic research organisms and their genomic data resources to biomedical research. |
Comparative Genome Viewer (CGV)
The Comparative Genome Viewer (CGV) tool allows you to compare two genomes based on assembly-assembly alignments provided by NCBI.

Eukaryotic Genome Annotation Pipeline Application (EGAPx)
This downloadable application uses alignment evidence from RNA-seq and protein data to generate structural gene predictions using Gnomon, and functional annotations are subsequently added to gene predictions using orthology information. Currently, EGAPx can annotate genomes from vertebrates, arthropods, and some plants.

Foreign Contamination Screen (FCS)
The NCBI Foreign Contamination Screen (FCS) is a tool suite for identifying and removing contaminant sequences in genome assemblies. Contaminants are defined as sequences in a dataset that do not originate from the biological source organism and can arise from a variety of environmental and laboratory sources. FCS will help you remove contaminants from genomes before submission to GenBank.

Multiple Comparative Genome Viewer (MCGV)
The Multiple Comparative Genome Viewer (CGV) tool is a graphical display that allows you to visualize and explore alignments of multiple whole eukaryotic genomes.

NCBI Datasets
NCBI Datasets is a new resource that lets you easily gather data from across NCBI databases in the website or with command-line tools or APIs. Find and download gene, transcript, protein and genome sequences, annotation and metadata.

GenBank
GenBank® is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. GenBank is part of the International Nucleotide Sequence Database Collaboration, which comprises the DNA DataBank of Japan (DDBJ), the European Nucleotide Archive (ENA), and GenBank at NCBI. These three organizations exchange data on a daily basis.

PubChem
PubChem is an open chemistry database at the National Institutes of Health (NIH), where you can put your scientific data in PubChem and that others may use it. Since the launch in 2004, PubChem has become a key chemical information resource for scientists, students, and the general public. Its website and programmatic services provide data to millions of users worldwide.

Sequence Read Archive (SRA)
Sequence Read Archive data, available through multiple cloud providers and NCBI servers, is the largest publicly available repository of high throughput sequencing data. The archive accepts data from all branches of life as well as metagenomic and environmental surveys. SRA stores raw sequencing data and alignment information (if submitted) to enhance reproducibility and facilitate new discoveries through data analysis.

Last Reviewed: December 3, 2025