Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Research Tips for Genetic Disorder & Variant Mapping


Available Resources at NCBI  

If you are looking for information here are some good places to START to learn about key topics!

Here are some helpful database hubs for human variation research:
  • MedGen - an aggregate database for information and links to more about gene-associated human disorders and phenotypes.
  • Genetic Testing Registry (GTR) - a source for clinical and research genetic tests with information provided by the laboratories.
  • ClinVar - a genetic variation resource collating clinically-relevant information submitted by research and clinical labs, expert clinical panels, and some key genetic disease literature resources.
  • dbSNP - a registry of short human genetic variants with biological information and population data.
  • NCBI Gene - an aggregate database for information and links to more data about genes
  • NCBI Structure - a source for curated 3D biomolecular structure information based on submissions to the Protein DataBank (PDB).

Here are some good tools for mapping variants on to biomolecular structures:
  • BLAST - genome or reference sequence databases
  • Genome Data Viewer (GDV) - a full-service interactive genome sequence and annotation browser
  • On NCBI Nucleotide or NCBI Protein database record:
    • Graphical Sequence Viewer - an interactive sequence browser display available by clicking on "Graphics"
    • Pre-calculated Conserved Domain (CD) View -  an interactive graphical display of CD-Search results by clicking on "Identify Conserved Domains"
  • iCn3D - a web-based 3D structure viewer accessible on NCBI Structure record pages or as a stand-alone tool.

Variant nomenclature

Over the years, researchers have adopted many different ways to name a particular genetic variant that they have been studying. Here are some examples of what has been used in published literature for exactly the same genetic variant:
      • Factor V Leiden variant
      • F5 Arg534Gln
      • FV R506Q
      • NC_000001.10:g.169519049=
      • NC_000001.11:g.169549811C>T
      • rs6025
      • OMIM: 612309.0001
A standard way to notate these has been proposed - the Human Genome Variant Syntax (HGVS) and is now often in use by many research and clinical labs.  This notation is based on anchoring the location of a variant to a specific, discrete sequence record.

Accession.version(gene symbol):molecular type-abbreviation.
then a structured statement including:
variant-location, wildtype-residue and variant-impact (such as the variant residue)

For example:

NG_011806.1(F5):g.41721G>A 
or
NP_000121.2(F5):p.Arg534Gln

 
For Accession.version, a particular set of sequences and their accessions is often used to reliably and sustainably anchor these variants.  The NCBI RefSeq Project collects all known nucleotide and protein sequences and uses them and literature information to create a non-redundant set of reference sequences (recognizable accession prefixes).  For humans, these are the most commonly found record-types and prefixes for their accessions:
    • Chromosome:  NC_
    • Gene/Gene region:  NG_
    • Transcript
      • Protein coding:  NM_ (with strong evidence)  or XM_ (predicted)
      • Non-protein coding:  NR_ (with strong evidence)  or XR_ (predicted)
    • Protein with sequence translated from the transcript:  NP_ (with strong evidence)  or XP_ (predicted)

Based on usage in the community, the inclusion of a gene symbol, though recommended, appears to be optional. However if included, it should be the official gene symbol as designated by the Human Genome Nomenclature Committee (HGNC).

For molecular type-abbreviation, at NCBI you will often see:
    • g.” for a linear genomic reference sequence
    • c.” for a coding DNA reference sequence
    • p.” for a protein reference sequence

For examples of HGVS usage, take a look below in the overview of how these fit in with biomolecules throughout the central dogma of molecular biology.


Clinical Toolkit - Genetics Add-on imageIf you are interested in learning more about implementation of genetics in clinical practice or information helpful as a Genetics Add-on for a Clinician's Toolkit, click the graphic to go to the materials created for a clinically-focused workshop.

 

A quick primer on the central dogma & how genetic variants can impact molecular biology


Image summarizing the steps of possible variant impact, depending upon location.


Let's put this all together in a general workflow that you can use


Image of helpful resources & flow for your research

Last Reviewed: May 20, 2023