On July 21, 2022, the NCBI Education Team provided a workshop on An Introduction to Molecular Evolutionary Analysis with NCBI Datasets and Python. Training materials from this event are available on this page.
As they diverge from a common ancestor, species accumulate differences in their DNA sequences. Differences within a protein-coding region are classified in two types. Non-synonymous substitutions change the amino acid sequence of the protein, while synonymous substitutions do not. Synonymous substitutions are largely invisible to natural selection and tend to accumulate at a constant rate. On the other hand, non-synonymous substitutions whose effects are beneficial accumulate at a faster rate, while those that are deleterious are suppressed. By comparing the rates of non-synonymous and synonymous substitutions, we can infer whether natural selection has primarily acted to conserve the protein sequence or to adapt it to a new environment or function.
In this workshop you will learn to compare the protein-coding sequences of two species to estimate which proteins show signs of adaptation. Working in a Jupyter notebook with bash and Python, you will use the NCBI Datasets command line interface (CLI) to download sequence data, then perform analysis with a few popular Python packages. The workshop assumes basic familiarity with a scripting language such as Python or R at a level equivalent to a semester course or programming bootcamp.
In this workshop you will learn how to:
- Search for and download protein ortholog sequences with NCBI Datasets CLI.
- Parse the downloaded files with BioPython.
- Identify synonymous and non-synonymous substitutions and calculate substitution rates.
- Plot the results with Matplotlib
NOTE: This workshop is designed for people with a good understanding of molecular biology and some scripting experience.
Keyboard controls: Space bar - toggle play/pause; Right and Left Arrow - seek the video forwards and back; Up and Down Arrow - increase and decrease the volume; M key - toggle mute/unmute; F key - toggle fullscreen off and on.
Webinar Materials
Jupyter Notebook
(Note: It may take a couple of minutes to load this page the first time.)
Return to NCBI Outreach Events.
Last Reviewed: August 23, 2022