Skip to content
Close this search box.
Home » C4


Sequence Similarity Introduction

This lesson and exercise defines similarity in a non-biological and biological sense, quantifies the similarity between two sequences, explains how a substitution matrix is used to quantify similarity, calculates amino acid similarity scores using the BLOSUM 62 substitution matrix, explains how BLAST detects similarity between two sequences and how to use BLAST and interpret the alignments.

Pathways Project Primer

This PowerPoint presentation provides a primer on the recommended annotation strategy for the Pathways Project. The presentation provides an overview of the goals of the Pathways Project annotations, an introduction to RNA-Seq, web databases, and a discussion on the phases of the splice donor and acceptor sites.

Synteny Introduction Slides

This resource is a slide deck offering an expanded introduction to synteny. Instructors are encouraged to use and modify to fit the needs of specific courses and students.

Introduction to R and RStudio

This series of modules introduces students to the statistical platform R using an integrated development environment of RStudio. Both softwares can be downloaded for free. Once downloaded and installed according to Module 0, students should watch the accompanying video for an introduction to the new environment. Module 1 presents an exercise where students work with genomic sequence alignment data to learn how to construct R commands while performing basic summary statistics and making basic plots.

Genomic Neighborhood Check For Understanding

This was created in response to a member mentioning their students really struggled with the genomic neighborhood and the member didn’t realize until they were already too far into the annotation to correct their misconceptions. This is meant to be a quick in-class and/or homework assignment.

Pathways Project: Annotation Form

This “Annotation Form” merged the “Annotation Report” and “Annotation Notebook” into a single document and the latter two items were archived.

The “Annotation Form” kept many of the Checks for Understanding type questions that were previously in the “Annotation Notebook.” However, given that not all faculty want their students to answer those questions, we marked any question that is NOT required for submission to the GEP as “OPTIONAL.” Currently the optional questions are organized as letters rather than numbers, which we hope will make it easy to quickly select and delete the OPTIONAL questions that faculty don’t wish to include. Again, all numbered questions are required for submission to the GEP for reconciliation.

Items that were previously asked for in the “Project Details” table of the “Annotation Report” are now available on the Genome Browser Gateway page for each species. Therefore, the “Project Details Table” instructions document is no longer needed.

4. Characteristics of the F Element

This lecture uses the themes from Slide Sets 1-3, of the “F Element Project: Annotated Lecture Slides” sequence, in describing what we have learned about the F element—combining wet-bench work in the Elgin lab, results of chromatin mapping by the modENCODE consortium, and the bioinformatics efforts of the faculty and students of the GEP. This includes the following:

  • characterization of the F element as a heterochromatic domain, high in repeated DNA but nonetheless having genes expressed at normal levels (7 slides);
  • mapping the chromatin state in relationship to the genes and to reporter transgenes inserted into the F (9 slides);
  • exploring the TSS and searching for unique factors or motifs associated with the TSSs of F element genes (6 slides);
  • introducing the “expanded F” project, describing some of the finds made looking at the F of D. ananassae, and summarizing the ongoing challenges and questions (9 slides)