Resources & Tools
One of the unusual features of eukaryotic genomes is the discordance between genome size and the complexity of the organism (i.e., the C-value paradox; Eddy, 2012). The smallest chromosome in Drosophila melanogaster is chromosome 4 (also known as the Muller F element), with an estimated size of ~5.2 Mb (Locke and McDermid, 1993). The D. melanogaster F element is generally packaged as heterochromatin: it has a high repeat content, is packaged throughout with HP1a and H3K9me2/3; it exhibits late replication and little or no recombination. However, the banded portion (~1.4 Mb) of this chromosome also contains ~80 protein-coding genes. These F element genes exhibit a range of expression levels similar to genes that reside in euchromatic domains — indicating that F element genes have acquired distinct features that enable them to function in a heterochromatic environment (reviewed in Riddle and Elgin, 2018).
While the F element has maintained a similar size in many other Drosophila species, it is substantially larger in at least four Drosophila species (i.e., D. ananassae, D. bipectinata, D. kikkawai, and D. takahashii). For example, the D. ananassae Muller F element is more than 18.7 Mb in size. This study will examine the factors (e.g., transposon density) that have contributed to the expansion of the F element in these four Drosophila species, and assess the impact of this expansion on gene characteristics (e.g., codon bias, intron size).
GEP students will produce coding region and transcription start site annotations for F element genes in D. ananassae, D. bipectinata, D. kikkawai, and D. takahashii, as well as for genes in a euchromatic reference region derived from the Muller D element. (Euchromatic regions have not expanded in these species.) Comparative analyses using these datasets will provide insights into the evolutionary impacts of changes in chromosome and gene size, and will facilitate the identification of factors that enable genes to function in a heterochromatic environment. We anticipate that this work will move us toward a better understanding of how and why eukaryotic genomes became so large, for mammals, ~1000X larger than that of E. coli.
Using comparative genomics to assess the evolutionary impact of Drosophila F element expansion on chromosome and gene characteristics. (Top) Past studies using a transgene reporter with the white gene driven by an hsp70 promoter show that the Drosophila melanogaster Muller F element is mostly heterochromatic, even though the region contains ~80 protein-coding genes. (Bottom left) The D. ananassae F element has substantially higher transposon density compared to the D. melanogaster F element. The high density of LTR and LINE retrotransposons is one of the major contributors to the expansion of the D. ananassae F element (>18.6Mb) compared to the D. melanogaster F element (>1.4Mb). (Bottom right) In addition to D. ananassae, the D. bipectinata, D. kikkawai, and D. takahashii F elements are also larger than the D. melanogaster F element. GEP students will annotate genes on the F element and on a euchromatic reference region from the D element for these four Drosophila species.
Image credits: (Top) Karmella Haynes; (Bottom Left) Leung et al., 2017; (Bottom Right) Phylogenetic tree produced by Thom Kaufman as part of the modENCODE project.
This is a PowerPoint presentation describing the recommended strategies for annotating a D. virilis fosmid. The homology-based annotation strategy should also be applicable to annotation of D. erecta and D. mojavensis projects.
This document is a more in depth description of the evidence based annotation technique used by the GEP. This document is designed to complement and extend the basic technique described in the Annotation for D. virilis PowerPoint.
This PowerPoint presentation describes the recommended annotation strategy for Drosophila projects. The presentation provides an overview of the goals of the GEP annotation project, an introduction to NCBI BLAST, web databases, and the issue of reading frames and phase.
This PowerPoint presentation provides a brief primer on the recommended annotation strategy for Drosophila projects. The presentation provides an overview of the goals of the GEP annotation project, an introduction to RNA-Seq, web databases, and a discussion on the phases of the splice donor and
This walkthrough uses the annotation of a gene on the D. biarmipes Muller F element to illustrate the GEP comparative annotation strategy. This document shows how you can investigate a feature in an annotation project using FlyBase, the Gene Record Finder, and the gene prediction
This walkthrough uses FlyBase, FlyFactorSurvey, and Patser to identify transcription factor binding sites in the region surrounding the transcription start site of onecut in D. biarmipes.
This walkthrough illustrates the GEP protocol for the comparative annotation of transcription start sites (TSS) in D. biarmipes. The walkthrough also includes a sample GEP TSS Report for the TSS annotation of onecut.
This document illustrates how the strategies outlined in the Annotation Instruction Sheet can be applied to more challenging annotation cases.
Developed by Dr. Jeremy Buhler, this exercise uses MEME to discover putative regulatory motifs in a collection of D. melanogaster promoter sequences. It also illustrates some of the challenges associated with motif finding and the limitations of motif finding programs.
This exercise continues your introduction to practical issues in comparative annotation. You will be annotating genomic sequence from the dot chromosome of Drosophila mojavensis using your knowledge of BLAST and some improved visualization tools. You will also consider how best to integrate information from high-throughput
This PowerPoint presentation describes the common errors observed in student annotations.
An introductory exercise using BLAST to annotate a region in the Drosophila melanogaster genome. Students can use this exercise to gain familiarity with performing BLAST searches and interpreting BLAST output. An answer key is provided for instructors.
This document describes the primary annotation goals to be included in the final oral presentation and written report for students enrolled in the Bio4342 course at WU.
This document is the revised annotation report that GEP students will use to report their annotation results to the GEP. Complete Annotation Report Complete Annotation Report TSS Report TSS Report
This workflow provides an overview of the key analysis steps and bioinformatics tools for the annotation of a predicted gene in the Drosophila F element GEP project.
Developed by Dr. Nick Reeves at Mt. San Jacinto College, Menifee Valley Campus, this PowerPoint presentation provides a brief overview of the Digital Lab Notebook, which provides detailed guidance to students on the GEP annotation strategy.
Developed by Dr. Nick Reeves, this PowerPoint presentation describes the implementation of the GEP curriculum materials at Mt. San Jacinto Community College.
This decision tree illustrates the list of criteria that can be used to determine the putative D. melanogaster ortholog of a predicted gene.
This document contains the notes from a lecture on motif finding given by Dr. Jeremy Buhler in the Bio 4342 course at WU. The lecture covers the different approaches used to represent sequence motifs and to search for sequence motifs in a genome.
This walkthrough uses FlyBase RNA-Seq Search and the MEME suite to discover motifs that are enriched in a collection of D. melanogaster Muller F element genes that show similar expression patterns.
This PowerPoint presentation describes the recommended annotation strategy for identifying transcription start sites in Drosophila. The presentation provides an overview of the promoter architecture in D. melanogaster and describes the types of evidence that can be used to support the transcription start sites annotations.
This worksheet will guide you through a series of basic steps that have been found to work well for annotation of species closely related to Drosophila melanogaster. It provides a technique that can also be the foundation of annotation in other, more divergent species.
This workflow provides an overview of the key steps and recommended search parameters for the annotation of transcription start sites.