Skip to content
Search
Close this search box.

F Element Project

""
In this project, GEP students produce coding region and transcription start site annotations for F element genes in D. ananassae, D. bipectinata, D. kikkawai, and D. takahashii, as well as for genes in a euchromatic reference region derived from the Muller D element.

One of the unusual features of eukaryotic genomes is the discordance between genome size and the complexity of the organism (i.e., the C-value paradox; Eddy, 2012). The smallest chromosome in Drosophila melanogaster is chromosome 4 (also known as the Muller F element), with an estimated size of ~5.2 Mb (Locke and McDermid, 1993). The D. melanogaster F element is generally packaged as heterochromatin: it has a high repeat content, is packaged throughout with HP1a and H3K9me2/3; it exhibits late replication and little or no recombination. However, the banded portion (~1.4 Mb) of this chromosome also contains ~80 protein-coding genes. These F element genes exhibit a range of expression levels similar to genes that reside in euchromatic domains — indicating that F element genes have acquired distinct features that enable them to function in a heterochromatic environment (reviewed in Riddle and Elgin, 2018).

While the F element has maintained a similar size in many other Drosophila species, it is substantially larger in at least four Drosophila species (i.e., D. ananassae, D. bipectinata, D. kikkawai, and D. takahashii). For example, the D. ananassae Muller F element is more than 18.7 Mb in size. This study will examine the factors (e.g., transposon density) that have contributed to the expansion of the F element in these four Drosophila species, and assess the impact of this expansion on gene characteristics (e.g., codon bias, intron size).

GEP students will produce coding region and transcription start site annotations for F element genes in D. ananassae, D. bipectinata, D. kikkawai, and D. takahashii, as well as for genes in a euchromatic reference region derived from the Muller D element. (Euchromatic regions have not expanded in these species.) Comparative analyses using these datasets will provide insights into the evolutionary impacts of changes in chromosome and gene size, and will facilitate the identification of factors that enable genes to function in a heterochromatic environment. We anticipate that this work will move us toward a better understanding of how and why eukaryotic genomes became so large, for mammals, ~1000X larger than that of E. coli.

About

Introduction to the F Element Project provided by Dr. Sarah C.R. Elgin
(26 minutes)
Slideset
The same material is covered in more detail in the “Annotated Lecture Slides.”

In this lecture for the July 2022 GEP New Member Training, Professor Sarah C. R. Elgin from Washington University in St. Louis describes the research aims for the Drosophila F Element Expansion project. The lecture begins with a brief overview of the C-value paradox, the impact of transposable elements on eukaryotic genomes, and the packaging of DNA into chromatin. The lecture then presents the unusual characteristics of the Muller F Elements in D. melanogaster and in other Drosophila species based on past studies by the Elgin Lab, GEP faculty and students, and other researchers (e.g., the modENCODE project). The presentation concludes with the goals for the comparative analysis of four Drosophila species where the F Elements have undergone different levels of expansion, and a summary of the unusual features (e.g., expansion of coding spans, pseudogene clusters) that have been identified by GEP students as part of their annotations of the D. ananassae F Element.

Project Curriculum

Annotated Lecture Slides

These highly annotated PowerPoint slide sets (and one video) provide background information on 1) the C-value paradox and basic chromatin structure; 2) heterochromatin vs. euchromatin; 3) living with transposable elements; and 4) the unique features of the F Element. A single video presentation covering these topics (ca. 45 minutes) and accompanying slide set are also available for a quicker introduction. All of the annotated slide sets and video are available through the F Element Project: Annotated Lecture Slides page.

Gene Annotation: Constructing a Defendable Exon/Intron Gene Model

An Introduction to NCBI BLAST

This walkthrough serves as an introduction to key functionalities of NCBI BLAST. Exercise Exercise Worksheet Worksheet Answer Key Answer Key Package without Answers Package

Detecting and Interpreting Genetic Homology

An introductory exercise using BLAST to annotate a region in the Drosophila melanogaster genome. Students can use this exercise to gain familiarity with performing BLAST searches and interpreting BLAST output. An answer key is provided for instructors.

Introduction to BLAST using Human Leptin

Dr. Justin R. DiAngelo (Penn State Berks) and Dr. Alexis Nagengast (Widener University) have developed an exercise that introduces students to the basic functionality of the NCBI web site and NCBI BLAST. Students will use NCBI BLAST to identify the putative orthologs of the human Leptin gene in other species.

RNA-Seq: a Closer Look at Read Mapping

Developed by Jeremy Buhler, this PowerPoint presentation provides an introduction to the core algorithms that form the basis for efficient mapping of RNA-Seq reads against a genome or transcriptome. The video that accompanies this presentation was developed by Leocadia Paliulis (Bucknell University). PowerPoint Handout

Browser-Based Annotation and RNA-Seq Data

This exercise continues your introduction to practical issues in comparative annotation. You will be annotating genomic sequence from the dot chromosome of Drosophila mojavensis using your knowledge of BLAST and some improved visualization tools. You will also consider how best to integrate information from high-throughput sequencing of expressed RNA.

Annotation of Drosophila Primer

This PowerPoint presentation provides a brief primer on the recommended annotation strategy for Drosophila projects. The presentation provides an overview of the goals of the GEP annotation project, an introduction to RNA-Seq, web databases, and a discussion on the phases of the splice donor and acceptor sites.

Annotation of a Drosophila Gene

This walkthrough uses the annotation of a gene on the D. biarmipes Muller F element to illustrate the GEP comparative annotation strategy. This document shows how you can investigate a feature in an annotation project using FlyBase, the Gene Record Finder, and the gene prediction and RNA-Seq evidence tracks on the GEP UCSC Genome Browser. The walkthrough then shows how you can identify the coordinates of each coding exon using NCBI BLAST, and also includes a discussion on the phases of the donor and acceptor splice sites. The walkthrough concludes by verifying the proposed gene model using the Gene Model Checker; it also includes a sample GEP Annotation Report.

Simple Annotation Problem

This worksheet will guide you through a series of basic steps that have been found to work well for annotation of species closely related to Drosophila melanogaster. It provides a technique that can also be the foundation of annotation in other, more divergent species.

GEP Annotation Workflow

This workflow provides an overview of the key analysis steps and bioinformatics tools for the annotation of a predicted gene in the Drosophila F element GEP project.

Identify D. melanogaster Ortholog

This decision tree illustrates the list of criteria that can be used to determine the putative D. melanogaster ortholog of a predicted gene.

Using BLAST for Genomic Sequence Annotation

Similar to the Lecture Notes on Alignment, this is a PowerPoint presentation given by Dr. Jeremy Buhler for the GEP faculty and TA workshops. This presentation covers the basics of alignment, essential for students to correctly interpret BLAST results.

Annotation of Drosophila

This PowerPoint presentation describes the recommended annotation strategy for Drosophila projects. The presentation provides an overview of the goals of the GEP annotation project, an introduction to NCBI BLAST, web databases, and the issue of reading frames and phase.

Annotation for D. virilis

This is a PowerPoint presentation describing the recommended strategies for annotating a D. virilis fosmid. The homology-based annotation strategy should also be applicable to annotation of D. erecta and D. mojavensis projects.

Annotation Instruction Sheet

This document is a more in depth description of the evidence based annotation technique used by the GEP. This document is designed to complement and extend the basic technique described in the Annotation for D. virilis PowerPoint.

Annotation Strategy Guide

This document illustrates how the strategies outlined in the Annotation Instruction Sheet can be applied to more challenging annotation cases.

GEP Digital Laboratory Notebook

Developed by Dr. Nick Reeves at Mt. San Jacinto College, Menifee Valley Campus, this PowerPoint presentation provides a brief overview of the Digital Lab Notebook, which provides detailed guidance to students on the GEP annotation strategy.

Annotation of Other Genomic Features within the F Element Project

This presentation illustrates the unusual genomic features that GEP students have encountered as part of their annotation of Muller F Elements from Drosophila ananassae and D. bipectinata. The Muller F Elements in these two species have undergone substantial expansion compared to D. melanogaster. The presentation describes the basic strategy for identifying pseudogenes, retrogenes, partial gene duplications, pseudogene clusters, and nuclear mitochondrial DNA segments (NUMT) within these F Element annotation projects.

RNA Quantitation from RNA-Seq Data

Developed by Dr. Jeremy Buhler, this PowerPoint presentation provides an overview of the approaches for quantifying transcript abundance based on RNA-Seq data. The presentation includes a discussion on the benefits and limitations of the two approaches commonly used for RNA quantitation – RPKM and TPM.

TSS Annotation (Under Development)

Searching for Transcription Start Sites in Drosophila

This PowerPoint presentation describes the recommended annotation strategy for identifying transcription start sites in Drosophila. The presentation provides an overview of the promoter architecture in D. melanogaster and describes the types of evidence that can be used to support the transcription start sites annotations.

TSS Annotation Workflow

This workflow provides an overview of the key steps and recommended search parameters for the annotation of transcription start sites.

Investigation of Motifs

Introduction to Motifs and Motif Finding

This document contains the notes from a lecture on motif finding given by Dr. Jeremy Buhler in the Bio 4342 course at WU. The lecture covers the different approaches used to represent sequence motifs and to search for sequence motifs in a genome.

Behavior and Limitations of Motif Finding

Developed by Dr. Jeremy Buhler, this exercise uses MEME to discover putative regulatory motifs in a collection of D. melanogaster promoter sequences. It also illustrates some of the challenges associated with motif finding and the limitations of motif finding programs.

Annotation of Conserved Motifs in Drosophila

This walkthrough uses FlyBase, FlyFactorSurvey, and Patser to identify transcription factor binding sites in the region surrounding the transcription start site of onecut in D. biarmipes.

Motif Discovery in Drosophila

This walkthrough uses FlyBase RNA-Seq Search and the MEME suite to discover motifs that are enriched in a collection of D. melanogaster Muller F element genes that show similar expression patterns.

Investigation of Repetitious Elements (Under Development)

Design and Use of RepeatMasker

Similar to the lecture notes on Repetitious DNA, this is a PowerPoint presentation given by Dr. Jeremy Buhler for the GEP faculty and TA workshops. This presentation covers the basics of RepeatMasker, as well as limitations of the program that students should be aware of.