Skip to content
Home » Drosophila F Element

Drosophila F Element

4. Characteristics of the F Element

This lecture uses the themes from Slide Sets 1-3, of the “F Element Project: Annotated Lecture Slides” sequence, in describing what we have learned about the F element—combining wet-bench work in the Elgin lab, results of chromatin mapping by the modENCODE consortium, and the bioinformatics efforts of the faculty and students of the GEP. This includes the following:

  • characterization of the F element as a heterochromatic domain, high in repeated DNA but nonetheless having genes expressed at normal levels (7 slides);
  • mapping the chromatin state in relationship to the genes and to reporter transgenes inserted into the F (9 slides);
  • exploring the TSS and searching for unique factors or motifs associated with the TSSs of F element genes (6 slides);
  • introducing the “expanded F” project, describing some of the finds made looking at the F of D. ananassae, and summarizing the ongoing challenges and questions (9 slides)

3. The Dilemma of Transposable Elements: Can’t Live with Them, Can’t Evolve without Them!

This lecture introduces students to the analysis of repetitious elements in the genome. It can be used as a stand-alone lecture, or since the content is also important to our thinking about the relationship of transposable elements to eukaryotic genomes which is a key issue in our study of the expanded F element, it can be included in the “F Element Project: Annotated Lecture Slides” sequence.

The slides provide each of the following:

  • an introduction to transposable elements (TEs), particularly in the human genome (5 slides),
  • examples showing how transposition, resulting in new insertion sites or new rearrangements, creates harmful mutations and/or stimulates inflammation (7 slides);
  • how mechanisms for silencing generate more options for gene regulation (4 slides);
  • how transposable elements have rewired the genome, contributed to novel regulatory proteins, helped build centromeres and telomeres, marked sex chromosomes, and might drive evolution during times of stress (7 slides)

Without our TEs and histones, we would be bacteria!

2. Heterochromatin Formation — It’s all about silencing!

This lecture develops the relationship between chromatin packaging and control of gene expression, a significant epigenetic system that allows the genome to respond to changes in environment, both the external environment and physiological cues (e.g., hormone responses). There are 7 slides developing the importance of epigenetic regulation, particularly the silencing of repeats by heterochromatin packaging; 6 slides on histone post-translational modification; 9 slides on the discovery of Heterochromatin Protein 1a and its validation using Position Effect Variegation (in Drosophila), including the model for spreading of heterochromatin; 8 slides on HOW heterochromatin packaging can lead to silencing; 3 slides on the inheritance and manipulation of the heterochromatic state; and 4 slides on mapping chromatin states across the fly genome.

1. Eukaryotic Genomes and Chromatin Structure

This lecture introduces the following topics:

  • C-value paradox (2 slides)
  • explains how we first recognized that eukaryotic genomes are full of repetitious sequences by using Cot curves (11 slides; allows you to remind students about second order rate equations they learned in freshman chemistry!);
  • repeat characteristics of eukaryotic genomes (5 slides);
  • the need to package all that DNA to get it into a nucleus (3 slides);
  • the development of the nucleosome model (11 slides);
  • the relationship between nucleosome arrays and gene expression (4 slides).

Development of the nucleosome model represents a paradigm shift in our thinking about chromosomes, and slides are included pointing out how this model was initially rejected, but subsequently achieved widespread support in the scientific community.

Why Study the F Element?

This video provides a 50-minute talk on our motivation and progress for the F Element “expansion” Project. The talk briefly introduces the

  • C-value paradox (2 slides);
  • the need to silence repeats (3 slides);
  • basic chromatin structure (3 slides);
  • heterochromatin and the F Element (6 slides);
  • mapping the F Element in D. melanogaster for repeats and heterochromatin structure (10 slides);
  • examining the Transcription Start Site, looking for regulatory motifs (6 slides);
  • describing the “F Element expansion” project and our initial findings (8 slides).

The slide set used in the video is provided as a PowerPoint (automatic download) and a PDF Handout using the buttons below.

F Element Project Reconciliation Statistics Fall 2020-Spring 2021 Gene model submission statistics: reconciled 848 gene models; approximately 75% of the submitted gene models were in congruence with the final gene models. Breakdown of annotation errors (n=232; note: some models have multiple annotation errors): The most common annotation error in the submitted gene models is the selection of incorrect splice site boundaries (24%) followed by start/stop codon error (18%), extra/missing exon (12%), missing pseudogene or retrogene (12%), gene model missing (9%), not the orthologous model (8%), mislabeled/missing isoform (6%), novel isoform (6%), and submission error (5%).

2021 F Element GEP Summer Research Fellows

During Summer 2021, five Summer Fellows from New Jersey City University (NJCU), Rutgers University, and Washington University in St. Louis (WUSTL) reconciled coding region annotation projects from the Drosophila ananassae F element, the D. ananassae D element, and the D. bipectinata F element. The five Summer Fellows (Ishtar Olaveja, Jackie Hester, Annabelle Laughlin, Martin Dalling, and Alice Herrmann) were mentored by Dr. Cindy J. Arrigo at NJCU with support from Wilson Leung at WUSTL.

Collectively, the Summer Fellows reconciled 848 gene models and completed 21 unique TSS annotations. Approximately 75% of the submitted gene models were in congruence with the final gene models. The most common annotation error in the submitted gene models is the selection of incorrect splice site boundaries (24%). The image above illustrates the preliminary reconciliation statistics for the F element projects in Summer 2021. We anticipate that the final reconciliation statistics will be available in October 2021.

As part of reconciliation of the coding region annotation projects, the Summer Fellows also identified several interesting features on the F and D elements. These interesting features include a novel isoform for the Zyx gene and a partial duplication of Arl4 on the D. ananassae F element, three novel paralogs of a male-specific gene in D. ananassae (derived from CG3795), and a retrotransposed pseudogene derived from yin on the D. bipectinata F element. The presentations which summarize their work this Summer are publicly available through Box.

Three of the Summer Fellows plan to continue their reconciliation work and the data analysis of the reconciled gene models this Fall. Some of the Summer Fellows also plan to present their work at local conferences and undergraduate research symposia at their local institutions.

The Summer Fellows were supported by a grant from the National Science Foundation (Award #: 2114661).

F Element Expansion Project Awarded NSF Grant

The National Science Foundation awarded a Standard Grant of $434,154 (Award Number 2114661) to support the GEP’s “Drosophila F Element Expansion: A Window on the C-value Paradox” Project led by Principal Investigator Cindy Arrigo (New Jersey City University).


This research award funds an investigation of the evolutionary causes and consequences of genome size variation. The DNA of all organisms contains the genes that code for proteins, the building blocks of cells. Humans have approximately five times as many protein-coding genes as do bacteria, but about 1,000 times the amount of DNA. This phenomenon, the C-value Paradox, will be studied using a chromosome (the F element) that has undergone a rapid change in size during the evolution of the fruit fly, Drosophila. Initial analysis of the F element genes in four species with an expanded F is being done by undergraduates in the Genomics Education Partnership (GEP). The GEP involves >150 faculty from across the United States who are using this project to introduce students to research in genomics, focusing on gene annotation. One of the most diverse universities in the nation, New Jersey City University, is the hub for this national research project.

An F element region containing ~80 genes is 1.3 megabases in Drosophila melanogaster, but 19.1 megabases in Drosophila ananassae, a 15-fold increase in size. Expansion of the F element is largely due to a higher repeat load, dominated by transposable elements (TEs). Using a comparative species approach to analyze the expansions within and between genes, the project will document the rate and timing of TE acquisition, and characterize the impacts of TEs on gene structure and on chromosome organization. Examining SNPs from 15 strains of D. ananassae will illuminate whether change in genome size is associated with change in effective population size. The mechanisms that limit recombination will be examined using both codon bias and substitution rates. These studies and others enabled by the GEP student annotations will contribute to a better understanding of the nature of the genome that can be broadly applied to eukaryotic biology.


This proposal is a collaborative effort of the Genomics Education Partnership (GEP), written by CJ Arrigo, C Ellison, W Leung, and SCR Elgin with GEP input. The GEP has published two major papers on the Drosophila F element, an unusual chromosome that appears to be entirely heterochromatic by many criteria (condensed appearance, high HP1a/H3K9me2/3, lack of recombination) but carries 80 genes. Surprisingly, four Drosophila species have been identified with a significantly larger than average F element (2-fold to ~15-fold). Investigation of this expansion should provide insights into genome expansion in general, including documentation of the process, impacts on the genes, and impacts on the chromosome. 

Aim 1: Characterizing F element expansion: Careful annotation of high quality genome assemblies will allow us to document the structure of both the genes and the repetitious sequences, primarily Transposable Elements (TEs), addressing the following questions using bioinformatics tools:

  • What is the distribution of repeats in relation to the protein-coding genes? How are those genes altered (e.g., in size) when the load of repetitious sequences increases?
  • Is there an impact on Transcription Start Sites?
  • How are these results impacted by the magnitude of expansion?
  • What are these repetitious sequences, and what is their evolutionary history?
  • Does high repeat density promote or allow other genome changes? 

Aim 2: Determining the impact of expansion on gene / genome evolution: What is the impact of higher repeat loads on the evolution of the chromosome as a whole, and on the evolution of the genes, embedded in a sea of repetitive sequences that must be silenced?

The multiple independent F element expansions in Drosophila provide a unique opportunity to determine whether change in genome size is associated with a change in effective population size. We will test this prediction by using genome-wide single nucleotide polymorphisms (SNPs) from 15 strains of D. ananassae. Focusing on the four species with an expanded F for which high quality sequence is available, we will document the rate and timing of TE acquisition. We will look at the impact on the genes, using both codon bias and substitution rates to examine the extent of Hill-Robertson interference, determining whether interference is a centromere proximal effect or simply reflects the lack of recombination due to heterochromatin formation driven by the high repeat density.

Motif Discovery in Drosophila

This walkthrough uses FlyBase RNA-Seq Search and the MEME suite to discover motifs that are enriched in a collection of D. melanogaster Muller F element genes that show similar expression patterns.

Annotation of Conserved Motifs in Drosophila

This walkthrough uses FlyBase, FlyFactorSurvey, and Patser to identify transcription factor binding sites in the region surrounding the transcription start site of onecut in D. biarmipes.

Behavior and Limitations of Motif Finding

Developed by Dr. Jeremy Buhler, this exercise uses MEME to discover putative regulatory motifs in a collection of D. melanogaster promoter sequences. It also illustrates some of the challenges associated with motif finding and the limitations of motif finding programs.