Home » Intermediate Student

Intermediate Student

Introduction to R and RStudio

This series of modules introduces students to the statistical platform R using an integrated development environment of RStudio. Both softwares can be downloaded for free. Once downloaded and installed according to Module 0, students should watch the accompanying video for an introduction to the new environment. Module 1 presents an exercise where students work with genomic sequence alignment data to learn how to construct R commands while performing basic summary statistics and making basic plots.

Design and Use of RepeatMasker

Similar to the lecture notes on Repetitious DNA, this is a PowerPoint presentation given by Dr. Jeremy Buhler for the GEP faculty and TA workshops. This presentation covers the basics of RepeatMasker, as well as limitations of the program that students should be aware of.

4. Characteristics of the F Element

This lecture uses the themes from Slide Sets 1-3, of the “F Element Project: Annotated Lecture Slides” sequence, in describing what we have learned about the F element—combining wet-bench work in the Elgin lab, results of chromatin mapping by the modENCODE consortium, and the bioinformatics efforts of the faculty and students of the GEP. This includes the following:

  • characterization of the F element as a heterochromatic domain, high in repeated DNA but nonetheless having genes expressed at normal levels (7 slides);
  • mapping the chromatin state in relationship to the genes and to reporter transgenes inserted into the F (9 slides);
  • exploring the TSS and searching for unique factors or motifs associated with the TSSs of F element genes (6 slides);
  • introducing the “expanded F” project, describing some of the finds made looking at the F of D. ananassae, and summarizing the ongoing challenges and questions (9 slides)

3. The Dilemma of Transposable Elements: Can’t Live with Them, Can’t Evolve without Them!

This lecture introduces students to the analysis of repetitious elements in the genome. It can be used as a stand-alone lecture, or since the content is also important to our thinking about the relationship of transposable elements to eukaryotic genomes which is a key issue in our study of the expanded F element, it can be included in the “F Element Project: Annotated Lecture Slides” sequence.

The slides provide each of the following:

  • an introduction to transposable elements (TEs), particularly in the human genome (5 slides),
  • examples showing how transposition, resulting in new insertion sites or new rearrangements, creates harmful mutations and/or stimulates inflammation (7 slides);
  • how mechanisms for silencing generate more options for gene regulation (4 slides);
  • how transposable elements have rewired the genome, contributed to novel regulatory proteins, helped build centromeres and telomeres, marked sex chromosomes, and might drive evolution during times of stress (7 slides)


Without our TEs and histones, we would be bacteria!

2. Heterochromatin Formation — It’s all about silencing!

This lecture develops the relationship between chromatin packaging and control of gene expression, a significant epigenetic system that allows the genome to respond to changes in environment, both the external environment and physiological cues (e.g., hormone responses). There are 7 slides developing the importance of epigenetic regulation, particularly the silencing of repeats by heterochromatin packaging; 6 slides on histone post-translational modification; 9 slides on the discovery of Heterochromatin Protein 1a and its validation using Position Effect Variegation (in Drosophila), including the model for spreading of heterochromatin; 8 slides on HOW heterochromatin packaging can lead to silencing; 3 slides on the inheritance and manipulation of the heterochromatic state; and 4 slides on mapping chromatin states across the fly genome.

1. Eukaryotic Genomes and Chromatin Structure

This lecture introduces the following topics:

  • C-value paradox (2 slides)
  • explains how we first recognized that eukaryotic genomes are full of repetitious sequences by using Cot curves (11 slides; allows you to remind students about second order rate equations they learned in freshman chemistry!);
  • repeat characteristics of eukaryotic genomes (5 slides);
  • the need to package all that DNA to get it into a nucleus (3 slides);
  • the development of the nucleosome model (11 slides);
  • the relationship between nucleosome arrays and gene expression (4 slides).

Development of the nucleosome model represents a paradigm shift in our thinking about chromosomes, and slides are included pointing out how this model was initially rejected, but subsequently achieved widespread support in the scientific community.

Why Study the F Element?

This video provides a 50-minute talk on our motivation and progress for the F Element “expansion” Project. The talk briefly introduces the

  • C-value paradox (2 slides);
  • the need to silence repeats (3 slides);
  • basic chromatin structure (3 slides);
  • heterochromatin and the F Element (6 slides);
  • mapping the F Element in D. melanogaster for repeats and heterochromatin structure (10 slides);
  • examining the Transcription Start Site, looking for regulatory motifs (6 slides);
  • describing the “F Element expansion” project and our initial findings (8 slides).

The slide set used in the video is provided as a PowerPoint (automatic download) and a PDF Handout using the buttons below.

Using BLAST for Genomic Sequence Annotation

Similar to the Lecture Notes on Alignment, this is a PowerPoint presentation given by Dr. Jeremy Buhler for the GEP faculty and TA workshops. This presentation covers the basics of alignment, essential for students to correctly interpret BLAST results.

Browser-Based Annotation and RNA-Seq Data

This exercise continues your introduction to practical issues in comparative annotation. You will be annotating genomic sequence from the dot chromosome of Drosophila mojavensis using your knowledge of BLAST and some improved visualization tools. You will also consider how best to integrate information from high-throughput sequencing of expressed RNA.