Katie Sandlin
Facilitating Growth through Frustration: Using Genomics Research in a Course-Based Undergraduate Research Experience
Lopatto D, Rosenwald AG, DiAngelo JR, et al. Facilitating Growth through Frustration: Using Genomics Research in a Course-Based Undergraduate Research Experience. J Microbiol Biol Educ. 2020;21(1):21.1.6. Published 2020 Feb 28. doi:10.1128/jmbe.v21i1.2005
Abstract
A hallmark of the research experience is encountering difficulty and working through those challenges to achieve success. This ability is essential to being a successful scientist, but replicating such challenges in a teaching setting can be difficult. The Genomics Education Partnership (GEP) is a consortium of faculty who engage their students in a genomics Course-Based Undergraduate Research Experience (CURE). Students participate in genome annotation, generating gene models using multiple lines of experimental evidence. Our observations suggested that the students’ learning experience is continuous and recursive, frequently beginning with frustration but eventually leading to success as they come up with defendable gene models. In order to explore our “formative frustration” hypothesis, we gathered data from faculty via a survey, and from students via both a general survey and a set of student focus groups. Upon analyzing these data, we found that all three datasets mentioned frustration and struggle, as well as learning and better understanding of the scientific process. Bioinformatics projects are particularly well suited to the process of iteration and refinement because iterations can be performed quickly and are inexpensive in both time and money. Based on these findings, we suggest that a dynamic of “formative frustration” is an important aspect for a successful CURE.
Retrotransposons Are the Major Contributors to the Expansion of the Drosophila ananassae Muller F Element
Leung W, Shaffer CD, Chen EJ, et al. Retrotransposons Are the Major Contributors to the Expansion of the Drosophila ananassae Muller F Element. G3 (Bethesda). 2017;7(8):2439‐2460. Published 2017 Aug 7. doi:10.1534/g3.117.040907
Leung W, Elgin SCR; (On behalf of the participating students and faculty of the Genomics Education Partnership). Response to the Letter to the Editor by Dunning Hotopp and Klasson. G3 (Bethesda). 2018;8(1):375. Published 2018 Jan 4. doi:10.1534/g3.117.300379
Abstract
The discordance between genome size and the complexity of eukaryotes can partly be attributed to differences in repeat density. The Muller F element (∼5.2 Mb) is the smallest chromosome in Drosophila melanogaster, but it is substantially larger (>18.7 Mb) in D. ananassae. To identify the major contributors to the expansion of the F element and to assess their impact, we improved the genome sequence and annotated the genes in a 1.4-Mb region of the D. ananassae F element, and a 1.7-Mb region from the D element for comparison. We find that transposons (particularly LTR and LINE retrotransposons) are major contributors to this expansion (78.6%), while Wolbachia sequences integrated into the D. ananassae genome are minor contributors (0.02%). Both D. melanogaster and D. ananassae F-element genes exhibit distinct characteristics compared to D-element genes (e.g., larger coding spans, larger introns, more coding exons, and lower codon bias), but these differences are exaggerated in D. ananassae. Compared to D. melanogaster, the codon bias observed in D. ananassae F-element genes can primarily be attributed to mutational biases instead of selection. The 5′ ends of F-element genes in both species are enriched in dimethylation of lysine 4 on histone 3 (H3K4me2), while the coding spans are enriched in H3K9me2. Despite differences in repeat density and gene characteristics, D. ananassae F-element genes show a similar range of expression levels compared to genes in euchromatic domains. This study improves our understanding of how transposons can affect genome size and how genes can function within highly repetitive domains.
An undergraduate bioinformatics curriculum that teaches eukaryotic gene structure
Laakso, M.M., Paliulis, L.V., Croonquist, P., Derr, B., Gracheva, E., Hauser, C., Howell, C., Jones, C.J., Kagey, J.D., Kennell, J., Silver Key, S.C., Mistry, H., Robic, S., Sanford, J., Santisteban, M., Small, C., Spokony, R., Stamm, J., Van Stry, M., Leung, W., Elgin, S.C.R. 2017. An undergraduate bioinformatics curriculum that teaches eukaryotic gene structure. CourseSource. https://doi.org/10.24918/cs.2017.13
Abstract
Gene structure, transcription, translation, and alternative splicing are challenging concepts for many undergraduates studying biology. These topics are typically covered in a traditional lecture environment, but students often fail to master and retain these concepts. To address this problem we have designed a series of six Modules that employ an active learning approach using a bioinformatics tool, the genome browser, to help students understand eukaryotic gene structure and functionality. Students learn how to use a mirror site of the UCSC Genome Browser created by the Genomics Education Partnership while completing the Modules, which focus on gene structure, transcription, splicing, translation, and alternative splicing. The Modules are supplemented with short videos that illustrate key functionalities of the genome browser and fundamental concepts in processing transcripts. These materials have been used successfully to teach gene structure in many different settings, from community colleges to 4-year colleges and universities, encompassing advanced high school students to college seniors. Instructors can easily customize the Modules and/or select a subset for their curriculum. The Modules have helped our students learn about eukaryotic gene structure and expression, simultaneously acquiring skills in the use of a genome browser, and have prepared them to pursue genome annotation projects as independent research.
The GEP: Crowd-Sourcing Big Data Analysis With Undergraduates
Abstract: The era of ‘big data’ is also the era of abundant data, creating new opportunities for student-scientist research partnerships. By coordinating undergraduate efforts, the Genomics Education Partnership produces high-quality annotated data sets and analyses that could not be generated otherwise, leading to scientific publications while providing many students with research experience.

Elgin SCR, Hauser C, Holzen TM, et al. The GEP: Crowd-Sourcing Big Data Analysis with Undergraduates. Trends Genet. 2017;33(2):81‐85. doi:10.1016/j.tig.2016.11.004
A Hands-on Introduction to Hidden Markov Models
Abstract: In this Lesson, we describe a classroom activity that demonstrates how a Hidden Markov Model (HMM) is applied to predict a eukaryotic gene, focusing on predicting one exon-intron boundary. This HMM lesson is part of the BIOL/CS 370 ‘Introduction to Bioinformatics’ course (Truman State University, MO) and of Bio4342 ‘Research Explorations in Genomics’ (Washington University in St. Louis, MO). The original target student audiences include both Biology and Computer Sciences majors in their junior and senior years, although we believe the model activity would be successful with younger students. The class session starts with a brief introductory lecture describing HMMs and the terminology used in defining the parameters for a given model. This lecture is followed by students’ exploration of the HMM using Excel spreadsheets to manage calculations while they alter the key variables; collaborative problem solving and discussion of their strategies and results; and homework to check their understandings.
Students have reacted very positively to the HMM curriculum. Students with more computer science experience tended to ask more questions concerning the model itself. Overall, students performed well on the homework assignment, leading us to believe that we are a step closer to our main goal of filling the intellectual gap between computer scientists and biologists.

Weisstein, A.E., Gracheva, E., Goodwin, Z., Qi, Z., Leung, W., Shaffer, C.D. and Elgin, S.C.R. 2016. A Hands-on Introduction to Hidden Markov Models. CourseSource. https://doi.org/10.24918/cs.2016.8
Drosophila Muller F Elements Maintain a Distinct Set of Genomic Properties Over 40 Million Years of Evolution
Abstract: The Muller F element (4.2 Mb, ~80 protein-coding genes) is an unusual autosome of Drosophila melanogaster; it is mostly heterochromatic with a low recombination rate. To investigate how these properties impact the evolution of repeats and genes, we manually improved the sequence and annotated the genes on the D. erecta, D. mojavensis, and D. grimshawi F elements and euchromatic domains from the Muller D element. We find that F elements have greater transposon density (25–50%) than euchromatic reference regions (3–11%). Among the F elements, D. grimshawi has the lowest transposon density (particularly DINE-1: 2% vs. 11–27%). F element genes have larger coding spans, more coding exons, larger introns, and lower codon bias. Comparison of the Effective Number of Codons with the Codon Adaptation Index shows that, in contrast to the other species, codon bias in D. grimshawi F element genes can be attributed primarily to selection instead of mutational biases, suggesting that density and types of transposons affect the degree of local heterochromatin formation. F element genes have lower estimated DNA melting temperatures than D element genes, potentially facilitating transcription through heterochromatin. Most F element genes (~90%) have remained on that element, but the F element has smaller syntenic blocks than genome averages (3.4–3.6 vs. 8.4–8.8 genes per block), indicating greater rates of inversion despite lower rates of recombination. Overall, the F element has maintained characteristics that are distinct from other autosomes in the Drosophila lineage, illuminating the constraints imposed by a heterochromatic milieu.

Leung W, Shaffer CD, Reed LK, et al. Drosophila Muller F Elements Maintain a Distinct Set of Genomic Properties Over 40 Million Years of Evolution. G3 (Bethesda). 2015;5(5):719‐740. Published 2015 Mar 4. doi:10.1534/g3.114.015966
- 1,014 co-authors; 940 of them participated as students
• The improved sequences and gene annotations are available as part of the supplemental materials for the manuscript.
• The genome browsers for D. erecta, D. mojavensis, and D. grimshawi are available on the GEP UCSC Genome Browser.
• The underlying database and additional data files are available for download through the WUSTL Digital Research Materials Repository.
Publication in the News
A Central Support System Can Facilitate Implementation and Sustainability of a Classroom-Based Undergraduate Research Experience (CURE) in Genomics
Abstract: In their 2012 report, the President’s Council of Advisors on Science and Technology advocated “replacing standard science laboratory courses with discovery-based research courses”-a challenging proposition that presents practical and pedagogical difficulties. In this paper, we describe our collective experiences working with the Genomics Education Partnership, a nationwide faculty consortium that aims to provide undergraduates with a research experience in genomics through a scheduled course (a classroom-based undergraduate research experience, or CURE). We examine the common barriers encountered in implementing a CURE, program elements of most value to faculty, ways in which a shared core support system can help, and the incentives for and rewards of establishing a CURE on our diverse campuses. While some of the barriers and rewards are specific to a research project utilizing a genomics approach, other lessons learned should be broadly applicable. We find that a central system that supports a shared investigation can mitigate some shortfalls in campus infrastructure (such as time for new curriculum development, availability of IT services) and provides collegial support for change. Our findings should be useful for designing similar supportive programs to facilitate change in the way we teach science for undergraduates.

Lopatto D, Hauser C, Jones CJ, et al. A Central Support System Can Facilitate Implementation and Sustainability of a Classroom-based Undergraduate Research Experience (CURE) in Genomics. CBE Life Sci Educ. 2014;13(4):711‐723. doi:10.1187/cbe.13-10-0200
A Course-Based Research Experience: How Benefits Change with Increased Investment in Instructional Time
Abstract: There is widespread agreement that science, technology, engineering, and mathematics programs should provide undergraduates with research experience. Practical issues and limited resources, however, make this a challenge. We have developed a bioinformatics project that provides a course-based research experience for students at a diverse group of schools and offers the opportunity to tailor this experience to local curriculum and institution-specific student needs. We assessed both attitude and knowledge gains, looking for insights into how students respond given this wide range of curricular and institutional variables. While different approaches all appear to result in learning gains, we find that a significant investment of course time is required to enable students to show gains commensurate to a summer research experience. An alumni survey revealed that time spent on a research project is also a significant factor in the value former students assign to the experience one or more years later. We conclude: 1) implementation of a bioinformatics project within the biology curriculum provides a mechanism for successfully engaging large numbers of students in undergraduate research; 2) benefits to students are achievable at a wide variety of academic institutions; and 3) successful implementation of course-based research experiences requires significant investment of instructional time for students to gain full benefit.

Shaffer CD, Alvarez CJ, Bednarski AE, et al. A Course-Based Research Experience: How Benefits Change with Increased Investment in Instructional Time. CBE Life Sci Educ. 2014;13(1):111‐130. doi:10.1187/cbe-13-08-0152
Evolution of a Distinct Genomic Domain in Drosophila: Comparative Analysis of the Dot Chromosome in D. melanogaster and D. virilis
Abstract: The distal arm of the fourth (“dot”) chromosome of Drosophila melanogaster is unusual in that it exhibits an amalgamation of heterochromatic properties (e.g., dense packaging, late replication) and euchromatic properties (e.g., gene density similar to euchromatic domains, replication during polytenization). To examine the evolution of this unusual domain, we undertook a comparative study by generating high-quality sequence data and manually curating gene models for the dot chromosome of D. virilis (Tucson strain 15010-1051.88). Our analysis shows that the dot chromosomes of D. melanogaster and D. virilis have higher repeat density, larger gene size, lower codon bias, and a higher rate of gene rearrangement compared to a reference euchromatic domain. Analysis of eight “wanderer” genes (present in a euchromatic chromosome arm in one species and on the dot chromosome in the other) shows that their characteristics are similar to other genes in the same domain, which suggests that these characteristics are features of the domain and are not required for these genes to function. Comparison of this strain of D. virilis with the strain sequenced by the Drosophila 12 Genomes Consortium (Tucson strain 15010-1051.87) indicates that most genes on the dot are under weak purifying selection. Collectively, despite the heterochromatin-like properties of this domain, genes on the dot evolve to maintain function while being responsive to changes in their local environment.

Leung W, Shaffer CD, Cordonnier T, et al. Evolution of a distinct genomic domain in Drosophila: comparative analysis of the dot chromosome in Drosophila melanogaster and Drosophila virilis. Genetics. 2010;185(4):1519‐1534. doi:10.1534/genetics.110.116129