Pathways

Sequence Similarity Introduction

June 23, 2024August 20, 2024

This lesson and exercise defines similarity in a non-biological and biological sense, quantifies the similarity between two sequences, explains how a substitution matrix is used to quantify similarity, calculates amino acid similarity scores using the BLOSUM 62 substitution matrix, explains how BLAST detects similarity between two sequences and how to use BLAST and interpret the alignments.

Melinda Yang Publishes Specifications Grading in GEP-CURE

March 1, 2024March 1, 2024

Pairing a bioinformatics-focused Course-Based undergraduate research experience with specifications grading in an introductory biology classroom

Abstract

Introducing bioinformatics-focused concepts and skills in a biology classroom is difficult, especially in introductory biology classrooms. Course-based Undergraduate Research Experiences (CUREs) facilitate this process, introducing genomics and bioinformatics through authentic research experiences, but the many learning objectives needed in scientific research and communication, foundational biology concepts, and bioinformatics-focused concepts and skills can make the process challenging. Here, the pairing of specifications grading with a bioinformatics-focused CURE based in the Genomics Education Partnership is described. Elements include how the course structure with specifications grading facilitated scaffolding of writing assignments, group work, and metacognitive activities; and the synergies between CUREs and specifications grading. CUREs require mastery of related concepts and skills for working through the research process, utilize common research practices of revision and iteration, and encourage a growth mindset to learning—all of which are heavily incentivized in assessment practices focused on specifications grading.

Puerto Rico Regional Node Meeting – February 6 & 8, 2024

February 12, 2024February 12, 2024

On the 6th and the 8th of February of 2024, the Puerto Rico Regional Node had a two-day Pathways Project training with Dr. Laura Reed. Node members met through Zoom each day from 4:30 to 6:00 pm AST (2:30 to 4:00 pm CST). Ten (10) GEP faculty and one (1) undergraduate student participated in the three-hour online training.

As part of their GEP onboarding, all Puerto Rico Node members have been trained in the F Element annotation methodology. However, few have worked with other GEP Projects. During the 2023 Faculty Workshop, attending Node members expressed an interest in professional development activities that would help them acquire the knowledge and skills necessary to participate in other GEP research activities. A Pathways Project training was identified as a logical next step towards achieving this goal.

The first session started with an overview of the overarching research questions, implementation ideas and an introduction to the project walkthrough. The second session covered topics such as common annotation mistakes, project claiming and submission protocols, managing report forms, and an overview of the microPublication pipeline. It also offered an opportunity to ask general questions and learn about different Pathways support initiatives. Members of the Puerto Rico Node have a special interest in the Pathways and Puerto Rican Parrot Projects and this training provided valuable knowledge and methodological insights that will serve them well as they venture (with their students) into other GEP annotation projects.

The Puerto Rico Node thanks Pathways Project Leader, Laura Reed, for setting time aside to train the Node members. A special thanks to the Regional Node Director and Co-Director, Melinda Yang and Jenni Kennell for answering many questions, sending helpful follow-up messages, and making sure the Node Leaders had the resources necessary to organize the training. Thanks to Sarah Potts for supporting the event registration, facilitating access to training materials and managing Zoom channel logistics. The planning of the training was a collaborative effort of the Node Leader and Co-Leader, Ángel O Custodio and Sheylda Díaz.

What worked well for your event that might help others plan similar events?

We had an online event during weekdays. This activity didn’t require organizing a venue or having people travel to the meeting location. It served our purpose well. Also, scheduling the training on a date removed from the beginning of the semester promoted faculty participation.

What lessons were learned from challenges in the planning or execution of the event?

Communicate often with headquarters or the Regional Node Directors before sending information to Node members. Also, remember that the Central Office will give the Node support with registrations and surveys.

What would your Node do differently based on your experiences?

Node leaders will write emails to headquarters more often to make sure they are not duplicating efforts doing some transactions that the Central Office can handle with ease. Also, they will set up a local checklist that will help plan and follow-up on certain tasks.

Genomic Neighborhood Check For Understanding

January 8, 2024January 19, 2025

This was created in response to a member mentioning their students really struggled with the genomic neighborhood and the member didn’t realize until they were already too far into the annotation to correct their misconceptions. This is meant to be a quick in-class and/or homework assignment.

New GEP Publication on the Pathways Project Annotation Protocol

January 8, 2024January 8, 2024

Manual annotation of Drosophila genes: a Genomics Education Partnership protocol

Abstract

Annotating the genomes of multiple species allows us to analyze the evolution of their genes. While many eukaryotic genome assemblies already include computational gene predictions, these predictions can benefit from review and refinement through manual gene annotation. The Genomics Education Partnership (GEP; https://thegep.org/) developed a structural annotation protocol for protein-coding genes that enables undergraduate student and faculty researchers to create high-quality gene annotations that can be utilized in subsequent scientific investigations. For example, this protocol has been utilized by the GEP faculty to engage undergraduate students in the comparative annotation of genes involved in the insulin signaling pathway in 27 Drosophila species, using D. melanogaster as the reference genome. Students construct gene models using multiple lines of computational and empirical evidence including expression data (e.g., RNA-Seq), sequence similarity (e.g., BLAST and multiple sequence alignment), and computational gene predictions. Quality control measures require each gene be annotated by at least two students working independently, followed by reconciliation of the submitted gene models by a more experienced student. This article provides an overview of the annotation protocol and describes how discrepancies in student submitted gene models are resolved to produce a final, high-quality gene set suitable for subsequent analyses. The protocol can be adapted to other scientific questions (e.g., expansion of the Drosophila Muller F element) and species (e.g., parasitoid wasps) to provide additional opportunities for undergraduate students to participate in genomics research. These student annotation efforts can substantially improve the quality of gene annotations in publicly available genomic databases.

MN/IA/Dakotas Regional Node Meeting – September 30, 2022

August 2, 2023August 2, 2023

The Minnesota, Iowa, and Dakotas Regional Node held a virtual meeting on September 30, 2022. The meeting began with introductions, laughs, and updates. Dr. Andy Arsham was the highlighted speaker from Bemidji State University. He gave an engaging research seminar on his current work with the Zas-Znf gene family in the Drosophila lineage. The second part of the Node workshop was an open work time and space for questions around advanced gene annotations in the Pathway Project. Two awesome GEP Virtual TAs joined to answer questions while participants worked through the problematic gene Shaggy in Drosophila willistoni. There were many isoforms and the group wanted to know the best way to approach all of the isoforms. This was a great success, and everyone took something new from the knowledgeable TAs. Finally, Node members discussed and planned the student research event for the spring. Save the Date – April 21st, 2023 at Saint Catherine University! The plan is for this event to be an all-day in person event with student presentations, guest speakers, and lots of food.

Update: The in-person RNM was held on April 21, 2023 and went incredibly well. There were a total of 80 participants (including admin, staff, and STEM faculty from St. Catherine University).

What worked well for your event that might help others plan similar events?

The Node decided to focus on the Pathways Project. Some of the main objectives of the event was to perform hands-on GEP curriculum training for current GEP members. In particular, members wanted to work on and ask questions about a really difficult gene in the project. The Node Leader sent out information on the gene that would be discussed a day before the workshop. Participants ended up talking through the process and then going through questions.

Pathways Project: Annotation Videos

August 5, 2022March 11, 2025

The Pathways Project is focused on annotating genes found in well-characterized signaling and metabolic pathways across the Drosophila genus. The current focus is on the insulin signaling pathway which is well-conserved across animals and critical to growth and metabolic homeostasis. The long-term goal of the Pathways Project is to analyze how the regulatory regions of genes evolve in the context of their positions within a network. For a general project overview, see the following video by Dr. Laura K. Reed:

Pathways Project Walkthrough Series

This series of videos based on the Pathways Project Annotation Walkthrough is intended to help GEP students annotate a Pathways Project gene from start to finish. The full series is available on the GEP YouTube Channel.

Pathways Gene Annotation Walkthrough Videos

Introduction
Part 1: Examine genomic neighborhood surrounding target gene in D. melanogaster
Part 2.1: Retrieve protein sequence of target gene in D. melanogaster
Part 2.2: Perform a BLAST search of D. melanogaster protein against the target species’ genome
Part 2.3: Summarize tblastn results for protein on target species’ scaffold
Part 3.1: Examine evidence for a protein-coding gene in region surrounding the tblastn alignment in the target species
Part 3.2: Use synteny to gather additional evidence for the ortholog assignment
Part 4: Determine target gene’s structure in D. melanogaster
Part 5: Determine approximate location of coding exons (CDS’s) in target species
Part 6.1: Verify start codon coordinates
Part 6.2: Verify stop codon coordinates
Part 6.3: Determine phases of donor and acceptor splice sites
Part 6.4: Use spliced RNA-Seq reads to verify coordinates for Intron-1
Part 6.5: Use splice junction predictions to verify coordinates for second intron
Part 7.1: Verify gene model of protein
Part 7.2: Download files required for project submission
Part 7.3: Merge project files
Appendix A: Combining (or Batching) BLAST Searches

Introduction

Part 1: Examine genomic neighborhood surrounding target gene in D. melanogaster

Part 2.1: Retrieve protein sequence of target gene in D. melanogaster

Part 2.2: Perform a BLAST search of D. melanogaster protein against the target species' genome

Part 2.3: Summarize tblastn results for protein on target species' scaffold

Part 3.1: Examine evidence for a protein-coding gene in region surrounding the tblastn alignment in the target species

Part 3.2: Use synteny to gather additional evidence for the ortholog assignment

Part 4: Determine target gene’s structure in D. melanogaster

Part 5: Determine approximate location of coding exons (CDS's) in target species

Part 6.1: Verify start codon coordinates

Part 6.2: Verify stop codon coordinates

Part 6.3: Determine phases of donor and acceptor splice sites

Part 6.4: Use spliced RNA-Seq reads to verify coordinates for Intron-1

Part 6.5: Use splice junction predictions to verify coordinates for second intron

Part 7.1: Verify gene model of protein

Part 7.2: Download files required for project submission

Part 7.3: Merge project files

Appendix A: Combining (or Batching) BLAST Searches

Pathways Project: Annotation Form D. pseudoobscura Key

August 5, 2022January 28, 2025

Students can apply what they learned in the Pathways Project: Annotation Walkthrough to construct a gene model for Rheb in D. pseudoobscura by completing the Pathways Project: Annotation Form. An answer key (generic username and password required) is provided to assist instructors in checking the accuracy of the annotation and includes potential areas of confusion throughout.

Pathways Project: Annotation Form Exemplar

August 5, 2022January 28, 2025

The Annotation Form Exemplar is provided as an example of a completed Annotation Form ready for submission to the GEP’s Pathways Project. The optional questions were omitted from the exemplar.

Pathways Project: Annotation Form

August 5, 2022January 28, 2025

This “Annotation Form” merged the “Annotation Report” and “Annotation Notebook” into a single document and the latter two items were archived.

The “Annotation Form” kept many of the Checks for Understanding type questions that were previously in the “Annotation Notebook.” However, given that not all faculty want their students to answer those questions, we marked any question that is NOT required for submission to the GEP as “OPTIONAL.” Currently the optional questions are organized as letters rather than numbers, which we hope will make it easy to quickly select and delete the OPTIONAL questions that faculty don’t wish to include. Again, all numbered questions are required for submission to the GEP for reconciliation.

Items that were previously asked for in the “Project Details” table of the “Annotation Report” are now available on the Genome Browser Gateway page for each species. Therefore, the “Project Details Table” instructions document is no longer needed.

About

Directories

Members

Curriculum

Research Projects

Students

Pathways

Abstract

What worked well for your event that might help others plan similar events?

What lessons were learned from challenges in the planning or execution of the event?

What would your Node do differently based on your experiences?

Abstract

What worked well for your event that might help others plan similar events?

Pathways Project Walkthrough Series

Pathways Gene Annotation Walkthrough Videos

Introduction

Part 1: Examine genomic neighborhood surrounding target gene in D. melanogaster

Part 2.1: Retrieve protein sequence of target gene in D. melanogaster

Part 2.2: Perform a BLAST search of D. melanogaster protein against the target species' genome

Part 2.3: Summarize tblastn results for protein on target species' scaffold

Part 3.1: Examine evidence for a protein-coding gene in region surrounding the tblastn alignment in the target species

Part 3.2: Use synteny to gather additional evidence for the ortholog assignment

Part 4: Determine target gene’s structure in D. melanogaster

Part 5: Determine approximate location of coding exons (CDS's) in target species

Part 6.1: Verify start codon coordinates

Part 6.2: Verify stop codon coordinates

Part 6.3: Determine phases of donor and acceptor splice sites

Part 6.4: Use spliced RNA-Seq reads to verify coordinates for Intron-1

Part 6.5: Use splice junction predictions to verify coordinates for second intron

Part 7.1: Verify gene model of protein

Part 7.2: Download files required for project submission

Part 7.3: Merge project files

Appendix A: Combining (or Batching) BLAST Searches