Skip to content
Search
Close this search box.

Potential Science Partners

The Genomics Education Partnership (GEP) has had great success in using genome annotation as a framework for introducing biology students to genomics and bioinformatics. We are looking for Science Partners, researchers who have a scientific project that would benefit from careful annotation of a set of genes or some other student contributions. Science Partners can be current GEP members. Past projects include annotation of all genes in a distinct domain (the 1.3 Mb Drosophila F element) across multiple species and annotation of the genes defining a particular pathway in Drosophila or in parasitoid wasps. Our primary goal is to identify projects that lead to publications in the scientific literature, but service projects (where careful annotation of a particular organism provides an improved resource for the scientific community) might be undertaken. Projects that can also lead to student micropublications are encouraged. Together we can accomplish large-scale genomics projects that cannot be easily completed by a single research lab. 

Interested? Contact the Chair of the Science and IT Committee or use the GEP contact form

  • The GEP provides an established infrastructure and teaching format for engaging large numbers of undergraduates in a research project, providing an immediate broader impact for the scientific work.
  • Participation of GEP students enables the annotation of megabases of DNA, including checking all potential isoforms from a reference species.
  • The annotation process includes a quality control step, called reconciliation, in which multiple independent student annotations of each genome feature are cross-checked and reconciled by expert student annotators, advised by research group members, ensuring high quality curation of all results.
  • Standard training curriculum is in place, and the GEP Curriculum Committee will assist in generating specific curriculum needed for the project.
  • Standard assessment is in place, and the GEP Assessment Committee will assist in generating specific assessment for the project.
  • We anticipate working with Science Partners (SP) who have generated genome assemblies for organisms that have not previously been sequenced or new genome assemblies with substantial improvements compared to previous versions of the assemblies. Using the G-OnRamp workflow, we have successfully constructed UCSC Assembly Hubs and JBrowse genome browsers for genomes that ranged from 120 Mb to 3.9 Gb in size and ranged from 54 to 402,501 scaffolds. In terms of the assembly quality, the main criteria would be for the scaffolds to be of sufficient length to contain mostly full-length genes. Hence the criteria for the quality of the assembly would be predicated on the distribution of the length of the gene spans for the target species of interest. A rough quality estimate is: contig N50 ≥ 50kb, scaffold N50 ≥ 2Mb. 
  • A high-quality reference genome in a reasonably closely related organism is strongly recommended. The percent identity between the protein sequences in the reference genome and those in the target genome should be at least 60% at the amino acid level. RNAseq data from the organism (or a very close relative) is also strongly recommended—particularly if the project requires the annotation of untranslated regions. Depending on the research goals, one of these two data types might be sufficient.
  • Scale: a good scientific question for the GEP to tackle should provide sub-projects for ~75 students/year for a minimum of 2 years. A “sub-project” will generally be made up of 1-5 genes requiring annotation or some other student contribution.
  • Other ideas that are not centered around gene annotation are also possible but might require a longer piloting phase and the creation of newer curriculum. The GEP is open to exploring new research avenues that extend beyond annotation.
  • The SP defines the scientific goals and specifies the types and amount of student-generated data needed. The SP agrees to carry out subsequent analyses using the GEP student-generated data, to draft the manuscript, and to coordinate submission of the results of the project for publication. It is assumed that the SP has (or is applying for) the required financial resources to accomplish these aims.
  • The SP will define the sub-project acceptance criteria (e.g., output formats, required metadata) and specify the quality control parameters needed for the project. If needed, the SP will participate in training personnel (and/or experienced students) to carry out GEP reconciliation procedures or additional project-specific steps. 
  • The SP is expected to attend monthly GEP Science and IT Committee meetings and to attend the annual GEP National Workshop. 
  • The SP and the GEP Curriculum Committee will work together to produce curriculum to introduce students to the scientific question involved and to generate instructions and exercises for any special skills needed.
  • The GEP Curriculum Committee, and SP if needed, will advise the GEP Assessment Committee on generating appropriate assessment materials (e.g., quiz questions).
  • Intellectual property issues should be discussed prior to the start of the project.
  • The GEP is open to different philosophies regarding publication and authorship eligibility. Plans to include or exclude participating GEP students or GEP faculty from authorship in scientific publications should be discussed and agreed upon before the SP initiates student-led data collection.
  • The SP and GEP must agree on a funding plan to cover any publication costs.
  • All co-authors must read and critique the draft manuscript. They must approve the final manuscript directly. The SP and colleagues (including GEP members) will generate the draft manuscript. The GEP will manage collection of GEP student/faculty critiques and approvals.
  • Both the GEP faculty and the SP can be eligible to be co-authors on any science education publications resulting from GEP assessment of the pedagogical results obtained in association with the project. Consult the Science Education Publication Workflow document for additional details.
  • As appropriate, GEP student annotators and their faculty instructors will also be able to share their annotations or genomic analyses as micropublications.
  • Other potential products beyond traditional publications or micropublications will also be considered (e.g., uploading gene models to a data repository that includes a place for student authorship or a description of student contributions).
  • The SP and GEP will be expected to share the costs of the project. The GEP Science/IT Committee will assist the SP to estimate the costs. The GEP can provide letters to funding agencies outlining the agreed upon support. The projected costs will vary depending on the complexity of the project but might be covered by funding already secured by the SP and the GEP. In other cases, it will be appropriate for the SP and GEP to submit a joint proposal to an appropriate funding agency.
  • The costs associated with a GEP project include a proportional share of core IT support and stipends to support experienced GEP students to carry out quality control work (reconciling submitted annotations), work often done in the summer. When possible, these costs will be covered by central GEP funding.
  • The costs associated with the SP’s work include any remaining needs for sequencing and assembling the desired genome or for generating ancillary data (e.g., RNAseq data). Budgets should also include support for lab personnel involved in data analyses and manuscript preparation. When possible, these costs will be covered by funding already secured by the SP.