Week 9

From LMU BioDB 2015
Jump to: navigation, search

This journal entry is due on Tuesday, November 3, at midnight PST. (Monday night/Tuesday morning)

Overview

The purpose of this assignment is:

  • to prepare for the team Gene Database Final Project by doing a "dry run" of a Gene Database export on a species for which the results are known (Vibrio cholerae).
  • to learn how to perform quality assurance on the exported database, i.e., using metadata and screening procedures to recognize artifacts, incompletion, or corruption of data sets.

Individual Journal Assignment

  • Store this journal entry as "username Week 9" (i.e., this is the text to place between the square brackets when you link to this page).
  • Link from your user page to this Assignment page.
  • Link to your journal entry from your user page.
  • Link back from your journal entry to your user page.
  • Don't forget to add the "Journal Entry" category to the end of your wiki page.
    • Note: you can easily fulfill all of these links by adding them to your template and then using your template on your journal entry.
  • For your assignment this week, you will keep an electronic laboratory notebook on your individual journal entry page for this week. An electronic laboratory notebook records all the manipulations you perform on the data and the answers to the questions throughout the protocol. Like a paper lab notebook found in a wet lab, it should contain enough information so that you or someone else could reproduce what you did using only the information from the notebook. As a reminder, be sure to answer any questions embedded in the protocol in your journal page and make sure to upload and link to your completed files on your individual journal page.

Homework Partners

For this week, the homework partners will be:

  • Mary Alverson, Nicole Anguiano
  • Ronald Legaspi, Kristen Zebrowski
  • Brandon Klein, Brandon Litvak
  • Josh Kuroda, Kevin Wyllie
  • Mahrad Saeedi, Veronica Pacheco
  • Lena Olufson, Emily Simso
  • Trixie Roque, Anu Varshneya
  • Jake Woodlee, Erich Yanoschik

Tuesday, October 27: Exporting a Vibrio cholerae Gene Database

Thursday, October 28: Performing Quality Assurance on the Vibrio cholerae Gene Database

  • Using the Vibrio cholerae GenMAPP Gene Database that you exported, perform the steps described on this page for your respective XML, PostgreSQL, and Microsoft Access data, and record the data in your Gene Database Testing Report.
    • Note that, for these counts, you will need to gain some familiarity with the ID systems.
    • You will also need to be able to form a correct pattern for expressing these IDs.
  • Document the following information:
    1. List the assorted counts produced by each method for their corresponding data sources:
      • Tally Engine
      • xmlpipedb-match
      • PostgreSQL
      • Microsoft Access
    2. Highlight any discrepancies, and offer hypotheses for why these discrepancies may have occurred.
    3. State the ID pattern(s) that you used for performing any counts that rely on such patterns.
  • You do not need to restrict yourself solely to the commands and queries listed in How Do I Count Thee? Let Me Count The Ways. “Poking around” further to get more information or to test any conjectures is strongly encouraged. If you do any other explorations, document them in your electronic notebook.

Shared Journal Assignment

  • Store your journal entry in the shared Class Journal Week 9 page. If this page does not exist yet, go ahead and create it (congratulations on getting in first :) )
  • Link to your journal entry from your user page.
  • Link back from the journal entry to your user page.
    • NOTE: you can easily fulfill the links part of these instructions by adding them to your template and using the template on your user page.
  • Sign your portion of the journal with the standard wiki signature shortcut (~~~~).
  • Add the "Journal Entry" and "Shared" categories to the end of the wiki page (if someone has not already done so).

Reflection

Reflect on working in teams either in this class or in previous classes.

  • What kinds of characteristics do you want in your teammates?
  • What kinds of things make teams go smoothly?
  • What kinds of things make teams not go so smoothly?

Project Team Requests

  • In class on November 3, the teams will be assigned for the Final Class Project.
  • There will be four teams of 4 students.
  • Ultimately the teams will be assigned by Drs. Dahlquist and Dionsio based on balancing areas of expertise amongst the team members. However, we would like to gather your input into the formation of the teams. At this point in the semester, you have worked with at 10-11 other people in the class as partners on journal assignments. Please send an e-mail to both Dr. Dahlquist and Dr. Dionisio by midnight, Saturday, October 31 (Friday night/Saturday morning; Halloween!) giving us the names of three other people in the class whom you want to be on your team (you can also choose people that haven't been your journal partner yet) and one person whom you prefer not to be on your team. Please give us a short explanation as to why you are choosing those particular people. We will consider all team requests, but we cannot guarantee that you will be matched with the people you picked. These e-mails will remain confidential to the instructors and will not be shared with your classmates.
  • Also on November 3, the species will be assigned for the Final Class Project. The assignment of species will be decided in class between the teams and the instructors after the teams have been announced. So you can start to think about it before next week, a couple of the species under consideration by the instructors are (check back for additions to this list):
    • Shewanella oneidensis
    • Shigella flexneri
  • You may propose a different species than one of the above, as long as it fulfills the following criteria:
    1. The genome is completely sequenced and has been published in a journal article
    2. The complete proteome set of data (UniProt XML and GOA) are available through the UniProt Complete Proteomes page and UniProt-GOA Downloads page, respectively.
    3. At least one DNA microarray dataset has been performed on this species, the data has been published in a journal article, and the complete dataset is publicly available and uses an ID system present in the UniProt XML. Sources of DNA microarray data include: