Stephen Louie Project Notebook

From LMU BioDB 2013
Jump to: navigation, search


Contents

Week 12

11/12/2013

  • Gave presentation for Genome paper to class

11/14/2013

  • Conducted meeting with guilds. No meeting was conducted for Quality Assurance
  • Sat in on GenMAPP builder guild meeting for absent teammate
  • Downloaded and extracted data source files with Mitchell
    • UniProt XML
      • Followed directions provided Here
    • GOA
      • Note:Current directions were not working. Follow these instructions for your respective species
      • From Running GenMAPP Builder page, clicked on the UniProt-GOA Downloads link.
      • Was given an error message. Changed url from "ftp" to "http" at beginning.
      • Was entered, was taken to Index of/pub/database/GO/goa
      • Clicked on "proteomes" folder
      • Directed to Index of /pub/databases/GO/goa/proteomes. Downloaded 58.R_meliloti.goa
      • Note: R. meliloti is an alternative name to S. Melitoti.
    • GO OBO-XML
      • Followed directions provided Here
  • Created new database in PostgreSQL
    • Followed directions provided Here
  • Imported data into PostgreSQL
    • Followed directions provided Here
    • UniProt XML took 19.17 minutes
    • GO OBO-XML took 17.81 minutes to import and to 15.54 minutes process
    • GOA file took less than a minute
  • Exported Gene Database
    • Followed directions provided Here
    • Export took ~8 hours

Week13

11/19/2013

  • Conducted side by side comparison of GeneIDs for gdb. and microarray data
    • For the gdb. file, used MS Access
      • Under tab "Orderedlocusnames" GeneIDs appeared as RB####.
    • Downloaded microarray data from wiki page. (Used Mitchell's draft version of the .3M NaCL results)
      • Opened the file in Excel in a tab.delimited format. GeneIDs appeared as SM.#####

11/21/2013

  • Ran a preliminary sanity check
    • Used GenMAPP to analyze microarray data to see if there was a discrepancy in the gene IDs between the microarray data and the GenMAPP database.
      • Followed instructions provided [here]
      • After the first run, the conversion yielded over 20,000 errors with no matches

ExpressionDatasetManagerscreencap.PNG

  • When observing the microarray data, the Gene IDs used an uncapitalized letter as the third character space in the ID. To see whether this was the cause of the discrepancy, one of the samples was changed to have a third capitalized letter in the ID
    • After the second run, there was no change in the amount of errors or matches from the modification to the Gene ID in the microarray data
  • In viewing the gdb. data in Microsoft Access, it was realized that the orderedlocusnames utilizes R.#### as the ID (Rhizobium Meliloti was the former name of the species).
    • Instead of using SM, R will be used instead to see whether that will make any substantial difference.

Week 14

11/26/2013

  • Opened XML file in XML editor.
    • Located GeneID under the "ORF" tag
    • Format of GeneID is SM.######
  • Note:Tag Gene ID in XML file is irrelevant. The tag is for a external link

External Links

User Page
Team Page
Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox