Difference between revisions of "Quality Assurance"

From LMU BioDB 2015
Jump to: navigation, search
(Provide overview and milestone names.)
m (Add dropped italics.)
Line 1: Line 1:
 
{{Gene Database Project Links}}
 
{{Gene Database Project Links}}
  
The Quality Assurance team member is the resident expert on species ID systems and formats. He or she should be proficient with XMLPipeDB Match, SQL queries in PostgreSQL, Microsoft Excel, and Microsoft Access to navigate through the data and find missing IDs, discrepancies, sanity checks, etc.
+
The Quality Assurance team member is the resident expert on species ID systems and formats. He or she should be proficient with ''XMLPipeDB Match'', SQL queries in PostgreSQL, Microsoft Excel, and Microsoft Access to navigate through the data and find missing IDs, discrepancies, sanity checks, etc.
  
 
== Guild Members ==
 
== Guild Members ==

Revision as of 00:41, 2 November 2015

Gene Database Project Links
Overview Deliverables Reference Format Guilds Project Manager GenMAPP User Quality Assurance Coder
Teams Heavy Metal HaterZ The Class Whoopers GÉNialOMICS Oregon Trail Survivors

The Quality Assurance team member is the resident expert on species ID systems and formats. He or she should be proficient with XMLPipeDB Match, SQL queries in PostgreSQL, Microsoft Excel, and Microsoft Access to navigate through the data and find missing IDs, discrepancies, sanity checks, etc.

Guild Members

  • Species 1:
  • Species 2:
  • Species 3:
  • Species 4:

Milestones

Milestone 1: Initial Database Export

  1. (with Coders) Get a full import-export cycle done.
  2. (with Coders) Decide on a file/version management scheme/system.
  3. Learn the ID systems:
    • Systems that are the same for each species (hint: guild members help each other out by posting the relevant information on this page)
      • UniProt
      • RefSeq
      • GeneID (EntrezGene from NCBI)
      • GO
    • The OrderedLocusNames for your species

Milestone 2: ID Pattern Definition and Verification

  1. Characterize regular expression patterns to detect the IDs (for filtering then counting).
    • XMLPipeDB Match utility
    • Direct SQL queries in PostgreSQL
    • For example, the Vibrio IDs were of the form VC#### or VC_####; how would you express that in Match or as an SQL query?
    • Table inspection/filtering/sorting in Microsoft Access
    • If needed, side-by-side sorted comparisons in Microsoft Excel (as described here)
  2. Document/log all work done, problems encountered, and how they were resolved.

Milestone 3: Tally Engine Configuration

Along with your Coder, customize the Tally Engine setup for your species as specified in these coder steps. You will want to add, at the very least, the ordered locus IDs for your species.

Milestone 4: Final Documentation

  1. Document the relational database schema for the gene database.
  2. Create the ReadMe with comparisons to MOD for your species.
Gene Database Project Links
Overview Deliverables Reference Format Guilds Project Manager GenMAPP User Quality Assurance Coder
Teams Heavy Metal HaterZ The Class Whoopers GÉNialOMICS Oregon Trail Survivors