Emilysimso Week 14

From LMU BioDB 2015
Jump to: navigation, search

Goal

  • GenMAPP Users should complete the statistical analysis of the microarray data, import the data into GenMAPP, and run MAPPFinder.

Initial Procedure

  • Created a new CompiledRawData spreadsheet
    • Column 1 = ID
    • Column 2 = MasterIndex (has numbers 1-10819)
  • Sheet 1 was labeled Compiled Raw Data
    • Took the log data from the 1, 5, 20, and 60 time points - had 16 total data columns (four for each time point)
  • Created MasterSheet in sheet 2
    • Deleted 704 rows after alphabetizing data A-Z according to the ID number because they were controls (started with Blank, blank, gDNA, NC-, or ORF)
    • 1333 cells have the error #NUM!
      • Replaced these with nothing so that the cells are blank
  • Created ScalingCentering in sheet 3
    • Copied over data from MasterSheet
    • Added two extra rows on top of data for Average and Standard Deviation
  • Some of the columns had divided by zero errors for the average and standard deviation
    • Had to delete #DIV/0! that came up for some of the genes in these columns
    • Made the cells that contained this error blank by searching for "DIV" and using the replace function to make this blank
      • C1_rep2 - 2 replacements
      • C5_rep4 - 1 replacement
      • C20_rep2 - 2 replacements
      • C20_rep3 - 1 replacement
      • C20_rep4 - 1 replacement
      • C60_rep1 - 2 replacements
      • C60_rep2 - 1 replacement
  • For the scaled and centered columns in the ScalingCentering tab - used the equation =(C4-C$2)/C$3 (for the first column)
  • Had to run the data through a script to get rid of duplicates
  • Calculated the averages for the technical replicates for each of the time points
  • Created a new tab labeled Statistics
    • Copied the technical averages into this tab
  • Calculated the biological averages for each time point
  • Calculated the Average Log Ratios by subtracting the value for C0 from each of the C points and subtracting C60 from each of the F points
  • To calculate the pvalues - had to compare the C values to C1 and the F values to C60
    • Used TTEST(array 1, array 2, 2, 3)
  • Created new Sheet - Bonferroni_Pvalue
    • Calculated the Bonferroni Pvalues for each of the comparisons
  • Created new Sheet - B-H_Pvalue
    • Calculated the Benjamin and Hochberg Pvalue for each of the comparisons
  • Created new Sheet - forGenMAPP
    • Copied columns A-AP from statistics worksheet

Edits to Match with Ron

  • Took Compiled_Raw_Data from Ron's spreadsheet
  • Created a Master_Sheet
  • Deleted 3011 #NUM! error cells
  • Deleted 704 rows the started incorrectly
  • Created Scaling_Centering sheet
  • Replaced 32 cells with #DIV/0! to read 'error'
  • Uploaded new updated file (see below)
  • Received split data from Dr. Dahlquist
  • Averaged the two split data points for each replicate at each time point (4 each for C0, C5, etc.)
  • Copied these averages to a new sheet labeled Statistics
  • Averaged the biological replicates for each time point
  • Calculated the average log ratio for C5-C0, C20-C0, C60-C0, F5-C60, F20-C60, and F60-C60
  • Calculated the ttest for the above relationships

Files


Weekly Assignment Information

User: Emilysimso

Assignments

Individual Journal Entries

Class Journal Entries

Group Project

Heavy Metal HaterZ