Eyanosch Week 14

From LMU BioDB 2015
Jump to: navigation, search

Erich Yanoschik

Erich Yanoschik

Electronic lab notebook

Dividing the Statistical analysis by the two types of drugs (Kristin is doing RX and I am doing RP) Since we performed the same tasks for the two different types of drugs, I borrowed Kristin's electronic lab notebook layout and some information.

Corrections from talk page

  • Renamed the columns by replacing LR with LogFC, Re-named Sheet 1 "MasterSheet".
  • Renamed column A "ID"
  • Inserted a ccolumn to the left of Column B and re-named it "MasterIndex". Typed "1" and cell B2 and 2 and Cell B3 and selected both cells. Filled in the column with numbers 1-4224
  • Selected the data and sorted from A to Z on the "ID" column. Any rows that had "Empty" or "Blank#### IDs were deleted
  • Sorted by the MasterIndex column to put the IDs back in order from 1 to 3926(new total # of files)
  • 585 cells with "Error" messages replaced with nothing (left them empty)
  • copied into Sheet 2 that was named "scalingCentering'

Followed the instructions from Sample Microarray Analysis Vibrio cholerae.

Normalizing the log ratios for the set of slides in the experiment

  • Copied all the data from MasterSheet and named cell A2 "Average" and named cell A3 "StdDev".
  • In cell C2 I typed =AVERAGE(C4:C3929) and in cell C3 I typed =STDEV(C4:C3929). I pressed enter and copied this equation across the rest of the columns through column AL.
  • I then copied the column headings for all data columns and pasted them to the right of the last column, I renamed each column with "_Scaled_Centered" at the end
  • In cell AM4 I typed =(C4-C$2)/C$3 indicating that I wanted data in cell C4 to have the average subtracted from it and then to divide it by the standard deviation.
  • I then copy and pasted that equation across the entire column. I copied and pasted this equation for each column of the data.

Perform statistical analysis on the ratios

  • Created a new sheet called "Statistics" and copied the information over
  • I copied and pasted the ID column from the ScalingCentering worksheet into the first column of the new worksheet, copied all Scaled_Centered columns from the ScalingCentering worksheet, pasted the values into column B1 of the new sheet, and deleted "Average" and "StDev" columns
  • Here is when we divided the work, Kristin worked on RX and I worked on RP. (I left the RX data in my excel sheet but only worked on RP)
  • Inserted column to the right and typed headers Avg_LogFC RP-0.5-10, Avg_LogFC RP-0.5-30, Avg_LogFC RP-0.5-60, Avg_LogFC RP-1-10, Avg_LogFC RP-1-30, and Avg_LogFC RP-1-60 into the top cells of the next 6 columns.
  • Computed the average log fold change per treatment and time period by typing the following equations: =AVERAGE(T2:V2), =AVERAGE(AB2:AD2), =AVERAGE(AJ2:AL2), =AVERAGE(AR2:AT2)), =AVERAGE(AZ2:BB2), and =AVERAGE(BH2:BJ2) below each corresponding Avg_LogFC column. Copied and pasted the equations throughout the entire column.
  • I then inserted a new column to the right and named it "Avg_LogFC_all" and typed the equation =AVERAGE(T2:Y2).
  • I inserted a new "Tstat" column next to Avg_LogFC. Into column AA2 I typed =AVERAGE(BE2:BG2)/(STDEV(BE2:BG2)/SQRT(6)), indicating 6 replicates. I copied this equation into the whole column. I also used this format for the Tstats of the other RP trials.
  • I inserted a new column and called it "Pvalue". I entered the equation =TDIST(ABS(BI2),5,2) for BI2, the T-stat column and 5 degrees of freedom.

Calculate Bonferroni p value correction

  • Inserted a new column to the right labeled Bonferroni_Pvalue (did this twice)
  • In the first Bonferroni_Pvalue column I typed =BJ2*3928 and copied this equation for the entire column.
  • In order to replace any corrected P value that was greater than 1 with the number 1, I typed the equation =IF(BK2>1,1,BK2) and pasted it through the entire second Bonferroni_Pvalue column.
  • I then saved the most updated version of the file to my ThawSpace and completed the work on thursday.

Analysis corrections

After uploading my 12/1 file to the OTS Files page and consulting with Dr. Dahlquist, I realized that I needed to make several corrections. Originally, I calculated the T-stat, Pvalue, Bonferroni P value, and adjusted Bonferroni P value using a LogFC_all column, which was incorrect. Because this experiment involved multiple treatments (of which I was analyzing 6 for the RX samples) I needed to calculate those four things for all 6 treatments (RX-0.5-10, RX-0.5-30, RX-0.5-60, RX-1-10, RX-1-30, and RX-1-60). I inserted columns for all four items for all 6 treatments I was looking at, and following the same procedure as before, I calculated the T-stat, P value, Bonferroni P value, and adjusted Bonferroni P value for each treatment. Here are some sample equations for the RX-0.5-10 treatment:

  • Column Z named Tstat_LogFC RX-0.5-10: =AVERAGE(B2:D2)/(STDEV(B2:D2)/SQRT(3)), denoting that I want a T statistic representative of all three RX-0.5-10 replicates.
  • Column AA named Pvalue_LogFC RX-0.5-10: =TDIST(ABS(Z2),2,2), indicating that i wanted the P value calculated from the T statistic column.
  • Column AB named Bonferroni_Pvalue_LogFC RX-0.5-10: =AA2*3926
  • Column AC named Bonferroni_Pvalue_LogFC RX-0.5-10: =IF(AB2>1,1,AB2) in reference to the first Bonferroni P value column.

Dr Dahlquist also made a change to the file where she replaced all of the nothing boxes with a space/character/space. The amount of replacements was still the same as before but this was done to insure that the analysis was as accurate as possible. I proceeded to do a sanity check before moving on to the Benjamini-Hochberg corrections.

  • This was taken from Kristens page, she talked with Dr. Dhalquist about the necessary corrections to complete our portion of the project.

Prepare file for GenMAPP

  • I inserted a new worksheet and named it "forGenMAPP."
  • I selected all from the Statistics worksheet and pasted the values on the new sheet.
  • I selected all fold changes and changed them to 2 decimal places by selecting Format > Cells.
  • I selected all columns with p values including the T stat column and changed them to 4 decimal places by selecting Format > Cells.
  • I deleted the left-most Bonferroni p value columns (the ones without the "IF" statements) from each treatment.
  • I inserted a column to the right of the ID column and named it "SystemCode". I filled the whole column with the letter N by typing N in the first cell after the heading and double-clicking on the little black cross.
  • I selected File > Save As "Text (Tab-delimited) (*.txt). I clicked through the different warnings and uploaded both files to the team's Wiki page. (CompiledRaw_data_GenMAPP_ready)

Sanity Check

Next I performed a sanity check, I opened the "forGenMAPP" tab and selected "Custom Filter" from the filter options. I then performed the following procedure for all 6 RX treatments.

  • Filtered P value column by less than 0.05, 0.01, and 0.001. Recorded results and percentages out of 3,926.
  • Filtered Bonferroni p value column by less than 0.05 and recorded results and percentages out of 3,926. The table can be located on my updated excel file from 12/3 found on the team Wiki.

The RP tests that showed the most gene manipulations were the 1 MIC at 60 minutes and 30 minutes with 64.32% and 64.34% respectively. The next closest was RP at 0.5 MIC at the 30 minute marker with 41.14%.

Running GenMAPP and MAPPFinder

  • upon running genmapp and producing the GO tree. A runtime error occurs ("Run-time error '53': File not found") when clicking on the GO terms.

Instructions from Dr. Dahlquist

  • Increased LogFC>0.25 and p<0.05
  • Decreased LogFC<-0.25 and p<0.05
  • Do Benjamini-Hochberg correction for the treatment with the most significant genes out of all 6
  • need to do this by tomorrow (the Benjamini-Hochberg correction)

All the current files are located under RP (Erich) of the Wiki page

Class Journals

Class Journal Week 1

Class Journal Week 2

Class Journal Week 3

Class Journal Week 4

Class Journal Week 5

Class Journal Week 6

Class Journal Week 7

Class Journal Week 8

Class Journal Week 9

Class Journal Week 10

Class Journal Week 11

Weekly Assignments

Week 1

Week 2

Week 3

Week 4

Week 5

Week 6

Week 7

Week 8

Week 9

Week 10

Week 11

Week 12

Week 13

Personal Journal

Eyanosch Week 2

Eyanosch Week 3

Eyanosch Week 4

Eyanosch Week 5

Eyanosch Week 6

Eyanosch Week 7

Eyanosch Week 8

Eyanosch Week 9

Eyanosch Week 10

Eyanosch Week 11

Eyanosch Week 12

Eyanosch Week 13

Eyanosch Week 14

Eyanosch Week 15

Electronic Notes (E-notes)

Week 3 E-notes Eyanosch

Week 4 E-notes Eyanosch

Week 5 E-notes Eyanosch

Week 6 E-notes Eyanosch

Week 7 E-notes Eyanosch

Week 8 E-notes Eyanosch

Week 9 E-notes Eyanosch

Week 10 E-notes Eyanosch

Week 11 E-notes Eyanosch

Week 12 E-notes Eyanosch

Week 13 E-notes Eyanosch

Week 14 E-notes Eyanosch

Week 15 E-notes Eyanosch


Class (personal) Notes

Week 3 Notes

Week 4 Notes

Week 5 Notes

Week 6 Notes

Week 7 Notes

Week 8 Notes

Week 9 Notes

Week 10 Notes

Week 11 Notes

Week 12 Notes

Week 13 Notes