Kzebrows Week 12

From LMU BioDB 2015
Jump to: navigation, search

Electronic Lab Notebook

First, I accessed the microarray data from [ArrayExpress] and downloaded the files labeled "E-GEOD-32978.sdrf.txt" and "E-GEOD-32978.raw.1.zip."

We then analyzed the data by opening the sdrf.txt file. From examining this file it was apparent that no dye swaps had occurred. We looked at the Comment [Sample_Title] column compared to the Label and FactorValue[TREATMENT] columns and found that each sample/control combo was repeated three times leading to a total of 72 samples, but each control was labeled with Cy3 dye and each treated sample was labeled with Cy5 dye. This was the extent of samples that occurred which is improper technique because the dyes have different affinities. This could potentially skew the results but we moved forward with our analysis.

Next, I unzipped the raw data file, which expanded to 36 files. Each file contains a treatment and a control hybridized. We eliminated extraneous columns, leaving behind only Column E ("ID") and Column AS ("Log Ratio 635/535"). Because each file has columns with these titles, Column E was re-labeled based on the sample and repetition. Column AS had the same label except it began with LR indicating that it was the hybridized log ratio column. Columns therefore looked like this:

  • RX-0.5-10-rep1
  • LR RX-0.5-10-rep1
  • RX-0.5-10-rep2
  • LR RX-0.5-10-rep2

and so on. I did all of the RX samples (18 total files) and Erich did all of the RP samples (18 total files). We created a table found here showing the correspondence of the samples with the files. There were 36 files but each file corresponds to a hybridized control/treatment sample; e.g. GSM815858.gpr corresponds to GSM815858 1 (treatment) and GSM815858 2 (control).

We then combined all files into one master data file by copying/pasting the spreadsheets into one document and uploaded it to the team's file page found here.

In order to verify that the gene IDs in the data match the chosen species and strain being used to create the .gdb, we consulted with the QA person and Coder. We verified that the gene IDs were in fact in the format of SF####, which was consistent with the strain being used to create the .gdb.

Class Notes 11/19

  • Open "raw" data file as .gpr in Excel
  • Open "sdrf.txt" file in Excel
  • Use Column E "ID"
  • Use Column AS "Log Ratio 635/535"

Assignments

Individual Journal Assignment Pages

Week 1
Week 2
Week 3
Week 4
Week 5
Week 6
Week 7
Week 8
Week 9
Week 10
Week 11
Week 12
Week 14
Week 15

Individual Journal Assignments

Kzebrows Week 1
Kzebrows Week 2
Kzebrows Week 3
Kzebrows Week 4
Kzebrows Week 5
Kzebrows Week 6
Kzebrows Week 7
Kzebrows Week 8
Kzebrows Week 9
Kzebrows Week 10
Kzebrows Week 11
Kzebrows Week 12
Kzebrows Week 14
Kzebrows Week 15
Final Individual Reflection

Shared Journal Assignments

Class Journal Week 1
Class Journal Week 2
Class Journal Week 3
Class Journal Week 4
Class Journal Week 5
Class Journal Week 6
Class Journal Week 7
Class Journal Week 8
Class Journal Week 9
Oregon Trail Survivors Week 10
Oregon Trail Survivors Week 11
Oregon Trail Survivors Week 12
Oregon Trail Survivors Week 14

Additional Links

User Page: Kristin Zebrowski
Class Page: BIOL/CMSI 367-01
Team Page: Oregon Trail Survivors

[[Category : Journal]