Kevin McGee Assessment and Reflection

From LMU BioDB 2013
Jump to: navigation, search

Statement of Work Describe exactly what you did on the project. Kevinmcgee Week 10

    • I used the PubMed database to find the reference geneome of Leishmania Major.
    • I searched with the terms "Leishmania Major [MeSH Terms] AND Genome [Title]"
    • The search terms came back with 30 articles
    • The 9th article was titled: The Genome of the kinetoplastid parasite, Leishmania Major (Ivens et al., 2005) This is the reference genome.
    • Link to Article Online
    • On Web of Science, I searched using the search terms "Leishmania Major" for the Title and "Ivens AC" for the author.
    • I got 7 article results back from my search terms. The 1st article was the reference genome that I found on PubMed.
    • The last thing that Ivens published was the genome sequence and has not had published work on Leishmania Major since then. However, looking at the people who have referenced his reference genome, you can see many of the directions people have taken his research. Many articles have been posted in the last year on determining Leishmania resistance to drugs and many properties of different proteins within the gene.

Kevinmcgee Week 11 Journal club Reference article Kevinmcgee Week 12

  1. Downloaded SDRF file off of the wiki.
  2. Started to edit SDRF file
    • Left the following columns while deleting the rest:
      • Source NAme
      • Characteristics
      • Comment (Sample_description)
      • Comment (Sample_source_name)
      • Label
      • Array Data File
  3. Filtered the file down to only L.Infantum samples
  4. Was left with the following image
    • SDRFL.Infantum (1).PNG
    • This image showed me what data was where when looking at the raw data files
  5. Proceeded to go into each data file for L.Infantum and keeping the name of each gene along with the log ratio of each gene.
  6. Compiled all data into a single sheet
  7. Was left with the following image
    • L.InfantumCompiledRawData.PNG
  8. Uploaded the Compiled Raw data file onto the wiki
    • Sdrf file was uploaded by Viktoria

Kevinmcgee Week 13

  • Opened L.infantumCompliedRawData(A).txt
  • Finished the formatting by flipping the dye swap chips negative
  • Created a column next to dye swap chips and did the formula:
=-1*(dye swap chip column)
  • made a new sheet
  • added all data from old sheet except only added the flipped dye swaps
  • looked for background information in the array paper
    • L. infantum MHOM/MA/67/ITMAP-263 and L. major LV39 MRHO/SU/59/P strains used in this study
    • All microarray data will be freely available on the Geo NCBI database in the MIAME format
      • The series accession number for our manuscript is GSE10407.
    • Each chip compares promastigote vs. amastigote with different replicates
    • Following data files found

LmjSampleInfo.PNG

  • Finished naming sheet with helpful names to know what is what on the sheet
  • Ready for statistical analysis
  • Began analysis by taking the average and standard deviation of our data chips seperately and using that information to scale and center our data:
=(B4-B$2)/B$3 This shows the equation we used to scale and center.
  • Copied and pasted values of scaled centered onto a new page. From there, we edited out all VALUE! cells and left them blank. GenMAPP will ignore these blanks when we input our data.
  • Made a column of the average fold change for each gene call Avg_LogFC_All
Average B2:G2
  • Made a column of the Tstat and Pvalue for the fold changes of each gene:
=AVERAGE(B2:G2)/STDEV(B2:G2)/SQRT(6) TStat
=TDIST(ABS(I2),5,2) Pvalue
  • Created a new page titled forGENMAPP
    • Copied and pasted all values from statistics page
  • Cut and pasted columns H-J and moved them to columns B-D
  • Inserted a new column at B called System Code. Filled in column with the letter N
  • File is now ready for GenMAPP import


  • Sample of what the final file looked like

L.InfantumforGenMAPP.PNG

Kevinmcgee Week 15

Contents

Uploading Into GenMAPP

Datasheet

  1. compiled all data onto a single data sheet including both L.Major and L.Infantum and uploaded into GenMAPP
    • Ran into some problems uploading (almost everything was an error)
  2. Filtered out any Lin genes and created a new datasheet for only LmfJ genes.
    • Still ran into problems uploading
  3. looked at the database OrthologicalNames sheet, and saw that the GeneID's were in a different format on there then in the spreadsheet.
    • Made a quick-fix file by changing the names on the spreadsheet, but longterm fixes are being made to the coding so that other users do not have to change the names on their spreadsheets every time (for convenience).
  4. Were able to upload our data and continue on with the project.

Sanity Check

Leishmania Infantum

  1. Filtered P-value
    • 1392 genes were <.05
    • 327 genes were <.01
    • 67 genes were <.001
    • 28 genes were >.0001
  2. Filtered Average Log Fold Change
    • 748 genes were >0
    • 646 genes were <0
    • 699 genes were >.25 while 646 were <.05
    • 748 genes were >.05 while 606 were <.-25

MAPPFinder Color Changes

  1. The colors were assigned to the two main criterion
    • Increased relative to control had a Log FC> 0.25 and P-Value <0.05 these were colored blue
    • Decreased relative to control had a Log FC< -0.25 and P-Value <0.05 these were colored purple

Running MAPPFinder

  • Set up MAPPFinder to run with the file name LMajorGOMap
  • Ran for about 1 1/2 hours
  1. Top Ten GO Terms
    • catalytic activity
    • Endonuclease activity
    • DNA catabolic process
    • Aromatic compound catabolic process
    • cellular nitrogen compound catabolic process
    • nucleobase-containing compound catabolic process
    • organic cyclic compound catabolic process
    • heterocycle catabolic process
    • oxidoreductase activity
    • macromolecular complex


Picture GO.PNG

Statement of Work

  • Describe exactly what you did on the project.
  • Provide references or links to artifacts of your work, such as:
    • Wiki pages
    • Other files or documents
    • Code or scripts

Assessment of Project

  • Give an objective assessment of the success of your project workflow and teamwork.
  • What worked and what didn't work?
  • What would you do differently if you could do it all over again?
  • Evaluate the Gene Database Project and Group Report in the following areas:
    1. Content: What is the quality of the work?
    2. Organization: Comment on the organization of the project and of your group's wiki pages.
    3. Completeness: Did your team achieve all of the project objectives? Why or why not?

Reflection on the Process

  • What did you learn?
    • With your head (biological or computer science principles)
    • With your heart (personal qualities and teamwork qualities that make things work or not work)?
    • With your hands (technical skills)?
  • What lesson will you take away from this project that you will still use a year from now?
Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox