HDelgadi Week 8

From LMU BioDB 2013
Jump to: navigation, search

Contents

Digital Lab Notebook

Continuation of classwork on Thursday, October 10th [Microarray Analysis Vibrio cholerae]:

  1. Columns B through Q were selected by highlighting the very first row of these 16 columns. Then scroll down to the last row of these columns. Hold down the 'Shift' button and highlight the entire columns. Click on the format button which is directly below the 'Delete' button under the 'Home' tab.
  2. Click on 'Format Cells' and 'Number' under the 'Number' sub-category.
  3. Select two decimal places and click 'OK'.
  4. Follow steps 1 through 3 with columns R and S and select 4 decimal places rather than 2.
  5. Select columns N through S from the very first cells, not just the values, and right click to press the 'Cut' button.
  6. Select column B by right clicking on B and select "Insert Cut Cell" (the cells should shift 6 columns to the right).
  7. Right click on the B header and click on 'Insert'.
  8. The previous B column should shift one column to the right.
  9. Under the new B header, type 'System Code' into the top cell of this column.
  10. Below 'System Code' write the letter 'N'.
  11. Left click on the letter 'N' and press Ctrl C to copy the letter and proceed to the last cell on the column and hold down the shift button along with left clicking this last cell (the entire column should be highlighted).
  12. Press Ctrl V to paste the letter 'N' to all of the cells of the column.
  13. Select File then Save As and under 'Save as type' select 'Text(Tab-delimited)(*.txt)'. Proceed to clicking OK when the signs of warnings begin to pop-up. The *.txt file is necessary to proceed.
  14. Rename the last tab next to statistics, 'forGenMAPP', since it should have changed to the name of the file when saved as the text file.
  15. Click on the arrow pointing down next to the A header (all columns should be highlighted at this point).
  16. Select the 'Data' tab and click on 'Filter' (drop-down arrows should appear for each column).
  17. Click on the drop-down arrow next to 'Pvalue'.
  18. Click on 'Number Filters'.
  19. Click on 'Custom Filter...' and make sure to select 'less than' on the first drop-down menu and type in 0.05 on the drop-down menu right next to it. Make sure to follow steps 17-19 for the additional respective p-values.
    • Genes that have a p-value <0.05:
      1. 948 out of 5222 values
    • Genes that have a p-value <.01:
      1. 235 out of 5222 values
    • Genes that have a p-value <.001:
      1. 24 out of 5222 values
    • Genes that have a p-value <.0001:
      1. 2 out of 5222 values
  20. (The significance of the p-value <0.05 is that there's a significant difference in the values which demonstrates the change of gene expression)
  21. Change the p-value to less than 0.05.
  22. Click on the drop-down arrow next to "Avg_LogFC_all".
  23. Click on 'Number Filters'.
  24. Click on 'Custom Filter...'.
  25. On the first drop-down menu make sure to select 'greater than' and type in 0 on the drop-down menu next to this one.
  26. Follow steps 22-25 for the respective average log fold changes.
    • Average log fold change greater than zero:
      1. 352
    • Average log fold change less than zero:
      1. 596
    • Average log fold change greater than 0.25:
      1. 339
    • Average log fold change less than -0.25:
      1. 579
The Statistical Analysis for Microarrays (SAM) program was used to determine the statistically significant differences in gene expression. Merrell et al.(2002) used at least a twofold change to determine these genes with statistical significant differences whereas we are using a 20% fold change, so we are using a slightly less fold change.
    • Sanity Check: Compare individual genes with known data
      1. VC0028,
      2. VC0941,
      3. VC0869,
      4. VC0051,
      5. VC0647,
      6. VC0468,
      7. VC2350,
      8. VCA0583:
    • AvgLogFC_all (Column F):
      1. 1.65,
      2. .09,
      3. 1.50,
      4. 1.92,
      5. -1.11,
      6. -.17,
      7. -2.40,
      8. 1.06
    • P-values (Column H):
      1. .0474,
      2. .6759,
      3. .0174,
      4. .0139,
      5. .0003,
      6. .3350,
      7. .0130,
      8. .1011

5 out of the eight genes are statistically significant p <.05

Class Journal Week 1

Week 1

Class Journal Week 2

Week 2

HDelgadi Week 2

Class Journal Week 3

HDelgadi Week 3

Week 4

Class Journal Week 4

HDelgadi Week 4

Week 5

HDelgadi Week 5

Class Journal Week 5

Week 6

HDelgadi Week 6

Class Journal Week 6

Week 7

HDelgadi Week 7

Class Journal Week 7

Week 8

HDelgadi Week 8

Class Journal Week 8

Week 9

HDelgadi Week 9

Class Journal Week 9

Week 10

HDelgadi Week 10

Week 11

Team H(oo)KD

HDelgadi Week 11

HDelgadi Project Notebook

Week 12 Status Report

Week 13 Status Report

Week 15 Status Report

Hilda Delgadillo

HDelgadi (talk) 18:48, 13 October 2013 (PDT)

Digital Lab Notebook 2

Continuation of classwork on Tuesday, October 14th [| BIOL367/F10:GenMAPP and MAPPFinder Protocols]:

  1. Select the top row, click 'Data' and under data click 'Filter'. Upon clicking 'Filter' drop-down arrows will appear for the entire row that was selected.
  2. Click on the drop-down arrow for 'Errors' and make sure to click on the 'Blank' box, so that it is not filled in. Press 'Okay'.
      1. Record the number of errors. For your journal assignment, open the .EX.txt file and use the Data > Filter > Autofilter function to determine what the errors were for the rows that were not converted. Record this information in your individual journal page.
    • 122 Errors found on the.EX.txt file
    • This is what appears in the 'Error' column: 'Gene not found in OrderedLocusNames or any related system'. The last row however, also contains 'No Gene ID' in addition to the previous comment.
      1. It is likely that you will have a different number of errors than your buddy who is using a different version of the Vibrio cholerae Gene Database. Which of you has more errors? Why do you think that is? Record your answers in your journal page.
    • My buddy has more errors,722, since his Gene Database is from the year 2009 whereas mine is from 2010, so there were continuous updates performed on the newer database which corrected the previous errors.
      1. Upload your exceptions file: EX.txt to your wiki page.
    • .EX.txt HD_20131014
      1. Upload your .gex file to your journal entry page for later retrieval.
    • .gex HD_20131014
      1. List the top 10 Gene Ontology terms in your individual journal entry. (DECREASED Criteria for MAPPFinder Procedure)
    • Glucose Catabolic Process
    • Hexose Catabolic Process
    • Glycolysis
    • Monosaccharide Catabolic Process
    • Cytoplasm
    • Alcohol CCtabolic Process
    • Cellular Carbohydrate Catabolic Process
    • Glucose Metabolic Process
    • Protein Folding
    • Hexose Metabolic Process
      1. Compare your list with your buddy who used a different version of the Gene Database. Are your terms the same or different? Why do you think that is? Record your answer in your individual journal entry.
    • My GO terms are all different than my buddy, Taurus' terms. This could be due to the fact that there is an addition of genes and through updating the database there is a higher gene correspondence to different terms.
  3. On the MAPPFinder Browser, type in or copy and paste the name of the given genes mentioned by Merrell et al. (2002). The first one on the list is VC0028. Look to the right and you will find the drop-down menu, click on it, and select OrderedLocusNames.
  4. Click on Gene ID Search.
  5. The GO terms associated with this gene is highlighted in blue.
  6. Follow the previous three steps for the rest of the genes.
      1. List the GO terms associated with each of those genes in your individual journal. (Note: they might not all be found.) Are they the same as your buddy who is using a different Gene Database? Why or why not?
      2. My GO terms associated to the genes below were all found whereas my buddy Taurus was only able to find GO terms for the gene VC0647. This could simply be due to the more recently updated database (2010) that obtains more recent information and in turn more GO terms associated with more genes.
    • VC0028:
      1. Branched chain family amino acid biosynthetic process
      2. Cellular Amino Acid Biosynthetic process
      3. Metabolic Process
      4. Metal Ion Binding
      5. Iron-Sulfur Cluster Binding
      6. 4 Iron, 4 Sulfur Cluster Binding
      7. Catalytic Activity
      8. Lyase Activity
      9. Dihydroxy-acid Dehydratase Activity
    • VC0941:
      1. Glycine Metabolic Process
      2. L-serine Metabolic Process
      3. One-Carbon Metabolic Process
      4. Cytoplasm
      5. Pyridoxal Phosphate Binding
      6. Catalytic Activity
      7. Transferase Activity
      8. Glycine Hydroxymethyltransferase Activity
    • VC0869
      1. Glutamine Metabolic Process
      2. Purine Nucleotide Biosynthetic Process
      3. 'de novo' IMP Biosynthetic Process
      4. Cytoplasm
      5. Nucleotide Binding
      6. ATP binding
      7. Catalytic Activity
      8. Ligase Activity
      9. Phosphoribosylformyglycinamidine Synthase Activity
    • VC0051:
      1. Purine Nucleotide Biosynthetic Process
      2. 'de novo' IMP Bisynthetic Process
      3. Nucleotide Binding
      4. ATP Binding
      5. Catalytic Activity
      6. Lyase Activity
      7. Carboxy-lyase Activity
      8. Phosphoribosylaminoimidazole
    • VC0647:
      1. mRNA Catabolic Process
      2. RNA Processing
      3. Cytoplasm
      4. Mitochondrion
      5. RNA Binding
      6. 3'-5'-exoribonuclease Activity
      7. Transferase Activity
      8. Nucleotidyltransferase Activity
      9. Polyribonucleotide Nucleotidyltransferase Activity
    • VC0468:
      1. Glutathione Biosynthetic Process
      2. Metal Ion Binding
      3. Nucleotide Binding
      4. ATP Binding
      5. Catalytic Activity
      6. Ligase Activity
      7. Glutathione Synthase Activity
    • VC2350:
      1. Deoxyribonucleotide Catabolic Process
      2. Metabolic Process
      3. Cytoplasm
      4. Catalytic Activity
      5. Lyase Activity
      6. Deoxyribose-phosphate aldolase Activity
    • VCA0583:
      1. Transport
      2. Outer Membrane-Bounded Periplasmic Space
      3. Transporter Activity
  7. Double click on one of the GO terms that are associated with the genes that are seen above, VC0028, VC0941, VC0869, VC0051, VC0647, VC0468, VC2350, and VCA0583.
  8. GenMAPP will open upon double clicking with all of the genes associated with the GO term.
    • List in your journal entry the name of the GO term you clicked on and whether the expression of the gene you were looking for changed significantly in the experiment.
      1. The GO term I chose is Transporter Activity under the gene VCA0583.
  9. Now right click or double click on a colored gene box. This will take you to 'GeneFinder' or it will direct you to a web browser called the 'Backpage' for this gene, respectively.
  10. Click on the UniProt ID, Q9KSL2, which is right next to the 'ID:'
  11. You are now at the UniProt database and under 'General annotation (Comments)' you will find 'Function' of the gene. According to the database the function of this gene is its involvement in vitamin B-12 import as well as in the "translocation of the substrate across the membrane" (UniProt KB).
  12. Copy C:\GenMAPP 2 Data\MAPPs\VC GO and paste to search programs and files. Once the MAPP file is located for the GO term that was chosen, this file will consist of the genes associated with this GO term.
  13. Dragging the file to your desktop would be easier, so that it can be downloaded to the wiki.
    • The MAPP that has just been created is stored in the directory, C:\GenMAPP 2 Data\MAPPs\VC GO. Upload this file and link to it in your journal.
      1. .mapp File HD_20131014
  14. Go to desktop where the Decrease HD-Criterion1-GO was saved.
  15. Right click on the file.
  16. Click on 'Copy'.
  17. Click on 'Paste' anywhere on your desktop.
  18. The copy of the file has been made, Decrease HD-Criterion1-GO - Copy.
  19. Open Microsoft Excel and click on 'Open'.
  20. Click on Desktop or wherever your file, Decrease_HD-Criterion1-GO_-_Copy.txt, is saved and click on 'Open' for the file to show up.
  21. Make sure 'All Files' is selected under the drop-down menu next to the drop-down menu 'File Name'.
  22. When the file wants to open make sure to click 'Finish' at the bottom right corner.
    • Compare this information with your buddy who used a different version of the Vibrio Gene Database. Which numbers are different? Why are they different? Record this information in your individual journal entry.
  23. Right click on the number of the row where the headings begin such as 'GO Name'.
  24. When the row is highlighted click on 'Data'.
  25. Click on 'Filter' and now all of the headers should have a drop-down menu to the right.
  26. Now we must set these filters:
  27. Click on the drop-down button next to Z score.
  28. Click on Number Filters.
  29. Click on Custom Filter.
  30. On the first drop-down column indicate greater than.
  31. On the drop-down column next to it insert the number 2.
  32. Follow steps 27 to 29 but for PermuteP.
  33. On the first drop-down column indicate less than.
  34. On the drop-down column next to it insert 0.05.
  35. Follow steps 27-29 for Number Changed.
  36. On the first drop-down column indicate greater than or equal.
  37. On the drop-down column next to it insert 4 or 5.
  38. Beneath the first drop-down menu make sure to click on the circle 'AND'.
  39. Beneath this circle make sure to choose less than for the drop-down menu.
  40. Next to this drop-down column insert 100.
  41. Follow steps 27-29 but for Percent Changed.
  42. On the first drop-down column indicate greater than.
  43. On the drop-down column next to it insert 25.
  44. Beneath the first drop-down menu make sure to click on the circle 'OR'.
  45. On the drop-down column below the first drop-down menu indicate equal.
  46. On the drop-down column next to it insert 50.
  47. Select 'File'.
  48. Select 'Save As'.
  49. Select Excel workbook (.xls) in the drop-down menu titled 'Save As Type'.
  50. Went through the 'MAPPFinder Browser' to search for GO terms by typing these in 'Search for GO term or MAPP' line.
  51. Make sure Keyword bubble is filled.
  52. Click on 'Word Search'.
  53. The GO terms typed in were 20 GO terms from the GO term excel file.
  54. I highlighted the terms with the same color that expressed a parent to child relationship and those that didn't express this relationship still have a designated color.

Decrease_HD-Criterion1-GO Highlighted

  1. The pathogenecity of the bacterium Vibrio cholerae mostly enforces the use of glucose and the cycle of glycolysis. The GO terms 'glucose catabolic process', 'hexose catabolic process', and 'glycolysis' are part of a family, so they indicate a heavy focus and necessity for the pathogenecity of the bacteria to flourish as the source of energy. The 'glucose metabolic process' and 'hexose metabolic process' are also part of a family and they too demonstrate the focus on chemical reactions and pathways involving glucose. Furthermore, nicotinamide nucleotide metabolic process and the pyridine nucleotide metabolic process are part of a family, so they are involved in metabolic processes of nicotinamide nucleotides and pyridine nucleotides. Moreover, peptidyl-prolyl cis-trans isomerase activity and cis-trans isomerase activity are a family. Cellular protein localization and cellular macromolecule localization are also part of a family which have to do with the process of the movement of macromolecules within the cell. In addition, the GO terms that do not have families are translation factor activity, nucleic acid binding, intracellular protein transport, intracellular transport, protein-N(PI)-phosphohistidine-sugar phosphotransferase activity, organelle organization, protein folding, cellular carbohydrate, alcohol catabolic process, and monosaccharide catabolic process.
    • There is one other file you need to save to your journal page. It has a .gmf extension and should be in the same fold as the .gex file that you created with the GenMAPP Expression Dataset Manager. You will need this file to re-open your results in MAPPFinder.
      1. gmf File HD_20131010

HDelgadi (talk) 00:22, 18 October 2013 (PDT) Class Journal Week 1

Week 1

Class Journal Week 2

Week 2

HDelgadi Week 2

Class Journal Week 3

HDelgadi Week 3

Week 4

Class Journal Week 4

HDelgadi Week 4

Week 5

HDelgadi Week 5

Class Journal Week 5

Week 6

HDelgadi Week 6

Class Journal Week 6

Week 7

HDelgadi Week 7

Class Journal Week 7

Week 8

HDelgadi Week 8

Class Journal Week 8

Week 9

HDelgadi Week 9

Class Journal Week 9

Week 10

HDelgadi Week 10

Week 11

Team H(oo)KD

HDelgadi Week 11

HDelgadi Project Notebook

Week 12 Status Report

Week 13 Status Report

Week 15 Status Report

Hilda Delgadillo

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox