Mbalducc Week 14

From LMU BioDB 2017
Jump to: navigation, search

Files

Excel Sheet of Profile #22 TF from YEASTRACT

PowerPoint of STEM profiles and Gene Regulatory Network from GRNsight

media:Mbalducc_profile 22_RegulationMatrix_Documented_2017.xlsx

Using YEASTRACT to Infer which Transcription Factors Regulate a Cluster of Genes

  1. Opened the gene list in Excel for the one of the significant profiles from my stem analysis. I chose a cluster with a clear cold shock/recovery up/down or down/up pattern.
    • Copied the list of gene IDs onto my clipboard.
  2. Launched a web browser and went to the YEASTRACT database.
    • On the left panel of the window, clicked on the link to Rank by TF.
    • Pasted my list of genes from my cluster into the box labeled ORFs/Genes.
    • Checked the box for Check for all TFs.
    • Accepted the defaults for the Regulations Filter (Documented, DNA binding plus expression evidence)
    • Did not apply a filter for "Filter Documented Regulations by environmental condition".
    • Ranked genes by TF using: The % of genes in the list and in YEASTRACT regulated by each TF.
    • Clicked the Search button.
  3. Answered the following questions:
    • In the results window that appears, the p values colored green are considered "significant", the ones colored yellow are considered "borderline significant" and the ones colored pink are considered "not significant". How many transcription factors are green or "significant"?
      • 43 are green or "significant".
    • Copied the table of results from the web page and pasteed it into a new Excel workbook to preserve the results.
      • Uploaded the Excel file to OWW or Box and linked to it in my electronic lab notebook.
      • Is your transcription factor on the list? If so, what is their "% in user set", "% in YEASTRACT", and "p value". (Note that this doesn't apply to the wt strain).
        • My transcription factor, Cin5 was on the list. Here is the data associated with it:
          • % in user set: 40.68%
          • % in YEASTRACT: 3.30%
          • p-value: 0.001495074628175
    • List of 20 significant transcription factors:
      1. Aft2p
      2. Mga2p
      3. Gis1p
      4. Hot1p
      5. Sut1p
      6. Zap1p
      7. Mig1p
      8. Hap2p
      9. Sut2p
      10. Bas1p
      11. Pho2p
      12. Opi1p
      13. Ert1p
      14. Oaf1p
      15. Sfp1p
      16. Mig3p
      17. Rlm1p
      18. Sum1p
      19. Ace2p
      20. Pho4p
    • Went back to the YEASTRACT database and followed the link to Generate Regulation Matrix.
    • Copied and pasted the list of transcription factors I identified (plus the transcription factor deleted in my strain: Cin5p) into both the "Transcription factors" field and the "Target ORF/Genes" field.
    • I used the "Regulations Filter" options of "Documented", "Only DNA binding evidence"
      • Clicked the "Generate" button.
      • In the results window that appeared, I clicked on the link to the "Regulation matrix (Semicolon Separated Values (CSV) file)" that appeared and saved it to my Desktop. I renamed this file with a meaningful name so that I could distinguish it from the other files I will generate.

Visualizing My Gene Regulatory Networks with GRNsight

  1. Followed these steps for each of the three files I generated:
    • Opened the file in Excel. It did not open properly in Excel because a semicolon was used as the column delimiter instead of a comma. To fix this, I select the entire Column A. Then went to the "Data" tab and selected "Text to columns". In the Wizard that appeared, I selected "Delimited" and click "Next". In the next window, I selected "Semicolon", and clicked "Next". In the next window, I left the data format at "General", and clicked "Finish". This now looked like a table with the names of the transcription factors across the top and down the first column and all of the zeros and ones distributed throughout the rows and columns. This is called an "adjacency matrix." If there is a "1" in the cell, that means there is a connection between the transcription factor in that row with that column.
    • Saved this file in Microsoft Excel workbook format (.xlsx).
    • Checked to see that all of the transcription factors in the matrix were connected to at least one of the other transcription factors by making sure that there was at least one "1" in a row or column for that transcription factor. If a factor is not connected to any other factor, I deleted its row and column from the matrix. I made sure that I still had somewhere between 15 and 30 transcription factors in my network after this pruning.
      • I only deleted the transcription factor if there were all zeros in its column AND all zeros in its row. Visualizing the matrix in GRNsight (below) helped me find these easily.
    • For this adjacency matrix to be usable in GRNmap (the modeling software) and GRNsight (the visualization software), I needed to transpose the matrix. I Inserted a new worksheet into my Excel file and named it "network". Went back to the previous sheet and selected the entire matrix and copied it. Went to my new worksheet and clicked on the A1 cell in the upper left. Selected "Paste special" from the "Home" tab. In the window that appeared, I checked the box for "Transpose". This pasted my data with the columns transposed to rows and vice versa. This was necessary because I wanted the transcription factors that are the "regulatORS" across the top and the "regulatEES" along the side.
    • The labels for the genes in the columns and rows needed to match. Thus, I deleted the "p" from each of the gene names in the columns. I adjusted the case of the labels to make them all upper case.
    • In cell A1, I copied and pasted the text "rows genes affected/cols genes controlling".
    • Finally, for ease of working with the adjacency matrix in Excel, I wanted to alphabetize the gene labels both across the top and side.
      • I selected the area of the entire adjacency matrix.
      • I clicked the Data tab and clicked the custom sort button.
      • I sorted Column A alphabetically, being sure to exclude the header row.
      • Next I sorted row 1 from left to right, excluding cell A1. In the Custom Sort window, click on the options button and select sort left to right, excluding column 1.
    • Saved the workbook.
  2. Next I visualized what these gene regulatory networks look like with the GRNsight software.
    • I went to the GRNsight home page.
    • I selected the menu item File > Open and select the regulation matrix .xlsx file that has the "network" worksheet in it that you formatted above. GRNsight created a graph of the network, I screenshot this and added it to my PowerPoint.

Acknowledgements

I worked with the rest of the Data Analysis guild: Antonio Porras, Dina Bashoura, and Emma Tyrnauer on this assignment. We spoke in class and created a group chat so we could ask each other questions and help each other as we worked on the project. I also used the instructions from the Week 10 assignment. I edited the instructions slightly so they would be specific to the profile I chose and the procedure I followed.

References

GRNsight. (2017). Home. Retrieved November 30, 2017 from http://dondi.github.io/GRNsight/

LMU BioDB 2017. (2017). Week 10 Retrieved November 28, 2017, from https://xmlpipedb.cs.lmu.edu/biodb/fall2017/index.php/Week_10

YEASTRACT. Generate Regulation Matrix. Retrieved November 30, 2017, from http://www.yeastract.com/formgenerateregulationmatrix.php

YEASTRACT. Rank by TF. Retrieved November 30, 2017, from http://www.yeastract.com/formrankbytf.php


Other Pages

Individual Journals

Mary Balducci

Week 2 Journal

Week 3 Journal

Week 4 Journal

Week 5 Journal

Week 6 Journal

Week 7 Journal

Week 8 Journal

Week 9 Journal

Week 10 Journal

Week 11 Journal

Week 12 Journal

No Assignment Week 13

Week 14 Journal

Week 15 Journal


Assignments

Week 1 Assignment

Week 2 Assignment

Week 3 Assignment

Week 4 Assignment

Week 5 Assignment

Week 6 Assignment

Week 7 Assignment

Week 8 Assignment

Week 9 Assignment

Week 10 Assignment

Week 11 Assignment

Week 12 Assignment

No Assignment Week 13

Week 14 Assignment

Week 15 Assignment

Shared Journals

Class Journal Week 1

Class Journal Week 2

Class Journal Week 3

Class Journal Week 4

Class Journal Week 5

Class Journal Week 6

Class Journal Week 7

Class Journal Week 8

Class Journal Week 9

Class Journal Week 10

Page Desiigner