Dbashour Week 9

From LMU BioDB 2017
Jump to: navigation, search

Homework Instructions

Hands-On with GRNsight

Each homework pair has been assigned one subset of the GRNsight client-side testing protocol for the current beta version of GRNsight. Follow this protocol and report the results of your tests in the electronic journal. Homework partners have one testing subset each so that you can talk to each other about the requested tests, but the testing itself should still be done and reported individually, in the spirit of seeking reproducible results.

  1. Each feature is to be tested in combination will all three formats that GRNsight can read (Excel workbook, SIF, GraphML). This is already specified in the testing document. Choose one file for each of these formats from this web page for use in your tests and specify them in your electronic notebook. In order to have a basis for comparison, homework partners should use the same test files for their individual test sequences.
    • For the Excel workbooks (.xlsx) in the linked collection above, click on the file then click Download to save the file to your computer.
    • For the SIF (.sif) and GraphML (.graphml) files, click on the file, click on the Raw button, then either copy-paste or save the resulting file to your computer.
  2. Each test specifies a sequence of actions to perform, followed by their expected results. Use the latter to determine whether GRNsight passed a particular test. Report the result of each test in your electronic notebook.
    • The version of GRNsight that you are testing is a beta version, so results that diverge from the expected ones are certainly possible.
    • If the observed result is the same as the expected result, indicate that GRNsight passed that particular test.
    • If the observed result is not the same as the expected result, indicate that GRNsight failed that particular test and document what was different. For many tests, a screenshot will be the most effective way to document this difference, so do not hesitate to supply one.
  3. If you see any other behavior that appears incorrect, erroneous, or confusing, please report those observations in a section of your electronic notebook as well.
  4. As always, make sure to document and acknowledge your interactions with your homework partner in the Acknowledgments section of your individual journal.

Web Service API Exploration

Each homework pair has been assigned one of the four gene-related web services that we have used for the “favorite gene page” assignments (Ensembl, NCBI, UniProt, SGD/YeastMine). Because there are only four such services, two homework pairs will be working on the same service, so if you wish, you may join “fources” (sorry) to explore the same web service together. Still, you must write up your findings individually in your own respective words.

Your Mission

For the web service that has been assigned to you, use the information given on this page to discover how to take a gene name/symbol (e.g., ACT1, BRO1, SPT15, etc.) and find your way to its full “data profile” within that service. This process may require multiple web service calls and will involve “reading” web service data formats such as JSON or XML.

Your foundational knowledge for this exercise begins with what you have learned from working with “your favorite gene” and from using the services’ corresponding websites. Furthermore, the final URLs that lead to the full gene data are already known to you: they are in the ajax-starter files from the Week 7 assignment. You will want to use a combination of a web browser and curl, with a code-savvy editor like Atom or Visual Studio Code to help make any received data more readable to you.

The Deliverable

Upon determining how to go from a gene name/symbol to that gene’s individual data record (as shown in the Week 7 ajax-starter files), write up this process as a reproducible “recipe” in your electronic journal. In general, this recipe will consist of:

  • The URLs to access in order to retrieve the desired data
  • Any portions in these URLs that need to be substituted for specific queries, such as the gene name or ID within that web service
  • Specific instructions on how to interpret the data returned by each URL so that you can extract exactly the information you need in order to proceed to the next step

This exercise is somewhat unusual in that the work lies in the process of figuring out how to use the web service. Once the steps are known, actually performing these steps is quite straightforward. Thus, although the prospect of doing this may be quite intimidating to those who are new to it, please rest assured that the journey itself is the reward here and it is the very open-endedness of this exploration that we’d like you to experience in this exercise.

That said, it is again imperative that you take good notes about the things you try, and their results, so that you don’t go around in circles and eventually narrow down your exploration the the desired set of steps.

Electronic Journal

GRNsight Beta Version Test: Edge Weights + Normalization

My partner and I used these three files in order to perform the following tests:

https://github.com/dondi/GRNsight/blob/beta/test-files/demo-files/21-genes_31-edges_Schade-data_estimation_output.graphml https://github.com/dondi/GRNsight/blob/beta/test-files/demo-files/21-genes_31-edges_Schade-data_estimation_output.sif https://github.com/dondi/GRNsight/blob/beta/test-files/demo-files/21-genes_31-edges_Schade-data_estimation_output.xlsx

After imputing these files into GRNsight individually, I found that...

Test 1: passed
Test 2: passed
Test 3: passed
Test 4: passed
Test 5: passed
Test 6: passed
Test 7: passed
Test 8: passed
Test 9: passed
Test 10: passed
Test 11: passed
Test 12: passed
Test 13: passed
Test 14: passed
Test 15: passed
Test 16: passed
Test 17: passed
Test 18: passed

Notices: When performing tests having to do with normalization, I noticed that because the graph reloads every time you set the normalization factor, it's hard to distinctly see what changes were made when the normalization factor was set as something similar to what the edge weights already were. But if you made them extreme and set the factors to extreme levels, you can clearly see the changes made with the edge weight thickness. My partner acknowledged this as well when he performed his own tests.
I performed these tests but first downloading the three links listed above and saving them to my computer. Once they were downloaded, I was then able to open them in GRNsight in order to follow the instruction for the test that my partner and I were assigned to. The instructions were very clear and easy to follow. By comparing the expected outcome to the actual outcome, I was able to see if the test passed or failed, and thankfully, they all passed.

Web Service API Exploration

  • We used the link listed below to retrieve the desired data:

https://yeastmine.yeastgenome.org/yeastmine/service/data/Gene

  • We added ?symbol=BRO1 to the end of the URL in order to specify that we were looking for the specific gene symbol, BRO1.

https://yeastmine.yeastgenome.org/yeastmine/service/data/Gene?symbol=BRO1

  • By simply copy and pasting that said URL into a web browser, a .json file will automatically be downloaded containing all the data on the gene BRO1. This file can be opened in a simple text editor and read by anyone who understands .json. You can repeat this action with any gene you might inquire by simply replacing the added code in the second step with any gene symbol that YeastMine has.


In order to figure out the process listed above, my partners and I used the hints that Dondi provided us in order to get the json from ajax.starter. We started off by using the inspect function on the page and locating the gene ID. Once this was located, we then were able to use the link attached to that gene ID to retrieve the URL. Once we had the URL we were looking for, https://yeastmine.yeastgenome.org/yeastmine/service/data/Gene?id=66043773 , we replaced the gene ID in that URL with the gene symbol, BRO1 since we could have used either one. https://yeastmine.yeastgenome.org/yeastmine/service/data/Gene?symbol=BRO1 We then figured out you can simply copy and past that URL into a web browser and a .json file would automatically download. We then repeated this for other genes like ADH1 and ACT1 and retrieved the .json file for those genes as well, leading us to conclude that you simply switch out the gene symbol with any other gene and retrieve the data.

Acknowledgements

I met with Simon on Sunday 10/29 to start working on the tests for GRNsight. We met in the biodatabases lab in person in order to collaborate on this section. On Monday 10/30 I met with Simon, John, and Hayden in the lab to work on the API exploration of the assignment.

Dondi helped answer our lifeline question and provide us with the hints we needed in order to complete the API exploration.

While I worked with the people noted above, this individual journal entry was completed by me and not copied from another source. Dbashour (talk) 23:03, 30 October 2017 (PDT)

References

LMU BioDB 2017. (2017). Week 9. Retrieved August 29, 2017, from https://xmlpipedb.cs.lmu.edu/biodb/fall2017/index.php/Week_9
GRNsight Client Side Testing Document: Edge Weights + Normalization. (2017). Retrieved October 30, 2017, from https://xmlpipedb.cs.lmu.edu/biodb/fall2017/images/b/b7/GRNsight_Testing-Edge_Weights_and_Normalization.pdf
GRNsight. (2017) Retrieved October 29, 2017, from http://dondi.github.io/GRNsight/.

Dina Bashoura

Biological Databases Homepage

List of Assignments

List of Individual Journal Entries

List of Shared Journal Entries

List of Final Assignments

List of Team Journal Assignments