Johnllopez Week 14

Electronic Lab Notebook

Pulling Data

Thanks to Corrine Wong, I was able to use the following table to figure out what processes had to be made in order to pull certain data from the functions:

NCBI

Locus tag < Parse from page https://www.ncbi.nlm.nih.gov/gene/854068
Gene ID < Parse from conversion algorithm

UniProt

Protein type/name <Parse XML?
Protein sequence <Parse XML
Gene ID <Parse XML
Similar proteins <Could not find on XML, found on Page

Ensembl

DNA sequence < Pull from http://www.ensembl.org/Saccharomyces_cerevisiae/Transcript/Sequence_cDNA?db=core;g=YJL128C;r=X:178097-180103;t=YJL128C
Gene description < Pulled from JSON
Gene ID < Pulled from JSON / Conversion Necessary

SGD

Gene ID < Pulled from JSON/ Possible conversion necessary
Gene expression < NO IDEA HOW WE CAN PULL THIS
Gene regulation <Pull from page https://www.yeastgenome.org/locus/S000005446
Gene ontology <SUummary in JSON

JASPAR

Sequence logo
Frequency matrixon her

Learning XML DOM

I was first tasked to figure out how to extract data from XML files. This was necessary to pull the pieces above marked "XML". I did so by using aspects from the jQuery library and XML Document Object Model.

The functions I used from jQuery were $get and $append, and I learned how to use these thanks to the Week 7 assignment. They allowed me to directly pull XML files from a query and append to a webpage. The next challenge was figuring out how to parse the data given to me and extract what I need.

I then figured out since XML was a markup language like HTML, I could use the same Document Object Model functions that I would to parse HTML. Of course, I had no idea how to do either, so I used | MDN to explain certain aspects of it.

I figured that the serializeToString() and getElementsByTagName() were useful, so that allowed me to pull lines of XML.

One problem was, such in the case of extracting the LOTUS tag and Other Names data from NCBI, the string that returned contained both pieces of data! The way I got around this was by taking the data as a string, using the split() function to make each word a node of an array, then further manipulating this array using the splice() function to remove the LOTUS tag once it was called.

Another problem I encountered was that some elements shared names, so I had to modify getElementsByTagName() by specifying that I wanted to view nodes and children of larger, more accessible XML tags. An example of this was getting the "Protein Type/Name" from UniProt. This was done through getting the child node of a child node of the "protein" tag.

There was one piece of data that I could unfortunately not figure out how to extract because I could not find the API call for it. This is something I will discuss with my partners on Tuesday.

You may see the work I completed on github here.

Acknowledgements and References

Acknowledgements

This week required collaboration between the coders Eddie Azinge, Eddie Bachoura, and Simon Wroblewski. While they developed the necessary components for JSON, I developed the XML portion. In addition, I discussed with Corrine Wong which functions were necessary to pull from the data provided.

While I worked with the people noted above, this individual journal entry was completed by me and not copied from another source. Johnllopez616 (talk) 23:06, 4 December 2017 (PST)

References

MDN. Document Object Model. Retrieved December 1, 2017, from https://developer.mozilla.org/en-US/docs/Web/API/Document_Object_Model
LMU BioDB 2017. (2017). Week 14. Retrieved December 4, 2017, from https://xmlpipedb.cs.lmu.edu/biodb/fall2017/index.php/Week_14
LMU BioDB 2017. (2017). Week 7. Retrieved December 1, 2017, from https://xmlpipedb.cs.lmu.edu/biodb/fall2017/index.php/Week_7

Individual Journal Entries and Assignments

Class Assignments

Class Weekly Journal Entries / Project Weekly Journal Entries

My Page

My User Page

Johnllopez Week 14

Contents

Electronic Lab Notebook

Pulling Data

Learning XML DOM

Acknowledgements and References

Acknowledgements

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools