Johnllopez Week 3

From LMU BioDB 2017
Jump to: navigation, search

Hack-a-Page

In this portion of the assignment, I decided to modify the following webpage:

http://www.laloyolan.com/news/reflecting-on-the-u-s-constitution-and-immigrants-rights-advocacy/article_4d76c1d7-6cbd-5cdf-a6c6-cb2e42a4221e.html

I did this by right clicking on the heading that read "Reflecting on the U.S. Constitution and immigrants’ rights advocacy", selecting "Inspect", and between the span tags, I changed the text. I did this with the first paragraph as well by changing the area within the paragraph tags. For the image, I right clicked on it, selected "Inspect", and found the 'src' attribute which led to the original image. I replaced that image with the one found here:

tvtropes.org (n.d.) Rick and Mortys 1. Retrieved from http://static.tvtropes.org/pmwiki/pub/images/rickandmortys1.png


JohnLopezWeek3Screenshot1.png
Without the developer panel, it looks like this:
JohnLopezWeek3Screenshot2.png

"DMing" The Server with Curl

I found that the best way to implement curl is through using this formulaic command:

  • Let x = the sequence you want to translate
  • Let y = one of the three output formats, seen within the 'option' tags.
  • Let z = one of the sixteen genetic codes, seen within the 'option' tags.
curl -d "pre_text=x&output=y&code=z" http://web.expasy.org/cgi-bin/translate/dna_aa

I developed this method thanks to assistance from the following online source:

CURL and click a button in a website. (n.d.). Retrieved September 18, 2017, from https://stackoverflow.com/questions/2366549/curl-and-click-a-button-in-a-website


Study the Curl'ed Code

Question 1

Within the ExPASy translation server's responses, there are several links. Within the header portion of the page, the following links can be found:

*http://www.isb-sib.ch/
*http://www.expasy.org/
*http://web.expasy.org/translate (this can be found twice)
*http://web.expasy.org/contact

The body contains several links, which link to a wikipedia page and several pages that follow the translated results URI. These appear to display data specifically to the reading frame mentioned in the text of that link.

*http://en.wikipedia.org/wiki/Open_reading_frame
*http://web.expasy.org/cgi-bin/translate/dna_sequences?/work/expasy/tmp/http/seqdna.31977,1
*http://web.expasy.org/cgi-bin/translate/dna_sequences?/work/expasy/tmp/http/seqdna.31977,2
*http://web.expasy.org/cgi-bin/translate/dna_sequences?/work/expasy/tmp/http/seqdna.31977,3
*http://web.expasy.org/cgi-bin/translate/dna_sequences?/work/expasy/tmp/http/seqdna.31977,4
*http://web.expasy.org/cgi-bin/translate/dna_sequences?/work/expasy/tmp/http/seqdna.31977,5
*http://web.expasy.org/cgi-bin/translate/dna_sequences?/work/expasy/tmp/http/seqdna.31977,6

Finally, there is a footer area which contains the following links:

*http://www.isb-sib.ch/
*http://www.expasy.org/disclaimer.html

Question 2

It would appear that several of the links have unique resource identifiers, listed below:

*http://web.expasy.org/contact
*http://web.expasy.org/cgi-bin/translate/
*http://www.expasy.org/disclaimer

These identify where in the main server (http://web.expasy.org) you can find these webpages. However, six of the links contain both URIs and local IDs, which lead to parts of a database that store translation responses.

*/work/expasy/tmp/http/seqdna.31977,1
*/work/expasy/tmp/http/seqdna.31977,2
*/work/expasy/tmp/http/seqdna.31977,3
*/work/expasy/tmp/http/seqdna.31977,4
*/work/expasy/tmp/http/seqdna.31977,5
*/work/expasy/tmp/http/seqdna.31977,6

Using the Command Line to Extract Just the Answers

Below is the single command that would allow me to extract simply the compound of

curl -d "pre_text=cgatggtacatggagtccagtagccgtagtgatgagatcgatgagctagc&output=Verbose&code=Standard" http://web.expasy.org/cgi-bin/translate/dna_aa | grep ' (PRE|Frame)' | sed 's/<[^>]*>//g'

This was developed using a variety of steps.

  1. I used what I knew from answering the "DMing the Server with Curl" question to develop the curl command.
  2. Once I realized that I could not use several 'grep' commands at once, I searched for a way that would allow me to use two at the same time, which would be necessary for identifying the lines of HTML that had the text I desired. I wound up using the following source to develop my solution: https://unix.stackexchange.com/questions/82990/how-can-i-grep-for-this-or-that-2-things-in-a-file
  3. For the 'sed' command, I initially used a chain of 'sed' commands to eliminate each individual HTML tag. This was due to 'sed' being greedy and ultimately replacing the entire line which began and ended with tags. However, in order to seek out a more pragmatic solution, I sought the advice of | Dondi, who directed me to the Dynamic Text Processing page. Here I found out that I could eliminate the greediness of 'sed', and that's how my solution was developed.

Acknowledgements and References

Acknowledgements

I worked with my homework partner twice Katie Wright in class. We met face-to-face one time outside of class on Monday. We texted on Sunday and Monday discuss the assignment, and we reviewed the assignment and worked simultaneously up to "DMing the Server with Curl". Furthermore, I reviewed the assignment with Simon Wroblewski on Monday. I also went to the office hours of Dondi to discuss the 'grep' and 'sed' commands, as well as the "Study the Curl'ed Code" section.

While I worked with the people noted above, this individual journal entry was completed by me and not copied from another source. Johnllopez616 (talk) 19:06, 19 September 2017 (PDT)

References

* CURL and click a button in a website. (n.d.). Retrieved September 18, 2017, from https://stackoverflow.com/questions/2366549/curl-and-click-a-button-in-a-website
* How can I grep for this or that (2 things) in a file? (n.d.). Retrieved September 18, 2017, from https://unix.stackexchange.com/questions/82990/how-can-i-grep-for-this-or-that-2-things-in-a-file
* McMurry et al. (2017) Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data. PLoS Biol 15(6): e2001414. doi: 10.1371/journal.pbio.2001414
* Dynamic Text Processing. Retrieved September 19, 2017, from https://xmlpipedb.cs.lmu.edu/biodb/fall2017/index.php/Dynamic_Text_Processing
* LMU BioDB 2017. (2017). Week 3. Retrieved September 18, 2017, from https://xmlpipedb.cs.lmu.edu/biodb/fall2017/index.php/Week_3

Individual Journal Entries and Assignments

Class Assignments

Class Weekly Journal Entries / Project Weekly Journal Entries

My Page