Week 4 E-notes Eyanosch

From LMU BioDB 2015
Jump to: navigation, search

3 command sequences

  • one for each question

so far for the first question

cat infA-E.coli-K12.txt | grep "[ct]at[at]at" | grep "tt[gt]ac[at]" | sed "s/tttact/ <minus35box> & <\/minus35box> /g" | sed "s/cattat/ <minus 10box> & <\/minus10box> \n/" | sed "s/a/ <TSS>&<\/TSS>/" | sed "2s/gagg/ <RBS>&<\/RBS> /" | grep "aaaaggt.*gcctttt"

  • the only problem that I'm having is only the second line shows when I add the grep "aaaaggt.*gcctttt" at the end to find the hairpin loop

| sed "s/aaaaggt.*tttttatt/ <Terminator>&<\/Terminator/g"

  • adds the description of the terminator sequence

What I'm trying to do is use the sed ':a;N;$!ba;s/\n//g' format to combine line 1 with line 2 but I'm unable to do so. I think it has to do with the way I'm writing the code into the mac terminal. My thinking process was that when finding the "a" for TSS I started a new line and counted down 12 nucleotides which happened to be an a, no prior nucletides were adenine. The problem is combining line 1 and 2 after finding the TSS.

  • changed my plan of attack after going through more of the wiki. copied and asted the 3 sed commands for manipulating lines and it worked out

Code as is:

cat infA-E.coli-K12.txt | grep "[ct]at[at]at" | grep "tt[gt]ac[at]" | sed "s/tttact/ <minus35box> & <\/minus35box> /g" |
sed "s/cattat/ <minus 10box> & <\/minus10box> \" | sed "s/ccggttc/&\n/g" | sed "2s/a/ <TSS>&<\/TSS> /1" | sed ':a;N;$!bs;s/\n//g' |
sed "2s/gagg/ <RBS>&<\/RBS> /" | grep "aaaaggt.*gcctttt"


Question 1 code:

eyanosch@ab201:/nfs/home/dondi/xmlpipedb/data$ cat infA-E.coli-K12.txt | grep "[ct]at[at]at" | grep "tt[gt]ac[at]" |
sed "s/tttact/ <minus35box> & <\/minus35box> /g" | sed "s/cattat/ <minus 10box> & <\/minus10box> /" |
sed "s/ccggttc/&\n/g" | sed "2s/a/ <TSS>&<\/TSS>/1" | sed ':a;N;$!ba;s/\n//g' | sed "s/gagg/ <RBS>&<\/RBS> /g" |
grep "aaaaggt.*gcctttt" | sed "s/aaaaggtc.*tttttatt/ <terminator>&<\/terminator> /g" |
sed "s/atg/ <start codon>&<\/start codon> /4" | sed "s/tga/ <stop codon>&<\/stop codon> /11"


  • After making a few adjustments:
cat infA-E.coli-K12.txt | grep "[ct]at[at]at" | grep "tt[gt]ac[at]" | sed "s/tttact/ <MINUS35BOX> & <\/MINUS35BOX> /g" |
sed "s/cattat/ <MINUS10BOX> & <\/MINUS10BOX> /" | sed "s/cttgcc/&\n/g" | sed "2s/g/ <TSS>&<\/TSS>/1" | sed ':a;N;$!ba;s/\n//g' |
sed "s/gagg/ <RBS>&<\/RBS> /g" | grep "aaaaggt.*gcctttt" | sed "s/aaaaggtc.*tttttatt/ <TERMINATOR>&<\/TERMINATOR> /g" |
sed "s/tac/ <START CODON>&<\/START CODON> /7" | sed "s/att/ <STOP CODON>&<\/STOP CODON> /11"