Kmeilak Week 5

From LMU BioDB 2013
Jump to: navigation, search

UniProt Lab Journal

  • Began at UniProt website http://www.uniprot.org
  • Entered Primary Accession Number P00533 into UniProt Query; was directed to page for that particular protein, aka EGFR_HUMAN. P00533 page
  • Entry information section near bottom of the page contained entry name, primary and secondary accession numbers, entry history and status
  • Found protein names, gene names, organism, taxonomic identifier and lineage in Names and Origin Section at the top of the entry.
  • Reference section contained all of the published papers that did research on P00533.
  • Followed link of E.C. number of protein, found and highlighted P00533 in list on this page.
  • Followed highlighted link for P00533, which returned me to P00533 page. Kept this page open and opened E.C. page in a new tab.
  • Found description near top of the page for P00533 under General Annotation (Comments) Section including function, catalytic activity, and tissue specificity, among others.
  • Clicked on Cross-Reference tab at top of P00533 page which took me to that section of the page. Below are the descriptions of information found on different databases asked for in the homework assignment. The link at the end of each database entry is the same one that was followed from the Cross-Reference section of the P00533 page.
  1. EMBL - This database has information concerning genomic DNA, mRNA, and their translation into protein. It lists the base pair sequences for a particular gene as found in its genomic DNA or mRNA sequence, and a graphic displaying what parts of that gene are significant in affecting the final protein product. http://www.ebi.ac.uk/ena/data/view/X00588
  2. InterPro - This database has textual information describing the function, and sometimes morphology and bodily significance, of particular protein domains and families. http://www.ebi.ac.uk/interpro/entry/IPR006212
  3. PDB - This database has information focusing on the 3-dimensional structure of proteins, and has information concerning the domains that comprise the protein. http://www.pdb.org/pdb/explore/explore.do?pdbId=1IVO
  4. Pfam - This database has information from its own archives and other databases about protein families and domains. It includes a 3-dimensional graphic of the protein or domain but focuses on providing links to all sources of information. http://pfam.sanger.ac.uk/family/PF00757
  5. RefSeq - This database lists all journal articles where the protein of interest is mentioned or has been researched, a summary of what the protein's structure and function are, and the protein's amino acid sequence. http://www.ncbi.nlm.nih.gov/protein/NP_005219.2
  6. GeneID - This database gives general information about both the gene and the protein product translated from the gene. This includes genomic region, transcripts, products, phenotypes, variation, interactions with other genes, and related sequences. This database in particular has a lot of information about all aspects of the gene and gene product, starting from gene, intermediate RNAs, and final gene product including relations to other genes and proteins. http://www.ncbi.nlm.nih.gov/gene/1956
  • Viewed Ontologies section and found keywords for entry.
  • Viewed Sequence Annotation (Features) section which contained a the entire amino acid sequence of the protein. The sequence is accessible by following any of the blue links in this section with a range of numbers. For example, the 646-668 link was followed next to the Transmembrane region which takes the user to a page with the amino acid sequence for that part of the protein. [646-668]
  • Viewed data in TXT, XML, and GFF formats by clicking on the corresponding link near the top of the page.

Summary

The human EGFR (epidermal growth factor receptor) protein is a tyrosine kinase, which is a transmembrane protein whose primary function is signal reception and translation into an appropriate cell response. It is a protein that is 1210 amino acids long, has one protein kinase domain, and is involved with lung cancer. It catalyzes a reaction that phosphorylates a protein with a L-tyrosine subgroup using the energy and phosphate group found in ATP. It is found in several places in the cell, including the endoplasmic reticulum membrane and the Golgi apparatus membrane.

Questions

  1. The purpose of this exercise was to familiarize myself with the function of multiple databases. It seeks to teach what each database is useful for, especially UniProt, which will be used in this class for our final project.
  2. I learned how to navigate a database and what the primary usefulness of several different databases are in addition to more about the human EGFR protein.
  3. It was challenging understanding the finer points of each database, and understanding what every figure and group of links was on first or even second investigation.


Kmeilak (talk) 21:20, 12 September 2013 (PDT)

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox