Laurmagee: Week 5

From LMU BioDB 2013
Jump to: navigation, search
  • I followed the link http://www.uniprot.org to the UniProt system and entered in the query 900533 into the search bar appearing in the top middle of the home page.
  • To look at the "General Information" of the entry, I scrolled down to the bottom of the page to find the section entitled "Entry Information". This was surprisingly hard to find, because the page was so full of information, but I eventually found it above the "Relevant Documents" section.
  • The section "Name and Origin" I found to be the first section on the page and the references to be in the middle of the page. These were easy to find, becauswe there were a long list of references included for this specific gene. The "General Annotation" section, third section from the top, served as a venue for commenting on specific aspects of the gene, such as its involvment with disease.
  • The Cross-References featured on the page, were organized into categories and sub-categories based upon their inclusion in certain databases.
    1. EMBL
      • The database contains the type of organism the specific gene is commonly seen in as well as the molecule type, topology, data class, sequence length, sequence version, and finally when the gene page was first made public in the database. The database also includes base range of the gene, its nucleotide and amino acid sequences, and the product of the genes expression.
    2. InterPro
      • This database offers the domain relationships associated with the your specific gene of interest. It also offers a very detailed description of of the gene and what it's possible interactions are. It also features the proteins that are coded by the gene and gives there specific name, species, and family for identification. The gene's domain organisations is also expressed through a easy to read chart, including the number of proteins in each domain. The species that express the gene, the genes structure, and further literature regarding the gene is also offered. InterPro also offers 3-D models of the gene located in the "structure" section.
    3. PDB
      • This database contains links to specific articles on the query you provide.
    4. Pfam
      • This database highlights the domain organisation, clan, alignments, species, interactions, and stuctures of the gene entered into the query. This database includes many visual representations of the gene's descriptors and allows for easy maneuver around the information.
    5. RefSeq
      • This database is almost identically formatted to GeneID, but it highlights different aspects of the gene. This database notes the gene locus, definition, accession, version. Along with, the details to multiple articles relating to the specific gene of interest. The FASTA file is also available on this source, so you are able to see the complete amino acid sequence of the gene.
    6. GeneID
      • This database offers the official symbol, full name, primary source, gene type, and lineage of the gene. As well as what organism the gene stems from, the location and sequence of the gene in the genome, and the genomic regions, transcripts, and products. This database it very concise, without any separate sections or imaging.
  • I learned that the human EGFR protein stems from homo sapiens and that it's full name is the epidermal growth factor receptor. The gene has three interactions within it's family, Recep L domain, V-set, and Furin-like. The gene's family is a member of the clan GF_recep_C-rich with two other members, Furin-like and GF_recep_IV. There are many sequences that match certain sections of the gene's architecture, showing the parts of the gene that are not unique to it's structure. EGFR has no domain relationships, but it's domain is found in 1773 different proteins. There are 198 different domain organisations that highlight the proteins coded for in the gene and sorts them based upon their domain organisations. Other species who carry this gene throughout it's lineage in addition to humans (56 proteins) are: fruit flies (252 proteins), zebrafish (36 proteins), mice (21 proteins), baker's yeast (2 proteins), fission yeast (2 proteins), and Caenorhabditis elegans (75 proteins). There is a copious amount of research published on the EGFR gene as well as cross-references.
  • Reflection Questions:
    1. The purpose of this exercise was to expose us to working with databases and understanding what type of information they can provide us regarding specific gene queries. We were able to identify which database we may feel the most comfortable using in future assignments and which would be the best for certain commands. We learned to comb through the databases to find useful information in determining numerous aspects of our gene.
    2. I feel like I successfully accomplished the above statements as well as learning how simplistic and interactive some of the data can be displayed on the databases. I really enjoyed the maps where you could highlight a certain area and find out more about how it related back to the entire gene. Just having the time to explore the databases before using them for a specific task was a helpful aspect of this assignment.
    3. I actually didn't have much trouble understanding this assignment, because i have had previous experiences with gene databases. I was in HHMI lab as a freshman and all of second semester was dedicated to looking through GenBank and GeneID, to find out more about the genes in the genome of our sequenced bacteriophage.

Laurmagee (talk) 23:05, 26 September 2013 (PDT)

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox