Malverso Week 11

From LMU BioDB 2015
Jump to: navigation, search

Personal Goals

  • Prepare for journal club presentations
  • Set up coding/testing environment
  • Determine the regular expression for the ordered locus ID for your species
  • Identify the appropriate model organism database for your species.
  • Perform an initial Gene Database export and Gene Database Testing Report

Electronic Journal

  • After struggling to find a Model Organism Gene Database for Shewanella oneidensis, I asked Dr.Dahlquist for assistance.
  • The only database link we could find was broken.
  • We decided that our next plan of action would be to use the website bacteria.ensembl.org as our MOD.

Journal Club Presentation

File:GenomePPT 20151123 HMH.pdf

Genome Paper Outline

Heidelberg, J. F., Paulsen, I. T., Nelson, K. E., Gaidos, E. J., Nelson, W. C., Read, T. D., ... & Fraser, C. M. (2002). Genome sequence of the dissimilatory metal ion–reducing bacterium Shewanella oneidensis. Nature biotechnology, 20(11), 1118-1123. doi:10.1038/nbt749

The Significance of this Work

  • Shewanella oneidensis has the potential to be used for bioremediation purposes. It is a respiratory generalist, and therefore can use oxygen during aerobic respiration but also can survive in anaerobic conditions by reducing other electron acceptors such as oxidized metals. This is important because it shows how the introduction of Shewanella oneidensis to an environment can reduce metal pollutants. Before using Shewanella oneidensis in this capacity, however, it is necessary to understand the possible negative effects of introducing Shewanella oneidensis to an environment as well as the positive effects, and that is why the sequencing of the whole Shewanella oneidensis is important.
  • From the genome sequencing and analysis, the authors were able to come to the conclusion that although they discovered virulence determinants in the Shewanella oneidensis genome, S. oneidensis is infrequently a human pathogen. They also found a lambda-like phage during their analysis that introduces the possibility of genetically manipulating the genes so that a microbe could be created to address a specific area needing bioremediation. The genome sequencing will allow others to conduct further experiments to be able to better predict the behaviors of S. oneidensis, which will decide if and how S. oneidensis can be used for bioremediation purposes.

Methods Used in this Study

  • The genome was sequenced using the whole-genome sequencing method. The Shewanella oneidensis was grown in a single, isolated colony. It was then cloned, sequenced, and assembled by TIGR. First, a plasmid library was created as well as a shotgun library. Each of those were sequenced, then joined by the TIGR assembler using the criteria that every position had to have at least double clone coverage. The sequence was then edited manually, and PCR as well as sequencing reactions were used to close gaps and improve coverage. There were 71,777 sequenced used to create the final genome.
  • Analysis of the genome then conducted. Glimmer software was used to identify open reading frames that were likely to encode proteins. Those open reading frames were used to identify likely proteins from amino acids, and those proteins were searched for using a protein database. TopPred was then used to find protein membrane-spanning domains. Markov models were also used to find and mask repeated domains within proteins. It is important to note that the authors operated under the assumption that the DNA composition was fairly uniform.
  • Finally, the authors compared the final genome to all other complete genomes available. They did this by using both the National Center for Biotechnology Information as well as the TIGR Comprehensive Microbial Resource Database.

Overview of the Results

  • The authors found that the genome is a circular chromosome made of 4,969,803 base pairs and containing 4,758 predicted protein encoded open reading frames (CDSs). The circular representation of the S. oneidensis genome is shown in Figure 1. The figure depicts predicted coding regions on the outer two circles, genes involved with electron transport on the third circle, and phage related genes on the fourth circle.
  • After the analysis, the authors were able to assign biological function to 54.4% of the open reading frames predicted to encode proteins using a classification scheme adapted from Riley. The statistics in detail are described in Table 1, with 2430 out of 4758 predicted CDSs being similar to known proteins. Table 1 also shows the number of CDSs similar to proteins if unknown function as being 843 out of 4758 CDSs total. This supports their statistic that 22.2% of the CDSs matched predicted coding sequences from other organisms, but that were not yet assigned to a function. The rest of the CDSs, 23.1%, where found to be unique to Shewanella oneidensis.
  • They also found that the genome sequence is the most similar to the genome sequence of Vibrio cholerae. As shown in Figure 2, the CDSs by far had the most similarities to the CDSs of V. cholerae, with 32.33% of the V. cholerae genome being similar to the S. oneidensis genome. The Shewanella oneidensis genome was also found to have many similarities to itself. In fact, 683 of the CDSs are very similar to other genes from Shewanella oneidensis, which implies that there are many duplications of genes within its genome. The type of CDSs that were most often duplicated showed the importance Shewanella oneidensis placed on being able to function as a respiratory generalist. Genome analysis also found a 51,857 bp lambda-like phage genome integrated in the Shewanella oneidensis genome, which is shown in Figure 4.The importance of this finding has been described above.

The Results Compared to Previous Studies

  • The authors noted a few times when their findings either supported or contrasted with previous studies. Their findings supported the theory that the metal reduction occurs extracellularly. Instead of transporting the metal ions into the cell, the metal is reduced through direct contact with the bacterial surface, according to the theory. The authors fouynd that Shewanella oneidendid did not have a high amount of metal ion transporters. In fact, it actually has less in comparison to E. coli and V. cholerae.
  • In contrast to the theory that Shewanella oneidensis had a lot of regulatory genes because of observations of S. oneidensis in diverse conditions, the authors found that there were only 88 two-component regulatory system proteins. The regulatory system proteins allow the bacteria to adapt to changing and diverse conditions, but the amount they found was significantly less than the amount other environmental bacteria contain.

Journal Club Presentation Prep

Defined Biological Terms

  1. bioremediation: When bacteria, plants, or other biological agents are used to get rid of pollutants in things such as soil or water.Found at this link.
  2. plasmid: DNA that can replicate itself without the chromosomal DNA.Found at this link.
  3. phage: Also known as bacteriophage. Viruses that usually cause the disintegration of a certain bacteria through infection. Found at this link.
  4. cytochromes: A protein whose main function is electron transport. Found at this link.
  5. redox: An abbreviation of reduction which is the loss of oxygen. Found at this link.
  6. heme: An iron compound that has oxygen carrying properties and is the non-protein part of hemoglobin. Found at this link.
  7. paralogous genes: Genes that are similar but at two different locations in the chromosome of the organism. This indicates that the sets came from an ancestral gene. Found at this link.
  8. hydrogenase: A catalyst for the formation/oxidation of H2. Found at this link.
  9. heterodimeric: An adjective that describes a protein which is comprised of two differing polyeptide chains. Found at this link.
  10. aquaporin: A channel that allows water to pass through the membrane (selectively), but not ions. Found at this link.
  11. efflux: The process of flowing/flowing out. Found at this link.
  12. biofilm: A colony of microorganisms that are encased in a protective coating of their own secretions. They can form on solid and liquid surfaces. Found at this link.
  13. virulence determinants: Factors that allow bacteria to cause disease. Found at this link.

Model Organism Database (MOD) Review

  • What types of data can be found in the database (sequence, structures, annotations, etc.)?
    • You can search for genes, regions of the chromosome, or description such as enzymes or proteins, along with splice variants, cDNA and protein sequences, and non-coding RNAs. You can download the DNA sequence, view a karyotype of the genome, and find other information such as the number of base pairs and the number of genes. [1]
  • Is it a primary or “meta” database; is it curated electronically, manually [in-house], or manually [community])?
    • This is a meta database. I know this because it gets its information about S. oneidensis from a different database - the European Nucleotide Archive.

The database is curated electronically in the sense that the genomes are automatically annotated. However, Ensembl as a whole is updated every two months about, in releases made by the Ensembl organization. This implies that the curation is manual (in-house). The site also says that the database is updated regularly with new data including new genomes, assemblies, gene models, as well as software updates and additions to existing data sets being included in each release. I can conclude that the database is curated both electronically and manually.

  • What individual or organization maintains the database?
    • Ensembl is a joint project between European Bioinformatics Institute (EMBI-EBI), an outstation of the European Molecular Biology Laboratory (EMBL), and the Wellcome Trust Sanger Institute (WTSI).[2]
  • What is their funding source(s)?
    • It is funded by the European Molecular Biology Laboratory as well as other grants. [3]
  • Is there a license agreement or any restrictions on access to the database?
    • No, the database is open for all. [4]
  • How often is the database updated?
    • The entire database website is updated every two months. [5] However, when I clicked on the link of the provider of the S. oneidensis information, it shows the last update of that specific information to be the 16th of November in 2012. [6]
  • Are there links to other databases?
  • Can the information be downloaded?
    • Yes, it can be downloaded.
  • In what file formats?
    • It can be downloaded in both FASTA and GFF3 formats.
  • Evaluate the “user-friendliness” of the database.
    • This website is pretty user friendly. There are pictures and easy to understand labels to buttons. The search bar is at the top and it is clear that the search bar can be utilized in a few different ways.
  • Is the Web site well-organized?
    • Yes, the resources available are clearly shown on the home screen, along with pictures, links to more information, and download buttons.
  • Does it have a help section or tutorial?
    • Yes, there is a website help button, as well as FAQ pages and buttons to click if you want more information on a variety of items. There are also examples of searches to help out.
  • Run a sample query. Do the results make sense?
    • I searched for SO_2097, which was mentioned in the genome paper. The results provided me with a chromosome location, a description of the gene, and a synonym for the gene. The description on the site matched the brief description in the genome paper, so the results made sense.
  • What is the format (regular expression) of the main type of gene ID for this species (the "ordered locus name" ID)? (for example, for Vibrio cholerae it was VC#### or VC_####).
    • SO_#### is the gene ID format.


Team Page

Heavy Metal HaterZ

Assignments

Individual Journal Entries

Shared Journal Entries