Jwoodlee Week 11

From LMU BioDB 2015
Jump to: navigation, search

Individual Journal Assignment

Citation

Jin, Q., Yuan, Z., Xu, J., Wang, Y., Shen, Y., Lu, W., … Yu, J. (2002). Genome sequence of Shigella flexneri 2a: insights into pathogenicity through comparison with genomes of Escherichia coli K12 and O157. Nucleic Acids Research, 30(20), 4432–4441. Link: HERE

Preparation for Journal Club on Your Species

Some steps are taken from here.

  1. I made a list of 12 biological terms for which I did not know the definitions when I first read the article. I then defined each of the terms.
    1. enterohemorrhagic - Based on what my teammates have told me and what I've found on the CDC website. Enterohemorrhagic is used to describe a strain of E. coli that induces hemorrhagic diarrhea which is a result of bleeding into the intestines.
    2. plasmid - Plasmids are circular independent DNA molecules that hold a few genes and can be inserted into genomes. [1]
    3. inversions - An inversion is a defect in a chromosome. Physically it occurs when a segment of the chromosome breaks off and is reinserted into the chromosome in the same spot. Source
    4. translocations - Translocation is another type of defect in a chromosome when a segment breaks off and is moved to another location within the same chromosome or to another chromosome.Source
    5. bacteriophage - Biology Online Dictionary A bacteriophage is a virus that can infect a bacteria.
    6. pathogenicity islands - The segment of genetic material within an organism(in this case within the bacteria) that gives the organism the ability to cause disease. This helped.
    7. enteric - I just used the google dictionary for this: [2] It means relating to the intestines, so it isn't really strictly a biological term.
    8. pseudogenes - A pseudogene is the part of the DNA that is not transcribed into mRNA and therefore not translated into protein. Source
    9. accession number - From what I can tell, an accession number in Biology is a unique identifier given to a protein or DNA sequence so it can be tracked. [3]
    10. virulence - The severity of the disease the bacteria causes and a measure of how infectious it is.source
    11. operons - An operon is a cluster of genes with a single promoter. source
    12. serotype - Serotypes are classifications within a species of bacteria or virus. These classifications are based on distinctive surface structures on the bacteria.CDC.gov
  2. Write an outline of the article. The length should be a minimum of the equivalent of 2 pages of standard 8 1/2 by 11 inch paper (you can use the "Print Preview" option in your browser to see the length). Your outline can be in any form you choose, but you should utilize the wiki syntax of headers and either numbered or bulleted lists to create it. The text of the outline does not have to be complete sentences, but it should answer the questions listed below and have enough information so that others can follow it. However, your outline should be in YOUR OWN WORDS, not copied straight from the article.

Outline:

  1. Introduction
    1. The scientists behind this paper decided to sequence the genome of Shigella flexneri serotype 2a. Shigella flexneri is responsible for causing bacillary dysentery or shigellosis and as a result an estimated 160 million and 1.1 million deaths occur from each of these diseases. This is especially a problem in developing countries. In China 10 million cases occur a year of which 50-70% is caused by the serotype 2a. With the ability to reproduce in the cytoplasm of host cells, Shigella flexneri causes an inflammatory reaction in the colon and rectum. In the 1890s Shigella was identified as the agent for bacillary dysentery. A recent study shows that Shigella emerged from multiple independent origins of E. Coli, which this paper verifies. Due to the fact that a genome for serotype 5a has become available they felt it necessary to sequence serotype 2a. The virulence plasmid from 2a diverges slightly from 5a, in this paper they have revealed the highly dynamic nature of Shigella.
  2. Materials and Methods
    1. Growth Conditions
      1. The strain they sequenced (Sf301) was isolated from a patient with severe clinical manifestations of shigellosis in Beijing in 1984. This instance of this sequence has been used as a reference strain for S.flexneri in China. The strain was grown at 37 degrees Celsius overnight on tryptic soy agar containing 0.01% Congo red. The colonies were inoculated into tryptic soy broth and grown to the stationary phase at 37 degrees for isolating plasmid and chromosomal DNA.
    2. Sequencing and Sequence Assembly
      1. The plasmids and chromosomes were separately constructed using a variety of programs and methods. Shotgun sequencing was the algorithm they used and their automated sequencers created 48,000 clones which gave rise to 10 times coverage.
    3. Open Reading Frames and identification of gene families
      1. Glimmer 2.0 is a program that searches for regions within DNA that code for protein. In this sequencing experiment it was used to find ORFs that possessed more than 30 consecutive codons.
    4. Accession of the genome sequence
      1. In GenBank the accession numbers for the Sf301 chromosome is AE005674 and for the plasmid pCP301 it is AF386526
  3. Results and Discussion
    1. General Features of the Genome
      1. The whole genome is made of 4,607,203 base pair chromosome and a 221,618 base pair virulence plasmid. About 3.9 Mb (Million base pairs) of the chromosome are shared with E.coli K12 (MG1655) and O157(EDL933), these base pairs are essentially collinear.
      2. However they are not entirely collinear as they are interrupted by segments of K12, O157 and Shigella DNA which the author calls “islands”. Collinearity is also broken by inversions and translocations between the two species.
    2. The Shigella islands
      1. There are 64 Shigella islands with sizes greater than 1000 base pairs (detailed in ‘linear map 1’). Among these islands they identified pathogenicity islands.
    3. The Pseudogenes
      1. Frame shifts, stop codons, and insertions in the coding regions appear to play a major part in creating Pseudogenes
    4. Virulence Plasmid pCP301
      1. Like previous virulence plasmids from serotype 5a, pCP301 has a lot of virulence related genes, IS elements, maintenance genes and functionally unknown ORFs.
  4. Conclusion
    1. Comparison of S. flexneri with E. coli shows that E. coli is closely related to S. flexneri and may turn out to belong to the same genus
    2. S. flexneri is more closely related to the non-pathogenic K12 E.coli strain than the pathogenic O157 strain
    • What is the importance or significance of this work (i.e., your species)?
      • This paper releases to other scientists data about Shigella flexneri. This is important because it will allow further study of this pervasive bacteria that hurts China and other third world countries by creating Shigellosis. Since this bacteria and the disease that comes with it is a leading cause of death in the world it is important that people are able to study it, and this paper enables them to do just that.
    • What were the methods used in the study?
      • This strain of S. flexneri was taken from Beijing in 1984 and has been used as a kind of standard of shigellosis in China. It was grown in a tryptic soy broth at 37 degrees Celsius, they sequenced it using shotgun sequencing. They used phred/phrap a program to automate assembly of the sequences. Using Glimmer 2.0 they figured out the open reading frames and then they compared the genome to that of E.coli using the GenomeComp software.
    • Briefly state the result shown in each of the figures and tables.
      • Figure 1 is a circular map of the sequenced S.flexneri genome compared to E.coli K12. These organisms share the same "backbone structure" which is about 3.9 million base pairs long.
      • Figure 2 is a representation of translocations, inversions, and strain‐specific islands in S.flexneri when compared to the two strains of E.coli mentioned.
      • Table 1 has general features of the Sf301 genome when compared to E.coli K12 and 0157.
      • Figure 3 the N-terminal halves of a class of protein identified in Sf301.
      • Table 2: Insertion Sequence elements identified in genomes of Sf301, MG1655 and EDL933, the virulence plasmid, and pWR501, from S.flexneri 5a
      • Figure 4 is a comparison of two regions on 3 different genomes. The regions are the rfa/waa regions that are responsible for LPS biogenesis.
      • Table 3 is a list of pseudogenes within Sf301 with a known function, the function is listed next to the pseudogene.
    • How do the results of this study compare to the results of previous studies (See Discussion).
      • This study supported a previous study that was done that asserted that E.coli and S.flexneri were very closely related to each other. Although S.flexneri is pathogenic it is more closely related to non-pathogenic strain of E.coli: K12. This study and the previous study went as far as to say that S.flexneri should potentially be recategorized as part of theEscherichia genus.

For the genome paper (Coder and QA only): in addition to the journal article, please find and review the Model Organism Database (MOD) for your species similarly to what you did to review your assigned database for the NAR assignment. After an email exchange with Dr. Dahlquist, Trixie and I decided to go with this database.

  1. What types of data can be found in the database (sequence, structures, annotations, etc.); is it a primary or “meta” database; is it curated electronically, manually [in-house], or manually [community])?
    • This is a gene database, it houses genetic code for each gene in S. flexneri. Genome maps are included, gene comparisons, analysis tools, and links to other databases. This is a meta database, and from what we can tell it is curated manually in house by the scientists.
  2. What individual or organization maintains the database?
    • State Key Laboratory for Molecular Virology and Genetic Engineering, Beijing, China
  3. What is their funding source(s)?
    • State Key Basic Research Program and High Technology Project from the Ministry of Science and Technology of China are their supporters.
  4. Is there a license agreement or any restrictions on access to the database?
    • This database appears to be open access.
  5. How often is the database updated?
    • Last update was June 11, 2014. Unfortunately there isn't data on how often it is updated but it doesn't seem to have been updated for awhile.
  6. Are there links to other databases?
    • Yes there are, on the left hand side there are two links to other databases. It links here and here.
  7. Can the information be downloaded?
    • Yes the information can be downloaded.
  8. In what file formats?
    • It can be downloaded as a .fas which we aren't sure how to open on Windows.
  9. Is the Web site well-organized?
    • It is very well organized, easy to get around.
  10. Does it have a help section or tutorial?
    • It has a help section but no tutorial.
  11. Run a sample query. Do the results make sense?
    • The results make sense. The "quick search" function was a little wonky but the advanced search found us what we wanted every time.
  12. What is the format (regular expression) of the main type of gene ID for this species (the "ordered locus name" ID)? (for example, for Vibrio cholerae it was VC#### or VC_####).
    • The format is: "SF####"

Journal Club Presentation

Each pair of students will prepare and give a 15 minute PowerPoint presentation for their paper in class on Tuesday, November 17 or Tuesday, November 24.

  • Please follow the Presentation Guidelines for how to format your slides.
  • You will need to prepare ~15 slides (assume 1 slide per minute of presentation).
  • You need to present the information in the outline of your journal article listed above, but organized as a presentation.
    • Specifically, you need to show each of the figures and tables in your article as part of your presentation. Do not have a separate section of your presentation for Methods. Instead, show each of the results (figures/tables) and just explain the methods used to obtain those results on that slide.
  • Your PowerPoint slides must be uploaded to the wiki and linked to from your individual journal page and your team page by midnight, Tuesday, November 17, even if you are presenting the following week.
    • You can update your slides before your presentation, but we will be grading the ones you upload by the deadline.
  • Your presentation (both the slides and the oral presentation) will be evalutated by the instructors using the Presentation Rubric.
  • Your presentation will also be evaluated by your fellow classmates (anonymously) who will answer the following questions:
    1. What is the speaker's take-home message (one short sentence)?
    2. What are the best points about the presentation's organization, visuals, and delivery? Please give at least 2 specific examples.
    3. What points need improvement? Please give at least 2 specific examples.
  • Although you may be working with different partners on this presentation than before, we expect that you will take the feedback from your previous presentation into account when doing this presentation.


BIOL 367, Fall 2015, User Page, Team Page

Weekly Assignments Individual Journal Pages Shared Journal Pages