Oregon Trail Survivors

From LMU BioDB 2015
(Redirected from Team 4)
Jump to: navigation, search
Oregon Trail Survivors
The third leading cause of death in the Oregon Trail.

Group Members

Helpful Links

Team Links
Files Team Members Week 11 Assignment Week 12 Assignment Week 14 Assignment Week 15 Assignment
OTS Deliverables Trixie Week 11 Week 12 Week 14 Week 15
Jake Week 11 Week 12 Week 14 Week 15
Gene Database Testing Report Erich Week 11 Week 12 Week 14 Week 15
Kristin Week 11 Week 12 Week 14 Week 15
Gene Database Project Links
Overview Deliverables Reference Format Guilds Project Manager GenMAPP User Quality Assurance Coder
Teams Heavy Metal HaterZ The Class Whoopers GÉNialOMICS Oregon Trail Survivors

Group Meeting Times

  • Thursday, November 5th at 8:00 pm
  • Met most Sundays and Monday evenings in the Biol DB lab to check in with one another.

Goals

Over the upcoming weeks our group will be investigating Shigella flexneri.

Week 10

  1. Find genome sequence paper
  2. Find 4-8 microarray data and paper that goes with the genome paper
  3. Compile team page to and create a ranked annotated bibliography

Week 11

  1. Prepare for journal club presentations in Weeks 12 and 13
  2. Begin initial tasks on research project

Click on username links for more information regarding each team member's contributions for Week 11.

Jake: Read through the genome paper and tried to get through the accessible things I had the ability to understand. Made an outline for the genome paper. Worked on the presentation with Trixie and found a database. And of course I answered the assigned questions.

Trixie: Mainly focused on the Genome paper presentation with Jake. This includes searching for a viable database that we will be using for the rest of the group assignment and actually creating the presentation we will be doing for October 17th, 2015. I've also updated our group page to reflect what Dr. Dahlquist suggested would improve our team page.

Erich: Analyzed the microarray paper in order to describe the experimental design of the microarray data, treatments, number of replicates, and dye swaps. Worked with Kristin to produce the power point for the GennMAP users presentation at Journal Club. Worked on the individual journal entry and created an outline of the microarray paper.

Kristin: Using the team's selected microarray paper I developed an outline including background information, experimental outline/methods and how samples corresponded to the data, a brief description of the results, and a discussion including the implications of the research and its results in comparison to previous studies. Using this outline, I created a flow chart corresponding to the research. I also worked with Erich in order to create a PowerPoint for the Journal Club presentation on Nov. 24.

Week 12

  1. QA will be doing an initial database export.
  2. Coder will be setting up version control.
  3. GenMAPP users will compile the raw data from the micorarray file to prepare for normalization and statistic analysis (will begin if time permits after consultation with Dr. Dahlquist). Additionally, the GenMAPP users will be determining the number of biological or technical replicates and how samples were labeled.
  4. Coder and QA will present on genome paper in class Tuesday, Nov. 24.

Click on username links for more information regarding each team member's contributions for Week 12.

  • Jake:Setup my environment in eclipse, created the s-flexneri branch, created my own copy of GenMAPP that I can modify for later use and I cloned the repository with the Git commands.
  • Trixie: Finished the preliminary export of the XML and GOA files and the corresponding Gene Testing Report. Also started identifying the gene id's for the specie. Decided on file management system with Jake.
  • Erich: Worked with Kristin in determining the total number of biological and technical replicates. Compiled the raw data for RP samples, specifically the ID and Log ratio columns. Incorporated the RP and RX data into one spreadsheet with Kristins data. We created a table of the sample data and file each corresponds with, also figured out there were no dye swaps in the experiment(The control was the Cy3 dye and the treatment the Cy5 dye).
  • Kristin: Determined that there were 3 biological replicates per treatment for 6 treatments total. Compiled raw data for RX samples by re-naming columns for ID and Log Ratio and putting into same worksheet, which was later combined with Erich's worksheet for RP samples. Erich and I met and worked together to create a table of which samples correspond to which file.

Week 14

  1. QA will be documenting the IDs using MATCH, Postgres, Microsoft Access, and Excel and get a head start of Milestone 3, which is customizing the TallyEngine.
  2. Coder will determine and document any modified export behavior that the GenMAPP Builder will have and resolve bugs. Coder will also work with QA by uploading GM Builder for additional export.
  3. GenMAPP Users will perform statistical analysis on Excel (normalization, tests) and format for import into GenMAPP. Users will also import data into GenMAPP and run MAPPFinder, and then document these test runs.

Click on username links for more information regarding each team member's contributions for Week 14.

  • Jake: Finished custom GenMAPP builder, committed to GitHub, and ran the export with the custom software. This created a custom .gdb which was opened in Microsoft Access and GenMAPP to check for accuracy.
  • Trixie: Trixie has finished identifying the gene IDs using MATCH, Postgres, Microsoft Access, and Excel. It was discovered that some IDs are in "dbReference/property&type&gene ID", and so another export was done on 12/7/15 to add the newly discovered gene IDs.
  • Erich: Kristin and I completed the corrections provided via Dr. Dhalquist on Kristins talk page. We split the work into two halves and I worked on the RP data. We completed the statistics, Bonferroni p value correction, and the sanity check. I downloaded the database and formatted/exported the file for GenMAPP, and tried to create a GO tree for one of the trail points with RX.
  • Kristin: This week Erich and I made corrections from the talk page and normalized log ratios for the slides in the experiment. I completed the statistical analysis for RX samples and calculated the Bonferroni p value correction. I also performed a sanity check for the RX samples and, going off of that, I calculated the Benjamini & Hochberg p value correction for RX-1-30, which had the most statistically significant changes in gene expression. I also formatted and exported the file for GenMAPP, downloaded the database, and attempted to create color sets to run the data set through MappFINDER.

Reflection

Each team member should reflect on the team's progress:

  1. What worked?
  2. What didn't work?
  3. What will I do next to fix what didn't work?

Kristin:

  1. What worked?
    • In terms of communication is having a group text. We also meet at least once a week outside of class in order to work together on the assignments and make sure we are all on the same page. So far, this has allowed us to troubleshoot and address bugs together as a team quickly. It also worked for Erich and I to divide up the samples so that I did all RX and Erich did all RP. Then, we could work at the same time and double-check procedures with each other but we were still getting the work done twice as quickly.
  2. What didn't work?
    • After creating the initial compiled raw data file, I had to make several corrections before the file could be run through GenMAPP. First of all, I had to get rid of the ".", and I also had to change all #DIV/0! with a space character for the file to be read at all. Also, although we were unable to find all of the b#### and CP#### gene ID's in UniProt or ShiBASE. Also, after creating my color set and trying to run MAPPFinder, I tried three computers and all of them crashed with the "not responding" message.
  3. What will I do next to fix what didn't work?
    • I will communicate with the QA and Coder in order to create a database with a minimal number of "Gene ID not found's" and then communicate with Erich when we try to run our dataset through MappFinder. Once the gene database is re-customized and the export is complete I can try and re-run my dataset to see if that makes a difference.

Trixie :

  1. What worked?
    • What worked in identifying the gene IDs is to look export .gdb file into Excel and compare with what the OrderedLocusNames table had (from Microsoft Access). From doing this, it was easier to find which genes were not found in the .gdb file and made it easier to look through them in the UniProt XML file. With the Excel file comparing the lists of gene IDs and using the CTRL+F shortcut, I was also able to discern which tags to include into the new builds for the databases. Because of this, I was able to confirm that some genes indeed do not exist in the XML file, while only a couple exist within the "dbReference" tag. In terms of group work, what worked is posting all our files into a single page as we progress through the assignment. Night meetings were also helpful in order to better communicate with the rest of my group.
  2. What didn't work?
    • What didn't work is using Match multiple times without thinking. Even when I was trying to match the number of gene IDs with what Tally Engine gives me, Match didn't really help me in identifying where to find the genes in the XML file. Waiting for the database to finish didn't help much at all since our builds would take more than 4 hours to finish.
  3. What will I do next to fix what didn't work?
    • What I would do next to fix what didn't work is to actually use Match in conjunction to the XML file, or just use the Excel method completely since that was actually more helpful in finding the necessary tags than the Match method. I would probably have to time myself to check the lab after about 4.5 hours since one of our builds lasted that long.

Jake:

  1. What worked?
    • Almost every procedural action I took from Dondi worked. The only hiccup I had was in regard to Eclipse and navigating the directories.
  2. What didn't work?
    • In Eclipse, my edits to the GenMAPP builder source code were causing red error marks, but after selecting "Organize Imports" from the source menu the errors were fixed easily and the proper classes were imported. Also I had difficulty navigating to the dist file in my Temp drive, however I traced this back within Eclipse and was able to make a zip that I could hand off to Trixie for export.
  3. What will I do next week to fix what didn't work?
    • It seems to me that there wasn't a whole lot that went wrong with my procedure. What wasn't working I already fixed. Currently Trixie and I are running an export that will take 4 hours with the new additions in the property files, so there may be some new hiccups when that export is finished but we will have to wait and see.

Erich:

  1. What worked?
    • Having a GenMAPP user meeting with Dr. Dhalquist helped focus on what goals we wanted to achieve by the time of our next meeting. A group text helped organize meeting times of both the coders and GenMAPP users helped keep us on schedule.
  2. What didn't work?
    • The GenMapp Gene Ontology Tree was unable to pull files for each GO selection. We need to work on and make sure the GO files can be found. We also had to remove and edit our compiled raw data files so that they are able to be read by GenMAPP.
  3. What will I do next to fix what didn't work?
    • A new .gex was created, so this might help with the problems experienced in the MappBuilder. Also communicating with the QA and coder to make sure we finish up the GO tree smoothly in order to assess the results of the Publication we chose for Shigella Flexneri.

Week 15

  1. Coder: Work with QA to fix bugs.
  2. QA: Work with coder to fix bugs in the .gdb.
  3. GenMAPP Users: Finish Milestone 3. Run tests with GenMAPP. Do a journal club outline of the paper to use in the Discussion section of group report and presentation. Create a .mapp file showing one changed pathway from the data.
  4. All team members will be working together to put together deliverables including the final report and presentation for next Tuesday.

Final PowerPoint Presentation

Click on username links for more information regarding each team member's contributions for Week 14.

  • Jake: Pulled Dondi's changes, and then created a new clean distribution. I then uploaded that distribution to our OTS Files page. Edited properties file for TallyEngine.
  • Trixie: Had to re-import to PostgreSQL due to having imported twice -- this resulted in the number of counts being twice as much as what was in the XML file. Also worked with Dr. Dionisio in order to find ~92 new IDs from the XML file that were not caught before and collaborated with Jake in order to make 2 more builds that should, ideally, produce the intended 92 genes.
  • Erich: Used Kristin's color sets criterion GO files to fill out my gene expression MAPP. Made MAPPS for pathways that were significantly affected such as Metabolic procceses (glycolysis, TCA cycle), Flagellar Assembly, and Ribosome. Incorporated the data into slides for the power point and analyzed the data obtained with that produced from the microarray paper.
  • Kristin: I created color sets with Increased/Decreased criteria for all of the 12 treatment/time point combos. Then, based on the criterion.go files, I created tables by filtering the results comparing the most commonly induced or repressed genes for the 1 x MIC at 60 minutes and 0.5 x MIC at 10 minutes between RX and RP. Strikingly, we found that between RX and RP the effects were very similar. I then compared them with the .mapp files that Erich created and put my portion of the project (compiled sanity check, color set, comparison tables) in the power point.

Overview of Genome Paper

  • Used the genome sequencing article to perform a prospective search in the Web of Science database.
  • Overview of the search:
    • How many articles does this article cite? 37
    • How many articles cite this article? 303
    • Based on the titles and abstracts of the papers, what type of research directions have been taken now that the genome for that organism has been sequenced?
      • Now that the genome has been sequenced, a majority of research has been done on discovering which genes are responsible for virulence and pathogenesis as well as potential antibiotics. Genomic research is also focused on how S. flexneri has been able to develop resistance to multiple drugs. Furthermore, Shigella is suspected to have evolved from Escherichia coli so a lot of research has been done in how and when pathogenic Shigella split from E. coli on the evolutionary tree.

Annotated Bibliography

Genome Paper

Jin, Q., Yuan, Z., Xu, J., Wang, Y., Shen, Y., Lu, W., … Yu, J. (2002). Genome sequence of Shigella flexneri 2a: insights into pathogenicity through comparison with genomes of Escherichia coli K12 and O157. Nucleic Acids Research, 30(20), 4432–4441.

Microarray Paper

Fu H, Liu L, Zhang X, Zhu Y, Zhao L, Peng J, et al. (2012) Common Changes in Global Gene Expression Induced by RNA Polymerase Inhibitors in shigella flexneri. PLoS ONE 7(3): e33240. doi:10.1371/journal.pone.0033240

  • The link to the abstract
  • The link to the full text of the article in PubMed Central
  • The link to the full text of the article (HTML format) from the publisher web site.
  • The link to the full PDF version of the article from the publisher web site.
  • Copyright: © 2012 Fu et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
  • Does the journal own the copyright? NO
  • Do the authors own the copyright? Yes
  • Do the authors own the rights under a Creative Commons license? Yes
  • Is the article available “Open Access”? Yes
  • What organization is the publisher of the article? What type of organization is it? PLoS One is the publisher/Journal. It hosts open access research articles. (Public Library of Science)
  • Is this article available in print or online only? Online only
  • Has LMU paid a subscription or other fee for your access to this article? No LMU has not paid a subscription or other fee because it is open access on the Public Library of Science.
  • Use the genome sequencing article you found to perform a prospective search in the ISI Web of Science/Knowledge database.
    • How many articles does this article cite? 25 cited references
    • How many articles cite this article? 0 articles cite this article
    • Based on the titles and abstracts of the papers, what type of research directions have been taken now that the genome for that organism has been sequenced?
  • Well given that there are no papers that cite this paper there hasn't been anything done to build on this specific topic. In regards to the genome I think this paper has built on the work of the people who sequenced the first genome of Shigella flexneri as well as the other micro array papers.
  • State which database you used to find the data and article: ArrayExpress
  • State what you used as search terms and what type of search terms they were: "shigella flexneri" filtered by organism, experiment type: "rna assay", experiment type: "array assay"
  • Give an overview of the results of the search.
    • How many results did you get? 7 results returned with 6 viable options due to the number assays.
    • Give an assessment of how relevant the results were: Very relevant, 6/7 results were viable.
  • Link to microarray data
  • What experiment was performed? What was the "treatment" and what was the "control" in the experiment?
    • Antibiotics (RNA Polymerase Inhibitors) were added to Shigella flexneri in order to see if bacteria became less active. The control was a group of bacteria with no drugs added to them, and the treatment was a group of bacteria with drugs added to them.
  • Were replicate experiments of the "treatment" and "control" conditions conducted? Were these biological or technical replicates? How many of each?
    • There are two drugs RX and RP with 6 samples per drug. The experiment was run 3 times which yielded 36 assays. I believe that means 3 biological replicates and 12 technical replicates within each experiment, but I am not 100 percent sure.