Talk:Chlamydia trachomatis

Week 13 Feedback

The IDs of the form pCTA_#### probably come from a plasmid and should be kept. I'm glad that changing the regular expression in match to capture these then gave a count that matched what was in the OrderedLocusNames table in Access.
You will need to talk with Dr. Dionisio to find out why the SQL query and the TallyEngine gave a different number and what the fix will be.
Your assumption about the "suffix" added to the IDs in the microarray data is correct. Excel has a function "Text-to-columns" that will allow you to separate the suffix off so that you just have the IDs that you want. I can show you this in class on Tuesday. — Kdahlquist (talk) 16:13, 25 November 2013 (PST)

Affymetrix RML Custom Pathogenic chip 3 CDF file needed to analyze data with dChip from NCBI GEO
dChip web site
2013-11-26: Worked with Dillon in office hours to demonstrate statistics that are needed for analyzing the data. Demo file uploaded as Master_Spreadsheet_Chlamydia_20131125_KD.xls. — Kdahlquist (talk) 12:45, 26 November 2013 (PST)
- The groups to be compared are (from sdrf):
  - EB in axenic media (2 replicates) to RB in axenic media (4 replicates)
  - EB + rifampicin (3 replicates) to RB + rifampicin (3 replicates)
  - There are also 4 replicates labeled "carry over of EB" that we don't know what to do with, so we are ignoring them.
- The calculations to carry out are as follows:
  - - Compute the average of each group (EB, RB, EB+rif, RB+rif)
    - Compute the ratios: EB/RB and EB+rif/RB+rif
    - Take the Log2 of the ratios
    - Compute p values comparing the EB and RB groups and EB+rif and RB+rif groups using the TTEST function in Excel. For example,

=TTEST(range of EB values, range of RB values, 2, 3)

Filter the results for probe set IDs that contain "CTA_" and paste these to a new sheet for GenMAPP. Working with just these values will make figuring out any errors easier and will restrict the analysis to just the Chlamydia genes.
- We need to talk to Dr. Dionisio about how to extract the CTA_#### IDs away from the entire probe set IDs. Text to columns did not work because:
  - You can only use one character as the delimiter. If you use "R", then you get IDs of the form "CTA_####_" with a trailing underscore. If you use an underscore, then the CTA gets separated from the #### and Excel removes the leading zeros from the number. Katrina and Hilda may already have found a solution for this, as well.

Thank you for submitting your team page on time.
We are expecting roughly equal contributions from all team members; Dillon only made a few saves compared to Katrina and Hilda.
As for the construction of your team page, you have only provided links to the User pages for your team. There are some additional useful links that you could provide for your team's page, what should they be?
- We encourage you to create a template for your team's useful links.
The full bibliographic reference needs to be provided for microarray reference 1.
You have only provided links to the HTML versions of the papers; please provide direct links to the PDF downloads for these articles.
Microarray reference one is approved for your project (you can comment out the references to the other microarray papers so that they don't appear on the page). You will need to talk to me in class to discuss how to proceed with the data.
- It appears that there are indeed four biological replicates, not just technical replicates of the samples.
- Because of some issues with the gene IDs provided on the microarray chip, you may opt to use the following paper for your journal club instead of the one you found in your literature search:
  - Carlson, J.H. et al. (2005) Comparative Genomic Analysis of Chlamydia trachomatis Oculotropic and Genitotropic Strains Infect. Immun. 73: 6407-6418.

— Kdahlquist (talk) 14:41, 4 November 2013 (PST)

As I mentioned to you in class today, the strain of Chlamydia from which the IDs on the microarray are taken (Olmsland et al. paper) is a different strain than the reference strain for which you found your original genome paper.
- The reference strain is: Chlamydia trachomatis (strain D/UW-3/Cx) with a taxonomic ID of 272561.
- The strain used for the gene IDs on the chip is: Chlamydia trachomatis serovar A (strain HAR-13 / ATCC VR-571B) with a taxonomic ID of 305277.
You may choose to do your journal club on either strain, but for your project to work, you need to match the UniProt and GOA downloads with the microarray chip IDs.

— Kdahlquist (talk) 16:27, 5 November 2013 (PST)

Please use the following format in naming all of the files that you work with <Descriptive term(s)>_v<version number>_<Initials>_<yyyymmdd>

Ksherbina (talk) 09:44, 7 November 2013 (PST)