Class Journal Week 8

From LMU BioDB 2017
Jump to: navigation, search

Zachary Van Ysseldyk's Responses

  1. The main issues with the data and analysis identified was that the data was not reproducible - actually, that the data was very very far from reproducible. They reused the same genes in statistical analysis, their indexing was off by one, and some of their verification tables (namely the 59 gene ovarian cancer model) matched 0% of what the line was supposed to be. Based on his overall observations, he says the most common mistakes are simple. He says how this simplicity is often hidden, and furthers to say that the most simple mistakes are common. Specifically, he found the most common mistakes concerning: Mixing up sample labels, Mixing up the gene labels, Mixing up the group labels, and incomplete documentation. He notes how the MOST common mistake is the complete confounding in the Experimental design. Many of these points Dr. Baggerly expresses have been brought up when looking at DataOne. For one, not all of the labels are clear. Furthermore, the workflow is not easily reproducible.
  2. Baggerly first suggests that the data should be labeled in order to clearly be able to tell which data is which.The biggest thing that he expresses, of course, is the reproducibility of the workflow. All of the suggestions Beggarly expresses basically points towards having the data be reproducible. DataOne also stresses and strongly advocated the essential practice for proper data documentation.
  3. The main best practice that we performed was the reproducibility of the data as outlined on the individual work page. We were able to cater the instructions to our specific gene so that the analyses could be easily reproduced. We also made sure that all of the genes were labeled. Putting in summaries and electronic workbooks helps the user to have an overview of the project which enables them to have a clear objective going into the project.
  4. It seemed like the press and organizations didn't take him that seriously at first. Beggarly didn't seem to upset about it during his lecture, but I would be a little angry having gone through that much work just to have been brushed off. Although I am not looking to go into a biology rated workplace, his enthusiasm about the subject was inspiring. Also I liked his quirkiness.


Zvanysse (talk) 19:40, 23 October 2017 (PDT)
Zvanysse

BIOL/CMSI 367-01: Biological Databases Fall 2017

Assignments

Week 1 | Week 2 | Week 3 | Week 4 | Week 5 | Week 6 | Week 7 | Week 8 | Week 9 | Week 10 | Week 11 | Week 12 | Week 14

Individual Assignments

Zvanysse Week 1 | Zvanysse Week 2 | Zvanysse Week 3 | Zvanysse Week 4 | Zvanysse Week 5 | Zvanysse Week 6 | Zvanysse Week 7 | Zvanysse Week 8 | Zvanysse Week 9 | Zvanysse Week 10 | Zvanysse Week 11 | Zvanysse Week 12 | Zvanysse Week 14 | Zvanysse Week 15

Shared Journals

Zvanysse Week 1 Journal | Zvanysse Week 2 Journal | Zvanysse Week 3 Journal | Zvanysse Week 4 Journal | Zvanysse Week 5 Journal | Zvanysse Week 6 Journal | Zvanysse Week 7 Journal | Zvanysse Week 8 Journal | Zvanysse Week 9 Journal | Zvanysse Week 10 Journal | Zvanysse Week 11 Journal | Zvanysse Week 12 Journal | Zvanysse Week 14 Journal

QLanners Responses

  1. There were a number of issues with the data and analysis identified by Baggerly and Coombs. Some of the main issues included a universal off-by-one indexing error brought about by poor attention to the software being used, an inaccurate use of secndary source data (their data labels seemed to be flipped from the published data labels), the use of duplicate data, and very poor documentation in general. The review panel even said that they could not figure out from the published data how to reproducce the work without some sort of outside help. A number of best practices enumarted by DataOne were broken, including a failure to maintain dataset provenance, a lack of documentation of all assumptions, a lack of any form of repdocuible workflow documentation, and very poor labeling techniques. Dr. Baggerly claimed that several of these were common mistakes, most prominently the off-by-one indexing error and the mixing-up of labels. Dr. Baggerly also pointed out how it is the poor documentation that often leads to these easy mistakes going undetected.
  2. Dr. Baggerly recommends more thorough documentation of the data, namely through labels for all published data. He also recommends a stricter requirement for data provenance and for code to be published along with the data. Overall, Dr. Baggerly stresses the need for the research to be reproducible. The corresponds very closely with what DataOne recommends, as several of the best practices (as outlined above) are essential for properly documenting data and data analysis and ensuring that someone else can perform the exact same steps on the data in the future using just the documentation.
  3. In this weeks assignment we performed a number of best practices. We ensured to provide distinct labels for all of our data points, we labeled our data files as descriptive names, we appropriately handled missing data, and we kept documentation on how we performed our data analysis so that it could be reproduced in the future by somebody else.
  4. I was very surprised at all of the pushback that Dr. Baggerly received from the scientific journals when he shared the errors in the data. I would have thought that scientific journals would have been much more committed to ensuring that the papers that they had published were accurate and would have been more helpful to Dr. Baggerly and tough on the Duke research team. I think going forward a higher sense of accountability needs to be adopted in the scientific field to avoid scenarios like this.

Qlanners (talk) 17:28, 22 October 2017 (PDT)
QLanners Links
Main Page
User Page
Assignment Pages: Week 1 | Week 2 | Week 3 | Week 4 | Week 5 | Week 6 | Week 7 | Week 8 | Week 9 | Week 10 | Week 11 | Week 12 | Week 14 | Week 15
Journal Entry Pages: Week 1 | Week 2 | Week 3 | Week 4 | Week 5 | Week 6 | Week 7 | Week 8 | Week 9 | Week 10 | Week 11 | Week 12 | Week 14 | Week 15
Shared Journal Pages: Week 1 | Week 2 | Week 3 | Week 4 | Week 5 | Week 6 | Week 7 | Week 8 | Week 9 | Week 10
Group Project Page: JASPAR the Friendly Ghost


Mary Balducci's Responses

  1. The main issue with the data and analysis was that the results were not reproducible. There were errors with mislabeling data, as well as indexing errors. The data did not make sense, and their methods were not clear without outside help. Simple were also easy to miss due to poor documentation and recording of data. Dr. Baggerly claims that the most common mistakes are mixing up sample labels, mixing up gene labels, mixing up group labels, and incomplete documentation.
  2. Dr. Baggerly recommends labelling table columns, having a code, describing steps, especially steps which are planned in advance. These are very similar to DataOne's recommendations of labelling and having good documentations.
  3. Best practices that I performed for this week were labelling my data points with headers that explain exactly what the data point is. I also kept a detailed outline of every step I took to get my results.
  4. My reaction to this case after viewing the video is that I'm still shocked that this went on for so long. It seems like it was very obvious that the data was not reliable and yet it was allowed to go as far as clinical trials.

Mbalducc (talk) 20:44, 22 October 2017 (PDT)

Eddie Azinge's Responses

  1. The most prevalent issue with the data was that Baggerly and Coombs weren't able to reproduce the data themselves, given a plethora of issues from the original data and analysis. Simple errors such as off-by-one errors, indexing errors, use of duplicate data, poor documentation, as well as mixing up sample and gene labels all aggregated to create a dataset that was irreproducible by reasonable methods. This was augmented by the fact that the lab was not always following best practices, specifically those set forth by DataOne.
  2. Dr Baggerly recommends a more rigorous and strict adherence to following proper protocol, such as following conventions for labeling data, heavy documentation of processes, and most importantly having a reproducible workflow. DataOne echoes most of these points, specifically emphasizing reproducibility of experiments and proper documentation.
  3. This week, we adhered to best practices by documenting our process as we followed the assignment, practicing proper labeling conventions, consistently dealing with missing data, and ensuring that our results were reproducible by other students.
  4. Learning more about this case makes me understand just how vast this field of biology is. This whole fiasco at Duke, if not properly taken care of, potentially stood to earn people a vast amount of money off of illegitimate practices and false hopes. It really emphasizes how important sticking to best practices is in order to prevent our analyses from causing harm to the society at large.

Cazinge (talk) 20:13, 23 October 2017 (PDT)

Katie Wright's Response

  1. Baggerly and Coombs were not able to reproduce the data they analyzed after pouring over the data and using every available means of "Forensic Bioinformatics." And this was not because of one specific error, but because of a multitude of errors. Often times, the data was not labeled correctly or software was used incorrectly that lead to the mislabeling of data (the +1 problem). The best practices violated were consistency in data labeling and documentation. Mislabeling and misdocumentation are some of the most common errors in data analysis, and wouldn't be such an enormous problem if they were just done properly in the first place.
  2. Dr. Baggerly and DataONE both reccommend creating "reproducible workflow." Your process and reasoning should be transparent and understandable so it can be evaluated/critiqued by others. Processes should also be automated wherever possible.
  3. For this week we
    • Documented entire procedure in minute detail (thanks to procedure provided by professors in week 8 assignment)
    • formatted Excel spreadsheet with no spaces between rows or columns, and new worksheets were created often to provide a step-by-step look at how the dataset was analyzed/manipulated.
  4. I think this talk just made me more angry about the whole fiasco. There were multiple "disturbing" errors (as Dr. Baggerly called them) that were pointed out to journals time after time. It took so long for the scientific community to listen to the biostatisticians and take an in-depth look at the data. I think that Dr. Baggerly makes a very good suggestion when he says that every institution should have their own biostatisticians independently review and reproduce the data analysis for every experiment before it is published.

Kwrigh35 (talk) 14:28, 23 October 2017 (PDT)

Corinne Wong's Response

  1. There was inconsistent data, and the research was not reproducible. Baggerly and Coombs frequently found errors, discrepancies, or missing information in the data that were never fully corrected. DataONE’s best practices that were violated were inconsistent and missing data. Some of the issues that were common were standard input errors: mixing up the sample labels, gene labels, and group labels.
  2. Dr. Baggerly recommends to provide the data, code, and have clear labels, which relates to how DataOne says to have accessible and organized data. His recommendations of clear documentation of corrections, assumptions, and errors also correspond to DataONE’s recommendations.
  3. The best practices that we performed for this week’s assignment were consistent and organized data entry, and accessible and reproducible research. We had clear labels for our datasets, and they are on accessible Excel spreadsheets with clear and detailed steps.
  4. I still can’t believe how long it took for them to finally pull their research after all of the red flags that Baggerly and Coombs found. After finding the report of so many errors, you would think the scientific community would look into them, especially when the responses from Potti and Nevins were not clear and did not provide documentation.

Cwong34 (talk) 20:35, 23 October 2017 (PDT)

cwong34

BIOL/CMSI 367-01: Biological Databases Fall 2017

Assignments

Journal Entries:

Shared Journals:

Group Project

Emma Tyrnauer's Responses

  1. The main issues with the data and analysis identified by Baggerly and Coombs was that the research was not reproducible. In fact, the review pannel admitted that they "were unable to identify a place where the statistical methods were described in sufficient detail to independently replicate the findings of the papers." Furthermore, statistical mistakes were made and propagated through mislabeling of data (through accidental switches and offsets). The best practices enumerated by DataONE that were violated were records of experimental design to allow for reproducible research and easy identification of errors. The most common errors identified were centered around accidental mislabeling of data.
  2. Dr. Baggerly recommends labeling columns and samples and providing code. These correspond to what DataONE recommends because they allow for the research to be easily reproduced by anyone.
  3. This week I made sure to correctly label all columns as well as I kept an electronic notebook recording the process I used to analyze my data set.
  4. This weeks assignment as well as learning about the Duke deception made me realize how easy it is to propagate incorrect data and the importance of reproducibility. If researches took a little more time to follow the recommendations of Dr. Baggerly, errors like this could be avoided or identified earlier.

Emmatyrnauer (talk) 20:48, 23 October 2017 (PDT)

Links

  1. My User Page
  2. List of Assignments
  3. List of Journal Entries
  4. List of Shared Journal Entries

Blair Hamilton's Responses

  1. What were the main issues with the data and analysis identified by Baggerly and Coombs? What best practices enumerated by DataONE were violated? Which of these did Dr. Baggerly claim were common issues?
    • The biggest issue with the data was they were unable to reproduce it. Much like our Week 8 assignment, if someone is following the exact same steps and is not getting the same answers, clearly something is off. Beggarly and Coombs found errors, discrepancies, inconsistent formatting and offset data. One of the best practices violated was the improper labeling of data, as well as the procedure section was not adequately done. Dr. Beggarly claimed that the documentation for formatting was incomplete, with genes being mislabeled and overall data input errors.
  2. What recommendations does Dr. Baggerly recommend for reproducible research? How do these correspond to what DataONE recommends?
    • Dr. Beggarly recommended reusing templates to create consistency, more thorough labeling/summarizing data, data with description, literature programming, reporting structure, and adding appendices. Similar to DataONE which says that consistency among data, procedure walkthroughs, documentation of assumptions and compatibility are good practices for data management.
  3. What best practices did you perform for this week's assignment?
    • This week we practiced consistent location of information. For example, when following the steps we made sure that each entry lined up and is easily read and followed. This is especially important for our yeast data because of the size of the data set (i.e. 6189). If the data isn't positioned correctly it is harder to find, read and sift through. Also, we practiced good labeling for columns. Each header explains the type of data below as well as which data is being referenced. For example, when taking the average of t15 the average is labeled with t15 so a viewer can see it is not referring to any other time period.
  4. Do you have any further reaction to this case after viewing Dr. Baggerly's talk?
    • I am still so amazed at the amount of ignorance Duke had dealing with this case. Given that Baggerly and Coombs found so many discrepancies and it took someone finding fault in the doctors resume for it to be even explored further is dumbfounding. I am also fascinated with how Baggerly and Coombs were able to describe these incidents with the data and now make it a teachable moment for other future researchers. Although what happened at Duke is truly strange, it is a great example of how data should be organized to make this never happen again.

Bhamilton18 (talk) 22:23, 23 October 2017 (PDT)

Category Links
User Page Blair Hamilton
Weekly Assignments Bhamilton18 Week 2Bhamilton18 Week 3Bhamilton18 Week 4Animal QTLBhamilton18 Week 6Bhamilton18 Week 7Bhamilton18 Week 8Bhamilton18 Week 9Bhamilton18 Week 10Bhamilton18 Week 11Bhamilton18 Week 12Bhamilton18 Week 14Bhamilton18 Week 15
Weekly Assignment
Instructions
Week 1Week 2Week 3Week 4Week 5Week 6Week 7Week 8Week 9Week 10Week 11Week 12Week 14Week 15
Class Journals Class Journal Week 1Class Journal Week 2Class Journal Week 3Class Journal Week 4Class Journal Week 5Class Journal Week 6Class Journal Week 7Class Journal Week 8Class Journal Week 9Class Journal Week 10
Final Project Lights, Camera, InterACTION!Lights, Camera, InterACTION! Deliverables

Aporras1 Response

  1. What were the main issues with the data and analysis identified by Baggerly and Coombs? What best practices enumerated by DataONE were violated? Which of these did Dr. Baggerly claim were common issues? They identified that a lot of the data wasn't reproducible and multiple aspects of the research was mislabeled and even unidentifiable. The practices which were violated was incomplete documentation, mixing up data labeling, and overall flaws in experimental design. The most common mistakes were the most simple mistakes.
  2. What recommendations does Dr. Baggerly recommend for reproducible research? How do these correspond to what DataONE recommends? He recommends literate programming, reusable templates, report structure, executive summaries and appendices. Compared to DataONE, they recommend entering complete lines of data, validation of data as it's being entered and ultimately they both agree on reproducible workflow.
  3. What best practices did you perform for this week's assignment? From this assignment, making sure my work was reproducible using the electronic notebook was essential along with always ensuring the data I was entering was organized properly in the spreadsheet and easy to follow.
  4. Do you have any further reaction to this case after viewing Dr. Baggerly's talk? I didn't realize it was the same Duke study until a couple minutes in and I was amazed that they sent that many letters without being taken seriously or suspending the trials to review the study. In complete honesty, I am appalled they didn't check their work before continuing and that no one listened to Dr. Baggerly and others.

User Page: Antonio Porras

Assignments

  1. Week 1 Assignment
  2. Week 2 Assignment
  3. Week 3 Assignment
  4. Week 4 Assignment
  5. Week 5 Assignment
  6. Week 6 Assignment
  7. Week 7 Assignment
  8. Week 8 Assignment
  9. Week 9 Assignment
  10. Week 10 Assignment
  11. Week 11 Assignment
  12. Week 12 Assignment
  13. Week 14 Assignment
  14. Week 15 Assignment

Individual Journal Entries

  1. Week 1
  2. Week 2
  3. Week 3
  4. Week 4
  5. Week 5
  6. Week 6
  7. Week 7
  8. Week 8
  9. Week 9
  10. Week 10
  11. Week 11
  12. Week 12
  13. Week 14
  14. Week 15

Class Journal Entries

  1. Class Journal Week 1
  2. Class Journal Week 2
  3. Class Journal Week 3
  4. Class Journal Week 4
  5. Class Journal Week 5
  6. Class Journal Week 6
  7. Class Journal Week 7
  8. Class Journal Week 8
  9. Class Journal Week 9
  10. Class Journal Week 10

Team Page

  1. JASPAR the Friendly Ghost

Individual Assessment and Reflection

  1. Individual Assessment and Reflection

Simon Wroblewski's Reflections

  1. What were the main issues with the data and analysis identified by Baggerly and Coombs? What best practices enumerated by DataONE were violated? Which of these did Dr. Baggerly claim were common issues?
    • The main issues identified by Baggerly and Coombs has to be the fact the the data was not reproducible and therefore incapable of being verified. Many practices from DataONE were violated, such as mislabeling, indexing, and manipulating data.
  2. What recommendations does Dr. Baggerly recommend for reproducible research? How do these correspond to what DataONE recommends?
    • Dr Baggerly recommends a more strict and regimented protocol, which include: (1) Following conventions for labeling data, (2) Heavy documentation of all processes, and (3) Most importantly having a reproducible workflow. DataOne reinforces these practices, especially in reference to reproducibility of all experiments coupled with proper documentation.
  3. What best practices did you perform for this week's assignment?
    • For this assignment, I made sure to save frequently and make clear labels for all headers so that it was there was no ambiguity about what each column's information contained. In addition, I also wrote each step down on a note pad with pen and paper so that I would be able to modify the assignment steps specifically to my groups assignment.
  4. Do you have any further reaction to this case after viewing Dr. Baggerly's talk?
    • I'm not sure if my reaction has changed, I both was and still am in shock about what happened, especially at the negligence that took place to allow for it. I'm not one for invading others space but when this type of autonomy can be just given... too much power can be shifted into the wrong hands, and no one will know until we see horrors like these, until it is too late.

Signature: Simonwro120 (talk) 23:34, 23 October 2017 (PDT)

List of Assignments

Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Week 11 Week 12 Week 13 Week 14 Week 15

List of Journal Entries

Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Week 11 Week 12 Week 13 Week 14 Week 15

List of Shared Journals

Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Week 11 Week 12 Week 13 Week 14 Week 15

John Lopez Responses

  1. Perhaps the biggest issues with the data and analysis involved the fact that the data could not easily be reproduced. There was no clear way to initially show how the results were obtained. Furthermore, the lack of good, correct labels, in order to reproduce the data was also absent. These are two of the best practices violated by DataONE. However, improper labeling would be the more common of the errors presented.
  2. Labeling properly is generally what Dr. Baggerly recommends for reproducible research, as well as making sure the research is in fact reproducible. The labeling ties heavily with the DataONE recommendation since it allows for clear interpretation of what is presented as well as good to be reproduced.
  3. I performed several of the best practices for this week's assignment, including using consistent column names, creating descriptive column names without spaces/special characters, using descriptive file names, and leaving missing data blank.
  4. After viewing Dr. Baggerly's talk, combined with the realizing the easy potential for errors in my own assignment, is that data is extremely tricky to work with and it is still so easy to manipulate to achieve undesirable results. Although the presentation made it clear that the data should have easily been determined as incorrect, it's still difficult to determine exactly how far down the research things went wrong.

Individual Journal Entries and Assignments

Class Assignments

Class Weekly Journal Entries / Project Weekly Journal Entries

My Page

Arash Lari's Responses

  1. The biggest problem that they had, which is a big sin in regards to the scientific method, is the fact that they couldn't reproduce their results, which is a big no-no in scientific research. They also mislabeled, didn't name stuff clearly. Another big instance of dishonesty and inaccuracy was the manipulation of data.
  2. Dr. Baggerly recommends proper labeling and ensuring that the experiment is reproducible. DataONE also heavily emphasizes doing these things.
  3. The whole point of this assignment was to create a reproducible step by step process and it revolved around proper labeling and regular saving and communication.
  4. I'm still surprised that the original event happened, regardless of watching the talk, because I feel as though much of this information was common sense for acquiring legitimate data. Even though Dr. Baggerly made it seem like it should be obvious to spot phony data, I think the fact that they weren't able to is proof that it's not that easy. I would say that I'm most surprised that the country's government doesn't establish a part of the government for the advancement of science that could unbiasedly and objectively review research from different labs.

ArashLari (talk) 23:38, 23 October 2017 (PDT) Arash Lari

BIOL/CMSI 367-01: Biological Databases Fall 2017

Assignments


Journal Entries:

Shared Journals:


Hayden Hinsch's Responses

  1. The main issues with the data and analysis identified by Beggerly and Coombs is that the data was not reproducible. They found many simple mistakes, mentioning that simple mistakes are the most common and that clear documentation of statistical analysis could be a way to fix this. The practices that were violated were that the data was not easily reproducible, the data had different names within the table, and data was not consistent. Baggerly claimed that the fact that the data was not well documented was a common issue. The indexes were off by one, and when reproducing the calculations, most of the heat maps were not even similar.
  2. Baggerly recommends that everything is extremely well documented. This will help with the reduction of simple errors. This corresponds with the DataONE recommendations but DataONE specifies that all of the data should be on one table.
  3. This week we created a description of the statistical analysis that we performed so that someone else may reproduce the results we obtained. We also maintained descriptive table names and had a singular table with multiple sheets for all of our data.
  4. I am not very surprised that so many simple mistakes were made. It seems that humans make simple mistakes all the time, so it makes sense that it would happen. I am glad I am learning about this, but am sad that the habit was not formed within me much earlier in life.

Hhinsch (talk) 23:49, 23 October 2017 (PDT)

Assignments

Week 1
Week 2
Week 3
Week 4
Week 5
Week 6
Week 7
Week 8
Week 9
Week 10
Week 11
Week 12
Week 14
Week 15

Hayden's Individual Journal Entries

hhinsch Week 1
hhinsch Week 2
hhinsch Week 3
hhinsch Week 4
hhinsch Week 5
hhinsch Week 6
hhinsch Week 7
hhinsch Week 8
hhinsch Week 9
hhinsch Week 10
hhinsch Week 11
hhinsch Week 12
hhinsch Week 14
hhinsch Week 15
Page Desiigner Deliverables Page

Class Journal Entries

Class Journal Week 1
Class Journal Week 2
Class Journal Week 3
Class Journal Week 4
Class Journal Week 5
Class Journal Week 6
Class Journal Week 7
Class Journal Week 8
Class Journal Week 9
Class Journal Week 10
Page Desiigner

Electronic Notebook

Hhinsch Electronic Notebook

Hayden's User Page

Hayden Hinsch


Nicole Kalcic's Responses

  1. The biggest issue with the data was that it was not able to be reproduced. After this week's assignment and readings, I feel like I have a much greater understanding on the importance of our electronic journals. If someone is not able to see exactly what steps I took to get my final product, then my data will not be able to be verified (making it somewhat useless). Several best practices enumerated by DataOne were broken. They include the lack of workflow documentation I just mentioned and a lack of labeling techniques. Dr. Baggerly claimed that improper labeling was the most common mistake.
  2. Following that, it would be easy to guess that Dr. Baggerly recommends properly and carefully labeling for reproducible research. Again, DataONE has labeling as one of the best practices, meaning that Baggerly and DataONE are both strongly emphasizing the same thing.
  3. This week, we used clear labeling on our data sets. We had consistent data entry and reproducible research.
  4. I feel like the case is more shocking for me at this point, because it has been drilled into our class that our work needs to have a journal and that it needs to be checked/verified. How could so many professionals miss that in regards to the work behind the case?

Nicolekalcic (talk) 23:51, 23 October 2017 (PDT)

Dina Bashoura's Responses

  1. What were the main issues with the data and analysis identified by Baggerly and Coombs? What best practices enumerated by DataONE were violated? Which of these did Dr. Baggerly claim were common issues?
    • The main issue was probably the fact that data could not be reproduced. Beggarly and Coombs found errors, discrepancies, duplicate data, and poor documentation. The common issues Dr. Baggerly claimed surrounded the mislabeling of data.
  2. What recommendations does Dr. Baggerly recommend for reproducible research? How do these correspond to what DataONE recommends?
    • Dr. Baggerly recommended labeling data, documenting all the processes, and using templates to have reproducible data. DataONE recommends proper labeling as well to allow for clear, reproducible data.
  3. What best practices did you perform for this week's assignment?
    • This week, we performed the best practice of labeling the data. We highlighted every new section of data a different color to have clear and easily distinguishable columns. We also documented the process in our electronic notebooks as well as used a template, as we do every week.
  4. Do you have any further reaction to this case after viewing Dr. Baggerly's talk?
    • After completing this week's assignment, I can see how tricky it is to organize all the data and realized how easy it is to miss over incorrect data. But this video clearly shows how many discrepancies were found and how the university still overlooked these findings. I can understand a few mistakes when working with large data sets like the one we worked with this week, but to find huge discrepancies in data shouldn't have been overlooked.

Dbashour (talk) 23:59, 23 October 2017 (PDT) Dina Bashoura

Biological Databases Homepage

List of Assignments

List of Individual Journal Entries

List of Shared Journal Entries

List of Final Assignments

List of Team Journal Assignments

Eddie Bachoura's Responses

  1. The biggest issue with the data analysis was that Baggerly and Coombs had data that was not reproducible. Also, their data was not clearly labeled. Both of these were practices that DataONE believes to be best practices, and both were violated. Baggerly agrees with the fact that his data wasn't documented well was a big issue.
  2. Dr. Baggerly recommends that, when working with data analysis, one should very strictly follow these protocols, such as labeling data, documenting the process well, and most importantly have a reproducible workflow. DataOne agrees with all three of these, with more focus on the reproducibility aspect.
  3. This week, we followed the best practices of documenting our data well as well as ensuring that our results were reproducible by others.
  4. Same reactions as before, but it helps me understand a bit about the professional biological world and how important it is to follow these protocols.

Ebachour (talk) 00:00, 24 October 2017 (PDT)