5 Techniques For Dramatic Improvements In Reproducibility
It’s not easy to improve reproducibility in your experiments.
Image manipulation has become a major problem in science, whether intentional or accidental. This has exploded with the advent of digital imaging and software like Photoshop. There are even mobile applications like Instagram filters that can be used for imaging trickery. It should go without saying that image reuse/manipulation represents profound dishonesty in science – a field intended to uphold the most stringent possible standards of truthful inquiry! But what about studies with a sloppy or stunted capacity for reproduction?
These, too, plague science and hinder our ability to seamlessly move forward – irreproducible research demands surplus time and energy from the scientific community and must, therefore, be addressed.
Have you ever tried to reproduce results from another lab only to discover it didn’t work as you expected? What about results from your own lab? Maybe there isn’t a standard operating procedure in place, or maybe this procedure has changed over time (yet no one has bothered to document these changes to ensure that daily methodology is not outdated). For example, another group once published findings that did not agree with those of my own organization.
I wanted to reproduce one of their key experiments and find out why we were getting such different data. I read their materials and methods, and I set out to repeat their experiment. This experiment involved feeding beads to CHO cells (Chinese hamster ovary cells). CHO cells are not immune cells, so they don’t normally uptake foreign objects.
Ultimately, I couldn’t get the beads in the cell. I tried everything from spinning them down to coating the beads. I wasted weeks and probably thousands of dollars trying to repeat this experiment. Fortuitously, I ran into the authors at a meeting and asked them how they managed to induce uptake in the CHO cells.
They had transfected the cells with Fc receptors to help bring in the beads. This, of course, had been left out in the materials and methods section of their published research! Journals don’t always give us enough room for materials and methods to describe our experiment fully, so this sort of thing can happen from time to time. Alternatively, the materials/methods section might be messy or incomplete due to inadequately archived notes.
Reproducibility has been defined as “the ability to get the same results using the same experimental system.” This is the minimal standard for which we ought to aim. But ideally, we want replicability, which is the opportunity to obtain the same results using different experimental systems. Repro for Everyone has some guidelines to ensure we meet these minimum standards for reproducibility, but I’m also going to cover 3 aspects of lab operating procedures that can be streamlined and controlled to improve reproducibility.
These 5 practices will help to improve reproducibility as well as saving your time and money. If that isn’t enough, high quality and reproducible experiments might even get you a few more citations due to the ease of following your work.
1. File Directory Structure
Before you even start a project, the first thing to do is have a good file directory structure, which should start with the project name. Then you would have subfolders for: the methods that you used, all your raw data, the analysis that you did on these, and any manuscript or other scripts you developed to be able to analyze this data. Experiments may take months–or years–to complete. You need to know where the data and analysis are located in order to avoid repeating an experiment unnecessarily.
Raw data also needs to be backed up, preferably 3 different times. I use a RAID 5 server, and that data is immediately streamed to a cloud backup of the drive once per week. Users are expected to download their data to their personal computers or hard drive. In this way, we can ensure that the raw data is protected.
Another area of consideration is in the actual file names, and file names are essential for two different reasons. First, they describe the exact experiment without requiring you to go back to your lab notebook to determine exactly what you did that particular day. And secondly, if you’re doing batch analysis, those tags can be valuable data for grouping your experiments and getting a meaningful output in software such as CellProfiler.
You can write what each different section of the name means, and the software can organize them from there. Although it may take some time in the beginning to set up these protocols, in the end, it will actually save you a lot of time for analysis, figure creation, etc. Simple-to-understand file structures allow someone else to recreate your analysis – i.e., they improve reproducibility.
2. Electronic Notebooks
Another step to improve reproducibility is to migrate your work to an electronic notebook or an “ELN.” There are many different electronic lab notebooks.Harvard has a great collection to see the different pros and cons in their estimation. My lab uses SciNote.
Why does this work for us? We have a respository of SOPs for the lab that students can pull down for an experiment, checking off the steps as they go. Then there is a section in it for any notes on deviations taken from the standard operating procedure. Then I can see what students have been working on and search within the notebook to find specific experiments.
The SOPs are all digital and in a single place so that I can pass them to other labs with greater ease, giving access to collaborators across the country. Easy sharing of SOPs can accelerate science, and it will improve reproducibility. As a plus, it might even earn your work a few additional citations by other labs using your clear techniques.
3. Digital Analysis and Visualization
I received no formal training on how to construct proper figures, and the same may be true of many readers. We think figures are obvious – just show a few representative images and some type of graph. How hard could it be? Weissgerber et al. suggest that there are 3 main features for effective figures:
- It must be easy to understand the study design using the figure. Using the figure and its legend, can readers understand what you did without reading the main text?
- The figure must illustrate important findings – only the important ones. Additional supportive data can be moved to the supplementary figures.
- The figure needs to allow readers to critically evaluate the data.
When I’m reading a new paper, I scan the abstract to determine whether it’s of interest to me. Then I go through the figures and the figure legends. After that, I try to determine what conclusions I’ve come to, and I check to see if I came to the same conclusions that the authors did. This is how you can quickly evaluate whether a study is rigorously conducted as well as whether you’ll be able to easily understand it. Therefore, carefully constructed figures are another way to ensure reproducible science.
4. Methods & Materials in Published Work
As mentioned previously, even if you do take careful notes and keep everything in an ELN, you may have to abbreviate your methods for publications. Here are my tips on how to craft a good “materials and methods” section in a paper.
- If you cite a method from a previous paper, make sure it is the original paper that fully describes the method. Creating a long chain of papers, each one citing another, is frustrating and not useful to anyone. Small changes might be different in each of these citations, meaning there is no way for the reader to piece together the real method.
- Utilize STAR methods (Structured, Transparent, Accessible Reporting). In short, you need to report your experimental model; how you performed your qualification and statistics; detailed methods; availability of data and/or code; and key resources used. How many times have you seen someone note that they used a β-tubulin antibody, but not the clone that they used?
- Deposit your protocols into Protocol.io. There is no way to get every detail of your protocol in your paper, no matter how generous a journal may be with space. Therefore, utilize public depositories to archive a detailed record of your experimental protocol.
5. Data Repository & Releasing Code
Science is only reproducible if someone can take your raw data and your analysis methods and have the same output. Therefore, one of the ways to improve reproducibility is to openly share both data and code. In the past, putting “data will be shared upon request” was sufficient. Often, you would hear about researchers not sharing their data because they weren’t “done” with it.
If publishing in a journal such as Nature, they state, “authors are required to make materials, data, code, and associated protocols promptly available to readers without undue qualifications.” Here are some tips on how to make your data and code (if you produced custom analysis for the data) publicly available.
- Find a place to host your raw data. One such place is the Image Data Repository, where you can easily submit a 100GB data set with included metadate. Other popular options are Dropbox or Google Drive with links to the dataset on your lab’s webpage. This helps with reproducibility and allows other researchers to mine data from experiments that have already been performed.
- Post your code in a repository like Github. Github is one of the most popular platforms for labs to post image analysis code, and it’s a great way to maintain version control. How many times have you tried to “improve” something only to make it worse?
Millions of dollars are wasted in research every year on irreproducible data. Moshe Pritsker, CEO of JoVE, said, “The reproducibility of published experiments is the foundation of science. No reproducibility – no science.” Therefore, if you are not striving to add quality data to a body of knowledge, you are wasting time in the lab. If no one can redo your experiment and get the same results, it will not add to current hypotheses.
Furthermore, sloppy data collection and analysis can cause problems with papers, patents, and even promotions. Good science pushes the field forward, so always aim for your best work. It’s never a bad idea to improve reproducibility further.
To learn more about how to improve reproducibility and get access to all of our advanced materials including 20 training videos, presentations, workbooks, and private group membership, get on the Flow Cytometry Mastery Class wait list.