Why being able to read the whole paper is important

chloroplastsby albertstraub

Sequencing hundreds of chloroplast genomes now possible
[Via Eureka! Science News – Popular science news]

Researchers at the University of Florida and Oberlin College have developed a sequencing method that will allow potentially hundreds of plant chloroplast genomes to be sequenced at once, facilitating studies of molecular biology and evolution in plants.


This paper made me smile.

New approaches to gathering information are always interesting.The idea of sequencing multiple chloroplast genomes at once is kind of cool. So I tracked down the paper and, thankfully, it is open access.

Because it has some interesting stuff buried in it. Stuff that brought a smile to my lips.

Chloroplasts are the parts of plants that really make them plants – it is here that photosynthesis takes place. Just like mitochondria in animal cells, chloroplasts appear to have arisen by the incorporation of bacteria into a symbiotic relationship with the larger eukaryotic cell.

As such, they still retain their own DNA, separate from the cell’s genome in the nucleus. But it is not easy getting at that DNA to sequence it. It is usually difficult to isolate the smaller chloroplast DNA from the large amount of nuclear DNA. Contamination with nuclear DNA makes the signal to noise ratio quite bad for many next generation sequencing techniques.

So this paper describes some techniques they used to enrich for the chloroplast genomes. Essentially, they looked at the chloroplast sequences that had been generated and designed a huge series of sequences then were synthesized. These sequences, about 100 nucleotides in length, were from regions that were conserved amongst the chloroplast genomes but not found in the nuclear DNA. They could then use these sequences as baited hooks, separating out the chloroplast DNA that was similar to them, while leaving the nuclear DNA behind.

This approach was so successful that they were able to take a mixture of DNA from 24 different plants, enrich for the chloroplast genomes, and then sequence all of them in a single lane of a sequencer. USing modern software tools, they were able to almost completely reassemble all the 24 chloroplast genomes.

Almost 60% of all the reads from the sequencer were from chloroplasts not from the nucleus, showing that the enrichment was quite successful. They did some back of the envelope calculations that indicate their procedure could work with a mixture of about 1300 different plants! All at once.

This is a pretty nice approach to mass screening for diversity purposes. But that is not what drew my attention and why I find being abel to read the entire report so important.

They did several unusual things in their protocol, things they did not really mention in the abstract, which is all one usually sees. Here is how they write it up in the Methods section (my bold):

We stress that the methods described here deviate substantially from the manufacturer’s protocols (see http://www.genomics.agilent.com/GenericB.aspx?PageType=Custom&SubPageType=Custom&PageID=3120). Additionally, the kit we used has been updated as Agilent has continued to refine its enrichment products (see http://www.genomics.agilent.com/CollectionSubpage.aspx?PageType=Product&SubPageType=ProductDetail&PageID=3033). We provide the information not only as a record of our methods, but also to illustrate the robustness of the kit and to encourage further experimentation among other users.

Specifically, three significant deviations were made from the manufacturer’s recommendations. First, for many reasons beyond our control, the kit was nine months past the manufacturer’s expiration date when it was used—clearly we would not recommend using an expired kit, but our success should reassure others who may find themselves with similarly outdated kits. Second, the kit contains blockers for the adapters that prevent nonspecific capture via adapter-adapter annealing. We used an older kit with blockers for single-end adapters, while our libraries had barcoded paired-end adapters—thus, we did not have the correct blockers in the mix. Lastly, all 24 barcoded libraries were pooled for a single capture, although the SureSelect protocol recommends selecting individual barcoded libraries followed by pooling of samples. Agilent now offers preselection pooling of barcoded libraries, although this is currently limited to 10 libraries, and the cost, while somewhat lower than 10 individual samples, is still significantly higher than one sample. Hence, performing a single selection on pooled barcoded samples is a significant and previously unsupported deviation from the manufacturer’s protocol. However, again we think that our results indicate that this method will work in many situations, and this approach is the only cost-effective option for enrichment of a small region such as the plastid genome.

In the old days labs used their own protocols, not purchased kits. But kits have become so standardized and cheap that we now see people taking a kit and modifying it for their purposes. But the three modifications here are actually not ones a lab would normally take.  And it is my reading between the lines of these deviations that made me smile.

First, they used a kit that was well beyond the expiration data. I would imagine that it was simply used without checking the data until it was too late and the run already done. But, for whatever reason, we now know that the procedure will work even with old reagents. Always a nice thing to know.

Second, they used a kit that actually did not have the best ingredients for what they wanted to accomplish. And finally, instead of doing the enrichment on each of the 24 DNA samples before pooling them, they pooled the samples and then did the enrichment. This is a really nice thing to know can be successful.

So, they did 3 things that were suboptimal and still were very successful in enriching for the chloroplast genome and assembling their sequences. Old kit, wrong ingredients and premixing of the samples. 

Just about any scientist would have said “Start over”  with even one of these deviations.  I imagine that the researchers just grabbed a kit they had in the lab and just used it. Perhaps only after they had succeeded and perhaps even when they were finally writing everything up, did they discover these deviations.  It made me smile.

I’ve been there, doing something ‘wrong’ and having it destroy the experiment – using the wrong buffer, doing the incubation at the wrong temperature, etc. I can imagine the feeling they had when they discovered they had used a kit that was almost a year past the expiration date. And they still succeeded. It made me smile.

The fact that they not only went ahead but persevered really demonstrates the robustness of their approach. It made me smile.

And that is why I love being able to read the entire paper. Often really interesting aspects of the paper never make it into the abstract.


2 thoughts on “Why being able to read the whole paper is important

  1. Inquiring minds are delighted that they proved their point, but want to know if the results were the same with all three things done properly! After all, without the same, improper kit, how will anyone duplicate the experiment?

    1. I imagine that will be part of their next paper and/or another labs work. I would expect that they could get much better enrichment changing some things.

      I would not imagine that using the kit past the expiration date would have much real effect. Most of the reagents are pretty stable I would not be surprised to see that the kit maker was substantially underestimating the dates in order to sell more kits.

Comments are closed.