Those of us who have been following the news over the past year and a half1 have likely become accustomed to hearing the terms “PCR” or “PCR test” all the time. Specifically, you might have heard that a PCR test is the gold standard for detecting viral infections because it is so sensitive, but why is that? And what even is a PCR? And why are my socks wet? These are all very good questions, and luckily I am here to answer them for you (except that last one, you’re on your own with there).
Firstly, I think it makes sense to explain what the letters “PCR” stand for before we go any further, just to get it out of the way. “PCR” stands for Polymerase Chain Reaction. We don’t really need to know what this means because it will become clear later. For now, it is enough to understand that PCR is simply a technique that scientists use to amplify the amount of a specific piece of genetic material, i.e. DNA. To understand how this works, we first have to know a bit more about DNA and how it works.
DNA, as some of you may know, is an acronym that stands for DeoxyriboNucleic Acid. More significantly, DNA is the basis of almost all life as we know it2 and contains the genetic information that makes us into the organisms that we are. Most people are familiar with the following depiction of DNA as a double-helix shape that looks like a twisted ladder.
While this picture is not incorrect, it doesn’t really help us to understand how the molecule of DNA actually works. Instead, I think a more suitable way to visualize it would be the following:
There is something noticeably different (I hope) in this image – namely, the “rungs” of the ladder are marked with pairs of letters. By now, those of you who took biology in school might know what I’m getting at with this image, but for those who didn’t, fear not – I will elaborate. The “rungs” here are the so-called bases of the DNA, made up of four unique molecules – A,T,C,G (these initials come from the chemical names of the molecules: Adenine, Thymine, Cytosine, and Guanine). The rungs of the ladder are formed by one base from each side pairing up with another base on the opposite side; G always pairs up with C, and A always pairs up with T, in a consistent pattern. It is the specific sequence of these bases – i.e. the order in which they are arranged on the ladder – that determines the genetic information carried by the DNA. When you hear someone talk about a “DNA sequence” or the “genetic code”, this is what they’re talking about.
So now we know a little bit about how DNA carries the information necessary to make life. In fact, a reasonably short strand of DNA can carry a huge amount of information – the mathematically inclined among you might recognize that for a strand of DNA with N number of bases, there are 4N number of unique combinations. For reference, the human genome is about three billion bases long, which gives an incomprehensibly large number of unique combinations (granted, most of our DNA contains general, species-level information like number of eyes, teeth, limbs, organs, etc. The amount of DNA that makes you unique from another human person is considerably smaller, but still huge).
This knowledge about the genetic code is all well and good and dandy, but to really understand how PCR works, there are a couple more things that we need to understand about our new favorite genetic molecule. First in this category is the fact that DNA has two strands, arranged like so:
“But millibeep” I hear you lament, “we know that DNA has two strands, we saw it in that beautiful picture from a few moments ago.”
Correct, but that picture was still missing a key piece of information. In this drawing, we can see the two strands that make up our stretch of DNA, arranged into a top strand and a bottom strand. Because the bases of each strand always pair up in a specific way (G with C, A with T), the top strand and bottom strand carry the same information as reverse images of each other (one consequence of this is information redundancy – if one strand is damaged, the other strand maintains the information so it can be repaired properly).
Furthermore, you might have noticed the labels on either end of each strand, which look like 5′ and 3′, but are pronouced “five-prime” and “three-prime”. There are technical reasons why they are named this way, but for now the important thing is to know that the two ends of a given strand are different. When a DNA molecule is formed, the top and bottom strands are always oriented opposite to one another, with the 5′ end on the top corresponding to the 3′ end on the bottom. Therefore, to have the complete set of information about how the DNA is arranged, we need to know not only the sequence of the bases, but their orientation with respect to the ends as well.
This end-to-end orientation is important when it comes to replicating a strand of DNA, something that our own cells do every day, and a crucial process to the concept of PCR. When a polymerase (the biological machine that replicates DNA) makes a copy of a strand of DNA, two things happen. First, the double stranded structure is partially “unzipped” to expose a region of single-stranded DNA. Second, a polymerase will “see” the unfinished, single-stranded DNA and rush to the scene. There it begins copying the DNA from the 5′ end towards the 3′ end, and only in this direction. Due to its nature, the polymerase’s machinery is only capable of functioning in this one direction.
Alright, we are almost ready to understand what PCR is and how it works, I promise. But first, we need to know just one more thing about DNA.
Like all molecules, DNA is held together by energetic bonds between its parts. The strength of these bonds is determined by the exact chemical structure, but the important thing to know is that the bonds between the bases are weak enough that they can be melted apart with the right amount of heat. This is just like an ice cube melting into water as the thermal energy breaks the bonds between its molecules. Also like the ice cube, once the thermal energy is removed, the bonds can re-form and the bases will connect back together (in genetics, this process is referred to annealing, rather than freezing, but the concept is the same).
Alright, now that we know:
a) how DNA is structured,
b) how a polymerase copies DNA, and
c) how DNA can melt apart and re-form,
we are finally ready to see how PCR actually works!
As I said earlier, PCR is a technique used to amplify the amount of a specific piece of DNA in a sample, i.e. to increase the number of copies of that piece many, many times (the keen-minded among you might already be imagining how the polymerase comes into all this). In the case of testing for a viral infection, the PCR will be amplifying a specific part of the virus’ genetic material.
In order to amplify a specific piece of DNA, it is predictably necessary to know the sequence of that piece (lucky for us, DNA sequencing technology has come a long way!). Once we know the sequence of the piece we want to amplify, we need to make short, single-stranded pieces of DNA called primers that have the same sequences as part of the target sequence. These can be created synthetically and the process for this is interesting, but too complicated to include here. For each piece of DNA we want to amplify, which I will now call the target sequence, we need two primers – one for the top strand and one for the bottom strand (respectively called “forward” and “reverse” by convention), like so:
By now, you may be able to guess why these short strands are called “primers”; they will be used to “prime” the chain reaction that gives the process its name. Let’s go through it step by step.
To start off, we have our initial sample that contains a mixture of our double-stranded target sequence as well as many, many copies of our single-stranded forward and reverse primers, and our polymerases.
Then, using a piece of lab equipment called a thermocycler, we heat the mixture just enough so that the double-stranded target sequence will melt apart. During the heating, nothing really happens to the primers since they are already single-stranded.
After melting apart, all of the bases of the target sequence are wide open and ready for anything – they’re single and ready to mingle. Taking advantage of this, we cool the mixture back down enough that some of the bonds will start to anneal, or reconnect. Because we have included many, many copies of our forward and reverse primers, chances are that some of the primers will anneal to the target sequence before the original strands can reattach to one another.
At this point, we will have some long strands of the target sequence that are partially double-stranded because of the primers, but not completely! This means that we have unfinished DNA on our hands – luckily, we have a guy for that. The polymerase will jump in to attach itself to the unfinished strand created by the primer, and begin to extend it by copying the target sequence to create a piece of double-stranded DNA.
If this process is able to occur on both strands of the original target sequence, once the polymerase(s) are done copying, we will have two copies of the original DNA – one made from the top strand and one made from the bottom strand. Because we added many, many copies of the primers to the mix, all we have to do is repeat the process over and over again (hence the “cycler” part of the word “thermocycler”). At each repetition, the number of copies of the original target sequence will increase exponentially – one becomes two, two become four, and so on (hopefully it is clear now where the name “Polymerase Chain Reaction” comes from). This means that in a relatively short time, we can increase how much target sequence DNA we have by a huge amount!
The exponential increase of target DNA is the strength of this technique – in a few short rounds of thermal cycling, the amount of our target can be exponentially increased to an easily detectable level (for the curious, this is usually on the order of a few micrograms per milliliter). This is what makes PCR so powerful for detecting, say, the sequence of a novel bat virus – all you need for a positive result (in theory) is a single copy of the genetic material3,4.
And there we have it, exactly what PCR is and how it works. I hope that I was able to demystify this concept for you and that you’ll now be slightly more informed when writing letters-to-the-editor or shouting from soapboxes or drafting legislation or whatever it is you people do when not reading my articles.
As a final note, I debated with myself about whether to include an explanation here about how the final product of the PCR (the hugely amplified amount of target DNA) is detected. It is an interesting, and (I think) fairly easy to understand process. However, I ultimately decided that will be the subject of a follow-up article, when I can actually demonstrate the technique in the lab (with pictures!).
Until next time.
1 – If you haven’t been following the news, don’t worry; nothing significant whatsoever has happened and you can go back to whatever it is you were doing.
2 – Some life forms, like viruses, use a slightly different genetic molecule called RNA.
3 – As an aside, this is much more sensitive than a typical viral antigen test, which tests for the presence of the viral antigens (the parts of the virus that your immune system reacts to). By their nature, antigen tests have a fairly high detection threshold – meaning that a sample needs a large number of antigens to register a positive result.
4 – In case anyone is alarmed at the idea of making many copies of a virus’ genetic material, fear not. It is safe because the PCR reaction would only target a small fraction of the overall genetic code – far less than is necessary to make a functioning virus. In fact, the greater danger comes from handling the samples of patients’ spit.