What’s in a number? The number 7 is the date you were married; it’s the exact number of teaspoons of sugar in a halwa recipe your mom sent you; it’s the age of your eldest son on his next birthday. As we get older, we become acquainted with the texture of these numbers; we begin to treat them as harbingers, or maybe ports in a storm. We associate them with feeling. I don’t like the number 21. Why? Because my favourite batsman gets out most often facing 21 deliveries. 3 is my lucky number. Why? I don’t know, it just is.
Larger numbers are never quite as potent. What’s in the number 23,723? Or 59,338? Nothing, as far as you know. Most of them lose character beyond the hundredths decimal place. Give me a man who murmurs “That train has 138 bogies,” and I’ll give you ten who’d say: “100 bogies is a lot for a train.” (It’s a goods train.)
By the end of this post, you’ll come to know a large number. Not just know it, but know it twice, the first time as too large, the second time as too small. That number is 143,000.
The nature of mutations
You’ve got DNA; every cell in your body harbours two copies of 3 billion bases—there’s a characterless large number again. However, these 3 billion bases (As, Ts, Gs, and Cs), once translated into the right language, make you YOU. In a very real sense, DNA bases are the building blocks of the biological world; they’re what make life happen. DNA is transcribed into RNA and thence to proteins, the stuff that makes things happen in the human body. RNA and proteins are all-important, but they’re also just puppets. It’s DNA that pulls the strings.
You can’t just meddle with DNA. It’s a sunny day, and you’re out watching trains, or perhaps your favourite batsman endure past the twenty-first delivery. You flake some skin, some other skin turns brownish-pink, and within the stern motor of your cutaneous membrane a cell acquires a mutation. What is a mutation? A DNA base has just been altered from a T to a C.
Oh no! Oh wow! Is this how I get cancer? Is this how I get skin cancer? Almost certainly not. Most mutations only knock—and knock weakly—on the door to chimeras. Part of the reason is that you always have two copies of DNA (a phenomenon known as diploidy in biology, meaning that we have not just one set of chromosomes but two), so if one goes awry there’s always another one to bank on (though see next paragraph for a counterpoint).
Besides, DNA is—what’s the geeky word?—redundant. You change a base, and about a quarter of the time it doesn’t matter, because the new base codes for the same protein. How is this possible? In one word: evolution. In 21 words: there are more combinations of codons (64) than amino acids (20), which means that many codons can stand for the same amino acid. Codons are triplets of DNA bases that code for specific amino acids (like leucine or proline), which in turn are lined up to be twisted into proteins. So if a mutation in a particular codon still leads to the same amino acid, we’re home dry. Or even if it doesn’t, the second copy of our DNA might still come to the rescue.
So DNA is tamper-resistant. But—as new parents might grimly testify—resistant and proof are not the same thing. Diploidy doesn’t protect against autosomal dominant mutations; mutations whose presence in a single DNA copy is sufficient to set off a medical condition.
But what if a mutation in a codon means it no longer codes for say the amino acid leucine but now creates a proline. Does that make a difference? Well, it depends. There’s a delicate chain of events that begins, but only begins, with the DNA. DNA pulls the strings; but what about the puppets? Simply put, changing DNA changes RNA, in turn changing proteins. Proteins “regulate” health, ergo changing DNA changes health. However, the real relationship between DNA, RNA and proteins is very, very complex. Volumes of wisdom have been written about it. Much is known. Still more is less well known.
But this much we do know: Just as redundancy is built into the fabric of DNA, it is also built into the relationship between DNA and “downstream” processes; between puppet and master. Sever a single skein of string, and usually, usually, nothing happens. The puppets just keep on skipping. How? Well, for one thing, the change in amino acid, itself caused by the change in DNA, may not actually, physically change the protein and it is still able to perform its original function.
The average human has six million mutations. Some of these are passed on through his “germline” (like the well-known BRCA1 and 2 mutations that can cause breast cancer), others are acquired (also known as “somatic” mutations) like the mutations caused by sunlight like in the earlier example. The vast majority of these six million mutations are harmless, with only a very small number of mutations increasing someone’s chances to develop cancer. It takes a lot to change DNA thanks to human diploidy and DNA redundancy, but there is also lot of DNA vulnerable to change. So the two extremes cancel each other out: a mutation on a single base means the difference between a day out on the green, and cancer.
Breast cancer: or, why 143,000 is both too much and not enough
Breast cancer thrives on mutations like these, single base swaps that wreak a good deal of havoc on—usually—a woman. Many mutations are merely unsettling (sometimes also called Variants of Unknown Significance, or VUS, because they have not yet been observed to cause cancer by science or medical observations in other patients). Other mutations are more definitive. When a mutation in BRCA1 for example is found to cut off a protein mid-sequence, it essentially guarantees the eventual manifestation of cancer in its owner.
So why is 143,000 both too much and not enough? 143,000 is the number of breast cancer cases recorded in India in the year 2012. Given how much we now know about this disease, the number feels large. Some 200,000 research papers have been written on the BRCA genes; Clinvar (a public database that records all genomic variations and their impact on human health) identifies 2,345 mutations in BRCA1 and 2,653 mutations in BRCA2 as pathogenic (that means they are disease-causing). Despite all this work, we’re no closer to preventing breast cancer. Earlier this year, Stanford professor and Fields Medalist Maryam Mirzakhani, with access presumably to some of the best healthcare in the world, lost her four-year battle with breast cancer. All of which is to say: there’s a long way to go.
In India, however, the largely generic problem of “too many cancers” bumps up against the specifically subcontinental problem of “too many people.” 143,000 is clearly—clearly—too few. The billion three hundred million in this country only contribute to 143,000 detections a year; meanwhile, a country with a fourth the number of people detects nearly twice as many. How many people does that leave out? Simple math reveals the cruelty of linear proportion; there are probably at least a few hundred thousand, and as many as a million, silent sufferers of breast cancer in India. It’s the large, characterless numbers that are also tyrannical.
The good news is nascent, probably a drop in the ocean; but hopers gonna hope. Small, focused organisations like Strand Life Sciences now offer clinical genetic tests that can tell you if you’re prone to breast cancer; whether you acquired or inherited particular mutations that are indicative of the disease; and then tell you what you can do about it. Underlying this complex clinical workflow is Strand NGS, which allows you to follow the delicate thread of bioinformatics-based inference and deduction needed to go from the “proband”—the patient—to diagnosis. For India, clinical genetic workflows like Strand NGS coupled with state-of-the-art clinical diagnostic tests, like the ones Strand offers, are powerful tools to battle a disease that, for too long, has thrived unopposed.
About the Author: Radhakrishna Bettadapura, or RK, is a Senior Software Engineer at Strand Life Sciences. RK holds a Ph.D in Computational Biology and Mechanical Engineering from the University of Texas, Austin. Before his stint with the NGS team at Strand, he developed algorithms to model and analyse protein flexibility. At Strand, he manages Strand NGS, the organization’s flagship product for the analysis of next-generation sequencing data.
Editorial Note: This is an abridged version of a blog originally posted on the Strand NGS Blog