On Friday I noted news reports that claimed US military intelligence was warning as far back as late November of a possible new virus in China with a possible global impact. The problem is that these US intelligence reports would predate by weeks our earliest understanding of when the first cases emerged and well before the Chinese themselves knew they had a new disease on their hands. That chronology of the outbreak comes from news reports from major dailies in the United States and Hong Kong. But there’s another body of evidence which points to a similar and more definitive timeline. That’s hidden in the COVID-19 genome itself.
One of the great cinematic moments of future movies about the COVID-19 epidemic in the United States came on February 29th when Trevor Bedford, a virologist at the Fred Hutchinson Cancer Research Center in Seattle made a startling public announcement. A genomic analysis by Bedford and his colleagues showed that the COVID-19 samples from the assisted living facility in Kirkland, Washington were almost certainly an evolutionary descendent of the January 19th sample taken from a Snohomish County man who had been diagnosed with COVID-19 after returning from China. As Bedford put it in his late night Twitter thread, this finding had “some enormous implications.” The evidence strongly suggested that COVID-19 had been spreading in the Greater Seattle region for six weeks.
This was a first clue that limitations on testing had allowed COVID-19 to get a critical jump on public health authorities who had assumed the contagion was bottled up with a handful of quarantined infections in travelers from China. But it is also a window into the science of viral genomics which gives researchers startling new insights into the life histories of pathogens.
As viruses replicate from host to host they pick up small (almost always inconsequential) mutations at an established frequency. By analyzing the number of mutations you can place the evolution and different branches of disease spread in time. Researchers have now mapped the genomes of numerous COVID-19 genetic lineages from across the world. In this way, they can pull each genetic thread back to a common starting point, placing it both in the stream of genetic evolution and in time. All the analyzed lineages lead back to an origin point in late November or early December 2019. As the NextStrain project puts it here, “The common ancestor of circulating viruses appears to have emerged in Wuhan, China, in late Nov or early Dec 2019.”
Here’s a visualization of what this detective work looks like (click the image to go to the interactive visualization.
If I understand the science correctly it is conceivable that scientists could still find a lineage that points to an earlier point of origination. But having analyzed over 2,000 from around the world this seems quite unlikely. I’m not in a position to comment on the science myself. But my impression from listening to those who can is that the science underlying these estimates is very, very compelling. And again, they place the origin of the disease roughly where the investigative press reports do – late November or early December 2019.
There might be some minor chronological wobbliness bound into statistical analyses of the frequency of mutations. But again, the science behind these analyses appears to be very compelling. And they seem to preclude anything happening in November 2019 that US military intelligence could possibly pick up.