As the novel coronavirus saturates the news, forcing colleges and sports leagues to shut down and infiltrating Hollywood, many Americans are understandably wondering when it will arrive at their doorstep. While the number of known cases in the U.S. appears to be comparatively low as of now, the figures are almost certain to spike very soon, as both testing and exposure increase. While COVID-19 has unquestionably spread further than officially known, it is poised to round the curb and spread widely across the U.S. By the end of April, there will be no dispute that COVID-19 is not a “foreign virus.”
To better understand outbreaks like this, the Centers for Disease Control and Prevention (CDC) consults a network of academics and industry experts who specialize in modeling the spread of contagious diseases. One of those outside groups, the Laboratory for the Modeling of Biological and Socio-technical Systems at Northeastern University, provided TIME with exclusive access to 100 of the different coronavirus scenarios it has generated in its efforts to support the CDC.
For the following interactive, TIME picked five of Northeastern’s potential scenarios that most closely align with the growth of COVID-19 cases we’ve already seen in the U.S. These models vary from detection levels of about 40% of those who contract the illness (under the “High” scenario) to 25% (in the “Low” scenario). They also account for the fact that the actual number of infected individuals is and will continue to remain significantly higher than the number of confirmed cases. That’s because not all infected individuals will exhibit symptoms or be tested, even though they remain contagious.
To create this interactive, the Northeastern team provided TIME with potential day-by-day growth in COVID-19 cases across 483 U.S. locations, organized around transportation hubs and dating from the emergence of the virus through April 30. This feature, which TIME produced in-house with the consultation of the researchers to ensure accuracy, will continue to be updated as the model adapts once more is known about the virus’ behavior — for instance, whether it might be highly seasonal, like the flu.
The purpose of this visualization, and of Northeastern’s research more broadly, is not to predict what will happen, but rather forecast what could occur under a variety of conditions that remain unknown or unknowable. But the conclusion the models offer is clear: The degree to which the U.S. government and the healthcare industry can coordinate efforts to test individuals more effectively — a process that has been confusing, slow and riddled with errors — could mean the difference between tens of thousands of cases over the next six weeks, or well over a million.
“What we’re seeing now is really just the tip of the iceberg,” says Alessandro Vespignani, the director of the Northeastern lab, who worked alongside colleagues Matteo Chinazzi and Ana Pastore y Piontti on this research. “That’s the problem of not doing extensive testing. Because testing has been limited here, I would be inclined toward the worst case scenarios.” (The researchers also provided TIME with a catastrophic scenario in which virtually no one is tested, which is not visualized here because attempts to produce images of the outcome repeatedly crashed this reporter’s computer. Suffice it to say the entire map quickly becomes completely orange.)
Keep up to date with our daily coronavirus newsletter by clicking here.
The model that produced these scenarios consists of two streams of information. The first is what Vespignani called “the ‘business as usual’ of the world.” This includes a vast amount of data on global populations gathered from each country’s version of the Census Bureau (as well as many other sources), with a focus on population density and mobility, from daily commuting patterns to the volume of international travel.
The second set of parameters fed into the model involve the nature of this coronavirus, which at this point is much less well understood. The challenge with a “novel” coronavirus, after all, is that it’s new. Every communicable disease behaves differently, which poses a problem for gaming out the transmission of one that wasn’t known to exist until very recently.
“For the flu, or ebola, or more regular diseases, we have quite a good understanding of the mechanism of transmission and so forth,” Vespignani said. “For [COVID-19], the problem is we didn’t know anything until two months ago. Now, every day that goes by we add a little piece to the puzzle and we can fill the model with those numbers.”
The most important factors that researchers like Vespignani need to consider include a virus’ “reproduction number” (a value that represents how contagious it is) as well as its incubation time (the period between infection and the onset of symptoms). Given that COVID-19 can produce minor or no symptoms in healthy individuals, the models in this case must also account for the detection rate.
Even if the coronavirus was better understood, the most complex simulations in the world would still produce scenarios with a wide variety of severity. Like all models, whether for election outcomes, sporting events or the path of a hurricane, there is variability that cannot be predicted or packaged into a variable — a margin of uncertainty known as “stochastic” events that exhibit random behavior.
“Let’s imagine you are sick with COVID, and you go into a coffee place. You might sneeze there or sneeze two minutes later when you get into the car and you are alone,” Vespignani said. “Unfortunately, nobody will ever be able to model for that. For this reason, all models are stochastic models.”
It is natural to wonder, then, why so much effort goes into computing models that produce such a range of outcomes. Again, the power of the discipline is not in correctly predicting what will occur, but demonstrating how the possible scenarios change based on different inputs. As the maps provided here demonstrate, the effective use of widespread testing, even of asymptomatic individuals, will be critical in mitigating the potentially catastrophic impact. Every variable is a clue, and every adjustment to its value — picture a giant machine with hundreds of levers in different positions — offers another hint as to what can contain a pandemic.
Please send any tips, leads, and stories to virus@time.com.