Swedish corona statistics

There is a lot of numeric information available about the spread of Covid-19. The problem of understanding the dynamic evolution from different data sources fall into the broader sensor fusion framework. We have on this site collected information from different sources, and added some simple numerical calculations. All data from c19, FHM and SCB are retrieved directly from the sources and all plots are updated automatically after this is done. This eliminates manual handling. Besides the imported figures, these pages are static.

One of our ambitions is to explain to hobby epidemiologists and professionals the engineering aspects of modelling dynamic processed based on observations of different kinds. This involved both filtering (estimating the state of the process) and parameter estimation. Complex models are often used in the literature about the covid-19 spread to make conclusions about what has happened and recommendations about how to act in the future. Their validity is often motivated by the fact that they can fit actual mortality data. The basic modelling aspects we teach in our courses need more attention in this context:

  • A model can never be proven to be correct. It can however be invalidated and proven to be incorrect. "All models are wrong, but some are useful" (George Box). Model validation is an engineering skill that requires a lot of theoretical insights and practical experience. 
  • The parsimonious principle. If several models can explain data, the simplest one is to prefer. This general principle dates back to the 12'th century and Ockham's razor, but is particular useful to selecting mathematical models.
  • Sensitivity and robustness. What happens if the input data (sensitivity) or model parameters (robustness) are perturbed a little. A good model should give the same output for small disturbances. 

What we have shown in Ny Teknik is that the simplest possible model, the 100 year SIR model, can explain Swedish mortality data very well. The figure below shows actual data (running average over one week) together with the simplest possible SIR model (red line). The fit has a standard deviation of 4 (deaths per day). By also including one change point where the spread rate is allowed to change, an even better fit is obtained (standard deviation 3 deaths per day).

The possible gain of increasing model complexity beyond that of the SIR model needs to be weighed by the risk of overfitting, that may lead to increased sensitivity and decreased robustness.