KIT Press release and additional coverage (T-Online & SPIEGEL) on Risklayer Corona tracking

Risklayer received various media coverage in national newspapers and released a press release together with KIT & CEDIM. Central part of these collaborations was sharing and crowdsourcing global and kreis-level data on the Corona pandemic. See the English translations below:

KIT Press release

Risklayer GmbH together with The KIT Center for Disaster Management and Risk Reduction Technology (CEDIM) is actively collecting data on the Corona pandemic as cases are reported across Germany and building a spatial and temporal database that can be used for risk assessment. The maps produced by the team offer a quick overview of the spread of the virus in Germany and worldwide, and risk areas down to the district level are identified. "Our goal is to provide an overview of the number of people infected with the Corona virus compared to the population in each district," says James Daniell, scientist at KIT and co-founder of the spin-off Risklayer GmbH. The team uses official statistics from health ministries and local governments. So far, the team has used the scraping method - i.e. the gathering of information by collecting targeted data from websites. In this way over 5000 data sources have been analyzed. 

The Risklayer and CEDIM team has also launched a crowd sourcing initiative to gather the latest data with the help of many volunteers from Germany. In addition to the number of Coronavirus cases, The team also evaluates demographic information such as the number of inhabitants, the healthcare capacity such as the number of hospital beds in each district and the age structure of the affected population. They data is reported in terms of both absolute and relative case numbers in relation to population density on Riskalyer’s interactive data visualization platform which can be accessed freely by the public at: “At the moment there is no open data portal that bundles and evaluates the data at the district level - that's why we came in to fill the gap,” says Daniell. "The more precisely we identify risk zones, the better we can protect ourselves." It is also possible to observe trends based on the information evaluated and thus to make estimates for the future. However, these also depended greatly on government measures to contain the virus.

Constant updates, including at global level, can be followed on Risklayer's Twitter page.

See the KIT Press Release on 24.04.2020 for the original press release in German. Translated by Bijan Khazai.

Spiegel reports on Risklayer’s crowdsourcing project to fill the data gap on official Coronavirus statistics reported by the Robert Koch Institute

In order to be able to estimate which measures against the corona pandemic make sense, experts need reliable data. However, the case numbers of the Robert Koch Institute sometimes lag behind reality for several days. According to the Robert Koch Institute (RKI), the number of people who can be proven to be infected with the new corona virus in Germany rose by almost 5,000 from Monday to Tuesday. On the previous days, only 4,000 (Sunday to Monday) or even only 2,000 (Saturday to Sunday) new arrivals were added. 5000 new infections in one day - that cannot not be correct, some readers noted. In fact, the number comes about through a peculiarity. Ideally, a local health office registers a case and transmits it to the state office on the same day, which in turn forwards it to the RKI. The RKI in turn takes all these numbers electronically transmitted to them into account in a nationwide statistic that they report the following day. But that doesn't always work.

Sometimes it takes significantly longer for the infection numbers from individual cities and counties to reach the RKI. This became particularly clear on the weekend. Because not all offices had submitted their data on Saturday, the RKI first reported on Sunday that fewer new infections had been reported than on the previous day - and then had to clarify that there were large data gaps. If one assumes that even on Sunday data from the countries only arrived at the RKI, numerous cases were also missing in the data from Monday. The high number of the verifiable newly infected cases on Tuesday thus contains infections that were not reported to the RKI by the state authorities on Saturday and Sunday. The institute writes: "The data will be forwarded on Monday and will also be available in this statistic from Tuesday."

More than half a week behind

Reliable data are particularly important because they form the basis for the measures against the spread of the new corona virus. After the misleading report on Sunday, there was brief hope that protective measures could be eased soon. But it is now clear that the number of cases continues to increase to a similar extent as at the end of last week.

The new online dashboard of the RKI, which has been available since Friday, provides information about the cities and districts in which the numbers were particularly lagging behind. For each federal state, each city and each district, you can see how many people have been infected and died. It also shows the date and number of new infections registered in a place. The RKI writes that this reporting date is the day on which the health authority becomes aware of a case and records it electronically as such.

According to the dashboard, at least 50 Corona cases have already been registered in 159 German cities and counties. Here one can assume that new infections are added almost daily and the number of cases should increase. Almost all of these locations should have new reports every day. But there are clearly gaps. In the figures published on Tuesday morning by the RKI, reports from Monday are only included for half of the 159 locations. The rest will only arrive at the RKI today or in the next few days. And as the RKI has already made clear: On weekends, some cities and counties do not report any new cases (see graphic).

In almost ten percent of the cities and districts (Landkreise) with at least 50 cases, the latest report known to the RKI still comes from before the weekend. In Cologne, Dortmund, Mannheim and the district of Lörrach, for example, the dashboard names last Thursday as the latest reporting date. Accordingly, only 285 Corona cases are reported for Cologne. As of March 23, the city already spoke of 857 confirmed cases on its own website - that's three times as many.

IT problems distorted regional case numbers

Strange numbers are seen for Heidelberg as well: The local authorities speak of more than 100 infected people, the RKI dashboard only mentioned four as of Monday. There are now 86 cases registered with the RKI. On SPIEGEL's request, the district office responsible for the city's Corona case numbers confirmed that there were coordination and technical problems with the data transmission last week.

In consultation with the State Health Office, a number of cases were initially registered under a different keyword than that required for the registration, said Andreas Welker, Deputy Head of the Health Department Rhein-Neckar-Kreis / Heidelberg. After consultation with the State Health Office, the problem was initially resolved on Thursday. Shortly afterwards, there were further IT problems, the official said. "Now this bug has been fixed, so that a correct transmission can be assumed from Tuesday onwards."

A research project by the Karlsruhe think tank Risklayer and the Center for Disaster Management and Risk Reduction Technology (CEDIM) shows how far the RKI numbers lag behind the recorded cases on site. Volunteers collect the latest numbers directly from the health authorities, bring them together, and check them. While the RKI spoke of a good 27,000 confirmed cases nationwide on Tuesday morning, the crowdsourcing project already had around 32,000 cases at this point.

Only 51 out of 300 cases known to the RKI

If you compare - city by city, district by district - the data of the project with that of the RKI, it becomes clear that the number of cases of individual health authorities in North Rhine-Westphalia is lagging behind in the RKI collection. In some cases, there is even the question of whether the reporting chain is not only very slow, but simply torn off. There are now well over 300 confirmed cases in the Rhein-Sieg district, and only 51 in the RKI statistics. According to the dashboard, last Friday was the reporting date for the most recent case. However, the NRW Ministry of Health spoke more than a week ago of over 70 infected people in the Rhein-Sieg district. When asked by SPIEGEL, the district office there referred to the usual delay in reporting. Technical problems are not known there.

The NRW Ministry of Health stated that, in contrast to the RKI, it not only included the figures from the responsible state authority in its statistics, but also reports from local authorities and from crisis teams. "As a result, the data status of the ministry may be closer to the municipal data status than is the case with the RKI," it says. It is unclear whether this alone explains the difference of 249 cases. In other places, too, there are strikingly large differences between the figures from the RKI statistics and the self-reports on site. According to the RKI, there are eleven cases in Bielefeld, and the health office has confirmed almost 100 cases by Monday. In the Rhein-Erft district, the authorities speak of around 200 infected people, but the RKI dashboard currently only mentions 63. In total, about a quarter of approximately 400 German cities and districts in the Risklayer and CEDIM data sets show values that are at least 25 percent higher than in the RKI statistics.

Data transmission by fax

At the request of SPIEGEL, the city of Bielefeld merely informed that the reporting route for the data followed a fixed pattern. This means that infection numbers arrive late at the RKI. The data is generated in test laboratories. They are legally obliged to report corona cases to the local health authorities, i.e. the city or counties, within 24 hours.

This is usually done by fax, a spokesman for the Ministry of Health in Baden-Württemberg said on request. The employees of the health authorities in the cities and counties then checked the cases and recorded them manually in a digital reporting system that transmits the data to the state authorities. They import the case numbers into a database and transmit them to the RKI every day at 3 p.m.

This can happen within a day if the laboratory reports the case very early. However, the laboratory only manages to report a registered case after almost 24 hours, and if the case is returned to the city or district health office for another day, three days pass before it can be found in the RKI reporting system. In the RKI statistics, it only appears with a delay of four days.

The more cases in a region, the higher the risk of outdated data

The risk that the RKI will only find out about newly infected people after many days increases as the number of cases increases. "The current high workload of the local health authorities can lead to a delay in data entry and the submission of cases, particularly for health authorities with large numbers of cases," said the Ministry of Health in Baden-Württemberg. The weekend problem is also known in Baden-Württemberg. "Experience shows that the increases at weekends are lower due to the reporting chain than during the week, since not all health authorities can guarantee full technical data entry," said a spokeswoman for the State Health Office in Stuttgart.

Apart from the delay in reporting to the RKI, there have been criticisms of the data basis of the corona pandemic for several weeks. The German Network for Evidence-based Medicine and the Institute for the World Economy (IfW) are calling for the number of corona infected people to be examined in a representative population sample. This is the only way to record how many people in the general population are infected and how quickly this value increases. The data basis must be improved, since it is the basis for the measures that are currently being taken. "We can only provide the data that is transmitted to us," says Susanne Glasmacher, spokeswoman for the RKI. The corona pandemic is not only a challenge for the health system, but also for bureaucracy.

Original Spiegel article by Julia Merlot and Marcel Pauly, translated by Bijan Khazai.

T-online now using Risklayer’s data feed to report the spread of Coronavirus in Germany’s districts is now publishing the current situation on Corona infections by districts in an interactive map using a data feed from Risklayer which updates the data several times a day as newer data becomes available from the authorities. The updating is supported through crowdsourcing support of digital volunteers led by Risklayer. 

See the Original article published in T-online by Larua Stressing and Lars Wienand, translated by Bijan Khazai.