SARS-COV-2/COVID-19 Confirmed Cases in Europe
covid19-eu-zh/covid19-eu-data is an automated COVID-19 confirmed cases data collection experiment using GitHub Actions.
The covid19-eu-zh/covid19-eu-data repository is an experiment of data scraping and aggregation using GitHub Actions.
covid19-eu-zh is a dynamic and energetic team. Please consider follow their telegram channel for COVID-19 in Europe in Chinese.
The Dataset
The dataset is being updated regularly.1
The following is a table showing the update status and data sources.
Country | Status | Data Source |
---|---|---|
AT | ||
BE | ||
CH | ||
CZ | ||
DE | ||
DK | ||
ES | ||
FR | ||
IT | ||
NL | ||
NO | ||
PL | ||
SE | ||
UK | ||
EU(ECDC) |
A Demo: Confirmed Cases in Germany
The data can be directly loaded into your applications. Here is a simple demo using the data file for Germany. Please refer to /flora/covid19_eu_data/
(link to the dataset) for the data files of all available countries.
Data Collection
The structure of the project is as follows.
.
├── README.md
├── dataset #where the data files lives
├── documents #where the raw data and files lives
├── now.json #zeit now setup for a FAAS service
├── scripts #scripts to download and aggregate data
└── .dataherb #where the metadata lives
Scripts
We have a python script for each country for more flexible schedules of each country. We are using classes from utils.py
so that the scripts have similar structures.
scripts
├── download_at.py
├── ...
├── requirements.txt
└── utils.py
Dataset
The dataset folder contains the full dataset of each country and the daily pdates of each country.
dataset
├── covid-19-at.csv
├── ...
└── daily
├── at
├── ...
GitHub Actions
We manage the pipelines using GitHub Actions. The full set of workflows is found in the original repository.
We use Germany as an example. In the workflow for Germany, we have two triggers, pushing to master branch and schedule. The job steps are
- Checkout the repository;
- Setup python and install python requirements;
- Run the python script to download and aggregate data;
- Push data to repository.
name: CI Download DE SARS-COV-2 Cases from RKI
on:
push:
branches:
- master
schedule:
- cron: '0 7/1 * * *'
jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Checkout current repo
uses: actions/checkout@v2
- name: Get current directory and files
run: |
pwd
ls
- uses: actions/setup-python@v1
with:
python-version: '3.7' # Version range or exact version of a Python version to use, using SemVer's version range syntax
architecture: 'x64' # optional x64 or x86. Defaults to x64 if not specified
- name: Install Python Requirements
run: |
python --version
pip install -r scripts/requirements.txt
- name: Download Records
run: |
python scripts/download_de.py
ls dataset/daily/de
git config --local user.email "action@github.com"
git config --local user.name "GitHub Action"
git pull
git status
git add .
git commit -m "Update DE Dataset" || echo "Nothing to update"
git status
- name: Push changes
uses: ad-m/github-push-action@master
with:
repository: covid19-eu-zh/covid19-eu-data
github_token: ${{ secrets.GITHUB_TOKEN }}
And Much More
We run a Telegram Channel (in Chinese): 新冠肺炎欧洲中文臺
If you would like to help or track the progress of this project, check out our roadmap.
-
Some countries, such as Spain, we only download the record PDF files. Some countries, such as Italy, provides open and well-organized data by the government. ↩