SQLite database export#

About#

You can use the phenodata export-observations subcommand to export observations including metadata into an SQLite database.

The database will contain the data table dwd_observation, and the metadata tables dwd_station, dwd_species, dwd_phase, dwd_quality_byte, and dwd_quality_level. There is also a database view called dwd_phenology, which will join those tables together properly, and present a denormalized variant of the data, ready for convenient querying.

If you want to create SQLite database archive files, containing all datasets, or just download and use them, please follow up reading the DWD SQLite database archive documentation section.

Usage#

Produce#

This example walks you through the steps needed to store a subset of the phenology observation data available on DWD CDC into an SQLite database file on your machine.

Let’s start by defining the database name.

export DBPATH=phenodata-dwd-annual-recent-hasel.sqlite

Now, export a few selected data points to keep the database size small.

phenodata export-observations \
    --source=dwd --dataset=annual --partition=recent \
    --station=münchen \
    --year=2021,2022,2023 \
    --filename=Hasel \
    --target=sqlite:///${DBPATH}

Attention

Please note that each invocation will overwrite (replace) all tables within the given database without confirmation.

Consume#

sqlite3#

Inquire the database schema and metadata information.

# Display all tables and views.
sqlite3 "${DBPATH}" '.tables'

# Display schema of all tables and views.
sqlite3 "${DBPATH}" '.fullschema --indent'

# Display database metadata information.
sqlite3 "${DBPATH}" -csv -header 'SELECT * FROM dwd_about'

Run a query on the dwd_phenology view.

sqlite3 -csv -header "${DBPATH}" 'SELECT * FROM dwd_phenology ORDER BY date;'

sqlite3 -csv -header "${DBPATH}" <<SQL
SELECT * FROM dwd_phenology ORDER BY date;
SQL

Details#

You can always inspect the database schema using sqlite3 "${DBPATH}" '.fullschema --indent'. In order to learn about how the dwd_phenology database view looks like, this SQL example can be helpful.

sqlite3 -csv -header "${DBPATH}" <<SQL
SELECT
   dwd_observation.*,
   dwd_station.*,
   dwd_station.station_name AS station_name,
   dwd_species.species_name_en AS species_name,
   dwd_phase.phase_name_en AS phase_name
FROM
   dwd_observation, dwd_station, dwd_species, dwd_phase
WHERE true
   AND dwd_observation.station_id=dwd_station.id
   AND dwd_observation.species_id=dwd_species.id
   AND dwd_observation.phase_id=dwd_phase.id
SQL

Note

Please note this SQL example omits joining in the dwd_quality_byte and dwd_quality_level tables for better readability. The view dwd_phenology does include them.

Backlog#

Todo

[o] How to publish using Datasette
[o] How to publish using Grafana SQLite Datasource
[o] Explore compression options
[o] Export to Parquet format