Ready-to-use benchmark dataset for few-shot crop type classification using Sentinel-1 and Sentinel-2 imagery.
Part of the PreTrainAppEO ("Pre-Training Applicability in Earth Observation") research project.
EuroCropsML
is a pre-processed and ready-to-use machine learning dataset for crop type classification of agricultural parcels in Europe.
It consists of a total of 706,683 Sentinel-2 and 176,055 Sentinel-1 multi-class labeled data points with a total of 176 distinct classes.
Each data point contains an annual time series of per parcel median pixel values of Sentinel-1 data and/or Sentinel-2 L1C (top-of-atmosphere) reflectance data for the year 2021. For Sentinel-1, we utilize the C-band Synthetic Aperture Radar (SAR) Ground Range Detected (GRD) data. Imagery is selected based on the orbit type available for the location, either ascending or descending. In terms of polarization, we use Interferometric Wide (IW) mode with VV (vertical polarization emission and reception) and VH (vertical polarization emission and horizontal reception) bands.
The dataset is based on Version 9 of EuroCrops
, an open-source collection of remote sensing reference data.
For EuroCropsML
, we acquired and aggregated data for the following countries:
Country | Number of distinct classes | Total number of datapoints for Sentinel-2 | Total number of datapoints for Sentinel-1 |
---|---|---|---|
Estonia | 127 | 175,906 | 176,055 |
Latvia | 103 | 431,143 | - |
Portugal | 79 | 99,634 | - |
The distribution of class labels differs substantially between the regions of Estonia, Latvia, and Portugal. This makes transferring knowledge gained in one region to another region quite challenging, especially if only few labeled data points are available. Therefore, this dataset is particularly suited to explore transfer-learning methods for few-shot crop type classification.
The data acquisition, aggregation, and pre-processing steps are schematically illustrated below. A more detailed description is given in the dataset section of our documentation.
eurocropsml
is a Python package hosted on PyPI.
The recommended installation method is pip-installing into a virtual environment:
$ python -Im pip install eurocropsml
The quickest way to interact with the eurocropsml
package and get started is to use the EuroCropsML
dataset is via the provided command-line interface (CLI).
For example, to get help on available commands and options, use
$ eurocropsml-cli --help
To show the currently used (default) configuration for the eurocropsml
dataset CLI, use
$ eurocropsml-cli datasets eurocrops config
To download the EuroCropsML dataset as currently configured, use
$ eurocropsml-cli datasets eurocrops download
Alternatively, the dataset can also be manually downloaded from our Zenodo repository.
A comprehensive documentation of the CLI can be found in the CLI Reference section of our documentation.
For a complete example use-case demonstrating the ready-to-use EuroCropsML dataset in action, please refer to the project's associated official repository for benchmarking meta-learning algorithms.
The eurocropsml
code repository is released under the MIT License.
Its documentation lives at Read the Docs, the code on GitHub and the latest release can by found on PyPI.
It is tested on Python 3.10+.
If you would like to contribute to eurocropsml
you are most welcome. We have written a short guide to help you get started.
The EuroCropsML dataset and associated eurocropsml
code repository are provided and developed as part of the joint PretrainAppEO research project by the chair of Remote Sensing Technology at Technical University Munich and dida.
The goal of the project is to investigate methods that rely on the approach of pre-training and fine-tuning machine learning models in order to improve generalizability for various standard applications in Earth observation and remote sensing.
The ready-to-use EuroCopsML dataset is developed for the purpose of improving and benchmarking few-shot crop type classification methods.
EuroCropsML
is based on Version 9 of EuroCrops
, an open-source collection of remote sensing reference data for agriculture from countries of the European Union.
If you use the EuroCropsML
dataset or eurocropsml
code repository in your research, please cite our project as follows:
Plain text
Reuss, J., & Macdonald, J. (2024). EuroCropsML [dataset]. Zenodo. https://doi.org/10.5281/zenodo.10629610
Bibtex
@misc{reuss_macdonald_eurocropsml_2024,
author = {Reuss, Joana and Macdonald, Jan},
title = {EuroCropsML},
year = 2024,
publisher = {Zenodo},
doi = {10.5281/zenodo.10629610},
url = {https://doi.org/10.5281/zenodo.10629610}
}
The PreTrainAppEO research project is funded by the German Space Agency at DLR on behalf of the Federal Ministry for Economic Affairs and Climate Action (BMWK).