Home / Papers / The role of spatial context and multitask learning in the detection of organic and conventional farming systems based on Sentinel-2 time series

The role of spatial context and multitask learning in the detection of organic and conventional farming systems based on Sentinel-2 time series

Authors: Jan Hemmerling, Marcel Schwieder, Philippe Rufin, Leon-Friedrich Thomas, Mirela Tulbure, Patrick Hostert, Stefan Erasmi Date: 2026-03-25 Paper ID: arxiv:2603.24552

Summary

This study proposes a Vision Transformer approach, extended from the TSViT architecture, to classify organic versus conventional farming systems using intra-annual Sentinel-2 time series. The work examines how incorporating crop type prediction via multitask learning and varying the input spatial context (patch size) influences classification performance. While the approach successfully discriminates farming systems for certain cereal crops (F1 >= 0.8), multitask learning offered limited gains, whereas increasing the spatial context consistently improved the accuracy for both classification tasks. The results confirm feasibility but highlight significant challenges in distinguishing management practices for permanent and specialty crops.

Key Contributions

Demonstrated the feasibility of discriminating between organic and conventional farming systems using intra-annual Sentinel-2 time series data.
Investigated the impact of incorporating crop type classification as a concurrent task (multitask learning) alongside farming system classification, finding only limited benefit.
Quantified the positive effect of increasing spatial context (via patch size) on the classification accuracy for both farming system and crop type discrimination.
Achieved high F1 scores (>= 0.8) for discriminating organic/conventional farming for specific crops like winter rye, wheat, and oat, while showing poor performance for permanent grassland and specialty crops.

Limitations

Classification performance varies substantially across crop types, with poor reliability for permanent grassland, orchards, grapevines, and hops.

Open Questions & Future Work

Key Concepts

Temporo-Spatial Vision Transformer: A Vision Transformer architecture specifically designed to process and leverage spatio-temporal data, such as multi-temporal satellite imagery, by integrating spatial patch processing with temporal sequence modeling.

Datasets

Sentinel-2 time series

Limitations

Classification performance varies substantially across crop types, with poor reliability for permanent grassland, orchards, grapevines, and hops.

The role of spatial context and multitask learning in the detection of organic and conventional farming systems based on Sentinel-2 time series

The role of spatial context and multitask learning in the detection of organic and conventional farming systems based on Sentinel-2 time series

Summary

Key Contributions

Limitations

Open Questions & Future Work

Key Concepts

Datasets

Limitations

Links

Metadata & Links