Abstract: Optical satellite Earth-observation (EO) time series are often obscured by clouds, resulting in sparse and temporally irregular observations. Compositing addresses these issues, but is insensitive to changes in vegetation phenology, which is critical for downstream tasks. Instead, we present TESSERA, a pixel-wise foundation model for multi-modal (Sentinel-1/2) EO time series that learns robust, label-efficient embeddings. During model training, TESSERA uses Barlow Twins to enforce invariance to the choice of cloud-free observations randomly sampled from the time series, so that the generated embeddings interpolate missing observations. We employ two key regularizers: global shuffling to decorrelate spatial neighborhoods, and mix-based regulation to improve invariance under extreme sparsity. We find that for diverse classification, segmentation, and regression tasks, TESSERA embeddings deliver state-of-the-art accuracy with high label efficiency, often requiring only a tiny task head and minimal computation. To democratize access, adhere to FAIR principles, and to simplify use, we release global, annual, 10m, pixel-wise int8 embeddings together with open weights/code and lightweight adaptation heads, thus providing practical tooling for large-scale retrieval and inference at planetary scale.
Bio: Frank Feng is a second-year Ph.D. student in the Department of Computer Science and Technology at the University of Cambridge. His research interests lie at the intersection of machine learning and earth sciences, with a particular focus on developing self-supervised learning methods in remote sensing.




