Challenges in Practice: Building a Usable Library for Planetary-Scale Embeddings (Video, PROPL 2025) Sadiq Jaffer, Frank Feng, Robin Young, Srinivasan Keshav, and Anil Madhavapeddy (University of Cambridge; University of Cambridge; University of Cambridge; University of Cambridge; University of Cambridge, UK)
Abstract: Remote sensing observations from satellites are critical for scientists to understand how our world is changing in the face of climate change, biodiversity loss, and desertification. However, working directly with this data is difficult. For any given satellite constellation, there are a multitude of processed products, data volume is considerable, and for optical imagery, users must contend with data sparsity due to cloud cover. This complexity creates a significant barrier for domain experts who are not specialists. Pre-trained, self-supervised foundation models such as TESSERA (https://arxiv.org/abs/2506.20380) aim to solve this by offering pre-computed global embeddings. These rich embeddings can be used in-place of raw remote sensing data in a powerful “embedding-as-data” approach. For example, a single 128-dimensional TESSERA embedding for a 10-meter point on Earth can substitute for an entire year of optical and radar imagery, representing its temporal and spectral characteristics. While this could democratise access to advanced remote sensing-derived analytics, it also creates a new programming challenge: a lack of tools designed for this new approach. In this talk we will focus on our lessons learnt from the development of geotessera (https://github.com/ucam-eo/geotessera), a library designed for this new embeddings-as-data approach. We will explore key design decisions that focus on both a high-level API for accessibility and tight integration with the existing scientific Python ecosystem. The core user workflow will be demonstrated, showing how our library enables a rapid classification task on this new data paradigm. By presenting this work as a case study, we aim to highlight the critical need for new programming systems research for high-dimensional geospatial embeddings and help build a stronger, more effective bridge between the programming and climate science communities. The Earth generates hundreds of petabytes of satellite data annually, creating unprecedented opportunities to monitor planetary-scale environmental changes. Geospatial foundation models based on satellite data are now demonstrating capabilities from wildfire mapping to biodiversity monitoring that could transform climate adaptation and policy. However, these powerful models remain largely inaccessible or impractical for domain experts. Ecologists, urban planners, disaster managers, and environmental scientists typically lack the machine learning expertise required to leverage foundation model capabilities. This accessibility gap represents a barrier to addressing planetary-scale challenges. We analyze the systematic user experience failures that prevent domain expert adoption and outline a research agenda for making geospatial AI models more accessible to practitioners. This also combines elements of another talk: The Earth generates hundreds of petabytes of satellite data annually, creating unprecedented opportunities to monitor planetary-scale environmental changes. Geospatial foundation models based on satellite data are now demonstrating capabilities from wildfire mapping to biodiversity monitoring that could transform climate adaptation and policy. However, these powerful models remain largely inaccessible or impractical for domain experts. Ecologists, urban planners, disaster managers, and environmental scientists typically lack the machine learning expertise required to leverage foundation model capabilities. This accessibility gap represents a barrier to addressing planetary-scale challenges. We analyze the systematic user experience failures that prevent domain expert adoption and outline a research agenda for making geospatial AI models more accessible to practitioners.
Presentation at the PROPL 2025 workshop, Oct 13, 2025, https://conf.researchr.org/home/icfp-splash-2025/propl-2025 Sponsored by ACM SIGPLAN.




