The THEIA land data center was created more than ten years ago. THEIA is an alliance of ten French public bodies aimed at federating public research efforts for monitoring continental surfaces. It is part of the Data Terra, research infrastructure, which brings together the various environmental data centres (ODATIS, THEIA, FormaTerre, AERIS).
CNES contribution to THEIA aims at facilitating access to space data for monitoring continental surfaces. CNES contributes through the activities of its research laboratories, the development of new products, the development of processing centres and chains, and the provision of data distribution platforms.
Over the last two years, CNES has completely overhauled the means of producing and accessing data for THEIA users. These changes may have disrupted habits, but they will enable us to make great progress. So it's time to learn about the new infrastructure.
New data and computing facilities
CNES relies heavily on the TREX computing centre with over 16,000 computing cores, and on a large data space, the datalake, with 35 petabytes on disks and as many on tapes. This is where data from CNES satellites (VENµS, SWOT and soon TRISHNA) is stored, as well as data from the European Union's Copernicus observation system, for which CNES provides images from Sentinel-1 and Sentinel-2, and altimetry data from Sentinel-3 and Sentinel-6 for the whole world. It is also in this datalake that we store the data produced by CNES for THEIA and Form@ter.

Available and forecasted data volumes for Copernicus and Theia products within the datalake
Volume in PiB | 2023 | 2024 | 2025 | 2026 |
Copernicus (Sentinel-1, 2, 3 SRAL & 6) | 24 | 26 | 36 | 41 |
THEIA | 1 | 1,2 | 6 | 9 |
FormaTerre | 0,4 | 0,9 | 1 | 1,2 |
New production centres
Until now, CNES's contribution has been based on the MUSCATE production centre, designed ten years ago, which had become too cumbersome to operate as demand increased. A new processing centre named Hespérides (the data garden) has been deployed on TREX to improve our production capacity without increasing operating costs: it is at Hesperides that datasets for continental hydrology flow and products dedicated to monitoring vegetation grow.
This centre uses an orchestrator named Chronos to distribute processing to the processors according to the arrival of the data. The processing chains, which are encapsulated in Docker containers, include all the libraries required for processing. Hespérides also has an effective production supervision tool. For performance reasons, we decided to split Hespérides into several production instances. One is dedicated to run-of-the-mill production (as soon as the data arrives), another to massive reprocessing (for example reprocessing all the Sentinel-2 data over Europe since 2015). Another could be reserved for experts who want to new versions of the processors or parameters.
New distribution platforms
Finally, new platforms for accessing earth observation data have been developed. The main one is GEODES, which provides access to all the earth observation data produced or available at CNES. It provides access to the data catalogue, a consultation and download interface, on-demand processing and a download API in python named PyGeodes, with a command-line tool.
The second platform, Hydroweb-next, distributes and presents data relating to continental hydrology, and in particular provides access to SWOT data.

New processors and new products
The Hespérides production centre responds to requests from French scientists, who can submit applications to THEIA's permanent call for projects. Thanks to the new hardware and software developed and operated by CNES for THEIA and Data Terra, our processing capacities have greatly increased, and it has become easier to put new chains into production, even if this requires work. The products currently in production in THEIA are presented in the following table (follow the links to discover the processing chains). Some examples of these products are provided at the end of the article.
Produit | Satellite | Processeur | Zone géographique |
Surface reflectance | Sentinel-2 & VENµS | MAJA | Europe, Maghreb; Sahel, India... |
Monthly syntheses of surface reflectance | Sentinel-2 & VENµS | WASP | Europe, Maghreb; Sahel, India... |
Water surfaces | Sentinel-1 & Sentinel-2 | SurfWater | Europe, Amazonia, West Africa |
Snow cover | Sentinel-2 | LIS | European and Indian mountains |
Land cover (22 classes) | Sentinel-2 | Iota2 | France |
We are also preparing new products, which should appear in the coming weeks of months.
Later on, on-demand processing facilities will be available, including the Sentinel-2 super-résolution tool. they will be available from GEODES.
Finally, thanks to our new production capacity, CNES has begun reprocessing all Sentinel-2 Level 2A data, starting with the European zone, following the reprocessing of these data by ESA to improve their geometric registration. This reprocessing will also take into account all the improvements made to the MAJA chain over the years (with better cloud mask resolution, advances in atmospheric corrections and bug fixes).

Role of the data campus
The CNES Data Campus plays an important role in this ecosystem. It manages the Hespérides processing centre, as well as the GEODES and Hydroweb-next distribution servers, in liaison with the Orbital Systems and Applications Directorate (DOA), which manages the projects, and with the CNES Strategy Directorate (DS).
The Data Campus also manages the operations of the various production and distribution centres, thanks to funding from the GEODES and THEIA projects. In several cases, the campus develops and maintains the processing chains in operational configuration (with the support of the Processing Performance Instruments (TPI) sub-directorate). Finally, the laboratories associated with the data campus (CESBIO, LEGOS, GET) make a major contribution to the definition of several processing chains (MAJA, WASP, LIS, OBS2CO, etc.).
All this work is supported by research laboratories and IT services and development companies in the space sector. Last but not least, nothing would be possible without the CNES computing centre.
Authors and contributions
Olivier Hagolle, Bernard Specht, Johan Aussenac, Isabelle Soleilhavoup
-
Exemple de produit Sentinel-2, sur Madagascar), exprimé en réflectance de surface, avec superposition des contours des nuages détectés (en vert) et des ombres (en jaune) © CESBIO -
Synthèse mensuelle des observations sans nuages de réflectance de surface; en juillet 2017, 2018, 2019 et 2020, à partir de données Sentinel-2 © CESBIO -
Probabilité d'occurence de présence d'eau, sur le bassin d'arcachon (100% en violet, 0% en blanc).