

The larger the forest, the more complex its inventory. The Bavarian State Forest is surveyed and statistically recorded every ten years. To do this, 21 foresters travel to 8,000 pre-defined locations, where they collect information on around 100,000 trees – assessing species, size, condition, and also the surrounding vegetation. This process is time-consuming and costly. In the future, forests could be surveyed more regularly, and damage could be detected more quickly – from above, using digital tools. “So far, no one in Bavaria or Germany has managed to conduct a timely, large-scale assessment of forest disturbances using satellite data,” says Nikolas Herbst, academic advisor at the Chair of Software Engineering at the Julius Maximilian University of Würzburg. He is part of the interdisciplinary project Real-Time Earth Observation of Forest Dynamics and Biodiversity (ROOT), which also involves the Deutsches Fernerkundungsdatenzentrum (DFD), and uses terrabyte, the high-performance platform for remote sensing data analysis at the Leibniz Supercomputing Centre (LRZ). For ROOT, Herbst and his colleagues analyse multispectral data from the Sentinel and Landsat satellites. The team combines these data with information from Deutscher Wetterdienst (DWD), insect traps, and other sources. Approximately every two weeks, enough new satellite images are available to update the forest data. The goal is to create an automated, digital information service for Bavarian forests that enables continuous monitoring.
The forest – specifically the Bavarian Forest, its neighbor Šumava (Bohemian Forest), and the university's own research forest – is also a central focus at Ludwig Maximilian University (LMU) Munich. For the Regulus research projects AI-Klima and Labforest, a research group led by Professor Lukas Lehnert combines Light Detection and Ranging (LiDAR) data from aircraft with satellite data and cameras installed in the forest. They use this to analyze the condition and development of the forest, identify strategies for regeneration, and also develop general statistical methods for evaluating remote sensing data—such as determining biomass or understanding how forests influence a region’s hydrology. “It’s about creating new methods for processing and analyzing satellite data, which in turn form the basis for AI-based or mathematical-physical models,” Lehnert explains. Like the ROOT project, his team also works with terabyte: “The platform allows us to use our own datasets, evaluation routines, and algorithms in various programming languages,” Lehnert says. “The results can be easily downloaded or further processed using our own tools.”
terrabyte is a project by the German Aerospace Center (DLR) and the LRZ. Conceptualized in 2019 and subsequently installed at the Leibniz Supercomputing Centre (LRZ), the terrabyte platform became operational in 2023. “The LRZ hosts and operates the infrastructure, as well as core services like authentication, job scheduling, and the web-based computing portal. The DLR is responsible for data management, specialized web services for the analysis and processing of Earth observation data, and user support,” explains Maximilian Schwinger, describing the division of responsibilities. Schwinger, a computer scientist at DLR, co-develops and maintains terrabyte alongside Dr. Jonas Eberle: “The platform is a hybrid between cloud and high-performance computing.” terrabyte combines a supercomputing cluster with 44,000 virtual CPU cores and 188 NVIDIA GPUs with storage capable of holding around 50 petabytes of data and imagery from various remote sensing missions and satellites (Copernicus, Landsat, MODIS, Sentinel, VIIRS). A high-speed data connection between Oberpfaffenhofen and Garching also allows researchers to access the extensive DLR archive. The platform is used by DLR staff and is available upon request to the LRZ user community.“An evaluation has shown that terrabyte follows the right technical approach,” Eberle notes. “The platform’s setup is not overly complex but tailored to the needs of scientific research. This enables us to operate more independently from U.S.-based services.”
On closer inspection, terrabyte offers an alternative to platforms provided by hyperscalers such as Amazon or Google. These commercial platforms also provide tools for analyzing image and other data, and they allow users to develop or implement their own code and algorithms. However: “We worked with Google Earth Engine,” reports Lehnert. “It was quite cumbersome, the capabilities were limited, and in the end, it was difficult to download the results.” To retain customers, digital corporations often rely on so-called vendor lock-in, designing their services and technologies in such a way that switching to alternatives becomes complicated. To make science more independent of such services, the terrabyte project was launched six years ago—a forward-looking decision: “There is a lot happening politically right now,” observes DLR’s Schwinger. “We don’t know which data and platforms in the U.S. will be available to Europeans in the future.” Due to staffing and funding cuts at agencies and institutions like the National Oceanic and Atmospheric Administration (NOAA) – which oversees three satellite programs and previously made its weather and climate data globally available – programs are being significantly scaled back, and the flow of information is being disrupted. Air quality data once collected by U.S. embassies worldwide is no longer available or may require payment in the future. On top of that, there are risks from emerging trade disputes: if Europe imposes tariffs on digital services, research using hyperscaler platforms could become more expensive. At the same time, high-tech corporations are resisting full compliance with European regulations on digitization, including data protection, information moderation, and the AI Act. “Research depends on ever-growing datasets, and AI applications further increase data volume,” says LMU Professor Lehnert. “If we want to stay competitive internationally, we have no choice but to build more platforms like terrabyte.” Currently, terrabyte is unique in Germany, but a comparable system is being developed in the state of Hesse. Across Europe, similar platforms are also being built, with efforts underway to interconnect them. “terrabyte,” concludes software specialist and reviewer Nikolas Herbst, summarizing his experiences with terrabyte, “makes a huge contribution to environmental and climate research. Creating added value from it for organizations, authorities, and companies is the next major goal of the ROOT project.”
Since 2023, the ROOT project has developed modular workloads for processing image data from a wide range of sources. Using terrabyte, it initially processed long-term data from the DLR archive, generating time series starting from the drought years of 2018, thus laying the foundation for the continuous evaluation of current satellite data. In addition, an app (picture above) was developed that displays forest damage in Bavaria through clearly structured maps. “We want to be able to respond quickly to symptoms like snow breakage, drought, wind damage, fires, or insect infestations,” explains Herbst. “While state forests are systematically monitored, many privately owned forests cannot be managed as closely. This makes it harder to detect damage in a timely manner and delays necessary intervention.” As a result, diseases can spread more rapidly.
The ROOT app, currently in its pilot phase and updated every 14 days, is aimed at forest owners. A web-based desktop version is also in development and has already attracted the interest of the Bavarian Forest National Park administration – a project partner. terrabyte has become an indispensable tool, and as more satellite and environmental data are analyzed, more and more information about the forest is becoming digitally accessible. Lehnert’s team is studying how the forest has changed between 2000 and 2023 due to climate change, bark beetles, and other influences. As part of the AI-Klima project, researchers are examining high-resolution, distortion-free, and georeferenced aerial images of the Bavarian Forest and Šumava National Park, where individual trees are visible, and comparing them with multispectral data from the Landsat missions. For the Bavarian Forest alone, each data set amounts to about 1.4 terabytes per year.
To handle such volumes, terrabyte is equipped with 335 terabytes of RAM and a lot of computing power. An additional petabyte of data can be stored in the cloud, along with access to 3,000 more virtual CPUs. In combination with Infiniband connections for data transfer, this setup enables computation, caching, and further evaluation of the data. “Whether a spruce is at risk from bark beetles is something we can only tell once the tree is already infested,” Lehnert explains. “So we can currently only quantify the resulting damage.” His team is working on extracting even more insights or knowledge from the image data – potentially even identifying leaf composition and biomass, which could, in turn, indicate tree health. ROOT and the studies conducted by Lehnert’s team – initial papers will be presented in late April at EGU25 in Vienna, a conference focused on computational environmental research – are also contributing to the optimization of Earth observation and data analysis. A new interdisciplinary project, Scientific Computing for Earth Observation and Sustainability (SOS), has just been launched. “We are continuing to develop technologies for platforms like terrabyte, as well as general tools and programs for Earth observation,” says Herbst, describing the team’s new tasks. “In the future, you shouldn’t need to be a computer scientist to work with high-performance platforms for data analysis.”
Meanwhile, Lehnert and his colleagues are developing methods for refining and enriching satellite data with information from other sources to advance geosciences and climate research. At DLR, there are also ongoing efforts to improve how researchers use terrabyte. “A platform like this isn’t static – it evolves continuously,” say Schwinger and Eberle. “In the coming months, we’re preparing features that will allow terrabyte users to share datasets, algorithms, and AI models with each other.” Additionally, terrabyte is now operating at full capacity on some days, making wait-time management and the allocation of ressources become more important. There are also considerations about which current satellite data might be offloaded from the platform to make room for information from more Earth observation programs. (vs)
Photo Credit: © DLR | Sentinel 1