Lifting a treasure trove of data for environmental sciences

terrabyte_sentinel1_Mockup_08072021_xs

The terrabyte HPDA platform at LRZ's compute cube.

terrabyte is the innovative high-performance data platform of the German Aerospace Center (DLR) and the Leibniz Supercomputing Centre (LRZ) of the Bavarian Academy of Sciences and Humanities. The platform makes Earth observation data accessible for research and offers practical tools for analytics.

  • terrabyte connects the archive for satellite data of the German Aerospace Center (DLR) in Oberpfaffenhofen with intelligently managed online storage of around 50 petabytes and the supercomputers of the LRZ in Garching via a 10 gigabit/s connection.
  • terrabyte offers a viable alternative to commercial data clouds and meets security and data protection requirements.
  • The data accessible via the terrabyte platform is to be used widely in the future; in addition to DLR, universities in Munich and Bavaria will soon have access.

Extreme weather, droughts, melting glaciers, coastal erosion, even the development of cities over time: Currently, eight Copernicus Sentinel, the US Landsat and the radar satellites of the German Aerospace Center send approx. 19 terabytes of data per day about the current state of Planet Earth.

To explore this gigantic treasure trove of historical and current earth observation data, DLR’s large data sets can now be quickly transferred to the terrabyte High Performance Data Analytics or HPDA platform at the LRZ. The system connects DLR's satellite data archive in Oberpfaffenhofen with new, intelligently managed online storage of around 50 petabytes and the supercomputers of the scientific computing centre in Garching via a 10 gigabit/s line.

Analysis of Earth observation data

"The terrabyte concept allows our scientists to evaluate huge amounts of data highly efficiently without running their algorithms in less protected environments," says Stefan Dech, Director of the German Remote Sensing Data Center (DFD) at DLR. "The data that satellites provide on urbanisation or the melting of glaciers and polar ice caps, for example, can be processed immediately in the future. This is a milestone for environmental research and remote sensing of the Earth. We expect an enormous leap in knowledge." Environmental protection, society and the economy should benefit from this. For DLR, terrabyte also offers an alternative to the data clouds of commercial providers, because the platform meets all security and data protection requirements.

The core of the terabyte platform is made up of 10 racks packed with ThinkSystem SR630 servers and variously sized DSS-G storage systems from Lenovo. Together they offer 49 petabytes of storage. The data is organised by IBM's Spectrum Scale file system, and the Infiniband network ensures extremely fast data transfers between storage and compute capacities. "Internally, we transfer the data at 300 gigabytes per second, which opens up new possibilities for their processing," says Dieter Kranzlmüller, Director of the LRZ. "The collaboration with DLR is a challenge, which we gladly accept. terrabyte is not only about very large compute capacities, but above all about processing mass data. The platform also shows the growing importance of storage volumes for research. In more and more scientific fields, data should be easily accessible and ideally processed on site." To be able to check research results or process them further, open access is increasingly in demand. terrabyte is the technical implementation­ the high-performance ana-lytics platform therefore also serves the LRZ as a model for further storage and compute offerings in other research areas.

Understanding the environment better

Xiaoxang Zhu, chair at the Technical University of Munich and head of department at DLR's Earth Observation Center (EOC), has been working with satellite data for years. The engineer has developed a wide variety of algorithms to depict mega-cities three-dimensionally and with the highest accuracy. Today, her models can be used to optimise spatial and urban planning or disaster control. Easily accessible earth observation data also advance environmental and climate research, simplify the construction of mobile phone and IT networks or provide evidence for the calculation of subsidies.

Another EOC team led by Thomas Esch also uses this data to create the "World Settlement Footprint" (WSF), a quasi-comparative and control instrument for urbanisation: for this purpose, information on the extent, structure and development of settlement areas as well as on population density and distribution is automatically evaluated. The WSF provides valuable information for science, politics and business, enabling them to react to the impoverishment of city districts, weather changes or the loss of biodiversity.

Gasometer_WSF_ohne-Titel_04613

The development of cities over three decades: Bangkok. Picture courtesy of DLR / EOC