Coupled Storage System for Efficient Management of Self-Describing Data Formats (CoSEMoS)
The project goal is to explore the benefits of a coupled storage system for self-describing data formats. It will introduce a novel hybrid approach leveraging storage technologies from the fields of high-performance computing and database systems, where each technology will be used according to its respective strengths and weaknesses. By coupling the storage system tightly with self-describing data formats, it can make use of structural information for selecting appropriate storage technologies and tiers. As such information is currently not available, storage systems have to employ heuristics, which often lead to suboptimal performance as well as unnecessary and expensive data movements. Moreover, the storage system will support adaptable I/O semantics to tune its performance according to application and data format requirements. Together, these features will enable completely new data management methods and provide significant performance improvements. Existing workflows of scientific users will be supported through a dedicated data analysis interface. All changes will be thoroughly tested to ensure backwards compatibility with existing applications and interfaces. Consequently, no modifications will be necessary to run applications on top of CoSEMoS, which helps preserve past investments in scientific software development.
More information can be found in the Problem Statement.
This project is funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) - 417705296.
Prof. Dr. Thomas Ludwig (director of the German Climate Computing Center and professor at Universität Hamburg)
Johann Lombardi (principal architect in the High Performance Data Division at Intel)
Uwe Schulzweida (one of the main developers of the Climate Data Operators at Max Planck Institute for Meteorology)
Erxleben, Timm Leon; Duwe, Kira; Saak, Jens; Köhler, Martin; Kuhn, Michael
Energy Efficiency of Parallel File Systems on an ARM Cluster
In: The Twelth International Conference on Smart Grids, Green Communications and IT Energy-aware Technologies,
ENERGY 2022 - IARIA, 2022 . In Press
Coupling storage systems and self-describing data formats for global metadata management
In: 2020 International Conference on Computational Science and Computational Intelligence/ CSCI - Piscataway, NJ:
IEEE. - 2021
Dissecting self-describing data formats to enable advanced querying of file metadata
In: SYSTOR 2021 - New York: Association for Computing Machinery - proceedings of the 14th ACM International Systems
and Storage Conference : June 14-16, 2021 . - 2021, insges. 7 S.
Duwe, Kira; Kuhn, Michael
Using ceph's BlueStore as object storage in HPC storage framework
In: Proceedings of the Workshop on Challenges and Opportunities of Efficient and Performant Storage Systems
(CHEOPS) - in conjunction with EuroSys 2021 - New York: ACM; Kuhn, Michael - in conjunction with EuroSys 2021 . -
2021, insges. 6 S.
Kuhn, Michael [HerausgeberIn]; Duwe, Kira [HerausgeberIn]; Acquaviva, Jean-Thomas [HerausgeberIn];
Chasapis, Konstantinos [HerausgeberIn]; Boukhobza, Jalil [HerausgeberIn]
Proceedings of the Workshop on Challenges and Opportunities of Efficient and Performant Storage Systems (CHEOPS) -
in conjunction with EuroSys 2021
In: New York: ACM, 2021, 1 Online-Ressource
Duwe, Kira; Lüttgau, Jakob; Mania, Georgiana; Squar, Jannek; Fuchs, Anna; Kuhn, Michael; Betke, Eugen;
State of the Art and Future Trends in Data Reduction for High-Performance Computing
In: Supercomputing Frontiers and Innovations, Publishing Center of South Ural State University, S. 4-36, 2020