Interactive Storage Layout Visualization in Distributed Storage
Supercomputers are typically accompanied by high-performance storage systems to provide fast access to large volumes of data. Files containing scientific data routinely exceed 100s of GiB to TiB in size, even for single timesteps. A single storage device alone, such as a hard-drive of an SSD, does not offer the required access performance characteristics nor the capacity to allow meaningful analysis across large volumes of data. For this reason, high-performance storage systems aggregate the performance of many (tens of thousands) devices. Typically, in the order of 50 to 100 disks are managed by so-called storage servers, and 100s of storage servers form a high-performance storage system which may feature 100s of petabytes of data at data rates reaching into the TiB/s. In practice, most applications will yield far lower data throughput performance as a result of suboptimal data distribution across available storage servers. In fact, by default, many files will not be spread out across multiple storage servers at all, which is a fair strategy for smaller files.
Most parallel file systems and many object stores offer command-line utilities to inspect and fine-tune how a file is striped across available storage targets. To achieve the optimal read and write performance, however, it becomes necessary to take the topological relationship of the compute allocation, the network as well as the storage system into account. The textual responses provided by existing storage APIs provide no intuitive representations which allow to quickly spot problematic data mappings nor would they take relationships to the rest of the supercomputer into account.