|
A Distributed Active Storage Testbed for Scientific Applications on NWICG |
|
Users with large computing needs have easy access to an enormous array of CPUs via systems such as the TeraGrid, the Open Science Grid, and (soon) NWICG. However, users of these systems are almost universally constrained by the performance of storage systems. Even the most high end archival systems cannot keep up with the bandwidth and latency requirements of thousands of CPUs. To attack this problem, we propose to intermix computing and storage capacity within a single cluster: this is known as distributed active storage. Each node of a computing cluster already has a disk to load the OS; why not use those disks for data storage? Although individual storage devices do not have the same aggregate performance as a high end archival system, each can provide for the I/O needs of the attached CPU. Ideally, one should be able to scale up both the computing and I/O capacity of a cluster by simply adding nodes one at a time. This architecture requires new system software to manage each local storage device, to track the location of various data units, and to direct computations to the data that they require.
|