ABOUT pNFS
BACKGROUND
Parallel NFS (pNFS) is a part of the NFS v4.1 standard that allows compute clients to access storage devices directly and in parallel. The pNFS architecture eliminates the scalability and performance issues associated with NFS servers deployed today. This is achieved by the separation of data and metadata, and moving the metadata server out of the data path.
High-performance data centers have been aggressively moving toward parallel technologies like clustered computing and multi-core processors. While this increased use of parallelism overcomes the vast majority of computational bottlenecks, it shifts the performance bottlenecks to the storage I/O system. To ensure that compute clusters deliver the maximum performance, storage systems must be optimized for parallelism. Legacy Network Attached Storage (NAS) architectures based on NFS v4.0 and earlier have serious performance bottlenecks and management challenges when implemented in conjunction with large scale, high performance compute clusters.
A consortium of storage industry technology leaders created the initial parallel NFS (pNFS) protocol, starting in 2002, as an optional extension of the NFS v4.1 standard in an attempt to produce a parallel file system client that was standard in Linux that could talk to multiple parallel file system backends including block, object, and file based parallel file system backends.
A first generation of parallel file systems, both open and proprietary as well as scale out NAS emerged 20 plus years ago to serve the tightly coupled HPC and loosely coupled Scale out communities. With the advent of AI, both tightly and loosely coupled workloads exist, and at scale that now dwarf most HPC use cases. The need for solutions in this area has dramatically increased as AI use has become wide spread.
To meet this emerging market growth, new proprietary solutions have emerged. pNFS is also growing in importance. pNFS represents the only standards based solution and with mulitiple providers of solutions using pNFS technologies and new pNFS features coming out rapidly. Currently, most solutions are mostly open source and ship in Linux distributions. There is at least one fully open solution entering the solution space.
GOALS AND CURRENT PLANS
To promote multiple standards based (pNFS) solutions and open source solutions, support of both the technology providers and potential users of pNFS technology is needed. Technology providers may not have scalable test resources or in depth knowledge of scalable use cases to ensure their solutions work well at scale. Technology users need to have an independent source of information about how well the solutions work at some scale and with workloads they care about.
To support the broad adtoption of pNFS, testing of performance, correctess, availability, at scale running important use workloads is needed, both for the using and providing industries. LANL, DOD, and CMU are all engaging in activities to assist in this testing for pNFS over time.
Promoting community involvement in open source activities and contributions is also important. Open source solutions will enable more interested parties to engage in innovation to keep the pNFS solution space growing and healthy. We intend to assist in whatever way we can to promote a healthy open source community through information sharing, testing, and promoting research of future enhancement exploration.
With open source clients, metadata servers, and data servers, including user space versions where possible, we hope to encourage growth in innovation in the pNFS solution space. Much as FUSE, a user space file system technology in operating systems, has enabled massive innovation in file system technology, we hope that open source and potentially user space components will do similar things for parallel/scalable/distributed file systems.
This project will provide at scale testing resources for technology providers to test their solutions, independent testing of pNFS technology and solutions with important workloads, provide a clearinghouse for testing results and information regarding solutions, promote engagement in open source community participation, and assist the pNFS community to grow
HIGH-LEVEL TEST OBJECTIVES
Workload areas and aspects include:
- HPC large-scale simulation
- Scale-out NAS (analogous to n-to-n HPC but asynchronous)
- Read-mostly: read-write scenarios that are particularly read-heavy
- Long-distance distribution of filesystems/namespaces
- Scale up with high single and few stream and IOPs from a single shared memory resource.
- Correctness
- Fail over/Availability
- Aging
- Computational Storage Enablement including replacing HDFS like functionality supported fully in standards (we will provide support to Computational Storage standardization efforts where possible within the pNFS standard framework).
We will also work to make testing of pNFS technology at scale easier for all by providing tools for allocating scalable resources, sharing testing environments, enablement of extreme scale testing via virtual cluster technology etc.
Additional background on pNFS.
EXTERNAL PARTIES’ INVOLVEMENT
We are currently determining the guidelines for collaboration and resource use, etc., and plan to contact collaborators in this space to guide the development of the standards technology provider participation. We will also solicit input from potential user communities.
CONTACTS
, Faculty
Associate Professor, ECE & CS
Carnegie Mellon University
, Faculty
Professor, ECE & CS
Carnegie Mellon University
, HPC Division Leader
Los Alamos National Laboratory
