The needs of intensive computing systems require a special distributed file system in many ways.
The high demands in performance, scalability and tolerance to hardware failures in these environments have motivated the creation of solutions that today have a clear transfer to environments that are not so demanding, such as medicine, navigation, genomics.
What is GlusterFS?
The Gluster File System, or GlusterFS, is a multi-scalable file system for NAS initially developed by Gluster Inc.
This allows multiple file servers to be added over Ethernet or Infiniband RDMA interconnects into a large parallel network file environment.
The Design of Gluster File Storage
The design of GLusterFS is based on the use of user space and thus does not compromise performance. They can be found being used in a wide variety of environments and applications such as cloud computing, biomedical sciences, and file storage.
GlusterFS is licensed under the GNU General Public License version 3. Gluster Inc was the main commercial sponsor of GlusterFS, which offers both commercial products and support for the development of free solutions based on GlusterFS. In October 2011, the acquisition of Gluster Inc by Red Hat Inc. was announced. GlusterFS is based on the interaction of client and server components.
Servers are typically implemented as block storage, on each server the glusterfsd daemon process exports a local file system as a volume.
The glusterfs client process, connecting to servers via some TCP/IP, InfiniBand, or SDP protocol, composes virtual composite volumes from multiple remote servers, using translators.
How are files stored in GlusterFs?
By default, files are stored in their entirety, but it can also be configured to be split into multiple chunks on each server. Volumes can be mounted on client machines via the FUSE module or accessed via the libglusterfs client library without incurring file system problems Most of the GlusterFS functionality is implemented as translators,
● Mirroring and file replication.
● File fragmentation or Data striping.
● Load balancing for reading and writing files.
● Fault-tolerant volumes.
● I/O scheduling and disk caching.
● Storage quotas
● The GlusterFS server is kept minimally simple: it exports an existing file system as-is, leaving the storage structure up to client-side translators.
The clients themselves are managed independently, they do not communicate directly with each other, and the translators manage data consistency between them.
The GlusterFS algorithm
GlusterFS is based on an elastic hash algorithm instead of using a centralized or distributed metadata model. Since version 3.1, volumes can be dynamically added, removed, or migrated, this helps prevent consistency issues, and allows GlusterFS to be scaled to several petabytes on low-cost hardware, thus avoiding the bottlenecks that they typically affect many distributed file systems with multiple concurrency.