Distributed File System (DFS) Architecture Components Explained

Distributed File System (DFS) Architecture Components Explained. In this post, we will introduce DFS file system, its features then explain its components in details.

Distributed File System (DFS) Architecture

Distributed File Systems are used to store and share files across different computers or servers. In turn, they allow people to share information with others without having it stored on one computer. This is done by dividing the file into pieces.  After, that they are distributed to different servers so that they are accessed from any other computer on the network.

Consequently, you distinguish a distributed file system (DFS) from typical file systems (i.e., NTFS and HFS). Certainly, this is achieved by allowing direct host access to the same file data across multiple locations.

As noted, you distribute files among multiple storage servers and in multiple locations. This then allows users to share data and storage resources. In case of a disaster or high load, two components work together to improve the data availability. Basically, this allows data from several locations to logically combine into one folder, known as DFS Root.

There are a few reasons one may want to look at a DFS solution for his environment. But all boil down to a need to access the same data from multiple locations. Especially in an unstructured data world, a DFS plays a critical role by providing a single, logical view of data scattered between local and remote locations, including in the cloud.

Besides, DFS makes information and files easily shared between users across the network, with controlled permissions. In turn, it allows users of the network to share information and files in a controlled and authorized way.

Applications of Distributed File System

Some of the major applications of the distributed file system are shown below:

NFS

Network File System (NFS) is a file sharing protocol works in a client server architecture. As a matter of fact, it allows users to access and mount directories located on the remote system. It is one of various DFS standards for Network-Attached Storage. Chiefly NFS uses a file locking system that allows many clients to share the same files. The NFS manages multiple application or compute threads for operation.

Hadoop

Free, open source distributed file system used to store process and analyse data which are very huge in volume. Designed to process large data sets across clusters of computers using simple programming models. Using Hadoop, you scale up from single servers to thousands of machines, each offering local computation and storage.

SMB

A Server Message Block or SMB is a file sharing protocol developed by IBM. All in all, it allows you to read and write files on a remote server over the local area network. With SMB, you share files, directories, printers and other resources on a companies internal network.

NetWare

Network operating system developed by Novell. NetWare uses IPX network protocol to run different services on a personal computer. Additionally supports several operating systems, including Microsoft Windows, DOS, IBM OS/2, and Unix.

Features of DFS

  • Easy to use and high availability
  • File Locking feature.
  • Coherent access.
  • Supports multi networking and multi protocol access.
  • User mobility.
  • Scalable and reliable.
  • Data integrity.
  • Secure and protects information from unwanted and unauthorized access.

DFS namespaces and DFS replication are part of the File and Storage Services role.

Namespaces are role services on Windows Server. That allow sharing folders located on various servers to be grouping in a single, or multiple, logically structured namespaces.

DFS replication is the multi master replication mechanism in Microsoft Windows Server. That is used for synchronizing folders across servers over a low bandwidth network connection. The Enterprise and Data center editions of Windows Server hosts multiple DFS roots on a single server.

It is not required that you use these components together, you can a namespace without using a file replication component, and it is entirely possible to use the file replication component without using the namespace component among servers. Below with Distributed File System (DFS) Architecture, we have listed a few more components of Distributed File Systems.

DFS Components

Distributed File Systems are a set of computers that work together to store and retrieve data. They are used for storing data in a secure way, as well as for sharing that data across the network. It has several components, which we have listed below.

Also below, in this section, we have discussed the cache manager, DCE file server machines, administrative server processes, and tools that help in keeping track of the DFS use and its activities. We have also explained the DFS/NFS secure gateway, which offers authorized access to DFS from NFS clients.

Cache Manager

Image Source: Geeksforgeeks

A cache manager is a program that stores data in the form of a cache. Intermediary between a computer’s main memory and the permanent storage system. The cache manager has two main functions: to improve performance by storing copies of recently accessed data in memory. In addition, it also protects against data loss by keeping backup copies of data on permanent storage.

Use cache managers for many different purposes, such as caching web pages for faster access, caching files for faster loading times, and optimizing disk usage.

It is the client side of DFS that checks the local cache at first on receiving a user’s request. If no similar file is found in the local cache, the cache manager forwards the request to the file server machine and caches data on disk or in memory.

File Exporter

The File Exporter is the server side component of DFS that exports files from the DFS to a location outside the DFS. You use it to export entire folders or individual files from the DFS, and it is also used to export groups of files. The component runs on the file server machine, where it receives requests and manages files.

When a file exporter receives an RPC request, it accesses its own local file system to fulfil the request. This local file system is the DCE Local File System (LFS) or a UNIX File System (UFS). It handles the synchronization of multiple clients accessing the same file simultaneously using the token manager and provides the client with the needed information.

Token Manager

Image Source: Gurtam

A Token Manager allows users to create and manage their own tokens to carry out operations. It further helps synchronize access to files by
numerous clients. The access privileges associated with the tokens that a token manager issues to DFS clients are typically read or write. Token manager can issue four types of tokens, including data tokens, status tokens, lock tokens, and open tokens.

To manage tokens, the token management layer in the cache manager works with the token manager that runs on a file server machine. If a client requests an operation that clashes with a token that another client possesses, then the token management must revoke the current token and give a new token before completing the desired action.

DCE Local File System

Image  Source: Geeksforgeeks

A DCE Local File System is a file system that allows users to store and retrieve files from the computer’s hard drive. The design goal of the DCE Local File System is to provide an open and reliable file system that is used by all applications in the distributed computing environment.

These Local File System provides a high level interface to store and retrieve files. It also provides support for directories, access control lists, and data integrity checks.

This type of file system is used to share files among workgroups, departments, or whole organizations. The DCE Local File System also provides a way to control access to files so that only authorized users read or change.

Fileset Server

The Fileset Server is a Windows service that stores files in the DFS namespace. A service is installed on an existing Windows Server machine or on a new server running the Windows Server operating system. Using this component, administrators create, delete, transfer, or perform operations on the fileset. It provides a centralized location for storing files that are shared among multiple clients.

Basic Overseer Server

Image Source: Jetsoft

A Basic Overseer Server is a server that contains one or more DFS replicas. Moreover, it has the ability to create and delete standalone or domain based replicas on the same server as well as on other servers in your organization. They are also used to provide an overview of all of the DFS Replica Servers that are part of your organization’s replication topology. It also helps monitor the health of your DFS Replica Servers.

Replication Server

Image Source: Nakivo

A replication Server is an administrative server that allows you to replicate and synchronize databases between different servers. It manages the replication of filesets and synchronizes the changes made on one server with another server. This is done by copying the data from one database to another.

Equally, you update the replicas manually or automatically. Further, if a copy of replicas fails to move, you can still access another copy of the fileset from another file server machine.

Update Server

Update servers are used in order to distribute binary files or administrative data to DFS configured servers. The upclient and upserver programmes make up the update server. A system that needs to receive updated binary files or administrative data is where the upclient software is installed. Any updates to binaries or administrative data are propagated to the workstations running the upclient software by the upserver programme, which is operated on a master system.

Fileset Location Server

The fileset location server (FL Server) is a server that manages filesets and their locations. At this point, it also offers a replicated directory service that maintains a record of each fileset and the place where it resides. You then easily access a fileset just by its name. It is not compulsory to know the fileset’s location in order to access a fileset. Also, the fileset location database’s location is automatically updated by DFS (FLDB).

Backup Server

Image Server: Aspirationhosting 

The backup server in DFS is used to create data backups on file server machines and maintain schedules for the same. This component helps to keep the replicated backup database’s backup records up to data and provides the ability to run full and incremental dumps. The fileset serves as the backup unit.

Scout

This administrative tool is another vital component that helps collect and show data about the file exporters running on file server computers. It helps administrators to keep track of how DFS is being used.

The dfstrace Utility

Using the dfstrace utility, administrators and system developers  keep track of DFS processes running run in the user space or the kernel. The component also offers a suite of commands for low level diagnostic and debugging data.

DFS/NFS Secure Gateway

Image source: Cloudera

The DFS/NFS Secure Gateway is a gateway that provides secure access to files stored on DFS or NFS servers. It offers a number of features that make it easy for users to work with the files they have stored on the server.

The gateway provides access to files and folders on the server plus file system navigation and file management operations, i.e., copy, delete, rename, and create a folder. Also, it has the ability to upload and download files from the server.

You can also perform file system tasks such as splitting archive files, synchronizing directories, and backing up data from the server onto local drives.

Thank you for reading Distributed File System (DFS) Architecture Components Explained. We shall conclude.

Distributed File System (DFS) Architecture Components Explained Conclusion

The Distributed filesystems (DFS) are file systems that extend over several file servers or several locations, for example, file servers located at various physical locations. They are highly scalable and have a high performance. They are also fault tolerant and are also used as a replacement for a centralized server.

It provides transparency of data and allows sharing it remotely. The distributed filesystem is also highly secure and helps protect data in the file system from unauthorized access. It also supports load sharing and file locking features.

Distributed file systems are designed to overcome the limits of traditional local storage, where data resides on only one computer. They is used for both large scale and small scale storage, as well as for backup purposes. Cache manager, file exporter, token manager, replication server, backup server, and a few main components of DFS. Check out their use and roles in DFS.

Have a look at more NFS content here

Avatar for Hitesh Jethva
Hitesh Jethva

I am a fan of open source technology and have more than 10 years of experience working with Linux and Open Source technologies. I am one of the Linux technical writers for Cloud Infrastructure Services.

4 3 votes
Article Rating
Subscribe
Notify of
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x