This is the second post of the SLURM configuration and installation guide on Azure (part I is here). In this part, we are going to configure the NFS system, and finally, in the third post, we are going to set up the SLURM environment.
NFS: Shared Directories
Considering that computers have to share some files and directories, we have decided to configure a node in the cluster as a Network Attached Storage (NAS). For doing this, we have configured the node that we named \texttt{nasnode} to store the information that could be required for other nodes using the Network File System (NFS) protocol that Linux provides.
Being connected to nasnode, we introduce the following commands to install the NFS server:
jjorge@nasnode:~$ sudo apt-get update jjorge@nasnode:~$ sudo apt-get install \ rpcbind nfs-kernel-server
We should edit /etc/fstab and /etc/exports to include the following lines, to mount this folder and make it available to the nodes.
jjorge@nasnode:~$ sudo vi /etc/fstab # ...Rest of the file # Adding this line at bottom /home /nfs none bind 0 0 # ...Rest of the file jjorge@nasnode:~$ sudo vi /etc/exports # ...Rest of the file # Adding this line at bottom /nfs 10.0.0.8/24(fsid=0,rw,sync,no_subtree_check,no_root_squash) # ...Rest of the file
Then, we can create the directory that will store the shared files and mount the partition. We can use a different name but keeping it coherent among nodes.
jjorge@nasnode:~$ sudo mkdir /nfs jjorge@nasnode:~$ sudo mount /nfs jjorge@nasnode:~$ sudo /etc/init.d/nfs-kernel-server \ restart
We have finished with nasnode, now we are going to configure the rest of the cluster to have access to this directory. The following steps have to be done on each node. For example, we are going to configure the compute0 node.
jjorge@compute0:~$ sudo apt-get update jjorge@compute0:~$ sudo apt-get install nfs-common
We should modify the local fstab as well, and then create the directory and mount the volume:
jjorge@compute0:~$ sudo vi /etc/fstab # ...Rest of the file # Adding this line at bottom nas:/nfs /nfs nfs auto,rsize=8192,wsize=8192 0 0 # ...Rest of the file jjorge@compute0:~$ sudo mkdir /nfs jjorge@compute0:~$ sudo mount /nfs/
Now, logged in as the main user, we can use his folder:
jjorge@compute0:/nfs/jjorge$ cd jjorge@compute0:~$ cd /nfs/jjorge/ jjorge@compute0:/nfs/jjorge$ cat > example.txt hello world! jjorge@compute0:/nfs/jjorge$ cat example.txt hello world! jjorge@compute0:/nfs/jjorge$ ls example.txt
And we can get this file from every node in the cluster.
jjorge@controller:~$ cd /nfs/jjorge/ jjorge@controller:/nfs/jjorge$ ls example.txt jjorge@controller:/nfs/jjorge$ cat example.txt hello world!
Now, having the shared directory set, we will install SLURM in the following post.