1 Tools

In this section we introduce tools necessary to use the WU-Cluster.

1.1 Virtual Private Network (VPN)

A virtual private network (VPN) is a service that encrypts your internet traffic and hides your online identity. A VPN lets you access websites and services that are blocked or censored in your region, and protects your data and identity from anyone who might be watching. It is mainly used to create secure connections to the local network of your company.

You will need VPN whenever you want to connect to the WU-Cluster and you are not connected to the WU network. WU uses the GlobalProtect VPN client, the setup is described at https://www.wu.ac.at/en/it/services/network/vpn/.

1.2 Secure Shell Protocol (SSH)

The secure shell protocol (SSH) provides a secure way to connect to a remote server and run commands or transfer files or forward ports.

1.2.1 Unix

On Linux and MacOS ssh client should be already installed.

1.2.2 Windows

On Windows you have to install one of the following tools to use ssh,

RTools and MobaXterm are both based on Cygwin. Cygwin is a large collection of tools which provide functionality similar to a Linux distribution on Windows.
RTools has the advantage that it is needed anhow if you want to build R packages from source on Windows.
MobaXterm has the advantage that it has a GUI more familiar to Windows users.


2 WU Cluster

2.1 Connect to the WU cluster

To connect to the WU cluster you have to be inside the WU network. Check that you are able to connect to the cluster by using,

ssh <user_name>@wucluster.wu.ac.at

2.2 SSH Key-Based (passwordless) Authentication

To avoid typing your password every time you want to connect to the cluster SSH key based authentication be used. Follow this two simple steps to setup SSH key based authentification, 1. create a SSH key and 2. add your public key (id_rsa.pub) into the ~/.ssh/authorized_keys file on the server.

2.2.1 Create SSH key by typing

ssh-keygen

into the terminal and pressing enter till finished.

2.2.2 Add your public key to the server

If ssh-copy-id is available use the following code

ssh-copy-id <user_name>@wucluster.wu.ac.at

to add your public key (id_rsa.pub) into the ~/.ssh/authorized_keys file on the server.

More information can be found at e.g., "How To Set up SSH Keys on a Linux / Unix System".

2.2.3 Setup ~/.ssh/config

For convenience reasons it is also recommended you create a ~/.ssh/config at your local PC.

Host wucluster
Hostname wucluster.wu.ac.at
User <user_name>

After the setup of SSH key based authentication and creating the config file you can connect to the cluster by just using.

ssh wucluster

2.3 File Upload to / Download from the WU cluster

For file upload / download via ssh scp or rsync are recommended. The advantage of scp is that it typically available after ssh is installed, the advantage of rsync is that it has much more functionality.

2.3.1 Secure copy protocol (SCP)

2.3.1.1 Upload to the Host (Cluster)

scp <source_file> <user_name>@hostname:<target_file>

e.g.,

scp test.R <user_name>@wucluster.wu.ac.at:~/exercises/test.R

uploads the file test.R from your current working directory to the cluster into the folder ~/exercises. If you setup the ~/.ssh/config as shown above, this reduces to

scp test.R wucluster:~/exercises/test.R

equivalently the following command could be used.

scp test.R wucluster:~/exercises/

To upload entire folders, the recursive copy argument -r has to be used.

2.3.1.2 Download from the Host (Cluster)

scp <user_name>@hostname:<source_file> <target_file> 

e.g.,

scp <user_name>@wucluster.wu.ac.at:~/exercises/test.R test.R 

downloads the file ~/exercises/test.R from the cluster into your current working directory. If you setup the ~/.ssh/config as shown above, this reduces to

scp wucluster:~/exercises/test.R test.R 

equivalently the following command could be used.

scp wucluster:~/exercises/test.R ./

2.3.2 rsync

In principle rsync works similar to scp but has much more options. Here only one set of options is shown.

2.3.2.1 Upload to the Host (Cluster)

The command

rsync -rtvP HPC wucluster:~/

uploads the folder HPC located in your current directory to your the ${HOME} folder on the cluster. Similarly, the command

rsync -rtvP HPC/* wucluster:~/HPC/

would upload the content of the HPC folder located in you current directory to the HPC folder located in ${HOME} folder on the cluster.

2.3.2.2 Download from the Host (Cluster)

The command

rsync -rtvP wucluster:~/HPC ./

download the folder HPC located on the cluster in your ${HOME} folder to your current directory. Similarly, the command

rsync -rtvP wucluster:~/HPC/* HPC/

would download the content of the HPC folder on the cluster in your ${HOME} folder into the HPC folder located in your current directory.


3 Introduction to Linux Command Line Basics

There exist many introductions to the linux commandline online.

For this course it should be enough if you know the commands pwd, cd, ls, mkdir, touch, nano, mv, cp, cat, head, tail, less, rm, rmdir and man.


4 Using SLURM with LMOD and R

4.1 SLURM

The Slurm Workload Manager, formerly known as Simple Linux Utility for Resource Management (SLURM), or simply Slurm, is a free and open-source job scheduler for Linux and Unix-like kernels, used by many of the world’s supercomputers and computer clusters.

4.1.1 Commands

Command Description Usage Example
sbatch Submit a batch script to slurm sbatch [options] script [args] sbatch job.sh
scancel Send a signal (TERM, KILL, …) to jobs or job steps that are under the control of slurm scancel [options] job_ids scancel 123456
scontrol View or modify Slurm configuration and state scontrol hold [job_id] scontrol hold 123456
sinfo View information about Slurm nodes and partitions sinfo [options] sinfo -al, sinfo -Nel
squeue view information about jobs located in the Slurm scheduling queue squeue [options] squeue -u $USER

The command

sinfo -al

views information about all partitions in long format. The command

squeue -u $USER

lists all jobs for current user. The command

squeue -t R

lists all currently runing jobs. The command

scontrol hold <job_id>

holds slurm batch jobs with job id <job_id>. To relase the job from hold

scontrol release <job_id>

has to be used. The command

scancel <job_id>

cancels a job.

scontrol update jobid=1 ArrayTaskThrottle=200

4.1.2 Submitting jobs

Jobs are submitted to SLURM by using the slurm batch command sbatch.

  • sbatch --usage gives an overview of all the available options.
  • sbatch --help shows the help.

As a first example create a simple array job run.slurm

#!/bin/bash
#SBATCH --job-name=myjob
#SBATCH --ntasks=1
#SBATCH --partition=compute-test
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=2G
#SBATCH --output=test-%A-%a.out

#
# To submit the job just type
# sbatch --array=1-3 run.slurm
#

ml purge   # cleanup
module restore R430

R --no-save --no-restore -f test.R > test-${SLURM_ARRAY_JOB_ID}-${SLURM_ARRAY_TASK_ID}.Rout 2>&1 

exit 0

which executes the R script test.R 3 times.

task_id <- as.integer(Sys.getenv("SLURM_ARRAY_TASK_ID"))

writeLines(sprintf("TASK_ID: %i", task_id))

To run this job the collection R430 has to be created before, load the collection by typing

module purge  # cleanup
module restore R430

into the terminal. Afterwards the array job can be started

sbatch --array=1-3 run.slurm

4.1.3 Selected SBATCH options

The following gives an overview on selected options to see all the available options use sbatch --usage or sbatch --help.

  • --job-name selects a name for the job.
  • --ntasks advises Slurm that a certain number of tasks will be launched from the job e.g. --ntasks=16 will tell Slurm that 16 different tasks will be launched from the job script.
  • --partition gives the name of the partion, all available paritions can be listed via sinfo -a.
  • --cpus-per-task Specifies the number of vCPUs (virtual centralized processing unit) required per task on the same node e.g. --cpus-per-task=4 will request that each task has 4 vCPUs allocated on the same node. The default is 1 vCPU per task.
  • --output gives a name of the file where the output should be stored.

4.2 LMOD

4.2.1 Cheat sheet

Command Description
module list List active modules in the user environment
module av [module] List available modules in MODULEPATH
module spider [module] Query all modules in MODULEPATH and any module hierarchy
module overview [module] List all modules with count of each module
module load [module] Load a module file in the users environment
module unload [module] Remove a loaded module from the user environment
module purge Remove all modules from the user environment
module swap [module1] [module2] Replace module1 with module2
module show [module] Show content of commands performed by loading module file
module --raw show [module] Show raw content of module file
module help [module] Show help for a given module
module whatis [module] A brief description of the module, generally single line
module savelist List all user collections
module save [collection] Save active modules in a user collection
module describe [collection] Show content of user collection
module restore [collection] Load modules from a collection
module disable [collection] Disable a user collection
module --config Show Lmod configuration
module use [-a] [path] Prepend or Append path to MODULEPATH
module unuse [path] Remove path from MODULEPATH
module --show_hidden av Show all available modules in MODULEPATH including hidden modules
module --show_hidden spider Show all possible modules in MODULEPATH and module hierarchy including hidden modules

4.2.2 Create a new collection

  1. Load the desired modules
  2. Save the collection
  3. Save the collection

4.2.2.1 Load the desired modules

To see which modules are available type

module av

into the terminal.

Using R on Linux the packages are compiled from source, therefore often gcc and cmake are needed. Based on the results of module av these modules can be loaded via,

module load gcc-12.2.0-gcc-11.3.0-bqzusfx
module load cmake-3.24.3-gcc-12.2.0-uzbdae4
module load r-4.3.0-t7suvif

4.2.2.2 Save the collection

Now we saved the loaded modules in the collection R430.

module save R430

4.2.3 Load an existing collection

All the available collections can be listed via

module savelist

To load the previously created R430 collection type

module restore R430

into the terminal.

5 Examples

Examples for the slurm and sge cluster can be found at gitlab.

Additional slurm examples can be found on the cluster at /opt/apps/wucluster-examples