Software Modules

ARC uses the lmod environment modules system to enable access to centrally-installed (ARC-maintained) scientific software packages. This provides for the dynamic modification of a user’s environment for an application or set of applications, enabling streamlined management of software versions and dependencies.

The modules on ARC’s systems rely on EasyBuild for module deployment.

EasyBuild

Newer (2020 and later) ARC clusters use a module system mostly built around EasyBuild, a software build and installation framework that allows you to manage (scientific) software on High Performance Computing (HPC) systems in an efficient way. EasyBuild is maintained by a broad user community and makes it easier for ARC to provide stable, performant, and updated scientific software. It also makes it trivial in some cases for users to install their own versions of packages if they so desire.

Toolchains

EasyBuild is built around toolchains, which describe the sequence of dependencies, such as compiler, linear algebra library, and MPI implementation, used to build packages. There are two main ones:

  • foss (“Free Open Source Software”): GCC compilers, OpenBLAS for linear algebra, OpenMPI for MPI, etc

  • intel: Intel compilers, Intel MKL for linear algebra, Intel MPI

However, we have upon request supported others, such as:

  • iomkl: Intel compilers, Intel MKL for linear algebra, and OpenMPI for MPI

  • gomkl: GCC compilers, Intel MKL for linear algebra, and OpenMPI for MPI

So please reach out if the toolchains that we provide are not what you need.

Toolchains are typically updated twice per year (a and b versions) and we try to stay up-to-date with those updates.

As an example, the modules active after loading the foss/2020b toolchain are (note that the first few modules in the list are defaults provided by ARC):

[arcuser@tinkercliffs2 ~]$ module reset; module load foss/2020b; module list
Resetting modules to system default

Currently Loaded Modules:
  1) shared                              8) useful_scripts                 15) XZ/5.2.5-GCCcore-10.2.0           22) PMIx/3.1.5-GCCcore-10.2.0
  2) slurm/20.02.3                       9) DefaultModules                 16) libxml2/2.9.10-GCCcore-10.2.0     23) OpenMPI/4.0.5-GCC-10.2.0
  3) apps                               10) GCCcore/10.2.0                 17) libpciaccess/0.16-GCCcore-10.2.0  24) OpenBLAS/0.3.12-GCC-10.2.0
  4) site/tinkercliffs/easybuild/setup  11) zlib/1.2.11-GCCcore-10.2.0     18) hwloc/2.2.0-GCCcore-10.2.0        25) gompi/2020b
  5) cray                               12) binutils/2.35-GCCcore-10.2.0   19) libevent/2.1.12-GCCcore-10.2.0    26) FFTW/3.3.8-gompi-2020b
  6) craype-x86-rome                    13) GCC/10.2.0                     20) UCX/1.9.0-GCCcore-10.2.0          27) ScaLAPACK/2.1.0-gompi-2020b
  7) craype-network-infiniband          14) numactl/2.0.13-GCCcore-10.2.0  21) libfabric/1.11.0-GCCcore-10.2.0   28) foss/2020b

We see here that lower-level software (e.g., binutils) is also included in the module structure, reducing the risk of conflicts in adding new versions later.

Usage

In this section we will describe how to use EasyBuild’s module system. We will use Gromacs as our example software. We begin by noting that, even though Gromacs is a popular software package on HPC systems, upon login its executable gmx is nowhere to be found:

[arcuser@tinkercliffs2 ~]$ which gmx
/usr/bin/which: no gmx in (/apps/useful_scripts/bin:/cm/shared/apps/slurm/20.02.3/sbin:/cm/shared/apps/slurm/20.02.3/bin:/home/arcuser/util:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/ibutils/bin:/usr/share/lmod/lmod/libexec)

To find it, we need to load the Gromacs module. To find a software package, you can use module spider. For example:

[arcuser@tinkercliffs2 ~]$ module spider gromacs

----------------------------------------------------------------------------------------------------------------------------------------------------------
  GROMACS:
----------------------------------------------------------------------------------------------------------------------------------------------------------
    Description:
      GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions
      of particles. This is a CPU only build, containing both MPI and threadMPI builds for both single and double precision. It also contains the gmxapi
      extension for the single precision MPI build. 

     Versions:
        GROMACS/2020.1-foss-2020a-Python-3.8.2
        GROMACS/2020.3-foss-2020a-Python-3.8.2

----------------------------------------------------------------------------------------------------------------------------------------------------------
  For detailed information about a specific "GROMACS" module (including how to load the modules) use the module's full name.
  For example:

     $ module spider GROMACS/2020.3-foss-2020a-Python-3.8.2
----------------------------------------------------------------------------------------------------------------------------------------------------------

Note

You can also use module avail to list all modules, although the output is quite long. We provide it here, in case it helps you find what you need.

To then load the module, you can use module load:

[arcuser@tinkercliffs2 ~]$ module reset; module load GROMACS/2020.3-foss-2020a-Python-3.8.2
Resetting modules to system default

We can use module list to list the modules we have loaded now:

[arcuser@tinkercliffs2 ~]$ module list

Currently Loaded Modules:
  1) shared                             14) numactl/2.0.13-GCCcore-9.3.0     27) ncurses/6.2-GCCcore-9.3.0
  2) slurm/20.02.3                      15) XZ/5.2.5-GCCcore-9.3.0           28) libreadline/8.0-GCCcore-9.3.0
  3) apps                               16) libxml2/2.9.10-GCCcore-9.3.0     29) Tcl/8.6.10-GCCcore-9.3.0
  4) site/tinkercliffs/easybuild/setup  17) libpciaccess/0.16-GCCcore-9.3.0  30) SQLite/3.31.1-GCCcore-9.3.0
  5) cray                               18) hwloc/2.2.0-GCCcore-9.3.0        31) GMP/6.2.0-GCCcore-9.3.0
  6) craype-x86-rome                    19) UCX/1.8.0-GCCcore-9.3.0          32) libffi/3.3-GCCcore-9.3.0
  7) craype-network-infiniband          20) OpenMPI/4.0.3-GCC-9.3.0          33) Python/3.8.2-GCCcore-9.3.0
  8) useful_scripts                     21) OpenBLAS/0.3.9-GCC-9.3.0         34) pybind11/2.4.3-GCCcore-9.3.0-Python-3.8.2
  9) DefaultModules                     22) gompi/2020a                      35) SciPy-bundle/2020.03-foss-2020a-Python-3.8.2
 10) GCCcore/9.3.0                      23) FFTW/3.3.8-gompi-2020a           36) networkx/2.4-foss-2020a-Python-3.8.2
 11) zlib/1.2.11-GCCcore-9.3.0          24) ScaLAPACK/2.1.0-gompi-2020a      37) GROMACS/2020.3-foss-2020a-Python-3.8.2
 12) binutils/2.34-GCCcore-9.3.0        25) foss/2020a
 13) GCC/9.3.0                          26) bzip2/1.0.8-GCCcore-9.3.0

We can see that the system now can find the Gromacs gmx executable:

[arcuser@tinkercliffs2 ~]$ which gmx
/apps/easybuild/software/tinkercliffs-rome/GROMACS/2020.3-foss-2020a-Python-3.8.2/bin/gmx

Finally, to clear out modules, we recommend using module reset, which will return the modules to their default state:

[arcuser@tinkercliffs2 ~]$ module reset; module list
Resetting modules to system default

Currently Loaded Modules:
  1) shared          3) apps                                5) cray              7) craype-network-infiniband   9) DefaultModules
  2) slurm/20.02.3   4) site/tinkercliffs/easybuild/setup   6) craype-x86-rome   8) useful_scripts

Warning

Do not use module purge. As you see above, ARC includes a number of important packages, such as the Slurm scheduler in the default modules. module purge will remove those, too, breaking key functionality. If you accidentally use module purge, simply use module reset to reset to the default.

Using EasyBuild to Build Your Own Software

EasyBuild can also be used by users to install packages. We describe the steps briefly below; see also our video tutorial on the subject.

The basic steps are:

  1. Load the EasyBuild module to get access to the eb executable:

    module reset; module load EasyBuild
    
  2. Use eb -S to search for the software package that you need (the output is quite long in this case so we only show a snippet):

    [arcuser@tinkercliffs2 ~]$ eb -S ^netCDF
     * $CFGS3/n/netCDF/netCDF-4.7.1-iimpi-2019b.eb
     * $CFGS3/n/netCDF/netCDF-4.7.1-iimpic-2019b.eb
     * $CFGS3/n/netCDF/netCDF-4.7.4-fix-mpi-info-f2c.patch
     * $CFGS3/n/netCDF/netCDF-4.7.4-gompi-2020a.eb
     * $CFGS3/n/netCDF/netCDF-4.7.4-gompi-2020b.eb
     * $CFGS3/n/netCDF/netCDF-4.7.4-gompic-2020a.eb
    
  3. Pick one of the versions and use eb -Dr filename.eb to see what it is going to do (the D in this case is for “dry run”). The [x] lines indicate packages that are already installed. The [ ] lines are packages that will need to be installed.

    [arcuser@tinkercliffs2 ~]$ eb -Dr netCDF-4.7.4-gompi-2020b.eb
    == Temporary log file in case of crash /localscratch/eb-ceKHhw/easybuild-asf_l0.log
    == found valid index for /apps/easybuild/software/tinkercliffs-rome/EasyBuild/4.4.0/easybuild/easyconfigs, so using it...
    == found valid index for /apps/easybuild/software/tinkercliffs-rome/EasyBuild/4.4.0/easybuild/easyconfigs, so using it...
    Dry run: printing build status of easyconfigs and dependencies
    CFGS=/apps/easybuild
     * [x] $CFGS/ebfiles_repo/tinkercliffs-rome/M4/M4-1.4.18.eb (module: M4/1.4.18)
     * [x] $CFGS/ebfiles_repo/tinkercliffs-rome/Bison/Bison-3.7.1.eb (module: Bison/3.7.1)
     * [x] $CFGS/ebfiles_repo/tinkercliffs-rome/bzip2/bzip2-1.0.8-GCCcore-10.2.0.eb (module: bzip2/1.0.8-GCCcore-10.2.0)
     * [ ] $CFGS/software/tinkercliffs-rome/EasyBuild/4.4.0/easybuild/easyconfigs/l/libiconv/libiconv-1.16-GCCcore-10.2.0.eb (module: libiconv/1.16-GCCcore-10.2.0)
     * [x] $CFGS/ebfiles_repo/tinkercliffs-rome/expat/expat-2.2.9-GCCcore-10.2.0.eb (module: expat/2.2.9-GCCcore-10.2.0)
     * [x] $CFGS/ebfiles_repo/tinkercliffs-rome/CMake/CMake-3.18.4-GCCcore-10.2.0.eb (module: CMake/3.18.4-GCCcore-10.2.0)
     * [ ] $CFGS/software/tinkercliffs-rome/EasyBuild/4.4.0/easybuild/easyconfigs/d/Doxygen/Doxygen-1.8.20-GCCcore-10.2.0.eb (module: Doxygen/1.8.20-GCCcore-10.2.0)
     * [x] $CFGS/ebfiles_repo/tinkercliffs-rome/libevent/libevent-2.1.12-GCCcore-10.2.0.eb (module: libevent/2.1.12-GCCcore-10.2.0)
     * [x] $CFGS/ebfiles_repo/tinkercliffs-rome/numactl/numactl-2.0.13-GCCcore-10.2.0.eb (module: numactl/2.0.13-GCCcore-10.2.0)
     * [x] $CFGS/ebfiles_repo/tinkercliffs-rome/OpenMPI/OpenMPI-4.0.5-GCC-10.2.0.eb (module: OpenMPI/4.0.5-GCC-10.2.0)
     * [x] $CFGS/ebfiles_repo/tinkercliffs-rome/gompi/gompi-2020b.eb (module: gompi/2020b)
     * [x] $CFGS/ebfiles_repo/tinkercliffs-rome/HDF5/HDF5-1.10.7-gompi-2020b.eb (module: HDF5/1.10.7-gompi-2020b)
     * [ ] $CFGS/software/tinkercliffs-rome/EasyBuild/4.4.0/easybuild/easyconfigs/n/netCDF/netCDF-4.7.4-gompi-2020b.eb (module: netCDF/4.7.4-gompi-2020b)
    == Temporary log file(s) /localscratch/eb-ceKHhw/easybuild-asf_l0.log* have been removed.
    == Temporary directory /localscratch/eb-ceKHhw has been removed.
    
  4. If you are okay with installing the packages marked with [ ], you can install them with eb -r filename.eb (i.e., remove the D for “dry run” from the previous command):

    [arcuser@tinkercliffs2 ~]$ eb -r netCDF-4.7.4-gompi-2020b.eb
    == Temporary log file in case of crash /localscratch/eb-lsT7pO/easybuild-zdQblI.log
    == found valid index for /apps/easybuild/software/tinkercliffs-rome/EasyBuild/4.4.0/easybuild/easyconfigs, so using it...
    == found valid index for /apps/easybuild/software/tinkercliffs-rome/EasyBuild/4.4.0/easybuild/easyconfigs, so using it...
    == resolving dependencies ...
    == processing EasyBuild easyconfig /apps/easybuild/software/tinkercliffs-rome/EasyBuild/4.4.0/easybuild/easyconfigs/l/libiconv/libiconv-1.16-GCCcore-10.2.0.eb
    == building and installing libiconv/1.16-GCCcore-10.2.0...
    == fetching files...
    == creating build dir, resetting environment...
    == unpacking...
    == patching...
    == preparing...
    == configuring...
    == building...
    == testing...
    == installing...
    

This process can be time-consuming depending on the package, so it may be worth starting it in, e.g., a screen session. If the process ultimately completes with a line that looks like

== COMPLETED: Installation ended successfully

then you have successfully installed your software package. It can then be loaded from the module system like any other module. In this case, we would use

module reset; module load netCDF/4.7.4-gompi-2020b

where we get the module name by converting the first - in the .eb file name to a / or by noting that EasyBuild printed the following during installation:

== building and installing netCDF/4.7.4-gompi-2020b...

Environment variables

Sometimes it is important to know where a software package is installed so that you can point to it when installing other software. For this purpose, EasyBuild provides $EBROOTSOFTWARE to point to the software installation location. For example:

[arcuser@tinkercliffs2 ~]$ module reset; module load netCDF/4.7.4-gompi-2020a
Resetting modules to system default
[arcuser@tinkercliffs2 ~]$ ls $EBROOTNETCDF
bin  easybuild  include  lib64  share

So to link to NetCDF libraries, one might use -L$EBROOTNETCDF/lib64.