The NHR system
The NHR system consists of two independent systems named Lise (named after Lise Meitner) and Emmy (named after Emmy Noether). The systems are located at the Zuse Institute Berlin (Lise) and the University of Göttingen (Emmy). Overall, the NHR system consists of 1270 compute nodes with 121,920 cores in total. You can learn more about the system and the differences between the sites on the NHR website.
System Lise at NHR@ZIB consists of two partitions, the CPU partition and GPU partition. The GPU partition is under development with a specific documentation for power users GPU.
Please login to the gateway nodes using the Secure Shell
ssh (protocol version 2), see the example below. The standard gateways are called
Please note, that there is a memory limit (currently 64 GByte) on a per user basis on the login nodes. Memory and CPU intensive tasks should be submitted as a job to our SLURM batch system.
Login authentication is possible via SSH keys only. For information and instructions please see our SSH Pubkey tutorial.
- Home file system with 340 TiByte capacity containing
- Lustre parallel file system with 8.1 PiByte capacity containing
- project data directories
/scratch/projects/<projectID>/(not yet available)
- Tape archive with 120 TiByte capacity (accessible on the login nodes, only)
- On Emmy: SSD for temporary data at
$LOCAL_TMPDIR(400 GB shared among all jobs running on the node)
Software and environment modules
The webpage Software gives you information about available software on the NHR systems.
NHR provides a number of compilers and software packages for parallel computing and (serial) pre- and postprocessing:
- Compilers: Intel, GNU
- Libraries: NetCDF, LAPACK, ScaLAPACK, BLAS, FFTW, ...
- Debuggers: Allinea DDT, Roguewave TotalView...
- Tools: octave, python, R ...
- Visualisation: mostly tools to investigate gridded data sets from earth-system modelling
- Application software: mostly for engineering and chemistry (molecular dynamics)
Environment Modules are used to manage the access to software/libraries. The
module command offers the following functionality.
- Show lists of available software
- Enables access to software in different versions
To avoid conflicts between different compilers and compiler versions, builds of most important libraries are provided for all compilers and major release numbers.
Here only a brief introduction to program building using the intel compiler is given. For more detailed instructions, including important compiler flags and special libraries, refer to our webpage Compilation Guide.
Examples for building a program on the Atos system
To build executables for the Atos system, call the standard compiler executables (icc, ifort, gcc, gfortran) directly.
MPI, Communication Libraries, OpenMP
We provide several communication libraries:
- Intel MPI
As Intel MPI is the communication library recommended by the system vendor, currently only documentation for Intel MPI is provided, except for application specific documentation.
OpenMP support is available with the compilers from Intel and GNU.
Using the batch system
To run your applications on the systems, you need to go through our batch system/scheduler: Slurm. The scheduler uses meta information about the job (requested node and core count, wall time, etc.) and then runs your program on the compute nodes, once the resources are available and your job is next in line. For a more in depth introduction, visit our Slurm documentation.
We distinguish two kinds of jobs:
- Interactive job execution
- Job script execution
To request resources, there are multiple flags to be used when submitting the job.
|# tasks||-n #||1|
|# nodes||-N #||1|
|# tasks per node||--tasks-per-node #|
For using compute resources interactively, e.g. to follow the execution of MPI programs, the following steps are required. Note that non-interactive batch jobs via job scripts (see below) are the primary way of using the compute resources.
- A resource allocation for interactive usage has to be requested first with the
salloc --interactivecommand which should also include your resource requirements.
sallocsuccessfully allocated the requested resources and when working at the Göttingen complex (Emmy), you are automatically logged in at one of the allocated compute nodes. For Berlin (Lise), you have to issue an additional srun command to work one of the allocated nodes (see example below) if you want to work on the compute node.
srunor MPI launch commands, like
mpiexec, can be used to start parallel programs (see according user guides)
Please go to our webpage MPI, OpenMP start Guide for more details about job scripts. For introduction, standard batch system jobs are executed applying the following steps:
- Provide (write) a batch job script, see the examples below.
- Submit the job script with the command
- Monitor and control the job execution, e.g. with the commands
scancel(cancel the job).
A job script is a script (written in
csh syntax) containing Slurm keywords which are used as arguments for the command
Requesting 4 nodes in the medium partition with 96 cores (no hyperthreading) for 10 minutes, using Intel MPI.
Requesting 1 large node with 96 CPUs (physical cores) for 20 minutes, and then using 192 hyperthreads
The webpage Accounting and NPL gives you more information about job accounting.
Every batch job on Lise and Emmy is accounted. The account (project) which is debited for a batch job can be specified using the
--account <account>. If a batch job does not state an account (project), a default is taken from the account database. It defaults to the personal project of the user, which has the same name as the user. Users may modify their default project by visiting the Service portal.
For questions, please contact the support crew email@example.com.