You are here

Orange.intersect.org.au Handbook

The following information is also available here as a PDF.

  1. Orange Facility - Technical Specifications
  2. Connecting to Orange
  3. Transferring files to and from Orange
  4. Setting up software environments
  5. Compiling code
  6. Running jobs
  7. Tricks to PBSPro and Orange
  8. Disks

 

All applications for merit-based resource subsidies for HPC, or 'Time' are made through NCI's forms.

See how to apply screen-by-screen here

An application must be made by academic staff at Intersect member institutions. We welcome PhD students to make use of the facilities, but require that the lead CI on an application be an academic staff member.

New project applications:    https://my.nci.org.au/mancini/login?next=/mancini/project/propose/

         ***In Step 5, Please select "INTERSECT (NSW)" as Scheme/Partner***

Create a new ID at: https://my.nci.org.au/mancini/signup/0

To be added to an existing project, use this form: https://my.nci.org.au/mancini/project/

1.  Orange Facility - Technical Specifications

The SGI 30+ TFlop distributed memory cluster features 103 cluster nodes with 1660 cores powered by the Intel® Xeon® E5-2600 processor series. It has 200TB of local scratch disk space and 101TB of usable high speed shared storage in a parallel file system.  

The SGI HPC cluster is comprised of 13 large memory (256 GB) compute nodes and 90 normal  (64GB) compute nodes with Intel Xeon E5-2600 8-core processors. System software provided includes SGI Management Center, SGI Performance Suite, PBS Pro Scheduler and SUSE® Linux Enterprise Server operating system.  

The clusters are connected with QDR InfiniBand® Non-blocking Interconnect technology.  A single administration node and a system console are also provided. Storage capabilities consist of an SGI NAS Storage Server and a Panasas® ActiveStor™ PAS-12 parallel file system.

2.  Connecting to Orange           

Linux/Mac: ssh orange.intersect.org.au or ssh -X orange.intersect.org.au         

Windows: Use for example putty

3.  Transferring files to and from Orange      

Linux/Mac: use sftp or scp        

Windows: use a sftp client such as filezilla

Files must be transferred through a secure protocol.

4.  Setting up software environments

To set up the environment for a software package, you need to use the module system. This is identical on both Orange and Raijin machines but the module names might differ. To see the exact names of the modules visit NCI software list or to get a list use the command:

module avail

To load, for example, the latest Intel compilers on Orange, you would use the command:

module load intel-tools-15

This sets up your environment (variables and path). The reason for using this module system is to allow for different versions of the same software package.

Other useful module commands include:      

A list of already loaded modules: module list       

To show you what a module does: module show package-name

To unload a package module unload <package> ()

If you would like to have software installed on Orange, please email to time@intersect.org.au and specify the download site or download the software yourself and let us know where to find it in your home directory.  You are also welcome to install software yourself in your home directory.    

For software requests at NCI please use the following form:  http://nf.nci.org.au/accounts/forms/software.php

5.  Compiling code          

The GNU and Intel compilers are installed on Orange.        

GNU: gcc, g++ and gfortran. Version 4.3.4 is the system default. There is a newer version (4.9.1) installed in /apps/gcc.  This is experimental and not supported. Give it a try. This new version can make use of the vector instructions of the Sandy Bridge processors via -xAVX. It is advised to use the Intel compilers. See below.

For the Intel compilers run:          

module load  intel-tools-15

to load the C and Fortran compilers version 15 and setup the Intel Math Kernel Library (MKL).   

After that you can use icc, icpc and ifort.

For a complete list of modules use the command:        

module avail

For Message Passing Interface (MPI) codesuch as OpenMPI use:

module avail openmpi

For the Intel compilers load SGI's Message Passing Toolkit (MPT):

module load sgi-mpt/2.11

To get best performance out of the Sandy Bridge processors use the Intel compilers and the options:

-xHost

which creates code for the new vector units.  

As for MPI we suggest to use MPT which is SGI's MPI based on mpich. Use the command

module load sgi-mpt/2.11

which sets up the environment. Compiler wrappers such as mpicc and mpif90 are supplied with MPT.

You can use also the Intel MPI which is more complicated to use. Load the module to setup this environment.

There is also OpenMPI and mpich 2 and 3 installed under /apps. They deliver lower performance but are compatible with many programs. There is no warranty for those MPIs. Support is only MPT.

If you use MPT the default is to use the Gnu compilers. To overwrite this, set the environment variable as follows:

export MPICC_CC=icc

export MPIF90_f90=ifort

Such compiled code should be started with the MPT mpiexec_mpt in batch:

mpiexec_mpt -np $NPROC ./myprog

Note that this command is only available in batch. If you test your code on the login node use mpirun instead.

It is also advisable to pin mpi processes with the command  

export MPI_DSM_DISTRIBUTE=1

6.  Running jobs

All jobs have to use the batch queue. NO program should run on the login node (except short testing and compiling). Orange uses PBSPro as a scheduler.

For full details see the PBS Pro User Guide

Here is a sample script:  

#!/bin/bash -login

#PBS -N my-jobname

#PBS -W group_list=project-code

#PBS -q workq

#PBS -l select=2:ncpus=16:mem=60G:mpiprocs=16,walltime=03:00:00

ulimit -s unlimited

cd $PBS_O_WORKDIR

source /usr/share/modules/init/bash

module load  intel-tools-15

module load sgi-mpt/2.11

mpiexec_mpt -n 32 ./myprog

This script asks for 2 nodes with 16 cores each, 60GB memory in each node and a wall-time of 3 hours.

It sets the stack to the maximum size and cds in your working directory. It sets up the environment for the bash and the Intel compilers including MPT. Then it runs a program with mpirun on 32 cores.

There is a wall-time limit of 200h. There is no maximum of number of cores in a job or number of jobs you run within your approved allocation.

An example using MPT would be:

#!/bin/bash

#PBS -l select=4:ncpus=16:mem=60G:mpiprocs=16

#PBS -l walltime=03:00:00

#PBS -W group_list=project-code

source /usr/share/modules/init/bash

module load intel-tools-15

module load sgi-mt/2.11

NPROC=`cat $PBS_NODEFILE |wc -l`

export MPI_DSM_DISTRIBUTE=1

mpiexec_mpt -np $NPROC ./myprog

A more complex example using MPI would be:

#!/bin/bash

#PBS -W group_list=

#PBS -q workq

#PBS -l select=8:ncpus=16:mpiprocs=16:ompthreads=1

#PBS -l walltime=200:00:00

source /usr/share/modules/init/bash

module load intel-fc-12 intel-cmkl-15 sgi-mpt/2.07

export MPI_HOME=/apps/sgi-mpt/2.07

export MPI_DISPLAY_SETTING=S

export MPI_USE_IB=

export MPI_VERBOSE=

export MPI_DSM_VERBOSE=

export MPI_DEFAULT_SINGLE_COPY_OFF=

export MPI_IB_RAILS=1

export MPI_DSM_DISTRIBUTE=

export OMP_NUM_THREADS=1

export MKL_SERIAL=yes

ulimit -c 0

ulimit -s unlimited

cd $PBS_O_WORKDIR

mpiexec_mpt -v -n 128 vasp

7.  Tricks to PBSPro and Orange

If you used ANUPBS before and used JOBFS in your script you need a workaround in PBSPRO to create local and temporal storage for your jobs. This is the way you should deal with high disk I/O:

#!/bin/bash -login

#P BS -N myPBSjob

#PBS -W group_list=project-code

#PBS -q workq

cd $PBS_O_WORKDIR

#Create temporary disk dir on local disk

JOBFS2=$(mktemp -d --tmpdir=/data2)

#copy input file to that location

cp file.in "${JOBFS2}"

cd "${JOBFS2}"

/apps/application/prog.exe

#Copy result back

cp result "${PBS_O_WORKDIR}"

#Clean up

rm -rf "${JOBFS2}"

Also note that NCI uses a highly altered version of PBSPro. In this version JOBFS is alvailbale. For details have a look at the Raijin handbook.

8.  Disks

Each user gets /home disk quota of 60GB on the SGI NAS.

Projects which asked during the resource allocation round for large disk space get space in /projects/project-name with an according quota for the project. This is on the Panasas® ActiveStor™ PAS-12 parallel file system.

Each job can use the local scratch disks. Files will be deleted automatically after the job finishes.

Disk Performance:

/projects resides on a parallel file system which is faster than /home.

If the job needs high disk I/O, it should use the local scratch disks in the nodes. Do not use /home  in this case as this will slow down the whole cluster.

For help, contact time@intersect.org.au.