The HPCC maintains a large number of computing resources designed to support research.
In general, users write their programs, submit a request/job to the scheduling system to run on the cluster. The job request specifies how much time and computing resources will needed. Since these resources are shared users/programs that overutilize the system, causing nodes to become unresponsive, may be terminated without prior notice.
ICER’s HPC clusters include 900 compute nodes with more than 23,000 cores. Included in the clusters are 7 Volta V100 and 40 NVIDIA GPUs and 14 Xeon Phi equipped nodes. The clusters are linked together by high-throughput, low-latency InfiniBand. HPCC uses ZFS with 6.5PB of total capacity persistent storage, and a high-speed Lustre file system with 1.9 PB of temporary storage. See User Documentation for more information.
For each hardware type in the compute cluster, a single node is set aside for software development and testing. Users have a direct SSH connection to these nodes through the HPCC gateway. These nodes are shared resources and programs can run for up to two CPU hours before being terminated. Longer jobs should be submitted to the cluster.