CUI, Restricted Data Projects

Overview

The Controlled Unclassified Information (CUI) cluster is a CPU + GPU system which is the result of a partnership between ARC and the Hume Center. CUI came online in 2021 and has 15 nodes, 960 CPU cores, 12 TB RAM, and 24 NVIDIA A100 GPUs.

Technical details are below:

Node type

CPU

GPU

Manufacturer

HPE

HPE Apollo

Chassis

HPE XL225n

HPE Apollo 6500 Gen10 Plus

Chip

AMD EPYC 7542

AMD EPYC 7542

Nodes

12

3

Cores/Node

64

64

GPU Model

-

Nvidia Ampere A100-80GB

GPU/Node

-

8

Memory (GB)/Node

512

2,048

Total Cores

768

192

Total Memory (GB)

6,144

6,144

Local Disk

240GB SSD

240GB SSD

Interconnect

HDR-100 IB

HDR-100 IB

A VAST Flash storage system with 656TB of data storage capacity is connected to provide network-based storage for the cluster.

Access

The CUI system is set up to host projects which require some computational scale but are subject to controlled access restrictions such as the International Traffic in Arms Regulations (ITAR). Access to the CUI system requires a technology control plan (TCP) established with the Office of Export and Secure Research Compliance (OESRC) and consultation with ARC personnel to set up access and provide instructions for use.

Networks from which CUI is accessible

The login node for the CUI system, cui1.arc.vt.edu, will only accept connections from secured hosts on VT networks. However, connections from the VT VPN are not allowed since they could originate from arbitrary locations.

If you do not have access to a secured host on a VT network, then you will likely need to connect from OESRC’s COMPASS system. COMPASS is also accessible from off-campus, US locations by first connecting to OESRC’s Barracuda VPN.

Running jobs

The CUI cluster uses SLURM for resource management and job scheduling. One key difference from other ARC clusters is that users’ jobs run using the the SLURM account “cui” (e.g. #SBATCH --account=cui) instead of the personalized accounts you may use on elsewhere.

Maintenance Downtimes

Patching and updates to resolve security or vulnerability issues are of high importance for this cluster, so it may be subject to downtimes for maintenance and security updates with little or no advance notice.