ARC System Changes: 2026-05
Open OnDemand: ood.arc.vt.edu system, firmware, and software updates
Hosted LLMs: llm.arc.vt.edu authentication framework change
Clusters changes (Tinkercliffs, Owl, Falcon):
update to Slurm version 25.11 (required job queues to be dropped)
cluster manager software updates
updates to correctly auto-drain nodes due to ECC errors on GPUs
Configuration changes:
tweak to reduce nodes being drained after some jobs which encounter timeout
increase throughput of backfill scheduler cycle
introduce scheduling tweaks to reduce wait time for interactive jobs launched via OnDemand or
interactupdate job priority weights to better balance job age, job size, and fairshare and improve capacity for backfill scheduling
Storage systems
VAST (/scratch, /apps), Qumulo (/home), ESS (/projects) storage system updates