ARC Storage Changes - Spring 2022
During spring break in March 2022, ARC brought online a new IBM ESS storage system running their GPFS software take over the duties of hosting /projects
storage for Tinkercliffs and Infer clusters and to syncronize the data from the old system. /projects
was previously hosted by a Dell storage system running BeeGFS, but was plagued by frequent outages since it came online in fall of 2020.
Outline of Impacts
Impact on /projects
on Tinkercliffs and Infer
GPFS replaced BeeGFS
remains available but is now hosted on a new platform
aggregate space available on the hosting system is reduced, but is sufficient to host current needs
quotas are tallied differently, but more intuitively
beegfs-ctl --getquota ...
commands are no longer valid because they’re specific to BeeGFSa few groups will find they are now over the quota and will need to address this
Impact on /work
on Tinkercliffs and Infer
Previously hosted by BeeGFS as 1TB personal scratch storage. This mount is being discontinued immediately.
/work
mount point is empty and$WORK
environment variable points to an invalid path, so job scripts reliant on these need to be changedWORK
data will remain temporarily available with read-only access at/beegfs/work
on the login nodes only until Monday March 28, 2022.Users must migrate any data they need to keep from there to other appropriate file systems prior to March 28, 2022.
ARC Begins a “Use Scratch for Jobs” campaign
/fastscratch
and Local Scratch storage provide superior performance and scale for in-job storage operations (input/output or I/O).Regular cleanup of files will commence in
/fastscratch
. Files older than 90 days will be purged starting on ______ (date TBA)
Cascades, Dragonstooth, Huckleberry unaffected at this time
/work
and/groups
on these clusters is hosted by a pre-existing GPFS storage system which remains in productionThat GPFS filesystem is, however, reaching its end-of-life and groups should begin the work of archiving old data and migrating active projects to new systems.
Huckleberry will be decommissioned at the end of the Spring 2022 semester.
Cascades and Dragonstooth clusters do not yet have decommission dates established, but are quickly approaching their end-of-life as well.
Actions you may need to take
Get within quota on /projects
If you are over quota on your PROJECT directory, everyone in your group will be unable to write to that directory. While the quotas limits have not changed, the method of calculating usage has. Quota usage on /projects
is now computed as the aggregate size of all files in the directory. Previously, some files in the directory would not be considered in the usage if their group ownership was inconsistent with the directory. This is not the case anymore, and that change may put some groups over their quota. To get under the quota, you need to reduce the total usage in that directory. Here are some approaches to consider:
delete unneeded data
archive data which is not being actively used
Use /fastscratch on Tinkercliffs to hold short-term working data (90 day limit)
Store large datasets in compressed archive files and only extract the data when it’s needed (this can also provide a big performance boost when paired with use of /localscratch).
Retrieve data that was previously in /work
If you were actively using /work
or need to retain files/data from /work
, then you have a short window of time to retrieve that data before the hosting BeeGFS system is taken offline. Those files are available through March 28, 2022 on Tinkercliffs and Infer login nodes at /beegfs/work
.
Change jobscripts so that /work
path and $WORK
environment variable are not used
Job scripts for Tinkercliffs or Infer clusters which reference or otherwise rely on /work/
paths or the $WORK
environment variable need to be adjusted to use different filesystems.
Note
The /work
mount point is no longer available and the $WORK
environment variable now expands to the invalid path /notavailable
.
Note
For a limited time, /beegfs/work
is available read-only so that data can be retrieved to other locations before BeeGFS is taken fully offline.
We recommend using /fastscratch
on Tinkercliffs as a working space for staging jobs, but note that files which are unmodified for 90 days will be automatically deleted from this location. For longer-term storage, use a /projects
storage location or remove/archive files to remove them from ARC active storage filesystems.
Get Help
Submit an ARC Helpdesk ticket: https://arc.vt.edu/help
Come to ARC Office Hours via Zoom: https://arc.vt.edu/office-hours