(Image shows growth of hard drive capacity from 1980-2010. Hankwang, Wikimedia Commons, CC 3.0 SA.)

Between December 17 and 19, the scheme for providing data storage space to users of the Duke Compute Cluster will be changed in order to increase capacity that’s available to researchers and to improve the stability and availability of cluster resources to groups. This is an important part of the cluster maintenance taking place during a planned maintenance outage over that timeframe.

The current storage scheme was established over fifteen years ago, and it has shown its age. Some deficiencies we’re aiming to fix:

  • The 250 GB standard capacity does not reflect the realities of much data-drive science and scholarship
  • The provisioning of individual “home directories” within shared group space has led to lock-outs of entire groups if a single cluster user inadvertently consumes all of the storage space allocated to the group
  • Deprovisioning accounts of former group members has been difficult for “points-of-contact” (POCs) and PIs
  • Changes of group membership of individuals (graduate students rotating through labs, for example) has been cumbersome
  • Cluster groups with many members have been pinched by having to share 250 GB of space, while small groups or “groups” of one have often had too much space at hand

The storage scheme will be changed in the following manner:

  • Groups will be granted 1 TB of storage capacity that can be shared by members of the group. This is four times the current allocation. Storage above the 1 TB allocation will be available for labs to purchase.
  • Individual user home directories will no longer be located in a group’s shared storage capacity, where they consumed group storage resources. Instead, each individual user of the cluster will be granted 10 GB in a separate home directory. This is more capacity than is the usual practice on many clusters. If users fill up their home directory inadvertently, only they will be affected by being unable to log in.

These changes will eliminate many of the deficiencies of the old set-up and they will give Duke researchers more flexibility in managing cluster accounts. However, the changes will mean that scripts may need to be updated, especially if “hard paths” are coded into them. Also, scripts and data stored in home directories will not be available to users’ lab groups.

PI’s, lab members, and POCs will need to emphasize the importance of storing lab data in their group’s shared storage space, since home directories will not be available to them. In addition, when cluster users leave the University or when they are deprovisioned on the cluster, their individual home directories will be deleted.

More detailed information about the specific changes to the storage is forthcoming and will be provided before the planned maintenance outage on December 17.

Questions about the outage and the changes should be emailed to rescomputing@duke.edu.