This memo outlines acceptable use of the Duke Compute Cluster and provides useful information about the security of data stored in the cluster. People who are granted access to the Duke Compute Cluster agree to the terms of this notice.
Appropriate uses of the Duke Compute Cluster
The Duke Compute Cluster is a community resource that serves the research, education, and service missions of Duke University. Users of the cluster agree to only run jobs that relate to these missions. For example,
- Bitcoin or other electronic and cryptographic currency “mining” for purposes of financial gain is not appropriate. Research and instructional uses of “mining” tools, not for purposes of financial gain, are not restricted.
- Commercial and business use of the cluster is not appropriate.
- Unauthorized use or storage of copyright-protected or proprietary resources is not appropriate.
Running jobs on the login nodes or that repeatedly take over large portions of cluster resources (computational or storage) are an abuse of the system.
Users of the cluster are encouraged to implement “check-pointing” for jobs that run for long periods, since node failures and scheduled maintenance may require interruption of processes. The use of checkpointing is good computing practice for long running jobs.
Inappropriate use of the cluster may result in either temporary or permanent shutdown of your account (depending on the frequency/severity of the infraction) by a systems administrator.
Sensitive information is not allowed on the cluster
Security and compliance provisions that are in place on the Duke Compute Cluster are not sufficient for sensitive information, such as HIPAA-regulated “Electronic Protected Health Information” and FERPA-regulated student records. Additionally, some data may be bound by restrictions in “data use agreements,” and those agreements may require more strenuous security than is in place on the cluster. However, in many cases, information can be de-identified and then introduced to the cluster for analysis without violating data use agreements or government regulations.
Users of the cluster are responsible for the data that they introduce to the cluster for analysis.
For more information on the classification of data, see the “Data Classification Standard” (PDF; security.duke.edu). Other policies and documents (security.duke.edu) are available from the Information Technology Security Office (ITSO).
“Points-of-Contact” (POCs) have responsibility for members of groups using the cluster
Every group on the Duke Compute Cluster has at least one “Point-of-Contact” who is charged with the following responsibilities:
- managing and vetting a group’s membership,
- serving as a central point-of-contact (hence the moniker) for communications from IT and research computing staff that are pertinent to a group,
- arranging the disposition of data produced and left by former members of the group, either as the data steward for the research being conducted or at the request of the data steward (e.g., grant PI or faculty member) who bears final responsibility for the care of the data,
- acting as an arbiter of trust who “vouches” for secure and responsible uses of the resources by members in his or her group, and
- helping to assess the group’s responsible use of the shared cluster’s storage and compute resources provided by the University and other researchers in the cluster.
The POC for a group is a person of authority in a lab or research group. For groups that are created for a class, the course instructor serves as POC. If a POC fails to enforce acceptable use of the cluster within their group, they may lose rights to act as POC. If problems persist within the group post a change of the POC, additional actions to restrict access by the group may occur, up to and including removal of the group.
Periodic review of group membership and patterns of cluster use by groups
The Point-Of-Contact (POC) for a group should review membership at least on an annual basis and, more prudently, on a semester basis so that lingering group members can be removed.
Duke Research Computing staff conduct periodic reviews of groups’ uses of the cluster storage and compute nodes in an effort to show groups their use of the cluster in a larger context and to help clarify the balance of use that is implicit in using a shared community resource. Some information about a group’s use of the cluster will be shared with other users of the cluster, and members of the faculty who serve on the Research Computing Advisory Group will also review cluster usage.
Data backups and appropriate use of storage resources
The DCC is primarily for data analysis and is not designed for data storage.
Users of the cluster should retain a copy of their irreplaceable data at a separate location, and they should remove results from the system as soon as they can.
Temporary and “ephemeral” data sets that are not essential should be deleted from cluster storage so that other users can use the capacity.
The cluster’s /work directory will be automatically deleted in regularly scheduled “purges” of data. Individual groups or individual users of the cluster will NOT be notified that their files will be deleted, since data management is researchers’ responsibility and the /work directory is not a place to store important datasets for long periods of time. Backups are not available for the /work directory.
The Duke Compute Cluster is a shared resource for Duke researchers and their collaborators. As a shared resource, the privacy that can be afforded to users is constrained. Users of the cluster must conduct themselves in a manner that respects other researchers’ privacy.
In order to assess and improve the functioning of the cluster, staff members who are involved with the systems administration and organization of the cluster will inspect submission scripts, software, and elements of the system from time to time.
Report security incidents and abuses of the cluster
Examples of a security incident include
- misuse of data and information, such as Duke’s proprietary information and patient information
- unauthorized access or use of Duke systems
- a compromised account — including “shared” account credentials
- a compromised system
If you observe such an incident or a violation of the behaviors and practices outlined in this document, please report it to your faculty advisor, lab manager, or group leader, the Duke Research Computing group (email@example.com) or the Duke University ITSO (security.duke.edu).