The Duke Compute Cluster (“DCC”)

The Duke Compute Cluster (formerly called the Duke Shared Cluster Resource or “DSCR”) consists of machines that the University has provided for community use and that researchers have purchased to conduct their research. At present, the cluster consists of about 7000 CPU-cores, with underlying hardware from Cisco UCS and Dell M600-series blades in Dell M1000-series chassis. Interconnects are 10 Gbs.

The cluster itself is a project of the University community, with the hardware provided by individual researchers and the University. The University, through Duke Research Computing and the Office of Information Technology, maintains and administers the equipment for its useful life (designated to be four years) and provides support for cluster users. As a result of the incremental purchases, the cluster is heterogeneous, with a narrow range of Intel chipsets and RAM capacities, though purchases of equipment are organized and channeled by Duke Research Computing in order to ease maintenance and exploit economies of scale. New “standard” nodes have 512 GB of RAM and 44 physical CPU-cores with double that possible using hyperthreading.

In February 2016, machines fitted with Nvidia Tesla K80 GPUs were added and are available for purchase by research groups with sustained need of GPU-accelerated computing. The machines are also available on a limited basis to cluster users as a common resource.

Researchers who have provided equipment have “high priority” access to their nodes and have “low priority” (or “common”) access to others’ nodes, including those purchased by the University, when idle cycles are available. Since researchers tend not to use 100 percent of the CPU of nodes they have purchased, “low priority” consumption of cycles greatly increases the efficiency of the cluster overall, while also providing all users the benefit of being able to access more than their own nodes’ cycles when they might need it. Jobs submitted with high priority run only on the nodes that members have bought, and low priority jobs on the machines yield to high priority jobs.

The Duke Compute Cluster is a general purpose high performance/high-throughput installation, and it is fitted with software used for a broad array of scientific projects. For the most part, applications on the cluster are Free and Open Source Software (FOSS), though some researchers have arranged for proprietary licenses for software they use on the cluster. The operating system and software installation and configuration is standard across all nodes (barring license restrictions), with Red Hat Enterprise Linux 6 the current operating system. SLURM is the scheduler for the entire system. The entire system is professional managed by systems administrators in the Office of Information Technology and the equipment is housed in enterprise-grade data centers on Duke’s West Campus. Software installations and user support, including training on using the system, is provided by experienced staff of Duke Research Computing.

Users of the cluster agree to an Acceptable Use Policy.

Accessing the Duke Compute Cluster

There are currently 2 “front-end” machines that users must login to first. The names of these head nodes are dcc-slogin-01.oit.duke.edu and dcc-slogin-02.oit.duke.edu.

Once you are logged in to a front-end, you will be able to login from there to any node in the cluster. Most of your non-computational work will be done on the front-ends: compilation, job submission, debugging.  Do not use the login nodes for computationally intensive processes.  All computationally demanding jobs should be submitted and run through the SLURM queueing system.

To learn more about gaining access to the Duke Compute Cluster, please see Gaining Access.

If you are a member of a group that already participates in the DSCR, please direct your new account request through your designated Point Of Contact

Using the Duke Compute Cluster

SLURM Queueing System

Office hours: Gross Hall 241 M/W 1:00-5:00 (map)

Duke Compute Cluster workshop