You are probably unaware of it, but you are using virtual machines (also known as “VMs”) every day. Your favorite websites are probably being served from a virtual machine. Your web searches are being processed by a virtual machine. Your Amazon purchase is being submitted, processed, and fulfilled on virtual machines. You’re actually using one right now by reading these words.

VMs are pervasive, and they are invisible — indistinguishable from the whirring metal boxes of yesterday, at least from the perspective of the services they perform. There’s a reason they have taken over: They’re more reliable, easier to manage, transportable, and more economical than “bare metal.” And they perform well. This mixture of computing virtues has made virtual machines important to the most recent scientific computing installations in XSEDE, a clearinghouse that provides researchers access to many North American supercomputers.

What is this magical thing?

It’s a little mind-bending, but essentially a virtual machine (VM) is a software representation of a whole computer — something we usually associate with real aluminum and plastic sitting on our desk. In a sense, a whole computer runs as software. With virtual machine technology this means that on “real” computing hardware, several virtual machines can run at once, just as several applications can run on our laptops. Each of the VMs can themselves run software applications, and a single piece of computing hardware can run a mixture of machines using Windows and Linux, or other operating systems. The “cloud,” in fact, is actually a well organized collection of various virtual machines.

Ease of management and lower costs of acquisition have driven the adoption of virtual machines, and studies have shown that the costs of data center power and cooling are beginning to exceed the cost of computing equipment. With the the fact of climate change, energy consumption is important to keep to a minimum, so making the most of computing resources is responsible both economically and environmentally. Virtualization makes it possible to match demand for computing resources with the number of actual machines that are running in a data center, so that the machines that are running are being fully used — or as close to fully used as possible.

That’s why Charley Kneifel, Duke OIT’s Senior Technical Director, says that the three most important reasons to use VMs are “utilization, utilization, utilization.”

Virtual machines are the foundation of Duke Research Computing

The infrastructure for Duke Research Computing is the Ivory soap of virtual computing — almost 99 44/100ths percent pure. The Duke Compute Cluster is almost entirely composed on virtual machines, however GPUs are still “baremetal” as are some soon-at-end-of-life computers. The offerings of Duke’s Research Toolkits — RAPID machines — are virtualized and offer templates of different operating systems and configurations. OIT’s “Clockworks” technology automates the provisioning of computers for both “enterprise” use and research computing, and all of those computers are VMs.

“We are highly virtualized at Duke — 95, 97, 98 percent,” said Kneifel. “The PeopleSoft systems that run student registration and all the course catalogue — all of those things used by the Registrar — those systems are all virtualized. The Protected Network has virtual machines associated with it with a security perimeter and posture that prevents people and services from getting in or out unless they’re approved and gives the researchers and safe space to analyze protected data.”

Duke has built in a “level of automation … puts us ahead of our peers in the ability for a faculty member — or a student even — to go in and grab a VM, within seconds in some cases, and get that provisioned and start using it.”

Further reading

Chase J, Anderson D, Thakur P, Vahdat A (2001) Managing energy and server resources in hosting centers. In: 18th symposium on operating systems principles (SOSP ’01), pp 103–116
Lee YC, Zomaya AY (2012) Energy efficient utilization of resources in cloud computing systems. J Supercomput 60(2):268–280. doi:10.1007/s11227-010-0421-3