On Thursday evening I had a quick conversation with Terry Oas to close the day. Since so much of his research is tied with computing, he’s long been engaged with people who pull the levers of information technologies. We chatted after a meeting of “ITAC” (Duke’s IT Advisory Council).

“It used to be that computing needed big and centralized infrastructure,” he said at one point recalling the efforts of Triangle Universities Computing Center (TUCC) from the 1950s through the 1980s. “Then when PCs came about, computing just kind of disappeared from view, since it shifted into labs. For the administration, it seemed to have been ‘taken care of’ and wasn’t an issue anymore.”

That’s no longer the case, Terry believes. Requirements have outstripped the powers and capacities of PC workstations under desks. Individual labs can’t sustain the computational power and — perhaps more pressing — the data storage that many scientists need to do their work. Circumstances of research have changed from the 1990s or the turn of the century for individual researchers, and certainly needs have increased for those laboring in many of the life sciences like Terry.

But requirements of individual labs aren’t the only factors making a coordinated, well supported, and flexible research computing infrastructure necessary. The habits of conducting studies also seem to be shifting toward more collaborative and coordinated study, and so research computing resources can also be leveraged to enrich collaboration. In my view, technologies to support research collaboration are part of the research computing agenda.

We’re not just talking file sharing or video conferencing, either.

Technologies that seamlessly integrate data production with analysis are in the mix. Technologies that make data provenance a routine and essential part of a collaboration also need a place. Bringing together the researchers’ activities and bolstering the confidence that researchers have in their collaborators’ work — these are part of the role of research computing.

I ran across an excerpt from Michael Polanyi‘s “The Republic of Science” that appeared long ago (1962!) in Minerva (thanks, Paul Meinshausen and Data Science for Social Good!). Polanyi uses an analogy to describe a structure of scientific endeavor:

Imagine that we have the pieces of a very large jigsaw puzzle, and that for some reason it’s important that our giant puzzle be put together as quickly as possible. We might try to work fastest by recruiting several helpers; the question would be how to structure the work.

Suppose we divide the puzzle pieces equally among the helpers and let each work on her set separately. It’s easy to see that this method, which would work fine for shelling peas, would be totally ineffective for the puzzle, since few of the pieces in each helper’s set would be found to fit together.

We could do better by providing duplicates of all the pieces to each helper separately, and eventually somehow bring their individual results together. But even with this approach the team wouldn’t be much better than the performance of a single individual at her best.

The only way the assistants can effectively cooperate, and thoroughly surpass what any single one of them could do, is to let them work on putting the puzzle together in sight of the others so that every time a piece is fitted in by one participant, all the others will immediately watch out for the next step that becomes possible in consequence. In this system, each participant will act on her own initiative, by responding to the latest achievements of the others, and the completion of their joint task will be greatly accelerated.

When you look at the massive efforts that came together to create the Large Hadron Collider, you see Polanyi’s jigsaw puzzle solvers at work. Now, too, they are teasing out information from the data it produces, and there, too, you see the puzzle solvers at work — coordinated and coordinating, vigilant, open, independent and responsive. The same is true in varying degrees among researchers across the entire scholarly landscape.

Technologies that make transparency, communication and trust easier I count among the necessary features and qualities of research computing.

— Mark R. DeLong, PhD (mark.delong@duke.edu)