The Duke Compute Cluster has a new login node, dscr-slogin-03, dedicated to running jobs on the Open Science Grid (OSG). This capability was previewed during the OSG Software Carpentry Workshop last October:
SWC-OSG Workshop at Duke University, October 27-29th 2015

Running jobs on the Open Science Grid requires an OSG account: please sign up  Complete instructions for running OSG jobs can be found here:  Job Scheduling with HTCondor  and here: Connecting the Campus to Grid Resources. Below is a terminal session of running the first OSG tutorial job on the DCC. Give this a try and send any questions to rescomputing@duke.edu

First, ssh to dscr-slogin-03:

tm103@dscr-slogin-02  ~ $ ssh dscr-slogin-03
################################################################################
## You are about to access a Duke University computer network that is intended #
## for authorized users only. You should have no expectation of privacy in     #
## your use of this network. Use of this network constitutes consent to        #
## monitoring, retrieval, and disclosure of any information stored within the  #
## network for any purpose including criminal prosecution.                     #
################################################################################
tm103@dscr-slogin-03's password: 
Last login by user tm103: Mon Jan 25 10:24 - 10:42 (00:17) from: dscr-slogin-02.oit.duke.edu
-bash-4.1$

Setup the OSG connect client (only needs to be done once):

-bash-4.1$ connect setup
Please enter the user name that you created during Connect registration.  Note that
it consists only of letters and numbers, with no @ symbol.

You will be connecting via the login.duke.ci-connect.net server.
Enter your Connect username: tm103
Password for tm103@login.duke.ci-connect.net: 
notice: Ongoing client access has been authorized at login.duke.ci-connect.net.
notice: Use "connect test" to verify access.

Test the connect client:

-bash-4.1$ connect test
Success! Your client access to login.duke.ci-connect.net is working.

Create the tutorial files:

-bash-4.1$ tutorial quickstart
Installing quickstart (master)...
Tutorial files installed in ./tutorial-quickstart.
Running setup in ./tutorial-quickstart...

Change to the tutorial-quickstart directory:

-bash-4.1$ cd tutorial-quickstart
-bash-4.1$ pwd
/dscrhome/tm103/tutorial-quickstart

Look at the tutorial files:

-bash-4.1$ ls -l
total 454
-rw-r--r--. 1 tm103 scsc     0 Oct 29 13:16 job.error
-rw-r--r--. 1 tm103 scsc 28240 Oct 29 13:22 job.log
-rw-r--r--. 1 tm103 scsc   273 Oct 29 13:22 job.output
drwxrwxr-x. 3 tm103 scsc  6797 Dec  8 14:59 log
-rw-rw-r--. 1 tm103 scsc  1204 Oct 28 21:14 osg-template-job.submit
-rw-rw-r--. 1 tm103 scsc 12938 Oct 28 21:14 README.md
-rwxrwxr-x. 1 tm103 scsc   296 Oct 28 21:14 short.sh
-rw-r--r--. 1 tm103 scsc     0 Dec  8 14:29 testjob.error
-rw-r--r--. 1 tm103 scsc  6127 Jan 25 10:28 testjob.log
-rw-r--r--. 1 tm103 scsc   250 Jan 25  2016 testjob.output
-rw-rw-r--. 1 tm103 scsc   800 Dec  8 14:28 tutorial01.submit
-rw-rw-r--. 1 tm103 scsc   204 Oct 28 21:14 tutorial02.submit
-rw-rw-r--. 1 tm103 scsc   237 Oct 28 21:14 tutorial03.submit
drwxrwxr-x. 3 tm103 scsc   220 Jan  7 10:28 tutorial-quickstart

Edit the first sample job script and change the project name from +ProjectName=”ConnectTrain” to +ProjectName=”duke-campus”.

-bash-4.1$ vim tutorial01.submit
-bash-4.1$ cat tutorial01.submit
# The UNIVERSE defines an execution environment. You will almost always use VANILLA. 
Universe = vanilla 

# EXECUTABLE is the program your job will run It's often useful 
# to create a shell script to "wrap" your actual work. 
Executable = short.sh 
Arguments = 10

# ERROR and OUTPUT are the error and output channels from your job
# that HTCondor returns from the remote host.
Error = testjob.error
Output = testjob.output

# The LOG file is where HTCondor places information about your 
# job's status, success, and resource consumption. 
Log = testjob.log

# +ProjectName is the name of the project reported to the OSG accounting system
# +ProjectName="ConnectTrain"
+ProjectName="duke-campus"

# QUEUE is the "start button" - it launches any jobs that have been 
# specified thus far. 
Queue 1

Look at the short.sh script:

-bash-4.1$ cat short.sh
#!/bin/bash
# short.sh: a short discovery job

printf "Start time: "; /bin/date
printf "Job is running on node: "; /bin/hostname
printf "Job running as user: "; /usr/bin/id
printf "Job is running in directory: "; /bin/pwd

echo
echo "Working hard..."
sleep ${1-15}
echo "Science complete!"

Submit the sample job:

-bash-4.1$ connect submit tutorial01.submit
..............+..+++++++++...........................................................................................................................................................................................................
10 objects sent; 219 objects up to date; 0 errors
Submitting job(s).
1 job(s) submitted to cluster 123956.

Check the progress of the job:

-bash-4.1$ connect q

-- Submitter: duke-login.osgconnect.net : <192.170.227.203:60920> : duke-login.osgconnect.net
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
123956.0   tm103           1/25 09:38   0+00:00:00  0   0.0  short.sh 10       

1 jobs; 0 completed, 0 removed, 1 idle, 0 running, 0 held, 0 suspended

-bash-4.1$ connect q

-- Submitter: duke-login.osgconnect.net : <192.170.227.203:60920> : duke-login.osgconnect.net
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
123956.0   tm103           1/25 09:38   0+00:00:05 R  0   0.0  short.sh 10       

1 jobs; 0 completed, 0 removed, 0 idle, 1 running, 0 held, 0 suspended

-bash-4.1$ connect q

-- Submitter: duke-login.osgconnect.net : <192.170.227.203:60920> : duke-login.osgconnect.net
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               

0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended

Retrieve the output files (important!):

-bash-4.1$ connect pull
+.......++...+..................................................................................................................................................................................................................
4 objects retrieved; 220 objects up to date; 0 errors

Look at the job output:

-bash-4.1$ cat testjob.output 
Start time: Mon Jan 25 09:38:38 CST 2016
Job is running on node: iut2-c085.iu.edu
Job running as user: uid=21039(osg) gid=21000(osgvo) groups=21000(osgvo)
Job is running in directory: /var/lib/condor/execute/dir_1093243/glide_6VG8SA/execute/dir_1099394

Working hard...
Science complete!
-bash-4.1$