Torque and Maui Sanity Check: Submitting a Job
From Debian Clusters
This is the last part of a four part tutorial on installing and configuring a queuing system and scheduler. The full tutorial includes:
- Using a Scheduler and Queue
- Resource Manager: Torque
- Scheduler: Maui
- Torque and Maui Sanity Check: Submitting a Job
There is also a troubleshooting page:
This part tutorial assumes you have already installed and configured Torque and Maui. If you haven't, you'll want to visit those pages first.
Contents |
Torque/Maui Sanity Check: Submitting a Job
A job is one particular instance of running a particular script or program of code. You won't want to run a job as root, so first, on your head node, become one of your users. (For instance, su - kwanous.)
Jobs are submitted to the job queue run by torque, which maui monitors and will then schedule, and torque will tell the pbs_mom client running on the worker node that maui picks to run the job. Jobs are submitted to torque with the qsub command.
Test: Sleep Job
An easy job to submit and monitor is just a sleep command.
As one of your users, enter the command that will create a job that simply sleeps for 30 seconds, as shown below:
echo "sleep 30" | qsub
Immediately afterward, run the torque command qstat to see the job appear in torque's queue, and then the maui command showq. You can even run
pbsnodes | grep -v status | grep -v ntype
to see which node the job is running on. A script of my output is shown below.
kwanous@gyrfalcon:~$ echo "sleep 30" | qsub 6.gyrfalcon kwanous@gyrfalcon:~$ qstat Job id Name User Time Use S Queue ------------------------- ---------------- --------------- -------- - ----- 6.gyrfalcon STDIN kwanous 0 R batch
kwanous@gyrfalcon:~$ showq
ACTIVE JOBS--------------------
JOBNAME USERNAME STATE PROC REMAINING STARTTIME
6 kwanous Running 1 1:00:00 Wed Jan 23 14:00:24
1 Active Job 1 of 28 Processors Active (3.57%)
1 of 7 Nodes Active (14.29%)
... snipped ...
Total Jobs: 1 Active Jobs: 1 Idle Jobs: 0 Blocked Jobs: 0
kwanous@gyrfalcon:~$ pbsnodes | grep -v status | grep -v ntype
eagle
state = free
np = 4
... snipped ...
peregrine
state = free
np = 4
jobs = 0/7.gyrfalcon
Approximately thirty seconds later, the job should finish running. If you run qstat and showq again, you should no longer see the job (6.gyrfalcon, in my example) running.
Sleep Job Results
In the home directory of the user you've submitted the job as, you should now see two files, something like:
-
STDIN.o3 -
STDIN.e3
where 3 is the job ID. The file ending in .o# is all of the output in the form of standard out that came from the job. .e# is all the output from standard error. For our sleep job, both of these should be empty. sleep doesn't give any output to standard out or standard error.
Test: Standard Output vs Standard Error
Qsub can also take input in the form of files. These files can give all sorts of specifications to torque about how long the job will run and what resources it needs. (To learn more about qsub submission files, see Torque Qsub Scripts.) We'll write just a simple one. Open your favorite text editor and enter the contents of my Standard Output/Error For Loop Script and save this file to submission. This script has a simple for loop that runs from 1 to 10. If the number is less than 5, it will print a statement to standard output. If the number is greater than or equal to 5, it will print a statement to standard error.
Submit the job with
qsub submission
where submission is the name of the script file.
Job Results
Again, you should have .o# and .e# files in your home directory, but this time they should start with the name of the file submitted to qsub (submission). This time, they should have content in them. Your output file should have the first four lines, which were printed to standard output:
1 is less than 5 2 is less than 5 3 is less than 5 4 is less than 5
and your error file should have the last six, which were printed to standard error:
5 is greater than or equal to 5 6 is greater than or equal to 5 7 is greater than or equal to 5 8 is greater than or equal to 5 9 is greater than or equal to 5 10 is greater than or equal to 5
Hmm...
If you didn't get the results described on this page, visiting the Troubleshooting Torque and Maui page might be of help.

