Difference between revisions of "Running Jobs on HPC"
From Montana Tech High Performance Computing
(12 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | When you connect to hpc.mtech.edu, you will be logged into the "head" node, aka the "management" or "login" node. You can compile and test programs, submit jobs, and view results on the head node. Computationally intensive work should be done on the "compute" nodes. Jobs are assigned to the compute nodes through the [[ | + | When you connect to hpc.mtech.edu, you will be logged into the "head" node, aka the "management" or "login" node. You can compile and test programs, submit jobs, and view results on the head node. Computationally intensive work should be done on the "compute" nodes. Jobs are assigned to the compute nodes through the [[Slurm]] job scheduler. You can request a portion of a compute node, an entire node, or multiple nodes for distributed programs. |
Primarily, there are two ways to run your jobs on the HPC: | Primarily, there are two ways to run your jobs on the HPC: | ||
# Running it '''interactively'''. | # Running it '''interactively'''. | ||
− | # '''Submitting''' it the hpc's job scheduler. | + | # '''Submitting''' it through the hpc's job scheduler. |
− | Both are done through the | + | Both are done through the Slurm commands. Below are just some basic examples, you can refer to the [https://slurm.schedmd.com/documentation.html Slurm documentation] for more details. |
==Running an interactive job on compute nodes== | ==Running an interactive job on compute nodes== | ||
After you are logged in, you will be at your home directory: | After you are logged in, you will be at your home directory: | ||
− | <code>[YourUserName@ | + | <code>[YourUserName@oredigger ~]$ ❚</code> |
You can do the following to start an interactive job | You can do the following to start an interactive job | ||
− | <code>[YourUserName@ | + | <code>[YourUserName@oredigger ~]$ srun --pty /bin/bash </code> |
− | And you'll | + | And you'll see your command line prompt changed to: |
− | <code style=display:block | + | <code style=display:block>[YourUserName@cn0 ~]$❚</code> |
− | Note, | + | Note, cn0 is the name of a compute node, which means you are now at compute node cn0. Depending on the cluster usage, you might get a different node, e.g. cn2, cn10, etc.. |
− | |||
You can now execute your calculations on the compute node. | You can now execute your calculations on the compute node. | ||
− | Be default, the | + | Be default, the srun command in the example above will give your 1 process and 1 hour compute time. Again, refer to the [[Slurm]] page for more options. |
Below is an example of running Matlab script on a compute node. | Below is an example of running Matlab script on a compute node. | ||
− | <code>[YourUserName@ | + | <code>[YourUserName@cn0 ~]$❚</code> |
Load the Matlab module so that you can use it in your environment: | Load the Matlab module so that you can use it in your environment: | ||
− | <code>[YourUserName@ | + | <code>[YourUserName@cn0 ~]$module load MATLAB</code> |
Go to the directory containing your Matlab script. (example here is a test.m script in the test_code directory) | Go to the directory containing your Matlab script. (example here is a test.m script in the test_code directory) | ||
Line 44: | Line 43: | ||
==Submitting jobs to compute nodes== | ==Submitting jobs to compute nodes== | ||
− | To do a job submission, you will first need to prepare a '''job submission script'''. The script is just a text file containing two parts: the job scheduler directives and the commands for your | + | To do a job submission, you will first need to prepare a '''job submission script'''. The script is just a text file containing two parts: the job scheduler directives and the commands for your program. You can prepare this script file either on your local computer with any text editor you familiar with, or on the HPC directly. To do it on the HPC, you can use text editors like, '''vi''' or '''nano''', that are available on the system. |
Below is a script doing the same thing as in the above Matlab example. | Below is a script doing the same thing as in the above Matlab example. | ||
− | * First go to the directory. (Or you can use the '''- | + | * First go to the directory. (Or you can use the '''-D''' option to specify your working directory) |
− | Note you are currently at the headnode, ''' | + | Note you are currently at the headnode, '''oredigger'''. |
− | <code>[YourUserName@ | + | <code>[YourUserName@oredigger ~]$ cd test_code</code> |
* Create a text file with the name '''matlabjob.sh''' with the following contents: | * Create a text file with the name '''matlabjob.sh''' with the following contents: | ||
− | <code style=display:block>#!/bin/sh<br># | + | <code style=display:block>#!/bin/sh<br>#SBATCH -J MatlabJob #Name of the computation<br>#SBATCH -N 1 # Total number of nodes requested <br>#SBATCH -n 4 # Total number of tasks per node requested<br>#SBATCH -t 01:00:00 # Total run time requested - 1 hour<br>#SBATCH -p normal # compute nodes partition requested <br><br>module load MATLAB<br>matlab -nodesktop -nosplash -r "test_code;quit;"</code> |
* Now submit this job: | * Now submit this job: | ||
− | <code> | + | <code>sbatch matlabjob.sh</code> |
* You can check your job status in the queue: | * You can check your job status in the queue: | ||
− | <code> | + | <code>squeue</code> |
− | More on [[ | + | More on [[Slurm]] page. |
Latest revision as of 13:08, 8 May 2020
When you connect to hpc.mtech.edu, you will be logged into the "head" node, aka the "management" or "login" node. You can compile and test programs, submit jobs, and view results on the head node. Computationally intensive work should be done on the "compute" nodes. Jobs are assigned to the compute nodes through the Slurm job scheduler. You can request a portion of a compute node, an entire node, or multiple nodes for distributed programs.
Primarily, there are two ways to run your jobs on the HPC:
- Running it interactively.
- Submitting it through the hpc's job scheduler.
Both are done through the Slurm commands. Below are just some basic examples, you can refer to the Slurm documentation for more details.
Running an interactive job on compute nodes
After you are logged in, you will be at your home directory:
[YourUserName@oredigger ~]$ ❚
You can do the following to start an interactive job
[YourUserName@oredigger ~]$ srun --pty /bin/bash
And you'll see your command line prompt changed to:
[YourUserName@cn0 ~]$❚
Note, cn0 is the name of a compute node, which means you are now at compute node cn0. Depending on the cluster usage, you might get a different node, e.g. cn2, cn10, etc..
You can now execute your calculations on the compute node.
Be default, the srun command in the example above will give your 1 process and 1 hour compute time. Again, refer to the Slurm page for more options.
Below is an example of running Matlab script on a compute node.
[YourUserName@cn0 ~]$❚
Load the Matlab module so that you can use it in your environment:
[YourUserName@cn0 ~]$module load MATLAB
Go to the directory containing your Matlab script. (example here is a test.m script in the test_code directory)
cd test_code
Now run the Matlab command:
matlab -nodesktop -nosplash -r "test_code;quit;"
Submitting jobs to compute nodes
To do a job submission, you will first need to prepare a job submission script. The script is just a text file containing two parts: the job scheduler directives and the commands for your program. You can prepare this script file either on your local computer with any text editor you familiar with, or on the HPC directly. To do it on the HPC, you can use text editors like, vi or nano, that are available on the system.
Below is a script doing the same thing as in the above Matlab example.
- First go to the directory. (Or you can use the -D option to specify your working directory)
Note you are currently at the headnode, oredigger.
[YourUserName@oredigger ~]$ cd test_code
- Create a text file with the name matlabjob.sh with the following contents:
#!/bin/sh
#SBATCH -J MatlabJob #Name of the computation
#SBATCH -N 1 # Total number of nodes requested
#SBATCH -n 4 # Total number of tasks per node requested
#SBATCH -t 01:00:00 # Total run time requested - 1 hour
#SBATCH -p normal # compute nodes partition requested
module load MATLAB
matlab -nodesktop -nosplash -r "test_code;quit;"
- Now submit this job:
sbatch matlabjob.sh
- You can check your job status in the queue:
squeue
More on Slurm page.