Running MATLAB on the CHTC
In order to access the CHTC, you will need to register for an account. You can request an account here.
Before running MATLAB on the CHTC, it is important to understand the HTCondor language and run basic code. You can learn how to do this here.
There is a guide available on the CHTC website that show the steps of running MATLAB code on the CHTC. However, the guide here will go into more detail and also address some niche problems.
Connecting to the CHTC and Useful Platforms
Basic Instructions to connect to the CHTC can be found here.
Note that you will need to be connected to the UW-Madison VPN to connect to CHTC.
It is recommended that you use a file transfer program to manage the files on the CHTC. I would recommend using WinSCP to manage files and Putty
to access the online command prompt of the CHTC.
The main script file should be converted to function files if multiple inputs or combinations of inputs are to be put in to make use of the parallelism of the CHTC. Make sure to convert input variables from str type to num type with the str2num() function on MATLAB since all inputs will be in str. Also, remove all semicolons for the final output variables as they do not automatically appear in the CHTC output, but rather need to be printed to the command line.
The other requirement to have all the supporting MATLAB functions on a functions/ folder. This can be achieved from the following steps:
Default MATLAB toolboxes are automatically included in the Compilation step on the CHTC. If you are using an external toolbox, follow the steps below:
- Create an empty folder and name it 'functions'
- Add all supporting functions to the folder
- Open MATLAB
- Go to Home -> Add-Ons -> Manage Add-Ons -> Click on the Rightmost Icon of the Toolbox -> Open Folder
- There, you will find all the functions that exist in the toolbox. Move all of these to the functions folder from before.
- Remove all non-functions from the folder. For example, in the HHXT MATLAB toolbox, remove the ‘doc’ and ‘resources’ directories.
- The structure of this folder doesn’t matter. The CHTC will recursively search over subfolders and compile all the functions it finds in the compiling step.
To compile everything for the CHTC, follow the steps below:
- Create a build file (which is essentially a .txt file). An example of a build file can be found here. Modify the file to suit your purpose.
- With the build file ready, run the following in the CHTC terminal on the same directory as the build file:
condor_submit -i buildfile.sub
module load MATLAB/R2018b
mcc -m -R -singleCompThread -R -nodisplay -R -nojvm ZZvsModel.m -a functions
Note that this is specific for running a main script called ‘ZZvsModel.m’. Change that part of the code as needed.
After exiting, a .sh file will be created. If the main MATLAB script is ‘ZZvsModel.m’, this file will be named ‘run_ZZvsModel.sh’. Open this file and add the following to the file:
tar -xzf r2018b.tar.gz
You should also find that the compiled script file has been created. If your main MATLAB script is called 'ZZvsModel.m', the script will be named 'ZZvsModel'.
For Submission, we require 3 things:
- The shell file created previously (For Example: run_ZZvsModel.sh)
- The compiled sript, also created in previous step.
- Input CSV file
- A submit text file
Input CSV File
If you are using the CHTC to run many combinations of inputs, an input csv file will allow you to have each input combination running on seperate nodes and make use of the CHTC's parallelism.
Take ZZvsModel.m as an example MATLAB script that is going to be run on the CHTC. The function inputs are shown below:
ZZvsModel(run, A, B)
Where "run", "A", and "B" are all the inputs we want to vary. In this example, we would prepare a csv file that appears as below:
0, 0.1, 0.3
1, 0.2, 0.3
2, 0.1, 0.4
3, 0.2, 0.4
Note that the header information should not be added to the csv file. The input 'run' is something that should be added to your MATLAB main script as a way to keep track of the input combinations later. It is also useful to use the 'run' input as a way to vary the names of the output file or any .mat file generated between each input combination. We will see how this CSV file is used in the next section.
A basic non-MATLAB specific MATLAB submit file can be found here. An example of a submit file tailored for MATLAB can be found here. There are a few things to note about the submit file:
- The 'initialdir' line is not needed here and can be omitted
- The executable should be specified to your shell file
- At the bottom of the submit file, there is a 'queue' line. Here, you state the name of each input in the input csv file in the order of the inputs in the csv file. Follow the format in the example.
- Add these input names to the 'arguments' line. The 'v95' specified at the start is needed and is specific to MATLAB 2018b (newest version of MATLAB on the CHTC right now)
- While not shown the example, you should have a line for errors and logs. Simply add a line such as 'error = error.txt' and 'log = log.txt' in case of any errors. This is a must for debugging purposes.
- 'Transfer_input_files' should follow the format of the example (assuming you are running the 2018b version of MATLAB on the CHTC). In the example shown, the MATLAB script is reading from 'HX_state.mat'. Input similar files, such as if your script is reading from a text file, here.
- The memory, disk space, and number of CPUs you want to request to run each node should be specified. If you don't have a good idea of how much you need, trial and error is fine here. Make sure if test a single node rather than every combination of inputs during this process (have another input text file with only one row). This information can be found in the log file that will be generated everytime a node is run on the CHTC. If the memory or disk space you request is too low, the CHTC will return an error telling you this.
- Finally, specify the output name and location for each node. Notice how you can use the input argument to vary each output text file as shown in the example.
If everything is ready, type in "condor_submit submitfile.sub" to the CHTC terminal. Type in "condor_q" to check the progress of your files.
For the output of your script, you could either save the variables to a .mat file (which you should vary by run number on the MATLAB script itself) or read from the output text file. The first is quite straightforward, but these files are generally magnitudes larger in file size compared output text files. If you decide to use the .mat files, check out this website to transfer files from the CHTC cloud to your computer more efficiently.
If you decide to use the output text files, there are a few things to take note of.
- The output text file won't have the typical outputs you see on the MATLAB terminal. You will need to call out of output variables on MATLAB and make sure that there is not a semicolon at the end of the callout. (If you want to see the value of a variable called 'Heat', simply have 'Heat' as a line on your MATLAB script) It is a good idea of have all these callouts in one section, as explained next.
- The output files are generally messy and not in the form that you want. However, it does contain all the information you need to create for example an output csv file. This does require some other code to clean the data. If you are familiar with using Python, the Regex and Pandas modules, as well as basic string operations are sufficient for cleaning the data.
Mechanical Engineering |
File last updated: September 16, 2009
Feedback, questions, or accessibility issues:
Copyright © 2008 The Board of Regents of the University of Wisconsin System.