Tutorials

The tutorials presented here are designed to help new users become familiar with the Metronome software. You will need to get an account on the UW-Madison Build & Test Lab in order to try out these examples.

This tutorial is also available as a single, printer-friendly document here.

Preparations

You will need a terminal window on a UW NMI Lab submit host, and a browser which can view the B&T web status pages.

On the terminal window you will need to perform a few setup steps if you want to cut and paste directly from these pages.


bash$ export NMI_BIN=/nmi/bin
bash$ export NMI_LIB=/nmi/lib
bash$ source $NMI_BIN/config.[sh/csh]

bash$ which nmi_submit
/nmi/bin/nmi_submit

bash$ nmi_submit —help
Usage: /nmi/bin/nmi_submit —nmiconf= Select which NMI configuration file to use —must-match Job must match with resources before submitted —notify-fail-only Only send notification if job fails —verbose Enable verbose output —quiet Do not print job submission information —timeout= Number of seconds to wait for runid. Default is 180 —no-wait Do not wait for runid. Program returns immediately —debug Enable debug output —help Show this information

You should also point your browser to the Build & Test Overview page

Navigating the Build and Test Overview Page

Overview Home.

As shown in the figure below, when the Build & Test Overview page is first displayed, the page list all of the submissions that have been run on the submit machines. The first thing that needs to be done is to filter the submissions down to a managable number using the search box. In this case the page displays 1109 submissions.

Sort Buttons.

The submission results can be sorted by each column listed in the page that contains sort buttons. As shown in the previous picture, each column which can be sorted has two buttons which will sort the results in either descending or ascending order.

Search Box.

The search box is used to filter out unwanted submissions. The figure below shows the fields that can be used to filter the results:

  1. Search all fields in the results database for the keyword.
  2. Show only submissions by selected user.
  3. Show only submissions of a selected run type; usually either build or test.
  4. Show only results from the selected platform.
  5. Advanced Search is not implemented at this time.
  6. Show only the submission associated with the selected run id.
  7. Show only results from the selected submit machine.
  8. Show only results from the selected project. This the value associated with the project submit file command.
  9. Show only results from the selected component. This the value associated with the component submit file command.
  10. Show results which are Running, Completed, Failed, or Removed.
  11. Show results from submissions run on the selected date.
  1. Show only pinned results when selected.

Details Page

Details Page.

The following picture shows an example details page for the build and test run:

  1. Clicking on this link will drop you into the run directory.
  2. These two buttons point to the output and standard error of the CVS command used to stage the component.
  3. This column shows the platforms the submission was run on in addition to the submit machine.
  1. This column contains links for each machine used in the run. These links display a host page which lists among other things the prerequisites that are installed.

Simple "Hello World" Build and Test Run

Introduction

This tutorial introduces the steps needed to run a B&T submission. The dissection sections explain the contents of the submit file and input file used in the exercise. The procedure and examining sections runs through the steps of the exercise. Two submit procedures are presented. The version using CVS input shows you how the build and test system retrieves code from your source repository. The version using SCP shows you how the build and test system can download input files from the submit machine. You should have already prepared for this tutorial by completing the preparations section.

Hello World Using CVS Input

Hello World Using CVS Input

This tutorial demonstrates a simple build and test run using the CVS input method. Download the attached files and move them to the working directory of your submit machine and then follow the sections below. The first two pages discuss the contents of the files you have downloaded. This is followed by the procedure to execute the run and examine the results.

Build & Test Submit File Dissection

The following specifies fields that identify the run so that it can be located on the Build & Test Overview page.

@project@ = tutorial
@component@ = perlHelloWorld
@component_version@ = 1.0.0
@description@ = This is a simple example
@run_type@ = build

This line points to the file perlHelloWorld.cvs for an input definition. The B&T system expects the file to be in the same directory as where nmi_submit is executed.

@inputs@ = perlHelloWorld.cvs

These lines specify that the command _“code/perlHelloWorld/helloWorld.pl Remote_Task Task”_ must be executed on each target platform.

@remote_task@ = code/perlHelloWorld/helloWorld.pl
@remote_task_args@ = Remote_Task Task

This line specifies that the run must be executed on a Fedora Core 3 system and a Sun Solaris 5.8 system.

@platforms@ = x86_fc_3, sun4u_sol_5.8

This line specifies where the B&T system should send the job completion message.

@notify@ =

Input Specification File Dissection

The first line specifies the way the software is transferred to the submit machine.

@method@ = cvs

For the cvs method the root and module need to be specified. Note that on the target machine, the path to the software will be the same as the module name. In this case the software will be found in the directory userdir/code/perlHelloWorld.

@cvs_root@ = /nmi/tutorial-cvs
@cvs_module@ = code/perlHelloWorld

Procedure

  1. Make sure you have downloaded the attachment files here into a working directory on the submit machine.
  1. Open the submit file perlHelloWorld.submit with your favorite editor and add a notify command. This will tell the build and test system how to notify you when a run is completed. Use the following format:
  2. notify = < your @ email.address >
  3. Start the submission by running nmi_submit.
bash$ nmi_submit perlHelloWorld.submit Global ID: tutorial_nmi-s005.cs.wisc.edu_1201015858_4220 Run Directory: /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201015858_4220 Run ID: 72218 The output provided by nmi_submit shows some important information about your run. The first line gives you a unique global identifier ( _tutorial_nmi-s005.cs.wisc.edu_1201015858_4220_ ) for the job. The second line points to the directory path ( _/nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201015858_4220_ ) where the submission results are stored. The third line shows the condor run id ( 72211 )which is used by the Build & Test Overview page.
  1. Check your email. You will receive a message when the run is completed.

Hello World Using SCP Input

Hello World with SCP Input.

This tutorial demonstrates a simple build and test run using the SCP input method. Download the attached files and move them to the working directory of your submit machine and then follow the sections below. The first two pages discuss the contents of the files you have downloaded. This is followed by the procedure to execute the run and examine the results.

Build & Test Submit File Dissection

The following specifies fields that identify the run so that it can be located on the Build & Test Overview page.

@project@ = tutorial
@component@ = perlHelloWorld
@component_version@ = 1.0.0
@description@ = This is a simple example
@run_type@ = build

This line points to the file perlHelloWorld.scp for an input definition. The B&T system expects the file to be in the same directory as where nmi_submit is executed.

@inputs@ = perlHelloWorld.scp

These lines specify that the command _“helloWorld.pl Remote_Task Task”_ must be executed on each target platform.

@remote_task@ = helloWorld.pl
@remote_task_args@ = Remote_Task Task

This line specifies that the run must be executed on a Fedora Core 3 system and a Sun Solaris 5.8 system.

@platforms@ = x86_fc_3, sun4u_sol_5.8

This line specifies where the B&T system should send the job completion message.

@notify@ =

Input Specification File Dissection

The first line specifies the way the software is transferred to the submit machine.

@method@ = scp

The scp method downloads files from the submit machine. In this case we want it to download the hello world perl script included in the attachments you downloaded into the working directory of the submit machine.

@scp_file@ = /YOUR/WORKING/DIRECTORY/helloWorld.pl

Procedure

  1. Make sure you have downloaded the attachment files here into a working directory on the submit machine.
  1. Open the submit file perlHelloWorld.submit with your favorite editor and add a notify command. This will tell the build and test system how to notify you when a run is completed. Use the following format:
  2. notify = < your @ email.address >
  3. Open the input file perlHelloWorld.scp with the same editor and replace the phrase /YOUR/WORKING/DIRECTORY/ with the pathname of your working directory. You can determine this by executing the command pwd.
  1. Start the submission by running nmi_submit.
bash$ nmi_submit perlHelloWorld.submit Global ID: tutorial_nmi-s005.cs.wisc.edu_1201016619_4691 Run Directory: /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201016619_4691 Run ID: 72219 The output provided by nmi_submit shows some important information about your run. The first line gives you a unique global identifier ( _tutorial_nmi-s005.cs.wisc.edu_1201016619_4691_ ) for the job. The second line points to the directory path ( _/nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201016619_4691_ ) where the submission results are stored. The third line shows the condor run id ( 72219 )which is used by the Build & Test Overview page.
  1. Check your email. You will receive a message when the run is completed.

Examining Build & Test Run Results

Examining Build & Test Run Results.

Here are two ways to look at the results of a build and test run. The first takes a look at the run results directory on the submit machine. The second way uses the build and test overview web page.

Examining Build & Test Run Directory

  1. The nmi_submit command output includes a path to the results directory. The path name will be of the form /nmi/run/ Your GID. Change into this directory and list the files.

bash$ cd /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201013789_32264
bash$ ls
  1. The directory is full of Condor artifacts which will become useful in more advanced tutorials. For now our files are in the userdir subdirectory. Change into this directory.

bash$ cd userdir/
bash$ ls
You will see one of 2 platform naming schemes. (The nmi: is insignificant for the tutorial purposes)
common sun4u_sol_5.8 x86_fc_3
OR
common nmi:sun4u_sol_5.8 nmi:x86_fc_3
  1. As you can see there are subdirectories here for each of the platforms and the common directory which contains the files from the submit machine.
  1. For this example the common directory does not contain anything significant. Change into the Sun subdirectory and take a look at its contents.

bash$ cd sun4u_sol_5.8/
bash$ ls
code remote_task.err remote_task.out
  1. This directory contains the standard out and standard error logs of the remote task. The code subdirectory contains the hello world perl script that was checked out of CVS and executed. list the remote_task.out file and you will see the hello world result:

bash$ cat remote_task.out
Hello World from "Remote_Task Task"

Examining Built & Test Run Results Using the Build & Test Overview Page

Examine a Build & Test Run

  1. Go to the Build & Test Overview page.
  2. Filter the results by selecting your user name in the search box and hit the SEARCH button. Review this section if you need to know how to do this.
  3. Locate your job by matching the run ID that nmi_submit returned. Your job should look similar to the figure below.
  4. Click on the details page and then click on the View link.
  1. Clicking on this link will drop you into the results directory from where you can go and examine the files that were generated by the submission. From here you can navigate to the remote_taske.out file examined in the previous section by clicking on userdir, sun4u_sol_5.8, and then remote_task.out.

The fields for the most part match either what was in the build and test submit file or what nmi_submit returned:

  1. Run ID that nmi_submit returned. Clicking on this number will take you to the Details page.
  2. Shows the job is still running.
  3. Your account name on the submit machine.
  4. The run type from the perlHelloWorld.submit submit file.
  5. The project from the perlHelloWorld.submit submit file.
  6. The component from the perlHelloWorld.submit submit file.
  7. The description from the perlHelloWorld.submit submit file.
  1. The number of platforms listed in the platforms command from the perlHelloWorld.submit submit file.

Alternate Inputs

This tutorial demonstrates a couple of variations of the standard fetch used by most of the other tutorials. First a tag is added to the cvs method defined in the Hello World Tutorial. Then the cvs method is replaced by the ftp method.

CVS Tag Procedure

  1. Re-use the input specification and build and test specification files from the Hello World Tutorial.
  1. For this exercise we will add a cvs tag that corresponds to a hello world script with a slightly different output. Open the file perlHelloWorld.cvs in your favorite editor and add the following line:
  2. cvs_tag = helloBranch
  3. Run nmi_submit using these files.
  4. bash$ nmi_submit perlHelloWorld.submit
  5. Locate your run’s details page using the overview page with the skills you learned here.
  1. Click on the output of fetch.perlHelloWorld.cvs. You should see something like this:
Doing a cvs check out U code/perlHelloWorld/.project U code/perlHelloWorld/helloWorld.pl system('cvs co -r helloBranch code/perlHelloWorld ') exited with value 0
  1. Notice in the output of the command above where the tag name is used.
  1. Click on one of the remote_task outputs. The output should contain the phrase (Tagged Version) which shows that the task has run the tagged version of the script.

FTP Procedure

  1. Re-use the build and test specification file from the Hello World Tutorial.
  1. Download the attachment below. This is the input specification file that defines the ftp fetch. It contains the following:
@method@ = ftp @ftp_root@ = ftp://ftp.cs.wisc.edu/condor/nmi/tutorial/ @ftp_target@ = helloWorld.tar.gz
  1. Edit the file perlHelloWorld.submit with your favorite editor. The input needs to be changed to specify the new ftp input specification file. Change the line inputs = perlHelloWorld.cvs to inputs = perlHelloWorld.ftp.
  1. Run nmi_submit using these files.
  2. bash$ nmi_submit perlHelloWorld.submit
  3. Locate your run’s details page using the overview page with the skills you learned here.
  1. Click on the output of fetch.perlHelloWorld.ftp. You should see something like this:
Doing an ftp check out system('wget ftp://ftp.cs.wisc.edu/condor/nmi/tutorials/helloWorld.tar.gz ') exited with value 0
  1. Click on the outputs of the remote_task tasks. You should see the hello world from the script included in the tar archive.

Task Hooks

Introduction.

This tutorial demonstrates the concept of tasks hooks. Tasks hooks are associated with the various stages of a build and test run. This section explains where these stages occur. This tutorial will show you how to use these hooks and how to examine the results of the tasks that are run.

Build & Test Submit File Dissection

Tasks Hooks in a Build and Test Submit File

The following shows the submit file used to demonstrate task hooks. The task hook commands highlighted in blue. The task hook remote_task was used in the Hello World tutorial.

@project@ = tutorial
@component@ = whereAmI
@description@ = Example to demonstrate task hooks
@run_type@ = build
@inputs@ = whereAmI.cvs

The following task hooks are executed on the submit machine

{color: blue}@pre_all@ = code/perlWhereAmI/whereAmI.pl
{color: blue}@pre_all_args@ = Pre_All Task
{color: blue}@platform_pre@ = code/perlWhereAmI/whereAmI.pl
{color: blue}@platform_pre_args@ = platform_Pre Task
{color: blue}@platform_post@ = code/perlWhereAmI/whereAmI.pl
{color: blue}@platform_post_args@ = platform_Post Task
{color: blue}@post_all@ = code/perlWhereAmI/whereAmI.pl
{color: blue}@post_all_args@ = Post_All Task

The following task hooks are executed on each platform.

{color: blue}@remote_pre_declare@ = code/perlWhereAmI/whereAmI.pl
{color: blue}@remote_pre_declare_args@ = Remote_Pre_Declare Task
{color: blue}@remote_declare@ = code/perlWhereAmI/whereAmI.pl
{color: blue}@remote_declare_args@ = Remote_Declare Task
{color: blue}@remote_pre@ = code/perlWhereAmI/whereAmI.pl
{color: blue}@remote_pre_args@ = Remote_Pre Task
{color: blue}@remote_task@ = code/perlWhereAmI/whereAmI.pl
{color: blue}@remote_task_args@ = Remote_Task Task
{color: blue}@remote_post@ = code/perlWhereAmI/whereAmI.pl
{color: blue}@remote_post_args@ = Remote_Post Task
@platforms@ = x86_fc_3, sun4u_sol_5.8

Procedure

  1. Review the Task Hook Diagram to understand the task hook sequence.
  2. Download the attachment files into a working directory on the submit machine.
  1. Open the submit file whereAmI.submit with your favorite editor and add a notify command. This will tell the build and test system how to notify you when a run is completed. Use the following format:

notify = < your @ email.address >
notify = <your@email.address>
  1. Start the submission by running nmi_submit.

bash$ nmi_submit whereAmI.submit
Global ID: tutorial_nmi-s005.cs.wisc.edu_1201019820_7684
Run Directory: /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201019820_7684
Run ID: 72223

Examine the Task Hook Result Files

  1. Change into your run directory: /nmi/run your GID. The result files in this directory are from tasks run on the submit machine.
  1. List all of the platform_pre files. As you can see there is a set of files for both of the platforms. You’ll see the results using one of several naming schemes. Again, you can ignore the nmi: at the start of the platform name for the purpose of this tutorial.
bash$ cd /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201019820_7684/ bash$ ls platform_pre.* platform_pre.sun4u_sol_5.8.err platform_pre.sun4u_sol_5.8.sub platform_pre.x86_fc_3.out platform_pre.sun4u_sol_5.8.out platform_pre.x86_fc_3.err platform_pre.x86_fc_3.sub OR platform_pre.nmi:sun4u_sol_5.8.err platform_pre.nmi:sun4u_sol_5.8.sub platform_pre.nmi:x86_fc_3.out platform_pre.nmi:sun4u_sol_5.8.out platform_pre.nmi:x86_fc_3.err platform_pre.nmi:x86_fc_3.sub
  1. Each task runs a script that outputs the time it was executed. Grep for this time value in the output files.
bash$ grep 'Time now' *.out platform_post.sun4u_sol_5.8.out: Time now is 14:21:28 platform_post.x86_fc_3.out: Time now is 14:21:06 platform_pre.sun4u_sol_5.8.out: Time now is 14:20:27 platform_pre.x86_fc_3.out: Time now is 14:20:08 post_all.out: Time now is 14:22:02 pre_all.out: Time now is 14:19:50 OR platform_post.nmi:sun4u_sol_5.8.out: Time now is 16:39:09 platform_post.nmi:x86_fc_3.out: Time now is 16:38:49 platform_pre.nmi:sun4u_sol_5.8.out: Time now is 16:38:01 platform_pre.nmi:x86_fc_3.out: Time now is 16:37:46 post_all.out: Time now is 16:39:38 pre_all.out: Time now is 16:37:32
  1. Change into a platform directory to see the tasks results for a remote platform. The platform subdirectory is located at ./userdir/ platform name.
bash$ cd userdir/nmi:sun4u_sol_5.8/ bash$ ls cmdfile remote_declare.err remote_post.out remote_pre.err remote_task.out code remote_declare.out remote_pre_declare.err remote_pre.out task_wrapper.sh put.pl remote_post.err remote_pre_declare.out remote_task.err
  1. Run the same grep command to examine the execution times of these tasks.
bash$ grep 'Time now' *.out remote_declare.out: Time now is 14:20:59 remote_post.out: Time now is 14:21:05 remote_pre_declare.out: Time now is 14:20:57 remote_pre.out: Time now is 14:21:01 remote_task.out: Time now is 14:21:03 OR remote_declare.out: Time now is 16:38:44 remote_post.out: Time now is 16:38:50 remote_pre_declare.out: Time now is 16:38:42 remote_pre.out: Time now is 16:38:46 remote_task.out: Time now is 16:38:48
  1. Return to the original directory and run a combination find/grep command to print out the time of all of the tasks.
bash$ cd ../.. bash$ find . -name "*.out" | xargs grep 'Time now' ./userdir/x86_fc_3/remote_pre_declare.out: Time now is 14:20:38 ./userdir/x86_fc_3/remote_task.out: Time now is 14:20:44 ./userdir/x86_fc_3/remote_pre.out: Time now is 14:20:42 ./userdir/x86_fc_3/remote_post.out: Time now is 14:20:46 ./userdir/x86_fc_3/remote_declare.out: Time now is 14:20:40 ./userdir/sun4u_sol_5.8/remote_pre_declare.out: Time now is 14:20:57 ./userdir/sun4u_sol_5.8/remote_declare.out: Time now is 14:20:59 ./userdir/sun4u_sol_5.8/remote_pre.out: Time now is 14:21:01 ./userdir/sun4u_sol_5.8/remote_task.out: Time now is 14:21:03 ./userdir/sun4u_sol_5.8/remote_post.out: Time now is 14:21:05 ./pre_all.out: Time now is 14:19:50 ./platform_pre.x86_fc_3.out: Time now is 14:20:08 ./platform_pre.sun4u_sol_5.8.out: Time now is 14:20:27 ./platform_post.x86_fc_3.out: Time now is 14:21:06 ./platform_post.sun4u_sol_5.8.out: Time now is 14:21:28 ./post_all.out: Time now is 14:22:02 OR ./userdir/nmi:x86_fc_3/remote_pre_declare.out: Time now is 16:38:20 ./userdir/nmi:x86_fc_3/remote_task.out: Time now is 16:38:26 ./userdir/nmi:x86_fc_3/remote_pre.out: Time now is 16:38:24 ./userdir/nmi:x86_fc_3/remote_post.out: Time now is 16:38:28 ./userdir/nmi:x86_fc_3/remote_declare.out: Time now is 16:38:22 ./userdir/nmi:sun4u_sol_5.8/remote_pre_declare.out: Time now is 16:38:42 ./userdir/nmi:sun4u_sol_5.8/remote_declare.out: Time now is 16:38:44 ./userdir/nmi:sun4u_sol_5.8/remote_pre.out: Time now is 16:38:46 ./userdir/nmi:sun4u_sol_5.8/remote_task.out: Time now is 16:38:48 ./userdir/nmi:sun4u_sol_5.8/remote_post.out: Time now is 16:38:50 ./pre_all.out: Time now is 16:37:32 ./platform_pre.nmi:x86_fc_3.out: Time now is 16:37:46 ./platform_pre.nmi:sun4u_sol_5.8.out: Time now is 16:38:01 ./platform_post.nmi:x86_fc_3.out: Time now is 16:38:49 ./platform_post.nmi:sun4u_sol_5.8.out: Time now is 16:39:09 ./post_all.out: Time now is 16:39:38
  1. Examine these times. The time order should match the sequence shown in task hooks sequence shown here.

Examine Task Hooks with Build and Test Overview Page.

  1. Go to the Build & Test Overview page.
  2. Find the WhereAmI run.
  3. Select its details page by clicking on its ID. You should see results similar to the image below.
  4. Click on the ascending order button ( 1 ) in the Start column. The Name column ( 2 ) should show the task in execution order1.
  5. Notice the platform_pre and platform_post tasks. These tasks are performed once on the submit machine for every platform. The Host column shows the machine that the task was executed on. The Platform column shows which platform the tasks was executed for.
  1. Click on the Output buttons ( 3 ) to see the results of each task. The whereAmI script prints out the time it was executed.

1 The times in the Start column may be off because of clock skew problems with the machines of the Build and Test System. For example, this problem could show that the tasks run on the submit machine were executed after one or more of the platform tasks.

Variables and Macros

Introduction.

This tutorial shows how the build and test system’s variables are set, and how user macros can be used. The build and test system publishes information about the run to user-provided scripts in the form of shell environment variables. This section explains each of the variables that the system provides. The build and test system also allows for macros to be defined by the user in the build and test specification file. These also appear to the task scripts as shell environment variables.

Procedure

  1. Take the build and test submit file from the task hook tutorial, rename the whereAmI.submit file to whereAmIWithMacro.submit and add the following line to the file:

IAmAMacro = I am a macro
This line sets a user macro to the value “I am a macro”.
  1. Submit the file to the build and test system.

bash$ nmi_submit whereAmIWithMacro.submit
Global ID: tutorial_nmi-s005.cs.wisc.edu_1201021357_15130
Run Directory: /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201021357_15130
Run ID: 72225

Examine the Task Variables

  1. The perl script (code/perlWhereAmI/whereAmI.pl) prints out all of the variables containing the characters NMI, CONDOR, and PATH. For this reason, you can look in the standard output files for each task (stored in the run directory) to determine the variables set by the build and test system.
  1. Change to the user directory.

bash$ cd /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201021357_15130
bash$ cd userdir/
  1. Search for the NMI_PLATFORM variable. As you can see it is present in every task.

bash$ find . -name "*.out" | xargs grep 'NMI_PLATFORM'
./x86_fc_3/remote_pre_declare.out: NMI_PLATFORM=x86_fc_3
./x86_fc_3/remote_task.out: NMI_PLATFORM=x86_fc_3
./x86_fc_3/remote_pre.out: NMI_PLATFORM=x86_fc_3
./x86_fc_3/remote_post.out: NMI_PLATFORM=x86_fc_3
./x86_fc_3/remote_declare.out: NMI_PLATFORM=x86_fc_3
./sun4u_sol_5.8/remote_pre_declare.out: NMI_PLATFORM=sun4u_sol_5.8
./sun4u_sol_5.8/remote_declare.out: NMI_PLATFORM=sun4u_sol_5.8
./sun4u_sol_5.8/remote_pre.out: NMI_PLATFORM=sun4u_sol_5.8
./sun4u_sol_5.8/remote_task.out: NMI_PLATFORM=sun4u_sol_5.8
./sun4u_sol_5.8/remote_post.out: NMI_PLATFORM=sun4u_sol_5.8
OR
./nmi:x86_fc_3/remote_pre_declare.out: NMI_PLATFORM=x86_fc_3
./nmi:x86_fc_3/remote_task.out: NMI_PLATFORM=x86_fc_3
./nmi:x86_fc_3/remote_pre.out: NMI_PLATFORM=x86_fc_3
./nmi:x86_fc_3/remote_post.out: NMI_PLATFORM=x86_fc_3
./nmi:x86_fc_3/remote_declare.out: NMI_PLATFORM=x86_fc_3
./nmi:sun4u_sol_5.8/remote_pre_declare.out: NMI_PLATFORM=sun4u_sol_5.8
./nmi:sun4u_sol_5.8/remote_declare.out: NMI_PLATFORM=sun4u_sol_5.8
./nmi:sun4u_sol_5.8/remote_pre.out: NMI_PLATFORM=sun4u_sol_5.8
./nmi:sun4u_sol_5.8/remote_task.out: NMI_PLATFORM=sun4u_sol_5.8
./nmi:sun4u_sol_5.8/remote_post.out: NMI_PLATFORM=sun4u_sol_5.8
Most of the output filenames above correspond to tasks you may recognize from your run specification file, but some of them do not (e.g., platform_job.*). These are output files corresponding to internal scripts run by the NMI build & test system itself, and can be ignored for now.

  1. Search for a specification file command. All of these are also present in every task.

bash$ find . -name "*.out" | xargs grep 'NMI_remote_task'
./x86_fc_3/remote_pre_declare.out: NMI_remote_task=code/perlWhereAmI/whereAmI.pl
./x86_fc_3/remote_pre_declare.out: NMI_remote_task_args=Remote_Task Task
./x86_fc_3/remote_task.out: NMI_remote_task=code/perlWhereAmI/whereAmI.pl
./x86_fc_3/remote_task.out: NMI_remote_task_args=Remote_Task Task
./x86_fc_3/remote_pre.out: NMI_remote_task=code/perlWhereAmI/whereAmI.pl
./x86_fc_3/remote_pre.out: NMI_remote_task_args=Remote_Task Task
./x86_fc_3/remote_post.out: NMI_remote_task=code/perlWhereAmI/whereAmI.pl
./x86_fc_3/remote_post.out: NMI_remote_task_args=Remote_Task Task
./x86_fc_3/remote_declare.out: NMI_remote_task=code/perlWhereAmI/whereAmI.pl
./x86_fc_3/remote_declare.out: NMI_remote_task_args=Remote_Task Task
./sun4u_sol_5.8/remote_pre_declare.out: NMI_remote_task=code/perlWhereAmI/whereAmI.pl
./sun4u_sol_5.8/remote_pre_declare.out: NMI_remote_task_args=Remote_Task Task
./sun4u_sol_5.8/remote_declare.out: NMI_remote_task=code/perlWhereAmI/whereAmI.pl
./sun4u_sol_5.8/remote_declare.out: NMI_remote_task_args=Remote_Task Task
./sun4u_sol_5.8/remote_pre.out: NMI_remote_task=code/perlWhereAmI/whereAmI.pl
./sun4u_sol_5.8/remote_pre.out: NMI_remote_task_args=Remote_Task Task
./sun4u_sol_5.8/remote_task.out: NMI_remote_task=code/perlWhereAmI/whereAmI.pl
./sun4u_sol_5.8/remote_task.out: NMI_remote_task_args=Remote_Task Task
./sun4u_sol_5.8/remote_post.out: NMI_remote_task=code/perlWhereAmI/whereAmI.pl
./sun4u_sol_5.8/remote_post.out: NMI_remote_task_args=Remote_Task Task
OR
./nmi:x86_fc_3/remote_pre_declare.out: NMI_remote_task=code/perlWhereAmI/whereAmI.pl
./nmi:x86_fc_3/remote_pre_declare.out: NMI_remote_task_args=Remote_Task Task
./nmi:x86_fc_3/remote_task.out: NMI_remote_task=code/perlWhereAmI/whereAmI.pl
./nmi:x86_fc_3/remote_task.out: NMI_remote_task_args=Remote_Task Task
./nmi:x86_fc_3/remote_pre.out: NMI_remote_task=code/perlWhereAmI/whereAmI.pl
./nmi:x86_fc_3/remote_pre.out: NMI_remote_task_args=Remote_Task Task
./nmi:x86_fc_3/remote_post.out: NMI_remote_task=code/perlWhereAmI/whereAmI.pl
./nmi:x86_fc_3/remote_post.out: NMI_remote_task_args=Remote_Task Task
./nmi:x86_fc_3/remote_declare.out: NMI_remote_task=code/perlWhereAmI/whereAmI.pl
./nmi:x86_fc_3/remote_declare.out: NMI_remote_task_args=Remote_Task Task
./nmi:sun4u_sol_5.8/remote_pre_declare.out: NMI_remote_task=code/perlWhereAmI/whereAmI.pl
./nmi:sun4u_sol_5.8/remote_pre_declare.out: NMI_remote_task_args=Remote_Task Task
./nmi:sun4u_sol_5.8/remote_declare.out: NMI_remote_task=code/perlWhereAmI/whereAmI.pl
./nmi:sun4u_sol_5.8/remote_declare.out: NMI_remote_task_args=Remote_Task Task
./nmi:sun4u_sol_5.8/remote_pre.out: NMI_remote_task=code/perlWhereAmI/whereAmI.pl
./nmi:sun4u_sol_5.8/remote_pre.out: NMI_remote_task_args=Remote_Task Task
./nmi:sun4u_sol_5.8/remote_task.out: NMI_remote_task=code/perlWhereAmI/whereAmI.pl
./nmi:sun4u_sol_5.8/remote_task.out: NMI_remote_task_args=Remote_Task Task
./nmi:sun4u_sol_5.8/remote_post.out: NMI_remote_task=code/perlWhereAmI/whereAmI.pl
./nmi:sun4u_sol_5.8/remote_post.out: NMI_remote_task_args=Remote_Task Task

  1. Search for the _NMI_GID which is only present in the submit machine tasks. The macro is only present in log files in the run directory.

bash$ find ../ -name "*.out" | xargs grep '_NMI_GID'
./pre_all.out: _NMI_GID=mbletzin_grandcentral.cs.wisc.edu_1151605501_23834
./platform_pre.x86_fc_3.out: _NMI_GID=mbletzin_grandcentral.cs.wisc.edu_1151605501_23834
./platform_pre.sun4u_sol_5.8.out: _NMI_GID=mbletzin_grandcentral.cs.wisc.edu_1151605501_23834
./platform_post.x86_fc_3.out: _NMI_GID=mbletzin_grandcentral.cs.wisc.edu_1151605501_23834
./platform_post.sun4u_sol_5.8.out: _NMI_GID=mbletzin_grandcentral.cs.wisc.edu_1151605501_23834
./post_all.out: _NMI_GID=mbletzin_grandcentral.cs.wisc.edu_1151605501_23834
OR
../pre_all.out: _NMI_GID=tutorial_nmi-s005.cs.wisc.edu_1201021357_15130
../platform_pre.nmi:x86_fc_3.out: _NMI_GID=tutorial_nmi-s005.cs.wisc.edu_1201021357_15130
../platform_pre.nmi:sun4u_sol_5.8.out: _NMI_GID=tutorial_nmi-s005.cs.wisc.edu_1201021357_15130
../platform_post.nmi:sun4u_sol_5.8.out: _NMI_GID=tutorial_nmi-s005.cs.wisc.edu_1201021357_15130
../platform_post.nmi:x86_fc_3.out: _NMI_GID=tutorial_nmi-s005.cs.wisc.edu_1201021357_15130
../post_all.out: _NMI_GID=tutorial_nmi-s005.cs.wisc.edu_1201021357_15130

  1. Search for the user macro IAmAMacro.

bash$ find . -name "*.out" | xargs grep -i 'iamamacro'
./x86_fc_3/remote_pre_declare.out: NMI_IAmAMacro=Here I Am
./x86_fc_3/remote_task.out: NMI_IAmAMacro=Here I Am
./x86_fc_3/remote_pre.out: NMI_IAmAMacro=Here I Am
./x86_fc_3/remote_post.out: NMI_IAmAMacro=Here I Am
./x86_fc_3/remote_declare.out: NMI_IAmAMacro=Here I Am
./sun4u_sol_5.8/remote_pre_declare.out: NMI_IAmAMacro=Here I Am
./sun4u_sol_5.8/remote_declare.out: NMI_IAmAMacro=Here I Am
./sun4u_sol_5.8/remote_pre.out: NMI_IAmAMacro=Here I Am
./sun4u_sol_5.8/remote_task.out: NMI_IAmAMacro=Here I Am
./sun4u_sol_5.8/remote_post.out: NMI_IAmAMacro=Here I Am
OR
../pre_all.out: _NMI_GID=tutorial_nmi-s005.cs.wisc.edu_1201021357_15130
../platform_pre.nmi:x86_fc_3.out: _NMI_GID=tutorial_nmi-s005.cs.wisc.edu_1201021357_15130
../platform_pre.nmi:sun4u_sol_5.8.out: _NMI_GID=tutorial_nmi-s005.cs.wisc.edu_1201021357_15130
../platform_post.nmi:sun4u_sol_5.8.out: _NMI_GID=tutorial_nmi-s005.cs.wisc.edu_1201021357_15130
../platform_post.nmi:x86_fc_3.out: _NMI_GID=tutorial_nmi-s005.cs.wisc.edu_1201021357_15130
../post_all.out: _NMI_GID=tutorial_nmi-s005.cs.wisc.edu_1201021357_15130

Retrieving Output

Introduction.

This tutorial explains how output files from tasks can be retrieved. Up to this point, the sole output that you have dealt with is the messages written to standard out and standard error for each task. This will also be the first tutorial for which what actual happens inside the component distribution is important. For this reason we will first take a look at the perl script that is used to generate the ouput.

Examining generateOutput.pl

The script used for this tutorial is attached. The script creates a file called “CreatedFile.txt” into which each task writes and identification message. Locate and examine the following code snippet:

if (-e 'CreatedFile.txt') { open IN, 'CreatedFile.txt'; my lines = ;@ close IN; print "Found created file:\n"; for my $l (lines) { print $l;@ } } else { print "Created file not found\n"; }

This script checks for the existance of the file CreatedFile.txt in the current working directory. If the file exists, the script will read its contents and print them to standard out.

Immediately following this section is the section that appends an identification message to the file. This will also create the file if it does not exist:

open OUT, "&gt;&gt;CreatedFile.txt";
print OUT "Greetings from $taskhook. The time is $time.\n";
close OUT;

By tracking the contents of this file, we can see how the build and test system manages the contents of the component for the various tasks.

Procedure

Submitting the Job

Submitting the Job.

  1. Download the attachment files into a working directory on the submit machine.
  1. Start the submission by running nmi_submit.

bash$ nmi_submit generateOutput.submit
Global ID: tutorial_nmi-s005.cs.wisc.edu_1201088902_12633
Run Directory: /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201088902_12633
Run ID: 72308

Examine The Output Files

Examining The Output Files.

  1. Open the details page for the run. You should see the user tasks listed on the page.
  1. The following table shows the outputs for the tasks with the _ppc_aix_5.3_ platform. The other platforms should show something similar. See if your logs also reflect this output by clicking on the output log of each task.
Task Output
(dark). pre_all I am running on nmi-s005.cs.wisc.edu which is a i386-linux-thread-multi platform I am processing for platform local and task hook pre_all I am processing for nmi platform The time now is 11:48:53
platform_pre Found created file:
Greetings from pre_all. The time is 11:48:53.
remote_pre_declare Found created file:
Greetings from pre_all. The time is 11:48:53.
Greetings from platform_pre. The time is 11:49:23.
remote_declare Found created file:
Greetings from pre_all. The time is 11:48:53.
Greetings from platform_pre. The time is 11:49:23.
Greetings from remote_pre_declare. The time is 21:14:41.
remote_pre Found created file:
Greetings from pre_all. The time is 11:48:53.
Greetings from platform_pre. The time is 11:49:23.
Greetings from remote_pre_declare. The time is 17:49:47.
Greetings from remote_declare. The time is 17:49:49.
remote_task Found created file:
Greetings from pre_all. The time is 11:48:53.
Greetings from platform_pre. The time is 11:49:23.
Greetings from remote_pre_declare. The time is 17:49:47.
Greetings from remote_declare. The time is 17:49:49.
Greetings from remote_pre. The time is 17:49:51.
remote_post Found created file:
Greetings from pre_all. The time is 11:48:53.
Greetings from platform_pre. The time is 11:49:23.
Greetings from remote_pre_declare. The time is 17:49:47.
Greetings from remote_declare. The time is 17:49:49.
Greetings from remote_pre. The time is 17:49:51.
Greetings from remote_task. The time is 17:49:53.
platform_post Found created file:
Greetings from pre_all. The time is 11:48:53.
Greetings from platform_pre. The time is 11:49:23.
post_all Found created file:
Greetings from pre_all. The time is 11:48:53.

Here are some things to notice with the various task outputs:

  1. The pre_all task creates the first instance of the file.
  2. The remote_pre_declare task shows that it has a copy of the file created by the previous tasks on the submit machine. It is a copy because the original was created on the submit machine.
  3. The platform_post task shows that it accessed the version last appended by the platform_pre task.
  1. The post_all task shows that it accessed the version that was created by the pre_all task.

Examine Output Copies in the Run Directory

Examine Output Copies in the Run Directory.

Lets find out where all of these instances of CreatedFile.txt are at.

  1. Change into the run directory

  2. bash$ cd /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201088902_12633
  3. Search the directory for the file:

bash$ find -name "CreatedFile.txt"
./userdir/common/CreatedFile.txt
./userdir/x86_fc_3/CreatedFile.txt
./userdir/x86_winnt_5.1/CreatedFile.txt
./userdir/ppc_aix_5.3/CreatedFile.txt
OR
./userdir/common/CreatedFile.txt
./userdir/nmi:x86_fc_3/CreatedFile.txt
./userdir/nmi:x86_winnt_5.1/CreatedFile.txt
./userdir/nmi:ppc_aix_5.3/CreatedFile.txt
  1. Examine the contents of these files:

bash$ cat userdir/common/CreatedFile.txt
Greetings from pre_all. The time is 11:48:53.
Greetings from post_all. The time is 11:50:54.
bash$ cat userdir/x86_fc_3/CreatedFile.txt
Greetings from pre_all. The time is 11:48:53.
Greetings from platform_pre. The time is 11:49:08.
Greetings from platform_post. The time is 11:50:08.
  1. What this shows:
    • Each platform has its own copy of the file altered by all of the tasks running on the submit machine.
    • The copy in the common directory was altered only by the pre_all and post_all tasks.
  • The copy altered by remote tasks is not present in the run directory.

Summary in a Picture

Summary in a Picture.

The following diagram illustrates the flow of the CreatedFile.txt file through the build and test run:

Bit Bucket is Not a Mistake.

At first glance it looks like the bit bucket behaviour illustrated by the previous diagram is a design flaw. However, this is a case were implementation realities intrude on the design. The build and test system is designed to build software. One characteristic of these kinds of jobs is that the output is usually orders of magnitude larger than the input, leading to system scalability problems. For this reason, the system forces the user to choose what output is retrieved from a remote platform rather than retrieving everything as a default. The next section demonstrates how this works.

generateOutput.pl Revisited

generateOutput.pl Revisited.

As was discussed previously, a results.tar.gz file needs to be created by one or more of scripts called by the remote task hooks in the build and test specification file. In our example, all of the task hooks call the same generateOutput.pl script. Locate and examine the following code snippet1:

if (defined $genresultfile) { print "Creating results.tar.gz file\n"; system("tar czvf results.tar.gz CreatedFile.txt");
}

The $genresultfile flag that is set by the Getopt::Long module. This snippet assumes that a tar program which supports compression is visible. For this particular program this assumption is correct because the build and test system has to support the creation of the results.tar.gz. Other programs may require prerequisites which will be discussed later.

1 A copy of the script is attached on this page.

results.tar.gz Procedure

Submitting the Job

Submitting the Job.

  1. Open the generateOutput.submit file with your favorite editor.
  2. Add the flag -genresult to the arguments of the remote_post task hook. This task is the last executed on the remote platform and so it makes sense to collect the ouptut at this point. The -genresult flag will invoke the code snippet discussed earlier. Change the line: remote_post_args = -taskhook=remote_post to remote_post_args = -taskhook=remote_post -genresult.
  1. Start the submission by running nmi_submit.

bash$ nmi_submit generateOutput.submit
Global ID: tutorial_nmi-s005.cs.wisc.edu_1201098668_29904
Run Directory: /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201098668_29904
Run ID: 72315

Use the Overview Page

Use the Overview Page.

  1. Open the details page for the run
  1. Locate the download box. This contains links to the results.tar.gz for each platform. The image below shows where the box is located. Download one of these files and look at the CreatedFile.txt file inside.

Examine the Run Directory

Examine the Run Directory.

  1. Change into the run directory.

  2. bash$ cd /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201098668_29904
  3. Find all of the result files.

bash$ find . -name results.tar.gz
./userdir/nmi:x86_fc_3/results.tar.gz
./userdir/nmi:x86_winnt_5.1/results.tar.gz
./userdir/nmi:ppc_aix_5.3/results.tar.gz
  1. Open one of these archives and take a look at the CreatedFile.txt inside it

bash$ tar xzvf ./userdir/nmi:x86_winnt_5.1/results.tar.gz
CreatedFile.txt
bash$ cat CreatedFile.txt
Greetings from pre_all . The time is 14:31:40.
Greetings from platform_pre . The time is 14:32:10.
Greetings from remote_pre_declare . The time is 14:32:35.
Greetings from remote_declare . The time is 14:32:37.
Greetings from remote_pre . The time is 14:32:39.
Greetings from remote_task . The time is 14:32:41.
Greetings from remote_post . The time is 14:32:44.

Summary in a Picture Revisited.

Summary in a Picture Revisited.

Here is the flow diagram again with the results.tar.gz file. Notice that once the results file is created, it is automatically copied to the submit machines run directory:

User Defined Tasks and Task Failures

Introduction

This tutorial explains how to tell the build and test system to subdivide the remote_task. It also demonstrates how the system handles failures. Before starting this tutorial you should review the failure handling reference here.

The key here is the generation of the file tasklist.nmi. As before we will start by examining the script that generates this file.

Examining userTasks.pl

The script we will examine is attached. Rather than create a different script for each task we wish to define, we’ll use a single script, userTasks.pl, capable of performing each step, depending on its arguments.

The main execution flow of the script is contained within the hash %table. The hash contains a list of task names which are associated with a subroutine. The following snippet shows which tasks associate with what subroutines:
my %table = ( default =&gt; \&amp;default, failure =&gt; \&amp;failure,
timeout_failure =&gt; \&amp;timeout_failed,
remote_declare =&gt; \&amp;generate_tasklist_nmi,
remote_post =&gt; \&amp;generate_results_file,
);

The table has entries for the remote_declare and remote_post tasks. The remote_declare task calls the function which creates the tasklist.nmi file. The remote_post task calls the function which generates the results.tar.gz file. It should be noted that these files can be generated during other tasks. Generating the files during these tasks follows the model that the system was designed for.

The script expects two flags, -taskhook and -fail. The remaining arguments to the script are assumed to be user defined task names that are passed as a list to the associated function in %table. The only function that uses this list is generate_tasklist_nmi.

The last thing to note is the following:

my $name = $taskhook;
$name = $ENV{'_NMI_TASKNAME'} if defined $ENV{'_NMI_TASKNAME'};

As this snippet shows, the variable $name can either contain the name of a taskhook or the name of a user defined task. The script treats taskhook tasks the same as user defined tasks.

Generating tasklist.nmi

The function for generating tasklist.nmi is below:
sub generate_tasklist_nmi { my ($taskhook,list) = _; open LIST, "&gt;tasklist.nmi"; print "Generating tasklist.nmi for $taskhook\n"; for my $l (list) { print LIST “$l 1\n”;@ print "$l 1\n"; } close LIST;
}

Each taskname in @list a timeout of 1 minute. The function assumes that the script is being executed in the top level directory where the tasklist.nmi needs to reside.

Creating Failures

The script has two functions that generate a failure. The first failure executes a perl die command which ends the script with a non zero return. Then second timeout_failure hangs the script for 80 seconds which exceeds the 1 minute timeout specified in the generated tasklist.nmi file. The functions are shown below:

sub failure { my ($taskhook) = _;@ print "Running failure with $taskhook\n"; die "Task $taskhook has failed\n";
}

sub timeout_failed { my ($taskhook) = _;@ print "Running timeout failed with $taskhook\n"; sleep 80; print "Timeout is done\n";
}

Dumping the Environment Variables

The var_dump function records the shell environment a bit differently than what the previous tutorials have done. The function is as follows:

sub var_dump { my ($taskhook) = _;@ mkdir 'CreatedDir' unless -d 'CreatedDir'; my $filename = File::Spec-&gt;catfile('CreatedDir',$taskhook . "VarDump.txt"); print "Creating $filename file\n"; open OUT, "&gt;$filename"; for my $e (sort keys %ENV) { next unless $e=~ m!CONDOR! or $e=~ m!NMI_! or $e eq 'PATH'; print OUT "\t\t$e=" . $ENV{$e} . "\n"; } close OUT;
}

As can be seen, the function creates a dump file specific to the task name under the directory CreatedDir. This means that the shell environment for each tasks will be stored in a seperate file.

Note that the variables _NMI_TASKNAME and _NMI_STEP_FAILED are also dump to standard out in the main script. The importance of these variables will be discussed later.

Creating results.tar.gz

The function generate_results_file creates the file results.tar.gz. Unlike the previous tutorial where only one file was archived, the function archives two files (_tasklist.nmi_ and CreatedFile.txt) and a directory (_CreatedDir_). The function is as follows:

sub generate_results_file { my ($taskhook) = _;@ print "Creating results.tar.gz file for $taskhook\n"; if (-e 'tasklist.nmi') { print "Including tasklist.nmi\n"; system("tar czvf results.tar.gz CreatedFile.txt CreatedDir tasklist.nmi"); } else { system("tar czvf results.tar.gz CreatedFile.txt CreatedDir"); }
}

Procedure

  1. Download the attached files to the submit machine
  1. Examine the file userTasks.submit. As shown in the snippet below, all of the task arguments are similar except for remote_declare. While all of the tasks are given their name with the -taskhook flag, remote_declare is also provided with user-defined task names that will be inserted into the tasklist.nmi file.

@remote_declare@ = code/perlUserTasks/userTasks.pl
@remote_declare_args@ = -taskhook=remote_declare customTaskOne customTaskTwo customTaskThree
@pre_all@ = code/perlUserTasks/userTasks.pl
@pre_all_args@ = -taskhook=pre_all
  1. Run nmi_submit.

bash$ nmi_submit userTasks.submit
Global ID: tutorial_nmi-s005.cs.wisc.edu_1201099302_30463
Run Directory: /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201099302_30463
Run ID: 72316
  1. Open the details page for the run . In addition to the usual task names, you should see the user-defined tasks listed on the page, similar to this:

  2. Click on the standard out log of one of the user tasks. Notice that the value of _NMI_TASKNAME is the same as the task name.
  3. Click on the standard out log of one of the task hook tasks such as remote_post. Notice that the value of _NMI_TASKNAME is not defined.
  1. Download and unpack the results file. Look at the tasklist.nmi file contained inside. It should have contents similar to this:

customTaskOne 1
customTaskTwo 1
customTaskThree 1
Notice that all of the user tasks listed in the details page are also in this file.

Add a User-Defined Task Failure: Procedure

  1. Now let’s add a failed user-defined task. This can be done by simply adding the task name failure to the arguments of the remote_declare task hook. Open the file userTasks.submit into your favorite editor and change the arguments to remote_declare as shown below:
remote_declare = code/perlUserTasks/userTasks.pl remote_declare_args = -taskhook=remote_declare customTaskOne failure customTaskTwo customTaskThree
  1. Run nmi_submit.
bash$ nmi_submit userTasks.submit Global ID: tutorial_nmi-s005.cs.wisc.edu_1201100669_2545 Run Directory: /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201100669_2545 Run ID: 72321
  1. Open the details page for the run . You should see the user tasks listed on the page similar to this:
  2. Notice the following:
    1. This is the “failure” tasked that failed.
    2. The user tasks following the failure still ran showing that this failure is of type continue remote / abort platform discussed here.
    1. The failure is propagated up through the system to the user. This propagation causes the “meta-tasks” remote_task and platform_job to be marked as failed.
  1. Click on the standard error log for the failure task (its the red X’d box next to the standard out button). The log should contain the statement Task failure has failed which was generated by the perl die command.

Add a Timeout Failure: Procedure

  1. Now lets add a time out failure. This can be done by simply adding the task name _timeout_failure_ to the arguments of the remote_declare task hook. Open the file userTasks.submit into your favorite editor and change the arguments to remote_declare as shown below:
remote_declare = code/perlUserTasks/userTasks.pl remote_declare_args = -taskhook=remote_declare customTaskOne failure customTaskTwo timeout_failure customTaskThree
  1. Run nmi_submit.
bash$ nmi_submit userTasks.submit Global ID: tutorial_nmi-s005.cs.wisc.edu_1201101119_2843 Run Directory: /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201101119_2843 Run ID: 72322
  1. Open the details page for the run . You should see the user tasks listed on the page similar to this:
  2. Notice that the time_out failure task failed as a _failure_noted_ type. This is the same behaviour as the failure task added in the previous section.
  1. Click on the standard error log for the timeout_failure task and notice that the build and test system killed the task after 60 seconds.

Add a Remote Preparation Failure: Procedure

  1. Now lets add a remote preparation failure. This is a failure on one of the remote tasks executed before remote_task. We are going to make this happen by adding a -fail flag to the arguments of remote_pre. Open the file userTasks.submit into your favorite editor and change the arguments to remote_pre as shown below:
remote_pre = code/perlUserTasks/userTasks.pl remote_pre_args = -taskhook=remote_pre -fail
  1. Run nmi_submit.
bash$ nmi_submit userTasks.submit Global ID: tutorial_nmi-s005.cs.wisc.edu_1201101416_3804 Run Directory: /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201101416_3804 Run ID: 72325
  1. Open the details page for the run . You should see the user tasks listed on the page similar to this:
  2. Notice that the remote_prep task failed as a _remote_post / abort platform_ type (Remember failure handling). After the system detected the failure it skipped the execution of the remaining remote task and executed remote_post. After this it aborted the run.

Add a Platform-Preparation Failure: Procedure

  1. Now lets add a platform-preparation failure. This is a failure of the initial platform-specific task on the submit machine. We are going to make this happen by adding a -fail flag to the arguments of platform_pre. Open the file userTasks.submit into your favorite editor and change the arguments to platform_pre as shown below:
platform_pre = code/perlUserTasks/userTasks.pl platform_pre_args = -taskhook=platform_pre -fail
  1. Run nmi_submit.
bash$ nmi_submit userTasks.submit Global ID: tutorial_nmi-s005.cs.wisc.edu_1201101812_5132 Run Directory: /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201101812_5132 Run ID: 72326
  1. Open the details page for the run . You should see the user tasks listed on the page similar to this:
  2. Notice that the platform_pre task failed as a abort platform type (Remember failure handling). The platform, and thus the run as a whole (since we were running on only one platform), aborted after the failure was detected.

Prerequisites

Introduction

This tutorial explains how to specify prereqs (short for prerequisites) to the build and test system. Your software may depend on one or more external tools or libraries in order to build or test. Often these components are not bundled as part of the “stock” operating system you are targetting. The build and test system, by default, provides a “clean” environment with only the default tools that are provided by the OS in question.

This need for external tools can be addressed in two different ways:

  1. Specify prereqs to NMI in your build and test specification file. These prereqs must be installed by the NMI pool administrator on one or more of the machines you require before your jobs will be able to run. The prereq list is used by NMI to ensure your submission only runs on resources which offer the necessary external tools, and to ensure that your submission runs with each of its specifed prereqs automatically present in its PATH. You may use the nmi_list_prereqs tool to find or view any and all prereqs installed in the NMI B&T Lab. This tutorial demonstrates this approach.
  1. Include the external tool yourself as an additional input to your build or test. The advantage is that you will have a more portable build or test that can run on unmodified instances of the given platform, without requiring special software to be installed in advance. This may provide you with access to many more resources. The disadvantage is that, depending on the number and size of your dependencies, your inputs could become large and unwieldy.
One way to do this is to use a precompiled binary of the tool or library as input; another is to submit a separate NMI build of the tool itself, and then specify to NMI that this build’s output should be used as input to a subsequent NMI build or test which requires it. This approach is sometimes called “build (or test) chaining”, and is demonstrated here. It has the advantage of allowing you to more target a new platform without having to manually locate or build binaries for each of your external tools yourself.

Procedure

  1. Download the first two attached files to the submit machine. The third file is the perl script executed during the remote task. You are welcomed to look at this file to see what is being executed.
  1. Run nmi_submit.

bash$ nmi_submit prereqCoreUtils.submit
Global ID: tutorial_nmi-s005.cs.wisc.edu_1201102326_5467
Run Directory: /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201102326_5467
Run ID: 72327
  1. When the run is completed, open the details page. Notice that the run has failed.
  1. Click on the error ouput for the remote_task. You should see the following:

system_wrapper: running system(/Users/condor/execute/dir_10129/task_wrapper.sh environment.source code/perlPrereqs/prereqTest.pl)...
md5sum does not exist at /Users/condor/execute/dir_10129/userdir/code/perlPrereqs/prereqTest.pl line 8.
system_wrapper: system() has returned.
  1. The middle line shows that the script died because it could not find the md5sum program. This program is contained in the prereq coreutils.
  1. Click on the details page.
  2. Click on the host machine link . This will display a list of prereqs that are installed. You can check out other machines by selecting the Pool Overview link at the top of the page.
  1. Locate the core utils package as shown below:

  2. add a prereq requirement to the build and test specification file:

@project@ = tutorial
@component@ = prereqCoreUtils
@description@ = Demonstrates prereqs
@run_type@ = build
@inputs@ = prereqs.cvs
prereqs = coreutils-5.2.1
@remote_task@ = code/perlPrereqs/prereqTest.pl
@platforms@ = x86_macos_10.4
@notify@ = bgietzelcs.wisc.edu@
  1. resubmit the file:

bash$ nmi_submit prereqCoreUtils.submit
Global ID: tutorial_nmi-s005.cs.wisc.edu_1201102587_5684
Run Directory: /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201102587_5684
Run ID: 72328
If you see the following warning then you have mis-spelled the prereq:
bash$ nmi_submit prereqCoreUtils.submit
WARNING: no machines were found that matched your requirement of: '(nmi_platform == "x86_macos_10.4" &amp;&amp; Arch != "" &amp;&amp; OpSys != "" &amp;&amp; @Memory >= 0) && (has_coreutilsOOPS_5_2_1 =!= Undefined)’ at /usr/local/nmi-2.1.5/bin/nmi_submit line 680.@
  1. When the run is completed, open the details page. Notice that the run has passed now.
  1. Click on the output for the remote_task. You should see an MD5 sum on the file TextFile.txt. Notice also that the PATH variable now contains a path to the coreutils programs. In addition, the system created a new variable called _NMI_PREREQ_coreutils_5_2_1_ROOT which contains the path to the coreutils installation. The system creates a seperate variable for every prereq requested and found as described here.
PATH=/prereq/coreutils-5.2.1/bin:/bin:/usr/bin:/Users/condor/execute/dir_10247/userdir _CONDOR_ANCESTOR_10247=10249:1154729035:3503905792 _CONDOR_ANCESTOR_12083=10247:1154729034:1340517094 _CONDOR_ANCESTOR_346=12083:1153930469:1192128401 _CONDOR_SCRATCH_DIR=/Users/condor/execute/dir_10247 _NMI_PREREQ_coreutils_5_2_1_ROOT=/prereq/coreutils-5.2.1 _NMI_TASKNAME=remote_task
68b37b75d0d1aba037dcaaf558fe1985 TestFile.txt

Saved Runs and Workflows

Introduction

This tutorial explains how to tell the build and test system to use an existing build as a starting point for a build and test run. The tutorial re-uses the example demonstrated in the retrieving output here. The first part of the tutorial will also show how to save a run using nmi_pin

Pinning the generateOutput Run Procedure

  1. Locate the run you submitted as part of the generateOutput section. Make sure that you have added the -genresult flag as part of the submission. The runid in this case is 72315.
  1. Use this run id with nmi_pin to save the run.
bash$ nmi_pin --runid=72315 UPDATE Run SET archive_results_until='2008-03-23 21:50:45' WHERE (((runid=72315)) AND (archived=1))
  1. Notice that this saves the run for the default 2 months. Hopefully you will be finished with this example before that time:).

Workflow Procedure

Workflow Procedure.

  1. Download the attachment files into a working directory on the submit machine.
    1. Take a look at the generateMoreOutput.pl script which is executed for every task. It is similar to the generateOutput.pl script discussed here. Notice the $extras variable which is appended to the output and prepended to the output file. Notice also that the script prints the contents of any file ending with CreatedFile.txt.
  1. Take a look at the file generateMoreOuput.submit a portion of which is shown below. Notice that the input command has two entries. The first generateOutput.cvs fetches the module containing the generateMoreOuput.pl script. The second generateMoreOuput.nmi to reuse a build. Notice also that the _args commands now have an extra argument MoreOuput. The script sticks this in the $extras variable discussed earlier.

@run_type@ = build
@inputs@ = generateOutput.cvs, generateMoreOutput.nmi
@pre_all@ = code/perlGenerateOutput/generateMoreOutput.pl
@pre_all_args@ = -taskhook=pre_all MoreOutput
  1. Open the file generateMoreOutput.nmi with your favorite editor. This input file uses the nmi method which fetches the results from a previous build and unpacks into the working directory of the submit machine before platform_pre is executed. The argument we will use here is input_runids. Type in the run id you remembered from the previous section:

  2. input_runids = 72315
  3. Start the submission by running nmi_submit.

bash$ nmi_submit generateMoreOutput.submit
Global ID: tutorial_nmi-s005.cs.wisc.edu_1201103817_6182
Run Directory: /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201103817_6182
Run ID: 72330
  1. When the run is completed, open the details page for the run
  1. Click on the standard ouput for the various tasks. The following output indicates that the tasks has found and accessed the results from the previous run:

Found created file CreatedFile.txt:
Greetings from pre_all . The time is 14:31:40.
Greetings from platform_pre . The time is 14:31:55.
Greetings from remote_pre_declare . The time is 14:32:36.
Greetings from remote_declare . The time is 14:32:38.
Greetings from remote_pre . The time is 14:32:40.
Greetings from remote_task . The time is 14:32:43.
Greetings from remote_post . The time is 14:32:45.
This output should be present in all of the tasks except pre_all and post_all.

Workflow with Multiple Runs Procedure

Workflow with Multiple Runs Procedure.

  1. The nmi method’s input_runids command can accept more than one runid so lets add another run. The first thing that needs to be done is to modify the build and test specification file so that the results from its run can be differentiated from previous run. Open the file generateMoreOutput.submit with your favorite editor and replace the phrase “MoreOutput” in each *_args command to “EvenMoreOutput”[1]. For example:

pre_all = code/perlGenerateOutput/generateMoreOutput.pl
pre_all_args = -taskhook=pre_all EvenMoreOutput
platform_pre = code/perlGenerateOutput/generateMoreOutput.pl
platform_pre_args = -taskhook=platform_pre EvenMoreOutput
platform_post = code/perlGenerateOutput/generateMoreOutput.pl
platform_post_args = -taskhook=platform_post EvenMoreOutput
post_all = code/perlGenerateOutput/generateMoreOutput.pl
post_all_args = -taskhook=post_all EvenMoreOutput
remote_pre_declare = code/perlGenerateOutput/generateMoreOutput.pl
remote_pre_declare_args = -taskhook=remote_pre_declare EvenMoreOutput
remote_declare = code/perlGenerateOutput/generateMoreOutput.pl
remote_declare_args = -taskhook=remote_declare EvenMoreOutput
remote_pre = code/perlGenerateOutput/generateMoreOutput.pl
remote_pre_args = -taskhook=remote_pre EvenMoreOutput
remote_task = code/perlGenerateOutput/generateMoreOutput.pl
remote_task_args = -taskhook=remote_task EvenMoreOutput
remote_post = code/perlGenerateOutput/generateMoreOutput.pl
remote_post_args = -taskhook=remote_post EvenMoreOutput -genresult
  1. Next add the runid from the previous workflow example to the file generateMoreOutput.nmi. Be sure the runids are seperated by a comma:

  2. input_runids = 72315, 72330
  3. Start the submission by running nmi_submit.

bash$ nmi_submit generateMoreOutput.submit
Global ID: tutorial_nmi-s005.cs.wisc.edu_1201104281_8698
Run Directory: /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201104281_8698
Run ID: 72331
  1. When the run is completed, open the details page.
  1. Click on the standard ouput for the various tasks. The following output indicates that the tasks has found and accessed the results from both previous runs:

Found created file CreatedFile.txt:
Greetings from pre_all . The time is 14:31:40.
Greetings from platform_pre . The time is 14:32:10.
Greetings from remote_pre_declare . The time is 20:32:37.
Greetings from remote_declare . The time is 20:32:39.
Greetings from remote_pre . The time is 20:32:41.
Greetings from remote_task . The time is 20:32:43.
Greetings from remote_post . The time is 20:32:46.
Found created file MoreOutput-CreatedFile.txt:
Greetings from pre_all MoreOutput. The time is 15:57:32.
Greetings from platform_pre MoreOutput. The time is 15:58:02.
Greetings from remote_pre_declare MoreOutput. The time is 21:58:38.
Greetings from remote_declare MoreOutput. The time is 21:58:40.
Greetings from remote_pre MoreOutput. The time is 21:58:42.
Greetings from remote_task MoreOutput. The time is 21:58:44.
Greetings from remote_post MoreOutput. The time is 21:58:46.

This output should be present in all of the tasks except pre_all and post_all.

1 If your favorite editor is vi then you can do a global search and replace using the following command :1,$s/ MoreOutput/ EvenMoreOutput/. Be sure to preserve the space before each string otherwise you will rename the script generateEvenMoreOutput.pl which will cause the run to fail.

Workflow with Repeated Runs Procedure

Workflow with Repeated Runs Procedure.

  1. Add the runid from the previous workflow example to the file generateMoreOutput.nmi. This time we are going to rerun the same submission and see what happens.
  1. Start the submission by running nmi_submit.

bash$ nmi_submit generateMoreOutput.submit
Global ID: tutorial_nmi-s005.cs.wisc.edu_1201105749_9782
Run Directory: /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201105749_9782
Run ID: 72333
  1. When the run is completed, open the details page for the run .
  1. Click on the standard output of the platform_pre task which is the first tasks that accesses the results from the previous runs. Notice this section:

Found created file EvenMoreOutput-CreatedFile.txt:
Greetings from pre_all EvenMoreOutput. The time is 16:29:45.
This is wrong because EvenMoreOutput messages from the previous run are missing. What happened is that the build and test system copied the results of the pre_all task after it unpacked the results from the previous run. This caused the EvenMoreOutput-CreatedFile.txt file contained in the previous run’s results to be replaced by the version generated by the pre_all task. To prove this we need to eliminate the pre_all task from the generateMoreOutput.submit.
  1. Open generateMoreOutput.submit with your favorite editor and comment out the pre_all command and its arguments.
  1. Start the submission by running nmi_submit.

bash$ nmi_submit generateMoreOutput.submit
Global ID: tutorial_nmi-s005.cs.wisc.edu_1201106123_15159
Run Directory: /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201106123_15159
Run ID: 72334
  1. When the run is completed, open the details page for the run .
  1. Click on the standard output of the platform_pre task which is the first tasks that accesses the results from the previous runs. This time the task found the file from the previous run:

Found created file EvenMoreOutput-CreatedFile.txt:
Greetings from pre_all EvenMoreOutput. The time is 16:05:17.
Greetings from platform_pre EvenMoreOutput. The time is 16:05:34.
Greetings from remote_pre_declare EvenMoreOutput. The time is 16:06:01.
Greetings from remote_declare EvenMoreOutput. The time is 16:06:03.
Greetings from remote_pre EvenMoreOutput. The time is 16:06:05.
Greetings from remote_post EvenMoreOutput. The time is 16:06:09.

Building and Testing Zsh

Introduction

This tutorial applies the build and test concepts learned in the previous tutorials to the an actual software application; Zsh. We will start by reviewing how one might normally build zsh by hand, and then walk through the process of automating this inside the NMI Build & Test System. We will start with a simple automation, and then introduce more advanced features and show how they add flexibility and power to the build or test process.

Building Zsh Manually

  1. Download the source for zsh here.
  2. Build it on a local machine. Follow the instructions in the INSTALL file. (Since we’re building from the source tarball, and not directly from CVS, the configure scripts are already present and you do not need run autoconf.)
  3. Run the tests for zsh by typing make check. You may have some problems with some of the tests hanging. For example the Y0 test hangs on Mac OS X 10.4. The tests can also be run individually by using the TESTNUM argument. For example make TESTNUM=B check runs the B* set of tests.
  4. Lets review the steps you had to do to build and test zsh:
    1. Retrieve the source code tarball from an FTP site.
    2. Unpack the source tar file.
    3. change into the source directory zsh-4.3.2.
    4. Execute ./configure. You may have had to add some flags dependending on your local machine.
    5. Execute make.
  1. Execute make check.

In order to repeat this every night, or on multiple remote platforms, these steps will need to be automated. The way to do this is by adding what is commonly known as a glue script.

Simple Build and Test

Simple Build and Test.

This first cut demonstrates a minimal build and test framework that shows you how easily a traditional build process can be added into the build and test system. Download the attachments and put them into your working directory on the submit machine.

Examining the Submission files

Several files need to be created to run the ZShell build on the system. A build/test specification file and input specification file are always required. These files will be called zshell.ftp and Simple.submit. Also needed is a build script (also called a glue script) that changes the working directory to the top of the ZShell source directory and runs the build command. The build script in this case will be called run.sh.

Finally, another input specification is needed to retrieve the build script itself. Although this script is code itself and should really be kept a reliable repository like CVS to ensure the repeatability of this build in the future, for now we’re just going to use the scp input method to “retrieve” it from a directory on the local submission host. This input specification file will be called glue.scp.

Examining zshell.ftp

The input method is ftp to get the tar archive from the FTP site and untar it into the run directory:

@method@ = ftp
@ftp_root@ = ftp://ftp.cs.wisc.edu/condor/nmi/tutorials/
@ftp_target@ = zsh-4.3.2.tar.gz
@untar@ = true

Examining Simple.submit

This is the quick and dirty version of a submit file. As seen below, we simply put our three build steps in three predefined remote tasks that we know will be run in the correct sequence.

@project@ = tutorial
@component@ = ZShell Simple
@description@ = ZShell Build and Test Simple Example
@run_type@ = build

The following shows the two input files being referenced:

@inputs@ = zshell.ftp, glue.scp

Here is the configure step. Notice how the glue script is what is actually called and the configure command is passed in as an argument:

@remote_pre@ = run.sh
@remote_pre_args@ = ./configure --without-tcsetpgrp

Here are the rest of the build steps and the platform list:

@remote_task@ = run.sh
@remote_task_args@ = make
@remote_post@ = run.sh
@remote_post_args@ = make check
@platforms@ = ia64_rhas_3, x86_64_fc_4

Examining glue.scp

The glue.scp calls the scp method.

Examining run.sh

run.sh is a two line script that changes to the zsh-4.3.2 directory and then runs whatever build command is provided as an argument.

Procedure

Procedure.

  1. Make sure you have downloaded the attachments from here and put them in your working directory on the submit machine.
  2. Open the file glue.scp with your favorite editor and replace the phrase /YOUR/WORKING/DIRECTORY/ with the path name of your working directory.
  1. Start the submission by running nmi_submit.
bash$ nmi_submit Simple.submit Global ID: tutorial_nmi-s005.cs.wisc.edu_1201179662_15953 Run Directory: /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201179662_15953 Run ID: 72428
  1. When the run is completed, open the details page for the run .
  1. Check the output logs of the remote tasks. The output should be the same as what you saw when you ran the build manually.

Adding User Defined Tasks

Adding User Defined Tasks.

The simple build and test run is sufficient for one-off tests. However, the real strength of the build and test system is running automated builds that are constantly repeated. Because of this, it makes sense to improve the submission files so that build and test results can be understood more easily by the casual user. In this tutorial, we will start by moving the build steps into explicitly-named user defined tasks, so that the build steps can easily be identified on the details page. To accomplish this we need to defined an NMI_TASKLIST file as we did in the user tasks tutorial.

To start this tutorial, download the attachments into your working directory on the submit machine.

Examining the Submission Files

Examining the Submission Files.

For this version of the submission files we adapt the script used in the user tasks tutorial. The submit file, tasks.submit, calls two tasks remote_declare and remote_task. Just as in the user tasks tutorial, remote_declare is called to create the tasklist.nmi file and remote_task is called repeatedly for each user defined task. The zshell.ftp file is the same as the simple example earlier. The file glue.scp is also similar except that it retrieves the script glue.pl instead of run.sh.

Examining the Script glue.pl

This script is adapted from the userTasks.pl script used in the user tasks tutorial. The tasknames and their associated functions have be replaced by the following:

  • configure – runs the configure command.
  • build – runs the make command.
  • test – runs the command “make check”

The tasklist is hard-coded in the _generate_tasklist_nmi_ function with each task getting a timeout of 10 minutes.

Procedure

Procedure.

  1. Make sure you have downloaded the attachments from here and put them in your working directory on the submit machine.
  2. Open the file glue.scp with your favorite editor and replace the phrase /YOUR/WORKING/DIRECTORY/ with the path name of your working directory.
  1. Start the submission by running nmi_submit.
bash$ nmi_submit tasks.submit Global ID: tutorial_nmi-s005.cs.wisc.edu_1201179264_15130 Run Directory: /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201179264_15130 Run ID: 72423
  1. When the run is completed, open the details page for the run .
  1. Notice the configure, build, and test tasks. Check the output logs of these tasks. The output should be the same as what you saw when you ran the build manually.

Robust Build and Test Files

Robust Build and Test Files.

In the previous section we introduced the idea that the build and test system should be used to run continuously repeated tasks. This kind of use can be part of software development practices such as continuous integration. However as with other internal tools, a good design is needed up front to keep developers from falling into the trap of spending more time fixing internal tools than working on the software deliverables1. In this tutorial we show some of the tricks that can be used to make the build and test system a vital but invisible part of your development process. We will demonstrate the following design principles:

  1. Localize Anticipated Changes to the Submission Files. Some aspect of the submission files will need to be changed everytime the build and test system is used. The trick here is to anticipate what these changes are and then provide one place where these changes are made. Here are some frequent changes:
    • Input Changes. The build and test system may need to retrieve the software using a different tag, pathname, or repository.
    • Version Number. Every release of an application requires a new version number. This number can appear in a build and test run in a number of places. For example, the version number for the ZShell example shows up in the name of the archive and in the path to the working directory of the build.
    • Test Customization. The purpose of a test and build run is to run tests that detect errors. A run that fails needs to be re-run with different parameters in order to isolate and debug the problem that caused the failure.
    • Platform changes. The list of platforms that constantly modified based on testing needs and the availability of platforms.
    • Provide Readable Logs. A build and test run is executed as a batch job. Because of this, the only way to understand the cause of failures is to scan the output and error logs. From these logs the failure has to be isolated and the information to reproduce the failure needs to be collected. One of the things we will show here is to provide information about each step that is executed such as:
    • The full path of the executable.
    • The command line arguments.
  • The working directory that the command is executed in.

We will use four features of the build and test system.

  • Build and test system shell variable substitution explained here.

1 This trap is humorously explained in the book The Peter Pyramid. In the book, Dr. Peter explains that one of the symptoms of an organization suffering from to much bureaucracy is when it spends more time on external processes then in talking to its customers.

Retrieving the Submission Files.

Retrieving the Submission Files.

  1. Copy the archive located here to the working directory of the submit machine:

bash$ wget ftp.cs.wisc.edu/condor/nmi/tutorials/zsh-scripts.tar.gz
--10:42:37-- http://ftp.cs.wisc.edu/condor/nmi/tutorials/zsh-scripts.tar.gz
=&gt; `zsh-scripts.tar.gz'
Resolving ftp.cs.wisc.edu... 128.105.2.28
Connecting to ftp.cs.wisc.edu|128.105.2.28|:21... connected.
Logging in as anonymous ... Logged in!
==&gt; SYST ... done. > PWD ... done.@ @&gt; TYPE I ... done. > CWD /condor/nmi/tutorials ... done.@ @&gt; PASV ... done. > RETR zsh-scripts.tar.gz ... done.@ @Length: 2,497 (2.4K) (unauthoritative)@ @100%[======================================================================&gt;] 2,497 --.--K/s
10:42:36 (1.03 MB/s) - `zsh-scripts.tar.gz' saved [2497]
  1. Unpack the archive and change into the zshell directory.

bash$ tar xzvf zsh-scripts.tar.gz
drwxr-xr-x tutorial/tutorial 0 2006-08-14 10:33:31 zshell/
-rwxr-xr-x tutorial/tutorial 3583 2006-08-13 11:29:04 zshell/glue.pl
-rw-r--r-- tutorial/tutorial 45 2006-08-12 21:01:43 zshell/glue.scp
-rw-r--r-- tutorial/tutorial 80 2006-08-12 19:33:14 zshell/submit_env.sh
-rwxr-xr-x tutorial/tutorial 856 2006-08-13 11:29:05 zshell/test.sh
-rw-r--r-- tutorial/tutorial 110 2006-08-12 21:38:25 zshell/zshell.ftp
-rw-r--r-- tutorial/tutorial 443 2006-08-13 09:14:07 zshell/zshell.submit
drwxr-xr-x tutorial/tutorial 0 2006-08-12 20:25:21 zshell/srctest/
-rw-r--r-- tutorial/tutorial 76 2006-08-12 19:33:14 zshell/srctest/Makefile
-rwxr-xr-x tutorial/tutorial 37 2006-08-12 19:33:14 zshell/srctest/configure
bash$ cd zshell/

Examining glue.pl

Examining glue.pl

Take a look at the file glue.pl. This is the script that automates the build steps you did earlier. The script is larger than it could be because it implements some of the design principles discussed in the subsection.

The first section to note is the lookup table which lays out the tasks that the glue script is expecting to run under (this is similar to the script used in the user tasks tutorial). The script is expecting the build and test of zsh to be broken up into configure, build, and test tasks. In addition, the script expects the remote_declare task to generate the tasklist file. Finally the script has a scan task available which will print out all of the CONDOR and NMI variables found.

The second section is a comment block that describes the macros that the script is expecting. Only _NMI_TASKNAME is provided by the build and test system. All of the others are user macros which need to be defined in the build and test specification file as demonstrated here. Remember that the NMI_ prefix is added by the build and test system. And so for example, the macro NMI_TASKLIST would be defined as TASKLIST = task1... in the build and test specification file. You should also notice the convention used with the NMI_ARGS and NMI_TIMEOUT macros. These macros indicate a specific task and optional platform by including the labels in the macro name. For example NMI_ARGS_configure_x86_fc_4 specifies arguments for the configure tasks run on the x86_fc_4 platform which NMI_ARGS_build specify arguments for the build task for all platforms. See the subroutine assemble for the perl logic behind this convention.

The third thing to note is the for loop which contains the actual lookup code. Unlike the previous script which expects to match tasks names exactly, this script looks for the task name as a prefix. So for example test1, test2, and testFinal would all execute the test task. This is done so that a task can be executed multiple times with different arguments and timeouts.

The last thing you should examine is the functions which execute the tasks. For example, you should see that the test command actually adds its arguments between make and check.

Glue Script Design Principles

Glue Script Design Principles

One of the traps that you can fall into when you run a task a remote batch is to keep the same debugging habits that you used when you ran on your local machine. Since build and test runs take much longer then a local execution, you will find your time wasting away if you let simple mistakes such as syntax errors creep into your builds. To prevent this from happening you should follow some simple design principles with your glue scripts:

  • Print out your execution commands. You need to make sure that a script was invoked correctly before you start looking for bugs within the script. It is also a good idea to print out the directory path that the command is invoked from.
  • Document your glue script arguments. List out every macro that the script is expecting. This will help others understand your script and also helps you review its design.
  • Test your glue script locally. Create stubs for the different build and test scripts that are invoked and run your glue script on your local machine to make sure it is working correctly. Debugging a glue script by using the build and test system is a waste of your time. See the file test.sh in the tutorial distribution for an example.
  • Pretty print your logs. For large builds it is a good idea to create a format that allows you to easily find execution sections and errors. One tool that could help with this is Log4perl which allows you to redirect and format log messages.

Examining zshell.submit

Examining zshell.submit

The file zshell.submit contains the start of what is needed for the Zsh build and test. This is the file where we expect all of the changes that were discussed here will be made. Note the macros section where the macros TASKNAME and SRCDIR are set. In the tasknames version of the submission files, the task list and source directory were hard-coded in the glue script. The TASKNAME macro contains only the scan task so that the run environment can be seen. During this tutorial you will be adding more stuff to this section.

Examining the glue.scp and submit_env.sh

Examining the glue.scp File.

This file tells the build and test system where to fetch the glue script. For this tutorial the glue script is assumed to be on the machine and path that you run nmi_submit from. The script can also be include within the source distribution or as a seperate archive or CVS module.

To fetch the glue script from the submit directory we will need to use the scp method. We also need to use variable substitution as discussed here.

@method@ = scp
@scp_file@ = $(SUBMITDIR)/glue.pl

The file _submit_env.sh_ should be sourced before the run is submitted so that SUBMITDIR is set to the proper value. This file set all of the environmental variables used by the submission files as shown below:

export _NMI_HOME=$HOME
export _NMI_HOSTNAME=$HOSTNAME
export _NMI_SUBMITDIR=$PWD

The file is used with Borne shell. If you prefer a different shell you will need to create your own version of the file.

Add Build and Test Steps Procedure

Add Build and Test Steps Procedure

  1. Source the _submit_env.sh_ file.

bash# . submit_env.sh
bash# set | grep _NMI_
_NMI_HOME=/home/tutorial
_NMI_HOSTNAME=nmi-s005.cs.wisc.edu
_NMI_SUBMITDIR=/home/tutorial/zshell
  1. Submit the zshell.submit file as is to see if the macros are set correctly.

bash# nmi_submit zshell.submit
Global ID: tutorial_nmi-s005.cs.wisc.edu_1201270245_29836
Run Directory: /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201270245_29836
Run ID: 72525
  1. When the run is done, take a look at the standard out log for the scan task. Among the variables you should see are the ones for the tasknames and source directory.

NMI_SRCDIR=zsh-4.3.2
NMI_TASKLIST=scan
  1. Open the zshell.submit file with your favorite editor and add the configure, build, and test commands to the TASKNAME macro.

  2. TASKLIST = scan,configure,build,test
  3. Submit the zshell.submit file again.

bash# nmi_submit zshell.submit
Global ID: tutorial_nmi-s005.cs.wisc.edu_1201384769_17678
Run Directory: /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201384769_17678
Run ID: 72610
  1. After the run is completed, notice that the configure task has failed. The error log shows that the configure command is complaining that it cannot find a valid tty program and recommends two command switches:

system_wrapper: running system(/home/condor/execute/dir_25030/task_wrapper.sh environment.source glue.pl -taskhook=remote_task)...
configure: error: no controlling tty
Try running configure with --with-tcsetpgrp or --without-tcsetpgrp
Task configure has failed at /home/condor/execute/dir_25030/userdir/glue.pl line 58.
So how do we add a command switch to the configure task? We could add the switch to the arguments of the remote_task. However, this will cause every remote task to get the switch. In this tutorial this is harmless but in complex projects this could lead to unwanted side effects. The next procedure shows how this switch can be added.

Add a Command Switch to Configure Task.

Add a Command Switch to Configure Task.

  1. Take a look again at the glue.pl script discussed here. In the macro documentation section we see that the script looks for macros that are target for specific tasks and platforms. This means we can add a task specific command switch using the ARGS macro with a suffix of configure.
  1. Open zshell.submit with your favorite editor and add the following to the macro section:
  2. ARGS_configure = --with-tcsetpgrp
  3. Submit the zshell.submit file. Make sure you have set the needed environment variables by sourcing _submit_env.sh_
bash# nmi_submit zshell.submit Global ID: tutorial_nmi-s005.cs.wisc.edu_1201385114_20490 Run Directory: /nmi/run/tutorial/2008/01/tutorial_nmi-s005.cs.wisc.edu_1201385114_20490 Run ID: 72613
  1. Verify that the run used the command switch.

Reproducibility of Builds

Things To Look Out For:

  • don’t check source code out of the “head” of a CVS trunk or branch — it isn’t a fixed target, and so future builds submitted using the same input spec may check out different source and fail or produce different results.
  • avoid less reliable, non-archival input methods, such as scp. Instead, use cvs or another revision control system. This is just as important for your build scripts as it is for the source code you’re building. If you don’t treat your build scripts as code, and archive, tag, and store it accordingly, it will be exceedinly difficult to reproduce old builds in the future.