Administrators
Physical plan contact information
Contacts in Physical plan related to fixing problems with B240 water supply
Nancy Lyons – paperwork person at the shops, involved in paying the bills for service, but also calls when Police call about an alert going off about low water
263-3333, ask for Nancy Lyons by name
(The Customer number tied to the CHTC WARF fund is B34914 this needs to be verified )
Kevin Corcoran – unofficial Steam fitter who is around nearly 24×7.. we were once told give him a call anytime this weekend if there is problems, he’ll come in.
444-4371 (possibly call the emergency CARS number, and then call this guy)
Kriss Viney – probably the backup person for the building manager when he is out.
SLES 9 on new hardware
In Order to install sles 9 amd64 on the new machines in Rack 7.
- Make sure hard drive is in compatibility mode
- Make sure install is done with safe settings.
Egenera pServer VM Host Setup How-to
First login to the web front end for the egenera.
To setup a pserver first go to the nmi-lpan, and click create in the pServer section.
Next assign the pServer the following:
- the next available pBlade
- a disk
- an ethernet device (make sure to assign the ethernet device to the private network switch)
- basic_sles9_boot boot image
After this select the disk, and partition the drive.
1gb: linux
16 gb: linux swap
Rest: linux
Save the drive’s settings and select Install Root Partition from above.
Select simple_sles9, and partition 3 and click submit. This will take a while.
Once the install has been done, start the machine and watch it via console.
After it is booted login and make the following directories, /sys, /proc, /proc/egenera.
Also goto /etc/sysconfig/network and rename ifcfg-eth-id-(mac address) to the correct mac address for that machine.
Reboot and make sure the machine boots correctly.
Intall gcc
scp 192.168.1.1:/root/gcc-3.3.3-43.28.i586.rpm .
scp 192.168.1.1:/root/glibc-devel-2.3.3-98.38.i686.rpm .
rpm -i glibc-devel-2.3.3-98.38.i686.rpm
rpm -i gcc-3.3.3-43.28.i586.rpm
Install VMware
scp 192.168.1.1:/root/VMware-server-1.0.4-56528.i386.rpm .
rpm -i VMware-server-1.0.4-56528.i386.rpm
vmware-config.pl
Follow the steps to install, serials available at http://nmi.cs.wisc.edu/node/1150
Metronome 2.5.0
This is a development release of Metronome. It contains new features, and may be unstable.
Download
Release Date: 2008-04-16
MD5 checksum: ae1672b027af1d56377894e199d17446
Metronome-2.5.0-0.noarch.rpm not created due to technical difficulties
MD5 checksum: TBD
Release Notes
The RPM packaging of this release has been delayed. Please contact us if this becomes a problem.
Backwards-incompatible syntax change: as a result of adding the ability to support multiple platform namespaces, we had to change the syntax of the platforms command in input specification files for the nmi input method. Instead of using a single colon to separate the source and destination platforms, users must now separate the platforms with two. This does not affect Metronome 2.5.0’s ability to use runs from earlier Metronome releases.
You can not run parallel jobs with Condor 6.9.5 and this release. Condor versions 6.9.4 and earlier, and 7.0.0 and later, do not have the problematic bug. Condor versions before 6.9.5 do not have the improved parallel job exit policies, which can dramatically simplify parallel testing, so we recommend using Condor 7.0.0 or above.
Some of Metronome 2.5.0’s new features require new tables in the database. Support for ‘git’ requires a new table, and this table has been added to the schema files. Support for nmi_resubmit_run remains more experimental, and its table is defined in a new file in the distribution, “database/Metronome-2.5.0”, which also includes a table for use with nmi_update_machine_table. This schema has only been tested against MySQL (although if you are using Metronome with Postgres, please let us know).
New Features
- renamed “Hosts” to “CPU Slots” in pool statistics sidebar, to reflect reality
- the Run Details web status pages now display the path to a run’s archive directory on the archive host (feature request 1176)
- Added remote_task_is_null flag to submit files to support local-only jobs.
- Added ability to handle multiple platform namespaces in a single submit file. See the new platforms and prereqs_ documentation.
- Added option for use with Condor 6.9.5 and later which throttles potentially IO-intensive jobs on the submit node. See the documentation for a discussion on this feature.
- Rewrote
nmi_migrate_runto better handle large run directories. - Added support for individually specifying
remote_*_timeouts, as well asremote_default_timeoutto replace the functionality of the 2.4.xremote_task_timeout. See <taskname>_timeout. - Large run workspaces are now more efficiently packaged in preparation for transfer to remote machines (feature requests 1327 and 1328).
- Add wgetrc option to use a separate wgetrc file for each input.
- Added ability to recreate a run entirely from the database. See documentation for nmi_resubmit_run.
Bugs Fixed
- This release contains all bug fixes from the Metronome 2.4.3 stable release.
Known Bugs
- View all Metronome bugs, as of the current release.
Requirements
- Metronome Submit/Archive Host
- Condor >= 6.8.0 or Condor >= 6.9.0
- Perl >= 5.005 (including DBI and DBD::mysql modules)
- Apache >= 2.0
- PHP >= 4.2.3 (i.e, with Session & MySQL support)
- Metronome DB Host
- MySQL 4.1.20
- Condor Central Manager Host
- Condor >= 6.8.0 or Condor >= 6.9.0
- Build/Test Execution Hosts
- Condor >= 6.8.0 or Condor >= 6.9.0
- Perl >= 5.005
Special Feature Requirements
- For Cross-Facility Job Migration: Condor >= 6.8.0
- For Parallel jobs: Condor >= 6.9.0 on central manager, submit and execute hosts.
Metronome 2.4.0
This is a stable release of Metronome. It contains only new bug fixes or new platform support.
Download
Release Date: 2007-08-30
MD5 checksum: 1c2fb3ea8fed3626760ad644fb98ae15
MD5 checksum: 42775e9de4e14c26378077ff0e11c4bf
Release Notes
None.
New Features
- Metronome now supports the Sony PlayStation 3 platform (requires Condor 6.9.4+)
Bugs Fixed
- The web status pages’ pool statistics sidebar now correctly reports the total number of Condor CPU Slots in the pool, instead of incorrectly reporting the number of hosts.
- The web status pages once again report prereq information. Additionally, many small bugs in the sorting of various related tables have been eliminated.
- Pinned runs whose pins have expired no longer display the pinned icon in the runs overview page.
nmi_rmno longer throws a spurious Perl warning.nmi_condor_qhandles remote (migrated) jobs better.
Known Bugs
- Automatic email notification of run completion is broken in 2.4.0; it will be fixed in 2.4.1, but to fix it by hand in the meantime, simply edit line 15 of notify.pl. The broken line reads:
use lib $ENV{'NMI_LIB'} || "/usr/local/nmi-2.2.7/lib";Just change the path at the end to be your installation’s actual NMI lib/ directory, and email notification will once again work.
- View all Metronome bugs, as of the current release.
Requirements
- Metronome Submit/Archive Host
- Condor >= 6.8.0 or Condor >= 6.9.0
- Perl >= 5.005 (including DBI and DBD::mysql modules)
- Apache >= 2.0
- PHP >= 4.2.3 (i.e, with Session & MySQL support)
- Metronome DB Host
- MySQL 4.1.20
- Condor Central Manager Host
- Condor >= 6.8.0 or Condor >= 6.9.0
- Build/Test Execution Hosts
- Condor >= 6.8.0 or Condor >= 6.9.0
- Perl >= 5.005
Special Feature Requirements
- For Cross-Facility Job Migration: Condor >= 6.8.0
- For Parallel jobs: Condor >= 6.9.0 on central manager, submit and execute hosts.
WARNING_PAGE
This is the file that is loaded to display warning messages in the web interface. It is not advised to change this parameter.
define('WARNING_PAGE', 'msgs/warning');
SESSION_NAME
The cookie name used for the web sessions. You can change this if you would like to have multiple Metronome installations within the same domain.
define('SESSION_NAME', 'MetronomeSessID');
NMI serial console scripts (node-reboot, node-console)
the scripts are located in /usr/local/bin on nmi-net
they use config files in /usr/local/condor/admin – this is where to add new machines/serial consoles
Metronome 2.2.8
Download
Release Date: 2007-08-08
MD5 checksum: 1e9060cb5b141e1ed228d6d5fe27dda3
MD5 checksum: 71a59e60fd90fefe05e0aef96c9658f1
Release Notes
To support the web status pages’ ability to retain user preferences across multiple submit nodes, the web page database user must now be able to write to the ‘sessions’ table. The schema file, which now also describes the ‘sessions’ table, has been renamed from schema.sql to schema.mysql; please see it for details. We regret that sites with only one submit node can not at present readily disable this feature.
New Features
- The NMI_CONF enviroment variable is now respected.
- Added support for platform types, so that users who already have a platform naming scheme can retain it. See the PLATFORM_TYPE configuration variable, and the platform_type submit file variable.
- You can now advertise ‘nmi_condor_release_dir’ for machines which don’t have Condor (in particular, ‘chirp’) in their default PATH. This allows parallel jobs to invoke nmi_[get|put]attr on these platforms.
- If the configuration variable use_condor_job_leases is true, Metronome now sets a two-hour Condor job lease for all platform_jobs; this means if a submit host goes down or is disconnected from an execute host for less than two hours, running jobs will no longer be interrupted and have to restart. Note that this will not function with Condor version earlier than 6.9.3.
- The web status pages can now retain user preferences across multiple submit nodes in a Metronome pool.
- The web status pages now allow users to set their local timezone.
- Searching for a run in the web status pages based on runid or gid in the new top-right search box will now automatically jump to the task view for that run.
- The web status pages now display “condor submission failure” instead of “-1001” for that error code.
- The user may now suppress the pop-up window used to display standard output and error.
Bugs Fixed
- Metronome by default now correctly looks for
nmi.confunder the Metronome install path specified at install time. - Metronome no longer mangles submit files containing ‘queue’ when trying to insert the GID.
- Return to respecting the advertised attributed ‘nmi_gnutar’.
- nmi_condor_q now respects the PATH_NMI configuration variable.
- E-mail notification of run completion works again.
- The monitor job (which sends updates the Metronome DB) no longer keeps a persistent DB connection open; it also now retries after any failed DB updates. This should increase the maximum number of jobs in the system for a given database connection limit.
- The monitor now properly drains the event queue when it detects job completion. This should eliminate inconsistencies between the database and the on-disk state of the job.
Known Bugs
nmi_condor_qfails to properly display jobs migrated to other Metronome sites- View all Metronome bugs, as of the current release.
Requirements
- Metronome Submit/Archive Host
- Condor >= 6.8.0 or Condor >= 6.9.0
- Perl >= 5.005 (including DBI and DBD::mysql modules)
- Apache >= 2.0
- PHP >= 4.2.3 (i.e, with Session & MySQL support)
- Metronome DB Host
- MySQL 4.1.20
- Condor Central Manager Host
- Condor >= 6.8.0 or Condor >= 6.9.0
- Build/Test Execution Hosts
- Condor >= 6.8.0 or Condor >= 6.9.0
- Perl >= 5.005
Special Feature Requirements
- For Cross-Facility Job Migration: Condor >= 6.8.0
- For Parallel jobs: Condor >= 6.9.0 on central manager, submit and execute hosts.
- If
use_condor_job_leasesis enabled innmi.conf: Condor >= 6.9.4
