vn.physics.ubc.ca PIII/Linux Cluster Homepage

vn.physics.ubc.ca: PIII/Linux Cluster Homepage

Over 1,750,000 CPU Hours served

Usage Summaries
	08/00	09/00	10/00	11/00	12/00
01/01	02/01	03/01	04/01	05/01	06/01
07/01	08/01	09/01	10/01	11/01	12/01
01/02	02/02	03/02	04/02	05/02	06/02
07/02	08/02	09/02	10/02

Important: The information contained below is subject to change without warning. Please e-mail Matthew Choptuik immediately if you encounter problems using the cluster. Please check here FREQUENTLY while using the system.

Index

Status & News [News Archive]

MAY 2, 2007, 12 NOON:

All compute nodes of the cluster, except for those belonging to Steve Plotkin, are now off-line, per the message below.
APRIL 23, 2007: IMPORTANT!!!

After 7+ years of service, the vn.physics.ubc.ca cluster will very shortly be decommissioned!!

Since the cluster has been almost completely idle, it is not expected that this will present much of a hardship to any users of the facility. Beginning later this week, management will start the process of shutting down the compute nodes and moving them out of the co-location room in the LS Klinck Building in order to free up space for new projects.

The single most important issue that users of the cluster may need to deal with is the disposition of files that are currently stored on one of the front-end machines, vnfe1.physics.ubc.ca and vnfe3.physics.ubc.ca, and on one of the following partitions /d/vnfe1/home, /d/vnfe1/home2, /d/vnfe1/home3, /d/vnfe3/home and /d/vnfe3/home2. Users with useful and/or important files on any of those partitions are asked to offload such files as soon as is feasible. Note, however, that it is management's intent to continue to run the two front-ends (vnfe1 and vnfe3) for a period of at least a few months following the cluster shutdown, so that, even after decommissioning, files located on the front-ends will be available for some time. In addition, this web site will continue to be maintained for some time to come, and will be updated as necessary to provide information concerning access to the front ends and other pertinent matters.

Please contact Matt Choptuik (choptuik@phas.ubc.ca) should you have any questions/concerns about this matter.
See HERE for a recent shapshot of usage on the cluster.
See HERE for recent node load factors.
See HERE for usage summary by user.
IMPORTANT: See HERE for disk usage summary by partition and user. PLEASE try to keep paritions below 80% usage.
To date, this cluster has had about 190 or so "crashes" (mean time between node crashes: 1.9 YEARS)

JULY 27, 8:45 AM:

MPI is MOSTLY functional again on the cluster, as the following table indicates:
```
          C    F77  C++  F90
```
```
PGI       Yes  Yes   ?    ?
```
```
INTEL     Yes  No    ?    ?
```
where '?' denotes 'unknown/who uses that stuff anyway?'.

Management will continue to work on the Intel/F77 and other issues, but, e.g., 3 of the 4 examples below should work again
JANUARY 26 12:00 NOON: IMPORTANT!! ALL USERS OF THE CLUSTER SHOULD RE-READ THE FOLLOWING!!

SUMMARY OF WHAT HAS CHANGED WITH THE UPGRADE
1. There are now two (rather than one) distinct partitions (partitions / and /home) on each of the machines (front-ends and compute nodes). The / partition occupies about 20% of the space of the physical partition, /home about 80%. This has two key consequences of which all users MUST be aware, and that some users will need to act on SOON.
  1. The /home partitions on vnfe1 and vnfe3 are NOW 20% SMALLER THAN THEY USED TO BE. THIS MEANS THAT USERS MUST REDUCE THE AMOUNT OF INFORMATION THAT THEY STORE ON THIS CLUSTER. Note that the WestGrid HSM (Hierarchichal Storage Management) Facility is perfect for all of your long term, high volume storage needs.
    
    If you don't yet have a WestGrid account, get your supervisor to take the 15 minutes that it will take to fill out the on-line application for a group account (she/he doesn't ever have to log into a WestGrid computer, just fill our the form), that group account will be assigned a project number, which you then can use to create your WestGrid account.
  2. /tmp on all the compute nodes is mounted on /, so /tmp is not ideal for high volume, LOCAL, disk storage. Thus, what would normally be the /home partition on the compute nodes has been renamed /sratch. and given the atrributes of a /tmp directory. On ANY machine (front-end or node) to which you login, the directory /scratch/$USER should be automatically created. If this is NOT the case, please report that fact to management immediately.
    - PLEASE DO ALL HIGH-VOLUME I/O ON THE LOCAL PARTITIONS I.E. on /scratch
    - USERS WHO VIOLATE THIS POLICY WILL BE DEEMED NEOPHYTES AND WILL BE SUBJECT TO HAVING ALL OF THEIR PROCESSES ON THE CLUSTER SUMMARILY TERMINATED.
2. The cluster is now running Mandrake 10.1 with 2.6 kernels. In 1999, when we last did an install of the OS (Mandrake 6.1), we simply installed everything. This is no longer an easy task, so if you find some software and/or capability that is missing, please feel free to report that fact to management, and we will do our best to rectify the situation, provided that the software and its use is compatible with the operating principles of the cluster.
3. The PGI compiler suite, version 5.2-2, is installed and available on ALL machines (not just the front ends as previously, license now served off VNP4). If your compilation is QUICK, then by all means, compile on a node. If not, COMPILE ON YOUR FRONT-END. Since vnfe2 no longer exists, if you used to use vnfe2 as your front-end you will need to figure out (via cd; pwd) which is "your" front-end (i.e. the one on which your NFS home directory for the cluster is physically mounted).
  - BUILDING MPI APPS WITH MPICH: The version of MPICH that has been compiled with the PGI compilers is mpich.1.2.6
4. The INTEL compiler suite, version 8.1, is installed an available on ALL machines
  (license served off VNP4). If your computation is QUICK (i.e. 1-20 seconds of compile time), then by all means compile on a node. If not, COMPILE ON YOUR FRONT-END. Read 3. above if you don't know which machine is your front-end.
  - BUILDING MPI APPS WITH MPICH: The version of MPICH that has been compiled with the PGI compilers is mpich.1.2.6
5. NEW!! PARALLEL CONVENIENCE FEATURES
SUMMARY OF WHAT HASN'T CHANGED WITH THE UPGRADE
1. ANY OF THE (AMMENDED) BASIC OPERATIONAL RULES (SEE SYSTEM USE SECTION BELOW).
2. THIS STILL ISN'T A SYSTEM FOR NEOPHYTES!

System Overview

See HERE for a complete list of currently available machine names and IP numbers
2 front-end nodes, 62 compute nodes. Each compute node has dual PIII 850 MHz processors and 512 MByte of RAM.
Use vnN (/d/vnfe1/home/matt/scripts/vnN) to see currently active compute nodes (caveat emptor)
If you are in Steven Plotkin's group, see HERE.

Accounts

GETTING AN ACCOUNT: Supply the requested information in the template available HERE, and send it to Matthew Choptuik.

System Access

All users MUST login to the cluster machines using ssh (the secure shell). Note that the cluster is now running Open SSH Version 3.9p1, Protocols 1.5 and 2.0.

System Use

Please also see the Warnings below.

Please read our Backup Policy.

At least while the cluster is under construction (and possibly after that), the cluster will be operated essentially as a cluster of workstations. To this end, you will be able to ssh directly to any of the compute nodes, as well as the front end nodes, and do pretty much everything on a compute node that you would on a front end node.

There is currently NO BATCH SYSTEM on the cluster. Users should feel free to interactively start a reasonable number of production jobs on whatever machines(s) are available. Use ruptime on one of the nodes to see load averages on all machines in the cluster.

In order to maximize the usefulness of the cluster, users should abide by the following guidelines:

BE AWARE AND CONSIDERATE OF OTHER USERS.
Please ensure that you have a valid .forward file in your home directory on the cluster so that mail sent to you will actually get to you.
Minimize the amount of network traffic to and from the cluster. The cluster's current link to the outside world is 100 MegaBIT/s maximum---about 10 Mbyte/s. Extremely large files should definitely be moved to and fro during off-peak hours Pacific Time.
Treat the cluster as a remote computing environment:
- If at all possible, develop and debug on another Linux/Unix system.
- If at all possible, build the executable from source on the cluster, to avoid problems with (e.g.) machine-dependent run-time support.
- From time-to-time, nodes in the cluster may have to be taken down for reboots on very short notice. Users running very long jobs that do not periodically checkpoint themselves do so at their own peril. There is currently no system-wide mechanism for suspending, then restarting, jobs. Management accepts no responsiblity for lost time and/or data.
DO NOT use the front-end nodes for long jobs (more than a few CPU minutes), except by arrangment with the management.
If demand for the compute nodes is high, minimize the amount of development work you do on those nodes.
DO NOT start long jobs (more than a few minutes) if there are already two jobs running on the node. Use top to determine the number of CPU intensive jobs that are currenly running.
DO start one additional job on a node that is currently running a job, unless there are completely free nodes. (Again use ruptime and top.)
DO NOT start a job that will result in total memory usage on a node exceeding 90%. (Once again, use top to see what percentage memory a currently running process is using)
WATCH YOUR DISK USAGE, particularly on the front end nodes. This is not a system for computing neophytes, and users unable or unwilling to keep /home directories under control will be [severe punishment to be determined later].
SCRATCH SPACE: Each user has a scratch directory /scratch/$USER that should be used for local storage on the compute nodes. PLEASE DO NOT WRITE LARGE DATA FILES TO /tmp ITSELF and PLEASE WATCH YOUR SCRATCH USAGE, ESPECIALLY IF /scratch's PARTITION IS 80% FULL OR MORE. Scratch should be used especially in those cases when a user has many I/O intensive jobs running simultaneously since, in such instances, writing to the NFS mounted partitions can easily swamp the front-ends.
BE AWARE AND CONSIDERATE OF OTHER USERS.

Software

Send mail to Matthew Choptuik, if there is software you wish to have installed. Please include a description of the software, and, if possible, a distribution site from which it can be downloaded.

Unless otherwise specified, all software is available on all machines (both front-end and compute nodes)

Linux: Mandrake 10.1 (previously 6.1!)
MPI: mpich Version 1.2.6
- Access control is via rsh and ~/.rhosts, NOT ssh and ~/.ssh/authorized_keys. You can use the perl script ~matt/scripts/mkvnrhosts to dump an appropriate .rhosts fragment to standard output.
- The default machine file on each node is /usr/local/util/machines/machines.LINUX
Portland Group HPF, F90, C and C++ compilers, version 8.1.
See HERE for usage information and HERE for on-line documentation. Note: These products are now available on all nodes.
- The following libraries have been built with the PG compilers and, unless otherwise noted, are installed in /usr/local/PGI/lib etc. on all machines.
  - MPI: mpich Version 1.2.6
  Note that in contrast to our original practice, the libraries have NOT been compiled with the -Msecond_underscore flag to pgf77.
OTHER SOFTWARE

Tips & Tricks

avail: Alias to display nodes in order of available capacity.

tcsh/csh:
     alias avail 'ruptime | grep -v down | grep -v vnfe | sort -n +7 | more'

bash:
     alias avail='ruptime | grep -v down | grep -v vnfe | sort -n +7 | more'

Warnings

By default, accounts will be created with world-readable home directories, and users of tcsh or csh will have umask 022 set by default. Users are, of course, free to explicitly read-protect material as they see fit.
You can only change your password while logged into vnfe1, and your changed password password will probably not be propagated to the other machines very quickly. Eventually this situation may change, but for the time being, try to choose an initial password that you can live with for a while.

Maintained by choptuik@physics.ubc.ca. Supported by CIAR, CFI and NSERC.