Wilson Snyder > Schedule-Load-3.064 > Schedule::Load

Download:
Schedule-Load-3.064.tar.gz

Dependencies

Annotate this POD

Website

Related Modules

Inline::C
more...
By perlmonks.org
View/Report Bugs
Module Version: 3.064   Source  

NAME ^

Schedule::Load - Load distribution and status across multiple host machines

SYNOPSIS ^

  #*** See the SETUP section of the Schedule::Load manpage.
  #*** Daemons must be running for this test

  # Get per-host or per top process information
  use Schedule::Load::Hosts;
  my $hosts = Schedule::Load::Hosts->fetch();
  foreach my $host ($hosts->hosts_sorted) {
      printf $host->hostname," is on our network\n";
  }

  # Choose hosts
  use Schedule::Load::Schedule;
  my $scheduler = Schedule::Load::Schedule->fetch();
  print "Best host for a new job: ", $scheduler->best(), "\n";

  # user access
  rschedule reserve <hostname>

DESCRIPTION ^

This package provides useful utilities for load distribution and status across multiple machines in a network. To just see what is up in the network, see the rschedule command. For initial setup, see below.

Most users do not need the Perl API, and can use the command line utilities that come with this package, and are installed in your standard binary directory like other unix applications. This package provides these four Unix programs:

rschedule

rschedule is a command line interface to this package. It and the potential aliases rtop, rhosts, and rloads report the current state of the network including hosts and top loading. rschedule also allows reserving hosts and setting the classes of the machines, as described later.

slchoosed

slchoosed is run on one host in the network. This host is specified in the SLCHOOSED_HOST environment variable, which may also specify additional cold standby hosts in case the first host goes down. Slchoosed collects connections from the slreportd reporters, and maintains a internal database of the entire network. User clients also connect to the chooser, which then gets updated information from the reporters, and returns the information to the user client. As the chooser has the entire network state, it can also choose the best host across all CPUs in the network.

slreportd

slreportd must be running on every host in the network, usually started with a init.d script. It reports itself to the slchoosed daemon periodically, and is responsible for checking loading and top processes specific to the host that it runs on.

slreportd may also be invoked with some variables set. This allows static host information, such as class settings to be passed to applications.

slpolice

slpolice is a optional client daemon which is run as a cron job. When a user process has over a hour of CPU time, it nices that process and sends mail to the user. It is intended as a example which can be used directly or changed to suit the system manager preferences.

lockerd

lockerd is part of the IPC::PidStat package. If running, it allows the scheduler to automatically cancel held resources if the process that requested the resource exits or is even killed without cleaning up.

MODULES ^

For those desiring finer control, or automation of new scripts, the Perl API may be used. The Perl API includes the following major modules:

Schedule::Load::Hosts

Schedule::Load::Hosts provides the connectivity to the slchoosed daemon, and accessors to load and modify that information.

Schedule::Load::Schedule

Schedule::Load::Schedule provides functions to choose the best host for a new job, reserving hosts, and for setting what hosts specific classes of jobs can run on.

Schedule::Load::Reporter

Schedule::Load::Reporter implements the internals of slreportd.

Schedule::Load::Chooser

Schedule::Load::Chooser implements the internals of slchoosed.

RESERVATIONS ^

Occasionally clusters have members that are only to be used by specific people, and not for general use. A host may be reserved with "rschedule reserve". This will place a special comment on the machine that "rschedule hosts" will show. Reservations also prevent the Schedule::Load::Schedule package from picking that host as the best host.

To be able to reserve a host, the reservable variable must be set on that host. This is generally done when slreportd is invoked on the reservable host by using "slreportd reservable=1".

CLASSES ^

Different hosts often have different properties, and jobs need to be able select a host with certain properties, such as hardware or licensing requirements. Classes are generally just boolean variables which start with class_. Classes can be specified when slreportd is invoked on the "slreportd class_foo=1". The class setting may be seen with "rschedule classes" or may be read (as may any other variable) as a accessor from a Schedule::Load::Hosts::Host object.

Once a class is defined, a scheduling call can include it the classes array that is passed when the best host is requested. Only machines which match one of those classes will be selected.

COMMAND COMMENTS ^

"rschedule loads" or rloads show the command that is being run. By default this is the basename of the command invoked, as reported by the operating system. Often this is of little use, especially when the same program is used by many people. The "rschedule cmnd_comment" command or Schedule::Load::Schedule::cmnd_comment function will assign a more verbose command to that process id. For example, we use dc_shell, and put the name of the module being compiled into the comment, so rather than several copies of the generic "dc_shell" we see "dc module", "dc module2", etc.

HOLD KEYS ^

Hold keys allow a job request to be queued, so that when the resource is freed, it will be issued to the oldest requester. The hold will persist for a specified time until a process actually starts up on the selected host, and enough CPU time elapses for that new process to claim CPU time.

For a this limited time, the load on the host will be incremented. When the job begins and a little CPU time has elapsed the hold is released with a hold_release call, the timer expiring, or IPC::PidStat detecting the holding process died. This will cause the load reported by "rschedule hosts" to occasionally be higher than the number of jobs on that host.

FIXED LOADS ^

Some jobs have CPU usage patterns which contain long periods of low CPU activity, such as when doing disk IO. make is a typical example; the parent make process uses little CPU time, but the children of the make pop in and out of the CPU run list.

When scheduling, it is useful to have such jobs always count as one (or more) job, so that the idle time is not misinterpreted and another job scheduled onto that machine. Fixed loading allows all children of a given parent to count as a given fixed CPU load. Using make again, if the parent make process is set as a fixed_load of one, the make and all children will always count as one load, even if not consuming CPU resources. The "rschedule loads" or rloads command includes not only top CPU users, but also all fixed loads. If a child process is using CPU time, that is what is displayed. If no children are using appreciable CPU time (~2%), the parent is the one shown in the loads list.

SETUP ^

When setting a new site with Schedule::Load, first read the DESCRIPTION section about the various daemons.

First, make sure you've built and installed this package on all of your machines.

Then, pick a reliable master machine for the chooser. Set the SLCHOOSED_HOST environment variable to include this host name, and add this setting to a site wide file so that all users including daemons may see it when booting. You may add additional colon separated hostnames which will be backups if the first machine is down. Run slchoosed on the SLCHOOSED_HOST specified host(s).

On all the hosts in the network you wish to schedule onto, check SLCHOOSED_HOST is set appropriately, then run slreportd. Optionally run pidstatd (from IPC::Locker) on these hosts also.

The "rschedule hosts" command should now show your hosts.

If you run slreportd before slchoosed, there may be a 60 second wait before slreportd detects the new slchoosed process is running. During this time rschedule won't show all of the hosts.

When everything is working manually, it's a good idea to set things up to run at boot time. Manually kill all of the daemons you started. Then, make init files in /etc/init.d so the daemons start at boot time. Some examples are in the init.d directory provided by the distribution, but you will need to edit them. Exactly how this works is OS dependent, please consult your documentation or the web.

ENVIRONMENT ^

SLCHOOSED_HOST

A colon separated list of hostnames to contact to find slchoosed. They will be contacted in order; after the first connection is established, remaining hostnames will be backups.

SLCHOOSED_PORT

Default port number that slchoosed uses. If not defined, defaults to /etc/services assigned slchoosed port number, or if not specified there, 1752.

DISTRIBUTION ^

The latest version is available from CPAN and from http://www.veripool.org/.

Copyright 1998-2011 by Wilson Snyder. This package is free software; you can redistribute it and/or modify it under the terms of either the GNU Lesser General Public License Version 3 or the Perl Artistic License Version 2.0.

AUTHORS ^

Wilson Snyder <wsnyder@wsnyder.org>

SEE ALSO ^

User program for viewing loading, etc:

rschedule, slrsh, slpolice

Daemons:

slreportd, slchoosed, slpolice

Perl modules:

Schedule::Load::Chooser, Schedule::Load::FakeReporter, Schedule::Load::Hosts, Schedule::Load::Hosts::Host, Schedule::Load::Hosts::Proc, Schedule::Load::Reporter, Schedule::Load::ResourceReq, Schedule::Load::Schedule

syntax highlighting: