LinuxHPC.org/Cluster Builder 1.3
    Workload Manager
Translate to another language

Workload Manager

A Workload Manager is a software application designed to work in conjunction with one or multiple resource managers in order to enhance the overall efficiency and integration of different platforms, software solutions and scheduling capabilities in a cluster environment.

A Worklad Manager combines intelligent scheduling of resources with advanced reservations to process jobs on the right resources at the right time. It also provides flexible policy and event engines that process workloads faster and in line with set business requirements and priorities.

Differentiation

Workload Managers are designed to offer improved resource management and job scheduling to users while allowing users to continue 'business as usual'.

A resource manager, such as TORQUE *, PBS Pro, Loadleveler, or LSF, manages the job queue and manages the compute nodes. A scheduler tells the resource manager what to do, when to run jobs, and where. Users typically submit jobs and query the state of the machine and jobs through the resource manager. When a workload manager is running, users can continue to issue the same resource manager commands as before. Additionally, workload managers also offer commands which provide additional information, and a superset of resource management and advanced scheduling capabilities in the following areas:

  • Ease of Use - full graphical cluster administration with integrated documentation and wizards
  • Multi-Cluster Support - efficient integration of resources from multiple clusters
  • On Demand Support - the most advanced on demand intelligence and policy controls
  • Integrated Billing - highly flexible costing, allocation management, and billing
  • Extended Resource Control - monitoring, co-allocating, reservation, provisioning, and reporting on compute, data, network, software, and other resources
  • Cluster Visualization - true real-time visualization of what the cluster is and has been doing providing usage breakdown, failures, inefficiencies, etc.
  • Instant Charting and Reporting - instant display of graphs and professional reports covering any timeframe and combination of data
  • ROI - unparalleled ROI features to reduce staff time, empower user, increase resource availability, improve cluster efficiency, more quickly diagnose failures, and steer the cluster with comprehensive feedback and dynamic policies to do what it is supposed to do.

Benefits

  • improves overall scheduling performance, allowing users to run more jobs on the same system and get their results back more quickly.
  • ability to prioritize workloads and preempt active jobs to accomodate higher priority workloads.
  • allows resource reservations which guarantee resource availability at particular times.
  • allows users access to additional resources, maximizing system utilization and minimizing end user support time
Related Concepts
The information used to create this article was provided by Cluster Resources, Inc.