This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Web Interface

How to use the web interface?

1: Settings
2: Searchbar
3: Plots
4: Filters
5: Views

5.1: My Jobs
5.2: User Jobs
5.3: Job List
5.4: Job
5.5: Users
5.6: Projects
5.7: Tags
5.8: Nodes
5.9: Node
5.10: Analysis
5.11: Status

Home

The entrypoint for each login via the login mask is a table containing each configured cluster as a row with the following columns:

Name: The configured clusters’ name
Running Jobs: Number of Jobs currently running longer than 5 minutes (or configured shortRunning amount of time)
- Clicking the Link will forward to the job list with preset filters for cluster and running jobs
Total Jobs: Number of Jobs in the respective job-archive
- Clicking the Link will forward to the job list with preset filter for cluster
Status View: Link to the status view of the respective cluster
- This column is only shown for users with admin authority.
Systems View: Link to the nodes view view of the respective cluster
- This column is only shown for users with admin authority.

The navigation bar allows direct access to ClusterCockpits’ different views and functions. Depending on the users’ authorization, the selectable views can differ.

For most viewports, the navigation bar is rendered fully expanded:

Item	Title	Description
1	Home Button	Leads back to the home table
2	Views	Leads to ClusterCockpits’ different views, will change dependent on user authority
3	Searchbar	Top-Level Searchbar, see full usage information here
4	Documentation	Leads to this Documentation
5	Settings	Leads to ClusterCockpit settings page
6	Logout	Logs out the active user

Adaptive Render Versions

On smaller viewports, the navigation bar will be rendered in one of two collapsed states:

ClusterCockpit Collapsed Navbar — Partially collapsed navigation bar. ‘Groups’ will expand to show links for Users, Projects, Tags, and Nodes views. ‘Stats’ will expand to show links for Analysis and Status views. Searchbar, Logout and Settings not shown here, but are still rendered explicitly in this case.

ClusterCockpit Burger Navbar — On mobile devices, the navigation bar as a whole is reduced into a burger navigation icon, and will display all views, as well as the searchbar, as stacked navigation menu.

1 - Settings

Webinterface Settings Page

The settings view allows non-privileged users to customize how metric plots are rendered. This includes line width, number of plots per row (where applicable), whether backgrounds should be colored, and the color scheme of multi-line metric plots.

Privileged users will also find an administrative interface for handling local user accounts. This includes creating local accounts from the interface, editing user roles, listing and deleting existing users, generating JSON Web Tokens for API usage, and delegating managed projects for manager role users.

Plotting Options

Field	Options	Note
Line Width	# Pixels	Width of the lines in the timeseries plots
Plots Per Row	# Plots	How many plots to show next to each other on pages such as the job or nodes views
Colored Backgrounds	Yes / No	Color plot backgrounds indicating mean values within warning thresholds
Color Scheme	See Below	Render multi-line metric plots in different color ranges

Color Schemes

Name	Colors
Default
Autumn
Beach
BlueRed
Rainbow
Binary
GistEarth
BlueWaves
BlueGreenRedYellow

Administration Options

Create User

New users can be created directly via the web interface. On successful creation a green response message will be returned, and the user is directly visible in the “Special Users” table - If the user has at least two roles, or a single role other than user.

Error messages will also be displayed if the user creation process failed. No user account is saved to the database in this case.

Please note: Users are usually imported via LDAP on ClusterCockpit startup.

Field	Option	Note
Username (ID)	`string`	Required, must be unique
Password	`string`	Only API users are allowed to have a blank password, users with a blank password can only authenticate via JW tokens
Project	`string`	Only manager users can have a project
Name	`string`	Name of the user, optional, can be blank
Email Address	`string`	Users email, optional, can be blank
Role	Select one	See roles for more detailed information
	`API`	Allowed to interact with REST API
Default	`User`	Same as if created via LDAP sync
	`Manager`	Allows to inspect jobs and users of given project
	`Support`	Allows to inspect jobs and users of all projects, has no admin view or settings access
	`Admin`	General access

Special Users

This table does not contain users who only have user as their only role saved in the database. This is the case for all users created by LDAP import, and thus, these users will not be shown here. However, LDAP users’ roles can still be edited, and will appear in the table as soon as a authority higher than user or two authorities were granted.

All other special case users, e.g. new users manually created with support role, will appear in the list.

User accounts can be deleted by pressing the respective function displayed for each user entry - A verification pop-up window will appear to stop accidental user deletion.

Additionally, JWT tokens for specific users can be generated here as well.

Column	Example	Description
Username	`abcd1`	Username of this user
Name	`Paul Atreides`	Name of this user
Project(s)	`abcd`	Managed project(s) of this user
Email	`demo@demo.com`	Email adress of this user
Roles	`admin,api`	Role(s) of this user
JWT	Press button to reveal freshly generated token	Generate a JWT for this user for use with the CC REST API endpoints
Delete	Press button to verify deletion	Delete this user

Edit User Role

On creation, users can only have one role. However, it is allowed to assign multiple roles to an user account. The addition or removal of roles is performed here.

Enter an existing username and select an existing (for removal) or new (for addition) role in the drop-down menu.

Then press the respective button to remove or add the selected authority from the user account. Errors will be displayed if existing roles are added or non-existing roles are removed.

Edit Managed Projects

On creation, users can only have one managed project. However, it is allowed to assign multiple projects to a manager account. The addition or removal of projects is performed here.

Enter an existing username and select an existing (for removal) or new (for addition) project by entering the respective projectId.

Then press the respective button to remove or add the selected project from the manager account. Errors will be displayed if existing projects are added, non-existing projects are removed, or if the user account is not authorized to manage projects at all.

2 - Searchbar

Toplevel Searchbar Functionality

The top searchbar will handle page wide searches either by entering a searchterm directly as <query>, or by using a “keyword” implemented in the form of <keyword>:<query>. Entering a searchterm directly will start a hierarchical search which will return the first match in the hierarchy (see table below). It is recommended to supply the search with a keyword to specify the searched entity. For example, jobName:myJobName will specifically search for all jobs which have the queried string (or a part thereof) in their metadata jobName field. For all keywords with examples, see the table below.

Both keywords and queries are trimmed of all spaces before performing the search, returning the same results independently of location and number of spaces, e.g. name : Paul and name: paul are both handled identically.

Unprocessable queries will return a message detailing the cause of the error.

Available Keywords

Please note: Hovering over the information icon right of the query field will list all keywords in the webinterface.

Keyword	Example Query	Destination	Note
No Keyword Used	`abcd100`	Joblist or User Joblist	Performs hierarchical search `jobId -> username -> name -> projectId -> jobName`
JobId	`jobId:123456`	Joblist	Allows multiple identical matches, e.g. JobIds from different clusters
JobName	`jobName:myJobName`	Joblist	Works with partial queries. Allows multiple identical matches, e.g. JobNames from different clusters
ProjectId	`projectId:abcd100`	Joblist	All Jobs of the given project
Username	`username:abcd100a`	Users Table	Only active users are returned; Users without jobs are not shown. Also, a `Last 30 Days` is active by default and might filter out expected users. Admin Only
Name	`name:Paul`	Users Table	Works with partial queries. Only active users are returned; Users without jobs are not shown. Also, a `Last 30 Days` is active by default and might filter out expected users. Admin Only
ArrayJobId	`arrayJobId:891011`	Joblist	All Jobs of the given arrayJobId

3 - Plots

Plot Descriptions and Functionality

Most plots visible in the ClusterCockpit webinterface are implemented via uPlot or Chart.js, which both offer various functionality to the user.

Metric Plots

The main plot component of ClusterCockpit renders the metric values retrieved from the systems in a time dependent manner.

Interactivity

A selector crosshair is shown when hovering over the rendered data, data points corresponding to the legend are highlighted.

It is possible to zoom in by dragging a selection square with your mouse. Double-Clicking into the plot will reset the zoom.

Conditional Legends

Hovering over the rendered data will display a legend as hovering box colored in yellow. Depending on the amount of data shown, this legend will render differently:

Single Dataset: Runtime and Dataset Identifier Only
2 to 6 Datasets: Runtime, Line Color and Dataset Identifier
7 to 12 Datasets: Runtime and Dataset Identifier Only
More than 12 Datasets: No Legend
Statistics Datasets: Runtime and Dataset Identifier Only (See below)

The “no legend” case is required to not clutter the display in case of high data volume, e.g. core granularity data for more than 128 cores, which would result in 128 legend entries, possibly blocking the plotting area of metric graphs below.

Example

Colored Backgrounds

The plots’ background is colored depending the average value of the viewed metric in respect to its configured threshold values. The three cases are

White: Metric average within expected parameters. No performance impact.
Yellow: Metric average below expected parameters, but not yet critical. Possible performace impact.
Red: Metric average unexpectedly low. Indicator for suboptimal usage of resources. Performance impact to be expected.

Example

Statistics Variant

In the job list views, high amounts of data are by default rendered as a statistical representation of the numerous, single datasets:

Maximum: The maximum values of the base datasets of each point in time, over time. Colored in green.
Average: The average values of the base datasets of each point in time, over time. Colored in black.
Minimum: The minimal values of the base datasets of each point in time, over time. Colored in red.

Example

Histograms

Histograms display (binned) data allowing distributions of the repective data source to be visualized. Data highlighting, zooming, and resetting the zoom work as described for metric plots.

Example

Roofline Plot

A roofline plot, or roofline model, represents the utilization of available resources as the relation between computation and memory usage.

Dotted Roofline

Roofline models rendered as dotted plots display the utilization of hardware resources over time.

Please Note: The roofline models rendered in the status view are not job-derived, but display the utilization of single nodes at the moment of data-collection. Therefore, no time information is required, and alle dots are colored blue.

Example

Heatmap Roofline

The roofline model shown in the analysis view, as the single exception, is rendered as a heatmap. This is due to the data being displayed is derived from a number of jobs greater than one, since the analysis view returns all jobs matching the selected filters. The roofline therefore colors regions of accumulated activity in increasing shades of red, depicting the regions below the roofs in which the returned jobs primarily perform.

Please note: The plot is rendered in double-logarithmic scaling, yet the lines in the background seem linear: The heatmap roofline is rendered manually (and directly) using only HTML canvas, while the dotted roofline model is rendered with the help of the uPlot package, which allows easy display of double-log scales.

Example

Polar Plots

A polar, or radar, plot represents the utilization of three key metrics: flops_any, mem_used, and mem_bw. Both the maximum and the average utilization as a fraction of the 100% theoretical maximum (labelled as 1.0) are rendered on three axes. This leads to an increasing area, which in return marks increasingly optimal resource usage. In principle, this is a graphic representation of data also shown in the footprint component.

By clicking on one of the two legends, the respective dataset will be hidden. This can be useful if high overlap reduces visibility.

Example

Scatter / Bubble Plot

Bubble scatter plots show the position of the averages of two selected metrics in relation to each other.

Each circle represents one job, while the size of a circle is proportional to its node hours. Darker circles mean multiple jobs have the same averages for the respective metric selection.

Example

4 - Filters

Webinterface Filter Options

Filter Button as displayed in Job List Views

The ClusterCockpit filter component is used for reducing the number of jobs, either for direct display in job list views, or to specifiy the data-source for collecting information displayed in user or project tables, as well as the analysis view.

Multiple Active Filters — Three active filters have reduced the total job count considerably

Multiple filters can be easily combined by selecting more than one option of the available filters.

By clicking on the respective filter pill, colored in blue, and located right of the filter component, one can directly access the respective filters’ menu for editing, or removing, the filter.

At the moment, the following filters are implemented:

Cluster/Partition

Select a configured cluster, or a specified partition of a given cluster, and display only jobs started on that cluster (and partition).

Options: All cluster names, and nested partition names, configured in config.json

Default: Any Cluster (Any Partition)

Job States

Select one or more job states, and display only jobs matching the selected criteria.

Options: running, completed, failed, cancelled, stopped, timeout, preempted, out_of_memory

Default: All states

Start Time

Select the timeframe in which jobs were started, and display only jobs matching the selected criteria.

Options: Free selection of date dd.mm.YYYY and time hh:mm for from and to limits.

Default: All Starttimes

Preset: Jobs started one month ago until $now

Duration

Select the duration of jobs, and display only jobs matching the selected criteria.

Options: Duration less than hh:mm, duration more than hh:mm, duration between two duration selections. Only one of the three options can be used at a time.

Default: All Durations

Resources

Select a named node or specify an amount of used resources, and display only jobs matching the selected criteria.

Options:

Named node free text field: Enter a hostname here to only return jobs which were ran on this node.
Range selectors: Select a range of allocated job resources ranging from the minimal to the maximum configured resource count of all clusters. If the cluster filter is set, the ranges are limited to the respective resources’ configuration. Available resources are:
- Nodes
- HWThreads
- Accelerators (if available)

Default: No named node, full resource ranges of all configured clusters

Statistics

Specify ranges of metric statistics, and display only jobs matching the selected criteria.

Options:

FLOPs (Avg.): Select Range From-To by dragging the slider or entering values directly.
Memory Bandwith (Avg.): Select Range From-To by dragging the slider or entering values directly.
Load (Avg.): Select Range From-To by dragging the slider or entering values directly.
Memory Used (Max.): Select Range From-To by dragging the slider or entering values directly.

Default: Full metric statistics ranges as configured

Start Time Quick Selections

Please note: Not available in all views!

Quickly select a preconfigured range of job start times. Will display as named start time filter.

Options: Last 6 hours, Last 24 hours, Last 7 Days, Last 30 Days

Default: No selection

5 - Views

View-Specific Frontend Usage Information.

Usage descriptions for each view of the ClusterCockpit web interface.

5.1 - My Jobs

All Jobs as Table of the Active User

The “My Jobs” View is available to all users regardless of authority and displays the users personal jobs, i.e. jobs started by this users username on the cluster systems.

The view is a personal variant of the user job view and therefore also consists of three components: Basic Information about the users jobs, selectable statistic histograms of the jobs, and a generalized job list.

Users are able to change the sorting, select and reorder the rendered metrics, filter, and activate a periodic reload of the data.

User Information and Basic Distributions

The top row always displays personal usage information, independent of the selected filters.

Additional histograms depicting the distribution of job duration and number of nodes occupied by the returned jobs are affected by the selected filters.

Information displayed:

Username
Total Jobs
Short Jobs (as defined by the configuration, default: less than 300 second runtime)
Total Walltime
Total Core Hours

Selectable Histograms

Histograms depicting the distribution of the selected jobs’ statistics can be selected from the top navbar “Select Histograms” button. The displayed data is based on the jobs returned from active filters, and will be pulled from the database, or in case of running jobs, calculated from the available metric data directly.

Available Metrics for Histograms: cpu_load, flops_any, mem_used, mem_bw, net_bw, file_bw

Job List

The job list displays all jobs started by your username on the systems. Additional filters will always respect this limitation. For a detailed description of the job list component, see the related documentation.

5.2 - User Jobs

All Jobs as Table of a Selected User

The “User Jobs” View is only available to management and supporting staff and displays jobs of the selected user, i.e. jobs started by this users username on the cluster systems.

The view consists of three components: Basic Information about the users jobs, selectable statistic histograms of the jobs, and a generalized job list.

Users are able to change the sorting, select and reorder the rendered metrics, filter, and activate a periodic reload of the data.

User Information and Basic Distributions

The top row always displays information about the user, independent of the selected filters.

Additional histograms depicting the distribution of job duration and number of nodes occupied by the returned jobs are affected by the selected filters.

Information displayed:

Username
Total Jobs
Short Jobs (as defined by the configuration, default: less than 300 second runtime)
Total Walltime
Total Core Hours

Selectable Histograms

Histograms depicting the distribution of the selected jobs’ statistics can be selected from the top navbar “Select Histograms” button. The displayed data is based on the jobs returned from active filters, and will be pulled from the database, or in case of running jobs, calculated from the available metric data directly.

Available Metrics for Histograms: cpu_load, flops_any, mem_used, mem_bw, net_bw, file_bw

Job List

The job list displays all jobs started by this users username on the systems. Additional filters will always respect this limitation. For a detailed description of the job list component, see the related documentation.

5.3 - Job List

A Configurable Table Displaying Jobs According to Filters

Job View — Job List. In this example, the optional footprint is displayed, two filters are active, and the table is refreshed every minute. The first job has a high node count, therefore the plots are rendered in the statistics variant. The ‘mem_bw’ metric likely has artifacts as shown by the grey footprint. The second job has tags and displays less than optimal performance in the ‘flops_any’ metric, coloring the respective plot background in orange.

The primary view of ClusterCockpits webinterface is the tabular listing of jobs, which displays various information about the jobs returned by the selected filters. This information includes the jobs’ full meta data, such as runtime or job state, as well as an optional footprint, allowing quick assessment of the jobs performance.

Most importantly, the list displays a selectable array of metrics as time dependent metric plots, which allows detailed insight into the jobs performance at a glance.

Default Users: For users without additional roles, this view is labelled as ‘Job Search’. Displayed jobs are limited to jobs started by the active user, otherwise the functionality is identical, e.g. filtering or footprint display.

Manager Users: For users with additional manager role, this view is labelled as ‘Managed Jobs’. Displayed jobs are limited to jobs started by users of the managed projects (usergroups), otherwise the functionality is identical, e.g. filtering or footprint display.

Several options allow configuration of the displayed data, which are also persisted for each user individually, either for general usage or by cluster.

Sorting

Basic selection of sorting parameter and direction. By default, jobs are sorted by starting timestamp in descending order (latest jobs first). Other selections to sort by are

Duration
Number of Nodes
Maximum Memory Used
Average FLOPs
Average Memory Bandwidth
Average Network Bandwidth

Switching of the sort direction is achieved by clicking on the arrow icon next to the desired sorting parameter.

Metrics

Selection of metrics shown in the tabular view for each job. The list is compiled from all available configured metrics of the ClusterCockpit instance, and the tabular view will be updated upon applying the changes.

In addition to the metric names themselves, the availability by cluster is indicated as comma seperated list next to the metric identifier. This information will change to the availablility by partition if the cluster filer is active.

It is furthermore possible to edit the order of the selected metrics. This can be achieved by dragging and dropping the metric selectors to the desired order, where the topmost metric will be displayed next to the “Job Info” column, and additional metrics will be added on the right side.

Lastly, the optional “Footprint” Column can be activated (and deactivated) here. It will always be rendered next to the “Job Info” column, while metrics start right of the “Footprint” column, if activated.

Job Count

The total number of jobs returned by the backend for the given set of filters.

Filters

Selection of filters applied to the queried jobs. By default, no filters are activated if the view was opened via the navigation bar. At multiple location throughout the web-interface, direct links will lead to this view with one or more preset filters active, e.g. selecting a clusters’ “running jobs” from the home page will open this view displaying only running jobs of that cluster.

Possible options are:

Cluster/Partition: Filter by configured cluster (and partitions thereof)
Job State: Filter by defined job state(s)
Start Time: Filter by start timestamp
Duration: Filter by job duration
Tags: Filter by tags assigned to jobs
Resources: Filter by allocated resources or named node
Statistics: Filter by average usage of defined metrics

Each filter and its default value is described in detail here.

Search and Reload

Search for specific username or project using the searchbox, force a complete reload of the table data, or set a timed periodic reload (30, 60, 120, 300 Seconds).

Job List Table

The main component of the job list view renders data pulled from the database, the job archive (completed jobs) and the configured metric data source (running jobs).

Job Info

The meta data containing general information about the job is represented in the “Job Info” column, which is always the first column to be rendered. From here, users can navigate to the detailed view of one specific job as well as the user or project specific job lists.

Field	Example	Description	Destination
Job Id	`123456`	The JobId of the job assigned by the scheduling daemon	Job View
Job Name	`myJobName`	The name of the job as supplied by the user	-
Username	`abcd10`	The username of the submitting user	User Jobs
Project	`abcd`	The name of the usergroup the submitting user belongs to	Joblist with preset Filter
Resources	`n100`	Indicator for the allocated resources. Single resources will be displayed by name, i.e. exclusive single-node jobs or shared resources. Multiples of resources will be indicated by icons for nodes, CPU Threads, and accelerators.	-
Partition	`main`	The cluster partition this job was startet at	-
Start Timestamp	`10.1.2024, 10:00:00`	The epoch timestamp the job was started at, formatted for human readability	-
Duration	`0:21:10`	The runtime of the job, will be updated for running jobs on reload. Additionally indicates the state of the job as colored pill	-
Walltime	`24:00:00`	The allocated walltime for the job as per job submission script	-

Footprint

The optional footprint column will show base metrics for job performance at a glance, and will hint to performance (and performance problems) in regard to configurable metric thresholds.

Field	Description	Note
cpu_load	Average CPU utilization	-
flops_any	Floprate calculated as `f_any = (f_double x 2) + f_single`	-
mem_bw	Average memory bandwidth used	Non-GPU Cluster only
mem_used	Maximum memory used	Non-GPU Cluster only
acc_utilization	Average accelerator utilization	GPU Cluster Only

Colors and icons differentiate between the different warning states based on the configured threshold of the metrics. Reported metric values below the warning threshold simply report bad performance in one or more metrics, and should therefore be inspected by the user for future performance improvement.

Metric values colored in blue, however, usually report performance above the expected levels - Which is exactly why these metrics should be inspected as well. The “maximum” thresholds are often the theoretically achievable performance by the respective hardware component, but rarely are they actually reached. Inspecting jobs reporting back such levels can lead to averaging errors, unrealistic spikes in the metric data or even bugs in the code of ClusterCockpit.

Color	Level	Description	Note
Blue	Info	Metric value below maximum configured peak threshold	Job performance above expected parameters - Inspection recommended
Green	OK	Metric value below normal configured threshold	Job performance within expected parameters
Yellow	Caution	Metric value below configured caution threshold	Job performance might be impacted
Red	Warning	Metric value below configured warning threshold	Job performance impacted with high probability - Inscpection recommended
Dark Grey	Error	Metric value extremely above maximum configured threshold	Inspection required - Metric spikes in affected metrics can lead to errorneous average values

For examples, see images in the job view section.

Metric Row

Selected metrics are rendered here in the selected order as metric lineplots. Aspects of the rendering can be configured at the settings page.

5.4 - Job

Detailed Single Job Information View

The job view displays all data related to one specific job in full detail, and allows detailed inspection of all metrics at several scopes, as well as manual tagging of the job.

Top Bar

The top bar of each job view replicates the “Job Info” and “Footprint” seen in the job list, and additionally renders general metric information in specialized plots.

For shared jobs, a list of jobs which run (or ran) concurrently is shown as well.

Job Info

Identical to the job list equivalent, this component displays meta data containing general information about the job. From here, users can navigate to the detailed view of one specific job as well as the user or project specific job lists.

Field	Example	Description	Destination
Job Id	`123456`	The JobId of the job assigned by the scheduling daemon	Job View
Job Name	`myJobName`	The name of the job as supplied by the user	-
Username	`abcd10`	The username of the submitting user	User Jobs
Project	`abcd`	The name of the usergroup the submitting user belongs to	Joblist with preset Filter
Resources	`n100`	Indicator for the allocated resources. Single resources will be displayed by name, i.e. exclusive single-node jobs or shared resources. Multiples of resources will be indicated by icons for nodes, CPU Threads, and accelerators.	-
Partition	`main`	The cluster partition this job was startet at	-
Start Timestamp	`10.1.2024, 10:00:00`	The epoch timestamp the job was started at, formatted for human readability	-
Duration	`0:21:10`	The runtime of the job, will be updated for running jobs on reload. Additionally indicates the state of the job as colored pill	-
Walltime	`24:00:00`	The allocated walltime for the job as per job submission script	-

Footprint

Identical to the job list equivalent, this component will show base metrics for job performance at a glance, and will hint to job quality and problems in regard to configurable metric thresholds. In contrast to the job list, it is always active and shown in the detailed job view.

Field	Description	Note
cpu_load	Average CPU utilization	-
flops_any	Floprate calculated as `f_any = (f_double x 2) + f_single`	-
mem_bw	Average memory bandwidth used	-
mem_used	Maximum memory used	Non-GPU Cluster only
acc_utilization	Average accelerator utilization	GPU Cluster Only

Colors and icons differentiate between the different warning states based on the configured thresholds of the metrics. Reported metric values below the warning threshold simply report bad performance in one or more metrics, and should therefore be inspected by the user for future performance improvement.

Metric values colored in blue, however, usually report performance above the expected levels - Which is exactly why these metrics should be inspected as well. The “maximum” thresholds are often the theoretically achievable performance by the respective hardware component, but rarely are they actually reached. Inspecting jobs reporting back such levels can lead to averaging errors, unrealistic spikes in the metric data or even bugs in the code of ClusterCockpit.

Color	Level	Description	Note
Blue	Info	Metric value below maximum configured peak threshold	Job performance above expected parameters - Inspection recommended
Green	OK	Metric value below normal configured threshold	Job performance within expected parameters
Yellow	Caution	Metric value below configured caution threshold	Job performance might be impacted
Red	Warning	Metric value below configured warning threshold	Job performance impacted with high probability - Inspection recommended
Dark Grey	Error	Metric value extremely above maximum configured threshold	Inspection required - Metric spikes in affected metrics can lead to errorneous average values

Specific to the job view: In the job view, the footprint component also allows for 1:1 rendering of HTML code, saved within the jobs’ meta data secton of the database. This is intended for administrative messages towards the user who created the job, e.g. for displaying warning, hints, or contact information.

Examples

Footprint with good Performance — Footprint of a job with performance well within expected parameters, ‘mem_bw’ even overperforms.

Footprint with mixed Performance — Footprint of an accelerated job with mixed performance parameters.

Footprint with Errors — Footprint of a job with performance averages way above the expected maxima - Look for artifacts!

Concurrent Jobs

In the case of a shared job, this component will display all jobs, which were run on the same hardware at the same time. “At the same time” is defined as “has a starting or ending time which lies between the starting and ending time of the reference job” for this purpose.

A cautious period of five minutes is applied to both limits, in order to restrict display of jobs which have too little overlap, and would just clutter the resulting list of jobs.

Each overlapping job is listed with its jobId as a link leading to this jobs detailed job view.

Polar Representation

A polar plot representing the utilization of three key metrics: flops_any, mem_used, and mem_bw. Both the maximum and the average are rendered. In principle, this is a graphic representation of data also shown in the footprint component.

Roofline Representation

A roofline plot representing the utilization of available resources as the relation between computation and memory usage over time (color scale blue -> red).

Metric Plot Table

The views’ middle section consists of metric plots for each metric selected in the “Metrics” selector, which defaults to all configured metrics.

The data shown per metric defaults to the smallest available granularity of the metric with data of all nodes, but can be changed at will by using the drop down selectors above each plot.

Please note: The statistical representation is not yet available for metric plots in this view. Jobs with high allocated node counts will be showing one line for each core if switched to this granilarity!

Tagging

Manual tagging of jobs is performed by using the “Manage Tags” option.

Existing tags are listed, and can be added to the jobs’ database entry simply by pressing the respective button.

The list can be filtered for specific tags by using the “Search Tags” prompt.

New tags can be created by entering a new type:name combination in the search prompt, which will display a button for creating this new tag.

Statistics and Meta Data

On the bottom of the job view, additional information about the job is collected. By default, the statistics of selected metrics are shown in tabular form, each in their metrics’ native granularity.

Statistics Table

The statistics table collects all metric statistical values (min, max, avg) for each allocated node and each granularity.

The metrics to be displayed can be selected using the “Metrics” selection pop-up window. In the header, next to the metric name, a second drop down allows the selection of the displayed granularity.

Core and Accelerator metrics default to their respective native granularities automatically.

Job Script

This tab displays the job script with which whis job was started on the systems.

Slurm Info

THis tab displays information returned drom the SLURM batch process management software.

5.5 - Users

Table of All Users Running Jobs on the Clusters

User Table, sorted by ‘Total Jobs’ in descending order. In addition, active filters reduce the underlying data to jobs with more than one hour runtime, started on the GPU accelerated cluster.

This view lists all users which are, and were, active on the configured clusters. Information about the total number of jobs, walltimes and calculation usages are shown.

It is possible to filter the list by username using the equally named prompt, which also accepts partial queries.

The filter component allows limitation of the returned users based on job parameters like start timestamp or memory usage.

The table can be sorted by clicking the respective icon next to the column headers.

Please Note: By default, a “Last 30 Days” filter is activated by default when opening this view.

Managers Only: For users with manager authority, this view will be titled ‘Managed Users’ in the navigation bar. Managers will only be able to see other user accounts of the managed projects.

Details

Column	Description	Note
User Name	The user jobs are associated with	Links to the users’ job list with preset filter returning only jobs of this user and additional histograms
Total Jobs	Users’ total of all started jobs
Total Walltime	Users’ total requested walltime
Total Core Hours	Users’ total of all used core hours
Total Accelerator Hours	Users’ total of all used accelerator hours	Please Note: This column is always shown, and will return `0` for clusters without installed accelerators

5.6 - Projects

Table of All Projects Running Jobs on the Clusters

This view lists all projects (usergroups) which are, and were, active on the configured clusters. Information about the total number of jobs, walltimes and calculation usages are shown.

It is possible to filter the list by project name using the equally named prompt, which also accepts partial queries.

The filter component allows limitation of the returned projects based on job parameters like start timestamp or memory usage.

The table can be sorted by clicking the respective icon next to the column headers.

Please Note: By default, a “Last 30 Days” filter is activated by default when opening this view.

Details

Column	Description	Note
Project Name	The project (usergoup) jobs are associated with	Links to a job list with preset filter returning only jobs of this project
Total Jobs	Project total of all started Jobs
Total Walltime	Project total requested walltime
Total Core Hours	Project total of all used core hours used
Total Accelerator Hours	Project total of all used accelerator hours	Please Note: This column is always shown, and will return `0` for clusters without installed accelerators

5.7 - Tags

Lists Active Tags Used in the Frontend

This view lists all tags currently used within the ClusterCockpit instance:

The type of the tag(s) is displayed as dark grey header, collecting all tags which share it.
The names of all tags sharing one type are rendered as yellow pills below the header.
How often a tag was applied to a job is shown in the number following the tags name

Each tags’ pill is clickable, and leads to a job list with a preset filter matching only jobs tagged with this specific label.

Please note: Creating tags and adding them to jobs is either done by using the respective REST API call, or manually from the job view.

5.8 - Nodes

Node Based Metric Information of one Cluster

The nodes view, or systems view, is always called in respect to one specified cluster. It displays the current state of all nodes in that cluster in respect to one selected metric, rendered in form of metric plots, and independent of job meta data, i.e. without consideration for job start and end timestamps.

Please note: The X-Axis of all plots rendered in this view are relative to the latest data point received from the collector daemon, and thus, the time displayed reaches backward as indicated by negative X-axis labels.

Selection Bar

Selections regarding the display, and update, of the plots rendered in the node table can be performed here:

(Periodic) Reload: Force reload of fresh data from the backend or set a periodic reload in specified intervals
- 30 Seconds, 60 Seconds, 120 Seconds, 5 Minutes
Displayed Time: Select the timeframe to be rendered in the node table
- Custom: Select timestamp from and to in which the data should be fetched. It is possible to select date and time.
- 15 Minutes, 30 Minutes, 1 Hour, 2 Hours, 4 Hours, 12 Hours, 24 Hours
Metric:: Select the metric to be fetched for all nodes. If no data can be fetched, messages are displayed per node.
Find Node:: Filter the node table by hostname. Partial queries are possible.

Node Table

Nodes (hosts) are ordered alphanumerically in this table, rendering the selected metric in the selected timeframe.

Each heading links to the singular node view of the respective host.

5.9 - Node

All Metrics of One Selected Node

The node view is always called in respect to one specified cluster and one specified node (host). It displays the current state of all metrics for that node, rendered in form of metric plots, and independent of job meta data, i.e. without consideration for job start and end timestamps.

Please note: The X-Axis of all plots rendered in this view are relative to the latest data point received from the collector daemon, and thus, the time displayed reaches backward as indicated by negative X-axis labels.

Selection Bar

Information and selections regarding the data of the plots rendered in the node table can be performed here:

Name: The hostname of the inspected node
Concurrent Jobs: Number of jobs currently allocated to this node. Exclusively used nodes will always display 1 if a job is running at the moment, or 0 if not.
- A link is provided which leads to the joblist with preset filter fetching only currently allocated jobs.
(Periodic) Reload: Force reload of fresh data from the backend or set a periodic reload in specified intervals
- 30 Seconds, 60 Seconds, 120 Seconds, 5 Minutes
Displayed Time: Select the timeframe to be rendered in the node table
- Custom: Select timestamp from and to in which the data should be fetched. It is possible to select date and time.
- 15 Minutes, 30 Minutes, 1 Hour, 2 Hours, 4 Hours, 12 Hours, 24 Hours

Node Table

Metrics are ordered alphanumerically in this table, rendering each metric in the selected timeframe.

5.10 - Analysis

Metric Data Analysis View

The analysis view is always called in respect to one specified cluster. It collects and renders data based on the jobs returned by the active filters, which can be specified to a high detail, allowing analysis of specific aspects.

Please note: By default, the requested data is limited by a preset start time filter to jobs started within the last 6 hours. In addition, some results are not calculated when the returned amount of jobs exceeds 500 entries, in order to save on rendering time.

General Information

The general information section of the analysis view is always rendered and consists of the following elements

Totals

Total counts of collected data based on the returned jobs matching the requested filters:

Total Jobs
Total Short Jobs (By default defined as jobs shorter than 5 minutes)
Total Walltime
Total Node Hours
Total Core Hours
Total Accelerator Hours

Top Users and Projects

The ten most active users or projects are rendered in a combination of pie chart and tabular legend with values displayed. By default, the top ten users with the most jobs matching the selected filters will be shown.

Hovering over one of the pie chart fractions will display a legend featuring the identifier and value of the selected parameter.

The selection can be changed directly in the headers of the pie chart and the table, and can be changed to

Element	Options
Pie Chart	`Users, Projects`
Table	`Walltime, Node Hours, Core Hours, Accelerator Hours`

The selection is saved for each user and cluster, and will select the last chosen types of list as default the next time this view is opened.

“User Names” and “Project Codes” are rendered as links, leading to user job lists or project job lists with preset filters for cluster and entity ID.

Please note: The legend colors are fixed by their position, and not by their respective identifier. This means that the orange fraction will always be the largest fraction, even if the contributing user or project changes.

Heatmap Roofline

A roofline plot representing the utilization of available resources as the relation between computation and memory for all jobs matching the filters. In order to represent the data in a meaningful way, the time information of the raw data is abstracted and represented as a heat map, with increasingly red sections of the roofline plot being the most populated regions of utilization.

Histograms

Two histograms depicting the duration and number of allocated cores distributions for the returned jobs matching the filters.

Selectable Data Representations

The second half of the analysis view consists of areas reserved for rendering user-selected data representations.

Select Plots for Histograms: Opens a selector listing all configured metrics of the respective cluster. One or more metrics can be selected, and the data returned will be rendered as average distributions normalized by node hours (core hours, accelerator hours; depending on the metric).
Select Plots in Scatter Plots: Opens a selector which allows selection of user chosen combinations of configured metrics for the respective cluster. Selected duplets will be rendered as scatter bubble plots for each selected pair of metrics.

Analysis View Scatter Selection — Three pairs of metrics are already selected for scatter representation. Remove a selected pair by pressing the ‘x’ button, add a new pair by selecting two metric from the dropdown menu, and confirming by pressing ‘Add Plot’.

Average Distribution Histograms

Analysis View Average Distributions — Three selected metrics are represented as normalized, average distributions based on returned jobs.

These histograms show the distribution of the normalized averages of all jobs matching the filters, split into 50 bins for high detail.

Normalization is achieved by weighting the selected metric data job averages by node hours (default), or by either accelerator hours (for native accelerator scope metrics) or core hours (for native core scope metrics).

Please note: Metrics, which are disabled for specific subclusters as per metric configuration file, will be returned as null values if data is requested for the whole cluster, which can affect the rendered distributions. Select a specific partition using the cluster filter to evade this artifact.

User Defined Scatterplots

Analysis View Scatter Plots — Three user defined scatter plots.

Bubble scatter plots show the position of the averages of two selected metrics in relation to each other.

Each circle represents one job, while the size of a circle is proportional to its node hours. Darker circles mean multiple jobs have the same averages for the respective metric selection.

5.11 - Status

Hardware Usage Information

The status view is always called in respect to one specified cluster. It displays the current state of utilization of the respective clusters resources, as well as user and project top lists and distribution histograms of the allocated resources per job.

Please note: By default, the periodic reload function is set to 2 Minutes.

Utilization Information

For each subluster, utilization is displayed in two parts rendered in one row.

Gauges

Simple gauge representation of the current utilization of available resources

Field	Description	Note
Allocated Nodes	Number of nodes currently allocated in respect to maximum available	-
Flop Rate (Any)	Currently achieved flop rate in respect to theoretical maximum	Floprate calculated as `f_any = (f_double x 2) + f_single`
MemBW Rate	Currently achieved memory bandwidth in respect to technical maximum	-

Roofline

A roofline plot representing the utilization of available resources as the relation between computation and memory for each currently allocated, running job at the time of the latest data retrieval. Therefore, no time information is represented (all dots in blue, representing one job each).

Top Users and Projects

The ten most active users or projects are rendered in a combination of pie chart and tabular legend. By default, the top ten users or projects with the most allocated, running jobs are listed.

The selection can be changed directly in the tables header at Number of ..., and can be changed to

Jobs (Default)
Nodes
Cores
Accelerators

The selection is saved for each user and cluster, and will select the last chosen type of list as default the next time this view is rendered.

Hovering over one of the pie chart fractions will display a legend featuring the identifier and value of the selected parameter.

“User Names” and “Project Codes” are rendered as links, leading to user job lists or project job lists with preset filters for cluster, entity ID, and state == running.

Please note: The legend colors are fixed by their position, and not by their respective identifier. This means that the orange fraction will always be the largest fraction, even if the contributing user or project changes.

Statistic Histograms

Several histrograms depicting the utilization of the clusters resources, based on all currently running jobs are rendered here:

Duration Distribution
Number of Nodes Distribution
Number of Cores Distribution
Number of Accelerators Distribution

Web Interface

Home

Navigation Bar

Adaptive Render Versions

1 - Settings

Plotting Options

Color Schemes

Administration Options

Create User

Special Users

Edit User Role

Edit Managed Projects

2 - Searchbar

Available Keywords

3 - Plots

Metric Plots

Interactivity

Conditional Legends

Example

Colored Backgrounds

Example

Statistics Variant

Example

Histograms

Example

Roofline Plot

Dotted Roofline

Example

Heatmap Roofline

Example

Polar Plots

Example

Scatter / Bubble Plot

Example

4 - Filters

Filter Options

Cluster/Partition

Job States

Start Time

Duration

Tags

Resources

Statistics

Start Time Quick Selections

5 - Views

5.1 - My Jobs

User Information and Basic Distributions

Selectable Histograms

Job List

5.2 - User Jobs

User Information and Basic Distributions

Selectable Histograms

Job List

5.3 - Job List

Job List Toolbar

Sorting

Metrics

Job Count

Filters

Search and Reload

Job List Table

Job Info

Footprint

Metric Row

5.4 - Job

Top Bar

Job Info

Footprint

Examples

Concurrent Jobs

Polar Representation

Roofline Representation

Metric Plot Table

Tagging

Statistics and Meta Data

Statistics Table

Job Script

Slurm Info

5.5 - Users

Details