1 - Backend ClusterCockpit Backend References
Reference information regarding the primary ClusterCockpit component “cc-backend” (GitHub Repo ).
1.1 - Command Line ClusterCockpit Command Line Options
This page describes the command line options for the cc-backend
executable.
-add-user <username>:[admin,support,manager,api,user]:<password>
Function: Adds a new user to the database. Only one role can be assigned.
Example: -add-user abcduser:manager:somepass
Function: Specifies alternative path to application configuration file.
Default: ./config.json
Example: -config ./configfiles/configuration.json
Function: Removes a user from the database by username.
Example: -del-user abcduser
Function: Enables development components: GraphQL Playground and Swagger UI.
Function: Go server listens via github.com/google/gops/agent (for debugging).
-import-job <path-to-meta.json>:<path-to-data.json>, ...
Function: Import one or more jobs by comma seperated list of paths to meta.json
and data.json
.
Example: -import-job ./to-import/job1-meta.json:./to-import/job1-data.json,./to-import/job2-meta.json:./to-import/job2-data.json
Function: Setups var
directory. Initializes sqlite database file, config.json
and .env
environment variable file.
Function: Iterates the job-archive and re-initializes the ‘job’, ’tag’, and ‘jobtag’ tables based on archived jobs.
Caution: All running jobs will be lost!
Function: Generates and prints a JWT for the user specified by its username.
Example: -jwt abcduser
Function: Set this flag to add date and time to log messages.
Function: Sets the loglevel of the running ClusterCockpit instance. “Debug” will print all levels, “Crit” will only log critical log messages.
Arguments: debug | info | warn | err | crit
Default: info
Example: -loglevel debug
Function: Migrate database to latest supported version and exit.
Function: Start a server, continues listening on configured port (Default: :8080
) after initialization and argument handling.
Function: Synchronizes the ‘user’ table with LDAP.
Function: Shows version information and exits.
1.2 - Configuration ClusterCockpit Configuration Option References
CC-Backend requires a JSON configuration file that specifies the cluster systems to be used. The schema of the configuration is described at the schema documentation .
To override the default, specify the location of a JSON configuration file with the -config <file path>
command line option.
Configuration Options addr
: Type string. Address where the http (or https) server will listen on (for example: ’localhost:80’). Default :8080
.apiAllowedIPs
: Type string array. Addresses from which the secured API endpoints (/users and other auth related endpoints) can be reacheduser
: Type string. Drop root permissions once .env was read and the port was taken. Only applicable if using privileged port.group
: Type string. Drop root permissions once .env was read and the port was taken. Only applicable if using privileged port.disable-authentication
: Type bool. Disable authentication (for everything: API, Web-UI, …). Default false
.embed-static-files
: Type bool. If all files in web/frontend/public
should be served from within the binary itself (they are embedded) or not. Default true
.static-files
: Type string. Folder where static assets can be found, if embed-static-files
is false
. No default.db-driver
: Type string. ‘sqlite3’ or ‘mysql’ (mysql will work for mariadb as well). Default sqlite3
.db
: Type string. For sqlite3 a filename, for mysql a DSN in this format: https://github.com/go-sql-driver/mysql#dsn-data-source-name (Without query parameters!). Default: ./var/job.db
.job-archive
: Type object.kind
: Type string. At them moment only file is supported as value.path
: Type string. Path to the job-archive. Default: ./var/job-archive
.compression
: Type integer. Setup automatic compression for jobs older than number of days.retention
: Type object.policy
: Type string (required). Retention policy. Possible values none, delete,
move.includeDB
: Type boolean. Also remove jobs from database.age
: Type integer. Act on jobs with startTime older than age (in days).location
: Type string. The target directory for retention. Only applicable for retention policy move.disable-archive
: Type bool. Keep all metric data in the metric data repositories, do not write to the job-archive. Default false
.validate
: Type bool. Validate all input json documents against json schema.session-max-age
: Type string. Specifies for how long a session shall be valid as a string parsable by time.ParseDuration(). If 0 or empty, the session/token does not expire! Default 168h
.https-cert-file
and https-key-file
: Type string. If both those options are not empty, use HTTPS using those certificates.redirect-http-to
: Type string. If not the empty string and addr
does not end in “:80”, redirect every request incoming at port 80 to that url.machine-state-dir
: Type string. Where to store MachineState files. TODO: Explain in more detail!stop-jobs-exceeding-walltime
: Type int. If not zero, automatically mark jobs as stopped running X seconds longer than their walltime. Only applies if walltime is set for job. Default 0
.short-running-jobs-duration
: Type int. Do not show running jobs shorter than X seconds. Default 300
.jwts
: Type object (required). For JWT Authentication.max-age
: Type string (required). Configure how long a token is valid. As string parsable by time.ParseDuration().cookieName
: Type string. Cookie that should be checked for a JWT token.vaidateUser
: Type boolean. Deny login for users not in database (but defined in JWT). Overwrite roles in JWT with database roles.trustedIssuer
: Type string. Issuer that should be accepted when validating external JWTs.syncUserOnLogin
: Type boolean. Add non-existent user to DB at login attempt with values provided in JWT.ldap
: Type object. For LDAP Authentication and user synchronisation. Default nil
.url
: Type string (required). URL of LDAP directory server.user_base
: Type string (required). Base DN of user tree root.search_dn
: Type string (required). DN for authenticating LDAP admin account with general read rights.user_bind
: Type string (required). Expression used to authenticate users via LDAP bind. Must contain uid={username}
.user_filter
: Type string (required). Filter to extract users for syncing.username_attr
: Type string. Attribute with full user name. Defaults to gecos
if not provided.sync_interval
: Type string. Interval used for syncing local user table with LDAP directory. Parsed using time.ParseDuration.sync_del_old_users
: Type boolean. Delete obsolete users in database.syncUserOnLogin
: Type boolean. Add non-existent user to DB at login attempt if user exists in Ldap directory.clusters
: Type array of objects (required)name
: Type string. The name of the cluster.metricDataRepository
: Type object with properties: kind
(Type string, can be one of cc-metric-store
, influxdb
), url
(Type string), token
(Type string)filterRanges
Type object. This option controls the slider ranges for the UI controls of numNodes, duration, and startTime. Example:"filterRanges": {
"numNodes": { "from": 1, "to": 64 },
"duration": { "from": 0, "to": 86400 },
"startTime": { "from": "2022-01-01T00:00:00Z", "to": null }
}
ui-defaults
: Type object. Default configuration for ui views. If overwritten, all options must be provided! Most options can be overwritten by the user via the web interface.analysis_view_histogramMetrics
: Type string array. Metrics to show as job count histograms in analysis view. Default ["flops_any", "mem_bw", "mem_used"]
.analysis_view_scatterPlotMetrics
: Type array of string array. Initial
scatter plot configuration in analysis view. Default [["flops_any", "mem_bw"], ["flops_any", "cpu_load"], ["cpu_load", "mem_bw"]]
.job_view_nodestats_selectedMetrics
: Type string array. Initial metrics shown in node statistics table of single job view. Default ["flops_any", "mem_bw", "mem_used"]
.job_view_polarPlotMetrics
: Type string array. Metrics shown in polar plot of single job view. Default ["flops_any", "mem_bw", "mem_used", "net_bw", "file_bw"]
.job_view_selectedMetrics
: Type string array. Default ["flops_any", "mem_bw", "mem_used"]
.plot_general_colorBackground
: Type bool. Color plot background according to job average threshold limits. Default true
.plot_general_colorscheme
: Type string array. Initial color scheme. Default "#00bfff", "#0000ff", "#ff00ff", "#ff0000", "#ff8000", "#ffff00", "#80ff00"
.plot_general_lineWidth
: Type int. Initial linewidth. Default 3
.plot_list_jobsPerPage
: Type int. Jobs shown per page in job lists. Default 50
.plot_list_selectedMetrics
: Type string array. Initial metric plots shown in jobs lists. Default "cpu_load", "ipc", "mem_used", "flops_any", "mem_bw"
.plot_view_plotsPerRow
: Type int. Number of plots per row in single job view. Default 3
.plot_view_showPolarplot
: Type bool. Option to toggle polar plot in single job view. Default true
.plot_view_showRoofline
: Type bool. Option to toggle roofline plot in single job view. Default true
.plot_view_showStatTable
: Type bool. Option to toggle the node statistic table in single job view. Default true
.system_view_selectedMetric
: Type string. Initial metric shown in system view. Default cpu_load
.Some of the ui-defaults
values can be appended by :<clustername>
in order to have different settings depending on the current cluster. Those are notably job_view_nodestats_selectedMetrics
, job_view_polarPlotMetrics
, job_view_selectedMetrics
and plot_list_selectedMetrics
.
1.3 - Environment ClusterCockpit Environment Variables
All security-related configurations, e.g. keys and passwords, are set using environment variables. It is supported to set these by means of a .env
file in the project root.
Environment Variables An example env file is found in this directory . Copy it as .env
into the project root and adapt it for your needs.
JWT_PUBLIC_KEY
and JWT_PRIVATE_KEY
: Base64 encoded Ed25519 keys used for JSON Web Token (JWT) authentication. You can generate your own keypair using go run ./cmd/gen-keypair/gen-keypair.go
. For more information, see the JWT documentation .SESSION_KEY
: Some random bytes used as secret for cookie-based sessions.LDAP_ADMIN_PASSWORD
: The LDAP admin user password (optional).CROSS_LOGIN_JWT_HS512_KEY
: Used for token based logins via another authentication service.LOGLEVEL
: Can be crit
, err
, warn
, info
or debug
. Can be used to reduce logging. Default is info
.1.4 - REST API ClusterCockpit RESTful API Endpoint Reference
Usage of Swagger UI To use the Swagger UI for testing you have to run an instance of cc-backend on localhost
(and use the default port 8080):
You may want to start the demo as described here .
This Swagger UI is also available as part of cc-backend
if you start it with
the dev
option:
./cc-backend -server -dev
You may access it at this URL .
Swagger API Reference Non-Interactive Documentation This reference is rendered using the
swaggerui
plugin based on the original definition file found in the ClusterCockpit
repository ,
but without a serving backend .This means that all interactivity (“Try It Out”) will not return actual data. However, a
Curl
call and a compiled
Request URL
will still be displayed, if an API endpoint is executed.
1.5 - Authentication Handbook How to configure and use the authentication backends
1.6 - Job Archive Handbook All you need to know about the ClusterCockpit Job Archive
1.7 - Schemas ClusterCockpit Schema References
ClusterCockpit Schema References for
Application Configuration Cluster Configuration Job Data Job Statistics Units Job Archive Job Metadata Job Archive Job Metricdata The schemas in their raw form can be found in the ClusterCockpit GitHub repository.
Manual Updates Changes to the original JSON schemas found in the repository are not automatically rendered in this reference documentation.The raw JSON schemas are parsed and rendered for better readability using the
json-schema-for-humans utility.
Last Update: 02.02.2024
1.7.1 - Application Config Schema ClusterCockpit Application Config Schema Reference
A detailed description of each of the application configuration options can be found in the config documentation .
The following schema in its raw form can be found in the ClusterCockpit GitHub repository.
Manual Updates Changes to the original JSON schema found in the repository are not automatically rendered in this reference documentation.Last Update: 02.02.2024cc-backend configuration file schema Title: cc-backend configuration file schema
1. [Optional] Property cc-backend configuration file schema > addrDescription: Address where the http (or https) server will listen on (for example: ’localhost:80’).
2. [Optional] Property cc-backend configuration file schema > userDescription: Drop root permissions once .env was read and the port was taken. Only applicable if using privileged port.
3. [Optional] Property cc-backend configuration file schema > groupDescription: Drop root permissions once .env was read and the port was taken. Only applicable if using privileged port.
4. [Optional] Property cc-backend configuration file schema > disable-authenticationDescription: Disable authentication (for everything: API, Web-UI, …).
5. [Optional] Property cc-backend configuration file schema > embed-static-filesDescription: If all files in web/frontend/public
should be served from within the binary itself (they are embedded) or not.
6. [Optional] Property cc-backend configuration file schema > static-filesDescription: Folder where static assets can be found, if embed-static-files is false.
7. [Optional] Property cc-backend configuration file schema > db-driverType enum (of string)
Required No
Description: sqlite3 or mysql (mysql will work for mariadb as well).
Must be one of:
8. [Optional] Property cc-backend configuration file schema > dbDescription: For sqlite3 a filename, for mysql a DSN in this format: https://github.com/go-sql-driver/mysql#dsn-data-source-name (Without query parameters!).
9. [Optional] Property cc-backend configuration file schema > job-archiveDescription: Configuration keys for job-archive
9.1. [Required] Property cc-backend configuration file schema > job-archive > kindType enum (of string)
Required Yes
Description: Backend type for job-archive
Must be one of:
9.2. [Optional] Property cc-backend configuration file schema > job-archive > pathDescription: Path to job archive for file backend
9.3. [Optional] Property cc-backend configuration file schema > job-archive > compressionDescription: Setup automatic compression for jobs older than number of days
9.4. [Optional] Property cc-backend configuration file schema > job-archive > retentionDescription: Configuration keys for retention
9.4.1. [Required] Property cc-backend configuration file schema > job-archive > retention > policyType enum (of string)
Required Yes
Description: Retention policy
Must be one of:
9.4.2. [Optional] Property cc-backend configuration file schema > job-archive > retention > includeDBDescription: Also remove jobs from database
9.4.3. [Optional] Property cc-backend configuration file schema > job-archive > retention > ageDescription: Act on jobs with startTime older than age (in days)
9.4.4. [Optional] Property cc-backend configuration file schema > job-archive > retention > locationDescription: The target directory for retention. Only applicable for retention move.
10. [Optional] Property cc-backend configuration file schema > disable-archiveDescription: Keep all metric data in the metric data repositories, do not write to the job-archive.
11. [Optional] Property cc-backend configuration file schema > validateDescription: Validate all input json documents against json schema.
12. [Optional] Property cc-backend configuration file schema > session-max-ageDescription: Specifies for how long a session shall be valid as a string parsable by time.ParseDuration(). If 0 or empty, the session/token does not expire!
13. [Optional] Property cc-backend configuration file schema > https-cert-fileDescription: Filepath to SSL certificate. If also https-key-file is set use HTTPS using those certificates.
14. [Optional] Property cc-backend configuration file schema > https-key-fileDescription: Filepath to SSL key file. If also https-cert-file is set use HTTPS using those certificates.
15. [Optional] Property cc-backend configuration file schema > redirect-http-toDescription: If not the empty string and addr does not end in :80, redirect every request incoming at port 80 to that url.
16. [Optional] Property cc-backend configuration file schema > stop-jobs-exceeding-walltimeDescription: If not zero, automatically mark jobs as stopped running X seconds longer than their walltime. Only applies if walltime is set for job.
17. [Optional] Property cc-backend configuration file schema > short-running-jobs-durationDescription: Do not show running jobs shorter than X seconds.
18. [Required] Property cc-backend configuration file schema > jwtsDescription: For JWT token authentication.
18.1. [Required] Property cc-backend configuration file schema > jwts > max-ageDescription: Configure how long a token is valid. As string parsable by time.ParseDuration()
18.2. [Optional] Property cc-backend configuration file schema > jwts > cookieNameDescription: Cookie that should be checked for a JWT token.
18.3. [Optional] Property cc-backend configuration file schema > jwts > validateUserDescription: Deny login for users not in database (but defined in JWT). Overwrite roles in JWT with database roles.
18.4. [Optional] Property cc-backend configuration file schema > jwts > trustedIssuerDescription: Issuer that should be accepted when validating external JWTs
18.5. [Optional] Property cc-backend configuration file schema > jwts > syncUserOnLoginDescription: Add non-existent user to DB at login attempt with values provided in JWT.
19. [Optional] Property cc-backend configuration file schema > ldapDescription: For LDAP Authentication and user synchronisation.
19.1. [Required] Property cc-backend configuration file schema > ldap > urlDescription: URL of LDAP directory server.
19.2. [Required] Property cc-backend configuration file schema > ldap > user_baseDescription: Base DN of user tree root.
19.3. [Required] Property cc-backend configuration file schema > ldap > search_dnDescription: DN for authenticating LDAP admin account with general read rights.
19.4. [Required] Property cc-backend configuration file schema > ldap > user_bindDescription: Expression used to authenticate users via LDAP bind. Must contain uid={username}.
19.5. [Required] Property cc-backend configuration file schema > ldap > user_filterDescription: Filter to extract users for syncing.
19.6. [Optional] Property cc-backend configuration file schema > ldap > username_attrDescription: Attribute with full username. Default: gecos
19.7. [Optional] Property cc-backend configuration file schema > ldap > sync_intervalDescription: Interval used for syncing local user table with LDAP directory. Parsed using time.ParseDuration.
19.8. [Optional] Property cc-backend configuration file schema > ldap > sync_del_old_usersDescription: Delete obsolete users in database.
19.9. [Optional] Property cc-backend configuration file schema > ldap > syncUserOnLoginDescription: Add non-existent user to DB at login attempt if user exists in Ldap directory
20. [Required] Property cc-backend configuration file schema > clustersType array of object
Required Yes
Description: Configuration for the clusters to be displayed.
Array restrictions Min items N/A Max items N/A Items unicity False Additional items False Tuple validation See below
20.1. cc-backend configuration file schema > clusters > clusters items 20.1.1. [Required] Property cc-backend configuration file schema > clusters > clusters items > nameDescription: The name of the cluster.
20.1.2. [Required] Property cc-backend configuration file schema > clusters > clusters items > metricDataRepositoryDescription: Type of the metric data repository for this cluster
20.1.2.1. [Required] Property cc-backend configuration file schema > clusters > clusters items > metricDataRepository > kindType enum (of string)
Required Yes
Must be one of:
“influxdb” “prometheus” “cc-metric-store” “test” 20.1.2.2. [Required] Property cc-backend configuration file schema > clusters > clusters items > metricDataRepository > url 20.1.2.3. [Optional] Property cc-backend configuration file schema > clusters > clusters items > metricDataRepository > token 20.1.3. [Required] Property cc-backend configuration file schema > clusters > clusters items > filterRangesDescription: This option controls the slider ranges for the UI controls of numNodes, duration, and startTime.
20.1.3.1. [Required] Property cc-backend configuration file schema > clusters > clusters items > filterRanges > numNodesDescription: UI slider range for number of nodes
20.1.3.1.1. [Required] Property cc-backend configuration file schema > clusters > clusters items > filterRanges > numNodes > from 20.1.3.1.2. [Required] Property cc-backend configuration file schema > clusters > clusters items > filterRanges > numNodes > to 20.1.3.2. [Required] Property cc-backend configuration file schema > clusters > clusters items > filterRanges > durationDescription: UI slider range for duration
20.1.3.2.1. [Required] Property cc-backend configuration file schema > clusters > clusters items > filterRanges > duration > from 20.1.3.2.2. [Required] Property cc-backend configuration file schema > clusters > clusters items > filterRanges > duration > to 20.1.3.3. [Required] Property cc-backend configuration file schema > clusters > clusters items > filterRanges > startTimeDescription: UI slider range for start time
20.1.3.3.1. [Required] Property cc-backend configuration file schema > clusters > clusters items > filterRanges > startTime > fromType string
Required Yes Format date-time
20.1.3.3.2. [Required] Property cc-backend configuration file schema > clusters > clusters items > filterRanges > startTime > to 21. [Optional] Property cc-backend configuration file schema > ui-defaultsDescription: Default configuration for web UI
21.1. [Required] Property cc-backend configuration file schema > ui-defaults > plot_general_colorBackgroundDescription: Color plot background according to job average threshold limits
21.2. [Required] Property cc-backend configuration file schema > ui-defaults > plot_general_lineWidthDescription: Initial linewidth
21.3. [Required] Property cc-backend configuration file schema > ui-defaults > plot_list_jobsPerPageDescription: Jobs shown per page in job lists
21.4. [Required] Property cc-backend configuration file schema > ui-defaults > plot_view_plotsPerRowDescription: Number of plots per row in single job view
21.5. [Required] Property cc-backend configuration file schema > ui-defaults > plot_view_showPolarplotDescription: Option to toggle polar plot in single job view
21.6. [Required] Property cc-backend configuration file schema > ui-defaults > plot_view_showRooflineDescription: Option to toggle roofline plot in single job view
21.7. [Required] Property cc-backend configuration file schema > ui-defaults > plot_view_showStatTableDescription: Option to toggle the node statistic table in single job view
21.8. [Required] Property cc-backend configuration file schema > ui-defaults > system_view_selectedMetricDescription: Initial metric shown in system view
21.9. [Required] Property cc-backend configuration file schema > ui-defaults > analysis_view_histogramMetricsType array of string
Required Yes
Description: Metrics to show as job count histograms in analysis view
Array restrictions Min items N/A Max items N/A Items unicity False Additional items False Tuple validation See below
21.9.1. cc-backend configuration file schema > ui-defaults > analysis_view_histogramMetrics > analysis_view_histogramMetrics items 21.10. [Required] Property cc-backend configuration file schema > ui-defaults > analysis_view_scatterPlotMetricsType array of array
Required Yes
Description: Initial scatter plto configuration in analysis view
Array restrictions Min items N/A Max items N/A Items unicity False Additional items False Tuple validation See below
21.10.1. cc-backend configuration file schema > ui-defaults > analysis_view_scatterPlotMetrics > analysis_view_scatterPlotMetrics itemsType array of string
Required No
Array restrictions Min items 1 Max items N/A Items unicity False Additional items False Tuple validation See below
21.10.1.1. cc-backend configuration file schema > ui-defaults > analysis_view_scatterPlotMetrics > analysis_view_scatterPlotMetrics items > analysis_view_scatterPlotMetrics items items 21.11. [Required] Property cc-backend configuration file schema > ui-defaults > job_view_nodestats_selectedMetricsType array of string
Required Yes
Description: Initial metrics shown in node statistics table of single job view
Array restrictions Min items N/A Max items N/A Items unicity False Additional items False Tuple validation See below
21.11.1. cc-backend configuration file schema > ui-defaults > job_view_nodestats_selectedMetrics > job_view_nodestats_selectedMetrics items 21.12. [Required] Property cc-backend configuration file schema > ui-defaults > job_view_polarPlotMetricsType array of string
Required Yes
Description: Metrics shown in polar plot of single job view
Array restrictions Min items N/A Max items N/A Items unicity False Additional items False Tuple validation See below
21.12.1. cc-backend configuration file schema > ui-defaults > job_view_polarPlotMetrics > job_view_polarPlotMetrics items 21.13. [Required] Property cc-backend configuration file schema > ui-defaults > job_view_selectedMetricsType array of string
Required Yes
Array restrictions Min items N/A Max items N/A Items unicity False Additional items False Tuple validation See below
21.13.1. cc-backend configuration file schema > ui-defaults > job_view_selectedMetrics > job_view_selectedMetrics items 21.14. [Required] Property cc-backend configuration file schema > ui-defaults > plot_general_colorschemeType array of string
Required Yes
Description: Initial color scheme
Array restrictions Min items N/A Max items N/A Items unicity False Additional items False Tuple validation See below
21.14.1. cc-backend configuration file schema > ui-defaults > plot_general_colorscheme > plot_general_colorscheme items 21.15. [Required] Property cc-backend configuration file schema > ui-defaults > plot_list_selectedMetricsType array of string
Required Yes
Description: Initial metric plots shown in jobs lists
Array restrictions Min items N/A Max items N/A Items unicity False Additional items False Tuple validation See below
21.15.1. cc-backend configuration file schema > ui-defaults > plot_list_selectedMetrics > plot_list_selectedMetrics itemsGenerated using json-schema-for-humans on 2024-02-02 at 14:36:54 +0100
1.7.2 - Cluster Schema ClusterCockpit Cluster Schema Reference
The following schema in its raw form can be found in the ClusterCockpit GitHub repository.
Manual Updates Changes to the original JSON schema found in the repository are not automatically rendered in this reference documentation.Last Update: 02.02.2024HPC cluster description Title: HPC cluster description
Description: Meta data information of a HPC cluster
1. [Required] Property HPC cluster description > nameDescription: The unique identifier of a cluster
2. [Required] Property HPC cluster description > metricConfigType array of object
Required Yes
Description: Metric specifications
Array restrictions Min items 1 Max items N/A Items unicity False Additional items False Tuple validation See below
2.1. HPC cluster description > metricConfig > metricConfig items 2.1.1. [Required] Property HPC cluster description > metricConfig > metricConfig items > nameDescription: Metric name
2.1.2. [Required] Property HPC cluster description > metricConfig > metricConfig items > unitDescription: Metric unit
2.1.2.1. [Required] Property HPC cluster description > metricConfig > metricConfig items > unit > baseType enum (of string)
Required Yes
Description: Metric base unit
Must be one of:
“B” “F” “B/s” “F/s” “CPI” “IPC” “Hz” “W” “°C” "" 2.1.2.2. [Optional] Property HPC cluster description > metricConfig > metricConfig items > unit > prefixType enum (of string)
Required No
Description: Unit prefix
Must be one of:
2.1.3. [Required] Property HPC cluster description > metricConfig > metricConfig items > scopeDescription: Native measurement resolution
2.1.4. [Required] Property HPC cluster description > metricConfig > metricConfig items > timestepDescription: Frequency of timeseries points
2.1.5. [Required] Property HPC cluster description > metricConfig > metricConfig items > aggregationType enum (of string)
Required Yes
Description: How the metric is aggregated
Must be one of:
2.1.6. [Required] Property HPC cluster description > metricConfig > metricConfig items > peakDescription: Metric peak threshold (Upper metric limit)
2.1.7. [Required] Property HPC cluster description > metricConfig > metricConfig items > normalDescription: Metric normal threshold
2.1.8. [Required] Property HPC cluster description > metricConfig > metricConfig items > cautionDescription: Metric caution threshold (Suspicious but does not require immediate action)
2.1.9. [Required] Property HPC cluster description > metricConfig > metricConfig items > alertDescription: Metric alert threshold (Requires immediate action)
2.1.10. [Optional] Property HPC cluster description > metricConfig > metricConfig items > subClustersType array of object
Required No
Description: Array of cluster hardware partition metric thresholds
Array restrictions Min items N/A Max items N/A Items unicity False Additional items False Tuple validation See below
2.1.10.1. HPC cluster description > metricConfig > metricConfig items > subClusters > subClusters items 2.1.10.1.1. [Required] Property HPC cluster description > metricConfig > metricConfig items > subClusters > subClusters items > nameDescription: Hardware partition name
2.1.10.1.2. [Optional] Property HPC cluster description > metricConfig > metricConfig items > subClusters > subClusters items > peak 2.1.10.1.3. [Optional] Property HPC cluster description > metricConfig > metricConfig items > subClusters > subClusters items > normal 2.1.10.1.4. [Optional] Property HPC cluster description > metricConfig > metricConfig items > subClusters > subClusters items > caution 2.1.10.1.5. [Optional] Property HPC cluster description > metricConfig > metricConfig items > subClusters > subClusters items > alert 2.1.10.1.6. [Optional] Property HPC cluster description > metricConfig > metricConfig items > subClusters > subClusters items > remove 3. [Required] Property HPC cluster description > subClustersType array of object
Required Yes
Description: Array of cluster hardware partitions
Array restrictions Min items 1 Max items N/A Items unicity False Additional items False Tuple validation See below
3.1. HPC cluster description > subClusters > subClusters items 3.1.1. [Required] Property HPC cluster description > subClusters > subClusters items > nameDescription: Hardware partition name
3.1.2. [Required] Property HPC cluster description > subClusters > subClusters items > processorTypeDescription: Processor type
3.1.3. [Required] Property HPC cluster description > subClusters > subClusters items > socketsPerNodeDescription: Number of sockets per node
3.1.4. [Required] Property HPC cluster description > subClusters > subClusters items > coresPerSocketDescription: Number of cores per socket
3.1.5. [Required] Property HPC cluster description > subClusters > subClusters items > threadsPerCoreDescription: Number of SMT threads per core
3.1.6. [Required] Property HPC cluster description > subClusters > subClusters items > flopRateScalarDescription: Theoretical node peak flop rate for scalar code in GFlops/s
3.1.6.1. [Optional] Property HPC cluster description > subClusters > subClusters items > flopRateScalar > unitDescription: Metric unit
3.1.6.2. [Optional] Property HPC cluster description > subClusters > subClusters items > flopRateScalar > value 3.1.7. [Required] Property HPC cluster description > subClusters > subClusters items > flopRateSimdDescription: Theoretical node peak flop rate for SIMD code in GFlops/s
3.1.7.1. [Optional] Property HPC cluster description > subClusters > subClusters items > flopRateSimd > unitDescription: Metric unit
3.1.7.2. [Optional] Property HPC cluster description > subClusters > subClusters items > flopRateSimd > value 3.1.8. [Required] Property HPC cluster description > subClusters > subClusters items > memoryBandwidthDescription: Theoretical node peak memory bandwidth in GB/s
3.1.8.1. [Optional] Property HPC cluster description > subClusters > subClusters items > memoryBandwidth > unitDescription: Metric unit
3.1.8.2. [Optional] Property HPC cluster description > subClusters > subClusters items > memoryBandwidth > value 3.1.9. [Required] Property HPC cluster description > subClusters > subClusters items > nodesDescription: Node list expression
3.1.10. [Required] Property HPC cluster description > subClusters > subClusters items > topologyDescription: Node topology
3.1.10.1. [Required] Property HPC cluster description > subClusters > subClusters items > topology > nodeType array of integer
Required Yes
Description: HwTread lists of node
Array restrictions Min items N/A Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description node items -
3.1.10.1.1. HPC cluster description > subClusters > subClusters items > topology > node > node items 3.1.10.2. [Required] Property HPC cluster description > subClusters > subClusters items > topology > socketType array of array
Required Yes
Description: HwTread lists of sockets
Array restrictions Min items N/A Max items N/A Items unicity False Additional items False Tuple validation See below
3.1.10.2.1. HPC cluster description > subClusters > subClusters items > topology > socket > socket itemsType array of integer
Required No
Array restrictions Min items N/A Max items N/A Items unicity False Additional items False Tuple validation See below
3.1.10.2.1.1. HPC cluster description > subClusters > subClusters items > topology > socket > socket items > socket items items 3.1.10.3. [Required] Property HPC cluster description > subClusters > subClusters items > topology > memoryDomainType array of array
Required Yes
Description: HwTread lists of memory domains
Array restrictions Min items N/A Max items N/A Items unicity False Additional items False Tuple validation See below
3.1.10.3.1. HPC cluster description > subClusters > subClusters items > topology > memoryDomain > memoryDomain itemsType array of integer
Required No
Array restrictions Min items N/A Max items N/A Items unicity False Additional items False Tuple validation See below
3.1.10.3.1.1. HPC cluster description > subClusters > subClusters items > topology > memoryDomain > memoryDomain items > memoryDomain items items 3.1.10.4. [Optional] Property HPC cluster description > subClusters > subClusters items > topology > dieType array of array
Required No
Description: HwTread lists of dies
Array restrictions Min items N/A Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description die items -
3.1.10.4.1. HPC cluster description > subClusters > subClusters items > topology > die > die itemsType array of integer
Required No
Array restrictions Min items N/A Max items N/A Items unicity False Additional items False Tuple validation See below
3.1.10.4.1.1. HPC cluster description > subClusters > subClusters items > topology > die > die items > die items items 3.1.10.5. [Optional] Property HPC cluster description > subClusters > subClusters items > topology > coreType array of array
Required No
Description: HwTread lists of cores
Array restrictions Min items N/A Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description core items -
3.1.10.5.1. HPC cluster description > subClusters > subClusters items > topology > core > core itemsType array of integer
Required No
Array restrictions Min items N/A Max items N/A Items unicity False Additional items False Tuple validation See below
3.1.10.5.1.1. HPC cluster description > subClusters > subClusters items > topology > core > core items > core items items 3.1.10.6. [Optional] Property HPC cluster description > subClusters > subClusters items > topology > acceleratorsType array of object
Required No
Description: List of of accelerator devices
Array restrictions Min items N/A Max items N/A Items unicity False Additional items False Tuple validation See below
3.1.10.6.1. HPC cluster description > subClusters > subClusters items > topology > accelerators > accelerators items 3.1.10.6.1.1. [Required] Property HPC cluster description > subClusters > subClusters items > topology > accelerators > accelerators items > idDescription: The unique device id
3.1.10.6.1.2. [Required] Property HPC cluster description > subClusters > subClusters items > topology > accelerators > accelerators items > typeType enum (of string)
Required Yes
Description: The accelerator type
Must be one of:
“Nvidia GPU” “AMD GPU” “Intel GPU” 3.1.10.6.1.3. [Required] Property HPC cluster description > subClusters > subClusters items > topology > accelerators > accelerators items > modelDescription: The accelerator model
Generated using json-schema-for-humans on 2024-02-02 at 14:36:54 +0100
1.7.3 - Job Data Schema ClusterCockpit Job Data Schema Reference
The following schema in its raw form can be found in the ClusterCockpit GitHub repository.
Manual Updates Changes to the original JSON schema found in the repository are not automatically rendered in this reference documentation.Last Update: 02.02.2024Job metric data list Title: Job metric data list
Description: Collection of metric data of a HPC job
1. [Required] Property Job metric data list > mem_usedDescription: Memory capacity used
1.1. [Required] Property Job metric data list > mem_used > nodeType object
Required Yes Additional properties [Any type: allowed] Defined in job-metric-data.schema.json
Description: Metric data of a HPC job
1.1.1. [Required] Property Job metric data list > mem_used > node > unitDescription: Metric unit
1.1.1.1. [Required] Property Job metric data list > mem_used > node > unit > baseType enum (of string)
Required Yes
Description: Metric base unit
Must be one of:
“B” “F” “B/s” “F/s” “CPI” “IPC” “Hz” “W” “°C” "" 1.1.1.2. [Optional] Property Job metric data list > mem_used > node > unit > prefixType enum (of string)
Required No
Description: Unit prefix
Must be one of:
1.1.2. [Required] Property Job metric data list > mem_used > node > timestepDescription: Measurement interval in seconds
1.1.3. [Optional] Property Job metric data list > mem_used > node > thresholdsDescription: Metric thresholds for specific system
1.1.3.1. [Optional] Property Job metric data list > mem_used > node > thresholds > peak 1.1.3.2. [Optional] Property Job metric data list > mem_used > node > thresholds > normal 1.1.3.3. [Optional] Property Job metric data list > mem_used > node > thresholds > caution 1.1.3.4. [Optional] Property Job metric data list > mem_used > node > thresholds > alert 1.1.4. [Optional] Property Job metric data list > mem_used > node > statisticsSeriesDescription: Statistics series across topology
1.1.4.1. [Optional] Property Job metric data list > mem_used > node > statisticsSeries > minType array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description min items -
1.1.4.1.1. Job metric data list > mem_used > node > statisticsSeries > min > min items 1.1.4.2. [Optional] Property Job metric data list > mem_used > node > statisticsSeries > maxType array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description max items -
1.1.4.2.1. Job metric data list > mem_used > node > statisticsSeries > max > max items 1.1.4.3. [Optional] Property Job metric data list > mem_used > node > statisticsSeries > meanType array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description mean items -
1.1.4.3.1. Job metric data list > mem_used > node > statisticsSeries > mean > mean items 1.1.4.4. [Optional] Property Job metric data list > mem_used > node > statisticsSeries > percentiles 1.1.4.4.1. [Optional] Property Job metric data list > mem_used > node > statisticsSeries > percentiles > 10Type array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description 10 items -
1.1.4.4.1.1. Job metric data list > mem_used > node > statisticsSeries > percentiles > 10 > 10 items 1.1.4.4.2. [Optional] Property Job metric data list > mem_used > node > statisticsSeries > percentiles > 20Type array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description 20 items -
1.1.4.4.2.1. Job metric data list > mem_used > node > statisticsSeries > percentiles > 20 > 20 items 1.1.4.4.3. [Optional] Property Job metric data list > mem_used > node > statisticsSeries > percentiles > 30Type array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description 30 items -
1.1.4.4.3.1. Job metric data list > mem_used > node > statisticsSeries > percentiles > 30 > 30 items 1.1.4.4.4. [Optional] Property Job metric data list > mem_used > node > statisticsSeries > percentiles > 40Type array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description 40 items -
1.1.4.4.4.1. Job metric data list > mem_used > node > statisticsSeries > percentiles > 40 > 40 items 1.1.4.4.5. [Optional] Property Job metric data list > mem_used > node > statisticsSeries > percentiles > 50Type array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description 50 items -
1.1.4.4.5.1. Job metric data list > mem_used > node > statisticsSeries > percentiles > 50 > 50 items 1.1.4.4.6. [Optional] Property Job metric data list > mem_used > node > statisticsSeries > percentiles > 60Type array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description 60 items -
1.1.4.4.6.1. Job metric data list > mem_used > node > statisticsSeries > percentiles > 60 > 60 items 1.1.4.4.7. [Optional] Property Job metric data list > mem_used > node > statisticsSeries > percentiles > 70Type array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description 70 items -
1.1.4.4.7.1. Job metric data list > mem_used > node > statisticsSeries > percentiles > 70 > 70 items 1.1.4.4.8. [Optional] Property Job metric data list > mem_used > node > statisticsSeries > percentiles > 80Type array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description 80 items -
1.1.4.4.8.1. Job metric data list > mem_used > node > statisticsSeries > percentiles > 80 > 80 items 1.1.4.4.9. [Optional] Property Job metric data list > mem_used > node > statisticsSeries > percentiles > 90Type array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description 90 items -
1.1.4.4.9.1. Job metric data list > mem_used > node > statisticsSeries > percentiles > 90 > 90 items 1.1.4.4.10. [Optional] Property Job metric data list > mem_used > node > statisticsSeries > percentiles > 25Type array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description 25 items -
1.1.4.4.10.1. Job metric data list > mem_used > node > statisticsSeries > percentiles > 25 > 25 items 1.1.4.4.11. [Optional] Property Job metric data list > mem_used > node > statisticsSeries > percentiles > 75Type array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description 75 items -
1.1.4.4.11.1. Job metric data list > mem_used > node > statisticsSeries > percentiles > 75 > 75 items 1.1.5. [Required] Property Job metric data list > mem_used > node > seriesType array of object
Required Yes
Array restrictions Min items N/A Max items N/A Items unicity False Additional items False Tuple validation See below
1.1.5.1. Job metric data list > mem_used > node > series > series items 1.1.5.1.1. [Required] Property Job metric data list > mem_used > node > series > series items > hostname 1.1.5.1.2. [Optional] Property Job metric data list > mem_used > node > series > series items > id 1.1.5.1.3. [Required] Property Job metric data list > mem_used > node > series > series items > statisticsDescription: Statistics across time dimension
1.1.5.1.3.1. [Required] Property Job metric data list > mem_used > node > series > series items > statistics > avgDescription: Series average
1.1.5.1.3.2. [Required] Property Job metric data list > mem_used > node > series > series items > statistics > minDescription: Series minimum
1.1.5.1.3.3. [Required] Property Job metric data list > mem_used > node > series > series items > statistics > maxDescription: Series maximum
1.1.5.1.4. [Required] Property Job metric data list > mem_used > node > series > series items > dataArray restrictions Min items 1 Max items N/A Items unicity False Additional items False Tuple validation See below
1.1.5.1.4.1. At least one of the items must be 2. [Required] Property Job metric data list > flops_anyDescription: Total flop rate with DP flops scaled up
2.1. [Optional] Property Job metric data list > flops_any > nodeDescription: Metric data of a HPC job
2.2. [Optional] Property Job metric data list > flops_any > socketDescription: Metric data of a HPC job
2.3. [Optional] Property Job metric data list > flops_any > memoryDomainDescription: Metric data of a HPC job
2.4. [Optional] Property Job metric data list > flops_any > coreDescription: Metric data of a HPC job
2.5. [Optional] Property Job metric data list > flops_any > hwthreadDescription: Metric data of a HPC job
3. [Required] Property Job metric data list > mem_bwDescription: Main memory bandwidth
3.1. [Optional] Property Job metric data list > mem_bw > nodeDescription: Metric data of a HPC job
3.2. [Optional] Property Job metric data list > mem_bw > socketDescription: Metric data of a HPC job
3.3. [Optional] Property Job metric data list > mem_bw > memoryDomainDescription: Metric data of a HPC job
4. [Required] Property Job metric data list > net_bwDescription: Total fast interconnect network bandwidth
4.1. [Required] Property Job metric data list > net_bw > nodeDescription: Metric data of a HPC job
5. [Optional] Property Job metric data list > ipcDescription: Instructions executed per cycle
5.1. [Optional] Property Job metric data list > ipc > nodeDescription: Metric data of a HPC job
5.2. [Optional] Property Job metric data list > ipc > socketDescription: Metric data of a HPC job
5.3. [Optional] Property Job metric data list > ipc > memoryDomainDescription: Metric data of a HPC job
5.4. [Optional] Property Job metric data list > ipc > coreDescription: Metric data of a HPC job
5.5. [Optional] Property Job metric data list > ipc > hwthreadDescription: Metric data of a HPC job
6. [Required] Property Job metric data list > cpu_userDescription: CPU user active core utilization
6.1. [Optional] Property Job metric data list > cpu_user > nodeDescription: Metric data of a HPC job
6.2. [Optional] Property Job metric data list > cpu_user > socketDescription: Metric data of a HPC job
6.3. [Optional] Property Job metric data list > cpu_user > memoryDomainDescription: Metric data of a HPC job
6.4. [Optional] Property Job metric data list > cpu_user > coreDescription: Metric data of a HPC job
6.5. [Optional] Property Job metric data list > cpu_user > hwthreadDescription: Metric data of a HPC job
7. [Required] Property Job metric data list > cpu_loadDescription: CPU requested core utilization (load 1m)
7.1. [Required] Property Job metric data list > cpu_load > nodeDescription: Metric data of a HPC job
8. [Optional] Property Job metric data list > flops_dpDescription: Double precision flop rate
8.1. [Optional] Property Job metric data list > flops_dp > nodeDescription: Metric data of a HPC job
8.2. [Optional] Property Job metric data list > flops_dp > socketDescription: Metric data of a HPC job
8.3. [Optional] Property Job metric data list > flops_dp > memoryDomainDescription: Metric data of a HPC job
8.4. [Optional] Property Job metric data list > flops_dp > coreDescription: Metric data of a HPC job
8.5. [Optional] Property Job metric data list > flops_dp > hwthreadDescription: Metric data of a HPC job
9. [Optional] Property Job metric data list > flops_spDescription: Single precision flops rate
9.1. [Optional] Property Job metric data list > flops_sp > nodeDescription: Metric data of a HPC job
9.2. [Optional] Property Job metric data list > flops_sp > socketDescription: Metric data of a HPC job
9.3. [Optional] Property Job metric data list > flops_sp > memoryDomainDescription: Metric data of a HPC job
9.4. [Optional] Property Job metric data list > flops_sp > coreDescription: Metric data of a HPC job
9.5. [Optional] Property Job metric data list > flops_sp > hwthreadDescription: Metric data of a HPC job
10. [Optional] Property Job metric data list > vectorization_ratioDescription: Fraction of arithmetic instructions using SIMD instructions
10.1. [Optional] Property Job metric data list > vectorization_ratio > nodeDescription: Metric data of a HPC job
10.2. [Optional] Property Job metric data list > vectorization_ratio > socketDescription: Metric data of a HPC job
10.3. [Optional] Property Job metric data list > vectorization_ratio > memoryDomainDescription: Metric data of a HPC job
10.4. [Optional] Property Job metric data list > vectorization_ratio > coreDescription: Metric data of a HPC job
10.5. [Optional] Property Job metric data list > vectorization_ratio > hwthreadDescription: Metric data of a HPC job
11. [Optional] Property Job metric data list > cpu_powerDescription: CPU power consumption
11.1. [Optional] Property Job metric data list > cpu_power > nodeDescription: Metric data of a HPC job
11.2. [Optional] Property Job metric data list > cpu_power > socketDescription: Metric data of a HPC job
12. [Optional] Property Job metric data list > mem_powerDescription: Memory power consumption
12.1. [Optional] Property Job metric data list > mem_power > nodeDescription: Metric data of a HPC job
12.2. [Optional] Property Job metric data list > mem_power > socketDescription: Metric data of a HPC job
13. [Optional] Property Job metric data list > acc_utilizationDescription: GPU utilization
13.1. [Required] Property Job metric data list > acc_utilization > acceleratorDescription: Metric data of a HPC job
14. [Optional] Property Job metric data list > acc_mem_usedDescription: GPU memory capacity used
14.1. [Required] Property Job metric data list > acc_mem_used > acceleratorDescription: Metric data of a HPC job
15. [Optional] Property Job metric data list > acc_powerDescription: GPU power consumption
15.1. [Required] Property Job metric data list > acc_power > acceleratorDescription: Metric data of a HPC job
16. [Optional] Property Job metric data list > clockDescription: Average core frequency
16.1. [Optional] Property Job metric data list > clock > nodeDescription: Metric data of a HPC job
16.2. [Optional] Property Job metric data list > clock > socketDescription: Metric data of a HPC job
16.3. [Optional] Property Job metric data list > clock > memoryDomainDescription: Metric data of a HPC job
16.4. [Optional] Property Job metric data list > clock > coreDescription: Metric data of a HPC job
16.5. [Optional] Property Job metric data list > clock > hwthreadDescription: Metric data of a HPC job
17. [Optional] Property Job metric data list > eth_read_bwDescription: Ethernet read bandwidth
17.1. [Required] Property Job metric data list > eth_read_bw > nodeDescription: Metric data of a HPC job
18. [Optional] Property Job metric data list > eth_write_bwDescription: Ethernet write bandwidth
18.1. [Required] Property Job metric data list > eth_write_bw > nodeDescription: Metric data of a HPC job
19. [Required] Property Job metric data list > filesystemsType array of object
Required Yes
Description: Array of filesystems
Array restrictions Min items 1 Max items N/A Items unicity False Additional items False Tuple validation See below
19.1. Job metric data list > filesystems > filesystems items 19.1.1. [Required] Property Job metric data list > filesystems > filesystems items > name 19.1.2. [Required] Property Job metric data list > filesystems > filesystems items > typeType enum (of string)
Required Yes
Must be one of:
“nfs” “lustre” “gpfs” “nvme” “ssd” “hdd” “beegfs” 19.1.3. [Required] Property Job metric data list > filesystems > filesystems items > read_bwDescription: File system read bandwidth
19.1.3.1. [Required] Property Job metric data list > filesystems > filesystems items > read_bw > nodeDescription: Metric data of a HPC job
19.1.4. [Required] Property Job metric data list > filesystems > filesystems items > write_bwDescription: File system write bandwidth
19.1.4.1. [Required] Property Job metric data list > filesystems > filesystems items > write_bw > nodeDescription: Metric data of a HPC job
19.1.5. [Optional] Property Job metric data list > filesystems > filesystems items > read_reqDescription: File system read requests
19.1.5.1. [Required] Property Job metric data list > filesystems > filesystems items > read_req > nodeDescription: Metric data of a HPC job
19.1.6. [Optional] Property Job metric data list > filesystems > filesystems items > write_reqDescription: File system write requests
19.1.6.1. [Required] Property Job metric data list > filesystems > filesystems items > write_req > nodeDescription: Metric data of a HPC job
19.1.7. [Optional] Property Job metric data list > filesystems > filesystems items > inodesDescription: File system write requests
19.1.7.1. [Required] Property Job metric data list > filesystems > filesystems items > inodes > nodeDescription: Metric data of a HPC job
19.1.8. [Optional] Property Job metric data list > filesystems > filesystems items > accessesDescription: File system open and close
19.1.8.1. [Required] Property Job metric data list > filesystems > filesystems items > accesses > nodeDescription: Metric data of a HPC job
19.1.9. [Optional] Property Job metric data list > filesystems > filesystems items > fsyncDescription: File system fsync
19.1.9.1. [Required] Property Job metric data list > filesystems > filesystems items > fsync > nodeDescription: Metric data of a HPC job
19.1.10. [Optional] Property Job metric data list > filesystems > filesystems items > createDescription: File system create
19.1.10.1. [Required] Property Job metric data list > filesystems > filesystems items > create > nodeDescription: Metric data of a HPC job
19.1.11. [Optional] Property Job metric data list > filesystems > filesystems items > openDescription: File system open
19.1.11.1. [Required] Property Job metric data list > filesystems > filesystems items > open > nodeDescription: Metric data of a HPC job
19.1.12. [Optional] Property Job metric data list > filesystems > filesystems items > closeDescription: File system close
19.1.12.1. [Required] Property Job metric data list > filesystems > filesystems items > close > nodeDescription: Metric data of a HPC job
19.1.13. [Optional] Property Job metric data list > filesystems > filesystems items > seekDescription: File system seek
19.1.13.1. [Required] Property Job metric data list > filesystems > filesystems items > seek > nodeDescription: Metric data of a HPC job
Generated using json-schema-for-humans on 2024-02-02 at 14:36:54 +0100
1.7.4 - Job Statistics Schema ClusterCockpit Job Statistics Schema Reference
The following schema in its raw form can be found in the ClusterCockpit GitHub repository.
Manual Updates Changes to the original JSON schema found in the repository are not automatically rendered in this reference documentation.Last Update: 02.02.2024Job statistics Title: Job statistics
Description: Format specification for job metric statistics
1. [Required] Property Job statistics > unitDescription: Metric unit
1.1. [Required] Property Job statistics > unit > baseType enum (of string)
Required Yes
Description: Metric base unit
Must be one of:
“B” “F” “B/s” “F/s” “CPI” “IPC” “Hz” “W” “°C” "" 1.2. [Optional] Property Job statistics > unit > prefixType enum (of string)
Required No
Description: Unit prefix
Must be one of:
2. [Required] Property Job statistics > avgDescription: Job metric average
3. [Required] Property Job statistics > minDescription: Job metric minimum
4. [Required] Property Job statistics > maxDescription: Job metric maximum
Generated using json-schema-for-humans on 2024-02-02 at 14:36:54 +0100
1.7.5 - Unit Schema ClusterCockpit Unit Schema Reference
The following schema in its raw form can be found in the ClusterCockpit GitHub repository.
Manual Updates Changes to the original JSON schema found in the repository are not automatically rendered in this reference documentation.Last Update: 02.02.2024Metric unit Title: Metric unit
Description: Format specification for job metric units
1. [Required] Property Metric unit > baseType enum (of string)
Required Yes
Description: Metric base unit
Must be one of:
“B” “F” “B/s” “F/s” “CPI” “IPC” “Hz” “W” “°C” "" 2. [Optional] Property Metric unit > prefixType enum (of string)
Required No
Description: Unit prefix
Must be one of:
Generated using json-schema-for-humans on 2024-02-02 at 14:36:54 +0100
1.7.6 - Job Archive Metadata Schema ClusterCockpit Job Archive Metadata Schema Reference
The following schema in its raw form can be found in the ClusterCockpit GitHub repository.
Manual Updates Changes to the original JSON schema found in the repository are not automatically rendered in this reference documentation.Last Update: 02.02.2024Title: Job meta data
Description: Meta data information of a HPC job
1. [Required] Property Job meta data > jobIdDescription: The unique identifier of a job
2. [Required] Property Job meta data > userDescription: The unique identifier of a user
3. [Required] Property Job meta data > projectDescription: The unique identifier of a project
4. [Required] Property Job meta data > clusterDescription: The unique identifier of a cluster
5. [Required] Property Job meta data > subClusterDescription: The unique identifier of a sub cluster
6. [Optional] Property Job meta data > partitionDescription: The Slurm partition to which the job was submitted
7. [Optional] Property Job meta data > arrayJobIdDescription: The unique identifier of an array job
8. [Required] Property Job meta data > numNodesDescription: Number of nodes used
9. [Optional] Property Job meta data > numHwthreadsDescription: Number of HWThreads used
10. [Optional] Property Job meta data > numAccDescription: Number of accelerators used
11. [Required] Property Job meta data > exclusiveDescription: Specifies how nodes are shared. 0 - Shared among multiple jobs of multiple users, 1 - Job exclusive, 2 - Shared among multiple jobs of same user
Restrictions Minimum ≥ 0 Maximum ≤ 2
12. [Optional] Property Job meta data > monitoringStatusDescription: State of monitoring system during job run
13. [Optional] Property Job meta data > smtDescription: SMT threads used by job
14. [Optional] Property Job meta data > walltimeDescription: Requested walltime of job in seconds
15. [Required] Property Job meta data > jobStateType enum (of string)
Required Yes
Description: Final state of job
Must be one of:
“completed” “failed” “cancelled” “stopped” “out_of_memory” “timeout” 16. [Required] Property Job meta data > startTimeDescription: Start epoch time stamp in seconds
17. [Required] Property Job meta data > durationDescription: Duration of job in seconds
18. [Required] Property Job meta data > resourcesType array of object
Required Yes
Description: Resources used by job
Array restrictions Min items N/A Max items N/A Items unicity False Additional items False Tuple validation See below
18.1.1. [Required] Property Job meta data > resources > resources items > hostname 18.1.2. [Optional] Property Job meta data > resources > resources items > hwthreadsType array of integer
Required No
Description: List of OS processor ids
Array restrictions Min items N/A Max items N/A Items unicity False Additional items False Tuple validation See below
18.1.3. [Optional] Property Job meta data > resources > resources items > acceleratorsType array of string
Required No
Description: List of of accelerator device ids
Array restrictions Min items N/A Max items N/A Items unicity False Additional items False Tuple validation See below
18.1.4. [Optional] Property Job meta data > resources > resources items > configurationDescription: The configuration options of the node
19. [Optional] Property Job meta data > metaDataDescription: Additional information about the job
19.1. [Optional] Property Job meta data > metaData > jobScriptDescription: The batch script of the job
19.2. [Optional] Property Job meta data > metaData > jobNameDescription: Slurm Job name
19.3. [Optional] Property Job meta data > metaData > slurmInfoDescription: Additional slurm infos as show by scontrol show job
20. [Optional] Property Job meta data > tagsType array of object
Required No
Description: List of tags
Array restrictions Min items N/A Max items N/A Items unicity True Additional items False Tuple validation See below
Each item of this array must be Description tags items -
20.1.1. [Required] Property Job meta data > tags > tags items > name 20.1.2. [Required] Property Job meta data > tags > tags items > type 21. [Required] Property Job meta data > statisticsDescription: Job statistic data
21.1. [Required] Property Job meta data > statistics > mem_usedType object
Required Yes Additional properties [Any type: allowed] Defined in job-metric-statistics.schema.json
Description: Memory capacity used (required)
21.1.1. [Required] Property Job meta data > statistics > mem_used > unitDescription: Metric unit
21.1.1.1. [Required] Property Job meta data > statistics > mem_used > unit > baseType enum (of string)
Required Yes
Description: Metric base unit
Must be one of:
“B” “F” “B/s” “F/s” “CPI” “IPC” “Hz” “W” “°C” "" 21.1.1.2. [Optional] Property Job meta data > statistics > mem_used > unit > prefixType enum (of string)
Required No
Description: Unit prefix
Must be one of:
21.1.2. [Required] Property Job meta data > statistics > mem_used > avgDescription: Job metric average
21.1.3. [Required] Property Job meta data > statistics > mem_used > minDescription: Job metric minimum
21.1.4. [Required] Property Job meta data > statistics > mem_used > maxDescription: Job metric maximum
21.2. [Required] Property Job meta data > statistics > cpu_loadDescription: CPU requested core utilization (load 1m) (required)
21.3. [Required] Property Job meta data > statistics > flops_anyDescription: Total flop rate with DP flops scaled up (required)
21.4. [Required] Property Job meta data > statistics > mem_bwDescription: Main memory bandwidth (required)
21.5. [Optional] Property Job meta data > statistics > net_bwDescription: Total fast interconnect network bandwidth (required)
21.6. [Optional] Property Job meta data > statistics > file_bwDescription: Total file IO bandwidth (required)
21.7. [Optional] Property Job meta data > statistics > ipcDescription: Instructions executed per cycle
21.8. [Required] Property Job meta data > statistics > cpu_userDescription: CPU user active core utilization
21.9. [Optional] Property Job meta data > statistics > flops_dpDescription: Double precision flop rate
21.10. [Optional] Property Job meta data > statistics > flops_spDescription: Single precision flops rate
21.11. [Optional] Property Job meta data > statistics > rapl_powerDescription: CPU power consumption
21.12. [Optional] Property Job meta data > statistics > acc_usedDescription: GPU utilization
21.13. [Optional] Property Job meta data > statistics > acc_mem_usedDescription: GPU memory capacity used
21.14. [Optional] Property Job meta data > statistics > acc_powerDescription: GPU power consumption
21.15. [Optional] Property Job meta data > statistics > clockDescription: Average core frequency
21.16. [Optional] Property Job meta data > statistics > eth_read_bwDescription: Ethernet read bandwidth
21.17. [Optional] Property Job meta data > statistics > eth_write_bwDescription: Ethernet write bandwidth
21.18. [Optional] Property Job meta data > statistics > ic_rcv_packetsDescription: Network interconnect read packets
21.19. [Optional] Property Job meta data > statistics > ic_send_packetsDescription: Network interconnect send packet
21.20. [Optional] Property Job meta data > statistics > ic_read_bwDescription: Network interconnect read bandwidth
21.21. [Optional] Property Job meta data > statistics > ic_write_bwDescription: Network interconnect write bandwidth
21.22. [Optional] Property Job meta data > statistics > filesystemsType array of object
Required No
Description: Array of filesystems
Array restrictions Min items 1 Max items N/A Items unicity False Additional items False Tuple validation See below
21.22.1.1. [Required] Property Job meta data > statistics > filesystems > filesystems items > name 21.22.1.2. [Required] Property Job meta data > statistics > filesystems > filesystems items > typeType enum (of string)
Required Yes
Must be one of:
“nfs” “lustre” “gpfs” “nvme” “ssd” “hdd” “beegfs” 21.22.1.3. [Required] Property Job meta data > statistics > filesystems > filesystems items > read_bwDescription: File system read bandwidth
21.22.1.4. [Required] Property Job meta data > statistics > filesystems > filesystems items > write_bwDescription: File system write bandwidth
21.22.1.5. [Optional] Property Job meta data > statistics > filesystems > filesystems items > read_reqDescription: File system read requests
21.22.1.6. [Optional] Property Job meta data > statistics > filesystems > filesystems items > write_reqDescription: File system write requests
21.22.1.7. [Optional] Property Job meta data > statistics > filesystems > filesystems items > inodesDescription: File system write requests
21.22.1.8. [Optional] Property Job meta data > statistics > filesystems > filesystems items > accessesDescription: File system open and close
21.22.1.9. [Optional] Property Job meta data > statistics > filesystems > filesystems items > fsyncDescription: File system fsync
21.22.1.10. [Optional] Property Job meta data > statistics > filesystems > filesystems items > createDescription: File system create
21.22.1.11. [Optional] Property Job meta data > statistics > filesystems > filesystems items > openDescription: File system open
21.22.1.12. [Optional] Property Job meta data > statistics > filesystems > filesystems items > closeDescription: File system close
21.22.1.13. [Optional] Property Job meta data > statistics > filesystems > filesystems items > seekDescription: File system seek
Generated using json-schema-for-humans on 2024-02-02 at 14:36:54 +0100
1.7.7 - Job Archive Metrics Data Schema ClusterCockpit Job Archive Metrics Data Schema Reference
The following schema in its raw form can be found in the ClusterCockpit GitHub repository.
Manual Updates Changes to the original JSON schema found in the repository are not automatically rendered in this reference documentation.Last Update: 02.02.2024Job metric data Title: Job metric data
Description: Metric data of a HPC job
1. [Required] Property Job metric data > unitDescription: Metric unit
1.1. [Required] Property Job metric data > unit > baseType enum (of string)
Required Yes
Description: Metric base unit
Must be one of:
“B” “F” “B/s” “F/s” “CPI” “IPC” “Hz” “W” “°C” "" 1.2. [Optional] Property Job metric data > unit > prefixType enum (of string)
Required No
Description: Unit prefix
Must be one of:
2. [Required] Property Job metric data > timestepDescription: Measurement interval in seconds
3. [Optional] Property Job metric data > thresholdsDescription: Metric thresholds for specific system
3.1. [Optional] Property Job metric data > thresholds > peak 3.2. [Optional] Property Job metric data > thresholds > normal 3.3. [Optional] Property Job metric data > thresholds > caution 3.4. [Optional] Property Job metric data > thresholds > alert 4. [Optional] Property Job metric data > statisticsSeriesDescription: Statistics series across topology
4.1. [Optional] Property Job metric data > statisticsSeries > minType array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description min items -
4.1.1. Job metric data > statisticsSeries > min > min items 4.2. [Optional] Property Job metric data > statisticsSeries > maxType array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description max items -
4.2.1. Job metric data > statisticsSeries > max > max items 4.3. [Optional] Property Job metric data > statisticsSeries > meanType array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description mean items -
4.3.1. Job metric data > statisticsSeries > mean > mean items 4.4. [Optional] Property Job metric data > statisticsSeries > percentiles 4.4.1. [Optional] Property Job metric data > statisticsSeries > percentiles > 10Type array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description 10 items -
4.4.1.1. Job metric data > statisticsSeries > percentiles > 10 > 10 items 4.4.2. [Optional] Property Job metric data > statisticsSeries > percentiles > 20Type array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description 20 items -
4.4.2.1. Job metric data > statisticsSeries > percentiles > 20 > 20 items 4.4.3. [Optional] Property Job metric data > statisticsSeries > percentiles > 30Type array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description 30 items -
4.4.3.1. Job metric data > statisticsSeries > percentiles > 30 > 30 items 4.4.4. [Optional] Property Job metric data > statisticsSeries > percentiles > 40Type array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description 40 items -
4.4.4.1. Job metric data > statisticsSeries > percentiles > 40 > 40 items 4.4.5. [Optional] Property Job metric data > statisticsSeries > percentiles > 50Type array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description 50 items -
4.4.5.1. Job metric data > statisticsSeries > percentiles > 50 > 50 items 4.4.6. [Optional] Property Job metric data > statisticsSeries > percentiles > 60Type array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description 60 items -
4.4.6.1. Job metric data > statisticsSeries > percentiles > 60 > 60 items 4.4.7. [Optional] Property Job metric data > statisticsSeries > percentiles > 70Type array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description 70 items -
4.4.7.1. Job metric data > statisticsSeries > percentiles > 70 > 70 items 4.4.8. [Optional] Property Job metric data > statisticsSeries > percentiles > 80Type array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description 80 items -
4.4.8.1. Job metric data > statisticsSeries > percentiles > 80 > 80 items 4.4.9. [Optional] Property Job metric data > statisticsSeries > percentiles > 90Type array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description 90 items -
4.4.9.1. Job metric data > statisticsSeries > percentiles > 90 > 90 items 4.4.10. [Optional] Property Job metric data > statisticsSeries > percentiles > 25Type array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description 25 items -
4.4.10.1. Job metric data > statisticsSeries > percentiles > 25 > 25 items 4.4.11. [Optional] Property Job metric data > statisticsSeries > percentiles > 75Type array of number
Required No
Array restrictions Min items 3 Max items N/A Items unicity False Additional items False Tuple validation See below
Each item of this array must be Description 75 items -
4.4.11.1. Job metric data > statisticsSeries > percentiles > 75 > 75 items 5. [Required] Property Job metric data > seriesType array of object
Required Yes
Array restrictions Min items N/A Max items N/A Items unicity False Additional items False Tuple validation See below
5.1. Job metric data > series > series items 5.1.1. [Required] Property Job metric data > series > series items > hostname 5.1.2. [Optional] Property Job metric data > series > series items > id 5.1.3. [Required] Property Job metric data > series > series items > statisticsDescription: Statistics across time dimension
5.1.3.1. [Required] Property Job metric data > series > series items > statistics > avgDescription: Series average
5.1.3.2. [Required] Property Job metric data > series > series items > statistics > minDescription: Series minimum
5.1.3.3. [Required] Property Job metric data > series > series items > statistics > maxDescription: Series maximum
5.1.4. [Required] Property Job metric data > series > series items > dataArray restrictions Min items 1 Max items N/A Items unicity False Additional items False Tuple validation See below
5.1.4.1. At least one of the items must beGenerated using json-schema-for-humans on 2024-02-02 at 14:36:54 +0100
2 - Metric Store ClusterCockpit Metric Store References
Reference information regarding the ClusterCockpit component “cc-metric-store” (GitHub Repo ).
2.1 - Command Line ClusterCockpit Metric Store Command Line Options
This page describes the command line options for the cc-metric-store
executable.
Function: Specifies alternative path to application configuration file.
Default: ./config.json
Example: -config ./configfiles/configuration.json
Function: Go server listens via github.com/google/gops/agent (for debugging).
2.2 - Configuration ClusterCockpit Metric Store Configuration Option References
All durations are specified as string that will be parsed like this (Allowed suffixes: s
, m
, h
, …).
metrics
: Map of metric-name to objects with the following propertiesfrequency
: Timestep/Interval/Resolution of this metricaggregation
: Can be "sum"
, "avg"
or null
null
means aggregation across nodes is forbidden for this metric"sum"
means that values from the child levels are summed up for the parent level"avg"
means that values from the child levels are averaged for the parent levelscope
: Unused at the moment, should be something like "node"
, "socket"
or "hwthread"
nats
:address
: Url of NATS.io server, example: “nats://localhost:4222”username
and password
: Optional, if provided use those for the connectionsubscriptions
:subscribe-to
: Where to expect the measurements to be publishedcluster-tag
: Default value for the cluster taghttp-api
:address
: Address to bind to, for example 0.0.0.0:8080
https-cert-file
and https-key-file
: Optional, if provided enable HTTPS using those files as certificate/keyjwt-public-key
: Base64 encoded string, use this to verify requests to the HTTP APIretention-on-memory
: Keep all values in memory for at least that amount of timecheckpoints
:interval
: Do checkpoints every X seconds/minutes/hoursdirectory
: Path to a directoryrestore
: After a restart, load the last X seconds/minutes/hours of data back into memoryarchive
:interval
: Move and compress all checkpoints not needed anymore every X seconds/minutes/hoursdirectory
: Path to a directory2.3 - REST API ClusterCockpit Metric Store RESTful API Endpoint description
Open API Reference Non-Interactive Documentation This reference is rendered using the
redoc
plugin based on the original definition file found in the ClusterCockpit Metric Store
repository ,
but without a serving backend .This means that all interactivity (“Try It Out”) will not return actual data. However, a
Curl
call and a compiled
Request URL
will still be displayed, if an API endpoint is executed.
Top
3 - Metric Collector ClusterCockpit Metric Collector References
Reference information regarding the ClusterCockpit component “cc-metric-collector” is documented only at the GitHub Repo at the moment.
Quick References Topic Link Note Overview Link Overview and example usage scenario Metric Collector Configuration Link Configure Metric Collector Active Collector Configuration Link Configure which available collectors are used Active Receiver Configuration Link Configure which available receivers are used Active Sink Configuration Link Configure which available sinks are used