System metrics with Prometheus#
Anaconda Server system performance can be monitored to understand system health, evaluate network traffic, and detect issues. Each of the Anaconda Server services expose a set of metrics that can be visualized using the built-in Prometheus expression browser. Metrics are provided in OpenMetrics (Prometheus) format.
From the dashboard, open the user dropdown menu and select Metrics to open Prometheus in a new tab.
Alternately, you can navigate directly to the dashboard in your web browser by appending
/prometheus/ to your Anaconda Server fully qualified domain name (FQDN). For example: https://<FQDN>.com/prometheus/
Creating graphs for metrics#
Prometheus uses a built-in expression browser for time series visualizations of system metrics.
To create system metric graphs in Prometheus:
Enter the name of an expression you want to view in the search box.
Select your expression from the list that appears.
Select the Graph tab.
The graph is populated by the selected metric, and a console readout appears below it.
To isolate a specific resource, select it from the legend below the graph.
up time metric tells you if your instance is running.
process_open_fds metric counts the number of files in the
This tells you how many regular files, sockets, pseudo terminals, etc. you currently have open.
process_max_fds metric reads
/proc/<PID>/limits and uses the Soft Limit from the Max Open Files row.
/limits lists both soft and hard limits. The soft limit is the value the kernel enforces for the corresponding resource, while the hard limit acts as the ceiling for the soft limit.
Setting a file limit alert#
Using the two metrics above,
process_max_fds, you can write an alert to warn you when a process hits 80% of the limit:
groups: - name: example rules: - alert: ProcessNearFDLimits expr: process_open_fds / process_max_fds > 0.8 for: 10m