Monitoring

Overview

The lab uses a self-hosted monitoring stack to track CPU, GPU, memory, disk, network, and per-process resource usage across all lab servers. Metrics are visualised in Grafana, which is available at https://grafana.lab.pyarelal.xyz. Log in with your lab account via the Sign in with Kanidm button.

What is monitored

Monitored hosts

Host GPU monitoring
orca Yes (NVIDIA)
kraken Yes (NVIDIA)
leviathan Yes (NVIDIA)
starfish No
eel No

Using the dashboard

After logging in, open the Infrastructure Overview dashboard. Use the Host dropdown at the top to switch between servers. The time range selector in the top right controls how far back the graphs show.

The dashboard is divided into three sections:


Revision #3
Created 10 April 2026 08:10:51 by Adarsh Pyarelal
Updated 10 April 2026 08:14:24 by Adarsh Pyarelal