Monitoring Overview The lab uses a self-hosted monitoring stack to track CPU, GPU, memory, disk, network, and per-process resource usage across all lab servers. Metrics are visualised in Grafana, which is available at https://grafana.lab.pyarelal.xyz . Log in with your lab account via the Sign in with Kanidm button. What is monitored CPU usage (by type: user, system, iowait, etc.) RAM usage (used, cached, buffers) Network traffic (sent and received) Disk I/O (read and write) GPU utilisation, memory, temperature, and power draw (on GPU-equipped hosts) Top processes by CPU and memory Monitored hosts Host GPU monitoring orca Yes (NVIDIA) kraken Yes (NVIDIA) leviathan Yes (NVIDIA) starfish No eel No Using the dashboard After logging in, open the Infrastructure Overview dashboard. Use the Host dropdown at the top to switch between servers. The time range selector in the top right controls how far back the graphs show. The dashboard is divided into three sections: System — CPU, RAM, network, and disk panels visible for all hosts GPU — GPU panels, populated only for GPU-equipped hosts Processes — top 10 processes by CPU and memory usage