hardware-observer docs - index

Hardware Observer

Hardware Observer is a subordinate machine charm that provides monitoring and alerting of hardware resources on bare metal infrastructure.

Hardware Observer collects and exports Prometheus metrics from BMCs (Baseboard management controllers), using the IPMI (Intelligent Platform Management Interface) and newer Redfish protocols, and various SAS(Serial Attached SCSI ) and RAID controllers through the use of the Prometheus Hardware Exporter project. The charm also collects S.M.A.R.T. (Self-Monitoring, Analysis, and Reporting Technology) metrics which monitors the drives on the system and checks for possible failures. Hardware Observer additionally configures Prometheus alert rules that are fired when the status of any metric is suboptimal.

Appropriate collectors and alert rules are installed based on the availability of one or more of the RAID/SAS controllers mentioned below:

  • Broadcom MegaRAID controller
  • Dell PowerEdge RAID Controller
  • LSI SAS-2 controller
  • LSI SAS-3 controller
  • HPE Smart Array controller

This charm is ideal for monitoring hardware resources when used in conjunction with the Canonical Observability Stack (COS).

Security, bugs and feature request

If you find a bug in this application or want to request a specific feature, here are the useful links:

  • Raise issues or feature requests in Github.
  • Security issues in Hardware Observer can be reported through LaunchPad. Please do not file GitHub issues about security issues.

Contributing

Please see the Juju SDK docs for guidelines on enhancements to this charm following best practice guidelines, and CONTRIBUTING.md for developer guidance.

License

Hardware Observer is free software, distributed under the Apache Software License, version 2.0. See LICENSE for more information.

Navigation

Mapping table
Level Path Navlink
1 tutorial Tutorial
1 how-to How to
2 integrate-with-cos Integrate with COS
2 monitor-hw-raid-controller Monitor hardware RAID controllers
2 migrate-from-hw-health Migrate from hw-health
1 explanation Explanation
2 hw-support-detection Hardware support detection
2 exporters Exporters
1 reference Reference
2 resources Resources
2 configurations Configurations
2 integrations Integrations
2 metrics-and-alerts Metrics and alerts
3 metrics-and-alerts-common Common
3 metrics-and-alerts-ipmi IPMI
3 metrics-and-alerts-redfish Redfish
3 metrics-and-alerts-megaraid MegaRAID
3 metrics-and-alerts-poweredge PowerEdge RAID
3 metrics-and-alerts-sas LSI SAS
3 metrics-and-alerts-hpe HPE Smart Array
3 metrics-and-alerts-smart S.M.A.R.T.

Good! The document is clear and organized.

1 Like