Hardware Observer
Hardware Observer is a subordinate machine charm that provides monitoring and alerting of hardware resources on bare metal infrastructure.
Hardware Observer collects and exports Prometheus metrics from BMCs (Baseboard management controllers), using the IPMI (Intelligent Platform Management Interface) and newer Redfish protocols, and various SAS(Serial Attached SCSI ) and RAID controllers through the use of the Prometheus Hardware Exporter project. The charm also collects S.M.A.R.T. (Self-Monitoring, Analysis, and Reporting Technology) metrics which monitors the drives on the system and checks for possible failures. Hardware Observer additionally configures Prometheus alert rules that are fired when the status of any metric is suboptimal.
Appropriate collectors and alert rules are installed based on the availability of one or more of the RAID/SAS controllers mentioned below:
- Broadcom MegaRAID controller
- Dell PowerEdge RAID Controller
- LSI SAS-2 controller
- LSI SAS-3 controller
- HPE Smart Array controller
This charm is ideal for monitoring hardware resources when used in conjunction with the Canonical Observability Stack (COS).
Security, bugs and feature request
If you find a bug in this application or want to request a specific feature, here are the useful links:
- Raise issues or feature requests in Github.
- Security issues in Hardware Observer can be reported through LaunchPad. Please do not file GitHub issues about security issues.
Contributing
Please see the Juju SDK docs for guidelines on enhancements to this charm following best practice guidelines, and CONTRIBUTING.md for developer guidance.
License
Hardware Observer is free software, distributed under the Apache Software License, version 2.0. See LICENSE for more information.
Navigation
Mapping table
Level | Path | Navlink |
---|---|---|
1 | tutorial | Tutorial |
1 | how-to | How to |
2 | integrate-with-cos | Integrate with COS |
2 | monitor-hw-raid-controller | Monitor hardware RAID controllers |
2 | migrate-from-hw-health | Migrate from hw-health |
1 | explanation | Explanation |
2 | hw-support-detection | Hardware support detection |
2 | exporters | Exporters |
1 | reference | Reference |
2 | resources | Resources |
2 | configurations | Configurations |
2 | integrations | Integrations |
2 | metrics-and-alerts | Metrics and alerts |
3 | metrics-and-alerts-common | Common |
3 | metrics-and-alerts-ipmi | IPMI |
3 | metrics-and-alerts-redfish | Redfish |
3 | metrics-and-alerts-megaraid | MegaRAID |
3 | metrics-and-alerts-poweredge | PowerEdge RAID |
3 | metrics-and-alerts-sas | LSI SAS |
3 | metrics-and-alerts-hpe | HPE Smart Array |
3 | metrics-and-alerts-smart | S.M.A.R.T. |