mirror of
https://github.com/prometheus-community/ansible
synced 2024-11-26 22:00:22 +00:00
initial migration of roles from cloudalchemy
Signed-off-by: Paweł Krupa (paulfantom) <pawel@krupa.net.pl>
This commit is contained in:
parent
35aceab5d3
commit
dcfbfa84d2
125 changed files with 4826 additions and 0 deletions
6
.ansible-lint
Normal file
6
.ansible-lint
Normal file
|
@ -0,0 +1,6 @@
|
|||
---
|
||||
skip_list:
|
||||
- '106'
|
||||
- '204'
|
||||
- '208'
|
||||
- '602'
|
98
CONTRIBUTING.md
Normal file
98
CONTRIBUTING.md
Normal file
|
@ -0,0 +1,98 @@
|
|||
# Contributor Guideline
|
||||
|
||||
This document provides an overview of how you can participate in improving this project or extending it. We are
|
||||
grateful for all your help: bug reports and fixes, code contributions, documentation or ideas. Feel free to join, we
|
||||
appreciate your support!!
|
||||
|
||||
## Communication
|
||||
|
||||
### GitHub repositories
|
||||
|
||||
Much of the issues, goals and ideas are tracked in the respective projects in GitHub. Please use this channel to report
|
||||
bugs, ask questions, and request new features .
|
||||
|
||||
## git and GitHub
|
||||
|
||||
In order to contribute code please:
|
||||
|
||||
1. Fork the project on GitHub
|
||||
2. Clone the project
|
||||
3. Add changes (and tests)
|
||||
4. Commit and push
|
||||
5. Create a merge-request
|
||||
|
||||
To have your code merged, see the expectations listed below.
|
||||
|
||||
You can find a well-written guide [here](https://help.github.com/articles/fork-a-repo).
|
||||
|
||||
Please follow common commit best-practices. Be explicit, have a short summary, a well-written description and
|
||||
references. This is especially important for the merge-request.
|
||||
|
||||
Some great guidelines can be found [here](https://wiki.openstack.org/wiki/GitCommitMessages) and
|
||||
[here](http://robots.thoughtbot.com/5-useful-tips-for-a-better-commit-message).
|
||||
|
||||
## Releases
|
||||
|
||||
We try to stick to semantic versioning and our releases are automated. Release is created by assigning a keyword (in a
|
||||
way similar to circle ci keyword [`[ci skip]`](https://docs.travis-ci.com/user/customizing-the-build#Skipping-a-build))
|
||||
to a commit with merge request. Available keywords are (square brackets are important!):
|
||||
|
||||
* `[patch]`, `[fix]`, `[bugfix]` - for PATCH version release
|
||||
* `[minor]`, `[feature]`, `[feat]` - for MINOR version release
|
||||
* `[major]`, `[breaking change]` - for MAJOR version release
|
||||
|
||||
## Changelog
|
||||
|
||||
Changelog is generated automatically during release process and all information is taken from github issues, PRs and
|
||||
labels.
|
||||
|
||||
## Expectations
|
||||
|
||||
### Keep it simple
|
||||
|
||||
We try to provide production ready ansible roles which should be as much zero-conf as possible but this doesn't mean to
|
||||
overcomplicate things. Just follow [KISS](https://en.wikipedia.org/wiki/KISS_principle).
|
||||
|
||||
### Be explicit
|
||||
|
||||
* Please avoid using nonsensical property and variable names.
|
||||
* Use self-describing attribute names for user configuration.
|
||||
* In case of failures, communicate what happened and why a failure occurs to the user. Make it easy to track the code
|
||||
or action that produced the error. Try to catch and handle errors if possible to provide improved failure messages.
|
||||
|
||||
|
||||
### Add tests
|
||||
|
||||
We are striving to use at least two test scenarios located in [/molecule](molecule) directory. First one
|
||||
([default](molecule/default)) is testing default configuration without any additional variables, second one
|
||||
([alternative](molecule/alternative)) is testing what happens when many variables from
|
||||
[/defaults/main.yml](defaults/main.yml) are changed. When adding new functionalities please add tests to proper
|
||||
scenarios. Tests are written in testinfra framework and are located in `/tests` subdirectory of scenario directory
|
||||
(for example default tests are in [/molecule/default/tests](molecule/default/tests)).
|
||||
More information about:
|
||||
- [testinfra](http://testinfra.readthedocs.io/en/latest/index.html)
|
||||
- [molecule](https://molecule.readthedocs.io/en/latest/index.html)
|
||||
|
||||
### Follow best practices
|
||||
|
||||
Please follow [ansible best practices](http://docs.ansible.com/ansible/latest/playbooks_best_practices.html) and
|
||||
especially provide meaningful names to tasks and even comments where needed.
|
||||
|
||||
Our test framework automatically lints code with [`yamllint`](https://github.com/adrienverge/yamllint),
|
||||
[`ansible-lint`](https://github.com/willthames/ansible-lint), and [`flake8`](https://gitlab.com/pycqa/flake8) programs
|
||||
so be sure to follow their rules.
|
||||
|
||||
Remember: Code is generally read much more often than written.
|
||||
|
||||
### Use Markdown
|
||||
|
||||
Wherever possible, please refrain from any other formats and stick to simple markdown.
|
||||
|
||||
## Requirements regarding roles design
|
||||
|
||||
We are trying to create the best and most secure installation method for non-containerized prometheus stack components.
|
||||
To accomplish this all roles need to support:
|
||||
|
||||
- current and at least one previous ansible version
|
||||
- systemd as the only available process manager
|
||||
- at least latest debian and CentOS distributions
|
43
galaxy.yml
Normal file
43
galaxy.yml
Normal file
|
@ -0,0 +1,43 @@
|
|||
### REQUIRED
|
||||
# The namespace of the collection. This can be a company/brand/organization or product namespace under which all
|
||||
# content lives. May only contain alphanumeric lowercase characters and underscores. Namespaces cannot start with
|
||||
# underscores or numbers and cannot contain consecutive underscores
|
||||
namespace: community
|
||||
name: prometheus
|
||||
version: 1.0.0
|
||||
|
||||
readme: README.md
|
||||
authors:
|
||||
- Ben Kochie (https://github.com/SuperQ)
|
||||
- Paweł Krupa (https://github.com/paulfantom)
|
||||
|
||||
description: your collection description
|
||||
license_file: LICENSE
|
||||
tags:
|
||||
- monitoring
|
||||
- prometheus
|
||||
- metrics
|
||||
- alerts
|
||||
- alerting
|
||||
- molecule
|
||||
- cloud
|
||||
|
||||
# Collections that this collection requires to be installed for it to be usable. The key of the dict is the
|
||||
# collection label 'namespace.name'. The value is a version range
|
||||
# L(specifiers,https://python-semanticversion.readthedocs.io/en/latest/#requirement-specification). Multiple version
|
||||
# range specifiers can be set and are separated by ','
|
||||
dependencies: {}
|
||||
|
||||
repository: https://github.com/prometheus-community/ansible
|
||||
documentation: https://github.com/prometheus-community/ansible/blob/main/docs
|
||||
homepage: https://prometheus.io
|
||||
issues: https://github.com/prometheus-community/ansible/issues
|
||||
|
||||
# A list of file glob-like patterns used to filter any files or directories that should not be included in the build
|
||||
# artifact. A pattern is matched from the relative path of the file or directory of the collection directory. This
|
||||
# uses 'fnmatch' to match the files or directories. Some directories and files like 'galaxy.yml', '*.pyc', '*.retry',
|
||||
# and '.git' are always filtered
|
||||
build_ignore:
|
||||
- 'tests/*'
|
||||
- '*.tar.gz'
|
||||
- 'docs/*'
|
1
meta/runtime.yml
Normal file
1
meta/runtime.yml
Normal file
|
@ -0,0 +1 @@
|
|||
requires_ansible: '>=2.9.10'
|
31
plugins/README.md
Normal file
31
plugins/README.md
Normal file
|
@ -0,0 +1,31 @@
|
|||
# Collections Plugins Directory
|
||||
|
||||
This directory can be used to ship various plugins inside an Ansible collection. Each plugin is placed in a folder that
|
||||
is named after the type of plugin it is in. It can also include the `module_utils` and `modules` directory that
|
||||
would contain module utils and modules respectively.
|
||||
|
||||
Here is an example directory of the majority of plugins currently supported by Ansible:
|
||||
|
||||
```
|
||||
└── plugins
|
||||
├── action
|
||||
├── become
|
||||
├── cache
|
||||
├── callback
|
||||
├── cliconf
|
||||
├── connection
|
||||
├── filter
|
||||
├── httpapi
|
||||
├── inventory
|
||||
├── lookup
|
||||
├── module_utils
|
||||
├── modules
|
||||
├── netconf
|
||||
├── shell
|
||||
├── strategy
|
||||
├── terminal
|
||||
├── test
|
||||
└── vars
|
||||
```
|
||||
|
||||
A full list of plugin types can be found at [Working With Plugins](https://docs.ansible.com/ansible-core/2.12/plugins/plugins.html).
|
99
roles/alertmanager/README.md
Normal file
99
roles/alertmanager/README.md
Normal file
|
@ -0,0 +1,99 @@
|
|||
<p><img src="https://upload.wikimedia.org/wikipedia/commons/thumb/1/1d/Human-dialog-warning.svg/2000px-Human-dialog-warning.svg.png" alt="alert logo" title="alert" align="right" height="60" /></p>
|
||||
|
||||
# Ansible Role: alertmanager
|
||||
|
||||
## Description
|
||||
|
||||
Deploy and manage Prometheus [alertmanager](https://github.com/prometheus/alertmanager) service using ansible.
|
||||
|
||||
## Requirements
|
||||
|
||||
- Ansible >= 2.7 (It might work on previous versions, but we cannot guarantee it)
|
||||
|
||||
It would be nice to have prometheus installed somewhere
|
||||
|
||||
## Role Variables
|
||||
|
||||
All variables which can be overridden are stored in [defaults/main.yml](defaults/main.yml) file as well as in table below.
|
||||
|
||||
| Name | Default Value | Description |
|
||||
| -------------- | ------------- | -----------------------------------|
|
||||
| `alertmanager_version` | 0.21.0 | Alertmanager package version. Also accepts `latest` as parameter. |
|
||||
| `alertmanager_binary_local_dir` | "" | Allows to use local packages instead of ones distributed on github. As parameter it takes a directory where `alertmanager` AND `amtool` binaries are stored on host on which ansible is ran. This overrides `alertmanager_version` parameter |
|
||||
| `alertmanager_web_listen_address` | 0.0.0.0:9093 | Address on which alertmanager will be listening |
|
||||
| `alertmanager_web_external_url` | http://localhost:9093/ | External address on which alertmanager is available. Useful when behind reverse proxy. Ex. example.org/alertmanager |
|
||||
| `alertmanager_config_dir` | /etc/alertmanager | Path to directory with alertmanager configuration |
|
||||
| `alertmanager_db_dir` | /var/lib/alertmanager | Path to directory with alertmanager database |
|
||||
| `alertmanager_config_file` | alertmanager.yml.j2 | Variable used to provide custom alertmanager configuration file in form of ansible template |
|
||||
| `alertmanager_config_flags_extra` | {} | Additional configuration flags passed to prometheus binary at startup |
|
||||
| `alertmanager_template_files` | ['alertmanager/templates/*.tmpl'] | List of folders where ansible will look for template files which will be copied to `{{ alertmanager_config_dir }}/templates/`. Files must have `*.tmpl` extension |
|
||||
| `alertmanager_resolve_timeout` | 3m | Time after which an alert is declared resolved |
|
||||
| `alertmanager_smtp` | {} | SMTP (email) configuration |
|
||||
| `alertmanager_http_config` | {} | Http config for using custom webhooks |
|
||||
| `alertmanager_slack_api_url` | "" | Slack webhook url |
|
||||
| `alertmanager_pagerduty_url` | "" | Pagerduty webhook url |
|
||||
| `alertmanager_opsgenie_api_key` | "" | Opsgenie webhook key |
|
||||
| `alertmanager_opsgenie_api_url` | "" | Opsgenie webhook url |
|
||||
| `alertmanager_victorops_api_key` | "" | VictorOps webhook key |
|
||||
| `alertmanager_victorops_api_url` | "" | VictorOps webhook url |
|
||||
| `alertmanager_hipchat_api_url` | "" | Hipchat webhook url |
|
||||
| `alertmanager_hipchat_auth_token` | "" | Hipchat authentication token |
|
||||
| `alertmanager_wechat_url` | "" | Enterprise WeChat webhook url |
|
||||
| `alertmanager_wechat_secret` | "" | Enterprise WeChat secret token |
|
||||
| `alertmanager_wechat_corp_id` | "" | Enterprise WeChat corporation id |
|
||||
| `alertmanager_cluster` | {listen-address: ""} | HA cluster network configuration. Disabled by default. More information in [alertmanager readme](https://github.com/prometheus/alertmanager#high-availability) |
|
||||
| `alertmanager_receivers` | [] | A list of notification receivers. Configuration same as in [official docs](https://prometheus.io/docs/alerting/configuration/#<receiver>) |
|
||||
| `alertmanager_inhibit_rules` | [] | List of inhibition rules. Same as in [official docs](https://prometheus.io/docs/alerting/configuration/#inhibit_rule) |
|
||||
| `alertmanager_route` | {} | Alert routing. More in [official docs](https://prometheus.io/docs/alerting/configuration/#<route>) |
|
||||
| `alertmanager_amtool_config_file` | amtool.yml.j2 | Template for amtool config |
|
||||
| `alertmanager_amtool_config_alertmanager_url` | `alertmanager_web_external_url` | URL of the alertmanager |
|
||||
| `alertmanager_amtool_config_output` | extended | Extended output, use `""` for simple output. |
|
||||
|
||||
## Example
|
||||
|
||||
### Playbook
|
||||
|
||||
```yaml
|
||||
---
|
||||
hosts: all
|
||||
roles:
|
||||
- ansible-alertmanager
|
||||
vars:
|
||||
alertmanager_version: latest
|
||||
alertmanager_slack_api_url: "http://example.com"
|
||||
alertmanager_receivers:
|
||||
- name: slack
|
||||
slack_configs:
|
||||
- send_resolved: true
|
||||
channel: '#alerts'
|
||||
alertmanager_route:
|
||||
group_by: ['alertname', 'cluster', 'service']
|
||||
group_wait: 30s
|
||||
group_interval: 5m
|
||||
repeat_interval: 3h
|
||||
receiver: slack
|
||||
```
|
||||
|
||||
### Demo site
|
||||
|
||||
We provide demo site for full monitoring solution based on prometheus and grafana. Repository with code and links to running instances is [available on github](https://github.com/prometheus/demo-site) and site is hosted on [DigitalOcean](https://digitalocean.com).
|
||||
|
||||
## Local Testing
|
||||
|
||||
The preferred way of locally testing the role is to use Docker and [molecule](https://github.com/ansible-community/molecule) (v3.x). You will have to install Docker on your system. See "Get started" for a Docker package suitable to for your system. Running your tests is as simple as executing `molecule test`.
|
||||
|
||||
## Continuous Integration
|
||||
|
||||
Combining molecule and circle CI allows us to test how new PRs will behave when used with multiple ansible versions and multiple operating systems. This also allows use to create test scenarios for different role configurations. As a result we have a quite large test matrix which can take more time than local testing, so please be patient.
|
||||
|
||||
## Contributing
|
||||
|
||||
See [contributor guideline](CONTRIBUTING.md).
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
See [troubleshooting](TROUBLESHOOTING.md).
|
||||
|
||||
## License
|
||||
|
||||
This project is licensed under MIT License. See [LICENSE](/LICENSE) for more details.
|
124
roles/alertmanager/defaults/main.yml
Normal file
124
roles/alertmanager/defaults/main.yml
Normal file
|
@ -0,0 +1,124 @@
|
|||
---
|
||||
alertmanager_version: 0.21.0
|
||||
alertmanager_binary_local_dir: ''
|
||||
|
||||
alertmanager_config_dir: /etc/alertmanager
|
||||
alertmanager_db_dir: /var/lib/alertmanager
|
||||
|
||||
alertmanager_config_file: 'alertmanager.yml.j2'
|
||||
|
||||
alertmanager_template_files:
|
||||
- alertmanager/templates/*.tmpl
|
||||
|
||||
alertmanager_web_listen_address: '0.0.0.0:9093'
|
||||
alertmanager_web_external_url: 'http://localhost:9093/'
|
||||
|
||||
alertmanager_http_config: {}
|
||||
|
||||
alertmanager_resolve_timeout: 3m
|
||||
|
||||
alertmanager_config_flags_extra: {}
|
||||
# alertmanager_config_flags_extra:
|
||||
# data.retention: 10
|
||||
|
||||
# SMTP default params
|
||||
alertmanager_smtp: {}
|
||||
# alertmanager_smtp:
|
||||
# from: ''
|
||||
# smarthost: ''
|
||||
# auth_username: ''
|
||||
# auth_password: ''
|
||||
# auth_secret: ''
|
||||
# auth_identity: ''
|
||||
# require_tls: "True"
|
||||
|
||||
# Default values you can see here -> https://prometheus.io/docs/alerting/configuration/
|
||||
alertmanager_slack_api_url: ''
|
||||
alertmanager_pagerduty_url: ''
|
||||
alertmanager_opsgenie_api_key: ''
|
||||
alertmanager_opsgenie_api_url: ''
|
||||
alertmanager_victorops_api_key: ''
|
||||
alertmanager_victorops_api_url: ''
|
||||
alertmanager_hipchat_api_url: ''
|
||||
alertmanager_hipchat_auth_token: ''
|
||||
alertmanager_wechat_url: ''
|
||||
alertmanager_wechat_secret: ''
|
||||
alertmanager_wechat_corp_id: ''
|
||||
|
||||
# First read: https://github.com/prometheus/alertmanager#high-availability
|
||||
alertmanager_cluster:
|
||||
listen-address: ""
|
||||
# alertmanager_cluster:
|
||||
# listen-address: "{{ ansible_default_ipv4.address }}:6783"
|
||||
# peers:
|
||||
# - "{{ ansible_default_ipv4.address }}:6783"
|
||||
# - "demo.cloudalchemy.org:6783"
|
||||
|
||||
alertmanager_receivers: []
|
||||
# alertmanager_receivers:
|
||||
# - name: slack
|
||||
# slack_configs:
|
||||
# - send_resolved: true
|
||||
# channel: '#alerts'
|
||||
|
||||
alertmanager_inhibit_rules: []
|
||||
# alertmanager_inhibit_rules:
|
||||
# - target_match:
|
||||
# label: value
|
||||
# source_match:
|
||||
# label: value
|
||||
# equal: ['dc', 'rack']
|
||||
# - target_match_re:
|
||||
# label: value1|value2
|
||||
# source_match_re:
|
||||
# label: value3|value5
|
||||
|
||||
alertmanager_route: {}
|
||||
# alertmanager_route:
|
||||
# group_by: ['alertname', 'cluster', 'service']
|
||||
# group_wait: 30s
|
||||
# group_interval: 5m
|
||||
# repeat_interval: 4h
|
||||
# receiver: slack
|
||||
# # This routes performs a regular expression match on alert labels to
|
||||
# # catch alerts that are related to a list of services.
|
||||
# - match_re:
|
||||
# service: ^(foo1|foo2|baz)$
|
||||
# receiver: team-X-mails
|
||||
# # The service has a sub-route for critical alerts, any alerts
|
||||
# # that do not match, i.e. severity != critical, fall-back to the
|
||||
# # parent node and are sent to 'team-X-mails'
|
||||
# routes:
|
||||
# - match:
|
||||
# severity: critical
|
||||
# receiver: team-X-pager
|
||||
# - match:
|
||||
# service: files
|
||||
# receiver: team-Y-mails
|
||||
# routes:
|
||||
# - match:
|
||||
# severity: critical
|
||||
# receiver: team-Y-pager
|
||||
# # This route handles all alerts coming from a database service. If there's
|
||||
# # no team to handle it, it defaults to the DB team.
|
||||
# - match:
|
||||
# service: database
|
||||
# receiver: team-DB-pager
|
||||
# # Also group alerts by affected database.
|
||||
# group_by: [alertname, cluster, database]
|
||||
# routes:
|
||||
# - match:
|
||||
# owner: team-X
|
||||
# receiver: team-X-pager
|
||||
# - match:
|
||||
# owner: team-Y
|
||||
# receiver: team-Y-pager
|
||||
|
||||
# The template for amtool's configuration
|
||||
alertmanager_amtool_config_file: 'amtool.yml.j2'
|
||||
|
||||
# Location (URL) of the alertmanager
|
||||
alertmanager_amtool_config_alertmanager_url: "{{ alertmanager_web_external_url }}"
|
||||
|
||||
# Extended output of `amtool` commands, use '' for less verbosity
|
||||
alertmanager_amtool_config_output: 'extended'
|
13
roles/alertmanager/handlers/main.yml
Normal file
13
roles/alertmanager/handlers/main.yml
Normal file
|
@ -0,0 +1,13 @@
|
|||
---
|
||||
- name: restart alertmanager
|
||||
become: true
|
||||
systemd:
|
||||
daemon_reload: true
|
||||
name: alertmanager
|
||||
state: restarted
|
||||
|
||||
- name: reload alertmanager
|
||||
become: true
|
||||
systemd:
|
||||
name: alertmanager
|
||||
state: reloaded
|
31
roles/alertmanager/meta/main.yml
Normal file
31
roles/alertmanager/meta/main.yml
Normal file
|
@ -0,0 +1,31 @@
|
|||
---
|
||||
galaxy_info:
|
||||
author: Prometheus Community
|
||||
description: Prometheus Alertmanager service
|
||||
license: Apache
|
||||
company: none
|
||||
min_ansible_version: "2.7"
|
||||
platforms:
|
||||
- name: Ubuntu
|
||||
versions:
|
||||
- bionic
|
||||
- xenial
|
||||
- name: Debian
|
||||
versions:
|
||||
- stretch
|
||||
- buster
|
||||
- name: EL
|
||||
versions:
|
||||
- 7
|
||||
- 8
|
||||
- name: Fedora
|
||||
versions:
|
||||
- 30
|
||||
- 31
|
||||
galaxy_tags:
|
||||
- monitoring
|
||||
- prometheus
|
||||
- alerting
|
||||
- alert
|
||||
|
||||
dependencies: []
|
70
roles/alertmanager/molecule/alternative/molecule.yml
Normal file
70
roles/alertmanager/molecule/alternative/molecule.yml
Normal file
|
@ -0,0 +1,70 @@
|
|||
---
|
||||
dependency:
|
||||
name: galaxy
|
||||
driver:
|
||||
name: docker
|
||||
platforms:
|
||||
- name: bionic
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:ubuntu-18.04
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: xenial
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:ubuntu-16.04
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: stretch
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:debian-9
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: buster
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:debian-10
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: centos7
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:centos-7
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: centos8
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:centos-8
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
groups:
|
||||
- python3
|
||||
- name: fedora
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:fedora-30
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
groups:
|
||||
- python3
|
||||
provisioner:
|
||||
name: ansible
|
||||
playbooks:
|
||||
prepare: prepare.yml
|
||||
converge: playbook.yml
|
||||
inventory:
|
||||
group_vars:
|
||||
python3:
|
||||
ansible_python_interpreter: /usr/bin/python3
|
||||
verifier:
|
||||
name: testinfra
|
34
roles/alertmanager/molecule/alternative/playbook.yml
Normal file
34
roles/alertmanager/molecule/alternative/playbook.yml
Normal file
|
@ -0,0 +1,34 @@
|
|||
---
|
||||
- hosts: all
|
||||
any_errors_fatal: true
|
||||
roles:
|
||||
- cloudalchemy.alertmanager
|
||||
vars:
|
||||
alertmanager_binary_local_dir: '/tmp/alertmanager-linux-amd64'
|
||||
alertmanager_config_dir: /opt/am/etc
|
||||
alertmanager_db_dir: /opt/am/lib
|
||||
alertmanager_web_listen_address: '127.0.0.1:9093'
|
||||
alertmanager_web_external_url: 'http://localhost:9093/alertmanager'
|
||||
alertmanager_resolve_timeout: 10m
|
||||
alertmanager_slack_api_url: "http://example.com"
|
||||
alertmanager_receivers:
|
||||
- name: slack
|
||||
slack_configs:
|
||||
- send_resolved: true
|
||||
api_url: 'https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX'
|
||||
channel: '#alerts'
|
||||
alertmanager_route:
|
||||
group_by: ['alertname', 'cluster', 'service']
|
||||
group_wait: 30s
|
||||
group_interval: 5m
|
||||
repeat_interval: 3h
|
||||
receiver: slack
|
||||
routes:
|
||||
- match_re:
|
||||
service: ^(foo1|foo2|baz)$
|
||||
receiver: slack
|
||||
alertmanager_mesh:
|
||||
listen-address: "127.0.0.1:6783"
|
||||
peers:
|
||||
- "127.0.0.1:6783"
|
||||
- "demo.cloudalchemy.org:6783"
|
37
roles/alertmanager/molecule/alternative/prepare.yml
Normal file
37
roles/alertmanager/molecule/alternative/prepare.yml
Normal file
|
@ -0,0 +1,37 @@
|
|||
---
|
||||
- name: Prepare
|
||||
hosts: localhost
|
||||
gather_facts: false
|
||||
vars:
|
||||
# Version seeds to be specified here as molecule doesn't have access to ansible_version at this stage
|
||||
version: 0.19.0
|
||||
tasks:
|
||||
- name: download alertmanager binary to local folder
|
||||
become: false
|
||||
get_url:
|
||||
url: "https://github.com/prometheus/alertmanager/releases/download/v{{ version }}/alertmanager-{{ version }}.linux-amd64.tar.gz"
|
||||
dest: "/tmp/alertmanager-{{ version }}.linux-amd64.tar.gz"
|
||||
register: _download_archive
|
||||
until: _download_archive is succeeded
|
||||
retries: 5
|
||||
delay: 2
|
||||
run_once: true
|
||||
check_mode: false
|
||||
|
||||
- name: unpack alertmanager binaries
|
||||
become: false
|
||||
unarchive:
|
||||
src: "/tmp/alertmanager-{{ version }}.linux-amd64.tar.gz"
|
||||
dest: "/tmp"
|
||||
creates: "/tmp/alertmanager-{{ version }}.linux-amd64/alertmanager"
|
||||
run_once: true
|
||||
check_mode: false
|
||||
|
||||
- name: link to alertmanager binaries directory
|
||||
become: false
|
||||
file:
|
||||
src: "/tmp/alertmanager-{{ version }}.linux-amd64"
|
||||
dest: "/tmp/alertmanager-linux-amd64"
|
||||
state: link
|
||||
run_once: true
|
||||
check_mode: false
|
|
@ -0,0 +1,43 @@
|
|||
import pytest
|
||||
import os
|
||||
import testinfra.utils.ansible_runner
|
||||
|
||||
testinfra_hosts = testinfra.utils.ansible_runner.AnsibleRunner(
|
||||
os.environ['MOLECULE_INVENTORY_FILE']).get_hosts('all')
|
||||
|
||||
|
||||
@pytest.mark.parametrize("dirs", [
|
||||
"/opt/am/etc",
|
||||
"/opt/am/etc/templates",
|
||||
"/opt/am/lib"
|
||||
])
|
||||
def test_directories(host, dirs):
|
||||
d = host.file(dirs)
|
||||
assert d.is_directory
|
||||
assert d.exists
|
||||
|
||||
|
||||
@pytest.mark.parametrize("files", [
|
||||
"/usr/local/bin/alertmanager",
|
||||
"/usr/local/bin/amtool",
|
||||
"/opt/am/etc/alertmanager.yml",
|
||||
"/etc/systemd/system/alertmanager.service"
|
||||
])
|
||||
def test_files(host, files):
|
||||
f = host.file(files)
|
||||
assert f.exists
|
||||
assert f.is_file
|
||||
|
||||
|
||||
def test_service(host):
|
||||
s = host.service("alertmanager")
|
||||
# assert s.is_enabled
|
||||
assert s.is_running
|
||||
|
||||
|
||||
@pytest.mark.parametrize("sockets", [
|
||||
"tcp://127.0.0.1:9093",
|
||||
"tcp://127.0.0.1:6783"
|
||||
])
|
||||
def test_socket(host, sockets):
|
||||
assert host.socket(sockets).is_listening
|
70
roles/alertmanager/molecule/default/molecule.yml
Normal file
70
roles/alertmanager/molecule/default/molecule.yml
Normal file
|
@ -0,0 +1,70 @@
|
|||
---
|
||||
dependency:
|
||||
name: galaxy
|
||||
driver:
|
||||
name: docker
|
||||
platforms:
|
||||
- name: bionic
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:ubuntu-18.04
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: xenial
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:ubuntu-16.04
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: stretch
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:debian-9
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: buster
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:debian-10
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: centos7
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:centos-7
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: centos8
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:centos-8
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
groups:
|
||||
- python3
|
||||
- name: fedora
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:fedora-30
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
groups:
|
||||
- python3
|
||||
provisioner:
|
||||
name: ansible
|
||||
playbooks:
|
||||
prepare: prepare.yml
|
||||
converge: playbook.yml
|
||||
inventory:
|
||||
group_vars:
|
||||
python3:
|
||||
ansible_python_interpreter: /usr/bin/python3
|
||||
verifier:
|
||||
name: testinfra
|
18
roles/alertmanager/molecule/default/playbook.yml
Normal file
18
roles/alertmanager/molecule/default/playbook.yml
Normal file
|
@ -0,0 +1,18 @@
|
|||
---
|
||||
- hosts: all
|
||||
any_errors_fatal: true
|
||||
roles:
|
||||
- cloudalchemy.alertmanager
|
||||
vars:
|
||||
alertmanager_slack_api_url: "http://example.com"
|
||||
alertmanager_receivers:
|
||||
- name: slack
|
||||
slack_configs:
|
||||
- send_resolved: true
|
||||
channel: '#alerts'
|
||||
alertmanager_route:
|
||||
group_by: ['alertname', 'cluster', 'service']
|
||||
group_wait: 30s
|
||||
group_interval: 5m
|
||||
repeat_interval: 3h
|
||||
receiver: slack
|
5
roles/alertmanager/molecule/default/prepare.yml
Normal file
5
roles/alertmanager/molecule/default/prepare.yml
Normal file
|
@ -0,0 +1,5 @@
|
|||
---
|
||||
- name: Prepare
|
||||
hosts: all
|
||||
gather_facts: false
|
||||
tasks: []
|
39
roles/alertmanager/molecule/default/tests/test_default.py
Normal file
39
roles/alertmanager/molecule/default/tests/test_default.py
Normal file
|
@ -0,0 +1,39 @@
|
|||
import pytest
|
||||
import os
|
||||
import testinfra.utils.ansible_runner
|
||||
|
||||
testinfra_hosts = testinfra.utils.ansible_runner.AnsibleRunner(
|
||||
os.environ['MOLECULE_INVENTORY_FILE']).get_hosts('all')
|
||||
|
||||
|
||||
@pytest.mark.parametrize("dirs", [
|
||||
"/etc/alertmanager",
|
||||
"/etc/alertmanager/templates",
|
||||
"/var/lib/alertmanager"
|
||||
])
|
||||
def test_directories(host, dirs):
|
||||
d = host.file(dirs)
|
||||
assert d.is_directory
|
||||
assert d.exists
|
||||
|
||||
|
||||
@pytest.mark.parametrize("files", [
|
||||
"/usr/local/bin/alertmanager",
|
||||
"/usr/local/bin/amtool",
|
||||
"/etc/alertmanager/alertmanager.yml",
|
||||
"/etc/systemd/system/alertmanager.service"
|
||||
])
|
||||
def test_files(host, files):
|
||||
f = host.file(files)
|
||||
assert f.exists
|
||||
assert f.is_file
|
||||
|
||||
|
||||
def test_service(host):
|
||||
s = host.service("alertmanager")
|
||||
# assert s.is_enabled
|
||||
assert s.is_running
|
||||
|
||||
|
||||
def test_socket(host):
|
||||
assert host.socket("tcp://0.0.0.0:9093").is_listening
|
35
roles/alertmanager/molecule/latest/molecule.yml
Normal file
35
roles/alertmanager/molecule/latest/molecule.yml
Normal file
|
@ -0,0 +1,35 @@
|
|||
---
|
||||
dependency:
|
||||
name: galaxy
|
||||
driver:
|
||||
name: docker
|
||||
platforms:
|
||||
- name: buster
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:debian-10
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: fedora
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:fedora-30
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
groups:
|
||||
- python3
|
||||
provisioner:
|
||||
name: ansible
|
||||
playbooks:
|
||||
create: ../default/create.yml
|
||||
prepare: ../default/prepare.yml
|
||||
converge: playbook.yml
|
||||
destroy: ../default/destroy.yml
|
||||
inventory:
|
||||
group_vars:
|
||||
python3:
|
||||
ansible_python_interpreter: /usr/bin/python3
|
||||
verifier:
|
||||
name: testinfra
|
20
roles/alertmanager/molecule/latest/playbook.yml
Normal file
20
roles/alertmanager/molecule/latest/playbook.yml
Normal file
|
@ -0,0 +1,20 @@
|
|||
---
|
||||
- name: Run role
|
||||
hosts: all
|
||||
any_errors_fatal: true
|
||||
roles:
|
||||
- cloudalchemy.alertmanager
|
||||
vars:
|
||||
alertmanager_version: latest
|
||||
alertmanager_slack_api_url: "http://example.com"
|
||||
alertmanager_receivers:
|
||||
- name: slack
|
||||
slack_configs:
|
||||
- send_resolved: true
|
||||
channel: '#alerts'
|
||||
alertmanager_route:
|
||||
group_by: ['alertname', 'cluster', 'service']
|
||||
group_wait: 30s
|
||||
group_interval: 5m
|
||||
repeat_interval: 3h
|
||||
receiver: slack
|
28
roles/alertmanager/molecule/latest/tests/test_latest.py
Normal file
28
roles/alertmanager/molecule/latest/tests/test_latest.py
Normal file
|
@ -0,0 +1,28 @@
|
|||
import pytest
|
||||
import os
|
||||
import testinfra.utils.ansible_runner
|
||||
|
||||
testinfra_hosts = testinfra.utils.ansible_runner.AnsibleRunner(
|
||||
os.environ['MOLECULE_INVENTORY_FILE']).get_hosts('all')
|
||||
|
||||
|
||||
@pytest.mark.parametrize("files", [
|
||||
"/etc/systemd/system/alertmanager.service",
|
||||
"/usr/local/bin/alertmanager",
|
||||
"/usr/local/bin/amtool"
|
||||
])
|
||||
def test_files(host, files):
|
||||
f = host.file(files)
|
||||
assert f.exists
|
||||
assert f.is_file
|
||||
|
||||
|
||||
def test_service(host):
|
||||
s = host.service("alertmanager")
|
||||
# assert s.is_enabled
|
||||
assert s.is_running
|
||||
|
||||
|
||||
def test_socket(host):
|
||||
s = host.socket("tcp://0.0.0.0:9093")
|
||||
assert s.is_listening
|
43
roles/alertmanager/tasks/configure.yml
Normal file
43
roles/alertmanager/tasks/configure.yml
Normal file
|
@ -0,0 +1,43 @@
|
|||
---
|
||||
- name: copy amtool config
|
||||
template:
|
||||
force: true
|
||||
src: "{{ alertmanager_amtool_config_file }}"
|
||||
dest: "{{ _alertmanager_amtool_config_dir }}/config.yml"
|
||||
owner: alertmanager
|
||||
group: alertmanager
|
||||
mode: 0644
|
||||
|
||||
- name: copy alertmanager config
|
||||
template:
|
||||
force: true
|
||||
src: "{{ alertmanager_config_file }}"
|
||||
dest: "{{ alertmanager_config_dir }}/alertmanager.yml"
|
||||
owner: alertmanager
|
||||
group: alertmanager
|
||||
mode: 0644
|
||||
validate: "{{ _alertmanager_binary_install_dir }}/amtool check-config %s"
|
||||
notify:
|
||||
- restart alertmanager
|
||||
|
||||
- name: create systemd service unit
|
||||
template:
|
||||
src: alertmanager.service.j2
|
||||
dest: /etc/systemd/system/alertmanager.service
|
||||
owner: root
|
||||
group: root
|
||||
mode: 0644
|
||||
notify:
|
||||
- restart alertmanager
|
||||
|
||||
- name: copy alertmanager template files
|
||||
copy:
|
||||
src: "{{ item }}"
|
||||
dest: "{{ alertmanager_config_dir }}/templates/"
|
||||
force: true
|
||||
owner: alertmanager
|
||||
group: alertmanager
|
||||
mode: 0644
|
||||
with_fileglob: "{{ alertmanager_template_files }}"
|
||||
notify:
|
||||
- restart alertmanager
|
80
roles/alertmanager/tasks/install.yml
Normal file
80
roles/alertmanager/tasks/install.yml
Normal file
|
@ -0,0 +1,80 @@
|
|||
---
|
||||
- name: create alertmanager system group
|
||||
group:
|
||||
name: alertmanager
|
||||
system: true
|
||||
state: present
|
||||
|
||||
- name: create alertmanager system user
|
||||
user:
|
||||
name: alertmanager
|
||||
system: true
|
||||
shell: "/usr/sbin/nologin"
|
||||
group: alertmanager
|
||||
createhome: false
|
||||
|
||||
- name: create alertmanager directories
|
||||
file:
|
||||
path: "{{ item }}"
|
||||
state: directory
|
||||
owner: alertmanager
|
||||
group: alertmanager
|
||||
mode: 0755
|
||||
with_items:
|
||||
- "{{ alertmanager_config_dir }}"
|
||||
- "{{ alertmanager_config_dir }}/templates"
|
||||
- "{{ alertmanager_db_dir }}"
|
||||
- "{{ _alertmanager_amtool_config_dir }}"
|
||||
|
||||
- block:
|
||||
- name: download alertmanager binary to local folder
|
||||
become: false
|
||||
get_url:
|
||||
url: "https://github.com/prometheus/alertmanager/releases/download/v{{ alertmanager_version }}/alertmanager-{{ alertmanager_version }}.linux-{{ go_arch }}.tar.gz"
|
||||
dest: "/tmp/alertmanager-{{ alertmanager_version }}.linux-{{ go_arch }}.tar.gz"
|
||||
checksum: "sha256:{{ __alertmanager_checksum }}"
|
||||
register: _download_archive
|
||||
until: _download_archive is succeeded
|
||||
retries: 5
|
||||
delay: 2
|
||||
# run_once: true # <-- this can't be set due to multi-arch support
|
||||
delegate_to: localhost
|
||||
check_mode: false
|
||||
|
||||
- name: unpack alertmanager binaries
|
||||
become: false
|
||||
unarchive:
|
||||
src: "/tmp/alertmanager-{{ alertmanager_version }}.linux-{{ go_arch }}.tar.gz"
|
||||
dest: "/tmp"
|
||||
mode: 0755
|
||||
creates: "/tmp/alertmanager-{{ alertmanager_version }}.linux-{{ go_arch }}/alertmanager"
|
||||
delegate_to: localhost
|
||||
check_mode: false
|
||||
|
||||
- name: propagate official alertmanager and amtool binaries
|
||||
copy:
|
||||
src: "/tmp/alertmanager-{{ alertmanager_version }}.linux-{{ go_arch }}/{{ item }}"
|
||||
dest: "{{ _alertmanager_binary_install_dir }}/{{ item }}"
|
||||
mode: 0755
|
||||
owner: root
|
||||
group: root
|
||||
with_items:
|
||||
- alertmanager
|
||||
- amtool
|
||||
notify:
|
||||
- restart alertmanager
|
||||
when: alertmanager_binary_local_dir | length == 0
|
||||
|
||||
- name: propagate locally distributed alertmanager and amtool binaries
|
||||
copy:
|
||||
src: "{{ alertmanager_binary_local_dir }}/{{ item }}"
|
||||
dest: "{{ _alertmanager_binary_install_dir }}/{{ item }}"
|
||||
mode: 0755
|
||||
owner: root
|
||||
group: root
|
||||
with_items:
|
||||
- alertmanager
|
||||
- amtool
|
||||
when: alertmanager_binary_local_dir | length > 0
|
||||
notify:
|
||||
- restart alertmanager
|
35
roles/alertmanager/tasks/main.yml
Normal file
35
roles/alertmanager/tasks/main.yml
Normal file
|
@ -0,0 +1,35 @@
|
|||
---
|
||||
- include: preflight.yml
|
||||
tags:
|
||||
- alertmanager_install
|
||||
- alertmanager_configure
|
||||
- alertmanager_run
|
||||
|
||||
- include: install.yml
|
||||
become: true
|
||||
tags:
|
||||
- alertmanager_install
|
||||
|
||||
- import_tasks: selinux.yml
|
||||
become: true
|
||||
when: ansible_selinux.status == "enabled"
|
||||
tags:
|
||||
- alertmanager_configure
|
||||
|
||||
- include: configure.yml
|
||||
become: true
|
||||
tags:
|
||||
- alertmanager_configure
|
||||
|
||||
- name: ensure alertmanager service is started and enabled
|
||||
become: true
|
||||
systemd:
|
||||
daemon_reload: true
|
||||
name: alertmanager
|
||||
state: started
|
||||
enabled: true
|
||||
tags:
|
||||
- alertmanager_run
|
||||
|
||||
- name: Flush alertmangaer handlers after run.
|
||||
meta: flush_handlers
|
135
roles/alertmanager/tasks/preflight.yml
Normal file
135
roles/alertmanager/tasks/preflight.yml
Normal file
|
@ -0,0 +1,135 @@
|
|||
---
|
||||
- name: Assert usage of systemd as an init system
|
||||
assert:
|
||||
that: ansible_service_mgr == 'systemd'
|
||||
msg: "This module only works with systemd"
|
||||
|
||||
- name: Get systemd version
|
||||
command: systemctl --version
|
||||
changed_when: false
|
||||
check_mode: false
|
||||
register: __systemd_version
|
||||
tags:
|
||||
- skip_ansible_lint
|
||||
|
||||
- name: Set systemd version fact
|
||||
set_fact:
|
||||
alertmanager_systemd_version: "{{ __systemd_version.stdout_lines[0].split(' ')[-1] }}"
|
||||
|
||||
- block:
|
||||
- name: Get latest release
|
||||
uri:
|
||||
url: "https://api.github.com/repos/prometheus/alertmanager/releases/latest"
|
||||
method: GET
|
||||
return_content: true
|
||||
status_code: 200
|
||||
body_format: json
|
||||
user: "{{ lookup('env', 'GH_USER') | default(omit) }}"
|
||||
password: "{{ lookup('env', 'GH_TOKEN') | default(omit) }}"
|
||||
no_log: "{{ not lookup('env', 'MOLECULE_DEBUG') | bool }}"
|
||||
register: _latest_release
|
||||
until: _latest_release.status == 200
|
||||
retries: 5
|
||||
|
||||
- name: "Set alertmanager version to {{ _latest_release.json.tag_name[1:] }}"
|
||||
set_fact:
|
||||
alertmanager_version: "{{ _latest_release.json.tag_name[1:] }}"
|
||||
alertmanager_checksum_url: "https://github.com/prometheus/alertmanager/releases/download/v{{ alertmanager_version }}/sha256sums.txt"
|
||||
when:
|
||||
- alertmanager_version == "latest"
|
||||
- alertmanager_binary_local_dir | length == 0
|
||||
|
||||
- block:
|
||||
- name: "Get checksum list"
|
||||
set_fact:
|
||||
__alertmanager_checksums: "{{ lookup('url', 'https://github.com/prometheus/alertmanager/releases/download/v' + alertmanager_version + '/sha256sums.txt', wantlist=True) | list }}"
|
||||
run_once: true
|
||||
|
||||
- name: "Get checksum for {{ go_arch }} architecture"
|
||||
set_fact:
|
||||
__alertmanager_checksum: "{{ item.split(' ')[0] }}"
|
||||
with_items: "{{ __alertmanager_checksums }}"
|
||||
when:
|
||||
- "('linux-' + go_arch + '.tar.gz') in item"
|
||||
delegate_to: localhost
|
||||
when:
|
||||
- alertmanager_binary_local_dir | length == 0
|
||||
|
||||
|
||||
- name: Fail when extra config flags are duplicating ansible variables
|
||||
fail:
|
||||
msg: "Detected duplicate configuration entry. Please check your ansible variables and role README.md."
|
||||
when:
|
||||
(alertmanager_config_flags_extra['config.file'] is defined) or
|
||||
(alertmanager_config_flags_extra['storage.path'] is defined) or
|
||||
(alertmanager_config_flags_extra['web.listen-address'] is defined) or
|
||||
(alertmanager_config_flags_extra['web.external-url'] is defined)
|
||||
|
||||
- name: Fail when there are no receivers defined
|
||||
fail:
|
||||
msg: "Configure alert receivers (`alertmanager_receivers`). Otherwise alertmanager won't know where to send alerts."
|
||||
when:
|
||||
- alertmanager_config_file == 'alertmanager.yml.j2'
|
||||
- alertmanager_receivers == []
|
||||
|
||||
- name: Fail when there is no alert route defined
|
||||
fail:
|
||||
msg: "Configure alert routing (`alertmanager_route`). Otherwise alertmanager won't know how to send alerts."
|
||||
when:
|
||||
- alertmanager_config_file == 'alertmanager.yml.j2'
|
||||
- alertmanager_route == {}
|
||||
|
||||
- name: "DEPRECATION WARNING: alertmanager version 0.15 and earlier are no longer supported and will be dropped from future releases"
|
||||
ignore_errors: true
|
||||
fail:
|
||||
msg: "Please use `alertmanager_version >= v0.16.0`"
|
||||
when: alertmanager_version is version_compare('0.16.0', '<')
|
||||
|
||||
- block:
|
||||
- name: Backward compatibility of variable [part 1]
|
||||
set_fact:
|
||||
alertmanager_config_flags_extra: "{{ alertmanager_cli_flags }}"
|
||||
|
||||
- name: "DEPRECATION WARNING: `alertmanager_cli_flags` is no longer supported and will be dropped from future releases"
|
||||
ignore_errors: true
|
||||
fail:
|
||||
msg: "Please use `alertmanager_config_flags_extra` instead of `alertmanager_cli_flags`"
|
||||
when: alertmanager_cli_flags is defined
|
||||
|
||||
- block:
|
||||
- name: Backward compatibility of variable [part 2]
|
||||
set_fact:
|
||||
alertmanager_web_listen_address: "{{ alertmanager_listen_address }}"
|
||||
|
||||
- name: "DEPRECATION WARNING: `alertmanager_listen_address` is no longer supported and will be dropped from future releases"
|
||||
ignore_errors: true
|
||||
fail:
|
||||
msg: "Please use `alertmanager_web_listen_address` instead of `alertmanager_listen_address`"
|
||||
when: alertmanager_listen_address is defined
|
||||
|
||||
- block:
|
||||
- name: Backward compatibility of variable [part 3]
|
||||
set_fact:
|
||||
alertmanager_web_external_url: "{{ alertmanager_external_url }}"
|
||||
|
||||
- name: "DEPRECATION WARNING: `alertmanager_external_url` is no longer supported and will be dropped from future releases"
|
||||
ignore_errors: true
|
||||
fail:
|
||||
msg: "Please use `alertmanager_web_external_url` instead of `alertmanager_external_url`"
|
||||
when: alertmanager_external_url is defined
|
||||
|
||||
- block:
|
||||
- name: HA config compatibility with alertmanager<0.15.0
|
||||
set_fact:
|
||||
alertmanager_cluster: "{{ alertmanager_mesh }}"
|
||||
|
||||
- name: "DEPRECATION WARNING: `alertmanager_mesh` is no longer supported and will be dropped from future releases"
|
||||
ignore_errors: true
|
||||
fail:
|
||||
msg: "Please use `alertmanager_cluster` instead of `alertmanager_cluster`"
|
||||
when: alertmanager_mesh is defined
|
||||
|
||||
- name: "`alertmanager_child_routes` is no longer supported"
|
||||
fail:
|
||||
msg: "Please move content of `alertmanager_child_routes` to `alertmanager_route.routes` as the former variable is deprecated and will be removed in future versions."
|
||||
when: alertmanager_child_routes is defined
|
39
roles/alertmanager/tasks/selinux.yml
Normal file
39
roles/alertmanager/tasks/selinux.yml
Normal file
|
@ -0,0 +1,39 @@
|
|||
---
|
||||
- name: Install selinux python packages [RHEL]
|
||||
package:
|
||||
name:
|
||||
- "{{ ( (ansible_facts.distribution_major_version | int) < 8) | ternary('libselinux-python','python3-libselinux') }}"
|
||||
- "{{ ( (ansible_facts.distribution_major_version | int) < 8) | ternary('libselinux-python','python3-policycoreutils') }}"
|
||||
state: present
|
||||
register: _install_selinux_packages
|
||||
until: _install_selinux_packages is success
|
||||
retries: 5
|
||||
delay: 2
|
||||
when:
|
||||
- (ansible_distribution | lower == "redhat") or
|
||||
(ansible_distribution | lower == "centos")
|
||||
|
||||
- name: Install selinux python packages [Fedora]
|
||||
package:
|
||||
name:
|
||||
- "{{ ( (ansible_facts.distribution_major_version | int) < 29) | ternary('libselinux-python','python3-libselinux') }}"
|
||||
- "{{ ( (ansible_facts.distribution_major_version | int) < 29) | ternary('libselinux-python','python3-policycoreutils') }}"
|
||||
state: present
|
||||
register: _install_selinux_packages
|
||||
until: _install_selinux_packages is success
|
||||
retries: 5
|
||||
delay: 2
|
||||
|
||||
when:
|
||||
- ansible_distribution | lower == "fedora"
|
||||
|
||||
- name: Install selinux python packages [clearlinux]
|
||||
package:
|
||||
name: sysadmin-basic
|
||||
state: present
|
||||
register: _install_selinux_packages
|
||||
until: _install_selinux_packages is success
|
||||
retries: 5
|
||||
delay: 2
|
||||
when:
|
||||
- ansible_distribution | lower == "clearlinux"
|
65
roles/alertmanager/templates/alertmanager.service.j2
Normal file
65
roles/alertmanager/templates/alertmanager.service.j2
Normal file
|
@ -0,0 +1,65 @@
|
|||
{%- if alertmanager_version is version_compare('0.13.0', '>=') %}
|
||||
{%- set pre = '-' %}
|
||||
{%- else %}
|
||||
{%- set pre = '' %}
|
||||
{%- endif %}
|
||||
{%- if alertmanager_version is version_compare('0.15.0', '<') %}
|
||||
{%- set cluster_flag = 'mesh' %}
|
||||
{%- else %}
|
||||
{%- set cluster_flag = 'cluster' %}
|
||||
{%- endif %}
|
||||
{{ ansible_managed | comment }}
|
||||
[Unit]
|
||||
Description=Prometheus Alertmanager
|
||||
After=network-online.target
|
||||
StartLimitInterval=0
|
||||
StartLimitIntervalSec=0
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
PIDFile=/var/run/alertmanager.pid
|
||||
User=alertmanager
|
||||
Group=alertmanager
|
||||
ExecReload=/bin/kill -HUP $MAINPID
|
||||
ExecStart={{ _alertmanager_binary_install_dir }}/alertmanager \
|
||||
{% for option, value in (alertmanager_cluster.items() | sort) %}
|
||||
{% if option == "peers" %}
|
||||
{% for peer in value %}
|
||||
{{ pre }}-{{ cluster_flag }}.peer={{ peer }} \
|
||||
{% endfor %}
|
||||
{% else %}
|
||||
{{ pre }}-{{ cluster_flag }}.{{ option }}={{ value }} \
|
||||
{% endif %}
|
||||
{% endfor %}
|
||||
{{ pre }}-config.file={{ alertmanager_config_dir }}/alertmanager.yml \
|
||||
{{ pre }}-storage.path={{ alertmanager_db_dir }} \
|
||||
{{ pre }}-web.listen-address={{ alertmanager_web_listen_address }} \
|
||||
{{ pre }}-web.external-url={{ alertmanager_web_external_url }}{% for flag, flag_value in alertmanager_config_flags_extra.items() %} \
|
||||
{{ pre }}-{{ flag }}={{ flag_value }}{% endfor %}
|
||||
|
||||
SyslogIdentifier=alertmanager
|
||||
Restart=always
|
||||
RestartSec=5
|
||||
|
||||
CapabilityBoundingSet=CAP_SET_UID
|
||||
LockPersonality=true
|
||||
NoNewPrivileges=true
|
||||
MemoryDenyWriteExecute=true
|
||||
PrivateTmp=true
|
||||
ProtectHome=true
|
||||
ReadWriteDirectories={{ alertmanager_db_dir }}
|
||||
RemoveIPC=true
|
||||
RestrictSUIDSGID=true
|
||||
|
||||
{% if alertmanager_systemd_version | int >= 232 %}
|
||||
PrivateUsers=true
|
||||
ProtectControlGroups=true
|
||||
ProtectKernelModules=true
|
||||
ProtectKernelTunables=yes
|
||||
ProtectSystem=strict
|
||||
{% else %}
|
||||
ProtectSystem=full
|
||||
{% endif %}
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
56
roles/alertmanager/templates/alertmanager.yml.j2
Normal file
56
roles/alertmanager/templates/alertmanager.yml.j2
Normal file
|
@ -0,0 +1,56 @@
|
|||
{{ ansible_managed | comment }}
|
||||
|
||||
global:
|
||||
resolve_timeout: {{ alertmanager_resolve_timeout | quote}}
|
||||
{% for key, value in alertmanager_smtp.items() %}
|
||||
smtp_{{ key }}: {{ value | quote }}
|
||||
{% endfor %}
|
||||
{% if alertmanager_slack_api_url | string | length %}
|
||||
slack_api_url: {{ alertmanager_slack_api_url | quote }}
|
||||
{% endif %}
|
||||
{% if alertmanager_http_config | length %}
|
||||
http_config:
|
||||
{{ alertmanager_http_config | to_nice_yaml(indent=2) | indent(4, False)}}
|
||||
{% endif %}
|
||||
{% if alertmanager_pagerduty_url | string | length %}
|
||||
pagerduty_url: {{ alertmanager_pagerduty_url | quote }}
|
||||
{% endif %}
|
||||
{% if alertmanager_opsgenie_api_key | string | length %}
|
||||
opsgenie_api_key: {{ alertmanager_opsgenie_api_key | quote }}
|
||||
{% endif %}
|
||||
{% if alertmanager_opsgenie_api_url | string | length %}
|
||||
opsgenie_api_url: {{ alertmanager_opsgenie_api_url | quote }}
|
||||
{% endif %}
|
||||
{% if alertmanager_victorops_api_key | string | length %}
|
||||
victorops_api_key: {{ alertmanager_victorops_api_key | quote }}
|
||||
{% endif %}
|
||||
{% if alertmanager_victorops_api_url | string | length %}
|
||||
victorops_api_url: {{ alertmanager_victorops_api_url | quote }}
|
||||
{% endif %}
|
||||
{% if alertmanager_hipchat_api_url | string | length %}
|
||||
hipchat_api_url: {{ alertmanager_hipchat_api_url | quote }}
|
||||
{% endif %}
|
||||
{% if alertmanager_hipchat_auth_token | string | length %}
|
||||
hipchat_auth_token: {{ alertmanager_hipchat_auth_token | quote }}
|
||||
{% endif %}
|
||||
{% if alertmanager_wechat_url | string | length %}
|
||||
wechat_api_url: {{ alertmanager_wechat_url | quote }}
|
||||
{% endif %}
|
||||
{% if alertmanager_wechat_secret | string | length %}
|
||||
wechat_api_secret: {{ alertmanager_wechat_secret | quote }}
|
||||
{% endif %}
|
||||
{% if alertmanager_wechat_corp_id | string | length %}
|
||||
wechat_api_corp_id: {{ alertmanager_wechat_corp_id | quote }}
|
||||
{% endif %}
|
||||
templates:
|
||||
- '{{ alertmanager_config_dir }}/templates/*.tmpl'
|
||||
{% if alertmanager_receivers | length %}
|
||||
receivers:
|
||||
{{ alertmanager_receivers | to_nice_yaml(indent=2) }}
|
||||
{% endif %}
|
||||
{% if alertmanager_inhibit_rules | length %}
|
||||
inhibit_rules:
|
||||
{{ alertmanager_inhibit_rules | to_nice_yaml(indent=2) }}
|
||||
{% endif %}
|
||||
route:
|
||||
{{ alertmanager_route | to_nice_yaml(indent=2) | indent(2, False) }}
|
4
roles/alertmanager/templates/amtool.yml.j2
Normal file
4
roles/alertmanager/templates/amtool.yml.j2
Normal file
|
@ -0,0 +1,4 @@
|
|||
alertmanager.url: "{{ alertmanager_amtool_config_alertmanager_url }}"
|
||||
{%if alertmanager_amtool_config_output != "" %}
|
||||
output: "{{ alertmanager_amtool_config_output }}"
|
||||
{% endif %}
|
13
roles/alertmanager/vars/main.yml
Normal file
13
roles/alertmanager/vars/main.yml
Normal file
|
@ -0,0 +1,13 @@
|
|||
---
|
||||
go_arch_map:
|
||||
i386: '386'
|
||||
x86_64: 'amd64'
|
||||
aarch64: 'arm64'
|
||||
armv7l: 'armv7'
|
||||
armv6l: 'armv6'
|
||||
|
||||
go_arch: "{{ go_arch_map[ansible_architecture] | default(ansible_architecture) }}"
|
||||
_alertmanager_binary_install_dir: '/usr/local/bin'
|
||||
|
||||
# The expected location of the amtool configuration file
|
||||
_alertmanager_amtool_config_dir: '/etc/amtool'
|
58
roles/blackbox_exporter/README.md
Normal file
58
roles/blackbox_exporter/README.md
Normal file
|
@ -0,0 +1,58 @@
|
|||
<p><img src="http://jacobsmedia.com/wp-content/uploads/2015/08/black-box-edit.png" alt="blackbox logo" title="blackbox" align="right" height="60" /></p>
|
||||
|
||||
# Ansible Role: Blackbox Exporter
|
||||
|
||||
# Description
|
||||
|
||||
Deploy and manage [blackbox exporter](https://github.com/prometheus/blackbox_exporter) which allows blackbox probing of endpoints over HTTP, HTTPS, DNS, TCP and ICMP.
|
||||
|
||||
## Requirements
|
||||
|
||||
- Ansible >= 2.7 (It might work on previous versions, but we cannot guarantee it)
|
||||
- gnu-tar on Mac deployer host (`brew install gnu-tar`)
|
||||
|
||||
## Role Variables
|
||||
|
||||
All variables which can be overridden are stored in [defaults/main.yml](defaults/main.yml) file as well as in table below.
|
||||
|
||||
| Name | Default Value | Description |
|
||||
| -------------- | ------------- | -----------------------------------|
|
||||
| `blackbox_exporter_version` | 0.18.0 | Blackbox exporter package version |
|
||||
| `blackbox_exporter_web_listen_address` | 0.0.0.0:9115 | Address on which blackbox exporter will be listening |
|
||||
| `blackbox_exporter_cli_flags` | {} | Additional configuration flags passed to blackbox exporter binary at startup |
|
||||
| `blackbox_exporter_configuration_modules` | http_2xx: { prober: http, timeout: 5s, http: '' } | |
|
||||
|
||||
## Example
|
||||
|
||||
### Playbook
|
||||
|
||||
```yaml
|
||||
- hosts: all
|
||||
become: true
|
||||
roles:
|
||||
- cloudalchemy.blackbox-exporter
|
||||
```
|
||||
|
||||
### Demo site
|
||||
|
||||
We provide demo site for full monitoring solution based on prometheus and grafana. Repository with code and links to running instances is [available on github](https://github.com/prometheus/demo-site) and site is hosted on [DigitalOcean](https://digitalocean.com).
|
||||
|
||||
## Local Testing
|
||||
|
||||
The preferred way of locally testing the role is to use Docker and [molecule](https://github.com/ansible-community/molecule) (v3.x). You will have to install Docker on your system. See "Get started" for a Docker package suitable to for your system. Running your tests is as simple as executing `molecule test`.
|
||||
|
||||
## Continuous Intergation
|
||||
|
||||
Combining molecule and circle CI allows us to test how new PRs will behave when used with multiple ansible versions and multiple operating systems. This also allows use to create test scenarios for different role configurations. As a result we have a quite large test matrix which can take more time than local testing, so please be patient.
|
||||
|
||||
## Contributing
|
||||
|
||||
See [contributor guideline](CONTRIBUTING.md).
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
See [troubleshooting](TROUBLESHOOTING.md).
|
||||
|
||||
## License
|
||||
|
||||
This project is licensed under MIT License. See [LICENSE](/LICENSE) for more details.
|
63
roles/blackbox_exporter/defaults/main.yml
Normal file
63
roles/blackbox_exporter/defaults/main.yml
Normal file
|
@ -0,0 +1,63 @@
|
|||
---
|
||||
blackbox_exporter_version: 0.18.0
|
||||
|
||||
blackbox_exporter_web_listen_address: "0.0.0.0:9115"
|
||||
|
||||
blackbox_exporter_cli_flags: {}
|
||||
# blackbox_exporter_cli_flags:
|
||||
# log.level: "warn"
|
||||
|
||||
blackbox_exporter_configuration_modules:
|
||||
http_2xx:
|
||||
prober: http
|
||||
timeout: 5s
|
||||
http:
|
||||
method: GET
|
||||
valid_status_codes: []
|
||||
# http_post_2xx:
|
||||
# prober: http
|
||||
# timeout: 5s
|
||||
# http:
|
||||
# method: POST
|
||||
# basic_auth:
|
||||
# username: "username"
|
||||
# password: "mysecret"
|
||||
# tcp_connect:
|
||||
# prober: tcp
|
||||
# timeout: 5s
|
||||
# pop3s_banner:
|
||||
# prober: tcp
|
||||
# tcp:
|
||||
# query_response:
|
||||
# - expect: "^+OK"
|
||||
# tls: true
|
||||
# tls_config:
|
||||
# insecure_skip_verify: false
|
||||
# ssh_banner:
|
||||
# prober: tcp
|
||||
# timeout: 5s
|
||||
# tcp:
|
||||
# query_response:
|
||||
# - expect: "^SSH-2.0-"
|
||||
# irc_banner:
|
||||
# prober: tcp
|
||||
# timeout: 5s
|
||||
# tcp:
|
||||
# query_response:
|
||||
# - send: "NICK prober"
|
||||
# - send: "USER prober prober prober :prober"
|
||||
# - expect: "PING :([^ ]+)"
|
||||
# send: "PONG ${1}"
|
||||
# - expect: "^:[^ ]+ 001"
|
||||
# icmp_test:
|
||||
# prober: icmp
|
||||
# timeout: 5s
|
||||
# icmp:
|
||||
# preferred_ip_protocol: ip4
|
||||
# dns_test:
|
||||
# prober: dns
|
||||
# timeout: 5s
|
||||
# dns:
|
||||
# preferred_ip_protocol: ip6
|
||||
# validate_answer_rrs:
|
||||
# fail_if_matches_regexp: [test]
|
13
roles/blackbox_exporter/handlers/main.yml
Normal file
13
roles/blackbox_exporter/handlers/main.yml
Normal file
|
@ -0,0 +1,13 @@
|
|||
---
|
||||
- name: restart blackbox exporter
|
||||
become: true
|
||||
systemd:
|
||||
daemon_reload: true
|
||||
name: blackbox_exporter
|
||||
state: restarted
|
||||
|
||||
- name: reload blackbox exporter
|
||||
become: true
|
||||
systemd:
|
||||
name: blackbox_exporter
|
||||
state: reloaded
|
33
roles/blackbox_exporter/meta/main.yml
Normal file
33
roles/blackbox_exporter/meta/main.yml
Normal file
|
@ -0,0 +1,33 @@
|
|||
---
|
||||
galaxy_info:
|
||||
author: Prometheus Community
|
||||
description: Prometheus Blackbox Exporter
|
||||
license: Apache
|
||||
company: none
|
||||
min_ansible_version: "2.7"
|
||||
platforms:
|
||||
- name: Ubuntu
|
||||
versions:
|
||||
- bionic
|
||||
- xenial
|
||||
- name: Debian
|
||||
versions:
|
||||
- stretch
|
||||
- buster
|
||||
- name: EL
|
||||
versions:
|
||||
- 7
|
||||
- 8
|
||||
- name: Fedora
|
||||
versions:
|
||||
- 30
|
||||
- 31
|
||||
galaxy_tags:
|
||||
- exporter
|
||||
- monitoring
|
||||
- prometheus
|
||||
- metrics
|
||||
- blackbox
|
||||
- probe
|
||||
|
||||
dependencies: []
|
70
roles/blackbox_exporter/molecule/alternative/molecule.yml
Normal file
70
roles/blackbox_exporter/molecule/alternative/molecule.yml
Normal file
|
@ -0,0 +1,70 @@
|
|||
---
|
||||
dependency:
|
||||
name: galaxy
|
||||
driver:
|
||||
name: docker
|
||||
platforms:
|
||||
- name: bionic
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:ubuntu-18.04
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: xenial
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:ubuntu-16.04
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: stretch
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:debian-9
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: buster
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:debian-10
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: centos7
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:centos-7
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: centos8
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:centos-8
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
groups:
|
||||
- python3
|
||||
- name: fedora
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:fedora-30
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
groups:
|
||||
- python3
|
||||
provisioner:
|
||||
name: ansible
|
||||
playbooks:
|
||||
prepare: prepare.yml
|
||||
converge: playbook.yml
|
||||
inventory:
|
||||
group_vars:
|
||||
python3:
|
||||
ansible_python_interpreter: /usr/bin/python3
|
||||
verifier:
|
||||
name: testinfra
|
14
roles/blackbox_exporter/molecule/alternative/playbook.yml
Normal file
14
roles/blackbox_exporter/molecule/alternative/playbook.yml
Normal file
|
@ -0,0 +1,14 @@
|
|||
---
|
||||
- name: Run role
|
||||
hosts: all
|
||||
any_errors_fatal: true
|
||||
roles:
|
||||
- ansible-blackbox-exporter
|
||||
vars:
|
||||
blackbox_exporter_web_listen_address: "127.0.0.1:9000"
|
||||
blackbox_exporter_cli_flags:
|
||||
log.level: "warn"
|
||||
blackbox_exporter_configuration_modules:
|
||||
tcp_connect:
|
||||
prober: tcp
|
||||
timeout: 5s
|
5
roles/blackbox_exporter/molecule/alternative/prepare.yml
Normal file
5
roles/blackbox_exporter/molecule/alternative/prepare.yml
Normal file
|
@ -0,0 +1,5 @@
|
|||
---
|
||||
- name: Prepare
|
||||
hosts: all
|
||||
gather_facts: false
|
||||
tasks: []
|
|
@ -0,0 +1,28 @@
|
|||
import pytest
|
||||
import os
|
||||
import testinfra.utils.ansible_runner
|
||||
|
||||
testinfra_hosts = testinfra.utils.ansible_runner.AnsibleRunner(
|
||||
os.environ['MOLECULE_INVENTORY_FILE']).get_hosts('all')
|
||||
|
||||
|
||||
@pytest.mark.parametrize("files", [
|
||||
"/etc/blackbox_exporter.yml",
|
||||
"/etc/systemd/system/blackbox_exporter.service",
|
||||
"/usr/local/bin/blackbox_exporter"
|
||||
])
|
||||
def test_files(host, files):
|
||||
f = host.file(files)
|
||||
assert f.exists
|
||||
assert f.is_file
|
||||
|
||||
|
||||
def test_service(host):
|
||||
s = host.service("blackbox_exporter")
|
||||
assert s.is_running
|
||||
# assert s.is_enabled
|
||||
|
||||
|
||||
def test_socket(host):
|
||||
s = host.socket("tcp://127.0.0.1:9000")
|
||||
assert s.is_listening
|
70
roles/blackbox_exporter/molecule/default/molecule.yml
Normal file
70
roles/blackbox_exporter/molecule/default/molecule.yml
Normal file
|
@ -0,0 +1,70 @@
|
|||
---
|
||||
dependency:
|
||||
name: galaxy
|
||||
driver:
|
||||
name: docker
|
||||
platforms:
|
||||
- name: bionic
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:ubuntu-18.04
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: xenial
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:ubuntu-16.04
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: stretch
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:debian-9
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: buster
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:debian-10
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: centos7
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:centos-7
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: centos8
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:centos-8
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
groups:
|
||||
- python3
|
||||
- name: fedora
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:fedora-30
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
groups:
|
||||
- python3
|
||||
provisioner:
|
||||
name: ansible
|
||||
playbooks:
|
||||
prepare: prepare.yml
|
||||
converge: playbook.yml
|
||||
inventory:
|
||||
group_vars:
|
||||
python3:
|
||||
ansible_python_interpreter: /usr/bin/python3
|
||||
verifier:
|
||||
name: testinfra
|
5
roles/blackbox_exporter/molecule/default/playbook.yml
Normal file
5
roles/blackbox_exporter/molecule/default/playbook.yml
Normal file
|
@ -0,0 +1,5 @@
|
|||
---
|
||||
- hosts: all
|
||||
any_errors_fatal: true
|
||||
roles:
|
||||
- ansible-blackbox-exporter
|
5
roles/blackbox_exporter/molecule/default/prepare.yml
Normal file
5
roles/blackbox_exporter/molecule/default/prepare.yml
Normal file
|
@ -0,0 +1,5 @@
|
|||
---
|
||||
- name: Prepare
|
||||
hosts: all
|
||||
gather_facts: false
|
||||
tasks: []
|
|
@ -0,0 +1,28 @@
|
|||
import pytest
|
||||
import os
|
||||
import testinfra.utils.ansible_runner
|
||||
|
||||
testinfra_hosts = testinfra.utils.ansible_runner.AnsibleRunner(
|
||||
os.environ['MOLECULE_INVENTORY_FILE']).get_hosts('all')
|
||||
|
||||
|
||||
@pytest.mark.parametrize("files", [
|
||||
"/etc/blackbox_exporter.yml",
|
||||
"/etc/systemd/system/blackbox_exporter.service",
|
||||
"/usr/local/bin/blackbox_exporter"
|
||||
])
|
||||
def test_files(host, files):
|
||||
f = host.file(files)
|
||||
assert f.exists
|
||||
assert f.is_file
|
||||
|
||||
|
||||
def test_service(host):
|
||||
s = host.service("blackbox_exporter")
|
||||
assert s.is_running
|
||||
# assert s.is_enabled
|
||||
|
||||
|
||||
def test_socket(host):
|
||||
s = host.socket("tcp://0.0.0.0:9115")
|
||||
assert s.is_listening
|
20
roles/blackbox_exporter/tasks/configure.yml
Normal file
20
roles/blackbox_exporter/tasks/configure.yml
Normal file
|
@ -0,0 +1,20 @@
|
|||
---
|
||||
- name: create systemd service unit
|
||||
template:
|
||||
src: blackbox_exporter.service.j2
|
||||
dest: /etc/systemd/system/blackbox_exporter.service
|
||||
owner: root
|
||||
group: root
|
||||
mode: 0644
|
||||
notify:
|
||||
- restart blackbox exporter
|
||||
|
||||
- name: configure blackbox exporter
|
||||
template:
|
||||
src: blackbox_exporter.yml.j2
|
||||
dest: /etc/blackbox_exporter.yml
|
||||
owner: blackbox-exp
|
||||
group: blackbox-exp
|
||||
mode: 0644
|
||||
notify:
|
||||
- reload blackbox exporter
|
60
roles/blackbox_exporter/tasks/install.yml
Normal file
60
roles/blackbox_exporter/tasks/install.yml
Normal file
|
@ -0,0 +1,60 @@
|
|||
---
|
||||
- name: create blackbox_exporter system group
|
||||
group:
|
||||
name: blackbox-exp
|
||||
system: true
|
||||
state: present
|
||||
|
||||
- name: create blackbox_exporter system user
|
||||
user:
|
||||
name: blackbox-exp
|
||||
system: true
|
||||
shell: "/usr/sbin/nologin"
|
||||
group: blackbox-exp
|
||||
createhome: false
|
||||
|
||||
- name: download blackbox exporter binary to local folder
|
||||
become: false
|
||||
unarchive:
|
||||
src: "https://github.com/prometheus/blackbox_exporter/releases/download/v{{ blackbox_exporter_version }}/blackbox_exporter-{{ blackbox_exporter_version }}.linux-{{ go_arch_map[ansible_architecture] | default(ansible_architecture) }}.tar.gz"
|
||||
dest: "/tmp"
|
||||
remote_src: true
|
||||
creates: "/tmp/blackbox_exporter-{{ blackbox_exporter_version }}.linux-{{ go_arch_map[ansible_architecture] | default(ansible_architecture) }}/blackbox_exporter"
|
||||
register: _download_binary
|
||||
until: _download_binary is succeeded
|
||||
retries: 5
|
||||
delay: 2
|
||||
delegate_to: localhost
|
||||
check_mode: false
|
||||
|
||||
- name: propagate blackbox exporter binary
|
||||
copy:
|
||||
src: "/tmp/blackbox_exporter-{{ blackbox_exporter_version }}.linux-{{ go_arch_map[ansible_architecture] | default(ansible_architecture) }}/blackbox_exporter"
|
||||
dest: "/usr/local/bin/blackbox_exporter"
|
||||
mode: 0750
|
||||
owner: blackbox-exp
|
||||
group: blackbox-exp
|
||||
notify:
|
||||
- restart blackbox exporter
|
||||
|
||||
- name: Install libcap on Debian systems
|
||||
package:
|
||||
name: "libcap2-bin"
|
||||
state: present
|
||||
register: _download_packages
|
||||
until: _download_packages is succeeded
|
||||
retries: 5
|
||||
delay: 2
|
||||
when: ansible_os_family | lower == "debian"
|
||||
|
||||
- name: Ensure blackbox exporter binary has cap_net_raw capability
|
||||
capabilities:
|
||||
path: '/usr/local/bin/blackbox_exporter'
|
||||
capability: cap_net_raw+ep
|
||||
state: present
|
||||
when: not ansible_check_mode
|
||||
|
||||
- name: Check Debug Message
|
||||
debug:
|
||||
msg: "The capabilities module is skipped during check mode, as the file may not exist, causing execution to fail."
|
||||
when: ansible_check_mode
|
26
roles/blackbox_exporter/tasks/main.yml
Normal file
26
roles/blackbox_exporter/tasks/main.yml
Normal file
|
@ -0,0 +1,26 @@
|
|||
---
|
||||
- include: preflight.yml
|
||||
tags:
|
||||
- blackbox_exporter_install
|
||||
- blackbox_exporter_configure
|
||||
- blackbox_exporter_run
|
||||
|
||||
- include: install.yml
|
||||
become: true
|
||||
tags:
|
||||
- blackbox_exporter_install
|
||||
|
||||
- include: configure.yml
|
||||
become: true
|
||||
tags:
|
||||
- blackbox_exporter_configure
|
||||
|
||||
- name: ensure blackbox_exporter service is started and enabled
|
||||
become: true
|
||||
systemd:
|
||||
daemon_reload: true
|
||||
name: blackbox_exporter
|
||||
state: started
|
||||
enabled: true
|
||||
tags:
|
||||
- blackbox_exporter_run
|
22
roles/blackbox_exporter/tasks/preflight.yml
Normal file
22
roles/blackbox_exporter/tasks/preflight.yml
Normal file
|
@ -0,0 +1,22 @@
|
|||
---
|
||||
- name: Assert usage of systemd as an init system
|
||||
assert:
|
||||
that: ansible_service_mgr == 'systemd'
|
||||
msg: "This role only works with systemd"
|
||||
|
||||
- name: Get systemd version
|
||||
command: systemctl --version
|
||||
changed_when: false
|
||||
check_mode: false
|
||||
register: __systemd_version
|
||||
tags:
|
||||
- skip_ansible_lint
|
||||
|
||||
- name: Set systemd version fact
|
||||
set_fact:
|
||||
blackbox_exporter_systemd_version: "{{ __systemd_version.stdout_lines[0] | regex_replace('^systemd\\s(\\d+).*$', '\\1') }}"
|
||||
|
||||
- name: Naive assertion of proper listen address
|
||||
assert:
|
||||
that:
|
||||
- "':' in blackbox_exporter_web_listen_address"
|
|
@ -0,0 +1,45 @@
|
|||
{{ ansible_managed | comment }}
|
||||
[Unit]
|
||||
Description=Blackbox Exporter
|
||||
After=network-online.target
|
||||
StartLimitInterval=0
|
||||
StartLimitIntervalSec=0
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
User=blackbox-exp
|
||||
Group=blackbox-exp
|
||||
PermissionsStartOnly=true
|
||||
ExecReload=/bin/kill -HUP $MAINPID
|
||||
ExecStart=/usr/local/bin/blackbox_exporter \
|
||||
--config.file=/etc/blackbox_exporter.yml \
|
||||
{% for flag, flag_value in blackbox_exporter_cli_flags.items() -%}
|
||||
--{{ flag }}={{ flag_value }} \
|
||||
{% endfor -%}
|
||||
--web.listen-address={{ blackbox_exporter_web_listen_address }}
|
||||
|
||||
SyslogIdentifier=blackbox_exporter
|
||||
KillMode=process
|
||||
Restart=always
|
||||
RestartSec=5
|
||||
|
||||
LockPersonality=true
|
||||
NoNewPrivileges=true
|
||||
MemoryDenyWriteExecute=true
|
||||
PrivateTmp=true
|
||||
ProtectHome=true
|
||||
RemoveIPC=true
|
||||
RestrictSUIDSGID=true
|
||||
|
||||
AmbientCapabilities=CAP_NET_RAW
|
||||
{% if blackbox_exporter_systemd_version | int >= 232 %}
|
||||
ProtectControlGroups=true
|
||||
ProtectKernelModules=true
|
||||
ProtectKernelTunables=yes
|
||||
ProtectSystem=strict
|
||||
{% else %}
|
||||
ProtectSystem=full
|
||||
{% endif %}
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
|
@ -0,0 +1,2 @@
|
|||
modules:
|
||||
{{ blackbox_exporter_configuration_modules | to_nice_yaml(indent=2) | indent(2,False) }}
|
7
roles/blackbox_exporter/vars/main.yml
Normal file
7
roles/blackbox_exporter/vars/main.yml
Normal file
|
@ -0,0 +1,7 @@
|
|||
---
|
||||
go_arch_map:
|
||||
i386: '386'
|
||||
x86_64: 'amd64'
|
||||
aarch64: 'arm64'
|
||||
armv7l: 'armv7'
|
||||
armv6l: 'armv6'
|
0
roles/blackbox_exporter/vars/redhat-7.yml
Normal file
0
roles/blackbox_exporter/vars/redhat-7.yml
Normal file
0
roles/blackbox_exporter/vars/ubuntu.yml
Normal file
0
roles/blackbox_exporter/vars/ubuntu.yml
Normal file
99
roles/node_exporter/README.md
Normal file
99
roles/node_exporter/README.md
Normal file
|
@ -0,0 +1,99 @@
|
|||
<p><img src="https://www.circonus.com/wp-content/uploads/2015/03/sol-icon-itOps.png" alt="graph logo" title="graph" align="right" height="60" /></p>
|
||||
|
||||
# Ansible Role: node exporter
|
||||
|
||||
## Warning
|
||||
|
||||
Due to limitations of galaxy.ansible.com we had to move the role to https://galaxy.ansible.com/cloudalchemy/node_exporter and use `_` instead of `-` in role name. This is a breaking change and unfortunately, it affects all versions of node_exporter role as ansible galaxy doesn't offer any form of redirection. We are sorry for the inconvenience.
|
||||
|
||||
## Description
|
||||
|
||||
Deploy prometheus [node exporter](https://github.com/prometheus/node_exporter) using ansible.
|
||||
|
||||
## Requirements
|
||||
|
||||
- Ansible >= 2.7 (It might work on previous versions, but we cannot guarantee it)
|
||||
- gnu-tar on Mac deployer host (`brew install gnu-tar`)
|
||||
- Passlib is required when using the basic authentication feature (`pip install passlib[bcrypt]`)
|
||||
|
||||
## Role Variables
|
||||
|
||||
All variables which can be overridden are stored in [defaults/main.yml](defaults/main.yml) and are listed in the table below.
|
||||
|
||||
| Name | Default Value | Description |
|
||||
| -------------- | ------------- | -----------------------------------|
|
||||
| `node_exporter_version` | 1.1.2 | Node exporter package version. Also accepts latest as parameter. |
|
||||
| `node_exporter_binary_local_dir` | "" | Enables the use of local packages instead of those distributed on github. The parameter may be set to a directory where the `node_exporter` binary is stored on the host where ansible is run. This overrides the `node_exporter_version` parameter |
|
||||
| `node_exporter_web_listen_address` | "0.0.0.0:9100" | Address on which node exporter will listen |
|
||||
| `node_exporter_web_telemetry_path` | "/metrics" | Path under which to expose metrics |
|
||||
| `node_exporter_enabled_collectors` | ```["systemd",{textfile: {directory: "{{node_exporter_textfile_dir}}"}}]``` | List of dicts defining additionally enabled collectors and their configuration. It adds collectors to [those enabled by default](https://github.com/prometheus/node_exporter#enabled-by-default). |
|
||||
| `node_exporter_disabled_collectors` | [] | List of disabled collectors. By default node_exporter disables collectors listed [here](https://github.com/prometheus/node_exporter#disabled-by-default). |
|
||||
| `node_exporter_textfile_dir` | "/var/lib/node_exporter" | Directory used by the [Textfile Collector](https://github.com/prometheus/node_exporter#textfile-collector). To get permissions to write metrics in this directory, users must be in `node-exp` system group. __Note__: More information in TROUBLESHOOTING.md guide.
|
||||
| `node_exporter_tls_server_config` | {} | Configuration for TLS authentication. Keys and values are the same as in [node_exporter docs](https://github.com/prometheus/node_exporter/blob/master/https/README.md#sample-config). |
|
||||
| `node_exporter_http_server_config` | {} | Config for HTTP/2 support. Keys and values are the same as in [node_exporter docs](https://github.com/prometheus/node_exporter/blob/master/https/README.md#sample-config). |
|
||||
| `node_exporter_basic_auth_users` | {} | Dictionary of users and password for basic authentication. Passwords are automatically hashed with bcrypt. |
|
||||
|
||||
## Example
|
||||
|
||||
### Playbook
|
||||
|
||||
Use it in a playbook as follows:
|
||||
```yaml
|
||||
- hosts: all
|
||||
roles:
|
||||
- cloudalchemy.node_exporter
|
||||
```
|
||||
|
||||
### TLS config
|
||||
|
||||
Before running node_exporter role, the user needs to provision their own certificate and key.
|
||||
```yaml
|
||||
- hosts: all
|
||||
pre_tasks:
|
||||
- name: Create node_exporter cert dir
|
||||
file:
|
||||
path: "/etc/node_exporter"
|
||||
state: directory
|
||||
owner: root
|
||||
group: root
|
||||
|
||||
- name: Create cert and key
|
||||
openssl_certificate:
|
||||
path: /etc/node_exporter/tls.cert
|
||||
csr_path: /etc/node_exporter/tls.csr
|
||||
privatekey_path: /etc/node_exporter/tls.key
|
||||
provider: selfsigned
|
||||
roles:
|
||||
- cloudalchemy.node_exporter
|
||||
vars:
|
||||
node_exporter_tls_server_config:
|
||||
cert_file: /etc/node_exporter/tls.cert
|
||||
key_file: /etc/node_exporter/tls.key
|
||||
node_exporter_basic_auth_users:
|
||||
randomuser: examplepassword
|
||||
```
|
||||
|
||||
|
||||
### Demo site
|
||||
|
||||
We provide an example site that demonstrates a full monitoring solution based on prometheus and grafana. The repository with code and links to running instances is [available on github](https://github.com/prometheus/demo-site) and the site is hosted on [DigitalOcean](https://digitalocean.com).
|
||||
|
||||
## Local Testing
|
||||
|
||||
The preferred way of locally testing the role is to use Docker and [molecule](https://github.com/ansible-community/molecule) (v3.x). You will have to install Docker on your system. See "Get started" for a Docker package suitable for your system. Running your tests is as simple as executing `molecule test`.
|
||||
|
||||
## Continuous Integration
|
||||
|
||||
Combining molecule and circle CI allows us to test how new PRs will behave when used with multiple ansible versions and multiple operating systems. This also allows use to create test scenarios for different role configurations. As a result we have quite a large test matrix which can take more time than local testing, so please be patient.
|
||||
|
||||
## Contributing
|
||||
|
||||
See [contributor guideline](CONTRIBUTING.md).
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
See [troubleshooting](TROUBLESHOOTING.md).
|
||||
|
||||
## License
|
||||
|
||||
This project is licensed under MIT License. See [LICENSE](/LICENSE) for more details.
|
43
roles/node_exporter/TROUBLESHOOTING.md
Normal file
43
roles/node_exporter/TROUBLESHOOTING.md
Normal file
|
@ -0,0 +1,43 @@
|
|||
# Troubleshooting
|
||||
|
||||
## Bad requests (HTTP 400)
|
||||
|
||||
This role downloads checksums from the Github project to verify the integrity of artifacts installed on your servers. When downloading the checksums, a "bad request" error might occur.
|
||||
|
||||
This happens in environments which (knowningly or unknowling) use the [netrc mechanism](https://www.gnu.org/software/inetutils/manual/html_node/The-_002enetrc-file.html) to auto-login into servers.
|
||||
|
||||
Unless netrc is needed by your playbook and ansible roles, please unset the var like so:
|
||||
|
||||
```
|
||||
$ NETRC= ansible-playbook ...
|
||||
```
|
||||
|
||||
Or:
|
||||
|
||||
```
|
||||
$ export NETRC=
|
||||
$ ansible-playbook ...
|
||||
```
|
||||
|
||||
## node_exporter doesn't report data from textfile collector
|
||||
|
||||
There are 3 potential issues why node_exporter doesn't pick up data:
|
||||
|
||||
1. Duplicated metrics across multiple files.
|
||||
2. File is not readable by node_exporter process.
|
||||
3. Textfile collector is not enabled.
|
||||
|
||||
Solving first possibility is out of scope of the role as data is created somewhere else. When creating that data ensure
|
||||
files are readable by `node-exp` user. To get access to the directory with files your process needs to be in `node-exp`
|
||||
group.
|
||||
|
||||
Lastly ansible role misconfiguration can also lead to data not being picked up. Check if `node_exporter` textfile
|
||||
collector is enabled in `node_exporter_enabled_collectors` as follows:
|
||||
|
||||
```yaml
|
||||
node_exporter_enabled_collectors:
|
||||
- textfile:
|
||||
directory: "{{ node_exporter_textfile_dir }}"
|
||||
```
|
||||
|
||||
__note___: `node_exporter_textfile_dir` variable is only responsible for creating a directory not enabling a collector.
|
28
roles/node_exporter/defaults/main.yml
Normal file
28
roles/node_exporter/defaults/main.yml
Normal file
|
@ -0,0 +1,28 @@
|
|||
---
|
||||
node_exporter_version: 1.1.2
|
||||
node_exporter_binary_local_dir: ""
|
||||
node_exporter_web_listen_address: "0.0.0.0:9100"
|
||||
node_exporter_web_telemetry_path: "/metrics"
|
||||
|
||||
node_exporter_textfile_dir: "/var/lib/node_exporter"
|
||||
|
||||
node_exporter_tls_server_config: {}
|
||||
|
||||
node_exporter_http_server_config: {}
|
||||
|
||||
node_exporter_basic_auth_users: {}
|
||||
|
||||
node_exporter_enabled_collectors:
|
||||
- systemd
|
||||
- textfile:
|
||||
directory: "{{ node_exporter_textfile_dir }}"
|
||||
# - filesystem:
|
||||
# ignored-mount-points: "^/(sys|proc|dev)($|/)"
|
||||
# ignored-fs-types: "^(sys|proc|auto)fs$"
|
||||
|
||||
node_exporter_disabled_collectors: []
|
||||
|
||||
# Internal variables.
|
||||
_node_exporter_binary_install_dir: "/usr/local/bin"
|
||||
_node_exporter_system_group: "node-exp"
|
||||
_node_exporter_system_user: "{{ _node_exporter_system_group }}"
|
9
roles/node_exporter/handlers/main.yml
Normal file
9
roles/node_exporter/handlers/main.yml
Normal file
|
@ -0,0 +1,9 @@
|
|||
---
|
||||
- name: restart node_exporter
|
||||
become: true
|
||||
systemd:
|
||||
daemon_reload: true
|
||||
name: node_exporter
|
||||
state: restarted
|
||||
when:
|
||||
- not ansible_check_mode
|
32
roles/node_exporter/meta/main.yml
Normal file
32
roles/node_exporter/meta/main.yml
Normal file
|
@ -0,0 +1,32 @@
|
|||
---
|
||||
galaxy_info:
|
||||
author: Prometheus Community
|
||||
description: Prometheus Node Exporter
|
||||
license: Apache
|
||||
company: none
|
||||
min_ansible_version: "2.7"
|
||||
platforms:
|
||||
- name: Ubuntu
|
||||
versions:
|
||||
- bionic
|
||||
- xenial
|
||||
- name: Debian
|
||||
versions:
|
||||
- stretch
|
||||
- buster
|
||||
- name: EL
|
||||
versions:
|
||||
- 7
|
||||
- 8
|
||||
- name: Fedora
|
||||
versions:
|
||||
- 30
|
||||
- 31
|
||||
galaxy_tags:
|
||||
- monitoring
|
||||
- prometheus
|
||||
- exporter
|
||||
- metrics
|
||||
- system
|
||||
|
||||
dependencies: []
|
70
roles/node_exporter/molecule/alternative/molecule.yml
Normal file
70
roles/node_exporter/molecule/alternative/molecule.yml
Normal file
|
@ -0,0 +1,70 @@
|
|||
---
|
||||
dependency:
|
||||
name: galaxy
|
||||
driver:
|
||||
name: docker
|
||||
platforms:
|
||||
- name: bionic
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:ubuntu-18.04
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: xenial
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:ubuntu-16.04
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: stretch
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:debian-9
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: buster
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:debian-10
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: centos7
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:centos-7
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: centos8
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:centos-8
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
groups:
|
||||
- python3
|
||||
- name: fedora
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:fedora-30
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
groups:
|
||||
- python3
|
||||
provisioner:
|
||||
name: ansible
|
||||
playbooks:
|
||||
prepare: prepare.yml
|
||||
converge: playbook.yml
|
||||
inventory:
|
||||
group_vars:
|
||||
python3:
|
||||
ansible_python_interpreter: /usr/bin/python3
|
||||
verifier:
|
||||
name: testinfra
|
38
roles/node_exporter/molecule/alternative/playbook.yml
Normal file
38
roles/node_exporter/molecule/alternative/playbook.yml
Normal file
|
@ -0,0 +1,38 @@
|
|||
---
|
||||
- name: Run role
|
||||
hosts: all
|
||||
any_errors_fatal: true
|
||||
roles:
|
||||
- cloudalchemy.node_exporter
|
||||
pre_tasks:
|
||||
- name: Create node_exporter cert dir
|
||||
file:
|
||||
path: "{{ node_exporter_tls_server_config.cert_file | dirname }}"
|
||||
state: directory
|
||||
owner: root
|
||||
group: root
|
||||
- name: Copy cert and key
|
||||
copy:
|
||||
src: "{{ item.src }}"
|
||||
dest: "{{ item.dest }}"
|
||||
with_items:
|
||||
- src: "/tmp/tls.cert"
|
||||
dest: "{{ node_exporter_tls_server_config.cert_file }}"
|
||||
- src: "/tmp/tls.key"
|
||||
dest: "{{ node_exporter_tls_server_config.key_file }}"
|
||||
vars:
|
||||
node_exporter_binary_local_dir: "/tmp/node_exporter-linux-amd64"
|
||||
node_exporter_web_listen_address: "127.0.0.1:8080"
|
||||
node_exporter_textfile_dir: ""
|
||||
node_exporter_enabled_collectors:
|
||||
- entropy
|
||||
node_exporter_disabled_collectors:
|
||||
- diskstats
|
||||
|
||||
node_exporter_tls_server_config:
|
||||
cert_file: /etc/node_exporter/tls.cert
|
||||
key_file: /etc/node_exporter/tls.key
|
||||
node_exporter_http_server_config:
|
||||
http2: true
|
||||
node_exporter_basic_auth_users:
|
||||
randomuser: examplepassword
|
57
roles/node_exporter/molecule/alternative/prepare.yml
Normal file
57
roles/node_exporter/molecule/alternative/prepare.yml
Normal file
|
@ -0,0 +1,57 @@
|
|||
---
|
||||
- name: Prepare
|
||||
hosts: localhost
|
||||
gather_facts: false
|
||||
vars:
|
||||
go_arch: amd64
|
||||
node_exporter_version: 1.0.0
|
||||
tasks:
|
||||
- name: Download node_exporter binary to local folder
|
||||
become: false
|
||||
get_url:
|
||||
url: "https://github.com/prometheus/node_exporter/releases/download/v{{ node_exporter_version }}/node_exporter-{{ node_exporter_version }}.linux-{{ go_arch }}.tar.gz"
|
||||
dest: "/tmp/node_exporter-{{ node_exporter_version }}.linux-{{ go_arch }}.tar.gz"
|
||||
register: _download_binary
|
||||
until: _download_binary is succeeded
|
||||
retries: 5
|
||||
delay: 2
|
||||
run_once: true
|
||||
check_mode: false
|
||||
|
||||
- name: Unpack node_exporter binary
|
||||
become: false
|
||||
unarchive:
|
||||
src: "/tmp/node_exporter-{{ node_exporter_version }}.linux-{{ go_arch }}.tar.gz"
|
||||
dest: "/tmp"
|
||||
creates: "/tmp/node_exporter-{{ node_exporter_version }}.linux-{{ go_arch }}/node_exporter"
|
||||
run_once: true
|
||||
check_mode: false
|
||||
|
||||
- name: link to node_exporter binaries directory
|
||||
become: false
|
||||
file:
|
||||
src: "/tmp/node_exporter-{{ node_exporter_version }}.linux-amd64"
|
||||
dest: "/tmp/node_exporter-linux-amd64"
|
||||
state: link
|
||||
run_once: true
|
||||
check_mode: false
|
||||
|
||||
- name: install pyOpenSSL for certificate generation
|
||||
pip:
|
||||
name: "pyOpenSSL"
|
||||
|
||||
- name: Create private key
|
||||
openssl_privatekey:
|
||||
path: "/tmp/tls.key"
|
||||
|
||||
- name: Create CSR
|
||||
openssl_csr:
|
||||
path: "/tmp/tls.csr"
|
||||
privatekey_path: "/tmp/tls.key"
|
||||
|
||||
- name: Create certificate
|
||||
openssl_certificate:
|
||||
path: "/tmp/tls.cert"
|
||||
csr_path: "/tmp/tls.csr"
|
||||
privatekey_path: "/tmp/tls.key"
|
||||
provider: selfsigned
|
|
@ -0,0 +1,29 @@
|
|||
import os
|
||||
import testinfra.utils.ansible_runner
|
||||
|
||||
testinfra_hosts = testinfra.utils.ansible_runner.AnsibleRunner(
|
||||
os.environ['MOLECULE_INVENTORY_FILE']).get_hosts('all')
|
||||
|
||||
|
||||
def test_directories(host):
|
||||
dirs = [
|
||||
"/var/lib/node_exporter"
|
||||
]
|
||||
for dir in dirs:
|
||||
d = host.file(dir)
|
||||
assert not d.exists
|
||||
|
||||
|
||||
def test_service(host):
|
||||
s = host.service("node_exporter")
|
||||
# assert s.is_enabled
|
||||
assert s.is_running
|
||||
|
||||
|
||||
def test_socket(host):
|
||||
sockets = [
|
||||
"tcp://127.0.0.1:8080"
|
||||
]
|
||||
for socket in sockets:
|
||||
s = host.socket(socket)
|
||||
assert s.is_listening
|
70
roles/node_exporter/molecule/default/molecule.yml
Normal file
70
roles/node_exporter/molecule/default/molecule.yml
Normal file
|
@ -0,0 +1,70 @@
|
|||
---
|
||||
dependency:
|
||||
name: galaxy
|
||||
driver:
|
||||
name: docker
|
||||
platforms:
|
||||
- name: bionic
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:ubuntu-18.04
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: xenial
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:ubuntu-16.04
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: stretch
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:debian-9
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: buster
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:debian-10
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: centos7
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:centos-7
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: centos8
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:centos-8
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
groups:
|
||||
- python3
|
||||
- name: fedora
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:fedora-30
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
groups:
|
||||
- python3
|
||||
provisioner:
|
||||
name: ansible
|
||||
playbooks:
|
||||
prepare: prepare.yml
|
||||
converge: playbook.yml
|
||||
inventory:
|
||||
group_vars:
|
||||
python3:
|
||||
ansible_python_interpreter: /usr/bin/python3
|
||||
verifier:
|
||||
name: testinfra
|
7
roles/node_exporter/molecule/default/playbook.yml
Normal file
7
roles/node_exporter/molecule/default/playbook.yml
Normal file
|
@ -0,0 +1,7 @@
|
|||
---
|
||||
- hosts: all
|
||||
any_errors_fatal: true
|
||||
roles:
|
||||
- cloudalchemy.node_exporter
|
||||
vars:
|
||||
node_exporter_web_listen_address: "127.0.0.1:9100"
|
5
roles/node_exporter/molecule/default/prepare.yml
Normal file
5
roles/node_exporter/molecule/default/prepare.yml
Normal file
|
@ -0,0 +1,5 @@
|
|||
---
|
||||
- name: Prepare
|
||||
hosts: all
|
||||
gather_facts: false
|
||||
tasks: []
|
63
roles/node_exporter/molecule/default/tests/test_default.py
Normal file
63
roles/node_exporter/molecule/default/tests/test_default.py
Normal file
|
@ -0,0 +1,63 @@
|
|||
import os
|
||||
import testinfra.utils.ansible_runner
|
||||
|
||||
testinfra_hosts = testinfra.utils.ansible_runner.AnsibleRunner(
|
||||
os.environ['MOLECULE_INVENTORY_FILE']).get_hosts('all')
|
||||
|
||||
|
||||
def test_directories(host):
|
||||
dirs = [
|
||||
"/var/lib/node_exporter"
|
||||
]
|
||||
for dir in dirs:
|
||||
d = host.file(dir)
|
||||
assert d.is_directory
|
||||
assert d.exists
|
||||
|
||||
|
||||
def test_files(host):
|
||||
files = [
|
||||
"/etc/systemd/system/node_exporter.service",
|
||||
"/usr/local/bin/node_exporter"
|
||||
]
|
||||
for file in files:
|
||||
f = host.file(file)
|
||||
assert f.exists
|
||||
assert f.is_file
|
||||
|
||||
|
||||
def test_permissions_didnt_change(host):
|
||||
dirs = [
|
||||
"/etc",
|
||||
"/root",
|
||||
"/usr",
|
||||
"/var"
|
||||
]
|
||||
for file in dirs:
|
||||
f = host.file(file)
|
||||
assert f.exists
|
||||
assert f.is_directory
|
||||
assert f.user == "root"
|
||||
assert f.group == "root"
|
||||
|
||||
|
||||
def test_user(host):
|
||||
assert host.group("node-exp").exists
|
||||
assert "node-exp" in host.user("node-exp").groups
|
||||
assert host.user("node-exp").shell == "/usr/sbin/nologin"
|
||||
assert host.user("node-exp").home == "/"
|
||||
|
||||
|
||||
def test_service(host):
|
||||
s = host.service("node_exporter")
|
||||
# assert s.is_enabled
|
||||
assert s.is_running
|
||||
|
||||
|
||||
def test_socket(host):
|
||||
sockets = [
|
||||
"tcp://127.0.0.1:9100"
|
||||
]
|
||||
for socket in sockets:
|
||||
s = host.socket(socket)
|
||||
assert s.is_listening
|
35
roles/node_exporter/molecule/latest/molecule.yml
Normal file
35
roles/node_exporter/molecule/latest/molecule.yml
Normal file
|
@ -0,0 +1,35 @@
|
|||
---
|
||||
dependency:
|
||||
name: galaxy
|
||||
driver:
|
||||
name: docker
|
||||
platforms:
|
||||
- name: buster
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:debian-10
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: fedora
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:fedora-30
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
groups:
|
||||
- python3
|
||||
provisioner:
|
||||
name: ansible
|
||||
playbooks:
|
||||
create: ../default/create.yml
|
||||
prepare: ../default/prepare.yml
|
||||
converge: playbook.yml
|
||||
destroy: ../default/destroy.yml
|
||||
inventory:
|
||||
group_vars:
|
||||
python3:
|
||||
ansible_python_interpreter: /usr/bin/python3
|
||||
verifier:
|
||||
name: testinfra
|
8
roles/node_exporter/molecule/latest/playbook.yml
Normal file
8
roles/node_exporter/molecule/latest/playbook.yml
Normal file
|
@ -0,0 +1,8 @@
|
|||
---
|
||||
- name: Run role
|
||||
hosts: all
|
||||
any_errors_fatal: true
|
||||
roles:
|
||||
- cloudalchemy.node_exporter
|
||||
vars:
|
||||
node_exporter_version: latest
|
|
@ -0,0 +1,27 @@
|
|||
import pytest
|
||||
import os
|
||||
import testinfra.utils.ansible_runner
|
||||
|
||||
testinfra_hosts = testinfra.utils.ansible_runner.AnsibleRunner(
|
||||
os.environ['MOLECULE_INVENTORY_FILE']).get_hosts('all')
|
||||
|
||||
|
||||
@pytest.mark.parametrize("files", [
|
||||
"/etc/systemd/system/node_exporter.service",
|
||||
"/usr/local/bin/node_exporter"
|
||||
])
|
||||
def test_files(host, files):
|
||||
f = host.file(files)
|
||||
assert f.exists
|
||||
assert f.is_file
|
||||
|
||||
|
||||
def test_service(host):
|
||||
s = host.service("node_exporter")
|
||||
# assert s.is_enabled
|
||||
assert s.is_running
|
||||
|
||||
|
||||
def test_socket(host):
|
||||
s = host.socket("tcp://0.0.0.0:9100")
|
||||
assert s.is_listening
|
51
roles/node_exporter/tasks/configure.yml
Normal file
51
roles/node_exporter/tasks/configure.yml
Normal file
|
@ -0,0 +1,51 @@
|
|||
---
|
||||
- name: Copy the node_exporter systemd service file
|
||||
template:
|
||||
src: node_exporter.service.j2
|
||||
dest: /etc/systemd/system/node_exporter.service
|
||||
owner: root
|
||||
group: root
|
||||
mode: 0644
|
||||
notify: restart node_exporter
|
||||
|
||||
- block:
|
||||
- name: Create node_exporter config directory
|
||||
file:
|
||||
path: "/etc/node_exporter"
|
||||
state: directory
|
||||
owner: root
|
||||
group: root
|
||||
mode: u+rwX,g+rwX,o=rX
|
||||
|
||||
- name: Copy the node_exporter config file
|
||||
template:
|
||||
src: config.yaml.j2
|
||||
dest: /etc/node_exporter/config.yaml
|
||||
owner: root
|
||||
group: root
|
||||
mode: 0644
|
||||
notify: restart node_exporter
|
||||
when:
|
||||
( node_exporter_tls_server_config | length > 0 ) or
|
||||
( node_exporter_http_server_config | length > 0 ) or
|
||||
( node_exporter_basic_auth_users | length > 0 )
|
||||
|
||||
- name: Create textfile collector dir
|
||||
file:
|
||||
path: "{{ node_exporter_textfile_dir }}"
|
||||
state: directory
|
||||
owner: "{{ _node_exporter_system_user }}"
|
||||
group: "{{ _node_exporter_system_group }}"
|
||||
recurse: true
|
||||
mode: u+rwX,g+rwX,o=rX
|
||||
when: node_exporter_textfile_dir | length > 0
|
||||
|
||||
- name: Allow node_exporter port in SELinux on RedHat OS family
|
||||
seport:
|
||||
ports: "{{ node_exporter_web_listen_address.split(':')[-1] }}"
|
||||
proto: tcp
|
||||
setype: http_port_t
|
||||
state: present
|
||||
when:
|
||||
- ansible_version.full is version_compare('2.4', '>=')
|
||||
- ansible_selinux.status == "enabled"
|
63
roles/node_exporter/tasks/install.yml
Normal file
63
roles/node_exporter/tasks/install.yml
Normal file
|
@ -0,0 +1,63 @@
|
|||
---
|
||||
- name: Create the node_exporter group
|
||||
group:
|
||||
name: "{{ _node_exporter_system_group }}"
|
||||
state: present
|
||||
system: true
|
||||
when: _node_exporter_system_group != "root"
|
||||
|
||||
- name: Create the node_exporter user
|
||||
user:
|
||||
name: "{{ _node_exporter_system_user }}"
|
||||
groups: "{{ _node_exporter_system_group }}"
|
||||
append: true
|
||||
shell: /usr/sbin/nologin
|
||||
system: true
|
||||
create_home: false
|
||||
home: /
|
||||
when: _node_exporter_system_user != "root"
|
||||
|
||||
- block:
|
||||
- name: Download node_exporter binary to local folder
|
||||
become: false
|
||||
get_url:
|
||||
url: "https://github.com/prometheus/node_exporter/releases/download/v{{ node_exporter_version }}/node_exporter-{{ node_exporter_version }}.linux-{{ go_arch }}.tar.gz"
|
||||
dest: "/tmp/node_exporter-{{ node_exporter_version }}.linux-{{ go_arch }}.tar.gz"
|
||||
checksum: "sha256:{{ node_exporter_checksum }}"
|
||||
mode: '0644'
|
||||
register: _download_binary
|
||||
until: _download_binary is succeeded
|
||||
retries: 5
|
||||
delay: 2
|
||||
delegate_to: localhost
|
||||
check_mode: false
|
||||
|
||||
- name: Unpack node_exporter binary
|
||||
become: false
|
||||
unarchive:
|
||||
src: "/tmp/node_exporter-{{ node_exporter_version }}.linux-{{ go_arch }}.tar.gz"
|
||||
dest: "/tmp"
|
||||
creates: "/tmp/node_exporter-{{ node_exporter_version }}.linux-{{ go_arch }}/node_exporter"
|
||||
delegate_to: localhost
|
||||
check_mode: false
|
||||
|
||||
- name: Propagate node_exporter binaries
|
||||
copy:
|
||||
src: "/tmp/node_exporter-{{ node_exporter_version }}.linux-{{ go_arch }}/node_exporter"
|
||||
dest: "{{ _node_exporter_binary_install_dir }}/node_exporter"
|
||||
mode: 0755
|
||||
owner: root
|
||||
group: root
|
||||
notify: restart node_exporter
|
||||
when: not ansible_check_mode
|
||||
when: node_exporter_binary_local_dir | length == 0
|
||||
|
||||
- name: propagate locally distributed node_exporter binary
|
||||
copy:
|
||||
src: "{{ node_exporter_binary_local_dir }}/node_exporter"
|
||||
dest: "{{ _node_exporter_binary_install_dir }}/node_exporter"
|
||||
mode: 0755
|
||||
owner: root
|
||||
group: root
|
||||
when: node_exporter_binary_local_dir | length > 0
|
||||
notify: restart node_exporter
|
39
roles/node_exporter/tasks/main.yml
Normal file
39
roles/node_exporter/tasks/main.yml
Normal file
|
@ -0,0 +1,39 @@
|
|||
---
|
||||
- import_tasks: preflight.yml
|
||||
tags:
|
||||
- node_exporter_install
|
||||
- node_exporter_configure
|
||||
- node_exporter_run
|
||||
|
||||
- import_tasks: install.yml
|
||||
become: true
|
||||
when:
|
||||
( not __node_exporter_is_installed.stat.exists ) or
|
||||
( (__node_exporter_current_version_output.stderr_lines | length > 0) and (__node_exporter_current_version_output.stderr_lines[0].split(" ")[2] != node_exporter_version) ) or
|
||||
( (__node_exporter_current_version_output.stdout_lines | length > 0) and (__node_exporter_current_version_output.stdout_lines[0].split(" ")[2] != node_exporter_version) ) or
|
||||
( node_exporter_binary_local_dir | length > 0 )
|
||||
tags:
|
||||
- node_exporter_install
|
||||
|
||||
- import_tasks: selinux.yml
|
||||
become: true
|
||||
when: ansible_selinux.status == "enabled"
|
||||
tags:
|
||||
- node_exporter_configure
|
||||
|
||||
- import_tasks: configure.yml
|
||||
become: true
|
||||
tags:
|
||||
- node_exporter_configure
|
||||
|
||||
- name: Ensure Node Exporter is enabled on boot
|
||||
become: true
|
||||
systemd:
|
||||
daemon_reload: true
|
||||
name: node_exporter
|
||||
enabled: true
|
||||
state: started
|
||||
when:
|
||||
- not ansible_check_mode
|
||||
tags:
|
||||
- node_exporter_run
|
111
roles/node_exporter/tasks/preflight.yml
Normal file
111
roles/node_exporter/tasks/preflight.yml
Normal file
|
@ -0,0 +1,111 @@
|
|||
---
|
||||
- name: Assert usage of systemd as an init system
|
||||
assert:
|
||||
that: ansible_service_mgr == 'systemd'
|
||||
msg: "This role only works with systemd"
|
||||
|
||||
- name: Get systemd version
|
||||
command: systemctl --version
|
||||
changed_when: false
|
||||
check_mode: false
|
||||
register: __systemd_version
|
||||
tags:
|
||||
- skip_ansible_lint
|
||||
|
||||
- name: Set systemd version fact
|
||||
set_fact:
|
||||
node_exporter_systemd_version: "{{ __systemd_version.stdout_lines[0] | regex_replace('^systemd\\s(\\d+).*$', '\\1') }}"
|
||||
|
||||
- name: Naive assertion of proper listen address
|
||||
assert:
|
||||
that:
|
||||
- "':' in node_exporter_web_listen_address"
|
||||
|
||||
- name: Assert collectors are not both disabled and enabled at the same time
|
||||
assert:
|
||||
that:
|
||||
- "item not in node_exporter_enabled_collectors"
|
||||
with_items: "{{ node_exporter_disabled_collectors }}"
|
||||
|
||||
- block:
|
||||
- name: Assert that TLS key and cert path are set
|
||||
assert:
|
||||
that:
|
||||
- "node_exporter_tls_server_config.cert_file is defined"
|
||||
- "node_exporter_tls_server_config.key_file is defined"
|
||||
|
||||
- name: Check existence of TLS cert file
|
||||
stat:
|
||||
path: "{{ node_exporter_tls_server_config.cert_file }}"
|
||||
register: __node_exporter_cert_file
|
||||
|
||||
- name: Check existence of TLS key file
|
||||
stat:
|
||||
path: "{{ node_exporter_tls_server_config.key_file }}"
|
||||
register: __node_exporter_key_file
|
||||
|
||||
- name: Assert that TLS key and cert are present
|
||||
assert:
|
||||
that:
|
||||
- "{{ __node_exporter_cert_file.stat.exists }}"
|
||||
- "{{ __node_exporter_key_file.stat.exists }}"
|
||||
when: node_exporter_tls_server_config | length > 0
|
||||
|
||||
- name: Check if node_exporter is installed
|
||||
stat:
|
||||
path: "{{ _node_exporter_binary_install_dir }}/node_exporter"
|
||||
register: __node_exporter_is_installed
|
||||
check_mode: false
|
||||
tags:
|
||||
- node_exporter_install
|
||||
|
||||
- name: Gather currently installed node_exporter version (if any)
|
||||
command: "{{ _node_exporter_binary_install_dir }}/node_exporter --version"
|
||||
args:
|
||||
warn: false
|
||||
changed_when: false
|
||||
register: __node_exporter_current_version_output
|
||||
check_mode: false
|
||||
when: __node_exporter_is_installed.stat.exists
|
||||
tags:
|
||||
- node_exporter_install
|
||||
- skip_ansible_lint
|
||||
|
||||
- block:
|
||||
- name: Get latest release
|
||||
uri:
|
||||
url: "https://api.github.com/repos/prometheus/node_exporter/releases/latest"
|
||||
method: GET
|
||||
return_content: true
|
||||
status_code: 200
|
||||
body_format: json
|
||||
user: "{{ lookup('env', 'GH_USER') | default(omit) }}"
|
||||
password: "{{ lookup('env', 'GH_TOKEN') | default(omit) }}"
|
||||
no_log: "{{ not lookup('env', 'MOLECULE_DEBUG') | bool }}"
|
||||
register: _latest_release
|
||||
until: _latest_release.status == 200
|
||||
retries: 5
|
||||
|
||||
- name: "Set node_exporter version to {{ _latest_release.json.tag_name[1:] }}"
|
||||
set_fact:
|
||||
node_exporter_version: "{{ _latest_release.json.tag_name[1:] }}"
|
||||
when:
|
||||
- node_exporter_version == "latest"
|
||||
- node_exporter_binary_local_dir | length == 0
|
||||
delegate_to: localhost
|
||||
run_once: true
|
||||
|
||||
- block:
|
||||
- name: Get checksum list from github
|
||||
set_fact:
|
||||
_checksums: "{{ lookup('url', 'https://github.com/prometheus/node_exporter/releases/download/v' + node_exporter_version + '/sha256sums.txt', wantlist=True) | list }}"
|
||||
run_once: true
|
||||
|
||||
- name: "Get checksum for {{ go_arch }} architecture"
|
||||
set_fact:
|
||||
node_exporter_checksum: "{{ item.split(' ')[0] }}"
|
||||
with_items: "{{ _checksums }}"
|
||||
when:
|
||||
- "('linux-' + go_arch + '.tar.gz') in item"
|
||||
delegate_to: localhost
|
||||
when: node_exporter_binary_local_dir | length == 0
|
39
roles/node_exporter/tasks/selinux.yml
Normal file
39
roles/node_exporter/tasks/selinux.yml
Normal file
|
@ -0,0 +1,39 @@
|
|||
---
|
||||
- name: Install selinux python packages [RHEL]
|
||||
package:
|
||||
name:
|
||||
- "{{ ( (ansible_facts.distribution_major_version | int) < 8) | ternary('libselinux-python','python3-libselinux') }}"
|
||||
- "{{ ( (ansible_facts.distribution_major_version | int) < 8) | ternary('policycoreutils-python','python3-policycoreutils') }}"
|
||||
state: present
|
||||
register: _install_selinux_packages
|
||||
until: _install_selinux_packages is success
|
||||
retries: 5
|
||||
delay: 2
|
||||
when:
|
||||
- (ansible_distribution | lower == "redhat") or
|
||||
(ansible_distribution | lower == "centos")
|
||||
|
||||
- name: Install selinux python packages [Fedora]
|
||||
package:
|
||||
name:
|
||||
- "{{ ( (ansible_facts.distribution_major_version | int) < 29) | ternary('libselinux-python','python3-libselinux') }}"
|
||||
- "{{ ( (ansible_facts.distribution_major_version | int) < 29) | ternary('policycoreutils-python','python3-policycoreutils') }}"
|
||||
state: present
|
||||
register: _install_selinux_packages
|
||||
until: _install_selinux_packages is success
|
||||
retries: 5
|
||||
delay: 2
|
||||
|
||||
when:
|
||||
- ansible_distribution | lower == "fedora"
|
||||
|
||||
- name: Install selinux python packages [clearlinux]
|
||||
package:
|
||||
name: sysadmin-basic
|
||||
state: present
|
||||
register: _install_selinux_packages
|
||||
until: _install_selinux_packages is success
|
||||
retries: 5
|
||||
delay: 2
|
||||
when:
|
||||
- ansible_distribution | lower == "clearlinux"
|
18
roles/node_exporter/templates/config.yaml.j2
Normal file
18
roles/node_exporter/templates/config.yaml.j2
Normal file
|
@ -0,0 +1,18 @@
|
|||
---
|
||||
{{ ansible_managed | comment }}
|
||||
{% if node_exporter_tls_server_config | length > 0 %}
|
||||
tls_server_config:
|
||||
{{ node_exporter_tls_server_config | to_nice_yaml | indent(2, true) }}
|
||||
{% endif %}
|
||||
|
||||
{% if node_exporter_http_server_config | length > 0 %}
|
||||
http_server_config:
|
||||
{{ node_exporter_http_server_config | to_nice_yaml | indent(2, true) }}
|
||||
{% endif %}
|
||||
|
||||
{% if node_exporter_basic_auth_users | length > 0 %}
|
||||
basic_auth_users:
|
||||
{% for k, v in node_exporter_basic_auth_users.items() %}
|
||||
{{ k }}: {{ v | password_hash('bcrypt', ('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890' | shuffle(seed=inventory_hostname) | join)[:22], rounds=9) }}
|
||||
{% endfor %}
|
||||
{% endif %}
|
54
roles/node_exporter/templates/node_exporter.service.j2
Normal file
54
roles/node_exporter/templates/node_exporter.service.j2
Normal file
|
@ -0,0 +1,54 @@
|
|||
{{ ansible_managed | comment }}
|
||||
|
||||
[Unit]
|
||||
Description=Prometheus Node Exporter
|
||||
After=network-online.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
User={{ _node_exporter_system_user }}
|
||||
Group={{ _node_exporter_system_group }}
|
||||
ExecStart={{ _node_exporter_binary_install_dir }}/node_exporter \
|
||||
{% for collector in node_exporter_enabled_collectors -%}
|
||||
{% if not collector is mapping %}
|
||||
--collector.{{ collector }} \
|
||||
{% else -%}
|
||||
{% set name, options = (collector.items()|list)[0] -%}
|
||||
--collector.{{ name }} \
|
||||
{% for k,v in options|dictsort %}
|
||||
--collector.{{ name }}.{{ k }}={{ v | quote }} \
|
||||
{% endfor -%}
|
||||
{% endif -%}
|
||||
{% endfor -%}
|
||||
{% for collector in node_exporter_disabled_collectors %}
|
||||
--no-collector.{{ collector }} \
|
||||
{% endfor %}
|
||||
{% if node_exporter_tls_server_config | length > 0 or node_exporter_http_server_config | length > 0 or node_exporter_basic_auth_users | length > 0 %}
|
||||
--web.config=/etc/node_exporter/config.yaml \
|
||||
{% endif %}
|
||||
--web.listen-address={{ node_exporter_web_listen_address }} \
|
||||
--web.telemetry-path={{ node_exporter_web_telemetry_path }}
|
||||
|
||||
SyslogIdentifier=node_exporter
|
||||
Restart=always
|
||||
RestartSec=1
|
||||
StartLimitInterval=0
|
||||
|
||||
{% for m in ansible_mounts if m.mount == '/home' %}
|
||||
ProtectHome=read-only
|
||||
{% else %}
|
||||
ProtectHome=yes
|
||||
{% endfor %}
|
||||
NoNewPrivileges=yes
|
||||
|
||||
{% if node_exporter_systemd_version | int >= 232 %}
|
||||
ProtectSystem=strict
|
||||
ProtectControlGroups=true
|
||||
ProtectKernelModules=true
|
||||
ProtectKernelTunables=yes
|
||||
{% else %}
|
||||
ProtectSystem=full
|
||||
{% endif %}
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
9
roles/node_exporter/vars/main.yml
Normal file
9
roles/node_exporter/vars/main.yml
Normal file
|
@ -0,0 +1,9 @@
|
|||
---
|
||||
go_arch_map:
|
||||
i386: '386'
|
||||
x86_64: 'amd64'
|
||||
aarch64: 'arm64'
|
||||
armv7l: 'armv7'
|
||||
armv6l: 'armv6'
|
||||
|
||||
go_arch: "{{ go_arch_map[ansible_architecture] | default(ansible_architecture) }}"
|
153
roles/prometheus/README.md
Normal file
153
roles/prometheus/README.md
Normal file
|
@ -0,0 +1,153 @@
|
|||
<p><img src="https://cdn.worldvectorlogo.com/logos/prometheus.svg" alt="prometheus logo" title="prometheus" align="right" height="60" /></p>
|
||||
|
||||
# Ansible Role: prometheus
|
||||
|
||||
## Description
|
||||
|
||||
Deploy [Prometheus](https://github.com/prometheus/prometheus) monitoring system using ansible.
|
||||
|
||||
### Upgradability notice
|
||||
|
||||
When upgrading from <= 2.4.0 version of this role to >= 2.4.1 please turn off your prometheus instance. More in [2.4.1 release notes](https://github.com/cloudalchemy/ansible-prometheus/releases/tag/2.4.1)
|
||||
|
||||
## Requirements
|
||||
|
||||
- Ansible >= 2.7 (It might work on previous versions, but we cannot guarantee it)
|
||||
- jmespath on deployer machine. If you are using Ansible from a Python virtualenv, install *jmespath* to the same virtualenv via pip.
|
||||
- gnu-tar on Mac deployer host (`brew install gnu-tar`)
|
||||
|
||||
## Role Variables
|
||||
|
||||
All variables which can be overridden are stored in [defaults/main.yml](defaults/main.yml) file as well as in table below.
|
||||
|
||||
| Name | Default Value | Description |
|
||||
| -------------- | ------------- | -----------------------------------|
|
||||
| `prometheus_version` | 2.27.0 | Prometheus package version. Also accepts `latest` as parameter. Only prometheus 2.x is supported |
|
||||
| `prometheus_skip_install` | false | Prometheus installation tasks gets skipped when set to true. |
|
||||
| `prometheus_binary_local_dir` | "" | Allows to use local packages instead of ones distributed on github. As parameter it takes a directory where `prometheus` AND `promtool` binaries are stored on host on which ansible is ran. This overrides `prometheus_version` parameter |
|
||||
| `prometheus_config_dir` | /etc/prometheus | Path to directory with prometheus configuration |
|
||||
| `prometheus_db_dir` | /var/lib/prometheus | Path to directory with prometheus database |
|
||||
| `prometheus_read_only_dirs`| [] | Additional paths that Prometheus is allowed to read (useful for SSL certs outside of the config directory) |
|
||||
| `prometheus_web_listen_address` | "0.0.0.0:9090" | Address on which prometheus will be listening |
|
||||
| `prometheus_web_config` | {} | A Prometheus [web config yaml](https://github.com/prometheus/exporter-toolkit/blob/master/docs/web-configuration.md) for configuring TLS and auth. |
|
||||
| `prometheus_web_external_url` | "" | External address on which prometheus is available. Useful when behind reverse proxy. Ex. `http://example.org/prometheus` |
|
||||
| `prometheus_storage_retention` | "30d" | Data retention period |
|
||||
| `prometheus_storage_retention_size` | "0" | Data retention period by size |
|
||||
| `prometheus_config_flags_extra` | {} | Additional configuration flags passed to prometheus binary at startup |
|
||||
| `prometheus_alertmanager_config` | [] | Configuration responsible for pointing where alertmanagers are. This should be specified as list in yaml format. It is compatible with official [<alertmanager_config>](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#alertmanager_config) |
|
||||
| `prometheus_alert_relabel_configs` | [] | Alert relabeling rules. This should be specified as list in yaml format. It is compatible with the official [<alert_relabel_configs>](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#alert_relabel_configs) |
|
||||
| `prometheus_global` | { scrape_interval: 60s, scrape_timeout: 15s, evaluation_interval: 15s } | Prometheus global config. Compatible with [official configuration](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#configuration-file) |
|
||||
| `prometheus_remote_write` | [] | Remote write. Compatible with [official configuration](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#<remote_write>) |
|
||||
| `prometheus_remote_read` | [] | Remote read. Compatible with [official configuration](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#<remote_read>) |
|
||||
| `prometheus_external_labels` | environment: "{{ ansible_fqdn \| default(ansible_host) \| default(inventory_hostname) }}" | Provide map of additional labels which will be added to any time series or alerts when communicating with external systems |
|
||||
| `prometheus_targets` | {} | Targets which will be scraped. Better example is provided in our [demo site](https://github.com/cloudalchemy/demo-site/blob/2a8a56fc10ce613d8b08dc8623230dace6704f9a/group_vars/all/vars#L8) |
|
||||
| `prometheus_scrape_configs` | [defaults/main.yml#L58](https://github.com/cloudalchemy/ansible-prometheus/blob/ff7830d06ba57be1177f2b6fca33a4dd2d97dc20/defaults/main.yml#L47) | Prometheus scrape jobs provided in same format as in [official docs](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) |
|
||||
| `prometheus_config_file` | "prometheus.yml.j2" | Variable used to provide custom prometheus configuration file in form of ansible template |
|
||||
| `prometheus_alert_rules` | [defaults/main.yml#L81](https://github.com/cloudalchemy/ansible-prometheus/blob/73d6df05a775ee5b736ac8f28d5605f2a975d50a/defaults/main.yml#L85) | Full list of alerting rules which will be copied to `{{ prometheus_config_dir }}/rules/ansible_managed.rules`. Alerting rules can be also provided by other files located in `{{ prometheus_config_dir }}/rules/` which have `*.rules` extension |
|
||||
| `prometheus_alert_rules_files` | [defaults/main.yml#L78](https://github.com/cloudalchemy/ansible-prometheus/blob/73d6df05a775ee5b736ac8f28d5605f2a975d50a/defaults/main.yml#L78) | List of folders where ansible will look for files containing alerting rules which will be copied to `{{ prometheus_config_dir }}/rules/`. Files must have `*.rules` extension |
|
||||
| `prometheus_static_targets_files` | [defaults/main.yml#L78](https://github.com/cloudalchemy/ansible-prometheus/blob/73d6df05a775ee5b736ac8f28d5605f2a975d50a/defaults/main.yml#L81) | List of folders where ansible will look for files containing custom static target configuration files which will be copied to `{{ prometheus_config_dir }}/file_sd/`. |
|
||||
|
||||
|
||||
### Relation between `prometheus_scrape_configs` and `prometheus_targets`
|
||||
|
||||
#### Short version
|
||||
|
||||
`prometheus_targets` is just a map used to create multiple files located in "{{ prometheus_config_dir }}/file_sd" directory. Where file names are composed from top-level keys in that map with `.yml` suffix. Those files store [file_sd scrape targets data](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#file_sd_config) and they need to be read in `prometheus_scrape_configs`.
|
||||
|
||||
#### Long version
|
||||
|
||||
A part of *prometheus.yml* configuration file which describes what is scraped by prometheus is stored in `prometheus_scrape_configs`. For this variable same configuration options as described in [prometheus docs](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#<scrape_config>) are used.
|
||||
|
||||
Meanwhile `prometheus_targets` is our way of adopting [prometheus scrape type `file_sd`](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#<file_sd_config>). It defines a map of files with their content. A top-level keys are base names of files which need to have their own scrape job in `prometheus_scrape_configs` and values are a content of those files.
|
||||
|
||||
All this mean that you CAN use custom `prometheus_scrape_configs` with `prometheus_targets` set to `{}`. However when you set anything in `prometheus_targets` it needs to be mapped to `prometheus_scrape_configs`. If it isn't you'll get an error in preflight checks.
|
||||
|
||||
#### Example
|
||||
|
||||
Lets look at our default configuration, which shows all features. By default we have this `prometheus_targets`:
|
||||
```
|
||||
prometheus_targets:
|
||||
node: # This is a base file name. File is located in "{{ prometheus_config_dir }}/file_sd/<<BASENAME>>.yml"
|
||||
- targets: #
|
||||
- localhost:9100 # All this is a targets section in file_sd format
|
||||
labels: #
|
||||
env: test #
|
||||
```
|
||||
Such config will result in creating one file named `node.yml` in `{{ prometheus_config_dir }}/file_sd` directory.
|
||||
|
||||
Next this file needs to be loaded into scrape config. Here is modified version of our default `prometheus_scrape_configs`:
|
||||
```
|
||||
prometheus_scrape_configs:
|
||||
- job_name: "prometheus" # Custom scrape job, here using `static_config`
|
||||
metrics_path: "/metrics"
|
||||
static_configs:
|
||||
- targets:
|
||||
- "localhost:9090"
|
||||
- job_name: "example-node-file-servicediscovery"
|
||||
file_sd_configs:
|
||||
- files:
|
||||
- "{{ prometheus_config_dir }}/file_sd/node.yml" # This line loads file created from `prometheus_targets`
|
||||
```
|
||||
|
||||
## Example
|
||||
|
||||
### Playbook
|
||||
|
||||
```yaml
|
||||
---
|
||||
- hosts: all
|
||||
roles:
|
||||
- cloudalchemy.prometheus
|
||||
vars:
|
||||
prometheus_targets:
|
||||
node:
|
||||
- targets:
|
||||
- localhost:9100
|
||||
- demo.cloudalchemy.org:9100
|
||||
labels:
|
||||
env: demosite
|
||||
```
|
||||
|
||||
### Demo site
|
||||
|
||||
Prometheus organization provide a demo site for full monitoring solution based on prometheus and grafana. Repository with code and links to running instances is [available on github](https://github.com/prometheus/demo-site).
|
||||
|
||||
### Defining alerting rules files
|
||||
|
||||
Alerting rules are defined in `prometheus_alert_rules` variable. Format is almost identical to one defined in[ Prometheus 2.0 documentation](https://prometheus.io/docs/prometheus/latest/configuration/template_examples/).
|
||||
Due to similarities in templating engines, every templates should be wrapped in `{% raw %}` and `{% endraw %}` statements. Example is provided in [defaults/main.yml](defaults/main.yml) file.
|
||||
|
||||
## Local Testing
|
||||
|
||||
The preferred way of locally testing the role is to use Docker and [molecule](https://github.com/metacloud/molecule) (v2.x). You will have to install Docker on your system. See "Get started" for a Docker package suitable to for your system.
|
||||
We are using tox to simplify process of testing on multiple ansible versions. To install tox execute:
|
||||
```sh
|
||||
pip3 install tox
|
||||
```
|
||||
To run tests on all ansible versions (WARNING: this can take some time)
|
||||
```sh
|
||||
tox
|
||||
```
|
||||
To run a custom molecule command on custom environment with only default test scenario:
|
||||
```sh
|
||||
tox -e py35-ansible28 -- molecule test -s default
|
||||
```
|
||||
For more information about molecule go to their [docs](http://molecule.readthedocs.io/en/latest/).
|
||||
|
||||
If you would like to run tests on remote docker host just specify `DOCKER_HOST` variable before running tox tests.
|
||||
|
||||
## CircleCI
|
||||
|
||||
Combining molecule and CircleCI allows us to test how new PRs will behave when used with multiple ansible versions and multiple operating systems. This also allows use to create test scenarios for different role configurations. As a result we have a quite large test matrix which will take more time than local testing, so please be patient.
|
||||
|
||||
## Contributing
|
||||
|
||||
See [contributor guideline](CONTRIBUTING.md).
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
See [troubleshooting](TROUBLESHOOTING.md).
|
||||
|
||||
## License
|
||||
|
||||
This project is licensed under MIT License. See [LICENSE](/LICENSE) for more details.
|
25
roles/prometheus/bump_version.sh
Executable file
25
roles/prometheus/bump_version.sh
Executable file
|
@ -0,0 +1,25 @@
|
|||
#!/usr/bin/env bash
|
||||
#
|
||||
# Description: Generate the next release version
|
||||
|
||||
set -uo pipefail
|
||||
|
||||
latest_tag="$(git semver)"
|
||||
if [[ -z "${latest_tag}" ]]; then
|
||||
echo "ERROR: Couldn't get latest tag from git semver, try 'pip install git-semver'" 2>&1
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Use HEAD if CIRCLE_SHA1 is not set.
|
||||
now="${CIRCLE_SHA1:-HEAD}"
|
||||
|
||||
new_tag='none'
|
||||
git_log="$(git log --format=%B "${latest_tag}..${now}")"
|
||||
|
||||
case "${git_log}" in
|
||||
*"[major]"*|*"[breaking change]"* ) new_tag=$(git semver --next-major) ;;
|
||||
*"[minor]"*|*"[feat]"*|*"[feature]"* ) new_tag=$(git semver --next-minor) ;;
|
||||
*"[patch]"*|*"[fix]"*|*"[bugfix]"* ) new_tag=$(git semver --next-patch) ;;
|
||||
esac
|
||||
|
||||
echo "NEW_TAG=${new_tag}"
|
219
roles/prometheus/defaults/main.yml
Normal file
219
roles/prometheus/defaults/main.yml
Normal file
|
@ -0,0 +1,219 @@
|
|||
---
|
||||
prometheus_version: 2.27.0
|
||||
prometheus_binary_local_dir: ''
|
||||
prometheus_skip_install: false
|
||||
|
||||
prometheus_config_dir: /etc/prometheus
|
||||
prometheus_db_dir: /var/lib/prometheus
|
||||
prometheus_read_only_dirs: []
|
||||
|
||||
prometheus_web_listen_address: "0.0.0.0:9090"
|
||||
prometheus_web_external_url: ''
|
||||
# See https://github.com/prometheus/exporter-toolkit/blob/master/docs/web-configuration.md
|
||||
prometheus_web_config:
|
||||
tls_server_config: {}
|
||||
http_server_config: {}
|
||||
basic_auth_users: {}
|
||||
|
||||
prometheus_storage_retention: "30d"
|
||||
# Available since Prometheus 2.7.0
|
||||
# [EXPERIMENTAL] Maximum number of bytes that can be stored for blocks. Units
|
||||
# supported: KB, MB, GB, TB, PB.
|
||||
prometheus_storage_retention_size: "0"
|
||||
|
||||
prometheus_config_flags_extra: {}
|
||||
# prometheus_config_flags_extra:
|
||||
# storage.tsdb.retention: 15d
|
||||
# alertmanager.timeout: 10s
|
||||
|
||||
prometheus_alertmanager_config: []
|
||||
# prometheus_alertmanager_config:
|
||||
# - scheme: https
|
||||
# path_prefix: alertmanager/
|
||||
# basic_auth:
|
||||
# username: user
|
||||
# password: pass
|
||||
# static_configs:
|
||||
# - targets: ["127.0.0.1:9093"]
|
||||
# proxy_url: "127.0.0.2"
|
||||
|
||||
prometheus_alert_relabel_configs: []
|
||||
# prometheus_alert_relabel_configs:
|
||||
# - action: labeldrop
|
||||
# regex: replica
|
||||
|
||||
prometheus_global:
|
||||
scrape_interval: 15s
|
||||
scrape_timeout: 10s
|
||||
evaluation_interval: 15s
|
||||
|
||||
prometheus_remote_write: []
|
||||
# prometheus_remote_write:
|
||||
# - url: https://dev.kausal.co/prom/push
|
||||
# basic_auth:
|
||||
# password: FOO
|
||||
|
||||
prometheus_remote_read: []
|
||||
# prometheus_remote_read:
|
||||
# - url: https://demo.cloudalchemy.org:9201/read
|
||||
# basic_auth:
|
||||
# password: FOO
|
||||
|
||||
prometheus_external_labels:
|
||||
environment: "{{ ansible_fqdn | default(ansible_host) | default(inventory_hostname) }}"
|
||||
|
||||
prometheus_targets: {}
|
||||
# node:
|
||||
# - targets:
|
||||
# - localhost:9100
|
||||
# labels:
|
||||
# env: test
|
||||
|
||||
prometheus_scrape_configs:
|
||||
- job_name: "prometheus"
|
||||
metrics_path: "{{ prometheus_metrics_path }}"
|
||||
static_configs:
|
||||
- targets:
|
||||
- "{{ ansible_fqdn | default(ansible_host) | default('localhost') }}:9090"
|
||||
- job_name: "node"
|
||||
file_sd_configs:
|
||||
- files:
|
||||
- "{{ prometheus_config_dir }}/file_sd/node.yml"
|
||||
|
||||
# Alternative config file name, searched in ansible templates path.
|
||||
prometheus_config_file: 'prometheus.yml.j2'
|
||||
|
||||
prometheus_alert_rules_files:
|
||||
- prometheus/rules/*.rules
|
||||
|
||||
prometheus_static_targets_files:
|
||||
- prometheus/targets/*.yml
|
||||
- prometheus/targets/*.json
|
||||
|
||||
prometheus_alert_rules:
|
||||
- alert: Watchdog
|
||||
expr: vector(1)
|
||||
for: 10m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
description: "This is an alert meant to ensure that the entire alerting pipeline is functional.\nThis alert is always firing, therefore it should always be firing in Alertmanager\nand always fire against a receiver. There are integrations with various notification\nmechanisms that send a notification when this alert is not firing. For example the\n\"DeadMansSnitch\" integration in PagerDuty."
|
||||
summary: 'Ensure entire alerting pipeline is functional'
|
||||
- alert: InstanceDown
|
||||
expr: 'up == 0'
|
||||
for: 5m
|
||||
labels:
|
||||
severity: critical
|
||||
annotations:
|
||||
description: '{% raw %}{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.{% endraw %}'
|
||||
summary: '{% raw %}Instance {{ $labels.instance }} down{% endraw %}'
|
||||
- alert: RebootRequired
|
||||
expr: 'node_reboot_required > 0'
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
description: '{% raw %}{{ $labels.instance }} requires a reboot.{% endraw %}'
|
||||
summary: '{% raw %}Instance {{ $labels.instance }} - reboot required{% endraw %}'
|
||||
- alert: NodeFilesystemSpaceFillingUp
|
||||
annotations:
|
||||
description: '{% raw %}Filesystem on {{ $labels.device }} at {{ $labels.instance }} has only {{ printf "%.2f" $value }}% available space left and is filling up.{% endraw %}'
|
||||
summary: 'Filesystem is predicted to run out of space within the next 24 hours.'
|
||||
expr: "(\n node_filesystem_avail_bytes{job=\"node\",fstype!=\"\"} / node_filesystem_size_bytes{job=\"node\",fstype!=\"\"} * 100 < 40\nand\n predict_linear(node_filesystem_avail_bytes{job=\"node\",fstype!=\"\"}[6h], 24*60*60) < 0\nand\n node_filesystem_readonly{job=\"node\",fstype!=\"\"} == 0\n)\n"
|
||||
for: 1h
|
||||
labels:
|
||||
severity: warning
|
||||
- alert: NodeFilesystemSpaceFillingUp
|
||||
annotations:
|
||||
description: '{% raw %}Filesystem on {{ $labels.device }} at {{ $labels.instance }} has only {{ printf "%.2f" $value }}% available space left and is filling up fast.{% endraw %}'
|
||||
summary: 'Filesystem is predicted to run out of space within the next 4 hours.'
|
||||
expr: "(\n node_filesystem_avail_bytes{job=\"node\",fstype!=\"\"} / node_filesystem_size_bytes{job=\"node\",fstype!=\"\"} * 100 < 20\nand\n predict_linear(node_filesystem_avail_bytes{job=\"node\",fstype!=\"\"}[6h], 4*60*60) < 0\nand\n node_filesystem_readonly{job=\"node\",fstype!=\"\"} == 0\n)\n"
|
||||
for: 1h
|
||||
labels:
|
||||
severity: critical
|
||||
- alert: NodeFilesystemAlmostOutOfSpace
|
||||
annotations:
|
||||
description: '{% raw %}Filesystem on {{ $labels.device }} at {{ $labels.instance }} has only {{ printf "%.2f" $value }}% available space left.{% endraw %}'
|
||||
summary: 'Filesystem has less than 5% space left.'
|
||||
expr: "(\n node_filesystem_avail_bytes{job=\"node\",fstype!=\"\"} / node_filesystem_size_bytes{job=\"node\",fstype!=\"\"} * 100 < 5\nand\n node_filesystem_readonly{job=\"node\",fstype!=\"\"} == 0\n)\n"
|
||||
for: 1h
|
||||
labels:
|
||||
severity: warning
|
||||
- alert: NodeFilesystemAlmostOutOfSpace
|
||||
annotations:
|
||||
description: '{% raw %}Filesystem on {{ $labels.device }} at {{ $labels.instance }} has only {{ printf "%.2f" $value }}% available space left.{% endraw %}'
|
||||
summary: 'Filesystem has less than 3% space left.'
|
||||
expr: "(\n node_filesystem_avail_bytes{job=\"node\",fstype!=\"\"} / node_filesystem_size_bytes{job=\"node\",fstype!=\"\"} * 100 < 3\nand\n node_filesystem_readonly{job=\"node\",fstype!=\"\"} == 0\n)\n"
|
||||
for: 1h
|
||||
labels:
|
||||
severity: critical
|
||||
- alert: NodeFilesystemFilesFillingUp
|
||||
annotations:
|
||||
description: '{% raw %}Filesystem on {{ $labels.device }} at {{ $labels.instance }} has only {{ printf "%.2f" $value }}% available inodes left and is filling up.{% endraw %}'
|
||||
summary: 'Filesystem is predicted to run out of inodes within the next 24 hours.'
|
||||
expr: "(\n node_filesystem_files_free{job=\"node\",fstype!=\"\"} / node_filesystem_files{job=\"node\",fstype!=\"\"} * 100 < 40\nand\n predict_linear(node_filesystem_files_free{job=\"node\",fstype!=\"\"}[6h], 24*60*60) < 0\nand\n node_filesystem_readonly{job=\"node\",fstype!=\"\"} == 0\n)\n"
|
||||
for: 1h
|
||||
labels:
|
||||
severity: warning
|
||||
- alert: NodeFilesystemFilesFillingUp
|
||||
annotations:
|
||||
description: '{% raw %}Filesystem on {{ $labels.device }} at {{ $labels.instance }} has only {{ printf "%.2f" $value }}% available inodes left and is filling up fast.{% endraw %}'
|
||||
summary: 'Filesystem is predicted to run out of inodes within the next 4 hours.'
|
||||
expr: "(\n node_filesystem_files_free{job=\"node\",fstype!=\"\"} / node_filesystem_files{job=\"node\",fstype!=\"\"} * 100 < 20\nand\n predict_linear(node_filesystem_files_free{job=\"node\",fstype!=\"\"}[6h], 4*60*60) < 0\nand\n node_filesystem_readonly{job=\"node\",fstype!=\"\"} == 0\n)\n"
|
||||
for: 1h
|
||||
labels:
|
||||
severity: critical
|
||||
- alert: NodeFilesystemAlmostOutOfFiles
|
||||
annotations:
|
||||
description: '{% raw %}Filesystem on {{ $labels.device }} at {{ $labels.instance }} has only {{ printf "%.2f" $value }}% available inodes left.{% endraw %}'
|
||||
summary: 'Filesystem has less than 5% inodes left.'
|
||||
expr: "(\n node_filesystem_files_free{job=\"node\",fstype!=\"\"} / node_filesystem_files{job=\"node\",fstype!=\"\"} * 100 < 5\nand\n node_filesystem_readonly{job=\"node\",fstype!=\"\"} == 0\n)\n"
|
||||
for: 1h
|
||||
labels:
|
||||
severity: warning
|
||||
- alert: NodeFilesystemAlmostOutOfFiles
|
||||
annotations:
|
||||
description: '{% raw %}Filesystem on {{ $labels.device }} at {{ $labels.instance }} has only {{ printf "%.2f" $value }}% available inodes left.{% endraw %}'
|
||||
summary: 'Filesystem has less than 3% inodes left.'
|
||||
expr: "(\n node_filesystem_files_free{job=\"node\",fstype!=\"\"} / node_filesystem_files{job=\"node\",fstype!=\"\"} * 100 < 3\nand\n node_filesystem_readonly{job=\"node\",fstype!=\"\"} == 0\n)\n"
|
||||
for: 1h
|
||||
labels:
|
||||
severity: critical
|
||||
- alert: NodeNetworkReceiveErrs
|
||||
annotations:
|
||||
description: '{% raw %}{{ $labels.instance }} interface {{ $labels.device }} has encountered {{ printf "%.0f" $value }} receive errors in the last two minutes.{% endraw %}'
|
||||
summary: 'Network interface is reporting many receive errors.'
|
||||
expr: "increase(node_network_receive_errs_total[2m]) > 10\n"
|
||||
for: 1h
|
||||
labels:
|
||||
severity: warning
|
||||
- alert: NodeNetworkTransmitErrs
|
||||
annotations:
|
||||
description: '{% raw %}{{ $labels.instance }} interface {{ $labels.device }} has encountered {{ printf "%.0f" $value }} transmit errors in the last two minutes.{% endraw %}'
|
||||
summary: 'Network interface is reporting many transmit errors.'
|
||||
expr: "increase(node_network_transmit_errs_total[2m]) > 10\n"
|
||||
for: 1h
|
||||
labels:
|
||||
severity: warning
|
||||
- alert: NodeHighNumberConntrackEntriesUsed
|
||||
annotations:
|
||||
description: '{% raw %}{{ $value | humanizePercentage }} of conntrack entries are used{% endraw %}'
|
||||
summary: 'Number of conntrack are getting close to the limit'
|
||||
expr: "(node_nf_conntrack_entries / node_nf_conntrack_entries_limit) > 0.75\n"
|
||||
labels:
|
||||
severity: warning
|
||||
- alert: NodeClockSkewDetected
|
||||
annotations:
|
||||
message: '{% raw %}Clock on {{ $labels.instance }} is out of sync by more than 300s. Ensure NTP is configured correctly on this host.{% endraw %}'
|
||||
summary: 'Clock skew detected.'
|
||||
expr: "(\n node_timex_offset_seconds > 0.05\nand\n deriv(node_timex_offset_seconds[5m]) >= 0\n)\nor\n(\n node_timex_offset_seconds < -0.05\nand\n deriv(node_timex_offset_seconds[5m]) <= 0\n)\n"
|
||||
for: 10m
|
||||
labels:
|
||||
severity: warning
|
||||
- alert: NodeClockNotSynchronising
|
||||
annotations:
|
||||
message: '{% raw %}Clock on {{ $labels.instance }} is not synchronising. Ensure NTP is configured on this host.{% endraw %}'
|
||||
summary: 'Clock not synchronising.'
|
||||
expr: "min_over_time(node_timex_sync_status[5m]) == 0\n"
|
||||
for: 10m
|
||||
labels:
|
||||
severity: warning
|
13
roles/prometheus/handlers/main.yml
Normal file
13
roles/prometheus/handlers/main.yml
Normal file
|
@ -0,0 +1,13 @@
|
|||
---
|
||||
- name: restart prometheus
|
||||
become: true
|
||||
systemd:
|
||||
daemon_reload: true
|
||||
name: prometheus
|
||||
state: restarted
|
||||
|
||||
- name: reload prometheus
|
||||
become: true
|
||||
systemd:
|
||||
name: prometheus
|
||||
state: reloaded
|
34
roles/prometheus/meta/main.yml
Normal file
34
roles/prometheus/meta/main.yml
Normal file
|
@ -0,0 +1,34 @@
|
|||
---
|
||||
galaxy_info:
|
||||
author: Prometheus Community
|
||||
description: Prometheus monitoring system configuration and management
|
||||
license: Apache
|
||||
company: none
|
||||
min_ansible_version: "2.7"
|
||||
platforms:
|
||||
- name: Ubuntu
|
||||
versions:
|
||||
- bionic
|
||||
- xenial
|
||||
- name: Debian
|
||||
versions:
|
||||
- stretch
|
||||
- buster
|
||||
- name: EL
|
||||
versions:
|
||||
- 7
|
||||
- 8
|
||||
- name: Fedora
|
||||
versions:
|
||||
- 30
|
||||
- 31
|
||||
galaxy_tags:
|
||||
- monitoring
|
||||
- prometheus
|
||||
- metrics
|
||||
- alerts
|
||||
- alerting
|
||||
- molecule
|
||||
- cloud
|
||||
|
||||
dependencies: []
|
70
roles/prometheus/molecule/alternative/molecule.yml
Normal file
70
roles/prometheus/molecule/alternative/molecule.yml
Normal file
|
@ -0,0 +1,70 @@
|
|||
---
|
||||
dependency:
|
||||
name: galaxy
|
||||
driver:
|
||||
name: docker
|
||||
platforms:
|
||||
- name: bionic
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:ubuntu-18.04
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: xenial
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:ubuntu-16.04
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: stretch
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:debian-9
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: buster
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:debian-10
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: centos7
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:centos-7
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: centos8
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:centos-8
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
groups:
|
||||
- python3
|
||||
- name: fedora
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:fedora-30
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
groups:
|
||||
- python3
|
||||
provisioner:
|
||||
name: ansible
|
||||
playbooks:
|
||||
prepare: prepare.yml
|
||||
converge: playbook.yml
|
||||
inventory:
|
||||
group_vars:
|
||||
python3:
|
||||
ansible_python_interpreter: /usr/bin/python3
|
||||
verifier:
|
||||
name: testinfra
|
89
roles/prometheus/molecule/alternative/playbook.yml
Normal file
89
roles/prometheus/molecule/alternative/playbook.yml
Normal file
|
@ -0,0 +1,89 @@
|
|||
---
|
||||
- name: Run role
|
||||
hosts: all
|
||||
any_errors_fatal: true
|
||||
roles:
|
||||
- cloudalchemy.prometheus
|
||||
vars:
|
||||
prometheus_binary_local_dir: '/tmp/prometheus-linux-amd64'
|
||||
prometheus_config_dir: /opt/prom/etc
|
||||
prometheus_db_dir: /opt/prom/lib
|
||||
prometheus_web_listen_address: "127.0.0.1:9090"
|
||||
prometheus_web_external_url: "http://127.0.0.1:9090/prometheus"
|
||||
prometheus_read_only_dirs:
|
||||
- /etc
|
||||
prometheus_storage_retention: "60d"
|
||||
prometheus_storage_retention_size: "1GB"
|
||||
prometheus_config_flags_extra:
|
||||
alertmanager.timeout: 10s
|
||||
web.enable-admin-api:
|
||||
enable-feature:
|
||||
- promql-at-modifier
|
||||
- remote-write-receiver
|
||||
prometheus_alertmanager_config:
|
||||
- scheme: https
|
||||
path_prefix: /alertmanager
|
||||
basic_auth:
|
||||
username: user
|
||||
password: pass
|
||||
static_configs:
|
||||
- targets: ["127.0.0.1:9090"]
|
||||
proxy_url: "127.0.0.2"
|
||||
prometheus_alert_relabel_configs:
|
||||
- action: labeldrop
|
||||
regex: replica
|
||||
prometheus_global:
|
||||
scrape_interval: 3s
|
||||
scrape_timeout: 2s
|
||||
evaluation_interval: 10s
|
||||
prometheus_remote_write:
|
||||
- url: http://influx.cloudalchemy.org:8086/api/v1/prom/write?db=test
|
||||
basic_auth:
|
||||
username: prometheus
|
||||
password: SuperSecret
|
||||
prometheus_remote_read:
|
||||
- url: http://influx.cloudalchemy.org:8086/api/v1/prom/read?db=cloudalchemy
|
||||
prometheus_external_labels:
|
||||
environment: "alternative"
|
||||
prometheus_targets:
|
||||
node:
|
||||
- targets:
|
||||
- demo.cloudalchemy.org:9100
|
||||
- influx.cloudalchemy.org:9100
|
||||
labels:
|
||||
env: cloudalchemy
|
||||
docker:
|
||||
- targets:
|
||||
- demo.cloudalchemy.org:8080
|
||||
- influx.cloudalchemy.org:8080
|
||||
labels:
|
||||
env: cloudalchemy
|
||||
prometheus_scrape_configs:
|
||||
- job_name: "prometheus"
|
||||
metrics_path: "{{ prometheus_metrics_path }}"
|
||||
static_configs:
|
||||
- targets:
|
||||
- "{{ ansible_fqdn | default(ansible_host) | default('localhost') }}:9090"
|
||||
- job_name: "node"
|
||||
file_sd_configs:
|
||||
- files:
|
||||
- "{{ prometheus_config_dir }}/file_sd/node.yml"
|
||||
- job_name: "docker"
|
||||
file_sd_configs:
|
||||
- files:
|
||||
- "{{ prometheus_config_dir }}/file_sd/docker.yml"
|
||||
- job_name: 'blackbox'
|
||||
metrics_path: /probe
|
||||
params:
|
||||
module: [http_2xx]
|
||||
static_configs:
|
||||
- targets:
|
||||
- http://demo.cloudalchemy.org:9100
|
||||
- http://influx.cloudalchemy.org:9100
|
||||
relabel_configs:
|
||||
- source_labels: [__address__]
|
||||
target_label: __param_target
|
||||
- source_labels: [__param_target]
|
||||
target_label: instance
|
||||
- target_label: __address__
|
||||
replacement: 127.0.0.1:9115 # Blackbox exporter.
|
38
roles/prometheus/molecule/alternative/prepare.yml
Normal file
38
roles/prometheus/molecule/alternative/prepare.yml
Normal file
|
@ -0,0 +1,38 @@
|
|||
---
|
||||
- name: Prepare
|
||||
hosts: localhost
|
||||
gather_facts: false
|
||||
vars:
|
||||
# This is meant to test a local prepared binary. It needs to be updated to support the minium
|
||||
# flag features in the systemd service file.
|
||||
version: 2.25.2
|
||||
tasks:
|
||||
- name: download prometheus binary to local folder
|
||||
become: false
|
||||
get_url:
|
||||
url: "https://github.com/prometheus/prometheus/releases/download/v{{ version }}/prometheus-{{ version }}.linux-amd64.tar.gz"
|
||||
dest: "/tmp/prometheus-{{ version }}.linux-amd64.tar.gz"
|
||||
register: _download_archive
|
||||
until: _download_archive is succeeded
|
||||
retries: 5
|
||||
delay: 2
|
||||
run_once: true
|
||||
check_mode: false
|
||||
|
||||
- name: unpack prometheus binaries
|
||||
become: false
|
||||
unarchive:
|
||||
src: "/tmp/prometheus-{{ version }}.linux-amd64.tar.gz"
|
||||
dest: "/tmp"
|
||||
creates: "/tmp/prometheus-{{ version }}.linux-amd64/prometheus"
|
||||
run_once: true
|
||||
check_mode: false
|
||||
|
||||
- name: link to prometheus binaries directory
|
||||
become: false
|
||||
file:
|
||||
src: "/tmp/prometheus-{{ version }}.linux-amd64"
|
||||
dest: "/tmp/prometheus-linux-amd64"
|
||||
state: link
|
||||
run_once: true
|
||||
check_mode: false
|
|
@ -0,0 +1,58 @@
|
|||
import pytest
|
||||
import os
|
||||
import testinfra.utils.ansible_runner
|
||||
|
||||
testinfra_hosts = testinfra.utils.ansible_runner.AnsibleRunner(
|
||||
os.environ['MOLECULE_INVENTORY_FILE']).get_hosts('all')
|
||||
|
||||
|
||||
@pytest.mark.parametrize("dirs", [
|
||||
"/opt/prom/etc",
|
||||
"/opt/prom/etc/rules",
|
||||
"/opt/prom/etc/file_sd",
|
||||
"/opt/prom/lib"
|
||||
])
|
||||
def test_directories(host, dirs):
|
||||
d = host.file(dirs)
|
||||
assert d.is_directory
|
||||
assert d.exists
|
||||
|
||||
|
||||
@pytest.mark.parametrize("files", [
|
||||
"/opt/prom/etc/prometheus.yml",
|
||||
"/opt/prom/etc/rules/ansible_managed.rules",
|
||||
"/opt/prom/etc/file_sd/node.yml",
|
||||
"/opt/prom/etc/file_sd/docker.yml",
|
||||
"/usr/local/bin/prometheus",
|
||||
"/usr/local/bin/promtool"
|
||||
])
|
||||
def test_files(host, files):
|
||||
f = host.file(files)
|
||||
assert f.exists
|
||||
assert f.is_file
|
||||
|
||||
|
||||
@pytest.mark.parametrize('file, content', [
|
||||
("/etc/systemd/system/prometheus.service",
|
||||
"ReadOnly.*=/etc"),
|
||||
("/etc/systemd/system/prometheus.service",
|
||||
"enable-feature=promql-at-modifier"),
|
||||
("/etc/systemd/system/prometheus.service",
|
||||
"enable-feature=remote-write-receiver"),
|
||||
])
|
||||
def test_file_contents(host, file, content):
|
||||
f = host.file(file)
|
||||
assert f.exists
|
||||
assert f.is_file
|
||||
assert f.contains(content)
|
||||
|
||||
|
||||
def test_service(host):
|
||||
s = host.service("prometheus")
|
||||
# assert s.is_enabled
|
||||
assert s.is_running
|
||||
|
||||
|
||||
def test_socket(host):
|
||||
s = host.socket("tcp://127.0.0.1:9090")
|
||||
assert s.is_listening
|
75
roles/prometheus/molecule/default/molecule.yml
Normal file
75
roles/prometheus/molecule/default/molecule.yml
Normal file
|
@ -0,0 +1,75 @@
|
|||
---
|
||||
dependency:
|
||||
name: galaxy
|
||||
driver:
|
||||
name: docker
|
||||
# lint: |
|
||||
# set -e
|
||||
# yamllint .
|
||||
# ansible-lint
|
||||
# flake8
|
||||
platforms:
|
||||
- name: bionic
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:ubuntu-18.04
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: xenial
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:ubuntu-16.04
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: stretch
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:debian-9
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: buster
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:debian-10
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: centos7
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:centos-7
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: centos8
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:centos-8
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
groups:
|
||||
- python3
|
||||
- name: fedora
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:fedora-30
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
groups:
|
||||
- python3
|
||||
provisioner:
|
||||
name: ansible
|
||||
playbooks:
|
||||
prepare: prepare.yml
|
||||
converge: playbook.yml
|
||||
inventory:
|
||||
group_vars:
|
||||
python3:
|
||||
ansible_python_interpreter: /usr/bin/python3
|
||||
verifier:
|
||||
name: testinfra
|
6
roles/prometheus/molecule/default/playbook.yml
Normal file
6
roles/prometheus/molecule/default/playbook.yml
Normal file
|
@ -0,0 +1,6 @@
|
|||
---
|
||||
- name: Run role
|
||||
hosts: all
|
||||
any_errors_fatal: true
|
||||
roles:
|
||||
- cloudalchemy.prometheus
|
5
roles/prometheus/molecule/default/prepare.yml
Normal file
5
roles/prometheus/molecule/default/prepare.yml
Normal file
|
@ -0,0 +1,5 @@
|
|||
---
|
||||
- name: Prepare
|
||||
hosts: all
|
||||
gather_facts: false
|
||||
tasks: []
|
73
roles/prometheus/molecule/default/tests/test_default.py
Normal file
73
roles/prometheus/molecule/default/tests/test_default.py
Normal file
|
@ -0,0 +1,73 @@
|
|||
import pytest
|
||||
import os
|
||||
import yaml
|
||||
import testinfra.utils.ansible_runner
|
||||
|
||||
testinfra_hosts = testinfra.utils.ansible_runner.AnsibleRunner(
|
||||
os.environ['MOLECULE_INVENTORY_FILE']).get_hosts('all')
|
||||
|
||||
|
||||
@pytest.fixture()
|
||||
def AnsibleDefaults():
|
||||
with open("defaults/main.yml", 'r') as stream:
|
||||
return yaml.load(stream)
|
||||
|
||||
|
||||
@pytest.mark.parametrize("dirs", [
|
||||
"/etc/prometheus",
|
||||
"/etc/prometheus/console_libraries",
|
||||
"/etc/prometheus/consoles",
|
||||
"/etc/prometheus/rules",
|
||||
"/etc/prometheus/file_sd",
|
||||
"/var/lib/prometheus"
|
||||
])
|
||||
def test_directories(host, dirs):
|
||||
d = host.file(dirs)
|
||||
assert d.is_directory
|
||||
assert d.exists
|
||||
|
||||
|
||||
@pytest.mark.parametrize("files", [
|
||||
"/etc/prometheus/prometheus.yml",
|
||||
"/etc/prometheus/console_libraries/prom.lib",
|
||||
"/etc/prometheus/consoles/prometheus.html",
|
||||
"/etc/prometheus/web.yml",
|
||||
"/etc/systemd/system/prometheus.service",
|
||||
"/usr/local/bin/prometheus",
|
||||
"/usr/local/bin/promtool"
|
||||
])
|
||||
def test_files(host, files):
|
||||
f = host.file(files)
|
||||
assert f.exists
|
||||
assert f.is_file
|
||||
|
||||
|
||||
@pytest.mark.parametrize("files", [
|
||||
"/etc/prometheus/rules/ansible_managed.rules"
|
||||
])
|
||||
def test_absent(host, files):
|
||||
f = host.file(files)
|
||||
assert f.exists
|
||||
|
||||
|
||||
def test_user(host):
|
||||
assert host.group("prometheus").exists
|
||||
assert host.user("prometheus").exists
|
||||
|
||||
|
||||
def test_service(host):
|
||||
s = host.service("prometheus")
|
||||
# assert s.is_enabled
|
||||
assert s.is_running
|
||||
|
||||
|
||||
def test_socket(host):
|
||||
s = host.socket("tcp://0.0.0.0:9090")
|
||||
assert s.is_listening
|
||||
|
||||
|
||||
def test_version(host, AnsibleDefaults):
|
||||
version = os.getenv('PROMETHEUS', AnsibleDefaults['prometheus_version'])
|
||||
run = host.run("/usr/local/bin/prometheus --version")
|
||||
out = run.stdout+run.stderr
|
||||
assert "prometheus, version " + version in out
|
35
roles/prometheus/molecule/latest/molecule.yml
Normal file
35
roles/prometheus/molecule/latest/molecule.yml
Normal file
|
@ -0,0 +1,35 @@
|
|||
---
|
||||
dependency:
|
||||
name: galaxy
|
||||
driver:
|
||||
name: docker
|
||||
platforms:
|
||||
- name: buster
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:debian-10
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
- name: fedora
|
||||
pre_build_image: true
|
||||
image: quay.io/paulfantom/molecule-systemd:fedora-30
|
||||
docker_host: "${DOCKER_HOST:-unix://var/run/docker.sock}"
|
||||
privileged: true
|
||||
volumes:
|
||||
- /sys/fs/cgroup:/sys/fs/cgroup:ro
|
||||
groups:
|
||||
- python3
|
||||
provisioner:
|
||||
name: ansible
|
||||
playbooks:
|
||||
create: ../default/create.yml
|
||||
prepare: ../default/prepare.yml
|
||||
converge: playbook.yml
|
||||
destroy: ../default/destroy.yml
|
||||
inventory:
|
||||
group_vars:
|
||||
python3:
|
||||
ansible_python_interpreter: /usr/bin/python3
|
||||
verifier:
|
||||
name: testinfra
|
8
roles/prometheus/molecule/latest/playbook.yml
Normal file
8
roles/prometheus/molecule/latest/playbook.yml
Normal file
|
@ -0,0 +1,8 @@
|
|||
---
|
||||
- name: Run role
|
||||
hosts: all
|
||||
any_errors_fatal: true
|
||||
roles:
|
||||
- cloudalchemy.prometheus
|
||||
vars:
|
||||
prometheus_version: latest
|
28
roles/prometheus/molecule/latest/tests/test_alternative.py
Normal file
28
roles/prometheus/molecule/latest/tests/test_alternative.py
Normal file
|
@ -0,0 +1,28 @@
|
|||
import pytest
|
||||
import os
|
||||
import testinfra.utils.ansible_runner
|
||||
|
||||
testinfra_hosts = testinfra.utils.ansible_runner.AnsibleRunner(
|
||||
os.environ['MOLECULE_INVENTORY_FILE']).get_hosts('all')
|
||||
|
||||
|
||||
@pytest.mark.parametrize("files", [
|
||||
"/etc/systemd/system/prometheus.service",
|
||||
"/usr/local/bin/prometheus",
|
||||
"/usr/local/bin/promtool"
|
||||
])
|
||||
def test_files(host, files):
|
||||
f = host.file(files)
|
||||
assert f.exists
|
||||
assert f.is_file
|
||||
|
||||
|
||||
def test_service(host):
|
||||
s = host.service("prometheus")
|
||||
# assert s.is_enabled
|
||||
assert s.is_running
|
||||
|
||||
|
||||
def test_socket(host):
|
||||
s = host.socket("tcp://0.0.0.0:9090")
|
||||
assert s.is_listening
|
69
roles/prometheus/tasks/configure.yml
Normal file
69
roles/prometheus/tasks/configure.yml
Normal file
|
@ -0,0 +1,69 @@
|
|||
---
|
||||
- name: alerting rules file
|
||||
template:
|
||||
src: "alert.rules.j2"
|
||||
dest: "{{ prometheus_config_dir }}/rules/ansible_managed.rules"
|
||||
owner: root
|
||||
group: prometheus
|
||||
mode: 0640
|
||||
validate: "{{ _prometheus_binary_install_dir }}/promtool check rules %s"
|
||||
when:
|
||||
- prometheus_alert_rules != []
|
||||
notify:
|
||||
- reload prometheus
|
||||
|
||||
- name: copy custom alerting rule files
|
||||
copy:
|
||||
src: "{{ item }}"
|
||||
dest: "{{ prometheus_config_dir }}/rules/"
|
||||
owner: root
|
||||
group: prometheus
|
||||
mode: 0640
|
||||
validate: "{{ _prometheus_binary_install_dir }}/promtool check rules %s"
|
||||
with_fileglob: "{{ prometheus_alert_rules_files }}"
|
||||
notify:
|
||||
- reload prometheus
|
||||
|
||||
- name: configure prometheus
|
||||
template:
|
||||
src: "{{ prometheus_config_file }}"
|
||||
dest: "{{ prometheus_config_dir }}/prometheus.yml"
|
||||
force: true
|
||||
owner: root
|
||||
group: prometheus
|
||||
mode: 0640
|
||||
validate: "{{ _prometheus_binary_install_dir }}/promtool check config %s"
|
||||
notify:
|
||||
- reload prometheus
|
||||
|
||||
- name: configure Prometheus web
|
||||
copy:
|
||||
content: "{{ prometheus_web_config | to_nice_yaml(indent=2,sort_keys=False) }}"
|
||||
dest: "{{ prometheus_config_dir }}/web.yml"
|
||||
force: true
|
||||
owner: root
|
||||
group: prometheus
|
||||
mode: 0640
|
||||
|
||||
- name: configure prometheus static targets
|
||||
copy:
|
||||
content: |
|
||||
#jinja2: lstrip_blocks: True
|
||||
{{ item.value | to_nice_yaml(indent=2,sort_keys=False) }}
|
||||
dest: "{{ prometheus_config_dir }}/file_sd/{{ item.key }}.yml"
|
||||
force: true
|
||||
owner: root
|
||||
group: prometheus
|
||||
mode: 0640
|
||||
with_dict: "{{ prometheus_targets }}"
|
||||
when: prometheus_targets != {}
|
||||
|
||||
- name: copy prometheus custom static targets
|
||||
copy:
|
||||
src: "{{ item }}"
|
||||
dest: "{{ prometheus_config_dir }}/file_sd/"
|
||||
force: true
|
||||
owner: root
|
||||
group: prometheus
|
||||
mode: 0640
|
||||
with_fileglob: "{{ prometheus_static_targets_files }}"
|
137
roles/prometheus/tasks/install.yml
Normal file
137
roles/prometheus/tasks/install.yml
Normal file
|
@ -0,0 +1,137 @@
|
|||
---
|
||||
- name: create prometheus system group
|
||||
group:
|
||||
name: prometheus
|
||||
system: true
|
||||
state: present
|
||||
|
||||
- name: create prometheus system user
|
||||
user:
|
||||
name: prometheus
|
||||
system: true
|
||||
shell: "/usr/sbin/nologin"
|
||||
group: prometheus
|
||||
createhome: false
|
||||
home: "{{ prometheus_db_dir }}"
|
||||
|
||||
- name: create prometheus data directory
|
||||
file:
|
||||
path: "{{ prometheus_db_dir }}"
|
||||
state: directory
|
||||
owner: prometheus
|
||||
group: prometheus
|
||||
mode: 0755
|
||||
|
||||
- name: create prometheus configuration directories
|
||||
file:
|
||||
path: "{{ item }}"
|
||||
state: directory
|
||||
owner: root
|
||||
group: prometheus
|
||||
mode: 0770
|
||||
with_items:
|
||||
- "{{ prometheus_config_dir }}"
|
||||
- "{{ prometheus_config_dir }}/rules"
|
||||
- "{{ prometheus_config_dir }}/file_sd"
|
||||
|
||||
- block:
|
||||
- name: download prometheus binary to local folder
|
||||
become: false
|
||||
get_url:
|
||||
url: "https://github.com/prometheus/prometheus/releases/download/v{{ prometheus_version }}/prometheus-{{ prometheus_version }}.linux-{{ go_arch }}.tar.gz"
|
||||
dest: "/tmp/prometheus-{{ prometheus_version }}.linux-{{ go_arch }}.tar.gz"
|
||||
checksum: "sha256:{{ __prometheus_checksum }}"
|
||||
register: _download_archive
|
||||
until: _download_archive is succeeded
|
||||
retries: 5
|
||||
delay: 2
|
||||
# run_once: true # <-- this cannot be set due to multi-arch support
|
||||
delegate_to: localhost
|
||||
check_mode: false
|
||||
|
||||
- name: unpack prometheus binaries
|
||||
become: false
|
||||
unarchive:
|
||||
src: "/tmp/prometheus-{{ prometheus_version }}.linux-{{ go_arch }}.tar.gz"
|
||||
dest: "/tmp"
|
||||
creates: "/tmp/prometheus-{{ prometheus_version }}.linux-{{ go_arch }}/prometheus"
|
||||
delegate_to: localhost
|
||||
check_mode: false
|
||||
|
||||
- name: propagate official prometheus and promtool binaries
|
||||
copy:
|
||||
src: "/tmp/prometheus-{{ prometheus_version }}.linux-{{ go_arch }}/{{ item }}"
|
||||
dest: "{{ _prometheus_binary_install_dir }}/{{ item }}"
|
||||
mode: 0755
|
||||
owner: root
|
||||
group: root
|
||||
with_items:
|
||||
- prometheus
|
||||
- promtool
|
||||
notify:
|
||||
- restart prometheus
|
||||
|
||||
- name: propagate official console templates
|
||||
copy:
|
||||
src: "/tmp/prometheus-{{ prometheus_version }}.linux-{{ go_arch }}/{{ item }}/"
|
||||
dest: "{{ prometheus_config_dir }}/{{ item }}/"
|
||||
mode: 0644
|
||||
owner: root
|
||||
group: root
|
||||
with_items:
|
||||
- console_libraries
|
||||
- consoles
|
||||
notify:
|
||||
- restart prometheus
|
||||
when:
|
||||
- prometheus_binary_local_dir | length == 0
|
||||
- not prometheus_skip_install
|
||||
|
||||
- name: propagate locally distributed prometheus and promtool binaries
|
||||
copy:
|
||||
src: "{{ prometheus_binary_local_dir }}/{{ item }}"
|
||||
dest: "{{ _prometheus_binary_install_dir }}/{{ item }}"
|
||||
mode: 0755
|
||||
owner: root
|
||||
group: root
|
||||
with_items:
|
||||
- prometheus
|
||||
- promtool
|
||||
when:
|
||||
- prometheus_binary_local_dir | length > 0
|
||||
- not prometheus_skip_install
|
||||
notify:
|
||||
- restart prometheus
|
||||
|
||||
- name: create systemd service unit
|
||||
template:
|
||||
src: prometheus.service.j2
|
||||
dest: /etc/systemd/system/prometheus.service
|
||||
owner: root
|
||||
group: root
|
||||
mode: 0644
|
||||
notify:
|
||||
- restart prometheus
|
||||
|
||||
- name: Install SELinux dependencies
|
||||
package:
|
||||
name: "{{ item }}"
|
||||
state: present
|
||||
with_items: "{{ prometheus_selinux_packages }}"
|
||||
register: _install_packages
|
||||
until: _install_packages is succeeded
|
||||
retries: 5
|
||||
delay: 2
|
||||
when:
|
||||
- ansible_version.full is version('2.4', '>=')
|
||||
- ansible_selinux.status == "enabled"
|
||||
|
||||
- name: Allow prometheus to bind to port in SELinux
|
||||
seport:
|
||||
ports: "{{ prometheus_web_listen_address.split(':')[1] }}"
|
||||
proto: tcp
|
||||
setype: http_port_t
|
||||
state: present
|
||||
when:
|
||||
- ansible_version.full is version('2.4', '>=')
|
||||
- ansible_selinux.status == "enabled"
|
38
roles/prometheus/tasks/main.yml
Normal file
38
roles/prometheus/tasks/main.yml
Normal file
|
@ -0,0 +1,38 @@
|
|||
---
|
||||
- name: Gather variables for each operating system
|
||||
include_vars: "{{ item }}"
|
||||
with_first_found:
|
||||
- "{{ ansible_distribution | lower }}-{{ ansible_distribution_major_version }}.yml"
|
||||
- "{{ ansible_distribution | lower }}.yml"
|
||||
- "{{ ansible_os_family | lower }}-{{ ansible_distribution_major_version | lower }}.yml"
|
||||
- "{{ ansible_os_family | lower }}.yml"
|
||||
tags:
|
||||
- prometheus_configure
|
||||
- prometheus_install
|
||||
- prometheus_run
|
||||
|
||||
- include: preflight.yml
|
||||
tags:
|
||||
- prometheus_configure
|
||||
- prometheus_install
|
||||
- prometheus_run
|
||||
|
||||
- include: install.yml
|
||||
become: true
|
||||
tags:
|
||||
- prometheus_install
|
||||
|
||||
- include: configure.yml
|
||||
become: true
|
||||
tags:
|
||||
- prometheus_configure
|
||||
|
||||
- name: ensure prometheus service is started and enabled
|
||||
become: true
|
||||
systemd:
|
||||
daemon_reload: true
|
||||
name: prometheus
|
||||
state: started
|
||||
enabled: true
|
||||
tags:
|
||||
- prometheus_run
|
114
roles/prometheus/tasks/preflight.yml
Normal file
114
roles/prometheus/tasks/preflight.yml
Normal file
|
@ -0,0 +1,114 @@
|
|||
---
|
||||
- name: Assert usage of systemd as an init system
|
||||
assert:
|
||||
that: ansible_service_mgr == 'systemd'
|
||||
msg: "This module only works with systemd"
|
||||
|
||||
- name: Get systemd version
|
||||
command: systemctl --version
|
||||
changed_when: false
|
||||
check_mode: false
|
||||
register: __systemd_version
|
||||
tags:
|
||||
- skip_ansible_lint
|
||||
|
||||
- name: Set systemd version fact
|
||||
set_fact:
|
||||
prometheus_systemd_version: "{{ __systemd_version.stdout_lines[0].split(' ')[-1] }}"
|
||||
|
||||
- name: Assert no duplicate config flags
|
||||
assert:
|
||||
that:
|
||||
- prometheus_config_flags_extra['config.file'] is not defined
|
||||
- prometheus_config_flags_extra['storage.tsdb.path'] is not defined
|
||||
- prometheus_config_flags_extra['storage.local.path'] is not defined
|
||||
- prometheus_config_flags_extra['web.listen-address'] is not defined
|
||||
- prometheus_config_flags_extra['web.external-url'] is not defined
|
||||
msg: "Detected duplicate configuration entry. Please check your ansible variables and role README.md."
|
||||
|
||||
- name: Assert external_labels aren't configured twice
|
||||
assert:
|
||||
that: prometheus_global.external_labels is not defined
|
||||
msg: "Use prometheus_external_labels to define external labels"
|
||||
|
||||
- name: Set prometheus external metrics path
|
||||
set_fact:
|
||||
prometheus_metrics_path: "/{{ ( prometheus_web_external_url + '/metrics' ) | regex_replace('^(.*://)?(.*?)/') }}"
|
||||
|
||||
- name: Fail when prometheus_config_flags_extra duplicates parameters set by other variables
|
||||
fail:
|
||||
msg: >
|
||||
Whooops. You are duplicating configuration. Please look at your prometheus_config_flags_extra
|
||||
and check against other variables in defaults/main.yml
|
||||
with_items:
|
||||
- 'storage.tsdb.retention'
|
||||
- 'storage.tsdb.path'
|
||||
- 'storage.local.retention'
|
||||
- 'storage.local.path'
|
||||
- 'config.file'
|
||||
- 'web.listen-address'
|
||||
- 'web.external-url'
|
||||
when: item in prometheus_config_flags_extra.keys()
|
||||
|
||||
- name: Get all file_sd files from scrape_configs
|
||||
set_fact:
|
||||
file_sd_files: "{{ prometheus_scrape_configs | json_query('[*][].file_sd_configs[*][].files[]') }}"
|
||||
|
||||
- name: Fail when file_sd targets are not defined in scrape_configs
|
||||
fail:
|
||||
msg: >
|
||||
Oh, snap! `{{ item.key }}` couldn't be found in your scrape configs. Please ensure you provided
|
||||
all targets from prometheus_targets in prometheus_scrape_configs
|
||||
when: not prometheus_config_dir + "/file_sd/" + item.key + ".yml" in file_sd_files
|
||||
# when: not item | basename | splitext | difference(['.yml']) | join('') in prometheus_targets.keys()
|
||||
with_dict: "{{ prometheus_targets }}"
|
||||
|
||||
- name: Alert when prometheus_alertmanager_config is empty, but prometheus_alert_rules is specified
|
||||
debug:
|
||||
msg: >
|
||||
No alertmanager configuration was specified. If you want your alerts to be sent make sure to
|
||||
specify a prometheus_alertmanager_config in defaults/main.yml.
|
||||
when:
|
||||
- prometheus_alertmanager_config == []
|
||||
- prometheus_alert_rules != []
|
||||
|
||||
- block:
|
||||
- name: Get latest release
|
||||
uri:
|
||||
url: "https://api.github.com/repos/prometheus/prometheus/releases/latest"
|
||||
method: GET
|
||||
return_content: true
|
||||
status_code: 200
|
||||
body_format: json
|
||||
validate_certs: false
|
||||
user: "{{ lookup('env', 'GH_USER') | default(omit) }}"
|
||||
password: "{{ lookup('env', 'GH_TOKEN') | default(omit) }}"
|
||||
no_log: "{{ not lookup('env', 'ANSIBLE_DEBUG') | bool }}"
|
||||
register: _latest_release
|
||||
until: _latest_release.status == 200
|
||||
retries: 5
|
||||
|
||||
- name: "Set prometheus version to {{ _latest_release.json.tag_name[1:] }}"
|
||||
set_fact:
|
||||
prometheus_version: "{{ _latest_release.json.tag_name[1:] }}"
|
||||
when:
|
||||
- prometheus_version == "latest"
|
||||
- prometheus_binary_local_dir | length == 0
|
||||
- not prometheus_skip_install
|
||||
|
||||
- block:
|
||||
- name: "Get checksum list"
|
||||
set_fact:
|
||||
__prometheus_checksums: "{{ lookup('url', 'https://github.com/prometheus/prometheus/releases/download/v' + prometheus_version + '/sha256sums.txt', wantlist=True) | list }}"
|
||||
run_once: true
|
||||
|
||||
- name: "Get checksum for {{ go_arch }} architecture"
|
||||
set_fact:
|
||||
__prometheus_checksum: "{{ item.split(' ')[0] }}"
|
||||
with_items: "{{ __prometheus_checksums }}"
|
||||
when:
|
||||
- "('linux-' + go_arch + '.tar.gz') in item"
|
||||
delegate_to: localhost
|
||||
when:
|
||||
- prometheus_binary_local_dir | length == 0
|
||||
- not prometheus_skip_install
|
6
roles/prometheus/templates/alert.rules.j2
Normal file
6
roles/prometheus/templates/alert.rules.j2
Normal file
|
@ -0,0 +1,6 @@
|
|||
{{ ansible_managed | comment }}
|
||||
|
||||
groups:
|
||||
- name: ansible managed alert rules
|
||||
rules:
|
||||
{{ prometheus_alert_rules | to_nice_yaml(indent=2,sort_keys=False) | indent(2,False) }}
|
85
roles/prometheus/templates/prometheus.service.j2
Normal file
85
roles/prometheus/templates/prometheus.service.j2
Normal file
|
@ -0,0 +1,85 @@
|
|||
{{ ansible_managed | comment }}
|
||||
|
||||
[Unit]
|
||||
Description=Prometheus
|
||||
After=network-online.target
|
||||
Requires=local-fs.target
|
||||
After=local-fs.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
Environment="GOMAXPROCS={{ ansible_processor_vcpus|default(ansible_processor_count) }}"
|
||||
User=prometheus
|
||||
Group=prometheus
|
||||
ExecReload=/bin/kill -HUP $MAINPID
|
||||
ExecStart={{ _prometheus_binary_install_dir }}/prometheus \
|
||||
--storage.tsdb.path={{ prometheus_db_dir }} \
|
||||
{% if prometheus_version is version('2.7.0', '>=') %}
|
||||
--storage.tsdb.retention.time={{ prometheus_storage_retention }} \
|
||||
--storage.tsdb.retention.size={{ prometheus_storage_retention_size }} \
|
||||
{% else %}
|
||||
--storage.tsdb.retention={{ prometheus_storage_retention }} \
|
||||
{% endif %}
|
||||
{% if prometheus_version is version('2.24.0', '>=') %}
|
||||
--web.config.file={{ prometheus_config_dir }}/web.yml \
|
||||
{% endif %}
|
||||
--web.console.libraries={{ prometheus_config_dir }}/console_libraries \
|
||||
--web.console.templates={{ prometheus_config_dir }}/consoles \
|
||||
--web.listen-address={{ prometheus_web_listen_address }} \
|
||||
--web.external-url={{ prometheus_web_external_url }} \
|
||||
{% for flag, flag_value in prometheus_config_flags_extra.items() %}
|
||||
{% if not flag_value %}
|
||||
--{{ flag }} \
|
||||
{% elif flag_value is string %}
|
||||
--{{ flag }}={{ flag_value }} \
|
||||
{% elif flag_value is sequence %}
|
||||
{% for flag_value_item in flag_value %}
|
||||
--{{ flag }}={{ flag_value_item }} \
|
||||
{% endfor %}
|
||||
{% endif %}
|
||||
{% endfor %}
|
||||
--config.file={{ prometheus_config_dir }}/prometheus.yml
|
||||
|
||||
CapabilityBoundingSet=CAP_SET_UID
|
||||
LimitNOFILE=65000
|
||||
LockPersonality=true
|
||||
NoNewPrivileges=true
|
||||
MemoryDenyWriteExecute=true
|
||||
PrivateDevices=true
|
||||
PrivateTmp=true
|
||||
ProtectHome=true
|
||||
RemoveIPC=true
|
||||
RestrictSUIDSGID=true
|
||||
#SystemCallFilter=@signal @timer
|
||||
|
||||
{% if prometheus_systemd_version | int >= 231 %}
|
||||
ReadWritePaths={{ prometheus_db_dir }}
|
||||
{% for path in prometheus_read_only_dirs %}
|
||||
ReadOnlyPaths={{ path }}
|
||||
{% endfor %}
|
||||
{% else %}
|
||||
ReadWriteDirectories={{ prometheus_db_dir }}
|
||||
{% for path in prometheus_read_only_dirs %}
|
||||
ReadOnlyDirectories={{ path }}
|
||||
{% endfor %}
|
||||
{% endif %}
|
||||
|
||||
{% if prometheus_systemd_version | int >= 232 %}
|
||||
PrivateUsers=true
|
||||
ProtectControlGroups=true
|
||||
ProtectKernelModules=true
|
||||
ProtectKernelTunables=true
|
||||
ProtectSystem=strict
|
||||
{% else %}
|
||||
ProtectSystem=full
|
||||
{% endif %}
|
||||
|
||||
{% if http_proxy is defined %}
|
||||
Environment="HTTP_PROXY={{ http_proxy }}"{% if https_proxy is defined %} "HTTPS_PROXY={{ https_proxy }}{% endif %}"
|
||||
{% endif %}
|
||||
|
||||
SyslogIdentifier=prometheus
|
||||
Restart=always
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
34
roles/prometheus/templates/prometheus.yml.j2
Normal file
34
roles/prometheus/templates/prometheus.yml.j2
Normal file
|
@ -0,0 +1,34 @@
|
|||
#jinja2: trim_blocks: True, lstrip_blocks: True
|
||||
{{ ansible_managed | comment }}
|
||||
# http://prometheus.io/docs/operating/configuration/
|
||||
|
||||
global:
|
||||
{{ prometheus_global | to_nice_yaml(indent=2,sort_keys=False) | indent(2, False) }}
|
||||
external_labels:
|
||||
{{ prometheus_external_labels | to_nice_yaml(indent=2,sort_keys=False) | indent(4, False) }}
|
||||
|
||||
{% if prometheus_remote_write != [] %}
|
||||
remote_write:
|
||||
{{ prometheus_remote_write | to_nice_yaml(indent=2,sort_keys=False) | indent(2, False) }}
|
||||
{% endif %}
|
||||
|
||||
{% if prometheus_remote_read != [] %}
|
||||
remote_read:
|
||||
{{ prometheus_remote_read | to_nice_yaml(indent=2,sort_keys=False) | indent(2, False) }}
|
||||
{% endif %}
|
||||
|
||||
rule_files:
|
||||
- {{ prometheus_config_dir }}/rules/*.rules
|
||||
|
||||
{% if prometheus_alertmanager_config | length > 0 %}
|
||||
alerting:
|
||||
alertmanagers:
|
||||
{{ prometheus_alertmanager_config | to_nice_yaml(indent=2,sort_keys=False) | indent(2,False) }}
|
||||
{% if prometheus_alert_relabel_configs | length > 0 %}
|
||||
alert_relabel_configs:
|
||||
{{ prometheus_alert_relabel_configs | to_nice_yaml(indent=2,sort_keys=False) | indent(2,False) }}
|
||||
{% endif %}
|
||||
{% endif %}
|
||||
|
||||
scrape_configs:
|
||||
{{ prometheus_scrape_configs | to_nice_yaml(indent=2,sort_keys=False) | indent(2,False) }}
|
4
roles/prometheus/vars/centos-8.yml
Normal file
4
roles/prometheus/vars/centos-8.yml
Normal file
|
@ -0,0 +1,4 @@
|
|||
---
|
||||
prometheus_selinux_packages:
|
||||
- python3-libselinux
|
||||
- python3-policycoreutils
|
4
roles/prometheus/vars/centos.yml
Normal file
4
roles/prometheus/vars/centos.yml
Normal file
|
@ -0,0 +1,4 @@
|
|||
---
|
||||
prometheus_selinux_packages:
|
||||
- libselinux-python
|
||||
- policycoreutils-python
|
4
roles/prometheus/vars/debian.yml
Normal file
4
roles/prometheus/vars/debian.yml
Normal file
|
@ -0,0 +1,4 @@
|
|||
---
|
||||
prometheus_selinux_packages:
|
||||
- python-selinux
|
||||
- policycoreutils
|
Some files were not shown because too many files have changed in this diff Show more
Loading…
Reference in a new issue