我已经想出了一个适合我的解决方案。
我已经通过两个独立的地形模块创建了正常运行时间检查和正常运行时间检测警报。
Terrraform正常运行时间检查模块如下所示:
resource "google_monitoring_uptime_check_config" "uptime-check" {
project = var.project_id
display_name = var.display_name
timeout = "10s"
period = "60s"
http_check {
path = var.path
port = var.port
use_ssl = true
validate_ssl = true
}
monitored_resource {
type = "uptime_url"
labels = {
host = var.hostname,
project_id = var.project_id
}
}
content_matchers {
content = "\"status\":\"UP\""
}
}
那么对于
outputs.tf
对于该模块,我有:
output "uptime_check_id" {
value = google_monitoring_uptime_check_config.uptime-check.uptime_check_id
}
然后在警报模块中,我遵循了terraform文档,但将它们修改为如下代码:
module "medallies-common-alerts" {
source = "./modules/alerts"
project_id = var.project_id
uptime_check_depends_on = [module.uptime-check]
check_id = module.uptime-check.uptime_check_id
}
...
resource "google_monitoring_alert_policy" "alert_policy_uptime_check" {
project = var.project_id
enabled = true
depends_on = [var.uptime_check_depends_on]
....
condition_threshold {
filter = format("metric.type=\"monitoring.googleapis.com/uptime_check/check_passed\" AND metric.label.\"check_id\"=\"%s\" AND resource.type=\"uptime_url\"",var.check_id)
duration = "300s"
comparison = "COMPARISON_GT"
threshold_value = "1"
trigger {
count = 1
}
...
}
希望它也能帮助到别人。