logstash每周获取多个事件,然后将这些事件转发给elasticsearch,
如何配置logstash,让它告诉elasticsearch删除旧事件?
编辑2018-03-28:
输入:
{host:"host1", type:"packages", records: [{name:"pkg1", ver: "1"}, {name: "pkg2", ver: "2"},...]
{host:"host1", type:"mounts", records: [{path:"path1", dev: "dev1"}, {path:"path2", dev: "dev2"},...]
{host:"host1", type:"???", records: [{???}, {???},...]
...
{host:"host2", type:"packages, records: [{name:"pkg1", version: "1"}, {name: "pkg2", ver: "2"},...]
{host:"host2", type: "mounts", records: [{path:"path1", dev: "dev1"}, {path:"path2", dev: "dev2"},...]
{host:"host2", type:"???", records: [{???}, {???},...]
这是每个主机的各种事件。每个事件都由
无法确定
架构。
为了能够精确地搜索数组中的字段,我必须将数组拆分为多个elasticsearch文档。
(我知道有很多方法可以不拆分,但可以在数组内搜索。这是另一个故事:
Nested Object
。在我的例子中,内部对象不是固定模式,因此我无法预先提供每个内部字段定义)
输出:
{host: "host1", type:"packages", record: {name: "pkg1", ver: "1"}}
{host: "host1", type:"packages", record: {name: "pkg2", ver: "2"}}
{host: "host1", type:"mounts", record: {path: "path1", dev: "dev1"}}
{host: "host1", type:"???", record: {???}
{host: "host1", type:"???", record: {???}
{host: "host1", type:"mounts", record: {path: "path2", dev: "dev2"}}
{host: "host2", type:"packages", record: {name: "pkg1", ver: "1"}}
{host: "host2", type:"packages", record: {name: "pkg2", ver: "2"}}
{host: "host2", type:"mounts", record: {path: "path1", dev: "dev1"}}
{host: "host2", type:"mounts", record: {path: "path2", dev: "dev2"}}
{host: "host2", type:"???", record: {???}
{host: "host2", type:"???", record: {???}
...
日志存储。形态:
input { ... }
filter {
split {
# split array and save them into new multiple events
field => "records"
}
mutate {
rename => { "records" => "record" }
}
}
output {
elasticsearch {
hosts => ["ELASTIC_IP:PORT"]
index => "packages-%{+YYYY.MM.dd}"
}
}
-
问题是:对于每种类型的主机,Elasticsearch将填充越来越多的旧事件。
因此,我想在获得主机的新数据后删除主机的旧数据。
注意一些失败的尝试:
因为输出是多个文档,而不是单个文档,有时更多,有时更少,所以它不是一个简单的更新。它必须是全部删除(&a);添加
我知道有一些方法可以不拆分,但可以在数组中搜索。这是另一个故事:
嵌套对象
。在我的例子中,内部对象不是固定模式,因此我无法预先提供每个内部字段定义