filebeat收集日志通过elsaticsearch的pipeline功能解析
使用es pipeline 的grok处理器来解析日志。
可以选择kibana的Ingest Pipelines创建也可以使用es api来创建pipeline。
解析apache的日志,使用curl命令创建es pipeline。
curl -XPUT "http://10.30.4.50:9200/_ingest/pipeline/apache2" -H 'Content-Type: application/json' -d'
{
"description" : "This pipeline is the regular rule that Grok uses to match the access logs of Apache 2",
"processors" : [
{
"grok": {
"field": "message",
"patterns": [
"%{COMBINEDAPACHELOG}"
]
}
}
]
}
'
查询es pipeline
curl -XGET http://10.30.4.50:9200/_ingest/pipeline/apache2?pretty
创建一个filebeat容器,然后将usr/share/filebeat目录全部拷贝出来。
删除容器,然后重新创建,将其挂载,并将需要收集的日志路径也一并挂载。
docker run -dit --name filebeat -u root -v /app/dockerdata/filebeat:/usr/share/filebeat -v /var/run/docker.sock:/var/run/docker.sock:ro -v /app/dockerdata/ldap_data/log:/ldap/log:ro 10.30.4.50:8082/soimt/beats/soimt-filebeat:7.17.5
filebeat.yml的内容:
# ============================== Filebeat inputs ===============================
filebeat.inputs:
- type: log
enabled: true
paths:
- /ldap/log/apache2/access.log
pipeline: "apache2"
scan_frequency: 30s
tags: ["LDAP-Apache"]
setup.ilm.enabled: false
setup.template.name: "10.30.4.57"
setup.template.pattern: "10.30.4.57-*"
# ============================== Filebeat modules ==============================
filebeat.config.modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
# ======================= Elasticsearch template setting =======================
setup.template.settings:
index.number_of_shards: 5
index.number_of_replicas: 0
# =================================== Kibana ===================================
setup.kibana:
host: "10.30.4.50:5601"
# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
hosts: ["10.30.4.50:9200"]
#protocol: "https"
#username: "elastic"
#password: "admin1234"
indices:
- index: "10.30.4.57-LDAP_apache-%{+yyyy.MM.dd}"
when.contains:
tags: "LDAP-Apache"
# ================================= Processors =================================
processors:
- add_host_metadata:
when.not.contains.tags: forwarded
- add_cloud_metadata: ~
- add_docker_metadata: ~
- add_kubernetes_metadata: ~
如果es设置了https和密码,则可以将其注释取消
启动成功filebeat后,在es中会自动创建一个名为10.30.4.57的索引模板,索引需要在Stack Management-->Index Patterns中创建。
将索引创建成功后,打开Discover选择创建成功的索引,选择message就可以看到日志已经被解析。
遇到的问题:
如果在启动filebeat后,filebeat日志报错
{"type":"mapper_parsing_exception","reason":"object mapping for [agent] tried to parse field [agent] as object, but found a concrete value"}, dropping event!
这种类似的错误。
"agent" 字段的 mapping 映射错误,这表明 Elasticsearch 尝试将该字段解释为对象,但实际上却找到了一个具体的值,导致出现了映射解析异常
只需要在创建es pipeline的时候删除agent字段就行
curl -XPUT "http://10.30.4.50:9200/_ingest/pipeline/apache2" -H 'Content-Type: application/json' -d'
{
"description" : "This pipeline is the regular rule that Grok uses to match the access logs of Apache 2",
"processors" : [
{
"grok": {
"field": "message",
"patterns": [
"%{COMBINEDAPACHELOG}"
]
}
},
{
"remove": {
"ignore_failure": true,
"field": "agent"
}
}
]
}
'
删除索引的命令:curl -X DELETE http://10.30.4.50:9200/10.30.4.57*
es Pipeline上传至其他集群
将所有pipeline下载下来,写到pipelines.json中
curl -XGET 'http://localhost:9200/_ingest/pipeline?pretty' > pipelines.json
使用脚本将pipelines.json文件读取,然后上传至新的es集群中。
脚本内容:
#!/bin/bash
for pipeline_id in $(cat pipelines.json | jq -r '. | keys[]'); do
pipeline_file="${pipeline_id}.json"
cat pipelines.json | jq -r ".[\"${pipeline_id}\"]" > "${pipeline_file}"
curl -XPUT "http://localhost:9200/_ingest/pipeline/${pipeline_id}" -H 'Content-Type: application/json' -d @"${pipeline_file}"
done
#如果没有jq命令需要安装。或者将pipelines.json文件拷贝到可以访问到es集群的机器上然后执行脚本。