k8s搭建ELK

概述

这篇文章是我用 K8s 搭建 ELK 的实践记录,配置是通用的,每一部分需要额外修改的部分都会标注出来。整体配置文件目录结构如下:

1
2
3
4
5
6
7
├── elk
│   ├── elk-data.yml
│   ├── elk-kibana.yml
│   ├── elk-logstash.yml
│   ├── elk-master.yml
│   ├── elk-ns.yml
│   └── elk-service.yml

我的服务器的配置为 4C8G,配置好 ELK + Grafana + 其它的一些项目后资源使用情况如下:

image-20210426221955158

需要注意:

  • 这个版本没有开启 XPack,后续会把这块更新上。

  • 所有的 PV 配置我都配置的相对很小,需要根据自己的实际情况调整大小。

  • 我的机器配置不高,只有 4C8G,省吃简用的节省配置后,ELK 依然占据了 2.28Gi 的内存,不过对于自用的话足够了。

image-20210426221822974

NameSpace

首先给 ELK 服务的所有内容创建命名空间将它们隔离起来,我的明明空间是 elk,下述所有内容命名空间都是这个。

elk-ns.yaml
1
2
3
4
5
6
apiVersion: v1
kind: Namespace
metadata:
name: elk
labels:
app: elasticsearch

ElasticSearch

es 分为 master 结点和 data 结点,需要将索引数据映射出来,由于我配置的时候只有一台服务器,所以直接挂载到了host磁盘上。

es 定义了四项内容:

  1. PersistentVolume
  2. PersistentVolumeClaim
  3. StatefulSet
  4. PodDisruptionBudget

需要注意:

  1. 需要将 master 和 data 的索引数据映射出来,我的映射目录是 /data/elk/master/data/elk/data,这两个 PV 映射的时候存储大小要根据实际情况来定,我由于是自己用,数据量小,每个都只设置了 10Gi。
  2. master 和 data 都可以配置多个 pod,但是我的机器 cpu 和 内存 有限,所以都只配了1个,如果要改的话,修改 StatefulSet 的 spec.replicas 值。
  3. ES JVM 堆内存最小为 2GB,但我没这么多,所以配置的小,内存足够的话记得修改 data 结点 StatefulSet 的 ES_JAVA_OPTS,我配置的是 -Xms512m -Xmx512m,内存足够建议配置 2GB 以上,根据实际情况决定,比如:-Xms4g -Xmx4g,master 结点可以不用太大,512MB 就可以。
  4. master 结点需要注意 env.cluster.initial_master_nodes 的值,这里需要根据实际配置的 Pod 数量来决定值,我只配置了一个 Pod,所以值是 elasticsearch-master-0,如果是多个的话,按照数字顺序往后拼,比如 3 个就是 elasticsearch-master-0,elasticsearch-master-1,elasticsearch-master-2
  5. master 和 data 结点都需要注意配置下 env.discovery.seed_hosts,这里需要将我配置的值中的 elk 改为自己的命名空间。
  6. PodDisruptionBudget 配置用于限制在同一时间因自愿干扰导致的复制应用程序中宕机的 pod 数量,这里我配置的是1。
  7. master 和 data 结点都暴露了 92009300 端口,结点端口只映射了 master 的 9200这里需要将配置文件 elk_service.yaml 中的 your node port 替换为你想要对外暴露的结点端口
elk-master.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
kind: PersistentVolume
apiVersion: v1
metadata:
name: pv-volume-elastic-master
namespace: elk
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/data/elk/master"

---

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pv-claim-elastic-master
namespace: elk
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi

---

apiVersion: apps/v1
kind: StatefulSet
metadata:
namespace: elk
name: elasticsearch-master
labels:
app: elasticsearch
role: master
spec:
serviceName: elasticsearch-master
replicas: 1
selector:
matchLabels:
app: elasticsearch
role: master
template:
metadata:
labels:
app: elasticsearch
role: master
spec:
volumes:
- name: pv-storage-elastic-master
persistentVolumeClaim:
claimName: pv-claim-elastic-master
containers:
- name: elasticsearch
image: elasticsearch:7.2.0
resources:
requests:
memory: 1Gi
cpu: 0.5
limits:
memory: 1Gi
cpu: 0.5
command: ["bash", "-c", "ulimit -l unlimited && sysctl -w vm.max_map_count=262144 && chown -R elasticsearch:elasticsearch /usr/share/elasticsearch/data && exec su elasticsearch docker-entrypoint.sh"]
ports:
- containerPort: 9200
name: http
- containerPort: 9300
name: transport
env:
- name: discovery.seed_hosts
value: "elasticsearch-master.elk.svc.cluster.local"
- name: cluster.initial_master_nodes
value: "elasticsearch-master-0"
- name: ES_JAVA_OPTS
value: -Xms512m -Xmx512m

- name: node.master
value: "true"
- name: node.ingest
value: "false"
- name: node.data
value: "false"

- name: cluster.name
value: "elasticsearch-cluster-v7"
- name: node.name
valueFrom:
fieldRef:
fieldPath: metadata.name

volumeMounts:
- mountPath: /usr/share/elasticsearch/data
name: pv-storage-elastic-master

# Gave permission to init container
securityContext:
privileged: true

# Pull image from private repo
imagePullSecrets:
- name: regcred-elastic
---
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
namespace: elk
name: elasticsearch-master
spec:
maxUnavailable: 1
selector:
matchLabels:
app: elasticsearch
role: master
elk-data.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
kind: PersistentVolume
apiVersion: v1
metadata:
name: pv-volume-elastic-data
namespace: elk
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/data/elk/data"

---

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pv-claim-elastic-data
namespace: elk
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi

---

apiVersion: apps/v1
kind: StatefulSet
metadata:
namespace: elk
name: elasticsearch-data
labels:
app: elasticsearch
role: data
spec:
serviceName: elasticsearch-data
replicas: 1
selector:
matchLabels:
app: elasticsearch
role: data
template:
metadata:
labels:
app: elasticsearch
role: data
spec:
volumes:
- name: pv-storage-elastic-data
persistentVolumeClaim:
claimName: pv-claim-elastic-data
containers:
- name: elasticsearch
image: elasticsearch:7.2.0
resources:
requests:
memory: 1Gi
cpu: 0.5
limits:
memory: 1Gi
cpu: 0.5
command: ["bash", "-c", "ulimit -l unlimited && sysctl -w vm.max_map_count=262144 && chown -R elasticsearch:elasticsearch /usr/share/elasticsearch/data && exec su elasticsearch docker-entrypoint.sh"]
ports:
- containerPort: 9200
name: http
- containerPort: 9300
name: transport
env:
- name: discovery.seed_hosts
value: "elasticsearch-master.elk.svc.cluster.local"
- name: ES_JAVA_OPTS
value: -Xms512m -Xmx512m

- name: node.master
value: "false"
- name: node.ingest
value: "true"
- name: node.data
value: "true"

- name: cluster.name
value: "elasticsearch-cluster-v7"
- name: node.name
valueFrom:
fieldRef:
fieldPath: metadata.name
volumeMounts:
- mountPath: /usr/share/elasticsearch/data
name: pv-storage-elastic-data

# Gave permission to init container
securityContext:
privileged: true

# Pull image from private repo
imagePullSecrets:
- name: regcred-elastic
---
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
namespace: elk
name: elasticsearch-data
spec:
maxUnavailable: 1
selector:
matchLabels:
app: elasticsearch
role: data
elk-service.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
apiVersion: v1
kind: Service
metadata:
namespace: elk
name: elasticsearch-master
labels:
app: elasticsearch
role: master
spec:
clusterIP: None
selector:
app: elasticsearch
role: master
ports:
- port: 9200
name: http
- port: 9300
name: node-to-node

---

apiVersion: v1
kind: Service
metadata:
namespace: elk
name: elasticsearch
labels:
app: elasticsearch
role: data
spec:
clusterIP: None
selector:
app: elasticsearch
role: data
ports:
- port: 9200
name: http
- port: 9300
name: node-to-node

---

apiVersion: v1
kind: Service
metadata:
namespace: elk
name: elasticsearch-service
labels:
app: elasticsearch
role: master
spec:
type: NodePort
ports:
- port: 9200
targetPort: 9200
nodePort: your node port
selector:
app: elasticsearch
role: master

Kibana

Kibana 需要注意的点不多,唯一需要注意的是 env.ELASTICSEARCH_URL 的值是上面配置的 master 服务的 name,端口是上面配置的 9200。

elk-kibana.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
---
kind: Deployment
apiVersion: apps/v1
metadata:
labels:
app: kibana
name: kibana
namespace: elk
spec:
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: kibana
template:
metadata:
labels:
app: kibana
spec:
containers:
- name: kibana
image: kibana:7.2.0
ports:
- containerPort: 5601
protocol: TCP
env:
- name: "ELASTICSEARCH_URL"
value: "http://elasticsearch-service:9200"
imagePullSecrets:
- name: regcred-elastic
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule

---

kind: Service
apiVersion: v1
metadata:
labels:
app: kibana
name: kibana-service
namespace: elk
spec:
type: NodePort
ports:
- port: 5601
targetPort: 5601
nodePort: 32502
selector:
app: kibana

Logstash

logstash 是重头戏,需要注意的地方比较多,这里需要仔细。

  1. nginx:今天配置的时候我是以机器上的 nginx 日志来做测试,将机器的 nginx 日志挂载到了pod 内部,并且修改了 nginx 的日志格式为json,当然,有条件的话可以用 kafka。

    nginx 的日志格式如下:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    log_format  json  '{"@timestamp":"$time_iso8601",'
    '"@source":"$server_addr",'
    '"hostname":"$hostname",'
    '"ip":"$remote_addr",'
    '"client":"$remote_addr",'
    '"request_method":"$request_method",'
    '"scheme":"$scheme",'
    '"domain":"$server_name",'
    '"referer":"$http_referer",'
    '"request":"$request_uri",'
    '"args":"$args",'
    '"size":$body_bytes_sent,'
    '"status": $status,'
    '"responsetime":$request_time,'
    '"upstreamtime":"$upstream_response_time",'
    '"upstreamaddr":"$upstream_addr",'
    '"http_user_agent":"$http_user_agent",'
    '"https":"$https"'
    '}';

    需要将对应虚拟主机的 access log 的格式改为 json

    1
    2
    3
    4
    5
    6
    7
    server {

    ...
    access_log /path/to/your/log/file json;
    ...

    }
  2. GeoIP:日志解析用到了 IP 地址解析的功能,所以需要在主机上安装 GeoIp,更新城市数据后将数据文件映射到 Pod 中,如果这里安装更新有疑惑的话可以参考我的这篇文章:GeoIP的安装和更新

  3. nginx 和 GeoIP 的 PV 配置中需要修改自己的 nginx 日志路径 和 GeoIP 文件路径。

  4. ConfigMap 中需要根据自己的实际情况修改 :

    • input.file.path 匹配映射到 Pod 中的 access 日志路径。
    • filter.geoip.database 匹配映射到 Pod 中的 GeoLite2-City 数据文件的路径。
    • output.elasticsearch.hosts 为上述 ES 服务配置的 内网IP+结点端口
  5. 我是将所有的 nginx 都收集到一个 index 中,这样后续配合 Grafana 可以统一用一个面板查看所有服务的访问情况,也可以通过切换域名查看单个服务的数据。

elk-logstash.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
kind: PersistentVolume
apiVersion: v1
metadata:
name: pv-volume-log
namespace: elk
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 2Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/path/to/your/nginx/logs"

---

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pv-claim-log
namespace: elk
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi

---

kind: PersistentVolume
apiVersion: v1
metadata:
name: pv-volume-geoip
namespace: elk
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 200Mi
accessModes:
- ReadWriteOnce
hostPath:
path: "/path/to/GeoIP"

---

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pv-claim-geoip
namespace: elk
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Mi

---

kind: ConfigMap
apiVersion: v1
metadata:
name: logstash-config
namespace: elk
data:
logstash-config-named-k8s: |
input {
file {
path => "/var/log/nginx/*access.log"
type => "nginx-access-log"
ignore_older => 0
codec => json
start_position => "beginning"
}
}

filter {
mutate {
convert => [ "status","integer" ]
convert => [ "size","integer" ]
convert => [ "upstreatime","float" ]
convert => ["[geoip][coordinates]", "float"]
remove_field => "message"
}
date {
match => [ "timestamp" ,"dd/MMM/YYYY:HH:mm:ss Z" ]
}
geoip {
source => "client"
target => "geoip"
database =>"/usr/share/GeoIP/GeoLite2-City.mmdb"
add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" ]
}
mutate {
remove_field => "timestamp"
}
if "_geoip_lookup_failure" in [tags] { drop { } } ### 如果解析的地址是内网IP geoip解析将会失败,会生成_geoip_lookup_failure字段,这段话的意思是如果内网地址 drop掉这个字段。
}

output {
#stdout { codec => rubydebug }
elasticsearch {
hosts => ["your internal ip:your node port"]
index => "nginx"
}
}

---

kind: Deployment
apiVersion: apps/v1
metadata:
name: logstash
namespace: elk
labels:
app: logstash
spec:
replicas: 1
selector:
matchLabels:
app: logstash
template:
metadata:
labels:
app: logstash
spec:
containers:
- name: logstash
image: logstash:7.2.0
command: ["/bin/sh","-c"]
args: ["/usr/share/logstash/bin/logstash -f /usr/share/logstash/config/indexer-kafka-named-k8s.conf"]
volumeMounts:
- name: vm-config
mountPath: /usr/share/logstash/config
- name: pv-storage-log
mountPath: /var/log/nginx
- name: pv-storage-geoip
mountPath: /usr/share/GeoIP
imagePullSecrets:
- name: regcred-elastic
volumes:
- name: vm-config
configMap:
name: logstash-config
items:
- key: logstash-config-named-k8s
path: indexer-kafka-named-k8s.conf
- name: pv-storage-log
persistentVolumeClaim:
claimName: pv-claim-log
- name: pv-storage-geoip
persistentVolumeClaim:
claimName: pv-claim-geoip

总结

通过用 K8s 部署 Elk,现在将常用的几个服务就都从 docker-compose 迁移到了 k8s 上,并且上述操作将 nginx 日志也收集到了 ELK 中,接下来结合 Grafana 我们可以将 Nginx 的日志可视化,所有服务的服务情况就可以有一个统一的面板。

本文作者:Jormin
本文地址https://blog.lerzen.com/k8s搭建ELK/
版权声明:本博客所有文章除特别声明外,均采用 CC BY-NC-SA 3.0 CN 许可协议。转载请注明出处!

----- 到这结束咯 感谢您的阅读 -----