副标题[/!--empirenews.page--]

前言
在一个阳光明媚的下午,电脑右下角传来一片片邮件提醒,同时伴随着微信钉钉的震动,打开一看,应用各种出错,天兔告警,数据库服务器内存爆红,MySql 数据库实例挂掉了。
排查
先交代一下数据库版本:
- mysql> status
- --------------
- mysql Ver 14.14 Distrib 5.7.22-22, for Linux (x86_64) using 6.2
- Connection id: 59568
- Current database:
- Current user: root@localhost
- SSL: Not in use
- Current pager: stdout
- Using outfile: ''
- Using delimiter: ;
- Server version: 5.7.22-22-log Percona Server (GPL), Release 22, Revision f62d93c
- Protocol version: 10
崩溃故障排除绝不是一项有趣的任务,特别是如果MySQL没有报告崩溃的原因。例如,当MySQL内存不足时。
数据库邮件告警提醒发来的消息:
- Type: mysql
- Tags: 生产主库
- Host: 172.16.1.66:3306
- Level: critical
- Item: connect
- Value: down
- Message: mysql server down
登录 Grafana 监控面板,数据库连接在哪个时间段曾有幅度的增长。

顺手检查一下之前的服务器邮件监控告警记录,上一个时间点,内存占用率99%,这说明了数据库连接的幅度增长,可能是压垮服务器的最后一根稻草。
其实导致OOM的直接原因并不复杂,就是因为服务器内存不足,内核需要回收内存,回收内存就是kill掉服务器上使用内存最多的程序,而MySQL服务可能就是使用内存最多,所以就OOM了。
- Type: os
- Tags: 66数据库
- Host: 172.16.1.66:
- Level: critical
- Item: memory
- Value: 99%
- Message: too more memory usage
查看系统日志
我们带着这个疑问来排查一下日志:
- # 查看日志
- tail -500f /var/log/messages
- # 以下是 oom-killer
- Nov 27 14:55:48 itstyledb1 kernel: mysqld invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
- Nov 27 14:55:48 itstyledb1 kernel: mysqld cpuset=/ mems_allowed=0-1
- Nov 27 14:55:48 itstyledb1 kernel: CPU: 2 PID: 895 Comm: mysqld Kdump: loaded Not tainted 3.10.0-862.3.2.el7.x86_64 #1
- Nov 27 14:55:48 itstyledb1 kernel: Hardware name: Huawei RH1288 V3/BC11HGSC0, BIOS 3.22 05/16/2016
- Nov 27 14:55:48 itstyledb1 kernel: Call Trace:
小伙伴们继续往下看:
- 0 pages HighMem/MovableOnly
- Nov 27 14:55:48 itstyledb1 kernel: 291281 pages reserved
- Nov 27 14:55:48 itstyledb1 kernel: [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
- Nov 27 14:55:48 itstyledb1 kernel: [ 468] 0 468 28271 4326 62 55 0 systemd-journal
- Nov 27 14:55:48 itstyledb1 kernel: [ 490] 0 490 11492 2 24 553 -1000 systemd-udevd
- Nov 27 14:55:48 itstyledb1 kernel: [ 787] 0 787 13877 18 27 96 -1000 auditd
- Nov 27 14:55:48 itstyledb1 kernel: [ 810] 81 810 14552 81 34 89 -900 dbus-daemon
- Nov 27 14:55:48 itstyledb1 kernel: [ 815] 0 815 55956 1 60 466 0 abrtd
- Nov 27 14:55:48 itstyledb1 kernel: [ 816] 0 816 55327 9 64 346 0 abrt-watch-log
- Nov 27 14:55:48 itstyledb1 kernel: [ 818] 0 818 121607 220 90 495 0 NetworkManager
- Nov 27 14:55:48 itstyledb1 kernel: [ 822] 0 822 5415 49 16 33 0 irqbalance
- Nov 27 14:55:48 itstyledb1 kernel: [ 823] 997 823 134634 97 60 1306 0 polkitd
- Nov 27 14:55:48 itstyledb1 kernel: [ 825] 0 825 6594 42 20 41 0 systemd-logind
- Nov 27 14:55:48 itstyledb1 kernel: [ 830] 0 830 31578 28 21 139 0 crond
- Nov 27 14:55:48 itstyledb1 kernel: [ 839] 0 839 27522 2 10 31 0 agetty
- Nov 27 14:55:48 itstyledb1 kernel: [ 1142] 0 1142 143454 114 97 2672 0 tuned
- Nov 27 14:55:48 itstyledb1 kernel: [ 1144] 0 1144 28203 11 59 246 -1000 sshd
- Nov 27 14:55:48 itstyledb1 kernel: [ 1145] 0 1145 97438 694 103 328 0 rsyslogd
- Nov 27 14:55:48 itstyledb1 kernel: [ 1369] 0 1369 22526 20 44 256 0 master
- Nov 27 14:55:48 itstyledb1 kernel: [ 1371] 89 1371 22596 32 46 251 0 qmgr
- Nov 27 14:55:48 itstyledb1 kernel: [ 5140] 0 5140 5102 1617 15 239 0 mysqld_exporter
- Nov 27 14:55:48 itstyledb1 kernel: [ 9430] 0 9430 55966 378 62 790 0 snmpd
- Nov 27 14:55:48 itstyledb1 kernel: [30320] 27 30320 22951376 13928375 43437 8163662 0 mysqld
- Nov 27 14:55:48 itstyledb1 kernel: [ 688] 89 688 22552 271 46 0 0 pickup
- Nov 27 14:55:48 itstyledb1 kernel: Out of memory: Kill process 30320 (mysqld) score 984 or sacrifice child
- Nov 27 14:55:48 itstyledb1 kernel: Killed process 30320 (mysqld) total-vm:91805504kB, anon-rss:55713500kB, file-rss:0kB, shmem-rss:0kB
- Nov 27 14:56:00 itstyledb1 systemd: mysqld.service: main process exited, code=killed, status=9/KILL
- Nov 27 14:56:00 itstyledb1 systemd: Unit mysqld.service entered failed state.
- Nov 27 14:56:00 itstyledb1 systemd: mysqld.service failed.
- Nov 27 14:56:00 itstyledb1 systemd: mysqld.service holdoff time over, scheduling restart.
- Nov 27 14:56:01 itstyledb1 systemd: Starting MySQL Server...
当out of memory发生时,outofmemory函数会选择一个内核认为犯有分配过多内存 “罪行”的进程,并杀死该进程。显然 Mysql 就是哪个“罪人”。
(编辑:晋中站长网)
【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!
|