写Nginx模块时如果代码写的有问题导致Nginx worker异常退出,这时想要知道哪一行代码有问题就需要借助内核的Core Dump功能。关于如何生成nginx core dump文件网上的资料有些过时,本文将记录下CentOS 8系统下nginx core dump文件的生成和调试过程。

1
2023/05/11 14:34:45 [alert] 55005#55005: worker process 55559 exited on signal 11 (core dumped)

使用 coredumpctl查看系统中产生的core dump文件

1
2
3
4
5
6
7
[root@test openresty-1.21.4.1]# coredumpctl
TIME                            PID   UID   GID SIG COREFILE  EXE
Thu 2023-05-11 11:30:22 CST   54597 65534 65534  11 none      /usr/local/openresty/nginx/sbin/nginx
Thu 2023-05-11 11:30:23 CST   54602 65534 65534  11 none      /usr/local/openresty/nginx/sbin/nginx
Thu 2023-05-11 11:30:28 CST   54612 65534 65534  11 none      /usr/local/openresty/nginx/sbin/nginx
Thu 2023-05-11 11:30:39 CST   54621 65534 65534  11 none      /usr/local/openresty/nginx/sbin/nginx
Thu 2023-05-11 11:31:09 CST   54631 65534 65534  11 none      /usr/local/openresty/nginx/sbin/nginx

COREFILE列为none,没有core dump文件

使用coredumctl info 查看详细信息

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
[root@test openresty-1.21.4.1]# coredumpctl info 54631
           PID: 54631 (nginx)
           UID: 65534 (nobody)
           GID: 65534 (nobody)
        Signal: 11 (SEGV)
     Timestamp: Thu 2023-05-11 11:31:09 CST (1 day 3h ago)
  Command Line: nginx: worker process
    Executable: /usr/local/openresty/nginx/sbin/nginx
 Control Group: /system.slice/openresty.service
          Unit: openresty.service
         Slice: system.slice
       Boot ID: 9a4536ecba224f2a8484a31eba0c5a65
    Machine ID: 8fca9ee8630a4867a1a6a14e20972e8f
      Hostname: test
       Storage: none
       Message: Process 54631 (nginx) of user 65534 dumped core.

Storage也是为none。

这里为什么没有生成Core dump文件,查资料各种试尝试,最后发现CentOS8 其实只需要在nginx.conf配置文件中加一行就行。

1
worker_rlimit_core  50M;

文件大小没要求,够生成core dump文件就行。

重启Openresty进程,再次用coredumctl查看

1
Thu 2023-05-11 15:50:24 CST   67246 65534 65534  11 present   /usr/local/openresty/nginx/sbin/nginx

COREFILE 列已经为present

coredumctl info {PID} 能看到详细的文件信息

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
[root@test openresty-1.21.4.1]# coredumpctl info 67246
           PID: 67246 (nginx)
           UID: 65534 (nobody)
           GID: 65534 (nobody)
        Signal: 11 (SEGV)
     Timestamp: Thu 2023-05-11 15:50:24 CST (23h ago)
  Command Line: nginx: worker process
    Executable: /usr/local/openresty/nginx/sbin/nginx
 Control Group: /system.slice/openresty.service
          Unit: openresty.service
         Slice: system.slice
       Boot ID: 9a4536ecba224f2a8484a31eba0c5a65
    Machine ID: 8fca9ee8630a4867a1a6a14e20972e8f
      Hostname: test
       Storage: /var/lib/systemd/coredump/core.nginx.65534.9a4536ecba224f2a8484a31eba0c5a65.67246.1683791424000000.lz4
       Message: Process 67246 (nginx) of user 65534 dumped core.

                Stack trace of thread 67246:
                #0  0x00000000005500d6 ngx_http_xnile_real_ip_from (nginx)
                #1  0x00007f5007a6658a lj_vm_ffi_call (libluajit-5.1.so.2)
                #2  0x00007f5007ab1fe7 lj_ccall_func (libluajit-5.1.so.2)
                #3  0x00007f5007ac886d lj_cf_ffi_meta___call (libluajit-5.1.so.2)
                #4  0x00007f5007a63fb6 lj_BC_FUNCC (libluajit-5.1.so.2)
                #5  0x00000000004f8cf3 ngx_http_lua_run_thread (nginx)
                #6  0x00000000004fdf29 ngx_http_lua_access_by_chunk (nginx)
                #7  0x000000000045f1cc ngx_http_core_access_phase (nginx)
                #8  0x000000000045abc5 ngx_http_core_run_phases (nginx)
                #9  0x0000000000465927 ngx_http_process_request_headers (nginx)
                #10 0x0000000000465cc6 ngx_http_process_request_line (nginx)
                #11 0x000000000044d1fb ngx_epoll_process_events (nginx)
                #12 0x0000000000443e05 ngx_process_events_and_timers (nginx)
                #13 0x000000000044b3b2 ngx_worker_process_cycle (nginx)
                #14 0x0000000000449b17 ngx_spawn_process (nginx)
                #15 0x000000000044a854 ngx_start_worker_processes (nginx)
                #16 0x000000000044bd2c ngx_master_process_cycle (nginx)
                #17 0x0000000000423cc0 main (nginx)
                #18 0x00007f500734a202 __libc_start_main (libc.so.6)
                #19 0x0000000000423d1e _start (nginx)

coredumpctl gdb {PID}查看调试信息

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
[root@test openresty-1.21.4.1]# coredumpctl gdb 67246
           PID: 67246 (nginx)
           UID: 65534 (nobody)
           GID: 65534 (nobody)
        Signal: 11 (SEGV)
     Timestamp: Thu 2023-05-11 15:50:24 CST (23h ago)
  Command Line: nginx: worker process
    Executable: /usr/local/openresty/nginx/sbin/nginx
 Control Group: /system.slice/openresty.service
          Unit: openresty.service
         Slice: system.slice
       Boot ID: 9a4536ecba224f2a8484a31eba0c5a65
    Machine ID: 8fca9ee8630a4867a1a6a14e20972e8f
      Hostname: test
       Storage: /var/lib/systemd/coredump/core.nginx.65534.9a4536ecba224f2a8484a31eba0c5a65.67246.1683791424000000.lz4
       Message: Process 67246 (nginx) of user 65534 dumped core.

                Stack trace of thread 67246:
                #0  0x00000000005500d6 ngx_http_xnile_real_ip_from (nginx)
                #1  0x00007f5007a6658a lj_vm_ffi_call (libluajit-5.1.so.2)
                #2  0x00007f5007ab1fe7 lj_ccall_func (libluajit-5.1.so.2)
                #3  0x00007f5007ac886d lj_cf_ffi_meta___call (libluajit-5.1.so.2)
                #4  0x00007f5007a63fb6 lj_BC_FUNCC (libluajit-5.1.so.2)
                #5  0x00000000004f8cf3 ngx_http_lua_run_thread (nginx)
                #6  0x00000000004fdf29 ngx_http_lua_access_by_chunk (nginx)
                #7  0x000000000045f1cc ngx_http_core_access_phase (nginx)
                #8  0x000000000045abc5 ngx_http_core_run_phases (nginx)
                #9  0x0000000000465927 ngx_http_process_request_headers (nginx)
                #10 0x0000000000465cc6 ngx_http_process_request_line (nginx)
                #11 0x000000000044d1fb ngx_epoll_process_events (nginx)
                #12 0x0000000000443e05 ngx_process_events_and_timers (nginx)
                #13 0x000000000044b3b2 ngx_worker_process_cycle (nginx)
                #14 0x0000000000449b17 ngx_spawn_process (nginx)
                #15 0x000000000044a854 ngx_start_worker_processes (nginx)
                #16 0x000000000044bd2c ngx_master_process_cycle (nginx)
                #17 0x0000000000423cc0 main (nginx)
                #18 0x00007f500734a202 __libc_start_main (libc.so.6)
                #19 0x0000000000423d1e _start (nginx)

GNU gdb (GDB) Red Hat Enterprise Linux 9.2-7.1.al8
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/local/openresty/nginx/sbin/nginx...
[New LWP 67246]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `nginx: worker process                '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00000000005500d6 in ngx_http_xnile_real_ip_from (r=0x2502960, text=0x7f50072b9c88 "xnile")
    at /root/openresty-1.21.4.1/../test-nginx-module/src/ngx_http_test_module.c:852
852	    ngx_str_t header_str = ngx_string("xnile");

参考:

http://nginx.org/en/docs/ngx_core_module.html