************************ crashinfo ************************* /exports/testreports/38859/testresults/racer-special9-ldiskfs-centos7_x86_64-centos7_x86_64/oleg156-server-timeout-core (3.10.0-7.9-debug) +==========================+ | *** Crashinfo v1.3.7 *** | +==========================+ +++WARNING+++ PARTIAL DUMP with size(vmcore) < 25% size(RAM) KERNEL: /tmp/crash-anaysis.6FaMv/vmlinux [TAINTED] DUMPFILE: /exports/testreports/38859/testresults/racer-special9-ldiskfs-centos7_x86_64-centos7_x86_64/oleg156-server-timeout-core [PARTIAL DUMP] CPUS: 4 DATE: Fri Jan 19 12:55:14 EST 2024 UPTIME: 01:01:31 LOAD AVERAGE: 19.00, 18.99, 18.57 TASKS: 267 NODENAME: oleg156-server.virtnet RELEASE: 3.10.0-7.9-debug VERSION: #1 SMP Sat Mar 26 23:28:42 EDT 2022 MACHINE: x86_64 (2399 Mhz) MEMORY: 4 GB PANIC: "" +--------------------------+ >------------------------| Per-cpu Stacks ('bt -a') |------------------------< +--------------------------+ -- CPU#0 -- PID=0 CPU=0 CMD=swapper/0 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 rest_init+0x8e #7 start_kernel+0x456 #8 x86_64_start_reservations+0x2a #9 x86_64_start_kernel+0x152 #10 start_cpu+0x5 -- CPU#1 -- PID=0 CPU=1 CMD=swapper/1 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 start_secondary+0x1eb #7 start_cpu+0x5 -- CPU#2 -- PID=0 CPU=2 CMD=swapper/2 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 start_secondary+0x1eb #7 start_cpu+0x5 -- CPU#3 -- PID=0 CPU=3 CMD=swapper/3 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 start_secondary+0x1eb #7 start_cpu+0x5 +--------------------------------+ >---------------------| How This Dump Has Been Created |---------------------< +--------------------------------+ Cannot identify the specific condition that triggered vmcore +---------------+ >------------------------------| Tasks Summary |------------------------------< +---------------+ Number of Threads That Ran Recently ----------------------------------- last second 25 last 5s 46 last 60s 58 ----- Total Numbers of Threads per State ------ TASK_INTERRUPTIBLE 244 TASK_RUNNING 1 TASK_UNINTERRUPTIBLE 19 +++WARNING+++ There are 3 threads running in their own namespaces Use 'taskinfo --ns' to get more details +-----------------------+ >--------------------------| 5 Most Recent Threads |--------------------------< +-----------------------+ PID CMD Age ARGS ----- -------------- ------ ---------------------------- 1 systemd 0 ms /usr/lib/systemd/systemd --switched-root --system --deserialize 22 6827 ldlm_bl_03 0 ms (no user stack) 27 rcuos/2 0 ms (no user stack) 9 rcu_sched 0 ms (no user stack) 784 gmain 2 ms /usr/sbin/NetworkManager --no-daemon +------------------------+ >-------------------------| Memory Usage (kmem -i) |-------------------------< +------------------------+ PAGES TOTAL PERCENTAGE TOTAL MEM 955067 3.6 GB ---- FREE 726478 2.8 GB 76% of TOTAL MEM USED 228589 892.9 MB 23% of TOTAL MEM SHARED 17325 67.7 MB 1% of TOTAL MEM BUFFERS 14349 56.1 MB 1% of TOTAL MEM CACHED 83654 326.8 MB 8% of TOTAL MEM SLAB 17531 68.5 MB 1% of TOTAL MEM TOTAL HUGE 0 0 ---- HUGE FREE 0 0 0% of TOTAL HUGE TOTAL SWAP 262143 1024 MB ---- SWAP USED 0 0 0% of TOTAL SWAP SWAP FREE 262143 1024 MB 100% of TOTAL SWAP COMMIT LIMIT 739676 2.8 GB ---- COMMITTED 59497 232.4 MB 8% of TOTAL LIMIT +++WARNING+++ 2 processes in UNINTERRUPTIBLE state are committing journal +++three oldest UNINTERRUPTIBLE threads ... ran 3517s ago PID=5674 CPU=3 CMD=mdt00_000 #0 __schedule+0x2e2 #1 schedule+0x29 #2 wait_transaction_locked+0x93 #3 add_transaction_credits+0x270 #4 start_this_handle+0x1cc #5 jbd2__journal_start+0xf3 #6 __ldiskfs_journal_start_sb+0x6d #7 osd_trans_start+0x233 #8 top_trans_start+0x7d9 #9 lod_trans_start+0x7f #10 mdd_trans_start+0x14 #11 mdd_xattr_del+0x2ae #12 mdt_reint_setxattr+0xa88 #13 mdt_reint_rec+0x87 #14 mdt_reint_internal+0x74c #15 mdt_reint+0x67 #16 tgt_request_handle+0x74e #17 ptlrpc_server_handle_request+0x26e #18 ptlrpc_main+0xc37 #19 kthread+0xe4 #20 ret_from_fork_nospec_begin+0x7 ... ran 3517s ago PID=11977 CPU=2 CMD=mdt00_010 #0 __schedule+0x2e2 #1 schedule+0x29 #2 wait_transaction_locked+0x93 #3 add_transaction_credits+0x270 #4 start_this_handle+0x1cc #5 jbd2__journal_start+0xf3 #6 __ldiskfs_journal_start_sb+0x6d #7 osd_trans_start+0x233 #8 mdt_empty_transno+0x265 #9 mdt_mfd_open+0x94f #10 mdt_finish_open+0x50b #11 mdt_reint_open+0x1f0a #12 mdt_reint_rec+0x87 #13 mdt_reint_internal+0x74c #14 mdt_intent_open+0x93 #15 mdt_intent_opc+0x1c8 #16 mdt_intent_policy+0xfa #17 ldlm_lock_enqueue+0x3b1 #18 ldlm_handle_enqueue+0x359 #19 tgt_enqueue+0x68 #20 tgt_request_handle+0x74e #21 ptlrpc_server_handle_request+0x26e #22 ptlrpc_main+0xc37 #23 kthread+0xe4 #24 ret_from_fork_nospec_begin+0x7 ... ran 3517s ago PID=11983 CPU=0 CMD=mdt00_016 #0 __schedule+0x2e2 #1 schedule+0x29 #2 rwsem_down_write_failed+0x195 #3 call_rwsem_down_write_failed+0x17 #4 down_write+0x40 #5 osd_write_lock+0x5f #6 lod_write_lock+0x2a #7 mdd_write_lock+0x2a #8 mdd_xattr_set+0x361 #9 mdt_reint_setxattr+0x737 #10 mdt_reint_rec+0x87 #11 mdt_reint_internal+0x74c #12 mdt_reint+0x67 #13 tgt_request_handle+0x74e #14 ptlrpc_server_handle_request+0x26e #15 ptlrpc_main+0xc37 #16 kthread+0xe4 #17 ret_from_fork_nospec_begin+0x7 +-------------------------------+ >----------------------| Scheduler Runqueues (per CPU) |----------------------< +-------------------------------+ ---+ CPU=0 ---- | CURRENT TASK , CMD=swapper/0 ---+ CPU=1 ---- | CURRENT TASK , CMD=swapper/1 ---+ CPU=2 ---- | CURRENT TASK , CMD=swapper/2 ---+ CPU=3 ---- | CURRENT TASK , CMD=swapper/3 +------------------------+ >-------------------------| Network Status Summary |-------------------------< +------------------------+ TCP Connection Info ------------------- ESTABLISHED 4 LISTEN 3 NAGLE disabled (TCP_NODELAY): 3 user_data set (NFS etc.): 3 UDP Connection Info ------------------- 2 UDP sockets, 0 in ESTABLISHED Unix Connection Info ------------------------ ESTABLISHED 26 CLOSE 17 LISTEN 8 Raw sockets info -------------------- ESTABLISHED 1 Interfaces Info --------------- How long ago (in seconds) interfaces transmitted/received? Name RX TX ---- ---------- --------- lo n/a 3688.5 eth0 n/a 4.8 RSS_TOTAL=52520 pages, %mem= 0.9 +++WARNING+++ Possible hang +++WARNING+++ Run 'hanginfo' to get more details +------------+ >-------------------------------| Mounted FS |-------------------------------< +------------+ MOUNT SUPERBLK TYPE DEVNAME DIRNAME ffff880138cca000 ffff880139940800 rootfs rootfs / ffff88012a246000 ffff88012a328000 sysfs sysfs /sys ffff88012a2461c0 ffff880139944000 proc proc /proc ffff88012a246380 ffff880137668000 devtmpfs devtmpfs /dev ffff88012a246540 ffff8800b5249000 securityfs securityfs /sys/kernel/security ffff88012a246700 ffff88012a328800 tmpfs tmpfs /dev/shm ffff88012a2468c0 ffff880137187000 devpts devpts /dev/pts ffff88012a246a80 ffff88012a329000 tmpfs tmpfs /run ffff88012a246c40 ffff88012a329800 tmpfs tmpfs /sys/fs/cgroup ffff88012a246e00 ffff88012a32a000 cgroup cgroup /sys/fs/cgroup/systemd ffff88012a246fc0 ffff88012a32a800 pstore pstore /sys/fs/pstore ffff88012a247180 ffff88012a32c800 cgroup cgroup /sys/fs/cgroup/cpu,cpuacct ffff88012a247340 ffff88012a32c000 cgroup cgroup /sys/fs/cgroup/pids ffff88012a247500 ffff88012a32b800 cgroup cgroup /sys/fs/cgroup/memory ffff88012a2476c0 ffff88012a32b000 cgroup cgroup /sys/fs/cgroup/blkio ffff88012a247880 ffff88012a32d000 cgroup cgroup /sys/fs/cgroup/perf_event ffff88012a247a40 ffff88012a32d800 cgroup cgroup /sys/fs/cgroup/freezer ffff88012a247c00 ffff88012a32e000 cgroup cgroup /sys/fs/cgroup/cpuset ffff88012a247dc0 ffff88012a32e800 cgroup cgroup /sys/fs/cgroup/net_cls,net_prio ffff88012a378000 ffff88012a32f000 cgroup cgroup /sys/fs/cgroup/devices ffff88012a3781c0 ffff88012a32f800 cgroup cgroup /sys/fs/cgroup/hugetlb ffff880137652380 ffff88013766d000 configfs configfs /sys/kernel/config ffff88012a378380 ffff8800b5264000 ext4 /dev/nbd0 / ffff880137652540 ffff8800b524a800 rpc_pipefs rpc_pipefs /var/lib/nfs/rpc_pipefs ffff88012e186000 ffff88013680b800 autofs systemd-1 /proc/sys/fs/binfmt_misc ffff880137652700 ffff880139947800 debugfs debugfs /sys/kernel/debug ffff88012e1861c0 ffff88012b270000 mqueue mqueue /dev/mqueue ffff88012e186380 ffff88013680e000 hugetlbfs hugetlbfs /dev/hugepages ffff8801376528c0 ffff88013766c000 binfmt_misc binfmt_misc /proc/sys/fs/binfmt_misc/ ffff88012e186540 ffff880129ede800 ramfs none /mnt ffff880138ccbc00 ffff880129f6d000 tmpfs none /var/lib/stateless/writable ffff880138ccbdc0 ffff880129f6c800 squashfs /dev/vda /home/green/git/lustre-release ffff88012a378a80 ffff880129f6d000 tmpfs none /var/cache/man ffff88012e186700 ffff880129f6d000 tmpfs none /var/log ffff88012a378c40 ffff880129f6d000 tmpfs none /var/lib/dbus ffff880137652c40 ffff880129f6d000 tmpfs none /tmp ffff88012e1868c0 ffff880129f6d000 tmpfs none /var/lib/dhclient ffff880137652e00 ffff880129f6d000 tmpfs none /var/tmp ffff88012a378e00 ffff880129f6d000 tmpfs none /var/lib/NetworkManager ffff880138ccb6c0 ffff880129f6d000 tmpfs none /var/lib/systemd/random-seed ffff880137652fc0 ffff880129f6d000 tmpfs none /var/spool ffff88012a378fc0 ffff880129f6d000 tmpfs none /var/lib/nfs ffff88012e186a80 ffff880129f6d000 tmpfs none /var/lib/gssproxy ffff88012a379180 ffff880129f6d000 tmpfs none /var/lib/logrotate ffff880137653180 ffff880129f6d000 tmpfs none /etc ffff88012e186c40 ffff880129f6d000 tmpfs none /var/lib/rsyslog ffff88012a379340 ffff880129f6d000 tmpfs none /var/lib/dhclient/var/lib/dhclient ffff88012a379500 ffff8800b524a800 rpc_pipefs sunrpc /var/lib/nfs/var/lib/nfs/rpc_pipefs ffff8800ae537dc0 ffff880129f6c800 squashfs /dev/vda /usr/sbin/mount.lustre ffff8800ae59f340 ffff880129f54000 lustre /dev/mapper/mds1_flakey /mnt/lustre-mds1 ffff8800ae59f880 ffff8800b40da000 lustre /dev/mapper/ost1_flakey /mnt/lustre-ost1 ffff8800ae59ea80 ffff880078434000 lustre /dev/mapper/ost2_flakey /mnt/lustre-ost2 +-------------------------------+ >----------------------| Last 40 lines of dmesg buffer |----------------------< +-------------------------------+ [ 361.661021] [] ret_from_fork_nospec_begin+0x7/0x21 [ 361.663465] [] ? kthread_create_on_node+0x140/0x140 [ 474.366250] LustreError: 5675:0:(ldlm_request.c:124:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1705683396, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-lustre-MDT0000_UUID lock: ffff880080e24280/0xd30f0896b85197fe lrc: 3/0,1 mode: --/EX res: [0x200000401:0x16d6:0x0].0x0 bits 0x12/0x0 rrc: 8 type: IBT flags: 0x40210400000020 pid: 5675 initiator: MDT0 [ 474.394571] LustreError: dumping log to /tmp/lustre-log.1705683696.5675 [ 474.907213] LustreError: 11973:0:(ldlm_request.c:124:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1705683397, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-lustre-MDT0000_UUID lock: ffff88007aa58a00/0xd30f0896b8519898 lrc: 3/1,0 mode: --/PR res: [0x200000401:0x16d6:0x0].0x0 bits 0x13/0x8 rrc: 8 type: IBT flags: 0x40210000000000 pid: 11973 initiator: MDT0 [ 769.133245] Lustre: 11982:0:(service.c:1455:ptlrpc_at_send_early_reply()) @@@ Could not add any time (5/5), not sending early reply req@ffff88007c69d500 x1788538529196800/t0(0) o101->1cd3db5d-077d-4dfa-9e22-5aaec155bd62@192.168.201.56@tcp:76/0 lens 1072/3912 e 24 to 0 dl 1705683996 ref 2 fl Interpret:/200/0 rc 0/0 job:'ls.0' uid:0 gid:0 [ 770.135210] Lustre: 11982:0:(service.c:1455:ptlrpc_at_send_early_reply()) @@@ Could not add any time (5/5), not sending early reply req@ffff88007c69df00 x1788538529196928/t0(0) o101->1cd3db5d-077d-4dfa-9e22-5aaec155bd62@192.168.201.56@tcp:77/0 lens 576/3888 e 24 to 0 dl 1705683997 ref 2 fl Interpret:/200/0 rc 0/0 job:'cp.0' uid:0 gid:0 [ 770.141732] Lustre: 11982:0:(service.c:1455:ptlrpc_at_send_early_reply()) Skipped 8 previous similar messages [ 778.718479] Lustre: lustre-MDT0000: Client 1cd3db5d-077d-4dfa-9e22-5aaec155bd62 (at 192.168.201.56@tcp) reconnecting [ 779.375427] Lustre: lustre-MDT0000: Client f042c446-6c28-44d9-932f-243f4ae12647 (at 192.168.201.56@tcp) reconnecting [ 862.957219] ptlrpc_watchdog_fire: 12 callbacks suppressed [ 862.960048] Lustre: mdt_rdpg00_002: service thread pid 10283 was inactive for 474.882 seconds. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [ 862.966092] Pid: 10283, comm: mdt_rdpg00_002 3.10.0-7.9-debug #1 SMP Sat Mar 26 23:28:42 EDT 2022 [ 862.968640] Call Trace: [ 862.969720] [<0>] wait_transaction_locked+0x93/0xe0 [jbd2] [ 862.971337] [<0>] add_transaction_credits+0x270/0x310 [jbd2] [ 862.973027] [<0>] start_this_handle+0x1cc/0x460 [jbd2] [ 862.974550] [<0>] jbd2__journal_start+0xf3/0x200 [jbd2] [ 862.976306] [<0>] __ldiskfs_journal_start_sb+0x6d/0x100 [ldiskfs] [ 862.978300] [<0>] osd_trans_start+0x233/0x6b0 [osd_ldiskfs] [ 862.980231] [<0>] mdt_empty_transno+0x265/0x850 [mdt] [ 862.981968] [<0>] mdt_close+0x29e/0x990 [mdt] [ 862.983757] [<0>] tgt_request_handle+0x74e/0x19d0 [ptlrpc] [ 862.987371] [<0>] ptlrpc_server_handle_request+0x26e/0xcf0 [ptlrpc] [ 862.989967] [<0>] ptlrpc_main+0xc37/0x16d0 [ptlrpc] [ 862.991678] [<0>] kthread+0xe4/0xf0 [ 862.993198] [<0>] ret_from_fork_nospec_begin+0x7/0x21 [ 862.995418] [<0>] 0xfffffffffffffffe [ 1238.765238] Lustre: 5678:0:(service.c:1455:ptlrpc_at_send_early_reply()) @@@ Could not add any time (3/-251), not sending early reply req@ffff880079b46900 x1788538529219264/t0(0) o35->1cd3db5d-077d-4dfa-9e22-5aaec155bd62@192.168.201.56@tcp:544/0 lens 392/456 e 2 to 0 dl 1705684464 ref 2 fl Interpret:/200/0 rc 0/0 job:'ls.0' uid:0 gid:0 [ 1246.730873] Lustre: lustre-MDT0000: Client 1cd3db5d-077d-4dfa-9e22-5aaec155bd62 (at 192.168.201.56@tcp) reconnecting [ 1361.645210] Lustre: 12109:0:(service.c:1455:ptlrpc_at_send_early_reply()) @@@ Could not add any time (5/-553), not sending early reply req@ffff880079802300 x1788538529202752/t0(0) o4->f042c446-6c28-44d9-932f-243f4ae12647@192.168.201.56@tcp:668/0 lens 496/448 e 23 to 0 dl 1705684588 ref 2 fl Interpret:/200/0 rc 0/0 job:'dir_create.sh.0' uid:0 gid:0 [ 1370.927568] Lustre: lustre-MDT0000: Client f042c446-6c28-44d9-932f-243f4ae12647 (at 192.168.201.56@tcp) reconnecting [ 1714.757625] Lustre: lustre-MDT0000: Client 1cd3db5d-077d-4dfa-9e22-5aaec155bd62 (at 192.168.201.56@tcp) reconnecting [ 1962.985795] Lustre: lustre-MDT0000: Client f042c446-6c28-44d9-932f-243f4ae12647 (at 192.168.201.56@tcp) reconnecting [ 2182.781951] Lustre: lustre-MDT0000: Client 1cd3db5d-077d-4dfa-9e22-5aaec155bd62 (at 192.168.201.56@tcp) reconnecting [ 2554.927608] Lustre: lustre-MDT0000: Client f042c446-6c28-44d9-932f-243f4ae12647 (at 192.168.201.56@tcp) reconnecting [ 2650.801372] Lustre: lustre-MDT0000: Client 1cd3db5d-077d-4dfa-9e22-5aaec155bd62 (at 192.168.201.56@tcp) reconnecting [ 3118.828980] Lustre: lustre-MDT0000: Client 1cd3db5d-077d-4dfa-9e22-5aaec155bd62 (at 192.168.201.56@tcp) reconnecting [ 3586.864540] Lustre: lustre-MDT0000: Client 1cd3db5d-077d-4dfa-9e22-5aaec155bd62 (at 192.168.201.56@tcp) reconnecting [ 3586.867456] Lustre: Skipped 1 previous similar message ****************************************************************************** ************************ A Summary Of Problems Found ************************* ****************************************************************************** -------------------- A list of all +++WARNING+++ messages -------------------- PARTIAL DUMP with size(vmcore) < 25% size(RAM) There are 3 threads running in their own namespaces Use 'taskinfo --ns' to get more details 2 processes in UNINTERRUPTIBLE state are committing journal Possible hang Run 'hanginfo' to get more details ------------------------------------------------------------------------------ ** Execution took 11.57s (real) 6.63s (CPU), Child processes: 4.92s