************************ crashinfo ************************* /exports/testreports/38859/testresults/racer-special2-ldiskfs-centos7_x86_64-centos7_x86_64/oleg310-server-timeout-core (3.10.0-7.9-debug) +==========================+ | *** Crashinfo v1.3.7 *** | +==========================+ +++WARNING+++ PARTIAL DUMP with size(vmcore) < 25% size(RAM) KERNEL: /tmp/crash-anaysis.dElFW/vmlinux [TAINTED] DUMPFILE: /exports/testreports/38859/testresults/racer-special2-ldiskfs-centos7_x86_64-centos7_x86_64/oleg310-server-timeout-core [PARTIAL DUMP] CPUS: 4 DATE: Fri Jan 19 12:55:13 EST 2024 UPTIME: 01:01:30 LOAD AVERAGE: 19.00, 18.99, 18.61 TASKS: 261 NODENAME: oleg310-server.virtnet RELEASE: 3.10.0-7.9-debug VERSION: #1 SMP Sat Mar 26 23:28:42 EDT 2022 MACHINE: x86_64 (2399 Mhz) MEMORY: 4 GB PANIC: "" +--------------------------+ >------------------------| Per-cpu Stacks ('bt -a') |------------------------< +--------------------------+ -- CPU#0 -- PID=0 CPU=0 CMD=swapper/0 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 rest_init+0x8e #7 start_kernel+0x456 #8 x86_64_start_reservations+0x2a #9 x86_64_start_kernel+0x152 #10 start_cpu+0x5 -- CPU#1 -- PID=0 CPU=1 CMD=swapper/1 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 start_secondary+0x1eb #7 start_cpu+0x5 -- CPU#2 -- PID=0 CPU=2 CMD=swapper/2 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 start_secondary+0x1eb #7 start_cpu+0x5 -- CPU#3 -- PID=0 CPU=3 CMD=swapper/3 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 start_secondary+0x1eb #7 start_cpu+0x5 +--------------------------------+ >---------------------| How This Dump Has Been Created |---------------------< +--------------------------------+ Cannot identify the specific condition that triggered vmcore +---------------+ >------------------------------| Tasks Summary |------------------------------< +---------------+ Number of Threads That Ran Recently ----------------------------------- last second 21 last 5s 46 last 60s 63 ----- Total Numbers of Threads per State ------ TASK_INTERRUPTIBLE 238 TASK_RUNNING 1 TASK_UNINTERRUPTIBLE 19 +++WARNING+++ There are 3 threads running in their own namespaces Use 'taskinfo --ns' to get more details +-----------------------+ >--------------------------| 5 Most Recent Threads |--------------------------< +-----------------------+ PID CMD Age ARGS ----- -------------- ------ ---------------------------- 880 in:imjournal 0 ms /usr/sbin/rsyslogd -n 5782 ldlm_bl_01 0 ms (no user stack) 3581 l2arc_feed 0 ms (no user stack) 46 kworker/0:1 0 ms (no user stack) 3577 zthr_procedure 123 ms (no user stack) +------------------------+ >-------------------------| Memory Usage (kmem -i) |-------------------------< +------------------------+ PAGES TOTAL PERCENTAGE TOTAL MEM 955067 3.6 GB ---- FREE 748092 2.9 GB 78% of TOTAL MEM USED 206975 808.5 MB 21% of TOTAL MEM SHARED 10610 41.4 MB 1% of TOTAL MEM BUFFERS 7650 29.9 MB 0% of TOTAL MEM CACHED 72835 284.5 MB 7% of TOTAL MEM SLAB 15345 59.9 MB 1% of TOTAL MEM TOTAL HUGE 0 0 ---- HUGE FREE 0 0 0% of TOTAL HUGE TOTAL SWAP 262143 1024 MB ---- SWAP USED 0 0 0% of TOTAL SWAP SWAP FREE 262143 1024 MB 100% of TOTAL SWAP COMMIT LIMIT 739676 2.8 GB ---- COMMITTED 59591 232.8 MB 8% of TOTAL LIMIT +++WARNING+++ 2 processes in UNINTERRUPTIBLE state are committing journal +++three oldest UNINTERRUPTIBLE threads ... ran 3599s ago PID=12125 CPU=2 CMD=mdt_rdpg00_004 #0 __schedule+0x2e2 #1 schedule+0x29 #2 wait_transaction_locked+0x93 #3 add_transaction_credits+0x270 #4 start_this_handle+0x1cc #5 jbd2__journal_start+0xf3 #6 __ldiskfs_journal_start_sb+0x6d #7 osd_trans_start+0x233 #8 top_trans_start+0x7d9 #9 lod_trans_start+0x7f #10 mdd_trans_start+0x14 #11 mdd_xattr_set+0x27c #12 mdt_set_som+0xf1 #13 mdt_lsom_update+0x334 #14 mdt_mfd_close+0x4ce #15 mdt_close_internal+0xb4 #16 mdt_close+0x28c #17 tgt_request_handle+0x74e #18 ptlrpc_server_handle_request+0x26e #19 ptlrpc_main+0xc37 #20 kthread+0xe4 #21 ret_from_fork_nospec_begin+0x7 ... ran 3599s ago PID=5775 CPU=2 CMD=jbd2/dm-0-8 #0 __schedule+0x2e2 #1 schedule+0x29 #2 jbd2_journal_commit_transaction+0x25d #3 kjournald2+0xcd #4 kthread+0xe4 #5 ret_from_fork_nospec_begin+0x7 ... ran 3599s ago PID=12103 CPU=2 CMD=mdt00_010 #0 __schedule+0x2e2 #1 schedule+0x29 #2 jbd2_log_wait_commit+0xc5 #3 jbd2_journal_stop+0x36b #4 __ldiskfs_journal_stop+0x3c #5 osd_trans_stop+0x1ce #6 top_trans_stop+0x9f #7 lod_trans_stop+0x72 #8 mdd_trans_stop+0x28 #9 mdd_attr_set+0x5ee #10 mdt_reint_setattr+0x91b #11 mdt_reint_rec+0x87 #12 mdt_reint_internal+0x74c #13 mdt_reint+0x67 #14 tgt_request_handle+0x74e #15 ptlrpc_server_handle_request+0x26e #16 ptlrpc_main+0xc37 #17 kthread+0xe4 #18 ret_from_fork_nospec_begin+0x7 +-------------------------------+ >----------------------| Scheduler Runqueues (per CPU) |----------------------< +-------------------------------+ ---+ CPU=0 ---- | CURRENT TASK , CMD=swapper/0 ---+ CPU=1 ---- | CURRENT TASK , CMD=swapper/1 ---+ CPU=2 ---- | CURRENT TASK , CMD=swapper/2 ---+ CPU=3 ---- | CURRENT TASK , CMD=swapper/3 +------------------------+ >-------------------------| Network Status Summary |-------------------------< +------------------------+ TCP Connection Info ------------------- ESTABLISHED 5 LISTEN 3 NAGLE disabled (TCP_NODELAY): 4 user_data set (NFS etc.): 4 UDP Connection Info ------------------- 2 UDP sockets, 0 in ESTABLISHED Unix Connection Info ------------------------ ESTABLISHED 26 CLOSE 17 LISTEN 8 Raw sockets info -------------------- ESTABLISHED 1 Interfaces Info --------------- How long ago (in seconds) interfaces transmitted/received? Name RX TX ---- ---------- --------- lo n/a 3688.4 eth0 n/a 1.2 RSS_TOTAL=51796 pages, %mem= 0.8 +++WARNING+++ Possible hang +++WARNING+++ Run 'hanginfo' to get more details +------------+ >-------------------------------| Mounted FS |-------------------------------< +------------+ MOUNT SUPERBLK TYPE DEVNAME DIRNAME ffff880138cca000 ffff880139940800 rootfs rootfs / ffff88012a312000 ffff8800b7e8a000 sysfs sysfs /sys ffff88012a3121c0 ffff880139944000 proc proc /proc ffff88012a312380 ffff880137668000 devtmpfs devtmpfs /dev ffff88012a312540 ffff8800b51e0800 securityfs securityfs /sys/kernel/security ffff88012a312700 ffff8800b7e8a800 tmpfs tmpfs /dev/shm ffff88012a3128c0 ffff88013717f000 devpts devpts /dev/pts ffff88012a312a80 ffff8800b7e8b000 tmpfs tmpfs /run ffff88012a312c40 ffff8800b7e8b800 tmpfs tmpfs /sys/fs/cgroup ffff88012a312e00 ffff8800b7e8c000 cgroup cgroup /sys/fs/cgroup/systemd ffff88012a312fc0 ffff8800b7e8c800 pstore pstore /sys/fs/pstore ffff88012a313180 ffff8800b7e8e800 cgroup cgroup /sys/fs/cgroup/pids ffff88012a313340 ffff8800b7e8e000 cgroup cgroup /sys/fs/cgroup/cpu,cpuacct ffff88012a313500 ffff8800b7e8d800 cgroup cgroup /sys/fs/cgroup/cpuset ffff88012a3136c0 ffff8800b7e8d000 cgroup cgroup /sys/fs/cgroup/hugetlb ffff88012a313880 ffff8800b7e8f000 cgroup cgroup /sys/fs/cgroup/freezer ffff88012a313a40 ffff8800b7e8f800 cgroup cgroup /sys/fs/cgroup/devices ffff88012a313c00 ffff88012a328000 cgroup cgroup /sys/fs/cgroup/net_cls,net_prio ffff88012a313dc0 ffff88012a328800 cgroup cgroup /sys/fs/cgroup/perf_event ffff88012a2f4000 ffff88012a329000 cgroup cgroup /sys/fs/cgroup/memory ffff88012a2f41c0 ffff88012a329800 cgroup cgroup /sys/fs/cgroup/blkio ffff88012a2f4380 ffff88012a32d000 configfs configfs /sys/kernel/config ffff8800b524c000 ffff8800b5233000 ext4 /dev/nbd0 / ffff880138ccba40 ffff88013766a000 rpc_pipefs rpc_pipefs /var/lib/nfs/rpc_pipefs ffff880137652540 ffff88013685d000 autofs systemd-1 /proc/sys/fs/binfmt_misc ffff8800b524c1c0 ffff880139947800 debugfs debugfs /sys/kernel/debug ffff88012a2f4a80 ffff88012b298000 mqueue mqueue /dev/mqueue ffff880137652700 ffff88013685e000 hugetlbfs hugetlbfs /dev/hugepages ffff88012a2f4c40 ffff8800b51e6800 binfmt_misc binfmt_misc /proc/sys/fs/binfmt_misc/ ffff880137652a80 ffff8800b4aa0800 ramfs none /mnt ffff88012a2f4e00 ffff8800b3c7a800 squashfs /dev/vda /home/green/git/lustre-release ffff880137652c40 ffff8800b5234800 tmpfs none /var/lib/stateless/writable ffff880137652e00 ffff8800b5234800 tmpfs none /var/cache/man ffff880138ccbc00 ffff8800b5234800 tmpfs none /var/log ffff880137652fc0 ffff8800b5234800 tmpfs none /var/lib/dbus ffff880137653180 ffff8800b5234800 tmpfs none /tmp ffff880137653340 ffff8800b5234800 tmpfs none /var/lib/dhclient ffff88012a2f4fc0 ffff8800b5234800 tmpfs none /var/tmp ffff88012a2f5180 ffff8800b5234800 tmpfs none /var/lib/NetworkManager ffff8800b524c380 ffff8800b5234800 tmpfs none /var/lib/systemd/random-seed ffff88012a2f5340 ffff8800b5234800 tmpfs none /var/spool ffff88012a2f5500 ffff8800b5234800 tmpfs none /var/lib/nfs ffff880137653500 ffff8800b5234800 tmpfs none /var/lib/gssproxy ffff8801376536c0 ffff8800b5234800 tmpfs none /var/lib/logrotate ffff8800b524c540 ffff8800b5234800 tmpfs none /etc ffff88012a2f56c0 ffff8800b5234800 tmpfs none /var/lib/rsyslog ffff88012a2f5880 ffff8800b5234800 tmpfs none /var/lib/dhclient/var/lib/dhclient ffff880137653880 ffff8800b4bdf800 nfs4 192.168.200.253:/exports/state/oleg310-server.virtnet /var/lib/stateless/state ffff88012a2f5a40 ffff8800b4bdf800 nfs4 192.168.200.253:/exports/state/oleg310-server.virtnet /boot ffff88012a2f5c00 ffff8800b4bdf800 nfs4 192.168.200.253:/exports/state/oleg310-server.virtnet /etc/etc/kdump.conf ffff88012a2f5dc0 ffff88013766a000 rpc_pipefs sunrpc /var/lib/nfs/var/lib/nfs/rpc_pipefs ffff8800b1ef2700 ffff8800b3c7a800 squashfs /dev/vda /usr/sbin/mount.lustre ffff8800b3cc08c0 ffff880136859800 lustre /dev/mapper/mds1_flakey /mnt/lustre-mds1 ffff8800b3cc16c0 ffff8800b3c7c000 lustre /dev/mapper/ost1_flakey /mnt/lustre-ost1 ffff8800b3cc0700 ffff88007515b000 lustre /dev/mapper/ost2_flakey /mnt/lustre-ost2 +-------------------------------+ >----------------------| Last 40 lines of dmesg buffer |----------------------< +-------------------------------+ [ 241.742598] [] mdd_xattr_set+0x27c/0x1040 [mdd] [ 241.745446] [] ? lustre_msg_get_versions+0x22/0xf0 [ptlrpc] [ 241.748435] [] mdt_reint_setxattr+0x737/0x1140 [mdt] [ 241.749982] [] mdt_reint_rec+0x87/0x240 [mdt] [ 241.751391] [] mdt_reint_internal+0x74c/0xba0 [mdt] [ 241.752864] [] mdt_reint+0x67/0x150 [mdt] [ 241.754258] [] tgt_request_handle+0x74e/0x19d0 [ptlrpc] [ 241.755985] [] ptlrpc_server_handle_request+0x26e/0xcf0 [ptlrpc] [ 241.758860] [] ptlrpc_main+0xc37/0x16d0 [ptlrpc] [ 241.760787] [] ? put_prev_entity+0x31/0x400 [ 241.762817] [] ? __switch_to+0xcd/0x4e0 [ 241.765125] [] ? ptlrpc_wait_event+0x5e0/0x5e0 [ptlrpc] [ 241.767736] [] kthread+0xe4/0xf0 [ 241.769785] [] ? kthread_create_on_node+0x140/0x140 [ 241.771852] [] ret_from_fork_nospec_begin+0x7/0x21 [ 241.773873] [] ? kthread_create_on_node+0x140/0x140 [ 391.835207] LustreError: 12104:0:(ldlm_request.c:124:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1705683314, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-lustre-MDT0000_UUID lock: ffff880076598780/0xba37f4addd1edc06 lrc: 3/1,0 mode: --/PR res: [0x200000401:0x1:0x0].0x0 bits 0x13/0x0 rrc: 20 type: IBT flags: 0x40210000000000 pid: 12104 initiator: MDT0 [ 391.845315] LustreError: 12104:0:(ldlm_request.c:124:ldlm_expired_completion_wait()) Skipped 5 previous similar messages [ 391.851500] LustreError: dumping log to /tmp/lustre-log.1705683614.8719 [ 392.758689] LustreError: 12131:0:(ldlm_request.c:124:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1705683315, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-lustre-MDT0000_UUID lock: ffff880079cbb480/0xba37f4addd1edc68 lrc: 3/1,0 mode: --/PR res: [0x200000401:0x1:0x0].0x0 bits 0x13/0x0 rrc: 19 type: IBT flags: 0x40210000000000 pid: 12131 initiator: MDT0 [ 392.769419] LustreError: 12131:0:(ldlm_request.c:124:ldlm_expired_completion_wait()) Skipped 3 previous similar messages [ 395.754510] LustreError: 12138:0:(ldlm_request.c:124:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1705683318, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-lustre-MDT0000_UUID lock: ffff880079cbb700/0xba37f4addd1edcae lrc: 3/1,0 mode: --/PR res: [0x200000401:0x1:0x0].0x0 bits 0x13/0x0 rrc: 19 type: IBT flags: 0x40210000000000 pid: 12138 initiator: MDT0 [ 395.765789] LustreError: 12138:0:(ldlm_request.c:124:ldlm_expired_completion_wait()) Skipped 1 previous similar message [ 686.783368] Lustre: 12110:0:(service.c:1455:ptlrpc_at_send_early_reply()) @@@ Could not add any time (5/5), not sending early reply req@ffff88012db9a850 x1788538518009216/t0(0) o35->61010ae6-e609-4603-a64d-e10e61d02192@192.168.203.10@tcp:749/0 lens 392/456 e 24 to 0 dl 1705683914 ref 2 fl Interpret:/200/0 rc 0/0 job:'mrename.0' uid:0 gid:0 [ 686.793178] Lustre: 12110:0:(service.c:1455:ptlrpc_at_send_early_reply()) Skipped 18 previous similar messages [ 687.785519] Lustre: 12142:0:(service.c:1455:ptlrpc_at_send_early_reply()) @@@ Could not add any time (5/5), not sending early reply req@ffff880075a1d000 x1788538518012160/t0(0) o101->61010ae6-e609-4603-a64d-e10e61d02192@192.168.203.10@tcp:750/0 lens 576/3808 e 24 to 0 dl 1705683915 ref 2 fl Interpret:/200/0 rc 0/0 job:'ls.0' uid:0 gid:0 [ 687.792056] Lustre: 12142:0:(service.c:1455:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message [ 696.781375] Lustre: lustre-MDT0000: Client e16192e2-cf47-4b90-ab85-ac3ca28b2241 (at 192.168.203.10@tcp) reconnecting [ 696.785128] Lustre: Skipped 1 previous similar message [ 1281.727709] Lustre: 12170:0:(service.c:1455:ptlrpc_at_send_early_reply()) @@@ Could not add any time (4/-557), not sending early reply req@ffff880075a1bc00 x1788538518016832/t0(0) o4->61010ae6-e609-4603-a64d-e10e61d02192@192.168.203.10@tcp:587/0 lens 488/448 e 23 to 0 dl 1705684507 ref 2 fl Interpret:/200/0 rc 0/0 job:'cp.0' uid:0 gid:0 [ 1281.742787] Lustre: 12170:0:(service.c:1455:ptlrpc_at_send_early_reply()) Skipped 5 previous similar messages [ 1290.104986] Lustre: lustre-MDT0000: Client 61010ae6-e609-4603-a64d-e10e61d02192 (at 192.168.203.10@tcp) reconnecting [ 1301.787391] Lustre: lustre-MDT0000: Client e16192e2-cf47-4b90-ab85-ac3ca28b2241 (at 192.168.203.10@tcp) reconnecting [ 1883.178749] Lustre: lustre-MDT0000: Client 61010ae6-e609-4603-a64d-e10e61d02192 (at 192.168.203.10@tcp) reconnecting [ 1906.793070] Lustre: lustre-MDT0000: Client e16192e2-cf47-4b90-ab85-ac3ca28b2241 (at 192.168.203.10@tcp) reconnecting [ 2475.858069] Lustre: lustre-MDT0000: Client 61010ae6-e609-4603-a64d-e10e61d02192 (at 192.168.203.10@tcp) reconnecting [ 2511.809894] Lustre: lustre-MDT0000: Client e16192e2-cf47-4b90-ab85-ac3ca28b2241 (at 192.168.203.10@tcp) reconnecting [ 3068.833666] Lustre: lustre-MDT0000: Client 61010ae6-e609-4603-a64d-e10e61d02192 (at 192.168.203.10@tcp) reconnecting [ 3662.345836] Lustre: lustre-MDT0000: Client 61010ae6-e609-4603-a64d-e10e61d02192 (at 192.168.203.10@tcp) reconnecting [ 3662.349335] Lustre: Skipped 1 previous similar message ****************************************************************************** ************************ A Summary Of Problems Found ************************* ****************************************************************************** -------------------- A list of all +++WARNING+++ messages -------------------- PARTIAL DUMP with size(vmcore) < 25% size(RAM) There are 3 threads running in their own namespaces Use 'taskinfo --ns' to get more details 2 processes in UNINTERRUPTIBLE state are committing journal Possible hang Run 'hanginfo' to get more details ------------------------------------------------------------------------------ ** Execution took 15.21s (real) 9.53s (CPU), Child processes: 5.48s