************************ crashinfo ************************* /exports/testreports/38859/testresults/racer-ldiskfs-DNE-centos7_x86_64-centos7_x86_64-retry1/oleg343-server-timeout-core (3.10.0-7.9-debug) +==========================+ | *** Crashinfo v1.3.7 *** | +==========================+ +++WARNING+++ PARTIAL DUMP with size(vmcore) < 25% size(RAM) KERNEL: /tmp/crash-anaysis.IsIdi/vmlinux [TAINTED] DUMPFILE: /exports/testreports/38859/testresults/racer-ldiskfs-DNE-centos7_x86_64-centos7_x86_64-retry1/oleg343-server-timeout-core [PARTIAL DUMP] CPUS: 4 DATE: Fri Jan 19 12:05:44 EST 2024 UPTIME: 00:11:57 LOAD AVERAGE: 0.00, 0.49, 0.56 TASKS: 303 NODENAME: oleg343-server.virtnet RELEASE: 3.10.0-7.9-debug VERSION: #1 SMP Sat Mar 26 23:28:42 EDT 2022 MACHINE: x86_64 (2399 Mhz) MEMORY: 4 GB PANIC: "" +--------------------------+ >------------------------| Per-cpu Stacks ('bt -a') |------------------------< +--------------------------+ -- CPU#0 -- PID=0 CPU=0 CMD=swapper/0 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 rest_init+0x8e #7 start_kernel+0x456 #8 x86_64_start_reservations+0x2a #9 x86_64_start_kernel+0x152 #10 start_cpu+0x5 -- CPU#1 -- PID=0 CPU=1 CMD=swapper/1 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 start_secondary+0x1eb #7 start_cpu+0x5 -- CPU#2 -- PID=0 CPU=2 CMD=swapper/2 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 start_secondary+0x1eb #7 start_cpu+0x5 -- CPU#3 -- PID=0 CPU=3 CMD=swapper/3 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 start_secondary+0x1eb #7 start_cpu+0x5 +--------------------------------+ >---------------------| How This Dump Has Been Created |---------------------< +--------------------------------+ Cannot identify the specific condition that triggered vmcore +---------------+ >------------------------------| Tasks Summary |------------------------------< +---------------+ Number of Threads That Ran Recently ----------------------------------- last second 20 last 5s 69 last 60s 81 ----- Total Numbers of Threads per State ------ TASK_INTERRUPTIBLE 299 TASK_RUNNING 1 +++WARNING+++ There are 3 threads running in their own namespaces Use 'taskinfo --ns' to get more details +-----------------------+ >--------------------------| 5 Most Recent Threads |--------------------------< +-----------------------+ PID CMD Age ARGS ----- -------------- ------ ---------------------------- 3837 dbuf_evict 0 ms (no user stack) 3071 socknal_reaper 0 ms (no user stack) 1 systemd 0 ms /usr/lib/systemd/systemd --switched-root --system --deserialize 22 3076 monitor_thread 3 ms (no user stack) 46 kworker/0:1 19 ms (no user stack) +------------------------+ >-------------------------| Memory Usage (kmem -i) |-------------------------< +------------------------+ PAGES TOTAL PERCENTAGE TOTAL MEM 955067 3.6 GB ---- FREE 700332 2.7 GB 73% of TOTAL MEM USED 254735 995.1 MB 26% of TOTAL MEM SHARED 23429 91.5 MB 2% of TOTAL MEM BUFFERS 21257 83 MB 2% of TOTAL MEM CACHED 91259 356.5 MB 9% of TOTAL MEM SLAB 21396 83.6 MB 2% of TOTAL MEM TOTAL HUGE 0 0 ---- HUGE FREE 0 0 0% of TOTAL HUGE TOTAL SWAP 262143 1024 MB ---- SWAP USED 0 0 0% of TOTAL SWAP SWAP FREE 262143 1024 MB 100% of TOTAL SWAP COMMIT LIMIT 739676 2.8 GB ---- COMMITTED 60549 236.5 MB 8% of TOTAL LIMIT +-------------------------------+ >----------------------| Scheduler Runqueues (per CPU) |----------------------< +-------------------------------+ ---+ CPU=0 ---- | CURRENT TASK , CMD=swapper/0 ---+ CPU=1 ---- | CURRENT TASK , CMD=swapper/1 ---+ CPU=2 ---- | CURRENT TASK , CMD=swapper/2 ---+ CPU=3 ---- | CURRENT TASK , CMD=swapper/3 +------------------------+ >-------------------------| Network Status Summary |-------------------------< +------------------------+ TCP Connection Info ------------------- ESTABLISHED 5 LISTEN 3 NAGLE disabled (TCP_NODELAY): 4 user_data set (NFS etc.): 4 UDP Connection Info ------------------- 2 UDP sockets, 0 in ESTABLISHED Unix Connection Info ------------------------ ESTABLISHED 26 CLOSE 17 LISTEN 8 Raw sockets info -------------------- ESTABLISHED 1 Interfaces Info --------------- How long ago (in seconds) interfaces transmitted/received? Name RX TX ---- ---------- --------- lo n/a 715.2 eth0 n/a 1.8 RSS_TOTAL=52188 pages, %mem= 0.9 +------------+ >-------------------------------| Mounted FS |-------------------------------< +------------+ MOUNT SUPERBLK TYPE DEVNAME DIRNAME ffff880138cca000 ffff880139940800 rootfs rootfs / ffff88012a274000 ffff88012a318000 sysfs sysfs /sys ffff88012a2741c0 ffff880139944000 proc proc /proc ffff88012a274380 ffff880137668000 devtmpfs devtmpfs /dev ffff88012a274540 ffff8800b5229800 securityfs securityfs /sys/kernel/security ffff88012a274700 ffff88012a318800 tmpfs tmpfs /dev/shm ffff88012a2748c0 ffff88013716f000 devpts devpts /dev/pts ffff88012a274a80 ffff88012a319000 tmpfs tmpfs /run ffff88012a274c40 ffff88012a319800 tmpfs tmpfs /sys/fs/cgroup ffff88012a274e00 ffff88012a31a000 cgroup cgroup /sys/fs/cgroup/systemd ffff88012a274fc0 ffff88012a31a800 pstore pstore /sys/fs/pstore ffff88012a275180 ffff88012a31c800 cgroup cgroup /sys/fs/cgroup/freezer ffff88012a275340 ffff88012a31c000 cgroup cgroup /sys/fs/cgroup/memory ffff88012a275500 ffff88012a31b800 cgroup cgroup /sys/fs/cgroup/devices ffff88012a2756c0 ffff88012a31b000 cgroup cgroup /sys/fs/cgroup/cpuset ffff88012a275880 ffff88012a31d000 cgroup cgroup /sys/fs/cgroup/perf_event ffff88012a275a40 ffff88012a31d800 cgroup cgroup /sys/fs/cgroup/pids ffff88012a275c00 ffff88012a31e000 cgroup cgroup /sys/fs/cgroup/blkio ffff88012a275dc0 ffff88012a31e800 cgroup cgroup /sys/fs/cgroup/cpu,cpuacct ffff88012a34c000 ffff88012a31f000 cgroup cgroup /sys/fs/cgroup/hugetlb ffff88012a34c1c0 ffff88012a31f800 cgroup cgroup /sys/fs/cgroup/net_cls,net_prio ffff880138ccb880 ffff8800b51fd000 configfs configfs /sys/kernel/config ffff88012a34c380 ffff8800b5261800 ext4 /dev/nbd0 / ffff88012a34c540 ffff8800b522b800 rpc_pipefs rpc_pipefs /var/lib/nfs/rpc_pipefs ffff880137652700 ffff8800b4205800 autofs systemd-1 /proc/sys/fs/binfmt_misc ffff88012a34cc40 ffff88012b260000 mqueue mqueue /dev/mqueue ffff8801376528c0 ffff880139947800 debugfs debugfs /sys/kernel/debug ffff880137652a80 ffff880129cd7800 hugetlbfs hugetlbfs /dev/hugepages ffff88012b4ca000 ffff88013683a800 binfmt_misc binfmt_misc /proc/sys/fs/binfmt_misc/ ffff88012a34ce00 ffff8800b4a59000 ramfs none /mnt ffff88012a34cfc0 ffff8800b4208800 squashfs /dev/vda /home/green/git/lustre-release ffff88012b4ca1c0 ffff8800b4a9c000 tmpfs none /var/lib/stateless/writable ffff880137652e00 ffff8800b4a9c000 tmpfs none /var/cache/man ffff880138ccba40 ffff8800b4a9c000 tmpfs none /var/log ffff880137652fc0 ffff8800b4a9c000 tmpfs none /var/lib/dbus ffff880137653180 ffff8800b4a9c000 tmpfs none /tmp ffff880137653340 ffff8800b4a9c000 tmpfs none /var/lib/dhclient ffff880138ccbc00 ffff8800b4a9c000 tmpfs none /var/tmp ffff880137653500 ffff8800b4a9c000 tmpfs none /var/lib/NetworkManager ffff88012a34d180 ffff8800b4a9c000 tmpfs none /var/lib/systemd/random-seed ffff8801376536c0 ffff8800b4a9c000 tmpfs none /var/spool ffff88012a34d340 ffff8800b4a9c000 tmpfs none /var/lib/nfs ffff880137653880 ffff8800b4a9c000 tmpfs none /var/lib/gssproxy ffff88012a34d500 ffff8800b4a9c000 tmpfs none /var/lib/logrotate ffff88012b4ca380 ffff8800b4a9c000 tmpfs none /etc ffff88012a34d6c0 ffff8800b4a9c000 tmpfs none /var/lib/rsyslog ffff88012a34d880 ffff8800b4a9c000 tmpfs none /var/lib/dhclient/var/lib/dhclient ffff88012a34da40 ffff8801373f4800 nfs4 192.168.200.253:/exports/state/oleg343-server.virtnet /var/lib/stateless/state ffff88012a34dc00 ffff8801373f4800 nfs4 192.168.200.253:/exports/state/oleg343-server.virtnet /boot ffff880137653a40 ffff8801373f4800 nfs4 192.168.200.253:/exports/state/oleg343-server.virtnet /etc/etc/kdump.conf ffff88012a34ddc0 ffff8800b522b800 rpc_pipefs sunrpc /var/lib/nfs/var/lib/nfs/rpc_pipefs ffff8800b234a8c0 ffff8800b4208800 squashfs /dev/vda /usr/sbin/mount.lustre ffff88012e189dc0 ffff8800b4a9f000 lustre /dev/mapper/mds1_flakey /mnt/lustre-mds1 ffff88012e188700 ffff88006faac800 lustre /dev/mapper/mds2_flakey /mnt/lustre-mds2 ffff88012e188e00 ffff88008b006000 lustre /dev/mapper/ost1_flakey /mnt/lustre-ost1 ffff88012e189180 ffff88009d8b7000 lustre /dev/mapper/ost2_flakey /mnt/lustre-ost2 +-------------------------------+ >----------------------| Last 40 lines of dmesg buffer |----------------------< +-------------------------------+ [ 256.486188] [<0>] ptlrpc_main+0xc37/0x16d0 [ptlrpc] [ 256.486194] [<0>] kthread+0xe4/0xf0 [ 256.486196] [<0>] ret_from_fork_nospec_begin+0x7/0x21 [ 256.486205] [<0>] 0xfffffffffffffffe [ 256.522662] Lustre: Skipped 1 previous similar message [ 256.524407] Pid: 6531, comm: mdt00_002 3.10.0-7.9-debug #1 SMP Sat Mar 26 23:28:42 EDT 2022 [ 256.527158] Call Trace: [ 256.528065] [<0>] ldlm_completion_ast+0x963/0xe80 [ptlrpc] [ 256.530837] [<0>] ldlm_cli_enqueue_local+0x259/0x880 [ptlrpc] [ 256.532552] [<0>] mdt_object_lock_internal+0x1a9/0x420 [mdt] [ 256.534195] [<0>] mdt_object_lock_try+0xa0/0x250 [mdt] [ 256.536011] [<0>] mdt_object_open_lock+0x679/0xb40 [mdt] [ 256.542557] [<0>] mdt_reint_open+0x24cf/0x2e60 [mdt] [ 256.544423] [<0>] mdt_reint_rec+0x87/0x240 [mdt] [ 256.545994] [<0>] mdt_reint_internal+0x74c/0xba0 [mdt] [ 256.552885] [<0>] mdt_intent_open+0x93/0x480 [mdt] [ 256.554601] [<0>] mdt_intent_opc+0x1c8/0xc50 [mdt] [ 256.562738] [<0>] mdt_intent_policy+0xfa/0x460 [mdt] [ 256.564083] [<0>] ldlm_lock_enqueue+0x3b1/0xbb0 [ptlrpc] [ 256.565195] [<0>] ldlm_handle_enqueue+0x359/0x17c0 [ptlrpc] [ 256.569016] [<0>] tgt_enqueue+0x68/0x240 [ptlrpc] [ 256.571390] [<0>] tgt_request_handle+0x74e/0x19d0 [ptlrpc] [ 256.572819] [<0>] ptlrpc_server_handle_request+0x26e/0xcf0 [ptlrpc] [ 256.577080] [<0>] ptlrpc_main+0xc37/0x16d0 [ptlrpc] [ 256.595894] [<0>] kthread+0xe4/0xf0 [ 256.598054] [<0>] ret_from_fork_nospec_begin+0x7/0x21 [ 256.599654] [<0>] 0xfffffffffffffffe [ 256.997877] Lustre: mdt00_023: service thread pid 14254 was inactive for 40.076 seconds. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [ 257.004247] Lustre: Skipped 8 previous similar messages [ 316.773807] LustreError: 6521:0:(ldlm_lockd.c:261:expired_lock_main()) ### lock callback timer expired after 100s: evicting client at 192.168.203.43@tcp ns: mdt-lustre-MDT0001_UUID lock: ffff880099c00c80/0xc10c714cea1e5769 lrc: 3/0,0 mode: PW/PW res: [0x240000402:0x171f:0x0].0x0 bits 0x4/0x0 rrc: 4 type: IBT gid 0 flags: 0x60200400000020 nid: 192.168.203.43@tcp remote: 0xb8dd65ac952ec71b expref: 948 pid: 14204 timeout: 315 lvb_type: 0 [ 316.800114] LustreError: 6530:0:(mdt_reint.c:2443:mdt_reint_migrate()) lustre-MDT0001: migrate [0x200000402:0x19c9:0x0]/12 failed: rc = -16 [ 316.805765] LustreError: 6530:0:(mdt_reint.c:2443:mdt_reint_migrate()) Skipped 188 previous similar messages [ 317.096996] LustreError: 14198:0:(mdd_dir.c:4452:mdd_migrate_cmd_check()) lustre-MDD0000: '10' migration was interrupted, run 'lfs migrate -m 0 -c 2 -H crush 10' to finish migration. [ 317.108629] LustreError: 14198:0:(mdd_dir.c:4452:mdd_migrate_cmd_check()) Skipped 22 previous similar messages [ 418.661926] LustreError: 6521:0:(ldlm_lockd.c:261:expired_lock_main()) ### lock callback timer expired after 100s: evicting client at 192.168.203.43@tcp ns: mdt-lustre-MDT0000_UUID lock: ffff88012f000a00/0xc10c714cea1f7bc5 lrc: 3/0,0 mode: PW/PW res: [0x200000403:0x176b:0x0].0x0 bits 0x4/0x0 rrc: 5 type: IBT gid 0 flags: 0x60200400000020 nid: 192.168.203.43@tcp remote: 0xb8dd65ac952f1d28 expref: 831 pid: 14255 timeout: 417 lvb_type: 0 [ 418.687091] LustreError: 6521:0:(mdd_object.c:3645:mdd_close()) lustre-MDD0000: failed to get lu_attr of [0x200000403:0x78a:0x0]: rc = -2 [ 418.691462] LustreError: 6521:0:(mdd_object.c:3645:mdd_close()) Skipped 2 previous similar messages [ 418.699502] LustreError: 14198:0:(ldlm_lockd.c:1495:ldlm_handle_enqueue()) ### lock on destroyed export ffff88009e3a0800 ns: mdt-lustre-MDT0000_UUID lock: ffff88009b9b0a00/0xc10c714cea1fdf3f lrc: 3/0,0 mode: PR/PR res: [0x200000403:0x176b:0x0].0x0 bits 0x1b/0x0 rrc: 4 type: IBT gid 0 flags: 0x50200000000000 nid: 192.168.203.43@tcp remote: 0xb8dd65ac952f208c expref: 5 pid: 14198 timeout: 0 lvb_type: 0 [ 501.699798] LustreError: 14240:0:(ldlm_request.c:124:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1705683427, 300s ago); not entering recovery in server code, just going back to sleep ns: filter-lustre-OST0001_UUID lock: ffff88009baed680/0xc10c714cea16e046 lrc: 3/0,1 mode: --/PW res: [0x2c0000400:0x2ab:0x0].0x0 rrc: 6 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) gid 0 flags: 0x40010400000020 nid: local remote: 0x0 expref: -99 pid: 14240 timeout: 0 lvb_type: 0 [ 501.709859] LustreError: dumping log to /tmp/lustre-log.1705683727.14240 ****************************************************************************** ************************ A Summary Of Problems Found ************************* ****************************************************************************** -------------------- A list of all +++WARNING+++ messages -------------------- PARTIAL DUMP with size(vmcore) < 25% size(RAM) There are 3 threads running in their own namespaces Use 'taskinfo --ns' to get more details ------------------------------------------------------------------------------ ** Execution took 11.14s (real) 5.99s (CPU), Child processes: 5.15s