************************ crashinfo ************************* /exports/testreports/38859/testresults/racer-special10-ldiskfs-DNE-centos7_x86_64-centos7_x86_64/oleg117-server-timeout-core (3.10.0-7.9-debug) +==========================+ | *** Crashinfo v1.3.7 *** | +==========================+ +++WARNING+++ PARTIAL DUMP with size(vmcore) < 25% size(RAM) KERNEL: /tmp/crash-anaysis.9OD8I/vmlinux [TAINTED] DUMPFILE: /exports/testreports/38859/testresults/racer-special10-ldiskfs-DNE-centos7_x86_64-centos7_x86_64/oleg117-server-timeout-core [PARTIAL DUMP] CPUS: 4 DATE: Fri Jan 19 12:55:32 EST 2024 UPTIME: 01:01:49 LOAD AVERAGE: 0.00, 0.01, 0.05 TASKS: 319 NODENAME: oleg117-server.virtnet RELEASE: 3.10.0-7.9-debug VERSION: #1 SMP Sat Mar 26 23:28:42 EDT 2022 MACHINE: x86_64 (2399 Mhz) MEMORY: 4 GB PANIC: "" +--------------------------+ >------------------------| Per-cpu Stacks ('bt -a') |------------------------< +--------------------------+ -- CPU#0 -- PID=0 CPU=0 CMD=swapper/0 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 rest_init+0x8e #7 start_kernel+0x456 #8 x86_64_start_reservations+0x2a #9 x86_64_start_kernel+0x152 #10 start_cpu+0x5 -- CPU#1 -- PID=0 CPU=1 CMD=swapper/1 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 start_secondary+0x1eb #7 start_cpu+0x5 -- CPU#2 -- PID=0 CPU=2 CMD=swapper/2 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 start_secondary+0x1eb #7 start_cpu+0x5 -- CPU#3 -- PID=0 CPU=3 CMD=swapper/3 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 start_secondary+0x1eb #7 start_cpu+0x5 +--------------------------------+ >---------------------| How This Dump Has Been Created |---------------------< +--------------------------------+ Cannot identify the specific condition that triggered vmcore +---------------+ >------------------------------| Tasks Summary |------------------------------< +---------------+ Number of Threads That Ran Recently ----------------------------------- last second 47 last 5s 64 last 60s 80 ----- Total Numbers of Threads per State ------ TASK_INTERRUPTIBLE 315 TASK_RUNNING 1 +++WARNING+++ There are 3 threads running in their own namespaces Use 'taskinfo --ns' to get more details +-----------------------+ >--------------------------| 5 Most Recent Threads |--------------------------< +-----------------------+ PID CMD Age ARGS ----- -------------- ------ ---------------------------- 3084 lnet_discovery 0 ms (no user stack) 3842 dbuf_evict 0 ms (no user stack) 14287 mdt_out00_005 0 ms (no user stack) 3086 ptlrpcd_00_00 0 ms (no user stack) 8814 ll_ost_create00 0 ms (no user stack) +------------------------+ >-------------------------| Memory Usage (kmem -i) |-------------------------< +------------------------+ PAGES TOTAL PERCENTAGE TOTAL MEM 955067 3.6 GB ---- FREE 697931 2.7 GB 73% of TOTAL MEM USED 257136 1004.4 MB 26% of TOTAL MEM SHARED 32319 126.2 MB 3% of TOTAL MEM BUFFERS 30160 117.8 MB 3% of TOTAL MEM CACHED 80216 313.3 MB 8% of TOTAL MEM SLAB 23818 93 MB 2% of TOTAL MEM TOTAL HUGE 0 0 ---- HUGE FREE 0 0 0% of TOTAL HUGE TOTAL SWAP 262143 1024 MB ---- SWAP USED 0 0 0% of TOTAL SWAP SWAP FREE 262143 1024 MB 100% of TOTAL SWAP COMMIT LIMIT 739676 2.8 GB ---- COMMITTED 58959 230.3 MB 7% of TOTAL LIMIT +-------------------------------+ >----------------------| Scheduler Runqueues (per CPU) |----------------------< +-------------------------------+ ---+ CPU=0 ---- | CURRENT TASK , CMD=swapper/0 ---+ CPU=1 ---- | CURRENT TASK , CMD=swapper/1 ---+ CPU=2 ---- | CURRENT TASK , CMD=swapper/2 ---+ CPU=3 ---- | CURRENT TASK , CMD=swapper/3 +------------------------+ >-------------------------| Network Status Summary |-------------------------< +------------------------+ TCP Connection Info ------------------- ESTABLISHED 5 LISTEN 3 NAGLE disabled (TCP_NODELAY): 4 user_data set (NFS etc.): 4 UDP Connection Info ------------------- 2 UDP sockets, 0 in ESTABLISHED Unix Connection Info ------------------------ ESTABLISHED 26 CLOSE 17 LISTEN 8 Raw sockets info -------------------- ESTABLISHED 1 Interfaces Info --------------- How long ago (in seconds) interfaces transmitted/received? Name RX TX ---- ---------- --------- lo n/a 3706.7 eth0 n/a 1.4 RSS_TOTAL=50460 pages, %mem= 0.8 +------------+ >-------------------------------| Mounted FS |-------------------------------< +------------+ MOUNT SUPERBLK TYPE DEVNAME DIRNAME ffff880138cca000 ffff880139940800 rootfs rootfs / ffff880137652380 ffff880137668800 sysfs sysfs /sys ffff880137652540 ffff880139944000 proc proc /proc ffff880137652700 ffff880137668000 devtmpfs devtmpfs /dev ffff8801376528c0 ffff8800b5242800 securityfs securityfs /sys/kernel/security ffff880137652a80 ffff880137669000 tmpfs tmpfs /dev/shm ffff880137652c40 ffff880137315000 devpts devpts /dev/pts ffff880137652e00 ffff880137669800 tmpfs tmpfs /run ffff880137652fc0 ffff88013766a000 tmpfs tmpfs /sys/fs/cgroup ffff880137653180 ffff88013766a800 cgroup cgroup /sys/fs/cgroup/systemd ffff880137653340 ffff88013766b000 pstore pstore /sys/fs/pstore ffff880137653500 ffff88013766d000 cgroup cgroup /sys/fs/cgroup/pids ffff8801376536c0 ffff88013766c800 cgroup cgroup /sys/fs/cgroup/cpuset ffff880137653880 ffff88013766c000 cgroup cgroup /sys/fs/cgroup/memory ffff880137653a40 ffff88013766b800 cgroup cgroup /sys/fs/cgroup/net_cls,net_prio ffff880137653c00 ffff88013766d800 cgroup cgroup /sys/fs/cgroup/devices ffff880137653dc0 ffff88013766e000 cgroup cgroup /sys/fs/cgroup/perf_event ffff88012a2a2000 ffff88013766e800 cgroup cgroup /sys/fs/cgroup/freezer ffff88012a2a21c0 ffff88013766f000 cgroup cgroup /sys/fs/cgroup/cpu,cpuacct ffff88012a2a2380 ffff88013766f800 cgroup cgroup /sys/fs/cgroup/blkio ffff88012a2a2540 ffff88012a330000 cgroup cgroup /sys/fs/cgroup/hugetlb ffff880138ccb6c0 ffff8800b5747800 configfs configfs /sys/kernel/config ffff88012a308380 ffff8800b5243000 ext4 /dev/nbd0 / ffff88012a308540 ffff8800b5246000 rpc_pipefs rpc_pipefs /var/lib/nfs/rpc_pipefs ffff8801377e81c0 ffff8800b4f6d800 autofs systemd-1 /proc/sys/fs/binfmt_misc ffff880138ccb500 ffff880137316000 mqueue mqueue /dev/mqueue ffff88012a2a2c40 ffff8800b410e000 hugetlbfs hugetlbfs /dev/hugepages ffff880138ccb880 ffff880139947800 debugfs debugfs /sys/kernel/debug ffff8801377e8380 ffff88012a330800 binfmt_misc binfmt_misc /proc/sys/fs/binfmt_misc/ ffff88012a308700 ffff8800b421a000 ramfs none /mnt ffff88012a2a2e00 ffff8800b414e800 squashfs /dev/vda /home/green/git/lustre-release ffff8801377e8540 ffff8800b40e5800 tmpfs none /var/lib/stateless/writable ffff88012a3088c0 ffff8800b40e5800 tmpfs none /var/cache/man ffff88012a308a80 ffff8800b40e5800 tmpfs none /var/log ffff88012a2a2fc0 ffff8800b40e5800 tmpfs none /var/lib/dbus ffff8801377e8700 ffff8800b40e5800 tmpfs none /tmp ffff88012a308c40 ffff8800b40e5800 tmpfs none /var/lib/dhclient ffff8801377e88c0 ffff8800b40e5800 tmpfs none /var/tmp ffff88012a308e00 ffff8800b40e5800 tmpfs none /var/lib/NetworkManager ffff88012a308fc0 ffff8800b40e5800 tmpfs none /var/lib/systemd/random-seed ffff88012a2a3180 ffff8800b40e5800 tmpfs none /var/spool ffff88012a2a3340 ffff8800b40e5800 tmpfs none /var/lib/nfs ffff8801377e8a80 ffff8800b40e5800 tmpfs none /var/lib/gssproxy ffff88012a2a3500 ffff8800b40e5800 tmpfs none /var/lib/logrotate ffff88012a2a36c0 ffff8800b40e5800 tmpfs none /etc ffff88012a309180 ffff8800b40e5800 tmpfs none /var/lib/rsyslog ffff88012a309340 ffff8800b40e5800 tmpfs none /var/lib/dhclient/var/lib/dhclient ffff88012a309500 ffff8800b4219800 nfs4 192.168.200.253:/exports/state/oleg117-server.virtnet /var/lib/stateless/state ffff88012a309c00 ffff8800b4219800 nfs4 192.168.200.253:/exports/state/oleg117-server.virtnet /boot ffff8801377e8c40 ffff8800b4219800 nfs4 192.168.200.253:/exports/state/oleg117-server.virtnet /etc/etc/kdump.conf ffff88012a2a3880 ffff8800b5246000 rpc_pipefs sunrpc /var/lib/nfs/var/lib/nfs/rpc_pipefs ffff8800b1e90e00 ffff8800b414e800 squashfs /dev/vda /usr/sbin/mount.lustre ffff880129c06700 ffff880129c9f800 lustre /dev/mapper/mds1_flakey /mnt/lustre-mds1 ffff880129c07a40 ffff8800835bf800 lustre /dev/mapper/mds2_flakey /mnt/lustre-mds2 ffff880129c06fc0 ffff8800a5b34000 lustre /dev/mapper/ost1_flakey /mnt/lustre-ost1 ffff880129c06a80 ffff8800b410e800 lustre /dev/mapper/ost2_flakey /mnt/lustre-ost2 +-------------------------------+ >----------------------| Last 40 lines of dmesg buffer |----------------------< +-------------------------------+ [ 239.788317] Lustre: 14311:0:(osd_handler.c:2003:osd_trans_dump_creds()) ref_add: 8/8/0, ref_del: 8/8/1 [ 239.793405] Lustre: 14311:0:(osd_handler.c:2003:osd_trans_dump_creds()) Skipped 35 previous similar messages [ 246.656958] LustreError: 7695:0:(llog_cat.c:737:llog_cat_cancel_arr_rec()) lustre-MDT0000-osp-MDT0001: fail to cancel 1 llog-records: rc = -2 [ 246.666373] LustreError: 7695:0:(llog_cat.c:773:llog_cat_cancel_records()) lustre-MDT0000-osp-MDT0001: fail to cancel 1 of 1 llog-records: rc = -2 [ 252.067970] LustreError: 6528:0:(ldlm_lockd.c:261:expired_lock_main()) ### lock callback timer expired after 100s: evicting client at 192.168.201.17@tcp ns: filter-lustre-OST0000_UUID lock: ffff88009d8c7980/0xf5f2f6416dbffd4f lrc: 3/0,0 mode: PW/PW res: [0x280000400:0x169:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) gid 0 flags: 0x60000480020020 nid: 192.168.201.17@tcp remote: 0x43a885fd81f11732 expref: 60 pid: 14229 timeout: 251 lvb_type: 0 [ 253.610672] Lustre: 14309:0:(osd_handler.c:2050:osd_trans_start()) lustre-MDT0000: credits 3218 > trans_max 3200 [ 253.614122] Pid: 14309, comm: mdt00_026 3.10.0-7.9-debug #1 SMP Sat Mar 26 23:28:42 EDT 2022 [ 253.617006] Call Trace: [ 253.617842] [<0>] libcfs_call_trace+0x90/0xf0 [libcfs] [ 253.619276] [<0>] libcfs_debug_dumpstack+0x26/0x30 [libcfs] [ 253.620757] [<0>] osd_trans_start+0x643/0x6b0 [osd_ldiskfs] [ 253.622694] [<0>] top_trans_start+0x351/0xab0 [ptlrpc] [ 253.624606] [<0>] lod_trans_start+0x7f/0x2e0 [lod] [ 253.625561] [<0>] mdd_trans_start+0x14/0x20 [mdd] [ 253.627682] [<0>] mdd_migrate_object+0x1439/0x1e00 [mdd] [ 253.629477] [<0>] mdd_migrate+0x28/0x30 [mdd] [ 253.631242] [<0>] mdt_reint_migrate+0x1b58/0x2330 [mdt] [ 253.633200] [<0>] mdt_reint_rec+0x87/0x240 [mdt] [ 253.634673] [<0>] mdt_reint_internal+0x74c/0xba0 [mdt] [ 253.636611] [<0>] mdt_reint+0x67/0x150 [mdt] [ 253.637917] [<0>] tgt_request_handle+0x74e/0x19d0 [ptlrpc] [ 253.640328] [<0>] ptlrpc_server_handle_request+0x26e/0xcf0 [ptlrpc] [ 253.642394] [<0>] ptlrpc_main+0xc37/0x16d0 [ptlrpc] [ 253.644455] [<0>] kthread+0xe4/0xf0 [ 253.645515] [<0>] ret_from_fork_nospec_begin+0x7/0x21 [ 253.647877] [<0>] 0xfffffffffffffffe [ 265.046820] Lustre: 14345:0:(lod_lov.c:1429:lod_parse_striping()) lustre-MDT0001-mdtlov: EXTENSION flags=40 set on component[2]=1 of non-SEL file [0x240000403:0x289a:0x0] with magic=0xbd60bd0 [ 265.052657] Lustre: 14345:0:(lod_lov.c:1429:lod_parse_striping()) Skipped 83 previous similar messages [ 294.179973] LustreError: 6528:0:(ldlm_lockd.c:261:expired_lock_main()) ### lock callback timer expired after 100s: evicting client at 192.168.201.17@tcp ns: mdt-lustre-MDT0000_UUID lock: ffff880131b81400/0xf5f2f6416ddd5b64 lrc: 3/0,0 mode: CR/CR res: [0x200000402:0x21ab:0x0].0x0 bits 0xa/0x0 rrc: 18 type: IBT gid 0 flags: 0x60200400000020 nid: 192.168.201.17@tcp remote: 0x43a885fd81f9ad5c expref: 1211 pid: 14237 timeout: 293 lvb_type: 0 [ 294.206783] LustreError: 6528:0:(mdd_object.c:3645:mdd_close()) lustre-MDD0000: failed to get lu_attr of [0x200000402:0x2b25:0x0]: rc = -2 [ 294.209390] LustreError: 14201:0:(ldlm_lockd.c:1495:ldlm_handle_enqueue()) ### lock on destroyed export ffff88012b0ce000 ns: mdt-lustre-MDT0000_UUID lock: ffff8801316be300/0xf5f2f6416dddd38a lrc: 3/0,0 mode: PR/PR res: [0x200000402:0x21ab:0x0].0x0 bits 0x5b/0x0 rrc: 16 type: IBT gid 0 flags: 0x50200400000020 nid: 192.168.201.17@tcp remote: 0x43a885fd81f9ceb1 expref: 514 pid: 14201 timeout: 0 lvb_type: 0 [ 329.251953] LustreError: 6528:0:(ldlm_lockd.c:261:expired_lock_main()) ### lock callback timer expired after 100s: evicting client at 192.168.201.17@tcp ns: filter-lustre-OST0000_UUID lock: ffff880093595400/0xf5f2f6416decfdce lrc: 3/0,0 mode: PW/PW res: [0x280000401:0x4d1:0x0].0x0 rrc: 4 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) gid 0 flags: 0x60000400020020 nid: 192.168.201.17@tcp remote: 0x43a885fd81fe52d0 expref: 18 pid: 14216 timeout: 328 lvb_type: 0 [ 345.283924] LustreError: 6528:0:(ldlm_lockd.c:261:expired_lock_main()) ### lock callback timer expired after 100s: evicting client at 192.168.201.17@tcp ns: mdt-lustre-MDT0001_UUID lock: ffff880095135900/0xf5f2f6416df356ca lrc: 3/0,0 mode: CR/CR res: [0x240000402:0x2401:0x0].0x0 bits 0xa/0x0 rrc: 7 type: IBT gid 0 flags: 0x60200400000020 nid: 192.168.201.17@tcp remote: 0x43a885fd82003d1d expref: 1160 pid: 14352 timeout: 344 lvb_type: 0 [ 345.297002] LustreError: 14311:0:(ldlm_lockd.c:1495:ldlm_handle_enqueue()) ### lock on destroyed export ffff88009ebe7800 ns: mdt-lustre-MDT0001_UUID lock: ffff880131ee9900/0xf5f2f6416df716aa lrc: 3/0,0 mode: PR/PR res: [0x240000400:0xd1:0x0].0x0 bits 0x12/0x0 rrc: 5 type: IBT gid 0 flags: 0x50200000000000 nid: 192.168.201.17@tcp remote: 0x43a885fd82014628 expref: 1130 pid: 14311 timeout: 0 lvb_type: 0 [ 345.306083] LustreError: 14311:0:(ldlm_lockd.c:1495:ldlm_handle_enqueue()) Skipped 3 previous similar messages [ 357.316352] LustreError: 6528:0:(ldlm_lockd.c:261:expired_lock_main()) ### lock callback timer expired after 100s: evicting client at 192.168.201.17@tcp ns: filter-lustre-OST0001_UUID lock: ffff88009e2e7200/0xf5f2f6416df45ea8 lrc: 3/0,0 mode: PW/PW res: [0x2c0000401:0x531:0x0].0x0 rrc: 4 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) gid 0 flags: 0x60000400000020 nid: 192.168.201.17@tcp remote: 0x43a885fd82008681 expref: 29 pid: 14231 timeout: 356 lvb_type: 0 [ 389.316038] Lustre: lustre-OST0001: haven't heard from client 90cfae4a-9d85-4078-a66d-4e0f0920944b (at 192.168.201.17@tcp) in 31 seconds. I think it's dead, and I am evicting it. exp ffff880097fee800, cur 1705683611 expire 1705683581 last 1705683580 [ 447.651999] LustreError: 6528:0:(ldlm_lockd.c:261:expired_lock_main()) ### lock callback timer expired after 100s: evicting client at 192.168.201.17@tcp ns: mdt-lustre-MDT0001_UUID lock: ffff880096624780/0xf5f2f6416df850f4 lrc: 3/0,0 mode: PW/PW res: [0x240000402:0x24b9:0x0].0x0 bits 0x4/0x0 rrc: 5 type: IBT gid 0 flags: 0x60200400000020 nid: 192.168.201.17@tcp remote: 0x43a885fd8201b2b6 expref: 79 pid: 14199 timeout: 446 lvb_type: 0 [ 447.660821] LustreError: 14201:0:(mdt_reint.c:2443:mdt_reint_migrate()) lustre-MDT0001: migrate [0x240000402:0x1:0x0]/13 failed: rc = -16 [ 447.664887] LustreError: 14201:0:(mdt_reint.c:2443:mdt_reint_migrate()) Skipped 126 previous similar messages ****************************************************************************** ************************ A Summary Of Problems Found ************************* ****************************************************************************** -------------------- A list of all +++WARNING+++ messages -------------------- PARTIAL DUMP with size(vmcore) < 25% size(RAM) There are 3 threads running in their own namespaces Use 'taskinfo --ns' to get more details ------------------------------------------------------------------------------ ** Execution took 12.21s (real) 6.54s (CPU), Child processes: 5.66s