************************ crashinfo ************************* /exports/testreports/38859/testresults/racer-special10-ldiskfs-DNE-centos7_x86_64-centos7_x86_64/oleg117-client-timeout-core (3.10.0-7.9-debug) +==========================+ | *** Crashinfo v1.3.7 *** | +==========================+ +++WARNING+++ PARTIAL DUMP with size(vmcore) < 25% size(RAM) KERNEL: /tmp/crash-anaysis.Bfx2o/vmlinux [TAINTED] DUMPFILE: /exports/testreports/38859/testresults/racer-special10-ldiskfs-DNE-centos7_x86_64-centos7_x86_64/oleg117-client-timeout-core [PARTIAL DUMP] CPUS: 4 DATE: Fri Jan 19 12:55:32 EST 2024 UPTIME: 01:01:47 LOAD AVERAGE: 0.00, 0.01, 3.88 TASKS: 157 NODENAME: oleg117-client.virtnet RELEASE: 3.10.0-7.9-debug VERSION: #1 SMP Sat Mar 26 23:28:42 EDT 2022 MACHINE: x86_64 (2399 Mhz) MEMORY: 4 GB PANIC: "" +--------------------------+ >------------------------| Per-cpu Stacks ('bt -a') |------------------------< +--------------------------+ -- CPU#0 -- PID=0 CPU=0 CMD=swapper/0 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 rest_init+0x8e #7 start_kernel+0x456 #8 x86_64_start_reservations+0x2a #9 x86_64_start_kernel+0x152 #10 start_cpu+0x5 -- CPU#1 -- PID=0 CPU=1 CMD=swapper/1 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 start_secondary+0x1eb #7 start_cpu+0x5 -- CPU#2 -- PID=0 CPU=2 CMD=swapper/2 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 start_secondary+0x1eb #7 start_cpu+0x5 -- CPU#3 -- PID=0 CPU=3 CMD=swapper/3 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 start_secondary+0x1eb #7 start_cpu+0x5 +--------------------------------+ >---------------------| How This Dump Has Been Created |---------------------< +--------------------------------+ Cannot identify the specific condition that triggered vmcore +---------------+ >------------------------------| Tasks Summary |------------------------------< +---------------+ Number of Threads That Ran Recently ----------------------------------- last second 17 last 5s 29 last 60s 36 ----- Total Numbers of Threads per State ------ TASK_INTERRUPTIBLE 153 TASK_RUNNING 1 +++WARNING+++ There are 3 threads running in their own namespaces Use 'taskinfo --ns' to get more details +-----------------------+ >--------------------------| 5 Most Recent Threads |--------------------------< +-----------------------+ PID CMD Age ARGS ----- -------------- ------ ---------------------------- 1850 monitor_thread 0 ms (no user stack) 34 rcuos/3 0 ms (no user stack) 1 systemd 0 ms /usr/lib/systemd/systemd --switched-root --system --deserialize 22 9 rcu_sched 21 ms (no user stack) 1851 lnet_discovery 101 ms (no user stack) +------------------------+ >-------------------------| Memory Usage (kmem -i) |-------------------------< +------------------------+ PAGES TOTAL PERCENTAGE TOTAL MEM 955079 3.6 GB ---- FREE 839751 3.2 GB 87% of TOTAL MEM USED 115328 450.5 MB 12% of TOTAL MEM SHARED 16294 63.6 MB 1% of TOTAL MEM BUFFERS 5146 20.1 MB 0% of TOTAL MEM CACHED 53888 210.5 MB 5% of TOTAL MEM SLAB 21501 84 MB 2% of TOTAL MEM TOTAL HUGE 0 0 ---- HUGE FREE 0 0 0% of TOTAL HUGE TOTAL SWAP 262143 1024 MB ---- SWAP USED 0 0 0% of TOTAL SWAP SWAP FREE 262143 1024 MB 100% of TOTAL SWAP COMMIT LIMIT 739682 2.8 GB ---- COMMITTED 64732 252.9 MB 8% of TOTAL LIMIT +-------------------------------+ >----------------------| Scheduler Runqueues (per CPU) |----------------------< +-------------------------------+ ---+ CPU=0 ---- | CURRENT TASK , CMD=swapper/0 ---+ CPU=1 ---- | CURRENT TASK , CMD=swapper/1 ---+ CPU=2 ---- | CURRENT TASK , CMD=swapper/2 ---+ CPU=3 ---- | CURRENT TASK , CMD=swapper/3 +------------------------+ >-------------------------| Network Status Summary |-------------------------< +------------------------+ TCP Connection Info ------------------- ESTABLISHED 6 LISTEN 3 NAGLE disabled (TCP_NODELAY): 5 user_data set (NFS etc.): 4 UDP Connection Info ------------------- 2 UDP sockets, 0 in ESTABLISHED Unix Connection Info ------------------------ ESTABLISHED 26 CLOSE 18 LISTEN 8 Raw sockets info -------------------- ESTABLISHED 1 Interfaces Info --------------- How long ago (in seconds) interfaces transmitted/received? Name RX TX ---- ---------- --------- lo n/a 3704.9 eth0 n/a 0.2 RSS_TOTAL=78304 pages, %mem= 1.3 +------------+ >-------------------------------| Mounted FS |-------------------------------< +------------+ MOUNT SUPERBLK TYPE DEVNAME DIRNAME ffff880138cca000 ffff880139940800 rootfs rootfs / ffff880137652380 ffff880137668800 sysfs sysfs /sys ffff880137652540 ffff880139944000 proc proc /proc ffff880137652700 ffff880137668000 devtmpfs devtmpfs /dev ffff8801376528c0 ffff8800b6cf1000 securityfs securityfs /sys/kernel/security ffff880137652a80 ffff880137669000 tmpfs tmpfs /dev/shm ffff880137652c40 ffff880137315000 devpts devpts /dev/pts ffff880137652e00 ffff880137669800 tmpfs tmpfs /run ffff880137652fc0 ffff88013766a000 tmpfs tmpfs /sys/fs/cgroup ffff880137653180 ffff88013766a800 cgroup cgroup /sys/fs/cgroup/systemd ffff880137653340 ffff88013766b000 pstore pstore /sys/fs/pstore ffff880137653500 ffff88013766d000 cgroup cgroup /sys/fs/cgroup/net_cls,net_prio ffff8801376536c0 ffff88013766c800 cgroup cgroup /sys/fs/cgroup/cpu,cpuacct ffff880137653880 ffff88013766c000 cgroup cgroup /sys/fs/cgroup/perf_event ffff880137653a40 ffff88013766b800 cgroup cgroup /sys/fs/cgroup/cpuset ffff880137653c00 ffff88013766d800 cgroup cgroup /sys/fs/cgroup/hugetlb ffff880137653dc0 ffff88013766e000 cgroup cgroup /sys/fs/cgroup/freezer ffff88012a9e4000 ffff88013766e800 cgroup cgroup /sys/fs/cgroup/pids ffff88012a9e41c0 ffff88013766f000 cgroup cgroup /sys/fs/cgroup/devices ffff88012a9e4380 ffff88013766f800 cgroup cgroup /sys/fs/cgroup/blkio ffff88012a9e4540 ffff88012aa88000 cgroup cgroup /sys/fs/cgroup/memory ffff8801377e01c0 ffff8801377f3000 configfs configfs /sys/kernel/config ffff88012aa761c0 ffff8800b6cf2000 ext4 /dev/nbd0 / ffff88012a9e48c0 ffff88012aa8c800 rpc_pipefs rpc_pipefs /var/lib/nfs/rpc_pipefs ffff88012a9e4a80 ffff8800b67ac800 autofs systemd-1 /proc/sys/fs/binfmt_misc ffff8801377e08c0 ffff880137316000 mqueue mqueue /dev/mqueue ffff88012aa76380 ffff880139947800 debugfs debugfs /sys/kernel/debug ffff88012a9e4c40 ffff8800b67a9800 hugetlbfs hugetlbfs /dev/hugepages ffff88012aa76540 ffff8800b6518000 binfmt_misc binfmt_misc /proc/sys/fs/binfmt_misc/ ffff8801377e0c40 ffff88012a5ad000 ramfs none /mnt ffff8801377e0e00 ffff8801377f4800 tmpfs none /var/lib/stateless/writable ffff880138ccb6c0 ffff88012a5ac000 squashfs /dev/vda /home/green/git/lustre-release ffff88012aa76700 ffff8801377f4800 tmpfs none /var/cache/man ffff8801377e0fc0 ffff8801377f4800 tmpfs none /var/log ffff88012a9e4e00 ffff8801377f4800 tmpfs none /var/lib/dbus ffff88012aa768c0 ffff8801377f4800 tmpfs none /tmp ffff88012a9e4fc0 ffff8801377f4800 tmpfs none /var/lib/dhclient ffff8801377e1180 ffff8801377f4800 tmpfs none /var/tmp ffff88012aa76a80 ffff8801377f4800 tmpfs none /var/lib/NetworkManager ffff88012aa76c40 ffff8801377f4800 tmpfs none /var/lib/systemd/random-seed ffff8801377e1340 ffff8801377f4800 tmpfs none /var/spool ffff88012aa76e00 ffff8801377f4800 tmpfs none /var/lib/nfs ffff88012aa76fc0 ffff8801377f4800 tmpfs none /var/lib/gssproxy ffff8801377e1500 ffff8801377f4800 tmpfs none /var/lib/logrotate ffff88012aa77180 ffff8801377f4800 tmpfs none /etc ffff8801377e16c0 ffff8801377f4800 tmpfs none /var/lib/rsyslog ffff88012a9e5180 ffff8801377f4800 tmpfs none /var/lib/dhclient/var/lib/dhclient ffff88012a9e5340 ffff88012a532800 nfs4 192.168.200.253:/exports/state/oleg117-client.virtnet /var/lib/stateless/state ffff8801377e1880 ffff88012a532800 nfs4 192.168.200.253:/exports/state/oleg117-client.virtnet /boot ffff88012aa776c0 ffff88012a532800 nfs4 192.168.200.253:/exports/state/oleg117-client.virtnet /etc/etc/kdump.conf ffff88012a9e56c0 ffff88012aa8c800 rpc_pipefs sunrpc /var/lib/nfs/var/lib/nfs/rpc_pipefs ffff88012a9e5880 ffff88012bf8c000 nfs4 192.168.200.253://exports/testreports/38859/testresults/racer-special10-ldiskfs-DNE-centos7_x86_64-centos7_x86_64 /tmp/tmp/testlogs ffff8800b08e48c0 ffff8800b42ed800 tmpfs tmpfs /run/user/0 ffff8800b08e4fc0 ffff88012a5ac000 squashfs /dev/vda /usr/sbin/mount.lustre ffff880138ccb880 ffff88012a5a9800 lustre 192.168.201.117@tcp:/lustre /mnt/lustre ffff880138ccba40 ffff8800b651c000 lustre 192.168.201.117@tcp:/lustre /mnt/lustre2 +-------------------------------+ >----------------------| Last 40 lines of dmesg buffer |----------------------< +-------------------------------+ [ 356.985831] LustreError: 167-0: lustre-OST0001-osc-ffff8800b651c000: This client was evicted by lustre-OST0001; in progress operations using this service will fail. [ 356.992901] Lustre: 1853:0:(llite_lib.c:4016:ll_dirty_page_discard_warn()) lustre: dirty page discard: 192.168.201.117@tcp:/lustre/fid: [0x240000404:0xa0:0x0]/ may get corrupted (rc -108) [ 356.992982] Lustre: 1854:0:(llite_lib.c:4016:ll_dirty_page_discard_warn()) lustre: dirty page discard: 192.168.201.117@tcp:/lustre/fid: [0x240000404:0x7a:0x0]// may get corrupted (rc -108) [ 356.993341] Lustre: 1854:0:(llite_lib.c:4016:ll_dirty_page_discard_warn()) lustre: dirty page discard: 192.168.201.117@tcp:/lustre/fid: [0x200000403:0x1db3:0x0]/ may get corrupted (rc -108) [ 356.993374] Lustre: 1854:0:(llite_lib.c:4016:ll_dirty_page_discard_warn()) lustre: dirty page discard: 192.168.201.117@tcp:/lustre/fid: [0x200000404:0x2e:0x0]// may get corrupted (rc -108) [ 356.993415] Lustre: 1854:0:(llite_lib.c:4016:ll_dirty_page_discard_warn()) lustre: dirty page discard: 192.168.201.117@tcp:/lustre/fid: [0x200000403:0xbfb1:0x0]// may get corrupted (rc -108) [ 356.993439] Lustre: 1854:0:(llite_lib.c:4016:ll_dirty_page_discard_warn()) lustre: dirty page discard: 192.168.201.117@tcp:/lustre/fid: [0x200000403:0x1bc2:0x0]/ may get corrupted (rc -108) [ 356.993471] Lustre: 1854:0:(llite_lib.c:4016:ll_dirty_page_discard_warn()) lustre: dirty page discard: 192.168.201.117@tcp:/lustre/fid: [0x200000403:0xc039:0x0]// may get corrupted (rc -108) [ 356.993597] Lustre: 1854:0:(llite_lib.c:4016:ll_dirty_page_discard_warn()) lustre: dirty page discard: 192.168.201.117@tcp:/lustre/fid: [0x240000404:0x82:0x0]// may get corrupted (rc -108) [ 357.046263] LustreError: 8951:0:(ldlm_resource.c:1125:ldlm_resource_complain()) lustre-OST0001-osc-ffff8800b651c000: namespace resource [0x2c0000401:0x531:0x0].0x0 (ffff8801354d8700) refcount nonzero (1) after lock cleanup; forcing cleanup. [ 360.694749] INFO: task mrename:23101 blocked for more than 120 seconds. [ 360.698596] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 360.701893] mrename D ffff8800a2c7b600 11736 23101 9612 0x00000000 [ 360.704129] Call Trace: [ 360.704871] [] schedule_preempt_disabled+0x39/0x90 [ 360.707251] [] __mutex_lock_slowpath+0x13a/0x340 [ 360.709056] [] mutex_lock+0x2d/0x40 [ 360.711610] [] lock_rename+0x31/0xe0 [ 360.713176] [] SYSC_renameat2+0x22f/0x570 [ 360.715192] [] ? handle_mm_fault+0xc2/0x150 [ 360.716962] [] ? system_call_after_swapgs+0xa2/0x13a [ 360.720320] [] ? system_call_after_swapgs+0x96/0x13a [ 360.726210] [] ? system_call_after_swapgs+0xa2/0x13a [ 360.730572] [] ? system_call_after_swapgs+0x96/0x13a [ 360.734957] [] SyS_renameat2+0xe/0x10 [ 360.738038] [] SyS_rename+0x1e/0x20 [ 360.739698] [] system_call_fastpath+0x1f/0x24 [ 360.741203] [] ? system_call_after_swapgs+0xa2/0x13a [ 446.645128] LustreError: 11-0: lustre-MDT0001-mdc-ffff8800b651c000: operation mds_close to node 192.168.201.117@tcp failed: rc = -107 [ 446.651901] Lustre: lustre-MDT0001-mdc-ffff8800b651c000: Connection to lustre-MDT0001 (at 192.168.201.117@tcp) was lost; in progress operations using this service will wait for recovery to complete [ 446.659550] LustreError: 167-0: lustre-MDT0001-mdc-ffff8800b651c000: This client was evicted by lustre-MDT0001; in progress operations using this service will fail. [ 446.668521] LustreError: 27375:0:(file.c:246:ll_close_inode_openhandle()) lustre-clilmv-ffff8800b651c000: inode [0x240000402:0x24b9:0x0] mdc close failed: rc = -5 [ 446.671466] LustreError: 28381:0:(llite_lib.c:1997:ll_md_setattr()) md_setattr fails: rc = -108 [ 446.671469] LustreError: 28381:0:(llite_lib.c:1997:ll_md_setattr()) Skipped 2 previous similar messages [ 446.677675] Lustre: lustre-OST0001-osc-ffff8800b651c000: Connection restored to (at 192.168.201.117@tcp) [ 446.677677] Lustre: Skipped 1 previous similar message [ 446.691729] LustreError: 27375:0:(file.c:246:ll_close_inode_openhandle()) Skipped 98 previous similar messages [ 446.698307] LustreError: 27318:0:(mdc_request.c:1465:mdc_read_page()) lustre-MDT0001-mdc-ffff8800b651c000: [0x240000402:0x1:0x0] lock enqueue fails: rc = -5 [ 446.699443] LustreError: 16905:0:(ldlm_resource.c:1125:ldlm_resource_complain()) lustre-MDT0001-mdc-ffff8800b651c000: namespace resource [0x240000402:0x1:0x0].0x0 (ffff8800929c2e00) refcount nonzero (1) after lock cleanup; forcing cleanup. [ 446.706965] LustreError: 27318:0:(mdc_request.c:1465:mdc_read_page()) Skipped 28 previous similar messages ****************************************************************************** ************************ A Summary Of Problems Found ************************* ****************************************************************************** -------------------- A list of all +++WARNING+++ messages -------------------- PARTIAL DUMP with size(vmcore) < 25% size(RAM) There are 3 threads running in their own namespaces Use 'taskinfo --ns' to get more details ------------------------------------------------------------------------------ ** Execution took 11.86s (real) 5.92s (CPU), Child processes: 5.88s