************************ crashinfo ************************* /exports/testreports/38859/testresults/racer-special9-ldiskfs-centos7_x86_64-centos7_x86_64/oleg156-client-timeout-core (3.10.0-7.9-debug) +==========================+ | *** Crashinfo v1.3.7 *** | +==========================+ +++WARNING+++ PARTIAL DUMP with size(vmcore) < 25% size(RAM) KERNEL: /tmp/crash-anaysis.KuSlj/vmlinux [TAINTED] DUMPFILE: /exports/testreports/38859/testresults/racer-special9-ldiskfs-centos7_x86_64-centos7_x86_64/oleg156-client-timeout-core [PARTIAL DUMP] CPUS: 4 DATE: Fri Jan 19 12:55:13 EST 2024 UPTIME: 01:01:31 LOAD AVERAGE: 133.00, 132.99, 130.24 TASKS: 301 NODENAME: oleg156-client.virtnet RELEASE: 3.10.0-7.9-debug VERSION: #1 SMP Sat Mar 26 23:28:42 EDT 2022 MACHINE: x86_64 (2399 Mhz) MEMORY: 4 GB PANIC: "" +--------------------------+ >------------------------| Per-cpu Stacks ('bt -a') |------------------------< +--------------------------+ -- CPU#0 -- PID=0 CPU=0 CMD=swapper/0 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 rest_init+0x8e #7 start_kernel+0x456 #8 x86_64_start_reservations+0x2a #9 x86_64_start_kernel+0x152 #10 start_cpu+0x5 -- CPU#1 -- PID=0 CPU=1 CMD=swapper/1 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 start_secondary+0x1eb #7 start_cpu+0x5 -- CPU#2 -- PID=0 CPU=2 CMD=swapper/2 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 start_secondary+0x1eb #7 start_cpu+0x5 -- CPU#3 -- PID=0 CPU=3 CMD=swapper/3 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 start_secondary+0x1eb #7 start_cpu+0x5 +--------------------------------+ >---------------------| How This Dump Has Been Created |---------------------< +--------------------------------+ Cannot identify the specific condition that triggered vmcore +++WARNING+++ High Load Averages: 133.00, 132.99, 130.24 +---------------+ >------------------------------| Tasks Summary |------------------------------< +---------------+ Number of Threads That Ran Recently ----------------------------------- last second 9 last 5s 29 last 60s 38 ----- Total Numbers of Threads per State ------ TASK_INTERRUPTIBLE 164 TASK_RUNNING 1 TASK_UNINTERRUPTIBLE 133 +++WARNING+++ There are 3 threads running in their own namespaces Use 'taskinfo --ns' to get more details +-----------------------+ >--------------------------| 5 Most Recent Threads |--------------------------< +-----------------------+ PID CMD Age ARGS ----- -------------- ------ ---------------------------- 34 rcuos/3 0 ms (no user stack) 1742 lnet_discovery 0 ms (no user stack) 1 systemd 0 ms /usr/lib/systemd/systemd --switched-root --system --deserialize 22 9 rcu_sched 19 ms (no user stack) 9117 ldlm_bl_05 102 ms (no user stack) +------------------------+ >-------------------------| Memory Usage (kmem -i) |-------------------------< +------------------------+ PAGES TOTAL PERCENTAGE TOTAL MEM 955079 3.6 GB ---- FREE 830848 3.2 GB 86% of TOTAL MEM USED 124231 485.3 MB 13% of TOTAL MEM SHARED 31221 122 MB 3% of TOTAL MEM BUFFERS 5147 20.1 MB 0% of TOTAL MEM CACHED 61055 238.5 MB 6% of TOTAL MEM SLAB 16727 65.3 MB 1% of TOTAL MEM TOTAL HUGE 0 0 ---- HUGE FREE 0 0 0% of TOTAL HUGE TOTAL SWAP 262143 1024 MB ---- SWAP USED 0 0 0% of TOTAL SWAP SWAP FREE 262143 1024 MB 100% of TOTAL SWAP COMMIT LIMIT 739682 2.8 GB ---- COMMITTED 78174 305.4 MB 10% of TOTAL LIMIT +++three oldest UNINTERRUPTIBLE threads ... ran 3516s ago PID=3292 CPU=2 CMD=ls #0 __schedule+0x2e2 #1 schedule_preempt_disabled+0x39 #2 __mutex_lock_slowpath+0x13a #3 mutex_lock+0x2d #4 lookup_slow+0x33 #5 link_path_walk+0x81f #6 path_openat+0xae #7 do_filp_open+0x4d #8 do_sys_open+0x124 #9 sys_openat+0x14 #10 system_call_fastpath+0x1f, 477 bytes of data ... ran 3516s ago PID=3354 CPU=2 CMD=ls #0 __schedule+0x2e2 #1 schedule_preempt_disabled+0x39 #2 __mutex_lock_slowpath+0x13a #3 mutex_lock+0x2d #4 do_last+0x296 #5 path_openat+0xcd #6 do_filp_open+0x4d #7 do_sys_open+0x124 #8 sys_openat+0x14 #9 system_call_fastpath+0x1f, 477 bytes of data ... ran 3516s ago PID=3361 CPU=2 CMD=ls #0 __schedule+0x2e2 #1 schedule_preempt_disabled+0x39 #2 __mutex_lock_slowpath+0x13a #3 mutex_lock+0x2d #4 do_last+0x296 #5 path_openat+0xcd #6 do_filp_open+0x4d #7 do_sys_open+0x124 #8 sys_openat+0x14 #9 system_call_fastpath+0x1f, 477 bytes of data +-------------------------------+ >----------------------| Scheduler Runqueues (per CPU) |----------------------< +-------------------------------+ ---+ CPU=0 ---- | CURRENT TASK , CMD=swapper/0 ---+ CPU=1 ---- | CURRENT TASK , CMD=swapper/1 ---+ CPU=2 ---- | CURRENT TASK , CMD=swapper/2 ---+ CPU=3 ---- | CURRENT TASK , CMD=swapper/3 +------------------------+ >-------------------------| Network Status Summary |-------------------------< +------------------------+ TCP Connection Info ------------------- ESTABLISHED 6 LISTEN 3 NAGLE disabled (TCP_NODELAY): 5 user_data set (NFS etc.): 4 UDP Connection Info ------------------- 2 UDP sockets, 0 in ESTABLISHED Unix Connection Info ------------------------ ESTABLISHED 26 CLOSE 18 LISTEN 8 Raw sockets info -------------------- ESTABLISHED 1 Interfaces Info --------------- How long ago (in seconds) interfaces transmitted/received? Name RX TX ---- ---------- --------- lo n/a 3689.1 eth0 n/a 2.5 RSS_TOTAL=189440 pages, %mem= 1.4 +++WARNING+++ Possible hang +++WARNING+++ Run 'hanginfo' to get more details +------------+ >-------------------------------| Mounted FS |-------------------------------< +------------+ MOUNT SUPERBLK TYPE DEVNAME DIRNAME ffff880138cca000 ffff880139940800 rootfs rootfs / ffff88012aac2000 ffff88012aac8000 sysfs sysfs /sys ffff88012aac21c0 ffff880139944000 proc proc /proc ffff88012aac2380 ffff880137668000 devtmpfs devtmpfs /dev ffff88012aac2540 ffff8800b6ce1800 securityfs securityfs /sys/kernel/security ffff88012aac2700 ffff88012aac8800 tmpfs tmpfs /dev/shm ffff88012aac28c0 ffff88013717f000 devpts devpts /dev/pts ffff88012aac2a80 ffff88012aac9000 tmpfs tmpfs /run ffff88012aac2c40 ffff88012aac9800 tmpfs tmpfs /sys/fs/cgroup ffff88012aac2e00 ffff88012aaca000 cgroup cgroup /sys/fs/cgroup/systemd ffff88012aac2fc0 ffff88012aaca800 pstore pstore /sys/fs/pstore ffff88012aac3180 ffff88012aacc800 cgroup cgroup /sys/fs/cgroup/freezer ffff88012aac3340 ffff88012aacc000 cgroup cgroup /sys/fs/cgroup/devices ffff88012aac3500 ffff88012aacb800 cgroup cgroup /sys/fs/cgroup/net_cls,net_prio ffff88012aac36c0 ffff88012aacb000 cgroup cgroup /sys/fs/cgroup/cpu,cpuacct ffff88012aac3880 ffff88012aacd000 cgroup cgroup /sys/fs/cgroup/cpuset ffff88012aac3a40 ffff88012aacd800 cgroup cgroup /sys/fs/cgroup/hugetlb ffff88012aac3c00 ffff88012aace000 cgroup cgroup /sys/fs/cgroup/perf_event ffff88012aac3dc0 ffff88012aace800 cgroup cgroup /sys/fs/cgroup/memory ffff88012aad6000 ffff88012aacf000 cgroup cgroup /sys/fs/cgroup/pids ffff88012aad61c0 ffff88012aacf800 cgroup cgroup /sys/fs/cgroup/blkio ffff880138ccb880 ffff880000094800 configfs configfs /sys/kernel/config ffff880137652700 ffff8800b6ce7000 ext4 /dev/nbd0 / ffff8800b603c000 ffff8800b6ce6000 rpc_pipefs rpc_pipefs /var/lib/nfs/rpc_pipefs ffff8800b603c1c0 ffff8800b6279800 autofs systemd-1 /proc/sys/fs/binfmt_misc ffff8801376528c0 ffff88012b268000 mqueue mqueue /dev/mqueue ffff880137652540 ffff880139947800 debugfs debugfs /sys/kernel/debug ffff880138ccba40 ffff8800b5899800 hugetlbfs hugetlbfs /dev/hugepages ffff88012aad6380 ffff88012e158800 binfmt_misc binfmt_misc /proc/sys/fs/binfmt_misc/ ffff88012aad6540 ffff8800b60df000 ramfs none /mnt ffff88012aad6700 ffff8800b6275000 tmpfs none /var/lib/stateless/writable ffff88012aad68c0 ffff8800b6272000 squashfs /dev/vda /home/green/git/lustre-release ffff880137652c40 ffff8800b6275000 tmpfs none /var/cache/man ffff880138ccbdc0 ffff8800b6275000 tmpfs none /var/log ffff880137652e00 ffff8800b6275000 tmpfs none /var/lib/dbus ffff880138ccb6c0 ffff8800b6275000 tmpfs none /tmp ffff880137652fc0 ffff8800b6275000 tmpfs none /var/lib/dhclient ffff8800b603c380 ffff8800b6275000 tmpfs none /var/tmp ffff8800b603c540 ffff8800b6275000 tmpfs none /var/lib/NetworkManager ffff8800b5968000 ffff8800b6275000 tmpfs none /var/lib/systemd/random-seed ffff880137653180 ffff8800b6275000 tmpfs none /var/spool ffff880137653340 ffff8800b6275000 tmpfs none /var/lib/nfs ffff880137653500 ffff8800b6275000 tmpfs none /var/lib/gssproxy ffff8800b59681c0 ffff8800b6275000 tmpfs none /var/lib/logrotate ffff8801376536c0 ffff8800b6275000 tmpfs none /etc ffff8800b5968380 ffff8800b6275000 tmpfs none /var/lib/rsyslog ffff880137653880 ffff8800b6275000 tmpfs none /var/lib/dhclient/var/lib/dhclient ffff88012aad6e00 ffff8800b6ce6000 rpc_pipefs sunrpc /var/lib/nfs/var/lib/nfs/rpc_pipefs ffff88012e153880 ffff8800b63e1000 nfs4 192.168.200.253://exports/testreports/38859/testresults/racer-special9-ldiskfs-centos7_x86_64-centos7_x86_64 /tmp/tmp/testlogs ffff8800b5969340 ffff8800b6331000 tmpfs tmpfs /run/user/0 ffff88012aad6fc0 ffff8800b6272000 squashfs /dev/vda /usr/sbin/mount.lustre ffff8800b5969a40 ffff8800b6ec0800 lustre 192.168.201.156@tcp:/lustre /mnt/lustre ffff88012aad7880 ffff8800b636b800 lustre 192.168.201.156@tcp:/lustre /mnt/lustre2 +-------------------------------+ >----------------------| Last 40 lines of dmesg buffer |----------------------< +-------------------------------+ [ 361.229992] [] SyS_openat+0x14/0x20 [ 361.231750] [] system_call_fastpath+0x1f/0x24 [ 361.233454] [] ? system_call_after_swapgs+0xa2/0x13a [ 779.602699] Lustre: 1746:0:(client.c:2337:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1705683396/real 1705683396] req@ffff88012c096e00 x1788538529194240/t0(0) o22->lustre-MDT0000-mdc-ffff8800b636b800@192.168.201.156@tcp:12/10 lens 440/432 e 24 to 1 dl 1705684001 ref 1 fl Rpc:XQr/200/ffffffff rc 0/-1 job:'fallocate.0' uid:0 gid:0 [ 779.608749] Lustre: lustre-MDT0000-mdc-ffff8800b636b800: Connection to lustre-MDT0000 (at 192.168.201.156@tcp) was lost; in progress operations using this service will wait for recovery to complete [ 779.615628] Lustre: lustre-MDT0000-mdc-ffff8800b636b800: Connection restored to (at 192.168.201.156@tcp) [ 780.257736] Lustre: 3871:0:(client.c:2337:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1705683396/real 1705683396] req@ffff8800a1463200 x1788538529194176/t0(0) o36->lustre-MDT0000-mdc-ffff8800b6ec0800@192.168.201.156@tcp:12/10 lens 544/440 e 24 to 1 dl 1705684001 ref 2 fl Rpc:XQr/200/ffffffff rc 0/-1 job:'setfattr.0' uid:0 gid:0 [ 780.263723] Lustre: lustre-MDT0000-mdc-ffff8800b6ec0800: Connection to lustre-MDT0000 (at 192.168.201.156@tcp) was lost; in progress operations using this service will wait for recovery to complete [ 780.274195] Lustre: 3871:0:(client.c:2337:ptlrpc_expire_one_request()) Skipped 4 previous similar messages [ 780.275099] Lustre: lustre-MDT0000-mdc-ffff8800b6ec0800: Connection restored to (at 192.168.201.156@tcp) [ 1247.614815] Lustre: 3310:0:(client.c:2337:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1705684001/real 1705684001] req@ffff88012b030f00 x1788538529219264/t0(0) o35->lustre-MDT0000-mdc-ffff8800b636b800@192.168.201.156@tcp:23/10 lens 392/1048 e 2 to 1 dl 1705684469 ref 2 fl Rpc:XQr/202/ffffffff rc -11/-1 job:'ls.0' uid:0 gid:0 [ 1247.621200] Lustre: lustre-MDT0000-mdc-ffff8800b636b800: Connection to lustre-MDT0000 (at 192.168.201.156@tcp) was lost; in progress operations using this service will wait for recovery to complete [ 1247.627829] Lustre: lustre-MDT0000-mdc-ffff8800b636b800: Connection restored to (at 192.168.201.156@tcp) [ 1371.809712] Lustre: 1744:0:(client.c:2337:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1705684001/real 1705684001] req@ffff8800a820da00 x1788538529202752/t0(0) o4->lustre-MDT0000-mdc-ffff8800b6ec0800@192.168.201.156@tcp:13/10 lens 496/448 e 23 to 1 dl 1705684593 ref 2 fl Rpc:XQr/202/ffffffff rc -11/-1 job:'dir_create.sh.0' uid:0 gid:0 [ 1371.817058] Lustre: lustre-MDT0000-mdc-ffff8800b6ec0800: Connection to lustre-MDT0000 (at 192.168.201.156@tcp) was lost; in progress operations using this service will wait for recovery to complete [ 1371.825151] Lustre: lustre-MDT0000-mdc-ffff8800b6ec0800: Connection restored to (at 192.168.201.156@tcp) [ 1715.629762] Lustre: 3310:0:(client.c:2337:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1705684469/real 1705684469] req@ffff88012b030f00 x1788538529219264/t0(0) o35->lustre-MDT0000-mdc-ffff8800b636b800@192.168.201.156@tcp:23/10 lens 392/1048 e 2 to 1 dl 1705684937 ref 2 fl Rpc:XQr/202/ffffffff rc -11/-1 job:'ls.0' uid:0 gid:0 [ 1715.643045] Lustre: lustre-MDT0000-mdc-ffff8800b636b800: Connection to lustre-MDT0000 (at 192.168.201.156@tcp) was lost; in progress operations using this service will wait for recovery to complete [ 1715.659761] Lustre: lustre-MDT0000-mdc-ffff8800b636b800: Connection restored to (at 192.168.201.156@tcp) [ 1963.857897] Lustre: 1744:0:(client.c:2337:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1705684593/real 1705684593] req@ffff8800a820da00 x1788538529202752/t0(0) o4->lustre-MDT0000-mdc-ffff8800b6ec0800@192.168.201.156@tcp:13/10 lens 496/448 e 23 to 1 dl 1705685185 ref 2 fl Rpc:XQr/202/ffffffff rc -11/-1 job:'dir_create.sh.0' uid:0 gid:0 [ 1963.871556] Lustre: lustre-MDT0000-mdc-ffff8800b6ec0800: Connection to lustre-MDT0000 (at 192.168.201.156@tcp) was lost; in progress operations using this service will wait for recovery to complete [ 1963.888578] Lustre: lustre-MDT0000-mdc-ffff8800b6ec0800: Connection restored to (at 192.168.201.156@tcp) [ 2183.664753] Lustre: 3310:0:(client.c:2337:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1705684937/real 1705684937] req@ffff88012b030f00 x1788538529219264/t0(0) o35->lustre-MDT0000-mdc-ffff8800b636b800@192.168.201.156@tcp:23/10 lens 392/1048 e 2 to 1 dl 1705685405 ref 2 fl Rpc:XQr/202/ffffffff rc -11/-1 job:'ls.0' uid:0 gid:0 [ 2183.671548] Lustre: lustre-MDT0000-mdc-ffff8800b636b800: Connection to lustre-MDT0000 (at 192.168.201.156@tcp) was lost; in progress operations using this service will wait for recovery to complete [ 2183.680860] Lustre: lustre-MDT0000-mdc-ffff8800b636b800: Connection restored to (at 192.168.201.156@tcp) [ 2555.809681] Lustre: 1744:0:(client.c:2337:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1705685185/real 1705685185] req@ffff8800a820da00 x1788538529202752/t0(0) o4->lustre-MDT0000-mdc-ffff8800b6ec0800@192.168.201.156@tcp:13/10 lens 496/448 e 23 to 1 dl 1705685777 ref 2 fl Rpc:XQr/202/ffffffff rc -11/-1 job:'dir_create.sh.0' uid:0 gid:0 [ 2555.817303] Lustre: lustre-MDT0000-mdc-ffff8800b6ec0800: Connection to lustre-MDT0000 (at 192.168.201.156@tcp) was lost; in progress operations using this service will wait for recovery to complete [ 2555.824921] Lustre: lustre-MDT0000-mdc-ffff8800b6ec0800: Connection restored to (at 192.168.201.156@tcp) [ 2651.683750] Lustre: 3310:0:(client.c:2337:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1705685405/real 1705685405] req@ffff88012b030f00 x1788538529219264/t0(0) o35->lustre-MDT0000-mdc-ffff8800b636b800@192.168.201.156@tcp:23/10 lens 392/1048 e 2 to 1 dl 1705685873 ref 2 fl Rpc:XQr/202/ffffffff rc -11/-1 job:'ls.0' uid:0 gid:0 [ 2651.691159] Lustre: lustre-MDT0000-mdc-ffff8800b636b800: Connection to lustre-MDT0000 (at 192.168.201.156@tcp) was lost; in progress operations using this service will wait for recovery to complete [ 2651.700045] Lustre: lustre-MDT0000-mdc-ffff8800b636b800: Connection restored to (at 192.168.201.156@tcp) [ 3119.701788] Lustre: 3310:0:(client.c:2337:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1705685873/real 1705685873] req@ffff88012b030f00 x1788538529219264/t0(0) o35->lustre-MDT0000-mdc-ffff8800b636b800@192.168.201.156@tcp:23/10 lens 392/1048 e 2 to 1 dl 1705686341 ref 2 fl Rpc:XQr/202/ffffffff rc -11/-1 job:'ls.0' uid:0 gid:0 [ 3119.714273] Lustre: lustre-MDT0000-mdc-ffff8800b636b800: Connection to lustre-MDT0000 (at 192.168.201.156@tcp) was lost; in progress operations using this service will wait for recovery to complete [ 3119.729855] Lustre: lustre-MDT0000-mdc-ffff8800b636b800: Connection restored to (at 192.168.201.156@tcp) [ 3587.731728] Lustre: 3310:0:(client.c:2337:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1705686341/real 1705686341] req@ffff88012b030f00 x1788538529219264/t0(0) o35->lustre-MDT0000-mdc-ffff8800b636b800@192.168.201.156@tcp:23/10 lens 392/1048 e 2 to 1 dl 1705686809 ref 2 fl Rpc:XQr/202/ffffffff rc -11/-1 job:'ls.0' uid:0 gid:0 [ 3587.743454] Lustre: 3310:0:(client.c:2337:ptlrpc_expire_one_request()) Skipped 1 previous similar message [ 3587.747116] Lustre: lustre-MDT0000-mdc-ffff8800b636b800: Connection to lustre-MDT0000 (at 192.168.201.156@tcp) was lost; in progress operations using this service will wait for recovery to complete [ 3587.754435] Lustre: Skipped 1 previous similar message [ 3587.763543] Lustre: lustre-MDT0000-mdc-ffff8800b636b800: Connection restored to (at 192.168.201.156@tcp) [ 3587.766779] Lustre: Skipped 1 previous similar message ****************************************************************************** ************************ A Summary Of Problems Found ************************* ****************************************************************************** -------------------- A list of all +++WARNING+++ messages -------------------- PARTIAL DUMP with size(vmcore) < 25% size(RAM) High Load Averages: 133.00, 132.99, 130.24 There are 3 threads running in their own namespaces Use 'taskinfo --ns' to get more details Possible hang Run 'hanginfo' to get more details ------------------------------------------------------------------------------ ** Execution took 10.81s (real) 5.73s (CPU), Child processes: 5.08s