[syzbot] [ocfs2?] possible deadlock in ocfs2

syzbot

unread,

Sep 3, 2024, 11:43:25 PM9/3/24

to [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]

Hello,

syzbot found the following issue on:

HEAD commit: 6cd90e5ea72f Merge branch 'fixes' of git://git.kernel.org/..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=12c6b543980000
kernel config: https://syzkaller.appspot.com/x/.config?x=714e8373ca1f0bb3
dashboard link: https://syzkaller.appspot.com/bug?extid=ca440b457d21568f8021
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7bc7510fe41f/non_bootable_disk-6cd90e5e.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/5118b413b1ad/vmlinux-6cd90e5e.xz
kernel image: https://storage.googleapis.com/syzbot-assets/6a2ee4a9243a/bzImage-6cd90e5e.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]

loop0: detected capacity change from 0 to 32768
=======================================================
WARNING: The mand mount option has been deprecated and
and is ignored by this kernel. Remove the mand
option from the mount to silence this warning.
=======================================================
JBD2: Ignoring recovery information on journal
ocfs2: Mounting device (7,0) on (node local, slot 0) with ordered data mode.
======================================================
WARNING: possible circular locking dependency detected
6.11.0-rc5-syzkaller-00316-g6cd90e5ea72f #0 Not tainted
------------------------------------------------------
syz.0.0/5152 is trying to acquire lock:
ffff888035ca6a18 (&mm->mmap_lock){++++}-{3:3}, at: __might_fault+0xaa/0x120 mm/memory.c:6387

but task is already holding lock:
ffff888012decda0 (&oi->ip_alloc_sem){++++}-{3:3}, at: ocfs2_fiemap+0x377/0xf80 fs/ocfs2/extent_map.c:755

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (&oi->ip_alloc_sem){++++}-{3:3}:
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5759
down_write+0x99/0x220 kernel/locking/rwsem.c:1579
ocfs2_page_mkwrite+0x347/0xed0 fs/ocfs2/mmap.c:142
do_page_mkwrite+0x19b/0x480 mm/memory.c:3142
do_shared_fault mm/memory.c:5133 [inline]
do_fault mm/memory.c:5195 [inline]
do_pte_missing mm/memory.c:3947 [inline]
handle_pte_fault+0x126b/0x6fc0 mm/memory.c:5521
__handle_mm_fault mm/memory.c:5664 [inline]
handle_mm_fault+0x1109/0x1bc0 mm/memory.c:5832
do_user_addr_fault arch/x86/mm/fault.c:1389 [inline]
handle_page_fault arch/x86/mm/fault.c:1481 [inline]
exc_page_fault+0x2b9/0x8c0 arch/x86/mm/fault.c:1539
asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623

-> #1 (sb_pagefaults){.+.+}-{0:0}:
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5759
percpu_down_read include/linux/percpu-rwsem.h:51 [inline]
__sb_start_write include/linux/fs.h:1676 [inline]
sb_start_pagefault include/linux/fs.h:1841 [inline]
ocfs2_page_mkwrite+0x223/0xed0 fs/ocfs2/mmap.c:122
do_page_mkwrite+0x19b/0x480 mm/memory.c:3142
do_shared_fault mm/memory.c:5133 [inline]
do_fault mm/memory.c:5195 [inline]
do_pte_missing mm/memory.c:3947 [inline]
handle_pte_fault+0x126b/0x6fc0 mm/memory.c:5521
__handle_mm_fault mm/memory.c:5664 [inline]
handle_mm_fault+0x1109/0x1bc0 mm/memory.c:5832
do_user_addr_fault arch/x86/mm/fault.c:1389 [inline]
handle_page_fault arch/x86/mm/fault.c:1481 [inline]
exc_page_fault+0x2b9/0x8c0 arch/x86/mm/fault.c:1539
asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623

-> #0 (&mm->mmap_lock){++++}-{3:3}:
check_prev_add kernel/locking/lockdep.c:3133 [inline]
check_prevs_add kernel/locking/lockdep.c:3252 [inline]
validate_chain+0x18e0/0x5900 kernel/locking/lockdep.c:3868
__lock_acquire+0x137a/0x2040 kernel/locking/lockdep.c:5142
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5759
__might_fault+0xc6/0x120 mm/memory.c:6387
_inline_copy_to_user include/linux/uaccess.h:176 [inline]
_copy_to_user+0x2a/0xb0 lib/usercopy.c:26
copy_to_user include/linux/uaccess.h:209 [inline]
fiemap_fill_next_extent+0x235/0x410 fs/ioctl.c:145
ocfs2_fiemap+0x9f1/0xf80 fs/ocfs2/extent_map.c:796
ioctl_fiemap fs/ioctl.c:220 [inline]
do_vfs_ioctl+0x1c07/0x2e50 fs/ioctl.c:841
__do_sys_ioctl fs/ioctl.c:905 [inline]
__se_sys_ioctl+0x81/0x170 fs/ioctl.c:893
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f

other info that might help us debug this:

Chain exists of:
&mm->mmap_lock --> sb_pagefaults --> &oi->ip_alloc_sem

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
rlock(&oi->ip_alloc_sem);
lock(sb_pagefaults);
lock(&oi->ip_alloc_sem);
rlock(&mm->mmap_lock);

*** DEADLOCK ***

1 lock held by syz.0.0/5152:
#0: ffff888012decda0 (&oi->ip_alloc_sem){++++}-{3:3}, at: ocfs2_fiemap+0x377/0xf80 fs/ocfs2/extent_map.c:755

stack backtrace:
CPU: 0 UID: 0 PID: 5152 Comm: syz.0.0 Not tainted 6.11.0-rc5-syzkaller-00316-g6cd90e5ea72f #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:93 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2186
check_prev_add kernel/locking/lockdep.c:3133 [inline]
check_prevs_add kernel/locking/lockdep.c:3252 [inline]
validate_chain+0x18e0/0x5900 kernel/locking/lockdep.c:3868
__lock_acquire+0x137a/0x2040 kernel/locking/lockdep.c:5142
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5759
__might_fault+0xc6/0x120 mm/memory.c:6387
_inline_copy_to_user include/linux/uaccess.h:176 [inline]
_copy_to_user+0x2a/0xb0 lib/usercopy.c:26
copy_to_user include/linux/uaccess.h:209 [inline]
fiemap_fill_next_extent+0x235/0x410 fs/ioctl.c:145
ocfs2_fiemap+0x9f1/0xf80 fs/ocfs2/extent_map.c:796
ioctl_fiemap fs/ioctl.c:220 [inline]
do_vfs_ioctl+0x1c07/0x2e50 fs/ioctl.c:841
__do_sys_ioctl fs/ioctl.c:905 [inline]
__se_sys_ioctl+0x81/0x170 fs/ioctl.c:893
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fa8b7f79eb9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fa8b8cc7038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007fa8b8115f80 RCX: 00007fa8b7f79eb9
RDX: 00000000200001c0 RSI: 00000000c020660b RDI: 0000000000000005
RBP: 00007fa8b7fe793e R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007fa8b8115f80 R15: 00007ffc6d1b5088
</TASK>

---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

syzbot

unread,

Sep 29, 2024, 10:52:26 AM9/29/24

to [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]

syzbot has found a reproducer for the following issue on:

HEAD commit: 3efc57369a0c Merge tag 'for-linus' of git://git.kernel.org..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=11480127980000
kernel config: https://syzkaller.appspot.com/x/.config?x=a4fcb065287cdb84

dashboard link: https://syzkaller.appspot.com/bug?extid=ca440b457d21568f8021
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

syz repro: https://syzkaller.appspot.com/x/repro.syz?x=109ccd9f980000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1476de80580000

Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-3efc5736.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/d0988c372a39/vmlinux-3efc5736.xz
kernel image: https://storage.googleapis.com/syzbot-assets/8547f30d7e9d/bzImage-3efc5736.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/ac59be7d6f54/mount_0.gz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]

ocfs2: Mounting device (7,0) on (node local, slot 0) with ordered data mode.
======================================================
WARNING: possible circular locking dependency detected

6.11.0-syzkaller-11993-g3efc57369a0c #0 Not tainted
------------------------------------------------------
syz-executor356/5112 is trying to acquire lock:
ffff8880119d4418 (&mm->mmap_lock){++++}-{3:3}, at: __might_fault+0xaa/0x120 mm/memory.c:6700

but task is already holding lock:

ffff888041bdbf60 (&oi->ip_alloc_sem){++++}-{3:3}, at: ocfs2_fiemap+0x377/0xf80 fs/ocfs2/extent_map.c:755

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (&oi->ip_alloc_sem){++++}-{3:3}:

lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5822
down_write+0x99/0x220 kernel/locking/rwsem.c:1579
ocfs2_page_mkwrite+0x346/0xed0 fs/ocfs2/mmap.c:142
do_page_mkwrite+0x198/0x480 mm/memory.c:3162
do_shared_fault mm/memory.c:5358 [inline]
do_fault mm/memory.c:5420 [inline]
do_pte_missing mm/memory.c:3965 [inline]
handle_pte_fault+0x11fa/0x6800 mm/memory.c:5751
__handle_mm_fault mm/memory.c:5894 [inline]
handle_mm_fault+0x1106/0x1bb0 mm/memory.c:6062

do_user_addr_fault arch/x86/mm/fault.c:1389 [inline]
handle_page_fault arch/x86/mm/fault.c:1481 [inline]
exc_page_fault+0x2b9/0x8c0 arch/x86/mm/fault.c:1539
asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623

-> #1 (sb_pagefaults){.+.+}-{0:0}:

lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5822
percpu_down_read include/linux/percpu-rwsem.h:51 [inline]
__sb_start_write include/linux/fs.h:1716 [inline]
sb_start_pagefault include/linux/fs.h:1881 [inline]
ocfs2_page_mkwrite+0x222/0xed0 fs/ocfs2/mmap.c:122
do_page_mkwrite+0x198/0x480 mm/memory.c:3162
do_shared_fault mm/memory.c:5358 [inline]
do_fault mm/memory.c:5420 [inline]
do_pte_missing mm/memory.c:3965 [inline]
handle_pte_fault+0x11fa/0x6800 mm/memory.c:5751
__handle_mm_fault mm/memory.c:5894 [inline]
handle_mm_fault+0x1106/0x1bb0 mm/memory.c:6062

do_user_addr_fault arch/x86/mm/fault.c:1389 [inline]
handle_page_fault arch/x86/mm/fault.c:1481 [inline]
exc_page_fault+0x2b9/0x8c0 arch/x86/mm/fault.c:1539
asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623

-> #0 (&mm->mmap_lock){++++}-{3:3}:

check_prev_add kernel/locking/lockdep.c:3158 [inline]
check_prevs_add kernel/locking/lockdep.c:3277 [inline]
validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3901
__lock_acquire+0x1384/0x2050 kernel/locking/lockdep.c:5199
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5822
__might_fault+0xc6/0x120 mm/memory.c:6700
_inline_copy_to_user include/linux/uaccess.h:183 [inline]
_copy_to_user+0x2a/0xb0 lib/usercopy.c:26
copy_to_user include/linux/uaccess.h:216 [inline]

fiemap_fill_next_extent+0x235/0x410 fs/ioctl.c:145
ocfs2_fiemap+0x9f1/0xf80 fs/ocfs2/extent_map.c:796
ioctl_fiemap fs/ioctl.c:220 [inline]

do_vfs_ioctl+0x1bf8/0x2e40 fs/ioctl.c:841

__do_sys_ioctl fs/ioctl.c:905 [inline]
__se_sys_ioctl+0x81/0x170 fs/ioctl.c:893
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f

other info that might help us debug this:

Chain exists of:
&mm->mmap_lock --> sb_pagefaults --> &oi->ip_alloc_sem

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
rlock(&oi->ip_alloc_sem);
lock(sb_pagefaults);
lock(&oi->ip_alloc_sem);
rlock(&mm->mmap_lock);

*** DEADLOCK ***

1 lock held by syz-executor356/5112:
#0: ffff888041bdbf60 (&oi->ip_alloc_sem){++++}-{3:3}, at: ocfs2_fiemap+0x377/0xf80 fs/ocfs2/extent_map.c:755

stack backtrace:
CPU: 0 UID: 0 PID: 5112 Comm: syz-executor356 Not tainted 6.11.0-syzkaller-11993-g3efc57369a0c #0

Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Call Trace:
<TASK>

__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
print_circular_bug+0x13a/0x1b0 kernel/locking/lockdep.c:2074
check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2203
check_prev_add kernel/locking/lockdep.c:3158 [inline]
check_prevs_add kernel/locking/lockdep.c:3277 [inline]
validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3901
__lock_acquire+0x1384/0x2050 kernel/locking/lockdep.c:5199
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5822
__might_fault+0xc6/0x120 mm/memory.c:6700
_inline_copy_to_user include/linux/uaccess.h:183 [inline]
_copy_to_user+0x2a/0xb0 lib/usercopy.c:26
copy_to_user include/linux/uaccess.h:216 [inline]

fiemap_fill_next_extent+0x235/0x410 fs/ioctl.c:145
ocfs2_fiemap+0x9f1/0xf80 fs/ocfs2/extent_map.c:796
ioctl_fiemap fs/ioctl.c:220 [inline]

do_vfs_ioctl+0x1bf8/0x2e40 fs/ioctl.c:841

__do_sys_ioctl fs/ioctl.c:905 [inline]
__se_sys_ioctl+0x81/0x170 fs/ioctl.c:893
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f

RIP: 0033:0x7f0c21255d99
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 17 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffc7eb343e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f0c21255d99

RDX: 00000000200001c0 RSI: 00000000c020660b RDI: 0000000000000005

RBP: 00007f0c212ce5f0 R08: 0000000000000000 R09: 00005555783d34c0

R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000

R13: 00007f0c212ce5f0 R14: 431bde82d7b634db R15: 00007f0c2129f03b
</TASK>

---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

syzbot

unread,

Oct 11, 2024, 11:59:04 PM10/11/24

to [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]

syzbot has bisected this issue to:

commit a3c06ae158dd6fa8336157c31d9234689d068d02
Author: Parav Pandit <[email protected]>
Date: Tue Jan 5 10:32:03 2021 +0000

vdpa_sim_net: Add support for user supported devices

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=13a0a840580000
start commit: 1d227fcc7222 Merge tag 'net-6.12-rc3' of git://git.kernel...
git tree: upstream
final oops: https://syzkaller.appspot.com/x/report.txt?x=1060a840580000
console output: https://syzkaller.appspot.com/x/log.txt?x=17a0a840580000
kernel config: https://syzkaller.appspot.com/x/.config?x=7a3fccdd0bb995
dashboard link: https://syzkaller.appspot.com/bug?extid=ca440b457d21568f8021
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=142fc840580000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15026b27980000

Reported-by: [email protected]
Fixes: a3c06ae158dd ("vdpa_sim_net: Add support for user supported devices")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

Lizhi Xu

unread,

Oct 17, 2024, 4:52:20 AM10/17/24

to [email protected], [email protected]

#syz test

diff --git a/fs/ocfs2/mmap.c b/fs/ocfs2/mmap.c
index 6ef4cb045ccd..f7863f7fb4a1 100644
--- a/fs/ocfs2/mmap.c
+++ b/fs/ocfs2/mmap.c
@@ -119,9 +119,6 @@ static vm_fault_t ocfs2_page_mkwrite(struct vm_fault *vmf)
int err;
vm_fault_t ret;

- sb_start_pagefault(inode->i_sb);
- ocfs2_block_signals(&oldset);
-
/*
* The cluster locks taken will block a truncate from another
* node. Taking the data lock will also ensure that we don't
@@ -131,7 +128,7 @@ static vm_fault_t ocfs2_page_mkwrite(struct vm_fault *vmf)
if (err < 0) {
mlog_errno(err);
ret = vmf_error(err);
- goto out;
+ return ret;
}

/*
@@ -141,16 +138,19 @@ static vm_fault_t ocfs2_page_mkwrite(struct vm_fault *vmf)
*/
down_write(&OCFS2_I(inode)->ip_alloc_sem);

+ sb_start_pagefault(inode->i_sb);
+ ocfs2_block_signals(&oldset);
+
ret = __ocfs2_page_mkwrite(vmf->vma->vm_file, di_bh, page);

+ ocfs2_unblock_signals(&oldset);
+ sb_end_pagefault(inode->i_sb);
+
up_write(&OCFS2_I(inode)->ip_alloc_sem);

brelse(di_bh);
ocfs2_inode_unlock(inode, 1);

-out:
- ocfs2_unblock_signals(&oldset);
- sb_end_pagefault(inode->i_sb);
return ret;
}

syzbot

unread,

Oct 17, 2024, 5:18:06 AM10/17/24

to [email protected], [email protected], [email protected]

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
possible deadlock in ocfs2_fiemap

option from the mount to silence this warning.
=======================================================

ocfs2: Mounting device (7,0) on (node local, slot 0) with ordered data mode.
======================================================
WARNING: possible circular locking dependency detected

6.12.0-rc3-syzkaller-00087-gc964ced77262-dirty #0 Not tainted
------------------------------------------------------
syz.0.15/6035 is trying to acquire lock:
ffff88807a5ee098 (&mm->mmap_lock){++++}-{3:3}, at: __might_fault+0xaa/0x120 mm/memory.c:6700

but task is already holding lock:

ffff888071633f60 (&oi->ip_alloc_sem){++++}-{3:3}, at: ocfs2_fiemap+0x377/0xf80 fs/ocfs2/extent_map.c:755

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&oi->ip_alloc_sem){++++}-{3:3}:
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825
down_write+0x99/0x220 kernel/locking/rwsem.c:1577
ocfs2_page_mkwrite+0x1e9/0xed0 fs/ocfs2/mmap.c:139

do_page_mkwrite+0x198/0x480 mm/memory.c:3162
do_shared_fault mm/memory.c:5358 [inline]
do_fault mm/memory.c:5420 [inline]
do_pte_missing mm/memory.c:3965 [inline]
handle_pte_fault+0x11fa/0x6800 mm/memory.c:5751
__handle_mm_fault mm/memory.c:5894 [inline]

handle_mm_fault+0x1053/0x1ad0 mm/memory.c:6062

do_user_addr_fault arch/x86/mm/fault.c:1389 [inline]
handle_page_fault arch/x86/mm/fault.c:1481 [inline]
exc_page_fault+0x2b9/0x8c0 arch/x86/mm/fault.c:1539
asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623

-> #0 (&mm->mmap_lock){++++}-{3:3}:

check_prev_add kernel/locking/lockdep.c:3161 [inline]
check_prevs_add kernel/locking/lockdep.c:3280 [inline]
validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3904
__lock_acquire+0x1384/0x2050 kernel/locking/lockdep.c:5202
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825

__might_fault+0xc6/0x120 mm/memory.c:6700
_inline_copy_to_user include/linux/uaccess.h:183 [inline]
_copy_to_user+0x2a/0xb0 lib/usercopy.c:26
copy_to_user include/linux/uaccess.h:216 [inline]
fiemap_fill_next_extent+0x235/0x410 fs/ioctl.c:145
ocfs2_fiemap+0x9f1/0xf80 fs/ocfs2/extent_map.c:796
ioctl_fiemap fs/ioctl.c:220 [inline]
do_vfs_ioctl+0x1bf8/0x2e40 fs/ioctl.c:841
__do_sys_ioctl fs/ioctl.c:905 [inline]
__se_sys_ioctl+0x81/0x170 fs/ioctl.c:893
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f

other info that might help us debug this:

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
rlock(&oi->ip_alloc_sem);

lock(&mm->mmap_lock);

lock(&oi->ip_alloc_sem);
rlock(&mm->mmap_lock);

*** DEADLOCK ***

1 lock held by syz.0.15/6035:
#0: ffff888071633f60 (&oi->ip_alloc_sem){++++}-{3:3}, at: ocfs2_fiemap+0x377/0xf80 fs/ocfs2/extent_map.c:755

stack backtrace:
CPU: 1 UID: 0 PID: 6035 Comm: syz.0.15 Not tainted 6.12.0-rc3-syzkaller-00087-gc964ced77262-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024

Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
print_circular_bug+0x13a/0x1b0 kernel/locking/lockdep.c:2074

check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2206
check_prev_add kernel/locking/lockdep.c:3161 [inline]
check_prevs_add kernel/locking/lockdep.c:3280 [inline]
validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3904
__lock_acquire+0x1384/0x2050 kernel/locking/lockdep.c:5202
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825

__might_fault+0xc6/0x120 mm/memory.c:6700
_inline_copy_to_user include/linux/uaccess.h:183 [inline]
_copy_to_user+0x2a/0xb0 lib/usercopy.c:26
copy_to_user include/linux/uaccess.h:216 [inline]
fiemap_fill_next_extent+0x235/0x410 fs/ioctl.c:145
ocfs2_fiemap+0x9f1/0xf80 fs/ocfs2/extent_map.c:796
ioctl_fiemap fs/ioctl.c:220 [inline]
do_vfs_ioctl+0x1bf8/0x2e40 fs/ioctl.c:841
__do_sys_ioctl fs/ioctl.c:905 [inline]
__se_sys_ioctl+0x81/0x170 fs/ioctl.c:893
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f

RIP: 0033:0x7f351397dff9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f35147e5038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f3513b35f80 RCX: 00007f351397dff9

RDX: 00000000200001c0 RSI: 00000000c020660b RDI: 0000000000000005

RBP: 00007f35139f0296 R08: 0000000000000000 R09: 0000000000000000

R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000

R13: 0000000000000000 R14: 00007f3513b35f80 R15: 00007ffef5debf08
</TASK>

Tested on:

commit: c964ced7 Merge tag 'for-linus' of git://git.kernel.org..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=14483030580000
kernel config: https://syzkaller.appspot.com/x/.config?x=164d2822debd8b0d

dashboard link: https://syzkaller.appspot.com/bug?extid=ca440b457d21568f8021
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

patch: https://syzkaller.appspot.com/x/patch.diff?x=12bdf727980000

Lizhi Xu

unread,

Oct 17, 2024, 6:19:44 AM10/17/24

to [email protected], [email protected]

#syz test

diff --git a/fs/ocfs2/extent_map.c b/fs/ocfs2/extent_map.c
index f7672472fa82..6d5ffa803b31 100644
--- a/fs/ocfs2/extent_map.c
+++ b/fs/ocfs2/extent_map.c
@@ -793,8 +793,10 @@ int ocfs2_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
phys_bytes = le64_to_cpu(rec.e_blkno) << osb->sb->s_blocksize_bits;
virt_bytes = (u64)le32_to_cpu(rec.e_cpos) << osb->s_clustersize_bits;

+ up_read(&OCFS2_I(inode)->ip_alloc_sem);
ret = fiemap_fill_next_extent(fieinfo, virt_bytes, phys_bytes,
len_bytes, fe_flags);
+ down_read(&OCFS2_I(inode)->ip_alloc_sem);
if (ret)
break;

syzbot

unread,

Oct 17, 2024, 6:51:08 AM10/17/24

to [email protected], [email protected], [email protected]

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: [email protected]
Tested-by: [email protected]

Tested on:

commit: c964ced7 Merge tag 'for-linus' of git://git.kernel.org..
git tree: upstream

console output: https://syzkaller.appspot.com/x/log.txt?x=135ab887980000

kernel config: https://syzkaller.appspot.com/x/.config?x=164d2822debd8b0d
dashboard link: https://syzkaller.appspot.com/bug?extid=ca440b457d21568f8021
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

patch: https://syzkaller.appspot.com/x/patch.diff?x=1383dc40580000

Note: testing is done by a robot and is best-effort only.

Lizhi Xu

unread,

Oct 17, 2024, 11:55:37 AM10/17/24

to [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]

Syzbot reported a possible circular locking dependency.

WARNING: possible circular locking dependency detected

6.12.0-rc2-syzkaller-00205-g1d227fcc7222 #0 Not tainted
------------------------------------------------------
syz-executor161/5226 is trying to acquire lock:
ffff88807e907398 (&mm->mmap_lock){++++}-{3:3}, at: __might_fault+0xaa/0x120 mm/memory.c:6700

but task is already holding lock:

ffff8880720d0660 (&oi->ip_alloc_sem){++++}-{3:3}, at: ocfs2_fiemap+0x377/0xf80 fs/ocfs2/extent_map.c:755

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (&oi->ip_alloc_sem){++++}-{3:3}:

lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825
down_write+0x99/0x220 kernel/locking/rwsem.c:1577

ocfs2_page_mkwrite+0x346/0xed0 fs/ocfs2/mmap.c:142
do_page_mkwrite+0x198/0x480 mm/memory.c:3162
do_shared_fault mm/memory.c:5358 [inline]
do_fault mm/memory.c:5420 [inline]
do_pte_missing mm/memory.c:3965 [inline]
handle_pte_fault+0x11fa/0x6800 mm/memory.c:5751
__handle_mm_fault mm/memory.c:5894 [inline]

handle_mm_fault+0x1053/0x1ad0 mm/memory.c:6062

do_user_addr_fault arch/x86/mm/fault.c:1389 [inline]
handle_page_fault arch/x86/mm/fault.c:1481 [inline]
exc_page_fault+0x2b9/0x8c0 arch/x86/mm/fault.c:1539
asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623

-> #1 (sb_pagefaults){.+.+}-{0:0}:

lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825

percpu_down_read include/linux/percpu-rwsem.h:51 [inline]
__sb_start_write include/linux/fs.h:1716 [inline]
sb_start_pagefault include/linux/fs.h:1881 [inline]
ocfs2_page_mkwrite+0x222/0xed0 fs/ocfs2/mmap.c:122
do_page_mkwrite+0x198/0x480 mm/memory.c:3162
do_shared_fault mm/memory.c:5358 [inline]
do_fault mm/memory.c:5420 [inline]
do_pte_missing mm/memory.c:3965 [inline]
handle_pte_fault+0x11fa/0x6800 mm/memory.c:5751
__handle_mm_fault mm/memory.c:5894 [inline]

handle_mm_fault+0x1053/0x1ad0 mm/memory.c:6062

do_user_addr_fault arch/x86/mm/fault.c:1389 [inline]
handle_page_fault arch/x86/mm/fault.c:1481 [inline]
exc_page_fault+0x2b9/0x8c0 arch/x86/mm/fault.c:1539
asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623

-> #0 (&mm->mmap_lock){++++}-{3:3}:

check_prev_add kernel/locking/lockdep.c:3161 [inline]
check_prevs_add kernel/locking/lockdep.c:3280 [inline]
validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3904
__lock_acquire+0x1384/0x2050 kernel/locking/lockdep.c:5202
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825

__might_fault+0xc6/0x120 mm/memory.c:6700
_inline_copy_to_user include/linux/uaccess.h:183 [inline]
_copy_to_user+0x2a/0xb0 lib/usercopy.c:26
copy_to_user include/linux/uaccess.h:216 [inline]
fiemap_fill_next_extent+0x235/0x410 fs/ioctl.c:145
ocfs2_fiemap+0x9f1/0xf80 fs/ocfs2/extent_map.c:796
ioctl_fiemap fs/ioctl.c:220 [inline]
do_vfs_ioctl+0x1bf8/0x2e40 fs/ioctl.c:841
__do_sys_ioctl fs/ioctl.c:905 [inline]
__se_sys_ioctl+0x81/0x170 fs/ioctl.c:893
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f

other info that might help us debug this:

Chain exists of:
&mm->mmap_lock --> sb_pagefaults --> &oi->ip_alloc_sem

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
rlock(&oi->ip_alloc_sem);
lock(sb_pagefaults);
lock(&oi->ip_alloc_sem);
rlock(&mm->mmap_lock);

*** DEADLOCK ***

Fix it by reordering locks sb_pagefaults and ip_alloc_sem, because
fiemap_fill_next_extent() does not need to be in the lock ip_alloc_sem,
cancel and retrieve ip_alloc_sem before and after the execution of it.

Reported-and-tested-by: [email protected]
Closes: https://syzkaller.appspot.com/bug?extid=ca440b457d21568f8021
Signed-off-by: Lizhi Xu <[email protected]>
---
fs/ocfs2/extent_map.c | 2 ++
fs/ocfs2/mmap.c | 14 +++++++-------
2 files changed, 9 insertions(+), 7 deletions(-)

--
2.43.0

Reply all

Reply to author

Forward

[syzbot] [ocfs2?] possible deadlock in ocfs2_fiemap

syzbot

syzbot

syzbot

Lizhi Xu

syzbot

Lizhi Xu

syzbot

Lizhi Xu