[syzbot] [ntfs3?] BUG: unable to handle kernel NULL pointer dereference in generic_file_read_iter

6 views
Skip to first unread message

syzbot

unread,
Apr 10, 2025, 10:15:36 PMApr 10
Hello,

syzbot found the following issue on:

HEAD commit: 0af2f6be1b42 Linux 6.15-rc1
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=145dc7e4580000
kernel config: https://syzkaller.appspot.com/x/.config?x=bae073f4634b7fd
dashboard link: https://syzkaller.appspot.com/bug?extid=e36cc3297bd3afd25e19
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1441eb4c580000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=161b0c04580000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/f359042635eb/disk-0af2f6be.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/bd095707eff2/vmlinux-0af2f6be.xz
kernel image: https://storage.googleapis.com/syzbot-assets/9257d0cc2f0f/bzImage-0af2f6be.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/93de6f4d2865/mount_0.gz

The issue was bisected to:

commit b432163ebd15a0fb74051949cb61456d6c55ccbd
Author: Konstantin Komarov <[email protected]>
Date: Thu Jan 30 14:03:41 2025 +0000

fs/ntfs3: Update inode->i_mapping->a_ops on compression state

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1351523f980000
final oops: https://syzkaller.appspot.com/x/report.txt?x=10d1523f980000
console output: https://syzkaller.appspot.com/x/log.txt?x=1751523f980000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]
Fixes: b432163ebd15 ("fs/ntfs3: Update inode->i_mapping->a_ops on compression state")

loop0: detected capacity change from 0 to 4096
ntfs3(loop0): Different NTFS sector size (4096) and media sector size (512).
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 0 P4D 0
Oops: Oops: 0010 [#1] SMP KASAN PTI
CPU: 0 UID: 0 PID: 5858 Comm: syz-executor328 Not tainted 6.15.0-rc1-syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc90003f1f880 EFLAGS: 00010246
RAX: 1ffffffff18fac93 RBX: 0000000000000000 RCX: ffff8880312cda00
RDX: 0000000000000000 RSI: ffffc90003f1f980 RDI: ffffc90003f1f9d0
RBP: ffffffff8c7d6498 R08: ffffffff82450731 R09: 1ffff1100ee119e1
R10: dffffc0000000000 R11: 0000000000000000 R12: 1ffff920007e3f33
R13: ffffc90003f1f980 R14: dffffc0000000000 R15: ffffc90003f1f9d0
FS: 00007efd457ef6c0(0000) GS:ffff888124fc9000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 0000000034abc000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
generic_file_read_iter+0x343/0x550 mm/filemap.c:2870
copy_splice_read+0x63f/0xb50 fs/splice.c:363
do_splice_read fs/splice.c:978 [inline]
splice_direct_to_actor+0x4f0/0xc90 fs/splice.c:1083
do_splice_direct_actor fs/splice.c:1201 [inline]
do_splice_direct+0x281/0x3d0 fs/splice.c:1227
do_sendfile+0x582/0x8c0 fs/read_write.c:1368
__do_sys_sendfile64 fs/read_write.c:1429 [inline]
__se_sys_sendfile64+0x17e/0x1e0 fs/read_write.c:1415
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7efd4583d069
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 91 1a 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007efd457ef168 EFLAGS: 00000246 ORIG_RAX: 0000000000000028
RAX: ffffffffffffffda RBX: 00007efd458e3708 RCX: 00007efd4583d069
RDX: 0000000000000000 RSI: 0000000000000005 RDI: 0000000000000004
RBP: 00007efd458e3700 R08: 00007efd457ef6c0 R09: 0000000000000000
R10: 0001000000201005 R11: 0000000000000246 R12: 00007efd458e370c
R13: 000000000000000b R14: 00007ffdcf3e22c0 R15: 00007ffdcf3e23a8
</TASK>
Modules linked in:
CR2: 0000000000000000
---[ end trace 0000000000000000 ]---
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc90003f1f880 EFLAGS: 00010246
RAX: 1ffffffff18fac93 RBX: 0000000000000000 RCX: ffff8880312cda00
RDX: 0000000000000000 RSI: ffffc90003f1f980 RDI: ffffc90003f1f9d0
RBP: ffffffff8c7d6498 R08: ffffffff82450731 R09: 1ffff1100ee119e1
R10: dffffc0000000000 R11: 0000000000000000 R12: 1ffff920007e3f33
R13: ffffc90003f1f980 R14: dffffc0000000000 R15: ffffc90003f1f9d0
FS: 00007efd457ef6c0(0000) GS:ffff888124fc9000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 0000000034abc000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Lizhi Xu

unread,
Apr 11, 2025, 3:51:44 AMApr 11
missing direct io func.
#syz test

diff --git a/fs/ntfs3/inode.c b/fs/ntfs3/inode.c
index 3e2957a1e360..50524f573d3a 100644
--- a/fs/ntfs3/inode.c
+++ b/fs/ntfs3/inode.c
@@ -2068,5 +2068,6 @@ const struct address_space_operations ntfs_aops_cmpr = {
.read_folio = ntfs_read_folio,
.readahead = ntfs_readahead,
.dirty_folio = block_dirty_folio,
+ .direct_IO = ntfs_direct_IO,
};
// clang-format on

syzbot

unread,
Apr 11, 2025, 4:15:05 AMApr 11
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
unregister_netdevice: waiting for DEV to become free

unregister_netdevice: waiting for batadv0 to become free. Usage count = 3


Tested on:

commit: 0c7cae12 Merge tag 'irq-urgent-2025-04-10' of git://gi..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1373f070580000
kernel config: https://syzkaller.appspot.com/x/.config?x=fb8650d88e9fb80f
dashboard link: https://syzkaller.appspot.com/bug?extid=e36cc3297bd3afd25e19
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=11a8474c580000

Lizhi Xu

unread,
Apr 11, 2025, 4:24:34 AMApr 11
The ntfs3 can use the page cache directly, so its address_space_operations
need direct_IO.

Fixes: b432163ebd15 ("fs/ntfs3: Update inode->i_mapping->a_ops on compression state")
Reported-by: [email protected]
Closes: https://syzkaller.appspot.com/bug?extid=e36cc3297bd3afd25e19
Signed-off-by: Lizhi Xu <[email protected]>
---
fs/ntfs3/inode.c | 1 +
1 file changed, 1 insertion(+)
--
2.43.0

Christoph Hellwig

unread,
Apr 14, 2025, 8:50:58 AMApr 14
On Fri, Apr 11, 2025 at 09:24:27AM +0800, Lizhi Xu wrote:
> The ntfs3 can use the page cache directly, so its address_space_operations
> need direct_IO.

I can't parse that sentence. What are you trying to say with it?

Lizhi Xu

unread,
Apr 15, 2025, 4:05:25 AMApr 15
The comments [1] of generic_file_read_iter() clearly states "read_iter()
for all filesystems that can use the page cache directly".

In the calltrace of this example, it is clear that direct_IO is not set.
In [3], it is also clear that the lack of direct_IO in ntfs_aops_cmpr
caused this problem.

In summary, direct_IO must be set in this issue.

[1]
* generic_file_read_iter - generic filesystem read routine
* @iocb: kernel I/O control block
* @iter: destination for the data read
*
* This is the "read_iter()" routine for all filesystems
* that can use the page cache directly.

[2]
generic_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
{
size_t count = iov_iter_count(iter);
ssize_t retval = 0;

if (!count)
return 0; /* skip atime */

if (iocb->ki_flags & IOCB_DIRECT) {
struct file *file = iocb->ki_filp;
struct address_space *mapping = file->f_mapping;
struct inode *inode = mapping->host;

retval = kiocb_write_and_wait(iocb, count);
if (retval < 0)
return retval;
file_accessed(file);

retval = mapping->a_ops->direct_IO(iocb, iter);
[3]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b432163ebd15a0fb74051949cb61456d6c55ccbd
diff --git a/fs/ntfs3/file.c b/fs/ntfs3/file.c
index 4d9d84cc3c6f55..9b6a3f8d2e7c5c 100644
--- a/fs/ntfs3/file.c
+++ b/fs/ntfs3/file.c
@@ -101,8 +101,26 @@ int ntfs_fileattr_set(struct mnt_idmap *idmap, struct dentry *dentry,
/* Allowed to change compression for empty files and for directories only. */
if (!is_dedup(ni) && !is_encrypted(ni) &&
(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode))) {
- /* Change compress state. */
- int err = ni_set_compress(inode, flags & FS_COMPR_FL);
+ int err = 0;
+ struct address_space *mapping = inode->i_mapping;
+
+ /* write out all data and wait. */
+ filemap_invalidate_lock(mapping);
+ err = filemap_write_and_wait(mapping);
+
+ if (err >= 0) {
+ /* Change compress state. */
+ bool compr = flags & FS_COMPR_FL;
+ err = ni_set_compress(inode, compr);
+
+ /* For files change a_ops too. */
+ if (!err)
+ mapping->a_ops = compr ? &ntfs_aops_cmpr :
+ &ntfs_aops;

BR,
Lizhi

Jan Kara

unread,
Apr 15, 2025, 12:05:29 PMApr 15
On Tue 15-04-25 09:05:18, Lizhi Xu wrote:
> On Sun, 13 Apr 2025 22:50:54 -0700, Christoph Hellwig wrote:
> > On Fri, Apr 11, 2025 at 09:24:27AM +0800, Lizhi Xu wrote:
> > > The ntfs3 can use the page cache directly, so its address_space_operations
> > > need direct_IO.
> >
> > I can't parse that sentence. What are you trying to say with it?
> The comments [1] of generic_file_read_iter() clearly states "read_iter()
> for all filesystems that can use the page cache directly".
>
> In the calltrace of this example, it is clear that direct_IO is not set.
> In [3], it is also clear that the lack of direct_IO in ntfs_aops_cmpr
> caused this problem.
>
> In summary, direct_IO must be set in this issue.

I agree that you need to set .direct_IO in ntfs_aops_cmpr but since
compressed files do not *support* direct IO (at least I don't see any such
support in ntfs_direct_IO()) you either need to also handle these files in
ntfs_direct_IO() or you need to set special direct IO handler that will
just return 0 and thus fall back to buffered IO. So I don't think your
patch is correct as is.

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

Lizhi Xu

unread,
Apr 15, 2025, 12:26:44 PMApr 15
The ntfs3 can use the page cache directly, so its address_space_operations
need direct_IO. Exit ntfs_direct_IO() if it is a compressed file.

Fixes: b432163ebd15 ("fs/ntfs3: Update inode->i_mapping->a_ops on compression state")
Reported-by: [email protected]
Closes: https://syzkaller.appspot.com/bug?extid=e36cc3297bd3afd25e19
Signed-off-by: Lizhi Xu <[email protected]>
---
V1 -> V2: exit direct io if it is a compressed file.

fs/ntfs3/inode.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/fs/ntfs3/inode.c b/fs/ntfs3/inode.c
index 3e2957a1e360..0f0d27d4644a 100644
--- a/fs/ntfs3/inode.c
+++ b/fs/ntfs3/inode.c
@@ -805,6 +805,10 @@ static ssize_t ntfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
ret = 0;
goto out;
}
+ if (is_compressed(ni)) {
+ ret = 0;
+ goto out;
+ }

ret = blockdev_direct_IO(iocb, inode, iter,
wr ? ntfs_get_block_direct_IO_W :
@@ -2068,5 +2072,6 @@ const struct address_space_operations ntfs_aops_cmpr = {

Jan Kara

unread,
Apr 15, 2025, 2:08:55 PMApr 15
On Tue 15-04-25 17:26:37, Lizhi Xu wrote:
> The ntfs3 can use the page cache directly, so its address_space_operations
> need direct_IO. Exit ntfs_direct_IO() if it is a compressed file.
>
> Fixes: b432163ebd15 ("fs/ntfs3: Update inode->i_mapping->a_ops on compression state")
> Reported-by: [email protected]
> Closes: https://syzkaller.appspot.com/bug?extid=e36cc3297bd3afd25e19
> Signed-off-by: Lizhi Xu <[email protected]>

OK, this looks sensible to me. Feel free to add:

Reviewed-by: Jan Kara <[email protected]>

Honza

Christoph Hellwig

unread,
Apr 16, 2025, 7:37:08 AMApr 16
On Tue, Apr 15, 2025 at 05:26:37PM +0800, Lizhi Xu wrote:
> The ntfs3 can use the page cache directly, so its address_space_operations
> need direct_IO.

This sentence still does not make any sense.

Lizhi Xu

unread,
Apr 16, 2025, 8:34:34 AMApr 16
On Tue, 15 Apr 2025 21:37:04 -0700, Christoph Hellwig wrote:
> > The ntfs3 can use the page cache directly, so its address_space_operations
> > need direct_IO.
>
> This sentence still does not make any sense.
Did you see the following comments?
https://lore.kernel.org/all/[email protected]/

Christoph Hellwig

unread,
Apr 16, 2025, 8:35:34 AMApr 16
I did, but that changes nothing about the fact that the above sentence
doesn't make sense.

Lizhi Xu

unread,
Apr 16, 2025, 9:03:57 AMApr 16
On Tue, 15 Apr 2025 22:35:30 -0700, Christoph Hellwig wrote:
> > > > The ntfs3 can use the page cache directly, so its address_space_operations
> > > > need direct_IO.
> > >
> > > This sentence still does not make any sense.
> > Did you see the following comments?
> > https://lore.kernel.org/all/[email protected]/
>
> I did, but that changes nothing about the fact that the above sentence
> doesn't make sense.
In the reproducer, the second file passed in by the system call sendfile()
sets the file flag O_DIRECT when opening the file, which bypasses the page
cache and accesses the direct io interface of the ntfs3 file system.
However, ntfs3 does not set direct_IO for compressed files in ntfs_aops_cmpr.

Christoph Hellwig

unread,
Apr 16, 2025, 9:06:02 AMApr 16
On Wed, Apr 16, 2025 at 02:03:51PM +0800, Lizhi Xu wrote:
> In the reproducer, the second file passed in by the system call sendfile()
> sets the file flag O_DIRECT when opening the file, which bypasses the page
> cache and accesses the direct io interface of the ntfs3 file system.
> However, ntfs3 does not set direct_IO for compressed files in ntfs_aops_cmpr.

Not allowing direct I/O is perfectly fine. If you think you need to
support direct I/O for this case it is also fine. But none of this
has anything to do with 'can use the page cache' and there are also
plenty of ways to support direct I/O without ->direct_IO.

Lizhi Xu

unread,
Apr 16, 2025, 9:18:35 AMApr 16
On Tue, 15 Apr 2025 23:05:56 -0700, Christoph Hellwig wrote:
> > In the reproducer, the second file passed in by the system call sendfile()
> > sets the file flag O_DIRECT when opening the file, which bypasses the page
> > cache and accesses the direct io interface of the ntfs3 file system.
> > However, ntfs3 does not set direct_IO for compressed files in ntfs_aops_cmpr.
>
> Not allowing direct I/O is perfectly fine. If you think you need to
> support direct I/O for this case it is also fine. But none of this
> has anything to do with 'can use the page cache' and there are also
The "The ntfs3 can use the page cache directly" I mentioned in the patch
is to explain that the calltrace is the direct I/O of ntfs3 called from
generic_file_read_iter().
Reply all
Reply to author
Forward
0 new messages