summaryrefslogtreecommitdiff
path: root/fs/9p/vfs_super.c
AgeCommit message (Collapse)Author
2024-04-22 fs/9p: mitigate inode collisionsEric Van Hensbergen
Detect and mitigate inode collsions that now occur since we fixed 9p generating duplicate inode structures. Underlying cause of these appears to be a race condition between reuse of inode numbers in underlying file system and cleanup of inode numbers in the client. Enabling caching makes this much more likely to happen as it increases cleanup latency due to writebacks. Reported-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
2024-04-11fs/9p: drop inodes immediately on non-.L tooJoakim Sindholt
Signed-off-by: Joakim Sindholt <opensource@zhasha.com> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
2024-04-11fs/9p: Revert "fs/9p: fix dups even in uncached mode"Eric Van Hensbergen
This reverts commit be57855f505003c5cafff40338d5d0f23b00ba4d. It caused a regression involving duplicate inode numbers in some tester trees. The bad behavior seems to be dependent on inode reuse policy in underlying file system, so it did not trigger in my test setup. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
2024-01-26fs/9p: fix dups even in uncached modeEric Van Hensbergen
In uncached mode we were still seeing duplicate getattr requests because of aggressive dropping of inodes. Inode "freshness" is guarded by other mechanisms when caches are disabled so this is unnecessary and increases overhead of almost every operation. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
2024-01-26fs/9p: simplify iget to remove unnecessary pathsEric Van Hensbergen
Remove the additional comparison operators and switch to simply lookup by inode number (aka qid.path). Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
2024-01-26fs/9p: switch vfsmount to use v9fs_get_new_inodeEric Van Hensbergen
In the process of cleaning up inode number allocation, I noticed several functions which didn't use the standard helper allocators. This patch fixes the allocation in the mount entrypoint. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
2023-12-24netfs: Move pinning-for-writeback from fscache to netfsDavid Howells
Move the resource pinning-for-writeback from fscache code to netfslib code. This is used to keep a cache backing object pinned whilst we have dirty pages on the netfs inode in the pagecache such that VM writeback will be able to reach it. Whilst we're at it, switch the parameters of netfs_unpin_writeback() to match ->write_inode() so that it can be used for that directly. Note that this mechanism could be more generically useful than that for network filesystems. Quite often they have to keep around other resources (e.g. authentication tokens or network connections) until the writeback is complete. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
2023-05-05Merge tag 'net-6.4-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from netfilter. Current release - regressions: - sched: act_pedit: free pedit keys on bail from offset check Current release - new code bugs: - pds_core: - Kconfig fixes (DEBUGFS and AUXILIARY_BUS) - fix mutex double unlock in error path Previous releases - regressions: - sched: cls_api: remove block_cb from driver_list before freeing - nf_tables: fix ct untracked match breakage - eth: mtk_eth_soc: drop generic vlan rx offload - sched: flower: fix error handler on replace Previous releases - always broken: - tcp: fix skb_copy_ubufs() vs BIG TCP - ipv6: fix skb hash for some RST packets - af_packet: don't send zero-byte data in packet_sendmsg_spkt() - rxrpc: timeout handling fixes after moving client call connection to the I/O thread - ixgbe: fix panic during XDP_TX with > 64 CPUs - igc: RMW the SRRCTL register to prevent losing timestamp config - dsa: mt7530: fix corrupt frames using TRGMII on 40 MHz XTAL MT7621 - r8152: - fix flow control issue of RTL8156A - fix the poor throughput for 2.5G devices - move setting r8153b_rx_agg_chg_indicate() to fix coalescing - enable autosuspend - ncsi: clear Tx enable mode when handling a Config required AEN - octeontx2-pf: macsec: fixes for CN10KB ASIC rev Misc: - 9p: remove INET dependency" * tag 'net-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits) net: bcmgenet: Remove phy_stop() from bcmgenet_netif_stop() pds_core: fix mutex double unlock in error path net/sched: flower: fix error handler on replace Revert "net/sched: flower: Fix wrong handle assignment during filter change" net/sched: flower: fix filter idr initialization net: fec: correct the counting of XDP sent frames bonding: add xdp_features support net: enetc: check the index of the SFI rather than the handle sfc: Add back mailing list virtio_net: suppress cpu stall when free_unused_bufs ice: block LAN in case of VF to VF offload net: dsa: mt7530: fix network connectivity with multiple CPU ports net: dsa: mt7530: fix corrupt frames using trgmii on 40 MHz XTAL MT7621 9p: Remove INET dependency netfilter: nf_tables: fix ct untracked match breakage af_packet: Don't send zero-byte data in packet_sendmsg_spkt(). igc: read before write to SRRCTL register pds_core: add AUXILIARY_BUS and NET_DEVLINK to Kconfig pds_core: remove CONFIG_DEBUG_FS from makefile ionic: catch failure from devlink_alloc ...
2023-05-049p: Remove INET dependencyJason Andryuk
9pfs can run over assorted transports, so it doesn't have an INET dependency. Drop it and remove the includes of linux/inet.h. NET_9P_FD/trans_fd.o builds without INET or UNIX and is usable over plain file descriptors. However, tcp and unix functionality is still built and would generate runtime failures if used. Add imply INET and UNIX to NET_9P_FD, so functionality is enabled by default but can still be explicitly disabled. This allows configuring 9pfs over Xen with INET and UNIX disabled. Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-09fs/9p: Rework cache modes and add new options to DocumentationEric Van Hensbergen
Switch cache modes to a bit-mask and use legacy cache names as shortcuts. Update documentation to include information on both shortcuts and bitmasks. This patch also fixes missing guards related to fscache. Update the documentation for new mount flags and cache modes. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
2023-03-27fs/9p: remove writeback fid and fix per-file modesEric Van Hensbergen
This patch removes the creating of an additional writeback_fid for opened files. The patch addresses problems when files were opened write-only or getattr on files with dirty caches. This patch also incorporates information about cache behavior in the fid for every file. This allows us to reflect cache behavior from mount flags, open mode, and information from the server to inform readahead and writeback behavior. This includes adding support for a 9p semantic that qid.version==0 is used to mark a file as non-cachable which is important for synthetic files. This may have a side-effect of not supporting caching on certain legacy file servers that do not properly set qid.version. There is also now a mount flag which can disable the qid.version behavior. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
2023-03-27fs/9p: allow disable of xattr support on mountEric Van Hensbergen
xattr creates a lot of additional messages for 9p in the current implementation. This allows users to conditionalize xattr support on 9p mount if they are on a connection with bad latency. Using this flag is also useful when debugging other aspects of 9p as it reduces the noise in the trace files. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org> Reviewed-by: Dominique Martinet <asmadeus@codewreck.org>
2023-03-27fs/9p: Remove unnecessary superblock flagsEric Van Hensbergen
These flags just add unnecessary extra operations. When 9p is run without cache, it inherently implements these options so we don't need them in the superblock (which ends up sending extraneous fsyncs, etc.). User can still request these options on mount, but we don't need to set them as default. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
2022-12-029p/fs: Remove unneeded idr.h #includeChristophe JAILLET
The 9p fs does not use IDR or IDA functionalities. So there is no point in including <linux/idr.h>. Remove it. Link: https://lkml.kernel.org/r/3d1e0ed9714eaee7e18d9f5b0b4bfa49b00b286d.1669553950.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> [Dominique: reword subject] Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2022-07-029p fid refcount: cleanup p9_fid_put callsDominique Martinet
Simplify p9_fid_put cleanup path in many 9p functions since the function is noop on null or error fids. Also make the *_add_fid() helpers "steal" the fid by nulling its pointer, so put after them will be noop. This should lead to no change of behaviour Link: https://lkml.kernel.org/r/20220612085330.1451496-7-asmadeus@codewreck.org Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2022-07-029p fid refcount: add p9_fid_get/put wrappersDominique Martinet
I was recently reminded that it is not clear that p9_client_clunk() was actually just decrementing refcount and clunking only when that reaches zero: make it clear through a set of helpers. This will also allow instrumenting refcounting better for debugging next patch Link: https://lkml.kernel.org/r/20220612085330.1451496-5-asmadeus@codewreck.org Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2022-01-109p: Copy local writes to the cache when writing to the serverDavid Howells
When writing to the server from v9fs_vfs_writepage(), copy the data to the cache object too. To make this possible, the cookie must have its active users count incremented when the page is dirtied and kept incremented until we manage to clean up all the pages. This allows the writeback to take place after the last file struct is released. This is done by taking a use on the cookie in v9fs_set_page_dirty() if we haven't already done so (controlled by the I_PINNING_FSCACHE_WB flag) and dropping the pin in v9fs_write_inode() if __writeback_single_inode() clears all the outstanding dirty pages (conveyed by the unpinned_fscache_wb flag in the writeback_control struct). Inode eviction must also clear the flag after truncating away all the outstanding pages. In the future this will be handled more gracefully by netfslib. Changes ======= ver #3: - Canonicalise the coherency data to make it endianness-independent. ver #2: - Fix an unused-var warning due to CONFIG_9P_FSCACHE=n[1]. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Jeff Layton <jlayton@kernel.org> Tested-by: Dominique Martinet <asmadeus@codewreck.org> cc: Eric Van Hensbergen <ericvh@gmail.com> cc: Latchesar Ionkov <lucho@ionkov.net> cc: v9fs-developer@lists.sourceforge.net cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819667027.215744.13815687931204222995.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906978015.143852.10646669694345706328.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967180760.1823006.5831751873616248910.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021574522.640689.13849966660182529125.stgit@warthog.procyon.org.uk/ # v4
2021-11-049p: fix a bunch of checkpatch warningsDominique Martinet
Sohaib Mohamed started a serie of tiny and incomplete checkpatch fixes but seemingly stopped halfway -- take over and do most of it. This is still missing net/9p/trans* and net/9p/protocol.c for a later time... Link: http://lkml.kernel.org/r/20211102134608.1588018-3-dominique.martinet@atmark-techno.com Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2021-11-049p: set readahead and io size according to maxsizeDominique Martinet
having a readahead of 128k with a msize of 128k (with overhead) lead to reading 124+4k everytime, making two roundtrips needlessly. tune readahead according to msize when cache is enabled for better performance Link: http://lkml.kernel.org/r/20211104120323.2189376-1-asmadeus@codewreck.org Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2021-11-039p: fix file headersDominique Martinet
- add missing SPDX-License-Identifier - remove (sometimes incorrect) file name from file header Link: http://lkml.kernel.org/r/20211102134608.1588018-2-dominique.martinet@atmark-techno.com Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2020-11-199p: add refcount to p9_fid structJianyong Wu
Fix race issue in fid contention. Eric's and Greg's patch offer a mechanism to fix open-unlink-f*syscall bug in 9p. But there is race issue in fid parallel accesses. As Greg's patch stores all of fids from opened files into according inode, so all the lookup fid ops can retrieve fid from inode preferentially. But there is no mechanism to handle the fid contention issue. For example, there are two threads get the same fid in the same time and one of them clunk the fid before the other thread ready to discard the fid. In this scenario, it will lead to some fatal problems, even kernel core dump. I introduce a mechanism to fix this race issue. A counter field introduced into p9_fid struct to store the reference counter to the fid. When a fid is allocated from the inode or dentry, the counter will increase, and will decrease at the end of its occupation. It is guaranteed that the fid won't be clunked before the reference counter go down to 0, then we can avoid the clunked fid to be used. tests: race issue test from the old test case: for file in {01..50}; do touch f.${file}; done seq 1 1000 | xargs -n 1 -P 50 -I{} cat f.* > /dev/null open-unlink-f*syscall test: I have tested for f*syscall include: ftruncate fstat fchown fchmod faccessat. Link: http://lkml.kernel.org/r/20200923141146.90046-5-jianyong.wu@arm.com Fixes: 478ba09edc1f ("fs/9p: search open fids first") Signed-off-by: Jianyong Wu <jianyong.wu@arm.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2020-10-24Merge branch 'work.misc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc vfs updates from Al Viro: "Assorted stuff all over the place (the largest group here is Christoph's stat cleanups)" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: remove KSTAT_QUERY_FLAGS fs: remove vfs_stat_set_lookup_flags fs: move vfs_fstatat out of line fs: implement vfs_stat and vfs_lstat in terms of vfs_fstatat fs: remove vfs_statx_fd fs: omfs: use kmemdup() rather than kmalloc+memcpy [PATCH] reduce boilerplate in fsid handling fs: Remove duplicated flag O_NDELAY occurring twice in VALID_OPEN_FLAGS selftests: mount: add nosymfollow tests Add a "nosymfollow" mount option.
2020-09-24bdi: initialize ->ra_pages and ->io_pages in bdi_initChristoph Hellwig
Set up a readahead size by default, as very few users have a good reason to change it. This means code, ecryptfs, and orangefs now set up the values while they were previously missing it, while ubifs, mtd and vboxsf manually set it to 0 to avoid readahead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: David Sterba <dsterba@suse.com> [btrfs] Acked-by: Richard Weinberger <richard@nod.at> [ubifs, mtd] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-18[PATCH] reduce boilerplate in fsid handlingAl Viro
Get rid of boilerplate in most of ->statfs() instances... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-09-27Merge tag '9p-for-5.4' of git://github.com/martinetd/linuxLinus Torvalds
Pull 9p updates from Dominique Martinet: "Some of the usual small fixes and cleanup. Small fixes all around: - avoid overlayfs copy-up for PRIVATE mmaps - KUMSAN uninitialized warning for transport error - one syzbot memory leak fix in 9p cache - internal API cleanup for v9fs_fill_super" * tag '9p-for-5.4' of git://github.com/martinetd/linux: 9p/vfs_super.c: Remove unused parameter data in v9fs_fill_super 9p/cache.c: Fix memory leak in v9fs_cache_session_get_cookie 9p: Transport error uninitialized 9p: avoid attaching writeback_fid on mmap with type PRIVATE
2019-09-039p/vfs_super.c: Remove unused parameter data in v9fs_fill_superBharath Vedartham
v9fs_fill_super has a param 'void *data' which is unused in the function. This patch removes the 'void *data' param in v9fs_fill_super and changes the parameters in all function calls of v9fs_fill_super. Link: http://lkml.kernel.org/r/20190523165619.GA4209@bharath12345-Inspiron-5559 Signed-off-by: Bharath Vedartham <linux.bhar@gmail.com> Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
2019-08-309p: Fill min and max timestamps in sbDeepa Dinamani
struct p9_wstat and struct p9_stat_dotl indicate that the wire transport uses u32 and u64 fields for timestamps. Fill in the appropriate limits to avoid inconsistencies in the vfs cached inode times when timestamps are outside the permitted range. Note that the upper bound for V9FS_PROTO_2000L is retained as S64_MAX. This is because that is the upper bound supported by vfs. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Acked-by: Jeff Layton <jlayton@kernel.org> Cc: ericvh@gmail.com Cc: lucho@ionkov.net Cc: asmadeus@codewreck.org Cc: v9fs-developer@lists.sourceforge.net
2019-05-30treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 188Thomas Gleixner
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not write to free software foundation 51 franklin street fifth floor boston ma 02111 1301 usa extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 27 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Steve Winslow <swinslow@gmail.com> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190528170026.981318839@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-019p: switch to ->free_inode()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-03-17Merge tag '9p-for-5.1' of git://github.com/martinetd/linuxLinus Torvalds
Pull 9p updates from Dominique Martinet: "Here is a 9p update for 5.1; there honestly hasn't been much. Two fixes (leak on invalid mount argument and possible deadlock on i_size update on 32bit smp) and a fall-through warning cleanup" * tag '9p-for-5.1' of git://github.com/martinetd/linux: 9p/net: fix memory leak in p9_client_create 9p: use inode->i_lock to protect i_size_write() under 32-bit 9p: mark expected switch fall-through
2019-03-12mm: refactor readahead defines in mm.hNikolay Borisov
All users of VM_MAX_READAHEAD actually convert it to kbytes and then to pages. Define the macro explicitly as (SZ_128K / PAGE_SIZE). This simplifies the expression in every filesystem. Also rename the macro to VM_READAHEAD_PAGES to properly convey its meaning. Finally remove unused VM_MIN_READAHEAD [akpm@linux-foundation.org: fix fs/io_uring.c, per Stephen] Link: http://lkml.kernel.org/r/20181221144053.24318-1-nborisov@suse.com Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Latchesar Ionkov <lucho@ionkov.net> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: David Howells <dhowells@redhat.com> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: David Sterba <dsterba@suse.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-039p: use inode->i_lock to protect i_size_write() under 32-bitHou Tao
Use inode->i_lock to protect i_size_write(), else i_size_read() in generic_fillattr() may loop infinitely in read_seqcount_begin() when multiple processes invoke v9fs_vfs_getattr() or v9fs_vfs_getattr_dotl() simultaneously under 32-bit SMP environment, and a soft lockup will be triggered as show below: watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [stat:2217] Modules linked in: CPU: 5 PID: 2217 Comm: stat Not tainted 5.0.0-rc1-00005-g7f702faf5a9e #4 Hardware name: Generic DT based system PC is at generic_fillattr+0x104/0x108 LR is at 0xec497f00 pc : [<802b8898>] lr : [<ec497f00>] psr: 200c0013 sp : ec497e20 ip : ed608030 fp : ec497e3c r10: 00000000 r9 : ec497f00 r8 : ed608030 r7 : ec497ebc r6 : ec497f00 r5 : ee5c1550 r4 : ee005780 r3 : 0000052d r2 : 00000000 r1 : ec497f00 r0 : ed608030 Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none Control: 10c5387d Table: ac48006a DAC: 00000051 CPU: 5 PID: 2217 Comm: stat Not tainted 5.0.0-rc1-00005-g7f702faf5a9e #4 Hardware name: Generic DT based system Backtrace: [<8010d974>] (dump_backtrace) from [<8010dc88>] (show_stack+0x20/0x24) [<8010dc68>] (show_stack) from [<80a1d194>] (dump_stack+0xb0/0xdc) [<80a1d0e4>] (dump_stack) from [<80109f34>] (show_regs+0x1c/0x20) [<80109f18>] (show_regs) from [<801d0a80>] (watchdog_timer_fn+0x280/0x2f8) [<801d0800>] (watchdog_timer_fn) from [<80198658>] (__hrtimer_run_queues+0x18c/0x380) [<801984cc>] (__hrtimer_run_queues) from [<80198e60>] (hrtimer_run_queues+0xb8/0xf0) [<80198da8>] (hrtimer_run_queues) from [<801973e8>] (run_local_timers+0x28/0x64) [<801973c0>] (run_local_timers) from [<80197460>] (update_process_times+0x3c/0x6c) [<80197424>] (update_process_times) from [<801ab2b8>] (tick_nohz_handler+0xe0/0x1bc) [<801ab1d8>] (tick_nohz_handler) from [<80843050>] (arch_timer_handler_virt+0x38/0x48) [<80843018>] (arch_timer_handler_virt) from [<80180a64>] (handle_percpu_devid_irq+0x8c/0x240) [<801809d8>] (handle_percpu_devid_irq) from [<8017ac20>] (generic_handle_irq+0x34/0x44) [<8017abec>] (generic_handle_irq) from [<8017b344>] (__handle_domain_irq+0x6c/0xc4) [<8017b2d8>] (__handle_domain_irq) from [<801022e0>] (gic_handle_irq+0x4c/0x88) [<80102294>] (gic_handle_irq) from [<80101a30>] (__irq_svc+0x70/0x98) [<802b8794>] (generic_fillattr) from [<8056b284>] (v9fs_vfs_getattr_dotl+0x74/0xa4) [<8056b210>] (v9fs_vfs_getattr_dotl) from [<802b8904>] (vfs_getattr_nosec+0x68/0x7c) [<802b889c>] (vfs_getattr_nosec) from [<802b895c>] (vfs_getattr+0x44/0x48) [<802b8918>] (vfs_getattr) from [<802b8a74>] (vfs_statx+0x9c/0xec) [<802b89d8>] (vfs_statx) from [<802b9428>] (sys_lstat64+0x48/0x78) [<802b93e0>] (sys_lstat64) from [<80101000>] (ret_fast_syscall+0x0/0x28) [dominique.martinet@cea.fr: updated comment to not refer to a function in another subsystem] Link: http://lkml.kernel.org/r/20190124063514.8571-2-houtao1@huawei.com Cc: stable@vger.kernel.org Fixes: 7549ae3e81cc ("9p: Use the i_size_[read, write]() macros instead of using inode->i_size directly.") Reported-by: Xing Gaopeng <xingaopeng@huawei.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
2018-04-05fs/9p: don't set SB_NOATIME by defaultYiwen Jiang
When the user uses some syscall, for example mmap(v9fs_file_mmap), it will not update atime even if user's was set mnt_flags without MNT_NOATIME, because v9fs defaults to settine SB_NOATIME in v9fs_set_super. For supporting access time updating when the user mounts with relatime, we should not set SB_NOATIME by default. Link: http://lkml.kernel.org/r/5AB9A377.6080906@huawei.com Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com> Reviewed-by: Greg Kurz <groug@kaod.org> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@sandia.gov> Cc: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-27Rename superblock flags (MS_xyz -> SB_xyz)Linus Torvalds
This is a pure automated search-and-replace of the internal kernel superblock flags. The s_flags are now called SB_*, with the names and the values for the moment mirroring the MS_* flags that they're equivalent to. Note how the MS_xyz flags are the ones passed to the mount system call, while the SB_xyz flags are what we then use in sb->s_flags. The script to do this was: # places to look in; re security/*: it generally should *not* be # touched (that stuff parses mount(2) arguments directly), but # there are two places where we really deal with superblock flags. FILES="drivers/mtd drivers/staging/lustre fs ipc mm \ include/linux/fs.h include/uapi/linux/bfs_fs.h \ security/apparmor/apparmorfs.c security/apparmor/include/lib.h" # the list of MS_... constants SYMS="RDONLY NOSUID NODEV NOEXEC SYNCHRONOUS REMOUNT MANDLOCK \ DIRSYNC NOATIME NODIRATIME BIND MOVE REC VERBOSE SILENT \ POSIXACL UNBINDABLE PRIVATE SLAVE SHARED RELATIME KERNMOUNT \ I_VERSION STRICTATIME LAZYTIME SUBMOUNT NOREMOTELOCK NOSEC BORN \ ACTIVE NOUSER" SED_PROG= for i in $SYMS; do SED_PROG="$SED_PROG -e s/MS_$i/SB_$i/g"; done # we want files that contain at least one of MS_..., # with fs/namespace.c and fs/pnode.c excluded. L=$(for i in $SYMS; do git grep -w -l MS_$i $FILES; done| sort|uniq|grep -v '^fs/namespace.c'|grep -v '^fs/pnode.c') for f in $L; do sed -i $f $SED_PROG; done Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-119p: Implement show_optionsDavid Howells
Implement the show_options superblock op for 9p as part of a bid to get rid of s_options and generic_show_options() to make it easier to implement a context-based mount where the mount options can be passed individually over a file descriptor. Signed-off-by: David Howells <dhowells@redhat.com> cc: Eric Van Hensbergen <ericvh@gmail.com> cc: Ron Minnich <rminnich@sandia.gov> cc: Latchesar Ionkov <lucho@ionkov.net> cc: v9fs-developer@lists.sourceforge.net Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-04-209p: Convert to separately allocated bdiJan Kara
Allocate struct backing_dev_info separately instead of embedding it inside session. This unifies handling of bdi among users. CC: Eric Van Hensbergen <ericvh@gmail.com> CC: Ron Minnich <rminnich@sandia.gov> CC: Latchesar Ionkov <lucho@ionkov.net> CC: v9fs-developer@lists.sourceforge.net Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-04-04mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macrosKirill A. Shutemov
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-08v9fs: fix error handling in v9fs_session_init()Tejun Heo
On failure, v9fs_session_init() returns with the v9fs_session_info struct partially initialized and expects the caller to invoke v9fs_session_close() to clean it up; however, it doesn't track whether the bdi is initialized or not and curiously invokes bdi_destroy() in both vfs_session_init() failure path too. A. If v9fs_session_init() fails before the bdi is initialized, the follow-up v9fs_session_close() will invoke bdi_destroy() on an uninitialized bdi. B. If v9fs_session_init() fails after the bdi is initialized, bdi_destroy() will be called twice on the same bdi - once in the failure path of v9fs_session_init() and then by v9fs_session_close(). A is broken no matter what. B used to be okay because bdi_destroy() allowed being invoked multiple times on the same bdi, which BTW was broken in its own way - if bdi_destroy() was invoked on an initialiezd but !registered bdi, it'd fail to free percpu counters. Since f0054bb1e1f3 ("writeback: move backing_dev_info->wb_lock and ->worklist into bdi_writeback"), this no longer work - bdi_destroy() on an initialized but not registered bdi works correctly but multiple invocations of bdi_destroy() is no longer allowed. The obvious culprit here is v9fs_session_init()'s odd and broken error behavior. It should simply clean up after itself on failures. This patch makes the following updates to v9fs_session_init(). * @rc -> @retval error return propagation removed. It didn't serve any purpose. Just use @rc. * Move addition to v9fs_sessionlist to the end of the function so that incomplete sessions are not put on the list or iterated and error path doesn't have to worry about it. * Update error handling so that it cleans up after itself. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-04-15VFS: normal filesystems (and lustre): d_inode() annotationsDavid Howells
that's the bulk of filesystem drivers dealing with inodes of their own Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-01-109P: introduction of a new cache=mmap model.Dominique Martinet
- Add cache=mmap option - Make mmap read-write while keeping it as synchronous as possible - Build writeback fid on mmap creation if it is writable Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2013-03-03fs: Limit sys_mount to only request filesystem modules.Eric W. Biederman
Modify the request_module to prefix the file system type with "fs-" and add aliases to all of the filesystems that can be built as modules to match. A common practice is to build all of the kernel code and leave code that is not commonly needed as modules, with the result that many users are exposed to any bug anywhere in the kernel. Looking for filesystems with a fs- prefix limits the pool of possible modules that can be loaded by mount to just filesystems trivially making things safer with no real cost. Using aliases means user space can control the policy of which filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf with blacklist and alias directives. Allowing simple, safe, well understood work-arounds to known problematic software. This also addresses a rare but unfortunate problem where the filesystem name is not the same as it's module name and module auto-loading would not work. While writing this patch I saw a handful of such cases. The most significant being autofs that lives in the module autofs4. This is relevant to user namespaces because we can reach the request module in get_fs_type() without having any special permissions, and people get uncomfortable when a user specified string (in this case the filesystem type) goes all of the way to request_module. After having looked at this issue I don't think there is any particular reason to perform any filtering or permission checks beyond making it clear in the module request that we want a filesystem module. The common pattern in the kernel is to call request_module() without regards to the users permissions. In general all a filesystem module does once loaded is call register_filesystem() and go to sleep. Which means there is not much attack surface exposed by loading a filesytem module unless the filesystem is mounted. In a user namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT, which most filesystems do not set today. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Acked-by: Kees Cook <keescook@chromium.org> Reported-by: Kees Cook <keescook@google.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-26vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry opJeff Layton
The following set of operations on a NFS client and server will cause server# mkdir a client# cd a server# mv a a.bak client# sleep 30 # (or whatever the dir attrcache timeout is) client# stat . stat: cannot stat `.': Stale NFS file handle Obviously, we should not be getting an ESTALE error back there since the inode still exists on the server. The problem is that the lookup code will call d_revalidate on the dentry that "." refers to, because NFS has FS_REVAL_DOT set. nfs_lookup_revalidate will see that the parent directory has changed and will try to reverify the dentry by redoing a LOOKUP. That of course fails, so the lookup code returns ESTALE. The problem here is that d_revalidate is really a bad fit for this case. What we really want to know at this point is whether the inode is still good or not, but we don't really care what name it goes by or whether the dcache is still valid. Add a new d_op->d_weak_revalidate operation and have complete_walk call that instead of d_revalidate. The intent there is to allow for a "weaker" d_revalidate that just checks to see whether the inode is still good. This is also gives us an opportunity to kill off the FS_REVAL_DOT special casing. [AV: changed method name, added note in porting, fixed confusion re having it possibly called from RCU mode (it won't be)] Cc: NeilBrown <neilb@suse.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14VFS: Pass mount flags to sget()David Howells
Pass mount flags to sget() so that it can use them in initialising a new superblock before the set function is called. They could also be passed to the compare function. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-28Merge tag 'for-linus-3.4-merge-window' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs Pull 9p changes for the 3.4 merge window from Eric Van Hensbergen. * tag 'for-linus-3.4-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: 9p: statfs should not override server f_type net/9p: handle flushed Tclunk/Tremove net/9p: don't allow Tflush to be interrupted
2012-03-20switch open-coded instances of d_make_root() to new helperAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-109p: statfs should not override server f_typeJim Garlick
Allow a 9p2000.L server to supply the statfs f_type value rather than hardwiring V9FS_MAGIC. It is desirable to give the server this option in some applications, e.g. I/O forwarding. Signed-off-by: Jim Garlick <garlick@llnl.gov> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2012-01-10Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: fs/9p: iattr_valid flags are kernel internal flags map them to 9p values. fs/9p: We should not allocate a new inode when creating hardlines. fs/9p: v9fs_stat2inode should update suid/sgid bits. 9p: Reduce object size with CONFIG_NET_9P_DEBUG fs/9p: check schedule_timeout_interruptible return value Fix up trivial conflicts in fs/9p/{vfs_inode.c,vfs_inode_dotl.c} due to debug messages having changed to use p9_debug() on one hand, and the changes for umode_t on the other.
2012-01-059p: Reduce object size with CONFIG_NET_9P_DEBUGJoe Perches
Reduce object size by deduplicating formats. Use vsprintf extension %pV. Rename P9_DPRINTK uses to p9_debug, align arguments. Add function for _p9_debug and macro to add __func__. Add missing "\n"s to p9_debug uses. Remove embedded function names as p9_debug adds it. Remove P9_EPRINTK macro and convert use to pr_<level>. Add and use pr_fmt and pr_<level>. $ size fs/9p/built-in.o* text data bss dec hex filename 62133 984 16000 79117 1350d fs/9p/built-in.o.new 67342 984 16928 85254 14d06 fs/9p/built-in.o.old $ size net/9p/built-in.o* text data bss dec hex filename 88792 4148 22024 114964 1c114 net/9p/built-in.o.new 94072 4148 23232 121452 1da6c net/9p/built-in.o.old Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2012-01-039p: propagate umode_tAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-09-06fs/9p: Don't update file type when updating file attributesAneesh Kumar K.V
We should only update attributes that we can change on stat2inode. Also do file type initialization in v9fs_init_inode. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>