linux.git - dakr's fork of kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

Age	Commit message (Collapse)	Author
2020-01-20	fs/adfs: super: extract filesystem block probe	Russell King
	Separate the filesystem block probing from the superblock filling so we can support other ADFS filesystem formats, such as the single-zone E and E+ floppy image formats which do not have a boot block. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: dir: remove debug in adfs_dir_update()	Russell King
	Remove the noisy debug in adfs_dir_update(). Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: super: fix inode dropping	Russell King
	When we have write support enabled, we must not drop inodes before they have been written back, otherwise we lose updates to the filesystem on umount. Keep the inodes around unless we are built in read-only mode, or we are mounted read-only. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: bigdir: implement directory update support	Russell King
	Implement big directory entry update support in the same way that we do for new directories. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: bigdir: calculate and validate directory checkbyte	Russell King
	When reading a big directory, calculate the validate the directory checkbyte to ensure that the directory contents are valid. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: bigdir: directory validation strengthening	Russell King
	Strengthen the directory validation by ensuring that the header fields contain sensible values that fit inside the directory, and limit the directory size to 4MB as per RISC OS requirements. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: bigdir: extract directory validation	Russell King
	Extract the directory validation from the directory reading function as we will want to re-use this code. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: bigdir: factor out directory entry offset calculation	Russell King
	Factor out the directory entry byte offset calculation. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: newdir: split out directory commit from update	Russell King
	After changing a directory, we need to update the sequence numbers and calculate the new check byte before the directory is scheduled to be written back to the media. Since this needs to happen for any change to the directory, move this into a separate method. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: newdir: clean up adfs_f_update()	Russell King
	__adfs_dir_put() and adfs_dir_find_entry() are only called from adfs_f_update(), so move them into this function, removing some unnecessary entry copying by doing so. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: newdir: merge adfs_dir_read() into adfs_f_read()	Russell King
	adfs_dir_read() is only called from adfs_f_read(), so merge it into that function. As new directories are always 2048 bytes in size, (which we rely on elsewhere) we can consolidate some of the code. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: newdir: improve directory validation	Russell King
	Check that the lastmask and reserved fields are all zero, as per the documentation. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: newdir: factor out directory format validation	Russell King
	We have two locations where we validate the new directory format, so factor this out to a helper. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: dir: use pointers to access directory head/tails	Russell King
	Add and use pointers in the adfs_dir structure to access the directory head and tail structures, which will always be contiguous in a buffer. This allows us to avoid memcpy()ing the data in the new directory code, making it slightly more efficient. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: dir: add more efficient iterate() per-format method	Russell King
	Rather than using setpos + getnext to iterate through the directory entries, pass iterate() down to the dir format code to populate the dirents. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: dir: switch to iterate_shared method	Russell King
	There is nothing in our readdir (aka iterate) method that relies on the directory inode being exclusively locked, so switch to using the iterate_shared() hook rather than iterate(). Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: dir: improve compiler coverage in adfs_dir_update	Russell King
	Get rid of the ifdef, using IS_ENABLED() instead to detect whether the code should be callable. This allows the compiler to always parse the following code, reducing the chances of errors being missed. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: dir: improve update failure handling	Russell King
	When we update a directory, a number of errors may happen. If we failed to find the entry to update, we can just release the directory buffers as normal. However, if we have some other error, we may have partially updated the buffers, resulting in an invalid directory. In this case, we need to discard the buffers to avoid writing the contents back to the media, and later re-read the directory from the media. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: dir: modernise on-disk directory structures	Russell King
	Use __u8 and pack the structures for on-disk directories. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: dir: update directory locking	Russell King
	Update directory locking such that it covers the validation of the directory, which could fail if another thread is concurrently writing to the same directory. Since we may sleep, we need to use a rwsem rather than a rw spinlock. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: dir: add helper to mark directory buffers dirty	Russell King
	Provide a helper for marking directory buffers dirty so they get written back to disk. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: dir: add helper to read directory using inode	Russell King
	Add a helper to read a directory using the inode, which we do in two places. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: dir: add generic directory reading	Russell King
	Both directory formats code the mechanics of fetching the directory buffers using their own implementations. Consolidate these into one implementation. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: dir: add generic copy functions	Russell King
	Directories can span multiple buffers, and we currently open-code memcpy access to these buffers, including dealing with entries that are split across multiple buffers. Such code exists in both directory format implementations. Provide common functions to allow data to be copied from/to the directory buffers as if they were a contiguous set of buffers, and use them when accessing directories. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: dir: add common directory sync method	Russell King
	adfs_fplus_sync() can be used for both directory formats since we now have a common way to access the buffer heads, so move it into dir.c and appropriately rename it. Remove the directory-format specific implementations. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: dir: add common directory buffer release method	Russell King
	With the bhs pointer in place, we have no need for separate per-format free() methods, since a generic version will do. Provide a generic implementation, remove the format specific implementations and the method function pointer. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: dir: add common dir object initialisation	Russell King
	Initialise the dir object before we pass it down to the directory format specific read handler. This allows us to get rid of the initialisation inside those handlers. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: dir: rename bh_fplus to bhs	Russell King
	Rename bh_fplus to bhs in preparation to make some of the directory handling code sharable between implementations. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: map: fix map scanning	Russell King
	When scanning the map for a fragment id, we need to keep track of the free space links, so we don't inadvertently believe that the freespace link is a valid fragment id. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: map: move map-specific sb initialisation to map.c	Russell King
	Move map specific superblock initialisation to map.c, rather than having it spread into super.c. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: map: use find_next_bit_le() rather than open coding it	Russell King
	Use find_next_bit_le() to find the end of a fragment in the map rather than open-coding this functionality. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: map: incorporate map offsets into layout	Russell King
	lookup_zone() and scan_free_map() cope in different ways with the location of the map data within a zone: 1. lookup_zone() adds a four byte offset to the map data pointer to skip over the check and free link bytes. 2. scan_free_map() needs to use the free link pointer, which is an offset from itself, so we end up adding a 32-bit offset to the end pointer (aka mapsize) which is really confusing. Rename mapsize to endbit as this is really what it is, and incorporate the 32-bit offset into the map layout. This means that both dm_startbit and dm_endbit are now bit offsets from the start of the buffer, rather than four bytes in to the buffer. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: map: factor out map cleanup	Russell King
	We have several places which deal with releasing the map buffers and freeing the map array. Provide a helper for this. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: map: break up adfs_read_map()	Russell King
	Split up adfs_read_map() into separate helpers to layout the map, read the map, and release the map buffers. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: map: rename adfs_map_free() to adfs_map_statfs()	Russell King
	adfs_map_free() is not obvious whether it is freeing the map or returning the number of free blocks on the filesystem. Rename it to the more generic statfs() to make it clear that it's a statistic function. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: map: move map reading and validation to map.c	Russell King
	Keep all the map code together in map.c, rather than having some in super.c Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: inode: fix adfs_mode2atts()	Russell King
	Fix adfs_mode2atts() to actually update the file permissions on the media rather than using the current inode mode. Note also that directories do not have read/write permissions stored on the media. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-20	fs/adfs: inode: update timestamps to centisecond precision	Russell King
	Despite ADFS timestamps having centi-second granularity, and Linux gaining fine-grained timestamp support in v2.5.48, fs/adfs was never updated. Update fs/adfs to centi-second support, and ensure that the inode ctime always reflects what is written in underlying media. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-12-08	Merge tag '5.5-rc-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6	Linus Torvalds
	Pull cifs fixes from Steve French: "Nine cifs/smb3 fixes: - one fix for stable (oops during oplock break) - two timestamp fixes including important one for updating mtime at close to avoid stale metadata caching issue on dirty files (also improves perf by using SMB2_CLOSE_FLAG_POSTQUERY_ATTRIB over the wire) - two fixes for "modefromsid" mount option for file create (now allows mode bits to be set more atomically and accurately on create by adding "sd_context" on create when modefromsid specified on mount) - two fixes for multichannel found in testing this week against different servers - two small cleanup patches" * tag '5.5-rc-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6: smb3: improve check for when we send the security descriptor context on create smb3: fix mode passed in on create for modetosid mount option cifs: fix possible uninitialized access and race on iface_list cifs: Fix lookup of SMB connections on multichannel smb3: query attributes on file close smb3: remove unused flag passed into close functions cifs: remove redundant assignment to pointer pneg_ctxt fs: cifs: Fix atime update check vs mtime CIFS: Fix NULL-pointer dereference in smb2_push_mandatory_locks
2019-12-08	Merge branch 'work.misc' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc vfs cleanups from Al Viro: "No common topic, just three cleanups". * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: make __d_alloc() static fs/namespace: add __user to open_tree and move_mount syscalls fs/fnctl: fix missing __user in fcntl_rw_hint()
2019-12-07	Merge tag 'iomap-5.5-merge-14' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux	Linus Torvalds
	Pull iomap fixes from Darrick Wong: "Fix a race condition and a use-after-free error: - Fix a UAF when reporting writeback errors - Fix a race condition when handling page uptodate on fragmented file with blocksize < pagesize" * tag 'iomap-5.5-merge-14' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: iomap: stop using ioend after it's been freed in iomap_finish_ioend() iomap: fix sub-page uptodate handling
2019-12-07	Merge tag 'xfs-5.5-merge-17' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux	Linus Torvalds
	Pull xfs fixes from Darrick Wong: "Fix a couple of resource management errors and a hang: - fix a crash in the log setup code when log mounting fails - fix a hang when allocating space on the realtime device - fix a block leak when freeing space on the realtime device" * tag 'xfs-5.5-merge-17' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix mount failure crash on invalid iclog memory access xfs: don't check for AG deadlock for realtime files in bunmapi xfs: fix realtime file data space leak
2019-12-07	Merge tag 'for-linus-5.5-ofs1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux Pull orangefs update from Mike Marshall: "orangefs: posix open permission checking... Orangefs has no open, and orangefs checks file permissions on each file access. Posix requires that file permissions be checked on open and nowhere else. Orangefs-through-the-kernel needs to seem posix compliant. The VFS opens files, even if the filesystem provides no method. We can see if a file was successfully opened for read and or for write by looking at file->f_mode. When writes are flowing from the page cache, file is no longer available. We can trust the VFS to have checked file->f_mode before writing to the page cache. The mode of a file might change between when it is opened and IO commences, or it might be created with an arbitrary mode. We'll make sure we don't hit EACCES during the IO stage by using UID 0" [ This is "posixish", but not a great solution in the long run, since a proper secure network server shouldn't really trust the client like this. But proper and secure POSIX behavior requires an open method and a resulting cookie for IO of some kind, or similar. - Linus ] * tag 'for-linus-5.5-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux: orangefs: posix open permission checking...
2019-12-07	Merge tag 'nfsd-5.5' of git://linux-nfs.org/~bfields/linux	Linus Torvalds
	Pull nfsd updates from Bruce Fields: "This is a relatively quiet cycle for nfsd, mainly various bugfixes. Possibly most interesting is Trond's fixes for some callback races that were due to my incomplete understanding of rpc client shutdown. Unfortunately at the last minute I've started noticing a new intermittent failure to send callbacks. As the logic seems basically correct, I'm leaving Trond's patches in for now, and hope to find a fix in the next week so I don't have to revert those patches" * tag 'nfsd-5.5' of git://linux-nfs.org/~bfields/linux: (24 commits) nfsd: depend on CRYPTO_MD5 for legacy client tracking NFSD fixing possible null pointer derefering in copy offload nfsd: check for EBUSY from vfs_rmdir/vfs_unink. nfsd: Ensure CLONE persists data and metadata changes to the target file SUNRPC: Fix backchannel latency metrics nfsd: restore NFSv3 ACL support nfsd: v4 support requires CRYPTO_SHA256 nfsd: Fix cld_net->cn_tfm initialization lockd: remove __KERNEL__ ifdefs sunrpc: remove __KERNEL__ ifdefs race in exportfs_decode_fh() nfsd: Drop LIST_HEAD where the variable it declares is never used. nfsd: document callback_wq serialization of callback code nfsd: mark cb path down on unknown errors nfsd: Fix races between nfsd4_cb_release() and nfsd4_shutdown_callback() nfsd: minor 4.1 callback cleanup SUNRPC: Fix svcauth_gss_proxy_init() SUNRPC: Trace gssproxy upcall results sunrpc: fix crash when cache_head become valid before update nfsd: remove private bin2hex implementation ...
2019-12-07	Merge tag 'nfs-for-5.5-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs	Linus Torvalds
	Pull NFS client updates from Trond Myklebust: "Highlights include: Features: - NFSv4.2 now supports cross device offloaded copy (i.e. offloaded copy of a file from one source server to a different target server). - New RDMA tracepoints for debugging congestion control and Local Invalidate WRs. Bugfixes and cleanups - Drop the NFSv4.1 session slot if nfs4_delegreturn_prepare waits for layoutreturn - Handle bad/dead sessions correctly in nfs41_sequence_process() - Various bugfixes to the delegation return operation. - Various bugfixes pertaining to delegations that have been revoked. - Cleanups to the NFS timespec code to avoid unnecessary conversions between timespec and timespec64. - Fix unstable RDMA connections after a reconnect - Close race between waking an RDMA sender and posting a receive - Wake pending RDMA tasks if connection fails - Fix MR list corruption, and clean up MR usage - Fix another RPCSEC_GSS issue with MIC buffer space" * tag 'nfs-for-5.5-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (79 commits) SUNRPC: Capture completion of all RPC tasks SUNRPC: Fix another issue with MIC buffer space NFS4: Trace lock reclaims NFS4: Trace state recovery operation NFSv4.2 fix memory leak in nfs42_ssc_open NFSv4.2 fix kfree in __nfs42_copy_file_range NFS: remove duplicated include from nfs4file.c NFSv4: Make _nfs42_proc_copy_notify() static NFS: Fallocate should use the nfs4_fattr_bitmap NFS: Return -ETXTBSY when attempting to write to a swapfile fs: nfs: sysfs: Remove NULL check before kfree NFS: remove unneeded semicolon NFSv4: add declaration of current_stateid NFSv4.x: Drop the slot if nfs4_delegreturn_prepare waits for layoutreturn NFSv4.x: Handle bad/dead sessions correctly in nfs41_sequence_process() nfsv4: Move NFSPROC4_CLNT_COPY_NOTIFY to end of list SUNRPC: Avoid RPC delays when exiting suspend NFS: Add a tracepoint in nfs_fh_to_dentry() NFSv4: Don't retry the GETATTR on old stateid in nfs4_delegreturn_done() NFSv4: Handle NFS4ERR_OLD_STATEID in delegreturn ...
2019-12-07	smb3: improve check for when we send the security descriptor context on create	Steve French
	We had cases in the previous patch where we were sending the security descriptor context on SMB3 open (file create) in cases when we hadn't mounted with with "modefromsid" mount option. Add check for that mount flag before calling ad_sd_context in open init. Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-12-07	pipe: don't use 'pipe_wait() for basic pipe IO	Linus Torvalds
	pipe_wait() may be simple, but since it relies on the pipe lock, it means that we have to do the wakeup while holding the lock. That's unfortunate, because the very first thing the waked entity will want to do is to get the pipe lock for itself. So get rid of the pipe_wait() usage by simply releasing the pipe lock, doing the wakeup (if required) and then using wait_event_interruptible() to wait on the right condition instead. wait_event_interruptible() handles races on its own by comparing the wakeup condition before and after adding itself to the wait queue, so you can use an optimistic unlocked condition for it. Cc: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-12-07	pipe: remove 'waiting_writers' merging logic	Linus Torvalds
	This code is ancient, and goes back to when we only had a single page for the pipe buffers. The exact history is hidden in the mists of time (ie "before git", and in fact predates the BK repository too). At that long-ago point in time, it actually helped to try to merge big back-and-forth pipe reads and writes, and not limit pipe reads to the single pipe buffer in length just because that was all we had at a time. However, since then we've expanded the pipe buffers to multiple pages, and this logic really doesn't seem to make sense. And a lot of it is somewhat questionable (ie "hmm, the user asked for a non-blocking read, but we see that there's a writer pending, so let's wait anyway to get the extra data that the writer will have"). But more importantly, it makes the "go to sleep" logic much less obvious, and considering the wakeup issues we've had, I want to make for less of those kinds of things. Cc: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-12-07	pipe: fix and clarify pipe read wakeup logic	Linus Torvalds
	This is the read side version of the previous commit: it simplifies the logic to only wake up waiting writers when necessary, and makes sure to use a synchronous wakeup. This time not so much for GNU make jobserver reasons (that pipe never fills up), but simply to get the writer going quickly again. A bit less verbose commentary this time, if only because I assume that the write side commentary isn't going to be ignored if you touch this code. Cc: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-12-07	pipe: fix and clarify pipe write wakeup logic	Linus Torvalds
	The pipe rework ends up having been extra painful, partly becaused of actual bugs with ordering and caching of the pipe state, but also because of subtle performance issues. In particular, the pipe rework caused the kernel build to inexplicably slow down. The reason turns out to be that the GNU make jobserver (which limits the parallelism of the build) uses a pipe to implement a "token" system: a parallel submake will read a character from the pipe to get the job token before starting a new job, and will write a character back to the pipe when it is done. The overall job limit is thus easily controlled by just writing the appropriate number of initial token characters into the pipe. But to work well, that really means that the old behavior of write wakeups being synchronous (WF_SYNC) is very important - when the pipe writer wakes up a reader, we want the reader to actually get scheduled immediately. Otherwise you lose the parallelism of the build. The pipe rework lost that synchronous wakeup on write, and we had clearly all forgotten the reasons and rules for it. This rewrites the pipe write wakeup logic to do the required Wsync wakeups, but also clarifies the logic and avoids extraneous wakeups. It also ends up addign a number of comments about what oit does and why, so that we hopefully don't end up forgetting about this next time we change this code. Cc: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>