summaryrefslogtreecommitdiff
path: root/fs/ecryptfs
AgeCommit message (Collapse)Author
2011-04-25eCryptfs: Flush dirty pages in setattrTyler Hicks
After 57db4e8d73ef2b5e94a3f412108dff2576670a8a changed eCryptfs to write-back caching, eCryptfs page writeback updates the lower inode times due to the use of vfs_write() on the lower file. To preserve inode metadata changes, such as 'cp -p' does with utimensat(), we need to flush all dirty pages early in ecryptfs_setattr() so that the user-updated lower inode metadata isn't clobbered later in writeback. https://bugzilla.kernel.org/show_bug.cgi?id=33372 Reported-by: Rocko <rockorequin@hotmail.com> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-04-25eCryptfs: Handle failed metadata read in lookupTyler Hicks
When failing to read the lower file's crypto metadata during a lookup, eCryptfs must continue on without throwing an error. For example, there may be a plaintext file in the lower mount point that the user wants to delete through the eCryptfs mount. If an error is encountered while reading the metadata in lookup(), the eCryptfs inode's size could be incorrect. We must be sure to reread the plaintext inode size from the metadata when performing an open() or setattr(). The metadata is already being read in those paths, so this adds minimal performance overhead. This patch introduces a flag which will track whether or not the plaintext inode size has been read so that an incorrect i_size can be fixed in the open() or setattr() paths. https://bugs.launchpad.net/bugs/509180 Cc: <stable@kernel.org> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-04-25eCryptfs: Add reference counting to lower filesTyler Hicks
For any given lower inode, eCryptfs keeps only one lower file open and multiplexes all eCryptfs file operations through that lower file. The lower file was considered "persistent" and stayed open from the first lookup through the lifetime of the inode. This patch keeps the notion of a single, per-inode lower file, but adds reference counting around the lower file so that it is closed when not currently in use. If the reference count is at 0 when an operation (such as open, create, etc.) needs to use the lower file, a new lower file is opened. Since the file is no longer persistent, all references to the term persistent file are changed to lower file. Locking is added around the sections of code that opens the lower file and assign the pointer in the inode info, as well as the code the fputs the lower file when all eCryptfs users are done with it. This patch is needed to fix issues, when mounted on top of the NFSv3 client, where the lower file is left silly renamed until the eCryptfs inode is destroyed. Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-04-25eCryptfs: dput dentries returned from dget_parentTyler Hicks
Call dput on the dentries previously returned by dget_parent() in ecryptfs_rename(). This is needed for supported eCryptfs mounts on top of the NFSv3 client. Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-04-25eCryptfs: Remove extra d_delete in ecryptfs_rmdirTyler Hicks
vfs_rmdir() already calls d_delete() on the lower dentry. That was being duplicated in ecryptfs_rmdir() and caused a NULL pointer dereference when NFSv3 was the lower filesystem. Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-03-31Fix common misspellingsLucas De Marchi
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-28eCryptfs: write lock requested keysRoberto Sassu
A requested key is write locked in order to prevent modifications on the authentication token while it is being used. Signed-off-by: Roberto Sassu <roberto.sassu@polito.it> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-03-28eCryptfs: move ecryptfs_find_auth_tok_for_sig() call before mutex_lockRoberto Sassu
The ecryptfs_find_auth_tok_for_sig() call is moved before the mutex_lock(s->tfm_mutex) instruction in order to avoid possible deadlocks that may occur by holding the lock on the two semaphores 'key->sem' and 's->tfm_mutex' in reverse order. Signed-off-by: Roberto Sassu <roberto.sassu@polito.it> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-03-28eCryptfs: verify authentication tokens before their useRoberto Sassu
Authentication tokens content may change if another requestor calls the update() method of the corresponding key. The new function ecryptfs_verify_auth_tok_from_key() retrieves the authentication token from the provided key and verifies if it is still valid before being used to encrypt or decrypt an eCryptfs file. Signed-off-by: Roberto Sassu <roberto.sassu@polito.it> [tyhicks: Minor formatting changes] Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-03-28eCryptfs: modified size of keysig in the ecryptfs_key_sig structureRoberto Sassu
The size of the 'keysig' array is incremented of one byte in order to make room for the NULL character. The 'keysig' variable is used, in the function ecryptfs_generate_key_packet_set(), to find an authentication token with the given signature and is printed a debug message if it cannot be retrieved. Signed-off-by: Roberto Sassu <roberto.sassu@polito.it> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-03-28eCryptfs: removed num_global_auth_toks from ecryptfs_mount_crypt_statRoberto Sassu
This patch removes the 'num_global_auth_toks' field of the ecryptfs_mount_crypt_stat structure, used to count the number of items in the 'global_auth_tok_list' list. This variable is not needed because there are no checks based upon it. Signed-off-by: Roberto Sassu <roberto.sassu@polito.it> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-03-28eCryptfs: ecryptfs_keyring_auth_tok_for_sig() bug fixRoberto Sassu
The pointer '(*auth_tok_key)' is set to NULL in case request_key() fails, in order to prevent its use by functions calling ecryptfs_keyring_auth_tok_for_sig(). Signed-off-by: Roberto Sassu <roberto.sassu@polito.it> Cc: <stable@kernel.org> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-03-28eCryptfs: Unlock page in write_begin error pathTyler Hicks
Unlock the page in error path of ecryptfs_write_begin(). This may happen, for example, if decryption fails while bring the page up-to-date. Cc: <stable@kernel.org> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-03-28ecryptfs: modify write path to encrypt page in writepageThieu Le
Change the write path to encrypt the data only when the page is written to disk in ecryptfs_writepage. Previously, ecryptfs encrypts the page in ecryptfs_write_end which means that if there are multiple write requests to the same page, ecryptfs ends up re-encrypting that page over and over again. This patch minimizes the number of encryptions needed. Signed-off-by: Thieu Le <thieule@chromium.org> [tyhicks: Changed NULL .drop_inode sop pointer to generic_drop_inode] Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-03-28eCryptfs: Remove ECRYPTFS_NEW_FILE crypt stat flagTyler Hicks
Now that grow_file() is not called in the ecryptfs_create() path, the ECRYPTFS_NEW_FILE flag is no longer needed. It helped ecryptfs_readpage() know not to decrypt zeroes that were read from the lower file in the grow_file() path. Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-03-28eCryptfs: Remove unnecessary grow_file() functionTyler Hicks
When creating a new eCryptfs file, the crypto metadata is written out and then the lower file was being "grown" with 4 kB of encrypted zeroes. I suspect that growing the encrypted file was to prevent an information leak that the unencrypted file was empty. However, the unencrypted file size is stored, in plaintext, in the metadata so growing the file is unnecessary. Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-02-21eCryptfs: Copy up lower inode attrs in getattrTyler Hicks
The lower filesystem may do some type of inode revalidation during a getattr call. eCryptfs should take advantage of that by copying the lower inode attributes to the eCryptfs inode after a call to vfs_getattr() on the lower inode. I originally wrote this fix while working on eCryptfs on nfsv3 support, but discovered it also fixed an eCryptfs on ext4 nanosecond timestamp bug that was reported. https://bugs.launchpad.net/bugs/613873 Cc: <stable@kernel.org> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-02-21ecryptfs: read on a directory should return EISDIR if not supportedAndy Whitcroft
read() calls against a file descriptor connected to a directory are incorrectly returning EINVAL rather than EISDIR: [EISDIR] [XSI] [Option Start] The fildes argument refers to a directory and the implementation does not allow the directory to be read using read() or pread(). The readdir() function should be used instead. [Option End] This occurs because we do not have a .read operation defined for ecryptfs directories. Connect this up to generic_read_dir(). BugLink: http://bugs.launchpad.net/bugs/719691 Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-02-21eCryptfs: Handle NULL nameidata pointersTyler Hicks
Allow for NULL nameidata pointers in eCryptfs create, lookup, and d_revalidate functions. Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-02-17eCryptfs: Revert "dont call lookup_one_len to avoid NULL nameidata"Tyler Hicks
This reverts commit 21edad32205e97dc7ccb81a85234c77e760364c8 and commit 93c3fe40c279f002906ad14584c30671097d4394, which fixed a regression by the former. Al Viro pointed out bypassed dcache lookups in ecryptfs_new_lower_dentry(), misuse of vfs_path_lookup() in ecryptfs_lookup_one_lower() and a dislike of passing nameidata to the lower filesystem. Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-01-17ecryptfs: remove unnecessary decrypt when extending a fileFrank Swiderski
Removes an unecessary page decrypt from ecryptfs_begin_write when the page is beyond the current file size. Previously, the call to ecryptfs_decrypt_page would result in a read of 0 bytes, but still attempt to decrypt an entire page. This patch detects that case and merely zeros the page before marking it up-to-date. Signed-off-by: Frank Swiderski <fes@chromium.org> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-01-17ecryptfs: Fix ecryptfs_printk() size_t warningsTyler Hicks
Commit cb55d21f6fa19d8c6c2680d90317ce88c1f57269 revealed a number of missing 'z' length modifiers in calls to ecryptfs_printk() when printing variables of type size_t. This patch fixes those compiler warnings. Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-01-17fs/ecryptfs: Add printf format/argument verification and fix falloutJoe Perches
Add __attribute__((format... to __ecryptfs_printk Make formats and arguments match. Add casts to (unsigned long long) for %llu. Signed-off-by: Joe Perches <joe@perches.com> [tyhicks: 80 columns cleanup and fixed typo] Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-01-17ecryptfs: fixed testing of file descriptor flagsRoberto Sassu
This patch replaces the check (lower_file->f_flags & O_RDONLY) with ((lower_file & O_ACCMODE) == O_RDONLY). Signed-off-by: Roberto Sassu <roberto.sassu@polito.it> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-01-17ecryptfs: test lower_file pointer when lower_file_mutex is lockedRoberto Sassu
This patch prevents the lower_file pointer in the 'ecryptfs_inode_info' structure to be checked when the mutex 'lower_file_mutex' is not locked. Signed-off-by: Roberto Sassu <roberto.sassu@polito.it> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-01-17ecryptfs: missing initialization of the superblock 'magic' fieldRoberto Sassu
This patch initializes the 'magic' field of ecryptfs filesystems to ECRYPTFS_SUPER_MAGIC. Signed-off-by: Roberto Sassu <roberto.sassu@polito.it> [tyhicks: merge with 66cb76666d69] Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-01-17ecryptfs: moved ECRYPTFS_SUPER_MAGIC definition to linux/magic.hRoberto Sassu
The definition of ECRYPTFS_SUPER_MAGIC has been moved to the include file 'linux/magic.h' to become available to other kernel subsystems. Signed-off-by: Roberto Sassu <roberto.sassu@polito.it> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-01-17ecryptfs: fix truncation error in ecryptfs_read_update_atimeEdward Shishkin
This is similar to the bug found in direct-io not so long ago. Fix up truncation (ssize_t->int). This only matters with >2G reads/writes, which the kernel doesn't permit. Signed-off-by: Edward Shishkin <edward.shishkin@gmail.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Eric Sandeen <esandeen@redhat.com> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-01-13ecryptfs: fix broken buildLinus Torvalds
Stephen Rothwell reports that the vfs merge broke the build of ecryptfs. The breakage comes from commit 66cb76666d69 ("sanitize ecryptfs ->mount()") which was obviously not even build tested. Tssk, tssk, Al. This is the minimal build fixup for the situation, although I don't have a filesystem to actually test it with. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-12sanitize ecryptfs ->mount()Al Viro
kill ecryptfs_read_super(), reorder code allowing to use normal d_alloc_root() instead of opencoding it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-01-07fs: provide rcu-walk aware permission i_opsNick Piggin
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: rcu-walk aware d_revalidate methodNick Piggin
Require filesystems be aware of .d_revalidate being called in rcu-walk mode (nd->flags & LOOKUP_RCU). For now do a simple push down, returning -ECHILD from all implementations. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: dcache reduce branches in lookup pathNick Piggin
Reduce some branches and memory accesses in dcache lookup by adding dentry flags to indicate common d_ops are set, rather than having to check them. This saves a pointer memory access (dentry->d_op) in common path lookup situations, and saves another pointer load and branch in cases where we have d_op but not the particular operation. Patched with: git grep -E '[.>]([[:space:]])*d_op([[:space:]])*=' | xargs sed -e 's/\([^\t ]*\)->d_op = \(.*\);/d_set_d_op(\1, \2);/' -e 's/\([^\t ]*\)\.d_op = \(.*\);/d_set_d_op(\&\1, \2);/' -i Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: icache RCU free inodesNick Piggin
RCU free the struct inode. This will allow: - Subsequent store-free path walking patch. The inode must be consulted for permissions when walking, so an RCU inode reference is a must. - sb_inode_list_lock to be moved inside i_lock because sb list walkers who want to take i_lock no longer need to take sb_inode_list_lock to walk the list in the first place. This will simplify and optimize locking. - Could remove some nested trylock loops in dcache code - Could potentially simplify things a bit in VM land. Do not need to take the page lock to follow page->mapping. The downsides of this is the performance cost of using RCU. In a simple creat/unlink microbenchmark, performance drops by about 10% due to inability to reuse cache-hot slab objects. As iterations increase and RCU freeing starts kicking over, this increases to about 20%. In cases where inode lifetimes are longer (ie. many inodes may be allocated during the average life span of a single inode), a lot of this cache reuse is not applicable, so the regression caused by this patch is smaller. The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU, however this adds some complexity to list walking and store-free path walking, so I prefer to implement this at a later date, if it is shown to be a win in real situations. I haven't found a regression in any non-micro benchmark so I doubt it will be a problem. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: dcache scale dentry refcountNick Piggin
Make d_count non-atomic and protect it with d_lock. This allows us to ensure a 0 refcount dentry remains 0 without dcache_lock. It is also fairly natural when we start protecting many other dentry members with d_lock. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: change d_hash for rcu-walkNick Piggin
Change d_hash so it may be called from lock-free RCU lookups. See similar patch for d_compare for details. For in-tree filesystems, this is just a mechanical change. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2010-11-17BKL: remove extraneous #include <smp_lock.h>Arnd Bergmann
The big kernel lock has been removed from all these files at some point, leaving only the #include. Remove this too as a cleanup. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-29Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6: eCryptfs: Print mount_auth_tok_only param in ecryptfs_show_options ecryptfs: added ecryptfs_mount_auth_tok_only mount parameter ecryptfs: checking return code of ecryptfs_find_auth_tok_for_sig() ecryptfs: release keys loaded in ecryptfs_keyring_auth_tok_for_sig() eCryptfs: Clear LOOKUP_OPEN flag when creating lower file ecryptfs: call vfs_setxattr() in ecryptfs_setxattr()
2010-10-29eCryptfs: Print mount_auth_tok_only param in ecryptfs_show_optionsTyler Hicks
When printing mount options, print the new ecryptfs_mount_auth_tok_only mount option. Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2010-10-29ecryptfs: added ecryptfs_mount_auth_tok_only mount parameterRoberto Sassu
This patch adds a new mount parameter 'ecryptfs_mount_auth_tok_only' to force ecryptfs to use only authentication tokens which signature has been specified at mount time with parameters 'ecryptfs_sig' and 'ecryptfs_fnek_sig'. In this way, after disabling the passthrough and the encrypted view modes, it's possible to make available to users only files encrypted with the specified authentication token. Signed-off-by: Roberto Sassu <roberto.sassu@polito.it> Cc: Dustin Kirkland <kirkland@canonical.com> Cc: James Morris <jmorris@namei.org> [Tyler: Clean up coding style errors found by checkpatch] Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2010-10-29ecryptfs: checking return code of ecryptfs_find_auth_tok_for_sig()Roberto Sassu
This patch replaces the check of the 'matching_auth_tok' pointer with the exit status of ecryptfs_find_auth_tok_for_sig(). This avoids to use authentication tokens obtained through the function ecryptfs_keyring_auth_tok_for_sig which are not valid. Signed-off-by: Roberto Sassu <roberto.sassu@polito.it> Cc: Dustin Kirkland <kirkland@canonical.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2010-10-29ecryptfs: release keys loaded in ecryptfs_keyring_auth_tok_for_sig()Roberto Sassu
This patch allows keys requested in the function ecryptfs_keyring_auth_tok_for_sig()to be released when they are no longer required. In particular keys are directly released in the same function if the obtained authentication token is not valid. Further, a new function parameter 'auth_tok_key' has been added to ecryptfs_find_auth_tok_for_sig() in order to provide callers the key pointer to be passed to key_put(). Signed-off-by: Roberto Sassu <roberto.sassu@polito.it> Cc: Dustin Kirkland <kirkland@canonical.com> Cc: James Morris <jmorris@namei.org> [Tyler: Initialize auth_tok_key to NULL in ecryptfs_parse_packet_set] Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2010-10-29eCryptfs: Clear LOOKUP_OPEN flag when creating lower fileTyler Hicks
eCryptfs was passing the LOOKUP_OPEN flag through to the lower file system, even though ecryptfs_create() doesn't support the flag. A valid filp for the lower filesystem could be returned in the nameidata if the lower file system's create() function supported LOOKUP_OPEN, possibly resulting in unencrypted writes to the lower file. However, this is only a potential problem in filesystems (FUSE, NFS, CIFS, CEPH, 9p) that eCryptfs isn't known to support today. https://bugs.launchpad.net/ecryptfs/+bug/641703 Reported-by: Kevin Buhr Cc: stable <stable@kernel.org> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2010-10-29ecryptfs: call vfs_setxattr() in ecryptfs_setxattr()Roberto Sassu
Ecryptfs is a stackable filesystem which relies on lower filesystems the ability of setting/getting extended attributes. If there is a security module enabled on the system it updates the 'security' field of inodes according to the owned extended attribute set with the function vfs_setxattr(). When this function is performed on a ecryptfs filesystem the 'security' field is not updated for the lower filesystem since the call security_inode_post_setxattr() is missing for the lower inode. Further, the call security_inode_setxattr() is missing for the lower inode, leading to policy violations in the security module because specific checks for this hook are not performed (i. e. filesystem 'associate' permission on SELinux is not checked for the lower filesystem). This patch replaces the call of the setxattr() method of the lower inode in the function ecryptfs_setxattr() with vfs_setxattr(). Signed-off-by: Roberto Sassu <roberto.sassu@polito.it> Cc: stable <stable@kernel.org> Cc: Dustin Kirkland <kirkland@canonical.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2010-10-29convert ecryptfsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-10-24Merge branch 'for-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) Update broken web addresses in arch directory. Update broken web addresses in the kernel. Revert "drivers/usb: Remove unnecessary return's from void functions" for musb gadget Revert "Fix typo: configuation => configuration" partially ida: document IDA_BITMAP_LONGS calculation ext2: fix a typo on comment in ext2/inode.c drivers/scsi: Remove unnecessary casts of private_data drivers/s390: Remove unnecessary casts of private_data net/sunrpc/rpc_pipe.c: Remove unnecessary casts of private_data drivers/infiniband: Remove unnecessary casts of private_data drivers/gpu/drm: Remove unnecessary casts of private_data kernel/pm_qos_params.c: Remove unnecessary casts of private_data fs/ecryptfs: Remove unnecessary casts of private_data fs/seq_file.c: Remove unnecessary casts of private_data arm: uengine.c: remove C99 comments arm: scoop.c: remove C99 comments Fix typo configue => configure in comments Fix typo: configuation => configuration Fix typo interrest[ing|ed] => interest[ing|ed] Fix various typos of valid in comments ... Fix up trivial conflicts in: drivers/char/ipmi/ipmi_si_intf.c drivers/usb/gadget/rndis.c net/irda/irnet/irnet_ppp.c
2010-10-22Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bklLinus Torvalds
* 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl: vfs: make no_llseek the default vfs: don't use BKL in default_llseek llseek: automatically add .llseek fop libfs: use generic_file_llseek for simple_attr mac80211: disallow seeks in minstrel debug code lirc: make chardev nonseekable viotape: use noop_llseek raw: use explicit llseek file operations ibmasmfs: use generic_file_llseek spufs: use llseek in all file operations arm/omap: use generic_file_llseek in iommu_debug lkdtm: use generic_file_llseek in debugfs net/wireless: use generic_file_llseek in debugfs drm: use noop_llseek
2010-10-15llseek: automatically add .llseek fopArnd Bergmann
All file_operations should get a .llseek operation so we can make nonseekable_open the default for future file operations without a .llseek pointer. The three cases that we can automatically detect are no_llseek, seq_lseek and default_llseek. For cases where we can we can automatically prove that the file offset is always ignored, we use noop_llseek, which maintains the current behavior of not returning an error from a seek. New drivers should normally not use noop_llseek but instead use no_llseek and call nonseekable_open at open time. Existing drivers can be converted to do the same when the maintainer knows for certain that no user code relies on calling seek on the device file. The generated code is often incorrectly indented and right now contains comments that clarify for each added line why a specific variant was chosen. In the version that gets submitted upstream, the comments will be gone and I will manually fix the indentation, because there does not seem to be a way to do that using coccinelle. Some amount of new code is currently sitting in linux-next that should get the same modifications, which I will do at the end of the merge window. Many thanks to Julia Lawall for helping me learn to write a semantic patch that does all this. ===== begin semantic patch ===== // This adds an llseek= method to all file operations, // as a preparation for making no_llseek the default. // // The rules are // - use no_llseek explicitly if we do nonseekable_open // - use seq_lseek for sequential files // - use default_llseek if we know we access f_pos // - use noop_llseek if we know we don't access f_pos, // but we still want to allow users to call lseek // @ open1 exists @ identifier nested_open; @@ nested_open(...) { <+... nonseekable_open(...) ...+> } @ open exists@ identifier open_f; identifier i, f; identifier open1.nested_open; @@ int open_f(struct inode *i, struct file *f) { <+... ( nonseekable_open(...) | nested_open(...) ) ...+> } @ read disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off) { <+... ( *off = E | *off += E | func(..., off, ...) | E = *off ) ...+> } @ read_no_fpos disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off) { ... when != off } @ write @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off) { <+... ( *off = E | *off += E | func(..., off, ...) | E = *off ) ...+> } @ write_no_fpos @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off) { ... when != off } @ fops0 @ identifier fops; @@ struct file_operations fops = { ... }; @ has_llseek depends on fops0 @ identifier fops0.fops; identifier llseek_f; @@ struct file_operations fops = { ... .llseek = llseek_f, ... }; @ has_read depends on fops0 @ identifier fops0.fops; identifier read_f; @@ struct file_operations fops = { ... .read = read_f, ... }; @ has_write depends on fops0 @ identifier fops0.fops; identifier write_f; @@ struct file_operations fops = { ... .write = write_f, ... }; @ has_open depends on fops0 @ identifier fops0.fops; identifier open_f; @@ struct file_operations fops = { ... .open = open_f, ... }; // use no_llseek if we call nonseekable_open //////////////////////////////////////////// @ nonseekable1 depends on !has_llseek && has_open @ identifier fops0.fops; identifier nso ~= "nonseekable_open"; @@ struct file_operations fops = { ... .open = nso, ... +.llseek = no_llseek, /* nonseekable */ }; @ nonseekable2 depends on !has_llseek @ identifier fops0.fops; identifier open.open_f; @@ struct file_operations fops = { ... .open = open_f, ... +.llseek = no_llseek, /* open uses nonseekable */ }; // use seq_lseek for sequential files ///////////////////////////////////// @ seq depends on !has_llseek @ identifier fops0.fops; identifier sr ~= "seq_read"; @@ struct file_operations fops = { ... .read = sr, ... +.llseek = seq_lseek, /* we have seq_read */ }; // use default_llseek if there is a readdir /////////////////////////////////////////// @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier readdir_e; @@ // any other fop is used that changes pos struct file_operations fops = { ... .readdir = readdir_e, ... +.llseek = default_llseek, /* readdir is present */ }; // use default_llseek if at least one of read/write touches f_pos ///////////////////////////////////////////////////////////////// @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read.read_f; @@ // read fops use offset struct file_operations fops = { ... .read = read_f, ... +.llseek = default_llseek, /* read accesses f_pos */ }; @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, ... + .llseek = default_llseek, /* write accesses f_pos */ }; // Use noop_llseek if neither read nor write accesses f_pos /////////////////////////////////////////////////////////// @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; identifier write_no_fpos.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, .read = read_f, ... +.llseek = noop_llseek, /* read and write both use no f_pos */ }; @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write_no_fpos.write_f; @@ struct file_operations fops = { ... .write = write_f, ... +.llseek = noop_llseek, /* write uses no f_pos */ }; @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; @@ struct file_operations fops = { ... .read = read_f, ... +.llseek = noop_llseek, /* read uses no f_pos */ }; @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; @@ struct file_operations fops = { ... +.llseek = noop_llseek, /* no read or write fn */ }; ===== End semantic patch ===== Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Julia Lawall <julia@diku.dk> Cc: Christoph Hellwig <hch@infradead.org>
2010-10-04BKL: Remove BKL from ecryptfsArnd Bergmann
The BKL is only used in fill_super, which is protected by the superblocks s_umount rw_semaphorei, and in fasync, which does not do anything that could require the BKL. Therefore it is safe to remove the BKL entirely. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Dustin Kirkland <kirkland@canonical.com> Cc: Tyler Hicks <tyhicks@linux.vnet.ibm.com> Cc: ecryptfs-devel@lists.launchpad.net
2010-10-04BKL: Explicitly add BKL around get_sb/fill_superJan Blunck
This patch is a preparation necessary to remove the BKL from do_new_mount(). It explicitly adds calls to lock_kernel()/unlock_kernel() around get_sb/fill_super operations for filesystems that still uses the BKL. I've read through all the code formerly covered by the BKL inside do_kern_mount() and have satisfied myself that it doesn't need the BKL any more. do_kern_mount() is already called without the BKL when mounting the rootfs and in nfsctl. do_kern_mount() calls vfs_kern_mount(), which is called from various places without BKL: simple_pin_fs(), nfs_do_clone_mount() through nfs_follow_mountpoint(), afs_mntpt_do_automount() through afs_mntpt_follow_link(). Both later functions are actually the filesystems follow_link inode operation. vfs_kern_mount() is calling the specified get_sb function and lets the filesystem do its job by calling the given fill_super function. Therefore I think it is safe to push down the BKL from the VFS to the low-level filesystems get_sb/fill_super operation. [arnd: do not add the BKL to those file systems that already don't use it elsewhere] Signed-off-by: Jan Blunck <jblunck@infradead.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Christoph Hellwig <hch@infradead.org>