summaryrefslogtreecommitdiff
path: root/fs/ocfs2
AgeCommit message (Collapse)Author
2007-02-07[PATCH] ocfs2 heartbeat: clean up bio submission codePhilipp Reisner
As was already pointed out Mathieu Avila on Thu, 07 Sep 2006 03:15:25 -0700 that OCFS2 is expecting bio_add_page() to add pages to BIOs in an easily predictable manner. That is not true, especially for devices with own merge_bvec_fn(). Therefore OCFS2's heartbeat code is very likely to fail on such devices. Move the bio_put() call into the bio's bi_end_io() function. This makes the whole idea of trying to predict the behaviour of bio_add_page() unnecessary. Removed compute_max_sectors() and o2hb_compute_request_limits(). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-02-07ocfs2: introduce sc->sc_send_lock to protect outbound outbound messagesZhen Wei
When there is a lot of multithreaded I/O usage, two threads can collide while sending out a message to the other nodes. This is due to the lack of locking between threads while sending out the messages. When a connected TCP send(), sendto(), or sendmsg() arrives in the Linux kernel, it eventually comes through tcp_sendmsg(). tcp_sendmsg() protects itself by acquiring a lock at invocation by calling lock_sock(). tcp_sendmsg() then loops over the buffers in the iovec, allocating associated sk_buff's and cache pages for use in the actual send. As it does so, it pushes the data out to tcp for actual transmission. However, if one of those allocation fails (because a large number of large sends is being processed, for example), it must wait for memory to become available. It does so by jumping to wait_for_sndbuf or wait_for_memory, both of which eventually cause a call to sk_stream_wait_memory(). sk_stream_wait_memory() contains a code path that calls sk_wait_event(). Finally, sk_wait_event() contains the call to release_sock(). The following patch adds a lock to the socket container in order to properly serialize outbound requests. From: Zhen Wei <zwei@novell.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-02-07ocfs2_dlm: Add timeout to dlm join domainSunil Mushran
Currently the ocfs2 dlm has no timeout during dlm join domain. While this is not a problem in normal operation, this does become an issue if, say, the other node is refusing to let the node join the domain because of a stuck recovery. This patch adds a 90 sec timeout. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-02-07ocfs2_dlm: Silence some messages during join domainSunil Mushran
These messages can easily be activated using the mlog infrastructure and don't need to be enabled by default. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-02-07ocfs2_dlm: disallow a domain join if node maps mismatchSrinivas Eeda
There is a small window where a joining node may not see the node(s) that just died but are still part of the domain. To fix this, we must disallow join requests if the joining node has a different node map. A new field node_map is added to dlm_query_join_request to send the current nodes nodemap along with join request. On the receiving end the nodes that are part of the cluster verifies if this new node sees all the nodes that are still part of the cluster. They disallow the join if the maps mismatch. Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-02-07ocfs2_dlm: Ensure correct ordering of set/clear refmap bit on lockresSunil Mushran
Eventhough the set refmap bit message is sent before the clear refmap message, currently there is no guarentee that the set message will be handled before the clear. This patch prevents the clear refmap to be processed while the node is sending assert master messages to other nodes. (The set refmap message is sent as a response to the assert master request). Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-02-07ocfs2: Binds listener to the configured ip addressSunil Mushran
This patch binds the o2net listener to the configured ip address instead of INADDR_ANY for security. Fixes oss.oracle.com bugzilla#814. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-02-07ocfs2_dlm: Calling post handler function in assert master handlerKurt Hackel
This patch prevents the dlm from sending the clear refmap message before the set refmap. We use the newly created post function handler routine to accomplish the task. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-02-07ocfs2: Added post handler callable function in o2net message handlerKurt Hackel
Currently o2net allows one handler function per message type. This patch adds the ability to call another function to be called after the handler has returned the message to the other node. Handlers are now given the option of returning a context (in the form of a void **) which will be passed back into the post message handler function. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-02-07ocfs2_dlm: Cookies in locks not being printed correctly in error messagesKurt Hackel
The dlm encodes the node number and a sequence number in the lock cookie. It also stores the cookie in the lockres in the big endian format to avoid swapping 8 bytes on each lock request. The bug here was that it was assuming the cookie to be in the cpu format when decoding it for printing the error message. This patch swaps the bytes before the print. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-02-07ocfs2_dlm: Silence a failed convertKurt Hackel
When the lockres is in migrate or recovery state, all convert requests are denied with the appropriate error status that is handled on the requester node. This patch silences the erroneous error message printed on the master node. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-02-07ocfs2_dlm: wake up sleepers on the lockres waitqueueKurt Hackel
The dlm was not waking up threads waiting on the lockres wait queue, waiting for the lockres to be no longer be in the DLM_LOCK_RES_IN_PROGRESS and the DLM_LOCK_RES_MIGRATING states. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-02-07ocfs2_dlm: Dlm dispatch was stopping too earlyKurt Hackel
dlm_dispatch_work was not processing the queued up tasks at the first sign of the node leaving the domain leading to not only incompleted tasks but also a mismatch in the dlm refcnt. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-02-07ocfs2_dlm: Drop inflight refmap even if no locks found on the lockresKurt Hackel
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-02-07ocfs2_dlm: Flush dlm workqueue before starting to migrateKurt Hackel
This is to prevent the condition in which a previously queued up assert master asserts after we start the migration. Now migration ensures the workqueue is flushed before proceeding with migrating the lock to another node. This condition is typically encountered during parallel umounts. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-02-07ocfs2_dlm: Fix migrate lockres handler queue scanningKurt Hackel
The migrate lockres handler was only searching for its lock on migrated lockres on the expected queue. This could be problematic as the new master could have also issued a convert request during the migration and thus moved the lock to the convert queue. We now search for the lock on all three queues. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <Sunil.Mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-02-07ocfs2_dlm: Make dlmunlock() wait for migration to completeKurt Hackel
dlmunlock() was not waiting for migration to complete before releasing locks on locally mastered locks. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <Sunil.Mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-02-07ocfs2_dlm: Fixes race between migrate and dirtyKurt Hackel
dlmthread was removing lockres' from the dirty list and resetting the dirty flag before shuffling the list. This patch retains the dirty state flag until the lists are shuffled. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <Sunil.Mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-02-07[PATCH] fs/ocfs2/dlm/: make functions staticAdrian Bunk
This patch makes some needlessly global functions static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-02-07ocfs2_dlm: fix cluster-wide refcounting of lock resourcesKurt Hackel
This was previously broken and migration of some locks had to be temporarily disabled. We use a new (and backward-incompatible) set of network messages to account for all references to a lock resources held across the cluster. once these are all freed, the master node may then free the lock resource memory once its local references are dropped. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-02-01ocfs2: ocfs2_link() journal credits updateMark Fasheh
Commit 592282cf2eaa33409c6511ddd3f3ecaa57daeaaa fixed some missing directory c/mtime updates in part by introducing a dinode update in ocfs2_add_entry(). Unfortunately, ocfs2_link() (which didn't update the directory inode before) is now missing a single journal credit. Fix this by doubling the number of inode updates expected during hard link creation. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-01-26[PATCH] ocfs2: fix thinko in ocfs2_backup_super_blkno()Mark Fasheh
Fix a bug which was introduced when I synced up ocfs2_fs.h with ocfs2-tools. We can't do u64/u32 in kernel. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-21ocfs2: Add backup superblock info to ocfs2_fs.hMark Fasheh
This synchronizes us with recent ocfs2-tools changes. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-01-21ocfs2: cleanup ocfs2_iget() errorsMark Fasheh
Get rid of some error prints in the ocfs2_iget() path from ocfs2_get_dentry(). NFSD can easily cause us to read stale inodes. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-01-21ocfs2: Directory c/mtime update fixesMark Fasheh
ocfs2 wasn't updating c/mtime on directories during dirent creation/deletion. Fix ocfs2_unlink(), ocfs2_rename() and __ocfs2_add_entry() by adding the proper code to update the struct inode and push the change out to disk. This helps rename/unlink on nfs exported file systems in particular as those clients compare directory time values to avoid a full re-reading a directory which hasn't changed. ocfs2_rename() loses some superfluous error handling as a result of this patch. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-01-21ocfs2: Don't print errors when following symlinksMark Fasheh
We shouldn't print errors returned from vfs_follow_link(). This was causing spurious errors to show up in the logs. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-12-28ocfs2: export heartbeat thread pid via configfsZhen Wei
The patch allows the ocfs2 heartbeat thread to prioritize I/O which may help cut down on spurious fencing. Most of this will be in the tools - we can have a pid configfs attribute and let userspace (ocfs2_hb_ctl) calls the ioprio_set syscall after starting heartbeat, but only cfq scheduler supports I/O priorities now. Signed-off-by: Zhen Wei <zwei@novell.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-12-28ocfs2: always unmap in ocfs2_data_convert_worker()Mark Fasheh
Mmap-heavy clustered workloads were sometimes finding stale data on mmap reads. The solution is to call unmap_mapping_range() on any down convert of a data lock. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-12-28ocfs2: ignore NULL vfsmnt in ocfs2_should_update_atime()Mark Fasheh
This can come from NFSD. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-12-28ocfs2: Allow direct I/O read past end of fileMark Fasheh
ocfs2_direct_IO_get_blocks() was incorrectly returning -EIO for a direct I/O read whose start block was past the end of the file allocation tree. Fix things so that we return a hole instead. do_direct_IO() will then notice that the range start is past eof and return a short read. While there, remove the unused vbo_max variable. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-12-28ocfs2: don't print error in ocfs2_permission()Mark Fasheh
Errors from generic_permission() can happen in valid cases and shouldn't be reported. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-12-13[PATCH] Fix numerous kcalloc() calls, convert to kzalloc()Robert P. J. Day
All kcalloc() calls of the form "kcalloc(1,...)" are converted to the equivalent kzalloc() calls, and a few kcalloc() calls with the incorrect ordering of the first two arguments are fixed. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Cc: Jeff Garzik <jeff@garzik.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Adam Belay <ambx1@neo.rr.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Greg KH <greg@kroah.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-13[PATCH] ocfs2: relative atime supportMark Fasheh
Update ocfs2_should_update_atime() to understand the MNT_RELATIME flag and to test against mtime / ctime accordingly. [akpm@osdl.org: cleanups] Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Cc: Valerie Henson <val_henson@linux.intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-12Merge branch 'upstream-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: [patch 3/3] OCFS2 Configurable timeouts - Protocol changes [patch 2/3] OCFS2 Configurable timeouts [patch 1/3] OCFS2 - Expose struct o2nm_cluster ocfs2: Synchronize feature incompat flags in ocfs2_fs.h ocfs2: update mount option documentation ocfs2: local mounts
2006-12-11[patch 3/3] OCFS2 Configurable timeouts - Protocol changesAndrew Beekhof
Modify the OCFS2 handshake to ensure essential timeouts are configured identically on all nodes. Only allow changes when there are no connected peers Improves the logic in o2net_advance_rx() which broke now that sizeof(struct o2net_handshake) is greater than sizeof(struct o2net_msg) Included is the field for userspace-heartbeat timeout to avoid the need for further protocol changes. Uses a global spinlock to ensure the decisions to update configfs entries are made on the correct value. The region covered by the spinlock when incrementing the counter is much larger as this is the more critical case. Small cleanup contributed by Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Beekhof <abeekhof@suse.de> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-12-08[PATCH] struct path: convert ocfs2Josef Sipek
Signed-off-by: Josef Sipek <jsipek@fsl.cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07[patch 2/3] OCFS2 Configurable timeoutsJeff Mahoney
Allow configuration of OCFS2 timeouts from userspace via configfs Signed-off-by: Andrew Beekhof <abeekhof@suse.de> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-12-07[patch 1/3] OCFS2 - Expose struct o2nm_clusterAndrew Beekhof
Subsequent patches (namely userspace heartbeat and configurable timeouts) require access to the o2nm_cluster struct. This patch does the necessary shuffling. Signed-off-by: Andrew Beekhof <abeekhof@suse.de> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-12-07ocfs2: Synchronize feature incompat flags in ocfs2_fs.hMark Fasheh
These got a little bit out of date with ocfs2-tools, make things consistent again. We reserve a flag for sparse allocation code as that's pretty close to testable at this point. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-12-07ocfs2: local mountsSunil Mushran
This allows users to format an ocfs2 file system with a special flag, OCFS2_FEATURE_INCOMPAT_LOCAL_MOUNT. When the file system sees this flag, it will not use any cluster services, nor will it require a cluster configuration, thus acting like a 'local' file system. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-12-07[PATCH] fs/*: trivial vsnprintf() conversionAlexey Dobriyan
It would very lame to get buffer overflow via one of the following. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07[PATCH] slab: remove kmem_cache_tChristoph Lameter
Replace all uses of kmem_cache_t with struct kmem_cache. The patch was generated using the following script: #!/bin/sh # # Replace one string by another in all the kernel sources. # set -e for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do quilt add $file sed -e "1,\$s/$1/$2/g" $file >/tmp/$$ mv /tmp/$$ $file quilt refresh done The script was run like this sh replace kmem_cache_t "struct kmem_cache" Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07[PATCH] slab: remove SLAB_NOFSChristoph Lameter
SLAB_NOFS is an alias of GFP_NOFS. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-05Merge branch 'master' of ↵David Howells
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: drivers/ata/libata-scsi.c include/linux/libata.h Futher merge of Linus's head and compilation fixups. Signed-Off-By: David Howells <dhowells@redhat.com>
2006-12-01ocfs2: implement i_op->permissionTiger Yang
Implement .permission() in ocfs2_file_iops, ocfs2_special_file_iops and ocfs2_dir_iops. This helps us avoid some multi-node races with mode change and vfs operations. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-12-01ocfs2: update file system paths to set atimeTiger Yang
Conditionally update atime in ocfs2_file_aio_read(), ocfs2_readdir() and ocfs2_mmap(). Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-12-01ocfs2: core atime update functionsTiger Yang
This patch adds the core routines for updating atime in ocfs2. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-12-01ocfs2: Add splice supportTiger Yang
Add splice read/write support in ocfs2. ocfs2_file_splice_read/write are very similar to ocfs2_file_aio_read/write. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-12-01ocfs2: Remove ocfs2_write_should_remove_suid()Mark Fasheh
Use should_remove_suid() instead. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-12-01ocfs2: Remove struct ocfs2_journal_handle in favor of handle_tMark Fasheh
This is mostly a search and replace as ocfs2_journal_handle is now no more than a container for a handle_t pointer. ocfs2_commit_trans() becomes very straight forward, and we remove some out of date comments / code. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>