summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2008-07-09NFS: Use NFSDBG_FILE for all fopsChuck Lever
Clean up: some fops use NFSDBG_FILE, some use NFSDBG_VFS. Let's use NFSDBG_FILE for all fops, and consistently report file names instead of inode numbers. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-09NFS: Add debugging facility for NFS aopsChuck Lever
Recent work in fs/nfs/file.c neglected to add appropriate trace debugging for the NFS client's address space operations. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-09NFS: Make nfs_open methods consistentChuck Lever
Clean up: Report the same debugging info and count function calls the same for files and directories in nfs_opendir() and nfs_file_open(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-09NFS: Make nfs_llseek methods consistentChuck Lever
Clean up: Report the same debugging info in nfs_llseek_dir() and nfs_llseek_file(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-09NFS: Make nfs_fsync methods consistentChuck Lever
Clean up: Report the same debugging info, count function calls the same, and use similar function naming in nfs_fsync_dir() and nfs_fsync(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-09SUNRPC: Display some debugging information as text rather than numbersChuck Lever
In rpc_show_tasks(), display the program name, version number, procedure name and tk_action as human-readable variable-length text fields rather than columnar numbers. Doing the symbol lookup here helps in cases where we have actual debugging output from a kernel log, but don't have access to the kernel image or RPC module that generated the output. Sample output: -pid- flgs status -client- --rqstp- -timeout ---ops-- 5608 0001 -11 eeb42690 f6d93710 0 f8fa1764 nfsv3 WRITE a:call_transmit_status q:none 5609 0001 -11 eeb42690 f6d937e0 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending 5610 0001 -11 eeb42690 f6d93230 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending 5611 0001 -11 eeb42690 f6d93300 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending 5612 0001 -11 eeb42690 f6d93090 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending 5613 0001 -11 eeb42690 f6d933d0 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending 5614 0001 -11 eeb42690 f6d93cc0 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending 5615 0001 -11 eeb42690 f6d93a50 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending 5616 0001 -11 eeb42690 f6d93640 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending 5617 0001 -11 eeb42690 f6d93b20 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending 5618 0001 -11 eeb42690 f6d93160 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-09SUNRPC: Refactor rpc_show_tasksChuck Lever
Clean up: move the logic that displays each task to its own function. This removes indentation and makes future changes easier. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-09SUNRPC: Don't display the rpc_show_tasks header if there are no tasksChuck Lever
Clean up: don't display the rpc_show_tasks column header unless there is at least one task to display. As far as I can tell, it is safe to let the list_for_each_entry macro decide that each list is empty. scripts/checkpatch.pl also wants a KERN_FOO at the start of any newly added printk() calls, so this and subsequent patches will also add KERN_INFO. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-09SUNRPC: Rename "call_" functions that are no longer FSM statesChuck Lever
The RPC client uses a finite state machine to move RPC tasks through each step of an RPC request. Each state is contained in a function in net/sunrpc/clnt.c, and named call_foo. Some of the functions named call_foo have changed over the past few years and are no longer states in the FSM. These include: call_encode, call_header, and call_verify. As a clean up, rename the functions that have changed. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-09SUNRPC: Add a function to display the name of an RPC procedureChuck Lever
Improve debugging messages in call_start() and call_verify() by having them show the RPC procedure name instead of the procedure number. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-09NFS: Update help text for CONFIG_NFS_FSChuck Lever
Clean up: refresh the help text for Kconfig items related to the NFS client. Remove obsolete URLs, and make the language consistent among the options. Also move the ROOT_NFS config option next to the options related to the NFS client. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-09NFS: do_setlk(): don't flush caches when we have a delegationTrond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-09SUNRPC: Use GFP_NOFS when allocating credentialsTrond Myklebust
Since the credentials may be allocated during the call to rpc_new_task(), which again may be called by a memory allocator... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-09NFS: Revert commit 44dd151dTrond Myklebust
Revert commit 44dd151d "NFS: Don't mark a written page as uptodate until it is on disk". While it is true that the write may fail, that is always the case. There is no reason why we should treat data on pages that are not already marked as PG_uptodate as being special. The only thing we gain is a noticeable slowdown when re-reading these pages. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-09NFS: Optimise append writes with holesTrond Myklebust
If a file is being extended, and we're creating a hole, we might as well declare the entire page to be up to date. This patch significantly improves the write performance for sparse files in the case where lseek(SEEK_END) is used to append several non-contiguous writes at intervals of < PAGE_SIZE. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-09SUNRPC: An ENOMEM error from call_encode is always fatalTrond Myklebust
The special 'ENOMEM' case that was previously flagged as non-fatal is bogus: auth_gss always returns EAGAIN for non-fatal errors, and may in fact return ENOMEM in the special case where xdr_buf_read_netobj runs out of preallocated buffer space (invariably a _fatal_ error, since there is no provision for preallocating larger buffers). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-09SUNRPC: Ensure we exit early in case of an encode errorTrond Myklebust
All errors from call_encode(), with exception of EAGAIN are fatal, so we should immediately return instead of proceeding to xprt_transmit(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-09NFS: Add correct bounds checking to NFSv2 locksTrond Myklebust
NFSv2 file locking currently fails the Connectathon tests, because the calls to the VFS locking code do not return an EINVAL error if the struct file_lock overflows the 32-bit boundaries. The problem is due to the fact that we occasionally call helpers from fs/locks.c in order to avoid RPC calls to the server when we know that a local process holds the lock. These helpers are, of course, always 64-bit enabled, so EINVAL is not returned in cases when it would if the call had gone to the NLM code. For consistency, we therefore add support for a bounds-checking helper. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-09NFS: Fix a preemption count leak in nfs_update_requestTrond Myklebust
The commit 2785259631697ebb0749a3782cca206e2e542939 (nfs: use GFP_NOFS preloads for radix-tree insertion) appears to have introduced a bug: We only want to call radix_tree_preload() once after creating a request. Calling it every time we loop after we created the request, will cause preemption count leaks. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Nick Piggin <npiggin@suse.de>
2008-07-09NFS: Reduce the stack usage in NFSv3 create operationsTrond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-09NFS: Reduce the stack usage in NFSv4 create operationsTrond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-08Merge branch 'hotfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6Linus Torvalds
* 'hotfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: SUNRPC: Fix an rpcbind breakage for the case of IPv6 lookups SUNRPC: Fix a double-free in rpcbind NFS: Fix readdir cache invalidation
2008-07-08Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linusLinus Torvalds
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: [MIPS] Fix 32bit kernels on R4k with 128 byte cache line size [MIPS] Atlas, decstation: Fix section mismatches triggered by defconfigs
2008-07-08reiserfs: discard prealloc in reiserfs_delete_inodeJeff Mahoney
With the removal of struct file from the xattr code, reiserfs_file_release() isn't used anymore, so the prealloc isn't discarded. This causes hangs later down the line. This patch adds it to reiserfs_delete_inode. In most cases it will be a no-op due to it already having been called, but will avoid hangs with xattrs. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-08SUNRPC: Fix an rpcbind breakage for the case of IPv6 lookupsTrond Myklebust
Now that rpcb_next_version has been split into an IPv4 version and an IPv6 version, we Oops when rpcb_call_async attempts to look up the IPv6-specific RPC procedure in rpcb_next_version. Fix the Oops simply by having rpcb_getport_async pass the correct RPC procedure as an argument. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-08SUNRPC: Fix a double-free in rpcbindTrond Myklebust
It is wrong to be freeing up the rpcbind arguments if the call to rpcb_call_async() fails, since they should already have been freed up by rpcb_map_release(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-08NFS: Fix readdir cache invalidationTrond Myklebust
invalidate_inode_pages2_range() takes page offset arguments, not byte ranges. Another thought is that individual pages might perhaps get evicted by VM pressure, in which case we might perhaps want to re-read not only the evicted page, but all subsequent pages too (in case the server returns more/less data per page so that the alignment of the next entry changes). We should therefore remove the condition that we only do this on page->index==0. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-08[MIPS] Fix 32bit kernels on R4k with 128 byte cache line sizeThomas Bogendoerfer
The generated copy_page for R4k CPU with a 128 byte cache line size used Create Dirty Exclusive cache line operations even if only part of the cache line was filled. This change avoids generating cache operations, if only part of the cache line size is copied in one loop. It also increases the maxmimum loop size, because the generated code even fits into the available space for r4k CPUs with 128 byte cache line size. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-08[MIPS] Atlas, decstation: Fix section mismatches triggered by defconfigsShane McDonald
Resolve these mismatches by defining affected functions with the __cpuinit attribute, rather than __init. Signed-off-by: Shane McDonald <mcdonald.shane@gmail.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: it8213: fix return value in it8213_init_one() palm_bk3710: fix IDECLK period calculation ide: add __ide_default_irq() inline helper
2008-07-08it8213: fix return value in it8213_init_one()Bartlomiej Zolnierkiewicz
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-08palm_bk3710: fix IDECLK period calculationSergei Shtylyov
The driver uses completely bogus rounding formula for calculating period from the IDECLK frequency which gives one-off period values (e.g. 11 ns with 100 MHz IDECLK) which in turn can lead to overclocked IDE transfer timings. Actually, rounding is just wrong in this case, so use a mere division for a safe result. While at it, also: - give 'ide_palm_clk' variable a more suitable name; - get rid of the useless 'ideclkp' variable; - drop the LISP stype 'p' postfix from the 'clkp' variable's name. :-) Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: mcherkashin@ru.mvista.com Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-08ide: add __ide_default_irq() inline helperBartlomiej Zolnierkiewicz
Add __ide_default_irq() inline helper and use it instead of ide_default_irq() in ide-probe.c and ns87415.c (all host drivers except IDE PCI ones always setup hwif->irq so it is enough to check only for I/O bases 0x1f0 and 0x170). This fixes post-2.6.25 regression since ide_default_irq() define could shadow ide_default_irq() inline. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-08Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6Linus Torvalds
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: [S390] protect _PAGE_SPECIAL bit against mprotect
2008-07-08Correct hash flushing from huge_ptep_set_wrprotect()David Gibson
As Andy Whitcroft recently pointed out, the current powerpc version of huge_ptep_set_wrprotect() has a bug. It just calls ptep_set_wrprotect() which in turn calls pte_update() then hpte_need_flush() with the 'huge' argument set to 0. This will cause hpte_need_flush() to flush the wrong hash entries (of any). Andy's fix for this is already in the powerpc tree as commit 016b33c4958681c24056abed8ec95844a0da80a3. I have confirmed this is a real bug, not masked by some other synchronization, with a new testcase for libhugetlbfs. A process write a (MAP_PRIVATE) hugepage mapping, fork(), then alter the mapping and have the child incorrectly see the second write. Therefore, this should be fixed for 2.6.26, and for the stable tree. Here is a suitable patch for 2.6.26, which I think will also be suitable for the stable tree (neither of the headers in question has been changed much recently). It is cut down slighlty from Andy's original version, in that it does not include a 32-bit version of huge_ptep_set_wrprotect(). Currently, hugepages are not supported on any 32-bit powerpc platform. When they are, a suitable 32-bit version can be added - the only 32-bit hardware which supports hugepages does not use the conventional hashtable MMU and so will have different needs anyway. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-08[S390] protect _PAGE_SPECIAL bit against mprotectNick Piggin
Stop mprotect's pte_modify from wiping out the s390 pte_special bit, which caused oops thereafter when vm_normal_page thought X's abnormal was normal. Debugged-by: Ryan Hope <rmh3093@gmail.com> Debugged-by: Zan Lynx <zlynx@acm.org> Acked-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-07-07Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: Revert "PCI: Correct last two HP entries in the bfsort whitelist"
2008-07-07Revert "PCI: Correct last two HP entries in the bfsort whitelist"Jesse Barnes
This reverts commit a1676072558854b95336c8f7db76b0504e909a0a. It duplicates the change from 8d64c781f0c5fbfdf8016bd1634506ff2ad1376a and only one should be applied, otherwise some of the Dell quirks are lost. Thanks to Tony Camuso for catching this. Acked-by: Tony Camuso <tcamuso@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-07-07[UML] fix gcc ICEs and unresolved externsJeff Dike
There are various constraints on the use of unit-at-a-time: - i386 uses no-unit-at-a-time for pre-4.0 (not 4.3) - x86_64 uses unit-at-a-time always Uli reported a crash on x86_64 with gcc 4.1.2 with unit-at-a-time, resulting in commit c0a18111e571138747a98af18b3a2124df56a0d1 Ingo reported a gcc internal error with gcc 4.3 with no-unit-at-a-timem, resulting in 22eecde2f9034764a3fd095eecfa3adfb8ec9a98 Benny Halevy is seeing extern inlines not resolved with gcc 4.3 with no-unit-at-a-time This patch reintroduces unit-at-a-time for gcc >= 4.0, bringing back the possibility of Uli's crash. If that happens, we'll debug it. I started seeing both the internal compiler errors and unresolved inlines on Fedora 9. This patch fixes both problems, without so far reintroducing the crash reported by Uli. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Benny Halevy <bhalevy@panasas.com> Cc: Adrian Bunk <bunk@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: can: add sanity checks fs_enet: restore promiscuous and multicast settings in restart() ibm_newemac: Fixes entry of short packets ibm_newemac: Fixes kernel crashes when speed of cable connected changes pasemi_mac: Access iph->tot_len with correct endianness ehea: Access iph->tot_len with correct endianness ehea: fix race condition ehea: add MODULE_DEVICE_TABLE ehea: fix might sleep problem forcedeth: fix lockdep warning on ethtool -s Add missing skb->dev assignment in Frame Relay RX code bridge: fix use-after-free in br_cleanup_bridges() tcp: fix a size_t < 0 comparison in tcp_read_sock tcp: net/ipv4/tcp.c needs linux/scatterlist.h libertas: support USB persistence on suspend/resume (resend) iwlwifi: drop skb silently for Tx request in monitor mode iwlwifi: fix incorrect 5GHz rates reported in monitor mode
2008-07-07powerpc: Fix unterminated of_device_id array in legacy_serial.cBenjamin Herrenschmidt
A recent patch to legacy_serial.c factored out some code by using the of_match_node() facility to match a node against an array of possible matches. However, the patch didn't properly terminate the array causing potential crashes in cases where no match is found. In addition, the name of the array was poorly chosen for a static symbol making debugging harder. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-06vsprintf: add support for '%pS' and '%pF' pointer formatsLinus Torvalds
They print out a pointer in symbolic format, if possible (ie using symbolic KALLSYMS information). The '%pS' format is for regular direct pointers (which can point to data or code and that you find on the stack during backtraces etc), while '%pF' is for C function pointer types. On most architectures, the two mean exactly the same thing, but some architectures use an indirect pointer for C function pointers, where the function pointer points to a function descriptor (which in turn contains the actual pointer to the code). The '%pF' code automatically does the appropriate function descriptor dereference on such architectures. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-06vsprintf: add infrastructure support for extended '%p' specifiersLinus Torvalds
This expands the kernel '%p' handling with an arbitrary alphanumberic specifier extension string immediately following the '%p'. Right now it's just being ignored, but the next commit will start adding some specific pointer type extensions. NOTE! The reason the extension is appended to the '%p' is to allow minimal gcc type checking: gcc will still see the '%p' and will check that the argument passed in is indeed a pointer, and yet will not complain about the extended information that gcc doesn't understand about (on the other hand, it also won't actually check that the pointer type and the extension are compatible). Alphanumeric characters were chosen because there is no sane existing use for a string format with a hex pointer representation immediately followed by alphanumerics (which is what such a format string would have traditionally resulted in). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-06vsprintf: split out '%p' handling logicLinus Torvalds
The actual code is the same, just split out into a helper function. This makes it easier to read, and allows for simple future extension of %p handling. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-06vsprintf: split out '%s' handling logicLinus Torvalds
The actual code is the same, just split out into a helper function. This makes it easier to read, and allows for future sharing of the string code. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-06Merge branch 'kvm-updates-2.6.26' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm * 'kvm-updates-2.6.26' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: KVM: IOAPIC: Fix level-triggered irq injection hang x86: KVM guest: Add memory clobber to hypercalls
2008-07-06pxamci: fix byte aligned DMA transfersPhilipp Zabel
The pxa27x DMA controller defaults to 64-bit alignment. This caused the SCR reads to fail (and, depending on card type, error out) when card->raw_scr was not aligned on a 8-byte boundary. For performance reasons all scatter-gather addresses passed to pxamci_request should be aligned on 8-byte boundaries, but if this can't be guaranteed, byte aligned DMA transfers in the have to be enabled in the controller to get correct behaviour. Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-06Revert "USB: don't explicitly reenable root-hub status interrupts"Linus Torvalds
This reverts commit e872154921a6b5256a3c412dd69158ac0b135176. Andrey Borzenkov reports that it resulted in a totally hung machine for him when loading the OHCI driver. Extensive netconsole capture with SysRq output shows that modprobe gets stuck in ohci_hub_status_data() when probing and enabling the OHCI controller, see for example http://lkml.org/lkml/2008/7/5/236 for an analysis. The problem appears to be an interrupt flood triggered by the commit that gets reverted, and Andrey confirmed that the revert makes things work for him again. Reported-and-tested-by: Andrey Borzenkov <arvidjaar@mail.ru> Acked-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: David Brownell <david-b@pacbell.net> Cc: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-06KVM: IOAPIC: Fix level-triggered irq injection hangMark McLoughlin
The "remote_irr" variable is used to indicate an interrupt which has been received by the LAPIC, but not acked. In our EOI handler, we unset remote_irr and re-inject the interrupt if the interrupt line is still asserted. However, we do not set remote_irr here, leading to a situation where if kvm_ioapic_set_irq() is called, then we go ahead and call ioapic_service(). This means that IRR is re-asserted even though the interrupt is currently in service (i.e. LAPIC IRR is cleared and ISR/TMR set) The issue with this is that when the currently executing interrupt handler finishes and writes LAPIC EOI, then TMR is unset and EOI sent to the IOAPIC. Since IRR is now asserted, but TMR is not, then when the second interrupt is handled, no EOI is sent and if there is any pending interrupt, it is not re-injected. This fixes a hang only seen while running mke2fs -j on an 8Gb virtio disk backed by a fully sparse raw file, with aliguori "avoid fragmented virtio-blk transfers by copying" changes. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Acked-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-06x86: KVM guest: Add memory clobber to hypercallsAnthony Liguori
Hypercalls can modify arbitrary regions of memory. Make sure to indicate this in the clobber list. This fixes a hang when using KVM_GUEST kernel built with GCC 4.3.0. This was originally spotted and analyzed by Marcelo. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>