summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2006-06-20[PATCH] log more info for directory entry change eventsAmy Griffis
When an audit event involves changes to a directory entry, include a PATH record for the directory itself. A few other notable changes: - fixed audit_inode_child() hooks in fsnotify_move() - removed unused flags arg from audit_inode() - added audit log routines for logging a portion of a string Here's some sample output. before patch: type=SYSCALL msg=audit(1149821605.320:26): arch=40000003 syscall=39 success=yes exit=0 a0=bf8d3c7c a1=1ff a2=804e1b8 a3=bf8d3c7c items=1 ppid=739 pid=800 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=ttyS0 comm="mkdir" exe="/bin/mkdir" subj=root:system_r:unconfined_t:s0-s0:c0.c255 type=CWD msg=audit(1149821605.320:26): cwd="/root" type=PATH msg=audit(1149821605.320:26): item=0 name="foo" parent=164068 inode=164010 dev=03:00 mode=040755 ouid=0 ogid=0 rdev=00:00 obj=root:object_r:user_home_t:s0 after patch: type=SYSCALL msg=audit(1149822032.332:24): arch=40000003 syscall=39 success=yes exit=0 a0=bfdd9c7c a1=1ff a2=804e1b8 a3=bfdd9c7c items=2 ppid=714 pid=777 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=ttyS0 comm="mkdir" exe="/bin/mkdir" subj=root:system_r:unconfined_t:s0-s0:c0.c255 type=CWD msg=audit(1149822032.332:24): cwd="/root" type=PATH msg=audit(1149822032.332:24): item=0 name="/root" inode=164068 dev=03:00 mode=040750 ouid=0 ogid=0 rdev=00:00 obj=root:object_r:user_home_dir_t:s0 type=PATH msg=audit(1149822032.332:24): item=1 name="foo" inode=164010 dev=03:00 mode=040755 ouid=0 ogid=0 rdev=00:00 obj=root:object_r:user_home_t:s0 Signed-off-by: Amy Griffis <amy.griffis@hp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20[PATCH] fix AUDIT_FILTER_PREPEND handlingAmy Griffis
Clear AUDIT_FILTER_PREPEND flag after adding rule to list. This fixes three problems when a rule is added with the -A syntax: - auditctl displays filter list as "(null)" - the rule cannot be removed using -d - a duplicate rule can be added with -a Signed-off-by: Amy Griffis <amy.griffis@hp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20[PATCH] validate rule fields' typesAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20[PATCH] audit: path-based rulesAmy Griffis
In this implementation, audit registers inotify watches on the parent directories of paths specified in audit rules. When audit's inotify event handler is called, it updates any affected rules based on the filesystem event. If the parent directory is renamed, removed, or its filesystem is unmounted, audit removes all rules referencing that inotify watch. To keep things simple, this implementation limits location-based auditing to the directory entries in an existing directory. Given a path-based rule for /foo/bar/passwd, the following table applies: passwd modified -- audit event logged passwd replaced -- audit event logged, rules list updated bar renamed -- rule removed foo renamed -- untracked, meaning that the rule now applies to the new location Audit users typically want to have many rules referencing filesystem objects, which can significantly impact filtering performance. This patch also adds an inode-number-based rule hash to mitigate this situation. The patch is relative to the audit git tree: http://kernel.org/git/?p=linux/kernel/git/viro/audit-current.git;a=summary and uses the inotify kernel API: http://lkml.org/lkml/2006/6/1/145 Signed-off-by: Amy Griffis <amy.griffis@hp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20[PATCH] Audit of POSIX Message Queue Syscalls v.2George C. Wilson
This patch adds audit support to POSIX message queues. It applies cleanly to the lspp.b15 branch of Al Viro's git tree. There are new auxiliary data structures, and collection and emission routines in kernel/auditsc.c. New hooks in ipc/mqueue.c collect arguments from the syscalls. I tested the patch by building the examples from the POSIX MQ library tarball. Build them -lrt, not against the old MQ library in the tarball. Here's the URL: http://www.geocities.com/wronski12/posix_ipc/libmqueue-4.41.tar.gz Do auditctl -a exit,always -S for mq_open, mq_timedsend, mq_timedreceive, mq_notify, mq_getsetattr. mq_unlink has no new hooks. Please see the corresponding userspace patch to get correct output from auditd for the new record types. [fixes folded] Signed-off-by: George Wilson <ltcgcw@us.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20[PATCH] fix se_sen audit filterDarrel Goeddel
Fix a broken comparison that causes the process clearance to be checked for both se_clr and se_sen audit filters. Signed-off-by: Darrel Goeddel <dgoeddel@trustedcs.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20[PATCH] deprecate AUDIT_POSSBILEAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20[PATCH] inline more audit helpersAl Viro
pull checks for ->audit_context into inlined wrappers Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20[PATCH] proc_loginuid_write() uses simple_strtoul() on non-terminated arrayAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20[PATCH] update of IPC audit record cleanupLinda Knippers
The following patch addresses most of the issues with the IPC_SET_PERM records as described in: https://www.redhat.com/archives/linux-audit/2006-May/msg00010.html and addresses the comments I received on the record field names. To summarize, I made the following changes: 1. Changed sys_msgctl() and semctl_down() so that an IPC_SET_PERM record is emitted in the failure case as well as the success case. This matches the behavior in sys_shmctl(). I could simplify the code in sys_msgctl() and semctl_down() slightly but it would mean that in some error cases we could get an IPC_SET_PERM record without an IPC record and that seemed odd. 2. No change to the IPC record type, given no feedback on the backward compatibility question. 3. Removed the qbytes field from the IPC record. It wasn't being set and when audit_ipc_obj() is called from ipcperms(), the information isn't available. If we want the information in the IPC record, more extensive changes will be necessary. Since it only applies to message queues and it isn't really permission related, it doesn't seem worth it. 4. Removed the obj field from the IPC_SET_PERM record. This means that the kern_ipc_perm argument is no longer needed. 5. Removed the spaces and renamed the IPC_SET_PERM field names. Replaced iuid and igid fields with ouid and ogid in the IPC record. I tested this with the lspp.22 kernel on an x86_64 box. I believe it applies cleanly on the latest kernel. -- ljk Signed-off-by: Linda Knippers <linda.knippers@hp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20[PATCH] minor audit updatesSerge E. Hallyn
Just a few minor proposed updates. Only the last one will actually affect behavior. The rest are just misleading code. Several AUDIT_SET functions return 'old' value, but only return value <0 is checked for. So just return 0. propagate audit_set_rate_limit and audit_set_backlog_limit error values In audit_buffer_free, the audit_freelist_count was being incremented even when we discard the return buffer, so audit_freelist_count can end up wrong. This could cause the actual freelist to shrink over time, eventually threatening to degrate audit performance. Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20[PATCH] fix audit_krule_to_{rule,data} return valuesAmy Griffis
Don't return -ENOMEM when callers of these functions are checking for a NULL return. Bug noticed by Serge Hallyn. Signed-off-by: Amy Griffis <amy.griffis@hp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20[PATCH] add filtering by ppidAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20[PATCH] log ppidAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20[PATCH] collect sid of those who send signals to auditdAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20[PATCH] execve argument loggingAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20[PATCH] fix deadlocks in AUDIT_LIST/AUDIT_LIST_RULESAl Viro
We should not send a pile of replies while holding audit_netlink_mutex since we hold the same mutex when we receive commands. As the result, we can get blocked while sending and sit there holding the mutex while auditctl is unable to send the next command and get around to receiving what we'd sent. Solution: create skb and put them into a queue instead of sending; once we are done, send what we've got on the list. The former can be done synchronously while we are handling AUDIT_LIST or AUDIT_LIST_RULES; we are holding audit_netlink_mutex at that point. The latter is done asynchronously and without messing with audit_netlink_mutex. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20[PATCH] audit_panic() is audit-internalAl Viro
... no need to provide a stub; note that extern is already gone from include/linux/audit.h Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20[PATCH] inotify (5/5): update kernel documentationAmy Griffis
Update kernel documentation to include a description of the inotify kernel API. Signed-off-by: Amy Griffis <amy.griffis@hp.com> Acked-by: Robert Love <rml@novell.com> Acked-by: John McCutchan <john@johnmccutchan.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20[PATCH] inotify (4/5): allow watch removal from event handlerAmy Griffis
Allow callers to remove watches from their event handler via inotify_remove_watch_locked(). This functionality can be used to achieve IN_ONESHOT-like functionality for a subset of events in the mask. Signed-off-by: Amy Griffis <amy.griffis@hp.com> Acked-by: Robert Love <rml@novell.com> Acked-by: John McCutchan <john@johnmccutchan.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20[PATCH] inotify (3/5): add interfaces to kernel APIAmy Griffis
Add inotify_init_watch() so caller can use inotify_watch refcounts before calling inotify_add_watch(). Add inotify_find_watch() to find an existing watch for an (ih,inode) pair. This is similar to inotify_find_update_watch(), but does not update the watch's mask if one is found. Add inotify_rm_watch() to remove a watch via the watch pointer instead of the watch descriptor. Signed-off-by: Amy Griffis <amy.griffis@hp.com> Acked-by: Robert Love <rml@novell.com> Acked-by: John McCutchan <john@johnmccutchan.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20[PATCH] inotify (2/5): add name's inode to event handlerAmy Griffis
When an inotify event includes a dentry name, also include the inode associated with that name. Signed-off-by: Amy Griffis <amy.griffis@hp.com> Acked-by: Robert Love <rml@novell.com> Acked-by: John McCutchan <john@johnmccutchan.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20[PATCH] inotify (1/5): split kernel API from userspace supportAmy Griffis
The following series of patches introduces a kernel API for inotify, making it possible for kernel modules to benefit from inotify's mechanism for watching inodes. With these patches, inotify will maintain for each caller a list of watches (via an embedded struct inotify_watch), where each inotify_watch is associated with a corresponding struct inode. The caller registers an event handler and specifies for which filesystem events their event handler should be called per inotify_watch. Signed-off-by: Amy Griffis <amy.griffis@hp.com> Acked-by: Robert Love <rml@novell.com> Acked-by: John McCutchan <john@johnmccutchan.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20[PATCH] remove config.h from inotify.hAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-17Linux v2.6.17v2.6.17Linus Torvalds
Being named "Crazed Snow-Weasel" instills a lot of confidence in this release, so I'm sure this will be one of the better ones.
2006-06-17[PATCH] powerpc: enable CPU_FTR_CI_LARGE_PAGE for cellArnd Bergmann
Reflect the fact that the Cell Broadband Engine supports 64k pages by adding the bit to the CPU features. Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-17[PATCH] powerpc: Fix 64k pages on non-partitioned machinesArnd Bergmann
The page size encoding passed to tlbie is incorrect for new-style large pages. This fixes it. This doesn't affect anything on older machines because mmu_psize_defs[psize].penc (the page size encoding) is 0 for 4k and 16M pages (the two are distinguished by a separate "is a large page" bit). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-17[PATCH] arm_timer: remove a racy and obsolete PF_EXITING checkOleg Nesterov
arm_timer() checks PF_EXITING to prevent BUG_ON(->exit_state) in run_posix_cpu_timers(). However, for some reason it does so only for CPUCLOCK_PERTHREAD case (which is imho wrong). Also, this check is not reliable, PF_EXITING could be set on another cpu without any locks/barriers just after the check, so it can't prevent from attaching the timer to the exiting task. The previous patch makes this check unneeded. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-17[PATCH] run_posix_cpu_timers: remove a bogus BUG_ON()Oleg Nesterov
do_exit() clears ->it_##clock##_expires, but nothing prevents another cpu to attach the timer to exiting process after that. arm_timer() tries to protect against this race, but the check is racy. After exit_notify() does 'write_unlock_irq(&tasklist_lock)' and before do_exit() calls 'schedule() local timer interrupt can find tsk->exit_state != 0. If that state was EXIT_DEAD (or another cpu does sys_wait4) interrupted task has ->signal == NULL. At this moment exiting task has no pending cpu timers, they were cleanuped in __exit_signal()->posix_cpu_timers_exit{,_group}(), so we can just return from irq. John Stultz recently confirmed this bug, see http://marc.theaimsgroup.com/?l=linux-kernel&m=115015841413687 Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-17[PATCH] check_process_timers: fix possible lockupOleg Nesterov
If the local timer interrupt happens just after do_exit() sets PF_EXITING (and before it clears ->it_xxx_expires) run_posix_cpu_timers() will call check_process_timers() with tasklist_lock + ->siglock held and check_process_timers: t = tsk; do { .... do { t = next_thread(t); } while (unlikely(t->flags & PF_EXITING)); } while (t != tsk); the outer loop will never stop. Actually, the window is bigger. Another process can attach the timer after ->it_xxx_expires was cleared (see the next commit) and the 'if (PF_EXITING)' check in arm_timer() is racy (see the one after that). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-17[PATCH] sky2: netconsole suspend/resume interactionStephen Hemminger
A couple of fixes that should prevent crashes when using netconsole and suspend/resume. First, netconsole poll routine shouldn't run unless the device is up; second, the NAPI poll should be disabled during suspend. This is only an issue on sky2, because it has to have one NAPI poll routine for both ports on dual port boards. Normal drivers use netif_rx_schedule_prep and that checks for netif_running. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-17[PATCH] Fix missing ret assignment in __bio_map_user() error pathJens Axboe
If get_user_pages() returns less pages than what we asked for, we jump to out_unmap which will return ERR_PTR(ret). But ret can contain a positive number just smaller than local_nr_pages, so be sure to set it to -EFAULT always. Problem found and diagnosed by Damien Le Moal <damien@sdl.hitachi.co.jp> Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-17[PATCH] fix cdrom openJens Axboe
Some time ago the cdrom open routine was changed so that we call the driver's open routine before checking to see if it is read only. However, if we discovered that a read write open was not possible and the open flags required a writable open, we just returned -EROFS without calling the driver's release routine. This seems to work for most cdrom drivers, but breaks the Powerpc iSeries virtual cdrom rather badly. This just inserts the release call in the error path to balance the call to "->open()" done by "open_for_data()". Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-14[PATCH] cfq-iosched: fix crash in do_div()Jens Axboe
We don't clear the seek stat values in cfq_alloc_io_context(), and if ->seek_mean is unlucky enough to be set to -36 by chance, the first invocation of cfq_update_io_seektime() will oops with a divide by zero in do_div(). Just memset the entire cic instead of filling invididual values independently. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-14[PATCH] Return error in case flock_lock_file failureKirill Korotaev
If flock_lock_file() failed to allocate flock with locks_alloc_lock() then "error = 0" is returned. Need to return some non-zero. Signed-off-by: Pavel Emelianov <xemul@openvz.org> Signed-off-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-13[PATCH] sky2: stop/start hardware idle timer on suspend/resumeStephen Hemminger
The resume bug was caused not by an early interrupt but because the idle timeout was not being stopped on suspend. Also disable hardware IRQ's on suspend. Will need to revisit this with hotplug? Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-13[PATCH] sky2: save/restore base hardware irq during suspend/resumeStephen Hemminger
The hardware should be fully shut off during suspend, and the base irq mask restored during resume. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-13[PATCH] sky2: fix hotplug detect during pollStephen Hemminger
If the poll routine detects no hardware available, it needs to dequeue it self from the network poll list. Linus didn't understand NAPI. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-13[PATCH] sky2: don't hard code number of portsStephen Hemminger
It is cleaner, to not loop over both ports if only one exists. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-13[PATCH] sky2: set_power_state should be voidStephen Hemminger
The set power state function is cleaner if it doesn't return anything. The only caller that could fail is in suspend() and it can check the argument there. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-12[PATCH] alpha: generic hweight build fixRandy Dunlap
From: Randy Dunlap <rdunlap@xenotime.net> According to include/asm-alpha/bitops.h, only ALPHA_EV67 has hardware hweight support, so ALPHA_EV6 needs to use GENERIC_HWEIGHT. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Ernst Herzberg <earny@net4u.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-12[PATCH] tmpfs: Decrement i_nlink correctly in shmem_rmdir()Sergey Vlasov
shmem_rmdir() must undo the increment of i_nlink done in shmem_get_inode() for directories, otherwise at least IN_DELETE_SELF inotify event generation is broken. Signed-off-by: Sergey Vlasov <vsu@altlinux.ru> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-12[PATCH] tmpfs: time granularity fix for [acm]time going backwardsRobin H. Johnson
I noticed a strange behavior in a tmpfs file system the other day, while building packages - occasionally, and seemingly at random, make decided to rebuild a target. However, only on tmpfs. A file would be created, and if checked, it had a sub-second timestamp. However, after an utimes related call where sub-seconds should be set, they were zeroed instead. In the case that a file was created, and utimes(...,NULL) was used on it in the same second, the timestamp on the file moved backwards. After some digging, I found that this was being caused by tmpfs not having a time granularity set, thus inheriting the default 1 second granularity. Hugh adds: yes, we missed tmpfs when the s_time_gran mods went into 2.6.11. Unfortunately, the granularity of CURRENT_TIME, often used in filesystems, does not match the default granularity set by alloc_super. A few more such discrepancies have been found, but this is the most important to fix now. Signed-off-by: Robin H. Johnson <robbat2@gentoo.org> Acked-by: Andi Kleen <ak@suse.de> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-12Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC64]: Do not double-export sys_close() when CONFIG_SOLARIS_EMUL_MODULE
2006-06-12Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [IPV4]: Increment ipInHdrErrors when TTL expires. [TCP]: continued: reno sacked_out count fix [DCCP] Ackvec: fix soft lockup in ackvec handling code
2006-06-12Merge master.kernel.org:/home/rmk/linux-2.6-armLinus Torvalds
* master.kernel.org:/home/rmk/linux-2.6-arm: [ARM] Fix Integrator and Versatile interrupt initialisation [ARM] 3546/1: PATCH: subtle lost interrupts bug on i.MX [ARM] 3547/1: PXA-OHCI: Allow platforms to specify a power budget [ARM] Fix Neponset IRQ handling
2006-06-12[IPV4]: Increment ipInHdrErrors when TTL expires.Weidong
Signed-off-by: Weidong <weid@nanjing-fnst.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-12[sky2] Fix sky2 network driver suspend/resumeLinus Torvalds
This fixes two independent problems: it would not save the PCI state on suspend (and thus try to resume a nonexistent state on resume), and while shut off, if an interrupt happened on the same shared irq, the irq handler would react very badly to the interrupt status being an invalid all-ones state. Acked-by: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-12Merge branch 'upstream-linus' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: [PATCH] sata_mv: grab host lock inside eng_timeout
2006-06-11[TCP]: continued: reno sacked_out count fixAki M Nyrhinen
From: Aki M Nyrhinen <anyrhine@cs.helsinki.fi> IMHO the current fix to the problem (in_flight underflow in reno) is incorrect. it treats the symptons but ignores the problem. the problem is timing out packets other than the head packet when we don't have sack. i try to explain (sorry if explaining the obvious). with sack, scanning the retransmit queue for timed out packets is fine because we know which packets in our retransmit queue have been acked by the receiver. without sack, we know only how many packets in our retransmit queue the receiver has acknowledged, but no idea which packets. think of a "typical" slow-start overshoot case, where for example every third packet in a window get lost because a router buffer gets full. with sack, we check for timeouts on those every third packet (as the rest have been sacked). the packet counting works out and if there is no reordering, we'll retransmit exactly the packets that were lost. without sack, however, we check for timeout on every packet and end up retransmitting consecutive packets in the retransmit queue. in our slow-start example, 2/3 of those retransmissions are unnecessary. these unnecessary retransmissions eat the congestion window and evetually prevent fast recovery from continuing, if enough packets were lost. Signed-off-by: David S. Miller <davem@davemloft.net>