linux.git - dakr's fork of kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

Age	Commit message (Collapse)	Author
2009-08-07	lib/decompress_*: only include <linux/slab.h> if STATIC is not defined	Albin Tonnerre
	These includes were added by 079effb6933f34b9b1b67b08bd4fd7fb672d16ef ("kmemtrace, kbuild: fix slab.h dependency problem in lib/decompress_inflate.c") to fix the build when using kmemtrace. However this is not necessary when used to create a compressed kernel, and actually creates issues (brings a lot of things unavailable in the decompression environment), so don't include it if STATIC is defined. Signed-off-by: Albin Tonnerre <albin.tonnerre@free-electrons.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Cc: Phillip Lougher <phillip@lougher.demon.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-07	bzip2/lzma: remove nasty uncompressed size hack in pre-boot environment	Phillip Lougher
	decompress_bunzip2 and decompress_unlzma have a nasty hack that subtracts 4 from the input length if being called in the pre-boot environment. This is a nasty hack because it relies on the fact that flush = NULL only when called from the pre-boot environment (i.e. arch/x86/boot/compressed/misc.c). initramfs.c/do_mounts_rd.c pass in a flush buffer (flush != NULL). This hack prevents the decompressors from being used with flush = NULL by other callers unless knowledge of the hack is propagated to them. This patch removes the hack by making decompress (called only from the pre-boot environment) a wrapper function that subtracts 4 from the input length before calling the decompressor. Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-07	bzip2/lzma/gzip: fix comments describing decompressor API	Phillip Lougher
	Fix and improve comments in decompress/generic.h that describe the decompressor API. Also remove an unused definition, and rename INBUF_LEN in lib/decompress_inflate.c to conform to bzip2/lzma naming. Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-07	execve: must clear current->clear_child_tid	Eric Dumazet
	While looking at Jens Rosenboom bug report (http://lkml.org/lkml/2009/7/27/35) about strange sys_futex call done from a dying "ps" program, we found following problem. clone() syscall has special support for TID of created threads. This support includes two features. One (CLONE_CHILD_SETTID) is to set an integer into user memory with the TID value. One (CLONE_CHILD_CLEARTID) is to clear this same integer once the created thread dies. The integer location is a user provided pointer, provided at clone() time. kernel keeps this pointer value into current->clear_child_tid. At execve() time, we should make sure kernel doesnt keep this user provided pointer, as full user memory is replaced by a new one. As glibc fork() actually uses clone() syscall with CLONE_CHILD_SETTID and CLONE_CHILD_CLEARTID set, chances are high that we might corrupt user memory in forked processes. Following sequence could happen: 1) bash (or any program) starts a new process, by a fork() call that glibc maps to a clone( ... CLONE_CHILD_SETTID \| CLONE_CHILD_CLEARTID ...) syscall 2) When new process starts, its current->clear_child_tid is set to a location that has a meaning only in bash (or initial program) context (&THREAD_SELF->tid) 3) This new process does the execve() syscall to start a new program. current->clear_child_tid is left unchanged (a non NULL value) 4) If this new program creates some threads, and initial thread exits, kernel will attempt to clear the integer pointed by current->clear_child_tid from mm_release() : if (tsk->clear_child_tid && !(tsk->flags & PF_SIGNALED) && atomic_read(&mm->mm_users) > 1) { u32 __user * tidptr = tsk->clear_child_tid; tsk->clear_child_tid = NULL; /* * We don't check the error code - if userspace has * not set up a proper pointer then tough luck. */ << here >> put_user(0, tidptr); sys_futex(tidptr, FUTEX_WAKE, 1, NULL, NULL, 0); } 5) OR : if new program is not multi-threaded, but spied by /proc/pid users (ps command for example), mm_users > 1, and the exiting program could corrupt 4 bytes in a persistent memory area (shm or memory mapped file) If current->clear_child_tid points to a writeable portion of memory of the new program, kernel happily and silently corrupts 4 bytes of memory, with unexpected effects. Fix is straightforward and should not break any sane program. Reported-by: Jens Rosenboom <jens@mcbone.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sonny Rao <sonnyrao@us.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-07	drivers/mmc: correct error-handling code	Julia Lawall
	sdhci_alloc_host returns an ERR_PTR value in an error case instead of NULL. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @match exists@ expression x, E; statement S1, S2; @@ x = sdhci_alloc_host(...) ... when != x = E ( * if (x == NULL \|\| ...) S1 else S2 \| * if (x == NULL && ...) S1 else S2 ) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Acked-by: Anton Vorontsov <avorontsov@ru.mvista.com> Cc: Matt Fleming <matt@console-pimps.org> Cc: Ian Molton <ian@mnementh.co.uk> Cc: "Roberto A. Foglietta" <roberto.foglietta@gmail.com> Cc: Philip Langdale <philipl@overt.org> Cc: Pierre Ossman <pierre@ossman.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-07	i.MX31: fix framebuffer locking regressions	Guennadi Liakhovetski
	Recent framebuffer locking patches first made affected systems unbootable, then the dead-lock has been fixed but as of 2.6.31-rc4 the framebuffer on mx3 machines doesn't work. Fix this. Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Cc: Sascha Hauer <s.hauer@pengutronix.de> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-07	vfs: mnt_want_write_file(): fix special file handling	OGAWA Hirofumi
	I suspect that mnt_want_write_file() may have wrong assumption. I think mnt_want_write_file() is assuming it increments ->mnt_writers if (file->f_mode & FMODE_WRITE). But, if it's special_file(), it is false? Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Acked-by: Dave Hansen <dave@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-07	compat_ioctl: hook up compat handler for FIEMAP ioctl	Eric Sandeen
	The FIEMAP_IOC_FIEMAP mapping ioctl was missing a 32-bit compat handler, which means that 32-bit suerspace on 64-bit kernels cannot use this ioctl command. The structure is nicely aligned, padded, and sized, so it is just this simple. Tested w/ 32-bit ioctl tester (from Josef) on a 64-bit kernel on ext4. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Cc: <linux-ext4@vger.kernel.org> Cc: Mark Lord <lkml@rtr.ca> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Josef Bacik <josef@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-07	fbcon: don't use vc_resize() on initialization	Johannes Weiner
	Catalin and kmemleak spotted a leak of a VC screen buffer in vc_allocate() due to the following chain of events: vc_allocate() visual_init(init=1) vc->vc_sw->con_init(init=1) fbcon_init() vc_resize() vc->screen_buf = kmalloc() vc->screen_buf = kmalloc() The common way for the VC drivers is to set the screen dimension parameters manually in the init case and only call vc_resize() for !init - which allocates a screen buffer according to the new dimensions. fbcon instead would do vc_resize() unconditionally and afterwards set the dimensions manually (again) for !init - i.e. completely upside down. The vc_resize() allocated buffer would then get lost by vc_allocate() allocating a fresh one. Use vc_resize() only for actual resizing to close the leak. Set the dimensions manually only in initialization mode to remove the redundant setting in resize mode. The kmemleak trace from Catalin: unreferenced object 0xde158000 (size 12288): comm "Xorg", pid 1439, jiffies 4294961016 hex dump (first 32 bytes): 20 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 . . . . . . . . 20 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 . . . . . . . . backtrace: [<c006f74b>] __save_stack_trace+0x17/0x1c [<c006f81d>] create_object+0xcd/0x188 [<c01f5457>] kmemleak_alloc+0x1b/0x3c [<c006e303>] __kmalloc+0xdb/0xe8 [<c012cc4b>] vc_do_resize+0x73/0x1e0 [<c012cdf1>] vc_resize+0x15/0x18 [<c011afc1>] fbcon_init+0x1f9/0x2b8 [<c0129e87>] visual_init+0x9f/0xdc [<c012aff3>] vc_allocate+0x7f/0xfc [<c012b087>] con_open+0x17/0x80 [<c0120e43>] tty_open+0x1f7/0x2e4 [<c0072fa1>] chrdev_open+0x101/0x118 [<c006ffad>] __dentry_open+0x105/0x1cc [<c00700fd>] nameidata_to_filp+0x2d/0x38 [<c00788cd>] do_filp_open+0x2c1/0x54c [<c006fdff>] do_sys_open+0x3b/0xb4 Reported-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Tested-by: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-07	viafb: fix rmmod bug	Florian Tobias Schandinat
	This fixes a bug caused by changing pointers (viafb_mode, viafb_mode1) assigned by module_param. It reduces driver complexity by not needlessly changing these vars as they are only read once and removing now superfluous code. On unpatched kernels loading viafb with viafb_mode or viafb_mode1 option used and afterwards unloading it results in: kernel BUG at mm/slub.c:2926! invalid opcode: 0000 [#1] PREEMPT last sysfs file: /sys/devices/virtual/block/loop0/removable Modules linked in: snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm rtl8187 snd_timer eeprom_93cx6 mmc_block snd soundcore via_sdmmc fb snd_page_alloc i2c_algo_bit i2c_viapro ehci_hcd uhci_hcd cfbcopyarea mmc_core cfbimgblt cfbfillrect video output [last unloaded: viafb] Pid: 3355, comm: rmmod Not tainted (2.6.31-rc1 #0) EIP: 0060:[<c106a759>] EFLAGS: 00010246 CPU: 0 EIP is at kfree+0x80/0xda EAX: c17c2da0 EBX: dc7edbdc ECX: 0000010f EDX: 00000000 ESI: c102c700 EDI: dc7ed8fa EBP: d703ff2c ESP: d703ff20 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 Process rmmod (pid: 3355, ti=d703e000 task=db1412c0 task.ti=d703e000) Stack: dc7edbdc 00000014 00000016 d703ff40 c102c700 dc7f45d4 dc7f45d4 00000880 d703ff4c c103e571 00000000 d703ffac c103e751 66616976 da140062 db89ba80 00000328 d702edf8 db89ba80 d703ff9c c105d0f0 00000200 da14f898 00000014 Call Trace: [<c102c700>] ? destroy_params+0x1e/0x2b [<c103e571>] ? free_module+0xa2/0xd7 [<c103e751>] ? sys_delete_module+0x1ab/0x1da [<c105d0f0>] ? do_munmap+0x20a/0x225 [<c10029b4>] ? sysenter_do_call+0x12/0x26 Code: 10 76 7a 8d 87 00 00 00 40 c1 e8 0c c1 e0 05 03 05 1c 87 41 c1 66 83 38 00 79 03 8b 40 0c 8b 10 84 d2 78 12 66 f7 c2 00 c0 75 04 <0f> 0b eb fe e8 6f 5a fe ff eb 47 8b 55 04 8b 58 0c 9c 5e fa 3b EIP: [<c106a759>] kfree+0x80/0xda SS:ESP 0068:d703ff20 This is caused by the current code changing the pointers assigned by module_param. During unload it tries to free the memory the pointers point at which is now part of an internal structure. The patch simply avoids changing the pointers. This is okay as they are read only once during the initialization process. Signed-off-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de> Cc: Scott Fang <ScottFang@viatech.com.cn> Cc: Joseph Chan <JosephChan@via.com.tw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-07	mm: make set_mempolicy(MPOL_INTERLEAV) N_HIGH_MEMORY aware	KAMEZAWA Hiroyuki
	At first, init_task's mems_allowed is initialized as this. init_task->mems_allowed == node_state[N_POSSIBLE] And cpuset's top_cpuset mask is initialized as this top_cpuset->mems_allowed = node_state[N_HIGH_MEMORY] Before 2.6.29: policy's mems_allowed is initialized as this. 1. update tasks->mems_allowed by its cpuset->mems_allowed. 2. policy->mems_allowed = nodes_and(tasks->mems_allowed, user's mask) Updating task's mems_allowed in reference to top_cpuset's one. cpuset's mems_allowed is aware of N_HIGH_MEMORY, always. In 2.6.30: After commit 58568d2a8215cb6f55caf2332017d7bdff954e1c ("cpuset,mm: update tasks' mems_allowed in time"), policy's mems_allowed is initialized as this. 1. policy->mems_allowd = nodes_and(task->mems_allowed, user's mask) Here, if task is in top_cpuset, task->mems_allowed is not updated from init's one. Assume user excutes command as #numactrl --interleave=all ,.... policy->mems_allowd = nodes_and(N_POSSIBLE, ALL_SET_MASK) Then, policy's mems_allowd can includes a possible node, which has no pgdat. MPOL's INTERLEAVE just scans nodemask of task->mems_allowd and access this directly. NODE_DATA(nid)->zonelist even if NODE_DATA(nid)==NULL Then, what's we need is making policy->mems_allowed be aware of N_HIGH_MEMORY. This patch does that. But to do so, extra nodemask will be on statck. Because I know cpumask has a new interface of CPUMASK_ALLOC(), I added it to node. This patch stands on old behavior. But I feel this fix itself is just a Band-Aid. But to do fundametal fix, we have to take care of memory hotplug and it takes time. (task->mems_allowd should be N_HIGH_MEMORY, I think.) mpol_set_nodemask() should be aware of N_HIGH_MEMORY and policy's nodemask should be includes only online nodes. In old behavior, this is guaranteed by frequent reference to cpuset's code. Now, most of them are removed and mempolicy has to check it by itself. To do check, a few nodemask_t will be used for calculating nodemask. But, size of nodemask_t can be big and it's not good to allocate them on stack. Now, cpumask_t has CPUMASK_ALLOC/FREE an easy code for get scratch area. NODEMASK_ALLOC/FREE shoudl be there. [akpm@linux-foundation.org: cleanups & tweaks] Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Miao Xie <miaox@cn.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Paul Menage <menage@google.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: David Rientjes <rientjes@google.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-07	fbcon: fix rotate upside down crash	Stefani Seibold
	Fix the rotate_ud() function not to crash in case of a font which has not a width of multiple by 8: The inner loop of the font pixel copy should not access a bit outside the font memory area. Subtract the shift offset from the font width will prevent this. Signed-off-by: Stefani Seibold <stefani@seibold.net> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-07	generic-ipi: fix hotplug_cfd()	Xiao Guangrong
	Use CONFIG_HOTPLUG_CPU, not CONFIG_CPU_HOTPLUG When hot-unpluging a cpu, it will leak memory allocated at cpu hotplug, but only if CPUMASK_OFFSTACK=y, which is default to n. The bug was introduced by 8969a5ede0f9e17da4b943712429aef2c9bcd82b ("generic-ipi: remove kmalloc()"). Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-07	drivers/w1/masters/omap_hdq.c: fix missing mutex unlock	Stoyan Gaydarov
	This was found using a semantic patch, more info can be found at: http://www.emn.fr/x-info/coccinelle/ Signed-off-by: Stoyan Gaydarov <sgayda2@uiuc.edu> Acked-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-05	drm/radeon: Add support for RS880 chips	Alex Deucher
	These are new AMD IGP chips Signed-off-by: Alex Deucher <alexdeucher@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2009-08-04	Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus	Linus Torvalds
	* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: (32 commits) MIPS: Wire up accept4 syscall. MIPS: VPE: Delete unused function get_tc_unused(). MIPS: VPE: Fix bogus indentation. MIPS: VPE: Make various functions static. MIPS: VPE: Free relocation chain on error. MIPS: VPE: Fix compiler warning. MIPS: Module: Make error messages unique. MIPS: Octeon: Run IPI code with interrupts disabled. MIPS: Jazz: Fix read buffer overflow MIPS: Use DIV_ROUND_CLOSEST MIPS: MTX-1: Request button GPIO before setting its direction MIPS: AR7: Override CFLAGS with -Werror MIPS: AR7: Remove unused tnetd7200_get_clock function MIPS: AR7: Use DMA_BIT_MASK(nn) instead of deprecated DMA_nnBIT_MASK MIPS: AR7: Fix build failures when CONFIG_SERIAL_8250 is not enabled MIPS: Fix read buffer overflow MIPS: AR7: Fix build warning on memory.c MIPS: Octeon PCIe: Make hardware and software bus numbers match. MIPS: RBTX4939: Fix IOC pin-enable register updating MIPS: Simplify and correct interrupt handling for MSP4200 ...
2009-08-04	Merge branch 'fix/hda' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 * 'fix/hda' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: ALSA: hda - Read buffer overflow ALSA: hda: Correct EAPD for Dell Inspiron 1525 ALSA: hda: warn on spurious response ALSA: hda: remember last command for each codec ALSA: hda: read CORBWP inside reg_lock ALSA: hda: take reg_lock in azx_init_cmd_io/azx_free_cmd_io ALSA: hda: take cmd_mutex in probe_codec() ALSA: hda: track CIRB/CORB command/response states for each codec ALSA: hda - Fix quirk for Toshiba Satellite A135-S4527
2009-08-04	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: tty-ldisc: be more careful in 'put_ldisc' locking tty-ldisc: turn ldisc user count into a proper refcount tty-ldisc: make refcount be atomic_t 'users' count
2009-08-04	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block	Linus Torvalds
	* 'for-linus' of git://git.kernel.dk/linux-2.6-block: Make SCSI SG v4 driver enabled by default and remove EXPERIMENTAL dependency, since udev depends on BSG block: Update topology documentation block: Stack optimal I/O size block: Add a wrapper for setting minimum request size without a queue block: Make blk_queue_stack_limits use the new stacking interface
2009-08-04	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits) ehea: Fix napi list corruption on ifconfig down igbvf: Allow VF driver to correctly recognize failure to set mac 3c59x: Fix build failure with gcc 3.2 sky2: Avoid transmits during sky2_down() iwlagn: do not send key clear commands when rfkill enabled libertas: Read buffer overflow drivers/net/wireless: introduce missing kfree drivers/net/wireless/iwlwifi: introduce missing kfree zd1211rw: fix unaligned access in zd_mac_rx cfg80211: fix regression on beacon world roaming feature cfg80211: add two missing NULL pointer checks ixgbe: Patch to modify 82598 PCIe completion timeout values bluetooth: rfcomm_init bug fix mlx4_en: Fix double pci unmapping. mISDN: Fix handling of receive buffer size in L1oIP pcnet32: VLB support fixes pcnet32: remove superfluous NULL pointer check in pcnet32_probe1() net: restore the original spinlock to protect unicast list netxen: fix coherent dma mask setting mISDN: Read buffer overflow ...
2009-08-04	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: (23 commits) [SCSI] sd: Avoid sending extended inquiry to legacy devices [SCSI] libsas: fix wide port hotplug issues [SCSI] libfc: fix a circular locking warning during sending RRQ [SCSI] qla4xxx: Remove hiwat code so scsi eh does not get escalated when we can make progress [SCSI] qla4xxx: Fix srb lookup in qla4xxx_eh_device_reset [SCSI] qla4xxx: Fix Driver Fault Recovery Completion [SCSI] qla4xxx: add timeout handler [SCSI] qla4xxx: Correct Extended Sense Data Errors [SCSI] libiscsi: disable bh in and abort handler. [SCSI] zfcp: Fix tracing of request id for abort requests [SCSI] zfcp: Fix wka port processing [SCSI] zfcp: avoid double notify in lowmem scenario [SCSI] zfcp: Add port only once to FC transport class [SCSI] zfcp: Recover from stalled outbound queue [SCSI] zfcp: Fix erp escalation procedure [SCSI] zfcp: Fix logic for physical port close [SCSI] zfcp: Use -EIO for SBAL allocation failures [SCSI] zfcp: Use unchained mode for small ct and els requests [SCSI] zfcp: Use correct flags for zfcp_erp_notify [SCSI] zfcp: Return -ENOMEM for allocation failures in zfcp_fsf ...
2009-08-04	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp	Linus Torvalds
	* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: amd64_edac: print debug statements only on error amd64_edac: fix ECC checking
2009-08-04	Merge branch 'drm-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: drm/ttm: Read buffer overflow drm/radeon: Read buffer overflow drm/ttm: Fix a sync object leak. drm/radeon/kms: fix memory leak in radeon_driver_load_kms drm/radeon/kms: fix nomodeset. drm/ttm: Fix a potential comparison of structs. drm/radeon/kms: fix rv515 VRAM initialisation. drm/radeon: add some new r7xx pci ids drm: Catch stop possible NULL pointer reference drm: Small logic fix in drm_mode_setcrtc
2009-08-04	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-2.6: parisc: hppb.c - fix printk format strings parisc: parisc-agp.c - use correct page_mask function parisc: sticore.c - check return values parisc: dino.c - check return value of pci_assign_resource() parisc: hp_sdc_mlc.c - check return value of down_trylock() parisc: includecheck fix for ccio-dma.c parisc: Set correct bit in protection flags parisc: isa-eeprom - Fix loff_t usage parisc: fixed faulty check in lba_pci parisc: Fix read buffer overflow in pdc_stable driver parisc: Fix GOT overflow during module load on 64bit kernel
2009-08-04	flex_array: remove unneeded index calculation	Jonathan Corbet
	flex_array_get() calculates an index value, then drops it on the floor; simply remove it. Signed-off-by: Jonathan Corbet <corbet@lwn.net> Acked-by: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-04	Merge branch 'perfcounters-fixes-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf_counter: Set the CONFIG_PERF_COUNTERS default to y if CONFIG_PROFILING=y perf: Fix read buffer overflow perf top: Add mwait_idle_with_hints to skip_symbols[] perf tools: Fix faulty check perf report: Update for the new FORK/EXIT events perf_counter: Full task tracing perf_counter: Collapse inherit on read() tracing, perf_counter: Add help text to CONFIG_EVENT_PROFILE perf_counter tools: Fix link errors with older toolchains
2009-08-04	Merge branch 'sched-fixes-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Fix race in cpupri introduced by cpumask_var changes sched: Fix latencytop and sleep profiling vs group scheduling
2009-08-04	Merge branch 'timers-fixes-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: posix-timers: Fix oops in clock_nanosleep() with CLOCK_MONOTONIC_RAW
2009-08-04	Merge branch 'tracing-fixes-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tracing: Fix missing function_graph events when we splice_read from trace_pipe tracing: Fix invalid function_graph entry trace: stop tracer in oops_enter() ftrace: Only update $offset when we update $ref_func ftrace: Fix the conditional that updates $ref_func tracing: only truncate ftrace files when O_TRUNC is set tracing: show proper address for trace-printk format
2009-08-04	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6: mfd: twl4030 irq fixes
2009-08-04	Merge branch 'x86-fixes-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Work around compilation warning in arch/x86/kernel/apm_32.c x86, UV: Complete IRQ interrupt migration in arch_enable_uv_irq() x86, 32-bit: Fix double accounting in reserve_top_address() x86: Don't use current_cpu_data in x2apic phys_pkg_id x86, UV: Fix UV apic mode x86, UV: Fix macros for accessing large node numbers x86, UV: Delete mapping of MMR rangs mapped by BIOS x86, UV: Handle missing blade-local memory correctly x86: fix assembly constraints in native_save_fl() x86, msr: execute on the correct CPU subset x86: Fix assert syntax in vmlinux.lds.S x86: Make 64-bit efi_ioremap use ioremap on MMIO regions x86: Add quirk to make Apple MacBook5,2 use reboot=pci x86: Fix CPA memtype reserving in the set_pages_array*() cases x86, pat: Fix set_memory_wc related corruption x86: fix section mismatch for i386 init code
2009-08-04	Merge branch 'fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] Make cpufreq suspend code conditional on powerpc. [CPUFREQ] Fix a kobject reference bug related to managed CPUs [CPUFREQ] Do not set policy for offline cpus [CPUFREQ] Fix NULL pointer dereference regression in conservative governor
2009-08-04	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2: nilfs2: fix missing unlock in error path of nilfs_mdt_write_page nilfs2: fix oops due to inconsistent state in page with discrete b-tree nodes
2009-08-04	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: [CIFS] Update readme to reflect forceuid mount parms cifs: Read buffer overflow cifs: show noforceuid/noforcegid mount options (try #2) cifs: reinstate original behavior when uid=/gid= options are specified [CIFS] Updates fs/cifs/CHANGES cifs: fix error handling in mount-time DFS referral chasing code
2009-08-04	tty-ldisc: be more careful in 'put_ldisc' locking	Linus Torvalds
	Use 'atomic_dec_and_lock()' to make sure that we always hold the tty_ldisc_lock when the ldisc count goes to zero. That way we can never race against 'tty_ldisc_try()' increasing the count again. Reported-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@mail.by> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-08-04	tty-ldisc: turn ldisc user count into a proper refcount	Linus Torvalds
	By using the user count for the actual lifetime rules, we can get rid of the silly "wait_for_idle" logic, because any busy ldisc will automatically stay around until the last user releases it. This avoids a host of odd issues, and simplifies the code. So now, when the last ldisc reference is dropped, we just release the ldisc operations struct reference, and free the ldisc. It looks obvious enough, and it does work for me, but the counting _could_ be off. It probably isn't (bad counting in the new version would generally imply that the old code did something really bad, like free an ldisc with a non-zero count), but it does need some testing, and preferably somebody looking at it. With this change, both 'tty_ldisc_put()' and 'tty_ldisc_deref()' are just aliases for the new ref-counting 'put_ldisc()'. Both of them decrement the ldisc user count and free it if it goes down to zero. They're identical functions, in other words. But the reason they still exist as sepate functions is that one of them was exported (tty_ldisc_deref) and had a stupid name (so I don't want to use it as the main name), and the other one was used in multiple places (and I didn't want to make the patch larger just to rename the users). In addition to the refcounting, I did do some minimal cleanup. For example, now "tty_ldisc_try()" actually returns the ldisc it got under the lock, rather than returning true/false and then the caller would look up the ldisc again (now without the protection of the lock). That said, there's tons of dubious use of 'tty->ldisc' without obviously proper locking or refcounting left. I expressly did _not_ want to try to fix it all, keeping the patch minimal. There may or may not be bugs in that kind of code, but they wouldn't be _new_ bugs. That said, even if the bugs aren't new, the timing and lifetime will change. For example, some silly code may depend on the 'tty->ldisc' pointer not changing because they hold a refcount on the 'ldisc'. And that's no longer true - if you hold a ref on the ldisc, the 'ldisc' itself is safe, but tty->ldisc may change. So the proper locking (remains) to hold tty->ldisc_mutex if you expect tty->ldisc to be stable. That's not really a _new_ rule, but it's an example of something that the old code might have unintentionally depended on and hidden bugs. Whatever. The patch _looks_ sensible to me. The only users of ldisc->users are: - get_ldisc() - atomically increment the count - put_ldisc() - atomically decrements the count and releases if zero - tty_ldisc_try_get() - creates the ldisc, and sets the count to 1. The ldisc should then either be released, or be attached to a tty. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@mail.by> Acked-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-08-04	tty-ldisc: make refcount be atomic_t 'users' count	Linus Torvalds
	This is pure preparation of changing the ldisc reference counting to be a true refcount that defines the lifetime of the ldisc. But this is a purely syntactic change for now to make the next steps easier. This patch should make no semantic changes at all. But I wanted to make the ldisc refcount be an atomic (I will be touching it without locks soon enough), and I wanted to rename it so that there isn't quite as much confusion between 'ldo->refcount' (ldisk operations refcount) and 'ld->refcount' (ldisc refcount itself) in the same file. So it's now an atomic 'ld->users' count. It still starts at zero, despite having a reference from 'tty->ldisc', but that will change once we turn it into a _real_ refcount. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@mail.by> Acked-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-08-04	Make SCSI SG v4 driver enabled by default and remove EXPERIMENTAL ↵	John Stoffel
	dependency, since udev depends on BSG Make Block Layer SG support v4 the default, since recent udev versions depend on this to access serial numbers and other low level info properly. This should be backported to older kernels as well, since most distros have enabled this for a long time. Signed-off-by: John Stoffel <john@stoffel.org> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-08-04	ehea: Fix napi list corruption on ifconfig down	Hannes Hering
	This patch fixes the napi list handling when an ehea interface is shut down to avoid corruption of the napi list. Signed-off-by: Hannes Hering <hering2@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-04	igbvf: Allow VF driver to correctly recognize failure to set mac	Alexander Duyck
	The VF driver was not correctly recognizing that it did not correctly set it's mac address. As a result the VF driver was unable to receive network traffic until being unloaded and reloaded. The issue was root caused to the fact that the CTS bit was not taken into account when checking for the request being NAKed. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-04	[CPUFREQ] Make cpufreq suspend code conditional on powerpc.	Dave Jones
	The suspend code runs with interrupts disabled, and the powerpc workaround we do in the cpufreq suspend hook calls the drivers ->get method. powernow-k8's ->get does an smp_call_function_single which needs interrupts enabled cpufreq's suspend/resume code was added in 42d4dc3f4e1e to work around a hardware problem on ppc powerbooks. If we make all this code conditional on powerpc, we avoid the issue above. Signed-off-by: Dave Jones <davej@redhat.com>
2009-08-04	[CPUFREQ] Fix a kobject reference bug related to managed CPUs	Thomas Renninger
	The first offline/online cycle is successful, the second not. Doing: echo 0 >cpu1/online echo 1 >cpu1/online echo 0 >cpu1/online The last command will trigger: Jul 22 14:39:50 linux kernel: [ 593.210125] ------------[ cut here ]------------ Jul 22 14:39:50 linux kernel: [ 593.210139] WARNING: at lib/kref.c:43 kref_get+0x23/0x2b() Jul 22 14:39:50 linux kernel: [ 593.210144] Hardware name: To Be Filled By O.E.M. Jul 22 14:39:50 linux kernel: [ 593.210148] Modules linked in: powernow_k8 Jul 22 14:39:50 linux kernel: [ 593.210158] Pid: 378, comm: kondemand/2 Tainted: G W 2.6.31-rc2 #38 Jul 22 14:39:50 linux kernel: [ 593.210163] Call Trace: Jul 22 14:39:50 linux kernel: [ 593.210171] [<ffffffff812008e8>] ? kref_get+0x23/0x2b Jul 22 14:39:50 linux kernel: [ 593.210181] [<ffffffff81041926>] warn_slowpath_common+0x77/0xa4 Jul 22 14:39:50 linux kernel: [ 593.210190] [<ffffffff81041962>] warn_slowpath_null+0xf/0x11 Jul 22 14:39:50 linux kernel: [ 593.210198] [<ffffffff812008e8>] kref_get+0x23/0x2b Jul 22 14:39:50 linux kernel: [ 593.210206] [<ffffffff811ffa19>] kobject_get+0x1a/0x22 Jul 22 14:39:50 linux kernel: [ 593.210214] [<ffffffff813e815d>] cpufreq_cpu_get+0x8a/0xcb Jul 22 14:39:50 linux kernel: [ 593.210222] [<ffffffff813e87d1>] __cpufreq_driver_getavg+0x1d/0x67 Jul 22 14:39:50 linux kernel: [ 593.210231] [<ffffffff813ea18f>] do_dbs_timer+0x158/0x27f Jul 22 14:39:50 linux kernel: [ 593.210240] [<ffffffff810529ea>] worker_thread+0x200/0x313 ... The output continues on every do_dbs_timer ondemand freq checking poll. This regression was introduced by git commit: 3f4a782b5ce2698b1870b5a7b573cd721d4fce33 The policy is released when the cpufreq device is removed in: __cpufreq_remove_dev(): /* if this isn't the CPU which is the parent of the kobj, we * only need to unlink, put and exit */ Not creating the symlink is not sever at all. As long as: sysfs_remove_link(&sys_dev->kobj, "cpufreq"); handles it gracefully that the symlink did not exist. Possibly no error should be returned at all, because ondemand governor would still provide the same functionality. Userspace in userspace gov case might be confused if the link is missing. Resolves http://bugzilla.kernel.org/show_bug.cgi?id=13903 CC: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> CC: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dave Jones <davej@redhat.com>
2009-08-04	[CPUFREQ] Do not set policy for offline cpus	Prarit Bhargava
	Suspend/Resume fails on multi socket, multi core systems because the cpufreq code erroneously sets the per_cpu policy_cpu value when a logical cpu is offline. This most notably results in missing sysfs files that are used to set the cpu frequencies of the various cpus. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Dave Jones <davej@redhat.com>
2009-08-04	[CPUFREQ] Fix NULL pointer dereference regression in conservative governor	Pallipadi, Venkatesh
	Commit ee88415caf736b89500f16e0a545614541a45005 introduced this regression when it removed enable bit in cpu_dbs_info_s. That added a possibility of dbs_cpufreq_notifier getting called for a CPU that is not yet managed by conservative governor. That will happen as the transition notifier is set as soon as one CPU switches to conservative governor and other CPUs can get a NULL pointer dereference without the enable bit check. Add the enable bit back again. Reported-by: Lermytte Christophe <Christophe.Lermytte@thomson.net> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>
2009-08-04	mfd: twl4030 irq fixes	Russell King
	The TWL4030 IRQ handler has a bug which leads to spinlock lock-up. It is calling the 'unmask' function in a process context. :The mask/unmask/ack functions are only designed to be called from the IRQ handler code, or the proper API interfaces found in linux/interrupt.h. Also there is no need to have IRQ chaining mechanism. The right way to handle this is to claim the parent interrupt as a standard interrupt and arrange for handle_twl4030_pih to take care of the rest of the devices. Mail thread on this issue can be found at: http://marc.info/?l=linux-arm-kernel&m=124629940123396&w=2 Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-08-04	perf_counter: Set the CONFIG_PERF_COUNTERS default to y if CONFIG_PROFILING=y	Ingo Molnar
	If user has already enabled profiling support in the kernel (for oprofile, old-style profiling of ftrace) then offer up perfcounters with a y default in interactive kconfig sessions. Still keep it off by default otherwise. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-04	x86: Work around compilation warning in arch/x86/kernel/apm_32.c	Subrata Modak
	The following fix was initially inspired by David Howells fix few days back: http://lkml.org/lkml/2009/7/9/109 However, Ingo disapproves such fixes as it's dangerous (it can hide future, relevant warnings) - in something as performance-uncritical. So, initialize 'err' to '0' to work around a GCC false positive warning: http://lkml.org/lkml/2009/7/18/89 Signed-off-by: Subrata Modak<subrata@linux.vnet.ibm.com> Cc: Sachin P Sant <sachinp@linux.vnet.ibm.com> Cc: David Howells <dhowells@redhat.com> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> LKML-Reference: <20090721023226.31855.67236.sendpatchset@subratamodak.linux.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-04	x86, UV: Complete IRQ interrupt migration in arch_enable_uv_irq()	Jack Steiner
	In uv_setup_irq(), the call to create_irq() initially assigns IRQ vectors to cpu 0. The subsequent call to assign_irq_vector() in arch_enable_uv_irq() migrates the IRQ to another cpu and frees the cpu 0 vector - at least it will be freed as soon as the "IRQ move" completes. arch_enable_uv_irq() needs to send a cleanup IPI to complete the IRQ move. Otherwise, assignment of GRU interrupts on large systems (>200 cpus) will exhaust the cpu 0 interrupt vectors and initialization of the GRU driver will fail. Signed-off-by: Jack Steiner <steiner@sgi.com> LKML-Reference: <20090720142840.GA8885@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-04	x86, 32-bit: Fix double accounting in reserve_top_address()	Jan Beulich
	With VMALLOC_END included in the calculation of MAXMEM (as of 2.6.28) it is no longer correct to also bump __VMALLOC_RESERVE in reserve_top_address(). Doing so results in needlessly small lowmem. Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <4A71DD2A020000780000D482@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-04	x86: Don't use current_cpu_data in x2apic phys_pkg_id	Yinghai Lu
	One system has socket 1 come up as BSP. kexeced kernel reports BSP as: [ 1.524550] Initializing cgroup subsys cpuacct [ 1.536064] initial_apicid:20 [ 1.537135] ht_mask_width:1 [ 1.538128] core_select_mask:f [ 1.539126] core_plus_mask_width:5 [ 1.558479] CPU: Physical Processor ID: 0 [ 1.559501] CPU: Processor Core ID: 0 [ 1.560539] CPU: L1 I cache: 32K, L1 D cache: 32K [ 1.579098] CPU: L2 cache: 256K [ 1.580085] CPU: L3 cache: 24576K [ 1.581108] CPU 0/0x20 -> Node 0 [ 1.596193] CPU 0 microcode level: 0xffff0008 It doesn't have correct physical processor id and will get an error: [ 38.840859] CPU0 attaching sched-domain: [ 38.848287] domain 0: span 0,8,72 level SIBLING [ 38.851151] groups: 0 8 72 [ 38.858137] domain 1: span 0,8-15,72-79 level MC [ 38.868944] groups: 0,8,72 9,73 10,74 11,75 12,76 13,77 14,78 15,79 [ 38.881383] ERROR: parent span is not a superset of domain->span [ 38.890724] domain 2: span 0-7,64-71 level CPU [ 38.899237] ERROR: domain->groups does not contain CPU0 [ 38.909229] groups: 8-15,72-79 [ 38.912547] ERROR: groups don't span domain->span [ 38.919665] domain 3: span 0-127 level NODE [ 38.930739] groups: 0-7,64-71 8-15,72-79 16-23,80-87 24-31,88-95 32-39,96-103 40-47,104-111 48-55,112-119 56-63,120-127 it turns out: we can not use current_cpu_data in phys_pgd_id for x2apic. identify_boot_cpu() is called by check_bugs() before smp_prepare_cpus() and till smp_prepare_cpus() current_cpu_data for bsp is assigned with boot_cpu_data. Just make phys_pkg_id for x2apic is aligned to xapic. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <4A6ADD0D.10002@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>