summaryrefslogtreecommitdiff
path: root/arch/powerpc
AgeCommit message (Collapse)Author
2016-03-11powerpc32: Remove clear_pages() and define clear_page() inlineChristophe Leroy
clear_pages() is never used expect by clear_page, and PPC32 is the only architecture (still) having this function. Neither PPC64 nor any other architecture has it. This patch removes clear_pages() and moves clear_page() function inline (same as PPC64) as it only is a few isns Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11powerpc: add inline functions for cache related instructionsChristophe Leroy
This patch adds inline functions to use dcbz, dcbi, dcbf, dcbst from C functions Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11powerpc/8xx: rewrite flush_instruction_cache() in CChristophe Leroy
On PPC8xx, flushing instruction cache is performed by writing in register SPRN_IC_CST. This registers suffers CPU6 ERRATA. The patch rewrites the fonction in C so that CPU6 ERRATA will be handled transparently Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11powerpc/8xx: rewrite set_context() in CChristophe Leroy
There is no real need to have set_context() in assembly. Now that we have mtspr() handling CPU6 ERRATA directly, we can rewrite set_context() in C language for easier maintenance. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11powerpc/8xx: remove special handling of CPU6 errata in set_dec()Christophe Leroy
CPU6 ERRATA is now handled directly in mtspr(), so we can use the standard set_dec() fonction in all cases. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11powerpc/8xx: Handle CPU6 ERRATA directly in mtspr() macroChristophe Leroy
MPC8xx has an ERRATA on the use of mtspr() for some registers This patch includes the ERRATA handling directly into mtspr() macro so that mtspr() users don't need to bother about that errata Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11powerpc/8xx: Add missing SPRN defines into reg_8xx.hChristophe Leroy
Add missing SPRN defines into reg_8xx.h Some of them are defined in mmu-8xx.h, so we include mmu-8xx.h in reg_8xx.h, for that we remove references to PAGE_SHIFT in mmu-8xx.h to have it self sufficient, as includers of reg_8xx.h don't all include asm/page.h Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11powerpc32: remove ioremap_baseChristophe Leroy
ioremap_base is not initialised and is nowhere used so remove it Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11powerpc32: Remove useless/wrong MMU:setio progress messageChristophe Leroy
Commit 771168494719 ("[POWERPC] Remove unused machine call outs") removed the call to setup_io_mappings(), so remove the associated progress line message Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11powerpc32: refactor x_mapped_by_bats() and x_mapped_by_tlbcam() togetherChristophe Leroy
x_mapped_by_bats() and x_mapped_by_tlbcam() serve the same kind of purpose, and are never defined at the same time. So rename them x_block_mapped() and define them in the relevant places Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11powerpc32: Fix pte_offset_kernel() to return NULL for bad pagesChristophe Leroy
The fixmap related functions try to map kernel pages that are already mapped through Large TLBs. pte_offset_kernel() has to return NULL for LTLBs, otherwise the caller will try to access level 2 table which doesn't exist Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11powerpc/8xx: move setup_initial_memory_limit() into 8xx_mmu.cChristophe Leroy
Now we have a 8xx specific .c file for that so put it in there as other powerpc variants do Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11powerpc/8xx: Map linear kernel RAM with 8M pagesChristophe Leroy
On a live running system (VoIP gateway for Air Trafic Control), over a 10 minutes period (with 277s idle), we get 87 millions DTLB misses and approximatly 35 secondes are spent in DTLB handler. This represents 5.8% of the overall time and even 10.8% of the non-idle time. Among those 87 millions DTLB misses, 15% are on user addresses and 85% are on kernel addresses. And within the kernel addresses, 93% are on addresses from the linear address space and only 7% are on addresses from the virtual address space. MPC8xx has no BATs but it has 8Mb page size. This patch implements mapping of kernel RAM using 8Mb pages, on the same model as what is done on the 40x. In 4k pages mode, each PGD entry maps a 4Mb area: we map every two entries to the same 8Mb physical page. In each second entry, we add 4Mb to the page physical address to ease life of the FixupDAR routine. This is just ignored by HW. In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry will point to the first page of the area. The DTLB handler adds the 3 bits from EPN to map the correct page. With this patch applied, we now get only 13 millions TLB misses during the 10 minutes period. The idle time has increased to 313s and the overall time spent in DTLB miss handler is 6.3s, which represents 1% of the overall time and 2.2% of non-idle time. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11powerpc/8xx: Save r3 all the time in DTLB miss handlerChristophe Leroy
We are spending between 40 and 160 cycles with a mean of 65 cycles in the DTLB handling routine (measured with mftbl) so make it more simple althought it adds one instruction. With this modification, we get three registers available at all time, which will help with following patch. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-11Merge branch 'topic/mprofile-kernel' into nextMichael Ellerman
Merge the ftrace changes to support -mprofile-kernel on ppc64le. This is a prerequisite for live patching, the support for which will be merged via the livepatch tree based on this topic branch.
2016-03-10powerpc/perf: Fix misleading comment in pmao_restore_workaround()Madhavan Srinivasan
The current comment in pmao_restore_workaround() regarding hard_irq_disable() is wrong. It should say to hard *disable* interrupts instead of *enable*. Fix it. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-10powerpc/perf/24x7: Eliminate domain suffix in event namesSukadev Bhattiprolu
The Physical Core events of the 24x7 PMU can be monitored across various domains (physical core, vcpu home core, vcpu home node etc). For each of these core events, we currently create multiple events in sysfs, one for each domain the event can be monitored in. These events are distinguished by their suffixes like __PHYS_CORE, __VCPU_HOME_CORE etc. Rather than creating multiple such entries, we could let the user specify make 'domain' index a required parameter and let the user specify a value for it (like they currently specify the core index). $ cat /sys/bus/event_source/devices/hv_24x7/events/HPM_CCYC domain=?,offset=0x98,core=?,lpar=0x0 $ perf stat -C 0 -e hv_24x7/HPM_CCYC,domain=2,core=1/ true (the 'domain=?' and 'core=?' in sysfs tell perf tool to enforce them as required parameters). This simplifies the interface and allows users to identify events by the name specified in the catalog (User can determine the domain index by referring to '/sys/bus/event_source/devices/hv_24x7/interface/domains'). Eliminating the event suffix eliminates several functions and simplifies code. Note that Physical Chip events can only be monitored in the chip domain so those events have the domain set to 1 (rather than =?) and users don't need to specify the domain index for the Chip events. $ cat /sys/bus/event_source/devices/hv_24x7/events/PM_XLINK_CYCLES domain=1,offset=0x230,chip=?,lpar=0x0 $ perf stat -C 0 -e hv_24x7/PM_XLINK_CYCLES,chip=1/ true Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-10powerpc/perf/hv-24x7: Display domain indices in sysfsSukadev Bhattiprolu
To help users determine domains, display the domain indices used by the kernel in sysfs. $ cat /sys/bus/event_source/devices/hv_24x7/interface/domains 1: Physical Chip 2: Physical Core 3: VCPU Home Core 4: VCPU Home Chip 5: VCPU Home Node 6: VCPU Remote Node Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-10powerpc/perf/hv-24x7: Display change in counter valuesSukadev Bhattiprolu
For 24x7 counters, perf displays the raw value of the 24x7 counter, which is a monotonically increasing value. perf stat -C 0 -e \ 'hv_24x7/HPM_0THRD_NON_IDLE_CCYC__PHYS_CORE,core=1/' \ sleep 1 Performance counter stats for 'CPU(s) 0': 9,105,403,170 hv_24x7/HPM_0THRD_NON_IDLE_CCYC__PHYS_CORE,core=1/ 0.000425751 seconds time elapsed In the typical usage of 'perf stat' this counter value is not as useful as the _change_ in the counter value over the duration of the application. Have h_24x7_event_init() set the event's prev_count to the raw value of the 24x7 counter at the time of initialization. When the application terminates, hv_24x7_event_read() will compute the change in value and report to the perf tool. Similarly, for the transaction interface, clear the event count to 0 at the beginning of the transaction. perf stat -C 0 -e \ 'hv_24x7/HPM_0THRD_NON_IDLE_CCYC__PHYS_CORE,core=1/' \ sleep 1 Performance counter stats for 'CPU(s) 0': 245,758 hv_24x7/HPM_0THRD_NON_IDLE_CCYC__PHYS_CORE,core=1/ 1.006366383 seconds time elapsed Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-10powerpc/perf/hv-24x7: Fix usage with chip events.Sukadev Bhattiprolu
24x7 counters can belong to different domains (core, chip, virtual CPU etc). For events in the 'chip' domain, sysfs entry currently looks like: $ cd /sys/bus/event_source/devices/hv_24x7/events $ cat PM_XLINK_CYCLES__PHYS_CHIP domain=0x1,offset=0x230,core=?,lpar=0x0 where the required parameter, 'core=?' is specified with perf as: perf stat -C 0 -e hv_24x7/PM_XLINK_CYCLES__PHYS_CHIP,core=1/ \ /bin/true This is inconsistent in that 'core' is a required parameter for a chip event. Instead, have the the sysfs entry display 'chip=?' for chip events: $ cd /sys/bus/event_source/devices/hv_24x7/events $ cat PM_XLINK_CYCLES__PHYS_CHIP domain=0x1,offset=0x230,chip=?,lpar=0x0 We also need to add a 'chip' entry in the sysfs format directory: $ ls /sys/bus/event_source/devices/hv_24x7/format chip core domain lpar offset vcpu ^^^^ (new) so the perf tool can automatically check usage and format the chip parameter correctly: $ perf stat -C 0 -v -e hv_24x7/PM_XLINK_CYCLES__PHYS_CHIP/ \ /bin/true Required parameter 'chip' not specified invalid or unsupported event: 'hv_24x7/PM_XLINK_CYCLES__PHYS_CHIP/' $ perf stat -C 0 -v -e hv_24x7/PM_XLINK_CYCLES__PHYS_CHIP,chip=1/ \ /bin/true hv_24x7/PM_XLINK_CYCLES__PHYS_CHIP,chip=1/: 0 6628908 6628908 Performance counter stats for 'CPU(s) 0': 0 hv_24x7/PM_XLINK_CYCLES__PHYS_CHIP,chip=1/ 0.006606970 seconds time elapsed Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-10powerpc/perf: Export Power8 generic and cache events to sysfsSukadev Bhattiprolu
Power8 supports a large number of events in each susbystem so when a user runs: perf stat -e branch-instructions sleep 1 perf stat -e L1-dcache-loads sleep 1 it is not clear as to which PMU events were monitored. Export the generic hardware and cache perf events for Power8 to sysfs, so users can precisely determine the PMU event monitored by the generic event. Eg: cat /sys/bus/event_source/devices/cpu/events/branch-instructions event=0x10068 $ cat /sys/bus/event_source/devices/cpu/events/L1-dcache-loads event=0x100ee Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-10powerpc/perf: Remove PME_ prefix for power7 eventsSukadev Bhattiprolu
We used the PME_ prefix earlier to avoid some macro/variable name collisions. We have since changed the way we define/use the event macros so we no longer need the prefix. By dropping the prefix, we keep the the event macros consistent with their official names. Reported-by: Michael Ellerman <ellerman@au1.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-10Merge branch 'linus' into locking/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-09powerpc/p5040: Add device node for RAID EngineXuelin Shi
add the missing RAID Engine device node for p5040. otherwise, the device can not be detected. Signed-off-by: Xuelin Shi <xuelin.shi@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-09powerpc: optimise csum_partial() call when len is constantChristophe Leroy
csum_partial is often called for small fixed length packets for which it is suboptimal to use the generic csum_partial() function. For instance, in my configuration, I got: * One place calling it with constant len 4 * Seven places calling it with constant len 8 * Three places calling it with constant len 14 * One place calling it with constant len 20 * One place calling it with constant len 24 * One place calling it with constant len 32 This patch renames csum_partial() to __csum_partial() and implements csum_partial() as a wrapper inline function which * uses csum_add() for small 16bits multiple constant length * uses ip_fast_csum() for other 32bits multiple constant * uses __csum_partial() in all other cases Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-09powerpc/fsl-lbc: Modify suspend/resume entry sequenceRaghav Dogra
Modify platform driver suspend/resume to syscore suspend/resume. This is because p1022ds needs to use localbus when entering the PCIE resume. Signed-off-by: Raghav Dogra <raghav.dogra@nxp.com> [scottwood: dropped makefile churn] Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-09powerpc/8xx: CONFIG_DEBUG_PAGEALLOC requires ITLBmiss for kernel addressesChristophe Leroy
When CONFIG_DEBUG_PAGEALLOC is activated, the initial TLB mapping gets flushed to track accesses to wrong areas. Therefore, kernel addresses will also generate ITLB misses. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-09powerpc/885: set SDCR to 0x40Christophe Leroy
The MPC885 reference manual says that SDCR shall have value 0x40, but most exemples set SDCR to 0x1 With 0x1 in SDCR, we observe TX underruns on SCC when using it in QMC mode. According the NXP technical support, this is a copy/paste error from MPC860 reference manual, 0x40 being the only value supported by the MPC885 HW. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-09powerpc/86xx: disable IDE subsystem in mpc8610_hpcd_defconfigBartlomiej Zolnierkiewicz
This patch disables deprecated IDE subsystem in mpc8610_hpcd_defconfig (no IDE host drivers are selected in this config so there is no valid reason to enable IDE subsystem itself). Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-09powerpc/85xx: disable IDE subsystem in stx_gp3_defconfigBartlomiej Zolnierkiewicz
This patch disables deprecated IDE subsystem in stx_gp3_defconfig (no IDE host drivers are selected in this config so there is no valid reason to enable IDE subsystem itself). Cc: Scott Wood <oss@buserror.net> Cc: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-09powerpc/85xx: disable IDE subsystem in ksi8560_defconfigBartlomiej Zolnierkiewicz
This patch disables deprecated IDE subsystem in ksi8560_defconfig (no IDE host drivers are selected in this config so there is no valid reason to enable IDE subsystem itself). Cc: Scott Wood <oss@buserror.net> Cc: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-09powerpc/83xx: disable IDE subsystem in mpc834x_itx_defconfigBartlomiej Zolnierkiewicz
This patch disables deprecated IDE subsystem in mpc834x_itx_defconfig (no IDE host drivers are selected in this config so there is no valid reason to enable IDE subsystem itself). Cc: Scott Wood <oss@buserror.net> Cc: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-09Merge tag 'kvm-arm-for-4.6' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM updates for 4.6 - VHE support so that we can run the kernel at EL2 on ARMv8.1 systems - PMU support for guests - 32bit world switch rewritten in C - Various optimizations to the vgic save/restore code Conflicts: include/uapi/linux/kvm.h
2016-03-09Merge branch 'ib-mfd-regulator-gpio-4.6' of ↵Linus Walleij
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd into devel
2016-03-09powerpc: New possible return value from hcallChristophe Lombard
The hcalls introduced for cxl use a possible new value: H_STATE (invalid state). Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/eeh: eeh_pci_enable(): fix checking of post-request stateAndrew Donnellan
In eeh_pci_enable(), after making the request to set the new options, we call eeh_ops->wait_state() to check that the request finished successfully. At the moment, if eeh_ops->wait_state() returns 0, we return 0 without checking that it reflects the expected outcome. This can lead to callers further up the chain incorrectly assuming the slot has been successfully unfrozen and continuing to attempt recovery. On powernv, this will occur if pnv_eeh_get_pe_state() or pnv_eeh_get_phb_state() return 0, which in turn occurs if the relevant OPAL call returns OPAL_EEH_STOPPED_MMIO_DMA_FREEZE or OPAL_EEH_PHB_ERROR respectively. On pseries, this will occur if pseries_eeh_get_state() returns 0, which in turn occurs if RTAS reports that the PE is in the MMIO Stopped and DMA Stopped states. Obviously, none of these cases represent a successful completion of a request to thaw MMIO or DMA. Fix the check so that a wait_state() return value of 0 won't be considered successful for the EEH_OPT_THAW_MMIO or EEH_OPT_THAW_DMA cases. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/eeh: Remove duplicated check in eeh_dump_pe_log()Gavin Shan
When eeh_dump_pe_log() is only called by eeh_slot_error_detail(), we already have the check that the PE isn't in PCI config blocked state in eeh_slot_error_detail(). So we needn't the duplicated check in eeh_dump_pe_log(). This removes the duplicated check in eeh_dump_pe_log(). No logical changes introduced. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/eeh: Synchronize recovery in host/guestGavin Shan
When passing through SRIOV VFs to guest, we possibly encounter EEH error on PF. In this case, the VF PEs are put into frozen state. The error could be reported to guest before it's captured by the host. That means the guest could attempt to recover errors on VFs before host gets chance to recover errors on PFs. The VFs won't be recovered successfully. This enforces the recovery order for above case: the recovery on child PE in guest is hold until the recovery on parent PE in host is completed. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/eeh: Don't remove passed VFsGavin Shan
When we have partial hotplug as part of the error recovery on PF, the VFs that are bound with vfio-pci driver will experience hotplug. That's not allowed. This checks if the VF PE is passed or not. If it does, we leave the VF without removing it. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/eeh: Don't propagate error to guestGavin Shan
When EEH error happened to the parent PE of those PEs that have been passed through to guest, the error is propagated to guest domain and the VFIO driver's error handlers are called. It's not correct as the error in the host domain shouldn't be propagated to guests and affect them. This adds one more limitation when calling EEH error handlers. If the PE has been passed through to guest, the error handlers won't be called. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/eeh: powerpc/eeh: Support error recovery for VF PEWei Yang
PFs are enumerated on PCI bus, while VFs are created by PF's driver. In EEH recovery, it has two cases: 1. Device and driver is EEH aware, error handlers are called. 2. Device and driver is not EEH aware, un-plug the device and plug it again by enumerating it. The special thing happens on the second case. For a PF, we could use the original pci core to enumerate the bus, while for VF we need to record the VFs which aer un-plugged then plug it again. Also The patch caches the VF index in pci_dn, which can be used to calculate VF's bus, device and function number. Those information helps to locate the VF's PCI device instance when doing hotplug during EEH recovery if necessary. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/powernv: Support PCI config restore for VFsWei Yang
After PE reset, OPAL API opal_pci_reinit() is called on all devices contained in the PE to reinitialize them. While skiboot is not aware of VFs, we have to implement the function in kernel to reinitialize VFs after reset on PE for VFs. In this patch, two functions pnv_pci_fixup_vf_mps() and pnv_eeh_restore_vf_config() both manipulate the MPS of the VF, since for a VF it has three cases. 1. Normal creation for a VF In this case, pnv_pci_fixup_vf_mps() is called to make the MPS a proper value compared with its parent. 2. EEH recovery without VF removed In this case, MPS is stored in pci_dn and pnv_eeh_restore_vf_config() is called to restore it and reinitialize other part. 3. EEH recovery with VF removed In this case, VF will be removed then re-created. Both functions are called. First pnv_pci_fixup_vf_mps() is called to store the proper MPS to pci_dn and then pnv_eeh_restore_vf_config() is called to do proper thing. This introduces two functions: pnv_pci_fixup_vf_mps() to fixup the VF's MPS to make sure it is equal to parent's and store this value in pci_dn for future use. pnv_eeh_restore_vf_config() to re-initialize on VF by restoring MPS, disabling completion timeout, enabling SERR, etc. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/powernv: Support EEH reset for VF PEWei Yang
PEs for VFs don't have primary bus. So they have to have their own reset backend, which is used during EEH recovery. The patch implements the reset backend for VF's PE by issuing FLR or AF FLR to the VFs, which are contained in the PE. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/eeh: Create PE for VFsWei Yang
This creates PEs for VFs in the weak function pcibios_bus_add_device(). Those PEs for VFs are identified with newly introduced flag EEH_PE_VF so that we treat them differently during EEH recovery. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/eeh: EEH device for VFWei Yang
VFs and their corresponding pdn are created and released dynamically when their PF's SRIOV capability is enabled and disabled. This creates and releases EEH devices for VFs when creating and releasing their pdn instances, which means EEH devices and pdn instances have same life cycle. Also, VF's EEH device is identified by (struct eeh_dev::physfn). Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/eeh: Cache normal BARs, not windows or IOV BARsWei Yang
This restricts the EEH address cache to use only the first 7 BARs. This makes __eeh_addr_cache_insert_dev() ignore PCI bridge window and IOV BARs. As the result of this change, eeh_addr_cache_get_dev() will return VFs from VF's resource addresses instead of parent PFs. This also removes PCI bridge check as we limit __eeh_addr_cache_insert_dev() to 7 BARs and this effectively excludes PCI bridges from being cached. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/pci: Remove VFs prior to PFWei Yang
As commit ac205b7bb72f ("PCI: make sriov work with hotplug remove") indicates, VFs which is on the same PCI bus as their PF, should be removed before the PF. Otherwise, we might run into kernel crash at PCI unplugging time. This applies the above pattern to powerpc PCI hotplug path. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/eeh: Reworked eeh_pe_bus_get()Gavin Shan
The original implementation is ugly: unnecessary if statements and "out" tag. This reworks the function to avoid above weaknesses. No functional changes introduced. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-08PCI: Include pci/hotplug Kconfig directly from pci/KconfigBjorn Helgaas
Include pci/hotplug/Kconfig directly from pci/Kconfig, so arches don't have to source both pci/Kconfig and pci/hotplug/Kconfig. Note that this effectively adds pci/hotplug/Kconfig to the following arches, because they already sourced drivers/pci/Kconfig but they previously did not source drivers/pci/hotplug/Kconfig: alpha arm avr32 frv m68k microblaze mn10300 sparc unicore32 Inspired-by-patch-from: Bogicevic Sasa <brutallesale@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-03-08PCI: Include pci/pcie/Kconfig directly from pci/KconfigBogicevic Sasa
Include pci/pcie/Kconfig directly from pci/Kconfig, so arches don't have to source both pci/Kconfig and pci/pcie/Kconfig. Note that this effectively adds pci/pcie/Kconfig to the following arches, because they already sourced drivers/pci/Kconfig but they previously did not source drivers/pci/pcie/Kconfig: alpha avr32 blackfin frv m32r m68k microblaze mn10300 parisc sparc unicore32 xtensa [bhelgaas: changelog, source pci/pcie/Kconfig at top of pci/Kconfig, whitespace] Signed-off-by: Sasa Bogicevic <brutallesale@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>