summaryrefslogtreecommitdiff
path: root/arch/powerpc/lib/Makefile
AgeCommit message (Collapse)Author
2024-05-07powerpc/Makefile: Remove bits related to the previous use of -mcmodel=largeNaveen N Rao
All supported compilers today (gcc v5.1+ and clang v11+) have support for -mcmodel=medium. As such, NO_MINIMAL_TOC is no longer being set. Remove NO_MINIMAL_TOC as well as the fallback to -mminimal-toc. Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Naveen N Rao <naveen@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240110141237.3179199-1-naveen@kernel.org
2024-03-07powerpc: xor_vmx: Add '-mhard-float' to CFLAGSNathan Chancellor
arch/powerpc/lib/xor_vmx.o is built with '-msoft-float' (from the main powerpc Makefile) and '-maltivec' (from its CFLAGS), which causes an error when building with clang after a recent change in main: error: option '-msoft-float' cannot be specified with '-maltivec' make[6]: *** [scripts/Makefile.build:243: arch/powerpc/lib/xor_vmx.o] Error 1 Explicitly add '-mhard-float' before '-maltivec' in xor_vmx.o's CFLAGS to override the previous inclusion of '-msoft-float' (as the last option wins), which matches how other areas of the kernel use '-maltivec', such as AMDGPU. Cc: stable@vger.kernel.org Closes: https://github.com/ClangBuiltLinux/linux/issues/1986 Link: https://github.com/llvm/llvm-project/commit/4792f912b232141ecba4cbae538873be3c28556c Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240127-ppc-xor_vmx-drop-msoft-float-v1-1-f24140e81376@kernel.org
2023-11-27powerpc: add crtsavres.o to always-y instead of extra-yMasahiro Yamada
crtsavres.o is linked to modules. However, as explained in commit d0e628cd817f ("kbuild: doc: clarify the difference between extra-y and always-y"), 'make modules' does not build extra-y. For example, the following command fails: $ make ARCH=powerpc LLVM=1 KBUILD_MODPOST_WARN=1 mrproper ps3_defconfig modules [snip] LD [M] arch/powerpc/platforms/cell/spufs/spufs.ko ld.lld: error: cannot open arch/powerpc/lib/crtsavres.o: No such file or directory make[3]: *** [scripts/Makefile.modfinal:56: arch/powerpc/platforms/cell/spufs/spufs.ko] Error 1 make[2]: *** [Makefile:1844: modules] Error 2 make[1]: *** [/home/masahiro/workspace/linux-kbuild/Makefile:350: __build_one_by_one] Error 2 make: *** [Makefile:234: __sub-make] Error 2 Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Fixes: baa25b571a16 ("powerpc/64: Do not link crtsavres.o in vmlinux") Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231120232332.4100288-1-masahiroy@kernel.org
2023-08-24powerpc: Drop zalloc_maybe_bootmem()Michael Ellerman
The only callers of zalloc_maybe_bootmem() are PCI setup routines. These used to be called early during boot before slab setup, and also during runtime due to hotplug. But commit 5537fcb319d0 ("powerpc/pci: Add ppc_md.discover_phbs()") moved the boot-time calls later, after slab setup, meaning there's no longer any need for zalloc_maybe_bootmem(), kzalloc() can be used in all cases. Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230823055430.752550-1-mpe@ellerman.id.au
2023-06-27powerpc: remove checks for binutils older than 2.25Masahiro Yamada
Commit e4412739472b ("Documentation: raise minimum supported version of binutils to 2.25") allows us to remove the checks for old binutils. There is no more user for ld-ifversion. Remove it as well. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230119082250.151485-1-masahiroy@kernel.org
2023-02-10powerpc/kcsan: Add exclusions from instrumentationRohan McLure
Exclude various incompatible compilation units from KCSAN instrumentation. Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230206021801.105268-2-rmclure@linux.ibm.com
2022-12-02powerpc/qspinlock: powerpc qspinlock implementationNicholas Piggin
Add a powerpc specific implementation of queued spinlocks. This is the build framework with a very simple (non-queued) spinlock implementation to begin with. Later changes add queueing, and other features and optimisations one-at-a-time. It is done this way to more easily see how the queued spinlocks are built, and to make performance and correctness bisects more useful. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Drop paravirt.h & processor.h changes to fix 32-bit build] [mpe: Fix 32-bit build of qspinlock.o & disallow GENERIC_LOCKBREAK per Nick] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/CONLLQB6DCJU.2ZPOS7T6S5GRR@bobo
2022-05-22powerpc/kasan: Don't instrument non-maskable or raw interruptsDaniel Axtens
Disable address sanitization for raw and non-maskable interrupt handlers, because they can run in real mode, where we cannot access the shadow memory. (Note that kasan_arch_is_ready() doesn't test for real mode, since it is a static branch for speed, and in any case not all the entry points to the generic KASAN code are protected by kasan_arch_is_ready guards.) The changes to interrupt_nmi_enter/exit_prepare() look larger than they actually are. The changes are equivalent to adding !IS_ENABLED(CONFIG_KASAN) to the conditions for calling nmi_enter() or nmi_exit() in real mode. That is, the code is equivalent to using the following condition for calling nmi_enter/exit: if (((!IS_ENABLED(CONFIG_PPC_BOOK3S_64) || !firmware_has_feature(FW_FEATURE_LPAR) || radix_enabled()) && !IS_ENABLED(CONFIG_KASAN) || (mfmsr() & MSR_DR)) That unwieldy condition has been split into several statements with comments, for easier reading. The nmi_ipi_lock functions that call atomic functions (i.e., nmi_ipi_lock_start(), nmi_ipi_lock() and nmi_ipi_unlock()), besides being marked noinstr, now call arch_atomic_* functions instead of atomic_* functions because with KASAN enabled, the atomic_* functions are wrappers which explicitly do address sanitization on their arguments. Since we are trying to avoid address sanitization, we have to use the lower-level arch_atomic_* versions. In hv_nmi_check_nonrecoverable(), the regs_set_unrecoverable() call has been open-coded so as to avoid having to either trust the inlining or mark regs_set_unrecoverable() as noinstr. [paulus@ozlabs.org: combined a few work-in-progress commits of Daniel's and wrote the commit message.] Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/YoTFGaKM8Pd46PIK@cleo
2021-12-23powerpc/32: Fix boot failure with GCC latent entropy pluginChristophe Leroy
Boot fails with GCC latent entropy plugin enabled. This is due to early boot functions trying to access 'latent_entropy' global data while the kernel is not relocated at its final destination yet. As there is no way to tell GCC to use PTRRELOC() to access it, disable latent entropy plugin in early_32.o and feature-fixups.o and code-patching.o Fixes: 38addce8b600 ("gcc-plugins: Add latent_entropy plugin") Cc: stable@vger.kernel.org # v4.9+ Reported-by: Erhard Furtner <erhard_f@mailbox.org> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://bugzilla.kernel.org/show_bug.cgi?id=215217 Link: https://lore.kernel.org/r/2bac55483b8daf5b1caa163a45fa5f9cdbe18be4.1640178426.git.christophe.leroy@csgroup.eu
2021-12-23powerpc/code-patching: Move code patching selftests in its own fileChristophe Leroy
Code patching selftests are half of code-patching.c. As they are guarded by CONFIG_CODE_PATCHING_SELFTESTS, they'd be better in their own file. Also add a missing __init for instr_is_branch_to_addr() Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c0c30504f04eb546a48ff77127a8bccd12a3d809.1638446239.git.christophe.leroy@csgroup.eu
2021-12-23powerpc/code-patching: Use test_trampoline for prefixed patch testChristophe Leroy
Use the dedicated test_trampoline function for testing prefixed patching like other tests and remove the hand coded assembly stuff. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a450ef3f8653f75e1bd9aaf7a3889d379752f33b.1638446239.git.christophe.leroy@csgroup.eu
2021-09-22isystem: delete global -isystem compile optionAlexey Dobriyan
Further isolate kernel from userspace, prevent accidental inclusion of undesireable headers, mainly float.h and stdatomic.h. nds32 keeps -isystem globally due to intrinsics used in entrenched header. -isystem is selectively reenabled for some files, again, for intrinsics. Compile tested on: hexagon-defconfig hexagon-allmodconfig alpha-allmodconfig alpha-allnoconfig alpha-defconfig arm64-allmodconfig arm64-allnoconfig arm64-defconfig arm-am200epdkit arm-aspeed_g4 arm-aspeed_g5 arm-assabet arm-at91_dt arm-axm55xx arm-badge4 arm-bcm2835 arm-cerfcube arm-clps711x arm-cm_x300 arm-cns3420vb arm-colibri_pxa270 arm-colibri_pxa300 arm-collie arm-corgi arm-davinci_all arm-dove arm-ep93xx arm-eseries_pxa arm-exynos arm-ezx arm-footbridge arm-gemini arm-h3600 arm-h5000 arm-hackkit arm-hisi arm-imote2 arm-imx_v4_v5 arm-imx_v6_v7 arm-integrator arm-iop32x arm-ixp4xx arm-jornada720 arm-keystone arm-lart arm-lpc18xx arm-lpc32xx arm-lpd270 arm-lubbock arm-magician arm-mainstone arm-milbeaut_m10v arm-mini2440 arm-mmp2 arm-moxart arm-mps2 arm-multi_v4t arm-multi_v5 arm-multi_v7 arm-mv78xx0 arm-mvebu_v5 arm-mvebu_v7 arm-mxs arm-neponset arm-netwinder arm-nhk8815 arm-omap1 arm-omap2plus arm-orion5x arm-oxnas_v6 arm-palmz72 arm-pcm027 arm-pleb arm-pxa arm-pxa168 arm-pxa255-idp arm-pxa3xx arm-pxa910 arm-qcom arm-realview arm-rpc arm-s3c2410 arm-s3c6400 arm-s5pv210 arm-sama5 arm-shannon arm-shmobile arm-simpad arm-socfpga arm-spear13xx arm-spear3xx arm-spear6xx arm-spitz arm-stm32 arm-sunxi arm-tct_hammer arm-tegra arm-trizeps4 arm-u8500 arm-versatile arm-vexpress arm-vf610m4 arm-viper arm-vt8500_v6_v7 arm-xcep arm-zeus csky-allmodconfig csky-allnoconfig csky-defconfig h8300-edosk2674 h8300-h8300h-sim h8300-h8s-sim i386-allmodconfig i386-allnoconfig i386-defconfig ia64-allmodconfig ia64-allnoconfig ia64-bigsur ia64-generic ia64-gensparse ia64-tiger ia64-zx1 m68k-amcore m68k-amiga m68k-apollo m68k-atari m68k-bvme6000 m68k-hp300 m68k-m5208evb m68k-m5249evb m68k-m5272c3 m68k-m5275evb m68k-m5307c3 m68k-m5407c3 m68k-m5475evb m68k-mac m68k-multi m68k-mvme147 m68k-mvme16x m68k-q40 m68k-stmark2 m68k-sun3 m68k-sun3x microblaze-allmodconfig microblaze-allnoconfig microblaze-mmu mips-ar7 mips-ath25 mips-ath79 mips-bcm47xx mips-bcm63xx mips-bigsur mips-bmips_be mips-bmips_stb mips-capcella mips-cavium_octeon mips-ci20 mips-cobalt mips-cu1000-neo mips-cu1830-neo mips-db1xxx mips-decstation mips-decstation_64 mips-decstation_r4k mips-e55 mips-fuloong2e mips-gcw0 mips-generic mips-gpr mips-ip22 mips-ip27 mips-ip28 mips-ip32 mips-jazz mips-jmr3927 mips-lemote2f mips-loongson1b mips-loongson1c mips-loongson2k mips-loongson3 mips-malta mips-maltaaprp mips-malta_kvm mips-malta_qemu_32r6 mips-maltasmvp mips-maltasmvp_eva mips-maltaup mips-maltaup_xpa mips-mpc30x mips-mtx1 mips-nlm_xlp mips-nlm_xlr mips-omega2p mips-pic32mzda mips-pistachio mips-qi_lb60 mips-rb532 mips-rbtx49xx mips-rm200 mips-rs90 mips-rt305x mips-sb1250_swarm mips-tb0219 mips-tb0226 mips-tb0287 mips-vocore2 mips-workpad mips-xway nds32-allmodconfig nds32-allnoconfig nds32-defconfig nios2-10m50 nios2-3c120 nios2-allmodconfig nios2-allnoconfig openrisc-allmodconfig openrisc-allnoconfig openrisc-or1klitex openrisc-or1ksim openrisc-simple_smp parisc-allnoconfig parisc-generic-32bit parisc-generic-64bit powerpc-acadia powerpc-adder875 powerpc-akebono powerpc-amigaone powerpc-arches powerpc-asp8347 powerpc-bamboo powerpc-bluestone powerpc-canyonlands powerpc-cell powerpc-chrp32 powerpc-cm5200 powerpc-currituck powerpc-ebony powerpc-eiger powerpc-ep8248e powerpc-ep88xc powerpc-fsp2 powerpc-g5 powerpc-gamecube powerpc-ge_imp3a powerpc-holly powerpc-icon powerpc-iss476-smp powerpc-katmai powerpc-kilauea powerpc-klondike powerpc-kmeter1 powerpc-ksi8560 powerpc-linkstation powerpc-lite5200b powerpc-makalu powerpc-maple powerpc-mgcoge powerpc-microwatt powerpc-motionpro powerpc-mpc512x powerpc-mpc5200 powerpc-mpc7448_hpc2 powerpc-mpc8272_ads powerpc-mpc8313_rdb powerpc-mpc8315_rdb powerpc-mpc832x_mds powerpc-mpc832x_rdb powerpc-mpc834x_itx powerpc-mpc834x_itxgp powerpc-mpc834x_mds powerpc-mpc836x_mds powerpc-mpc836x_rdk powerpc-mpc837x_mds powerpc-mpc837x_rdb powerpc-mpc83xx powerpc-mpc8540_ads powerpc-mpc8560_ads powerpc-mpc85xx_cds powerpc-mpc866_ads powerpc-mpc885_ads powerpc-mvme5100 powerpc-obs600 powerpc-pasemi powerpc-pcm030 powerpc-pmac32 powerpc-powernv powerpc-ppa8548 powerpc-ppc40x powerpc-ppc44x powerpc-ppc64 powerpc-ppc64e powerpc-ppc6xx powerpc-pq2fads powerpc-ps3 powerpc-pseries powerpc-rainier powerpc-redwood powerpc-sam440ep powerpc-sbc8548 powerpc-sequoia powerpc-skiroot powerpc-socrates powerpc-storcenter powerpc-stx_gp3 powerpc-taishan powerpc-tqm5200 powerpc-tqm8540 powerpc-tqm8541 powerpc-tqm8548 powerpc-tqm8555 powerpc-tqm8560 powerpc-tqm8xx powerpc-walnut powerpc-warp powerpc-wii powerpc-xes_mpc85xx riscv-allmodconfig riscv-allnoconfig riscv-nommu_k210 riscv-nommu_k210_sdcard riscv-nommu_virt riscv-rv32 s390-allmodconfig s390-allnoconfig s390-debug s390-zfcpdump sh-ap325rxa sh-apsh4a3a sh-apsh4ad0a sh-dreamcast sh-ecovec24 sh-ecovec24-romimage sh-edosk7705 sh-edosk7760 sh-espt sh-hp6xx sh-j2 sh-kfr2r09 sh-kfr2r09-romimage sh-landisk sh-lboxre2 sh-magicpanelr2 sh-microdev sh-migor sh-polaris sh-r7780mp sh-r7785rp sh-rsk7201 sh-rsk7203 sh-rsk7264 sh-rsk7269 sh-rts7751r2d1 sh-rts7751r2dplus sh-sdk7780 sh-sdk7786 sh-se7206 sh-se7343 sh-se7619 sh-se7705 sh-se7712 sh-se7721 sh-se7722 sh-se7724 sh-se7750 sh-se7751 sh-se7780 sh-secureedge5410 sh-sh03 sh-sh2007 sh-sh7710voipgw sh-sh7724_generic sh-sh7757lcr sh-sh7763rdp sh-sh7770_generic sh-sh7785lcr sh-sh7785lcr_32bit sh-shmin sh-shx3 sh-titan sh-ul2 sh-urquell sparc-allmodconfig sparc-allnoconfig sparc-sparc32 sparc-sparc64 um-i386-allmodconfig um-i386-allnoconfig um-i386-defconfig um-x86_64-allmodconfig um-x86_64-allnoconfig x86_64-allmodconfig x86_64-allnoconfig x86_64-defconfig xtensa-allmodconfig xtensa-allnoconfig xtensa-audio_kc705 xtensa-cadence_csp xtensa-common xtensa-generic_kc705 xtensa-iss xtensa-nommu_kc705 xtensa-smp_lx200 xtensa-virt xtensa-xip_kc705 Tested-by: Nathan Chancellor <nathan@kernel.org> # build (hexagon) Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2021-07-01powerpc: Only build restart_table.c for 64sMichael Ellerman
Commit 9b69d48c7516 ("powerpc/64e: remove implicit soft-masking and interrupt exit restart logic") limited the implicit soft masking and restart logic to 64-bit Book3S only. However we are still building restart_table.c for all 64-bit, ie. Book3E also. There's no need to build it for 64e, and it also causes missing prototype warnings for 64e builds, because the prototype is already behind an #ifdef PPC_BOOK3S_64. Fixes: 9b69d48c7516 ("powerpc/64e: remove implicit soft-masking and interrupt exit restart logic") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701125026.292224-1-mpe@ellerman.id.au
2021-06-25powerpc/64: allow alternate return locations for soft-masked interruptsNicholas Piggin
The exception table fixup adjusts a failed page fault's interrupt return location if it was taken at an address specified in the exception table, to a corresponding fixup handler address. Introduce a variation of that idea which adds a fixup table for NMIs and soft-masked asynchronous interrupts. This will be used to protect certain critical sections that are sensitive to being clobbered by interrupts coming in (due to using the same SPRs and/or irq soft-mask state). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210617155116.2167984-10-npiggin@gmail.com
2021-05-04powerpc/32: Fix boot failure with CONFIG_STACKPROTECTORChristophe Leroy
Commit 7c95d8893fb5 ("powerpc: Change calling convention for create_branch() et. al.") complexified the frame of function do_feature_fixups(), leading to GCC setting up a stack guard when CONFIG_STACKPROTECTOR is selected. The problem is that do_feature_fixups() is called very early while 'current' in r2 is not set up yet and the code is still not at the final address used at link time. So, like other instrumentation, stack protection needs to be deactivated for feature-fixups.c and code-patching.c Fixes: 7c95d8893fb5 ("powerpc: Change calling convention for create_branch() et. al.") Cc: stable@vger.kernel.org # v5.8+ Reported-by: Jonathan Neuschaefer <j.neuschaefer@gmx.net> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Jonathan Neuschaefer <j.neuschaefer@gmx.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b688fe82927b330349d9e44553363fa451ea4d95.1619715114.git.christophe.leroy@csgroup.eu
2021-04-21powerpc: Move copy_inst_from_kernel_nofault()Christophe Leroy
When probe_kernel_read_inst() was created, there was no good place to put it, so a file called lib/inst.c was dedicated for it. Since then, probe_kernel_read_inst() has been renamed copy_inst_from_kernel_nofault(). And mm/maccess.h didn't exist at that time. Today, mm/maccess.h is related to copy_from_kernel_nofault(). Move copy_inst_from_kernel_nofault() into mm/maccess.c Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/9655d8957313906b77b8db5700a0e33ce06f45e5.1618405715.git.christophe.leroy@csgroup.eu
2021-02-12kbuild: LD_VERSION redenominationMasahiro Yamada
Commit ccbef1674a15 ("Kbuild, lto: add ld-version and ld-ifversion macros") introduced scripts/ld-version.sh for GCC LTO. At that time, this script handled 5 version fields because GCC LTO needed the downstream binutils. (https://lkml.org/lkml/2014/4/8/272) The code snippet from the submitted patch was as follows: # We need HJ Lu's Linux binutils because mainline binutils does not # support mixing assembler and LTO code in the same ld -r object. # XXX check if the gcc plugin ld is the expected one too # XXX some Fedora binutils should also support it. How to check for that? ifeq ($(call ld-ifversion,-ge,22710001,y),y) ... However, GCC LTO was not merged into the mainline after all. (https://lkml.org/lkml/2014/4/8/272) So, the 4th and 5th fields were never used, and finally removed by commit 0d61ed17dd30 ("ld-version: Drop the 4th and 5th version components"). Since then, the last 4-digits returned by this script is always zeros. Remove the meaningless last 4-digits. This makes the version format consistent with GCC_VERSION, CLANG_VERSION, LLD_VERSION. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2020-10-06x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()Dan Williams
In reaction to a proposal to introduce a memcpy_mcsafe_fast() implementation Linus points out that memcpy_mcsafe() is poorly named relative to communicating the scope of the interface. Specifically what addresses are valid to pass as source, destination, and what faults / exceptions are handled. Of particular concern is that even though x86 might be able to handle the semantics of copy_mc_to_user() with its common copy_user_generic() implementation other archs likely need / want an explicit path for this case: On Fri, May 1, 2020 at 11:28 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Thu, Apr 30, 2020 at 6:21 PM Dan Williams <dan.j.williams@intel.com> wrote: > > > > However now I see that copy_user_generic() works for the wrong reason. > > It works because the exception on the source address due to poison > > looks no different than a write fault on the user address to the > > caller, it's still just a short copy. So it makes copy_to_user() work > > for the wrong reason relative to the name. > > Right. > > And it won't work that way on other architectures. On x86, we have a > generic function that can take faults on either side, and we use it > for both cases (and for the "in_user" case too), but that's an > artifact of the architecture oddity. > > In fact, it's probably wrong even on x86 - because it can hide bugs - > but writing those things is painful enough that everybody prefers > having just one function. Replace a single top-level memcpy_mcsafe() with either copy_mc_to_user(), or copy_mc_to_kernel(). Introduce an x86 copy_mc_fragile() name as the rename for the low-level x86 implementation formerly named memcpy_mcsafe(). It is used as the slow / careful backend that is supplanted by a fast copy_mc_generic() in a follow-on patch. One side-effect of this reorganization is that separating copy_mc_64.S to its own file means that perf no longer needs to track dependencies for its memcpy_64.S benchmarks. [ bp: Massage a bit. ] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: <stable@vger.kernel.org> Link: http://lore.kernel.org/r/CAHk-=wjSqtXAqfUJxFtWNwmguFASTgB0dz1dT3V-78Quiezqbg@mail.gmail.com Link: https://lkml.kernel.org/r/160195561680.2163339.11574962055305783722.stgit@dwillia2-desk3.amr.corp.intel.com
2020-07-27powerpc/64s: Implement queued spinlocks and rwlocksNicholas Piggin
These have shown significantly improved performance and fairness when spinlock contention is moderate to high on very large systems. With this series including subsequent patches, on a 16 socket 1536 thread POWER9, a stress test such as same-file open/close from all CPUs gets big speedups, 11620op/s aggregate with simple spinlocks vs 384158op/s (33x faster), where the difference in throughput between the fastest and slowest thread goes from 7x to 1.4x. Thanks to the fast path being identical in terms of atomics and barriers (after a subsequent optimisation patch), single threaded performance is not changed (no measurable difference). On smaller systems, performance and fairness seems to be generally improved. Using dbench on tmpfs as a test (that starts to run into kernel spinlock contention), a 2-socket OpenPOWER POWER9 system was tested with bare metal and KVM guest configurations. Results can be found here: https://github.com/linuxppc/issues/issues/305#issuecomment-663487453 Observations are: - Queued spinlocks are equal when contention is insignificant, as expected and as measured with microbenchmarks. - When there is contention, on bare metal queued spinlocks have better throughput and max latency at all points. - When virtualised, queued spinlocks are slightly worse approaching peak throughput, but significantly better throughput and max latency at all points beyond peak, until queued spinlock maximum latency rises when clients are 2x vCPUs. The regressions haven't been analysed very well yet, there are a lot of things that can be tuned, particularly the paravirtualised locking, but the numbers already look like a good net win even on relatively small systems. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Waiman Long <longman@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200724131423.1362108-4-npiggin@gmail.com
2020-05-19powerpc: Test prefixed code patchingJordan Niethe
Expand the code-patching self-tests to includes tests for patching prefixed instructions. Signed-off-by: Jordan Niethe <jniethe5@gmail.com> [mpe: Use CONFIG_PPC64 not __powerpc64__] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200506034050.24806-25-jniethe5@gmail.com
2020-05-19powerpc: Add a probe_user_read_inst() functionJordan Niethe
Introduce a probe_user_read_inst() function to use in cases where probe_user_read() is used for getting an instruction. This will be more useful for prefixed instructions. Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Reviewed-by: Alistair Popple <alistair@popple.id.au> [mpe: Don't write to *inst on error, fold in __user annotations] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200506034050.24806-14-jniethe5@gmail.com
2019-08-21powerpc/memcpy: Add memcpy_mcsafe for pmemBalbir Singh
The pmem infrastructure uses memcpy_mcsafe in the pmem layer so as to convert machine check exceptions into a return value on failure in case a machine check exception is encountered during the memcpy. The return value is the number of bytes remaining to be copied. This patch largely borrows from the copyuser_power7 logic and does not add the VMX optimizations, largely to keep the patch simple. If needed those optimizations can be folded in. Signed-off-by: Balbir Singh <bsingharora@gmail.com> [arbab@linux.ibm.com: Added symbol export] Co-developed-by: Santosh Sivaraj <santosh@fossix.org> Signed-off-by: Santosh Sivaraj <santosh@fossix.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190820081352.8641-7-santosh@fossix.org
2019-08-05powerpc/32: activate ARCH_HAS_PMEM_API and ARCH_HAS_UACCESS_FLUSHCACHEChristophe Leroy
PPC32 also have flush_dcache_range() so it can also support ARCH_HAS_PMEM_API and ARCH_HAS_UACCESS_FLUSHCACHE without changes. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a682a2f9db308c5cfe77e45aa3352e41bc9f4e33.1564554634.git.christophe.leroy@c-s.fr
2019-05-28powerpc/lib: only build ldstfp.o when CONFIG_PPC_FPU is setChristophe Leroy
The entire code in ldstfp.o is enclosed into #ifdef CONFIG_PPC_FPU, so there is no point in building it when this config is not selected. Fixes: cd64d1697cf0 ("powerpc: mtmsrd not defined") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-28powerpc/lib: fix redundant inclusion of quad.oChristophe Leroy
quad.o is only for PPC64, and already included in obj64-y, so it doesn't have to be in obj-y Fixes: 31bfdb036f12 ("powerpc: Use instruction emulation infrastructure to handle alignment faults") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03powerpc: disable KASAN instrumentation on early/critical files.Christophe Leroy
All files containing functions run before kasan_early_init() is called must have KASAN instrumentation disabled. For those file, branch profiling also have to be disabled otherwise each if () generates a call to ftrace_likely_update(). Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03powerpc: prepare string/mem functions for KASANChristophe Leroy
CONFIG_KASAN implements wrappers for memcpy() memmove() and memset() Those wrappers are doing the verification then call respectively __memcpy() __memmove() and __memset(). The arches are therefore expected to rename their optimised functions that way. For files on which KASAN is inhibited, #defines are used to allow them to directly call optimised versions of the functions without going through the KASAN wrappers. See commit 393f203f5fd5 ("x86_64: kasan: add interceptors for memset/memmove/memcpy functions") for details. Other string / mem functions do not (yet) have kasan wrappers, we therefore have to fallback to the generic versions when KASAN is active, otherwise KASAN checks will be skipped. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Fixups to keep selftests working] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-23powerpc: sstep: Add tests for compute type instructionsSandipan Das
This enhances the current selftest framework for validating the in-kernel instruction emulation infrastructure by adding support for compute type instructions i.e. integer ALU-based instructions. Originally, this framework was limited to only testing load and store instructions. While most of the GPRs can be validated, support for SPRs is limited to LR, CR and XER for now. When writing the test cases, one must ensure that the Stack Pointer (GPR1) or the Thread Pointer (GPR13) are not touched by any means as these are vital non-volatile registers. Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> [mpe: Use patch_site for the code patching] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20powerpc: Add support for function error injectionNaveen N. Rao
We implement regs_set_return_value() and override_function_with_return() for this purpose. On powerpc, a return from a function (blr) just branches to the location contained in the link register. So, we can just update pt_regs rather than redirecting execution to a dummy function that returns. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Reviewed-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19powerpc: Add -Werror at arch/powerpc levelMichael Ellerman
Back when I added -Werror in commit ba55bd74360e ("powerpc: Add configurable -Werror for arch/powerpc") I did it by adding it to most of the arch Makefiles. At the time we excluded math-emu, because apparently it didn't build cleanly. But that seems to have been fixed somewhere in the interim. So move the -Werror addition to the top-level of the arch, this saves us from repeating it in every Makefile and means we won't forget to add it to any new sub-dirs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07powerpc/lib: Implement strlen() in assembly for PPC32Christophe Leroy
The generic implementation of strlen() reads strings byte per byte. This patch implements strlen() in assembly based on a read of entire words, in the same spirit as what some other arches and glibc do. On a 8xx the time spent in strlen is reduced by 3/4 for long strings. strlen() selftest on an 8xx provides the following values: Before the patch (ie with the generic strlen() in lib/string.c): len 256 : time = 1.195055 len 016 : time = 0.083745 len 008 : time = 0.046828 len 004 : time = 0.028390 After the patch: len 256 : time = 0.272185 ==> 78% improvment len 016 : time = 0.040632 ==> 51% improvment len 008 : time = 0.033060 ==> 29% improvment len 004 : time = 0.029149 ==> 2% degradation On a 832x: Before the patch: len 256 : time = 0.236125 len 016 : time = 0.018136 len 008 : time = 0.011000 len 004 : time = 0.007229 After the patch: len 256 : time = 0.094950 ==> 60% improvment len 016 : time = 0.013357 ==> 26% improvment len 008 : time = 0.010586 ==> 4% improvment len 004 : time = 0.008784 Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-04powerpc/lib: optimise PPC32 memcmpChristophe Leroy
At the time being, memcmp() compares two chunks of memory byte per byte. This patch optimises the comparison by comparing word by word. On the same way as commit 15c2d45d17418 ("powerpc: Add 64bit optimised memcmp"), this patch moves memcmp() into a dedicated file named memcmp_32.S A small benchmark performed on an 8xx comparing two chuncks of 512 bytes performed 100000 times gives: Before : 5852274 TB ticks After: 1488638 TB ticks This is almost 4 times faster Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-04powerpc/lib: optimise 32 bits __clear_user()Christophe Leroy
Rewrite clear_user() on the same principle as memset(0), making use of dcbz to clear complete cache lines. This code is a copy/paste of memset(), with some modifications in order to retrieve remaining number of bytes to be cleared, as it needs to be returned in case of error. On the same way as done on PPC64 in commit 17968fbbd19f1 ("powerpc: 64bit optimised __clear_user"), the patch moves __clear_user() into a dedicated file string_32.S On a MPC885, throughput is almost doubled: Before: ~# dd if=/dev/zero of=/dev/null bs=1M count=1000 1048576000 bytes (1000.0MB) copied, 18.990779 seconds, 52.7MB/s After: ~# dd if=/dev/zero of=/dev/null bs=1M count=1000 1048576000 bytes (1000.0MB) copied, 9.611468 seconds, 104.0MB/s On a MPC8321, throughput is multiplied by 2.12: Before: root@vgoippro:~# dd if=/dev/zero of=/dev/null bs=1M count=1000 1048576000 bytes (1000.0MB) copied, 6.844352 seconds, 146.1MB/s After: root@vgoippro:~# dd if=/dev/zero of=/dev/null bs=1M count=1000 1048576000 bytes (1000.0MB) copied, 3.218854 seconds, 310.7MB/s Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-01powerpc/64s: Set assembler machine type to POWER4Nicholas Piggin
Rather than override the machine type in .S code (which can hide wrong or ambiguous code generation for the target), set the type to power4 for all assembly. This also means we need to be careful not to build power4-only code when we're not building for Book3S, such as the "power7" versions of copyuser/page/memcpy. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Fix Book3E build, don't build the "power7" variants for non-Book3S] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-16Merge tag 'powerpc-4.15-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "A bit of a small release, I suspect in part due to me travelling for KS. But my backlog of patches to review is smaller than usual, so I think in part folks just didn't send as much this cycle. Non-highlights: - Five fixes for the >128T address space handling, both to fix bugs in our implementation and to bring the semantics exactly into line with x86. Highlights: - Support for a new OPAL call on bare metal machines which gives us a true NMI (ie. is not masked by MSR[EE]=0) for debugging etc. - Support for Power9 DD2 in the CXL driver. - Improvements to machine check handling so that uncorrectable errors can be reported into the generic memory_failure() machinery. - Some fixes and improvements for VPHN, which is used under PowerVM to notify the Linux partition of topology changes. - Plumbing to enable TM (transactional memory) without suspend on some Power9 processors (PPC_FEATURE2_HTM_NO_SUSPEND). - Support for emulating vector loads form cache-inhibited memory, on some Power9 revisions. - Disable the fast-endian switch "syscall" by default (behind a CONFIG), we believe it has never had any users. - A major rework of the API drivers use when initiating and waiting for long running operations performed by OPAL firmware, and changes to the powernv_flash driver to use the new API. - Several fixes for the handling of FP/VMX/VSX while processes are using transactional memory. - Optimisations of TLB range flushes when using the radix MMU on Power9. - Improvements to the VAS facility used to access coprocessors on Power9, and related improvements to the way the NX crypto driver handles requests. - Implementation of PMEM_API and UACCESS_FLUSHCACHE for 64-bit. Thanks to: Alexey Kardashevskiy, Alistair Popple, Allen Pais, Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Balbir Singh, Benjamin Herrenschmidt, Breno Leitao, Christophe Leroy, Christophe Lombard, Cyril Bur, Frederic Barrat, Gautham R. Shenoy, Geert Uytterhoeven, Guilherme G. Piccoli, Gustavo Romero, Haren Myneni, Joel Stanley, Kamalesh Babulal, Kautuk Consul, Markus Elfring, Masami Hiramatsu, Michael Bringmann, Michael Neuling, Michal Suchanek, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Paul Mackerras, Pedro Miraglia Franco de Carvalho, Philippe Bergheaud, Sandipan Das, Seth Forshee, Shriya, Stephen Rothwell, Stewart Smith, Sukadev Bhattiprolu, Tyrel Datwyler, Vaibhav Jain, Vaidyanathan Srinivasan, and William A. Kennington III" * tag 'powerpc-4.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (151 commits) powerpc/64s: Fix Power9 DD2.0 workarounds by adding DD2.1 feature powerpc/64s: Fix masking of SRR1 bits on instruction fault powerpc/64s: mm_context.addr_limit is only used on hash powerpc/64s/radix: Fix 128TB-512TB virtual address boundary case allocation powerpc/64s/hash: Allow MAP_FIXED allocations to cross 128TB boundary powerpc/64s/hash: Fix fork() with 512TB process address space powerpc/64s/hash: Fix 128TB-512TB virtual address boundary case allocation powerpc/64s/hash: Fix 512T hint detection to use >= 128T powerpc: Fix DABR match on hash based systems powerpc/signal: Properly handle return value from uprobe_deny_signal() powerpc/fadump: use kstrtoint to handle sysfs store powerpc/lib: Implement UACCESS_FLUSHCACHE API powerpc/lib: Implement PMEM API powerpc/powernv/npu: Don't explicitly flush nmmu tlb powerpc/powernv/npu: Use flush_all_mm() instead of flush_tlb_mm() powerpc/powernv/idle: Round up latency and residency values powerpc/kprobes: refactor kprobe_lookup_name for safer string operations powerpc/kprobes: Blacklist emulate_update_regs() from kprobes powerpc/kprobes: Do not disable interrupts for optprobes and kprobes_on_ftrace powerpc/kprobes: Disable preemption before invoking probe handler for optprobes ...
2017-11-13powerpc/lib: Implement PMEM APIOliver O'Halloran
Implement the architecture specific cache maintence functions that make up the "PMEM API". Currently the writeback and invalidate functions are the same since the function of the DCBST (data cache block store) instruction is typically interpreted as "writeback to the point of coherency" rather than to memory. As a result implementing the API requires a full cache flush rather than just a cache write back. This will probably change in the not-too-distant future. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-01powerpc: Use instruction emulation infrastructure to handle alignment faultsPaul Mackerras
This replaces almost all of the instruction emulation code in fix_alignment() with calls to analyse_instr(), emulate_loadstore() and emulate_dcbz(). The only emulation code left is the SPE emulation code; analyse_instr() etc. do not handle SPE instructions at present. One result of this is that we can now handle alignment faults on all the new VSX load and store instructions that were added in POWER9. VSX loads/stores will take alignment faults for unaligned accesses to cache-inhibited memory. Another effect is that we no longer rely on the DAR and DSISR values set by the processor. With this, we now need to include the instruction emulation code unconditionally. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-01powerpc: Handle most loads and stores in instruction emulation codePaul Mackerras
This extends the instruction emulation infrastructure in sstep.c to handle all the load and store instructions defined in the Power ISA v3.0, except for the atomic memory operations, ldmx (which was never implemented), lfdp/stfdp, and the vector element load/stores. The instructions added are: Integer loads and stores: lbarx, lharx, lqarx, stbcx., sthcx., stqcx., lq, stq. VSX loads and stores: lxsiwzx, lxsiwax, stxsiwx, lxvx, lxvl, lxvll, lxvdsx, lxvwsx, stxvx, stxvl, stxvll, lxsspx, lxsdx, stxsspx, stxsdx, lxvw4x, lxsibzx, lxvh8x, lxsihzx, lxvb16x, stxvw4x, stxsibx, stxvh8x, stxsihx, stxvb16x, lxsd, lxssp, lxv, stxsd, stxssp, stxv. These instructions are handled both in the analyse_instr phase and in the emulate_step phase. The code for lxvd2ux and stxvd2ux has been taken out, as those instructions were never implemented in any processor and have been taken out of the architecture, and their opcodes have been reused for other instructions in POWER9 (lxvb16x and stxvb16x). The emulation for the VSX loads and stores uses helper functions which don't access registers or memory directly, which can hopefully be reused by KVM later. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-02powerpc/lib/xor_vmx: Ensure no altivec code executes before ↵Matt Brown
enable_kernel_altivec() The xor_vmx.c file is used for the RAID5 xor operations. In these functions altivec is enabled to run the operation and then disabled. The code uses enable_kernel_altivec() around the core of the algorithm, however the whole file is built with -maltivec, so the compiler is within its rights to generate altivec code anywhere. This has been seen at least once in the wild: 0:mon> di $xor_altivec_2 c0000000000b97d0 3c4c01d9 addis r2,r12,473 c0000000000b97d4 3842db30 addi r2,r2,-9424 c0000000000b97d8 7c0802a6 mflr r0 c0000000000b97dc f8010010 std r0,16(r1) c0000000000b97e0 60000000 nop c0000000000b97e4 7c0802a6 mflr r0 c0000000000b97e8 faa1ffa8 std r21,-88(r1) ... c0000000000b981c f821ff41 stdu r1,-192(r1) c0000000000b9820 7f8101ce stvx v28,r1,r0 <-- POP c0000000000b9824 38000030 li r0,48 c0000000000b9828 7fa101ce stvx v29,r1,r0 ... c0000000000b984c 4bf6a06d bl c0000000000238b8 # enable_kernel_altivec This patch splits the non-altivec code into xor_vmx_glue.c which calls the altivec functions in xor_vmx.c. By compiling xor_vmx_glue.c without -maltivec we can guarantee that altivec instruction will not be executed outside of the enable/disable block. Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com> [mpe: Rework change log and include disassembly] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-30powerpc/64: Linker on-demand sfpr functions for modulesNicholas Piggin
For final link, the powerpc64 linker generates fpr save/restore functions on-demand, placing them in the .sfpr section. Starting with binutils 2.25, these can be provided for non-final links with --save-restore-funcs. Use that where possible for module links. This saves about 200 bytes per module (~60kB) on powernv defconfig build. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-30powerpc/64: Do not link crtsavres.o in vmlinuxNicholas Piggin
The 64-bit linker creates save/restore functions on demand with final links, so vmlinux does not require crtsavres.o. Make crtsavres.o extra-y on 64-bit (it is still required by modules). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-06powerpc: get rid of zeroing, switch to RAW_COPY_USERAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-03-03powerpc: emulate_step() tests for load/store instructionsRavi Bangoria
Add new selftest that test emulate_step for Normal, Floating Point, Vector and Vector Scalar - load/store instructions. Test should run at boot time if CONFIG_KPROBES_SANITY_TEST and CONFIG_PPC64 is set. Sample log: emulate_step_test: ld : PASS emulate_step_test: lwz : PASS emulate_step_test: lwzx : PASS emulate_step_test: std : PASS emulate_step_test: ldarx / stdcx. : PASS emulate_step_test: lfsx : PASS emulate_step_test: stfsx : PASS emulate_step_test: lfdx : PASS emulate_step_test: stfdx : PASS emulate_step_test: lvx : PASS emulate_step_test: stvx : PASS emulate_step_test: lxvd2x : PASS emulate_step_test: stxvd2x : PASS Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> [mpe: Drop start/complete lines, make it all __init] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-25powerpc/64: Use optimized checksum routines on little-endianPaul Mackerras
Currently we have optimized hand-coded assembly checksum routines for big-endian 64-bit systems, but for little-endian we use the generic C routines. This modifies the optimized routines to work for little-endian. With this, we no longer need to enable CONFIG_GENERIC_CSUM. This also fixes a couple of comments in checksum_64.S so they accurately reflect what the associated instruction does. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> [mpe: Use the more common __BIG_ENDIAN__] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-14Merge branch 'kbuild' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull kbuild updates from Michal Marek: - EXPORT_SYMBOL for asm source by Al Viro. This does bring a regression, because genksyms no longer generates checksums for these symbols (CONFIG_MODVERSIONS). Nick Piggin is working on a patch to fix this. Plus, we are talking about functions like strcpy(), which rarely change prototypes. - Fixes for PPC fallout of the above by Stephen Rothwell and Nick Piggin - fixdep speedup by Alexey Dobriyan. - preparatory work by Nick Piggin to allow architectures to build with -ffunction-sections, -fdata-sections and --gc-sections - CONFIG_THIN_ARCHIVES support by Stephen Rothwell - fix for filenames with colons in the initramfs source by me. * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: (22 commits) initramfs: Escape colons in depfile ppc: there is no clear_pages to export powerpc/64: whitelist unresolved modversions CRCs kbuild: -ffunction-sections fix for archs with conflicting sections kbuild: add arch specific post-link Makefile kbuild: allow archs to select link dead code/data elimination kbuild: allow architectures to use thin archives instead of ld -r kbuild: Regenerate genksyms lexer kbuild: genksyms fix for typeof handling fixdep: faster CONFIG_ search ia64: move exports to definitions sparc32: debride memcpy.S a bit [sparc] unify 32bit and 64bit string.h sparc: move exports to definitions ppc: move exports to definitions arm: move exports to definitions s390: move exports to definitions m68k: move exports to definitions alpha: move exports to actual definitions x86: move exports to actual definitions ...
2016-09-13powerpc/Makefile: Drop CONFIG_WORD_SIZE for BITSMichael Ellerman
Commit 2578bfae84a7 ("[POWERPC] Create and use CONFIG_WORD_SIZE") added CONFIG_WORD_SIZE, and suggests that other arches were going to do likewise. But that never happened, powerpc is the only architecture which uses it. So switch to using a simple make variable, BITS, like x86, sh, sparc and tile. It is also easier to spell and simpler, avoiding any confusion about whether it's defined due to ordering of make vs kconfig. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-07ppc: move exports to definitionsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-14Merge branch 'next' of ↵Michael Ellerman
git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next Freescale updates from Scott: "Highlights include 8xx optimizations, 32-bit checksum optimizations, 86xx consolidation, e5500/e6500 cpu hotplug, more fman and other dt bits, and minor fixes/cleanup."
2016-03-07powerpc/ftrace: Use $(CC_FLAGS_FTRACE) when disabling ftraceTorsten Duwe
Rather than open-coding -pg whereever we want to disable ftrace, use the existing $(CC_FLAGS_FTRACE) variable. This has the advantage that it will work in future when we use a different set of flags to enable ftrace. Signed-off-by: Torsten Duwe <duwe@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>