summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-07-24powerpc/mm/hash: Remove the superfluous bitwise operation when find hpte groupAneesh Kumar K.V
When computing the starting slot number for a hash page table group we used to do this hpte_group = ((hash & htab_hash_mask) * HPTES_PER_GROUP) & ~0x7UL; Multiplying with 8 (HPTES_PER_GROUP) imply the last three bits are 0. Hence we really don't need to clear then separately. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-24powerpc/mm: Increase MAX_PHYSMEM_BITS to 128TB with SPARSEMEM_VMEMMAP configAneesh Kumar K.V
We do this only with VMEMMAP config so that our page_to_[nid/section] etc are not impacted. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-24powerpc/mm: Check memblock_add against MAX_PHYSMEM_BITS rangeAneesh Kumar K.V
With SPARSEMEM config enabled, we make sure that we don't add sections beyond MAX_PHYSMEM_BITS range. This results in not building vmemmap mapping for range beyond max range. But our memblock layer looks the device tree and create mapping for the full memory range. Prevent this by checking against MAX_PHSYSMEM_BITS when doing memblock_add. We don't do similar check for memeblock_reserve_range. If reserve range is beyond MAX_PHYSMEM_BITS we expect that to be configured with 'nomap'. Any other reserved range should come from existing memblock ranges which we already filtered while adding. This avoids crash as below when running on a system with system ram config above MAX_PHSYSMEM_BITS Unable to handle kernel paging request for data at address 0xc00a001000000440 Faulting instruction address: 0xc000000001034118 cpu 0x0: Vector: 300 (Data Access) at [c00000000124fb30] pc: c000000001034118: __free_pages_bootmem+0xc0/0x1c0 lr: c00000000103b258: free_all_bootmem+0x19c/0x22c sp: c00000000124fdb0 msr: 9000000002001033 dar: c00a001000000440 dsisr: 40000000 current = 0xc00000000120dd00 paca = 0xc000000001f60000^I irqmask: 0x03^I irq_happened: 0x01 pid = 0, comm = swapper [c00000000124fe20] c00000000103b258 free_all_bootmem+0x19c/0x22c [c00000000124fee0] c000000001010a68 mem_init+0x3c/0x5c [c00000000124ff00] c00000000100401c start_kernel+0x298/0x5e4 [c00000000124ff90] c00000000000b57c start_here_common+0x1c/0x520 Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-24powerpc: Add ppc64le and ppc64_book3e allmodconfig targetsMichael Ellerman
Similarly as we just did for 32-bit, add phony targets for generating a little endian and Book3E allmodconfig. These aren't covered by the regular allmodconfig, which is big endian and Book3S due to the way the Kconfig symbols are structured. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-24powerpc: Add ppc32_allmodconfig defconfig targetMichael Ellerman
Because the allmodconfig logic just sets every symbol to M or Y, it has the effect of always generating a 64-bit config, because CONFIG_PPC64 becomes Y. So to make it easier for folks to test 32-bit code, provide a phony defconfig target that generates a 32-bit allmodconfig. The 32-bit port has several mutually exclusive CPU types, we choose the Book3S variants as that's what the help text in Kconfig says is most common. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-24powerpc64s: Show ori31 availability in spectre_v1 sysfs file not v2Michael Ellerman
When I added the spectre_v2 information in sysfs, I included the availability of the ori31 speculation barrier. Although the ori31 barrier can be used to mitigate v2, it's primarily intended as a spectre v1 mitigation. Spectre v2 is mitigated by hardware changes. So rework the sysfs files to show the ori31 information in the spectre_v1 file, rather than v2. Currently we display eg: $ grep . spectre_v* spectre_v1:Mitigation: __user pointer sanitization spectre_v2:Mitigation: Indirect branch cache disabled, ori31 speculation barrier enabled After: $ grep . spectre_v* spectre_v1:Mitigation: __user pointer sanitization, ori31 speculation barrier enabled spectre_v2:Mitigation: Indirect branch cache disabled Fixes: d6fbe1c55c55 ("powerpc/64s: Wire up cpu_show_spectre_v2()") Cc: stable@vger.kernel.org # v4.17+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-24powerpc: NMI IPI make NMI IPIs fully sychronousNicholas Piggin
There is an asynchronous aspect to smp_send_nmi_ipi. The caller waits for all CPUs to call in to the handler, but it does not wait for completion of the handler. This is a needless complication, so remove it and always wait synchronously. The synchronous wait allows the caller to easily time out and clear the wait for completion (zero nmi_ipi_busy_count) in the case of badly behaved handlers. This would have prevented the recent smp_send_stop NMI IPI bug from causing the system to hang. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-24powerpc/64s: make PACA_IRQ_HARD_DIS track MSR[EE] closelyNicholas Piggin
When the masked interrupt handler clears MSR[EE] for an interrupt in the PACA_IRQ_MUST_HARD_MASK set, it does not set PACA_IRQ_HARD_DIS. This makes them get out of synch. With that taken into account, it's only low level irq manipulation (and interrupt entry before reconcile) where they can be out of synch. This makes the code less surprising. It also allows the IRQ replay code to rely on the IRQ_HARD_DIS value and not have to mtmsrd again in this case (e.g., for an external interrupt that has been masked). The bigger benefit might just be that there is not such an element of surprise in these two bits of state. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-24selftests/powerpc: Fix ptrace-pkey for default execute permission changeRam Pai
The test case assumes execute-permissions of unallocated keys are enabled by default, which is incorrect. Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-24selftests/powerpc: Fix core-pkey for default execute permission changeRam Pai
Only when the key is allocated, its permission are enabled. Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-24powerpc/pkeys: make protection key 0 less specialRam Pai
Applications need the ability to associate an address-range with some key and latter revert to its initial default key. Pkey-0 comes close to providing this function but falls short, because the current implementation disallows applications to explicitly associate pkey-0 to the address range. Lets make pkey-0 less special and treat it almost like any other key. Thus it can be explicitly associated with any address range, and can be freed. This gives the application more flexibility and power. The ability to free pkey-0 must be used responsibily, since pkey-0 is associated with almost all address-range by default. Even with this change pkey-0 continues to be slightly more special from the following point of view. (a) it is implicitly allocated. (b) it is the default key assigned to any address-range. (c) its permissions cannot be modified by userspace. NOTE: (c) is specific to powerpc only. pkey-0 is associated by default with all pages including kernel pages, and pkeys are also active in kernel mode. If any permission is denied on pkey-0, the kernel running in the context of the application will be unable to operate. Tested on powerpc. Signed-off-by: Ram Pai <linuxram@us.ibm.com> [mpe: Drop #define PKEY_0 0 in favour of plain old 0] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-24powerpc/pkeys: Preallocate execute-only keyRam Pai
execute-only key is allocated dynamically. This is a problem. When a thread implicitly creates an execute-only key, and resets the UAMOR for that key, the UAMOR value does not percolate to all the other threads. Any other thread may ignorantly change the permissions on the key. This can cause the key to be not execute-only for that thread. Preallocate the execute-only key and ensure that no thread can change the permission of the key, by resetting the corresponding bit in UAMOR. Fixes: 5586cf61e108 ("powerpc: introduce execute-only pkey") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-24powerpc/pkeys: Fix calculation of total pkeys.Ram Pai
Total number of pkeys calculation is off by 1. Fix it. Fixes: 4fb158f65ac5 ("powerpc: track allocation status of all pkeys") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-24powerpc/pkeys: Save the pkey registers before forkRam Pai
When a thread forks the contents of AMR, IAMR, UAMOR registers in the newly forked thread are not inherited. Save the registers before forking, for content of those registers to be automatically copied into the new thread. Fixes: cf43d3b26452 ("powerpc: Enable pkey subsystem") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-24powerpc/pkeys: key allocation/deallocation must not change pkey registersRam Pai
Key allocation and deallocation has the side effect of programming the UAMOR/AMR/IAMR registers. This is wrong, since its the responsibility of the application and not that of the kernel, to modify the permission on the key. Do not modify the pkey registers at key allocation/deallocation. This patch also fixes a bug where a sys_pkey_free() resets the UAMOR bits of the key, thus making its permissions unmodifiable from user space. Later if the same key gets reallocated from a different thread this thread will no longer be able to change the permissions on the key. Fixes: cf43d3b26452 ("powerpc: Enable pkey subsystem") Cc: stable@vger.kernel.org # v4.16+ Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-24powerpc/pkeys: Deny read/write/execute by defaultRam Pai
Deny all permissions on all keys, with some exceptions. pkey-0 must allow all permissions, or else everything comes to a screaching halt. Execute-only key must allow execute permission. Fixes: cf43d3b26452 ("powerpc: Enable pkey subsystem") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-24powerpc/pkeys: Give all threads control of their key permissionsRam Pai
Currently in a multithreaded application, a key allocated by one thread is not usable by other threads. By "not usable" we mean that other threads are unable to change the access permissions for that key for themselves. When a new key is allocated in one thread, the corresponding UAMOR bits for that thread get enabled, however the UAMOR bits for that key for all other threads remain disabled. Other threads have no way to set permissions on the key, and the current default permissions are that read/write is enabled for all keys, which means the key has no effect for other threads. Although that may be the desired behaviour in some circumstances, having all threads able to control their permissions for the key is more flexible. The current behaviour also differs from the x86 behaviour, which is problematic for users. To fix this, enable the UAMOR bits for all keys, at process creation (in start_thread(), ie exec time). Since the contents of UAMOR are inherited at fork, all threads are capable of modifying the permissions on any key. This is technically an ABI break on powerpc, but pkey support is fairly new on powerpc and not widely used, and this brings us into line with x86. Fixes: cf43d3b26452 ("powerpc: Enable pkey subsystem") Cc: stable@vger.kernel.org # v4.16+ Tested-by: Florian Weimer <fweimer@redhat.com> Signed-off-by: Ram Pai <linuxram@us.ibm.com> [mpe: Reword some of the changelog] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-20selftests/powerpc: Consolidate copy/paste test logicMichael Ellerman
This logic was shared between multiple tests, but now that we have removed all but one of them we can just move it into that test. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-20selftests/powerpc: Remove Power9 paste testsMichael Ellerman
Paste on POWER9 only works to accelerators and not on real memory. So these tests just generate a SIGILL. So just delete them. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-20selftests/powerpc: Remove Power9 copy_unaligned testMichael Ellerman
This is a test of the ISA 3.0 "copy" instruction. That instruction has an L field, which if set to 1 specifies that "the instruction identifies the beginning of a move group" (pp 858). That's also referred to as "copy first" vs "copy". In ISA 3.0B the copy instruction does not have an L field, and the corresponding bit in the instruction must be set to 1. This test is generating a "copy" instruction, not a "copy first", and so on Power9 (which implements 3.0B), this results in an illegal instruction. So just drop the test entirely. We still have copy_first_unaligned to test the "copy first" behaviour. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-20powerpc/prom_init: Remove linux,stdout-package propertyMurilo Opsfelder Araujo
This property was added in 2004 and the only use of it, which was already inside `#if 0`, was removed a month later. Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-20powerpc/ps3: Set driver coherent_dma_maskGeoff Levand
Set the coherent_dma_mask for the PS3 ehci, ohci, and snd devices. Silences WARN_ON_ONCE messages emitted by the dma_alloc_attrs() routine. Reported-by: Fredrik Noring <noring@nocrew.org> Signed-off-by: Geoff Levand <geoff@infradead.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-19cxl: Fix wrong comparison in cxl_adapter_context_get()Vaibhav Jain
Function atomic_inc_unless_negative() returns a bool to indicate success/failure. However cxl_adapter_context_get() wrongly compares the return value against '>=0' which will always be true. The patch fixes this comparison to '==0' there by also fixing this compile time warning: drivers/misc/cxl/main.c:290 cxl_adapter_context_get() warn: 'atomic_inc_unless_negative(&adapter->contexts_num)' is unsigned Fixes: 70b565bbdb91 ("cxl: Prevent adapter reset if an active context exists") Cc: stable@vger.kernel.org # v4.9+ Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-19powerpc/powernv/npu: Add a debugfs setting to change ATSD thresholdAlistair Popple
The threshold at which it becomes more efficient to coalesce a range of ATSDs into a single per-PID ATSD is currently not well understood due to a lack of real-world work loads. This patch adds a debugfs parameter allowing the threshold to be altered at runtime in order to aid future development and refinement of the value. Signed-off-by: Alistair Popple <alistair@popple.id.au> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-19MAINTAINERS: Remove the entry for the orphaned ams driverMichael Hanselmann
I no longer have any hardware with the Apple motion sensor and thus relinquish maintainership of the driver. Remove the maintainers entry entirely, meaning the code will now fall under "LINUX FOR POWER MACINTOSH". Signed-off-by: Michael Hanselmann <linux-kernel@hansmi.ch> [mpe: Drop the entry entirely, munge change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-19powerpc/mpic: Pass first free vector number to mpic_setup_error_int()Bharat Bhushan
Update the comment to account for the spurious interrupt number. The code was already accounting for it, but that was unclear because it was achieved by mpic_setup_error_int() knowing that the number it was passed was the last used vector, rather than the first free vector. So change the meaning of the argument to the first free vector and update the caller to pass 13, instead of 12, to achieve the same result. Signed-off-by: Bharat Bhushan <Bharat.Bhushan@nxp.com> [mpe: Rewrite change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-19powerpc/hugetlbpage: Rmove unhelpful HUGEPD_*_SHIFT macrosDavid Gibson
The HUGEPD_*_SHIFT macros are always defined to be PGDIR_SHIFT and PUD_SHIFT, and have to have those values to work properly. They once used to have different values, but that was really only because they were used to mean different things in different contexts. 6fa50483 "powerpc/mm/hugetlb: initialize the pagetable cache correctly for hugetlb" removed that double meaning, but left the now useless constants. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-19chrp/nvram.c: add MODULE_LICENSE()Randy Dunlap
Add MODULE_LICENSE() to the chrp nvram.c driver to fix the build warning message: WARNING: modpost: missing MODULE_LICENSE() in arch/powerpc/platforms/chrp/nvram.o Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-19powerpc/8xx: fix handling of early NULL pointer dereferenceChristophe Leroy
NULL pointers are pointers to user memory space. So user pagetable has to be set in order to avoid random behaviour in case of NULL pointer dereference, otherwise we may encounter random memory access hence Machine Check Exception from TLB Miss handlers. Set user pagetable as early as possible in order to properly catch early kernel NULL pointer dereference. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-19Merge branch 'topic/ppc-kvm' into nextMichael Ellerman
Merge in some commits we're sharing with the KVM tree. I manually propagated the change from commit d3d4ffaae439 ("powerpc/powernv/ioda2: Reduce upper limit for DMA window size") into pci-ioda-tce.c. Conflicts: arch/powerpc/include/asm/cputable.h arch/powerpc/platforms/powernv/pci-ioda.c arch/powerpc/platforms/powernv/pci.h
2018-07-16powerpc/powernv/ioda: Allocate indirect TCE levels on demandAlexey Kardashevskiy
At the moment we allocate the entire TCE table, twice (hardware part and userspace translation cache). This normally works as we normally have contigous memory and the guest will map entire RAM for 64bit DMA. However if we have sparse RAM (one example is a memory device), then we will allocate TCEs which will never be used as the guest only maps actual memory for DMA. If it is a single level TCE table, there is nothing we can really do but if it a multilevel table, we can skip allocating TCEs we know we won't need. This adds ability to allocate only first level, saving memory. This changes iommu_table::free() to avoid allocating of an extra level; iommu_table::set() will do this when needed. This adds @alloc parameter to iommu_table::exchange() to tell the callback if it can allocate an extra level; the flag is set to "false" for the realmode KVM handlers of H_PUT_TCE hcalls and the callback returns H_TOO_HARD. This still requires the entire table to be counted in mm::locked_vm. To be conservative, this only does on-demand allocation when the usespace cache table is requested which is the case of VFIO. The example math for a system replicating a powernv setup with NVLink2 in a guest: 16GB RAM mapped at 0x0 128GB GPU RAM window (16GB of actual RAM) mapped at 0x244000000000 the table to cover that all with 64K pages takes: (((0x244000000000 + 0x2000000000) >> 16)*8)>>20 = 4556MB If we allocate only necessary TCE levels, we will only need: (((0x400000000 + 0x400000000) >> 16)*8)>>20 = 4MB (plus some for indirect levels). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-16powerpc/powernv: Rework TCE level allocationAlexey Kardashevskiy
This moves actual pages allocation to a separate function which is going to be reused later in on-demand TCE allocation. While we are at it, remove unnecessary level size round up as the caller does this already. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-16powerpc/powernv: Add indirect levels to it_userspaceAlexey Kardashevskiy
We want to support sparse memory and therefore huge chunks of DMA windows do not need to be mapped. If a DMA window big enough to require 2 or more indirect levels, and a DMA window is used to map all RAM (which is a default case for 64bit window), we can actually save some memory by not allocation TCE for regions which we are not going to map anyway. The hardware tables alreary support indirect levels but we also keep host-physical-to-userspace translation array which is allocated by vmalloc() and is a flat array which might use quite some memory. This converts it_userspace from vmalloc'ed array to a multi level table. As the format becomes platform dependend, this replaces the direct access to it_usespace with a iommu_table_ops::useraddrptr hook which returns a pointer to the userspace copy of a TCE; future extension will return NULL if the level was not allocated. This should not change non-KVM handling of TCE tables and it_userspace will not be allocated for non-KVM tables. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-16KVM: PPC: Make iommu_table::it_userspace big endianAlexey Kardashevskiy
We are going to reuse multilevel TCE code for the userspace copy of the TCE table and since it is big endian, let's make the copy big endian too. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-16powerpc/powernv: Move TCE manupulation code to its own fileAlexey Kardashevskiy
Right now we have allocation code in pci-ioda.c and traversing code in pci.c, let's keep them toghether. However both files are big enough already so let's move this business to a new file. While we at it, move the code which links IOMMU table groups to IOMMU tables as it is not specific to any PNV PHB model. These puts exported symbols from the new file together. This fixes several warnings from checkpatch.pl like this: "WARNING: Prefer 'unsigned int' to bare use of 'unsigned'". As this is almost cut-n-paste, there should be no behavioral change. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-16powerpc/powernv: Remove useless wrapperAlexey Kardashevskiy
This gets rid of a useless wrapper around pnv_pci_ioda2_table_free_pages(). Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-16powerpc/64s: Remove POWER9 DD1 supportNicholas Piggin
POWER9 DD1 was never a product. It is no longer supported by upstream firmware, and it is not effectively supported in Linux due to lack of testing. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Michael Ellerman <mpe@ellerman.id.au> [mpe: Remove arch_make_huge_pte() entirely] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-12powerpc/xive: Replace msleep(x) with msleep(OPAL_BUSY_DELAY_MS)Daniel Klamt
Replace msleep(x) with with msleep(OPAL_BUSY_DELAY_MS) to document these sleeps are to wait for opal (firmware). Signed-off-by: Daniel Klamt <eleon@ele0n.de> Signed-off-by: Bjoern Noetel <bjoern@br3ak3r.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-12powerpc/64s: Report SLB multi-hit rather than parity errorMichael Ellerman
When we take an SLB multi-hit on bare metal, we see both the multi-hit and parity error bits set in DSISR. The user manuals indicates this is expected to always happen on Power8, whereas on Power9 it says a multi-hit will "usually" also cause a parity error. We decide what to do based on the various error tables in mce_power.c, and because we process them in order and only report the first, we currently always report a parity error but not the multi-hit, eg: Severe Machine check interrupt [Recovered] Initiator: CPU Error type: SLB [Parity] Effective address: c000000ffffd4300 Although this is correct, it leaves the user wondering why they got a parity error. It would be clearer instead if we reported the multi-hit because that is more likely to be simply a software bug, whereas a true parity error is possibly an indication of a bad core. We can do that simply by reordering the error tables so that multi-hit appears before parity. That doesn't affect the error recovery at all, because we flush the SLB either way. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-12powerpc: Remove Power8 DD1 from cputableJoel Stanley
This was added to support an early version of Power8 that did not have working doorbells. These machines were not publicly available, and all of the internal users have long since upgraded. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-10powerpc/dts: Use a correct at24 compatible fallback in ac14xxBartosz Golaszewski
Using 'at24' as fallback is now deprecated - use the full 'atmel,<model>' string. Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-10powerpc/dts: Use 'atmel' as at24 manufacturer for kmcent2Bartosz Golaszewski
Using compatible strings without the <manufacturer> part for at24 is now deprecated. Use a correct 'atmel,<model>' value. Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-10powerpc/dts: Use 'atmel' as at24 manufacturer for pdm360ngBartosz Golaszewski
Using 'at' as the <manufacturer> part of the compatible string is now deprecated. Use a correct string: 'atmel,<model>'. Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-09cpufreq: powernv: Remove global pstate ramp-down timer in POWER9Shilpasri G Bhat
POWER9 does not support global pstate requests for the chip. So remove the timer logic which slowly ramps down the global pstate in P9 platforms. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [mpe: Drop NULL check before kfree(policy->driver_data)] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-04powerpc: Enable kernel XZ compression option on BOOK3S_32Aaro Koskinen
Enable kernel XZ compression option on BOOK3S_32. Tested on G4 PowerBook. Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> [mpe: Use one select under the PPC symbol guarded by if PPC_BOOK3S] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-04powerpc/msi: Remove VLA usageKees Cook
In the quest to remove all stack VLA usage from the kernel[1], this switches from an unchanging variable to a constant expression to eliminate the VLA generation. [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-04powerpc/powernv/ioda2: Add 256M IOMMU page size to the default POWER8 caseAlexey Kardashevskiy
The sketchy bypass uses 256M pages so add this page size as well. This should cause no behavioral change but will be used later. Fixes: 477afd6ea6 "powerpc/ioda: Use ibm,supported-tce-sizes for IOMMU page size mask" Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-04powerpc/kdump: Handle crashkernel memory reservation failureHari Bathini
Memory reservation for crashkernel could fail if there are holes around kdump kernel offset (128M). Fail gracefully in such cases and print an error message. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Tested-by: David Gibson <dgibson@redhat.com> Reviewed-by: Dave Young <dyoung@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-03powerpc/mpc5200: Remove VLA usageKees Cook
In the quest to remove all stack VLA usage from the kernel[1], this switches to using a stack size large enough for the saved routine and adds a sanity check making sure the routine doesn't overflow into the 0x600 exception handler. [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-02ocxl: Fix page fault handler in case of fault on dying processFrederic Barrat
If a process exits without doing proper cleanup, there's a window where an opencapi device can try to access the memory of the dying process and may trigger a page fault. That's an expected scenario and the ocxl driver holds a reference on the mm_struct of the process until the opencapi device is notified of the process exiting. However, if mm_users is already at 0, i.e. the address space of the process has already been destroyed, the driver shouldn't try resolving the page fault, as it will fail, but it can also try accessing already freed data. It is fixed by only calling the bottom half of the page fault handler if mm_users is greater than 0 and get a reference on mm_users instead of mm_count. Otherwise, we can safely return a translation fault to the device, as its associated memory context is being removed. The opencapi device will be properly cleaned up shortly after when closing the file descriptors. Fixes: 5ef3166e8a32 ("ocxl: Driver code for 'generic' opencapi devices") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-By: Alastair D'Silva <alastair@d-silva.org> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>