diff options
author | David S. Miller <davem@davemloft.net> | 2018-11-28 22:10:54 -0800 |
---|---|---|
committer | David S. Miller <davem@davemloft.net> | 2018-11-28 22:10:54 -0800 |
commit | e561bb29b650d2817d10a4858f1817836ed08399 (patch) | |
tree | 0bc92b5bb8a287a8e4a88732f3c64b56d126da58 | |
parent | 62e3a931788223048120357ab3f29dcb55c5ef79 (diff) | |
parent | 60b548237fed4b4164bab13c994dd9615f6c4323 (diff) |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Trivial conflict in net/core/filter.c, a locally computed
'sdif' is now an argument to the function.
Signed-off-by: David S. Miller <davem@davemloft.net>
121 files changed, 2189 insertions, 1064 deletions
@@ -2204,6 +2204,10 @@ S: Post Office Box 371 S: North Little Rock, Arkansas 72115 S: USA +N: Christopher Li +E: sparse@chrisli.org +D: Sparse maintainer 2009 - 2018 + N: Stephan Linz E: linz@mazet.de E: Stephan.Linz@gmx.de diff --git a/Documentation/core-api/xarray.rst b/Documentation/core-api/xarray.rst index a4e705108f42..dbe96cb5558e 100644 --- a/Documentation/core-api/xarray.rst +++ b/Documentation/core-api/xarray.rst @@ -74,7 +74,8 @@ using :c:func:`xa_load`. xa_store will overwrite any entry with the new entry and return the previous entry stored at that index. You can use :c:func:`xa_erase` instead of calling :c:func:`xa_store` with a ``NULL`` entry. There is no difference between an entry that has never -been stored to and one that has most recently had ``NULL`` stored to it. +been stored to, one that has been erased and one that has most recently +had ``NULL`` stored to it. You can conditionally replace an entry at an index by using :c:func:`xa_cmpxchg`. Like :c:func:`cmpxchg`, it will only succeed if @@ -105,23 +106,44 @@ may result in the entry being marked at some, but not all of the other indices. Storing into one index may result in the entry retrieved by some, but not all of the other indices changing. +Sometimes you need to ensure that a subsequent call to :c:func:`xa_store` +will not need to allocate memory. The :c:func:`xa_reserve` function +will store a reserved entry at the indicated index. Users of the normal +API will see this entry as containing ``NULL``. If you do not need to +use the reserved entry, you can call :c:func:`xa_release` to remove the +unused entry. If another user has stored to the entry in the meantime, +:c:func:`xa_release` will do nothing; if instead you want the entry to +become ``NULL``, you should use :c:func:`xa_erase`. + +If all entries in the array are ``NULL``, the :c:func:`xa_empty` function +will return ``true``. + Finally, you can remove all entries from an XArray by calling :c:func:`xa_destroy`. If the XArray entries are pointers, you may wish to free the entries first. You can do this by iterating over all present entries in the XArray using the :c:func:`xa_for_each` iterator. -ID assignment -------------- +Allocating XArrays +------------------ + +If you use :c:func:`DEFINE_XARRAY_ALLOC` to define the XArray, or +initialise it by passing ``XA_FLAGS_ALLOC`` to :c:func:`xa_init_flags`, +the XArray changes to track whether entries are in use or not. You can call :c:func:`xa_alloc` to store the entry at any unused index in the XArray. If you need to modify the array from interrupt context, you can use :c:func:`xa_alloc_bh` or :c:func:`xa_alloc_irq` to disable -interrupts while allocating the ID. Unlike :c:func:`xa_store`, allocating -a ``NULL`` pointer does not delete an entry. Instead it reserves an -entry like :c:func:`xa_reserve` and you can release it using either -:c:func:`xa_erase` or :c:func:`xa_release`. To use ID assignment, the -XArray must be defined with :c:func:`DEFINE_XARRAY_ALLOC`, or initialised -by passing ``XA_FLAGS_ALLOC`` to :c:func:`xa_init_flags`, +interrupts while allocating the ID. + +Using :c:func:`xa_store`, :c:func:`xa_cmpxchg` or :c:func:`xa_insert` +will mark the entry as being allocated. Unlike a normal XArray, storing +``NULL`` will mark the entry as being in use, like :c:func:`xa_reserve`. +To free an entry, use :c:func:`xa_erase` (or :c:func:`xa_release` if +you only want to free the entry if it's ``NULL``). + +You cannot use ``XA_MARK_0`` with an allocating XArray as this mark +is used to track whether an entry is free or not. The other marks are +available for your use. Memory allocation ----------------- @@ -158,6 +180,8 @@ Takes RCU read lock: Takes xa_lock internally: * :c:func:`xa_store` + * :c:func:`xa_store_bh` + * :c:func:`xa_store_irq` * :c:func:`xa_insert` * :c:func:`xa_erase` * :c:func:`xa_erase_bh` @@ -167,6 +191,9 @@ Takes xa_lock internally: * :c:func:`xa_alloc` * :c:func:`xa_alloc_bh` * :c:func:`xa_alloc_irq` + * :c:func:`xa_reserve` + * :c:func:`xa_reserve_bh` + * :c:func:`xa_reserve_irq` * :c:func:`xa_destroy` * :c:func:`xa_set_mark` * :c:func:`xa_clear_mark` @@ -177,6 +204,7 @@ Assumes xa_lock held on entry: * :c:func:`__xa_erase` * :c:func:`__xa_cmpxchg` * :c:func:`__xa_alloc` + * :c:func:`__xa_reserve` * :c:func:`__xa_set_mark` * :c:func:`__xa_clear_mark` @@ -234,7 +262,8 @@ Sharing the XArray with interrupt context is also possible, either using :c:func:`xa_lock_irqsave` in both the interrupt handler and process context, or :c:func:`xa_lock_irq` in process context and :c:func:`xa_lock` in the interrupt handler. Some of the more common patterns have helper -functions such as :c:func:`xa_erase_bh` and :c:func:`xa_erase_irq`. +functions such as :c:func:`xa_store_bh`, :c:func:`xa_store_irq`, +:c:func:`xa_erase_bh` and :c:func:`xa_erase_irq`. Sometimes you need to protect access to the XArray with a mutex because that lock sits above another mutex in the locking hierarchy. That does @@ -322,7 +351,8 @@ to :c:func:`xas_retry`, and retry the operation if it returns ``true``. - :c:func:`xa_is_zero` - Zero entries appear as ``NULL`` through the Normal API, but occupy an entry in the XArray which can be used to reserve the index for - future use. + future use. This is used by allocating XArrays for allocated entries + which are ``NULL``. Other internal entries may be added in the future. As far as possible, they will be handled by :c:func:`xas_retry`. diff --git a/Documentation/devicetree/bindings/spi/spi-uniphier.txt b/Documentation/devicetree/bindings/spi/spi-uniphier.txt index 504a4ecfc7b1..b04e66a52de5 100644 --- a/Documentation/devicetree/bindings/spi/spi-uniphier.txt +++ b/Documentation/devicetree/bindings/spi/spi-uniphier.txt @@ -5,18 +5,20 @@ UniPhier SoCs have SCSSI which supports SPI single channel. Required properties: - compatible: should be "socionext,uniphier-scssi" - reg: address and length of the spi master registers - - #address-cells: must be <1>, see spi-bus.txt - - #size-cells: must be <0>, see spi-bus.txt - - clocks: A phandle to the clock for the device. - - resets: A phandle to the reset control for the device. + - interrupts: a single interrupt specifier + - pinctrl-names: should be "default" + - pinctrl-0: pin control state for the default mode + - clocks: a phandle to the clock for the device + - resets: a phandle to the reset control for the device Example: spi0: spi@54006000 { compatible = "socionext,uniphier-scssi"; reg = <0x54006000 0x100>; - #address-cells = <1>; - #size-cells = <0>; + interrupts = <0 39 4>; + pinctrl-names = "default"; + pinctrl-0 = <&pinctrl_spi0>; clocks = <&peri_clk 11>; resets = <&peri_rst 11>; }; diff --git a/Documentation/input/event-codes.rst b/Documentation/input/event-codes.rst index cef220c176a4..a8c0873beb95 100644 --- a/Documentation/input/event-codes.rst +++ b/Documentation/input/event-codes.rst @@ -190,16 +190,7 @@ A few EV_REL codes have special meanings: * REL_WHEEL, REL_HWHEEL: - These codes are used for vertical and horizontal scroll wheels, - respectively. The value is the number of "notches" moved on the wheel, the - physical size of which varies by device. For high-resolution wheels (which - report multiple events for each notch of movement, or do not have notches) - this may be an approximation based on the high-resolution scroll events. - -* REL_WHEEL_HI_RES: - - - If a vertical scroll wheel supports high-resolution scrolling, this code - will be emitted in addition to REL_WHEEL. The value is the (approximate) - distance travelled by the user's finger, in microns. + respectively. EV_ABS ------ diff --git a/MAINTAINERS b/MAINTAINERS index 81319971ca9a..fb88b6863d10 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2801,7 +2801,7 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git T: git git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git Q: https://patchwork.ozlabs.org/project/netdev/list/?delegate=77147 S: Supported -F: arch/x86/net/bpf_jit* +F: arch/*/net/* F: Documentation/networking/filter.txt F: Documentation/bpf/ F: include/linux/bpf* @@ -2821,6 +2821,67 @@ F: tools/bpf/ F: tools/lib/bpf/ F: tools/testing/selftests/bpf/ +BPF JIT for ARM +M: Shubham Bansal <illusionist.neo@gmail.com> +L: netdev@vger.kernel.org +S: Maintained +F: arch/arm/net/ + +BPF JIT for ARM64 +M: Daniel Borkmann <daniel@iogearbox.net> +M: Alexei Starovoitov <ast@kernel.org> +M: Zi Shen Lim <zlim.lnx@gmail.com> +L: netdev@vger.kernel.org +S: Supported +F: arch/arm64/net/ + +BPF JIT for MIPS (32-BIT AND 64-BIT) +M: Paul Burton <paul.burton@mips.com> +L: netdev@vger.kernel.org +S: Maintained +F: arch/mips/net/ + +BPF JIT for NFP NICs +M: Jakub Kicinski <jakub.kicinski@netronome.com> +L: netdev@vger.kernel.org +S: Supported +F: drivers/net/ethernet/netronome/nfp/bpf/ + +BPF JIT for POWERPC (32-BIT AND 64-BIT) +M: Naveen N. Rao <naveen.n.rao@linux.ibm.com> +M: Sandipan Das <sandipan@linux.ibm.com> +L: netdev@vger.kernel.org +S: Maintained +F: arch/powerpc/net/ + +BPF JIT for S390 +M: Martin Schwidefsky <schwidefsky@de.ibm.com> +M: Heiko Carstens <heiko.carstens@de.ibm.com> +L: netdev@vger.kernel.org +S: Maintained +F: arch/s390/net/ +X: arch/s390/net/pnet.c + +BPF JIT for SPARC (32-BIT AND 64-BIT) +M: David S. Miller <davem@davemloft.net> +L: netdev@vger.kernel.org +S: Maintained +F: arch/sparc/net/ + +BPF JIT for X86 32-BIT +M: Wang YanQing <udknight@gmail.com> +L: netdev@vger.kernel.org +S: Maintained +F: arch/x86/net/bpf_jit_comp32.c + +BPF JIT for X86 64-BIT +M: Alexei Starovoitov <ast@kernel.org> +M: Daniel Borkmann <daniel@iogearbox.net> +L: netdev@vger.kernel.org +S: Supported +F: arch/x86/net/ +X: arch/x86/net/bpf_jit_comp32.c + BROADCOM B44 10/100 ETHERNET DRIVER M: Michael Chan <michael.chan@broadcom.com> L: netdev@vger.kernel.org @@ -13997,11 +14058,10 @@ F: drivers/tty/serial/sunzilog.h F: drivers/tty/vcc.c SPARSE CHECKER -M: "Christopher Li" <sparse@chrisli.org> +M: "Luc Van Oostenryck" <luc.vanoostenryck@gmail.com> L: linux-sparse@vger.kernel.org W: https://sparse.wiki.kernel.org/ T: git git://git.kernel.org/pub/scm/devel/sparse/sparse.git -T: git git://git.kernel.org/pub/scm/devel/sparse/chrisl/sparse.git S: Maintained F: include/linux/compiler.h @@ -2,8 +2,8 @@ VERSION = 4 PATCHLEVEL = 20 SUBLEVEL = 0 -EXTRAVERSION = -rc3 -NAME = "People's Front" +EXTRAVERSION = -rc4 +NAME = Shy Crocodile # *DOCUMENTATION* # To see a list of typical targets execute "make help" diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c index a6fdaea07c63..89198017e8e6 100644 --- a/arch/arm64/net/bpf_jit_comp.c +++ b/arch/arm64/net/bpf_jit_comp.c @@ -351,7 +351,8 @@ static void build_epilogue(struct jit_ctx *ctx) * >0 - successfully JITed a 16-byte eBPF instruction. * <0 - failed to JIT. */ -static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx) +static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx, + bool extra_pass) { const u8 code = insn->code; const u8 dst = bpf2a64[insn->dst_reg]; @@ -625,12 +626,19 @@ emit_cond_jmp: case BPF_JMP | BPF_CALL: { const u8 r0 = bpf2a64[BPF_REG_0]; - const u64 func = (u64)__bpf_call_base + imm; + bool func_addr_fixed; + u64 func_addr; + int ret; - if (ctx->prog->is_func) - emit_addr_mov_i64(tmp, func, ctx); + ret = bpf_jit_get_func_addr(ctx->prog, insn, extra_pass, + &func_addr, &func_addr_fixed); + if (ret < 0) + return ret; + if (func_addr_fixed) + /* We can use optimized emission here. */ + emit_a64_mov_i64(tmp, func_addr, ctx); else - emit_a64_mov_i64(tmp, func, ctx); + emit_addr_mov_i64(tmp, func_addr, ctx); emit(A64_BLR(tmp), ctx); emit(A64_MOV(1, r0, A64_R(0)), ctx); break; @@ -753,7 +761,7 @@ emit_cond_jmp: return 0; } -static int build_body(struct jit_ctx *ctx) +static int build_body(struct jit_ctx *ctx, bool extra_pass) { const struct bpf_prog *prog = ctx->prog; int i; @@ -762,7 +770,7 @@ static int build_body(struct jit_ctx *ctx) const struct bpf_insn *insn = &prog->insnsi[i]; int ret; - ret = build_insn(insn, ctx); + ret = build_insn(insn, ctx, extra_pass); if (ret > 0) { i++; if (ctx->image == NULL) @@ -858,7 +866,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog) /* 1. Initial fake pass to compute ctx->idx. */ /* Fake pass to fill in ctx->offset. */ - if (build_body(&ctx)) { + if (build_body(&ctx, extra_pass)) { prog = orig_prog; goto out_off; } @@ -888,7 +896,7 @@ skip_init_ctx: build_prologue(&ctx, was_classic); - if (build_body(&ctx)) { + if (build_body(&ctx, extra_pass)) { bpf_jit_binary_free(header); prog = orig_prog; goto out_off; diff --git a/arch/ia64/include/asm/numa.h b/arch/ia64/include/asm/numa.h index ebef7f40aabb..c5c253cb9bd6 100644 --- a/arch/ia64/include/asm/numa.h +++ b/arch/ia64/include/asm/numa.h @@ -59,7 +59,9 @@ extern struct node_cpuid_s node_cpuid[NR_CPUS]; */ extern u8 numa_slit[MAX_NUMNODES * MAX_NUMNODES]; -#define node_distance(from,to) (numa_slit[(from) * MAX_NUMNODES + (to)]) +#define slit_distance(from,to) (numa_slit[(from) * MAX_NUMNODES + (to)]) +extern int __node_distance(int from, int to); +#define node_distance(from,to) __node_distance(from, to) extern int paddr_to_nid(unsigned long paddr); diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c index 1dacbf5e9e09..41eb281709da 100644 --- a/arch/ia64/kernel/acpi.c +++ b/arch/ia64/kernel/acpi.c @@ -578,8 +578,8 @@ void __init acpi_numa_fixup(void) if (!slit_table) { for (i = 0; i < MAX_NUMNODES; i++) for (j = 0; j < MAX_NUMNODES; j++) - node_distance(i, j) = i == j ? LOCAL_DISTANCE : - REMOTE_DISTANCE; + slit_distance(i, j) = i == j ? + LOCAL_DISTANCE : REMOTE_DISTANCE; return; } @@ -592,7 +592,7 @@ void __init acpi_numa_fixup(void) if (!pxm_bit_test(j)) continue; node_to = pxm_to_node(j); - node_distance(node_from, node_to) = + slit_distance(node_from, node_to) = slit_table->entry[i * slit_table->locality_count + j]; } } diff --git a/arch/ia64/mm/numa.c b/arch/ia64/mm/numa.c index 3861d6e32d5f..a03803506b0c 100644 --- a/arch/ia64/mm/numa.c +++ b/arch/ia64/mm/numa.c @@ -36,6 +36,12 @@ struct node_cpuid_s node_cpuid[NR_CPUS] = */ u8 numa_slit[MAX_NUMNODES * MAX_NUMNODES]; +int __node_distance(int from, int to) +{ + return slit_distance(from, to); +} +EXPORT_SYMBOL(__node_distance); + /* Identify which cnode a physical address resides on */ int paddr_to_nid(unsigned long paddr) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index d65b961661fb..a56f8413758a 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -983,6 +983,7 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu) ret = kvmhv_enter_nested_guest(vcpu); if (ret == H_INTERRUPT) { kvmppc_set_gpr(vcpu, 3, 0); + vcpu->arch.hcall_needed = 0; return -EINTR; } break; diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c index 50b129785aee..17482f5de3e2 100644 --- a/arch/powerpc/net/bpf_jit_comp64.c +++ b/arch/powerpc/net/bpf_jit_comp64.c @@ -166,7 +166,33 @@ static void bpf_jit_build_epilogue(u32 *image, struct codegen_context *ctx) PPC_BLR(); } -static void bpf_jit_emit_func_call(u32 *image, struct codegen_context *ctx, u64 func) +static void bpf_jit_emit_func_call_hlp(u32 *image, struct codegen_context *ctx, + u64 func) +{ +#ifdef PPC64_ELF_ABI_v1 + /* func points to the function descriptor */ + PPC_LI64(b2p[TMP_REG_2], func); + /* Load actual entry point from function descriptor */ + PPC_BPF_LL(b2p[TMP_REG_1], b2p[TMP_REG_2], 0); + /* ... and move it to LR */ + PPC_MTLR(b2p[TMP_REG_1]); + /* + * Load TOC from function descriptor at offset 8. + * We can clobber r2 since we get called through a + * function pointer (so caller will save/restore r2) + * and since we don't use a TOC ourself. + */ + PPC_BPF_LL(2, b2p[TMP_REG_2], 8); +#else + /* We can clobber r12 */ + PPC_FUNC_ADDR(12, func); + PPC_MTLR(12); +#endif + PPC_BLRL(); +} + +static void bpf_jit_emit_func_call_rel(u32 *image, struct codegen_context *ctx, + u64 func) { unsigned int i, ctx_idx = ctx->idx; @@ -273,7 +299,7 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, { const struct bpf_insn *insn = fp->insnsi; int flen = fp->len; - int i; + int i, ret; /* Start of epilogue code - will only be valid 2nd pass onwards */ u32 exit_addr = addrs[flen]; @@ -284,8 +310,9 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 src_reg = b2p[insn[i].src_reg]; s16 off = insn[i].off; s32 imm = insn[i].imm; + bool func_addr_fixed; + u64 func_addr; u64 imm64; - u8 *func; u32 true_cond; u32 tmp_idx; @@ -711,23 +738,15 @@ emit_clear: case BPF_JMP | BPF_CALL: ctx->seen |= SEEN_FUNC; - /* bpf function call */ - if (insn[i].src_reg == BPF_PSEUDO_CALL) - if (!extra_pass) - func = NULL; - else if (fp->aux->func && off < fp->aux->func_cnt) - /* use the subprog id from the off - * field to lookup the callee address - */ - func = (u8 *) fp->aux->func[off]->bpf_func; - else - return -EINVAL; - /* kernel helper call */ - else - func = (u8 *) __bpf_call_base + imm; - - bpf_jit_emit_func_call(image, ctx, (u64)func); + ret = bpf_jit_get_func_addr(fp, &insn[i], extra_pass, + &func_addr, &func_addr_fixed); + if (ret < 0) + return ret; + if (func_addr_fixed) + bpf_jit_emit_func_call_hlp(image, ctx, func_addr); + else + bpf_jit_emit_func_call_rel(image, ctx, func_addr); /* move return value from r3 to BPF_REG_0 */ PPC_MR(b2p[BPF_REG_0], 3); break; diff --git a/arch/sparc/net/bpf_jit_comp_64.c b/arch/sparc/net/bpf_jit_comp_64.c index 222785af550b..5fda4f7bf15d 100644 --- a/arch/sparc/net/bpf_jit_comp_64.c +++ b/arch/sparc/net/bpf_jit_comp_64.c @@ -791,7 +791,7 @@ static int emit_compare_and_branch(const u8 code, const u8 dst, u8 src, } /* Just skip the save instruction and the ctx register move. */ -#define BPF_TAILCALL_PROLOGUE_SKIP 16 +#define BPF_TAILCALL_PROLOGUE_SKIP 32 #define BPF_TAILCALL_CNT_SP_OFF (STACK_BIAS + 128) static void build_prologue(struct jit_ctx *ctx) @@ -824,9 +824,15 @@ static void build_prologue(struct jit_ctx *ctx) const u8 vfp = bpf2sparc[BPF_REG_FP]; emit(ADD | IMMED | RS1(FP) | S13(STACK_BIAS) | RD(vfp), ctx); + } else { + emit_nop(ctx); } emit_reg_move(I0, O0, ctx); + emit_reg_move(I1, O1, ctx); + emit_reg_move(I2, O2, ctx); + emit_reg_move(I3, O3, ctx); + emit_reg_move(I4, O4, ctx); /* If you add anything here, adjust BPF_TAILCALL_PROLOGUE_SKIP above. */ } @@ -1270,6 +1276,9 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx) const u8 tmp2 = bpf2sparc[TMP_REG_2]; u32 opcode = 0, rs2; + if (insn->dst_reg == BPF_REG_FP) + ctx->saw_frame_pointer = true; + ctx->tmp_2_used = true; emit_loadimm(imm, tmp2, ctx); @@ -1308,6 +1317,9 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx) const u8 tmp = bpf2sparc[TMP_REG_1]; u32 opcode = 0, rs2; + if (insn->dst_reg == BPF_REG_FP) + ctx->saw_frame_pointer = true; + switch (BPF_SIZE(code)) { case BPF_W: opcode = ST32; @@ -1340,6 +1352,9 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx) const u8 tmp2 = bpf2sparc[TMP_REG_2]; const u8 tmp3 = bpf2sparc[TMP_REG_3]; + if (insn->dst_reg == BPF_REG_FP) + ctx->saw_frame_pointer = true; + ctx->tmp_1_used = true; ctx->tmp_2_used = true; ctx->tmp_3_used = true; @@ -1360,6 +1375,9 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx) const u8 tmp2 = bpf2sparc[TMP_REG_2]; const u8 tmp3 = bpf2sparc[TMP_REG_3]; + if (insn->dst_reg == BPF_REG_FP) + ctx->saw_frame_pointer = true; + ctx->tmp_1_used = true; ctx->tmp_2_used = true; ctx->tmp_3_used = true; @@ -1425,12 +1443,12 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog) struct bpf_prog *tmp, *orig_prog = prog; struct sparc64_jit_data *jit_data; struct bpf_binary_header *header; + u32 prev_image_size, image_size; bool tmp_blinded = false; bool extra_pass = false; struct jit_ctx ctx; - u32 image_size; u8 *image_ptr; - int pass; + int pass, i; if (!prog->jit_requested) return orig_prog; @@ -1461,61 +1479,82 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog) header = jit_data->header; extra_pass = true; image_size = sizeof(u32) * ctx.idx; + prev_image_size = image_size; + pass = 1; goto skip_init_ctx; } memset(&ctx, 0, sizeof(ctx)); ctx.prog = prog; - ctx.offset = kcalloc(prog->len, sizeof(unsigned int), GFP_KERNEL); + ctx.offset = kmalloc_array(prog->len, sizeof(unsigned int), GFP_KERNEL); if (ctx.offset == NULL) { prog = orig_prog; goto out_off; } - /* Fake pass to detect features used, and get an accurate assessment - * of what the final image size will be. + /* Longest sequence emitted is for bswap32, 12 instructions. Pre-cook + * the offset array so that we converge faster. */ - if (build_body(&ctx)) { - prog = orig_prog; - goto out_off; - } - build_prologue(&ctx); - build_epilogue(&ctx); - - /* Now we know the actual image size. */ - image_size = sizeof(u32) * ctx.idx; - header = bpf_jit_binary_alloc(image_size, &image_ptr, - sizeof(u32), jit_fill_hole); - if (header == NULL) { - prog = orig_prog; - goto out_off; - } + for (i = 0; i < prog->len; i++) + ctx.offset[i] = i * (12 * 4); - ctx.image = (u32 *)image_ptr; -skip_init_ctx: - for (pass = 1; pass < 3; pass++) { + prev_image_size = ~0U; + for (pass = 1; pass < 40; pass++) { ctx.idx = 0; build_prologue(&ctx); - if (build_body(&ctx)) { - bpf_jit_binary_free(header); prog = orig_prog; goto out_off; } - build_epilogue(&ctx); if (bpf_jit_enable > 1) - pr_info("Pass %d: shrink = %d, seen = [%c%c%c%c%c%c]\n", pass, - image_size - (ctx.idx * 4), + pr_info("Pass %d: size = %u, seen = [%c%c%c%c%c%c]\n", pass, + ctx.idx * 4, ctx.tmp_1_used ? '1' : ' ', ctx.tmp_2_used ? '2' : ' ', ctx.tmp_3_used ? '3' : ' ', ctx.saw_frame_pointer ? 'F' : ' ', ctx.saw_call ? 'C' : ' ', ctx.saw_tail_call ? 'T' : ' '); + + if (ctx.idx * 4 == prev_image_size) + break; + prev_image_size = ctx.idx * 4; + cond_resched(); + } + + /* Now we know the actual image size. */ + image_size = sizeof(u32) * ctx.idx; + header = bpf_jit_binary_alloc(image_size, &image_ptr, + sizeof(u32), jit_fill_hole); + if (header == NULL) { + prog = orig_prog; + goto out_off; + } + + ctx.image = (u32 *)image_ptr; +skip_init_ctx: + ctx.idx = 0; + + build_prologue(&ctx); + + if (build_body(&ctx)) { + bpf_jit_binary_free(header); + prog = orig_prog; + goto out_off; + } + + build_epilogue(&ctx); + + if (ctx.idx * 4 != prev_image_size) { + pr_err("bpf_jit: Failed to converge, prev_size=%u size=%d\n", + prev_image_size, ctx.idx * 4); + bpf_jit_binary_free(header); + prog = orig_prog; + goto out_off; } if (bpf_jit_enable > 1) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 55e51ff7e421..fbda5a917c5b 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1094,7 +1094,8 @@ struct kvm_x86_ops { bool (*has_wbinvd_exit)(void); u64 (*read_l1_tsc_offset)(struct kvm_vcpu *vcpu); - void (*write_tsc_offset)(struct kvm_vcpu *vcpu, u64 offset); + /* Returns actual tsc_offset set in active VMCS */ + u64 (*write_l1_tsc_offset)(struct kvm_vcpu *vcpu, u64 offset); void (*get_exit_info)(struct kvm_vcpu *vcpu, u64 *info1, u64 *info2); diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 89db20f8cb70..c4533d05c214 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -55,7 +55,7 @@ #define PRIo64 "o" /* #define apic_debug(fmt,arg...) printk(KERN_WARNING fmt,##arg) */ -#define apic_debug(fmt, arg...) +#define apic_debug(fmt, arg...) do {} while (0) /* 14 is the version for Xeon and Pentium 8.4.8*/ #define APIC_VERSION (0x14UL | ((KVM_APIC_LVT_NUM - 1) << 16)) @@ -576,6 +576,11 @@ int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low, rcu_read_lock(); map = rcu_dereference(kvm->arch.apic_map); + if (unlikely(!map)) { + count = -EOPNOTSUPP; + goto out; + } + if (min > map->max_apic_id) goto out; /* Bits above cluster_size are masked in the caller. */ diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index cf5f572f2305..7c03c0f35444 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -5074,9 +5074,9 @@ static bool need_remote_flush(u64 old, u64 new) } static u64 mmu_pte_write_fetch_gpte(struct kvm_vcpu *vcpu, gpa_t *gpa, - const u8 *new, int *bytes) + int *bytes) { - u64 gentry; + u64 gentry = 0; int r; /* @@ -5088,22 +5088,12 @@ static u64 mmu_pte_write_fetch_gpte(struct kvm_vcpu *vcpu, gpa_t *gpa, /* Handle a 32-bit guest writing two halves of a 64-bit gpte */ *gpa &= ~(gpa_t)7; *bytes = 8; - r = kvm_vcpu_read_guest(vcpu, *gpa, &gentry, 8); - if (r) - gentry = 0; - new = (const u8 *)&gentry; } - switch (*bytes) { - case 4: - gentry = *(const u32 *)new; - break; - case 8: - gentry = *(const u64 *)new; - break; - default: - gentry = 0; - break; + if (*bytes == 4 || *bytes == 8) { + r = kvm_vcpu_read_guest_atomic(vcpu, *gpa, &gentry, *bytes); + if (r) + gentry = 0; } return gentry; @@ -5207,8 +5197,6 @@ static void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, pgprintk("%s: gpa %llx bytes %d\n", __func__, gpa, bytes); - gentry = mmu_pte_write_fetch_gpte(vcpu, &gpa, new, &bytes); - /* * No need to care whether allocation memory is successful * or not since pte prefetch is skiped if it does not have @@ -5217,6 +5205,9 @@ static void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, mmu_topup_memory_caches(vcpu); spin_lock(&vcpu->kvm->mmu_lock); + + gentry = mmu_pte_write_fetch_gpte(vcpu, &gpa, &bytes); + ++vcpu->kvm->stat.mmu_pte_write; kvm_mmu_audit(vcpu, AUDIT_PRE_PTE_WRITE); diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 0e21ccc46792..cc6467b35a85 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1446,7 +1446,7 @@ static u64 svm_read_l1_tsc_offset(struct kvm_vcpu *vcpu) return vcpu->arch.tsc_offset; } -static void svm_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset) +static u64 svm_write_l1_tsc_offset(struct kvm_vcpu *vcpu, u64 offset) { struct vcpu_svm *svm = to_svm(vcpu); u64 g_tsc_offset = 0; @@ -1464,6 +1464,7 @@ static void svm_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset) svm->vmcb->control.tsc_offset = offset + g_tsc_offset; mark_dirty(svm->vmcb, VMCB_INTERCEPTS); + return svm->vmcb->control.tsc_offset; } static void avic_init_vmcb(struct vcpu_svm *svm) @@ -1664,20 +1665,23 @@ static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu, static int avic_init_access_page(struct kvm_vcpu *vcpu) { struct kvm *kvm = vcpu->kvm; - int ret; + int ret = 0; + mutex_lock(&kvm->slots_lock); if (kvm->arch.apic_access_page_done) - return 0; + goto out; - ret = x86_set_memory_region(kvm, - APIC_ACCESS_PAGE_PRIVATE_MEMSLOT, - APIC_DEFAULT_PHYS_BASE, - PAGE_SIZE); + ret = __x86_set_memory_region(kvm, + APIC_ACCESS_PAGE_PRIVATE_MEMSLOT, + APIC_DEFAULT_PHYS_BASE, + PAGE_SIZE); if (ret) - return ret; + goto out; kvm->arch.apic_access_page_done = true; - return 0; +out: + mutex_unlock(&kvm->slots_lock); + return ret; } static int avic_init_backing_page(struct kvm_vcpu *vcpu) @@ -2189,21 +2193,31 @@ out: return ERR_PTR(err); } +static void svm_clear_current_vmcb(struct vmcb *vmcb) +{ + int i; + + for_each_online_cpu(i) + cmpxchg(&per_cpu(svm_data, i)->current_vmcb, vmcb, NULL); +} + static void svm_free_vcpu(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); + /* + * The vmcb page can be recycled, causing a false negative in + * svm_vcpu_load(). So, ensure that no logical CPU has this + * vmcb page recorded as its current vmcb. + */ + svm_clear_current_vmcb(svm->vmcb); + __free_page(pfn_to_page(__sme_clr(svm->vmcb_pa) >> PAGE_SHIFT)); __free_pages(virt_to_page(svm->msrpm), MSRPM_ALLOC_ORDER); __free_page(virt_to_page(svm->nested.hsave)); __free_pages(virt_to_page(svm->nested.msrpm), MSRPM_ALLOC_ORDER); kvm_vcpu_uninit(vcpu); kmem_cache_free(kvm_vcpu_cache, svm); - /* - * The vmcb page can be recycled, causing a false negative in - * svm_vcpu_load(). So do a full IBPB now. - */ - indirect_branch_prediction_barrier(); } static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu) @@ -7149,7 +7163,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = { .has_wbinvd_exit = svm_has_wbinvd_exit, .read_l1_tsc_offset = svm_read_l1_tsc_offset, - .write_tsc_offset = svm_write_tsc_offset, + .write_l1_tsc_offset = svm_write_l1_tsc_offset, .set_tdp_cr3 = set_tdp_cr3, diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 4555077d69ce..02edd9960e9d 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -174,6 +174,7 @@ module_param_named(preemption_timer, enable_preemption_timer, bool, S_IRUGO); * refer SDM volume 3b section 21.6.13 & 22.1.3. */ static unsigned int ple_gap = KVM_DEFAULT_PLE_GAP; +module_param(ple_gap, uint, 0444); static unsigned int ple_window = KVM_VMX_DEFAULT_PLE_WINDOW; module_param(ple_window, uint, 0444); @@ -984,6 +985,7 @@ struct vcpu_vmx { struct shared_msr_entry *guest_msrs; int nmsrs; int save_nmsrs; + bool guest_msrs_dirty; unsigned long host_idt_base; #ifdef CONFIG_X86_64 u64 msr_host_kernel_gs_base; @@ -1306,7 +1308,7 @@ static void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked); static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12, u16 error_code); static void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu); -static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, +static __always_inline void vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, u32 msr, int type); static DEFINE_PER_CPU(struct vmcs *, vmxarea); @@ -1610,12 +1612,6 @@ static int nested_enable_evmcs(struct kvm_vcpu *vcpu, { struct vcpu_vmx *vmx = to_vmx(vcpu); - /* We don't support disabling the feature for simplicity. */ - if (vmx->nested.enlightened_vmcs_enabled) - return 0; - - vmx->nested.enlightened_vmcs_enabled = true; - /* * vmcs_version represents the range of supported Enlightened VMCS * versions: lower 8 bits is the minimal version, higher 8 bits is the @@ -1625,6 +1621,12 @@ static int nested_enable_evmcs(struct kvm_vcpu *vcpu, if (vmcs_version) *vmcs_version = (KVM_EVMCS_VERSION << 8) | 1; + /* We don't support disabling the feature for simplicity. */ + if (vmx->nested.enlightened_vmcs_enabled) + return 0; + + vmx->nested.enlightened_vmcs_enabled = true; + vmx->nested.msrs.pinbased_ctls_high &= ~EVMCS1_UNSUPPORTED_PINCTRL; vmx->nested.msrs.entry_ctls_high &= ~EVMCS1_UNSUPPORTED_VMENTRY_CTRL; vmx->nested.msrs.exit_ctls_high &= ~EVMCS1_UNSUPPORTED_VMEXIT_CTRL; @@ -2897,6 +2899,20 @@ static void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) vmx->req_immediate_exit = false; + /* + * Note that guest MSRs to be saved/restored can also be changed + * when guest state is loaded. This happens when guest transitions + * to/from long-mode by setting MSR_EFER.LMA. + */ + if (!vmx->loaded_cpu_state || vmx->guest_msrs_dirty) { + vmx->guest_msrs_dirty = false; + for (i = 0; i < vmx->save_nmsrs; ++i) + kvm_set_shared_msr(vmx->guest_msrs[i].index, + vmx->guest_msrs[i].data, + vmx->guest_msrs[i].mask); + + } + if (vmx->loaded_cpu_state) return; @@ -2957,11 +2973,6 @@ static void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) vmcs_writel(HOST_GS_BASE, gs_base); host_state->gs_base = gs_base; } - - for (i = 0; i < vmx->save_nmsrs; ++i) - kvm_set_shared_msr(vmx->guest_msrs[i].index, - vmx->guest_msrs[i].data, - vmx->guest_msrs[i].mask); } static void vmx_prepare_switch_to_host(struct vcpu_vmx *vmx) @@ -3436,6 +3447,7 @@ static void setup_msrs(struct vcpu_vmx *vmx) move_msr_up(vmx, index, save_nmsrs++); vmx->save_nmsrs = save_nmsrs; + vmx->guest_msrs_dirty = true; if (cpu_has_vmx_msr_bitmap()) vmx_update_msr_bitmap(&vmx->vcpu); @@ -3452,11 +3464,9 @@ static u64 vmx_read_l1_tsc_offset(struct kvm_vcpu *vcpu) return vcpu->arch.tsc_offset; } -/* - * writes 'offset' into guest's timestamp counter offset register - */ -static void vmx_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset) +static u64 vmx_write_l1_tsc_offset(struct kvm_vcpu *vcpu, u64 offset) { + u64 active_offset = offset; if (is_guest_mode(vcpu)) { /* * We're here if L1 chose not to trap WRMSR to TSC. According @@ -3464,17 +3474,16 @@ static void vmx_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset) * set for L2 remains unchanged, and still needs to be added * to the newly set TSC to get L2's TSC. */ - struct vmcs12 *vmcs12; - /* recalculate vmcs02.TSC_OFFSET: */ - vmcs12 = get_vmcs12(vcpu); - vmcs_write64(TSC_OFFSET, offset + - (nested_cpu_has(vmcs12, CPU_BASED_USE_TSC_OFFSETING) ? - vmcs12->tsc_offset : 0)); + struct vmcs12 *vmcs12 = get_vmcs12(vcpu); + if (nested_cpu_has(vmcs12, CPU_BASED_USE_TSC_OFFSETING)) + active_offset += vmcs12->tsc_offset; } else { trace_kvm_write_tsc_offset(vcpu->vcpu_id, vmcs_read64(TSC_OFFSET), offset); - vmcs_write64(TSC_OFFSET, offset); } + + vmcs_write64(TSC_OFFSET, active_offset); + return active_offset; } /* @@ -5944,7 +5953,7 @@ static void free_vpid(int vpid) spin_unlock(&vmx_vpid_lock); } -static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, +static __always_inline void vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, u32 msr, int type) { int f = sizeof(unsigned long); @@ -5982,7 +5991,7 @@ static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bit } } -static void __always_inline vmx_enable_intercept_for_msr(unsigned long *msr_bitmap, +static __always_inline void vmx_enable_intercept_for_msr(unsigned long *msr_bitmap, u32 msr, int type) { int f = sizeof(unsigned long); @@ -6020,7 +6029,7 @@ static void __always_inline vmx_enable_intercept_for_msr(unsigned long *msr_bitm } } -static void __always_inline vmx_set_intercept_for_msr(unsigned long *msr_bitmap, +static __always_inline void vmx_set_intercept_for_msr(unsigned long *msr_bitmap, u32 msr, int type, bool value) { if (value) @@ -8664,8 +8673,6 @@ static int copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx) struct vmcs12 *vmcs12 = vmx->nested.cached_vmcs12; struct hv_enlightened_vmcs *evmcs = vmx->nested.hv_evmcs; - vmcs12->hdr.revision_id = evmcs->revision_id; - /* HV_VMX_ENLIGHTENED_CLEAN_FIELD_NONE */ vmcs12->tpr_threshold = evmcs->tpr_threshold; vmcs12->guest_rip = evmcs->guest_rip; @@ -9369,7 +9376,30 @@ static int nested_vmx_handle_enlightened_vmptrld(struct kvm_vcpu *vcpu, vmx->nested.hv_evmcs = kmap(vmx->nested.hv_evmcs_page); - if (vmx->nested.hv_evmcs->revision_id != VMCS12_REVISION) { + /* + * Currently, KVM only supports eVMCS version 1 + * (== KVM_EVMCS_VERSION) and thus we expect guest to set this + * value to first u32 field of eVMCS which should specify eVMCS + * VersionNumber. + * + * Guest should be aware of supported eVMCS versions by host by + * examining CPUID.0x4000000A.EAX[0:15]. Host userspace VMM is + * expected to set this CPUID leaf according to the value + * returned in vmcs_version from nested_enable_evmcs(). + * + * However, it turns out that Microsoft Hyper-V fails to comply + * to their own invented interface: When Hyper-V use eVMCS, it + * just sets first u32 field of eVMCS to revision_id specified + * in MSR_IA32_VMX_BASIC. Instead of used eVMCS version number + * which is one of the supported versions specified in + * CPUID.0x4000000A.EAX[0:15]. + * + * To overcome Hyper-V bug, we accept here either a supported + * eVMCS version or VMCS12 revision_id as valid values for first + * u32 field of eVMCS. + */ + if ((vmx->nested.hv_evmcs->revision_id != KVM_EVMCS_VERSION) && + (vmx->nested.hv_evmcs->revision_id != VMCS12_REVISION)) { nested_release_evmcs(vcpu); return 0; } @@ -9390,9 +9420,11 @@ static int nested_vmx_handle_enlightened_vmptrld(struct kvm_vcpu *vcpu, * present in struct hv_enlightened_vmcs, ...). Make sure there * are no leftovers. */ - if (from_launch) - memset(vmx->nested.cached_vmcs12, 0, - sizeof(*vmx->nested.cached_vmcs12)); + if (from_launch) { + struct vmcs12 *vmcs12 = get_vmcs12(vcpu); + memset(vmcs12, 0, sizeof(*vmcs12)); + vmcs12->hdr.revision_id = VMCS12_REVISION; + } } return 1; @@ -15062,7 +15094,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = { .has_wbinvd_exit = cpu_has_vmx_wbinvd_exit, .read_l1_tsc_offset = vmx_read_l1_tsc_offset, - .write_tsc_offset = vmx_write_tsc_offset, + .write_l1_tsc_offset = vmx_write_l1_tsc_offset, .set_tdp_cr3 = vmx_set_cr3, diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5cd5647120f2..d02937760c3b 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1665,8 +1665,7 @@ EXPORT_SYMBOL_GPL(kvm_read_l1_tsc); static void kvm_vcpu_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset) { - kvm_x86_ops->write_tsc_offset(vcpu, offset); - vcpu->arch.tsc_offset = offset; + vcpu->arch.tsc_offset = kvm_x86_ops->write_l1_tsc_offset(vcpu, offset); } static inline bool kvm_check_tsc_unstable(void) @@ -1794,7 +1793,8 @@ EXPORT_SYMBOL_GPL(kvm_write_tsc); static inline void adjust_tsc_offset_guest(struct kvm_vcpu *vcpu, s64 adjustment) { - kvm_vcpu_write_tsc_offset(vcpu, vcpu->arch.tsc_offset + adjustment); + u64 tsc_offset = kvm_x86_ops->read_l1_tsc_offset(vcpu); + kvm_vcpu_write_tsc_offset(vcpu, tsc_offset + adjustment); } static inline void adjust_tsc_offset_host(struct kvm_vcpu *vcpu, s64 adjustment) @@ -6918,6 +6918,7 @@ static int kvm_pv_clock_pairing(struct kvm_vcpu *vcpu, gpa_t paddr, clock_pairing.nsec = ts.tv_nsec; clock_pairing.tsc = kvm_read_l1_tsc(vcpu, cycle); clock_pairing.flags = 0; + memset(&clock_pairing.pad, 0, sizeof(clock_pairing.pad)); ret = 0; if (kvm_write_guest(vcpu->kvm, paddr, &clock_pairing, @@ -7455,7 +7456,8 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu) else { if (vcpu->arch.apicv_active) kvm_x86_ops->sync_pir_to_irr(vcpu); - kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors); + if (ioapic_in_kernel(vcpu->kvm)) + kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors); } if (is_guest_mode(vcpu)) diff --git a/arch/xtensa/kernel/asm-offsets.c b/arch/xtensa/kernel/asm-offsets.c index 67904f55f188..120dd746a147 100644 --- a/arch/xtensa/kernel/asm-offsets.c +++ b/arch/xtensa/kernel/asm-offsets.c @@ -94,14 +94,14 @@ int main(void) DEFINE(THREAD_SP, offsetof (struct task_struct, thread.sp)); DEFINE(THREAD_CPENABLE, offsetof (struct thread_info, cpenable)); #if XTENSA_HAVE_COPROCESSORS - DEFINE(THREAD_XTREGS_CP0, offsetof (struct thread_info, xtregs_cp)); - DEFINE(THREAD_XTREGS_CP1, offsetof (struct thread_info, xtregs_cp)); - DEFINE(THREAD_XTREGS_CP2, offsetof (struct thread_info, xtregs_cp)); - DEFINE(THREAD_XTREGS_CP3, offsetof (struct thread_info, xtregs_cp)); - DEFINE(THREAD_XTREGS_CP4, offsetof (struct thread_info, xtregs_cp)); - DEFINE(THREAD_XTREGS_CP5, offsetof (struct thread_info, xtregs_cp)); - DEFINE(THREAD_XTREGS_CP6, offsetof (struct thread_info, xtregs_cp)); - DEFINE(THREAD_XTREGS_CP7, offsetof (struct thread_info, xtregs_cp)); + DEFINE(THREAD_XTREGS_CP0, offsetof(struct thread_info, xtregs_cp.cp0)); + DEFINE(THREAD_XTREGS_CP1, offsetof(struct thread_info, xtregs_cp.cp1)); + DEFINE(THREAD_XTREGS_CP2, offsetof(struct thread_info, xtregs_cp.cp2)); + DEFINE(THREAD_XTREGS_CP3, offsetof(struct thread_info, xtregs_cp.cp3)); + DEFINE(THREAD_XTREGS_CP4, offsetof(struct thread_info, xtregs_cp.cp4)); + DEFINE(THREAD_XTREGS_CP5, offsetof(struct thread_info, xtregs_cp.cp5)); + DEFINE(THREAD_XTREGS_CP6, offsetof(struct thread_info, xtregs_cp.cp6)); + DEFINE(THREAD_XTREGS_CP7, offsetof(struct thread_info, xtregs_cp.cp7)); #endif DEFINE(THREAD_XTREGS_USER, offsetof (struct thread_info, xtregs_user)); DEFINE(XTREGS_USER_SIZE, sizeof(xtregs_user_t)); diff --git a/arch/xtensa/kernel/process.c b/arch/xtensa/kernel/process.c index 483dcfb6e681..4bb68133a72a 100644 --- a/arch/xtensa/kernel/process.c +++ b/arch/xtensa/kernel/process.c @@ -94,18 +94,21 @@ void coprocessor_release_all(struct thread_info *ti) void coprocessor_flush_all(struct thread_info *ti) { - unsigned long cpenable; + unsigned long cpenable, old_cpenable; int i; preempt_disable(); + RSR_CPENABLE(old_cpenable); cpenable = ti->cpenable; + WSR_CPENABLE(cpenable); for (i = 0; i < XCHAL_CP_MAX; i++) { if ((cpenable & 1) != 0 && coprocessor_owner[i] == ti) coprocessor_flush(ti, i); cpenable >>= 1; } + WSR_CPENABLE(old_cpenable); preempt_enable(); } diff --git a/arch/xtensa/kernel/ptrace.c b/arch/xtensa/kernel/ptrace.c index c0845cb1cbb9..d9541be0605a 100644 --- a/arch/xtensa/kernel/ptrace.c +++ b/arch/xtensa/kernel/ptrace.c @@ -127,12 +127,37 @@ static int ptrace_setregs(struct task_struct *child, void __user *uregs) } +#if XTENSA_HAVE_COPROCESSORS +#define CP_OFFSETS(cp) \ + { \ + .elf_xtregs_offset = offsetof(elf_xtregs_t, cp), \ + .ti_offset = offsetof(struct thread_info, xtregs_cp.cp), \ + .sz = sizeof(xtregs_ ## cp ## _t), \ + } + +static const struct { + size_t elf_xtregs_offset; + size_t ti_offset; + size_t sz; +} cp_offsets[] = { + CP_OFFSETS(cp0), + CP_OFFSETS(cp1), + CP_OFFSETS(cp2), + CP_OFFSETS(cp3), + CP_OFFSETS(cp4), + CP_OFFSETS(cp5), + CP_OFFSETS(cp6), + CP_OFFSETS(cp7), +}; +#endif + static int ptrace_getxregs(struct task_struct *child, void __user *uregs) { struct pt_regs *regs = task_pt_regs(child); struct thread_info *ti = task_thread_info(child); elf_xtregs_t __user *xtregs = uregs; int ret = 0; + int i __maybe_unused; if (!access_ok(VERIFY_WRITE, uregs, sizeof(elf_xtregs_t))) return -EIO; @@ -140,8 +165,13 @@ static int ptrace_getxregs(struct task_struct *child, void __user *uregs) #if XTENSA_HAVE_COPROCESSORS /* Flush all coprocessor registers to memory. */ coprocessor_flush_all(ti); - ret |= __copy_to_user(&xtregs->cp0, &ti->xtregs_cp, - sizeof(xtregs_coprocessor_t)); + + for (i = 0; i < ARRAY_SIZE(cp_offsets); ++i) + ret |= __copy_to_user((char __user *)xtregs + + cp_offsets[i].elf_xtregs_offset, + (const char *)ti + + cp_offsets[i].ti_offset, + cp_offsets[i].sz); #endif ret |= __copy_to_user(&xtregs->opt, ®s->xtregs_opt, sizeof(xtregs->opt)); @@ -157,6 +187,7 @@ static int ptrace_setxregs(struct task_struct *child, void __user *uregs) struct pt_regs *regs = task_pt_regs(child); elf_xtregs_t *xtregs = uregs; int ret = 0; + int i __maybe_unused; if (!access_ok(VERIFY_READ, uregs, sizeof(elf_xtregs_t))) return -EFAULT; @@ -166,8 +197,11 @@ static int ptrace_setxregs(struct task_struct *child, void __user *uregs) coprocessor_flush_all(ti); coprocessor_release_all(ti); - ret |= __copy_from_user(&ti->xtregs_cp, &xtregs->cp0, - sizeof(xtregs_coprocessor_t)); + for (i = 0; i < ARRAY_SIZE(cp_offsets); ++i) + ret |= __copy_from_user((char *)ti + cp_offsets[i].ti_offset, + (const char __user *)xtregs + + cp_offsets[i].elf_xtregs_offset, + cp_offsets[i].sz); #endif ret |= __copy_from_user(®s->xtregs_opt, &xtregs->opt, sizeof(xtregs->opt)); diff --git a/drivers/atm/firestream.c b/drivers/atm/firestream.c index 4e46dc9e41ad..11e1663bdc4d 100644 --- a/drivers/atm/firestream.c +++ b/drivers/atm/firestream.c @@ -1410,7 +1410,7 @@ static int init_q(struct fs_dev *dev, struct queue *txq, int queue, func_enter (); - fs_dprintk (FS_DEBUG_INIT, "Inititing queue at %x: %d entries:\n", + fs_dprintk (FS_DEBUG_INIT, "Initializing queue at %x: %d entries:\n", queue, nentries); p = aligned_kmalloc (sz, GFP_KERNEL, 0x10); @@ -1443,7 +1443,7 @@ static int init_fp(struct fs_dev *dev, struct freepool *fp, int queue, { func_enter (); - fs_dprintk (FS_DEBUG_INIT, "Inititing free pool at %x:\n", queue); + fs_dprintk (FS_DEBUG_INIT, "Initializing free pool at %x:\n", queue); write_fs (dev, FP_CNF(queue), (bufsize * RBFP_RBS) | RBFP_RBSVAL | RBFP_CME); write_fs (dev, FP_SA(queue), 0); diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h index c0d668944dbe..ed35c9a9a110 100644 --- a/drivers/hid/hid-ids.h +++ b/drivers/hid/hid-ids.h @@ -275,6 +275,9 @@ #define USB_VENDOR_ID_CIDC 0x1677 +#define I2C_VENDOR_ID_CIRQUE 0x0488 +#define I2C_PRODUCT_ID_CIRQUE_121F 0x121F + #define USB_VENDOR_ID_CJTOUCH 0x24b8 #define USB_DEVICE_ID_CJTOUCH_MULTI_TOUCH_0020 0x0020 #define USB_DEVICE_ID_CJTOUCH_MULTI_TOUCH_0040 0x0040 @@ -707,6 +710,7 @@ #define USB_VENDOR_ID_LG 0x1fd2 #define USB_DEVICE_ID_LG_MULTITOUCH 0x0064 #define USB_DEVICE_ID_LG_MELFAS_MT 0x6007 +#define I2C_DEVICE_ID_LG_8001 0x8001 #define USB_VENDOR_ID_LOGITECH 0x046d #define USB_DEVICE_ID_LOGITECH_AUDIOHUB 0x0a0e @@ -805,6 +809,7 @@ #define USB_DEVICE_ID_MS_TYPE_COVER_2 0x07a9 #define USB_DEVICE_ID_MS_POWER_COVER 0x07da #define USB_DEVICE_ID_MS_XBOX_ONE_S_CONTROLLER 0x02fd +#define USB_DEVICE_ID_MS_PIXART_MOUSE 0x00cb #define USB_VENDOR_ID_MOJO 0x8282 #define USB_DEVICE_ID_RETRO_ADAPTER 0x3201 @@ -1043,6 +1048,7 @@ #define USB_VENDOR_ID_SYMBOL 0x05e0 #define USB_DEVICE_ID_SYMBOL_SCANNER_1 0x0800 #define USB_DEVICE_ID_SYMBOL_SCANNER_2 0x1300 +#define USB_DEVICE_ID_SYMBOL_SCANNER_3 0x1200 #define USB_VENDOR_ID_SYNAPTICS 0x06cb #define USB_DEVICE_ID_SYNAPTICS_TP 0x0001 @@ -1204,6 +1210,8 @@ #define USB_DEVICE_ID_PRIMAX_MOUSE_4D22 0x4d22 #define USB_DEVICE_ID_PRIMAX_KEYBOARD 0x4e05 #define USB_DEVICE_ID_PRIMAX_REZEL 0x4e72 +#define USB_DEVICE_ID_PRIMAX_PIXART_MOUSE_4D0F 0x4d0f +#define USB_DEVICE_ID_PRIMAX_PIXART_MOUSE_4E22 0x4e22 #define USB_VENDOR_ID_RISO_KAGAKU 0x1294 /* Riso Kagaku Corp. */ diff --git a/drivers/hid/hid-input.c b/drivers/hid/hid-input.c index a2f74e6adc70..d6fab5798487 100644 --- a/drivers/hid/hid-input.c +++ b/drivers/hid/hid-input.c @@ -325,6 +325,9 @@ static const struct hid_device_id hid_battery_quirks[] = { { HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_ELECOM, USB_DEVICE_ID_ELECOM_BM084), HID_BATTERY_QUIRK_IGNORE }, + { HID_USB_DEVICE(USB_VENDOR_ID_SYMBOL, + USB_DEVICE_ID_SYMBOL_SCANNER_3), + HID_BATTERY_QUIRK_IGNORE }, {} }; @@ -1838,47 +1841,3 @@ void hidinput_disconnect(struct hid_device *hid) } EXPORT_SYMBOL_GPL(hidinput_disconnect); -/** - * hid_scroll_counter_handle_scroll() - Send high- and low-resolution scroll - * events given a high-resolution wheel - * movement. - * @counter: a hid_scroll_counter struct describing the wheel. - * @hi_res_value: the movement of the wheel, in the mouse's high-resolution - * units. - * - * Given a high-resolution movement, this function converts the movement into - * microns and emits high-resolution scroll events for the input device. It also - * uses the multiplier from &struct hid_scroll_counter to emit low-resolution - * scroll events when appropriate for backwards-compatibility with userspace - * input libraries. - */ -void hid_scroll_counter_handle_scroll(struct hid_scroll_counter *counter, - int hi_res_value) -{ - int low_res_value, remainder, multiplier; - - input_report_rel(counter->dev, REL_WHEEL_HI_RES, - hi_res_value * counter->microns_per_hi_res_unit); - - /* - * Update the low-res remainder with the high-res value, - * but reset if the direction has changed. - */ - remainder = counter->remainder; - if ((remainder ^ hi_res_value) < 0) - remainder = 0; - remainder += hi_res_value; - - /* - * Then just use the resolution multiplier to see if - * we should send a low-res (aka regular wheel) event. - */ - multiplier = counter->resolution_multiplier; - low_res_value = remainder / multiplier; - remainder -= low_res_value * multiplier; - counter->remainder = remainder; - - if (low_res_value) - input_report_rel(counter->dev, REL_WHEEL, low_res_value); -} -EXPORT_SYMBOL_GPL(hid_scroll_counter_handle_scroll); diff --git a/drivers/hid/hid-logitech-hidpp.c b/drivers/hid/hid-logitech-hidpp.c index f01280898b24..19cc980eebce 100644 --- a/drivers/hid/hid-logitech-hidpp.c +++ b/drivers/hid/hid-logitech-hidpp.c @@ -64,14 +64,6 @@ MODULE_PARM_DESC(disable_tap_to_click, #define HIDPP_QUIRK_NO_HIDINPUT BIT(23) #define HIDPP_QUIRK_FORCE_OUTPUT_REPORTS BIT(24) #define HIDPP_QUIRK_UNIFYING BIT(25) -#define HIDPP_QUIRK_HI_RES_SCROLL_1P0 BIT(26) -#define HIDPP_QUIRK_HI_RES_SCROLL_X2120 BIT(27) -#define HIDPP_QUIRK_HI_RES_SCROLL_X2121 BIT(28) - -/* Convenience constant to check for any high-res support. */ -#define HIDPP_QUIRK_HI_RES_SCROLL (HIDPP_QUIRK_HI_RES_SCROLL_1P0 | \ - HIDPP_QUIRK_HI_RES_SCROLL_X2120 | \ - HIDPP_QUIRK_HI_RES_SCROLL_X2121) #define HIDPP_QUIRK_DELAYED_INIT HIDPP_QUIRK_NO_HIDINPUT @@ -157,7 +149,6 @@ struct hidpp_device { unsigned long capabilities; struct hidpp_battery battery; - struct hid_scroll_counter vertical_wheel_counter; }; /* HID++ 1.0 error codes */ @@ -409,53 +400,32 @@ static void hidpp_prefix_name(char **name, int name_length) #define HIDPP_SET_LONG_REGISTER 0x82 #define HIDPP_GET_LONG_REGISTER 0x83 -/** - * hidpp10_set_register_bit() - Sets a single bit in a HID++ 1.0 register. - * @hidpp_dev: the device to set the register on. - * @register_address: the address of the register to modify. - * @byte: the byte of the register to modify. Should be less than 3. - * Return: 0 if successful, otherwise a negative error code. - */ -static int hidpp10_set_register_bit(struct hidpp_device *hidpp_dev, - u8 register_address, u8 byte, u8 bit) +#define HIDPP_REG_GENERAL 0x00 + +static int hidpp10_enable_battery_reporting(struct hidpp_device *hidpp_dev) { struct hidpp_report response; int ret; u8 params[3] = { 0 }; ret = hidpp_send_rap_command_sync(hidpp_dev, - REPORT_ID_HIDPP_SHORT, - HIDPP_GET_REGISTER, - register_address, - NULL, 0, &response); + REPORT_ID_HIDPP_SHORT, + HIDPP_GET_REGISTER, + HIDPP_REG_GENERAL, + NULL, 0, &response); if (ret) return ret; memcpy(params, response.rap.params, 3); - params[byte] |= BIT(bit); + /* Set the battery bit */ + params[0] |= BIT(4); return hidpp_send_rap_command_sync(hidpp_dev, - REPORT_ID_HIDPP_SHORT, - HIDPP_SET_REGISTER, - register_address, - params, 3, &response); -} - - -#define HIDPP_REG_GENERAL 0x00 - -static int hidpp10_enable_battery_reporting(struct hidpp_device *hidpp_dev) -{ - return hidpp10_set_register_bit(hidpp_dev, HIDPP_REG_GENERAL, 0, 4); -} - -#define HIDPP_REG_FEATURES 0x01 - -/* On HID++ 1.0 devices, high-res scroll was called "scrolling acceleration". */ -static int hidpp10_enable_scrolling_acceleration(struct hidpp_device *hidpp_dev) -{ - return hidpp10_set_register_bit(hidpp_dev, HIDPP_REG_FEATURES, 0, 6); + REPORT_ID_HIDPP_SHORT, + HIDPP_SET_REGISTER, + HIDPP_REG_GENERAL, + params, 3, &response); } #define HIDPP_REG_BATTERY_STATUS 0x07 @@ -1167,100 +1137,6 @@ static int hidpp_battery_get_property(struct power_supply *psy, } /* -------------------------------------------------------------------------- */ -/* 0x2120: Hi-resolution scrolling */ -/* -------------------------------------------------------------------------- */ - -#define HIDPP_PAGE_HI_RESOLUTION_SCROLLING 0x2120 - -#define CMD_HI_RESOLUTION_SCROLLING_SET_HIGHRES_SCROLLING_MODE 0x10 - -static int hidpp_hrs_set_highres_scrolling_mode(struct hidpp_device *hidpp, - bool enabled, u8 *multiplier) -{ - u8 feature_index; - u8 feature_type; - int ret; - u8 params[1]; - struct hidpp_report response; - - ret = hidpp_root_get_feature(hidpp, - HIDPP_PAGE_HI_RESOLUTION_SCROLLING, - &feature_index, - &feature_type); - if (ret) - return ret; - - params[0] = enabled ? BIT(0) : 0; - ret = hidpp_send_fap_command_sync(hidpp, feature_index, - CMD_HI_RESOLUTION_SCROLLING_SET_HIGHRES_SCROLLING_MODE, - params, sizeof(params), &response); - if (ret) - return ret; - *multiplier = response.fap.params[1]; - return 0; -} - -/* -------------------------------------------------------------------------- */ -/* 0x2121: HiRes Wheel */ -/* -------------------------------------------------------------------------- */ - -#define HIDPP_PAGE_HIRES_WHEEL 0x2121 - -#define CMD_HIRES_WHEEL_GET_WHEEL_CAPABILITY 0x00 -#define CMD_HIRES_WHEEL_SET_WHEEL_MODE 0x20 - -static int hidpp_hrw_get_wheel_capability(struct hidpp_device *hidpp, - u8 *multiplier) -{ - u8 feature_index; - u8 feature_type; - int ret; - struct hidpp_report response; - - ret = hidpp_root_get_feature(hidpp, HIDPP_PAGE_HIRES_WHEEL, - &feature_index, &feature_type); - if (ret) - goto return_default; - - ret = hidpp_send_fap_command_sync(hidpp, feature_index, - CMD_HIRES_WHEEL_GET_WHEEL_CAPABILITY, - NULL, 0, &response); - if (ret) - goto return_default; - - *multiplier = response.fap.params[0]; - return 0; -return_default: - hid_warn(hidpp->hid_dev, - "Couldn't get wheel multiplier (error %d), assuming %d.\n", - ret, *multiplier); - return ret; -} - -static int hidpp_hrw_set_wheel_mode(struct hidpp_device *hidpp, bool invert, - bool high_resolution, bool use_hidpp) -{ - u8 feature_index; - u8 feature_type; - int ret; - u8 params[1]; - struct hidpp_report response; - - ret = hidpp_root_get_feature(hidpp, HIDPP_PAGE_HIRES_WHEEL, - &feature_index, &feature_type); - if (ret) - return ret; - - params[0] = (invert ? BIT(2) : 0) | - (high_resolution ? BIT(1) : 0) | - (use_hidpp ? BIT(0) : 0); - - return hidpp_send_fap_command_sync(hidpp, feature_index, - CMD_HIRES_WHEEL_SET_WHEEL_MODE, - params, sizeof(params), &response); -} - -/* -------------------------------------------------------------------------- */ /* 0x4301: Solar Keyboard */ /* -------------------------------------------------------------------------- */ @@ -2523,8 +2399,7 @@ static int m560_raw_event(struct hid_device *hdev, u8 *data, int size) input_report_rel(mydata->input, REL_Y, v); v = hid_snto32(data[6], 8); - hid_scroll_counter_handle_scroll( - &hidpp->vertical_wheel_counter, v); + input_report_rel(mydata->input, REL_WHEEL, v); input_sync(mydata->input); } @@ -2653,72 +2528,6 @@ static int g920_get_config(struct hidpp_device *hidpp) } /* -------------------------------------------------------------------------- */ -/* High-resolution scroll wheels */ -/* -------------------------------------------------------------------------- */ - -/** - * struct hi_res_scroll_info - Stores info on a device's high-res scroll wheel. - * @product_id: the HID product ID of the device being described. - * @microns_per_hi_res_unit: the distance moved by the user's finger for each - * high-resolution unit reported by the device, in - * 256ths of a millimetre. - */ -struct hi_res_scroll_info { - __u32 product_id; - int microns_per_hi_res_unit; -}; - -static struct hi_res_scroll_info hi_res_scroll_devices[] = { - { /* Anywhere MX */ - .product_id = 0x1017, .microns_per_hi_res_unit = 445 }, - { /* Performance MX */ - .product_id = 0x101a, .microns_per_hi_res_unit = 406 }, - { /* M560 */ - .product_id = 0x402d, .microns_per_hi_res_unit = 435 }, - { /* MX Master 2S */ - .product_id = 0x4069, .microns_per_hi_res_unit = 406 }, -}; - -static int hi_res_scroll_look_up_microns(__u32 product_id) -{ - int i; - int num_devices = sizeof(hi_res_scroll_devices) - / sizeof(hi_res_scroll_devices[0]); - for (i = 0; i < num_devices; i++) { - if (hi_res_scroll_devices[i].product_id == product_id) - return hi_res_scroll_devices[i].microns_per_hi_res_unit; - } - /* We don't have a value for this device, so use a sensible default. */ - return 406; -} - -static int hi_res_scroll_enable(struct hidpp_device *hidpp) -{ - int ret; - u8 multiplier = 8; - - if (hidpp->quirks & HIDPP_QUIRK_HI_RES_SCROLL_X2121) { - ret = hidpp_hrw_set_wheel_mode(hidpp, false, true, false); - hidpp_hrw_get_wheel_capability(hidpp, &multiplier); - } else if (hidpp->quirks & HIDPP_QUIRK_HI_RES_SCROLL_X2120) { - ret = hidpp_hrs_set_highres_scrolling_mode(hidpp, true, - &multiplier); - } else /* if (hidpp->quirks & HIDPP_QUIRK_HI_RES_SCROLL_1P0) */ - ret = hidpp10_enable_scrolling_acceleration(hidpp); - - if (ret) - return ret; - - hidpp->vertical_wheel_counter.resolution_multiplier = multiplier; - hidpp->vertical_wheel_counter.microns_per_hi_res_unit = - hi_res_scroll_look_up_microns(hidpp->hid_dev->product); - hid_info(hidpp->hid_dev, "multiplier = %d, microns = %d\n", - multiplier, - hidpp->vertical_wheel_counter.microns_per_hi_res_unit); - return 0; -} - -/* -------------------------------------------------------------------------- */ /* Generic HID++ devices */ /* -------------------------------------------------------------------------- */ @@ -2763,11 +2572,6 @@ static void hidpp_populate_input(struct hidpp_device *hidpp, wtp_populate_input(hidpp, input, origin_is_hid_core); else if (hidpp->quirks & HIDPP_QUIRK_CLASS_M560) m560_populate_input(hidpp, input, origin_is_hid_core); - - if (hidpp->quirks & HIDPP_QUIRK_HI_RES_SCROLL) { - input_set_capability(input, EV_REL, REL_WHEEL_HI_RES); - hidpp->vertical_wheel_counter.dev = input; - } } static int hidpp_input_configured(struct hid_device *hdev, @@ -2886,27 +2690,6 @@ static int hidpp_raw_event(struct hid_device *hdev, struct hid_report *report, return 0; } -static int hidpp_event(struct hid_device *hdev, struct hid_field *field, - struct hid_usage *usage, __s32 value) -{ - /* This function will only be called for scroll events, due to the - * restriction imposed in hidpp_usages. - */ - struct hidpp_device *hidpp = hid_get_drvdata(hdev); - struct hid_scroll_counter *counter = &hidpp->vertical_wheel_counter; - /* A scroll event may occur before the multiplier has been retrieved or - * the input device set, or high-res scroll enabling may fail. In such - * cases we must return early (falling back to default behaviour) to - * avoid a crash in hid_scroll_counter_handle_scroll. - */ - if (!(hidpp->quirks & HIDPP_QUIRK_HI_RES_SCROLL) || value == 0 - || counter->dev == NULL || counter->resolution_multiplier == 0) - return 0; - - hid_scroll_counter_handle_scroll(counter, value); - return 1; -} - static int hidpp_initialize_battery(struct hidpp_device *hidpp) { static atomic_t battery_no = ATOMIC_INIT(0); @@ -3118,9 +2901,6 @@ static void hidpp_connect_event(struct hidpp_device *hidpp) if (hidpp->battery.ps) power_supply_changed(hidpp->battery.ps); - if (hidpp->quirks & HIDPP_QUIRK_HI_RES_SCROLL) - hi_res_scroll_enable(hidpp); - if (!(hidpp->quirks & HIDPP_QUIRK_NO_HIDINPUT) || hidpp->delayed_input) /* if the input nodes are already created, we can stop now */ return; @@ -3306,63 +3086,35 @@ static void hidpp_remove(struct hid_device *hdev) mutex_destroy(&hidpp->send_mutex); } -#define LDJ_DEVICE(product) \ - HID_DEVICE(BUS_USB, HID_GROUP_LOGITECH_DJ_DEVICE, \ - USB_VENDOR_ID_LOGITECH, (product)) - static const struct hid_device_id hidpp_devices[] = { { /* wireless touchpad */ - LDJ_DEVICE(0x4011), + HID_DEVICE(BUS_USB, HID_GROUP_LOGITECH_DJ_DEVICE, + USB_VENDOR_ID_LOGITECH, 0x4011), .driver_data = HIDPP_QUIRK_CLASS_WTP | HIDPP_QUIRK_DELAYED_INIT | HIDPP_QUIRK_WTP_PHYSICAL_BUTTONS }, { /* wireless touchpad T650 */ - LDJ_DEVICE(0x4101), + HID_DEVICE(BUS_USB, HID_GROUP_LOGITECH_DJ_DEVICE, + USB_VENDOR_ID_LOGITECH, 0x4101), .driver_data = HIDPP_QUIRK_CLASS_WTP | HIDPP_QUIRK_DELAYED_INIT }, { /* wireless touchpad T651 */ HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_T651), .driver_data = HIDPP_QUIRK_CLASS_WTP }, - { /* Mouse Logitech Anywhere MX */ - LDJ_DEVICE(0x1017), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_1P0 }, - { /* Mouse Logitech Cube */ - LDJ_DEVICE(0x4010), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2120 }, - { /* Mouse Logitech M335 */ - LDJ_DEVICE(0x4050), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2121 }, - { /* Mouse Logitech M515 */ - LDJ_DEVICE(0x4007), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2120 }, { /* Mouse logitech M560 */ - LDJ_DEVICE(0x402d), - .driver_data = HIDPP_QUIRK_DELAYED_INIT | HIDPP_QUIRK_CLASS_M560 - | HIDPP_QUIRK_HI_RES_SCROLL_X2120 }, - { /* Mouse Logitech M705 (firmware RQM17) */ - LDJ_DEVICE(0x101b), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_1P0 }, - { /* Mouse Logitech M705 (firmware RQM67) */ - LDJ_DEVICE(0x406d), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2121 }, - { /* Mouse Logitech M720 */ - LDJ_DEVICE(0x405e), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2121 }, - { /* Mouse Logitech MX Anywhere 2 */ - LDJ_DEVICE(0x404a), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2121 }, - { LDJ_DEVICE(0xb013), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2121 }, - { LDJ_DEVICE(0xb018), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2121 }, - { LDJ_DEVICE(0xb01f), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2121 }, - { /* Mouse Logitech MX Anywhere 2S */ - LDJ_DEVICE(0x406a), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2121 }, - { /* Mouse Logitech MX Master */ - LDJ_DEVICE(0x4041), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2121 }, - { LDJ_DEVICE(0x4060), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2121 }, - { LDJ_DEVICE(0x4071), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2121 }, - { /* Mouse Logitech MX Master 2S */ - LDJ_DEVICE(0x4069), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_X2121 }, - { /* Mouse Logitech Performance MX */ - LDJ_DEVICE(0x101a), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_1P0 }, + HID_DEVICE(BUS_USB, HID_GROUP_LOGITECH_DJ_DEVICE, + USB_VENDOR_ID_LOGITECH, 0x402d), + .driver_data = HIDPP_QUIRK_DELAYED_INIT | HIDPP_QUIRK_CLASS_M560 }, { /* Keyboard logitech K400 */ - LDJ_DEVICE(0x4024), + HID_DEVICE(BUS_USB, HID_GROUP_LOGITECH_DJ_DEVICE, + USB_VENDOR_ID_LOGITECH, 0x4024), .driver_data = HIDPP_QUIRK_CLASS_K400 }, { /* Solar Keyboard Logitech K750 */ - LDJ_DEVICE(0x4002), + HID_DEVICE(BUS_USB, HID_GROUP_LOGITECH_DJ_DEVICE, + USB_VENDOR_ID_LOGITECH, 0x4002), .driver_data = HIDPP_QUIRK_CLASS_K750 }, - { LDJ_DEVICE(HID_ANY_ID) }, + { HID_DEVICE(BUS_USB, HID_GROUP_LOGITECH_DJ_DEVICE, + USB_VENDOR_ID_LOGITECH, HID_ANY_ID)}, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_G920_WHEEL), .driver_data = HIDPP_QUIRK_CLASS_G920 | HIDPP_QUIRK_FORCE_OUTPUT_REPORTS}, @@ -3371,19 +3123,12 @@ static const struct hid_device_id hidpp_devices[] = { MODULE_DEVICE_TABLE(hid, hidpp_devices); -static const struct hid_usage_id hidpp_usages[] = { - { HID_GD_WHEEL, EV_REL, REL_WHEEL }, - { HID_ANY_ID - 1, HID_ANY_ID - 1, HID_ANY_ID - 1} -}; - static struct hid_driver hidpp_driver = { .name = "logitech-hidpp-device", .id_table = hidpp_devices, .probe = hidpp_probe, .remove = hidpp_remove, .raw_event = hidpp_raw_event, - .usage_table = hidpp_usages, - .event = hidpp_event, .input_configured = hidpp_input_configured, .input_mapping = hidpp_input_mapping, .input_mapped = hidpp_input_mapped, diff --git a/drivers/hid/hid-multitouch.c b/drivers/hid/hid-multitouch.c index f7c6de2b6730..dca0a3a90fb8 100644 --- a/drivers/hid/hid-multitouch.c +++ b/drivers/hid/hid-multitouch.c @@ -1814,6 +1814,12 @@ static const struct hid_device_id mt_devices[] = { MT_USB_DEVICE(USB_VENDOR_ID_CHUNGHWAT, USB_DEVICE_ID_CHUNGHWAT_MULTITOUCH) }, + /* Cirque devices */ + { .driver_data = MT_CLS_WIN_8_DUAL, + HID_DEVICE(BUS_I2C, HID_GROUP_MULTITOUCH_WIN_8, + I2C_VENDOR_ID_CIRQUE, + I2C_PRODUCT_ID_CIRQUE_121F) }, + /* CJTouch panels */ { .driver_data = MT_CLS_NSMU, MT_USB_DEVICE(USB_VENDOR_ID_CJTOUCH, diff --git a/drivers/hid/hid-quirks.c b/drivers/hid/hid-quirks.c index 8237dd86fb17..c85a79986b6a 100644 --- a/drivers/hid/hid-quirks.c +++ b/drivers/hid/hid-quirks.c @@ -107,6 +107,7 @@ static const struct hid_device_id hid_quirks[] = { { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_MOUSE_C05A), HID_QUIRK_ALWAYS_POLL }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_MOUSE_C06A), HID_QUIRK_ALWAYS_POLL }, { HID_USB_DEVICE(USB_VENDOR_ID_MCS, USB_DEVICE_ID_MCS_GAMEPADBLOCK), HID_QUIRK_MULTI_INPUT }, + { HID_USB_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_MS_PIXART_MOUSE), HID_QUIRK_ALWAYS_POLL }, { HID_USB_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_MS_POWER_COVER), HID_QUIRK_NO_INIT_REPORTS }, { HID_USB_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_MS_SURFACE_PRO_2), HID_QUIRK_NO_INIT_REPORTS }, { HID_USB_DEVICE(USB_VENDOR_ID_MICROSOFT, USB_DEVICE_ID_MS_TOUCH_COVER_2), HID_QUIRK_NO_INIT_REPORTS }, @@ -129,6 +130,8 @@ static const struct hid_device_id hid_quirks[] = { { HID_USB_DEVICE(USB_VENDOR_ID_PIXART, USB_DEVICE_ID_PIXART_OPTICAL_TOUCH_SCREEN), HID_QUIRK_NO_INIT_REPORTS }, { HID_USB_DEVICE(USB_VENDOR_ID_PIXART, USB_DEVICE_ID_PIXART_USB_OPTICAL_MOUSE), HID_QUIRK_ALWAYS_POLL }, { HID_USB_DEVICE(USB_VENDOR_ID_PRIMAX, USB_DEVICE_ID_PRIMAX_MOUSE_4D22), HID_QUIRK_ALWAYS_POLL }, + { HID_USB_DEVICE(USB_VENDOR_ID_PRIMAX, USB_DEVICE_ID_PRIMAX_PIXART_MOUSE_4D0F), HID_QUIRK_ALWAYS_POLL }, + { HID_USB_DEVICE(USB_VENDOR_ID_PRIMAX, USB_DEVICE_ID_PRIMAX_PIXART_MOUSE_4E22), HID_QUIRK_ALWAYS_POLL }, { HID_USB_DEVICE(USB_VENDOR_ID_PRODIGE, USB_DEVICE_ID_PRODIGE_CORDLESS), HID_QUIRK_NOGET }, { HID_USB_DEVICE(USB_VENDOR_ID_QUANTA, USB_DEVICE_ID_QUANTA_OPTICAL_TOUCH_3001), HID_QUIRK_NOGET }, { HID_USB_DEVICE(USB_VENDOR_ID_QUANTA, USB_DEVICE_ID_QUANTA_OPTICAL_TOUCH_3003), HID_QUIRK_NOGET }, diff --git a/drivers/hid/hid-steam.c b/drivers/hid/hid-steam.c index 0422ec2b13d2..dc4128bfe2ca 100644 --- a/drivers/hid/hid-steam.c +++ b/drivers/hid/hid-steam.c @@ -23,8 +23,9 @@ * In order to avoid breaking them this driver creates a layered hidraw device, * so it can detect when the client is running and then: * - it will not send any command to the controller. - * - this input device will be disabled, to avoid double input of the same + * - this input device will be removed, to avoid double input of the same * user action. + * When the client is closed, this input device will be created again. * * For additional functions, such as changing the right-pad margin or switching * the led, you can use the user-space tool at: @@ -113,7 +114,7 @@ struct steam_device { spinlock_t lock; struct hid_device *hdev, *client_hdev; struct mutex mutex; - bool client_opened, input_opened; + bool client_opened; struct input_dev __rcu *input; unsigned long quirks; struct work_struct work_connect; @@ -279,18 +280,6 @@ static void steam_set_lizard_mode(struct steam_device *steam, bool enable) } } -static void steam_update_lizard_mode(struct steam_device *steam) -{ - mutex_lock(&steam->mutex); - if (!steam->client_opened) { - if (steam->input_opened) - steam_set_lizard_mode(steam, false); - else - steam_set_lizard_mode(steam, lizard_mode); - } - mutex_unlock(&steam->mutex); -} - static int steam_input_open(struct input_dev *dev) { struct steam_device *steam = input_get_drvdata(dev); @@ -301,7 +290,6 @@ static int steam_input_open(struct input_dev *dev) return ret; mutex_lock(&steam->mutex); - steam->input_opened = true; if (!steam->client_opened && lizard_mode) steam_set_lizard_mode(steam, false); mutex_unlock(&steam->mutex); @@ -313,7 +301,6 @@ static void steam_input_close(struct input_dev *dev) struct steam_device *steam = input_get_drvdata(dev); mutex_lock(&steam->mutex); - steam->input_opened = false; if (!steam->client_opened && lizard_mode) steam_set_lizard_mode(steam, true); mutex_unlock(&steam->mutex); @@ -400,7 +387,7 @@ static int steam_battery_register(struct steam_device *steam) return 0; } -static int steam_register(struct steam_device *steam) +static int steam_input_register(struct steam_device *steam) { struct hid_device *hdev = steam->hdev; struct input_dev *input; @@ -414,17 +401,6 @@ static int steam_register(struct steam_device *steam) return 0; } - /* - * Unlikely, but getting the serial could fail, and it is not so - * important, so make up a serial number and go on. - */ - if (steam_get_serial(steam) < 0) - strlcpy(steam->serial_no, "XXXXXXXXXX", - sizeof(steam->serial_no)); - - hid_info(hdev, "Steam Controller '%s' connected", - steam->serial_no); - input = input_allocate_device(); if (!input) return -ENOMEM; @@ -492,11 +468,6 @@ static int steam_register(struct steam_device *steam) goto input_register_fail; rcu_assign_pointer(steam->input, input); - - /* ignore battery errors, we can live without it */ - if (steam->quirks & STEAM_QUIRK_WIRELESS) - steam_battery_register(steam); - return 0; input_register_fail: @@ -504,27 +475,88 @@ input_register_fail: return ret; } -static void steam_unregister(struct steam_device *steam) +static void steam_input_unregister(struct steam_device *steam) { struct input_dev *input; + rcu_read_lock(); + input = rcu_dereference(steam->input); + rcu_read_unlock(); + if (!input) + return; + RCU_INIT_POINTER(steam->input, NULL); + synchronize_rcu(); + input_unregister_device(input); +} + +static void steam_battery_unregister(struct steam_device *steam) +{ struct power_supply *battery; rcu_read_lock(); - input = rcu_dereference(steam->input); battery = rcu_dereference(steam->battery); rcu_read_unlock(); - if (battery) { - RCU_INIT_POINTER(steam->battery, NULL); - synchronize_rcu(); - power_supply_unregister(battery); + if (!battery) + return; + RCU_INIT_POINTER(steam->battery, NULL); + synchronize_rcu(); + power_supply_unregister(battery); +} + +static int steam_register(struct steam_device *steam) +{ + int ret; + + /* + * This function can be called several times in a row with the + * wireless adaptor, without steam_unregister() between them, because + * another client send a get_connection_status command, for example. + * The battery and serial number are set just once per device. + */ + if (!steam->serial_no[0]) { + /* + * Unlikely, but getting the serial could fail, and it is not so + * important, so make up a serial number and go on. + */ + if (steam_get_serial(steam) < 0) + strlcpy(steam->serial_no, "XXXXXXXXXX", + sizeof(steam->serial_no)); + + hid_info(steam->hdev, "Steam Controller '%s' connected", + steam->serial_no); + + /* ignore battery errors, we can live without it */ + if (steam->quirks & STEAM_QUIRK_WIRELESS) + steam_battery_register(steam); + + mutex_lock(&steam_devices_lock); + list_add(&steam->list, &steam_devices); + mutex_unlock(&steam_devices_lock); } - if (input) { - RCU_INIT_POINTER(steam->input, NULL); - synchronize_rcu(); + + mutex_lock(&steam->mutex); + if (!steam->client_opened) { + steam_set_lizard_mode(steam, lizard_mode); + ret = steam_input_register(steam); + } else { + ret = 0; + } + mutex_unlock(&steam->mutex); + + return ret; +} + +static void steam_unregister(struct steam_device *steam) +{ + steam_battery_unregister(steam); + steam_input_unregister(steam); + if (steam->serial_no[0]) { hid_info(steam->hdev, "Steam Controller '%s' disconnected", steam->serial_no); - input_unregister_device(input); + mutex_lock(&steam_devices_lock); + list_del(&steam->list); + mutex_unlock(&steam_devices_lock); + steam->serial_no[0] = 0; } } @@ -600,6 +632,9 @@ static int steam_client_ll_open(struct hid_device *hdev) mutex_lock(&steam->mutex); steam->client_opened = true; mutex_unlock(&steam->mutex); + + steam_input_unregister(steam); + return ret; } @@ -609,13 +644,13 @@ static void steam_client_ll_close(struct hid_device *hdev) mutex_lock(&steam->mutex); steam->client_opened = false; - if (steam->input_opened) - steam_set_lizard_mode(steam, false); - else - steam_set_lizard_mode(steam, lizard_mode); mutex_unlock(&steam->mutex); hid_hw_close(steam->hdev); + if (steam->connected) { + steam_set_lizard_mode(steam, lizard_mode); + steam_input_register(steam); + } } static int steam_client_ll_raw_request(struct hid_device *hdev, @@ -744,11 +779,6 @@ static int steam_probe(struct hid_device *hdev, } } - mutex_lock(&steam_devices_lock); - steam_update_lizard_mode(steam); - list_add(&steam->list, &steam_devices); - mutex_unlock(&steam_devices_lock); - return 0; hid_hw_open_fail: @@ -774,10 +804,6 @@ static void steam_remove(struct hid_device *hdev) return; } - mutex_lock(&steam_devices_lock); - list_del(&steam->list); - mutex_unlock(&steam_devices_lock); - hid_destroy_device(steam->client_hdev); steam->client_opened = false; cancel_work_sync(&steam->work_connect); @@ -792,12 +818,14 @@ static void steam_remove(struct hid_device *hdev) static void steam_do_connect_event(struct steam_device *steam, bool connected) { unsigned long flags; + bool changed; spin_lock_irqsave(&steam->lock, flags); + changed = steam->connected != connected; steam->connected = connected; spin_unlock_irqrestore(&steam->lock, flags); - if (schedule_work(&steam->work_connect) == 0) + if (changed && schedule_work(&steam->work_connect) == 0) dbg_hid("%s: connected=%d event already queued\n", __func__, connected); } @@ -1019,13 +1047,8 @@ static int steam_raw_event(struct hid_device *hdev, return 0; rcu_read_lock(); input = rcu_dereference(steam->input); - if (likely(input)) { + if (likely(input)) steam_do_input_event(steam, input, data); - } else { - dbg_hid("%s: input data without connect event\n", - __func__); - steam_do_connect_event(steam, true); - } rcu_read_unlock(); break; case STEAM_EV_CONNECT: @@ -1074,7 +1097,10 @@ static int steam_param_set_lizard_mode(const char *val, mutex_lock(&steam_devices_lock); list_for_each_entry(steam, &steam_devices, list) { - steam_update_lizard_mode(steam); + mutex_lock(&steam->mutex); + if (!steam->client_opened) + steam_set_lizard_mode(steam, lizard_mode); + mutex_unlock(&steam->mutex); } mutex_unlock(&steam_devices_lock); return 0; diff --git a/drivers/hid/i2c-hid/i2c-hid-core.c b/drivers/hid/i2c-hid/i2c-hid-core.c index 3cde7c1b9c33..8555ce7e737b 100644 --- a/drivers/hid/i2c-hid/i2c-hid-core.c +++ b/drivers/hid/i2c-hid/i2c-hid-core.c @@ -177,6 +177,8 @@ static const struct i2c_hid_quirks { I2C_HID_QUIRK_NO_RUNTIME_PM }, { I2C_VENDOR_ID_RAYDIUM, I2C_PRODUCT_ID_RAYDIUM_4B33, I2C_HID_QUIRK_DELAY_AFTER_SLEEP }, + { USB_VENDOR_ID_LG, I2C_DEVICE_ID_LG_8001, + I2C_HID_QUIRK_NO_RUNTIME_PM }, { 0, 0 } }; diff --git a/drivers/hid/uhid.c b/drivers/hid/uhid.c index 3c5507313606..840634e0f1e3 100644 --- a/drivers/hid/uhid.c +++ b/drivers/hid/uhid.c @@ -12,6 +12,7 @@ #include <linux/atomic.h> #include <linux/compat.h> +#include <linux/cred.h> #include <linux/device.h> #include <linux/fs.h> #include <linux/hid.h> @@ -496,12 +497,13 @@ static int uhid_dev_create2(struct uhid_device *uhid, goto err_free; } - len = min(sizeof(hid->name), sizeof(ev->u.create2.name)); - strlcpy(hid->name, ev->u.create2.name, len); - len = min(sizeof(hid->phys), sizeof(ev->u.create2.phys)); - strlcpy(hid->phys, ev->u.create2.phys, len); - len = min(sizeof(hid->uniq), sizeof(ev->u.create2.uniq)); - strlcpy(hid->uniq, ev->u.create2.uniq, len); + /* @hid is zero-initialized, strncpy() is correct, strlcpy() not */ + len = min(sizeof(hid->name), sizeof(ev->u.create2.name)) - 1; + strncpy(hid->name, ev->u.create2.name, len); + len = min(sizeof(hid->phys), sizeof(ev->u.create2.phys)) - 1; + strncpy(hid->phys, ev->u.create2.phys, len); + len = min(sizeof(hid->uniq), sizeof(ev->u.create2.uniq)) - 1; + strncpy(hid->uniq, ev->u.create2.uniq, len); hid->ll_driver = &uhid_hid_driver; hid->bus = ev->u.create2.bus; @@ -722,6 +724,17 @@ static ssize_t uhid_char_write(struct file *file, const char __user *buffer, switch (uhid->input_buf.type) { case UHID_CREATE: + /* + * 'struct uhid_create_req' contains a __user pointer which is + * copied from, so it's unsafe to allow this with elevated + * privileges (e.g. from a setuid binary) or via kernel_write(). + */ + if (file->f_cred != current_cred() || uaccess_kernel()) { + pr_err_once("UHID_CREATE from different security context by process %d (%s), this is not allowed.\n", + task_tgid_vnr(current), current->comm); + ret = -EACCES; + goto unlock; + } ret = uhid_dev_create(uhid, &uhid->input_buf); break; case UHID_CREATE2: diff --git a/drivers/hwmon/ina2xx.c b/drivers/hwmon/ina2xx.c index 71d3445ba869..07ee19573b3f 100644 --- a/drivers/hwmon/ina2xx.c +++ b/drivers/hwmon/ina2xx.c @@ -274,7 +274,7 @@ static int ina2xx_get_value(struct ina2xx_data *data, u8 reg, break; case INA2XX_CURRENT: /* signed register, result in mA */ - val = regval * data->current_lsb_uA; + val = (s16)regval * data->current_lsb_uA; val = DIV_ROUND_CLOSEST(val, 1000); break; case INA2XX_CALIBRATION: @@ -491,7 +491,7 @@ static int ina2xx_probe(struct i2c_client *client, } data->groups[group++] = &ina2xx_group; - if (id->driver_data == ina226) + if (chip == ina226) data->groups[group++] = &ina226_group; hwmon_dev = devm_hwmon_device_register_with_groups(dev, client->name, @@ -500,7 +500,7 @@ static int ina2xx_probe(struct i2c_client *client, return PTR_ERR(hwmon_dev); dev_info(dev, "power monitor %s (Rshunt = %li uOhm)\n", - id->name, data->rshunt); + client->name, data->rshunt); return 0; } diff --git a/drivers/hwmon/mlxreg-fan.c b/drivers/hwmon/mlxreg-fan.c index de46577c7d5a..d8fa4bea4bc8 100644 --- a/drivers/hwmon/mlxreg-fan.c +++ b/drivers/hwmon/mlxreg-fan.c @@ -51,7 +51,7 @@ */ #define MLXREG_FAN_GET_RPM(rval, d, s) (DIV_ROUND_CLOSEST(15000000 * 100, \ ((rval) + (s)) * (d))) -#define MLXREG_FAN_GET_FAULT(val, mask) (!!((val) ^ (mask))) +#define MLXREG_FAN_GET_FAULT(val, mask) (!((val) ^ (mask))) #define MLXREG_FAN_PWM_DUTY2STATE(duty) (DIV_ROUND_CLOSEST((duty) * \ MLXREG_FAN_MAX_STATE, \ MLXREG_FAN_MAX_DUTY)) diff --git a/drivers/hwmon/raspberrypi-hwmon.c b/drivers/hwmon/raspberrypi-hwmon.c index be5ba4690895..0d0457245e7d 100644 --- a/drivers/hwmon/raspberrypi-hwmon.c +++ b/drivers/hwmon/raspberrypi-hwmon.c @@ -115,7 +115,6 @@ static int rpi_hwmon_probe(struct platform_device *pdev) { struct device *dev = &pdev->dev; struct rpi_hwmon_data *data; - int ret; data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL); if (!data) @@ -124,11 +123,6 @@ static int rpi_hwmon_probe(struct platform_device *pdev) /* Parent driver assure that firmware is correct */ data->fw = dev_get_drvdata(dev->parent); - /* Init throttled */ - ret = rpi_firmware_property(data->fw, RPI_FIRMWARE_GET_THROTTLED, - &data->last_throttled, - sizeof(data->last_throttled)); - data->hwmon_dev = devm_hwmon_device_register_with_info(dev, "rpi_volt", data, &rpi_chip_info, diff --git a/drivers/hwmon/w83795.c b/drivers/hwmon/w83795.c index 49276bbdac3d..1bb80f992aa8 100644 --- a/drivers/hwmon/w83795.c +++ b/drivers/hwmon/w83795.c @@ -1691,7 +1691,7 @@ store_sf_setup(struct device *dev, struct device_attribute *attr, * somewhere else in the code */ #define SENSOR_ATTR_TEMP(index) { \ - SENSOR_ATTR_2(temp##index##_type, S_IRUGO | (index < 4 ? S_IWUSR : 0), \ + SENSOR_ATTR_2(temp##index##_type, S_IRUGO | (index < 5 ? S_IWUSR : 0), \ show_temp_mode, store_temp_mode, NOT_USED, index - 1), \ SENSOR_ATTR_2(temp##index##_input, S_IRUGO, show_temp, \ NULL, TEMP_READ, index - 1), \ diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c index 55af04fa03a7..6c8dcb65ff03 100644 --- a/drivers/net/ethernet/cavium/thunder/nic_main.c +++ b/drivers/net/ethernet/cavium/thunder/nic_main.c @@ -1441,6 +1441,9 @@ static void nic_remove(struct pci_dev *pdev) { struct nicpf *nic = pci_get_drvdata(pdev); + if (!nic) + return; + if (nic->flags & NIC_SRIOV_ENABLED) pci_disable_sriov(pdev); diff --git a/drivers/net/ethernet/hisilicon/hip04_eth.c b/drivers/net/ethernet/hisilicon/hip04_eth.c index be268dcde8fa..f9a4e76c5a8b 100644 --- a/drivers/net/ethernet/hisilicon/hip04_eth.c +++ b/drivers/net/ethernet/hisilicon/hip04_eth.c @@ -915,10 +915,8 @@ static int hip04_mac_probe(struct platform_device *pdev) } ret = register_netdev(ndev); - if (ret) { - free_netdev(ndev); + if (ret) goto alloc_fail; - } return 0; diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 47f0fdadbac9..6d5b13f69dec 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -1417,7 +1417,7 @@ void __i40e_del_filter(struct i40e_vsi *vsi, struct i40e_mac_filter *f) } vsi->flags |= I40E_VSI_FLAG_FILTER_CHANGED; - set_bit(__I40E_MACVLAN_SYNC_PENDING, vsi->state); + set_bit(__I40E_MACVLAN_SYNC_PENDING, vsi->back->state); } /** diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c index add1e457886d..433c8e688c78 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c @@ -33,7 +33,7 @@ static int i40e_alloc_xsk_umems(struct i40e_vsi *vsi) } /** - * i40e_add_xsk_umem - Store an UMEM for a certain ring/qid + * i40e_add_xsk_umem - Store a UMEM for a certain ring/qid * @vsi: Current VSI * @umem: UMEM to store * @qid: Ring/qid to associate with the UMEM @@ -56,7 +56,7 @@ static int i40e_add_xsk_umem(struct i40e_vsi *vsi, struct xdp_umem *umem, } /** - * i40e_remove_xsk_umem - Remove an UMEM for a certain ring/qid + * i40e_remove_xsk_umem - Remove a UMEM for a certain ring/qid * @vsi: Current VSI * @qid: Ring/qid associated with the UMEM **/ @@ -130,7 +130,7 @@ static void i40e_xsk_umem_dma_unmap(struct i40e_vsi *vsi, struct xdp_umem *umem) } /** - * i40e_xsk_umem_enable - Enable/associate an UMEM to a certain ring/qid + * i40e_xsk_umem_enable - Enable/associate a UMEM to a certain ring/qid * @vsi: Current VSI * @umem: UMEM * @qid: Rx ring to associate UMEM to @@ -189,7 +189,7 @@ static int i40e_xsk_umem_enable(struct i40e_vsi *vsi, struct xdp_umem *umem, } /** - * i40e_xsk_umem_disable - Diassociate an UMEM from a certain ring/qid + * i40e_xsk_umem_disable - Disassociate a UMEM from a certain ring/qid * @vsi: Current VSI * @qid: Rx ring to associate UMEM to * @@ -255,12 +255,12 @@ int i40e_xsk_umem_query(struct i40e_vsi *vsi, struct xdp_umem **umem, } /** - * i40e_xsk_umem_query - Queries a certain ring/qid for its UMEM + * i40e_xsk_umem_setup - Enable/disassociate a UMEM to/from a ring/qid * @vsi: Current VSI * @umem: UMEM to enable/associate to a ring, or NULL to disable * @qid: Rx ring to (dis)associate UMEM (from)to * - * This function enables or disables an UMEM to a certain ring. + * This function enables or disables a UMEM to a certain ring. * * Returns 0 on success, <0 on failure **/ @@ -276,7 +276,7 @@ int i40e_xsk_umem_setup(struct i40e_vsi *vsi, struct xdp_umem *umem, * @rx_ring: Rx ring * @xdp: xdp_buff used as input to the XDP program * - * This function enables or disables an UMEM to a certain ring. + * This function enables or disables a UMEM to a certain ring. * * Returns any of I40E_XDP_{PASS, CONSUMED, TX, REDIR} **/ diff --git a/drivers/net/ethernet/intel/igb/e1000_i210.c b/drivers/net/ethernet/intel/igb/e1000_i210.c index c54ebedca6da..c393cb2c0f16 100644 --- a/drivers/net/ethernet/intel/igb/e1000_i210.c +++ b/drivers/net/ethernet/intel/igb/e1000_i210.c @@ -842,6 +842,7 @@ s32 igb_pll_workaround_i210(struct e1000_hw *hw) nvm_word = E1000_INVM_DEFAULT_AL; tmp_nvm = nvm_word | E1000_INVM_PLL_WO_VAL; igb_write_phy_reg_82580(hw, I347AT4_PAGE_SELECT, E1000_PHY_PLL_FREQ_PAGE); + phy_word = E1000_PHY_PLL_UNCONF; for (i = 0; i < E1000_MAX_PLL_TRIES; i++) { /* check current state directly from internal PHY */ igb_read_phy_reg_82580(hw, E1000_PHY_PLL_FREQ_REG, &phy_word); diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c index 10dbaf4f6e80..9c42f741ed5e 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c @@ -2262,7 +2262,9 @@ static s32 ixgbe_get_link_capabilities_X550em(struct ixgbe_hw *hw, *autoneg = false; if (hw->phy.sfp_type == ixgbe_sfp_type_1g_sx_core0 || - hw->phy.sfp_type == ixgbe_sfp_type_1g_sx_core1) { + hw->phy.sfp_type == ixgbe_sfp_type_1g_sx_core1 || + hw->phy.sfp_type == ixgbe_sfp_type_1g_lx_core0 || + hw->phy.sfp_type == ixgbe_sfp_type_1g_lx_core1) { *speed = IXGBE_LINK_SPEED_1GB_FULL; return 0; } diff --git a/drivers/net/ethernet/microchip/lan743x_main.c b/drivers/net/ethernet/microchip/lan743x_main.c index 867cddba840f..e8ca98c070f6 100644 --- a/drivers/net/ethernet/microchip/lan743x_main.c +++ b/drivers/net/ethernet/microchip/lan743x_main.c @@ -1672,7 +1672,7 @@ static int lan743x_tx_napi_poll(struct napi_struct *napi, int weight) netif_wake_queue(adapter->netdev); } - if (!napi_complete_done(napi, weight)) + if (!napi_complete(napi)) goto done; /* enable isr */ @@ -1681,7 +1681,7 @@ static int lan743x_tx_napi_poll(struct napi_struct *napi, int weight) lan743x_csr_read(adapter, INT_STS); done: - return weight; + return 0; } static void lan743x_tx_ring_cleanup(struct lan743x_tx *tx) @@ -1870,9 +1870,9 @@ static int lan743x_tx_open(struct lan743x_tx *tx) tx->vector_flags = lan743x_intr_get_vector_flags(adapter, INT_BIT_DMA_TX_ (tx->channel_number)); - netif_napi_add(adapter->netdev, - &tx->napi, lan743x_tx_napi_poll, - tx->ring_size - 1); + netif_tx_napi_add(adapter->netdev, + &tx->napi, lan743x_tx_napi_poll, + tx->ring_size - 1); napi_enable(&tx->napi); data = 0; @@ -3017,6 +3017,7 @@ static const struct dev_pm_ops lan743x_pm_ops = { static const struct pci_device_id lan743x_pcidev_tbl[] = { { PCI_DEVICE(PCI_VENDOR_ID_SMSC, PCI_DEVICE_ID_SMSC_LAN7430) }, + { PCI_DEVICE(PCI_VENDOR_ID_SMSC, PCI_DEVICE_ID_SMSC_LAN7431) }, { 0, } }; diff --git a/drivers/net/ethernet/microchip/lan743x_main.h b/drivers/net/ethernet/microchip/lan743x_main.h index 0e82b6368798..2d6eea18973e 100644 --- a/drivers/net/ethernet/microchip/lan743x_main.h +++ b/drivers/net/ethernet/microchip/lan743x_main.h @@ -548,6 +548,7 @@ struct lan743x_adapter; /* SMSC acquired EFAR late 1990's, MCHP acquired SMSC 2012 */ #define PCI_VENDOR_ID_SMSC PCI_VENDOR_ID_EFAR #define PCI_DEVICE_ID_SMSC_LAN7430 (0x7430) +#define PCI_DEVICE_ID_SMSC_LAN7431 (0x7431) #define PCI_CONFIG_LENGTH (0x1000) diff --git a/drivers/net/ethernet/qlogic/qed/qed_debug.c b/drivers/net/ethernet/qlogic/qed/qed_debug.c index 78a638ec7c0a..979f1e4bc18b 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_debug.c +++ b/drivers/net/ethernet/qlogic/qed/qed_debug.c @@ -6071,7 +6071,7 @@ static const char * const s_igu_fifo_error_strs[] = { "no error", "length error", "function disabled", - "VF sent command to attnetion address", + "VF sent command to attention address", "host sent prod update command", "read of during interrupt register while in MIMD mode", "access to PXP BAR reserved address", diff --git a/drivers/net/ethernet/via/via-velocity.c b/drivers/net/ethernet/via/via-velocity.c index ef9538ee53d0..82412691ee66 100644 --- a/drivers/net/ethernet/via/via-velocity.c +++ b/drivers/net/ethernet/via/via-velocity.c @@ -3605,7 +3605,7 @@ static const char velocity_gstrings[][ETH_GSTRING_LEN] = { "tx_jumbo", "rx_mac_control_frames", "tx_mac_control_frames", - "rx_frame_alignement_errors", + "rx_frame_alignment_errors", "rx_long_ok", "rx_long_err", "tx_sqe_errors", diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index e06613f2d431..0904002b19a2 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -2255,6 +2255,14 @@ int phy_driver_register(struct phy_driver *new_driver, struct module *owner) new_driver->mdiodrv.driver.remove = phy_remove; new_driver->mdiodrv.driver.owner = owner; + /* The following works around an issue where the PHY driver doesn't bind + * to the device, resulting in the genphy driver being used instead of + * the dedicated driver. The root cause of the issue isn't known yet + * and seems to be in the base driver core. Once this is fixed we may + * remove this workaround. + */ + new_driver->mdiodrv.driver.probe_type = PROBE_FORCE_SYNCHRONOUS; + retval = driver_register(&new_driver->mdiodrv.driver); if (retval) { pr_err("%s: Error %d in registering driver\n", diff --git a/drivers/net/rionet.c b/drivers/net/rionet.c index e9f101c9bae2..bfbb39f93554 100644 --- a/drivers/net/rionet.c +++ b/drivers/net/rionet.c @@ -216,9 +216,9 @@ static int rionet_start_xmit(struct sk_buff *skb, struct net_device *ndev) * it just report sending a packet to the target * (without actual packet transfer). */ - dev_kfree_skb_any(skb); ndev->stats.tx_packets++; ndev->stats.tx_bytes += skb->len; + dev_kfree_skb_any(skb); } } diff --git a/drivers/net/usb/ipheth.c b/drivers/net/usb/ipheth.c index 7275761a1177..3d8a70d3ea9b 100644 --- a/drivers/net/usb/ipheth.c +++ b/drivers/net/usb/ipheth.c @@ -140,7 +140,6 @@ struct ipheth_device { struct usb_device *udev; struct usb_interface *intf; struct net_device *net; - struct sk_buff *tx_skb; struct urb *tx_urb; struct urb *rx_urb; unsigned char *tx_buf; @@ -230,6 +229,7 @@ static void ipheth_rcvbulk_callback(struct urb *urb) case -ENOENT: case -ECONNRESET: case -ESHUTDOWN: + case -EPROTO: return; case 0: break; @@ -281,7 +281,6 @@ static void ipheth_sndbulk_callback(struct urb *urb) dev_err(&dev->intf->dev, "%s: urb status: %d\n", __func__, status); - dev_kfree_skb_irq(dev->tx_skb); if (status == 0) netif_wake_queue(dev->net); else @@ -423,7 +422,7 @@ static int ipheth_tx(struct sk_buff *skb, struct net_device *net) if (skb->len > IPHETH_BUF_SIZE) { WARN(1, "%s: skb too large: %d bytes\n", __func__, skb->len); dev->net->stats.tx_dropped++; - dev_kfree_skb_irq(skb); + dev_kfree_skb_any(skb); return NETDEV_TX_OK; } @@ -443,12 +442,11 @@ static int ipheth_tx(struct sk_buff *skb, struct net_device *net) dev_err(&dev->intf->dev, "%s: usb_submit_urb: %d\n", __func__, retval); dev->net->stats.tx_errors++; - dev_kfree_skb_irq(skb); + dev_kfree_skb_any(skb); } else { - dev->tx_skb = skb; - dev->net->stats.tx_packets++; dev->net->stats.tx_bytes += skb->len; + dev_consume_skb_any(skb); netif_stop_queue(net); } diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c index ae972461860e..e63e03143ca7 100644 --- a/drivers/s390/net/qeth_core_main.c +++ b/drivers/s390/net/qeth_core_main.c @@ -4513,8 +4513,8 @@ static int qeth_snmp_command_cb(struct qeth_card *card, { struct qeth_ipa_cmd *cmd; struct qeth_arp_query_info *qinfo; - struct qeth_snmp_cmd *snmp; unsigned char *data; + void *snmp_data; __u16 data_len; QETH_CARD_TEXT(card, 3, "snpcmdcb"); @@ -4522,7 +4522,6 @@ static int qeth_snmp_command_cb(struct qeth_card *card, cmd = (struct qeth_ipa_cmd *) sdata; data = (unsigned char *)((char *)cmd - reply->offset); qinfo = (struct qeth_arp_query_info *) reply->param; - snmp = &cmd->data.setadapterparms.data.snmp; if (cmd->hdr.return_code) { QETH_CARD_TEXT_(card, 4, "scer1%x", cmd->hdr.return_code); @@ -4535,10 +4534,15 @@ static int qeth_snmp_command_cb(struct qeth_card *card, return 0; } data_len = *((__u16 *)QETH_IPA_PDU_LEN_PDU1(data)); - if (cmd->data.setadapterparms.hdr.seq_no == 1) - data_len -= (__u16)((char *)&snmp->data - (char *)cmd); - else - data_len -= (__u16)((char *)&snmp->request - (char *)cmd); + if (cmd->data.setadapterparms.hdr.seq_no == 1) { + snmp_data = &cmd->data.setadapterparms.data.snmp; + data_len -= offsetof(struct qeth_ipa_cmd, + data.setadapterparms.data.snmp); + } else { + snmp_data = &cmd->data.setadapterparms.data.snmp.request; + data_len -= offsetof(struct qeth_ipa_cmd, + data.setadapterparms.data.snmp.request); + } /* check if there is enough room in userspace */ if ((qinfo->udata_len - qinfo->udata_offset) < data_len) { @@ -4551,16 +4555,9 @@ static int qeth_snmp_command_cb(struct qeth_card *card, QETH_CARD_TEXT_(card, 4, "sseqn%i", cmd->data.setadapterparms.hdr.seq_no); /*copy entries to user buffer*/ - if (cmd->data.setadapterparms.hdr.seq_no == 1) { - memcpy(qinfo->udata + qinfo->udata_offset, - (char *)snmp, - data_len + offsetof(struct qeth_snmp_cmd, data)); - qinfo->udata_offset += offsetof(struct qeth_snmp_cmd, data); - } else { - memcpy(qinfo->udata + qinfo->udata_offset, - (char *)&snmp->request, data_len); - } + memcpy(qinfo->udata + qinfo->udata_offset, snmp_data, data_len); qinfo->udata_offset += data_len; + /* check if all replies received ... */ QETH_CARD_TEXT_(card, 4, "srtot%i", cmd->data.setadapterparms.hdr.used_total); diff --git a/drivers/spi/spi-mt65xx.c b/drivers/spi/spi-mt65xx.c index 3dc31627c655..0c2867deb36f 100644 --- a/drivers/spi/spi-mt65xx.c +++ b/drivers/spi/spi-mt65xx.c @@ -522,11 +522,11 @@ static irqreturn_t mtk_spi_interrupt(int irq, void *dev_id) mdata->xfer_len = min(MTK_SPI_MAX_FIFO_SIZE, len); mtk_spi_setup_packet(master); - cnt = len / 4; + cnt = mdata->xfer_len / 4; iowrite32_rep(mdata->base + SPI_TX_DATA_REG, trans->tx_buf + mdata->num_xfered, cnt); - remainder = len % 4; + remainder = mdata->xfer_len % 4; if (remainder > 0) { reg_val = 0; memcpy(®_val, diff --git a/drivers/spi/spi-omap2-mcspi.c b/drivers/spi/spi-omap2-mcspi.c index f024c3fc3679..2fd8881fcd65 100644 --- a/drivers/spi/spi-omap2-mcspi.c +++ b/drivers/spi/spi-omap2-mcspi.c @@ -1540,13 +1540,26 @@ static int omap2_mcspi_remove(struct platform_device *pdev) /* work with hotplug and coldplug */ MODULE_ALIAS("platform:omap2_mcspi"); -#ifdef CONFIG_SUSPEND -static int omap2_mcspi_suspend_noirq(struct device *dev) +static int __maybe_unused omap2_mcspi_suspend(struct device *dev) { - return pinctrl_pm_select_sleep_state(dev); + struct spi_master *master = dev_get_drvdata(dev); + struct omap2_mcspi *mcspi = spi_master_get_devdata(master); + int error; + + error = pinctrl_pm_select_sleep_state(dev); + if (error) + dev_warn(mcspi->dev, "%s: failed to set pins: %i\n", + __func__, error); + + error = spi_master_suspend(master); + if (error) + dev_warn(mcspi->dev, "%s: master suspend failed: %i\n", + __func__, error); + + return pm_runtime_force_suspend(dev); } -static int omap2_mcspi_resume_noirq(struct device *dev) +static int __maybe_unused omap2_mcspi_resume(struct device *dev) { struct spi_master *master = dev_get_drvdata(dev); struct omap2_mcspi *mcspi = spi_master_get_devdata(master); @@ -1557,17 +1570,17 @@ static int omap2_mcspi_resume_noirq(struct device *dev) dev_warn(mcspi->dev, "%s: failed to set pins: %i\n", __func__, error); - return 0; -} + error = spi_master_resume(master); + if (error) + dev_warn(mcspi->dev, "%s: master resume failed: %i\n", + __func__, error); -#else -#define omap2_mcspi_suspend_noirq NULL -#define omap2_mcspi_resume_noirq NULL -#endif + return pm_runtime_force_resume(dev); +} static const struct dev_pm_ops omap2_mcspi_pm_ops = { - .suspend_noirq = omap2_mcspi_suspend_noirq, - .resume_noirq = omap2_mcspi_resume_noirq, + SET_SYSTEM_SLEEP_PM_OPS(omap2_mcspi_suspend, + omap2_mcspi_resume) .runtime_resume = omap_mcspi_runtime_resume, }; diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 3f0b6d1936e8..6d776717d8b3 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -477,9 +477,9 @@ static int btree_read_extent_buffer_pages(struct btrfs_fs_info *fs_info, int mirror_num = 0; int failed_mirror = 0; - clear_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags); io_tree = &BTRFS_I(fs_info->btree_inode)->io_tree; while (1) { + clear_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags); ret = read_extent_buffer_pages(io_tree, eb, WAIT_COMPLETE, mirror_num); if (!ret) { @@ -493,15 +493,6 @@ static int btree_read_extent_buffer_pages(struct btrfs_fs_info *fs_info, break; } - /* - * This buffer's crc is fine, but its contents are corrupted, so - * there is no reason to read the other copies, they won't be - * any less wrong. - */ - if (test_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags) || - ret == -EUCLEAN) - break; - num_copies = btrfs_num_copies(fs_info, eb->start, eb->len); if (num_copies == 1) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index a3c22e16509b..58e93bce3036 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2089,6 +2089,30 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync) atomic_inc(&root->log_batch); /* + * Before we acquired the inode's lock, someone may have dirtied more + * pages in the target range. We need to make sure that writeback for + * any such pages does not start while we are logging the inode, because + * if it does, any of the following might happen when we are not doing a + * full inode sync: + * + * 1) We log an extent after its writeback finishes but before its + * checksums are added to the csum tree, leading to -EIO errors + * when attempting to read the extent after a log replay. + * + * 2) We can end up logging an extent before its writeback finishes. + * Therefore after the log replay we will have a file extent item + * pointing to an unwritten extent (and no data checksums as well). + * + * So trigger writeback for any eventual new dirty pages and then we + * wait for all ordered extents to complete below. + */ + ret = start_ordered_ops(inode, start, end); + if (ret) { + inode_unlock(inode); + goto out; + } + + /* * We have to do this here to avoid the priority inversion of waiting on * IO of a lower priority task while holding a transaciton open. */ diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c index 45868fd76209..f70825af6438 100644 --- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -2659,7 +2659,7 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle *trans, u64 srcid, int i; u64 *i_qgroups; struct btrfs_fs_info *fs_info = trans->fs_info; - struct btrfs_root *quota_root = fs_info->quota_root; + struct btrfs_root *quota_root; struct btrfs_qgroup *srcgroup; struct btrfs_qgroup *dstgroup; u32 level_size = 0; @@ -2669,6 +2669,7 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle *trans, u64 srcid, if (!test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags)) goto out; + quota_root = fs_info->quota_root; if (!quota_root) { ret = -EINVAL; goto out; diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 924116f654a1..a3f75b8926d4 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -3959,6 +3959,7 @@ static noinline_for_stack int relocate_block_group(struct reloc_control *rc) restart: if (update_backref_cache(trans, &rc->backref_cache)) { btrfs_end_transaction(trans); + trans = NULL; continue; } diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 094cc1444a90..5be83b5a1b43 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -3340,7 +3340,8 @@ static void free_pending_move(struct send_ctx *sctx, struct pending_dir_move *m) kfree(m); } -static void tail_append_pending_moves(struct pending_dir_move *moves, +static void tail_append_pending_moves(struct send_ctx *sctx, + struct pending_dir_move *moves, struct list_head *stack) { if (list_empty(&moves->list)) { @@ -3351,6 +3352,10 @@ static void tail_append_pending_moves(struct pending_dir_move *moves, list_add_tail(&moves->list, stack); list_splice_tail(&list, stack); } + if (!RB_EMPTY_NODE(&moves->node)) { + rb_erase(&moves->node, &sctx->pending_dir_moves); + RB_CLEAR_NODE(&moves->node); + } } static int apply_children_dir_moves(struct send_ctx *sctx) @@ -3365,7 +3370,7 @@ static int apply_children_dir_moves(struct send_ctx *sctx) return 0; INIT_LIST_HEAD(&stack); - tail_append_pending_moves(pm, &stack); + tail_append_pending_moves(sctx, pm, &stack); while (!list_empty(&stack)) { pm = list_first_entry(&stack, struct pending_dir_move, list); @@ -3376,7 +3381,7 @@ static int apply_children_dir_moves(struct send_ctx *sctx) goto out; pm = get_pending_dir_moves(sctx, parent_ino); if (pm) - tail_append_pending_moves(pm, &stack); + tail_append_pending_moves(sctx, pm, &stack); } return 0; diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index cbc9d0d2c12d..645fc81e2a94 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -2237,6 +2237,7 @@ static long btrfs_control_ioctl(struct file *file, unsigned int cmd, vol = memdup_user((void __user *)arg, sizeof(*vol)); if (IS_ERR(vol)) return PTR_ERR(vol); + vol->name[BTRFS_PATH_NAME_MAX] = '\0'; switch (cmd) { case BTRFS_IOC_SCAN_DEV: @@ -98,12 +98,6 @@ static void *dax_make_entry(pfn_t pfn, unsigned long flags) return xa_mk_value(flags | (pfn_t_to_pfn(pfn) << DAX_SHIFT)); } -static void *dax_make_page_entry(struct page *page) -{ - pfn_t pfn = page_to_pfn_t(page); - return dax_make_entry(pfn, PageHead(page) ? DAX_PMD : 0); -} - static bool dax_is_locked(void *entry) { return xa_to_value(entry) & DAX_LOCKED; @@ -116,12 +110,12 @@ static unsigned int dax_entry_order(void *entry) return 0; } -static int dax_is_pmd_entry(void *entry) +static unsigned long dax_is_pmd_entry(void *entry) { return xa_to_value(entry) & DAX_PMD; } -static int dax_is_pte_entry(void *entry) +static bool dax_is_pte_entry(void *entry) { return !(xa_to_value(entry) & DAX_PMD); } @@ -222,9 +216,8 @@ static void *get_unlocked_entry(struct xa_state *xas) ewait.wait.func = wake_exceptional_entry_func; for (;;) { - entry = xas_load(xas); - if (!entry || xa_is_internal(entry) || - WARN_ON_ONCE(!xa_is_value(entry)) || + entry = xas_find_conflict(xas); + if (!entry || WARN_ON_ONCE(!xa_is_value(entry)) || !dax_is_locked(entry)) return entry; @@ -255,6 +248,7 @@ static void dax_unlock_entry(struct xa_state *xas, void *entry) { void *old; + BUG_ON(dax_is_locked(entry)); xas_reset(xas); xas_lock_irq(xas); old = xas_store(xas, entry); @@ -352,16 +346,27 @@ static struct page *dax_busy_page(void *entry) return NULL; } +/* + * dax_lock_mapping_entry - Lock the DAX entry corresponding to a page + * @page: The page whose entry we want to lock + * + * Context: Process context. + * Return: %true if the entry was locked or does not need to be locked. + */ bool dax_lock_mapping_entry(struct page *page) { XA_STATE(xas, NULL, 0); void *entry; + bool locked; + /* Ensure page->mapping isn't freed while we look at it */ + rcu_read_lock(); for (;;) { struct address_space *mapping = READ_ONCE(page->mapping); + locked = false; if (!dax_mapping(mapping)) - return false; + break; /* * In the device-dax case there's no need to lock, a @@ -370,8 +375,9 @@ bool dax_lock_mapping_entry(struct page *page) * otherwise we would not have a valid pfn_to_page() * translation. */ + locked = true; if (S_ISCHR(mapping->host->i_mode)) - return true; + break; xas.xa = &mapping->i_pages; xas_lock_irq(&xas); @@ -382,28 +388,35 @@ bool dax_lock_mapping_entry(struct page *page) xas_set(&xas, page->index); entry = xas_load(&xas); if (dax_is_locked(entry)) { + rcu_read_unlock(); entry = get_unlocked_entry(&xas); - /* Did the page move while we slept? */ - if (dax_to_pfn(entry) != page_to_pfn(page)) { - xas_unlock_irq(&xas); - continue; - } + xas_unlock_irq(&xas); + put_unlocked_entry(&xas, entry); + rcu_read_lock(); + continue; } dax_lock_entry(&xas, entry); xas_unlock_irq(&xas); - return true; + break; } + rcu_read_unlock(); + return locked; } void dax_unlock_mapping_entry(struct page *page) { struct address_space *mapping = page->mapping; XA_STATE(xas, &mapping->i_pages, page->index); + void *entry; if (S_ISCHR(mapping->host->i_mode)) return; - dax_unlock_entry(&xas, dax_make_page_entry(page)); + rcu_read_lock(); + entry = xas_load(&xas); + rcu_read_unlock(); + entry = dax_make_entry(page_to_pfn_t(page), dax_is_pmd_entry(entry)); + dax_unlock_entry(&xas, entry); } /* @@ -445,11 +458,9 @@ static void *grab_mapping_entry(struct xa_state *xas, retry: xas_lock_irq(xas); entry = get_unlocked_entry(xas); - if (xa_is_internal(entry)) - goto fallback; if (entry) { - if (WARN_ON_ONCE(!xa_is_value(entry))) { + if (!xa_is_value(entry)) { xas_set_err(xas, EIO); goto out_unlock; } @@ -1628,8 +1639,7 @@ dax_insert_pfn_mkwrite(struct vm_fault *vmf, pfn_t pfn, unsigned int order) /* Did we race with someone splitting entry or so? */ if (!entry || (order == 0 && !dax_is_pte_entry(entry)) || - (order == PMD_ORDER && (xa_is_internal(entry) || - !dax_is_pmd_entry(entry)))) { + (order == PMD_ORDER && !dax_is_pmd_entry(entry))) { put_unlocked_entry(&xas, entry); xas_unlock_irq(&xas); trace_dax_insert_pfn_mkwrite_no_entry(mapping->host, vmf, diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c index 7b861bbc0b43..315967354954 100644 --- a/fs/nfs/callback_proc.c +++ b/fs/nfs/callback_proc.c @@ -686,20 +686,24 @@ __be32 nfs4_callback_offload(void *data, void *dummy, { struct cb_offloadargs *args = data; struct nfs_server *server; - struct nfs4_copy_state *copy; + struct nfs4_copy_state *copy, *tmp_copy; bool found = false; + copy = kzalloc(sizeof(struct nfs4_copy_state), GFP_NOFS); + if (!copy) + return htonl(NFS4ERR_SERVERFAULT); + spin_lock(&cps->clp->cl_lock); rcu_read_lock(); list_for_each_entry_rcu(server, &cps->clp->cl_superblocks, client_link) { - list_for_each_entry(copy, &server->ss_copies, copies) { + list_for_each_entry(tmp_copy, &server->ss_copies, copies) { if (memcmp(args->coa_stateid.other, - copy->stateid.other, + tmp_copy->stateid.other, sizeof(args->coa_stateid.other))) continue; - nfs4_copy_cb_args(copy, args); - complete(©->completion); + nfs4_copy_cb_args(tmp_copy, args); + complete(&tmp_copy->completion); found = true; goto out; } @@ -707,15 +711,11 @@ __be32 nfs4_callback_offload(void *data, void *dummy, out: rcu_read_unlock(); if (!found) { - copy = kzalloc(sizeof(struct nfs4_copy_state), GFP_NOFS); - if (!copy) { - spin_unlock(&cps->clp->cl_lock); - return htonl(NFS4ERR_SERVERFAULT); - } memcpy(©->stateid, &args->coa_stateid, NFS4_STATEID_SIZE); nfs4_copy_cb_args(copy, args); list_add_tail(©->copies, &cps->clp->pending_cb_stateids); - } + } else + kfree(copy); spin_unlock(&cps->clp->cl_lock); return 0; diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c index 86bcba40ca61..74b36ed883ca 100644 --- a/fs/nfs/flexfilelayout/flexfilelayout.c +++ b/fs/nfs/flexfilelayout/flexfilelayout.c @@ -1361,12 +1361,7 @@ static void ff_layout_read_prepare_v4(struct rpc_task *task, void *data) task)) return; - if (ff_layout_read_prepare_common(task, hdr)) - return; - - if (nfs4_set_rw_stateid(&hdr->args.stateid, hdr->args.context, - hdr->args.lock_context, FMODE_READ) == -EIO) - rpc_exit(task, -EIO); /* lost lock, terminate I/O */ + ff_layout_read_prepare_common(task, hdr); } static void ff_layout_read_call_done(struct rpc_task *task, void *data) @@ -1542,12 +1537,7 @@ static void ff_layout_write_prepare_v4(struct rpc_task *task, void *data) task)) return; - if (ff_layout_write_prepare_common(task, hdr)) - return; - - if (nfs4_set_rw_stateid(&hdr->args.stateid, hdr->args.context, - hdr->args.lock_context, FMODE_WRITE) == -EIO) - rpc_exit(task, -EIO); /* lost lock, terminate I/O */ + ff_layout_write_prepare_common(task, hdr); } static void ff_layout_write_call_done(struct rpc_task *task, void *data) @@ -1742,6 +1732,10 @@ ff_layout_read_pagelist(struct nfs_pgio_header *hdr) fh = nfs4_ff_layout_select_ds_fh(lseg, idx); if (fh) hdr->args.fh = fh; + + if (!nfs4_ff_layout_select_ds_stateid(lseg, idx, &hdr->args.stateid)) + goto out_failed; + /* * Note that if we ever decide to split across DSes, * then we may need to handle dense-like offsets. @@ -1804,6 +1798,9 @@ ff_layout_write_pagelist(struct nfs_pgio_header *hdr, int sync) if (fh) hdr->args.fh = fh; + if (!nfs4_ff_layout_select_ds_stateid(lseg, idx, &hdr->args.stateid)) + goto out_failed; + /* * Note that if we ever decide to split across DSes, * then we may need to handle dense-like offsets. diff --git a/fs/nfs/flexfilelayout/flexfilelayout.h b/fs/nfs/flexfilelayout/flexfilelayout.h index 411798346e48..de50a342d5a5 100644 --- a/fs/nfs/flexfilelayout/flexfilelayout.h +++ b/fs/nfs/flexfilelayout/flexfilelayout.h @@ -215,6 +215,10 @@ unsigned int ff_layout_fetch_ds_ioerr(struct pnfs_layout_hdr *lo, unsigned int maxnum); struct nfs_fh * nfs4_ff_layout_select_ds_fh(struct pnfs_layout_segment *lseg, u32 mirror_idx); +int +nfs4_ff_layout_select_ds_stateid(struct pnfs_layout_segment *lseg, + u32 mirror_idx, + nfs4_stateid *stateid); struct nfs4_pnfs_ds * nfs4_ff_layout_prepare_ds(struct pnfs_layout_segment *lseg, u32 ds_idx, diff --git a/fs/nfs/flexfilelayout/flexfilelayoutdev.c b/fs/nfs/flexfilelayout/flexfilelayoutdev.c index 74d8d5352438..d23347389626 100644 --- a/fs/nfs/flexfilelayout/flexfilelayoutdev.c +++ b/fs/nfs/flexfilelayout/flexfilelayoutdev.c @@ -370,6 +370,25 @@ out: return fh; } +int +nfs4_ff_layout_select_ds_stateid(struct pnfs_layout_segment *lseg, + u32 mirror_idx, + nfs4_stateid *stateid) +{ + struct nfs4_ff_layout_mirror *mirror = FF_LAYOUT_COMP(lseg, mirror_idx); + + if (!ff_layout_mirror_valid(lseg, mirror, false)) { + pr_err_ratelimited("NFS: %s: No data server for mirror offset index %d\n", + __func__, mirror_idx); + goto out; + } + + nfs4_stateid_copy(stateid, &mirror->stateid); + return 1; +out: + return 0; +} + /** * nfs4_ff_layout_prepare_ds - prepare a DS connection for an RPC call * @lseg: the layout segment we're operating on diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c index ac5b784a1de0..fed06fd9998d 100644 --- a/fs/nfs/nfs42proc.c +++ b/fs/nfs/nfs42proc.c @@ -137,31 +137,32 @@ static int handle_async_copy(struct nfs42_copy_res *res, struct file *dst, nfs4_stateid *src_stateid) { - struct nfs4_copy_state *copy; + struct nfs4_copy_state *copy, *tmp_copy; int status = NFS4_OK; bool found_pending = false; struct nfs_open_context *ctx = nfs_file_open_context(dst); + copy = kzalloc(sizeof(struct nfs4_copy_state), GFP_NOFS); + if (!copy) + return -ENOMEM; + spin_lock(&server->nfs_client->cl_lock); - list_for_each_entry(copy, &server->nfs_client->pending_cb_stateids, + list_for_each_entry(tmp_copy, &server->nfs_client->pending_cb_stateids, copies) { - if (memcmp(&res->write_res.stateid, ©->stateid, + if (memcmp(&res->write_res.stateid, &tmp_copy->stateid, NFS4_STATEID_SIZE)) continue; found_pending = true; - list_del(©->copies); + list_del(&tmp_copy->copies); break; } if (found_pending) { spin_unlock(&server->nfs_client->cl_lock); + kfree(copy); + copy = tmp_copy; goto out; } - copy = kzalloc(sizeof(struct nfs4_copy_state), GFP_NOFS); - if (!copy) { - spin_unlock(&server->nfs_client->cl_lock); - return -ENOMEM; - } memcpy(©->stateid, &res->write_res.stateid, NFS4_STATEID_SIZE); init_completion(©->completion); copy->parent_state = ctx->state; diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h index 8d59c9655ec4..1b994b527518 100644 --- a/fs/nfs/nfs4_fs.h +++ b/fs/nfs/nfs4_fs.h @@ -41,6 +41,8 @@ enum nfs4_client_state { NFS4CLNT_MOVED, NFS4CLNT_LEASE_MOVED, NFS4CLNT_DELEGATION_EXPIRED, + NFS4CLNT_RUN_MANAGER, + NFS4CLNT_DELEGRETURN_RUNNING, }; #define NFS4_RENEW_TIMEOUT 0x01 diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c index ffea57885394..d8decf2ec48f 100644 --- a/fs/nfs/nfs4state.c +++ b/fs/nfs/nfs4state.c @@ -1210,6 +1210,7 @@ void nfs4_schedule_state_manager(struct nfs_client *clp) struct task_struct *task; char buf[INET6_ADDRSTRLEN + sizeof("-manager") + 1]; + set_bit(NFS4CLNT_RUN_MANAGER, &clp->cl_state); if (test_and_set_bit(NFS4CLNT_MANAGER_RUNNING, &clp->cl_state) != 0) return; __module_get(THIS_MODULE); @@ -2503,6 +2504,7 @@ static void nfs4_state_manager(struct nfs_client *clp) /* Ensure exclusive access to NFSv4 state */ do { + clear_bit(NFS4CLNT_RUN_MANAGER, &clp->cl_state); if (test_bit(NFS4CLNT_PURGE_STATE, &clp->cl_state)) { section = "purge state"; status = nfs4_purge_lease(clp); @@ -2593,14 +2595,18 @@ static void nfs4_state_manager(struct nfs_client *clp) } nfs4_end_drain_session(clp); - if (test_and_clear_bit(NFS4CLNT_DELEGRETURN, &clp->cl_state)) { - nfs_client_return_marked_delegations(clp); - continue; + nfs4_clear_state_manager_bit(clp); + + if (!test_and_set_bit(NFS4CLNT_DELEGRETURN_RUNNING, &clp->cl_state)) { + if (test_and_clear_bit(NFS4CLNT_DELEGRETURN, &clp->cl_state)) { + nfs_client_return_marked_delegations(clp); + set_bit(NFS4CLNT_RUN_MANAGER, &clp->cl_state); + } + clear_bit(NFS4CLNT_DELEGRETURN_RUNNING, &clp->cl_state); } - nfs4_clear_state_manager_bit(clp); /* Did we race with an attempt to give us more work? */ - if (clp->cl_state == 0) + if (!test_bit(NFS4CLNT_RUN_MANAGER, &clp->cl_state)) return; if (test_and_set_bit(NFS4CLNT_MANAGER_RUNNING, &clp->cl_state) != 0) return; diff --git a/fs/nilfs2/btnode.c b/fs/nilfs2/btnode.c index de99db518571..f2129a5d9f23 100644 --- a/fs/nilfs2/btnode.c +++ b/fs/nilfs2/btnode.c @@ -266,9 +266,7 @@ void nilfs_btnode_abort_change_key(struct address_space *btnc, return; if (nbh == NULL) { /* blocksize == pagesize */ - xa_lock_irq(&btnc->i_pages); - __xa_erase(&btnc->i_pages, newkey); - xa_unlock_irq(&btnc->i_pages); + xa_erase_irq(&btnc->i_pages, newkey); unlock_page(ctxt->bh->b_page); } else brelse(nbh); diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h index bd73e7a91410..9e66bfe369aa 100644 --- a/include/linux/dma-direct.h +++ b/include/linux/dma-direct.h @@ -5,7 +5,7 @@ #include <linux/dma-mapping.h> #include <linux/mem_encrypt.h> -#define DIRECT_MAPPING_ERROR 0 +#define DIRECT_MAPPING_ERROR (~(dma_addr_t)0) #ifdef CONFIG_ARCH_HAS_PHYS_TO_DMA #include <asm/dma-direct.h> diff --git a/include/linux/filter.h b/include/linux/filter.h index cc17f5f32fbb..d16deead65c6 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -852,6 +852,10 @@ void bpf_jit_binary_free(struct bpf_binary_header *hdr); void bpf_jit_free(struct bpf_prog *fp); +int bpf_jit_get_func_addr(const struct bpf_prog *prog, + const struct bpf_insn *insn, bool extra_pass, + u64 *func_addr, bool *func_addr_fixed); + struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *fp); void bpf_jit_prog_release_other(struct bpf_prog *fp, struct bpf_prog *fp_other); diff --git a/include/linux/hid.h b/include/linux/hid.h index 387c70df6f29..a355d61940f2 100644 --- a/include/linux/hid.h +++ b/include/linux/hid.h @@ -1139,34 +1139,6 @@ static inline u32 hid_report_len(struct hid_report *report) int hid_report_raw_event(struct hid_device *hid, int type, u8 *data, u32 size, int interrupt); - -/** - * struct hid_scroll_counter - Utility class for processing high-resolution - * scroll events. - * @dev: the input device for which events should be reported. - * @microns_per_hi_res_unit: the amount moved by the user's finger for each - * high-resolution unit reported by the mouse, in - * microns. - * @resolution_multiplier: the wheel's resolution in high-resolution mode as a - * multiple of its lower resolution. For example, if - * moving the wheel by one "notch" would result in a - * value of 1 in low-resolution mode but 8 in - * high-resolution, the multiplier is 8. - * @remainder: counts the number of high-resolution units moved since the last - * low-resolution event (REL_WHEEL or REL_HWHEEL) was sent. Should - * only be used by class methods. - */ -struct hid_scroll_counter { - struct input_dev *dev; - int microns_per_hi_res_unit; - int resolution_multiplier; - - int remainder; -}; - -void hid_scroll_counter_handle_scroll(struct hid_scroll_counter *counter, - int hi_res_value); - /* HID quirks API */ unsigned long hid_lookup_quirk(const struct hid_device *hdev); int hid_quirks_init(char **quirks_param, __u16 bus, int count); diff --git a/include/linux/netfilter/nf_conntrack_proto_gre.h b/include/linux/netfilter/nf_conntrack_proto_gre.h index b8d95564bd53..14edb795ab43 100644 --- a/include/linux/netfilter/nf_conntrack_proto_gre.h +++ b/include/linux/netfilter/nf_conntrack_proto_gre.h @@ -21,6 +21,19 @@ struct nf_ct_gre_keymap { struct nf_conntrack_tuple tuple; }; +enum grep_conntrack { + GRE_CT_UNREPLIED, + GRE_CT_REPLIED, + GRE_CT_MAX +}; + +struct netns_proto_gre { + struct nf_proto_net nf; + rwlock_t keymap_lock; + struct list_head keymap_list; + unsigned int gre_timeouts[GRE_CT_MAX]; +}; + /* add new tuple->key_reply pair to keymap */ int nf_ct_gre_keymap_add(struct nf_conn *ct, enum ip_conntrack_dir dir, struct nf_conntrack_tuple *t); diff --git a/include/linux/xarray.h b/include/linux/xarray.h index d9514928ddac..564892e19f8c 100644 --- a/include/linux/xarray.h +++ b/include/linux/xarray.h @@ -289,9 +289,7 @@ struct xarray { void xa_init_flags(struct xarray *, gfp_t flags); void *xa_load(struct xarray *, unsigned long index); void *xa_store(struct xarray *, unsigned long index, void *entry, gfp_t); -void *xa_cmpxchg(struct xarray *, unsigned long index, - void *old, void *entry, gfp_t); -int xa_reserve(struct xarray *, unsigned long index, gfp_t); +void *xa_erase(struct xarray *, unsigned long index); void *xa_store_range(struct xarray *, unsigned long first, unsigned long last, void *entry, gfp_t); bool xa_get_mark(struct xarray *, unsigned long index, xa_mark_t); @@ -344,65 +342,6 @@ static inline bool xa_marked(const struct xarray *xa, xa_mark_t mark) } /** - * xa_erase() - Erase this entry from the XArray. - * @xa: XArray. - * @index: Index of entry. - * - * This function is the equivalent of calling xa_store() with %NULL as - * the third argument. The XArray does not need to allocate memory, so - * the user does not need to provide GFP flags. - * - * Context: Process context. Takes and releases the xa_lock. - * Return: The entry which used to be at this index. - */ -static inline void *xa_erase(struct xarray *xa, unsigned long index) -{ - return xa_store(xa, index, NULL, 0); -} - -/** - * xa_insert() - Store this entry in the XArray unless another entry is - * already present. - * @xa: XArray. - * @index: Index into array. - * @entry: New entry. - * @gfp: Memory allocation flags. - * - * If you would rather see the existing entry in the array, use xa_cmpxchg(). - * This function is for users who don't care what the entry is, only that - * one is present. - * - * Context: Process context. Takes and releases the xa_lock. - * May sleep if the @gfp flags permit. - * Return: 0 if the store succeeded. -EEXIST if another entry was present. - * -ENOMEM if memory could not be allocated. - */ -static inline int xa_insert(struct xarray *xa, unsigned long index, - void *entry, gfp_t gfp) -{ - void *curr = xa_cmpxchg(xa, index, NULL, entry, gfp); - if (!curr) - return 0; - if (xa_is_err(curr)) - return xa_err(curr); - return -EEXIST; -} - -/** - * xa_release() - Release a reserved entry. - * @xa: XArray. - * @index: Index of entry. - * - * After calling xa_reserve(), you can call this function to release the - * reservation. If the entry at @index has been stored to, this function - * will do nothing. - */ -static inline void xa_release(struct xarray *xa, unsigned long index) -{ - xa_cmpxchg(xa, index, NULL, NULL, 0); -} - -/** * xa_for_each() - Iterate over a portion of an XArray. * @xa: XArray. * @entry: Entry retrieved from array. @@ -455,6 +394,7 @@ void *__xa_store(struct xarray *, unsigned long index, void *entry, gfp_t); void *__xa_cmpxchg(struct xarray *, unsigned long index, void *old, void *entry, gfp_t); int __xa_alloc(struct xarray *, u32 *id, u32 max, void *entry, gfp_t); +int __xa_reserve(struct xarray *, unsigned long index, gfp_t); void __xa_set_mark(struct xarray *, unsigned long index, xa_mark_t); void __xa_clear_mark(struct xarray *, unsigned long index, xa_mark_t); @@ -487,6 +427,58 @@ static inline int __xa_insert(struct xarray *xa, unsigned long index, } /** + * xa_store_bh() - Store this entry in the XArray. + * @xa: XArray. + * @index: Index into array. + * @entry: New entry. + * @gfp: Memory allocation flags. + * + * This function is like calling xa_store() except it disables softirqs + * while holding the array lock. + * + * Context: Any context. Takes and releases the xa_lock while + * disabling softirqs. + * Return: The entry which used to be at this index. + */ +static inline void *xa_store_bh(struct xarray *xa, unsigned long index, + void *entry, gfp_t gfp) +{ + void *curr; + + xa_lock_bh(xa); + curr = __xa_store(xa, index, entry, gfp); + xa_unlock_bh(xa); + + return curr; +} + +/** + * xa_store_irq() - Erase this entry from the XArray. + * @xa: XArray. + * @index: Index into array. + * @entry: New entry. + * @gfp: Memory allocation flags. + * + * This function is like calling xa_store() except it disables interrupts + * while holding the array lock. + * + * Context: Process context. Takes and releases the xa_lock while + * disabling interrupts. + * Return: The entry which used to be at this index. + */ +static inline void *xa_store_irq(struct xarray *xa, unsigned long index, + void *entry, gfp_t gfp) +{ + void *curr; + + xa_lock_irq(xa); + curr = __xa_store(xa, index, entry, gfp); + xa_unlock_irq(xa); + + return curr; +} + +/** * xa_erase_bh() - Erase this entry from the XArray. * @xa: XArray. * @index: Index of entry. @@ -495,7 +487,7 @@ static inline int __xa_insert(struct xarray *xa, unsigned long index, * the third argument. The XArray does not need to allocate memory, so * the user does not need to provide GFP flags. * - * Context: Process context. Takes and releases the xa_lock while + * Context: Any context. Takes and releases the xa_lock while * disabling softirqs. * Return: The entry which used to be at this index. */ @@ -535,6 +527,61 @@ static inline void *xa_erase_irq(struct xarray *xa, unsigned long index) } /** + * xa_cmpxchg() - Conditionally replace an entry in the XArray. + * @xa: XArray. + * @index: Index into array. + * @old: Old value to test against. + * @entry: New value to place in array. + * @gfp: Memory allocation flags. + * + * If the entry at @index is the same as @old, replace it with @entry. + * If the return value is equal to @old, then the exchange was successful. + * + * Context: Any context. Takes and releases the xa_lock. May sleep + * if the @gfp flags permit. + * Return: The old value at this index or xa_err() if an error happened. + */ +static inline void *xa_cmpxchg(struct xarray *xa, unsigned long index, + void *old, void *entry, gfp_t gfp) +{ + void *curr; + + xa_lock(xa); + curr = __xa_cmpxchg(xa, index, old, entry, gfp); + xa_unlock(xa); + + return curr; +} + +/** + * xa_insert() - Store this entry in the XArray unless another entry is + * already present. + * @xa: XArray. + * @index: Index into array. + * @entry: New entry. + * @gfp: Memory allocation flags. + * + * If you would rather see the existing entry in the array, use xa_cmpxchg(). + * This function is for users who don't care what the entry is, only that + * one is present. + * + * Context: Process context. Takes and releases the xa_lock. + * May sleep if the @gfp flags permit. + * Return: 0 if the store succeeded. -EEXIST if another entry was present. + * -ENOMEM if memory could not be allocated. + */ +static inline int xa_insert(struct xarray *xa, unsigned long index, + void *entry, gfp_t gfp) +{ + void *curr = xa_cmpxchg(xa, index, NULL, entry, gfp); + if (!curr) + return 0; + if (xa_is_err(curr)) + return xa_err(curr); + return -EEXIST; +} + +/** * xa_alloc() - Find somewhere to store this entry in the XArray. * @xa: XArray. * @id: Pointer to ID. @@ -575,7 +622,7 @@ static inline int xa_alloc(struct xarray *xa, u32 *id, u32 max, void *entry, * Updates the @id pointer with the index, then stores the entry at that * index. A concurrent lookup will not see an uninitialised @id. * - * Context: Process context. Takes and releases the xa_lock while + * Context: Any context. Takes and releases the xa_lock while * disabling softirqs. May sleep if the @gfp flags permit. * Return: 0 on success, -ENOMEM if memory allocation fails or -ENOSPC if * there is no more space in the XArray. @@ -621,6 +668,98 @@ static inline int xa_alloc_irq(struct xarray *xa, u32 *id, u32 max, void *entry, return err; } +/** + * xa_reserve() - Reserve this index in the XArray. + * @xa: XArray. + * @index: Index into array. + * @gfp: Memory allocation flags. + * + * Ensures there is somewhere to store an entry at @index in the array. + * If there is already something stored at @index, this function does + * nothing. If there was nothing there, the entry is marked as reserved. + * Loading from a reserved entry returns a %NULL pointer. + * + * If you do not use the entry that you have reserved, call xa_release() + * or xa_erase() to free any unnecessary memory. + * + * Context: Any context. Takes and releases the xa_lock. + * May sleep if the @gfp flags permit. + * Return: 0 if the reservation succeeded or -ENOMEM if it failed. + */ +static inline +int xa_reserve(struct xarray *xa, unsigned long index, gfp_t gfp) +{ + int ret; + + xa_lock(xa); + ret = __xa_reserve(xa, index, gfp); + xa_unlock(xa); + + return ret; +} + +/** + * xa_reserve_bh() - Reserve this index in the XArray. + * @xa: XArray. + * @index: Index into array. + * @gfp: Memory allocation flags. + * + * A softirq-disabling version of xa_reserve(). + * + * Context: Any context. Takes and releases the xa_lock while + * disabling softirqs. + * Return: 0 if the reservation succeeded or -ENOMEM if it failed. + */ +static inline +int xa_reserve_bh(struct xarray *xa, unsigned long index, gfp_t gfp) +{ + int ret; + + xa_lock_bh(xa); + ret = __xa_reserve(xa, index, gfp); + xa_unlock_bh(xa); + + return ret; +} + +/** + * xa_reserve_irq() - Reserve this index in the XArray. + * @xa: XArray. + * @index: Index into array. + * @gfp: Memory allocation flags. + * + * An interrupt-disabling version of xa_reserve(). + * + * Context: Process context. Takes and releases the xa_lock while + * disabling interrupts. + * Return: 0 if the reservation succeeded or -ENOMEM if it failed. + */ +static inline +int xa_reserve_irq(struct xarray *xa, unsigned long index, gfp_t gfp) +{ + int ret; + + xa_lock_irq(xa); + ret = __xa_reserve(xa, index, gfp); + xa_unlock_irq(xa); + + return ret; +} + +/** + * xa_release() - Release a reserved entry. + * @xa: XArray. + * @index: Index of entry. + * + * After calling xa_reserve(), you can call this function to release the + * reservation. If the entry at @index has been stored to, this function + * will do nothing. + */ +static inline void xa_release(struct xarray *xa, unsigned long index) +{ + xa_cmpxchg(xa, index, NULL, NULL, 0); +} + /* Everything below here is the Advanced API. Proceed with caution. */ /* diff --git a/include/net/netfilter/ipv4/nf_nat_masquerade.h b/include/net/netfilter/ipv4/nf_nat_masquerade.h index cd24be4c4a99..13d55206bb9f 100644 --- a/include/net/netfilter/ipv4/nf_nat_masquerade.h +++ b/include/net/netfilter/ipv4/nf_nat_masquerade.h @@ -9,7 +9,7 @@ nf_nat_masquerade_ipv4(struct sk_buff *skb, unsigned int hooknum, const struct nf_nat_range2 *range, const struct net_device *out); -void nf_nat_masquerade_ipv4_register_notifier(void); +int nf_nat_masquerade_ipv4_register_notifier(void); void nf_nat_masquerade_ipv4_unregister_notifier(void); #endif /*_NF_NAT_MASQUERADE_IPV4_H_ */ diff --git a/include/net/netfilter/ipv6/nf_nat_masquerade.h b/include/net/netfilter/ipv6/nf_nat_masquerade.h index 0c3b5ebf0bb8..2917bf95c437 100644 --- a/include/net/netfilter/ipv6/nf_nat_masquerade.h +++ b/include/net/netfilter/ipv6/nf_nat_masquerade.h @@ -5,7 +5,7 @@ unsigned int nf_nat_masquerade_ipv6(struct sk_buff *skb, const struct nf_nat_range2 *range, const struct net_device *out); -void nf_nat_masquerade_ipv6_register_notifier(void); +int nf_nat_masquerade_ipv6_register_notifier(void); void nf_nat_masquerade_ipv6_unregister_notifier(void); #endif /* _NF_NAT_MASQUERADE_IPV6_H_ */ diff --git a/include/uapi/linux/input-event-codes.h b/include/uapi/linux/input-event-codes.h index 6d180cc60a5d..3eb5a4c3d60a 100644 --- a/include/uapi/linux/input-event-codes.h +++ b/include/uapi/linux/input-event-codes.h @@ -716,7 +716,6 @@ * the situation described above. */ #define REL_RESERVED 0x0a -#define REL_WHEEL_HI_RES 0x0b #define REL_MAX 0x0f #define REL_CNT (REL_MAX+1) @@ -753,15 +752,6 @@ #define ABS_MISC 0x28 -/* - * 0x2e is reserved and should not be used in input drivers. - * It was used by HID as ABS_MISC+6 and userspace needs to detect if - * the next ABS_* event is correct or is just ABS_MISC + n. - * We define here ABS_RESERVED so userspace can rely on it and detect - * the situation described above. - */ -#define ABS_RESERVED 0x2e - #define ABS_MT_SLOT 0x2f /* MT slot being modified */ #define ABS_MT_TOUCH_MAJOR 0x30 /* Major axis of touching ellipse */ #define ABS_MT_TOUCH_MINOR 0x31 /* Minor axis (omit if circular) */ diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 16d77012ad3e..539e8575689f 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -685,6 +685,40 @@ void __weak bpf_jit_free(struct bpf_prog *fp) bpf_prog_unlock_free(fp); } +int bpf_jit_get_func_addr(const struct bpf_prog *prog, + const struct bpf_insn *insn, bool extra_pass, + u64 *func_addr, bool *func_addr_fixed) +{ + s16 off = insn->off; + s32 imm = insn->imm; + u8 *addr; + + *func_addr_fixed = insn->src_reg != BPF_PSEUDO_CALL; + if (!*func_addr_fixed) { + /* Place-holder address till the last pass has collected + * all addresses for JITed subprograms in which case we + * can pick them up from prog->aux. + */ + if (!extra_pass) + addr = NULL; + else if (prog->aux->func && + off >= 0 && off < prog->aux->func_cnt) + addr = (u8 *)prog->aux->func[off]->bpf_func; + else + return -EINVAL; + } else { + /* Address of a BPF helper call. Since part of the core + * kernel, it's always at a fixed location. __bpf_call_base + * and the helper with imm relative to it are both in core + * kernel. + */ + addr = (u8 *)__bpf_call_base + imm; + } + + *func_addr = (unsigned long)addr; + return 0; +} + static int bpf_jit_blind_insn(const struct bpf_insn *from, const struct bpf_insn *aux, struct bpf_insn *to_buff) diff --git a/kernel/bpf/local_storage.c b/kernel/bpf/local_storage.c index 9e94b1cc6cf2..b65017dead44 100644 --- a/kernel/bpf/local_storage.c +++ b/kernel/bpf/local_storage.c @@ -138,7 +138,8 @@ static int cgroup_storage_update_elem(struct bpf_map *map, void *_key, return -ENOENT; new = kmalloc_node(sizeof(struct bpf_storage_buffer) + - map->value_size, __GFP_ZERO | GFP_USER, + map->value_size, + __GFP_ZERO | GFP_ATOMIC | __GFP_NOWARN, map->numa_node); if (!new) return -ENOMEM; diff --git a/kernel/bpf/queue_stack_maps.c b/kernel/bpf/queue_stack_maps.c index 8bbd72d3a121..b384ea9f3254 100644 --- a/kernel/bpf/queue_stack_maps.c +++ b/kernel/bpf/queue_stack_maps.c @@ -7,6 +7,7 @@ #include <linux/bpf.h> #include <linux/list.h> #include <linux/slab.h> +#include <linux/capability.h> #include "percpu_freelist.h" #define QUEUE_STACK_CREATE_FLAG_MASK \ @@ -45,8 +46,12 @@ static bool queue_stack_map_is_full(struct bpf_queue_stack *qs) /* Called from syscall */ static int queue_stack_map_alloc_check(union bpf_attr *attr) { + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + /* check sanity of attributes */ if (attr->max_entries == 0 || attr->key_size != 0 || + attr->value_size == 0 || attr->map_flags & ~QUEUE_STACK_CREATE_FLAG_MASK) return -EINVAL; @@ -63,15 +68,10 @@ static struct bpf_map *queue_stack_map_alloc(union bpf_attr *attr) { int ret, numa_node = bpf_map_attr_numa_node(attr); struct bpf_queue_stack *qs; - u32 size, value_size; - u64 queue_size, cost; - - size = attr->max_entries + 1; - value_size = attr->value_size; - - queue_size = sizeof(*qs) + (u64) value_size * size; + u64 size, queue_size, cost; - cost = queue_size; + size = (u64) attr->max_entries + 1; + cost = queue_size = sizeof(*qs) + size * attr->value_size; if (cost >= U32_MAX - PAGE_SIZE) return ERR_PTR(-E2BIG); diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index f102c4fd0c5a..4ce049cd30a3 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -5771,7 +5771,7 @@ static void adjust_subprog_starts(struct bpf_verifier_env *env, u32 off, u32 len return; /* NOTE: fake 'exit' subprog should be updated as well. */ for (i = 0; i <= env->subprog_cnt; i++) { - if (env->subprog_info[i].start < off) + if (env->subprog_info[i].start <= off) continue; env->subprog_info[i].start += len - 1; } diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 5731daa09a32..045930e32c0e 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -679,7 +679,8 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page, } if (!dev_is_dma_coherent(dev) && - (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0) + (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0 && + dev_addr != DIRECT_MAPPING_ERROR) arch_sync_dma_for_device(dev, phys, size, dir); return dev_addr; diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 08fcfe440c63..9864a35c8bb5 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -196,11 +196,13 @@ BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1, i++; } else if (fmt[i] == 'p' || fmt[i] == 's') { mod[fmt_cnt]++; - i++; - if (!isspace(fmt[i]) && !ispunct(fmt[i]) && fmt[i] != 0) + /* disallow any further format extensions */ + if (fmt[i + 1] != 0 && + !isspace(fmt[i + 1]) && + !ispunct(fmt[i + 1])) return -EINVAL; fmt_cnt++; - if (fmt[i - 1] == 's') { + if (fmt[i] == 's') { if (str_seen) /* allow only one '%s' per fmt string */ return -EINVAL; diff --git a/lib/test_xarray.c b/lib/test_xarray.c index aa47754150ce..0598e86af8fc 100644 --- a/lib/test_xarray.c +++ b/lib/test_xarray.c @@ -208,15 +208,19 @@ static noinline void check_xa_mark_1(struct xarray *xa, unsigned long index) XA_BUG_ON(xa, xa_get_mark(xa, i, XA_MARK_2)); /* We should see two elements in the array */ + rcu_read_lock(); xas_for_each(&xas, entry, ULONG_MAX) seen++; + rcu_read_unlock(); XA_BUG_ON(xa, seen != 2); /* One of which is marked */ xas_set(&xas, 0); seen = 0; + rcu_read_lock(); xas_for_each_marked(&xas, entry, ULONG_MAX, XA_MARK_0) seen++; + rcu_read_unlock(); XA_BUG_ON(xa, seen != 1); } XA_BUG_ON(xa, xa_get_mark(xa, next, XA_MARK_0)); @@ -373,6 +377,12 @@ static noinline void check_reserve(struct xarray *xa) xa_erase_index(xa, 12345678); XA_BUG_ON(xa, !xa_empty(xa)); + /* And so does xa_insert */ + xa_reserve(xa, 12345678, GFP_KERNEL); + XA_BUG_ON(xa, xa_insert(xa, 12345678, xa_mk_value(12345678), 0) != 0); + xa_erase_index(xa, 12345678); + XA_BUG_ON(xa, !xa_empty(xa)); + /* Can iterate through a reserved entry */ xa_store_index(xa, 5, GFP_KERNEL); xa_reserve(xa, 6, GFP_KERNEL); @@ -436,7 +446,9 @@ static noinline void check_multi_store_1(struct xarray *xa, unsigned long index, XA_BUG_ON(xa, xa_load(xa, max) != NULL); XA_BUG_ON(xa, xa_load(xa, min - 1) != NULL); + xas_lock(&xas); XA_BUG_ON(xa, xas_store(&xas, xa_mk_value(min)) != xa_mk_value(index)); + xas_unlock(&xas); XA_BUG_ON(xa, xa_load(xa, min) != xa_mk_value(min)); XA_BUG_ON(xa, xa_load(xa, max - 1) != xa_mk_value(min)); XA_BUG_ON(xa, xa_load(xa, max) != NULL); @@ -452,9 +464,11 @@ static noinline void check_multi_store_2(struct xarray *xa, unsigned long index, XA_STATE(xas, xa, index); xa_store_order(xa, index, order, xa_mk_value(0), GFP_KERNEL); + xas_lock(&xas); XA_BUG_ON(xa, xas_store(&xas, xa_mk_value(1)) != xa_mk_value(0)); XA_BUG_ON(xa, xas.xa_index != index); XA_BUG_ON(xa, xas_store(&xas, NULL) != xa_mk_value(1)); + xas_unlock(&xas); XA_BUG_ON(xa, !xa_empty(xa)); } #endif @@ -498,7 +512,7 @@ static noinline void check_multi_store(struct xarray *xa) rcu_read_unlock(); /* We can erase multiple values with a single store */ - xa_store_order(xa, 0, 63, NULL, GFP_KERNEL); + xa_store_order(xa, 0, BITS_PER_LONG - 1, NULL, GFP_KERNEL); XA_BUG_ON(xa, !xa_empty(xa)); /* Even when the first slot is empty but the others aren't */ @@ -702,7 +716,7 @@ static noinline void check_multi_find_2(struct xarray *xa) } } -static noinline void check_find(struct xarray *xa) +static noinline void check_find_1(struct xarray *xa) { unsigned long i, j, k; @@ -748,6 +762,34 @@ static noinline void check_find(struct xarray *xa) XA_BUG_ON(xa, xa_get_mark(xa, i, XA_MARK_0)); } XA_BUG_ON(xa, !xa_empty(xa)); +} + +static noinline void check_find_2(struct xarray *xa) +{ + void *entry; + unsigned long i, j, index = 0; + + xa_for_each(xa, entry, index, ULONG_MAX, XA_PRESENT) { + XA_BUG_ON(xa, true); + } + + for (i = 0; i < 1024; i++) { + xa_store_index(xa, index, GFP_KERNEL); + j = 0; + index = 0; + xa_for_each(xa, entry, index, ULONG_MAX, XA_PRESENT) { + XA_BUG_ON(xa, xa_mk_value(index) != entry); + XA_BUG_ON(xa, index != j++); + } + } + + xa_destroy(xa); +} + +static noinline void check_find(struct xarray *xa) +{ + check_find_1(xa); + check_find_2(xa); check_multi_find(xa); check_multi_find_2(xa); } @@ -1067,7 +1109,7 @@ static noinline void check_store_range(struct xarray *xa) __check_store_range(xa, 4095 + i, 4095 + j); __check_store_range(xa, 4096 + i, 4096 + j); __check_store_range(xa, 123456 + i, 123456 + j); - __check_store_range(xa, UINT_MAX + i, UINT_MAX + j); + __check_store_range(xa, (1 << 24) + i, (1 << 24) + j); } } } @@ -1146,10 +1188,12 @@ static noinline void check_account(struct xarray *xa) XA_STATE(xas, xa, 1 << order); xa_store_order(xa, 0, order, xa, GFP_KERNEL); + rcu_read_lock(); xas_load(&xas); XA_BUG_ON(xa, xas.xa_node->count == 0); XA_BUG_ON(xa, xas.xa_node->count > (1 << order)); XA_BUG_ON(xa, xas.xa_node->nr_values != 0); + rcu_read_unlock(); xa_store_order(xa, 1 << order, order, xa_mk_value(1 << order), GFP_KERNEL); diff --git a/lib/xarray.c b/lib/xarray.c index 8b176f009c08..bbacca576593 100644 --- a/lib/xarray.c +++ b/lib/xarray.c @@ -610,8 +610,8 @@ static int xas_expand(struct xa_state *xas, void *head) * (see the xa_cmpxchg() implementation for an example). * * Return: If the slot already existed, returns the contents of this slot. - * If the slot was newly created, returns NULL. If it failed to create the - * slot, returns NULL and indicates the error in @xas. + * If the slot was newly created, returns %NULL. If it failed to create the + * slot, returns %NULL and indicates the error in @xas. */ static void *xas_create(struct xa_state *xas) { @@ -1334,44 +1334,31 @@ void *__xa_erase(struct xarray *xa, unsigned long index) XA_STATE(xas, xa, index); return xas_result(&xas, xas_store(&xas, NULL)); } -EXPORT_SYMBOL_GPL(__xa_erase); +EXPORT_SYMBOL(__xa_erase); /** - * xa_store() - Store this entry in the XArray. + * xa_erase() - Erase this entry from the XArray. * @xa: XArray. - * @index: Index into array. - * @entry: New entry. - * @gfp: Memory allocation flags. + * @index: Index of entry. * - * After this function returns, loads from this index will return @entry. - * Storing into an existing multislot entry updates the entry of every index. - * The marks associated with @index are unaffected unless @entry is %NULL. + * This function is the equivalent of calling xa_store() with %NULL as + * the third argument. The XArray does not need to allocate memory, so + * the user does not need to provide GFP flags. * - * Context: Process context. Takes and releases the xa_lock. May sleep - * if the @gfp flags permit. - * Return: The old entry at this index on success, xa_err(-EINVAL) if @entry - * cannot be stored in an XArray, or xa_err(-ENOMEM) if memory allocation - * failed. + * Context: Any context. Takes and releases the xa_lock. + * Return: The entry which used to be at this index. */ -void *xa_store(struct xarray *xa, unsigned long index, void *entry, gfp_t gfp) +void *xa_erase(struct xarray *xa, unsigned long index) { - XA_STATE(xas, xa, index); - void *curr; - - if (WARN_ON_ONCE(xa_is_internal(entry))) - return XA_ERROR(-EINVAL); + void *entry; - do { - xas_lock(&xas); - curr = xas_store(&xas, entry); - if (xa_track_free(xa) && entry) - xas_clear_mark(&xas, XA_FREE_MARK); - xas_unlock(&xas); - } while (xas_nomem(&xas, gfp)); + xa_lock(xa); + entry = __xa_erase(xa, index); + xa_unlock(xa); - return xas_result(&xas, curr); + return entry; } -EXPORT_SYMBOL(xa_store); +EXPORT_SYMBOL(xa_erase); /** * __xa_store() - Store this entry in the XArray. @@ -1395,10 +1382,12 @@ void *__xa_store(struct xarray *xa, unsigned long index, void *entry, gfp_t gfp) if (WARN_ON_ONCE(xa_is_internal(entry))) return XA_ERROR(-EINVAL); + if (xa_track_free(xa) && !entry) + entry = XA_ZERO_ENTRY; do { curr = xas_store(&xas, entry); - if (xa_track_free(xa) && entry) + if (xa_track_free(xa)) xas_clear_mark(&xas, XA_FREE_MARK); } while (__xas_nomem(&xas, gfp)); @@ -1407,45 +1396,33 @@ void *__xa_store(struct xarray *xa, unsigned long index, void *entry, gfp_t gfp) EXPORT_SYMBOL(__xa_store); /** - * xa_cmpxchg() - Conditionally replace an entry in the XArray. + * xa_store() - Store this entry in the XArray. * @xa: XArray. * @index: Index into array. - * @old: Old value to test against. - * @entry: New value to place in array. + * @entry: New entry. * @gfp: Memory allocation flags. * - * If the entry at @index is the same as @old, replace it with @entry. - * If the return value is equal to @old, then the exchange was successful. + * After this function returns, loads from this index will return @entry. + * Storing into an existing multislot entry updates the entry of every index. + * The marks associated with @index are unaffected unless @entry is %NULL. * - * Context: Process context. Takes and releases the xa_lock. May sleep - * if the @gfp flags permit. - * Return: The old value at this index or xa_err() if an error happened. + * Context: Any context. Takes and releases the xa_lock. + * May sleep if the @gfp flags permit. + * Return: The old entry at this index on success, xa_err(-EINVAL) if @entry + * cannot be stored in an XArray, or xa_err(-ENOMEM) if memory allocation + * failed. */ -void *xa_cmpxchg(struct xarray *xa, unsigned long index, - void *old, void *entry, gfp_t gfp) +void *xa_store(struct xarray *xa, unsigned long index, void *entry, gfp_t gfp) { - XA_STATE(xas, xa, index); void *curr; - if (WARN_ON_ONCE(xa_is_internal(entry))) - return XA_ERROR(-EINVAL); - - do { - xas_lock(&xas); - curr = xas_load(&xas); - if (curr == XA_ZERO_ENTRY) - curr = NULL; - if (curr == old) { - xas_store(&xas, entry); - if (xa_track_free(xa) && entry) - xas_clear_mark(&xas, XA_FREE_MARK); - } - xas_unlock(&xas); - } while (xas_nomem(&xas, gfp)); + xa_lock(xa); + curr = __xa_store(xa, index, entry, gfp); + xa_unlock(xa); - return xas_result(&xas, curr); + return curr; } -EXPORT_SYMBOL(xa_cmpxchg); +EXPORT_SYMBOL(xa_store); /** * __xa_cmpxchg() - Store this entry in the XArray. @@ -1471,6 +1448,8 @@ void *__xa_cmpxchg(struct xarray *xa, unsigned long index, if (WARN_ON_ONCE(xa_is_internal(entry))) return XA_ERROR(-EINVAL); + if (xa_track_free(xa) && !entry) + entry = XA_ZERO_ENTRY; do { curr = xas_load(&xas); @@ -1478,7 +1457,7 @@ void *__xa_cmpxchg(struct xarray *xa, unsigned long index, curr = NULL; if (curr == old) { xas_store(&xas, entry); - if (xa_track_free(xa) && entry) + if (xa_track_free(xa)) xas_clear_mark(&xas, XA_FREE_MARK); } } while (__xas_nomem(&xas, gfp)); @@ -1488,7 +1467,7 @@ void *__xa_cmpxchg(struct xarray *xa, unsigned long index, EXPORT_SYMBOL(__xa_cmpxchg); /** - * xa_reserve() - Reserve this index in the XArray. + * __xa_reserve() - Reserve this index in the XArray. * @xa: XArray. * @index: Index into array. * @gfp: Memory allocation flags. @@ -1496,33 +1475,32 @@ EXPORT_SYMBOL(__xa_cmpxchg); * Ensures there is somewhere to store an entry at @index in the array. * If there is already something stored at @index, this function does * nothing. If there was nothing there, the entry is marked as reserved. - * Loads from @index will continue to see a %NULL pointer until a - * subsequent store to @index. + * Loading from a reserved entry returns a %NULL pointer. * * If you do not use the entry that you have reserved, call xa_release() * or xa_erase() to free any unnecessary memory. * - * Context: Process context. Takes and releases the xa_lock, IRQ or BH safe - * if specified in XArray flags. May sleep if the @gfp flags permit. + * Context: Any context. Expects the xa_lock to be held on entry. May + * release the lock, sleep and reacquire the lock if the @gfp flags permit. * Return: 0 if the reservation succeeded or -ENOMEM if it failed. */ -int xa_reserve(struct xarray *xa, unsigned long index, gfp_t gfp) +int __xa_reserve(struct xarray *xa, unsigned long index, gfp_t gfp) { XA_STATE(xas, xa, index); - unsigned int lock_type = xa_lock_type(xa); void *curr; do { - xas_lock_type(&xas, lock_type); curr = xas_load(&xas); - if (!curr) + if (!curr) { xas_store(&xas, XA_ZERO_ENTRY); - xas_unlock_type(&xas, lock_type); - } while (xas_nomem(&xas, gfp)); + if (xa_track_free(xa)) + xas_clear_mark(&xas, XA_FREE_MARK); + } + } while (__xas_nomem(&xas, gfp)); return xas_error(&xas); } -EXPORT_SYMBOL(xa_reserve); +EXPORT_SYMBOL(__xa_reserve); #ifdef CONFIG_XARRAY_MULTI static void xas_set_range(struct xa_state *xas, unsigned long first, @@ -1587,8 +1565,9 @@ void *xa_store_range(struct xarray *xa, unsigned long first, do { xas_lock(&xas); if (entry) { - unsigned int order = (last == ~0UL) ? 64 : - ilog2(last + 1); + unsigned int order = BITS_PER_LONG; + if (last + 1) + order = __ffs(last + 1); xas_set_order(&xas, last, order); xas_create(&xas); if (xas_error(&xas)) @@ -1662,7 +1641,7 @@ EXPORT_SYMBOL(__xa_alloc); * @index: Index of entry. * @mark: Mark number. * - * Attempting to set a mark on a NULL entry does not succeed. + * Attempting to set a mark on a %NULL entry does not succeed. * * Context: Any context. Expects xa_lock to be held on entry. */ @@ -1674,7 +1653,7 @@ void __xa_set_mark(struct xarray *xa, unsigned long index, xa_mark_t mark) if (entry) xas_set_mark(&xas, mark); } -EXPORT_SYMBOL_GPL(__xa_set_mark); +EXPORT_SYMBOL(__xa_set_mark); /** * __xa_clear_mark() - Clear this mark on this entry while locked. @@ -1692,7 +1671,7 @@ void __xa_clear_mark(struct xarray *xa, unsigned long index, xa_mark_t mark) if (entry) xas_clear_mark(&xas, mark); } -EXPORT_SYMBOL_GPL(__xa_clear_mark); +EXPORT_SYMBOL(__xa_clear_mark); /** * xa_get_mark() - Inquire whether this mark is set on this entry. @@ -1732,7 +1711,7 @@ EXPORT_SYMBOL(xa_get_mark); * @index: Index of entry. * @mark: Mark number. * - * Attempting to set a mark on a NULL entry does not succeed. + * Attempting to set a mark on a %NULL entry does not succeed. * * Context: Process context. Takes and releases the xa_lock. */ @@ -1829,6 +1808,8 @@ void *xa_find_after(struct xarray *xa, unsigned long *indexp, entry = xas_find_marked(&xas, max, filter); else entry = xas_find(&xas, max); + if (xas.xa_node == XAS_BOUNDS) + break; if (xas.xa_shift) { if (xas.xa_index & ((1UL << xas.xa_shift) - 1)) continue; @@ -1899,7 +1880,7 @@ static unsigned int xas_extract_marked(struct xa_state *xas, void **dst, * * The @filter may be an XArray mark value, in which case entries which are * marked with that mark will be copied. It may also be %XA_PRESENT, in - * which case all entries which are not NULL will be copied. + * which case all entries which are not %NULL will be copied. * * The entries returned may not represent a snapshot of the XArray at a * moment in time. For example, if another thread stores to index 5, then diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index c09219e7f230..5dbec21856f4 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -939,7 +939,7 @@ static int __ip_append_data(struct sock *sk, unsigned int fraglen; unsigned int fraggap; unsigned int alloclen; - unsigned int pagedlen = 0; + unsigned int pagedlen; struct sk_buff *skb_prev; alloc_new_skb: skb_prev = skb; @@ -956,6 +956,7 @@ alloc_new_skb: if (datalen > mtu - fragheaderlen) datalen = maxfraglen - fragheaderlen; fraglen = datalen + fragheaderlen; + pagedlen = 0; if ((flags & MSG_MORE) && !(rt->dst.dev->features&NETIF_F_SG)) diff --git a/net/ipv4/netfilter/ipt_MASQUERADE.c b/net/ipv4/netfilter/ipt_MASQUERADE.c index ce1512b02cb2..fd3f9e8a74da 100644 --- a/net/ipv4/netfilter/ipt_MASQUERADE.c +++ b/net/ipv4/netfilter/ipt_MASQUERADE.c @@ -81,9 +81,12 @@ static int __init masquerade_tg_init(void) int ret; ret = xt_register_target(&masquerade_tg_reg); + if (ret) + return ret; - if (ret == 0) - nf_nat_masquerade_ipv4_register_notifier(); + ret = nf_nat_masquerade_ipv4_register_notifier(); + if (ret) + xt_unregister_target(&masquerade_tg_reg); return ret; } diff --git a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c index a9d5e013e555..41327bb99093 100644 --- a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c +++ b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c @@ -147,28 +147,50 @@ static struct notifier_block masq_inet_notifier = { .notifier_call = masq_inet_event, }; -static atomic_t masquerade_notifier_refcount = ATOMIC_INIT(0); +static int masq_refcnt; +static DEFINE_MUTEX(masq_mutex); -void nf_nat_masquerade_ipv4_register_notifier(void) +int nf_nat_masquerade_ipv4_register_notifier(void) { + int ret = 0; + + mutex_lock(&masq_mutex); /* check if the notifier was already set */ - if (atomic_inc_return(&masquerade_notifier_refcount) > 1) - return; + if (++masq_refcnt > 1) + goto out_unlock; /* Register for device down reports */ - register_netdevice_notifier(&masq_dev_notifier); + ret = register_netdevice_notifier(&masq_dev_notifier); + if (ret) + goto err_dec; /* Register IP address change reports */ - register_inetaddr_notifier(&masq_inet_notifier); + ret = register_inetaddr_notifier(&masq_inet_notifier); + if (ret) + goto err_unregister; + + mutex_unlock(&masq_mutex); + return ret; + +err_unregister: + unregister_netdevice_notifier(&masq_dev_notifier); +err_dec: + masq_refcnt--; +out_unlock: + mutex_unlock(&masq_mutex); + return ret; } EXPORT_SYMBOL_GPL(nf_nat_masquerade_ipv4_register_notifier); void nf_nat_masquerade_ipv4_unregister_notifier(void) { + mutex_lock(&masq_mutex); /* check if the notifier still has clients */ - if (atomic_dec_return(&masquerade_notifier_refcount) > 0) - return; + if (--masq_refcnt > 0) + goto out_unlock; unregister_netdevice_notifier(&masq_dev_notifier); unregister_inetaddr_notifier(&masq_inet_notifier); +out_unlock: + mutex_unlock(&masq_mutex); } EXPORT_SYMBOL_GPL(nf_nat_masquerade_ipv4_unregister_notifier); diff --git a/net/ipv4/netfilter/nft_masq_ipv4.c b/net/ipv4/netfilter/nft_masq_ipv4.c index f1193e1e928a..6847de1d1db8 100644 --- a/net/ipv4/netfilter/nft_masq_ipv4.c +++ b/net/ipv4/netfilter/nft_masq_ipv4.c @@ -69,7 +69,9 @@ static int __init nft_masq_ipv4_module_init(void) if (ret < 0) return ret; - nf_nat_masquerade_ipv4_register_notifier(); + ret = nf_nat_masquerade_ipv4_register_notifier(); + if (ret) + nft_unregister_expr(&nft_masq_ipv4_type); return ret; } diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index f32397890b6d..78752746a6e2 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -579,10 +579,12 @@ static inline void tcp_rcv_rtt_measure_ts(struct sock *sk, u32 delta = tcp_time_stamp(tp) - tp->rx_opt.rcv_tsecr; u32 delta_us; - if (!delta) - delta = 1; - delta_us = delta * (USEC_PER_SEC / TCP_TS_HZ); - tcp_rcv_rtt_update(tp, delta_us, 0); + if (likely(delta < INT_MAX / (USEC_PER_SEC / TCP_TS_HZ))) { + if (!delta) + delta = 1; + delta_us = delta * (USEC_PER_SEC / TCP_TS_HZ); + tcp_rcv_rtt_update(tp, delta_us, 0); + } } } @@ -2910,9 +2912,11 @@ static bool tcp_ack_update_rtt(struct sock *sk, const int flag, if (seq_rtt_us < 0 && tp->rx_opt.saw_tstamp && tp->rx_opt.rcv_tsecr && flag & FLAG_ACKED) { u32 delta = tcp_time_stamp(tp) - tp->rx_opt.rcv_tsecr; - u32 delta_us = delta * (USEC_PER_SEC / TCP_TS_HZ); - seq_rtt_us = ca_rtt_us = delta_us; + if (likely(delta < INT_MAX / (USEC_PER_SEC / TCP_TS_HZ))) { + seq_rtt_us = delta * (USEC_PER_SEC / TCP_TS_HZ); + ca_rtt_us = seq_rtt_us; + } } rs->rtt_us = ca_rtt_us; /* RTT of last (S)ACKed packet (or -1) */ if (seq_rtt_us < 0) diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index 5f8b6d3cd855..091c53925e4d 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -40,15 +40,17 @@ static u32 tcp_clamp_rto_to_user_timeout(const struct sock *sk) { struct inet_connection_sock *icsk = inet_csk(sk); u32 elapsed, start_ts; + s32 remaining; start_ts = tcp_retransmit_stamp(sk); if (!icsk->icsk_user_timeout || !start_ts) return icsk->icsk_rto; elapsed = tcp_time_stamp(tcp_sk(sk)) - start_ts; - if (elapsed >= icsk->icsk_user_timeout) + remaining = icsk->icsk_user_timeout - elapsed; + if (remaining <= 0) return 1; /* user timeout has passed; fire ASAP */ - else - return min_t(u32, icsk->icsk_rto, msecs_to_jiffies(icsk->icsk_user_timeout - elapsed)); + + return min_t(u32, icsk->icsk_rto, msecs_to_jiffies(remaining)); } /** @@ -209,7 +211,7 @@ static bool retransmits_timed_out(struct sock *sk, (boundary - linear_backoff_thresh) * TCP_RTO_MAX; timeout = jiffies_to_msecs(timeout); } - return (tcp_time_stamp(tcp_sk(sk)) - start_ts) >= timeout; + return (s32)(tcp_time_stamp(tcp_sk(sk)) - start_ts - timeout) >= 0; } /* A write timeout has occurred. Process the after effects. */ diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index 89e0d5118afe..827a3f5ff3bb 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -1354,7 +1354,7 @@ emsgsize: unsigned int fraglen; unsigned int fraggap; unsigned int alloclen; - unsigned int pagedlen = 0; + unsigned int pagedlen; alloc_new_skb: /* There's no room in the current skb */ if (skb) @@ -1378,6 +1378,7 @@ alloc_new_skb: if (datalen > (cork->length <= mtu && !(cork->flags & IPCORK_ALLFRAG) ? mtu : maxfraglen) - fragheaderlen) datalen = maxfraglen - fragheaderlen - rt->dst.trailer_len; fraglen = datalen + fragheaderlen; + pagedlen = 0; if ((flags & MSG_MORE) && !(rt->dst.dev->features&NETIF_F_SG)) diff --git a/net/ipv6/netfilter.c b/net/ipv6/netfilter.c index 5ae8e1c51079..8b075f0bc351 100644 --- a/net/ipv6/netfilter.c +++ b/net/ipv6/netfilter.c @@ -24,7 +24,8 @@ int ip6_route_me_harder(struct net *net, struct sk_buff *skb) unsigned int hh_len; struct dst_entry *dst; struct flowi6 fl6 = { - .flowi6_oif = sk ? sk->sk_bound_dev_if : 0, + .flowi6_oif = sk && sk->sk_bound_dev_if ? sk->sk_bound_dev_if : + rt6_need_strict(&iph->daddr) ? skb_dst(skb)->dev->ifindex : 0, .flowi6_mark = skb->mark, .flowi6_uid = sock_net_uid(net, sk), .daddr = iph->daddr, diff --git a/net/ipv6/netfilter/ip6t_MASQUERADE.c b/net/ipv6/netfilter/ip6t_MASQUERADE.c index 491f808e356a..29c7f1915a96 100644 --- a/net/ipv6/netfilter/ip6t_MASQUERADE.c +++ b/net/ipv6/netfilter/ip6t_MASQUERADE.c @@ -58,8 +58,12 @@ static int __init masquerade_tg6_init(void) int err; err = xt_register_target(&masquerade_tg6_reg); - if (err == 0) - nf_nat_masquerade_ipv6_register_notifier(); + if (err) + return err; + + err = nf_nat_masquerade_ipv6_register_notifier(); + if (err) + xt_unregister_target(&masquerade_tg6_reg); return err; } diff --git a/net/ipv6/netfilter/nf_nat_masquerade_ipv6.c b/net/ipv6/netfilter/nf_nat_masquerade_ipv6.c index 3e4bf2286abe..0ad0da5a2600 100644 --- a/net/ipv6/netfilter/nf_nat_masquerade_ipv6.c +++ b/net/ipv6/netfilter/nf_nat_masquerade_ipv6.c @@ -132,8 +132,8 @@ static void iterate_cleanup_work(struct work_struct *work) * of ipv6 addresses being deleted), we also need to add an upper * limit to the number of queued work items. */ -static int masq_inet_event(struct notifier_block *this, - unsigned long event, void *ptr) +static int masq_inet6_event(struct notifier_block *this, + unsigned long event, void *ptr) { struct inet6_ifaddr *ifa = ptr; const struct net_device *dev; @@ -171,30 +171,53 @@ static int masq_inet_event(struct notifier_block *this, return NOTIFY_DONE; } -static struct notifier_block masq_inet_notifier = { - .notifier_call = masq_inet_event, +static struct notifier_block masq_inet6_notifier = { + .notifier_call = masq_inet6_event, }; -static atomic_t masquerade_notifier_refcount = ATOMIC_INIT(0); +static int masq_refcnt; +static DEFINE_MUTEX(masq_mutex); -void nf_nat_masquerade_ipv6_register_notifier(void) +int nf_nat_masquerade_ipv6_register_notifier(void) { + int ret = 0; + + mutex_lock(&masq_mutex); /* check if the notifier is already set */ - if (atomic_inc_return(&masquerade_notifier_refcount) > 1) - return; + if (++masq_refcnt > 1) + goto out_unlock; + + ret = register_netdevice_notifier(&masq_dev_notifier); + if (ret) + goto err_dec; + + ret = register_inet6addr_notifier(&masq_inet6_notifier); + if (ret) + goto err_unregister; - register_netdevice_notifier(&masq_dev_notifier); - register_inet6addr_notifier(&masq_inet_notifier); + mutex_unlock(&masq_mutex); + return ret; + +err_unregister: + unregister_netdevice_notifier(&masq_dev_notifier); +err_dec: + masq_refcnt--; +out_unlock: + mutex_unlock(&masq_mutex); + return ret; } EXPORT_SYMBOL_GPL(nf_nat_masquerade_ipv6_register_notifier); void nf_nat_masquerade_ipv6_unregister_notifier(void) { + mutex_lock(&masq_mutex); /* check if the notifier still has clients */ - if (atomic_dec_return(&masquerade_notifier_refcount) > 0) - return; + if (--masq_refcnt > 0) + goto out_unlock; - unregister_inet6addr_notifier(&masq_inet_notifier); + unregister_inet6addr_notifier(&masq_inet6_notifier); unregister_netdevice_notifier(&masq_dev_notifier); +out_unlock: + mutex_unlock(&masq_mutex); } EXPORT_SYMBOL_GPL(nf_nat_masquerade_ipv6_unregister_notifier); diff --git a/net/ipv6/netfilter/nft_masq_ipv6.c b/net/ipv6/netfilter/nft_masq_ipv6.c index dd0122f3cffe..e06c82e9dfcd 100644 --- a/net/ipv6/netfilter/nft_masq_ipv6.c +++ b/net/ipv6/netfilter/nft_masq_ipv6.c @@ -70,7 +70,9 @@ static int __init nft_masq_ipv6_module_init(void) if (ret < 0) return ret; - nf_nat_masquerade_ipv6_register_notifier(); + ret = nf_nat_masquerade_ipv6_register_notifier(); + if (ret) + nft_unregister_expr(&nft_masq_ipv6_type); return ret; } diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c index 83395bf6dc35..432141f04af3 100644 --- a/net/netfilter/ipvs/ip_vs_ctl.c +++ b/net/netfilter/ipvs/ip_vs_ctl.c @@ -3980,6 +3980,9 @@ static void __net_exit ip_vs_control_net_cleanup_sysctl(struct netns_ipvs *ipvs) static struct notifier_block ip_vs_dst_notifier = { .notifier_call = ip_vs_dst_event, +#ifdef CONFIG_IP_VS_IPV6 + .priority = ADDRCONF_NOTIFY_PRIORITY + 5, +#endif }; int __net_init ip_vs_control_net_init(struct netns_ipvs *ipvs) diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c index 02ca7df793f5..b6d0f6deea86 100644 --- a/net/netfilter/nf_conncount.c +++ b/net/netfilter/nf_conncount.c @@ -49,6 +49,7 @@ struct nf_conncount_tuple { struct nf_conntrack_zone zone; int cpu; u32 jiffies32; + bool dead; struct rcu_head rcu_head; }; @@ -106,15 +107,16 @@ nf_conncount_add(struct nf_conncount_list *list, conn->zone = *zone; conn->cpu = raw_smp_processor_id(); conn->jiffies32 = (u32)jiffies; - spin_lock(&list->list_lock); + conn->dead = false; + spin_lock_bh(&list->list_lock); if (list->dead == true) { kmem_cache_free(conncount_conn_cachep, conn); - spin_unlock(&list->list_lock); + spin_unlock_bh(&list->list_lock); return NF_CONNCOUNT_SKIP; } list_add_tail(&conn->node, &list->head); list->count++; - spin_unlock(&list->list_lock); + spin_unlock_bh(&list->list_lock); return NF_CONNCOUNT_ADDED; } EXPORT_SYMBOL_GPL(nf_conncount_add); @@ -132,19 +134,22 @@ static bool conn_free(struct nf_conncount_list *list, { bool free_entry = false; - spin_lock(&list->list_lock); + spin_lock_bh(&list->list_lock); - if (list->count == 0) { - spin_unlock(&list->list_lock); - return free_entry; + if (conn->dead) { + spin_unlock_bh(&list->list_lock); + return free_entry; } list->count--; + conn->dead = true; list_del_rcu(&conn->node); - if (list->count == 0) + if (list->count == 0) { + list->dead = true; free_entry = true; + } - spin_unlock(&list->list_lock); + spin_unlock_bh(&list->list_lock); call_rcu(&conn->rcu_head, __conn_free); return free_entry; } @@ -245,7 +250,7 @@ void nf_conncount_list_init(struct nf_conncount_list *list) { spin_lock_init(&list->list_lock); INIT_LIST_HEAD(&list->head); - list->count = 1; + list->count = 0; list->dead = false; } EXPORT_SYMBOL_GPL(nf_conncount_list_init); @@ -259,6 +264,7 @@ bool nf_conncount_gc_list(struct net *net, struct nf_conn *found_ct; unsigned int collected = 0; bool free_entry = false; + bool ret = false; list_for_each_entry_safe(conn, conn_n, &list->head, node) { found = find_or_evict(net, list, conn, &free_entry); @@ -288,7 +294,15 @@ bool nf_conncount_gc_list(struct net *net, if (collected > CONNCOUNT_GC_MAX_NODES) return false; } - return false; + + spin_lock_bh(&list->list_lock); + if (!list->count) { + list->dead = true; + ret = true; + } + spin_unlock_bh(&list->list_lock); + + return ret; } EXPORT_SYMBOL_GPL(nf_conncount_gc_list); @@ -309,11 +323,8 @@ static void tree_nodes_free(struct rb_root *root, while (gc_count) { rbconn = gc_nodes[--gc_count]; spin_lock(&rbconn->list.list_lock); - if (rbconn->list.count == 0 && rbconn->list.dead == false) { - rbconn->list.dead = true; - rb_erase(&rbconn->node, root); - call_rcu(&rbconn->rcu_head, __tree_nodes_free); - } + rb_erase(&rbconn->node, root); + call_rcu(&rbconn->rcu_head, __tree_nodes_free); spin_unlock(&rbconn->list.list_lock); } } @@ -414,6 +425,7 @@ insert_tree(struct net *net, nf_conncount_list_init(&rbconn->list); list_add(&conn->node, &rbconn->list.head); count = 1; + rbconn->list.count = count; rb_link_node(&rbconn->node, parent, rbnode); rb_insert_color(&rbconn->node, root); diff --git a/net/netfilter/nf_conntrack_proto_gre.c b/net/netfilter/nf_conntrack_proto_gre.c index 9b48dc8b4b88..2a5e56c6d8d9 100644 --- a/net/netfilter/nf_conntrack_proto_gre.c +++ b/net/netfilter/nf_conntrack_proto_gre.c @@ -43,24 +43,12 @@ #include <linux/netfilter/nf_conntrack_proto_gre.h> #include <linux/netfilter/nf_conntrack_pptp.h> -enum grep_conntrack { - GRE_CT_UNREPLIED, - GRE_CT_REPLIED, - GRE_CT_MAX -}; - static const unsigned int gre_timeouts[GRE_CT_MAX] = { [GRE_CT_UNREPLIED] = 30*HZ, [GRE_CT_REPLIED] = 180*HZ, }; static unsigned int proto_gre_net_id __read_mostly; -struct netns_proto_gre { - struct nf_proto_net nf; - rwlock_t keymap_lock; - struct list_head keymap_list; - unsigned int gre_timeouts[GRE_CT_MAX]; -}; static inline struct netns_proto_gre *gre_pernet(struct net *net) { @@ -402,6 +390,8 @@ static int __init nf_ct_proto_gre_init(void) { int ret; + BUILD_BUG_ON(offsetof(struct netns_proto_gre, nf) != 0); + ret = register_pernet_subsys(&proto_gre_net_ops); if (ret < 0) goto out_pernet; diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 42487d01a3ed..2e61aab6ed73 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -2457,7 +2457,7 @@ err: static void nf_tables_rule_destroy(const struct nft_ctx *ctx, struct nft_rule *rule) { - struct nft_expr *expr; + struct nft_expr *expr, *next; /* * Careful: some expressions might not be initialized in case this @@ -2465,8 +2465,9 @@ static void nf_tables_rule_destroy(const struct nft_ctx *ctx, */ expr = nft_expr_first(rule); while (expr != nft_expr_last(rule) && expr->ops) { + next = nft_expr_next(expr); nf_tables_expr_destroy(ctx, expr); - expr = nft_expr_next(expr); + expr = next; } kfree(rule); } @@ -2589,17 +2590,14 @@ static int nf_tables_newrule(struct net *net, struct sock *nlsk, if (chain->use == UINT_MAX) return -EOVERFLOW; - } - - if (nla[NFTA_RULE_POSITION]) { - if (!(nlh->nlmsg_flags & NLM_F_CREATE)) - return -EOPNOTSUPP; - pos_handle = be64_to_cpu(nla_get_be64(nla[NFTA_RULE_POSITION])); - old_rule = __nft_rule_lookup(chain, pos_handle); - if (IS_ERR(old_rule)) { - NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_POSITION]); - return PTR_ERR(old_rule); + if (nla[NFTA_RULE_POSITION]) { + pos_handle = be64_to_cpu(nla_get_be64(nla[NFTA_RULE_POSITION])); + old_rule = __nft_rule_lookup(chain, pos_handle); + if (IS_ERR(old_rule)) { + NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_POSITION]); + return PTR_ERR(old_rule); + } } } @@ -2669,21 +2667,14 @@ static int nf_tables_newrule(struct net *net, struct sock *nlsk, } if (nlh->nlmsg_flags & NLM_F_REPLACE) { - if (!nft_is_active_next(net, old_rule)) { - err = -ENOENT; - goto err2; - } - trans = nft_trans_rule_add(&ctx, NFT_MSG_DELRULE, - old_rule); + trans = nft_trans_rule_add(&ctx, NFT_MSG_NEWRULE, rule); if (trans == NULL) { err = -ENOMEM; goto err2; } - nft_deactivate_next(net, old_rule); - chain->use--; - - if (nft_trans_rule_add(&ctx, NFT_MSG_NEWRULE, rule) == NULL) { - err = -ENOMEM; + err = nft_delrule(&ctx, old_rule); + if (err < 0) { + nft_trans_destroy(trans); goto err2; } @@ -6324,7 +6315,7 @@ static void nf_tables_commit_chain_free_rules_old(struct nft_rule **rules) call_rcu(&old->h, __nf_tables_commit_chain_free_rules_old); } -static void nf_tables_commit_chain_active(struct net *net, struct nft_chain *chain) +static void nf_tables_commit_chain(struct net *net, struct nft_chain *chain) { struct nft_rule **g0, **g1; bool next_genbit; @@ -6441,11 +6432,8 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb) /* step 2. Make rules_gen_X visible to packet path */ list_for_each_entry(table, &net->nft.tables, list) { - list_for_each_entry(chain, &table->chains, list) { - if (!nft_is_active_next(net, chain)) - continue; - nf_tables_commit_chain_active(net, chain); - } + list_for_each_entry(chain, &table->chains, list) + nf_tables_commit_chain(net, chain); } /* diff --git a/net/netfilter/nfnetlink_cttimeout.c b/net/netfilter/nfnetlink_cttimeout.c index a518eb162344..109b0d27345a 100644 --- a/net/netfilter/nfnetlink_cttimeout.c +++ b/net/netfilter/nfnetlink_cttimeout.c @@ -455,7 +455,8 @@ static int cttimeout_default_get(struct net *net, struct sock *ctnl, case IPPROTO_TCP: timeouts = nf_tcp_pernet(net)->timeouts; break; - case IPPROTO_UDP: + case IPPROTO_UDP: /* fallthrough */ + case IPPROTO_UDPLITE: timeouts = nf_udp_pernet(net)->timeouts; break; case IPPROTO_DCCP: @@ -471,11 +472,21 @@ static int cttimeout_default_get(struct net *net, struct sock *ctnl, timeouts = nf_sctp_pernet(net)->timeouts; #endif break; + case IPPROTO_GRE: +#ifdef CONFIG_NF_CT_PROTO_GRE + if (l4proto->net_id) { + struct netns_proto_gre *net_gre; + + net_gre = net_generic(net, *l4proto->net_id); + timeouts = net_gre->gre_timeouts; + } +#endif + break; case 255: timeouts = &nf_generic_pernet(net)->timeout; break; default: - WARN_ON_ONCE(1); + WARN_ONCE(1, "Missing timeouts for proto %d", l4proto->l4proto); break; } diff --git a/net/netfilter/nft_compat.c b/net/netfilter/nft_compat.c index 9d0ede474224..7334e0b80a5e 100644 --- a/net/netfilter/nft_compat.c +++ b/net/netfilter/nft_compat.c @@ -520,6 +520,7 @@ __nft_match_destroy(const struct nft_ctx *ctx, const struct nft_expr *expr, void *info) { struct xt_match *match = expr->ops->data; + struct module *me = match->me; struct xt_mtdtor_param par; par.net = ctx->net; @@ -530,7 +531,7 @@ __nft_match_destroy(const struct nft_ctx *ctx, const struct nft_expr *expr, par.match->destroy(&par); if (nft_xt_put(container_of(expr->ops, struct nft_xt, ops))) - module_put(match->me); + module_put(me); } static void diff --git a/net/netfilter/nft_flow_offload.c b/net/netfilter/nft_flow_offload.c index e82d9a966c45..974525eb92df 100644 --- a/net/netfilter/nft_flow_offload.c +++ b/net/netfilter/nft_flow_offload.c @@ -214,7 +214,9 @@ static int __init nft_flow_offload_module_init(void) { int err; - register_netdevice_notifier(&flow_offload_netdev_notifier); + err = register_netdevice_notifier(&flow_offload_netdev_notifier); + if (err) + goto err; err = nft_register_expr(&nft_flow_offload_type); if (err < 0) @@ -224,6 +226,7 @@ static int __init nft_flow_offload_module_init(void) register_expr: unregister_netdevice_notifier(&flow_offload_netdev_notifier); +err: return err; } diff --git a/net/netfilter/xt_RATEEST.c b/net/netfilter/xt_RATEEST.c index dec843cadf46..9e05c86ba5c4 100644 --- a/net/netfilter/xt_RATEEST.c +++ b/net/netfilter/xt_RATEEST.c @@ -201,18 +201,8 @@ static __net_init int xt_rateest_net_init(struct net *net) return 0; } -static void __net_exit xt_rateest_net_exit(struct net *net) -{ - struct xt_rateest_net *xn = net_generic(net, xt_rateest_id); - int i; - - for (i = 0; i < ARRAY_SIZE(xn->hash); i++) - WARN_ON_ONCE(!hlist_empty(&xn->hash[i])); -} - static struct pernet_operations xt_rateest_net_ops = { .init = xt_rateest_net_init, - .exit = xt_rateest_net_exit, .id = &xt_rateest_id, .size = sizeof(struct xt_rateest_net), }; diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c index 3e7d259e5d8d..1ad4017f9b73 100644 --- a/net/netfilter/xt_hashlimit.c +++ b/net/netfilter/xt_hashlimit.c @@ -295,9 +295,10 @@ static int htable_create(struct net *net, struct hashlimit_cfg3 *cfg, /* copy match config into hashtable config */ ret = cfg_copy(&hinfo->cfg, (void *)cfg, 3); - - if (ret) + if (ret) { + vfree(hinfo); return ret; + } hinfo->cfg.size = size; if (hinfo->cfg.max == 0) @@ -814,7 +815,6 @@ hashlimit_mt_v1(const struct sk_buff *skb, struct xt_action_param *par) int ret; ret = cfg_copy(&cfg, (void *)&info->cfg, 1); - if (ret) return ret; @@ -830,7 +830,6 @@ hashlimit_mt_v2(const struct sk_buff *skb, struct xt_action_param *par) int ret; ret = cfg_copy(&cfg, (void *)&info->cfg, 2); - if (ret) return ret; @@ -921,7 +920,6 @@ static int hashlimit_mt_check_v1(const struct xt_mtchk_param *par) return ret; ret = cfg_copy(&cfg, (void *)&info->cfg, 1); - if (ret) return ret; @@ -940,7 +938,6 @@ static int hashlimit_mt_check_v2(const struct xt_mtchk_param *par) return ret; ret = cfg_copy(&cfg, (void *)&info->cfg, 2); - if (ret) return ret; diff --git a/net/sctp/output.c b/net/sctp/output.c index b0e74a3e77ec..025f48e14a91 100644 --- a/net/sctp/output.c +++ b/net/sctp/output.c @@ -410,6 +410,7 @@ static void sctp_packet_gso_append(struct sk_buff *head, struct sk_buff *skb) head->truesize += skb->truesize; head->data_len += skb->len; head->len += skb->len; + refcount_add(skb->truesize, &head->sk->sk_wmem_alloc); __skb_header_release(skb); } diff --git a/net/tipc/node.c b/net/tipc/node.c index 2afc4f8c37a7..488019766433 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -584,12 +584,15 @@ static void tipc_node_clear_links(struct tipc_node *node) /* tipc_node_cleanup - delete nodes that does not * have active links for NODE_CLEANUP_AFTER time */ -static int tipc_node_cleanup(struct tipc_node *peer) +static bool tipc_node_cleanup(struct tipc_node *peer) { struct tipc_net *tn = tipc_net(peer->net); bool deleted = false; - spin_lock_bh(&tn->node_list_lock); + /* If lock held by tipc_node_stop() the node will be deleted anyway */ + if (!spin_trylock_bh(&tn->node_list_lock)) + return false; + tipc_node_write_lock(peer); if (!node_is_up(peer) && time_after(jiffies, peer->delete_at)) { diff --git a/tools/bpf/bpftool/Documentation/bpftool-cgroup.rst b/tools/bpf/bpftool/Documentation/bpftool-cgroup.rst index edbe81534c6d..d07ccf8a23f7 100644 --- a/tools/bpf/bpftool/Documentation/bpftool-cgroup.rst +++ b/tools/bpf/bpftool/Documentation/bpftool-cgroup.rst @@ -137,4 +137,10 @@ EXAMPLES SEE ALSO ======== - **bpftool**\ (8), **bpftool-prog**\ (8), **bpftool-map**\ (8) + **bpf**\ (2), + **bpf-helpers**\ (7), + **bpftool**\ (8), + **bpftool-prog**\ (8), + **bpftool-map**\ (8), + **bpftool-net**\ (8), + **bpftool-perf**\ (8) diff --git a/tools/bpf/bpftool/Documentation/bpftool-map.rst b/tools/bpf/bpftool/Documentation/bpftool-map.rst index 9e827e342d9e..5318dcb2085e 100644 --- a/tools/bpf/bpftool/Documentation/bpftool-map.rst +++ b/tools/bpf/bpftool/Documentation/bpftool-map.rst @@ -172,4 +172,10 @@ The following three commands are equivalent: SEE ALSO ======== - **bpftool**\ (8), **bpftool-prog**\ (8), **bpftool-cgroup**\ (8) + **bpf**\ (2), + **bpf-helpers**\ (7), + **bpftool**\ (8), + **bpftool-prog**\ (8), + **bpftool-cgroup**\ (8), + **bpftool-net**\ (8), + **bpftool-perf**\ (8) diff --git a/tools/bpf/bpftool/Documentation/bpftool-net.rst b/tools/bpf/bpftool/Documentation/bpftool-net.rst index 408ec30d8872..ed87c9b619ad 100644 --- a/tools/bpf/bpftool/Documentation/bpftool-net.rst +++ b/tools/bpf/bpftool/Documentation/bpftool-net.rst @@ -136,4 +136,10 @@ EXAMPLES SEE ALSO ======== - **bpftool**\ (8), **bpftool-prog**\ (8), **bpftool-map**\ (8) + **bpf**\ (2), + **bpf-helpers**\ (7), + **bpftool**\ (8), + **bpftool-prog**\ (8), + **bpftool-map**\ (8), + **bpftool-cgroup**\ (8), + **bpftool-perf**\ (8) diff --git a/tools/bpf/bpftool/Documentation/bpftool-perf.rst b/tools/bpf/bpftool/Documentation/bpftool-perf.rst index e3eb0eab7641..f4c5e5538bb8 100644 --- a/tools/bpf/bpftool/Documentation/bpftool-perf.rst +++ b/tools/bpf/bpftool/Documentation/bpftool-perf.rst @@ -78,4 +78,10 @@ EXAMPLES SEE ALSO ======== - **bpftool**\ (8), **bpftool-prog**\ (8), **bpftool-map**\ (8) + **bpf**\ (2), + **bpf-helpers**\ (7), + **bpftool**\ (8), + **bpftool-prog**\ (8), + **bpftool-map**\ (8), + **bpftool-cgroup**\ (8), + **bpftool-net**\ (8) diff --git a/tools/bpf/bpftool/Documentation/bpftool-prog.rst b/tools/bpf/bpftool/Documentation/bpftool-prog.rst index 8db78ed82a71..ab36e920e552 100644 --- a/tools/bpf/bpftool/Documentation/bpftool-prog.rst +++ b/tools/bpf/bpftool/Documentation/bpftool-prog.rst @@ -136,7 +136,8 @@ OPTIONS Generate human-readable JSON output. Implies **-j**. -f, --bpffs - Show file names of pinned programs. + When showing BPF programs, show file names of pinned + programs. EXAMPLES ======== @@ -218,4 +219,10 @@ EXAMPLES SEE ALSO ======== - **bpftool**\ (8), **bpftool-map**\ (8), **bpftool-cgroup**\ (8) + **bpf**\ (2), + **bpf-helpers**\ (7), + **bpftool**\ (8), + **bpftool-map**\ (8), + **bpftool-cgroup**\ (8), + **bpftool-net**\ (8), + **bpftool-perf**\ (8) diff --git a/tools/bpf/bpftool/Documentation/bpftool.rst b/tools/bpf/bpftool/Documentation/bpftool.rst index 04cd4f92ab89..129b7a9c0f9b 100644 --- a/tools/bpf/bpftool/Documentation/bpftool.rst +++ b/tools/bpf/bpftool/Documentation/bpftool.rst @@ -63,5 +63,10 @@ OPTIONS SEE ALSO ======== - **bpftool-map**\ (8), **bpftool-prog**\ (8), **bpftool-cgroup**\ (8) - **bpftool-perf**\ (8), **bpftool-net**\ (8) + **bpf**\ (2), + **bpf-helpers**\ (7), + **bpftool-prog**\ (8), + **bpftool-map**\ (8), + **bpftool-cgroup**\ (8), + **bpftool-net**\ (8), + **bpftool-perf**\ (8) diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c index cb06a5b6e016..4e217d57118e 100644 --- a/tools/bpf/bpftool/common.c +++ b/tools/bpf/bpftool/common.c @@ -138,16 +138,17 @@ static int mnt_bpffs(const char *target, char *buff, size_t bufflen) return 0; } -int open_obj_pinned(char *path) +int open_obj_pinned(char *path, bool quiet) { int fd; fd = bpf_obj_get(path); if (fd < 0) { - p_err("bpf obj get (%s): %s", path, - errno == EACCES && !is_bpffs(dirname(path)) ? - "directory not in bpf file system (bpffs)" : - strerror(errno)); + if (!quiet) + p_err("bpf obj get (%s): %s", path, + errno == EACCES && !is_bpffs(dirname(path)) ? + "directory not in bpf file system (bpffs)" : + strerror(errno)); return -1; } @@ -159,7 +160,7 @@ int open_obj_pinned_any(char *path, enum bpf_obj_type exp_type) enum bpf_obj_type type; int fd; - fd = open_obj_pinned(path); + fd = open_obj_pinned(path, false); if (fd < 0) return -1; @@ -311,7 +312,7 @@ char *get_fdinfo(int fd, const char *key) return NULL; } - while ((n = getline(&line, &line_n, fdi))) { + while ((n = getline(&line, &line_n, fdi)) > 0) { char *value; int len; @@ -391,7 +392,7 @@ int build_pinned_obj_table(struct pinned_obj_table *tab, while ((ftse = fts_read(fts))) { if (!(ftse->fts_info & FTS_F)) continue; - fd = open_obj_pinned(ftse->fts_path); + fd = open_obj_pinned(ftse->fts_path, true); if (fd < 0) continue; diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h index 3e8979567cf1..7431669fae0a 100644 --- a/tools/bpf/bpftool/main.h +++ b/tools/bpf/bpftool/main.h @@ -129,7 +129,7 @@ int cmd_select(const struct cmd *cmds, int argc, char **argv, int get_fd_type(int fd); const char *get_fd_type_name(enum bpf_obj_type type); char *get_fdinfo(int fd, const char *key); -int open_obj_pinned(char *path); +int open_obj_pinned(char *path, bool quiet); int open_obj_pinned_any(char *path, enum bpf_obj_type exp_type); int mount_bpffs_for_pin(const char *name); int do_pin_any(int argc, char **argv, int (*get_fd_by_id)(__u32)); diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c index 37b1daf19da6..cb18429818ec 100644 --- a/tools/bpf/bpftool/prog.c +++ b/tools/bpf/bpftool/prog.c @@ -359,10 +359,9 @@ static void print_prog_plain(struct bpf_prog_info *info, int fd) if (!hash_empty(prog_table.table)) { struct pinned_obj *obj; - printf("\n"); hash_for_each_possible(prog_table.table, obj, hash, info->id) { if (obj->id == info->id) - printf("\tpinned %s\n", obj->path); + printf("\n\tpinned %s", obj->path); } } @@ -912,6 +911,7 @@ static int load_with_options(int argc, char **argv, bool first_prog_only) } NEXT_ARG(); } else if (is_prefix(*argv, "map")) { + void *new_map_replace; char *endptr, *name; int fd; @@ -945,12 +945,15 @@ static int load_with_options(int argc, char **argv, bool first_prog_only) if (fd < 0) goto err_free_reuse_maps; - map_replace = reallocarray(map_replace, old_map_fds + 1, - sizeof(*map_replace)); - if (!map_replace) { + new_map_replace = reallocarray(map_replace, + old_map_fds + 1, + sizeof(*map_replace)); + if (!new_map_replace) { p_err("mem alloc failed"); goto err_free_reuse_maps; } + map_replace = new_map_replace; + map_replace[old_map_fds].idx = idx; map_replace[old_map_fds].name = name; map_replace[old_map_fds].fd = fd; diff --git a/tools/include/uapi/linux/pkt_cls.h b/tools/include/uapi/linux/pkt_cls.h new file mode 100644 index 000000000000..401d0c1e612d --- /dev/null +++ b/tools/include/uapi/linux/pkt_cls.h @@ -0,0 +1,612 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +#ifndef __LINUX_PKT_CLS_H +#define __LINUX_PKT_CLS_H + +#include <linux/types.h> +#include <linux/pkt_sched.h> + +#define TC_COOKIE_MAX_SIZE 16 + +/* Action attributes */ +enum { + TCA_ACT_UNSPEC, + TCA_ACT_KIND, + TCA_ACT_OPTIONS, + TCA_ACT_INDEX, + TCA_ACT_STATS, + TCA_ACT_PAD, + TCA_ACT_COOKIE, + __TCA_ACT_MAX +}; + +#define TCA_ACT_MAX __TCA_ACT_MAX +#define TCA_OLD_COMPAT (TCA_ACT_MAX+1) +#define TCA_ACT_MAX_PRIO 32 +#define TCA_ACT_BIND 1 +#define TCA_ACT_NOBIND 0 +#define TCA_ACT_UNBIND 1 +#define TCA_ACT_NOUNBIND 0 +#define TCA_ACT_REPLACE 1 +#define TCA_ACT_NOREPLACE 0 + +#define TC_ACT_UNSPEC (-1) +#define TC_ACT_OK 0 +#define TC_ACT_RECLASSIFY 1 +#define TC_ACT_SHOT 2 +#define TC_ACT_PIPE 3 +#define TC_ACT_STOLEN 4 +#define TC_ACT_QUEUED 5 +#define TC_ACT_REPEAT 6 +#define TC_ACT_REDIRECT 7 +#define TC_ACT_TRAP 8 /* For hw path, this means "trap to cpu" + * and don't further process the frame + * in hardware. For sw path, this is + * equivalent of TC_ACT_STOLEN - drop + * the skb and act like everything + * is alright. + */ +#define TC_ACT_VALUE_MAX TC_ACT_TRAP + +/* There is a special kind of actions called "extended actions", + * which need a value parameter. These have a local opcode located in + * the highest nibble, starting from 1. The rest of the bits + * are used to carry the value. These two parts together make + * a combined opcode. + */ +#define __TC_ACT_EXT_SHIFT 28 +#define __TC_ACT_EXT(local) ((local) << __TC_ACT_EXT_SHIFT) +#define TC_ACT_EXT_VAL_MASK ((1 << __TC_ACT_EXT_SHIFT) - 1) +#define TC_ACT_EXT_OPCODE(combined) ((combined) & (~TC_ACT_EXT_VAL_MASK)) +#define TC_ACT_EXT_CMP(combined, opcode) (TC_ACT_EXT_OPCODE(combined) == opcode) + +#define TC_ACT_JUMP __TC_ACT_EXT(1) +#define TC_ACT_GOTO_CHAIN __TC_ACT_EXT(2) +#define TC_ACT_EXT_OPCODE_MAX TC_ACT_GOTO_CHAIN + +/* Action type identifiers*/ +enum { + TCA_ID_UNSPEC=0, + TCA_ID_POLICE=1, + /* other actions go here */ + __TCA_ID_MAX=255 +}; + +#define TCA_ID_MAX __TCA_ID_MAX + +struct tc_police { + __u32 index; + int action; +#define TC_POLICE_UNSPEC TC_ACT_UNSPEC +#define TC_POLICE_OK TC_ACT_OK +#define TC_POLICE_RECLASSIFY TC_ACT_RECLASSIFY +#define TC_POLICE_SHOT TC_ACT_SHOT +#define TC_POLICE_PIPE TC_ACT_PIPE + + __u32 limit; + __u32 burst; + __u32 mtu; + struct tc_ratespec rate; + struct tc_ratespec peakrate; + int refcnt; + int bindcnt; + __u32 capab; +}; + +struct tcf_t { + __u64 install; + __u64 lastuse; + __u64 expires; + __u64 firstuse; +}; + +struct tc_cnt { + int refcnt; + int bindcnt; +}; + +#define tc_gen \ + __u32 index; \ + __u32 capab; \ + int action; \ + int refcnt; \ + int bindcnt + +enum { + TCA_POLICE_UNSPEC, + TCA_POLICE_TBF, + TCA_POLICE_RATE, + TCA_POLICE_PEAKRATE, + TCA_POLICE_AVRATE, + TCA_POLICE_RESULT, + TCA_POLICE_TM, + TCA_POLICE_PAD, + __TCA_POLICE_MAX +#define TCA_POLICE_RESULT TCA_POLICE_RESULT +}; + +#define TCA_POLICE_MAX (__TCA_POLICE_MAX - 1) + +/* tca flags definitions */ +#define TCA_CLS_FLAGS_SKIP_HW (1 << 0) /* don't offload filter to HW */ +#define TCA_CLS_FLAGS_SKIP_SW (1 << 1) /* don't use filter in SW */ +#define TCA_CLS_FLAGS_IN_HW (1 << 2) /* filter is offloaded to HW */ +#define TCA_CLS_FLAGS_NOT_IN_HW (1 << 3) /* filter isn't offloaded to HW */ +#define TCA_CLS_FLAGS_VERBOSE (1 << 4) /* verbose logging */ + +/* U32 filters */ + +#define TC_U32_HTID(h) ((h)&0xFFF00000) +#define TC_U32_USERHTID(h) (TC_U32_HTID(h)>>20) +#define TC_U32_HASH(h) (((h)>>12)&0xFF) +#define TC_U32_NODE(h) ((h)&0xFFF) +#define TC_U32_KEY(h) ((h)&0xFFFFF) +#define TC_U32_UNSPEC 0 +#define TC_U32_ROOT (0xFFF00000) + +enum { + TCA_U32_UNSPEC, + TCA_U32_CLASSID, + TCA_U32_HASH, + TCA_U32_LINK, + TCA_U32_DIVISOR, + TCA_U32_SEL, + TCA_U32_POLICE, + TCA_U32_ACT, + TCA_U32_INDEV, + TCA_U32_PCNT, + TCA_U32_MARK, + TCA_U32_FLAGS, + TCA_U32_PAD, + __TCA_U32_MAX +}; + +#define TCA_U32_MAX (__TCA_U32_MAX - 1) + +struct tc_u32_key { + __be32 mask; + __be32 val; + int off; + int offmask; +}; + +struct tc_u32_sel { + unsigned char flags; + unsigned char offshift; + unsigned char nkeys; + + __be16 offmask; + __u16 off; + short offoff; + + short hoff; + __be32 hmask; + struct tc_u32_key keys[0]; +}; + +struct tc_u32_mark { + __u32 val; + __u32 mask; + __u32 success; +}; + +struct tc_u32_pcnt { + __u64 rcnt; + __u64 rhit; + __u64 kcnts[0]; +}; + +/* Flags */ + +#define TC_U32_TERMINAL 1 +#define TC_U32_OFFSET 2 +#define TC_U32_VAROFFSET 4 +#define TC_U32_EAT 8 + +#define TC_U32_MAXDEPTH 8 + + +/* RSVP filter */ + +enum { + TCA_RSVP_UNSPEC, + TCA_RSVP_CLASSID, + TCA_RSVP_DST, + TCA_RSVP_SRC, + TCA_RSVP_PINFO, + TCA_RSVP_POLICE, + TCA_RSVP_ACT, + __TCA_RSVP_MAX +}; + +#define TCA_RSVP_MAX (__TCA_RSVP_MAX - 1 ) + +struct tc_rsvp_gpi { + __u32 key; + __u32 mask; + int offset; +}; + +struct tc_rsvp_pinfo { + struct tc_rsvp_gpi dpi; + struct tc_rsvp_gpi spi; + __u8 protocol; + __u8 tunnelid; + __u8 tunnelhdr; + __u8 pad; +}; + +/* ROUTE filter */ + +enum { + TCA_ROUTE4_UNSPEC, + TCA_ROUTE4_CLASSID, + TCA_ROUTE4_TO, + TCA_ROUTE4_FROM, + TCA_ROUTE4_IIF, + TCA_ROUTE4_POLICE, + TCA_ROUTE4_ACT, + __TCA_ROUTE4_MAX +}; + +#define TCA_ROUTE4_MAX (__TCA_ROUTE4_MAX - 1) + + +/* FW filter */ + +enum { + TCA_FW_UNSPEC, + TCA_FW_CLASSID, + TCA_FW_POLICE, + TCA_FW_INDEV, /* used by CONFIG_NET_CLS_IND */ + TCA_FW_ACT, /* used by CONFIG_NET_CLS_ACT */ + TCA_FW_MASK, + __TCA_FW_MAX +}; + +#define TCA_FW_MAX (__TCA_FW_MAX - 1) + +/* TC index filter */ + +enum { + TCA_TCINDEX_UNSPEC, + TCA_TCINDEX_HASH, + TCA_TCINDEX_MASK, + TCA_TCINDEX_SHIFT, + TCA_TCINDEX_FALL_THROUGH, + TCA_TCINDEX_CLASSID, + TCA_TCINDEX_POLICE, + TCA_TCINDEX_ACT, + __TCA_TCINDEX_MAX +}; + +#define TCA_TCINDEX_MAX (__TCA_TCINDEX_MAX - 1) + +/* Flow filter */ + +enum { + FLOW_KEY_SRC, + FLOW_KEY_DST, + FLOW_KEY_PROTO, + FLOW_KEY_PROTO_SRC, + FLOW_KEY_PROTO_DST, + FLOW_KEY_IIF, + FLOW_KEY_PRIORITY, + FLOW_KEY_MARK, + FLOW_KEY_NFCT, + FLOW_KEY_NFCT_SRC, + FLOW_KEY_NFCT_DST, + FLOW_KEY_NFCT_PROTO_SRC, + FLOW_KEY_NFCT_PROTO_DST, + FLOW_KEY_RTCLASSID, + FLOW_KEY_SKUID, + FLOW_KEY_SKGID, + FLOW_KEY_VLAN_TAG, + FLOW_KEY_RXHASH, + __FLOW_KEY_MAX, +}; + +#define FLOW_KEY_MAX (__FLOW_KEY_MAX - 1) + +enum { + FLOW_MODE_MAP, + FLOW_MODE_HASH, +}; + +enum { + TCA_FLOW_UNSPEC, + TCA_FLOW_KEYS, + TCA_FLOW_MODE, + TCA_FLOW_BASECLASS, + TCA_FLOW_RSHIFT, + TCA_FLOW_ADDEND, + TCA_FLOW_MASK, + TCA_FLOW_XOR, + TCA_FLOW_DIVISOR, + TCA_FLOW_ACT, + TCA_FLOW_POLICE, + TCA_FLOW_EMATCHES, + TCA_FLOW_PERTURB, + __TCA_FLOW_MAX +}; + +#define TCA_FLOW_MAX (__TCA_FLOW_MAX - 1) + +/* Basic filter */ + +enum { + TCA_BASIC_UNSPEC, + TCA_BASIC_CLASSID, + TCA_BASIC_EMATCHES, + TCA_BASIC_ACT, + TCA_BASIC_POLICE, + __TCA_BASIC_MAX +}; + +#define TCA_BASIC_MAX (__TCA_BASIC_MAX - 1) + + +/* Cgroup classifier */ + +enum { + TCA_CGROUP_UNSPEC, + TCA_CGROUP_ACT, + TCA_CGROUP_POLICE, + TCA_CGROUP_EMATCHES, + __TCA_CGROUP_MAX, +}; + +#define TCA_CGROUP_MAX (__TCA_CGROUP_MAX - 1) + +/* BPF classifier */ + +#define TCA_BPF_FLAG_ACT_DIRECT (1 << 0) + +enum { + TCA_BPF_UNSPEC, + TCA_BPF_ACT, + TCA_BPF_POLICE, + TCA_BPF_CLASSID, + TCA_BPF_OPS_LEN, + TCA_BPF_OPS, + TCA_BPF_FD, + TCA_BPF_NAME, + TCA_BPF_FLAGS, + TCA_BPF_FLAGS_GEN, + TCA_BPF_TAG, + TCA_BPF_ID, + __TCA_BPF_MAX, +}; + +#define TCA_BPF_MAX (__TCA_BPF_MAX - 1) + +/* Flower classifier */ + +enum { + TCA_FLOWER_UNSPEC, + TCA_FLOWER_CLASSID, + TCA_FLOWER_INDEV, + TCA_FLOWER_ACT, + TCA_FLOWER_KEY_ETH_DST, /* ETH_ALEN */ + TCA_FLOWER_KEY_ETH_DST_MASK, /* ETH_ALEN */ + TCA_FLOWER_KEY_ETH_SRC, /* ETH_ALEN */ + TCA_FLOWER_KEY_ETH_SRC_MASK, /* ETH_ALEN */ + TCA_FLOWER_KEY_ETH_TYPE, /* be16 */ + TCA_FLOWER_KEY_IP_PROTO, /* u8 */ + TCA_FLOWER_KEY_IPV4_SRC, /* be32 */ + TCA_FLOWER_KEY_IPV4_SRC_MASK, /* be32 */ + TCA_FLOWER_KEY_IPV4_DST, /* be32 */ + TCA_FLOWER_KEY_IPV4_DST_MASK, /* be32 */ + TCA_FLOWER_KEY_IPV6_SRC, /* struct in6_addr */ + TCA_FLOWER_KEY_IPV6_SRC_MASK, /* struct in6_addr */ + TCA_FLOWER_KEY_IPV6_DST, /* struct in6_addr */ + TCA_FLOWER_KEY_IPV6_DST_MASK, /* struct in6_addr */ + TCA_FLOWER_KEY_TCP_SRC, /* be16 */ + TCA_FLOWER_KEY_TCP_DST, /* be16 */ + TCA_FLOWER_KEY_UDP_SRC, /* be16 */ + TCA_FLOWER_KEY_UDP_DST, /* be16 */ + + TCA_FLOWER_FLAGS, + TCA_FLOWER_KEY_VLAN_ID, /* be16 */ + TCA_FLOWER_KEY_VLAN_PRIO, /* u8 */ + TCA_FLOWER_KEY_VLAN_ETH_TYPE, /* be16 */ + + TCA_FLOWER_KEY_ENC_KEY_ID, /* be32 */ + TCA_FLOWER_KEY_ENC_IPV4_SRC, /* be32 */ + TCA_FLOWER_KEY_ENC_IPV4_SRC_MASK,/* be32 */ + TCA_FLOWER_KEY_ENC_IPV4_DST, /* be32 */ + TCA_FLOWER_KEY_ENC_IPV4_DST_MASK,/* be32 */ + TCA_FLOWER_KEY_ENC_IPV6_SRC, /* struct in6_addr */ + TCA_FLOWER_KEY_ENC_IPV6_SRC_MASK,/* struct in6_addr */ + TCA_FLOWER_KEY_ENC_IPV6_DST, /* struct in6_addr */ + TCA_FLOWER_KEY_ENC_IPV6_DST_MASK,/* struct in6_addr */ + + TCA_FLOWER_KEY_TCP_SRC_MASK, /* be16 */ + TCA_FLOWER_KEY_TCP_DST_MASK, /* be16 */ + TCA_FLOWER_KEY_UDP_SRC_MASK, /* be16 */ + TCA_FLOWER_KEY_UDP_DST_MASK, /* be16 */ + TCA_FLOWER_KEY_SCTP_SRC_MASK, /* be16 */ + TCA_FLOWER_KEY_SCTP_DST_MASK, /* be16 */ + + TCA_FLOWER_KEY_SCTP_SRC, /* be16 */ + TCA_FLOWER_KEY_SCTP_DST, /* be16 */ + + TCA_FLOWER_KEY_ENC_UDP_SRC_PORT, /* be16 */ + TCA_FLOWER_KEY_ENC_UDP_SRC_PORT_MASK, /* be16 */ + TCA_FLOWER_KEY_ENC_UDP_DST_PORT, /* be16 */ + TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK, /* be16 */ + + TCA_FLOWER_KEY_FLAGS, /* be32 */ + TCA_FLOWER_KEY_FLAGS_MASK, /* be32 */ + + TCA_FLOWER_KEY_ICMPV4_CODE, /* u8 */ + TCA_FLOWER_KEY_ICMPV4_CODE_MASK,/* u8 */ + TCA_FLOWER_KEY_ICMPV4_TYPE, /* u8 */ + TCA_FLOWER_KEY_ICMPV4_TYPE_MASK,/* u8 */ + TCA_FLOWER_KEY_ICMPV6_CODE, /* u8 */ + TCA_FLOWER_KEY_ICMPV6_CODE_MASK,/* u8 */ + TCA_FLOWER_KEY_ICMPV6_TYPE, /* u8 */ + TCA_FLOWER_KEY_ICMPV6_TYPE_MASK,/* u8 */ + + TCA_FLOWER_KEY_ARP_SIP, /* be32 */ + TCA_FLOWER_KEY_ARP_SIP_MASK, /* be32 */ + TCA_FLOWER_KEY_ARP_TIP, /* be32 */ + TCA_FLOWER_KEY_ARP_TIP_MASK, /* be32 */ + TCA_FLOWER_KEY_ARP_OP, /* u8 */ + TCA_FLOWER_KEY_ARP_OP_MASK, /* u8 */ + TCA_FLOWER_KEY_ARP_SHA, /* ETH_ALEN */ + TCA_FLOWER_KEY_ARP_SHA_MASK, /* ETH_ALEN */ + TCA_FLOWER_KEY_ARP_THA, /* ETH_ALEN */ + TCA_FLOWER_KEY_ARP_THA_MASK, /* ETH_ALEN */ + + TCA_FLOWER_KEY_MPLS_TTL, /* u8 - 8 bits */ + TCA_FLOWER_KEY_MPLS_BOS, /* u8 - 1 bit */ + TCA_FLOWER_KEY_MPLS_TC, /* u8 - 3 bits */ + TCA_FLOWER_KEY_MPLS_LABEL, /* be32 - 20 bits */ + + TCA_FLOWER_KEY_TCP_FLAGS, /* be16 */ + TCA_FLOWER_KEY_TCP_FLAGS_MASK, /* be16 */ + + TCA_FLOWER_KEY_IP_TOS, /* u8 */ + TCA_FLOWER_KEY_IP_TOS_MASK, /* u8 */ + TCA_FLOWER_KEY_IP_TTL, /* u8 */ + TCA_FLOWER_KEY_IP_TTL_MASK, /* u8 */ + + TCA_FLOWER_KEY_CVLAN_ID, /* be16 */ + TCA_FLOWER_KEY_CVLAN_PRIO, /* u8 */ + TCA_FLOWER_KEY_CVLAN_ETH_TYPE, /* be16 */ + + TCA_FLOWER_KEY_ENC_IP_TOS, /* u8 */ + TCA_FLOWER_KEY_ENC_IP_TOS_MASK, /* u8 */ + TCA_FLOWER_KEY_ENC_IP_TTL, /* u8 */ + TCA_FLOWER_KEY_ENC_IP_TTL_MASK, /* u8 */ + + TCA_FLOWER_KEY_ENC_OPTS, + TCA_FLOWER_KEY_ENC_OPTS_MASK, + + TCA_FLOWER_IN_HW_COUNT, + + __TCA_FLOWER_MAX, +}; + +#define TCA_FLOWER_MAX (__TCA_FLOWER_MAX - 1) + +enum { + TCA_FLOWER_KEY_ENC_OPTS_UNSPEC, + TCA_FLOWER_KEY_ENC_OPTS_GENEVE, /* Nested + * TCA_FLOWER_KEY_ENC_OPT_GENEVE_ + * attributes + */ + __TCA_FLOWER_KEY_ENC_OPTS_MAX, +}; + +#define TCA_FLOWER_KEY_ENC_OPTS_MAX (__TCA_FLOWER_KEY_ENC_OPTS_MAX - 1) + +enum { + TCA_FLOWER_KEY_ENC_OPT_GENEVE_UNSPEC, + TCA_FLOWER_KEY_ENC_OPT_GENEVE_CLASS, /* u16 */ + TCA_FLOWER_KEY_ENC_OPT_GENEVE_TYPE, /* u8 */ + TCA_FLOWER_KEY_ENC_OPT_GENEVE_DATA, /* 4 to 128 bytes */ + + __TCA_FLOWER_KEY_ENC_OPT_GENEVE_MAX, +}; + +#define TCA_FLOWER_KEY_ENC_OPT_GENEVE_MAX \ + (__TCA_FLOWER_KEY_ENC_OPT_GENEVE_MAX - 1) + +enum { + TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT = (1 << 0), + TCA_FLOWER_KEY_FLAGS_FRAG_IS_FIRST = (1 << 1), +}; + +/* Match-all classifier */ + +enum { + TCA_MATCHALL_UNSPEC, + TCA_MATCHALL_CLASSID, + TCA_MATCHALL_ACT, + TCA_MATCHALL_FLAGS, + __TCA_MATCHALL_MAX, +}; + +#define TCA_MATCHALL_MAX (__TCA_MATCHALL_MAX - 1) + +/* Extended Matches */ + +struct tcf_ematch_tree_hdr { + __u16 nmatches; + __u16 progid; +}; + +enum { + TCA_EMATCH_TREE_UNSPEC, + TCA_EMATCH_TREE_HDR, + TCA_EMATCH_TREE_LIST, + __TCA_EMATCH_TREE_MAX +}; +#define TCA_EMATCH_TREE_MAX (__TCA_EMATCH_TREE_MAX - 1) + +struct tcf_ematch_hdr { + __u16 matchid; + __u16 kind; + __u16 flags; + __u16 pad; /* currently unused */ +}; + +/* 0 1 + * 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 + * +-----------------------+-+-+---+ + * | Unused |S|I| R | + * +-----------------------+-+-+---+ + * + * R(2) ::= relation to next ematch + * where: 0 0 END (last ematch) + * 0 1 AND + * 1 0 OR + * 1 1 Unused (invalid) + * I(1) ::= invert result + * S(1) ::= simple payload + */ +#define TCF_EM_REL_END 0 +#define TCF_EM_REL_AND (1<<0) +#define TCF_EM_REL_OR (1<<1) +#define TCF_EM_INVERT (1<<2) +#define TCF_EM_SIMPLE (1<<3) + +#define TCF_EM_REL_MASK 3 +#define TCF_EM_REL_VALID(v) (((v) & TCF_EM_REL_MASK) != TCF_EM_REL_MASK) + +enum { + TCF_LAYER_LINK, + TCF_LAYER_NETWORK, + TCF_LAYER_TRANSPORT, + __TCF_LAYER_MAX +}; +#define TCF_LAYER_MAX (__TCF_LAYER_MAX - 1) + +/* Ematch type assignments + * 1..32767 Reserved for ematches inside kernel tree + * 32768..65535 Free to use, not reliable + */ +#define TCF_EM_CONTAINER 0 +#define TCF_EM_CMP 1 +#define TCF_EM_NBYTE 2 +#define TCF_EM_U32 3 +#define TCF_EM_META 4 +#define TCF_EM_TEXT 5 +#define TCF_EM_VLAN 6 +#define TCF_EM_CANID 7 +#define TCF_EM_IPSET 8 +#define TCF_EM_IPT 9 +#define TCF_EM_MAX 9 + +enum { + TCF_EM_PROG_TC +}; + +enum { + TCF_EM_OPND_EQ, + TCF_EM_OPND_GT, + TCF_EM_OPND_LT +}; + +#endif diff --git a/tools/include/uapi/linux/tc_act/tc_bpf.h b/tools/include/uapi/linux/tc_act/tc_bpf.h new file mode 100644 index 000000000000..6e89a5df49a4 --- /dev/null +++ b/tools/include/uapi/linux/tc_act/tc_bpf.h @@ -0,0 +1,37 @@ +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */ +/* + * Copyright (c) 2015 Jiri Pirko <jiri@resnulli.us> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ + +#ifndef __LINUX_TC_BPF_H +#define __LINUX_TC_BPF_H + +#include <linux/pkt_cls.h> + +#define TCA_ACT_BPF 13 + +struct tc_act_bpf { + tc_gen; +}; + +enum { + TCA_ACT_BPF_UNSPEC, + TCA_ACT_BPF_TM, + TCA_ACT_BPF_PARMS, + TCA_ACT_BPF_OPS_LEN, + TCA_ACT_BPF_OPS, + TCA_ACT_BPF_FD, + TCA_ACT_BPF_NAME, + TCA_ACT_BPF_PAD, + TCA_ACT_BPF_TAG, + TCA_ACT_BPF_ID, + __TCA_ACT_BPF_MAX, +}; +#define TCA_ACT_BPF_MAX (__TCA_ACT_BPF_MAX - 1) + +#endif diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index f1fe492c8e17..f0017c831e57 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -24,6 +24,7 @@ TARGETS += memory-hotplug TARGETS += mount TARGETS += mqueue TARGETS += net +TARGETS += netfilter TARGETS += nsfs TARGETS += powerpc TARGETS += proc diff --git a/tools/testing/selftests/bpf/test_netcnt.c b/tools/testing/selftests/bpf/test_netcnt.c index 7887df693399..44ed7f29f8ab 100644 --- a/tools/testing/selftests/bpf/test_netcnt.c +++ b/tools/testing/selftests/bpf/test_netcnt.c @@ -81,7 +81,10 @@ int main(int argc, char **argv) goto err; } - assert(system("ping localhost -6 -c 10000 -f -q > /dev/null") == 0); + if (system("which ping6 &>/dev/null") == 0) + assert(!system("ping6 localhost -c 10000 -f -q > /dev/null")); + else + assert(!system("ping -6 localhost -c 10000 -f -q > /dev/null")); if (bpf_prog_query(cgroup_fd, BPF_CGROUP_INET_EGRESS, 0, NULL, NULL, &prog_cnt)) { diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c index 537a8f91af02..17021d2b6bfe 100644 --- a/tools/testing/selftests/bpf/test_verifier.c +++ b/tools/testing/selftests/bpf/test_verifier.c @@ -13953,6 +13953,25 @@ static struct bpf_test tests[] = { .prog_type = BPF_PROG_TYPE_SCHED_CLS, .result = ACCEPT, }, + { + "calls: ctx read at start of subprog", + .insns = { + BPF_MOV64_REG(BPF_REG_6, BPF_REG_1), + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, 5), + BPF_JMP_REG(BPF_JSGT, BPF_REG_0, BPF_REG_0, 0), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, 2), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_0), + BPF_EXIT_INSN(), + BPF_LDX_MEM(BPF_B, BPF_REG_9, BPF_REG_1, 0), + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_SOCKET_FILTER, + .errstr_unpriv = "function calls to other bpf functions are allowed for root only", + .result_unpriv = REJECT, + .result = ACCEPT, + }, }; static int probe_filter_length(const struct bpf_insn *fp) diff --git a/tools/testing/selftests/netfilter/Makefile b/tools/testing/selftests/netfilter/Makefile new file mode 100644 index 000000000000..47ed6cef93fb --- /dev/null +++ b/tools/testing/selftests/netfilter/Makefile @@ -0,0 +1,6 @@ +# SPDX-License-Identifier: GPL-2.0 +# Makefile for netfilter selftests + +TEST_PROGS := nft_trans_stress.sh + +include ../lib.mk diff --git a/tools/testing/selftests/netfilter/config b/tools/testing/selftests/netfilter/config new file mode 100644 index 000000000000..1017313e41a8 --- /dev/null +++ b/tools/testing/selftests/netfilter/config @@ -0,0 +1,2 @@ +CONFIG_NET_NS=y +NF_TABLES_INET=y diff --git a/tools/testing/selftests/netfilter/nft_trans_stress.sh b/tools/testing/selftests/netfilter/nft_trans_stress.sh new file mode 100755 index 000000000000..f1affd12c4b1 --- /dev/null +++ b/tools/testing/selftests/netfilter/nft_trans_stress.sh @@ -0,0 +1,78 @@ +#!/bin/bash +# +# This test is for stress-testing the nf_tables config plane path vs. +# packet path processing: Make sure we never release rules that are +# still visible to other cpus. +# +# set -e + +# Kselftest framework requirement - SKIP code is 4. +ksft_skip=4 + +testns=testns1 +tables="foo bar baz quux" + +nft --version > /dev/null 2>&1 +if [ $? -ne 0 ];then + echo "SKIP: Could not run test without nft tool" + exit $ksft_skip +fi + +ip -Version > /dev/null 2>&1 +if [ $? -ne 0 ];then + echo "SKIP: Could not run test without ip tool" + exit $ksft_skip +fi + +tmp=$(mktemp) + +for table in $tables; do + echo add table inet "$table" >> "$tmp" + echo flush table inet "$table" >> "$tmp" + + echo "add chain inet $table INPUT { type filter hook input priority 0; }" >> "$tmp" + echo "add chain inet $table OUTPUT { type filter hook output priority 0; }" >> "$tmp" + for c in $(seq 1 400); do + chain=$(printf "chain%03u" "$c") + echo "add chain inet $table $chain" >> "$tmp" + done + + for c in $(seq 1 400); do + chain=$(printf "chain%03u" "$c") + for BASE in INPUT OUTPUT; do + echo "add rule inet $table $BASE counter jump $chain" >> "$tmp" + done + echo "add rule inet $table $chain counter return" >> "$tmp" + done +done + +ip netns add "$testns" +ip -netns "$testns" link set lo up + +lscpu | grep ^CPU\(s\): | ( read cpu cpunum ; +cpunum=$((cpunum-1)) +for i in $(seq 0 $cpunum);do + mask=$(printf 0x%x $((1<<$i))) + ip netns exec "$testns" taskset $mask ping -4 127.0.0.1 -fq > /dev/null & + ip netns exec "$testns" taskset $mask ping -6 ::1 -fq > /dev/null & +done) + +sleep 1 + +for i in $(seq 1 10) ; do ip netns exec "$testns" nft -f "$tmp" & done + +for table in $tables;do + randsleep=$((RANDOM%10)) + sleep $randsleep + ip netns exec "$testns" nft delete table inet $table 2>/dev/null +done + +randsleep=$((RANDOM%10)) +sleep $randsleep + +pkill -9 ping + +wait + +rm -f "$tmp" +ip netns del "$testns" |