kernel-hacking-2024-linux-steffo

unimore/kernel-hacking-2024-linux-steffo

Author	SHA1	Message	Date
Daniel Borkmann	a8cf76a902	bpf: fix sg shift repair start offset in bpf_msg_pull_data When we perform the sg shift repair for the scatterlist ring, we currently start out at i = first_sg + 1. However, this is not correct since the first_sg could point to the sge sitting at slot MAX_SKB_FRAGS - 1, and a subsequent i = MAX_SKB_FRAGS will access the scatterlist ring (sg) out of bounds. Add the sk_msg_iter_var() helper for iterating through the ring, and apply the same rule for advancing to the next ring element as we do elsewhere. Later work will use this helper also in other places. Fixes: `015632bb30` ("bpf: sk_msg program helper bpf_sk_msg_pull_data") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-08-29 10:47:17 -07:00
Daniel Borkmann	2e43f95dd8	bpf: fix shift upon scatterlist ring wrap-around in bpf_msg_pull_data If first_sg and last_sg wraps around in the scatterlist ring, then we need to account for that in the shift as well. E.g. crafting such msgs where this is the case leads to a hang as shift becomes negative. E.g. consider the following scenario: first_sg := 14 \|=> shift := -12 msg->sg_start := 10 last_sg := 3 \| msg->sg_end := 5 round 1: i := 15, move_from := 3, sg[15] := sg[ 3] round 2: i := 0, move_from := -12, sg[ 0] := sg[-12] round 3: i := 1, move_from := -11, sg[ 1] := sg[-11] round 4: i := 2, move_from := -10, sg[ 2] := sg[-10] [...] round 13: i := 11, move_from := -1, sg[ 2] := sg[ -1] round 14: i := 12, move_from := 0, sg[ 2] := sg[ 0] round 15: i := 13, move_from := 1, sg[ 2] := sg[ 1] round 16: i := 14, move_from := 2, sg[ 2] := sg[ 2] round 17: i := 15, move_from := 3, sg[ 2] := sg[ 3] [...] This means we will loop forever and never hit the msg->sg_end condition to break out of the loop. When we see that the ring wraps around, then the shift should be MAX_SKB_FRAGS - first_sg + last_sg - 1. Meaning, the remainder slots from the tail of the ring and the head until last_sg combined. Fixes: `015632bb30` ("bpf: sk_msg program helper bpf_sk_msg_pull_data") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-08-29 10:47:17 -07:00
Daniel Borkmann	0e06b227c5	bpf: fix msg->data/data_end after sg shift repair in bpf_msg_pull_data In the current code, msg->data is set as sg_virt(&sg[i]) + start - offset and msg->data_end relative to it as msg->data + bytes. Using iterator i to point to the updated starting scatterlist element holds true for some cases, however not for all where we'd end up pointing out of bounds. It is /correct/ for these ones: 1) When first finding the starting scatterlist element (sge) where we find that the page is already privately owned by the msg and where the requested bytes and headroom fit into the sge's length. However, it's /incorrect/ for the following ones: 2) After we made the requested area private and updated the newly allocated page into first_sg slot of the scatterlist ring; when we find that no shift repair of the ring is needed where we bail out updating msg->data and msg->data_end. At that point i will point to last_sg, which in this case is the next elem of first_sg in the ring. The sge at that point might as well be invalid (e.g. i == msg->sg_end), which we use for setting the range of sg_virt(&sg[i]). The correct one would have been first_sg. 3) Similar as in 2) but when we find that a shift repair of the ring is needed. In this case we fix up all sges and stop once we've reached the end. In this case i will point to will point to the new msg->sg_end, and the sge at that point will be invalid. Again here the requested range sits in first_sg. Fixes: `015632bb30` ("bpf: sk_msg program helper bpf_sk_msg_pull_data") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-08-29 10:47:17 -07:00
Jens Axboe	52bd456a66	Merge branch 'nvme-4.19' of git://git.infradead.org/nvme into for-linus Pull NVMe fixes from Christoph. * 'nvme-4.19' of git://git.infradead.org/nvme: nvmet: free workqueue object if module init fails nvme-fcloop: Fix dropped LS's to removed target port nvme-pci: add a memory barrier to nvme_dbbuf_update_and_check_event	2018-08-29 11:05:20 -06:00
Scott Bauer	8f3fafc9c2	cdrom: Fix info leak/OOB read in cdrom_ioctl_drive_status Like `d88b6d04`: "cdrom: information leak in cdrom_ioctl_media_changed()" There is another cast from unsigned long to int which causes a bounds check to fail with specially crafted input. The value is then used as an index in the slot array in cdrom_slot_status(). Signed-off-by: Scott Bauer <scott.bauer@intel.com> Signed-off-by: Scott Bauer <sbauer@plzdonthack.me> Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-08-29 08:09:20 -06:00
Johan Hovold	36156f9241	of: add helper to lookup compatible child node Add of_get_compatible_child() helper that can be used to lookup compatible child nodes. Several drivers currently use of_find_compatible_node() to lookup child nodes while failing to notice that the of_find_ functions search the entire tree depth-first (from a given start node) and therefore can match unrelated nodes. The fact that these functions also drop a reference to the node they start searching from (e.g. the parent node) is typically also overlooked, something which can lead to use-after-free bugs. Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Rob Herring <robh@kernel.org>	2018-08-29 08:06:46 -05:00
Alexey Khoroshilov	a618cf4800	gpio: dwapb: Fix error handling in dwapb_gpio_probe() If dwapb_gpio_add_port() fails in dwapb_gpio_probe(), gpio->clk is left undisabled. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2018-08-29 14:04:04 +02:00
Hans de Goede	78d3a92edb	gpiolib-acpi: Register GpioInt ACPI event handlers from a late_initcall GpioInt ACPI event handlers may see there IRQ triggered immediately after requesting the IRQ (esp. level triggered ones). This means that they may run before any other (builtin) drivers have had a chance to register their OpRegion handlers, leading to errors like this: [ 1.133274] ACPI Error: No handler for Region [PMOP] ((____ptrval____)) [UserDefinedRegion] (20180531/evregion-132) [ 1.133286] ACPI Error: Region UserDefinedRegion (ID=141) has no handler (20180531/exfldio-265) [ 1.133297] ACPI Error: Method parse/execution failed \_SB.GPO2._L01, AE_NOT_EXIST (20180531/psparse-516) We already defer the manual initial trigger of edge triggered interrupts by running it from a late_initcall handler, this commit replaces this with deferring the entire acpi_gpiochip_request_interrupts() call till then, fixing the problem of some OpRegions not being registered yet. Note that this removes the need to have a list of edge triggered handlers which need to run, since the entire acpi_gpiochip_request_interrupts() call is now delayed, acpi_gpiochip_request_interrupt() can call these directly now. Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2018-08-29 13:32:00 +02:00
Andy Shevchenko	993b9bc5c4	gpiolib: acpi: Switch to cansleep version of GPIO library call The commit `ca876c7483` ("gpiolib-acpi: make sure we trigger edge events at least once on boot") added a initial value check for pin which is about to be locked as IRQ. Unfortunately, not all GPIO drivers can do that atomically. Thus, switch to cansleep version of the call. Otherwise we have a warning: ... WARNING: CPU: 2 PID: 1408 at drivers/gpio/gpiolib.c:2883 gpiod_get_value+0x46/0x50 ... RIP: 0010:gpiod_get_value+0x46/0x50 ... The change tested on Intel Broxton with Whiskey Cove PMIC GPIO controller. Fixes: `ca876c7483` ("gpiolib-acpi: make sure we trigger edge events at least once on boot") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2018-08-29 13:29:01 +02:00
Marc Zyngier	1d8f574708	arm/arm64: smccc-1.1: Make return values unsigned long An unfortunate consequence of having a strong typing for the input values to the SMC call is that it also affects the type of the return values, limiting r0 to 32 bits and r{1,2,3} to whatever was passed as an input. Let's turn everything into "unsigned long", which satisfies the requirements of both architectures, and allows for the full range of return values. Reported-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2018-08-29 11:42:20 +01:00
Julien Grall	afce0cc9ad	Documentation/arm64/sve: Couple of improvements and typos - Fix mismatch between SVE registers (Z) and FPSIMD register (V) - Don't prefix the path for [3] with Linux to stay consistent with [1] and [2]. Signed-off-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2018-08-29 11:33:19 +01:00
Sara Sharon	166ac9d55b	mac80211: avoid kernel panic when building AMSDU from non-linear SKB When building building AMSDU from non-linear SKB, we hit a kernel panic when trying to push the padding to the tail. Instead, put the padding at the head of the next subframe. This also fixes the A-MSDU subframes to not have the padding accounted in the length field and not have pad at all for the last subframe, both required by the spec. Fixes: `6e0456b545` ("mac80211: add A-MSDU tx support") Signed-off-by: Sara Sharon <sara.sharon@intel.com> Reviewed-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-08-29 12:17:55 +02:00
Yuan-Chi Pang	1f631c3201	mac80211: mesh: fix HWMP sequence numbering to follow standard IEEE 802.11-2016 14.10.8.3 HWMP sequence numbering says: If it is a target mesh STA, it shall update its own HWMP SN to maximum (current HWMP SN, target HWMP SN in the PREQ element) + 1 immediately before it generates a PREP element in response to a PREQ element. Signed-off-by: Yuan-Chi Pang <fu3mo6goo@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-08-29 11:15:30 +02:00
Michael Hennerich	6537886cdc	gpio: adp5588: Fix sleep-in-atomic-context bug This fixes: [BUG] gpio: gpio-adp5588: A possible sleep-in-atomic-context bug in adp5588_gpio_write() [BUG] gpio: gpio-adp5588: A possible sleep-in-atomic-context bug in adp5588_gpio_direction_input() Reported-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2018-08-29 10:54:38 +02:00
Daniel Borkmann	5b24109b05	bpf: fix several offset tests in bpf_msg_pull_data While recently going over bpf_msg_pull_data(), I noticed three issues which are fixed in here: 1) When we attempt to find the first scatterlist element (sge) for the start offset, we add len to the offset before we check for start < offset + len, whereas it should come after when we iterate to the next sge to accumulate the offsets. For example, given a start offset of 12 with a sge length of 8 for the first sge in the list would lead us to determine this sge as the first sge thinking it covers first 16 bytes where start is located, whereas start sits in subsequent sges so we would end up pulling in the wrong data. 2) After figuring out the starting sge, we have a short-cut test in !msg->sg_copy[i] && bytes <= len. This checks whether it's not needed to make the page at the sge private where we can just exit by updating msg->data and msg->data_end. However, the length test is not fully correct. bytes <= len checks whether the requested bytes (end - start offsets) fit into the sge's length. The part that is missing is that start must not be sge length aligned. Meaning, the start offset into the sge needs to be accounted as well on top of the requested bytes as otherwise we can access the sge out of bounds. For example the sge could have length of 8, our requested bytes could have length of 8, but at a start offset of 4, so we also would need to pull in 4 bytes of the next sge, when we jump to the out label we do set msg->data to sg_virt(&sg[i]) + start - offset and msg->data_end to msg->data + bytes which would be oob. 3) The subsequent bytes < copy test for finding the last sge has the same issue as in point 2) but also it tests for less than rather than less or equal to. Meaning if the sge length is of 8 and requested bytes of 8 while having the start aligned with the sge, we would unnecessarily go and pull in the next sge as well to make it private. Fixes: `015632bb30` ("bpf: sk_msg program helper bpf_sk_msg_pull_data") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-08-28 22:23:45 -07:00
Linus Torvalds	3f16503b7d	Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal Pull thermal fixes from Eduardo Valentin: "Minor fixes to OF thermal, qoriq, and rcar drivers" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal: thermal: of-thermal: disable passive polling when thermal zone is disabled thermal: rcar_gen3_thermal: convert to SPDX identifiers thermal: rcar_thermal: convert to SPDX identifiers thermal: qoriq: Switch to SPDX identifier thermal: qoriq: Simplify the 'site' variable assignment thermal: qoriq: Use devm_thermal_zone_of_sensor_register()	2018-08-28 16:11:34 -07:00
Gustavo A. R. Silva	450b6b9b16	clk: npcm7xx: fix memory allocation One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; void entry[]; }; instance = kzalloc(sizeof(struct foo) + sizeof(void ) * count, GFP_KERNEL); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL); Notice that, currently, there is a bug during the allocation: sizeof(npcm7xx_clk_data) should be sizeof(*npcm7xx_clk_data) Fix this bug by using struct_size() in kzalloc() This issue was detected with the help of Coccinelle. Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Avi Fishman <avifishman70@gmail.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org>	2018-08-28 15:12:59 -07:00
Colin Ian King	6d3c8ce012	x86/xen: remove redundant variable save_pud Variable save_pud is being assigned but is never used hence it is redundant and can be removed. Cleans up clang warning: variable 'save_pud' set but not used [-Wunused-but-set-variable] Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2018-08-28 17:37:40 -04:00
Joe Jin	076e2cedd6	xen: export device state to sysfs Export device state to sysfs to allow for easier get device state. Signed-off-by: Joe Jin <joe.jin@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2018-08-28 17:37:40 -04:00
Palmer Dabbelt	47d80a68f1	RISC-V: Use a less ugly workaround for unused variable warnings Thanks to Christoph Hellwig for pointing out a cleaner way to do this, as my approach was quite ugly. CC: Christoph Hellwig <hch@lst.de> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>	2018-08-28 12:58:36 -07:00
Will Deacon	0ce5671c44	riscv: tlb: Provide definition of tlb_flush() before including tlb.h As of commit `fd1102f0aa` ("mm: mmu_notifier fix for tlb_end_vma"), asm-generic/tlb.h now calls tlb_flush() from a static inline function, so we need to make sure that it's declared before #including the asm-generic header in the arch header. Reported-by: Guenter Roeck <linux@roeck-us.net> Fixes: `fd1102f0aa` ("mm: mmu_notifier fix for tlb_end_vma") Signed-off-by: Will Deacon <will.deacon@arm.com> [groeck: Use forward declaration instead of moving inline function] Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>	2018-08-28 12:58:35 -07:00
Palmer Dabbelt	11f65ad111	dt-bindings: riscv,cpu-intc: Cleanups from a missed review I managed to miss one of Rob's code reviews on the mailing list <http://lists.infradead.org/pipermail/linux-riscv/2018-August/001139.html>. The patch has already been merged, so I'm submitting a fixup. Sorry! Fixes: `b67bc7cb40` ("dt-bindings: interrupt-controller: RISC-V local interrupt controller") Cc: Rob Herring <robh@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Karsten Merker <merker@debian.org> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>	2018-08-28 12:58:35 -07:00
Rodrigo Vivi	1b1b116274	drm/i915: Free write_buf that we allocated with kzalloc. We use kzalloc to allocate the write_buf that we use for i2c transfer on hdcp write. But it seems that we are forgetting to free the memory that is not needed after i2c transfer is completed. Reported-by: Brian J Wood <brian.j.wood@intel.com> Fixes: `2320175feb` ("drm/i915: Implement HDCP for HDMI") Cc: Ramalingam C <ramalingam.c@intel.com> Cc: Sean Paul <seanpaul@chromium.org> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: <stable@vger.kernel.org> # v4.17+ Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180823205136.31310-1-rodrigo.vivi@intel.com (cherry picked from commit `62d3a8deaa`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2018-08-28 12:50:43 -07:00
Ville Syrjälä	5b2695fd4b	drm/i915: Fix glk/cnl display w/a #1175 The workaround was supposed to look at the plane destination coordinates. Currently it's looking at some mixture of src and dst coordinates that doesn't make sense. Fix it up. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180719182214.4323-2-ville.syrjala@linux.intel.com Fixes: `394676f05b` (drm/i915: Add WA for planes ending close to left screen edge) Reviewed-by: Imre Deak <imre.deak@intel.com> (cherry picked from commit `b1f1c2c11f`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2018-08-28 12:50:37 -07:00
Emily Deng	6ddd9769db	drm/amdgpu: Need to set moved to true when evict bo Fix the VMC page fault when the running sequence is as below: 1.amdgpu_gem_create_ioctl 2.ttm_bo_swapout->amdgpu_vm_bo_invalidate, as not called amdgpu_vm_bo_base_init, so won't called list_add_tail(&base->bo_list, &bo->va). Even the bo was evicted, it won't set the bo_base->moved. 3.drm_gem_open_ioctl->amdgpu_vm_bo_base_init, here only called list_move_tail(&base->vm_status, &vm->evicted), but not set the bo_base->moved. 4.amdgpu_vm_bo_map->amdgpu_vm_bo_insert_map, as the bo_base->moved is not set true, the function amdgpu_vm_bo_insert_map will call list_move(&bo_va->base.vm_status, &vm->moved) 5.amdgpu_cs_ioctl won't validate the swapout bo, as it is only in the moved list, not in the evict list. So VMC page fault occurs. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-28 12:42:48 -05:00
Tony Lindgren	2d59bb6023	ARM: dts: omap4-droid4: Fix emmc errors seen on some devices Otherwise we can get the following errors occasionally on some devices: mmc1: tried to HW reset card, got error -110 mmcblk1: error -110 requesting status mmcblk1: recovery failed! print_req_error: I/O error, dev mmcblk1, sector 14329 ... I have one device that hits this error almost on every boot, and another one that hits it only rarely with the other ones I've used behave without problems. I'm not sure if the issue is related to a particular eMMC card model, but in case it is, both of the machines with issues have: # cat /sys/class/mmc_host/mmc1/mmc1:0001/manfid \ /sys/class/mmc_host/mmc1/mmc1:0001/oemid \ /sys/class/mmc_host/mmc1/mmc1:0001/name 0x000045 0x0100 SEM16G and the working ones have: 0x000011 0x0100 016G92 Note that "ti,non-removable" is different as omap_hsmmc_reg_get() does not call omap_hsmmc_disable_boot_regulators() if no_regulator_off_init is set. And currently we set no_regulator_off_init only for "ti,non-removable" and not for "non-removable". It seems that we should have "non-removable" with some other mmc generic property behave in the same way instead of having to use a non-generic property. But let's fix the issue first. Fixes: `7e2f8c0ae6` ("ARM: dts: Add minimal support for motorola droid 4 xt894") Cc: Marcel Partap <mpartap@gmx.net> Cc: Merlijn Wajer <merlijn@wizzup.org> Cc: Michael Scott <hashcode0f@gmail.com> Cc: NeKit <nekit1000@gmail.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Sebastian Reichel <sre@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com>	2018-08-28 09:59:46 -07:00
Tony Lindgren	ea4d65f14f	Merge branch 'perm-fix' into omap-for-v4.19/fixes-v2	2018-08-28 09:58:03 -07:00
Neeraj Dantu	496f3347d8	ARM: dts: Fix file permission for am335x-osd3358-sm-red.dts Fix wrong mode for dts file added by commit `bb3e3fbbac` ("ARM: dts: Add DT support for Octavo Systems OSD3358-SM-RED based on TI AM335x"). Signed-off-by: Neeraj Dantu <neeraj.dantu@octavosystems.com> CC: Robert Nelson <robertcnelson@gmail.com> CC: Jason Kridner <jkridner@gmail.com> Signed-off-by: Tony Lindgren <tony@atomide.com>	2018-08-28 09:54:16 -07:00
Haim Dreyfuss	b88d26d97c	nl80211: Pass center frequency in kHz instead of MHz freq_reg_info expects to get the frequency in kHz. Instead we accidently pass it in MHz. Thus, currently the function always return ERR rule. Fix that. Fixes: `50f32718e1` ("nl80211: Add wmm rule attribute to NL80211_CMD_GET_WIPHY dump command") Signed-off-by: Haim Dreyfuss <haim.dreyfuss@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> [fix kHz/MHz in commit message] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-08-28 11:41:23 +02:00
Haim Dreyfuss	d3c89bbc74	nl80211: Fix nla_put_u8 to u16 for NL80211_WMMR_TXOP TXOP (also known as Channel Occupancy Time) is u16 and should be added using nla_put_u16 instead of u8, fix that. Fixes: `50f32718e1` ("nl80211: Add wmm rule attribute to NL80211_CMD_GET_WIPHY dump command") Signed-off-by: Haim Dreyfuss <haim.dreyfuss@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-08-28 11:37:28 +02:00
Jinbum Park	3a2af7cccb	mac80211_hwsim: Fix possible Spectre-v1 for hwsim_world_regdom_custom User controls @idx which to be used as index of hwsim_world_regdom_custom. So, It can be exploited via Spectre-like attack. (speculative execution) This kind of attack leaks address of hwsim_world_regdom_custom, It leads an attacker to bypass security mechanism such as KASLR. So sanitize @idx before using it to prevent attack. I leveraged strategy [1] to find and exploit this gadget. [1] https://github.com/jinb-park/linux-exploit/tree/master/exploit-remaining-spectre-gadget/ Signed-off-by: Jinbum Park <jinb.park7@gmail.com> [johannes: unwrap URL] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-08-28 11:14:56 +02:00
Emmanuel Grumbach	20932750d9	mac80211: don't update the PM state of a peer upon a multicast frame I changed the way mac80211 updates the PM state of the peer. I forgot that we could also have multicast frames from the peer and that those frame should of course not change the PM state of the peer: A peer goes to power save when it needs to scan, but it won't send the broadcast Probe Request with the PM bit set. This made us mark the peer as awake when it wasn't and then Intel's firmware would fail to transmit because the peer is asleep according to its database. The driver warned about this and it looked like this: WARNING: CPU: 0 PID: 184 at /usr/src/linux-4.16.14/drivers/net/wireless/intel/iwlwifi/mvm/tx.c:1369 iwl_mvm_rx_tx_cmd+0x53b/0x860 CPU: 0 PID: 184 Comm: irq/124-iwlwifi Not tainted 4.16.14 #1 RIP: 0010:iwl_mvm_rx_tx_cmd+0x53b/0x860 Call Trace: iwl_pcie_rx_handle+0x220/0x880 iwl_pcie_irq_handler+0x6c9/0xa20 ? irq_forced_thread_fn+0x60/0x60 ? irq_thread_dtor+0x90/0x90 The relevant code that spits the WARNING is: case TX_STATUS_FAIL_DEST_PS: /* the FW should have stopped the queue and not * return this status */ WARN_ON(1); info->flags \|= IEEE80211_TX_STAT_TX_FILTERED; This fixes https://bugzilla.kernel.org/show_bug.cgi?id=199967. Fixes: `9fef654433` ("mac80211: always update the PM state of a peer on MGMT / DATA frames") Cc: <stable@vger.kernel.org> #4.16+ Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-08-28 11:12:42 +02:00
Stanislaw Gruszka	38cb87ee47	cfg80211: make wmm_rule part of the reg_rule structure Make wmm_rule be part of the reg_rule structure. This simplifies the code a lot at the cost of having bigger memory usage. However in most cases we have only few reg_rule's and when we do have many like in iwlwifi we do not save memory as it allocates a separate wmm_rule for each channel anyway. This also fixes a bug reported in various places where somewhere the pointers were corrupted and we ended up doing a null-dereference. Fixes: `230ebaa189` ("cfg80211: read wmm rules from regulatory database") Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> [rephrase commit message slightly] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-08-28 11:11:47 +02:00
Danek Duvall	d7c863a2f6	mac80211_hwsim: correct use of IEEE80211_VHT_CAP_RXSTBC_X The mac80211_hwsim driver intends to say that it supports up to four STBC receive streams, but instead it ends up saying something undefined. The IEEE80211_VHT_CAP_RXSTBC_X macros aren't independent bits that can be ORed together, but values. In this case, _4 is the appropriate one to use. Signed-off-by: Danek Duvall <duvall@comfychair.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-08-28 11:09:05 +02:00
Danek Duvall	67d1ba8a6d	mac80211: correct use of IEEE80211_VHT_CAP_RXSTBC_X The mod mask for VHT capabilities intends to say that you can override the number of STBC receive streams, and it does, but only by accident. The IEEE80211_VHT_CAP_RXSTBC_X aren't bits to be set, but values (albeit left-shifted). ORing the bits together gets the right answer, but we should use the _MASK macro here instead. Signed-off-by: Danek Duvall <duvall@comfychair.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-08-28 11:08:59 +02:00
John Fastabend	501ca81760	bpf: sockmap, decrement copied count correctly in redirect error case Currently, when a redirect occurs in sockmap and an error occurs in the redirect call we unwind the scatterlist once in the error path of bpf_tcp_sendmsg_do_redirect() and then again in sendmsg(). Then in the error path of sendmsg we decrement the copied count by the send size. However, its possible we partially sent data before the error was generated. This can happen if do_tcp_sendpages() partially sends the scatterlist before encountering a memory pressure error. If this happens we need to decrement the copied value (the value tracking how many bytes were actually sent to TCP stack) by the number of remaining bytes _not_ the entire send size. Otherwise we risk confusing userspace. Also we don't need two calls to free the scatterlist one is good enough. So remove the one in bpf_tcp_sendmsg_do_redirect() and then properly reduce copied by the number of remaining bytes which may in fact be the entire send size if no bytes were sent. To do this use bool to indicate if free_start_sg() should do mem accounting or not. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-28 09:01:06 +02:00
Chaitanya Kulkarni	04db0e5ec5	nvmet: free workqueue object if module init fails Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2018-08-28 08:40:44 +02:00
James Smart	afd299ca99	nvme-fcloop: Fix dropped LS's to removed target port When a targetport is removed from the config, fcloop will avoid calling the LS done() routine thinking the targetport is gone. This leaves the initiator reset/reconnect hanging as it waits for a status on the Create_Association LS for the reconnect. Change the filter in the LS callback path. If tport null (set when failed validation before "sending to remote port"), be sure to call done. This was the main bug. But, continue the logic that only calls done if tport was set but there is no remoteport (e.g. case where remoteport has been removed, thus host doesn't expect a completion). Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2018-08-28 08:40:43 +02:00
Michal Wnukowski	f1ed3df20d	nvme-pci: add a memory barrier to nvme_dbbuf_update_and_check_event In many architectures loads may be reordered with older stores to different locations. In the nvme driver the following two operations could be reordered: - Write shadow doorbell (dbbuf_db) into memory. - Read EventIdx (dbbuf_ei) from memory. This can result in a potential race condition between driver and VM host processing requests (if given virtual NVMe controller has a support for shadow doorbell). If that occurs, then the NVMe controller may decide to wait for MMIO doorbell from guest operating system, and guest driver may decide not to issue MMIO doorbell on any of subsequent commands. This issue is purely timing-dependent one, so there is no easy way to reproduce it. Currently the easiest known approach is to run "Oracle IO Numbers" (orion) that is shipped with Oracle DB: orion -run advanced -num_large 0 -size_small 8 -type rand -simulate \ concat -write 40 -duration 120 -matrix row -testname nvme_test Where nvme_test is a .lun file that contains a list of NVMe block devices to run test against. Limiting number of vCPUs assigned to given VM instance seems to increase chances for this bug to occur. On test environment with VM that got 4 NVMe drives and 1 vCPU assigned the virtual NVMe controller hang could be observed within 10-20 minutes. That correspond to about 400-500k IO operations processed (or about 100GB of IO read/writes). Orion tool was used as a validation and set to run in a loop for 36 hours (equivalent of pushing 550M IO operations). No issues were observed. That suggest that the patch fixes the issue. Fixes: `f9f38e3338` ("nvme: improve performance for virtual NVMe devices") Signed-off-by: Michal Wnukowski <wnukowski@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> [hch: updated changelog and comment a bit] Signed-off-by: Christoph Hellwig <hch@lst.de>	2018-08-28 08:40:42 +02:00
Stefan Agner	3f6e138d41	bpf: fix build error with clang Building the newly introduced BPF_PROG_TYPE_SK_REUSEPORT leads to a compile time error when building with clang: net/core/filter.o: In function `sk_reuseport_convert_ctx_access': ../net/core/filter.c:7284: undefined reference to `__compiletime_assert_7284' It seems that clang has issues resolving hweight_long at compile time. Since SK_FL_PROTO_MASK is a constant, we can use the interface for known constant arguments which works fine with clang. Fixes: `2dbb9b9e6d` ("bpf: Introduce BPF_PROG_TYPE_SK_REUSEPORT") Signed-off-by: Stefan Agner <stefan@agner.ch> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-08-27 20:37:05 -07:00
Daniel Borkmann	15c480efab	bpf, sockmap: fix psock refcount leak in bpf_tcp_recvmsg In bpf_tcp_recvmsg() we first took a reference on the psock, however once we find that there are skbs in the normal socket's receive queue we return with processing them through tcp_recvmsg(). Problem is that we leak the taken reference on the psock in that path. Given we don't really do anything with the psock at this point, move the skb_queue_empty() test before we fetch the psock to fix this case. Fixes: `8934ce2fd0` ("bpf: sockmap redirect ingress support") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-08-27 20:22:05 -07:00
Daniel Borkmann	e06fa9c16c	bpf, sockmap: fix potential use after free in bpf_tcp_close bpf_tcp_close() we pop the psock linkage to a map via psock_map_pop(). A parallel update on the sock hash map can happen between psock_map_pop() and lookup_elem_raw() where we override the element under link->hash / link->key. In bpf_tcp_close()'s lookup_elem_raw() we subsequently only test whether an element is present, but we do not test whether the element is infact the element we were looking for. We lock the sock in bpf_tcp_close() during that time, so do we hold the lock in sock_hash_update_elem(). However, the latter locks the sock which is newly updated, not the one we're purging from the hash table. This means that while one CPU is doing the lookup from bpf_tcp_close(), another CPU is doing the map update in parallel, dropped our sock from the hlist and released the psock. Subsequently the first CPU will find the new sock and attempts to drop and release the old sock yet another time. Fix is that we need to check the elements for a match after lookup, similar as we do in the sock map. Note that the hash tab elems are freed via RCU, so access to their link->hash / link->key is fine since we're under RCU read side there. Fixes: `e9db4ef6bf` ("bpf: sockhash fix omitted bucket lock in sock_close") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-08-27 20:22:05 -07:00
John Pittman	db193954ed	block: bsg: move atomic_t ref_count variable to refcount API Currently, variable ref_count within the bsg_device struct is of type atomic_t. For variables being used as reference counters, the refcount API should be used instead of atomic. The newer refcount API works to prevent counter overflows and use-after-free bugs. So, move this varable from the atomic API to refcount, potentially avoiding the issues mentioned. Signed-off-by: John Pittman <jpittman@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-08-27 19:17:02 -06:00
Chengguang Xu	62d2a19407	block: remove unnecessary condition check kmem_cache_destroy() can handle NULL pointer correctly, so there is no need to check e->icq_cache before calling kmem_cache_destroy(). Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-08-27 19:16:06 -06:00
Zhu Yanjun	53ae914d89	net/rds: Use rdma_read_gids to get connection SGID/DGID in IPv6 In IPv4, the newly introduced rdma_read_gids is used to read the SGID/DGID for the connection which returns GID correctly for RoCE transport as well. In IPv6, rdma_read_gids is also used. The following are why rdma_read_gids is introduced. rdma_addr_get_dgid() for RoCE for client side connections returns MAC address, instead of DGID. rdma_addr_get_sgid() for RoCE doesn't return correct SGID for IPv6 and when more than one IP address is assigned to the netdevice. So the transport agnostic rdma_read_gids() API is provided by rdma_cm module. Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-27 15:26:01 -07:00
Linus Walleij	ad8619864f	net: dsa: Drop GPIO includes Commit `52638f71fc` ("dsa: Move gpio reset into switch driver") moved the GPIO handling into the switch drivers but forgot to remove the GPIO header includes. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-27 15:24:33 -07:00
Haiqing Bai	30935198b7	tipc: fix the big/little endian issue in tipc_dest In function tipc_dest_push, the 32bit variables 'node' and 'port' are stored separately in uppper and lower part of 64bit 'value'. Then this value is assigned to dst->value which is a union like: union { struct { u32 port; u32 node; }; u64 value; } This works on little-endian machines like x86 but fails on big-endian machines. The fix remove the 'value' stack parameter and even the 'value' member of the union in tipc_dest, assign the 'node' and 'port' member directly with the input parameter to avoid the endian issue. Fixes: `a80ae5306a` ("tipc: improve destination linked list") Signed-off-by: Zhenbo Gao <zhenbo.gao@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Haiqing Bai <Haiqing.Bai@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-27 15:23:31 -07:00
David S. Miller	ca2b1d2d42	Merge branch 'net-sched-fixes' Jiri Pirko says: ==================== net: sched: couple of small fixes Jiri Pirko (2): net: sched: fix extack error message when chain is failed to be created net: sched: return -ENOENT when trying to remove filter from non-existent chain ==================== Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-27 15:16:16 -07:00
Jiri Pirko	b7b4247d55	net: sched: return -ENOENT when trying to remove filter from non-existent chain When chain 0 was implicitly created, removal of non-existent filter from chain 0 gave -ENOENT. Once chain 0 became non-implicit, the same call is giving -EINVAL. Fix this by returning -ENOENT in that case. Reported-by: Roman Mashak <mrv@mojatatu.com> Fixes: `f71e0ca4db` ("net: sched: Avoid implicit chain 0 creation") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-27 15:16:16 -07:00
Jiri Pirko	d5ed72a55b	net: sched: fix extack error message when chain is failed to be created Instead "Cannot find" say "Cannot create". Fixes: `c35a4acc29` ("net: sched: cls_api: handle generic cls errors") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-27 15:16:16 -07:00

... 3 4 5 6 7 ...

782088 commits