Felipe Balbi [Tue, 29 Oct 2019 10:56:11 +0000 (12:56 +0200)]
usb: dwc3: of-simple: add a shutdown
In case we're loading a new kernel via kexec, let's make sure to
cleanup the dwc3 address space correctly. This means that we should
run the same steps from driver remove, so just extract a reusable
function for both cases.
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Roger Quadros [Mon, 28 Oct 2019 09:32:49 +0000 (11:32 +0200)]
usb: cdns3: Add TI specific wrapper driver
The J721e platform comes with 2 Cadence USB3 controller
instances. This driver supports the TI specific wrapper
on this platform.
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Reviewed-by: Pawel Laszczak <pawell@cadence.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Roger Quadros [Mon, 28 Oct 2019 09:32:48 +0000 (11:32 +0200)]
dt-bindings: usb: Add binding for the TI wrapper for Cadence USB3 controller
TI platforms have a wrapper module around the Cadence USB3
controller. Add binding information for that.
Signed-off-by: Roger Quadros <rogerq@ti.com>
Cc: Rob Herring <robh@kernel.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Chunfeng Yun [Wed, 9 Oct 2019 09:05:00 +0000 (17:05 +0800)]
usb: mtu3: fix race condition about delayed_status
usb_composite_setup_continue() may be called before composite_setup()
return USB_GADGET_DELAYED_STATUS, then the controller driver will
delay status stage after composite_setup() finish, but the class driver
don't ask the controller to continue delayed status anymore, this will
cause control transfer timeout.
happens when use mass storage (also enabled other class driver):
cpu1: cpu2
handle_setup(SET_CONFIG) //gadget driver
unlock (g->lock)
gadget_driver->setup()
composite_setup()
lock(cdev->lock)
set_config()
fsg_set_alt() // maybe some times due to many class are enabled
raise FSG_STATE_CONFIG_CHANGE
return USB_GADGET_DELAYED_STATUS
handle_exception()
usb_composite_setup_continue()
unlock(cdev->lock)
lock(cdev->lock)
ep0_queue()
lock (g->lock)
//no delayed status, nothing todo
unlock (g->lock)
unlock(cdev->lock)
return USB_GADGET_DELAYED_STATUS // composite_setup
lock (g->lock)
get USB_GADGET_DELAYED_STATUS //handle_setup [1]
Try to fix the race condition as following:
After the driver gets USB_GADGET_DELAYED_STATUS at [1], if we find
there is a usb_request in ep0 request list, it means composite already
asked us to continue delayed status by usb_composite_setup_continue(),
so we skip request about delayed_status by composite_setup() and still
do status stage.
Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Nagarjuna Kristam [Fri, 18 Oct 2019 09:38:15 +0000 (15:08 +0530)]
usb: gadget: Add UDC driver for tegra XUSB device mode controller
This patch adds UDC driver for tegra XUSB 3.0 device mode controller.
XUSB device mode controller supports SS, HS and FS modes
Based on work by:
Mark Kuo <mkuo@nvidia.com>
Hui Fu <hfu@nvidia.com>
Andrew Bresticker <abrestic@chromium.org>
Signed-off-by: Nagarjuna Kristam <nkristam@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Thinh Nguyen [Tue, 22 Oct 2019 19:10:16 +0000 (12:10 -0700)]
usb: dwc3: debug: Remove newline printout
The newline from the unknown link state tracepoint doesn't follow the
other tracepoints, and it looks unsightly. Let's remove it.
Signed-off-by: Thinh Nguyen <thinhn@synopsys.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Mathias Kresin [Sun, 7 Jul 2019 14:22:01 +0000 (16:22 +0200)]
usb: dwc2: use a longer core rest timeout in dwc2_core_reset()
Testing on different generations of Lantiq MIPS SoC based boards, showed
that it takes up to 1500 us until the core reset bit is cleared.
The driver from the vendor SDK (ifxhcd) uses a 1 second timeout. Use the
same timeout to fix wrong hang detections and make the driver work for
Lantiq MIPS SoCs.
At least till kernel 4.14 the hanging reset only caused a warning but
the driver was probed successful. With kernel 4.19 errors out with
EBUSY.
Cc: linux-stable <stable@vger.kernel.org> # 4.19+
Signed-off-by: Mathias Kresin <dev@kresin.me>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Markus Elfring [Thu, 19 Sep 2019 13:47:24 +0000 (15:47 +0200)]
usb: gadget: udc: lpc32xx: Use devm_platform_ioremap_resource() in lpc32xx_udc_probe()
Simplify this function implementation by using a known wrapper function.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Colin Ian King [Fri, 27 Sep 2019 08:50:31 +0000 (09:50 +0100)]
USB: gadget: udc: clean up an indentation issue
There is a statement that is indented too deeply, remove
the extraneous tabs.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Joel Stanley [Mon, 30 Sep 2019 13:14:34 +0000 (22:44 +0930)]
usb: gadget: Quieten gadget config message
On a system that often re-configures a USB gadget device the kernel log
is filled with:
configfs-gadget gadget: high-speed config #1: c
Reduce the verbosity of this print to debug.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Geert Uytterhoeven [Tue, 1 Oct 2019 18:11:09 +0000 (20:11 +0200)]
phy: renesas: rcar-gen3-usb2: Use platform_get_irq_optional() for optional irq
As platform_get_irq() now prints an error when the interrupt does not
exist, a scary warning may be printed for an optional interrupt:
phy_rcar_gen3_usb2
ee0a0200.usb-phy: IRQ index 0 not found
Fix this by calling platform_get_irq_optional() instead.
Fixes:
7723f4c5ecdb8d83 ("driver core: platform: Add an error message to platform_get_irq*()")
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
zhengbin [Wed, 9 Oct 2019 08:40:34 +0000 (16:40 +0800)]
usb: gadget: Remove set but not used variable 'opts' in msg_do_config
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/usb/gadget/legacy/mass_storage.c: In function msg_do_config:
drivers/usb/gadget/legacy/mass_storage.c:108:19: warning: variable opts set but not used [-Wunused-but-set-variable]
It is not used since commit
f78bbcae86e6 ("usb: f_mass_storage:
test whether thread is running before starting another")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
zhengbin [Wed, 9 Oct 2019 08:40:33 +0000 (16:40 +0800)]
usb: gadget: Remove set but not used variable 'opts' in acm_ms_do_config
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/usb/gadget/legacy/acm_ms.c: In function acm_ms_do_config:
drivers/usb/gadget/legacy/acm_ms.c:108:19: warning: variable opts set but not used [-Wunused-but-set-variable]
It is not used since commit
f78bbcae86e6 ("usb: f_mass_storage:
test whether thread is running before starting another")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Chunfeng Yun [Wed, 9 Oct 2019 09:04:59 +0000 (17:04 +0800)]
usb: mtu3: add a new function to do status stage
Exact a new static function to do status stage
Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Peter Chen [Mon, 14 Oct 2019 06:39:51 +0000 (14:39 +0800)]
usb: gadget: configfs: fix concurrent issue between composite APIs
We meet several NULL pointer issues if configfs_composite_unbind
and composite_setup (or composite_disconnect) are running together.
These issues occur when do the function switch stress test, the
configfs_compsoite_unbind is called from user mode by
echo "" to /sys/../UDC entry, and meanwhile, the setup interrupt
or disconnect interrupt occurs by hardware. The composite_setup
will get the cdev from get_gadget_data, but configfs_composite_unbind
will set gadget data as NULL, so the NULL pointer issue occurs.
This concurrent is hard to reproduce by native kernel, but can be
reproduced by android kernel.
In this commit, we introduce one spinlock belongs to structure
gadget_info since we can't use the same spinlock in usb_composite_dev
due to exclusive running together between composite_setup and
configfs_composite_unbind. And one bit flag 'unbind' to indicate the
code is at unbind routine, this bit is needed due to we release the
lock at during configfs_composite_unbind sometimes, and composite_setup
may be run at that time.
Several oops:
oops 1:
android_work: sent uevent USB_STATE=CONNECTED
configfs-gadget gadget: super-speed config #1: b
android_work: sent uevent USB_STATE=CONFIGURED
init: Received control message 'start' for 'adbd' from pid: 3515 (system_server)
Unable to handle kernel NULL pointer dereference at virtual address
0000002a
init: Received control message 'stop' for 'adbd' from pid: 3375 (/vendor/bin/hw/android.hardware.usb@1.1-servic)
Mem abort info:
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000004
CM = 0, WnR = 0
user pgtable: 4k pages, 48-bit VAs, pgd =
ffff8008f1b7f000
[
000000000000002a] *pgd=
0000000000000000
Internal error: Oops:
96000004 [#1] PREEMPT SMP
Modules linked in:
CPU: 4 PID: 2457 Comm: irq/125-
5b11000 Not tainted
4.14.98-07846-g0b40a9b-dirty #16
Hardware name: Freescale i.MX8QM MEK (DT)
task:
ffff8008f2a98000 task.stack:
ffff00000b7b8000
PC is at composite_setup+0x44/0x1508
LR is at android_setup+0xb8/0x13c
pc : [<
ffff0000089ffb3c>] lr : [<
ffff000008a032fc>] pstate:
800001c5
sp :
ffff00000b7bbb80
x29:
ffff00000b7bbb80 x28:
ffff8008f2a3c010
x27:
0000000000000001 x26:
0000000000000000 [1232/1897]
audit: audit_lost=25791 audit_rate_limit=5 audit_backlog_limit=64
x25:
00000000ffffffa1 x24:
ffff8008f2a3c010
audit: rate limit exceeded
x23:
0000000000000409 x22:
ffff000009c8e000
x21:
ffff8008f7a8b428 x20:
ffff00000afae000
x19:
ffff0000089ff000 x18:
0000000000000000
x17:
0000000000000000 x16:
ffff0000082b7c9c
x15:
0000000000000000 x14:
f1866f5b952aca46
x13:
e35502e30d44349c x12:
0000000000000008
x11:
0000000000000008 x10:
0000000000000a30
x9 :
ffff00000b7bbd00 x8 :
ffff8008f2a98a90
x7 :
ffff8008f27a9c90 x6 :
0000000000000001
x5 :
0000000000000000 x4 :
0000000000000001
x3 :
0000000000000000 x2 :
0000000000000006
x1 :
ffff0000089ff8d0 x0 :
732a010310b9ed00
X7: 0xffff8008f27a9c10:
9c10
00000002 00000000 00000001 00000000 13110000 ffff0000 00000002 00208040
9c30
00000000 00000000 00000000 00000000 00000000 00000005 00000029 00000000
9c50
00051778 00000001 f27a8e00 ffff8008 00000005 00000000 00000078 00000078
9c70
00000078 00000000 09031d48 ffff0000 00100000 00000000 00400000 00000000
9c90
00000001 00000000 00000000 00000000 00000000 00000000 ffefb1a0 ffff8008
9cb0
f27a9ca8 ffff8008 00000000 00000000 b9d88037 00000173 1618a3eb 00000001
9cd0
870a792a 0000002e 16188fe6 00000001 0000242b 00000000 00000000 00000000
using random self ethernet address
9cf0
019a4646 00000000 000547f3 00000000 ecfd6c33 00000002 00000000
using random host ethernet address
00000000
X8: 0xffff8008f2a98a10:
8a10
00000000 00000000 f7788d00 ffff8008 00000001 00000000 00000000 00000000
8a30
eb218000 ffff8008 f2a98000 ffff8008 f2a98000 ffff8008 09885000 ffff0000
8a50
f34df480 ffff8008 00000000 00000000 f2a98648 ffff8008 09c8e000 ffff0000
8a70
fff2c800 ffff8008 09031d48 ffff0000 0b7bbd00 ffff0000 0b7bbd00 ffff0000
8a90
080861bc ffff0000 00000000 00000000 00000000 00000000 00000000 00000000
8ab0
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
8ad0
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
8af0
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
X21: 0xffff8008f7a8b3a8:
b3a8
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
b3c8
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
b3e8
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
b408
00000000 00000000 00000000 00000000 00000000 00000000 00000001 00000000
b428
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
b448
0053004d 00540046 00300031 00010030 eb07b520 ffff8008 20011201 00000003
b468
e418d109 0104404e 00010302 00000000 eb07b558 ffff8008 eb07b558 ffff8008
b488
f7a8b488 ffff8008 f7a8b488 ffff8008 f7a8b300 ffff8008 00000000 00000000
X24: 0xffff8008f2a3bf90:
bf90
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bfb0
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bfd0
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bff0
00000000 00000000 00000000 00000000 f76c8010 ffff8008 f76c8010 ffff8008
c010
00000000 00000000 f2a3c018 ffff8008 f2a3c018 ffff8008 08a067dc ffff0000
c030
f2a5a000 ffff8008 091c3650 ffff0000 f716fd18 ffff8008 f716fe30 ffff8008
c050
f2ce4a30 ffff8008 00000000 00000005 00000000 00000000 095d1568 ffff0000
c070
f76c8010 ffff8008 f2ce4b00 ffff8008 095cac68 ffff0000 f2a5a028 ffff8008
X28: 0xffff8008f2a3bf90:
bf90
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bfb0
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bfd0
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bff0
00000000 00000000 00000000 00000000 f76c8010 ffff8008 f76c8010 ffff8008
c010
00000000 00000000 f2a3c018 ffff8008 f2a3c018 ffff8008 08a067dc ffff0000
c030
f2a5a000 ffff8008 091c3650 ffff0000 f716fd18 ffff8008 f716fe30 ffff8008
c050
f2ce4a30 ffff8008 00000000 00000005 00000000 00000000 095d1568 ffff0000
c070
f76c8010 ffff8008 f2ce4b00 ffff8008 095cac68 ffff0000 f2a5a028 ffff8008
Process irq/125-
5b11000 (pid: 2457, stack limit = 0xffff00000b7b8000)
Call trace:
Exception stack(0xffff00000b7bba40 to 0xffff00000b7bbb80)
ba40:
732a010310b9ed00 ffff0000089ff8d0 0000000000000006 0000000000000000
ba60:
0000000000000001 0000000000000000 0000000000000001 ffff8008f27a9c90
ba80:
ffff8008f2a98a90 ffff00000b7bbd00 0000000000000a30 0000000000000008
baa0:
0000000000000008 e35502e30d44349c f1866f5b952aca46 0000000000000000
bac0:
ffff0000082b7c9c 0000000000000000 0000000000000000 ffff0000089ff000
bae0:
ffff00000afae000 ffff8008f7a8b428 ffff000009c8e000 0000000000000409
bb00:
ffff8008f2a3c010 00000000ffffffa1 0000000000000000 0000000000000001
bb20:
ffff8008f2a3c010 ffff00000b7bbb80 ffff000008a032fc ffff00000b7bbb80
bb40:
ffff0000089ffb3c 00000000800001c5 ffff00000b7bbb80 732a010310b9ed00
bb60:
ffffffffffffffff ffff0000080f777c ffff00000b7bbb80 ffff0000089ffb3c
[<
ffff0000089ffb3c>] composite_setup+0x44/0x1508
[<
ffff000008a032fc>] android_setup+0xb8/0x13c
[<
ffff0000089bd9a8>] cdns3_ep0_delegate_req+0x44/0x70
[<
ffff0000089bdff4>] cdns3_check_ep0_interrupt_proceed+0x33c/0x654
[<
ffff0000089bca44>] cdns3_device_thread_irq_handler+0x4b0/0x4bc
[<
ffff0000089b77b4>] cdns3_thread_irq+0x48/0x68
[<
ffff000008145bf0>] irq_thread_fn+0x28/0x88
[<
ffff000008145e38>] irq_thread+0x13c/0x228
[<
ffff0000080fed70>] kthread+0x104/0x130
[<
ffff000008085064>] ret_from_fork+0x10/0x18
oops2:
composite_disconnect: Calling disconnect on a Gadget that is not connected
android_work: did not send uevent (0 0 (null))
init: Received control message 'stop' for 'adbd' from pid: 3359 (/vendor/bin/hw/android.hardware.usb@1.1-service.imx)
init: Sending signal 9 to service 'adbd' (pid 22343) process group...
------------[ cut here ]------------
audit: audit_lost=180038 audit_rate_limit=5 audit_backlog_limit=64
audit: rate limit exceeded
WARNING: CPU: 0 PID: 3468 at kernel_imx/drivers/usb/gadget/composite.c:2009 composite_disconnect+0x80/0x88
Modules linked in:
CPU: 0 PID: 3468 Comm: HWC-UEvent-Thre Not tainted
4.14.98-07846-g0b40a9b-dirty #16
Hardware name: Freescale i.MX8QM MEK (DT)
task:
ffff8008f2349c00 task.stack:
ffff00000b0a8000
PC is at composite_disconnect+0x80/0x88
LR is at composite_disconnect+0x80/0x88
pc : [<
ffff0000089ff9b0>] lr : [<
ffff0000089ff9b0>] pstate:
600001c5
sp :
ffff000008003dd0
x29:
ffff000008003dd0 x28:
ffff8008f2349c00
x27:
ffff000009885018 x26:
ffff000008004000
Timeout for IPC response!
x25:
ffff000009885018 x24:
ffff000009c8e280
x23:
ffff8008f2d98010 x22:
00000000000001c0
x21:
ffff8008f2d98394 x20:
ffff8008f2d98010
x19:
0000000000000000 x18:
0000e3956f4f075a
fxos8700 4-001e: i2c block read acc failed
x17:
0000e395735727e8 x16:
ffff00000829f4d4
x15:
ffffffffffffffff x14:
7463656e6e6f6320
x13:
746f6e2009090920 x12:
7369207461687420
x11:
7465676461472061 x10:
206e6f207463656e
x9 :
6e6f637369642067 x8 :
ffff000009c8e280
x7 :
ffff0000086ca6cc x6 :
ffff000009f15e78
x5 :
0000000000000000 x4 :
0000000000000000
x3 :
ffffffffffffffff x2 :
c3f28b86000c3900
x1 :
c3f28b86000c3900 x0 :
000000000000004e
X20: 0xffff8008f2d97f90:
7f90
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
7fb0
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
libprocessgroup: Failed to kill process cgroup uid 0 pid 22343 in 215ms, 1 processes remain
7fd0
Timeout for IPC response!
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
using random self ethernet address
7ff0
00000000 00000000 00000000 00000000 f76c8010 ffff8008 f76c8010 ffff8008
8010
00000100 00000000 f2d98018 ffff8008 f2d98018 ffff8008 08a067dc
using random host ethernet address
ffff0000
8030
f206d800 ffff8008 091c3650 ffff0000 f7957b18 ffff8008 f7957730 ffff8008
8050
f716a630 ffff8008 00000000 00000005 00000000 00000000 095d1568 ffff0000
8070
f76c8010 ffff8008 f716a800 ffff8008 095cac68 ffff0000 f206d828 ffff8008
X21: 0xffff8008f2d98314:
8314
ffff8008 00000000 00000000 00000000 00000000 00000000 00000000 00000000
8334
00000000 00000000 00000000 00000000 00000000 08a04cf4 ffff0000 00000000
8354
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
8374
00000000 00000000 00000000 00001001 00000000 00000000 00000000 00000000
8394
e4bbe4bb 0f230000 ffff0000 0afae000 ffff0000 ae001000 00000000 f206d400
Timeout for IPC response!
83b4
ffff8008 00000000 00000000 f7957b18 ffff8008 f7957718 ffff8008 f7957018
83d4
ffff8008 f7957118 ffff8008 f7957618 ffff8008 f7957818 ffff8008 f7957918
83f4
ffff8008 f7957d18 ffff8008 00000000 00000000 00000000 00000000 00000000
X23: 0xffff8008f2d97f90:
7f90
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
7fb0
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
7fd0
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
7ff0
00000000 00000000 00000000 00000000 f76c8010 ffff8008 f76c8010 ffff8008
8010
00000100 00000000 f2d98018 ffff8008 f2d98018 ffff8008 08a067dc ffff0000
8030
f206d800 ffff8008 091c3650 ffff0000 f7957b18 ffff8008 f7957730 ffff8008
8050
f716a630 ffff8008 00000000 00000005 00000000 00000000 095d1568 ffff0000
8070
f76c8010 ffff8008 f716a800 ffff8008 095cac68 ffff0000 f206d828 ffff8008
X28: 0xffff8008f2349b80:
9b80
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
9ba0
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
9bc0
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
9be0
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
9c00
00000022 00000000 ffffffff ffffffff 00010001 00000000 00000000 00000000
9c20
0b0a8000 ffff0000 00000002 00404040 00000000 00000000 00000000 00000000
9c40
00000001 00000000 00000001 00000000 001ebd44 00000001 f390b800 ffff8008
9c60
00000000 00000001 00000070 00000070 00000070 00000000 09031d48 ffff0000
Call trace:
Exception stack(0xffff000008003c90 to 0xffff000008003dd0)
3c80:
000000000000004e c3f28b86000c3900
3ca0:
c3f28b86000c3900 ffffffffffffffff 0000000000000000 0000000000000000
3cc0:
ffff000009f15e78 ffff0000086ca6cc ffff000009c8e280 6e6f637369642067
3ce0:
206e6f207463656e 7465676461472061 7369207461687420 746f6e2009090920
3d00:
7463656e6e6f6320 ffffffffffffffff ffff00000829f4d4 0000e395735727e8
3d20:
0000e3956f4f075a 0000000000000000 ffff8008f2d98010 ffff8008f2d98394
3d40:
00000000000001c0 ffff8008f2d98010 ffff000009c8e280 ffff000009885018
3d60:
ffff000008004000 ffff000009885018 ffff8008f2349c00 ffff000008003dd0
3d80:
ffff0000089ff9b0 ffff000008003dd0 ffff0000089ff9b0 00000000600001c5
3da0:
ffff8008f33f2cd8 0000000000000000 0000ffffffffffff 0000000000000000
init: Received control message 'start' for 'adbd' from pid: 3359 (/vendor/bin/hw/android.hardware.usb@1.1-service.imx)
3dc0:
ffff000008003dd0 ffff0000089ff9b0
[<
ffff0000089ff9b0>] composite_disconnect+0x80/0x88
[<
ffff000008a044d4>] android_disconnect+0x3c/0x68
[<
ffff0000089ba9f8>] cdns3_device_irq_handler+0xfc/0x2c8
[<
ffff0000089b84c0>] cdns3_irq+0x44/0x94
[<
ffff00000814494c>] __handle_irq_event_percpu+0x60/0x24c
[<
ffff000008144c0c>] handle_irq_event+0x58/0xc0
[<
ffff00000814873c>] handle_fasteoi_irq+0x98/0x180
[<
ffff000008143a10>] generic_handle_irq+0x24/0x38
[<
ffff000008144170>] __handle_domain_irq+0x60/0xac
[<
ffff0000080819c4>] gic_handle_irq+0xd4/0x17c
Cc: Andrzej Pietrasiewicz <andrzejtp2010@gmail.com>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Jayshri Pawar [Mon, 21 Oct 2019 06:35:00 +0000 (07:35 +0100)]
usb: gadget: f_tcm: Provide support to get alternate setting in tcm function
Providing tcm_get_alt in tcm function to support Bulk only protocol and
USB Attached SCSI protocol
Signed-off-by: Jayshri Pawar <jpawar@cadence.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Nikhil Badola [Mon, 21 Oct 2019 10:21:53 +0000 (18:21 +0800)]
usb: gadget: Correct NULL pointer checking in fsl gadget
Correct NULL pointer checking for endpoint descriptor
before it gets dereferenced
Signed-off-by: Nikhil Badola <nikhil.badola@freescale.com>
Signed-off-by: Ran Wang <ran.wang_1@nxp.com>
Reviewed-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Nikhil Badola [Mon, 21 Oct 2019 10:21:52 +0000 (18:21 +0800)]
usb: fsl: Remove unused variable
Remove unused variable td_complete
Signed-off-by: Nikhil Badola <nikhil.badola@freescale.com>
Reviewed-by: Ran Wang <ran.wang_1@nxp.com>
Reviewed-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Andrey Konovalov [Mon, 21 Oct 2019 14:20:59 +0000 (16:20 +0200)]
USB: dummy-hcd: use usb_urb_dir_in instead of usb_pipein
Commit
fea3409112a9 ("USB: add direction bit to urb->transfer_flags") has
added a usb_urb_dir_in() helper function that can be used to determine
the direction of the URB. With that patch USB_DIR_IN control requests with
wLength == 0 are considered out requests by real USB HCDs. This patch
changes dummy-hcd to use the usb_urb_dir_in() helper to match that
behavior.
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Andrey Konovalov [Mon, 21 Oct 2019 14:20:58 +0000 (16:20 +0200)]
USB: dummy-hcd: increase max number of devices to 32
When fuzzing the USB subsystem with syzkaller, we currently use 8 testing
processes within one VM. To isolate testing processes from one another it
is desirable to assign a dedicated USB bus to each of those, which means
we need at least 8 Dummy UDC/HCD devices.
This patch increases the maximum number of Dummy UDC/HCD devices to 32
(more than 8 in case we need more of them in the future).
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Veeraiyan Chidambaram [Mon, 9 Sep 2019 10:54:41 +0000 (12:54 +0200)]
usb: renesas_usbhs: add suspend event support in gadget mode
When R-Car Gen3 USB 2.0 is in Gadget mode, if host is detached an interrupt
will be generated and Suspended state bit is set in interrupt status
register. Interrupt handler will call driver->suspend(composite_suspend)
if suspended state bit is set. composite_suspend will call
ffs_func_suspend which will post FUNCTIONFS_SUSPEND and will be consumed
by user space application via /dev/ep0.
To be able to detect host detach, extend the DVSQ_MASK to cover the
Suspended bit of the DVSQ[2:0] bitfield from the Interrupt Status
Register 0 (INTSTS0) register and perform appropriate action in the
DVST interrupt handler (usbhsg_irq_dev_state).
Without this commit, disconnection of the phone from R-Car H3 ES2.0
Salvator-X CN9 port is not recognized and reverse role switch does
not not happen. If phone is connected again it does not enumerate.
With this commit, disconnection will be recognized and reverse role
switch will happen by a user space application. If phone is connected
again it will enumerate properly and will become visible in the output
of 'lsusb'.
Signed-off-by: Veeraiyan Chidambaram <veeraiyan.chidambaram@in.bosch.com>
Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Eugeniu Rosca [Mon, 9 Sep 2019 10:54:40 +0000 (12:54 +0200)]
usb: renesas_usbhs: simplify usbhs_status_get_device_state()
Similar to usbhs_status_get_ctrl_stage(), *_get_device_state() is not
supposed to return any error code since its return value is the DVSQ
bitfield of the INTSTS0 register. According to SoC HW manual rev1.00,
every single value of DVSQ[2:0] is valid and none is an error:
----8<----
Device State
000: Powered state
001: Default state
010: Address state
011: Configuration state
1xx: Suspended state
----8<----
Hence, simplify the function body. The motivation behind dropping the
switch/case construct is being able to implement reading the suspended
state. The latter (based on the above DVSQ[2:0] description) doesn't
have a unique value, but is rather a list of states (which makes
switch/case less suitable for reading/validating it):
100: (Suspended) Powered state
101: (Suspended) Default state
110: (Suspended) Address state
111: (Suspended) Configuration state
Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com>
Signed-off-by: Veeraiyan Chidambaram <veeraiyan.chidambaram@in.bosch.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Eugeniu Rosca [Mon, 9 Sep 2019 10:54:39 +0000 (12:54 +0200)]
usb: renesas_usbhs: enable DVSE interrupt
Commit [1] enabled the possibility of checking the DVST (Device State
Transition) bit of INTSTS0 (Interrupt Status Register 0) and calling
the irq_dev_state() handler if the DVST bit is set. But neither
commit [1] nor commit [2] actually enabled the DVSE (Device State
Transition Interrupt Enable) bit in the INTENB0 (Interrupt Enable
Register 0). As a consequence, irq_dev_state() handler is getting
called as a side effect of other (non-DVSE) interrupts being fired,
which definitely can't be relied upon, if DVST notifications are of
any value.
Why this doesn't hurt is because usbhsg_irq_dev_state() currently
doesn't do much except of a dev_dbg(). Once more work is added to
the handler (e.g. detecting device "Suspended" state and notifying
other USB gadget components about it), enabling DVSE becomes a hard
requirement. Do it in a standalone commit for better visibility and
clear explanation.
[1]
f1407d5 ("usb: renesas_usbhs: Add Renesas USBHS common code")
[2]
2f98382 ("usb: renesas_usbhs: Add Renesas USBHS Gadget")
Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com>
Signed-off-by: Veeraiyan Chidambaram <veeraiyan.chidambaram@in.bosch.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Veeraiyan Chidambaram [Thu, 5 Sep 2019 09:17:54 +0000 (11:17 +0200)]
usb: gadget: udc: renesas_usb3: add suspend event support
In R-Car Gen3 USB 3.0 Function, if host is detached an interrupt
will be generated and Suspended state bit is set in interrupt status
register. Interrupt handler will call driver->suspend(composite_suspend)
if suspended state bit is set. composite_suspend will call
ffs_func_suspend which will post FUNCTIONFS_SUSPEND and will be consumed
by user space application via /dev/ep0.
To be able to detect the host detach, USB_INT_1_B2_SPND to cover the
Suspended bit of the B2_SPND_OUT[9] from the USB Status Register
(USB_STA) register and perform appropriate action in the
usb3_irq_epc_int_1 function.
Without this commit, disconnection of the phone from R-Car H3 ES2.0
Salvator-X CN11 port is not recognized and reverse role switch does
not happen. If phone is connected again it does not enumerate.
With this commit, disconnection will be recognized and reverse role
switch will happen by a user space application. If phone is connected
again it will enumerate properly and will become visible in the
output of 'lsusb'.
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Veeraiyan Chidambaram <veeraiyan.chidambaram@in.bosch.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
YueHaibing [Wed, 4 Sep 2019 09:50:22 +0000 (17:50 +0800)]
usb: gadget: s3c-hsudc: use devm_platform_ioremap_resource() to simplify code
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
YueHaibing [Wed, 4 Sep 2019 09:48:36 +0000 (17:48 +0800)]
usb: gadget: renesas_usb3: use devm_platform_ioremap_resource() to simplify code
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
YueHaibing [Wed, 4 Sep 2019 09:47:38 +0000 (17:47 +0800)]
usb: gadget: r8a66597-udc: use devm_platform_ioremap_resource() to simplify code
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
YueHaibing [Wed, 4 Sep 2019 09:45:57 +0000 (17:45 +0800)]
usb: gadget: pxa27x_udc: use devm_platform_ioremap_resource() to simplify code
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
YueHaibing [Wed, 4 Sep 2019 09:42:22 +0000 (17:42 +0800)]
usb: gadget: pxa25x_udc: use devm_platform_ioremap_resource() to simplify code
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
YueHaibing [Wed, 4 Sep 2019 09:40:33 +0000 (17:40 +0800)]
usb: gadget: gr_udc: use devm_platform_ioremap_resource() to simplify code
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
YueHaibing [Wed, 4 Sep 2019 09:33:35 +0000 (17:33 +0800)]
usb: bdc: use devm_platform_ioremap_resource() to simplify code
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
YueHaibing [Wed, 4 Sep 2019 09:32:27 +0000 (17:32 +0800)]
usb: gadget: bcm63xx_udc: use devm_platform_ioremap_resource() to simplify code
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
YueHaibing [Wed, 4 Sep 2019 09:02:39 +0000 (17:02 +0800)]
usb: gadget: at91_udc: use devm_platform_ioremap_resource() to simplify code
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
YueHaibing [Wed, 4 Sep 2019 08:50:45 +0000 (16:50 +0800)]
usb: renesas_usbhs: use devm_platform_ioremap_resource() to simplify code
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
YueHaibing [Wed, 4 Sep 2019 08:48:27 +0000 (16:48 +0800)]
usb: phy: mxs: use devm_platform_ioremap_resource() to simplify code
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
YueHaibing [Wed, 4 Sep 2019 08:45:58 +0000 (16:45 +0800)]
usb: phy: keystone: use devm_platform_ioremap_resource() to simplify code
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Michał Mirosław [Sat, 10 Aug 2019 08:42:53 +0000 (10:42 +0200)]
usb: gadget: u_serial: use mutex for serialising open()s
Remove home-made waiting mechanism from gs_open() and rely on
portmaster's mutex to do the job.
Note: This releases thread waiting on close() when another thread
open()s simultaneously.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Michał Mirosław [Sat, 10 Aug 2019 08:42:52 +0000 (10:42 +0200)]
usb: gadget: u_serial: diagnose missed console messages
Insert markers in console stream marking places where data
is missing. This makes the hole in the data stand out clearly
instead of glueing together unrelated messages.
Example output as seen from USB host side:
[ 0.064078] pinctrl core: registered pin 16 (UART3_RTS_N PC0) on
70000868.pinmux
[ 0.064130] pinctrl
[missed 114987 bytes]
[ 4.302299] udevd[134]: starting version 3.2.5
[ 4.306845] random: udevd: uninitialized urandom read (16 bytes read)
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Michał Mirosław [Sat, 10 Aug 2019 08:42:52 +0000 (10:42 +0200)]
usb: gadget: legacy/serial: allow dynamic removal
Legacy serial USB gadget is still useful as an early console,
before userspace is up. Later it could be replaced with proper
configfs-configured composite gadget - that use case is enabled
by this patch.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Michał Mirosław [Sat, 10 Aug 2019 08:42:51 +0000 (10:42 +0200)]
usb: gadget: u_serial: allow more console gadget ports
Allow configuring more than one console using USB serial or ACM gadget.
By default, only first (ttyGS0) is a console, but this may be changed
using function's new "console" attribute.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Michał Mirosław [Sat, 10 Aug 2019 08:42:50 +0000 (10:42 +0200)]
usb: gadget: u_serial: make OBEX port not a console
Prevent OBEX serial port from ever becoming a console. Console messages
will definitely break the protocol, and since you have to instantiate
the port making it explicitly for OBEX, there is no point in allowing
console to break it by mistake.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Michał Mirosław [Sat, 10 Aug 2019 08:42:49 +0000 (10:42 +0200)]
usb: gadget: u_serial: reimplement console support
Rewrite console support to fix a few shortcomings of the old code
preventing its use with multiple ports. This removes some duplicated
code and replaces a custom kthread with simpler workqueue item.
Only port ttyGS0 gets to be a console for now.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tested-by: Ladislav Michl <ladis@linux-mips.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Michał Mirosław [Sat, 10 Aug 2019 08:42:48 +0000 (10:42 +0200)]
usb: gadget: u_serial: add missing port entry locking
gserial_alloc_line() misses locking (for a release barrier) while
resetting port entry on TTY allocation failure. Fix this.
Cc: stable@vger.kernel.org
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tested-by: Ladislav Michl <ladis@linux-mips.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Thinh Nguyen [Fri, 9 Aug 2019 19:15:52 +0000 (12:15 -0700)]
usb: dwc3: Disable phy suspend after power-on reset
For DRD controllers, the programming guide recommended that
GUSB3PIPECTL.SUSPENDABLE and GUSB2PHYCFG.SUSPHY to be cleared after
power-on reset and only set after the controller initialization is
completed. This can be done after device soft-reset in dwc3_core_init().
This patch makes sure to clear GUSB3PIPECTL.SUSPENDABLE and
GUSB2PHYCFG.SUSPHY before core initialization and only set them after
the device soft-reset is completed.
Reference: DWC_usb3 3.30a and DWC_usb31 1.90a programming guide section
1.2.49 and 1.2.45
Signed-off-by: Thinh Nguyen <thinhn@synopsys.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Linus Torvalds [Sun, 20 Oct 2019 19:56:22 +0000 (15:56 -0400)]
Linux 5.4-rc4
Linus Torvalds [Sun, 20 Oct 2019 16:36:57 +0000 (12:36 -0400)]
Merge tag 'kbuild-fixes-v5.4-2' of git://git./linux/kernel/git/masahiroy/linux-kbuild
Pull more Kbuild fixes from Masahiro Yamada:
- fix a bashism of setlocalversion
- do not use the too new --sort option of tar
* tag 'kbuild-fixes-v5.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kheaders: substituting --sort in archive creation
scripts: setlocalversion: fix a bashism
kbuild: update comment about KBUILD_ALLDIRS
Linus Torvalds [Sun, 20 Oct 2019 10:31:14 +0000 (06:31 -0400)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A small set of x86 fixes:
- Prevent a NULL pointer dereference in the X2APIC code in case of a
CPU hotplug failure.
- Prevent boot failures on HP superdome machines by invalidating the
level2 kernel pagetable entries outside of the kernel area as
invalid so BIOS reserved space won't be touched unintentionally.
Also ensure that memory holes are rounded up to the next PMD
boundary correctly.
- Enable X2APIC support on Hyper-V to prevent boot failures.
- Set the paravirt name when running on Hyper-V for consistency
- Move a function under the appropriate ifdef guard to prevent build
warnings"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot/acpi: Move get_cmdline_acpi_rsdp() under #ifdef guard
x86/hyperv: Set pv_info.name to "Hyper-V"
x86/apic/x2apic: Fix a NULL pointer deref when handling a dying cpu
x86/hyperv: Make vapic support x2apic mode
x86/boot/64: Round memory hole size up to next PMD page
x86/boot/64: Make level2_kernel_pgt pages invalid outside kernel area
Linus Torvalds [Sun, 20 Oct 2019 10:27:54 +0000 (06:27 -0400)]
Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
"A small set of irq chip driver fixes and updates:
- Update the SIFIVE PLIC interrupt driver to use the fasteoi handler
to address the shortcomings of the existing flow handling which was
prone to lose interrupts
- Use the proper limit for GIC interrupt line numbers
- Add retrigger support for the recently merged Anapurna Labs Fabric
interrupt controller to make it complete
- Enable the ATMEL AIC5 interrupt controller driver on the new
SAM9X60 SoC"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/sifive-plic: Switch to fasteoi flow
irqchip/gic-v3: Fix GIC_LINE_NR accessor
irqchip/atmel-aic5: Add support for sam9x60 irqchip
irqchip/al-fic: Add support for irq retrigger
Linus Torvalds [Sun, 20 Oct 2019 10:25:12 +0000 (06:25 -0400)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull hrtimer fixlet from Thomas Gleixner:
"A single commit annotating the lockcless access to timer->base with
READ_ONCE() and adding the WRITE_ONCE() counterparts for completeness"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
hrtimer: Annotate lockless access to timer->base
Linus Torvalds [Sun, 20 Oct 2019 10:22:25 +0000 (06:22 -0400)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull stop-machine fix from Thomas Gleixner:
"A single fix, amending stop machine with WRITE/READ_ONCE() to address
the fallout of KCSAN"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
stop_machine: Avoid potential race behaviour
Linus Torvalds [Sat, 19 Oct 2019 21:09:11 +0000 (17:09 -0400)]
Merge git://git./linux/kernel/git/netdev/net
Pull networking fixes from David Miller:
"I was battling a cold after some recent trips, so quite a bit piled up
meanwhile, sorry about that.
Highlights:
1) Fix fd leak in various bpf selftests, from Brian Vazquez.
2) Fix crash in xsk when device doesn't support some methods, from
Magnus Karlsson.
3) Fix various leaks and use-after-free in rxrpc, from David Howells.
4) Fix several SKB leaks due to confusion of who owns an SKB and who
should release it in the llc code. From Eric Biggers.
5) Kill a bunc of KCSAN warnings in TCP, from Eric Dumazet.
6) Jumbo packets don't work after resume on r8169, as the BIOS resets
the chip into non-jumbo mode during suspend. From Heiner Kallweit.
7) Corrupt L2 header during MPLS push, from Davide Caratti.
8) Prevent possible infinite loop in tc_ctl_action, from Eric
Dumazet.
9) Get register bits right in bcmgenet driver, based upon chip
version. From Florian Fainelli.
10) Fix mutex problems in microchip DSA driver, from Marek Vasut.
11) Cure race between route lookup and invalidation in ipv4, from Wei
Wang.
12) Fix performance regression due to false sharing in 'net'
structure, from Eric Dumazet"
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (145 commits)
net: reorder 'struct net' fields to avoid false sharing
net: dsa: fix switch tree list
net: ethernet: dwmac-sun8i: show message only when switching to promisc
net: aquantia: add an error handling in aq_nic_set_multicast_list
net: netem: correct the parent's backlog when corrupted packet was dropped
net: netem: fix error path for corrupted GSO frames
macb: propagate errors when getting optional clocks
xen/netback: fix error path of xenvif_connect_data()
net: hns3: fix mis-counting IRQ vector numbers issue
net: usb: lan78xx: Connect PHY before registering MAC
vsock/virtio: discard packets if credit is not respected
vsock/virtio: send a credit update when buffer size is changed
mlxsw: spectrum_trap: Push Ethernet header before reporting trap
net: ensure correct skb->tstamp in various fragmenters
net: bcmgenet: reset 40nm EPHY on energy detect
net: bcmgenet: soft reset 40nm EPHYs before MAC init
net: phy: bcm7xxx: define soft_reset for 40nm EPHY
net: bcmgenet: don't set phydev->link from MAC
net: Update address for MediaTek ethernet driver in MAINTAINERS
ipv4: fix race condition between route lookup and invalidation
...
Eric Dumazet [Fri, 18 Oct 2019 22:20:05 +0000 (15:20 -0700)]
net: reorder 'struct net' fields to avoid false sharing
Intel test robot reported a ~7% regression on TCP_CRR tests
that they bisected to the cited commit.
Indeed, every time a new TCP socket is created or deleted,
the atomic counter net->count is touched (via get_net(net)
and put_net(net) calls)
So cpus might have to reload a contended cache line in
net_hash_mix(net) calls.
We need to reorder 'struct net' fields to move @hash_mix
in a read mostly cache line.
We move in the first cache line fields that can be
dirtied often.
We probably will have to address in a followup patch
the __randomize_layout that was added in linux-4.13,
since this might break our placement choices.
Fixes:
355b98553789 ("netns: provide pure entropy for net_hash_mix()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Fri, 18 Oct 2019 21:02:46 +0000 (17:02 -0400)]
net: dsa: fix switch tree list
If there are multiple switch trees on the device, only the last one
will be listed, because the arguments of list_add_tail are swapped.
Fixes:
83c0afaec7b7 ("net: dsa: Add new binding implementation")
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mans Rullgard [Fri, 18 Oct 2019 16:56:58 +0000 (17:56 +0100)]
net: ethernet: dwmac-sun8i: show message only when switching to promisc
Printing the info message every time more than the max number of mac
addresses are requested generates unnecessary log spam. Showing it only
when the hw is not already in promiscous mode is equally informative
without being annoying.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chenwandun [Fri, 18 Oct 2019 10:20:37 +0000 (18:20 +0800)]
net: aquantia: add an error handling in aq_nic_set_multicast_list
add an error handling in aq_nic_set_multicast_list, it may not
work when hw_multicast_list_set error; and at the same time
it will remove gcc Wunused-but-set-variable warning.
Signed-off-by: Chenwandun <chenwandun@huawei.com>
Reviewed-by: Igor Russkikh <igor.russkikh@aquantia.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 19 Oct 2019 19:12:36 +0000 (12:12 -0700)]
Merge branch 'netem-fix-further-issues-with-packet-corruption'
Jakub Kicinski says:
====================
net: netem: fix further issues with packet corruption
This set is fixing two more issues with the netem packet corruption.
First patch (which was previously posted) avoids NULL pointer dereference
if the first frame gets freed due to allocation or checksum failure.
v2 improves the clarity of the code a little as requested by Cong.
Second patch ensures we don't return SUCCESS if the frame was in fact
dropped. Thanks to this commit message for patch 1 no longer needs the
"this will still break with a single-frame failure" disclaimer.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 18 Oct 2019 16:16:58 +0000 (09:16 -0700)]
net: netem: correct the parent's backlog when corrupted packet was dropped
If packet corruption failed we jump to finish_segs and return
NET_XMIT_SUCCESS. Seeing success will make the parent qdisc
increment its backlog, that's incorrect - we need to return
NET_XMIT_DROP.
Fixes:
6071bd1aa13e ("netem: Segment GSO packets on enqueue")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 18 Oct 2019 16:16:57 +0000 (09:16 -0700)]
net: netem: fix error path for corrupted GSO frames
To corrupt a GSO frame we first perform segmentation. We then
proceed using the first segment instead of the full GSO skb and
requeue the rest of the segments as separate packets.
If there are any issues with processing the first segment we
still want to process the rest, therefore we jump to the
finish_segs label.
Commit
177b8007463c ("net: netem: fix backlog accounting for
corrupted GSO frames") started using the pointer to the first
segment in the "rest of segments processing", but as mentioned
above the first segment may had already been freed at this point.
Backlog corrections for parent qdiscs have to be adjusted.
Fixes:
177b8007463c ("net: netem: fix backlog accounting for corrupted GSO frames")
Reported-by: kbuild test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Tretter [Fri, 18 Oct 2019 14:11:43 +0000 (16:11 +0200)]
macb: propagate errors when getting optional clocks
The tx_clk, rx_clk, and tsu_clk are optional. Currently the macb driver
marks clock as not available if it receives an error when trying to get
a clock. This is wrong, because a clock controller might return
-EPROBE_DEFER if a clock is not available, but will eventually become
available.
In these cases, the driver would probe successfully but will never be
able to adjust the clocks, because the clocks were not available during
probe, but became available later.
For example, the clock controller for the ZynqMP is implemented in the
PMU firmware and the clocks are only available after the firmware driver
has been probed.
Use devm_clk_get_optional() in instead of devm_clk_get() to get the
optional clock and propagate all errors to the calling function.
Signed-off-by: Michael Tretter <m.tretter@pengutronix.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Tested-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Juergen Gross [Fri, 18 Oct 2019 07:45:49 +0000 (09:45 +0200)]
xen/netback: fix error path of xenvif_connect_data()
xenvif_connect_data() calls module_put() in case of error. This is
wrong as there is no related module_get().
Remove the superfluous module_put().
Fixes:
279f438e36c0a7 ("xen-netback: Don't destroy the netdev until the vif is shut down")
Cc: <stable@vger.kernel.org> # 3.12
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yonglong Liu [Fri, 18 Oct 2019 03:42:59 +0000 (11:42 +0800)]
net: hns3: fix mis-counting IRQ vector numbers issue
Currently, the num_msi_left means the vector numbers of NIC,
but if the PF supported RoCE, it contains the vector numbers
of NIC and RoCE(Not expected).
This may cause interrupts lost in some case, because of the
NIC module used the vector resources which belongs to RoCE.
This patch adds a new variable num_nic_msi to store the vector
numbers of NIC, and adjust the default TQP numbers and rss_size
according to the value of num_nic_msi.
Fixes:
46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sat, 19 Oct 2019 10:53:59 +0000 (06:53 -0400)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"Rather a lot of fixes, almost all affecting mm/"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (26 commits)
scripts/gdb: fix debugging modules on s390
kernel/events/uprobes.c: only do FOLL_SPLIT_PMD for uprobe register
mm/thp: allow dropping THP from page cache
mm/vmscan.c: support removing arbitrary sized pages from mapping
mm/thp: fix node page state in split_huge_page_to_list()
proc/meminfo: fix output alignment
mm/init-mm.c: include <linux/mman.h> for vm_committed_as_batch
mm/filemap.c: include <linux/ramfs.h> for generic_file_vm_ops definition
mm: include <linux/huge_mm.h> for is_vma_temporary_stack
zram: fix race between backing_dev_show and backing_dev_store
mm/memcontrol: update lruvec counters in mem_cgroup_move_account
ocfs2: fix panic due to ocfs2_wq is null
hugetlbfs: don't access uninitialized memmaps in pfn_range_valid_gigantic()
mm: memblock: do not enforce current limit for memblock_phys* family
mm: memcg: get number of pages on the LRU list in memcgroup base on lru_zone_size
mm/gup: fix a misnamed "write" argument, and a related bug
mm/gup_benchmark: add a missing "w" to getopt string
ocfs2: fix error handling in ocfs2_setattr()
mm: memcg/slab: fix panic in __free_slab() caused by premature memcg pointer release
mm/memunmap: don't access uninitialized memmap in memunmap_pages()
...
Ilya Leoshkevich [Sat, 19 Oct 2019 03:20:43 +0000 (20:20 -0700)]
scripts/gdb: fix debugging modules on s390
Currently lx-symbols assumes that module text is always located at
module->core_layout->base, but s390 uses the following layout:
+------+ <- module->core_layout->base
| GOT |
+------+ <- module->core_layout->base + module->arch->plt_offset
| PLT |
+------+ <- module->core_layout->base + module->arch->plt_offset +
| TEXT | module->arch->plt_size
+------+
Therefore, when trying to debug modules on s390, all the symbol
addresses are skewed by plt_offset + plt_size.
Fix by adding plt_offset + plt_size to module_addr in
load_module_symbols().
Link: http://lkml.kernel.org/r/20191017085917.81791-1-iii@linux.ibm.com
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Kieran Bingham <kbingham@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Song Liu [Sat, 19 Oct 2019 03:20:40 +0000 (20:20 -0700)]
kernel/events/uprobes.c: only do FOLL_SPLIT_PMD for uprobe register
Attaching uprobe to text section in THP splits the PMD mapped page table
into PTE mapped entries. On uprobe detach, we would like to regroup PMD
mapped page table entry to regain performance benefit of THP.
However, the regroup is broken For perf_event based trace_uprobe. This
is because perf_event based trace_uprobe calls uprobe_unregister twice
on close: first in TRACE_REG_PERF_CLOSE, then in
TRACE_REG_PERF_UNREGISTER. The second call will split the PMD mapped
page table entry, which is not the desired behavior.
Fix this by only use FOLL_SPLIT_PMD for uprobe register case.
Add a WARN() to confirm uprobe unregister never work on huge pages, and
abort the operation when this WARN() triggers.
Link: http://lkml.kernel.org/r/20191017164223.2762148-6-songliubraving@fb.com
Fixes:
5a52c9df62b4 ("uprobe: use FOLL_SPLIT_PMD instead of FOLL_SPLIT")
Signed-off-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kirill A. Shutemov [Sat, 19 Oct 2019 03:20:36 +0000 (20:20 -0700)]
mm/thp: allow dropping THP from page cache
Once a THP is added to the page cache, it cannot be dropped via
/proc/sys/vm/drop_caches. Fix this issue with proper handling in
invalidate_mapping_pages().
Link: http://lkml.kernel.org/r/20191017164223.2762148-5-songliubraving@fb.com
Fixes:
99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Tested-by: Song Liu <songliubraving@fb.com>
Acked-by: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
William Kucharski [Sat, 19 Oct 2019 03:20:33 +0000 (20:20 -0700)]
mm/vmscan.c: support removing arbitrary sized pages from mapping
__remove_mapping() assumes that pages can only be either base pages or
HPAGE_PMD_SIZE. Ask the page what size it is.
Link: http://lkml.kernel.org/r/20191017164223.2762148-4-songliubraving@fb.com
Fixes:
99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Song Liu <songliubraving@fb.com>
Acked-by: Yang Shi <yang.shi@linux.alibaba.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kirill A. Shutemov [Sat, 19 Oct 2019 03:20:30 +0000 (20:20 -0700)]
mm/thp: fix node page state in split_huge_page_to_list()
Make sure split_huge_page_to_list() handles the state of shmem THP and
file THP properly.
Link: http://lkml.kernel.org/r/20191017164223.2762148-3-songliubraving@fb.com
Fixes:
60fbf0ab5da1 ("mm,thp: stats for file backed THP")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Tested-by: Song Liu <songliubraving@fb.com>
Acked-by: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kirill A. Shutemov [Sat, 19 Oct 2019 03:20:27 +0000 (20:20 -0700)]
proc/meminfo: fix output alignment
Patch series "Fixes for THP in page cache", v2.
This patch (of 5):
Add extra space for FileHugePages and FilePmdMapped, so the output is
aligned with other rows.
Link: http://lkml.kernel.org/r/20191017164223.2762148-2-songliubraving@fb.com
Fixes:
60fbf0ab5da1 ("mm,thp: stats for file backed THP")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Tested-by: Song Liu <songliubraving@fb.com>
Acked-by: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Dooks (Codethink) [Sat, 19 Oct 2019 03:20:24 +0000 (20:20 -0700)]
mm/init-mm.c: include <linux/mman.h> for vm_committed_as_batch
mm_init.c needs to include <linux/mman.h> for the definition of
vm_committed_as_batch. Fixes the following sparse warning:
mm/mm_init.c:141:5: warning: symbol 'vm_committed_as_batch' was not declared. Should it be static?
Link: http://lkml.kernel.org/r/20191016091509.26708-1-ben.dooks@codethink.co.uk
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Dooks [Sat, 19 Oct 2019 03:20:20 +0000 (20:20 -0700)]
mm/filemap.c: include <linux/ramfs.h> for generic_file_vm_ops definition
The generic_file_vm_ops is defined in <linux/ramfs.h> so include it to
fix the following warning:
mm/filemap.c:2717:35: warning: symbol 'generic_file_vm_ops' was not declared. Should it be static?
Link: http://lkml.kernel.org/r/20191008102311.25432-1-ben.dooks@codethink.co.uk
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Dooks [Sat, 19 Oct 2019 03:20:17 +0000 (20:20 -0700)]
mm: include <linux/huge_mm.h> for is_vma_temporary_stack
Include <linux/huge_mm.h> for the definition of is_vma_temporary_stack
to fix the following sparse warning:
mm/rmap.c:1673:6: warning: symbol 'is_vma_temporary_stack' was not declared. Should it be static?
Link: http://lkml.kernel.org/r/20191009151155.27763-1-ben.dooks@codethink.co.uk
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Reviewed-by: Qian Cai <cai@lca.pw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chenwandun [Sat, 19 Oct 2019 03:20:14 +0000 (20:20 -0700)]
zram: fix race between backing_dev_show and backing_dev_store
CPU0: CPU1:
backing_dev_show backing_dev_store
...... ......
file = zram->backing_dev;
down_read(&zram->init_lock); down_read(&zram->init_init_lock)
file_path(file, ...); zram->backing_dev = backing_dev;
up_read(&zram->init_lock); up_read(&zram->init_lock);
gets the value of zram->backing_dev too early in backing_dev_show, which
resultin the value being NULL at the beginning, and not NULL later.
backtrace:
d_path+0xcc/0x174
file_path+0x10/0x18
backing_dev_show+0x40/0xb4
dev_attr_show+0x20/0x54
sysfs_kf_seq_show+0x9c/0x10c
kernfs_seq_show+0x28/0x30
seq_read+0x184/0x488
kernfs_fop_read+0x5c/0x1a4
__vfs_read+0x44/0x128
vfs_read+0xa0/0x138
SyS_read+0x54/0xb4
Link: http://lkml.kernel.org/r/1571046839-16814-1-git-send-email-chenwandun@huawei.com
Signed-off-by: Chenwandun <chenwandun@huawei.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: <stable@vger.kernel.org> [4.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Konstantin Khlebnikov [Sat, 19 Oct 2019 03:20:11 +0000 (20:20 -0700)]
mm/memcontrol: update lruvec counters in mem_cgroup_move_account
Mapped, dirty and writeback pages are also counted in per-lruvec stats.
These counters needs update when page is moved between cgroups.
Currently is nobody *consuming* the lruvec versions of these counters and
that there is no user-visible effect.
Link: http://lkml.kernel.org/r/157112699975.7360.1062614888388489788.stgit@buzz
Fixes:
00f3ca2c2d66 ("mm: memcontrol: per-lruvec stats infrastructure")
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yi Li [Sat, 19 Oct 2019 03:20:08 +0000 (20:20 -0700)]
ocfs2: fix panic due to ocfs2_wq is null
mount.ocfs2 failed when reading ocfs2 filesystem superblock encounters
an error. ocfs2_initialize_super() returns before allocating ocfs2_wq.
ocfs2_dismount_volume() triggers the following panic.
Oct 15 16:09:27 cnwarekv-205120 kernel: On-disk corruption discovered.Please run fsck.ocfs2 once the filesystem is unmounted.
Oct 15 16:09:27 cnwarekv-205120 kernel: (mount.ocfs2,22804,44): ocfs2_read_locked_inode:537 ERROR: status = -30
Oct 15 16:09:27 cnwarekv-205120 kernel: (mount.ocfs2,22804,44): ocfs2_init_global_system_inodes:458 ERROR: status = -30
Oct 15 16:09:27 cnwarekv-205120 kernel: (mount.ocfs2,22804,44): ocfs2_init_global_system_inodes:491 ERROR: status = -30
Oct 15 16:09:27 cnwarekv-205120 kernel: (mount.ocfs2,22804,44): ocfs2_initialize_super:2313 ERROR: status = -30
Oct 15 16:09:27 cnwarekv-205120 kernel: (mount.ocfs2,22804,44): ocfs2_fill_super:1033 ERROR: status = -30
------------[ cut here ]------------
Oops: 0002 [#1] SMP NOPTI
CPU: 1 PID: 11753 Comm: mount.ocfs2 Tainted: G E
4.14.148-200.ckv.x86_64 #1
Hardware name: Sugon H320-G30/35N16-US, BIOS 0SSDX017 12/21/2018
task:
ffff967af0520000 task.stack:
ffffa5f05484000
RIP: 0010:mutex_lock+0x19/0x20
Call Trace:
flush_workqueue+0x81/0x460
ocfs2_shutdown_local_alloc+0x47/0x440 [ocfs2]
ocfs2_dismount_volume+0x84/0x400 [ocfs2]
ocfs2_fill_super+0xa4/0x1270 [ocfs2]
? ocfs2_initialize_super.isa.211+0xf20/0xf20 [ocfs2]
mount_bdev+0x17f/0x1c0
mount_fs+0x3a/0x160
Link: http://lkml.kernel.org/r/1571139611-24107-1-git-send-email-yili@winhong.com
Signed-off-by: Yi Li <yilikernel@gmail.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Hildenbrand [Sat, 19 Oct 2019 03:20:05 +0000 (20:20 -0700)]
hugetlbfs: don't access uninitialized memmaps in pfn_range_valid_gigantic()
Uninitialized memmaps contain garbage and in the worst case trigger
kernel BUGs, especially with CONFIG_PAGE_POISONING. They should not get
touched.
Let's make sure that we only consider online memory (managed by the
buddy) that has initialized memmaps. ZONE_DEVICE is not applicable.
page_zone() will call page_to_nid(), which will trigger
VM_BUG_ON_PGFLAGS(PagePoisoned(page), page) with CONFIG_PAGE_POISONING
and CONFIG_DEBUG_VM_PGFLAGS when called on uninitialized memmaps. This
can be the case when an offline memory block (e.g., never onlined) is
spanned by a zone.
Note: As explained by Michal in [1], alloc_contig_range() will verify
the range. So it boils down to the wrong access in this function.
[1] http://lkml.kernel.org/r/
20180423000943.GO17484@dhcp22.suse.cz
Link: http://lkml.kernel.org/r/20191015120717.4858-1-david@redhat.com
Fixes:
f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [visible after
d0dc12e86b319]
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: Michal Hocko <mhocko@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: <stable@vger.kernel.org> [4.13+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Sat, 19 Oct 2019 03:20:01 +0000 (20:20 -0700)]
mm: memblock: do not enforce current limit for memblock_phys* family
Until commit
92d12f9544b7 ("memblock: refactor internal allocation
functions") the maximal address for memblock allocations was forced to
memblock.current_limit only for the allocation functions returning
virtual address. The changes introduced by that commit moved the limit
enforcement into the allocation core and as a result the allocation
functions returning physical address also started to limit allocations
to memblock.current_limit.
This caused breakage of etnaviv GPU driver:
etnaviv etnaviv: bound 130000.gpu (ops gpu_ops)
etnaviv etnaviv: bound 134000.gpu (ops gpu_ops)
etnaviv etnaviv: bound
2204000.gpu (ops gpu_ops)
etnaviv-gpu 130000.gpu: model: GC2000, revision: 5108
etnaviv-gpu 130000.gpu: command buffer outside valid memory window
etnaviv-gpu 134000.gpu: model: GC320, revision: 5007
etnaviv-gpu 134000.gpu: command buffer outside valid memory window
etnaviv-gpu
2204000.gpu: model: GC355, revision: 1215
etnaviv-gpu
2204000.gpu: Ignoring GPU with VG and FE2.0
Restore the behaviour of memblock_phys* family so that these functions
will not enforce memblock.current_limit.
Link: http://lkml.kernel.org/r/1570915861-17633-1-git-send-email-rppt@kernel.org
Fixes:
92d12f9544b7 ("memblock: refactor internal allocation functions")
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reported-by: Adam Ford <aford173@gmail.com>
Tested-by: Adam Ford <aford173@gmail.com> [imx6q-logicpd]
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Honglei Wang [Sat, 19 Oct 2019 03:19:58 +0000 (20:19 -0700)]
mm: memcg: get number of pages on the LRU list in memcgroup base on lru_zone_size
Commit
1a61ab8038e72 ("mm: memcontrol: replace zone summing with
lruvec_page_state()") has made lruvec_page_state to use per-cpu counters
instead of calculating it directly from lru_zone_size with an idea that
this would be more effective.
Tim has reported that this is not really the case for their database
benchmark which is showing an opposite results where lruvec_page_state
is taking up a huge chunk of CPU cycles (about 25% of the system time
which is roughly 7% of total cpu cycles) on 5.3 kernels. The workload
is running on a larger machine (96cpus), it has many cgroups (500) and
it is heavily direct reclaim bound.
Tim Chen said:
: The problem can also be reproduced by running simple multi-threaded
: pmbench benchmark with a fast Optane SSD swap (see profile below).
:
:
: 6.15% 3.08% pmbench [kernel.vmlinux] [k] lruvec_lru_size
: |
: |--3.07%--lruvec_lru_size
: | |
: | |--2.11%--cpumask_next
: | | |
: | | --1.66%--find_next_bit
: | |
: | --0.57%--call_function_interrupt
: | |
: | --0.55%--smp_call_function_interrupt
: |
: |--1.59%--0x441f0fc3d009
: | _ops_rdtsc_init_base_freq
: | access_histogram
: | page_fault
: | __do_page_fault
: | handle_mm_fault
: | __handle_mm_fault
: | |
: | --1.54%--do_swap_page
: | swapin_readahead
: | swap_cluster_readahead
: | |
: | --1.53%--read_swap_cache_async
: | __read_swap_cache_async
: | alloc_pages_vma
: | __alloc_pages_nodemask
: | __alloc_pages_slowpath
: | try_to_free_pages
: | do_try_to_free_pages
: | shrink_node
: | shrink_node_memcg
: | |
: | |--0.77%--lruvec_lru_size
: | |
: | --0.76%--inactive_list_is_low
: | |
: | --0.76%--lruvec_lru_size
: |
: --1.50%--measure_read
: page_fault
: __do_page_fault
: handle_mm_fault
: __handle_mm_fault
: do_swap_page
: swapin_readahead
: swap_cluster_readahead
: |
: --1.48%--read_swap_cache_async
: __read_swap_cache_async
: alloc_pages_vma
: __alloc_pages_nodemask
: __alloc_pages_slowpath
: try_to_free_pages
: do_try_to_free_pages
: shrink_node
: shrink_node_memcg
: |
: |--0.75%--inactive_list_is_low
: | |
: | --0.75%--lruvec_lru_size
: |
: --0.73%--lruvec_lru_size
The likely culprit is the cache traffic the lruvec_page_state_local
generates. Dave Hansen says:
: I was thinking purely of the cache footprint. If it's reading
: pn->lruvec_stat_local->count[idx] is three separate cachelines, so 192
: bytes of cache *96 CPUs = 18k of data, mostly read-only. 1 cgroup would
: be 18k of data for the whole system and the caching would be pretty
: efficient and all 18k would probably survive a tight page fault loop in
: the L1. 500 cgroups would be ~90k of data per CPU thread which doesn't
: fit in the L1 and probably wouldn't survive a tight page fault loop if
: both logical threads were banging on different cgroups.
:
: It's just a theory, but it's why I noted the number of cgroups when I
: initially saw this show up in profiles
Fix the regression by partially reverting the said commit and calculate
the lru size explicitly.
Link: http://lkml.kernel.org/r/20190905071034.16822-1-honglei.wang@oracle.com
Fixes:
1a61ab8038e72 ("mm: memcontrol: replace zone summing with lruvec_page_state()")
Signed-off-by: Honglei Wang <honglei.wang@oracle.com>
Reported-by: Tim Chen <tim.c.chen@linux.intel.com>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Tested-by: Tim Chen <tim.c.chen@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: <stable@vger.kernel.org> [5.2+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
John Hubbard [Sat, 19 Oct 2019 03:19:53 +0000 (20:19 -0700)]
mm/gup: fix a misnamed "write" argument, and a related bug
In several routines, the "flags" argument is incorrectly named "write".
Change it to "flags".
Also, in one place, the misnaming led to an actual bug:
"flags & FOLL_WRITE" is required, rather than just "flags".
(That problem was flagged by krobot, in v1 of this patch.)
Also, change the flags argument from int, to unsigned int.
You can see that this was a simple oversight, because the
calling code passes "flags" to the fifth argument:
gup_pgd_range():
...
if (!gup_huge_pd(__hugepd(pgd_val(pgd)), addr,
PGDIR_SHIFT, next, flags, pages, nr))
...which, until this patch, the callees referred to as "write".
Also, change two lines to avoid checkpatch line length
complaints, and another line to fix another oversight
that checkpatch called out: missing "int" on pdshift.
Link: http://lkml.kernel.org/r/20191014184639.1512873-3-jhubbard@nvidia.com
Fixes:
b798bec4741b ("mm/gup: change write parameter to flags in fast walk")
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Reported-by: kbuild test robot <lkp@intel.com>
Suggested-by: Kirill A. Shutemov <kirill@shutemov.name>
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
John Hubbard [Sat, 19 Oct 2019 03:19:50 +0000 (20:19 -0700)]
mm/gup_benchmark: add a missing "w" to getopt string
Even though gup_benchmark.c has code to handle the -w command-line option,
the "w" is not part of the getopt string. It looks as if it has been
missing the whole time.
On my machine, this leads naturally to the following predictable result:
$ sudo ./gup_benchmark -w
./gup_benchmark: invalid option -- 'w'
...which is fixed with this commit.
Link: http://lkml.kernel.org/r/20191014184639.1512873-2-jhubbard@nvidia.com
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: kbuild test robot <lkp@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chengguang Xu [Sat, 19 Oct 2019 03:19:47 +0000 (20:19 -0700)]
ocfs2: fix error handling in ocfs2_setattr()
Should set transfer_to[USRQUOTA/GRPQUOTA] to NULL on error case before
jumping to do dqput().
Link: http://lkml.kernel.org/r/20191010082349.1134-1-cgxu519@mykernel.net
Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roman Gushchin [Sat, 19 Oct 2019 03:19:44 +0000 (20:19 -0700)]
mm: memcg/slab: fix panic in __free_slab() caused by premature memcg pointer release
Karsten reported the following panic in __free_slab() happening on a s390x
machine:
Unable to handle kernel pointer dereference in virtual kernel address space
Failing address:
0000000000000000 TEID:
0000000000000483
Fault in home space mode while using kernel ASCE.
AS:
00000000017d4007 R3:
000000007fbd0007 S:
000000007fbff000 P:
000000000000003d
Oops: 0004 ilc:3 Ý#1¨ PREEMPT SMP
Modules linked in: tcp_diag inet_diag xt_tcpudp ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_at nf_nat
CPU: 0 PID: 0 Comm: swapper/0 Not tainted
5.3.0-05872-g6133e3e4bada-dirty #14
Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0)
Krnl PSW :
0704d00180000000 00000000003cadb6 (__free_slab+0x686/0x6b0)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
Krnl GPRS:
00000000f3a32928 0000000000000000 000000007fbf5d00 000000000117c4b8
0000000000000000 000000009e3291c1 0000000000000000 0000000000000000
0000000000000003 0000000000000008 000000002b478b00 000003d080a97600
0000000000000003 0000000000000008 000000002b478b00 000003d080a97600
000000000117ba00 000003e000057db0 00000000003cabcc 000003e000057c78
Krnl Code:
00000000003cada6:
e310a1400004 lg %r1,320(%r10)
00000000003cadac:
c0e50046c286 brasl %r14,ca32b8
#
00000000003cadb2:
a7f4fe36 brc 15,3caa1e
>
00000000003cadb6:
e32060800024 stg %r2,128(%r6)
00000000003cadbc:
a7f4fd9e brc 15,3ca8f8
00000000003cadc0:
c0e50046790c brasl %r14,c99fd8
00000000003cadc6:
a7f4fe2c brc 15,3caa
00000000003cadc6:
a7f4fe2c brc 15,3caa1e
00000000003cadca:
ecb1ffff00d9 aghik %r11,%r1,-1
Call Trace:
(<
00000000003cabcc> __free_slab+0x49c/0x6b0)
<
00000000001f5886> rcu_core+0x5a6/0x7e0
<
0000000000ca2dea> __do_softirq+0xf2/0x5c0
<
0000000000152644> irq_exit+0x104/0x130
<
000000000010d222> do_IRQ+0x9a/0xf0
<
0000000000ca2344> ext_int_handler+0x130/0x134
<
0000000000103648> enabled_wait+0x58/0x128
(<
0000000000103634> enabled_wait+0x44/0x128)
<
0000000000103b00> arch_cpu_idle+0x40/0x58
<
0000000000ca0544> default_idle_call+0x3c/0x68
<
000000000018eaa4> do_idle+0xec/0x1c0
<
000000000018ee0e> cpu_startup_entry+0x36/0x40
<
000000000122df34> arch_call_rest_init+0x5c/0x88
<
0000000000000000> 0x0
INFO: lockdep is turned off.
Last Breaking-Event-Address:
<
00000000003ca8f4> __free_slab+0x1c4/0x6b0
Kernel panic - not syncing: Fatal exception in interrupt
The kernel panics on an attempt to dereference the NULL memcg pointer.
When shutdown_cache() is called from the kmem_cache_destroy() context, a
memcg kmem_cache might have empty slab pages in a partial list, which are
still charged to the memory cgroup.
These pages are released by free_partial() at the beginning of
shutdown_cache(): either directly or by scheduling a RCU-delayed work
(if the kmem_cache has the SLAB_TYPESAFE_BY_RCU flag). The latter case
is when the reported panic can happen: memcg_unlink_cache() is called
immediately after shrinking partial lists, without waiting for scheduled
RCU works. It sets the kmem_cache->memcg_params.memcg pointer to NULL,
and the following attempt to dereference it by __free_slab() from the
RCU work context causes the panic.
To fix the issue, let's postpone the release of the memcg pointer to
destroy_memcg_params(). It's called from a separate work context by
slab_caches_to_rcu_destroy_workfn(), which contains a full RCU barrier.
This guarantees that all scheduled page release RCU works will complete
before the memcg pointer will be zeroed.
Big thanks for Karsten for the perfect report containing all necessary
information, his help with the analysis of the problem and testing of the
fix.
Link: http://lkml.kernel.org/r/20191010160549.1584316-1-guro@fb.com
Fixes:
fb2f2b0adb98 ("mm: memcg/slab: reparent memcg kmem_caches on cgroup removal")
Signed-off-by: Roman Gushchin <guro@fb.com>
Reported-by: Karsten Graul <kgraul@linux.ibm.com>
Tested-by: Karsten Graul <kgraul@linux.ibm.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Karsten Graul <kgraul@linux.ibm.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Aneesh Kumar K.V [Sat, 19 Oct 2019 03:19:39 +0000 (20:19 -0700)]
mm/memunmap: don't access uninitialized memmap in memunmap_pages()
Patch series "mm/memory_hotplug: Shrink zones before removing memory",
v6.
This series fixes the access of uninitialized memmaps when shrinking
zones/nodes and when removing memory. Also, it contains all fixes for
crashes that can be triggered when removing certain namespace using
memunmap_pages() - ZONE_DEVICE, reported by Aneesh.
We stop trying to shrink ZONE_DEVICE, as it's buggy, fixing it would be
more involved (we don't have SECTION_IS_ONLINE as an indicator), and
shrinking is only of limited use (set_zone_contiguous() cannot detect
the ZONE_DEVICE as contiguous).
We continue shrinking !ZONE_DEVICE zones, however, I reduced the amount
of code to a minimum. Shrinking is especially necessary to keep
zone->contiguous set where possible, especially, on memory unplug of
DIMMs at zone boundaries.
--------------------------------------------------------------------------
Zones are now properly shrunk when offlining memory blocks or when
onlining failed. This allows to properly shrink zones on memory unplug
even if the separate memory blocks of a DIMM were onlined to different
zones or re-onlined to a different zone after offlining.
Example:
:/# cat /proc/zoneinfo
Node 1, zone Movable
spanned 0
present 0
managed 0
:/# echo "online_movable" > /sys/devices/system/memory/memory41/state
:/# echo "online_movable" > /sys/devices/system/memory/memory43/state
:/# cat /proc/zoneinfo
Node 1, zone Movable
spanned 98304
present 65536
managed 65536
:/# echo 0 > /sys/devices/system/memory/memory43/online
:/# cat /proc/zoneinfo
Node 1, zone Movable
spanned 32768
present 32768
managed 32768
:/# echo 0 > /sys/devices/system/memory/memory41/online
:/# cat /proc/zoneinfo
Node 1, zone Movable
spanned 0
present 0
managed 0
This patch (of 10):
With an altmap, the memmap falling into the reserved altmap space are not
initialized and, therefore, contain a garbage NID and a garbage zone.
Make sure to read the NID/zone from a memmap that was initialized.
This fixes a kernel crash that is observed when destroying a namespace:
kernel BUG at include/linux/mm.h:1107!
cpu 0x1: Vector: 700 (Program Check) at [
c000000274087890]
pc:
c0000000004b9728: memunmap_pages+0x238/0x340
lr:
c0000000004b9724: memunmap_pages+0x234/0x340
...
pid = 3669, comm = ndctl
kernel BUG at include/linux/mm.h:1107!
devm_action_release+0x30/0x50
release_nodes+0x268/0x2d0
device_release_driver_internal+0x174/0x240
unbind_store+0x13c/0x190
drv_attr_store+0x44/0x60
sysfs_kf_write+0x70/0xa0
kernfs_fop_write+0x1ac/0x290
__vfs_write+0x3c/0x70
vfs_write+0xe4/0x200
ksys_write+0x7c/0x140
system_call+0x5c/0x68
The "page_zone(pfn_to_page(pfn)" was introduced by
69324b8f4833 ("mm,
devm_memremap_pages: add MEMORY_DEVICE_PRIVATE support"), however, I
think we will never have driver reserved memory with
MEMORY_DEVICE_PRIVATE (no altmap AFAIKS).
[david@redhat.com: minimze code changes, rephrase description]
Link: http://lkml.kernel.org/r/20191006085646.5768-2-david@redhat.com
Fixes:
2c2a5af6fed2 ("mm, memory_hotplug: add nid parameter to arch_remove_memory")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Damian Tometzki <damian.tometzki@gmail.com>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jun Yao <yaojun8558363@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Pankaj Gupta <pagupta@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qian Cai <cai@lca.pw>
Cc: Rich Felker <dalias@libc.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Yu Zhao <yuzhao@google.com>
Cc: <stable@vger.kernel.org> [5.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Hildenbrand [Sat, 19 Oct 2019 03:19:33 +0000 (20:19 -0700)]
mm/memory_hotplug: don't access uninitialized memmaps in shrink_pgdat_span()
We might use the nid of memmaps that were never initialized. For
example, if the memmap was poisoned, we will crash the kernel in
pfn_to_nid() right now. Let's use the calculated boundaries of the
separate zones instead. This now also avoids having to iterate over a
whole bunch of subsections again, after shrinking one zone.
Before commit
d0dc12e86b31 ("mm/memory_hotplug: optimize memory
hotplug"), the memmap was initialized to 0 and the node was set to the
right value. After that commit, the node might be garbage.
We'll have to fix shrink_zone_span() next.
Link: http://lkml.kernel.org/r/20191006085646.5768-4-david@redhat.com
Fixes:
f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [
d0dc12e86b319]
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Damian Tometzki <damian.tometzki@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jun Yao <yaojun8558363@gmail.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Pankaj Gupta <pagupta@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qian Cai <cai@lca.pw>
Cc: Rich Felker <dalias@libc.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Yu Zhao <yuzhao@google.com>
Cc: <stable@vger.kernel.org> [4.13+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Qian Cai [Sat, 19 Oct 2019 03:19:29 +0000 (20:19 -0700)]
mm/page_owner: don't access uninitialized memmaps when reading /proc/pagetypeinfo
Uninitialized memmaps contain garbage and in the worst case trigger
kernel BUGs, especially with CONFIG_PAGE_POISONING. They should not get
touched.
For example, when not onlining a memory block that is spanned by a zone
and reading /proc/pagetypeinfo with CONFIG_DEBUG_VM_PGFLAGS and
CONFIG_PAGE_POISONING, we can trigger a kernel BUG:
:/# echo 1 > /sys/devices/system/memory/memory40/online
:/# echo 1 > /sys/devices/system/memory/memory42/online
:/# cat /proc/pagetypeinfo > test.file
page:
fffff2c585200000 is uninitialized and poisoned
raw:
ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
raw:
ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
There is not page extension available.
------------[ cut here ]------------
kernel BUG at include/linux/mm.h:1107!
invalid opcode: 0000 [#1] SMP NOPTI
Please note that this change does not affect ZONE_DEVICE, because
pagetypeinfo_showmixedcount_print() is called from
mm/vmstat.c:pagetypeinfo_showmixedcount() only for populated zones, and
ZONE_DEVICE is never populated (zone->present_pages always 0).
[david@redhat.com: move check to outer loop, add comment, rephrase description]
Link: http://lkml.kernel.org/r/20191011140638.8160-1-david@redhat.com
Fixes:
f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") # visible after
d0dc12e86b319
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Miles Chen <miles.chen@mediatek.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org> [4.13+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joel Colledge [Sat, 19 Oct 2019 03:19:26 +0000 (20:19 -0700)]
scripts/gdb: fix lx-dmesg when CONFIG_PRINTK_CALLER is set
When CONFIG_PRINTK_CALLER is set, struct printk_log contains an
additional member caller_id. This affects the offset of the log text.
Account for this by using the type information from gdb to determine all
the offsets instead of using hardcoded values.
This fixes following error:
(gdb) lx-dmesg
Python Exception <class 'ValueError'> embedded null character:
Error occurred in Python command: embedded null character
The read_u* utility functions now take an offset argument to make them
easier to use.
Link: http://lkml.kernel.org/r/20191011142500.2339-1-joel.colledge@linbit.com
Signed-off-by: Joel Colledge <joel.colledge@linbit.com>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Kieran Bingham <kbingham@kernel.org>
Cc: Leonard Crestez <leonard.crestez@nxp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Hildenbrand [Sat, 19 Oct 2019 03:19:23 +0000 (20:19 -0700)]
mm/memory-failure.c: don't access uninitialized memmaps in memory_failure()
We should check for pfn_to_online_page() to not access uninitialized
memmaps. Reshuffle the code so we don't have to duplicate the error
message.
Link: http://lkml.kernel.org/r/20191009142435.3975-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Fixes:
f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [visible after
d0dc12e86b319]
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: <stable@vger.kernel.org> [4.13+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Hildenbrand [Sat, 19 Oct 2019 03:19:20 +0000 (20:19 -0700)]
fs/proc/page.c: don't access uninitialized memmaps in fs/proc/page.c
There are three places where we access uninitialized memmaps, namely:
- /proc/kpagecount
- /proc/kpageflags
- /proc/kpagecgroup
We have initialized memmaps either when the section is online or when the
page was initialized to the ZONE_DEVICE. Uninitialized memmaps contain
garbage and in the worst case trigger kernel BUGs, especially with
CONFIG_PAGE_POISONING.
For example, not onlining a DIMM during boot and calling /proc/kpagecount
with CONFIG_PAGE_POISONING:
:/# cat /proc/kpagecount > tmp.test
BUG: unable to handle page fault for address:
fffffffffffffffe
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD
114616067 P4D
114616067 PUD
114618067 PMD 0
Oops: 0000 [#1] SMP NOPTI
CPU: 0 PID: 469 Comm: cat Not tainted 5.4.0-rc1-next-
20191004+ #11
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4
RIP: 0010:kpagecount_read+0xce/0x1e0
Code: e8 09 83 e0 3f 48 0f a3 02 73 2d 4c 89 e7 48 c1 e7 06 48 03 3d ab 51 01 01 74 1d 48 8b 57 08 480
RSP: 0018:
ffffa14e409b7e78 EFLAGS:
00010202
RAX:
fffffffffffffffe RBX:
0000000000020000 RCX:
0000000000000000
RDX:
0000000000000001 RSI:
00007f76b5595000 RDI:
fffff35645000000
RBP:
00007f76b5595000 R08:
0000000000000001 R09:
0000000000000000
R10:
0000000000000000 R11:
0000000000000000 R12:
0000000000140000
R13:
0000000000020000 R14:
00007f76b5595000 R15:
ffffa14e409b7f08
FS:
00007f76b577d580(0000) GS:
ffff8f41bd400000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
fffffffffffffffe CR3:
0000000078960000 CR4:
00000000000006f0
Call Trace:
proc_reg_read+0x3c/0x60
vfs_read+0xc5/0x180
ksys_read+0x68/0xe0
do_syscall_64+0x5c/0xa0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
For now, let's drop support for ZONE_DEVICE from the three pseudo files
in order to fix this. To distinguish offline memory (with garbage
memmap) from ZONE_DEVICE memory with properly initialized memmaps, we
would have to check get_dev_pagemap() and pfn_zone_device_reserved()
right now. The usage of both (especially, special casing devmem) is
frowned upon and needs to be reworked.
The fundamental issue we have is:
if (pfn_to_online_page(pfn)) {
/* memmap initialized */
} else if (pfn_valid(pfn)) {
/*
* ???
* a) offline memory. memmap garbage.
* b) devmem: memmap initialized to ZONE_DEVICE.
* c) devmem: reserved for driver. memmap garbage.
* (d) devmem: memmap currently initializing - garbage)
*/
}
We'll leave the pfn_zone_device_reserved() check in stable_page_flags()
in place as that function is also used from memory failure. We now no
longer dump information about pages that are not in use anymore -
offline.
Link: http://lkml.kernel.org/r/20191009142435.3975-2-david@redhat.com
Fixes:
f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [visible after
d0dc12e86b319]
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: Qian Cai <cai@lca.pw>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Toshiki Fukasawa <t-fukasawa@vx.jp.nec.com>
Cc: Pankaj gupta <pagupta@redhat.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Anthony Yznaga <anthony.yznaga@oracle.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: <stable@vger.kernel.org> [4.13+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Hildenbrand [Sat, 19 Oct 2019 03:19:16 +0000 (20:19 -0700)]
drivers/base/memory.c: don't access uninitialized memmaps in soft_offline_page_store()
Uninitialized memmaps contain garbage and in the worst case trigger kernel
BUGs, especially with CONFIG_PAGE_POISONING. They should not get touched.
Right now, when trying to soft-offline a PFN that resides on a memory
block that was never onlined, one gets a misleading error with
CONFIG_PAGE_POISONING:
:/# echo
5637144576 > /sys/devices/system/memory/soft_offline_page
[ 23.097167] soft offline: 0x150000 page already poisoned
But the actual result depends on the garbage in the memmap.
soft_offline_page() can only work with online pages, it returns -EIO in
case of ZONE_DEVICE. Make sure to only forward pages that are online
(iow, managed by the buddy) and, therefore, have an initialized memmap.
Add a check against pfn_to_online_page() and similarly return -EIO.
Link: http://lkml.kernel.org/r/20191010141200.8985-1-david@redhat.com
Fixes:
f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [visible after
d0dc12e86b319]
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: <stable@vger.kernel.org> [4.13+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 19 Oct 2019 02:29:36 +0000 (22:29 -0400)]
Merge tag 'for-linus-2019-10-18' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- NVMe pull request from Keith that address deadlocks, double resets,
memory leaks, and other regression.
- Fixup elv_support_iosched() for bio based devices (Damien)
- Fixup for the ahci PCS quirk (Dan)
- Socket O_NONBLOCK handling fix for io_uring (me)
- Timeout sequence io_uring fixes (yangerkun)
- MD warning fix for parameter default_layout (Song)
- blkcg activation fixes (Tejun)
- blk-rq-qos node deletion fix (Tejun)
* tag 'for-linus-2019-10-18' of git://git.kernel.dk/linux-block:
nvme-pci: Set the prp2 correctly when using more than 4k page
io_uring: fix logic error in io_timeout
io_uring: fix up O_NONBLOCK handling for sockets
md/raid0: fix warning message for parameter default_layout
libata/ahci: Fix PCS quirk application
blk-rq-qos: fix first node deletion of rq_qos_del()
blkcg: Fix multiple bugs in blkcg_activate_policy()
io_uring: consider the overflow of sequence for timeout req
nvme-tcp: fix possible leakage during error flow
nvmet-loop: fix possible leakage during error flow
block: Fix elv_support_iosched()
nvme-tcp: Initialize sk->sk_ll_usec only with NET_RX_BUSY_POLL
nvme: Wait for reset state when required
nvme: Prevent resets during paused controller state
nvme: Restart request timers in resetting state
nvme: Remove ADMIN_ONLY state
nvme-pci: Free tagset if no IO queues
nvme: retain split access workaround for capability reads
nvme: fix possible deadlock when nvme_update_formats fails
Linus Torvalds [Sat, 19 Oct 2019 02:26:18 +0000 (22:26 -0400)]
Merge tag 'riscv/for-v5.4-rc4' of git://git./linux/kernel/git/riscv/linux
Pull RISC-V fixes from Paul Walmsley:
"Some RISC-V fixes:
- Fix the virtual memory layout so the fixaddr region doesn't overlap
with other regions. (This was originally intended to go in as part
of an earlier patch, but I inadvertently dropped it during a
rebase)
- Add the DT chosen/stdout-path property to the HiFive Unleashed DT
file. This is so "earlycon" can be specified with no arguments on
the kernel command line, and the correct UART will be automatically
selected.
And two cleanup patches:
- Simplify the code in our breakpoint trap handler.
- Drop a comment in our TLB flush code that has caused some
confusion"
* tag 'riscv/for-v5.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: fix virtual address overlapped in FIXADDR_START and VMEMMAP_START
riscv: tlbflush: remove confusing comment on local_flush_tlb_all()
riscv: dts: HiFive Unleashed: add default chosen/stdout-path
riscv: remove the switch statement in do_trap_break()
Linus Torvalds [Fri, 18 Oct 2019 22:41:16 +0000 (18:41 -0400)]
filldir[64]: remove WARN_ON_ONCE() for bad directory entries
This was always meant to be a temporary thing, just for testing and to
see if it actually ever triggered.
The only thing that reported it was syzbot doing disk image fuzzing, and
then that warning is expected. So let's just remove it before -rc4,
because the extra sanity testing should probably go to -stable, but we
don't want the warning to do so.
Reported-by: syzbot+3031f712c7ad5dd4d926@syzkaller.appspotmail.com
Fixes:
8a23eb804ca4 ("Make filldir[64]() verify the directory entry filename is valid")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 18 Oct 2019 22:30:09 +0000 (18:30 -0400)]
Merge tag 'ceph-for-5.4-rc4' of git://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
"A future-proofing decoding fix from Jeff intended for stable and a
patch for a mostly benign race from Dongsheng"
* tag 'ceph-for-5.4-rc4' of git://github.com/ceph/ceph-client:
rbd: cancel lock_dwork if the wait is interrupted
ceph: just skip unrecognized info in ceph_reply_info_extra
Linus Torvalds [Fri, 18 Oct 2019 22:26:07 +0000 (18:26 -0400)]
Merge tag 'for-5.4/dm-fixes' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- Fix DM snapshot deadlock that can occur due to COW throttling
preventing locks from being released.
- Fix DM cache's GFP_NOWAIT allocation failure error paths by switching
to GFP_NOIO.
- Make __hash_find() static in the DM clone target.
* tag 'for-5.4/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm cache: fix bugs when a GFP_NOWAIT allocation fails
dm snapshot: rework COW throttling to fix deadlock
dm snapshot: introduce account_start_copy() and account_end_copy()
dm clone: Make __hash_find static
Linus Torvalds [Fri, 18 Oct 2019 22:23:16 +0000 (18:23 -0400)]
Merge tag 'iommu-fixes-v5.4-rc3' of git://git./linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
- Fixes for page-table issues on Mali GPUs
- Missing free in an error path for ARM-SMMU
- PASID decoding in the AMD IOMMU Event log code
- Another update for the locking fixes in the AMD IOMMU driver
- Reduce the calls to platform_get_irq() in the IPMMU-VMSA and Rockchip
IOMMUs to get rid of the warning message added to this function
recently
* tag 'iommu-fixes-v5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/amd: Check PM_LEVEL_SIZE() condition in locked section
iommu/amd: Fix incorrect PASID decoding from event log
iommu/ipmmu-vmsa: Only call platform_get_irq() when interrupt is mandatory
iommu/rockchip: Don't use platform_get_irq to implicitly count irqs
iommu/io-pgtable-arm: Support all Mali configurations
iommu/io-pgtable-arm: Correct Mali attributes
iommu/arm-smmu: Free context bitmap in the err path of arm_smmu_init_domain_context
Linus Torvalds [Fri, 18 Oct 2019 22:19:04 +0000 (18:19 -0400)]
Merge tag 'copy-struct-from-user-v5.4-rc4' of gitolite.pub/scm/linux/kernel/git/brauner/linux
Pull usercopy test fixlets from Christian Brauner:
"This contains two improvements for the copy_struct_from_user() tests:
- a coding style change to get rid of the ugly "if ((ret |= test()))"
pointed out when pulling the original patchset.
- avoid a soft lockups when running the usercopy tests on machines
with large page sizes by scanning only a 1024 byte region"
* tag 'copy-struct-from-user-v5.4-rc4' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux:
usercopy: Avoid soft lockups in test_check_nonzero_user()
lib: test_user_copy: style cleanup
Andrew Lunn [Thu, 17 Oct 2019 19:29:26 +0000 (21:29 +0200)]
net: usb: lan78xx: Connect PHY before registering MAC
As soon as the netdev is registers, the kernel can start using the
interface. If the driver connects the MAC to the PHY after the netdev
is registered, there is a race condition where the interface can be
opened without having the PHY connected.
Change the order to close this race condition.
Fixes:
92571a1aae40 ("lan78xx: Connect phy early")
Reported-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 18 Oct 2019 17:19:43 +0000 (10:19 -0700)]
Merge branch 'vsock-virtio-make-the-credit-mechanism-more-robust'
Stefano Garzarella says:
====================
vsock/virtio: make the credit mechanism more robust
This series makes the credit mechanism implemented in the
virtio-vsock devices more robust.
Patch 1 sends an update to the remote peer when the buf_alloc
change.
Patch 2 prevents a malicious peer (especially the guest) can
consume all the memory of the other peer, discarding packets
when the credit available is not respected.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Garzarella [Thu, 17 Oct 2019 12:44:03 +0000 (14:44 +0200)]
vsock/virtio: discard packets if credit is not respected
If the remote peer doesn't respect the credit information
(buf_alloc, fwd_cnt), sending more data than it can send,
we should drop the packets to prevent a malicious peer
from using all of our memory.
This is patch follows the VIRTIO spec: "VIRTIO_VSOCK_OP_RW data
packets MUST only be transmitted when the peer has sufficient
free buffer space for the payload"
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Garzarella [Thu, 17 Oct 2019 12:44:02 +0000 (14:44 +0200)]
vsock/virtio: send a credit update when buffer size is changed
When the user application set a new buffer size value, we should
update the remote peer about this change, since it uses this
information to calculate the credit available.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 17 Oct 2019 07:11:03 +0000 (10:11 +0300)]
mlxsw: spectrum_trap: Push Ethernet header before reporting trap
devlink maintains packets and bytes statistics for each trap. Since
eth_type_trans() was called to set the skb's protocol, the data pointer
no longer points to the start of the packet and the bytes accounting is
off by 14 bytes.
Fix this by pushing the skb's data pointer to the start of the packet.
Fixes:
b5ce611fd96e ("mlxsw: spectrum: Add devlink-trap support")
Reported-by: Alex Kushnarov <alexanderk@mellanox.com>
Tested-by: Alex Kushnarov <alexanderk@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>