linux-2.6-microblaze.git
3 years agomm: kmem: make memcg_kmem_enabled() irreversible
Roman Gushchin [Fri, 7 Aug 2020 06:20:28 +0000 (23:20 -0700)]
mm: kmem: make memcg_kmem_enabled() irreversible

Historically the kernel memory accounting was an opt-in feature, which
could be enabled for individual cgroups.  But now it's not true, and it's
on by default both on cgroup v1 and cgroup v2.  And as long as a user has
at least one non-root memory cgroup, the kernel memory accounting is on.
So in most setups it's either always on (if memory cgroups are in use and
kmem accounting is not disabled), either always off (otherwise).

memcg_kmem_enabled() is used in many places to guard the kernel memory
accounting code.  If memcg_kmem_enabled() can reverse from returning true
to returning false (as now), we can't rely on it on release paths and have
to check if it was on before.

If we'll make memcg_kmem_enabled() irreversible (always returning true
after returning it for the first time), it'll make the general logic more
simple and robust.  It also will allow to guard some checks which
otherwise would stay unguarded.

Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Link: http://lkml.kernel.org/r/20200702180926.1330769-1-guro@fb.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agotmpfs: support 64-bit inums per-sb
Chris Down [Fri, 7 Aug 2020 06:20:25 +0000 (23:20 -0700)]
tmpfs: support 64-bit inums per-sb

The default is still set to inode32 for backwards compatibility, but
system administrators can opt in to the new 64-bit inode numbers by
either:

1. Passing inode64 on the command line when mounting, or
2. Configuring the kernel with CONFIG_TMPFS_INODE64=y

The inode64 and inode32 names are used based on existing precedent from
XFS.

[hughd@google.com: Kconfig fixes]
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008011928010.13320@eggly.anvils
Signed-off-by: Chris Down <chris@chrisdown.name>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/8b23758d0c66b5e2263e08baf9c4b6a7565cbd8f.1594661218.git.chris@chrisdown.name
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agotmpfs: per-superblock i_ino support
Chris Down [Fri, 7 Aug 2020 06:20:20 +0000 (23:20 -0700)]
tmpfs: per-superblock i_ino support

Patch series "tmpfs: inode: Reduce risk of inum overflow", v7.

In Facebook production we are seeing heavy i_ino wraparounds on tmpfs.  On
affected tiers, in excess of 10% of hosts show multiple files with
different content and the same inode number, with some servers even having
as many as 150 duplicated inode numbers with differing file content.

This causes actual, tangible problems in production.  For example, we have
complaints from those working on remote caches that their application is
reporting cache corruptions because it uses (device, inodenum) to
establish the identity of a particular cache object, but because it's not
unique any more, the application refuses to continue and reports cache
corruption.  Even worse, sometimes applications may not even detect the
corruption but may continue anyway, causing phantom and hard to debug
behaviour.

In general, userspace applications expect that (device, inodenum) should
be enough to be uniquely point to one inode, which seems fair enough.  One
might also need to check the generation, but in this case:

1. That's not currently exposed to userspace
   (ioctl(...FS_IOC_GETVERSION...) returns ENOTTY on tmpfs);
2. Even with generation, there shouldn't be two live inodes with the
   same inode number on one device.

In order to mitigate this, we take a two-pronged approach:

1. Moving inum generation from being global to per-sb for tmpfs. This
   itself allows some reduction in i_ino churn. This works on both 64-
   and 32- bit machines.
2. Adding inode{64,32} for tmpfs. This fix is supported on machines with
   64-bit ino_t only: we allow users to mount tmpfs with a new inode64
   option that uses the full width of ino_t, or CONFIG_TMPFS_INODE64.

You can see how this compares to previous related patches which didn't
implement this per-superblock:

- https://patchwork.kernel.org/patch/11254001/
- https://patchwork.kernel.org/patch/11023915/

This patch (of 2):

get_next_ino has a number of problems:

- It uses and returns a uint, which is susceptible to become overflowed
  if a lot of volatile inodes that use get_next_ino are created.
- It's global, with no specificity per-sb or even per-filesystem. This
  means it's not that difficult to cause inode number wraparounds on a
  single device, which can result in having multiple distinct inodes
  with the same inode number.

This patch adds a per-superblock counter that mitigates the second case.
This design also allows us to later have a specific i_ino size per-device,
for example, allowing users to choose whether to use 32- or 64-bit inodes
for each tmpfs mount.  This is implemented in the next commit.

For internal shmem mounts which may be less tolerant to spinlock delays,
we implement a percpu batching scheme which only takes the stat_lock at
each batch boundary.

Signed-off-by: Chris Down <chris@chrisdown.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/cover.1594661218.git.chris@chrisdown.name
Link: http://lkml.kernel.org/r/1986b9d63b986f08ec07a4aa4b2275e718e47d8a.1594661218.git.chris@chrisdown.name
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/page_io.c: use blk_io_schedule() for avoiding task hung in sync io
Xianting Tian [Fri, 7 Aug 2020 06:20:17 +0000 (23:20 -0700)]
mm/page_io.c: use blk_io_schedule() for avoiding task hung in sync io

swap_readpage() does the sync io for one page, the io is not big,
normally, the io can be finished quickly, but it may take long time or
wait forever in case of io failure or discard.

This patch uses blk_io_schedule() instead of io_schedule() to avoid task
hung and crash (when set /proc/sys/kernel/hung_task_panic) when the above
exception occurs.

This is similar to the hung task avoidance in submit_bio_wait(),
blk_execute_rq() and __blkdev_direct_IO().

Signed-off-by: Xianting Tian <xianting_tian@126.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Hugh Dickins <hughd@google.com>
Link: http://lkml.kernel.org/r/1596461807-21087-1-git-send-email-xianting_tian@126.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm: swap: fix kerneldoc of swap_vma_readahead()
Krzysztof Kozlowski [Fri, 7 Aug 2020 06:20:14 +0000 (23:20 -0700)]
mm: swap: fix kerneldoc of swap_vma_readahead()

Fix W=1 compile warnings (invalid kerneldoc):

    mm/swap_state.c:742: warning: Function parameter or member 'fentry' not described in 'swap_vma_readahead'
    mm/swap_state.c:742: warning: Excess function parameter 'entry' description in 'swap_vma_readahead'

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200728171109.28687-2-krzk@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/swap_slots.c: remove redundant check for swap_slot_cache_initialized
Zhen Lei [Fri, 7 Aug 2020 06:20:11 +0000 (23:20 -0700)]
mm/swap_slots.c: remove redundant check for swap_slot_cache_initialized

Because enable_swap_slots_cache can only become true in
enable_swap_slots_cache(), and depends on swap_slot_cache_initialized is
true before.  That means, when enable_swap_slots_cache is true,
swap_slot_cache_initialized is true also.

So the condition:
"swap_slot_cache_enabled && swap_slot_cache_initialized"
can be reduced to "swap_slot_cache_enabled"

And in mathematics:
"!swap_slot_cache_enabled || !swap_slot_cache_initialized"
is equal to "!(swap_slot_cache_enabled && swap_slot_cache_initialized)"

So no functional change.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Link: http://lkml.kernel.org/r/20200430061143.450-4-thunder.leizhen@huawei.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/swap_slots.c: simplify enable_swap_slots_cache()
Zhen Lei [Fri, 7 Aug 2020 06:20:08 +0000 (23:20 -0700)]
mm/swap_slots.c: simplify enable_swap_slots_cache()

Whether swap_slot_cache_initialized is true or false,
__reenable_swap_slots_cache() is always called.  To make this meaning
clear, leave only one call to __reenable_swap_slots_cache().  This also
make it clearer what extra needs be done when swap_slot_cache_initialized
is false.

No functional change.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Link: http://lkml.kernel.org/r/20200430061143.450-3-thunder.leizhen@huawei.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/swap_slots.c: simplify alloc_swap_slot_cache()
Zhen Lei [Fri, 7 Aug 2020 06:20:05 +0000 (23:20 -0700)]
mm/swap_slots.c: simplify alloc_swap_slot_cache()

Patch series "clean up some functions in mm/swap_slots.c".

When I studied the code of mm/swap_slots.c, I found some places can be
improved.

This patch (of 3):

Both "slots" and "slots_ret" are only need to be freed when cache already
allocated.  Make them closer, seems more clear.

No functional change.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Link: http://lkml.kernel.org/r/20200430061143.450-1-thunder.leizhen@huawei.com
Link: http://lkml.kernel.org/r/20200430061143.450-2-thunder.leizhen@huawei.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/gup.c: fix the comment of return value for populate_vma_page_range()
Tang Yizhou [Fri, 7 Aug 2020 06:20:01 +0000 (23:20 -0700)]
mm/gup.c: fix the comment of return value for populate_vma_page_range()

The return value of populate_vma_page_range() is consistent with
__get_user_pages(), and so is the function comment of return value.

Signed-off-by: Tang Yizhou <tangyizhou@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: http://lkml.kernel.org/r/20200720034303.29920-1-tangyizhou@huawei.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm: filemap: add missing FGP_ flags in kerneldoc comment for pagecache_get_page
Yang Shi [Fri, 7 Aug 2020 06:19:58 +0000 (23:19 -0700)]
mm: filemap: add missing FGP_ flags in kerneldoc comment for pagecache_get_page

FGP_{WRITE|NOFS|NOWAIT} were missed in pagecache_get_page's kerneldoc
comment.

Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Gang Deng <gavin.dg@linux.alibaba.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@surriel.com>
Link: http://lkml.kernel.org/r/1593031747-4249-1-git-send-email-yang.shi@linux.alibaba.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm: filemap: clear idle flag for writes
Yang Shi [Fri, 7 Aug 2020 06:19:55 +0000 (23:19 -0700)]
mm: filemap: clear idle flag for writes

Since commit bbddabe2e436aa ("mm: filemap: only do access activations on
reads"), mark_page_accessed() is called for reads only.  But the idle flag
is cleared by mark_page_accessed() so the idle flag won't get cleared if
the page is write accessed only.

Basically idle page tracking is used to estimate workingset size of
workload, noticeable size of workingset might be missed if the idle flag
is not maintained correctly.

It seems good enough to just clear idle flag for write operations.

Fixes: bbddabe2e436 ("mm: filemap: only do access activations on reads")
Reported-by: Gang Deng <gavin.dg@linux.alibaba.com>
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@surriel.com>
Link: http://lkml.kernel.org/r/1593020612-13051-1-git-send-email-yang.shi@linux.alibaba.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm, dump_page: do not crash with bad compound_mapcount()
John Hubbard [Fri, 7 Aug 2020 06:19:51 +0000 (23:19 -0700)]
mm, dump_page: do not crash with bad compound_mapcount()

If a compound page is being split while dump_page() is being run on that
page, we can end up calling compound_mapcount() on a page that is no
longer compound.  This leads to a crash (already seen at least once in the
field), due to the VM_BUG_ON_PAGE() assertion inside compound_mapcount().

(The above is from Matthew Wilcox's analysis of Qian Cai's bug report.)

A similar problem is possible, via compound_pincount() instead of
compound_mapcount().

In order to avoid this kind of crash, make dump_page() slightly more
robust, by providing a pair of simpler routines that don't contain
assertions: head_mapcount() and head_pincount().

For debug tools, we don't want to go *too* far in this direction, but this
is a simple small fix, and the crash has already been seen, so it's a good
trade-off.

Reported-by: Qian Cai <cai@lca.pw>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Link: http://lkml.kernel.org/r/20200804214807.169256-1-jhubbard@nvidia.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/debug: print hashed address of struct page
Matthew Wilcox (Oracle) [Fri, 7 Aug 2020 06:19:48 +0000 (23:19 -0700)]
mm/debug: print hashed address of struct page

The actual address of the struct page isn't particularly helpful, while
the hashed address helps match with other messages elsewhere.  Add the PFN
that the page refers to in order to help diagnose problems where the page
is improperly aligned for the purpose.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: William Kucharski <william.kucharski@oracle.com>
Link: http://lkml.kernel.org/r/20200709202117.7216-7-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/debug: print the inode number in dump_page
Matthew Wilcox (Oracle) [Fri, 7 Aug 2020 06:19:45 +0000 (23:19 -0700)]
mm/debug: print the inode number in dump_page

The inode number helps correlate this page with debug messages elsewhere
in the kernel.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: William Kucharski <william.kucharski@oracle.com>
Link: http://lkml.kernel.org/r/20200709202117.7216-6-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/debug: switch dump_page to get_kernel_nofault
Matthew Wilcox (Oracle) [Fri, 7 Aug 2020 06:19:42 +0000 (23:19 -0700)]
mm/debug: switch dump_page to get_kernel_nofault

This is simpler to use than copy_from_kernel_nofault().  Also make some of
the related error messages less verbose.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: William Kucharski <william.kucharski@oracle.com>
Link: http://lkml.kernel.org/r/20200709202117.7216-5-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/debug: print head flags in dump_page
Matthew Wilcox (Oracle) [Fri, 7 Aug 2020 06:19:39 +0000 (23:19 -0700)]
mm/debug: print head flags in dump_page

Tail page flags contain very little useful information.  Print the head
page's flags instead.  While the flags will contain "head" for tail pages,
this should not be too confusing as the previous line starts with the word
"head:" and so the flags should be interpreted as belonging to the head
page.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: William Kucharski <william.kucharski@oracle.com>
Link: http://lkml.kernel.org/r/20200709202117.7216-4-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/debug: dump compound page information on a second line
Matthew Wilcox (Oracle) [Fri, 7 Aug 2020 06:19:35 +0000 (23:19 -0700)]
mm/debug: dump compound page information on a second line

Simplify both the implementation and the output by splitting all the
compound page information onto a second line.

Reported-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: William Kucharski <william.kucharski@oracle.com>
Link: http://lkml.kernel.org/r/20200709202117.7216-3-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/debug: handle page->mapping better in dump_page
Matthew Wilcox (Oracle) [Fri, 7 Aug 2020 06:19:32 +0000 (23:19 -0700)]
mm/debug: handle page->mapping better in dump_page

Patch series "Improvements for dump_page()", v2.

Here's a sample dump of a pagecache tail page with all of the patches
applied:

page:000000006d1c49ca refcount:6 mapcount:0 mapping:00000000136b8d90 index:0x109 pfn:0x6c645
head:000000008bd38076 order:2 compound_mapcount:0 compound_pincount:0
aops:xfs_address_space_operations ino:800042 dentry name:"fd"
flags: 0x4000000000012014(uptodate|lru|private|head)
raw: 4000000000000000 ffffd46ac1b19101 ffffffff00000202 dead000000000004
raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
head: 4000000000012014 ffffd46ac1b1bbc8 ffffd46ac1b1bc08 ffff91976f659560
head: 0000000000000108 ffff919773220680 00000006ffffffff 0000000000000000
page dumped because: testing

This patch (of 6):

If we can't call page_mapping() to get the page mapping, handle the
anon/ksm/movable bits correctly.

[akpm@linux-foundation.org: augmented code comment from John]
Link: http://lkml.kernel.org/r/15cff11a-6762-8a6a-3f0e-dd227280cd6f@nvidia.com
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Link: http://lkml.kernel.org/r/20200709202117.7216-1-willy@infradead.org
Link: http://lkml.kernel.org/r/20200709202117.7216-2-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoDocumentation/mm: add descriptions for arch page table helpers
Anshuman Khandual [Fri, 7 Aug 2020 06:19:28 +0000 (23:19 -0700)]
Documentation/mm: add descriptions for arch page table helpers

This adds a specific description file for all arch page table helpers which
is in sync with the semantics being tested via CONFIG_DEBUG_VM_PGTABLE. All
future changes either to these descriptions here or the debug test should
always remain in sync.

[anshuman.khandual@arm.com: fold in Mike's patch for the rst document, fix typos in the rst document]
Link: http://lkml.kernel.org/r/1594610587-4172-5-git-send-email-anshuman.khandual@arm.com
Suggested-by: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Zi Yan <ziy@nvidia.com>
Link: http://lkml.kernel.org/r/1593996516-7186-5-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/debug_vm_pgtable: add debug prints for individual tests
Anshuman Khandual [Fri, 7 Aug 2020 06:19:25 +0000 (23:19 -0700)]
mm/debug_vm_pgtable: add debug prints for individual tests

This adds debug print information that enlists all tests getting executed
on a given platform.  With dynamic debug enabled, the following
information will be splashed during boot.  For compactness purpose,
dropped both time stamp and prefix (i.e debug_vm_pgtable) from this sample
output.

[debug_vm_pgtable      ]: Validating architecture page table helpers
[pte_basic_tests       ]: Validating PTE basic
[pmd_basic_tests       ]: Validating PMD basic
[p4d_basic_tests       ]: Validating P4D basic
[pgd_basic_tests       ]: Validating PGD basic
[pte_clear_tests       ]: Validating PTE clear
[pmd_clear_tests       ]: Validating PMD clear
[pte_advanced_tests    ]: Validating PTE advanced
[pmd_advanced_tests    ]: Validating PMD advanced
[hugetlb_advanced_tests]: Validating HugeTLB advanced
[pmd_leaf_tests        ]: Validating PMD leaf
[pmd_huge_tests        ]: Validating PMD huge
[pte_savedwrite_tests  ]: Validating PTE saved write
[pmd_savedwrite_tests  ]: Validating PMD saved write
[pmd_populate_tests    ]: Validating PMD populate
[pte_special_tests     ]: Validating PTE special
[pte_protnone_tests    ]: Validating PTE protnone
[pmd_protnone_tests    ]: Validating PMD protnone
[pte_devmap_tests      ]: Validating PTE devmap
[pmd_devmap_tests      ]: Validating PMD devmap
[pte_swap_tests        ]: Validating PTE swap
[swap_migration_tests  ]: Validating swap migration
[hugetlb_basic_tests   ]: Validating HugeTLB basic
[pmd_thp_tests         ]: Validating PMD based THP

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Vineet Gupta <vgupta@synopsys.com> [arc]
Reviewed-by: Zi Yan <ziy@nvidia.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mike Rapoport <rppt@kernel.org>
Link: http://lkml.kernel.org/r/1593996516-7186-4-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/debug_vm_pgtable: add tests validating advanced arch page table helpers
Anshuman Khandual [Fri, 7 Aug 2020 06:19:20 +0000 (23:19 -0700)]
mm/debug_vm_pgtable: add tests validating advanced arch page table helpers

This adds new tests validating for these following arch advanced page
table helpers.  These tests create and test specific mapping types at
various page table levels.

1. pxxp_set_wrprotect()
2. pxxp_get_and_clear()
3. pxxp_set_access_flags()
4. pxxp_get_and_clear_full()
5. pxxp_test_and_clear_young()
6. pxx_leaf()
7. pxx_set_huge()
8. pxx_(clear|mk)_savedwrite()
9. huge_pxxp_xxx()

[anshuman.khandual@arm.com: drop RANDOM_ORVALUE from hugetlb_advanced_tests()]
Link: http://lkml.kernel.org/r/1594610587-4172-3-git-send-email-anshuman.khandual@arm.com
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Vineet Gupta <vgupta@synopsys.com> [arc]
Reviewed-by: Zi Yan <ziy@nvidia.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Steven Price <steven.price@arm.com>
Link: http://lkml.kernel.org/r/1593996516-7186-3-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/debug_vm_pgtable: add tests validating arch helpers for core MM features
Anshuman Khandual [Fri, 7 Aug 2020 06:19:16 +0000 (23:19 -0700)]
mm/debug_vm_pgtable: add tests validating arch helpers for core MM features

Patch series "mm/debug_vm_pgtable: Add some more tests", v5.

This series adds some more arch page table helper validation tests which
are related to core and advanced memory functions.  This also creates a
documentation, enlisting expected semantics for all page table helpers as
suggested by Mike Rapoport previously
(https://lkml.org/lkml/2020/1/30/40).

There are many TRANSPARENT_HUGEPAGE and ARCH_HAS_TRANSPARENT_HUGEPAGE_PUD
ifdefs scattered across the test.  But consolidating all the fallback
stubs is not very straight forward because
ARCH_HAS_TRANSPARENT_HUGEPAGE_PUD is not explicitly dependent on
ARCH_HAS_TRANSPARENT_HUGEPAGE.

Tested on arm64, x86 platforms but only build tested on all other enabled
platforms through ARCH_HAS_DEBUG_VM_PGTABLE i.e powerpc, arc, s390.  The
following failure on arm64 still exists which was mentioned previously.
It will be fixed with the upcoming THP migration on arm64 enablement
series.

WARNING .... mm/debug_vm_pgtable.c:860 debug_vm_pgtable+0x940/0xa54
WARN_ON(!pmd_present(pmd_mkinvalid(pmd_mkhuge(pmd))))

This patch (of 4):

This adds new tests validating arch page table helpers for these following
core memory features.  These tests create and test specific mapping types
at various page table levels.

1. SPECIAL mapping
2. PROTNONE mapping
3. DEVMAP mapping
4. SOFTDIRTY mapping
5. SWAP mapping
6. MIGRATION mapping
7. HUGETLB mapping
8. THP mapping

Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Vineet Gupta <vgupta@synopsys.com> [arc]
Reviewed-by: Zi Yan <ziy@nvidia.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Steven Price <steven.price@arm.com>
Link: http://lkml.kernel.org/r/1594610587-4172-1-git-send-email-anshuman.khandual@arm.com
Link: http://lkml.kernel.org/r/1593996516-7186-1-git-send-email-anshuman.khandual@arm.com
Link: http://lkml.kernel.org/r/1593996516-7186-2-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm, kcsan: instrument SLAB/SLUB free with "ASSERT_EXCLUSIVE_ACCESS"
Marco Elver [Fri, 7 Aug 2020 06:19:12 +0000 (23:19 -0700)]
mm, kcsan: instrument SLAB/SLUB free with "ASSERT_EXCLUSIVE_ACCESS"

Provide the necessary KCSAN checks to assist with debugging racy
use-after-frees.  While KASAN is more reliable at generally catching such
use-after-frees (due to its use of a quarantine), it can be difficult to
debug racy use-after-frees.  If a reliable reproducer exists, KCSAN can
assist in debugging such issues.

Note: ASSERT_EXCLUSIVE_ACCESS is a convenience wrapper if the size is
simply sizeof(var).  Instead, here we just use __kcsan_check_access()
explicitly to pass the correct size.

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Link: http://lkml.kernel.org/r/20200623072653.114563-1-elver@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/slub.c: drop lockdep_assert_held() from put_map()
Sebastian Andrzej Siewior [Fri, 7 Aug 2020 06:19:09 +0000 (23:19 -0700)]
mm/slub.c: drop lockdep_assert_held() from put_map()

There is no point in using lockdep_assert_held() unlock that is about to
be unlocked.  It works only with lockdep and lockdep will complain if
spin_unlock() is used on a lock that has not been locked.

Remove superfluous lockdep_assert_held().

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Christopher Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20200618201234.795692-2-bigeasy@linutronix.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm, slab/slub: improve error reporting and overhead of cache_from_obj()
Vlastimil Babka [Fri, 7 Aug 2020 06:19:05 +0000 (23:19 -0700)]
mm, slab/slub: improve error reporting and overhead of cache_from_obj()

cache_from_obj() was added by commit b9ce5ef49f00 ("sl[au]b: always get
the cache from its page in kmem_cache_free()") to support kmemcg, where
per-memcg cache can be different from the root one, so we can't use the
kmem_cache pointer given to kmem_cache_free().

Prior to that commit, SLUB already had debugging check+warning that could
be enabled to compare the given kmem_cache pointer to one referenced by
the slab page where the object-to-be-freed resides.  This check was moved
to cache_from_obj().  Later the check was also enabled for
SLAB_FREELIST_HARDENED configs by commit 598a0717a816 ("mm/slab: validate
cache membership under freelist hardening").

These checks and warnings can be useful especially for the debugging,
which can be improved.  Commit 598a0717a816 changed the pr_err() with
WARN_ON_ONCE() to WARN_ONCE() so only the first hit is now reported,
others are silent.  This patch changes it to WARN() so that all errors are
reported.

It's also useful to print SLUB allocation/free tracking info for the
offending object, if tracking is enabled.  Thus, export the SLUB
print_tracking() function and provide an empty one for SLAB.

For SLUB we can also benefit from the static key check in
kmem_cache_debug_flags(), but we need to move this function to slab.h and
declare the static key there.

[1] https://lore.kernel.org/r/20200608230654.828134-18-guro@fb.com

[vbabka@suse.cz: avoid bogus WARN()]
Link: https://lore.kernel.org/r/20200623090213.GW5535@shao2-debian
Link: http://lkml.kernel.org/r/b33e0fa7-cd28-4788-9e54-5927846329ef@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Matthew Garrett <mjg59@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Link: http://lkml.kernel.org/r/afeda7ac-748b-33d8-a905-56b708148ad5@suse.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm, slab/slub: move and improve cache_from_obj()
Vlastimil Babka [Fri, 7 Aug 2020 06:19:01 +0000 (23:19 -0700)]
mm, slab/slub: move and improve cache_from_obj()

The function cache_from_obj() was added by commit b9ce5ef49f00 ("sl[au]b:
always get the cache from its page in kmem_cache_free()") to support
kmemcg, where per-memcg cache can be different from the root one, so we
can't use the kmem_cache pointer given to kmem_cache_free().

Prior to that commit, SLUB already had debugging check+warning that could
be enabled to compare the given kmem_cache pointer to one referenced by
the slab page where the object-to-be-freed resides.  This check was moved
to cache_from_obj().  Later the check was also enabled for
SLAB_FREELIST_HARDENED configs by commit 598a0717a816 ("mm/slab: validate
cache membership under freelist hardening").

These checks and warnings can be useful especially for the debugging,
which can be improved.  Commit 598a0717a816 changed the pr_err() with
WARN_ON_ONCE() to WARN_ONCE() so only the first hit is now reported,
others are silent.  This patch changes it to WARN() so that all errors are
reported.

It's also useful to print SLUB allocation/free tracking info for the
offending object, if tracking is enabled.  We could export the SLUB
print_tracking() function and provide an empty one for SLAB, or realize
that both the debugging and hardening cases in cache_from_obj() are only
supported by SLUB anyway.  So this patch moves cache_from_obj() from
slab.h to separate instances in slab.c and slub.c, where the SLAB version
only does the kmemcg lookup and even could be completely removed once the
kmemcg rework [1] is merged.  The SLUB version can thus easily use the
print_tracking() function.  It can also use the kmem_cache_debug_flags()
static key check for improved performance in kernels without the hardening
and with debugging not enabled on boot.

[1] https://lore.kernel.org/r/20200608230654.828134-18-guro@fb.com

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Jann Horn <jannh@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/20200610163135.17364-10-vbabka@suse.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm, slub: extend checks guarded by slub_debug static key
Vlastimil Babka [Fri, 7 Aug 2020 06:18:58 +0000 (23:18 -0700)]
mm, slub: extend checks guarded by slub_debug static key

There are few more places in SLUB that could benefit from reduced overhead
of the static key introduced by a previous patch:

- setup_object_debug() called on each object in newly allocated slab page
- setup_page_debug() called on newly allocated slab page
- __free_slab() called on freed slab page

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Jann Horn <jannh@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/20200610163135.17364-9-vbabka@suse.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm, slub: introduce kmem_cache_debug_flags()
Vlastimil Babka [Fri, 7 Aug 2020 06:18:55 +0000 (23:18 -0700)]
mm, slub: introduce kmem_cache_debug_flags()

There are few places that call kmem_cache_debug(s) (which tests if any of
debug flags are enabled for a cache) immediately followed by a test for a
specific flag.  The compiler can probably eliminate the extra check, but
we can make the code nicer by introducing kmem_cache_debug_flags() that
works like kmem_cache_debug() (including the static key check) but tests
for specific flag(s).  The next patches will add more users.

[vbabka@suse.cz: change return from int to bool, per Kees.  Add VM_WARN_ON_ONCE() for invalid flags, per Roman]
Link: http://lkml.kernel.org/r/949b90ed-e0f0-07d7-4d21-e30ec0958a7c@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Jann Horn <jannh@google.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/20200610163135.17364-8-vbabka@suse.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm, slub: introduce static key for slub_debug()
Vlastimil Babka [Fri, 7 Aug 2020 06:18:51 +0000 (23:18 -0700)]
mm, slub: introduce static key for slub_debug()

One advantage of CONFIG_SLUB_DEBUG is that a generic distro kernel can be
built with the option enabled, but it's inactive until simply enabled on
boot, without rebuilding the kernel.  With a static key, we can further
eliminate the overhead of checking whether a cache has a particular debug
flag enabled if we know that there are no such caches (slub_debug was not
enabled during boot).  We use the same mechanism also for e.g.
page_owner, debug_pagealloc or kmemcg functionality.

This patch introduces the static key and makes the general check for
per-cache debug flags kmem_cache_debug() use it.  This benefits several
call sites, including (slow path but still rather frequent) __slab_free().
The next patches will add more uses.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Jann Horn <jannh@google.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/20200610163135.17364-7-vbabka@suse.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm, slub: make reclaim_account attribute read-only
Vlastimil Babka [Fri, 7 Aug 2020 06:18:48 +0000 (23:18 -0700)]
mm, slub: make reclaim_account attribute read-only

The attribute reflects the SLAB_RECLAIM_ACCOUNT cache flag.  It's not
clear why this attribute was writable in the first place, as it's tied to
how the cache is used by its creator, it's not a user tunable.
Furthermore:

- it affects slab merging, but that's not being checked while toggled
- if affects whether __GFP_RECLAIMABLE flag is used to allocate page, but
  the runtime toggle doesn't update allocflags
- it affects cache_vmstat_idx() so runtime toggling might lead to incosistency
  of NR_SLAB_RECLAIMABLE and NR_SLAB_UNRECLAIMABLE

Thus make it read-only.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Jann Horn <jannh@google.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/20200610163135.17364-6-vbabka@suse.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm, slub: make remaining slub_debug related attributes read-only
Vlastimil Babka [Fri, 7 Aug 2020 06:18:45 +0000 (23:18 -0700)]
mm, slub: make remaining slub_debug related attributes read-only

SLUB_DEBUG creates several files under /sys/kernel/slab/<cache>/ that can
be read to check if the respective debugging options are enabled for given
cache.  Some options, namely sanity_checks, trace, and failslab can be
also enabled and disabled at runtime by writing into the files.

The runtime toggling is racy.  Some options disable __CMPXCHG_DOUBLE when
enabled, which means that in case of concurrent allocations, some can
still use __CMPXCHG_DOUBLE and some not, leading to potential corruption.
The s->flags field is also not updated or checked atomically.  The
simplest solution is to remove the runtime toggling.  The extended
slub_debug boot parameter syntax introduced by earlier patch should allow
to fine-tune the debugging configuration during boot with same
granularity.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Jann Horn <jannh@google.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/20200610163135.17364-5-vbabka@suse.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm, slub: remove runtime allocation order changes
Vlastimil Babka [Fri, 7 Aug 2020 06:18:41 +0000 (23:18 -0700)]
mm, slub: remove runtime allocation order changes

SLUB allows runtime changing of page allocation order by writing into the
/sys/kernel/slab/<cache>/order file.  Jann has reported [1] that this
interface allows the order to be set too small, leading to crashes.

While it's possible to fix the immediate issue, closer inspection reveals
potential races.  Storing the new order calls calculate_sizes() which
non-atomically updates a lot of kmem_cache fields while the cache is still
in use.  Unexpected behavior might occur even if the fields are set to the
same value as they were.

This could be fixed by splitting out the part of calculate_sizes() that
depends on forced_order, so that we only update kmem_cache.oo field.  This
could still race with init_cache_random_seq(), shuffle_freelist(),
allocate_slab().  Perhaps it's possible to audit and e.g.  add some
READ_ONCE/WRITE_ONCE accesses, it might be easier just to remove the
runtime order changes, which is what this patch does.  If there are valid
usecases for per-cache order setting, we could e.g.  extend the boot
parameters to do that.

[1] https://lore.kernel.org/r/CAG48ez31PP--h6_FzVyfJ4H86QYczAFPdxtJHUEEan+7VJETAQ@mail.gmail.com

Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/20200610163135.17364-4-vbabka@suse.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm, slub: make some slub_debug related attributes read-only
Vlastimil Babka [Fri, 7 Aug 2020 06:18:38 +0000 (23:18 -0700)]
mm, slub: make some slub_debug related attributes read-only

SLUB_DEBUG creates several files under /sys/kernel/slab/<cache>/ that can
be read to check if the respective debugging options are enabled for given
cache.  The options can be also toggled at runtime by writing into the
files.  Some of those, namely red_zone, poison, and store_user can be
toggled only when no objects yet exist in the cache.

Vijayanand reports [1] that there is a problem with freelist randomization
if changing the debugging option's state results in different number of
objects per page, and the random sequence cache needs thus needs to be
recomputed.

However, another problem is that the check for "no objects yet exist in
the cache" is racy, as noted by Jann [2] and fixing that would add
overhead or otherwise complicate the allocation/freeing paths.  Thus it
would be much simpler just to remove the runtime toggling support.  The
documentation describes it's "In case you forgot to enable debugging on
the kernel command line", but the neccessity of having no objects limits
its usefulness anyway for many caches.

Vijayanand describes an use case [3] where debugging is enabled for all
but zram caches for memory overhead reasons, and using the runtime toggles
was the only way to achieve such configuration.  After the previous patch
it's now possible to do that directly from the kernel boot option, so we
can remove the dangerous runtime toggles by making the /sys attribute
files read-only.

While updating it, also improve the documentation of the debugging /sys files.

[1] https://lkml.kernel.org/r/1580379523-32272-1-git-send-email-vjitta@codeaurora.org
[2] https://lore.kernel.org/r/CAG48ez31PP--h6_FzVyfJ4H86QYczAFPdxtJHUEEan+7VJETAQ@mail.gmail.com
[3] https://lore.kernel.org/r/1383cd32-1ddc-4dac-b5f8-9c42282fa81c@codeaurora.org

Reported-by: Vijayanand Jitta <vjitta@codeaurora.org>
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/20200610163135.17364-3-vbabka@suse.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm, slub: extend slub_debug syntax for multiple blocks
Vlastimil Babka [Fri, 7 Aug 2020 06:18:35 +0000 (23:18 -0700)]
mm, slub: extend slub_debug syntax for multiple blocks

Patch series "slub_debug fixes and improvements".

The slub_debug kernel boot parameter can either apply a single set of
options to all caches or a list of caches.  There is a use case where
debugging is applied for all caches and then disabled at runtime for
specific caches, for performance and memory consumption reasons [1].  As
runtime changes are dangerous, extend the boot parameter syntax so that
multiple blocks of either global or slab-specific options can be
specified, with blocks delimited by ';'.  This will also support the use
case of [1] without runtime changes.

For details see the updated Documentation/vm/slub.rst

[1] https://lore.kernel.org/r/1383cd32-1ddc-4dac-b5f8-9c42282fa81c@codeaurora.org

[weiyongjun1@huawei.com: make parse_slub_debug_flags() static]
Link: http://lkml.kernel.org/r/20200702150522.4940-1-weiyongjun1@huawei.com
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Jann Horn <jannh@google.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Link: http://lkml.kernel.org/r/20200610163135.17364-2-vbabka@suse.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/slab.c: update outdated kmem_list3 in a comment
Xiao Yang [Fri, 7 Aug 2020 06:18:31 +0000 (23:18 -0700)]
mm/slab.c: update outdated kmem_list3 in a comment

kmem_list3 has been renamed to kmem_cache_node long long ago so update it.

References:
6744f087ba2a ("slab: Common name for the per node structures")
ce8eb6c424c7 ("slab: Rename list3/l3 to node")

Signed-off-by: Xiao Yang <yangx.jy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Link: http://lkml.kernel.org/r/20200722033355.26908-1-yangx.jy@cn.fujitsu.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm, slab: check GFP_SLAB_BUG_MASK before alloc_pages in kmalloc_order
Long Li [Fri, 7 Aug 2020 06:18:28 +0000 (23:18 -0700)]
mm, slab: check GFP_SLAB_BUG_MASK before alloc_pages in kmalloc_order

kmalloc cannot allocate memory from HIGHMEM.  Allocating large amounts of
memory currently bypasses the check and will simply leak the memory when
page_address() returns NULL.  To fix this, factor the GFP_SLAB_BUG_MASK
check out of slab & slub, and call it from kmalloc_order() as well.  In
order to make the code clear, the warning message is put in one place.

Signed-off-by: Long Li <lonuxli.64@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Link: http://lkml.kernel.org/r/20200704035027.GA62481@lilong
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/slab: add naive detection of double free
Kees Cook [Fri, 7 Aug 2020 06:18:24 +0000 (23:18 -0700)]
mm/slab: add naive detection of double free

Similar to commit ce6fa91b9363 ("mm/slub.c: add a naive detection of
double free or corruption"), add a very cheap double-free check for SLAB
under CONFIG_SLAB_FREELIST_HARDENED.  With this added, the
"SLAB_FREE_DOUBLE" LKDTM test passes under SLAB:

  lkdtm: Performing direct entry SLAB_FREE_DOUBLE
  lkdtm: Attempting double slab free ...
  ------------[ cut here ]------------
  WARNING: CPU: 2 PID: 2193 at mm/slab.c:757 ___cache _free+0x325/0x390

[keescook@chromium.org: fix misplaced __free_one()]
Link: http://lkml.kernel.org/r/202006261306.0D82A2B@keescook
Link: https://lore.kernel.org/lkml/7ff248c7-d447-340c-a8e2-8c02972aca70@infradead.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Randy Dunlap <rdunlap@infradead.org> [build tested]
Cc: Roman Gushchin <guro@fb.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Cc: Matthew Garrett <mjg59@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Link: http://lkml.kernel.org/r/20200625215548.389774-3-keescook@chromium.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/slab: expand CONFIG_SLAB_FREELIST_HARDENED to include SLAB
Kees Cook [Fri, 7 Aug 2020 06:18:20 +0000 (23:18 -0700)]
mm/slab: expand CONFIG_SLAB_FREELIST_HARDENED to include SLAB

Patch series "mm: Expand CONFIG_SLAB_FREELIST_HARDENED to include SLAB"

In reviewing Vlastimil Babka's latest slub debug series, I realized[1]
that several checks under CONFIG_SLAB_FREELIST_HARDENED weren't being
applied to SLAB.  Fix this by expanding the Kconfig coverage, and adding a
simple double-free test for SLAB.

This patch (of 2):

Include SLAB caches when performing kmem_cache pointer verification.  A
defense against such corruption[1] should be applied to all the
allocators.  With this added, the "SLAB_FREE_CROSS" and "SLAB_FREE_PAGE"
LKDTM tests now pass on SLAB:

  lkdtm: Performing direct entry SLAB_FREE_CROSS
  lkdtm: Attempting cross-cache slab free ...
  ------------[ cut here ]------------
  cache_from_obj: Wrong slab cache. lkdtm-heap-b but object is from lkdtm-heap-a
  WARNING: CPU: 2 PID: 2195 at mm/slab.h:530 kmem_cache_free+0x8d/0x1d0
  ...
  lkdtm: Performing direct entry SLAB_FREE_PAGE
  lkdtm: Attempting non-Slab slab free ...
  ------------[ cut here ]------------
  virt_to_cache: Object is not a Slab page!
  WARNING: CPU: 1 PID: 2202 at mm/slab.h:489 kmem_cache_free+0x196/0x1d0

Additionally clean up neighboring Kconfig entries for clarity,
readability, and redundant option removal.

[1] https://github.com/ThomasKing2014/slides/raw/master/Building%20universal%20Android%20rooting%20with%20a%20type%20confusion%20vulnerability.pdf

Fixes: 598a0717a816 ("mm/slab: validate cache membership under freelist hardening")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Matthew Garrett <mjg59@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Link: http://lkml.kernel.org/r/20200625215548.389774-1-keescook@chromium.org
Link: http://lkml.kernel.org/r/20200625215548.389774-2-keescook@chromium.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm: ksize() should silently accept a NULL pointer
William Kucharski [Fri, 7 Aug 2020 06:18:17 +0000 (23:18 -0700)]
mm: ksize() should silently accept a NULL pointer

Other mm routines such as kfree() and kzfree() silently do the right thing
if passed a NULL pointer, so ksize() should do the same.

Signed-off-by: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Link: http://lkml.kernel.org/r/20200616225409.4670-1-william.kucharski@oracle.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm, treewide: rename kzfree() to kfree_sensitive()
Waiman Long [Fri, 7 Aug 2020 06:18:13 +0000 (23:18 -0700)]
mm, treewide: rename kzfree() to kfree_sensitive()

As said by Linus:

  A symmetric naming is only helpful if it implies symmetries in use.
  Otherwise it's actively misleading.

  In "kzalloc()", the z is meaningful and an important part of what the
  caller wants.

  In "kzfree()", the z is actively detrimental, because maybe in the
  future we really _might_ want to use that "memfill(0xdeadbeef)" or
  something. The "zero" part of the interface isn't even _relevant_.

The main reason that kzfree() exists is to clear sensitive information
that should not be leaked to other future users of the same memory
objects.

Rename kzfree() to kfree_sensitive() to follow the example of the recently
added kvfree_sensitive() and make the intention of the API more explicit.
In addition, memzero_explicit() is used to clear the memory to make sure
that it won't get optimized away by the compiler.

The renaming is done by using the command sequence:

  git grep -w --name-only kzfree |\
  xargs sed -i 's/kzfree/kfree_sensitive/'

followed by some editing of the kfree_sensitive() kerneldoc and adding
a kzfree backward compatibility macro in slab.h.

[akpm@linux-foundation.org: fs/crypto/inline_crypt.c needs linux/slab.h]
[akpm@linux-foundation.org: fix fs/crypto/inline_crypt.c some more]

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Joe Perches <joe@perches.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: "Jason A . Donenfeld" <Jason@zx2c4.com>
Link: http://lkml.kernel.org/r/20200616154311.12314-3-longman@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoocfs2: fix unbalanced locking
Pavel Machek [Fri, 7 Aug 2020 06:18:09 +0000 (23:18 -0700)]
ocfs2: fix unbalanced locking

Based on what fails, function can return with nfs_sync_rwlock either
locked or unlocked. That can not be right.

Always return with lock unlocked on error.

Fixes: 4cd9973f9ff6 ("ocfs2: avoid inode removal while nfsd is accessing it")
Signed-off-by: Pavel Machek (CIP) <pavel@denx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Link: http://lkml.kernel.org/r/20200724124443.GA28164@duo.ucw.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoocfs2: replace HTTP links with HTTPS ones
Alexander A. Klimov [Fri, 7 Aug 2020 06:18:06 +0000 (23:18 -0700)]
ocfs2: replace HTTP links with HTTPS ones

Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.

Deterministic algorithm:
For each file:
  If not .svg:
    For each line:
      If doesn't contain `xmlns`:
        For each link, `http://[^#  ]*(?:\w|/)`:
  If neither `gnu\.org/license`, nor `mozilla\.org/MPL`:
            If both the HTTP and HTTPS versions
            return 200 OK and serve the same content:
              Replace HTTP with HTTPS.

Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Link: http://lkml.kernel.org/r/20200713174456.36596-1-grandmaster@al2klimov.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoocfs2: change slot number type s16 to u16
Junxiao Bi [Fri, 7 Aug 2020 06:18:02 +0000 (23:18 -0700)]
ocfs2: change slot number type s16 to u16

Dan Carpenter reported the following static checker warning.

fs/ocfs2/super.c:1269 ocfs2_parse_options() warn: '(-1)' 65535 can't fit into 32767 'mopt->slot'
fs/ocfs2/suballoc.c:859 ocfs2_init_inode_steal_slot() warn: '(-1)' 65535 can't fit into 32767 'osb->s_inode_steal_slot'
fs/ocfs2/suballoc.c:867 ocfs2_init_meta_steal_slot() warn: '(-1)' 65535 can't fit into 32767 'osb->s_meta_steal_slot'

That's because OCFS2_INVALID_SLOT is (u16)-1. Slot number in ocfs2 can be
never negative, so change s16 to u16.

Fixes: 9277f8334ffc ("ocfs2: fix value of OCFS2_INVALID_SLOT")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Gang He <ghe@suse.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200627001259.19757-1-junxiao.bi@oracle.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoocfs2: suballoc.h: delete a duplicated word
Randy Dunlap [Fri, 7 Aug 2020 06:17:59 +0000 (23:17 -0700)]
ocfs2: suballoc.h: delete a duplicated word

Drop the repeated word "is" in a comment.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
Link: http://lkml.kernel.org/r/20200720001421.28823-1-rdunlap@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoocfs2: fix remounting needed after setfacl command
Gang He [Fri, 7 Aug 2020 06:17:56 +0000 (23:17 -0700)]
ocfs2: fix remounting needed after setfacl command

When use setfacl command to change a file's acl, the user cannot get the
latest acl information from the file via getfacl command, until remounting
the file system.

e.g.
setfacl -m u:ivan:rw /ocfs2/ivan
getfacl /ocfs2/ivan
getfacl: Removing leading '/' from absolute path names
file: ocfs2/ivan
owner: root
group: root
user::rw-
group::r--
mask::r--
other::r--

The latest acl record("u:ivan:rw") cannot be returned via getfacl
command until remounting.

Signed-off-by: Gang He <ghe@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Link: http://lkml.kernel.org/r/20200717023751.9922-1-ghe@suse.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agontfs: fix ntfs_test_inode and ntfs_init_locked_inode function type
Luca Stefani [Fri, 7 Aug 2020 06:17:53 +0000 (23:17 -0700)]
ntfs: fix ntfs_test_inode and ntfs_init_locked_inode function type

Clang's Control Flow Integrity (CFI) is a security mechanism that can help
prevent JOP chains, deployed extensively in downstream kernels used in
Android.

Its deployment is hindered by mismatches in function signatures.  For this
case, we make callbacks match their intended function signature, and cast
parameters within them rather than casting the callback when passed as a
parameter.

When running `mount -t ntfs ...` we observe the following trace:

Call trace:
__cfi_check_fail+0x1c/0x24
name_to_dev_t+0x0/0x404
iget5_locked+0x594/0x5e8
ntfs_fill_super+0xbfc/0x43ec
mount_bdev+0x30c/0x3cc
ntfs_mount+0x18/0x24
mount_fs+0x1b0/0x380
vfs_kern_mount+0x90/0x398
do_mount+0x5d8/0x1a10
SyS_mount+0x108/0x144
el0_svc_naked+0x34/0x38

Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: freak07 <michalechner92@googlemail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Anton Altaparmakov <anton@tuxera.com>
Link: http://lkml.kernel.org/r/20200718112513.533800-1-luca.stefani.ge1@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoscripts/spelling.txt: add more spellings to spelling.txt
Colin Ian King [Fri, 7 Aug 2020 06:17:50 +0000 (23:17 -0700)]
scripts/spelling.txt: add more spellings to spelling.txt

Here are some of the more common spelling mistakes and typos that I've
found while fixing up spelling mistakes in the kernel since April 2020.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200714092837.173796-1-colin.king@canonical.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoconst_structs.checkpatch: add regulator_ops
Joe Perches [Fri, 7 Aug 2020 06:17:46 +0000 (23:17 -0700)]
const_structs.checkpatch: add regulator_ops

Add regulator_ops to expected to be const list.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Pi-Hsun Shih <pihsun@chromium.org>
Cc: Liam Girdwood <lgirdwood@gmail.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Benson Leung <bleung@chromium.org>
Cc: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Cc: Guenter Roeck <groeck@chromium.org>
Cc: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Link: http://lkml.kernel.org/r/dab1ba1aa03a8236933cfb7a28937efb0b808f13.camel@perches.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoscripts/decode_stacktrace.sh: guess path to vmlinux by release name
Konstantin Khlebnikov [Fri, 7 Aug 2020 06:17:43 +0000 (23:17 -0700)]
scripts/decode_stacktrace.sh: guess path to vmlinux by release name

Add option decode_stacktrace -r <release> to specify only release name.
This is enough to guess standard paths to vmlinux and modules:

$ echo -e 'schedule+0x0/0x0
tap_open+0x0/0x0 [tap]' |
./scripts/decode_stacktrace.sh -r 5.4.0-37-generic
schedule (kernel/sched/core.c:4138)
tap_open (drivers/net/tap.c:502) tap

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Sasha Levin <sashal@kernel.org>
Link: http://lkml.kernel.org/r/159282923334.248444.2399153100007347838.stgit@buzz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoscripts/decode_stacktrace.sh: guess path to modules
Konstantin Khlebnikov [Fri, 7 Aug 2020 06:17:41 +0000 (23:17 -0700)]
scripts/decode_stacktrace.sh: guess path to modules

Try to find module in directory with vmlinux (for fresh build).  Then try
standard paths where debuginfo are usually placed.  Pick first file which
have elf section '.debug_line'.

Before:

$ echo 'tap_open+0x0/0x0 [tap]' |
  ./scripts/decode_stacktrace.sh /usr/lib/debug/boot/vmlinux-5.4.0-37-generic
WARNING! Modules path isn't set, but is needed to parse this symbol
tap_open+0x0/0x0 tap

After:

$ echo 'tap_open+0x0/0x0 [tap]' |
  ./scripts/decode_stacktrace.sh /usr/lib/debug/boot/vmlinux-5.4.0-37-generic
tap_open (drivers/net/tap.c:502) tap

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Sasha Levin <sashal@kernel.org>
Link: http://lkml.kernel.org/r/159282923068.248444.5461337458421616083.stgit@buzz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoscripts/decode_stacktrace.sh: guess basepath if not specified
Konstantin Khlebnikov [Fri, 7 Aug 2020 06:17:38 +0000 (23:17 -0700)]
scripts/decode_stacktrace.sh: guess basepath if not specified

Guess path to kernel sources using known location of symbol "kernel_init".
Make basepath argument optional.

Before:

$ echo 'vfs_open+0x0/0x0' | ./scripts/decode_stacktrace.sh vmlinux ""
vfs_open (home/khlebnikov/src/linux/fs/open.c:912)

After:

$ echo 'vfs_open+0x0/0x0' | ./scripts/decode_stacktrace.sh vmlinux
vfs_open (fs/open.c:912)

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Sasha Levin <sashal@kernel.org>
Link: http://lkml.kernel.org/r/159282922803.248444.2379229451667913634.stgit@buzz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoscripts/decode_stacktrace.sh: skip missing symbols
Konstantin Khlebnikov [Fri, 7 Aug 2020 06:17:35 +0000 (23:17 -0700)]
scripts/decode_stacktrace.sh: skip missing symbols

For now script turns missing symbols into '0' and make bogus decode.  Skip
them instead.  Also simplify parsing output of 'nm'.

Before:

$ echo 'xxx+0x0/0x0' | ./scripts/decode_stacktrace.sh vmlinux ""
xxx (home/khlebnikov/src/linux/./arch/x86/include/asm/processor.h:398)

After:

$ echo 'xxx+0x0/0x0' | ./scripts/decode_stacktrace.sh vmlinux ""
xxx+0x0/0x0

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Sasha Levin <sashal@kernel.org>
Link: http://lkml.kernel.org/r/159282922499.248444.4883465570858385250.stgit@buzz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoscripts/bloat-o-meter: Support comparing library archives
Nikolay Borisov [Fri, 7 Aug 2020 06:17:32 +0000 (23:17 -0700)]
scripts/bloat-o-meter: Support comparing library archives

Library archives (.a) usually contain multiple object files so their
output of nm --size-sort contains lines like:

<omitted for brevity>
00000000000003a8 t run_test

extent-map-tests.o:
<omitted for brevity>

bloat-o-meter currently doesn't handle them which results in errors when
calling .split() on them.  Fix this by simply ignoring them.  This enables
diffing subsystems which generate built-in.a files.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200603103513.3712-1-nborisov@suse.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoscripts/tags.sh: collect compiled source precisely
Jialu Xu [Fri, 7 Aug 2020 06:17:29 +0000 (23:17 -0700)]
scripts/tags.sh: collect compiled source precisely

Parse compiled source from *.cmd but don't 'find' too many files that are
not related to compilation.

[xujialu@vimux.org: don't expand symlinks by add option -s for realpath]
Link: http://lkml.kernel.org/r/5efc5bfb.1c69fb81.41bf5.7131SMTPIN_ADDED_MISSING@mx.google.com
Signed-off-by: Jialu Xu <xujialu@vimux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Joe Perches <joe@perches.com>
Link: http://lkml.kernel.org/r/5ee5d8e3.1c69fb81.9b804.47b2SMTPIN_ADDED_MISSING@mx.google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agotools/testing/selftests/cgroup/cgroup_util.c: cg_read_strcmp: fix null pointer derefe...
Gaurav Singh [Fri, 7 Aug 2020 06:17:25 +0000 (23:17 -0700)]
tools/testing/selftests/cgroup/cgroup_util.c: cg_read_strcmp: fix null pointer dereference

Haven't reproduced this issue. This PR is does a minor code cleanup.

Signed-off-by: Gaurav Singh <gaurav1086@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Michal Koutn <mkoutny@suse.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Chris Down <chris@chrisdown.name>
Link: http://lkml.kernel.org/r/20200726013808.22242-1-gaurav1086@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agotools/: replace HTTP links with HTTPS ones
Alexander A. Klimov [Fri, 7 Aug 2020 06:17:22 +0000 (23:17 -0700)]
tools/: replace HTTP links with HTTPS ones

Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.

Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200726120752.16768-1-grandmaster@al2klimov.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agokthread: remove incorrect comment in kthread_create_on_cpu()
Ilias Stamatis [Fri, 7 Aug 2020 06:17:19 +0000 (23:17 -0700)]
kthread: remove incorrect comment in kthread_create_on_cpu()

Originally kthread_create_on_cpu() parked and woke up the new thread.
However, since commit a65d40961dc7 ("kthread/smpboot: do not park in
kthread_create_on_cpu()") this is no longer the case.  This patch removes
the comment that has been left behind and is now incorrect / stale.

Fixes: a65d40961dc7 ("kthread/smpboot: do not park in kthread_create_on_cpu()")
Signed-off-by: Ilias Stamatis <stamatis.iliass@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Link: http://lkml.kernel.org/r/20200611135920.240551-1-stamatis.iliass@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm: fix kthread_use_mm() vs TLB invalidate
Peter Zijlstra [Fri, 7 Aug 2020 06:17:16 +0000 (23:17 -0700)]
mm: fix kthread_use_mm() vs TLB invalidate

For SMP systems using IPI based TLB invalidation, looking at
current->active_mm is entirely reasonable.  This then presents the
following race condition:

  CPU0 CPU1

  flush_tlb_mm(mm) use_mm(mm)
    <send-IPI>
  tsk->active_mm = mm;
  <IPI>
    if (tsk->active_mm == mm)
      // flush TLBs
  </IPI>
  switch_mm(old_mm,mm,tsk);

Where it is possible the IPI flushed the TLBs for @old_mm, not @mm,
because the IPI lands before we actually switched.

Avoid this by disabling IRQs across changing ->active_mm and
switch_mm().

Of the (SMP) architectures that have IPI based TLB invalidate:

  Alpha    - checks active_mm
  ARC      - ASID specific
  IA64     - checks active_mm
  MIPS     - ASID specific flush
  OpenRISC - shoots down world
  PARISC   - shoots down world
  SH       - ASID specific
  SPARC    - ASID specific
  x86      - N/A
  xtensa   - checks active_mm

So at the very least Alpha, IA64 and Xtensa are suspect.

On top of this, for scheduler consistency we need at least preemption
disabled across changing tsk->mm and doing switch_mm(), which is
currently provided by task_lock(), but that's not sufficient for
PREEMPT_RT.

[akpm@linux-foundation.org: add comment]

Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jann Horn <jannh@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200721154106.GE10769@hirez.programming.kicks-ass.net
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/shuffle: don't move pages between zones and don't read garbage memmaps
David Hildenbrand [Fri, 7 Aug 2020 06:17:13 +0000 (23:17 -0700)]
mm/shuffle: don't move pages between zones and don't read garbage memmaps

Especially with memory hotplug, we can have offline sections (with a
garbage memmap) and overlapping zones.  We have to make sure to only touch
initialized memmaps (online sections managed by the buddy) and that the
zone matches, to not move pages between zones.

To test if this can actually happen, I added a simple

BUG_ON(page_zone(page_i) != page_zone(page_j));

right before the swap.  When hotplugging a 256M DIMM to a 4G x86-64 VM and
onlining the first memory block "online_movable" and the second memory
block "online_kernel", it will trigger the BUG, as both zones (NORMAL and
MOVABLE) overlap.

This might result in all kinds of weird situations (e.g., double
allocations, list corruptions, unmovable allocations ending up in the
movable zone).

Fixes: e900a918b098 ("mm: shuffle initial free memory to improve memory-side-cache utilization")
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: <stable@vger.kernel.org> [5.2+]
Link: http://lkml.kernel.org/r/20200624094741.9918-2-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/migrate: fix migrate_pgmap_owner w/o CONFIG_MMU_NOTIFIER
Ralph Campbell [Fri, 7 Aug 2020 06:17:09 +0000 (23:17 -0700)]
mm/migrate: fix migrate_pgmap_owner w/o CONFIG_MMU_NOTIFIER

On x86_64, when CONFIG_MMU_NOTIFIER is not set/enabled, there is a
compiler error:

   mm/migrate.c: In function 'migrate_vma_collect':
   mm/migrate.c:2481:7: error: 'struct mmu_notifier_range' has no member named 'migrate_pgmap_owner'
     range.migrate_pgmap_owner = migrate->pgmap_owner;
          ^

Fixes: 998427b3ad2c ("mm/notifier: add migration invalidation type")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "Jason Gunthorpe" <jgg@mellanox.com>
Link: http://lkml.kernel.org/r/20200806193353.7124-1-rcampbell@nvidia.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoMerge tag 'tty-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Linus Torvalds [Thu, 6 Aug 2020 21:56:11 +0000 (14:56 -0700)]
Merge tag 'tty-5.9-rc1' of git://git./linux/kernel/git/gregkh/tty

Pull tty/serial updates from Greg KH:
 "Here is the large set of TTY and Serial driver patches for 5.9-rc1.

  Lots of bugfixes in here, thanks to syzbot fuzzing for serial and vt
  and console code.

  Other highlights include:

   - much needed vt/vc code cleanup from Jiri Slaby

   - 8250 driver fixes and additions

   - various serial driver updates and feature enhancements

   - locking cleanup for serial/console initializations

   - other minor cleanups

  All of these have been in linux-next with no reported issues"

* tag 'tty-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (90 commits)
  MAINTAINERS: enlist Greg formally for console stuff
  vgacon: Fix for missing check in scrollback handling
  Revert "serial: 8250: Let serial core initialise spin lock"
  serial: 8250: Let serial core initialise spin lock
  tty: keyboard, do not speculate on func_table index
  serial: stm32: Add RS485 RTS GPIO control
  serial: 8250_dw: Fix common clocks usage race condition
  serial: 8250_dw: Pass the same rate to the clk round and set rate methods
  serial: 8250_dw: Simplify the ref clock rate setting procedure
  serial: 8250: Add 8250 port clock update method
  tty: serial: imx: add imx earlycon driver
  tty: serial: imx: enable imx serial console port as module
  tty/synclink: remove leftover bits of non-PCI card support
  tty: Use the preferred form for passing the size of a structure type
  tty: Fix identation issues in struct serial_struct32
  tty: Avoid the use of one-element arrays
  serial: msm_serial: add sparse context annotation
  serial: pmac_zilog: add sparse context annotation
  newport_con: vc_color is now in state
  serial: imx: use hrtimers for rs485 delays
  ...

3 years agoMerge tag 'staging-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Thu, 6 Aug 2020 21:36:13 +0000 (14:36 -0700)]
Merge tag 'staging-5.9-rc1' of git://git./linux/kernel/git/gregkh/staging

Pull staging/IIO driver updates from Greg KH:
 "Here is the large set of Staging and IIO driver patches for 5.9-rc1.

  Lots of churn here, but overall the size increase in lines added is
  small, while adding a load of new IIO drivers.

  Major things in here:

   - lots and lots of IIO new drivers and frameworks added

   - IIO driver fixes and updates

   - lots of tiny coding style cleanups for staging drivers

   - vc04_services major reworks and cleanups

  We had 3 set of drivers move out of staging in this round as well:

   - wilc1000 wireless driver moved out of staging

   - speakup moved out of staging

   - most USB driver moved out of staging

  Full details are in the shortlog.

  All of these have been in linux-next with no reported issues. The last
  few changes here were to resolve reported linux-next issues, and they
  seem to have resolved the problems"

* tag 'staging-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (428 commits)
  staging: most: fix up movement of USB driver
  staging: rts5208: clear alignment style issues
  staging: r8188eu: replace rtw_netdev_priv define with inline function
  staging: netlogic: clear alignment style issues
  staging: android: ashmem: Fix lockdep warning for write operation
  drivers: most: add USB adapter driver
  staging: most: Use %pM format specifier for MAC addresses
  staging: ks7010: Use %pM format specifier for MAC addresses
  staging: qlge: qlge_dbg: removed comment repition
  staging: wfx: Use flex_array_size() helper in memcpy()
  staging: rtl8723bs: Align macro definitions
  staging: rtl8723bs: Clean up function declations
  staging: rtl8723bs: Fix coding style errors
  drivers: staging: audio: Fix the missing header file for helper file
  staging: greybus: audio: Enable GB codec, audio module compilation.
  staging: greybus: audio: Add helper APIs for dynamic audio modules
  staging: greybus: audio: Resolve compilation error in topology parser
  staging: greybus: audio: Resolve compilation errors for GB codec module
  staging: greybus: audio: Maintain jack list within GB Audio module
  staging: greybus: audio: Update snd_jack FW usage as per new APIs
  ...

3 years agoMerge tag 'sound-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Thu, 6 Aug 2020 21:27:31 +0000 (14:27 -0700)]
Merge tag 'sound-5.9-rc1' of git://git./linux/kernel/git/tiwai/sound

Pull sound updates from Takashi Iwai:
 "This became wide and scattered updates all over the sound tree as
  diffstat shows: lots of (still ongoing) refactoring works in ASoC,
  fixes and cleanups caught by static analysis, inclusive term
  conversions as well as lots of new drivers. Below are highlights:

  ASoC core:
   - API cleanups and conversions to the unified mute_stream() call
   - Simplify I/O helper functions
   - Use helper macros to retrieve RTD from substreams

  ASoC drivers:
   - Lots of fixes and cleanups in Intel ASoC drivers
   - Lots of new stuff: Freescale MQS and i.MX6sx, Intel KeemBay I2S,
     Maxim MAX98360A and MAX98373 SoundWire, various Mediatek boards,
     nVidia Tegra 186 and 210, RealTek RL6231, Samsung Midas and Aries
     boards, TI J721e EVM

  ALSA core:
   - Minor code refacotring for SG-buffer handling

  HD-audio:
   - Generalization of mute-LED handling with LED classdev
   - Intel silent stream support for HDMI
   - Device-specific fixes: CA0132, Loongson-3

  Others:
   - Usual USB- and HD-audio quirks for various devices
   - Fixes for echoaudio DMA position handling
   - Various documents and trivial fixes for sparse warnings
   - Conversion to adopt inclusive terms"

* tag 'sound-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (479 commits)
  ALSA: pci: delete repeated words in comments
  ALSA: isa: delete repeated words in comments
  ALSA: hda/tegra: Add 100us dma stop delay
  ALSA: hda: Add dma stop delay variable
  ASoC: hda/tegra: Set buffer alignment to 128 bytes
  ALSA: seq: oss: Serialize ioctls
  ALSA: hda/hdmi: Add quirk to force connectivity
  ALSA: usb-audio: add startech usb audio dock name
  ALSA: usb-audio: Add support for Lenovo ThinkStation P620
  Revert "ALSA: hda: call runtime_allow() for all hda controllers"
  ALSA: hda/ca0132 - Fix AE-5 microphone selection commands.
  ALSA: hda/ca0132 - Add new quirk ID for Recon3D.
  ALSA: hda/ca0132 - Fix ZxR Headphone gain control get value.
  ALSA: hda/realtek: Add alc269/alc662 pin-tables for Loongson-3 laptops
  ALSA: docs: fix typo
  ALSA: doc: use correct config variable name
  ASoC: core: Two step component registration
  ASoC: core: Simplify snd_soc_component_initialize declaration
  ASoC: core: Relocate and expose snd_soc_component_initialize
  ASoC: sh: Replace 'select' DMADEVICES 'with depends on'
  ...

3 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Thu, 6 Aug 2020 19:59:31 +0000 (12:59 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "s390:
   - implement diag318

  x86:
   - Report last CPU for debugging
   - Emulate smaller MAXPHYADDR in the guest than in the host
   - .noinstr and tracing fixes from Thomas
   - nested SVM page table switching optimization and fixes

  Generic:
   - Unify shadow MMU cache data structures across architectures"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (127 commits)
  KVM: SVM: Fix sev_pin_memory() error handling
  KVM: LAPIC: Set the TDCR settable bits
  KVM: x86: Specify max TDP level via kvm_configure_mmu()
  KVM: x86/mmu: Rename max_page_level to max_huge_page_level
  KVM: x86: Dynamically calculate TDP level from max level and MAXPHYADDR
  KVM: VXM: Remove temporary WARN on expected vs. actual EPTP level mismatch
  KVM: x86: Pull the PGD's level from the MMU instead of recalculating it
  KVM: VMX: Make vmx_load_mmu_pgd() static
  KVM: x86/mmu: Add separate helper for shadow NPT root page role calc
  KVM: VMX: Drop a duplicate declaration of construct_eptp()
  KVM: nSVM: Correctly set the shadow NPT root level in its MMU role
  KVM: Using macros instead of magic values
  MIPS: KVM: Fix build error caused by 'kvm_run' cleanup
  KVM: nSVM: remove nonsensical EXITINFO1 adjustment on nested NPF
  KVM: x86: Add a capability for GUEST_MAXPHYADDR < HOST_MAXPHYADDR support
  KVM: VMX: optimize #PF injection when MAXPHYADDR does not match
  KVM: VMX: Add guest physical address check in EPT violation and misconfig
  KVM: VMX: introduce vmx_need_pf_intercept
  KVM: x86: update exception bitmap on CPUID changes
  KVM: x86: rename update_bp_intercept to update_exception_bitmap
  ...

3 years agoRevert "x86/mm/64: Do not sync vmalloc/ioremap mappings"
Linus Torvalds [Thu, 6 Aug 2020 19:02:58 +0000 (12:02 -0700)]
Revert "x86/mm/64: Do not sync vmalloc/ioremap mappings"

This reverts commit 8bb9bf242d1fee925636353807c511d54fde8986.

It seems the vmalloc page tables aren't always preallocated in all
situations, because Jason Donenfeld reports an oops with this commit:

  BUG: unable to handle page fault for address: ffffe8ffffd00608
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] PREEMPT SMP
  CPU: 2 PID: 22 Comm: kworker/2:0 Not tainted 5.8.0+ #154
  RIP: process_one_work+0x2c/0x2d0
  Code: 41 56 41 55 41 54 55 48 89 f5 53 48 89 fb 48 83 ec 08 48 8b 06 4c 8b 67 40 49 89 c6 45 30 f6 a8 04 b8 00 00 00 00 4c 0f 44 f0 <49> 8b 46 08 44 8b a8 00 01 05
  Call Trace:
   worker_thread+0x4b/0x3b0
   ? rescuer_thread+0x360/0x360
   kthread+0x116/0x140
   ? __kthread_create_worker+0x110/0x110
   ret_from_fork+0x1f/0x30
  CR2: ffffe8ffffd00608

and that page fault address is right in that vmalloc space, and we
clearly don't have a PGD/P4D entry for it.

Looking at the "Code:" line, the actual fault seems to come from the
'pwq->wq' dereference at the top of the process_one_work() function:

        struct pool_workqueue *pwq = get_work_pwq(work);
        struct worker_pool *pool = worker->pool;
        bool cpu_intensive = pwq->wq->flags & WQ_CPU_INTENSIVE;

so 'struct pool_workqueue *pwq' is the allocation that hasn't been
synchronized across CPUs.

Just revert for now, while Joerg figures out the cause.

Reported-and-bisected-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoMerge tag 'sched-fifo-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 6 Aug 2020 18:55:43 +0000 (11:55 -0700)]
Merge tag 'sched-fifo-2020-08-04' of git://git./linux/kernel/git/tip/tip

Pull sched/fifo updates from Ingo Molnar:
 "This adds the sched_set_fifo*() encapsulation APIs to remove static
  priority level knowledge from non-scheduler code.

  The three APIs for non-scheduler code to set SCHED_FIFO are:

   - sched_set_fifo()
   - sched_set_fifo_low()
   - sched_set_normal()

  These are two FIFO priority levels: default (high), and a 'low'
  priority level, plus sched_set_normal() to set the policy back to
  non-SCHED_FIFO.

  Since the changes affect a lot of non-scheduler code, we kept this in
  a separate tree"

* tag 'sched-fifo-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  sched,tracing: Convert to sched_set_fifo()
  sched: Remove sched_set_*() return value
  sched: Remove sched_setscheduler*() EXPORTs
  sched,psi: Convert to sched_set_fifo_low()
  sched,rcutorture: Convert to sched_set_fifo_low()
  sched,rcuperf: Convert to sched_set_fifo_low()
  sched,locktorture: Convert to sched_set_fifo()
  sched,irq: Convert to sched_set_fifo()
  sched,watchdog: Convert to sched_set_fifo()
  sched,serial: Convert to sched_set_fifo()
  sched,powerclamp: Convert to sched_set_fifo()
  sched,ion: Convert to sched_set_normal()
  sched,powercap: Convert to sched_set_fifo*()
  sched,spi: Convert to sched_set_fifo*()
  sched,mmc: Convert to sched_set_fifo*()
  sched,ivtv: Convert to sched_set_fifo*()
  sched,drm/scheduler: Convert to sched_set_fifo*()
  sched,msm: Convert to sched_set_fifo*()
  sched,psci: Convert to sched_set_fifo*()
  sched,drbd: Convert to sched_set_fifo*()
  ...

3 years agoMerge tag 'integrity-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar...
Linus Torvalds [Thu, 6 Aug 2020 18:35:57 +0000 (11:35 -0700)]
Merge tag 'integrity-v5.9' of git://git./linux/kernel/git/zohar/linux-integrity

Pull integrity updates from Mimi Zohar:
 "The nicest change is the IMA policy rule checking. The other changes
  include allowing the kexec boot cmdline line measure policy rules to
  be defined in terms of the inode associated with the kexec kernel
  image, making the IMA_APPRAISE_BOOTPARAM, which governs the IMA
  appraise mode (log, fix, enforce), a runtime decision based on the
  secure boot mode of the system, and including errno in the audit log"

* tag 'integrity-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  integrity: remove redundant initialization of variable ret
  ima: move APPRAISE_BOOTPARAM dependency on ARCH_POLICY to runtime
  ima: AppArmor satisfies the audit rule requirements
  ima: Rename internal filter rule functions
  ima: Support additional conditionals in the KEXEC_CMDLINE hook function
  ima: Use the common function to detect LSM conditionals in a rule
  ima: Move comprehensive rule validation checks out of the token parser
  ima: Use correct type for the args_p member of ima_rule_entry.lsm elements
  ima: Shallow copy the args_p member of ima_rule_entry.lsm elements
  ima: Fail rule parsing when appraise_flag=blacklist is unsupportable
  ima: Fail rule parsing when the KEY_CHECK hook is combined with an invalid cond
  ima: Fail rule parsing when the KEXEC_CMDLINE hook is combined with an invalid cond
  ima: Fail rule parsing when buffer hook functions have an invalid action
  ima: Free the entire rule if it fails to parse
  ima: Free the entire rule when deleting a list of rules
  ima: Have the LSM free its audit rule
  IMA: Add audit log for failure conditions
  integrity: Add errno field in audit message

3 years agoMerge branch 'for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux
Linus Torvalds [Thu, 6 Aug 2020 18:34:35 +0000 (11:34 -0700)]
Merge branch 'for-5.9' of git://git./linux/kernel/git/jlawall/linux

Pull coccinelle updates from Julia Lawall:
 "New semantic patches and semantic patch improvements from Denis
  Efremov"

* 'for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux:
  coccinelle: api: filter out memdup_user definitions
  coccinelle: api: extend memdup_user rule with vmemdup_user()
  coccinelle: api: extend memdup_user transformation with GFP_USER
  coccinelle: api: add kzfree script
  coccinelle: misc: add array_size_dup script to detect missed overflow checks
  coccinelle: api/kstrdup: fix coccinelle position
  coccinelle: api: add device_attr_show script

3 years agoMerge tag 'livepatching-for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 6 Aug 2020 18:33:20 +0000 (11:33 -0700)]
Merge tag 'livepatching-for-5.9' of git://git./linux/kernel/git/livepatching/livepatching

Pull livepatching updates from Petr Mladek:
 "Improvements and cleanups of livepatching selftests"

* tag 'livepatching-for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching:
  selftests/livepatch: adopt to newer sysctl error format
  selftests/livepatch: Use "comm" instead of "diff" for dmesg
  selftests/livepatch: add test delimiter to dmesg
  selftests/livepatch: refine dmesg 'taints' in dmesg comparison
  selftests/livepatch: Don't clear dmesg when running tests
  selftests/livepatch: fix mem leaks in test-klp-shadow-vars
  selftests/livepatch: more verification in test-klp-shadow-vars
  selftests/livepatch: rework test-klp-shadow-vars
  selftests/livepatch: simplify test-klp-callbacks busy target tests

3 years agoMerge tag 'Smack-for-5.9' of git://github.com/cschaufler/smack-next
Linus Torvalds [Thu, 6 Aug 2020 18:02:23 +0000 (11:02 -0700)]
Merge tag 'Smack-for-5.9' of git://github.com/cschaufler/smack-next

Pull smack updates from Casey Schaufler:
 "Minor fixes to Smack for the v5.9 release.

  All were found by automated checkers and have straightforward
  resolution"

* tag 'Smack-for-5.9' of git://github.com/cschaufler/smack-next:
  Smack: prevent underflow in smk_set_cipso()
  Smack: fix another vsscanf out of bounds
  Smack: fix use-after-free in smk_write_relabel_self()

3 years agoMerge tag 'mips_5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Linus Torvalds [Thu, 6 Aug 2020 17:54:07 +0000 (10:54 -0700)]
Merge tag 'mips_5.9' of git://git./linux/kernel/git/mips/linux

Pull MIPS upates from Thomas Bogendoerfer:

 - improvements for Loongson64

 - extended ingenic support

 - removal of not maintained paravirt system type

 - cleanups and fixes

* tag 'mips_5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (81 commits)
  MIPS: SGI-IP27: always enable NUMA in Kconfig
  MAINTAINERS: Update KVM/MIPS maintainers
  MIPS: Update default config file for Loongson-3
  MIPS: KVM: Add kvm guest support for Loongson-3
  dt-bindings: mips: Document Loongson kvm guest board
  MIPS: handle Loongson-specific GSExc exception
  MIPS: add definitions for Loongson-specific CP0.Diag1 register
  MIPS: only register FTLBPar exception handler for supported models
  MIPS: ingenic: Hardcode mem size for qi,lb60 board
  MIPS: DTS: ingenic/qi,lb60: Add model and memory node
  MIPS: ingenic: Use fw_passed_dtb even if CONFIG_BUILTIN_DTB
  MIPS: head.S: Init fw_passed_dtb to builtin DTB
  of: address: Fix parser address/size cells initialization
  of_address: Guard of_bus_pci_get_flags with CONFIG_PCI
  MIPS: DTS: Fix number of msi vectors for Loongson64G
  MIPS: Loongson64: Add ISA node for LS7A PCH
  MIPS: Loongson64: DTS: Fix ISA and PCI I/O ranges for RS780E PCH
  MIPS: Loongson64: Enlarge IO_SPACE_LIMIT
  MIPS: Loongson64: Process ISA Node in DeviceTree
  of_address: Add bus type match for pci ranges parser
  ...

3 years agoMerge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Linus Torvalds [Thu, 6 Aug 2020 17:17:00 +0000 (10:17 -0700)]
Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM updates from Russell King:

 - add arch/arm/Kbuild from Masahiro Yamada.

 - simplify act_mm macro, since it contains an open-coded
   get_thread_info.

 - VFP updates for Clang from Stefan Agner.

 - Fix unwinder for Clang from Nathan Huckleberry.

 - Remove unused it8152 PCI host controller, used by the removed cm-x2xx
   platforms from Mike Rapoport.

 - Further explanation of __range_ok().

 - Remove kimage_voffset that isn't used anymore from Marc Zyngier.

 - Drop ancient Thumb-2 workaround for old binutils from Ard Biesheuvel.

 - Documentation cleanup for mach-* from Pete Zaitcev.

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 8996/1: Documentation/Clean up the description of mach-<class>
  ARM: 8995/1: drop Thumb-2 workaround for ancient binutils
  ARM: 8994/1: mm: drop kimage_voffset which was only used by KVM
  ARM: uaccess: add further explanation of __range_ok()
  ARM: 8993/1: remove it8152 PCI controller driver
  ARM: 8992/1: Fix unwind_frame for clang-built kernels
  ARM: 8991/1: use VFP assembler mnemonics if available
  ARM: 8990/1: use VFP assembler mnemonics in register load/store macros
  ARM: 8989/1: use .fpu assembler directives instead of assembler arguments
  ARM: 8982/1: mm: Simplify act_mm macro
  ARM: 8981/1: add arch/arm/Kbuild

3 years agoMerge tag 'csky-for-linus-5.9-rc1' of https://github.com/c-sky/csky-linux
Linus Torvalds [Thu, 6 Aug 2020 17:15:28 +0000 (10:15 -0700)]
Merge tag 'csky-for-linus-5.9-rc1' of https://github.com/c-sky/csky-linux

Pull arch/csky updates from Guo Ren:
 "New features:
   - seccomp-filter
   - err-injection
   - top-down&random mmap-layout
   - irq_work
   - show_ipi
   - context-tracking

  Fixes & Optimizations:
   - kprobe_on_ftrace
   - optimize panic print"

* tag 'csky-for-linus-5.9-rc1' of https://github.com/c-sky/csky-linux:
  csky: Add context tracking support
  csky: Add arch_show_interrupts for IPI interrupts
  csky: Add irq_work support
  csky: Fixup warning by EXPORT_SYMBOL(kmap)
  csky: Set CONFIG_NR_CPU 4 as default
  csky: Use top-down mmap layout
  csky: Optimize the trap processing flow
  csky: Add support for function error injection
  csky: Fixup kprobes handler couldn't change pc
  csky: Fixup duplicated restore sp in RESTORE_REGS_FTRACE
  csky: Add cpu feature register hint for smp
  csky: Add SECCOMP_FILTER supported
  csky: remove unusued thread_saved_pc and *_segments functions/macros

3 years agoMerge tag 'xtensa-20200805' of git://github.com/jcmvbkbc/linux-xtensa
Linus Torvalds [Thu, 6 Aug 2020 17:07:40 +0000 (10:07 -0700)]
Merge tag 'xtensa-20200805' of git://github.com/jcmvbkbc/linux-xtensa

Pull Xtensa updates from Max Filippov:

 - add syscall audit support

 - add seccomp filter support

 - clean up make rules under arch/xtensa/boot

 - fix state management for exclusive access opcodes

 - fix build with PMU enabled

* tag 'xtensa-20200805' of git://github.com/jcmvbkbc/linux-xtensa:
  xtensa: add missing exclusive access state management
  xtensa: fix xtensa_pmu_setup prototype
  xtensa: add boot subdirectories build artifacts to 'targets'
  xtensa: add uImage and xipImage to targets
  xtensa: move vmlinux.bin[.gz] to boot subdirectory
  xtensa: initialize_mmu.h: fix a duplicated word
  selftests/seccomp: add xtensa support
  xtensa: add seccomp support
  xtensa: expose syscall through user_pt_regs
  xtensa: add audit support

3 years agoMerge tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyper...
Linus Torvalds [Thu, 6 Aug 2020 16:26:10 +0000 (09:26 -0700)]
Merge tag 'hyperv-next-signed' of git://git./linux/kernel/git/hyperv/linux

Pull hyperv updates from Wei Liu:

 - A patch series from Andrea to improve vmbus code

 - Two clean-up patches from Alexander and Randy

* tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  hyperv: hyperv.h: drop a duplicated word
  tools: hv: change http to https in hv_kvp_daemon.c
  Drivers: hv: vmbus: Remove the lock field from the vmbus_channel struct
  scsi: storvsc: Introduce the per-storvsc_device spinlock
  Drivers: hv: vmbus: Remove unnecessary channel->lock critical sections (sc_list updaters)
  Drivers: hv: vmbus: Use channel_mutex in channel_vp_mapping_show()
  Drivers: hv: vmbus: Remove unnecessary channel->lock critical sections (sc_list readers)
  Drivers: hv: vmbus: Replace cpumask_test_cpu(, cpu_online_mask) with cpu_online()
  Drivers: hv: vmbus: Remove the numa_node field from the vmbus_channel struct
  Drivers: hv: vmbus: Remove the target_vp field from the vmbus_channel struct

3 years agoALSA: pci: delete repeated words in comments
Randy Dunlap [Thu, 6 Aug 2020 02:19:26 +0000 (19:19 -0700)]
ALSA: pci: delete repeated words in comments

Drop duplicated words in sound/pci/.
{and, the, at}

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20200806021926.32418-1-rdunlap@infradead.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
3 years agoALSA: isa: delete repeated words in comments
Randy Dunlap [Thu, 6 Aug 2020 02:19:16 +0000 (19:19 -0700)]
ALSA: isa: delete repeated words in comments

Drop duplicated words in sound/isa/.
{be, bit}

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20200806021916.32369-1-rdunlap@infradead.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
3 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Linus Torvalds [Thu, 6 Aug 2020 03:13:21 +0000 (20:13 -0700)]
Merge git://git./linux/kernel/git/netdev/net-next

Pull networking updates from David Miller:

 1) Support 6Ghz band in ath11k driver, from Rajkumar Manoharan.

 2) Support UDP segmentation in code TSO code, from Eric Dumazet.

 3) Allow flashing different flash images in cxgb4 driver, from Vishal
    Kulkarni.

 4) Add drop frames counter and flow status to tc flower offloading,
    from Po Liu.

 5) Support n-tuple filters in cxgb4, from Vishal Kulkarni.

 6) Various new indirect call avoidance, from Eric Dumazet and Brian
    Vazquez.

 7) Fix BPF verifier failures on 32-bit pointer arithmetic, from
    Yonghong Song.

 8) Support querying and setting hardware address of a port function via
    devlink, use this in mlx5, from Parav Pandit.

 9) Support hw ipsec offload on bonding slaves, from Jarod Wilson.

10) Switch qca8k driver over to phylink, from Jonathan McDowell.

11) In bpftool, show list of processes holding BPF FD references to
    maps, programs, links, and btf objects. From Andrii Nakryiko.

12) Several conversions over to generic power management, from Vaibhav
    Gupta.

13) Add support for SO_KEEPALIVE et al. to bpf_setsockopt(), from Dmitry
    Yakunin.

14) Various https url conversions, from Alexander A. Klimov.

15) Timestamping and PHC support for mscc PHY driver, from Antoine
    Tenart.

16) Support bpf iterating over tcp and udp sockets, from Yonghong Song.

17) Support 5GBASE-T i40e NICs, from Aleksandr Loktionov.

18) Add kTLS RX HW offload support to mlx5e, from Tariq Toukan.

19) Fix the ->ndo_start_xmit() return type to be netdev_tx_t in several
    drivers. From Luc Van Oostenryck.

20) XDP support for xen-netfront, from Denis Kirjanov.

21) Support receive buffer autotuning in MPTCP, from Florian Westphal.

22) Support EF100 chip in sfc driver, from Edward Cree.

23) Add XDP support to mvpp2 driver, from Matteo Croce.

24) Support MPTCP in sock_diag, from Paolo Abeni.

25) Commonize UDP tunnel offloading code by creating udp_tunnel_nic
    infrastructure, from Jakub Kicinski.

26) Several pci_ --> dma_ API conversions, from Christophe JAILLET.

27) Add FLOW_ACTION_POLICE support to mlxsw, from Ido Schimmel.

28) Add SK_LOOKUP bpf program type, from Jakub Sitnicki.

29) Refactor a lot of networking socket option handling code in order to
    avoid set_fs() calls, from Christoph Hellwig.

30) Add rfc4884 support to icmp code, from Willem de Bruijn.

31) Support TBF offload in dpaa2-eth driver, from Ioana Ciornei.

32) Support XDP_REDIRECT in qede driver, from Alexander Lobakin.

33) Support PCI relaxed ordering in mlx5 driver, from Aya Levin.

34) Support TCP syncookies in MPTCP, from Flowian Westphal.

35) Fix several tricky cases of PMTU handling wrt. briding, from Stefano
    Brivio.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2056 commits)
  net: thunderx: initialize VF's mailbox mutex before first usage
  usb: hso: remove bogus check for EINPROGRESS
  usb: hso: no complaint about kmalloc failure
  hso: fix bailout in error case of probe
  ip_tunnel_core: Fix build for archs without _HAVE_ARCH_IPV6_CSUM
  selftests/net: relax cpu affinity requirement in msg_zerocopy test
  mptcp: be careful on subflow creation
  selftests: rtnetlink: make kci_test_encap() return sub-test result
  selftests: rtnetlink: correct the final return value for the test
  net: dsa: sja1105: use detected device id instead of DT one on mismatch
  tipc: set ub->ifindex for local ipv6 address
  ipv6: add ipv6_dev_find()
  net: openvswitch: silence suspicious RCU usage warning
  Revert "vxlan: fix tos value before xmit"
  ptp: only allow phase values lower than 1 period
  farsync: switch from 'pci_' to 'dma_' API
  wan: wanxl: switch from 'pci_' to 'dma_' API
  hv_netvsc: do not use VF device if link is down
  dpaa2-eth: Fix passing zero to 'PTR_ERR' warning
  net: macb: Properly handle phylink on at91sam9x
  ...

3 years agoMerge tag 'drm-next-2020-08-06' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Thu, 6 Aug 2020 02:50:06 +0000 (19:50 -0700)]
Merge tag 'drm-next-2020-08-06' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "New xilinx displayport driver, AMD support for two new GPUs (more
  header files), i915 initial support for RocketLake and some work on
  their DG1 (discrete chip).

  The core also grew some lockdep annotations to try and constrain what
  drivers do with dma-fences, and added some documentation on why the
  idea of indefinite fences doesn't work.

  The long list is below.

  I do have some fixes trees outstanding, but I'll follow up with those
  later.

  core:
   - add user def flag to cmd line modes
   - dma_fence_wait added might_sleep
   - dma-fence lockdep annotations
   - indefinite fences are bad documentation
   - gem CMA functions used in more drivers
   - struct mutex removal
   - more drm_ debug macro usage
   - set/drop master api fixes
   - fix for drm/mm hole size comparison
   - drm/mm remove invalid entry optimization
   - optimise drm/mm hole handling
   - VRR debugfs added
   - uncompressed AFBC modifier support
   - multiple display id blocks in EDID
   - multiple driver sg handling fixes
   - __drm_atomic_helper_crtc_reset in all drivers
   - managed vram helpers

  ttm:
   - ttm_mem_reg handling cleanup
   - remove bo offset field
   - drop CMA memtype flag
   - drop mappable flag

  xilinx:
   - New Xilinx ZynqMP DisplayPort Subsystem driver

  nouveau:
   - add CRC support
   - start using NVIDIA published class header files
   - convert all push buffer emission to new macros
   - Proper push buffer space management for EVO/NVD channels.
   - firmware loading fixes
   - 2MiB system memory pages support on Pascal and newer

  vkms:
   - larger cursor support

  i915:
   - Rocketlake platform enablement
   - Early DG1 enablement
   - Numerous GEM refactorings
   - DP MST fixes
   - FBC, PSR, Cursor, Color, Gamma fixes
   - TGL, RKL, EHL workaround updates
   - TGL 8K display support fixes
   - SDVO/HDMI/DVI fixes

  amdgpu:
   - Initial support for Sienna Cichlid GPU
   - Initial support for Navy Flounder GPU
   - SI UVD/VCE support
   - expose rotation property
   - Add support for unique id on Arcturus
   - Enable runtime PM on vega10 boards that support BACO
   - Skip BAR resizing if the bios already did id
   - Major swSMU code cleanup
   - Fixes for DCN bandwidth calculations

  amdkfd:
   - Track SDMA usage per process
   - SMI events interface

  radeon:
   - Default to on chip GART for AGP boards on all arches
   - Runtime PM reference count fixes

  msm:
   - headers regenerated causing churn
   - a650/a640 display and GPU enablement
   - dpu dither support for 6bpc panels
   - dpu cursor fix
   - dsi/mdp5 enablement for sdm630/sdm636/sdm66

  tegra:
   - video capture prep support
   - reflection support

  mediatek:
   - convert mtk_dsi to bridge API

  meson:
   - FBC support

  sun4i:
   - iommu support

  rockchip:
   - register locking fix
   - per-pixel alpha support PX30 VOP

  mgag200:
   - ported to simple and shmem helpers
   - device init cleanups
   - use managed pci functions
   - dropped hw cursor support

  ast:
   - use managed pci functions
   - use managed VRAM helpers
   - rework cursor support

  malidp:
   - dev_groups support

  hibmc:
   - refactor hibmc_drv_vdac:

  vc4:
   - create TXP CRTC

  imx:
   - error path fixes and cleanups

  etnaviv:
   - clock handling and error handling cleanups
   - use pin_user_pages"

* tag 'drm-next-2020-08-06' of git://anongit.freedesktop.org/drm/drm: (1747 commits)
  drm/msm: use kthread_create_worker instead of kthread_run
  drm/msm/mdp5: Add MDP5 configuration for SDM636/660
  drm/msm/dsi: Add DSI configuration for SDM660
  drm/msm/mdp5: Add MDP5 configuration for SDM630
  drm/msm/dsi: Add phy configuration for SDM630/636/660
  drm/msm/a6xx: add A640/A650 hwcg
  drm/msm/a6xx: hwcg tables in gpulist
  drm/msm/dpu: add SM8250 to hw catalog
  drm/msm/dpu: add SM8150 to hw catalog
  drm/msm/dpu: intf timing path for displayport
  drm/msm/dpu: set missing flush bits for INTF_2 and INTF_3
  drm/msm/dpu: don't use INTF_INPUT_CTRL feature on sdm845
  drm/msm/dpu: move some sspp caps to dpu_caps
  drm/msm/dpu: update UBWC config for sm8150 and sm8250
  drm/msm/dpu: use right setup_blend_config for sm8150 and sm8250
  drm/msm/a6xx: set ubwc config for A640 and A650
  drm/msm/adreno: un-open-code some packets
  drm/msm: sync generated headers
  drm/msm/a6xx: add build_bw_table for A640/A650
  drm/msm/a6xx: fix crashstate capture for A650
  ...

3 years agoMerge tag 'leds-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux...
Linus Torvalds [Thu, 6 Aug 2020 02:24:27 +0000 (19:24 -0700)]
Merge tag 'leds-5.9-rc1' of git://git./linux/kernel/git/pavel/linux-leds

Pull LED updates from Pavel Machek:
 "Okay, so... this one is interesting. RGB LEDs are very common, and we
  need to have some kind of support for them. Multicolor is for
  arbitrary set of LEDs in one package, RGB is for LEDs that can produce
  full range of colors. We do not have real multicolor LED that is not
  RGB in the pipeline, so that one is disabled for now.

  You can expect this saga to continue with next pull requests"

* tag 'leds-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux-leds: (37 commits)
  MAINTAINERS: Remove myself as LED subsystem maintainer
  leds: disallow /sys/class/leds/*:multi:* for now
  leds: add RGB color option, as that is different from multicolor.
  Make LEDS_LP55XX_COMMON depend on I2C to fix build errors:
  Documentation: ABI: leds-turris-omnia: document sysfs attribute
  leds: initial support for Turris Omnia LEDs
  dt-bindings: leds: add cznic,turris-omnia-leds binding
  leds: pattern trigger -- check pattern for validity
  leds: Replace HTTP links with HTTPS ones
  leds: trigger: add support for LED-private device triggers
  leds: lp5521: Add multicolor framework multicolor brightness support
  leds: lp5523: Update the lp5523 code to add multicolor brightness function
  leds: lp55xx: Add multicolor framework support to lp55xx
  leds: lp55xx: Convert LED class registration to devm_*
  dt-bindings: leds: Convert leds-lp55xx to yaml
  leds: multicolor: Introduce a multicolor class definition
  leds: Add multicolor ID to the color ID list
  dt: bindings: Add multicolor class dt bindings documention
  leds: lp5523: Fix various formatting issues in the code
  leds: lp55xx: Fix file permissions to use DEVICE_ATTR macros
  ...

3 years agonet: thunderx: initialize VF's mailbox mutex before first usage
Dean Nelson [Wed, 5 Aug 2020 18:18:48 +0000 (13:18 -0500)]
net: thunderx: initialize VF's mailbox mutex before first usage

A VF's mailbox mutex is not getting initialized by nicvf_probe() until after
it is first used. And such usage is resulting in...

[   28.270927] ------------[ cut here ]------------
[   28.270934] DEBUG_LOCKS_WARN_ON(lock->magic != lock)
[   28.270980] WARNING: CPU: 9 PID: 675 at kernel/locking/mutex.c:938 __mutex_lock+0xdac/0x12f0
[   28.270985] Modules linked in: ast(+) nicvf(+) i2c_algo_bit drm_vram_helper drm_ttm_helper ttm nicpf(+) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm ixgbe(+) sg thunder_bgx mdio i2c_thunderx mdio_thunder thunder_xcv mdio_cavium dm_mirror dm_region_hash dm_log dm_mod
[   28.271064] CPU: 9 PID: 675 Comm: systemd-udevd Not tainted 4.18.0+ #1
[   28.271070] Hardware name: GIGABYTE R120-T34-00/MT30-GS2-00, BIOS F02 08/06/2019
[   28.271078] pstate: 60000005 (nZCv daif -PAN -UAO)
[   28.271086] pc : __mutex_lock+0xdac/0x12f0
[   28.271092] lr : __mutex_lock+0xdac/0x12f0
[   28.271097] sp : ffff800d42146fb0
[   28.271103] x29: ffff800d42146fb0 x28: 0000000000000000
[   28.271113] x27: ffff800d24361180 x26: dfff200000000000
[   28.271122] x25: 0000000000000000 x24: 0000000000000002
[   28.271132] x23: ffff20001597cc80 x22: ffff2000139e9848
[   28.271141] x21: 0000000000000000 x20: 1ffff001a8428e0c
[   28.271151] x19: ffff200015d5d000 x18: 1ffff001ae0f2184
[   28.271160] x17: 0000000000000000 x16: 0000000000000000
[   28.271170] x15: ffff800d70790c38 x14: ffff20001597c000
[   28.271179] x13: ffff20001597cc80 x12: ffff040002b2f779
[   28.271189] x11: 1fffe40002b2f778 x10: ffff040002b2f778
[   28.271199] x9 : 0000000000000000 x8 : 00000000f1f1f1f1
[   28.271208] x7 : 00000000f2f2f2f2 x6 : 0000000000000000
[   28.271217] x5 : 1ffff001ae0f2186 x4 : 1fffe400027eb03c
[   28.271227] x3 : dfff200000000000 x2 : ffff1001a8428dbe
[   28.271237] x1 : c87fdfac7ea11d00 x0 : 0000000000000000
[   28.271246] Call trace:
[   28.271254]  __mutex_lock+0xdac/0x12f0
[   28.271261]  mutex_lock_nested+0x3c/0x50
[   28.271297]  nicvf_send_msg_to_pf+0x40/0x3a0 [nicvf]
[   28.271316]  nicvf_register_misc_interrupt+0x20c/0x328 [nicvf]
[   28.271334]  nicvf_probe+0x508/0xda0 [nicvf]
[   28.271344]  local_pci_probe+0xc4/0x180
[   28.271352]  pci_device_probe+0x3ec/0x528
[   28.271363]  driver_probe_device+0x21c/0xb98
[   28.271371]  device_driver_attach+0xe8/0x120
[   28.271379]  __driver_attach+0xe0/0x2a0
[   28.271386]  bus_for_each_dev+0x118/0x190
[   28.271394]  driver_attach+0x48/0x60
[   28.271401]  bus_add_driver+0x328/0x558
[   28.271409]  driver_register+0x148/0x398
[   28.271416]  __pci_register_driver+0x14c/0x1b0
[   28.271437]  nicvf_init_module+0x54/0x10000 [nicvf]
[   28.271447]  do_one_initcall+0x18c/0xc18
[   28.271457]  do_init_module+0x18c/0x618
[   28.271464]  load_module+0x2bc0/0x4088
[   28.271472]  __se_sys_finit_module+0x110/0x188
[   28.271479]  __arm64_sys_finit_module+0x70/0xa0
[   28.271490]  el0_svc_handler+0x15c/0x380
[   28.271496]  el0_svc+0x8/0xc
[   28.271502] irq event stamp: 52649
[   28.271513] hardirqs last  enabled at (52649): [<ffff200011b4d790>] _raw_spin_unlock_irqrestore+0xc0/0xd8
[   28.271522] hardirqs last disabled at (52648): [<ffff200011b4d3c4>] _raw_spin_lock_irqsave+0x3c/0xf0
[   28.271530] softirqs last  enabled at (52330): [<ffff200010082af4>] __do_softirq+0xacc/0x117c
[   28.271540] softirqs last disabled at (52313): [<ffff20001019b354>] irq_exit+0x3cc/0x500
[   28.271545] ---[ end trace a9b90324c8a0d4ee ]---

This problem is resolved by moving the call to mutex_init() up earlier
in nicvf_probe().

Fixes: 609ea65c65a0 ("net: thunderx: add mutex to protect mailbox from concurrent calls for same VF")
Signed-off-by: Dean Nelson <dnelson@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'misc-bug-fixes-for-the-hso-driver'
David S. Miller [Thu, 6 Aug 2020 00:43:39 +0000 (17:43 -0700)]
Merge branch 'misc-bug-fixes-for-the-hso-driver'

Oliver Neukum says:

====================
misc bug fixes for the hso driver

1. Code reuse led to an unregistration of a net driver that has not been
registered
2. The kernel complains generically if kmalloc with GFP_KERNEL fails
3. A race that can lead to an URB that is in use being reused or
a use after free
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agousb: hso: remove bogus check for EINPROGRESS
Oliver Neukum [Wed, 5 Aug 2020 12:07:09 +0000 (14:07 +0200)]
usb: hso: remove bogus check for EINPROGRESS

This check an inherent race. It opens a race where
an error code has already been set or cleared yet
the URB has not been given back. We cannot do
such an optimization and must unlink unconditionally.

Signed-off-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agousb: hso: no complaint about kmalloc failure
Oliver Neukum [Wed, 5 Aug 2020 12:07:08 +0000 (14:07 +0200)]
usb: hso: no complaint about kmalloc failure

If this fails, kmalloc() will print a report including
a stack trace. There is no need for a separate complaint.

Signed-off-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agohso: fix bailout in error case of probe
Oliver Neukum [Wed, 5 Aug 2020 12:07:07 +0000 (14:07 +0200)]
hso: fix bailout in error case of probe

The driver tries to reuse code for disconnect in case
of a failed probe.
If resources need to be freed after an error in probe, the
netdev must not be freed because it has never been registered.
Fix it by telling the helper which path we are in.

Signed-off-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMIPS: SGI-IP27: always enable NUMA in Kconfig
Mike Rapoport [Wed, 5 Aug 2020 12:51:41 +0000 (15:51 +0300)]
MIPS: SGI-IP27: always enable NUMA in Kconfig

When a configuration has NUMA disabled and SGI_IP27 enabled, the build
fails:

  CC      kernel/bounds.s
  CC      arch/mips/kernel/asm-offsets.s
In file included from arch/mips/include/asm/topology.h:11,
                 from include/linux/topology.h:36,
                 from include/linux/gfp.h:9,
                 from include/linux/slab.h:15,
                 from include/linux/crypto.h:19,
                 from include/crypto/hash.h:11,
                 from include/linux/uio.h:10,
                 from include/linux/socket.h:8,
                 from include/linux/compat.h:15,
                 from arch/mips/kernel/asm-offsets.c:12:
include/linux/topology.h: In function 'numa_node_id':
arch/mips/include/asm/mach-ip27/topology.h:16:27: error: implicit declaration of function 'cputonasid'; did you mean 'cpu_vpe_id'? [-Werror=implicit-function-declaration]
 #define cpu_to_node(cpu) (cputonasid(cpu))
                           ^~~~~~~~~~
include/linux/topology.h:119:9: note: in expansion of macro 'cpu_to_node'
  return cpu_to_node(raw_smp_processor_id());
         ^~~~~~~~~~~
include/linux/topology.h: In function 'cpu_cpu_mask':
arch/mips/include/asm/mach-ip27/topology.h:19:7: error: implicit declaration of function 'hub_data' [-Werror=implicit-function-declaration]
      &hub_data(node)->h_cpus)
       ^~~~~~~~
include/linux/topology.h:210:9: note: in expansion of macro 'cpumask_of_node'
  return cpumask_of_node(cpu_to_node(cpu));
         ^~~~~~~~~~~~~~~
arch/mips/include/asm/mach-ip27/topology.h:19:21: error: invalid type argument of '->' (have 'int')
      &hub_data(node)->h_cpus)
                     ^~
include/linux/topology.h:210:9: note: in expansion of macro 'cpumask_of_node'
  return cpumask_of_node(cpu_to_node(cpu));
         ^~~~~~~~~~~~~~~

Before switch from discontigmem to sparsemem, there always was
CONFIG_NEED_MULTIPLE_NODES=y because it was selected by DISCONTIGMEM.
Without DISCONTIGMEM it is possible to have SPARSEMEM without NUMA for
SGI_IP27 and as many things there rely on custom node definition, the
build breaks.

As Thomas noted "... there are right now too many places in IP27 code,
which assumes NUMA enabled", the simplest solution would be to always
enable NUMA for SGI-IP27 builds.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: 397dc00e249e ("mips: sgi-ip27: switch from DISCONTIGMEM to SPARSEMEM")
Cc: stable@vger.kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
3 years agoMAINTAINERS: Remove myself as LED subsystem maintainer
Jacek Anaszewski [Tue, 4 Aug 2020 10:15:25 +0000 (12:15 +0200)]
MAINTAINERS: Remove myself as LED subsystem maintainer

It don't have enough time for reviewing patches and thus don't
want to be listed as regular LED maintainer. Nonetheless I may still
give a review from time to time.

Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
Signed-off-by: Pavel Machek <pavel@ucw.cz>
3 years agoMerge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Linus Torvalds [Wed, 5 Aug 2020 20:28:50 +0000 (13:28 -0700)]
Merge tag 'for-linus-hmm' of git://git./linux/kernel/git/rdma/rdma

Pull hmm updates from Jason Gunthorpe:
 "Ralph has been working on nouveau's use of hmm_range_fault() and
  migrate_vma() which resulted in this small series. It adds reporting
  of the page table order from hmm_range_fault() and some optimization
  of migrate_vma():

   - Report the size of the page table mapping out of hmm_range_fault().

     This makes it easier to establish a large/huge/etc mapping in the
     device's page table.

   - Allow devices to ignore the invalidations during migration in cases
     where the migration is not going to change pages.

     For instance migrating pages to a device does not require the
     device to invalidate pages already in the device.

   - Update nouveau and hmm_tests to use the above"

* tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  mm/hmm/test: use the new migration invalidation
  nouveau/svm: use the new migration invalidation
  mm/notifier: add migration invalidation type
  mm/migrate: add a flags parameter to migrate_vma
  nouveau: fix storing invalid ptes
  nouveau/hmm: support mapping large sysmem pages
  nouveau: fix mapping 2MB sysmem pages
  nouveau/hmm: fault one page at a time
  mm/hmm: add tests for hmm_pfn_to_map_order()
  mm/hmm: provide the page mapping order in hmm_range_fault()

3 years agoMerge tag 'mmc-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Linus Torvalds [Wed, 5 Aug 2020 20:23:24 +0000 (13:23 -0700)]
Merge tag 'mmc-v5.9' of git://git./linux/kernel/git/ulfh/mmc

Pull MMC updates from Ulf Hansson:
 "MMC core:

   - Add a new host cap bit and a corresponding DT property, to support
     power cycling of the card by FW at system suspend/resume.

   - Fix clock rate setting for SDIO in SDR12/SDR25 speed-mode

   - Fix switch to 1/4-bit mode at system suspend/resume for SD-combo
     cards

   - Convert the mmc-pwrseq DT bindings to the json-schema

   - Always allow the card detect uevent to be consumed by userspace

  MMC host controllers:

   - Convert a few DT bindings to the json-schema

   - mtk-sd:
      - Add support for command queue through cqhci
      - Add support for the MT6779 variant

   - renesas_sdhi_internal_dmac:
      - Fix dma unmapping in the error path

   - sdhci_am654:
      - Add support for the AM65x PG2.0 variant
      - Extend support for phys/clocks

   - sdhci-cadence:
      - Drop incorrect HW tuning for SD mode

   - sdhci-msm:
      - Add support for interconnect bandwidth scaling
      - Enable internal voltage control
      - Enable low power state for pinctrls

   - sdhci-of-at91:
      - Ludovic Desroches handovers maintenance to Eugen Hristev

   - sdhci-pci-gli:
      - Improve clock handling for GL975x

   - sdhci-pci-o2micro:
      - Add HW tuning for SDR104 mode
      - Fix support for O2 host controller Seabird1"

* tag 'mmc-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: (66 commits)
  mmc: mediatek: make function msdc_cqe_disable() static
  MAINTAINERS: mmc: sdhci-of-at91: handover maintenance to Eugen Hristev
  dt-bindings: mmc: mediatek: Add document for mt6779
  mmc: mediatek: command queue support
  mmc: mediatek: refine msdc timeout api
  mmc: mediatek: add MT6779 MMC driver support
  mmc: sdhci-pci-o2micro: Add HW tuning for SDR104 mode
  mmc: sdhci-pci-o2micro: Bug fix for O2 host controller Seabird1
  mmc: via-sdmmc: use generic power management
  memstick: jmb38x_ms: use generic power management
  mmc: sdhci-cadence: do not use hardware tuning for SD mode
  mmc: sdhci-pci-gli: Set SDR104's clock to 205MHz and enable SSC for GL975x
  mmc: cqhci: Fix a print format for the task descriptor
  mmc: sdhci-of-arasan: fix timings allocation code
  mmc: sdhci: Fix a potential uninitialized variable
  dt-bindings: mmc: renesas,sdhi: convert to YAML
  dt-bindings: mmc: convert arasan sdhci bindings to yaml
  mmc: sdhci: Fix potential null pointer access while accessing vqmmc
  mmc: core: Add MMC_CAP2_FULL_PWR_CYCLE_IN_SUSPEND
  dt-bindings: mmc: Add full-pwr-cycle-in-suspend property
  ...

3 years agoMerge tag 'hwmon-for-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck...
Linus Torvalds [Wed, 5 Aug 2020 20:13:57 +0000 (13:13 -0700)]
Merge tag 'hwmon-for-v5.9' of git://git./linux/kernel/git/groeck/linux-staging

Pull hwmon updates from Guenter Roeck:
 "Highlights:
   - New driver for Sparx5 SoC temperature sensot
   - New driver for Corsair Commander Pro
   - MAX20710 support added to max20730 driver

  Enhancements:
   - max6697: Allow max6581 to create tempX_offset attributes
   - gsc (Gateworks System Controller): add 16bit pre-scaled voltage mode
   - adm1275: Enable adm1278 ADM1278_TEMP1_EN
   - dell-smm: Add Latitude 5480 to fan control whitelist

  Fixes:
   - adc128d818: Fix advanced configuration register init
   - pmbus/core: Use s64 instead of long for calculations to fix
     overflow issues with 32-bit architectures

  Plus various cleanups in several drivers"

* tag 'hwmon-for-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: (32 commits)
  hwmon: (adc128d818) Fix advanced configuration register init
  hwmon: (axi-fan-control) remove duplicate macros
  hwmon: (i5k_amb, vt8231) Drop uses of pci_read_config_*() return value
  hwmon: (sparx5) Make symbol 's5_temp_match' static
  hwmon: (corsair-cpro) add reading pwm values
  hwmon: sparx5: Add Sparx5 SoC temperature driver
  dt-bindings: hwmon: Add Sparx5 temperature sensor
  hwmon: (tmp401) Replace HTTP links with HTTPS ones
  hwmon: (lm95234) Replace HTTP links with HTTPS ones
  hwmon: (lm90) Replace HTTP links with HTTPS ones
  hwmon: (k8temp) Replace HTTP links with HTTPS ones
  hwmon: (jc42) Replace HTTP links with HTTPS ones
  hwmon: (ina2xx) Replace HTTP links with HTTPS ones
  hwmon: (ina209) Replace HTTP links with HTTPS ones
  hwmon: Replace HTTP links with HTTPS ones
  docs: hwmon: Replace HTTP links with HTTPS ones
  hwmon: (adm1025) Replace HTTP links with HTTPS ones
  hwmon: add Corsair Commander Pro driver
  hwmon: (max6697) Allow max6581 to create tempX_offset
  hwmon: (tmmp513) Replace HTTP links with HTTPS links
  ...

3 years agoMerge tag 'devicetree-for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/robh...
Linus Torvalds [Wed, 5 Aug 2020 20:02:45 +0000 (13:02 -0700)]
Merge tag 'devicetree-for-5.9' of git://git./linux/kernel/git/robh/linux

Pull Devicetree updates from Rob Herring:

 - Improve device links cycle detection and breaking. Add more bindings
   for device link dependencies.

 - Refactor parsing 'no-map' in __reserved_mem_alloc_size()

 - Improve DT unittest 'ranges' and 'dma-ranges' test case to check
   differing cell sizes

 - Various http to https link conversions

 - Add a schema check to prevent 'syscon' from being used by itself
   without a more specific compatible

 - A bunch more DT binding conversions to schema

* tag 'devicetree-for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (55 commits)
  of: reserved-memory: remove duplicated call to of_get_flat_dt_prop() for no-map node
  of: unittest: Use bigger address cells to catch parser regressions
  dt-bindings: memory-controllers: Convert mmdc to json-schema
  dt-bindings: mtd: Convert imx nand to json-schema
  dt-bindings: mtd: Convert gpmi nand to json-schema
  dt-bindings: iio: io-channel-mux: Fix compatible string in example code
  of: property: Add device link support for pinctrl-0 through pinctrl-8
  of: property: Add device link support for multiple DT bindings
  dt-bindings: phy: ti: phy-gmii-sel: convert bindings to json-schema
  dt-bindings: mux: mux.h: drop a duplicated word
  dt-bindings: misc: Convert olpc,xo1.75-ec to json-schema
  dt-bindings: aspeed-lpc: Replace HTTP links with HTTPS ones
  dt-bindings: drm/bridge: Replace HTTP links with HTTPS ones
  drm/tilcdc: Replace HTTP links with HTTPS ones
  dt-bindings: iommu: renesas,ipmmu-vmsa: Add r8a774e1 support
  dt-bindings: fpga: Replace HTTP links with HTTPS ones
  dt-bindings: virtio: Replace HTTP links with HTTPS ones
  dt-bindings: media: imx274: Add optional input clock and supplies
  dt-bindings: i2c-gpio: Use 'deprecated' keyword on deprecated properties
  dt-bindings: interrupt-controller: Fix typos in loongson,liointc.yaml
  ...

3 years agoMerge tag 'gpio-v5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux...
Linus Torvalds [Wed, 5 Aug 2020 19:56:27 +0000 (12:56 -0700)]
Merge tag 'gpio-v5.9-1' of git://git./linux/kernel/git/linusw/linux-gpio

Pull GPIO updates from Linus Walleij:
 "This is the bulk of GPIO changes for the v5.9 kernel cycle.

  There is nothing too exciting in it, but a new macro that fixes a
  build failure on a minor ARM32 platform that appeared yesterday is
  part of it so we better merge it.

  Core changes:

   - Introduce the for_each_requested_gpio() macro to help in dependent
     code all over the place. Also patch a few locations to use it while
     we are at it.

   - Split out the sysfs code into its own file.

   - Split out the character device code into its own file, then make a
     set of refactorings and improvements to this code. We are setting
     the stage to revamp the userspace API a bit in the next cycle.

   - Fix a whole slew of kerneldoc that was wrong or missing.

  New drivers:

   - The PCA953x driver now supports the PCAL9535.

  Driver improvements:

   - A host of incremental modernizations and improvements to the
     PCA953x driver.

   - Incremental improvements to the Xilinx Zynq driver.

   - Some improvements to the GPIO aggregator driver.

   - I ran all over the place switching all threaded and other drivers
     requesting their own IRQ while using the core GPIO IRQ helpers to
     pass the GPIO irq chip as a template instead of calling the
     explicit set-up functions. Next merge window we may retire the old
     code altogether"

* tag 'gpio-v5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (97 commits)
  gpio: wcove: Request IRQ after all initialisation done
  gpio: crystalcove: Free IRQ on error path
  gpio: pca953x: Request IRQ after all initialisation done
  gpio: don't use same lockdep class for all devm_gpiochip_add_data users
  gpio: max732x: Use irqchip template
  gpio: stmpe: Move chip registration
  gpio: rcar: Use irqchip template
  gpio: regmap: fix type clash
  gpio: Correct kernel-doc inconsistency
  gpio: pci-idio-16: Use irqchip template
  gpio: pcie-idio-24: Use irqchip template
  gpio: 104-idio-16: Use irqchip template
  gpio: 104-idi-48: Use irqchip template
  gpio: 104-dio-48e: Use irqchip template
  gpio: ws16c48: Use irqchip template
  gpio: omap: improve coding style for pin config flags
  gpio: dln2: Use irqchip template
  gpio: sch: Add a blank line between declaration and code
  gpio: sch: changed every 'unsigned' to 'unsigned int'
  gpio: ich: changed every 'unsigned' to 'unsigned int'
  ...

3 years agorandom: random.h should include archrandom.h, not the other way around
Linus Torvalds [Wed, 5 Aug 2020 19:39:48 +0000 (12:39 -0700)]
random: random.h should include archrandom.h, not the other way around

This is hopefully the final piece of the crazy puzzle with random.h
dependencies.

And by "hopefully" I obviously mean "Linus is a hopeless optimist".

Reported-and-tested-by: Daniel Díaz <daniel.diaz@linaro.org>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoip_tunnel_core: Fix build for archs without _HAVE_ARCH_IPV6_CSUM
Stefano Brivio [Wed, 5 Aug 2020 13:39:31 +0000 (15:39 +0200)]
ip_tunnel_core: Fix build for archs without _HAVE_ARCH_IPV6_CSUM

On architectures defining _HAVE_ARCH_IPV6_CSUM, we get
csum_ipv6_magic() defined by means of arch checksum.h headers. On
other architectures, we actually need to include net/ip6_checksum.h
to be able to use it.

Without this include, building with defconfig breaks at least for
s390.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoselftests/net: relax cpu affinity requirement in msg_zerocopy test
Willem de Bruijn [Wed, 5 Aug 2020 08:40:45 +0000 (04:40 -0400)]
selftests/net: relax cpu affinity requirement in msg_zerocopy test

The msg_zerocopy test pins the sender and receiver threads to separate
cores to reduce variance between runs.

But it hardcodes the cores and skips core 0, so it fails on machines
with the selected cores offline, or simply fewer cores.

The test mainly gives code coverage in automated runs. The throughput
of zerocopy ('-z') and non-zerocopy runs is logged for manual
inspection.

Continue even when sched_setaffinity fails. Just log to warn anyone
interpreting the data.

Fixes: 07b65c5b31ce ("test: add msg_zerocopy test")
Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomptcp: be careful on subflow creation
Paolo Abeni [Tue, 4 Aug 2020 16:31:06 +0000 (18:31 +0200)]
mptcp: be careful on subflow creation

Nicolas reported the following oops:

[ 1521.392541] BUG: kernel NULL pointer dereference, address: 00000000000000c0
[ 1521.394189] #PF: supervisor read access in kernel mode
[ 1521.395376] #PF: error_code(0x0000) - not-present page
[ 1521.396607] PGD 0 P4D 0
[ 1521.397156] Oops: 0000 [#1] SMP PTI
[ 1521.398020] CPU: 0 PID: 22986 Comm: kworker/0:2 Not tainted 5.8.0-rc4+ #109
[ 1521.399618] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[ 1521.401728] Workqueue: events mptcp_worker
[ 1521.402651] RIP: 0010:mptcp_subflow_create_socket+0xf1/0x1c0
[ 1521.403954] Code: 24 08 89 44 24 04 48 8b 7a 18 e8 2a 48 d4 ff 8b 44 24 04 85 c0 75 7a 48 8b 8b 78 02 00 00 48 8b 54 24 08 48 8d bb 80 00 00 00 <48> 8b 89 c0 00 00 00 48 89 8a c0 00 00 00 48 8b 8b 78 02 00 00 8b
[ 1521.408201] RSP: 0000:ffffabc4002d3c60 EFLAGS: 00010246
[ 1521.409433] RAX: 0000000000000000 RBX: ffffa0b9ad8c9a00 RCX: 0000000000000000
[ 1521.411096] RDX: ffffa0b9ae78a300 RSI: 00000000fffffe01 RDI: ffffa0b9ad8c9a80
[ 1521.412734] RBP: ffffa0b9adff2e80 R08: ffffa0b9af02d640 R09: ffffa0b9ad923a00
[ 1521.414333] R10: ffffabc4007139f8 R11: fefefefefefefeff R12: ffffabc4002d3cb0
[ 1521.415918] R13: ffffa0b9ad91fa58 R14: ffffa0b9ad8c9f9c R15: 0000000000000000
[ 1521.417592] FS:  0000000000000000(0000) GS:ffffa0b9af000000(0000) knlGS:0000000000000000
[ 1521.419490] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1521.420839] CR2: 00000000000000c0 CR3: 000000002951e006 CR4: 0000000000160ef0
[ 1521.422511] Call Trace:
[ 1521.423103]  __mptcp_subflow_connect+0x94/0x1f0
[ 1521.425376]  mptcp_pm_create_subflow_or_signal_addr+0x200/0x2a0
[ 1521.426736]  mptcp_worker+0x31b/0x390
[ 1521.431324]  process_one_work+0x1fc/0x3f0
[ 1521.432268]  worker_thread+0x2d/0x3b0
[ 1521.434197]  kthread+0x117/0x130
[ 1521.435783]  ret_from_fork+0x22/0x30

on some unconventional configuration.

The MPTCP protocol is trying to create a subflow for an
unaccepted server socket. That is allowed by the RFC, even
if subflow creation will likely fail.
Unaccepted sockets have still a NULL sk_socket field,
avoid the issue by failing earlier.

Reported-and-tested-by: Nicolas Rybowski <nicolas.rybowski@tessares.net>
Fixes: 7d14b0d2b9b3 ("mptcp: set correct vfs info for subflows")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'selftests-rtnetlink-Fix-for-false-negative-return-values'
David S. Miller [Wed, 5 Aug 2020 19:23:29 +0000 (12:23 -0700)]
Merge branch 'selftests-rtnetlink-Fix-for-false-negative-return-values'

Po-Hsu Lin says:

====================
selftests: rtnetlink: Fix for false-negative return values

This patchset will address the false-negative return value issue
caused by the following:
  1. The return value "ret" in this script will be reset to 0 from
     the beginning of each sub-test in rtnetlink.sh, therefore this
     rtnetlink test will always pass if the last sub-test has passed.
  2. The test result from two sub-tests in kci_test_encap() were not
     being processed, thus they will not affect the final test result
     of this test.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoselftests: rtnetlink: make kci_test_encap() return sub-test result
Po-Hsu Lin [Tue, 4 Aug 2020 10:18:03 +0000 (18:18 +0800)]
selftests: rtnetlink: make kci_test_encap() return sub-test result

kci_test_encap() is actually composed by two different sub-tests,
kci_test_encap_vxlan() and kci_test_encap_fou()

Therefore we should check the test result of these two in
kci_test_encap() to let the script be aware of the pass / fail status.
Otherwise it will generate false-negative result like below:
    $ sudo ./test.sh
    PASS: policy routing
    PASS: route get
    PASS: preferred_lft addresses have expired
    PASS: promote_secondaries complete
    PASS: tc htb hierarchy
    PASS: gre tunnel endpoint
    PASS: gretap
    PASS: ip6gretap
    PASS: erspan
    PASS: ip6erspan
    PASS: bridge setup
    PASS: ipv6 addrlabel
    PASS: set ifalias 5b193daf-0a08-46d7-af2c-e7aadd422ded for test-dummy0
    PASS: vrf
    PASS: vxlan
    FAIL: can't add fou port 7777, skipping test
    PASS: macsec
    PASS: bridge fdb get
    PASS: neigh get
    $ echo $?
    0

Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoselftests: rtnetlink: correct the final return value for the test
Po-Hsu Lin [Tue, 4 Aug 2020 10:18:02 +0000 (18:18 +0800)]
selftests: rtnetlink: correct the final return value for the test

The return value "ret" will be reset to 0 from the beginning of each
sub-test in rtnetlink.sh, therefore this test will always pass if the
last sub-test has passed:
    $ sudo ./rtnetlink.sh
    PASS: policy routing
    PASS: route get
    PASS: preferred_lft addresses have expired
    PASS: promote_secondaries complete
    PASS: tc htb hierarchy
    PASS: gre tunnel endpoint
    PASS: gretap
    PASS: ip6gretap
    PASS: erspan
    PASS: ip6erspan
    PASS: bridge setup
    PASS: ipv6 addrlabel
    PASS: set ifalias a39ee707-e36b-41d3-802f-63179ed4d580 for test-dummy0
    PASS: vrf
    PASS: vxlan
    FAIL: can't add fou port 7777, skipping test
    PASS: macsec
    PASS: ipsec
    3,7c3,7
    < sa[0]    spi=0x00000009 proto=0x32 salt=0x64636261 crypt=1
    < sa[0]    key=0x31323334 35363738 39303132 33343536
    < sa[1] rx ipaddr=0x00000000 00000000 00000000 c0a87b03
    < sa[1]    spi=0x00000009 proto=0x32 salt=0x64636261 crypt=1
    < sa[1]    key=0x31323334 35363738 39303132 33343536
    ---
    > sa[0]    spi=0x00000009 proto=0x32 salt=0x61626364 crypt=1
    > sa[0]    key=0x34333231 38373635 32313039 36353433
    > sa[1] rx ipaddr=0x00000000 00000000 00000000 037ba8c0
    > sa[1]    spi=0x00000009 proto=0x32 salt=0x61626364 crypt=1
    > sa[1]    key=0x34333231 38373635 32313039 36353433
    FAIL: ipsec_offload incorrect driver data
    FAIL: ipsec_offload
    PASS: bridge fdb get
    PASS: neigh get
    $ echo $?
    0

Make "ret" become a local variable for all sub-tests.
Also, check the sub-test results in kci_test_rtnl() and return the
final result for this test.

Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: sja1105: use detected device id instead of DT one on mismatch
Vladimir Oltean [Mon, 3 Aug 2020 16:48:23 +0000 (19:48 +0300)]
net: dsa: sja1105: use detected device id instead of DT one on mismatch

Although we can detect the chip revision 100% at runtime, it is useful
to specify it in the device tree compatible string too, because
otherwise there would be no way to assess the correctness of device tree
bindings statically, without booting a board (only some switch versions
have internal RGMII delays and/or an SGMII port).

But for testing the P/Q/R/S support, what I have is a reworked board
with the SJA1105T replaced by a pin-compatible SJA1105Q, and I don't
want to keep a separate device tree blob just for this one-off board.
Since just the chip has been replaced, its RGMII delay setup is
inherently the same (meaning: delays added by the PHY on the slave
ports, and by PCB traces on the fixed-link CPU port).

For this board, I'd rather have the driver shout at me, but go ahead and
use what it found even if it doesn't match what it's been told is there.

[    2.970826] sja1105 spi0.1: Device tree specifies chip SJA1105T but found SJA1105Q, please fix it!
[    2.980010] sja1105 spi0.1: Probed switch chip: SJA1105Q
[    3.005082] sja1105 spi0.1: Enabled switch tagging

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>