hugetlb: remove prep_compound_huge_page cleanup
authorMike Kravetz <mike.kravetz@oracle.com>
Thu, 1 Jul 2021 01:48:31 +0000 (18:48 -0700)
committerLinus Torvalds <torvalds@linux-foundation.org>
Thu, 1 Jul 2021 03:47:26 +0000 (20:47 -0700)
Patch series "Fix prep_compound_gigantic_page ref count adjustment".

These patches address the possible race between
prep_compound_gigantic_page and __page_cache_add_speculative as described
by Jann Horn in [1].

The first patch simply removes the unnecessary/obsolete helper routine
prep_compound_huge_page to make the actual fix a little simpler.

The second patch is the actual fix and has a detailed explanation in the
commit message.

This potential issue has existed for almost 10 years and I am unaware of
anyone actually hitting the race.  I did not cc stable, but would be happy
to squash the patches and send to stable if anyone thinks that is a good
idea.

[1] https://lore.kernel.org/linux-mm/CAG48ez23q0Jy9cuVnwAe7t_fdhMk2S7N5Hdi-GLcCeq5bsfLxw@mail.gmail.com/

This patch (of 2):

I could not think of a reliable way to recreate the issue for testing.
Rather, I 'simulated errors' to exercise all the error paths.

The routine prep_compound_huge_page is a simple wrapper to call either
prep_compound_gigantic_page or prep_compound_page.  However, it is only
called from gather_bootmem_prealloc which only processes gigantic pages.
Eliminate the routine and call prep_compound_gigantic_page directly.

Link: https://lkml.kernel.org/r/20210622021423.154662-1-mike.kravetz@oracle.com
Link: https://lkml.kernel.org/r/20210622021423.154662-2-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Youquan Song <youquan.song@intel.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/hugetlb.c

index b14f4d1..8048763 100644 (file)
@@ -1320,8 +1320,6 @@ static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
        return alloc_contig_pages(nr_pages, gfp_mask, nid, nodemask);
 }
 
-static void prep_new_huge_page(struct hstate *h, struct page *page, int nid);
-static void prep_compound_gigantic_page(struct page *page, unsigned int order);
 #else /* !CONFIG_CONTIG_ALLOC */
 static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
                                        int nid, nodemask_t *nodemask)
@@ -2759,16 +2757,10 @@ found:
        return 1;
 }
 
-static void __init prep_compound_huge_page(struct page *page,
-               unsigned int order)
-{
-       if (unlikely(order > (MAX_ORDER - 1)))
-               prep_compound_gigantic_page(page, order);
-       else
-               prep_compound_page(page, order);
-}
-
-/* Put bootmem huge pages into the standard lists after mem_map is up */
+/*
+ * Put bootmem huge pages into the standard lists after mem_map is up.
+ * Note: This only applies to gigantic (order > MAX_ORDER) pages.
+ */
 static void __init gather_bootmem_prealloc(void)
 {
        struct huge_bootmem_page *m;
@@ -2777,20 +2769,19 @@ static void __init gather_bootmem_prealloc(void)
                struct page *page = virt_to_page(m);
                struct hstate *h = m->hstate;
 
+               VM_BUG_ON(!hstate_is_gigantic(h));
                WARN_ON(page_count(page) != 1);
-               prep_compound_huge_page(page, huge_page_order(h));
+               prep_compound_gigantic_page(page, huge_page_order(h));
                WARN_ON(PageReserved(page));
                prep_new_huge_page(h, page, page_to_nid(page));
                put_page(page); /* free it into the hugepage allocator */
 
                /*
-                * If we had gigantic hugepages allocated at boot time, we need
-                * to restore the 'stolen' pages to totalram_pages in order to
-                * fix confusing memory reports from free(1) and another
-                * side-effects, like CommitLimit going negative.
+                * We need to restore the 'stolen' pages to totalram_pages
+                * in order to fix confusing memory reports from free(1) and
+                * other side-effects, like CommitLimit going negative.
                 */
-               if (hstate_is_gigantic(h))
-                       adjust_managed_page_count(page, pages_per_huge_page(h));
+               adjust_managed_page_count(page, pages_per_huge_page(h));
                cond_resched();
        }
 }