mm/page_owner: record and dump free_pid and free_tgid
authorBarry Song <21cnbao@gmail.com>
Tue, 14 Nov 2023 03:42:02 +0000 (16:42 +1300)
committerAndrew Morton <akpm@linux-foundation.org>
Mon, 11 Dec 2023 00:51:40 +0000 (16:51 -0800)
While investigating some complex memory allocation and free bugs
especially in multi-processes and multi-threads cases, from time to time,
I feel the free stack isn't sufficient as a page can be freed by processes
or threads other than the one allocating it.  And other processes and
threads which free the page often have the exactly same free stack with
the one allocating the page.  We can't know who free the page only through
the free stack though the current page_owner does tell us the pid and tgid
of the one allocating the page.  This makes the bug investigation often
hard.

So this patch adds free pid and tgid in page_owner, so that we can easily
figure out if the freeing is crossing processes or threads.

Link: https://lkml.kernel.org/r/20231114034202.73098-1-v-songbaohua@oppo.com
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Cc: Audra Mitchell <audra@redhat.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kassey Li <quic_yingangl@quicinc.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/page_owner.c

index 4f13ce7..e7eba76 100644 (file)
@@ -32,6 +32,8 @@ struct page_owner {
        char comm[TASK_COMM_LEN];
        pid_t pid;
        pid_t tgid;
+       pid_t free_pid;
+       pid_t free_tgid;
 };
 
 static bool page_owner_enabled __initdata;
@@ -152,6 +154,8 @@ void __reset_page_owner(struct page *page, unsigned short order)
                page_owner = get_page_owner(page_ext);
                page_owner->free_handle = handle;
                page_owner->free_ts_nsec = free_ts_nsec;
+               page_owner->free_pid = current->pid;
+               page_owner->free_tgid = current->tgid;
                page_ext = page_ext_next(page_ext);
        }
        page_ext_put(page_ext);
@@ -253,6 +257,8 @@ void __folio_copy_owner(struct folio *newfolio, struct folio *old)
        new_page_owner->handle = old_page_owner->handle;
        new_page_owner->pid = old_page_owner->pid;
        new_page_owner->tgid = old_page_owner->tgid;
+       new_page_owner->free_pid = old_page_owner->free_pid;
+       new_page_owner->free_tgid = old_page_owner->free_tgid;
        new_page_owner->ts_nsec = old_page_owner->ts_nsec;
        new_page_owner->free_ts_nsec = old_page_owner->ts_nsec;
        strcpy(new_page_owner->comm, old_page_owner->comm);
@@ -495,7 +501,8 @@ void __dump_page_owner(const struct page *page)
        if (!handle) {
                pr_alert("page_owner free stack trace missing\n");
        } else {
-               pr_alert("page last free stack trace:\n");
+               pr_alert("page last free pid %d tgid %d stack trace:\n",
+                         page_owner->free_pid, page_owner->free_tgid);
                stack_depot_print(handle);
        }