prctl: Allow local CAP_SYS_ADMIN changing exe_file
authorKirill Tkhai <ktkhai@virtuozzo.com>
Fri, 12 May 2017 14:33:36 +0000 (17:33 +0300)
committerEric W. Biederman <ebiederm@xmission.com>
Thu, 20 Jul 2017 12:46:07 +0000 (07:46 -0500)
commit4d28df6152aa3ffd0ad0389bb1d31f5b1c1c2b1f
tree4f756fb4aadf88d6760420ed2b30bd01ea5ac076
parent64db4c7f4c1dde23d47b60f887000e28f82b268f
prctl: Allow local CAP_SYS_ADMIN changing exe_file

During checkpointing and restore of userspace tasks
we bumped into the situation, that it's not possible
to restore the tasks, which user namespace does not
have uid 0 or gid 0 mapped.

People create user namespace mappings like they want,
and there is no a limitation on obligatory uid and gid
"must be mapped". So, if there is no uid 0 or gid 0
in the mapping, it's impossible to restore mm->exe_file
of the processes belonging to this user namespace.

Also, there is no a workaround. It's impossible
to create a temporary uid/gid mapping, because
only one write to /proc/[pid]/uid_map and gid_map
is allowed during a namespace lifetime.
If there is an entry, then no more mapings can't be
written. If there isn't an entry, we can't write
there too, otherwise user task won't be able
to do that in the future.

The patch changes the check, and looks for CAP_SYS_ADMIN
instead of zero uid and gid. This allows to restore
a task independently of its user namespace mappings.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Serge Hallyn <serge@hallyn.com>
CC: "Eric W. Biederman" <ebiederm@xmission.com>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Michal Hocko <mhocko@suse.com>
CC: Andrei Vagin <avagin@openvz.org>
CC: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Stanislav Kinsburskiy <skinsbursky@virtuozzo.com>
CC: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
kernel/sys.c