Re: [PATCH] devpts: Add ptmx_uid and ptmx_gid options



On Thu, May 28, 2015 at 12:42 PM, Eric W. Biederman
<ebiederm xmission com> wrote:
Andy Lutomirski <luto amacapital net> writes:

On Thu, May 28, 2015 at 10:01 AM, Alexander Larsson <alexl redhat com> wrote:
On Thu, 2015-05-28 at 11:44 -0500, Eric W. Biederman wrote:
Andy Lutomirski <luto amacapital net> writes:

On Thu, Apr 2, 2015 at 11:27 AM, Eric W. Biederman
<ebiederm xmission com> wrote:
Andy Lutomirski <luto amacapital net> writes:

On Thu, Apr 2, 2015 at 7:29 AM, Alexander Larsson <
alexl redhat com> wrote:
On Thu, 2015-04-02 at 07:06 -0700, Andy Lutomirski wrote:
On Thu, Apr 2, 2015 at 3:12 AM, James Bottomley
<James Bottomley hansenpartnership com> wrote:
On Tue, 2015-03-31 at 16:17 +0200, Alexander Larsson
wrote:
On tis, 2015-03-31 at 17:08 +0300, James Bottomley
wrote:
On Tue, 2015-03-31 at 06:59 -0700, Andy Lutomirski
wrote:

I don't think that this is correct.  That user can
already create a
nested userns and map themselves as 0 inside it.
 Then they can mount
devpts.

I don't mind if they create a container and control
the isolated ttys in
that sub container in the VPS; that's fine.  I do
mind if they get
access to the ttys in the VPS.

If you can convince me (and the rest of Linux) that
the tty subsystem
should be mountable by an unprivileged user
generally, then what you
propose is OK.

That is controlled by the general rights to mount
stuff. I.e. unless you
have CAP_SYS_ADMIN in the VPS container you will not be
able to mount
devpts there. You can only do it in a subcontainer
where you got
permissions to mount via using user namespaces.

OK let me try again.  Fine, if you want to speak
capabilities, you've
given a non-root user an unexpected capability (the
capability of
creating a ptmx device).  But you haven't used a
capability separation
to do this, you've just hard coded it via a mount
parameter mechanism.

If you want to do this thing, do it properly, so it's
acceptable to the
whole of Linux, not a special corner case for one
particular type of
container.

Security breaches are created when people code in
special, little used,
corner cases because they don't get as thoroughly tested
and inspected
as generally applicable mechanisms.

What you want is to be able to use the tty subsystem as a
non root user:
fine, but set that up globally, don't hide it in
containers so a lot
fewer people care.

I tend to agree, and not just for the tty subsystem.  This
is an
attack surface issue.  With unprivileged user namespaces,
unprivileged
users can create mount namespaces (probably a good thing
for bind
mounts, etc), network namespaces (reasonably safe by
themselves),
network interfaces and iptables rules (scary), fresh
instances/superblocks of some filesystems (scariness
depends on the fs
-- tmpfs is probably fine), and more.

I think we should have real controls for this, and this is
mostly
Eric's domain.  Eric?  A silly issue that sometimes
prevents devpts
from being mountable isn't a real control, though.

I thought the controls for limiting how much of the userspace API
an application could use were called seccomp and seccomp2.

Do we need something like a PAM module so that we can set up
these
controls during login?

I'm honestly surprised that non-root is allowed to mount
things in
general with user namespaces. This was long disabled use for
non-root in
Fedora, but it is now enabled.

For instance, using loopback mounted files you could probably
attack
some of the less well tested filesystem implementations by
feeding them
fuzzed data.


You actually can't do that right now.  Filesystems have to opt
in to
being mounted in unprivileged user namespaces, and no
filesystems with
backing stores have opted in.  devpts has, but it's buggy
without this
patch IMO.

Arguably you should use two user namespaces.  The first to do
what you
want to as root the second to run as the uid you want to run as.

Anyway, I don't see how this affects devpts though. If you're
running in
a container (or uncontained), as a regular users with no
mount
capabilities you can already mount a devpts filesystem if you
create a
subbcontainer with user namespaces and map your uid to 0 in
the
subcontainer. Then you get a new ptmx device that you can do
whatever
you want with. The mount option would let you do the same,
except be
your regular uid in the subcontainer.

The only difference outside of the subcontainer is that if
the outer
container has no uid 0 mapped, yet the user has CAP_SYSADMIN
rights in
that container. Then he can mount devpts in the outer
container where he
before could only mount it in an inner container.


Agreed.  Also, devpts doesn't seem scary at all to me from a
userns
perspective.  Regular users on normal systems can already use
ptmx,
and AFAICS basically all of the attack surface is already
available
through the normal /dev/ptmx node.

My only real take is that there are a lot more places that you
need to
tweak beyond devpts.  So this patch seemed lacking and boring.

Beyond that until I get the mount namespace sorted out things are
pretty
much in a feature freeze because I can't multitask well enough to
do
complicated patches and take feature patches.


Eric, do you think you have time now to take a look at this patch?

I am much closer.  Escaping bind mounts is still not yet fixed but I
have code that almost works.

My gut feel still says that two user namespaces one where your 0 is
mapped to your uid and a second where your uid is identity mapped is
the
preferrable configuration, and makes this patch unnecessary.

I don't really understand this. My usecase is that I want a desktop app
sandbox, it should run as the actual user that is running the graphical
session mapped to its real uid. In this namespace i want a /dev/pts so
that i can e.g. shell out to ssh and feed it a password on the tty
prompt or similar. And i don't want to bind-mount in the host /dev/pts,
because then the sandbox can read from the ttys of other apps.

Where does the second namespace enter into this?


I think Eric is suggesting making a user namespace that maps your uid
as 0, then making a mount namespace and mounting devpts, then making
*another* user namespace that maps your uid (seen as 0) back to
whatever nonzero number you want.

That would probably work, but I think it's really ugly.

I just looked and the number of places where we actually care if uid 0
is mapped is very small.  Mostly just the places that have to deal with
setuid applications.  So I think the maintenance burden is much smaller
that I would have expected.

That said if we update /dev/pts to handle being mounted by a non-root
user I expect what we actually want is to use the fsuid and fsgid
of the caller of mount.  That is less code and it does the right thing
without effort, and it makes sense even outside of a user namespace
context.

Something like:

diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
index add566303c68..8fdaa6740f23 100644
--- a/fs/devpts/inode.c
+++ b/fs/devpts/inode.c
@@ -245,13 +245,8 @@ static int mknod_ptmx(struct super_block *sb)
        struct dentry *root = sb->s_root;
        struct pts_fs_info *fsi = DEVPTS_SB(sb);
        struct pts_mount_opts *opts = &fsi->mount_opts;
-       kuid_t root_uid;
-       kgid_t root_gid;
-
-       root_uid = make_kuid(current_user_ns(), 0);
-       root_gid = make_kgid(current_user_ns(), 0);
-       if (!uid_valid(root_uid) || !gid_valid(root_gid))
-               return -EINVAL;
+       kuid_t ptmx_uid = current_fsuid();
+       kgid_t ptmx_gid = current_fsgid();

        mutex_lock(&d_inode(root)->i_mutex);

@@ -282,8 +277,8 @@ static int mknod_ptmx(struct super_block *sb)

        mode = S_IFCHR|opts->ptmxmode;
        init_special_inode(inode, mode, MKDEV(TTYAUX_MAJOR, 2));
-       inode->i_uid = root_uid;
-       inode->i_gid = root_gid;
+       inode->i_uid = ptmx_uid;
+       inode->i_gid = ptmx_gid;

        d_add(dentry, inode);

Apparently alexl is encountering some annoyances related to the
current workaround, and the workaround is certainly ugly.

Your proposal seems like it could break some use cases involving
fscaps on a mount or mount-like binary.

What if we change it to use the owner of the userns that owns the
current mount ns?  For anything that doesn't explicitly use
namespaces, this will be zero.  For namespace users, it should do the
right thing.

--Andy


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]