1
0
mirror of https://github.com/opencontainers/runc.git synced 2026-02-05 18:45:28 +01:00

27 Commits

Author SHA1 Message Date
Kir Kolyshkin
1b954f1f06 libct: fix mips compilation
On MIPS arches, Rdev is uint32 so we have to convert it.

Fixes issue 4962.

Fixes: 8476df83 ("libct: add/use isDevNull, verifyDevNull")
Fixes: de87203e ("console: verify /dev/pts/ptmx before use")
Fixes: 398955bc ("console: add fallback for pre-TIOCGPTPEER kernels")
Reported-by: Tianon Gravi <admwiggin@gmail.com>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-11-05 17:56:14 -08:00
Aleksa Sarai
de87203e62 console: verify /dev/pts/ptmx before use
This is primarily done out of an abudance of caution against runc exec
being attacked by a container where /dev/pts/ptmx has been replaced with
some other bad inode (a disconnected NFS handle, a symlink that goes
through a leaked runc file descriptor to reference a host ptmx, etc).

Unfortunately, we cannot trivially verify that /dev/pts/ptmx is actually
the /dev/pts from the container without storing stuff like the fsid in
the runc state.json, which is probably not worth the extra effort. This
should at least avoid the most concerning cases.

Reported-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2025-11-01 21:24:03 +11:00
Aleksa Sarai
9be1dbf4ac console: avoid trivial symlink attacks for /dev/console
An attacker could make /dev/console a symlink. This presents two
possible issues:

 1. os.Create will happily truncate targets, which could have resulted
    in a worse version of CVE-2024-4531. Luckily, this all happens after
    pivot_root(2) so the scope of that particular attack is fairly
    limited (you are unlikely to be able to easily access host rootfs
    files -- though it might be possible to take advantage of leaks such
    as in CVE-2024-21626). However, O_CREAT|O_NOFOLLOW is what we should
    be doing for all file creations.

 2. Because we passed /dev/console as the only mount path (as opposed to
    using a /proc/self/fd/$n path), an attacker could swap the symlink
    to point to any other path and thus cause us to mount over some
    other path. This is not as big of a problem because all the mounts
    are in the container namespace after pivot_root(2), and users
    usually can create arbitrary mount targets inside the container.

These issues don't seem particularly exploitable, but they deserve to be
hardened regardless.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2025-11-01 21:24:03 +11:00
Aleksa Sarai
398955bccb console: add fallback for pre-TIOCGPTPEER kernels
The pty driver has very consistent allocation rules for the major:minor
numbers of /dev/pts/$n inodes, so it is possible to somewhat safely open
/dev/pts/* paths if we validate that the inode is the one we expect.

It is possible for an attacker to have over-mounted a pts peer from a
different devpts instance, but to fix this would require more tracking
of devpts instances than runc currently can do.

This means runc should continue to work on very old kernels.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2025-11-01 21:24:03 +11:00
Aleksa Sarai
531ef794e4 console: use TIOCGPTPEER when allocating peer PTY
When opening the peer end of a pty, the old kernel API required us to
open /dev/pts/$num inside the container (at least since we fixed console
handling many years ago in commit 244c9fc426 ("*: console rewrite")).

The problem is that in a hostile container it is possible for
/dev/pts/$num to be an attacker-controlled symlink that runc can be
tricked into resolving when doing bind-mounts. This allows the attacker
to (among other things) persist /proc/... entries that are later masked
by runc, allowing an attacker to escape through the kernel.core_pattern
sysctl (/proc/sys/kernel/core_pattern). This is the original issue
reported by Lei Wang and Li Fu Bang in CVE-2025-52565.

However, it should be noted that this is not entirely a newly-discovered
problem. Way back in Linux 4.13 (2017), I added the TIOCGPTPEER ioctl,
which allows us to get a pty peer without touching the /dev/pts inside
the container. The original threat model was around an attacker
replacing /dev/pts/$n or /dev/pts/ptmx with some malicious inode (a DoS
inode, or possibly a PTY they wanted a confused deputy to operate on).
Unfortunately, there was no practical way for runc to cache a safe
O_PATH handle to /dev/pts/ptmx (unlike other runtimes like LXC, which
switched to TIOCGPTPEER way back in 2017). Since it wasn't clear how we
could protect against the main attack TIOCGPTPEER was meant to protect
against, we never switched to it (even though I implemented it
specifically to harden container runtimes).

Unfortunately, It turns out that mount *sources* are a threat we didn't
fully consider. Since TIOCGPTPEER already solves this problem entirely
for us in a race free way, we should just use that. In a later patch, we
will add some hardening for /dev/pts/$num opening to maintain support
for very old kernels (Linux 4.13 is very old at this point, but RHEL 7
is still kicking and is stuck on Linux 3.10).

Fixes: GHSA-qw9x-cqr3-wc7r CVE-2025-52565
Reported-by: Lei Wang <ssst0n3@gmail.com> (CVE-2025-52565)
Reported-by: lfbzhm <lifubang@acmcoder.com> (CVE-2025-52565)
Reported-by: Aleksa Sarai <cyphar@cyphar.com> (TIOCGPTPEER)
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2025-11-01 21:24:03 +11:00
Kir Kolyshkin
e655abc0da int/linux: add/use Dup3, Open, Openat
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-03-26 14:16:53 -07:00
Kir Kolyshkin
2e2ecf29ff libct: use chmod instead of umask
Umask is problematic for Go programs as it affects other goroutines
(see [1] for more details).

Instead of using it, let's just prop up with Chmod.

Note this patch misses the MkdirAll call in createDeviceNode. Since the
runtime spec does not say anything about creating intermediary
directories for device nodes, let's assume that doing it via mkdir with
the current umask set is sufficient (if not, we have to reimplement
MkdirAll from scratch, with added call to os.Chmod).

[1] https://github.com/opencontainers/runc/pull/3563#discussion_r990293788

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2023-09-27 16:46:53 -07:00
Kir Kolyshkin
976748e8d6 libct: add mountViaFDs, simplify mount
1. Simplify mount call by removing the procfd argument, and use the new
   mount() where procfd is not used. Now, the mount() arguments are the
   same as for unix.Mount.

2. Introduce a new mountViaFDs function, which is similar to the old
   mount(), except it can take procfd for both source and target.
   The new arguments are called srcFD and dstFD.

3. Modify the mount error to show both srcFD and dstFD so it's clear
   which one is used for which purpose. This fixes the issue of having
   a somewhat cryptic errors like this:

> mount /proc/self/fd/11:/sys/fs/cgroup/systemd (via /proc/self/fd/12), flags: 0x20502f: operation not permitted

  (in which fd 11 is actually the source, and fd 12 is the target).

   After this change, it looks like

> mount src=/proc/self/fd/11, dst=/sys/fs/cgroup/systemd, dstFD=/proc/self/fd/12, flags=0x20502f: operation not permitted

   so it's clear that 12 is a destination fd.

4. Fix the mountViaFDs callers to use dstFD (rather than procfd) for the
   variable name.

5. Use srcFD where mountFd is set.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2023-05-02 18:41:09 -07:00
Kir Kolyshkin
36aefad45d libct: wrap unix.Mount/Unmount errors
Errors returned by unix are bare. In some cases it's impossible to find
out what went wrong because there's is not enough context.

Add a mountError type (mostly copy-pasted from github.com/moby/sys/mount),
and mount/unmount helpers. Use these where appropriate, and convert error
checks to use errors.Is.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-06-22 16:09:37 -07:00
Kir Kolyshkin
e6048715e4 Use gofumpt to format code
gofumpt (mvdan.cc/gofumpt) is a fork of gofmt with stricter rules.

Brought to you by

	git ls-files \*.go | grep -v ^vendor/ | xargs gofumpt -s -w

Looking at the diff, all these changes make sense.

Also, replace gofmt with gofumpt in golangci.yml.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-06-01 12:17:27 -07:00
Daniel Dao
91eafcbc65 tty: move IO of master pty to be done with epoll
This moves all console code to use github.com/containerd/console library to
handle console I/O. Also move to use EpollConsole by default when user requests
a terminal so we can still cope when the other side temporarily goes away.

Signed-off-by: Daniel Dao <dqminh89@gmail.com>
2017-07-28 12:35:02 +01:00
Tobias Klauser
078e903296 libcontainer: use ioctl wrappers from x/sys/unix
Use IoctlGetInt and IoctlGetTermios/IoctlSetTermios instead of manually
reimplementing them.

Because of unlockpt, the ioctl wrapper is still needed as it needs to
pass a pointer to a value, which is not supported by any ioctl function
in x/sys/unix yet.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
2017-07-10 10:56:58 +02:00
W. Trevor King
830c0d70df libcontainer/console_linux.go: Make SaneTerminal public
And use it only in local tooling that is forwarding the pseudoterminal
master.  That way runC no longer has an opinion on the onlcr setting
for folks who are creating a terminal and detaching.  They'll use
--console-socket and can setup the pseudoterminal however they like
without runC having an opinion.  With this commit, the only cases
where runC still has applies SaneTerminal is when *it* is the process
consuming the master descriptor.

Signed-off-by: W. Trevor King <wking@tremily.us>
2017-06-07 21:32:41 -07:00
Christy Perez
fca53109c1 Fix console syscalls
Fixes opencontainers/runc/issues/1364

Signed-off-by: Christy Perez <christy@linux.vnet.ibm.com>
2017-04-06 16:51:54 -05:00
Michael Crosby
00a0ecf554 Add separate console socket
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-03-16 10:23:59 -07:00
Aleksa Sarai
7df64f8886 runc: implement --console-socket
This allows for higher-level orchestrators to be able to have access to
the master pty file descriptor without keeping the runC process running.
This is key to having (detach && createTTY) with a _real_ pty created
inside the container, which is then sent to a higher level orchestrator
over an AF_UNIX socket.

This patch is part of the console rewrite patchset.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2016-12-01 15:49:36 +11:00
Mrunal Patel
f1324a9fc1 Don't label the console as it already has the right label
[@cyphar: removed mountLabel argument from .mount().]

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
Signed-off-by: Aleksa Sarai <asarai@suse.de>
2016-12-01 15:49:36 +11:00
Aleksa Sarai
c0c8edb9e8 console: don't chown(2) the slave PTY
Since the gid=X and mode=Y flags can be set inside config.json as mount
options, don't override them with our own defaults. This avoids
/dev/pts/* not being owned by tty in a regular container, as well as all
of the issues with us implementing grantpt(3) manually. This is the
least opinionated approach to take.

This patch is part of the console rewrite patchset.

Reported-by: Mrunal Patel <mrunalp@gmail.com>
Signed-off-by: Aleksa Sarai <asarai@suse.de>
2016-12-01 15:49:36 +11:00
Aleksa Sarai
244c9fc426 *: console rewrite
This implements {createTTY, detach} and all of the combinations and
negations of the two that were previously implemented. There are some
valid questions about out-of-OCI-scope topics like !createTTY and how
things should be handled (why do we dup the current stdio to the
process, and how is that not a security issue). However, these will be
dealt with in a separate patchset.

In order to allow for late console setup, split setupRootfs into the
"preparation" section where all of the mounts are created and the
"finalize" section where we pivot_root and set things as ro. In between
the two we can set up all of the console mountpoints and symlinks we
need.

We use two-stage synchronisation to ensures that when the syscalls are
reordered in a suboptimal way, an out-of-place read() on the parentPipe
will not gobble the ancilliary information.

This patch is part of the console rewrite patchset.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2016-12-01 15:49:36 +11:00
Michael Crosby
5f24c9a61a Merge pull request #1146 from cyphar/io-set-termios-onlcr
libcontainer: io: stop screwing with \n in console output
2016-11-03 09:49:50 -07:00
Aleksa Sarai
eea28f480d libcontainer: io: stop screwing with \n in console output
The default terminal setting for a new pty on Linux (unix98) has +ONLCR,
resulting in '\n' writes by a container process to be converted to
'\r\n' reads by the managing process. This is quite unexpected, and
causes multiple issues with things like bats testing. To fix it, make
the terminal sane after opening it by setting -ONLCR.

This patch might need to be rewritten after the console rewrite patchset
is merged.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2016-11-01 14:40:54 +11:00
Qiang Huang
b15668b36d Fix all typos found by misspell
I use the same tool (https://github.com/client9/misspell)
as Daniel used a few days ago, don't why he missed these
typos at that time.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
2016-10-29 14:14:42 +08:00
Daniel Dao
1b876b0bf2 fix typos with misspell
pipe the source through https://github.com/client9/misspell. typos be gone!

Signed-off-by: Daniel Dao <dqminh89@gmail.com>
2016-10-11 23:22:48 +00:00
Michael Crosby
9c9aac5385 Export console New func
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-12-09 11:59:10 -08:00
Antonio Murdaca
d6e6462478 Cleanup unused func arguments
Signed-off-by: Antonio Murdaca <runcom@linux.com>
2015-09-21 11:50:29 +02:00
Michael Crosby
080df7ab88 Update import paths for new repository
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-06-21 19:29:59 -07:00
Michael Crosby
8f97d39dd2 Move libcontainer into subdirectory
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2015-06-21 19:29:15 -07:00