opencontainers/runc - runc - Linuxmonk: Open Source Repository Mirror

mirror of https://github.com/opencontainers/runc.git synced 2026-02-05 18:45:28 +01:00

Author	SHA1	Message	Date
Kir Kolyshkin	1b954f1f06	libct: fix mips compilation On MIPS arches, Rdev is uint32 so we have to convert it. Fixes issue 4962. Fixes: `8476df83` ("libct: add/use isDevNull, verifyDevNull") Fixes: `de87203e` ("console: verify /dev/pts/ptmx before use") Fixes: `398955bc` ("console: add fallback for pre-TIOCGPTPEER kernels") Reported-by: Tianon Gravi <admwiggin@gmail.com> Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2025-11-05 17:56:14 -08:00
Aleksa Sarai	de87203e62	console: verify /dev/pts/ptmx before use This is primarily done out of an abudance of caution against runc exec being attacked by a container where /dev/pts/ptmx has been replaced with some other bad inode (a disconnected NFS handle, a symlink that goes through a leaked runc file descriptor to reference a host ptmx, etc). Unfortunately, we cannot trivially verify that /dev/pts/ptmx is actually the /dev/pts from the container without storing stuff like the fsid in the runc state.json, which is probably not worth the extra effort. This should at least avoid the most concerning cases. Reported-by: Aleksa Sarai <cyphar@cyphar.com> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>	2025-11-01 21:24:03 +11:00
Aleksa Sarai	9be1dbf4ac	console: avoid trivial symlink attacks for /dev/console An attacker could make /dev/console a symlink. This presents two possible issues: 1. os.Create will happily truncate targets, which could have resulted in a worse version of CVE-2024-4531. Luckily, this all happens after pivot_root(2) so the scope of that particular attack is fairly limited (you are unlikely to be able to easily access host rootfs files -- though it might be possible to take advantage of leaks such as in CVE-2024-21626). However, O_CREAT\|O_NOFOLLOW is what we should be doing for all file creations. 2. Because we passed /dev/console as the only mount path (as opposed to using a /proc/self/fd/$n path), an attacker could swap the symlink to point to any other path and thus cause us to mount over some other path. This is not as big of a problem because all the mounts are in the container namespace after pivot_root(2), and users usually can create arbitrary mount targets inside the container. These issues don't seem particularly exploitable, but they deserve to be hardened regardless. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>	2025-11-01 21:24:03 +11:00
Aleksa Sarai	398955bccb	console: add fallback for pre-TIOCGPTPEER kernels The pty driver has very consistent allocation rules for the major:minor numbers of /dev/pts/$n inodes, so it is possible to somewhat safely open /dev/pts/* paths if we validate that the inode is the one we expect. It is possible for an attacker to have over-mounted a pts peer from a different devpts instance, but to fix this would require more tracking of devpts instances than runc currently can do. This means runc should continue to work on very old kernels. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>	2025-11-01 21:24:03 +11:00
Aleksa Sarai	531ef794e4	console: use TIOCGPTPEER when allocating peer PTY When opening the peer end of a pty, the old kernel API required us to open /dev/pts/$num inside the container (at least since we fixed console handling many years ago in commit `244c9fc426` (": console rewrite")). The problem is that in a hostile container it is possible for /dev/pts/$num to be an attacker-controlled symlink that runc can be tricked into resolving when doing bind-mounts. This allows the attacker to (among other things) persist /proc/... entries that are later masked by runc, allowing an attacker to escape through the kernel.core_pattern sysctl (/proc/sys/kernel/core_pattern). This is the original issue reported by Lei Wang and Li Fu Bang in CVE-2025-52565. However, it should be noted that this is not entirely a newly-discovered problem. Way back in Linux 4.13 (2017), I added the TIOCGPTPEER ioctl, which allows us to get a pty peer without touching the /dev/pts inside the container. The original threat model was around an attacker replacing /dev/pts/$n or /dev/pts/ptmx with some malicious inode (a DoS inode, or possibly a PTY they wanted a confused deputy to operate on). Unfortunately, there was no practical way for runc to cache a safe O_PATH handle to /dev/pts/ptmx (unlike other runtimes like LXC, which switched to TIOCGPTPEER way back in 2017). Since it wasn't clear how we could protect against the main attack TIOCGPTPEER was meant to protect against, we never switched to it (even though I implemented it specifically to harden container runtimes). Unfortunately, It turns out that mount sources* are a threat we didn't fully consider. Since TIOCGPTPEER already solves this problem entirely for us in a race free way, we should just use that. In a later patch, we will add some hardening for /dev/pts/$num opening to maintain support for very old kernels (Linux 4.13 is very old at this point, but RHEL 7 is still kicking and is stuck on Linux 3.10). Fixes: GHSA-qw9x-cqr3-wc7r CVE-2025-52565 Reported-by: Lei Wang <ssst0n3@gmail.com> (CVE-2025-52565) Reported-by: lfbzhm <lifubang@acmcoder.com> (CVE-2025-52565) Reported-by: Aleksa Sarai <cyphar@cyphar.com> (TIOCGPTPEER) Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>	2025-11-01 21:24:03 +11:00
Kir Kolyshkin	e655abc0da	int/linux: add/use Dup3, Open, Openat Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2025-03-26 14:16:53 -07:00
Kir Kolyshkin	2e2ecf29ff	libct: use chmod instead of umask Umask is problematic for Go programs as it affects other goroutines (see [1] for more details). Instead of using it, let's just prop up with Chmod. Note this patch misses the MkdirAll call in createDeviceNode. Since the runtime spec does not say anything about creating intermediary directories for device nodes, let's assume that doing it via mkdir with the current umask set is sufficient (if not, we have to reimplement MkdirAll from scratch, with added call to os.Chmod). [1] https://github.com/opencontainers/runc/pull/3563#discussion_r990293788 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2023-09-27 16:46:53 -07:00
Kir Kolyshkin	976748e8d6	libct: add mountViaFDs, simplify mount 1. Simplify mount call by removing the procfd argument, and use the new mount() where procfd is not used. Now, the mount() arguments are the same as for unix.Mount. 2. Introduce a new mountViaFDs function, which is similar to the old mount(), except it can take procfd for both source and target. The new arguments are called srcFD and dstFD. 3. Modify the mount error to show both srcFD and dstFD so it's clear which one is used for which purpose. This fixes the issue of having a somewhat cryptic errors like this: > mount /proc/self/fd/11:/sys/fs/cgroup/systemd (via /proc/self/fd/12), flags: 0x20502f: operation not permitted (in which fd 11 is actually the source, and fd 12 is the target). After this change, it looks like > mount src=/proc/self/fd/11, dst=/sys/fs/cgroup/systemd, dstFD=/proc/self/fd/12, flags=0x20502f: operation not permitted so it's clear that 12 is a destination fd. 4. Fix the mountViaFDs callers to use dstFD (rather than procfd) for the variable name. 5. Use srcFD where mountFd is set. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2023-05-02 18:41:09 -07:00
Kir Kolyshkin	36aefad45d	libct: wrap unix.Mount/Unmount errors Errors returned by unix are bare. In some cases it's impossible to find out what went wrong because there's is not enough context. Add a mountError type (mostly copy-pasted from github.com/moby/sys/mount), and mount/unmount helpers. Use these where appropriate, and convert error checks to use errors.Is. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-06-22 16:09:37 -07:00
Kir Kolyshkin	e6048715e4	Use gofumpt to format code gofumpt (mvdan.cc/gofumpt) is a fork of gofmt with stricter rules. Brought to you by git ls-files \*.go \| grep -v ^vendor/ \| xargs gofumpt -s -w Looking at the diff, all these changes make sense. Also, replace gofmt with gofumpt in golangci.yml. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-06-01 12:17:27 -07:00
Daniel Dao	91eafcbc65	tty: move IO of master pty to be done with epoll This moves all console code to use github.com/containerd/console library to handle console I/O. Also move to use EpollConsole by default when user requests a terminal so we can still cope when the other side temporarily goes away. Signed-off-by: Daniel Dao <dqminh89@gmail.com>	2017-07-28 12:35:02 +01:00
Tobias Klauser	078e903296	libcontainer: use ioctl wrappers from x/sys/unix Use IoctlGetInt and IoctlGetTermios/IoctlSetTermios instead of manually reimplementing them. Because of unlockpt, the ioctl wrapper is still needed as it needs to pass a pointer to a value, which is not supported by any ioctl function in x/sys/unix yet. Signed-off-by: Tobias Klauser <tklauser@distanz.ch>	2017-07-10 10:56:58 +02:00
W. Trevor King	830c0d70df	libcontainer/console_linux.go: Make SaneTerminal public And use it only in local tooling that is forwarding the pseudoterminal master. That way runC no longer has an opinion on the onlcr setting for folks who are creating a terminal and detaching. They'll use --console-socket and can setup the pseudoterminal however they like without runC having an opinion. With this commit, the only cases where runC still has applies SaneTerminal is when it is the process consuming the master descriptor. Signed-off-by: W. Trevor King <wking@tremily.us>	2017-06-07 21:32:41 -07:00
Christy Perez	fca53109c1	Fix console syscalls Fixes opencontainers/runc/issues/1364 Signed-off-by: Christy Perez <christy@linux.vnet.ibm.com>	2017-04-06 16:51:54 -05:00
Michael Crosby	00a0ecf554	Add separate console socket Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2017-03-16 10:23:59 -07:00
Aleksa Sarai	7df64f8886	runc: implement --console-socket This allows for higher-level orchestrators to be able to have access to the master pty file descriptor without keeping the runC process running. This is key to having (detach && createTTY) with a _real_ pty created inside the container, which is then sent to a higher level orchestrator over an AF_UNIX socket. This patch is part of the console rewrite patchset. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2016-12-01 15:49:36 +11:00
Mrunal Patel	f1324a9fc1	Don't label the console as it already has the right label [@cyphar: removed mountLabel argument from .mount().] Signed-off-by: Mrunal Patel <mrunalp@gmail.com> Signed-off-by: Aleksa Sarai <asarai@suse.de>	2016-12-01 15:49:36 +11:00
Aleksa Sarai	c0c8edb9e8	console: don't chown(2) the slave PTY Since the gid=X and mode=Y flags can be set inside config.json as mount options, don't override them with our own defaults. This avoids /dev/pts/* not being owned by tty in a regular container, as well as all of the issues with us implementing grantpt(3) manually. This is the least opinionated approach to take. This patch is part of the console rewrite patchset. Reported-by: Mrunal Patel <mrunalp@gmail.com> Signed-off-by: Aleksa Sarai <asarai@suse.de>	2016-12-01 15:49:36 +11:00
Aleksa Sarai	244c9fc426	*: console rewrite This implements {createTTY, detach} and all of the combinations and negations of the two that were previously implemented. There are some valid questions about out-of-OCI-scope topics like !createTTY and how things should be handled (why do we dup the current stdio to the process, and how is that not a security issue). However, these will be dealt with in a separate patchset. In order to allow for late console setup, split setupRootfs into the "preparation" section where all of the mounts are created and the "finalize" section where we pivot_root and set things as ro. In between the two we can set up all of the console mountpoints and symlinks we need. We use two-stage synchronisation to ensures that when the syscalls are reordered in a suboptimal way, an out-of-place read() on the parentPipe will not gobble the ancilliary information. This patch is part of the console rewrite patchset. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2016-12-01 15:49:36 +11:00
Michael Crosby	5f24c9a61a	Merge pull request #1146 from cyphar/io-set-termios-onlcr libcontainer: io: stop screwing with \n in console output	2016-11-03 09:49:50 -07:00
Aleksa Sarai	eea28f480d	libcontainer: io: stop screwing with \n in console output The default terminal setting for a new pty on Linux (unix98) has +ONLCR, resulting in '\n' writes by a container process to be converted to '\r\n' reads by the managing process. This is quite unexpected, and causes multiple issues with things like bats testing. To fix it, make the terminal sane after opening it by setting -ONLCR. This patch might need to be rewritten after the console rewrite patchset is merged. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2016-11-01 14:40:54 +11:00
Qiang Huang	b15668b36d	Fix all typos found by misspell I use the same tool (https://github.com/client9/misspell) as Daniel used a few days ago, don't why he missed these typos at that time. Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>	2016-10-29 14:14:42 +08:00
Daniel Dao	1b876b0bf2	fix typos with misspell pipe the source through https://github.com/client9/misspell. typos be gone! Signed-off-by: Daniel Dao <dqminh89@gmail.com>	2016-10-11 23:22:48 +00:00
Michael Crosby	9c9aac5385	Export console New func Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2015-12-09 11:59:10 -08:00
Antonio Murdaca	d6e6462478	Cleanup unused func arguments Signed-off-by: Antonio Murdaca <runcom@linux.com>	2015-09-21 11:50:29 +02:00
Michael Crosby	080df7ab88	Update import paths for new repository Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2015-06-21 19:29:59 -07:00
Michael Crosby	8f97d39dd2	Move libcontainer into subdirectory Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2015-06-21 19:29:15 -07:00

27 Commits