mirror of https://github.com/opencontainers/runc.git synced 2026-02-05 09:46:08 +01:00

Files

Kir Kolyshkin cb31d62f1c Fix exec vs Go 1.26

Since [PR 4812], runc exec tries to use clone3 syscall with
CLONE_INTO_CGROUP, falling back to the old method if it is not
supported.

One issue with that approach is, a

> Cmd cannot be reused after calling its [Cmd.Start], [Cmd.Run],
> [Cmd.Output], or [Cmd.CombinedOutput] methods.

(from https://pkg.go.dev/os/exec#Cmd).

This is enforced since Go 1.26, see [CL 728642], and so runc exec
actually fails in specific scenarios (go1.26 and no CLONE_INTO_CGROUP
support).

The easiest workaround is to pre-copy the p.cmd structure (copy = *cmd).
From the [CL 734200] it looks like it is an acceptable way, but it might
break in the future as it also copies the private fields, so let's do a
proper field-by-field copy. If the upstream will add cmd.Clone method,
we will switch to it.

Also, we can probably be fine with a post-copy (once the first Start has
failed), but let's be conservative here and do a pre-copy.

[PR 4812]: https://github.com/opencontainers/runc/pull/4812
[CL 728642]: https://go.dev/cl/728642
[CL 734200]: https://go.dev/cl/734200

Reported-by: Efim Verzakov <efimverzakov@gmail.com>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

2026-01-29 13:49:34 -08:00

apparmor

libcontainer: move CleanPath and StripRoot to internal/pathrs

2025-11-26 21:03:29 +11:00

capabilities

…

configs

libct/configs: use pointers for Config methods

2026-01-26 14:17:44 -08:00

devices

Replace os.Is* error checking functions with their errors.Is counterpart

2025-12-11 03:16:02 +01:00

exeseal

*: switch to safer securejoin.Reopen

2025-11-01 21:24:02 +11:00

integration

libc/int: use strings.Builder

2025-12-16 15:04:04 -08:00

intelrdt

Replace os.Is* error checking functions with their errors.Is counterpart

2025-12-11 03:16:02 +01:00

internal/userns

Enable gofumpt extra rules

2025-11-10 13:18:45 -08:00

keys

…

logs

Merge pull request #4994 from kolyshkin/gofumpt-extra

2025-11-28 09:30:57 +09:00

nsenter

Enable gofumpt extra rules

2025-11-10 13:18:45 -08:00

seccomp

Enable gofumpt extra rules

2025-11-10 13:18:45 -08:00

specconv

Replace os.Is* error checking functions with their errors.Is counterpart

2025-12-11 03:16:02 +01:00

system

libct/system: use securejoin for /proc/$pid/stat

2025-11-01 21:24:05 +11:00

utils

Merge pull request #5051 from cyphar/libct-utils-deprecated

2025-12-02 15:06:01 -08:00

cmd_clone.go

Fix exec vs Go 1.26

2026-01-29 13:49:34 -08:00

console_linux.go

libct: fix mips compilation

2025-11-05 17:56:14 -08:00

container_linux_test.go

…

container_linux.go

libct: check cmd.Err after exec.Command call

2026-01-29 13:49:04 -08:00

container.go

…

criu_disabled_linux.go

…

criu_linux.go

Replace os.Is* error checking functions with their errors.Is counterpart

2025-12-11 03:16:02 +01:00

criu_opts_linux.go

…

env_test.go

…

env.go

…

error.go

…

factory_linux_test.go

…

factory_linux.go

Replace os.Is* error checking functions with their errors.Is counterpart

2025-12-11 03:16:02 +01:00

init_linux.go

Replace os.Is* error checking functions with their errors.Is counterpart

2025-12-11 03:16:02 +01:00

message_linux.go

…

mount_linux_test.go

…

mount_linux.go

libct: mountFd: close mountFile on error

2025-12-02 15:15:23 -08:00

network_linux.go

Enable gofumpt extra rules

2025-11-10 13:18:45 -08:00

notify_linux_test.go

…

notify_linux.go

Replace os.Is* error checking functions with their errors.Is counterpart

2025-12-11 03:16:02 +01:00

notify_v2_linux.go

…

process_linux.go

Fix exec vs Go 1.26

2026-01-29 13:49:34 -08:00

process.go

libct: use pointers for Process methods

2026-01-26 14:17:46 -08:00

README.md

…

restored_process.go

…

rootfs_linux_test.go

…

rootfs_linux.go

Replace os.Is* error checking functions with their errors.Is counterpart

2025-12-11 03:16:02 +01:00

setns_init_linux.go

…

SPEC.md

…

standard_init_linux.go

init: write sysctls using safe procfs API

2025-11-01 21:24:05 +11:00

state_linux_test.go

…

state_linux.go

Replace os.Is* error checking functions with their errors.Is counterpart

2025-12-11 03:16:02 +01:00

stats_linux.go

…

sync_unix.go

…

sync.go

…

README.md

libcontainer

Libcontainer provides a native Go implementation for creating containers with namespaces, cgroups, capabilities, and filesystem access controls. It allows you to manage the lifecycle of the container performing additional operations after the container is created.

Container

A container is a self contained execution environment that shares the kernel of the host system and which is (optionally) isolated from other containers in the system.

Using libcontainer

Container init

Because containers are spawned in a two step process you will need a binary that will be executed as the init process for the container. In libcontainer, we use the current binary (/proc/self/exe) to be executed as the init process, and use arg "init", we call the first step process "bootstrap", so you always need a "init" function as the entry of "bootstrap".

In addition to the go init function the early stage bootstrap is handled by importing nsenter.

For details on how runc implements such "init", see init.go and libcontainer/init_linux.go.

Device management

If you want containers that have access to some devices, you need to import this package into your code:

    import (
        _ "github.com/opencontainers/cgroups/devices"
    )

Without doing this, libcontainer cgroup manager won't be able to set up device access rules, and will fail if devices are specified in the container configuration.

Container creation

To create a container you first have to create a configuration struct describing how the container is to be created. A sample would look similar to this:

defaultMountFlags := unix.MS_NOEXEC | unix.MS_NOSUID | unix.MS_NODEV
var devices []*devices.Rule
for _, device := range specconv.AllowedDevices {
	devices = append(devices, &device.Rule)
}
config := &configs.Config{
	Rootfs: "/your/path/to/rootfs",
	Capabilities: &configs.Capabilities{
		Bounding: []string{
			"CAP_KILL",
			"CAP_AUDIT_WRITE",
		},
		Effective: []string{
			"CAP_KILL",
			"CAP_AUDIT_WRITE",
		},
		Permitted: []string{
			"CAP_KILL",
			"CAP_AUDIT_WRITE",
		},
	},
	Namespaces: configs.Namespaces([]configs.Namespace{
		{Type: configs.NEWNS},
		{Type: configs.NEWUTS},
		{Type: configs.NEWIPC},
		{Type: configs.NEWPID},
		{Type: configs.NEWUSER},
		{Type: configs.NEWNET},
		{Type: configs.NEWCGROUP},
	}),
	Cgroups: &configs.Cgroup{
		Name:   "test-container",
		Parent: "system",
		Resources: &configs.Resources{
			MemorySwappiness: nil,
			Devices:          devices,
		},
	},
	MaskPaths: []string{
		"/proc/kcore",
		"/sys/firmware",
	},
	ReadonlyPaths: []string{
		"/proc/sys", "/proc/sysrq-trigger", "/proc/irq", "/proc/bus",
	},
	Devices:  specconv.AllowedDevices,
	Hostname: "testing",
	Mounts: []*configs.Mount{
		{
			Source:      "proc",
			Destination: "/proc",
			Device:      "proc",
			Flags:       defaultMountFlags,
		},
		{
			Source:      "tmpfs",
			Destination: "/dev",
			Device:      "tmpfs",
			Flags:       unix.MS_NOSUID | unix.MS_STRICTATIME,
			Data:        "mode=755",
		},
		{
			Source:      "devpts",
			Destination: "/dev/pts",
			Device:      "devpts",
			Flags:       unix.MS_NOSUID | unix.MS_NOEXEC,
			Data:        "newinstance,ptmxmode=0666,mode=0620,gid=5",
		},
		{
			Device:      "tmpfs",
			Source:      "shm",
			Destination: "/dev/shm",
			Data:        "mode=1777,size=65536k",
			Flags:       defaultMountFlags,
		},
		{
			Source:      "mqueue",
			Destination: "/dev/mqueue",
			Device:      "mqueue",
			Flags:       defaultMountFlags,
		},
		{
			Source:      "sysfs",
			Destination: "/sys",
			Device:      "sysfs",
			Flags:       defaultMountFlags | unix.MS_RDONLY,
		},
	},
	UIDMappings: []configs.IDMap{
		{
			ContainerID: 0,
			HostID: 1000,
			Size: 65536,
		},
	},
	GIDMappings: []configs.IDMap{
		{
			ContainerID: 0,
			HostID: 1000,
			Size: 65536,
		},
	},
	Networks: []*configs.Network{
		{
			Type:    "loopback",
			Address: "127.0.0.1/0",
			Gateway: "localhost",
		},
	},
	Rlimits: []configs.Rlimit{
		{
			Type: unix.RLIMIT_NOFILE,
			Hard: uint64(1025),
			Soft: uint64(1025),
		},
	},
}

Once you have the configuration populated you can create a container with a specified ID under a specified state directory:

container, err := libcontainer.Create("/run/containers", "container-id", config)
if err != nil {
	logrus.Fatal(err)
	return
}

To spawn bash as the initial process inside the container and have the processes pid returned in order to wait, signal, or kill the process:

process := &libcontainer.Process{
	Args:   []string{"/bin/bash"},
	Env:    []string{"PATH=/bin"},
	User:   "daemon",
	Stdin:  os.Stdin,
	Stdout: os.Stdout,
	Stderr: os.Stderr,
	Init:   true,
}

err := container.Run(process)
if err != nil {
	container.Destroy()
	logrus.Fatal(err)
	return
}

// wait for the process to finish.
_, err := process.Wait()
if err != nil {
	logrus.Fatal(err)
}

// destroy the container.
container.Destroy()

Additional ways to interact with a running container are:

// return all the pids for all processes running inside the container.
processes, err := container.Processes()

// get detailed cpu, memory, io, and network statistics for the container and
// it's processes.
stats, err := container.Stats()

// pause all processes inside the container.
container.Pause()

// resume all paused processes.
container.Resume()

// send signal to container's init process.
container.Signal(signal)

// update container resource constraints.
container.Set(config)

// get current status of the container.
status, err := container.Status()

// get current container's state information.
state, err := container.State()

Checkpoint & Restore

libcontainer now integrates CRIU for checkpointing and restoring containers. This lets you save the state of a process running inside a container to disk, and then restore that state into a new process, on the same machine or on another machine.

criu version 1.5.2 or higher is required to use checkpoint and restore. If you don't already have criu installed, you can build it from source, following the online instructions. criu is also installed in the docker image generated when building libcontainer with docker.

Copyright and license

Code and documentation copyright 2014 Docker, inc. The code and documentation are released under the Apache 2.0 license. The documentation is also released under Creative Commons Attribution 4.0 International License. You may obtain a copy of the license, titled CC-BY-4.0, at http://creativecommons.org/licenses/by/4.0/.