1
0
mirror of https://github.com/lxc/lxcfs.git synced 2026-02-05 09:46:18 +01:00

303 Commits

Author SHA1 Message Date
Alexander Mikhalitsyn
456047f344 lxcfs: add enable-psi-poll cmdline option
Let's make PSI triggers virtulization opt-in feature.

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2025-11-03 15:19:47 +01:00
Alexander Mikhalitsyn
f664a57fa7 proc_fuse: add /proc/pressure/{cpu, io, memory} virtualization
Virtualize /proc/pressure/{cpu, io, memory} by doing a simple
passthrough of write()/poll() syscalls to the underlying cgroup's
/sys/fs/cgroup/x/y/z/{cpu, io, memory}.pressure file.

Implementation is a bit tricky because FUSE notifications must
be issued asynchroniously and we have to use a separate thread for this.

My main concern here was to ensure that no thread leaks are possible,
cause it can be a potenial DoS for host.

If PSITRIGGERTEST macro is defined, then instead of poll-ing on a real
fd we do a simple nanosleep() with 1 second delay. This needed to enable
CI testing of this feature.

For a "real-world" testing, I was using an example program from [1],
but with cpu counter instead of memory. To make cpu pressure I use
"sysbench --threads="$(nproc)" cpu run" command.

Link: https://www.kernel.org/doc/Documentation/accounting/psi.rst [1]
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2025-10-31 19:04:50 +01:00
Alexander Mikhalitsyn
34d5daf679 lxcfs: install noop signal handler for SIGRTMIN + 0
Define SIG_NOTIFY_POLL_WAKEUP as SIGRTMIN + 0 and install
noop signal handler. This signal will be used to manage
notification threads.

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2025-10-31 14:41:02 +01:00
Alexander Mikhalitsyn
84ef19e424 proc_fuse: move release/releasedir at the end
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2025-10-31 14:39:51 +01:00
Alexander Mikhalitsyn
8fae19b23d lxcfs: wire up ->poll callback for /proc
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2025-10-31 14:38:47 +01:00
Alexander Mikhalitsyn
142b0cfe57 lxcfs: wire up ->write callback for /proc
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2025-10-31 14:20:23 +01:00
Alexander Mikhalitsyn
d411fda7fa bindings: add private_data field to struct file_info
This change is safe, cause I'm adding a union and we have enough
space for void *.

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2025-10-31 14:15:18 +01:00
Alexander Mikhalitsyn
282f237494 proc_fuse: deduplicate read() handlers code for /proc/pressure files
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2025-10-29 13:49:23 +01:00
Stéphane Graber
6957cc1121 Merge pull request #686 from kadinsayani/feat/zswap
Add zswap accounting
2025-10-09 10:07:41 -04:00
Alexander Mikhalitsyn
af454ab8d4 src/utils: fix in_same_namespace helper
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2025-10-09 15:34:33 +02:00
Alexander Mikhalitsyn
82481b6a39 lxcfs: use macro to generate liblxcfs call helpers
Let's reduce code duplication by using macro for this.

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2025-10-09 15:11:45 +02:00
Kadin Sayani
c503b12083 cgroups: replace dup() call with openat_safe()
Signed-off-by: Kadin Sayani <kadin.sayani@canonical.com>
2025-10-08 09:44:52 -06:00
Kadin Sayani
6bd2ebd751 proc_fuse: add zswap information to /proc/meminfo
The following new metrics are now available in /proc/meminfo:
- Zswap: the total amount of memory consumed by the zswap compression
backend
- Zswapped: amount of application memory swapped out to zswap

Signed-off-by: Kadin Sayani <kadin.sayani@canonical.com>
2025-10-08 09:44:52 -06:00
Kadin Sayani
0dc531d2f6 bindings: add zswap feature detection
Signed-off-by: Kadin Sayani <kadin.sayani@canonical.com>
2025-10-08 09:44:51 -06:00
Kadin Sayani
777505614a lxcfs: add disable-zswap opt
Signed-off-by: Kadin Sayani <kadin.sayani@canonical.com>
2025-10-08 09:44:15 -06:00
Kadin Sayani
21ce4aa4b9 cgroups: add zswap feature detection
Signed-off-by: Kadin Sayani <kadin.sayani@canonical.com>
2025-10-07 12:47:00 -06:00
Kadin Sayani
b54e16a09e cgroups: extract cgfsng_can_use_memory_feature() util function
Signed-off-by: Kadin Sayani <kadin.sayani@canonical.com>
2025-10-07 12:46:40 -06:00
last-las
255b7a7477 proc_fuse: fix proc_stat_read reporting host cpu count under cgroup v2
Signed-off-by: Quanjin Lin <478942543@qq.com>
2025-09-02 10:15:35 +08:00
Deyan Doychev
2ea5561141 proc_loadavg: Prevent integer overflow calculating the sleep interval
If the loadavg thread falls behind schedule for any reason, the calculation can
overflow, resulting in an unintended sleep duration of approximately 70 munutes.
To prevent this, the logic has been updated to skip the sleep in cases where the
calculation would overflow.

Signed-off-by: Deyan Doychev <deyan@siteground.com>
2025-04-01 18:46:37 +03:00
Alexander Mikhalitsyn
28be637ae4 lxcfs: use strlcpy when handle runtime-dir parameter
Fixes: Coverity 451805
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2025-01-31 14:29:07 +01:00
Alexander Mikhalitsyn
0f253c7d4c utils: move strlcpy/strlcat helpers from cgroup_utils to utils
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2025-01-31 14:25:10 +01:00
Alexander Mikhalitsyn
3aa1bb65b7 cpuset_parse: make a check for an empty string in cpu_in_cpuset()
Fixes: Coverity 382195
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2025-01-31 10:44:22 +01:00
Alexander Mikhalitsyn
531a988d49 utils: fix wait_for_sock to use time_t instead of int
Fixes: Coverity 382186
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2025-01-31 10:29:36 +01:00
Stéphane Graber
19e2c99899 Merge pull request #671 from asainkujovic/excludesrecl
meminfo: Add slab_reclaimable to MemAvailable
2025-01-07 11:29:08 -05:00
Asain Kujovic
abdecf18cf meminfo: Add slab_reclaimable to MemAvailable
Signed-off-by: Asain Kujovic <asainnp@gmail.com>
2025-01-05 06:15:08 +01:00
Feng Sun
31da3ae731 proc_fuse: add psi(pressure stall information) procfs
Kernel support psi(pressure stall information) since 4.20
with procfs /proc/pressure/{io,cpu,memory} and
cgroupv2 {io.pressure, cpu.pressure, memory.pressure}.

This patch add read-only psi procfs,
and people can get pressure information now.
Full functional feature for monitoring are still under investigation.

Signed-off-by: Feng Sun <loyou85@gmail.com>
2025-01-03 10:53:20 -05:00
Alexander Mikhalitsyn
49e862b696 cgroups/cgfsng: improve swap accounting support detection
When LXCFS daemon runs in a root cgroup of cgroup2 tree,
we need to go down the tree when checking for memory.swap.current.

We already have some logic to go up the tree from LXCFS daemon's
cgroup, but it's useless when LXCFS daemon sits in the cgroup2 tree root.

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2024-10-02 15:59:52 +02:00
Alexander Mikhalitsyn
56fd97e62e lxcfs: fix readdir for procfs subtree
After #640 was merged we've got the entire
procfs subtree unavailable.

Fixes: #640
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2024-10-02 10:50:36 +02:00
Stéphane Graber
68fa858e03 Merge pull request #640 from DevonSchwartz/fix_lxcfs_read_null
Fix lxcfs read null
2024-09-24 16:20:33 -04:00
Devon Schwartz
bcb1b0a930 lxcfs_read: Added LXCFS_TYPE macro to all FUSE filesystem calls
Signed-off-by: Devon Schwartz <devon.s.schwartz@utexas.edu>
2024-09-24 14:21:26 -05:00
Stéphane Graber
1e4e1841d6 Add missing linux/limits.h include
Closes #657

Signed-off-by: Stéphane Graber <stgraber@stgraber.org>
2024-09-15 23:19:23 +02:00
Sebastien Dabdoub
5c42da9c19 lxcfs/bindings: add a flag for overriding the runtime dir
Adds a --runtime-dir cli flag which overrides the /run dir in the lxcfslib.
This ended up being kind of tricky because of how lxcfslib can be reloaded and
its use of a library constructor.

In order read the cli flag and then set a variable in the library, I removed
the contstructor and made init happen as part of the fuse load/reload.

I also added the runtime field to the lxcfs_opts struct and upped its version
for backwards compatibility.

Signed-off-by: Sebastien Dabdoub <sebastien.dabdoub@gmail.com>
- permission mask change 0755 -> 0700
- prevent potential NULL-pointer dereference in lxcfs_fuse_init()
- commit message edits
- one commit was squashed
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2024-06-06 11:41:11 +02:00
Sebastien Dabdoub
328a30b1de lxcfs/bindings: Refactor RUNTIME_PATH so that it can be overridden on startup
What was RUNTIME_PATH is now named DEFAULT_RUNTIME_PATH so this should not change current behavior.
It is done in preparation for adding a flag to override the runtime path on startup.

Signed-off-by: Sebastien Dabdoub <sebastien.dabdoub@gmail.com>
- permission mask change 0755 -> 0700
- commit message edits
- use tabs everywhere instead of spaces
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2024-06-06 11:35:19 +02:00
Samuel FORESTIER
276cc1cbf1 proc: checks system security policy before trying to get personalities
096972f7 and fc8f593b introduces task personalities retrieval to fix
incorrect /proc files info in some cases.
Linux governs access to personalities based on system ptrace policy,
which may be restricted by an LSM (e.g. Yama).

This patch implements a simple check for init's personality access to
make sure ptrace usage is allowed, and prevent access from containers to
proc files with "Permission denied" error if not.

> closes #636 (follow-up to #553 and #609).

Signed-off-by: Samuel FORESTIER <samuel+dev@forestier.app>
2024-05-01 11:10:27 +02:00
Alexander Mikhalitsyn
86d93a312b cgroup_utils: explicitly check for cgroup2 FDs in cgroup_walkup_to_root
See:
https://github.com/lxc/lxcfs/pull/617#discussion_r1533524372

Suggested-by: Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2024-03-27 14:55:35 +01:00
Alex Hudspith
a6c309be27 proc: Fix swap handling for cgroups v2 (zero limits)
Since memory.swap.max = 0 is valid under v2, limits of 0 must not be
treated differently. Instead, use UINT64_MAX as the default limit. This aligns
with cgroups v1 behaviour anyway since 'limit_in_bytes' files contain a large
number for unspecified limits (2^63).

Resolves: #534
Signed-off-by: Alex Hudspith <alex@hudspith.io>
2024-03-27 13:39:23 +01:00
Alex Hudspith
f496e62cdb proc: Fix swap handling for cgroups v2 (can_use_swap)
On cgroups v2, there are no swap current/max files at the cgroup root, so
can_use_swap must look lower in the hierarchy to determine if swap accounting
is enabled. To also account for memory accounting being turned off at some
level, walk the hierarchy upwards from lxcfs' own cgroup.

Signed-off-by: Alex Hudspith <alex@hudspith.io>
[ added check cgroup pointer is not NULL in lxcfs_init() ]
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2024-03-27 13:38:32 +01:00
Alex Hudspith
b50a9a3d67 proc_fuse: Fix get_swap_info typo swtotal == 0 -> *swtotal == 0
Signed-off-by: Alex Hudspith <alex@hudspith.io>
2024-03-27 13:34:41 +01:00
Alexander Mikhalitsyn
000b539f1b lxcfs: introduce new option --enable-cgroup
During our private discussion, Stéphane proposed
to add a new option --enable-cgroup to explicitly
enable old cgroup emulation code

It's worth mentioning that cgroup code in LXCFS
is not widely used, because it was written before
cgroup namespace era and not actual these days.

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2024-03-15 18:28:45 +01:00
Alexander Mikhalitsyn
ce8e6e973c sysfs: forbid write()
It's just dangerous to allow passthrough of write()
syscall anywhere under emulated sysfs subtree.

Let's forbid it.

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2024-03-15 16:47:57 +01:00
Stéphane Graber
7d508dfb1e Merge pull request #622 from zhaixiaojuan/main
Add macro pivot&bpf for loongarch64
2024-01-14 22:27:01 +01:00
zhaixiaojuan
10f17f5dde Add macro pivot&bpf for loongarch64
Signed-off-by: zhaixiaojuan <zhaixiaojuan@loongson.cn>
2024-01-14 16:25:34 -05:00
vegbir
c27c750ba0 typofix: fix incorrect printing in lxcfs help interface
Signed-off-by: vegbir <yangjiaqi16@huawei.com>
2023-12-14 07:27:07 +00:00
Kyeong Yoo
5340b27fc5 proc: fix MemAvailable in /proc/meminfo to exclude tmpfs files
The "total_cache" from memory.stat of cgroup includes
the memory used by tmpfs files ("total_shmem"). Considering
it as available memory is wrong because files created
on a tmpfs file system cannot be simply reclaimed.

So the available memory is calculated with the sum of:
 * Memory the kernel knows is free
 * Memory that contained in the kernel active file LRU,
   that can be reclaimed if necessary
 * Memory that is contained in the kernel non-active file
   LRU, that can be reclaimed if necessary

Signed-off-by: Kyeong Yoo <kyeong.yoo@alliedtelesis.co.nz>
2023-10-03 16:36:51 +13:00
Stéphane Graber
87a2fe91b8 Merge pull request #612 from mihalicyn/load_daemon_signature_v2
loadavg: make cleanup of start_loadavg
2023-09-29 12:15:13 -04:00
Christian Brauner
4c7965d92c Merge pull request #614 from stgraber/main
lxcfs: Add startup message
2023-09-29 18:12:05 +02:00
Stéphane Graber
8ba6e228f7 Merge pull request #613 from mihalicyn/cpuview_debug_print_fix
cpuview: pass a correct argument to lxcfs_debug
2023-09-29 12:07:36 -04:00
Stéphane Graber
926a698f73 lxcfs: Add startup message
Closes #560

Signed-off-by: Stéphane Graber <stgraber@stgraber.org>
2023-09-29 12:06:45 -04:00
Alexander Mikhalitsyn
a019277c7f cpuview: pass a correct argument to lxcfs_debug
struct cg_proc_stat *cur;
...
lxcfs_debug("Removing stat node for %s\n", cur);

should be:

lxcfs_debug("Removing stat node for %s\n", cur->cg);

Only reproducible when DEBUG macro is defined.

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2023-09-29 17:28:46 +02:00
Alexander Mikhalitsyn
e0533550b2 loadavg: make cleanup of start_loadavg
Cleanup start_loadavg code:
- add a new external symbol load_daemon_v2 with the pthread_create-like signature
- make hacky casts of pthread_t to int (and reverse) unnecessary for new API users

Related to: #610

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2023-09-29 16:37:21 +02:00