This implements a way to run a debug container with a provided image on
the node.
The container runs with privileged profile, allowing to issue debugging
commands (e.g. using some advanced network tools) to troubleshoot a
machine.
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Add DisableAccessTime and Secure mount options for existing volumes.
DisableAccessTime adds noatime parameter to disable access time updates.
Secure adds nosuid and nodev parameters for security (defaults to true).
Add integration tests for both options.
Signed-off-by: Pranav Patil <pranavppatil767@gmail.com>
Unify a list of all APIs in Talos to a single place, and use them in
associated tests:
* the test for one2many specifics
* the test for deprecated methods
* the test for missing RBAC rules
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Allow some patches to be generated correctly according to the version
contract of the machine configuration.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixing issue when PercentageSize is used and instead of calling Merge it was trying to merge individual unexported fields.
Signed-off-by: Mateusz Urbanek <mateusz.urbanek@siderolabs.com>
These new APIs only support one2one proxying, so they don't have any
hacks, and look as regular gRPC APIs.
Old APIs are deprecated, but still supported.
Implement client-side multiplexing in `talosctl`, provide fallback to
old APIs for legacy Talos versions.
New APIs include removing an image, importing an image.
Extracted from #12392
Co-authored-by: Laura Brehm <laurabrehm@hey.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This command was always hidden, rename it to `debug-tool` to free up the
`talosctl debug` for #12932.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Instead of defaulting to one2many, list explicitly one2many supported
APIs.
The idea is that any new API will only be "normal" gRPC API, so we can
flip the switch, and consider one2many APIs as "legacy".
Extracted from #12392
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This change prevents user-specified exposed ports from overriding the
default ones.
This allows one e.g. to export the Kubernetes endpoint both at the
default random port and at a specified host address.
Signed-off-by: Dmitrii Sharshakov <dmitry.sharshakov@siderolabs.com>
The --k8s-endpoint flag was defined but never used in the rotate-ca
command. This fix passes the flag value through to the Kubernetes
client, allowing users to override the default Kubernetes API endpoint
during CA rotation.
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes #12649
The cryptic error was coming from our code, as it never worked if the
decoded node is not mapping node.
Also annotate errors with line numbers (or document kinds) to make
understanding the problem better, specifically for multi-doc and long
configs.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
It seems that etcd might derive them incorrectly on IPv6-only system.
This change is confusing, as it sets the `--initial-` prefixed flag even
after join, but it seems that on etcd side, the configuration value is
used always despite the flag name.
Fixes #12646 (see the issue for more details)
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Updated `ReadFromVolume` to open the filesystem it's attempting to
read from as read-only. This allows `vfat` cloud init volumes to be
successfully read by Talos Linux. This change was made here and not in
`pkg/xfs/fsopen/fsopen_linux.go` so that it only applies to volumes that
are being read for cloud init configuration, not all volumes.
Fixes https://github.com/siderolabs/talos/issues/12647.
Signed-off-by: Mateusz Urbanek <mateusz.urbanek@siderolabs.com>
We always unconditionally create `BIOS` partition, even on arm64, so the
prefix should be same on all arches.
We don't use `BIOS` on arm64, but still this would be easier to support
in the future.
Co-authored-by: Dmitrii Sharshakov <dmitry.sharshakov@siderolabs.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This commit changes the way kubespan gets the podCIDR to advertise when
`advertiseKubernetesNetworks` is enabled. Before, it used the interface
address, but some CNIs (such as Cilium in NativeRouting) only set a
single /32 IP to a single interface (`cilium_host` in cilium's case).
This adds the `v1.Node`'s `.spec.podCIDRs` array to the `k8s.NodeStatus`
object and uses this to advertise the kubernetes network.
Signed-off-by: Florian Ströger <stroeger@youniqx.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Same as for any other resource - layering per source, and proper merge
across layers, so we can see where it comes from.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This commit introduces ProbeConfig, a new network configuration document type
that allows users to configure TCP connectivity probes to monitor network
endpoints.
Features:
- ProbeConfig document type with TCP probe support
- ProbeSpec and ProbeStatus resources for probe management
- ProbeConfigController to translate ProbeConfig into ProbeSpec
- ProbeController to execute probes and update ProbeStatus
- Configurable probe interval, timeout, and failure threshold
- Integration tests for API functionality
Signed-off-by: Mickaël Canévet <mickael.canevet@proton.ch>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Add support for using certificates stored in AWS Certificate Manager to
sign secureboot images in imager.
Signed-off-by: Tim Jones <tim.jones@siderolabs.com>
If SMBIOS does not report memory information, fall back to
/proc/meminfo and expose a dummy memory module as a best-effort
approximation.
Signed-off-by: Mateusz Urbanek <mateusz.urbanek@siderolabs.com>
In case some settings are missing that might be impacting the usage of
802.3ad, present a warning to users.
Signed-off-by: Mateusz Urbanek <mateusz.urbanek@siderolabs.com>
Simplify the flow a bit by using live partition info,
avoid doing some calculations which are already done in the
partition code.
Remove some steps I believe we don't need to do.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Add support for negative max size values in volume configuration.
Negative max size represents the amount of space to be left free on the device, rather than the size the volume should consume.
For example, a max size of "-10GiB" means the volume can grow to the device size minus 10GiB.
Signed-off-by: Mateusz Urbanek <mateusz.urbanek@siderolabs.com>
This value for some historical reason (I guess treating empty string as
'none') doesn't use standard enumer's methods.
So we shipped it in Talos 1.12 without proper encoding/decoding
in YAML config documents (it was actually converted to int).
Fix encoding, but keep backwards compatibility for integer values
just in case someone already started relying on it.
Fixes #12625
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
In addition to derivative of full PSI for the affected cgroups, also
look at avg10 value to provide some hysteresis against small spikes.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
I got a failure when dual-boot image refuses to format EPHEMERAL
partition where `EFI` partition used to be (VFAT).
So until we have a resolution, do this workaround.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This should reduce false triggers due to high IO activity and similar
events increasing global memory PSI despite free memory being available.
Also add more details for trigger condition and debugging.
Fixes: #12526
Co-authored-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Signed-off-by: Dmitrii Sharshakov <dmitry.sharshakov@siderolabs.com>
Open the blockdevice in `O_EXCL` mode when wiping to ensure that we
don't wipe a mounted device.
This issue was discovered via #12620, when we wipe a blockdevice which
is still mounted ending up in a wrong state.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
This reports image pull progress in the console for images pulled by
Talos:
* etcd
* kubelet
* installer
This work was mostly done by @laurazard, I just wrapped it for the
console with Laura's help. (see #12932)
Co-authored-by: Laura Brehm <laurabrehm@hey.com>
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
The problem is that xfs with 6.18 LTS settings is not supported
by GRUB yet. It might be supported with newly released 2.14 though.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Add Talos details to the Hetzner Cloud client user-agent.
Helps us identify and troubleshoot issues with users running Talos on Hetzner Cloud.
Signed-off-by: Jonas Lammler <jonas.lammler@hetzner-cloud.de>
Signed-off-by: Noel Georgi <git@frezbo.dev>
Changing `.cluster.controlPlane.endpoint=$NEW` will cause old tokens to be no longer valid.
We want to ensure that new tokens are issued using the `.cluster.controlPlane.endpoint=$NEW` value,
but all the existing tokens (issued using `.cluster.controlPlane.endpoint=$OLD`) are still accepted.
Signed-off-by: Mateusz Urbanek <mateusz.urbanek@siderolabs.com>
In API Server, passing extra args with `service-account-issuer` will add them to default value.
Fixes #11694
Signed-off-by: Mateusz Urbanek <mateusz.urbanek@siderolabs.com>
BREAKING: internal resources for the components use different
representation of AxtraArgs, resulting in modified types in protocol
buffers.
Signed-off-by: Mateusz Urbanek <mateusz.urbanek.98@gmail.com>
This type is used in Image Factory schematic, so move it into machinery
so that it can be imported into IF without pulling Talos core.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>