* agent/installconfig: Add two-node-with-fencing topology and refactor
two-node validation
* feat: add override for control plane fencing creds
Signed-off-by: ehila <ehila@redhat.com>
* Add TNF fencing credentials override test
* Update integration test with new validation result
* Update installer verification and tests to only allow URLs with redfish on them for Two Nodes with Fencing topology
* Update validation check for redfish
* Remove simultaneous dual replica feature set restriction
* Update fencing address validation to include port
* Update validation to disallow http
* Update and expand url validation tests
* Revert "Update validation to disallow http"
This reverts commit e9595a8d4f.
* Update variable name
* Update tests
* Add YAML tags to Credential struct for fencing
Add explicit yaml struct tags to the Credential type to ensure proper
YAML serialization with lowercase field names (e.g., 'hostname' instead
of 'hostName'). This is required for the assisted-service to correctly
parse the fencing credentials file.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Add fencing credentials file generation for TNF clusters
Generate /etc/assisted/hostconfig/fencing-credentials.yaml containing
all fencing credentials from controlPlane.fencing.credentials[]. This
file is embedded in the agent ISO and consumed by assisted-service
during TNF cluster installation.
Key changes:
- Add OptionalInstallConfig to Ignition Dependencies()
- Add addFencingCredentials() function to generate the YAML file
- Call addFencingCredentials() in Generate() after NTP sources
- Add comprehensive unit tests for the new function
The single-file approach avoids directory naming collisions between
MAC-based host directories and hostname-based fencing credentials.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Revert fencing credentials override
The fencing credentials are now passed to assisted-service via the
hostconfig/fencing-credentials.yaml file embedded in the ISO, making
the install-config annotation override unnecessary.
This reverts commits:
- 105b3c95c9 Add TNF fencing credentials override test
- a06d1a766b feat: add override for control plane fencing creds
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Improve fencing credentials test coverage
Enhance TestIgnition_addFencingCredentials with:
- File owner verification (assert root ownership)
- Append behavior test with pre-existing files
- Fix misleading test name and add second credential to match
valid TNF configuration (2 credentials required)
- Remove unused expectError field from test struct
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Support vendor-specific redfish schemes in fencing validation
Vendor-specific redfish schemes like idrac-redfish:// and ilo5-redfish://
use HTTPS (port 443) by default, so they should be valid without an
explicit port number.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* unit tests: Add missing OptionalInstallConfig dependency in ignition test
The TestIgnition_Generate test was panicking because the
OptionalInstallConfig asset was missing from the test dependencies.
This caused dependencies.Get() to return a nil value when the
addFencingCredentials function tried to access it.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* agent: Refactor fencing credentials into standalone asset
Move fencing credentials generation from inline ignition.go code into a
proper FencingCredentials asset following the installer's asset pattern.
This refactor:
- Creates pkg/asset/agent/manifests/fencingcredentials.go as a
WritableAsset with Dependencies, Generate, Files, and Load methods
- Adds comprehensive unit tests in fencingcredentials_test.go
- Integrates FencingCredentials into AgentManifests dependency graph
- Removes addFencingCredentials() from ignition.go
- Adds positive integration test for TNF with fencing credentials
- Changes output path from /etc/assisted/hostconfig/ to
/etc/assisted/manifests/ (standard manifests location)
The asset automatically returns empty Files() for non-TNF clusters,
so no fencing-credentials.yaml is generated unless fencing is configured.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* agent: Improve fencing credentials code quality
- Add explicit YAML library aliasing for clarity (goyaml for marshal,
k8syaml for unmarshal) with documentation explaining why different
libraries are used for each operation
- Improve error message to include credential count for debugging
- Add test case for empty fencing credentials array
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* agent: Fix CI failures for gofmt and integration tests
- Add blank line between k8s.io and github.com/openshift import groups
in ignition_test.go to satisfy gci formatting requirements
- Add featureSet: TechPreviewNoUpgrade to tnf_with_fencing_credentials
integration test to enable the DualReplica feature gate required for
TNF fencing configuration
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* agent: Move fencing credentials to agentconfig package
FencingCredentials is a host-scoped configuration asset, not a
cluster-scoped manifest. Moving it from manifests/ to agentconfig/
aligns with the package's purpose and follows the pattern used by
other host configuration assets like AgentHosts.
This change also updates ignition.go to import from the new location
and removes the now-unused fencing credentials from agent.go manifests.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* agent: Add FencingCredentials to ignition test dependencies
The TestIgnition_Generate test was failing with a panic because the
FencingCredentials asset was added as a dependency to Ignition.Generate()
but wasn't included in the test's buildIgnitionAssetDefaultDependencies()
helper function.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* agent: Fix fencing credentials path in integration test
The integration test expected the fencing credentials file at
/etc/assisted/manifests/ but assisted-service reads it from
/etc/assisted/hostconfig/ (HOST_CONFIG_DIR default). The installer
correctly embeds the file at hostconfig/, so the test expectation
was wrong.
Changed test path from manifests to hostconfig to match both the
installer implementation and assisted-service expectations.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* agent: Add nolint directive for gosec G101 false positive
The gosec linter flags fencingCredentialsFilename as "potential
hardcoded credentials" (G101) because the variable name contains
"credentials". This is a false positive - the variable contains
a filename string, not actual credentials.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* agent: Fix expected YAML field order in TNF integration test
The expected fencing-credentials.yaml had fields in a different order
than the actual YAML serialization output. Updated the expected file
to match the actual field order: hostname, username, password, address,
certificateVerification.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---------
Signed-off-by: ehila <ehila@redhat.com>
Co-authored-by: ehila <ehila@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
The commit is an incremental step to migrate AWS API calls
to AWS SDK v2. This only focuses on logics to get the default region
from loaded config for the survey.
Custom DNS (userProvisionedDNS) is not supported on Azure Stack Hub. This
change adds validation to prevent users from setting userProvisionedDNS on
Azure Stack Hub.
The commit is an incremental step to migrate AWS API calls to AWS SDK
v2. This focuses on ELB/ELBv2 clients in the pkg/asset and dependent
pkg(s).
The ELB and ELBv2 clients now use SDK v2 with custom endpoint resolvers
that maintain backwards compatibility with SDK v1 service endpoint
configurations. Special handling is included for the fact that SDK v1 used
the same endpoint identifier ("elasticloadbalancing") for both ELB classic
and ELBv2, while SDK v2 uses distinct service IDs.
Filter out AI zones when discovering zones in the region. AI zones
do not have quota for general compute resources, so we should not provision
nodes there by default.
This commit is an incremental step to migrate AWS API calls
to AWS SDK v2. This focuses on handlers that retrieve the source
or provider of credentials, for example, via shared credential file
and via environment variables.
Note: these logics are to determine whether the credential provider
is static, which is safe to transfer to the cluster as-is in Mint and
Passthrough credentialsMode.
* pkg/destroy/aws/ec2helpers.go
** the bulk of the changes are to the ec2helpers file. All of the sdk v1 imports
are removed except for session as this one is engrained too many files currently.
pkg/destroy/aws/aws.go
** Add a client for ELB ELBV2 and IAM to the Cluster Removal Struct. Even though
these changes are mainly to ec2helpers, the other clients were required in for
certain operations.
** The rest of the file updates are alter ARN import to come from aws sdk v2.
* pkg/destroy/aws/iamhelpers.go
** Remove/Change all imports from AWS sdk v1 to v2.
pkg/destroy/aws/errors.go
pkg/destroy/aws/ec2helpers.go
** Remove the Error checking/formatting function from ec2helpers and put the function
in the errors.go file.
* pkg/destroy/aws/elbhelpers.go
** Remove all SDK v1 imports from elb helpers.
* Add reference to correct HandleErrorCode function.
* pkg/destroy/aws/aws.go
** Update Route53, s3, and efs services to sdk v2. This is slowly removing the
requirement for aws session.
* ** Vendor updates for S3 and EFS services.
** This caused updates to other packages such as aws/config, credentials, stscreds, and
a list of aws internal packages.
* Clean up references and use the exported config creator to create new clients in destroyer.
* ** Migrate the use of resource tagging api to the sdk V2.
pkg/destroy/aws:
** Alter the function name from HandleErrorCode to handleErrorCode. The initial thought was that
this function could be used in other areas of the code, but it will remain in destroy for now.
pkg/destroy/aws/shared.go:
** Remove the session import and uses in the file.
* Fix references to HandleErrorCode.
* kg/destroy/aws/aws.go:
** Remove session from the imports. Added the agent handler to the configurations.
* Fix package updates for vendoring.
* Use the correct private and public zone clients.
Set a Destroy User Agent.
Cleanup pointer references to use the aws sdk.
* The ListUsers API call does not return tags for the IAM users in the
response. There is a separate call ListUserTags to fetch its tag for
checking in the installer code.
* rebase: fix other imports after rebase
* revert: use GetRole/GetUser to fetch tags
An older commit uses ListRoleTags/ListUserTags in order to save
bandwidth by fetching only tags. However, the minimal permission
required for the installer does not have permission iam:ListUserTags or
iam:ListRoleTags, thus causing the deprovisioning to skip users and
roles. This is part of the reasons for previous CI leaks.
This commit reverts the optimisation idea to just user GetRole/GetUser,
which should have sufficient minimal permission policy.
---------
Co-authored-by: barbacbd <barbacbd@gmail.com>
Add NodeIPFamilies configuration to AWS cloud provider config
when dual-stack networking is enabled. The cloud provider now
sets the appropriate IP family ordering (ipv4/ipv6 or ipv6/ipv4)
based on the install config's IPFamily setting.
For dual-stack IPv4 primary clusters, NodeIPFamilies is set to:
NodeIPFamilies=ipv4
NodeIPFamilies=ipv6
For dual-stack IPv6 primary clusters, NodeIPFamilies is set to:
NodeIPFamilies=ipv6
NodeIPFamilies=ipv4
Single-stack IPv4 clusters continue to use the minimal config with an
empty Global section.
** While the regional support is valid, we will not be using this in openshift. Regional support
requires that each api have its own endpoint. Only one api is associated with an endpoint, and managing
this access will be difficult and unnessary at this time.
In order to attach IPv6 addresses to the ENI of EC2 instances, the
instance type must support IPv6 networking. The installer must validate
it by inspecting the networking capabilities of instance type via EC2
API calls.
pkg/types/gcp/machinepools.go:
Include the n4a instance type in the map as well as the (current) supported disk types:
- hyperdisk-balanced
pkg/asset/installconfig/gcp/validation.go:
Include n4a in the types of arm instance families.
Based on install-config input, update IPFamily in AWSPlatformStatus
and AzurePlatformStatus fields within the Infrastructure manifest.
Update unit tests to verify Infra manifest creation.
* azure: validate cluster name against Azure reserved words
Azure prohibits the use of certain reserved words and trademarks
in resource names. This change adds validation to reject cluster
names containing any of the 43 reserved words documented by Azure,
preventing deployment failures with ReservedResourceName errors.
Reserved words checked include:
- Complete reserved words (40): AZURE, OFFICE, EXCHANGE, etc.
- Substring forbidden (2): MICROSOFT, WINDOWS
- Prefix forbidden (1): LOGIN
* update the checking logic on reserved words
* fix the gofmt issues
Control plane machines were intermittently being created in different
availability zones than specified in their machine specs. This occurred
because the zone list returned from FilterZonesBasedOnInstanceType used
a set's UnsortedList() func, which has a non-deterministic order.
When CAPI and MAPI manifest generation independently called this func,
they could receive zones in different orders, causing a mismatch in
machine zone placements between CAPI and MAPI manifests.
This commit ensures that we sort the zone slices before further
processing.