1
0
mirror of https://github.com/openshift/installer.git synced 2026-02-05 15:47:14 +01:00

7132 Commits

Author SHA1 Message Date
Pablo Fontanilla
5797d192d6 OCPEDGE-1517: add-tnf-agent-based-installer (#9946)
* agent/installconfig: Add two-node-with-fencing topology and refactor
two-node validation

* feat: add override for control plane fencing creds

Signed-off-by: ehila <ehila@redhat.com>

* Add TNF fencing credentials override test

* Update integration test with new validation result

* Update installer verification and tests to only allow URLs with redfish on them for Two Nodes with Fencing topology

* Update validation check for redfish

* Remove simultaneous dual replica feature set restriction

* Update fencing address validation to include port

* Update validation to disallow http

* Update and expand url validation tests

* Revert "Update validation to disallow http"

This reverts commit e9595a8d4f.

* Update variable name

* Update tests

* Add YAML tags to Credential struct for fencing

Add explicit yaml struct tags to the Credential type to ensure proper
YAML serialization with lowercase field names (e.g., 'hostname' instead
of 'hostName'). This is required for the assisted-service to correctly
parse the fencing credentials file.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Add fencing credentials file generation for TNF clusters

Generate /etc/assisted/hostconfig/fencing-credentials.yaml containing
all fencing credentials from controlPlane.fencing.credentials[]. This
file is embedded in the agent ISO and consumed by assisted-service
during TNF cluster installation.

Key changes:
- Add OptionalInstallConfig to Ignition Dependencies()
- Add addFencingCredentials() function to generate the YAML file
- Call addFencingCredentials() in Generate() after NTP sources
- Add comprehensive unit tests for the new function

The single-file approach avoids directory naming collisions between
MAC-based host directories and hostname-based fencing credentials.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Revert fencing credentials override

The fencing credentials are now passed to assisted-service via the
hostconfig/fencing-credentials.yaml file embedded in the ISO, making
the install-config annotation override unnecessary.

This reverts commits:
- 105b3c95c9 Add TNF fencing credentials override test
- a06d1a766b feat: add override for control plane fencing creds

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Improve fencing credentials test coverage

Enhance TestIgnition_addFencingCredentials with:
- File owner verification (assert root ownership)
- Append behavior test with pre-existing files
- Fix misleading test name and add second credential to match
  valid TNF configuration (2 credentials required)
- Remove unused expectError field from test struct

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Support vendor-specific redfish schemes in fencing validation

Vendor-specific redfish schemes like idrac-redfish:// and ilo5-redfish://
use HTTPS (port 443) by default, so they should be valid without an
explicit port number.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* unit tests: Add missing OptionalInstallConfig dependency in ignition test

The TestIgnition_Generate test was panicking because the
OptionalInstallConfig asset was missing from the test dependencies.
This caused dependencies.Get() to return a nil value when the
addFencingCredentials function tried to access it.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* agent: Refactor fencing credentials into standalone asset

Move fencing credentials generation from inline ignition.go code into a
proper FencingCredentials asset following the installer's asset pattern.

This refactor:
- Creates pkg/asset/agent/manifests/fencingcredentials.go as a
  WritableAsset with Dependencies, Generate, Files, and Load methods
- Adds comprehensive unit tests in fencingcredentials_test.go
- Integrates FencingCredentials into AgentManifests dependency graph
- Removes addFencingCredentials() from ignition.go
- Adds positive integration test for TNF with fencing credentials
- Changes output path from /etc/assisted/hostconfig/ to
  /etc/assisted/manifests/ (standard manifests location)

The asset automatically returns empty Files() for non-TNF clusters,
so no fencing-credentials.yaml is generated unless fencing is configured.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* agent: Improve fencing credentials code quality

- Add explicit YAML library aliasing for clarity (goyaml for marshal,
  k8syaml for unmarshal) with documentation explaining why different
  libraries are used for each operation
- Improve error message to include credential count for debugging
- Add test case for empty fencing credentials array

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* agent: Fix CI failures for gofmt and integration tests

- Add blank line between k8s.io and github.com/openshift import groups
  in ignition_test.go to satisfy gci formatting requirements
- Add featureSet: TechPreviewNoUpgrade to tnf_with_fencing_credentials
  integration test to enable the DualReplica feature gate required for
  TNF fencing configuration

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* agent: Move fencing credentials to agentconfig package

FencingCredentials is a host-scoped configuration asset, not a
cluster-scoped manifest. Moving it from manifests/ to agentconfig/
aligns with the package's purpose and follows the pattern used by
other host configuration assets like AgentHosts.

This change also updates ignition.go to import from the new location
and removes the now-unused fencing credentials from agent.go manifests.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* agent: Add FencingCredentials to ignition test dependencies

The TestIgnition_Generate test was failing with a panic because the
FencingCredentials asset was added as a dependency to Ignition.Generate()
but wasn't included in the test's buildIgnitionAssetDefaultDependencies()
helper function.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* agent: Fix fencing credentials path in integration test

The integration test expected the fencing credentials file at
/etc/assisted/manifests/ but assisted-service reads it from
/etc/assisted/hostconfig/ (HOST_CONFIG_DIR default). The installer
correctly embeds the file at hostconfig/, so the test expectation
was wrong.

Changed test path from manifests to hostconfig to match both the
installer implementation and assisted-service expectations.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* agent: Add nolint directive for gosec G101 false positive

The gosec linter flags fencingCredentialsFilename as "potential
hardcoded credentials" (G101) because the variable name contains
"credentials". This is a false positive - the variable contains
a filename string, not actual credentials.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* agent: Fix expected YAML field order in TNF integration test

The expected fencing-credentials.yaml had fields in a different order
than the actual YAML serialization output. Updated the expected file
to match the actual field order: hostname, username, password, address,
certificateVerification.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Signed-off-by: ehila <ehila@redhat.com>
Co-authored-by: ehila <ehila@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
2026-02-05 12:30:34 +00:00
Vincenzo Mauro
ef751a9235 Add support for platform External in TNA clusters 2026-02-04 15:15:38 +01:00
openshift-merge-bot[bot]
2e77e61a96 Merge pull request #10236 from vimauro/tna-platform-none-support
OCPEDGE-2276: Add support for platform None in TNA (Two-Node Arbiter) clusters
2026-02-03 23:49:36 +00:00
openshift-merge-bot[bot]
9351917279 Merge pull request #10261 from tthvo/CORS-4055-cred-source
CORS-4055: migrate credential provider check to AWS SDK v2
2026-02-03 08:15:44 +00:00
openshift-merge-bot[bot]
1a283806d7 Merge pull request #10257 from tthvo/CORS-4328
CORS-4328: configure AWS CCM NodeIPFamilies for dual-stack support
2026-02-03 04:50:00 +00:00
openshift-merge-bot[bot]
e10bd06ca9 Merge pull request #10258 from tthvo/CORS-4055-elb
CORS-4055, CORS-4078: migrate ELB/ELBv2 API calls to AWS SDK v2
2026-02-03 00:09:45 +00:00
openshift-merge-bot[bot]
2a7b94f45c Merge pull request #10279 from jinyunma/fix-OCPBUGS-74631
OCPBUGS-74631: Add validation to reject userProvisionedDNS on Azure Stack Hub
2026-01-31 10:48:33 +00:00
Thuan Vo
3330f83b95 CORS-4055: migrate default region check to AWS SDK v2
The commit is an incremental step to migrate AWS API calls
to AWS SDK v2. This only focuses on logics to get the default region
from loaded config for the survey.
2026-01-30 14:25:27 -08:00
Jinyun Ma
de51668ab4 Add validation to reject userProvisionedDNS on Azure Stack Hub
Custom DNS (userProvisionedDNS) is not supported on Azure Stack Hub. This
change adds validation to prevent users from setting userProvisionedDNS on
Azure Stack Hub.
2026-01-30 11:46:50 +08:00
Thuan Vo
08ad0f8617 CORS-4055, CORS-4078: migrate ELB/ELBv2 API calls to AWS SDK v2
The commit is an incremental step to migrate AWS API calls to AWS SDK
v2. This focuses on ELB/ELBv2 clients in the pkg/asset and dependent
pkg(s).

The ELB and ELBv2 clients now use SDK v2 with custom endpoint resolvers
that maintain backwards compatibility with SDK v1 service endpoint
configurations. Special handling is included for the fact that SDK v1 used
the same endpoint identifier ("elasticloadbalancing") for both ELB classic
and ELBv2, while SDK v2 uses distinct service IDs.
2026-01-29 13:07:08 -08:00
Patrick Dillon
fee6f94711 GCP: skip AI zones
Filter out AI zones when discovering zones in the region. AI zones
do not have quota for general compute resources, so we should not provision
nodes there by default.
2026-01-29 10:36:08 -05:00
openshift-merge-bot[bot]
277456d55f Merge pull request #10245 from tthvo/CORS-4055-iam
CORS-4055: migrate IAM API calls to AWS SDK v2
2026-01-28 17:59:20 +00:00
openshift-merge-bot[bot]
c44c2dbd93 Merge pull request #10242 from tthvo/CORS-4055-s3
CORS-4055: migrate S3 API calls to AWS SDK v2
2026-01-28 14:23:48 +00:00
openshift-merge-bot[bot]
cbe2b67c22 Merge pull request #10081 from barbacbd/OCPBUGS-63305
OCPBUGS-63305: Make SimulatePrincipalPolicy optional
2026-01-28 14:23:40 +00:00
openshift-merge-bot[bot]
f77608818d Merge pull request #10224 from rna-afk/azure_client_version
OCPBUGS-67816: Revert storage account API version for client
2026-01-27 21:41:44 +00:00
Thuan Vo
d67b14e479 CORS-4055: migrate credential provider check to AWS SDK v2
This commit is an incremental step to migrate AWS API calls
to AWS SDK v2. This focuses on handlers that retrieve the source
or provider of credentials, for example, via shared credential file
and via environment variables.

Note: these logics are to determine whether the credential provider
is static, which is safe to transfer to the cluster as-is in Mint and
Passthrough credentialsMode.
2026-01-27 12:03:27 -08:00
Thuan Vo
552b61936e CORS-4058: Migrate AWS Destroy to SDK v2 (#9982)
* pkg/destroy/aws/ec2helpers.go

** the bulk of the changes are to the ec2helpers file. All of the sdk v1 imports
are removed except for session as this one is engrained too many files currently.

pkg/destroy/aws/aws.go

** Add a client for ELB ELBV2 and IAM to the Cluster Removal Struct. Even though
these changes are mainly to ec2helpers, the other clients were required in for
certain operations.

** The rest of the file updates are alter ARN import to come from aws sdk v2.

* pkg/destroy/aws/iamhelpers.go

** Remove/Change all imports from AWS sdk v1 to v2.

pkg/destroy/aws/errors.go
pkg/destroy/aws/ec2helpers.go

** Remove the Error checking/formatting function from ec2helpers and put the function
in the errors.go file.

* pkg/destroy/aws/elbhelpers.go

** Remove all SDK v1 imports from elb helpers.

* Add reference to correct HandleErrorCode function.

* pkg/destroy/aws/aws.go

** Update Route53, s3, and efs services to sdk v2. This is slowly removing the
requirement for aws session.

* ** Vendor updates for S3 and EFS services.
** This caused updates to other packages such as aws/config, credentials, stscreds, and
a list of aws internal packages.

* Clean up references and use the exported config creator to create new clients in destroyer.

* ** Migrate the use of resource tagging api to the sdk V2.

pkg/destroy/aws:

** Alter the function name from HandleErrorCode to handleErrorCode. The initial thought was that
this function could be used in other areas of the code, but it will remain in destroy for now.

pkg/destroy/aws/shared.go:

** Remove the session import and uses in the file.

* Fix references to HandleErrorCode.

* kg/destroy/aws/aws.go:

** Remove session from the imports. Added the agent handler to the configurations.

* Fix package updates for vendoring.

* Use the correct private and public zone clients.
Set a Destroy User Agent.
Cleanup pointer references to use the aws sdk.

* The ListUsers API call does not return tags for the IAM users in the
response. There is a separate call ListUserTags to fetch its tag for
checking in the installer code.

* rebase: fix other imports after rebase

* revert: use GetRole/GetUser to fetch tags

An older commit uses ListRoleTags/ListUserTags in order to save
bandwidth by fetching only tags. However, the minimal permission
required for the installer does not have permission iam:ListUserTags or
iam:ListRoleTags, thus causing the deprovisioning to skip users and
roles. This is part of the reasons for previous CI leaks.

This commit reverts the optimisation idea to just user GetRole/GetUser,
which should have sufficient minimal permission policy.

---------

Co-authored-by: barbacbd <barbacbd@gmail.com>
2026-01-27 11:55:23 +00:00
openshift-merge-bot[bot]
5aa688f0a7 Merge pull request #10211 from barbacbd/installer-n4a-instances
CORS-4299,CORS-4300: Allow N4A Instance Types in the installer
2026-01-27 06:23:08 +00:00
Thuan Vo
edb4e5af40 tests: add unit tests for NodeIPFamilies configurations 2026-01-26 18:34:15 -08:00
Thuan Vo
ab593238e2 CORS-4328: configure NodeIPFamilies for dual-stack support
Add NodeIPFamilies configuration to AWS cloud provider config
when dual-stack networking is enabled. The cloud provider now
sets the appropriate IP family ordering (ipv4/ipv6 or ipv6/ipv4)
based on the install config's IPFamily setting.

For dual-stack IPv4 primary clusters, NodeIPFamilies is set to:

NodeIPFamilies=ipv4
NodeIPFamilies=ipv6

For dual-stack IPv6 primary clusters, NodeIPFamilies is set to:

NodeIPFamilies=ipv6
NodeIPFamilies=ipv4

Single-stack IPv4 clusters continue to use the minimal config with an
empty Global section.
2026-01-26 18:31:08 -08:00
openshift-merge-bot[bot]
56e3874a13 Merge pull request #10238 from tthvo/CORS-4073
CORS-4073: validate instance type support IPv6 in dual-stack
2026-01-26 20:06:58 +00:00
barbacbd
8066014ea0 OCPBUGS-74363: Remove region option for the GCP Private Service Connect Endpoint
** While the regional support is valid, we will not be using this in openshift. Regional support
requires that each api have its own endpoint. Only one api is associated with an endpoint, and managing
this access will be difficult and unnessary at this time.
2026-01-23 09:19:39 -05:00
Thuan Vo
352241d9f5 CORS-4055: migrate IAM API calls to AWS SDK v2
The commit is an incremental step to migrate AWS API calls to AWS SDK
v2. This focuses on IAM clients in the pkg/asset and dependent pkg(s).
2026-01-21 17:53:00 -08:00
Thuan Vo
deb94a3815 CORS-4055: migrate S3 API calls to AWS SDK v2
The commit is an incremental step to migrate AWS API calls to AWS SDK
v2. This focuses on S3 clients in the pkg/asset and dependent pkg(s).
2026-01-20 16:59:19 -08:00
Thuan Vo
adfe5e7b4a tests: add unit tests for IPv6 networking validations 2026-01-20 13:38:10 -08:00
Thuan Vo
3a2f742642 CORS-4073: validate instance type support IPv6 in dual-stack
In order to attach IPv6 addresses to the ENI of EC2 instances, the
instance type must support IPv6 networking. The installer must validate
it by inspecting the networking capabilities of instance type via EC2
API calls.
2026-01-20 13:38:10 -08:00
openshift-merge-bot[bot]
dfdec6e1da Merge pull request #10176 from pawanpinjarkar/modify-hw-storage-requirements-for-ove
AGENT-1309: Modify NoRegistryClusterInstall storage requirements
2026-01-19 20:19:23 +00:00
openshift-merge-bot[bot]
617269249e Merge pull request #10223 from gpei/fix-OCPBUGS-56770
OCPBUGS-56770: Honor user-specified bootDiagnostics on Azure Stack Hub
2026-01-19 16:35:56 +00:00
barbacbd
f7eb72b373 CORS-4300: Update installer to allow n4a instances
pkg/types/gcp/machinepools.go:

Include the n4a instance type in the map as well as the (current) supported disk types:
- hyperdisk-balanced

pkg/asset/installconfig/gcp/validation.go:

Include n4a in the types of arm instance families.
2026-01-19 11:28:45 -05:00
Vincenzo Mauro
bb7d56e927 Added support for platform None in TNA clusters 2026-01-19 16:05:04 +01:00
openshift-merge-bot[bot]
19e15798a0 Merge pull request #10193 from abhay-nutanix/OCPBUGS-63028
OCPBUGS-63028: filtering only PEs from cluster list
2026-01-19 09:01:40 +00:00
openshift-merge-bot[bot]
e04b9d5eab Merge pull request #10207 from sadasu/dual-stack-config
CORS-4075, CORS-4113: Install-config and Infra manifest updates for DualStack for AWS and Azure
2026-01-17 02:18:31 +00:00
Gaoyun
e7bd4cae84 Check whether the user has explicitly configured bootDiagnostics in the mpool's bootDiagnostics field. If not configured, the Azure Stack Hub default is applied 2026-01-16 00:42:21 +00:00
openshift-merge-bot[bot]
f075df5766 Merge pull request #10213 from patrickdillon/ocpbugs-69735-private-ssh
OCPBUGS-69735: handle SSH rule deletion for Azure private
2026-01-15 22:19:38 +00:00
Sandhya Dasu
3a1ca8f3dd Check for FeatureGates when ipFamily can be set to DualStack
Make sure that ipFamily can be set to DualStackIPv4Primary and
DualStackIPv6Primary only when the platform based featuregates
have been enabled.
2026-01-15 13:17:58 -05:00
Sandhya Dasu
a99b4a05ae Update Infrastructure manifest with IPFamily for AWS and Azure
Based on install-config input, update IPFamily in AWSPlatformStatus
and AzurePlatformStatus fields within the Infrastructure manifest.
Update unit tests to verify Infra manifest creation.
2026-01-15 13:17:58 -05:00
Sandhya Dasu
8812b8e56f Add ipFamily as an install-config field for AWS and Azure
Includes validation for input values and unit tests for this new
install-config parameter.
2026-01-15 13:17:42 -05:00
Aditya Narayanaswamy
c7127f680d azure: Revert storage account API version for client
Reverting the API version for storage account in the call
to check if exists as it's causing an issue with the boot
diagnostics.
2026-01-15 10:35:39 -05:00
Gaoyun Pei
15d1d85a87 OCPBUGS-66943: Validate cluster name against Azure reserved words (#10221)
* azure: validate cluster name against Azure reserved words

  Azure prohibits the use of certain reserved words and trademarks
  in resource names. This change adds validation to reject cluster
  names containing any of the 43 reserved words documented by Azure,
  preventing deployment failures with ReservedResourceName errors.

  Reserved words checked include:
  - Complete reserved words (40): AZURE, OFFICE, EXCHANGE, etc.
  - Substring forbidden (2): MICROSOFT, WINDOWS
  - Prefix forbidden (1): LOGIN

* update the checking logic on reserved words

* fix the gofmt issues
2026-01-15 04:17:16 +00:00
Pawan Pinjarkar
cb6f36ef8f AGENT-1309: Modify OVE storage requirements 2026-01-14 13:21:37 -05:00
openshift-merge-bot[bot]
d9fb2e0510 Merge pull request #10188 from tthvo/OCPBUGS-69923
OCPBUGS-69923: ensure deterministic zone ordering for control plane machines
2026-01-13 21:47:16 +00:00
Mark Old
be0a05a9fe Fix nil pointer exception in azure mapiImage 2026-01-07 14:31:26 -08:00
Patrick Dillon
f22a3a3956 OCPBUGS-69735: handle SSH rule deletion for Azure private
In private clusters, no inbound nat rule is created for SSH; this
commit handles that scenario gracefully.
2026-01-05 12:43:31 -05:00
Abhay
ea748700e2 OCPBUGS-63028: filtering only PEs from cluster list 2026-01-05 16:56:11 +05:30
Thuan Vo
1957abe09b OCPBUGS-69923: ensure deterministic zone ordering for control plane machines
Control plane machines were intermittently being created in different
availability zones than specified in their machine specs. This occurred
because the zone list returned from FilterZonesBasedOnInstanceType used
a set's UnsortedList() func, which has a non-deterministic order.

When CAPI and MAPI manifest generation independently called this func,
they could receive zones in different orders, causing a mismatch in
machine zone placements between CAPI and MAPI manifests.

This commit ensures that we sort the zone slices before further
processing.
2025-12-22 13:37:13 -08:00
openshift-merge-bot[bot]
ff6438bc69 Merge pull request #10138 from barbacbd/fix-basic-linting-issues
no-jira: Fix linting issues for golangci-lint v2
2025-12-19 20:45:29 +00:00
openshift-merge-bot[bot]
03c237e1fd Merge pull request #10175 from hamzy/PowerVC-PostProvision
OCPBUGS-69840: PowerVC: fix PostProvision
2025-12-18 20:08:41 +00:00
Mark Hamzy
a43f8cc5df PowerVC: fix PostProvision
New code was added that we need to avoid.  Also, we need to create
OpenStack's Metadata structure.
2025-12-18 09:22:59 -06:00
openshift-merge-bot[bot]
63876c32e4 Merge pull request #10169 from jcpowermac/OCPBUGS-69434-2
SPLAT-2584,OCPBUGS-69434: Added ability to install different IPAM version when in TP.
2025-12-18 14:15:41 +00:00
openshift-merge-bot[bot]
93ba4638d6 Merge pull request #10086 from jcpowermac/OCPBUGS-17667
OCPBUGS-17667: Validate vCenter datacenters in failure domain topology
2025-12-18 10:59:07 +00:00