* Updated the link to the rhcos.json to point to the new location in the
Installer repo (data/data/coreos/rhcos.json)
* Updated the json path to include the architecture and the images
content.
* Changed the instruction to use the existing boot image in the
rhcos-cloud project instead of copying it as a new image.
This patch updates the "cloud-install" links in the documentation to
point to the current location.
Signed-off-by: Juan Hernandez <juan.hernandez@redhat.com>
GCP has a size restriction of 63 for the instance group name which
is mostly taken up by the suffix -instance-group that is being added
to make sure the resources have unique name. Reducing the size
of the suffix from -instance-group to -ig would help in restricting
the size of the name and would also help in keeping the names
unique.
The consuming script has not been changed since it landed in
cbe6f1549d (upi/gcp: initial deployment manager templates,
2019-07-29, #2117).
The associated Markdown landed with a 'region' in the docs and YAML in
cbe6f1549d. It was removed from the YAML in 998a518a17 (gcp upi:
split templates to simplify shared vpc workflow, 2019-10-25, #2574),
but not the surrounding Markdown (until this commit).
This mirrors changes to GCP IPI in #3544
The infra id of the clusters on GCP was reduced to 12 in #2088 because
we couldn't handle the hostname seen by rhcos machine to be greater than
64.
More details on this are available in
https://bugzilla.redhat.com/show_bug.cgi?id=1809345
now since BZ 1809345 is fixed by openshift/machine-config-operator#1711
and openshift/cluster-api-provider-gcp#88 the installer can relax the
restriction on the infra-id to match the other platforms.
Why is it important?
On GCP all resources are prefixed with infra-id, which currently is 12
chars with 6 chars used by random bit, leaving only 6 chars from cluster
name. This causes trouble associating the cluster to jobs in CI as most
of the identifyable characters are dropped from the resource names in CI
due to this restriction.
Also because of the previous restriction, only one char are used from
pool's name, making is higly likely to collide in cases there are more.
Current disk size restrictions for GCP is set to be above 0.
The current size of the image is 16GB and hence the disk sizes
must at least be 16GB. Also, the maximum disk size limit on GCP
is 65536GB so users must not be allowed to create disks above that
limit. Added the validations to the install config input for GCP.
Currently, the installer does not allow the users to customize the
type and size of the disks for the workers and control plane.
Added the option for the user to specify the type and the size of
the disks for both machines in GCP.
The user can specify two types of disks, pd-standard and pd-ssd disks
which are the options that GCP/Terraform provides. pd-standard is not
recommended for control planes and will not be allowed as a value in
the DefaultMachinePlatform and the master compute section.
Add an optional parameter in the GCP install-config that
contains a list of license URLs to be added to the compute image
Credits:
Based on the work by Colin Walters
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
This change documents how to add custom tags to the bootstrap, master,
and worker nodes at install time. This will enable users with custom
firewall rules to use previously known tags to enable communications to
their cluster.
This change adds direction in the GCP UPI install document about how to
install a cluster using a Shared VPC. Because the VPC, networks,
subnets, and dns zones are in a different project (the host project),
the installer has problems finding them while creating the Ignition
files. Furthermore, some changes are required to the cloud-provider in
order for the cluster to properly provision resources in the subnets. In
addition, it is assumed the service account in the service project will
likely not have sufficient permissions in the host project to perform
all of the required tasks.
Previously, the bootstrap host was being added to the first master
instance group. This causes an issue if the gcp cloud provider attempts
to create internal load balancers for the cluster because it ignores the
first master's instance groupd and tries to put it into a new instance
group. If there are workers that are in a different subnet, then the
cloud provider throws an error and never creates the ingress lbs.
This change creates an instance group for the bootstrap host, and
updates the doc to utilize it. It also removes the steps of adding and
removing the bootstrap host from the external target pools, as that is
not what we are doing with ipi.
This change adds 02_lb_int.py template to the workflow to enable
internal load balancers. The cluster will begin communicating to the api
and mcs through the internal load balancers. The external load balancer
can optionally be disabled for private clusters.
This change also updates the documentation to use the $(command) syntax
to be in line with the other platforms.
In addition, the variable definitions were all moved to immediately
after the associated resources were created. This will help make clear
where their origins are.
Prior to this change, users needed to edit the gcp upi python templates
in order to provision an cluster using a shared VPC. This was prone to
user error.
This change breaks up the templates so that only the yaml files need to
be modified, thus greatly simplifying the process. All of the resources
that would be provisioned in the host project are now in their own
python templates (01_vpc.py, 02_dns.py, and 03_firewall.py). These
resources can be removed from the yaml files to be run against the
service project and placed into yaml files to be run against the host
project instead.
Move the step of enabling GCP service APIs to it's own step. These APIs
must also be enabled prior to DNS configuration.
Signed-off-by: Christoph Blecker <cblecker@redhat.com>
We grew replicas-zeroing in c22d042 (docs/user/aws/install_upi: Add
'sed' call to zero compute replicas, 2019-05-02, #1649) to set the
stage for changing the 'replicas: 0' semantics from "we'll make you
some dummy MachineSets" to "we won't make you MachineSets". But that
hasn't happened yet, and since 64f96df (scheduler: Use schedulable
masters if no compute hosts defined, 2019-07-16, #2004) 'replicas: 0'
for compute has also meant "add the 'worker' role to control-plane
nodes". That leads to racy problems when ingress comes through a load
balancer, because Kubernetes load balancers exclude control-plane
nodes from their target set [1,2] (although this may get relaxed
soonish [3]). If the router pods get scheduled on the control plane
machines due to the 'worker' role, they are not reachable from the
load balancer and ingress routing breaks [4]. Seth says:
> pod nodeSelectors are not like taints/tolerations. They only have
> effect at scheduling time. They are not continually enforced.
which means that attempting to address this issue as a day-2 operation
would mean removing the 'worker' role from the control-plane nodes and
then manually evicting the router pods to force rescheduling. So
until we get the changes from [3], we can either drop the zeroing [5]
or adjust the scheduler configuration to remove the effect of the
zeroing. In both cases, this is a change we'll want to revert later
once we bump Kubernetes to pick up a fix for the service load-balancer
targets.
[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1671136#c1
[2]: https://github.com/kubernetes/kubernetes/issues/65618
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1744370#c6
[4]: https://bugzilla.redhat.com/show_bug.cgi?id=1755073
[5]: https://github.com/openshift/installer/pull/2402/
Before this change, gcp used individual firewall rules for each
service/port used. This caused quota issues where multiple clusters were
provisoned to the same project.
This change collapses the firewall rules where approperiate to reduce
the number of firewall rules used.
Before this change, the GCP UPI workflow hard coded the zones in the
bootstrap and control-plane templates. It assumed every region had zones
$REGION-{a,b,c}. However, in some regions this is not the case.
This change adds the zone(s) as parameters to the templates and updates
the docs accordingly. The list of zones is now fetched from gcp, and
then used to populate the templates.