1
0
mirror of https://github.com/openshift/installer.git synced 2026-02-06 18:47:19 +01:00
Commit Graph

370 Commits

Author SHA1 Message Date
OpenShift Merge Robot
aaa3542cc8 Merge pull request #692 from deads2k/fix-etcd-ca
bootkube: CA for configmaps should not be base64
2018-11-16 13:48:10 -08:00
W. Trevor King
09469eba1b pkg/tfvars: Drop YAML tags
We haven't needed these since we dropped the parsers in f7a4e68b
(pkg/types/config: Drop ParseConfig and other Parse* methods,
2018-10-02, #403).

Generated with:

  $ sed -i 's/ yaml:.*/`/' $(git grep -l yaml pkg/tfvars)
2018-11-16 09:31:05 -08:00
David Eads
b94ecccfaa bootkube: CA for configmaps should not be base64
secrets are base64 encoded, but configmaps are raw strings
2018-11-16 08:47:29 -05:00
Steven Hardy
0b6d686f87 OpenStack Destroy Support
This adds initial support for destroy for OpenStack

Note that because gophercloud doesn't currently support nova tags[1]
I've used Metadata for servers (which is already populated with the
tectonicClusterId via the server properties).

Also note this requires changes to the terraform-provider-openstack which
are in the latest 1.12 release:

https://github.com/terraform-providers/terraform-provider-openstack/releases/tag/v1.12.0
https://github.com/terraform-providers/terraform-provider-openstack/issues/453
2018-11-16 09:10:48 +00:00
OpenShift Merge Robot
6a098fd06d Merge pull request #666 from cgwalters/edit-motd
bootkube: Add info to /etc/motd and a bit to /root/.bash_history
2018-11-15 13:12:46 -08:00
Colin Walters
b81e635909 bootstrap: Add info to /etc/motd
The intention here is to help people debug.  If you like this
I may also try to do something similar for masters.

V2: Applied code changes mostly written by @wking
2018-11-15 15:34:11 -05:00
W. Trevor King
ac11b74cc5 *: Drop tectonic-system namespace
The kube-addon operator was the last remaining component in that
namespace, and it was just controlling a metrics server.  Metrics
aren't critical to cluster functions, and dropping kube-addon means we
don't need the old pull secret anymore (although we will shortly need
new pull secrets for pulling private release images [1]).

Also drop the admin and user roles [2], although I'm less clear on
their connection.

[1]: https://github.com/openshift/installer/pull/663
[2]: https://github.com/openshift/installer/pull/682#issuecomment-439145907
2018-11-15 11:10:31 -08:00
OpenShift Merge Robot
4ebf8f4df2 Merge pull request #639 from staebler/do_not_generate_unneeded_dependents
Skip generation of dependencies for on-disk assets
2018-11-14 11:10:00 -08:00
W. Trevor King
ee519b421c pkg/asset/installconfig: Push platform-specific logic into subdirectories
To keep the platform-specific code separate.

The only thing that wasn't perfectly clean about this separation is
that now the AWS and OpenStack packages have their own defaultVPCCIDR.
There's no reason that they need to agree on that CIDR, though, so I
haven't bothered setting up a common-default package for them to share
a single constant.
2018-11-13 22:54:31 -08:00
staebler
5e50cbf754 asset/store: do not generate dependencies for on-disk assets when none of its dependencies are dirty
This allows the user to supply and use an on-disk asset (such as
install-config.yml) without the need to also supply the state file that was
created. This is helpful when re-using an on-disk asset for multiple
installations. In particular, hive would like to run openshift-install
with a supplied install-config.yml and no state file.

To effect this behavior, the asset store loads all of the on-disk assets that a
fetched asset depends upon prior to fetching the dependencies for the fetched
asset. From this, the asset store can determine whether the fetched asset is
dirty or not. If the fetched asset is not dirty and is on-disk or in the state
file, then the asset is used as is without generating any of the dependent
assets--as they would be ignored in resolving the fetched asset anyway.

Conflicts can occur when the asset store resolves in different ways the fetch of
two assets that share a dependency. For example, let us say that there are two
assets, A and B, that both depend upon asset C. Asset A is present on disk, and
asset B is not present on disk. When the asset store fetchs asset A, then asset
C will not be generated. However, when the asset store fetches asset B, then
asset C will be generated in order to generate asset B. Asset A could
potentially have data that conflicts with the data that would have been taken
from the asset C that was generated.

The new load function creates new asset instances to store the asset state loaded
from on-disk and from the state file. The store tests were relying on the same
asset instance being used throughout the test. Unfortunately, the tests now
need to use a lot of global variables, making the tests more fragile.

The assetToPurge field was removed since the same information can be obtained by
iterating over the assets map. Also, the parameter passed to the purge function
was changed to a single Asset instead of an Asset slice since the function is
only ever called with a single Asset.

Fixes https://github.com/openshift/installer/issues/545
2018-11-13 23:17:17 -05:00
W. Trevor King
0dd848f246 pkg/asset/cluster: Push InstallConfig -> Metadata into subpackages
To keep the platform-specific code separate.  These converters are a
bit tiny for their own packages, but they seemed too
application-specific to belong to pkg/types.
2018-11-13 16:14:30 -08:00
W. Trevor King
18ca912897 pkg/destroy/libvirt: Use prefix-based deletion
To avoid wiping out the caller's whole libvirt environment, regardless
of whether it was associated with our cluster or not.  Using
cluster-name prefixes still makes me a bit jumpy, so I've added
warnings to both the environment-variable and asset-prompt docs
warning libvirt users to pick something sufficiently unique.

Also:

* Use {cluster-name}-master-{count} naming.  We used to use
  master{count}, which diverged from other usage (e.g. AWS, which has
  used master-{count} since way back in ca443c5e (openstack/nova:
  replace cloud-init with ignition, 2017-02-27,
  coreos/tectonic-installer#7).

* Rename module.libvirt_base_volume -> module.volume.  There's no
  reason to diverge from the module source for that name.
2018-11-13 14:52:48 -08:00
W. Trevor King
57bb8bc677 pkg/destroy/libvirt: Single pass (instead of looping goroutines)
And I've rerolled deletion to use a single call to each deleter,
failing fast if they error.  That should address cases where we cannot
destroy a shut-off domain [1]:

  $ virsh -c $OPENSHIFT_INSTALL_LIBVIRT_URI list --all
   Id    Name                           State
  ----------------------------------------------------
   -     master0                        shut off
   -     test1-worker-0-zd7hd           shut off

  $ bin/openshift-install destroy cluster --dir test --log-level debug
  DEBUG Deleting libvirt volumes
  DEBUG Deleting libvirt domains
  DEBUG Deleting libvirt network
  DEBUG Exiting deleting libvirt network
  DEBUG goroutine deleteNetwork complete
  ERROR Error destroying domain test1-worker-0-zd7hd: virError(Code=55, Domain=10, Message='Requested operation is not valid: domain is not running')
  DEBUG Exiting deleting libvirt domains
  DEBUG Exiting deleting libvirt volumes
  DEBUG goroutine deleteVolumes complete
  DEBUG Deleting libvirt domains
  ERROR Error destroying domain test1-worker-0-zd7hd: virError(Code=55, Domain=10, Message='Requested operation is not valid: domain is not running')
  [...]

Now we'll fail-fast in those cases, allowing the caller to clear the
stuck domains, after which they can restart deletion.

The previous goroutine approach was borrowed from the AWS destroyer.
But AWS has a large, complicated resource dependency graph which
includes cycles.  Libvirt is much simpler, with volumes and a network
that are all independent, followed by domains which depend on the
network and some of the volumes.  With this commit we now take a
single pass at destroying those resources starting at the leaf domains
and working our way rootwards.

I've retained some looping (although no longer in a separate
goroutine) for domain deletion.  This guards against racing domain
creation, as discussed in the new godocs for deleteDomains.

Also:

* Rename from libvirt_prefix_deprovision.go to libvirt.go.  The name
  is from 998ba306 (cmd,pkg/destroy: add non-terraform destroy,
  2018-09-25, #324), but the implementation doesn't need to be
  represented in the filename.  This commit renames to libvirt.go to
  match the package name, since this file is the guts of this package.

* Simplify the AlwaysTrueFilter implementation.  No semantic changes,
  but this saves us a few lines of code.

* Add trailing periods for godocs to comply with [2].

[1]: https://github.com/openshift/installer/issues/656#issue-379634884
[2]: https://github.com/golang/go/wiki/CodeReviewComments#comment-sentences
2018-11-13 14:52:47 -08:00
W. Trevor King
ec68d771a5 pkg/types/installconfig/libvirt: Drop Network.Name
We'd been defaulting it to ClusterName in InstallConfig.Generate, and
I see no reason for the user to want to create a separate name for the
network alone.  The variable dates back to 4a08942c (steps: bootstrap
/ etcd / topology support for libvirtm 2018-04-24,
coreos/tectonic-installer#3213), where it is not explicitly motivated.
2018-11-13 14:52:16 -08:00
W. Trevor King
1e129fe32b pkg/asset/installconfig/platform: Drop *PlatformType for types.{platform}.Name
The old *PlatformType are from cccbb37a (Generate installation assets
via a dependency graph, 2018-08-10, #120), but since 476be073
(pkg/asset: use vendored cluster-api instead of go templates,
2018-10-30, #573), we've had variables for the name strings in the
more central pkg/types.  With this commit, we drop the more peripheral
forms.  I've also pushed the types.PlatformName{Platform} variables
down into types.{platform}.Name at Ahbinav's suggestion [1].

I've added a unit test to enforce sorting in PlatformNames, because
the order is required by sort.SearchStrings in queryUserForPlatform.

[1]: https://github.com/openshift/installer/pull/659#discussion_r232849156
2018-11-13 10:36:34 -08:00
OpenShift Merge Robot
a6bddcd493 Merge pull request #657 from wking/type-platform-subdirs
pkg/types: Push platform-specific types (AWS, etc.) into subdirs
2018-11-13 09:52:54 -08:00
OpenShift Merge Robot
08018ca79e Merge pull request #661 from staebler/clean_up_test_files_from_pkg_asset
asset: use temporary directory for asset store in tests
2018-11-13 05:33:42 -08:00
Flavio Percoco
40e438cebe Read OpenStack creds from standard paths
OpenStack creds cold be in 3 different paths (etc, home config and
current dir). Instead of re-implementing the logic to find and read the
clouds.yaml file, we should use gophercloud which is the standard
go library for OpenStack.

Note that deployments on OpenStack are currently broken unless there's
a clouds.yaml under /etc/openstack.

Fixes #550
2018-11-13 11:37:35 +01:00
staebler
91ccc25aa7 asset: use temporary directory for asset store in tests
The asset store tests that call Fetch create residual state files (and could also use
any state files left over from previous tests). This change uses a temporary directory
for each test run so that the environment is clean before and after the tests.
2018-11-12 22:23:22 -05:00
W. Trevor King
6c5e90485d pkg/types: Push platform-specific types (AWS, etc.) into subdirs
This decouples our platforms a bit and makes it easier to distinguish
between platform-specific and platform-agnostic code.  It also gives
us much more compact struct names, since now we don't need to
distinguish between many flavors of machine pool, etc. in a single
package.

I've also updated pkg/types/doc.go; pkg/types includes more than
user-specified configuration since 78c31183 (pkg: add ClusterMetadata
asset,type that can be used for destroy, 2018-09-25, #324).

I've also added OWNERS files for some OpenStack-specific directories
that were missing them before.

There's still more work to go in this direction (e.g. pushing default
logic into subdirs), but this seems like a reasonable chunk.
2018-11-12 11:56:45 -08:00
Casey Callendrello
687ff31b69 pkg/manifests: move cluster_k8s_io to manifests from machines.
This is to break an import loop between pkg/assets/machines and
pkg/assets/manifests.
2018-11-12 10:56:50 +01:00
Casey Callendrello
31b78ea746 replace tectonic-network-operator with cluster-network-operator
* Generate cluster-network-operator config from install-config
* Refactor install-config to better reflect network config
* remove tectonic-network-operator
* Remove temporary kube-proxy and cvo override
2018-11-12 10:56:35 +01:00
staebler
64bb638a85 validate: remove unused prefixError function
We are using github.com/pkg/errors to handle error wrapping.
The prefixError function is obsolete.
2018-11-11 13:54:02 -05:00
OpenShift Merge Robot
5b3964decd Merge pull request #614 from wking/libvirt-bootstrap-dns-removal
pkg/destroy/bootstrap: Remove bootstrap from DNS for libvirt
2018-11-09 14:28:33 -08:00
OpenShift Merge Robot
6d57970715 Merge pull request #592 from rajatchopra/manifest-templates
target manifest-templates
2018-11-09 12:05:22 -08:00
Rajat Chopra
166a9f1eb3 pkg/asset: new target manifest-templates
1. Move files from manifests/content to templates directory
2. Create new asset called templates that the target manifest-templates can directly call
3. All template files are separate assets by themselves, and 'templates' asset depends on all leaf template assets
4. Manifest/tectonic assets now use templates as parent assets that they depend upon

Other templates (e.g. ignition/machines) are not moved into assets in this commit.

data/data/manifests: move all yaml content to its own files

So that a yaml lint check can catch the inappropriate ones.
No functional change at runtime.
2018-11-08 17:08:40 -05:00
OpenShift Merge Robot
4617ea711a Merge pull request #510 from wking/bootstrap-ignition-from-data-directory
data/bootstrap: Pull content out of pkg/asset/ignition/bootstrap
2018-11-08 11:03:11 -08:00
OpenShift Merge Robot
b7798874b6 Merge pull request #640 from simonpasquier/fix-typo
pkg/asset/cluster: fix typo
2018-11-08 07:52:28 -08:00
Simon Pasquier
c661b2c0d3 pkg/asset/cluster: fix typo 2018-11-08 09:53:32 +01:00
W. Trevor King
6c4160fabe data/bootstrap: Pull content out of pkg/asset/ignition/bootstrap
It's easier for humans and linters to find this content if it's not
hidden in Go variables.

Since we're effectively pulling these files from Git now (either at
build time or at run-time depending on release vs. dev mode in
hack/build.sh), I'm being a bit more relaxed about file modes than the
previous implementation.  Files are now either 0555 (if they are in a
'bin' directory) or 0600 (if they aren't).  This is a change for files
like manifests.Manifests, which had previously been 0644.

I've flattened the manifest overrides into a single directly, because
the filenames are sufficient for sorting them by operator.  And all of
the override manifests now have their own comment explaining their
target and eventual location.
2018-11-07 21:25:48 -08:00
OpenShift Merge Robot
e64a43d293 Merge pull request #630 from wking/improve-docs-for-cluster-name-base-domain-interaction
pkg/asset/installconfig/basedomain: Document cluster-name subdomains
2018-11-07 17:18:40 -08:00
W. Trevor King
e6b90babce pkg/asset/installconfig/platform: Remove libvirt image prompt
Keep the environment variable (with a warning about using it), but
drop the interactive prompt.  The default is solid, and users
manipulating it are more likely to break something (e.g. by continuing
to use the old v1 pipeline), while the installer can update its
default to track the RHCOS folks (e.g. like 11178211, rhcos: implement
image discovery for new pipeline, 2018-10-26, #554).
2018-11-07 09:48:26 -08:00
W. Trevor King
b81fef787e pkg/asset/installconfig/basedomain: Document cluster-name subdomains
Brian says [1]:

  The third time I had to abort was when prompted for base domain
  followed by cluster name (I included my cluster name in my base
  domain because I'm using a well structure name server delegation
  structure).

And I've seen a number of other cases where folks suggest including
the cluster name again in the base domain.  Sometimes you might need
to do that (e.g. if you cannot create subdomains without the
additional namespacing).  But in most cases, including the cluster
name in the base domain is redundant.

[1]: https://github.com/openshift/installer/issues/627#issue-378069672
2018-11-06 18:17:14 -08:00
W. Trevor King
b2930b94b9 pkg/asset/installconfig/pullsecret: Help for single JSON line
Brian says [1]:

  It appears that the pull-secret prompt expects the input to be
  compacted. This should be more clearly specified.

The auths hint is from our existing OPENSHIFT_INSTALL_PULL_SECRET
docs.

[1]: https://github.com/openshift/installer/issues/627#issuecomment-436441021
2018-11-06 18:10:16 -08:00
W. Trevor King
e244ee083c pkg/tfvars/libvirt/cache: Remove .gz suffix handling
The new pipeline handles Content-Encoding gzip:

  $ curl -LI --compressed https://releases-rhcos.svc.ci.openshift.org/storage/releases/maipo/47.73/redhat-coreos-maipo-47.73-qemu.qcow2
  HTTP/1.1 302 Moved Temporarily
  Server: nginx/1.12.1
  Date: Tue, 06 Nov 2018 20:55:06 GMT
  Content-Type: text/html
  Content-Length: 161
  Location: https://d26v6vn1y7q7fv.cloudfront.net/releases/maipo/47.73/redhat-coreos-maipo-47.73-qemu.qcow2
  Set-Cookie: ...; path=/; HttpOnly; Secure

  HTTP/1.1 200 OK
  Content-Type: binary/octet-stream
  Content-Length: 726878219
  Connection: keep-alive
  Date: Tue, 06 Nov 2018 20:23:37 GMT
  Last-Modified: Tue, 06 Nov 2018 17:56:36 GMT
  ETag: "021080ef3b515d2443be3749ebbb0b08-87"
  Content-Encoding: gzip
  Accept-Ranges: bytes
  Server: AmazonS3
  Age: 1890
  X-Cache: Hit from cloudfront
  Via: 1.1 d85d7507ed6501757cfe600c02a26c7d.cloudfront.net (CloudFront)
  X-Amz-Cf-Id: ...

so we can rely on Go's default compression handling.  See
DisableCompression, which defaults to false, in [1].  To confirm the
default handling, try to retrieve from a local server:

  $ export OPENSHIFT_INSTALL_LIBVIRT_IMAGE=http://localhost:8080/example.qcow
  $ openshift-install --dir=wking create cluster
  INFO Fetching OS image...
  FATAL Error executing openshift-install: failed to fetch Terraform Variables: failed to generate asset "Terraform Variables": failed to get Tfvars: failed to use cached libvirt image: Get http://localhost:8080/example.qcow: EOF

In another terminal, I was using Ncat as a dummy server to capture
request headers:

  $ nc -l -p 8080 </dev/null
  GET /example.qcow HTTP/1.1
  Host: localhost:8080
  User-Agent: Go-http-client/1.1
  Accept-Encoding: gzip

This commit drops the now-unnecessary suffix handling from 7abe94d4
(config/libvirt/cache: Decompress when URI has ".gz" suffix,
2018-09-21, #280).

[1]: https://golang.org/pkg/net/http/#Transport
2018-11-06 13:01:02 -08:00
Abhinav Dahiya
cb37d8fbfd manifets: add config map for root-ca
operators like MachineConfigOperator can use this configmap to sync up the root ca
that needs to be provided to nodes.
2018-11-06 09:41:45 -08:00
Dr. Stefan Schimanski
9556ff32cf bootkube: remove ctrl mgr --disable-phase-2 after default change 2018-11-06 09:29:40 +01:00
OpenShift Merge Robot
6106d58ddf Merge pull request #613 from abhinavdahiya/mp_to_ms
asset/machines/aws: fix num of machine objects created.
2018-11-05 15:59:33 -08:00
Ravi Sankar Penta
70991ddf21 Remove kube-dns from installer
- Now openshift-cluster-dns-operator will provide the needed internal domain resolution functionality
2018-11-05 12:58:52 -08:00
W. Trevor King
5e285cc2f8 pkg/destroy/bootstrap: Remove bootstrap from DNS for libvirt
Without this, round-robin clients will fail when they hit the
bootstrap DNS entry (after the bootstrap node stops serving its
control plane).

The implementation is a bit awkward; I'd have preferred the AWS
approach, with:

  resource "aws_lb_target_group_attachment" "bootstrap" {
    count = "${var.target_group_arns_length}"

    target_group_arn = "${var.target_group_arns[count.index]}"
    target_id        = "${aws_instance.bootstrap.private_ip}"
  }

in the bootstrap module.  But the libvirt host entries are only
available as a subsection of a libvirt_network resource (because the
whole network is defined in a single XML object, including the DNS
entries [1]).  So instead I've added an additional variable which we
can tweak to disable the bootstrap entry.  The default value for the
new variable includes the bootstrap entry for the initial cluster
'apply' call; on destry I override it via an *.auto.tfvars file (which
Terraform loads automatically [2]) to remove the bootstrap entry.

[1]: https://libvirt.org/formatnetwork.html#elementsAddress
[2]: https://www.terraform.io/docs/configuration/variables.html
2018-11-05 10:12:20 -08:00
Abhinav Dahiya
ea51957936 asset/machines/aws: fix num of machine objects created.
https://github.com/openshift/installer/pull/573 had wrong calculation for machine objects
in AWS.
2018-11-05 10:08:28 -08:00
Luis Sanchez
7294a3dad2 manifests: new secret etcd-client.kube-system
Manifest to create this resources belongs in installer. The renderer in
cluster-kube-apiserver-operator will need to be also changed to stop
creating the same manifest file.

Also, updated the dependency graph at docs/design/resource_dep.svg
2018-11-04 13:03:58 -05:00
OpenShift Merge Robot
0f78fadf92 Merge pull request #547 from abhinavdahiya/state_cleanup
cmd/destroy: delete cluster asset after destroying cluster
2018-11-02 22:20:17 -07:00
OpenShift Merge Robot
0013548d02 Merge pull request #604 from crawford/restart
asset/ignition: fix bootkube and tectonic retries
2018-11-02 19:10:43 -07:00
Clayton Coleman
6a20a4c477 assets: The CVO object name has changed
The name changed between when this PR was created and the newer CVO
PR was updated.
2018-11-02 21:07:41 -04:00
OpenShift Merge Robot
07c2c4f54a Merge pull request #601 from abhinavdahiya/mp_to_ms
pkg/asset/machines/libvirt: allow empty machinepool
2018-11-02 16:56:03 -07:00
OpenShift Merge Robot
0d6145ab18 Merge pull request #549 from smarterclayton/rename_cvoconfig
cvo: Rename the API object for CVOConfig to ClusterVersion
2018-11-02 16:23:44 -07:00
Abhinav Dahiya
30b2ca543f asset: trim Store interface and add Destroy to Store
Save and Purge functions don't really seem to belong to Store interface.

Fetch fetches the asset, checkpoints the state file to disk and consumes
all the assets that were loaded from disk to create the asset.

Also fixes the error in save() where it would only save the assets in memory and
drops all the other assets present in the state file but were not fetched into the store's
asset map
eg:
```console
./openshift-install ign configs # state has everything
./openshift-install install-config # state now only has assets uptill install-config, all the ign config assets are lost from statefile

./openshift-install ign configs # state has everything
./openshift-install install-config # state still includes all the state from 'ign-config' run.
```

Destroy deletes the asset from all the possible sources, state file and disk.
2018-11-02 16:18:24 -07:00
Alex Crawford
945040c10c asset/ignition: fix bootkube and tectonic retries
Back in 54242f8, RemainAfterExit was added to prevent bootkube.service
and tectonic.service from running multiple times when progress.service
restarted. Due to a systemd bug [1], this causes the restart to be
ignored on those services.

This new approach uses conditions instead. This allows systemd to
activate the services but not actually start them if they have already
completed.

[1]: https://github.com/systemd/systemd/issues/3396
2018-11-02 15:59:38 -07:00
Abhinav Dahiya
63c2ee8c83 pkg/asset/machines/libvirt: allow empty machinepool
Empty machinepool is a valid for libvirt because we don't use it.

Fixes the error
```console
FATAL Error executing openshift-install: non-Libvirt machine-pool: ""
``
introduced in https://github.com/openshift/installer/pull/573
2018-11-02 12:16:57 -07:00