1
0
mirror of https://github.com/openshift/installer.git synced 2026-02-06 18:47:19 +01:00
Commit Graph

5336 Commits

Author SHA1 Message Date
Casey Callendrello
f8441fc14f docs: add some networking troubleshooting docs 2018-11-19 12:04:33 +01:00
OpenShift Merge Robot
3d0ba6a0b0 Merge pull request #674 from abhinavdahiya/pull_secret
data/data/bootstrap: add --pull-secret flag to mco bootstrap in bootkube.sh
2018-11-16 14:22:40 -08:00
OpenShift Merge Robot
aaa3542cc8 Merge pull request #692 from deads2k/fix-etcd-ca
bootkube: CA for configmaps should not be base64
2018-11-16 13:48:10 -08:00
OpenShift Merge Robot
aa396312ca Merge pull request #695 from wking/drop-tfvars-yaml
pkg/tfvars: Drop YAML tags
2018-11-16 12:13:22 -08:00
W. Trevor King
09469eba1b pkg/tfvars: Drop YAML tags
We haven't needed these since we dropped the parsers in f7a4e68b
(pkg/types/config: Drop ParseConfig and other Parse* methods,
2018-10-02, #403).

Generated with:

  $ sed -i 's/ yaml:.*/`/' $(git grep -l yaml pkg/tfvars)
2018-11-16 09:31:05 -08:00
David Eads
b94ecccfaa bootkube: CA for configmaps should not be base64
secrets are base64 encoded, but configmaps are raw strings
2018-11-16 08:47:29 -05:00
OpenShift Merge Robot
5a740603a7 Merge pull request #391 from hardys/openstack_destroy
OpenStack Destroy Cluster Support
2018-11-16 02:47:16 -08:00
Steven Hardy
0b6d686f87 OpenStack Destroy Support
This adds initial support for destroy for OpenStack

Note that because gophercloud doesn't currently support nova tags[1]
I've used Metadata for servers (which is already populated with the
tectonicClusterId via the server properties).

Also note this requires changes to the terraform-provider-openstack which
are in the latest 1.12 release:

https://github.com/terraform-providers/terraform-provider-openstack/releases/tag/v1.12.0
https://github.com/terraform-providers/terraform-provider-openstack/issues/453
2018-11-16 09:10:48 +00:00
Steven Hardy
f304be43e4 vendor: gophercloud/gophercloud
This adds compute and networking gophercloud APIs, required for
OpenStack destroy support.
2018-11-16 08:53:39 +00:00
OpenShift Merge Robot
954b7f65f0 Merge pull request #686 from wking/drop-aws-region-current
data/data/aws/vpc: Drop 'current' from aws_region
2018-11-15 15:56:57 -08:00
OpenShift Merge Robot
c25196ad18 Merge pull request #683 from smarterclayton/make_consistent_with_3x
security: Open ports 9000-9999 inside the cluster for host network services
2018-11-15 15:18:51 -08:00
W. Trevor King
fe93bbe6a0 data/data/aws/vpc: Drop 'current' from aws_region
It's the default.  The old parameter was deprecated before
terraform-providers/terraform-provider-aws@1cc81c92
(docs/data-source/aws_region: Remove now deprecated current argument,
2018-02-09, v1.9.0), and we're pinning 1.39.0 since ac5aeed6
(data/aws: bump aws provider, 2018-11-01, #594).  Changing this
avoids:

  $ openshift-install --log-level=debug create cluster
  ...
  - Downloading plugin for provider "aws" (1.39.0)...
  ...
  Warning: module.vpc.data.aws_region.current: "current": [DEPRECATED] Defaults to current provider region if no other filtering is enabled
  ...
2018-11-15 13:13:21 -08:00
OpenShift Merge Robot
6a098fd06d Merge pull request #666 from cgwalters/edit-motd
bootkube: Add info to /etc/motd and a bit to /root/.bash_history
2018-11-15 13:12:46 -08:00
Clayton Coleman
3248996de2 security: Open ports 9000-9999 inside the cluster for host network services
In OpenShift 3.x we opened 9000-9999 for TCP for all internal connections
between masters, infra, and workers so that we could have a range that
host level services inside the cluster could coordinate on. This range
is analogous to node ports, except unlike node ports it is only available
on the inside. The most common consumers are node network metrics ports
(node exporter, cluster version operator, network operator, sdn, node
proxy) that need to be reachable from prometheus without magic tricks.
A second set is internal secured services that want to connect but must
be host network, like gluster, storage services, or other cluster level
proxies.

Open the range 9000-9999 by default so that new services don't require
either a reinstall or manual management. Future changes in the platform
may autoallocate from this range, but for now teams must reserve.
2018-11-15 15:35:46 -05:00
Colin Walters
b81e635909 bootstrap: Add info to /etc/motd
The intention here is to help people debug.  If you like this
I may also try to do something similar for masters.

V2: Applied code changes mostly written by @wking
2018-11-15 15:34:11 -05:00
OpenShift Merge Robot
31567af33f Merge pull request #682 from wking/drop-kube-addon
*: Drop tectonic-system namespace
2018-11-15 11:47:12 -08:00
W. Trevor King
a14da047c5 vendor: Drop tectonic-config
No longer needed since we dropped the tectonic-system namespace.
I edited Gopkg.toml by hand and then ran:

  $ dep ensure

using:

  $ dep version
  dep:
   version     : v0.5.0
   build date  :
   git hash    : 22125cf
   go version  : go1.10.3
   go compiler : gc
   platform    : linux/amd64
   features    : ImportDuringSolve=false
2018-11-15 11:10:32 -08:00
W. Trevor King
cd20c83c56 docs/design/resource_dep: Rebuild after tectonic-system prune
Generated with:

  $ openshift-install graph | dot -Tsvg >docs/design/resource_dep.svg

using:

  $ dot -V
  dot - graphviz version 2.30.1 (20170916.1124)
2018-11-15 11:10:31 -08:00
W. Trevor King
ac11b74cc5 *: Drop tectonic-system namespace
The kube-addon operator was the last remaining component in that
namespace, and it was just controlling a metrics server.  Metrics
aren't critical to cluster functions, and dropping kube-addon means we
don't need the old pull secret anymore (although we will shortly need
new pull secrets for pulling private release images [1]).

Also drop the admin and user roles [2], although I'm less clear on
their connection.

[1]: https://github.com/openshift/installer/pull/663
[2]: https://github.com/openshift/installer/pull/682#issuecomment-439145907
2018-11-15 11:10:31 -08:00
OpenShift Merge Robot
2507518e40 Merge pull request #678 from wking/issue-template-libvirt-version
.github/ISSUE_TEMPLATE: Ask for libvirt version information
2018-11-15 06:21:15 -08:00
W. Trevor King
a5e6fe4ce3 .github/ISSUE_TEMPLATE: Ask for libvirt version information
We see a lot of libvirt issues do to ancient libvirt versions.  Just
today, I fielded one from someone running QEMU 1.5.3 (from 2013!).
Asking for this information up front saves a follow up
request/response, and hopefully it's obvious enough that folks using
other providers can skip it.

Terraform can load plugins from a few places [1], but I've used the
standard path on POSIX systems.  I expect folks who install to
non-standard locations will be able to adapt, and we'll worry about
making it easy for Windows when we get users on Windows ;).

[1]: https://www.terraform.io/docs/extend/how-terraform-works.html#plugin-locations
2018-11-14 21:46:55 -08:00
OpenShift Merge Robot
8edb29d8a6 Merge pull request #676 from wking/dependencies-go-header-level
docs/dev/dependencies: Nest "Go" under "Build Dependencies"
2018-11-14 16:11:16 -08:00
W. Trevor King
7ad6d9ddd1 docs/dev/dependencies: Nest "Go" under "Build Dependencies"
This should have happened when "Build Dependencies" landed in 4278ba3f
(docs/libvirt-howto: Add dependency installation, 2018-09-24, #315).
2018-11-14 15:53:38 -08:00
Abhinav Dahiya
460cd6a0f7 data/data/bootstrap: add --pull-secret flag to mco bootstrap in bootkube.sh 2018-11-14 14:39:03 -08:00
OpenShift Merge Robot
aa0e4b336b Merge pull request #673 from wking/docs-prefix-libvirt-master
docs/dev/libvirt: Update to {cluster-name}-master-0
2018-11-14 13:05:52 -08:00
W. Trevor King
393aeaf9ce docs/dev/libvirt: Update to {cluster-name}-master-0
Catching up with 18ca9128 (pkg/destroy/libvirt: Use prefix-based
deletion, 2018-11-12, #660).
2018-11-14 12:48:30 -08:00
OpenShift Merge Robot
23cc80b950 Merge pull request #664 from wking/generic-troubleshooting-section
docs/user/troubleshooting: Add generic advice for getting pod details
2018-11-14 12:17:16 -08:00
OpenShift Merge Robot
31dfbd9204 Merge pull request #625 from sallyom/danmace-libvirt-scripts
scripts for nested-libvirt ci
2018-11-14 12:07:27 -08:00
OpenShift Merge Robot
4ebf8f4df2 Merge pull request #639 from staebler/do_not_generate_unneeded_dependents
Skip generation of dependencies for on-disk assets
2018-11-14 11:10:00 -08:00
OpenShift Merge Robot
e1d784bcd5 Merge pull request #667 from wking/install-config-platform-subdirs
pkg/asset/installconfig: Push platform-specific logic into subdirectories
2018-11-14 10:18:11 -08:00
Sally O'Malley
652b825331 scripts for nested-libvirt ci 2018-11-14 11:50:43 -05:00
OpenShift Merge Robot
f3c63c4a24 Merge pull request #670 from pmorie/fix-typo
Fix typo in libvirt instructions
2018-11-14 08:37:09 -08:00
Paul Morie
b6f9192658 Fix typo in libvirt instructions 2018-11-14 11:04:14 -05:00
W. Trevor King
ee519b421c pkg/asset/installconfig: Push platform-specific logic into subdirectories
To keep the platform-specific code separate.

The only thing that wasn't perfectly clean about this separation is
that now the AWS and OpenStack packages have their own defaultVPCCIDR.
There's no reason that they need to agree on that CIDR, though, so I
haven't bothered setting up a common-default package for them to share
a single constant.
2018-11-13 22:54:31 -08:00
staebler
5e50cbf754 asset/store: do not generate dependencies for on-disk assets when none of its dependencies are dirty
This allows the user to supply and use an on-disk asset (such as
install-config.yml) without the need to also supply the state file that was
created. This is helpful when re-using an on-disk asset for multiple
installations. In particular, hive would like to run openshift-install
with a supplied install-config.yml and no state file.

To effect this behavior, the asset store loads all of the on-disk assets that a
fetched asset depends upon prior to fetching the dependencies for the fetched
asset. From this, the asset store can determine whether the fetched asset is
dirty or not. If the fetched asset is not dirty and is on-disk or in the state
file, then the asset is used as is without generating any of the dependent
assets--as they would be ignored in resolving the fetched asset anyway.

Conflicts can occur when the asset store resolves in different ways the fetch of
two assets that share a dependency. For example, let us say that there are two
assets, A and B, that both depend upon asset C. Asset A is present on disk, and
asset B is not present on disk. When the asset store fetchs asset A, then asset
C will not be generated. However, when the asset store fetches asset B, then
asset C will be generated in order to generate asset B. Asset A could
potentially have data that conflicts with the data that would have been taken
from the asset C that was generated.

The new load function creates new asset instances to store the asset state loaded
from on-disk and from the state file. The store tests were relying on the same
asset instance being used throughout the test. Unfortunately, the tests now
need to use a lot of global variables, making the tests more fragile.

The assetToPurge field was removed since the same information can be obtained by
iterating over the assets map. Also, the parameter passed to the purge function
was changed to a single Asset instead of an Asset slice since the function is
only ever called with a single Asset.

Fixes https://github.com/openshift/installer/issues/545
2018-11-13 23:17:17 -05:00
OpenShift Merge Robot
54b49cdf0f Merge pull request #665 from wking/subpackages-for-install-config-to-metadata
pkg/asset/cluster: Push InstallConfig -> Metadata into subpackages
2018-11-13 18:52:00 -08:00
W. Trevor King
a40cdc1445 docs/user/troubleshooting: Add generic advice for getting pod details
Because this is the next step beyond "my cluster didn't install,
here's the --log-level=debug output", for folks working up bug reports
that don't match one of the common failures.
2018-11-13 16:30:07 -08:00
W. Trevor King
0dd848f246 pkg/asset/cluster: Push InstallConfig -> Metadata into subpackages
To keep the platform-specific code separate.  These converters are a
bit tiny for their own packages, but they seemed too
application-specific to belong to pkg/types.
2018-11-13 16:14:30 -08:00
OpenShift Merge Robot
32a15a5330 Merge pull request #660 from wking/libvirt-destroy
pkg/destroy/libvirt: Use prefix-based deletion
2018-11-13 15:27:19 -08:00
W. Trevor King
18ca912897 pkg/destroy/libvirt: Use prefix-based deletion
To avoid wiping out the caller's whole libvirt environment, regardless
of whether it was associated with our cluster or not.  Using
cluster-name prefixes still makes me a bit jumpy, so I've added
warnings to both the environment-variable and asset-prompt docs
warning libvirt users to pick something sufficiently unique.

Also:

* Use {cluster-name}-master-{count} naming.  We used to use
  master{count}, which diverged from other usage (e.g. AWS, which has
  used master-{count} since way back in ca443c5e (openstack/nova:
  replace cloud-init with ignition, 2017-02-27,
  coreos/tectonic-installer#7).

* Rename module.libvirt_base_volume -> module.volume.  There's no
  reason to diverge from the module source for that name.
2018-11-13 14:52:48 -08:00
W. Trevor King
57bb8bc677 pkg/destroy/libvirt: Single pass (instead of looping goroutines)
And I've rerolled deletion to use a single call to each deleter,
failing fast if they error.  That should address cases where we cannot
destroy a shut-off domain [1]:

  $ virsh -c $OPENSHIFT_INSTALL_LIBVIRT_URI list --all
   Id    Name                           State
  ----------------------------------------------------
   -     master0                        shut off
   -     test1-worker-0-zd7hd           shut off

  $ bin/openshift-install destroy cluster --dir test --log-level debug
  DEBUG Deleting libvirt volumes
  DEBUG Deleting libvirt domains
  DEBUG Deleting libvirt network
  DEBUG Exiting deleting libvirt network
  DEBUG goroutine deleteNetwork complete
  ERROR Error destroying domain test1-worker-0-zd7hd: virError(Code=55, Domain=10, Message='Requested operation is not valid: domain is not running')
  DEBUG Exiting deleting libvirt domains
  DEBUG Exiting deleting libvirt volumes
  DEBUG goroutine deleteVolumes complete
  DEBUG Deleting libvirt domains
  ERROR Error destroying domain test1-worker-0-zd7hd: virError(Code=55, Domain=10, Message='Requested operation is not valid: domain is not running')
  [...]

Now we'll fail-fast in those cases, allowing the caller to clear the
stuck domains, after which they can restart deletion.

The previous goroutine approach was borrowed from the AWS destroyer.
But AWS has a large, complicated resource dependency graph which
includes cycles.  Libvirt is much simpler, with volumes and a network
that are all independent, followed by domains which depend on the
network and some of the volumes.  With this commit we now take a
single pass at destroying those resources starting at the leaf domains
and working our way rootwards.

I've retained some looping (although no longer in a separate
goroutine) for domain deletion.  This guards against racing domain
creation, as discussed in the new godocs for deleteDomains.

Also:

* Rename from libvirt_prefix_deprovision.go to libvirt.go.  The name
  is from 998ba306 (cmd,pkg/destroy: add non-terraform destroy,
  2018-09-25, #324), but the implementation doesn't need to be
  represented in the filename.  This commit renames to libvirt.go to
  match the package name, since this file is the guts of this package.

* Simplify the AlwaysTrueFilter implementation.  No semantic changes,
  but this saves us a few lines of code.

* Add trailing periods for godocs to comply with [2].

[1]: https://github.com/openshift/installer/issues/656#issue-379634884
[2]: https://github.com/golang/go/wiki/CodeReviewComments#comment-sentences
2018-11-13 14:52:47 -08:00
W. Trevor King
ec68d771a5 pkg/types/installconfig/libvirt: Drop Network.Name
We'd been defaulting it to ClusterName in InstallConfig.Generate, and
I see no reason for the user to want to create a separate name for the
network alone.  The variable dates back to 4a08942c (steps: bootstrap
/ etcd / topology support for libvirtm 2018-04-24,
coreos/tectonic-installer#3213), where it is not explicitly motivated.
2018-11-13 14:52:16 -08:00
OpenShift Merge Robot
8096592bd2 Merge pull request #659 from wking/consolidate-package-name-variables
pkg/asset/installconfig/platform: Drop *PlatformType for types.{platform}.Name
2018-11-13 11:08:43 -08:00
W. Trevor King
1e129fe32b pkg/asset/installconfig/platform: Drop *PlatformType for types.{platform}.Name
The old *PlatformType are from cccbb37a (Generate installation assets
via a dependency graph, 2018-08-10, #120), but since 476be073
(pkg/asset: use vendored cluster-api instead of go templates,
2018-10-30, #573), we've had variables for the name strings in the
more central pkg/types.  With this commit, we drop the more peripheral
forms.  I've also pushed the types.PlatformName{Platform} variables
down into types.{platform}.Name at Ahbinav's suggestion [1].

I've added a unit test to enforce sorting in PlatformNames, because
the order is required by sort.SearchStrings in queryUserForPlatform.

[1]: https://github.com/openshift/installer/pull/659#discussion_r232849156
2018-11-13 10:36:34 -08:00
OpenShift Merge Robot
a6bddcd493 Merge pull request #657 from wking/type-platform-subdirs
pkg/types: Push platform-specific types (AWS, etc.) into subdirs
2018-11-13 09:52:54 -08:00
OpenShift Merge Robot
08018ca79e Merge pull request #661 from staebler/clean_up_test_files_from_pkg_asset
asset: use temporary directory for asset store in tests
2018-11-13 05:33:42 -08:00
OpenShift Merge Robot
3c5b2bef01 Merge pull request #588 from flaper87/openstack-creds-read
Read OpenStack creds from standard paths
2018-11-13 04:49:45 -08:00
Flavio Percoco
40e438cebe Read OpenStack creds from standard paths
OpenStack creds cold be in 3 different paths (etc, home config and
current dir). Instead of re-implementing the logic to find and read the
clouds.yaml file, we should use gophercloud which is the standard
go library for OpenStack.

Note that deployments on OpenStack are currently broken unless there's
a clouds.yaml under /etc/openstack.

Fixes #550
2018-11-13 11:37:35 +01:00
Flavio Percoco
f0bb560f92 vendor: gophercloud/utils
gophercloud and gophercloud utils are a library and a set of utilities
that provide common functionality to interact with OpenStack clouds,
configurations, etc.

We need these libraries to manage OpenStack configs as it's done
upstream and for future work like adding an OpenStack destroyer.
2018-11-13 11:37:35 +01:00
staebler
91ccc25aa7 asset: use temporary directory for asset store in tests
The asset store tests that call Fetch create residual state files (and could also use
any state files left over from previous tests). This change uses a temporary directory
for each test run so that the environment is clean before and after the tests.
2018-11-12 22:23:22 -05:00