Migrations of such VMs will be rejected by libvirt, it doesn't make
sense to start the migration machinery for them. It's also good to
avoid resuming such VMs during migration preparation if possible,
which may fail and prevent the VM from resuming later or which may
reveal unhandled races or corner cases.
Migration of such VMs is already prevented in Engine but let's add an
additional check to Vdsm to handle races, similarly to _not_migrating
API guard.
This patch doesn't handle the case when the VM gets paused and
resumed when the migration process is already running. It will be
addressed in another patch.
Change-Id: Iec4e343f6c3cf39f36b339987d27c9c32b40c0a4
Bug-Url: https://bugzilla.redhat.com/2010478
Signed-off-by: Milan Zamazal <mzamazal@redhat.com>
This patch adds a Timer class to the concurrent module.
The class is based on and behaves pretty much the same
as the threading.Timer class, except that the thread
which carries out the target function is created with
concurrent.thread instead of the regular
threading.Thread. This makes the Timer class available
for use in vdsm, while still ensuring every thread is
created with concurrent.thread.
The Timer also doesn't inherit from the threading.Thread
class, like the threading.Timer does, but instead keeps
the thread object as an attribute.
Change-Id: I28f7f0a7f254088129964bc7d30e5fae846eb3fb
Signed-off-by: Filip Januska <fjanuska@redhat.com>
When using block based scratch disk, keep the scratch disk info in the
Drive object during the backup.
Currently we have only the index, which can be used to register block
threshold events. We need to modify engine to send also the scratch disk
domain, image, and volume ids so we can also extend them.
Change-Id: I7df8312386da2bc628efc7bf2fba6669c59f81d0
Bug-Url: https://bugzilla.redhat.com/1913387
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Convert storageserver_test to pytest. Also remove __future__ imports,
which are not needed any more.
Change-Id: Ibf26c7a6af8e0ce047fdf9b51e6a1b25537bce69
Signed-off-by: Vojtěch Juránek <vjuranek@redhat.com>
Instead of using parent class for connecting prepared connections,
return child class from prepare connection function and use it for
connecting. This finally delegates connection to sub-class and after
this change we can overide connect function in sub-class and implement
parallel connection for given sub-class.
Mixing different types of connection is not supported, so we don't have
to to care about this case.
Change-Id: Iee4d531496a3b1754db42941d8e0c9364d64dbdf
Bug-Url: https://bugzilla.redhat.com/1787192
Signed-off-by: Vojtěch Juránek <vjuranek@redhat.com>
Create parent class for all storage server connections and move bulk
connect and disconnect functionality there. This will allow child
classes to override these methods and implement different way how to
create multiple connections, e.g. creating them in parallel.
Change-Id: I11a2d9fee45b4251973cb6b8703c0d83b3b7f441
Bug-Url: https://bugzilla.redhat.com/1787192
Signed-off-by: Vojtěch Juránek <vjuranek@redhat.com>
Fix wrong test added in commit ceb07387f2
storage: move connectStorageOverIser() function into IscsiConnection
This is probably fixed in a pending patch. Github runs only the top
patch, but we merge the first patches.
Change-Id: Ied480b4d808044d4cf0fb42c4adeda02469a5a44
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Encapsulating connectStorageOverIser() into IscsiConnection object
allows us to simplify this method little bit and also unifies connection
of all storage server. Now, all storage servers are connected by calling
connect() method on respective object, without any need to call some
pre-connect method, which is specific to given storage server.
To check if initiatorName was passed into as a connection parameter
from the engine (which shouldn't be possible anyway as it's not covered
in IscsiConnectionParameters [1]) we need to create iface object of the
connection. However, the implementation of iscsi.IscsiInterface seems to
be buggy and throws KeyError when initiatorName is not set so we need to
check for KeyError when assing it. This would deserver fix and whole
module revision, but this is different task than BZ #1787192 and can be
done later.
[1] https://github.com/oVirt/vdsm/blob/v4.50.0/lib/vdsm/api/vdsm-api.yml#L379
Change-Id: I1549393cc75faf2d195d62c13f89491f07301d14
Bug-Url: https://bugzilla.redhat.com/1787192
Signed-off-by: Vojtěch Juránek <vjuranek@redhat.com>
When adding more CPUs for VMs with CPU policy engine has to dedicate the
new CPUs to match the policy. For this the API has to be extended to
allow passing the CPU sets for new VMs. Engine will decide whether or
not it can pass the argument based on cluster version. For older cluster
versions this argument does not have to be specified even on new VDSM
because VMs with CPU policy cannot be used there.
To make the API more flexible, we expect that engine passes
configuration for all CPUs and not just the new ones. This potentially
allows engine to relocate also the already assigned CPUs of the VM to
optimize use of resources on the host. When decreasing the number of
CPUs there is nothing special to do for the VM with dedicated CPUs.
In all cases the shared CPU pool has to be updated and VMs without
dedicated CPUs reconfigured.
Change-Id: I3f05f4ae71101513df0c34f17245fdad290f9e20
Signed-off-by: Tomáš Golembiovský <tgolembi@redhat.com>
Bug-Url: https://bugzilla.redhat.com/1782077
When migrating the VM we need to remove any CPU pinning that was defined
by VDSM. The reason is that pinned CPUs may serve a different purpose on
the destination host (may be dedicated to another VM) or it may not be
present on the destination at all.
For VMs with no policy the pinning is simply dropped and it will be
filled on destination. For VMs with manual CPU pinning or VMs that use
NUMA auto pinning this affects only vCPUs that don't have any pinning
defined by user and are using shared CPU pool. Such configuration is
also removed and it will be filled again on destination.
For VMs with a policy we expect Engine to pass a new pinning
configuration to the source VDSM. The information is passed in "cpusets"
parameter in the form of a list. Each item of the list corresponds to
vCPU and contains a string with cpuset definition.
Change-Id: I3de2d50e8ab26a8728beb662339fdbecb8aacf74
Signed-off-by: Tomáš Golembiovský <tgolembi@redhat.com>
Bug-Url: https://bugzilla.redhat.com/1782077
New function for running commands run_command() will replace the
LVMCache.cmd().
The new function raises on failure (rc!=0) so there is no need to check
the rc all over the place in tests.
Also when we need to use pytest.raises now when command failure is
expected.
Signed-off-by: Roman Bednar <rbednar@redhat.com>
Bug-Url: https://bugzilla.redhat.com/1536880
Change-Id: I4540270b32ecef5126f776baf9615e1b6c9ede4d
Modify changeVGTags() flow to use new run_command() which now raises
LVMCommandError.
The exceptions raised in this flow now inherit from LVMCommandError and
provide more details for better debugging.
Replacing VolumeGroupReplaceTagError with ValueError where appropriate
same as was already done in changeLVsTags() here:
https://gerrit.ovirt.org/c/vdsm/+/116780/4/lib/vdsm/storage/lvm.py#1836
Signed-off-by: Roman Bednar <rbednar@redhat.com>
Bug-Url: https://bugzilla.redhat.com/1536880
Change-Id: I536196c037f5cbe6565f32fd6c65b1ad51c614ce
Since RHEL 8 lvmetad was removed, but we could not remove the code
disabling it since we supported Fedora 30. Remove the useless code.
Change-Id: I6d94e3621e5791abd45d537db86c9afd7cf76309
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Before VM CPU stats are available, Vdsm reports zero initial values
for them. ovirt-hosted-engine-ha relies on those stats when handling
the Engine VM. The initial fake VM stats may confuse Engine VM
monitoring and induce undesirable actions such as restarting the VM on
another host without a good reason.
There is no good way to distinguish the initial fake CPU stats from
real CPU stats on the Engine side. Let's add a new flag, cpuActual,
distinguishing the two cases. It is set to true when all the CPU
stats are based on actual measured values.
It would be better to simply omit the initial fake CPU stats. But we
must keep them for compatibility with Engine 4.2, which expects their
presence.
Change-Id: I5adb1b01653b0029a30949ecb89219fde794dfd8
Bug-Url: https://bugzilla.redhat.com/2026263
Signed-off-by: Milan Zamazal <mzamazal@redhat.com>
Creating a loop device with the --sector-size option may fail randomly
if the device has dirty pages from previous usage. This was fixed in
losetup from util-linux 2.37.1, but this version is not available in
Centos Stream 8.
Fixed by adding a retry loop, similar to the retry loop used internally
in losetup.
Here is example random failure, fixed on the first retry:
[userstorage] WARNING Attempt 1/20 failed: losetup: /dev/loop5: set
logical block size failed: Resource temporarily unavailable
Change-Id: I285dedd09abd89e62152b887a3b05807c627041e
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Create and then remove bond to load the module.
Since we are already running inside container
we cannot simply use modprobe bonding. Fortunately
creating bond through iproute2 loads the module for us.
Change-Id: I0d894b6914a692fbb9ddda35543388926f9e66f4
Signed-off-by: Ales Musil <amusil@redhat.com>
IPv6 is disabled in docker containers on GH actions.
Enable IPv6 and remove skips for working tests.
Change-Id: I6cdc6a5b66e233a26723252e4cdd054415da4c02
Signed-off-by: Ales Musil <amusil@redhat.com>
Modify removeLVs() flow to use new run_command() which now raises
LVMCommandError.
The exceptions raised in this flow now inherit from LVMCommandError and
provide more details for better debugging.
The origial error (CannotRemoveLogicalVolume) is used as a wrapper for
other errors so we can not change it to inherit from LVMCommandError
which is what we need in removeLVs(). In this case we can add a new
exception for this flow - LogicalVolumeRemoveError.
Signed-off-by: Roman Bednar <rbednar@redhat.com>
Bug-Url: https://bugzilla.redhat.com/1536880
Change-Id: Ibca33d32dbdddc5cfff914a85807f5c688670592
HSM module is huge and contains various function related only to some
other module. Such functions are e.g. helpers for connecting to storage
servers. Move these functions partially into storageServer module and
those related only to iscsi connection to iscsi module.
Change-Id: Ie139f3d0d99fc01ab51d8fa95f9ce6637beba326
Bug-Url: https://bugzilla.redhat.com/1787192
Signed-off-by: Vojtěch Juránek <vjuranek@redhat.com>
When the device is a multipath device, the udev link points to the
actual path (/dev/sda) during early boot. The link is updated to point
to the multipath device only later. This creates a race during early
boot when lvm can grab the device before multipath.
Since lvm2-2.03.14-1.el8.x86_64 (Centos Stream 8) oVirt node boot breaks
when using "stable" udev links. According to David Teigland:
Using a link with the PVID may have sort of worked in the past, but
it probably should not have. I'd call it accidental, it's depending
on a quirk of udev processing, not anything in lvm itself.
The change in lvm may be reverted, but I think we should get rid of the
udev "stable" links anyway.
This change reverts commit db13e4bc58
lvmfilter: Use /dev/disk/by-id/lvm-pv-uuid devlinks for pv naming
but it is not possible to simply revert the commit since additional code
added based on that commit. Instead we changed the behavior:
1. When computing a filter, use the device names reported by lvm.
This fixes the attached bug. When lvm reports /dev/mapper/xxx this is
the device that will be used in the filter.
2. When analyzing existing filter, recommend to replace filter including
"stable" udev links with filter including the device names.
The original bug[1] will be solved in 4.5 by using the new lvm devices
feature[2].
[1] https://bugzilla.redhat.com/1635614
[2] https://bugzilla.redhat.com/2012830
Change-Id: I538e4a078dfba2ba28408f6e2178ca5082ed808b
Bug-Url: https://bugzilla.redhat.com/2016173
Bug-Url: https://bugzilla.redhat.com/2026370
Related-to: https://bugzilla.redhat.com/2026640
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
We used to call qemu-ga directly with guest-get-disks command to get the
names and serial numbers of all disks in the VM. We can now use API
provided by libvirt (since 7.3.0).
Change-Id: I938c049af930682c37db60956b16330424bb546e
Bug-Url: https://bugzilla.redhat.com/1919857
Signed-off-by: Tomáš Golembiovský <tgolembi@redhat.com>
NUMA code is extended to build a list of CPUs on a core for each core
available. This is then used to build a list of all CPUs in a shared
pool -- i.e. CPUs available to all VMs without any specific CPU policy.
The list is built in a way that:
- CPUs of VMs with manual CPU pinning or NUMA auto-pinning policy are
included in shared pool
- CPUs of VMs with dedicated policy are excluded from shared pool
- CPUs of VMs with isolate-threads or siblings policy are excluded from
shared pool and all their siblings as well, so that whole cores are
removed from shared pool and left exclusive to the particular VM
The updates need to be exclusive and cannot run concurrently, otherwise
the assigned CPU sets maybe wrong. Two racing VM.destroy calls are
problematic because we could fail to increase the shared pool so the
configured CPU set will be smaller than it could be for some VMs. Racing
VM.destroy and VM.create is much more problematic as it can lead to
situations where dedicated CPU would be used by other VMs.
Change-Id: Ife3797cda4419ecd153a136dea9fe35663f07f18
Signed-off-by: Tomáš Golembiovský <tgolembi@redhat.com>
Bug-Url: https://bugzilla.redhat.com/1782077
The _numa() call is cached (memoized) based on arguments used when
calling the function. The _numa() function optionally took libvirt
capabilities as an argument. This argument however was never used,
except in tests. Meaning that _numa() was normally called only without
arguments in which case libvirt capabilities were evaluated only the
first time the function was called (by _numa() function itself).
Recent patch changed how we treat _numa() in Host.getCapabilities API
call. We fetch fresh libvirt capabilities and pass it as argument to
_numa(). This causes two problems. First, it now causes small leak
because the cache is allowed to grow indefinitely. Secondly, the results
of re-evaluated _numa() call are not available to the rest of the VDSM
code that calls _numa() without arguments (e.g. in sampling.py for every
sample).
The first problem could be easily solved by changing the memoizing
decorator to functools.lru_cache(maxsize=N). But to solve also also the
second problem the caching cannot be done based on arguments. We need
something special because we don't want to fetch the libvirt
capabilities every time we need to call _numa() and we also
don't want to re-evaluate _numa() on every call.
This patch removes the @cache.memoized decorator and creates caching in
numa.py There is a separate update call that fetches fresh libvirt
capabilities. The capabilities are examined only when they change.
Change-Id: I9f7bc5596fb3b3c25dcf585e3d3a07a3a1a86a9e
Signed-off-by: Tomáš Golembiovský <tgolembi@redhat.com>
This is a first patch of the CPU policies series. We should get the CPU
policy in VM metadata. If it is not there it has to be detected and
stored there so that we know it on recovery. Otherwise, when we start
managing the vCPUs and add pinning it may be falsely mistaken for manual
CPU pinning. For the same reason we also store which vCPUs are manually
pinned. Later, the vCPUs without pinning will be using pCPUs from shared
pool and we need to remember which vCPUs were pinned by the user and
which have pinning defined by VDSM.
Defined policy names are:
- none: no policy defined, CPUs from shared pool will be used
- pin: manual CPU pinning or NUMA auto-pinning policy
- dedicated: each vCPU is pinned to single pCPU that cannot be used by
any other VM
- siblings: like dedicated but physical cores used by the VM are blocked
from use by other VMs
- isolate-threads: like siblings but only one vCPU can be assigned to
each physical core
Related feature page for policies is:
https://ovirt.org/develop/release-management/features/virt/dedicated-cpu.html
Change-Id: I7bc7bad82a20d47d06135c82fa58572a2327badd
Signed-off-by: Tomáš Golembiovský <tgolembi@redhat.com>
Bug-Url: https://bugzilla.redhat.com/1782077
The element value does not have to correspond to the real vCPU count. It
is a maximum number of vCPUs a VM can have, but the actual count may be
lower to make CPU hot-plugging possible. The actual count is specified
in the (optional) attribute "current". Only if the attribute is not
present the element value is the real count.
The original implementation returned None in case there is was no <vcpu>
element present. But missing <vcpu> suggests a broken domain XML so now
we raise an exception instead.
Change-Id: I67835dc2b05cb11cb7c0fcb6d0ce1c802f968c28
Signed-off-by: Tomáš Golembiovský <tgolembi@redhat.com>
Read only mode is not useful since RHEL 8. Remove tests for read only
mode or changes to read only mode during tests.
Change-Id: I368cbc3285cdcfb2965ea411528b107c53946262
Bug-Url: https://bugzilla.redhat.com/2025527
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
This option is deprecated and useless since RHEL 8. We could not remove
it in the past since we had to support older version of lvm on Fedora.
Looks like this option became harmful in lvm2-2.03.14-1.el8.x86_64,
converting locking_type=4 to --readonly. We see this failure in vdsm
tests:
WARNING: locking_type (4) is deprecated, using --sysinit --readonly.
Operation prohibited while --readonly is set.
Can\'t get lock for b4512b9d-84dc-43ba-865d-32c4a1cd148a.
and the lvm command fails.
Converting locking_type=4 to --readonly does not look correct, so this
is likely a regression in lvm. But we should not use locking_type in
vdsm. This option is used only in tests using read only mode, which is
never used in the real application.
As a quick fix to unbreak the tests, remove the locking_type
configuration. We need to remove the tests for read only mode and remove
the entire read only mode feature later.
Change-Id: Ia9af81756c07c26517805633af5d90d523e60fe7
Bug-Url: https://bugzilla.redhat.com/2025527
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Replace the call to blockInfo() with DriveMonitor.get_block_stats(),
returning block info for all volumes. To extract the block stats for the
drive active volume we need the index of the drive.
We get block stats only if we have drives that should be extended, or
when we pre-extend a drive when starting replication.
getExtendInfo() was modified to amend block info from libvirt with
information from the replica in case the drive is not chunked but
replicating to a chunked replica. This method should be removed once we
start using libvirt block stats for the replica drive.
Change-Id: I5600bedf886be993233df67dcad39078e7c920c8
Bug-Url: https://bugzilla.redhat.com/1913387
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Since we stopped calling getExtendInfo() during live merge, we don't
need to mock blockInfo().
Change-Id: Ia9f78d37b8eb4673a9bf0f7ef09fa4ae70a15a18
Bug-Url: https://bugzilla.redhat.com/1913387
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
When extending base drive before live merge we used Vm.getExtendInfo()
to get the current block info. This is wrong for several reasons:
- The function returns block info for the drive active volume, instead
of the base volume for the merge, which is never the active volume.
- The function will replace Drive.blockinfo with the new block info,
which is unwanted side affect when we try to extend the base volume.
- It calls libvirt for no reason.
Since we need the volume capacity, add the capacity to job.extend dict
when starting a merge, so we don't need to get it from the storage API
on each call to _start_extend().
Change-Id: Ic4b6dcccab4336e72a000b511d0e10c003eac9c5
Bug-Url: https://bugzilla.redhat.com/1913387
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
When extending volumes, we use libvirt.virDomain.blockInfo() to get
volume allocation. This API is easy to use, but it does not work for
backup scratch disks, volumes in the backing chain, or blockCopy
destination volume.
We want to replace usage of libvirt.virDomain.blockInfo() with
libvirt.virConnect.domainListGetStats(), that works for all block nodes,
include backup scratch disks[1] (since RHEL 8.6).
Vdsm already collects block stats for sampling purpose, but we cannot
used this code::
- We need the allocation info at the time of the call, and sampling
collects values only every 15 seconds.
- Sampling does not collect info for the backing chain (and should not)
but we must collect info for the backing chain.
- Sampling collects all stats while we need only block stats.
- Sampling collects stats for all VMs while we need only single VM.
- Sampling skip non responsive VMs, while we don't skip blockInfo()
calls.
- Drive monitoring cannot depend on sampling, a subsystem with different
requirements (best effort) and maintained by different teams.
Using libvirt.virConnect.domainListGetStats() is tricky, since it
requires a libvirt.virDomain object as parameter, and this object is
wrapped by the VM._dom object. Since the VM owns the _dom object, it is
natural to provide a method to get block stats in the VM object:
VM.get_block_stats().
Another problem with libvirt.virConnect.domainListGetStats() is the
unhelpful return value, flat mapping of "block.N.KEY" to VALUE for all
block nodes:
{
...
"block.0.fl.times": 0,
"block.1.name": "sda",
"block.1.path": "/rhev/.../44d498a1-54a5-4371-8eda-02d839d7c840",
"block.1.backingIndex": 2,
"block.1.rd.reqs": 13448,
"block.1.rd.bytes": 415614976,
"block.1.rd.times": 9940902315,
"block.1.wr.reqs": 4909,
"block.1.wr.bytes": 82999296,
"block.1.wr.times": 47469574949,
"block.1.fl.reqs": 683,
"block.1.fl.times": 4204366339,
"block.1.allocation": 216006656,
"block.1.capacity": 6442450944,
"block.1.physical": 7113539584,
"block.1.threshold": 6576668672,
"block.2.name": "sda",
...
There is lot of unneeded information for monitoring context, and no
easy way to extract the single value we actually need. A new method
DriveMonitor.get_block_stats() added, extracting this info in a useable
form:
{
2: drivemonitor.BlockInfo(
index=2,
name='sda',
path='/rhev/.../44d498a1-54a5-4371-8eda-02d839d7c840',
allocation=216006656,
capacity=6442450944,
physical=7113539584,
threshold=6576668672,
),
...
}
We will use this in extend flows to fetch block info for all volumes
when we try to extend drives.
[1] https://bugzilla.redhat.com/2017928
Change-Id: I8cdaaaf56c9f1e078809e4400a86219fc8086c41
Bug-Url: https://bugzilla.redhat.com/1913387
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Some code needs access the underlying libvirt.virtDomain object. Make a
public accessor so we don't access private attributes.
The underlying virDomain is needed when calling
libvirt.virConnect.domainListGetStats(), requiring list of virDomain
objects. It should not be used for anything else.
Change-Id: Iff39fb25d218d73d3ee8493a5c052a55d3270013
Bug-Url: https://bugzilla.redhat.com/1913387
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
FakeVm was using FakeLogger, which hides all the logs using the vm
logger. The fake logger class should be used only for testing logging;
when running tests we want to see real logs when a test fails.
Change-Id: I69fbb822fd1e33a4a53bf5060afe860459c199de
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
When we start a backup, we need to parse libvirt backup xml to get the
index for the backup scratch disk. This requires that we have a backup
xml for every test running backup.start_backup(). We have 22
invocations, so setting this manually is not the way.
FakeDomainAdapter generates now the backup xml from the backup_xml
argument to backupBegin(). Since we always have backup xml, we can
verify that backup was started correctly by comparing the backup xml
instead of keeping and comparing the input xml.
Similar change is needed for verifying checkpoint xml.
Change-Id: I64a5202b8f1ef3cad0092c6949c0a32c65fb9f10
Bug-Url: https://bugzilla.redhat.com/1913387
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
When generating backup input xml, use transientdisk.disk_path(). The
cleanest way to do this is to use temporary variables, and include them
in the xml using fstring (introduced in python 3.6).
Change-Id: Id7e15e59be1c9208fab5e00c5c82d6afce6cf544
Bug-Url: https://bugzilla.redhat.com/1913387
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
DOMAIN_ID was used only once when generating drives info.
VOLUME_ID was used only once, to create drives with same volume id,
which is invalid configuration.
We create new uuid now instead of using a global.
Change-Id: I3171ae0e3f125ff361b458e621a1780c9b89d4ed
Bug-Url: https://bugzilla.redhat.com/1913387
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Pass the fake vm to all the helper function that need to access the
drive list. This will make it possible to create different drive
configuration, for example block based drives that need to be extended
during backup.
Now that the fake vm is available in the helper function, we can use its
id attribute instead of duplicating it.
Change-Id: I5c67319b0f27a0ae74c586e3b90dff565e843d85
Bug-Url: https://bugzilla.redhat.com/1913387
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
We have been using vm_kill_paused_time config option to specify the
time after which we can kill VMs with "kill" resume behavior that are
in paused state due to an I/O error. If a user modifies sanlock
timeout settings, the option must be adjusted accordingly.
The sanlock I/O timeout is now configurable using sanlock.io_timeout
option. The VM killing timeout can be directly computed from it by
multiplying it by 8 and should be no more taken from a different
option. This patch removes vm_kill_paused_time option and computes
the corresponding value from io_timeout.
Change-Id: Icaa097008544c280da0f6122f0bb378cc14b873c
Bug-Url: https://bugzilla.redhat.com/2010205
Signed-off-by: Milan Zamazal <mzamazal@redhat.com>
The DHCP monitoring was recently switched to use netlink monitoring
which makes the whole process simpler, however the was major oversight.
The DHCP monitor runs in unprivileged context (vdsmd) this brings
two issues:
1) We cannot setup the source route rules because it needs to be done
in root context (supervdsmd).
2) The pool of items that are monitored is operated by supervdsmd
and vdsmd couldn't see the items available. That resulted in skipping
every valid opportunity to notify engine about new IP and creating
the source route rules.
In order to fix that the dhcp monitor is kept in vdsmd but the
monitoring check and removal is delegated to supervdsmd. Same
goes for the setup of source route rules.
Change-Id: I0481d5badfe2929a112fb47a945cbe7395341a71
Signed-off-by: Ales Musil <amusil@redhat.com>
During vdsm shutdown, we must keep storage live, since VMs or image
transfers may use the storage domain. We had special check for the host
id, keeping it alive during shutdown, but we were missing similar check
for teardown.
In the past StroageDoamin.teardown() was not effective, but during 4.4.
we fixed it several times, and now it really teardown the storage
domain, and shutting down vdsm deactivate entire volume groups and
remove device mapper devices for logical volumes.
When logical volumes are used, we see these errors during shutdown:
2021-11-14 14:00:03,911+0200 INFO (monitor/313e6d7) [storage.blocksd]
Tearing down domain 313e6d78-80f7-41ab-883b-d1bddf77a5da (blockSD:996)
2021-11-14 14:00:03,911+0200 DEBUG (monitor/313e6d7) [common.commands]
/usr/bin/taskset --cpu-list 0-1 /usr/bin/sudo -n /usr/sbin/lvm vgchange
--config 'devices ... --available n 313e6d78-80f7-41ab-883b-d1bddf77a5da
(cwd None) (commands:154)
2021-11-14 14:00:09,114+0200 DEBUG (monitor/313e6d7) [common.commands]
FAILED: <err> = b' Logical volume 313e6d78-80f7-41ab-883b-d1bddf77a5da/ids
in use.\n Can\'t deactivate volume group "313e6d78-80f7-41ab-883b-d1bddf77a5da"
with 1 open logical volume(s)\n'; <rc> = 5 (commands:186)
If we have logical volumes in use, tearing down the storage domain will
leave them active, so running VMs and active image transfers are safe.
However failed LVM commands are retried several times, which slow down
the shutdown process, and shutting down is likely to time out.
I think this may be related to the hosted engine local maintenance
issue.
Change-Id: Ic2a2d219868d869eb946047f6cdafeffc17704fb
Bug-Url: https://bugzilla.redhat.com/2023344
Related-to: https://bugzilla.redhat.com/1986732
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Seems that OST is broken - it merged a patch without CI+1, and now
CI fails on master.
Change-Id: I7f50936e51fbf49ff942007f615b957041e63533
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
During live merge we call syncVolumeChain multiple times to make sure
the actual/libvirt chain is synced to current/vdsm chain.
When pivot starts, the new requested chain is passed to syncVolumeChain
which compares it to current vdsm chain. This way we can tell what
volume is being removed.
If the volume being removed is a leaf/active layer, it is flagged as
ILLEGAL in vdsm to prevent usage.
Then libvirt blockjob (abort) is started and if it fails the old code
never recovered the volume from ILLEGAL state and manual intervention
was required.
This patch adds a helper to switch the leaf volume back to LEGAL and
calls the helper if libvirt fails abort block job.
Signed-off-by: Roman Bednar <rbednar@redhat.com>
Bug-Url: https://bugzilla.redhat.com/1949475
Change-Id: Ia57c26529cf60d381d3df143a4f7195948ce1cea
Sync volume chain needs to be able to recover top volume legality.
This sync function can be used after libvirt failing pivot to make
sure the top volume in vdsm chain is not left in ILLEGAL state.
Signed-off-by: Roman Bednar <rbednar@redhat.com>
Bug-Url: https://bugzilla.redhat.com/1949475
Change-Id: I3b3fd83fa0fd9fa90ac9b330f4454b2916eee4c8