The block host volume will still exist even when the blocks are all
deleted. Manually deleting block host volume will need another step
of unmounting. With this patch we auto-delete the block host volume
if there are no blocks.
The availablility check of bhv free space is not done within lock, hence
there is a possiblity that the available space has changed by the
time we decide to create the volume. This patch also fixes the race
condition.
Signed-off-by: Poornima G <pgurusid@redhat.com>
Downgrade lvm2 to prevent hangs while using udev.
dmeventd was never designed to be executed inside
'container' so there are some assumption about being
there only single instance of running 'dmeventd' on
the whole host system.
Signed-off-by: Kotresh HR <khiremat@redhat.com>
This is a lightweight container based on Alpine Linux and run as a
sidecar container in a gcs cluster along with containers running gd2.
This is mainly used to parse gluster specific logs and normalize it for
elastic search. Right now only gd2 logs are parsed to some extent. For
this, an rsyslog configuration along with a rulebase file are included.
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
initiator will add trasactions under pending-transaction/transaction/
key. Txn Engine keeps a watch on this key for new transactions
and execute transactions.
eg.. etcd key for storing details of a single transaction:-
pending-transaction/transaction/<txn-ID>
Signed-off-by: Oshank Kumar <okumar@redhat.com>
Decouples tracing code from RunStepFuncLocally() func and
added a middleware which implements transaction.StepManager
to trace step funcs. This will provide a better trace
management and clean code.
Signed-off-by: Oshank Kumar <okumar@redhat.com>
If a peer restarts then it should check for pending
transactions and resume all the pending transaction.
Signed-off-by: Oshank Kumar <okumar@redhat.com>
Implement glustercli command to disable and delete the current tracing
configuration on the cluster. The changes include gd2 transaction that
first deletes the trace configuration from the store on one node and then
a subsequent step clears the in-memory trace configuration on all nodes.
closes #1368
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
Implement glustercli command to update the current tracing status on the
cluster. All trace config options are passed as flags to the command. If
any option is not passed, the existing value for that option will be
retained.
closes #1368
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
Implement glustercli command to get the current tracing status on the
cluster. The tracing info is read from the store and presented to the
user in table format with info like Status, Jaeger Endpoints, Sampler
type and sample fraction. For e.g.,
+------------------------+----------------------------+
| TRACE OPTION | VALUE |
+------------------------+----------------------------+
| Status | enabled |
| Jaeger Endpoint | http://192.168.122.1:14268 |
| Jaeger Agent Endpoint | http://192.168.122.1:6831 |
| Jaeger Sampler | 2 (Probabilistic) |
| Jaeger Sample Fraction | 0.99 |
+------------------------+----------------------------+
Add "trace enable" e2e test cases. The tests also exercise the
"trace status" request.
closes #1368
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
Implement glustercli command to enable tracing. The rest client performs
basic checks of the tracing options prior to sending the request.
closes #1368
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
If no start-up options are specified (i.e. command line or config file),
but a valid trace config exists in the store, then read the config as
part of GD2 start-up sequence and apply the trace config on the node that
is coming up. This way, all the GD2 nodes read the trace configuration
and apply it on themselves on start-up. This scenario is applicable when,
- trace configuration was applied via glustercli (not start-up config)
and,
- GD2 node(s) restart for some reason. If there's a valid trace config
in the store, then it must be applied when a node comes up.
closes #1368
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
Implement the undo step for trace enable transaction. This
step removes the trace configuration from the store if
written.
closes #1368
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
This commit implements a gd2 plugin that allows management of tracing
operations across the cluster. This change-set implements the server side
handling of the request to enable tracing on all gd2 nodes. The
pre-condition to execute this transaction is that there shouldn't be any
existing trace configuration in etcd. The steps
involved in the transaction are,
1. Node receiving the request Validates the passed tracing configuration,
2. Node Stores the tracing configuration in etcd, and,
3. Set the in-memory trace configuration on all nodes
Failure in steps 2 and 3 will result in the undo logic restoring the
previous configuration both in memory and in etcd.
closes #1368
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
The new client configurable options that allows trace control are,
jaeger-sampler:
---------------
This option can take any one of the following values. Each value
maps to,
- 0 (Never): Disable tracing.
- 1 (Always): Samples all traces.
- 2 (Probabilistic): A probabilistic sampler. Sample traces based on
an additional option called "sampling fraction" described below using
which Jaeger decides whether to trace the operation or not.
jaeger-sample-fraction:
-----------------------
If "jaeger-sampler-type" is set to 2 (probabilistic), then an option to
set the sampling fraction (or frequency) can be specified. By default this
value is set to 0.1 (i.e. sample every 1 in 10 traces). Valid values for
this option ranges from >0.0 and <1.0. Higher the value, the greater the
probability that a trace is sampled. This allows a control on the volume
of traces captured in a highly scaled environment.
Other changes in this commit:
- Encapsulate Jaeger specific trace configuration information in a separate
structure that is instantiated only once and updated subsequently.
- Factor out code that validates and sets trace config information into
separate functions for other modules to re-use.
closes #1368
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
Include spanContext structure within the main Txn structure that contains
all important information about a transaction. The spanContext is used to
track the parent or the root span of the transaction. This is necessary to
build the entire span tree for the transaction in question.
closes #1368
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
Remove the option to always sample traces within the ocgrpc server and
client handler. The global sampler option provided via CLI or config file
should take precedence.
closes #1368
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
Currently, to GET/DELETE any block volume we mount all the block hosting
volumes and readdir all the volumes, and loop through it, to find the
block hosting volume, the block belongs to. This approach is not scalable.
Hence, in the metadata of the block hosting volume, we keep a list of all
the block volumes present in that host volume. This way, GET/DELETE walks
through the volume metadata rather than the readdir.
Signed-off-by: Poornima G <pgurusid@redhat.com>
When the block host cluster options are set to values other
than default, the block volume creation fails on the first
attempt, but succeeds on the subsequent attempts. This is
due to the initialization of block vol create req before
reading the cluster options. In this patch, change the order
of the same to fix the issue.
Signed-off-by: Poornima G <pgurusid@redhat.com>
initialize config followed by initializing logging before execution
of main() function will prevent redirecting of logs in init() func
of other packages to console.
Signed-off-by: Oshank Kumar <okumar@redhat.com>
Brick paths are generated while creating smart volume using volume
name, subvol number and brick number. So collision will not happen
when multiple smart volumes are created.
Signed-off-by: Aravinda VK <avishwan@redhat.com>
in brickmux tests. Increase waiting time for bricks to sign in and
add sleep whereever a new gd2 instance is respawned to give enough time
to bricks to sign in properly and multiplex.
Signed-off-by: Vishal Pandey <vpandey@redhat.com>
If a Volume is cloned from another volume, the bricks of cloned volume
will also belong to same LV thinpool as the original Volume. So before
removing the thinpool a check was made to confirm if number of Lvs in
that thinpool is zero. This check was causing hang when parallel
Volume delete commands were issued.
With this PR, number of Lvs check is removed, instead of that captured
the failure of thinpool delete and handled it gracefully.
This PR also adds support for gracefully delete the volume if lv or
thinpool already deleted by previous failed transaction or manual delete.
Signed-off-by: Aravinda VK <avishwan@redhat.com>
- moved hosts parameter from mandatory to optional field in CreateBlockVolume
method of BlockProvider interface,since hosts field may not required for
other block providers like loopback.
- a common function for updating available hosting volume size will prevent
from duplicate code
Signed-off-by: Oshank Kumar <okumar@redhat.com>