1
0
mirror of https://github.com/gluster/glusterfs.git synced 2026-02-05 06:47:35 +01:00

13708 Commits

Author SHA1 Message Date
Barak Sason Rofman
620158475f dht - fixing xattr inconsistency
The scenario of setting an xattr to a dir, killing one of the bricks,
removing the xattr, bringing back the brick results in xattr
inconsistency - The downed brick will still have the xattr, but the rest
won't.
This patch add a mechanism that will remove the extra xattrs during
lookup.

fixes: #1324
Change-Id: Ibcc449bad6c7cb46bcae380e42e4496d733b453d
Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
2020-06-25 07:05:26 +00:00
Dmitry Antipov
9f0beedd55 storage/posix, libglusterfs: library function to sync filesystem
Convert an ad-hoc hack to a regular library function gf_syncfs().

Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Change-Id: I3ed93e9f28f22c273df1466ba4a458eacb8df395
Fixes: #1329
2020-06-22 17:16:26 +03:00
Sanju Rakonde
c18782bc91 glusterd: add-brick command failure
Problem: add-brick operation is failing when replica or disperse
count is not mentioned in the add-brick command.

Reason: with commit a113d93 we are checking brick order while
doing add-brick operation for replica and disperse volumes. If
replica count or disperse count is not mentioned in the command,
the dict get is failing and resulting add-brick operation failure.

fixes: #1306

Change-Id: Ie957540e303bfb5f2d69015661a60d7e72557353
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
2020-06-21 04:21:02 +00:00
Anoop C S
d552187818 extras/systemd: Move StartLimitIntervalSec to [Unit] section
It has been a while since systemd moved[1] StartLimitInterval= and
StartLimitBurst= options(along with some others) from [Service] to
[Unit] section. Additionally StartLimitInterval= got renamed[2] to
StartLimitIntervalSec= and can be configured only in [Service] section.
Therefore making necessary modifications to avoid following warning:

$ sudo systemd-analyze verify glusterd.service
/usr/local/lib/systemd/system/glusterd.service:21: Unknown key name 'StartLimitIntervalSec' in section 'Service', ignoring.

For backward compatability reasons those options configured in [Service]
section are also honoured but officially documented in man
systemd.unit(5)[3].

[1] 6bf0f408e4
[2] f0367da7d1
[3] https://www.freedesktop.org/software/systemd/man/systemd.unit.html

Change-Id: I72a5b65930ddcf1d84c7e66f11685fa9a6fbda9a
Updates: #1000
Signed-off-by: Anoop C S <anoopcs@redhat.com>
2020-06-19 07:06:38 +00:00
Dmitry Antipov
2ef75183ab cli: fix data race when handling connection status
Found with GCC ThreadSanitizer:

WARNING: ThreadSanitizer: data race (pid=287943)
  Write of size 4 at 0x00000047dfa0 by thread T4:
    #0 cli_rpc_notify /path/to/glusterfs/cli/src/cli.c:313 (gluster+0x40a6df)
    #1 rpc_clnt_handle_disconnect /path/to/glusterfs/rpc/rpc-lib/src/rpc-clnt.c:821 (libgfrpc.so.0+0x13f04)
    #2 rpc_clnt_notify /path/to/glusterfs/rpc/rpc-lib/src/rpc-clnt.c:882 (libgfrpc.so.0+0x13f04)
    #3 rpc_transport_notify /path/to/glusterfs/rpc/rpc-lib/src/rpc-transport.c:520 (libgfrpc.so.0+0xf070)
    #4 socket_event_poll_err /path/to/glusterfs/rpc/rpc-transport/socket/src/socket.c:1364 (socket.so+0x812c)
    #5 socket_event_handler /path/to/glusterfs/rpc/rpc-transport/socket/src/socket.c:2958 (socket.so+0xc453)
    #6 socket_event_handler /path/to/glusterfs/rpc/rpc-transport/socket/src/socket.c:2854 (socket.so+0xc453)
    #7 event_dispatch_epoll_handler /path/to/glusterfs/libglusterfs/src/event-epoll.c:640 (libglusterfs.so.0+0xcaf23)
    #8 event_dispatch_epoll_worker /path/to/glusterfs/libglusterfs/src/event-epoll.c:751 (libglusterfs.so.0+0xcaf23)
    #9 <null> <null> (libtsan.so.0+0x2d33f)

  Previous read of size 4 at 0x00000047dfa0 by thread T3 (mutexes: write M3587):
    #0 cli_cmd_await_connected /path/to/glusterfs/cli/src/cli-cmd.c:321 (gluster+0x40ca37)
    #1 cli_cmd_process /path/to/glusterfs/cli/src/cli-cmd.c:123 (gluster+0x40cc74)
    #2 cli_batch /path/to/glusterfs/cli/src/input.c:29 (gluster+0x40c2b9)
    #3 <null> <null> (libtsan.so.0+0x2d33f)

  Location is global 'connected' of size 4 at 0x00000047dfa0 (gluster+0x00000047dfa0)

Change-Id: Ie85a8a80a2c5b82252c0c1d45e68ebe9938da2eb
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1311
2020-06-18 16:22:11 +00:00
Pranith Kumar K
3510916573 mount/fuse: use cookies to get fuse-interrupt-record instead of xdata
Problem:
On executing tests/features/flock_interrupt.t the following error log
appears
[2020-06-16 11:51:54.631072 +0000] E
[fuse-bridge.c:4791:fuse_setlk_interrupt_handler_cbk] 0-glusterfs-fuse:
interrupt record not found

This happens because fuse-interrupt-record is never sent on the wire by
getxattr fop and there is no guarantee that in the cbk it will be
available in case of failures.

Fix:
wind getxattr fop with fuse-interrupt-record as cookie and recover it
in the cbk

Fixes: #1310
Change-Id: I4cfff154321a449114fc26e9440db0f08e5c7daa
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
2020-06-18 06:12:07 +00:00
Pranith Kumar K
4a2a6b9888 features/locks: posixlk-clear-lock should set error as EINTR
Problem:
fuse on receiving interrupt for setlk sends clear-lock "fop"
using virtual-getxattr. At the moment blocked locks which are
cleared return EAGAIN errno as opposed to EINTR errno

Fix:
Return EINTR errno.

Updates: #1310
Change-Id: I47de0fcaec370b267f2f5f89deeb37e1b9c0ee9b
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
2020-06-18 06:11:39 +00:00
Dmitry Antipov
3db89cf23a rpc: fix undefined behaviour in __builtin_ctz
Found with GCC UBsan:

rpcsvc.c:102:36: runtime error: passing zero to ctz(), which is not a valid argument
    #0 0x7fcd1ff6faa4 in rpcsvc_get_free_queue_index /path/to/glusterfs/rpc/rpc-lib/src/rpcsvc.c:102
    #1 0x7fcd1ff81e12 in rpcsvc_handle_rpc_call /path/to/glusterfs/rpc/rpc-lib/src/rpcsvc.c:837
    #2 0x7fcd1ff833ad in rpcsvc_notify /path/to/glusterfs/rpc/rpc-lib/src/rpcsvc.c:1000
    #3 0x7fcd1ff8829d in rpc_transport_notify /path/to/glusterfs/rpc/rpc-lib/src/rpc-transport.c:520
    #4 0x7fcd0dd72f16 in socket_event_poll_in_async /path/to/glusterfs/rpc/rpc-transport/socket/src/socket.c:2502
    #5 0x7fcd0dd8986a in gf_async ../../../../libglusterfs/src/glusterfs/async.h:189
    #6 0x7fcd0dd8986a in socket_event_poll_in /path/to/glusterfs/rpc/rpc-transport/socket/src/socket.c:2543
    #7 0x7fcd0dd8986a in socket_event_handler /path/to/glusterfs/rpc/rpc-transport/socket/src/socket.c:2934
    #8 0x7fcd0dd8986a in socket_event_handler /path/to/glusterfs/rpc/rpc-transport/socket/src/socket.c:2854
    #9 0x7fcd2048aff7 in event_dispatch_epoll_handler /path/to/glusterfs/libglusterfs/src/event-epoll.c:640
    #10 0x7fcd2048aff7 in event_dispatch_epoll_worker /path/to/glusterfs/libglusterfs/src/event-epoll.c:751
    ...

Fix, simplify, and prefer 'unsigned long' as underlying bitmap type.

Change-Id: If3f24dfe7bef8bc7a11a679366e219a73caeb9e4
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1283
2020-06-17 17:26:12 +00:00
Amar Tumballi
9c17cd3b9c volgen: add an option to disable acl
Also add a message saying this is to be used only
for 'debug' purpose only. This is helpful to corner the
issue to acl. There were recently many issues reported
related to permissions, and acl access denied bugs.
The bugs were elsewhere, but to validate them and to
get people back to service (in certain cases like oVirt,
where gluster volumes are used mostly by single user),
this option can be used.

Updates: #876
Change-Id: I7be4401153607e11c9efb831ab794df4176604df
Signed-off-by: Amar Tumballi <amar@kadalu.io>
2020-06-17 17:25:39 +00:00
Sanju Rakonde
c325082370 tests/glusterd: spurious failure of tests/bugs/glusterd/mgmt-handshake-and-volume-sync-post-glusterd-restart.t
Test Summary Report
-------------------
tests/bugs/glusterd/mgmt-handshake-and-volume-sync-post-glusterd-restart.t
(Wstat: 0 Tests: 23 Failed: 3)
Failed tests:  21-23

After glusterd restart, volume start is failing. Looks like, it need some
time to sync the data. Adding sleep for the same.

Note: All other changes are made to avoid spurious failures in the future.

fixes: #1272

Change-Id: Ib184757fb936e03b5b6208465e44a8e790b71c1c
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
2020-06-17 13:15:26 +00:00
Xavi Hernandez
efaab5ec02 locks: prevent deletion of locked entries
To keep consistency inside transactions started by locking an entry or
an inode, this change delays the removal of entries that are currently
locked by one or more clients. Once all locks are released, the removal
is processed.

It has also been improved the detection of stale inodes in the locking
code of EC.

Fixes: #990
Change-Id: Ic8ba23d9480f80c7f74e7a310bf8a15922320fd5
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
2020-06-17 12:22:15 +00:00
Sanju Rakonde
8a55d6b65b glusterd: migrating remove-brick commands to mgmt v3 framework
Currently remove-brick commands follow sync-op framework. For code
extensibility (like, adding more phases in the trnasaction) we are
migrating the command to mgmt v3 framework.

fixes: #1164

Change-Id: I5d363223d6f9dc7a70b61adb9d3a5250e84a71b4
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
2020-06-17 05:17:47 +00:00
Kaleb S. KEITHLEY
3bb459c304 packaging: refactor to align with common practices
Apparently some sdditional Obsoletes: are required

Change-Id: I919ae5a0fcc6f720e3eab4784af36977b9eef044
Fixes: #1126
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
2020-06-16 07:16:59 -04:00
Ravishankar N
c4a6748f25 afr: more quorum checks in lookup and new entry marking
Problem: See github issue for details.

Fix:
-In lookup if the entry exists in 2 out of 3 bricks, don't fail the
lookup with ENOENT just because there is an entrylk on the parent.
Consider quorum before deciding.

-If entry FOP does not succeed on quorum no. of bricks, do not perform
new entry mark.

Fixes: #1303
Change-Id: I56df8c89ad53b29fa450c7930a7b7ccec9f4a6c5
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
2020-06-16 04:53:03 +00:00
Krutika Dhananjay
5391f16fc4 extras: Modify group 'virt' to include network-related options
This is needed to work around an issue seen where vms running on
online hosts are getting killed when a different host is rebooted
in ovirt-gluster hyperconverged environments. Actual RCA is quite
lengthy and documented in the github issue. Please refer to it
for more details.

Change-Id: Ic25b5f50144ad42458e5c847e1e7e191032396c1
Fixes: #1217
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
2020-06-15 12:55:56 +00:00
Csaba Henk
9a4f91abc0 Indicate timezone offsets in timestamps
Logs and other output carrying timestamps
will have now timezone offsets indicated, eg.:

[2020-03-12 07:01:05.584482 +0000] I [MSGID: 106143] [glusterd-pmap.c:388:pmap_registry_remove] 0-pmap: removing brick (null) on port 49153

To this end,

- gf_time_fmt() now inserts timezone offset via %z strftime(3) template.
- A new utility function has been added, gf_time_fmt_tv(), that
  takes a struct timeval pointer (*tv) instead of a time_t value to
  specify the time. If tv->tv_usec is negative,

  gf_time_fmt_tv(... tv ...)

  is equivalent to

  gf_time_fmt(... tv->tv_sec ...)

  Otherwise it also inserts tv->tv_usec to the formatted string.
- Building timestamps of usec precision has been converted to
  gf_time_fmt_tv, which is necessary because the method of appending
  a period and the usec value to the end of the timestamp does not work
  if the timestamp has zone offset, but it's also beneficial in terms of
  eliminating repetition.
- The buffer passed to gf_time_fmt/gf_time_fmt_tv has been unified to
  be of GF_TIMESTR_SIZE size (256). We need slightly larger buffer space
  to accommodate the zone offset and it's preferable to use a buffer
  which is undisputedly large enough.

This change does *not* do the following:

- Retaining a method of timestamp creation without timezone offset.
  As to my understanding we don't need such backward compatibility
  as the code just emits timestamps to logs and other diagnostic
  texts, and doesn't do any later processing on them that would rely
  on their format. An exception to this, ie. a case where timestamp
  is built for internal use, is graph.c:fill_uuid(). As far as I can
  see, what matters in that case is the uniqueness of the produced
  string, not the format.
- Implementing a single-token (space free) timestamp format.
  While some timestamp formats used to be single-token, now all of
  them will include a space preceding the offset indicator. Again,
  I did not see a use case where this could be significant in terms
  of representation.
- Moving the codebase to a single unified timestamp format and
  dropping the fmt argument of gf_time_fmt/gf_time_fmt_tv.
  While the gf_timefmt_FT format is almost ubiquitous, there are
  a few cases where different formats are used. I'm not convinced
  there is any reason to not use gf_timefmt_FT in those cases too,
  but I did not want to make a decision in this regard.

Change-Id: I0af73ab5d490cca7ed8d07a2ce7ac22a6df2920a
Updates: #837
Signed-off-by: Csaba Henk <csaba@redhat.com>
2020-06-15 12:41:10 +00:00
Vinayakswami Hariharmath
71dd19f710 features/shard: Use fd lookup post file open
Issue:
When a process has the open fd and the same file is
unlinked in middle of the operations, then file based
lookup fails with ENOENT or stale file

Solution:
When the file already open and fd is available, use fstat
to get the file attributes

Change-Id: I0e83aee9f11b616dcfe13769ebfcda6742e4e0f4
Fixes: #1281
Signed-off-by: Vinayakswami Hariharmath <vharihar@redhat.com>
2020-06-11 16:50:41 +05:30
Dmitry Antipov
fda036e79e libglusterfs: fix use-after-destroy mutex error
Found with GCC ThreadSanitizer:

WARNING: ThreadSanitizer: use of an invalid mutex (e.g. uninitialized or destroyed) (pid=188590)
    #0 pthread_mutex_lock <null> (libtsan.so.0+0x528ac)
    #1 client_ctx_del /path/to/glusterfs/libglusterfs/src/client_t.c:535 (libglusterfs.so.0+0xc681a)
    #2 client_destroy_cbk /path/to/glusterfs/xlators/protocol/server/src/server.c:944 (server.so+0xaf6e)
    #3 gf_client_destroy_recursive /path/to/glusterfs/libglusterfs/src/client_t.c:295 (libglusterfs.so.0+0xc5058)
    #4 client_destroy /path/to/glusterfs/libglusterfs/src/client_t.c:330 (libglusterfs.so.0+0xc60e4)
    ...

  Location is heap block of size 272 at 0x7b440001a180 allocated by thread T7:
    #0 calloc <null> (libtsan.so.0+0x3075a)
    #1 __gf_calloc /path/to/glusterfs/libglusterfs/src/mem-pool.c:151 (libglusterfs.so.0+0x6e42b)
    #2 gf_client_get /path/to/glusterfs/libglusterfs/src/client_t.c:155 (libglusterfs.so.0+0xc571a)
    ...

The problem is that client_destroy() may call client_ctx_del() (which attempts to lock
'sratch_ctx.lock') via recursive deletion from gf_client_destroy_recursive(), so
destroying mutex before entering recursive deletion is an error. It should be destroyed
later - just before the client context is freed.

Change-Id: I730a628714d2b404e3f019ae552403da16b51b68
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1285
2020-06-11 03:29:32 +00:00
Dmitry Antipov
9af05a85a0 glusterd: destroy all volume info locks and mutexes
Add destroy calls for 'store_volinfo_lock' and 'lock' of volume info.
Move initialization of 'store_volinfo_lock' from glusterd_op_create_volume()
to common place, which is glusterd_volinfo_new() indeed.

Change-Id: I5fae4469f28eab80c4fa6f5947646528e6aedad7
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1291
2020-06-10 15:47:03 +00:00
Dmitry Antipov
e150d75825 cli: fix several signed integer overflows and format specifiers
Initially found with GCC UBsan:

cli/src/cli-rpc-ops.c:5347:73: runtime error: left shift of 1 by 31
                               places cannot be represented in type 'int'
cli/src/cli-rpc-ops.c:5355:74: runtime error: left shift of 1 by 31
                               places cannot be represented in type 'int'

Ditto in cli/src/cli-xml-output.c.

Change-Id: I14ed51d06dafe5039f154b0c4edf25a0997d696e
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1279
2020-06-10 15:43:44 +00:00
Mohit Agrawal
955bfd5673 test: Test case brick-mux-validation-in-cluster.t is failing on RHEL-8
Brick process are not properly attached on any cluster node while
some volume options are changed on peer node and glusterd is down on
that specific node.

Solution: At the time of restart glusterd it got a friend update request
from a peer node if peer node having some changes on volume.If the brick
process is started before received a friend update request in that case
brick_mux behavior is not workingproperly. All bricks are attached to
the same process even volumes options are not the same. To avoid the
issue introduce an atomic flag volpeerupdate and update the value while
glusterd has received a friend update request from peer for a specific
volume.If volpeerupdate flag is 1 volume is started by
glusterd_import_friend_volume synctask

Change-Id: I4c026f1e7807ded249153670e6967a2be8d22cb7
Credit: Sanju Rakaonde <srakonde@redhat.com>
fixes: #1290
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
2020-06-09 16:01:04 +05:30
yinkui
06fa986907 glusterd: To do full heal in different online node when do ec/afr full heal
For example:
    We have 3 nodes and create ec 3*(2+1) volume for
test-disperse-0/test-disperse-1/test-disperse-2 when we do
'gluster v heal test full' in node-1 that can in node-1/
node-2/node-3 glustershd's get op=GF_EVENT_TRANSLATOR_OP
and then do full heal in different disperse group.
    Let us say we have 2X(2+1) disperse with each brick
from different machine m0, m1, m2, m3, m4, m5. and candidate_max is m5.
and do full heal so '*index' is 3 and !gf_uuid_compare(MY_UUID, brickinfo->uuid)
will be true in m3,and then m3's glustershd will be the heal-xlator.

Id: I5c6762e6cfb375aed32d3fc11fe5eae3ee41aab4
Signed-off-by: yinkui <13965432176@163.com>

Change-Id: Ic7ef3ddfd30b5f4714ba99b4e7b708c927d68764
fixes: bz#1724948
2020-06-09 01:40:01 +00:00
Kaleb S. KEITHLEY
d21d656bc4 packaging: refactor to align with common practices
The claim that Fedora package guidelines do not require this
scheme is a non-argument. Not only do they not require it, they
don't prohibit it either. (And you can't prove a negative. It's
a specious argument.)

Change-Id: I7748c7531d52dedd71b3a7f5df049742258a6aba
Fixes: #1126
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
2020-06-09 01:36:36 +00:00
Florian Smeets
5419d280a9 Make glusterfs compile on all recent and supported FreeBSD releases.
I'm currently trying to update the FreeBSD port of glusterfs from 3.11 to
7.6. With this change I was able to compile everything again on 11.3,
11.4RC1, 12.1 and 13 (head)

Change-Id: I867fa51e931f7ef486529eecb58d903d2d23f79a
Fixes: #1275
Signed-off-by: Florian Smeets <flo@FreeBSD.org>
2020-06-09 01:34:06 +00:00
Dmitry Antipov
a15640a5c7 libglusterfs: drop useless 'const' on function return type
Using 'const' qualifier on function return type makes no effect in
ISO C, as reported with -Wignored-qualifiers of gcc-10 and clang-10.

Change-Id: I83de7ab4c8255284683bb462cd9f584ebf0f983b
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1249
2020-06-09 01:32:48 +00:00
Tamar Shacked
dbd74e0b8d When creating new file don't set xatrr "trusted.glusterfs.dht"
The curr call to delete the xattr from the dict fails to find the key: dict_del_sizen(xdata, xattr_name);
This is beacuse keysize is calculated as sizeof of xattr_name which is a pointer, this lead to wrong size -> hash.
Fix: call to dict_deln which get keysize using strlen.

fixes: #1282
Change-Id: I23ce1f8f7928e9daa43bc3a9fa8d3611e81bbc36
Signed-off-by: Tamar Shacked <tshacked@redhat.com>
2020-06-09 01:31:20 +00:00
Pranith Kumar K
af89d9e623 cluster/afr: Delay post-op for fsync
Problem:
AFR doesn't delay post-op for fsync fop. For fsync heavy workloads
this leads to un-necessary fxattrop/finodelk for every fsync leading
to bad performance.

Fix:
Have delayed post-op for fsync. Add special flag in xdata to indicate
that afr shouldn't delay post-op in cases where either the
process will terminate or graph-switch would happen. Otherwise it leads
to un-necessary heals when the graph-switch/process-termination
happens before delayed-post-op completes.

Fixes: #1253
Change-Id: I531940d13269a111c49e0510d49514dc169f4577
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
2020-06-08 13:49:12 +00:00
Sheetal Pamecha
a113d93621 glusterd: check for same node while adding bricks in disperse volume
The optimal way for configuring disperse and replicate volumes
is to have all bricks in different nodes.

During create operation it fails saying it is not optimal, user
must use force to over-ride this behavior. Implementing same
during add-brick operation to avoid situation where all the added
bricks end up from same host. Operation will error out accordingly.
and this can be over-ridden by using force same as create.

fixes: #1047
Change-Id: I3ee9c97c1a14b73f4532893bc00187ef9355238b
Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
2020-06-06 10:18:33 +05:30
Dmitry Antipov
cd3978eda0 glusterd: fix memory leak in glusterd_store_retrieve_bricks()
Found with GCC's address sanitizer:

==67190==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 24624 byte(s) in 6 object(s) allocated from:
    #0 0x7f62535c0837 in __interceptor_calloc (/usr/lib64/libasan.so.6+0xb0837)
    #1 0x7f62532a1690 in __gf_default_calloc glusterfs/mem-pool.h:122
    #2 0x7f62532a20ca in __gf_calloc /path/to/glusterfs/libglusterfs/src/mem-pool.c:144
    #3 0x7f62532c8128 in gf_store_iter_new /path/to/glusterfs/libglusterfs/src/store.c:511
    #4 0x7f623e2f9ed7 in glusterd_store_retrieve_bricks /path/to/glusterfs/xlators/mgmt/glusterd/src/glusterd-store.c:2389

Direct leak of 8208 byte(s) in 2 object(s) allocated from:
    #0 0x7f62535c0837 in __interceptor_calloc (/usr/lib64/libasan.so.6+0xb0837)
    #1 0x7f62532a1690 in __gf_default_calloc glusterfs/mem-pool.h:122
    #2 0x7f62532a20ca in __gf_calloc /path/to/glusterfs/libglusterfs/src/mem-pool.c:144
    #3 0x7f62532c8128 in gf_store_iter_new /path/to/glusterfs/libglusterfs/src/store.c:511
    #4 0x7f623e2f9cf0 in glusterd_store_retrieve_bricks /path/to/glusterfs/xlators/mgmt/glusterd/src/glusterd-store.c:2363
    #5 0x7fff5cb70bcf  ([stack]+0x15bcf)
    #6 0x7f623e309113 in glusterd_store_retrieve_volumes /path/to/glusterfs/xlators/mgmt/glusterd/src/glusterd-store.c:3505
    #7 0xfffeb96e61d  (<unknown module>)
    #8 0x7f623e4586d7  (/usr/lib64/glusterfs/9dev/xlator/mgmt/glusterd.so+0x2f86d7)

Change-Id: I9b2a543dc095f4fa739cd664fd4d608bf8c87d60
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1263
2020-06-05 17:43:52 +00:00
karthik-us
fa63b45ca5 cluster/afr: Prioritize ENOSPC over other errors
Problem:
In a replicate/arbiter volume if file creations or writes fails on
quorum number of bricks and on one brick it is due to ENOSPC and
on other brick it fails for a different reason, it may fail with
errors other than ENOSPC in some cases.

Fix:
Prioritize ENOSPC over other lesser priority errors and do not set
op_errno in posix_gfid_set if op_ret is 0 to avoid receiving any
error_no which can be misinterpreted by __afr_dir_write_finalize().

Also removing the function afr_has_arbiter_fop_cbk_quorum() which
might consider a successful reply form a single brick as quorum
success in some cases, whereas we always need fop to be successful
on quorum number of bricks in arbiter configuration.

Change-Id: I106e267f8b9451f681022f1cccb410d9bc824c08
Fixes: #1254
Signed-off-by: karthik-us <ksubrahm@redhat.com>
2020-06-05 09:23:28 +05:30
Xavi Hernandez
d405498e37 open-behind: rewrite of internal logic
There was a critical flaw in the previous implementation of open-behind.

When an open is done in the background, it's necessary to take a
reference on the fd_t object because once we "fake" the open answer,
the fd could be destroyed. However as long as there's a reference,
the release function won't be called. So, if the application closes
the file descriptor without having actually opened it, there will
always remain at least 1 reference, causing a leak.

To avoid this problem, the previous implementation didn't take a
reference on the fd_t, so there were races where the fd could be
destroyed while it was still in use.

To fix this, I've implemented a new xlator cbk that gets called from
fuse when the application closes a file descriptor.

The whole logic of handling background opens have been simplified and
it's more efficient now. Only if the fop needs to be delayed until an
open completes, a stub is created. Otherwise no memory allocations are
needed.

Correctly handling the close request while the open is still pending
has added a bit of complexity, but overall normal operation is simpler.

Change-Id: I6376a5491368e0e1c283cc452849032636261592
Fixes: #1225
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
2020-06-04 07:34:39 +00:00
Dmitry Antipov
cab995fd9a rpc, gf_attach: add minimal proper synchronization
Implement minimal proper synchronization between gf_attach
and underlying RPC layer using convenient POSIX primitives.

Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1260

Change-Id: Ib5130b586a8b65ed5cf5f9156c111b161570224b
2020-06-03 03:06:44 +00:00
Csaba Henk
443e8d8e6f meta: indicate writability of tunables in file mode
All files under .meta (synthetic subtree facilitating
reflection for glusterfs clients, implemented by meta
xlator) were shown as writable, but the vast majority of
them are usable only for querying parameters or stats, not
for setting them. (The exceptions are loglevel and
measure_latency.)

However, one could only find out about this only by trial
and error, or reading the code.

With this change we align file permissions with
tunability, stripping the writable bits for those nodes
which are only for querying.

Also strip writable bits from directory permissions.

updates: #1000
Change-Id: I82954e165ffc31cdf7307f4d990ef60b8154a2e2
Signed-off-by: Csaba Henk <csaba@redhat.com>
2020-06-03 02:25:21 +00:00
Ritesh Chikatwar
8972bf7fd1 events: fix python3 compatibility As there is no "unicode" in py3 so removing it.
Problem: vdsm-client command failing below traceback

[root@dhcp35-179 ~]# cat <<EOF | vdsm-client  --gluster-enabled -f - GlusterEvent webhookUpdate
{
"url": "https://mail.google.com/mail/u/1/#inbox",
"bearerToken": "ritesh"
}
EOF
vdsm-client: Command GlusterEvent.webhookUpdate with args
{'url': 'https://mail.google.com/mail/u/1/#inbox', 'bearerToken': 'ritesh'} failed:
(code=4752, message=Failed to update webhook: rc=1 out=()
err=b'Traceback (most recent call last):  File "/sbin/gluster-eventsapi", line 673,
in <module> runcli()
   File "/usr/lib/python3.6/site-packages/gluster/cliutils/cliutils.py", line 232,
in runcli  cls.run(args)\n
   File "/sbin/gluster-eventsapi", line 357,
in run
   isinstance(data[args.url], unicode):NameError: name \'unicode\' is not defined')

Solution:

In py3 str can hold unicode string.

Change-Id: I3dc59df8b812f236380d2d57cfcf8e3aba91e582
Fixes: #1226
Signed-off-by: Ritesh Chikatwar <rchikatw@redhat.com>
2020-06-02 10:39:42 +00:00
Barak Sason Rofman
c9e1b58efa fuse - remove unnecessary code block
Per a "TODO" comment in the code, a code block can be removed as it's no
longer required

Change-Id: I60e064ece985ff2ea2a686bbd2f0e6cc850899e9
updates: #1000
Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
2020-06-02 10:35:50 +00:00
Csaba Henk
9541796664 io-cache,quick-read: deprecate volume options with flawed semantics or naming
- performance.cache-size has a flawed semantics, as it's
  dispatched on two independent translators, io-cache
  and quick-read.
- performance.qr-cache-timeout has a confusing name, as
  other options affecting quick-read have an unabbreviated
  "quick-read-..." prefix in their names.

We keep these options with unchanged operation, but in the
help output we indicate their deprecation.

The following better alternatives are introduced:

- performance.io-cache-size to tune cache-size option of io-cache
- performance.quick-read-cache-size to tune cache-size option of
  quick-read
- performance.quick-read-cache-timeout as a preferred synonym for
  performance.qr-cache-timeout

Fixes: #952
Change-Id: Ibd04fb638de8cac450ba992ad8a415154f9f4281
Signed-off-by: Csaba Henk <csaba@redhat.com>
2020-06-02 10:34:51 +00:00
Dmitry Antipov
6ac77f8525 afr: fix memory leak in afr_priv_destroy()
Found with GCC ASan:

Direct leak of 202 byte(s) in 2 object(s) allocated from:
    #0 0x7fc6c6ef0667 in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb0667)
    #1 0x7fc6c6bd145b in __gf_malloc /path/to/glusterfs/libglusterfs/src/mem-pool.c:175
    #2 0x7fc6c6bd17a3 in gf_vasprintf /path/to/glusterfs/libglusterfs/src/mem-pool.c:223
    #3 0x7fc6c6bd1993 in gf_asprintf /path/to/glusterfs/libglusterfs/src/mem-pool.c:243
    #4 0x7fc6b0dc92f6 in init /path/to/glusterfs/xlators/cluster/afr/src/afr.c:590
    ...

Change-Id: I29feb1d30a045fb70472758e6ed4e195888090b2
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1278
2020-06-01 15:42:18 +03:00
Barak Sason Rofman
7b7559733c dht - sparse files rebalance enhancements
Currently data migration in rebalance reads sparse file sequentially,
disregarding which segments are holes and which are data. This can lead
to extremely long migration time for large sparse file.
Data migration mechanism needs to be enhanced so only data segments are
read and migrated. This can be achieved using lseek to seek for holes
and data in the file.
This enhancement is a consequence of
https://bugzilla.redhat.com/show_bug.cgi?id=1823703

fixes: #1222
Change-Id: If5f448a0c532926464e1f34f504c5c94749b08c3
Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
2020-06-01 10:47:33 +00:00
Dmitry Antipov
4ccd078a00 cli: fix memory leak in gf_cli_gsync_status_output()
In gf_cli_gsync_status_output(), call to gf_cli_read_status_data()
overwrites 'sts_vals' pointers to areas allocated by GF_CALLOC()
with pointers to dict data, thus making the allocated areas not
accessible.

Change-Id: I00c310aec1a1413caf13ade14dc4fed37b51962c
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1259
2020-05-31 18:37:16 +00:00
Kaleb S. KEITHLEY
eaf126f4b0 common-ha: ganesha-ha.sh bad test for {rhel,centos} for pcs options
bash [[ ... =~ ... ]] built-in returns _0_ when the regex matches,
not 1, thus the sense of the test is backwards and never correctly
detects rhel or centos.

Change-Id: Ic9e60aae4ea38aff8f13979080995e60621a68fe
Fixes: #1269
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
2020-05-28 08:27:55 -04:00
Krutika Dhananjay
3251952510 features/shard: Aggregate file size, block-count before unwinding removexattr
Posix translator returns pre and postbufs in the dict in {F}REMOVEXATTR fops.
These iatts are further cached at layers like md-cache.
Shard translator, in its current state, simply returns these values without
updating the aggregated file size and block-count.

This patch fixes this problem.

Change-Id: I4b2dd41ede472c5829af80a67401ec5a6376d872
Fixes: #1243
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
2020-05-26 15:17:44 +05:30
Susant Palai
66ee6467d9 dht: add null check in gf_defrag_free_dir_dfmeta
fixes: #1258
Change-Id: I9d1fb512072bcc540d21d47da5b15ae1b79cf2b8
Signed-off-by: Susant Palai <spalai@redhat.com>
2020-05-26 03:04:11 +00:00
Ashish Pandey
61c4695ea1 afr/changelog: fix NULL dereferences and error handling
This patch includes the following CID from Coverity Scan:
 *1419116
 *1420206

Change-Id: Id92fd6a78c8a00726a61aa4697b5c126ced8ed4d
Updates: #1202
2020-05-26 03:03:18 +00:00
Mohit Agrawal
177cc09d24 socket: Use AES128 cipher in SSL if AES is supported by CPU
SSL performance is improved after configuring AES128 cipher
so use AES128 cipher as a default cipher on the CPU those
enabled AES bits otherwise ssl use AES256 cipher

Change-Id: I91c50fe987cbb22ed76f8012094730c592c63506
Fixes: #1050
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
2020-05-26 02:58:12 +00:00
Mohammed Rafi KC
2ae5f8eaaf glusterd/snapshot: Improve log message during snapshot clone
While taking a snapshot clone, if the snapshot is not activated,
th cli was returning that the bricks are down.
This patch clearly print tha the error is due to the snapshot
state.

Change-Id: Ia840e6e071342e061ad38bf15e2e2ff2b0dacdfa
Fixes: #1255
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
2020-05-22 15:50:25 +00:00
Krutika Dhananjay
29ec66c6ab features/shard: Aggregate size, block-count in iatt before unwinding setxattr
Posix translator returns pre and postbufs in the dict in {F}SETXATTR fops.
These iatts are further cached at layers like md-cache.
Shard translator, in its current state, simply returns these values without
updating the aggregated file size and block-count.

This patch fixes this problem.

Change-Id: I4da0eceb4235b91546df79270bcc0af8cd64e9ea
Fixes: #1243
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
2020-05-21 20:06:32 +00:00
Csaba Henk
c1baf3c68b fuse: occasional logging for fuse device 'weird' write errors
This change is a followup to
I510158843e4b1d482bdc496c2e97b1860dc1ba93.

In referred change we pushed log messages about 'weird'
write errors to fuse device out of sight, by reporting
them at Debug loglevel instead of Error (where
'weird' means errno is not POSIX compliant but having
meaningful semantics for FUSE protocol).

This solved the issue of spurious error reporting.
And so far so good: these messages don't indicate
an error condition by themselves. However, when they
come in high repetitions, that indicates a suboptimal
condition which should be reported.[1]

Therefore now we shall emit a Warning if a certain
errno occurs a certain number of times[2] as the
outcome of a write to the fuse device.

___
[1] typically ENOENTs and ENOTDIRs accumulate
when glusterfs' inode invalidation lags behind
the kernel's internal inode garbage collection
(in this case above errnos mean that the inode
which we requested to be invalidated is not found
in kernel). This can be mitigated with the
invalidate-limit command line / mount option,
cf. bz#1732717.

[2] 256, as of the current implementation.

Change-Id: I8cc7fe104da43a88875f93b0db49d5677cc16045
Updates: #1000
Signed-off-by: Csaba Henk <csaba@redhat.com>
2020-05-21 12:08:07 +00:00
Xavi Hernandez
c888ef5749 glusterd: add missing synccond_broadcast()
After the changes in commit 3da22f8cb0,
there was a place where synccond_broadcast() was missing. It could
cause a hang if another synctask was waiting on the condition variable.

Change-Id: I92bfe4e15c5c3591e4854a64aa9e1566d50dd204
Fixes: #1116
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
2020-05-21 08:35:01 +02:00
Dmitry Antipov
d3538f3c32 storage/posix: fix thread name to comply with common convention
Rename disk space checking thread to comply with
common convention, adjust related docs as well.

Change-Id: I36d642cf09773a28abd95bbe337ce29134ad96a4
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1248
2020-05-18 18:59:05 +03:00
Dmitry Antipov
8e550380f7 features/bit-rot: invalid snprintf() buffer size
Found with clang-10 -Wfortify-source:

bit-rot-scrub.c:1802:15: warning: 'snprintf' size argument is too large;
destination buffer has size 32, but size argument is 4096 [-Wfortify-source]
        len = snprintf(key, PATH_MAX, "quarantine-%d", j);
              ^
bit-rot-scrub.c:1813:9: warning: 'snprintf' size argument is too large;
destination buffer has size 32, but size argument is 4096 [-Wfortify-source]
        snprintf(main_key, PATH_MAX, "quarantine-%d", tmp_count);

Change-Id: I9b9c09ef2223ed181d81215154345de976b82f13
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1221
2020-05-18 07:17:40 +00:00