From de7d64d9f57977eb92ebad6cb2e0f48eaf8afafe Mon Sep 17 00:00:00 2001 From: "black.dragon74" Date: Thu, 19 May 2022 13:20:48 +0530 Subject: [PATCH 01/21] [Install-Guide] Fixes and improvements Also, cleaned up the syntax. Signed-off-by: black.dragon74 --- docs/Install-Guide/Common-criteria.md | 82 +++++++------- docs/Install-Guide/Community-Packages.md | 131 +++++++++++------------ docs/Install-Guide/Configure.md | 46 ++++---- docs/Install-Guide/Install.md | 4 +- docs/Install-Guide/Overview.md | 69 ++++++------ docs/Install-Guide/Setup-Bare-metal.md | 50 ++++----- docs/Install-Guide/Setup-aws.md | 94 ++++++++-------- docs/Install-Guide/Setup-virt.md | 21 ++-- mkdocs.yml | 2 +- 9 files changed, 249 insertions(+), 250 deletions(-) diff --git a/docs/Install-Guide/Common-criteria.md b/docs/Install-Guide/Common-criteria.md index 5c6aa4f..d43edbe 100644 --- a/docs/Install-Guide/Common-criteria.md +++ b/docs/Install-Guide/Common-criteria.md @@ -8,9 +8,9 @@ setting up Gluster. Next, choose the method you want to use to set up your first cluster: -- Within a virtual machine -- To bare metal servers -- To EC2 instances in Amazon +- Within a virtual machine +- To bare metal servers +- To EC2 instances in Amazon Finally, we will install Gluster, create a few volumes, and test using them. @@ -27,50 +27,50 @@ Gluster gets distributed across multiple hosts simultaneously. This means that you can use space from any host that you have available. Typically, XFS is recommended but it can be used with other filesystems as well. Most commonly EXT4 is used when XFS isn’t, but you can (and -many, many people do) use another filesystem that suits you. +many, many people do) use another filesystem that suits you. Now that we understand that, we can define a few of the common terms used in Gluster. -- A **trusted pool** refers collectively to the hosts in a given - Gluster Cluster. -- A **node** or “server” refers to any server that is part of a - trusted pool. In general, this assumes all nodes are in the same - trusted pool. -- A **brick** is used to refer to any device (really this means - filesystem) that is being used for Gluster storage. -- An **export** refers to the mount path of the brick(s) on a given - server, for example, /export/brick1 -- The term **Global Namespace** is a fancy way of saying a Gluster - volume -- A **Gluster volume** is a collection of one or more bricks (of - course, typically this is two or more). This is analogous to - /etc/exports entries for NFS. -- **GNFS** and **kNFS**. GNFS is how we refer to our inline NFS - server. kNFS stands for kernel NFS, or, as most people would say, - just plain NFS. Most often, you will want kNFS services disabled on - the Gluster nodes. Gluster NFS doesn't take any additional - configuration and works just like you would expect with NFSv3. It is - possible to configure Gluster and NFS to live in harmony if you want - to. +- A **trusted pool** refers collectively to the hosts in a given + Gluster Cluster. +- A **node** or “server” refers to any server that is part of a + trusted pool. In general, this assumes all nodes are in the same + trusted pool. +- A **brick** is used to refer to any device (really this means + filesystem) that is being used for Gluster storage. +- An **export** refers to the mount path of the brick(s) on a given + server, for example, /export/brick1 +- The term **Global Namespace** is a fancy way of saying a Gluster + volume +- A **Gluster volume** is a collection of one or more bricks (of + course, typically this is two or more). This is analogous to + /etc/exports entries for NFS. +- **GNFS** and **kNFS**. GNFS is how we refer to our inline NFS + server. kNFS stands for kernel NFS, or, as most people would say, + just plain NFS. Most often, you will want kNFS services disabled on + the Gluster nodes. Gluster NFS doesn't take any additional + configuration and works just like you would expect with NFSv3. It is + possible to configure Gluster and NFS to live in harmony if you want + to. Other notes: -- For this test, if you do not have DNS set up, you can get away with - using /etc/hosts entries for the two nodes. However, when you move - from this basic setup to using Gluster in production, correct DNS - entries (forward and reverse) and NTP are essential. -- When you install the Operating System, do not format the Gluster - storage disks! We will use specific settings with the mkfs command - later on when we set up Gluster. If you are testing with a single - disk (not recommended), make sure to carve out a free partition or - two to be used by Gluster later, so that you can format or reformat - at will during your testing. -- Firewalls are great, except when they aren’t. For storage servers, - being able to operate in a trusted environment without firewalls can - mean huge gains in performance, and is recommended. In case you absolutely - need to set up a firewall, have a look at - [Setting up clients](../Administrator-Guide/Setting-Up-Clients.md) for - information on the ports used. +- For this test, if you do not have DNS set up, you can get away with + using /etc/hosts entries for the two nodes. However, when you move + from this basic setup to using Gluster in production, correct DNS + entries (forward and reverse) and NTP are essential. +- When you install the Operating System, do not format the Gluster + storage disks! We will use specific settings with the mkfs command + later on when we set up Gluster. If you are testing with a single + disk (not recommended), make sure to carve out a free partition or + two to be used by Gluster later, so that you can format or reformat + at will during your testing. +- Firewalls are great, except when they aren’t. For storage servers, + being able to operate in a trusted environment without firewalls can + mean huge gains in performance, and is recommended. In case you absolutely + need to set up a firewall, have a look at + [Setting up clients](../Administrator-Guide/Setting-Up-Clients.md) for + information on the ports used. Click here to [get started](../Quick-Start-Guide/Quickstart.md) diff --git a/docs/Install-Guide/Community-Packages.md b/docs/Install-Guide/Community-Packages.md index cb68800..0611f03 100644 --- a/docs/Install-Guide/Community-Packages.md +++ b/docs/Install-Guide/Community-Packages.md @@ -8,81 +8,78 @@ A **yes** means packages are (or will be) provided in the respective repository. A **no** means no plans to build new updates. Existing packages will remain in the repos. The following GlusterFS versions have reached EOL[1]: 8, 7, 6 and earlier. -| | | 10 | 9 | -|--------------|----------------|:---------:|:---------:| -|CentOS Storage SIG[2]|7 | no | yes | -| |8 | yes | yes | -| |Stream 8 | yes | yes | -| |Stream 9 | yes | yes | -| | | | | -|Fedora[3] |F34 | yes | yes¹ | -| |F35 | yes | yes¹ | -| |F36(rawhide) | yes | yes¹ | -| | | | | -|Debian[3] |Stretch/9 | no | yes | -| |Buster/10 | yes | yes | -| |Bullseye/11 | yes | yes | -| |Bookworm/12(sid)| yes | yes | -| | | | | -|Ubuntu Launchpad[4]|Xenial/16.04 | no | yes | -| |Bionic/18.04 | yes | yes | -| |Focal/20.04 | yes | yes | -| |Impish/21.10 | yes | yes | -| |Jammy/22.04 | yes | yes | -| |Kinetic/22.10 | yes | yes | -| | | | | -|OpenSUSE Build Service[5]|Leap15.2 | no | yes | -| |Leap15.3 | yes | yes | -| |Leap15.4 | yes | yes | -| |SLES12SP5 | no | yes | -| |SLES15SP2 | no | yes | -| |SLES15SP3 | yes | yes | -| |SLES15SP4 | yes | yes | -| |Tumbleweed | yes | yes | - +| | | 10 | 9 | +| ------------------------- | ---------------- | :-: | :--: | +| CentOS Storage SIG[2] | 7 | no | yes | +| | 8 | yes | yes | +| | Stream 8 | yes | yes | +| | Stream 9 | yes | yes | +| | | | | +| Fedora[3] | F34 | yes | yes¹ | +| | F35 | yes | yes¹ | +| | F36(rawhide) | yes | yes¹ | +| | | | | +| Debian[3] | Stretch/9 | no | yes | +| | Buster/10 | yes | yes | +| | Bullseye/11 | yes | yes | +| | Bookworm/12(sid) | yes | yes | +| | | | | +| Ubuntu Launchpad[4] | Xenial/16.04 | no | yes | +| | Bionic/18.04 | yes | yes | +| | Focal/20.04 | yes | yes | +| | Impish/21.10 | yes | yes | +| | Jammy/22.04 | yes | yes | +| | Kinetic/22.10 | yes | yes | +| | | | | +| OpenSUSE Build Service[5] | Leap15.2 | no | yes | +| | Leap15.3 | yes | yes | +| | Leap15.4 | yes | yes | +| | SLES12SP5 | no | yes | +| | SLES15SP2 | no | yes | +| | SLES15SP3 | yes | yes | +| | SLES15SP4 | yes | yes | +| | Tumbleweed | yes | yes | **NOTE** - We are not building Debian arm packages due to resource constraints for a while now. There will be only amd64 packages present on [download.gluster.org](https://download.gluster.org/pub/gluster/glusterfs/LATEST/) #### Related Packages -| | | glusterfs-selinux | gdeploy | gluster-block | glusterfs-coreutils | nfs-ganesha | Samba | -|--------------|----------------|:-----------------:|:-------:|:-------------:|:-------------------:|:-----------:|:-----:| -|CentOS Storage SIG[2]|7 | yes | yes | yes | yes | yes | yes | -| |8 | yes | tbd | yes | yes | yes | yes | -| |Stream 8 | yes | tbd | yes | yes | yes | yes | -| |Stream 9 | yes | tbd | yes | yes | yes | yes | -| | | | | | | | | -|Fedora[3] |F34 | yes | yes | yes | yes | yes | ? | -| |F35 | yes | yes | yes | yes | yes | ? | -| |F36(rawhide) | yes | yes | yes | yes | yes | ? | -| | | | | | | | | -|Debian[3] |Stretch/9 | n/a | no | no | yes | yes | ? | -| |Buster/10 | n/a | no | no | yes | yes | ? | -| |Bullseye/11 | n/a | no | no | yes | yes | ? | -| |Bookworm/12(sid)| n/a | no | no | yes | yes | ? | -| | | | | | | | | -|Ubuntu Launchpad[4]|Xenial/16.04 | n/a/ | no | no | yes | yes | ? | -| |Bionic/18.04 | n/a | no | no | yes | yes | ? | -| |Focal/20.04 | n/a | no | no | yes | yes | ? | -| |Impish/21.10 | n/a | no | no | yes | yes | ? | -| |Jammy/22.04 | n/a | no | no | yes | yes | ? | -| |Kinetic/22.10 | n/a | no | no | yes | yes | ? | -| | | | | | | | | -|OpenSUSE Build Service[5]|Leap15.2| n/a | yes | yes | yes | yes | ? | -| |Leap15.3 | n/a | yes | yes | yes | yes | ? | -| |Leap15.4 | n/a | yes | yes | yes | yes | ? | -| |SLES12SP5 | n/a | yes | yes | yes | yes | ? | -| |SLES15SP2 | n/a | yes | yes | yes | yes | ? | -| |SLES15SP3 | n/a | yes | yes | yes | yes | ? | -| |SLES15SP4 | n/a | yes | yes | yes | yes | ? | -| |Tumbleweed | n/a | yes | yes | yes | yes | ? | - - +| | | glusterfs-selinux | gdeploy | gluster-block | glusterfs-coreutils | nfs-ganesha | Samba | +| ------------------------- | ---------------- | :---------------: | :-----: | :-----------: | :-----------------: | :---------: | :---: | +| CentOS Storage SIG[2] | 7 | yes | yes | yes | yes | yes | yes | +| | 8 | yes | tbd | yes | yes | yes | yes | +| | Stream 8 | yes | tbd | yes | yes | yes | yes | +| | Stream 9 | yes | tbd | yes | yes | yes | yes | +| | | | | | | | | +| Fedora[3] | F34 | yes | yes | yes | yes | yes | ? | +| | F35 | yes | yes | yes | yes | yes | ? | +| | F36(rawhide) | yes | yes | yes | yes | yes | ? | +| | | | | | | | | +| Debian[3] | Stretch/9 | n/a | no | no | yes | yes | ? | +| | Buster/10 | n/a | no | no | yes | yes | ? | +| | Bullseye/11 | n/a | no | no | yes | yes | ? | +| | Bookworm/12(sid) | n/a | no | no | yes | yes | ? | +| | | | | | | | | +| Ubuntu Launchpad[4] | Xenial/16.04 | n/a/ | no | no | yes | yes | ? | +| | Bionic/18.04 | n/a | no | no | yes | yes | ? | +| | Focal/20.04 | n/a | no | no | yes | yes | ? | +| | Impish/21.10 | n/a | no | no | yes | yes | ? | +| | Jammy/22.04 | n/a | no | no | yes | yes | ? | +| | Kinetic/22.10 | n/a | no | no | yes | yes | ? | +| | | | | | | | | +| OpenSUSE Build Service[5] | Leap15.2 | n/a | yes | yes | yes | yes | ? | +| | Leap15.3 | n/a | yes | yes | yes | yes | ? | +| | Leap15.4 | n/a | yes | yes | yes | yes | ? | +| | SLES12SP5 | n/a | yes | yes | yes | yes | ? | +| | SLES15SP2 | n/a | yes | yes | yes | yes | ? | +| | SLES15SP3 | n/a | yes | yes | yes | yes | ? | +| | SLES15SP4 | n/a | yes | yes | yes | yes | ? | +| | Tumbleweed | n/a | yes | yes | yes | yes | ? | [1] [2] [3] [4] -[5] +[5] -¹ Fedora Updates, UpdatesTesting, or Rawhide repository. Use dnf to install. +¹ Fedora Updates, UpdatesTesting, or Rawhide repository. Use dnf to install. diff --git a/docs/Install-Guide/Configure.md b/docs/Install-Guide/Configure.md index 6f624a0..21051c6 100644 --- a/docs/Install-Guide/Configure.md +++ b/docs/Install-Guide/Configure.md @@ -4,7 +4,7 @@ For the Gluster to communicate within a cluster either the firewalls have to be turned off or enable communication for each server. ```console -# iptables -I INPUT -p all -s `` -j ACCEPT +iptables -I INPUT -p all -s `` -j ACCEPT ``` ### Configure the trusted pool @@ -21,7 +21,7 @@ or IP address if you don’t have DNS or `/etc/hosts` entries. Let say we want to connect to `node02`: ```console -# gluster peer probe node02 +gluster peer probe node02 ``` Notice that running `gluster peer status` from the second node shows @@ -29,10 +29,10 @@ that the first node has already been added. ### Partition the disk -Assuming you have an empty disk at `/dev/sdb`: *(You can check the partitions on your system using* `fdisk -l`*)* +Assuming you have an empty disk at `/dev/sdb`: _(You can check the partitions on your system using_ `fdisk -l`_)_ ```console -# fdisk /dev/sdb +fdisk /dev/sdb ``` And then create a single XFS partition using fdisk @@ -40,19 +40,19 @@ And then create a single XFS partition using fdisk ### Format the partition ```console -# mkfs.xfs -i size=512 /dev/sdb1 +mkfs.xfs -i size=512 /dev/sdb1 ``` ### Add an entry to /etc/fstab ```console -# echo "/dev/sdb1 /export/sdb1 xfs defaults 0 0" >> /etc/fstab +echo "/dev/sdb1 /export/sdb1 xfs defaults 0 0" >> /etc/fstab ``` ### Mount the partition as a Gluster "brick" ```console -# mkdir -p /export/sdb1 && mount -a && mkdir -p /export/sdb1/brick +mkdir -p /export/sdb1 && mount -a && mkdir -p /export/sdb1/brick ``` #### Set up a Gluster volume @@ -70,22 +70,22 @@ something goes wrong. To set up a replicated volume: ```console -# gluster volume create gv0 replica 3 node01.mydomain.net:/export/sdb1/brick \ - node02.mydomain.net:/export/sdb1/brick \ - node03.mydomain.net:/export/sdb1/brick +gluster volume create gv0 replica 3 node01.mydomain.net:/export/sdb1/brick \ + node02.mydomain.net:/export/sdb1/brick \ + node03.mydomain.net:/export/sdb1/brick ``` Breaking this down into pieces: - the first part says to create a gluster volume named gv0 -(the name is arbitrary, `gv0` was chosen simply because -it’s less typing than `gluster_volume_0`). + (the name is arbitrary, `gv0` was chosen simply because + it’s less typing than `gluster_volume_0`). - make the volume a replica volume - keep a copy of the data on at least 3 bricks at any given time. -Since we only have three bricks total, this -means each server will house a copy of the data. + Since we only have three bricks total, this + means each server will house a copy of the data. - we specify which nodes to use, and which bricks on those nodes. The order here is -important when you have more bricks. + important when you have more bricks. It is possible (as of the most current release as of this writing, Gluster 3.3) to specify the bricks in such a way that you would make both copies of the data reside on a @@ -96,12 +96,12 @@ cluster comes to a grinding halt when a single point of failure occurs. Now, we can check to make sure things are working as expected: ```console -# gluster volume info +gluster volume info ``` And you should see results similar to the following: -```console +```{ .console .no-copy } Volume Name: gv0 Type: Replicate Volume ID: 8bc3e96b-a1b6-457d-8f7a-a91d1d4dc019 @@ -115,14 +115,12 @@ Brick3: node03.yourdomain.net:/export/sdb1/brick ``` This shows us essentially what we just specified during the volume -creation. The one this to mention is the `Status`. A status of `Created` -means that the volume has been created, but hasn’t yet been started, -which would cause any attempt to mount the volume fail. +creation. The one key output worth noticing is `Status`. +A status of `Created` means that the volume has been created, +but hasn’t yet been started, which would cause any attempt to mount the volume fail. -Now, we should start the volume. +Now, we should start the volume before we try to mount it. ``` -# gluster volume start gv0 +gluster volume start gv0 ``` - -Find all documentation [here](../index.md) diff --git a/docs/Install-Guide/Install.md b/docs/Install-Guide/Install.md index 09f16b8..9feb09f 100644 --- a/docs/Install-Guide/Install.md +++ b/docs/Install-Guide/Install.md @@ -65,8 +65,8 @@ Finally, install the packages: apt install glusterfs-server ``` -*Note: Packages exist for Ubuntu 16.04 LTS, 18.04 -LTS, 20.04 LTS, 20.10, 21.04* +_Note: Packages exist for Ubuntu 16.04 LTS, 18.04 +LTS, 20.04 LTS, 20.10, 21.04_ ###### For Red Hat/CentOS diff --git a/docs/Install-Guide/Overview.md b/docs/Install-Guide/Overview.md index 5f0e42c..adda33e 100644 --- a/docs/Install-Guide/Overview.md +++ b/docs/Install-Guide/Overview.md @@ -1,4 +1,5 @@ # Overview + ### Purpose The Install Guide (IG) is aimed at providing the sequence of steps needed for @@ -30,40 +31,40 @@ this is accomplished without a centralized metadata server. #### What is Gluster without making me learn an extra glossary of terminology? -- Gluster is an easy way to provision your own storage backend NAS - using almost any hardware you choose. -- You can add as much as you want to start with, and if you need more - later, adding more takes just a few steps. -- You can configure failover automatically, so that if a server goes - down, you don’t lose access to the data. No manual steps are - required for failover. When you fix the server that failed and bring - it back online, you don’t have to do anything to get the data back - except wait. In the meantime, the most current copy of your data - keeps getting served from the node that was still running. -- You can build a clustered filesystem in a matter of minutes… it is - trivially easy for basic setups -- It takes advantage of what we refer to as “commodity hardware”, - which means, we run on just about any hardware you can think of, - from that stack of decomm’s and gigabit switches in the corner no - one can figure out what to do with (how many license servers do you - really need, after all?), to that dream array you were speccing out - online. Don’t worry, I won’t tell your boss. -- It takes advantage of commodity software too. No need to mess with - kernels or fine tune the OS to a tee. We run on top of most unix - filesystems, with XFS and ext4 being the most popular choices. We do - have some recommendations for more heavily utilized arrays, but - these are simple to implement and you probably have some of these - configured already anyway. -- Gluster data can be accessed from just about anywhere – You can use - traditional NFS, SMB/CIFS for Windows clients, or our own native - GlusterFS (a few additional packages are needed on the client - machines for this, but as you will see, they are quite small). -- There are even more advanced features than this, but for now we will - focus on the basics. -- It’s not just a toy. Gluster is enterprise-ready, and commercial - support is available if you need it. It is used in some of the most - taxing environments like media serving, natural resource - exploration, medical imaging, and even as a filesystem for Big Data. +- Gluster is an easy way to provision your own storage backend NAS + using almost any hardware you choose. +- You can add as much as you want to start with, and if you need more + later, adding more takes just a few steps. +- You can configure failover automatically, so that if a server goes + down, you don’t lose access to the data. No manual steps are + required for failover. When you fix the server that failed and bring + it back online, you don’t have to do anything to get the data back + except wait. In the meantime, the most current copy of your data + keeps getting served from the node that was still running. +- You can build a clustered filesystem in a matter of minutes… it is + trivially easy for basic setups +- It takes advantage of what we refer to as “commodity hardware”, + which means, we run on just about any hardware you can think of, + from that stack of decomm’s and gigabit switches in the corner no + one can figure out what to do with (how many license servers do you + really need, after all?), to that dream array you were speccing out + online. Don’t worry, I won’t tell your boss. +- It takes advantage of commodity software too. No need to mess with + kernels or fine tune the OS to a tee. We run on top of most unix + filesystems, with XFS and ext4 being the most popular choices. We do + have some recommendations for more heavily utilized arrays, but + these are simple to implement and you probably have some of these + configured already anyway. +- Gluster data can be accessed from just about anywhere – You can use + traditional NFS, SMB/CIFS for Windows clients, or our own native + GlusterFS (a few additional packages are needed on the client + machines for this, but as you will see, they are quite small). +- There are even more advanced features than this, but for now we will + focus on the basics. +- It’s not just a toy. Gluster is enterprise-ready, and commercial + support is available if you need it. It is used in some of the most + taxing environments like media serving, natural resource + exploration, medical imaging, and even as a filesystem for Big Data. #### Is Gluster going to work for me and what I need it to do? diff --git a/docs/Install-Guide/Setup-Bare-metal.md b/docs/Install-Guide/Setup-Bare-metal.md index 682123d..97e0ced 100644 --- a/docs/Install-Guide/Setup-Bare-metal.md +++ b/docs/Install-Guide/Setup-Bare-metal.md @@ -1,5 +1,7 @@ # Setup Bare Metal -*Note: You only need one of the three setup methods!* + +_Note: You only need one of the three setup methods!_ + ### Setup, Method 2 – Setting up on physical servers To set up Gluster on physical servers, we recommend two servers of very @@ -14,16 +16,16 @@ would to a production environment (in case it becomes one, as mentioned above). That being said, here is a reminder of some of the best practices we mentioned before: -- Make sure DNS and NTP are setup, correct, and working -- If you have access to a backend storage network, use it! 10GBE or - InfiniBand are great if you have access to them, but even a 1GBE - backbone can help you get the most out of your deployment. Make sure - that the interfaces you are going to use are also in DNS since we - will be using the hostnames when we deploy Gluster -- When it comes to disks, the more the merrier. Although you could - technically fake things out with a single disk, there would be - performance issues as soon as you tried to do any real work on the - servers +- Make sure DNS and NTP are setup, correct, and working +- If you have access to a backend storage network, use it! 10GBE or + InfiniBand are great if you have access to them, but even a 1GBE + backbone can help you get the most out of your deployment. Make sure + that the interfaces you are going to use are also in DNS since we + will be using the hostnames when we deploy Gluster +- When it comes to disks, the more the merrier. Although you could + technically fake things out with a single disk, there would be + performance issues as soon as you tried to do any real work on the + servers With the explosion of commodity hardware, you don’t need to be a hardware expert these days to deploy a server. Although this is @@ -31,19 +33,19 @@ generally a good thing, it also means that paying attention to some important, performance-impacting BIOS settings is commonly ignored. Several points that might cause issues when if you're unaware of them: -- Most manufacturers enable power saving mode by default. This is a - great idea for servers that do not have high-performance - requirements. For the average storage server, the performance-impact - of the power savings is not a reasonable tradeoff -- Newer motherboards and processors have lots of nifty features! - Enhancements in virtualization, newer ways of doing predictive - algorithms and NUMA are just a few to mention. To be safe, many - manufactures ship hardware with settings meant to work with as - massive a variety of workloads and configurations as they have - customers. One issue you could face is when you set up that blazing-fast - 10GBE card you were so thrilled about installing? In many - cases, it would end up being crippled by a default 1x speed put in - place on the PCI-E bus by the motherboard. +- Most manufacturers enable power saving mode by default. This is a + great idea for servers that do not have high-performance + requirements. For the average storage server, the performance-impact + of the power savings is not a reasonable tradeoff +- Newer motherboards and processors have lots of nifty features! + Enhancements in virtualization, newer ways of doing predictive + algorithms and NUMA are just a few to mention. To be safe, many + manufactures ship hardware with settings meant to work with as + massive a variety of workloads and configurations as they have + customers. One issue you could face is when you set up that blazing-fast + 10GBE card you were so thrilled about installing? In many + cases, it would end up being crippled by a default 1x speed put in + place on the PCI-E bus by the motherboard. Thankfully, most manufacturers show all the BIOS settings, including the defaults, right in the manual. It only takes a few minutes to download, diff --git a/docs/Install-Guide/Setup-aws.md b/docs/Install-Guide/Setup-aws.md index d71d124..342f85b 100644 --- a/docs/Install-Guide/Setup-aws.md +++ b/docs/Install-Guide/Setup-aws.md @@ -1,5 +1,6 @@ # Setup AWS -*Note: You only need one of the three setup methods!* + +_Note: You only need one of the three setup methods!_ ### Setup, Method 3 – Deploying in AWS @@ -7,54 +8,53 @@ Deploying in Amazon can be one of the fastest ways to get up and running with Gluster. Of course, most of what we cover here will work with other cloud platforms. -- Deploy at least two instances. For testing, you can use micro - instances (I even go as far as using spot instances in most cases). - Debates rage on what size instance to use in production, and there - is really no correct answer. As with most things, the real answer is - “whatever works for you”, where the trade-offs between cost and - performance are balanced in a continual dance of trying to make your - project successful while making sure there is enough money left over - in the budget for you to get that sweet new ping pong table in the - break room. -- For cloud platforms, your data is wide open right from the start. As - such, you shouldn’t allow open access to all ports in your security - groups if you plan to put a single piece of even the least valuable - information on the test instances. By least valuable, I mean “Cash - value of this coupon is 1/100th of 1 cent” kind of least valuable. - Don’t be the next one to end up as a breaking news flash on the - latest inconsiderate company to allow their data to fall into the - hands of the baddies. See Step 2 for the minimum ports you will need - open to use Gluster -- You can use the free “ephemeral” storage for the Gluster bricks - during testing, but make sure to use some form of protection against - data loss when you move to production. Typically this means EBS - backed volumes or using S3 to periodically back up your data bricks. +- Deploy at least two instances. For testing, you can use micro + instances (I even go as far as using spot instances in most cases). + Debates rage on what size instance to use in production, and there + is really no correct answer. As with most things, the real answer is + “whatever works for you”, where the trade-offs between cost and + performance are balanced in a continual dance of trying to make your + project successful while making sure there is enough money left over + in the budget for you to get that sweet new ping pong table in the + break room. +- For cloud platforms, your data is wide open right from the start. As + such, you shouldn’t allow open access to all ports in your security + groups if you plan to put a single piece of even the least valuable + information on the test instances. By least valuable, I mean “Cash + value of this coupon is 1/100th of 1 cent” kind of least valuable. + Don’t be the next one to end up as a breaking news flash on the + latest inconsiderate company to allow their data to fall into the + hands of the baddies. See Step 2 for the minimum ports you will need + open to use Gluster +- You can use the free “ephemeral” storage for the Gluster bricks + during testing, but make sure to use some form of protection against + data loss when you move to production. Typically this means EBS + backed volumes or using S3 to periodically back up your data bricks. Other notes: -- In production, it is recommended to replicate your VM’s across - multiple zones. For purpose of this tutorial, it is overkill, but if - anyone is interested in this please let us know since we are always - looking to write articles on the most requested features and - questions. -- Using EBS volumes and Elastic IPs are also recommended in - production. For testing, you can safely ignore these as long as you - are aware that the data could be lost at any moment, so make sure - your test deployment is just that, testing only. -- Performance can fluctuate wildly in a cloud environment. If - performance issues are seen, there are several possible strategies, - but keep in mind that this is the perfect place to take advantage of - the scale-out capability of Gluster. While it is not true in all - cases that deploying more instances will necessarily result in a - “faster” cluster, in general, you will see that adding more nodes - means more performance for the cluster overall. -- If a node reboots, you will typically need to do some extra work to - get Gluster running again using the default EC2 configuration. If a - node is shut down, it can mean absolute loss of the node (depending - on how you set things up). This is well beyond the scope of this - document but is discussed in any number of AWS-related forums and - posts. Since I found out the hard way myself (oh, so you read the - manual every time?!), I thought it worth at least mentioning. - +- In production, it is recommended to replicate your VM’s across + multiple zones. For purpose of this tutorial, it is overkill, but if + anyone is interested in this please let us know since we are always + looking to write articles on the most requested features and + questions. +- Using EBS volumes and Elastic IPs are also recommended in + production. For testing, you can safely ignore these as long as you + are aware that the data could be lost at any moment, so make sure + your test deployment is just that, testing only. +- Performance can fluctuate wildly in a cloud environment. If + performance issues are seen, there are several possible strategies, + but keep in mind that this is the perfect place to take advantage of + the scale-out capability of Gluster. While it is not true in all + cases that deploying more instances will necessarily result in a + “faster” cluster, in general, you will see that adding more nodes + means more performance for the cluster overall. +- If a node reboots, you will typically need to do some extra work to + get Gluster running again using the default EC2 configuration. If a + node is shut down, it can mean absolute loss of the node (depending + on how you set things up). This is well beyond the scope of this + document but is discussed in any number of AWS-related forums and + posts. Since I found out the hard way myself (oh, so you read the + manual every time?!), I thought it worth at least mentioning. Once you have both instances up, you can proceed to the [install](./Install.md) page. diff --git a/docs/Install-Guide/Setup-virt.md b/docs/Install-Guide/Setup-virt.md index 950182c..c6b51c7 100644 --- a/docs/Install-Guide/Setup-virt.md +++ b/docs/Install-Guide/Setup-virt.md @@ -1,5 +1,6 @@ # Setup on Virtual Machine -*Note: You only need one of the three setup methods!* + +_Note: You only need one of the three setup methods!_ ### Setup, Method 1 – Setting up in virtual machines @@ -16,18 +17,18 @@ distribution already. Create or clone two VM’s, with the following setup on each: -- 2 disks using the VirtIO driver, one for the base OS and one that we - will use as a Gluster “brick”. You can add more later to try testing - some more advanced configurations, but for now let’s keep it simple. +- 2 disks using the VirtIO driver, one for the base OS and one that we + will use as a Gluster “brick”. You can add more later to try testing + some more advanced configurations, but for now let’s keep it simple. -*Note: If you have ample space available, consider allocating all the -disk space at once.* +_Note: If you have ample space available, consider allocating all the +disk space at once._ -- 2 NIC’s using VirtIO driver. The second NIC is not strictly - required, but can be used to demonstrate setting up a separate - network for storage and management traffic. +- 2 NIC’s using VirtIO driver. The second NIC is not strictly + required, but can be used to demonstrate setting up a separate + network for storage and management traffic. -*Note: Attach each NIC to a separate network.* +_Note: Attach each NIC to a separate network._ Other notes: Make sure that if you clone the VM, that Gluster has not already been installed. Gluster generates a UUID to “fingerprint” each diff --git a/mkdocs.yml b/mkdocs.yml index 4ea5df7..74b72da 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -33,8 +33,8 @@ nav: - Setting up on physical servers: Install-Guide/Setup-Bare-metal.md - Deploying in AWS: Install-Guide/Setup-aws.md - Install: Install-Guide/Install.md - - Community Packages: Install-Guide/Community-Packages.md - Configure: Install-Guide/Configure.md + - Community Packages: Install-Guide/Community-Packages.md - Administration Guide: - Overview: Administrator-Guide/overview.md - Index: Administrator-Guide/index.md From 19101cc61c433033b1281a83a3598c87609c80fe Mon Sep 17 00:00:00 2001 From: "black.dragon74" Date: Tue, 31 May 2022 15:30:20 +0530 Subject: [PATCH 02/21] [contrib-guide] Cleanup syntax Signed-off-by: black.dragon74 --- docs/Contributors-Guide/Adding-your-blog.md | 2 - .../Bug-Reporting-Guidelines.md | 132 +++++++++--------- docs/Contributors-Guide/Bug-Triage.md | 92 ++++++------ .../GlusterFS-Release-process.md | 50 ++++--- .../Guidelines-For-Maintainers.md | 24 ++-- docs/Contributors-Guide/Index.md | 33 ++--- 6 files changed, 165 insertions(+), 168 deletions(-) diff --git a/docs/Contributors-Guide/Adding-your-blog.md b/docs/Contributors-Guide/Adding-your-blog.md index ebcd227..0d17eb3 100644 --- a/docs/Contributors-Guide/Adding-your-blog.md +++ b/docs/Contributors-Guide/Adding-your-blog.md @@ -7,5 +7,3 @@ OK, you can do that by editing planet-gluster [feeds](https://github.com/gluster Please find instructions mentioned in the file and send a pull request. Once approved, all your gluster related posts will appear in [planet.gluster.org](http://planet.gluster.org) website. - - diff --git a/docs/Contributors-Guide/Bug-Reporting-Guidelines.md b/docs/Contributors-Guide/Bug-Reporting-Guidelines.md index fe266d7..4b984e0 100644 --- a/docs/Contributors-Guide/Bug-Reporting-Guidelines.md +++ b/docs/Contributors-Guide/Bug-Reporting-Guidelines.md @@ -1,31 +1,29 @@ -Before filing an issue ----------------------- +## Before filing an issue If you are finding any issues, these preliminary checks as useful: -- Is SELinux enabled? (you can use `getenforce` to check) -- Are iptables rules blocking any data traffic? (`iptables -L` can - help check) -- Are all the nodes reachable from each other? [ Network problem ] -- Please search [issues](https://github.com/gluster/glusterfs/issues) - to see if the bug has already been reported - - If an issue has been already filed for a particular release and - you found the issue in another release, add a comment in issue. +- Is SELinux enabled? (you can use `getenforce` to check) +- Are iptables rules blocking any data traffic? (`iptables -L` can + help check) +- Are all the nodes reachable from each other? [ Network problem ] +- Please search [issues](https://github.com/gluster/glusterfs/issues) + to see if the bug has already been reported + + - If an issue has been already filed for a particular release and you found the issue in another release, add a comment in issue. Anyone can search in github issues, you don't need an account. Searching requires some effort, but helps avoid duplicates, and you may find that your problem has already been solved. -Reporting An Issue ------------------- +## Reporting An Issue -- You should have an account with github.com -- Here is the link to file an issue: - [Github](https://github.com/gluster/glusterfs/issues/new) +- You should have an account with github.com +- Here is the link to file an issue: + [Github](https://github.com/gluster/glusterfs/issues/new) -*Note: Please go through all below sections to understand what +_Note: Please go through all below sections to understand what information we need to put in a bug. So it will help the developer to -root cause and fix it* +root cause and fix it_ ### Required Information @@ -33,84 +31,86 @@ You should gather the information below before creating the bug report. #### Package Information -- Location from which the packages are used -- Package Info - version of glusterfs package installed +- Location from which the packages are used +- Package Info - version of glusterfs package installed #### Cluster Information -- Number of nodes in the cluster -- Hostnames and IPs of the gluster Node [if it is not a security - issue] - - Hostname / IP will help developers in understanding & - correlating with the logs -- Output of `gluster peer status` -- Node IP, from which the "x" operation is done - - "x" here means any operation that causes the issue +- Number of nodes in the cluster +- Hostnames and IPs of the gluster Node [if it is not a security + issue] + + - Hostname / IP will help developers in understanding & correlating with the logs + +- Output of `gluster peer status` +- Node IP, from which the "x" operation is done + + - "x" here means any operation that causes the issue #### Volume Information -- Number of volumes -- Volume Names -- Volume on which the particular issue is seen [ if applicable ] -- Type of volumes -- Volume options if available -- Output of `gluster volume info` -- Output of `gluster volume status` -- Get the statedump of the volume with the problem - -`$ gluster volume statedump ` +- Number of volumes +- Volume Names +- Volume on which the particular issue is seen [ if applicable ] +- Type of volumes +- Volume options if available +- Output of `gluster volume info` +- Output of `gluster volume status` +- Get the statedump of the volume with the problem `gluster volume statedump ` This dumps statedump per brick process in `/var/run/gluster` -*NOTE: Collect statedumps from one gluster Node in a directory.* +_NOTE: Collect statedumps from one gluster Node in a directory._ Repeat it in all Nodes containing the bricks of the volume. All the so collected directories could be archived, compressed and attached to bug #### Brick Information -- xfs options when a brick partition was done - - This could be obtained with this command : +- xfs options when a brick partition was done -`$ xfs_info /dev/mapper/vg1-brick` + - This could be obtained with this command: `xfs_info /dev/mapper/vg1-brick` -- Extended attributes on the bricks - - This could be obtained with this command: +- Extended attributes on the bricks -`$ getfattr -d -m. -ehex /rhs/brick1/b1` + - This could be obtained with this command: `getfattr -d -m. -ehex /rhs/brick1/b1` #### Client Information -- OS Type ( Ubuntu, Fedora, RHEL ) -- OS Version: In case of Linux distro get the following : +- OS Type ( Ubuntu, Fedora, RHEL ) +- OS Version: In case of Linux distro get the following : -`uname -r` -`cat /etc/issue` +```console +uname -r +cat /etc/issue +``` -- Fuse or NFS Mount point on the client with output of mount commands -- Output of `df -Th` command +- Fuse or NFS Mount point on the client with output of mount commands +- Output of `df -Th` command #### Tool Information -- If any tools are used for testing, provide the info/version about it -- if any IO is simulated using a script, provide the script +- If any tools are used for testing, provide the info/version about it +- if any IO is simulated using a script, provide the script #### Logs Information -- You can check logs for issues/warnings/errors. - - Self-heal logs - - Rebalance logs - - Glusterd logs - - Brick logs - - NFS logs (if applicable) - - Samba logs (if applicable) - - Client mount log -- Add the entire logs as attachment, if its very large to paste as a - comment +- You can check logs for issues/warnings/errors. + + - Self-heal logs + - Rebalance logs + - Glusterd logs + - Brick logs + - NFS logs (if applicable) + - Samba logs (if applicable) + - Client mount log + +- Add the entire logs as attachment, if its very large to paste as a + comment #### SOS report for CentOS/Fedora -- Get the sosreport from the involved gluster Node and Client [ in - case of CentOS /Fedora ] -- Add a meaningful name/IP to the sosreport, by renaming/adding - hostname/ip to the sosreport name +- Get the sosreport from the involved gluster Node and Client [ in + case of CentOS /Fedora ] +- Add a meaningful name/IP to the sosreport, by renaming/adding + hostname/ip to the sosreport name diff --git a/docs/Contributors-Guide/Bug-Triage.md b/docs/Contributors-Guide/Bug-Triage.md index 04e7eaa..7eb167f 100644 --- a/docs/Contributors-Guide/Bug-Triage.md +++ b/docs/Contributors-Guide/Bug-Triage.md @@ -1,25 +1,24 @@ -Issues Triage Guidelines -======================== +# Issues Triage Guidelines -- Triaging of issues is an important task; when done correctly, it can - reduce the time between reporting an issue and the availability of a - fix enormously. +- Triaging of issues is an important task; when done correctly, it can + reduce the time between reporting an issue and the availability of a + fix enormously. -- Triager should focus on new issues, and try to define the problem - easily understandable and as accurate as possible. The goal of the - triagers is to reduce the time that developers need to solve the bug - report. +- Triager should focus on new issues, and try to define the problem + easily understandable and as accurate as possible. The goal of the + triagers is to reduce the time that developers need to solve the bug + report. -- A triager is like an assistant that helps with the information - gathering and possibly the debugging of a new bug report. Because a - triager helps preparing a bug before a developer gets involved, it - can be a very nice role for new community members that are - interested in technical aspects of the software. +- A triager is like an assistant that helps with the information + gathering and possibly the debugging of a new bug report. Because a + triager helps preparing a bug before a developer gets involved, it + can be a very nice role for new community members that are + interested in technical aspects of the software. -- Triagers will stumble upon many different kind of issues, ranging - from reports about spelling mistakes, or unclear log messages to - memory leaks causing crashes or performance issues in environments - with several hundred storage servers. +- Triagers will stumble upon many different kind of issues, ranging + from reports about spelling mistakes, or unclear log messages to + memory leaks causing crashes or performance issues in environments + with several hundred storage servers. Nobody expects that triagers can prepare all bug reports. Therefore most developers will be able to assist the triagers, answer questions and @@ -28,17 +27,16 @@ more experienced and will rely less on developers. **Issue triage can be summarized as below points:** -- Is the issue a bug? an enhancement request? or a question? Assign the relevant label. -- Is there enough information in the issue description? -- Is it a duplicate issue? -- Is it assigned to correct component of GlusterFS? -- Is the bug summary is correct? -- Assigning issue or Adding people's github handle in the comment, so they get notified. +- Is the issue a bug? an enhancement request? or a question? Assign the relevant label. +- Is there enough information in the issue description? +- Is it a duplicate issue? +- Is it assigned to correct component of GlusterFS? +- Is the bug summary is correct? +- Assigning issue or Adding people's github handle in the comment, so they get notified. The detailed discussion about the above points are below. -Is there enough information? ----------------------------- +## Is there enough information? It's hard to generalize what makes a good report. For "average" reporters is definitely often helpful to have good steps to reproduce, @@ -46,42 +44,38 @@ GlusterFS software version , and information about the test/production environment, Linux/GNU distribution. If the reporter is a developer, steps to reproduce can sometimes be -omitted as context is obvious. *However, this can create a problem for +omitted as context is obvious. _However, this can create a problem for contributors that need to find their way, hence it is strongly advised -to list the steps to reproduce an issue.* +to list the steps to reproduce an issue._ Other tips: -- There should be only one issue per report. Try not to mix related or - similar looking bugs per report. +- There should be only one issue per report. Try not to mix related or + similar looking bugs per report. -- It should be possible to call the described problem fixed at some - point. "Improve the documentation" or "It runs slow" could never be - called fixed, while "Documentation should cover the topic Embedding" - or "The page at should load - in less than five seconds" would have a criterion. A good summary of - the bug will also help others in finding existing bugs and prevent - filing of duplicates. +- It should be possible to call the described problem fixed at some + point. "Improve the documentation" or "It runs slow" could never be + called fixed, while "Documentation should cover the topic Embedding" + or "The page at should load + in less than five seconds" would have a criterion. A good summary of + the bug will also help others in finding existing bugs and prevent + filing of duplicates. -- If the bug is a graphical problem, you may want to ask for a - screenshot to attach to the bug report. Make sure to ask that the - screenshot should not contain any confidential information. +- If the bug is a graphical problem, you may want to ask for a + screenshot to attach to the bug report. Make sure to ask that the + screenshot should not contain any confidential information. -Is it a duplicate? ------------------- +## Is it a duplicate? If you think that you have found a duplicate but you are not totally sure, just add a comment like "This issue looks related to issue #NNN" (and replace NNN by issue-id) so somebody else can take a look and help judging. - -Is it assigned with correct label? ----------------------------------- +## Is it assigned with correct label? Go through the labels and assign the appropriate label -Are the fields correct? ------------------------ +## Are the fields correct? ### Description @@ -89,8 +83,8 @@ Sometimes the description does not summarize the bug itself well. You may want to update the bug summary to make the report distinguishable. A good title may contain: -- A brief explanation of the root cause (if it was found) -- Some of the symptoms people are experiencing +- A brief explanation of the root cause (if it was found) +- Some of the symptoms people are experiencing ### Assigning issue or Adding people's github handle in the comment diff --git a/docs/Contributors-Guide/GlusterFS-Release-process.md b/docs/Contributors-Guide/GlusterFS-Release-process.md index 5888bcf..1ef6bff 100644 --- a/docs/Contributors-Guide/GlusterFS-Release-process.md +++ b/docs/Contributors-Guide/GlusterFS-Release-process.md @@ -15,7 +15,7 @@ Minor releases will have guaranteed backwards compatibilty with earlier minor re Each GlusterFS major release has a 4-6 month release window, in which changes get merged. This window is split into two phases. 1. A Open phase, where all changes get merged -1. A Stability phase, where only changes that stabilize the release get merged. +2. A Stability phase, where only changes that stabilize the release get merged. The first 2-4 months of a release window will be the Open phase, and the last month will be the stability phase. @@ -30,8 +30,8 @@ All changes will be accepted during the Open phase. The changes have a few requi - a change fixing a bug SHOULD have public test case - a change introducing a new feature MUST have a disable switch that can disable the feature during a build - #### Stability phase + This phase is used to stabilize any new features introduced in the open phase, or general bug fixes for already existing features. A new `release-` branch is created at the beginning of this phase. All changes need to be sent to the master branch before getting backported to the new release branch. @@ -54,6 +54,7 @@ Patches accepted in the Stability phase have the following requirements: Patches that do not satisfy the above requirements can still be submitted for review, but cannot be merged. ## Release procedure + This procedure is followed by a release maintainer/manager, to perform the actual release. The release procedure for both major releases and minor releases is nearly the same. @@ -63,6 +64,7 @@ The procedure for the major releases starts at the beginning of the Stability ph _TODO: Add the release verification procedure_ ### Release steps + The release-manager needs to follow the following steps, to actually perform the release once ready. #### Create tarball @@ -73,9 +75,11 @@ The release-manager needs to follow the following steps, to actually perform the 4. create the tarball with the [release job in Jenkins](http://build.gluster.org/job/release/) #### Notify packagers + Notify the packagers that we need packages created. Provide the link to the source tarball from the Jenkins release job to the [packagers mailinglist](mailto:packaging@gluster.org). A list of the people involved in the package maintenance for the different distributions is in the `MAINTAINERS` file in the sources, all of them should be subscribed to the packagers mailinglist. #### Create a new Tracker Bug for the next release + The tracker bugs are used as guidance for blocker bugs and should get created when a release is made. To create one - Create a [new milestone](https://github.com/gluster/glusterfs/milestones/new) @@ -83,19 +87,21 @@ The tracker bugs are used as guidance for blocker bugs and should get created wh - issues that were not fixed in previous release, but in milestone should be moved to the new milestone. #### Create Release Announcement -(Major releases) -The Release Announcement is based off the release notes. This needs to indicate: - * What this release's overall focus is - * Which versions will stop receiving updates as of this release - * Links to the direct download folder - * Feature set - -Best practice as of version-8 is to create a collaborative version of the release notes that both the release manager and community lead work on together, and the release manager posts to the mailing lists (gluster-users@, gluster-devel@, announce@). +(Major releases) +The Release Announcement is based off the release notes. This needs to indicate: + +- What this release's overall focus is +- Which versions will stop receiving updates as of this release +- Links to the direct download folder +- Feature set + +Best practice as of version-8 is to create a collaborative version of the release notes that both the release manager and community lead work on together, and the release manager posts to the mailing lists (gluster-users@, gluster-devel@, announce@). #### Create Upgrade Guide -(Major releases) -If required, as in the case of a major release, an upgrade guide needs to be available at the same time as the release. + +(Major releases) +If required, as in the case of a major release, an upgrade guide needs to be available at the same time as the release. This document should go under the [Upgrade Guide](https://github.com/gluster/glusterdocs/tree/master/Upgrade-Guide) section of the [glusterdocs](https://github.com/gluster/glusterdocs) repository. #### Send Release Announcement @@ -103,13 +109,15 @@ This document should go under the [Upgrade Guide](https://github.com/gluster/glu Once the Fedora/EL RPMs are ready (and any others that are ready by then), send the release announcement: - Gluster Mailing lists - - [gluster-announce](https://lists.gluster.org/mailman/listinfo/announce/) - - [gluster-devel](https://lists.gluster.org/mailman/listinfo/gluster-devel) - - [gluster-users](https://lists.gluster.org/mailman/listinfo/gluster-users/) - -- [Gluster Blog](https://planet.gluster.org/) -The blog will automatically post to both Facebook and Twitter. Be careful with this! - - [Gluster Twitter account](https://twitter.com/gluster) - - [Gluster Facebook page](https://www.facebook.com/GlusterInc) -- [Gluster LinkedIn group](https://www.linkedin.com/company/gluster/about/) + - [gluster-announce](https://lists.gluster.org/mailman/listinfo/announce/) + - [gluster-devel](https://lists.gluster.org/mailman/listinfo/gluster-devel) + - [gluster-users](https://lists.gluster.org/mailman/listinfo/gluster-users/) + +- [Gluster Blog](https://planet.gluster.org/) + The blog will automatically post to both Facebook and Twitter. Be careful with this! + + - [Gluster Twitter account](https://twitter.com/gluster) + - [Gluster Facebook page](https://www.facebook.com/GlusterInc) + +- [Gluster LinkedIn group](https://www.linkedin.com/company/gluster/about/) diff --git a/docs/Contributors-Guide/Guidelines-For-Maintainers.md b/docs/Contributors-Guide/Guidelines-For-Maintainers.md index ad7366d..f3dc3a9 100644 --- a/docs/Contributors-Guide/Guidelines-For-Maintainers.md +++ b/docs/Contributors-Guide/Guidelines-For-Maintainers.md @@ -13,8 +13,10 @@ explicitly called out. ### Guidelines that Maintainers are expected to adhere to -1. Ensure qualitative and timely management of patches sent for review. -2. For merging patches into the repository, it is expected of maintainers to: +1. Ensure qualitative and timely management of patches sent for review. + +2. For merging patches into the repository, it is expected of maintainers to: + - Merge patches of owned components only. - Seek approvals from all maintainers before merging a patchset spanning multiple components. @@ -28,14 +30,15 @@ explicitly called out. quality of the codebase. - Not merge patches written by themselves until there is a +2 Code Review vote by other reviewers. -3. The responsibility of merging a patch into a release branch in normal - circumstances will be that of the release maintainer's. Only in exceptional - situations, maintainers & sub-maintainers will merge patches into a release - branch. -4. Release maintainers will ensure approval from appropriate maintainers before - merging a patch into a release branch. -5. Maintainers have a responsibility to the community, it is expected of - maintainers to: + +3. The responsibility of merging a patch into a release branch in normal + circumstances will be that of the release maintainer's. Only in exceptional + situations, maintainers & sub-maintainers will merge patches into a release + branch. +4. Release maintainers will ensure approval from appropriate maintainers before + merging a patch into a release branch. + +5. Maintainers have a responsibility to the community, it is expected of maintainers to: - Facilitate the community in all aspects. - Be very active and visible in the community. - Be objective and consider the larger interests of the community ahead of @@ -53,4 +56,3 @@ Any questions or comments regarding these guidelines can be routed to Github can be used to list patches that need reviews and/or can get merged from [Pull Requests](https://github.com/gluster/glusterfs/pulls) - diff --git a/docs/Contributors-Guide/Index.md b/docs/Contributors-Guide/Index.md index 4d91291..f198112 100644 --- a/docs/Contributors-Guide/Index.md +++ b/docs/Contributors-Guide/Index.md @@ -1,28 +1,23 @@ # Workflow Guide -Bug Handling ------------- +## Bug Handling -- [Bug reporting guidelines](./Bug-Reporting-Guidelines.md) - - Guideline for reporting a bug in GlusterFS -- [Bug triage guidelines](./Bug-Triage.md) - Guideline on how to - triage bugs for GlusterFS +- [Bug reporting guidelines](./Bug-Reporting-Guidelines.md) - + Guideline for reporting a bug in GlusterFS +- [Bug triage guidelines](./Bug-Triage.md) - Guideline on how to + triage bugs for GlusterFS -Release Process ---------------- +## Release Process -- [GlusterFS Release process](./GlusterFS-Release-process.md) - - Our release process / checklist +- [GlusterFS Release process](./GlusterFS-Release-process.md) - + Our release process / checklist -Patch Acceptance ----------------- +## Patch Acceptance -- The [Guidelines For Maintainers](./Guidelines-For-Maintainers.md) explains when - maintainers can merge patches. +- The [Guidelines For Maintainers](./Guidelines-For-Maintainers.md) explains when + maintainers can merge patches. -Blogging about gluster ----------------- - -- The [Adding your gluster blog](./Adding-your-blog.md) explains how to add your -gluster blog to Community blogger. +## Blogging about gluster +- The [Adding your gluster blog](./Adding-your-blog.md) explains how to add your + gluster blog to Community blogger. From 3e390eb10641b44a6c914c6bc7ae222eaf79e220 Mon Sep 17 00:00:00 2001 From: black-dragon74 Date: Thu, 2 Jun 2022 14:09:26 +0530 Subject: [PATCH 03/21] [devel-guide] Fix broken links and cleanup syntax Signed-off-by: black-dragon74 --- docs/Developer-guide/Backport-Guidelines.md | 25 ++-- docs/Developer-guide/Building-GlusterFS.md | 12 +- docs/Developer-guide/Developers-Index.md | 55 +++---- docs/Developer-guide/Development-Workflow.md | 137 +++++++++--------- docs/Developer-guide/Easy-Fix-Bugs.md | 16 +- ...orted-by-tools-for-static-code-analysis.md | 58 ++++---- docs/Developer-guide/Projects.md | 8 +- .../Simplified-Development-Workflow.md | 53 +++---- docs/Developer-guide/compiling-rpms.md | 129 ++++++++--------- .../coredump-on-customer-setup.md | 61 ++++---- 10 files changed, 276 insertions(+), 278 deletions(-) diff --git a/docs/Developer-guide/Backport-Guidelines.md b/docs/Developer-guide/Backport-Guidelines.md index f013338..068f9af 100644 --- a/docs/Developer-guide/Backport-Guidelines.md +++ b/docs/Developer-guide/Backport-Guidelines.md @@ -1,4 +1,5 @@ # Backport Guidelines + In GlusterFS project, as a policy, any new change, bug fix, etc., are to be fixed in 'devel' branch before release branches. When a bug is fixed in the devel branch, it might be desirable or necessary in release branch. @@ -9,17 +10,17 @@ understand how to request for backport from community. ## Policy -* No feature from devel would be backported to the release branch -* CVE ie., security vulnerability [(listed on the CVE database)](https://cve.mitre.org/cve/search_cve_list.html) -reported in the existing releases would be backported, after getting fixed -in devel branch. -* Only topics which bring about data loss or, unavailability would be -backported to the release. -* For any other issues, the project recommends that the installation be -upgraded to a newer release where the specific bug has been addressed. +- No feature from devel would be backported to the release branch +- CVE ie., security vulnerability [(listed on the CVE database)](https://cve.mitre.org/cve/search_cve_list.html) + reported in the existing releases would be backported, after getting fixed + in devel branch. +- Only topics which bring about data loss or, unavailability would be + backported to the release. +- For any other issues, the project recommends that the installation be + upgraded to a newer release where the specific bug has been addressed. - Gluster provides 'rolling' upgrade support, i.e., one can upgrade their -server version without stopping the application I/O, so we recommend migrating -to higher version. + server version without stopping the application I/O, so we recommend migrating + to higher version. ## Things to pay attention to while backporting a patch. @@ -27,12 +28,10 @@ If your patch meets the criteria above, or you are a user, who prefer to have a fix backported, because your current setup is facing issues, below are the steps you need to take care to submit a patch on release branch. -* The patch should have same 'Change-Id'. - +- The patch should have same 'Change-Id'. ### How to contact release owners? All release owners are part of 'gluster-devel@gluster.org' mailing list. Please write your expectation from next release there, so we can take that to consideration while making the release. - diff --git a/docs/Developer-guide/Building-GlusterFS.md b/docs/Developer-guide/Building-GlusterFS.md index 1a9c9fd..4164761 100644 --- a/docs/Developer-guide/Building-GlusterFS.md +++ b/docs/Developer-guide/Building-GlusterFS.md @@ -7,9 +7,11 @@ This page describes how to build and install GlusterFS. The following packages are required for building GlusterFS, - GNU Autotools - - Automake - - Autoconf - - Libtool + + - Automake + - Autoconf + - Libtool + - lex (generally flex) - GNU Bison - OpenSSL @@ -258,9 +260,9 @@ cd extras/LinuxRPM make glusterrpms ``` -This will create rpms from the source in 'extras/LinuxRPM'. *(Note: You +This will create rpms from the source in 'extras/LinuxRPM'. _(Note: You will need to install the rpmbuild requirements including rpmbuild and -mock)*
+mock)_
For CentOS / Enterprise Linux 8 the dependencies can be installed via: ```console diff --git a/docs/Developer-guide/Developers-Index.md b/docs/Developer-guide/Developers-Index.md index 3f324fa..669576f 100644 --- a/docs/Developer-guide/Developers-Index.md +++ b/docs/Developer-guide/Developers-Index.md @@ -1,8 +1,8 @@ -Developers -========== +# Developers ### Contributing to the Gluster community -------------------------------------- + +--- Are you itching to send in patches and participate as a developer in the Gluster community? Here are a number of starting points for getting @@ -10,36 +10,37 @@ involved. All you need is your 'github' account to be handy. Remember that, [Gluster community](https://github.com/gluster) has multiple projects, each of which has its own way of handling PRs and patches. Decide on which project you want to contribute. Below documents are mostly about 'GlusterFS' project, which is the core of Gluster Community. -Workflow --------- +## Workflow -- [Simplified Developer Workflow](./Simplified-Development-Workflow.md) - - A simpler and faster intro to developing with GlusterFS, than the document below -- [Developer Workflow](./Development-Workflow.md) - - Covers detail about requirements from a patch; tools and toolkits used by developers. - This is recommended reading in order to begin contributions to the project. -- [GD2 Developer Workflow](https://github.com/gluster/glusterd2/blob/master/doc/development-guide.md) - - Helps in on-boarding developers to contribute in GlusterD2 project. +- [Simplified Developer Workflow](./Simplified-Development-Workflow.md) -Compiling Gluster ------------------ + - A simpler and faster intro to developing with GlusterFS, than the document below -- [Building GlusterFS](./Building-GlusterFS.md) - How to compile - Gluster from source code. +- [Developer Workflow](./Development-Workflow.md) -Developing ----------- + - Covers detail about requirements from a patch; tools and toolkits used by developers. + This is recommended reading in order to begin contributions to the project. -- [Projects](./Projects.md) - Ideas for projects you could - create -- [Fixing issues reported by tools for static code - analysis](./Fixing-issues-reported-by-tools-for-static-code-analysis.md) - - This is a good starting point for developers to fix bugs in - GlusterFS project. +- [GD2 Developer Workflow](https://github.com/gluster/glusterd2/blob/master/doc/development-guide.md) -Releases and Backports ----------------------- + - Helps in on-boarding developers to contribute in GlusterD2 project. -- [Backport Guidelines](./Backport-Guidelines.md) describe the steps that branches too. +## Compiling Gluster + +- [Building GlusterFS](./Building-GlusterFS.md) - How to compile + Gluster from source code. + +## Developing + +- [Projects](./Projects.md) - Ideas for projects you could + create +- [Fixing issues reported by tools for static code + analysis](./Fixing-issues-reported-by-tools-for-static-code-analysis.md) + + - This is a good starting point for developers to fix bugs in GlusterFS project. + +## Releases and Backports + +- [Backport Guidelines](./Backport-Guidelines.md) describe the steps that branches too. Some more GlusterFS Developer documentation can be found [in glusterfs documentation directory](https://github.com/gluster/glusterfs/tree/master/doc/developer-guide) diff --git a/docs/Developer-guide/Development-Workflow.md b/docs/Developer-guide/Development-Workflow.md index fa6096e..585f712 100644 --- a/docs/Developer-guide/Development-Workflow.md +++ b/docs/Developer-guide/Development-Workflow.md @@ -1,12 +1,10 @@ -Development workflow of Gluster -================================ +# Development workflow of Gluster This document provides a detailed overview of the development model followed by the GlusterFS project. For a simpler overview visit [Simplified development workflow](./Simplified-Development-Workflow.md). -##Basics --------- +## Basics The GlusterFS development model largely revolves around the features and functionality provided by Git version control system, Github and Jenkins @@ -31,8 +29,7 @@ all builds and tests can be viewed at 'regression' job which is designed to execute test scripts provided as part of the code change. -##Preparatory Setup -------------------- +## Preparatory Setup Here is a list of initial one-time steps before you can start hacking on code. @@ -46,9 +43,9 @@ Fork [GlusterFS repository](https://github.com/gluster/glusterfs/fork) Get yourself a working tree by cloning the development repository from ```console -# git clone git@github.com:${username}/glusterfs.git -# cd glusterfs/ -# git remote add upstream git@github.com:gluster/glusterfs.git +git clone git@github.com:${username}/glusterfs.git +cd glusterfs/ +git remote add upstream git@github.com:gluster/glusterfs.git ``` ### Preferred email and set username @@ -69,13 +66,14 @@ get alerts. Set up a filter rule in your mail client to tag or classify emails with the header + ```text list: ``` + as mails originating from the github system. -##Development & Other flows ---------------------------- +## Development & Other flows ### Issue @@ -90,17 +88,17 @@ as mails originating from the github system. - Make sure clang-format is installed and is run on the patch. ### Keep up-to-date + - GlusterFS is a large project with many developers, so there would be one or the other patch everyday. - It is critical for developer to be up-to-date with devel repo to be Conflict-Free when PR is opened. - Git provides many options to keep up-to-date, below is one of them ```console -# git fetch upstream -# git rebase upstream/devel +git fetch upstream +git rebase upstream/devel ``` -##Branching policy ------------------- +## Branching policy This section describes both, the branching policies on the public repo as well as the suggested best-practice for local branching @@ -130,13 +128,12 @@ change. The name of the branch on your personal fork can start with issueNNNN, followed by anything of your choice. If you are submitting changes to the devel branch, first create a local task branch like this - -```console +```{ .console .no-copy } # git checkout -b issueNNNN upstream/main ... ``` -##Building ----------- +## Building ### Environment Setup @@ -147,18 +144,19 @@ refer : [Building GlusterFS](./Building-GlusterFS.md) Once the required packages are installed for your appropiate system, generate the build configuration: + ```console -# ./autogen.sh -# ./configure --enable-fusermount +./autogen.sh +./configure --enable-fusermount ``` ### Build and install + ```console -# make && make install +make && make install ``` -##Commit policy / PR description --------------------------------- +## Commit policy / PR description Typically you would have a local branch per task. You will need to sign-off your commit (git commit -s) before sending the @@ -169,22 +167,21 @@ CONTRIBUTING file available in the repository root. Provide a meaningful commit message. Your commit message should be in the following format -- A short one-line title of format 'component: title', describing what the patch accomplishes -- An empty line following the subject -- Situation necessitating the patch -- Description of the code changes -- Reason for doing it this way (compared to others) -- Description of test cases -- When you open a PR, having a reference Issue for the commit is mandatory in GlusterFS. -- Commit message can have, either Fixes: #NNNN or Updates: #NNNN in a separate line in the commit message. - Here, NNNN is the Issue ID in glusterfs repository. -- Each commit needs the author to have the 'Signed-off-by: Name ' line. - Can do this by -s option for git commit. -- If the PR is not ready for review, apply the label work-in-progress. - Check the availability of "Draft PR" is present for you, if yes, use that instead. +- A short one-line title of format 'component: title', describing what the patch accomplishes +- An empty line following the subject +- Situation necessitating the patch +- Description of the code changes +- Reason for doing it this way (compared to others) +- Description of test cases +- When you open a PR, having a reference Issue for the commit is mandatory in GlusterFS. +- Commit message can have, either Fixes: #NNNN or Updates: #NNNN in a separate line in the commit message. + Here, NNNN is the Issue ID in glusterfs repository. +- Each commit needs the author to have the 'Signed-off-by: Name ' line. + Can do this by -s option for git commit. +- If the PR is not ready for review, apply the label work-in-progress. + Check the availability of "Draft PR" is present for you, if yes, use that instead. -##Push the change ------------------ +## Push the change After doing the local commit, it is time to submit the code for review. There is a script available inside glusterfs.git called rfc.sh. It is @@ -192,31 +189,34 @@ recommended you keep pushing to your repo every day, so you don't loose any work. You can submit your changes for review by simply executing ```console -# ./rfc.sh +./rfc.sh ``` + or + ```console -# git push origin HEAD:issueNNN +git push origin HEAD:issueNNN ``` This script rfc.sh does the following: -- The first time it is executed, it downloads a git hook from - and sets it up - locally to generate a Change-Id: tag in your commit message (if it - was not already generated.) -- Rebase your commit against the latest upstream HEAD. This rebase - also causes your commits to undergo massaging from the just - downloaded commit-msg hook. -- Prompt for a Reference Id for each commit (if it was not already provided) - and include it as a "fixes: #n" tag in the commit log. You can just hit - at this prompt if your submission is purely for review - purposes. -- Push the changes for review. On a successful push, you will see a URL pointing to - the change in [Pull requests](https://github.com/gluster/glusterfs/pulls) section. +- The first time it is executed, it downloads a git hook from + and sets it up + locally to generate a Change-Id: tag in your commit message (if it + was not already generated.) +- Rebase your commit against the latest upstream HEAD. This rebase + also causes your commits to undergo massaging from the just + downloaded commit-msg hook. +- Prompt for a Reference Id for each commit (if it was not already provided) + and include it as a "fixes: #n" tag in the commit log. You can just hit + at this prompt if your submission is purely for review + purposes. +- Push the changes for review. On a successful push, you will see a URL pointing to + the change in [Pull requests](https://github.com/gluster/glusterfs/pulls) section. ## Test cases and Verification ------------------------------- + +--- ### Auto-triggered tests @@ -258,13 +258,13 @@ To check and run all regression tests locally, run the below script from glusterfs root directory. ```console -# ./run-tests.sh +./run-tests.sh ``` To run a single regression test locally, run the below command. ```console -# prove -vf +prove -vf ``` **NOTE:** The testing framework needs perl-Test-Harness package to be installed. @@ -284,18 +284,17 @@ of the feature. Please go through glusto-tests project to understand more information on how to write and execute the tests in glusto. 1. Extend/Modify old test cases in existing scripts - This is typically -when present behavior (default values etc.) of code is changed. + when present behavior (default values etc.) of code is changed. 2. No test cases - This is typically when a code change is trivial -(e.g. fixing typos in output strings, code comments). + (e.g. fixing typos in output strings, code comments). 3. Only test case and no code change - This is typically when we are -adding test cases to old code (already existing before this regression -test policy was enforced). More details on how to work with test case -scripts can be found in tests/README. + adding test cases to old code (already existing before this regression + test policy was enforced). More details on how to work with test case + scripts can be found in tests/README. -##Reviewing / Commenting ------------------------- +## Reviewing / Commenting Code review with Github is relatively easy compared to other available tools. Each change is presented as multiple files and each file can be @@ -304,8 +303,7 @@ on each line by clicking on '+' icon and writing in your comments in the text box. Such in-line comments are saved as drafts, till you finally publish them by Starting a Review. -##Incorporate, rfc.sh, Reverify --------------------------------------- +## Incorporate, rfc.sh, Reverify Code review comments are notified via email. After incorporating the changes in code, you can mark each of the inline comments as 'done' @@ -313,8 +311,9 @@ changes in code, you can mark each of the inline comments as 'done' commits in the same branch with - ```console -# git commit -a -s +git commit -a -s ``` + Push the commit by executing rfc.sh. If your previous push was an "rfc" push (i.e, without a Issue Id) you will be prompted for a Issue Id again. You can re-push an rfc change without any other code change too @@ -332,8 +331,7 @@ comments can be made on the new patch as well, and the same cycle repeats. If no further changes are necessary, the reviewer can approve the patch. -##Submission Qualifiers ------------------------ +## Submission Qualifiers GlusterFS project follows 'Squash and Merge' method. @@ -350,8 +348,7 @@ The project maintainer will merge the changes once a patch meets these qualifiers. If you feel there is delay, feel free to add a comment, discuss the same in Slack channel, or send email. -##Submission Disqualifiers --------------------------- +## Submission Disqualifiers - +2 : is equivalent to "Approve" from the people in the maintainer's group. - +1 : can be given by a maintainer/reviewer by explicitly stating that in the comment. diff --git a/docs/Developer-guide/Easy-Fix-Bugs.md b/docs/Developer-guide/Easy-Fix-Bugs.md index 96db08c..54ec30e 100644 --- a/docs/Developer-guide/Easy-Fix-Bugs.md +++ b/docs/Developer-guide/Easy-Fix-Bugs.md @@ -2,8 +2,8 @@ Fixing easy issues is an excellent method to start contributing patches to Gluster. -Sometimes an *Easy Fix* issue has a patch attached. In those cases, -the *Patch* keyword has been added to the bug. These bugs can be +Sometimes an _Easy Fix_ issue has a patch attached. In those cases, +the _Patch_ keyword has been added to the bug. These bugs can be used by new contributors that would like to verify their workflow. [Bug 1099645](https://bugzilla.redhat.com/1099645) is one example of those. @@ -11,12 +11,12 @@ All such issues can be found [here](https://github.com/gluster/glusterfs/labels/ ### Guidelines for new comers -- While trying to write a patch, do not hesitate to ask questions. -- If something in the documentation is unclear, we do need to know so - that we can improve it. -- There are no stupid questions, and it's more stupid to not ask - questions that others can easily answer. Always assume that if you - have a question, someone else would like to hear the answer too. +- While trying to write a patch, do not hesitate to ask questions. +- If something in the documentation is unclear, we do need to know so + that we can improve it. +- There are no stupid questions, and it's more stupid to not ask + questions that others can easily answer. Always assume that if you + have a question, someone else would like to hear the answer too. [Reach out](https://www.gluster.org/community/) to the developers in #gluster on [Gluster Slack](https://gluster.slack.com) channel, or on diff --git a/docs/Developer-guide/Fixing-issues-reported-by-tools-for-static-code-analysis.md b/docs/Developer-guide/Fixing-issues-reported-by-tools-for-static-code-analysis.md index 268e98e..e0d7769 100644 --- a/docs/Developer-guide/Fixing-issues-reported-by-tools-for-static-code-analysis.md +++ b/docs/Developer-guide/Fixing-issues-reported-by-tools-for-static-code-analysis.md @@ -1,7 +1,6 @@ -Static Code Analysis Tools --------------------------- +## Static Code Analysis Tools -Bug fixes for issues reported by *Static Code Analysis Tools* should +Bug fixes for issues reported by _Static Code Analysis Tools_ should follow [Development Work Flow](./Development-Workflow.md) ### Coverity @@ -9,49 +8,48 @@ follow [Development Work Flow](./Development-Workflow.md) GlusterFS is part of [Coverity's](https://scan.coverity.com/) scan program. -- To see Coverity issues you have to be a member of the GlusterFS - project in Coverity scan website. -- Here is the link to [Coverity scan website](https://scan.coverity.com/projects/987) -- Go to above link and subscribe to GlusterFS project (as - contributor). It will send a request to Admin for including you in - the Project. -- Once admins for the GlusterFS Coverity scan approve your request, - you will be able to see the defects raised by Coverity. -- [Issue #1060](https://github.com/gluster/glusterfs/issues/1060) - can be used as a umbrella bug for Coverity issues in master - branch unless you are trying to fix a specific issue. -- When you decide to work on some issue, please assign it to your name - in the same Coverity website. So that we don't step on each others - work. -- When marking a bug intentional in Coverity scan website, please put - an explanation for the same. So that it will help others to - understand the reasoning behind it. +- To see Coverity issues you have to be a member of the GlusterFS + project in Coverity scan website. +- Here is the link to [Coverity scan website](https://scan.coverity.com/projects/987) +- Go to above link and subscribe to GlusterFS project (as + contributor). It will send a request to Admin for including you in + the Project. +- Once admins for the GlusterFS Coverity scan approve your request, + you will be able to see the defects raised by Coverity. +- [Issue #1060](https://github.com/gluster/glusterfs/issues/1060) + can be used as a umbrella bug for Coverity issues in master + branch unless you are trying to fix a specific issue. +- When you decide to work on some issue, please assign it to your name + in the same Coverity website. So that we don't step on each others + work. +- When marking a bug intentional in Coverity scan website, please put + an explanation for the same. So that it will help others to + understand the reasoning behind it. -*If you have more questions please send it to +_If you have more questions please send it to [gluster-devel](https://lists.gluster.org/mailman/listinfo/gluster-devel) mailing -list* +list_ ### CPP Check Cppcheck is available in Fedora and EL's EPEL repo -- Install Cppcheck +- Install Cppcheck - # dnf install cppcheck + dnf install cppcheck -- Clone GlusterFS code +- Clone GlusterFS code - # git clone https://github.com/gluster/glusterfs + git clone https://github.com/gluster/glusterfs -- Run Cpp check - - # cppcheck glusterfs/ 2>cppcheck.log +- Run Cpp check + cppcheck glusterfs/ 2>cppcheck.log ### Clang-Scan Daily Runs We have daily runs of static source code analysis tool clang-scan on -the glusterfs sources. There are daily analyses of the master and +the glusterfs sources. There are daily analyses of the master and on currently supported branches. Results are posted at diff --git a/docs/Developer-guide/Projects.md b/docs/Developer-guide/Projects.md index e394315..f204491 100644 --- a/docs/Developer-guide/Projects.md +++ b/docs/Developer-guide/Projects.md @@ -3,9 +3,7 @@ This page contains a list of project ideas which will be suitable for students (for GSOC, internship etc.) -Projects/Features which needs contributors ------------------------------------------- - +## Projects/Features which needs contributors ### RIO @@ -13,27 +11,23 @@ Issue: https://github.com/gluster/glusterfs/issues/243 This is a new distribution logic, which can scale Gluster to 1000s of nodes. - ### Composition xlator for small files Merge small files into a designated large file using our own custom semantics. This can improve our small file performance. - ### Path based geo-replication Issue: https://github.com/gluster/glusterfs/issues/460 This would allow remote volume to be of different type (NFS/S3 etc etc) too. - ### Project Quota support Issue: https://github.com/gluster/glusterfs/issues/184 This will make Gluster's Quota faster, and also provide desired behavior. - ### Cluster testing framework based on gluster-tester Repo: https://github.com/aravindavk/gluster-tester diff --git a/docs/Developer-guide/Simplified-Development-Workflow.md b/docs/Developer-guide/Simplified-Development-Workflow.md index c9a9cab..f3261f7 100644 --- a/docs/Developer-guide/Simplified-Development-Workflow.md +++ b/docs/Developer-guide/Simplified-Development-Workflow.md @@ -1,5 +1,4 @@ -Simplified development workflow for GlusterFS -============================================= +# Simplified development workflow for GlusterFS This page gives a simplified model of the development workflow used by the GlusterFS project. This will give the steps required to get a patch @@ -8,8 +7,7 @@ accepted into the GlusterFS source. Visit [Development Work Flow](./Development-Workflow.md) a more detailed description of the workflow. -##Initial preparation ---------------------- +## Initial preparation The GlusterFS development workflow revolves around [GitHub](http://github.com/gluster/glusterfs/) and @@ -17,13 +15,15 @@ The GlusterFS development workflow revolves around Using these both tools requires some initial preparation. ### Get the source + Git clone the GlusterFS source using -```console - git clone git@github.com:${username}/glusterfs.git - cd glusterfs/ - git remote add upstream git@github.com:gluster/glusterfs.git +```{ .console .no-copy } +git clone git@github.com:${username}/glusterfs.git +cd glusterfs/ +git remote add upstream git@github.com:gluster/glusterfs.git ``` + This will clone the GlusterFS source into a subdirectory named glusterfs with the devel branch checked out. @@ -34,7 +34,7 @@ distribution specific package manger to install git. After installation configure git. At the minimum, set a git user email. To set the email do, -```console +```{ .console .no-copy } git config --global user.name git config --global user.email ``` @@ -43,8 +43,7 @@ Next, install the build requirements for GlusterFS. Refer [Building GlusterFS - Build Requirements](./Building-GlusterFS.md#Build Requirements) for the actual requirements. -##Actual development --------------------- +## Actual development The commands in this section are to be run inside the glusterfs source directory. @@ -55,23 +54,25 @@ It is recommended to use separate local development branches for each change you want to contribute to GlusterFS. To create a development branch, first checkout the upstream branch you want to work on and update it. More details on the upstream branching model for GlusterFS -can be found at [Development Work Flow - Branching\_policy](./Development-Workflow.md#branching-policy). +can be found at [Development Work Flow - Branching_policy](./Development-Workflow.md#branching-policy). For example if you want to develop on the devel branch, ```console -# git checkout devel -# git pull +git checkout devel +git pull ``` Now, create a new branch from devel and switch to the new branch. It is recommended to have descriptive branch names. Do, -```console +```{ .console .no-copy } git branch issueNNNN git checkout issueNNNN ``` + or, -```console + +```{ .console .no-copy } git checkout -b issueNNNN upstream/main ``` @@ -100,8 +101,8 @@ working GlusterFS installation and needs to be run as root. To run the regression test suite, do ```console -# make install -# ./run-tests.sh +make install +./run-tests.sh ``` or, After uploading the patch The regression tests would be triggered @@ -113,7 +114,7 @@ If you haven't broken anything, you can now commit your changes. First identify the files that you modified/added/deleted using git-status and stage these files. -```console +```{ .console .no-copy } git status git add ``` @@ -121,7 +122,7 @@ git add Now, commit these changes using ```console -# git commit -s +git commit -s ``` Provide a meaningful commit message. The commit message policy is @@ -134,18 +135,19 @@ sign-off the commit with your configured email. To submit your change for review, run the rfc.sh script, ```console -# ./rfc.sh +./rfc.sh ``` + or -```console + +```{ .console .no-copy } git push origin HEAD:issueNNN ``` More details on the rfc.sh script are available at [Development Work Flow - rfc.sh](./Development-Workflow.md#rfc.sh). -##Review process ----------------- +## Review process Your change will now be reviewed by the GlusterFS maintainers and component owners. You can follow and take part in the review process @@ -186,8 +188,9 @@ review comments. Build and test to see if the new changes are working. Stage your changes and commit your new changes in new commits using, ```console -# git commit -a -s +git commit -a -s ``` + Now you can resubmit the commit for review using the rfc.sh script or git push. The formal review process could take a long time. To increase chances diff --git a/docs/Developer-guide/compiling-rpms.md b/docs/Developer-guide/compiling-rpms.md index f28933d..ab4783f 100644 --- a/docs/Developer-guide/compiling-rpms.md +++ b/docs/Developer-guide/compiling-rpms.md @@ -1,5 +1,4 @@ -How to compile GlusterFS RPMs from git source, for RHEL/CentOS, and Fedora --------------------------------------------------------------------------- +## How to compile GlusterFS RPMs from git source, for RHEL/CentOS, and Fedora Creating rpm's of GlusterFS from git source is fairly easy, once you know the steps. @@ -21,13 +20,13 @@ Specific instructions for compiling are below. If you're using: ### Preparation steps for Fedora 16-20 (only) -1. Install gcc, the python development headers, and python setuptools: +1. Install gcc, the python development headers, and python setuptools: - # sudo yum -y install gcc python-devel python-setuptools + sudo yum -y install gcc python-devel python-setuptools -2. If you're compiling GlusterFS version 3.4, then install python-swiftclient. Other GlusterFS versions don't need it: +2. If you're compiling GlusterFS version 3.4, then install python-swiftclient. Other GlusterFS versions don't need it: - # sudo easy_install simplejson python-swiftclient + sudo easy_install simplejson python-swiftclient Now follow through with the **Common Steps** part below. @@ -35,15 +34,15 @@ Now follow through with the **Common Steps** part below. You'll need EPEL installed first and some CentOS-specific packages. The commands below will get that done for you. After that, follow through the "Common steps" section. -1. Install EPEL first: +1. Install EPEL first: - # curl -OL `[`http://download.fedoraproject.org/pub/epel/5/x86_64/epel-release-5-4.noarch.rpm`](http://download.fedoraproject.org/pub/epel/5/x86_64/epel-release-5-4.noarch.rpm) - # sudo yum -y install epel-release-5-4.noarch.rpm --nogpgcheck + curl -OL http://download.fedoraproject.org/pub/epel/5/x86_64/epel-release-5-4.noarch.rpm + sudo yum -y install epel-release-5-4.noarch.rpm --nogpgcheck -2. Install the packages required only on CentOS 5.x: +2. Install the packages required only on CentOS 5.x: - # sudo yum -y install buildsys-macros gcc ncurses-devel \ - python-ctypes python-sphinx10 redhat-rpm-config + sudo yum -y install buildsys-macros gcc ncurses-devel \ + python-ctypes python-sphinx10 redhat-rpm-config Now follow through with the **Common Steps** part below. @@ -51,32 +50,31 @@ Now follow through with the **Common Steps** part below. You'll need EPEL installed first and some CentOS-specific packages. The commands below will get that done for you. After that, follow through the "Common steps" section. -1. Install EPEL first: +1. Install EPEL first: - # sudo yum -y install `[`http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm`](http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm) + sudo yum -y install http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm -2. Install the packages required only on CentOS: +2. Install the packages required only on CentOS: - # sudo yum -y install python-webob1.0 python-paste-deploy1.5 python-sphinx10 redhat-rpm-config + sudo yum -y install python-webob1.0 python-paste-deploy1.5 python-sphinx10 redhat-rpm-config Now follow through with the **Common Steps** part below. - ### Preparation steps for CentOS 8.x (only) -You'll need EPEL installed and then the powertools package enabled. +You'll need EPEL installed and then the powertools package enabled. -1. Install EPEL first: - - # sudo rpm -ivh https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm +1. Install EPEL first: -2. Enable the PowerTools repo and install CentOS 8.x specific packages for building the rpms. + sudo rpm -ivh https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm - # sudo yum --enablerepo=PowerTools install automake autoconf libtool flex bison openssl-devel \ - libxml2-devel libaio-devel libibverbs-devel librdmacm-devel readline-devel lvm2-devel \ - glib2-devel userspace-rcu-devel libcmocka-devel libacl-devel sqlite-devel fuse-devel \ - redhat-rpm-config rpcgen libtirpc-devel make python3-devel rsync libuuid-devel \ - rpm-build dbench perl-Test-Harness attr libcurl-devel selinux-policy-devel -y +2. Enable the PowerTools repo and install CentOS 8.x specific packages for building the rpms. + + sudo yum --enablerepo=PowerTools install automake autoconf libtool flex bison openssl-devel \ + libxml2-devel libaio-devel libibverbs-devel librdmacm-devel readline-devel lvm2-devel \ + glib2-devel userspace-rcu-devel libcmocka-devel libacl-devel sqlite-devel fuse-devel \ + redhat-rpm-config rpcgen libtirpc-devel make python3-devel rsync libuuid-devel \ + rpm-build dbench perl-Test-Harness attr libcurl-devel selinux-policy-devel -y Now follow through from Point 2 in the **Common Steps** part below. @@ -84,14 +82,14 @@ Now follow through from Point 2 in the **Common Steps** part below. You'll need EPEL installed first and some RHEL specific packages. The 2 commands below will get that done for you. After that, follow through the "Common steps" section. -1. Install EPEL first: +1. Install EPEL first: - # sudo yum -y install `[`http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm`](http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm) + sudo yum -y install http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm -2. Install the packages required only on RHEL: +2. Install the packages required only on RHEL: - # sudo yum -y --enablerepo=rhel-6-server-optional-rpms install python-webob1.0 \ - python-paste-deploy1.5 python-sphinx10 redhat-rpm-config + sudo yum -y --enablerepo=rhel-6-server-optional-rpms install python-webob1.0 \ + python-paste-deploy1.5 python-sphinx10 redhat-rpm-config Now follow through with the **Common Steps** part below. @@ -104,64 +102,65 @@ These steps are for both Fedora and RHEL/CentOS. At the end you'll have the comp - If you're on RHEL/CentOS 5.x and get a message about lvm2-devel not being available, it's ok. You can ignore it. :) - If you're on RHEL/CentOS 6.x and get any messages about python-eventlet, python-netifaces, python-sphinx and/or pyxattr not being available, it's ok. You can ignore them. :) - If you're on CentOS 8.x, you can skip step 1 and start from step 2. Also, for CentOS 8.x, the steps have been -tested for the master branch. It is unknown if it would work for older branches. + tested for the master branch. It is unknown if it would work for older branches.
-1. Install the needed packages +1. Install the needed packages - # sudo yum -y --disablerepo=rhs* --enablerepo=*optional-rpms install git autoconf \ - automake bison dos2unix flex fuse-devel glib2-devel libaio-devel \ - libattr-devel libibverbs-devel librdmacm-devel libtool libxml2-devel lvm2-devel make \ - openssl-devel pkgconfig pyliblzma python-devel python-eventlet python-netifaces \ - python-paste-deploy python-simplejson python-sphinx python-webob pyxattr readline-devel \ - rpm-build systemtap-sdt-devel tar libcmocka-devel + sudo yum -y --disablerepo=rhs* --enablerepo=*optional-rpms install git autoconf \ + automake bison dos2unix flex fuse-devel glib2-devel libaio-devel \ + libattr-devel libibverbs-devel librdmacm-devel libtool libxml2-devel lvm2-devel make \ + openssl-devel pkgconfig pyliblzma python-devel python-eventlet python-netifaces \ + python-paste-deploy python-simplejson python-sphinx python-webob pyxattr readline-devel \ + rpm-build systemtap-sdt-devel tar libcmocka-devel -2. Clone the GlusterFS git repository +2. Clone the GlusterFS git repository - # git clone `[`git://git.gluster.org/glusterfs`](git://git.gluster.org/glusterfs) - # cd glusterfs + git clone git://git.gluster.org/glusterfs + cd glusterfs -3. Choose which branch to compile +3. Choose which branch to compile If you want to compile the latest development code, you can skip this step and go on to the next one. :) If instead, you want to compile the code for a specific release of GlusterFS (such as v3.4), get the list of release names here: - # git branch -a | grep release - remotes/origin/release-2.0 - remotes/origin/release-3.0 - remotes/origin/release-3.1 - remotes/origin/release-3.2 - remotes/origin/release-3.3 - remotes/origin/release-3.4 - remotes/origin/release-3.5 + # git branch -a | grep release + remotes/origin/release-2.0 + remotes/origin/release-3.0 + remotes/origin/release-3.1 + remotes/origin/release-3.2 + remotes/origin/release-3.3 + remotes/origin/release-3.4 + remotes/origin/release-3.5 Then switch to the correct release using the git "checkout" command, and the name of the release after the "remotes/origin/" bit from the list above: - # git checkout release-3.4 + git checkout release-3.4 **NOTE -** The CentOS 5.x instructions have only been tested for the master branch in GlusterFS git. It is unknown (yet) if they work for branches older than release-3.5. - --- - If you are compiling the latest development code you can skip steps **4** and **5**. Instead, you can run the below command and you will get the RPMs. + *** - - # extras/LinuxRPM/make_glusterrpms - --- + If you are compiling the latest development code you can skip steps **4** and **5**. Instead, you can run the below command and you will get the RPMs. -4. Configure and compile GlusterFS + extras/LinuxRPM/make_glusterrpms + + *** + +4. Configure and compile GlusterFS Now you're ready to compile Gluster: - # ./autogen.sh - # ./configure --enable-fusermount - # make dist + ./autogen.sh + ./configure --enable-fusermount + make dist -5. Create the GlusterFS RPMs +5. Create the GlusterFS RPMs - # cd extras/LinuxRPM - # make glusterrpms + cd extras/LinuxRPM + make glusterrpms That should complete with no errors, leaving you with a directory containing the RPMs. diff --git a/docs/Developer-guide/coredump-on-customer-setup.md b/docs/Developer-guide/coredump-on-customer-setup.md index 7c8ae174..734c2b5 100644 --- a/docs/Developer-guide/coredump-on-customer-setup.md +++ b/docs/Developer-guide/coredump-on-customer-setup.md @@ -1,47 +1,52 @@ # Get core dump on a customer set up without killing the process ### Why do we need this? + Finding the root cause of an issue that occurred in the customer/production setup is a challenging task. Most of the time we cannot replicate/setup the environment and scenario which is leading to the issue on our test setup. In such cases, we got to grab most of the information from the system where the problem has occurred. -
+ ### What information we look for and also useful? + The information like a core dump is very helpful to catch the root cause of an issue by adding ASSERT() in the code at the places where we feel something is wrong and install the custom build on the affected setup. But the issue is ASSERT() would kill the process and produce the core dump. -
+ ### Is it a good idea to do ASSERT() on customer setup? -Remember we are seeking help from customer setup, they unlikely agree to kill the process and produce the + +Remember we are seeking help from customer setup, they unlikely agree to kill the process and produce the core dump for us to root cause it. It affects the customer’s business and nobody agrees with this proposal. -
+ ### What if we have a way to produce a core dump without a kill? -Yes, Glusterfs provides a way to do this. Gluster has customized ASSERT() i.e GF_ASSERT() in place which helps -in producing the core dump without killing the associated process and also provides a script which can be run on -the customer set up that produces the core dump without harming the running process (This presumes we already have -GF_ASSERT() at the expected place in the current build running on customer setup. If not, we need to install custom + +Yes, Glusterfs provides a way to do this. Gluster has customized ASSERT() i.e GF_ASSERT() in place which helps +in producing the core dump without killing the associated process and also provides a script which can be run on +the customer set up that produces the core dump without harming the running process (This presumes we already have +GF_ASSERT() at the expected place in the current build running on customer setup. If not, we need to install custom build on that setup by adding GF_ASSERT()). -
+ ### Is GF_ASSERT() newly introduced in Gluster code? -No. GF_ASSERT() is already there in the codebase before this improvement. In the debug build, GF_ASSERT() kills the -process and produces the core dump but in the production build, it just logs the error and moves on. What we have done -is we just changed the implementation of the code and now in production build also we get the core dump but the process + +No. GF_ASSERT() is already there in the codebase before this improvement. In the debug build, GF_ASSERT() kills the +process and produces the core dump but in the production build, it just logs the error and moves on. What we have done +is we just changed the implementation of the code and now in production build also we get the core dump but the process won’t be killed. The code places where GF_ASSERT() is not covered, please add it as per the requirement. -
## Here are the steps to achieve the goal: -- Add GF_ASSERT() in the Gluster code path where you expect something wrong is happening. -- Build the Gluster code, install and mount the Gluster volume (For detailed steps refer: Gluster quick start guide). -- Now, in the other terminal, run the gfcore.py script - `# ./extras/debug/gfcore.py $PID 1 /tmp/` (PID of the gluster process you are interested in, got it by `ps -ef | grep gluster` - in the previous step. For more details, check `# ./extras/debug/gfcore.py --help`) -- Hit the code path where you have introduced GF_ASSERT(). If GF_ASSERT() is in fuse_write() path, you can hit the code - path by writing on to a file present under Gluster moun. Ex: `# dd if=/dev/zero of=/mnt/glustrefs/abcd bs=1M count=1` - where `/mnt/glusterfs` is the gluster mount -- Go to the terminal where the gdb is running (step 3) and observe that the gdb process is terminated -- Go to the directory where the core-dump is produced. Default would be present working directory. -- Access the core dump using gdb Ex: `# gdb -ex "core-file $GFCORE_FILE" $GLUSTER_BINARY` - (1st arg would be core file name and 2nd arg is o/p of file command in the previous step) -- Observe that the Gluster process is unaffected by checking its process state. Check pid status using `ps -ef | grep gluster` -
-Thanks, Xavi Hernandez(jahernan@redhat.com) for the idea. This will ease many Gluster developer's/maintainer’s life. + +- Add GF_ASSERT() in the Gluster code path where you expect something wrong is happening. +- Build the Gluster code, install and mount the Gluster volume (For detailed steps refer: Gluster quick start guide). +- Now, in the other terminal, run the gfcore.py script + `# ./extras/debug/gfcore.py $PID 1 /tmp/` (PID of the gluster process you are interested in, got it by `ps -ef | grep gluster` + in the previous step. For more details, check `# ./extras/debug/gfcore.py --help`) +- Hit the code path where you have introduced GF_ASSERT(). If GF_ASSERT() is in fuse_write() path, you can hit the code + path by writing on to a file present under Gluster moun. Ex: `# dd if=/dev/zero of=/mnt/glustrefs/abcd bs=1M count=1` + where `/mnt/glusterfs` is the gluster mount +- Go to the terminal where the gdb is running (step 3) and observe that the gdb process is terminated +- Go to the directory where the core-dump is produced. Default would be present working directory. +- Access the core dump using gdb Ex: `# gdb -ex "core-file $GFCORE_FILE" $GLUSTER_BINARY` + (1st arg would be core file name and 2nd arg is o/p of file command in the previous step) +- Observe that the Gluster process is unaffected by checking its process state. Check pid status using `ps -ef | grep gluster` + + Thanks, Xavi Hernandez(jahernan@redhat.com) for the idea. This will ease many Gluster developer's/maintainer’s life. From 0a3f07bd4a9943a18ef1f4defd5225b6f5b975c5 Mon Sep 17 00:00:00 2001 From: black-dragon74 Date: Fri, 3 Jun 2022 16:17:31 +0530 Subject: [PATCH 04/21] [troubleshooting] Fix AFR and Split brain pages and cleanup the syntax Signed-off-by: black-dragon74 --- docs/Troubleshooting/README.md | 8 +- docs/Troubleshooting/gfid-to-path.md | 12 +- docs/Troubleshooting/gluster-crash.md | 14 +- docs/Troubleshooting/resolving-splitbrain.md | 381 ++++++++++-------- docs/Troubleshooting/statedump.md | 90 ++--- docs/Troubleshooting/troubleshooting-afr.md | 142 ++++--- .../troubleshooting-filelocks.md | 18 +- .../Troubleshooting/troubleshooting-georep.md | 92 +++-- .../troubleshooting-glusterd.md | 72 ++-- docs/Troubleshooting/troubleshooting-gnfs.md | 53 ++- .../Troubleshooting/troubleshooting-memory.md | 4 +- 11 files changed, 471 insertions(+), 415 deletions(-) diff --git a/docs/Troubleshooting/README.md b/docs/Troubleshooting/README.md index 0741662..4ec0122 100644 --- a/docs/Troubleshooting/README.md +++ b/docs/Troubleshooting/README.md @@ -1,9 +1,8 @@ -Troubleshooting Guide ---------------------- +## Troubleshooting Guide + This guide describes some commonly seen issues and steps to recover from them. If that doesn’t help, reach out to the [Gluster community](https://www.gluster.org/community/), in which case the guide also describes what information needs to be provided in order to debug the issue. At minimum, we need the version of gluster running and the output of `gluster volume info`. - ### Where Do I Start? Is the issue already listed in the component specific troubleshooting sections? @@ -15,7 +14,6 @@ Is the issue already listed in the component specific troubleshooting sections? - [Gluster NFS Issues](./troubleshooting-gnfs.md) - [File Locks](./troubleshooting-filelocks.md) - If that didn't help, here is how to debug further. Identifying the problem and getting the necessary information to diagnose it is the first step in troubleshooting your Gluster setup. As Gluster operations involve interactions between multiple processes, this can involve multiple steps. @@ -25,5 +23,3 @@ Identifying the problem and getting the necessary information to diagnose it is - An operation failed - [High Memory Usage](./troubleshooting-memory.md) - [A Gluster process crashed](./gluster-crash.md) - - diff --git a/docs/Troubleshooting/gfid-to-path.md b/docs/Troubleshooting/gfid-to-path.md index 275fb71..3a25a1b 100644 --- a/docs/Troubleshooting/gfid-to-path.md +++ b/docs/Troubleshooting/gfid-to-path.md @@ -8,24 +8,26 @@ normal filesystem. The GFID of a file is stored in its xattr named #### Special mount using gfid-access translator: ```console -# mount -t glusterfs -o aux-gfid-mount vm1:test /mnt/testvol +mount -t glusterfs -o aux-gfid-mount vm1:test /mnt/testvol ``` Assuming, you have `GFID` of a file from changelog (or somewhere else). For trying this out, you can get `GFID` of a file from mountpoint: ```console -# getfattr -n glusterfs.gfid.string /mnt/testvol/dir/file +getfattr -n glusterfs.gfid.string /mnt/testvol/dir/file ``` --- + ### Get file path from GFID (Method 1): + **(Lists hardlinks delimited by `:`, returns path as seen from mountpoint)** #### Turn on build-pgfid option ```console -# gluster volume set test build-pgfid on +gluster volume set test build-pgfid on ``` Read virtual xattr `glusterfs.ancestry.path` which contains the file path @@ -36,7 +38,7 @@ getfattr -n glusterfs.ancestry.path -e text /mnt/testvol/.gfid/ **Example:** -```console +```{ .console .no-copy } [root@vm1 glusterfs]# ls -il /mnt/testvol/dir/ total 1 10610563327990022372 -rw-r--r--. 2 root root 3 Jul 17 18:05 file @@ -54,6 +56,7 @@ glusterfs.ancestry.path="/dir/file:/dir/file3" ``` ### Get file path from GFID (Method 2): + **(Does not list all hardlinks, returns backend brick path)** ```console @@ -70,4 +73,5 @@ trusted.glusterfs.pathinfo="( info` This lists all the files that require healing (and will be processed by the self-heal daemon). It prints either their path or their GFID. ### Interpreting the output + All the files listed in the output of this command need to be healed. The files listed may also be accompanied by the following tags: a) 'Is in split-brain' -A file in data or metadata split-brain will -be listed with " - Is in split-brain" appended after its path/GFID. E.g. +A file in data or metadata split-brain will +be listed with " - Is in split-brain" appended after its path/GFID. E.g. "/file4" in the output provided below. However, for a file in GFID split-brain, - the parent directory of the file is shown to be in split-brain and the file -itself is shown to be needing healing, e.g. "/dir" in the output provided below +the parent directory of the file is shown to be in split-brain and the file +itself is shown to be needing healing, e.g. "/dir" in the output provided below is in split-brain because of GFID split-brain of file "/dir/a". Files in split-brain cannot be healed without resolving the split-brain. @@ -36,11 +37,13 @@ b) 'Is possibly undergoing heal' When the heal info command is run, it (or to be more specific, the 'glfsheal' binary that is executed when you run the command) takes locks on each file to find if it needs healing. However, if the self-heal daemon had already started healing the file, it would have taken locks which glfsheal wouldn't be able to acquire. In such a case, it could print this message. Another possible case could be multiple glfsheal processes running simultaneously (e.g. multiple users ran a heal info command at the same time) and competing for same lock. The following is an example of heal info command's output. + ### Example + Consider a replica volume "test" with two bricks b1 and b2; self-heal daemon off, mounted at /mnt. -```console +```{ .console .no-copy } # gluster volume heal test info Brick \ - Is in split-brain @@ -63,24 +66,27 @@ Number of entries: 6 ``` ### Analysis of the output -It can be seen that -A) from brick b1, four entries need healing: -      1) file with gfid:6dc78b20-7eb6-49a3-8edb-087b90142246 needs healing -      2) "aaca219f-0e25-4576-8689-3bfd93ca70c2", -"39f301ae-4038-48c2-a889-7dac143e82dd" and "c3c94de2-232d-4083-b534-5da17fc476ac" - are in split-brain -B) from brick b2 six entries need healing- -      1) "a", "file2" and "file3" need healing -      2) "file1", "file4" & "/dir" are in split-brain +It can be seen that + +A) from brick b1, four entries need healing: + +- file with gfid:6dc78b20-7eb6-49a3-8edb-087b90142246 needs healing +- "aaca219f-0e25-4576-8689-3bfd93ca70c2", "39f301ae-4038-48c2-a889-7dac143e82dd" and "c3c94de2-232d-4083-b534-5da17fc476ac" are in split-brain + +B) from brick b2 six entries need healing- + +- "a", "file2" and "file3" need healing +- "file1", "file4" & "/dir" are in split-brain # 2. Volume heal info split-brain + Usage: `gluster volume heal info split-brain` This command only shows the list of files that are in split-brain. The output is therefore a subset of `gluster volume heal info` ### Example -```console +```{ .console .no-copy } # gluster volume heal test info split-brain Brick @@ -95,19 +101,22 @@ Brick Number of entries in split-brain: 3 ``` -Note that similar to the heal info command, for GFID split-brains (same filename but different GFID) +Note that similar to the heal info command, for GFID split-brains (same filename but different GFID) their parent directories are listed to be in split-brain. # 3. Resolution of split-brain using gluster CLI + Once the files in split-brain are identified, their resolution can be done from the gluster command line using various policies. Type-mismatch cannot be healed using this methods. Split-brain resolution commands let the user resolve data, metadata, and GFID split-brains. ## 3.1 Resolution of data/metadata split-brain using gluster CLI + Data and metadata split-brains can be resolved using the following policies: ## i) Select the bigger-file as source + This command is useful for per file healing where it is known/decided that the -file with bigger size is to be considered as source. +file with bigger size is to be considered as source. `gluster volume heal split-brain bigger-file ` Here, `` can be either the full file name as seen from the root of the volume (or) the GFID-string representation of the file, which sometimes gets displayed @@ -115,13 +124,14 @@ in the heal info command's output. Once this command is executed, the replica co size is found and healing is completed with that brick as a source. ### Example : + Consider the earlier output of the heal info split-brain command. -Before healing the file, notice file size and md5 checksums : +Before healing the file, notice file size and md5 checksums : On brick b1: -```console +```{ .console .no-copy } [brick1]# stat b1/dir/file1 File: ‘b1/dir/file1’ Size: 17 Blocks: 16 IO Block: 4096 regular file @@ -138,7 +148,7 @@ Change: 2015-03-06 13:55:37.206880347 +0530 On brick b2: -```console +```{ .console .no-copy } [brick2]# stat b2/dir/file1 File: ‘b2/dir/file1’ Size: 13 Blocks: 16 IO Block: 4096 regular file @@ -153,7 +163,7 @@ Change: 2015-03-06 13:52:22.910758923 +0530 cb11635a45d45668a403145059c2a0d5 b2/dir/file1 ``` -**Healing file1 using the above command** :- +**Healing file1 using the above command** :- `gluster volume heal test split-brain bigger-file /dir/file1` Healed /dir/file1. @@ -161,7 +171,7 @@ After healing is complete, the md5sum and file size on both bricks should be the On brick b1: -```console +```{ .console .no-copy } [brick1]# stat b1/dir/file1 File: ‘b1/dir/file1’ Size: 17 Blocks: 16 IO Block: 4096 regular file @@ -178,7 +188,7 @@ Change: 2015-03-06 14:17:12.880343950 +0530 On brick b2: -```console +```{ .console .no-copy } [brick2]# stat b2/dir/file1 File: ‘b2/dir/file1’ Size: 17 Blocks: 16 IO Block: 4096 regular file @@ -195,7 +205,7 @@ Change: 2015-03-06 14:17:12.881343955 +0530 ## ii) Select the file with the latest mtime as source -```console +```{ .console .no-copy } gluster volume heal split-brain latest-mtime ``` @@ -203,20 +213,21 @@ As is perhaps self-explanatory, this command uses the brick having the latest mo ## iii) Select one of the bricks in the replica as the source for a particular file -```console +```{ .console .no-copy } gluster volume heal split-brain source-brick ``` Here, `` is selected as source brick and `` present in the source brick is taken as the source for healing. ### Example : + Notice the md5 checksums and file size before and after healing. Before heal : On brick b1: -```console +```{ .console .no-copy } [brick1]# stat b1/file4 File: ‘b1/file4’ Size: 4 Blocks: 16 IO Block: 4096 regular file @@ -233,7 +244,7 @@ b6273b589df2dfdbd8fe35b1011e3183 b1/file4 On brick b2: -```console +```{ .console .no-copy } [brick2]# stat b2/file4 File: ‘b2/file4’ Size: 4 Blocks: 16 IO Block: 4096 regular file @@ -251,7 +262,7 @@ Change: 2015-03-06 13:52:35.769833142 +0530 **Healing the file with gfid c3c94de2-232d-4083-b534-5da17fc476ac using the above command** : ```console -# gluster volume heal test split-brain source-brick test-host:/test/b1 gfid:c3c94de2-232d-4083-b534-5da17fc476ac +gluster volume heal test split-brain source-brick test-host:/test/b1 gfid:c3c94de2-232d-4083-b534-5da17fc476ac ``` Healed gfid:c3c94de2-232d-4083-b534-5da17fc476ac. @@ -260,7 +271,7 @@ After healing : On brick b1: -```console +```{ .console .no-copy } # stat b1/file4 File: ‘b1/file4’ Size: 4 Blocks: 16 IO Block: 4096 regular file @@ -276,7 +287,7 @@ b6273b589df2dfdbd8fe35b1011e3183 b1/file4 On brick b2: -```console +```{ .console .no-copy } # stat b2/file4 File: ‘b2/file4’ Size: 4 Blocks: 16 IO Block: 4096 regular file @@ -292,7 +303,7 @@ b6273b589df2dfdbd8fe35b1011e3183 b2/file4 ## iv) Select one brick of the replica as the source for all files -```console +```{ .console .no-copy } gluster volume heal split-brain source-brick ``` @@ -301,9 +312,10 @@ replica pair is source. As the result of the above command all split-brained files in `` are selected as source and healed to the sink. ### Example: + Consider a volume having three entries "a, b and c" in split-brain. -```console +```{ .console .no-copy } # gluster volume heal test split-brain source-brick test-host:/test/b1 Healed gfid:944b4764-c253-4f02-b35f-0d0ae2f86c0f. Healed gfid:3256d814-961c-4e6e-8df2-3a3143269ced. @@ -312,19 +324,24 @@ Number of healed entries: 3 ``` # 3.2 Resolution of GFID split-brain using gluster CLI + GFID split-brains can also be resolved by the gluster command line using the same policies that are used to resolve data and metadata split-brains. ## i) Selecting the bigger-file as source + This method is useful for per file healing and where you can decided that the file with bigger size is to be considered as source. Run the following command to obtain the path of the file that is in split-brain: -```console + +```{ .console .no-copy } # gluster volume heal VOLNAME info split-brain ``` From the output, identify the files for which file operations performed from the client failed with input/output error. + ### Example : -```console + +```{ .console .no-copy } # gluster volume heal testvol info Brick 10.70.47.45:/bricks/brick2/b0 /f5 @@ -340,19 +357,22 @@ Brick 10.70.47.144:/bricks/brick2/b1 Status: Connected Number of entries: 2 ``` + > **Note** > Entries which are in GFID split-brain may not be shown as in split-brain by the heal info or heal info split-brain commands always. For entry split-brains, it is the parent directory which is shown as being in split-brain. So one might need to run info split-brain to get the dir names and then heal info to get the list of files under that dir which might be in split-brain (it could just be needing heal without split-brain). In the above command, testvol is the volume name, b0 and b1 are the bricks. Execute the below getfattr command on the brick to fetch information if a file is in GFID split-brain or not. -```console +```{ .console .no-copy } # getfattr -d -e hex -m. ``` ### Example : + On brick /b0 -```console + +```{ .console .no-copy } # getfattr -d -m . -e hex /bricks/brick2/b0/f5 getfattr: Removing leading '/' from absolute path names file: bricks/brick2/b0/f5 @@ -364,7 +384,8 @@ trusted.gfid2path.9cde09916eabc845=0x30303030303030302d303030302d303030302d30303 ``` On brick /b1 -```console + +```{ .console .no-copy } # getfattr -d -m . -e hex /bricks/brick2/b1/f5 getfattr: Removing leading '/' from absolute path names file: bricks/brick2/b1/f5 @@ -379,7 +400,8 @@ You can notice the difference in GFID for the file f5 in both the bricks. You can find the differences in the file size by executing stat command on the file from the bricks. On brick /b0 -```console + +```{ .console .no-copy } # stat /bricks/brick2/b0/f5 File: ‘/bricks/brick2/b0/f5’ Size: 15 Blocks: 8 IO Block: 4096 regular file @@ -393,7 +415,8 @@ Birth: - ``` On brick /b1 -```console + +```{ .console .no-copy } # stat /bricks/brick2/b1/f5 File: ‘/bricks/brick2/b1/f5’ Size: 2 Blocks: 8 IO Block: 4096 regular file @@ -408,12 +431,13 @@ Birth: - Execute the following command along with the full filename as seen from the root of the volume which is displayed in the heal info command's output: -```console +```{ .console .no-copy } # gluster volume heal VOLNAME split-brain bigger-file FILE ``` ### Example : -```console + +```{ .console .no-copy } # gluster volume heal testvol split-brain bigger-file /f5 GFID split-brain resolved for file /f5 ``` @@ -421,7 +445,8 @@ GFID split-brain resolved for file /f5 After the healing is complete, the GFID of the file on both the bricks must be the same as that of the file which had the bigger size. The following is a sample output of the getfattr command after completion of healing the file. On brick /b0 -```console + +```{ .console .no-copy } # getfattr -d -m . -e hex /bricks/brick2/b0/f5 getfattr: Removing leading '/' from absolute path names file: bricks/brick2/b0/f5 @@ -431,7 +456,8 @@ trusted.gfid2path.9cde09916eabc845=0x30303030303030302d303030302d303030302d30303 ``` On brick /b1 -```console + +```{ .console .no-copy } # getfattr -d -m . -e hex /bricks/brick2/b1/f5 getfattr: Removing leading '/' from absolute path names file: bricks/brick2/b1/f5 @@ -441,14 +467,16 @@ trusted.gfid2path.9cde09916eabc845=0x30303030303030302d303030302d303030302d30303 ``` ## ii) Selecting the file with latest mtime as source + This method is useful for per file healing and if you want the file with latest mtime has to be considered as source. ### Example : + Lets take another file which is in GFID split-brain and try to heal that using the latest-mtime option. On brick /b0 -```console +```{ .console .no-copy } # getfattr -d -m . -e hex /bricks/brick2/b0/f4 getfattr: Removing leading '/' from absolute path names file: bricks/brick2/b0/f4 @@ -460,7 +488,8 @@ trusted.gfid2path.364f55367c7bd6f4=0x30303030303030302d303030302d303030302d30303 ``` On brick /b1 -```console + +```{ .console .no-copy } # getfattr -d -m . -e hex /bricks/brick2/b1/f4 getfattr: Removing leading '/' from absolute path names file: bricks/brick2/b1/f4 @@ -475,7 +504,8 @@ You can notice the difference in GFID for the file f4 in both the bricks. You can find the difference in the modification time by executing stat command on the file from the bricks. On brick /b0 -```console + +```{ .console .no-copy } # stat /bricks/brick2/b0/f4 File: ‘/bricks/brick2/b0/f4’ Size: 14 Blocks: 8 IO Block: 4096 regular file @@ -489,7 +519,8 @@ Birth: - ``` On brick /b1 -```console + +```{ .console .no-copy } # stat /bricks/brick2/b1/f4 File: ‘/bricks/brick2/b1/f4’ Size: 2 Blocks: 8 IO Block: 4096 regular file @@ -503,12 +534,14 @@ Birth: - ``` Execute the following command: -```console + +```{ .console .no-copy } # gluster volume heal VOLNAME split-brain latest-mtime FILE ``` ### Example : -```console + +```{ .console .no-copy } # gluster volume heal testvol split-brain latest-mtime /f4 GFID split-brain resolved for file /f4 ``` @@ -516,7 +549,9 @@ GFID split-brain resolved for file /f4 After the healing is complete, the GFID of the files on both bricks must be same. The following is a sample output of the getfattr command after completion of healing the file. You can notice that the file has been healed using the brick having the latest mtime as the source. On brick /b0 -```console# getfattr -d -m . -e hex /bricks/brick2/b0/f4 + +```{ .console .no-copy } +# getfattr -d -m . -e hex /bricks/brick2/b0/f4 getfattr: Removing leading '/' from absolute path names file: bricks/brick2/b0/f4 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 @@ -525,7 +560,8 @@ trusted.gfid2path.364f55367c7bd6f4=0x30303030303030302d303030302d303030302d30303 ``` On brick /b1 -```console + +```{ .console .no-copy } # getfattr -d -m . -e hex /bricks/brick2/b1/f4 getfattr: Removing leading '/' from absolute path names file: bricks/brick2/b1/f4 @@ -535,13 +571,16 @@ trusted.gfid2path.364f55367c7bd6f4=0x30303030303030302d303030302d303030302d30303 ``` ## iii) Select one of the bricks in the replica as source for a particular file + This method is useful for per file healing and if you know which copy of the file is good. ### Example : + Lets take another file which is in GFID split-brain and try to heal that using the source-brick option. On brick /b0 -```console + +```{ .console .no-copy } # getfattr -d -m . -e hex /bricks/brick2/b0/f3 getfattr: Removing leading '/' from absolute path names file: bricks/brick2/b0/f3 @@ -553,7 +592,8 @@ trusted.gfid2path.364f55367c7bd6f4=0x30303030303030302d303030302d303030302d30303 ``` On brick /b1 -```console + +```{ .console .no-copy } # getfattr -d -m . -e hex /bricks/brick2/b1/f3 getfattr: Removing leading '/' from absolute path names file: bricks/brick2/b0/f3 @@ -567,14 +607,16 @@ trusted.gfid2path.364f55367c7bd6f4=0x30303030303030302d303030302d303030302d30303 You can notice the difference in GFID for the file f3 in both the bricks. Execute the following command: -```console + +```{ .console .no-copy } # gluster volume heal VOLNAME split-brain source-brick HOSTNAME:export-directory-absolute-path FILE ``` In this command, FILE present in HOSTNAME : export-directory-absolute-path is taken as source for healing. ### Example : -```console + +```{ .console .no-copy } # gluster volume heal testvol split-brain source-brick 10.70.47.144:/bricks/brick2/b1 /f3 GFID split-brain resolved for file /f3 ``` @@ -582,7 +624,8 @@ GFID split-brain resolved for file /f3 After the healing is complete, the GFID of the file on both the bricks should be same as that of the brick which was chosen as source for healing. The following is a sample output of the getfattr command after the file is healed. On brick /b0 -```console + +```{ .console .no-copy } # getfattr -d -m . -e hex /bricks/brick2/b0/f3 getfattr: Removing leading '/' from absolute path names file: bricks/brick2/b0/f3 @@ -592,7 +635,8 @@ trusted.gfid2path.364f55367c7bd6f4=0x30303030303030302d303030302d303030302d30303 ``` On brick /b1 -```console + +```{ .console .no-copy } # getfattr -d -m . -e hex /bricks/brick2/b1/f3 getfattr: Removing leading '/' from absolute path names file: bricks/brick2/b1/f3 @@ -602,19 +646,22 @@ trusted.gfid2path.364f55367c7bd6f4=0x30303030303030302d303030302d303030302d30303 ``` > **Note** ->- One cannot use the GFID of the file as an argument with any of the CLI options to resolve GFID split-brain. It should be the absolute path as seen from the mount point to the file considered as source. > ->- With source-brick option there is no way to resolve all the GFID split-brain in one shot by not specifying any file path in the CLI as done while resolving data or metadata split-brain. For each file in GFID split-brain, run the CLI with the policy you want to use. +> - One cannot use the GFID of the file as an argument with any of the CLI options to resolve GFID split-brain. It should be the absolute path as seen from the mount point to the file considered as source. > ->- Resolving directory GFID split-brain using CLI with the "source-brick" option in a "distributed-replicated" volume needs to be done on all the sub-volumes explicitly, which are in this state. Since directories get created on all the sub-volumes, using one particular brick as source for directory GFID split-brain heals the directory for that particular sub-volume. Source brick should be chosen in such a way that after heal all the bricks of all the sub-volumes have the same GFID. +> - With source-brick option there is no way to resolve all the GFID split-brain in one shot by not specifying any file path in the CLI as done while resolving data or metadata split-brain. For each file in GFID split-brain, run the CLI with the policy you want to use. +> +> - Resolving directory GFID split-brain using CLI with the "source-brick" option in a "distributed-replicated" volume needs to be done on all the sub-volumes explicitly, which are in this state. Since directories get created on all the sub-volumes, using one particular brick as source for directory GFID split-brain heals the directory for that particular sub-volume. Source brick should be chosen in such a way that after heal all the bricks of all the sub-volumes have the same GFID. ## Note: + As mentioned earlier, type-mismatch can not be resolved using CLI. Type-mismatch means different st_mode values (for example, the entry is a file in one brick while it is a directory on the other). Trying to heal such entry would fail. ### Example + The entry named "entry1" is of different types on the bricks of the replica. Lets try to heal that using the split-brain CLI. -```console +```{ .console .no-copy } # gluster volume heal test split-brain source-brick test-host:/test/b1 /entry1 Healing /entry1 failed:Operation not permitted. Volume heal failed. @@ -623,22 +670,23 @@ Volume heal failed. However, they can be fixed by deleting the file from all but one bricks. See [Fixing Directory entry split-brain](#dir-split-brain) # An overview of working of heal info commands -When these commands are invoked, a "glfsheal" process is spawned which reads -the entries from the various sub-directories under `//.glusterfs/indices/` of all -the bricks that are up (that it can connect to) one after another. These -entries are GFIDs of files that might need healing. Once GFID entries from a -brick are obtained, based on the lookup response of this file on each -participating brick of replica-pair & trusted.afr.* extended attributes it is -found out if the file needs healing, is in split-brain etc based on the + +When these commands are invoked, a "glfsheal" process is spawned which reads +the entries from the various sub-directories under `//.glusterfs/indices/` of all +the bricks that are up (that it can connect to) one after another. These +entries are GFIDs of files that might need healing. Once GFID entries from a +brick are obtained, based on the lookup response of this file on each +participating brick of replica-pair & trusted.afr.\* extended attributes it is +found out if the file needs healing, is in split-brain etc based on the requirement of each command and displayed to the user. - # 4. Resolution of split-brain from the mount point + A set of getfattr and setfattr commands have been provided to detect the data and metadata split-brain status of a file and resolve split-brain, if any, from mount point. Consider a volume "test", having bricks b0, b1, b2 and b3. -```console +```{ .console .no-copy } # gluster volume info test Volume Name: test @@ -656,7 +704,7 @@ Brick4: test-host:/test/b3 Directory structure of the bricks is as follows: -```console +```{ .console .no-copy } # tree -R /test/b? /test/b0 ├── dir @@ -683,7 +731,7 @@ Directory structure of the bricks is as follows: Some files in the volume are in split-brain. -```console +```{ .console .no-copy } # gluster v heal test info split-brain Brick test-host:/test/b0/ /file100 @@ -708,7 +756,7 @@ Number of entries in split-brain: 2 ### To know data/metadata split-brain status of a file: -```console +```{ .console .no-copy } getfattr -n replica.split-brain-status ``` @@ -716,50 +764,52 @@ The above command executed from mount provides information if a file is in data/ This command is not applicable to gfid/directory split-brain. ### Example: -1) "file100" is in metadata split-brain. Executing the above mentioned command for file100 gives : -```console +1. "file100" is in metadata split-brain. Executing the above mentioned command for file100 gives : + +```{ .console .no-copy } # getfattr -n replica.split-brain-status file100 file: file100 replica.split-brain-status="data-split-brain:no metadata-split-brain:yes Choices:test-client-0,test-client-1" ``` -2) "file1" is in data split-brain. +2. "file1" is in data split-brain. -```console +```{ .console .no-copy } # getfattr -n replica.split-brain-status file1 file: file1 replica.split-brain-status="data-split-brain:yes metadata-split-brain:no Choices:test-client-2,test-client-3" ``` -3) "file99" is in both data and metadata split-brain. +3. "file99" is in both data and metadata split-brain. -```console +```{ .console .no-copy } # getfattr -n replica.split-brain-status file99 file: file99 replica.split-brain-status="data-split-brain:yes metadata-split-brain:yes Choices:test-client-2,test-client-3" ``` -4) "dir" is in directory split-brain but as mentioned earlier, the above command is not applicable to such split-brain. So it says that the file is not under data or metadata split-brain. +4. "dir" is in directory split-brain but as mentioned earlier, the above command is not applicable to such split-brain. So it says that the file is not under data or metadata split-brain. -```console +```{ .console .no-copy } # getfattr -n replica.split-brain-status dir file: dir replica.split-brain-status="The file is not under data or metadata split-brain" ``` -5) "file2" is not in any kind of split-brain. +5. "file2" is not in any kind of split-brain. -```console +```{ .console .no-copy } # getfattr -n replica.split-brain-status file2 file: file2 replica.split-brain-status="The file is not under data or metadata split-brain" ``` ### To analyze the files in data and metadata split-brain + Trying to do operations (say cat, getfattr etc) from the mount on files in split-brain, gives an input/output error. To enable the users analyze such files, a setfattr command is provided. -```console +```{ .console .no-copy } # setfattr -n replica.split-brain-choice -v "choiceX" ``` @@ -767,9 +817,9 @@ Using this command, a particular brick can be chosen to access the file in split ### Example: -1) "file1" is in data-split-brain. Trying to read from the file gives input/output error. +1. "file1" is in data-split-brain. Trying to read from the file gives input/output error. -```console +```{ .console .no-copy } # cat file1 cat: file1: Input/output error ``` @@ -778,13 +828,13 @@ Split-brain choices provided for file1 were test-client-2 and test-client-3. Setting test-client-2 as split-brain choice for file1 serves reads from b2 for the file. -```console +```{ .console .no-copy } # setfattr -n replica.split-brain-choice -v test-client-2 file1 ``` Now, read operations on the file can be done. -```console +```{ .console .no-copy } # cat file1 xyz ``` @@ -793,18 +843,18 @@ Similarly, to inspect the file from other choice, replica.split-brain-choice is Trying to inspect the file from a wrong choice errors out. -To undo the split-brain-choice that has been set, the above mentioned setfattr command can be used +To undo the split-brain-choice that has been set, the above mentioned setfattr command can be used with "none" as the value for extended attribute. ### Example: -```console +```{ .console .no-copy } # setfattr -n replica.split-brain-choice -v none file1 ``` Now performing cat operation on the file will again result in input/output error, as before. -```console +```{ .console .no-copy } # cat file cat: file1: Input/output error ``` @@ -812,13 +862,13 @@ cat: file1: Input/output error Once the choice for resolving split-brain is made, source brick is supposed to be set for the healing to be done. This is done using the following command: -```console +```{ .console .no-copy } # setfattr -n replica.split-brain-heal-finalize -v ``` ## Example -```console +```{ .console .no-copy } # setfattr -n replica.split-brain-heal-finalize -v test-client-2 file1 ``` @@ -826,18 +876,19 @@ The above process can be used to resolve data and/or metadata split-brain on all **NOTE**: -1) If "fopen-keep-cache" fuse mount option is disabled then inode needs to be invalidated each time before selecting a new replica.split-brain-choice to inspect a file. This can be done by using: +1. If "fopen-keep-cache" fuse mount option is disabled then inode needs to be invalidated each time before selecting a new replica.split-brain-choice to inspect a file. This can be done by using: -```console +```{ .console .no-copy } # sefattr -n inode-invalidate -v 0 ``` -2) The above mentioned process for split-brain resolution from mount will not work on nfs mounts as it doesn't provide xattrs support. +2. The above mentioned process for split-brain resolution from mount will not work on nfs mounts as it doesn't provide xattrs support. # 5. Automagic unsplit-brain by [ctime|mtime|size|majority] -The CLI and fuse mount based resolution methods require intervention in the sense that the admin/ user needs to run the commands manually. There is a `cluster.favorite-child-policy` volume option which when set to one of the various policies available, automatically resolve split-brains without user intervention. The default value is 'none', i.e. it is disabled. -```console +The CLI and fuse mount based resolution methods require intervention in the sense that the admin/ user needs to run the commands manually. There is a `cluster.favorite-child-policy` volume option which when set to one of the various policies available, automatically resolve split-brains without user intervention. The default value is 'none', i.e. it is disabled. + +```{ .console .no-copy } # gluster volume set help | grep -A3 cluster.favorite-child-policy Option: cluster.favorite-child-policy Default Value: none @@ -846,40 +897,41 @@ Description: This option can be used to automatically resolve split-brains using `cluster.favorite-child-policy` applies to all files of the volume. It is assumed that if this option is enabled with a particular policy, you don't care to examine the split-brain files on a per file basis but just want the split-brain to be resolved as and when it occurs based on the set policy. - - # Manual Split-Brain Resolution: -Quick Start: -============ -1. Get the path of the file that is in split-brain: -> It can be obtained either by -> a) The command `gluster volume heal info split-brain`. -> b) Identify the files for which file operations performed - from the client keep failing with Input/Output error. +# Quick Start: -2. Close the applications that opened this file from the mount point. -In case of VMs, they need to be powered-off. +1. Get the path of the file that is in split-brain: -3. Decide on the correct copy: -> This is done by observing the afr changelog extended attributes of the file on -the bricks using the getfattr command; then identifying the type of split-brain -(data split-brain, metadata split-brain, entry split-brain or split-brain due to -gfid-mismatch); and finally determining which of the bricks contains the 'good copy' -of the file. -> `getfattr -d -m . -e hex `. -It is also possible that one brick might contain the correct data while the -other might contain the correct metadata. + > It can be obtained either by + > a) The command `gluster volume heal info split-brain`. + > b) Identify the files for which file operations performed from the client keep failing with Input/Output error. -4. Reset the relevant extended attribute on the brick(s) that contains the -'bad copy' of the file data/metadata using the setfattr command. -> `setfattr -n -v ` +1. Close the applications that opened this file from the mount point. + In case of VMs, they need to be powered-off. -5. Trigger self-heal on the file by performing lookup from the client: -> `ls -l ` +1. Decide on the correct copy: + + > This is done by observing the afr changelog extended attributes of the file on + > the bricks using the getfattr command; then identifying the type of split-brain + > (data split-brain, metadata split-brain, entry split-brain or split-brain due to + > gfid-mismatch); and finally determining which of the bricks contains the 'good copy' + > of the file. + > `getfattr -d -m . -e hex `. + > It is also possible that one brick might contain the correct data while the + > other might contain the correct metadata. + +1. Reset the relevant extended attribute on the brick(s) that contains the + 'bad copy' of the file data/metadata using the setfattr command. + + > `setfattr -n -v ` + +1. Trigger self-heal on the file by performing lookup from the client: + + > `ls -l ` + +# Detailed Instructions for steps 3 through 5: -Detailed Instructions for steps 3 through 5: -=========================================== To understand how to resolve split-brain we need to know how to interpret the afr changelog extended attributes. @@ -887,7 +939,7 @@ Execute `getfattr -d -m . -e hex ` Example: -```console +```{ .console .no-copy } [root@store3 ~]# getfattr -d -e hex -m. brick-a/file.txt \#file: brick-a/file.txt security.selinux=0x726f6f743a6f626a6563745f723a66696c655f743a733000 @@ -900,7 +952,7 @@ The extended attributes with `trusted.afr.-client-` are used by afr to maintain changelog of the file.The values of the `trusted.afr.-client-` are calculated by the glusterfs client (fuse or nfs-server) processes. When the glusterfs client modifies a file -or directory, the client contacts each brick and updates the changelog extended +or directory, the client contacts each brick and updates the changelog extended attribute according to the response of the brick. 'subvolume-index' is nothing but (brick number - 1) in @@ -908,7 +960,7 @@ attribute according to the response of the brick. Example: -```console +```{ .console .no-copy } [root@pranithk-laptop ~]# gluster volume info vol Volume Name: vol Type: Distributed-Replicate @@ -929,7 +981,7 @@ Example: In the example above: -```console +```{ .console .no-copy } Brick | Replica set | Brick subvolume index ---------------------------------------------------------------------------- -/gfs/brick-a | 0 | 0 @@ -945,25 +997,25 @@ Brick | Replica set | Brick subvolume index Each file in a brick maintains the changelog of itself and that of the files present in all the other bricks in its replica set as seen by that brick. -In the example volume given above, all files in brick-a will have 2 entries, +In the example volume given above, all files in brick-a will have 2 entries, one for itself and the other for the file present in its replica pair, i.e.brick-b: trusted.afr.vol-client-0=0x000000000000000000000000 -->changelog for itself (brick-a) -trusted.afr.vol-client-1=0x000000000000000000000000 -->changelog for brick-b as seen by brick-a +trusted.afr.vol-client-1=0x000000000000000000000000 -->changelog for brick-b as seen by brick-a Likewise, all files in brick-b will have: trusted.afr.vol-client-0=0x000000000000000000000000 -->changelog for brick-a as seen by brick-b -trusted.afr.vol-client-1=0x000000000000000000000000 -->changelog for itself (brick-b) +trusted.afr.vol-client-1=0x000000000000000000000000 -->changelog for itself (brick-b) -The same can be extended for other replica pairs. +The same can be extended for other replica pairs. Interpreting Changelog (roughly pending operation count) Value: Each extended attribute has a value which is 24 hexa decimal digits. First 8 digits represent changelog of data. Second 8 digits represent changelog -of metadata. Last 8 digits represent Changelog of directory entries. +of metadata. Last 8 digits represent Changelog of directory entries. Pictorially representing the same, we have: -```text +```{ .text .no-copy } 0x 000003d7 00000001 00000000 | | | | | \_ changelog of directory entries @@ -971,17 +1023,16 @@ Pictorially representing the same, we have: \ _ changelog of data ``` - For Directories metadata and entry changelogs are valid. For regular files data and metadata changelogs are valid. For special files like device files etc metadata changelog is valid. When a file split-brain happens it could be either data split-brain or meta-data split-brain or both. When a split-brain happens the changelog of the -file would be something like this: +file would be something like this: Example:(Lets consider both data, metadata split-brain on same file). -```console +```{ .console .no-copy } [root@pranithk-laptop vol]# getfattr -d -m . -e hex /gfs/brick-?/a getfattr: Removing leading '/' from absolute path names \#file: gfs/brick-a/a @@ -1007,7 +1058,7 @@ on itself but failed on /gfs/brick-b/a. The second 8 digits of trusted.afr.vol-client-0 are all zeros (0x........00000000........), and the second 8 digits of trusted.afr.vol-client-1 are not all zeros (0x........00000001........). -So the changelog on /gfs/brick-a/a implies that some metadata operations succeeded +So the changelog on /gfs/brick-a/a implies that some metadata operations succeeded on itself but failed on /gfs/brick-b/a. #### According to Changelog extended attributes on file /gfs/brick-b/a: @@ -1029,12 +1080,12 @@ file, it is in both data and metadata split-brain. #### Deciding on the correct copy: -The user may have to inspect stat,getfattr output of the files to decide which +The user may have to inspect stat,getfattr output of the files to decide which metadata to retain and contents of the file to decide which data to retain. Continuing with the example above, lets say we want to retain the data of /gfs/brick-a/a and metadata of /gfs/brick-b/a. -#### Resetting the relevant changelogs to resolve the split-brain: +#### Resetting the relevant changelogs to resolve the split-brain: For resolving data-split-brain: @@ -1068,27 +1119,31 @@ For trusted.afr.vol-client-1 Hence execute `setfattr -n trusted.afr.vol-client-1 -v 0x000003d70000000000000000 /gfs/brick-a/a` -Thus after the above operations are done, the changelogs look like this: -[root@pranithk-laptop vol]# getfattr -d -m . -e hex /gfs/brick-?/a -getfattr: Removing leading '/' from absolute path names -\#file: gfs/brick-a/a -trusted.afr.vol-client-0=0x000000000000000000000000 -trusted.afr.vol-client-1=0x000003d70000000000000000 -trusted.gfid=0x80acdbd886524f6fbefa21fc356fed57 +Thus after the above operations are done, the changelogs look like this: -\#file: gfs/brick-b/a -trusted.afr.vol-client-0=0x000000000000000100000000 -trusted.afr.vol-client-1=0x000000000000000000000000 -trusted.gfid=0x80acdbd886524f6fbefa21fc356fed57 +```{ .console .no-copy } +[root@pranithk-laptop vol]# getfattr -d -m . -e hex /gfs/brick-?/a +getfattr: Removing leading '/' from absolute path names +\#file: gfs/brick-a/a +trusted.afr.vol-client-0=0x000000000000000000000000 +trusted.afr.vol-client-1=0x000003d70000000000000000 +trusted.gfid=0x80acdbd886524f6fbefa21fc356fed57 +\#file: gfs/brick-b/a +trusted.afr.vol-client-0=0x000000000000000100000000 +trusted.afr.vol-client-1=0x000000000000000000000000 +trusted.gfid=0x80acdbd886524f6fbefa21fc356fed57 +``` + +## Triggering Self-heal: -Triggering Self-heal: ---------------------- Perform `ls -l ` to trigger healing. Fixing Directory entry split-brain: ----------------------------------- + +--- + Afr has the ability to conservatively merge different entries in the directories when there is a split-brain on directory. If on one brick directory 'd' has entries '1', '2' and has entries '3', '4' on @@ -1108,9 +1163,11 @@ needs to be removed.The gfid-link files are present in the .glusterfs folder in the top-level directory of the brick. If the gfid of the file is 0x307a5c9efddd4e7c96e94fd4bcdcbd1b (the trusted.gfid extended attribute got from the getfattr command earlier),the gfid-link file can be found at + > /gfs/brick-a/.glusterfs/30/7a/307a5c9efddd4e7c96e94fd4bcdcbd1b #### Word of caution: + Before deleting the gfid-link, we have to ensure that there are no hard links to the file present on that brick. If hard-links exist,they must be deleted as well. diff --git a/docs/Troubleshooting/statedump.md b/docs/Troubleshooting/statedump.md index 3c33810..b89345d 100644 --- a/docs/Troubleshooting/statedump.md +++ b/docs/Troubleshooting/statedump.md @@ -2,20 +2,18 @@ A statedump is, as the name suggests, a dump of the internal state of a glusterfs process. It captures information about in-memory structures such as frames, call stacks, active inodes, fds, mempools, iobufs, and locks as well as xlator specific data structures. This can be an invaluable tool for debugging memory leaks and hung processes. +- [Generate a Statedump](#generate-a-statedump) +- [Read a Statedump](#read-a-statedump) +- [Debug with a Statedump](#debug-with-statedumps) - - - [Generate a Statedump](#generate-a-statedump) - - [Read a Statedump](#read-a-statedump) - - [Debug with a Statedump](#debug-with-statedumps) - -************************ - +--- ## Generate a Statedump + Run the command ```console -# gluster --print-statedumpdir +gluster --print-statedumpdir ``` on a gluster server node to find out which directory the statedumps will be created in. This directory may need to be created if not already present. @@ -38,7 +36,6 @@ kill -USR1 There are specific commands to generate statedumps for all brick processes/nfs server/quotad which can be used instead of the above. Run the following commands on one of the server nodes: - For bricks: ```console @@ -59,16 +56,17 @@ gluster volume statedump quotad The statedumps will be created in `statedump-directory` on each node. The statedumps for brick processes will be created with the filename `hyphenated-brick-path..dump.timestamp` while for all other processes it will be `glusterdump..dump.timestamp`. -*** +--- ## Read a Statedump Statedumps are text files and can be opened in any text editor. The first and last lines of the file contain the start and end time (in UTC)respectively of when the statedump file was written. ### Mallinfo + The mallinfo return status is printed in the following format. Please read _man mallinfo_ for more information about what each field means. -``` +```{.text .no-copy } [mallinfo] mallinfo_arena=100020224 /* Non-mmapped space allocated (bytes) */ mallinfo_ordblks=69467 /* Number of free chunks */ @@ -83,19 +81,19 @@ mallinfo_keepcost=133712 /* Top-most, releasable space (bytes) */ ``` ### Memory accounting stats + Each xlator defines data structures specific to its requirements. The statedump captures information about the memory usage and allocations of these structures for each xlator in the call-stack and prints them in the following format: For the xlator with the name _glusterfs_ -``` +```{.text .no-copy } [global.glusterfs - Memory usage] #[global. - Memory usage] num_types=119 #The number of data types it is using ``` - followed by the memory usage for each data-type for that translator. The following example displays a sample for the gf_common_mt_gf_timer_t type -``` +```{.text .no-copy } [global.glusterfs - usage-type gf_common_mt_gf_timer_t memusage] #[global. - usage-type memusage] size=112 #Total size allocated for data-type when the statedump was taken i.e. num_allocs * sizeof (data-type) @@ -113,7 +111,7 @@ Mempools are an optimization intended to reduce the number of allocations of a d Memory pool allocations by each xlator are displayed in the following format: -``` +```{.text .no-copy } [mempool] #Section name -----=----- pool-name=fuse:fd_t #pool-name=: @@ -129,10 +127,9 @@ max-stdalloc=0 #Maximum number of allocations from heap that were in active This information is also useful while debugging high memory usage issues as large hot_count and cur-stdalloc values may point to an element not being freed after it has been used. - ### Iobufs -``` +```{.text .no-copy } [iobuf.global] iobuf_pool=0x1f0d970 #The memory pool for iobufs iobuf_pool.default_page_size=131072 #The default size of iobuf (if no iobuf size is specified the default size is allocated) @@ -148,7 +145,7 @@ There are 3 lists of arenas 2. Purge list: arenas that can be purged(no active iobufs, active_cnt == 0). 3. Filled list: arenas without free iobufs. -``` +```{.text .no-copy } [purge.1] #purge. purge.1.mem_base=0x7fc47b35f000 #The address of the arena structure purge.1.active_cnt=0 #The number of iobufs active in that arena @@ -168,7 +165,7 @@ arena.5.page_size=32768 If the active_cnt of any arena is non zero, then the statedump will also have the iobuf list. -``` +```{.text .no-copy } [arena.6.active_iobuf.1] #arena..active_iobuf. arena.6.active_iobuf.1.ref=1 #refcount of the iobuf arena.6.active_iobuf.1.ptr=0x7fdb921a9000 #address of the iobuf @@ -180,12 +177,11 @@ arena.6.active_iobuf.2.ptr=0x7fdb92189000 A lot of filled arenas at any given point in time could be a sign of iobuf leaks. - ### Call stack The fops received by gluster are handled using call stacks. A call stack contains information about the uid/gid/pid etc of the process that is executing the fop. Each call stack contains different call-frames for each xlator which handles that fop. -``` +```{.text .no-copy } [global.callpool.stack.3] #global.callpool.stack. stack=0x7fc47a44bbe0 #Stack address uid=0 #Uid of the process executing the fop @@ -199,9 +195,10 @@ cnt=9 #Number of frames in this stack. ``` ### Call-frame + Each frame will have information about which xlator the frame belongs to, which function it wound to/from and which it will be unwound to, and whether it has unwound. -``` +```{.text .no-copy } [global.callpool.stack.3.frame.2] #global.callpool.stack..frame. frame=0x7fc47a611dbc #Frame address ref_count=0 #Incremented at the time of wind and decremented at the time of unwind. @@ -215,12 +212,11 @@ unwind_to=afr_lookup_cbk #Parent xlator function to unwind to To debug hangs in the system, see which xlator has not yet unwound its fop by checking the value of the _complete_ tag in the statedump. (_complete=0_ indicates the xlator has not yet unwound). - ### FUSE Operation History Gluster Fuse maintains a history of the operations that it has performed. -``` +```{.text .no-copy } [xlator.mount.fuse.history] TIME=2014-07-09 16:44:57.523364 message=[0] fuse_release: RELEASE(): 4590:, fd: 0x1fef0d8, gfid: 3afb4968-5100-478d-91e9-76264e634c9f @@ -234,7 +230,7 @@ message=[0] fuse_getattr_resume: 4591, STAT, path: (/iozone.tmp), gfid: (3afb496 ### Xlator configuration -``` +```{.text .no-copy } [cluster/replicate.r2-replicate-0] #Xlator type, name information child_count=2 #Number of children for the xlator #Xlator specific configuration below @@ -255,7 +251,7 @@ wait_count=1 ### Graph/inode table -``` +```{.text .no-copy } [active graph - 1] conn.1.bound_xl./data/brick01a/homegfs.hashsize=14057 @@ -268,7 +264,7 @@ conn.1.bound_xl./data/brick01a/homegfs.purge_size=0 #Number of inodes present ### Inode -``` +```{.text .no-copy } [conn.1.bound_xl./data/brick01a/homegfs.active.324] #324th inode in active inode list gfid=e6d337cf-97eb-44b3-9492-379ba3f6ad42 #Gfid of the inode nlookup=13 #Number of times lookups happened from the client or from fuse kernel @@ -285,9 +281,10 @@ ia_type=2 ``` ### Inode context + Each xlator can store information specific to it in the inode context. This context can also be printed in the statedump. Here is the inode context of the locks xlator -``` +```{.text .no-copy } [xlator.features.locks.homegfs-locks.inode] path=/homegfs/users/dfrobins/gfstest/r4/SCRATCH/fort.5102 - path of the file mandatory=0 @@ -301,10 +298,11 @@ lock-dump.domain.domain=homegfs-replicate-0:metadata #Domain name where metadata lock-dump.domain.domain=homegfs-replicate-0 #Domain name where entry/data operations take locks to maintain replication consistency inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=11141120, len=131072, pid = 18446744073709551615, owner=080b1ada117f0000, client=0xb7fc30, connection-id=compute-30-029.com-3505-2014/06/29-14:46:12:477358-homegfs-client-0-0-1, granted at Sun Jun 29 11:10:36 2014 #Active lock information ``` - -*** + +--- ## Debug With Statedumps + ### Memory leaks Statedumps can be used to determine whether the high memory usage of a process is caused by a leak. To debug the issue, generate statedumps for that process at regular intervals, or before and after running the steps that cause the memory used to increase. Once you have multiple statedumps, compare the memory allocation stats to see if any of them are increasing steadily as those could indicate a potential memory leak. @@ -315,7 +313,7 @@ The following examples walk through using statedumps to debug two different memo [BZ 1120151](https://bugzilla.redhat.com/show_bug.cgi?id=1120151) reported high memory usage by the self heal daemon whenever one of the bricks was wiped in a replicate volume and a full self-heal was invoked to heal the contents. This issue was debugged using statedumps to determine which data-structure was leaking memory. -A statedump of the self heal daemon process was taken using +A statedump of the self heal daemon process was taken using ```console kill -USR1 `` @@ -323,7 +321,7 @@ kill -USR1 `` On examining the statedump: -``` +```{.text .no-copy } grep -w num_allocs glusterdump.5225.dump.1405493251 num_allocs=77078 num_allocs=87070 @@ -338,6 +336,7 @@ hot-count=4095 ``` On searching for num_allocs with high values in the statedump, a `grep` of the statedump revealed a large number of allocations for the following data-types under the replicate xlator: + 1. gf_common_mt_asprintf 2. gf_common_mt_char 3. gf_common_mt_mem_pool. @@ -345,16 +344,15 @@ On searching for num_allocs with high values in the statedump, a `grep` of the s On checking the afr-code for allocations with tag `gf_common_mt_char`, it was found that the `data-self-heal` code path does not free one such allocated data structure. `gf_common_mt_mem_pool` suggests that there is a leak in pool memory. The `replicate-0:dict_t`, `glusterfs:data_t` and `glusterfs:data_pair_t` pools are using a lot of memory, i.e. cold_count is `0` and there are too many allocations. Checking the source code of dict.c shows that `key` in `dict` is allocated with `gf_common_mt_char` i.e. `2.` tag and value is created using gf_asprintf which in-turn uses `gf_common_mt_asprintf` i.e. `1.`. Checking the code for leaks in self-heal code paths led to a line which over-writes a variable with new dictionary even when it was already holding a reference to another dictionary. After fixing these leaks, we ran the same test to verify that none of the `num_allocs` values increased in the statedump of the self-daemon after healing 10,000 files. Please check [http://review.gluster.org/8316](http://review.gluster.org/8316) for more info about the patch/code. - #### Leaks in mempools: -The statedump output of mempools was used to test and verify the fixes for [BZ 1134221](https://bugzilla.redhat.com/show_bug.cgi?id=1134221). On code analysis, dict_t objects were found to be leaking (due to missing unref's) during name self-heal. + +The statedump output of mempools was used to test and verify the fixes for [BZ 1134221](https://bugzilla.redhat.com/show_bug.cgi?id=1134221). On code analysis, dict_t objects were found to be leaking (due to missing unref's) during name self-heal. Glusterfs was compiled with the -DDEBUG flags to have cold count set to 0 by default. The test involved creating 100 files on plain replicate volume, removing them from one of the backend bricks, and then triggering lookups on them from the mount point. A statedump of the mount process was taken before executing the test case and after it was completed. Statedump output of the fuse mount process before the test case was executed: -``` - +```{.text .no-copy } pool-name=glusterfs:dict_t hot-count=0 cold-count=0 @@ -364,12 +362,11 @@ max-alloc=0 pool-misses=33 cur-stdalloc=14 max-stdalloc=18 - ``` + Statedump output of the fuse mount process after the test case was executed: -``` - +```{.text .no-copy } pool-name=glusterfs:dict_t hot-count=0 cold-count=0 @@ -379,15 +376,15 @@ max-alloc=0 pool-misses=2841 cur-stdalloc=214 max-stdalloc=220 - ``` + Here, as cold count was 0 by default, cur-stdalloc indicates the number of dict_t objects that were allocated from the heap using mem_get(), and are yet to be freed using mem_put(). After running the test case (named selfheal of 100 files), there was a rise in the cur-stdalloc value (from 14 to 214) for dict_t. After the leaks were fixed, glusterfs was again compiled with -DDEBUG flags and the steps were repeated. Statedumps of the FUSE mount were taken before and after executing the test case to ascertain the validity of the fix. And the results were as follows: Statedump output of the fuse mount process before executing the test case: -``` +```{.text .no-copy } pool-name=glusterfs:dict_t hot-count=0 cold-count=0 @@ -397,11 +394,11 @@ max-alloc=0 pool-misses=33 cur-stdalloc=14 max-stdalloc=18 - ``` + Statedump output of the fuse mount process after executing the test case: -``` +```{.text .no-copy } pool-name=glusterfs:dict_t hot-count=0 cold-count=0 @@ -411,17 +408,18 @@ max-alloc=0 pool-misses=2837 cur-stdalloc=14 max-stdalloc=119 - ``` + The value of cur-stdalloc remained 14 after the test, indicating that the fix indeed does what it's supposed to do. ### Hangs caused by frame loss + [BZ 994959](https://bugzilla.redhat.com/show_bug.cgi?id=994959) reported that the Fuse mount hangs on a readdirp operation. Here are the steps used to locate the cause of the hang using statedump. Statedumps were taken for all gluster processes after reproducing the issue. The following stack was seen in the FUSE mount's statedump: -``` +```{.text .no-copy } [global.callpool.stack.1.frame.1] ref_count=1 translator=fuse @@ -463,8 +461,8 @@ parent=r2-quick-read wind_from=qr_readdirp wind_to=FIRST_CHILD (this)->fops->readdirp unwind_to=qr_readdirp_cbk - ``` + `unwind_to` shows that call was unwound to `afr_readdirp_cbk` from the r2-client-1 xlator. Inspecting that function revealed that afr is not unwinding the stack when fop failed. Check [http://review.gluster.org/5531](http://review.gluster.org/5531) for more info about patch/code changes. diff --git a/docs/Troubleshooting/troubleshooting-afr.md b/docs/Troubleshooting/troubleshooting-afr.md index 42bc2b4..8d85562 100644 --- a/docs/Troubleshooting/troubleshooting-afr.md +++ b/docs/Troubleshooting/troubleshooting-afr.md @@ -8,7 +8,7 @@ The first level of analysis always starts with looking at the log files. Which o Sometimes, you might need more verbose logging to figure out what’s going on: `gluster volume set $volname client-log-level $LEVEL` -where LEVEL can be any one of `DEBUG, WARNING, ERROR, INFO, CRITICAL, NONE, TRACE`. This should ideally make all the log files mentioned above to start logging at `$LEVEL`. The default is `INFO` but you can temporarily toggle it to `DEBUG` or `TRACE` if you want to see under-the-hood messages. Useful when the normal logs don’t give a clue as to what is happening. +where LEVEL can be any one of `DEBUG, WARNING, ERROR, INFO, CRITICAL, NONE, TRACE`. This should ideally make all the log files mentioned above to start logging at `$LEVEL`. The default is `INFO` but you can temporarily toggle it to `DEBUG` or `TRACE` if you want to see under-the-hood messages. Useful when the normal logs don’t give a clue as to what is happening. ## Heal related issues: @@ -20,17 +20,19 @@ Most issues I’ve seen on the mailing list and with customers can broadly fit i If the number of entries are large, then heal info will take longer than usual. While there are performance improvements to heal info being planned, a faster way to get an approx. count of the pending entries is to use the `gluster volume heal $VOLNAME statistics heal-count` command. -**Knowledge Hack:** Since we know that during the write transaction. the xattrop folder will capture the gfid-string of the file if it needs heal, we can also do an `ls /brick/.glusterfs/indices/xattrop|wc -l` on each brick to get the approx. no of entries that need heal. If this number reduces over time, it is a sign that the heal backlog is reducing. You will also see messages whenever a particular type of heal starts/ends for a given gfid, like so: +**Knowledge Hack:** Since we know that during the write transaction. the xattrop folder will capture the gfid-string of the file if it needs heal, we can also do an `ls /brick/.glusterfs/indices/xattrop|wc -l` on each brick to get the approx. no of entries that need heal. If this number reduces over time, it is a sign that the heal backlog is reducing. You will also see messages whenever a particular type of heal starts/ends for a given gfid, like so: -`[2019-05-07 12:05:14.460442] I [MSGID: 108026] [afr-self-heal-entry.c:883:afr_selfheal_entry_do] 0-testvol-replicate-0: performing entry selfheal on d120c0cf-6e87-454b-965b-0d83a4c752bb` +```{.text .no-copy } +[2019-05-07 12:05:14.460442] I [MSGID: 108026] [afr-self-heal-entry.c:883:afr_selfheal_entry_do] 0-testvol-replicate-0: performing entry selfheal on d120c0cf-6e87-454b-965b-0d83a4c752bb -`[2019-05-07 12:05:14.474710] I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal] 0-testvol-replicate-0: Completed entry selfheal on d120c0cf-6e87-454b-965b-0d83a4c752bb. sources=[0] 2 sinks=1` +[2019-05-07 12:05:14.474710] I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal] 0-testvol-replicate-0: Completed entry selfheal on d120c0cf-6e87-454b-965b-0d83a4c752bb. sources=[0] 2 sinks=1 -`[2019-05-07 12:05:14.493506] I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal] 0-testvol-replicate-0: Completed data selfheal on a9b5f183-21eb-4fb3-a342-287d3a7dddc5. sources=[0] 2 sinks=1` +[2019-05-07 12:05:14.493506] I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal] 0-testvol-replicate-0: Completed data selfheal on a9b5f183-21eb-4fb3-a342-287d3a7dddc5. sources=[0] 2 sinks=1 -`[2019-05-07 12:05:14.494577] I [MSGID: 108026] [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do] 0-testvol-replicate-0: performing metadata selfheal on a9b5f183-21eb-4fb3-a342-287d3a7dddc5` +[2019-05-07 12:05:14.494577] I [MSGID: 108026] [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do] 0-testvol-replicate-0: performing metadata selfheal on a9b5f183-21eb-4fb3-a342-287d3a7dddc5 -`[2019-05-07 12:05:14.498398] I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal] 0-testvol-replicate-0: Completed metadata selfheal on a9b5f183-21eb-4fb3-a342-287d3a7dddc5. sources=[0] 2 sinks=1` +[2019-05-07 12:05:14.498398] I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal] 0-testvol-replicate-0: Completed metadata selfheal on a9b5f183-21eb-4fb3-a342-287d3a7dddc5. sources=[0] 2 sinks=1 +``` ### ii) Self-heal is stuck/ not getting completed. @@ -38,69 +40,88 @@ If a file seems to be forever appearing in heal info and not healing, check the - Examine the afr xattrs- Do they clearly indicate the good and bad copies? If there isn’t at least one good copy, then the file is in split-brain and you would need to use the split-brain resolution CLI. - Identify which node’s shds would be picking up the file for heal. If a file is listed in the heal info output under brick1 and brick2, then the shds on the nodes which host those bricks would attempt (and one of them would succeed) in doing the heal. - - Once the shd is identified, look at the shd logs to see if it is indeed connected to the bricks. +- Once the shd is identified, look at the shd logs to see if it is indeed connected to the bricks. This is good: -`[2019-05-07 09:53:02.912923] I [MSGID: 114046] [client-handshake.c:1106:client_setvolume_cbk] 0-testvol-client-2: Connected to testvol-client-2, attached to remote volume '/bricks/brick3'` + +```{.text .no-copy } +[2019-05-07 09:53:02.912923] I [MSGID: 114046] [client-handshake.c:1106:client_setvolume_cbk] 0-testvol-client-2: Connected to testvol-client-2, attached to remote volume '/bricks/brick3' +``` This indicates a disconnect: -`[2019-05-07 11:44:47.602862] I [MSGID: 114018] [client.c:2334:client_rpc_notify] 0-testvol-client-2: disconnected from testvol-client-2. Client process will keep trying to connect to glusterd until brick's port is available` -`[2019-05-07 11:44:50.953516] E [MSGID: 114058] [client-handshake.c:1456:client_query_portmap_cbk] 0-testvol-client-2: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.` +```{.text .no-copy } +[2019-05-07 11:44:47.602862] I [MSGID: 114018] [client.c:2334:client_rpc_notify] 0-testvol-client-2: disconnected from testvol-client-2. Client process will keep trying to connect to glusterd until brick's port is available + +[2019-05-07 11:44:50.953516] E [MSGID: 114058] [client-handshake.c:1456:client_query_portmap_cbk] 0-testvol-client-2: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. +``` Alternatively, take a statedump of the self-heal daemon (shd) and check if all client xlators are connected to the respective bricks. The shd must have `connected=1` for all the client xlators, meaning it can talk to all the bricks. -| Shd’s statedump entry of a client xlator that is connected to the 3rd brick | Shd’s statedump entry of the same client xlator if it is diconnected from the 3rd brick | -|:--------------------------------------------------------------------------------------------------------------------------------------------------------:|:--------------------------------------------------------------------------------------------------------------------------------------------------------:| +| Shd’s statedump entry of a client xlator that is connected to the 3rd brick | Shd’s statedump entry of the same client xlator if it is diconnected from the 3rd brick | +| :------------------------------------------------------------------------------------------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------: | | [xlator.protocol.client.testvol-client-2.priv] connected=1 total_bytes_read=75004 ping_timeout=42 total_bytes_written=50608 ping_msgs_sent=0 msgs_sent=0 | [xlator.protocol.client.testvol-client-2.priv] connected=0 total_bytes_read=75004 ping_timeout=42 total_bytes_written=50608 ping_msgs_sent=0 msgs_sent=0 | If there are connection issues (i.e. `connected=0`), you would need to investigate and fix them. Check if the pid and the TCP/RDMA Port of the brick proceess from gluster volume status $VOLNAME matches that of `ps aux|grep glusterfsd|grep $brick-path` -`[root@tuxpad glusterfs]# gluster volume status` +```{.text .no-copy } +# gluster volume status Status of volume: testvol -Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------- -Brick 127.0.0.2:/bricks/brick1 49152 0 Y 12527 +Gluster process TCP Port RDMA Port Online Pid -`[root@tuxpad glusterfs]# ps aux|grep brick1` +--- -`root 12527 0.0 0.1 1459208 20104 ? Ssl 11:20 0:01 /usr/local/sbin/glusterfsd -s 127.0.0.2 --volfile-id testvol.127.0.0.2.bricks-brick1 -p /var/run/gluster/vols/testvol/127.0.0.2-bricks-brick1.pid -S /var/run/gluster/70529980362a17d6.socket --brick-name /bricks/brick1 -l /var/log/glusterfs/bricks/bricks-brick1.log --xlator-option *-posix.glusterd-uuid=d90b1532-30e5-4f9d-a75b-3ebb1c3682d4 --process-name brick --brick-port 49152 --xlator-option testvol-server.listen-port=49152` +Brick 127.0.0.2:/bricks/brick1 49152 0 Y 12527 +``` + +```{.text .no-copy } +# ps aux|grep brick1 + +root 12527 0.0 0.1 1459208 20104 ? Ssl 11:20 0:01 /usr/local/sbin/glusterfsd -s 127.0.0.2 --volfile-id testvol.127.0.0.2.bricks-brick1 -p /var/run/gluster/vols/testvol/127.0.0.2-bricks-brick1.pid -S /var/run/gluster/70529980362a17d6.socket --brick-name /bricks/brick1 -l /var/log/glusterfs/bricks/bricks-brick1.log --xlator-option *-posix.glusterd-uuid=d90b1532-30e5-4f9d-a75b-3ebb1c3682d4 --process-name brick --brick-port 49152 --xlator-option testvol-server.listen-port=49152 +``` Though this will likely match, sometimes there could be a bug leading to stale port usage. A quick workaround would be to restart glusterd on that node and check if things match. Report the issue to the devs if you see this problem. - I have seen some cases where a file is listed in heal info, and the afr xattrs indicate pending metadata or data heal but the file itself is not present on all bricks. Ideally, the parent directory of the file must have pending entry heal xattrs so that the file either gets created on the missing bricks or gets deleted from the ones where it is present. But if the parent dir doesn’t have xattrs, the entry heal can’t proceed. In such cases, you can - -- Either do a lookup directly on the file from the mount so that name heal is triggered and then shd can pickup the data/metadata heal. - -- Or manually set entry xattrs on the parent dir to emulate an entry heal so that the file gets created as a part of it. - -- If a brick’s underlying filesystem/lvm was damaged and fsck’d to recovery, some files/dirs might be missing on it. If there is a lot of missing info on the recovered bricks, it might be better to just to a replace-brick or reset-brick and let the heal fully sync everything rather than fiddling with afr xattrs of individual entries. -**Hack:** How to trigger heal on *any* file/directory + - Either do a lookup directly on the file from the mount so that name heal is triggered and then shd can pickup the data/metadata heal. + - Or manually set entry xattrs on the parent dir to emulate an entry heal so that the file gets created as a part of it. + - If a brick’s underlying filesystem/lvm was damaged and fsck’d to recovery, some files/dirs might be missing on it. If there is a lot of missing info on the recovered bricks, it might be better to just to a replace-brick or reset-brick and let the heal fully sync everything rather than fiddling with afr xattrs of individual entries. + +**Hack:** How to trigger heal on _any_ file/directory Knowing about self-heal logic and index heal from the previous post, we can sort of emulate a heal with the following steps. This is not something that you should be doing on your cluster but it pays to at least know that it is possible when push comes to shove. 1. Picking one brick as good and setting the afr pending xattr on it blaming the bad bricks. 2. Capture the gfid inside .glusterfs/indices/xattrop so that the shd can pick it up during index heal. 3. Finally, trigger index heal: gluster volume heal $VOLNAME . -*Example:* Let us say a FILE-1 exists with `trusted.gfid=0x1ad2144928124da9b7117d27393fea5c` on all bricks of a replica 3 volume called testvol. It has no afr xattrs. But you still need to emulate a heal. Let us say you choose brick-2 as the source. Let us do the steps listed above: +_Example:_ Let us say a FILE-1 exists with `trusted.gfid=0x1ad2144928124da9b7117d27393fea5c` on all bricks of a replica 3 volume called testvol. It has no afr xattrs. But you still need to emulate a heal. Let us say you choose brick-2 as the source. Let us do the steps listed above: -1. Make brick-2 blame the other 2 bricks: -[root@tuxpad fuse_mnt]# setfattr -n trusted.afr.testvol-client-2 -v 0x000000010000000000000000 /bricks/brick2/FILE-1 -[root@tuxpad fuse_mnt]# setfattr -n trusted.afr.testvol-client-1 -v 0x000000010000000000000000 /bricks/brick2/FILE-1 +1. Make brick-2 blame the other 2 bricks: -2. Store the gfid string inside xattrop folder as a hardlink to the base entry: -root@tuxpad ~]# cd /bricks/brick2/.glusterfs/indices/xattrop/ -[root@tuxpad xattrop]# ls -li -total 0 -17829255 ----------. 1 root root 0 May 10 11:20 xattrop-a400ca91-cec9-4463-a183-aca9eaff9fa7` -[root@tuxpad xattrop]# ln xattrop-a400ca91-cec9-4463-a183-aca9eaff9fa7 1ad21449-2812-4da9-b711-7d27393fea5c -[root@tuxpad xattrop]# ll -total 0 -----------. 2 root root 0 May 10 11:20 1ad21449-2812-4da9-b711-7d27393fea5c -----------. 2 root root 0 May 10 11:20 xattrop-a400ca91-cec9-4463-a183-aca9eaff9fa7 + setfattr -n trusted.afr.testvol-client-2 -v 0x000000010000000000000000 /bricks/brick2/FILE-1 + setfattr -n trusted.afr.testvol-client-1 -v 0x000000010000000000000000 /bricks/brick2/FILE-1 -3. Trigger heal: gluster volume heal testvol -The glustershd.log of node-2 should log about the heal. -[2019-05-10 06:10:46.027238] I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal] 0-testvol-replicate-0: Completed data selfheal on 1ad21449-2812-4da9-b711-7d27393fea5c. sources=[1] sinks=0 2 -So the data was healed from the second brick to the first and third brick. +2. Store the gfid string inside xattrop folder as a hardlink to the base entry: + + # cd /bricks/brick2/.glusterfs/indices/xattrop/ + # ls -li + total 0 + 17829255 ----------. 1 root root 0 May 10 11:20 xattrop-a400ca91-cec9-4463-a183-aca9eaff9fa7` + + # ln xattrop-a400ca91-cec9-4463-a183-aca9eaff9fa7 1ad21449-2812-4da9-b711-7d27393fea5c + # ll + total 0 + ----------. 2 root root 0 May 10 11:20 1ad21449-2812-4da9-b711-7d27393fea5c + ----------. 2 root root 0 May 10 11:20 xattrop-a400ca91-cec9-4463-a183-aca9eaff9fa7 + +3. Trigger heal: `gluster volume heal testvol` + + The glustershd.log of node-2 should log about the heal. + + [2019-05-10 06:10:46.027238] I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal] 0-testvol-replicate-0: Completed data selfheal on 1ad21449-2812-4da9-b711-7d27393fea5c. sources=[1] sinks=0 2 + + So the data was healed from the second brick to the first and third brick. ### iii) Self-heal is too slow @@ -109,7 +130,7 @@ If the heal backlog is decreasing and you see glustershd logging heals but you Option: cluster.shd-max-threads Default Value: 1 Description: Maximum number of parallel heals SHD can do per local brick. This can substantially lower heal times, but can also crush your bricks if you don’t have the storage hardware to support this. - + Option: cluster.shd-wait-qlength Default Value: 1024 Description: This option can be used to control number of heals that can wait in SHD per subvolume @@ -118,38 +139,45 @@ I’m not covering it here but it is possible to launch multiple shd instances ( ### iv) Self-heal is too aggressive and slows down the system. -If shd-max-threads are at the lowest value (i.e. 1) and you see if CPU usage of the bricks is too high, you can check if the volume’s profile info shows a lot of RCHECKSUM fops. Data self-heal does checksum calculation (i.e the `posix_rchecksum()` FOP) which can be CPU intensive. You can the `cluster.data-self-heal-algorithm` option to full. This does a full file copy instead of computing rolling checksums and syncing only the mismatching blocks. The tradeoff is that the network consumption will be increased. +If shd-max-threads are at the lowest value (i.e. 1) and you see if CPU usage of the bricks is too high, you can check if the volume’s profile info shows a lot of RCHECKSUM fops. Data self-heal does checksum calculation (i.e the `posix_rchecksum()` FOP) which can be CPU intensive. You can the `cluster.data-self-heal-algorithm` option to full. This does a full file copy instead of computing rolling checksums and syncing only the mismatching blocks. The tradeoff is that the network consumption will be increased. -You can also disable all client-side heals if they are turned on so that the client bandwidth is consumed entirely by the application FOPs and not the ones by client side background heals. i.e. turn off `cluster.metadata-self-heal, cluster.data-self-heal and cluster.entry-self-heal`. -Note: In recent versions of gluster, client-side heals are disabled by default. +You can also disable all client-side heals if they are turned on so that the client bandwidth is consumed entirely by the application FOPs and not the ones by client side background heals. i.e. turn off `cluster.metadata-self-heal, cluster.data-self-heal and cluster.entry-self-heal`. +Note: In recent versions of gluster, client-side heals are disabled by default. ## Mount related issues: - ### i) All fops are failing with ENOTCONN + +### i) All fops are failing with ENOTCONN Check mount log/ statedump for loss of quorum, just like for glustershd. If this is a fuse client (as opposed to an nfs/ gfapi client), you can also check the .meta folder to check the connection status to the bricks. -`[root@tuxpad ~]# cat /mnt/fuse_mnt/.meta/graphs/active/testvol-client-*/private |grep connected` -`connected = 0` -`connected = 1` -`connected = 1` +```{.text .no-copy } +# cat /mnt/fuse_mnt/.meta/graphs/active/testvol-client-*/private |grep connected -If `connected=0`, the connection to that brick is lost. Find out why. If the client is not connected to quorum number of bricks, then AFR fails lookups (and therefore any subsequent FOP) with Transport endpoint is not connected +connected = 0 +connected = 1 +connected = 1 +``` + +If `connected=0`, the connection to that brick is lost. Find out why. If the client is not connected to quorum number of bricks, then AFR fails lookups (and therefore any subsequent FOP) with Transport endpoint is not connected ### ii) FOPs on some files are failing with ENOTCONN Check mount log for the file being unreadable: -`[2019-05-10 11:04:01.607046] W [MSGID: 108027] [afr-common.c:2268:afr_attempt_readsubvol_set] 13-testvol-replicate-0: no read subvols for /FILE.txt` -`[2019-05-10 11:04:01.607775] W [fuse-bridge.c:939:fuse_entry_cbk] 0-glusterfs-fuse: 234: LOOKUP() /FILE.txt => -1 (Transport endpoint is not connected)` -This means there was only 1 good copy and the client has lost connection to that brick. You need to ensure that the client is connected to all bricks. +```{.text .no-copy } +[2019-05-10 11:04:01.607046] W [MSGID: 108027] [afr-common.c:2268:afr_attempt_readsubvol_set] 13-testvol-replicate-0: no read subvols for /FILE.txt +[2019-05-10 11:04:01.607775] W [fuse-bridge.c:939:fuse_entry_cbk] 0-glusterfs-fuse: 234: LOOKUP() /FILE.txt => -1 (Transport endpoint is not connected) +``` + +This means there was only 1 good copy and the client has lost connection to that brick. You need to ensure that the client is connected to all bricks. ### iii) Mount is hung It can be difficult to pin-point the issue immediately and might require assistance from the developers but the first steps to debugging could be to - - strace the fuse mount; see where it is hung. - - Take a statedump of the mount to see which xlator has frames that are not wound (i.e. complete=0) and for which FOP. Then check the source code to see if there are any unhanded cases where the xlator doesn’t wind the FOP to its child. - - Take statedump of bricks to see if there are any stale locks. An indication of stale locks is the same lock being present in multiple statedumps or the ‘granted’ date being very old. +- strace the fuse mount; see where it is hung. +- Take a statedump of the mount to see which xlator has frames that are not wound (i.e. complete=0) and for which FOP. Then check the source code to see if there are any unhanded cases where the xlator doesn’t wind the FOP to its child. +- Take statedump of bricks to see if there are any stale locks. An indication of stale locks is the same lock being present in multiple statedumps or the ‘granted’ date being very old. Excerpt from a brick statedump: diff --git a/docs/Troubleshooting/troubleshooting-filelocks.md b/docs/Troubleshooting/troubleshooting-filelocks.md index ec5da40..aaf42b5 100644 --- a/docs/Troubleshooting/troubleshooting-filelocks.md +++ b/docs/Troubleshooting/troubleshooting-filelocks.md @@ -1,6 +1,4 @@ -Troubleshooting File Locks -========================== - +# Troubleshooting File Locks Use [statedumps](./statedump.md) to find and list the locks held on files. The statedump output also provides information on each lock @@ -13,11 +11,11 @@ lock using the following `clear lock` commands. 1. **Perform statedump on the volume to view the files that are locked using the following command:** - # gluster volume statedump inode + gluster volume statedump inode For example, to display statedump of test-volume: - # gluster volume statedump test-volume + gluster volume statedump test-volume Volume statedump successful The statedump files are created on the brick servers in the` /tmp` @@ -58,25 +56,23 @@ lock using the following `clear lock` commands. 2. **Clear the lock using the following command:** - # gluster volume clear-locks + gluster volume clear-locks For example, to clear the entry lock on `file1` of test-volume: - # gluster volume clear-locks test-volume / kind granted entry file1 + gluster volume clear-locks test-volume / kind granted entry file1 Volume clear-locks successful vol-locks: entry blocked locks=0 granted locks=1 3. **Clear the inode lock using the following command:** - # gluster volume clear-locks + gluster volume clear-locks For example, to clear the inode lock on `file1` of test-volume: - # gluster volume clear-locks test-volume /file1 kind granted inode 0,0-0 + gluster volume clear-locks test-volume /file1 kind granted inode 0,0-0 Volume clear-locks successful vol-locks: inode blocked locks=0 granted locks=1 Perform statedump on test-volume again to verify that the above inode and entry locks are cleared. - - diff --git a/docs/Troubleshooting/troubleshooting-georep.md b/docs/Troubleshooting/troubleshooting-georep.md index 9ef49fe..cb66538 100644 --- a/docs/Troubleshooting/troubleshooting-georep.md +++ b/docs/Troubleshooting/troubleshooting-georep.md @@ -8,13 +8,13 @@ to GlusterFS Geo-replication. For every Geo-replication session, the following three log files are associated to it (four, if the secondary is a gluster volume): -- **Primary-log-file** - log file for the process which monitors the Primary - volume -- **Secondary-log-file** - log file for process which initiates the changes in - secondary -- **Primary-gluster-log-file** - log file for the maintenance mount point - that Geo-replication module uses to monitor the Primary volume -- **Secondary-gluster-log-file** - is the secondary's counterpart of it +- **Primary-log-file** - log file for the process which monitors the Primary + volume +- **Secondary-log-file** - log file for process which initiates the changes in + secondary +- **Primary-gluster-log-file** - log file for the maintenance mount point + that Geo-replication module uses to monitor the Primary volume +- **Secondary-gluster-log-file** - is the secondary's counterpart of it **Primary Log File** @@ -28,7 +28,7 @@ gluster volume geo-replication config log-file For example: ```console -# gluster volume geo-replication Volume1 example.com:/data/remote_dir config log-file +gluster volume geo-replication Volume1 example.com:/data/remote_dir config log-file ``` **Secondary Log File** @@ -38,13 +38,13 @@ running on secondary machine), use the following commands: 1. On primary, run the following command: - # gluster volume geo-replication Volume1 example.com:/data/remote_dir config session-owner 5f6e5200-756f-11e0-a1f0-0800200c9a66 + gluster volume geo-replication Volume1 example.com:/data/remote_dir config session-owner 5f6e5200-756f-11e0-a1f0-0800200c9a66 Displays the session owner details. 2. On secondary, run the following command: - # gluster volume geo-replication /data/remote_dir config log-file /var/log/gluster/${session-owner}:remote-mirror.log + gluster volume geo-replication /data/remote_dir config log-file /var/log/gluster/${session-owner}:remote-mirror.log 3. Replace the session owner details (output of Step 1) to the output of Step 2 to get the location of the log file. @@ -52,7 +52,7 @@ running on secondary machine), use the following commands: /var/log/gluster/5f6e5200-756f-11e0-a1f0-0800200c9a66:remote-mirror.log ### Rotating Geo-replication Logs - + Administrators can rotate the log file of a particular primary-secondary session, as needed. When you run geo-replication's ` log-rotate` command, the log file is backed up with the current timestamp suffixed @@ -61,34 +61,34 @@ log file. **To rotate a geo-replication log file** -- Rotate log file for a particular primary-secondary session using the - following command: +- Rotate log file for a particular primary-secondary session using the + following command: - # gluster volume geo-replication log-rotate + gluster volume geo-replication log-rotate - For example, to rotate the log file of primary `Volume1` and secondary - `example.com:/data/remote_dir` : + For example, to rotate the log file of primary `Volume1` and secondary + `example.com:/data/remote_dir` : - # gluster volume geo-replication Volume1 example.com:/data/remote_dir log rotate + gluster volume geo-replication Volume1 example.com:/data/remote_dir log rotate log rotate successful -- Rotate log file for all sessions for a primary volume using the - following command: +- Rotate log file for all sessions for a primary volume using the + following command: - # gluster volume geo-replication log-rotate + gluster volume geo-replication log-rotate - For example, to rotate the log file of primary `Volume1`: + For example, to rotate the log file of primary `Volume1`: - # gluster volume geo-replication Volume1 log rotate + gluster volume geo-replication Volume1 log rotate log rotate successful -- Rotate log file for all sessions using the following command: +- Rotate log file for all sessions using the following command: - # gluster volume geo-replication log-rotate + gluster volume geo-replication log-rotate - For example, to rotate the log file for all sessions: + For example, to rotate the log file for all sessions: - # gluster volume geo-replication log rotate + gluster volume geo-replication log rotate log rotate successful ### Synchronization is not complete @@ -102,16 +102,14 @@ GlusterFS geo-replication begins synchronizing all the data. All files are compared using checksum, which can be a lengthy and high resource utilization operation on large data sets. - ### Issues in Data Synchronization **Description**: Geo-replication display status as OK, but the files do not get synced, only directories and symlink gets synced with the following error message in the log: -```console -[2011-05-02 13:42:13.467644] E [primary:288:regjob] GMaster: failed to -sync ./some\_file\` +```{ .text .no-copy } +[2011-05-02 13:42:13.467644] E [primary:288:regjob] GMaster: failed to sync ./some\_file\` ``` **Solution**: Geo-replication invokes rsync v3.0.0 or higher on the host @@ -123,7 +121,7 @@ required version. **Description**: Geo-replication displays status as faulty very often with a backtrace similar to the following: -```console +```{ .text .no-copy } 2011-04-28 14:06:18.378859] E [syncdutils:131:log\_raise\_exception] \: FAIL: Traceback (most recent call last): File "/usr/local/libexec/glusterfs/python/syncdaemon/syncdutils.py", line @@ -139,28 +137,28 @@ the primary gsyncd module and secondary gsyncd module is broken and this can happen for various reasons. Check if it satisfies all the following pre-requisites: -- Password-less SSH is set up properly between the host and the remote - machine. -- If FUSE is installed in the machine, because geo-replication module - mounts the GlusterFS volume using FUSE to sync data. -- If the **Secondary** is a volume, check if that volume is started. -- If the Secondary is a plain directory, verify if the directory has been - created already with the required permissions. -- If GlusterFS 3.2 or higher is not installed in the default location - (in Primary) and has been prefixed to be installed in a custom - location, configure the `gluster-command` for it to point to the - exact location. -- If GlusterFS 3.2 or higher is not installed in the default location - (in secondary) and has been prefixed to be installed in a custom - location, configure the `remote-gsyncd-command` for it to point to - the exact place where gsyncd is located. +- Password-less SSH is set up properly between the host and the remote + machine. +- If FUSE is installed in the machine, because geo-replication module + mounts the GlusterFS volume using FUSE to sync data. +- If the **Secondary** is a volume, check if that volume is started. +- If the Secondary is a plain directory, verify if the directory has been + created already with the required permissions. +- If GlusterFS 3.2 or higher is not installed in the default location + (in Primary) and has been prefixed to be installed in a custom + location, configure the `gluster-command` for it to point to the + exact location. +- If GlusterFS 3.2 or higher is not installed in the default location + (in secondary) and has been prefixed to be installed in a custom + location, configure the `remote-gsyncd-command` for it to point to + the exact place where gsyncd is located. ### Intermediate Primary goes to Faulty State **Description**: In a cascading set-up, the intermediate primary goes to faulty state with the following log: -```console +```{ .text .no-copy } raise RuntimeError ("aborting on uuid change from %s to %s" % \\ RuntimeError: aborting on uuid change from af07e07c-427f-4586-ab9f- 4bf7d299be81 to de6b5040-8f4e-4575-8831-c4f55bd41154 diff --git a/docs/Troubleshooting/troubleshooting-glusterd.md b/docs/Troubleshooting/troubleshooting-glusterd.md index c42936b..dfa2ed7 100644 --- a/docs/Troubleshooting/troubleshooting-glusterd.md +++ b/docs/Troubleshooting/troubleshooting-glusterd.md @@ -4,45 +4,40 @@ The glusterd daemon runs on every trusted server node and is responsible for the The gluster CLI sends commands to the glusterd daemon on the local node, which executes the operation and returns the result to the user. -
- ### Debugging glusterd #### Logs + Start by looking at the log files for clues as to what went wrong when you hit a problem. The default directory for Gluster logs is /var/log/glusterfs. The logs for the CLI and glusterd are: - - glusterd : /var/log/glusterfs/glusterd.log - - gluster CLI : /var/log/glusterfs/cli.log - +- glusterd : /var/log/glusterfs/glusterd.log +- gluster CLI : /var/log/glusterfs/cli.log #### Statedumps + Statedumps are useful in debugging memory leaks and hangs. See [Statedump](./statedump.md) for more details. -
- ### Common Issues and How to Resolve Them - -**"*Another transaction is in progress for volname*" or "*Locking failed on xxx.xxx.xxx.xxx"*** +**"_Another transaction is in progress for volname_" or "_Locking failed on xxx.xxx.xxx.xxx"_** As Gluster is distributed by nature, glusterd takes locks when performing operations to ensure that configuration changes made to a volume are atomic across the cluster. These errors are returned when: -* More than one transaction contends on the same lock. -> *Solution* : These are likely to be transient errors and the operation will succeed if retried once the other transaction is complete. +- More than one transaction contends on the same lock. -* A stale lock exists on one of the nodes. -> *Solution* : Repeating the operation will not help until the stale lock is cleaned up. Restart the glusterd process holding the lock + > _Solution_ : These are likely to be transient errors and the operation will succeed if retried once the other transaction is complete. - * Check the glusterd.log file to find out which node holds the stale lock. Look for the message: - `lock being held by ` - * Run `gluster peer status` to identify the node with the uuid in the log message. - * Restart glusterd on that node. +- A stale lock exists on one of the nodes. + > _Solution_ : Repeating the operation will not help until the stale lock is cleaned up. Restart the glusterd process holding the lock -
+ - Check the glusterd.log file to find out which node holds the stale lock. Look for the message: + `lock being held by ` + - Run `gluster peer status` to identify the node with the uuid in the log message. + - Restart glusterd on that node. **"_Transport endpoint is not connected_" errors but all bricks are up** @@ -51,51 +46,40 @@ Gluster client processes query glusterd for the ports the bricks processes are l If the port information in glusterd is incorrect, the client will fail to connect to the brick even though it is up. Operations which would need to access that brick may fail with "Transport endpoint is not connected". -*Solution* : Restart the glusterd service. - -
+_Solution_ : Restart the glusterd service. **"Peer Rejected"** `gluster peer status` returns "Peer Rejected" for a node. -```console +```{ .text .no-copy } Hostname: Uuid: State: Peer Rejected (Connected) ``` -This indicates that the volume configuration on the node is not in sync with the rest of the trusted storage pool. +This indicates that the volume configuration on the node is not in sync with the rest of the trusted storage pool. You should see the following message in the glusterd log for the node on which the peer status command was run: -```console +```{ .text .no-copy } Version of Cksums differ. local cksum = xxxxxx, remote cksum = xxxxyx on peer ``` -*Solution*: Update the cluster.op-version +_Solution_: Update the cluster.op-version - * Run `gluster volume get all cluster.max-op-version` to get the latest supported op-version. - * Update the cluster.op-version to the latest supported op-version by executing `gluster volume set all cluster.op-version `. - -
+- Run `gluster volume get all cluster.max-op-version` to get the latest supported op-version. +- Update the cluster.op-version to the latest supported op-version by executing `gluster volume set all cluster.op-version `. **"Accepted Peer Request"** -If the glusterd handshake fails while expanding a cluster, the view of the cluster will be inconsistent. The state of the peer in `gluster peer status` will be “accepted peer request” and subsequent CLI commands will fail with an error. -Eg. `Volume create command will fail with "volume create: testvol: failed: Host is not in 'Peer in Cluster' state` - +If the glusterd handshake fails while expanding a cluster, the view of the cluster will be inconsistent. The state of the peer in `gluster peer status` will be “accepted peer request” and subsequent CLI commands will fail with an error. +Eg. `Volume create command will fail with "volume create: testvol: failed: Host is not in 'Peer in Cluster' state` + In this case the value of the state field in `/var/lib/glusterd/peers/` will be other than 3. -*Solution*: - -* Stop glusterd -* Open `/var/lib/glusterd/peers/` -* Change state to 3 -* Start glusterd - - - - - - +_Solution_: +- Stop glusterd +- Open `/var/lib/glusterd/peers/` +- Change state to 3 +- Start glusterd diff --git a/docs/Troubleshooting/troubleshooting-gnfs.md b/docs/Troubleshooting/troubleshooting-gnfs.md index 7e2c61a..9d7c455 100644 --- a/docs/Troubleshooting/troubleshooting-gnfs.md +++ b/docs/Troubleshooting/troubleshooting-gnfs.md @@ -11,14 +11,14 @@ This error is encountered when the server has not started correctly. On most Linux distributions this is fixed by starting portmap: ```console -# /etc/init.d/portmap start +/etc/init.d/portmap start ``` On some distributions where portmap has been replaced by rpcbind, the following command is required: ```console -# /etc/init.d/rpcbind start +/etc/init.d/rpcbind start ``` After starting portmap or rpcbind, gluster NFS server needs to be @@ -32,13 +32,13 @@ This error can arise in case there is already a Gluster NFS server running on the same machine. This situation can be confirmed from the log file, if the following error lines exist: -```text +```{ .text .no-copy } [2010-05-26 23:40:49] E [rpc-socket.c:126:rpcsvc_socket_listen] rpc-socket: binding socket failed:Address already in use -[2010-05-26 23:40:49] E [rpc-socket.c:129:rpcsvc_socket_listen] rpc-socket: Port is already in use -[2010-05-26 23:40:49] E [rpcsvc.c:2636:rpcsvc_stage_program_register] rpc-service: could not create listening connection -[2010-05-26 23:40:49] E [rpcsvc.c:2675:rpcsvc_program_register] rpc-service: stage registration of program failed -[2010-05-26 23:40:49] E [rpcsvc.c:2695:rpcsvc_program_register] rpc-service: Program registration failed: MOUNT3, Num: 100005, Ver: 3, Port: 38465 -[2010-05-26 23:40:49] E [nfs.c:125:nfs_init_versions] nfs: Program init failed +[2010-05-26 23:40:49] E [rpc-socket.c:129:rpcsvc_socket_listen] rpc-socket: Port is already in use +[2010-05-26 23:40:49] E [rpcsvc.c:2636:rpcsvc_stage_program_register] rpc-service: could not create listening connection +[2010-05-26 23:40:49] E [rpcsvc.c:2675:rpcsvc_program_register] rpc-service: stage registration of program failed +[2010-05-26 23:40:49] E [rpcsvc.c:2695:rpcsvc_program_register] rpc-service: Program registration failed: MOUNT3, Num: 100005, Ver: 3, Port: 38465 +[2010-05-26 23:40:49] E [nfs.c:125:nfs_init_versions] nfs: Program init failed [2010-05-26 23:40:49] C [nfs.c:531:notify] nfs: Failed to initialize protocols ``` @@ -50,7 +50,7 @@ multiple NFS servers on the same machine. If the mount command fails with the following error message: -```console +```{ .text .no-copy } mount.nfs: rpc.statd is not running but is required for remote locking. mount.nfs: Either use '-o nolock' to keep locks local, or start statd. ``` @@ -59,7 +59,7 @@ For NFS clients to mount the NFS server, rpc.statd service must be running on the clients. Start rpc.statd service by running the following command: ```console -# rpc.statd +rpc.statd ``` ### mount command takes too long to finish. @@ -71,14 +71,14 @@ NFS client. The resolution for this is to start either of these services by running the following command: ```console -# /etc/init.d/portmap start +/etc/init.d/portmap start ``` On some distributions where portmap has been replaced by rpcbind, the following command is required: ```console -# /etc/init.d/rpcbind start +/etc/init.d/rpcbind start ``` ### NFS server glusterfsd starts but initialization fails with “nfsrpc- service: portmap registration of program failed” error message in the log. @@ -88,8 +88,8 @@ still fail preventing clients from accessing the mount points. Such a situation can be confirmed from the following error messages in the log file: -```text -[2010-05-26 23:33:47] E [rpcsvc.c:2598:rpcsvc_program_register_portmap] rpc-service: Could notregister with portmap +```{ .text .no-copy } +[2010-05-26 23:33:47] E [rpcsvc.c:2598:rpcsvc_program_register_portmap] rpc-service: Could notregister with portmap [2010-05-26 23:33:47] E [rpcsvc.c:2682:rpcsvc_program_register] rpc-service: portmap registration of program failed [2010-05-26 23:33:47] E [rpcsvc.c:2695:rpcsvc_program_register] rpc-service: Program registration failed: MOUNT3, Num: 100005, Ver: 3, Port: 38465 [2010-05-26 23:33:47] E [nfs.c:125:nfs_init_versions] nfs: Program init failed @@ -104,12 +104,12 @@ file: On most Linux distributions, portmap can be started using the following command: - # /etc/init.d/portmap start + /etc/init.d/portmap start On some distributions where portmap has been replaced by rpcbind, run the following command: - # /etc/init.d/rpcbind start + /etc/init.d/rpcbind start After starting portmap or rpcbind, gluster NFS server needs to be restarted. @@ -126,8 +126,8 @@ file: On Linux, kernel NFS servers can be stopped by using either of the following commands depending on the distribution in use: - # /etc/init.d/nfs-kernel-server stop - # /etc/init.d/nfs stop + /etc/init.d/nfs-kernel-server stop + /etc/init.d/nfs stop 3. **Restart Gluster NFS server** @@ -135,7 +135,7 @@ file: mount command fails with following error -```console +```{ .text .no-copy } mount: mount to NFS server '10.1.10.11' failed: timed out (retrying). ``` @@ -175,14 +175,13 @@ Perform one of the following to resolve this issue: forcing the NFS client to use version 3. The **vers** option to mount command is used for this purpose: - # mount -o vers=3 + mount -o vers=3 -### showmount fails with clnt\_create: RPC: Unable to receive +### showmount fails with clnt_create: RPC: Unable to receive Check your firewall setting to open ports 111 for portmap requests/replies and Gluster NFS server requests/replies. Gluster NFS -server operates over the following port numbers: 38465, 38466, and -38467. +server operates over the following port numbers: 38465, 38466, and 38467. ### Application fails with "Invalid argument" or "Value too large for defined data type" error. @@ -193,9 +192,9 @@ numbers instead: nfs.enable-ino32 \ Applications that will benefit are those that were either: -- built 32-bit and run on 32-bit machines such that they do not - support large files by default -- built 32-bit on 64-bit systems +- built 32-bit and run on 32-bit machines such that they do not + support large files by default +- built 32-bit on 64-bit systems This option is disabled by default so NFS returns 64-bit inode numbers by default. @@ -203,6 +202,6 @@ by default. Applications which can be rebuilt from source are recommended to rebuild using the following flag with gcc: -``` +```console -D_FILE_OFFSET_BITS=64 ``` diff --git a/docs/Troubleshooting/troubleshooting-memory.md b/docs/Troubleshooting/troubleshooting-memory.md index 12336d5..70d83b2 100644 --- a/docs/Troubleshooting/troubleshooting-memory.md +++ b/docs/Troubleshooting/troubleshooting-memory.md @@ -1,5 +1,4 @@ -Troubleshooting High Memory Utilization -======================================= +# Troubleshooting High Memory Utilization If the memory utilization of a Gluster process increases significantly with time, it could be a leak caused by resources not being freed. If you suspect that you may have hit such an issue, try using [statedumps](./statedump.md) to debug the issue. @@ -12,4 +11,3 @@ If you are unable to figure out where the leak is, please [file an issue](https: - Steps to reproduce the issue if available - Statedumps for the process collected at intervals as the memory utilization increases - The Gluster log files for the process (if possible) - From aa6172d2ab6c305529bc3a24dd1e96ab90b81099 Mon Sep 17 00:00:00 2001 From: black-dragon74 Date: Thu, 9 Jun 2022 13:38:33 +0530 Subject: [PATCH 05/21] [glfs-tools] Cleanup syntax Signed-off-by: black-dragon74 --- docs/GlusterFS-Tools/README.md | 7 ++-- docs/GlusterFS-Tools/gfind-missing-files.md | 16 +++---- docs/GlusterFS-Tools/glusterfind.md | 46 +++++++++++++-------- 3 files changed, 40 insertions(+), 29 deletions(-) diff --git a/docs/GlusterFS-Tools/README.md b/docs/GlusterFS-Tools/README.md index bafd575..5c4cbe9 100644 --- a/docs/GlusterFS-Tools/README.md +++ b/docs/GlusterFS-Tools/README.md @@ -1,5 +1,4 @@ -GlusterFS Tools ---------------- +## GlusterFS Tools -- [glusterfind](./glusterfind.md) -- [gfind missing files](./gfind-missing-files.md) +- [glusterfind](./glusterfind.md) +- [gfind missing files](./gfind-missing-files.md) diff --git a/docs/GlusterFS-Tools/gfind-missing-files.md b/docs/GlusterFS-Tools/gfind-missing-files.md index f7f9e08..1275d6a 100644 --- a/docs/GlusterFS-Tools/gfind-missing-files.md +++ b/docs/GlusterFS-Tools/gfind-missing-files.md @@ -54,15 +54,15 @@ bash gfid_to_path.sh ## Things to keep in mind when running the tool -1. Running this tool can result in a crawl of the backend filesystem at each - brick which can be intensive. To ensure there is no impact on ongoing I/O on - RHS volumes, we recommend that this tool be run at a low I/O scheduling class - (best-effort) and priority. +1. Running this tool can result in a crawl of the backend filesystem at each + brick which can be intensive. To ensure there is no impact on ongoing I/O on + RHS volumes, we recommend that this tool be run at a low I/O scheduling class + (best-effort) and priority. - ionice -c 2 -p + ionice -c 2 -p -2. We do not recommend interrupting the tool when it is running - (e.g. by doing CTRL^C). It is better to wait for the tool to finish +2. We do not recommend interrupting the tool when it is running + (e.g. by doing CTRL^C). It is better to wait for the tool to finish execution. In case it is interrupted, manually unmount the Slave Volume. - umount + umount diff --git a/docs/GlusterFS-Tools/glusterfind.md b/docs/GlusterFS-Tools/glusterfind.md index 442b3f4..67f76a4 100644 --- a/docs/GlusterFS-Tools/glusterfind.md +++ b/docs/GlusterFS-Tools/glusterfind.md @@ -6,11 +6,23 @@ This tool should be run in one of the node, which will get Volume info and gets ## Session Management -Create a glusterfind session to remember the time when last sync or processing complete. For example, your backup application runs every day and gets incremental results on each run. The tool maintains session in `$GLUSTERD_WORKDIR/glusterfind/`, for each session it creates and directory and creates a sub directory with Volume name. (Default working directory is /var/lib/glusterd, in some systems this location may change. To find Working dir location run `grep working-directory /etc/glusterfs/glusterd.vol` or `grep working-directory /usr/local/etc/glusterfs/glusterd.vol` if source install) +Create a glusterfind session to remember the time when last sync or processing complete. For example, your backup application runs every day and gets incremental results on each run. The tool maintains session in `$GLUSTERD_WORKDIR/glusterfind/`, for each session it creates and directory and creates a sub directory with Volume name. (Default working directory is /var/lib/glusterd, in some systems this location may change. To find Working dir location run + +```{ .text .no-copy } +grep working-directory /etc/glusterfs/glusterd.vol +``` + +or + +```{ .text .no-copy } +grep working-directory /usr/local/etc/glusterfs/glusterd.vol +``` + +if you installed from the source. For example, if the session name is "backup" and volume name is "datavol", then the tool creates `$GLUSTERD_WORKDIR/glusterfind/backup/datavol`. Now onwards we refer this directory as `$SESSION_DIR`. -```text +```{ .text .no-copy } create => pre => post => [delete] ``` @@ -34,13 +46,13 @@ Incremental find uses Changelogs to get the list of GFIDs modified/created. Any If we set build-pgfid option in Volume GlusterFS starts recording each files parent directory GFID as xattr in file on any ENTRY fop. -```text +```{ .text .no-copy } trusted.pgfid.=NUM_LINKS ``` To convert from GFID to path, we can mount Volume with aux-gfid-mount option, and get Path information by a getfattr query. -```console +```{ .console .no-copy } getfattr -n glusterfs.ancestry.path -e text /mnt/datavol/.gfid/ ``` @@ -54,7 +66,7 @@ Tool collects the list of GFIDs failed to convert with above method and does a f ### Create the session -```console +```{ .console .no-copy } glusterfind create SESSION_NAME VOLNAME [--force] glusterfind create --help ``` @@ -63,7 +75,7 @@ Where, SESSION_NAME is any name without space to identify when run second time. Examples, -```console +```{ .console .no-copy } # glusterfind create --help # glusterfind create backup datavol # glusterfind create antivirus_scanner datavol @@ -72,7 +84,7 @@ Examples, ### Pre Command -```console +```{ .console .no-copy } glusterfind pre SESSION_NAME VOLUME_NAME OUTFILE glusterfind pre --help ``` @@ -83,7 +95,7 @@ To trigger the full find, call the pre command with `--full` argument. Multiple Examples, -```console +```{ .console .no-copy } # glusterfind pre backup datavol /root/backup.txt # glusterfind pre backup datavol /root/backup.txt --full @@ -96,7 +108,7 @@ Examples, Output file contains list of files/dirs relative to the Volume mount, if we need to prefix with any path to have absolute path then, -```console +```{ .console .no-copy } # glusterfind pre backup datavol /root/backup.txt --file-prefix=/mnt/datavol/ ``` @@ -104,20 +116,20 @@ Output file contains list of files/dirs relative to the Volume mount, if we need To get the list of sessions and respective session time, -```console +```{ .console .no-copy } glusterfind list [--session SESSION_NAME] [--volume VOLUME_NAME] ``` Examples, -```console +```{ .console .no-copy } # glusterfind list # glusterfind list --session backup ``` Example output, -```console +```{ .text .no-copy } SESSION VOLUME SESSION TIME --------------------------------------------------------------------------- backup datavol 2015-03-04 17:35:34 @@ -125,25 +137,25 @@ backup datavol 2015-03-04 17:35:34 ### Post Command -```console +```{ .console .no-copy } glusterfind post SESSION_NAME VOLUME_NAME ``` Examples, -```console +```{ .console .no-copy } # glusterfind post backup datavol ``` ### Delete Command -```console +```{ .console .no-copy } glusterfind delete SESSION_NAME VOLUME_NAME ``` Examples, -```console +```{ .console .no-copy } # glusterfind delete backup datavol ``` @@ -170,7 +182,7 @@ Custom crawler can be executable script/binary which accepts volume name, brick For example, -```console +```{ .console .no-copy } /root/parallelbrickcrawl SESSION_NAME VOLUME BRICK_PATH OUTFILE START_TIME [--debug] ``` From 5069c34769981a625dfafb44f3f37daef326ac1b Mon Sep 17 00:00:00 2001 From: black-dragon74 Date: Thu, 9 Jun 2022 13:47:55 +0530 Subject: [PATCH 06/21] [glossary] Cleanup syntax Signed-off-by: black-dragon74 --- docs/glossary.md | 253 ++++++++++++++++++++++++----------------------- 1 file changed, 127 insertions(+), 126 deletions(-) diff --git a/docs/glossary.md b/docs/glossary.md index 1a4858e..e9a78a0 100644 --- a/docs/glossary.md +++ b/docs/glossary.md @@ -1,57 +1,58 @@ -Glossary -======== +# Glossary **Access Control Lists** -: Access Control Lists (ACLs) allow you to assign different permissions - for different users or groups even though they do not correspond to the - original owner or the owning group. +: Access Control Lists (ACLs) allow you to assign different permissions +for different users or groups even though they do not correspond to the +original owner or the owning group. **Block Storage** -: Block special files, or block devices, correspond to devices through which the system moves - data in the form of blocks. These device nodes often represent addressable devices such as - hard disks, CD-ROM drives, or memory regions. GlusterFS requires a filesystem (like XFS) that - supports extended attributes. +: Block special files, or block devices, correspond to devices through which the system moves +data in the form of blocks. These device nodes often represent addressable devices such as +hard disks, CD-ROM drives, or memory regions. GlusterFS requires a filesystem (like XFS) that +supports extended attributes. **Brick** -: A Brick is the basic unit of storage in GlusterFS, represented by an export directory - on a server in the trusted storage pool. - A brick is expressed by combining a server with an export directory in the following format: +: A Brick is the basic unit of storage in GlusterFS, represented by an export directory +on a server in the trusted storage pool. +A brick is expressed by combining a server with an export directory in the following format: - `SERVER:EXPORT` - For example: - `myhostname:/exports/myexportdir/` +```{ .text .no-copy } +SERVER:EXPORT +For example: +myhostname:/exports/myexportdir/ +``` **Client** -: Any machine that mounts a GlusterFS volume. Any applications that use libgfapi access - mechanism can also be treated as clients in GlusterFS context. +: Any machine that mounts a GlusterFS volume. Any applications that use libgfapi access +mechanism can also be treated as clients in GlusterFS context. **Cluster** -: A trusted pool of linked computers working together, resembling a single computing resource. - In GlusterFS, a cluster is also referred to as a trusted storage pool. +: A trusted pool of linked computers working together, resembling a single computing resource. +In GlusterFS, a cluster is also referred to as a trusted storage pool. **Distributed File System** -: A file system that allows multiple clients to concurrently access data which is spread across - servers/bricks in a trusted storage pool. Data sharing among multiple locations is fundamental - to all distributed file systems. +: A file system that allows multiple clients to concurrently access data which is spread across +servers/bricks in a trusted storage pool. Data sharing among multiple locations is fundamental +to all distributed file systems. **Extended Attributes** -: Extended file attributes (abbreviated xattr) is a filesystem feature that enables - users/programs to associate files/dirs with metadata. Gluster stores metadata in xattrs. +: Extended file attributes (abbreviated xattr) is a filesystem feature that enables +users/programs to associate files/dirs with metadata. Gluster stores metadata in xattrs. **Filesystem** -: A method of storing and organizing computer files and their data. - Essentially, it organizes these files into a database for the - storage, organization, manipulation, and retrieval by the computer's - operating system. +: A method of storing and organizing computer files and their data. +Essentially, it organizes these files into a database for the +storage, organization, manipulation, and retrieval by the computer's +operating system. -Source [Wikipedia][Wikipedia] +Source [Wikipedia][wikipedia] **FUSE** : Filesystem in Userspace (FUSE) is a loadable kernel module for Unix-like - computer operating systems that lets non-privileged users create their - own file systems without editing kernel code. This is achieved by - running file system code in user space while the FUSE module provides - only a "bridge" to the actual kernel interfaces. +computer operating systems that lets non-privileged users create their +own file systems without editing kernel code. This is achieved by +running file system code in user space while the FUSE module provides +only a "bridge" to the actual kernel interfaces. Source: [Wikipedia][1] **GFID** @@ -60,156 +61,156 @@ associated with it called the GFID. This is analogous to inode in a regular filesystem. **glusterd** -: The Gluster daemon/service that manages volumes and cluster membership. It is required to - run on all the servers in the trusted storage pool. +: The Gluster daemon/service that manages volumes and cluster membership. It is required to +run on all the servers in the trusted storage pool. **Geo-Replication** -: Geo-replication provides a continuous, asynchronous, and incremental - replication service from site to another over Local Area Networks - (LANs), Wide Area Network (WANs), and across the Internet. - +: Geo-replication provides a continuous, asynchronous, and incremental +replication service from site to another over Local Area Networks +(LANs), Wide Area Network (WANs), and across the Internet. **Infiniband** - InfiniBand is a switched fabric computer network communications link - used in high-performance computing and enterprise data centers. +InfiniBand is a switched fabric computer network communications link +used in high-performance computing and enterprise data centers. **Metadata** -: Metadata is defined as data providing information about one or more - other pieces of data. There is no special metadata storage concept in - GlusterFS. The metadata is stored with the file data itself usually in the - form of extended attributes +: Metadata is defined as data providing information about one or more +other pieces of data. There is no special metadata storage concept in +GlusterFS. The metadata is stored with the file data itself usually in the +form of extended attributes **Namespace** -: A namespace is an abstract container or environment created to hold a - logical grouping of unique identifiers or symbols. Each Gluster volume - exposes a single namespace as a POSIX mount point that contains every - file in the cluster. +: A namespace is an abstract container or environment created to hold a +logical grouping of unique identifiers or symbols. Each Gluster volume +exposes a single namespace as a POSIX mount point that contains every +file in the cluster. **Node** -: A server or computer that hosts one or more bricks. +: A server or computer that hosts one or more bricks. **N-way Replication** -: Local synchronous data replication which is typically deployed across campus - or Amazon Web Services Availability Zones. +: Local synchronous data replication which is typically deployed across campus +or Amazon Web Services Availability Zones. **Petabyte** -: A petabyte (derived from the SI prefix peta- ) is a unit of - information equal to one quadrillion (short scale) bytes, or 1000 - terabytes. The unit symbol for the petabyte is PB. The prefix peta- - (P) indicates a power of 1000: +: A petabyte (derived from the SI prefix peta- ) is a unit of +information equal to one quadrillion (short scale) bytes, or 1000 +terabytes. The unit symbol for the petabyte is PB. The prefix peta- +(P) indicates a power of 1000: - 1 PB = 1,000,000,000,000,000 B = 10005 B = 1015 B. +```{ .text .no-copy } +1 PB = 1,000,000,000,000,000 B = 10005 B = 1015 B. - The term "pebibyte" (PiB), using a binary prefix, is used for the - corresponding power of 1024. +The term "pebibyte" (PiB), using a binary prefix, is used for the +corresponding power of 1024. +``` Source: [Wikipedia][3] **POSIX** -: Portable Operating System Interface (for Unix) is the name of a family - of related standards specified by the IEEE to define the application - programming interface (API), along with shell and utilities interfaces - for software compatible with variants of the Unix operating system - Gluster exports a POSIX compatible file system. +: Portable Operating System Interface (for Unix) is the name of a family +of related standards specified by the IEEE to define the application +programming interface (API), along with shell and utilities interfaces +for software compatible with variants of the Unix operating system +Gluster exports a POSIX compatible file system. **Quorum** -: The configuration of quorum in a trusted storage pool determines the - number of server failures that the trusted storage pool can sustain. - If an additional failure occurs, the trusted storage pool becomes - unavailable. +: The configuration of quorum in a trusted storage pool determines the +number of server failures that the trusted storage pool can sustain. +If an additional failure occurs, the trusted storage pool becomes +unavailable. **Quota** -: Quota allows you to set limits on usage of disk space by directories or - by volumes. +: Quota allows you to set limits on usage of disk space by directories or +by volumes. **RAID** -: Redundant Array of Inexpensive Disks (RAID) is a technology that provides - increased storage reliability through redundancy, combining multiple - low-cost, less-reliable disk drives components into a logical unit where - all drives in the array are interdependent. +: Redundant Array of Inexpensive Disks (RAID) is a technology that provides +increased storage reliability through redundancy, combining multiple +low-cost, less-reliable disk drives components into a logical unit where +all drives in the array are interdependent. **RDMA** -: Remote direct memory access (RDMA) is a direct memory access from the - memory of one computer into that of another without involving either - one's operating system. This permits high-throughput, low-latency - networking, which is especially useful in massively parallel computer - clusters +: Remote direct memory access (RDMA) is a direct memory access from the +memory of one computer into that of another without involving either +one's operating system. This permits high-throughput, low-latency +networking, which is especially useful in massively parallel computer +clusters **Rebalance** -: The process of redistributing data in a distributed volume when a - brick is added or removed. +: The process of redistributing data in a distributed volume when a +brick is added or removed. **RRDNS** -: Round Robin Domain Name Service (RRDNS) is a method to distribute load - across application servers. It is implemented by creating multiple A - records with the same name and different IP addresses in the zone file - of a DNS server. +: Round Robin Domain Name Service (RRDNS) is a method to distribute load +across application servers. It is implemented by creating multiple A +records with the same name and different IP addresses in the zone file +of a DNS server. **Samba** -: Samba allows file and print sharing between computers running Windows and - computers running Linux. It is an implementation of several services and - protocols including SMB and CIFS. +: Samba allows file and print sharing between computers running Windows and +computers running Linux. It is an implementation of several services and +protocols including SMB and CIFS. **Scale-Up Storage** -: Increases the capacity of the storage device in a single dimension. - For example, adding additional disk capacity to an existing trusted storage pool. +: Increases the capacity of the storage device in a single dimension. +For example, adding additional disk capacity to an existing trusted storage pool. **Scale-Out Storage** -: Scale out systems are designed to scale on both capacity and performance. - It increases the capability of a storage device in single dimension. - For example, adding more systems of the same size, or adding servers to a trusted storage pool - that increases CPU, disk capacity, and throughput for the trusted storage pool. +: Scale out systems are designed to scale on both capacity and performance. +It increases the capability of a storage device in single dimension. +For example, adding more systems of the same size, or adding servers to a trusted storage pool +that increases CPU, disk capacity, and throughput for the trusted storage pool. **Self-Heal** -: The self-heal daemon that runs in the background, identifies - inconsistencies in files/dirs in a replicated or erasure coded volume and then resolves - or heals them. This healing process is usually required when one or more - bricks of a volume goes down and then comes up later. +: The self-heal daemon that runs in the background, identifies +inconsistencies in files/dirs in a replicated or erasure coded volume and then resolves +or heals them. This healing process is usually required when one or more +bricks of a volume goes down and then comes up later. **Server** -: The machine (virtual or bare metal) that hosts the bricks in which data is stored. +: The machine (virtual or bare metal) that hosts the bricks in which data is stored. **Split-brain** -: A situation where data on two or more bricks in a replicated - volume start to diverge in terms of content or metadata. In this state, - one cannot determine programmatically which set of data is "right" and - which is "wrong". +: A situation where data on two or more bricks in a replicated +volume start to diverge in terms of content or metadata. In this state, +one cannot determine programmatically which set of data is "right" and +which is "wrong". **Subvolume** -: A brick after being processed by at least one translator. +: A brick after being processed by at least one translator. **Translator** -: Translators (also called xlators) are stackable modules where each - module has a very specific purpose. Translators are stacked in a - hierarchical structure called as graph. A translator receives data - from its parent translator, performs necessary operations and then - passes the data down to its child translator in hierarchy. +: Translators (also called xlators) are stackable modules where each +module has a very specific purpose. Translators are stacked in a +hierarchical structure called as graph. A translator receives data +from its parent translator, performs necessary operations and then +passes the data down to its child translator in hierarchy. **Trusted Storage Pool** -: A storage pool is a trusted network of storage servers. When you start - the first server, the storage pool consists of that server alone. +: A storage pool is a trusted network of storage servers. When you start +the first server, the storage pool consists of that server alone. **Userspace** -: Applications running in user space don’t directly interact with - hardware, instead using the kernel to moderate access. Userspace - applications are generally more portable than applications in kernel - space. Gluster is a user space application. +: Applications running in user space don’t directly interact with +hardware, instead using the kernel to moderate access. Userspace +applications are generally more portable than applications in kernel +space. Gluster is a user space application. **Virtual File System (VFS)** -: VFS is a kernel software layer which handles all system calls related to the standard Linux file system. - It provides a common interface to several kinds of file systems. +: VFS is a kernel software layer which handles all system calls related to the standard Linux file system. +It provides a common interface to several kinds of file systems. **Volume** -: A volume is a logical collection of bricks. +: A volume is a logical collection of bricks. **Vol file** : Vol files or volume (.vol) files are configuration files that determine the behavior of the - Gluster trusted storage pool. It is a textual representation of a - collection of modules (also known as translators) that together implement the - various functions required. +Gluster trusted storage pool. It is a textual representation of a +collection of modules (also known as translators) that together implement the +various functions required. - - [Wikipedia]: http://en.wikipedia.org/wiki/Filesystem - [1]: http://en.wikipedia.org/wiki/Filesystem_in_Userspace - [2]: http://en.wikipedia.org/wiki/Open_source - [3]: http://en.wikipedia.org/wiki/Petabyte +[wikipedia]: http://en.wikipedia.org/wiki/Filesystem +[1]: http://en.wikipedia.org/wiki/Filesystem_in_Userspace +[2]: http://en.wikipedia.org/wiki/Open_source +[3]: http://en.wikipedia.org/wiki/Petabyte From 9ae1ee6e41edfb1162b4b030db483319ff24cd14 Mon Sep 17 00:00:00 2001 From: Niraj Kumar Yadav Date: Tue, 14 Jun 2022 09:54:21 +0530 Subject: [PATCH 07/21] [ops-guide] Cleanup Syntax (#746) Signed-off-by: black-dragon74 --- docs/Ops-Guide/Overview.md | 2 +- docs/Ops-Guide/Tools.md | 25 +++++++++++++------------ 2 files changed, 14 insertions(+), 13 deletions(-) diff --git a/docs/Ops-Guide/Overview.md b/docs/Ops-Guide/Overview.md index 743c9dc..fa2374a 100644 --- a/docs/Ops-Guide/Overview.md +++ b/docs/Ops-Guide/Overview.md @@ -6,7 +6,7 @@ planning but the growth has mostly been ad-hoc and need-based. Central to the plan of revitalizing the Gluster.org community is the ability to provide well-maintained infrastructure services with predictable uptimes and -resilience. We're migrating the existing services into the Community Cage. The +resilience. We're migrating the existing services into the Community Cage. The implied objective is that the transition would open up ways and means of the formation of a loose coalition among Infrastructure Administrators who provide expertise and guidance to the community projects within the OSAS team. diff --git a/docs/Ops-Guide/Tools.md b/docs/Ops-Guide/Tools.md index e2b5bb8..287e74a 100644 --- a/docs/Ops-Guide/Tools.md +++ b/docs/Ops-Guide/Tools.md @@ -1,23 +1,24 @@ ## Tools We Use -| Service/Tool | Purpose | Hosted At | -|----------------------|----------------------------------------------------|-----------------| -| Github | Code Review | Github | -| Jenkins | CI, build-verification-test | Temporary Racks | -| Backups | Website, Gerrit and Jenkins backup | Rackspace | -| Docs | Documentation content | mkdocs.org | -| download.gluster.org | Official download site of the binaries | Rackspace | -| Mailman | Lists mailman | Rackspace | -| www.gluster.org | Web asset | Rackspace | +| Service/Tool | Purpose | Hosted At | +| :------------------- | :------------------------------------: | --------------: | +| Github | Code Review | Github | +| Jenkins | CI, build-verification-test | Temporary Racks | +| Backups | Website, Gerrit and Jenkins backup | Rackspace | +| Docs | Documentation content | mkdocs.org | +| download.gluster.org | Official download site of the binaries | Rackspace | +| Mailman | Lists mailman | Rackspace | +| www.gluster.org | Web asset | Rackspace | ## Notes -* download.gluster.org: Resiliency is important for availability and metrics. + +- download.gluster.org: Resiliency is important for availability and metrics. Since it's official download, access need to restricted as much as possible. Few developers building the community packages have access. If anyone requires access can raise an issue at [gluster/project-infrastructure](https://github.com/gluster/project-infrastructure/issues/new) with valid reason -* Mailman: Should be migrated to a separate host. Should be made more redundant +- Mailman: Should be migrated to a separate host. Should be made more redundant (ie, more than 1 MX). -* www.gluster.org: Framework, Artifacts now exist under gluster.github.com. Has +- www.gluster.org: Framework, Artifacts now exist under gluster.github.com. Has various legacy installation of software (mediawiki, etc ), being cleaned as we find them. From e172b71c900261cf186c45bb43a13d58059538ee Mon Sep 17 00:00:00 2001 From: black-dragon74 Date: Wed, 15 Jun 2022 13:25:14 +0530 Subject: [PATCH 08/21] Fix copying of some commands Signed-off-by: black-dragon74 --- docs/GlusterFS-Tools/glusterfind.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/docs/GlusterFS-Tools/glusterfind.md b/docs/GlusterFS-Tools/glusterfind.md index 67f76a4..7424e3e 100644 --- a/docs/GlusterFS-Tools/glusterfind.md +++ b/docs/GlusterFS-Tools/glusterfind.md @@ -8,13 +8,13 @@ This tool should be run in one of the node, which will get Volume info and gets Create a glusterfind session to remember the time when last sync or processing complete. For example, your backup application runs every day and gets incremental results on each run. The tool maintains session in `$GLUSTERD_WORKDIR/glusterfind/`, for each session it creates and directory and creates a sub directory with Volume name. (Default working directory is /var/lib/glusterd, in some systems this location may change. To find Working dir location run -```{ .text .no-copy } +```console grep working-directory /etc/glusterfs/glusterd.vol ``` or -```{ .text .no-copy } +```console grep working-directory /usr/local/etc/glusterfs/glusterd.vol ``` @@ -108,8 +108,8 @@ Examples, Output file contains list of files/dirs relative to the Volume mount, if we need to prefix with any path to have absolute path then, -```{ .console .no-copy } -# glusterfind pre backup datavol /root/backup.txt --file-prefix=/mnt/datavol/ +```console +glusterfind pre backup datavol /root/backup.txt --file-prefix=/mnt/datavol/ ``` ### List Command @@ -143,8 +143,8 @@ glusterfind post SESSION_NAME VOLUME_NAME Examples, -```{ .console .no-copy } -# glusterfind post backup datavol +```console +glusterfind post backup datavol ``` ### Delete Command @@ -155,8 +155,8 @@ glusterfind delete SESSION_NAME VOLUME_NAME Examples, -```{ .console .no-copy } -# glusterfind delete backup datavol +```console +glusterfind delete backup datavol ``` ## Adding more Crawlers From 14cc346c4b96565a1439d2f342f3a4c0e31377b6 Mon Sep 17 00:00:00 2001 From: black-dragon74 Date: Wed, 15 Jun 2022 14:05:05 +0530 Subject: [PATCH 09/21] Replace tilda with backticks to enable code blocks Signed-off-by: black-dragon74 --- docs/release-notes/10.0.md | 351 ++++++++++--------- docs/release-notes/10.1.md | 3 + docs/release-notes/10.2.md | 6 +- docs/release-notes/3.10.0.md | 146 ++++---- docs/release-notes/3.10.1.md | 18 +- docs/release-notes/3.10.10.md | 4 +- docs/release-notes/3.10.11.md | 6 +- docs/release-notes/3.10.12.md | 2 +- docs/release-notes/3.10.2.md | 26 +- docs/release-notes/3.10.3.md | 25 +- docs/release-notes/3.10.4.md | 22 +- docs/release-notes/3.10.5.md | 23 +- docs/release-notes/3.10.6.md | 23 +- docs/release-notes/3.10.7.md | 19 +- docs/release-notes/3.10.8.md | 19 +- docs/release-notes/3.10.9.md | 19 +- docs/release-notes/3.11.0.md | 87 +++-- docs/release-notes/3.11.1.md | 17 +- docs/release-notes/3.11.2.md | 17 +- docs/release-notes/3.11.3.md | 11 +- docs/release-notes/3.12.0.md | 83 +++-- docs/release-notes/3.12.1.md | 19 +- docs/release-notes/3.12.10.md | 3 +- docs/release-notes/3.12.11.md | 1 + docs/release-notes/3.12.12.md | 1 + docs/release-notes/3.12.13.md | 4 +- docs/release-notes/3.12.14.md | 11 +- docs/release-notes/3.12.2.md | 26 +- docs/release-notes/3.12.3.md | 16 +- docs/release-notes/3.12.4.md | 12 +- docs/release-notes/3.12.5.md | 11 +- docs/release-notes/3.12.6.md | 15 +- docs/release-notes/3.12.7.md | 6 +- docs/release-notes/3.12.8.md | 4 +- docs/release-notes/3.12.9.md | 1 + docs/release-notes/3.13.0.md | 40 ++- docs/release-notes/3.13.1.md | 10 +- docs/release-notes/3.13.2.md | 4 +- docs/release-notes/3.5.0.md | 25 +- docs/release-notes/3.5.1.md | 143 ++++---- docs/release-notes/3.5.2.md | 39 +-- docs/release-notes/3.5.3.md | 32 +- docs/release-notes/3.5.4.md | 37 +- docs/release-notes/3.6.0.md | 23 +- docs/release-notes/3.6.3.md | 30 +- docs/release-notes/3.7.0.md | 52 +-- docs/release-notes/3.7.1.md | 119 ++++--- docs/release-notes/3.9.0.md | 120 ++++--- docs/release-notes/4.0.0.md | 132 +++++-- docs/release-notes/4.0.1.md | 2 + docs/release-notes/4.0.2.md | 2 + docs/release-notes/4.1.0.md | 100 +++--- docs/release-notes/4.1.1.md | 1 + docs/release-notes/4.1.10.md | 2 +- docs/release-notes/4.1.2.md | 8 +- docs/release-notes/4.1.3.md | 6 +- docs/release-notes/4.1.4.md | 36 +- docs/release-notes/4.1.6.md | 5 +- docs/release-notes/4.1.7.md | 2 +- docs/release-notes/4.1.9.md | 2 +- docs/release-notes/5.0.md | 55 +-- docs/release-notes/5.1.md | 7 +- docs/release-notes/5.10.md | 2 +- docs/release-notes/5.13.md | 2 +- docs/release-notes/5.5.md | 2 +- docs/release-notes/5.6.md | 2 +- docs/release-notes/6.0.md | 62 ++-- docs/release-notes/6.1.md | 1 - docs/release-notes/6.10.md | 3 +- docs/release-notes/6.3.md | 3 +- docs/release-notes/6.4.md | 4 +- docs/release-notes/6.5.md | 2 +- docs/release-notes/6.6.md | 10 +- docs/release-notes/6.7.md | 2 +- docs/release-notes/6.8.md | 4 +- docs/release-notes/6.9.md | 2 +- docs/release-notes/7.0.md | 43 +-- docs/release-notes/7.1.md | 5 +- docs/release-notes/7.2.md | 2 - docs/release-notes/7.3.md | 3 - docs/release-notes/7.4.md | 1 - docs/release-notes/7.5.md | 1 - docs/release-notes/7.6.md | 1 - docs/release-notes/7.7.md | 3 +- docs/release-notes/7.8.md | 2 +- docs/release-notes/7.9.md | 3 +- docs/release-notes/8.0.md | 96 +++-- docs/release-notes/8.1.md | 8 +- docs/release-notes/8.2.md | 2 +- docs/release-notes/8.3.md | 6 +- docs/release-notes/8.4.md | 6 +- docs/release-notes/8.5.md | 5 +- docs/release-notes/8.6.md | 8 +- docs/release-notes/9.0.md | 64 ++-- docs/release-notes/9.1.md | 6 +- docs/release-notes/9.2.md | 6 +- docs/release-notes/9.3.md | 6 +- docs/release-notes/9.4.md | 6 +- docs/release-notes/9.5.md | 11 +- docs/release-notes/geo-rep-in-3.7.md | 33 +- docs/release-notes/glusterfs-selinux2.0.1.md | 4 +- docs/release-notes/index.md | 29 +- 102 files changed, 1387 insertions(+), 1165 deletions(-) diff --git a/docs/release-notes/10.0.md b/docs/release-notes/10.0.md index 2a215c5..211a574 100644 --- a/docs/release-notes/10.0.md +++ b/docs/release-notes/10.0.md @@ -7,28 +7,29 @@ This is a major release that includes a range of features, code improvements and A selection of the key features and changes are documented in this page. A full list of bugs that have been addressed is included further below. -- [Announcements](#announcements) -- [Highlights](#highlights) -- [Bugs addressed in the release](#bugs-addressed) +- [Release notes for Gluster 10.0](#release-notes-for-gluster-100) + - [Announcements](#announcements) + - [Builds are available at -](#builds-are-available-at--) + - [Highlights](#highlights) + - [Bugs addressed](#bugs-addressed) ## Announcements 1. Releases that receive maintenance updates post release 10 is 9 -([reference](https://www.gluster.org/release-schedule/)) -2. Release 10 will receive maintenance updates around the 15th of every alternative month, and the release 9 will recieve maintainance updates around 15th every three months. - + ([reference](https://www.gluster.org/release-schedule/)) +2. Release 10 will receive maintenance updates around the 15th of every alternative month, and the release 9 will recieve maintainance updates around 15th every three months. ## Builds are available at - -[https://download.gluster.org/pub/gluster/glusterfs/10/10.0/](https://download.gluster.org/pub/gluster/glusterfs/10/10.0/) +[https://download.gluster.org/pub/gluster/glusterfs/10/10.0/](https://download.gluster.org/pub/gluster/glusterfs/10/10.0/) ## Highlights - Major performance improvement of ~20% w.r.t small files as well as large files testing in controlled lab environments [#2771](https://github.com/gluster/glusterfs/issues/2771) - - **NOTE**: The above improvement requires tcmalloc library to be enabled for building. We have tested and verified tcmalloc in X86_64 platforms and is enabled only for x86_64 builds in current release. - + + **NOTE**: The above improvement requires tcmalloc library to be enabled for building. We have tested and verified tcmalloc in X86_64 platforms and is enabled only for x86_64 builds in current release. + - Randomized port selection for bricks, improves startup time [#786](https://github.com/gluster/glusterfs/issues/786) - Performance improvement with use of readdir instead of readdirp in fix-layout [#2241](https://github.com/gluster/glusterfs/issues/2241) - Heal time improvement with bigger window size [#2067](https://github.com/gluster/glusterfs/issues/2067) @@ -37,168 +38,168 @@ A full list of bugs that have been addressed is included further below. Bugs addressed since release-10 are listed below. - - [#504](https://github.com/gluster/glusterfs/issues/504) AFR: remove memcpy() + ntoh32() pattern - - [#705](https://github.com/gluster/glusterfs/issues/705) gf_backtrace_save inefficiencies - - [#782](https://github.com/gluster/glusterfs/issues/782) Do not explicitly call strerror(errnum) when logging - - [#786](https://github.com/gluster/glusterfs/issues/786) glusterd-pmap binds to 10K ports on startup (using IPv4) - - [#904](https://github.com/gluster/glusterfs/issues/904) [bug:1649037] Translators allocate too much memory in their xlator_ - - [#1000](https://github.com/gluster/glusterfs/issues/1000) [bug:1193929] GlusterFS can be improved - - [#1002](https://github.com/gluster/glusterfs/issues/1002) [bug:1679998] GlusterFS can be improved - - [#1052](https://github.com/gluster/glusterfs/issues/1052) [bug:1693692] Increase code coverage from regression tests - - [#1060](https://github.com/gluster/glusterfs/issues/1060) [bug:789278] Issues reported by Coverity static analysis tool - - [#1096](https://github.com/gluster/glusterfs/issues/1096) [bug:1622665] clang-scan report: glusterfs issues - - [#1101](https://github.com/gluster/glusterfs/issues/1101) [bug:1813029] volume brick fails to come online because other proce - - [#1251](https://github.com/gluster/glusterfs/issues/1251) performance: improve __afr_fd_ctx_get() function - - [#1339](https://github.com/gluster/glusterfs/issues/1339) Rebalance status is not shown correctly after node reboot - - [#1358](https://github.com/gluster/glusterfs/issues/1358) features/shard: wrong "inode->ref" leading to ASSERT in inode_unref - - [#1359](https://github.com/gluster/glusterfs/issues/1359) Cleanup --disable-mempool - - [#1380](https://github.com/gluster/glusterfs/issues/1380) fd_unref() optimization - do an atomic decrement outside the lock a - - [#1384](https://github.com/gluster/glusterfs/issues/1384) mount glusterfs volume, files larger than 64Mb only show 64Mb - - [#1406](https://github.com/gluster/glusterfs/issues/1406) shared storage volume fails to mount in ipv6 environment - - [#1415](https://github.com/gluster/glusterfs/issues/1415) Removing problematic language in geo-replication - - [#1423](https://github.com/gluster/glusterfs/issues/1423) shard_make_block_abspath() should be called with a string of of the - - [#1536](https://github.com/gluster/glusterfs/issues/1536) Improve dict_reset() efficiency - - [#1545](https://github.com/gluster/glusterfs/issues/1545) fuse_invalidate_entry() - too many repetitive calls to uuid_utoa() - - [#1583](https://github.com/gluster/glusterfs/issues/1583) Rework stats structure (xl->stats.total.metrics[fop_idx] and friend - - [#1584](https://github.com/gluster/glusterfs/issues/1584) MAINTAINERS file needs to be revisited and updated - - [#1596](https://github.com/gluster/glusterfs/issues/1596) 'this' NULL check relies on 'THIS' not being NULL - - [#1600](https://github.com/gluster/glusterfs/issues/1600) Save and re-use MYUUID - - [#1678](https://github.com/gluster/glusterfs/issues/1678) Improve gf_error_to_errno() and gf_errno_to_error() positive flow - - [#1695](https://github.com/gluster/glusterfs/issues/1695) Rebalance has a redundant lookup operation - - [#1702](https://github.com/gluster/glusterfs/issues/1702) Move GF_CLIENT_PID_GSYNCD check to start of the function. - - [#1703](https://github.com/gluster/glusterfs/issues/1703) Remove trivial check for GF_XATTR_SHARD_FILE_SIZE before calling sh - - [#1707](https://github.com/gluster/glusterfs/issues/1707) PL_LOCAL_GET_REQUESTS access the dictionary twice for the same info - - [#1717](https://github.com/gluster/glusterfs/issues/1717) glusterd: sequence of rebalance and replace/reset-brick presents re - - [#1723](https://github.com/gluster/glusterfs/issues/1723) DHT: further investigation for treating an ongoing mknod's linkto file - - [#1749](https://github.com/gluster/glusterfs/issues/1749) brick-process: call 'notify()' and 'fini()' of brick xlators in a p - - [#1755](https://github.com/gluster/glusterfs/issues/1755) Reduce calls to 'THIS' in fd_destroy() and others, where 'THIS' is - - [#1761](https://github.com/gluster/glusterfs/issues/1761) CONTRIBUTING.md regression can only be run by maintainers - - [#1764](https://github.com/gluster/glusterfs/issues/1764) Slow write on ZFS bricks after healing millions of files due to add - - [#1772](https://github.com/gluster/glusterfs/issues/1772) build: add LTO as a configure option - - [#1773](https://github.com/gluster/glusterfs/issues/1773) DHT/Rebalance - Remove unused variable dht_migrate_file - - [#1779](https://github.com/gluster/glusterfs/issues/1779) Add-brick command should check hostnames with bricks present in vol - - [#1825](https://github.com/gluster/glusterfs/issues/1825) Latency in io-stats should be in nanoseconds resolution, not micros - - [#1872](https://github.com/gluster/glusterfs/issues/1872) Question: How to check heal info without glusterd management layer - - [#1885](https://github.com/gluster/glusterfs/issues/1885) __posix_writev() - reduce memory copies and unneeded zeroing - - [#1888](https://github.com/gluster/glusterfs/issues/1888) GD_OP_VERSION needs to be updated for release-10 - - [#1898](https://github.com/gluster/glusterfs/issues/1898) schedule_georep.py resulting in failure when used with python3 - - [#1909](https://github.com/gluster/glusterfs/issues/1909) core: Avoid several dict OR key is NULL message in brick logs - - [#1925](https://github.com/gluster/glusterfs/issues/1925) dht_pt_getxattr does not seem to handle virtual xattrs. - - [#1935](https://github.com/gluster/glusterfs/issues/1935) logging to syslog instead of any glusterfs logs - - [#1943](https://github.com/gluster/glusterfs/issues/1943) glusterd-volgen: Add functionality to accept any custom xlator - - [#1952](https://github.com/gluster/glusterfs/issues/1952) posix-aio: implement GF_FOP_FSYNC - - [#1959](https://github.com/gluster/glusterfs/issues/1959) Broken links in the 2 replicas split-brain-issue - [Bug][Enhancemen - - [#1960](https://github.com/gluster/glusterfs/issues/1960) Add missing LOCK_DESTROY() calls - - [#1966](https://github.com/gluster/glusterfs/issues/1966) Can't print trace details due to memory allocation issues - - [#1977](https://github.com/gluster/glusterfs/issues/1977) Inconsistent locking in presence of disconnects - - [#1978](https://github.com/gluster/glusterfs/issues/1978) test case ./tests/bugs/core/bug-1432542-mpx-restart-crash.t is gett - - [#1981](https://github.com/gluster/glusterfs/issues/1981) Reduce posix_fdstat() calls in IO paths - - [#1991](https://github.com/gluster/glusterfs/issues/1991) mdcache: bug causes getxattr() to report ENODATA when fetching samb - - [#1992](https://github.com/gluster/glusterfs/issues/1992) dht: var decommission_subvols_cnt becomes invalid when config is up - - [#1996](https://github.com/gluster/glusterfs/issues/1996) Analyze if spinlocks have any benefit and remove them if not - - [#2001](https://github.com/gluster/glusterfs/issues/2001) Error handling in /usr/sbin/gluster-eventsapi produces AttributeErr - - [#2005](https://github.com/gluster/glusterfs/issues/2005) ./tests/bugs/replicate/bug-921231.t is continuously failing - - [#2013](https://github.com/gluster/glusterfs/issues/2013) dict_t hash-calculation can be removed when hash_size=1 - - [#2024](https://github.com/gluster/glusterfs/issues/2024) Remove gfs_id variable or at least set to appropriate value - - [#2025](https://github.com/gluster/glusterfs/issues/2025) list_del() should not set prev and next - - [#2033](https://github.com/gluster/glusterfs/issues/2033) tests/bugs/nfs/bug-1053579.t fails on CentOS 8 - - [#2038](https://github.com/gluster/glusterfs/issues/2038) shard_unlink() fails due to no space to create marker file - - [#2039](https://github.com/gluster/glusterfs/issues/2039) Do not allow POSIX IO backend switch when the volume is running - - [#2042](https://github.com/gluster/glusterfs/issues/2042) mount ipv6 gluster volume with serveral backup-volfile-servers,use - - [#2052](https://github.com/gluster/glusterfs/issues/2052) Revert the commit 50e953e2450b5183988c12e87bdfbc997e0ad8a8 - - [#2054](https://github.com/gluster/glusterfs/issues/2054) cleanup call_stub_t from unused variables - - [#2063](https://github.com/gluster/glusterfs/issues/2063) Provide autoconf option to enable/disable storage.linux-io_uring du - - [#2067](https://github.com/gluster/glusterfs/issues/2067) Change self-heal-window-size to 1MB by default - - [#2075](https://github.com/gluster/glusterfs/issues/2075) Annotate synctasks with valgrind API if --enable-valgrind[=memcheck - - [#2080](https://github.com/gluster/glusterfs/issues/2080) Glustereventsd default port - - [#2083](https://github.com/gluster/glusterfs/issues/2083) GD_MSG_DICT_GET_FAILED should not include 'errno' but 'ret' - - [#2086](https://github.com/gluster/glusterfs/issues/2086) Move tests/00-geo-rep/00-georep-verify-non-root-setup.t to tests/00 - - [#2096](https://github.com/gluster/glusterfs/issues/2096) iobuf_arena structure doesn't need passive and active iobufs, but l - - [#2099](https://github.com/gluster/glusterfs/issues/2099) 'force' option does not work in the replicated volume snapshot crea - - [#2101](https://github.com/gluster/glusterfs/issues/2101) Move 00-georep-verify-non-root-setup.t back to tests/00-geo-rep/ - - [#2107](https://github.com/gluster/glusterfs/issues/2107) mount crashes when setfattr -n distribute.fix.layout -v "yes" is ex - - [#2116](https://github.com/gluster/glusterfs/issues/2116) enable quota for multiple volumes take more time - - [#2117](https://github.com/gluster/glusterfs/issues/2117) Concurrent quota enable causes glusterd deadlock - - [#2123](https://github.com/gluster/glusterfs/issues/2123) Implement an I/O framework - - [#2129](https://github.com/gluster/glusterfs/issues/2129) CID 1445996 Null pointer dereferences (FORWARD_NULL) /xlators/mgmt/ - - [#2130](https://github.com/gluster/glusterfs/issues/2130) stack.h/c: remove unused variable and reorder struct - - [#2133](https://github.com/gluster/glusterfs/issues/2133) Changelog History Crawl failed after resuming stopped geo-replicati - - [#2134](https://github.com/gluster/glusterfs/issues/2134) Fix spurious failures caused by change in profile info duration to - - [#2138](https://github.com/gluster/glusterfs/issues/2138) glfs_write() dumps a core file file when buffer size is 1GB - - [#2154](https://github.com/gluster/glusterfs/issues/2154) "Operation not supported" doing a chmod on a symlink - - [#2159](https://github.com/gluster/glusterfs/issues/2159) Remove unused component tests - - [#2161](https://github.com/gluster/glusterfs/issues/2161) Crash caused by memory corruption - - [#2169](https://github.com/gluster/glusterfs/issues/2169) Stack overflow when parallel-readdir is enabled - - [#2180](https://github.com/gluster/glusterfs/issues/2180) CID 1446716: Memory - illegal accesses (USE_AFTER_FREE) /xlators/mg - - [#2187](https://github.com/gluster/glusterfs/issues/2187) [Input/output error] IO failure while performing shrink operation w - - [#2190](https://github.com/gluster/glusterfs/issues/2190) Move a test case tests/basic/glusterd-restart-shd-mux.t to flaky - - [#2192](https://github.com/gluster/glusterfs/issues/2192) 4+1 arbiter setup is broken - - [#2198](https://github.com/gluster/glusterfs/issues/2198) There are blocked inodelks for a long time - - [#2216](https://github.com/gluster/glusterfs/issues/2216) Fix coverity issues - - [#2232](https://github.com/gluster/glusterfs/issues/2232) "Invalid argument" when reading a directory with gfapi - - [#2234](https://github.com/gluster/glusterfs/issues/2234) Segmentation fault in directory quota daemon for replicated volume - - [#2239](https://github.com/gluster/glusterfs/issues/2239) rebalance crashes in dht on master - - [#2241](https://github.com/gluster/glusterfs/issues/2241) Using readdir instead of readdirp for fix-layout increases performa - - [#2253](https://github.com/gluster/glusterfs/issues/2253) Disable lookup-optimize by default in the virt group - - [#2258](https://github.com/gluster/glusterfs/issues/2258) Provide option to disable fsync in data migration - - [#2260](https://github.com/gluster/glusterfs/issues/2260) failed to list quota info after setting limit-usage - - [#2268](https://github.com/gluster/glusterfs/issues/2268) dht_layout_unref() only uses 'this' to check that 'this->private' i - - [#2278](https://github.com/gluster/glusterfs/issues/2278) nfs-ganesha does not start due to shared storage not ready, but ret - - [#2287](https://github.com/gluster/glusterfs/issues/2287) runner infrastructure fails to provide platfrom independent error c - - [#2294](https://github.com/gluster/glusterfs/issues/2294) dict.c: remove some strlen() calls if using DICT_LIST_IMP - - [#2308](https://github.com/gluster/glusterfs/issues/2308) Developer sessions for glusterfs - - [#2313](https://github.com/gluster/glusterfs/issues/2313) Long setting names mess up the columns and break parsing - - [#2317](https://github.com/gluster/glusterfs/issues/2317) Rebalance doesn't migrate some sparse files - - [#2328](https://github.com/gluster/glusterfs/issues/2328) "gluster volume set group samba" needs to include write-b - - [#2330](https://github.com/gluster/glusterfs/issues/2330) gf_msg can cause relock deadlock - - [#2334](https://github.com/gluster/glusterfs/issues/2334) posix_handle_soft() is doing an unnecessary stat - - [#2337](https://github.com/gluster/glusterfs/issues/2337) memory leak observed in lock fop - - [#2348](https://github.com/gluster/glusterfs/issues/2348) Gluster's test suite on RHEL 8 runs slower than on RHEL 7 - - [#2351](https://github.com/gluster/glusterfs/issues/2351) glusterd: After upgrade on release 9.1 glusterd protocol is broken - - [#2353](https://github.com/gluster/glusterfs/issues/2353) Permission issue after upgrading to Gluster v9.1 - - [#2360](https://github.com/gluster/glusterfs/issues/2360) extras: postscript fails on logrotation of snapd logs - - [#2364](https://github.com/gluster/glusterfs/issues/2364) After the service is restarted, a large number of handles are not r - - [#2370](https://github.com/gluster/glusterfs/issues/2370) glusterd: Issues with custom xlator changes - - [#2378](https://github.com/gluster/glusterfs/issues/2378) Remove sys_fstatat() from posix_handle_unset_gfid() function - not - - [#2380](https://github.com/gluster/glusterfs/issues/2380) Remove sys_lstat() from posix_acl_xattr_set() - not needed - - [#2388](https://github.com/gluster/glusterfs/issues/2388) Geo-replication gets delayed when there are many renames on primary - - [#2394](https://github.com/gluster/glusterfs/issues/2394) Spurious failure in tests/basic/fencing/afr-lock-heal-basic.t - - [#2398](https://github.com/gluster/glusterfs/issues/2398) Bitrot and scrub process showed like unknown in the gluster volume - - [#2404](https://github.com/gluster/glusterfs/issues/2404) Spurious failure of tests/bugs/ec/bug-1236065.t - - [#2407](https://github.com/gluster/glusterfs/issues/2407) configure glitch with CC=clang - - [#2410](https://github.com/gluster/glusterfs/issues/2410) dict_xxx_sizen variant compilation should fail on passing a variabl - - [#2414](https://github.com/gluster/glusterfs/issues/2414) Prefer mallinfo2() to mallinfo() if available - - [#2421](https://github.com/gluster/glusterfs/issues/2421) rsync should not try to sync internal xattrs. - - [#2429](https://github.com/gluster/glusterfs/issues/2429) Use file timestamps with nanosecond precision - - [#2431](https://github.com/gluster/glusterfs/issues/2431) Drop --disable-syslog configuration option - - [#2440](https://github.com/gluster/glusterfs/issues/2440) Geo-replication not working on Ubuntu 21.04 - - [#2443](https://github.com/gluster/glusterfs/issues/2443) Core dumps on Gluster 9 - 3 replicas - - [#2446](https://github.com/gluster/glusterfs/issues/2446) client_add_lock_for_recovery() - new_client_lock() should be called - - [#2467](https://github.com/gluster/glusterfs/issues/2467) failed to open /proc/0/status: No such file or directory - - [#2470](https://github.com/gluster/glusterfs/issues/2470) sharding: [inode.c:1255:__inode_unlink] 0-inode: dentry not found - - [#2480](https://github.com/gluster/glusterfs/issues/2480) Brick going offline on another host as well as the host which reboo - - [#2502](https://github.com/gluster/glusterfs/issues/2502) xlator/features/locks/src/common.c has code duplication - - [#2507](https://github.com/gluster/glusterfs/issues/2507) Use appropriate msgid in gf_msg() - - [#2515](https://github.com/gluster/glusterfs/issues/2515) Unable to mount the gluster volume using fuse unless iptables is fl - - [#2522](https://github.com/gluster/glusterfs/issues/2522) ganesha_ha (extras/ganesha/ocf): ganesha_grace RA fails in start() - - [#2540](https://github.com/gluster/glusterfs/issues/2540) delay-gen doesn't work correctly for delays longer than 2 seconds - - [#2551](https://github.com/gluster/glusterfs/issues/2551) Sometimes the lock notification feature doesn't work - - [#2581](https://github.com/gluster/glusterfs/issues/2581) With strict-locks enabled clients which are holding posix locks sti - - [#2590](https://github.com/gluster/glusterfs/issues/2590) trusted.io-stats-dump extended attribute usage description error - - [#2611](https://github.com/gluster/glusterfs/issues/2611) Granular entry self-heal is taking more time than full entry self h - - [#2617](https://github.com/gluster/glusterfs/issues/2617) High CPU utilization of thread glfs_fusenoti and huge delays in som - - [#2620](https://github.com/gluster/glusterfs/issues/2620) Granular entry heal purging of index name trigger two lookups in th - - [#2625](https://github.com/gluster/glusterfs/issues/2625) auth.allow value is corrupted after add-brick operation - - [#2626](https://github.com/gluster/glusterfs/issues/2626) entry self-heal does xattrops unnecessarily in many cases - - [#2649](https://github.com/gluster/glusterfs/issues/2649) glustershd failed in bind with error "Address already in use" - - [#2652](https://github.com/gluster/glusterfs/issues/2652) Removal of deadcode: Pump - - [#2659](https://github.com/gluster/glusterfs/issues/2659) tests/basic/afr/afr-anon-inode.t crashed - - [#2664](https://github.com/gluster/glusterfs/issues/2664) Test suite produce uncompressed logs - - [#2693](https://github.com/gluster/glusterfs/issues/2693) dht: dht_local_wipe is crashed while running rename operation - - [#2771](https://github.com/gluster/glusterfs/issues/2771) Smallfile improvement in glusterfs - - [#2782](https://github.com/gluster/glusterfs/issues/2782) Glustereventsd does not listen on IPv4 when IPv6 is not available - - [#2789](https://github.com/gluster/glusterfs/issues/2789) An improper locking bug(e.g., deadlock) on the lock up_inode_ctx->c - - [#2798](https://github.com/gluster/glusterfs/issues/2798) FUSE mount option for localtime-logging is not exposed - - [#2816](https://github.com/gluster/glusterfs/issues/2816) Glusterfsd memory leak when subdir_mounting a volume - - [#2835](https://github.com/gluster/glusterfs/issues/2835) dht: found anomalies in dht_layout after commit c4cbdbcb3d02fb56a62 - - [#2857](https://github.com/gluster/glusterfs/issues/2857) variable twice initialization. +- [#504](https://github.com/gluster/glusterfs/issues/504) AFR: remove memcpy() + ntoh32() pattern +- [#705](https://github.com/gluster/glusterfs/issues/705) gf_backtrace_save inefficiencies +- [#782](https://github.com/gluster/glusterfs/issues/782) Do not explicitly call strerror(errnum) when logging +- [#786](https://github.com/gluster/glusterfs/issues/786) glusterd-pmap binds to 10K ports on startup (using IPv4) +- [#904](https://github.com/gluster/glusterfs/issues/904) [bug:1649037] Translators allocate too much memory in their xlator\_ +- [#1000](https://github.com/gluster/glusterfs/issues/1000) [bug:1193929] GlusterFS can be improved +- [#1002](https://github.com/gluster/glusterfs/issues/1002) [bug:1679998] GlusterFS can be improved +- [#1052](https://github.com/gluster/glusterfs/issues/1052) [bug:1693692] Increase code coverage from regression tests +- [#1060](https://github.com/gluster/glusterfs/issues/1060) [bug:789278] Issues reported by Coverity static analysis tool +- [#1096](https://github.com/gluster/glusterfs/issues/1096) [bug:1622665] clang-scan report: glusterfs issues +- [#1101](https://github.com/gluster/glusterfs/issues/1101) [bug:1813029] volume brick fails to come online because other proce +- [#1251](https://github.com/gluster/glusterfs/issues/1251) performance: improve \_\_afr_fd_ctx_get() function +- [#1339](https://github.com/gluster/glusterfs/issues/1339) Rebalance status is not shown correctly after node reboot +- [#1358](https://github.com/gluster/glusterfs/issues/1358) features/shard: wrong "inode->ref" leading to ASSERT in inode_unref +- [#1359](https://github.com/gluster/glusterfs/issues/1359) Cleanup --disable-mempool +- [#1380](https://github.com/gluster/glusterfs/issues/1380) fd_unref() optimization - do an atomic decrement outside the lock a +- [#1384](https://github.com/gluster/glusterfs/issues/1384) mount glusterfs volume, files larger than 64Mb only show 64Mb +- [#1406](https://github.com/gluster/glusterfs/issues/1406) shared storage volume fails to mount in ipv6 environment +- [#1415](https://github.com/gluster/glusterfs/issues/1415) Removing problematic language in geo-replication +- [#1423](https://github.com/gluster/glusterfs/issues/1423) shard_make_block_abspath() should be called with a string of of the +- [#1536](https://github.com/gluster/glusterfs/issues/1536) Improve dict_reset() efficiency +- [#1545](https://github.com/gluster/glusterfs/issues/1545) fuse_invalidate_entry() - too many repetitive calls to uuid_utoa() +- [#1583](https://github.com/gluster/glusterfs/issues/1583) Rework stats structure (xl->stats.total.metrics[fop_idx] and friend +- [#1584](https://github.com/gluster/glusterfs/issues/1584) MAINTAINERS file needs to be revisited and updated +- [#1596](https://github.com/gluster/glusterfs/issues/1596) 'this' NULL check relies on 'THIS' not being NULL +- [#1600](https://github.com/gluster/glusterfs/issues/1600) Save and re-use MYUUID +- [#1678](https://github.com/gluster/glusterfs/issues/1678) Improve gf_error_to_errno() and gf_errno_to_error() positive flow +- [#1695](https://github.com/gluster/glusterfs/issues/1695) Rebalance has a redundant lookup operation +- [#1702](https://github.com/gluster/glusterfs/issues/1702) Move GF_CLIENT_PID_GSYNCD check to start of the function. +- [#1703](https://github.com/gluster/glusterfs/issues/1703) Remove trivial check for GF_XATTR_SHARD_FILE_SIZE before calling sh +- [#1707](https://github.com/gluster/glusterfs/issues/1707) PL_LOCAL_GET_REQUESTS access the dictionary twice for the same info +- [#1717](https://github.com/gluster/glusterfs/issues/1717) glusterd: sequence of rebalance and replace/reset-brick presents re +- [#1723](https://github.com/gluster/glusterfs/issues/1723) DHT: further investigation for treating an ongoing mknod's linkto file +- [#1749](https://github.com/gluster/glusterfs/issues/1749) brick-process: call 'notify()' and 'fini()' of brick xlators in a p +- [#1755](https://github.com/gluster/glusterfs/issues/1755) Reduce calls to 'THIS' in fd_destroy() and others, where 'THIS' is +- [#1761](https://github.com/gluster/glusterfs/issues/1761) CONTRIBUTING.md regression can only be run by maintainers +- [#1764](https://github.com/gluster/glusterfs/issues/1764) Slow write on ZFS bricks after healing millions of files due to add +- [#1772](https://github.com/gluster/glusterfs/issues/1772) build: add LTO as a configure option +- [#1773](https://github.com/gluster/glusterfs/issues/1773) DHT/Rebalance - Remove unused variable dht_migrate_file +- [#1779](https://github.com/gluster/glusterfs/issues/1779) Add-brick command should check hostnames with bricks present in vol +- [#1825](https://github.com/gluster/glusterfs/issues/1825) Latency in io-stats should be in nanoseconds resolution, not micros +- [#1872](https://github.com/gluster/glusterfs/issues/1872) Question: How to check heal info without glusterd management layer +- [#1885](https://github.com/gluster/glusterfs/issues/1885) \_\_posix_writev() - reduce memory copies and unneeded zeroing +- [#1888](https://github.com/gluster/glusterfs/issues/1888) GD_OP_VERSION needs to be updated for release-10 +- [#1898](https://github.com/gluster/glusterfs/issues/1898) schedule_georep.py resulting in failure when used with python3 +- [#1909](https://github.com/gluster/glusterfs/issues/1909) core: Avoid several dict OR key is NULL message in brick logs +- [#1925](https://github.com/gluster/glusterfs/issues/1925) dht_pt_getxattr does not seem to handle virtual xattrs. +- [#1935](https://github.com/gluster/glusterfs/issues/1935) logging to syslog instead of any glusterfs logs +- [#1943](https://github.com/gluster/glusterfs/issues/1943) glusterd-volgen: Add functionality to accept any custom xlator +- [#1952](https://github.com/gluster/glusterfs/issues/1952) posix-aio: implement GF_FOP_FSYNC +- [#1959](https://github.com/gluster/glusterfs/issues/1959) Broken links in the 2 replicas split-brain-issue - [Bug]Enhancemen +- [#1960](https://github.com/gluster/glusterfs/issues/1960) Add missing LOCK_DESTROY() calls +- [#1966](https://github.com/gluster/glusterfs/issues/1966) Can't print trace details due to memory allocation issues +- [#1977](https://github.com/gluster/glusterfs/issues/1977) Inconsistent locking in presence of disconnects +- [#1978](https://github.com/gluster/glusterfs/issues/1978) test case ./tests/bugs/core/bug-1432542-mpx-restart-crash.t is gett +- [#1981](https://github.com/gluster/glusterfs/issues/1981) Reduce posix_fdstat() calls in IO paths +- [#1991](https://github.com/gluster/glusterfs/issues/1991) mdcache: bug causes getxattr() to report ENODATA when fetching samb +- [#1992](https://github.com/gluster/glusterfs/issues/1992) dht: var decommission_subvols_cnt becomes invalid when config is up +- [#1996](https://github.com/gluster/glusterfs/issues/1996) Analyze if spinlocks have any benefit and remove them if not +- [#2001](https://github.com/gluster/glusterfs/issues/2001) Error handling in /usr/sbin/gluster-eventsapi produces AttributeErr +- [#2005](https://github.com/gluster/glusterfs/issues/2005) ./tests/bugs/replicate/bug-921231.t is continuously failing +- [#2013](https://github.com/gluster/glusterfs/issues/2013) dict_t hash-calculation can be removed when hash_size=1 +- [#2024](https://github.com/gluster/glusterfs/issues/2024) Remove gfs_id variable or at least set to appropriate value +- [#2025](https://github.com/gluster/glusterfs/issues/2025) list_del() should not set prev and next +- [#2033](https://github.com/gluster/glusterfs/issues/2033) tests/bugs/nfs/bug-1053579.t fails on CentOS 8 +- [#2038](https://github.com/gluster/glusterfs/issues/2038) shard_unlink() fails due to no space to create marker file +- [#2039](https://github.com/gluster/glusterfs/issues/2039) Do not allow POSIX IO backend switch when the volume is running +- [#2042](https://github.com/gluster/glusterfs/issues/2042) mount ipv6 gluster volume with serveral backup-volfile-servers,use +- [#2052](https://github.com/gluster/glusterfs/issues/2052) Revert the commit 50e953e2450b5183988c12e87bdfbc997e0ad8a8 +- [#2054](https://github.com/gluster/glusterfs/issues/2054) cleanup call_stub_t from unused variables +- [#2063](https://github.com/gluster/glusterfs/issues/2063) Provide autoconf option to enable/disable storage.linux-io_uring du +- [#2067](https://github.com/gluster/glusterfs/issues/2067) Change self-heal-window-size to 1MB by default +- [#2075](https://github.com/gluster/glusterfs/issues/2075) Annotate synctasks with valgrind API if --enable-valgrind[=memcheck +- [#2080](https://github.com/gluster/glusterfs/issues/2080) Glustereventsd default port +- [#2083](https://github.com/gluster/glusterfs/issues/2083) GD_MSG_DICT_GET_FAILED should not include 'errno' but 'ret' +- [#2086](https://github.com/gluster/glusterfs/issues/2086) Move tests/00-geo-rep/00-georep-verify-non-root-setup.t to tests/00 +- [#2096](https://github.com/gluster/glusterfs/issues/2096) iobuf_arena structure doesn't need passive and active iobufs, but l +- [#2099](https://github.com/gluster/glusterfs/issues/2099) 'force' option does not work in the replicated volume snapshot crea +- [#2101](https://github.com/gluster/glusterfs/issues/2101) Move 00-georep-verify-non-root-setup.t back to tests/00-geo-rep/ +- [#2107](https://github.com/gluster/glusterfs/issues/2107) mount crashes when setfattr -n distribute.fix.layout -v "yes" is ex +- [#2116](https://github.com/gluster/glusterfs/issues/2116) enable quota for multiple volumes take more time +- [#2117](https://github.com/gluster/glusterfs/issues/2117) Concurrent quota enable causes glusterd deadlock +- [#2123](https://github.com/gluster/glusterfs/issues/2123) Implement an I/O framework +- [#2129](https://github.com/gluster/glusterfs/issues/2129) CID 1445996 Null pointer dereferences (FORWARD_NULL) /xlators/mgmt/ +- [#2130](https://github.com/gluster/glusterfs/issues/2130) stack.h/c: remove unused variable and reorder struct +- [#2133](https://github.com/gluster/glusterfs/issues/2133) Changelog History Crawl failed after resuming stopped geo-replicati +- [#2134](https://github.com/gluster/glusterfs/issues/2134) Fix spurious failures caused by change in profile info duration to +- [#2138](https://github.com/gluster/glusterfs/issues/2138) glfs_write() dumps a core file file when buffer size is 1GB +- [#2154](https://github.com/gluster/glusterfs/issues/2154) "Operation not supported" doing a chmod on a symlink +- [#2159](https://github.com/gluster/glusterfs/issues/2159) Remove unused component tests +- [#2161](https://github.com/gluster/glusterfs/issues/2161) Crash caused by memory corruption +- [#2169](https://github.com/gluster/glusterfs/issues/2169) Stack overflow when parallel-readdir is enabled +- [#2180](https://github.com/gluster/glusterfs/issues/2180) CID 1446716: Memory - illegal accesses (USE_AFTER_FREE) /xlators/mg +- [#2187](https://github.com/gluster/glusterfs/issues/2187) [Input/output error] IO failure while performing shrink operation w +- [#2190](https://github.com/gluster/glusterfs/issues/2190) Move a test case tests/basic/glusterd-restart-shd-mux.t to flaky +- [#2192](https://github.com/gluster/glusterfs/issues/2192) 4+1 arbiter setup is broken +- [#2198](https://github.com/gluster/glusterfs/issues/2198) There are blocked inodelks for a long time +- [#2216](https://github.com/gluster/glusterfs/issues/2216) Fix coverity issues +- [#2232](https://github.com/gluster/glusterfs/issues/2232) "Invalid argument" when reading a directory with gfapi +- [#2234](https://github.com/gluster/glusterfs/issues/2234) Segmentation fault in directory quota daemon for replicated volume +- [#2239](https://github.com/gluster/glusterfs/issues/2239) rebalance crashes in dht on master +- [#2241](https://github.com/gluster/glusterfs/issues/2241) Using readdir instead of readdirp for fix-layout increases performa +- [#2253](https://github.com/gluster/glusterfs/issues/2253) Disable lookup-optimize by default in the virt group +- [#2258](https://github.com/gluster/glusterfs/issues/2258) Provide option to disable fsync in data migration +- [#2260](https://github.com/gluster/glusterfs/issues/2260) failed to list quota info after setting limit-usage +- [#2268](https://github.com/gluster/glusterfs/issues/2268) dht_layout_unref() only uses 'this' to check that 'this->private' i +- [#2278](https://github.com/gluster/glusterfs/issues/2278) nfs-ganesha does not start due to shared storage not ready, but ret +- [#2287](https://github.com/gluster/glusterfs/issues/2287) runner infrastructure fails to provide platfrom independent error c +- [#2294](https://github.com/gluster/glusterfs/issues/2294) dict.c: remove some strlen() calls if using DICT_LIST_IMP +- [#2308](https://github.com/gluster/glusterfs/issues/2308) Developer sessions for glusterfs +- [#2313](https://github.com/gluster/glusterfs/issues/2313) Long setting names mess up the columns and break parsing +- [#2317](https://github.com/gluster/glusterfs/issues/2317) Rebalance doesn't migrate some sparse files +- [#2328](https://github.com/gluster/glusterfs/issues/2328) "gluster volume set group samba" needs to include write-b +- [#2330](https://github.com/gluster/glusterfs/issues/2330) gf_msg can cause relock deadlock +- [#2334](https://github.com/gluster/glusterfs/issues/2334) posix_handle_soft() is doing an unnecessary stat +- [#2337](https://github.com/gluster/glusterfs/issues/2337) memory leak observed in lock fop +- [#2348](https://github.com/gluster/glusterfs/issues/2348) Gluster's test suite on RHEL 8 runs slower than on RHEL 7 +- [#2351](https://github.com/gluster/glusterfs/issues/2351) glusterd: After upgrade on release 9.1 glusterd protocol is broken +- [#2353](https://github.com/gluster/glusterfs/issues/2353) Permission issue after upgrading to Gluster v9.1 +- [#2360](https://github.com/gluster/glusterfs/issues/2360) extras: postscript fails on logrotation of snapd logs +- [#2364](https://github.com/gluster/glusterfs/issues/2364) After the service is restarted, a large number of handles are not r +- [#2370](https://github.com/gluster/glusterfs/issues/2370) glusterd: Issues with custom xlator changes +- [#2378](https://github.com/gluster/glusterfs/issues/2378) Remove sys_fstatat() from posix_handle_unset_gfid() function - not +- [#2380](https://github.com/gluster/glusterfs/issues/2380) Remove sys_lstat() from posix_acl_xattr_set() - not needed +- [#2388](https://github.com/gluster/glusterfs/issues/2388) Geo-replication gets delayed when there are many renames on primary +- [#2394](https://github.com/gluster/glusterfs/issues/2394) Spurious failure in tests/basic/fencing/afr-lock-heal-basic.t +- [#2398](https://github.com/gluster/glusterfs/issues/2398) Bitrot and scrub process showed like unknown in the gluster volume +- [#2404](https://github.com/gluster/glusterfs/issues/2404) Spurious failure of tests/bugs/ec/bug-1236065.t +- [#2407](https://github.com/gluster/glusterfs/issues/2407) configure glitch with CC=clang +- [#2410](https://github.com/gluster/glusterfs/issues/2410) dict_xxx_sizen variant compilation should fail on passing a variabl +- [#2414](https://github.com/gluster/glusterfs/issues/2414) Prefer mallinfo2() to mallinfo() if available +- [#2421](https://github.com/gluster/glusterfs/issues/2421) rsync should not try to sync internal xattrs. +- [#2429](https://github.com/gluster/glusterfs/issues/2429) Use file timestamps with nanosecond precision +- [#2431](https://github.com/gluster/glusterfs/issues/2431) Drop --disable-syslog configuration option +- [#2440](https://github.com/gluster/glusterfs/issues/2440) Geo-replication not working on Ubuntu 21.04 +- [#2443](https://github.com/gluster/glusterfs/issues/2443) Core dumps on Gluster 9 - 3 replicas +- [#2446](https://github.com/gluster/glusterfs/issues/2446) client_add_lock_for_recovery() - new_client_lock() should be called +- [#2467](https://github.com/gluster/glusterfs/issues/2467) failed to open /proc/0/status: No such file or directory +- [#2470](https://github.com/gluster/glusterfs/issues/2470) sharding: [inode.c:1255:__inode_unlink] 0-inode: dentry not found +- [#2480](https://github.com/gluster/glusterfs/issues/2480) Brick going offline on another host as well as the host which reboo +- [#2502](https://github.com/gluster/glusterfs/issues/2502) xlator/features/locks/src/common.c has code duplication +- [#2507](https://github.com/gluster/glusterfs/issues/2507) Use appropriate msgid in gf_msg() +- [#2515](https://github.com/gluster/glusterfs/issues/2515) Unable to mount the gluster volume using fuse unless iptables is fl +- [#2522](https://github.com/gluster/glusterfs/issues/2522) ganesha_ha (extras/ganesha/ocf): ganesha_grace RA fails in start() +- [#2540](https://github.com/gluster/glusterfs/issues/2540) delay-gen doesn't work correctly for delays longer than 2 seconds +- [#2551](https://github.com/gluster/glusterfs/issues/2551) Sometimes the lock notification feature doesn't work +- [#2581](https://github.com/gluster/glusterfs/issues/2581) With strict-locks enabled clients which are holding posix locks sti +- [#2590](https://github.com/gluster/glusterfs/issues/2590) trusted.io-stats-dump extended attribute usage description error +- [#2611](https://github.com/gluster/glusterfs/issues/2611) Granular entry self-heal is taking more time than full entry self h +- [#2617](https://github.com/gluster/glusterfs/issues/2617) High CPU utilization of thread glfs_fusenoti and huge delays in som +- [#2620](https://github.com/gluster/glusterfs/issues/2620) Granular entry heal purging of index name trigger two lookups in th +- [#2625](https://github.com/gluster/glusterfs/issues/2625) auth.allow value is corrupted after add-brick operation +- [#2626](https://github.com/gluster/glusterfs/issues/2626) entry self-heal does xattrops unnecessarily in many cases +- [#2649](https://github.com/gluster/glusterfs/issues/2649) glustershd failed in bind with error "Address already in use" +- [#2652](https://github.com/gluster/glusterfs/issues/2652) Removal of deadcode: Pump +- [#2659](https://github.com/gluster/glusterfs/issues/2659) tests/basic/afr/afr-anon-inode.t crashed +- [#2664](https://github.com/gluster/glusterfs/issues/2664) Test suite produce uncompressed logs +- [#2693](https://github.com/gluster/glusterfs/issues/2693) dht: dht_local_wipe is crashed while running rename operation +- [#2771](https://github.com/gluster/glusterfs/issues/2771) Smallfile improvement in glusterfs +- [#2782](https://github.com/gluster/glusterfs/issues/2782) Glustereventsd does not listen on IPv4 when IPv6 is not available +- [#2789](https://github.com/gluster/glusterfs/issues/2789) An improper locking bug(e.g., deadlock) on the lock up_inode_ctx->c +- [#2798](https://github.com/gluster/glusterfs/issues/2798) FUSE mount option for localtime-logging is not exposed +- [#2816](https://github.com/gluster/glusterfs/issues/2816) Glusterfsd memory leak when subdir_mounting a volume +- [#2835](https://github.com/gluster/glusterfs/issues/2835) dht: found anomalies in dht_layout after commit c4cbdbcb3d02fb56a62 +- [#2857](https://github.com/gluster/glusterfs/issues/2857) variable twice initialization. diff --git a/docs/release-notes/10.1.md b/docs/release-notes/10.1.md index fd79600..bd3396e 100644 --- a/docs/release-notes/10.1.md +++ b/docs/release-notes/10.1.md @@ -12,15 +12,18 @@ This is a bugfix and improvement release. The release notes for [10.0](10.0.md) - Users are highly encouraged to upgrade to newer releases of GlusterFS. ## Important fixes in this release + - Fix missing stripe count issue with upgrade from 9.x to 10.x - Fix IO failure when shrinking distributed dispersed volume with ongoing IO - Fix log spam introduced with glusterfs 10.0 - Enable ltcmalloc_minimal instead of ltcmalloc ## Builds are available at - + [https://download.gluster.org/pub/gluster/glusterfs/10/10.1/](https://download.gluster.org/pub/gluster/glusterfs/10/10.1/) ## Bugs addressed + - [#2846](https://github.com/gluster/glusterfs/issues/2846) Avoid redundant logs in gluster - [#2903](https://github.com/gluster/glusterfs/issues/2903) Fix worker disconnect due to AttributeError in geo-replication - [#2910](https://github.com/gluster/glusterfs/issues/2910) Check for available ports in port_range in glusterd diff --git a/docs/release-notes/10.2.md b/docs/release-notes/10.2.md index c3f4f92..c66065f 100644 --- a/docs/release-notes/10.2.md +++ b/docs/release-notes/10.2.md @@ -3,22 +3,26 @@ This is a bugfix and improvement release. The release notes for [10.0](10.0.md) and [10.1](10.1.md) contain a listing of all the new features that were added and bugs fixed in the GlusterFS 10 stable release. **NOTE:** + - Next minor release tentative date: Week of 15th Aug, 2022 - Users are highly encouraged to upgrade to newer releases of GlusterFS. ## Important fixes in this release + - Optimize server functionality by enhancing server_process_event_upcall code path during the handling of upcall event - Fix all bricks not starting issue on node reboot when brick count is high(>750) - Fix stale posix locks that appear after client disconnection ## Builds are available at + [https://download.gluster.org/pub/gluster/glusterfs/10/10.2/](https://download.gluster.org/pub/gluster/glusterfs/10/10.2/) ## Bugs addressed + - [#3182](https://github.com/gluster/glusterfs/issues/3182) Fix stale posix locks that appear after client disconnection - [#3187](https://github.com/gluster/glusterfs/issues/3187) Fix Locks xlator fd leaks - [#3234](https://github.com/gluster/glusterfs/issues/3234) Fix incorrect directory check inorder to successfully locate the SSL certificate -- [#3262](https://github.com/gluster/glusterfs/issues/3262) Synchronize layout_(ref|unref) during layout_(get|set) in dht +- [#3262](https://github.com/gluster/glusterfs/issues/3262) Synchronize layout*(ref|unref) during layout*(get|set) in dht - [#3321](https://github.com/gluster/glusterfs/issues/3321) Optimize server functionality by enhancing server_process_event_upcall code path during the handling of upcall event - [#3334](https://github.com/gluster/glusterfs/issues/3334) Fix errors and timeouts when creating qcow2 file via libgfapi - [#3375](https://github.com/gluster/glusterfs/issues/3375) Fix all bricks not starting issue on node reboot when brick count is high(>750) diff --git a/docs/release-notes/3.10.0.md b/docs/release-notes/3.10.0.md index 2e9bd0e..fa856d6 100644 --- a/docs/release-notes/3.10.0.md +++ b/docs/release-notes/3.10.0.md @@ -11,28 +11,30 @@ of bugs that has been addressed is included further below. ## Major changes and features ### Brick multiplexing -*Notes for users:* -Multiplexing reduces both port and memory usage. It does *not* improve + +_Notes for users:_ +Multiplexing reduces both port and memory usage. It does _not_ improve performance vs. non-multiplexing except when memory is the limiting factor, though there are other related changes that improve performance overall (e.g. compared to 3.9). -Multiplexing is off by default. It can be enabled with +Multiplexing is off by default. It can be enabled with ```bash # gluster volume set all cluster.brick-multiplex on ``` -*Limitations:* +_Limitations:_ There are currently no tuning options for multiplexing - it's all or nothing. This will change in the near future. -*Known Issues:* +_Known Issues:_ The only feature or combination of features known not to work with multiplexing -is USS and SSL. Anyone using that combination should leave multiplexing off. +is USS and SSL. Anyone using that combination should leave multiplexing off. ### Support to display op-version information from clients -*Notes for users:* + +_Notes for users:_ To get information on what op-version are supported by the clients, users can invoke the `gluster volume status` command for clients. Along with information on hostname, port, bytes read, bytes written and number of clients connected @@ -43,12 +45,13 @@ operate. Following is the example usage: # gluster volume status clients ``` -*Limitations:* +_Limitations:_ -*Known Issues:* +_Known Issues:_ ### Support to get maximum op-version in a heterogeneous cluster -*Notes for users:* + +_Notes for users:_ A heterogeneous cluster operates on a common op-version that can be supported across all the nodes in the trusted storage pool. Upon upgrade of the nodes in the cluster, the cluster might support a higher op-version. Users can retrieve @@ -60,12 +63,13 @@ the `gluster volume get` command on the newly introduced global option, # gluster volume get all cluster.max-op-version ``` -*Limitations:* +_Limitations:_ -*Known Issues:* +_Known Issues:_ ### Support for rebalance time to completion estimation -*Notes for users:* + +_Notes for users:_ Users can now see approximately how much time the rebalance operation will take to complete across all nodes. @@ -76,27 +80,27 @@ as part of the rebalance status. Use the command: # gluster volume rebalance status ``` -*Limitations:* +_Limitations:_ The rebalance process calculates the time left based on the rate at while files are processed on the node and the total number of files on the brick which is determined using statfs. The limitations of this are: - * A single fs partition must host only one brick. Multiple bricks on -the same fs partition will cause the statfs results to be invalid. +- A single fs partition must host only one brick. Multiple bricks on + the same fs partition will cause the statfs results to be invalid. - * The estimates are dynamic and are recalculated every time the rebalance status -command is invoked.The estimates become more accurate over time so short running -rebalance operations may not benefit. +- The estimates are dynamic and are recalculated every time the rebalance status + command is invoked.The estimates become more accurate over time so short running + rebalance operations may not benefit. -*Known Issues:* +_Known Issues:_ As glusterfs does not stored the number of files on the brick, we use statfs to guess the number. The .glusterfs directory contents can significantly skew this number and affect the calculated estimates. - ### Separation of tier as its own service -*Notes for users:* + +_Notes for users:_ This change is to move the management of the tier daemon into the gluster service framework, thereby improving it stability and manageability by the service framework. @@ -104,24 +108,26 @@ service framework. This has no change to any of the tier commands or user facing interfaces and operations. -*Limitations:* +_Limitations:_ -*Known Issues:* +_Known Issues:_ ### Statedump support for gfapi based applications -*Notes for users:* + +_Notes for users:_ gfapi based applications now can dump state information for better trouble shooting of issues. A statedump can be triggered in two ways: 1. by executing the following on one of the Gluster servers, + ```bash # gluster volume statedump client : ``` - - `` should be replaced by the name of the volume - - `` should be replaced by the hostname of the system running the - gfapi application - - `` should be replaced by the PID of the gfapi application +- `` should be replaced by the name of the volume +- `` should be replaced by the hostname of the system running the + gfapi application +- `` should be replaced by the PID of the gfapi application 2. through calling `glfs_sysrq(, GLFS_SYSRQ_STATEDUMP)` within the application @@ -131,7 +137,7 @@ shooting of issues. A statedump can be triggered in two ways: All statedumps (`*.dump.*` files) will be located at the usual location, on most distributions this would be `/var/run/gluster/`. -*Limitations:* +_Limitations:_ It is not possible to trigger statedumps from the Gluster CLI when the gfapi application has lost its management connection to the GlusterD servers. @@ -141,24 +147,26 @@ GlusterFS 3.10 is the first release that contains support for the new debugging will need to be adapted to call this function. At the time of the release of 3.10, no applications are known to call `glfs_sysrq()`. -*Known Issues:* +_Known Issues:_ ### Disabled creation of trash directory by default -*Notes for users:* + +_Notes for users:_ From now onwards trash directory, namely .trashcan, will not be be created by default upon creation of new volumes unless and until the feature is turned ON and the restrictions on the same will be applicable as long as features.trash is set for a particular volume. -*Limitations:* +_Limitations:_ After upgrade for pre-existing volumes, trash directory will be still present at root of the volume. Those who are not interested in this feature may have to manually delete the directory from the mount point. -*Known Issues:* +_Known Issues:_ ### Implemented parallel readdirp with distribute xlator -*Notes for users:* + +_Notes for users:_ Currently the directory listing gets slower as the number of bricks/nodes increases in a volume, though the file/directory numbers remain unchanged. With this feature, the performance of directory listing is made mostly @@ -167,28 +175,32 @@ exponentially reduce the directory listing performance. (On a 2, 5, 10, 25 brick setup we saw ~5, 100, 400, 450% improvement consecutively) To enable this feature: + ```bash # gluster volume set performance.readdir-ahead on # gluster volume set performance.parallel-readdir on ``` To disable this feature: + ```bash # gluster volume set performance.parallel-readdir off ``` If there are more than 50 bricks in the volume it is good to increase the cache size to be more than 10Mb (default value): + ```bash # gluster volume set performance.rda-cache-limit ``` -*Limitations:* +_Limitations:_ -*Known Issues:* +_Known Issues:_ ### md-cache can optionally -ve cache security.ima xattr -*Notes for users:* + +_Notes for users:_ From kernel version 3.X or greater, creating of a file results in removexattr call on security.ima xattr. This xattr is not set on the file unless IMA feature is active. With this patch, removxattr call returns ENODATA if it is @@ -197,18 +209,20 @@ not found in the cache. The end benefit is faster create operations where IMA is not enabled. To cache this xattr use, + ```bash # gluster volume set performance.cache-ima-xattrs on ``` The above option is on by default. -*Limitations:* +_Limitations:_ -*Known Issues:* +_Known Issues:_ ### Added support for CPU extensions in disperse computations -*Notes for users:* + +_Notes for users:_ To improve disperse computations, a new way of generating dynamic code targeting specific CPU extensions like SSE and AVX on Intel processors is implemented. The available extensions are detected on run time. This can @@ -226,18 +240,18 @@ command: Valid values are: -* none: Completely disable dynamic code generation -* auto: Automatically detect available extensions and use the best one -* x64: Use dynamic code generation using standard 64 bits instructions -* sse: Use dynamic code generation using SSE extensions (128 bits) -* avx: Use dynamic code generation using AVX extensions (256 bits) +- none: Completely disable dynamic code generation +- auto: Automatically detect available extensions and use the best one +- x64: Use dynamic code generation using standard 64 bits instructions +- sse: Use dynamic code generation using SSE extensions (128 bits) +- avx: Use dynamic code generation using AVX extensions (256 bits) The default value is 'auto'. If a value is specified that is not detected on run-time, it will automatically fall back to the next available option. -*Limitations:* +_Limitations:_ -*Known Issues:* +_Known Issues:_ To solve a conflict between the dynamic code generator and SELinux, it has been necessary to create a dynamic file on runtime in the directory /usr/libexec/glusterfs. This directory only exists if the server package @@ -271,20 +285,20 @@ Bugs addressed since release-3.9 are listed below. - [#1325531](https://bugzilla.redhat.com/1325531): Statedump: Add per xlator ref counting for inode - [#1325792](https://bugzilla.redhat.com/1325792): "gluster vol heal test statistics heal-count replica" seems doesn't work - [#1330604](https://bugzilla.redhat.com/1330604): out-of-tree builds generate XDR headers and source files in the original directory -- [#1336371](https://bugzilla.redhat.com/1336371): Sequential volume start&stop is failing with SSL enabled setup. -- [#1341948](https://bugzilla.redhat.com/1341948): DHT: Rebalance- Misleading log messages from __dht_check_free_space function +- [#1336371](https://bugzilla.redhat.com/1336371): Sequential volume start&stop is failing with SSL enabled setup. +- [#1341948](https://bugzilla.redhat.com/1341948): DHT: Rebalance- Misleading log messages from \_\_dht_check_free_space function - [#1344714](https://bugzilla.redhat.com/1344714): removal of file from nfs mount crashs ganesha server - [#1349385](https://bugzilla.redhat.com/1349385): [FEAT]jbr: Add rollbacking of failed fops - [#1355956](https://bugzilla.redhat.com/1355956): RFE : move ganesha related configuration into shared storage - [#1356076](https://bugzilla.redhat.com/1356076): DHT doesn't evenly balance files on FreeBSD with ZFS -- [#1356960](https://bugzilla.redhat.com/1356960): OOM Kill on client when heal is in progress on 1*(2+1) arbiter volume +- [#1356960](https://bugzilla.redhat.com/1356960): OOM Kill on client when heal is in progress on 1\*(2+1) arbiter volume - [#1357753](https://bugzilla.redhat.com/1357753): JSON output for all Events CLI commands - [#1357754](https://bugzilla.redhat.com/1357754): Delayed Events if any one Webhook is slow - [#1358296](https://bugzilla.redhat.com/1358296): tier: breaking down the monolith processing function tier_migrate_using_query_file() - [#1359612](https://bugzilla.redhat.com/1359612): [RFE] Geo-replication Logging Improvements - [#1360670](https://bugzilla.redhat.com/1360670): Add output option `--xml` to man page of gluster - [#1363595](https://bugzilla.redhat.com/1363595): Node remains in stopped state in pcs status with "/usr/lib/ocf/resource.d/heartbeat/ganesha_mon: line 137: [: too many arguments ]" messages in logs. -- [#1363965](https://bugzilla.redhat.com/1363965): geo-replication *changes.log does not respect the log-level configured +- [#1363965](https://bugzilla.redhat.com/1363965): geo-replication \*changes.log does not respect the log-level configured - [#1364420](https://bugzilla.redhat.com/1364420): [RFE] History Crawl performance improvement - [#1365395](https://bugzilla.redhat.com/1365395): Support for rc.d and init for Service management - [#1365740](https://bugzilla.redhat.com/1365740): dht: Update stbuf from servers having layout @@ -298,7 +312,7 @@ Bugs addressed since release-3.9 are listed below. - [#1368138](https://bugzilla.redhat.com/1368138): Crash of glusterd when using long username with geo-replication - [#1368312](https://bugzilla.redhat.com/1368312): Value of `replica.split-brain-status' attribute of a directory in metadata split-brain in a dist-rep volume reads that it is not in split-brain - [#1368336](https://bugzilla.redhat.com/1368336): [RFE] Tier Events -- [#1369077](https://bugzilla.redhat.com/1369077): The directories get renamed when data bricks are offline in 4*(2+1) volume +- [#1369077](https://bugzilla.redhat.com/1369077): The directories get renamed when data bricks are offline in 4\*(2+1) volume - [#1369124](https://bugzilla.redhat.com/1369124): fix unused variable warnings from out-of-tree builds generate XDR headers and source files i... - [#1369397](https://bugzilla.redhat.com/1369397): segment fault in changelog_cleanup_dispatchers - [#1369403](https://bugzilla.redhat.com/1369403): [RFE]: events from protocol server @@ -366,14 +380,14 @@ Bugs addressed since release-3.9 are listed below. - [#1384142](https://bugzilla.redhat.com/1384142): crypt: changes needed for openssl-1.1 (coming in Fedora 26) - [#1384297](https://bugzilla.redhat.com/1384297): glusterfs can't self heal character dev file for invalid dev_t parameters - [#1384906](https://bugzilla.redhat.com/1384906): arbiter volume write performance is bad with sharding -- [#1385104](https://bugzilla.redhat.com/1385104): invalid argument warning messages seen in fuse client logs 2016-09-30 06:34:58.938667] W [dict.c:418ict_set] (-->/usr/lib64/glusterfs/3.8.4/xlator/cluster/replicate.so(+0x58722) 0-dict: !this || !value for key=link-count [Invalid argument] +- [#1385104](https://bugzilla.redhat.com/1385104): invalid argument warning messages seen in fuse client logs 2016-09-30 06:34:58.938667] W [dict.c:418ict_set] (-->/usr/lib64/glusterfs/3.8.4/xlator/cluster/replicate.so(+0x58722) 0-dict: !this || !value for key=link-count [Invalid argument] - [#1385575](https://bugzilla.redhat.com/1385575): pmap_signin event fails to update brickinfo->signed_in flag - [#1385593](https://bugzilla.redhat.com/1385593): Fix some spelling mistakes in comments and log messages -- [#1385839](https://bugzilla.redhat.com/1385839): Incorrect volume type in the "glusterd_state" file generated using CLI "gluster get-state" +- [#1385839](https://bugzilla.redhat.com/1385839): Incorrect volume type in the "glusterd_state" file generated using CLI "gluster get-state" - [#1386088](https://bugzilla.redhat.com/1386088): Memory Leaks in snapshot code path -- [#1386097](https://bugzilla.redhat.com/1386097): 4 of 8 bricks (2 dht subvols) crashed on systemic setup +- [#1386097](https://bugzilla.redhat.com/1386097): 4 of 8 bricks (2 dht subvols) crashed on systemic setup - [#1386123](https://bugzilla.redhat.com/1386123): geo-replica slave node goes faulty for non-root user session due to fail to locate gluster binary -- [#1386141](https://bugzilla.redhat.com/1386141): Error and warning message getting while removing glusterfs-events package +- [#1386141](https://bugzilla.redhat.com/1386141): Error and warning message getting while removing glusterfs-events package - [#1386188](https://bugzilla.redhat.com/1386188): Asynchronous Unsplit-brain still causes Input/Output Error on system calls - [#1386200](https://bugzilla.redhat.com/1386200): Log all published events - [#1386247](https://bugzilla.redhat.com/1386247): [Eventing]: 'gluster volume tier start force' does not generate a TIER_START event @@ -417,7 +431,7 @@ Bugs addressed since release-3.9 are listed below. - [#1395648](https://bugzilla.redhat.com/1395648): ganesha-ha.conf --status should validate if the VIPs are assigned to right nodes - [#1395660](https://bugzilla.redhat.com/1395660): Checkpoint completed event missing master node detail - [#1395687](https://bugzilla.redhat.com/1395687): Client side IObuff leaks at a high pace consumes complete client memory and hence making gluster volume inaccessible -- [#1395993](https://bugzilla.redhat.com/1395993): heal info --xml when bricks are down in a systemic environment is not displaying anything even after more than 30minutes +- [#1395993](https://bugzilla.redhat.com/1395993): heal info --xml when bricks are down in a systemic environment is not displaying anything even after more than 30minutes - [#1396038](https://bugzilla.redhat.com/1396038): refresh-config fails and crashes ganesha when mdcache is enabled on the volume. - [#1396048](https://bugzilla.redhat.com/1396048): A hard link is lost during rebalance+lookup - [#1396062](https://bugzilla.redhat.com/1396062): [geo-rep]: Worker crashes seen while renaming directories in loop @@ -447,11 +461,11 @@ Bugs addressed since release-3.9 are listed below. - [#1400013](https://bugzilla.redhat.com/1400013): [USS,SSL] .snaps directory is not reachable when I/O encryption (SSL) is enabled - [#1400026](https://bugzilla.redhat.com/1400026): Duplicate value assigned to GD_MSG_DAEMON_STATE_REQ_RCVD and GD_MSG_BRICK_CLEANUP_SUCCESS messages - [#1400237](https://bugzilla.redhat.com/1400237): Ganesha services are not stopped when pacemaker quorum is lost -- [#1400613](https://bugzilla.redhat.com/1400613): [GANESHA] failed to create directory of hostname of new node in var/lib/nfs/ganesha/ in already existing cluster nodes -- [#1400818](https://bugzilla.redhat.com/1400818): possible memory leak on client when writing to a file while another client issues a truncate +- [#1400613](https://bugzilla.redhat.com/1400613): [GANESHA] failed to create directory of hostname of new node in var/lib/nfs/ganesha/ in already existing cluster nodes +- [#1400818](https://bugzilla.redhat.com/1400818): possible memory leak on client when writing to a file while another client issues a truncate - [#1401095](https://bugzilla.redhat.com/1401095): log the error when locking the brick directory fails - [#1401218](https://bugzilla.redhat.com/1401218): Fix compound fops memory leaks -- [#1401404](https://bugzilla.redhat.com/1401404): [Arbiter] IO's Halted and heal info command hung +- [#1401404](https://bugzilla.redhat.com/1401404): [Arbiter] IO's Halted and heal info command hung - [#1401777](https://bugzilla.redhat.com/1401777): atime becomes zero when truncating file via ganesha (or gluster-NFS) - [#1401801](https://bugzilla.redhat.com/1401801): [RFE] Use Host UUID to find local nodes to spawn workers - [#1401812](https://bugzilla.redhat.com/1401812): RFE: Make readdirp parallel in dht @@ -463,7 +477,7 @@ Bugs addressed since release-3.9 are listed below. - [#1402369](https://bugzilla.redhat.com/1402369): Getting the warning message while erasing the gluster "glusterfs-server" package. - [#1402710](https://bugzilla.redhat.com/1402710): ls and move hung on disperse volume - [#1402730](https://bugzilla.redhat.com/1402730): self-heal not happening, as self-heal info lists the same pending shards to be healed -- [#1402828](https://bugzilla.redhat.com/1402828): Snapshot: Snapshot create command fails when gluster-shared-storage volume is stopped +- [#1402828](https://bugzilla.redhat.com/1402828): Snapshot: Snapshot create command fails when gluster-shared-storage volume is stopped - [#1402841](https://bugzilla.redhat.com/1402841): Files remain unhealed forever if shd is disabled and re-enabled while healing is in progress. - [#1403130](https://bugzilla.redhat.com/1403130): [GANESHA] Adding a node to cluster failed to allocate resource-agents to new node. - [#1403780](https://bugzilla.redhat.com/1403780): Incorrect incrementation of volinfo refcnt during volume start @@ -495,7 +509,7 @@ Bugs addressed since release-3.9 are listed below. - [#1408757](https://bugzilla.redhat.com/1408757): Fix failure of split-brain-favorite-child-policy.t in CentOS7 - [#1408758](https://bugzilla.redhat.com/1408758): tests/bugs/glusterd/bug-913555.t fails spuriously - [#1409078](https://bugzilla.redhat.com/1409078): RFE: Need a command to check op-version compatibility of clients -- [#1409186](https://bugzilla.redhat.com/1409186): Dict_t leak in dht_migration_complete_check_task and dht_rebalance_inprogress_task +- [#1409186](https://bugzilla.redhat.com/1409186): Dict_t leak in dht_migration_complete_check_task and dht_rebalance_inprogress_task - [#1409202](https://bugzilla.redhat.com/1409202): Warning messages throwing when EC volume offline brick comes up are difficult to understand for end user. - [#1409206](https://bugzilla.redhat.com/1409206): Extra lookup/fstats are sent over the network when a brick is down. - [#1409727](https://bugzilla.redhat.com/1409727): [ganesha + EC]posix compliance rename tests failed on EC volume with nfs-ganesha mount. @@ -531,7 +545,7 @@ Bugs addressed since release-3.9 are listed below. - [#1417042](https://bugzilla.redhat.com/1417042): glusterd restart is starting the offline shd daemon on other node in the cluster - [#1417135](https://bugzilla.redhat.com/1417135): [Stress] : SHD Logs flooded with "Heal Failed" messages,filling up "/" quickly - [#1417521](https://bugzilla.redhat.com/1417521): [SNAPSHOT] With all USS plugin enable .snaps directory is not visible in cifs mount as well as windows mount -- [#1417527](https://bugzilla.redhat.com/1417527): glusterfind: After glusterfind pre command execution all temporary files and directories /usr/var/lib/misc/glusterfsd/glusterfind/// should be removed +- [#1417527](https://bugzilla.redhat.com/1417527): glusterfind: After glusterfind pre command execution all temporary files and directories /usr/var/lib/misc/glusterfsd/glusterfind/// should be removed - [#1417804](https://bugzilla.redhat.com/1417804): debug/trace: Print iatts of individual entries in readdirp callback for better debugging experience - [#1418091](https://bugzilla.redhat.com/1418091): [RFE] Support multiple bricks in one process (multiplexing) - [#1418536](https://bugzilla.redhat.com/1418536): Portmap allocates way too much memory (256KB) on stack @@ -555,11 +569,11 @@ Bugs addressed since release-3.9 are listed below. - [#1420987](https://bugzilla.redhat.com/1420987): warning messages seen in glusterd logs while setting the volume option - [#1420989](https://bugzilla.redhat.com/1420989): when server-quorum is enabled, volume get returns 0 value for server-quorum-ratio - [#1420991](https://bugzilla.redhat.com/1420991): Modified volume options not synced once offline nodes comes up. -- [#1421017](https://bugzilla.redhat.com/1421017): CLI option "--timeout" is accepting non numeric and negative values. +- [#1421017](https://bugzilla.redhat.com/1421017): CLI option "--timeout" is accepting non numeric and negative values. - [#1421956](https://bugzilla.redhat.com/1421956): Disperse: Fallback to pre-compiled code execution when dynamic code generation fails - [#1422350](https://bugzilla.redhat.com/1422350): glustershd process crashed on systemic setup - [#1422363](https://bugzilla.redhat.com/1422363): [Replicate] "RPC call decoding failed" leading to IO hang & mount inaccessible -- [#1422391](https://bugzilla.redhat.com/1422391): Gluster NFS server crashing in __mnt3svc_umountall +- [#1422391](https://bugzilla.redhat.com/1422391): Gluster NFS server crashing in \_\_mnt3svc_umountall - [#1422766](https://bugzilla.redhat.com/1422766): Entry heal messages in glustershd.log while no entries shown in heal info - [#1422777](https://bugzilla.redhat.com/1422777): DHT doesn't evenly balance files on FreeBSD with ZFS - [#1422819](https://bugzilla.redhat.com/1422819): [Geo-rep] Recreating geo-rep session with same slave after deleting with reset-sync-time fails to sync diff --git a/docs/release-notes/3.10.1.md b/docs/release-notes/3.10.1.md index d1d48ba..47eb068 100644 --- a/docs/release-notes/3.10.1.md +++ b/docs/release-notes/3.10.1.md @@ -6,17 +6,17 @@ bugs in the GlusterFS 3.10 stable release. ## Major changes, features and limitations addressed in this release -1. auth-allow setting was broken with 3.10 release and is now fixed (#1429117) +1. auth-allow setting was broken with 3.10 release and is now fixed (#1429117) ## Major issues 1. Expanding a gluster volume that is sharded may cause file corruption - - Sharded volumes are typically used for VM images, if such volumes are - expanded or possibly contracted (i.e add/remove bricks and rebalance) - there are reports of VM images getting corrupted. - - If you are using sharded volumes, DO NOT rebalance them till this is - fixed - - Status of this bug can be tracked here, [#1426508](https://bugzilla.redhat.com/1426508) + - Sharded volumes are typically used for VM images, if such volumes are + expanded or possibly contracted (i.e add/remove bricks and rebalance) + there are reports of VM images getting corrupted. + - If you are using sharded volumes, DO NOT rebalance them till this is + fixed + - Status of this bug can be tracked here, [#1426508](https://bugzilla.redhat.com/1426508) ## Bugs addressed @@ -28,7 +28,7 @@ A total of 31 patches have been merged, addressing 26 bugs: - [#1426222](https://bugzilla.redhat.com/1426222): build: fixes to build 3.9.0rc2 on Debian (jessie) - [#1426323](https://bugzilla.redhat.com/1426323): common-ha: no need to remove nodes one-by-one in teardown - [#1426329](https://bugzilla.redhat.com/1426329): [Ganesha] : Add comment to Ganesha HA config file ,about cluster name's length limitation -- [#1427387](https://bugzilla.redhat.com/1427387): systemic testing: seeing lot of ping time outs which would lead to splitbrains +- [#1427387](https://bugzilla.redhat.com/1427387): systemic testing: seeing lot of ping time outs which would lead to splitbrains - [#1427399](https://bugzilla.redhat.com/1427399): [RFE] capture portmap details in glusterd's statedump - [#1427461](https://bugzilla.redhat.com/1427461): Bricks take up new ports upon volume restart after add-brick op with brick mux enabled - [#1428670](https://bugzilla.redhat.com/1428670): Disconnects in nfs mount leads to IO hang and mount inaccessible @@ -36,7 +36,7 @@ A total of 31 patches have been merged, addressing 26 bugs: - [#1429117](https://bugzilla.redhat.com/1429117): auth failure after upgrade to GlusterFS 3.10 - [#1429402](https://bugzilla.redhat.com/1429402): Restore atime/mtime for symlinks and other non-regular files. - [#1429773](https://bugzilla.redhat.com/1429773): disallow increasing replica count for arbiter volumes -- [#1430512](https://bugzilla.redhat.com/1430512): /libgfxdr.so.0.0.1: undefined symbol: __gf_free +- [#1430512](https://bugzilla.redhat.com/1430512): /libgfxdr.so.0.0.1: undefined symbol: \_\_gf_free - [#1430844](https://bugzilla.redhat.com/1430844): build/packaging: Debian and Ubuntu don't have /usr/libexec/; results in bad packages - [#1431175](https://bugzilla.redhat.com/1431175): volume start command hangs - [#1431176](https://bugzilla.redhat.com/1431176): USS is broken when multiplexing is on diff --git a/docs/release-notes/3.10.10.md b/docs/release-notes/3.10.10.md index 9f07430..a1cf5de 100644 --- a/docs/release-notes/3.10.10.md +++ b/docs/release-notes/3.10.10.md @@ -6,6 +6,7 @@ the new features that were added and bugs fixed in the GlusterFS 3.10 stable release. ## Major changes, features and limitations addressed in this release + **No Major changes** ## Major issues @@ -13,10 +14,9 @@ the new features that were added and bugs fixed in the GlusterFS 1. Brick multiplexing is being tested and fixed aggressively but we still have a few crashes and memory leaks to fix. - ## Bugs addressed Bugs addressed since release-3.10.9 are listed below. -- [#1498081](https://bugzilla.redhat.com/1498081): dht_(f)xattrop does not implement migration checks +- [#1498081](https://bugzilla.redhat.com/1498081): dht\_(f)xattrop does not implement migration checks - [#1534848](https://bugzilla.redhat.com/1534848): entries not getting cleared post healing of softlinks (stale entries showing up in heal info) diff --git a/docs/release-notes/3.10.11.md b/docs/release-notes/3.10.11.md index f330546..9440e4b 100644 --- a/docs/release-notes/3.10.11.md +++ b/docs/release-notes/3.10.11.md @@ -6,6 +6,7 @@ the new features that were added and bugs fixed in the GlusterFS 3.10 stable release. ## Major changes, features and limitations addressed in this release + **No Major changes** ## Major issues @@ -13,13 +14,12 @@ the new features that were added and bugs fixed in the GlusterFS 1. Brick multiplexing is being tested and fixed aggressively but we still have a few crashes and memory leaks to fix. - ## Bugs addressed Bugs addressed since release-3.10.10 are listed below. -- [#1486542](https://bugzilla.redhat.com/1486542): "ganesha.so cannot open" warning message in glusterd log in non ganesha setup. +- [#1486542](https://bugzilla.redhat.com/1486542): "ganesha.so cannot open" warning message in glusterd log in non ganesha setup. - [#1544461](https://bugzilla.redhat.com/1544461): 3.8 -> 3.10 rolling upgrade fails (same for 3.12 or 3.13) on Ubuntu 14 - [#1544787](https://bugzilla.redhat.com/1544787): tests/bugs/cli/bug-1169302.t fails spuriously - [#1546912](https://bugzilla.redhat.com/1546912): tests/bugs/posix/bug-990028.t fails in release-3.10 branch -- [#1549482](https://bugzilla.redhat.com/1549482): Quota: After deleting directory from mount point on which quota was configured, quota list command output is blank +- [#1549482](https://bugzilla.redhat.com/1549482): Quota: After deleting directory from mount point on which quota was configured, quota list command output is blank diff --git a/docs/release-notes/3.10.12.md b/docs/release-notes/3.10.12.md index e81b1bf..0db8081 100644 --- a/docs/release-notes/3.10.12.md +++ b/docs/release-notes/3.10.12.md @@ -8,6 +8,7 @@ GlusterFS 3.10 stable release. ## Major changes, features and limitations addressed in this release This release contains a fix for a security vulerability in Gluster as follows, + - http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-1088 - https://nvd.nist.gov/vuln/detail/CVE-2018-1088 @@ -24,7 +25,6 @@ See, this [guide](https://docs.gluster.org/en/v3/Administrator%20Guide/SSL/) for 1. Brick multiplexing is being tested and fixed aggressively but we still have a few crashes and memory leaks to fix. - ## Bugs addressed Bugs addressed since release-3.10.11 are listed below. diff --git a/docs/release-notes/3.10.2.md b/docs/release-notes/3.10.2.md index 0537334..6189de9 100644 --- a/docs/release-notes/3.10.2.md +++ b/docs/release-notes/3.10.2.md @@ -6,18 +6,19 @@ contains a listing of all the new features that were added and bugs in the GlusterFS 3.10 stable release. ## Major changes, features and limitations addressed in this release + 1. Many bugs brick multiplexing and nfs-ganesha+ha bugs have been addressed. 2. Rebalance and remove brick operations have been disabled for sharded volumes to prevent data corruption. - ## Major issues -1. Expanding a gluster volume that is sharded may cause file corruption -- Sharded volumes are typically used for VM images, if such volumes are -expanded or possibly contracted (i.e add/remove bricks and rebalance) -there are reports of VM images getting corrupted. -- Status of this bug can be tracked here, [#1426508](https://bugzilla.redhat.com/1426508) +1. Expanding a gluster volume that is sharded may cause file corruption + +- Sharded volumes are typically used for VM images, if such volumes are + expanded or possibly contracted (i.e add/remove bricks and rebalance) + there are reports of VM images getting corrupted. +- Status of this bug can be tracked here, [#1426508](https://bugzilla.redhat.com/1426508) ## Bugs addressed @@ -40,12 +41,12 @@ A total of 63 patches have been merged, addressing 46 bugs: - [#1443349](https://bugzilla.redhat.com/1443349): [Eventing]: Unrelated error message displayed when path specified during a 'webhook-test/add' is missing a schema - [#1441576](https://bugzilla.redhat.com/1441576): [geo-rep]: rsync should not try to sync internal xattrs - [#1441927](https://bugzilla.redhat.com/1441927): [geo-rep]: Worker crashes with [Errno 16] Device or resource busy: '.gfid/00000000-0000-0000-0000-000000000001/dir.166 while renaming directories -- [#1401877](https://bugzilla.redhat.com/1401877): [GANESHA] Symlinks from /etc/ganesha/ganesha.conf to shared\_storage are created on the non-ganesha nodes in 8 node gluster having 4 node ganesha cluster -- [#1425723](https://bugzilla.redhat.com/1425723): nfs-ganesha volume export file remains stale in shared\_storage\_volume when volume is deleted +- [#1401877](https://bugzilla.redhat.com/1401877): [GANESHA] Symlinks from /etc/ganesha/ganesha.conf to shared_storage are created on the non-ganesha nodes in 8 node gluster having 4 node ganesha cluster +- [#1425723](https://bugzilla.redhat.com/1425723): nfs-ganesha volume export file remains stale in shared_storage_volume when volume is deleted - [#1427759](https://bugzilla.redhat.com/1427759): nfs-ganesha: Incorrect error message returned when disable fails - [#1438325](https://bugzilla.redhat.com/1438325): Need to improve remove-brick failure message when the brick process is down. - [#1438338](https://bugzilla.redhat.com/1438338): glusterd is setting replicate volume property over disperse volume or vice versa -- [#1438340](https://bugzilla.redhat.com/1438340): glusterd is not validating for allowed values while setting "cluster.brick-multiplex" property +- [#1438340](https://bugzilla.redhat.com/1438340): glusterd is not validating for allowed values while setting "cluster.brick-multiplex" property - [#1441476](https://bugzilla.redhat.com/1441476): Glusterd crashes when restarted with many volumes - [#1444128](https://bugzilla.redhat.com/1444128): [BrickMultiplex] gluster command not responding and .snaps directory is not visible after executing snapshot related command - [#1445260](https://bugzilla.redhat.com/1445260): [GANESHA] Volume start and stop having ganesha enable on it,turns off cache-invalidation on volume @@ -54,10 +55,10 @@ A total of 63 patches have been merged, addressing 46 bugs: - [#1435779](https://bugzilla.redhat.com/1435779): Inode ref leak on anonymous reads and writes - [#1440278](https://bugzilla.redhat.com/1440278): [GSS] NFS Sub-directory mount not working on solaris10 client - [#1450378](https://bugzilla.redhat.com/1450378): GNFS crashed while taking lock on a file from 2 different clients having same volume mounted from 2 different servers -- [#1449779](https://bugzilla.redhat.com/1449779): quota: limit-usage command failed with error " Failed to start aux mount" +- [#1449779](https://bugzilla.redhat.com/1449779): quota: limit-usage command failed with error " Failed to start aux mount" - [#1450564](https://bugzilla.redhat.com/1450564): glfsheal: crashed(segfault) with disperse volume in RDMA - [#1443501](https://bugzilla.redhat.com/1443501): Don't wind post-op on a brick where the fop phase failed. -- [#1444892](https://bugzilla.redhat.com/1444892): When either killing or restarting a brick with performance.stat-prefetch on, stat sometimes returns a bad st\_size value. +- [#1444892](https://bugzilla.redhat.com/1444892): When either killing or restarting a brick with performance.stat-prefetch on, stat sometimes returns a bad st_size value. - [#1449169](https://bugzilla.redhat.com/1449169): Multiple bricks WILL crash after TCP port probing - [#1440805](https://bugzilla.redhat.com/1440805): Update rfc.sh to check Change-Id consistency for backports - [#1443010](https://bugzilla.redhat.com/1443010): snapshot: snapshots appear to be failing with respect to secure geo-rep slave @@ -65,8 +66,7 @@ A total of 63 patches have been merged, addressing 46 bugs: - [#1444773](https://bugzilla.redhat.com/1444773): explicitly specify executor to be bash for tests - [#1445407](https://bugzilla.redhat.com/1445407): remove bug-1421590-brick-mux-reuse-ports.t - [#1440742](https://bugzilla.redhat.com/1440742): Test files clean up for tier during 3.10 -- [#1448790](https://bugzilla.redhat.com/1448790): [Tiering]: High and low watermark values when set to the same level, is allowed +- [#1448790](https://bugzilla.redhat.com/1448790): [Tiering]: High and low watermark values when set to the same level, is allowed - [#1435942](https://bugzilla.redhat.com/1435942): Enabling parallel-readdir causes dht linkto files to be visible on the mount, - [#1437763](https://bugzilla.redhat.com/1437763): File-level WORM allows ftruncate() on read-only files - [#1439148](https://bugzilla.redhat.com/1439148): Parallel readdir on Gluster NFS displays less number of dentries - diff --git a/docs/release-notes/3.10.3.md b/docs/release-notes/3.10.3.md index 3ef65b7..b5a9ba4 100644 --- a/docs/release-notes/3.10.3.md +++ b/docs/release-notes/3.10.3.md @@ -6,18 +6,20 @@ contain a listing of all the new features that were added and bugs in the GlusterFS 3.10 stable release. ## Major changes, features and limitations addressed in this release + 1. No Major changes - ## Major issues -1. Expanding a gluster volume that is sharded may cause file corruption -- Sharded volumes are typically used for VM images, if such volumes are -expanded or possibly contracted (i.e add/remove bricks and rebalance) -there are reports of VM images getting corrupted. -- Status of this bug can be tracked here, [#1426508](https://bugzilla.redhat.com/1426508) -2. Brick multiplexing is being tested and fixed aggressively but we still have a - few crashes and memory leaks to fix. +1. Expanding a gluster volume that is sharded may cause file corruption + + - Sharded volumes are typically used for VM images, if such volumes are + expanded or possibly contracted (i.e add/remove bricks and rebalance) + there are reports of VM images getting corrupted. + - Status of this bug can be tracked here, [#1426508](https://bugzilla.redhat.com/1426508) + +2. Brick multiplexing is being tested and fixed aggressively but we still have a + few crashes and memory leaks to fix. ## Bugs addressed @@ -27,13 +29,12 @@ A total of 18 patches have been merged, addressing 13 bugs: - [#1450773](https://bugzilla.redhat.com/1450773): Quota: After upgrade from 3.7 to higher version , gluster quota list command shows "No quota configured on volume repvol" - [#1450934](https://bugzilla.redhat.com/1450934): [New] - Replacing an arbiter brick while I/O happens causes vm pause - [#1450947](https://bugzilla.redhat.com/1450947): Autoconf leaves unexpanded variables in path names of non-shell-scripttext files -- [#1451371](https://bugzilla.redhat.com/1451371): crash in dht\_rmdir\_do +- [#1451371](https://bugzilla.redhat.com/1451371): crash in dht_rmdir_do - [#1451561](https://bugzilla.redhat.com/1451561): AFR returns the node uuid of the same node for every file in the replica - [#1451587](https://bugzilla.redhat.com/1451587): cli xml status of detach tier broken - [#1451977](https://bugzilla.redhat.com/1451977): Add logs to identify whether disconnects are voluntary or due to network problems - [#1451995](https://bugzilla.redhat.com/1451995): Log message shows error code as success even when rpc fails to connect -- [#1453056](https://bugzilla.redhat.com/1453056): [DHt] : segfault in dht\_selfheal\_dir\_setattr while running regressions +- [#1453056](https://bugzilla.redhat.com/1453056): [DHt] : segfault in dht_selfheal_dir_setattr while running regressions - [#1453087](https://bugzilla.redhat.com/1453087): Brick Multiplexing: On reboot of a node Brick multiplexing feature lost on that node as multiple brick processes get spawned - [#1456682](https://bugzilla.redhat.com/1456682): tierd listens to a port. -- [#1457054](https://bugzilla.redhat.com/1457054): glusterfs client crash on io-cache.so(\_\_ioc\_page\_wakeup+0x44) - +- [#1457054](https://bugzilla.redhat.com/1457054): glusterfs client crash on io-cache.so(\_\_ioc_page_wakeup+0x44) diff --git a/docs/release-notes/3.10.4.md b/docs/release-notes/3.10.4.md index 95ee0e4..fe19f17 100644 --- a/docs/release-notes/3.10.4.md +++ b/docs/release-notes/3.10.4.md @@ -6,26 +6,28 @@ contain a listing of all the new features that were added and bugs fixed in the GlusterFS 3.10 stable release. ## Major changes, features and limitations addressed in this release + 1. No Major changes - ## Major issues -1. Expanding a gluster volume that is sharded may cause file corruption -- Sharded volumes are typically used for VM images, if such volumes are -expanded or possibly contracted (i.e add/remove bricks and rebalance) -there are reports of VM images getting corrupted. -- Status of this bug can be tracked here, [#1426508](https://bugzilla.redhat.com/1426508) -2. Brick multiplexing is being tested and fixed aggressively but we still have a - few crashes and memory leaks to fix. -3. Another rebalance related bug is being worked upon [#1467010](https://bugzilla.redhat.com/1467010) +1. Expanding a gluster volume that is sharded may cause file corruption + + - Sharded volumes are typically used for VM images, if such volumes are + expanded or possibly contracted (i.e add/remove bricks and rebalance) + there are reports of VM images getting corrupted. + - Status of this bug can be tracked here, [#1426508](https://bugzilla.redhat.com/1426508) + +2. Brick multiplexing is being tested and fixed aggressively but we still have a + few crashes and memory leaks to fix. +3. Another rebalance related bug is being worked upon [#1467010](https://bugzilla.redhat.com/1467010) ## Bugs addressed A total of 18 patches have been merged, addressing 13 bugs: - [#1457732](https://bugzilla.redhat.com/1457732): "split-brain observed [Input/output error]" error messages in samba logs during parallel rm -rf -- [#1459760](https://bugzilla.redhat.com/1459760): Glusterd segmentation fault in ' _Unwind_Backtrace' while running peer probe +- [#1459760](https://bugzilla.redhat.com/1459760): Glusterd segmentation fault in ' \_Unwind_Backtrace' while running peer probe - [#1460649](https://bugzilla.redhat.com/1460649): posix-acl: Whitelist virtual ACL xattrs - [#1460914](https://bugzilla.redhat.com/1460914): Rebalance estimate time sometimes shows negative values - [#1460993](https://bugzilla.redhat.com/1460993): Revert CLI restrictions on running rebalance in VM store use case diff --git a/docs/release-notes/3.10.5.md b/docs/release-notes/3.10.5.md index b91bf5b..cfc161b 100644 --- a/docs/release-notes/3.10.5.md +++ b/docs/release-notes/3.10.5.md @@ -6,19 +6,22 @@ contain a listing of all the new features that were added and bugs fixed in the GlusterFS 3.10 stable release. ## Major changes, features and limitations addressed in this release + **No Major changes** ## Major issues -1. Expanding a gluster volume that is sharded may cause file corruption -- Sharded volumes are typically used for VM images, if such volumes are -expanded or possibly contracted (i.e add/remove bricks and rebalance) -there are reports of VM images getting corrupted. -- The last known cause for corruption [#1467010](https://bugzilla.redhat.com/show_bug.cgi?id=1467010) -has a fix with this release. As further testing is still in progress, the issue -is retained as a major issue. -2. Brick multiplexing is being tested and fixed aggressively but we still have a - few crashes and memory leaks to fix. +1. Expanding a gluster volume that is sharded may cause file corruption + + - Sharded volumes are typically used for VM images, if such volumes are + expanded or possibly contracted (i.e add/remove bricks and rebalance) + there are reports of VM images getting corrupted. + - The last known cause for corruption [#1467010](https://bugzilla.redhat.com/show_bug.cgi?id=1467010) + has a fix with this release. As further testing is still in progress, the issue + is retained as a major issue. + +2. Brick multiplexing is being tested and fixed aggressively but we still have a + few crashes and memory leaks to fix. ## Bugs addressed @@ -46,4 +49,4 @@ Bugs addressed since release-3.10.4 are listed below. - [#1476212](https://bugzilla.redhat.com/1476212): [geo-rep]: few of the self healed hardlinks on master did not sync to slave - [#1478498](https://bugzilla.redhat.com/1478498): scripts: invalid test in S32gluster_enable_shared_storage.sh - [#1478499](https://bugzilla.redhat.com/1478499): packaging: /var/lib/glusterd/options should be %config(noreplace) -- [#1480594](https://bugzilla.redhat.com/1480594): nfs process crashed in "nfs3_getattr" \ No newline at end of file +- [#1480594](https://bugzilla.redhat.com/1480594): nfs process crashed in "nfs3_getattr" diff --git a/docs/release-notes/3.10.6.md b/docs/release-notes/3.10.6.md index eb911bb..b8cb2b9 100644 --- a/docs/release-notes/3.10.6.md +++ b/docs/release-notes/3.10.6.md @@ -6,18 +6,21 @@ contain a listing of all the new features that were added and bugs fixed in the GlusterFS 3.10 stable release. ## Major changes, features and limitations addressed in this release + **No Major changes** ## Major issues -1. Expanding a gluster volume that is sharded may cause file corruption -- Sharded volumes are typically used for VM images, if such volumes are -expanded or possibly contracted (i.e add/remove bricks and rebalance) -there are reports of VM images getting corrupted. -- The last known cause for corruption [#1498081](https://bugzilla.redhat.com/show_bug.cgi?id=1498081) -is still pending, and not yet a part of this release. -2. Brick multiplexing is being tested and fixed aggressively but we still have a - few crashes and memory leaks to fix. +1. Expanding a gluster volume that is sharded may cause file corruption + + - Sharded volumes are typically used for VM images, if such volumes are + expanded or possibly contracted (i.e add/remove bricks and rebalance) + there are reports of VM images getting corrupted. + - The last known cause for corruption [#1498081](https://bugzilla.redhat.com/show_bug.cgi?id=1498081) + is still pending, and not yet a part of this release. + +2. Brick multiplexing is being tested and fixed aggressively but we still have a + few crashes and memory leaks to fix. ## Bugs addressed @@ -28,7 +31,7 @@ Bugs addressed since release-3.10.5 are listed below. - [#1482857](https://bugzilla.redhat.com/1482857): glusterd fails to start - [#1483997](https://bugzilla.redhat.com/1483997): packaging: use rdma-core(-devel) instead of ibverbs, rdmacm; disable rdma on armv7hl - [#1484443](https://bugzilla.redhat.com/1484443): packaging: /run and /var/run; prefer /run -- [#1486542](https://bugzilla.redhat.com/1486542): "ganesha.so cannot open" warning message in glusterd log in non ganesha setup. +- [#1486542](https://bugzilla.redhat.com/1486542): "ganesha.so cannot open" warning message in glusterd log in non ganesha setup. - [#1487042](https://bugzilla.redhat.com/1487042): AFR returns the node uuid of the same node for every file in the replica - [#1487647](https://bugzilla.redhat.com/1487647): with AFR now making both nodes to return UUID for a file will result in georep consuming more resources - [#1488391](https://bugzilla.redhat.com/1488391): gluster-blockd process crashed and core generated @@ -38,7 +41,7 @@ Bugs addressed since release-3.10.5 are listed below. - [#1491691](https://bugzilla.redhat.com/1491691): rpc: TLSv1_2_method() is deprecated in OpenSSL-1.1 - [#1491966](https://bugzilla.redhat.com/1491966): AFR entry self heal removes a directory's .glusterfs symlink. - [#1491985](https://bugzilla.redhat.com/1491985): Add NULL gfid checks before creating file -- [#1491995](https://bugzilla.redhat.com/1491995): afr: check op_ret value in __afr_selfheal_name_impunge +- [#1491995](https://bugzilla.redhat.com/1491995): afr: check op_ret value in \_\_afr_selfheal_name_impunge - [#1492010](https://bugzilla.redhat.com/1492010): Launch metadata heal in discover code path. - [#1495430](https://bugzilla.redhat.com/1495430): Make event-history feature configurable and have it disabled by default - [#1496321](https://bugzilla.redhat.com/1496321): [afr] split-brain observed on T files post hardlink and rename in x3 volume diff --git a/docs/release-notes/3.10.7.md b/docs/release-notes/3.10.7.md index a3932b7..860fc09 100644 --- a/docs/release-notes/3.10.7.md +++ b/docs/release-notes/3.10.7.md @@ -6,18 +6,21 @@ contain a listing of all the new features that were added and bugs fixed in the GlusterFS 3.10 stable release. ## Major changes, features and limitations addressed in this release + **No Major changes** ## Major issues -1. Expanding a gluster volume that is sharded may cause file corruption -- Sharded volumes are typically used for VM images, if such volumes are -expanded or possibly contracted (i.e add/remove bricks and rebalance) -there are reports of VM images getting corrupted. -- The last known cause for corruption [#1498081](https://bugzilla.redhat.com/show_bug.cgi?id=1498081) -is still pending, and not yet a part of this release. -2. Brick multiplexing is being tested and fixed aggressively but we still have a - few crashes and memory leaks to fix. +1. Expanding a gluster volume that is sharded may cause file corruption + + - Sharded volumes are typically used for VM images, if such volumes are + expanded or possibly contracted (i.e add/remove bricks and rebalance) + there are reports of VM images getting corrupted. + - The last known cause for corruption [#1498081](https://bugzilla.redhat.com/show_bug.cgi?id=1498081) + is still pending, and not yet a part of this release. + +2. Brick multiplexing is being tested and fixed aggressively but we still have a + few crashes and memory leaks to fix. ## Bugs addressed diff --git a/docs/release-notes/3.10.8.md b/docs/release-notes/3.10.8.md index 7315449..1ac00ad 100644 --- a/docs/release-notes/3.10.8.md +++ b/docs/release-notes/3.10.8.md @@ -6,18 +6,21 @@ contain a listing of all the new features that were added and bugs fixed in the GlusterFS 3.10 stable release. ## Major changes, features and limitations addressed in this release + **No Major changes** ## Major issues -1. Expanding a gluster volume that is sharded may cause file corruption -- Sharded volumes are typically used for VM images, if such volumes are -expanded or possibly contracted (i.e add/remove bricks and rebalance) -there are reports of VM images getting corrupted. -- The last known cause for corruption [#1498081](https://bugzilla.redhat.com/show_bug.cgi?id=1498081) -is still pending, and not yet a part of this release. -2. Brick multiplexing is being tested and fixed aggressively but we still have a - few crashes and memory leaks to fix. +1. Expanding a gluster volume that is sharded may cause file corruption + + - Sharded volumes are typically used for VM images, if such volumes are + expanded or possibly contracted (i.e add/remove bricks and rebalance) + there are reports of VM images getting corrupted. + - The last known cause for corruption [#1498081](https://bugzilla.redhat.com/show_bug.cgi?id=1498081) + is still pending, and not yet a part of this release. + +2. Brick multiplexing is being tested and fixed aggressively but we still have a + few crashes and memory leaks to fix. ## Bugs addressed diff --git a/docs/release-notes/3.10.9.md b/docs/release-notes/3.10.9.md index fccb5c8..15f8b47 100644 --- a/docs/release-notes/3.10.9.md +++ b/docs/release-notes/3.10.9.md @@ -6,18 +6,21 @@ the new features that were added and bugs fixed in the GlusterFS 3.10 stable release. ## Major changes, features and limitations addressed in this release + **No Major changes** ## Major issues -1. Expanding a gluster volume that is sharded may cause file corruption -- Sharded volumes are typically used for VM images, if such volumes are -expanded or possibly contracted (i.e add/remove bricks and rebalance) -there are reports of VM images getting corrupted. -- The last known cause for corruption [#1498081](https://bugzilla.redhat.com/show_bug.cgi?id=1498081) -is still pending, and not yet a part of this release. -2. Brick multiplexing is being tested and fixed aggressively but we still have a - few crashes and memory leaks to fix. +1. Expanding a gluster volume that is sharded may cause file corruption + + - Sharded volumes are typically used for VM images, if such volumes are + expanded or possibly contracted (i.e add/remove bricks and rebalance) + there are reports of VM images getting corrupted. + - The last known cause for corruption [#1498081](https://bugzilla.redhat.com/show_bug.cgi?id=1498081) + is still pending, and not yet a part of this release. + +2. Brick multiplexing is being tested and fixed aggressively but we still have a + few crashes and memory leaks to fix. ## Bugs addressed diff --git a/docs/release-notes/3.11.0.md b/docs/release-notes/3.11.0.md index 985aaf0..1d06e68 100644 --- a/docs/release-notes/3.11.0.md +++ b/docs/release-notes/3.11.0.md @@ -11,6 +11,7 @@ of bugs that have been addressed is included further below. ## Major changes and features ### Switched to storhaug for ganesha and samba high availability + **Notes for users:** High Availability (HA) support for NFS-Ganesha (NFS) and Samba (SMB) @@ -26,6 +27,7 @@ There are many to choose from in most popular Linux distributions. Choose the one the best fits your environment and use it. ### Added SELinux support for Gluster Volumes + **Notes for users:** A new xlator has been introduced (`features/selinux`) to allow setting the @@ -40,17 +42,20 @@ This feature is intended to be the base for implementing Labelled-NFS in NFS-Ganesha and SELinux support for FUSE mounts in the Linux kernel. **Limitations:** + - The Linux kernel does not support mounting of FUSE filesystems with SELinux support, yet. - NFS-Ganesha does not support Labelled-NFS, yet. **Known Issues:** + - There has been limited testing, because other projects can not consume the functionality yet without being part of a release. So far, no problems have been observed, but this might change when other projects start to seriously use this. ### Several memory leaks are fixed in gfapi during graph switches + **Notes for users:** Gluster API (or gfapi), has had a few memory leak issues arising specifically @@ -59,9 +64,11 @@ addressed in this release, and more work towards ironing out the pending leaks are in the works across the next few releases. **Limitations:** + - There are still a few leaks to be addressed when graph switches occur ### get-state CLI is enhanced to provide client and brick capacity related information + **Notes for users:** The get-state CLI output now optionally accommodates client related information @@ -80,11 +87,13 @@ bricks as obtained from `gluster volume status |all detail` has also been added to the get-state output. **Limitations:** + - Information for non-local bricks and clients connected to non-local bricks -won't be available. This is a known limitation of the get-state command, since -get-state command doesn't provide information on non-local bricks. + won't be available. This is a known limitation of the get-state command, since + get-state command doesn't provide information on non-local bricks. ### Ability to serve negative lookups from cache has been added + **Notes for users:** Before creating / renaming any file, lookups (around, 5-6 when using the SMB @@ -99,10 +108,13 @@ Execute the following commands to enable negative-lookup cache: # gluster volume set features.cache-invalidation-timeout 600 # gluster volume set nl-cache on ``` + **Limitations** + - This feature is supported only for SMB access, for this release ### New xlator to help developers detecting resource leaks has been added + **Notes for users:** This is intended as a developer feature, and hence there is no direct user @@ -114,6 +126,7 @@ gfapi and any xlator in between the API and the sink xlator. More details can be found in [this](http://lists.gluster.org/pipermail/gluster-devel/2017-April/052618.html) thread on the gluster-devel lists ### Feature for metadata-caching/small file performance is production ready + **Notes for users:** Over the course of releases several fixes and enhancements have been made to @@ -132,15 +145,18 @@ SMB access, by enabling metadata caching: - Renaming files To enable metadata caching execute the following commands: + ```bash # gluster volume set group metadata-cache # gluster volume set network.inode-lru-limit ``` + \, is set to 50000 by default. It should be increased if the number of concurrently accessed files in the volume is very high. Increasing this number increases the memory footprint of the brick processes. ### "Parallel Readdir" feature introduced in 3.10.0 is production ready + **Notes for users:** This feature was introduced in 3.10 and was experimental in nature. Over the @@ -150,6 +166,7 @@ stabilized and is ready for use in production environments. For further details refer: [3.10.0 release notes](https://github.com/gluster/glusterfs/blob/release-3.10/doc/release-notes/3.10.0.md) ### Object versioning is enabled only if bitrot is enabled + **Notes for users:** Object versioning was turned on by default on brick processes by the bitrot @@ -161,6 +178,7 @@ To fix this, object versioning is disabled by default, and is only enabled as a part of enabling the bitrot option. ### Distribute layer provides more robust transactions during directory namespace operations + **Notes for users:** Distribute layer in Gluster, creates and maintains directories in all subvolumes @@ -173,6 +191,7 @@ ensuring better consistency of the file system as a whole, when dealing with racing operations, operating on the same directory object. ### gfapi extended readdirplus API has been added + **Notes for users:** An extended readdirplus API `glfs_xreaddirplus` is added to get extra @@ -184,10 +203,12 @@ involving directory listing. The API syntax and usage can be found in [`glfs.h`](https://github.com/gluster/glusterfs/blob/v3.11.0rc1/api/src/glfs.h#L810) header file. **Limitations:** + - This API currently has support to only return stat and handles (`glfs_object`) -for each dirent of the directory, but can be extended in the future. + for each dirent of the directory, but can be extended in the future. ### Improved adoption of standard refcounting functions across the code + **Notes for users:** This change does not impact users, it is an internal code cleanup activity @@ -195,10 +216,12 @@ that ensures that we ref count in a standard manner, thus avoiding unwanted bugs due to different implementations of the same. **Known Issues:** + - This standardization started with this release and is expected to continue -across releases. + across releases. ### Performance improvements to rebalance have been made + **Notes for users:** Both crawling and migration improvement has been done in rebalance. The crawler @@ -209,7 +232,7 @@ both the nodes divide the load among each other giving boost to migration performance. And also there have been some optimization to avoid redundant network operations (or RPC calls) in the process of migrating a file. -Further, file migration now avoids syncop framework and is managed entirely by +Further, file migration now avoids syncop framework and is managed entirely by rebalance threads giving performance boost. Also, There is a change to throttle settings in rebalance. Earlier user could @@ -220,21 +243,23 @@ of threads rebalance process will work with, thereby translating to the number of files being migrated in parallel. ### Halo Replication feature in AFR has been introduced + **Notes for users:** Halo Geo-replication is a feature which allows Gluster or NFS clients to write locally to their region (as defined by a latency "halo" or threshold if you like), and have their writes asynchronously propagate from their origin to the -rest of the cluster. Clients can also write synchronously to the cluster +rest of the cluster. Clients can also write synchronously to the cluster simply by specifying a halo-latency which is very large (e.g. 10seconds) which will include all bricks. To enable halo feature execute the following commands: + ```bash # gluster volume set cluster.halo-enabled yes ``` You may have to set the following following options to change defaults. -`cluster.halo-shd-latency`: The threshold below which self-heal daemons will +`cluster.halo-shd-latency`: The threshold below which self-heal daemons will consider children (bricks) connected. `cluster.halo-nfsd-latency`: The threshold below which NFS daemons will consider @@ -249,12 +274,14 @@ If the number of children falls below this threshold the next best (chosen by latency) shall be swapped in. ### FALLOCATE support with EC + **Notes for users** Support for FALLOCATE file operation on EC volume is added with this release. EC volumes can now support basic FALLOCATE functionality. ### Self-heal window-size control option for EC + **Notes for users** Support to control the maximum size of read/write operation carried out @@ -262,14 +289,16 @@ during self-heal process has been added with this release. User has to tune 'disperse.self-heal-window-size' option on disperse volume to adjust the size. ## Major issues -1. Expanding a gluster volume that is sharded may cause file corruption + +1. Expanding a gluster volume that is sharded may cause file corruption + - Sharded volumes are typically used for VM images, if such volumes are - expanded or possibly contracted (i.e add/remove bricks and rebalance) there - are reports of VM images getting corrupted. + expanded or possibly contracted (i.e add/remove bricks and rebalance) there + are reports of VM images getting corrupted. - Status of this bug can be tracked here, #1426508 - Latest series of fixes for the issue (which are present in this release as - well) are not showing the previous corruption, and hence the fixes look - good, but this is maintained on the watch list nevetheness. + well) are not showing the previous corruption, and hence the fixes look + good, but this is maintained on the watch list nevetheness. ## Bugs addressed @@ -289,7 +318,7 @@ Bugs addressed since release-3.10.0 are listed below. - [#1328342](https://bugzilla.redhat.com/1328342): [tiering]: gluster v reset of watermark levels can allow low watermark level to have a higher value than hi watermark level - [#1353952](https://bugzilla.redhat.com/1353952): [geo-rep]: rsync should not try to sync internal xattrs - [#1356076](https://bugzilla.redhat.com/1356076): DHT doesn't evenly balance files on FreeBSD with ZFS -- [#1359599](https://bugzilla.redhat.com/1359599): BitRot :- bit-rot.signature and bit-rot.version xattr should not be set if bitrot is not enabled on volume +- [#1359599](https://bugzilla.redhat.com/1359599): BitRot :- bit-rot.signature and bit-rot.version xattr should not be set if bitrot is not enabled on volume - [#1369393](https://bugzilla.redhat.com/1369393): dead loop in changelog_rpc_server_destroy - [#1383893](https://bugzilla.redhat.com/1383893): glusterd restart is starting the offline shd daemon on other node in the cluster - [#1384989](https://bugzilla.redhat.com/1384989): libglusterfs : update correct memory segments in glfs-message-id @@ -304,7 +333,7 @@ Bugs addressed since release-3.10.0 are listed below. - [#1399593](https://bugzilla.redhat.com/1399593): Obvious typo in cleanup code in rpc_clnt_notify - [#1401571](https://bugzilla.redhat.com/1401571): bitrot quarantine dir misspelled - [#1401812](https://bugzilla.redhat.com/1401812): RFE: Make readdirp parallel in dht -- [#1401877](https://bugzilla.redhat.com/1401877): [GANESHA] Symlinks from /etc/ganesha/ganesha.conf to shared_storage are created on the non-ganesha nodes in 8 node gluster having 4 node ganesha cluster +- [#1401877](https://bugzilla.redhat.com/1401877): [GANESHA] Symlinks from /etc/ganesha/ganesha.conf to shared_storage are created on the non-ganesha nodes in 8 node gluster having 4 node ganesha cluster - [#1402254](https://bugzilla.redhat.com/1402254): compile warning unused variable - [#1402661](https://bugzilla.redhat.com/1402661): Samba crash when mounting a distributed dispersed volume over CIFS - [#1404424](https://bugzilla.redhat.com/1404424): The data-self-heal option is not honored in AFR @@ -317,10 +346,10 @@ Bugs addressed since release-3.10.0 are listed below. - [#1411334](https://bugzilla.redhat.com/1411334): Improve output of "gluster volume status detail" - [#1412135](https://bugzilla.redhat.com/1412135): rename of the same file from multiple clients with caching enabled may result in duplicate files - [#1412549](https://bugzilla.redhat.com/1412549): EXPECT_WITHIN is taking too much time even if the result matches with expected value -- [#1413526](https://bugzilla.redhat.com/1413526): glusterfind: After glusterfind pre command execution all temporary files and directories /usr/var/lib/misc/glusterfsd/glusterfind/// should be removed +- [#1413526](https://bugzilla.redhat.com/1413526): glusterfind: After glusterfind pre command execution all temporary files and directories /usr/var/lib/misc/glusterfsd/glusterfind/// should be removed - [#1413971](https://bugzilla.redhat.com/1413971): Bonnie test suite failed with "Can't open file" error - [#1414287](https://bugzilla.redhat.com/1414287): repeated operation failed warnings in gluster mount logs with disperse volume -- [#1414346](https://bugzilla.redhat.com/1414346): Quota: After upgrade from 3.7 to higher version , gluster quota list command shows "No quota configured on volume repvol" +- [#1414346](https://bugzilla.redhat.com/1414346): Quota: After upgrade from 3.7 to higher version , gluster quota list command shows "No quota configured on volume repvol" - [#1414645](https://bugzilla.redhat.com/1414645): Typo in glusterfs code comments - [#1414782](https://bugzilla.redhat.com/1414782): Add logs to selfheal code path to be helpful for debug - [#1414902](https://bugzilla.redhat.com/1414902): packaging: python/python2(/python3) cleanup @@ -341,7 +370,7 @@ Bugs addressed since release-3.10.0 are listed below. - [#1418095](https://bugzilla.redhat.com/1418095): Portmap allocates way too much memory (256KB) on stack - [#1418213](https://bugzilla.redhat.com/1418213): [Ganesha+SSL] : Bonnie++ hangs during rewrites. - [#1418249](https://bugzilla.redhat.com/1418249): [RFE] Need to have group cli option to set all md-cache options using a single command -- [#1418259](https://bugzilla.redhat.com/1418259): Quota: After deleting directory from mount point on which quota was configured, quota list command output is blank +- [#1418259](https://bugzilla.redhat.com/1418259): Quota: After deleting directory from mount point on which quota was configured, quota list command output is blank - [#1418417](https://bugzilla.redhat.com/1418417): packaging: remove glusterfs-ganesha subpackage - [#1418629](https://bugzilla.redhat.com/1418629): glustershd process crashed on systemic setup - [#1418900](https://bugzilla.redhat.com/1418900): [RFE] Include few more options in virt file @@ -355,7 +384,7 @@ Bugs addressed since release-3.10.0 are listed below. - [#1420619](https://bugzilla.redhat.com/1420619): Entry heal messages in glustershd.log while no entries shown in heal info - [#1420623](https://bugzilla.redhat.com/1420623): [RHV-RHGS]: Application VM paused after add brick operation and VM didn't comeup after power cycle. - [#1420637](https://bugzilla.redhat.com/1420637): Modified volume options not synced once offline nodes comes up. -- [#1420697](https://bugzilla.redhat.com/1420697): CLI option "--timeout" is accepting non numeric and negative values. +- [#1420697](https://bugzilla.redhat.com/1420697): CLI option "--timeout" is accepting non numeric and negative values. - [#1420713](https://bugzilla.redhat.com/1420713): glusterd: storhaug, remove all vestiges ganesha - [#1421023](https://bugzilla.redhat.com/1421023): Binary file gf_attach generated during build process should be git ignored - [#1421590](https://bugzilla.redhat.com/1421590): Bricks take up new ports upon volume restart after add-brick op with brick mux enabled @@ -364,9 +393,9 @@ Bugs addressed since release-3.10.0 are listed below. - [#1421653](https://bugzilla.redhat.com/1421653): dht_setxattr returns EINVAL when a file is deleted during the FOP - [#1421721](https://bugzilla.redhat.com/1421721): volume start command hangs - [#1421724](https://bugzilla.redhat.com/1421724): glusterd log is flooded with stale disconnect rpc messages -- [#1421759](https://bugzilla.redhat.com/1421759): Gluster NFS server crashing in __mnt3svc_umountall +- [#1421759](https://bugzilla.redhat.com/1421759): Gluster NFS server crashing in \_\_mnt3svc_umountall - [#1421937](https://bugzilla.redhat.com/1421937): [Replicate] "RPC call decoding failed" leading to IO hang & mount inaccessible -- [#1421938](https://bugzilla.redhat.com/1421938): systemic testing: seeing lot of ping time outs which would lead to splitbrains +- [#1421938](https://bugzilla.redhat.com/1421938): systemic testing: seeing lot of ping time outs which would lead to splitbrains - [#1421955](https://bugzilla.redhat.com/1421955): Disperse: Fallback to pre-compiled code execution when dynamic code generation fails - [#1422074](https://bugzilla.redhat.com/1422074): GlusterFS truncates nanoseconds to microseconds when setting mtime - [#1422152](https://bugzilla.redhat.com/1422152): Bricks not coming up when ran with address sanitizer @@ -387,7 +416,7 @@ Bugs addressed since release-3.10.0 are listed below. - [#1424815](https://bugzilla.redhat.com/1424815): Fix erronous comparaison of flags resulting in UUID always sent - [#1424894](https://bugzilla.redhat.com/1424894): Some switches don't have breaks causing unintended fall throughs. - [#1424905](https://bugzilla.redhat.com/1424905): Coverity: Memory issues and dead code -- [#1425288](https://bugzilla.redhat.com/1425288): glusterd is not validating for allowed values while setting "cluster.brick-multiplex" property +- [#1425288](https://bugzilla.redhat.com/1425288): glusterd is not validating for allowed values while setting "cluster.brick-multiplex" property - [#1425515](https://bugzilla.redhat.com/1425515): tests: quota-anon-fd-nfs.t needs to check if nfs mount is avialable before mounting - [#1425623](https://bugzilla.redhat.com/1425623): Free all xlator specific resources when xlator->fini() gets called - [#1425676](https://bugzilla.redhat.com/1425676): gfids are not populated in release/releasedir requests @@ -415,8 +444,8 @@ Bugs addressed since release-3.10.0 are listed below. - [#1428510](https://bugzilla.redhat.com/1428510): memory leak in features/locks xlator - [#1429198](https://bugzilla.redhat.com/1429198): Restore atime/mtime for symlinks and other non-regular files. - [#1429200](https://bugzilla.redhat.com/1429200): disallow increasing replica count for arbiter volumes -- [#1429330](https://bugzilla.redhat.com/1429330): [crawler]: auxiliary mount remains even after crawler finishes -- [#1429696](https://bugzilla.redhat.com/1429696): ldd libgfxdr.so.0.0.1: undefined symbol: __gf_free +- [#1429330](https://bugzilla.redhat.com/1429330): [crawler]: auxiliary mount remains even after crawler finishes +- [#1429696](https://bugzilla.redhat.com/1429696): ldd libgfxdr.so.0.0.1: undefined symbol: \_\_gf_free - [#1430042](https://bugzilla.redhat.com/1430042): Transport endpoint not connected error seen on client when glusterd is restarted - [#1430148](https://bugzilla.redhat.com/1430148): USS is broken when multiplexing is on - [#1430608](https://bugzilla.redhat.com/1430608): [RFE] Pass slave volume in geo-rep as read-only @@ -452,7 +481,7 @@ Bugs addressed since release-3.10.0 are listed below. - [#1438370](https://bugzilla.redhat.com/1438370): rebalance: Allow admin to change thread count for rebalance - [#1438411](https://bugzilla.redhat.com/1438411): [Ganesha + EC] : Input/Output Error while creating LOTS of smallfiles - [#1438738](https://bugzilla.redhat.com/1438738): Inode ref leak on anonymous reads and writes -- [#1438772](https://bugzilla.redhat.com/1438772): build: clang/llvm has __builtin_ffs() and __builtin_popcount() +- [#1438772](https://bugzilla.redhat.com/1438772): build: clang/llvm has **builtin_ffs() and **builtin_popcount() - [#1438810](https://bugzilla.redhat.com/1438810): File-level WORM allows ftruncate() on read-only files - [#1438858](https://bugzilla.redhat.com/1438858): explicitly specify executor to be bash for tests - [#1439527](https://bugzilla.redhat.com/1439527): [disperse] Don't count healing brick as healthy brick @@ -491,7 +520,7 @@ Bugs addressed since release-3.10.0 are listed below. - [#1449004](https://bugzilla.redhat.com/1449004): [Brick Multiplexing] : Bricks for multiple volumes going down after glusterd restart and not coming back up after volume start force - [#1449191](https://bugzilla.redhat.com/1449191): Multiple bricks WILL crash after TCP port probing - [#1449311](https://bugzilla.redhat.com/1449311): [whql][virtio-block+glusterfs]"Disk Stress" and "Disk Verification" job always failed on win7-32/win2012/win2k8R2 guest -- [#1449775](https://bugzilla.redhat.com/1449775): quota: limit-usage command failed with error " Failed to start aux mount" +- [#1449775](https://bugzilla.redhat.com/1449775): quota: limit-usage command failed with error " Failed to start aux mount" - [#1449921](https://bugzilla.redhat.com/1449921): afr: include quorum type and count when dumping afr priv - [#1449924](https://bugzilla.redhat.com/1449924): When either killing or restarting a brick with performance.stat-prefetch on, stat sometimes returns a bad st_size value. - [#1449933](https://bugzilla.redhat.com/1449933): Brick Multiplexing :- resetting a brick bring down other bricks with same PID @@ -499,25 +528,25 @@ Bugs addressed since release-3.10.0 are listed below. - [#1450377](https://bugzilla.redhat.com/1450377): GNFS crashed while taking lock on a file from 2 different clients having same volume mounted from 2 different servers - [#1450565](https://bugzilla.redhat.com/1450565): glfsheal: crashed(segfault) with disperse volume in RDMA - [#1450729](https://bugzilla.redhat.com/1450729): Brick Multiplexing: seeing Input/Output Error for .trashcan -- [#1450933](https://bugzilla.redhat.com/1450933): [New] - Replacing an arbiter brick while I/O happens causes vm pause +- [#1450933](https://bugzilla.redhat.com/1450933): [New] - Replacing an arbiter brick while I/O happens causes vm pause - [#1451033](https://bugzilla.redhat.com/1451033): contrib: timer-wheel 32-bit bug, use builtin_fls, license, etc - [#1451573](https://bugzilla.redhat.com/1451573): AFR returns the node uuid of the same node for every file in the replica - [#1451586](https://bugzilla.redhat.com/1451586): crash in dht_rmdir_do - [#1451591](https://bugzilla.redhat.com/1451591): cli xml status of detach tier broken - [#1451887](https://bugzilla.redhat.com/1451887): Add tests/basic/afr/gfid-mismatch-resolution-with-fav-child-policy.t to bad tests - [#1452000](https://bugzilla.redhat.com/1452000): Spacing issue in fix-layout status output -- [#1453050](https://bugzilla.redhat.com/1453050): [DHt] : segfault in dht_selfheal_dir_setattr while running regressions +- [#1453050](https://bugzilla.redhat.com/1453050): [DHt] : segfault in dht_selfheal_dir_setattr while running regressions - [#1453086](https://bugzilla.redhat.com/1453086): Brick Multiplexing: On reboot of a node Brick multiplexing feature lost on that node as multiple brick processes get spawned - [#1453152](https://bugzilla.redhat.com/1453152): [Parallel Readdir] : Mounts fail when performance.parallel-readdir is set to "off" - [#1454533](https://bugzilla.redhat.com/1454533): lock_revocation.t Marked as bad in 3.11 for CentOS as well - [#1454569](https://bugzilla.redhat.com/1454569): [geo-rep + nl]: Multiple crashes observed on slave with "nlc_lookup_cbk" -- [#1454597](https://bugzilla.redhat.com/1454597): [Tiering]: High and low watermark values when set to the same level, is allowed +- [#1454597](https://bugzilla.redhat.com/1454597): [Tiering]: High and low watermark values when set to the same level, is allowed - [#1454612](https://bugzilla.redhat.com/1454612): glusterd on a node crashed after running volume profile command - [#1454686](https://bugzilla.redhat.com/1454686): Implement FALLOCATE FOP for EC - [#1454853](https://bugzilla.redhat.com/1454853): Seeing error "Failed to get the total number of files. Unable to estimate time to complete rebalance" in rebalance logs - [#1455177](https://bugzilla.redhat.com/1455177): ignore incorrect uuid validation in gd_validate_mgmt_hndsk_req - [#1455423](https://bugzilla.redhat.com/1455423): dht: dht self heal fails with no hashed subvol error -- [#1455907](https://bugzilla.redhat.com/1455907): heal info shows the status of the bricks as "Transport endpoint is not connected" though bricks are up +- [#1455907](https://bugzilla.redhat.com/1455907): heal info shows the status of the bricks as "Transport endpoint is not connected" though bricks are up - [#1456224](https://bugzilla.redhat.com/1456224): [gluster-block]:Need a volume group profile option for gluster-block volume to add necessary options to be added. - [#1456225](https://bugzilla.redhat.com/1456225): gluster-block is not working as expected when shard is enabled - [#1456331](https://bugzilla.redhat.com/1456331): [Bitrot]: Brick process crash observed while trying to recover a bad file in disperse volume diff --git a/docs/release-notes/3.11.1.md b/docs/release-notes/3.11.1.md index d6deb23..a212751 100644 --- a/docs/release-notes/3.11.1.md +++ b/docs/release-notes/3.11.1.md @@ -7,6 +7,7 @@ GlusterFS 3.11 stable release. ## Major changes, features and limitations addressed in this release ### Improved disperse performance + Fix for bug [#1456259](https://bugzilla.redhat.com/1456259) changes the way messages are read and processed from the socket layers on the Gluster client. This has shown performance improvements on disperse volumes, and is applicable @@ -14,6 +15,7 @@ to other volume types as well, where there maybe multiple applications or users accessing the same mount point. ### Group settings for enabling negative lookup caching provided + Ability to serve negative lookups from cache was added in 3.11.0 and with this release, a group volume set option is added for ease in enabling this feature. @@ -21,6 +23,7 @@ feature. See [group-nl-cache](https://github.com/gluster/glusterfs/blob/release-3.11/extras/group-nl-cache) for more details. ### Gluster fuse now implements "-oauto_unmount" feature. + libfuse has an auto_unmount option which, if enabled, ensures that the file system is unmounted at FUSE server termination by running a separate monitor process that performs the unmount when that occurs. This release implements that @@ -30,15 +33,17 @@ Note that "auto unmount" (robust or not) is a leaky abstraction, as the kernel cannot guarantee that at the path where the FUSE fs is mounted is actually the toplevel mount at the time of the umount(2) call, for multiple reasons, among others, see: + - fuse-devel: ["fuse: feasible to distinguish between umount and abort?"](http://fuse.996288.n3.nabble.com/fuse-feasible-to-distinguish-between-umount-and-abort-tt14358.html) - https://github.com/libfuse/libfuse/issues/122 ## Major issues -1. Expanding a gluster volume that is sharded may cause file corruption +1. Expanding a gluster volume that is sharded may cause file corruption + - Sharded volumes are typically used for VM images, if such volumes are - expanded or possibly contracted (i.e add/remove bricks and rebalance) there - are reports of VM images getting corrupted. + expanded or possibly contracted (i.e add/remove bricks and rebalance) there + are reports of VM images getting corrupted. - Status of this bug can be tracked here, #1465123 ## Bugs addressed @@ -46,7 +51,7 @@ among others, see: Bugs addressed since release-3.11.0 are listed below. - [#1456259](https://bugzilla.redhat.com/1456259): limited throughput with disperse volume over small number of bricks -- [#1457058](https://bugzilla.redhat.com/1457058): glusterfs client crash on io-cache.so(__ioc_page_wakeup+0x44) +- [#1457058](https://bugzilla.redhat.com/1457058): glusterfs client crash on io-cache.so(\_\_ioc_page_wakeup+0x44) - [#1457289](https://bugzilla.redhat.com/1457289): tierd listens to a port. - [#1457339](https://bugzilla.redhat.com/1457339): DHT: slow readdirp performance - [#1457616](https://bugzilla.redhat.com/1457616): "split-brain observed [Input/output error]" error messages in samba logs during parallel rm -rf @@ -55,8 +60,8 @@ Bugs addressed since release-3.11.0 are listed below. - [#1458664](https://bugzilla.redhat.com/1458664): [Geo-rep]: METADATA errors are seen even though everything is in sync - [#1459090](https://bugzilla.redhat.com/1459090): all: spelling errors (debian package maintainer) - [#1459095](https://bugzilla.redhat.com/1459095): extras/hook-scripts: non-portable shell syntax (debian package maintainer) -- [#1459392](https://bugzilla.redhat.com/1459392): possible repeatedly recursive healing of same file with background heal not happening when IO is going on -- [#1459759](https://bugzilla.redhat.com/1459759): Glusterd segmentation fault in ' _Unwind_Backtrace' while running peer probe +- [#1459392](https://bugzilla.redhat.com/1459392): possible repeatedly recursive healing of same file with background heal not happening when IO is going on +- [#1459759](https://bugzilla.redhat.com/1459759): Glusterd segmentation fault in ' \_Unwind_Backtrace' while running peer probe - [#1460647](https://bugzilla.redhat.com/1460647): posix-acl: Whitelist virtual ACL xattrs - [#1460894](https://bugzilla.redhat.com/1460894): Rebalance estimate time sometimes shows negative values - [#1460895](https://bugzilla.redhat.com/1460895): Upcall missing invalidations diff --git a/docs/release-notes/3.11.2.md b/docs/release-notes/3.11.2.md index a25ac59..34b137c 100644 --- a/docs/release-notes/3.11.2.md +++ b/docs/release-notes/3.11.2.md @@ -10,13 +10,14 @@ There are no major features or changes made in this release. ## Major issues -1. Expanding a gluster volume that is sharded may cause file corruption +1. Expanding a gluster volume that is sharded may cause file corruption + - Sharded volumes are typically used for VM images, if such volumes are - expanded or possibly contracted (i.e add/remove bricks and rebalance) there - are reports of VM images getting corrupted. + expanded or possibly contracted (i.e add/remove bricks and rebalance) there + are reports of VM images getting corrupted. - The last known cause for corruption (Bug #1465123) has a fix with this - release. As further testing is still in progress, the issue is retained as - a major issue. + release. As further testing is still in progress, the issue is retained as + a major issue. - Status of this bug can be tracked here, #1465123 ## Bugs addressed @@ -26,8 +27,8 @@ Bugs addressed since release-3.11.0 are listed below. - [#1463512](https://bugzilla.redhat.com/1463512): USS: stale snap entries are seen when activation/deactivation performed during one of the glusterd's unavailability - [#1463513](https://bugzilla.redhat.com/1463513): [geo-rep]: extended attributes are not synced if the entry and extended attributes are done within changelog roleover/or entry sync - [#1463517](https://bugzilla.redhat.com/1463517): Brick Multiplexing:dmesg shows request_sock_TCP: Possible SYN flooding on port 49152 and memory related backtraces -- [#1463528](https://bugzilla.redhat.com/1463528): [Perf] 35% drop in small file creates on smbv3 on *2 -- [#1463626](https://bugzilla.redhat.com/1463626): [Ganesha]Bricks got crashed while running posix compliance test suit on V4 mount +- [#1463528](https://bugzilla.redhat.com/1463528): [Perf] 35% drop in small file creates on smbv3 on \*2 +- [#1463626](https://bugzilla.redhat.com/1463626): [Ganesha]Bricks got crashed while running posix compliance test suit on V4 mount - [#1464316](https://bugzilla.redhat.com/1464316): DHT: Pass errno as an argument to gf_msg - [#1465123](https://bugzilla.redhat.com/1465123): Fd based fops fail with EBADF on file migration - [#1465854](https://bugzilla.redhat.com/1465854): Regression: Heal info takes longer time when a brick is down @@ -36,7 +37,7 @@ Bugs addressed since release-3.11.0 are listed below. - [#1467268](https://bugzilla.redhat.com/1467268): Heal info shows incorrect status - [#1468118](https://bugzilla.redhat.com/1468118): disperse seek does not correctly handle the end of file - [#1468200](https://bugzilla.redhat.com/1468200): [Geo-rep]: entry failed to sync to slave with ENOENT errror -- [#1468457](https://bugzilla.redhat.com/1468457): selfheal deamon cpu consumption not reducing when IOs are going on and all redundant bricks are brought down one after another +- [#1468457](https://bugzilla.redhat.com/1468457): selfheal deamon cpu consumption not reducing when IOs are going on and all redundant bricks are brought down one after another - [#1469459](https://bugzilla.redhat.com/1469459): Rebalance hangs on remove-brick if the target volume changes - [#1470938](https://bugzilla.redhat.com/1470938): Regression: non-disruptive(in-service) upgrade on EC volume fails - [#1471025](https://bugzilla.redhat.com/1471025): glusterfs process leaking memory when error occurs diff --git a/docs/release-notes/3.11.3.md b/docs/release-notes/3.11.3.md index e8fcc39..58e7502 100644 --- a/docs/release-notes/3.11.3.md +++ b/docs/release-notes/3.11.3.md @@ -14,13 +14,14 @@ There are no major features or changes made in this release. ## Major issues -1. Expanding a gluster volume that is sharded may cause file corruption +1. Expanding a gluster volume that is sharded may cause file corruption + - Sharded volumes are typically used for VM images, if such volumes are - expanded or possibly contracted (i.e add/remove bricks and rebalance) there - are reports of VM images getting corrupted. + expanded or possibly contracted (i.e add/remove bricks and rebalance) there + are reports of VM images getting corrupted. - The last known cause for corruption (Bug #1465123) has a fix with the 3.11.2 - release. As further testing is still in progress, the issue is retained as - a major issue. + release. As further testing is still in progress, the issue is retained as + a major issue. - Status of this bug can be tracked here, #1465123 ## Bugs addressed diff --git a/docs/release-notes/3.12.0.md b/docs/release-notes/3.12.0.md index d35e420..15172fd 100644 --- a/docs/release-notes/3.12.0.md +++ b/docs/release-notes/3.12.0.md @@ -20,6 +20,7 @@ captures the list of features that were introduced with 3.11. ## Major changes and features ### Ability to mount sub-directories using the Gluster FUSE protocol + **Notes for users:** With this release, it is possible define sub-directories to be mounted by @@ -31,15 +32,19 @@ client. This feature helps sharing a volume among the multiple consumers along with enabling restricting access to the sub-directory of choice. Option controlling sub-directory allow/deny rules can be set as follows: + ``` # gluster volume set auth.allow "/subdir1(192.168.1.*),/(192.168.10.*),/subdir2(192.168.8.*)" ``` How to mount from the client: + ``` # mount -t glusterfs :// / ``` + Or, + ``` # mount -t glusterfs :/ -osubdir_mount= / ``` @@ -47,14 +52,15 @@ Or, **Limitations:** - There are no throttling or QoS support for this feature. The feature will -just provide the namespace isolation for the different clients. + just provide the namespace isolation for the different clients. **Known Issues:** - Once we cross more than 1000s of subdirs in 'auth.allow' option, the -performance of reconnect / authentication would be impacted. + performance of reconnect / authentication would be impacted. ### GFID to path conversion is enabled by default + **Notes for users:** Prior to this feature, only when quota was enabled, did the on disk data have @@ -80,18 +86,20 @@ None None ### Various enhancements have been made to the output of get-state CLI command + **Notes for users:** The command `#gluster get-state` has been enhanced to output more information as below, + - Arbiter bricks are marked more clearly in a volume that has the feature -enabled + enabled - Ability to get all volume options (both set and defaults) in the get-state -output + output - Rebalance time estimates, for ongoing rebalance, is captured in the get-state -output + output - If geo-replication is configured, then get-state now captures the session -details of the same + details of the same **Limitations:** @@ -102,6 +110,7 @@ None None ### Provided an option to set a limit on number of bricks multiplexed in a processes + **Notes for users:** This release includes a global option to be switched on only if brick @@ -111,19 +120,22 @@ node. If the limit set by this option is insufficient for a single process, more processes are spawned for the subsequent bricks. Usage: + ``` #gluster volume set all cluster.max-bricks-per-process ``` ### Provided an option to use localtime timestamps in log entries + **Limitations:** Gluster defaults to UTC timestamps. glusterd, glusterfsd, and server-side glusterfs daemons will use UTC until one of, + 1. command line option is processed, 2. gluster config (/var/lib/glusterd/options) is loaded, 3. admin manually sets localtime-logging (cluster.localtime-logging, e.g. -`#gluster volume set all cluster.localtime-logging enable`). + `#gluster volume set all cluster.localtime-logging enable`). There is no mount option to make the FUSE client enable localtime logging. @@ -144,6 +156,7 @@ and also enhancing the ability for file placement in the distribute translator when used with the option `min-free-disk`. ### Provided a means to resolve GFID split-brain using the gluster CLI + **Notes for users:** The existing CLI commands to heal files under split-brain did not handle cases @@ -152,6 +165,7 @@ the same CLI commands can now address GFID split-brain situations based on the choices provided. The CLI options that are enhanced to help with this situation are, + ``` volume heal split-brain {bigger-file | latest-mtime | @@ -167,14 +181,16 @@ None None ### Developer related: Added a 'site.h' for more vendor/company specific defaults + **Notes for developers:** **NOTE**: Also relevant for users building from sources and needing different defaults for some options Most people consume Gluster in one of two ways: -* From packages provided by their OS/distribution vendor -* By building themselves from source + +- From packages provided by their OS/distribution vendor +- By building themselves from source For the first group it doesn't matter whether configuration is done in a configure script, via command-line options to that configure script, or in a @@ -198,6 +214,7 @@ file. Further guidelines for how to determine whether an option should go in configure.ac or site.h are explained within site.h itself. ### Developer related: Added xxhash library to libglusterfs for required use + **Notes for developers:** Function gf_xxh64_wrapper has been added as a wrapper into libglusterfs for @@ -206,6 +223,7 @@ consumption by interested developers. Reference to code can be found [here](https://github.com/gluster/glusterfs/blob/v3.12.0alpha1/libglusterfs/src/common-utils.h#L835) ### Developer related: glfs_ipc API in libgfapi is removed as a public interface + **Notes for users:** glfs_ipc API was maintained as a public API in the GFAPI libraries. This has @@ -219,14 +237,15 @@ this change. API ## Major issues + 1. Expanding a gluster volume that is sharded may cause file corruption - - Sharded volumes are typically used for VM images, if such volumes are - expanded or possibly contracted (i.e add/remove bricks and rebalance) there - are reports of VM images getting corrupted. - - The last known cause for corruption (Bug #1465123) has a fix with this - release. As further testing is still in progress, the issue is retained as - a major issue. - - Status of this bug can be tracked here, #1465123 + - Sharded volumes are typically used for VM images, if such volumes are + expanded or possibly contracted (i.e add/remove bricks and rebalance) there + are reports of VM images getting corrupted. + - The last known cause for corruption (Bug #1465123) has a fix with this + release. As further testing is still in progress, the issue is retained as + a major issue. + - Status of this bug can be tracked here, #1465123 ## Bugs addressed @@ -243,13 +262,13 @@ Bugs addressed since release-3.11.0 are listed below. - [#1400924](https://bugzilla.redhat.com/1400924): [RFE] Rsync flags for performance improvements - [#1402406](https://bugzilla.redhat.com/1402406): Client stale file handle error in dht-linkfile.c under SPEC SFS 2014 VDA workload - [#1414242](https://bugzilla.redhat.com/1414242): [whql][virtio-block+glusterfs]"Disk Stress" and "Disk Verification" job always failed on win7-32/win2012/win2k8R2 guest -- [#1421938](https://bugzilla.redhat.com/1421938): systemic testing: seeing lot of ping time outs which would lead to splitbrains +- [#1421938](https://bugzilla.redhat.com/1421938): systemic testing: seeing lot of ping time outs which would lead to splitbrains - [#1424817](https://bugzilla.redhat.com/1424817): Fix wrong operators, found by coverty - [#1428061](https://bugzilla.redhat.com/1428061): Halo Replication feature for AFR translator -- [#1428673](https://bugzilla.redhat.com/1428673): possible repeatedly recursive healing of same file with background heal not happening when IO is going on +- [#1428673](https://bugzilla.redhat.com/1428673): possible repeatedly recursive healing of same file with background heal not happening when IO is going on - [#1430608](https://bugzilla.redhat.com/1430608): [RFE] Pass slave volume in geo-rep as read-only - [#1431908](https://bugzilla.redhat.com/1431908): Enabling parallel-readdir causes dht linkto files to be visible on the mount, -- [#1433906](https://bugzilla.redhat.com/1433906): quota: limit-usage command failed with error " Failed to start aux mount" +- [#1433906](https://bugzilla.redhat.com/1433906): quota: limit-usage command failed with error " Failed to start aux mount" - [#1437748](https://bugzilla.redhat.com/1437748): Spacing issue in fix-layout status output - [#1438966](https://bugzilla.redhat.com/1438966): Multiple bricks WILL crash after TCP port probing - [#1439068](https://bugzilla.redhat.com/1439068): Segmentation fault when creating a qcow2 with qemu-img @@ -270,7 +289,7 @@ Bugs addressed since release-3.11.0 are listed below. - [#1447826](https://bugzilla.redhat.com/1447826): potential endless loop in function glusterfs_graph_validate_options - [#1447828](https://bugzilla.redhat.com/1447828): Should use dict_set_uint64 to set fd->pid when dump fd's info to dict - [#1447953](https://bugzilla.redhat.com/1447953): Remove inadvertently merged IPv6 code -- [#1447960](https://bugzilla.redhat.com/1447960): [Tiering]: High and low watermark values when set to the same level, is allowed +- [#1447960](https://bugzilla.redhat.com/1447960): [Tiering]: High and low watermark values when set to the same level, is allowed - [#1447966](https://bugzilla.redhat.com/1447966): 'make cscope' fails on a clean tree due to missing generated XDR files - [#1448150](https://bugzilla.redhat.com/1448150): USS: stale snap entries are seen when activation/deactivation performed during one of the glusterd's unavailability - [#1448265](https://bugzilla.redhat.com/1448265): use common function iov_length to instead of duplicate code @@ -286,7 +305,7 @@ Bugs addressed since release-3.11.0 are listed below. - [#1449329](https://bugzilla.redhat.com/1449329): When either killing or restarting a brick with performance.stat-prefetch on, stat sometimes returns a bad st_size value. - [#1449348](https://bugzilla.redhat.com/1449348): disperse seek does not correctly handle the end of file - [#1449495](https://bugzilla.redhat.com/1449495): glfsheal: crashed(segfault) with disperse volume in RDMA -- [#1449610](https://bugzilla.redhat.com/1449610): [New] - Replacing an arbiter brick while I/O happens causes vm pause +- [#1449610](https://bugzilla.redhat.com/1449610): [New] - Replacing an arbiter brick while I/O happens causes vm pause - [#1450010](https://bugzilla.redhat.com/1450010): [gluster-block]:Need a volume group profile option for gluster-block volume to add necessary options to be added. - [#1450559](https://bugzilla.redhat.com/1450559): Error 0-socket.management: socket_poller XX.XX.XX.XX:YYY failed (Input/output error) during any volume operation - [#1450630](https://bugzilla.redhat.com/1450630): [brick multiplexing] detach a brick if posix health check thread complaints about underlying brick @@ -299,7 +318,7 @@ Bugs addressed since release-3.11.0 are listed below. - [#1451724](https://bugzilla.redhat.com/1451724): glusterfind pre crashes with "UnicodeDecodeError: 'utf8' codec can't decode" error when the `--no-encode` is used - [#1452006](https://bugzilla.redhat.com/1452006): tierd listens to a port. - [#1452084](https://bugzilla.redhat.com/1452084): [Ganesha] : Stale linkto files after unsuccessfuly hardlinks -- [#1452102](https://bugzilla.redhat.com/1452102): [DHt] : segfault in dht_selfheal_dir_setattr while running regressions +- [#1452102](https://bugzilla.redhat.com/1452102): [DHt] : segfault in dht_selfheal_dir_setattr while running regressions - [#1452378](https://bugzilla.redhat.com/1452378): Cleanup unnecessary logs in fix_quorum_options - [#1452527](https://bugzilla.redhat.com/1452527): Shared volume doesn't get mounted on few nodes after rebooting all nodes in cluster. - [#1452956](https://bugzilla.redhat.com/1452956): glusterd on a node crashed after running volume profile command @@ -307,9 +326,9 @@ Bugs addressed since release-3.11.0 are listed below. - [#1453977](https://bugzilla.redhat.com/1453977): Brick Multiplexing: Deleting brick directories of the base volume must gracefully detach from glusterfsd without impacting other volumes IO(currently seeing transport end point error) - [#1454317](https://bugzilla.redhat.com/1454317): [Bitrot]: Brick process crash observed while trying to recover a bad file in disperse volume - [#1454375](https://bugzilla.redhat.com/1454375): ignore incorrect uuid validation in gd_validate_mgmt_hndsk_req -- [#1454418](https://bugzilla.redhat.com/1454418): Glusterd segmentation fault in ' _Unwind_Backtrace' while running peer probe +- [#1454418](https://bugzilla.redhat.com/1454418): Glusterd segmentation fault in ' \_Unwind_Backtrace' while running peer probe - [#1454701](https://bugzilla.redhat.com/1454701): DHT: Pass errno as an argument to gf_msg -- [#1454865](https://bugzilla.redhat.com/1454865): [Brick Multiplexing] heal info shows the status of the bricks as "Transport endpoint is not connected" though bricks are up +- [#1454865](https://bugzilla.redhat.com/1454865): [Brick Multiplexing] heal info shows the status of the bricks as "Transport endpoint is not connected" though bricks are up - [#1454872](https://bugzilla.redhat.com/1454872): [Geo-rep]: Make changelog batch size configurable - [#1455049](https://bugzilla.redhat.com/1455049): [GNFS+EC] Unable to release the lock when the other client tries to acquire the lock on the same file - [#1455104](https://bugzilla.redhat.com/1455104): dht: dht self heal fails with no hashed subvol error @@ -317,8 +336,8 @@ Bugs addressed since release-3.11.0 are listed below. - [#1455301](https://bugzilla.redhat.com/1455301): gluster-block is not working as expected when shard is enabled - [#1455559](https://bugzilla.redhat.com/1455559): [Geo-rep]: METADATA errors are seen even though everything is in sync - [#1455831](https://bugzilla.redhat.com/1455831): libglusterfs: updates old comment for 'arena_size' -- [#1456361](https://bugzilla.redhat.com/1456361): DHT : for many operation directory/file path is '(null)' in brick log -- [#1456385](https://bugzilla.redhat.com/1456385): glusterfs client crash on io-cache.so(__ioc_page_wakeup+0x44) +- [#1456361](https://bugzilla.redhat.com/1456361): DHT : for many operation directory/file path is '(null)' in brick log +- [#1456385](https://bugzilla.redhat.com/1456385): glusterfs client crash on io-cache.so(\_\_ioc_page_wakeup+0x44) - [#1456405](https://bugzilla.redhat.com/1456405): Brick Multiplexing:dmesg shows request_sock_TCP: Possible SYN flooding on port 49152 and memory related backtraces - [#1456582](https://bugzilla.redhat.com/1456582): "split-brain observed [Input/output error]" error messages in samba logs during parallel rm -rf - [#1456653](https://bugzilla.redhat.com/1456653): nlc_lookup_cbk floods logs @@ -333,7 +352,7 @@ Bugs addressed since release-3.11.0 are listed below. - [#1458197](https://bugzilla.redhat.com/1458197): io-stats usability/performance statistics enhancements - [#1458539](https://bugzilla.redhat.com/1458539): [Negative Lookup]: negative lookup features doesn't seem to work on restart of volume - [#1458582](https://bugzilla.redhat.com/1458582): add all as volume option in gluster volume get usage -- [#1458768](https://bugzilla.redhat.com/1458768): [Perf] 35% drop in small file creates on smbv3 on *2 +- [#1458768](https://bugzilla.redhat.com/1458768): [Perf] 35% drop in small file creates on smbv3 on \*2 - [#1459402](https://bugzilla.redhat.com/1459402): brick process crashes while running bug-1432542-mpx-restart-crash.t in a loop - [#1459530](https://bugzilla.redhat.com/1459530): [RFE] Need a way to resolve gfid split brains - [#1459620](https://bugzilla.redhat.com/1459620): [geo-rep]: Worker crashed with TypeError: expected string or buffer @@ -349,17 +368,17 @@ Bugs addressed since release-3.11.0 are listed below. - [#1461655](https://bugzilla.redhat.com/1461655): glusterd crashes when statedump is taken - [#1461792](https://bugzilla.redhat.com/1461792): lk fop succeeds even when lock is not acquired on at least quorum number of bricks - [#1461845](https://bugzilla.redhat.com/1461845): [Bitrot]: Inconsistency seen with 'scrub ondemand' - fails to trigger scrub -- [#1462200](https://bugzilla.redhat.com/1462200): glusterd status showing failed when it's stopped in RHEL7 +- [#1462200](https://bugzilla.redhat.com/1462200): glusterd status showing failed when it's stopped in RHEL7 - [#1462241](https://bugzilla.redhat.com/1462241): glusterfind: syntax error due to uninitialized variable 'end' - [#1462790](https://bugzilla.redhat.com/1462790): with AFR now making both nodes to return UUID for a file will result in georep consuming more resources -- [#1463178](https://bugzilla.redhat.com/1463178): [Ganesha]Bricks got crashed while running posix compliance test suit on V4 mount +- [#1463178](https://bugzilla.redhat.com/1463178): [Ganesha]Bricks got crashed while running posix compliance test suit on V4 mount - [#1463365](https://bugzilla.redhat.com/1463365): Changes for Maintainers 2.0 - [#1463648](https://bugzilla.redhat.com/1463648): Use GF_XATTR_LIST_NODE_UUIDS_KEY to figure out local subvols - [#1464072](https://bugzilla.redhat.com/1464072): cns-brick-multiplexing: brick process fails to restart after gluster pod failure - [#1464091](https://bugzilla.redhat.com/1464091): Regression: Heal info takes longer time when a brick is down - [#1464110](https://bugzilla.redhat.com/1464110): [Scale] : Rebalance ETA (towards the end) may be inaccurate,even on a moderately large data set. - [#1464327](https://bugzilla.redhat.com/1464327): glusterfs client crashes when reading large directory -- [#1464359](https://bugzilla.redhat.com/1464359): selfheal deamon cpu consumption not reducing when IOs are going on and all redundant bricks are brought down one after another +- [#1464359](https://bugzilla.redhat.com/1464359): selfheal deamon cpu consumption not reducing when IOs are going on and all redundant bricks are brought down one after another - [#1465024](https://bugzilla.redhat.com/1465024): glusterfind: DELETE path needs to be unquoted before further processing - [#1465075](https://bugzilla.redhat.com/1465075): Fd based fops fail with EBADF on file migration - [#1465214](https://bugzilla.redhat.com/1465214): build failed with GF_DISABLE_MEMPOOL @@ -424,7 +443,7 @@ Bugs addressed since release-3.11.0 are listed below. - [#1479717](https://bugzilla.redhat.com/1479717): Running sysbench on vm disk from plain distribute gluster volume causes disk corruption - [#1480448](https://bugzilla.redhat.com/1480448): More useful error - replace 'not optimal' - [#1480459](https://bugzilla.redhat.com/1480459): Gluster puts PID files in wrong place -- [#1481931](https://bugzilla.redhat.com/1481931): [Scale] : I/O errors on multiple gNFS mounts with "Stale file handle" during rebalance of an erasure coded volume. +- [#1481931](https://bugzilla.redhat.com/1481931): [Scale] : I/O errors on multiple gNFS mounts with "Stale file handle" during rebalance of an erasure coded volume. - [#1482804](https://bugzilla.redhat.com/1482804): Negative Test: glusterd crashes for some of the volume options if set at cluster level - [#1482835](https://bugzilla.redhat.com/1482835): glusterd fails to start - [#1483402](https://bugzilla.redhat.com/1483402): DHT: readdirp fails to read some directories. @@ -432,6 +451,6 @@ Bugs addressed since release-3.11.0 are listed below. - [#1484440](https://bugzilla.redhat.com/1484440): packaging: /run and /var/run; prefer /run - [#1484885](https://bugzilla.redhat.com/1484885): [rpc]: EPOLLERR - disconnecting now messages every 3 secs after completing rebalance - [#1486107](https://bugzilla.redhat.com/1486107): /var/lib/glusterd/peers File had a blank line, Stopped Glusterd from starting -- [#1486110](https://bugzilla.redhat.com/1486110): [quorum]: Replace brick is happened when Quorum not met. +- [#1486110](https://bugzilla.redhat.com/1486110): [quorum]: Replace brick is happened when Quorum not met. - [#1486120](https://bugzilla.redhat.com/1486120): symlinks trigger faulty geo-replication state (rsnapshot usecase) - [#1486122](https://bugzilla.redhat.com/1486122): gluster-block profile needs to have strict-o-direct diff --git a/docs/release-notes/3.12.1.md b/docs/release-notes/3.12.1.md index 04ce830..76dca9d 100644 --- a/docs/release-notes/3.12.1.md +++ b/docs/release-notes/3.12.1.md @@ -1,20 +1,23 @@ # Release notes for Gluster 3.12.1 This is a bugfix release. The [Release Notes for 3.12.0](3.12.0.md), - [3.12.1](3.12.1.md) contain a listing of all the new features that - were added and bugs fixed in the GlusterFS 3.12 stable release. +[3.12.1](3.12.1.md) contain a listing of all the new features that +were added and bugs fixed in the GlusterFS 3.12 stable release. ## Major changes, features and limitations addressed in this release + No Major changes ## Major issues -1. Expanding a gluster volume that is sharded may cause file corruption + +1. Expanding a gluster volume that is sharded may cause file corruption + - Sharded volumes are typically used for VM images, if such volumes are - expanded or possibly contracted (i.e add/remove bricks and rebalance) there - are reports of VM images getting corrupted. + expanded or possibly contracted (i.e add/remove bricks and rebalance) there + are reports of VM images getting corrupted. - The last known cause for corruption (Bug #1465123) has a fix with this - release. As further testing is still in progress, the issue is retained as - a major issue. + release. As further testing is still in progress, the issue is retained as + a major issue. - Status of this bug can be tracked here, #1465123 ## Bugs addressed @@ -24,7 +27,7 @@ This is a bugfix release. The [Release Notes for 3.12.0](3.12.0.md), - [#1486538](https://bugzilla.redhat.com/1486538): [geo-rep+qr]: Crashes observed at slave from qr_lookup_sbk during rename/hardlink/rebalance - [#1486557](https://bugzilla.redhat.com/1486557): Log entry of files skipped/failed during rebalance operation - [#1487033](https://bugzilla.redhat.com/1487033): rpc: client_t and related objects leaked due to incorrect ref counts -- [#1487319](https://bugzilla.redhat.com/1487319): afr: check op_ret value in __afr_selfheal_name_impunge +- [#1487319](https://bugzilla.redhat.com/1487319): afr: check op_ret value in \_\_afr_selfheal_name_impunge - [#1488119](https://bugzilla.redhat.com/1488119): scripts: mount.glusterfs contains non-portable bashisms - [#1488168](https://bugzilla.redhat.com/1488168): Launch metadata heal in discover code path. - [#1488387](https://bugzilla.redhat.com/1488387): gluster-blockd process crashed and core generated diff --git a/docs/release-notes/3.12.10.md b/docs/release-notes/3.12.10.md index 6a66ab3..bdbcca5 100644 --- a/docs/release-notes/3.12.10.md +++ b/docs/release-notes/3.12.10.md @@ -16,11 +16,12 @@ features that were added and bugs fixed in the GlusterFS 3.12 stable release. Bugs addressed since release-3.12.9 are listed below . + - [#1570475](https://bugzilla.redhat.com/1570475): Rebalance on few nodes doesn't seem to complete - stuck at FUTEX_WAIT - [#1576816](https://bugzilla.redhat.com/1576816): GlusterFS can be improved - [#1577164](https://bugzilla.redhat.com/1577164): gfapi: broken symbol versions - [#1577845](https://bugzilla.redhat.com/1577845): Geo-rep: faulty session due to OSError: [Errno 95] Operation not supported -- [#1577862](https://bugzilla.redhat.com/1577862): [geo-rep]: Upgrade fails, session in FAULTY state +- [#1577862](https://bugzilla.redhat.com/1577862): [geo-rep]: Upgrade fails, session in FAULTY state - [#1577868](https://bugzilla.redhat.com/1577868): Glusterd crashed on a few (master) nodes - [#1577871](https://bugzilla.redhat.com/1577871): [geo-rep]: Geo-rep scheduler fails - [#1580519](https://bugzilla.redhat.com/1580519): the regression test "tests/bugs/posix/bug-990028.t" fails diff --git a/docs/release-notes/3.12.11.md b/docs/release-notes/3.12.11.md index 3845a4e..9efe037 100644 --- a/docs/release-notes/3.12.11.md +++ b/docs/release-notes/3.12.11.md @@ -8,6 +8,7 @@ GlusterFS 3.12 stable release. ## Major changes, features and limitations addressed in this release This release contains a fix for a security vulerability in Gluster as follows, + - http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-10841 - https://nvd.nist.gov/vuln/detail/CVE-2018-10841 diff --git a/docs/release-notes/3.12.12.md b/docs/release-notes/3.12.12.md index 477ddb0..d6caa97 100644 --- a/docs/release-notes/3.12.12.md +++ b/docs/release-notes/3.12.12.md @@ -16,6 +16,7 @@ all the new features that were added and bugs fixed in the GlusterFS 3.12 stable ## Bugs addressed Bugs addressed since release-3.12.12 are listed below + - [#1579673](https://bugzilla.redhat.com/1579673): Remove EIO from the dht_inode_missing macro - [#1595528](https://bugzilla.redhat.com/1595528): rmdir is leaking softlinks to directories in .glusterfs - [#1597120](https://bugzilla.redhat.com/1597120): Add quorum checks in post-op diff --git a/docs/release-notes/3.12.13.md b/docs/release-notes/3.12.13.md index e69b473..946f109 100644 --- a/docs/release-notes/3.12.13.md +++ b/docs/release-notes/3.12.13.md @@ -16,9 +16,9 @@ contain a listing of all the new features that were added and bugs fixed in the ## Bugs addressed Bugs addressed in release-3.12.13 are listed below -- [#1599788](https://bugzilla.redhat.com/1599788): _is_prefix should return false for 0-length strings + +- [#1599788](https://bugzilla.redhat.com/1599788): \_is_prefix should return false for 0-length strings - [#1603093](https://bugzilla.redhat.com/1603093): directories are invisible on client side - [#1613512](https://bugzilla.redhat.com/1613512): Backport glusterfs-client memory leak fix to 3.12.x - [#1618838](https://bugzilla.redhat.com/1618838): gluster bash completion leaks TOP=0 into the environment - [#1618348](https://bugzilla.redhat.com/1618348): [Ganesha] Ganesha crashed in mdcache_alloc_and_check_handle while running bonnie and untars with parallel lookups - diff --git a/docs/release-notes/3.12.14.md b/docs/release-notes/3.12.14.md index 487e623..d911494 100644 --- a/docs/release-notes/3.12.14.md +++ b/docs/release-notes/3.12.14.md @@ -7,7 +7,9 @@ and [3.12.13](3.12.13.md) contain a listing of all the new features that were ad the GlusterFS 3.12 stable release. ## Major changes, features and limitations addressed in this release + 1. This release contains fix for following security vulnerabilities, + - https://nvd.nist.gov/vuln/detail/CVE-2018-10904 - https://nvd.nist.gov/vuln/detail/CVE-2018-10907 - https://nvd.nist.gov/vuln/detail/CVE-2018-10911 @@ -21,10 +23,11 @@ the GlusterFS 3.12 stable release. - https://nvd.nist.gov/vuln/detail/CVE-2018-10930 2. To resolve the security vulnerabilities following limitations were made in GlusterFS - - open,read,write on special files like char and block are no longer permitted - - io-stat xlator can dump stat info only to /var/run/gluster directory -3. Addressed an issue that affected copying a file over SSL/TLS in a volume + - open,read,write on special files like char and block are no longer permitted + - io-stat xlator can dump stat info only to /var/run/gluster directory + +3. Addressed an issue that affected copying a file over SSL/TLS in a volume Installing the updated packages and restarting gluster services on gluster brick hosts, will fix the security issues. @@ -38,7 +41,7 @@ brick hosts, will fix the security issues. Bugs addressed since release-3.12.14 are listed below. - [#1622405](https://bugzilla.redhat.com/1622405): Problem with SSL/TLS encryption on Gluster 4.0 & 4.1 -- [#1625286](https://bugzilla.redhat.com/1625286): Information Exposure in posix_get_file_contents function in posix-helpers.c +- [#1625286](https://bugzilla.redhat.com/1625286): Information Exposure in posix_get_file_contents function in posix-helpers.c - [#1625648](https://bugzilla.redhat.com/1625648): I/O to arbitrary devices on storage server - [#1625654](https://bugzilla.redhat.com/1625654): Stack-based buffer overflow in server-rpc-fops.c allows remote attackers to execute arbitrary code - [#1625656](https://bugzilla.redhat.com/1625656): Improper deserialization in dict.c:dict_unserialize() can allow attackers to read arbitrary memory diff --git a/docs/release-notes/3.12.2.md b/docs/release-notes/3.12.2.md index 3f0ab9a..05943a9 100644 --- a/docs/release-notes/3.12.2.md +++ b/docs/release-notes/3.12.2.md @@ -5,6 +5,7 @@ This is a bugfix release. The release notes for [3.12.0](3.12.0.md), [3.12.1](3. fixed in the GlusterFS 3.12 stable release. ## Major changes, features and limitations addressed in this release + 1.) In a pure distribute volume there is no source to heal the replaced brick from and hence would cause a loss of data that was present in the replaced brick. The CLI has been enhanced to prevent a user from inadvertently using replace brick @@ -12,31 +13,32 @@ fixed in the GlusterFS 3.12 stable release. an existing brick in a pure distribute volume. ## Major issues -1. Expanding a gluster volume that is sharded may cause file corruption + +1. Expanding a gluster volume that is sharded may cause file corruption + - Sharded volumes are typically used for VM images, if such volumes are - expanded or possibly contracted (i.e add/remove bricks and rebalance) there - are reports of VM images getting corrupted. + expanded or possibly contracted (i.e add/remove bricks and rebalance) there + are reports of VM images getting corrupted. - The last known cause for corruption #1465123 is still pending, and not yet - part of this release. + part of this release. -2. Gluster volume restarts fail if the sub directory export feature is in use. - Status of this issue can be tracked here, #1501315 +2. Gluster volume restarts fail if the sub directory export feature is in use. + Status of this issue can be tracked here, #1501315 -3. Mounting a gluster snapshot will fail, when attempting a FUSE based mount of - the snapshot. So for the current users, it is recommend to only access snapshot - via ".snaps" directory on a mounted gluster volume. - Status of this issue can be tracked here, #1501378 +3. Mounting a gluster snapshot will fail, when attempting a FUSE based mount of + the snapshot. So for the current users, it is recommend to only access snapshot + via ".snaps" directory on a mounted gluster volume. + Status of this issue can be tracked here, #1501378 ## Bugs addressed A total of 31 patches have been merged, addressing 28 bugs - - [#1490493](https://bugzilla.redhat.com/1490493): Sub-directory mount details are incorrect in /proc/mounts - [#1491178](https://bugzilla.redhat.com/1491178): GlusterD returns a bad memory pointer in glusterd_get_args_from_dict() - [#1491292](https://bugzilla.redhat.com/1491292): Provide brick list as part of VOLUME_CREATE event. - [#1491690](https://bugzilla.redhat.com/1491690): rpc: TLSv1_2_method() is deprecated in OpenSSL-1.1 -- [#1492026](https://bugzilla.redhat.com/1492026): set the shard-block-size to 64MB in virt profile +- [#1492026](https://bugzilla.redhat.com/1492026): set the shard-block-size to 64MB in virt profile - [#1492061](https://bugzilla.redhat.com/1492061): CLIENT_CONNECT event not being raised - [#1492066](https://bugzilla.redhat.com/1492066): AFR_SUBVOL_UP and AFR_SUBVOLS_DOWN events not working - [#1493975](https://bugzilla.redhat.com/1493975): disallow replace brick operation on plain distribute volume diff --git a/docs/release-notes/3.12.3.md b/docs/release-notes/3.12.3.md index 97d98e5..1c96e73 100644 --- a/docs/release-notes/3.12.3.md +++ b/docs/release-notes/3.12.3.md @@ -5,22 +5,22 @@ This is a bugfix release. The release notes for [3.12.0](3.12.0.md), [3.12.1](3. were added and bugs fixed in the GlusterFS 3.12 stable release. ## Major changes, features and limitations addressed in this release -1. The two regression related to with subdir mount got fixed - - gluster volume restart failure (#1465123) - - mounting gluster snapshot via fuse (#1501378) + +1. The two regression related to with subdir mount got fixed - gluster volume restart failure (#1465123) - mounting gluster snapshot via fuse (#1501378) 2. Improvements for "help" command with in gluster cli (#1509786) 3. Introduction of new api glfs_fd_set_lkowner() to set lock owner - ## Major issues -1. Expanding a gluster volume that is sharded may cause file corruption + +1. Expanding a gluster volume that is sharded may cause file corruption + - Sharded volumes are typically used for VM images, if such volumes are - expanded or possibly contracted (i.e add/remove bricks and rebalance) there - are reports of VM images getting corrupted. + expanded or possibly contracted (i.e add/remove bricks and rebalance) there + are reports of VM images getting corrupted. - The last known cause for corruption #1465123 is still pending, and not yet - part of this release. + part of this release. ## Bugs addressed diff --git a/docs/release-notes/3.12.4.md b/docs/release-notes/3.12.4.md index d8157b4..ff3cd37 100644 --- a/docs/release-notes/3.12.4.md +++ b/docs/release-notes/3.12.4.md @@ -5,19 +5,21 @@ This is a bugfix release. The release notes for [3.12.0](3.12.0.md), [3.12.1](3. the new features that were added and bugs fixed in the GlusterFS 3.12 stable release. ## Major issues -1. Expanding a gluster volume that is sharded may cause file corruption + +1. Expanding a gluster volume that is sharded may cause file corruption + - Sharded volumes are typically used for VM images, if such volumes are - expanded or possibly contracted (i.e add/remove bricks and rebalance) there - are reports of VM images getting corrupted. + expanded or possibly contracted (i.e add/remove bricks and rebalance) there + are reports of VM images getting corrupted. - The last known cause for corruption #1465123 is still pending, and not yet - part of this release. + part of this release. ## Bugs addressed A total of 13 patches have been merged, addressing 12 bugs - [#1478411](https://bugzilla.redhat.com/1478411): Directory listings on fuse mount are very slow due to small number of getdents() entries -- [#1511782](https://bugzilla.redhat.com/1511782): In Replica volume 2*2 when quorum is set, after glusterd restart nfs server is coming up instead of self-heal daemon +- [#1511782](https://bugzilla.redhat.com/1511782): In Replica volume 2\*2 when quorum is set, after glusterd restart nfs server is coming up instead of self-heal daemon - [#1512432](https://bugzilla.redhat.com/1512432): Test bug-1483058-replace-brick-quorum-validation.t fails inconsistently - [#1513258](https://bugzilla.redhat.com/1513258): NetBSD port - [#1514380](https://bugzilla.redhat.com/1514380): default timeout of 5min not honored for analyzing split-brain files post setfattr replica.split-brain-heal-finalize diff --git a/docs/release-notes/3.12.5.md b/docs/release-notes/3.12.5.md index 050c3e5..9c5edec 100644 --- a/docs/release-notes/3.12.5.md +++ b/docs/release-notes/3.12.5.md @@ -4,16 +4,19 @@ This is a bugfix release. The release notes for [3.12.0](3.12.0.md), [3.12.1](3. [3.12.2](3.12.2.md), [3.12.3](3.12.3.md), [3.12.4](3.12.4.md), [3.12.5](3.12.5.md) contain a listing of all the new features that were added and bugs fixed in the GlusterFS 3.12 stable release. ## Major issues -1. Expanding a gluster volume that is sharded may cause file corruption + +1. Expanding a gluster volume that is sharded may cause file corruption + - Sharded volumes are typically used for VM images, if such volumes are - expanded or possibly contracted (i.e add/remove bricks and rebalance) there - are reports of VM images getting corrupted. + expanded or possibly contracted (i.e add/remove bricks and rebalance) there + are reports of VM images getting corrupted. - The last known cause for corruption #1465123 is still pending, and not yet - part of this release. + part of this release. ## Bugs addressed A total of 12 patches have been merged, addressing 11 bugs + - [#1489043](https://bugzilla.redhat.com/1489043): The number of bytes of the quota specified in version 3.7 or later is incorrect - [#1511301](https://bugzilla.redhat.com/1511301): In distribute volume after glusterd restart, brick goes offline - [#1525850](https://bugzilla.redhat.com/1525850): rdma transport may access an obsolete item in gf_rdma_device_t->all_mr, and causes glusterfsd/glusterfs process crash. diff --git a/docs/release-notes/3.12.6.md b/docs/release-notes/3.12.6.md index 4be56b4..b778b48 100644 --- a/docs/release-notes/3.12.6.md +++ b/docs/release-notes/3.12.6.md @@ -3,29 +3,32 @@ This is a bugfix release. The release notes for [3.12.0](3.12.0.md), [3.12.1](3.12.1.md), [3.12.2](3.12.2.md), [3.12.3](3.12.3.md), [3.12.4](3.12.4.md), [3.12.5](3.12.5.md), [3.12.5](3.12.6.md) contain a listing of all the new features that were added and bugs fixed in the GlusterFS 3.12 stable release. ## Major issues -1. Expanding a gluster volume that is sharded may cause file corruption + +1. Expanding a gluster volume that is sharded may cause file corruption + - Sharded volumes are typically used for VM images, if such volumes are - expanded or possibly contracted (i.e add/remove bricks and rebalance) there - are reports of VM images getting corrupted. + expanded or possibly contracted (i.e add/remove bricks and rebalance) there + are reports of VM images getting corrupted. - The last known cause for corruption #1465123 is still pending, and not yet - part of this release. + part of this release. ## Bugs addressed A total of 16 patches have been merged, addressing 16 bugs + - [#1510342](https://bugzilla.redhat.com/1510342): Not all files synced using geo-replication - [#1533269](https://bugzilla.redhat.com/1533269): Random GlusterFSD process dies during rebalance - [#1534847](https://bugzilla.redhat.com/1534847): entries not getting cleared post healing of softlinks (stale entries showing up in heal info) - [#1536334](https://bugzilla.redhat.com/1536334): [Disperse] Implement open fd heal for disperse volume - [#1537346](https://bugzilla.redhat.com/1537346): glustershd/glusterd is not using right port when connecting to glusterfsd process - [#1539516](https://bugzilla.redhat.com/1539516): DHT log messages: Found anomalies in (null) (gfid = 00000000-0000-0000-0000-000000000000). Holes=1 overlaps=0 -- [#1540224](https://bugzilla.redhat.com/1540224): dht_(f)xattrop does not implement migration checks +- [#1540224](https://bugzilla.redhat.com/1540224): dht\_(f)xattrop does not implement migration checks - [#1541267](https://bugzilla.redhat.com/1541267): dht_layout_t leak in dht_populate_inode_for_dentry - [#1541930](https://bugzilla.redhat.com/1541930): A down brick is incorrectly considered to be online and makes the volume to be started without any brick available - [#1542054](https://bugzilla.redhat.com/1542054): tests/bugs/cli/bug-1169302.t fails spuriously - [#1542475](https://bugzilla.redhat.com/1542475): Random failures in tests/bugs/nfs/bug-974972.t - [#1542601](https://bugzilla.redhat.com/1542601): The used space in the volume increases when the volume is expanded -- [#1542615](https://bugzilla.redhat.com/1542615): tests/bugs/core/multiplex-limit-issue-151.t fails sometimes in upstream master +- [#1542615](https://bugzilla.redhat.com/1542615): tests/bugs/core/multiplex-limit-issue-151.t fails sometimes in upstream master - [#1542826](https://bugzilla.redhat.com/1542826): Mark tests/bugs/posix/bug-990028.t bad on release-3.12 - [#1542934](https://bugzilla.redhat.com/1542934): Seeing timer errors in the rebalance logs - [#1543016](https://bugzilla.redhat.com/1543016): dht_lookup_unlink_of_false_linkto_cbk fails with "Permission denied" diff --git a/docs/release-notes/3.12.7.md b/docs/release-notes/3.12.7.md index 6ff2da3..c841277 100644 --- a/docs/release-notes/3.12.7.md +++ b/docs/release-notes/3.12.7.md @@ -1,17 +1,19 @@ # Release notes for Gluster 3.12.7 This is a bugfix release. The release notes for [3.12.0](3.12.0.md), [3.12.1](3.12.1.md), [3.12.2](3.12.2.md), [3.12.3](3.12.3.md), [3.12.4](3.12.4.md), [3.12.5](3.12.5.md), [3.12.6](3.12.6.md) contain a listing of all the new features that were added and bugs fixed in the GlusterFS 3.12 stable release. + ## Bugs addressed ## Major issues + 1. Consider a case in which one of the nodes goes down in gluster cluster with brick multiplexing enabled, if volume operations are performed then post when the node comes back, brick processes will fail to come up. The issue is tracked in #1543708 and will be fixed by next release. - A total of 8 patches have been merged, addressing 8 bugs + A total of 8 patches have been merged, addressing 8 bugs - [#1517260](https://bugzilla.redhat.com/1517260): Volume wrong size - [#1543709](https://bugzilla.redhat.com/1543709): Optimize glusterd_import_friend_volume code path - [#1544635](https://bugzilla.redhat.com/1544635): Though files are in split-brain able to perform writes to the file -- [#1547841](https://bugzilla.redhat.com/1547841): Typo error in __dht_check_free_space function log message +- [#1547841](https://bugzilla.redhat.com/1547841): Typo error in \_\_dht_check_free_space function log message - [#1548078](https://bugzilla.redhat.com/1548078): [Rebalance] "Migrate file failed: : failed to get xattr [No data available]" warnings in rebalance logs - [#1548270](https://bugzilla.redhat.com/1548270): DHT calls dht_lookup_everywhere for 1xn volumes - [#1549505](https://bugzilla.redhat.com/1549505): Backport patch to reduce duplicate code in server-rpc-fops.c diff --git a/docs/release-notes/3.12.8.md b/docs/release-notes/3.12.8.md index 7117a26..048ec3b 100644 --- a/docs/release-notes/3.12.8.md +++ b/docs/release-notes/3.12.8.md @@ -1,9 +1,11 @@ # Release notes for Gluster 3.12.8 This is a bugfix release. The release notes for [3.12.0](3.12.0.md), [3.12.1](3.12.1.md), [3.12.2](3.12.2.md), [3.12.3](3.12.3.md), [3.12.4](3.12.4.md), [3.12.5](3.12.5.md), [3.12.6](3.12.6.md), [3.12.7](3.12.7.md) contain a listing of all the new features that were added and bugs fixed in the GlusterFS 3.12 stable release. + ## Bugs addressed A total of 9 patches have been merged, addressing 9 bugs + - [#1543708](https://bugzilla.redhat.com/1543708): glusterd fails to attach brick during restart of the node - [#1546627](https://bugzilla.redhat.com/1546627): Syntactical errors in hook scripts for managing SELinux context on bricks - [#1549473](https://bugzilla.redhat.com/1549473): possible memleak in glusterfsd process with brick multiplexing on @@ -12,4 +14,4 @@ This is a bugfix release. The release notes for [3.12.0](3.12.0.md), [3.12.1](3. - [#1558352](https://bugzilla.redhat.com/1558352): [EC] Read performance of EC volume exported over gNFS is significantly lower than write performance - [#1561731](https://bugzilla.redhat.com/1561731): Rebalance failures on a dispersed volume with lookup-optimize enabled - [#1562723](https://bugzilla.redhat.com/1562723): SHD is not healing entries in halo replication -- [#1565590](https://bugzilla.redhat.com/1565590): timer: Possible race condition between gf_timer_* routines +- [#1565590](https://bugzilla.redhat.com/1565590): timer: Possible race condition between gf*timer*\* routines diff --git a/docs/release-notes/3.12.9.md b/docs/release-notes/3.12.9.md index b6d4818..8576c93 100644 --- a/docs/release-notes/3.12.9.md +++ b/docs/release-notes/3.12.9.md @@ -7,6 +7,7 @@ features that were added and bugs fixed in the GlusterFS 3.12 stable release. ## Major changes, features and limitations addressed in this release This release contains a fix for a security vulerability in Gluster as follows, + - http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-1088 - https://nvd.nist.gov/vuln/detail/CVE-2018-1088 diff --git a/docs/release-notes/3.13.0.md b/docs/release-notes/3.13.0.md index 687f69d..b0114ee 100644 --- a/docs/release-notes/3.13.0.md +++ b/docs/release-notes/3.13.0.md @@ -15,11 +15,13 @@ The Gluster heal info CLI now has a 'summary' option displaying the statistics of entries pending heal, in split-brain and currently being healed, per brick. Usage: + ``` # gluster volume heal info summary ``` Sample output: + ``` Brick Status: Connected @@ -68,7 +70,7 @@ before, even when only 1 brick is online. Further reference: [mailing list discussions on topic](http://lists.gluster.org/pipermail/gluster-users/2017-September/032524.html) -### Support for max-port range in glusterd.vol +### Support for max-port range in glusterd.vol **Notes for users:** @@ -102,6 +104,7 @@ endpoint (called gfproxy) on the gluster server nodes, thus thinning the client stack. Usage: + ``` # gluster volume set config.gfproxyd enable ``` @@ -110,6 +113,7 @@ The above enables the gfproxy protocol service on the server nodes. To mount a client that interacts with this end point, use the --thin-client mount option. Example: + ``` # glusterfs --thin-client --volfile-id= --volfile-server= ``` @@ -134,6 +138,7 @@ feature is disabled. The option takes a numeric percentage value, that reserves up to that percentage of disk space. Usage: + ``` # gluster volume set storage.reserve ``` @@ -146,6 +151,7 @@ Gluster CLI is enhanced with an option to list all connected clients to a volume volume. Usage: + ``` # gluster volume status client-list ``` @@ -165,6 +171,7 @@ This feature is enabled by default, and can be toggled using the boolean option, This feature enables users to punch hole in files created on disperse volumes. Usage: + ``` # fallocate -p -o -l ``` @@ -186,7 +193,6 @@ There are currently no statistics included in the `statedump` about the actual behavior of the memory pools. This means that the efficiency of the memory pools can not be verified. - ### Gluster APIs added to register callback functions for upcalls **Notes for developers:** @@ -201,8 +207,8 @@ int glfs_upcall_register (struct glfs *fs, uint32_t event_list, glfs_upcall_cbk cbk, void *data); int glfs_upcall_unregister (struct glfs *fs, uint32_t event_list); ``` -libgfapi [header](https://github.com/gluster/glusterfs/blob/release-3.13/api/src/glfs.h#L970) files include the complete synopsis about these APIs definition and their usage. +libgfapi [header](https://github.com/gluster/glusterfs/blob/release-3.13/api/src/glfs.h#L970) files include the complete synopsis about these APIs definition and their usage. **Limitations:** An application can register only a single callback function for all the upcall @@ -237,13 +243,15 @@ responses and enable better qualification of the translator stacks. For usage refer to this [test case](https://github.com/gluster/glusterfs/blob/v3.13.0rc0/tests/features/delay-gen.t). ## Major issues -1. Expanding a gluster volume that is sharded may cause file corruption + +1. Expanding a gluster volume that is sharded may cause file corruption + - Sharded volumes are typically used for VM images, if such volumes are - expanded or possibly contracted (i.e add/remove bricks and rebalance) there - are reports of VM images getting corrupted. + expanded or possibly contracted (i.e add/remove bricks and rebalance) there + are reports of VM images getting corrupted. - The last known cause for corruption (Bug #1515434) has a fix with this - release. As further testing is still in progress, the issue is retained as - a major issue. + release. As further testing is still in progress, the issue is retained as + a major issue. - Status of this bug can be tracked here, #1515434 ## Bugs addressed @@ -252,13 +260,13 @@ Bugs addressed since release-3.12.0 are listed below. - [#1248393](https://bugzilla.redhat.com/1248393): DHT: readdirp fails to read some directories. - [#1258561](https://bugzilla.redhat.com/1258561): Gluster puts PID files in wrong place -- [#1261463](https://bugzilla.redhat.com/1261463): AFR : [RFE] Improvements needed in "gluster volume heal info" commands +- [#1261463](https://bugzilla.redhat.com/1261463): AFR : [RFE] Improvements needed in "gluster volume heal info" commands - [#1294051](https://bugzilla.redhat.com/1294051): Though files are in split-brain able to perform writes to the file - [#1328994](https://bugzilla.redhat.com/1328994): When a feature fails needing a higher opversion, the message should state what version it needs. - [#1335251](https://bugzilla.redhat.com/1335251): mgmt/glusterd: clang compile warnings in glusterd-snapshot.c - [#1350406](https://bugzilla.redhat.com/1350406): [storage/posix] - posix_do_futimes function not implemented - [#1365683](https://bugzilla.redhat.com/1365683): Fix crash bug when mnt3_resolve_subdir_cbk fails -- [#1371806](https://bugzilla.redhat.com/1371806): DHT :- inconsistent 'custom extended attributes',uid and gid, Access permission (for directories) if User set/modifies it after bringing one or more sub-volume down +- [#1371806](https://bugzilla.redhat.com/1371806): DHT :- inconsistent 'custom extended attributes',uid and gid, Access permission (for directories) if User set/modifies it after bringing one or more sub-volume down - [#1376326](https://bugzilla.redhat.com/1376326): separating attach tier and add brick - [#1388509](https://bugzilla.redhat.com/1388509): gluster volume heal info "healed" and "heal-failed" showing wrong information - [#1395492](https://bugzilla.redhat.com/1395492): trace/error-gen be turned on together while use 'volume set' command to set one of them @@ -314,14 +322,14 @@ Bugs addressed since release-3.12.0 are listed below. - [#1480099](https://bugzilla.redhat.com/1480099): More useful error - replace 'not optimal' - [#1480445](https://bugzilla.redhat.com/1480445): Log entry of files skipped/failed during rebalance operation - [#1480525](https://bugzilla.redhat.com/1480525): Make choose-local configurable through `volume-set` command -- [#1480591](https://bugzilla.redhat.com/1480591): [Scale] : I/O errors on multiple gNFS mounts with "Stale file handle" during rebalance of an erasure coded volume. +- [#1480591](https://bugzilla.redhat.com/1480591): [Scale] : I/O errors on multiple gNFS mounts with "Stale file handle" during rebalance of an erasure coded volume. - [#1481199](https://bugzilla.redhat.com/1481199): mempool: run-time crash when built with --disable-mempool - [#1481600](https://bugzilla.redhat.com/1481600): rpc: client_t and related objects leaked due to incorrect ref counts - [#1482023](https://bugzilla.redhat.com/1482023): snpashots issues with other processes accessing the mounted brick snapshots - [#1482344](https://bugzilla.redhat.com/1482344): Negative Test: glusterd crashes for some of the volume options if set at cluster level - [#1482906](https://bugzilla.redhat.com/1482906): /var/lib/glusterd/peers File had a blank line, Stopped Glusterd from starting -- [#1482923](https://bugzilla.redhat.com/1482923): afr: check op_ret value in __afr_selfheal_name_impunge -- [#1483058](https://bugzilla.redhat.com/1483058): [quorum]: Replace brick is happened when Quorum not met. +- [#1482923](https://bugzilla.redhat.com/1482923): afr: check op_ret value in \_\_afr_selfheal_name_impunge +- [#1483058](https://bugzilla.redhat.com/1483058): [quorum]: Replace brick is happened when Quorum not met. - [#1483995](https://bugzilla.redhat.com/1483995): packaging: use rdma-core(-devel) instead of ibverbs, rdmacm; disable rdma on armv7hl - [#1484215](https://bugzilla.redhat.com/1484215): Add Deepshika has CI Peer - [#1484225](https://bugzilla.redhat.com/1484225): [rpc]: EPOLLERR - disconnecting now messages every 3 secs after completing rebalance @@ -344,7 +352,7 @@ Bugs addressed since release-3.12.0 are listed below. - [#1488909](https://bugzilla.redhat.com/1488909): Fix the type of 'len' in posix.c, clang is showing a warning - [#1488913](https://bugzilla.redhat.com/1488913): Sub-directory mount details are incorrect in /proc/mounts - [#1489432](https://bugzilla.redhat.com/1489432): disallow replace brick operation on plain distribute volume -- [#1489823](https://bugzilla.redhat.com/1489823): set the shard-block-size to 64MB in virt profile +- [#1489823](https://bugzilla.redhat.com/1489823): set the shard-block-size to 64MB in virt profile - [#1490642](https://bugzilla.redhat.com/1490642): glusterfs client crash when removing directories - [#1490897](https://bugzilla.redhat.com/1490897): GlusterD returns a bad memory pointer in glusterd_get_args_from_dict() - [#1491025](https://bugzilla.redhat.com/1491025): rpc: TLSv1_2_method() is deprecated in OpenSSL-1.1 @@ -408,13 +416,13 @@ Bugs addressed since release-3.12.0 are listed below. - [#1510022](https://bugzilla.redhat.com/1510022): Revert experimental and 4.0 features to prepare for 3.13 release - [#1511274](https://bugzilla.redhat.com/1511274): Rebalance estimate(ETA) shows wrong details(as intial message of 10min wait reappears) when still in progress - [#1511293](https://bugzilla.redhat.com/1511293): In distribute volume after glusterd restart, brick goes offline -- [#1511768](https://bugzilla.redhat.com/1511768): In Replica volume 2*2 when quorum is set, after glusterd restart nfs server is coming up instead of self-heal daemon +- [#1511768](https://bugzilla.redhat.com/1511768): In Replica volume 2\*2 when quorum is set, after glusterd restart nfs server is coming up instead of self-heal daemon - [#1512435](https://bugzilla.redhat.com/1512435): Test bug-1483058-replace-brick-quorum-validation.t fails inconsistently - [#1512460](https://bugzilla.redhat.com/1512460): disperse eager-lock degrades performance for file create workloads - [#1513259](https://bugzilla.redhat.com/1513259): NetBSD port - [#1514419](https://bugzilla.redhat.com/1514419): gluster volume splitbrain info needs to display output of each brick in a stream fashion instead of buffering and dumping at the end - [#1515045](https://bugzilla.redhat.com/1515045): bug-1247563.t is failing on master -- [#1515572](https://bugzilla.redhat.com/1515572): Accessing a file when source brick is down results in that FOP being hung +- [#1515572](https://bugzilla.redhat.com/1515572): Accessing a file when source brick is down results in that FOP being hung - [#1516313](https://bugzilla.redhat.com/1516313): Bringing down data bricks in cyclic order results in arbiter brick becoming the source for heal. - [#1517692](https://bugzilla.redhat.com/1517692): Memory leak in locks xlator - [#1518257](https://bugzilla.redhat.com/1518257): EC DISCARD doesn't punch hole properly diff --git a/docs/release-notes/3.13.1.md b/docs/release-notes/3.13.1.md index 1583ba9..a57dc9d 100644 --- a/docs/release-notes/3.13.1.md +++ b/docs/release-notes/3.13.1.md @@ -5,17 +5,19 @@ contain a listing of all the new features that were added and bugs fixed in the GlusterFS 3.13 stable release. ## Major changes, features and limitations addressed in this release + **No Major changes** ## Major issues -1. Expanding a gluster volume that is sharded may cause file corruption + +1. Expanding a gluster volume that is sharded may cause file corruption + - Sharded volumes are typically used for VM images, if such volumes are - expanded or possibly contracted (i.e add/remove bricks and rebalance) there - are reports of VM images getting corrupted. + expanded or possibly contracted (i.e add/remove bricks and rebalance) there + are reports of VM images getting corrupted. - The last known cause for corruption (Bug #1515434) is still under review. - Status of this bug can be tracked here, [#1515434](https://bugzilla.redhat.com/1515434) - ## Bugs addressed Bugs addressed since release-3.13.0 are listed below. diff --git a/docs/release-notes/3.13.2.md b/docs/release-notes/3.13.2.md index 6a2b1d0..9c66947 100644 --- a/docs/release-notes/3.13.2.md +++ b/docs/release-notes/3.13.2.md @@ -5,9 +5,11 @@ contain a listing of all the new features that were added and bugs fixed in the GlusterFS 3.13 stable release. ## Major changes, features and limitations addressed in this release + **No Major changes** ## Major issues + **No Major iissues** ## Bugs addressed @@ -15,7 +17,7 @@ bugs fixed in the GlusterFS 3.13 stable release. Bugs addressed since release-3.13.1 are listed below. - [#1511293](https://bugzilla.redhat.com/1511293): In distribute volume after glusterd restart, brick goes offline -- [#1515434](https://bugzilla.redhat.com/1515434): dht_(f)xattrop does not implement migration checks +- [#1515434](https://bugzilla.redhat.com/1515434): dht\_(f)xattrop does not implement migration checks - [#1516313](https://bugzilla.redhat.com/1516313): Bringing down data bricks in cyclic order results in arbiter brick becoming the source for heal. - [#1529055](https://bugzilla.redhat.com/1529055): Test case ./tests/bugs/bug-1371806_1.t is failing - [#1529084](https://bugzilla.redhat.com/1529084): fstat returns ENOENT/ESTALE diff --git a/docs/release-notes/3.5.0.md b/docs/release-notes/3.5.0.md index e1a76a8..f0ba5f8 100644 --- a/docs/release-notes/3.5.0.md +++ b/docs/release-notes/3.5.0.md @@ -28,6 +28,7 @@ to files in glusterfs using its GFID For more information refer [here](https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.5/gfid%20access.md). ### Prevent NFS restart on Volume change + Earlier any volume change (volume option, volume start, volume stop, volume delete,brick add, etc) required restarting NFS server. @@ -48,7 +49,7 @@ directory read performance. zerofill feature allows creation of pre-allocated and zeroed-out files on GlusterFS volumes by offloading the zeroing part to server and/or storage (storage offloads use SCSI WRITESAME), thereby achieves quick creation of - pre-allocated and zeroed-out VM disk image by using server/storage off-loads. +pre-allocated and zeroed-out VM disk image by using server/storage off-loads. For more information refer [here](https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.5/Zerofill.md). @@ -93,7 +94,7 @@ The Volume group is represented as directory and logical volumes as files. remove-brick CLI earlier used to remove the brick forcefully ( without data migration ), when called without any arguments. This mode of 'remove-brick' cli, without any -arguments has been deprecated. +arguments has been deprecated. ### Experimental Features @@ -126,24 +127,26 @@ The following features are experimental with this release: - AUTH support for exported nfs sub-directories added - ### Known Issues: + - The following configuration changes are necessary for qemu and samba integration with libgfapi to work seamlessly: - 1) gluster volume set server.allow-insecure on +```{ .text .no-copy } +1) gluster volume set server.allow-insecure on - 2) Edit /etc/glusterfs/glusterd.vol to contain this line: - option rpc-auth-allow-insecure on - Post 1), restarting the volume would be necessary. - Post 2), restarting glusterd would be necessary. +2) Edit /etc/glusterfs/glusterd.vol to contain this line: + option rpc-auth-allow-insecure on -- RDMA connection manager needs IPoIB for connection establishment. More - details can be found [here](https://github.com/gluster/glusterfs-specs/blob/master/done/Features/rdmacm.md). +Post 1), restarting the volume would be necessary. +Post 2), restarting glusterd would be necessary. +``` +- RDMA connection manager needs IPoIB for connection establishment. More + details can be found [here](https://github.com/gluster/glusterfs-specs/blob/master/done/Features/rdmacm.md). - For Block Device translator based volumes open-behind translator at the -client side needs to be disabled. + client side needs to be disabled. - libgfapi clients calling glfs_fini before a successfull glfs_init will cause the client to hang as reported [here](http://lists.gnu.org/archive/html/gluster-devel/2014-04/msg00179.html). diff --git a/docs/release-notes/3.5.1.md b/docs/release-notes/3.5.1.md index 18eb07d..0b65810 100644 --- a/docs/release-notes/3.5.1.md +++ b/docs/release-notes/3.5.1.md @@ -15,83 +15,82 @@ additions: ### Bugs Fixed: -* [765202](https://bugzilla.redhat.com/765202): lgetxattr called with invalid keys on the bricks -* [833586](https://bugzilla.redhat.com/833586): inodelk hang from marker_rename_release_newp_lock -* [859581](https://bugzilla.redhat.com/859581): self-heal process can sometimes create directories instead of symlinks for the root gfid file in .glusterfs -* [986429](https://bugzilla.redhat.com/986429): Backupvolfile server option should work internal to GlusterFS framework -* [1039544](https://bugzilla.redhat.com/1039544): [FEAT] "gluster volume heal info" should list the entries that actually required to be healed. -* [1046624](https://bugzilla.redhat.com/1046624): Unable to heal symbolic Links -* [1046853](https://bugzilla.redhat.com/1046853): AFR : For every file self-heal there are warning messages reported in glustershd.log file -* [1063190](https://bugzilla.redhat.com/1063190): Volume was not accessible after server side quorum was met -* [1064096](https://bugzilla.redhat.com/1064096): The old Python Translator code (not Glupy) should be removed -* [1066996](https://bugzilla.redhat.com/1066996): Using sanlock on a gluster mount with replica 3 (quorum-type auto) leads to a split-brain -* [1071191](https://bugzilla.redhat.com/1071191): [3.5.1] Sporadic SIGBUS with mmap() on a sparse file created with open(), seek(), write() -* [1078061](https://bugzilla.redhat.com/1078061): Need ability to heal mismatching user extended attributes without any changelogs -* [1078365](https://bugzilla.redhat.com/1078365): New xlators are linked as versioned .so files, creating .so.0.0.0 -* [1086743](https://bugzilla.redhat.com/1086743): Add documentation for the Feature: RDMA-connection manager (RDMA-CM) -* [1086748](https://bugzilla.redhat.com/1086748): Add documentation for the Feature: AFR CLI enhancements -* [1086749](https://bugzilla.redhat.com/1086749): Add documentation for the Feature: Exposing Volume Capabilities -* [1086750](https://bugzilla.redhat.com/1086750): Add documentation for the Feature: File Snapshots in GlusterFS -* [1086751](https://bugzilla.redhat.com/1086751): Add documentation for the Feature: gfid-access -* [1086752](https://bugzilla.redhat.com/1086752): Add documentation for the Feature: On-Wire Compression/Decompression -* [1086754](https://bugzilla.redhat.com/1086754): Add documentation for the Feature: Quota Scalability -* [1086755](https://bugzilla.redhat.com/1086755): Add documentation for the Feature: readdir-ahead -* [1086756](https://bugzilla.redhat.com/1086756): Add documentation for the Feature: zerofill API for GlusterFS -* [1086758](https://bugzilla.redhat.com/1086758): Add documentation for the Feature: Changelog based parallel geo-replication -* [1086760](https://bugzilla.redhat.com/1086760): Add documentation for the Feature: Write Once Read Many (WORM) volume -* [1086762](https://bugzilla.redhat.com/1086762): Add documentation for the Feature: BD Xlator - Block Device translator -* [1086766](https://bugzilla.redhat.com/1086766): Add documentation for the Feature: Libgfapi -* [1086774](https://bugzilla.redhat.com/1086774): Add documentation for the Feature: Access Control List - Version 3 support for Gluster NFS -* [1086781](https://bugzilla.redhat.com/1086781): Add documentation for the Feature: Eager locking -* [1086782](https://bugzilla.redhat.com/1086782): Add documentation for the Feature: glusterfs and oVirt integration -* [1086783](https://bugzilla.redhat.com/1086783): Add documentation for the Feature: qemu 1.3 - libgfapi integration -* [1088848](https://bugzilla.redhat.com/1088848): Spelling errors in rpc/rpc-transport/rdma/src/rdma.c -* [1089054](https://bugzilla.redhat.com/1089054): gf-error-codes.h is missing from source tarball -* [1089470](https://bugzilla.redhat.com/1089470): SMB: Crash on brick process during compile kernel. -* [1089934](https://bugzilla.redhat.com/1089934): list dir with more than N files results in Input/output error -* [1091340](https://bugzilla.redhat.com/1091340): Doc: Add glfs_fini known issue to release notes 3.5 -* [1091392](https://bugzilla.redhat.com/1091392): glusterfs.spec.in: minor/nit changes to sync with Fedora spec -* [1095256](https://bugzilla.redhat.com/1095256): Excessive logging from self-heal daemon, and bricks -* [1095595](https://bugzilla.redhat.com/1095595): Stick to IANA standard while allocating brick ports -* [1095775](https://bugzilla.redhat.com/1095775): Add support in libgfapi to fetch volume info from glusterd. -* [1095971](https://bugzilla.redhat.com/1095971): Stopping/Starting a Gluster volume resets ownership -* [1096040](https://bugzilla.redhat.com/1096040): AFR : self-heal-daemon not clearing the change-logs of all the sources after self-heal -* [1096425](https://bugzilla.redhat.com/1096425): i/o error when one user tries to access RHS volume over NFS with 100+ GIDs -* [1099878](https://bugzilla.redhat.com/1099878): Need support for handle based Ops to fetch/modify extended attributes of a file -* [1101647](https://bugzilla.redhat.com/1101647): gluster volume heal volname statistics heal-count not giving desired output. -* [1102306](https://bugzilla.redhat.com/1102306): license: xlators/features/glupy dual license GPLv2 and LGPLv3+ -* [1103413](https://bugzilla.redhat.com/1103413): Failure in gf_log_init reopening stderr -* [1104592](https://bugzilla.redhat.com/1104592): heal info may give Success instead of transport end point not connected when a brick is down. -* [1104915](https://bugzilla.redhat.com/1104915): glusterfsd crashes while doing stress tests -* [1104919](https://bugzilla.redhat.com/1104919): Fix memory leaks in gfid-access xlator. -* [1104959](https://bugzilla.redhat.com/1104959): Dist-geo-rep : some of the files not accessible on slave after the geo-rep sync from master to slave. -* [1105188](https://bugzilla.redhat.com/1105188): Two instances each, of brick processes, glusterfs-nfs and quotad seen after glusterd restart -* [1105524](https://bugzilla.redhat.com/1105524): Disable nfs.drc by default -* [1107937](https://bugzilla.redhat.com/1107937): quota-anon-fd-nfs.t fails spuriously -* [1109832](https://bugzilla.redhat.com/1109832): I/O fails for for glusterfs 3.4 AFR clients accessing servers upgraded to glusterfs 3.5 -* [1110777](https://bugzilla.redhat.com/1110777): glusterfsd OOM - using all memory when quota is enabled +- [765202](https://bugzilla.redhat.com/765202): lgetxattr called with invalid keys on the bricks +- [833586](https://bugzilla.redhat.com/833586): inodelk hang from marker_rename_release_newp_lock +- [859581](https://bugzilla.redhat.com/859581): self-heal process can sometimes create directories instead of symlinks for the root gfid file in .glusterfs +- [986429](https://bugzilla.redhat.com/986429): Backupvolfile server option should work internal to GlusterFS framework +- [1039544](https://bugzilla.redhat.com/1039544): [FEAT] "gluster volume heal info" should list the entries that actually required to be healed. +- [1046624](https://bugzilla.redhat.com/1046624): Unable to heal symbolic Links +- [1046853](https://bugzilla.redhat.com/1046853): AFR : For every file self-heal there are warning messages reported in glustershd.log file +- [1063190](https://bugzilla.redhat.com/1063190): Volume was not accessible after server side quorum was met +- [1064096](https://bugzilla.redhat.com/1064096): The old Python Translator code (not Glupy) should be removed +- [1066996](https://bugzilla.redhat.com/1066996): Using sanlock on a gluster mount with replica 3 (quorum-type auto) leads to a split-brain +- [1071191](https://bugzilla.redhat.com/1071191): [3.5.1] Sporadic SIGBUS with mmap() on a sparse file created with open(), seek(), write() +- [1078061](https://bugzilla.redhat.com/1078061): Need ability to heal mismatching user extended attributes without any changelogs +- [1078365](https://bugzilla.redhat.com/1078365): New xlators are linked as versioned .so files, creating .so.0.0.0 +- [1086743](https://bugzilla.redhat.com/1086743): Add documentation for the Feature: RDMA-connection manager (RDMA-CM) +- [1086748](https://bugzilla.redhat.com/1086748): Add documentation for the Feature: AFR CLI enhancements +- [1086749](https://bugzilla.redhat.com/1086749): Add documentation for the Feature: Exposing Volume Capabilities +- [1086750](https://bugzilla.redhat.com/1086750): Add documentation for the Feature: File Snapshots in GlusterFS +- [1086751](https://bugzilla.redhat.com/1086751): Add documentation for the Feature: gfid-access +- [1086752](https://bugzilla.redhat.com/1086752): Add documentation for the Feature: On-Wire Compression/Decompression +- [1086754](https://bugzilla.redhat.com/1086754): Add documentation for the Feature: Quota Scalability +- [1086755](https://bugzilla.redhat.com/1086755): Add documentation for the Feature: readdir-ahead +- [1086756](https://bugzilla.redhat.com/1086756): Add documentation for the Feature: zerofill API for GlusterFS +- [1086758](https://bugzilla.redhat.com/1086758): Add documentation for the Feature: Changelog based parallel geo-replication +- [1086760](https://bugzilla.redhat.com/1086760): Add documentation for the Feature: Write Once Read Many (WORM) volume +- [1086762](https://bugzilla.redhat.com/1086762): Add documentation for the Feature: BD Xlator - Block Device translator +- [1086766](https://bugzilla.redhat.com/1086766): Add documentation for the Feature: Libgfapi +- [1086774](https://bugzilla.redhat.com/1086774): Add documentation for the Feature: Access Control List - Version 3 support for Gluster NFS +- [1086781](https://bugzilla.redhat.com/1086781): Add documentation for the Feature: Eager locking +- [1086782](https://bugzilla.redhat.com/1086782): Add documentation for the Feature: glusterfs and oVirt integration +- [1086783](https://bugzilla.redhat.com/1086783): Add documentation for the Feature: qemu 1.3 - libgfapi integration +- [1088848](https://bugzilla.redhat.com/1088848): Spelling errors in rpc/rpc-transport/rdma/src/rdma.c +- [1089054](https://bugzilla.redhat.com/1089054): gf-error-codes.h is missing from source tarball +- [1089470](https://bugzilla.redhat.com/1089470): SMB: Crash on brick process during compile kernel. +- [1089934](https://bugzilla.redhat.com/1089934): list dir with more than N files results in Input/output error +- [1091340](https://bugzilla.redhat.com/1091340): Doc: Add glfs_fini known issue to release notes 3.5 +- [1091392](https://bugzilla.redhat.com/1091392): glusterfs.spec.in: minor/nit changes to sync with Fedora spec +- [1095256](https://bugzilla.redhat.com/1095256): Excessive logging from self-heal daemon, and bricks +- [1095595](https://bugzilla.redhat.com/1095595): Stick to IANA standard while allocating brick ports +- [1095775](https://bugzilla.redhat.com/1095775): Add support in libgfapi to fetch volume info from glusterd. +- [1095971](https://bugzilla.redhat.com/1095971): Stopping/Starting a Gluster volume resets ownership +- [1096040](https://bugzilla.redhat.com/1096040): AFR : self-heal-daemon not clearing the change-logs of all the sources after self-heal +- [1096425](https://bugzilla.redhat.com/1096425): i/o error when one user tries to access RHS volume over NFS with 100+ GIDs +- [1099878](https://bugzilla.redhat.com/1099878): Need support for handle based Ops to fetch/modify extended attributes of a file +- [1101647](https://bugzilla.redhat.com/1101647): gluster volume heal volname statistics heal-count not giving desired output. +- [1102306](https://bugzilla.redhat.com/1102306): license: xlators/features/glupy dual license GPLv2 and LGPLv3+ +- [1103413](https://bugzilla.redhat.com/1103413): Failure in gf_log_init reopening stderr +- [1104592](https://bugzilla.redhat.com/1104592): heal info may give Success instead of transport end point not connected when a brick is down. +- [1104915](https://bugzilla.redhat.com/1104915): glusterfsd crashes while doing stress tests +- [1104919](https://bugzilla.redhat.com/1104919): Fix memory leaks in gfid-access xlator. +- [1104959](https://bugzilla.redhat.com/1104959): Dist-geo-rep : some of the files not accessible on slave after the geo-rep sync from master to slave. +- [1105188](https://bugzilla.redhat.com/1105188): Two instances each, of brick processes, glusterfs-nfs and quotad seen after glusterd restart +- [1105524](https://bugzilla.redhat.com/1105524): Disable nfs.drc by default +- [1107937](https://bugzilla.redhat.com/1107937): quota-anon-fd-nfs.t fails spuriously +- [1109832](https://bugzilla.redhat.com/1109832): I/O fails for for glusterfs 3.4 AFR clients accessing servers upgraded to glusterfs 3.5 +- [1110777](https://bugzilla.redhat.com/1110777): glusterfsd OOM - using all memory when quota is enabled ### Known Issues: - The following configuration changes are necessary for qemu and samba integration with libgfapi to work seamlessly: - 1. gluster volume set server.allow-insecure on - 2. restarting the volume is necessary - ~~~ - gluster volume stop - gluster volume start - ~~~ - 3. Edit `/etc/glusterfs/glusterd.vol` to contain this line: - ~~~ - option rpc-auth-allow-insecure on - ~~~ - 4. restarting glusterd is necessary - ~~~ - service glusterd restart - ~~~ + 1. gluster volume set server.allow-insecure on + 2. restarting the volume is necessary - More details are also documented in the Gluster Wiki on the [Libgfapi with qemu libvirt](https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.5/libgfapi%20with%20qemu%20libvirt.md) page. + gluster volume stop + gluster volume start + + 3. Edit `/etc/glusterfs/glusterd.vol` to contain this line: + + option rpc-auth-allow-insecure on + + 4. restarting glusterd is necessary + + service glusterd restart + +More details are also documented in the Gluster Wiki on the [Libgfapi with qemu libvirt](https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.5/libgfapi%20with%20qemu%20libvirt.md) page. - For Block Device translator based volumes open-behind translator at the client side needs to be disabled. @@ -104,5 +103,5 @@ additions: - After enabling `server.manage-gids`, the volume needs to be stopped and started again to have the option enabled in the brick processes - gluster volume stop - gluster volume start + gluster volume stop + gluster volume start diff --git a/docs/release-notes/3.5.2.md b/docs/release-notes/3.5.2.md index 2b6f2f2..5e43031 100644 --- a/docs/release-notes/3.5.2.md +++ b/docs/release-notes/3.5.2.md @@ -4,12 +4,12 @@ This is mostly a bugfix release. The [Release Notes for 3.5.0](./3.5.0.md) and [ ### Bugs Fixed: -- [1096020](https://bugzilla.redhat.com/1096020): NFS server crashes in _socket_read_vectored_request +- [1096020](https://bugzilla.redhat.com/1096020): NFS server crashes in \_socket_read_vectored_request - [1100050](https://bugzilla.redhat.com/1100050): Can't write to quota enable folder - [1103050](https://bugzilla.redhat.com/1103050): nfs: reset command does not alter the result for nfs options earlier set - [1105891](https://bugzilla.redhat.com/1105891): features/gfid-access: stat on .gfid virtual directory return EINVAL - [1111454](https://bugzilla.redhat.com/1111454): creating symlinks generates errors on stripe volume -- [1112111](https://bugzilla.redhat.com/1112111): Self-heal errors with "afr crawl failed for child 0 with ret -1" while performing rolling upgrade. +- [1112111](https://bugzilla.redhat.com/1112111): Self-heal errors with "afr crawl failed for child 0 with ret -1" while performing rolling upgrade. - [1112348](https://bugzilla.redhat.com/1112348): [AFR] I/O fails when one of the replica nodes go down - [1112659](https://bugzilla.redhat.com/1112659): Fix inode leaks in gfid-access xlator - [1112980](https://bugzilla.redhat.com/1112980): NFS subdir authentication doesn't correctly handle multi-(homed,protocol,etc) network addresses @@ -18,8 +18,8 @@ This is mostly a bugfix release. The [Release Notes for 3.5.0](./3.5.0.md) and [ - [1113749](https://bugzilla.redhat.com/1113749): client_t clienttable cliententries are never expanded when all entries are used - [1113894](https://bugzilla.redhat.com/1113894): AFR : self-heal of few files not happening when a AWS EC2 Instance is back online after a restart - [1113959](https://bugzilla.redhat.com/1113959): Spec %post server does not wait for the old glusterd to exit -- [1114501](https://bugzilla.redhat.com/1114501): Dist-geo-rep : deletion of files on master, geo-rep fails to propagate to slaves. -- [1115369](https://bugzilla.redhat.com/1115369): Allow the usage of the wildcard character '*' to the options "nfs.rpc-auth-allow" and "nfs.rpc-auth-reject" +- [1114501](https://bugzilla.redhat.com/1114501): Dist-geo-rep : deletion of files on master, geo-rep fails to propagate to slaves. +- [1115369](https://bugzilla.redhat.com/1115369): Allow the usage of the wildcard character '\*' to the options "nfs.rpc-auth-allow" and "nfs.rpc-auth-reject" - [1115950](https://bugzilla.redhat.com/1115950): glfsheal: Improve the way in which we check the presence of replica volumes - [1116672](https://bugzilla.redhat.com/1116672): Resource cleanup doesn't happen for clients on servers after disconnect - [1116997](https://bugzilla.redhat.com/1116997): mounting a volume over NFS (TCP) with MOUNT over UDP fails @@ -32,34 +32,33 @@ This is mostly a bugfix release. The [Release Notes for 3.5.0](./3.5.0.md) and [ - The following configuration changes are necessary for 'qemu' and 'samba vfs plugin' integration with libgfapi to work seamlessly: - 1. gluster volume set server.allow-insecure on - 2. restarting the volume is necessary + 1. gluster volume set server.allow-insecure on + 2. restarting the volume is necessary - ~~~ - gluster volume stop - gluster volume start - ~~~ + ``` + gluster volume stop + gluster volume start + ``` - 3. Edit `/etc/glusterfs/glusterd.vol` to contain this line: + 3. Edit `/etc/glusterfs/glusterd.vol` to contain this line: - ~~~ - option rpc-auth-allow-insecure on - ~~~ + ``` + option rpc-auth-allow-insecure on + ``` - 4. restarting glusterd is necessary + 4. restarting glusterd is necessary - ~~~ - service glusterd restart - ~~~ + ``` + service glusterd restart + ``` - More details are also documented in the Gluster Wiki on the [Libgfapi with qemu libvirt](https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.5/libgfapi%20with%20qemu%20libvirt.md) page. + More details are also documented in the Gluster Wiki on the [Libgfapi with qemu libvirt](https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.5/libgfapi%20with%20qemu%20libvirt.md) page. - For Block Device translator based volumes open-behind translator at the client side needs to be disabled. gluster volume set performance.open-behind disabled - - libgfapi clients calling `glfs_fini` before a successfull `glfs_init` will cause the client to hang as reported [here](http://lists.gnu.org/archive/html/gluster-devel/2014-04/msg00179.html). The workaround is NOT to call `glfs_fini` for error cases encountered before a successfull diff --git a/docs/release-notes/3.5.3.md b/docs/release-notes/3.5.3.md index 61ffca1..be58934 100644 --- a/docs/release-notes/3.5.3.md +++ b/docs/release-notes/3.5.3.md @@ -10,7 +10,7 @@ features that were added and bugs fixed in the GlusterFS 3.5 stable release. - [1100204](https://bugzilla.redhat.com/1100204): brick failure detection does not work for ext4 filesystems - [1126801](https://bugzilla.redhat.com/1126801): glusterfs logrotate config file pollutes global config - [1129527](https://bugzilla.redhat.com/1129527): DHT :- data loss - file is missing on renaming same file from multiple client at same time -- [1129541](https://bugzilla.redhat.com/1129541): [DHT:REBALANCE]: Rebalance failures are seen with error message " remote operation failed: File exists" +- [1129541](https://bugzilla.redhat.com/1129541): [DHT:REBALANCE]: Rebalance failures are seen with error message " remote operation failed: File exists" - [1132391](https://bugzilla.redhat.com/1132391): NFS interoperability problem: stripe-xlator removes EOF at end of READDIR - [1133949](https://bugzilla.redhat.com/1133949): Minor typo in afr logging - [1136221](https://bugzilla.redhat.com/1136221): The memories are exhausted quickly when handle the message which has multi fragments in a single record @@ -44,27 +44,27 @@ features that were added and bugs fixed in the GlusterFS 3.5 stable release. - The following configuration changes are necessary for 'qemu' and 'samba vfs plugin' integration with libgfapi to work seamlessly: - 1. gluster volume set server.allow-insecure on - 2. restarting the volume is necessary + 1. gluster volume set server.allow-insecure on + 2. restarting the volume is necessary - ~~~ - gluster volume stop - gluster volume start - ~~~ + ``` + gluster volume stop + gluster volume start + ``` - 3. Edit `/etc/glusterfs/glusterd.vol` to contain this line: + 3. Edit `/etc/glusterfs/glusterd.vol` to contain this line: - ~~~ - option rpc-auth-allow-insecure on - ~~~ + ``` + option rpc-auth-allow-insecure on + ``` - 4. restarting glusterd is necessary + 4. restarting glusterd is necessary - ~~~ - service glusterd restart - ~~~ + ``` + service glusterd restart + ``` - More details are also documented in the Gluster Wiki on the [Libgfapi with qemu libvirt](https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.5/libgfapi%20with%20qemu%20libvirt.md) page. + More details are also documented in the Gluster Wiki on the [Libgfapi with qemu libvirt](https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.5/libgfapi%20with%20qemu%20libvirt.md) page. - For Block Device translator based volumes open-behind translator at the client side needs to be disabled. diff --git a/docs/release-notes/3.5.4.md b/docs/release-notes/3.5.4.md index 6b1d6bd..5d09895 100644 --- a/docs/release-notes/3.5.4.md +++ b/docs/release-notes/3.5.4.md @@ -10,7 +10,7 @@ release. - [1092037](https://bugzilla.redhat.com/1092037): Issues reported by Cppcheck static analysis tool - [1101138](https://bugzilla.redhat.com/1101138): meta-data split-brain prevents entry/data self-heal of dir/file respectively - [1115197](https://bugzilla.redhat.com/1115197): Directory quota does not apply on it's sub-directories -- [1159968](https://bugzilla.redhat.com/1159968): glusterfs.spec.in: deprecate *.logrotate files in dist-git in favor of the upstream logrotate files +- [1159968](https://bugzilla.redhat.com/1159968): glusterfs.spec.in: deprecate \*.logrotate files in dist-git in favor of the upstream logrotate files - [1160711](https://bugzilla.redhat.com/1160711): libgfapi: use versioned symbols in libgfapi.so for compatibility - [1161102](https://bugzilla.redhat.com/1161102): self heal info logs are filled up with messages reporting split-brain - [1162150](https://bugzilla.redhat.com/1162150): AFR gives EROFS when fop fails on all subvolumes when client-quorum is enabled @@ -28,8 +28,8 @@ release. - [1190633](https://bugzilla.redhat.com/1190633): self-heal-algorithm with option "full" doesn't heal sparse files correctly - [1191006](https://bugzilla.redhat.com/1191006): Building argp-standalone breaks nightly builds on Fedora Rawhide - [1192832](https://bugzilla.redhat.com/1192832): log files get flooded when removexattr() can't find a specified key or value -- [1200764](https://bugzilla.redhat.com/1200764): [AFR] Core dump and crash observed during disk replacement case -- [1202675](https://bugzilla.redhat.com/1202675): Perf: readdirp in replicated volumes causes performance degrade +- [1200764](https://bugzilla.redhat.com/1200764): [AFR] Core dump and crash observed during disk replacement case +- [1202675](https://bugzilla.redhat.com/1202675): Perf: readdirp in replicated volumes causes performance degrade - [1211841](https://bugzilla.redhat.com/1211841): glusterfs-api.pc versioning breaks QEMU - [1222150](https://bugzilla.redhat.com/1222150): readdirp return 64bits inodes even if enable-ino32 is set @@ -38,34 +38,33 @@ release. - The following configuration changes are necessary for 'qemu' and 'samba vfs plugin' integration with libgfapi to work seamlessly: - 1. gluster volume set server.allow-insecure on - 2. restarting the volume is necessary + 1. gluster volume set server.allow-insecure on + 2. restarting the volume is necessary - ~~~ - gluster volume stop - gluster volume start - ~~~ + ``` + gluster volume stop + gluster volume start + ``` - 3. Edit `/etc/glusterfs/glusterd.vol` to contain this line: + 3. Edit `/etc/glusterfs/glusterd.vol` to contain this line: - ~~~ - option rpc-auth-allow-insecure on - ~~~ + ``` + option rpc-auth-allow-insecure on + ``` - 4. restarting glusterd is necessary + 4. restarting glusterd is necessary - ~~~ - service glusterd restart - ~~~ + ``` + service glusterd restart + ``` - More details are also documented in the Gluster Wiki on the [Libgfapi with qemu libvirt](https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.5/libgfapi%20with%20qemu%20libvirt.md) page. + More details are also documented in the Gluster Wiki on the [Libgfapi with qemu libvirt](https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.5/libgfapi%20with%20qemu%20libvirt.md) page. - For Block Device translator based volumes open-behind translator at the client side needs to be disabled. gluster volume set performance.open-behind disabled - - libgfapi clients calling `glfs_fini` before a successful `glfs_init` will cause the client to hang as reported [here](http://lists.gnu.org/archive/html/gluster-devel/2014-04/msg00179.html). The workaround is NOT to call `glfs_fini` for error cases encountered before a successful diff --git a/docs/release-notes/3.6.0.md b/docs/release-notes/3.6.0.md index 71896aa..60782db 100644 --- a/docs/release-notes/3.6.0.md +++ b/docs/release-notes/3.6.0.md @@ -38,6 +38,7 @@ Prior to 3.6, bricks with heterogeneous sizes were treated as equal regardless o GlusterFS 3.6 provides better support to enable SSL on both management and data connections. This feature is currently being consumed by the GlusterFS native driver in OpenStack Manila. ### Better peer identification + GlusterFS 3.6 improves peer identification. GlusterD will no longer complain when a mixture of FQDNs, shortnames and IP addresses are used. Changes done for this improvement have also laid down a base for improving multi network support in GlusterFS. ### Meta translator @@ -55,11 +56,12 @@ From a user point of view, there is no change in the replication behaviour but t - Bricks in a replica set do not mark any pending change log extended attributes for itself during pre or post op. They only mark it for other bricks in the replica set. For e.g.: -In a replica 2 volume, `trusted.afr.-client-0` for brick-0 and `trusted.afr.-client-1` for brick-1 will always be `0x000000000000000000000000`. +In a replica 2 volume, `trusted.afr.-client-0` for brick-0 and `trusted.afr.-client-1` for brick-1 will always be `0x000000000000000000000000`. - If the post-op changelog updation does not complete successfully on a brick, a `trusted.afr.dirty` extended attribute is set on that brick. ### Barrier translator + The barrier translator allows file operations to be temporarily 'paused' on GlusterFS bricks, which is needed for performing consistent snapshots of a GlusterFS volume. For more information, see [here](https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.6/Server-side%20Barrier%20feature.md). @@ -98,19 +100,20 @@ The following features are experimental with this release: - A new logging framework that can suppress repetitive log messages and provide a dictionary of messages has been added. Few translators have now been integrated with the framework. More translators are expected to integrate with this framework in upcoming minor & major releases. ### Known Issues: + - The following configuration changes are necessary for qemu and samba integration with libgfapi to work seamlessly: - 1. `gluster volume set server.allow-insecure on` + 1. `gluster volume set server.allow-insecure on` - 2. Edit `/etc/glusterfs/glusterd.vol` to contain this line: - `option rpc-auth-allow-insecure on` + 2. Edit `/etc/glusterfs/glusterd.vol` to contain this line: + `option rpc-auth-allow-insecure on` - Post 1, restarting the volume would be necessary: - `# gluster volume stop ` - `# gluster volume start ` + Post 1, restarting the volume would be necessary: + `# gluster volume stop ` + `# gluster volume start ` - Post 2, restarting glusterd would be necessary: - `# service glusterd restart` + Post 2, restarting glusterd would be necessary: + `# service glusterd restart` - For Block Device translator based volumes open-behind translator at the client side needs to be disabled. @@ -124,7 +127,7 @@ The following features are experimental with this release: Q' = Q / (N - R) - Where Q is the desired quota value, Q' is the new quota value to use, N is the number of bricks per disperse set, and R is the redundancy. + Where Q is the desired quota value, Q' is the new quota value to use, N is the number of bricks per disperse set, and R is the redundancy. ### Upgrading to 3.6.X diff --git a/docs/release-notes/3.6.3.md b/docs/release-notes/3.6.3.md index 2156097..ac8683e 100644 --- a/docs/release-notes/3.6.3.md +++ b/docs/release-notes/3.6.3.md @@ -53,27 +53,27 @@ release. - The following configuration changes are necessary for 'qemu' and 'samba vfs plugin' integration with libgfapi to work seamlessly: - 1. gluster volume set server.allow-insecure on - 2. restarting the volume is necessary + 1. gluster volume set server.allow-insecure on + 2. restarting the volume is necessary - ~~~ - gluster volume stop - gluster volume start - ~~~ + ``` + gluster volume stop + gluster volume start + ``` - 3. Edit `/etc/glusterfs/glusterd.vol` to contain this line: + 3. Edit `/etc/glusterfs/glusterd.vol` to contain this line: - ~~~ - option rpc-auth-allow-insecure on - ~~~ + ``` + option rpc-auth-allow-insecure on + ``` - 4. restarting glusterd is necessary + 4. restarting glusterd is necessary - ~~~ - service glusterd restart - ~~~ + ``` + service glusterd restart + ``` - More details are also documented in the Gluster Wiki on the [Libgfapi with qemu libvirt](https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.5/libgfapi%20with%20qemu%20libvirt.md) page. + More details are also documented in the Gluster Wiki on the [Libgfapi with qemu libvirt](https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.5/libgfapi%20with%20qemu%20libvirt.md) page. - For Block Device translator based volumes open-behind translator at the client side needs to be disabled. diff --git a/docs/release-notes/3.7.0.md b/docs/release-notes/3.7.0.md index 6dbb813..693378d 100644 --- a/docs/release-notes/3.7.0.md +++ b/docs/release-notes/3.7.0.md @@ -2,7 +2,7 @@ Release Notes for GlusterFS 3.7.0 ## Major Changes and Features -Documentation about major changes and features is included in the [`doc/features/`](https://github.com/gluster/glusterfs/tree/master/doc/features/) directory of GlusterFS repository. +Documentation about major changes and features is included in the [`doc/features/`](https://github.com/gluster/glusterfs/tree/master/doc/features/) directory of GlusterFS repository. ### Geo Replication @@ -124,42 +124,42 @@ For more information, see the 'Resolution of split-brain from the mount point' s ### Minor Improvements -* Message ID based logging has been added for several translators. -* Quorum support for reads. -* Snapshot names contain timestamps by default.Subsequent access to the snapshots should be done by the name listed in `gluster snapshot list` -* Support for `gluster volume get ` added. -* libgfapi has added handle based functions to get/set POSIX ACLs based on common libacl structures. +- Message ID based logging has been added for several translators. +- Quorum support for reads. +- Snapshot names contain timestamps by default.Subsequent access to the snapshots should be done by the name listed in `gluster snapshot list` +- Support for `gluster volume get ` added. +- libgfapi has added handle based functions to get/set POSIX ACLs based on common libacl structures. ### Known Issues -* Enabling Bitrot on volumes with more than 2 bricks on a node is known to cause problems. -* Addition of bricks dynamically to cold or hot tiers in a tiered volume is not supported. -* The following configuration changes are necessary for qemu and samba integration with libgfapi to work seamlessly: +- Enabling Bitrot on volumes with more than 2 bricks on a node is known to cause problems. +- Addition of bricks dynamically to cold or hot tiers in a tiered volume is not supported. +- The following configuration changes are necessary for qemu and samba integration with libgfapi to work seamlessly: - ~~~ - # gluster volume set server.allow-insecure on - ~~~ + ``` + # gluster volume set server.allow-insecure on + ``` - Edit `/etc/glusterfs/glusterd.vol` to contain this line: `option rpc-auth-allow-insecure on` + Edit `/etc/glusterfs/glusterd.vol` to contain this line: `option rpc-auth-allow-insecure on` - Post 1, restarting the volume would be necessary: + Post 1, restarting the volume would be necessary: - ~~~ - # gluster volume stop - # gluster volume start - ~~~ + ``` + # gluster volume stop + # gluster volume start + ``` - Post 2, restarting glusterd would be necessary: + Post 2, restarting glusterd would be necessary: - ~~~ - # service glusterd restart - ~~~ + ``` + # service glusterd restart + ``` - or + or - ~~~ - # systemctl restart glusterd - ~~~ + ``` + # systemctl restart glusterd + ``` ### Upgrading to 3.7.0 diff --git a/docs/release-notes/3.7.1.md b/docs/release-notes/3.7.1.md index 0ea1980..4518d66 100644 --- a/docs/release-notes/3.7.1.md +++ b/docs/release-notes/3.7.1.md @@ -3,76 +3,75 @@ This is a bugfix release. The [Release Notes for 3.7.0](./3.7.0.md), contain a listing of all the new features that were added. -```Note: Enabling Bitrot on volumes with more than 2 bricks on a node works with this release. ``` +`Note: Enabling Bitrot on volumes with more than 2 bricks on a node works with this release. ` ### Bugs Fixed -- [1212676](http://bugzilla.redhat.com/1212676): NetBSD port -- [1218863](http://bugzilla.redhat.com/1218863): `ls' on a directory which has files with mismatching gfid's does not list anything -- [1219782](http://bugzilla.redhat.com/1219782): Regression failures in tests/bugs/snapshot/bug-1112559.t -- [1221000](http://bugzilla.redhat.com/1221000): detach-tier status emulates like detach-tier stop -- [1221470](http://bugzilla.redhat.com/1221470): dHT rebalance: Dict_copy log messages when running rebalance on a dist-rep volume -- [1221476](http://bugzilla.redhat.com/1221476): Data Tiering:rebalance fails on a tiered volume -- [1221477](http://bugzilla.redhat.com/1221477): The tiering feature requires counters. -- [1221503](http://bugzilla.redhat.com/1221503): DHT Rebalance : Misleading log messages for linkfiles -- [1221507](http://bugzilla.redhat.com/1221507): NFS-Ganesha: ACL should not be enabled by default -- [1221534](http://bugzilla.redhat.com/1221534): rebalance failed after attaching the tier to the volume. -- [1221967](http://bugzilla.redhat.com/1221967): Do not allow detach-tier commands on a non tiered volume -- [1221969](http://bugzilla.redhat.com/1221969): tiering: use sperate log/socket/pid file for tiering -- [1222198](http://bugzilla.redhat.com/1222198): Fix nfs/mount3.c build warnings reported in Koji -- [1222750](http://bugzilla.redhat.com/1222750): non-root geo-replication session goes to faulty state, when the session is started -- [1222869](http://bugzilla.redhat.com/1222869): [SELinux] [BVT]: Selinux throws AVC errors while running DHT automation on Rhel6.6 -- [1223215](http://bugzilla.redhat.com/1223215): gluster volume status fails with locking failed error message -- [1223286](http://bugzilla.redhat.com/1223286): [geo-rep]: worker died with "ESTALE" when performed rm -rf on a directory from mount of master volume -- [1223644](http://bugzilla.redhat.com/1223644): [geo-rep]: With tarssh the file is created at slave but it doesnt get sync -- [1224100](http://bugzilla.redhat.com/1224100): [geo-rep]: Even after successful sync, the DATA counter did not reset to 0 -- [1224241](http://bugzilla.redhat.com/1224241): gfapi: zero size issue in glfs_h_acl_set() -- [1224292](http://bugzilla.redhat.com/1224292): peers connected in the middle of a transaction are participating in the transaction -- [1224647](http://bugzilla.redhat.com/1224647): [RFE] Provide hourly scrubbing option -- [1224650](http://bugzilla.redhat.com/1224650): SIGNING FAILURE Error messages are poping up in the bitd log -- [1224894](http://bugzilla.redhat.com/1224894): Quota: spurious failures with quota testcases -- [1225077](http://bugzilla.redhat.com/1225077): Fix regression test spurious failures -- [1225279](http://bugzilla.redhat.com/1225279): Different client can not execute "for((i=0;i<1000;i++));do ls -al;done" in a same directory at the sametime -- [1225318](http://bugzilla.redhat.com/1225318): glusterd could crash in remove-brick-status when local remove-brick process has just completed -- [1225320](http://bugzilla.redhat.com/1225320): ls command failed with features.read-only on while mounting ec volume. -- [1225331](http://bugzilla.redhat.com/1225331): [geo-rep] stop-all-gluster-processes.sh fails to stop all gluster processes -- [1225543](http://bugzilla.redhat.com/1225543): [geo-rep]: snapshot creation timesout even if geo-replication is in pause/stop/delete state -- [1225552](http://bugzilla.redhat.com/1225552): [Backup]: Unable to create a glusterfind session -- [1225709](http://bugzilla.redhat.com/1225709): [RFE] Move signing trigger mechanism to [f]setxattr() -- [1225743](http://bugzilla.redhat.com/1225743): [AFR-V2] - afr_final_errno() should treat op_ret > 0 also as success -- [1225796](http://bugzilla.redhat.com/1225796): Spurious failure in tests/bugs/disperse/bug-1161621.t -- [1225919](http://bugzilla.redhat.com/1225919): Log EEXIST errors in DEBUG level in fops MKNOD and MKDIR -- [1225922](http://bugzilla.redhat.com/1225922): Sharding - Skip update of block count and size for directories in readdirp callback -- [1226024](http://bugzilla.redhat.com/1226024): cli/tiering:typo errors in tiering -- [1226029](http://bugzilla.redhat.com/1226029): I/O's hanging on tiered volumes (NFS) -- [1226032](http://bugzilla.redhat.com/1226032): glusterd crashed on the node when tried to detach a tier after restoring data from the snapshot. -- [1226117](http://bugzilla.redhat.com/1226117): [RFE] Return proper error codes in case of snapshot failure -- [1226120](http://bugzilla.redhat.com/1226120): [Snapshot] Do not run scheduler if ovirt scheduler is running -- [1226139](http://bugzilla.redhat.com/1226139): Implement MKNOD fop in bit-rot. -- [1226146](http://bugzilla.redhat.com/1226146): BitRot :- bitd is not signing Objects if more than 3 bricks are present on same node -- [1226153](http://bugzilla.redhat.com/1226153): Quota: Do not allow set/unset of quota limit in heterogeneous cluster -- [1226629](http://bugzilla.redhat.com/1226629): bug-973073.t fails spuriously -- [1226853](http://bugzilla.redhat.com/1226853): Volume start fails when glusterfs is source compiled with GCC v5.1.1 +- [1212676](http://bugzilla.redhat.com/1212676): NetBSD port +- [1218863](http://bugzilla.redhat.com/1218863): `ls' on a directory which has files with mismatching gfid's does not list anything +- [1219782](http://bugzilla.redhat.com/1219782): Regression failures in tests/bugs/snapshot/bug-1112559.t +- [1221000](http://bugzilla.redhat.com/1221000): detach-tier status emulates like detach-tier stop +- [1221470](http://bugzilla.redhat.com/1221470): dHT rebalance: Dict_copy log messages when running rebalance on a dist-rep volume +- [1221476](http://bugzilla.redhat.com/1221476): Data Tiering:rebalance fails on a tiered volume +- [1221477](http://bugzilla.redhat.com/1221477): The tiering feature requires counters. +- [1221503](http://bugzilla.redhat.com/1221503): DHT Rebalance : Misleading log messages for linkfiles +- [1221507](http://bugzilla.redhat.com/1221507): NFS-Ganesha: ACL should not be enabled by default +- [1221534](http://bugzilla.redhat.com/1221534): rebalance failed after attaching the tier to the volume. +- [1221967](http://bugzilla.redhat.com/1221967): Do not allow detach-tier commands on a non tiered volume +- [1221969](http://bugzilla.redhat.com/1221969): tiering: use sperate log/socket/pid file for tiering +- [1222198](http://bugzilla.redhat.com/1222198): Fix nfs/mount3.c build warnings reported in Koji +- [1222750](http://bugzilla.redhat.com/1222750): non-root geo-replication session goes to faulty state, when the session is started +- [1222869](http://bugzilla.redhat.com/1222869): [SELinux] [BVT]: Selinux throws AVC errors while running DHT automation on Rhel6.6 +- [1223215](http://bugzilla.redhat.com/1223215): gluster volume status fails with locking failed error message +- [1223286](http://bugzilla.redhat.com/1223286): [geo-rep]: worker died with "ESTALE" when performed rm -rf on a directory from mount of master volume +- [1223644](http://bugzilla.redhat.com/1223644): [geo-rep]: With tarssh the file is created at slave but it doesnt get sync +- [1224100](http://bugzilla.redhat.com/1224100): [geo-rep]: Even after successful sync, the DATA counter did not reset to 0 +- [1224241](http://bugzilla.redhat.com/1224241): gfapi: zero size issue in glfs_h_acl_set() +- [1224292](http://bugzilla.redhat.com/1224292): peers connected in the middle of a transaction are participating in the transaction +- [1224647](http://bugzilla.redhat.com/1224647): [RFE] Provide hourly scrubbing option +- [1224650](http://bugzilla.redhat.com/1224650): SIGNING FAILURE Error messages are poping up in the bitd log +- [1224894](http://bugzilla.redhat.com/1224894): Quota: spurious failures with quota testcases +- [1225077](http://bugzilla.redhat.com/1225077): Fix regression test spurious failures +- [1225279](http://bugzilla.redhat.com/1225279): Different client can not execute "for((i=0;i<1000;i++));do ls -al;done" in a same directory at the sametime +- [1225318](http://bugzilla.redhat.com/1225318): glusterd could crash in remove-brick-status when local remove-brick process has just completed +- [1225320](http://bugzilla.redhat.com/1225320): ls command failed with features.read-only on while mounting ec volume. +- [1225331](http://bugzilla.redhat.com/1225331): [geo-rep] stop-all-gluster-processes.sh fails to stop all gluster processes +- [1225543](http://bugzilla.redhat.com/1225543): [geo-rep]: snapshot creation timesout even if geo-replication is in pause/stop/delete state +- [1225552](http://bugzilla.redhat.com/1225552): [Backup]: Unable to create a glusterfind session +- [1225709](http://bugzilla.redhat.com/1225709): [RFE] Move signing trigger mechanism to [f]setxattr() +- [1225743](http://bugzilla.redhat.com/1225743): [AFR-V2] - afr_final_errno() should treat op_ret > 0 also as success +- [1225796](http://bugzilla.redhat.com/1225796): Spurious failure in tests/bugs/disperse/bug-1161621.t +- [1225919](http://bugzilla.redhat.com/1225919): Log EEXIST errors in DEBUG level in fops MKNOD and MKDIR +- [1225922](http://bugzilla.redhat.com/1225922): Sharding - Skip update of block count and size for directories in readdirp callback +- [1226024](http://bugzilla.redhat.com/1226024): cli/tiering:typo errors in tiering +- [1226029](http://bugzilla.redhat.com/1226029): I/O's hanging on tiered volumes (NFS) +- [1226032](http://bugzilla.redhat.com/1226032): glusterd crashed on the node when tried to detach a tier after restoring data from the snapshot. +- [1226117](http://bugzilla.redhat.com/1226117): [RFE] Return proper error codes in case of snapshot failure +- [1226120](http://bugzilla.redhat.com/1226120): [Snapshot] Do not run scheduler if ovirt scheduler is running +- [1226139](http://bugzilla.redhat.com/1226139): Implement MKNOD fop in bit-rot. +- [1226146](http://bugzilla.redhat.com/1226146): BitRot :- bitd is not signing Objects if more than 3 bricks are present on same node +- [1226153](http://bugzilla.redhat.com/1226153): Quota: Do not allow set/unset of quota limit in heterogeneous cluster +- [1226629](http://bugzilla.redhat.com/1226629): bug-973073.t fails spuriously +- [1226853](http://bugzilla.redhat.com/1226853): Volume start fails when glusterfs is source compiled with GCC v5.1.1 ### Known Issues -- [1227677](http://bugzilla.redhat.com/1227677): Glusterd crashes and cannot start after rebalance -- [1227656](http://bugzilla.redhat.com/1227656): Glusted dies when adding new brick to a distributed volume and converting to replicated volume -- [1210256](http://bugzilla.redhat.com/1210256): gluster volume info --xml gives back incorrect typrStr in xml -- [1212842](http://bugzilla.redhat.com/1212842): tar on a glusterfs mount displays "file changed as we read it" even though the file was not changed -- [1220347](http://bugzilla.redhat.com/1220347): Read operation on a file which is in split-brain condition is successful -- [1213352](http://bugzilla.redhat.com/1213352): nfs-ganesha: HA issue, the iozone process is not moving ahead, once the nfs-ganesha is killed -- [1220270](http://bugzilla.redhat.com/1220270): nfs-ganesha: Rename fails while exectuing Cthon general category test -- [1214169](http://bugzilla.redhat.com/1214169): glusterfsd crashed while rebalance and self-heal were in progress -- [1221941](http://bugzilla.redhat.com/1221941): glusterfsd: bricks crash while executing ls on nfs-ganesha vers=3 -- [1225809](http://bugzilla.redhat.com/1225809): [DHT-REBALANCE]-DataLoss: The data appended to a file during its migration will be lost once the migration is done -- [1225940](http://bugzilla.redhat.com/1225940): DHT: lookup-unhashed feature breaks runtime compatibility with older client versions - +- [1227677](http://bugzilla.redhat.com/1227677): Glusterd crashes and cannot start after rebalance +- [1227656](http://bugzilla.redhat.com/1227656): Glusted dies when adding new brick to a distributed volume and converting to replicated volume +- [1210256](http://bugzilla.redhat.com/1210256): gluster volume info --xml gives back incorrect typrStr in xml +- [1212842](http://bugzilla.redhat.com/1212842): tar on a glusterfs mount displays "file changed as we read it" even though the file was not changed +- [1220347](http://bugzilla.redhat.com/1220347): Read operation on a file which is in split-brain condition is successful +- [1213352](http://bugzilla.redhat.com/1213352): nfs-ganesha: HA issue, the iozone process is not moving ahead, once the nfs-ganesha is killed +- [1220270](http://bugzilla.redhat.com/1220270): nfs-ganesha: Rename fails while exectuing Cthon general category test +- [1214169](http://bugzilla.redhat.com/1214169): glusterfsd crashed while rebalance and self-heal were in progress +- [1221941](http://bugzilla.redhat.com/1221941): glusterfsd: bricks crash while executing ls on nfs-ganesha vers=3 +- [1225809](http://bugzilla.redhat.com/1225809): [DHT-REBALANCE]-DataLoss: The data appended to a file during its migration will be lost once the migration is done +- [1225940](http://bugzilla.redhat.com/1225940): DHT: lookup-unhashed feature breaks runtime compatibility with older client versions - Addition of bricks dynamically to cold or hot tiers in a tiered volume is not supported. - The following configuration changes are necessary for qemu and samba integration with libgfapi to work seamlessly: -```# gluster volume set server.allow-insecure on``` +`# gluster volume set server.allow-insecure on` Edit `/etc/glusterfs/glusterd.vol` to contain this line: `option rpc-auth-allow-insecure on` Post 1, restarting the volume would be necessary: diff --git a/docs/release-notes/3.9.0.md b/docs/release-notes/3.9.0.md index be53ac3..da46b21 100644 --- a/docs/release-notes/3.9.0.md +++ b/docs/release-notes/3.9.0.md @@ -12,34 +12,41 @@ of bugs that has been addressed is included further below. ## Major changes and features ### Introducing reset-brick command -*Notes for users:* + +_Notes for users:_ The reset-brick command provides support to reformat/replace the disk(s) represented by a brick within a volume. This is helpful when a disk goes bad etc Start reset process - + ```bash gluster volume reset-brick VOLNAME HOSTNAME:BRICKPATH start ``` + The above command kills the respective brick process. Now the brick can be reformatted. -To restart the brick after modifying configuration - +To restart the brick after modifying configuration - + ```bash gluster volume reset-brick VOLNAME HOSTNAME:BRICKPATH HOSTNAME:BRICKPATH commit ``` + If the brick was killed to replace the brick with same brick path, restart with following command - + ```bash gluster volume reset-brick VOLNAME HOSTNAME:BRICKPATH HOSTNAME:BRICKPATH commit force ``` -*Limitations:* +_Limitations:_ + 1. resetting a brick kills a brick process in concern. During this -period the brick will not be available for IO's. + period the brick will not be available for IO's. 2. Replacing a brick with this command will work only if both the brick paths -are same and belong to same volume. + are same and belong to same volume. ### Get node level status of a cluster -*Notes for users:* +_Notes for users:_ The get-state command provides node level status of a trusted storage pool from the point of view of glusterd in a parseable format. Using get-state command, external applications can invoke the command on all nodes of the cluster, and @@ -49,30 +56,35 @@ picture of the state of the cluster. ```bash # gluster get-state [odir ] ``` + This would dump data points that reflect the local state representation of the cluster as maintained in glusterd (no other daemons are supported as of now) to a file inside the specified output directory. The default output directory -and filename is /var/run/gluster and glusterd_state_ respectively. +and filename is /var/run/gluster and glusterd*state* respectively. Following are the sections in the output: + 1. `Global`: UUID and op-version of glusterd 2. `Global options`: Displays cluster specific options that have been set -explicitly through the volume set command. + explicitly through the volume set command. 3. `Peers`: Displays the peer node information including its hostname and -connection status + connection status 4. `Volumes`: Displays the list of volumes created on this node along with -detailed information on each volume. + detailed information on each volume. 5. `Services`: Displays the list of the services configured on this node along -with their corresponding statuses. + with their corresponding statuses. + +_Limitations:_ -*Limitations:* 1. This only supports glusterd. 2. Does not provide complete cluster state. Data to be collated from all nodes -by external application to get the complete cluster state. + by external application to get the complete cluster state. ### Multi threaded self-heal for Disperse volumes -*Notes for users:* + +_Notes for users:_ Users now have the ability to configure multi-threaded self-heal in disperse volumes using the following commands: + ```bash Option below can be used to control number of parallel heals in SHD # gluster volume set disperse.shd-max-threads [1-64] # default is 1 @@ -81,17 +93,21 @@ Option below can be used to control number of heals that can wait in SHD ``` ### Hardware extention acceleration in Disperse volumes -*Notes for users:* + +_Notes for users:_ If the user has hardware that has special instructions which can be used in erasure code calculations on the client it will be automatically used. At the moment this support is added for cpu-extentions: `x64`, `sse`, `avx` ### Lock revocation feature -*Notes for users:* + +_Notes for users:_ + 1. Motivation: Prevents cluster instability by mis-behaving clients causing bricks to OOM due to inode/entry lock pile-ups. 2. Adds option to strip clients of entry/inode locks after N seconds 3. Adds option to clear ALL locks should the revocation threshold get hit 4. Adds option to clear all or granted locks should the max-blocked threshold get hit (can be used in combination w/ revocation-clear-all). 5. Adds logging to indicate revocation event & reason 6. Options are: + ```bash # gluster volume set features.locks-revocation-secs # gluster volume set features.locks-revocation-clear-all [on/off] @@ -99,29 +115,33 @@ If the user has hardware that has special instructions which can be used in eras ``` ### On demand scrubbing for Bitrot Detection: -*Notes for users:* With 'ondemand' scrub option, you don't need to wait for the scrub-frequency + +_Notes for users:_ With 'ondemand' scrub option, you don't need to wait for the scrub-frequency to expire. As the option name itself says, the scrubber can be initiated on demand to detect the corruption. If the scrubber is already running, this option is a no op. + ```bash # gluster volume bitrot scrub ondemand ``` - ### Improvements in Gluster NFS-Ganesha integration -*Notes for users:* + +### Improvements in Gluster NFS-Ganesha integration + +_Notes for users:_ With this release the major change done is to store all the ganesha related configuration files in the shared storage volume mount point instead of having separate local copy in '/etc/ganesha' folder on each node. For new users, before enabling nfs-ganesha -1. create a directory named *nfs-ganesha* in the shared storage mount point (*/var/run/gluster/shared_storage/*) +1. create a directory named _nfs-ganesha_ in the shared storage mount point (_/var/run/gluster/shared_storage/_) -2. Create *ganesha.conf* & *ganesha-ha.conf* in that directory with the required details filled in. +2. Create _ganesha.conf_ & _ganesha-ha.conf_ in that directory with the required details filled in. For existing users, before starting nfs-ganesha service do the following : -1. Copy all the contents of */etc/ganesha* directory (including *.export_added* file) to */var/run/gluster/shared_storage/nfs-ganesha* from any of the ganesha nodes +1. Copy all the contents of _/etc/ganesha_ directory (including _.export_added_ file) to _/var/run/gluster/shared_storage/nfs-ganesha_ from any of the ganesha nodes -2. Create symlink using */var/run/gluster/shared_storage/nfs-ganesha/ganesha.conf* on */etc/ganesha* one each node in ganesha-cluster +2. Create symlink using _/var/run/gluster/shared_storage/nfs-ganesha/ganesha.conf_ on _/etc/ganesha_ one each node in ganesha-cluster -3. Change path for each export entry in *ganesha.conf* file +3. Change path for each export entry in _ganesha.conf_ file ```sh Example: if a volume "test" was exported, then ganesha.conf shall have below export entry - @@ -131,8 +151,9 @@ Change that line to ``` In addition, following changes have been made - -* The entity "HA_VOL_SERVER= " in *ganesha-ha.conf* is no longer required. -* A new resource-agent called portblock (available in >= *resource-agents-3.9.5* package) is added to the cluster configuration to speed up the nfs-client connections post IP failover or failback. This may be noticed while looking at the cluster configuration status using the command *pcs status*. + +- The entity "HA_VOL_SERVER= " in _ganesha-ha.conf_ is no longer required. +- A new resource-agent called portblock (available in >= _resource-agents-3.9.5_ package) is added to the cluster configuration to speed up the nfs-client connections post IP failover or failback. This may be noticed while looking at the cluster configuration status using the command _pcs status_. ### Availability of python bindings to libgfapi @@ -144,17 +165,19 @@ The python bindings have been packaged and has been made available over [PyPI](https://pypi.python.org/pypi/gfapi/). ### Small file improvements in Gluster with md-cache (Experimental) -*Notes for users:* + +_Notes for users:_ With this release, metadata cache on the client side is integrated with the cache-invalidation feature so that the clients can cache longer without -compromising on consistency. By enabling, the metadata cache and cache +compromising on consistency. By enabling, the metadata cache and cache invalidation feature and extending the cache timeout to 600s, we have seen performance improvements in metadata operation like creates, ls/stat, chmod, -rename, delete. The perf improvements is significant in SMB access of gluster +rename, delete. The perf improvements is significant in SMB access of gluster volume, but as a cascading effect the improvements is also seen on FUSE/Native access and NFS access. Use the below options in the order mentioned, to enable the features: + ```bash # gluster volume set features.cache-invalidation on # gluster volume set features.cache-invalidation-timeout 600 @@ -165,6 +188,7 @@ Use the below options in the order mentioned, to enable the features: ``` ### Real time Cluster notifications using Events APIs + Let us imagine we have a Gluster monitoring system which displays list of volumes and its state, to show the realtime status, monitoring app need to query the Gluster in regular interval to check volume @@ -180,6 +204,7 @@ Gluster. More details about this new feature is available here. http://docs.gluster.org/en/latest/Administrator%20Guide/Events%20APIs ### Geo-replication improvements + #### Documentation improvements: Upstream documentation is rewritten to reflect the latest version of @@ -189,6 +214,7 @@ to it. Latest version of documentation is available here http://docs.gluster.org/en/latest/Administrator%20Guide/Geo%20Replication #### Geo-replication Events are available for Events API consumers: + Events APIs is the new Gluster feature available with 3.9 release, most of the events from Geo-replication are added to eventsapi. @@ -234,6 +260,7 @@ configuration option provided to modify the changelog log level and defaulted to `INFO` ## Behavior changes + - [#1221623](https://bugzilla.redhat.com/1221623): Earlier the ports GlusterD used to allocate for the daemons like brick processes, quotad, shd et all were persistent through the volume's life cycle, so every restart of the @@ -245,6 +272,7 @@ defaulted to `INFO` etc-glusterfs-glusterd.vol.log ## Known Issues + - [#1387878](https://bugzilla.redhat.com/1387878):add-brick on a vm-store configuration which has sharding enabled is leading to vm corruption. To work around this issue, one can scale up by creating more volumes until this issue @@ -266,7 +294,7 @@ A total of 571 patches has been sent, addressing 422 bugs: - [#1211863](https://bugzilla.redhat.com/1211863): RFE: Support in md-cache to use upcall notifications to invalidate its cache - [#1221623](https://bugzilla.redhat.com/1221623): glusterd: add brick command should re-use the port for listening which is freed by remove-brick. - [#1222915](https://bugzilla.redhat.com/1222915): usage text is wrong for use-readdirp mount default -- [#1223937](https://bugzilla.redhat.com/1223937): Outdated autotools helper config.* files +- [#1223937](https://bugzilla.redhat.com/1223937): Outdated autotools helper config.\* files - [#1225718](https://bugzilla.redhat.com/1225718): [FEAT] DHT - rebalance - rebalance status o/p should be different for 'fix-layout' option, it should not show 'Rebalanced-files' , 'Size', 'Scanned' etc as it is not migrating any files. - [#1227667](https://bugzilla.redhat.com/1227667): Minor improvements and code cleanup for protocol server/client - [#1228142](https://bugzilla.redhat.com/1228142): clang-analyzer: adding clang static analysis support @@ -322,7 +350,7 @@ A total of 571 patches has been sent, addressing 422 bugs: - [#1332134](https://bugzilla.redhat.com/1332134): bitrot: Build generates Compilation Warning. - [#1332136](https://bugzilla.redhat.com/1332136): Detach tier fire before the background fixlayout is complete may result in failure - [#1332156](https://bugzilla.redhat.com/1332156): SMB:while running I/O on cifs mount and doing graph switch causes cifs mount to hang. -- [#1332219](https://bugzilla.redhat.com/1332219): tier: avoid pthread_join if pthread_create fails +- [#1332219](https://bugzilla.redhat.com/1332219): tier: avoid pthread_join if pthread_create fails - [#1332413](https://bugzilla.redhat.com/1332413): Wrong op-version for mandatory-locks volume set option - [#1332419](https://bugzilla.redhat.com/1332419): geo-rep: address potential leak of memory - [#1332460](https://bugzilla.redhat.com/1332460): [features/worm] - when disabled, worm xl should simply pass requested fops to its child xl @@ -362,7 +390,7 @@ A total of 571 patches has been sent, addressing 422 bugs: - [#1335232](https://bugzilla.redhat.com/1335232): features/index: clang compile warnings in index.c - [#1335429](https://bugzilla.redhat.com/1335429): Self heal shows different information for the same volume from each node - [#1335494](https://bugzilla.redhat.com/1335494): Modifying peer ops library -- [#1335531](https://bugzilla.redhat.com/1335531): Modified volume options are not syncing once glusterd comes up. +- [#1335531](https://bugzilla.redhat.com/1335531): Modified volume options are not syncing once glusterd comes up. - [#1335652](https://bugzilla.redhat.com/1335652): Heal info shows split-brain for .shard directory though only one brick was down - [#1335717](https://bugzilla.redhat.com/1335717): PREFIX is not honoured during build and install - [#1335776](https://bugzilla.redhat.com/1335776): rpc: change client insecure port ceiling from 65535 to 49151 @@ -378,7 +406,7 @@ A total of 571 patches has been sent, addressing 422 bugs: - [#1336612](https://bugzilla.redhat.com/1336612): one of vm goes to paused state when network goes down and comes up back - [#1336630](https://bugzilla.redhat.com/1336630): ERROR and Warning message on writing a file from mount point "null gfid for path (null)" repeated 3 times between" - [#1336642](https://bugzilla.redhat.com/1336642): [RFE] git-branch-diff: wrapper script for git to visualize backports -- [#1336698](https://bugzilla.redhat.com/1336698): DHT : few Files are not accessible and not listed on mount + more than one Directory have same gfid + (sometimes) attributes has ?? in ls output after renaming Directories from multiple client at same time +- [#1336698](https://bugzilla.redhat.com/1336698): DHT : few Files are not accessible and not listed on mount + more than one Directory have same gfid + (sometimes) attributes has ?? in ls output after renaming Directories from multiple client at same time - [#1336793](https://bugzilla.redhat.com/1336793): assorted typos and spelling mistakes from Debian lintian - [#1336818](https://bugzilla.redhat.com/1336818): Add ability to set oom_score_adj for glusterfs process - [#1336853](https://bugzilla.redhat.com/1336853): scripts: bash-isms in scripts @@ -394,7 +422,7 @@ A total of 571 patches has been sent, addressing 422 bugs: - [#1337899](https://bugzilla.redhat.com/1337899): Misleading error message on rebalance start when one of the glusterd instance is down - [#1338544](https://bugzilla.redhat.com/1338544): fuse: In fuse_first_lookup(), dict is not un-referenced in case create_frame returns an empty pointer. - [#1338634](https://bugzilla.redhat.com/1338634): AFR : fuse,nfs mount hangs when directories with same names are created and deleted continuously -- [#1338733](https://bugzilla.redhat.com/1338733): __inode_ctx_put: fix mem leak on failure +- [#1338733](https://bugzilla.redhat.com/1338733): \_\_inode_ctx_put: fix mem leak on failure - [#1338967](https://bugzilla.redhat.com/1338967): common-ha: ganesha.nfsd not put into NFS-GRACE after fail-back - [#1338991](https://bugzilla.redhat.com/1338991): DHT2: Tracker bug - [#1339071](https://bugzilla.redhat.com/1339071): dht/rebalance: mark hardlink migration failure as skipped for rebalance process @@ -408,7 +436,7 @@ A total of 571 patches has been sent, addressing 422 bugs: - [#1339553](https://bugzilla.redhat.com/1339553): gfapi: in case of handle based APIs, close glfd after successful create - [#1339689](https://bugzilla.redhat.com/1339689): RFE - capacity info (df -h on a mount) is incorrect for a tiered volume - [#1340488](https://bugzilla.redhat.com/1340488): copy-export-ganesha.sh does not have a correct shebang -- [#1340623](https://bugzilla.redhat.com/1340623): Directory creation(mkdir) fails when the remove brick is initiated for replicated volumes accessing via nfs-ganesha +- [#1340623](https://bugzilla.redhat.com/1340623): Directory creation(mkdir) fails when the remove brick is initiated for replicated volumes accessing via nfs-ganesha - [#1340853](https://bugzilla.redhat.com/1340853): [geo-rep]: If the session is renamed, geo-rep configuration are not retained - [#1340936](https://bugzilla.redhat.com/1340936): Automount fails because /sbin/mount.glusterfs does not accept the -s option - [#1341007](https://bugzilla.redhat.com/1341007): gfapi : throwing warning message for unused variable in glfs_h_find_handle() @@ -416,7 +444,7 @@ A total of 571 patches has been sent, addressing 422 bugs: - [#1341294](https://bugzilla.redhat.com/1341294): build: RHEL7 unpackaged files /var/lib/glusterd/hooks/.../S57glusterfind-delete-post.{pyc,pyo} - [#1341474](https://bugzilla.redhat.com/1341474): [geo-rep]: Snapshot creation having geo-rep session is broken - [#1341650](https://bugzilla.redhat.com/1341650): conservative merge happening on a x3 volume for a deleted file -- [#1341768](https://bugzilla.redhat.com/1341768): After setting up ganesha on RHEL 6, nodes remains in stopped state and grace related failures observed in pcs status +- [#1341768](https://bugzilla.redhat.com/1341768): After setting up ganesha on RHEL 6, nodes remains in stopped state and grace related failures observed in pcs status - [#1341796](https://bugzilla.redhat.com/1341796): [quota+snapshot]: Directories are inaccessible from activated snapshot, when the snapshot was created during directory creation - [#1342171](https://bugzilla.redhat.com/1342171): O_DIRECT support for sharding - [#1342259](https://bugzilla.redhat.com/1342259): [features/worm] - write FOP should pass for the normal files @@ -449,7 +477,7 @@ A total of 571 patches has been sent, addressing 422 bugs: - [#1346211](https://bugzilla.redhat.com/1346211): cleanup glusterd-georep code - [#1346551](https://bugzilla.redhat.com/1346551): wrong understanding of function's parameter - [#1346719](https://bugzilla.redhat.com/1346719): [Disperse] dd + rm + ls lead to IO hang -- [#1346821](https://bugzilla.redhat.com/1346821): cli core dumped while providing/not wrong values during arbiter replica volume +- [#1346821](https://bugzilla.redhat.com/1346821): cli core dumped while providing/not wrong values during arbiter replica volume - [#1347249](https://bugzilla.redhat.com/1347249): libgfapi : variables allocated by glfs_set_volfile_server is not freed - [#1347354](https://bugzilla.redhat.com/1347354): glusterd: SuSE build system error for incorrect strcat, strncat usage - [#1347686](https://bugzilla.redhat.com/1347686): IO error seen with Rolling or non-disruptive upgrade of an distribute-disperse(EC) volume from 3.7.5 to 3.7.9 @@ -481,7 +509,7 @@ A total of 571 patches has been sent, addressing 422 bugs: - [#1350744](https://bugzilla.redhat.com/1350744): GlusterFS 3.9.0 tracker - [#1350793](https://bugzilla.redhat.com/1350793): build: remove absolute paths from glusterfs spec file - [#1350867](https://bugzilla.redhat.com/1350867): RFE: FEATURE: Lock revocation for features/locks xlator -- [#1351021](https://bugzilla.redhat.com/1351021): [DHT]: Rebalance info for remove brick operation is not showing after glusterd restart +- [#1351021](https://bugzilla.redhat.com/1351021): [DHT]: Rebalance info for remove brick operation is not showing after glusterd restart - [#1351071](https://bugzilla.redhat.com/1351071): [geo-rep] Stopped geo-rep session gets started automatically once all the master nodes are upgraded - [#1351134](https://bugzilla.redhat.com/1351134): [SSL] : gluster v set help does not show ssl options - [#1351537](https://bugzilla.redhat.com/1351537): [Bitrot] Need a way to set scrub interval to a minute, for ease of testing @@ -556,7 +584,7 @@ A total of 571 patches has been sent, addressing 422 bugs: - [#1362520](https://bugzilla.redhat.com/1362520): Per xlator logging not working - [#1362602](https://bugzilla.redhat.com/1362602): [Open SSL] : Unable to mount an SSL enabled volume via SMB v3/Ganesha v4 - [#1363591](https://bugzilla.redhat.com/1363591): Geo-replication user driven Events -- [#1363721](https://bugzilla.redhat.com/1363721): [HC]: After bringing down and up of the bricks VM's are getting paused +- [#1363721](https://bugzilla.redhat.com/1363721): [HC]: After bringing down and up of the bricks VM's are getting paused - [#1363948](https://bugzilla.redhat.com/1363948): Spurious failure in tests/bugs/glusterd/bug-1089668.t - [#1364026](https://bugzilla.redhat.com/1364026): glfs_fini() crashes with SIGSEGV - [#1364420](https://bugzilla.redhat.com/1364420): [RFE] History Crawl performance improvement @@ -564,7 +592,7 @@ A total of 571 patches has been sent, addressing 422 bugs: - [#1364529](https://bugzilla.redhat.com/1364529): api: revert glfs_ipc_xd intended for 4.0 - [#1365455](https://bugzilla.redhat.com/1365455): [AFR]: Files not available in the mount point after converting Distributed volume type to Replicated one. - [#1365489](https://bugzilla.redhat.com/1365489): glfs_truncate missing -- [#1365506](https://bugzilla.redhat.com/1365506): gfapi: use const qualifier for glfs_*timens() +- [#1365506](https://bugzilla.redhat.com/1365506): gfapi: use const qualifier for glfs\_\*timens() - [#1366195](https://bugzilla.redhat.com/1366195): [Bitrot - RFE]: On demand scrubbing option to scrub - [#1366222](https://bugzilla.redhat.com/1366222): "heal info --xml" not showing the brick name of offline bricks. - [#1366226](https://bugzilla.redhat.com/1366226): Move alloca0 definition to common-utils @@ -589,7 +617,7 @@ A total of 571 patches has been sent, addressing 422 bugs: - [#1369430](https://bugzilla.redhat.com/1369430): Track the client that performed readdirp - [#1369432](https://bugzilla.redhat.com/1369432): IATT cache invalidation should be sent when permission changes on file - [#1369524](https://bugzilla.redhat.com/1369524): segment fault while join thread reaper_thr in fini() -- [#1369530](https://bugzilla.redhat.com/1369530): protocol/server: readlink rsp xdr failed while readlink got an error +- [#1369530](https://bugzilla.redhat.com/1369530): protocol/server: readlink rsp xdr failed while readlink got an error - [#1369638](https://bugzilla.redhat.com/1369638): DHT stale layout issue will be seen often with md-cache prolonged cache of lookups - [#1369721](https://bugzilla.redhat.com/1369721): EventApis will not work if compiled using ./configure --disable-glupy - [#1370053](https://bugzilla.redhat.com/1370053): fix EXPECT_WITHIN @@ -615,7 +643,7 @@ A total of 571 patches has been sent, addressing 422 bugs: - [#1374567](https://bugzilla.redhat.com/1374567): [Bitrot]: Recovery fails of a corrupted hardlink (and the corresponding parent file) in a disperse volume - [#1374581](https://bugzilla.redhat.com/1374581): Geo-rep worker Faulty with OSError: [Errno 21] Is a directory - [#1374597](https://bugzilla.redhat.com/1374597): [geo-rep]: AttributeError: 'Popen' object has no attribute 'elines' -- [#1374608](https://bugzilla.redhat.com/1374608): geo-replication *changes.log does not respect the log-level configured +- [#1374608](https://bugzilla.redhat.com/1374608): geo-replication \*changes.log does not respect the log-level configured - [#1374626](https://bugzilla.redhat.com/1374626): Worker crashes with EINVAL errors - [#1374630](https://bugzilla.redhat.com/1374630): [geo-replication]: geo-rep Status is not showing bricks from one of the nodes - [#1374639](https://bugzilla.redhat.com/1374639): glusterfs: create a directory with 0464 mode return EIO error @@ -645,17 +673,17 @@ A total of 571 patches has been sent, addressing 422 bugs: - [#1383692](https://bugzilla.redhat.com/1383692): GlusterFS fails to build on old Linux distros with linux/oom.h missing - [#1383913](https://bugzilla.redhat.com/1383913): spurious heal info as pending heal entries never end on an EC volume while IOs are going on - [#1385224](https://bugzilla.redhat.com/1385224): arbiter volume write performance is bad with sharding -- [#1385236](https://bugzilla.redhat.com/1385236): invalid argument warning messages seen in fuse client logs 2016-09-30 06:34:58.938667] W [dict.c:418ict_set] (-->/usr/lib64/glusterfs/3.8.4/xlator/cluster/replicate.so(+0x58722) 0-dict: !this || !value for key=link-count [Invalid argument] +- [#1385236](https://bugzilla.redhat.com/1385236): invalid argument warning messages seen in fuse client logs 2016-09-30 06:34:58.938667] W [dict.c:418ict_set] (-->/usr/lib64/glusterfs/3.8.4/xlator/cluster/replicate.so(+0x58722) 0-dict: !this || !value for key=link-count [Invalid argument] - [#1385451](https://bugzilla.redhat.com/1385451): "nfs.disable: on" is not showing in Vol info by default for the 3.7.x volumes after updating to 3.9.0 - [#1386072](https://bugzilla.redhat.com/1386072): Spurious permission denied problems observed - [#1386178](https://bugzilla.redhat.com/1386178): eventsapi/georep: Events are not available for Checkpoint and Status Change - [#1386338](https://bugzilla.redhat.com/1386338): pmap_signin event fails to update brickinfo->signed_in flag - [#1387099](https://bugzilla.redhat.com/1387099): Boolean attributes are published as string -- [#1387492](https://bugzilla.redhat.com/1387492): Error and warning message getting while removing glusterfs-events package -- [#1387502](https://bugzilla.redhat.com/1387502): Incorrect volume type in the "glusterd_state" file generated using CLI "gluster get-state" +- [#1387492](https://bugzilla.redhat.com/1387492): Error and warning message getting while removing glusterfs-events package +- [#1387502](https://bugzilla.redhat.com/1387502): Incorrect volume type in the "glusterd_state" file generated using CLI "gluster get-state" - [#1387564](https://bugzilla.redhat.com/1387564): [Eventing]: UUID is showing zeros in the event message for the peer probe operation. - [#1387894](https://bugzilla.redhat.com/1387894): Regression caused by enabling client-io-threads by default -- [#1387960](https://bugzilla.redhat.com/1387960): Sequential volume start&stop is failing with SSL enabled setup. +- [#1387960](https://bugzilla.redhat.com/1387960): Sequential volume start&stop is failing with SSL enabled setup. - [#1387964](https://bugzilla.redhat.com/1387964): [Eventing]: 'gluster vol bitrot scrub ondemand' does not produce an event - [#1387975](https://bugzilla.redhat.com/1387975): Continuous warning messages getting when one of the cluster node is down on SSL setup. - [#1387981](https://bugzilla.redhat.com/1387981): [Eventing]: 'gluster volume tier start force' does not generate a TIER_START event diff --git a/docs/release-notes/4.0.0.md b/docs/release-notes/4.0.0.md index a3bcd56..e8e6f68 100644 --- a/docs/release-notes/4.0.0.md +++ b/docs/release-notes/4.0.0.md @@ -17,19 +17,19 @@ A full list of bugs that have been addressed is included further below. ## Announcements 1. As 3.13 was a short term maintenance release, features which have been -included in that release are available with 4.0.0 as well.These features may be of -interest to users upgrading to 4.0.0 from older than 3.13 releases. The 3.13 -[release notes](http://docs.gluster.org/en/latest/release-notes/) captures the list of features that were introduced with 3.13. + included in that release are available with 4.0.0 as well.These features may be of + interest to users upgrading to 4.0.0 from older than 3.13 releases. The 3.13 + [release notes](http://docs.gluster.org/en/latest/release-notes/) captures the list of features that were introduced with 3.13. **NOTE:** As 3.13 was a short term maintenance release, it will reach end of life (EOL) with the release of 4.0.0. ([reference](https://www.gluster.org/release-schedule/)) 2. Releases that receive maintenance updates post 4.0 release are, 3.10, 3.12, -4.0 ([reference](https://www.gluster.org/release-schedule/)) + 4.0 ([reference](https://www.gluster.org/release-schedule/)) 3. With this release, the CentOS storage SIG will not build server packages for -CentOS6. Server packages will be available for CentOS7 only. For ease of -migrations, client packages on CentOS6 will be published and maintained. + CentOS6. Server packages will be available for CentOS7 only. For ease of + migrations, client packages on CentOS6 will be published and maintained. **NOTE**: This change was announced [here](http://lists.gluster.org/pipermail/gluster-users/2018-January/033212.html) @@ -59,39 +59,46 @@ daemon. More information is available in the [Limitations](#limitations) section GD2 brings many new changes and improvements, that affect both users and developers. #### Features + The most significant new features brought by GD2 are below. + ##### Native REST APIs + GD2 exposes all of its management functionality via [ReST APIs](https://github.com/gluster/glusterd2/blob/master/doc/endpoints.md). The ReST APIs accept and return data encoded in JSON. This enables external projects such as [Heketi](https://github.com/heketi/heketi) to be better integrated with GD2. ##### CLI + GD2 provides a new CLI, `glustercli`, built on top of the ReST API. The CLI retains much of the syntax of the old `gluster` command. In addition we have, + - Improved CLI help messages - Auto completion for sub commands - Improved CLI error messages on failure - Framework to run `glustercli` from outside the Cluster. In this release, the following CLI commands are available, + - Peer management - - Peer Probe/Attach - - Peer Detach - - Peer Status + - Peer Probe/Attach + - Peer Detach + - Peer Status - Volume Management - - Create/Start/Stop/Delete - - Expand - - Options Set/Get + - Create/Start/Stop/Delete + - Expand + - Options Set/Get - Bitrot - - Enable/Disable - - Configure - - Status + - Enable/Disable + - Configure + - Status - Geo-replication - - Create/Start/Pause/Resume/Stop/Delete - - Configure - - Status + - Create/Start/Pause/Resume/Stop/Delete + - Configure + - Status ##### Configuration store + GD2 uses [etcd](https://github.com/coreos/etcd/) to store the Gluster pool configuration, which solves the config synchronize issues reported against the Gluster management daemon. @@ -100,12 +107,14 @@ forming the trusted storage pool. If required, GD2 can also connect to an already existing etcd cluster. ##### Transaction Framework + GD2 brings a newer more flexible distributed framework, to help it perform actions across the storage pool. The transaction framework provides better control for choosing peers for a Gluster operation and it also provides a mechanism to roll back the changes when something goes bad. ##### Volume Options + GD2 intelligently fetches and builds the list of volume options by directly reading `xlators` `*.so` files. It does required validations during volume set without maintaining duplicate list of options. This avoids lot of issues which @@ -117,6 +126,7 @@ options and default options. Work is still in progress to categorize these options and tune the list for better understanding and ease of use. ##### Volfiles generation and management + GD2 has a newer and better structured way for developers to define volfile structure. The new method reduces the effort required to extend graphs or add new graphs. @@ -125,11 +135,13 @@ Also, volfiles are generated in single peer and stored in `etcd` store. This is very important for scalability since Volfiles are not stored in every node. ##### Security + GD2 supports TLS for ReST and internal communication, and authentication for the ReST API.If enabled, ReST APIs are currently limited to CLI, or the users who have access to the Token file present in `$GLUSTERD2_WORKDIR/auth` file. ##### Features integration - Self Heal + Self Heal feature integrated for the new Volumes created using Glusterd2. ##### Geo-replication @@ -137,9 +149,11 @@ Self Heal feature integrated for the new Volumes created using Glusterd2. With GD2 integration Geo-replication setup becomes very easy. If Master and Remote volume are available and running, Geo-replication can be setup with just a single command. + ``` glustercli geo-replication create :: ``` + Geo-replication status is improved, Status clearly distinguishes the multiple session details in status output. @@ -149,24 +163,29 @@ release, Master worker status rows will always match with Bricks list in Volume info. Status can be checked using, + ``` glustercli geo-replication status glustercli geo-replication status :: ``` + All the other commands are available as usual. Limitations: - On Remote nodes, Geo-replication is not yet creates the log directories. As -a workaround, create the required log directories in Remote Volume nodes. + a workaround, create the required log directories in Remote Volume nodes. ##### Events APIs + Events API feature is integrated with GD2. Webhooks can be registered to listen for GlusterFS events. Work is in progress for exposing an REST API to view all the events happened in last 15 minutes. #### Limitations + ##### Backward compatibility + GD2 is not backwards compatible with the older GlusterD. Heterogeneous clusters running both GD2 and GlusterD are not possible. @@ -174,6 +193,7 @@ GD2 retains compatibility with Gluster-3.x clients. Old clients will still be able to mount and use volumes exported using GD2. ##### Upgrade and migration + GD2 does not support upgrade from Gluster-3.x releases, in Gluster-4.0. Gluster-4.0 will be shipping with both GD2 and the existing GlusterD. Users will be able to upgrade to Gluster-4.0 while continuing to use GlusterD. @@ -186,23 +206,29 @@ Post Gluster-4.1, GlusterD would be maintained for a couple of releases, post which the only option to manage the cluster would be GD2. ##### Missing and partial commands + Not all commands from GlusterD, have been implemented for GD2. Some have been only partially implemented. This means not all GlusterFS features are available in GD2. We aim to bring most of the commands back in Gluster-4.1. ##### Recovery from full shutdown + With GD2, the process of recovery from a situation of a full cluster shutdown requires reading the [document available](https://github.com/gluster/glusterd2/wiki/Recovery) as well as some expertise. #### Known Issues + ##### 2-node clusters + GD2 does not work well in 2-node clusters. Two main issues exist in this regard. + - Restarting GD2 fails in 2-node clusters [#352](https://github.com/gluster/glusterd2/issues/352) - Detach fails in 2-node clusters [#332](https://github.com/gluster/glusterd2/issues/332) So it is recommended right now to run GD2 only in clusters of 3 or larger. ##### Other issues + Other known issues are tracked on [github issues](https://github.com/gluster/glusterd2/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+) right now. Please file any other issue you find on github issues. @@ -220,6 +246,7 @@ information and avoids the performance penalty and complexities of previous approaches. #### 1. Metrics collection across every FOP in every xlator + **Notes for users:** Now, Gluster now has in-built latency measures in the xlator abstraction, thus enabling capture of metrics and usage patterns across workloads. @@ -230,6 +257,7 @@ These measures are currently enabled by default. This feature is auto-enabled and cannot be disabled. #### 2. Monitoring support + **Notes for users:** Currently, the only project which consumes metrics and provides basic monitoring is [glustermetrics](https://github.com/amarts/glustermetrics), which provides a good idea on how to @@ -244,31 +272,38 @@ framework to generate more metrics is present for other translators and core components. However, additional metrics are not added in this release. ### Performance + #### 1. EC: Make metadata [F]GETXATTR operations faster + **Notes for users:** Disperse translator has made performance improvements to the [F]GETXATTR operation. Workloads involving heavy use of extended attributes on files and directories, will gain from the improvements made. #### 2. Allow md-cache to serve nameless lookup from cache + **Notes for users:** The md-cache translator is enhanced to cache nameless lookups (typically seen with NFS workloads). This helps speed up overall operations on the volume reducing the number of lookups done over the network. Typical workloads that will benefit from this enhancement are, + - NFS based access - Directory listing with FUSE, when ACLs are enabled #### 3. md-cache: Allow runtime addition of xattrs to the list of xattrs that md-cache caches + **Notes for users:** md-cache was enhanced to cache extended attributes of a file or directory, for gluster specific attributes. This has now been enhanced to cache user provided attributes (xattrs) as well. To add specific xattrs to the cache list, use the following command: + ``` # gluster volume set xattr-cache-list ",,..." ``` + Existing options, such as "cache-samba-metadata" "cache-swift-metadata" continue to function. The new option "xattr-cache-list" appends to the list generated by the existing options. @@ -278,18 +313,22 @@ Setting this option overwrites the previous value set for this option. The append to the existing list of xattr is not supported with this release. #### 4. Cache last stripe of an EC volume while write is going on + **Notes for users:** Disperse translator now has the option to retain a write-through cache of the last write stripe. This helps in improved small append sequential IO patterns by reducing the need to read a partial stripe for appending operations. To enable this use, + ``` # gluster volume set disperse.stripe-cache ``` + Where, is the number of stripes to cache. #### 5. tie-breaker logic for blocking inodelks/entrylk in SHD + **Notes for users:** Self-heal deamon locking has been enhanced to identify situations where an selfheal deamon is actively working on an inode. This enables other selfheal @@ -297,6 +336,7 @@ daemons to proceed with other entries in the queue, than waiting on a particular entry, thus preventing starvation among selfheal threads. #### 6. Independent eager-lock options for file and directory accesses + **Notes for users:** A new option named 'disperse.other-eager-lock' has been added to make it possible to have different settings for regular file accesses and accesses @@ -308,36 +348,45 @@ from the same directory, you can disable this option to improve the performance for these users while still keeping best performance for file accesses. #### 7. md-cache: Added an option to cache statfs data + **Notes for users:** This can be controlled with option performance.md-cache-statfs + ``` gluster volume set performance.md-cache-statfs ``` #### 8. Improved disperse performance due to parallel xattrop updates + **Notes for users:** Disperse translator has been optimized to perform xattrop update operation in parallel on the bricks during self-heal to improve performance. ### Geo-replication + #### 1. Geo-replication: Improve gverify.sh logs + **Notes for users:** gverify.sh is the script which runs during geo-rep session creation which validates pre-requisites. The logs have been improved and locations are changed as follows, + 1. Slave mount log file is changed from `/geo-replication-slaves/slave.log` to, `/geo-replication/gverify-slavemnt.log` 2. Master mount log file is separated from the slave log file under, `/geo-replication/gverify-mastermnt.log` #### 2. Geo-rep: Cleanup stale (unusable) XSYNC changelogs. + **Notes for users:** Stale xsync logs were not cleaned up, causing accumulation of these on the system. This change cleans up the stale xsync logs, if geo-replication has to restart from a faulty state. ### Standalone + #### 1. Ability to force permissions while creating files/directories on a volume + **Notes for users:** Options have been added to the posix translator, to override default umask values with which files and directories are created. This is particularly useful @@ -346,16 +395,19 @@ prevent such useful sharing, and supersede ACLs in this regard, these options are provided to control this behavior. Command usage is as follows: + ``` # gluster volume set storage. ``` + The valid `` ranges from 0000 to 0777 `` are: - - create-mask - - create-directory-mask - - force-create-mode - - force-create-directory + +- create-mask +- create-directory-mask +- force-create-mode +- force-create-directory Options "create-mask" and "create-directory-mask" are added to remove the mode bits set on a file or directory when its created. Default value of these @@ -364,6 +416,7 @@ the default permission for a file or directory irrespective of the clients umask. Default value of these options is 0000. #### 2. Replace MD5 usage to enable FIPS support + **Notes for users:** Previously, if Gluster was run on a FIPS enabled system, it used to crash because MD5 is not FIPS compliant and Gluster consumes MD5 checksum in @@ -385,6 +438,7 @@ Snapshot feature in Gluster still uses md5 checksums, hence running in FIPS compliant systems requires that the snapshot feature is not used. #### 3. Dentry fop serializer xlator on brick stack + **Notes for users:** This feature strengthens consistency of the file system, trading it for some performance and is strongly suggested for workloads where consistency is @@ -397,6 +451,7 @@ become consistent, but a large proportion of applications are not built to handle eventual consistency. This feature can be enabled as follows, + ``` # gluster volume set features.sdfs enable ``` @@ -406,6 +461,7 @@ This feature is released as a technical preview, as performance implications are not known completely. #### 4. Add option to disable nftw() based deletes when purging the landfill directory + **Notes for users:** The gluster brick processes use an optimized manner of deleting entire sub-trees using the nftw call. With this release, an option is being added to toggle this @@ -418,15 +474,18 @@ helps toggle this feature. The default is always enabled, as in the older releases. #### 5. Add option in POSIX to limit hardlinks per inode + **Notes for users:** Added an option to POSIX that limits the number of hard links that can be created against an inode (file). This helps when there needs to be a different hardlink limit than what the local FS provides for the bricks. The option to control this behavior is, + ``` # gluster volume set storage.max-hardlinks ``` + Where, `` is 0-0xFFFFFFFF. If the local file system that the brick is using has a lower limit than this setting, that would be honored. @@ -434,6 +493,7 @@ Default is set to 100, setting this to 0 turns it off and leaves it to the local file system defaults. Setting it to 1 turns off hard links. #### 6. Enhancements for directory listing in readdirp + **Notes for users:** Prior to this release, rebalance performed a fix-layout on a directory before healing its subdirectories. If there were a lot of subdirs, it could take a @@ -445,12 +505,14 @@ parents, thereby changing the way rebalance acts (files within sub directories are migrated first) and also resolving the directory listing issue. #### 7. Rebalance skips migration of file if it detects writes from application + **Notes for users:** Rebalance process skips migration of file if it detects writes from application. To force migration even in the presence of writes from application to file, "cluster.force-migration" has to be turned on, which is off by default. The option to control this behavior is, + ``` # gluster volume set cluster.force-migration ``` @@ -464,7 +526,9 @@ remove brick commit is performed. Rebalancing files with active write IO to them has a chance of data corruption. ### Developer related + #### 1. xlators should not provide init(), fini() and others directly, but have class_methods + **Notes for developers:** This release brings in a new unified manner of defining xlator methods. Which avoids certain unwanted side-effects of the older method (like having to have @@ -480,11 +544,13 @@ same can be seen [here](https://github.com/gluster/glusterfs/commit/5b4b25c697f9 The older mechanism is still supported, but not preferred. #### 2. Framework for distributed testing + **Notes for developers:** A new framework for running the regression tests for Gluster is added. The [README](https://github.com/gluster/glusterfs/blob/release-4.0/extras/distributed-testing/README) has details on how to use the same. #### 3. New API for acquiring mandatory locks + **Notes for developers:** The current API for byte-range locks glfs_posix_lock doesn't allow applications to specify whether it is advisory or mandatory type lock. This @@ -498,6 +564,7 @@ A sample test program can be found [here](https://github.com/gluster/glusterfs/b usage of this API. #### 4. New on-wire protocol (XDR) needed to support iattx and cleaner dictionary structure + **Notes for developers:** With changes in the code to adapt to a newer iatt structure, and stricter data format enforcement within dictionaries passed across the wire, and also as a @@ -516,6 +583,7 @@ An example of better encoding dictionary values for wire transfers can be seen [Here](https://github.com/gluster/glusterfs/blob/master/doc/developer-guide/rpc-for-glusterfs.new-versions.md) is some additional information on Gluster RPC programs for the inquisitive. #### 5. The protocol xlators should prevent sending binary values in a dict over the networks + **Notes for developers:** Dict data over the wire in Gluster was sent in binary. This has been changed with this release, as the on-wire protocol wire is also new, to send XDR encoded @@ -523,6 +591,7 @@ dict values across. In the future, any new dict type needs to also handle the required XDR encoding of the same. #### 6. Translator to handle 'global' options + **Notes for developers:** GlusterFS process has around 50 command line arguments to itself. While many of the options are initial settings, many others can change its value in volume @@ -553,10 +622,10 @@ Bugs addressed since release-3.13.0 are listed below. - [#1440659](https://bugzilla.redhat.com/1440659): Add events to notify disk getting fill - [#1443145](https://bugzilla.redhat.com/1443145): Free runtime allocated resources upon graph switch or glfs_fini() - [#1446381](https://bugzilla.redhat.com/1446381): detach start does not kill the tierd -- [#1467250](https://bugzilla.redhat.com/1467250): Accessing a file when source brick is down results in that FOP being hung +- [#1467250](https://bugzilla.redhat.com/1467250): Accessing a file when source brick is down results in that FOP being hung - [#1467614](https://bugzilla.redhat.com/1467614): Gluster read/write performance improvements on NVMe backend - [#1469487](https://bugzilla.redhat.com/1469487): sys_xxx() functions should guard against bad return values from fs -- [#1471031](https://bugzilla.redhat.com/1471031): dht_(f)xattrop does not implement migration checks +- [#1471031](https://bugzilla.redhat.com/1471031): dht\_(f)xattrop does not implement migration checks - [#1471753](https://bugzilla.redhat.com/1471753): [disperse] Keep stripe in in-memory cache for the non aligned write - [#1474768](https://bugzilla.redhat.com/1474768): The output of the "gluster help" command is difficult to read - [#1479528](https://bugzilla.redhat.com/1479528): Rebalance estimate(ETA) shows wrong details(as intial message of 10min wait reappears) when still in progress @@ -577,7 +646,7 @@ Bugs addressed since release-3.13.0 are listed below. - [#1506197](https://bugzilla.redhat.com/1506197): [Parallel-Readdir]Warning messages in client log saying 'parallel-readdir' is not recognized. - [#1508898](https://bugzilla.redhat.com/1508898): Add new configuration option to manage deletion of Worm files - [#1508947](https://bugzilla.redhat.com/1508947): glusterfs: Include path in pkgconfig file is wrong -- [#1509189](https://bugzilla.redhat.com/1509189): timer: Possible race condition between gf_timer_* routines +- [#1509189](https://bugzilla.redhat.com/1509189): timer: Possible race condition between gf*timer*\* routines - [#1509254](https://bugzilla.redhat.com/1509254): snapshot remove does not cleans lvm for deactivated snaps - [#1509340](https://bugzilla.redhat.com/1509340): glusterd does not write pidfile correctly when forking - [#1509412](https://bugzilla.redhat.com/1509412): Change default versions of certain features to 3.13 from 4.0 @@ -591,7 +660,7 @@ Bugs addressed since release-3.13.0 are listed below. - [#1510874](https://bugzilla.redhat.com/1510874): print-backtrace.sh failing with cpio version 2.11 or older - [#1510940](https://bugzilla.redhat.com/1510940): The number of bytes of the quota specified in version 3.7 or later is incorrect - [#1511310](https://bugzilla.redhat.com/1511310): Test bug-1483058-replace-brick-quorum-validation.t fails inconsistently -- [#1511339](https://bugzilla.redhat.com/1511339): In Replica volume 2*2 when quorum is set, after glusterd restart nfs server is coming up instead of self-heal daemon +- [#1511339](https://bugzilla.redhat.com/1511339): In Replica volume 2\*2 when quorum is set, after glusterd restart nfs server is coming up instead of self-heal daemon - [#1512437](https://bugzilla.redhat.com/1512437): parallel-readdir = TRUE prevents directories listing - [#1512451](https://bugzilla.redhat.com/1512451): Not able to create snapshot - [#1512455](https://bugzilla.redhat.com/1512455): glustereventsd hardcodes working-directory @@ -606,13 +675,13 @@ Bugs addressed since release-3.13.0 are listed below. - [#1517068](https://bugzilla.redhat.com/1517068): Unable to change the Slave configurations - [#1517554](https://bugzilla.redhat.com/1517554): help for volume profile is not in man page - [#1517633](https://bugzilla.redhat.com/1517633): Geo-rep: access-mount config is not working -- [#1517904](https://bugzilla.redhat.com/1517904): tests/bugs/core/multiplex-limit-issue-151.t fails sometimes in upstream master +- [#1517904](https://bugzilla.redhat.com/1517904): tests/bugs/core/multiplex-limit-issue-151.t fails sometimes in upstream master - [#1517961](https://bugzilla.redhat.com/1517961): Failure of some regression tests on Centos7 (passes on centos6) - [#1518508](https://bugzilla.redhat.com/1518508): Change GD_OP_VERSION to 3_13_0 from 3_12_0 for RFE https://bugzilla.redhat.com/show_bug.cgi?id=1464350 - [#1518582](https://bugzilla.redhat.com/1518582): Reduce lock contention on fdtable lookup - [#1519598](https://bugzilla.redhat.com/1519598): Reduce lock contention on protocol client manipulating fd - [#1520245](https://bugzilla.redhat.com/1520245): High mem/cpu usage, brick processes not starting and ssl encryption issues while testing scaling with multiplexing (500-800 vols) -- [#1520758](https://bugzilla.redhat.com/1520758): [Disperse] Add stripe in cache even if file/data does not exist +- [#1520758](https://bugzilla.redhat.com/1520758): [Disperse] Add stripe in cache even if file/data does not exist - [#1520974](https://bugzilla.redhat.com/1520974): Compiler warning in dht-common.c because of a switch statement on a boolean - [#1521013](https://bugzilla.redhat.com/1521013): rfc.sh should allow custom remote names for ORIGIN - [#1521014](https://bugzilla.redhat.com/1521014): quota_unlink_cbk crashes when loc.inode is null @@ -641,7 +710,7 @@ Bugs addressed since release-3.13.0 are listed below. - [#1529883](https://bugzilla.redhat.com/1529883): glusterfind is extremely slow if there are lots of changes - [#1530281](https://bugzilla.redhat.com/1530281): glustershd fails to start on a volume force start after a brick is down - [#1530910](https://bugzilla.redhat.com/1530910): Use after free in cli_cmd_volume_create_cbk -- [#1531149](https://bugzilla.redhat.com/1531149): memory leak: get-state leaking memory in small amounts +- [#1531149](https://bugzilla.redhat.com/1531149): memory leak: get-state leaking memory in small amounts - [#1531987](https://bugzilla.redhat.com/1531987): increment of a boolean expression warning - [#1532238](https://bugzilla.redhat.com/1532238): Failed to access volume via Samba with undefined symbol from socket.so - [#1532591](https://bugzilla.redhat.com/1532591): Tests: Geo-rep tests are failing in few regression machines @@ -674,11 +743,10 @@ Bugs addressed since release-3.13.0 are listed below. - [#1544638](https://bugzilla.redhat.com/1544638): 3.8 -> 3.10 rolling upgrade fails (same for 3.12 or 3.13) on Ubuntu 14 - [#1545724](https://bugzilla.redhat.com/1545724): libgfrpc does not export IPv6 RPC methods even with --with-ipv6-default - [#1547635](https://bugzilla.redhat.com/1547635): add option to bulld rpm without server -- [#1547842](https://bugzilla.redhat.com/1547842): Typo error in __dht_check_free_space function log message +- [#1547842](https://bugzilla.redhat.com/1547842): Typo error in \_\_dht_check_free_space function log message - [#1548264](https://bugzilla.redhat.com/1548264): [Rebalance] "Migrate file failed: : failed to get xattr [No data available]" warnings in rebalance logs - [#1548271](https://bugzilla.redhat.com/1548271): DHT calls dht_lookup_everywhere for 1xn volumes - [#1550808](https://bugzilla.redhat.com/1550808): memory leak in pre-op in replicate volumes for every write - [#1551112](https://bugzilla.redhat.com/1551112): Rolling upgrade to 4.0 is broken - [#1551640](https://bugzilla.redhat.com/1551640): GD2 fails to dlopen server xlator - [#1554077](https://bugzilla.redhat.com/1554077): 4.0 clients may fail to convert iatt in dict when recieving the same from older (< 4.0) servers - diff --git a/docs/release-notes/4.0.1.md b/docs/release-notes/4.0.1.md index 64158b8..a94aa99 100644 --- a/docs/release-notes/4.0.1.md +++ b/docs/release-notes/4.0.1.md @@ -5,9 +5,11 @@ contain a listing of all the new features that were added and bugs fixed in the GlusterFS 4.0 release. ## Major changes, features and limitations addressed in this release + **No Major changes** ## Major issues + **No Major issues** ## Bugs addressed diff --git a/docs/release-notes/4.0.2.md b/docs/release-notes/4.0.2.md index 6092306..069fe9c 100644 --- a/docs/release-notes/4.0.2.md +++ b/docs/release-notes/4.0.2.md @@ -7,6 +7,7 @@ GlusterFS 4.0 release. ## Major changes, features and limitations addressed in this release This release contains a fix for a security vulerability in Gluster as follows, + - http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-1088 - https://nvd.nist.gov/vuln/detail/CVE-2018-1088 @@ -19,6 +20,7 @@ enabled, and access to the same restricted using the `auth.ssl-allow` option. See, this [guide](https://docs.gluster.org/en/v3/Administrator%20Guide/SSL/) for more details. ## Major issues + **No Major issues** ## Bugs addressed diff --git a/docs/release-notes/4.1.0.md b/docs/release-notes/4.1.0.md index f05b845..298c185 100644 --- a/docs/release-notes/4.1.0.md +++ b/docs/release-notes/4.1.0.md @@ -15,22 +15,22 @@ A full list of bugs that have been addressed is included further below. ## Announcements 1. As 4.0 was a short term maintenance release, features which have been -included in that release are available with 4.1.0 as well. These features may -be of interest to users upgrading to 4.1.0 from older than 4.0 releases. The 4.0 -[release notes](http://docs.gluster.org/en/latest/release-notes/) captures the list of features that were introduced with 4.0. + included in that release are available with 4.1.0 as well. These features may + be of interest to users upgrading to 4.1.0 from older than 4.0 releases. The 4.0 + [release notes](http://docs.gluster.org/en/latest/release-notes/) captures the list of features that were introduced with 4.0. **NOTE:** As 4.0 was a short term maintenance release, it will reach end of life (EOL) with the release of 4.1.0. ([reference](https://www.gluster.org/release-schedule/)) 2. Releases that receive maintenance updates post 4.1 release are, 3.12, and -4.1 ([reference](https://www.gluster.org/release-schedule/)) + 4.1 ([reference](https://www.gluster.org/release-schedule/)) **NOTE:** 3.10 long term maintenance release, will reach end of life (EOL) with the release of 4.1.0. ([reference](https://www.gluster.org/release-schedule/)) 3. Continuing with this release, the CentOS storage SIG will not build server -packages for CentOS6. Server packages will be available for CentOS7 only. For -ease of migrations, client packages on CentOS6 will be published and maintained. + packages for CentOS6. Server packages will be available for CentOS7 only. For + ease of migrations, client packages on CentOS6 will be published and maintained. **NOTE**: This change was announced [here](http://lists.gluster.org/pipermail/gluster-users/2018-January/033212.html) @@ -60,6 +60,7 @@ provisioning and a lot of other bug fixes and internal changes. ##### Rebalance [#786](https://github.com/gluster/glusterd2/pull/786) GD2 supports running rebalance on volumes. Supported rebalance operations include, + - rebalance start - rebalance start with fix-layout - rebalance stop @@ -72,6 +73,7 @@ Support only exists in the ReST API right now. CLI support will be introduced in Initial support for volume snapshot has been introduced. At the moment, snapshots are supported only on Thin-LVM bricks. Support snapshot operations include, + - create - activate/deactivate - list @@ -98,6 +100,7 @@ for zones is available. [#783](https://github.com/gluster/glusterd2/pull/783) [# ##### Other changes Other notable changes include, + - Support for volume option levels (experimental, advanced, deprecated) [#591](https://github.com/gluster/glusterd2/pull/591) - Support for resetting volume options [#545](https://github.com/gluster/glusterd2/pull/545) - Option hooks for volume set [#708](https://github.com/gluster/glusterd2/pull/708) @@ -140,39 +143,42 @@ These metrics can be dumped and visualized as detailed [here](https://docs.glust #### 1. Additional metrics added to negative lookup cache xlator Metrics added are: - - negative_lookup_hit_count - - negative_lookup_miss_count - - get_real_filename_hit_count - - get_real_filename_miss_count - - nameless_lookup_count - - inodes_with_positive_dentry_cache - - inodes_with_negative_dentry_cache - - dentry_invalidations_recieved - - cache_limit - - consumed_cache_size - - inode_limit - - consumed_inodes + +- negative_lookup_hit_count +- negative_lookup_miss_count +- get_real_filename_hit_count +- get_real_filename_miss_count +- nameless_lookup_count +- inodes_with_positive_dentry_cache +- inodes_with_negative_dentry_cache +- dentry_invalidations_recieved +- cache_limit +- consumed_cache_size +- inode_limit +- consumed_inodes #### 2. Additional metrics added to md-cache xlator Metrics added are: - - stat_cache_hit_count - - stat_cache_miss_count - - xattr_cache_hit_count - - xattr_cache_miss_count - - nameless_lookup_count - - negative_lookup_count - - stat_cache_invalidations_received - - xattr_cache_invalidations_received + +- stat_cache_hit_count +- stat_cache_miss_count +- xattr_cache_hit_count +- xattr_cache_miss_count +- nameless_lookup_count +- negative_lookup_count +- stat_cache_invalidations_received +- xattr_cache_invalidations_received #### 3. Additional metrics added to quick-read xlator Metrics added are: - - total_files_cached - - total_cache_used - - cache-hit - - cache-miss - - cache-invalidations + +- total_files_cached +- total_cache_used +- cache-hit +- cache-miss +- cache-invalidations ### Performance @@ -207,6 +213,7 @@ in processing FUSE requests in parallel, than the existing single reader model. This is provided as a mount time option named `reader-thread-count` and can be used as follows, + ``` # mount -t glusterfs -o reader-thread-count= : ``` @@ -223,6 +230,7 @@ configurable option provides the ability to tune this up or down based on the workload to improve performance of writes. Usage: + ``` # gluster volume set performance.aggregate-size ``` @@ -265,17 +273,19 @@ operations (for example tar may report "file changed as we read it"), to maintain and report equal time stamps on the file across the subvolumes. To enable the feature use, + ``` # gluster volume set features.utime ``` **Limitations**: + - Mounting gluster volume with time attribute options (noatime, realatime...) -is not supported with this feature + is not supported with this feature - Certain entry operations (with differing creation flags) would reflect an -eventual consistency w.r.t the time attributes + eventual consistency w.r.t the time attributes - This feature does not guarantee consistent time for directories if hashed -sub-volume for the directory is down + sub-volume for the directory is down - readdirp (or directory listing) is not supported with this feature ### Developer related @@ -320,7 +330,7 @@ Bugs addressed since release-4.0.0 are listed below. - [#1523219](https://bugzilla.redhat.com/1523219): fuse xlator uses block size and fragment size 128KB leading to rounding off in df output - [#1530905](https://bugzilla.redhat.com/1530905): Reducing regression time of glusterd test cases - [#1533342](https://bugzilla.redhat.com/1533342): Syntactical errors in hook scripts for managing SELinux context on bricks -- [#1536024](https://bugzilla.redhat.com/1536024): Rebalance process is behaving differently for AFR and EC volume. +- [#1536024](https://bugzilla.redhat.com/1536024): Rebalance process is behaving differently for AFR and EC volume. - [#1536186](https://bugzilla.redhat.com/1536186): build: glibc has removed legacy rpc headers and rpcgen in Fedora28, use libtirpc - [#1537362](https://bugzilla.redhat.com/1537362): glustershd/glusterd is not using right port when connecting to glusterfsd process - [#1537364](https://bugzilla.redhat.com/1537364): [RFE] - get-state option should mark profiling enabled flag at volume level @@ -354,13 +364,13 @@ Bugs addressed since release-4.0.0 are listed below. - [#1546620](https://bugzilla.redhat.com/1546620): DHT calls dht_lookup_everywhere for 1xn volumes - [#1546954](https://bugzilla.redhat.com/1546954): [Rebalance] "Migrate file failed: : failed to get xattr [No data available]" warnings in rebalance logs - [#1547068](https://bugzilla.redhat.com/1547068): Bricks getting assigned to different pids depending on whether brick path is IP or hostname based -- [#1547128](https://bugzilla.redhat.com/1547128): Typo error in __dht_check_free_space function log message +- [#1547128](https://bugzilla.redhat.com/1547128): Typo error in \_\_dht_check_free_space function log message - [#1547662](https://bugzilla.redhat.com/1547662): After a replace brick command, self-heal takes some time to start healing files on disperse volumes - [#1547888](https://bugzilla.redhat.com/1547888): [brick-mux] incorrect event-thread scaling in server_reconfigure() - [#1548361](https://bugzilla.redhat.com/1548361): Make afr_fsync a transaction - [#1549000](https://bugzilla.redhat.com/1549000): line-coverage tests not capturing details properly. - [#1549606](https://bugzilla.redhat.com/1549606): Eager lock should be present for both metadata and data transactions -- [#1549915](https://bugzilla.redhat.com/1549915): [Fuse Sub-dir] After performing add-brick on volume,doing rm -rf * on subdir mount point fails with "Transport endpoint is not connected" +- [#1549915](https://bugzilla.redhat.com/1549915): [Fuse Sub-dir] After performing add-brick on volume,doing rm -rf \* on subdir mount point fails with "Transport endpoint is not connected" - [#1550078](https://bugzilla.redhat.com/1550078): memory leak in pre-op in replicate volumes for every write - [#1550339](https://bugzilla.redhat.com/1550339): glusterd leaks memory when vol status is issued - [#1550895](https://bugzilla.redhat.com/1550895): GD2 fails to dlopen server xlator @@ -383,7 +393,7 @@ Bugs addressed since release-4.0.0 are listed below. - [#1559075](https://bugzilla.redhat.com/1559075): enable ownthread feature for glusterfs4_0_fop_prog - [#1559126](https://bugzilla.redhat.com/1559126): Incorrect error message in /features/changelog/lib/src/gf-history-changelog.c - [#1559130](https://bugzilla.redhat.com/1559130): ssh stderr in glusterfind gets swallowed -- [#1559235](https://bugzilla.redhat.com/1559235): Increase the inode table size on server when upcall enabled +- [#1559235](https://bugzilla.redhat.com/1559235): Increase the inode table size on server when upcall enabled - [#1560319](https://bugzilla.redhat.com/1560319): NFS client gets "Invalid argument" when writing file through nfs-ganesha with quota - [#1560393](https://bugzilla.redhat.com/1560393): Fix regresssion failure for ./tests/basic/md-cache/bug-1418249.t - [#1560411](https://bugzilla.redhat.com/1560411): fallocate created data set is crossing storage reserve space limits resulting 100% brick full @@ -419,12 +429,12 @@ Bugs addressed since release-4.0.0 are listed below. - [#1570011](https://bugzilla.redhat.com/1570011): test case is failing ./tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t while brick mux is enabled - [#1570538](https://bugzilla.redhat.com/1570538): linux untar errors out at completion during disperse volume inservice upgrade - [#1570962](https://bugzilla.redhat.com/1570962): print the path of the corrupted object in scrub status -- [#1571069](https://bugzilla.redhat.com/1571069): [geo-rep]: Lot of changelogs retries and "dict is null" errors in geo-rep logs -- [#1572076](https://bugzilla.redhat.com/1572076): Dictionary response is not captured in syncop_(f)xattrop +- [#1571069](https://bugzilla.redhat.com/1571069): [geo-rep]: Lot of changelogs retries and "dict is null" errors in geo-rep logs +- [#1572076](https://bugzilla.redhat.com/1572076): Dictionary response is not captured in syncop\_(f)xattrop - [#1572581](https://bugzilla.redhat.com/1572581): Remove-brick failed on Distributed volume while rm -rf is in-progress - [#1572586](https://bugzilla.redhat.com/1572586): dht: do not allow migration if file is open - [#1573066](https://bugzilla.redhat.com/1573066): growing glusterd memory usage with connected RHGSWA -- [#1573119](https://bugzilla.redhat.com/1573119): Amends in volume profile option 'gluster-block' +- [#1573119](https://bugzilla.redhat.com/1573119): Amends in volume profile option 'gluster-block' - [#1573220](https://bugzilla.redhat.com/1573220): Memory leak in volume tier status command - [#1574259](https://bugzilla.redhat.com/1574259): Errors unintentionally reported for snapshot status - [#1574305](https://bugzilla.redhat.com/1574305): rm command hangs in fuse_request_send @@ -435,9 +445,9 @@ Bugs addressed since release-4.0.0 are listed below. - [#1576814](https://bugzilla.redhat.com/1576814): GlusterFS can be improved - [#1577162](https://bugzilla.redhat.com/1577162): gfapi: broken symbol versions - [#1579674](https://bugzilla.redhat.com/1579674): Remove EIO from the dht_inode_missing macro -- [#1579736](https://bugzilla.redhat.com/1579736): Additional log messages in dht_readdir(p)_cbk +- [#1579736](https://bugzilla.redhat.com/1579736): Additional log messages in dht_readdir(p)\_cbk - [#1579757](https://bugzilla.redhat.com/1579757): DHT Log flooding in mount log "key=trusted.glusterfs.dht.mds [Invalid argument]" -- [#1580215](https://bugzilla.redhat.com/1580215): [geo-rep]: Lot of changelogs retries and "dict is null" errors in geo-rep logs +- [#1580215](https://bugzilla.redhat.com/1580215): [geo-rep]: Lot of changelogs retries and "dict is null" errors in geo-rep logs - [#1580540](https://bugzilla.redhat.com/1580540): make getfattr return proper response for "glusterfs.gfidtopath" xattr for files created when gfid2path was off - [#1581548](https://bugzilla.redhat.com/1581548): writes succeed when only good brick is down in 1x3 volume - [#1581745](https://bugzilla.redhat.com/1581745): bug-1309462.t is failing reliably due to changes in security.capability changes in the kernel @@ -449,11 +459,11 @@ Bugs addressed since release-4.0.0 are listed below. - [#1582199](https://bugzilla.redhat.com/1582199): posix unwinds readdirp calls with readdir signature - [#1582286](https://bugzilla.redhat.com/1582286): Brick-mux regressions failing on 4.1 branch - [#1582531](https://bugzilla.redhat.com/1582531): posix/ctime: Mtime is not updated on setting it to older date -- [#1582549](https://bugzilla.redhat.com/1582549): api: missing __THROW on pub function decls +- [#1582549](https://bugzilla.redhat.com/1582549): api: missing \_\_THROW on pub function decls - [#1583016](https://bugzilla.redhat.com/1583016): libgfapi: glfs init fails on afr volume with ctime feature enabled - [#1583734](https://bugzilla.redhat.com/1583734): rpc_transport_unref() called for an unregistered socket fd - [#1583769](https://bugzilla.redhat.com/1583769): Fix incorrect rebalance log message -- [#1584633](https://bugzilla.redhat.com/1584633): Brick process crashed after upgrade from RHGS-3.3.1 async(7.4) to RHGS-3.4(7.5) +- [#1584633](https://bugzilla.redhat.com/1584633): Brick process crashed after upgrade from RHGS-3.3.1 async(7.4) to RHGS-3.4(7.5) - [#1585894](https://bugzilla.redhat.com/1585894): posix/ctime: EC self heal of directory is blocked with ctime feature enabled - [#1587908](https://bugzilla.redhat.com/1587908): Fix deadlock in failure codepath of shard fsync - [#1590128](https://bugzilla.redhat.com/1590128): xdata is leaking in server3_3_seek diff --git a/docs/release-notes/4.1.1.md b/docs/release-notes/4.1.1.md index d3edd73..5229955 100644 --- a/docs/release-notes/4.1.1.md +++ b/docs/release-notes/4.1.1.md @@ -7,6 +7,7 @@ GlusterFS 4.1 stable release. ## Major changes, features and limitations addressed in this release This release contains a fix for a security vulerability in Gluster as follows, + - http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-10841 - https://nvd.nist.gov/vuln/detail/CVE-2018-10841 diff --git a/docs/release-notes/4.1.10.md b/docs/release-notes/4.1.10.md index b1ea3a7..3029dfc 100644 --- a/docs/release-notes/4.1.10.md +++ b/docs/release-notes/4.1.10.md @@ -1,6 +1,6 @@ # Release notes for Gluster 4.1.10 -This is a bugfix release. The release notes for [4.1.0](4.1.0.md), [4.1.1](4.1.1.md), +This is a bugfix release. The release notes for [4.1.0](4.1.0.md), [4.1.1](4.1.1.md), [4.1.2](4.1.2.md), [4.1.3](4.1.3.md), [4.1.4](4.1.4.md), [4.1.5](4.1.5.md), [4.1.6](4.1.6.md), [4.1.7](4.1.7.md), [4.1.8](4.1.8.md) and [4.1.9](4.1.9.md) contains a listing of all the new features that were added and bugs fixed diff --git a/docs/release-notes/4.1.2.md b/docs/release-notes/4.1.2.md index cfe10e4..f1a5eed 100644 --- a/docs/release-notes/4.1.2.md +++ b/docs/release-notes/4.1.2.md @@ -6,9 +6,9 @@ GlusterFS 4.1 stable release. ## Major changes, features and limitations addressed in this release -1. Release 4.1.0 [notes](4.1.0.md) *incorrectly* reported that all python code in -Gluster packages are python3 compliant, this is not the case and the release -note is amended accordingly. +1. Release 4.1.0 [notes](4.1.0.md) _incorrectly_ reported that all python code in + Gluster packages are python3 compliant, this is not the case and the release + note is amended accordingly. ## Major issues @@ -26,7 +26,7 @@ Bugs addressed since release-4.1.1 are listed below. - [#1597229](https://bugzilla.redhat.com/1597229): glustershd crashes when index heal is launched before graph is initialized. - [#1598193](https://bugzilla.redhat.com/1598193): Stale lock with lk-owner all-zeros is observed in some tests - [#1599629](https://bugzilla.redhat.com/1599629): Don't execute statements after decrementing call count in afr -- [#1599785](https://bugzilla.redhat.com/1599785): _is_prefix should return false for 0-length strings +- [#1599785](https://bugzilla.redhat.com/1599785): \_is_prefix should return false for 0-length strings - [#1600941](https://bugzilla.redhat.com/1600941): [geo-rep]: geo-replication scheduler is failing due to unsuccessful umount - [#1603056](https://bugzilla.redhat.com/1603056): When reserve limits are reached, append on an existing file after truncate operation results to hang - [#1603099](https://bugzilla.redhat.com/1603099): directories are invisible on client side diff --git a/docs/release-notes/4.1.3.md b/docs/release-notes/4.1.3.md index c08719e..356872c 100644 --- a/docs/release-notes/4.1.3.md +++ b/docs/release-notes/4.1.3.md @@ -13,8 +13,8 @@ GlusterFS 4.1 stable release. ## Major issues 1. Bug [#1601356](https://bugzilla.redhat.com/show_bug.cgi?id=1601356) titled "Problem with SSL/TLS encryption", -is **not** yet fixed with this release. Patch to fix the same is in progress and -can be tracked [here](https://review.gluster.org/c/glusterfs/+/20993). + is **not** yet fixed with this release. Patch to fix the same is in progress and + can be tracked [here](https://review.gluster.org/c/glusterfs/+/20993). ## Bugs addressed @@ -24,7 +24,7 @@ Bugs addressed since release-4.1.2 are listed below. - [#1596686](https://bugzilla.redhat.com/1596686): key = trusted.glusterfs.protect.writes [Invalid argument]; key = glusterfs.avoid.overwrite [Invalid argument] - [#1609550](https://bugzilla.redhat.com/1609550): glusterfs-resource-agents should not be built for el6 - [#1609551](https://bugzilla.redhat.com/1609551): glusterfs-resource-agents should not be built for el6 -- [#1611104](https://bugzilla.redhat.com/1611104): [geo-rep]: Upgrade fails, session in FAULTY state +- [#1611104](https://bugzilla.redhat.com/1611104): [geo-rep]: Upgrade fails, session in FAULTY state - [#1611106](https://bugzilla.redhat.com/1611106): Glusterd crashed on a few (master) nodes - [#1611108](https://bugzilla.redhat.com/1611108): [geo-rep]: Geo-rep scheduler fails - [#1611110](https://bugzilla.redhat.com/1611110): Glusterd memory leaking in gf_gld_mt_linebuf diff --git a/docs/release-notes/4.1.4.md b/docs/release-notes/4.1.4.md index 4fc74e1..6f50694 100644 --- a/docs/release-notes/4.1.4.md +++ b/docs/release-notes/4.1.4.md @@ -1,26 +1,27 @@ # Release notes for Gluster 4.1.4 This is a bugfix release. The release notes for [4.1.0](4.1.0.md), - [4.1.1](4.1.1.md), [4.1.2](4.1.2.md) and [4.1.3](4.1.3.md) contains a +[4.1.1](4.1.1.md), [4.1.2](4.1.2.md) and [4.1.3](4.1.3.md) contains a listing of all the new features that were added and bugs fixed in the GlusterFS 4.1 stable release. ## Major changes, features and limitations addressed in this release -1. This release contains fix for following security vulnerabilities, - - https://nvd.nist.gov/vuln/detail/CVE-2018-10904 - - https://nvd.nist.gov/vuln/detail/CVE-2018-10907 - - https://nvd.nist.gov/vuln/detail/CVE-2018-10911 - - https://nvd.nist.gov/vuln/detail/CVE-2018-10913 - - https://nvd.nist.gov/vuln/detail/CVE-2018-10914 - - https://nvd.nist.gov/vuln/detail/CVE-2018-10923 - - https://nvd.nist.gov/vuln/detail/CVE-2018-10926 - - https://nvd.nist.gov/vuln/detail/CVE-2018-10927 - - https://nvd.nist.gov/vuln/detail/CVE-2018-10928 - - https://nvd.nist.gov/vuln/detail/CVE-2018-10929 - - https://nvd.nist.gov/vuln/detail/CVE-2018-10930 +1. This release contains fix for following security vulnerabilities, -2. To resolve the security vulnerabilities following limitations were made in GlusterFS + - https://nvd.nist.gov/vuln/detail/CVE-2018-10904 + - https://nvd.nist.gov/vuln/detail/CVE-2018-10907 + - https://nvd.nist.gov/vuln/detail/CVE-2018-10911 + - https://nvd.nist.gov/vuln/detail/CVE-2018-10913 + - https://nvd.nist.gov/vuln/detail/CVE-2018-10914 + - https://nvd.nist.gov/vuln/detail/CVE-2018-10923 + - https://nvd.nist.gov/vuln/detail/CVE-2018-10926 + - https://nvd.nist.gov/vuln/detail/CVE-2018-10927 + - https://nvd.nist.gov/vuln/detail/CVE-2018-10928 + - https://nvd.nist.gov/vuln/detail/CVE-2018-10929 + - https://nvd.nist.gov/vuln/detail/CVE-2018-10930 + +2. To resolve the security vulnerabilities following limitations were made in GlusterFS - open,read,write on special files like char and block are no longer permitted - io-stat xlator can dump stat info only to /var/run/gluster directory @@ -30,8 +31,8 @@ brick hosts, will fix the security issues. ## Major issues 1. Bug [#1601356](https://bugzilla.redhat.com/show_bug.cgi?id=1601356) titled "Problem with SSL/TLS encryption", -is **not** yet fixed with this release. Patch to fix the same is in progress and -can be tracked [here](https://review.gluster.org/c/glusterfs/+/20993). + is **not** yet fixed with this release. Patch to fix the same is in progress and + can be tracked [here](https://review.gluster.org/c/glusterfs/+/20993). ## Bugs addressed @@ -41,6 +42,5 @@ Bugs addressed since release-4.1.3 are listed below. - [#1625095](https://bugzilla.redhat.com/1625095): Files can be renamed outside volume - [#1625096](https://bugzilla.redhat.com/1625096): I/O to arbitrary devices on storage server - [#1625097](https://bugzilla.redhat.com/1625097): Stack-based buffer overflow in server-rpc-fops.c allows remote attackers to execute arbitrary code -- [#1625102](https://bugzilla.redhat.com/1625102): Information Exposure in posix_get_file_contents function in posix-helpers.c +- [#1625102](https://bugzilla.redhat.com/1625102): Information Exposure in posix_get_file_contents function in posix-helpers.c - [#1625106](https://bugzilla.redhat.com/1625106): Unsanitized file names in debug/io-stats translator can allow remote attackers to execute arbitrary code - diff --git a/docs/release-notes/4.1.6.md b/docs/release-notes/4.1.6.md index 04d774b..e496e7a 100644 --- a/docs/release-notes/4.1.6.md +++ b/docs/release-notes/4.1.6.md @@ -10,6 +10,7 @@ features that were added and bugs fixed in the GlusterFS 4.1 stable release. This release contains fixes for several security vulnerabilities in Gluster as follows, + - https://nvd.nist.gov/vuln/detail/CVE-2018-14651 - https://nvd.nist.gov/vuln/detail/CVE-2018-14652 - https://nvd.nist.gov/vuln/detail/CVE-2018-14653 @@ -34,13 +35,13 @@ Bugs addressed since release-4.1.5 are listed below. - [#1636218](https://bugzilla.redhat.com/1636218): [SNAPSHOT]: with brick multiplexing, snapshot restore will make glusterd send wrong volfile - [#1637953](https://bugzilla.redhat.com/1637953): data-self-heal in arbiter volume results in stale locks. - [#1641761](https://bugzilla.redhat.com/1641761): Spurious failures in bug-1637802-arbiter-stale-data-heal-lock.t -- [#1643052](https://bugzilla.redhat.com/1643052): Seeing defunt translator and discrepancy in volume info when issued from node which doesn't host bricks in that volume +- [#1643052](https://bugzilla.redhat.com/1643052): Seeing defunt translator and discrepancy in volume info when issued from node which doesn't host bricks in that volume - [#1643075](https://bugzilla.redhat.com/1643075): tests/bugs/glusterd/optimized-basic-testcases-in-cluster.t failing - [#1643929](https://bugzilla.redhat.com/1643929): geo-rep: gluster-mountbroker status crashes - [#1644163](https://bugzilla.redhat.com/1644163): geo-rep: geo-replication gets stuck after file rename and gfid conflict - [#1644474](https://bugzilla.redhat.com/1644474): afr/lease: Read child nodes from lease structure - [#1644516](https://bugzilla.redhat.com/1644516): geo-rep: gluster-mountbroker status crashes -- [#1644518](https://bugzilla.redhat.com/1644518): [Geo-Replication] Geo-rep faulty sesion because of the directories are not synced to slave. +- [#1644518](https://bugzilla.redhat.com/1644518): [Geo-Replication] Geo-rep faulty sesion because of the directories are not synced to slave. - [#1644524](https://bugzilla.redhat.com/1644524): Excessive logging in posix_update_utime_in_mdata - [#1645363](https://bugzilla.redhat.com/1645363): CVE-2018-14652 glusterfs: Buffer overflow in "features/locks" translator allows for denial of service [fedora-all] - [#1646200](https://bugzilla.redhat.com/1646200): CVE-2018-14654 glusterfs: "features/index" translator can create arbitrary, empty files [fedora-all] diff --git a/docs/release-notes/4.1.7.md b/docs/release-notes/4.1.7.md index c3e3dac..baf286e 100644 --- a/docs/release-notes/4.1.7.md +++ b/docs/release-notes/4.1.7.md @@ -20,7 +20,7 @@ Bugs addressed since release-4.1.6 are listed below. - [#1654118](https://bugzilla.redhat.com/1654118): [geo-rep]: Failover / Failback shows fault status in a non-root setup - [#1654229](https://bugzilla.redhat.com/1654229): Provide an option to silence glfsheal logs -- [#1655527](https://bugzilla.redhat.com/1655527): Incorrect usage of local->fd in afr_open_ftruncate_cbk +- [#1655527](https://bugzilla.redhat.com/1655527): Incorrect usage of local->fd in afr_open_ftruncate_cbk - [#1655532](https://bugzilla.redhat.com/1655532): Tracker bug for all leases related issues - [#1655561](https://bugzilla.redhat.com/1655561): gfid heal does not happen when there is no source brick - [#1662635](https://bugzilla.redhat.com/1662635): Fix tests/bugs/shard/zero-flag.t diff --git a/docs/release-notes/4.1.9.md b/docs/release-notes/4.1.9.md index 5c51709..f0e96e8 100644 --- a/docs/release-notes/4.1.9.md +++ b/docs/release-notes/4.1.9.md @@ -1,6 +1,6 @@ # Release notes for Gluster 4.1.9 -This is a bugfix release. The release notes for [4.1.0](4.1.0.md), [4.1.1](4.1.1.md), +This is a bugfix release. The release notes for [4.1.0](4.1.0.md), [4.1.1](4.1.1.md), [4.1.2](4.1.2.md), [4.1.3](4.1.3.md), [4.1.4](4.1.4.md), [4.1.5](4.1.5.md), [4.1.6](4.1.6.md), [4.1.7](4.1.7.md) and [4.1.8](4.1.8.md) contains a listing of all the new features that were added and bugs fixed diff --git a/docs/release-notes/5.0.md b/docs/release-notes/5.0.md index d096a26..41f647a 100644 --- a/docs/release-notes/5.0.md +++ b/docs/release-notes/5.0.md @@ -14,15 +14,15 @@ A full list of bugs that have been addressed is included further below. ## Announcements 1. Releases that receive maintenance updates post release 5 are, 4.1 -([reference](https://www.gluster.org/release-schedule/)) + ([reference](https://www.gluster.org/release-schedule/)) **NOTE:** 3.12 long term maintenance release, will reach end of life (EOL) with the release of 5.0. ([reference](https://www.gluster.org/release-schedule/)) 2. Release 5 will receive maintenance updates around the 10th of every month -for the first 3 months post release (i.e Nov'18, Dec'18, Jan'18). Post the -initial 3 months, it will receive maintenance updates every 2 months till EOL. -([reference](https://lists.gluster.org/pipermail/announce/2018-July/000103.html)) + for the first 3 months post release (i.e Nov'18, Dec'18, Jan'18). Post the + initial 3 months, it will receive maintenance updates every 2 months till EOL. + ([reference](https://lists.gluster.org/pipermail/announce/2018-July/000103.html)) ## Major changes and features @@ -44,27 +44,27 @@ Features are categorized into the following sections, The following major changes have been committed to GlusterD2 since v4.1.0. 1. Volume snapshots : Most snapshot operations are available including create, -delete, activate, deactivate, clone and restore. + delete, activate, deactivate, clone and restore. 2. Volume heal: Support for full heal and index heal for replicate volumes has -been implemented. + been implemented. 3. Tracing with Opencensus: Support for tracing distributed operations has been -implemented in GD2, using the Opencensus API. Tracing instrumentation has been -done for volume create, list and delete operations. Other operations will -follow subsequently. + implemented in GD2, using the Opencensus API. Tracing instrumentation has been + done for volume create, list and delete operations. Other operations will + follow subsequently. 4. Portmap refactoring: Portmap in GlisterD2 no longer selects a port for the -bricks to listen on, instead leaving the choice upto the bricks. Portmap only -saves port information provided by brick during signin. + bricks to listen on, instead leaving the choice upto the bricks. Portmap only + saves port information provided by brick during signin. 5. Smartvol API merged with volume create API: The smart volume API which allows -user to create a volume by just specifying a size has been merged with the -normal volume create API. + user to create a volume by just specifying a size has been merged with the + normal volume create API. 6. Configure GlusterD2 with environment variables: In addition to CLI flags, and -the config file, GD2 configuration options can be set using environment -variables. + the config file, GD2 configuration options can be set using environment + variables. In addition to the above, many changes have been merged for minor bug-fixes and to help with testing. @@ -101,6 +101,7 @@ for operations that change may trigger an atime update on the file system objects. To enable the feature use, + ``` # gluster volume set features.utime on # gluster volume set features.ctime on @@ -121,6 +122,7 @@ The ctime-invalidation option makes quick-read to prefer ctime over mtime to validate staleness of its cache. To enable this option use, + ``` # gluster volume set ctime-invalidation on ``` @@ -138,6 +140,7 @@ The default value is set at 100, but can be increased to delete more shards in parallel for faster space reclamation. To change the defaults for this option use, + ``` # gluster volume set shard-deletion-rate ``` @@ -171,18 +174,22 @@ to actively address code quality in these areas. ## Major issues 1. The following options are removed from the code base and require to be unset -before an upgrade from releases older than release 4.1.0, + before an upgrade from releases older than release 4.1.0, + - features.lock-heal - features.grace-timeout To check if these options are set use, + ``` # gluster volume info ``` + and ensure that the above options are not part of the `Options Reconfigured:` section in the output of all volumes in the cluster. If these are set, then unset them using the following commands, + ``` # gluster volume reset