mirror of
https://github.com/gluster/glusterdocs.git
synced 2026-02-05 15:47:01 +01:00
Merge branch 'main' into install-new-sntx
This commit is contained in:
@@ -1,46 +1,55 @@
|
||||
# Split brain and the ways to deal with it
|
||||
|
||||
### Split brain:
|
||||
Split brain is a situation where two or more replicated copies of a file become divergent. When a file is in split brain, there is an inconsistency in either data or metadata of the file amongst the bricks of a replica and do not have enough information to authoritatively pick a copy as being pristine and heal the bad copies, despite all bricks being up and online. For a directory, there is also an entry split brain where a file inside it can have different gfid/file-type across the bricks of a replica. Split brain can happen mainly because of 2 reasons:
|
||||
1. Due to network disconnect:
|
||||
Where a client temporarily loses connection to the bricks.
|
||||
Split brain is a situation where two or more replicated copies of a file become divergent. When a file is in split brain, there is an inconsistency in either data or metadata of the file amongst the bricks of a replica and do not have enough information to authoritatively pick a copy as being pristine and heal the bad copies, despite all bricks being up and online. For a directory, there is also an entry split brain where a file inside it can have different gfid/file-type across the bricks of a replica.
|
||||
|
||||
Split brain can happen mainly because of 2 reasons:
|
||||
|
||||
1. Due to network disconnect, where a client temporarily loses connection to the bricks.
|
||||
|
||||
- There is a replica pair of 2 bricks, brick1 on server1 and brick2 on server2.
|
||||
- Client1 loses connection to brick2 and client2 loses connection to brick1 due to network split.
|
||||
- Writes from client1 goes to brick1 and from client2 goes to brick2, which is nothing but split-brain.
|
||||
|
||||
2. Gluster brick processes going down or returning error:
|
||||
|
||||
- Server1 is down and server2 is up: Writes happen on server 2.
|
||||
- Server1 comes up, server2 goes down (Heal not happened / data on server 2 is not replicated on server1): Writes happen on server1.
|
||||
- Server2 comes up: Both server1 and server2 has data independent of each other.
|
||||
|
||||
If we use the replica 2 volume, it is not possible to prevent split-brain without losing availability.
|
||||
If we use the `replica 2` volume, it is not possible to prevent split-brain without losing availability.
|
||||
|
||||
### Ways to deal with split brain:
|
||||
In glusterfs there are ways to resolve split brain. You can see the detailed description of how to resolve a split-brain [here](../Troubleshooting/resolving-splitbrain.md). Moreover, there are ways to reduce the chances of ending up in split-brain situations. They are:
|
||||
1. Replica 3 volume
|
||||
|
||||
1. volume with `replica 3`
|
||||
2. Arbiter volume
|
||||
|
||||
Both of these uses the client-quorum option of glusterfs to avoid the split-brain situations.
|
||||
Both of these use the client-quorum option of glusterfs to avoid the split-brain situations.
|
||||
|
||||
### Client quorum:
|
||||
This is a feature implemented in Automatic File Replication (AFR here on) module, to prevent split-brains in the I/O path for replicate/distributed-replicate volumes. By default, if the client-quorum is not met for a particular replica subvol, it becomes read-only. The other subvols (in a dist-rep volume) will still have R/W access. [Here](arbiter-volumes-and-quorum.md#client-quorum) you can see more details about client-quorum.
|
||||
|
||||
#### Client quorum in replica 2 volumes:
|
||||
In a replica 2 volume it is not possible to achieve high availability and consistency at the same time, without sacrificing tolerance to partition. If we set the client-quorum option to auto, then the first brick must always be up, irrespective of the status of the second brick. If only the second brick is up, the subvolume becomes read-only.
|
||||
In a `replica 2` volume it is not possible to achieve high availability and consistency at the same time, without sacrificing tolerance to partition. If we set the client-quorum option to auto, then the first brick must always be up, irrespective of the status of the second brick. If only the second brick is up, the subvolume becomes read-only.
|
||||
If the quorum-type is set to fixed, and the quorum-count is set to 1, then we may end up in split brain.
|
||||
|
||||
- Brick1 is up and brick2 is down. Quorum is met and write happens on brick1.
|
||||
- Brick1 goes down and brick2 comes up (No heal happened). Quorum is met, write happens on brick2.
|
||||
- Brick1 comes up. Quorum is met, but both the bricks have independent writes - split-brain.
|
||||
|
||||
To avoid this we have to set the quorum-count to 2, which will cost the availability. Even if we have one replica brick up and running, the quorum is not met and we end up seeing EROFS.
|
||||
|
||||
### 1. Replica 3 volume:
|
||||
When we create a replicated or distributed replicated volume with replica count 3, the cluster.quorum-type option is set to auto by default. That means at least 2 bricks should be up and running to satisfy the quorum and allow the writes. This is the recommended setting for a replica 3 volume and this should not be changed. Here is how it prevents files from ending up in split brain:
|
||||
When we create a replicated or distributed replicated volume with replica count 3, the cluster.quorum-type option is set to auto by default. That means at least 2 bricks should be up and running to satisfy the quorum and allow the writes. This is the recommended setting for a `replica 3` volume and this should not be changed. Here is how it prevents files from ending up in split brain:
|
||||
|
||||
B1, B2, and B3 are the 3 bricks of a replica 3 volume.
|
||||
|
||||
1. B1 & B2 are up and B3 is down. Quorum is met and write happens on B1 & B2.
|
||||
2. B3 comes up and B2 is down. Quorum is met and write happens on B1 & B3.
|
||||
3. B2 comes up and B1 goes down. Quorum is met. But when a write request comes, AFR sees that B2 & B3 are blaming each other (B2 says that some writes are pending on B3 and B3 says that some writes are pending on B2), therefore the write is not allowed and is failed with EIO.
|
||||
|
||||
Command to create a replica 3 volume:
|
||||
Command to create a `replica 3` volume:
|
||||
```sh
|
||||
$gluster volume create <volname> replica 3 host1:brick1 host2:brick2 host3:brick3
|
||||
```
|
||||
@@ -65,6 +74,7 @@ Since the arbiter brick has only name and metadata of the files, there are some
|
||||
You can find more details on arbiter [here](arbiter-volumes-and-quorum.md).
|
||||
|
||||
### Differences between replica 3 and arbiter volumes:
|
||||
|
||||
1. In case of a replica 3 volume, we store the entire file in all the bricks and it is recommended to have bricks of same size. But in case of arbiter, since we do not store data, the size of the arbiter brick is comparatively lesser than the other bricks.
|
||||
2. Arbiter is a state between replica 2 and replica 3 volume. If we have only arbiter and one of the other brick is up and the arbiter brick blames the other brick, then we can not proceed with the FOPs.
|
||||
4. Replica 3 gives high availability compared to arbiter, because unlike in arbiter, replica 3 has a full copy of the data in all 3 bricks.
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
The arbiter volume is a special subset of replica volumes that is aimed at
|
||||
preventing split-brains and providing the same consistency guarantees as a normal
|
||||
replica 3 volume without consuming 3x space.
|
||||
`replica 3` volume without consuming 3x space.
|
||||
|
||||
<!-- TOC depthFrom:1 depthTo:6 withLinks:1 updateOnSave:1 orderedList:0 -->
|
||||
|
||||
@@ -22,7 +22,7 @@ replica 3 volume without consuming 3x space.
|
||||
|
||||
The syntax for creating the volume is:
|
||||
```
|
||||
# gluster volume create <VOLNAME> replica 2 arbiter 1 <NEW-BRICK> ...
|
||||
# gluster volume create <VOLNAME> replica 2 arbiter 1 <NEW-BRICK> ...
|
||||
```
|
||||
**Note**: The earlier syntax used to be ```replica 3 arbiter 1``` but that was
|
||||
leading to confusions among users about the total no. of data bricks. For the
|
||||
@@ -33,7 +33,7 @@ arbiter volume.
|
||||
|
||||
For example:
|
||||
```
|
||||
# gluster volume create testvol replica 2 arbiter 1 server{1..6}:/bricks/brick
|
||||
# gluster volume create testvol replica 2 arbiter 1 server{1..6}:/bricks/brick
|
||||
volume create: testvol: success: please start the volume to access data
|
||||
```
|
||||
|
||||
@@ -66,9 +66,9 @@ performance.readdir-ahead: on `
|
||||
```
|
||||
|
||||
The arbiter brick will store only the file/directory names (i.e. the tree structure)
|
||||
and extended attributes (metadata) but not any data. i.e. the file size
|
||||
and extended attributes (metadata) but not any data, i.e. the file size
|
||||
(as shown by `ls -l`) will be zero bytes. It will also store other gluster
|
||||
metadata like the .glusterfs folder and its contents.
|
||||
metadata like the `.glusterfs` folder and its contents.
|
||||
|
||||
_**Note:** Enabling the arbiter feature **automatically** configures_
|
||||
_client-quorum to 'auto'. This setting is **not** to be changed._
|
||||
@@ -76,11 +76,10 @@ _client-quorum to 'auto'. This setting is **not** to be changed._
|
||||
## Arbiter brick(s) sizing
|
||||
|
||||
Since the arbiter brick does not store file data, its disk usage will be considerably
|
||||
less than the other bricks of the replica. The sizing of the brick will depend on
|
||||
smaller than for the other bricks of the replica. The sizing of the brick will depend on
|
||||
how many files you plan to store in the volume. A good estimate will be
|
||||
4KB times the number of files in the replica. Note that the estimate also
|
||||
depends on the inode space alloted by the underlying filesystem for a given
|
||||
disk size.
|
||||
depends on the inode space allocated by the underlying filesystem for a given disk size.
|
||||
|
||||
The `maxpct` value in XFS for volumes of size 1TB to 50TB is only 5%.
|
||||
If you want to store say 300 million files, 4KB x 300M gives us 1.2TB.
|
||||
@@ -130,7 +129,7 @@ greater than 50%, so that two nodes separated from each other do not believe
|
||||
they have quorum simultaneously. For a two-node plain replica volume, this would
|
||||
mean both nodes need to be up and running. So there is no notion of HA/failover.
|
||||
|
||||
There are users who create a replica 2 volume from 2 nodes and peer-probe
|
||||
There are users who create a `replica 2` volume from 2 nodes and peer-probe
|
||||
a 'dummy' node without bricks and enable server quorum with a ratio of 51%.
|
||||
This does not prevent files from getting into split-brain. For example, if B1
|
||||
and B2 are the bricks/nodes of the replica and B3 is the dummy node, we can
|
||||
@@ -176,7 +175,7 @@ The following volume set options are used to configure it:
|
||||
to specify the number of bricks to be active to participate in quorum.
|
||||
If the quorum-type is auto then this option has no significance.
|
||||
|
||||
Earlier, when quorm was not met, the replica subvolume turned read-only. But
|
||||
Earlier, when quorum was not met, the replica subvolume turned read-only. But
|
||||
since [glusterfs-3.13](https://docs.gluster.org/en/latest/release-notes/3.13.0/#addition-of-checks-for-allowing-lookups-in-afr-and-removal-of-clusterquorum-reads-volume-option) and upwards, the subvolume becomes unavailable, i.e. all
|
||||
the file operations fail with ENOTCONN error instead of becoming EROFS.
|
||||
This means the ```cluster.quorum-reads``` volume option is also not supported.
|
||||
@@ -185,16 +184,16 @@ This means the ```cluster.quorum-reads``` volume option is also not supported.
|
||||
## Replica 2 and Replica 3 volumes
|
||||
|
||||
From the above descriptions, it is clear that client-quorum cannot really be applied
|
||||
to a replica 2 volume:(without costing HA).
|
||||
to a `replica 2` volume (without costing HA).
|
||||
If the quorum-type is set to auto, then by the description
|
||||
given earlier, the first brick must always be up, irrespective of the status of the
|
||||
second brick. IOW, if only the second brick is up, the subvol returns ENOTCONN, i.e. no HA.
|
||||
If quorum-type is set to fixed, then the quorum-count *has* to be two
|
||||
to prevent split-brains (otherwise a write can succeed in brick1, another in brick2 =>split-brain).
|
||||
So for all practical purposes, if you want high availability in a replica 2 volume,
|
||||
So for all practical purposes, if you want high availability in a `replica 2` volume,
|
||||
it is recommended not to enable client-quorum.
|
||||
|
||||
In a replica 3 volume, client-quorum is enabled by default and set to 'auto'.
|
||||
In a `replica 3` volume, client-quorum is enabled by default and set to 'auto'.
|
||||
This means 2 bricks need to be up for the write to succeed. Here is how this
|
||||
configuration prevents files from ending up in split-brain:
|
||||
|
||||
|
||||
@@ -7,5 +7,3 @@ OK, you can do that by editing planet-gluster [feeds](https://github.com/gluster
|
||||
Please find instructions mentioned in the file and send a pull request.
|
||||
|
||||
Once approved, all your gluster related posts will appear in [planet.gluster.org](http://planet.gluster.org) website.
|
||||
|
||||
|
||||
|
||||
@@ -1,31 +1,29 @@
|
||||
Before filing an issue
|
||||
----------------------
|
||||
## Before filing an issue
|
||||
|
||||
If you are finding any issues, these preliminary checks as useful:
|
||||
|
||||
- Is SELinux enabled? (you can use `getenforce` to check)
|
||||
- Are iptables rules blocking any data traffic? (`iptables -L` can
|
||||
help check)
|
||||
- Are all the nodes reachable from each other? [ Network problem ]
|
||||
- Please search [issues](https://github.com/gluster/glusterfs/issues)
|
||||
to see if the bug has already been reported
|
||||
- If an issue has been already filed for a particular release and
|
||||
you found the issue in another release, add a comment in issue.
|
||||
- Is SELinux enabled? (you can use `getenforce` to check)
|
||||
- Are iptables rules blocking any data traffic? (`iptables -L` can
|
||||
help check)
|
||||
- Are all the nodes reachable from each other? [ Network problem ]
|
||||
- Please search [issues](https://github.com/gluster/glusterfs/issues)
|
||||
to see if the bug has already been reported
|
||||
|
||||
- If an issue has been already filed for a particular release and you found the issue in another release, add a comment in issue.
|
||||
|
||||
Anyone can search in github issues, you don't need an account. Searching
|
||||
requires some effort, but helps avoid duplicates, and you may find that
|
||||
your problem has already been solved.
|
||||
|
||||
Reporting An Issue
|
||||
------------------
|
||||
## Reporting An Issue
|
||||
|
||||
- You should have an account with github.com
|
||||
- Here is the link to file an issue:
|
||||
[Github](https://github.com/gluster/glusterfs/issues/new)
|
||||
- You should have an account with github.com
|
||||
- Here is the link to file an issue:
|
||||
[Github](https://github.com/gluster/glusterfs/issues/new)
|
||||
|
||||
*Note: Please go through all below sections to understand what
|
||||
_Note: Please go through all below sections to understand what
|
||||
information we need to put in a bug. So it will help the developer to
|
||||
root cause and fix it*
|
||||
root cause and fix it_
|
||||
|
||||
### Required Information
|
||||
|
||||
@@ -33,84 +31,86 @@ You should gather the information below before creating the bug report.
|
||||
|
||||
#### Package Information
|
||||
|
||||
- Location from which the packages are used
|
||||
- Package Info - version of glusterfs package installed
|
||||
- Location from which the packages are used
|
||||
- Package Info - version of glusterfs package installed
|
||||
|
||||
#### Cluster Information
|
||||
|
||||
- Number of nodes in the cluster
|
||||
- Hostnames and IPs of the gluster Node [if it is not a security
|
||||
issue]
|
||||
- Hostname / IP will help developers in understanding &
|
||||
correlating with the logs
|
||||
- Output of `gluster peer status`
|
||||
- Node IP, from which the "x" operation is done
|
||||
- "x" here means any operation that causes the issue
|
||||
- Number of nodes in the cluster
|
||||
- Hostnames and IPs of the gluster Node [if it is not a security
|
||||
issue]
|
||||
|
||||
- Hostname / IP will help developers in understanding & correlating with the logs
|
||||
|
||||
- Output of `gluster peer status`
|
||||
- Node IP, from which the "x" operation is done
|
||||
|
||||
- "x" here means any operation that causes the issue
|
||||
|
||||
#### Volume Information
|
||||
|
||||
- Number of volumes
|
||||
- Volume Names
|
||||
- Volume on which the particular issue is seen [ if applicable ]
|
||||
- Type of volumes
|
||||
- Volume options if available
|
||||
- Output of `gluster volume info`
|
||||
- Output of `gluster volume status`
|
||||
- Get the statedump of the volume with the problem
|
||||
|
||||
`$ gluster volume statedump <vol-name>`
|
||||
- Number of volumes
|
||||
- Volume Names
|
||||
- Volume on which the particular issue is seen [ if applicable ]
|
||||
- Type of volumes
|
||||
- Volume options if available
|
||||
- Output of `gluster volume info`
|
||||
- Output of `gluster volume status`
|
||||
- Get the statedump of the volume with the problem `gluster volume statedump <vol-name>`
|
||||
|
||||
This dumps statedump per brick process in `/var/run/gluster`
|
||||
|
||||
*NOTE: Collect statedumps from one gluster Node in a directory.*
|
||||
_NOTE: Collect statedumps from one gluster Node in a directory._
|
||||
|
||||
Repeat it in all Nodes containing the bricks of the volume. All the so
|
||||
collected directories could be archived, compressed and attached to bug
|
||||
|
||||
#### Brick Information
|
||||
|
||||
- xfs options when a brick partition was done
|
||||
- This could be obtained with this command :
|
||||
- xfs options when a brick partition was done
|
||||
|
||||
`$ xfs_info /dev/mapper/vg1-brick`
|
||||
- This could be obtained with this command: `xfs_info /dev/mapper/vg1-brick`
|
||||
|
||||
- Extended attributes on the bricks
|
||||
- This could be obtained with this command:
|
||||
- Extended attributes on the bricks
|
||||
|
||||
`$ getfattr -d -m. -ehex /rhs/brick1/b1`
|
||||
- This could be obtained with this command: `getfattr -d -m. -ehex /rhs/brick1/b1`
|
||||
|
||||
#### Client Information
|
||||
|
||||
- OS Type ( Ubuntu, Fedora, RHEL )
|
||||
- OS Version: In case of Linux distro get the following :
|
||||
- OS Type ( Ubuntu, Fedora, RHEL )
|
||||
- OS Version: In case of Linux distro get the following :
|
||||
|
||||
`uname -r`
|
||||
`cat /etc/issue`
|
||||
```console
|
||||
uname -r
|
||||
cat /etc/issue
|
||||
```
|
||||
|
||||
- Fuse or NFS Mount point on the client with output of mount commands
|
||||
- Output of `df -Th` command
|
||||
- Fuse or NFS Mount point on the client with output of mount commands
|
||||
- Output of `df -Th` command
|
||||
|
||||
#### Tool Information
|
||||
|
||||
- If any tools are used for testing, provide the info/version about it
|
||||
- if any IO is simulated using a script, provide the script
|
||||
- If any tools are used for testing, provide the info/version about it
|
||||
- if any IO is simulated using a script, provide the script
|
||||
|
||||
#### Logs Information
|
||||
|
||||
- You can check logs for issues/warnings/errors.
|
||||
- Self-heal logs
|
||||
- Rebalance logs
|
||||
- Glusterd logs
|
||||
- Brick logs
|
||||
- NFS logs (if applicable)
|
||||
- Samba logs (if applicable)
|
||||
- Client mount log
|
||||
- Add the entire logs as attachment, if its very large to paste as a
|
||||
comment
|
||||
- You can check logs for issues/warnings/errors.
|
||||
|
||||
- Self-heal logs
|
||||
- Rebalance logs
|
||||
- Glusterd logs
|
||||
- Brick logs
|
||||
- NFS logs (if applicable)
|
||||
- Samba logs (if applicable)
|
||||
- Client mount log
|
||||
|
||||
- Add the entire logs as attachment, if its very large to paste as a
|
||||
comment
|
||||
|
||||
#### SOS report for CentOS/Fedora
|
||||
|
||||
- Get the sosreport from the involved gluster Node and Client [ in
|
||||
case of CentOS /Fedora ]
|
||||
- Add a meaningful name/IP to the sosreport, by renaming/adding
|
||||
hostname/ip to the sosreport name
|
||||
- Get the sosreport from the involved gluster Node and Client [ in
|
||||
case of CentOS /Fedora ]
|
||||
- Add a meaningful name/IP to the sosreport, by renaming/adding
|
||||
hostname/ip to the sosreport name
|
||||
|
||||
@@ -1,25 +1,24 @@
|
||||
Issues Triage Guidelines
|
||||
========================
|
||||
# Issues Triage Guidelines
|
||||
|
||||
- Triaging of issues is an important task; when done correctly, it can
|
||||
reduce the time between reporting an issue and the availability of a
|
||||
fix enormously.
|
||||
- Triaging of issues is an important task; when done correctly, it can
|
||||
reduce the time between reporting an issue and the availability of a
|
||||
fix enormously.
|
||||
|
||||
- Triager should focus on new issues, and try to define the problem
|
||||
easily understandable and as accurate as possible. The goal of the
|
||||
triagers is to reduce the time that developers need to solve the bug
|
||||
report.
|
||||
- Triager should focus on new issues, and try to define the problem
|
||||
easily understandable and as accurate as possible. The goal of the
|
||||
triagers is to reduce the time that developers need to solve the bug
|
||||
report.
|
||||
|
||||
- A triager is like an assistant that helps with the information
|
||||
gathering and possibly the debugging of a new bug report. Because a
|
||||
triager helps preparing a bug before a developer gets involved, it
|
||||
can be a very nice role for new community members that are
|
||||
interested in technical aspects of the software.
|
||||
- A triager is like an assistant that helps with the information
|
||||
gathering and possibly the debugging of a new bug report. Because a
|
||||
triager helps preparing a bug before a developer gets involved, it
|
||||
can be a very nice role for new community members that are
|
||||
interested in technical aspects of the software.
|
||||
|
||||
- Triagers will stumble upon many different kind of issues, ranging
|
||||
from reports about spelling mistakes, or unclear log messages to
|
||||
memory leaks causing crashes or performance issues in environments
|
||||
with several hundred storage servers.
|
||||
- Triagers will stumble upon many different kind of issues, ranging
|
||||
from reports about spelling mistakes, or unclear log messages to
|
||||
memory leaks causing crashes or performance issues in environments
|
||||
with several hundred storage servers.
|
||||
|
||||
Nobody expects that triagers can prepare all bug reports. Therefore most
|
||||
developers will be able to assist the triagers, answer questions and
|
||||
@@ -28,17 +27,16 @@ more experienced and will rely less on developers.
|
||||
|
||||
**Issue triage can be summarized as below points:**
|
||||
|
||||
- Is the issue a bug? an enhancement request? or a question? Assign the relevant label.
|
||||
- Is there enough information in the issue description?
|
||||
- Is it a duplicate issue?
|
||||
- Is it assigned to correct component of GlusterFS?
|
||||
- Is the bug summary is correct?
|
||||
- Assigning issue or Adding people's github handle in the comment, so they get notified.
|
||||
- Is the issue a bug? an enhancement request? or a question? Assign the relevant label.
|
||||
- Is there enough information in the issue description?
|
||||
- Is it a duplicate issue?
|
||||
- Is it assigned to correct component of GlusterFS?
|
||||
- Is the bug summary is correct?
|
||||
- Assigning issue or Adding people's github handle in the comment, so they get notified.
|
||||
|
||||
The detailed discussion about the above points are below.
|
||||
|
||||
Is there enough information?
|
||||
----------------------------
|
||||
## Is there enough information?
|
||||
|
||||
It's hard to generalize what makes a good report. For "average"
|
||||
reporters is definitely often helpful to have good steps to reproduce,
|
||||
@@ -46,42 +44,38 @@ GlusterFS software version , and information about the test/production
|
||||
environment, Linux/GNU distribution.
|
||||
|
||||
If the reporter is a developer, steps to reproduce can sometimes be
|
||||
omitted as context is obvious. *However, this can create a problem for
|
||||
omitted as context is obvious. _However, this can create a problem for
|
||||
contributors that need to find their way, hence it is strongly advised
|
||||
to list the steps to reproduce an issue.*
|
||||
to list the steps to reproduce an issue._
|
||||
|
||||
Other tips:
|
||||
|
||||
- There should be only one issue per report. Try not to mix related or
|
||||
similar looking bugs per report.
|
||||
- There should be only one issue per report. Try not to mix related or
|
||||
similar looking bugs per report.
|
||||
|
||||
- It should be possible to call the described problem fixed at some
|
||||
point. "Improve the documentation" or "It runs slow" could never be
|
||||
called fixed, while "Documentation should cover the topic Embedding"
|
||||
or "The page at <http://en.wikipedia.org/wiki/Example> should load
|
||||
in less than five seconds" would have a criterion. A good summary of
|
||||
the bug will also help others in finding existing bugs and prevent
|
||||
filing of duplicates.
|
||||
- It should be possible to call the described problem fixed at some
|
||||
point. "Improve the documentation" or "It runs slow" could never be
|
||||
called fixed, while "Documentation should cover the topic Embedding"
|
||||
or "The page at <http://en.wikipedia.org/wiki/Example> should load
|
||||
in less than five seconds" would have a criterion. A good summary of
|
||||
the bug will also help others in finding existing bugs and prevent
|
||||
filing of duplicates.
|
||||
|
||||
- If the bug is a graphical problem, you may want to ask for a
|
||||
screenshot to attach to the bug report. Make sure to ask that the
|
||||
screenshot should not contain any confidential information.
|
||||
- If the bug is a graphical problem, you may want to ask for a
|
||||
screenshot to attach to the bug report. Make sure to ask that the
|
||||
screenshot should not contain any confidential information.
|
||||
|
||||
Is it a duplicate?
|
||||
------------------
|
||||
## Is it a duplicate?
|
||||
|
||||
If you think that you have found a duplicate but you are not totally
|
||||
sure, just add a comment like "This issue looks related to issue #NNN" (and
|
||||
replace NNN by issue-id) so somebody else can take a look and help judging.
|
||||
|
||||
|
||||
Is it assigned with correct label?
|
||||
----------------------------------
|
||||
## Is it assigned with correct label?
|
||||
|
||||
Go through the labels and assign the appropriate label
|
||||
|
||||
Are the fields correct?
|
||||
-----------------------
|
||||
## Are the fields correct?
|
||||
|
||||
### Description
|
||||
|
||||
@@ -89,8 +83,8 @@ Sometimes the description does not summarize the bug itself well. You may
|
||||
want to update the bug summary to make the report distinguishable. A
|
||||
good title may contain:
|
||||
|
||||
- A brief explanation of the root cause (if it was found)
|
||||
- Some of the symptoms people are experiencing
|
||||
- A brief explanation of the root cause (if it was found)
|
||||
- Some of the symptoms people are experiencing
|
||||
|
||||
### Assigning issue or Adding people's github handle in the comment
|
||||
|
||||
|
||||
@@ -15,7 +15,7 @@ Minor releases will have guaranteed backwards compatibilty with earlier minor re
|
||||
Each GlusterFS major release has a 4-6 month release window, in which changes get merged. This window is split into two phases.
|
||||
|
||||
1. A Open phase, where all changes get merged
|
||||
1. A Stability phase, where only changes that stabilize the release get merged.
|
||||
2. A Stability phase, where only changes that stabilize the release get merged.
|
||||
|
||||
The first 2-4 months of a release window will be the Open phase, and the last month will be the stability phase.
|
||||
|
||||
@@ -30,8 +30,8 @@ All changes will be accepted during the Open phase. The changes have a few requi
|
||||
- a change fixing a bug SHOULD have public test case
|
||||
- a change introducing a new feature MUST have a disable switch that can disable the feature during a build
|
||||
|
||||
|
||||
#### Stability phase
|
||||
|
||||
This phase is used to stabilize any new features introduced in the open phase, or general bug fixes for already existing features.
|
||||
|
||||
A new `release-<version>` branch is created at the beginning of this phase. All changes need to be sent to the master branch before getting backported to the new release branch.
|
||||
@@ -54,6 +54,7 @@ Patches accepted in the Stability phase have the following requirements:
|
||||
Patches that do not satisfy the above requirements can still be submitted for review, but cannot be merged.
|
||||
|
||||
## Release procedure
|
||||
|
||||
This procedure is followed by a release maintainer/manager, to perform the actual release.
|
||||
|
||||
The release procedure for both major releases and minor releases is nearly the same.
|
||||
@@ -63,6 +64,7 @@ The procedure for the major releases starts at the beginning of the Stability ph
|
||||
_TODO: Add the release verification procedure_
|
||||
|
||||
### Release steps
|
||||
|
||||
The release-manager needs to follow the following steps, to actually perform the release once ready.
|
||||
|
||||
#### Create tarball
|
||||
@@ -73,9 +75,11 @@ The release-manager needs to follow the following steps, to actually perform the
|
||||
4. create the tarball with the [release job in Jenkins](http://build.gluster.org/job/release/)
|
||||
|
||||
#### Notify packagers
|
||||
|
||||
Notify the packagers that we need packages created. Provide the link to the source tarball from the Jenkins release job to the [packagers mailinglist](mailto:packaging@gluster.org). A list of the people involved in the package maintenance for the different distributions is in the `MAINTAINERS` file in the sources, all of them should be subscribed to the packagers mailinglist.
|
||||
|
||||
#### Create a new Tracker Bug for the next release
|
||||
|
||||
The tracker bugs are used as guidance for blocker bugs and should get created when a release is made. To create one
|
||||
|
||||
- Create a [new milestone](https://github.com/gluster/glusterfs/milestones/new)
|
||||
@@ -83,19 +87,21 @@ The tracker bugs are used as guidance for blocker bugs and should get created wh
|
||||
- issues that were not fixed in previous release, but in milestone should be moved to the new milestone.
|
||||
|
||||
#### Create Release Announcement
|
||||
(Major releases)
|
||||
The Release Announcement is based off the release notes. This needs to indicate:
|
||||
* What this release's overall focus is
|
||||
* Which versions will stop receiving updates as of this release
|
||||
* Links to the direct download folder
|
||||
* Feature set
|
||||
|
||||
Best practice as of version-8 is to create a collaborative version of the release notes that both the release manager and community lead work on together, and the release manager posts to the mailing lists (gluster-users@, gluster-devel@, announce@).
|
||||
|
||||
(Major releases)
|
||||
The Release Announcement is based off the release notes. This needs to indicate:
|
||||
|
||||
- What this release's overall focus is
|
||||
- Which versions will stop receiving updates as of this release
|
||||
- Links to the direct download folder
|
||||
- Feature set
|
||||
|
||||
Best practice as of version-8 is to create a collaborative version of the release notes that both the release manager and community lead work on together, and the release manager posts to the mailing lists (gluster-users@, gluster-devel@, announce@).
|
||||
|
||||
#### Create Upgrade Guide
|
||||
(Major releases)
|
||||
If required, as in the case of a major release, an upgrade guide needs to be available at the same time as the release.
|
||||
|
||||
(Major releases)
|
||||
If required, as in the case of a major release, an upgrade guide needs to be available at the same time as the release.
|
||||
This document should go under the [Upgrade Guide](https://github.com/gluster/glusterdocs/tree/master/Upgrade-Guide) section of the [glusterdocs](https://github.com/gluster/glusterdocs) repository.
|
||||
|
||||
#### Send Release Announcement
|
||||
@@ -103,13 +109,15 @@ This document should go under the [Upgrade Guide](https://github.com/gluster/glu
|
||||
Once the Fedora/EL RPMs are ready (and any others that are ready by then), send the release announcement:
|
||||
|
||||
- Gluster Mailing lists
|
||||
- [gluster-announce](https://lists.gluster.org/mailman/listinfo/announce/)
|
||||
- [gluster-devel](https://lists.gluster.org/mailman/listinfo/gluster-devel)
|
||||
- [gluster-users](https://lists.gluster.org/mailman/listinfo/gluster-users/)
|
||||
|
||||
- [Gluster Blog](https://planet.gluster.org/)
|
||||
The blog will automatically post to both Facebook and Twitter. Be careful with this!
|
||||
- [Gluster Twitter account](https://twitter.com/gluster)
|
||||
- [Gluster Facebook page](https://www.facebook.com/GlusterInc)
|
||||
- [Gluster LinkedIn group](https://www.linkedin.com/company/gluster/about/)
|
||||
|
||||
- [gluster-announce](https://lists.gluster.org/mailman/listinfo/announce/)
|
||||
- [gluster-devel](https://lists.gluster.org/mailman/listinfo/gluster-devel)
|
||||
- [gluster-users](https://lists.gluster.org/mailman/listinfo/gluster-users/)
|
||||
|
||||
- [Gluster Blog](https://planet.gluster.org/)
|
||||
The blog will automatically post to both Facebook and Twitter. Be careful with this!
|
||||
|
||||
- [Gluster Twitter account](https://twitter.com/gluster)
|
||||
- [Gluster Facebook page](https://www.facebook.com/GlusterInc)
|
||||
|
||||
- [Gluster LinkedIn group](https://www.linkedin.com/company/gluster/about/)
|
||||
|
||||
@@ -13,8 +13,10 @@ explicitly called out.
|
||||
|
||||
### Guidelines that Maintainers are expected to adhere to
|
||||
|
||||
1. Ensure qualitative and timely management of patches sent for review.
|
||||
2. For merging patches into the repository, it is expected of maintainers to:
|
||||
1. Ensure qualitative and timely management of patches sent for review.
|
||||
|
||||
2. For merging patches into the repository, it is expected of maintainers to:
|
||||
|
||||
- Merge patches of owned components only.
|
||||
- Seek approvals from all maintainers before merging a patchset spanning
|
||||
multiple components.
|
||||
@@ -28,14 +30,15 @@ explicitly called out.
|
||||
quality of the codebase.
|
||||
- Not merge patches written by themselves until there is a +2 Code Review
|
||||
vote by other reviewers.
|
||||
3. The responsibility of merging a patch into a release branch in normal
|
||||
circumstances will be that of the release maintainer's. Only in exceptional
|
||||
situations, maintainers & sub-maintainers will merge patches into a release
|
||||
branch.
|
||||
4. Release maintainers will ensure approval from appropriate maintainers before
|
||||
merging a patch into a release branch.
|
||||
5. Maintainers have a responsibility to the community, it is expected of
|
||||
maintainers to:
|
||||
|
||||
3. The responsibility of merging a patch into a release branch in normal
|
||||
circumstances will be that of the release maintainer's. Only in exceptional
|
||||
situations, maintainers & sub-maintainers will merge patches into a release
|
||||
branch.
|
||||
4. Release maintainers will ensure approval from appropriate maintainers before
|
||||
merging a patch into a release branch.
|
||||
|
||||
5. Maintainers have a responsibility to the community, it is expected of maintainers to:
|
||||
- Facilitate the community in all aspects.
|
||||
- Be very active and visible in the community.
|
||||
- Be objective and consider the larger interests of the community ahead of
|
||||
@@ -53,4 +56,3 @@ Any questions or comments regarding these guidelines can be routed to
|
||||
|
||||
Github can be used to list patches that need reviews and/or can get
|
||||
merged from [Pull Requests](https://github.com/gluster/glusterfs/pulls)
|
||||
|
||||
|
||||
@@ -1,28 +1,23 @@
|
||||
# Workflow Guide
|
||||
|
||||
Bug Handling
|
||||
------------
|
||||
## Bug Handling
|
||||
|
||||
- [Bug reporting guidelines](./Bug-Reporting-Guidelines.md) -
|
||||
Guideline for reporting a bug in GlusterFS
|
||||
- [Bug triage guidelines](./Bug-Triage.md) - Guideline on how to
|
||||
triage bugs for GlusterFS
|
||||
- [Bug reporting guidelines](./Bug-Reporting-Guidelines.md) -
|
||||
Guideline for reporting a bug in GlusterFS
|
||||
- [Bug triage guidelines](./Bug-Triage.md) - Guideline on how to
|
||||
triage bugs for GlusterFS
|
||||
|
||||
Release Process
|
||||
---------------
|
||||
## Release Process
|
||||
|
||||
- [GlusterFS Release process](./GlusterFS-Release-process.md) -
|
||||
Our release process / checklist
|
||||
- [GlusterFS Release process](./GlusterFS-Release-process.md) -
|
||||
Our release process / checklist
|
||||
|
||||
Patch Acceptance
|
||||
----------------
|
||||
## Patch Acceptance
|
||||
|
||||
- The [Guidelines For Maintainers](./Guidelines-For-Maintainers.md) explains when
|
||||
maintainers can merge patches.
|
||||
- The [Guidelines For Maintainers](./Guidelines-For-Maintainers.md) explains when
|
||||
maintainers can merge patches.
|
||||
|
||||
Blogging about gluster
|
||||
----------------
|
||||
|
||||
- The [Adding your gluster blog](./Adding-your-blog.md) explains how to add your
|
||||
gluster blog to Community blogger.
|
||||
## Blogging about gluster
|
||||
|
||||
- The [Adding your gluster blog](./Adding-your-blog.md) explains how to add your
|
||||
gluster blog to Community blogger.
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
# Backport Guidelines
|
||||
|
||||
In GlusterFS project, as a policy, any new change, bug fix, etc., are to be
|
||||
fixed in 'devel' branch before release branches. When a bug is fixed in
|
||||
the devel branch, it might be desirable or necessary in release branch.
|
||||
@@ -9,17 +10,17 @@ understand how to request for backport from community.
|
||||
|
||||
## Policy
|
||||
|
||||
* No feature from devel would be backported to the release branch
|
||||
* CVE ie., security vulnerability [(listed on the CVE database)](https://cve.mitre.org/cve/search_cve_list.html)
|
||||
reported in the existing releases would be backported, after getting fixed
|
||||
in devel branch.
|
||||
* Only topics which bring about data loss or, unavailability would be
|
||||
backported to the release.
|
||||
* For any other issues, the project recommends that the installation be
|
||||
upgraded to a newer release where the specific bug has been addressed.
|
||||
- No feature from devel would be backported to the release branch
|
||||
- CVE ie., security vulnerability [(listed on the CVE database)](https://cve.mitre.org/cve/search_cve_list.html)
|
||||
reported in the existing releases would be backported, after getting fixed
|
||||
in devel branch.
|
||||
- Only topics which bring about data loss or, unavailability would be
|
||||
backported to the release.
|
||||
- For any other issues, the project recommends that the installation be
|
||||
upgraded to a newer release where the specific bug has been addressed.
|
||||
- Gluster provides 'rolling' upgrade support, i.e., one can upgrade their
|
||||
server version without stopping the application I/O, so we recommend migrating
|
||||
to higher version.
|
||||
server version without stopping the application I/O, so we recommend migrating
|
||||
to higher version.
|
||||
|
||||
## Things to pay attention to while backporting a patch.
|
||||
|
||||
@@ -27,12 +28,10 @@ If your patch meets the criteria above, or you are a user, who prefer to have a
|
||||
fix backported, because your current setup is facing issues, below are the
|
||||
steps you need to take care to submit a patch on release branch.
|
||||
|
||||
* The patch should have same 'Change-Id'.
|
||||
|
||||
- The patch should have same 'Change-Id'.
|
||||
|
||||
### How to contact release owners?
|
||||
|
||||
All release owners are part of 'gluster-devel@gluster.org' mailing list.
|
||||
Please write your expectation from next release there, so we can take that
|
||||
to consideration while making the release.
|
||||
|
||||
|
||||
@@ -7,9 +7,11 @@ This page describes how to build and install GlusterFS.
|
||||
The following packages are required for building GlusterFS,
|
||||
|
||||
- GNU Autotools
|
||||
- Automake
|
||||
- Autoconf
|
||||
- Libtool
|
||||
|
||||
- Automake
|
||||
- Autoconf
|
||||
- Libtool
|
||||
|
||||
- lex (generally flex)
|
||||
- GNU Bison
|
||||
- OpenSSL
|
||||
@@ -258,9 +260,9 @@ cd extras/LinuxRPM
|
||||
make glusterrpms
|
||||
```
|
||||
|
||||
This will create rpms from the source in 'extras/LinuxRPM'. *(Note: You
|
||||
This will create rpms from the source in 'extras/LinuxRPM'. _(Note: You
|
||||
will need to install the rpmbuild requirements including rpmbuild and
|
||||
mock)*<br>
|
||||
mock)_<br>
|
||||
For CentOS / Enterprise Linux 8 the dependencies can be installed via:
|
||||
|
||||
```console
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
Developers
|
||||
==========
|
||||
# Developers
|
||||
|
||||
### Contributing to the Gluster community
|
||||
-------------------------------------
|
||||
|
||||
---
|
||||
|
||||
Are you itching to send in patches and participate as a developer in the
|
||||
Gluster community? Here are a number of starting points for getting
|
||||
@@ -10,36 +10,37 @@ involved. All you need is your 'github' account to be handy.
|
||||
|
||||
Remember that, [Gluster community](https://github.com/gluster) has multiple projects, each of which has its own way of handling PRs and patches. Decide on which project you want to contribute. Below documents are mostly about 'GlusterFS' project, which is the core of Gluster Community.
|
||||
|
||||
Workflow
|
||||
--------
|
||||
## Workflow
|
||||
|
||||
- [Simplified Developer Workflow](./Simplified-Development-Workflow.md)
|
||||
- A simpler and faster intro to developing with GlusterFS, than the document below
|
||||
- [Developer Workflow](./Development-Workflow.md)
|
||||
- Covers detail about requirements from a patch; tools and toolkits used by developers.
|
||||
This is recommended reading in order to begin contributions to the project.
|
||||
- [GD2 Developer Workflow](https://github.com/gluster/glusterd2/blob/master/doc/development-guide.md)
|
||||
- Helps in on-boarding developers to contribute in GlusterD2 project.
|
||||
- [Simplified Developer Workflow](./Simplified-Development-Workflow.md)
|
||||
|
||||
Compiling Gluster
|
||||
-----------------
|
||||
- A simpler and faster intro to developing with GlusterFS, than the document below
|
||||
|
||||
- [Building GlusterFS](./Building-GlusterFS.md) - How to compile
|
||||
Gluster from source code.
|
||||
- [Developer Workflow](./Development-Workflow.md)
|
||||
|
||||
Developing
|
||||
----------
|
||||
- Covers detail about requirements from a patch; tools and toolkits used by developers.
|
||||
This is recommended reading in order to begin contributions to the project.
|
||||
|
||||
- [Projects](./Projects.md) - Ideas for projects you could
|
||||
create
|
||||
- [Fixing issues reported by tools for static code
|
||||
analysis](./Fixing-issues-reported-by-tools-for-static-code-analysis.md)
|
||||
- This is a good starting point for developers to fix bugs in
|
||||
GlusterFS project.
|
||||
- [GD2 Developer Workflow](https://github.com/gluster/glusterd2/blob/master/doc/development-guide.md)
|
||||
|
||||
Releases and Backports
|
||||
----------------------
|
||||
- Helps in on-boarding developers to contribute in GlusterD2 project.
|
||||
|
||||
- [Backport Guidelines](./Backport-Guidelines.md) describe the steps that branches too.
|
||||
## Compiling Gluster
|
||||
|
||||
- [Building GlusterFS](./Building-GlusterFS.md) - How to compile
|
||||
Gluster from source code.
|
||||
|
||||
## Developing
|
||||
|
||||
- [Projects](./Projects.md) - Ideas for projects you could
|
||||
create
|
||||
- [Fixing issues reported by tools for static code
|
||||
analysis](./Fixing-issues-reported-by-tools-for-static-code-analysis.md)
|
||||
|
||||
- This is a good starting point for developers to fix bugs in GlusterFS project.
|
||||
|
||||
## Releases and Backports
|
||||
|
||||
- [Backport Guidelines](./Backport-Guidelines.md) describe the steps that branches too.
|
||||
|
||||
Some more GlusterFS Developer documentation can be found [in glusterfs documentation directory](https://github.com/gluster/glusterfs/tree/master/doc/developer-guide)
|
||||
|
||||
@@ -1,12 +1,10 @@
|
||||
Development workflow of Gluster
|
||||
================================
|
||||
# Development workflow of Gluster
|
||||
|
||||
This document provides a detailed overview of the development model
|
||||
followed by the GlusterFS project. For a simpler overview visit
|
||||
[Simplified development workflow](./Simplified-Development-Workflow.md).
|
||||
|
||||
##Basics
|
||||
--------
|
||||
## Basics
|
||||
|
||||
The GlusterFS development model largely revolves around the features and
|
||||
functionality provided by Git version control system, Github and Jenkins
|
||||
@@ -31,8 +29,7 @@ all builds and tests can be viewed at
|
||||
'regression' job which is designed to execute test scripts provided as
|
||||
part of the code change.
|
||||
|
||||
##Preparatory Setup
|
||||
-------------------
|
||||
## Preparatory Setup
|
||||
|
||||
Here is a list of initial one-time steps before you can start hacking on
|
||||
code.
|
||||
@@ -46,9 +43,9 @@ Fork [GlusterFS repository](https://github.com/gluster/glusterfs/fork)
|
||||
Get yourself a working tree by cloning the development repository from
|
||||
|
||||
```console
|
||||
# git clone git@github.com:${username}/glusterfs.git
|
||||
# cd glusterfs/
|
||||
# git remote add upstream git@github.com:gluster/glusterfs.git
|
||||
git clone git@github.com:${username}/glusterfs.git
|
||||
cd glusterfs/
|
||||
git remote add upstream git@github.com:gluster/glusterfs.git
|
||||
```
|
||||
|
||||
### Preferred email and set username
|
||||
@@ -69,13 +66,14 @@ get alerts.
|
||||
|
||||
Set up a filter rule in your mail client to tag or classify emails with
|
||||
the header
|
||||
|
||||
```text
|
||||
list: <glusterfs.gluster.github.com>
|
||||
```
|
||||
|
||||
as mails originating from the github system.
|
||||
|
||||
##Development & Other flows
|
||||
---------------------------
|
||||
## Development & Other flows
|
||||
|
||||
### Issue
|
||||
|
||||
@@ -90,17 +88,17 @@ as mails originating from the github system.
|
||||
- Make sure clang-format is installed and is run on the patch.
|
||||
|
||||
### Keep up-to-date
|
||||
|
||||
- GlusterFS is a large project with many developers, so there would be one or the other patch everyday.
|
||||
- It is critical for developer to be up-to-date with devel repo to be Conflict-Free when PR is opened.
|
||||
- Git provides many options to keep up-to-date, below is one of them
|
||||
|
||||
```console
|
||||
# git fetch upstream
|
||||
# git rebase upstream/devel
|
||||
git fetch upstream
|
||||
git rebase upstream/devel
|
||||
```
|
||||
|
||||
##Branching policy
|
||||
------------------
|
||||
## Branching policy
|
||||
|
||||
This section describes both, the branching policies on the public repo
|
||||
as well as the suggested best-practice for local branching
|
||||
@@ -130,13 +128,12 @@ change. The name of the branch on your personal fork can start with issueNNNN,
|
||||
followed by anything of your choice. If you are submitting changes to the devel
|
||||
branch, first create a local task branch like this -
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# git checkout -b issueNNNN upstream/main
|
||||
... <hack, commit>
|
||||
```
|
||||
|
||||
##Building
|
||||
----------
|
||||
## Building
|
||||
|
||||
### Environment Setup
|
||||
|
||||
@@ -147,18 +144,19 @@ refer : [Building GlusterFS](./Building-GlusterFS.md)
|
||||
|
||||
Once the required packages are installed for your appropiate system,
|
||||
generate the build configuration:
|
||||
|
||||
```console
|
||||
# ./autogen.sh
|
||||
# ./configure --enable-fusermount
|
||||
./autogen.sh
|
||||
./configure --enable-fusermount
|
||||
```
|
||||
|
||||
### Build and install
|
||||
|
||||
```console
|
||||
# make && make install
|
||||
make && make install
|
||||
```
|
||||
|
||||
##Commit policy / PR description
|
||||
--------------------------------
|
||||
## Commit policy / PR description
|
||||
|
||||
Typically you would have a local branch per task. You will need to
|
||||
sign-off your commit (git commit -s) before sending the
|
||||
@@ -169,22 +167,21 @@ CONTRIBUTING file available in the repository root.
|
||||
Provide a meaningful commit message. Your commit message should be in
|
||||
the following format
|
||||
|
||||
- A short one-line title of format 'component: title', describing what the patch accomplishes
|
||||
- An empty line following the subject
|
||||
- Situation necessitating the patch
|
||||
- Description of the code changes
|
||||
- Reason for doing it this way (compared to others)
|
||||
- Description of test cases
|
||||
- When you open a PR, having a reference Issue for the commit is mandatory in GlusterFS.
|
||||
- Commit message can have, either Fixes: #NNNN or Updates: #NNNN in a separate line in the commit message.
|
||||
Here, NNNN is the Issue ID in glusterfs repository.
|
||||
- Each commit needs the author to have the 'Signed-off-by: Name <email>' line.
|
||||
Can do this by -s option for git commit.
|
||||
- If the PR is not ready for review, apply the label work-in-progress.
|
||||
Check the availability of "Draft PR" is present for you, if yes, use that instead.
|
||||
- A short one-line title of format 'component: title', describing what the patch accomplishes
|
||||
- An empty line following the subject
|
||||
- Situation necessitating the patch
|
||||
- Description of the code changes
|
||||
- Reason for doing it this way (compared to others)
|
||||
- Description of test cases
|
||||
- When you open a PR, having a reference Issue for the commit is mandatory in GlusterFS.
|
||||
- Commit message can have, either Fixes: #NNNN or Updates: #NNNN in a separate line in the commit message.
|
||||
Here, NNNN is the Issue ID in glusterfs repository.
|
||||
- Each commit needs the author to have the 'Signed-off-by: Name <email>' line.
|
||||
Can do this by -s option for git commit.
|
||||
- If the PR is not ready for review, apply the label work-in-progress.
|
||||
Check the availability of "Draft PR" is present for you, if yes, use that instead.
|
||||
|
||||
##Push the change
|
||||
-----------------
|
||||
## Push the change
|
||||
|
||||
After doing the local commit, it is time to submit the code for review.
|
||||
There is a script available inside glusterfs.git called rfc.sh. It is
|
||||
@@ -192,31 +189,34 @@ recommended you keep pushing to your repo every day, so you don't loose
|
||||
any work. You can submit your changes for review by simply executing
|
||||
|
||||
```console
|
||||
# ./rfc.sh
|
||||
./rfc.sh
|
||||
```
|
||||
|
||||
or
|
||||
|
||||
```console
|
||||
# git push origin HEAD:issueNNN
|
||||
git push origin HEAD:issueNNN
|
||||
```
|
||||
|
||||
This script rfc.sh does the following:
|
||||
|
||||
- The first time it is executed, it downloads a git hook from
|
||||
<http://review.gluster.org/tools/hooks/commit-msg> and sets it up
|
||||
locally to generate a Change-Id: tag in your commit message (if it
|
||||
was not already generated.)
|
||||
- Rebase your commit against the latest upstream HEAD. This rebase
|
||||
also causes your commits to undergo massaging from the just
|
||||
downloaded commit-msg hook.
|
||||
- Prompt for a Reference Id for each commit (if it was not already provided)
|
||||
and include it as a "fixes: #n" tag in the commit log. You can just hit
|
||||
<enter> at this prompt if your submission is purely for review
|
||||
purposes.
|
||||
- Push the changes for review. On a successful push, you will see a URL pointing to
|
||||
the change in [Pull requests](https://github.com/gluster/glusterfs/pulls) section.
|
||||
- The first time it is executed, it downloads a git hook from
|
||||
<http://review.gluster.org/tools/hooks/commit-msg> and sets it up
|
||||
locally to generate a Change-Id: tag in your commit message (if it
|
||||
was not already generated.)
|
||||
- Rebase your commit against the latest upstream HEAD. This rebase
|
||||
also causes your commits to undergo massaging from the just
|
||||
downloaded commit-msg hook.
|
||||
- Prompt for a Reference Id for each commit (if it was not already provided)
|
||||
and include it as a "fixes: #n" tag in the commit log. You can just hit
|
||||
<enter> at this prompt if your submission is purely for review
|
||||
purposes.
|
||||
- Push the changes for review. On a successful push, you will see a URL pointing to
|
||||
the change in [Pull requests](https://github.com/gluster/glusterfs/pulls) section.
|
||||
|
||||
## Test cases and Verification
|
||||
------------------------------
|
||||
|
||||
---
|
||||
|
||||
### Auto-triggered tests
|
||||
|
||||
@@ -258,13 +258,13 @@ To check and run all regression tests locally, run the below script
|
||||
from glusterfs root directory.
|
||||
|
||||
```console
|
||||
# ./run-tests.sh
|
||||
./run-tests.sh
|
||||
```
|
||||
|
||||
To run a single regression test locally, run the below command.
|
||||
|
||||
```console
|
||||
# prove -vf <path_to_the_file>
|
||||
prove -vf <path_to_the_file>
|
||||
```
|
||||
|
||||
**NOTE:** The testing framework needs perl-Test-Harness package to be installed.
|
||||
@@ -284,18 +284,17 @@ of the feature. Please go through glusto-tests project to understand
|
||||
more information on how to write and execute the tests in glusto.
|
||||
|
||||
1. Extend/Modify old test cases in existing scripts - This is typically
|
||||
when present behavior (default values etc.) of code is changed.
|
||||
when present behavior (default values etc.) of code is changed.
|
||||
|
||||
2. No test cases - This is typically when a code change is trivial
|
||||
(e.g. fixing typos in output strings, code comments).
|
||||
(e.g. fixing typos in output strings, code comments).
|
||||
|
||||
3. Only test case and no code change - This is typically when we are
|
||||
adding test cases to old code (already existing before this regression
|
||||
test policy was enforced). More details on how to work with test case
|
||||
scripts can be found in tests/README.
|
||||
adding test cases to old code (already existing before this regression
|
||||
test policy was enforced). More details on how to work with test case
|
||||
scripts can be found in tests/README.
|
||||
|
||||
##Reviewing / Commenting
|
||||
------------------------
|
||||
## Reviewing / Commenting
|
||||
|
||||
Code review with Github is relatively easy compared to other available
|
||||
tools. Each change is presented as multiple files and each file can be
|
||||
@@ -304,8 +303,7 @@ on each line by clicking on '+' icon and writing in your comments in
|
||||
the text box. Such in-line comments are saved as drafts, till you
|
||||
finally publish them by Starting a Review.
|
||||
|
||||
##Incorporate, rfc.sh, Reverify
|
||||
--------------------------------------
|
||||
## Incorporate, rfc.sh, Reverify
|
||||
|
||||
Code review comments are notified via email. After incorporating the
|
||||
changes in code, you can mark each of the inline comments as 'done'
|
||||
@@ -313,8 +311,9 @@ changes in code, you can mark each of the inline comments as 'done'
|
||||
commits in the same branch with -
|
||||
|
||||
```console
|
||||
# git commit -a -s
|
||||
git commit -a -s
|
||||
```
|
||||
|
||||
Push the commit by executing rfc.sh. If your previous push was an "rfc"
|
||||
push (i.e, without a Issue Id) you will be prompted for a Issue Id
|
||||
again. You can re-push an rfc change without any other code change too
|
||||
@@ -332,8 +331,7 @@ comments can be made on the new patch as well, and the same cycle repeats.
|
||||
|
||||
If no further changes are necessary, the reviewer can approve the patch.
|
||||
|
||||
##Submission Qualifiers
|
||||
-----------------------
|
||||
## Submission Qualifiers
|
||||
|
||||
GlusterFS project follows 'Squash and Merge' method.
|
||||
|
||||
@@ -350,8 +348,7 @@ The project maintainer will merge the changes once a patch
|
||||
meets these qualifiers. If you feel there is delay, feel free
|
||||
to add a comment, discuss the same in Slack channel, or send email.
|
||||
|
||||
##Submission Disqualifiers
|
||||
--------------------------
|
||||
## Submission Disqualifiers
|
||||
|
||||
- +2 : is equivalent to "Approve" from the people in the maintainer's group.
|
||||
- +1 : can be given by a maintainer/reviewer by explicitly stating that in the comment.
|
||||
|
||||
@@ -2,8 +2,8 @@
|
||||
|
||||
Fixing easy issues is an excellent method to start contributing patches to Gluster.
|
||||
|
||||
Sometimes an *Easy Fix* issue has a patch attached. In those cases,
|
||||
the *Patch* keyword has been added to the bug. These bugs can be
|
||||
Sometimes an _Easy Fix_ issue has a patch attached. In those cases,
|
||||
the _Patch_ keyword has been added to the bug. These bugs can be
|
||||
used by new contributors that would like to verify their workflow. [Bug
|
||||
1099645](https://bugzilla.redhat.com/1099645) is one example of those.
|
||||
|
||||
@@ -11,12 +11,12 @@ All such issues can be found [here](https://github.com/gluster/glusterfs/labels/
|
||||
|
||||
### Guidelines for new comers
|
||||
|
||||
- While trying to write a patch, do not hesitate to ask questions.
|
||||
- If something in the documentation is unclear, we do need to know so
|
||||
that we can improve it.
|
||||
- There are no stupid questions, and it's more stupid to not ask
|
||||
questions that others can easily answer. Always assume that if you
|
||||
have a question, someone else would like to hear the answer too.
|
||||
- While trying to write a patch, do not hesitate to ask questions.
|
||||
- If something in the documentation is unclear, we do need to know so
|
||||
that we can improve it.
|
||||
- There are no stupid questions, and it's more stupid to not ask
|
||||
questions that others can easily answer. Always assume that if you
|
||||
have a question, someone else would like to hear the answer too.
|
||||
|
||||
[Reach out](https://www.gluster.org/community/) to the developers
|
||||
in #gluster on [Gluster Slack](https://gluster.slack.com) channel, or on
|
||||
|
||||
@@ -1,7 +1,6 @@
|
||||
Static Code Analysis Tools
|
||||
--------------------------
|
||||
## Static Code Analysis Tools
|
||||
|
||||
Bug fixes for issues reported by *Static Code Analysis Tools* should
|
||||
Bug fixes for issues reported by _Static Code Analysis Tools_ should
|
||||
follow [Development Work Flow](./Development-Workflow.md)
|
||||
|
||||
### Coverity
|
||||
@@ -9,49 +8,48 @@ follow [Development Work Flow](./Development-Workflow.md)
|
||||
GlusterFS is part of [Coverity's](https://scan.coverity.com/) scan
|
||||
program.
|
||||
|
||||
- To see Coverity issues you have to be a member of the GlusterFS
|
||||
project in Coverity scan website.
|
||||
- Here is the link to [Coverity scan website](https://scan.coverity.com/projects/987)
|
||||
- Go to above link and subscribe to GlusterFS project (as
|
||||
contributor). It will send a request to Admin for including you in
|
||||
the Project.
|
||||
- Once admins for the GlusterFS Coverity scan approve your request,
|
||||
you will be able to see the defects raised by Coverity.
|
||||
- [Issue #1060](https://github.com/gluster/glusterfs/issues/1060)
|
||||
can be used as a umbrella bug for Coverity issues in master
|
||||
branch unless you are trying to fix a specific issue.
|
||||
- When you decide to work on some issue, please assign it to your name
|
||||
in the same Coverity website. So that we don't step on each others
|
||||
work.
|
||||
- When marking a bug intentional in Coverity scan website, please put
|
||||
an explanation for the same. So that it will help others to
|
||||
understand the reasoning behind it.
|
||||
- To see Coverity issues you have to be a member of the GlusterFS
|
||||
project in Coverity scan website.
|
||||
- Here is the link to [Coverity scan website](https://scan.coverity.com/projects/987)
|
||||
- Go to above link and subscribe to GlusterFS project (as
|
||||
contributor). It will send a request to Admin for including you in
|
||||
the Project.
|
||||
- Once admins for the GlusterFS Coverity scan approve your request,
|
||||
you will be able to see the defects raised by Coverity.
|
||||
- [Issue #1060](https://github.com/gluster/glusterfs/issues/1060)
|
||||
can be used as a umbrella bug for Coverity issues in master
|
||||
branch unless you are trying to fix a specific issue.
|
||||
- When you decide to work on some issue, please assign it to your name
|
||||
in the same Coverity website. So that we don't step on each others
|
||||
work.
|
||||
- When marking a bug intentional in Coverity scan website, please put
|
||||
an explanation for the same. So that it will help others to
|
||||
understand the reasoning behind it.
|
||||
|
||||
*If you have more questions please send it to
|
||||
_If you have more questions please send it to
|
||||
[gluster-devel](https://lists.gluster.org/mailman/listinfo/gluster-devel) mailing
|
||||
list*
|
||||
list_
|
||||
|
||||
### CPP Check
|
||||
|
||||
Cppcheck is available in Fedora and EL's EPEL repo
|
||||
|
||||
- Install Cppcheck
|
||||
- Install Cppcheck
|
||||
|
||||
# dnf install cppcheck
|
||||
dnf install cppcheck
|
||||
|
||||
- Clone GlusterFS code
|
||||
- Clone GlusterFS code
|
||||
|
||||
# git clone https://github.com/gluster/glusterfs
|
||||
git clone https://github.com/gluster/glusterfs
|
||||
|
||||
- Run Cpp check
|
||||
|
||||
# cppcheck glusterfs/ 2>cppcheck.log
|
||||
- Run Cpp check
|
||||
|
||||
cppcheck glusterfs/ 2>cppcheck.log
|
||||
|
||||
### Clang-Scan Daily Runs
|
||||
|
||||
We have daily runs of static source code analysis tool clang-scan on
|
||||
the glusterfs sources. There are daily analyses of the master and
|
||||
the glusterfs sources. There are daily analyses of the master and
|
||||
on currently supported branches.
|
||||
|
||||
Results are posted at
|
||||
|
||||
@@ -3,9 +3,7 @@
|
||||
This page contains a list of project ideas which will be suitable for
|
||||
students (for GSOC, internship etc.)
|
||||
|
||||
Projects/Features which needs contributors
|
||||
------------------------------------------
|
||||
|
||||
## Projects/Features which needs contributors
|
||||
|
||||
### RIO
|
||||
|
||||
@@ -13,27 +11,23 @@ Issue: https://github.com/gluster/glusterfs/issues/243
|
||||
|
||||
This is a new distribution logic, which can scale Gluster to 1000s of nodes.
|
||||
|
||||
|
||||
### Composition xlator for small files
|
||||
|
||||
Merge small files into a designated large file using our own custom
|
||||
semantics. This can improve our small file performance.
|
||||
|
||||
|
||||
### Path based geo-replication
|
||||
|
||||
Issue: https://github.com/gluster/glusterfs/issues/460
|
||||
|
||||
This would allow remote volume to be of different type (NFS/S3 etc etc) too.
|
||||
|
||||
|
||||
### Project Quota support
|
||||
|
||||
Issue: https://github.com/gluster/glusterfs/issues/184
|
||||
|
||||
This will make Gluster's Quota faster, and also provide desired behavior.
|
||||
|
||||
|
||||
### Cluster testing framework based on gluster-tester
|
||||
|
||||
Repo: https://github.com/aravindavk/gluster-tester
|
||||
|
||||
@@ -1,5 +1,4 @@
|
||||
Simplified development workflow for GlusterFS
|
||||
=============================================
|
||||
# Simplified development workflow for GlusterFS
|
||||
|
||||
This page gives a simplified model of the development workflow used by
|
||||
the GlusterFS project. This will give the steps required to get a patch
|
||||
@@ -8,8 +7,7 @@ accepted into the GlusterFS source.
|
||||
Visit [Development Work Flow](./Development-Workflow.md) a more
|
||||
detailed description of the workflow.
|
||||
|
||||
##Initial preparation
|
||||
---------------------
|
||||
## Initial preparation
|
||||
|
||||
The GlusterFS development workflow revolves around
|
||||
[GitHub](http://github.com/gluster/glusterfs/) and
|
||||
@@ -17,13 +15,15 @@ The GlusterFS development workflow revolves around
|
||||
Using these both tools requires some initial preparation.
|
||||
|
||||
### Get the source
|
||||
|
||||
Git clone the GlusterFS source using
|
||||
|
||||
```console
|
||||
git clone git@github.com:${username}/glusterfs.git
|
||||
cd glusterfs/
|
||||
git remote add upstream git@github.com:gluster/glusterfs.git
|
||||
```{ .console .no-copy }
|
||||
git clone git@github.com:${username}/glusterfs.git
|
||||
cd glusterfs/
|
||||
git remote add upstream git@github.com:gluster/glusterfs.git
|
||||
```
|
||||
|
||||
This will clone the GlusterFS source into a subdirectory named glusterfs
|
||||
with the devel branch checked out.
|
||||
|
||||
@@ -34,7 +34,7 @@ distribution specific package manger to install git. After installation
|
||||
configure git. At the minimum, set a git user email. To set the email
|
||||
do,
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
git config --global user.name <name>
|
||||
git config --global user.email <email address>
|
||||
```
|
||||
@@ -43,8 +43,7 @@ Next, install the build requirements for GlusterFS. Refer
|
||||
[Building GlusterFS - Build Requirements](./Building-GlusterFS.md#Build Requirements)
|
||||
for the actual requirements.
|
||||
|
||||
##Actual development
|
||||
--------------------
|
||||
## Actual development
|
||||
|
||||
The commands in this section are to be run inside the glusterfs source
|
||||
directory.
|
||||
@@ -55,23 +54,25 @@ It is recommended to use separate local development branches for each
|
||||
change you want to contribute to GlusterFS. To create a development
|
||||
branch, first checkout the upstream branch you want to work on and
|
||||
update it. More details on the upstream branching model for GlusterFS
|
||||
can be found at [Development Work Flow - Branching\_policy](./Development-Workflow.md#branching-policy).
|
||||
can be found at [Development Work Flow - Branching_policy](./Development-Workflow.md#branching-policy).
|
||||
For example if you want to develop on the devel branch,
|
||||
|
||||
```console
|
||||
# git checkout devel
|
||||
# git pull
|
||||
git checkout devel
|
||||
git pull
|
||||
```
|
||||
|
||||
Now, create a new branch from devel and switch to the new branch. It is
|
||||
recommended to have descriptive branch names. Do,
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
git branch issueNNNN
|
||||
git checkout issueNNNN
|
||||
```
|
||||
|
||||
or,
|
||||
```console
|
||||
|
||||
```{ .console .no-copy }
|
||||
git checkout -b issueNNNN upstream/main
|
||||
```
|
||||
|
||||
@@ -100,8 +101,8 @@ working GlusterFS installation and needs to be run as root. To run the
|
||||
regression test suite, do
|
||||
|
||||
```console
|
||||
# make install
|
||||
# ./run-tests.sh
|
||||
make install
|
||||
./run-tests.sh
|
||||
```
|
||||
|
||||
or, After uploading the patch The regression tests would be triggered
|
||||
@@ -113,7 +114,7 @@ If you haven't broken anything, you can now commit your changes. First
|
||||
identify the files that you modified/added/deleted using git-status and
|
||||
stage these files.
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
git status
|
||||
git add <list of modified files>
|
||||
```
|
||||
@@ -121,7 +122,7 @@ git add <list of modified files>
|
||||
Now, commit these changes using
|
||||
|
||||
```console
|
||||
# git commit -s
|
||||
git commit -s
|
||||
```
|
||||
|
||||
Provide a meaningful commit message. The commit message policy is
|
||||
@@ -134,18 +135,19 @@ sign-off the commit with your configured email.
|
||||
To submit your change for review, run the rfc.sh script,
|
||||
|
||||
```console
|
||||
# ./rfc.sh
|
||||
./rfc.sh
|
||||
```
|
||||
|
||||
or
|
||||
```console
|
||||
|
||||
```{ .console .no-copy }
|
||||
git push origin HEAD:issueNNN
|
||||
```
|
||||
|
||||
More details on the rfc.sh script are available at
|
||||
[Development Work Flow - rfc.sh](./Development-Workflow.md#rfc.sh).
|
||||
|
||||
##Review process
|
||||
----------------
|
||||
## Review process
|
||||
|
||||
Your change will now be reviewed by the GlusterFS maintainers and
|
||||
component owners. You can follow and take part in the review process
|
||||
@@ -186,8 +188,9 @@ review comments. Build and test to see if the new changes are working.
|
||||
Stage your changes and commit your new changes in new commits using,
|
||||
|
||||
```console
|
||||
# git commit -a -s
|
||||
git commit -a -s
|
||||
```
|
||||
|
||||
Now you can resubmit the commit for review using the rfc.sh script or git push.
|
||||
|
||||
The formal review process could take a long time. To increase chances
|
||||
|
||||
@@ -1,5 +1,4 @@
|
||||
How to compile GlusterFS RPMs from git source, for RHEL/CentOS, and Fedora
|
||||
--------------------------------------------------------------------------
|
||||
## How to compile GlusterFS RPMs from git source, for RHEL/CentOS, and Fedora
|
||||
|
||||
Creating rpm's of GlusterFS from git source is fairly easy, once you know the steps.
|
||||
|
||||
@@ -21,13 +20,13 @@ Specific instructions for compiling are below. If you're using:
|
||||
|
||||
### Preparation steps for Fedora 16-20 (only)
|
||||
|
||||
1. Install gcc, the python development headers, and python setuptools:
|
||||
1. Install gcc, the python development headers, and python setuptools:
|
||||
|
||||
# sudo yum -y install gcc python-devel python-setuptools
|
||||
sudo yum -y install gcc python-devel python-setuptools
|
||||
|
||||
2. If you're compiling GlusterFS version 3.4, then install python-swiftclient. Other GlusterFS versions don't need it:
|
||||
2. If you're compiling GlusterFS version 3.4, then install python-swiftclient. Other GlusterFS versions don't need it:
|
||||
|
||||
# sudo easy_install simplejson python-swiftclient
|
||||
sudo easy_install simplejson python-swiftclient
|
||||
|
||||
Now follow through with the **Common Steps** part below.
|
||||
|
||||
@@ -35,15 +34,15 @@ Now follow through with the **Common Steps** part below.
|
||||
|
||||
You'll need EPEL installed first and some CentOS-specific packages. The commands below will get that done for you. After that, follow through the "Common steps" section.
|
||||
|
||||
1. Install EPEL first:
|
||||
1. Install EPEL first:
|
||||
|
||||
# curl -OL `[`http://download.fedoraproject.org/pub/epel/5/x86_64/epel-release-5-4.noarch.rpm`](http://download.fedoraproject.org/pub/epel/5/x86_64/epel-release-5-4.noarch.rpm)
|
||||
# sudo yum -y install epel-release-5-4.noarch.rpm --nogpgcheck
|
||||
curl -OL http://download.fedoraproject.org/pub/epel/5/x86_64/epel-release-5-4.noarch.rpm
|
||||
sudo yum -y install epel-release-5-4.noarch.rpm --nogpgcheck
|
||||
|
||||
2. Install the packages required only on CentOS 5.x:
|
||||
2. Install the packages required only on CentOS 5.x:
|
||||
|
||||
# sudo yum -y install buildsys-macros gcc ncurses-devel \
|
||||
python-ctypes python-sphinx10 redhat-rpm-config
|
||||
sudo yum -y install buildsys-macros gcc ncurses-devel \
|
||||
python-ctypes python-sphinx10 redhat-rpm-config
|
||||
|
||||
Now follow through with the **Common Steps** part below.
|
||||
|
||||
@@ -51,32 +50,31 @@ Now follow through with the **Common Steps** part below.
|
||||
|
||||
You'll need EPEL installed first and some CentOS-specific packages. The commands below will get that done for you. After that, follow through the "Common steps" section.
|
||||
|
||||
1. Install EPEL first:
|
||||
1. Install EPEL first:
|
||||
|
||||
# sudo yum -y install `[`http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm`](http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm)
|
||||
sudo yum -y install http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
|
||||
|
||||
2. Install the packages required only on CentOS:
|
||||
2. Install the packages required only on CentOS:
|
||||
|
||||
# sudo yum -y install python-webob1.0 python-paste-deploy1.5 python-sphinx10 redhat-rpm-config
|
||||
sudo yum -y install python-webob1.0 python-paste-deploy1.5 python-sphinx10 redhat-rpm-config
|
||||
|
||||
Now follow through with the **Common Steps** part below.
|
||||
|
||||
|
||||
### Preparation steps for CentOS 8.x (only)
|
||||
|
||||
You'll need EPEL installed and then the powertools package enabled.
|
||||
You'll need EPEL installed and then the powertools package enabled.
|
||||
|
||||
1. Install EPEL first:
|
||||
|
||||
# sudo rpm -ivh https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
|
||||
1. Install EPEL first:
|
||||
|
||||
2. Enable the PowerTools repo and install CentOS 8.x specific packages for building the rpms.
|
||||
sudo rpm -ivh https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
|
||||
|
||||
# sudo yum --enablerepo=PowerTools install automake autoconf libtool flex bison openssl-devel \
|
||||
libxml2-devel libaio-devel libibverbs-devel librdmacm-devel readline-devel lvm2-devel \
|
||||
glib2-devel userspace-rcu-devel libcmocka-devel libacl-devel sqlite-devel fuse-devel \
|
||||
redhat-rpm-config rpcgen libtirpc-devel make python3-devel rsync libuuid-devel \
|
||||
rpm-build dbench perl-Test-Harness attr libcurl-devel selinux-policy-devel -y
|
||||
2. Enable the PowerTools repo and install CentOS 8.x specific packages for building the rpms.
|
||||
|
||||
sudo yum --enablerepo=PowerTools install automake autoconf libtool flex bison openssl-devel \
|
||||
libxml2-devel libaio-devel libibverbs-devel librdmacm-devel readline-devel lvm2-devel \
|
||||
glib2-devel userspace-rcu-devel libcmocka-devel libacl-devel sqlite-devel fuse-devel \
|
||||
redhat-rpm-config rpcgen libtirpc-devel make python3-devel rsync libuuid-devel \
|
||||
rpm-build dbench perl-Test-Harness attr libcurl-devel selinux-policy-devel -y
|
||||
|
||||
Now follow through from Point 2 in the **Common Steps** part below.
|
||||
|
||||
@@ -84,14 +82,14 @@ Now follow through from Point 2 in the **Common Steps** part below.
|
||||
|
||||
You'll need EPEL installed first and some RHEL specific packages. The 2 commands below will get that done for you. After that, follow through the "Common steps" section.
|
||||
|
||||
1. Install EPEL first:
|
||||
1. Install EPEL first:
|
||||
|
||||
# sudo yum -y install `[`http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm`](http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm)
|
||||
sudo yum -y install http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
|
||||
|
||||
2. Install the packages required only on RHEL:
|
||||
2. Install the packages required only on RHEL:
|
||||
|
||||
# sudo yum -y --enablerepo=rhel-6-server-optional-rpms install python-webob1.0 \
|
||||
python-paste-deploy1.5 python-sphinx10 redhat-rpm-config
|
||||
sudo yum -y --enablerepo=rhel-6-server-optional-rpms install python-webob1.0 \
|
||||
python-paste-deploy1.5 python-sphinx10 redhat-rpm-config
|
||||
|
||||
Now follow through with the **Common Steps** part below.
|
||||
|
||||
@@ -104,64 +102,65 @@ These steps are for both Fedora and RHEL/CentOS. At the end you'll have the comp
|
||||
- If you're on RHEL/CentOS 5.x and get a message about lvm2-devel not being available, it's ok. You can ignore it. :)
|
||||
- If you're on RHEL/CentOS 6.x and get any messages about python-eventlet, python-netifaces, python-sphinx and/or pyxattr not being available, it's ok. You can ignore them. :)
|
||||
- If you're on CentOS 8.x, you can skip step 1 and start from step 2. Also, for CentOS 8.x, the steps have been
|
||||
tested for the master branch. It is unknown if it would work for older branches.
|
||||
tested for the master branch. It is unknown if it would work for older branches.
|
||||
|
||||
<br/>
|
||||
|
||||
1. Install the needed packages
|
||||
1. Install the needed packages
|
||||
|
||||
# sudo yum -y --disablerepo=rhs* --enablerepo=*optional-rpms install git autoconf \
|
||||
automake bison dos2unix flex fuse-devel glib2-devel libaio-devel \
|
||||
libattr-devel libibverbs-devel librdmacm-devel libtool libxml2-devel lvm2-devel make \
|
||||
openssl-devel pkgconfig pyliblzma python-devel python-eventlet python-netifaces \
|
||||
python-paste-deploy python-simplejson python-sphinx python-webob pyxattr readline-devel \
|
||||
rpm-build systemtap-sdt-devel tar libcmocka-devel
|
||||
sudo yum -y --disablerepo=rhs* --enablerepo=*optional-rpms install git autoconf \
|
||||
automake bison dos2unix flex fuse-devel glib2-devel libaio-devel \
|
||||
libattr-devel libibverbs-devel librdmacm-devel libtool libxml2-devel lvm2-devel make \
|
||||
openssl-devel pkgconfig pyliblzma python-devel python-eventlet python-netifaces \
|
||||
python-paste-deploy python-simplejson python-sphinx python-webob pyxattr readline-devel \
|
||||
rpm-build systemtap-sdt-devel tar libcmocka-devel
|
||||
|
||||
2. Clone the GlusterFS git repository
|
||||
2. Clone the GlusterFS git repository
|
||||
|
||||
# git clone `[`git://git.gluster.org/glusterfs`](git://git.gluster.org/glusterfs)
|
||||
# cd glusterfs
|
||||
git clone git://git.gluster.org/glusterfs
|
||||
cd glusterfs
|
||||
|
||||
3. Choose which branch to compile
|
||||
3. Choose which branch to compile
|
||||
|
||||
If you want to compile the latest development code, you can skip this step and go on to the next one. :)
|
||||
|
||||
If instead, you want to compile the code for a specific release of GlusterFS (such as v3.4), get the list of release names here:
|
||||
|
||||
# git branch -a | grep release
|
||||
remotes/origin/release-2.0
|
||||
remotes/origin/release-3.0
|
||||
remotes/origin/release-3.1
|
||||
remotes/origin/release-3.2
|
||||
remotes/origin/release-3.3
|
||||
remotes/origin/release-3.4
|
||||
remotes/origin/release-3.5
|
||||
# git branch -a | grep release
|
||||
remotes/origin/release-2.0
|
||||
remotes/origin/release-3.0
|
||||
remotes/origin/release-3.1
|
||||
remotes/origin/release-3.2
|
||||
remotes/origin/release-3.3
|
||||
remotes/origin/release-3.4
|
||||
remotes/origin/release-3.5
|
||||
|
||||
Then switch to the correct release using the git "checkout" command, and the name of the release after the "remotes/origin/" bit from the list above:
|
||||
|
||||
# git checkout release-3.4
|
||||
git checkout release-3.4
|
||||
|
||||
**NOTE -** The CentOS 5.x instructions have only been tested for the master branch in GlusterFS git. It is unknown (yet) if they work for branches older than release-3.5.
|
||||
|
||||
---
|
||||
If you are compiling the latest development code you can skip steps **4** and **5**. Instead, you can run the below command and you will get the RPMs.
|
||||
***
|
||||
|
||||
|
||||
# extras/LinuxRPM/make_glusterrpms
|
||||
---
|
||||
If you are compiling the latest development code you can skip steps **4** and **5**. Instead, you can run the below command and you will get the RPMs.
|
||||
|
||||
4. Configure and compile GlusterFS
|
||||
extras/LinuxRPM/make_glusterrpms
|
||||
|
||||
***
|
||||
|
||||
4. Configure and compile GlusterFS
|
||||
|
||||
Now you're ready to compile Gluster:
|
||||
|
||||
# ./autogen.sh
|
||||
# ./configure --enable-fusermount
|
||||
# make dist
|
||||
./autogen.sh
|
||||
./configure --enable-fusermount
|
||||
make dist
|
||||
|
||||
5. Create the GlusterFS RPMs
|
||||
5. Create the GlusterFS RPMs
|
||||
|
||||
# cd extras/LinuxRPM
|
||||
# make glusterrpms
|
||||
cd extras/LinuxRPM
|
||||
make glusterrpms
|
||||
|
||||
That should complete with no errors, leaving you with a directory containing the RPMs.
|
||||
|
||||
|
||||
@@ -1,47 +1,52 @@
|
||||
# Get core dump on a customer set up without killing the process
|
||||
|
||||
### Why do we need this?
|
||||
|
||||
Finding the root cause of an issue that occurred in the customer/production setup is a challenging task.
|
||||
Most of the time we cannot replicate/setup the environment and scenario which is leading to the issue on
|
||||
our test setup. In such cases, we got to grab most of the information from the system where the problem
|
||||
has occurred.
|
||||
<br>
|
||||
|
||||
### What information we look for and also useful?
|
||||
|
||||
The information like a core dump is very helpful to catch the root cause of an issue by adding ASSERT() in
|
||||
the code at the places where we feel something is wrong and install the custom build on the affected setup.
|
||||
But the issue is ASSERT() would kill the process and produce the core dump.
|
||||
<br>
|
||||
|
||||
### Is it a good idea to do ASSERT() on customer setup?
|
||||
Remember we are seeking help from customer setup, they unlikely agree to kill the process and produce the
|
||||
|
||||
Remember we are seeking help from customer setup, they unlikely agree to kill the process and produce the
|
||||
core dump for us to root cause it. It affects the customer’s business and nobody agrees with this proposal.
|
||||
<br>
|
||||
|
||||
### What if we have a way to produce a core dump without a kill?
|
||||
Yes, Glusterfs provides a way to do this. Gluster has customized ASSERT() i.e GF_ASSERT() in place which helps
|
||||
in producing the core dump without killing the associated process and also provides a script which can be run on
|
||||
the customer set up that produces the core dump without harming the running process (This presumes we already have
|
||||
GF_ASSERT() at the expected place in the current build running on customer setup. If not, we need to install custom
|
||||
|
||||
Yes, Glusterfs provides a way to do this. Gluster has customized ASSERT() i.e GF_ASSERT() in place which helps
|
||||
in producing the core dump without killing the associated process and also provides a script which can be run on
|
||||
the customer set up that produces the core dump without harming the running process (This presumes we already have
|
||||
GF_ASSERT() at the expected place in the current build running on customer setup. If not, we need to install custom
|
||||
build on that setup by adding GF_ASSERT()).
|
||||
<br>
|
||||
|
||||
### Is GF_ASSERT() newly introduced in Gluster code?
|
||||
No. GF_ASSERT() is already there in the codebase before this improvement. In the debug build, GF_ASSERT() kills the
|
||||
process and produces the core dump but in the production build, it just logs the error and moves on. What we have done
|
||||
is we just changed the implementation of the code and now in production build also we get the core dump but the process
|
||||
|
||||
No. GF_ASSERT() is already there in the codebase before this improvement. In the debug build, GF_ASSERT() kills the
|
||||
process and produces the core dump but in the production build, it just logs the error and moves on. What we have done
|
||||
is we just changed the implementation of the code and now in production build also we get the core dump but the process
|
||||
won’t be killed. The code places where GF_ASSERT() is not covered, please add it as per the requirement.
|
||||
<br>
|
||||
|
||||
## Here are the steps to achieve the goal:
|
||||
- Add GF_ASSERT() in the Gluster code path where you expect something wrong is happening.
|
||||
- Build the Gluster code, install and mount the Gluster volume (For detailed steps refer: Gluster quick start guide).
|
||||
- Now, in the other terminal, run the gfcore.py script
|
||||
`# ./extras/debug/gfcore.py $PID 1 /tmp/` (PID of the gluster process you are interested in, got it by `ps -ef | grep gluster`
|
||||
in the previous step. For more details, check `# ./extras/debug/gfcore.py --help`)
|
||||
- Hit the code path where you have introduced GF_ASSERT(). If GF_ASSERT() is in fuse_write() path, you can hit the code
|
||||
path by writing on to a file present under Gluster moun. Ex: `# dd if=/dev/zero of=/mnt/glustrefs/abcd bs=1M count=1`
|
||||
where `/mnt/glusterfs` is the gluster mount
|
||||
- Go to the terminal where the gdb is running (step 3) and observe that the gdb process is terminated
|
||||
- Go to the directory where the core-dump is produced. Default would be present working directory.
|
||||
- Access the core dump using gdb Ex: `# gdb -ex "core-file $GFCORE_FILE" $GLUSTER_BINARY`
|
||||
(1st arg would be core file name and 2nd arg is o/p of file command in the previous step)
|
||||
- Observe that the Gluster process is unaffected by checking its process state. Check pid status using `ps -ef | grep gluster`
|
||||
<br>
|
||||
Thanks, Xavi Hernandez(jahernan@redhat.com) for the idea. This will ease many Gluster developer's/maintainer’s life.
|
||||
|
||||
- Add GF_ASSERT() in the Gluster code path where you expect something wrong is happening.
|
||||
- Build the Gluster code, install and mount the Gluster volume (For detailed steps refer: Gluster quick start guide).
|
||||
- Now, in the other terminal, run the gfcore.py script
|
||||
`# ./extras/debug/gfcore.py $PID 1 /tmp/` (PID of the gluster process you are interested in, got it by `ps -ef | grep gluster`
|
||||
in the previous step. For more details, check `# ./extras/debug/gfcore.py --help`)
|
||||
- Hit the code path where you have introduced GF_ASSERT(). If GF_ASSERT() is in fuse_write() path, you can hit the code
|
||||
path by writing on to a file present under Gluster moun. Ex: `# dd if=/dev/zero of=/mnt/glustrefs/abcd bs=1M count=1`
|
||||
where `/mnt/glusterfs` is the gluster mount
|
||||
- Go to the terminal where the gdb is running (step 3) and observe that the gdb process is terminated
|
||||
- Go to the directory where the core-dump is produced. Default would be present working directory.
|
||||
- Access the core dump using gdb Ex: `# gdb -ex "core-file $GFCORE_FILE" $GLUSTER_BINARY`
|
||||
(1st arg would be core file name and 2nd arg is o/p of file command in the previous step)
|
||||
- Observe that the Gluster process is unaffected by checking its process state. Check pid status using `ps -ef | grep gluster`
|
||||
|
||||
Thanks, Xavi Hernandez(jahernan@redhat.com) for the idea. This will ease many Gluster developer's/maintainer’s life.
|
||||
|
||||
@@ -1,5 +1,4 @@
|
||||
GlusterFS Tools
|
||||
---------------
|
||||
## GlusterFS Tools
|
||||
|
||||
- [glusterfind](./glusterfind.md)
|
||||
- [gfind missing files](./gfind-missing-files.md)
|
||||
- [glusterfind](./glusterfind.md)
|
||||
- [gfind missing files](./gfind-missing-files.md)
|
||||
|
||||
@@ -54,15 +54,15 @@ bash gfid_to_path.sh <BRICK_PATH> <GFID_FILE>
|
||||
|
||||
## Things to keep in mind when running the tool
|
||||
|
||||
1. Running this tool can result in a crawl of the backend filesystem at each
|
||||
brick which can be intensive. To ensure there is no impact on ongoing I/O on
|
||||
RHS volumes, we recommend that this tool be run at a low I/O scheduling class
|
||||
(best-effort) and priority.
|
||||
1. Running this tool can result in a crawl of the backend filesystem at each
|
||||
brick which can be intensive. To ensure there is no impact on ongoing I/O on
|
||||
RHS volumes, we recommend that this tool be run at a low I/O scheduling class
|
||||
(best-effort) and priority.
|
||||
|
||||
ionice -c 2 -p <pid of gfind_missing_files.sh>
|
||||
ionice -c 2 -p <pid of gfind_missing_files.sh>
|
||||
|
||||
2. We do not recommend interrupting the tool when it is running
|
||||
(e.g. by doing CTRL^C). It is better to wait for the tool to finish
|
||||
2. We do not recommend interrupting the tool when it is running
|
||||
(e.g. by doing CTRL^C). It is better to wait for the tool to finish
|
||||
execution. In case it is interrupted, manually unmount the Slave Volume.
|
||||
|
||||
umount <MOUNT_POINT>
|
||||
umount <MOUNT_POINT>
|
||||
|
||||
@@ -6,11 +6,23 @@ This tool should be run in one of the node, which will get Volume info and gets
|
||||
|
||||
## Session Management
|
||||
|
||||
Create a glusterfind session to remember the time when last sync or processing complete. For example, your backup application runs every day and gets incremental results on each run. The tool maintains session in `$GLUSTERD_WORKDIR/glusterfind/`, for each session it creates and directory and creates a sub directory with Volume name. (Default working directory is /var/lib/glusterd, in some systems this location may change. To find Working dir location run `grep working-directory /etc/glusterfs/glusterd.vol` or `grep working-directory /usr/local/etc/glusterfs/glusterd.vol` if source install)
|
||||
Create a glusterfind session to remember the time when last sync or processing complete. For example, your backup application runs every day and gets incremental results on each run. The tool maintains session in `$GLUSTERD_WORKDIR/glusterfind/`, for each session it creates and directory and creates a sub directory with Volume name. (Default working directory is /var/lib/glusterd, in some systems this location may change. To find Working dir location run
|
||||
|
||||
```console
|
||||
grep working-directory /etc/glusterfs/glusterd.vol
|
||||
```
|
||||
|
||||
or
|
||||
|
||||
```console
|
||||
grep working-directory /usr/local/etc/glusterfs/glusterd.vol
|
||||
```
|
||||
|
||||
if you installed from the source.
|
||||
|
||||
For example, if the session name is "backup" and volume name is "datavol", then the tool creates `$GLUSTERD_WORKDIR/glusterfind/backup/datavol`. Now onwards we refer this directory as `$SESSION_DIR`.
|
||||
|
||||
```text
|
||||
```{ .text .no-copy }
|
||||
create => pre => post => [delete]
|
||||
```
|
||||
|
||||
@@ -34,13 +46,13 @@ Incremental find uses Changelogs to get the list of GFIDs modified/created. Any
|
||||
|
||||
If we set build-pgfid option in Volume GlusterFS starts recording each files parent directory GFID as xattr in file on any ENTRY fop.
|
||||
|
||||
```text
|
||||
```{ .text .no-copy }
|
||||
trusted.pgfid.<GFID>=NUM_LINKS
|
||||
```
|
||||
|
||||
To convert from GFID to path, we can mount Volume with aux-gfid-mount option, and get Path information by a getfattr query.
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
getfattr -n glusterfs.ancestry.path -e text /mnt/datavol/.gfid/<GFID>
|
||||
```
|
||||
|
||||
@@ -54,7 +66,7 @@ Tool collects the list of GFIDs failed to convert with above method and does a f
|
||||
|
||||
### Create the session
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
glusterfind create SESSION_NAME VOLNAME [--force]
|
||||
glusterfind create --help
|
||||
```
|
||||
@@ -63,7 +75,7 @@ Where, SESSION_NAME is any name without space to identify when run second time.
|
||||
|
||||
Examples,
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# glusterfind create --help
|
||||
# glusterfind create backup datavol
|
||||
# glusterfind create antivirus_scanner datavol
|
||||
@@ -72,7 +84,7 @@ Examples,
|
||||
|
||||
### Pre Command
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
glusterfind pre SESSION_NAME VOLUME_NAME OUTFILE
|
||||
glusterfind pre --help
|
||||
```
|
||||
@@ -83,7 +95,7 @@ To trigger the full find, call the pre command with `--full` argument. Multiple
|
||||
|
||||
Examples,
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# glusterfind pre backup datavol /root/backup.txt
|
||||
# glusterfind pre backup datavol /root/backup.txt --full
|
||||
|
||||
@@ -97,27 +109,27 @@ Examples,
|
||||
Output file contains list of files/dirs relative to the Volume mount, if we need to prefix with any path to have absolute path then,
|
||||
|
||||
```console
|
||||
# glusterfind pre backup datavol /root/backup.txt --file-prefix=/mnt/datavol/
|
||||
glusterfind pre backup datavol /root/backup.txt --file-prefix=/mnt/datavol/
|
||||
```
|
||||
|
||||
### List Command
|
||||
|
||||
To get the list of sessions and respective session time,
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
glusterfind list [--session SESSION_NAME] [--volume VOLUME_NAME]
|
||||
```
|
||||
|
||||
Examples,
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# glusterfind list
|
||||
# glusterfind list --session backup
|
||||
```
|
||||
|
||||
Example output,
|
||||
|
||||
```console
|
||||
```{ .text .no-copy }
|
||||
SESSION VOLUME SESSION TIME
|
||||
---------------------------------------------------------------------------
|
||||
backup datavol 2015-03-04 17:35:34
|
||||
@@ -125,26 +137,26 @@ backup datavol 2015-03-04 17:35:34
|
||||
|
||||
### Post Command
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
glusterfind post SESSION_NAME VOLUME_NAME
|
||||
```
|
||||
|
||||
Examples,
|
||||
|
||||
```console
|
||||
# glusterfind post backup datavol
|
||||
glusterfind post backup datavol
|
||||
```
|
||||
|
||||
### Delete Command
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
glusterfind delete SESSION_NAME VOLUME_NAME
|
||||
```
|
||||
|
||||
Examples,
|
||||
|
||||
```console
|
||||
# glusterfind delete backup datavol
|
||||
glusterfind delete backup datavol
|
||||
```
|
||||
|
||||
## Adding more Crawlers
|
||||
@@ -170,7 +182,7 @@ Custom crawler can be executable script/binary which accepts volume name, brick
|
||||
|
||||
For example,
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
/root/parallelbrickcrawl SESSION_NAME VOLUME BRICK_PATH OUTFILE START_TIME [--debug]
|
||||
```
|
||||
|
||||
|
||||
@@ -3,6 +3,7 @@
|
||||
For the Gluster to communicate within a cluster either the firewalls
|
||||
have to be turned off or enable communication for each server.
|
||||
|
||||
|
||||
```{ .console .no-copy }
|
||||
iptables -I INPUT -p all -s `<ip-address>` -j ACCEPT
|
||||
```
|
||||
@@ -115,14 +116,12 @@ Brick3: node03.yourdomain.net:/export/sdb1/brick
|
||||
```
|
||||
|
||||
This shows us essentially what we just specified during the volume
|
||||
creation. The one this to mention is the `Status`. A status of `Created`
|
||||
means that the volume has been created, but hasn’t yet been started,
|
||||
which would cause any attempt to mount the volume fail.
|
||||
creation. The one key output worth noticing is `Status`.
|
||||
A status of `Created` means that the volume has been created,
|
||||
but hasn’t yet been started, which would cause any attempt to mount the volume fail.
|
||||
|
||||
Now, we should start the volume.
|
||||
Now, we should start the volume before we try to mount it.
|
||||
|
||||
```console
|
||||
gluster volume start gv0
|
||||
```
|
||||
|
||||
Find all documentation [here](../index.md)
|
||||
|
||||
@@ -6,7 +6,7 @@ planning but the growth has mostly been ad-hoc and need-based.
|
||||
|
||||
Central to the plan of revitalizing the Gluster.org community is the ability to
|
||||
provide well-maintained infrastructure services with predictable uptimes and
|
||||
resilience. We're migrating the existing services into the Community Cage. The
|
||||
resilience. We're migrating the existing services into the Community Cage. The
|
||||
implied objective is that the transition would open up ways and means of the
|
||||
formation of a loose coalition among Infrastructure Administrators who provide
|
||||
expertise and guidance to the community projects within the OSAS team.
|
||||
|
||||
@@ -1,23 +1,24 @@
|
||||
## Tools We Use
|
||||
|
||||
| Service/Tool | Purpose | Hosted At |
|
||||
|----------------------|----------------------------------------------------|-----------------|
|
||||
| Github | Code Review | Github |
|
||||
| Jenkins | CI, build-verification-test | Temporary Racks |
|
||||
| Backups | Website, Gerrit and Jenkins backup | Rackspace |
|
||||
| Docs | Documentation content | mkdocs.org |
|
||||
| download.gluster.org | Official download site of the binaries | Rackspace |
|
||||
| Mailman | Lists mailman | Rackspace |
|
||||
| www.gluster.org | Web asset | Rackspace |
|
||||
| Service/Tool | Purpose | Hosted At |
|
||||
| :------------------- | :------------------------------------: | --------------: |
|
||||
| Github | Code Review | Github |
|
||||
| Jenkins | CI, build-verification-test | Temporary Racks |
|
||||
| Backups | Website, Gerrit and Jenkins backup | Rackspace |
|
||||
| Docs | Documentation content | mkdocs.org |
|
||||
| download.gluster.org | Official download site of the binaries | Rackspace |
|
||||
| Mailman | Lists mailman | Rackspace |
|
||||
| www.gluster.org | Web asset | Rackspace |
|
||||
|
||||
## Notes
|
||||
* download.gluster.org: Resiliency is important for availability and metrics.
|
||||
|
||||
- download.gluster.org: Resiliency is important for availability and metrics.
|
||||
Since it's official download, access need to restricted as much as possible.
|
||||
Few developers building the community packages have access. If anyone requires
|
||||
access can raise an issue at [gluster/project-infrastructure](https://github.com/gluster/project-infrastructure/issues/new)
|
||||
with valid reason
|
||||
* Mailman: Should be migrated to a separate host. Should be made more redundant
|
||||
- Mailman: Should be migrated to a separate host. Should be made more redundant
|
||||
(ie, more than 1 MX).
|
||||
* www.gluster.org: Framework, Artifacts now exist under gluster.github.com. Has
|
||||
- www.gluster.org: Framework, Artifacts now exist under gluster.github.com. Has
|
||||
various legacy installation of software (mediawiki, etc ), being cleaned as
|
||||
we find them.
|
||||
|
||||
@@ -1,9 +1,8 @@
|
||||
Troubleshooting Guide
|
||||
---------------------
|
||||
## Troubleshooting Guide
|
||||
|
||||
This guide describes some commonly seen issues and steps to recover from them.
|
||||
If that doesn’t help, reach out to the [Gluster community](https://www.gluster.org/community/), in which case the guide also describes what information needs to be provided in order to debug the issue. At minimum, we need the version of gluster running and the output of `gluster volume info`.
|
||||
|
||||
|
||||
### Where Do I Start?
|
||||
|
||||
Is the issue already listed in the component specific troubleshooting sections?
|
||||
@@ -15,7 +14,6 @@ Is the issue already listed in the component specific troubleshooting sections?
|
||||
- [Gluster NFS Issues](./troubleshooting-gnfs.md)
|
||||
- [File Locks](./troubleshooting-filelocks.md)
|
||||
|
||||
|
||||
If that didn't help, here is how to debug further.
|
||||
|
||||
Identifying the problem and getting the necessary information to diagnose it is the first step in troubleshooting your Gluster setup. As Gluster operations involve interactions between multiple processes, this can involve multiple steps.
|
||||
@@ -25,5 +23,3 @@ Identifying the problem and getting the necessary information to diagnose it is
|
||||
- An operation failed
|
||||
- [High Memory Usage](./troubleshooting-memory.md)
|
||||
- [A Gluster process crashed](./gluster-crash.md)
|
||||
|
||||
|
||||
|
||||
@@ -8,24 +8,26 @@ normal filesystem. The GFID of a file is stored in its xattr named
|
||||
#### Special mount using gfid-access translator:
|
||||
|
||||
```console
|
||||
# mount -t glusterfs -o aux-gfid-mount vm1:test /mnt/testvol
|
||||
mount -t glusterfs -o aux-gfid-mount vm1:test /mnt/testvol
|
||||
```
|
||||
|
||||
Assuming, you have `GFID` of a file from changelog (or somewhere else).
|
||||
For trying this out, you can get `GFID` of a file from mountpoint:
|
||||
|
||||
```console
|
||||
# getfattr -n glusterfs.gfid.string /mnt/testvol/dir/file
|
||||
getfattr -n glusterfs.gfid.string /mnt/testvol/dir/file
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Get file path from GFID (Method 1):
|
||||
|
||||
**(Lists hardlinks delimited by `:`, returns path as seen from mountpoint)**
|
||||
|
||||
#### Turn on build-pgfid option
|
||||
|
||||
```console
|
||||
# gluster volume set test build-pgfid on
|
||||
gluster volume set test build-pgfid on
|
||||
```
|
||||
|
||||
Read virtual xattr `glusterfs.ancestry.path` which contains the file path
|
||||
@@ -36,7 +38,7 @@ getfattr -n glusterfs.ancestry.path -e text /mnt/testvol/.gfid/<GFID>
|
||||
|
||||
**Example:**
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
[root@vm1 glusterfs]# ls -il /mnt/testvol/dir/
|
||||
total 1
|
||||
10610563327990022372 -rw-r--r--. 2 root root 3 Jul 17 18:05 file
|
||||
@@ -54,6 +56,7 @@ glusterfs.ancestry.path="/dir/file:/dir/file3"
|
||||
```
|
||||
|
||||
### Get file path from GFID (Method 2):
|
||||
|
||||
**(Does not list all hardlinks, returns backend brick path)**
|
||||
|
||||
```console
|
||||
@@ -70,4 +73,5 @@ trusted.glusterfs.pathinfo="(<DISTRIBUTE:test-dht> <POSIX(/mnt/brick-test/b):vm1
|
||||
```
|
||||
|
||||
#### References and links:
|
||||
|
||||
[posix: placeholders for GFID to path conversion](http://review.gluster.org/5951)
|
||||
|
||||
@@ -1,13 +1,11 @@
|
||||
Debugging a Crash
|
||||
=================
|
||||
# Debugging a Crash
|
||||
|
||||
To find out why a Gluster process terminated abruptly, we need the following:
|
||||
|
||||
* A coredump of the process that crashed
|
||||
* The exact version of Gluster that is running
|
||||
* The Gluster log files
|
||||
* the output of `gluster volume info`
|
||||
* Steps to reproduce the crash if available
|
||||
|
||||
- A coredump of the process that crashed
|
||||
- The exact version of Gluster that is running
|
||||
- The Gluster log files
|
||||
- the output of `gluster volume info`
|
||||
- Steps to reproduce the crash if available
|
||||
|
||||
Contact the [community](https://www.gluster.org/community/) with this information or [open an issue](https://github.com/gluster/glusterfs/issues/new)
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
Heal info and split-brain resolution
|
||||
=======================================
|
||||
This document explains the heal info command available in gluster for monitoring pending heals in replicate volumes and the methods available to resolve split-brains.
|
||||
# Heal info and split-brain resolution
|
||||
|
||||
This document explains the heal info command available in gluster for monitoring pending heals in replicate volumes and the methods available to resolve split-brains.
|
||||
|
||||
## Types of Split-Brains:
|
||||
|
||||
@@ -9,26 +9,27 @@ is the correct one.
|
||||
|
||||
There are three types of split-brains:
|
||||
|
||||
- Data split-brain: The data in the file differs on the bricks in the replica set
|
||||
- Metadata split-brain: The metadata differs on the bricks
|
||||
- Entry split-brain: The GFID of the file is different on the bricks in the replica or the type of the file is different on the bricks in the replica. Type-mismatch cannot be healed using any of the split-brain resolution methods while gfid split-brains can be.
|
||||
|
||||
- Data split-brain: The data in the file differs on the bricks in the replica set
|
||||
- Metadata split-brain: The metadata differs on the bricks
|
||||
- Entry split-brain: The GFID of the file is different on the bricks in the replica or the type of the file is different on the bricks in the replica. Type-mismatch cannot be healed using any of the split-brain resolution methods while gfid split-brains can be.
|
||||
|
||||
## 1) Volume heal info:
|
||||
|
||||
Usage: `gluster volume heal <VOLNAME> info`
|
||||
|
||||
This lists all the files that require healing (and will be processed by the self-heal daemon). It prints either their path or their GFID.
|
||||
|
||||
### Interpreting the output
|
||||
|
||||
All the files listed in the output of this command need to be healed.
|
||||
The files listed may also be accompanied by the following tags:
|
||||
|
||||
a) 'Is in split-brain'
|
||||
A file in data or metadata split-brain will
|
||||
be listed with " - Is in split-brain" appended after its path/GFID. E.g.
|
||||
A file in data or metadata split-brain will
|
||||
be listed with " - Is in split-brain" appended after its path/GFID. E.g.
|
||||
"/file4" in the output provided below. However, for a file in GFID split-brain,
|
||||
the parent directory of the file is shown to be in split-brain and the file
|
||||
itself is shown to be needing healing, e.g. "/dir" in the output provided below
|
||||
the parent directory of the file is shown to be in split-brain and the file
|
||||
itself is shown to be needing healing, e.g. "/dir" in the output provided below
|
||||
is in split-brain because of GFID split-brain of file "/dir/a".
|
||||
Files in split-brain cannot be healed without resolving the split-brain.
|
||||
|
||||
@@ -36,11 +37,13 @@ b) 'Is possibly undergoing heal'
|
||||
When the heal info command is run, it (or to be more specific, the 'glfsheal' binary that is executed when you run the command) takes locks on each file to find if it needs healing. However, if the self-heal daemon had already started healing the file, it would have taken locks which glfsheal wouldn't be able to acquire. In such a case, it could print this message. Another possible case could be multiple glfsheal processes running simultaneously (e.g. multiple users ran a heal info command at the same time) and competing for same lock.
|
||||
|
||||
The following is an example of heal info command's output.
|
||||
|
||||
### Example
|
||||
|
||||
Consider a replica volume "test" with two bricks b1 and b2;
|
||||
self-heal daemon off, mounted at /mnt.
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# gluster volume heal test info
|
||||
Brick \<hostname:brickpath-b1>
|
||||
<gfid:aaca219f-0e25-4576-8689-3bfd93ca70c2> - Is in split-brain
|
||||
@@ -63,24 +66,27 @@ Number of entries: 6
|
||||
```
|
||||
|
||||
### Analysis of the output
|
||||
It can be seen that
|
||||
A) from brick b1, four entries need healing:
|
||||
1) file with gfid:6dc78b20-7eb6-49a3-8edb-087b90142246 needs healing
|
||||
2) "aaca219f-0e25-4576-8689-3bfd93ca70c2",
|
||||
"39f301ae-4038-48c2-a889-7dac143e82dd" and "c3c94de2-232d-4083-b534-5da17fc476ac"
|
||||
are in split-brain
|
||||
|
||||
B) from brick b2 six entries need healing-
|
||||
1) "a", "file2" and "file3" need healing
|
||||
2) "file1", "file4" & "/dir" are in split-brain
|
||||
It can be seen that
|
||||
|
||||
A) from brick b1, four entries need healing:
|
||||
|
||||
- file with gfid:6dc78b20-7eb6-49a3-8edb-087b90142246 needs healing
|
||||
- "aaca219f-0e25-4576-8689-3bfd93ca70c2", "39f301ae-4038-48c2-a889-7dac143e82dd" and "c3c94de2-232d-4083-b534-5da17fc476ac" are in split-brain
|
||||
|
||||
B) from brick b2 six entries need healing-
|
||||
|
||||
- "a", "file2" and "file3" need healing
|
||||
- "file1", "file4" & "/dir" are in split-brain
|
||||
|
||||
# 2. Volume heal info split-brain
|
||||
|
||||
Usage: `gluster volume heal <VOLNAME> info split-brain`
|
||||
This command only shows the list of files that are in split-brain. The output is therefore a subset of `gluster volume heal <VOLNAME> info`
|
||||
|
||||
### Example
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# gluster volume heal test info split-brain
|
||||
Brick <hostname:brickpath-b1>
|
||||
<gfid:aaca219f-0e25-4576-8689-3bfd93ca70c2>
|
||||
@@ -95,19 +101,22 @@ Brick <hostname:brickpath-b2>
|
||||
Number of entries in split-brain: 3
|
||||
```
|
||||
|
||||
Note that similar to the heal info command, for GFID split-brains (same filename but different GFID)
|
||||
Note that similar to the heal info command, for GFID split-brains (same filename but different GFID)
|
||||
their parent directories are listed to be in split-brain.
|
||||
|
||||
# 3. Resolution of split-brain using gluster CLI
|
||||
|
||||
Once the files in split-brain are identified, their resolution can be done
|
||||
from the gluster command line using various policies. Type-mismatch cannot be healed using this methods. Split-brain resolution commands let the user resolve data, metadata, and GFID split-brains.
|
||||
|
||||
## 3.1 Resolution of data/metadata split-brain using gluster CLI
|
||||
|
||||
Data and metadata split-brains can be resolved using the following policies:
|
||||
|
||||
## i) Select the bigger-file as source
|
||||
|
||||
This command is useful for per file healing where it is known/decided that the
|
||||
file with bigger size is to be considered as source.
|
||||
file with bigger size is to be considered as source.
|
||||
`gluster volume heal <VOLNAME> split-brain bigger-file <FILE>`
|
||||
Here, `<FILE>` can be either the full file name as seen from the root of the volume
|
||||
(or) the GFID-string representation of the file, which sometimes gets displayed
|
||||
@@ -115,13 +124,14 @@ in the heal info command's output. Once this command is executed, the replica co
|
||||
size is found and healing is completed with that brick as a source.
|
||||
|
||||
### Example :
|
||||
|
||||
Consider the earlier output of the heal info split-brain command.
|
||||
|
||||
Before healing the file, notice file size and md5 checksums :
|
||||
Before healing the file, notice file size and md5 checksums :
|
||||
|
||||
On brick b1:
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
[brick1]# stat b1/dir/file1
|
||||
File: ‘b1/dir/file1’
|
||||
Size: 17 Blocks: 16 IO Block: 4096 regular file
|
||||
@@ -138,7 +148,7 @@ Change: 2015-03-06 13:55:37.206880347 +0530
|
||||
|
||||
On brick b2:
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
[brick2]# stat b2/dir/file1
|
||||
File: ‘b2/dir/file1’
|
||||
Size: 13 Blocks: 16 IO Block: 4096 regular file
|
||||
@@ -153,7 +163,7 @@ Change: 2015-03-06 13:52:22.910758923 +0530
|
||||
cb11635a45d45668a403145059c2a0d5 b2/dir/file1
|
||||
```
|
||||
|
||||
**Healing file1 using the above command** :-
|
||||
**Healing file1 using the above command** :-
|
||||
`gluster volume heal test split-brain bigger-file /dir/file1`
|
||||
Healed /dir/file1.
|
||||
|
||||
@@ -161,7 +171,7 @@ After healing is complete, the md5sum and file size on both bricks should be the
|
||||
|
||||
On brick b1:
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
[brick1]# stat b1/dir/file1
|
||||
File: ‘b1/dir/file1’
|
||||
Size: 17 Blocks: 16 IO Block: 4096 regular file
|
||||
@@ -178,7 +188,7 @@ Change: 2015-03-06 14:17:12.880343950 +0530
|
||||
|
||||
On brick b2:
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
[brick2]# stat b2/dir/file1
|
||||
File: ‘b2/dir/file1’
|
||||
Size: 17 Blocks: 16 IO Block: 4096 regular file
|
||||
@@ -195,7 +205,7 @@ Change: 2015-03-06 14:17:12.881343955 +0530
|
||||
|
||||
## ii) Select the file with the latest mtime as source
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
gluster volume heal <VOLNAME> split-brain latest-mtime <FILE>
|
||||
```
|
||||
|
||||
@@ -203,20 +213,21 @@ As is perhaps self-explanatory, this command uses the brick having the latest mo
|
||||
|
||||
## iii) Select one of the bricks in the replica as the source for a particular file
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
gluster volume heal <VOLNAME> split-brain source-brick <HOSTNAME:BRICKNAME> <FILE>
|
||||
```
|
||||
|
||||
Here, `<HOSTNAME:BRICKNAME>` is selected as source brick and `<FILE>` present in the source brick is taken as the source for healing.
|
||||
|
||||
### Example :
|
||||
|
||||
Notice the md5 checksums and file size before and after healing.
|
||||
|
||||
Before heal :
|
||||
|
||||
On brick b1:
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
[brick1]# stat b1/file4
|
||||
File: ‘b1/file4’
|
||||
Size: 4 Blocks: 16 IO Block: 4096 regular file
|
||||
@@ -233,7 +244,7 @@ b6273b589df2dfdbd8fe35b1011e3183 b1/file4
|
||||
|
||||
On brick b2:
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
[brick2]# stat b2/file4
|
||||
File: ‘b2/file4’
|
||||
Size: 4 Blocks: 16 IO Block: 4096 regular file
|
||||
@@ -251,7 +262,7 @@ Change: 2015-03-06 13:52:35.769833142 +0530
|
||||
**Healing the file with gfid c3c94de2-232d-4083-b534-5da17fc476ac using the above command** :
|
||||
|
||||
```console
|
||||
# gluster volume heal test split-brain source-brick test-host:/test/b1 gfid:c3c94de2-232d-4083-b534-5da17fc476ac
|
||||
gluster volume heal test split-brain source-brick test-host:/test/b1 gfid:c3c94de2-232d-4083-b534-5da17fc476ac
|
||||
```
|
||||
|
||||
Healed gfid:c3c94de2-232d-4083-b534-5da17fc476ac.
|
||||
@@ -260,7 +271,7 @@ After healing :
|
||||
|
||||
On brick b1:
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# stat b1/file4
|
||||
File: ‘b1/file4’
|
||||
Size: 4 Blocks: 16 IO Block: 4096 regular file
|
||||
@@ -276,7 +287,7 @@ b6273b589df2dfdbd8fe35b1011e3183 b1/file4
|
||||
|
||||
On brick b2:
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# stat b2/file4
|
||||
File: ‘b2/file4’
|
||||
Size: 4 Blocks: 16 IO Block: 4096 regular file
|
||||
@@ -292,7 +303,7 @@ b6273b589df2dfdbd8fe35b1011e3183 b2/file4
|
||||
|
||||
## iv) Select one brick of the replica as the source for all files
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
gluster volume heal <VOLNAME> split-brain source-brick <HOSTNAME:BRICKNAME>
|
||||
```
|
||||
|
||||
@@ -301,9 +312,10 @@ replica pair is source. As the result of the above command all split-brained
|
||||
files in `<HOSTNAME:BRICKNAME>` are selected as source and healed to the sink.
|
||||
|
||||
### Example:
|
||||
|
||||
Consider a volume having three entries "a, b and c" in split-brain.
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# gluster volume heal test split-brain source-brick test-host:/test/b1
|
||||
Healed gfid:944b4764-c253-4f02-b35f-0d0ae2f86c0f.
|
||||
Healed gfid:3256d814-961c-4e6e-8df2-3a3143269ced.
|
||||
@@ -312,19 +324,24 @@ Number of healed entries: 3
|
||||
```
|
||||
|
||||
# 3.2 Resolution of GFID split-brain using gluster CLI
|
||||
|
||||
GFID split-brains can also be resolved by the gluster command line using the same policies that are used to resolve data and metadata split-brains.
|
||||
|
||||
## i) Selecting the bigger-file as source
|
||||
|
||||
This method is useful for per file healing and where you can decided that the file with bigger size is to be considered as source.
|
||||
|
||||
Run the following command to obtain the path of the file that is in split-brain:
|
||||
```console
|
||||
|
||||
```{ .console .no-copy }
|
||||
# gluster volume heal VOLNAME info split-brain
|
||||
```
|
||||
|
||||
From the output, identify the files for which file operations performed from the client failed with input/output error.
|
||||
|
||||
### Example :
|
||||
```console
|
||||
|
||||
```{ .console .no-copy }
|
||||
# gluster volume heal testvol info
|
||||
Brick 10.70.47.45:/bricks/brick2/b0
|
||||
/f5
|
||||
@@ -340,19 +357,22 @@ Brick 10.70.47.144:/bricks/brick2/b1
|
||||
Status: Connected
|
||||
Number of entries: 2
|
||||
```
|
||||
|
||||
> **Note**
|
||||
> Entries which are in GFID split-brain may not be shown as in split-brain by the heal info or heal info split-brain commands always. For entry split-brains, it is the parent directory which is shown as being in split-brain. So one might need to run info split-brain to get the dir names and then heal info to get the list of files under that dir which might be in split-brain (it could just be needing heal without split-brain).
|
||||
|
||||
In the above command, testvol is the volume name, b0 and b1 are the bricks.
|
||||
Execute the below getfattr command on the brick to fetch information if a file is in GFID split-brain or not.
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# getfattr -d -e hex -m. <path-to-file>
|
||||
```
|
||||
|
||||
### Example :
|
||||
|
||||
On brick /b0
|
||||
```console
|
||||
|
||||
```{ .console .no-copy }
|
||||
# getfattr -d -m . -e hex /bricks/brick2/b0/f5
|
||||
getfattr: Removing leading '/' from absolute path names
|
||||
file: bricks/brick2/b0/f5
|
||||
@@ -364,7 +384,8 @@ trusted.gfid2path.9cde09916eabc845=0x30303030303030302d303030302d303030302d30303
|
||||
```
|
||||
|
||||
On brick /b1
|
||||
```console
|
||||
|
||||
```{ .console .no-copy }
|
||||
# getfattr -d -m . -e hex /bricks/brick2/b1/f5
|
||||
getfattr: Removing leading '/' from absolute path names
|
||||
file: bricks/brick2/b1/f5
|
||||
@@ -379,7 +400,8 @@ You can notice the difference in GFID for the file f5 in both the bricks.
|
||||
You can find the differences in the file size by executing stat command on the file from the bricks.
|
||||
|
||||
On brick /b0
|
||||
```console
|
||||
|
||||
```{ .console .no-copy }
|
||||
# stat /bricks/brick2/b0/f5
|
||||
File: ‘/bricks/brick2/b0/f5’
|
||||
Size: 15 Blocks: 8 IO Block: 4096 regular file
|
||||
@@ -393,7 +415,8 @@ Birth: -
|
||||
```
|
||||
|
||||
On brick /b1
|
||||
```console
|
||||
|
||||
```{ .console .no-copy }
|
||||
# stat /bricks/brick2/b1/f5
|
||||
File: ‘/bricks/brick2/b1/f5’
|
||||
Size: 2 Blocks: 8 IO Block: 4096 regular file
|
||||
@@ -408,12 +431,13 @@ Birth: -
|
||||
|
||||
Execute the following command along with the full filename as seen from the root of the volume which is displayed in the heal info command's output:
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# gluster volume heal VOLNAME split-brain bigger-file FILE
|
||||
```
|
||||
|
||||
### Example :
|
||||
```console
|
||||
|
||||
```{ .console .no-copy }
|
||||
# gluster volume heal testvol split-brain bigger-file /f5
|
||||
GFID split-brain resolved for file /f5
|
||||
```
|
||||
@@ -421,7 +445,8 @@ GFID split-brain resolved for file /f5
|
||||
After the healing is complete, the GFID of the file on both the bricks must be the same as that of the file which had the bigger size. The following is a sample output of the getfattr command after completion of healing the file.
|
||||
|
||||
On brick /b0
|
||||
```console
|
||||
|
||||
```{ .console .no-copy }
|
||||
# getfattr -d -m . -e hex /bricks/brick2/b0/f5
|
||||
getfattr: Removing leading '/' from absolute path names
|
||||
file: bricks/brick2/b0/f5
|
||||
@@ -431,7 +456,8 @@ trusted.gfid2path.9cde09916eabc845=0x30303030303030302d303030302d303030302d30303
|
||||
```
|
||||
|
||||
On brick /b1
|
||||
```console
|
||||
|
||||
```{ .console .no-copy }
|
||||
# getfattr -d -m . -e hex /bricks/brick2/b1/f5
|
||||
getfattr: Removing leading '/' from absolute path names
|
||||
file: bricks/brick2/b1/f5
|
||||
@@ -441,14 +467,16 @@ trusted.gfid2path.9cde09916eabc845=0x30303030303030302d303030302d303030302d30303
|
||||
```
|
||||
|
||||
## ii) Selecting the file with latest mtime as source
|
||||
|
||||
This method is useful for per file healing and if you want the file with latest mtime has to be considered as source.
|
||||
|
||||
### Example :
|
||||
|
||||
Lets take another file which is in GFID split-brain and try to heal that using the latest-mtime option.
|
||||
|
||||
On brick /b0
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# getfattr -d -m . -e hex /bricks/brick2/b0/f4
|
||||
getfattr: Removing leading '/' from absolute path names
|
||||
file: bricks/brick2/b0/f4
|
||||
@@ -460,7 +488,8 @@ trusted.gfid2path.364f55367c7bd6f4=0x30303030303030302d303030302d303030302d30303
|
||||
```
|
||||
|
||||
On brick /b1
|
||||
```console
|
||||
|
||||
```{ .console .no-copy }
|
||||
# getfattr -d -m . -e hex /bricks/brick2/b1/f4
|
||||
getfattr: Removing leading '/' from absolute path names
|
||||
file: bricks/brick2/b1/f4
|
||||
@@ -475,7 +504,8 @@ You can notice the difference in GFID for the file f4 in both the bricks.
|
||||
You can find the difference in the modification time by executing stat command on the file from the bricks.
|
||||
|
||||
On brick /b0
|
||||
```console
|
||||
|
||||
```{ .console .no-copy }
|
||||
# stat /bricks/brick2/b0/f4
|
||||
File: ‘/bricks/brick2/b0/f4’
|
||||
Size: 14 Blocks: 8 IO Block: 4096 regular file
|
||||
@@ -489,7 +519,8 @@ Birth: -
|
||||
```
|
||||
|
||||
On brick /b1
|
||||
```console
|
||||
|
||||
```{ .console .no-copy }
|
||||
# stat /bricks/brick2/b1/f4
|
||||
File: ‘/bricks/brick2/b1/f4’
|
||||
Size: 2 Blocks: 8 IO Block: 4096 regular file
|
||||
@@ -503,12 +534,14 @@ Birth: -
|
||||
```
|
||||
|
||||
Execute the following command:
|
||||
```console
|
||||
|
||||
```{ .console .no-copy }
|
||||
# gluster volume heal VOLNAME split-brain latest-mtime FILE
|
||||
```
|
||||
|
||||
### Example :
|
||||
```console
|
||||
|
||||
```{ .console .no-copy }
|
||||
# gluster volume heal testvol split-brain latest-mtime /f4
|
||||
GFID split-brain resolved for file /f4
|
||||
```
|
||||
@@ -516,7 +549,9 @@ GFID split-brain resolved for file /f4
|
||||
After the healing is complete, the GFID of the files on both bricks must be same. The following is a sample output of the getfattr command after completion of healing the file. You can notice that the file has been healed using the brick having the latest mtime as the source.
|
||||
|
||||
On brick /b0
|
||||
```console# getfattr -d -m . -e hex /bricks/brick2/b0/f4
|
||||
|
||||
```{ .console .no-copy }
|
||||
# getfattr -d -m . -e hex /bricks/brick2/b0/f4
|
||||
getfattr: Removing leading '/' from absolute path names
|
||||
file: bricks/brick2/b0/f4
|
||||
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
|
||||
@@ -525,7 +560,8 @@ trusted.gfid2path.364f55367c7bd6f4=0x30303030303030302d303030302d303030302d30303
|
||||
```
|
||||
|
||||
On brick /b1
|
||||
```console
|
||||
|
||||
```{ .console .no-copy }
|
||||
# getfattr -d -m . -e hex /bricks/brick2/b1/f4
|
||||
getfattr: Removing leading '/' from absolute path names
|
||||
file: bricks/brick2/b1/f4
|
||||
@@ -535,13 +571,16 @@ trusted.gfid2path.364f55367c7bd6f4=0x30303030303030302d303030302d303030302d30303
|
||||
```
|
||||
|
||||
## iii) Select one of the bricks in the replica as source for a particular file
|
||||
|
||||
This method is useful for per file healing and if you know which copy of the file is good.
|
||||
|
||||
### Example :
|
||||
|
||||
Lets take another file which is in GFID split-brain and try to heal that using the source-brick option.
|
||||
|
||||
On brick /b0
|
||||
```console
|
||||
|
||||
```{ .console .no-copy }
|
||||
# getfattr -d -m . -e hex /bricks/brick2/b0/f3
|
||||
getfattr: Removing leading '/' from absolute path names
|
||||
file: bricks/brick2/b0/f3
|
||||
@@ -553,7 +592,8 @@ trusted.gfid2path.364f55367c7bd6f4=0x30303030303030302d303030302d303030302d30303
|
||||
```
|
||||
|
||||
On brick /b1
|
||||
```console
|
||||
|
||||
```{ .console .no-copy }
|
||||
# getfattr -d -m . -e hex /bricks/brick2/b1/f3
|
||||
getfattr: Removing leading '/' from absolute path names
|
||||
file: bricks/brick2/b0/f3
|
||||
@@ -567,14 +607,16 @@ trusted.gfid2path.364f55367c7bd6f4=0x30303030303030302d303030302d303030302d30303
|
||||
You can notice the difference in GFID for the file f3 in both the bricks.
|
||||
|
||||
Execute the following command:
|
||||
```console
|
||||
|
||||
```{ .console .no-copy }
|
||||
# gluster volume heal VOLNAME split-brain source-brick HOSTNAME:export-directory-absolute-path FILE
|
||||
```
|
||||
|
||||
In this command, FILE present in HOSTNAME : export-directory-absolute-path is taken as source for healing.
|
||||
|
||||
### Example :
|
||||
```console
|
||||
|
||||
```{ .console .no-copy }
|
||||
# gluster volume heal testvol split-brain source-brick 10.70.47.144:/bricks/brick2/b1 /f3
|
||||
GFID split-brain resolved for file /f3
|
||||
```
|
||||
@@ -582,7 +624,8 @@ GFID split-brain resolved for file /f3
|
||||
After the healing is complete, the GFID of the file on both the bricks should be same as that of the brick which was chosen as source for healing. The following is a sample output of the getfattr command after the file is healed.
|
||||
|
||||
On brick /b0
|
||||
```console
|
||||
|
||||
```{ .console .no-copy }
|
||||
# getfattr -d -m . -e hex /bricks/brick2/b0/f3
|
||||
getfattr: Removing leading '/' from absolute path names
|
||||
file: bricks/brick2/b0/f3
|
||||
@@ -592,7 +635,8 @@ trusted.gfid2path.364f55367c7bd6f4=0x30303030303030302d303030302d303030302d30303
|
||||
```
|
||||
|
||||
On brick /b1
|
||||
```console
|
||||
|
||||
```{ .console .no-copy }
|
||||
# getfattr -d -m . -e hex /bricks/brick2/b1/f3
|
||||
getfattr: Removing leading '/' from absolute path names
|
||||
file: bricks/brick2/b1/f3
|
||||
@@ -602,19 +646,22 @@ trusted.gfid2path.364f55367c7bd6f4=0x30303030303030302d303030302d303030302d30303
|
||||
```
|
||||
|
||||
> **Note**
|
||||
>- One cannot use the GFID of the file as an argument with any of the CLI options to resolve GFID split-brain. It should be the absolute path as seen from the mount point to the file considered as source.
|
||||
>
|
||||
>- With source-brick option there is no way to resolve all the GFID split-brain in one shot by not specifying any file path in the CLI as done while resolving data or metadata split-brain. For each file in GFID split-brain, run the CLI with the policy you want to use.
|
||||
> - One cannot use the GFID of the file as an argument with any of the CLI options to resolve GFID split-brain. It should be the absolute path as seen from the mount point to the file considered as source.
|
||||
>
|
||||
>- Resolving directory GFID split-brain using CLI with the "source-brick" option in a "distributed-replicated" volume needs to be done on all the sub-volumes explicitly, which are in this state. Since directories get created on all the sub-volumes, using one particular brick as source for directory GFID split-brain heals the directory for that particular sub-volume. Source brick should be chosen in such a way that after heal all the bricks of all the sub-volumes have the same GFID.
|
||||
> - With source-brick option there is no way to resolve all the GFID split-brain in one shot by not specifying any file path in the CLI as done while resolving data or metadata split-brain. For each file in GFID split-brain, run the CLI with the policy you want to use.
|
||||
>
|
||||
> - Resolving directory GFID split-brain using CLI with the "source-brick" option in a "distributed-replicated" volume needs to be done on all the sub-volumes explicitly, which are in this state. Since directories get created on all the sub-volumes, using one particular brick as source for directory GFID split-brain heals the directory for that particular sub-volume. Source brick should be chosen in such a way that after heal all the bricks of all the sub-volumes have the same GFID.
|
||||
|
||||
## Note:
|
||||
|
||||
As mentioned earlier, type-mismatch can not be resolved using CLI. Type-mismatch means different st_mode values (for example, the entry is a file in one brick while it is a directory on the other). Trying to heal such entry would fail.
|
||||
|
||||
### Example
|
||||
|
||||
The entry named "entry1" is of different types on the bricks of the replica. Lets try to heal that using the split-brain CLI.
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# gluster volume heal test split-brain source-brick test-host:/test/b1 /entry1
|
||||
Healing /entry1 failed:Operation not permitted.
|
||||
Volume heal failed.
|
||||
@@ -623,22 +670,23 @@ Volume heal failed.
|
||||
However, they can be fixed by deleting the file from all but one bricks. See [Fixing Directory entry split-brain](#dir-split-brain)
|
||||
|
||||
# An overview of working of heal info commands
|
||||
When these commands are invoked, a "glfsheal" process is spawned which reads
|
||||
the entries from the various sub-directories under `/<brick-path>/.glusterfs/indices/` of all
|
||||
the bricks that are up (that it can connect to) one after another. These
|
||||
entries are GFIDs of files that might need healing. Once GFID entries from a
|
||||
brick are obtained, based on the lookup response of this file on each
|
||||
participating brick of replica-pair & trusted.afr.* extended attributes it is
|
||||
found out if the file needs healing, is in split-brain etc based on the
|
||||
|
||||
When these commands are invoked, a "glfsheal" process is spawned which reads
|
||||
the entries from the various sub-directories under `/<brick-path>/.glusterfs/indices/` of all
|
||||
the bricks that are up (that it can connect to) one after another. These
|
||||
entries are GFIDs of files that might need healing. Once GFID entries from a
|
||||
brick are obtained, based on the lookup response of this file on each
|
||||
participating brick of replica-pair & trusted.afr.\* extended attributes it is
|
||||
found out if the file needs healing, is in split-brain etc based on the
|
||||
requirement of each command and displayed to the user.
|
||||
|
||||
|
||||
# 4. Resolution of split-brain from the mount point
|
||||
|
||||
A set of getfattr and setfattr commands have been provided to detect the data and metadata split-brain status of a file and resolve split-brain, if any, from mount point.
|
||||
|
||||
Consider a volume "test", having bricks b0, b1, b2 and b3.
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# gluster volume info test
|
||||
|
||||
Volume Name: test
|
||||
@@ -656,7 +704,7 @@ Brick4: test-host:/test/b3
|
||||
|
||||
Directory structure of the bricks is as follows:
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# tree -R /test/b?
|
||||
/test/b0
|
||||
├── dir
|
||||
@@ -683,7 +731,7 @@ Directory structure of the bricks is as follows:
|
||||
|
||||
Some files in the volume are in split-brain.
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# gluster v heal test info split-brain
|
||||
Brick test-host:/test/b0/
|
||||
/file100
|
||||
@@ -708,7 +756,7 @@ Number of entries in split-brain: 2
|
||||
|
||||
### To know data/metadata split-brain status of a file:
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
getfattr -n replica.split-brain-status <path-to-file>
|
||||
```
|
||||
|
||||
@@ -716,50 +764,52 @@ The above command executed from mount provides information if a file is in data/
|
||||
This command is not applicable to gfid/directory split-brain.
|
||||
|
||||
### Example:
|
||||
1) "file100" is in metadata split-brain. Executing the above mentioned command for file100 gives :
|
||||
|
||||
```console
|
||||
1. "file100" is in metadata split-brain. Executing the above mentioned command for file100 gives :
|
||||
|
||||
```{ .console .no-copy }
|
||||
# getfattr -n replica.split-brain-status file100
|
||||
file: file100
|
||||
replica.split-brain-status="data-split-brain:no metadata-split-brain:yes Choices:test-client-0,test-client-1"
|
||||
```
|
||||
|
||||
2) "file1" is in data split-brain.
|
||||
2. "file1" is in data split-brain.
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# getfattr -n replica.split-brain-status file1
|
||||
file: file1
|
||||
replica.split-brain-status="data-split-brain:yes metadata-split-brain:no Choices:test-client-2,test-client-3"
|
||||
```
|
||||
|
||||
3) "file99" is in both data and metadata split-brain.
|
||||
3. "file99" is in both data and metadata split-brain.
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# getfattr -n replica.split-brain-status file99
|
||||
file: file99
|
||||
replica.split-brain-status="data-split-brain:yes metadata-split-brain:yes Choices:test-client-2,test-client-3"
|
||||
```
|
||||
|
||||
4) "dir" is in directory split-brain but as mentioned earlier, the above command is not applicable to such split-brain. So it says that the file is not under data or metadata split-brain.
|
||||
4. "dir" is in directory split-brain but as mentioned earlier, the above command is not applicable to such split-brain. So it says that the file is not under data or metadata split-brain.
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# getfattr -n replica.split-brain-status dir
|
||||
file: dir
|
||||
replica.split-brain-status="The file is not under data or metadata split-brain"
|
||||
```
|
||||
|
||||
5) "file2" is not in any kind of split-brain.
|
||||
5. "file2" is not in any kind of split-brain.
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# getfattr -n replica.split-brain-status file2
|
||||
file: file2
|
||||
replica.split-brain-status="The file is not under data or metadata split-brain"
|
||||
```
|
||||
|
||||
### To analyze the files in data and metadata split-brain
|
||||
|
||||
Trying to do operations (say cat, getfattr etc) from the mount on files in split-brain, gives an input/output error. To enable the users analyze such files, a setfattr command is provided.
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# setfattr -n replica.split-brain-choice -v "choiceX" <path-to-file>
|
||||
```
|
||||
|
||||
@@ -767,9 +817,9 @@ Using this command, a particular brick can be chosen to access the file in split
|
||||
|
||||
### Example:
|
||||
|
||||
1) "file1" is in data-split-brain. Trying to read from the file gives input/output error.
|
||||
1. "file1" is in data-split-brain. Trying to read from the file gives input/output error.
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# cat file1
|
||||
cat: file1: Input/output error
|
||||
```
|
||||
@@ -778,13 +828,13 @@ Split-brain choices provided for file1 were test-client-2 and test-client-3.
|
||||
|
||||
Setting test-client-2 as split-brain choice for file1 serves reads from b2 for the file.
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# setfattr -n replica.split-brain-choice -v test-client-2 file1
|
||||
```
|
||||
|
||||
Now, read operations on the file can be done.
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# cat file1
|
||||
xyz
|
||||
```
|
||||
@@ -793,18 +843,18 @@ Similarly, to inspect the file from other choice, replica.split-brain-choice is
|
||||
|
||||
Trying to inspect the file from a wrong choice errors out.
|
||||
|
||||
To undo the split-brain-choice that has been set, the above mentioned setfattr command can be used
|
||||
To undo the split-brain-choice that has been set, the above mentioned setfattr command can be used
|
||||
with "none" as the value for extended attribute.
|
||||
|
||||
### Example:
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# setfattr -n replica.split-brain-choice -v none file1
|
||||
```
|
||||
|
||||
Now performing cat operation on the file will again result in input/output error, as before.
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# cat file
|
||||
cat: file1: Input/output error
|
||||
```
|
||||
@@ -812,13 +862,13 @@ cat: file1: Input/output error
|
||||
Once the choice for resolving split-brain is made, source brick is supposed to be set for the healing to be done.
|
||||
This is done using the following command:
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# setfattr -n replica.split-brain-heal-finalize -v <heal-choice> <path-to-file>
|
||||
```
|
||||
|
||||
## Example
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# setfattr -n replica.split-brain-heal-finalize -v test-client-2 file1
|
||||
```
|
||||
|
||||
@@ -826,18 +876,19 @@ The above process can be used to resolve data and/or metadata split-brain on all
|
||||
|
||||
**NOTE**:
|
||||
|
||||
1) If "fopen-keep-cache" fuse mount option is disabled then inode needs to be invalidated each time before selecting a new replica.split-brain-choice to inspect a file. This can be done by using:
|
||||
1. If "fopen-keep-cache" fuse mount option is disabled then inode needs to be invalidated each time before selecting a new replica.split-brain-choice to inspect a file. This can be done by using:
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# sefattr -n inode-invalidate -v 0 <path-to-file>
|
||||
```
|
||||
|
||||
2) The above mentioned process for split-brain resolution from mount will not work on nfs mounts as it doesn't provide xattrs support.
|
||||
2. The above mentioned process for split-brain resolution from mount will not work on nfs mounts as it doesn't provide xattrs support.
|
||||
|
||||
# 5. Automagic unsplit-brain by [ctime|mtime|size|majority]
|
||||
The CLI and fuse mount based resolution methods require intervention in the sense that the admin/ user needs to run the commands manually. There is a `cluster.favorite-child-policy` volume option which when set to one of the various policies available, automatically resolve split-brains without user intervention. The default value is 'none', i.e. it is disabled.
|
||||
|
||||
```console
|
||||
The CLI and fuse mount based resolution methods require intervention in the sense that the admin/ user needs to run the commands manually. There is a `cluster.favorite-child-policy` volume option which when set to one of the various policies available, automatically resolve split-brains without user intervention. The default value is 'none', i.e. it is disabled.
|
||||
|
||||
```{ .console .no-copy }
|
||||
# gluster volume set help | grep -A3 cluster.favorite-child-policy
|
||||
Option: cluster.favorite-child-policy
|
||||
Default Value: none
|
||||
@@ -846,40 +897,41 @@ Description: This option can be used to automatically resolve split-brains using
|
||||
|
||||
`cluster.favorite-child-policy` applies to all files of the volume. It is assumed that if this option is enabled with a particular policy, you don't care to examine the split-brain files on a per file basis but just want the split-brain to be resolved as and when it occurs based on the set policy.
|
||||
|
||||
|
||||
<a name="manual-split-brain"></a>
|
||||
# Manual Split-Brain Resolution:
|
||||
|
||||
Quick Start:
|
||||
============
|
||||
1. Get the path of the file that is in split-brain:
|
||||
> It can be obtained either by
|
||||
> a) The command `gluster volume heal info split-brain`.
|
||||
> b) Identify the files for which file operations performed
|
||||
from the client keep failing with Input/Output error.
|
||||
# Quick Start:
|
||||
|
||||
2. Close the applications that opened this file from the mount point.
|
||||
In case of VMs, they need to be powered-off.
|
||||
1. Get the path of the file that is in split-brain:
|
||||
|
||||
3. Decide on the correct copy:
|
||||
> This is done by observing the afr changelog extended attributes of the file on
|
||||
the bricks using the getfattr command; then identifying the type of split-brain
|
||||
(data split-brain, metadata split-brain, entry split-brain or split-brain due to
|
||||
gfid-mismatch); and finally determining which of the bricks contains the 'good copy'
|
||||
of the file.
|
||||
> `getfattr -d -m . -e hex <file-path-on-brick>`.
|
||||
It is also possible that one brick might contain the correct data while the
|
||||
other might contain the correct metadata.
|
||||
> It can be obtained either by
|
||||
> a) The command `gluster volume heal info split-brain`.
|
||||
> b) Identify the files for which file operations performed from the client keep failing with Input/Output error.
|
||||
|
||||
4. Reset the relevant extended attribute on the brick(s) that contains the
|
||||
'bad copy' of the file data/metadata using the setfattr command.
|
||||
> `setfattr -n <attribute-name> -v <attribute-value> <file-path-on-brick>`
|
||||
1. Close the applications that opened this file from the mount point.
|
||||
In case of VMs, they need to be powered-off.
|
||||
|
||||
5. Trigger self-heal on the file by performing lookup from the client:
|
||||
> `ls -l <file-path-on-gluster-mount>`
|
||||
1. Decide on the correct copy:
|
||||
|
||||
> This is done by observing the afr changelog extended attributes of the file on
|
||||
> the bricks using the getfattr command; then identifying the type of split-brain
|
||||
> (data split-brain, metadata split-brain, entry split-brain or split-brain due to
|
||||
> gfid-mismatch); and finally determining which of the bricks contains the 'good copy'
|
||||
> of the file.
|
||||
> `getfattr -d -m . -e hex <file-path-on-brick>`.
|
||||
> It is also possible that one brick might contain the correct data while the
|
||||
> other might contain the correct metadata.
|
||||
|
||||
1. Reset the relevant extended attribute on the brick(s) that contains the
|
||||
'bad copy' of the file data/metadata using the setfattr command.
|
||||
|
||||
> `setfattr -n <attribute-name> -v <attribute-value> <file-path-on-brick>`
|
||||
|
||||
1. Trigger self-heal on the file by performing lookup from the client:
|
||||
|
||||
> `ls -l <file-path-on-gluster-mount>`
|
||||
|
||||
# Detailed Instructions for steps 3 through 5:
|
||||
|
||||
Detailed Instructions for steps 3 through 5:
|
||||
===========================================
|
||||
To understand how to resolve split-brain we need to know how to interpret the
|
||||
afr changelog extended attributes.
|
||||
|
||||
@@ -887,7 +939,7 @@ Execute `getfattr -d -m . -e hex <file-path-on-brick>`
|
||||
|
||||
Example:
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
[root@store3 ~]# getfattr -d -e hex -m. brick-a/file.txt
|
||||
\#file: brick-a/file.txt
|
||||
security.selinux=0x726f6f743a6f626a6563745f723a66696c655f743a733000
|
||||
@@ -900,7 +952,7 @@ The extended attributes with `trusted.afr.<volname>-client-<subvolume-index>`
|
||||
are used by afr to maintain changelog of the file.The values of the
|
||||
`trusted.afr.<volname>-client-<subvolume-index>` are calculated by the glusterfs
|
||||
client (fuse or nfs-server) processes. When the glusterfs client modifies a file
|
||||
or directory, the client contacts each brick and updates the changelog extended
|
||||
or directory, the client contacts each brick and updates the changelog extended
|
||||
attribute according to the response of the brick.
|
||||
|
||||
'subvolume-index' is nothing but (brick number - 1) in
|
||||
@@ -908,7 +960,7 @@ attribute according to the response of the brick.
|
||||
|
||||
Example:
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
[root@pranithk-laptop ~]# gluster volume info vol
|
||||
Volume Name: vol
|
||||
Type: Distributed-Replicate
|
||||
@@ -929,7 +981,7 @@ Example:
|
||||
|
||||
In the example above:
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
Brick | Replica set | Brick subvolume index
|
||||
----------------------------------------------------------------------------
|
||||
-/gfs/brick-a | 0 | 0
|
||||
@@ -945,25 +997,25 @@ Brick | Replica set | Brick subvolume index
|
||||
Each file in a brick maintains the changelog of itself and that of the files
|
||||
present in all the other bricks in its replica set as seen by that brick.
|
||||
|
||||
In the example volume given above, all files in brick-a will have 2 entries,
|
||||
In the example volume given above, all files in brick-a will have 2 entries,
|
||||
one for itself and the other for the file present in its replica pair, i.e.brick-b:
|
||||
trusted.afr.vol-client-0=0x000000000000000000000000 -->changelog for itself (brick-a)
|
||||
trusted.afr.vol-client-1=0x000000000000000000000000 -->changelog for brick-b as seen by brick-a
|
||||
trusted.afr.vol-client-1=0x000000000000000000000000 -->changelog for brick-b as seen by brick-a
|
||||
|
||||
Likewise, all files in brick-b will have:
|
||||
trusted.afr.vol-client-0=0x000000000000000000000000 -->changelog for brick-a as seen by brick-b
|
||||
trusted.afr.vol-client-1=0x000000000000000000000000 -->changelog for itself (brick-b)
|
||||
trusted.afr.vol-client-1=0x000000000000000000000000 -->changelog for itself (brick-b)
|
||||
|
||||
The same can be extended for other replica pairs.
|
||||
The same can be extended for other replica pairs.
|
||||
|
||||
Interpreting Changelog (roughly pending operation count) Value:
|
||||
Each extended attribute has a value which is 24 hexa decimal digits.
|
||||
First 8 digits represent changelog of data. Second 8 digits represent changelog
|
||||
of metadata. Last 8 digits represent Changelog of directory entries.
|
||||
of metadata. Last 8 digits represent Changelog of directory entries.
|
||||
|
||||
Pictorially representing the same, we have:
|
||||
|
||||
```text
|
||||
```{ .text .no-copy }
|
||||
0x 000003d7 00000001 00000000
|
||||
| | |
|
||||
| | \_ changelog of directory entries
|
||||
@@ -971,17 +1023,16 @@ Pictorially representing the same, we have:
|
||||
\ _ changelog of data
|
||||
```
|
||||
|
||||
|
||||
For Directories metadata and entry changelogs are valid.
|
||||
For regular files data and metadata changelogs are valid.
|
||||
For special files like device files etc metadata changelog is valid.
|
||||
When a file split-brain happens it could be either data split-brain or
|
||||
meta-data split-brain or both. When a split-brain happens the changelog of the
|
||||
file would be something like this:
|
||||
file would be something like this:
|
||||
|
||||
Example:(Lets consider both data, metadata split-brain on same file).
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
[root@pranithk-laptop vol]# getfattr -d -m . -e hex /gfs/brick-?/a
|
||||
getfattr: Removing leading '/' from absolute path names
|
||||
\#file: gfs/brick-a/a
|
||||
@@ -1007,7 +1058,7 @@ on itself but failed on /gfs/brick-b/a.
|
||||
The second 8 digits of trusted.afr.vol-client-0 are
|
||||
all zeros (0x........00000000........), and the second 8 digits of
|
||||
trusted.afr.vol-client-1 are not all zeros (0x........00000001........).
|
||||
So the changelog on /gfs/brick-a/a implies that some metadata operations succeeded
|
||||
So the changelog on /gfs/brick-a/a implies that some metadata operations succeeded
|
||||
on itself but failed on /gfs/brick-b/a.
|
||||
|
||||
#### According to Changelog extended attributes on file /gfs/brick-b/a:
|
||||
@@ -1029,12 +1080,12 @@ file, it is in both data and metadata split-brain.
|
||||
|
||||
#### Deciding on the correct copy:
|
||||
|
||||
The user may have to inspect stat,getfattr output of the files to decide which
|
||||
The user may have to inspect stat,getfattr output of the files to decide which
|
||||
metadata to retain and contents of the file to decide which data to retain.
|
||||
Continuing with the example above, lets say we want to retain the data
|
||||
of /gfs/brick-a/a and metadata of /gfs/brick-b/a.
|
||||
|
||||
#### Resetting the relevant changelogs to resolve the split-brain:
|
||||
#### Resetting the relevant changelogs to resolve the split-brain:
|
||||
|
||||
For resolving data-split-brain:
|
||||
|
||||
@@ -1068,27 +1119,31 @@ For trusted.afr.vol-client-1
|
||||
Hence execute
|
||||
`setfattr -n trusted.afr.vol-client-1 -v 0x000003d70000000000000000 /gfs/brick-a/a`
|
||||
|
||||
Thus after the above operations are done, the changelogs look like this:
|
||||
[root@pranithk-laptop vol]# getfattr -d -m . -e hex /gfs/brick-?/a
|
||||
getfattr: Removing leading '/' from absolute path names
|
||||
\#file: gfs/brick-a/a
|
||||
trusted.afr.vol-client-0=0x000000000000000000000000
|
||||
trusted.afr.vol-client-1=0x000003d70000000000000000
|
||||
trusted.gfid=0x80acdbd886524f6fbefa21fc356fed57
|
||||
Thus after the above operations are done, the changelogs look like this:
|
||||
|
||||
\#file: gfs/brick-b/a
|
||||
trusted.afr.vol-client-0=0x000000000000000100000000
|
||||
trusted.afr.vol-client-1=0x000000000000000000000000
|
||||
trusted.gfid=0x80acdbd886524f6fbefa21fc356fed57
|
||||
```{ .console .no-copy }
|
||||
[root@pranithk-laptop vol]# getfattr -d -m . -e hex /gfs/brick-?/a
|
||||
getfattr: Removing leading '/' from absolute path names
|
||||
\#file: gfs/brick-a/a
|
||||
trusted.afr.vol-client-0=0x000000000000000000000000
|
||||
trusted.afr.vol-client-1=0x000003d70000000000000000
|
||||
trusted.gfid=0x80acdbd886524f6fbefa21fc356fed57
|
||||
|
||||
\#file: gfs/brick-b/a
|
||||
trusted.afr.vol-client-0=0x000000000000000100000000
|
||||
trusted.afr.vol-client-1=0x000000000000000000000000
|
||||
trusted.gfid=0x80acdbd886524f6fbefa21fc356fed57
|
||||
```
|
||||
|
||||
## Triggering Self-heal:
|
||||
|
||||
Triggering Self-heal:
|
||||
---------------------
|
||||
Perform `ls -l <file-path-on-gluster-mount>` to trigger healing.
|
||||
|
||||
<a name="dir-split-brain"></a>
|
||||
Fixing Directory entry split-brain:
|
||||
----------------------------------
|
||||
|
||||
---
|
||||
|
||||
Afr has the ability to conservatively merge different entries in the directories
|
||||
when there is a split-brain on directory.
|
||||
If on one brick directory 'd' has entries '1', '2' and has entries '3', '4' on
|
||||
@@ -1108,9 +1163,11 @@ needs to be removed.The gfid-link files are present in the .glusterfs folder
|
||||
in the top-level directory of the brick. If the gfid of the file is
|
||||
0x307a5c9efddd4e7c96e94fd4bcdcbd1b (the trusted.gfid extended attribute got
|
||||
from the getfattr command earlier),the gfid-link file can be found at
|
||||
|
||||
> /gfs/brick-a/.glusterfs/30/7a/307a5c9efddd4e7c96e94fd4bcdcbd1b
|
||||
|
||||
#### Word of caution:
|
||||
|
||||
Before deleting the gfid-link, we have to ensure that there are no hard links
|
||||
to the file present on that brick. If hard-links exist,they must be deleted as
|
||||
well.
|
||||
|
||||
@@ -2,20 +2,18 @@
|
||||
|
||||
A statedump is, as the name suggests, a dump of the internal state of a glusterfs process. It captures information about in-memory structures such as frames, call stacks, active inodes, fds, mempools, iobufs, and locks as well as xlator specific data structures. This can be an invaluable tool for debugging memory leaks and hung processes.
|
||||
|
||||
- [Generate a Statedump](#generate-a-statedump)
|
||||
- [Read a Statedump](#read-a-statedump)
|
||||
- [Debug with a Statedump](#debug-with-statedumps)
|
||||
|
||||
|
||||
- [Generate a Statedump](#generate-a-statedump)
|
||||
- [Read a Statedump](#read-a-statedump)
|
||||
- [Debug with a Statedump](#debug-with-statedumps)
|
||||
|
||||
************************
|
||||
|
||||
---
|
||||
|
||||
## Generate a Statedump
|
||||
|
||||
Run the command
|
||||
|
||||
```console
|
||||
# gluster --print-statedumpdir
|
||||
gluster --print-statedumpdir
|
||||
```
|
||||
|
||||
on a gluster server node to find out which directory the statedumps will be created in. This directory may need to be created if not already present.
|
||||
@@ -38,7 +36,6 @@ kill -USR1 <pid-of-gluster-mount-process>
|
||||
There are specific commands to generate statedumps for all brick processes/nfs server/quotad which can be used instead of the above. Run the following
|
||||
commands on one of the server nodes:
|
||||
|
||||
|
||||
For bricks:
|
||||
|
||||
```console
|
||||
@@ -59,16 +56,17 @@ gluster volume statedump <volname> quotad
|
||||
|
||||
The statedumps will be created in `statedump-directory` on each node. The statedumps for brick processes will be created with the filename `hyphenated-brick-path.<pid>.dump.timestamp` while for all other processes it will be `glusterdump.<pid>.dump.timestamp`.
|
||||
|
||||
***
|
||||
---
|
||||
|
||||
## Read a Statedump
|
||||
|
||||
Statedumps are text files and can be opened in any text editor. The first and last lines of the file contain the start and end time (in UTC)respectively of when the statedump file was written.
|
||||
|
||||
### Mallinfo
|
||||
|
||||
The mallinfo return status is printed in the following format. Please read _man mallinfo_ for more information about what each field means.
|
||||
|
||||
```
|
||||
```{.text .no-copy }
|
||||
[mallinfo]
|
||||
mallinfo_arena=100020224 /* Non-mmapped space allocated (bytes) */
|
||||
mallinfo_ordblks=69467 /* Number of free chunks */
|
||||
@@ -83,19 +81,19 @@ mallinfo_keepcost=133712 /* Top-most, releasable space (bytes) */
|
||||
```
|
||||
|
||||
### Memory accounting stats
|
||||
|
||||
Each xlator defines data structures specific to its requirements. The statedump captures information about the memory usage and allocations of these structures for each xlator in the call-stack and prints them in the following format:
|
||||
|
||||
For the xlator with the name _glusterfs_
|
||||
|
||||
```
|
||||
```{.text .no-copy }
|
||||
[global.glusterfs - Memory usage] #[global.<xlator-name> - Memory usage]
|
||||
num_types=119 #The number of data types it is using
|
||||
```
|
||||
|
||||
|
||||
followed by the memory usage for each data-type for that translator. The following example displays a sample for the gf_common_mt_gf_timer_t type
|
||||
|
||||
```
|
||||
```{.text .no-copy }
|
||||
[global.glusterfs - usage-type gf_common_mt_gf_timer_t memusage]
|
||||
#[global.<xlator-name> - usage-type <tag associated with the data-type> memusage]
|
||||
size=112 #Total size allocated for data-type when the statedump was taken i.e. num_allocs * sizeof (data-type)
|
||||
@@ -113,7 +111,7 @@ Mempools are an optimization intended to reduce the number of allocations of a d
|
||||
|
||||
Memory pool allocations by each xlator are displayed in the following format:
|
||||
|
||||
```
|
||||
```{.text .no-copy }
|
||||
[mempool] #Section name
|
||||
-----=-----
|
||||
pool-name=fuse:fd_t #pool-name=<xlator-name>:<data-type>
|
||||
@@ -129,10 +127,9 @@ max-stdalloc=0 #Maximum number of allocations from heap that were in active
|
||||
|
||||
This information is also useful while debugging high memory usage issues as large hot_count and cur-stdalloc values may point to an element not being freed after it has been used.
|
||||
|
||||
|
||||
### Iobufs
|
||||
|
||||
```
|
||||
```{.text .no-copy }
|
||||
[iobuf.global]
|
||||
iobuf_pool=0x1f0d970 #The memory pool for iobufs
|
||||
iobuf_pool.default_page_size=131072 #The default size of iobuf (if no iobuf size is specified the default size is allocated)
|
||||
@@ -148,7 +145,7 @@ There are 3 lists of arenas
|
||||
2. Purge list: arenas that can be purged(no active iobufs, active_cnt == 0).
|
||||
3. Filled list: arenas without free iobufs.
|
||||
|
||||
```
|
||||
```{.text .no-copy }
|
||||
[purge.1] #purge.<S.No.>
|
||||
purge.1.mem_base=0x7fc47b35f000 #The address of the arena structure
|
||||
purge.1.active_cnt=0 #The number of iobufs active in that arena
|
||||
@@ -168,7 +165,7 @@ arena.5.page_size=32768
|
||||
|
||||
If the active_cnt of any arena is non zero, then the statedump will also have the iobuf list.
|
||||
|
||||
```
|
||||
```{.text .no-copy }
|
||||
[arena.6.active_iobuf.1] #arena.<S.No>.active_iobuf.<iobuf.S.No.>
|
||||
arena.6.active_iobuf.1.ref=1 #refcount of the iobuf
|
||||
arena.6.active_iobuf.1.ptr=0x7fdb921a9000 #address of the iobuf
|
||||
@@ -180,12 +177,11 @@ arena.6.active_iobuf.2.ptr=0x7fdb92189000
|
||||
|
||||
A lot of filled arenas at any given point in time could be a sign of iobuf leaks.
|
||||
|
||||
|
||||
### Call stack
|
||||
|
||||
The fops received by gluster are handled using call stacks. A call stack contains information about the uid/gid/pid etc of the process that is executing the fop. Each call stack contains different call-frames for each xlator which handles that fop.
|
||||
|
||||
```
|
||||
```{.text .no-copy }
|
||||
[global.callpool.stack.3] #global.callpool.stack.<Serial-Number>
|
||||
stack=0x7fc47a44bbe0 #Stack address
|
||||
uid=0 #Uid of the process executing the fop
|
||||
@@ -199,9 +195,10 @@ cnt=9 #Number of frames in this stack.
|
||||
```
|
||||
|
||||
### Call-frame
|
||||
|
||||
Each frame will have information about which xlator the frame belongs to, which function it wound to/from and which it will be unwound to, and whether it has unwound.
|
||||
|
||||
```
|
||||
```{.text .no-copy }
|
||||
[global.callpool.stack.3.frame.2] #global.callpool.stack.<stack-serial-number>.frame.<frame-serial-number>
|
||||
frame=0x7fc47a611dbc #Frame address
|
||||
ref_count=0 #Incremented at the time of wind and decremented at the time of unwind.
|
||||
@@ -215,12 +212,11 @@ unwind_to=afr_lookup_cbk #Parent xlator function to unwind to
|
||||
|
||||
To debug hangs in the system, see which xlator has not yet unwound its fop by checking the value of the _complete_ tag in the statedump. (_complete=0_ indicates the xlator has not yet unwound).
|
||||
|
||||
|
||||
### FUSE Operation History
|
||||
|
||||
Gluster Fuse maintains a history of the operations that it has performed.
|
||||
|
||||
```
|
||||
```{.text .no-copy }
|
||||
[xlator.mount.fuse.history]
|
||||
TIME=2014-07-09 16:44:57.523364
|
||||
message=[0] fuse_release: RELEASE(): 4590:, fd: 0x1fef0d8, gfid: 3afb4968-5100-478d-91e9-76264e634c9f
|
||||
@@ -234,7 +230,7 @@ message=[0] fuse_getattr_resume: 4591, STAT, path: (/iozone.tmp), gfid: (3afb496
|
||||
|
||||
### Xlator configuration
|
||||
|
||||
```
|
||||
```{.text .no-copy }
|
||||
[cluster/replicate.r2-replicate-0] #Xlator type, name information
|
||||
child_count=2 #Number of children for the xlator
|
||||
#Xlator specific configuration below
|
||||
@@ -255,7 +251,7 @@ wait_count=1
|
||||
|
||||
### Graph/inode table
|
||||
|
||||
```
|
||||
```{.text .no-copy }
|
||||
[active graph - 1]
|
||||
|
||||
conn.1.bound_xl./data/brick01a/homegfs.hashsize=14057
|
||||
@@ -268,7 +264,7 @@ conn.1.bound_xl./data/brick01a/homegfs.purge_size=0 #Number of inodes present
|
||||
|
||||
### Inode
|
||||
|
||||
```
|
||||
```{.text .no-copy }
|
||||
[conn.1.bound_xl./data/brick01a/homegfs.active.324] #324th inode in active inode list
|
||||
gfid=e6d337cf-97eb-44b3-9492-379ba3f6ad42 #Gfid of the inode
|
||||
nlookup=13 #Number of times lookups happened from the client or from fuse kernel
|
||||
@@ -285,9 +281,10 @@ ia_type=2
|
||||
```
|
||||
|
||||
### Inode context
|
||||
|
||||
Each xlator can store information specific to it in the inode context. This context can also be printed in the statedump. Here is the inode context of the locks xlator
|
||||
|
||||
```
|
||||
```{.text .no-copy }
|
||||
[xlator.features.locks.homegfs-locks.inode]
|
||||
path=/homegfs/users/dfrobins/gfstest/r4/SCRATCH/fort.5102 - path of the file
|
||||
mandatory=0
|
||||
@@ -301,10 +298,11 @@ lock-dump.domain.domain=homegfs-replicate-0:metadata #Domain name where metadata
|
||||
lock-dump.domain.domain=homegfs-replicate-0 #Domain name where entry/data operations take locks to maintain replication consistency
|
||||
inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=11141120, len=131072, pid = 18446744073709551615, owner=080b1ada117f0000, client=0xb7fc30, connection-id=compute-30-029.com-3505-2014/06/29-14:46:12:477358-homegfs-client-0-0-1, granted at Sun Jun 29 11:10:36 2014 #Active lock information
|
||||
```
|
||||
|
||||
***
|
||||
|
||||
---
|
||||
|
||||
## Debug With Statedumps
|
||||
|
||||
### Memory leaks
|
||||
|
||||
Statedumps can be used to determine whether the high memory usage of a process is caused by a leak. To debug the issue, generate statedumps for that process at regular intervals, or before and after running the steps that cause the memory used to increase. Once you have multiple statedumps, compare the memory allocation stats to see if any of them are increasing steadily as those could indicate a potential memory leak.
|
||||
@@ -315,7 +313,7 @@ The following examples walk through using statedumps to debug two different memo
|
||||
|
||||
[BZ 1120151](https://bugzilla.redhat.com/show_bug.cgi?id=1120151) reported high memory usage by the self heal daemon whenever one of the bricks was wiped in a replicate volume and a full self-heal was invoked to heal the contents. This issue was debugged using statedumps to determine which data-structure was leaking memory.
|
||||
|
||||
A statedump of the self heal daemon process was taken using
|
||||
A statedump of the self heal daemon process was taken using
|
||||
|
||||
```console
|
||||
kill -USR1 `<pid-of-gluster-self-heal-daemon>`
|
||||
@@ -323,7 +321,7 @@ kill -USR1 `<pid-of-gluster-self-heal-daemon>`
|
||||
|
||||
On examining the statedump:
|
||||
|
||||
```
|
||||
```{.text .no-copy }
|
||||
grep -w num_allocs glusterdump.5225.dump.1405493251
|
||||
num_allocs=77078
|
||||
num_allocs=87070
|
||||
@@ -338,6 +336,7 @@ hot-count=4095
|
||||
```
|
||||
|
||||
On searching for num_allocs with high values in the statedump, a `grep` of the statedump revealed a large number of allocations for the following data-types under the replicate xlator:
|
||||
|
||||
1. gf_common_mt_asprintf
|
||||
2. gf_common_mt_char
|
||||
3. gf_common_mt_mem_pool.
|
||||
@@ -345,16 +344,15 @@ On searching for num_allocs with high values in the statedump, a `grep` of the s
|
||||
On checking the afr-code for allocations with tag `gf_common_mt_char`, it was found that the `data-self-heal` code path does not free one such allocated data structure. `gf_common_mt_mem_pool` suggests that there is a leak in pool memory. The `replicate-0:dict_t`, `glusterfs:data_t` and `glusterfs:data_pair_t` pools are using a lot of memory, i.e. cold_count is `0` and there are too many allocations. Checking the source code of dict.c shows that `key` in `dict` is allocated with `gf_common_mt_char` i.e. `2.` tag and value is created using gf_asprintf which in-turn uses `gf_common_mt_asprintf` i.e. `1.`. Checking the code for leaks in self-heal code paths led to a line which over-writes a variable with new dictionary even when it was already holding a reference to another dictionary. After fixing these leaks, we ran the same test to verify that none of the `num_allocs` values increased in the statedump of the self-daemon after healing 10,000 files.
|
||||
Please check [http://review.gluster.org/8316](http://review.gluster.org/8316) for more info about the patch/code.
|
||||
|
||||
|
||||
#### Leaks in mempools:
|
||||
The statedump output of mempools was used to test and verify the fixes for [BZ 1134221](https://bugzilla.redhat.com/show_bug.cgi?id=1134221). On code analysis, dict_t objects were found to be leaking (due to missing unref's) during name self-heal.
|
||||
|
||||
The statedump output of mempools was used to test and verify the fixes for [BZ 1134221](https://bugzilla.redhat.com/show_bug.cgi?id=1134221). On code analysis, dict_t objects were found to be leaking (due to missing unref's) during name self-heal.
|
||||
|
||||
Glusterfs was compiled with the -DDEBUG flags to have cold count set to 0 by default. The test involved creating 100 files on plain replicate volume, removing them from one of the backend bricks, and then triggering lookups on them from the mount point. A statedump of the mount process was taken before executing the test case and after it was completed.
|
||||
|
||||
Statedump output of the fuse mount process before the test case was executed:
|
||||
|
||||
```
|
||||
|
||||
```{.text .no-copy }
|
||||
pool-name=glusterfs:dict_t
|
||||
hot-count=0
|
||||
cold-count=0
|
||||
@@ -364,12 +362,11 @@ max-alloc=0
|
||||
pool-misses=33
|
||||
cur-stdalloc=14
|
||||
max-stdalloc=18
|
||||
|
||||
```
|
||||
|
||||
Statedump output of the fuse mount process after the test case was executed:
|
||||
|
||||
```
|
||||
|
||||
```{.text .no-copy }
|
||||
pool-name=glusterfs:dict_t
|
||||
hot-count=0
|
||||
cold-count=0
|
||||
@@ -379,15 +376,15 @@ max-alloc=0
|
||||
pool-misses=2841
|
||||
cur-stdalloc=214
|
||||
max-stdalloc=220
|
||||
|
||||
```
|
||||
|
||||
Here, as cold count was 0 by default, cur-stdalloc indicates the number of dict_t objects that were allocated from the heap using mem_get(), and are yet to be freed using mem_put(). After running the test case (named selfheal of 100 files), there was a rise in the cur-stdalloc value (from 14 to 214) for dict_t.
|
||||
|
||||
After the leaks were fixed, glusterfs was again compiled with -DDEBUG flags and the steps were repeated. Statedumps of the FUSE mount were taken before and after executing the test case to ascertain the validity of the fix. And the results were as follows:
|
||||
|
||||
Statedump output of the fuse mount process before executing the test case:
|
||||
|
||||
```
|
||||
```{.text .no-copy }
|
||||
pool-name=glusterfs:dict_t
|
||||
hot-count=0
|
||||
cold-count=0
|
||||
@@ -397,11 +394,11 @@ max-alloc=0
|
||||
pool-misses=33
|
||||
cur-stdalloc=14
|
||||
max-stdalloc=18
|
||||
|
||||
```
|
||||
|
||||
Statedump output of the fuse mount process after executing the test case:
|
||||
|
||||
```
|
||||
```{.text .no-copy }
|
||||
pool-name=glusterfs:dict_t
|
||||
hot-count=0
|
||||
cold-count=0
|
||||
@@ -411,17 +408,18 @@ max-alloc=0
|
||||
pool-misses=2837
|
||||
cur-stdalloc=14
|
||||
max-stdalloc=119
|
||||
|
||||
```
|
||||
|
||||
The value of cur-stdalloc remained 14 after the test, indicating that the fix indeed does what it's supposed to do.
|
||||
|
||||
### Hangs caused by frame loss
|
||||
|
||||
[BZ 994959](https://bugzilla.redhat.com/show_bug.cgi?id=994959) reported that the Fuse mount hangs on a readdirp operation.
|
||||
Here are the steps used to locate the cause of the hang using statedump.
|
||||
|
||||
Statedumps were taken for all gluster processes after reproducing the issue. The following stack was seen in the FUSE mount's statedump:
|
||||
|
||||
```
|
||||
```{.text .no-copy }
|
||||
[global.callpool.stack.1.frame.1]
|
||||
ref_count=1
|
||||
translator=fuse
|
||||
@@ -463,8 +461,8 @@ parent=r2-quick-read
|
||||
wind_from=qr_readdirp
|
||||
wind_to=FIRST_CHILD (this)->fops->readdirp
|
||||
unwind_to=qr_readdirp_cbk
|
||||
|
||||
```
|
||||
|
||||
`unwind_to` shows that call was unwound to `afr_readdirp_cbk` from the r2-client-1 xlator.
|
||||
Inspecting that function revealed that afr is not unwinding the stack when fop failed.
|
||||
Check [http://review.gluster.org/5531](http://review.gluster.org/5531) for more info about patch/code changes.
|
||||
|
||||
@@ -8,7 +8,7 @@ The first level of analysis always starts with looking at the log files. Which o
|
||||
Sometimes, you might need more verbose logging to figure out what’s going on:
|
||||
`gluster volume set $volname client-log-level $LEVEL`
|
||||
|
||||
where LEVEL can be any one of `DEBUG, WARNING, ERROR, INFO, CRITICAL, NONE, TRACE`. This should ideally make all the log files mentioned above to start logging at `$LEVEL`. The default is `INFO` but you can temporarily toggle it to `DEBUG` or `TRACE` if you want to see under-the-hood messages. Useful when the normal logs don’t give a clue as to what is happening.
|
||||
where LEVEL can be any one of `DEBUG, WARNING, ERROR, INFO, CRITICAL, NONE, TRACE`. This should ideally make all the log files mentioned above to start logging at `$LEVEL`. The default is `INFO` but you can temporarily toggle it to `DEBUG` or `TRACE` if you want to see under-the-hood messages. Useful when the normal logs don’t give a clue as to what is happening.
|
||||
|
||||
## Heal related issues:
|
||||
|
||||
@@ -20,17 +20,19 @@ Most issues I’ve seen on the mailing list and with customers can broadly fit i
|
||||
|
||||
If the number of entries are large, then heal info will take longer than usual. While there are performance improvements to heal info being planned, a faster way to get an approx. count of the pending entries is to use the `gluster volume heal $VOLNAME statistics heal-count` command.
|
||||
|
||||
**Knowledge Hack:** Since we know that during the write transaction. the xattrop folder will capture the gfid-string of the file if it needs heal, we can also do an `ls /brick/.glusterfs/indices/xattrop|wc -l` on each brick to get the approx. no of entries that need heal. If this number reduces over time, it is a sign that the heal backlog is reducing. You will also see messages whenever a particular type of heal starts/ends for a given gfid, like so:
|
||||
**Knowledge Hack:** Since we know that during the write transaction. the xattrop folder will capture the gfid-string of the file if it needs heal, we can also do an `ls /brick/.glusterfs/indices/xattrop|wc -l` on each brick to get the approx. no of entries that need heal. If this number reduces over time, it is a sign that the heal backlog is reducing. You will also see messages whenever a particular type of heal starts/ends for a given gfid, like so:
|
||||
|
||||
`[2019-05-07 12:05:14.460442] I [MSGID: 108026] [afr-self-heal-entry.c:883:afr_selfheal_entry_do] 0-testvol-replicate-0: performing entry selfheal on d120c0cf-6e87-454b-965b-0d83a4c752bb`
|
||||
```{.text .no-copy }
|
||||
[2019-05-07 12:05:14.460442] I [MSGID: 108026] [afr-self-heal-entry.c:883:afr_selfheal_entry_do] 0-testvol-replicate-0: performing entry selfheal on d120c0cf-6e87-454b-965b-0d83a4c752bb
|
||||
|
||||
`[2019-05-07 12:05:14.474710] I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal] 0-testvol-replicate-0: Completed entry selfheal on d120c0cf-6e87-454b-965b-0d83a4c752bb. sources=[0] 2 sinks=1`
|
||||
[2019-05-07 12:05:14.474710] I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal] 0-testvol-replicate-0: Completed entry selfheal on d120c0cf-6e87-454b-965b-0d83a4c752bb. sources=[0] 2 sinks=1
|
||||
|
||||
`[2019-05-07 12:05:14.493506] I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal] 0-testvol-replicate-0: Completed data selfheal on a9b5f183-21eb-4fb3-a342-287d3a7dddc5. sources=[0] 2 sinks=1`
|
||||
[2019-05-07 12:05:14.493506] I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal] 0-testvol-replicate-0: Completed data selfheal on a9b5f183-21eb-4fb3-a342-287d3a7dddc5. sources=[0] 2 sinks=1
|
||||
|
||||
`[2019-05-07 12:05:14.494577] I [MSGID: 108026] [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do] 0-testvol-replicate-0: performing metadata selfheal on a9b5f183-21eb-4fb3-a342-287d3a7dddc5`
|
||||
[2019-05-07 12:05:14.494577] I [MSGID: 108026] [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do] 0-testvol-replicate-0: performing metadata selfheal on a9b5f183-21eb-4fb3-a342-287d3a7dddc5
|
||||
|
||||
`[2019-05-07 12:05:14.498398] I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal] 0-testvol-replicate-0: Completed metadata selfheal on a9b5f183-21eb-4fb3-a342-287d3a7dddc5. sources=[0] 2 sinks=1`
|
||||
[2019-05-07 12:05:14.498398] I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal] 0-testvol-replicate-0: Completed metadata selfheal on a9b5f183-21eb-4fb3-a342-287d3a7dddc5. sources=[0] 2 sinks=1
|
||||
```
|
||||
|
||||
### ii) Self-heal is stuck/ not getting completed.
|
||||
|
||||
@@ -38,69 +40,88 @@ If a file seems to be forever appearing in heal info and not healing, check the
|
||||
|
||||
- Examine the afr xattrs- Do they clearly indicate the good and bad copies? If there isn’t at least one good copy, then the file is in split-brain and you would need to use the split-brain resolution CLI.
|
||||
- Identify which node’s shds would be picking up the file for heal. If a file is listed in the heal info output under brick1 and brick2, then the shds on the nodes which host those bricks would attempt (and one of them would succeed) in doing the heal.
|
||||
- Once the shd is identified, look at the shd logs to see if it is indeed connected to the bricks.
|
||||
- Once the shd is identified, look at the shd logs to see if it is indeed connected to the bricks.
|
||||
|
||||
This is good:
|
||||
`[2019-05-07 09:53:02.912923] I [MSGID: 114046] [client-handshake.c:1106:client_setvolume_cbk] 0-testvol-client-2: Connected to testvol-client-2, attached to remote volume '/bricks/brick3'`
|
||||
|
||||
```{.text .no-copy }
|
||||
[2019-05-07 09:53:02.912923] I [MSGID: 114046] [client-handshake.c:1106:client_setvolume_cbk] 0-testvol-client-2: Connected to testvol-client-2, attached to remote volume '/bricks/brick3'
|
||||
```
|
||||
|
||||
This indicates a disconnect:
|
||||
`[2019-05-07 11:44:47.602862] I [MSGID: 114018] [client.c:2334:client_rpc_notify] 0-testvol-client-2: disconnected from testvol-client-2. Client process will keep trying to connect to glusterd until brick's port is available`
|
||||
|
||||
`[2019-05-07 11:44:50.953516] E [MSGID: 114058] [client-handshake.c:1456:client_query_portmap_cbk] 0-testvol-client-2: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.`
|
||||
```{.text .no-copy }
|
||||
[2019-05-07 11:44:47.602862] I [MSGID: 114018] [client.c:2334:client_rpc_notify] 0-testvol-client-2: disconnected from testvol-client-2. Client process will keep trying to connect to glusterd until brick's port is available
|
||||
|
||||
[2019-05-07 11:44:50.953516] E [MSGID: 114058] [client-handshake.c:1456:client_query_portmap_cbk] 0-testvol-client-2: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
|
||||
```
|
||||
|
||||
Alternatively, take a statedump of the self-heal daemon (shd) and check if all client xlators are connected to the respective bricks. The shd must have `connected=1` for all the client xlators, meaning it can talk to all the bricks.
|
||||
|
||||
| Shd’s statedump entry of a client xlator that is connected to the 3rd brick | Shd’s statedump entry of the same client xlator if it is diconnected from the 3rd brick |
|
||||
|:--------------------------------------------------------------------------------------------------------------------------------------------------------:|:--------------------------------------------------------------------------------------------------------------------------------------------------------:|
|
||||
| Shd’s statedump entry of a client xlator that is connected to the 3rd brick | Shd’s statedump entry of the same client xlator if it is diconnected from the 3rd brick |
|
||||
| :------------------------------------------------------------------------------------------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------: |
|
||||
| [xlator.protocol.client.testvol-client-2.priv] connected=1 total_bytes_read=75004 ping_timeout=42 total_bytes_written=50608 ping_msgs_sent=0 msgs_sent=0 | [xlator.protocol.client.testvol-client-2.priv] connected=0 total_bytes_read=75004 ping_timeout=42 total_bytes_written=50608 ping_msgs_sent=0 msgs_sent=0 |
|
||||
|
||||
If there are connection issues (i.e. `connected=0`), you would need to investigate and fix them. Check if the pid and the TCP/RDMA Port of the brick proceess from gluster volume status $VOLNAME matches that of `ps aux|grep glusterfsd|grep $brick-path`
|
||||
|
||||
`[root@tuxpad glusterfs]# gluster volume status`
|
||||
```{.text .no-copy }
|
||||
# gluster volume status
|
||||
Status of volume: testvol
|
||||
Gluster process TCP Port RDMA Port Online Pid
|
||||
------------------------------------------------------------------------------
|
||||
Brick 127.0.0.2:/bricks/brick1 49152 0 Y 12527
|
||||
Gluster process TCP Port RDMA Port Online Pid
|
||||
|
||||
`[root@tuxpad glusterfs]# ps aux|grep brick1`
|
||||
---
|
||||
|
||||
`root 12527 0.0 0.1 1459208 20104 ? Ssl 11:20 0:01 /usr/local/sbin/glusterfsd -s 127.0.0.2 --volfile-id testvol.127.0.0.2.bricks-brick1 -p /var/run/gluster/vols/testvol/127.0.0.2-bricks-brick1.pid -S /var/run/gluster/70529980362a17d6.socket --brick-name /bricks/brick1 -l /var/log/glusterfs/bricks/bricks-brick1.log --xlator-option *-posix.glusterd-uuid=d90b1532-30e5-4f9d-a75b-3ebb1c3682d4 --process-name brick --brick-port 49152 --xlator-option testvol-server.listen-port=49152`
|
||||
Brick 127.0.0.2:/bricks/brick1 49152 0 Y 12527
|
||||
```
|
||||
|
||||
```{.text .no-copy }
|
||||
# ps aux|grep brick1
|
||||
|
||||
root 12527 0.0 0.1 1459208 20104 ? Ssl 11:20 0:01 /usr/local/sbin/glusterfsd -s 127.0.0.2 --volfile-id testvol.127.0.0.2.bricks-brick1 -p /var/run/gluster/vols/testvol/127.0.0.2-bricks-brick1.pid -S /var/run/gluster/70529980362a17d6.socket --brick-name /bricks/brick1 -l /var/log/glusterfs/bricks/bricks-brick1.log --xlator-option *-posix.glusterd-uuid=d90b1532-30e5-4f9d-a75b-3ebb1c3682d4 --process-name brick --brick-port 49152 --xlator-option testvol-server.listen-port=49152
|
||||
```
|
||||
|
||||
Though this will likely match, sometimes there could be a bug leading to stale port usage. A quick workaround would be to restart glusterd on that node and check if things match. Report the issue to the devs if you see this problem.
|
||||
|
||||
- I have seen some cases where a file is listed in heal info, and the afr xattrs indicate pending metadata or data heal but the file itself is not present on all bricks. Ideally, the parent directory of the file must have pending entry heal xattrs so that the file either gets created on the missing bricks or gets deleted from the ones where it is present. But if the parent dir doesn’t have xattrs, the entry heal can’t proceed. In such cases, you can
|
||||
-- Either do a lookup directly on the file from the mount so that name heal is triggered and then shd can pickup the data/metadata heal.
|
||||
-- Or manually set entry xattrs on the parent dir to emulate an entry heal so that the file gets created as a part of it.
|
||||
-- If a brick’s underlying filesystem/lvm was damaged and fsck’d to recovery, some files/dirs might be missing on it. If there is a lot of missing info on the recovered bricks, it might be better to just to a replace-brick or reset-brick and let the heal fully sync everything rather than fiddling with afr xattrs of individual entries.
|
||||
|
||||
**Hack:** How to trigger heal on *any* file/directory
|
||||
- Either do a lookup directly on the file from the mount so that name heal is triggered and then shd can pickup the data/metadata heal.
|
||||
- Or manually set entry xattrs on the parent dir to emulate an entry heal so that the file gets created as a part of it.
|
||||
- If a brick’s underlying filesystem/lvm was damaged and fsck’d to recovery, some files/dirs might be missing on it. If there is a lot of missing info on the recovered bricks, it might be better to just to a replace-brick or reset-brick and let the heal fully sync everything rather than fiddling with afr xattrs of individual entries.
|
||||
|
||||
**Hack:** How to trigger heal on _any_ file/directory
|
||||
Knowing about self-heal logic and index heal from the previous post, we can sort of emulate a heal with the following steps. This is not something that you should be doing on your cluster but it pays to at least know that it is possible when push comes to shove.
|
||||
|
||||
1. Picking one brick as good and setting the afr pending xattr on it blaming the bad bricks.
|
||||
2. Capture the gfid inside .glusterfs/indices/xattrop so that the shd can pick it up during index heal.
|
||||
3. Finally, trigger index heal: gluster volume heal $VOLNAME .
|
||||
|
||||
*Example:* Let us say a FILE-1 exists with `trusted.gfid=0x1ad2144928124da9b7117d27393fea5c` on all bricks of a replica 3 volume called testvol. It has no afr xattrs. But you still need to emulate a heal. Let us say you choose brick-2 as the source. Let us do the steps listed above:
|
||||
_Example:_ Let us say a FILE-1 exists with `trusted.gfid=0x1ad2144928124da9b7117d27393fea5c` on all bricks of a replica 3 volume called testvol. It has no afr xattrs. But you still need to emulate a heal. Let us say you choose brick-2 as the source. Let us do the steps listed above:
|
||||
|
||||
1. Make brick-2 blame the other 2 bricks:
|
||||
[root@tuxpad fuse_mnt]# setfattr -n trusted.afr.testvol-client-2 -v 0x000000010000000000000000 /bricks/brick2/FILE-1
|
||||
[root@tuxpad fuse_mnt]# setfattr -n trusted.afr.testvol-client-1 -v 0x000000010000000000000000 /bricks/brick2/FILE-1
|
||||
1. Make brick-2 blame the other 2 bricks:
|
||||
|
||||
2. Store the gfid string inside xattrop folder as a hardlink to the base entry:
|
||||
root@tuxpad ~]# cd /bricks/brick2/.glusterfs/indices/xattrop/
|
||||
[root@tuxpad xattrop]# ls -li
|
||||
total 0
|
||||
17829255 ----------. 1 root root 0 May 10 11:20 xattrop-a400ca91-cec9-4463-a183-aca9eaff9fa7`
|
||||
[root@tuxpad xattrop]# ln xattrop-a400ca91-cec9-4463-a183-aca9eaff9fa7 1ad21449-2812-4da9-b711-7d27393fea5c
|
||||
[root@tuxpad xattrop]# ll
|
||||
total 0
|
||||
----------. 2 root root 0 May 10 11:20 1ad21449-2812-4da9-b711-7d27393fea5c
|
||||
----------. 2 root root 0 May 10 11:20 xattrop-a400ca91-cec9-4463-a183-aca9eaff9fa7
|
||||
setfattr -n trusted.afr.testvol-client-2 -v 0x000000010000000000000000 /bricks/brick2/FILE-1
|
||||
setfattr -n trusted.afr.testvol-client-1 -v 0x000000010000000000000000 /bricks/brick2/FILE-1
|
||||
|
||||
3. Trigger heal: gluster volume heal testvol
|
||||
The glustershd.log of node-2 should log about the heal.
|
||||
[2019-05-10 06:10:46.027238] I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal] 0-testvol-replicate-0: Completed data selfheal on 1ad21449-2812-4da9-b711-7d27393fea5c. sources=[1] sinks=0 2
|
||||
So the data was healed from the second brick to the first and third brick.
|
||||
2. Store the gfid string inside xattrop folder as a hardlink to the base entry:
|
||||
|
||||
# cd /bricks/brick2/.glusterfs/indices/xattrop/
|
||||
# ls -li
|
||||
total 0
|
||||
17829255 ----------. 1 root root 0 May 10 11:20 xattrop-a400ca91-cec9-4463-a183-aca9eaff9fa7`
|
||||
|
||||
# ln xattrop-a400ca91-cec9-4463-a183-aca9eaff9fa7 1ad21449-2812-4da9-b711-7d27393fea5c
|
||||
# ll
|
||||
total 0
|
||||
----------. 2 root root 0 May 10 11:20 1ad21449-2812-4da9-b711-7d27393fea5c
|
||||
----------. 2 root root 0 May 10 11:20 xattrop-a400ca91-cec9-4463-a183-aca9eaff9fa7
|
||||
|
||||
3. Trigger heal: `gluster volume heal testvol`
|
||||
|
||||
The glustershd.log of node-2 should log about the heal.
|
||||
|
||||
[2019-05-10 06:10:46.027238] I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal] 0-testvol-replicate-0: Completed data selfheal on 1ad21449-2812-4da9-b711-7d27393fea5c. sources=[1] sinks=0 2
|
||||
|
||||
So the data was healed from the second brick to the first and third brick.
|
||||
|
||||
### iii) Self-heal is too slow
|
||||
|
||||
@@ -109,7 +130,7 @@ If the heal backlog is decreasing and you see glustershd logging heals but you
|
||||
Option: cluster.shd-max-threads
|
||||
Default Value: 1
|
||||
Description: Maximum number of parallel heals SHD can do per local brick. This can substantially lower heal times, but can also crush your bricks if you don’t have the storage hardware to support this.
|
||||
|
||||
|
||||
Option: cluster.shd-wait-qlength
|
||||
Default Value: 1024
|
||||
Description: This option can be used to control number of heals that can wait in SHD per subvolume
|
||||
@@ -118,38 +139,45 @@ I’m not covering it here but it is possible to launch multiple shd instances (
|
||||
|
||||
### iv) Self-heal is too aggressive and slows down the system.
|
||||
|
||||
If shd-max-threads are at the lowest value (i.e. 1) and you see if CPU usage of the bricks is too high, you can check if the volume’s profile info shows a lot of RCHECKSUM fops. Data self-heal does checksum calculation (i.e the `posix_rchecksum()` FOP) which can be CPU intensive. You can the `cluster.data-self-heal-algorithm` option to full. This does a full file copy instead of computing rolling checksums and syncing only the mismatching blocks. The tradeoff is that the network consumption will be increased.
|
||||
If shd-max-threads are at the lowest value (i.e. 1) and you see if CPU usage of the bricks is too high, you can check if the volume’s profile info shows a lot of RCHECKSUM fops. Data self-heal does checksum calculation (i.e the `posix_rchecksum()` FOP) which can be CPU intensive. You can the `cluster.data-self-heal-algorithm` option to full. This does a full file copy instead of computing rolling checksums and syncing only the mismatching blocks. The tradeoff is that the network consumption will be increased.
|
||||
|
||||
You can also disable all client-side heals if they are turned on so that the client bandwidth is consumed entirely by the application FOPs and not the ones by client side background heals. i.e. turn off `cluster.metadata-self-heal, cluster.data-self-heal and cluster.entry-self-heal`.
|
||||
Note: In recent versions of gluster, client-side heals are disabled by default.
|
||||
You can also disable all client-side heals if they are turned on so that the client bandwidth is consumed entirely by the application FOPs and not the ones by client side background heals. i.e. turn off `cluster.metadata-self-heal, cluster.data-self-heal and cluster.entry-self-heal`.
|
||||
Note: In recent versions of gluster, client-side heals are disabled by default.
|
||||
|
||||
## Mount related issues:
|
||||
### i) All fops are failing with ENOTCONN
|
||||
|
||||
### i) All fops are failing with ENOTCONN
|
||||
|
||||
Check mount log/ statedump for loss of quorum, just like for glustershd. If this is a fuse client (as opposed to an nfs/ gfapi client), you can also check the .meta folder to check the connection status to the bricks.
|
||||
`[root@tuxpad ~]# cat /mnt/fuse_mnt/.meta/graphs/active/testvol-client-*/private |grep connected`
|
||||
|
||||
`connected = 0`
|
||||
`connected = 1`
|
||||
`connected = 1`
|
||||
```{.text .no-copy }
|
||||
# cat /mnt/fuse_mnt/.meta/graphs/active/testvol-client-*/private |grep connected
|
||||
|
||||
If `connected=0`, the connection to that brick is lost. Find out why. If the client is not connected to quorum number of bricks, then AFR fails lookups (and therefore any subsequent FOP) with Transport endpoint is not connected
|
||||
connected = 0
|
||||
connected = 1
|
||||
connected = 1
|
||||
```
|
||||
|
||||
If `connected=0`, the connection to that brick is lost. Find out why. If the client is not connected to quorum number of bricks, then AFR fails lookups (and therefore any subsequent FOP) with Transport endpoint is not connected
|
||||
|
||||
### ii) FOPs on some files are failing with ENOTCONN
|
||||
|
||||
Check mount log for the file being unreadable:
|
||||
`[2019-05-10 11:04:01.607046] W [MSGID: 108027] [afr-common.c:2268:afr_attempt_readsubvol_set] 13-testvol-replicate-0: no read subvols for /FILE.txt`
|
||||
`[2019-05-10 11:04:01.607775] W [fuse-bridge.c:939:fuse_entry_cbk] 0-glusterfs-fuse: 234: LOOKUP() /FILE.txt => -1 (Transport endpoint is not connected)`
|
||||
|
||||
This means there was only 1 good copy and the client has lost connection to that brick. You need to ensure that the client is connected to all bricks.
|
||||
```{.text .no-copy }
|
||||
[2019-05-10 11:04:01.607046] W [MSGID: 108027] [afr-common.c:2268:afr_attempt_readsubvol_set] 13-testvol-replicate-0: no read subvols for /FILE.txt
|
||||
[2019-05-10 11:04:01.607775] W [fuse-bridge.c:939:fuse_entry_cbk] 0-glusterfs-fuse: 234: LOOKUP() /FILE.txt => -1 (Transport endpoint is not connected)
|
||||
```
|
||||
|
||||
This means there was only 1 good copy and the client has lost connection to that brick. You need to ensure that the client is connected to all bricks.
|
||||
|
||||
### iii) Mount is hung
|
||||
|
||||
It can be difficult to pin-point the issue immediately and might require assistance from the developers but the first steps to debugging could be to
|
||||
|
||||
- strace the fuse mount; see where it is hung.
|
||||
- Take a statedump of the mount to see which xlator has frames that are not wound (i.e. complete=0) and for which FOP. Then check the source code to see if there are any unhanded cases where the xlator doesn’t wind the FOP to its child.
|
||||
- Take statedump of bricks to see if there are any stale locks. An indication of stale locks is the same lock being present in multiple statedumps or the ‘granted’ date being very old.
|
||||
- strace the fuse mount; see where it is hung.
|
||||
- Take a statedump of the mount to see which xlator has frames that are not wound (i.e. complete=0) and for which FOP. Then check the source code to see if there are any unhanded cases where the xlator doesn’t wind the FOP to its child.
|
||||
- Take statedump of bricks to see if there are any stale locks. An indication of stale locks is the same lock being present in multiple statedumps or the ‘granted’ date being very old.
|
||||
|
||||
Excerpt from a brick statedump:
|
||||
|
||||
|
||||
@@ -1,6 +1,4 @@
|
||||
Troubleshooting File Locks
|
||||
==========================
|
||||
|
||||
# Troubleshooting File Locks
|
||||
|
||||
Use [statedumps](./statedump.md) to find and list the locks held
|
||||
on files. The statedump output also provides information on each lock
|
||||
@@ -13,11 +11,11 @@ lock using the following `clear lock` commands.
|
||||
1. **Perform statedump on the volume to view the files that are locked
|
||||
using the following command:**
|
||||
|
||||
# gluster volume statedump inode
|
||||
gluster volume statedump inode
|
||||
|
||||
For example, to display statedump of test-volume:
|
||||
|
||||
# gluster volume statedump test-volume
|
||||
gluster volume statedump test-volume
|
||||
Volume statedump successful
|
||||
|
||||
The statedump files are created on the brick servers in the` /tmp`
|
||||
@@ -58,25 +56,23 @@ lock using the following `clear lock` commands.
|
||||
|
||||
2. **Clear the lock using the following command:**
|
||||
|
||||
# gluster volume clear-locks
|
||||
gluster volume clear-locks
|
||||
|
||||
For example, to clear the entry lock on `file1` of test-volume:
|
||||
|
||||
# gluster volume clear-locks test-volume / kind granted entry file1
|
||||
gluster volume clear-locks test-volume / kind granted entry file1
|
||||
Volume clear-locks successful
|
||||
vol-locks: entry blocked locks=0 granted locks=1
|
||||
|
||||
3. **Clear the inode lock using the following command:**
|
||||
|
||||
# gluster volume clear-locks
|
||||
gluster volume clear-locks
|
||||
|
||||
For example, to clear the inode lock on `file1` of test-volume:
|
||||
|
||||
# gluster volume clear-locks test-volume /file1 kind granted inode 0,0-0
|
||||
gluster volume clear-locks test-volume /file1 kind granted inode 0,0-0
|
||||
Volume clear-locks successful
|
||||
vol-locks: inode blocked locks=0 granted locks=1
|
||||
|
||||
Perform statedump on test-volume again to verify that the
|
||||
above inode and entry locks are cleared.
|
||||
|
||||
|
||||
|
||||
@@ -8,13 +8,13 @@ to GlusterFS Geo-replication.
|
||||
For every Geo-replication session, the following three log files are
|
||||
associated to it (four, if the secondary is a gluster volume):
|
||||
|
||||
- **Primary-log-file** - log file for the process which monitors the Primary
|
||||
volume
|
||||
- **Secondary-log-file** - log file for process which initiates the changes in
|
||||
secondary
|
||||
- **Primary-gluster-log-file** - log file for the maintenance mount point
|
||||
that Geo-replication module uses to monitor the Primary volume
|
||||
- **Secondary-gluster-log-file** - is the secondary's counterpart of it
|
||||
- **Primary-log-file** - log file for the process which monitors the Primary
|
||||
volume
|
||||
- **Secondary-log-file** - log file for process which initiates the changes in
|
||||
secondary
|
||||
- **Primary-gluster-log-file** - log file for the maintenance mount point
|
||||
that Geo-replication module uses to monitor the Primary volume
|
||||
- **Secondary-gluster-log-file** - is the secondary's counterpart of it
|
||||
|
||||
**Primary Log File**
|
||||
|
||||
@@ -28,7 +28,7 @@ gluster volume geo-replication <session> config log-file
|
||||
For example:
|
||||
|
||||
```console
|
||||
# gluster volume geo-replication Volume1 example.com:/data/remote_dir config log-file
|
||||
gluster volume geo-replication Volume1 example.com:/data/remote_dir config log-file
|
||||
```
|
||||
|
||||
**Secondary Log File**
|
||||
@@ -38,13 +38,13 @@ running on secondary machine), use the following commands:
|
||||
|
||||
1. On primary, run the following command:
|
||||
|
||||
# gluster volume geo-replication Volume1 example.com:/data/remote_dir config session-owner 5f6e5200-756f-11e0-a1f0-0800200c9a66
|
||||
gluster volume geo-replication Volume1 example.com:/data/remote_dir config session-owner 5f6e5200-756f-11e0-a1f0-0800200c9a66
|
||||
|
||||
Displays the session owner details.
|
||||
|
||||
2. On secondary, run the following command:
|
||||
|
||||
# gluster volume geo-replication /data/remote_dir config log-file /var/log/gluster/${session-owner}:remote-mirror.log
|
||||
gluster volume geo-replication /data/remote_dir config log-file /var/log/gluster/${session-owner}:remote-mirror.log
|
||||
|
||||
3. Replace the session owner details (output of Step 1) to the output
|
||||
of Step 2 to get the location of the log file.
|
||||
@@ -52,7 +52,7 @@ running on secondary machine), use the following commands:
|
||||
/var/log/gluster/5f6e5200-756f-11e0-a1f0-0800200c9a66:remote-mirror.log
|
||||
|
||||
### Rotating Geo-replication Logs
|
||||
|
||||
|
||||
Administrators can rotate the log file of a particular primary-secondary
|
||||
session, as needed. When you run geo-replication's ` log-rotate`
|
||||
command, the log file is backed up with the current timestamp suffixed
|
||||
@@ -61,34 +61,34 @@ log file.
|
||||
|
||||
**To rotate a geo-replication log file**
|
||||
|
||||
- Rotate log file for a particular primary-secondary session using the
|
||||
following command:
|
||||
- Rotate log file for a particular primary-secondary session using the
|
||||
following command:
|
||||
|
||||
# gluster volume geo-replication log-rotate
|
||||
gluster volume geo-replication log-rotate
|
||||
|
||||
For example, to rotate the log file of primary `Volume1` and secondary
|
||||
`example.com:/data/remote_dir` :
|
||||
For example, to rotate the log file of primary `Volume1` and secondary
|
||||
`example.com:/data/remote_dir` :
|
||||
|
||||
# gluster volume geo-replication Volume1 example.com:/data/remote_dir log rotate
|
||||
gluster volume geo-replication Volume1 example.com:/data/remote_dir log rotate
|
||||
log rotate successful
|
||||
|
||||
- Rotate log file for all sessions for a primary volume using the
|
||||
following command:
|
||||
- Rotate log file for all sessions for a primary volume using the
|
||||
following command:
|
||||
|
||||
# gluster volume geo-replication log-rotate
|
||||
gluster volume geo-replication log-rotate
|
||||
|
||||
For example, to rotate the log file of primary `Volume1`:
|
||||
For example, to rotate the log file of primary `Volume1`:
|
||||
|
||||
# gluster volume geo-replication Volume1 log rotate
|
||||
gluster volume geo-replication Volume1 log rotate
|
||||
log rotate successful
|
||||
|
||||
- Rotate log file for all sessions using the following command:
|
||||
- Rotate log file for all sessions using the following command:
|
||||
|
||||
# gluster volume geo-replication log-rotate
|
||||
gluster volume geo-replication log-rotate
|
||||
|
||||
For example, to rotate the log file for all sessions:
|
||||
For example, to rotate the log file for all sessions:
|
||||
|
||||
# gluster volume geo-replication log rotate
|
||||
gluster volume geo-replication log rotate
|
||||
log rotate successful
|
||||
|
||||
### Synchronization is not complete
|
||||
@@ -102,16 +102,14 @@ GlusterFS geo-replication begins synchronizing all the data. All files
|
||||
are compared using checksum, which can be a lengthy and high resource
|
||||
utilization operation on large data sets.
|
||||
|
||||
|
||||
### Issues in Data Synchronization
|
||||
|
||||
**Description**: Geo-replication display status as OK, but the files do
|
||||
not get synced, only directories and symlink gets synced with the
|
||||
following error message in the log:
|
||||
|
||||
```console
|
||||
[2011-05-02 13:42:13.467644] E [primary:288:regjob] GMaster: failed to
|
||||
sync ./some\_file\`
|
||||
```{ .text .no-copy }
|
||||
[2011-05-02 13:42:13.467644] E [primary:288:regjob] GMaster: failed to sync ./some\_file\`
|
||||
```
|
||||
|
||||
**Solution**: Geo-replication invokes rsync v3.0.0 or higher on the host
|
||||
@@ -123,7 +121,7 @@ required version.
|
||||
**Description**: Geo-replication displays status as faulty very often
|
||||
with a backtrace similar to the following:
|
||||
|
||||
```console
|
||||
```{ .text .no-copy }
|
||||
2011-04-28 14:06:18.378859] E [syncdutils:131:log\_raise\_exception]
|
||||
\<top\>: FAIL: Traceback (most recent call last): File
|
||||
"/usr/local/libexec/glusterfs/python/syncdaemon/syncdutils.py", line
|
||||
@@ -139,28 +137,28 @@ the primary gsyncd module and secondary gsyncd module is broken and this can
|
||||
happen for various reasons. Check if it satisfies all the following
|
||||
pre-requisites:
|
||||
|
||||
- Password-less SSH is set up properly between the host and the remote
|
||||
machine.
|
||||
- If FUSE is installed in the machine, because geo-replication module
|
||||
mounts the GlusterFS volume using FUSE to sync data.
|
||||
- If the **Secondary** is a volume, check if that volume is started.
|
||||
- If the Secondary is a plain directory, verify if the directory has been
|
||||
created already with the required permissions.
|
||||
- If GlusterFS 3.2 or higher is not installed in the default location
|
||||
(in Primary) and has been prefixed to be installed in a custom
|
||||
location, configure the `gluster-command` for it to point to the
|
||||
exact location.
|
||||
- If GlusterFS 3.2 or higher is not installed in the default location
|
||||
(in secondary) and has been prefixed to be installed in a custom
|
||||
location, configure the `remote-gsyncd-command` for it to point to
|
||||
the exact place where gsyncd is located.
|
||||
- Password-less SSH is set up properly between the host and the remote
|
||||
machine.
|
||||
- If FUSE is installed in the machine, because geo-replication module
|
||||
mounts the GlusterFS volume using FUSE to sync data.
|
||||
- If the **Secondary** is a volume, check if that volume is started.
|
||||
- If the Secondary is a plain directory, verify if the directory has been
|
||||
created already with the required permissions.
|
||||
- If GlusterFS 3.2 or higher is not installed in the default location
|
||||
(in Primary) and has been prefixed to be installed in a custom
|
||||
location, configure the `gluster-command` for it to point to the
|
||||
exact location.
|
||||
- If GlusterFS 3.2 or higher is not installed in the default location
|
||||
(in secondary) and has been prefixed to be installed in a custom
|
||||
location, configure the `remote-gsyncd-command` for it to point to
|
||||
the exact place where gsyncd is located.
|
||||
|
||||
### Intermediate Primary goes to Faulty State
|
||||
|
||||
**Description**: In a cascading set-up, the intermediate primary goes to
|
||||
faulty state with the following log:
|
||||
|
||||
```console
|
||||
```{ .text .no-copy }
|
||||
raise RuntimeError ("aborting on uuid change from %s to %s" % \\
|
||||
RuntimeError: aborting on uuid change from af07e07c-427f-4586-ab9f-
|
||||
4bf7d299be81 to de6b5040-8f4e-4575-8831-c4f55bd41154
|
||||
|
||||
@@ -4,45 +4,40 @@ The glusterd daemon runs on every trusted server node and is responsible for the
|
||||
|
||||
The gluster CLI sends commands to the glusterd daemon on the local node, which executes the operation and returns the result to the user.
|
||||
|
||||
<br>
|
||||
|
||||
### Debugging glusterd
|
||||
|
||||
#### Logs
|
||||
|
||||
Start by looking at the log files for clues as to what went wrong when you hit a problem.
|
||||
The default directory for Gluster logs is /var/log/glusterfs. The logs for the CLI and glusterd are:
|
||||
|
||||
- glusterd : /var/log/glusterfs/glusterd.log
|
||||
- gluster CLI : /var/log/glusterfs/cli.log
|
||||
|
||||
- glusterd : /var/log/glusterfs/glusterd.log
|
||||
- gluster CLI : /var/log/glusterfs/cli.log
|
||||
|
||||
#### Statedumps
|
||||
|
||||
Statedumps are useful in debugging memory leaks and hangs.
|
||||
See [Statedump](./statedump.md) for more details.
|
||||
|
||||
<br>
|
||||
|
||||
### Common Issues and How to Resolve Them
|
||||
|
||||
|
||||
**"*Another transaction is in progress for volname*" or "*Locking failed on xxx.xxx.xxx.xxx"***
|
||||
**"_Another transaction is in progress for volname_" or "_Locking failed on xxx.xxx.xxx.xxx"_**
|
||||
|
||||
As Gluster is distributed by nature, glusterd takes locks when performing operations to ensure that configuration changes made to a volume are atomic across the cluster.
|
||||
These errors are returned when:
|
||||
|
||||
* More than one transaction contends on the same lock.
|
||||
> *Solution* : These are likely to be transient errors and the operation will succeed if retried once the other transaction is complete.
|
||||
- More than one transaction contends on the same lock.
|
||||
|
||||
* A stale lock exists on one of the nodes.
|
||||
> *Solution* : Repeating the operation will not help until the stale lock is cleaned up. Restart the glusterd process holding the lock
|
||||
> _Solution_ : These are likely to be transient errors and the operation will succeed if retried once the other transaction is complete.
|
||||
|
||||
* Check the glusterd.log file to find out which node holds the stale lock. Look for the message:
|
||||
`lock being held by <uuid>`
|
||||
* Run `gluster peer status` to identify the node with the uuid in the log message.
|
||||
* Restart glusterd on that node.
|
||||
- A stale lock exists on one of the nodes.
|
||||
|
||||
> _Solution_ : Repeating the operation will not help until the stale lock is cleaned up. Restart the glusterd process holding the lock
|
||||
|
||||
<br>
|
||||
- Check the glusterd.log file to find out which node holds the stale lock. Look for the message:
|
||||
`lock being held by <uuid>`
|
||||
- Run `gluster peer status` to identify the node with the uuid in the log message.
|
||||
- Restart glusterd on that node.
|
||||
|
||||
**"_Transport endpoint is not connected_" errors but all bricks are up**
|
||||
|
||||
@@ -51,51 +46,40 @@ Gluster client processes query glusterd for the ports the bricks processes are l
|
||||
If the port information in glusterd is incorrect, the client will fail to connect to the brick even though it is up. Operations which
|
||||
would need to access that brick may fail with "Transport endpoint is not connected".
|
||||
|
||||
*Solution* : Restart the glusterd service.
|
||||
|
||||
<br>
|
||||
_Solution_ : Restart the glusterd service.
|
||||
|
||||
**"Peer Rejected"**
|
||||
|
||||
`gluster peer status` returns "Peer Rejected" for a node.
|
||||
|
||||
```console
|
||||
```{ .text .no-copy }
|
||||
Hostname: <hostname>
|
||||
Uuid: <xxxx-xxx-xxxx>
|
||||
State: Peer Rejected (Connected)
|
||||
```
|
||||
|
||||
This indicates that the volume configuration on the node is not in sync with the rest of the trusted storage pool.
|
||||
This indicates that the volume configuration on the node is not in sync with the rest of the trusted storage pool.
|
||||
You should see the following message in the glusterd log for the node on which the peer status command was run:
|
||||
|
||||
```console
|
||||
```{ .text .no-copy }
|
||||
Version of Cksums <vol-name> differ. local cksum = xxxxxx, remote cksum = xxxxyx on peer <hostname>
|
||||
```
|
||||
|
||||
*Solution*: Update the cluster.op-version
|
||||
_Solution_: Update the cluster.op-version
|
||||
|
||||
* Run `gluster volume get all cluster.max-op-version` to get the latest supported op-version.
|
||||
* Update the cluster.op-version to the latest supported op-version by executing `gluster volume set all cluster.op-version <op-version>`.
|
||||
|
||||
<br>
|
||||
- Run `gluster volume get all cluster.max-op-version` to get the latest supported op-version.
|
||||
- Update the cluster.op-version to the latest supported op-version by executing `gluster volume set all cluster.op-version <op-version>`.
|
||||
|
||||
**"Accepted Peer Request"**
|
||||
|
||||
If the glusterd handshake fails while expanding a cluster, the view of the cluster will be inconsistent. The state of the peer in `gluster peer status` will be “accepted peer request” and subsequent CLI commands will fail with an error.
|
||||
Eg. `Volume create command will fail with "volume create: testvol: failed: Host <hostname> is not in 'Peer in Cluster' state`
|
||||
|
||||
If the glusterd handshake fails while expanding a cluster, the view of the cluster will be inconsistent. The state of the peer in `gluster peer status` will be “accepted peer request” and subsequent CLI commands will fail with an error.
|
||||
Eg. `Volume create command will fail with "volume create: testvol: failed: Host <hostname> is not in 'Peer in Cluster' state`
|
||||
|
||||
In this case the value of the state field in `/var/lib/glusterd/peers/<UUID>` will be other than 3.
|
||||
|
||||
*Solution*:
|
||||
|
||||
* Stop glusterd
|
||||
* Open `/var/lib/glusterd/peers/<UUID>`
|
||||
* Change state to 3
|
||||
* Start glusterd
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
_Solution_:
|
||||
|
||||
- Stop glusterd
|
||||
- Open `/var/lib/glusterd/peers/<UUID>`
|
||||
- Change state to 3
|
||||
- Start glusterd
|
||||
|
||||
@@ -11,14 +11,14 @@ This error is encountered when the server has not started correctly.
|
||||
On most Linux distributions this is fixed by starting portmap:
|
||||
|
||||
```console
|
||||
# /etc/init.d/portmap start
|
||||
/etc/init.d/portmap start
|
||||
```
|
||||
|
||||
On some distributions where portmap has been replaced by rpcbind, the
|
||||
following command is required:
|
||||
|
||||
```console
|
||||
# /etc/init.d/rpcbind start
|
||||
/etc/init.d/rpcbind start
|
||||
```
|
||||
|
||||
After starting portmap or rpcbind, gluster NFS server needs to be
|
||||
@@ -32,13 +32,13 @@ This error can arise in case there is already a Gluster NFS server
|
||||
running on the same machine. This situation can be confirmed from the
|
||||
log file, if the following error lines exist:
|
||||
|
||||
```text
|
||||
```{ .text .no-copy }
|
||||
[2010-05-26 23:40:49] E [rpc-socket.c:126:rpcsvc_socket_listen] rpc-socket: binding socket failed:Address already in use
|
||||
[2010-05-26 23:40:49] E [rpc-socket.c:129:rpcsvc_socket_listen] rpc-socket: Port is already in use
|
||||
[2010-05-26 23:40:49] E [rpcsvc.c:2636:rpcsvc_stage_program_register] rpc-service: could not create listening connection
|
||||
[2010-05-26 23:40:49] E [rpcsvc.c:2675:rpcsvc_program_register] rpc-service: stage registration of program failed
|
||||
[2010-05-26 23:40:49] E [rpcsvc.c:2695:rpcsvc_program_register] rpc-service: Program registration failed: MOUNT3, Num: 100005, Ver: 3, Port: 38465
|
||||
[2010-05-26 23:40:49] E [nfs.c:125:nfs_init_versions] nfs: Program init failed
|
||||
[2010-05-26 23:40:49] E [rpc-socket.c:129:rpcsvc_socket_listen] rpc-socket: Port is already in use
|
||||
[2010-05-26 23:40:49] E [rpcsvc.c:2636:rpcsvc_stage_program_register] rpc-service: could not create listening connection
|
||||
[2010-05-26 23:40:49] E [rpcsvc.c:2675:rpcsvc_program_register] rpc-service: stage registration of program failed
|
||||
[2010-05-26 23:40:49] E [rpcsvc.c:2695:rpcsvc_program_register] rpc-service: Program registration failed: MOUNT3, Num: 100005, Ver: 3, Port: 38465
|
||||
[2010-05-26 23:40:49] E [nfs.c:125:nfs_init_versions] nfs: Program init failed
|
||||
[2010-05-26 23:40:49] C [nfs.c:531:notify] nfs: Failed to initialize protocols
|
||||
```
|
||||
|
||||
@@ -50,7 +50,7 @@ multiple NFS servers on the same machine.
|
||||
|
||||
If the mount command fails with the following error message:
|
||||
|
||||
```console
|
||||
```{ .text .no-copy }
|
||||
mount.nfs: rpc.statd is not running but is required for remote locking.
|
||||
mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
|
||||
```
|
||||
@@ -59,7 +59,7 @@ For NFS clients to mount the NFS server, rpc.statd service must be
|
||||
running on the clients. Start rpc.statd service by running the following command:
|
||||
|
||||
```console
|
||||
# rpc.statd
|
||||
rpc.statd
|
||||
```
|
||||
|
||||
### mount command takes too long to finish.
|
||||
@@ -71,14 +71,14 @@ NFS client. The resolution for this is to start either of these services
|
||||
by running the following command:
|
||||
|
||||
```console
|
||||
# /etc/init.d/portmap start
|
||||
/etc/init.d/portmap start
|
||||
```
|
||||
|
||||
On some distributions where portmap has been replaced by rpcbind, the
|
||||
following command is required:
|
||||
|
||||
```console
|
||||
# /etc/init.d/rpcbind start
|
||||
/etc/init.d/rpcbind start
|
||||
```
|
||||
|
||||
### NFS server glusterfsd starts but initialization fails with “nfsrpc- service: portmap registration of program failed” error message in the log.
|
||||
@@ -88,8 +88,8 @@ still fail preventing clients from accessing the mount points. Such a
|
||||
situation can be confirmed from the following error messages in the log
|
||||
file:
|
||||
|
||||
```text
|
||||
[2010-05-26 23:33:47] E [rpcsvc.c:2598:rpcsvc_program_register_portmap] rpc-service: Could notregister with portmap
|
||||
```{ .text .no-copy }
|
||||
[2010-05-26 23:33:47] E [rpcsvc.c:2598:rpcsvc_program_register_portmap] rpc-service: Could notregister with portmap
|
||||
[2010-05-26 23:33:47] E [rpcsvc.c:2682:rpcsvc_program_register] rpc-service: portmap registration of program failed
|
||||
[2010-05-26 23:33:47] E [rpcsvc.c:2695:rpcsvc_program_register] rpc-service: Program registration failed: MOUNT3, Num: 100005, Ver: 3, Port: 38465
|
||||
[2010-05-26 23:33:47] E [nfs.c:125:nfs_init_versions] nfs: Program init failed
|
||||
@@ -104,12 +104,12 @@ file:
|
||||
On most Linux distributions, portmap can be started using the
|
||||
following command:
|
||||
|
||||
# /etc/init.d/portmap start
|
||||
/etc/init.d/portmap start
|
||||
|
||||
On some distributions where portmap has been replaced by rpcbind,
|
||||
run the following command:
|
||||
|
||||
# /etc/init.d/rpcbind start
|
||||
/etc/init.d/rpcbind start
|
||||
|
||||
After starting portmap or rpcbind, gluster NFS server needs to be
|
||||
restarted.
|
||||
@@ -126,8 +126,8 @@ file:
|
||||
On Linux, kernel NFS servers can be stopped by using either of the
|
||||
following commands depending on the distribution in use:
|
||||
|
||||
# /etc/init.d/nfs-kernel-server stop
|
||||
# /etc/init.d/nfs stop
|
||||
/etc/init.d/nfs-kernel-server stop
|
||||
/etc/init.d/nfs stop
|
||||
|
||||
3. **Restart Gluster NFS server**
|
||||
|
||||
@@ -135,7 +135,7 @@ file:
|
||||
|
||||
mount command fails with following error
|
||||
|
||||
```console
|
||||
```{ .text .no-copy }
|
||||
mount: mount to NFS server '10.1.10.11' failed: timed out (retrying).
|
||||
```
|
||||
|
||||
@@ -175,14 +175,13 @@ Perform one of the following to resolve this issue:
|
||||
forcing the NFS client to use version 3. The **vers** option to
|
||||
mount command is used for this purpose:
|
||||
|
||||
# mount -o vers=3
|
||||
mount -o vers=3
|
||||
|
||||
### showmount fails with clnt\_create: RPC: Unable to receive
|
||||
### showmount fails with clnt_create: RPC: Unable to receive
|
||||
|
||||
Check your firewall setting to open ports 111 for portmap
|
||||
requests/replies and Gluster NFS server requests/replies. Gluster NFS
|
||||
server operates over the following port numbers: 38465, 38466, and
|
||||
38467.
|
||||
server operates over the following port numbers: 38465, 38466, and 38467.
|
||||
|
||||
### Application fails with "Invalid argument" or "Value too large for defined data type" error.
|
||||
|
||||
@@ -193,9 +192,9 @@ numbers instead: nfs.enable-ino32 \<on|off\>
|
||||
|
||||
Applications that will benefit are those that were either:
|
||||
|
||||
- built 32-bit and run on 32-bit machines such that they do not
|
||||
support large files by default
|
||||
- built 32-bit on 64-bit systems
|
||||
- built 32-bit and run on 32-bit machines such that they do not
|
||||
support large files by default
|
||||
- built 32-bit on 64-bit systems
|
||||
|
||||
This option is disabled by default so NFS returns 64-bit inode numbers
|
||||
by default.
|
||||
@@ -203,6 +202,6 @@ by default.
|
||||
Applications which can be rebuilt from source are recommended to rebuild
|
||||
using the following flag with gcc:
|
||||
|
||||
```
|
||||
```console
|
||||
-D_FILE_OFFSET_BITS=64
|
||||
```
|
||||
|
||||
@@ -1,5 +1,4 @@
|
||||
Troubleshooting High Memory Utilization
|
||||
=======================================
|
||||
# Troubleshooting High Memory Utilization
|
||||
|
||||
If the memory utilization of a Gluster process increases significantly with time, it could be a leak caused by resources not being freed.
|
||||
If you suspect that you may have hit such an issue, try using [statedumps](./statedump.md) to debug the issue.
|
||||
@@ -12,4 +11,3 @@ If you are unable to figure out where the leak is, please [file an issue](https:
|
||||
- Steps to reproduce the issue if available
|
||||
- Statedumps for the process collected at intervals as the memory utilization increases
|
||||
- The Gluster log files for the process (if possible)
|
||||
|
||||
|
||||
@@ -1,32 +1,32 @@
|
||||
Upgrading GlusterFS
|
||||
-------------------
|
||||
- [About op-version](./op-version.md)
|
||||
## Upgrading GlusterFS
|
||||
|
||||
- [About op-version](./op-version.md)
|
||||
|
||||
If you are using GlusterFS version 6.x or above, you can upgrade it to the following:
|
||||
|
||||
- [Upgrading to 10](./upgrade-to-10.md)
|
||||
- [Upgrading to 9](./upgrade-to-9.md)
|
||||
- [Upgrading to 8](./upgrade-to-8.md)
|
||||
- [Upgrading to 7](./upgrade-to-7.md)
|
||||
- [Upgrading to 10](./upgrade-to-10.md)
|
||||
- [Upgrading to 9](./upgrade-to-9.md)
|
||||
- [Upgrading to 8](./upgrade-to-8.md)
|
||||
- [Upgrading to 7](./upgrade-to-7.md)
|
||||
|
||||
If you are using GlusterFS version 5.x or above, you can upgrade it to the following:
|
||||
|
||||
- [Upgrading to 8](./upgrade-to-8.md)
|
||||
- [Upgrading to 7](./upgrade-to-7.md)
|
||||
- [Upgrading to 6](./upgrade-to-6.md)
|
||||
- [Upgrading to 8](./upgrade-to-8.md)
|
||||
- [Upgrading to 7](./upgrade-to-7.md)
|
||||
- [Upgrading to 6](./upgrade-to-6.md)
|
||||
|
||||
If you are using GlusterFS version 4.x or above, you can upgrade it to the following:
|
||||
|
||||
- [Upgrading to 6](./upgrade-to-6.md)
|
||||
- [Upgrading to 5](./upgrade-to-5.md)
|
||||
- [Upgrading to 6](./upgrade-to-6.md)
|
||||
- [Upgrading to 5](./upgrade-to-5.md)
|
||||
|
||||
If you are using GlusterFS version 3.4.x or above, you can upgrade it to following:
|
||||
|
||||
- [Upgrading to 3.5](./upgrade-to-3.5.md)
|
||||
- [Upgrading to 3.6](./upgrade-to-3.6.md)
|
||||
- [Upgrading to 3.7](./upgrade-to-3.7.md)
|
||||
- [Upgrading to 3.9](./upgrade-to-3.9.md)
|
||||
- [Upgrading to 3.10](./upgrade-to-3.10.md)
|
||||
- [Upgrading to 3.11](./upgrade-to-3.11.md)
|
||||
- [Upgrading to 3.12](./upgrade-to-3.12.md)
|
||||
- [Upgrading to 3.13](./upgrade-to-3.13.md)
|
||||
- [Upgrading to 3.5](./upgrade-to-3.5.md)
|
||||
- [Upgrading to 3.6](./upgrade-to-3.6.md)
|
||||
- [Upgrading to 3.7](./upgrade-to-3.7.md)
|
||||
- [Upgrading to 3.9](./upgrade-to-3.9.md)
|
||||
- [Upgrading to 3.10](./upgrade-to-3.10.md)
|
||||
- [Upgrading to 3.11](./upgrade-to-3.11.md)
|
||||
- [Upgrading to 3.12](./upgrade-to-3.12.md)
|
||||
- [Upgrading to 3.13](./upgrade-to-3.13.md)
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
# Generic Upgrade procedure
|
||||
|
||||
### Pre-upgrade notes
|
||||
|
||||
- Online upgrade is only possible with replicated and distributed replicate volumes
|
||||
- Online upgrade is not supported for dispersed or distributed dispersed volumes
|
||||
- Ensure no configuration changes are done during the upgrade
|
||||
@@ -9,27 +10,28 @@
|
||||
- It is recommended to have the same client and server, major versions running eventually
|
||||
|
||||
### Online upgrade procedure for servers
|
||||
|
||||
This procedure involves upgrading **one server at a time**, while keeping the volume(s) online and client IO ongoing. This procedure assumes that multiple replicas of a replica set, are not part of the same server in the trusted storage pool.
|
||||
|
||||
> **ALERT:** If there are disperse or, pure distributed volumes in the storage pool being upgraded, this procedure is NOT recommended, use the [Offline upgrade procedure](#offline-upgrade-procedure) instead.
|
||||
|
||||
#### Repeat the following steps, on each server in the trusted storage pool, to upgrade the entire pool to new-version :
|
||||
1. Stop all gluster services, either using the command below, or through other means.
|
||||
|
||||
1. Stop all gluster services, either using the command below, or through other means.
|
||||
|
||||
# systemctl stop glusterd
|
||||
# systemctl stop glustereventsd
|
||||
# killall glusterfs glusterfsd glusterd
|
||||
systemctl stop glusterd
|
||||
systemctl stop glustereventsd
|
||||
killall glusterfs glusterfsd glusterd
|
||||
|
||||
2. Stop all applications that run on this server and access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.)
|
||||
2. Stop all applications that run on this server and access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.)
|
||||
|
||||
3. Install Gluster new-version, below example shows how to create a repository on fedora and use it to upgrade :
|
||||
3. Install Gluster new-version, below example shows how to create a repository on fedora and use it to upgrade :
|
||||
|
||||
3.1 Create a private repository (assuming /new-gluster-rpms/ folder has the new rpms ):
|
||||
3.1 Create a private repository (assuming /new-gluster-rpms/ folder has the new rpms ):
|
||||
|
||||
# createrepo /new-gluster-rpms/
|
||||
createrepo /new-gluster-rpms/
|
||||
|
||||
3.2 Create the .repo file in /etc/yum.d/ :
|
||||
3.2 Create the .repo file in /etc/yum.d/ :
|
||||
|
||||
# cat /etc/yum.d/newglusterrepo.repo
|
||||
[newglusterrepo]
|
||||
@@ -38,76 +40,74 @@ This procedure involves upgrading **one server at a time**, while keeping the vo
|
||||
gpgcheck=0
|
||||
enabled=1
|
||||
|
||||
3.3 Upgrade glusterfs, for example to upgrade glusterfs-server to x.y version :
|
||||
3.3 Upgrade glusterfs, for example to upgrade glusterfs-server to x.y version :
|
||||
|
||||
# yum update glusterfs-server-x.y.fc30.x86_64.rpm
|
||||
yum update glusterfs-server-x.y.fc30.x86_64.rpm
|
||||
|
||||
4. Ensure that version reflects new-version in the output of,
|
||||
4. Ensure that version reflects new-version in the output of,
|
||||
|
||||
# gluster --version
|
||||
gluster --version
|
||||
|
||||
5. Start glusterd on the upgraded server
|
||||
5. Start glusterd on the upgraded server
|
||||
|
||||
# systemctl start glusterd
|
||||
systemctl start glusterd
|
||||
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
|
||||
# gluster volume status
|
||||
gluster volume status
|
||||
|
||||
7. If the glustereventsd service was previously enabled, it is required to start it using the commands below, or, through other means,
|
||||
7. If the glustereventsd service was previously enabled, it is required to start it using the commands below, or, through other means,
|
||||
|
||||
# systemctl start glustereventsd
|
||||
systemctl start glustereventsd
|
||||
|
||||
8. Invoke self-heal on all the gluster volumes by running,
|
||||
8. Invoke self-heal on all the gluster volumes by running,
|
||||
|
||||
# for i in `gluster volume list`; do gluster volume heal $i; done
|
||||
for i in `gluster volume list`; do gluster volume heal $i; done
|
||||
|
||||
9. Verify that there are no heal backlog by running the command for all the volumes,
|
||||
9. Verify that there are no heal backlog by running the command for all the volumes,
|
||||
|
||||
# gluster volume heal <volname> info
|
||||
gluster volume heal <volname> info
|
||||
|
||||
> **NOTE:** Before proceeding to upgrade the next server in the pool it is recommended to check the heal backlog. If there is a heal backlog, it is recommended to wait until the backlog is empty, or, the backlog does not contain any entries requiring a sync to the just upgraded server.
|
||||
|
||||
10. Restart any gfapi based application stopped previously in step (2)
|
||||
1. Restart any gfapi based application stopped previously in step (2)
|
||||
|
||||
### Offline upgrade procedure
|
||||
|
||||
This procedure involves cluster downtime and during the upgrade window, clients are not allowed access to the volumes.
|
||||
|
||||
#### Steps to perform an offline upgrade:
|
||||
1. On every server in the trusted storage pool, stop all gluster services, either using the command below, or through other means,
|
||||
|
||||
```sh
|
||||
1. On every server in the trusted storage pool, stop all gluster services, either using the command below, or through other means,
|
||||
|
||||
# systemctl stop glusterd
|
||||
# systemctl stop glustereventsd
|
||||
# killall glusterfs glusterfsd glusterd
|
||||
```
|
||||
2. Stop all applications that access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.), across all servers
|
||||
systemctl stop glusterd
|
||||
systemctl stop glustereventsd
|
||||
killall glusterfs glusterfsd glusterd
|
||||
|
||||
3. Install Gluster new-version, on all servers
|
||||
2. Stop all applications that access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.), across all servers
|
||||
|
||||
4. Ensure that version reflects new-version in the output of the following command on all servers,
|
||||
```sh
|
||||
# gluster --version
|
||||
```
|
||||
3. Install Gluster new-version, on all servers
|
||||
|
||||
5. Start glusterd on all the upgraded servers
|
||||
```sh
|
||||
# systemctl start glusterd
|
||||
```
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
```sh
|
||||
# gluster volume status
|
||||
```
|
||||
4. Ensure that version reflects new-version in the output of the following command on all servers,
|
||||
|
||||
7. If the glustereventsd service was previously enabled, it is required to start it using the commands below, or, through other means,
|
||||
```sh
|
||||
# systemctl start glustereventsd
|
||||
```
|
||||
gluster --version
|
||||
|
||||
8. Restart any gfapi based application stopped previously in step (2)
|
||||
5. Start glusterd on all the upgraded servers
|
||||
|
||||
systemctl start glusterd
|
||||
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
|
||||
gluster volume status
|
||||
|
||||
7. If the glustereventsd service was previously enabled, it is required to start it using the commands below, or, through other means,
|
||||
|
||||
systemctl start glustereventsd
|
||||
|
||||
8. Restart any gfapi based application stopped previously in step (2)
|
||||
|
||||
### Post upgrade steps
|
||||
|
||||
Perform the following steps post upgrading the entire trusted storage pool,
|
||||
|
||||
- It is recommended to update the op-version of the cluster. Refer, to the [op-version](./op-version.md) section for further details
|
||||
@@ -117,12 +117,13 @@ Perform the following steps post upgrading the entire trusted storage pool,
|
||||
#### If upgrading from a version lesser than Gluster 7.0
|
||||
|
||||
> **NOTE:** If you have ever enabled quota on your volumes then after the upgrade
|
||||
is done, you will have to restart all the nodes in the cluster one by one so as to
|
||||
fix the checksum values in the quota.cksum file under the `/var/lib/glusterd/vols/<volname>/ directory.`
|
||||
The peers may go into `Peer rejected` state while doing so but once all the nodes are rebooted
|
||||
everything will be back to normal.
|
||||
> is done, you will have to restart all the nodes in the cluster one by one so as to
|
||||
> fix the checksum values in the quota.cksum file under the `/var/lib/glusterd/vols/<volname>/ directory.`
|
||||
> The peers may go into `Peer rejected` state while doing so but once all the nodes are rebooted
|
||||
> everything will be back to normal.
|
||||
|
||||
### Upgrade procedure for clients
|
||||
|
||||
Following are the steps to upgrade clients to the new-version version,
|
||||
|
||||
1. Unmount all glusterfs mount points on the client
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
|
||||
### op-version
|
||||
|
||||
op-version is the operating version of the Gluster which is running.
|
||||
|
||||
op-version was introduced to ensure gluster running with different versions do not end up in a problem and backward compatibility issues can be tackled.
|
||||
@@ -13,19 +13,19 @@ Current op-version can be queried as below:
|
||||
For 3.10 onwards:
|
||||
|
||||
```console
|
||||
# gluster volume get all cluster.op-version
|
||||
gluster volume get all cluster.op-version
|
||||
```
|
||||
|
||||
For release < 3.10:
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# gluster volume get <VOLNAME> cluster.op-version
|
||||
```
|
||||
|
||||
To get the maximum possible op-version a cluster can support, the following query can be used (this is available 3.10 release onwards):
|
||||
|
||||
```console
|
||||
# gluster volume get all cluster.max-op-version
|
||||
gluster volume get all cluster.max-op-version
|
||||
```
|
||||
|
||||
For example, if some nodes in a cluster have been upgraded to X and some to X+, then the maximum op-version supported by the cluster is X, and the cluster.op-version can be bumped up to X to support new features.
|
||||
@@ -34,7 +34,7 @@ op-version can be updated as below.
|
||||
For example, after upgrading to glusterfs-4.0.0, set op-version as:
|
||||
|
||||
```console
|
||||
# gluster volume set all cluster.op-version 40000
|
||||
gluster volume set all cluster.op-version 40000
|
||||
```
|
||||
|
||||
Note:
|
||||
@@ -46,11 +46,10 @@ When trying to set a volume option, it might happen that one or more of the conn
|
||||
|
||||
To check op-version information for the connected clients and find the offending client, the following query can be used for 3.10 release onwards:
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# gluster volume status <all|VOLNAME> clients
|
||||
```
|
||||
|
||||
The respective clients can then be upgraded to the required version.
|
||||
|
||||
This information could also be used to make an informed decision while bumping up the op-version of a cluster, so that connected clients can support all the new features provided by the upgraded cluster as well.
|
||||
|
||||
|
||||
@@ -10,6 +10,7 @@ Refer, to the [generic upgrade procedure](./generic-upgrade-procedure.md) guide
|
||||
## Major issues
|
||||
|
||||
### The following options are removed from the code base and require to be unset
|
||||
|
||||
before an upgrade from releases older than release 4.1.0,
|
||||
|
||||
- features.lock-heal
|
||||
@@ -18,7 +19,7 @@ before an upgrade from releases older than release 4.1.0,
|
||||
To check if these options are set use,
|
||||
|
||||
```console
|
||||
# gluster volume info
|
||||
gluster volume info
|
||||
```
|
||||
|
||||
and ensure that the above options are not part of the `Options Reconfigured:`
|
||||
@@ -26,7 +27,7 @@ section in the output of all volumes in the cluster.
|
||||
|
||||
If these are set, then unset them using the following commands,
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# gluster volume reset <volname> <option>
|
||||
```
|
||||
|
||||
@@ -40,7 +41,6 @@ If these are set, then unset them using the following commands,
|
||||
- Tiering support (tier xlator and changetimerecorder)
|
||||
- Glupy
|
||||
|
||||
|
||||
**NOTE:** Failure to do the above may result in failure during online upgrades,
|
||||
and the reset of these options to their defaults needs to be done **prior** to
|
||||
upgrading the cluster.
|
||||
@@ -48,4 +48,3 @@ upgrading the cluster.
|
||||
### Deprecated translators and upgrade procedure for volumes using these features
|
||||
|
||||
[If you are upgrading from a release prior to release-6 be aware of deprecated xlators and functionality](https://docs.gluster.org/en/latest/Upgrade-Guide/upgrade_to_6/#deprecated-translators-and-upgrade-procedure-for-volumes-using-these-features).
|
||||
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
## Upgrade procedure to Gluster 3.10.0, from Gluster 3.9.x, 3.8.x and 3.7.x
|
||||
|
||||
### Pre-upgrade notes
|
||||
|
||||
- Online upgrade is only possible with replicated and distributed replicate volumes
|
||||
- Online upgrade is not supported for dispersed or distributed dispersed volumes
|
||||
- Ensure no configuration changes are done during the upgrade
|
||||
@@ -9,83 +10,82 @@
|
||||
- It is recommended to have the same client and server, major versions running eventually
|
||||
|
||||
### Online upgrade procedure for servers
|
||||
|
||||
This procedure involves upgrading **one server at a time**, while keeping the volume(s) online and client IO ongoing. This procedure assumes that multiple replicas of a replica set, are not part of the same server in the trusted storage pool.
|
||||
|
||||
> **ALERT**: If any of your volumes, in the trusted storage pool that is being upgraded, uses disperse or is a pure distributed volume, this procedure is **NOT** recommended, use the [Offline upgrade procedure](#offline-upgrade-procedure) instead.
|
||||
|
||||
#### Repeat the following steps, on each server in the trusted storage pool, to upgrade the entire pool to 3.10 version:
|
||||
1. Stop all gluster services, either using the command below, or through other means,
|
||||
```sh
|
||||
#killall glusterfs glusterfsd glusterd
|
||||
```
|
||||
|
||||
2. Stop all applications that run on this server and access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.)
|
||||
1. Stop all gluster services, either using the command below, or through other means,
|
||||
|
||||
3. Install Gluster 3.10
|
||||
killall glusterfs glusterfsd glusterd
|
||||
|
||||
4. Ensure that version reflects 3.10.0 in the output of,
|
||||
```sh
|
||||
#gluster --version
|
||||
```
|
||||
2. Stop all applications that run on this server and access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.)
|
||||
|
||||
5. Start glusterd on the upgraded server
|
||||
```sh
|
||||
#glusterd
|
||||
```
|
||||
3. Install Gluster 3.10
|
||||
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
```sh
|
||||
#gluster volume status
|
||||
```
|
||||
4. Ensure that version reflects 3.10.0 in the output of,
|
||||
|
||||
7. Self-heal all gluster volumes by running
|
||||
```sh
|
||||
#for i in `gluster volume list`; do gluster volume heal $i; done
|
||||
```
|
||||
gluster --version
|
||||
|
||||
8. Ensure that there is no heal backlog by running the below command for all volumes
|
||||
```sh
|
||||
#gluster volume heal <volname> info
|
||||
```
|
||||
> NOTE: If there is a heal backlog, wait till the backlog is empty, or the backlog does not have any entries needing a sync to the just upgraded server, before proceeding to upgrade the next server in the pool
|
||||
5. Start glusterd on the upgraded server
|
||||
|
||||
9. Restart any gfapi based application stopped previously in step (2)
|
||||
glusterd
|
||||
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
|
||||
gluster volume status
|
||||
|
||||
7. Self-heal all gluster volumes by running
|
||||
|
||||
for i in `gluster volume list`; do gluster volume heal $i; done
|
||||
|
||||
8. Ensure that there is no heal backlog by running the below command for all volumes
|
||||
|
||||
gluster volume heal <volname> info
|
||||
|
||||
> NOTE: If there is a heal backlog, wait till the backlog is empty, or the backlog does not have any entries needing a sync to the just upgraded server, before proceeding to upgrade the next server in the pool
|
||||
|
||||
9. Restart any gfapi based application stopped previously in step (2)
|
||||
|
||||
### Offline upgrade procedure
|
||||
|
||||
This procedure involves cluster downtime and during the upgrade window, clients are not allowed access to the volumes.
|
||||
|
||||
#### Steps to perform an offline upgrade:
|
||||
1. On every server in the trusted storage pool, stop all gluster services, either using the command below, or through other means,
|
||||
|
||||
```sh
|
||||
#killall glusterfs glusterfsd glusterd
|
||||
```
|
||||
2. Stop all applications that access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.), across all servers
|
||||
1. On every server in the trusted storage pool, stop all gluster services, either using the command below, or through other means,
|
||||
|
||||
3. Install Gluster 3.10, on all servers
|
||||
killall glusterfs glusterfsd glusterd
|
||||
|
||||
4. Ensure that version reflects 3.10.0 in the output of the following command on all servers,
|
||||
```sh
|
||||
#gluster --version
|
||||
```
|
||||
2. Stop all applications that access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.), across all servers
|
||||
|
||||
5. Start glusterd on all the upgraded servers
|
||||
```sh
|
||||
#glusterd
|
||||
```
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
```sh
|
||||
#gluster volume status
|
||||
```
|
||||
3. Install Gluster 3.10, on all servers
|
||||
|
||||
7. Restart any gfapi based application stopped previously in step (2)
|
||||
4. Ensure that version reflects 3.10.0 in the output of the following command on all servers,
|
||||
|
||||
gluster --version
|
||||
|
||||
5. Start glusterd on all the upgraded servers
|
||||
|
||||
glusterd
|
||||
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
|
||||
gluster volume status
|
||||
|
||||
7. Restart any gfapi based application stopped previously in step (2)
|
||||
|
||||
### Post upgrade steps
|
||||
|
||||
Perform the following steps post upgrading the entire trusted storage pool,
|
||||
|
||||
- It is recommended to update the op-version of the cluster. Refer, to the [op-version](./op-version.md) section for further details
|
||||
- Proceed to [upgrade the clients](#upgrade-procedure-for-clients) to 3.10 version as well
|
||||
|
||||
### Upgrade procedure for clients
|
||||
|
||||
Following are the steps to upgrade clients to the 3.10.0 version,
|
||||
|
||||
1. Unmount all glusterfs mount points on the client
|
||||
|
||||
@@ -3,6 +3,7 @@
|
||||
**NOTE:** Upgrade procedure remains the same as with the 3.10 release
|
||||
|
||||
### Pre-upgrade notes
|
||||
|
||||
- Online upgrade is only possible with replicated and distributed replicate volumes
|
||||
- Online upgrade is not supported for dispersed or distributed dispersed volumes
|
||||
- Ensure no configuration changes are done during the upgrade
|
||||
@@ -11,87 +12,86 @@
|
||||
- It is recommended to have the same client and server, major versions running eventually
|
||||
|
||||
### Online upgrade procedure for servers
|
||||
|
||||
This procedure involves upgrading **one server at a time**, while keeping the volume(s) online and client IO ongoing. This procedure assumes that multiple replicas of a replica set, are not part of the same server in the trusted storage pool.
|
||||
|
||||
> **ALERT**: If any of your volumes, in the trusted storage pool that is being upgraded, uses disperse or is a pure distributed volume, this procedure is **NOT** recommended, use the [Offline upgrade procedure](#offline-upgrade-procedure) instead.
|
||||
|
||||
#### Repeat the following steps, on each server in the trusted storage pool, to upgrade the entire pool to 3.11 version:
|
||||
1. Stop all gluster services, either using the command below, or through other means,
|
||||
```sh
|
||||
#killall glusterfs glusterfsd glusterd
|
||||
```
|
||||
|
||||
2. Stop all applications that run on this server and access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.)
|
||||
1. Stop all gluster services, either using the command below, or through other means,
|
||||
|
||||
3. Install Gluster 3.11
|
||||
killall glusterfs glusterfsd glusterd
|
||||
|
||||
4. Ensure that version reflects 3.11.x in the output of,
|
||||
```sh
|
||||
#gluster --version
|
||||
```
|
||||
2. Stop all applications that run on this server and access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.)
|
||||
|
||||
**NOTE:** x is the minor release number for the release
|
||||
3. Install Gluster 3.11
|
||||
|
||||
5. Start glusterd on the upgraded server
|
||||
```sh
|
||||
#glusterd
|
||||
```
|
||||
4. Ensure that version reflects 3.11.x in the output of,
|
||||
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
```sh
|
||||
#gluster volume status
|
||||
```
|
||||
gluster --version
|
||||
|
||||
7. Self-heal all gluster volumes by running
|
||||
```sh
|
||||
#for i in `gluster volume list`; do gluster volume heal $i; done
|
||||
```
|
||||
**NOTE:** x is the minor release number for the release
|
||||
|
||||
8. Ensure that there is no heal backlog by running the below command for all volumes
|
||||
```sh
|
||||
#gluster volume heal <volname> info
|
||||
```
|
||||
> NOTE: If there is a heal backlog, wait till the backlog is empty, or the backlog does not have any entries needing a sync to the just upgraded server, before proceeding to upgrade the next server in the pool
|
||||
5. Start glusterd on the upgraded server
|
||||
|
||||
9. Restart any gfapi based application stopped previously in step (2)
|
||||
glusterd
|
||||
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
|
||||
gluster volume status
|
||||
|
||||
7. Self-heal all gluster volumes by running
|
||||
|
||||
for i in `gluster volume list`; do gluster volume heal $i; done
|
||||
|
||||
8. Ensure that there is no heal backlog by running the below command for all volumes
|
||||
|
||||
gluster volume heal <volname> info
|
||||
|
||||
> NOTE: If there is a heal backlog, wait till the backlog is empty, or the backlog does not have any entries needing a sync to the just upgraded server, before proceeding to upgrade the next server in the pool
|
||||
|
||||
9. Restart any gfapi based application stopped previously in step (2)
|
||||
|
||||
### Offline upgrade procedure
|
||||
|
||||
This procedure involves cluster downtime and during the upgrade window, clients are not allowed access to the volumes.
|
||||
|
||||
#### Steps to perform an offline upgrade:
|
||||
1. On every server in the trusted storage pool, stop all gluster services, either using the command below, or through other means,
|
||||
|
||||
```sh
|
||||
#killall glusterfs glusterfsd glusterd
|
||||
```
|
||||
2. Stop all applications that access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.), across all servers
|
||||
1. On every server in the trusted storage pool, stop all gluster services, either using the command below, or through other means,
|
||||
|
||||
3. Install Gluster 3.11, on all servers
|
||||
killall glusterfs glusterfsd glusterd
|
||||
|
||||
4. Ensure that version reflects 3.11.x in the output of the following command on all servers,
|
||||
```sh
|
||||
#gluster --version
|
||||
```
|
||||
2. Stop all applications that access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.), across all servers
|
||||
|
||||
**NOTE:** x is the minor release number for the release
|
||||
3. Install Gluster 3.11, on all servers
|
||||
|
||||
5. Start glusterd on all the upgraded servers
|
||||
```sh
|
||||
#glusterd
|
||||
```
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
```sh
|
||||
#gluster volume status
|
||||
```
|
||||
4. Ensure that version reflects 3.11.x in the output of the following command on all servers,
|
||||
|
||||
7. Restart any gfapi based application stopped previously in step (2)
|
||||
gluster --version
|
||||
|
||||
**NOTE:** x is the minor release number for the release
|
||||
|
||||
5. Start glusterd on all the upgraded servers
|
||||
|
||||
glusterd
|
||||
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
|
||||
gluster volume status
|
||||
|
||||
7. Restart any gfapi based application stopped previously in step (2)
|
||||
|
||||
### Post upgrade steps
|
||||
|
||||
Perform the following steps post upgrading the entire trusted storage pool,
|
||||
|
||||
- It is recommended to update the op-version of the cluster. Refer, to the [op-version](./op-version.md) section for further details
|
||||
- Proceed to [upgrade the clients](#upgrade-procedure-for-clients) to 3.11 version as well
|
||||
|
||||
### Upgrade procedure for clients
|
||||
|
||||
Following are the steps to upgrade clients to the 3.11.x version,
|
||||
|
||||
**NOTE:** x is the minor release number for the release
|
||||
|
||||
@@ -3,6 +3,7 @@
|
||||
> **NOTE:** Upgrade procedure remains the same as with 3.11 and 3.10 releases
|
||||
|
||||
### Pre-upgrade notes
|
||||
|
||||
- Online upgrade is only possible with replicated and distributed replicate volumes
|
||||
- Online upgrade is not supported for dispersed or distributed dispersed volumes
|
||||
- Ensure no configuration changes are done during the upgrade
|
||||
@@ -11,90 +12,96 @@
|
||||
- It is recommended to have the same client and server, major versions running eventually
|
||||
|
||||
### Online upgrade procedure for servers
|
||||
|
||||
This procedure involves upgrading **one server at a time**, while keeping the volume(s) online and client IO ongoing. This procedure assumes that multiple replicas of a replica set, are not part of the same server in the trusted storage pool.
|
||||
|
||||
> **ALERT:** If there are disperse or, pure distributed volumes in the storage pool being upgraded, this procedure is NOT recommended, use the [Offline upgrade procedure](#offline-upgrade-procedure) instead.
|
||||
|
||||
#### Repeat the following steps, on each server in the trusted storage pool, to upgrade the entire pool to 3.12 version:
|
||||
1. Stop all gluster services, either using the command below, or through other means,
|
||||
|
||||
# killall glusterfs glusterfsd glusterd
|
||||
# systemctl stop glustereventsd
|
||||
1. Stop all gluster services, either using the command below, or through other means,
|
||||
|
||||
2. Stop all applications that run on this server and access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.)
|
||||
killall glusterfs glusterfsd glusterd
|
||||
systemctl stop glustereventsd
|
||||
|
||||
3. Install Gluster 3.12
|
||||
2. Stop all applications that run on this server and access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.)
|
||||
|
||||
4. Ensure that version reflects 3.12.x in the output of,
|
||||
3. Install Gluster 3.12
|
||||
|
||||
# gluster --version
|
||||
4. Ensure that version reflects 3.12.x in the output of,
|
||||
|
||||
gluster --version
|
||||
|
||||
> **NOTE:** x is the minor release number for the release
|
||||
|
||||
5. Start glusterd on the upgraded server
|
||||
5. Start glusterd on the upgraded server
|
||||
|
||||
# glusterd
|
||||
glusterd
|
||||
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
|
||||
# gluster volume status
|
||||
gluster volume status
|
||||
|
||||
7. If the glustereventsd service was previously enabled, it is required to start it using the commands below, or, through other means,
|
||||
7. If the glustereventsd service was previously enabled, it is required to start it using the commands below, or, through other means,
|
||||
|
||||
# systemctl start glustereventsd
|
||||
systemctl start glustereventsd
|
||||
|
||||
8. Invoke self-heal on all the gluster volumes by running,
|
||||
8. Invoke self-heal on all the gluster volumes by running,
|
||||
|
||||
# for i in `gluster volume list`; do gluster volume heal $i; done
|
||||
for i in `gluster volume list`; do gluster volume heal $i; done
|
||||
|
||||
9. Verify that there are no heal backlog by running the command for all the volumes,
|
||||
9. Verify that there are no heal backlog by running the command for all the volumes,
|
||||
|
||||
# gluster volume heal <volname> info
|
||||
gluster volume heal <volname> info
|
||||
|
||||
> **NOTE:** Before proceeding to upgrade the next server in the pool it is recommended to check the heal backlog. If there is a heal backlog, it is recommended to wait until the backlog is empty, or, the backlog does not contain any entries requiring a sync to the just upgraded server.
|
||||
|
||||
10. Restart any gfapi based application stopped previously in step (2)
|
||||
|
||||
### Offline upgrade procedure
|
||||
|
||||
This procedure involves cluster downtime and during the upgrade window, clients are not allowed access to the volumes.
|
||||
|
||||
#### Steps to perform an offline upgrade:
|
||||
1. On every server in the trusted storage pool, stop all gluster services, either using the command below, or through other means,
|
||||
|
||||
# killall glusterfs glusterfsd glusterd glustereventsd
|
||||
# systemctl stop glustereventsd
|
||||
1. On every server in the trusted storage pool, stop all gluster services, either using the command below, or through other means,
|
||||
|
||||
2. Stop all applications that access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.), across all servers
|
||||
killall glusterfs glusterfsd glusterd glustereventsd
|
||||
systemctl stop glustereventsd
|
||||
|
||||
3. Install Gluster 3.12, on all servers
|
||||
2. Stop all applications that access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.), across all servers
|
||||
|
||||
4. Ensure that version reflects 3.12.x in the output of the following command on all servers,
|
||||
3. Install Gluster 3.12, on all servers
|
||||
|
||||
# gluster --version
|
||||
4. Ensure that version reflects 3.12.x in the output of the following command on all servers,
|
||||
|
||||
> **NOTE:** x is the minor release number for the release
|
||||
gluster --version
|
||||
|
||||
5. Start glusterd on all the upgraded servers
|
||||
> **NOTE:** x is the minor release number for the release
|
||||
|
||||
# glusterd
|
||||
5. Start glusterd on all the upgraded servers
|
||||
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
glusterd
|
||||
|
||||
# gluster volume status
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
|
||||
7. If the glustereventsd service was previously enabled, it is required to start it using the commands below, or, through other means,
|
||||
gluster volume status
|
||||
|
||||
# systemctl start glustereventsd
|
||||
7. If the glustereventsd service was previously enabled, it is required to start it using the commands below, or, through other means,
|
||||
|
||||
8. Restart any gfapi based application stopped previously in step (2)
|
||||
systemctl start glustereventsd
|
||||
|
||||
8. Restart any gfapi based application stopped previously in step (2)
|
||||
|
||||
### Post upgrade steps
|
||||
|
||||
Perform the following steps post upgrading the entire trusted storage pool,
|
||||
|
||||
- It is recommended to update the op-version of the cluster. Refer, to the [op-version](./op-version.md) section for further details
|
||||
- Proceed to [upgrade the clients](#upgrade-procedure-for-clients) to 3.12 version as well
|
||||
|
||||
### Upgrade procedure for clients
|
||||
|
||||
Following are the steps to upgrade clients to the 3.12.x version,
|
||||
|
||||
> **NOTE:** x is the minor release number for the release
|
||||
|
||||
@@ -3,6 +3,7 @@
|
||||
**NOTE:** Upgrade procedure remains the same as with 3.12 and 3.10 releases
|
||||
|
||||
### Pre-upgrade notes
|
||||
|
||||
- Online upgrade is only possible with replicated and distributed replicate volumes
|
||||
- Online upgrade is not supported for dispersed or distributed dispersed volumes
|
||||
- Ensure no configuration changes are done during the upgrade
|
||||
@@ -11,80 +12,86 @@
|
||||
- It is recommended to have the same client and server, major versions running eventually
|
||||
|
||||
### Online upgrade procedure for servers
|
||||
|
||||
This procedure involves upgrading **one server at a time**, while keeping the volume(s) online and client IO ongoing. This procedure assumes that multiple replicas of a replica set, are not part of the same server in the trusted storage pool.
|
||||
|
||||
> **ALERT**: If any of your volumes, in the trusted storage pool that is being upgraded, uses disperse or is a pure distributed volume, this procedure is **NOT** recommended, use the [Offline upgrade procedure](#offline-upgrade-procedure) instead.
|
||||
|
||||
#### Repeat the following steps, on each server in the trusted storage pool, to upgrade the entire pool to 3.13 version:
|
||||
1. Stop all gluster services, either using the command below, or through other means,
|
||||
|
||||
# killall glusterfs glusterfsd glusterd
|
||||
1. Stop all gluster services, either using the command below, or through other means,
|
||||
|
||||
2. Stop all applications that run on this server and access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.)
|
||||
killall glusterfs glusterfsd glusterd
|
||||
|
||||
3. Install Gluster 3.13
|
||||
2. Stop all applications that run on this server and access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.)
|
||||
|
||||
4. Ensure that version reflects 3.13.x in the output of,
|
||||
|
||||
# gluster --version
|
||||
3. Install Gluster 3.13
|
||||
|
||||
**NOTE:** x is the minor release number for the release
|
||||
4. Ensure that version reflects 3.13.x in the output of,
|
||||
|
||||
5. Start glusterd on the upgraded server
|
||||
gluster --version
|
||||
|
||||
# glusterd
|
||||
**NOTE:** x is the minor release number for the release
|
||||
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
5. Start glusterd on the upgraded server
|
||||
|
||||
# gluster volume status
|
||||
glusterd
|
||||
|
||||
7. Self-heal all gluster volumes by running
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
|
||||
# for i in `gluster volume list`; do gluster volume heal $i; done
|
||||
gluster volume status
|
||||
|
||||
8. Ensure that there is no heal backlog by running the below command for all volumes
|
||||
7. Self-heal all gluster volumes by running
|
||||
|
||||
# gluster volume heal <volname> info
|
||||
for i in `gluster volume list`; do gluster volume heal $i; done
|
||||
|
||||
8. Ensure that there is no heal backlog by running the below command for all volumes
|
||||
|
||||
gluster volume heal <volname> info
|
||||
|
||||
> NOTE: If there is a heal backlog, wait till the backlog is empty, or the backlog does not have any entries needing a sync to the just upgraded server, before proceeding to upgrade the next server in the pool
|
||||
|
||||
9. Restart any gfapi based application stopped previously in step (2)
|
||||
9. Restart any gfapi based application stopped previously in step (2)
|
||||
|
||||
### Offline upgrade procedure
|
||||
|
||||
This procedure involves cluster downtime and during the upgrade window, clients are not allowed access to the volumes.
|
||||
|
||||
#### Steps to perform an offline upgrade:
|
||||
1. On every server in the trusted storage pool, stop all gluster services, either using the command below, or through other means,
|
||||
|
||||
# killall glusterfs glusterfsd glusterd
|
||||
1. On every server in the trusted storage pool, stop all gluster services, either using the command below, or through other means,
|
||||
|
||||
2. Stop all applications that access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.), across all servers
|
||||
killall glusterfs glusterfsd glusterd
|
||||
|
||||
3. Install Gluster 3.13, on all servers
|
||||
2. Stop all applications that access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.), across all servers
|
||||
|
||||
4. Ensure that version reflects 3.13.x in the output of the following command on all servers,
|
||||
3. Install Gluster 3.13, on all servers
|
||||
|
||||
# gluster --version
|
||||
4. Ensure that version reflects 3.13.x in the output of the following command on all servers,
|
||||
|
||||
**NOTE:** x is the minor release number for the release
|
||||
gluster --version
|
||||
|
||||
5. Start glusterd on all the upgraded servers
|
||||
**NOTE:** x is the minor release number for the release
|
||||
|
||||
# glusterd
|
||||
5. Start glusterd on all the upgraded servers
|
||||
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
glusterd
|
||||
|
||||
# gluster volume status
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
|
||||
7. Restart any gfapi based application stopped previously in step (2)
|
||||
gluster volume status
|
||||
|
||||
7. Restart any gfapi based application stopped previously in step (2)
|
||||
|
||||
### Post upgrade steps
|
||||
|
||||
Perform the following steps post upgrading the entire trusted storage pool,
|
||||
|
||||
- It is recommended to update the op-version of the cluster. Refer, to the [op-version](./op-version.md) section for further details
|
||||
- Proceed to [upgrade the clients](#upgrade-procedure-for-clients) to 3.13 version as well
|
||||
|
||||
### Upgrade procedure for clients
|
||||
|
||||
Following are the steps to upgrade clients to the 3.13.x version,
|
||||
|
||||
**NOTE:** x is the minor release number for the release
|
||||
|
||||
@@ -23,7 +23,7 @@ provided below)
|
||||
|
||||
1. Execute "pre-upgrade-script-for-quota.sh" mentioned under "Upgrade Steps For Quota" section.
|
||||
2. Stop all glusterd, glusterfsd and glusterfs processes on your server.
|
||||
3. Install GlusterFS 3.5.0
|
||||
3. Install GlusterFS 3.5.0
|
||||
4. Start glusterd.
|
||||
5. Ensure that all started volumes have processes online in “gluster volume status”.
|
||||
6. Execute "Post-Upgrade Script" mentioned under "Upgrade Steps For Quota" section.
|
||||
@@ -77,7 +77,7 @@ The upgrade process for quota involves executing two upgrade scripts:
|
||||
1. pre-upgrade-script-for-quota.sh, and\
|
||||
2. post-upgrade-script-for-quota.sh
|
||||
|
||||
*Pre-Upgrade Script:*
|
||||
_Pre-Upgrade Script:_
|
||||
|
||||
What it does:
|
||||
|
||||
@@ -105,11 +105,11 @@ Invocation:
|
||||
Invoke the script by executing \`./pre-upgrade-script-for-quota.sh\`
|
||||
from the shell on any one of the nodes in the cluster.
|
||||
|
||||
- Example:
|
||||
- Example:
|
||||
|
||||
[root@server1 extras]#./pre-upgrade-script-for-quota.sh
|
||||
[root@server1 extras]#./pre-upgrade-script-for-quota.sh
|
||||
|
||||
*Post-Upgrade Script:*
|
||||
_Post-Upgrade Script:_
|
||||
|
||||
What it does:
|
||||
|
||||
@@ -164,9 +164,9 @@ In the first case, invoke post-upgrade-script-for-quota.sh from the
|
||||
shell for each volume with quota enabled, with the name of the volume
|
||||
passed as an argument in the command-line:
|
||||
|
||||
- Example:
|
||||
- Example:
|
||||
|
||||
*For a volume "vol1" on which quota is enabled, invoke the script in the following way:*
|
||||
_For a volume "vol1" on which quota is enabled, invoke the script in the following way:_
|
||||
|
||||
[root@server1 extras]#./post-upgrade-script-for-quota.sh vol1
|
||||
|
||||
@@ -176,9 +176,9 @@ procedure on each one of them. In this case, invoke
|
||||
post-upgrade-script-for-quota.sh from the shell with 'all' passed as an
|
||||
argument in the command-line:
|
||||
|
||||
- Example:
|
||||
- Example:
|
||||
|
||||
[root@server1 extras]#./post-upgrade-script-for-quota.sh all
|
||||
[root@server1 extras]#./post-upgrade-script-for-quota.sh all
|
||||
|
||||
Note:
|
||||
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
# GlusterFS upgrade from 3.5.x to 3.6.x
|
||||
|
||||
Now that GlusterFS 3.6.0 is out, here is the process to upgrade from
|
||||
earlier installed versions of GlusterFS.
|
||||
|
||||
@@ -8,15 +9,15 @@ GlusterFS clients. If you are not updating your clients to GlusterFS
|
||||
version 3.6 you need to disable client self healing process. You can
|
||||
perform this by below steps.
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# gluster v set testvol cluster.entry-self-heal off
|
||||
volume set: success
|
||||
#
|
||||
|
||||
# gluster v set testvol cluster.data-self-heal off
|
||||
volume set: success
|
||||
|
||||
# gluster v set testvol cluster.metadata-self-heal off
|
||||
volume set: success
|
||||
#
|
||||
```
|
||||
|
||||
### GlusterFS upgrade from 3.5.x to 3.6.x
|
||||
@@ -27,7 +28,7 @@ For this approach, schedule a downtime and prevent all your clients from
|
||||
accessing ( umount your volumes, stop gluster Volumes..etc)the servers.
|
||||
|
||||
1. Stop all glusterd, glusterfsd and glusterfs processes on your server.
|
||||
2. Install GlusterFS 3.6.0
|
||||
2. Install GlusterFS 3.6.0
|
||||
3. Start glusterd.
|
||||
4. Ensure that all started volumes have processes online in “gluster volume status”.
|
||||
|
||||
@@ -59,7 +60,7 @@ provided below)
|
||||
|
||||
1. Execute "pre-upgrade-script-for-quota.sh" mentioned under "Upgrade Steps For Quota" section.
|
||||
2. Stop all glusterd, glusterfsd and glusterfs processes on your server.
|
||||
3. Install GlusterFS 3.6.0
|
||||
3. Install GlusterFS 3.6.0
|
||||
4. Start glusterd.
|
||||
5. Ensure that all started volumes have processes online in “gluster volume status”.
|
||||
6. Execute "Post-Upgrade Script" mentioned under "Upgrade Steps For Quota" section.
|
||||
@@ -87,7 +88,7 @@ The upgrade process for quota involves executing two upgrade scripts:
|
||||
1. pre-upgrade-script-for-quota.sh, and\
|
||||
2. post-upgrade-script-for-quota.sh
|
||||
|
||||
*Pre-Upgrade Script:*
|
||||
_Pre-Upgrade Script:_
|
||||
|
||||
What it does:
|
||||
|
||||
@@ -121,7 +122,7 @@ Example:
|
||||
[root@server1 extras]#./pre-upgrade-script-for-quota.sh
|
||||
```
|
||||
|
||||
*Post-Upgrade Script:*
|
||||
_Post-Upgrade Script:_
|
||||
|
||||
What it does:
|
||||
|
||||
@@ -178,7 +179,7 @@ passed as an argument in the command-line:
|
||||
|
||||
Example:
|
||||
|
||||
*For a volume "vol1" on which quota is enabled, invoke the script in the following way:*
|
||||
_For a volume "vol1" on which quota is enabled, invoke the script in the following way:_
|
||||
|
||||
```console
|
||||
[root@server1 extras]#./post-upgrade-script-for-quota.sh vol1
|
||||
@@ -227,7 +228,7 @@ covered in detail here.
|
||||
|
||||
**Below are the steps to upgrade:**
|
||||
|
||||
1. Stop the geo-replication session in older version ( \< 3.5) using
|
||||
1. Stop the geo-replication session in older version ( \< 3.5) using
|
||||
the below command
|
||||
|
||||
# gluster volume geo-replication `<master_vol>` `<slave_host>`::`<slave_vol>` stop
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
# GlusterFS upgrade to 3.7.x
|
||||
|
||||
Now that GlusterFS 3.7.0 is out, here is the process to upgrade from
|
||||
earlier installed versions of GlusterFS. Please read the entire howto
|
||||
before proceeding with an upgrade of your deployment
|
||||
@@ -13,15 +14,15 @@ version 3.6 along with your servers you would need to disable client
|
||||
self healing process before the upgrade. You can perform this by below
|
||||
steps.
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# gluster v set testvol cluster.entry-self-heal off
|
||||
volume set: success
|
||||
#
|
||||
|
||||
# gluster v set testvol cluster.data-self-heal off
|
||||
volume set: success
|
||||
|
||||
# gluster v set testvol cluster.metadata-self-heal off
|
||||
volume set: success
|
||||
#
|
||||
```
|
||||
|
||||
### GlusterFS upgrade to 3.7.x
|
||||
@@ -71,11 +72,11 @@ The upgrade process for quota involves the following:
|
||||
|
||||
1. Run pre-upgrade-script-for-quota.sh
|
||||
2. Upgrade to 3.7.0
|
||||
2. Run post-upgrade-script-for-quota.sh
|
||||
3. Run post-upgrade-script-for-quota.sh
|
||||
|
||||
More details on the scripts are as under.
|
||||
|
||||
*Pre-Upgrade Script:*
|
||||
_Pre-Upgrade Script:_
|
||||
|
||||
What it does:
|
||||
|
||||
@@ -109,7 +110,7 @@ Example:
|
||||
[root@server1 extras]#./pre-upgrade-script-for-quota.sh
|
||||
```
|
||||
|
||||
*Post-Upgrade Script:*
|
||||
_Post-Upgrade Script:_
|
||||
|
||||
What it does:
|
||||
|
||||
|
||||
@@ -1,12 +1,13 @@
|
||||
## Upgrade procedure from Gluster 3.7.x
|
||||
|
||||
### Pre-upgrade Notes
|
||||
- Online upgrade is only possible with replicated and distributed replicate volumes.
|
||||
- Online upgrade is not yet supported for dispersed or distributed dispersed volumes.
|
||||
- Ensure no configuration changes are done during the upgrade.
|
||||
- If you are using geo-replication, please upgrade the slave cluster(s) before upgrading the master.
|
||||
- Upgrading the servers ahead of the clients is recommended.
|
||||
- Upgrade the clients after the servers are upgraded. It is recommended to have the same client and server major versions.
|
||||
### Pre-upgrade Notes
|
||||
|
||||
- Online upgrade is only possible with replicated and distributed replicate volumes.
|
||||
- Online upgrade is not yet supported for dispersed or distributed dispersed volumes.
|
||||
- Ensure no configuration changes are done during the upgrade.
|
||||
- If you are using geo-replication, please upgrade the slave cluster(s) before upgrading the master.
|
||||
- Upgrading the servers ahead of the clients is recommended.
|
||||
- Upgrade the clients after the servers are upgraded. It is recommended to have the same client and server major versions.
|
||||
|
||||
### Online Upgrade Procedure for Servers
|
||||
|
||||
@@ -14,7 +15,7 @@ The procedure involves upgrading one server at a time . On every storage server
|
||||
|
||||
- Stop all gluster services using the below command or through your favorite way to stop them.
|
||||
|
||||
# killall glusterfs glusterfsd glusterd
|
||||
killall glusterfs glusterfsd glusterd
|
||||
|
||||
- If you are using gfapi based applications (qemu, NFS-Ganesha, Samba etc.) on the servers, please stop those applications too.
|
||||
|
||||
@@ -22,38 +23,39 @@ The procedure involves upgrading one server at a time . On every storage server
|
||||
|
||||
- Ensure that version reflects 3.8.x in the output of
|
||||
|
||||
# gluster --version
|
||||
gluster --version
|
||||
|
||||
- Start glusterd on the upgraded server
|
||||
|
||||
# glusterd
|
||||
glusterd
|
||||
|
||||
- Ensure that all gluster processes are online by executing
|
||||
|
||||
# gluster volume status
|
||||
gluster volume status
|
||||
|
||||
- Self-heal all gluster volumes by running
|
||||
|
||||
# for i in `gluster volume list`; do gluster volume heal $i; done
|
||||
for i in `gluster volume list`; do gluster volume heal $i; done
|
||||
|
||||
- Ensure that there is no heal backlog by running the below command for all volumes
|
||||
|
||||
# gluster volume heal <volname> info
|
||||
gluster volume heal <volname> info
|
||||
|
||||
- Restart any gfapi based application stopped previously.
|
||||
|
||||
- After the upgrade is complete on all servers, run the following command:
|
||||
|
||||
# gluster volume set all cluster.op-version 30800
|
||||
gluster volume set all cluster.op-version 30800
|
||||
|
||||
### Offline Upgrade Procedure
|
||||
### Offline Upgrade Procedure
|
||||
|
||||
For this procedure, schedule a downtime and prevent all your clients from accessing the servers.
|
||||
|
||||
On every storage server in your trusted storage pool:
|
||||
|
||||
- Stop all gluster services using the below command or through your favorite way to stop them.
|
||||
|
||||
# killall glusterfs glusterfsd glusterd
|
||||
killall glusterfs glusterfsd glusterd
|
||||
|
||||
- If you are using gfapi based applications (qemu, NFS-Ganesha, Samba etc.) on the servers, please stop those applications too.
|
||||
|
||||
@@ -61,25 +63,24 @@ On every storage server in your trusted storage pool:
|
||||
|
||||
- Ensure that version reflects 3.8.x in the output of
|
||||
|
||||
# gluster --version
|
||||
gluster --version
|
||||
|
||||
- Start glusterd on the upgraded server
|
||||
|
||||
# glusterd
|
||||
glusterd
|
||||
|
||||
- Ensure that all gluster processes are online by executing
|
||||
|
||||
# gluster volume status
|
||||
gluster volume status
|
||||
|
||||
- Restart any gfapi based application stopped previously.
|
||||
|
||||
- After the upgrade is complete on all servers, run the following command:
|
||||
|
||||
# gluster volume set all cluster.op-version 30800
|
||||
gluster volume set all cluster.op-version 30800
|
||||
|
||||
### Upgrade Procedure for Clients
|
||||
|
||||
|
||||
- Unmount all glusterfs mount points on the client
|
||||
- Stop applications using gfapi (qemu etc.)
|
||||
- Install Gluster 3.8
|
||||
|
||||
@@ -9,5 +9,5 @@ Note that there is only a single difference, related to the `op-version`:
|
||||
After the upgrade is complete on all servers, run the following command:
|
||||
|
||||
```console
|
||||
# gluster volume set all cluster.op-version 30900
|
||||
gluster volume set all cluster.op-version 30900
|
||||
```
|
||||
|
||||
@@ -3,6 +3,7 @@
|
||||
**NOTE:** Upgrade procedure remains the same as with 3.12 and 3.10 releases
|
||||
|
||||
### Pre-upgrade notes
|
||||
|
||||
- Online upgrade is only possible with replicated and distributed replicate volumes
|
||||
- Online upgrade is not supported for dispersed or distributed dispersed volumes
|
||||
- Ensure no configuration changes are done during the upgrade
|
||||
@@ -11,74 +12,79 @@
|
||||
- It is recommended to have the same client and server, major versions running eventually
|
||||
|
||||
### Online upgrade procedure for servers
|
||||
|
||||
This procedure involves upgrading **one server at a time**, while keeping the volume(s) online and client IO ongoing. This procedure assumes that multiple replicas of a replica set, are not part of the same server in the trusted storage pool.
|
||||
|
||||
> **ALERT**: If any of your volumes, in the trusted storage pool that is being upgraded, uses disperse or is a pure distributed volume, this procedure is **NOT** recommended, use the [Offline upgrade procedure](#offline-upgrade-procedure) instead.
|
||||
|
||||
#### Repeat the following steps, on each server in the trusted storage pool, to upgrade the entire pool to 4.0 version:
|
||||
1. Stop all gluster services, either using the command below, or through other means,
|
||||
|
||||
# killall glusterfs glusterfsd glusterd
|
||||
1. Stop all gluster services, either using the command below, or through other means,
|
||||
|
||||
2. Stop all applications that run on this server and access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.)
|
||||
killall glusterfs glusterfsd glusterd
|
||||
|
||||
3. Install Gluster 4.0
|
||||
2. Stop all applications that run on this server and access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.)
|
||||
|
||||
4. Ensure that version reflects 4.0.x in the output of,
|
||||
3. Install Gluster 4.0
|
||||
|
||||
# gluster --version
|
||||
4. Ensure that version reflects 4.0.x in the output of,
|
||||
|
||||
**NOTE:** x is the minor release number for the release
|
||||
gluster --version
|
||||
|
||||
5. Start glusterd on the upgraded server
|
||||
**NOTE:** x is the minor release number for the release
|
||||
|
||||
# glusterd
|
||||
5. Start glusterd on the upgraded server
|
||||
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
glusterd
|
||||
|
||||
# gluster volume status
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
|
||||
7. Self-heal all gluster volumes by running
|
||||
gluster volume status
|
||||
|
||||
# for i in `gluster volume list`; do gluster volume heal $i; done
|
||||
7. Self-heal all gluster volumes by running
|
||||
|
||||
8. Ensure that there is no heal backlog by running the below command for all volumes
|
||||
for i in `gluster volume list`; do gluster volume heal $i; done
|
||||
|
||||
# gluster volume heal <volname> info
|
||||
8. Ensure that there is no heal backlog by running the below command for all volumes
|
||||
|
||||
> NOTE: If there is a heal backlog, wait till the backlog is empty, or the backlog does not have any entries needing a sync to the just upgraded server, before proceeding to upgrade the next server in the pool
|
||||
gluster volume heal <volname> info
|
||||
|
||||
9. Restart any gfapi based application stopped previously in step (2)
|
||||
> NOTE: If there is a heal backlog, wait till the backlog is empty, or the backlog does not have any entries needing a sync to the just upgraded server, before proceeding to upgrade the next server in the pool
|
||||
|
||||
9. Restart any gfapi based application stopped previously in step (2)
|
||||
|
||||
### Offline upgrade procedure
|
||||
|
||||
This procedure involves cluster downtime and during the upgrade window, clients are not allowed access to the volumes.
|
||||
|
||||
#### Steps to perform an offline upgrade:
|
||||
1. On every server in the trusted storage pool, stop all gluster services, either using the command below, or through other means,
|
||||
|
||||
# killall glusterfs glusterfsd glusterd
|
||||
1. On every server in the trusted storage pool, stop all gluster services, either using the command below, or through other means,
|
||||
|
||||
2. Stop all applications that access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.), across all servers
|
||||
killall glusterfs glusterfsd glusterd
|
||||
|
||||
3. Install Gluster 4.0, on all servers
|
||||
2. Stop all applications that access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.), across all servers
|
||||
|
||||
4. Ensure that version reflects 4.0.x in the output of the following command on all servers,
|
||||
3. Install Gluster 4.0, on all servers
|
||||
|
||||
# gluster --version
|
||||
4. Ensure that version reflects 4.0.x in the output of the following command on all servers,
|
||||
|
||||
**NOTE:** x is the minor release number for the release
|
||||
gluster --version
|
||||
|
||||
5. Start glusterd on all the upgraded servers
|
||||
**NOTE:** x is the minor release number for the release
|
||||
|
||||
# glusterd
|
||||
5. Start glusterd on all the upgraded servers
|
||||
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
glusterd
|
||||
|
||||
# gluster volume status
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
|
||||
7. Restart any gfapi based application stopped previously in step (2)
|
||||
gluster volume status
|
||||
|
||||
7. Restart any gfapi based application stopped previously in step (2)
|
||||
|
||||
### Post upgrade steps
|
||||
|
||||
Perform the following steps post upgrading the entire trusted storage pool,
|
||||
|
||||
- It is recommended to update the op-version of the cluster. Refer, to the [op-version](./op-version.md) section for further details
|
||||
@@ -86,6 +92,7 @@ Perform the following steps post upgrading the entire trusted storage pool,
|
||||
- Post upgrading the clients, for replicate volumes, it is recommended to enable the option `gluster volume set <volname> fips-mode-rchecksum on` to turn off usage of MD5 checksums during healing. This enables running Gluster on FIPS compliant systems.
|
||||
|
||||
### Upgrade procedure for clients
|
||||
|
||||
Following are the steps to upgrade clients to the 4.0.x version,
|
||||
|
||||
**NOTE:** x is the minor release number for the release
|
||||
|
||||
@@ -3,6 +3,7 @@
|
||||
> **NOTE:** Upgrade procedure remains the same as with 3.12 and 3.10 releases
|
||||
|
||||
### Pre-upgrade notes
|
||||
|
||||
- Online upgrade is only possible with replicated and distributed replicate volumes
|
||||
- Online upgrade is not supported for dispersed or distributed dispersed volumes
|
||||
- Ensure no configuration changes are done during the upgrade
|
||||
@@ -11,88 +12,89 @@
|
||||
- It is recommended to have the same client and server, major versions running eventually
|
||||
|
||||
### Online upgrade procedure for servers
|
||||
|
||||
This procedure involves upgrading **one server at a time**, while keeping the volume(s) online and client IO ongoing. This procedure assumes that multiple replicas of a replica set, are not part of the same server in the trusted storage pool.
|
||||
|
||||
> **ALERT:** If there are disperse or, pure distributed volumes in the storage pool being upgraded, this procedure is NOT recommended, use the [Offline upgrade procedure](#offline-upgrade-procedure) instead.
|
||||
|
||||
#### Repeat the following steps, on each server in the trusted storage pool, to upgrade the entire pool to 4.1 version:
|
||||
1. Stop all gluster services, either using the command below, or through other means,
|
||||
|
||||
# killall glusterfs glusterfsd glusterd
|
||||
# systemctl stop glustereventsd
|
||||
1. Stop all gluster services, either using the command below, or through other means,
|
||||
|
||||
2. Stop all applications that run on this server and access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.)
|
||||
killall glusterfs glusterfsd glusterd
|
||||
systemctl stop glustereventsd
|
||||
|
||||
3. Install Gluster 4.1
|
||||
2. Stop all applications that run on this server and access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.)
|
||||
|
||||
4. Ensure that version reflects 4.1.x in the output of,
|
||||
3. Install Gluster 4.1
|
||||
|
||||
# gluster --version
|
||||
4. Ensure that version reflects 4.1.x in the output of,
|
||||
|
||||
gluster --version
|
||||
|
||||
> **NOTE:** x is the minor release number for the release
|
||||
|
||||
5. Start glusterd on the upgraded server
|
||||
5. Start glusterd on the upgraded server
|
||||
|
||||
# glusterd
|
||||
glusterd
|
||||
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
|
||||
# gluster volume status
|
||||
gluster volume status
|
||||
|
||||
7. If the glustereventsd service was previously enabled, it is required to start it using the commands below, or, through other means,
|
||||
7. If the glustereventsd service was previously enabled, it is required to start it using the commands below, or, through other means,
|
||||
|
||||
# systemctl start glustereventsd
|
||||
systemctl start glustereventsd
|
||||
|
||||
8. Invoke self-heal on all the gluster volumes by running,
|
||||
8. Invoke self-heal on all the gluster volumes by running,
|
||||
|
||||
# for i in `gluster volume list`; do gluster volume heal $i; done
|
||||
for i in `gluster volume list`; do gluster volume heal $i; done
|
||||
|
||||
9. Verify that there are no heal backlog by running the command for all the volumes,
|
||||
9. Verify that there are no heal backlog by running the command for all the volumes,
|
||||
|
||||
# gluster volume heal <volname> info
|
||||
gluster volume heal <volname> info
|
||||
|
||||
> **NOTE:** Before proceeding to upgrade the next server in the pool it is recommended to check the heal backlog. If there is a heal backlog, it is recommended to wait until the backlog is empty, or, the backlog does not contain any entries requiring a sync to the just upgraded server.
|
||||
|
||||
10. Restart any gfapi based application stopped previously in step (2)
|
||||
|
||||
### Offline upgrade procedure
|
||||
|
||||
This procedure involves cluster downtime and during the upgrade window, clients are not allowed access to the volumes.
|
||||
|
||||
#### Steps to perform an offline upgrade:
|
||||
1. On every server in the trusted storage pool, stop all gluster services, either using the command below, or through other means,
|
||||
|
||||
```sh
|
||||
#killall glusterfs glusterfsd glusterd glustereventsd
|
||||
#systemctl stop glustereventsd
|
||||
```
|
||||
2. Stop all applications that access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.), across all servers
|
||||
1. On every server in the trusted storage pool, stop all gluster services, either using the command below, or through other means,
|
||||
|
||||
3. Install Gluster 4.1, on all servers
|
||||
killall glusterfs glusterfsd glusterd glustereventsd
|
||||
systemctl stop glustereventsd
|
||||
|
||||
4. Ensure that version reflects 4.1.x in the output of the following command on all servers,
|
||||
```sh
|
||||
#gluster --version
|
||||
```
|
||||
2. Stop all applications that access the volumes via gfapi (qemu, NFS-Ganesha, Samba, etc.), across all servers
|
||||
|
||||
> **NOTE:** x is the minor release number for the release
|
||||
3. Install Gluster 4.1, on all servers
|
||||
|
||||
5. Start glusterd on all the upgraded servers
|
||||
```sh
|
||||
#glusterd
|
||||
```
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
```sh
|
||||
#gluster volume status
|
||||
```
|
||||
4. Ensure that version reflects 4.1.x in the output of the following command on all servers,
|
||||
|
||||
7. If the glustereventsd service was previously enabled, it is required to start it using the commands below, or, through other means,
|
||||
```sh
|
||||
#systemctl start glustereventsd
|
||||
```
|
||||
gluster --version
|
||||
|
||||
8. Restart any gfapi based application stopped previously in step (2)
|
||||
> **NOTE:** x is the minor release number for the release
|
||||
|
||||
5. Start glusterd on all the upgraded servers
|
||||
|
||||
glusterd
|
||||
|
||||
6. Ensure that all gluster processes are online by checking the output of,
|
||||
|
||||
gluster volume status
|
||||
|
||||
7. If the glustereventsd service was previously enabled, it is required to start it using the commands below, or, through other means,
|
||||
|
||||
systemctl start glustereventsd
|
||||
|
||||
8. Restart any gfapi based application stopped previously in step (2)
|
||||
|
||||
### Post upgrade steps
|
||||
|
||||
Perform the following steps post upgrading the entire trusted storage pool,
|
||||
|
||||
- It is recommended to update the op-version of the cluster. Refer, to the [op-version](./op-version.md) section for further details
|
||||
@@ -100,6 +102,7 @@ Perform the following steps post upgrading the entire trusted storage pool,
|
||||
- Post upgrading the clients, for replicate volumes, it is recommended to enable the option `gluster volume set <volname> fips-mode-rchecksum on` to turn off usage of MD5 checksums during healing. This enables running Gluster on FIPS compliant systems.
|
||||
|
||||
### Upgrade procedure for clients
|
||||
|
||||
Following are the steps to upgrade clients to the 4.1.x version,
|
||||
|
||||
> **NOTE:** x is the minor release number for the release
|
||||
|
||||
@@ -8,15 +8,16 @@ version reference.
|
||||
|
||||
### Major issues
|
||||
|
||||
1. The following options are removed from the code base and require to be unset
|
||||
before an upgrade from releases older than release 4.1.0,
|
||||
- features.lock-heal
|
||||
- features.grace-timeout
|
||||
1. The following options are removed from the code base and require to be unset
|
||||
before an upgrade from releases older than release 4.1.0,
|
||||
|
||||
- features.lock-heal
|
||||
- features.grace-timeout
|
||||
|
||||
To check if these options are set use,
|
||||
|
||||
```console
|
||||
# gluster volume info
|
||||
gluster volume info
|
||||
```
|
||||
|
||||
and ensure that the above options are not part of the `Options Reconfigured:`
|
||||
@@ -24,7 +25,7 @@ section in the output of all volumes in the cluster.
|
||||
|
||||
If these are set, then unset them using the following commands,
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# gluster volume reset <volname> <option>
|
||||
```
|
||||
|
||||
|
||||
@@ -11,15 +11,16 @@ version reference.
|
||||
|
||||
### Major issues
|
||||
|
||||
1. The following options are removed from the code base and require to be unset
|
||||
before an upgrade from releases older than release 4.1.0,
|
||||
- features.lock-heal
|
||||
- features.grace-timeout
|
||||
1. The following options are removed from the code base and require to be unset
|
||||
before an upgrade from releases older than release 4.1.0,
|
||||
|
||||
- features.lock-heal
|
||||
- features.grace-timeout
|
||||
|
||||
To check if these options are set use,
|
||||
|
||||
```console
|
||||
# gluster volume info
|
||||
gluster volume info
|
||||
```
|
||||
|
||||
and ensure that the above options are not part of the `Options Reconfigured:`
|
||||
@@ -27,7 +28,7 @@ section in the output of all volumes in the cluster.
|
||||
|
||||
If these are set, then unset them using the following commands,
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# gluster volume reset <volname> <option>
|
||||
```
|
||||
|
||||
|
||||
@@ -10,22 +10,23 @@ documented instructions, replacing 7 when you encounter 4.1 in the guide as the
|
||||
version reference.
|
||||
|
||||
> **NOTE:** If you have ever enabled quota on your volumes then after the upgrade
|
||||
is done, you will have to restart all the nodes in the cluster one by one so as to
|
||||
fix the checksum values in the quota.cksum file under the `/var/lib/glusterd/vols/<volname>/ directory.`
|
||||
The peers may go into `Peer rejected` state while doing so but once all the nodes are rebooted
|
||||
everything will be back to normal.
|
||||
> is done, you will have to restart all the nodes in the cluster one by one so as to
|
||||
> fix the checksum values in the quota.cksum file under the `/var/lib/glusterd/vols/<volname>/ directory.`
|
||||
> The peers may go into `Peer rejected` state while doing so but once all the nodes are rebooted
|
||||
> everything will be back to normal.
|
||||
|
||||
### Major issues
|
||||
|
||||
1. The following options are removed from the code base and require to be unset
|
||||
before an upgrade from releases older than release 4.1.0,
|
||||
- features.lock-heal
|
||||
- features.grace-timeout
|
||||
1. The following options are removed from the code base and require to be unset
|
||||
before an upgrade from releases older than release 4.1.0,
|
||||
|
||||
- features.lock-heal
|
||||
- features.grace-timeout
|
||||
|
||||
To check if these options are set use,
|
||||
|
||||
```console
|
||||
# gluster volume info
|
||||
gluster volume info
|
||||
```
|
||||
|
||||
and ensure that the above options are not part of the `Options Reconfigured:`
|
||||
@@ -33,7 +34,7 @@ section in the output of all volumes in the cluster.
|
||||
|
||||
If these are set, then unset them using the following commands,
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# gluster volume reset <volname> <option>
|
||||
```
|
||||
|
||||
|
||||
@@ -7,17 +7,19 @@ aware of the features and fixes provided with the release.
|
||||
|
||||
> With version 8, there are certain changes introduced to the directory structure of changelog files in gluster geo-replication.
|
||||
> Thus, before the upgrade of geo-rep packages, we need to execute the [upgrade script](https://github.com/gluster/glusterfs/commit/2857fe3fad4d2b30894847088a54b847b88a23b9) with the brick path as argument, as described below:
|
||||
>1. Stop the geo-rep session
|
||||
>2. Run the upgrade script with the brick path as the argument. Script can be used in loop for multiple bricks.
|
||||
>3. Start the upgradation process.
|
||||
>This script will update the existing changelog directory structure and the paths inside the htime files to a new format introduced in version 8.
|
||||
>If the above mentioned script is not executed, the search algorithm, used during the history crawl will fail with the wrong result for upgradation from version 7 and below to version 8 and above.
|
||||
>
|
||||
> 1. Stop the geo-rep session
|
||||
> 2. Run the upgrade script with the brick path as the argument. Script can be used in loop for multiple bricks.
|
||||
> 3. Start the upgradation process.
|
||||
> This script will update the existing changelog directory structure and the paths inside the htime files to a new format introduced in version 8.
|
||||
> If the above mentioned script is not executed, the search algorithm, used during the history crawl will fail with the wrong result for upgradation from version 7 and below to version 8 and above.
|
||||
|
||||
Refer, to the [generic upgrade procedure](./generic-upgrade-procedure.md) guide and follow documented instructions.
|
||||
|
||||
## Major issues
|
||||
|
||||
### The following options are removed from the code base and require to be unset
|
||||
|
||||
before an upgrade from releases older than release 4.1.0,
|
||||
|
||||
- features.lock-heal
|
||||
@@ -26,7 +28,7 @@ before an upgrade from releases older than release 4.1.0,
|
||||
To check if these options are set use,
|
||||
|
||||
```console
|
||||
# gluster volume info
|
||||
gluster volume info
|
||||
```
|
||||
|
||||
and ensure that the above options are not part of the `Options Reconfigured:`
|
||||
@@ -34,7 +36,7 @@ section in the output of all volumes in the cluster.
|
||||
|
||||
If these are set, then unset them using the following commands,
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# gluster volume reset <volname> <option>
|
||||
```
|
||||
|
||||
@@ -48,7 +50,6 @@ If these are set, then unset them using the following commands,
|
||||
- Tiering support (tier xlator and changetimerecorder)
|
||||
- Glupy
|
||||
|
||||
|
||||
**NOTE:** Failure to do the above may result in failure during online upgrades,
|
||||
and the reset of these options to their defaults needs to be done **prior** to
|
||||
upgrading the cluster.
|
||||
|
||||
@@ -10,6 +10,7 @@ Refer, to the [generic upgrade procedure](./generic-upgrade-procedure.md) guide
|
||||
## Major issues
|
||||
|
||||
### The following options are removed from the code base and require to be unset
|
||||
|
||||
before an upgrade from releases older than release 4.1.0,
|
||||
|
||||
- features.lock-heal
|
||||
@@ -18,7 +19,7 @@ before an upgrade from releases older than release 4.1.0,
|
||||
To check if these options are set use,
|
||||
|
||||
```console
|
||||
# gluster volume info
|
||||
gluster volume info
|
||||
```
|
||||
|
||||
and ensure that the above options are not part of the `Options Reconfigured:`
|
||||
@@ -26,11 +27,11 @@ section in the output of all volumes in the cluster.
|
||||
|
||||
If these are set, then unset them using the following commands,
|
||||
|
||||
```console
|
||||
```{ .console .no-copy }
|
||||
# gluster volume reset <volname> <option>
|
||||
```
|
||||
|
||||
### Make sure you are not using any of the following depricated features :
|
||||
### Make sure you are not using any of the following deprecated features :
|
||||
|
||||
- Block device (bd) xlator
|
||||
- Decompounder feature
|
||||
@@ -40,7 +41,6 @@ If these are set, then unset them using the following commands,
|
||||
- Tiering support (tier xlator and changetimerecorder)
|
||||
- Glupy
|
||||
|
||||
|
||||
**NOTE:** Failure to do the above may result in failure during online upgrades,
|
||||
and the reset of these options to their defaults needs to be done **prior** to
|
||||
upgrading the cluster.
|
||||
|
||||
253
docs/glossary.md
253
docs/glossary.md
@@ -1,57 +1,58 @@
|
||||
Glossary
|
||||
========
|
||||
# Glossary
|
||||
|
||||
**Access Control Lists**
|
||||
: Access Control Lists (ACLs) allow you to assign different permissions
|
||||
for different users or groups even though they do not correspond to the
|
||||
original owner or the owning group.
|
||||
: Access Control Lists (ACLs) allow you to assign different permissions
|
||||
for different users or groups even though they do not correspond to the
|
||||
original owner or the owning group.
|
||||
|
||||
**Block Storage**
|
||||
: Block special files, or block devices, correspond to devices through which the system moves
|
||||
data in the form of blocks. These device nodes often represent addressable devices such as
|
||||
hard disks, CD-ROM drives, or memory regions. GlusterFS requires a filesystem (like XFS) that
|
||||
supports extended attributes.
|
||||
: Block special files, or block devices, correspond to devices through which the system moves
|
||||
data in the form of blocks. These device nodes often represent addressable devices such as
|
||||
hard disks, CD-ROM drives, or memory regions. GlusterFS requires a filesystem (like XFS) that
|
||||
supports extended attributes.
|
||||
|
||||
**Brick**
|
||||
: A Brick is the basic unit of storage in GlusterFS, represented by an export directory
|
||||
on a server in the trusted storage pool.
|
||||
A brick is expressed by combining a server with an export directory in the following format:
|
||||
: A Brick is the basic unit of storage in GlusterFS, represented by an export directory
|
||||
on a server in the trusted storage pool.
|
||||
A brick is expressed by combining a server with an export directory in the following format:
|
||||
|
||||
`SERVER:EXPORT`
|
||||
For example:
|
||||
`myhostname:/exports/myexportdir/`
|
||||
```{ .text .no-copy }
|
||||
SERVER:EXPORT
|
||||
For example:
|
||||
myhostname:/exports/myexportdir/
|
||||
```
|
||||
|
||||
**Client**
|
||||
: Any machine that mounts a GlusterFS volume. Any applications that use libgfapi access
|
||||
mechanism can also be treated as clients in GlusterFS context.
|
||||
: Any machine that mounts a GlusterFS volume. Any applications that use libgfapi access
|
||||
mechanism can also be treated as clients in GlusterFS context.
|
||||
|
||||
**Cluster**
|
||||
: A trusted pool of linked computers working together, resembling a single computing resource.
|
||||
In GlusterFS, a cluster is also referred to as a trusted storage pool.
|
||||
: A trusted pool of linked computers working together, resembling a single computing resource.
|
||||
In GlusterFS, a cluster is also referred to as a trusted storage pool.
|
||||
|
||||
**Distributed File System**
|
||||
: A file system that allows multiple clients to concurrently access data which is spread across
|
||||
servers/bricks in a trusted storage pool. Data sharing among multiple locations is fundamental
|
||||
to all distributed file systems.
|
||||
: A file system that allows multiple clients to concurrently access data which is spread across
|
||||
servers/bricks in a trusted storage pool. Data sharing among multiple locations is fundamental
|
||||
to all distributed file systems.
|
||||
|
||||
**Extended Attributes**
|
||||
: Extended file attributes (abbreviated xattr) is a filesystem feature that enables
|
||||
users/programs to associate files/dirs with metadata. Gluster stores metadata in xattrs.
|
||||
: Extended file attributes (abbreviated xattr) is a filesystem feature that enables
|
||||
users/programs to associate files/dirs with metadata. Gluster stores metadata in xattrs.
|
||||
|
||||
**Filesystem**
|
||||
: A method of storing and organizing computer files and their data.
|
||||
Essentially, it organizes these files into a database for the
|
||||
storage, organization, manipulation, and retrieval by the computer's
|
||||
operating system.
|
||||
: A method of storing and organizing computer files and their data.
|
||||
Essentially, it organizes these files into a database for the
|
||||
storage, organization, manipulation, and retrieval by the computer's
|
||||
operating system.
|
||||
|
||||
Source [Wikipedia][Wikipedia]
|
||||
Source [Wikipedia][wikipedia]
|
||||
|
||||
**FUSE**
|
||||
: Filesystem in Userspace (FUSE) is a loadable kernel module for Unix-like
|
||||
computer operating systems that lets non-privileged users create their
|
||||
own file systems without editing kernel code. This is achieved by
|
||||
running file system code in user space while the FUSE module provides
|
||||
only a "bridge" to the actual kernel interfaces.
|
||||
computer operating systems that lets non-privileged users create their
|
||||
own file systems without editing kernel code. This is achieved by
|
||||
running file system code in user space while the FUSE module provides
|
||||
only a "bridge" to the actual kernel interfaces.
|
||||
Source: [Wikipedia][1]
|
||||
|
||||
**GFID**
|
||||
@@ -60,156 +61,156 @@ associated with it called the GFID. This is analogous to inode in a
|
||||
regular filesystem.
|
||||
|
||||
**glusterd**
|
||||
: The Gluster daemon/service that manages volumes and cluster membership. It is required to
|
||||
run on all the servers in the trusted storage pool.
|
||||
: The Gluster daemon/service that manages volumes and cluster membership. It is required to
|
||||
run on all the servers in the trusted storage pool.
|
||||
|
||||
**Geo-Replication**
|
||||
: Geo-replication provides a continuous, asynchronous, and incremental
|
||||
replication service from site to another over Local Area Networks
|
||||
(LANs), Wide Area Network (WANs), and across the Internet.
|
||||
|
||||
: Geo-replication provides a continuous, asynchronous, and incremental
|
||||
replication service from site to another over Local Area Networks
|
||||
(LANs), Wide Area Network (WANs), and across the Internet.
|
||||
|
||||
**Infiniband**
|
||||
InfiniBand is a switched fabric computer network communications link
|
||||
used in high-performance computing and enterprise data centers.
|
||||
InfiniBand is a switched fabric computer network communications link
|
||||
used in high-performance computing and enterprise data centers.
|
||||
|
||||
**Metadata**
|
||||
: Metadata is defined as data providing information about one or more
|
||||
other pieces of data. There is no special metadata storage concept in
|
||||
GlusterFS. The metadata is stored with the file data itself usually in the
|
||||
form of extended attributes
|
||||
: Metadata is defined as data providing information about one or more
|
||||
other pieces of data. There is no special metadata storage concept in
|
||||
GlusterFS. The metadata is stored with the file data itself usually in the
|
||||
form of extended attributes
|
||||
|
||||
**Namespace**
|
||||
: A namespace is an abstract container or environment created to hold a
|
||||
logical grouping of unique identifiers or symbols. Each Gluster volume
|
||||
exposes a single namespace as a POSIX mount point that contains every
|
||||
file in the cluster.
|
||||
: A namespace is an abstract container or environment created to hold a
|
||||
logical grouping of unique identifiers or symbols. Each Gluster volume
|
||||
exposes a single namespace as a POSIX mount point that contains every
|
||||
file in the cluster.
|
||||
|
||||
**Node**
|
||||
: A server or computer that hosts one or more bricks.
|
||||
: A server or computer that hosts one or more bricks.
|
||||
|
||||
**N-way Replication**
|
||||
: Local synchronous data replication which is typically deployed across campus
|
||||
or Amazon Web Services Availability Zones.
|
||||
: Local synchronous data replication which is typically deployed across campus
|
||||
or Amazon Web Services Availability Zones.
|
||||
|
||||
**Petabyte**
|
||||
: A petabyte (derived from the SI prefix peta- ) is a unit of
|
||||
information equal to one quadrillion (short scale) bytes, or 1000
|
||||
terabytes. The unit symbol for the petabyte is PB. The prefix peta-
|
||||
(P) indicates a power of 1000:
|
||||
: A petabyte (derived from the SI prefix peta- ) is a unit of
|
||||
information equal to one quadrillion (short scale) bytes, or 1000
|
||||
terabytes. The unit symbol for the petabyte is PB. The prefix peta-
|
||||
(P) indicates a power of 1000:
|
||||
|
||||
1 PB = 1,000,000,000,000,000 B = 10005 B = 1015 B.
|
||||
```{ .text .no-copy }
|
||||
1 PB = 1,000,000,000,000,000 B = 10005 B = 1015 B.
|
||||
|
||||
The term "pebibyte" (PiB), using a binary prefix, is used for the
|
||||
corresponding power of 1024.
|
||||
The term "pebibyte" (PiB), using a binary prefix, is used for the
|
||||
corresponding power of 1024.
|
||||
```
|
||||
|
||||
Source: [Wikipedia][3]
|
||||
|
||||
**POSIX**
|
||||
: Portable Operating System Interface (for Unix) is the name of a family
|
||||
of related standards specified by the IEEE to define the application
|
||||
programming interface (API), along with shell and utilities interfaces
|
||||
for software compatible with variants of the Unix operating system
|
||||
Gluster exports a POSIX compatible file system.
|
||||
: Portable Operating System Interface (for Unix) is the name of a family
|
||||
of related standards specified by the IEEE to define the application
|
||||
programming interface (API), along with shell and utilities interfaces
|
||||
for software compatible with variants of the Unix operating system
|
||||
Gluster exports a POSIX compatible file system.
|
||||
|
||||
**Quorum**
|
||||
: The configuration of quorum in a trusted storage pool determines the
|
||||
number of server failures that the trusted storage pool can sustain.
|
||||
If an additional failure occurs, the trusted storage pool becomes
|
||||
unavailable.
|
||||
: The configuration of quorum in a trusted storage pool determines the
|
||||
number of server failures that the trusted storage pool can sustain.
|
||||
If an additional failure occurs, the trusted storage pool becomes
|
||||
unavailable.
|
||||
|
||||
**Quota**
|
||||
: Quota allows you to set limits on usage of disk space by directories or
|
||||
by volumes.
|
||||
: Quota allows you to set limits on usage of disk space by directories or
|
||||
by volumes.
|
||||
|
||||
**RAID**
|
||||
: Redundant Array of Inexpensive Disks (RAID) is a technology that provides
|
||||
increased storage reliability through redundancy, combining multiple
|
||||
low-cost, less-reliable disk drives components into a logical unit where
|
||||
all drives in the array are interdependent.
|
||||
: Redundant Array of Inexpensive Disks (RAID) is a technology that provides
|
||||
increased storage reliability through redundancy, combining multiple
|
||||
low-cost, less-reliable disk drives components into a logical unit where
|
||||
all drives in the array are interdependent.
|
||||
|
||||
**RDMA**
|
||||
: Remote direct memory access (RDMA) is a direct memory access from the
|
||||
memory of one computer into that of another without involving either
|
||||
one's operating system. This permits high-throughput, low-latency
|
||||
networking, which is especially useful in massively parallel computer
|
||||
clusters
|
||||
: Remote direct memory access (RDMA) is a direct memory access from the
|
||||
memory of one computer into that of another without involving either
|
||||
one's operating system. This permits high-throughput, low-latency
|
||||
networking, which is especially useful in massively parallel computer
|
||||
clusters
|
||||
|
||||
**Rebalance**
|
||||
: The process of redistributing data in a distributed volume when a
|
||||
brick is added or removed.
|
||||
: The process of redistributing data in a distributed volume when a
|
||||
brick is added or removed.
|
||||
|
||||
**RRDNS**
|
||||
: Round Robin Domain Name Service (RRDNS) is a method to distribute load
|
||||
across application servers. It is implemented by creating multiple A
|
||||
records with the same name and different IP addresses in the zone file
|
||||
of a DNS server.
|
||||
: Round Robin Domain Name Service (RRDNS) is a method to distribute load
|
||||
across application servers. It is implemented by creating multiple A
|
||||
records with the same name and different IP addresses in the zone file
|
||||
of a DNS server.
|
||||
|
||||
**Samba**
|
||||
: Samba allows file and print sharing between computers running Windows and
|
||||
computers running Linux. It is an implementation of several services and
|
||||
protocols including SMB and CIFS.
|
||||
: Samba allows file and print sharing between computers running Windows and
|
||||
computers running Linux. It is an implementation of several services and
|
||||
protocols including SMB and CIFS.
|
||||
|
||||
**Scale-Up Storage**
|
||||
: Increases the capacity of the storage device in a single dimension.
|
||||
For example, adding additional disk capacity to an existing trusted storage pool.
|
||||
: Increases the capacity of the storage device in a single dimension.
|
||||
For example, adding additional disk capacity to an existing trusted storage pool.
|
||||
|
||||
**Scale-Out Storage**
|
||||
: Scale out systems are designed to scale on both capacity and performance.
|
||||
It increases the capability of a storage device in single dimension.
|
||||
For example, adding more systems of the same size, or adding servers to a trusted storage pool
|
||||
that increases CPU, disk capacity, and throughput for the trusted storage pool.
|
||||
: Scale out systems are designed to scale on both capacity and performance.
|
||||
It increases the capability of a storage device in single dimension.
|
||||
For example, adding more systems of the same size, or adding servers to a trusted storage pool
|
||||
that increases CPU, disk capacity, and throughput for the trusted storage pool.
|
||||
|
||||
**Self-Heal**
|
||||
: The self-heal daemon that runs in the background, identifies
|
||||
inconsistencies in files/dirs in a replicated or erasure coded volume and then resolves
|
||||
or heals them. This healing process is usually required when one or more
|
||||
bricks of a volume goes down and then comes up later.
|
||||
: The self-heal daemon that runs in the background, identifies
|
||||
inconsistencies in files/dirs in a replicated or erasure coded volume and then resolves
|
||||
or heals them. This healing process is usually required when one or more
|
||||
bricks of a volume goes down and then comes up later.
|
||||
|
||||
**Server**
|
||||
: The machine (virtual or bare metal) that hosts the bricks in which data is stored.
|
||||
: The machine (virtual or bare metal) that hosts the bricks in which data is stored.
|
||||
|
||||
**Split-brain**
|
||||
: A situation where data on two or more bricks in a replicated
|
||||
volume start to diverge in terms of content or metadata. In this state,
|
||||
one cannot determine programmatically which set of data is "right" and
|
||||
which is "wrong".
|
||||
: A situation where data on two or more bricks in a replicated
|
||||
volume start to diverge in terms of content or metadata. In this state,
|
||||
one cannot determine programmatically which set of data is "right" and
|
||||
which is "wrong".
|
||||
|
||||
**Subvolume**
|
||||
: A brick after being processed by at least one translator.
|
||||
: A brick after being processed by at least one translator.
|
||||
|
||||
**Translator**
|
||||
: Translators (also called xlators) are stackable modules where each
|
||||
module has a very specific purpose. Translators are stacked in a
|
||||
hierarchical structure called as graph. A translator receives data
|
||||
from its parent translator, performs necessary operations and then
|
||||
passes the data down to its child translator in hierarchy.
|
||||
: Translators (also called xlators) are stackable modules where each
|
||||
module has a very specific purpose. Translators are stacked in a
|
||||
hierarchical structure called as graph. A translator receives data
|
||||
from its parent translator, performs necessary operations and then
|
||||
passes the data down to its child translator in hierarchy.
|
||||
|
||||
**Trusted Storage Pool**
|
||||
: A storage pool is a trusted network of storage servers. When you start
|
||||
the first server, the storage pool consists of that server alone.
|
||||
: A storage pool is a trusted network of storage servers. When you start
|
||||
the first server, the storage pool consists of that server alone.
|
||||
|
||||
**Userspace**
|
||||
: Applications running in user space don’t directly interact with
|
||||
hardware, instead using the kernel to moderate access. Userspace
|
||||
applications are generally more portable than applications in kernel
|
||||
space. Gluster is a user space application.
|
||||
: Applications running in user space don’t directly interact with
|
||||
hardware, instead using the kernel to moderate access. Userspace
|
||||
applications are generally more portable than applications in kernel
|
||||
space. Gluster is a user space application.
|
||||
|
||||
**Virtual File System (VFS)**
|
||||
: VFS is a kernel software layer which handles all system calls related to the standard Linux file system.
|
||||
It provides a common interface to several kinds of file systems.
|
||||
: VFS is a kernel software layer which handles all system calls related to the standard Linux file system.
|
||||
It provides a common interface to several kinds of file systems.
|
||||
|
||||
**Volume**
|
||||
: A volume is a logical collection of bricks.
|
||||
: A volume is a logical collection of bricks.
|
||||
|
||||
**Vol file**
|
||||
: Vol files or volume (.vol) files are configuration files that determine the behavior of the
|
||||
Gluster trusted storage pool. It is a textual representation of a
|
||||
collection of modules (also known as translators) that together implement the
|
||||
various functions required.
|
||||
Gluster trusted storage pool. It is a textual representation of a
|
||||
collection of modules (also known as translators) that together implement the
|
||||
various functions required.
|
||||
|
||||
|
||||
[Wikipedia]: http://en.wikipedia.org/wiki/Filesystem
|
||||
[1]: http://en.wikipedia.org/wiki/Filesystem_in_Userspace
|
||||
[2]: http://en.wikipedia.org/wiki/Open_source
|
||||
[3]: http://en.wikipedia.org/wiki/Petabyte
|
||||
[wikipedia]: http://en.wikipedia.org/wiki/Filesystem
|
||||
[1]: http://en.wikipedia.org/wiki/Filesystem_in_Userspace
|
||||
[2]: http://en.wikipedia.org/wiki/Open_source
|
||||
[3]: http://en.wikipedia.org/wiki/Petabyte
|
||||
|
||||
40
docs/js/custom-features.js
Normal file
40
docs/js/custom-features.js
Normal file
@@ -0,0 +1,40 @@
|
||||
// Add ability to copy the current URL using vim like shortcuts
|
||||
// There already exists navigation related shortcuts like
|
||||
// F/S -- For Searching
|
||||
// P/N -- For navigating to previous/next pages
|
||||
// This patch just extends those features
|
||||
|
||||
// Expose the internal notification API of mkdocs
|
||||
// This API isn't exposed publically, IDK why
|
||||
// They use it internally to show notifications when user copies a code block
|
||||
// I reverse engineered it for ease of use, takes a string arg `msg`
|
||||
const notifyDOM = (msg) => {
|
||||
if (typeof alert$ === "undefined") {
|
||||
console.error("Clipboard notification API not available");
|
||||
return;
|
||||
}
|
||||
|
||||
alert$.next(msg);
|
||||
};
|
||||
|
||||
// Extend the keyboard shortcut features
|
||||
keyboard$.subscribe((key) => {
|
||||
// We want to allow the user to be able to type our modifiders in search
|
||||
// Disallowing that would be hilarious
|
||||
if (key.mode === "search") {
|
||||
return;
|
||||
}
|
||||
|
||||
const keyPressed = key.type.toLowerCase();
|
||||
|
||||
// Y is added to honor vim enthusiasts (yank)
|
||||
if (keyPressed === "c" || keyPressed === "y") {
|
||||
const currLocation = window.location.href;
|
||||
if (currLocation) {
|
||||
navigator.clipboard
|
||||
.writeText(currLocation)
|
||||
.then(() => notifyDOM("Address copied to clipboard"))
|
||||
.catch((e) => console.error(e));
|
||||
}
|
||||
}
|
||||
});
|
||||
@@ -1,74 +0,0 @@
|
||||
(function () {
|
||||
'use strict';
|
||||
|
||||
$(document).ready(function () {
|
||||
fixSearchResults();
|
||||
fixSearch();
|
||||
warnDomain();
|
||||
});
|
||||
|
||||
/**
|
||||
* Adds a TOC-style table to each page in the 'Modules' section.
|
||||
*/
|
||||
function fixSearchResults() {
|
||||
$('#mkdocs-search-results').text('Searching...');
|
||||
}
|
||||
|
||||
/**
|
||||
* Warn if the domain is gluster.readthedocs.io
|
||||
*
|
||||
*/
|
||||
function warnDomain() {
|
||||
var domain = window.location.hostname;
|
||||
if (domain.indexOf('readthedocs.io') != -1) {
|
||||
$('div.section').prepend('<div class="warning"><p>You are viewing outdated content. We have moved to <a href="http://docs.gluster.org' + window.location.pathname + '">docs.gluster.org.</a></p></div>');
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* RTD messes up MkDocs' search feature by tinkering with the search box defined in the theme, see
|
||||
* https://github.com/rtfd/readthedocs.org/issues/1088. This function sets up a DOM4 MutationObserver
|
||||
* to react to changes to the search form (triggered by RTD on doc ready). It then reverts everything
|
||||
* the RTD JS code modified.
|
||||
*/
|
||||
function fixSearch() {
|
||||
var target = document.getElementById('mkdocs-search-form');
|
||||
var config = {attributes: true, childList: true};
|
||||
|
||||
var observer = new MutationObserver(function(mutations) {
|
||||
// if it isn't disconnected it'll loop infinitely because the observed element is modified
|
||||
observer.disconnect();
|
||||
var form = $('#mkdocs-search-form');
|
||||
form.empty();
|
||||
form.attr('action', 'https://' + window.location.hostname + '/en/' + determineSelectedBranch() + '/search.html');
|
||||
$('<input>').attr({
|
||||
type: "text",
|
||||
name: "q",
|
||||
placeholder: "Search docs"
|
||||
}).appendTo(form);
|
||||
});
|
||||
|
||||
if (window.location.origin.indexOf('readthedocs') > -1 || window.location.origin.indexOf('docs.gluster.org') > -1) {
|
||||
observer.observe(target, config);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**
|
||||
* Analyzes the URL of the current page to find out what the selected GitHub branch is. It's usually
|
||||
* part of the location path. The code needs to distinguish between running MkDocs standalone
|
||||
* and docs served from RTD. If no valid branch could be determined 'dev' returned.
|
||||
*
|
||||
* @returns GitHub branch name
|
||||
*/
|
||||
function determineSelectedBranch() {
|
||||
var branch = 'latest', path = window.location.pathname;
|
||||
if (window.location.origin.indexOf('readthedocs') > -1) {
|
||||
// path is like /en/<branch>/<lang>/build/ -> extract 'lang'
|
||||
// split[0] is an '' because the path starts with the separator
|
||||
branch = path.split('/')[2];
|
||||
}
|
||||
return branch;
|
||||
}
|
||||
|
||||
}());
|
||||
@@ -7,28 +7,29 @@ This is a major release that includes a range of features, code improvements and
|
||||
A selection of the key features and changes are documented in this page.
|
||||
A full list of bugs that have been addressed is included further below.
|
||||
|
||||
- [Announcements](#announcements)
|
||||
- [Highlights](#highlights)
|
||||
- [Bugs addressed in the release](#bugs-addressed)
|
||||
- [Release notes for Gluster 10.0](#release-notes-for-gluster-100)
|
||||
- [Announcements](#announcements)
|
||||
- [Builds are available at -](#builds-are-available-at--)
|
||||
- [Highlights](#highlights)
|
||||
- [Bugs addressed](#bugs-addressed)
|
||||
|
||||
## Announcements
|
||||
|
||||
1. Releases that receive maintenance updates post release 10 is 9
|
||||
([reference](https://www.gluster.org/release-schedule/))
|
||||
2. Release 10 will receive maintenance updates around the 15th of every alternative month, and the release 9 will recieve maintainance updates around 15th every three months.
|
||||
|
||||
([reference](https://www.gluster.org/release-schedule/))
|
||||
2. Release 10 will receive maintenance updates around the 15th of every alternative month, and the release 9 will recieve maintainance updates around 15th every three months.
|
||||
|
||||
## Builds are available at -
|
||||
[https://download.gluster.org/pub/gluster/glusterfs/10/10.0/](https://download.gluster.org/pub/gluster/glusterfs/10/10.0/)
|
||||
|
||||
[https://download.gluster.org/pub/gluster/glusterfs/10/10.0/](https://download.gluster.org/pub/gluster/glusterfs/10/10.0/)
|
||||
|
||||
## Highlights
|
||||
|
||||
- Major performance improvement of ~20% w.r.t small files
|
||||
as well as large files testing in controlled lab environments [#2771](https://github.com/gluster/glusterfs/issues/2771)
|
||||
|
||||
**NOTE**: The above improvement requires tcmalloc library to be enabled for building. We have tested and verified tcmalloc in X86_64 platforms and is enabled only for x86_64 builds in current release.
|
||||
|
||||
|
||||
**NOTE**: The above improvement requires tcmalloc library to be enabled for building. We have tested and verified tcmalloc in X86_64 platforms and is enabled only for x86_64 builds in current release.
|
||||
|
||||
- Randomized port selection for bricks, improves startup time [#786](https://github.com/gluster/glusterfs/issues/786)
|
||||
- Performance improvement with use of readdir instead of readdirp in fix-layout [#2241](https://github.com/gluster/glusterfs/issues/2241)
|
||||
- Heal time improvement with bigger window size [#2067](https://github.com/gluster/glusterfs/issues/2067)
|
||||
@@ -37,168 +38,168 @@ A full list of bugs that have been addressed is included further below.
|
||||
|
||||
Bugs addressed since release-10 are listed below.
|
||||
|
||||
- [#504](https://github.com/gluster/glusterfs/issues/504) AFR: remove memcpy() + ntoh32() pattern
|
||||
- [#705](https://github.com/gluster/glusterfs/issues/705) gf_backtrace_save inefficiencies
|
||||
- [#782](https://github.com/gluster/glusterfs/issues/782) Do not explicitly call strerror(errnum) when logging
|
||||
- [#786](https://github.com/gluster/glusterfs/issues/786) glusterd-pmap binds to 10K ports on startup (using IPv4)
|
||||
- [#904](https://github.com/gluster/glusterfs/issues/904) [bug:1649037] Translators allocate too much memory in their xlator_
|
||||
- [#1000](https://github.com/gluster/glusterfs/issues/1000) [bug:1193929] GlusterFS can be improved
|
||||
- [#1002](https://github.com/gluster/glusterfs/issues/1002) [bug:1679998] GlusterFS can be improved
|
||||
- [#1052](https://github.com/gluster/glusterfs/issues/1052) [bug:1693692] Increase code coverage from regression tests
|
||||
- [#1060](https://github.com/gluster/glusterfs/issues/1060) [bug:789278] Issues reported by Coverity static analysis tool
|
||||
- [#1096](https://github.com/gluster/glusterfs/issues/1096) [bug:1622665] clang-scan report: glusterfs issues
|
||||
- [#1101](https://github.com/gluster/glusterfs/issues/1101) [bug:1813029] volume brick fails to come online because other proce
|
||||
- [#1251](https://github.com/gluster/glusterfs/issues/1251) performance: improve __afr_fd_ctx_get() function
|
||||
- [#1339](https://github.com/gluster/glusterfs/issues/1339) Rebalance status is not shown correctly after node reboot
|
||||
- [#1358](https://github.com/gluster/glusterfs/issues/1358) features/shard: wrong "inode->ref" leading to ASSERT in inode_unref
|
||||
- [#1359](https://github.com/gluster/glusterfs/issues/1359) Cleanup --disable-mempool
|
||||
- [#1380](https://github.com/gluster/glusterfs/issues/1380) fd_unref() optimization - do an atomic decrement outside the lock a
|
||||
- [#1384](https://github.com/gluster/glusterfs/issues/1384) mount glusterfs volume, files larger than 64Mb only show 64Mb
|
||||
- [#1406](https://github.com/gluster/glusterfs/issues/1406) shared storage volume fails to mount in ipv6 environment
|
||||
- [#1415](https://github.com/gluster/glusterfs/issues/1415) Removing problematic language in geo-replication
|
||||
- [#1423](https://github.com/gluster/glusterfs/issues/1423) shard_make_block_abspath() should be called with a string of of the
|
||||
- [#1536](https://github.com/gluster/glusterfs/issues/1536) Improve dict_reset() efficiency
|
||||
- [#1545](https://github.com/gluster/glusterfs/issues/1545) fuse_invalidate_entry() - too many repetitive calls to uuid_utoa()
|
||||
- [#1583](https://github.com/gluster/glusterfs/issues/1583) Rework stats structure (xl->stats.total.metrics[fop_idx] and friend
|
||||
- [#1584](https://github.com/gluster/glusterfs/issues/1584) MAINTAINERS file needs to be revisited and updated
|
||||
- [#1596](https://github.com/gluster/glusterfs/issues/1596) 'this' NULL check relies on 'THIS' not being NULL
|
||||
- [#1600](https://github.com/gluster/glusterfs/issues/1600) Save and re-use MYUUID
|
||||
- [#1678](https://github.com/gluster/glusterfs/issues/1678) Improve gf_error_to_errno() and gf_errno_to_error() positive flow
|
||||
- [#1695](https://github.com/gluster/glusterfs/issues/1695) Rebalance has a redundant lookup operation
|
||||
- [#1702](https://github.com/gluster/glusterfs/issues/1702) Move GF_CLIENT_PID_GSYNCD check to start of the function.
|
||||
- [#1703](https://github.com/gluster/glusterfs/issues/1703) Remove trivial check for GF_XATTR_SHARD_FILE_SIZE before calling sh
|
||||
- [#1707](https://github.com/gluster/glusterfs/issues/1707) PL_LOCAL_GET_REQUESTS access the dictionary twice for the same info
|
||||
- [#1717](https://github.com/gluster/glusterfs/issues/1717) glusterd: sequence of rebalance and replace/reset-brick presents re
|
||||
- [#1723](https://github.com/gluster/glusterfs/issues/1723) DHT: further investigation for treating an ongoing mknod's linkto file
|
||||
- [#1749](https://github.com/gluster/glusterfs/issues/1749) brick-process: call 'notify()' and 'fini()' of brick xlators in a p
|
||||
- [#1755](https://github.com/gluster/glusterfs/issues/1755) Reduce calls to 'THIS' in fd_destroy() and others, where 'THIS' is
|
||||
- [#1761](https://github.com/gluster/glusterfs/issues/1761) CONTRIBUTING.md regression can only be run by maintainers
|
||||
- [#1764](https://github.com/gluster/glusterfs/issues/1764) Slow write on ZFS bricks after healing millions of files due to add
|
||||
- [#1772](https://github.com/gluster/glusterfs/issues/1772) build: add LTO as a configure option
|
||||
- [#1773](https://github.com/gluster/glusterfs/issues/1773) DHT/Rebalance - Remove unused variable dht_migrate_file
|
||||
- [#1779](https://github.com/gluster/glusterfs/issues/1779) Add-brick command should check hostnames with bricks present in vol
|
||||
- [#1825](https://github.com/gluster/glusterfs/issues/1825) Latency in io-stats should be in nanoseconds resolution, not micros
|
||||
- [#1872](https://github.com/gluster/glusterfs/issues/1872) Question: How to check heal info without glusterd management layer
|
||||
- [#1885](https://github.com/gluster/glusterfs/issues/1885) __posix_writev() - reduce memory copies and unneeded zeroing
|
||||
- [#1888](https://github.com/gluster/glusterfs/issues/1888) GD_OP_VERSION needs to be updated for release-10
|
||||
- [#1898](https://github.com/gluster/glusterfs/issues/1898) schedule_georep.py resulting in failure when used with python3
|
||||
- [#1909](https://github.com/gluster/glusterfs/issues/1909) core: Avoid several dict OR key is NULL message in brick logs
|
||||
- [#1925](https://github.com/gluster/glusterfs/issues/1925) dht_pt_getxattr does not seem to handle virtual xattrs.
|
||||
- [#1935](https://github.com/gluster/glusterfs/issues/1935) logging to syslog instead of any glusterfs logs
|
||||
- [#1943](https://github.com/gluster/glusterfs/issues/1943) glusterd-volgen: Add functionality to accept any custom xlator
|
||||
- [#1952](https://github.com/gluster/glusterfs/issues/1952) posix-aio: implement GF_FOP_FSYNC
|
||||
- [#1959](https://github.com/gluster/glusterfs/issues/1959) Broken links in the 2 replicas split-brain-issue - [Bug][Enhancemen
|
||||
- [#1960](https://github.com/gluster/glusterfs/issues/1960) Add missing LOCK_DESTROY() calls
|
||||
- [#1966](https://github.com/gluster/glusterfs/issues/1966) Can't print trace details due to memory allocation issues
|
||||
- [#1977](https://github.com/gluster/glusterfs/issues/1977) Inconsistent locking in presence of disconnects
|
||||
- [#1978](https://github.com/gluster/glusterfs/issues/1978) test case ./tests/bugs/core/bug-1432542-mpx-restart-crash.t is gett
|
||||
- [#1981](https://github.com/gluster/glusterfs/issues/1981) Reduce posix_fdstat() calls in IO paths
|
||||
- [#1991](https://github.com/gluster/glusterfs/issues/1991) mdcache: bug causes getxattr() to report ENODATA when fetching samb
|
||||
- [#1992](https://github.com/gluster/glusterfs/issues/1992) dht: var decommission_subvols_cnt becomes invalid when config is up
|
||||
- [#1996](https://github.com/gluster/glusterfs/issues/1996) Analyze if spinlocks have any benefit and remove them if not
|
||||
- [#2001](https://github.com/gluster/glusterfs/issues/2001) Error handling in /usr/sbin/gluster-eventsapi produces AttributeErr
|
||||
- [#2005](https://github.com/gluster/glusterfs/issues/2005) ./tests/bugs/replicate/bug-921231.t is continuously failing
|
||||
- [#2013](https://github.com/gluster/glusterfs/issues/2013) dict_t hash-calculation can be removed when hash_size=1
|
||||
- [#2024](https://github.com/gluster/glusterfs/issues/2024) Remove gfs_id variable or at least set to appropriate value
|
||||
- [#2025](https://github.com/gluster/glusterfs/issues/2025) list_del() should not set prev and next
|
||||
- [#2033](https://github.com/gluster/glusterfs/issues/2033) tests/bugs/nfs/bug-1053579.t fails on CentOS 8
|
||||
- [#2038](https://github.com/gluster/glusterfs/issues/2038) shard_unlink() fails due to no space to create marker file
|
||||
- [#2039](https://github.com/gluster/glusterfs/issues/2039) Do not allow POSIX IO backend switch when the volume is running
|
||||
- [#2042](https://github.com/gluster/glusterfs/issues/2042) mount ipv6 gluster volume with serveral backup-volfile-servers,use
|
||||
- [#2052](https://github.com/gluster/glusterfs/issues/2052) Revert the commit 50e953e2450b5183988c12e87bdfbc997e0ad8a8
|
||||
- [#2054](https://github.com/gluster/glusterfs/issues/2054) cleanup call_stub_t from unused variables
|
||||
- [#2063](https://github.com/gluster/glusterfs/issues/2063) Provide autoconf option to enable/disable storage.linux-io_uring du
|
||||
- [#2067](https://github.com/gluster/glusterfs/issues/2067) Change self-heal-window-size to 1MB by default
|
||||
- [#2075](https://github.com/gluster/glusterfs/issues/2075) Annotate synctasks with valgrind API if --enable-valgrind[=memcheck
|
||||
- [#2080](https://github.com/gluster/glusterfs/issues/2080) Glustereventsd default port
|
||||
- [#2083](https://github.com/gluster/glusterfs/issues/2083) GD_MSG_DICT_GET_FAILED should not include 'errno' but 'ret'
|
||||
- [#2086](https://github.com/gluster/glusterfs/issues/2086) Move tests/00-geo-rep/00-georep-verify-non-root-setup.t to tests/00
|
||||
- [#2096](https://github.com/gluster/glusterfs/issues/2096) iobuf_arena structure doesn't need passive and active iobufs, but l
|
||||
- [#2099](https://github.com/gluster/glusterfs/issues/2099) 'force' option does not work in the replicated volume snapshot crea
|
||||
- [#2101](https://github.com/gluster/glusterfs/issues/2101) Move 00-georep-verify-non-root-setup.t back to tests/00-geo-rep/
|
||||
- [#2107](https://github.com/gluster/glusterfs/issues/2107) mount crashes when setfattr -n distribute.fix.layout -v "yes" is ex
|
||||
- [#2116](https://github.com/gluster/glusterfs/issues/2116) enable quota for multiple volumes take more time
|
||||
- [#2117](https://github.com/gluster/glusterfs/issues/2117) Concurrent quota enable causes glusterd deadlock
|
||||
- [#2123](https://github.com/gluster/glusterfs/issues/2123) Implement an I/O framework
|
||||
- [#2129](https://github.com/gluster/glusterfs/issues/2129) CID 1445996 Null pointer dereferences (FORWARD_NULL) /xlators/mgmt/
|
||||
- [#2130](https://github.com/gluster/glusterfs/issues/2130) stack.h/c: remove unused variable and reorder struct
|
||||
- [#2133](https://github.com/gluster/glusterfs/issues/2133) Changelog History Crawl failed after resuming stopped geo-replicati
|
||||
- [#2134](https://github.com/gluster/glusterfs/issues/2134) Fix spurious failures caused by change in profile info duration to
|
||||
- [#2138](https://github.com/gluster/glusterfs/issues/2138) glfs_write() dumps a core file file when buffer size is 1GB
|
||||
- [#2154](https://github.com/gluster/glusterfs/issues/2154) "Operation not supported" doing a chmod on a symlink
|
||||
- [#2159](https://github.com/gluster/glusterfs/issues/2159) Remove unused component tests
|
||||
- [#2161](https://github.com/gluster/glusterfs/issues/2161) Crash caused by memory corruption
|
||||
- [#2169](https://github.com/gluster/glusterfs/issues/2169) Stack overflow when parallel-readdir is enabled
|
||||
- [#2180](https://github.com/gluster/glusterfs/issues/2180) CID 1446716: Memory - illegal accesses (USE_AFTER_FREE) /xlators/mg
|
||||
- [#2187](https://github.com/gluster/glusterfs/issues/2187) [Input/output error] IO failure while performing shrink operation w
|
||||
- [#2190](https://github.com/gluster/glusterfs/issues/2190) Move a test case tests/basic/glusterd-restart-shd-mux.t to flaky
|
||||
- [#2192](https://github.com/gluster/glusterfs/issues/2192) 4+1 arbiter setup is broken
|
||||
- [#2198](https://github.com/gluster/glusterfs/issues/2198) There are blocked inodelks for a long time
|
||||
- [#2216](https://github.com/gluster/glusterfs/issues/2216) Fix coverity issues
|
||||
- [#2232](https://github.com/gluster/glusterfs/issues/2232) "Invalid argument" when reading a directory with gfapi
|
||||
- [#2234](https://github.com/gluster/glusterfs/issues/2234) Segmentation fault in directory quota daemon for replicated volume
|
||||
- [#2239](https://github.com/gluster/glusterfs/issues/2239) rebalance crashes in dht on master
|
||||
- [#2241](https://github.com/gluster/glusterfs/issues/2241) Using readdir instead of readdirp for fix-layout increases performa
|
||||
- [#2253](https://github.com/gluster/glusterfs/issues/2253) Disable lookup-optimize by default in the virt group
|
||||
- [#2258](https://github.com/gluster/glusterfs/issues/2258) Provide option to disable fsync in data migration
|
||||
- [#2260](https://github.com/gluster/glusterfs/issues/2260) failed to list quota info after setting limit-usage
|
||||
- [#2268](https://github.com/gluster/glusterfs/issues/2268) dht_layout_unref() only uses 'this' to check that 'this->private' i
|
||||
- [#2278](https://github.com/gluster/glusterfs/issues/2278) nfs-ganesha does not start due to shared storage not ready, but ret
|
||||
- [#2287](https://github.com/gluster/glusterfs/issues/2287) runner infrastructure fails to provide platfrom independent error c
|
||||
- [#2294](https://github.com/gluster/glusterfs/issues/2294) dict.c: remove some strlen() calls if using DICT_LIST_IMP
|
||||
- [#2308](https://github.com/gluster/glusterfs/issues/2308) Developer sessions for glusterfs
|
||||
- [#2313](https://github.com/gluster/glusterfs/issues/2313) Long setting names mess up the columns and break parsing
|
||||
- [#2317](https://github.com/gluster/glusterfs/issues/2317) Rebalance doesn't migrate some sparse files
|
||||
- [#2328](https://github.com/gluster/glusterfs/issues/2328) "gluster volume set <volname> group samba" needs to include write-b
|
||||
- [#2330](https://github.com/gluster/glusterfs/issues/2330) gf_msg can cause relock deadlock
|
||||
- [#2334](https://github.com/gluster/glusterfs/issues/2334) posix_handle_soft() is doing an unnecessary stat
|
||||
- [#2337](https://github.com/gluster/glusterfs/issues/2337) memory leak observed in lock fop
|
||||
- [#2348](https://github.com/gluster/glusterfs/issues/2348) Gluster's test suite on RHEL 8 runs slower than on RHEL 7
|
||||
- [#2351](https://github.com/gluster/glusterfs/issues/2351) glusterd: After upgrade on release 9.1 glusterd protocol is broken
|
||||
- [#2353](https://github.com/gluster/glusterfs/issues/2353) Permission issue after upgrading to Gluster v9.1
|
||||
- [#2360](https://github.com/gluster/glusterfs/issues/2360) extras: postscript fails on logrotation of snapd logs
|
||||
- [#2364](https://github.com/gluster/glusterfs/issues/2364) After the service is restarted, a large number of handles are not r
|
||||
- [#2370](https://github.com/gluster/glusterfs/issues/2370) glusterd: Issues with custom xlator changes
|
||||
- [#2378](https://github.com/gluster/glusterfs/issues/2378) Remove sys_fstatat() from posix_handle_unset_gfid() function - not
|
||||
- [#2380](https://github.com/gluster/glusterfs/issues/2380) Remove sys_lstat() from posix_acl_xattr_set() - not needed
|
||||
- [#2388](https://github.com/gluster/glusterfs/issues/2388) Geo-replication gets delayed when there are many renames on primary
|
||||
- [#2394](https://github.com/gluster/glusterfs/issues/2394) Spurious failure in tests/basic/fencing/afr-lock-heal-basic.t
|
||||
- [#2398](https://github.com/gluster/glusterfs/issues/2398) Bitrot and scrub process showed like unknown in the gluster volume
|
||||
- [#2404](https://github.com/gluster/glusterfs/issues/2404) Spurious failure of tests/bugs/ec/bug-1236065.t
|
||||
- [#2407](https://github.com/gluster/glusterfs/issues/2407) configure glitch with CC=clang
|
||||
- [#2410](https://github.com/gluster/glusterfs/issues/2410) dict_xxx_sizen variant compilation should fail on passing a variabl
|
||||
- [#2414](https://github.com/gluster/glusterfs/issues/2414) Prefer mallinfo2() to mallinfo() if available
|
||||
- [#2421](https://github.com/gluster/glusterfs/issues/2421) rsync should not try to sync internal xattrs.
|
||||
- [#2429](https://github.com/gluster/glusterfs/issues/2429) Use file timestamps with nanosecond precision
|
||||
- [#2431](https://github.com/gluster/glusterfs/issues/2431) Drop --disable-syslog configuration option
|
||||
- [#2440](https://github.com/gluster/glusterfs/issues/2440) Geo-replication not working on Ubuntu 21.04
|
||||
- [#2443](https://github.com/gluster/glusterfs/issues/2443) Core dumps on Gluster 9 - 3 replicas
|
||||
- [#2446](https://github.com/gluster/glusterfs/issues/2446) client_add_lock_for_recovery() - new_client_lock() should be called
|
||||
- [#2467](https://github.com/gluster/glusterfs/issues/2467) failed to open /proc/0/status: No such file or directory
|
||||
- [#2470](https://github.com/gluster/glusterfs/issues/2470) sharding: [inode.c:1255:__inode_unlink] 0-inode: dentry not found
|
||||
- [#2480](https://github.com/gluster/glusterfs/issues/2480) Brick going offline on another host as well as the host which reboo
|
||||
- [#2502](https://github.com/gluster/glusterfs/issues/2502) xlator/features/locks/src/common.c has code duplication
|
||||
- [#2507](https://github.com/gluster/glusterfs/issues/2507) Use appropriate msgid in gf_msg()
|
||||
- [#2515](https://github.com/gluster/glusterfs/issues/2515) Unable to mount the gluster volume using fuse unless iptables is fl
|
||||
- [#2522](https://github.com/gluster/glusterfs/issues/2522) ganesha_ha (extras/ganesha/ocf): ganesha_grace RA fails in start()
|
||||
- [#2540](https://github.com/gluster/glusterfs/issues/2540) delay-gen doesn't work correctly for delays longer than 2 seconds
|
||||
- [#2551](https://github.com/gluster/glusterfs/issues/2551) Sometimes the lock notification feature doesn't work
|
||||
- [#2581](https://github.com/gluster/glusterfs/issues/2581) With strict-locks enabled clients which are holding posix locks sti
|
||||
- [#2590](https://github.com/gluster/glusterfs/issues/2590) trusted.io-stats-dump extended attribute usage description error
|
||||
- [#2611](https://github.com/gluster/glusterfs/issues/2611) Granular entry self-heal is taking more time than full entry self h
|
||||
- [#2617](https://github.com/gluster/glusterfs/issues/2617) High CPU utilization of thread glfs_fusenoti and huge delays in som
|
||||
- [#2620](https://github.com/gluster/glusterfs/issues/2620) Granular entry heal purging of index name trigger two lookups in th
|
||||
- [#2625](https://github.com/gluster/glusterfs/issues/2625) auth.allow value is corrupted after add-brick operation
|
||||
- [#2626](https://github.com/gluster/glusterfs/issues/2626) entry self-heal does xattrops unnecessarily in many cases
|
||||
- [#2649](https://github.com/gluster/glusterfs/issues/2649) glustershd failed in bind with error "Address already in use"
|
||||
- [#2652](https://github.com/gluster/glusterfs/issues/2652) Removal of deadcode: Pump
|
||||
- [#2659](https://github.com/gluster/glusterfs/issues/2659) tests/basic/afr/afr-anon-inode.t crashed
|
||||
- [#2664](https://github.com/gluster/glusterfs/issues/2664) Test suite produce uncompressed logs
|
||||
- [#2693](https://github.com/gluster/glusterfs/issues/2693) dht: dht_local_wipe is crashed while running rename operation
|
||||
- [#2771](https://github.com/gluster/glusterfs/issues/2771) Smallfile improvement in glusterfs
|
||||
- [#2782](https://github.com/gluster/glusterfs/issues/2782) Glustereventsd does not listen on IPv4 when IPv6 is not available
|
||||
- [#2789](https://github.com/gluster/glusterfs/issues/2789) An improper locking bug(e.g., deadlock) on the lock up_inode_ctx->c
|
||||
- [#2798](https://github.com/gluster/glusterfs/issues/2798) FUSE mount option for localtime-logging is not exposed
|
||||
- [#2816](https://github.com/gluster/glusterfs/issues/2816) Glusterfsd memory leak when subdir_mounting a volume
|
||||
- [#2835](https://github.com/gluster/glusterfs/issues/2835) dht: found anomalies in dht_layout after commit c4cbdbcb3d02fb56a62
|
||||
- [#2857](https://github.com/gluster/glusterfs/issues/2857) variable twice initialization.
|
||||
- [#504](https://github.com/gluster/glusterfs/issues/504) AFR: remove memcpy() + ntoh32() pattern
|
||||
- [#705](https://github.com/gluster/glusterfs/issues/705) gf_backtrace_save inefficiencies
|
||||
- [#782](https://github.com/gluster/glusterfs/issues/782) Do not explicitly call strerror(errnum) when logging
|
||||
- [#786](https://github.com/gluster/glusterfs/issues/786) glusterd-pmap binds to 10K ports on startup (using IPv4)
|
||||
- [#904](https://github.com/gluster/glusterfs/issues/904) [bug:1649037] Translators allocate too much memory in their xlator\_
|
||||
- [#1000](https://github.com/gluster/glusterfs/issues/1000) [bug:1193929] GlusterFS can be improved
|
||||
- [#1002](https://github.com/gluster/glusterfs/issues/1002) [bug:1679998] GlusterFS can be improved
|
||||
- [#1052](https://github.com/gluster/glusterfs/issues/1052) [bug:1693692] Increase code coverage from regression tests
|
||||
- [#1060](https://github.com/gluster/glusterfs/issues/1060) [bug:789278] Issues reported by Coverity static analysis tool
|
||||
- [#1096](https://github.com/gluster/glusterfs/issues/1096) [bug:1622665] clang-scan report: glusterfs issues
|
||||
- [#1101](https://github.com/gluster/glusterfs/issues/1101) [bug:1813029] volume brick fails to come online because other proce
|
||||
- [#1251](https://github.com/gluster/glusterfs/issues/1251) performance: improve \_\_afr_fd_ctx_get() function
|
||||
- [#1339](https://github.com/gluster/glusterfs/issues/1339) Rebalance status is not shown correctly after node reboot
|
||||
- [#1358](https://github.com/gluster/glusterfs/issues/1358) features/shard: wrong "inode->ref" leading to ASSERT in inode_unref
|
||||
- [#1359](https://github.com/gluster/glusterfs/issues/1359) Cleanup --disable-mempool
|
||||
- [#1380](https://github.com/gluster/glusterfs/issues/1380) fd_unref() optimization - do an atomic decrement outside the lock a
|
||||
- [#1384](https://github.com/gluster/glusterfs/issues/1384) mount glusterfs volume, files larger than 64Mb only show 64Mb
|
||||
- [#1406](https://github.com/gluster/glusterfs/issues/1406) shared storage volume fails to mount in ipv6 environment
|
||||
- [#1415](https://github.com/gluster/glusterfs/issues/1415) Removing problematic language in geo-replication
|
||||
- [#1423](https://github.com/gluster/glusterfs/issues/1423) shard_make_block_abspath() should be called with a string of of the
|
||||
- [#1536](https://github.com/gluster/glusterfs/issues/1536) Improve dict_reset() efficiency
|
||||
- [#1545](https://github.com/gluster/glusterfs/issues/1545) fuse_invalidate_entry() - too many repetitive calls to uuid_utoa()
|
||||
- [#1583](https://github.com/gluster/glusterfs/issues/1583) Rework stats structure (xl->stats.total.metrics[fop_idx] and friend
|
||||
- [#1584](https://github.com/gluster/glusterfs/issues/1584) MAINTAINERS file needs to be revisited and updated
|
||||
- [#1596](https://github.com/gluster/glusterfs/issues/1596) 'this' NULL check relies on 'THIS' not being NULL
|
||||
- [#1600](https://github.com/gluster/glusterfs/issues/1600) Save and re-use MYUUID
|
||||
- [#1678](https://github.com/gluster/glusterfs/issues/1678) Improve gf_error_to_errno() and gf_errno_to_error() positive flow
|
||||
- [#1695](https://github.com/gluster/glusterfs/issues/1695) Rebalance has a redundant lookup operation
|
||||
- [#1702](https://github.com/gluster/glusterfs/issues/1702) Move GF_CLIENT_PID_GSYNCD check to start of the function.
|
||||
- [#1703](https://github.com/gluster/glusterfs/issues/1703) Remove trivial check for GF_XATTR_SHARD_FILE_SIZE before calling sh
|
||||
- [#1707](https://github.com/gluster/glusterfs/issues/1707) PL_LOCAL_GET_REQUESTS access the dictionary twice for the same info
|
||||
- [#1717](https://github.com/gluster/glusterfs/issues/1717) glusterd: sequence of rebalance and replace/reset-brick presents re
|
||||
- [#1723](https://github.com/gluster/glusterfs/issues/1723) DHT: further investigation for treating an ongoing mknod's linkto file
|
||||
- [#1749](https://github.com/gluster/glusterfs/issues/1749) brick-process: call 'notify()' and 'fini()' of brick xlators in a p
|
||||
- [#1755](https://github.com/gluster/glusterfs/issues/1755) Reduce calls to 'THIS' in fd_destroy() and others, where 'THIS' is
|
||||
- [#1761](https://github.com/gluster/glusterfs/issues/1761) CONTRIBUTING.md regression can only be run by maintainers
|
||||
- [#1764](https://github.com/gluster/glusterfs/issues/1764) Slow write on ZFS bricks after healing millions of files due to add
|
||||
- [#1772](https://github.com/gluster/glusterfs/issues/1772) build: add LTO as a configure option
|
||||
- [#1773](https://github.com/gluster/glusterfs/issues/1773) DHT/Rebalance - Remove unused variable dht_migrate_file
|
||||
- [#1779](https://github.com/gluster/glusterfs/issues/1779) Add-brick command should check hostnames with bricks present in vol
|
||||
- [#1825](https://github.com/gluster/glusterfs/issues/1825) Latency in io-stats should be in nanoseconds resolution, not micros
|
||||
- [#1872](https://github.com/gluster/glusterfs/issues/1872) Question: How to check heal info without glusterd management layer
|
||||
- [#1885](https://github.com/gluster/glusterfs/issues/1885) \_\_posix_writev() - reduce memory copies and unneeded zeroing
|
||||
- [#1888](https://github.com/gluster/glusterfs/issues/1888) GD_OP_VERSION needs to be updated for release-10
|
||||
- [#1898](https://github.com/gluster/glusterfs/issues/1898) schedule_georep.py resulting in failure when used with python3
|
||||
- [#1909](https://github.com/gluster/glusterfs/issues/1909) core: Avoid several dict OR key is NULL message in brick logs
|
||||
- [#1925](https://github.com/gluster/glusterfs/issues/1925) dht_pt_getxattr does not seem to handle virtual xattrs.
|
||||
- [#1935](https://github.com/gluster/glusterfs/issues/1935) logging to syslog instead of any glusterfs logs
|
||||
- [#1943](https://github.com/gluster/glusterfs/issues/1943) glusterd-volgen: Add functionality to accept any custom xlator
|
||||
- [#1952](https://github.com/gluster/glusterfs/issues/1952) posix-aio: implement GF_FOP_FSYNC
|
||||
- [#1959](https://github.com/gluster/glusterfs/issues/1959) Broken links in the 2 replicas split-brain-issue - [Bug]Enhancemen
|
||||
- [#1960](https://github.com/gluster/glusterfs/issues/1960) Add missing LOCK_DESTROY() calls
|
||||
- [#1966](https://github.com/gluster/glusterfs/issues/1966) Can't print trace details due to memory allocation issues
|
||||
- [#1977](https://github.com/gluster/glusterfs/issues/1977) Inconsistent locking in presence of disconnects
|
||||
- [#1978](https://github.com/gluster/glusterfs/issues/1978) test case ./tests/bugs/core/bug-1432542-mpx-restart-crash.t is gett
|
||||
- [#1981](https://github.com/gluster/glusterfs/issues/1981) Reduce posix_fdstat() calls in IO paths
|
||||
- [#1991](https://github.com/gluster/glusterfs/issues/1991) mdcache: bug causes getxattr() to report ENODATA when fetching samb
|
||||
- [#1992](https://github.com/gluster/glusterfs/issues/1992) dht: var decommission_subvols_cnt becomes invalid when config is up
|
||||
- [#1996](https://github.com/gluster/glusterfs/issues/1996) Analyze if spinlocks have any benefit and remove them if not
|
||||
- [#2001](https://github.com/gluster/glusterfs/issues/2001) Error handling in /usr/sbin/gluster-eventsapi produces AttributeErr
|
||||
- [#2005](https://github.com/gluster/glusterfs/issues/2005) ./tests/bugs/replicate/bug-921231.t is continuously failing
|
||||
- [#2013](https://github.com/gluster/glusterfs/issues/2013) dict_t hash-calculation can be removed when hash_size=1
|
||||
- [#2024](https://github.com/gluster/glusterfs/issues/2024) Remove gfs_id variable or at least set to appropriate value
|
||||
- [#2025](https://github.com/gluster/glusterfs/issues/2025) list_del() should not set prev and next
|
||||
- [#2033](https://github.com/gluster/glusterfs/issues/2033) tests/bugs/nfs/bug-1053579.t fails on CentOS 8
|
||||
- [#2038](https://github.com/gluster/glusterfs/issues/2038) shard_unlink() fails due to no space to create marker file
|
||||
- [#2039](https://github.com/gluster/glusterfs/issues/2039) Do not allow POSIX IO backend switch when the volume is running
|
||||
- [#2042](https://github.com/gluster/glusterfs/issues/2042) mount ipv6 gluster volume with serveral backup-volfile-servers,use
|
||||
- [#2052](https://github.com/gluster/glusterfs/issues/2052) Revert the commit 50e953e2450b5183988c12e87bdfbc997e0ad8a8
|
||||
- [#2054](https://github.com/gluster/glusterfs/issues/2054) cleanup call_stub_t from unused variables
|
||||
- [#2063](https://github.com/gluster/glusterfs/issues/2063) Provide autoconf option to enable/disable storage.linux-io_uring du
|
||||
- [#2067](https://github.com/gluster/glusterfs/issues/2067) Change self-heal-window-size to 1MB by default
|
||||
- [#2075](https://github.com/gluster/glusterfs/issues/2075) Annotate synctasks with valgrind API if --enable-valgrind[=memcheck
|
||||
- [#2080](https://github.com/gluster/glusterfs/issues/2080) Glustereventsd default port
|
||||
- [#2083](https://github.com/gluster/glusterfs/issues/2083) GD_MSG_DICT_GET_FAILED should not include 'errno' but 'ret'
|
||||
- [#2086](https://github.com/gluster/glusterfs/issues/2086) Move tests/00-geo-rep/00-georep-verify-non-root-setup.t to tests/00
|
||||
- [#2096](https://github.com/gluster/glusterfs/issues/2096) iobuf_arena structure doesn't need passive and active iobufs, but l
|
||||
- [#2099](https://github.com/gluster/glusterfs/issues/2099) 'force' option does not work in the replicated volume snapshot crea
|
||||
- [#2101](https://github.com/gluster/glusterfs/issues/2101) Move 00-georep-verify-non-root-setup.t back to tests/00-geo-rep/
|
||||
- [#2107](https://github.com/gluster/glusterfs/issues/2107) mount crashes when setfattr -n distribute.fix.layout -v "yes" is ex
|
||||
- [#2116](https://github.com/gluster/glusterfs/issues/2116) enable quota for multiple volumes take more time
|
||||
- [#2117](https://github.com/gluster/glusterfs/issues/2117) Concurrent quota enable causes glusterd deadlock
|
||||
- [#2123](https://github.com/gluster/glusterfs/issues/2123) Implement an I/O framework
|
||||
- [#2129](https://github.com/gluster/glusterfs/issues/2129) CID 1445996 Null pointer dereferences (FORWARD_NULL) /xlators/mgmt/
|
||||
- [#2130](https://github.com/gluster/glusterfs/issues/2130) stack.h/c: remove unused variable and reorder struct
|
||||
- [#2133](https://github.com/gluster/glusterfs/issues/2133) Changelog History Crawl failed after resuming stopped geo-replicati
|
||||
- [#2134](https://github.com/gluster/glusterfs/issues/2134) Fix spurious failures caused by change in profile info duration to
|
||||
- [#2138](https://github.com/gluster/glusterfs/issues/2138) glfs_write() dumps a core file file when buffer size is 1GB
|
||||
- [#2154](https://github.com/gluster/glusterfs/issues/2154) "Operation not supported" doing a chmod on a symlink
|
||||
- [#2159](https://github.com/gluster/glusterfs/issues/2159) Remove unused component tests
|
||||
- [#2161](https://github.com/gluster/glusterfs/issues/2161) Crash caused by memory corruption
|
||||
- [#2169](https://github.com/gluster/glusterfs/issues/2169) Stack overflow when parallel-readdir is enabled
|
||||
- [#2180](https://github.com/gluster/glusterfs/issues/2180) CID 1446716: Memory - illegal accesses (USE_AFTER_FREE) /xlators/mg
|
||||
- [#2187](https://github.com/gluster/glusterfs/issues/2187) [Input/output error] IO failure while performing shrink operation w
|
||||
- [#2190](https://github.com/gluster/glusterfs/issues/2190) Move a test case tests/basic/glusterd-restart-shd-mux.t to flaky
|
||||
- [#2192](https://github.com/gluster/glusterfs/issues/2192) 4+1 arbiter setup is broken
|
||||
- [#2198](https://github.com/gluster/glusterfs/issues/2198) There are blocked inodelks for a long time
|
||||
- [#2216](https://github.com/gluster/glusterfs/issues/2216) Fix coverity issues
|
||||
- [#2232](https://github.com/gluster/glusterfs/issues/2232) "Invalid argument" when reading a directory with gfapi
|
||||
- [#2234](https://github.com/gluster/glusterfs/issues/2234) Segmentation fault in directory quota daemon for replicated volume
|
||||
- [#2239](https://github.com/gluster/glusterfs/issues/2239) rebalance crashes in dht on master
|
||||
- [#2241](https://github.com/gluster/glusterfs/issues/2241) Using readdir instead of readdirp for fix-layout increases performa
|
||||
- [#2253](https://github.com/gluster/glusterfs/issues/2253) Disable lookup-optimize by default in the virt group
|
||||
- [#2258](https://github.com/gluster/glusterfs/issues/2258) Provide option to disable fsync in data migration
|
||||
- [#2260](https://github.com/gluster/glusterfs/issues/2260) failed to list quota info after setting limit-usage
|
||||
- [#2268](https://github.com/gluster/glusterfs/issues/2268) dht_layout_unref() only uses 'this' to check that 'this->private' i
|
||||
- [#2278](https://github.com/gluster/glusterfs/issues/2278) nfs-ganesha does not start due to shared storage not ready, but ret
|
||||
- [#2287](https://github.com/gluster/glusterfs/issues/2287) runner infrastructure fails to provide platfrom independent error c
|
||||
- [#2294](https://github.com/gluster/glusterfs/issues/2294) dict.c: remove some strlen() calls if using DICT_LIST_IMP
|
||||
- [#2308](https://github.com/gluster/glusterfs/issues/2308) Developer sessions for glusterfs
|
||||
- [#2313](https://github.com/gluster/glusterfs/issues/2313) Long setting names mess up the columns and break parsing
|
||||
- [#2317](https://github.com/gluster/glusterfs/issues/2317) Rebalance doesn't migrate some sparse files
|
||||
- [#2328](https://github.com/gluster/glusterfs/issues/2328) "gluster volume set <volname> group samba" needs to include write-b
|
||||
- [#2330](https://github.com/gluster/glusterfs/issues/2330) gf_msg can cause relock deadlock
|
||||
- [#2334](https://github.com/gluster/glusterfs/issues/2334) posix_handle_soft() is doing an unnecessary stat
|
||||
- [#2337](https://github.com/gluster/glusterfs/issues/2337) memory leak observed in lock fop
|
||||
- [#2348](https://github.com/gluster/glusterfs/issues/2348) Gluster's test suite on RHEL 8 runs slower than on RHEL 7
|
||||
- [#2351](https://github.com/gluster/glusterfs/issues/2351) glusterd: After upgrade on release 9.1 glusterd protocol is broken
|
||||
- [#2353](https://github.com/gluster/glusterfs/issues/2353) Permission issue after upgrading to Gluster v9.1
|
||||
- [#2360](https://github.com/gluster/glusterfs/issues/2360) extras: postscript fails on logrotation of snapd logs
|
||||
- [#2364](https://github.com/gluster/glusterfs/issues/2364) After the service is restarted, a large number of handles are not r
|
||||
- [#2370](https://github.com/gluster/glusterfs/issues/2370) glusterd: Issues with custom xlator changes
|
||||
- [#2378](https://github.com/gluster/glusterfs/issues/2378) Remove sys_fstatat() from posix_handle_unset_gfid() function - not
|
||||
- [#2380](https://github.com/gluster/glusterfs/issues/2380) Remove sys_lstat() from posix_acl_xattr_set() - not needed
|
||||
- [#2388](https://github.com/gluster/glusterfs/issues/2388) Geo-replication gets delayed when there are many renames on primary
|
||||
- [#2394](https://github.com/gluster/glusterfs/issues/2394) Spurious failure in tests/basic/fencing/afr-lock-heal-basic.t
|
||||
- [#2398](https://github.com/gluster/glusterfs/issues/2398) Bitrot and scrub process showed like unknown in the gluster volume
|
||||
- [#2404](https://github.com/gluster/glusterfs/issues/2404) Spurious failure of tests/bugs/ec/bug-1236065.t
|
||||
- [#2407](https://github.com/gluster/glusterfs/issues/2407) configure glitch with CC=clang
|
||||
- [#2410](https://github.com/gluster/glusterfs/issues/2410) dict_xxx_sizen variant compilation should fail on passing a variabl
|
||||
- [#2414](https://github.com/gluster/glusterfs/issues/2414) Prefer mallinfo2() to mallinfo() if available
|
||||
- [#2421](https://github.com/gluster/glusterfs/issues/2421) rsync should not try to sync internal xattrs.
|
||||
- [#2429](https://github.com/gluster/glusterfs/issues/2429) Use file timestamps with nanosecond precision
|
||||
- [#2431](https://github.com/gluster/glusterfs/issues/2431) Drop --disable-syslog configuration option
|
||||
- [#2440](https://github.com/gluster/glusterfs/issues/2440) Geo-replication not working on Ubuntu 21.04
|
||||
- [#2443](https://github.com/gluster/glusterfs/issues/2443) Core dumps on Gluster 9 - 3 replicas
|
||||
- [#2446](https://github.com/gluster/glusterfs/issues/2446) client_add_lock_for_recovery() - new_client_lock() should be called
|
||||
- [#2467](https://github.com/gluster/glusterfs/issues/2467) failed to open /proc/0/status: No such file or directory
|
||||
- [#2470](https://github.com/gluster/glusterfs/issues/2470) sharding: [inode.c:1255:__inode_unlink] 0-inode: dentry not found
|
||||
- [#2480](https://github.com/gluster/glusterfs/issues/2480) Brick going offline on another host as well as the host which reboo
|
||||
- [#2502](https://github.com/gluster/glusterfs/issues/2502) xlator/features/locks/src/common.c has code duplication
|
||||
- [#2507](https://github.com/gluster/glusterfs/issues/2507) Use appropriate msgid in gf_msg()
|
||||
- [#2515](https://github.com/gluster/glusterfs/issues/2515) Unable to mount the gluster volume using fuse unless iptables is fl
|
||||
- [#2522](https://github.com/gluster/glusterfs/issues/2522) ganesha_ha (extras/ganesha/ocf): ganesha_grace RA fails in start()
|
||||
- [#2540](https://github.com/gluster/glusterfs/issues/2540) delay-gen doesn't work correctly for delays longer than 2 seconds
|
||||
- [#2551](https://github.com/gluster/glusterfs/issues/2551) Sometimes the lock notification feature doesn't work
|
||||
- [#2581](https://github.com/gluster/glusterfs/issues/2581) With strict-locks enabled clients which are holding posix locks sti
|
||||
- [#2590](https://github.com/gluster/glusterfs/issues/2590) trusted.io-stats-dump extended attribute usage description error
|
||||
- [#2611](https://github.com/gluster/glusterfs/issues/2611) Granular entry self-heal is taking more time than full entry self h
|
||||
- [#2617](https://github.com/gluster/glusterfs/issues/2617) High CPU utilization of thread glfs_fusenoti and huge delays in som
|
||||
- [#2620](https://github.com/gluster/glusterfs/issues/2620) Granular entry heal purging of index name trigger two lookups in th
|
||||
- [#2625](https://github.com/gluster/glusterfs/issues/2625) auth.allow value is corrupted after add-brick operation
|
||||
- [#2626](https://github.com/gluster/glusterfs/issues/2626) entry self-heal does xattrops unnecessarily in many cases
|
||||
- [#2649](https://github.com/gluster/glusterfs/issues/2649) glustershd failed in bind with error "Address already in use"
|
||||
- [#2652](https://github.com/gluster/glusterfs/issues/2652) Removal of deadcode: Pump
|
||||
- [#2659](https://github.com/gluster/glusterfs/issues/2659) tests/basic/afr/afr-anon-inode.t crashed
|
||||
- [#2664](https://github.com/gluster/glusterfs/issues/2664) Test suite produce uncompressed logs
|
||||
- [#2693](https://github.com/gluster/glusterfs/issues/2693) dht: dht_local_wipe is crashed while running rename operation
|
||||
- [#2771](https://github.com/gluster/glusterfs/issues/2771) Smallfile improvement in glusterfs
|
||||
- [#2782](https://github.com/gluster/glusterfs/issues/2782) Glustereventsd does not listen on IPv4 when IPv6 is not available
|
||||
- [#2789](https://github.com/gluster/glusterfs/issues/2789) An improper locking bug(e.g., deadlock) on the lock up_inode_ctx->c
|
||||
- [#2798](https://github.com/gluster/glusterfs/issues/2798) FUSE mount option for localtime-logging is not exposed
|
||||
- [#2816](https://github.com/gluster/glusterfs/issues/2816) Glusterfsd memory leak when subdir_mounting a volume
|
||||
- [#2835](https://github.com/gluster/glusterfs/issues/2835) dht: found anomalies in dht_layout after commit c4cbdbcb3d02fb56a62
|
||||
- [#2857](https://github.com/gluster/glusterfs/issues/2857) variable twice initialization.
|
||||
|
||||
@@ -12,15 +12,18 @@ This is a bugfix and improvement release. The release notes for [10.0](10.0.md)
|
||||
- Users are highly encouraged to upgrade to newer releases of GlusterFS.
|
||||
|
||||
## Important fixes in this release
|
||||
|
||||
- Fix missing stripe count issue with upgrade from 9.x to 10.x
|
||||
- Fix IO failure when shrinking distributed dispersed volume with ongoing IO
|
||||
- Fix log spam introduced with glusterfs 10.0
|
||||
- Enable ltcmalloc_minimal instead of ltcmalloc
|
||||
|
||||
## Builds are available at -
|
||||
|
||||
[https://download.gluster.org/pub/gluster/glusterfs/10/10.1/](https://download.gluster.org/pub/gluster/glusterfs/10/10.1/)
|
||||
|
||||
## Bugs addressed
|
||||
|
||||
- [#2846](https://github.com/gluster/glusterfs/issues/2846) Avoid redundant logs in gluster
|
||||
- [#2903](https://github.com/gluster/glusterfs/issues/2903) Fix worker disconnect due to AttributeError in geo-replication
|
||||
- [#2910](https://github.com/gluster/glusterfs/issues/2910) Check for available ports in port_range in glusterd
|
||||
|
||||
@@ -3,22 +3,26 @@
|
||||
This is a bugfix and improvement release. The release notes for [10.0](10.0.md) and [10.1](10.1.md) contain a listing of all the new features that were added and bugs fixed in the GlusterFS 10 stable release.
|
||||
|
||||
**NOTE:**
|
||||
|
||||
- Next minor release tentative date: Week of 15th Aug, 2022
|
||||
- Users are highly encouraged to upgrade to newer releases of GlusterFS.
|
||||
|
||||
## Important fixes in this release
|
||||
|
||||
- Optimize server functionality by enhancing server_process_event_upcall code path during the handling of upcall event
|
||||
- Fix all bricks not starting issue on node reboot when brick count is high(>750)
|
||||
- Fix stale posix locks that appear after client disconnection
|
||||
|
||||
## Builds are available at
|
||||
|
||||
[https://download.gluster.org/pub/gluster/glusterfs/10/10.2/](https://download.gluster.org/pub/gluster/glusterfs/10/10.2/)
|
||||
|
||||
## Bugs addressed
|
||||
|
||||
- [#3182](https://github.com/gluster/glusterfs/issues/3182) Fix stale posix locks that appear after client disconnection
|
||||
- [#3187](https://github.com/gluster/glusterfs/issues/3187) Fix Locks xlator fd leaks
|
||||
- [#3234](https://github.com/gluster/glusterfs/issues/3234) Fix incorrect directory check inorder to successfully locate the SSL certificate
|
||||
- [#3262](https://github.com/gluster/glusterfs/issues/3262) Synchronize layout_(ref|unref) during layout_(get|set) in dht
|
||||
- [#3262](https://github.com/gluster/glusterfs/issues/3262) Synchronize layout*(ref|unref) during layout*(get|set) in dht
|
||||
- [#3321](https://github.com/gluster/glusterfs/issues/3321) Optimize server functionality by enhancing server_process_event_upcall code path during the handling of upcall event
|
||||
- [#3334](https://github.com/gluster/glusterfs/issues/3334) Fix errors and timeouts when creating qcow2 file via libgfapi
|
||||
- [#3375](https://github.com/gluster/glusterfs/issues/3375) Fix all bricks not starting issue on node reboot when brick count is high(>750)
|
||||
|
||||
@@ -11,28 +11,30 @@ of bugs that has been addressed is included further below.
|
||||
## Major changes and features
|
||||
|
||||
### Brick multiplexing
|
||||
*Notes for users:*
|
||||
Multiplexing reduces both port and memory usage. It does *not* improve
|
||||
|
||||
_Notes for users:_
|
||||
Multiplexing reduces both port and memory usage. It does _not_ improve
|
||||
performance vs. non-multiplexing except when memory is the limiting factor,
|
||||
though there are other related changes that improve performance overall (e.g.
|
||||
compared to 3.9).
|
||||
|
||||
Multiplexing is off by default. It can be enabled with
|
||||
Multiplexing is off by default. It can be enabled with
|
||||
|
||||
```bash
|
||||
# gluster volume set all cluster.brick-multiplex on
|
||||
```
|
||||
|
||||
*Limitations:*
|
||||
_Limitations:_
|
||||
There are currently no tuning options for multiplexing - it's all or nothing.
|
||||
This will change in the near future.
|
||||
|
||||
*Known Issues:*
|
||||
_Known Issues:_
|
||||
The only feature or combination of features known not to work with multiplexing
|
||||
is USS and SSL. Anyone using that combination should leave multiplexing off.
|
||||
is USS and SSL. Anyone using that combination should leave multiplexing off.
|
||||
|
||||
### Support to display op-version information from clients
|
||||
*Notes for users:*
|
||||
|
||||
_Notes for users:_
|
||||
To get information on what op-version are supported by the clients, users can
|
||||
invoke the `gluster volume status` command for clients. Along with information
|
||||
on hostname, port, bytes read, bytes written and number of clients connected
|
||||
@@ -43,12 +45,13 @@ operate. Following is the example usage:
|
||||
# gluster volume status <VOLNAME|all> clients
|
||||
```
|
||||
|
||||
*Limitations:*
|
||||
_Limitations:_
|
||||
|
||||
*Known Issues:*
|
||||
_Known Issues:_
|
||||
|
||||
### Support to get maximum op-version in a heterogeneous cluster
|
||||
*Notes for users:*
|
||||
|
||||
_Notes for users:_
|
||||
A heterogeneous cluster operates on a common op-version that can be supported
|
||||
across all the nodes in the trusted storage pool. Upon upgrade of the nodes in
|
||||
the cluster, the cluster might support a higher op-version. Users can retrieve
|
||||
@@ -60,12 +63,13 @@ the `gluster volume get` command on the newly introduced global option,
|
||||
# gluster volume get all cluster.max-op-version
|
||||
```
|
||||
|
||||
*Limitations:*
|
||||
_Limitations:_
|
||||
|
||||
*Known Issues:*
|
||||
_Known Issues:_
|
||||
|
||||
### Support for rebalance time to completion estimation
|
||||
*Notes for users:*
|
||||
|
||||
_Notes for users:_
|
||||
Users can now see approximately how much time the rebalance
|
||||
operation will take to complete across all nodes.
|
||||
|
||||
@@ -76,27 +80,27 @@ as part of the rebalance status. Use the command:
|
||||
# gluster volume rebalance <VOLNAME> status
|
||||
```
|
||||
|
||||
*Limitations:*
|
||||
_Limitations:_
|
||||
The rebalance process calculates the time left based on the rate
|
||||
at while files are processed on the node and the total number of files
|
||||
on the brick which is determined using statfs. The limitations of this
|
||||
are:
|
||||
|
||||
* A single fs partition must host only one brick. Multiple bricks on
|
||||
the same fs partition will cause the statfs results to be invalid.
|
||||
- A single fs partition must host only one brick. Multiple bricks on
|
||||
the same fs partition will cause the statfs results to be invalid.
|
||||
|
||||
* The estimates are dynamic and are recalculated every time the rebalance status
|
||||
command is invoked.The estimates become more accurate over time so short running
|
||||
rebalance operations may not benefit.
|
||||
- The estimates are dynamic and are recalculated every time the rebalance status
|
||||
command is invoked.The estimates become more accurate over time so short running
|
||||
rebalance operations may not benefit.
|
||||
|
||||
*Known Issues:*
|
||||
_Known Issues:_
|
||||
As glusterfs does not stored the number of files on the brick, we use statfs to
|
||||
guess the number. The .glusterfs directory contents can significantly skew this
|
||||
number and affect the calculated estimates.
|
||||
|
||||
|
||||
### Separation of tier as its own service
|
||||
*Notes for users:*
|
||||
|
||||
_Notes for users:_
|
||||
This change is to move the management of the tier daemon into the gluster
|
||||
service framework, thereby improving it stability and manageability by the
|
||||
service framework.
|
||||
@@ -104,24 +108,26 @@ service framework.
|
||||
This has no change to any of the tier commands or user facing interfaces and
|
||||
operations.
|
||||
|
||||
*Limitations:*
|
||||
_Limitations:_
|
||||
|
||||
*Known Issues:*
|
||||
_Known Issues:_
|
||||
|
||||
### Statedump support for gfapi based applications
|
||||
*Notes for users:*
|
||||
|
||||
_Notes for users:_
|
||||
gfapi based applications now can dump state information for better trouble
|
||||
shooting of issues. A statedump can be triggered in two ways:
|
||||
|
||||
1. by executing the following on one of the Gluster servers,
|
||||
|
||||
```bash
|
||||
# gluster volume statedump <VOLNAME> client <HOST>:<PID>
|
||||
```
|
||||
|
||||
- `<VOLNAME>` should be replaced by the name of the volume
|
||||
- `<HOST>` should be replaced by the hostname of the system running the
|
||||
gfapi application
|
||||
- `<PID>` should be replaced by the PID of the gfapi application
|
||||
- `<VOLNAME>` should be replaced by the name of the volume
|
||||
- `<HOST>` should be replaced by the hostname of the system running the
|
||||
gfapi application
|
||||
- `<PID>` should be replaced by the PID of the gfapi application
|
||||
|
||||
2. through calling `glfs_sysrq(<FS>, GLFS_SYSRQ_STATEDUMP)` within the
|
||||
application
|
||||
@@ -131,7 +137,7 @@ shooting of issues. A statedump can be triggered in two ways:
|
||||
All statedumps (`*.dump.*` files) will be located at the usual location,
|
||||
on most distributions this would be `/var/run/gluster/`.
|
||||
|
||||
*Limitations:*
|
||||
_Limitations:_
|
||||
It is not possible to trigger statedumps from the Gluster CLI when the
|
||||
gfapi application has lost its management connection to the GlusterD
|
||||
servers.
|
||||
@@ -141,24 +147,26 @@ GlusterFS 3.10 is the first release that contains support for the new
|
||||
debugging will need to be adapted to call this function. At the time of
|
||||
the release of 3.10, no applications are known to call `glfs_sysrq()`.
|
||||
|
||||
*Known Issues:*
|
||||
_Known Issues:_
|
||||
|
||||
### Disabled creation of trash directory by default
|
||||
*Notes for users:*
|
||||
|
||||
_Notes for users:_
|
||||
From now onwards trash directory, namely .trashcan, will not be be created by
|
||||
default upon creation of new volumes unless and until the feature is turned ON
|
||||
and the restrictions on the same will be applicable as long as features.trash
|
||||
is set for a particular volume.
|
||||
|
||||
*Limitations:*
|
||||
_Limitations:_
|
||||
After upgrade for pre-existing volumes, trash directory will be still present at
|
||||
root of the volume. Those who are not interested in this feature may have to
|
||||
manually delete the directory from the mount point.
|
||||
|
||||
*Known Issues:*
|
||||
_Known Issues:_
|
||||
|
||||
### Implemented parallel readdirp with distribute xlator
|
||||
*Notes for users:*
|
||||
|
||||
_Notes for users:_
|
||||
Currently the directory listing gets slower as the number of bricks/nodes
|
||||
increases in a volume, though the file/directory numbers remain unchanged.
|
||||
With this feature, the performance of directory listing is made mostly
|
||||
@@ -167,28 +175,32 @@ exponentially reduce the directory listing performance. (On a 2, 5, 10, 25 brick
|
||||
setup we saw ~5, 100, 400, 450% improvement consecutively)
|
||||
|
||||
To enable this feature:
|
||||
|
||||
```bash
|
||||
# gluster volume set <VOLNAME> performance.readdir-ahead on
|
||||
# gluster volume set <VOLNAME> performance.parallel-readdir on
|
||||
```
|
||||
|
||||
To disable this feature:
|
||||
|
||||
```bash
|
||||
# gluster volume set <VOLNAME> performance.parallel-readdir off
|
||||
```
|
||||
|
||||
If there are more than 50 bricks in the volume it is good to increase the cache
|
||||
size to be more than 10Mb (default value):
|
||||
|
||||
```bash
|
||||
# gluster volume set <VOLNAME> performance.rda-cache-limit <CACHE SIZE>
|
||||
```
|
||||
|
||||
*Limitations:*
|
||||
_Limitations:_
|
||||
|
||||
*Known Issues:*
|
||||
_Known Issues:_
|
||||
|
||||
### md-cache can optionally -ve cache security.ima xattr
|
||||
*Notes for users:*
|
||||
|
||||
_Notes for users:_
|
||||
From kernel version 3.X or greater, creating of a file results in removexattr
|
||||
call on security.ima xattr. This xattr is not set on the file unless IMA
|
||||
feature is active. With this patch, removxattr call returns ENODATA if it is
|
||||
@@ -197,18 +209,20 @@ not found in the cache.
|
||||
The end benefit is faster create operations where IMA is not enabled.
|
||||
|
||||
To cache this xattr use,
|
||||
|
||||
```bash
|
||||
# gluster volume set <VOLNAME> performance.cache-ima-xattrs on
|
||||
```
|
||||
|
||||
The above option is on by default.
|
||||
|
||||
*Limitations:*
|
||||
_Limitations:_
|
||||
|
||||
*Known Issues:*
|
||||
_Known Issues:_
|
||||
|
||||
### Added support for CPU extensions in disperse computations
|
||||
*Notes for users:*
|
||||
|
||||
_Notes for users:_
|
||||
To improve disperse computations, a new way of generating dynamic code
|
||||
targeting specific CPU extensions like SSE and AVX on Intel processors is
|
||||
implemented. The available extensions are detected on run time. This can
|
||||
@@ -226,18 +240,18 @@ command:
|
||||
|
||||
Valid <type> values are:
|
||||
|
||||
* none: Completely disable dynamic code generation
|
||||
* auto: Automatically detect available extensions and use the best one
|
||||
* x64: Use dynamic code generation using standard 64 bits instructions
|
||||
* sse: Use dynamic code generation using SSE extensions (128 bits)
|
||||
* avx: Use dynamic code generation using AVX extensions (256 bits)
|
||||
- none: Completely disable dynamic code generation
|
||||
- auto: Automatically detect available extensions and use the best one
|
||||
- x64: Use dynamic code generation using standard 64 bits instructions
|
||||
- sse: Use dynamic code generation using SSE extensions (128 bits)
|
||||
- avx: Use dynamic code generation using AVX extensions (256 bits)
|
||||
|
||||
The default value is 'auto'. If a value is specified that is not detected on
|
||||
run-time, it will automatically fall back to the next available option.
|
||||
|
||||
*Limitations:*
|
||||
_Limitations:_
|
||||
|
||||
*Known Issues:*
|
||||
_Known Issues:_
|
||||
To solve a conflict between the dynamic code generator and SELinux, it
|
||||
has been necessary to create a dynamic file on runtime in the directory
|
||||
/usr/libexec/glusterfs. This directory only exists if the server package
|
||||
@@ -271,20 +285,20 @@ Bugs addressed since release-3.9 are listed below.
|
||||
- [#1325531](https://bugzilla.redhat.com/1325531): Statedump: Add per xlator ref counting for inode
|
||||
- [#1325792](https://bugzilla.redhat.com/1325792): "gluster vol heal test statistics heal-count replica" seems doesn't work
|
||||
- [#1330604](https://bugzilla.redhat.com/1330604): out-of-tree builds generate XDR headers and source files in the original directory
|
||||
- [#1336371](https://bugzilla.redhat.com/1336371): Sequential volume start&stop is failing with SSL enabled setup.
|
||||
- [#1341948](https://bugzilla.redhat.com/1341948): DHT: Rebalance- Misleading log messages from __dht_check_free_space function
|
||||
- [#1336371](https://bugzilla.redhat.com/1336371): Sequential volume start&stop is failing with SSL enabled setup.
|
||||
- [#1341948](https://bugzilla.redhat.com/1341948): DHT: Rebalance- Misleading log messages from \_\_dht_check_free_space function
|
||||
- [#1344714](https://bugzilla.redhat.com/1344714): removal of file from nfs mount crashs ganesha server
|
||||
- [#1349385](https://bugzilla.redhat.com/1349385): [FEAT]jbr: Add rollbacking of failed fops
|
||||
- [#1355956](https://bugzilla.redhat.com/1355956): RFE : move ganesha related configuration into shared storage
|
||||
- [#1356076](https://bugzilla.redhat.com/1356076): DHT doesn't evenly balance files on FreeBSD with ZFS
|
||||
- [#1356960](https://bugzilla.redhat.com/1356960): OOM Kill on client when heal is in progress on 1*(2+1) arbiter volume
|
||||
- [#1356960](https://bugzilla.redhat.com/1356960): OOM Kill on client when heal is in progress on 1\*(2+1) arbiter volume
|
||||
- [#1357753](https://bugzilla.redhat.com/1357753): JSON output for all Events CLI commands
|
||||
- [#1357754](https://bugzilla.redhat.com/1357754): Delayed Events if any one Webhook is slow
|
||||
- [#1358296](https://bugzilla.redhat.com/1358296): tier: breaking down the monolith processing function tier_migrate_using_query_file()
|
||||
- [#1359612](https://bugzilla.redhat.com/1359612): [RFE] Geo-replication Logging Improvements
|
||||
- [#1360670](https://bugzilla.redhat.com/1360670): Add output option `--xml` to man page of gluster
|
||||
- [#1363595](https://bugzilla.redhat.com/1363595): Node remains in stopped state in pcs status with "/usr/lib/ocf/resource.d/heartbeat/ganesha_mon: line 137: [: too many arguments ]" messages in logs.
|
||||
- [#1363965](https://bugzilla.redhat.com/1363965): geo-replication *changes.log does not respect the log-level configured
|
||||
- [#1363965](https://bugzilla.redhat.com/1363965): geo-replication \*changes.log does not respect the log-level configured
|
||||
- [#1364420](https://bugzilla.redhat.com/1364420): [RFE] History Crawl performance improvement
|
||||
- [#1365395](https://bugzilla.redhat.com/1365395): Support for rc.d and init for Service management
|
||||
- [#1365740](https://bugzilla.redhat.com/1365740): dht: Update stbuf from servers having layout
|
||||
@@ -298,7 +312,7 @@ Bugs addressed since release-3.9 are listed below.
|
||||
- [#1368138](https://bugzilla.redhat.com/1368138): Crash of glusterd when using long username with geo-replication
|
||||
- [#1368312](https://bugzilla.redhat.com/1368312): Value of `replica.split-brain-status' attribute of a directory in metadata split-brain in a dist-rep volume reads that it is not in split-brain
|
||||
- [#1368336](https://bugzilla.redhat.com/1368336): [RFE] Tier Events
|
||||
- [#1369077](https://bugzilla.redhat.com/1369077): The directories get renamed when data bricks are offline in 4*(2+1) volume
|
||||
- [#1369077](https://bugzilla.redhat.com/1369077): The directories get renamed when data bricks are offline in 4\*(2+1) volume
|
||||
- [#1369124](https://bugzilla.redhat.com/1369124): fix unused variable warnings from out-of-tree builds generate XDR headers and source files i...
|
||||
- [#1369397](https://bugzilla.redhat.com/1369397): segment fault in changelog_cleanup_dispatchers
|
||||
- [#1369403](https://bugzilla.redhat.com/1369403): [RFE]: events from protocol server
|
||||
@@ -366,14 +380,14 @@ Bugs addressed since release-3.9 are listed below.
|
||||
- [#1384142](https://bugzilla.redhat.com/1384142): crypt: changes needed for openssl-1.1 (coming in Fedora 26)
|
||||
- [#1384297](https://bugzilla.redhat.com/1384297): glusterfs can't self heal character dev file for invalid dev_t parameters
|
||||
- [#1384906](https://bugzilla.redhat.com/1384906): arbiter volume write performance is bad with sharding
|
||||
- [#1385104](https://bugzilla.redhat.com/1385104): invalid argument warning messages seen in fuse client logs 2016-09-30 06:34:58.938667] W [dict.c:418ict_set] (-->/usr/lib64/glusterfs/3.8.4/xlator/cluster/replicate.so(+0x58722) 0-dict: !this || !value for key=link-count [Invalid argument]
|
||||
- [#1385104](https://bugzilla.redhat.com/1385104): invalid argument warning messages seen in fuse client logs 2016-09-30 06:34:58.938667] W [dict.c:418ict_set] (-->/usr/lib64/glusterfs/3.8.4/xlator/cluster/replicate.so(+0x58722) 0-dict: !this || !value for key=link-count [Invalid argument]
|
||||
- [#1385575](https://bugzilla.redhat.com/1385575): pmap_signin event fails to update brickinfo->signed_in flag
|
||||
- [#1385593](https://bugzilla.redhat.com/1385593): Fix some spelling mistakes in comments and log messages
|
||||
- [#1385839](https://bugzilla.redhat.com/1385839): Incorrect volume type in the "glusterd_state" file generated using CLI "gluster get-state"
|
||||
- [#1385839](https://bugzilla.redhat.com/1385839): Incorrect volume type in the "glusterd_state" file generated using CLI "gluster get-state"
|
||||
- [#1386088](https://bugzilla.redhat.com/1386088): Memory Leaks in snapshot code path
|
||||
- [#1386097](https://bugzilla.redhat.com/1386097): 4 of 8 bricks (2 dht subvols) crashed on systemic setup
|
||||
- [#1386097](https://bugzilla.redhat.com/1386097): 4 of 8 bricks (2 dht subvols) crashed on systemic setup
|
||||
- [#1386123](https://bugzilla.redhat.com/1386123): geo-replica slave node goes faulty for non-root user session due to fail to locate gluster binary
|
||||
- [#1386141](https://bugzilla.redhat.com/1386141): Error and warning message getting while removing glusterfs-events package
|
||||
- [#1386141](https://bugzilla.redhat.com/1386141): Error and warning message getting while removing glusterfs-events package
|
||||
- [#1386188](https://bugzilla.redhat.com/1386188): Asynchronous Unsplit-brain still causes Input/Output Error on system calls
|
||||
- [#1386200](https://bugzilla.redhat.com/1386200): Log all published events
|
||||
- [#1386247](https://bugzilla.redhat.com/1386247): [Eventing]: 'gluster volume tier <volname> start force' does not generate a TIER_START event
|
||||
@@ -417,7 +431,7 @@ Bugs addressed since release-3.9 are listed below.
|
||||
- [#1395648](https://bugzilla.redhat.com/1395648): ganesha-ha.conf --status should validate if the VIPs are assigned to right nodes
|
||||
- [#1395660](https://bugzilla.redhat.com/1395660): Checkpoint completed event missing master node detail
|
||||
- [#1395687](https://bugzilla.redhat.com/1395687): Client side IObuff leaks at a high pace consumes complete client memory and hence making gluster volume inaccessible
|
||||
- [#1395993](https://bugzilla.redhat.com/1395993): heal info --xml when bricks are down in a systemic environment is not displaying anything even after more than 30minutes
|
||||
- [#1395993](https://bugzilla.redhat.com/1395993): heal info --xml when bricks are down in a systemic environment is not displaying anything even after more than 30minutes
|
||||
- [#1396038](https://bugzilla.redhat.com/1396038): refresh-config fails and crashes ganesha when mdcache is enabled on the volume.
|
||||
- [#1396048](https://bugzilla.redhat.com/1396048): A hard link is lost during rebalance+lookup
|
||||
- [#1396062](https://bugzilla.redhat.com/1396062): [geo-rep]: Worker crashes seen while renaming directories in loop
|
||||
@@ -447,11 +461,11 @@ Bugs addressed since release-3.9 are listed below.
|
||||
- [#1400013](https://bugzilla.redhat.com/1400013): [USS,SSL] .snaps directory is not reachable when I/O encryption (SSL) is enabled
|
||||
- [#1400026](https://bugzilla.redhat.com/1400026): Duplicate value assigned to GD_MSG_DAEMON_STATE_REQ_RCVD and GD_MSG_BRICK_CLEANUP_SUCCESS messages
|
||||
- [#1400237](https://bugzilla.redhat.com/1400237): Ganesha services are not stopped when pacemaker quorum is lost
|
||||
- [#1400613](https://bugzilla.redhat.com/1400613): [GANESHA] failed to create directory of hostname of new node in var/lib/nfs/ganesha/ in already existing cluster nodes
|
||||
- [#1400818](https://bugzilla.redhat.com/1400818): possible memory leak on client when writing to a file while another client issues a truncate
|
||||
- [#1400613](https://bugzilla.redhat.com/1400613): [GANESHA] failed to create directory of hostname of new node in var/lib/nfs/ganesha/ in already existing cluster nodes
|
||||
- [#1400818](https://bugzilla.redhat.com/1400818): possible memory leak on client when writing to a file while another client issues a truncate
|
||||
- [#1401095](https://bugzilla.redhat.com/1401095): log the error when locking the brick directory fails
|
||||
- [#1401218](https://bugzilla.redhat.com/1401218): Fix compound fops memory leaks
|
||||
- [#1401404](https://bugzilla.redhat.com/1401404): [Arbiter] IO's Halted and heal info command hung
|
||||
- [#1401404](https://bugzilla.redhat.com/1401404): [Arbiter] IO's Halted and heal info command hung
|
||||
- [#1401777](https://bugzilla.redhat.com/1401777): atime becomes zero when truncating file via ganesha (or gluster-NFS)
|
||||
- [#1401801](https://bugzilla.redhat.com/1401801): [RFE] Use Host UUID to find local nodes to spawn workers
|
||||
- [#1401812](https://bugzilla.redhat.com/1401812): RFE: Make readdirp parallel in dht
|
||||
@@ -463,7 +477,7 @@ Bugs addressed since release-3.9 are listed below.
|
||||
- [#1402369](https://bugzilla.redhat.com/1402369): Getting the warning message while erasing the gluster "glusterfs-server" package.
|
||||
- [#1402710](https://bugzilla.redhat.com/1402710): ls and move hung on disperse volume
|
||||
- [#1402730](https://bugzilla.redhat.com/1402730): self-heal not happening, as self-heal info lists the same pending shards to be healed
|
||||
- [#1402828](https://bugzilla.redhat.com/1402828): Snapshot: Snapshot create command fails when gluster-shared-storage volume is stopped
|
||||
- [#1402828](https://bugzilla.redhat.com/1402828): Snapshot: Snapshot create command fails when gluster-shared-storage volume is stopped
|
||||
- [#1402841](https://bugzilla.redhat.com/1402841): Files remain unhealed forever if shd is disabled and re-enabled while healing is in progress.
|
||||
- [#1403130](https://bugzilla.redhat.com/1403130): [GANESHA] Adding a node to cluster failed to allocate resource-agents to new node.
|
||||
- [#1403780](https://bugzilla.redhat.com/1403780): Incorrect incrementation of volinfo refcnt during volume start
|
||||
@@ -495,7 +509,7 @@ Bugs addressed since release-3.9 are listed below.
|
||||
- [#1408757](https://bugzilla.redhat.com/1408757): Fix failure of split-brain-favorite-child-policy.t in CentOS7
|
||||
- [#1408758](https://bugzilla.redhat.com/1408758): tests/bugs/glusterd/bug-913555.t fails spuriously
|
||||
- [#1409078](https://bugzilla.redhat.com/1409078): RFE: Need a command to check op-version compatibility of clients
|
||||
- [#1409186](https://bugzilla.redhat.com/1409186): Dict_t leak in dht_migration_complete_check_task and dht_rebalance_inprogress_task
|
||||
- [#1409186](https://bugzilla.redhat.com/1409186): Dict_t leak in dht_migration_complete_check_task and dht_rebalance_inprogress_task
|
||||
- [#1409202](https://bugzilla.redhat.com/1409202): Warning messages throwing when EC volume offline brick comes up are difficult to understand for end user.
|
||||
- [#1409206](https://bugzilla.redhat.com/1409206): Extra lookup/fstats are sent over the network when a brick is down.
|
||||
- [#1409727](https://bugzilla.redhat.com/1409727): [ganesha + EC]posix compliance rename tests failed on EC volume with nfs-ganesha mount.
|
||||
@@ -531,7 +545,7 @@ Bugs addressed since release-3.9 are listed below.
|
||||
- [#1417042](https://bugzilla.redhat.com/1417042): glusterd restart is starting the offline shd daemon on other node in the cluster
|
||||
- [#1417135](https://bugzilla.redhat.com/1417135): [Stress] : SHD Logs flooded with "Heal Failed" messages,filling up "/" quickly
|
||||
- [#1417521](https://bugzilla.redhat.com/1417521): [SNAPSHOT] With all USS plugin enable .snaps directory is not visible in cifs mount as well as windows mount
|
||||
- [#1417527](https://bugzilla.redhat.com/1417527): glusterfind: After glusterfind pre command execution all temporary files and directories /usr/var/lib/misc/glusterfsd/glusterfind/<session>/<volume>/ should be removed
|
||||
- [#1417527](https://bugzilla.redhat.com/1417527): glusterfind: After glusterfind pre command execution all temporary files and directories /usr/var/lib/misc/glusterfsd/glusterfind/<session>/<volume>/ should be removed
|
||||
- [#1417804](https://bugzilla.redhat.com/1417804): debug/trace: Print iatts of individual entries in readdirp callback for better debugging experience
|
||||
- [#1418091](https://bugzilla.redhat.com/1418091): [RFE] Support multiple bricks in one process (multiplexing)
|
||||
- [#1418536](https://bugzilla.redhat.com/1418536): Portmap allocates way too much memory (256KB) on stack
|
||||
@@ -555,11 +569,11 @@ Bugs addressed since release-3.9 are listed below.
|
||||
- [#1420987](https://bugzilla.redhat.com/1420987): warning messages seen in glusterd logs while setting the volume option
|
||||
- [#1420989](https://bugzilla.redhat.com/1420989): when server-quorum is enabled, volume get returns 0 value for server-quorum-ratio
|
||||
- [#1420991](https://bugzilla.redhat.com/1420991): Modified volume options not synced once offline nodes comes up.
|
||||
- [#1421017](https://bugzilla.redhat.com/1421017): CLI option "--timeout" is accepting non numeric and negative values.
|
||||
- [#1421017](https://bugzilla.redhat.com/1421017): CLI option "--timeout" is accepting non numeric and negative values.
|
||||
- [#1421956](https://bugzilla.redhat.com/1421956): Disperse: Fallback to pre-compiled code execution when dynamic code generation fails
|
||||
- [#1422350](https://bugzilla.redhat.com/1422350): glustershd process crashed on systemic setup
|
||||
- [#1422363](https://bugzilla.redhat.com/1422363): [Replicate] "RPC call decoding failed" leading to IO hang & mount inaccessible
|
||||
- [#1422391](https://bugzilla.redhat.com/1422391): Gluster NFS server crashing in __mnt3svc_umountall
|
||||
- [#1422391](https://bugzilla.redhat.com/1422391): Gluster NFS server crashing in \_\_mnt3svc_umountall
|
||||
- [#1422766](https://bugzilla.redhat.com/1422766): Entry heal messages in glustershd.log while no entries shown in heal info
|
||||
- [#1422777](https://bugzilla.redhat.com/1422777): DHT doesn't evenly balance files on FreeBSD with ZFS
|
||||
- [#1422819](https://bugzilla.redhat.com/1422819): [Geo-rep] Recreating geo-rep session with same slave after deleting with reset-sync-time fails to sync
|
||||
|
||||
@@ -6,17 +6,17 @@ bugs in the GlusterFS 3.10 stable release.
|
||||
|
||||
## Major changes, features and limitations addressed in this release
|
||||
|
||||
1. auth-allow setting was broken with 3.10 release and is now fixed (#1429117)
|
||||
1. auth-allow setting was broken with 3.10 release and is now fixed (#1429117)
|
||||
|
||||
## Major issues
|
||||
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance)
|
||||
there are reports of VM images getting corrupted.
|
||||
- If you are using sharded volumes, DO NOT rebalance them till this is
|
||||
fixed
|
||||
- Status of this bug can be tracked here, [#1426508](https://bugzilla.redhat.com/1426508)
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance)
|
||||
there are reports of VM images getting corrupted.
|
||||
- If you are using sharded volumes, DO NOT rebalance them till this is
|
||||
fixed
|
||||
- Status of this bug can be tracked here, [#1426508](https://bugzilla.redhat.com/1426508)
|
||||
|
||||
## Bugs addressed
|
||||
|
||||
@@ -28,7 +28,7 @@ A total of 31 patches have been merged, addressing 26 bugs:
|
||||
- [#1426222](https://bugzilla.redhat.com/1426222): build: fixes to build 3.9.0rc2 on Debian (jessie)
|
||||
- [#1426323](https://bugzilla.redhat.com/1426323): common-ha: no need to remove nodes one-by-one in teardown
|
||||
- [#1426329](https://bugzilla.redhat.com/1426329): [Ganesha] : Add comment to Ganesha HA config file ,about cluster name's length limitation
|
||||
- [#1427387](https://bugzilla.redhat.com/1427387): systemic testing: seeing lot of ping time outs which would lead to splitbrains
|
||||
- [#1427387](https://bugzilla.redhat.com/1427387): systemic testing: seeing lot of ping time outs which would lead to splitbrains
|
||||
- [#1427399](https://bugzilla.redhat.com/1427399): [RFE] capture portmap details in glusterd's statedump
|
||||
- [#1427461](https://bugzilla.redhat.com/1427461): Bricks take up new ports upon volume restart after add-brick op with brick mux enabled
|
||||
- [#1428670](https://bugzilla.redhat.com/1428670): Disconnects in nfs mount leads to IO hang and mount inaccessible
|
||||
@@ -36,7 +36,7 @@ A total of 31 patches have been merged, addressing 26 bugs:
|
||||
- [#1429117](https://bugzilla.redhat.com/1429117): auth failure after upgrade to GlusterFS 3.10
|
||||
- [#1429402](https://bugzilla.redhat.com/1429402): Restore atime/mtime for symlinks and other non-regular files.
|
||||
- [#1429773](https://bugzilla.redhat.com/1429773): disallow increasing replica count for arbiter volumes
|
||||
- [#1430512](https://bugzilla.redhat.com/1430512): /libgfxdr.so.0.0.1: undefined symbol: __gf_free
|
||||
- [#1430512](https://bugzilla.redhat.com/1430512): /libgfxdr.so.0.0.1: undefined symbol: \_\_gf_free
|
||||
- [#1430844](https://bugzilla.redhat.com/1430844): build/packaging: Debian and Ubuntu don't have /usr/libexec/; results in bad packages
|
||||
- [#1431175](https://bugzilla.redhat.com/1431175): volume start command hangs
|
||||
- [#1431176](https://bugzilla.redhat.com/1431176): USS is broken when multiplexing is on
|
||||
|
||||
@@ -6,6 +6,7 @@ the new features that were added and bugs fixed in the GlusterFS
|
||||
3.10 stable release.
|
||||
|
||||
## Major changes, features and limitations addressed in this release
|
||||
|
||||
**No Major changes**
|
||||
|
||||
## Major issues
|
||||
@@ -13,10 +14,9 @@ the new features that were added and bugs fixed in the GlusterFS
|
||||
1. Brick multiplexing is being tested and fixed aggressively but we still have a
|
||||
few crashes and memory leaks to fix.
|
||||
|
||||
|
||||
## Bugs addressed
|
||||
|
||||
Bugs addressed since release-3.10.9 are listed below.
|
||||
|
||||
- [#1498081](https://bugzilla.redhat.com/1498081): dht_(f)xattrop does not implement migration checks
|
||||
- [#1498081](https://bugzilla.redhat.com/1498081): dht\_(f)xattrop does not implement migration checks
|
||||
- [#1534848](https://bugzilla.redhat.com/1534848): entries not getting cleared post healing of softlinks (stale entries showing up in heal info)
|
||||
|
||||
@@ -6,6 +6,7 @@ the new features that were added and bugs fixed in the GlusterFS
|
||||
3.10 stable release.
|
||||
|
||||
## Major changes, features and limitations addressed in this release
|
||||
|
||||
**No Major changes**
|
||||
|
||||
## Major issues
|
||||
@@ -13,13 +14,12 @@ the new features that were added and bugs fixed in the GlusterFS
|
||||
1. Brick multiplexing is being tested and fixed aggressively but we still have a
|
||||
few crashes and memory leaks to fix.
|
||||
|
||||
|
||||
## Bugs addressed
|
||||
|
||||
Bugs addressed since release-3.10.10 are listed below.
|
||||
|
||||
- [#1486542](https://bugzilla.redhat.com/1486542): "ganesha.so cannot open" warning message in glusterd log in non ganesha setup.
|
||||
- [#1486542](https://bugzilla.redhat.com/1486542): "ganesha.so cannot open" warning message in glusterd log in non ganesha setup.
|
||||
- [#1544461](https://bugzilla.redhat.com/1544461): 3.8 -> 3.10 rolling upgrade fails (same for 3.12 or 3.13) on Ubuntu 14
|
||||
- [#1544787](https://bugzilla.redhat.com/1544787): tests/bugs/cli/bug-1169302.t fails spuriously
|
||||
- [#1546912](https://bugzilla.redhat.com/1546912): tests/bugs/posix/bug-990028.t fails in release-3.10 branch
|
||||
- [#1549482](https://bugzilla.redhat.com/1549482): Quota: After deleting directory from mount point on which quota was configured, quota list command output is blank
|
||||
- [#1549482](https://bugzilla.redhat.com/1549482): Quota: After deleting directory from mount point on which quota was configured, quota list command output is blank
|
||||
|
||||
@@ -8,6 +8,7 @@ GlusterFS 3.10 stable release.
|
||||
## Major changes, features and limitations addressed in this release
|
||||
|
||||
This release contains a fix for a security vulerability in Gluster as follows,
|
||||
|
||||
- http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-1088
|
||||
- https://nvd.nist.gov/vuln/detail/CVE-2018-1088
|
||||
|
||||
@@ -24,7 +25,6 @@ See, this [guide](https://docs.gluster.org/en/v3/Administrator%20Guide/SSL/) for
|
||||
1. Brick multiplexing is being tested and fixed aggressively but we still have a
|
||||
few crashes and memory leaks to fix.
|
||||
|
||||
|
||||
## Bugs addressed
|
||||
|
||||
Bugs addressed since release-3.10.11 are listed below.
|
||||
|
||||
@@ -6,18 +6,19 @@ contains a listing of all the new features that were added and
|
||||
bugs in the GlusterFS 3.10 stable release.
|
||||
|
||||
## Major changes, features and limitations addressed in this release
|
||||
|
||||
1. Many bugs brick multiplexing and nfs-ganesha+ha bugs have been addressed.
|
||||
2. Rebalance and remove brick operations have been disabled for sharded volumes
|
||||
to prevent data corruption.
|
||||
|
||||
|
||||
## Major issues
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance)
|
||||
there are reports of VM images getting corrupted.
|
||||
- Status of this bug can be tracked here, [#1426508](https://bugzilla.redhat.com/1426508)
|
||||
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance)
|
||||
there are reports of VM images getting corrupted.
|
||||
- Status of this bug can be tracked here, [#1426508](https://bugzilla.redhat.com/1426508)
|
||||
|
||||
## Bugs addressed
|
||||
|
||||
@@ -40,12 +41,12 @@ A total of 63 patches have been merged, addressing 46 bugs:
|
||||
- [#1443349](https://bugzilla.redhat.com/1443349): [Eventing]: Unrelated error message displayed when path specified during a 'webhook-test/add' is missing a schema
|
||||
- [#1441576](https://bugzilla.redhat.com/1441576): [geo-rep]: rsync should not try to sync internal xattrs
|
||||
- [#1441927](https://bugzilla.redhat.com/1441927): [geo-rep]: Worker crashes with [Errno 16] Device or resource busy: '.gfid/00000000-0000-0000-0000-000000000001/dir.166 while renaming directories
|
||||
- [#1401877](https://bugzilla.redhat.com/1401877): [GANESHA] Symlinks from /etc/ganesha/ganesha.conf to shared\_storage are created on the non-ganesha nodes in 8 node gluster having 4 node ganesha cluster
|
||||
- [#1425723](https://bugzilla.redhat.com/1425723): nfs-ganesha volume export file remains stale in shared\_storage\_volume when volume is deleted
|
||||
- [#1401877](https://bugzilla.redhat.com/1401877): [GANESHA] Symlinks from /etc/ganesha/ganesha.conf to shared_storage are created on the non-ganesha nodes in 8 node gluster having 4 node ganesha cluster
|
||||
- [#1425723](https://bugzilla.redhat.com/1425723): nfs-ganesha volume export file remains stale in shared_storage_volume when volume is deleted
|
||||
- [#1427759](https://bugzilla.redhat.com/1427759): nfs-ganesha: Incorrect error message returned when disable fails
|
||||
- [#1438325](https://bugzilla.redhat.com/1438325): Need to improve remove-brick failure message when the brick process is down.
|
||||
- [#1438338](https://bugzilla.redhat.com/1438338): glusterd is setting replicate volume property over disperse volume or vice versa
|
||||
- [#1438340](https://bugzilla.redhat.com/1438340): glusterd is not validating for allowed values while setting "cluster.brick-multiplex" property
|
||||
- [#1438340](https://bugzilla.redhat.com/1438340): glusterd is not validating for allowed values while setting "cluster.brick-multiplex" property
|
||||
- [#1441476](https://bugzilla.redhat.com/1441476): Glusterd crashes when restarted with many volumes
|
||||
- [#1444128](https://bugzilla.redhat.com/1444128): [BrickMultiplex] gluster command not responding and .snaps directory is not visible after executing snapshot related command
|
||||
- [#1445260](https://bugzilla.redhat.com/1445260): [GANESHA] Volume start and stop having ganesha enable on it,turns off cache-invalidation on volume
|
||||
@@ -54,10 +55,10 @@ A total of 63 patches have been merged, addressing 46 bugs:
|
||||
- [#1435779](https://bugzilla.redhat.com/1435779): Inode ref leak on anonymous reads and writes
|
||||
- [#1440278](https://bugzilla.redhat.com/1440278): [GSS] NFS Sub-directory mount not working on solaris10 client
|
||||
- [#1450378](https://bugzilla.redhat.com/1450378): GNFS crashed while taking lock on a file from 2 different clients having same volume mounted from 2 different servers
|
||||
- [#1449779](https://bugzilla.redhat.com/1449779): quota: limit-usage command failed with error " Failed to start aux mount"
|
||||
- [#1449779](https://bugzilla.redhat.com/1449779): quota: limit-usage command failed with error " Failed to start aux mount"
|
||||
- [#1450564](https://bugzilla.redhat.com/1450564): glfsheal: crashed(segfault) with disperse volume in RDMA
|
||||
- [#1443501](https://bugzilla.redhat.com/1443501): Don't wind post-op on a brick where the fop phase failed.
|
||||
- [#1444892](https://bugzilla.redhat.com/1444892): When either killing or restarting a brick with performance.stat-prefetch on, stat sometimes returns a bad st\_size value.
|
||||
- [#1444892](https://bugzilla.redhat.com/1444892): When either killing or restarting a brick with performance.stat-prefetch on, stat sometimes returns a bad st_size value.
|
||||
- [#1449169](https://bugzilla.redhat.com/1449169): Multiple bricks WILL crash after TCP port probing
|
||||
- [#1440805](https://bugzilla.redhat.com/1440805): Update rfc.sh to check Change-Id consistency for backports
|
||||
- [#1443010](https://bugzilla.redhat.com/1443010): snapshot: snapshots appear to be failing with respect to secure geo-rep slave
|
||||
@@ -65,8 +66,7 @@ A total of 63 patches have been merged, addressing 46 bugs:
|
||||
- [#1444773](https://bugzilla.redhat.com/1444773): explicitly specify executor to be bash for tests
|
||||
- [#1445407](https://bugzilla.redhat.com/1445407): remove bug-1421590-brick-mux-reuse-ports.t
|
||||
- [#1440742](https://bugzilla.redhat.com/1440742): Test files clean up for tier during 3.10
|
||||
- [#1448790](https://bugzilla.redhat.com/1448790): [Tiering]: High and low watermark values when set to the same level, is allowed
|
||||
- [#1448790](https://bugzilla.redhat.com/1448790): [Tiering]: High and low watermark values when set to the same level, is allowed
|
||||
- [#1435942](https://bugzilla.redhat.com/1435942): Enabling parallel-readdir causes dht linkto files to be visible on the mount,
|
||||
- [#1437763](https://bugzilla.redhat.com/1437763): File-level WORM allows ftruncate() on read-only files
|
||||
- [#1439148](https://bugzilla.redhat.com/1439148): Parallel readdir on Gluster NFS displays less number of dentries
|
||||
|
||||
|
||||
@@ -6,18 +6,20 @@ contain a listing of all the new features that were added and
|
||||
bugs in the GlusterFS 3.10 stable release.
|
||||
|
||||
## Major changes, features and limitations addressed in this release
|
||||
|
||||
1. No Major changes
|
||||
|
||||
|
||||
## Major issues
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance)
|
||||
there are reports of VM images getting corrupted.
|
||||
- Status of this bug can be tracked here, [#1426508](https://bugzilla.redhat.com/1426508)
|
||||
2. Brick multiplexing is being tested and fixed aggressively but we still have a
|
||||
few crashes and memory leaks to fix.
|
||||
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance)
|
||||
there are reports of VM images getting corrupted.
|
||||
- Status of this bug can be tracked here, [#1426508](https://bugzilla.redhat.com/1426508)
|
||||
|
||||
2. Brick multiplexing is being tested and fixed aggressively but we still have a
|
||||
few crashes and memory leaks to fix.
|
||||
|
||||
## Bugs addressed
|
||||
|
||||
@@ -27,13 +29,12 @@ A total of 18 patches have been merged, addressing 13 bugs:
|
||||
- [#1450773](https://bugzilla.redhat.com/1450773): Quota: After upgrade from 3.7 to higher version , gluster quota list command shows "No quota configured on volume repvol"
|
||||
- [#1450934](https://bugzilla.redhat.com/1450934): [New] - Replacing an arbiter brick while I/O happens causes vm pause
|
||||
- [#1450947](https://bugzilla.redhat.com/1450947): Autoconf leaves unexpanded variables in path names of non-shell-scripttext files
|
||||
- [#1451371](https://bugzilla.redhat.com/1451371): crash in dht\_rmdir\_do
|
||||
- [#1451371](https://bugzilla.redhat.com/1451371): crash in dht_rmdir_do
|
||||
- [#1451561](https://bugzilla.redhat.com/1451561): AFR returns the node uuid of the same node for every file in the replica
|
||||
- [#1451587](https://bugzilla.redhat.com/1451587): cli xml status of detach tier broken
|
||||
- [#1451977](https://bugzilla.redhat.com/1451977): Add logs to identify whether disconnects are voluntary or due to network problems
|
||||
- [#1451995](https://bugzilla.redhat.com/1451995): Log message shows error code as success even when rpc fails to connect
|
||||
- [#1453056](https://bugzilla.redhat.com/1453056): [DHt] : segfault in dht\_selfheal\_dir\_setattr while running regressions
|
||||
- [#1453056](https://bugzilla.redhat.com/1453056): [DHt] : segfault in dht_selfheal_dir_setattr while running regressions
|
||||
- [#1453087](https://bugzilla.redhat.com/1453087): Brick Multiplexing: On reboot of a node Brick multiplexing feature lost on that node as multiple brick processes get spawned
|
||||
- [#1456682](https://bugzilla.redhat.com/1456682): tierd listens to a port.
|
||||
- [#1457054](https://bugzilla.redhat.com/1457054): glusterfs client crash on io-cache.so(\_\_ioc\_page\_wakeup+0x44)
|
||||
|
||||
- [#1457054](https://bugzilla.redhat.com/1457054): glusterfs client crash on io-cache.so(\_\_ioc_page_wakeup+0x44)
|
||||
|
||||
@@ -6,26 +6,28 @@ contain a listing of all the new features that were added and
|
||||
bugs fixed in the GlusterFS 3.10 stable release.
|
||||
|
||||
## Major changes, features and limitations addressed in this release
|
||||
|
||||
1. No Major changes
|
||||
|
||||
|
||||
## Major issues
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance)
|
||||
there are reports of VM images getting corrupted.
|
||||
- Status of this bug can be tracked here, [#1426508](https://bugzilla.redhat.com/1426508)
|
||||
2. Brick multiplexing is being tested and fixed aggressively but we still have a
|
||||
few crashes and memory leaks to fix.
|
||||
3. Another rebalance related bug is being worked upon [#1467010](https://bugzilla.redhat.com/1467010)
|
||||
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance)
|
||||
there are reports of VM images getting corrupted.
|
||||
- Status of this bug can be tracked here, [#1426508](https://bugzilla.redhat.com/1426508)
|
||||
|
||||
2. Brick multiplexing is being tested and fixed aggressively but we still have a
|
||||
few crashes and memory leaks to fix.
|
||||
3. Another rebalance related bug is being worked upon [#1467010](https://bugzilla.redhat.com/1467010)
|
||||
|
||||
## Bugs addressed
|
||||
|
||||
A total of 18 patches have been merged, addressing 13 bugs:
|
||||
|
||||
- [#1457732](https://bugzilla.redhat.com/1457732): "split-brain observed [Input/output error]" error messages in samba logs during parallel rm -rf
|
||||
- [#1459760](https://bugzilla.redhat.com/1459760): Glusterd segmentation fault in ' _Unwind_Backtrace' while running peer probe
|
||||
- [#1459760](https://bugzilla.redhat.com/1459760): Glusterd segmentation fault in ' \_Unwind_Backtrace' while running peer probe
|
||||
- [#1460649](https://bugzilla.redhat.com/1460649): posix-acl: Whitelist virtual ACL xattrs
|
||||
- [#1460914](https://bugzilla.redhat.com/1460914): Rebalance estimate time sometimes shows negative values
|
||||
- [#1460993](https://bugzilla.redhat.com/1460993): Revert CLI restrictions on running rebalance in VM store use case
|
||||
|
||||
@@ -6,19 +6,22 @@ contain a listing of all the new features that were added and
|
||||
bugs fixed in the GlusterFS 3.10 stable release.
|
||||
|
||||
## Major changes, features and limitations addressed in this release
|
||||
|
||||
**No Major changes**
|
||||
|
||||
## Major issues
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance)
|
||||
there are reports of VM images getting corrupted.
|
||||
- The last known cause for corruption [#1467010](https://bugzilla.redhat.com/show_bug.cgi?id=1467010)
|
||||
has a fix with this release. As further testing is still in progress, the issue
|
||||
is retained as a major issue.
|
||||
2. Brick multiplexing is being tested and fixed aggressively but we still have a
|
||||
few crashes and memory leaks to fix.
|
||||
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance)
|
||||
there are reports of VM images getting corrupted.
|
||||
- The last known cause for corruption [#1467010](https://bugzilla.redhat.com/show_bug.cgi?id=1467010)
|
||||
has a fix with this release. As further testing is still in progress, the issue
|
||||
is retained as a major issue.
|
||||
|
||||
2. Brick multiplexing is being tested and fixed aggressively but we still have a
|
||||
few crashes and memory leaks to fix.
|
||||
|
||||
## Bugs addressed
|
||||
|
||||
@@ -46,4 +49,4 @@ Bugs addressed since release-3.10.4 are listed below.
|
||||
- [#1476212](https://bugzilla.redhat.com/1476212): [geo-rep]: few of the self healed hardlinks on master did not sync to slave
|
||||
- [#1478498](https://bugzilla.redhat.com/1478498): scripts: invalid test in S32gluster_enable_shared_storage.sh
|
||||
- [#1478499](https://bugzilla.redhat.com/1478499): packaging: /var/lib/glusterd/options should be %config(noreplace)
|
||||
- [#1480594](https://bugzilla.redhat.com/1480594): nfs process crashed in "nfs3_getattr"
|
||||
- [#1480594](https://bugzilla.redhat.com/1480594): nfs process crashed in "nfs3_getattr"
|
||||
|
||||
@@ -6,18 +6,21 @@ contain a listing of all the new features that were added and
|
||||
bugs fixed in the GlusterFS 3.10 stable release.
|
||||
|
||||
## Major changes, features and limitations addressed in this release
|
||||
|
||||
**No Major changes**
|
||||
|
||||
## Major issues
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance)
|
||||
there are reports of VM images getting corrupted.
|
||||
- The last known cause for corruption [#1498081](https://bugzilla.redhat.com/show_bug.cgi?id=1498081)
|
||||
is still pending, and not yet a part of this release.
|
||||
2. Brick multiplexing is being tested and fixed aggressively but we still have a
|
||||
few crashes and memory leaks to fix.
|
||||
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance)
|
||||
there are reports of VM images getting corrupted.
|
||||
- The last known cause for corruption [#1498081](https://bugzilla.redhat.com/show_bug.cgi?id=1498081)
|
||||
is still pending, and not yet a part of this release.
|
||||
|
||||
2. Brick multiplexing is being tested and fixed aggressively but we still have a
|
||||
few crashes and memory leaks to fix.
|
||||
|
||||
## Bugs addressed
|
||||
|
||||
@@ -28,7 +31,7 @@ Bugs addressed since release-3.10.5 are listed below.
|
||||
- [#1482857](https://bugzilla.redhat.com/1482857): glusterd fails to start
|
||||
- [#1483997](https://bugzilla.redhat.com/1483997): packaging: use rdma-core(-devel) instead of ibverbs, rdmacm; disable rdma on armv7hl
|
||||
- [#1484443](https://bugzilla.redhat.com/1484443): packaging: /run and /var/run; prefer /run
|
||||
- [#1486542](https://bugzilla.redhat.com/1486542): "ganesha.so cannot open" warning message in glusterd log in non ganesha setup.
|
||||
- [#1486542](https://bugzilla.redhat.com/1486542): "ganesha.so cannot open" warning message in glusterd log in non ganesha setup.
|
||||
- [#1487042](https://bugzilla.redhat.com/1487042): AFR returns the node uuid of the same node for every file in the replica
|
||||
- [#1487647](https://bugzilla.redhat.com/1487647): with AFR now making both nodes to return UUID for a file will result in georep consuming more resources
|
||||
- [#1488391](https://bugzilla.redhat.com/1488391): gluster-blockd process crashed and core generated
|
||||
@@ -38,7 +41,7 @@ Bugs addressed since release-3.10.5 are listed below.
|
||||
- [#1491691](https://bugzilla.redhat.com/1491691): rpc: TLSv1_2_method() is deprecated in OpenSSL-1.1
|
||||
- [#1491966](https://bugzilla.redhat.com/1491966): AFR entry self heal removes a directory's .glusterfs symlink.
|
||||
- [#1491985](https://bugzilla.redhat.com/1491985): Add NULL gfid checks before creating file
|
||||
- [#1491995](https://bugzilla.redhat.com/1491995): afr: check op_ret value in __afr_selfheal_name_impunge
|
||||
- [#1491995](https://bugzilla.redhat.com/1491995): afr: check op_ret value in \_\_afr_selfheal_name_impunge
|
||||
- [#1492010](https://bugzilla.redhat.com/1492010): Launch metadata heal in discover code path.
|
||||
- [#1495430](https://bugzilla.redhat.com/1495430): Make event-history feature configurable and have it disabled by default
|
||||
- [#1496321](https://bugzilla.redhat.com/1496321): [afr] split-brain observed on T files post hardlink and rename in x3 volume
|
||||
|
||||
@@ -6,18 +6,21 @@ contain a listing of all the new features that were added and
|
||||
bugs fixed in the GlusterFS 3.10 stable release.
|
||||
|
||||
## Major changes, features and limitations addressed in this release
|
||||
|
||||
**No Major changes**
|
||||
|
||||
## Major issues
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance)
|
||||
there are reports of VM images getting corrupted.
|
||||
- The last known cause for corruption [#1498081](https://bugzilla.redhat.com/show_bug.cgi?id=1498081)
|
||||
is still pending, and not yet a part of this release.
|
||||
2. Brick multiplexing is being tested and fixed aggressively but we still have a
|
||||
few crashes and memory leaks to fix.
|
||||
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance)
|
||||
there are reports of VM images getting corrupted.
|
||||
- The last known cause for corruption [#1498081](https://bugzilla.redhat.com/show_bug.cgi?id=1498081)
|
||||
is still pending, and not yet a part of this release.
|
||||
|
||||
2. Brick multiplexing is being tested and fixed aggressively but we still have a
|
||||
few crashes and memory leaks to fix.
|
||||
|
||||
## Bugs addressed
|
||||
|
||||
|
||||
@@ -6,18 +6,21 @@ contain a listing of all the new features that were added and
|
||||
bugs fixed in the GlusterFS 3.10 stable release.
|
||||
|
||||
## Major changes, features and limitations addressed in this release
|
||||
|
||||
**No Major changes**
|
||||
|
||||
## Major issues
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance)
|
||||
there are reports of VM images getting corrupted.
|
||||
- The last known cause for corruption [#1498081](https://bugzilla.redhat.com/show_bug.cgi?id=1498081)
|
||||
is still pending, and not yet a part of this release.
|
||||
2. Brick multiplexing is being tested and fixed aggressively but we still have a
|
||||
few crashes and memory leaks to fix.
|
||||
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance)
|
||||
there are reports of VM images getting corrupted.
|
||||
- The last known cause for corruption [#1498081](https://bugzilla.redhat.com/show_bug.cgi?id=1498081)
|
||||
is still pending, and not yet a part of this release.
|
||||
|
||||
2. Brick multiplexing is being tested and fixed aggressively but we still have a
|
||||
few crashes and memory leaks to fix.
|
||||
|
||||
## Bugs addressed
|
||||
|
||||
|
||||
@@ -6,18 +6,21 @@ the new features that were added and bugs fixed in the GlusterFS
|
||||
3.10 stable release.
|
||||
|
||||
## Major changes, features and limitations addressed in this release
|
||||
|
||||
**No Major changes**
|
||||
|
||||
## Major issues
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance)
|
||||
there are reports of VM images getting corrupted.
|
||||
- The last known cause for corruption [#1498081](https://bugzilla.redhat.com/show_bug.cgi?id=1498081)
|
||||
is still pending, and not yet a part of this release.
|
||||
2. Brick multiplexing is being tested and fixed aggressively but we still have a
|
||||
few crashes and memory leaks to fix.
|
||||
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance)
|
||||
there are reports of VM images getting corrupted.
|
||||
- The last known cause for corruption [#1498081](https://bugzilla.redhat.com/show_bug.cgi?id=1498081)
|
||||
is still pending, and not yet a part of this release.
|
||||
|
||||
2. Brick multiplexing is being tested and fixed aggressively but we still have a
|
||||
few crashes and memory leaks to fix.
|
||||
|
||||
## Bugs addressed
|
||||
|
||||
|
||||
@@ -11,6 +11,7 @@ of bugs that have been addressed is included further below.
|
||||
## Major changes and features
|
||||
|
||||
### Switched to storhaug for ganesha and samba high availability
|
||||
|
||||
**Notes for users:**
|
||||
|
||||
High Availability (HA) support for NFS-Ganesha (NFS) and Samba (SMB)
|
||||
@@ -26,6 +27,7 @@ There are many to choose from in most popular Linux distributions.
|
||||
Choose the one the best fits your environment and use it.
|
||||
|
||||
### Added SELinux support for Gluster Volumes
|
||||
|
||||
**Notes for users:**
|
||||
|
||||
A new xlator has been introduced (`features/selinux`) to allow setting the
|
||||
@@ -40,17 +42,20 @@ This feature is intended to be the base for implementing Labelled-NFS in
|
||||
NFS-Ganesha and SELinux support for FUSE mounts in the Linux kernel.
|
||||
|
||||
**Limitations:**
|
||||
|
||||
- The Linux kernel does not support mounting of FUSE filesystems with SELinux
|
||||
support, yet.
|
||||
- NFS-Ganesha does not support Labelled-NFS, yet.
|
||||
|
||||
**Known Issues:**
|
||||
|
||||
- There has been limited testing, because other projects can not consume the
|
||||
functionality yet without being part of a release. So far, no problems have
|
||||
been observed, but this might change when other projects start to seriously
|
||||
use this.
|
||||
|
||||
### Several memory leaks are fixed in gfapi during graph switches
|
||||
|
||||
**Notes for users:**
|
||||
|
||||
Gluster API (or gfapi), has had a few memory leak issues arising specifically
|
||||
@@ -59,9 +64,11 @@ addressed in this release, and more work towards ironing out the pending leaks
|
||||
are in the works across the next few releases.
|
||||
|
||||
**Limitations:**
|
||||
|
||||
- There are still a few leaks to be addressed when graph switches occur
|
||||
|
||||
### get-state CLI is enhanced to provide client and brick capacity related information
|
||||
|
||||
**Notes for users:**
|
||||
|
||||
The get-state CLI output now optionally accommodates client related information
|
||||
@@ -80,11 +87,13 @@ bricks as obtained from `gluster volume status <volname>|all detail` has also
|
||||
been added to the get-state output.
|
||||
|
||||
**Limitations:**
|
||||
|
||||
- Information for non-local bricks and clients connected to non-local bricks
|
||||
won't be available. This is a known limitation of the get-state command, since
|
||||
get-state command doesn't provide information on non-local bricks.
|
||||
won't be available. This is a known limitation of the get-state command, since
|
||||
get-state command doesn't provide information on non-local bricks.
|
||||
|
||||
### Ability to serve negative lookups from cache has been added
|
||||
|
||||
**Notes for users:**
|
||||
|
||||
Before creating / renaming any file, lookups (around, 5-6 when using the SMB
|
||||
@@ -99,10 +108,13 @@ Execute the following commands to enable negative-lookup cache:
|
||||
# gluster volume set <volname> features.cache-invalidation-timeout 600
|
||||
# gluster volume set <VOLNAME> nl-cache on
|
||||
```
|
||||
|
||||
**Limitations**
|
||||
|
||||
- This feature is supported only for SMB access, for this release
|
||||
|
||||
### New xlator to help developers detecting resource leaks has been added
|
||||
|
||||
**Notes for users:**
|
||||
|
||||
This is intended as a developer feature, and hence there is no direct user
|
||||
@@ -114,6 +126,7 @@ gfapi and any xlator in between the API and the sink xlator.
|
||||
More details can be found in [this](http://lists.gluster.org/pipermail/gluster-devel/2017-April/052618.html) thread on the gluster-devel lists
|
||||
|
||||
### Feature for metadata-caching/small file performance is production ready
|
||||
|
||||
**Notes for users:**
|
||||
|
||||
Over the course of releases several fixes and enhancements have been made to
|
||||
@@ -132,15 +145,18 @@ SMB access, by enabling metadata caching:
|
||||
- Renaming files
|
||||
|
||||
To enable metadata caching execute the following commands:
|
||||
|
||||
```bash
|
||||
# gluster volume set group metadata-cache
|
||||
# gluster volume set network.inode-lru-limit <n>
|
||||
```
|
||||
|
||||
\<n\>, is set to 50000 by default. It should be increased if the number of
|
||||
concurrently accessed files in the volume is very high. Increasing this number
|
||||
increases the memory footprint of the brick processes.
|
||||
|
||||
### "Parallel Readdir" feature introduced in 3.10.0 is production ready
|
||||
|
||||
**Notes for users:**
|
||||
|
||||
This feature was introduced in 3.10 and was experimental in nature. Over the
|
||||
@@ -150,6 +166,7 @@ stabilized and is ready for use in production environments.
|
||||
For further details refer: [3.10.0 release notes](https://github.com/gluster/glusterfs/blob/release-3.10/doc/release-notes/3.10.0.md)
|
||||
|
||||
### Object versioning is enabled only if bitrot is enabled
|
||||
|
||||
**Notes for users:**
|
||||
|
||||
Object versioning was turned on by default on brick processes by the bitrot
|
||||
@@ -161,6 +178,7 @@ To fix this, object versioning is disabled by default, and is only enabled as
|
||||
a part of enabling the bitrot option.
|
||||
|
||||
### Distribute layer provides more robust transactions during directory namespace operations
|
||||
|
||||
**Notes for users:**
|
||||
|
||||
Distribute layer in Gluster, creates and maintains directories in all subvolumes
|
||||
@@ -173,6 +191,7 @@ ensuring better consistency of the file system as a whole, when dealing with
|
||||
racing operations, operating on the same directory object.
|
||||
|
||||
### gfapi extended readdirplus API has been added
|
||||
|
||||
**Notes for users:**
|
||||
|
||||
An extended readdirplus API `glfs_xreaddirplus` is added to get extra
|
||||
@@ -184,10 +203,12 @@ involving directory listing.
|
||||
The API syntax and usage can be found in [`glfs.h`](https://github.com/gluster/glusterfs/blob/v3.11.0rc1/api/src/glfs.h#L810) header file.
|
||||
|
||||
**Limitations:**
|
||||
|
||||
- This API currently has support to only return stat and handles (`glfs_object`)
|
||||
for each dirent of the directory, but can be extended in the future.
|
||||
for each dirent of the directory, but can be extended in the future.
|
||||
|
||||
### Improved adoption of standard refcounting functions across the code
|
||||
|
||||
**Notes for users:**
|
||||
|
||||
This change does not impact users, it is an internal code cleanup activity
|
||||
@@ -195,10 +216,12 @@ that ensures that we ref count in a standard manner, thus avoiding unwanted
|
||||
bugs due to different implementations of the same.
|
||||
|
||||
**Known Issues:**
|
||||
|
||||
- This standardization started with this release and is expected to continue
|
||||
across releases.
|
||||
across releases.
|
||||
|
||||
### Performance improvements to rebalance have been made
|
||||
|
||||
**Notes for users:**
|
||||
|
||||
Both crawling and migration improvement has been done in rebalance. The crawler
|
||||
@@ -209,7 +232,7 @@ both the nodes divide the load among each other giving boost to migration
|
||||
performance. And also there have been some optimization to avoid redundant
|
||||
network operations (or RPC calls) in the process of migrating a file.
|
||||
|
||||
Further, file migration now avoids syncop framework and is managed entirely by
|
||||
Further, file migration now avoids syncop framework and is managed entirely by
|
||||
rebalance threads giving performance boost.
|
||||
|
||||
Also, There is a change to throttle settings in rebalance. Earlier user could
|
||||
@@ -220,21 +243,23 @@ of threads rebalance process will work with, thereby translating to the number
|
||||
of files being migrated in parallel.
|
||||
|
||||
### Halo Replication feature in AFR has been introduced
|
||||
|
||||
**Notes for users:**
|
||||
|
||||
Halo Geo-replication is a feature which allows Gluster or NFS clients to write
|
||||
locally to their region (as defined by a latency "halo" or threshold if you
|
||||
like), and have their writes asynchronously propagate from their origin to the
|
||||
rest of the cluster. Clients can also write synchronously to the cluster
|
||||
rest of the cluster. Clients can also write synchronously to the cluster
|
||||
simply by specifying a halo-latency which is very large (e.g. 10seconds) which
|
||||
will include all bricks.
|
||||
To enable halo feature execute the following commands:
|
||||
|
||||
```bash
|
||||
# gluster volume set cluster.halo-enabled yes
|
||||
```
|
||||
|
||||
You may have to set the following following options to change defaults.
|
||||
`cluster.halo-shd-latency`: The threshold below which self-heal daemons will
|
||||
`cluster.halo-shd-latency`: The threshold below which self-heal daemons will
|
||||
consider children (bricks) connected.
|
||||
|
||||
`cluster.halo-nfsd-latency`: The threshold below which NFS daemons will consider
|
||||
@@ -249,12 +274,14 @@ If the number of children falls below this threshold the next
|
||||
best (chosen by latency) shall be swapped in.
|
||||
|
||||
### FALLOCATE support with EC
|
||||
|
||||
**Notes for users**
|
||||
|
||||
Support for FALLOCATE file operation on EC volume is added with this release.
|
||||
EC volumes can now support basic FALLOCATE functionality.
|
||||
|
||||
### Self-heal window-size control option for EC
|
||||
|
||||
**Notes for users**
|
||||
|
||||
Support to control the maximum size of read/write operation carried out
|
||||
@@ -262,14 +289,16 @@ during self-heal process has been added with this release. User has to tune
|
||||
'disperse.self-heal-window-size' option on disperse volume to adjust the size.
|
||||
|
||||
## Major issues
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
- Status of this bug can be tracked here, #1426508
|
||||
- Latest series of fixes for the issue (which are present in this release as
|
||||
well) are not showing the previous corruption, and hence the fixes look
|
||||
good, but this is maintained on the watch list nevetheness.
|
||||
well) are not showing the previous corruption, and hence the fixes look
|
||||
good, but this is maintained on the watch list nevetheness.
|
||||
|
||||
## Bugs addressed
|
||||
|
||||
@@ -289,7 +318,7 @@ Bugs addressed since release-3.10.0 are listed below.
|
||||
- [#1328342](https://bugzilla.redhat.com/1328342): [tiering]: gluster v reset of watermark levels can allow low watermark level to have a higher value than hi watermark level
|
||||
- [#1353952](https://bugzilla.redhat.com/1353952): [geo-rep]: rsync should not try to sync internal xattrs
|
||||
- [#1356076](https://bugzilla.redhat.com/1356076): DHT doesn't evenly balance files on FreeBSD with ZFS
|
||||
- [#1359599](https://bugzilla.redhat.com/1359599): BitRot :- bit-rot.signature and bit-rot.version xattr should not be set if bitrot is not enabled on volume
|
||||
- [#1359599](https://bugzilla.redhat.com/1359599): BitRot :- bit-rot.signature and bit-rot.version xattr should not be set if bitrot is not enabled on volume
|
||||
- [#1369393](https://bugzilla.redhat.com/1369393): dead loop in changelog_rpc_server_destroy
|
||||
- [#1383893](https://bugzilla.redhat.com/1383893): glusterd restart is starting the offline shd daemon on other node in the cluster
|
||||
- [#1384989](https://bugzilla.redhat.com/1384989): libglusterfs : update correct memory segments in glfs-message-id
|
||||
@@ -304,7 +333,7 @@ Bugs addressed since release-3.10.0 are listed below.
|
||||
- [#1399593](https://bugzilla.redhat.com/1399593): Obvious typo in cleanup code in rpc_clnt_notify
|
||||
- [#1401571](https://bugzilla.redhat.com/1401571): bitrot quarantine dir misspelled
|
||||
- [#1401812](https://bugzilla.redhat.com/1401812): RFE: Make readdirp parallel in dht
|
||||
- [#1401877](https://bugzilla.redhat.com/1401877): [GANESHA] Symlinks from /etc/ganesha/ganesha.conf to shared_storage are created on the non-ganesha nodes in 8 node gluster having 4 node ganesha cluster
|
||||
- [#1401877](https://bugzilla.redhat.com/1401877): [GANESHA] Symlinks from /etc/ganesha/ganesha.conf to shared_storage are created on the non-ganesha nodes in 8 node gluster having 4 node ganesha cluster
|
||||
- [#1402254](https://bugzilla.redhat.com/1402254): compile warning unused variable
|
||||
- [#1402661](https://bugzilla.redhat.com/1402661): Samba crash when mounting a distributed dispersed volume over CIFS
|
||||
- [#1404424](https://bugzilla.redhat.com/1404424): The data-self-heal option is not honored in AFR
|
||||
@@ -317,10 +346,10 @@ Bugs addressed since release-3.10.0 are listed below.
|
||||
- [#1411334](https://bugzilla.redhat.com/1411334): Improve output of "gluster volume status detail"
|
||||
- [#1412135](https://bugzilla.redhat.com/1412135): rename of the same file from multiple clients with caching enabled may result in duplicate files
|
||||
- [#1412549](https://bugzilla.redhat.com/1412549): EXPECT_WITHIN is taking too much time even if the result matches with expected value
|
||||
- [#1413526](https://bugzilla.redhat.com/1413526): glusterfind: After glusterfind pre command execution all temporary files and directories /usr/var/lib/misc/glusterfsd/glusterfind/<session>/<volume>/ should be removed
|
||||
- [#1413526](https://bugzilla.redhat.com/1413526): glusterfind: After glusterfind pre command execution all temporary files and directories /usr/var/lib/misc/glusterfsd/glusterfind/<session>/<volume>/ should be removed
|
||||
- [#1413971](https://bugzilla.redhat.com/1413971): Bonnie test suite failed with "Can't open file" error
|
||||
- [#1414287](https://bugzilla.redhat.com/1414287): repeated operation failed warnings in gluster mount logs with disperse volume
|
||||
- [#1414346](https://bugzilla.redhat.com/1414346): Quota: After upgrade from 3.7 to higher version , gluster quota list command shows "No quota configured on volume repvol"
|
||||
- [#1414346](https://bugzilla.redhat.com/1414346): Quota: After upgrade from 3.7 to higher version , gluster quota list command shows "No quota configured on volume repvol"
|
||||
- [#1414645](https://bugzilla.redhat.com/1414645): Typo in glusterfs code comments
|
||||
- [#1414782](https://bugzilla.redhat.com/1414782): Add logs to selfheal code path to be helpful for debug
|
||||
- [#1414902](https://bugzilla.redhat.com/1414902): packaging: python/python2(/python3) cleanup
|
||||
@@ -341,7 +370,7 @@ Bugs addressed since release-3.10.0 are listed below.
|
||||
- [#1418095](https://bugzilla.redhat.com/1418095): Portmap allocates way too much memory (256KB) on stack
|
||||
- [#1418213](https://bugzilla.redhat.com/1418213): [Ganesha+SSL] : Bonnie++ hangs during rewrites.
|
||||
- [#1418249](https://bugzilla.redhat.com/1418249): [RFE] Need to have group cli option to set all md-cache options using a single command
|
||||
- [#1418259](https://bugzilla.redhat.com/1418259): Quota: After deleting directory from mount point on which quota was configured, quota list command output is blank
|
||||
- [#1418259](https://bugzilla.redhat.com/1418259): Quota: After deleting directory from mount point on which quota was configured, quota list command output is blank
|
||||
- [#1418417](https://bugzilla.redhat.com/1418417): packaging: remove glusterfs-ganesha subpackage
|
||||
- [#1418629](https://bugzilla.redhat.com/1418629): glustershd process crashed on systemic setup
|
||||
- [#1418900](https://bugzilla.redhat.com/1418900): [RFE] Include few more options in virt file
|
||||
@@ -355,7 +384,7 @@ Bugs addressed since release-3.10.0 are listed below.
|
||||
- [#1420619](https://bugzilla.redhat.com/1420619): Entry heal messages in glustershd.log while no entries shown in heal info
|
||||
- [#1420623](https://bugzilla.redhat.com/1420623): [RHV-RHGS]: Application VM paused after add brick operation and VM didn't comeup after power cycle.
|
||||
- [#1420637](https://bugzilla.redhat.com/1420637): Modified volume options not synced once offline nodes comes up.
|
||||
- [#1420697](https://bugzilla.redhat.com/1420697): CLI option "--timeout" is accepting non numeric and negative values.
|
||||
- [#1420697](https://bugzilla.redhat.com/1420697): CLI option "--timeout" is accepting non numeric and negative values.
|
||||
- [#1420713](https://bugzilla.redhat.com/1420713): glusterd: storhaug, remove all vestiges ganesha
|
||||
- [#1421023](https://bugzilla.redhat.com/1421023): Binary file gf_attach generated during build process should be git ignored
|
||||
- [#1421590](https://bugzilla.redhat.com/1421590): Bricks take up new ports upon volume restart after add-brick op with brick mux enabled
|
||||
@@ -364,9 +393,9 @@ Bugs addressed since release-3.10.0 are listed below.
|
||||
- [#1421653](https://bugzilla.redhat.com/1421653): dht_setxattr returns EINVAL when a file is deleted during the FOP
|
||||
- [#1421721](https://bugzilla.redhat.com/1421721): volume start command hangs
|
||||
- [#1421724](https://bugzilla.redhat.com/1421724): glusterd log is flooded with stale disconnect rpc messages
|
||||
- [#1421759](https://bugzilla.redhat.com/1421759): Gluster NFS server crashing in __mnt3svc_umountall
|
||||
- [#1421759](https://bugzilla.redhat.com/1421759): Gluster NFS server crashing in \_\_mnt3svc_umountall
|
||||
- [#1421937](https://bugzilla.redhat.com/1421937): [Replicate] "RPC call decoding failed" leading to IO hang & mount inaccessible
|
||||
- [#1421938](https://bugzilla.redhat.com/1421938): systemic testing: seeing lot of ping time outs which would lead to splitbrains
|
||||
- [#1421938](https://bugzilla.redhat.com/1421938): systemic testing: seeing lot of ping time outs which would lead to splitbrains
|
||||
- [#1421955](https://bugzilla.redhat.com/1421955): Disperse: Fallback to pre-compiled code execution when dynamic code generation fails
|
||||
- [#1422074](https://bugzilla.redhat.com/1422074): GlusterFS truncates nanoseconds to microseconds when setting mtime
|
||||
- [#1422152](https://bugzilla.redhat.com/1422152): Bricks not coming up when ran with address sanitizer
|
||||
@@ -387,7 +416,7 @@ Bugs addressed since release-3.10.0 are listed below.
|
||||
- [#1424815](https://bugzilla.redhat.com/1424815): Fix erronous comparaison of flags resulting in UUID always sent
|
||||
- [#1424894](https://bugzilla.redhat.com/1424894): Some switches don't have breaks causing unintended fall throughs.
|
||||
- [#1424905](https://bugzilla.redhat.com/1424905): Coverity: Memory issues and dead code
|
||||
- [#1425288](https://bugzilla.redhat.com/1425288): glusterd is not validating for allowed values while setting "cluster.brick-multiplex" property
|
||||
- [#1425288](https://bugzilla.redhat.com/1425288): glusterd is not validating for allowed values while setting "cluster.brick-multiplex" property
|
||||
- [#1425515](https://bugzilla.redhat.com/1425515): tests: quota-anon-fd-nfs.t needs to check if nfs mount is avialable before mounting
|
||||
- [#1425623](https://bugzilla.redhat.com/1425623): Free all xlator specific resources when xlator->fini() gets called
|
||||
- [#1425676](https://bugzilla.redhat.com/1425676): gfids are not populated in release/releasedir requests
|
||||
@@ -415,8 +444,8 @@ Bugs addressed since release-3.10.0 are listed below.
|
||||
- [#1428510](https://bugzilla.redhat.com/1428510): memory leak in features/locks xlator
|
||||
- [#1429198](https://bugzilla.redhat.com/1429198): Restore atime/mtime for symlinks and other non-regular files.
|
||||
- [#1429200](https://bugzilla.redhat.com/1429200): disallow increasing replica count for arbiter volumes
|
||||
- [#1429330](https://bugzilla.redhat.com/1429330): [crawler]: auxiliary mount remains even after crawler finishes
|
||||
- [#1429696](https://bugzilla.redhat.com/1429696): ldd libgfxdr.so.0.0.1: undefined symbol: __gf_free
|
||||
- [#1429330](https://bugzilla.redhat.com/1429330): [crawler]: auxiliary mount remains even after crawler finishes
|
||||
- [#1429696](https://bugzilla.redhat.com/1429696): ldd libgfxdr.so.0.0.1: undefined symbol: \_\_gf_free
|
||||
- [#1430042](https://bugzilla.redhat.com/1430042): Transport endpoint not connected error seen on client when glusterd is restarted
|
||||
- [#1430148](https://bugzilla.redhat.com/1430148): USS is broken when multiplexing is on
|
||||
- [#1430608](https://bugzilla.redhat.com/1430608): [RFE] Pass slave volume in geo-rep as read-only
|
||||
@@ -452,7 +481,7 @@ Bugs addressed since release-3.10.0 are listed below.
|
||||
- [#1438370](https://bugzilla.redhat.com/1438370): rebalance: Allow admin to change thread count for rebalance
|
||||
- [#1438411](https://bugzilla.redhat.com/1438411): [Ganesha + EC] : Input/Output Error while creating LOTS of smallfiles
|
||||
- [#1438738](https://bugzilla.redhat.com/1438738): Inode ref leak on anonymous reads and writes
|
||||
- [#1438772](https://bugzilla.redhat.com/1438772): build: clang/llvm has __builtin_ffs() and __builtin_popcount()
|
||||
- [#1438772](https://bugzilla.redhat.com/1438772): build: clang/llvm has **builtin_ffs() and **builtin_popcount()
|
||||
- [#1438810](https://bugzilla.redhat.com/1438810): File-level WORM allows ftruncate() on read-only files
|
||||
- [#1438858](https://bugzilla.redhat.com/1438858): explicitly specify executor to be bash for tests
|
||||
- [#1439527](https://bugzilla.redhat.com/1439527): [disperse] Don't count healing brick as healthy brick
|
||||
@@ -491,7 +520,7 @@ Bugs addressed since release-3.10.0 are listed below.
|
||||
- [#1449004](https://bugzilla.redhat.com/1449004): [Brick Multiplexing] : Bricks for multiple volumes going down after glusterd restart and not coming back up after volume start force
|
||||
- [#1449191](https://bugzilla.redhat.com/1449191): Multiple bricks WILL crash after TCP port probing
|
||||
- [#1449311](https://bugzilla.redhat.com/1449311): [whql][virtio-block+glusterfs]"Disk Stress" and "Disk Verification" job always failed on win7-32/win2012/win2k8R2 guest
|
||||
- [#1449775](https://bugzilla.redhat.com/1449775): quota: limit-usage command failed with error " Failed to start aux mount"
|
||||
- [#1449775](https://bugzilla.redhat.com/1449775): quota: limit-usage command failed with error " Failed to start aux mount"
|
||||
- [#1449921](https://bugzilla.redhat.com/1449921): afr: include quorum type and count when dumping afr priv
|
||||
- [#1449924](https://bugzilla.redhat.com/1449924): When either killing or restarting a brick with performance.stat-prefetch on, stat sometimes returns a bad st_size value.
|
||||
- [#1449933](https://bugzilla.redhat.com/1449933): Brick Multiplexing :- resetting a brick bring down other bricks with same PID
|
||||
@@ -499,25 +528,25 @@ Bugs addressed since release-3.10.0 are listed below.
|
||||
- [#1450377](https://bugzilla.redhat.com/1450377): GNFS crashed while taking lock on a file from 2 different clients having same volume mounted from 2 different servers
|
||||
- [#1450565](https://bugzilla.redhat.com/1450565): glfsheal: crashed(segfault) with disperse volume in RDMA
|
||||
- [#1450729](https://bugzilla.redhat.com/1450729): Brick Multiplexing: seeing Input/Output Error for .trashcan
|
||||
- [#1450933](https://bugzilla.redhat.com/1450933): [New] - Replacing an arbiter brick while I/O happens causes vm pause
|
||||
- [#1450933](https://bugzilla.redhat.com/1450933): [New] - Replacing an arbiter brick while I/O happens causes vm pause
|
||||
- [#1451033](https://bugzilla.redhat.com/1451033): contrib: timer-wheel 32-bit bug, use builtin_fls, license, etc
|
||||
- [#1451573](https://bugzilla.redhat.com/1451573): AFR returns the node uuid of the same node for every file in the replica
|
||||
- [#1451586](https://bugzilla.redhat.com/1451586): crash in dht_rmdir_do
|
||||
- [#1451591](https://bugzilla.redhat.com/1451591): cli xml status of detach tier broken
|
||||
- [#1451887](https://bugzilla.redhat.com/1451887): Add tests/basic/afr/gfid-mismatch-resolution-with-fav-child-policy.t to bad tests
|
||||
- [#1452000](https://bugzilla.redhat.com/1452000): Spacing issue in fix-layout status output
|
||||
- [#1453050](https://bugzilla.redhat.com/1453050): [DHt] : segfault in dht_selfheal_dir_setattr while running regressions
|
||||
- [#1453050](https://bugzilla.redhat.com/1453050): [DHt] : segfault in dht_selfheal_dir_setattr while running regressions
|
||||
- [#1453086](https://bugzilla.redhat.com/1453086): Brick Multiplexing: On reboot of a node Brick multiplexing feature lost on that node as multiple brick processes get spawned
|
||||
- [#1453152](https://bugzilla.redhat.com/1453152): [Parallel Readdir] : Mounts fail when performance.parallel-readdir is set to "off"
|
||||
- [#1454533](https://bugzilla.redhat.com/1454533): lock_revocation.t Marked as bad in 3.11 for CentOS as well
|
||||
- [#1454569](https://bugzilla.redhat.com/1454569): [geo-rep + nl]: Multiple crashes observed on slave with "nlc_lookup_cbk"
|
||||
- [#1454597](https://bugzilla.redhat.com/1454597): [Tiering]: High and low watermark values when set to the same level, is allowed
|
||||
- [#1454597](https://bugzilla.redhat.com/1454597): [Tiering]: High and low watermark values when set to the same level, is allowed
|
||||
- [#1454612](https://bugzilla.redhat.com/1454612): glusterd on a node crashed after running volume profile command
|
||||
- [#1454686](https://bugzilla.redhat.com/1454686): Implement FALLOCATE FOP for EC
|
||||
- [#1454853](https://bugzilla.redhat.com/1454853): Seeing error "Failed to get the total number of files. Unable to estimate time to complete rebalance" in rebalance logs
|
||||
- [#1455177](https://bugzilla.redhat.com/1455177): ignore incorrect uuid validation in gd_validate_mgmt_hndsk_req
|
||||
- [#1455423](https://bugzilla.redhat.com/1455423): dht: dht self heal fails with no hashed subvol error
|
||||
- [#1455907](https://bugzilla.redhat.com/1455907): heal info shows the status of the bricks as "Transport endpoint is not connected" though bricks are up
|
||||
- [#1455907](https://bugzilla.redhat.com/1455907): heal info shows the status of the bricks as "Transport endpoint is not connected" though bricks are up
|
||||
- [#1456224](https://bugzilla.redhat.com/1456224): [gluster-block]:Need a volume group profile option for gluster-block volume to add necessary options to be added.
|
||||
- [#1456225](https://bugzilla.redhat.com/1456225): gluster-block is not working as expected when shard is enabled
|
||||
- [#1456331](https://bugzilla.redhat.com/1456331): [Bitrot]: Brick process crash observed while trying to recover a bad file in disperse volume
|
||||
|
||||
@@ -7,6 +7,7 @@ GlusterFS 3.11 stable release.
|
||||
## Major changes, features and limitations addressed in this release
|
||||
|
||||
### Improved disperse performance
|
||||
|
||||
Fix for bug [#1456259](https://bugzilla.redhat.com/1456259) changes the way
|
||||
messages are read and processed from the socket layers on the Gluster client.
|
||||
This has shown performance improvements on disperse volumes, and is applicable
|
||||
@@ -14,6 +15,7 @@ to other volume types as well, where there maybe multiple applications or users
|
||||
accessing the same mount point.
|
||||
|
||||
### Group settings for enabling negative lookup caching provided
|
||||
|
||||
Ability to serve negative lookups from cache was added in 3.11.0 and with
|
||||
this release, a group volume set option is added for ease in enabling this
|
||||
feature.
|
||||
@@ -21,6 +23,7 @@ feature.
|
||||
See [group-nl-cache](https://github.com/gluster/glusterfs/blob/release-3.11/extras/group-nl-cache) for more details.
|
||||
|
||||
### Gluster fuse now implements "-oauto_unmount" feature.
|
||||
|
||||
libfuse has an auto_unmount option which, if enabled, ensures that the file
|
||||
system is unmounted at FUSE server termination by running a separate monitor
|
||||
process that performs the unmount when that occurs. This release implements that
|
||||
@@ -30,15 +33,17 @@ Note that "auto unmount" (robust or not) is a leaky abstraction, as the kernel
|
||||
cannot guarantee that at the path where the FUSE fs is mounted is actually the
|
||||
toplevel mount at the time of the umount(2) call, for multiple reasons,
|
||||
among others, see:
|
||||
|
||||
- fuse-devel: ["fuse: feasible to distinguish between umount and abort?"](http://fuse.996288.n3.nabble.com/fuse-feasible-to-distinguish-between-umount-and-abort-tt14358.html)
|
||||
- https://github.com/libfuse/libfuse/issues/122
|
||||
|
||||
## Major issues
|
||||
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
- Status of this bug can be tracked here, #1465123
|
||||
|
||||
## Bugs addressed
|
||||
@@ -46,7 +51,7 @@ among others, see:
|
||||
Bugs addressed since release-3.11.0 are listed below.
|
||||
|
||||
- [#1456259](https://bugzilla.redhat.com/1456259): limited throughput with disperse volume over small number of bricks
|
||||
- [#1457058](https://bugzilla.redhat.com/1457058): glusterfs client crash on io-cache.so(__ioc_page_wakeup+0x44)
|
||||
- [#1457058](https://bugzilla.redhat.com/1457058): glusterfs client crash on io-cache.so(\_\_ioc_page_wakeup+0x44)
|
||||
- [#1457289](https://bugzilla.redhat.com/1457289): tierd listens to a port.
|
||||
- [#1457339](https://bugzilla.redhat.com/1457339): DHT: slow readdirp performance
|
||||
- [#1457616](https://bugzilla.redhat.com/1457616): "split-brain observed [Input/output error]" error messages in samba logs during parallel rm -rf
|
||||
@@ -55,8 +60,8 @@ Bugs addressed since release-3.11.0 are listed below.
|
||||
- [#1458664](https://bugzilla.redhat.com/1458664): [Geo-rep]: METADATA errors are seen even though everything is in sync
|
||||
- [#1459090](https://bugzilla.redhat.com/1459090): all: spelling errors (debian package maintainer)
|
||||
- [#1459095](https://bugzilla.redhat.com/1459095): extras/hook-scripts: non-portable shell syntax (debian package maintainer)
|
||||
- [#1459392](https://bugzilla.redhat.com/1459392): possible repeatedly recursive healing of same file with background heal not happening when IO is going on
|
||||
- [#1459759](https://bugzilla.redhat.com/1459759): Glusterd segmentation fault in ' _Unwind_Backtrace' while running peer probe
|
||||
- [#1459392](https://bugzilla.redhat.com/1459392): possible repeatedly recursive healing of same file with background heal not happening when IO is going on
|
||||
- [#1459759](https://bugzilla.redhat.com/1459759): Glusterd segmentation fault in ' \_Unwind_Backtrace' while running peer probe
|
||||
- [#1460647](https://bugzilla.redhat.com/1460647): posix-acl: Whitelist virtual ACL xattrs
|
||||
- [#1460894](https://bugzilla.redhat.com/1460894): Rebalance estimate time sometimes shows negative values
|
||||
- [#1460895](https://bugzilla.redhat.com/1460895): Upcall missing invalidations
|
||||
|
||||
@@ -10,13 +10,14 @@ There are no major features or changes made in this release.
|
||||
|
||||
## Major issues
|
||||
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
- The last known cause for corruption (Bug #1465123) has a fix with this
|
||||
release. As further testing is still in progress, the issue is retained as
|
||||
a major issue.
|
||||
release. As further testing is still in progress, the issue is retained as
|
||||
a major issue.
|
||||
- Status of this bug can be tracked here, #1465123
|
||||
|
||||
## Bugs addressed
|
||||
@@ -26,8 +27,8 @@ Bugs addressed since release-3.11.0 are listed below.
|
||||
- [#1463512](https://bugzilla.redhat.com/1463512): USS: stale snap entries are seen when activation/deactivation performed during one of the glusterd's unavailability
|
||||
- [#1463513](https://bugzilla.redhat.com/1463513): [geo-rep]: extended attributes are not synced if the entry and extended attributes are done within changelog roleover/or entry sync
|
||||
- [#1463517](https://bugzilla.redhat.com/1463517): Brick Multiplexing:dmesg shows request_sock_TCP: Possible SYN flooding on port 49152 and memory related backtraces
|
||||
- [#1463528](https://bugzilla.redhat.com/1463528): [Perf] 35% drop in small file creates on smbv3 on *2
|
||||
- [#1463626](https://bugzilla.redhat.com/1463626): [Ganesha]Bricks got crashed while running posix compliance test suit on V4 mount
|
||||
- [#1463528](https://bugzilla.redhat.com/1463528): [Perf] 35% drop in small file creates on smbv3 on \*2
|
||||
- [#1463626](https://bugzilla.redhat.com/1463626): [Ganesha]Bricks got crashed while running posix compliance test suit on V4 mount
|
||||
- [#1464316](https://bugzilla.redhat.com/1464316): DHT: Pass errno as an argument to gf_msg
|
||||
- [#1465123](https://bugzilla.redhat.com/1465123): Fd based fops fail with EBADF on file migration
|
||||
- [#1465854](https://bugzilla.redhat.com/1465854): Regression: Heal info takes longer time when a brick is down
|
||||
@@ -36,7 +37,7 @@ Bugs addressed since release-3.11.0 are listed below.
|
||||
- [#1467268](https://bugzilla.redhat.com/1467268): Heal info shows incorrect status
|
||||
- [#1468118](https://bugzilla.redhat.com/1468118): disperse seek does not correctly handle the end of file
|
||||
- [#1468200](https://bugzilla.redhat.com/1468200): [Geo-rep]: entry failed to sync to slave with ENOENT errror
|
||||
- [#1468457](https://bugzilla.redhat.com/1468457): selfheal deamon cpu consumption not reducing when IOs are going on and all redundant bricks are brought down one after another
|
||||
- [#1468457](https://bugzilla.redhat.com/1468457): selfheal deamon cpu consumption not reducing when IOs are going on and all redundant bricks are brought down one after another
|
||||
- [#1469459](https://bugzilla.redhat.com/1469459): Rebalance hangs on remove-brick if the target volume changes
|
||||
- [#1470938](https://bugzilla.redhat.com/1470938): Regression: non-disruptive(in-service) upgrade on EC volume fails
|
||||
- [#1471025](https://bugzilla.redhat.com/1471025): glusterfs process leaking memory when error occurs
|
||||
|
||||
@@ -14,13 +14,14 @@ There are no major features or changes made in this release.
|
||||
|
||||
## Major issues
|
||||
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
- The last known cause for corruption (Bug #1465123) has a fix with the 3.11.2
|
||||
release. As further testing is still in progress, the issue is retained as
|
||||
a major issue.
|
||||
release. As further testing is still in progress, the issue is retained as
|
||||
a major issue.
|
||||
- Status of this bug can be tracked here, #1465123
|
||||
|
||||
## Bugs addressed
|
||||
|
||||
@@ -20,6 +20,7 @@ captures the list of features that were introduced with 3.11.
|
||||
## Major changes and features
|
||||
|
||||
### Ability to mount sub-directories using the Gluster FUSE protocol
|
||||
|
||||
**Notes for users:**
|
||||
|
||||
With this release, it is possible define sub-directories to be mounted by
|
||||
@@ -31,15 +32,19 @@ client. This feature helps sharing a volume among the multiple consumers along
|
||||
with enabling restricting access to the sub-directory of choice.
|
||||
|
||||
Option controlling sub-directory allow/deny rules can be set as follows:
|
||||
|
||||
```
|
||||
# gluster volume set <volname> auth.allow "/subdir1(192.168.1.*),/(192.168.10.*),/subdir2(192.168.8.*)"
|
||||
```
|
||||
|
||||
How to mount from the client:
|
||||
|
||||
```
|
||||
# mount -t glusterfs <hostname>:/<volname>/<subdir> /<mount_point>
|
||||
```
|
||||
|
||||
Or,
|
||||
|
||||
```
|
||||
# mount -t glusterfs <hostname>:/<volname> -osubdir_mount=<subdir> /<mount_point>
|
||||
```
|
||||
@@ -47,14 +52,15 @@ Or,
|
||||
**Limitations:**
|
||||
|
||||
- There are no throttling or QoS support for this feature. The feature will
|
||||
just provide the namespace isolation for the different clients.
|
||||
just provide the namespace isolation for the different clients.
|
||||
|
||||
**Known Issues:**
|
||||
|
||||
- Once we cross more than 1000s of subdirs in 'auth.allow' option, the
|
||||
performance of reconnect / authentication would be impacted.
|
||||
performance of reconnect / authentication would be impacted.
|
||||
|
||||
### GFID to path conversion is enabled by default
|
||||
|
||||
**Notes for users:**
|
||||
|
||||
Prior to this feature, only when quota was enabled, did the on disk data have
|
||||
@@ -80,18 +86,20 @@ None
|
||||
None
|
||||
|
||||
### Various enhancements have been made to the output of get-state CLI command
|
||||
|
||||
**Notes for users:**
|
||||
|
||||
The command `#gluster get-state` has been enhanced to output more information
|
||||
as below,
|
||||
|
||||
- Arbiter bricks are marked more clearly in a volume that has the feature
|
||||
enabled
|
||||
enabled
|
||||
- Ability to get all volume options (both set and defaults) in the get-state
|
||||
output
|
||||
output
|
||||
- Rebalance time estimates, for ongoing rebalance, is captured in the get-state
|
||||
output
|
||||
output
|
||||
- If geo-replication is configured, then get-state now captures the session
|
||||
details of the same
|
||||
details of the same
|
||||
|
||||
**Limitations:**
|
||||
|
||||
@@ -102,6 +110,7 @@ None
|
||||
None
|
||||
|
||||
### Provided an option to set a limit on number of bricks multiplexed in a processes
|
||||
|
||||
**Notes for users:**
|
||||
|
||||
This release includes a global option to be switched on only if brick
|
||||
@@ -111,19 +120,22 @@ node. If the limit set by this option is insufficient for a single process,
|
||||
more processes are spawned for the subsequent bricks.
|
||||
|
||||
Usage:
|
||||
|
||||
```
|
||||
#gluster volume set all cluster.max-bricks-per-process <value>
|
||||
```
|
||||
|
||||
### Provided an option to use localtime timestamps in log entries
|
||||
|
||||
**Limitations:**
|
||||
|
||||
Gluster defaults to UTC timestamps. glusterd, glusterfsd, and server-side
|
||||
glusterfs daemons will use UTC until one of,
|
||||
|
||||
1. command line option is processed,
|
||||
2. gluster config (/var/lib/glusterd/options) is loaded,
|
||||
3. admin manually sets localtime-logging (cluster.localtime-logging, e.g.
|
||||
`#gluster volume set all cluster.localtime-logging enable`).
|
||||
`#gluster volume set all cluster.localtime-logging enable`).
|
||||
|
||||
There is no mount option to make the FUSE client enable localtime logging.
|
||||
|
||||
@@ -144,6 +156,7 @@ and also enhancing the ability for file placement in the distribute translator
|
||||
when used with the option `min-free-disk`.
|
||||
|
||||
### Provided a means to resolve GFID split-brain using the gluster CLI
|
||||
|
||||
**Notes for users:**
|
||||
|
||||
The existing CLI commands to heal files under split-brain did not handle cases
|
||||
@@ -152,6 +165,7 @@ the same CLI commands can now address GFID split-brain situations based on the
|
||||
choices provided.
|
||||
|
||||
The CLI options that are enhanced to help with this situation are,
|
||||
|
||||
```
|
||||
volume heal <VOLNAME> split-brain {bigger-file <FILE> |
|
||||
latest-mtime <FILE> |
|
||||
@@ -167,14 +181,16 @@ None
|
||||
None
|
||||
|
||||
### Developer related: Added a 'site.h' for more vendor/company specific defaults
|
||||
|
||||
**Notes for developers:**
|
||||
|
||||
**NOTE**: Also relevant for users building from sources and needing different
|
||||
defaults for some options
|
||||
|
||||
Most people consume Gluster in one of two ways:
|
||||
* From packages provided by their OS/distribution vendor
|
||||
* By building themselves from source
|
||||
|
||||
- From packages provided by their OS/distribution vendor
|
||||
- By building themselves from source
|
||||
|
||||
For the first group it doesn't matter whether configuration is done in a
|
||||
configure script, via command-line options to that configure script, or in a
|
||||
@@ -198,6 +214,7 @@ file. Further guidelines for how to determine whether an option should go in
|
||||
configure.ac or site.h are explained within site.h itself.
|
||||
|
||||
### Developer related: Added xxhash library to libglusterfs for required use
|
||||
|
||||
**Notes for developers:**
|
||||
|
||||
Function gf_xxh64_wrapper has been added as a wrapper into libglusterfs for
|
||||
@@ -206,6 +223,7 @@ consumption by interested developers.
|
||||
Reference to code can be found [here](https://github.com/gluster/glusterfs/blob/v3.12.0alpha1/libglusterfs/src/common-utils.h#L835)
|
||||
|
||||
### Developer related: glfs_ipc API in libgfapi is removed as a public interface
|
||||
|
||||
**Notes for users:**
|
||||
|
||||
glfs_ipc API was maintained as a public API in the GFAPI libraries. This has
|
||||
@@ -219,14 +237,15 @@ this change.
|
||||
API
|
||||
|
||||
## Major issues
|
||||
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
- The last known cause for corruption (Bug #1465123) has a fix with this
|
||||
release. As further testing is still in progress, the issue is retained as
|
||||
a major issue.
|
||||
- Status of this bug can be tracked here, #1465123
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
- The last known cause for corruption (Bug #1465123) has a fix with this
|
||||
release. As further testing is still in progress, the issue is retained as
|
||||
a major issue.
|
||||
- Status of this bug can be tracked here, #1465123
|
||||
|
||||
## Bugs addressed
|
||||
|
||||
@@ -243,13 +262,13 @@ Bugs addressed since release-3.11.0 are listed below.
|
||||
- [#1400924](https://bugzilla.redhat.com/1400924): [RFE] Rsync flags for performance improvements
|
||||
- [#1402406](https://bugzilla.redhat.com/1402406): Client stale file handle error in dht-linkfile.c under SPEC SFS 2014 VDA workload
|
||||
- [#1414242](https://bugzilla.redhat.com/1414242): [whql][virtio-block+glusterfs]"Disk Stress" and "Disk Verification" job always failed on win7-32/win2012/win2k8R2 guest
|
||||
- [#1421938](https://bugzilla.redhat.com/1421938): systemic testing: seeing lot of ping time outs which would lead to splitbrains
|
||||
- [#1421938](https://bugzilla.redhat.com/1421938): systemic testing: seeing lot of ping time outs which would lead to splitbrains
|
||||
- [#1424817](https://bugzilla.redhat.com/1424817): Fix wrong operators, found by coverty
|
||||
- [#1428061](https://bugzilla.redhat.com/1428061): Halo Replication feature for AFR translator
|
||||
- [#1428673](https://bugzilla.redhat.com/1428673): possible repeatedly recursive healing of same file with background heal not happening when IO is going on
|
||||
- [#1428673](https://bugzilla.redhat.com/1428673): possible repeatedly recursive healing of same file with background heal not happening when IO is going on
|
||||
- [#1430608](https://bugzilla.redhat.com/1430608): [RFE] Pass slave volume in geo-rep as read-only
|
||||
- [#1431908](https://bugzilla.redhat.com/1431908): Enabling parallel-readdir causes dht linkto files to be visible on the mount,
|
||||
- [#1433906](https://bugzilla.redhat.com/1433906): quota: limit-usage command failed with error " Failed to start aux mount"
|
||||
- [#1433906](https://bugzilla.redhat.com/1433906): quota: limit-usage command failed with error " Failed to start aux mount"
|
||||
- [#1437748](https://bugzilla.redhat.com/1437748): Spacing issue in fix-layout status output
|
||||
- [#1438966](https://bugzilla.redhat.com/1438966): Multiple bricks WILL crash after TCP port probing
|
||||
- [#1439068](https://bugzilla.redhat.com/1439068): Segmentation fault when creating a qcow2 with qemu-img
|
||||
@@ -270,7 +289,7 @@ Bugs addressed since release-3.11.0 are listed below.
|
||||
- [#1447826](https://bugzilla.redhat.com/1447826): potential endless loop in function glusterfs_graph_validate_options
|
||||
- [#1447828](https://bugzilla.redhat.com/1447828): Should use dict_set_uint64 to set fd->pid when dump fd's info to dict
|
||||
- [#1447953](https://bugzilla.redhat.com/1447953): Remove inadvertently merged IPv6 code
|
||||
- [#1447960](https://bugzilla.redhat.com/1447960): [Tiering]: High and low watermark values when set to the same level, is allowed
|
||||
- [#1447960](https://bugzilla.redhat.com/1447960): [Tiering]: High and low watermark values when set to the same level, is allowed
|
||||
- [#1447966](https://bugzilla.redhat.com/1447966): 'make cscope' fails on a clean tree due to missing generated XDR files
|
||||
- [#1448150](https://bugzilla.redhat.com/1448150): USS: stale snap entries are seen when activation/deactivation performed during one of the glusterd's unavailability
|
||||
- [#1448265](https://bugzilla.redhat.com/1448265): use common function iov_length to instead of duplicate code
|
||||
@@ -286,7 +305,7 @@ Bugs addressed since release-3.11.0 are listed below.
|
||||
- [#1449329](https://bugzilla.redhat.com/1449329): When either killing or restarting a brick with performance.stat-prefetch on, stat sometimes returns a bad st_size value.
|
||||
- [#1449348](https://bugzilla.redhat.com/1449348): disperse seek does not correctly handle the end of file
|
||||
- [#1449495](https://bugzilla.redhat.com/1449495): glfsheal: crashed(segfault) with disperse volume in RDMA
|
||||
- [#1449610](https://bugzilla.redhat.com/1449610): [New] - Replacing an arbiter brick while I/O happens causes vm pause
|
||||
- [#1449610](https://bugzilla.redhat.com/1449610): [New] - Replacing an arbiter brick while I/O happens causes vm pause
|
||||
- [#1450010](https://bugzilla.redhat.com/1450010): [gluster-block]:Need a volume group profile option for gluster-block volume to add necessary options to be added.
|
||||
- [#1450559](https://bugzilla.redhat.com/1450559): Error 0-socket.management: socket_poller XX.XX.XX.XX:YYY failed (Input/output error) during any volume operation
|
||||
- [#1450630](https://bugzilla.redhat.com/1450630): [brick multiplexing] detach a brick if posix health check thread complaints about underlying brick
|
||||
@@ -299,7 +318,7 @@ Bugs addressed since release-3.11.0 are listed below.
|
||||
- [#1451724](https://bugzilla.redhat.com/1451724): glusterfind pre crashes with "UnicodeDecodeError: 'utf8' codec can't decode" error when the `--no-encode` is used
|
||||
- [#1452006](https://bugzilla.redhat.com/1452006): tierd listens to a port.
|
||||
- [#1452084](https://bugzilla.redhat.com/1452084): [Ganesha] : Stale linkto files after unsuccessfuly hardlinks
|
||||
- [#1452102](https://bugzilla.redhat.com/1452102): [DHt] : segfault in dht_selfheal_dir_setattr while running regressions
|
||||
- [#1452102](https://bugzilla.redhat.com/1452102): [DHt] : segfault in dht_selfheal_dir_setattr while running regressions
|
||||
- [#1452378](https://bugzilla.redhat.com/1452378): Cleanup unnecessary logs in fix_quorum_options
|
||||
- [#1452527](https://bugzilla.redhat.com/1452527): Shared volume doesn't get mounted on few nodes after rebooting all nodes in cluster.
|
||||
- [#1452956](https://bugzilla.redhat.com/1452956): glusterd on a node crashed after running volume profile command
|
||||
@@ -307,9 +326,9 @@ Bugs addressed since release-3.11.0 are listed below.
|
||||
- [#1453977](https://bugzilla.redhat.com/1453977): Brick Multiplexing: Deleting brick directories of the base volume must gracefully detach from glusterfsd without impacting other volumes IO(currently seeing transport end point error)
|
||||
- [#1454317](https://bugzilla.redhat.com/1454317): [Bitrot]: Brick process crash observed while trying to recover a bad file in disperse volume
|
||||
- [#1454375](https://bugzilla.redhat.com/1454375): ignore incorrect uuid validation in gd_validate_mgmt_hndsk_req
|
||||
- [#1454418](https://bugzilla.redhat.com/1454418): Glusterd segmentation fault in ' _Unwind_Backtrace' while running peer probe
|
||||
- [#1454418](https://bugzilla.redhat.com/1454418): Glusterd segmentation fault in ' \_Unwind_Backtrace' while running peer probe
|
||||
- [#1454701](https://bugzilla.redhat.com/1454701): DHT: Pass errno as an argument to gf_msg
|
||||
- [#1454865](https://bugzilla.redhat.com/1454865): [Brick Multiplexing] heal info shows the status of the bricks as "Transport endpoint is not connected" though bricks are up
|
||||
- [#1454865](https://bugzilla.redhat.com/1454865): [Brick Multiplexing] heal info shows the status of the bricks as "Transport endpoint is not connected" though bricks are up
|
||||
- [#1454872](https://bugzilla.redhat.com/1454872): [Geo-rep]: Make changelog batch size configurable
|
||||
- [#1455049](https://bugzilla.redhat.com/1455049): [GNFS+EC] Unable to release the lock when the other client tries to acquire the lock on the same file
|
||||
- [#1455104](https://bugzilla.redhat.com/1455104): dht: dht self heal fails with no hashed subvol error
|
||||
@@ -317,8 +336,8 @@ Bugs addressed since release-3.11.0 are listed below.
|
||||
- [#1455301](https://bugzilla.redhat.com/1455301): gluster-block is not working as expected when shard is enabled
|
||||
- [#1455559](https://bugzilla.redhat.com/1455559): [Geo-rep]: METADATA errors are seen even though everything is in sync
|
||||
- [#1455831](https://bugzilla.redhat.com/1455831): libglusterfs: updates old comment for 'arena_size'
|
||||
- [#1456361](https://bugzilla.redhat.com/1456361): DHT : for many operation directory/file path is '(null)' in brick log
|
||||
- [#1456385](https://bugzilla.redhat.com/1456385): glusterfs client crash on io-cache.so(__ioc_page_wakeup+0x44)
|
||||
- [#1456361](https://bugzilla.redhat.com/1456361): DHT : for many operation directory/file path is '(null)' in brick log
|
||||
- [#1456385](https://bugzilla.redhat.com/1456385): glusterfs client crash on io-cache.so(\_\_ioc_page_wakeup+0x44)
|
||||
- [#1456405](https://bugzilla.redhat.com/1456405): Brick Multiplexing:dmesg shows request_sock_TCP: Possible SYN flooding on port 49152 and memory related backtraces
|
||||
- [#1456582](https://bugzilla.redhat.com/1456582): "split-brain observed [Input/output error]" error messages in samba logs during parallel rm -rf
|
||||
- [#1456653](https://bugzilla.redhat.com/1456653): nlc_lookup_cbk floods logs
|
||||
@@ -333,7 +352,7 @@ Bugs addressed since release-3.11.0 are listed below.
|
||||
- [#1458197](https://bugzilla.redhat.com/1458197): io-stats usability/performance statistics enhancements
|
||||
- [#1458539](https://bugzilla.redhat.com/1458539): [Negative Lookup]: negative lookup features doesn't seem to work on restart of volume
|
||||
- [#1458582](https://bugzilla.redhat.com/1458582): add all as volume option in gluster volume get usage
|
||||
- [#1458768](https://bugzilla.redhat.com/1458768): [Perf] 35% drop in small file creates on smbv3 on *2
|
||||
- [#1458768](https://bugzilla.redhat.com/1458768): [Perf] 35% drop in small file creates on smbv3 on \*2
|
||||
- [#1459402](https://bugzilla.redhat.com/1459402): brick process crashes while running bug-1432542-mpx-restart-crash.t in a loop
|
||||
- [#1459530](https://bugzilla.redhat.com/1459530): [RFE] Need a way to resolve gfid split brains
|
||||
- [#1459620](https://bugzilla.redhat.com/1459620): [geo-rep]: Worker crashed with TypeError: expected string or buffer
|
||||
@@ -349,17 +368,17 @@ Bugs addressed since release-3.11.0 are listed below.
|
||||
- [#1461655](https://bugzilla.redhat.com/1461655): glusterd crashes when statedump is taken
|
||||
- [#1461792](https://bugzilla.redhat.com/1461792): lk fop succeeds even when lock is not acquired on at least quorum number of bricks
|
||||
- [#1461845](https://bugzilla.redhat.com/1461845): [Bitrot]: Inconsistency seen with 'scrub ondemand' - fails to trigger scrub
|
||||
- [#1462200](https://bugzilla.redhat.com/1462200): glusterd status showing failed when it's stopped in RHEL7
|
||||
- [#1462200](https://bugzilla.redhat.com/1462200): glusterd status showing failed when it's stopped in RHEL7
|
||||
- [#1462241](https://bugzilla.redhat.com/1462241): glusterfind: syntax error due to uninitialized variable 'end'
|
||||
- [#1462790](https://bugzilla.redhat.com/1462790): with AFR now making both nodes to return UUID for a file will result in georep consuming more resources
|
||||
- [#1463178](https://bugzilla.redhat.com/1463178): [Ganesha]Bricks got crashed while running posix compliance test suit on V4 mount
|
||||
- [#1463178](https://bugzilla.redhat.com/1463178): [Ganesha]Bricks got crashed while running posix compliance test suit on V4 mount
|
||||
- [#1463365](https://bugzilla.redhat.com/1463365): Changes for Maintainers 2.0
|
||||
- [#1463648](https://bugzilla.redhat.com/1463648): Use GF_XATTR_LIST_NODE_UUIDS_KEY to figure out local subvols
|
||||
- [#1464072](https://bugzilla.redhat.com/1464072): cns-brick-multiplexing: brick process fails to restart after gluster pod failure
|
||||
- [#1464091](https://bugzilla.redhat.com/1464091): Regression: Heal info takes longer time when a brick is down
|
||||
- [#1464110](https://bugzilla.redhat.com/1464110): [Scale] : Rebalance ETA (towards the end) may be inaccurate,even on a moderately large data set.
|
||||
- [#1464327](https://bugzilla.redhat.com/1464327): glusterfs client crashes when reading large directory
|
||||
- [#1464359](https://bugzilla.redhat.com/1464359): selfheal deamon cpu consumption not reducing when IOs are going on and all redundant bricks are brought down one after another
|
||||
- [#1464359](https://bugzilla.redhat.com/1464359): selfheal deamon cpu consumption not reducing when IOs are going on and all redundant bricks are brought down one after another
|
||||
- [#1465024](https://bugzilla.redhat.com/1465024): glusterfind: DELETE path needs to be unquoted before further processing
|
||||
- [#1465075](https://bugzilla.redhat.com/1465075): Fd based fops fail with EBADF on file migration
|
||||
- [#1465214](https://bugzilla.redhat.com/1465214): build failed with GF_DISABLE_MEMPOOL
|
||||
@@ -424,7 +443,7 @@ Bugs addressed since release-3.11.0 are listed below.
|
||||
- [#1479717](https://bugzilla.redhat.com/1479717): Running sysbench on vm disk from plain distribute gluster volume causes disk corruption
|
||||
- [#1480448](https://bugzilla.redhat.com/1480448): More useful error - replace 'not optimal'
|
||||
- [#1480459](https://bugzilla.redhat.com/1480459): Gluster puts PID files in wrong place
|
||||
- [#1481931](https://bugzilla.redhat.com/1481931): [Scale] : I/O errors on multiple gNFS mounts with "Stale file handle" during rebalance of an erasure coded volume.
|
||||
- [#1481931](https://bugzilla.redhat.com/1481931): [Scale] : I/O errors on multiple gNFS mounts with "Stale file handle" during rebalance of an erasure coded volume.
|
||||
- [#1482804](https://bugzilla.redhat.com/1482804): Negative Test: glusterd crashes for some of the volume options if set at cluster level
|
||||
- [#1482835](https://bugzilla.redhat.com/1482835): glusterd fails to start
|
||||
- [#1483402](https://bugzilla.redhat.com/1483402): DHT: readdirp fails to read some directories.
|
||||
@@ -432,6 +451,6 @@ Bugs addressed since release-3.11.0 are listed below.
|
||||
- [#1484440](https://bugzilla.redhat.com/1484440): packaging: /run and /var/run; prefer /run
|
||||
- [#1484885](https://bugzilla.redhat.com/1484885): [rpc]: EPOLLERR - disconnecting now messages every 3 secs after completing rebalance
|
||||
- [#1486107](https://bugzilla.redhat.com/1486107): /var/lib/glusterd/peers File had a blank line, Stopped Glusterd from starting
|
||||
- [#1486110](https://bugzilla.redhat.com/1486110): [quorum]: Replace brick is happened when Quorum not met.
|
||||
- [#1486110](https://bugzilla.redhat.com/1486110): [quorum]: Replace brick is happened when Quorum not met.
|
||||
- [#1486120](https://bugzilla.redhat.com/1486120): symlinks trigger faulty geo-replication state (rsnapshot usecase)
|
||||
- [#1486122](https://bugzilla.redhat.com/1486122): gluster-block profile needs to have strict-o-direct
|
||||
|
||||
@@ -1,20 +1,23 @@
|
||||
# Release notes for Gluster 3.12.1
|
||||
|
||||
This is a bugfix release. The [Release Notes for 3.12.0](3.12.0.md),
|
||||
[3.12.1](3.12.1.md) contain a listing of all the new features that
|
||||
were added and bugs fixed in the GlusterFS 3.12 stable release.
|
||||
[3.12.1](3.12.1.md) contain a listing of all the new features that
|
||||
were added and bugs fixed in the GlusterFS 3.12 stable release.
|
||||
|
||||
## Major changes, features and limitations addressed in this release
|
||||
|
||||
No Major changes
|
||||
|
||||
## Major issues
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
- The last known cause for corruption (Bug #1465123) has a fix with this
|
||||
release. As further testing is still in progress, the issue is retained as
|
||||
a major issue.
|
||||
release. As further testing is still in progress, the issue is retained as
|
||||
a major issue.
|
||||
- Status of this bug can be tracked here, #1465123
|
||||
|
||||
## Bugs addressed
|
||||
@@ -24,7 +27,7 @@ This is a bugfix release. The [Release Notes for 3.12.0](3.12.0.md),
|
||||
- [#1486538](https://bugzilla.redhat.com/1486538): [geo-rep+qr]: Crashes observed at slave from qr_lookup_sbk during rename/hardlink/rebalance
|
||||
- [#1486557](https://bugzilla.redhat.com/1486557): Log entry of files skipped/failed during rebalance operation
|
||||
- [#1487033](https://bugzilla.redhat.com/1487033): rpc: client_t and related objects leaked due to incorrect ref counts
|
||||
- [#1487319](https://bugzilla.redhat.com/1487319): afr: check op_ret value in __afr_selfheal_name_impunge
|
||||
- [#1487319](https://bugzilla.redhat.com/1487319): afr: check op_ret value in \_\_afr_selfheal_name_impunge
|
||||
- [#1488119](https://bugzilla.redhat.com/1488119): scripts: mount.glusterfs contains non-portable bashisms
|
||||
- [#1488168](https://bugzilla.redhat.com/1488168): Launch metadata heal in discover code path.
|
||||
- [#1488387](https://bugzilla.redhat.com/1488387): gluster-blockd process crashed and core generated
|
||||
|
||||
@@ -16,11 +16,12 @@ features that were added and bugs fixed in the GlusterFS 3.12 stable release.
|
||||
|
||||
Bugs addressed since release-3.12.9 are listed below
|
||||
.
|
||||
|
||||
- [#1570475](https://bugzilla.redhat.com/1570475): Rebalance on few nodes doesn't seem to complete - stuck at FUTEX_WAIT
|
||||
- [#1576816](https://bugzilla.redhat.com/1576816): GlusterFS can be improved
|
||||
- [#1577164](https://bugzilla.redhat.com/1577164): gfapi: broken symbol versions
|
||||
- [#1577845](https://bugzilla.redhat.com/1577845): Geo-rep: faulty session due to OSError: [Errno 95] Operation not supported
|
||||
- [#1577862](https://bugzilla.redhat.com/1577862): [geo-rep]: Upgrade fails, session in FAULTY state
|
||||
- [#1577862](https://bugzilla.redhat.com/1577862): [geo-rep]: Upgrade fails, session in FAULTY state
|
||||
- [#1577868](https://bugzilla.redhat.com/1577868): Glusterd crashed on a few (master) nodes
|
||||
- [#1577871](https://bugzilla.redhat.com/1577871): [geo-rep]: Geo-rep scheduler fails
|
||||
- [#1580519](https://bugzilla.redhat.com/1580519): the regression test "tests/bugs/posix/bug-990028.t" fails
|
||||
|
||||
@@ -8,6 +8,7 @@ GlusterFS 3.12 stable release.
|
||||
## Major changes, features and limitations addressed in this release
|
||||
|
||||
This release contains a fix for a security vulerability in Gluster as follows,
|
||||
|
||||
- http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-10841
|
||||
- https://nvd.nist.gov/vuln/detail/CVE-2018-10841
|
||||
|
||||
|
||||
@@ -16,6 +16,7 @@ all the new features that were added and bugs fixed in the GlusterFS 3.12 stable
|
||||
## Bugs addressed
|
||||
|
||||
Bugs addressed since release-3.12.12 are listed below
|
||||
|
||||
- [#1579673](https://bugzilla.redhat.com/1579673): Remove EIO from the dht_inode_missing macro
|
||||
- [#1595528](https://bugzilla.redhat.com/1595528): rmdir is leaking softlinks to directories in .glusterfs
|
||||
- [#1597120](https://bugzilla.redhat.com/1597120): Add quorum checks in post-op
|
||||
|
||||
@@ -16,9 +16,9 @@ contain a listing of all the new features that were added and bugs fixed in the
|
||||
## Bugs addressed
|
||||
|
||||
Bugs addressed in release-3.12.13 are listed below
|
||||
- [#1599788](https://bugzilla.redhat.com/1599788): _is_prefix should return false for 0-length strings
|
||||
|
||||
- [#1599788](https://bugzilla.redhat.com/1599788): \_is_prefix should return false for 0-length strings
|
||||
- [#1603093](https://bugzilla.redhat.com/1603093): directories are invisible on client side
|
||||
- [#1613512](https://bugzilla.redhat.com/1613512): Backport glusterfs-client memory leak fix to 3.12.x
|
||||
- [#1618838](https://bugzilla.redhat.com/1618838): gluster bash completion leaks TOP=0 into the environment
|
||||
- [#1618348](https://bugzilla.redhat.com/1618348): [Ganesha] Ganesha crashed in mdcache_alloc_and_check_handle while running bonnie and untars with parallel lookups
|
||||
|
||||
|
||||
@@ -7,7 +7,9 @@ and [3.12.13](3.12.13.md) contain a listing of all the new features that were ad
|
||||
the GlusterFS 3.12 stable release.
|
||||
|
||||
## Major changes, features and limitations addressed in this release
|
||||
|
||||
1. This release contains fix for following security vulnerabilities,
|
||||
|
||||
- https://nvd.nist.gov/vuln/detail/CVE-2018-10904
|
||||
- https://nvd.nist.gov/vuln/detail/CVE-2018-10907
|
||||
- https://nvd.nist.gov/vuln/detail/CVE-2018-10911
|
||||
@@ -21,10 +23,11 @@ the GlusterFS 3.12 stable release.
|
||||
- https://nvd.nist.gov/vuln/detail/CVE-2018-10930
|
||||
|
||||
2. To resolve the security vulnerabilities following limitations were made in GlusterFS
|
||||
- open,read,write on special files like char and block are no longer permitted
|
||||
- io-stat xlator can dump stat info only to /var/run/gluster directory
|
||||
|
||||
3. Addressed an issue that affected copying a file over SSL/TLS in a volume
|
||||
- open,read,write on special files like char and block are no longer permitted
|
||||
- io-stat xlator can dump stat info only to /var/run/gluster directory
|
||||
|
||||
3. Addressed an issue that affected copying a file over SSL/TLS in a volume
|
||||
|
||||
Installing the updated packages and restarting gluster services on gluster
|
||||
brick hosts, will fix the security issues.
|
||||
@@ -38,7 +41,7 @@ brick hosts, will fix the security issues.
|
||||
Bugs addressed since release-3.12.14 are listed below.
|
||||
|
||||
- [#1622405](https://bugzilla.redhat.com/1622405): Problem with SSL/TLS encryption on Gluster 4.0 & 4.1
|
||||
- [#1625286](https://bugzilla.redhat.com/1625286): Information Exposure in posix_get_file_contents function in posix-helpers.c
|
||||
- [#1625286](https://bugzilla.redhat.com/1625286): Information Exposure in posix_get_file_contents function in posix-helpers.c
|
||||
- [#1625648](https://bugzilla.redhat.com/1625648): I/O to arbitrary devices on storage server
|
||||
- [#1625654](https://bugzilla.redhat.com/1625654): Stack-based buffer overflow in server-rpc-fops.c allows remote attackers to execute arbitrary code
|
||||
- [#1625656](https://bugzilla.redhat.com/1625656): Improper deserialization in dict.c:dict_unserialize() can allow attackers to read arbitrary memory
|
||||
|
||||
@@ -5,6 +5,7 @@ This is a bugfix release. The release notes for [3.12.0](3.12.0.md), [3.12.1](3.
|
||||
fixed in the GlusterFS 3.12 stable release.
|
||||
|
||||
## Major changes, features and limitations addressed in this release
|
||||
|
||||
1.) In a pure distribute volume there is no source to heal the replaced brick
|
||||
from and hence would cause a loss of data that was present in the replaced brick.
|
||||
The CLI has been enhanced to prevent a user from inadvertently using replace brick
|
||||
@@ -12,31 +13,32 @@ fixed in the GlusterFS 3.12 stable release.
|
||||
an existing brick in a pure distribute volume.
|
||||
|
||||
## Major issues
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
- The last known cause for corruption #1465123 is still pending, and not yet
|
||||
part of this release.
|
||||
part of this release.
|
||||
|
||||
2. Gluster volume restarts fail if the sub directory export feature is in use.
|
||||
Status of this issue can be tracked here, #1501315
|
||||
2. Gluster volume restarts fail if the sub directory export feature is in use.
|
||||
Status of this issue can be tracked here, #1501315
|
||||
|
||||
3. Mounting a gluster snapshot will fail, when attempting a FUSE based mount of
|
||||
the snapshot. So for the current users, it is recommend to only access snapshot
|
||||
via ".snaps" directory on a mounted gluster volume.
|
||||
Status of this issue can be tracked here, #1501378
|
||||
3. Mounting a gluster snapshot will fail, when attempting a FUSE based mount of
|
||||
the snapshot. So for the current users, it is recommend to only access snapshot
|
||||
via ".snaps" directory on a mounted gluster volume.
|
||||
Status of this issue can be tracked here, #1501378
|
||||
|
||||
## Bugs addressed
|
||||
|
||||
A total of 31 patches have been merged, addressing 28 bugs
|
||||
|
||||
|
||||
- [#1490493](https://bugzilla.redhat.com/1490493): Sub-directory mount details are incorrect in /proc/mounts
|
||||
- [#1491178](https://bugzilla.redhat.com/1491178): GlusterD returns a bad memory pointer in glusterd_get_args_from_dict()
|
||||
- [#1491292](https://bugzilla.redhat.com/1491292): Provide brick list as part of VOLUME_CREATE event.
|
||||
- [#1491690](https://bugzilla.redhat.com/1491690): rpc: TLSv1_2_method() is deprecated in OpenSSL-1.1
|
||||
- [#1492026](https://bugzilla.redhat.com/1492026): set the shard-block-size to 64MB in virt profile
|
||||
- [#1492026](https://bugzilla.redhat.com/1492026): set the shard-block-size to 64MB in virt profile
|
||||
- [#1492061](https://bugzilla.redhat.com/1492061): CLIENT_CONNECT event not being raised
|
||||
- [#1492066](https://bugzilla.redhat.com/1492066): AFR_SUBVOL_UP and AFR_SUBVOLS_DOWN events not working
|
||||
- [#1493975](https://bugzilla.redhat.com/1493975): disallow replace brick operation on plain distribute volume
|
||||
|
||||
@@ -5,22 +5,22 @@ This is a bugfix release. The release notes for [3.12.0](3.12.0.md), [3.12.1](3.
|
||||
were added and bugs fixed in the GlusterFS 3.12 stable release.
|
||||
|
||||
## Major changes, features and limitations addressed in this release
|
||||
1. The two regression related to with subdir mount got fixed
|
||||
- gluster volume restart failure (#1465123)
|
||||
- mounting gluster snapshot via fuse (#1501378)
|
||||
|
||||
1. The two regression related to with subdir mount got fixed - gluster volume restart failure (#1465123) - mounting gluster snapshot via fuse (#1501378)
|
||||
|
||||
2. Improvements for "help" command with in gluster cli (#1509786)
|
||||
|
||||
3. Introduction of new api glfs_fd_set_lkowner() to set lock owner
|
||||
|
||||
|
||||
## Major issues
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
- The last known cause for corruption #1465123 is still pending, and not yet
|
||||
part of this release.
|
||||
part of this release.
|
||||
|
||||
## Bugs addressed
|
||||
|
||||
|
||||
@@ -5,19 +5,21 @@ This is a bugfix release. The release notes for [3.12.0](3.12.0.md), [3.12.1](3.
|
||||
the new features that were added and bugs fixed in the GlusterFS 3.12 stable release.
|
||||
|
||||
## Major issues
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
- The last known cause for corruption #1465123 is still pending, and not yet
|
||||
part of this release.
|
||||
part of this release.
|
||||
|
||||
## Bugs addressed
|
||||
|
||||
A total of 13 patches have been merged, addressing 12 bugs
|
||||
|
||||
- [#1478411](https://bugzilla.redhat.com/1478411): Directory listings on fuse mount are very slow due to small number of getdents() entries
|
||||
- [#1511782](https://bugzilla.redhat.com/1511782): In Replica volume 2*2 when quorum is set, after glusterd restart nfs server is coming up instead of self-heal daemon
|
||||
- [#1511782](https://bugzilla.redhat.com/1511782): In Replica volume 2\*2 when quorum is set, after glusterd restart nfs server is coming up instead of self-heal daemon
|
||||
- [#1512432](https://bugzilla.redhat.com/1512432): Test bug-1483058-replace-brick-quorum-validation.t fails inconsistently
|
||||
- [#1513258](https://bugzilla.redhat.com/1513258): NetBSD port
|
||||
- [#1514380](https://bugzilla.redhat.com/1514380): default timeout of 5min not honored for analyzing split-brain files post setfattr replica.split-brain-heal-finalize
|
||||
|
||||
@@ -4,16 +4,19 @@ This is a bugfix release. The release notes for [3.12.0](3.12.0.md), [3.12.1](3.
|
||||
[3.12.2](3.12.2.md), [3.12.3](3.12.3.md), [3.12.4](3.12.4.md), [3.12.5](3.12.5.md) contain a listing of all the new features that were added and bugs fixed in the GlusterFS 3.12 stable release.
|
||||
|
||||
## Major issues
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
- The last known cause for corruption #1465123 is still pending, and not yet
|
||||
part of this release.
|
||||
part of this release.
|
||||
|
||||
## Bugs addressed
|
||||
|
||||
A total of 12 patches have been merged, addressing 11 bugs
|
||||
|
||||
- [#1489043](https://bugzilla.redhat.com/1489043): The number of bytes of the quota specified in version 3.7 or later is incorrect
|
||||
- [#1511301](https://bugzilla.redhat.com/1511301): In distribute volume after glusterd restart, brick goes offline
|
||||
- [#1525850](https://bugzilla.redhat.com/1525850): rdma transport may access an obsolete item in gf_rdma_device_t->all_mr, and causes glusterfsd/glusterfs process crash.
|
||||
|
||||
@@ -3,29 +3,32 @@
|
||||
This is a bugfix release. The release notes for [3.12.0](3.12.0.md), [3.12.1](3.12.1.md), [3.12.2](3.12.2.md), [3.12.3](3.12.3.md), [3.12.4](3.12.4.md), [3.12.5](3.12.5.md), [3.12.5](3.12.6.md) contain a listing of all the new features that were added and bugs fixed in the GlusterFS 3.12 stable release.
|
||||
|
||||
## Major issues
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
- The last known cause for corruption #1465123 is still pending, and not yet
|
||||
part of this release.
|
||||
part of this release.
|
||||
|
||||
## Bugs addressed
|
||||
|
||||
A total of 16 patches have been merged, addressing 16 bugs
|
||||
|
||||
- [#1510342](https://bugzilla.redhat.com/1510342): Not all files synced using geo-replication
|
||||
- [#1533269](https://bugzilla.redhat.com/1533269): Random GlusterFSD process dies during rebalance
|
||||
- [#1534847](https://bugzilla.redhat.com/1534847): entries not getting cleared post healing of softlinks (stale entries showing up in heal info)
|
||||
- [#1536334](https://bugzilla.redhat.com/1536334): [Disperse] Implement open fd heal for disperse volume
|
||||
- [#1537346](https://bugzilla.redhat.com/1537346): glustershd/glusterd is not using right port when connecting to glusterfsd process
|
||||
- [#1539516](https://bugzilla.redhat.com/1539516): DHT log messages: Found anomalies in (null) (gfid = 00000000-0000-0000-0000-000000000000). Holes=1 overlaps=0
|
||||
- [#1540224](https://bugzilla.redhat.com/1540224): dht_(f)xattrop does not implement migration checks
|
||||
- [#1540224](https://bugzilla.redhat.com/1540224): dht\_(f)xattrop does not implement migration checks
|
||||
- [#1541267](https://bugzilla.redhat.com/1541267): dht_layout_t leak in dht_populate_inode_for_dentry
|
||||
- [#1541930](https://bugzilla.redhat.com/1541930): A down brick is incorrectly considered to be online and makes the volume to be started without any brick available
|
||||
- [#1542054](https://bugzilla.redhat.com/1542054): tests/bugs/cli/bug-1169302.t fails spuriously
|
||||
- [#1542475](https://bugzilla.redhat.com/1542475): Random failures in tests/bugs/nfs/bug-974972.t
|
||||
- [#1542601](https://bugzilla.redhat.com/1542601): The used space in the volume increases when the volume is expanded
|
||||
- [#1542615](https://bugzilla.redhat.com/1542615): tests/bugs/core/multiplex-limit-issue-151.t fails sometimes in upstream master
|
||||
- [#1542615](https://bugzilla.redhat.com/1542615): tests/bugs/core/multiplex-limit-issue-151.t fails sometimes in upstream master
|
||||
- [#1542826](https://bugzilla.redhat.com/1542826): Mark tests/bugs/posix/bug-990028.t bad on release-3.12
|
||||
- [#1542934](https://bugzilla.redhat.com/1542934): Seeing timer errors in the rebalance logs
|
||||
- [#1543016](https://bugzilla.redhat.com/1543016): dht_lookup_unlink_of_false_linkto_cbk fails with "Permission denied"
|
||||
|
||||
@@ -1,17 +1,19 @@
|
||||
# Release notes for Gluster 3.12.7
|
||||
|
||||
This is a bugfix release. The release notes for [3.12.0](3.12.0.md), [3.12.1](3.12.1.md), [3.12.2](3.12.2.md), [3.12.3](3.12.3.md), [3.12.4](3.12.4.md), [3.12.5](3.12.5.md), [3.12.6](3.12.6.md) contain a listing of all the new features that were added and bugs fixed in the GlusterFS 3.12 stable release.
|
||||
|
||||
## Bugs addressed
|
||||
|
||||
## Major issues
|
||||
|
||||
1. Consider a case in which one of the nodes goes down in gluster cluster with brick multiplexing enabled, if volume operations are performed then post when the node comes back, brick processes will fail to come up. The issue is tracked in #1543708 and will be fixed by next release.
|
||||
|
||||
A total of 8 patches have been merged, addressing 8 bugs
|
||||
A total of 8 patches have been merged, addressing 8 bugs
|
||||
|
||||
- [#1517260](https://bugzilla.redhat.com/1517260): Volume wrong size
|
||||
- [#1543709](https://bugzilla.redhat.com/1543709): Optimize glusterd_import_friend_volume code path
|
||||
- [#1544635](https://bugzilla.redhat.com/1544635): Though files are in split-brain able to perform writes to the file
|
||||
- [#1547841](https://bugzilla.redhat.com/1547841): Typo error in __dht_check_free_space function log message
|
||||
- [#1547841](https://bugzilla.redhat.com/1547841): Typo error in \_\_dht_check_free_space function log message
|
||||
- [#1548078](https://bugzilla.redhat.com/1548078): [Rebalance] "Migrate file failed: <filepath>: failed to get xattr [No data available]" warnings in rebalance logs
|
||||
- [#1548270](https://bugzilla.redhat.com/1548270): DHT calls dht_lookup_everywhere for 1xn volumes
|
||||
- [#1549505](https://bugzilla.redhat.com/1549505): Backport patch to reduce duplicate code in server-rpc-fops.c
|
||||
|
||||
@@ -1,9 +1,11 @@
|
||||
# Release notes for Gluster 3.12.8
|
||||
|
||||
This is a bugfix release. The release notes for [3.12.0](3.12.0.md), [3.12.1](3.12.1.md), [3.12.2](3.12.2.md), [3.12.3](3.12.3.md), [3.12.4](3.12.4.md), [3.12.5](3.12.5.md), [3.12.6](3.12.6.md), [3.12.7](3.12.7.md) contain a listing of all the new features that were added and bugs fixed in the GlusterFS 3.12 stable release.
|
||||
|
||||
## Bugs addressed
|
||||
|
||||
A total of 9 patches have been merged, addressing 9 bugs
|
||||
|
||||
- [#1543708](https://bugzilla.redhat.com/1543708): glusterd fails to attach brick during restart of the node
|
||||
- [#1546627](https://bugzilla.redhat.com/1546627): Syntactical errors in hook scripts for managing SELinux context on bricks
|
||||
- [#1549473](https://bugzilla.redhat.com/1549473): possible memleak in glusterfsd process with brick multiplexing on
|
||||
@@ -12,4 +14,4 @@ This is a bugfix release. The release notes for [3.12.0](3.12.0.md), [3.12.1](3.
|
||||
- [#1558352](https://bugzilla.redhat.com/1558352): [EC] Read performance of EC volume exported over gNFS is significantly lower than write performance
|
||||
- [#1561731](https://bugzilla.redhat.com/1561731): Rebalance failures on a dispersed volume with lookup-optimize enabled
|
||||
- [#1562723](https://bugzilla.redhat.com/1562723): SHD is not healing entries in halo replication
|
||||
- [#1565590](https://bugzilla.redhat.com/1565590): timer: Possible race condition between gf_timer_* routines
|
||||
- [#1565590](https://bugzilla.redhat.com/1565590): timer: Possible race condition between gf*timer*\* routines
|
||||
|
||||
@@ -7,6 +7,7 @@ features that were added and bugs fixed in the GlusterFS 3.12 stable release.
|
||||
## Major changes, features and limitations addressed in this release
|
||||
|
||||
This release contains a fix for a security vulerability in Gluster as follows,
|
||||
|
||||
- http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-1088
|
||||
- https://nvd.nist.gov/vuln/detail/CVE-2018-1088
|
||||
|
||||
|
||||
@@ -15,11 +15,13 @@ The Gluster heal info CLI now has a 'summary' option displaying the statistics
|
||||
of entries pending heal, in split-brain and currently being healed, per brick.
|
||||
|
||||
Usage:
|
||||
|
||||
```
|
||||
# gluster volume heal <volname> info summary
|
||||
```
|
||||
|
||||
Sample output:
|
||||
|
||||
```
|
||||
Brick <brickname>
|
||||
Status: Connected
|
||||
@@ -68,7 +70,7 @@ before, even when only 1 brick is online.
|
||||
|
||||
Further reference: [mailing list discussions on topic](http://lists.gluster.org/pipermail/gluster-users/2017-September/032524.html)
|
||||
|
||||
### Support for max-port range in glusterd.vol
|
||||
### Support for max-port range in glusterd.vol
|
||||
|
||||
**Notes for users:**
|
||||
|
||||
@@ -102,6 +104,7 @@ endpoint (called gfproxy) on the gluster server nodes, thus thinning the client
|
||||
stack.
|
||||
|
||||
Usage:
|
||||
|
||||
```
|
||||
# gluster volume set <volname> config.gfproxyd enable
|
||||
```
|
||||
@@ -110,6 +113,7 @@ The above enables the gfproxy protocol service on the server nodes. To mount a
|
||||
client that interacts with this end point, use the --thin-client mount option.
|
||||
|
||||
Example:
|
||||
|
||||
```
|
||||
# glusterfs --thin-client --volfile-id=<volname> --volfile-server=<host> <mountpoint>
|
||||
```
|
||||
@@ -134,6 +138,7 @@ feature is disabled. The option takes a numeric percentage value, that reserves
|
||||
up to that percentage of disk space.
|
||||
|
||||
Usage:
|
||||
|
||||
```
|
||||
# gluster volume set <volname> storage.reserve <number>
|
||||
```
|
||||
@@ -146,6 +151,7 @@ Gluster CLI is enhanced with an option to list all connected clients to a volume
|
||||
volume.
|
||||
|
||||
Usage:
|
||||
|
||||
```
|
||||
# gluster volume status <volname/all> client-list
|
||||
```
|
||||
@@ -165,6 +171,7 @@ This feature is enabled by default, and can be toggled using the boolean option,
|
||||
This feature enables users to punch hole in files created on disperse volumes.
|
||||
|
||||
Usage:
|
||||
|
||||
```
|
||||
# fallocate -p -o <offset> -l <len> <file_name>
|
||||
```
|
||||
@@ -186,7 +193,6 @@ There are currently no statistics included in the `statedump` about the actual
|
||||
behavior of the memory pools. This means that the efficiency of the memory
|
||||
pools can not be verified.
|
||||
|
||||
|
||||
### Gluster APIs added to register callback functions for upcalls
|
||||
|
||||
**Notes for developers:**
|
||||
@@ -201,8 +207,8 @@ int glfs_upcall_register (struct glfs *fs, uint32_t event_list,
|
||||
glfs_upcall_cbk cbk, void *data);
|
||||
int glfs_upcall_unregister (struct glfs *fs, uint32_t event_list);
|
||||
```
|
||||
libgfapi [header](https://github.com/gluster/glusterfs/blob/release-3.13/api/src/glfs.h#L970) files include the complete synopsis about these APIs definition and their usage.
|
||||
|
||||
libgfapi [header](https://github.com/gluster/glusterfs/blob/release-3.13/api/src/glfs.h#L970) files include the complete synopsis about these APIs definition and their usage.
|
||||
|
||||
**Limitations:**
|
||||
An application can register only a single callback function for all the upcall
|
||||
@@ -237,13 +243,15 @@ responses and enable better qualification of the translator stacks.
|
||||
For usage refer to this [test case](https://github.com/gluster/glusterfs/blob/v3.13.0rc0/tests/features/delay-gen.t).
|
||||
|
||||
## Major issues
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
- The last known cause for corruption (Bug #1515434) has a fix with this
|
||||
release. As further testing is still in progress, the issue is retained as
|
||||
a major issue.
|
||||
release. As further testing is still in progress, the issue is retained as
|
||||
a major issue.
|
||||
- Status of this bug can be tracked here, #1515434
|
||||
|
||||
## Bugs addressed
|
||||
@@ -252,13 +260,13 @@ Bugs addressed since release-3.12.0 are listed below.
|
||||
|
||||
- [#1248393](https://bugzilla.redhat.com/1248393): DHT: readdirp fails to read some directories.
|
||||
- [#1258561](https://bugzilla.redhat.com/1258561): Gluster puts PID files in wrong place
|
||||
- [#1261463](https://bugzilla.redhat.com/1261463): AFR : [RFE] Improvements needed in "gluster volume heal info" commands
|
||||
- [#1261463](https://bugzilla.redhat.com/1261463): AFR : [RFE] Improvements needed in "gluster volume heal info" commands
|
||||
- [#1294051](https://bugzilla.redhat.com/1294051): Though files are in split-brain able to perform writes to the file
|
||||
- [#1328994](https://bugzilla.redhat.com/1328994): When a feature fails needing a higher opversion, the message should state what version it needs.
|
||||
- [#1335251](https://bugzilla.redhat.com/1335251): mgmt/glusterd: clang compile warnings in glusterd-snapshot.c
|
||||
- [#1350406](https://bugzilla.redhat.com/1350406): [storage/posix] - posix_do_futimes function not implemented
|
||||
- [#1365683](https://bugzilla.redhat.com/1365683): Fix crash bug when mnt3_resolve_subdir_cbk fails
|
||||
- [#1371806](https://bugzilla.redhat.com/1371806): DHT :- inconsistent 'custom extended attributes',uid and gid, Access permission (for directories) if User set/modifies it after bringing one or more sub-volume down
|
||||
- [#1371806](https://bugzilla.redhat.com/1371806): DHT :- inconsistent 'custom extended attributes',uid and gid, Access permission (for directories) if User set/modifies it after bringing one or more sub-volume down
|
||||
- [#1376326](https://bugzilla.redhat.com/1376326): separating attach tier and add brick
|
||||
- [#1388509](https://bugzilla.redhat.com/1388509): gluster volume heal info "healed" and "heal-failed" showing wrong information
|
||||
- [#1395492](https://bugzilla.redhat.com/1395492): trace/error-gen be turned on together while use 'volume set' command to set one of them
|
||||
@@ -314,14 +322,14 @@ Bugs addressed since release-3.12.0 are listed below.
|
||||
- [#1480099](https://bugzilla.redhat.com/1480099): More useful error - replace 'not optimal'
|
||||
- [#1480445](https://bugzilla.redhat.com/1480445): Log entry of files skipped/failed during rebalance operation
|
||||
- [#1480525](https://bugzilla.redhat.com/1480525): Make choose-local configurable through `volume-set` command
|
||||
- [#1480591](https://bugzilla.redhat.com/1480591): [Scale] : I/O errors on multiple gNFS mounts with "Stale file handle" during rebalance of an erasure coded volume.
|
||||
- [#1480591](https://bugzilla.redhat.com/1480591): [Scale] : I/O errors on multiple gNFS mounts with "Stale file handle" during rebalance of an erasure coded volume.
|
||||
- [#1481199](https://bugzilla.redhat.com/1481199): mempool: run-time crash when built with --disable-mempool
|
||||
- [#1481600](https://bugzilla.redhat.com/1481600): rpc: client_t and related objects leaked due to incorrect ref counts
|
||||
- [#1482023](https://bugzilla.redhat.com/1482023): snpashots issues with other processes accessing the mounted brick snapshots
|
||||
- [#1482344](https://bugzilla.redhat.com/1482344): Negative Test: glusterd crashes for some of the volume options if set at cluster level
|
||||
- [#1482906](https://bugzilla.redhat.com/1482906): /var/lib/glusterd/peers File had a blank line, Stopped Glusterd from starting
|
||||
- [#1482923](https://bugzilla.redhat.com/1482923): afr: check op_ret value in __afr_selfheal_name_impunge
|
||||
- [#1483058](https://bugzilla.redhat.com/1483058): [quorum]: Replace brick is happened when Quorum not met.
|
||||
- [#1482923](https://bugzilla.redhat.com/1482923): afr: check op_ret value in \_\_afr_selfheal_name_impunge
|
||||
- [#1483058](https://bugzilla.redhat.com/1483058): [quorum]: Replace brick is happened when Quorum not met.
|
||||
- [#1483995](https://bugzilla.redhat.com/1483995): packaging: use rdma-core(-devel) instead of ibverbs, rdmacm; disable rdma on armv7hl
|
||||
- [#1484215](https://bugzilla.redhat.com/1484215): Add Deepshika has CI Peer
|
||||
- [#1484225](https://bugzilla.redhat.com/1484225): [rpc]: EPOLLERR - disconnecting now messages every 3 secs after completing rebalance
|
||||
@@ -344,7 +352,7 @@ Bugs addressed since release-3.12.0 are listed below.
|
||||
- [#1488909](https://bugzilla.redhat.com/1488909): Fix the type of 'len' in posix.c, clang is showing a warning
|
||||
- [#1488913](https://bugzilla.redhat.com/1488913): Sub-directory mount details are incorrect in /proc/mounts
|
||||
- [#1489432](https://bugzilla.redhat.com/1489432): disallow replace brick operation on plain distribute volume
|
||||
- [#1489823](https://bugzilla.redhat.com/1489823): set the shard-block-size to 64MB in virt profile
|
||||
- [#1489823](https://bugzilla.redhat.com/1489823): set the shard-block-size to 64MB in virt profile
|
||||
- [#1490642](https://bugzilla.redhat.com/1490642): glusterfs client crash when removing directories
|
||||
- [#1490897](https://bugzilla.redhat.com/1490897): GlusterD returns a bad memory pointer in glusterd_get_args_from_dict()
|
||||
- [#1491025](https://bugzilla.redhat.com/1491025): rpc: TLSv1_2_method() is deprecated in OpenSSL-1.1
|
||||
@@ -408,13 +416,13 @@ Bugs addressed since release-3.12.0 are listed below.
|
||||
- [#1510022](https://bugzilla.redhat.com/1510022): Revert experimental and 4.0 features to prepare for 3.13 release
|
||||
- [#1511274](https://bugzilla.redhat.com/1511274): Rebalance estimate(ETA) shows wrong details(as intial message of 10min wait reappears) when still in progress
|
||||
- [#1511293](https://bugzilla.redhat.com/1511293): In distribute volume after glusterd restart, brick goes offline
|
||||
- [#1511768](https://bugzilla.redhat.com/1511768): In Replica volume 2*2 when quorum is set, after glusterd restart nfs server is coming up instead of self-heal daemon
|
||||
- [#1511768](https://bugzilla.redhat.com/1511768): In Replica volume 2\*2 when quorum is set, after glusterd restart nfs server is coming up instead of self-heal daemon
|
||||
- [#1512435](https://bugzilla.redhat.com/1512435): Test bug-1483058-replace-brick-quorum-validation.t fails inconsistently
|
||||
- [#1512460](https://bugzilla.redhat.com/1512460): disperse eager-lock degrades performance for file create workloads
|
||||
- [#1513259](https://bugzilla.redhat.com/1513259): NetBSD port
|
||||
- [#1514419](https://bugzilla.redhat.com/1514419): gluster volume splitbrain info needs to display output of each brick in a stream fashion instead of buffering and dumping at the end
|
||||
- [#1515045](https://bugzilla.redhat.com/1515045): bug-1247563.t is failing on master
|
||||
- [#1515572](https://bugzilla.redhat.com/1515572): Accessing a file when source brick is down results in that FOP being hung
|
||||
- [#1515572](https://bugzilla.redhat.com/1515572): Accessing a file when source brick is down results in that FOP being hung
|
||||
- [#1516313](https://bugzilla.redhat.com/1516313): Bringing down data bricks in cyclic order results in arbiter brick becoming the source for heal.
|
||||
- [#1517692](https://bugzilla.redhat.com/1517692): Memory leak in locks xlator
|
||||
- [#1518257](https://bugzilla.redhat.com/1518257): EC DISCARD doesn't punch hole properly
|
||||
|
||||
@@ -5,17 +5,19 @@ contain a listing of all the new features that were added and
|
||||
bugs fixed in the GlusterFS 3.13 stable release.
|
||||
|
||||
## Major changes, features and limitations addressed in this release
|
||||
|
||||
**No Major changes**
|
||||
|
||||
## Major issues
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
1. Expanding a gluster volume that is sharded may cause file corruption
|
||||
|
||||
- Sharded volumes are typically used for VM images, if such volumes are
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
expanded or possibly contracted (i.e add/remove bricks and rebalance) there
|
||||
are reports of VM images getting corrupted.
|
||||
- The last known cause for corruption (Bug #1515434) is still under review.
|
||||
- Status of this bug can be tracked here, [#1515434](https://bugzilla.redhat.com/1515434)
|
||||
|
||||
|
||||
## Bugs addressed
|
||||
|
||||
Bugs addressed since release-3.13.0 are listed below.
|
||||
|
||||
@@ -5,9 +5,11 @@ contain a listing of all the new features that were added and
|
||||
bugs fixed in the GlusterFS 3.13 stable release.
|
||||
|
||||
## Major changes, features and limitations addressed in this release
|
||||
|
||||
**No Major changes**
|
||||
|
||||
## Major issues
|
||||
|
||||
**No Major iissues**
|
||||
|
||||
## Bugs addressed
|
||||
@@ -15,7 +17,7 @@ bugs fixed in the GlusterFS 3.13 stable release.
|
||||
Bugs addressed since release-3.13.1 are listed below.
|
||||
|
||||
- [#1511293](https://bugzilla.redhat.com/1511293): In distribute volume after glusterd restart, brick goes offline
|
||||
- [#1515434](https://bugzilla.redhat.com/1515434): dht_(f)xattrop does not implement migration checks
|
||||
- [#1515434](https://bugzilla.redhat.com/1515434): dht\_(f)xattrop does not implement migration checks
|
||||
- [#1516313](https://bugzilla.redhat.com/1516313): Bringing down data bricks in cyclic order results in arbiter brick becoming the source for heal.
|
||||
- [#1529055](https://bugzilla.redhat.com/1529055): Test case ./tests/bugs/bug-1371806_1.t is failing
|
||||
- [#1529084](https://bugzilla.redhat.com/1529084): fstat returns ENOENT/ESTALE
|
||||
|
||||
@@ -28,6 +28,7 @@ to files in glusterfs using its GFID
|
||||
For more information refer [here](https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.5/gfid%20access.md).
|
||||
|
||||
### Prevent NFS restart on Volume change
|
||||
|
||||
Earlier any volume change (volume option, volume start, volume stop, volume
|
||||
delete,brick add, etc) required restarting NFS server.
|
||||
|
||||
@@ -48,7 +49,7 @@ directory read performance.
|
||||
zerofill feature allows creation of pre-allocated and zeroed-out files on
|
||||
GlusterFS volumes by offloading the zeroing part to server and/or storage
|
||||
(storage offloads use SCSI WRITESAME), thereby achieves quick creation of
|
||||
pre-allocated and zeroed-out VM disk image by using server/storage off-loads.
|
||||
pre-allocated and zeroed-out VM disk image by using server/storage off-loads.
|
||||
|
||||
For more information refer [here](https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.5/Zerofill.md).
|
||||
|
||||
@@ -93,7 +94,7 @@ The Volume group is represented as directory and logical volumes as files.
|
||||
|
||||
remove-brick CLI earlier used to remove the brick forcefully ( without data migration ),
|
||||
when called without any arguments. This mode of 'remove-brick' cli, without any
|
||||
arguments has been deprecated.
|
||||
arguments has been deprecated.
|
||||
|
||||
### Experimental Features
|
||||
|
||||
@@ -126,24 +127,26 @@ The following features are experimental with this release:
|
||||
|
||||
- AUTH support for exported nfs sub-directories added
|
||||
|
||||
|
||||
### Known Issues:
|
||||
|
||||
- The following configuration changes are necessary for qemu and samba
|
||||
integration with libgfapi to work seamlessly:
|
||||
|
||||
1) gluster volume set <volname> server.allow-insecure on
|
||||
```{ .text .no-copy }
|
||||
1) gluster volume set <volname> server.allow-insecure on
|
||||
|
||||
2) Edit /etc/glusterfs/glusterd.vol to contain this line:
|
||||
option rpc-auth-allow-insecure on
|
||||
Post 1), restarting the volume would be necessary.
|
||||
Post 2), restarting glusterd would be necessary.
|
||||
2) Edit /etc/glusterfs/glusterd.vol to contain this line:
|
||||
option rpc-auth-allow-insecure on
|
||||
|
||||
- RDMA connection manager needs IPoIB for connection establishment. More
|
||||
details can be found [here](https://github.com/gluster/glusterfs-specs/blob/master/done/Features/rdmacm.md).
|
||||
Post 1), restarting the volume would be necessary.
|
||||
Post 2), restarting glusterd would be necessary.
|
||||
```
|
||||
|
||||
- RDMA connection manager needs IPoIB for connection establishment. More
|
||||
details can be found [here](https://github.com/gluster/glusterfs-specs/blob/master/done/Features/rdmacm.md).
|
||||
|
||||
- For Block Device translator based volumes open-behind translator at the
|
||||
client side needs to be disabled.
|
||||
client side needs to be disabled.
|
||||
|
||||
- libgfapi clients calling glfs_fini before a successfull glfs_init will cause the client to
|
||||
hang as reported [here](http://lists.gnu.org/archive/html/gluster-devel/2014-04/msg00179.html).
|
||||
|
||||
@@ -15,83 +15,82 @@ additions:
|
||||
|
||||
### Bugs Fixed:
|
||||
|
||||
* [765202](https://bugzilla.redhat.com/765202): lgetxattr called with invalid keys on the bricks
|
||||
* [833586](https://bugzilla.redhat.com/833586): inodelk hang from marker_rename_release_newp_lock
|
||||
* [859581](https://bugzilla.redhat.com/859581): self-heal process can sometimes create directories instead of symlinks for the root gfid file in .glusterfs
|
||||
* [986429](https://bugzilla.redhat.com/986429): Backupvolfile server option should work internal to GlusterFS framework
|
||||
* [1039544](https://bugzilla.redhat.com/1039544): [FEAT] "gluster volume heal info" should list the entries that actually required to be healed.
|
||||
* [1046624](https://bugzilla.redhat.com/1046624): Unable to heal symbolic Links
|
||||
* [1046853](https://bugzilla.redhat.com/1046853): AFR : For every file self-heal there are warning messages reported in glustershd.log file
|
||||
* [1063190](https://bugzilla.redhat.com/1063190): Volume was not accessible after server side quorum was met
|
||||
* [1064096](https://bugzilla.redhat.com/1064096): The old Python Translator code (not Glupy) should be removed
|
||||
* [1066996](https://bugzilla.redhat.com/1066996): Using sanlock on a gluster mount with replica 3 (quorum-type auto) leads to a split-brain
|
||||
* [1071191](https://bugzilla.redhat.com/1071191): [3.5.1] Sporadic SIGBUS with mmap() on a sparse file created with open(), seek(), write()
|
||||
* [1078061](https://bugzilla.redhat.com/1078061): Need ability to heal mismatching user extended attributes without any changelogs
|
||||
* [1078365](https://bugzilla.redhat.com/1078365): New xlators are linked as versioned .so files, creating <xlator>.so.0.0.0
|
||||
* [1086743](https://bugzilla.redhat.com/1086743): Add documentation for the Feature: RDMA-connection manager (RDMA-CM)
|
||||
* [1086748](https://bugzilla.redhat.com/1086748): Add documentation for the Feature: AFR CLI enhancements
|
||||
* [1086749](https://bugzilla.redhat.com/1086749): Add documentation for the Feature: Exposing Volume Capabilities
|
||||
* [1086750](https://bugzilla.redhat.com/1086750): Add documentation for the Feature: File Snapshots in GlusterFS
|
||||
* [1086751](https://bugzilla.redhat.com/1086751): Add documentation for the Feature: gfid-access
|
||||
* [1086752](https://bugzilla.redhat.com/1086752): Add documentation for the Feature: On-Wire Compression/Decompression
|
||||
* [1086754](https://bugzilla.redhat.com/1086754): Add documentation for the Feature: Quota Scalability
|
||||
* [1086755](https://bugzilla.redhat.com/1086755): Add documentation for the Feature: readdir-ahead
|
||||
* [1086756](https://bugzilla.redhat.com/1086756): Add documentation for the Feature: zerofill API for GlusterFS
|
||||
* [1086758](https://bugzilla.redhat.com/1086758): Add documentation for the Feature: Changelog based parallel geo-replication
|
||||
* [1086760](https://bugzilla.redhat.com/1086760): Add documentation for the Feature: Write Once Read Many (WORM) volume
|
||||
* [1086762](https://bugzilla.redhat.com/1086762): Add documentation for the Feature: BD Xlator - Block Device translator
|
||||
* [1086766](https://bugzilla.redhat.com/1086766): Add documentation for the Feature: Libgfapi
|
||||
* [1086774](https://bugzilla.redhat.com/1086774): Add documentation for the Feature: Access Control List - Version 3 support for Gluster NFS
|
||||
* [1086781](https://bugzilla.redhat.com/1086781): Add documentation for the Feature: Eager locking
|
||||
* [1086782](https://bugzilla.redhat.com/1086782): Add documentation for the Feature: glusterfs and oVirt integration
|
||||
* [1086783](https://bugzilla.redhat.com/1086783): Add documentation for the Feature: qemu 1.3 - libgfapi integration
|
||||
* [1088848](https://bugzilla.redhat.com/1088848): Spelling errors in rpc/rpc-transport/rdma/src/rdma.c
|
||||
* [1089054](https://bugzilla.redhat.com/1089054): gf-error-codes.h is missing from source tarball
|
||||
* [1089470](https://bugzilla.redhat.com/1089470): SMB: Crash on brick process during compile kernel.
|
||||
* [1089934](https://bugzilla.redhat.com/1089934): list dir with more than N files results in Input/output error
|
||||
* [1091340](https://bugzilla.redhat.com/1091340): Doc: Add glfs_fini known issue to release notes 3.5
|
||||
* [1091392](https://bugzilla.redhat.com/1091392): glusterfs.spec.in: minor/nit changes to sync with Fedora spec
|
||||
* [1095256](https://bugzilla.redhat.com/1095256): Excessive logging from self-heal daemon, and bricks
|
||||
* [1095595](https://bugzilla.redhat.com/1095595): Stick to IANA standard while allocating brick ports
|
||||
* [1095775](https://bugzilla.redhat.com/1095775): Add support in libgfapi to fetch volume info from glusterd.
|
||||
* [1095971](https://bugzilla.redhat.com/1095971): Stopping/Starting a Gluster volume resets ownership
|
||||
* [1096040](https://bugzilla.redhat.com/1096040): AFR : self-heal-daemon not clearing the change-logs of all the sources after self-heal
|
||||
* [1096425](https://bugzilla.redhat.com/1096425): i/o error when one user tries to access RHS volume over NFS with 100+ GIDs
|
||||
* [1099878](https://bugzilla.redhat.com/1099878): Need support for handle based Ops to fetch/modify extended attributes of a file
|
||||
* [1101647](https://bugzilla.redhat.com/1101647): gluster volume heal volname statistics heal-count not giving desired output.
|
||||
* [1102306](https://bugzilla.redhat.com/1102306): license: xlators/features/glupy dual license GPLv2 and LGPLv3+
|
||||
* [1103413](https://bugzilla.redhat.com/1103413): Failure in gf_log_init reopening stderr
|
||||
* [1104592](https://bugzilla.redhat.com/1104592): heal info may give Success instead of transport end point not connected when a brick is down.
|
||||
* [1104915](https://bugzilla.redhat.com/1104915): glusterfsd crashes while doing stress tests
|
||||
* [1104919](https://bugzilla.redhat.com/1104919): Fix memory leaks in gfid-access xlator.
|
||||
* [1104959](https://bugzilla.redhat.com/1104959): Dist-geo-rep : some of the files not accessible on slave after the geo-rep sync from master to slave.
|
||||
* [1105188](https://bugzilla.redhat.com/1105188): Two instances each, of brick processes, glusterfs-nfs and quotad seen after glusterd restart
|
||||
* [1105524](https://bugzilla.redhat.com/1105524): Disable nfs.drc by default
|
||||
* [1107937](https://bugzilla.redhat.com/1107937): quota-anon-fd-nfs.t fails spuriously
|
||||
* [1109832](https://bugzilla.redhat.com/1109832): I/O fails for for glusterfs 3.4 AFR clients accessing servers upgraded to glusterfs 3.5
|
||||
* [1110777](https://bugzilla.redhat.com/1110777): glusterfsd OOM - using all memory when quota is enabled
|
||||
- [765202](https://bugzilla.redhat.com/765202): lgetxattr called with invalid keys on the bricks
|
||||
- [833586](https://bugzilla.redhat.com/833586): inodelk hang from marker_rename_release_newp_lock
|
||||
- [859581](https://bugzilla.redhat.com/859581): self-heal process can sometimes create directories instead of symlinks for the root gfid file in .glusterfs
|
||||
- [986429](https://bugzilla.redhat.com/986429): Backupvolfile server option should work internal to GlusterFS framework
|
||||
- [1039544](https://bugzilla.redhat.com/1039544): [FEAT] "gluster volume heal info" should list the entries that actually required to be healed.
|
||||
- [1046624](https://bugzilla.redhat.com/1046624): Unable to heal symbolic Links
|
||||
- [1046853](https://bugzilla.redhat.com/1046853): AFR : For every file self-heal there are warning messages reported in glustershd.log file
|
||||
- [1063190](https://bugzilla.redhat.com/1063190): Volume was not accessible after server side quorum was met
|
||||
- [1064096](https://bugzilla.redhat.com/1064096): The old Python Translator code (not Glupy) should be removed
|
||||
- [1066996](https://bugzilla.redhat.com/1066996): Using sanlock on a gluster mount with replica 3 (quorum-type auto) leads to a split-brain
|
||||
- [1071191](https://bugzilla.redhat.com/1071191): [3.5.1] Sporadic SIGBUS with mmap() on a sparse file created with open(), seek(), write()
|
||||
- [1078061](https://bugzilla.redhat.com/1078061): Need ability to heal mismatching user extended attributes without any changelogs
|
||||
- [1078365](https://bugzilla.redhat.com/1078365): New xlators are linked as versioned .so files, creating <xlator>.so.0.0.0
|
||||
- [1086743](https://bugzilla.redhat.com/1086743): Add documentation for the Feature: RDMA-connection manager (RDMA-CM)
|
||||
- [1086748](https://bugzilla.redhat.com/1086748): Add documentation for the Feature: AFR CLI enhancements
|
||||
- [1086749](https://bugzilla.redhat.com/1086749): Add documentation for the Feature: Exposing Volume Capabilities
|
||||
- [1086750](https://bugzilla.redhat.com/1086750): Add documentation for the Feature: File Snapshots in GlusterFS
|
||||
- [1086751](https://bugzilla.redhat.com/1086751): Add documentation for the Feature: gfid-access
|
||||
- [1086752](https://bugzilla.redhat.com/1086752): Add documentation for the Feature: On-Wire Compression/Decompression
|
||||
- [1086754](https://bugzilla.redhat.com/1086754): Add documentation for the Feature: Quota Scalability
|
||||
- [1086755](https://bugzilla.redhat.com/1086755): Add documentation for the Feature: readdir-ahead
|
||||
- [1086756](https://bugzilla.redhat.com/1086756): Add documentation for the Feature: zerofill API for GlusterFS
|
||||
- [1086758](https://bugzilla.redhat.com/1086758): Add documentation for the Feature: Changelog based parallel geo-replication
|
||||
- [1086760](https://bugzilla.redhat.com/1086760): Add documentation for the Feature: Write Once Read Many (WORM) volume
|
||||
- [1086762](https://bugzilla.redhat.com/1086762): Add documentation for the Feature: BD Xlator - Block Device translator
|
||||
- [1086766](https://bugzilla.redhat.com/1086766): Add documentation for the Feature: Libgfapi
|
||||
- [1086774](https://bugzilla.redhat.com/1086774): Add documentation for the Feature: Access Control List - Version 3 support for Gluster NFS
|
||||
- [1086781](https://bugzilla.redhat.com/1086781): Add documentation for the Feature: Eager locking
|
||||
- [1086782](https://bugzilla.redhat.com/1086782): Add documentation for the Feature: glusterfs and oVirt integration
|
||||
- [1086783](https://bugzilla.redhat.com/1086783): Add documentation for the Feature: qemu 1.3 - libgfapi integration
|
||||
- [1088848](https://bugzilla.redhat.com/1088848): Spelling errors in rpc/rpc-transport/rdma/src/rdma.c
|
||||
- [1089054](https://bugzilla.redhat.com/1089054): gf-error-codes.h is missing from source tarball
|
||||
- [1089470](https://bugzilla.redhat.com/1089470): SMB: Crash on brick process during compile kernel.
|
||||
- [1089934](https://bugzilla.redhat.com/1089934): list dir with more than N files results in Input/output error
|
||||
- [1091340](https://bugzilla.redhat.com/1091340): Doc: Add glfs_fini known issue to release notes 3.5
|
||||
- [1091392](https://bugzilla.redhat.com/1091392): glusterfs.spec.in: minor/nit changes to sync with Fedora spec
|
||||
- [1095256](https://bugzilla.redhat.com/1095256): Excessive logging from self-heal daemon, and bricks
|
||||
- [1095595](https://bugzilla.redhat.com/1095595): Stick to IANA standard while allocating brick ports
|
||||
- [1095775](https://bugzilla.redhat.com/1095775): Add support in libgfapi to fetch volume info from glusterd.
|
||||
- [1095971](https://bugzilla.redhat.com/1095971): Stopping/Starting a Gluster volume resets ownership
|
||||
- [1096040](https://bugzilla.redhat.com/1096040): AFR : self-heal-daemon not clearing the change-logs of all the sources after self-heal
|
||||
- [1096425](https://bugzilla.redhat.com/1096425): i/o error when one user tries to access RHS volume over NFS with 100+ GIDs
|
||||
- [1099878](https://bugzilla.redhat.com/1099878): Need support for handle based Ops to fetch/modify extended attributes of a file
|
||||
- [1101647](https://bugzilla.redhat.com/1101647): gluster volume heal volname statistics heal-count not giving desired output.
|
||||
- [1102306](https://bugzilla.redhat.com/1102306): license: xlators/features/glupy dual license GPLv2 and LGPLv3+
|
||||
- [1103413](https://bugzilla.redhat.com/1103413): Failure in gf_log_init reopening stderr
|
||||
- [1104592](https://bugzilla.redhat.com/1104592): heal info may give Success instead of transport end point not connected when a brick is down.
|
||||
- [1104915](https://bugzilla.redhat.com/1104915): glusterfsd crashes while doing stress tests
|
||||
- [1104919](https://bugzilla.redhat.com/1104919): Fix memory leaks in gfid-access xlator.
|
||||
- [1104959](https://bugzilla.redhat.com/1104959): Dist-geo-rep : some of the files not accessible on slave after the geo-rep sync from master to slave.
|
||||
- [1105188](https://bugzilla.redhat.com/1105188): Two instances each, of brick processes, glusterfs-nfs and quotad seen after glusterd restart
|
||||
- [1105524](https://bugzilla.redhat.com/1105524): Disable nfs.drc by default
|
||||
- [1107937](https://bugzilla.redhat.com/1107937): quota-anon-fd-nfs.t fails spuriously
|
||||
- [1109832](https://bugzilla.redhat.com/1109832): I/O fails for for glusterfs 3.4 AFR clients accessing servers upgraded to glusterfs 3.5
|
||||
- [1110777](https://bugzilla.redhat.com/1110777): glusterfsd OOM - using all memory when quota is enabled
|
||||
|
||||
### Known Issues:
|
||||
|
||||
- The following configuration changes are necessary for qemu and samba
|
||||
integration with libgfapi to work seamlessly:
|
||||
|
||||
1. gluster volume set <volname> server.allow-insecure on
|
||||
2. restarting the volume is necessary
|
||||
~~~
|
||||
gluster volume stop <volname>
|
||||
gluster volume start <volname>
|
||||
~~~
|
||||
3. Edit `/etc/glusterfs/glusterd.vol` to contain this line:
|
||||
~~~
|
||||
option rpc-auth-allow-insecure on
|
||||
~~~
|
||||
4. restarting glusterd is necessary
|
||||
~~~
|
||||
service glusterd restart
|
||||
~~~
|
||||
1. gluster volume set <volname> server.allow-insecure on
|
||||
2. restarting the volume is necessary
|
||||
|
||||
More details are also documented in the Gluster Wiki on the [Libgfapi with qemu libvirt](https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.5/libgfapi%20with%20qemu%20libvirt.md) page.
|
||||
gluster volume stop <volname>
|
||||
gluster volume start <volname>
|
||||
|
||||
3. Edit `/etc/glusterfs/glusterd.vol` to contain this line:
|
||||
|
||||
option rpc-auth-allow-insecure on
|
||||
|
||||
4. restarting glusterd is necessary
|
||||
|
||||
service glusterd restart
|
||||
|
||||
More details are also documented in the Gluster Wiki on the [Libgfapi with qemu libvirt](https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.5/libgfapi%20with%20qemu%20libvirt.md) page.
|
||||
|
||||
- For Block Device translator based volumes open-behind translator at the client side needs to be disabled.
|
||||
|
||||
@@ -104,5 +103,5 @@ additions:
|
||||
- After enabling `server.manage-gids`, the volume needs to be stopped and
|
||||
started again to have the option enabled in the brick processes
|
||||
|
||||
gluster volume stop <volname>
|
||||
gluster volume start <volname>
|
||||
gluster volume stop <volname>
|
||||
gluster volume start <volname>
|
||||
|
||||
@@ -4,12 +4,12 @@ This is mostly a bugfix release. The [Release Notes for 3.5.0](./3.5.0.md) and [
|
||||
|
||||
### Bugs Fixed:
|
||||
|
||||
- [1096020](https://bugzilla.redhat.com/1096020): NFS server crashes in _socket_read_vectored_request
|
||||
- [1096020](https://bugzilla.redhat.com/1096020): NFS server crashes in \_socket_read_vectored_request
|
||||
- [1100050](https://bugzilla.redhat.com/1100050): Can't write to quota enable folder
|
||||
- [1103050](https://bugzilla.redhat.com/1103050): nfs: reset command does not alter the result for nfs options earlier set
|
||||
- [1105891](https://bugzilla.redhat.com/1105891): features/gfid-access: stat on .gfid virtual directory return EINVAL
|
||||
- [1111454](https://bugzilla.redhat.com/1111454): creating symlinks generates errors on stripe volume
|
||||
- [1112111](https://bugzilla.redhat.com/1112111): Self-heal errors with "afr crawl failed for child 0 with ret -1" while performing rolling upgrade.
|
||||
- [1112111](https://bugzilla.redhat.com/1112111): Self-heal errors with "afr crawl failed for child 0 with ret -1" while performing rolling upgrade.
|
||||
- [1112348](https://bugzilla.redhat.com/1112348): [AFR] I/O fails when one of the replica nodes go down
|
||||
- [1112659](https://bugzilla.redhat.com/1112659): Fix inode leaks in gfid-access xlator
|
||||
- [1112980](https://bugzilla.redhat.com/1112980): NFS subdir authentication doesn't correctly handle multi-(homed,protocol,etc) network addresses
|
||||
@@ -18,8 +18,8 @@ This is mostly a bugfix release. The [Release Notes for 3.5.0](./3.5.0.md) and [
|
||||
- [1113749](https://bugzilla.redhat.com/1113749): client_t clienttable cliententries are never expanded when all entries are used
|
||||
- [1113894](https://bugzilla.redhat.com/1113894): AFR : self-heal of few files not happening when a AWS EC2 Instance is back online after a restart
|
||||
- [1113959](https://bugzilla.redhat.com/1113959): Spec %post server does not wait for the old glusterd to exit
|
||||
- [1114501](https://bugzilla.redhat.com/1114501): Dist-geo-rep : deletion of files on master, geo-rep fails to propagate to slaves.
|
||||
- [1115369](https://bugzilla.redhat.com/1115369): Allow the usage of the wildcard character '*' to the options "nfs.rpc-auth-allow" and "nfs.rpc-auth-reject"
|
||||
- [1114501](https://bugzilla.redhat.com/1114501): Dist-geo-rep : deletion of files on master, geo-rep fails to propagate to slaves.
|
||||
- [1115369](https://bugzilla.redhat.com/1115369): Allow the usage of the wildcard character '\*' to the options "nfs.rpc-auth-allow" and "nfs.rpc-auth-reject"
|
||||
- [1115950](https://bugzilla.redhat.com/1115950): glfsheal: Improve the way in which we check the presence of replica volumes
|
||||
- [1116672](https://bugzilla.redhat.com/1116672): Resource cleanup doesn't happen for clients on servers after disconnect
|
||||
- [1116997](https://bugzilla.redhat.com/1116997): mounting a volume over NFS (TCP) with MOUNT over UDP fails
|
||||
@@ -32,34 +32,33 @@ This is mostly a bugfix release. The [Release Notes for 3.5.0](./3.5.0.md) and [
|
||||
- The following configuration changes are necessary for 'qemu' and 'samba vfs
|
||||
plugin' integration with libgfapi to work seamlessly:
|
||||
|
||||
1. gluster volume set <volname> server.allow-insecure on
|
||||
2. restarting the volume is necessary
|
||||
1. gluster volume set <volname> server.allow-insecure on
|
||||
2. restarting the volume is necessary
|
||||
|
||||
~~~
|
||||
gluster volume stop <volname>
|
||||
gluster volume start <volname>
|
||||
~~~
|
||||
```
|
||||
gluster volume stop <volname>
|
||||
gluster volume start <volname>
|
||||
```
|
||||
|
||||
3. Edit `/etc/glusterfs/glusterd.vol` to contain this line:
|
||||
3. Edit `/etc/glusterfs/glusterd.vol` to contain this line:
|
||||
|
||||
~~~
|
||||
option rpc-auth-allow-insecure on
|
||||
~~~
|
||||
```
|
||||
option rpc-auth-allow-insecure on
|
||||
```
|
||||
|
||||
4. restarting glusterd is necessary
|
||||
4. restarting glusterd is necessary
|
||||
|
||||
~~~
|
||||
service glusterd restart
|
||||
~~~
|
||||
```
|
||||
service glusterd restart
|
||||
```
|
||||
|
||||
More details are also documented in the Gluster Wiki on the [Libgfapi with qemu libvirt](https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.5/libgfapi%20with%20qemu%20libvirt.md) page.
|
||||
More details are also documented in the Gluster Wiki on the [Libgfapi with qemu libvirt](https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.5/libgfapi%20with%20qemu%20libvirt.md) page.
|
||||
|
||||
- For Block Device translator based volumes open-behind translator at the
|
||||
client side needs to be disabled.
|
||||
|
||||
gluster volume set <volname> performance.open-behind disabled
|
||||
|
||||
|
||||
- libgfapi clients calling `glfs_fini` before a successfull `glfs_init` will cause the client to
|
||||
hang as reported [here](http://lists.gnu.org/archive/html/gluster-devel/2014-04/msg00179.html).
|
||||
The workaround is NOT to call `glfs_fini` for error cases encountered before a successfull
|
||||
|
||||
@@ -10,7 +10,7 @@ features that were added and bugs fixed in the GlusterFS 3.5 stable release.
|
||||
- [1100204](https://bugzilla.redhat.com/1100204): brick failure detection does not work for ext4 filesystems
|
||||
- [1126801](https://bugzilla.redhat.com/1126801): glusterfs logrotate config file pollutes global config
|
||||
- [1129527](https://bugzilla.redhat.com/1129527): DHT :- data loss - file is missing on renaming same file from multiple client at same time
|
||||
- [1129541](https://bugzilla.redhat.com/1129541): [DHT:REBALANCE]: Rebalance failures are seen with error message " remote operation failed: File exists"
|
||||
- [1129541](https://bugzilla.redhat.com/1129541): [DHT:REBALANCE]: Rebalance failures are seen with error message " remote operation failed: File exists"
|
||||
- [1132391](https://bugzilla.redhat.com/1132391): NFS interoperability problem: stripe-xlator removes EOF at end of READDIR
|
||||
- [1133949](https://bugzilla.redhat.com/1133949): Minor typo in afr logging
|
||||
- [1136221](https://bugzilla.redhat.com/1136221): The memories are exhausted quickly when handle the message which has multi fragments in a single record
|
||||
@@ -44,27 +44,27 @@ features that were added and bugs fixed in the GlusterFS 3.5 stable release.
|
||||
- The following configuration changes are necessary for 'qemu' and 'samba vfs
|
||||
plugin' integration with libgfapi to work seamlessly:
|
||||
|
||||
1. gluster volume set <volname> server.allow-insecure on
|
||||
2. restarting the volume is necessary
|
||||
1. gluster volume set <volname> server.allow-insecure on
|
||||
2. restarting the volume is necessary
|
||||
|
||||
~~~
|
||||
gluster volume stop <volname>
|
||||
gluster volume start <volname>
|
||||
~~~
|
||||
```
|
||||
gluster volume stop <volname>
|
||||
gluster volume start <volname>
|
||||
```
|
||||
|
||||
3. Edit `/etc/glusterfs/glusterd.vol` to contain this line:
|
||||
3. Edit `/etc/glusterfs/glusterd.vol` to contain this line:
|
||||
|
||||
~~~
|
||||
option rpc-auth-allow-insecure on
|
||||
~~~
|
||||
```
|
||||
option rpc-auth-allow-insecure on
|
||||
```
|
||||
|
||||
4. restarting glusterd is necessary
|
||||
4. restarting glusterd is necessary
|
||||
|
||||
~~~
|
||||
service glusterd restart
|
||||
~~~
|
||||
```
|
||||
service glusterd restart
|
||||
```
|
||||
|
||||
More details are also documented in the Gluster Wiki on the [Libgfapi with qemu libvirt](https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.5/libgfapi%20with%20qemu%20libvirt.md) page.
|
||||
More details are also documented in the Gluster Wiki on the [Libgfapi with qemu libvirt](https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.5/libgfapi%20with%20qemu%20libvirt.md) page.
|
||||
|
||||
- For Block Device translator based volumes open-behind translator at the
|
||||
client side needs to be disabled.
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user