mirror of
https://github.com/gluster/glusterdocs.git
synced 2026-02-05 15:47:01 +01:00
Removing gluster developer guide as it is maintained
@https://github.com/gluster/glusterfs/tree/master/doc/developer-guide Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
This commit is contained in:
@@ -1,41 +0,0 @@
|
||||
Bugs often get fixed in master before release branches. When a bug is
|
||||
fixed in the master branch, it might be desirable or necessary in a
|
||||
stable branch. To put the fix in stable branch we need to backport the
|
||||
fix to stable branch.
|
||||
|
||||
Anyone in the community can suggest a backport. If you are interested to
|
||||
suggest a backport, please check the [Backport
|
||||
Wishlist](./Backport Wishlist.md).
|
||||
|
||||
This page describes the steps needed to backport simple changes. Changes
|
||||
that do not apply cleanly will need some manual modifications and using
|
||||
`git cherry-pick` may not always be the easiest solution.
|
||||
|
||||
1. Git clone the GlusterFS code
|
||||
|
||||
git clone ssh://username@review.gluster.org/glusterfs
|
||||
|
||||
2. Create and checkout a new branch for your work, based on the branch
|
||||
for the backport version
|
||||
|
||||
git checkout -t -b bug-123456/release-3.5 origin/release-3.5
|
||||
|
||||
3. Cherry pick the change from master.
|
||||
|
||||
$ git cherry-pick -x a0b1c2d3e4f5
|
||||
- verify that the change has been merged in the master branch.
|
||||
|
||||
4. Update/correct the commit message.
|
||||
|
||||
$ git commit -s --amend --date="$(date)"
|
||||
[This is one example](https://github.com/gluster/glusterfs/commit/40407afb529f6e5fa2f79e9778c2f527122d75eb) of the commit message that has a good description for a backport. Notice the indention of the patch-metadata like BUG, Change-ID and Reviewed-on tags. There is also the original commit-id that was cherry picked from the master branch.
|
||||
- make sure to quote the review tags
|
||||
- update the BUG reference, point to the BUG that is used for this
|
||||
particular release-branch
|
||||
- add a Signed-off-by tag
|
||||
|
||||
5. Run `./rfc.sh` to post the backport for review.
|
||||
|
||||
./rfc.sh
|
||||
After submitting patch(es), make sure to move the bug to the *POST*
|
||||
status.
|
||||
@@ -1,193 +0,0 @@
|
||||
Bugs often get fixed in master before release branches.
|
||||
|
||||
When a bug is fixed in the master branch it might be desirable or
|
||||
necessary to backport the fix to a stable branch.
|
||||
|
||||
This page is intended to help organize support (and prioritization) for
|
||||
backporting bug fixes of importance to the community.
|
||||
|
||||
### GlusterFs 3.6
|
||||
|
||||
Requested Backports for 3.6.0
|
||||
-----------------------------
|
||||
|
||||
The tracker bug for 3.6.0 :
|
||||
<https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-3.6.0>
|
||||
|
||||
Please add 'glusterfs-3.6.0' in the 'Blocks' field of bugs to propose
|
||||
inclusion in GlusterFS 3.6.0.
|
||||
|
||||
### GlusterFs 3.5
|
||||
|
||||
Requested Backports for 3.5.3
|
||||
-----------------------------
|
||||
|
||||
Current [list of bugs planned for
|
||||
inclusion](https://bugzilla.redhat.com/showdependencytree.cgi?hide_resolved=0&id=glusterfs-3.5.3).
|
||||
|
||||
- File a new bug for backporting a patch to 3.5.3:
|
||||
[<https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS&blocked=glusterfs-3.5.3&version=3.5.2&short_desc=backport%20request%20for%20>...
|
||||
new glusterfs-3.5.3 backport request]
|
||||
|
||||
### GlusterFs 3.4
|
||||
|
||||
Requested Backports for 3.4.6
|
||||
-----------------------------
|
||||
|
||||
The tracker bug for 3.4.6 :
|
||||
<https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-3.4.6>
|
||||
|
||||
Please add 'glusterfs-3.4.6' in the 'Blocks' field of bugs to propose
|
||||
inclusion in GlusterFS 3.4.6.
|
||||
|
||||
<https://bugzilla.redhat.com:443/show_bug.cgi?id=1116150>
|
||||
<https://bugzilla.redhat.com:443/show_bug.cgi?id=1117851>
|
||||
|
||||
Requested Backports for 3.4.4
|
||||
-----------------------------
|
||||
|
||||
<https://bugzilla.redhat.com/show_bug.cgi?id=859581> - "self-heal
|
||||
process can sometimes create directories instead of symlinks for the
|
||||
root gfid file in .glusterfs"
|
||||
|
||||
<https://bugzilla.redhat.com/show_bug.cgi?id=1041109> - "structure needs
|
||||
cleaning" message appear when accessing files.
|
||||
|
||||
<https://bugzilla.redhat.com/show_bug.cgi?id=1073023> - glusterfs mount
|
||||
crash after remove brick, detach peer and termination
|
||||
|
||||
Requested Backports for 3.4.3
|
||||
-----------------------------
|
||||
|
||||
<https://bugzilla.redhat.com/show_bug.cgi?id=859581> - "self-heal
|
||||
process can sometimes create directories instead of symlinks for the
|
||||
root gfid file in .glusterfs"
|
||||
|
||||
<https://bugzilla.redhat.com/show_bug.cgi?id=1041109> - "structure needs
|
||||
cleaning" message appear when accessing files.
|
||||
|
||||
<https://bugzilla.redhat.com/show_bug.cgi?id=977492> - large NFS writes
|
||||
to Gluster slow down then stop
|
||||
|
||||
<https://bugzilla.redhat.com/show_bug.cgi?id=1073023> - glusterfs mount
|
||||
crash after remove brick, detach peer and termination
|
||||
|
||||
Requested Backports for 3.3.3
|
||||
-----------------------------
|
||||
|
||||
[Enable fusermount by default, make nightly autobuilding
|
||||
work](https://bugzilla.redhat.com/1058666)
|
||||
|
||||
Requested Backports for 3.4.2
|
||||
-----------------------------
|
||||
|
||||
Please enter bugzilla ID or patch URL here:
|
||||
|
||||
1) Until RDMA handling is improved, we should output a warning when
|
||||
using RDMA volumes -
|
||||
<https://bugzilla.redhat.com/show_bug.cgi?id=1017176>
|
||||
|
||||
2) Unable to shrink volumes without dataloss -
|
||||
<https://bugzilla.redhat.com/show_bug.cgi?id=1024369>
|
||||
|
||||
3) cluster/dht: Allow non-local clients to function with nufa volumes.
|
||||
- <http://review.gluster.org/5414>
|
||||
|
||||
Requested Backports for 3.4.1
|
||||
-----------------------------
|
||||
|
||||
Please enter bugzilla ID or patch URL here.
|
||||
|
||||
<https://bugzilla.redhat.com/show_bug.cgi?id=812230> - "quota context
|
||||
not set in inode"
|
||||
|
||||
<https://bugzilla.redhat.com/show_bug.cgi?id=893778> - "NFS crash bug"
|
||||
|
||||
A note for whoever reviews this list: These are the fixes for issues
|
||||
that have caused actual service disruption in our production
|
||||
installation and thus are absolutely required for us (-- Lubomir
|
||||
Rintel):
|
||||
|
||||
<https://bugzilla.redhat.com/show_bug.cgi?id=994392> - "Setting ACL
|
||||
entries fails with glusterfs-3.4.0"
|
||||
|
||||
<https://bugzilla.redhat.com/show_bug.cgi?id=991622> - "fd leaks
|
||||
observed while running dbench with "open-behind" volume option set to
|
||||
"on" on a replicate volume"
|
||||
|
||||
These are issues that we've stumbled upon during the git log review and
|
||||
that seemed scary enough for us to cherry-pick them to avoid risk,
|
||||
despite not being actually hit. Hope that helps deciding whether it's
|
||||
worthwhile cherry-picking them (-- Lubomir Rintel):
|
||||
|
||||
<https://bugzilla.redhat.com/show_bug.cgi?id=961691> "CLI crash upon
|
||||
executing "gluster peer status" command"
|
||||
|
||||
<https://bugzilla.redhat.com/show_bug.cgi?id=965995> "quick-read and
|
||||
open-behind xlator: Make options (volume\_options ) structure NULL
|
||||
terminated."
|
||||
|
||||
<https://bugzilla.redhat.com/show_bug.cgi?id=958691> "nfs-root-squash:
|
||||
rename creates a file on a file residing inside a sticky bit set
|
||||
directory"
|
||||
|
||||
<https://bugzilla.redhat.com/show_bug.cgi?id=982919> "DHT : files are
|
||||
stored on directory which doesn't have hash range(hash layout)"
|
||||
|
||||
<https://bugzilla.redhat.com/show_bug.cgi?id=976189> "statedump crashes
|
||||
in ioc\_inode\_dump"
|
||||
|
||||
<https://bugzilla.redhat.com/show_bug.cgi?id=982174> "cli crashes when
|
||||
setting diagnostics.client-log-level is set to trace"
|
||||
|
||||
<https://bugzilla.redhat.com/show_bug.cgi?id=989579> "glusterfsd crashes
|
||||
on smallfile benchmark"
|
||||
|
||||
<http://review.gluster.org/5821>, "tests: call 'cleanup' at the end of
|
||||
each test", <https://bugzilla.redhat.com/show_bug.cgi?id=1004756>,
|
||||
backport of 983975
|
||||
|
||||
<http://review.gluster.org/5822>, "glusterfs-api.pc.in contains an
|
||||
rpath", <https://bugzilla.redhat.com/show_bug.cgi?id=1004751>, backport
|
||||
of 1002220
|
||||
|
||||
<http://review.gluster.org/5824> "glusterd.service (systemd), ensure
|
||||
glusterd starts before any local gluster mounts",
|
||||
<https://bugzilla.redhat.com/show_bug.cgi?id=1004796>, backport of
|
||||
1004795
|
||||
|
||||
<https://bugzilla.redhat.com/show_bug.cgi?id=819130> meta, check that
|
||||
glusterfs.spec.in has all relevant updates
|
||||
|
||||
<https://bugzilla.redhat.com/show_bug.cgi?id=1012400> - Glusterd would
|
||||
not store all the volumes when a global options were set leading to peer
|
||||
rejection
|
||||
|
||||
Requested Backports
|
||||
-------------------
|
||||
|
||||
- Please backport [gfapi: Closed the logfile fd and initialize to NULL
|
||||
in glfs\_fini](http://review.gluster.org/#/c/6552) into release-3.5
|
||||
- Done
|
||||
- Please backport [cluster/dht: Make sure loc has
|
||||
gfid](http://review.gluster.org/5178) into release-3.4
|
||||
- Please backport [Bug 887098](http://goo.gl/QjeMP) into release-3.3
|
||||
(FyreFoX) - Done
|
||||
- Please backport [Bug 856341](http://goo.gl/9cGAC) into release-3.2
|
||||
and release-3.3 (the-me o/b/o Debian) - Done for release-3.3
|
||||
- Please backport [Bug 895656](http://goo.gl/ZNs3J) into release-3.2
|
||||
and release-3.3 (semiosis, x4rlos) - Done for release-3.3
|
||||
- Please backport [Bug 918437](http://goo.gl/1QRyw) into release-3.3
|
||||
(tjstansell) - Done
|
||||
- Please backport into [Bug
|
||||
884597](https://bugzilla.redhat.com/show_bug.cgi?id=884597)
|
||||
release-3.3 (nocko) - Done
|
||||
|
||||
Unaddressed bugs
|
||||
----------------
|
||||
|
||||
- [Bug 838784](https://bugzilla.redhat.com/show_bug.cgi?id=838784)
|
||||
- [Bug 893778](https://bugzilla.redhat.com/show_bug.cgi?id=893778)
|
||||
- [Bug 913699](https://bugzilla.redhat.com/show_bug.cgi?id=913699);
|
||||
possibly related to [Bug
|
||||
884597](https://bugzilla.redhat.com/show_bug.cgi?id=884597)
|
||||
@@ -1,128 +0,0 @@
|
||||
Before filing a bug
|
||||
-------------------
|
||||
|
||||
If you are finding any issues, these preliminary checks as useful:
|
||||
|
||||
- Is SELinux enabled? (you can use `getenforce` to check)
|
||||
- Are iptables rules blocking any data traffic? (`iptables -L` can
|
||||
help check)
|
||||
- Are all the nodes reachable from each other? [ Network problem ]
|
||||
- Please search Bugzilla to see if the bug has already been reported
|
||||
- Choose GlusterFS as the "product", and then type something
|
||||
relevant in the "words" box. If you are seeing a crash or abort,
|
||||
searching for part of the abort message might be effective. If
|
||||
you are feeling adventurous you can select the "Advanced search"
|
||||
tab; this gives a lot more control but isn't much better for
|
||||
finding existing bugs.
|
||||
- If a bug has been already filed for a particular release and you
|
||||
found the bug in another release,
|
||||
- please clone the existing bug for the release, you found the
|
||||
issue.
|
||||
- If the existing bug is against mainline and you found the
|
||||
issue for a release, then the cloned bug *depends on* should
|
||||
be set to the BZ for mainline bug.
|
||||
|
||||
Anyone can search in Bugzilla, you don't need an account. Searching
|
||||
requires some effort, but helps avoid duplicates, and you may find that
|
||||
your problem has already been solved.
|
||||
|
||||
Reporting A Bug
|
||||
---------------
|
||||
|
||||
- You should have a Bugzilla account
|
||||
- Here is the link to file a bug:
|
||||
[Bugzilla](https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS)
|
||||
- The template for filing a bug can be found [
|
||||
*here*](./Bug reporting template.md)
|
||||
|
||||
*Note: Please go through all below sections to understand what
|
||||
information we need to put in a bug. So it will help the developer to
|
||||
root cause and fix it*
|
||||
|
||||
### Required Information
|
||||
|
||||
You should gather the information below before creating the bug report.
|
||||
|
||||
#### Package Information
|
||||
|
||||
- Location from which the packages are used
|
||||
- Package Info - version of glusterfs package installed
|
||||
|
||||
#### Cluster Information
|
||||
|
||||
- Number of nodes in the cluster
|
||||
- Hostnames and IPs of the gluster Node [if it is not a security
|
||||
issue]
|
||||
- Hostname / IP will help developers in understanding &
|
||||
correlating with the logs
|
||||
- Output of `gluster peer status`
|
||||
- Node IP, from which the "x" operation is done
|
||||
- "x" here means any operation that causes the issue
|
||||
|
||||
#### Volume Information
|
||||
|
||||
- Number of volumes
|
||||
- Volume Names
|
||||
- Volume on which the particular issue is seen [ if applicable ]
|
||||
- Type of volumes
|
||||
- Volume options if available
|
||||
- Output of `gluster volume info`
|
||||
- Output of `gluster volume status`
|
||||
- Get the statedump of the volume with the problem
|
||||
|
||||
` $ gluster volume statedump `<vol-name>
|
||||
|
||||
This dumps statedump per brick process in `/var/run/gluster`
|
||||
|
||||
*NOTE: Collect statedumps from one gluster Node in a directory.*
|
||||
|
||||
Repeat it in all Nodes containing the bricks of the volume. All the so
|
||||
collected directories could be archived,compressed and attached to bug
|
||||
|
||||
#### Brick Information
|
||||
|
||||
- xfs options when brick partition was done
|
||||
- This could be obtained with this command :
|
||||
|
||||
` $ xfs_info /dev/mapper/vg1-brick`
|
||||
|
||||
- Extended attributes on the bricks
|
||||
- This could be obtained with this command:
|
||||
|
||||
` $ getfattr -d -m. -ehex /rhs/brick1/b1`
|
||||
|
||||
#### Client Information
|
||||
|
||||
- OS Type ( Windows, RHEL )
|
||||
- OS Version : In case of Linux distro get the following :
|
||||
|
||||
` $ uname -r`\
|
||||
` $ cat /etc/issue`
|
||||
|
||||
- Fuse or NFS Mount point on the client with output of mount commands
|
||||
- Output of `df -Th` command
|
||||
|
||||
#### Tool Information
|
||||
|
||||
- If any tools are used for testing, provide the info/version about it
|
||||
- if any IO is simulated using a script, provide the script
|
||||
|
||||
#### Logs Information
|
||||
|
||||
- You can check logs for check for issues/warnings/errors.
|
||||
- Self-heal logs
|
||||
- Rebalance logs
|
||||
- Glusterd logs
|
||||
- Brick logs
|
||||
- NFS logs (if applicable)
|
||||
- Samba logs (if applicable)
|
||||
- Client mount log
|
||||
- Add the entire logs as attachment, if its very large to paste as a
|
||||
comment
|
||||
|
||||
#### SOS report for CentOS/Fedora
|
||||
|
||||
- Get the sosreport from the involved gluster Node and Client [ in
|
||||
case of CentOS /Fedora ]
|
||||
- Add a meaningful name/IP to the sosreport, by renaming/adding
|
||||
hostname/ip to the sosreport name
|
||||
@@ -1,400 +0,0 @@
|
||||
Bug Triage Guidelines
|
||||
=====================
|
||||
|
||||
- Triaging of bugs is an important task; when done correctly, it can
|
||||
reduce the time between reporting a bug and the availability of a
|
||||
fix enormously.
|
||||
|
||||
- Triager should focus on new bugs, and try to define the problem
|
||||
easily understandable and as accurate as possible. The goal of the
|
||||
triagers is to reduce the time that developers need to solve the bug
|
||||
report.
|
||||
|
||||
- A triager is like an assistant that helps with the information
|
||||
gathering and possibly the debugging of a new bug report. Because a
|
||||
triager helps preparing a bug before a developer gets involved, it
|
||||
can be a very nice role for new community members that are
|
||||
interested in technical aspects of the software.
|
||||
|
||||
- Triagers will stumble upon many different kind of issues, ranging
|
||||
from reports about spelling mistakes, or unclear log messages to
|
||||
memory leaks causing crashes or performance issues in environments
|
||||
with several hundred storage servers.
|
||||
|
||||
Nobody expects that triagers can prepare all bug reports. Therefore most
|
||||
developers will be able to assist the triagers, answer questions and
|
||||
suggest approaches to debug and data to gather. Over time, triagers get
|
||||
more experienced and will rely less on developers.
|
||||
|
||||
**Bug triage can be summarised as below points:**
|
||||
|
||||
- Is there enough information in the bug description?
|
||||
- Is it a duplicate bug?
|
||||
- Is it assigned to correct component of GlusterFS?
|
||||
- Are the Bugzilla fields correct?
|
||||
- Is the bug summary is correct?
|
||||
- Assigning bugs or Adding people to the "CC" list
|
||||
- Fix the Severity And Priority.
|
||||
- Todo, If the bug present in multiple GlusterFS versions.
|
||||
- Add appropriate Keywords to bug.
|
||||
|
||||
The detailed discussion about the above points are below.
|
||||
|
||||
Weekly meeting about Bug Triaging
|
||||
---------------------------------
|
||||
|
||||
We try to meet every week in \#gluster-meeting on Freenode. The meeting
|
||||
date and time for the next meeting is normally updated in the
|
||||
[agenda](https://public.pad.fsfe.org/p/gluster-bug-triage).
|
||||
|
||||
Getting Started: Find reports to triage
|
||||
---------------------------------------
|
||||
|
||||
There are many different techniques and approaches to find reports to
|
||||
triage. One easy way is to use these pre-defined Bugzilla reports (a
|
||||
report is completely structured in the URL and can manually be
|
||||
modified):
|
||||
|
||||
- New **bugs** that do not have the 'Triaged' keyword [Bugzilla
|
||||
link](https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&f1=keywords&keywords=Triaged%2CFutureFeature&keywords_type=nowords&list_id=3014117&o1=nowords&product=GlusterFS&query_format=advanced&v1=Triaged)
|
||||
- New **features** that do not have the 'Triaged' keyword (identified
|
||||
by FutureFeature keyword, probably of interest only to project
|
||||
leaders) [Bugzilla
|
||||
link](https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&f1=keywords&f2=keywords&list_id=3014699&o1=nowords&o2=allwords&product=GlusterFS&query_format=advanced&v1=Triaged&v2=FutureFeature)
|
||||
- New glusterd bugs: [Bugzilla
|
||||
link](https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&product=GlusterFS&f1=keywords&o1=nowords&v1=Triaged&component=glusterd)
|
||||
- New Replication(afr) bugs: [Bugzilla
|
||||
link](https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&component=replicate&f1=keywords&list_id=2816133&o1=nowords&product=GlusterFS&query_format=advanced&v1=Triaged)
|
||||
- New distribute(DHT) bugs: [Bugzilla
|
||||
links](https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&component=distribute&f1=keywords&list_id=2816148&o1=nowords&product=GlusterFS&query_format=advanced&v1=Triaged)
|
||||
|
||||
- New bugs against version 3.6:
|
||||
[<https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&product=GlusterFS&f1=keywords&f2=version&o1=nowords&o2=regexp&v1=Triaged&v2>=\^3.6
|
||||
Bugzilla link]
|
||||
- New bugs against version 3.5:
|
||||
[<https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&product=GlusterFS&f1=keywords&f2=version&o1=nowords&o2=regexp&v1=Triaged&v2>=\^3.5
|
||||
Bugzilla link]
|
||||
- New bugs against version 3.4:
|
||||
[<https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&product=GlusterFS&f1=keywords&f2=version&o1=nowords&o2=regexp&v1=Triaged&v2>=\^3.4
|
||||
Bugzilla link]
|
||||
|
||||
- [<https://bugzilla.redhat.com/page.cgi?id=browse.html&product=GlusterFS&product_version>=&bug\_status=all&tab=recents
|
||||
bugzilla tracker] (can include already Triaged bugs)
|
||||
|
||||
- [Untriaged NetBSD
|
||||
bugs](https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&keywords=Triaged&keywords_type=nowords&op_sys=NetBSD&product=GlusterFS)
|
||||
- [Untriaged FreeBSD
|
||||
bugs](https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&keywords=Triaged&keywords_type=nowords&op_sys=FreeBSD&product=GlusterFS)
|
||||
- [Untriaged Mac OS
|
||||
bugs](https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&keywords=Triaged&keywords_type=nowords&op_sys=Mac%20OS&product=GlusterFS)
|
||||
|
||||
In addition to manually checking Bugzilla for bugs to triage, it is also
|
||||
possible to receive emails when new
|
||||
bugs are filed or existing bugs get updated.
|
||||
|
||||
If at any point you feel like you do not know what to do with a certain
|
||||
report, please first ask [irc or mailing
|
||||
lists](http://www.gluster.org/community/index.html) before changing
|
||||
something.
|
||||
|
||||
Is there enough information?
|
||||
----------------------------
|
||||
|
||||
To make a report useful, the same rules apply as for
|
||||
[bug reporting guidelines](./Bug Reporting Guidelines.md).
|
||||
|
||||
It's hard to generalize what makes a good report. For "average"
|
||||
reporters is definitely often helpful to have good steps to reproduce,
|
||||
GlusterFS software version , and information about the test/production
|
||||
environment, Linux/GNU distribution.
|
||||
|
||||
If the reporter is a developer, steps to reproduce can sometimes be
|
||||
omitted as context is obvious. *However, this can create a problem for
|
||||
contributors that need to find their way, hence it is strongly advised
|
||||
to list the steps to reproduce an issue.*
|
||||
|
||||
Other tips:
|
||||
|
||||
- There should be only one issue per report. Try not to mix related or
|
||||
similar looking bugs per report.
|
||||
|
||||
- It should be possible to call the described problem fixed at some
|
||||
point. "Improve the documentation" or "It runs slow" could never be
|
||||
called fixed, while "Documentation should cover the topic Embedding"
|
||||
or "The page at <http://en.wikipedia.org/wiki/Example> should load
|
||||
in less than five seconds" would have a criterion. A good summary of
|
||||
the bug will also help others in finding existing bugs and prevent
|
||||
filing of duplicates.
|
||||
|
||||
- If the bug is a graphical problem, you may want to ask for a
|
||||
screenshot to attach to the bug report. Make sure to ask that the
|
||||
screenshot should not contain any confidential information.
|
||||
|
||||
Is it a duplicate?
|
||||
------------------
|
||||
|
||||
Some reports in Bugzilla have already been reported before so you can
|
||||
[search for an already existing
|
||||
report](https://bugzilla.redhat.com/query.cgi?format=advanced). We do
|
||||
not recommend to spend too much time on it; if a bug is filed twice,
|
||||
someone else will mark it as a duplicate later. If the bug is a
|
||||
duplicate, mark it as a duplicate in the resolution box below the
|
||||
comment field by setting the **CLOSED DUPLICATE** status, and shortly
|
||||
explain your action in a comment for the reporter. When marking a bug as
|
||||
a duplicate, it is required to reference the original bug.
|
||||
|
||||
If you think that you have found a duplicate but you are not totally
|
||||
sure, just add a comment like "This bug looks related to bug XXXXX" (and
|
||||
replace XXXXX by the bug number) so somebody else can take a look and
|
||||
help judging.
|
||||
|
||||
You can also take a look at
|
||||
https://bugzilla.redhat.com/page.cgi?id=browse.html&product=GlusterFS&product_version>=&bug\_status=all&tab=duplicates's
|
||||
list of existing duplicates
|
||||
|
||||
Is it assigned to correct component of GlusterFS?
|
||||
-------------------------------------------------
|
||||
|
||||
Make sure the bug is assigned on right component. Below are the list of
|
||||
GlusterFs components in bugzilla.
|
||||
|
||||
- access control - Access control translator
|
||||
- BDB - Berkeley DB backend storage
|
||||
- booster - LD\_PRELOAD'able access client
|
||||
- build - Compiler, package management and platform specific warnings
|
||||
and errors
|
||||
- cli -gluster command line
|
||||
- core - Core features of the filesystem
|
||||
- distribute - Distribute translator (previously DHT)
|
||||
- errorgen - Error Gen Translator
|
||||
- fuse -mount/fuse translator and patched fuse library
|
||||
- georeplication - Gluster Geo-Replication
|
||||
- glusterd - Management daemon
|
||||
- HDFS - Hadoop application support over GlusterFS
|
||||
- ib-verbs - Infiniband verbs transport
|
||||
- io-cache - IO buffer caching translator
|
||||
- io-threads - IO threads performance translator
|
||||
- libglusterfsclient- API interface to access glusterfs volumes
|
||||
programatically
|
||||
- locks - POSIX and internal locks
|
||||
- logging - Centralized logging, log messages, log rotation etc
|
||||
- nfs- NFS component in GlusterFS
|
||||
- nufa- Non-Uniform Filesystem Scheduler Translator
|
||||
- object-storage - Object Storage
|
||||
- porting - Porting GlusterFS to different operating systems and
|
||||
platforms
|
||||
- posix - POSIX (API) based backend storage
|
||||
- protocol -Client and Server protocol translators
|
||||
- quick-read- Quick Read Translator
|
||||
- quota - Volume & Directory quota translator
|
||||
- rdma- RDMA transport
|
||||
- read-ahead - Read ahead (file) performance translator
|
||||
- replicate- Replication translator (previously AFR)
|
||||
- rpc - RPC Layer
|
||||
- scripts - Build scripts, mount scripts, etc.
|
||||
- stat-prefetch - Stat prefetch translator
|
||||
- stripe - Striping (RAID-0) cluster translator
|
||||
- trace- Trace translator
|
||||
- transport - Socket (IPv4, IPv6, unix, ib-sdp) and generic transport
|
||||
code
|
||||
- unclassified - Unclassified - to be reclassified as other components
|
||||
- unify - Unify translator and schedulers
|
||||
- write-behind- Write behind performance translator
|
||||
- libgfapi - APIs for GlusterFS
|
||||
- tests- GlusterFS Test Framework
|
||||
- gluster-hadoop - Hadoop support on GlusterFS
|
||||
- gluster-hadoop-install - Automated Gluster volume configuration for
|
||||
Hadoop Environments
|
||||
- gluster-smb - gluster smb
|
||||
- puppet-gluster - A puppet module for GlusterFS
|
||||
|
||||
Tips for searching:
|
||||
|
||||
- As it is often hard for reporters to find the right place (product
|
||||
and component) where to file a report, also search for duplicates
|
||||
outside same product and component of the bug report you are
|
||||
triaging.
|
||||
- Use common words and try several times with different combinations,
|
||||
as there could be several ways to describe the same problem. If you
|
||||
choose the proper and common words, and you try several times with
|
||||
different combinations of those, you ensure to have matching
|
||||
results.
|
||||
- Drop the ending of a verb (e.g. search for "delet" so you get
|
||||
reports for both "delete" and "deleting"), and also try similar
|
||||
words (e.g. search both for "delet" and "remov").
|
||||
- Search using the date range delimiter: Most of the bug reports are
|
||||
recent, so you can try to increase the search speed using date
|
||||
delimiters by going to "Search by Change History" on the [search
|
||||
page](https://bugzilla.redhat.com/query.cgi?format=advanced).
|
||||
Example: search from "2011-01-01" or "-730d" (to cover the last two
|
||||
years) to "Now".
|
||||
|
||||
Are the fields correct?
|
||||
-----------------------
|
||||
|
||||
### Summary
|
||||
|
||||
Sometimes the summary does not summarize the bug itself well. You may
|
||||
want to update the bug summary to make the report distinguishable. A
|
||||
good title may contain:
|
||||
|
||||
- A brief explanation of the root cause (if it was found)
|
||||
- Some of the symptoms people are experiencing
|
||||
|
||||
### Adding people to the "CC" or changing the "Assigned to" field
|
||||
|
||||
Normally, developers and potential assignees of an area are already
|
||||
CC'ed by default, but sometimes reports describe general issues or are
|
||||
filed against common bugzilla products. Only if you know developers who
|
||||
work in the area covered by the bug report, and if you know that these
|
||||
developers accept getting CCed or assigned to certain reports, you can
|
||||
add that person to the CC field or even assign the bug report to
|
||||
her/him.
|
||||
|
||||
To get an idea who works in which area, check To know component owners ,
|
||||
you can check the "MAINTAINERS" file in root of glusterfs code directory
|
||||
or querying changes in [Gerrit](http://review.gluster.org) (see
|
||||
[Simplified dev workflow](./Simplified Development Workflow.md))
|
||||
|
||||
### Severity And Priority
|
||||
|
||||
Please see below for information on the available values and their
|
||||
meanings.
|
||||
|
||||
#### Severity
|
||||
|
||||
This field is a pull-down of the external weighting of the bug report's
|
||||
importance and can have the following values:
|
||||
|
||||
Severity |Definition
|
||||
-------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
urgent |catastrophic issues which severely impact the mission-critical operations of an organization. This may mean that the operational servers, development systems or customer applications are down or not functioning and no procedural workaround exists.
|
||||
high |high-impact issues in which the customer's operation is disrupted, but there is some capacity to produce
|
||||
medium |partial non-critical functionality loss, or issues which impair some operations but allow the customer to perform their critical tasks. This may be a minor issue with limited loss or no loss of functionality and limited impact to the customer's functionality
|
||||
low |general usage questions, recommendations for product enhancement, or development work
|
||||
unspecified |importance not specified
|
||||
|
||||
#### Priority
|
||||
|
||||
This field is a pull-down of the internal weighting of the bug report's
|
||||
importance and can have the following values:
|
||||
|
||||
Priority |Definition
|
||||
-------------|------------------------
|
||||
urgent |extremely important
|
||||
high |very important
|
||||
medium |average importance
|
||||
low |not very important
|
||||
unspecified |importance not specified
|
||||
|
||||
|
||||
### Bugs present in multiple Versions
|
||||
|
||||
During triaging you might come across a particular bug which is present
|
||||
across multiple version of GlusterFS. Here are the course of actions:
|
||||
|
||||
- We should have separate bugs for each release (We should
|
||||
clone bugs if required)
|
||||
- Bugs in released versions should be depended on bug for mainline
|
||||
(master branch) if the bug is applicable for mainline.
|
||||
- This will make sure that the fix would get merged in master
|
||||
branch first then the fix can get ported to other stable
|
||||
releases.
|
||||
|
||||
*Note: When a bug depends on other bugs, that means the bug cannot be
|
||||
fixed unless other bugs are fixed (depends on), or this bug stops other
|
||||
bugs being fixed (blocks)*
|
||||
|
||||
Here are some examples:
|
||||
|
||||
- A bug is raised for GlusterFS 3.5 and the same issue is present in
|
||||
mainline (master branch) and GlusterFS 3.6
|
||||
- Clone the original bug for mainline.
|
||||
- Clone another for 3.6.
|
||||
- And have the GlusterFS 3.6 bug and GlusterFS 3.5 bug 'depend on'
|
||||
the 'mainline' bug
|
||||
|
||||
- A bug is already present for mainline, and the same issue is seen in
|
||||
GlusterFS 3.5.
|
||||
- Clone the original bug for GlusterFS 3.5.
|
||||
- And have the cloned bug (for 3.5) 'depend on' the 'mainline'
|
||||
bug.
|
||||
|
||||
### Keywords
|
||||
|
||||
Many predefined searches for Bugzilla include keywords. One example are
|
||||
the searches for the triaging. If the bug is 'NEW' and 'Triaged' is no
|
||||
set, you (as a triager) can pick it and use this page to triage it. When
|
||||
the bug is 'NEW' and 'Triaged' is in the list of keyword, the bug is
|
||||
ready to be picked up by a developer.
|
||||
|
||||
**Triaged**
|
||||
: Once you are done with triage add the **Triaged** keyword to the
|
||||
bug, so that others will know the triaged state of the bug. The
|
||||
predefined search at the top of this page will then not list the
|
||||
Triaged bug anymore. Instead, the bug should have moved to [this
|
||||
list](https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&keywords=Triaged&product=GlusterFS).
|
||||
|
||||
**EasyFix**
|
||||
: By adding the **EasyFix** keyword, the bug gets added to the [list
|
||||
of bugs that should be simple to fix](./Easy Fix Bugs.md).
|
||||
Adding this keyword is encouraged for simple and well defined bugs
|
||||
or feature enhancements.
|
||||
|
||||
**Patch**
|
||||
: When a patch for the problem has been attached or included inline,
|
||||
add the **Patch** keyword so that it is clear that some preparation
|
||||
for the development has been done already. If course, it would have
|
||||
been nicer if the patch was sent to Gerrit for review, but not
|
||||
everyone is ready to pass the Gerrit hurdle when they report a bug.
|
||||
|
||||
You can also add the **Patch** keyword when a bug has been fixed in
|
||||
mainline and the patch(es) has been identified. Add a link to the
|
||||
Gerrit change(s) so that backporting to a stable release is made
|
||||
simpler.
|
||||
|
||||
**Documentation**
|
||||
: Add the **Documentation** keyword when a bug has been reported for
|
||||
the documentation. This helps editors and writers in finding the
|
||||
bugs that they can resolve.
|
||||
|
||||
**Tracking**
|
||||
: This keyword is used for bugs which are used to track other bugs for
|
||||
a particular release. For example [3.6 tracker
|
||||
bug](https://bugzilla.redhat.com/showdependencytree.cgi?maxdepth=2&hide_resolved=1&id=glusterfs-3.6.0)
|
||||
|
||||
**FutureFeature**
|
||||
: This keyword is used for bugs which are used to request for a
|
||||
feature enhancement ( RFE - Requested Feature Enhancement) for
|
||||
future releases of GlusterFS. If you open a bug by requesting a
|
||||
feature which you would like to see in next versions of GlusterFS
|
||||
please report with this keyword.
|
||||
|
||||
Add yourself to the CC list
|
||||
---------------------------
|
||||
|
||||
By adding yourself to the CC list of bug reports that you change, you
|
||||
will receive followup emails with all comments and changes by anybody on
|
||||
that individual report. This helps learning what further investigations
|
||||
others make. You can change the settings in Bugzilla on which actions
|
||||
you want to receive mail.
|
||||
|
||||
Bugs For Group Triage
|
||||
---------------------
|
||||
|
||||
If you come across a bug/ bugs or If you think any bug should to go
|
||||
thorough the bug triage group, please set NEEDINFO for bugs@gluster.org
|
||||
on the bug.
|
||||
|
||||
Resolving bug reports
|
||||
---------------------
|
||||
|
||||
See the [Bug report life cycle](./Bug report Life Cycle.md) for
|
||||
the meaning of the bug status and resolutions.
|
||||
|
||||
Example of Triaged Bugs
|
||||
-----------------------
|
||||
|
||||
This Bugzilla
|
||||
[filter](https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&keywords=Triaged&keywords_type=anywords&list_id=2739593&product=GlusterFS&query_format=advanced)
|
||||
will list NEW, Triaged Bugs
|
||||
@@ -1,57 +0,0 @@
|
||||
This page describes the life of a bug report.
|
||||
|
||||
- When a bug is first reported, it is given the **NEW** status.
|
||||
- Once a developer has started, or is planning to work on a bug, the
|
||||
status **ASSIGNED** is set. The "Assigned to" field should mention a
|
||||
specific developer.
|
||||
- If an initial
|
||||
[patch](https://en.wikipedia.org/wiki/Patch_(computing)) for a bug
|
||||
has been put into the [Gerrit code review
|
||||
tool](http://review.gluster.org), the status **POST** should be set
|
||||
manually. The status **POST** should only be used when all patches
|
||||
for a specific bug have been posted for review.
|
||||
- After a review of the patch, and passing any automated regression
|
||||
tests, the patch will get merged by one of the maintainers. When the
|
||||
patch has been merged into the git repository, a comment is added to
|
||||
the bug. Only when all needed patches have been merged, the assigned
|
||||
engineer will need to change the status to **MODIFIED**.
|
||||
- Once a package is available with fix for the bug, the status should
|
||||
be moved to **ON\_QA**.
|
||||
- The **Fixed in version** field should get the name/release of
|
||||
the package that contains the fix. Packages for multiple
|
||||
distributions will mostly get available within a few days after
|
||||
the *make dist* tarball was created.
|
||||
- This tells the bug reporter that a package is available with fix
|
||||
for the bug and that they should test the package.
|
||||
- The release maintainer need to do this change to bug status,
|
||||
scripts are available (ask *ndevos*).
|
||||
- The status **VERIFIED** is set if a QA tester or the reporter
|
||||
confirmed the fix after fix is merged and new build with the fix
|
||||
resolves the issue.
|
||||
- In case the version does not fix the reported bug, the status should
|
||||
be moved back to **ASSIGNED** with a clear note on what exactly
|
||||
failed.
|
||||
- When a report has been solved it is given **CLOSED** status. This
|
||||
can mean:
|
||||
- **CLOSED/CURRENTRELEASE** when a code change that fixes the
|
||||
reported problem has been merged in
|
||||
[Gerrit](http://review.gluster.org).
|
||||
- **CLOSED/WONTFIX** when the reported problem or suggestion is
|
||||
valid, but any fix of the reported problem or implementation of
|
||||
the suggestion would be barred from approval by the project's
|
||||
Developers/Maintainers (or product managers, if existing).
|
||||
- **CLOSED/WORKSFORME** when the problem can not be reproduced,
|
||||
when missing information has not been provided, or when an
|
||||
acceptable workaround exists to achieve a similar outcome as
|
||||
requested.
|
||||
- **CLOSED/CANTFIX** when the problem is not a bug, or when it is
|
||||
a change that is outside the power of GlusterFS development. For
|
||||
example, bugs proposing changes to third-party software can not
|
||||
be fixed in the GlusterFS project itself.
|
||||
- **CLOSED/DUPLICATE** when the problem has been reported before,
|
||||
no matter if the previous report has been already resolved or
|
||||
not.
|
||||
|
||||
If a bug report was marked as *CLOSED* or *VERIFIED* and it turns out
|
||||
that this was incorrect, the bug can be changed to the status *ASSIGNED*
|
||||
or *NEW*.
|
||||
@@ -1,55 +0,0 @@
|
||||
Template for bug description
|
||||
----------------------------
|
||||
This template should be in-line to the [Bug reporting guidelines](./Bug Reporting Guidelines.md).
|
||||
The template is replacement for the default description template present in [Bugzilla](https://bugzilla.redhat.com)
|
||||
|
||||
work in progress
|
||||
|
||||
------------------------------------------------------------------------
|
||||
|
||||
Description of problem:
|
||||
|
||||
Version of GlusterFS package installed:
|
||||
|
||||
Location from which the packages are used:
|
||||
|
||||
GlusterFS Cluster Information:
|
||||
|
||||
- Number of volumes
|
||||
- Volume Names
|
||||
- Volume on which the particular issue is seen [ if applicable ]
|
||||
- Type of volumes
|
||||
- Volume options if available
|
||||
- Output of `gluster volume info`
|
||||
- Output of `gluster volume status`
|
||||
- Get the statedump of the volume with the problem
|
||||
|
||||
` $ gluster volume statedump `<vol-name>
|
||||
|
||||
- Client Information
|
||||
- OS Type:
|
||||
- Mount type:
|
||||
- OS Version:
|
||||
|
||||
How reproducible:
|
||||
|
||||
Steps to Reproduce:
|
||||
|
||||
- 1.
|
||||
- 2.
|
||||
- 3.
|
||||
|
||||
Actual results:
|
||||
|
||||
Expected results:
|
||||
|
||||
Logs Information:
|
||||
|
||||
- Provide possible issues, warnings, errors as a comment to the bug
|
||||
- Look for issues/warnings/errors in self-heal logs, rebalance logs, glusterd logs, brick logs, mount logs/nfs logs/smb logs
|
||||
- Add the entire logs as attachment, if it is very large to paste as a comment
|
||||
|
||||
Additional info:
|
||||
|
||||
[Bug\_reporting\_guidelines]: Bug_reporting_guidelines "wikilink"
|
||||
[Bugzilla]: https://bugzilla.redhat.com
|
||||
@@ -1,148 +0,0 @@
|
||||
This page describes how to build and install GlusterFS.
|
||||
|
||||
Build Requirements
|
||||
------------------
|
||||
|
||||
The following packages are required for building GlusterFS,
|
||||
|
||||
- GNU Autotools
|
||||
- Automake
|
||||
- Autoconf
|
||||
- Libtool
|
||||
- lex (generally flex)
|
||||
- GNU Bison
|
||||
- OpenSSL
|
||||
- libxml2
|
||||
- Python 2.x
|
||||
- libaio
|
||||
- libibverbs
|
||||
- librdmacm
|
||||
- readline
|
||||
- lvm2
|
||||
- glib2
|
||||
- liburcu
|
||||
- cmocka
|
||||
- libacl
|
||||
- sqlite
|
||||
|
||||
### Fedora
|
||||
|
||||
The following yum command installs all the build requirements for
|
||||
Fedora,
|
||||
|
||||
# yum install automake autoconf libtool flex bison openssl-devel libxml2-devel python-devel libaio-devel libibverbs-devel librdmacm-devel readline-devel lvm2-devel glib2-devel userspace-rcu-devel libcmocka-devel libacl-devel sqlite-devel
|
||||
|
||||
### Ubuntu
|
||||
|
||||
The following apt-get command will install all the build requirements on
|
||||
Ubuntu,
|
||||
|
||||
$ sudo apt-get install make automake autoconf libtool flex bison pkg-config libssl-dev libxml2-dev python-dev libaio-dev libibverbs-dev librdmacm-dev libreadline-dev liblvm2-dev libglib2.0-dev liburcu-dev libcmocka-dev libsqlite3-dev libacl1-dev
|
||||
|
||||
Building from Source
|
||||
--------------------
|
||||
|
||||
This section describes how to build GlusterFS from source. It is assumed
|
||||
you have a copy of the GlusterFS source (either from a released tarball
|
||||
or a git clone). All the commands below are to be run with the source
|
||||
directory as the working directory.
|
||||
|
||||
### Configuring for building
|
||||
|
||||
Run the below commands once for configuring and setting up the build
|
||||
process.
|
||||
|
||||
Run autogen to generate the configure script.
|
||||
|
||||
$ ./autogen.sh
|
||||
|
||||
Once autogen completes successfully a configure script is generated. Run
|
||||
the configure script to generate the makefiles.
|
||||
|
||||
$ ./configure
|
||||
|
||||
If the above build requirements have been installed, running the
|
||||
configure script should give the below configure summary,
|
||||
|
||||
GlusterFS configure summary
|
||||
===========================
|
||||
FUSE client : yes
|
||||
Infiniband verbs : yes
|
||||
epoll IO multiplex : yes
|
||||
argp-standalone : no
|
||||
fusermount : yes
|
||||
readline : yes
|
||||
georeplication : yes
|
||||
Linux-AIO : yes
|
||||
Enable Debug : no
|
||||
systemtap : no
|
||||
Block Device xlator : yes
|
||||
glupy : yes
|
||||
Use syslog : yes
|
||||
XML output : yes
|
||||
QEMU Block formats : yes
|
||||
Encryption xlator : yes
|
||||
|
||||
During development it is good to enable a debug build. To do this run
|
||||
configure with a '--enable-debug' flag.
|
||||
|
||||
$ ./configure --enable-debug
|
||||
|
||||
Further configuration flags can be found by running configure with a
|
||||
'--help' flag,
|
||||
|
||||
$ ./configure --help
|
||||
|
||||
### Building
|
||||
|
||||
Once configured, GlusterFS can be built with a simple make command.
|
||||
|
||||
$ make
|
||||
|
||||
To speed up the build process on a multicore machine, add a '-jN' flag,
|
||||
where N is the number of parallel jobs.
|
||||
|
||||
### Installing
|
||||
|
||||
Run 'make install' to install GlusterFS. By default, GlusterFS will be
|
||||
installed into '/usr/local' prefix. To change the install prefix, give
|
||||
the appropriate option to configure. If installing into the default
|
||||
prefix, you might need to use 'sudo' or 'su -c' to install.
|
||||
|
||||
$ sudo make install
|
||||
|
||||
### Running GlusterFS
|
||||
|
||||
GlusterFS can be only run as root, so the following commands will need
|
||||
to be run as root. If you've installed into the default '/usr/local'
|
||||
prefix, add '/usr/local/sbin' and '/usr/local/bin' to your PATH before
|
||||
running the below commands.
|
||||
|
||||
A source install will generally not install any init scripts. So you
|
||||
will need to start glusterd manually. To manually start glusterd just
|
||||
run,
|
||||
|
||||
# glusterd
|
||||
|
||||
This will start glusterd and fork it into the background as a daemon
|
||||
process. You now run 'gluster' commands and make use of GlusterFS.
|
||||
|
||||
Building packages
|
||||
-----------------
|
||||
|
||||
### Building RPMs
|
||||
|
||||
Building RPMs is really simple. On a RPM based system, for eg. Fedora,
|
||||
get the source and do the configuration steps as shown in the 'Building
|
||||
from Source' section. After the configuration step, run the following
|
||||
steps to build RPMs,
|
||||
|
||||
$ cd extras/LinuxRPM
|
||||
$ make glusterrpms
|
||||
|
||||
This will create rpms from the source in 'extras/LinuxRPM'. *(Note: You
|
||||
will need to install the rpmbuild requirements including rpmbuild and
|
||||
mock)*
|
||||
|
||||
A more detailed description for building RPMs can be found at
|
||||
[CompilingRPMS](./Compiling RPMS.md).
|
||||
@@ -1,178 +0,0 @@
|
||||
How to compile GlusterFS RPMs from git source, for RHEL/CentOS, and Fedora
|
||||
--------------------------------------------------------------------------
|
||||
|
||||
Creating rpm's of GlusterFS from git source is fairly easy, once you
|
||||
know the steps.
|
||||
|
||||
RPMS can be compiled on at least the following OS's:
|
||||
|
||||
- Red Hat Enterprise Linux 5, 6 (& 7 when available)
|
||||
- CentOS 5, 6 (& 7 when available)
|
||||
- Fedora 16-20
|
||||
|
||||
Specific instructions for compiling are below. If you're using:
|
||||
|
||||
- Fedora 16-20 - Follow the Fedora steps, then do all of the Common
|
||||
steps.
|
||||
- CentOS 5.x - Follow the CentOS 5.x steps, then do all of the Common
|
||||
steps
|
||||
- CentOS 6.x - Follow the CentOS 6.x steps, then do all of the Common
|
||||
steps.
|
||||
- RHEL 6.x - Follow the RHEL 6.x steps, then do all of the Common
|
||||
steps.
|
||||
|
||||
Note - these instructions have been explicitly tested on all of CentOS
|
||||
5.10, RHEL 6.4, CentOS 6.4+, and Fedora 16-20. Other releases of
|
||||
RHEL/CentOS and Fedora may work too, but haven't been tested. Please
|
||||
update this page appropriately if you do so. :)
|
||||
|
||||
### Preparation steps for Fedora 16-20 (only)
|
||||
|
||||
1. Install gcc, the python development headers, and python setuptools:
|
||||
|
||||
$ sudo yum -y install gcc python-devel python-setuptools
|
||||
|
||||
2. If you're compiling GlusterFS version 3.4, then install
|
||||
python-swiftclient. Other GlusterFS versions don't need it:
|
||||
|
||||
$ sudo easy_install simplejson python-swiftclient
|
||||
|
||||
Now follow through the **Common Steps** part below.
|
||||
|
||||
### Preparation steps for CentOS 5.x (only)
|
||||
|
||||
You'll need EPEL installed first and some CentOS specific packages. The
|
||||
commands below will get that done for you. After that, follow through
|
||||
the "Common steps" section.
|
||||
|
||||
1. Install EPEL first:
|
||||
|
||||
$ curl -OL http://download.fedoraproject.org/pub/epel/5/x86_64/epel-release-5-4.noarch.rpm
|
||||
$ sudo yum -y install epel-release-5-4.noarch.rpm --nogpgcheck
|
||||
|
||||
2. Install the packages required only on CentOS 5.x:
|
||||
|
||||
$ sudo yum -y install buildsys-macros gcc ncurses-devel python-ctypes python-sphinx10 \
|
||||
redhat-rpm-config
|
||||
|
||||
Now follow through the **Common Steps** part below.
|
||||
|
||||
### Preparation steps for CentOS 6.x (only)
|
||||
|
||||
You'll need EPEL installed first and some CentOS specific packages. The
|
||||
commands below will get that done for you. After that, follow through
|
||||
the "Common steps" section.
|
||||
|
||||
1. Install EPEL first:
|
||||
|
||||
$ sudo yum -y install http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
|
||||
|
||||
2. Install the packages required only on CentOS:
|
||||
|
||||
$ sudo yum -y install python-webob1.0 python-paste-deploy1.5 python-sphinx10 redhat-rpm-config
|
||||
|
||||
Now follow through the **Common Steps** part below.
|
||||
|
||||
### Preparation steps for RHEL 6.x (only)
|
||||
|
||||
You'll need EPEL installed first and some RHEL specific packages. The 2
|
||||
commands below will get that done for you. After that, follow through
|
||||
the "Common steps" section.
|
||||
|
||||
1. Install EPEL first:
|
||||
|
||||
$ sudo yum -y install http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
|
||||
|
||||
2. Install the packages required only on RHEL:
|
||||
|
||||
$ sudo yum -y --enablerepo=rhel-6-server-optional-rpms install python-webob1.0 \
|
||||
python-paste-deploy1.5 python-sphinx10 redhat-rpm-config
|
||||
|
||||
Now follow through the **Common Steps** part below.
|
||||
|
||||
### Common Steps
|
||||
|
||||
These steps are for both Fedora and RHEL/CentOS. At the end you'll have
|
||||
the complete set of GlusterFS RPMs for your platform, ready to be
|
||||
installed.
|
||||
|
||||
**NOTES for step 1 below:**
|
||||
|
||||
- If you're on RHEL/CentOS 5.x and get a message about lvm2-devel not
|
||||
being available, it's ok. You can ignore it. :)
|
||||
- If you're on RHEL/CentOS 6.x and get any messages about
|
||||
python-eventlet, python-netifaces, python-sphinx and/or pyxattr not
|
||||
being available, it's ok. You can ignore them. :)
|
||||
|
||||
1. Install the needed packages
|
||||
|
||||
$ sudo yum -y --disablerepo=rhs* --enablerepo=*optional-rpms install git autoconf \
|
||||
automake bison cmockery2-devel dos2unix flex fuse-devel glib2-devel libaio-devel \
|
||||
libattr-devel libibverbs-devel librdmacm-devel libtool libxml2-devel lvm2-devel make \
|
||||
openssl-devel pkgconfig pyliblzma python-devel python-eventlet python-netifaces \
|
||||
python-paste-deploy python-simplejson python-sphinx python-webob pyxattr readline-devel \
|
||||
rpm-build systemtap-sdt-devel tar libcmocka-devel
|
||||
|
||||
2. Clone the GlusterFS git repository
|
||||
|
||||
$ git clone git://git.gluster.org/glusterfs
|
||||
$ cd glusterfs
|
||||
|
||||
3. Choose which branch to compile
|
||||
|
||||
If you want to compile the latest development code, you can skip this
|
||||
step and go on to the next one.
|
||||
|
||||
If instead you want to compile the code for a specific release of
|
||||
GlusterFS (such as v3.4), get the list of release names here:
|
||||
|
||||
$ git branch -a | grep release
|
||||
remotes/origin/release-2.0
|
||||
remotes/origin/release-3.0
|
||||
remotes/origin/release-3.1
|
||||
remotes/origin/release-3.2
|
||||
remotes/origin/release-3.3
|
||||
remotes/origin/release-3.4
|
||||
remotes/origin/release-3.5
|
||||
|
||||
Then switch to the correct release using the git "checkout" command, and
|
||||
the name of the release after the "remotes/origin/" bit from the list
|
||||
above:
|
||||
|
||||
$ git checkout release-3.4
|
||||
|
||||
**NOTE -** The CentOS 5.x instructions have only been tested for the
|
||||
master branch in GlusterFS git. It is unknown (yet) if they work for
|
||||
branches older then release-3.5.
|
||||
|
||||
4. Configure and compile GlusterFS
|
||||
|
||||
Now you're ready to compile Gluster:
|
||||
|
||||
$ ./autogen.sh
|
||||
$ ./configure --enable-fusermount
|
||||
$ make dist
|
||||
|
||||
5. Create the GlusterFS RPMs
|
||||
|
||||
$ cd extras/LinuxRPM
|
||||
$ make glusterrpms
|
||||
|
||||
That should complete with no errors, leaving you with a directory
|
||||
containing the RPMs.
|
||||
|
||||
$ ls -l *rpm
|
||||
-rw-rw-r-- 1 jc jc 3966111 Mar 2 12:15 glusterfs-3git-1.el5.centos.src.rpm
|
||||
-rw-rw-r-- 1 jc jc 1548890 Mar 2 12:17 glusterfs-3git-1.el5.centos.x86_64.rpm
|
||||
-rw-rw-r-- 1 jc jc 66680 Mar 2 12:17 glusterfs-api-3git-1.el5.centos.x86_64.rpm
|
||||
-rw-rw-r-- 1 jc jc 20399 Mar 2 12:17 glusterfs-api-devel-3git-1.el5.centos.x86_64.rpm
|
||||
-rw-rw-r-- 1 jc jc 123806 Mar 2 12:17 glusterfs-cli-3git-1.el5.centos.x86_64.rpm
|
||||
-rw-rw-r-- 1 jc jc 7850357 Mar 2 12:17 glusterfs-debuginfo-3git-1.el5.centos.x86_64.rpm
|
||||
-rw-rw-r-- 1 jc jc 112677 Mar 2 12:17 glusterfs-devel-3git-1.el5.centos.x86_64.rpm
|
||||
-rw-rw-r-- 1 jc jc 100410 Mar 2 12:17 glusterfs-fuse-3git-1.el5.centos.x86_64.rpm
|
||||
-rw-rw-r-- 1 jc jc 187221 Mar 2 12:17 glusterfs-geo-replication-3git-1.el5.centos.x86_64.rpm
|
||||
-rw-rw-r-- 1 jc jc 299171 Mar 2 12:17 glusterfs-libs-3git-1.el5.centos.x86_64.rpm
|
||||
-rw-rw-r-- 1 jc jc 44943 Mar 2 12:17 glusterfs-rdma-3git-1.el5.centos.x86_64.rpm
|
||||
-rw-rw-r-- 1 jc jc 123065 Mar 2 12:17 glusterfs-regression-tests-3git-1.el5.centos.x86_64.rpm
|
||||
-rw-rw-r-- 1 jc jc 16224 Mar 2 12:17 glusterfs-resource-agents-3git-1.el5.centos.x86_64.rpm
|
||||
-rw-rw-r-- 1 jc jc 654043 Mar 2 12:17 glusterfs-server-3git-1.el5.centos.x86_64.rpm
|
||||
@@ -1,127 +1,7 @@
|
||||
Developers
|
||||
==========
|
||||
|
||||
### From GlusterDocumentation
|
||||
|
||||
Contributing to the Gluster community
|
||||
### Contributing to the Gluster community
|
||||
-------------------------------------
|
||||
|
||||
Are you itching to send in patches and participate as a developer in the
|
||||
Gluster community? Here are a number of starting points for getting
|
||||
involved. We don't require a signed contributor license agreement or
|
||||
copyright assignment, but we do require a "signed-off-by" line on each
|
||||
code check-in.
|
||||
|
||||
- [Simplified Developer Workflow](./Simplified Development Workflow.md)
|
||||
- A simpler and faster intro to developing with GlusterFS, than the
|
||||
doc below.
|
||||
- [Developer Workflow](./Development Workflow.md) - this tells
|
||||
you about our patch requirements, tools we use, and more. Required
|
||||
reading if you want to contribute code.
|
||||
- [License
|
||||
Change](http://www.gluster.org/2012/05/glusterfs-license-change/) -
|
||||
we recently changed the client library code to a dual license under
|
||||
the GPL v2 and the LGPL v3 or later
|
||||
- [GlusterFS Coding Standards](./coding-standard.md)
|
||||
|
||||
Compiling Gluster
|
||||
-----------------
|
||||
|
||||
- [Compiling RPMS](./Compiling RPMS.md) - Step by step
|
||||
instructions for compiling Gluster RPMS
|
||||
- [Building GlusterFS](./Building GlusterFS.md) - How to compile
|
||||
Gluster from source code. Including instructions for Ubuntu.
|
||||
|
||||
Developing
|
||||
----------
|
||||
|
||||
- [Projects](./Projects.md) - Ideas for projects you could
|
||||
create
|
||||
- [Language Bindings](./Language Bindings.md) - Connect to
|
||||
GlusterFS using various language bindings
|
||||
- [EasyFix\_Bugs](./Easy Fix Bugs.md) - Easy to fix bugs of
|
||||
GlusterFS. One of the best place to start contributing to GlusterFS.
|
||||
- [Fixing issues reported by tools for static code
|
||||
analysis](./Fixing issues reported by tools for static code analysis.md)
|
||||
- This is a good starting point for developers to fix bugs in
|
||||
GlusterFS project.
|
||||
- [Backport Wishlist](./Backport Wishlist.md) - Problems fixed
|
||||
in the master branch might need to get fixed in stable release
|
||||
branches too.
|
||||
The [Backport Guidelines](./Backport Guidelines.md) describe the steps that
|
||||
branches too.
|
||||
|
||||
Adding File operations
|
||||
----------------------
|
||||
|
||||
- [Steps to be followed when adding a new FOP to GlusterFS ](./adding-fops.md)
|
||||
|
||||
Automatic File Replication
|
||||
--------------------------
|
||||
|
||||
- [Cluster/afr translator](./afr.md)
|
||||
- [History of Locking in AFR](./afr-locks-evolution.md)
|
||||
- [Self heal Daemon](./afr-self-heal-daemon.md)
|
||||
|
||||
Data Structures
|
||||
---------------
|
||||
|
||||
- [inode data structure](./datastructure-inode.md)
|
||||
- [iobuf data structure](./datastructure-iobuf.md)
|
||||
- [mem-pool data structure](./datastructure-mem-pool.md)
|
||||
|
||||
Find the gfapi symbol versions [here](./gfapi-symbol-versions.md)
|
||||
|
||||
Daemon Management Framework
|
||||
---------------------------
|
||||
|
||||
- [How to introduce new daemons using daemon management framework](./daemon-management-framework.md)
|
||||
|
||||
Translators
|
||||
-----------
|
||||
|
||||
- [Block Device Tanslator](./bd-xlator.md)
|
||||
- [Performance/write-Behind Translator](./write-behind.md)
|
||||
- [Translator Development](./translator-development.md)
|
||||
- [Storage/posix Translator](./posix.md)
|
||||
- [Compression translator](./network_compression.md)
|
||||
|
||||
Testing/Debugging
|
||||
-----------------
|
||||
|
||||
- [Unit Tests in GlusterFS](./unittest.md)
|
||||
- [Using the Gluster Test
|
||||
Framework](./Using Gluster Test Framework.md) - Step by
|
||||
step instructions for running the Gluster Test Framework
|
||||
- [Our Jenkins Infrastructure](./Jenkins Infrastructure.md) - A
|
||||
braindump of the Jenkins infrastructure we have in place for
|
||||
automated testing
|
||||
- [Manual steps for setting up a Jenkins slave VM in
|
||||
Rackspace](./Jenkins Manual Setup.md) - Steps for setting up a slave
|
||||
VM in Rackspace
|
||||
- [Coredump Analysis](./coredump-analysis.md) - Steps to analize coredumps generated by regression machines.
|
||||
|
||||
Bug Handling
|
||||
------------
|
||||
|
||||
- [Bug reporting guidelines](./Bug Reporting Guidelines.md) -
|
||||
Guideline for reporting a bug in GlusterFS
|
||||
- [Bug triage guidelines](./Bug Triage.md) - Guideline on how to
|
||||
triage bugs for GlusterFS
|
||||
- [Bug report life cycle in
|
||||
Bugzilla](./Bug report Life Cycle.md) - Information about bug
|
||||
life cycle
|
||||
|
||||
Patch Acceptance
|
||||
----------------
|
||||
|
||||
- The [Guidelines For
|
||||
Maintainers](./Guidelines For Maintainers.md) explains when
|
||||
maintainers can merge patches.
|
||||
|
||||
Release Process
|
||||
---------------
|
||||
|
||||
- [Versioning](./versioning.md)
|
||||
- [GlusterFS Release Process](./GlusterFS Release process.md) -
|
||||
Our release process / checklist
|
||||
Gluster Developer documentation can be found [here](https://github.com/gluster/glusterfs/tree/master/doc/developer-guide)
|
||||
|
||||
@@ -1,457 +0,0 @@
|
||||
Development work flow of Gluster
|
||||
================================
|
||||
|
||||
This document provides a detailed overview of the development model
|
||||
followed by the GlusterFS project.
|
||||
|
||||
For a simpler overview visit
|
||||
[Simplified develoment workflow](./Simplified Development Workflow.md).
|
||||
|
||||
Basics
|
||||
------
|
||||
|
||||
The GlusterFS development model largely revolves around the features and
|
||||
functionality provided by Git version control system, Gerrit code review
|
||||
system and Jenkins continuous integration system. It is a primer for a
|
||||
contributor to the project.
|
||||
|
||||
### Git
|
||||
|
||||
Git is a extremely flexible, distributed version control system.
|
||||
GlusterFS' main git repository is at <http://git.gluster.org> and public
|
||||
mirrors are at GlusterForge
|
||||
(https://forge.gluster.org/glusterfs-core/glusterfs) and at GitHub
|
||||
(https://github.com/gluster/glusterfs). The development repo is hosted
|
||||
inside Gerrit and every code merge is instantly replicated to the public
|
||||
mirrors.
|
||||
|
||||
A good introduction to Git can be found at
|
||||
<http://www-cs-students.stanford.edu/~blynn/gitmagic/>.
|
||||
|
||||
### Gerrit
|
||||
|
||||
Gerrit is an excellent code review system which is developed with a git
|
||||
based workflow in mind. The GlusterFS project code review system is
|
||||
hosted at [review.gluster.org](http://review.gluster.org). Gerrit works
|
||||
on "Change"s. A change is a set of modifications to various files in
|
||||
your repository to accomplish a task. It is essentially one large git
|
||||
commit with all the necessary changes which can be both built and
|
||||
tested.
|
||||
|
||||
Gerrit usage is described later in 'Review Process' section.
|
||||
|
||||
### Jenkins
|
||||
|
||||
Jenkins is a Continuous Integration build system. Jenkins is hosted at
|
||||
<http://build.gluster.org>. Jenkins is configured to work with Gerrit by
|
||||
setting up hooks. Every "Change" which is pushed to Gerrit is
|
||||
automatically picked up by Jenkins, built and smoke tested. Output of
|
||||
all builds and tests can be viewed at
|
||||
<http://build.gluster.org/job/smoke/>. Jenkins is also setup with a
|
||||
'regression' job which is designed to execute test scripts provided as
|
||||
part of the code change.
|
||||
|
||||
Preparatory Setup
|
||||
-----------------
|
||||
|
||||
Here is a list of initial one-time steps before you can start hacking on
|
||||
code.
|
||||
|
||||
### Register
|
||||
|
||||
Sign up for an account at <http://review.gluster.org> by clicking
|
||||
'Register' on the right-hand top. You can use your gmail login as the
|
||||
openID identity.
|
||||
|
||||
### Preferred email
|
||||
|
||||
On first login, add your git/work email to your identity. You will have
|
||||
to click on the URL which is sent to your email and set up a proper Full
|
||||
Name. Make sure you set your git/work email as your preferred email.
|
||||
This should be the email address from which all your code commits are
|
||||
associated.
|
||||
|
||||
### Set Username
|
||||
|
||||
Select yourself a username.
|
||||
|
||||
### Watch glusterfs
|
||||
|
||||
In Gerrit settings, watch the 'glusterfs' project. Tick on all the three
|
||||
(New Changes, All Comments, Submitted Changes) types of notifications.
|
||||
|
||||
### Email filters
|
||||
|
||||
Set up a filter rule in your mail client to tag or classify mails with
|
||||
the header
|
||||
|
||||
List-Id: <gerrit-glusterfs.review.gluster.org>
|
||||
|
||||
as mails originating from the review system.
|
||||
|
||||
### SSH keys
|
||||
|
||||
Provide your SSH public key into Gerrit so that you can successfully
|
||||
access the development git repo as well as push changes for
|
||||
review/merge.
|
||||
|
||||
### Clone a working tree
|
||||
|
||||
Get yourself a working tree by cloning the development repository from
|
||||
Gerrit
|
||||
|
||||
sh$ git clone ssh://[username)@]git.gluster.org/glusterfs.git glusterfs
|
||||
|
||||
Branching policy
|
||||
----------------
|
||||
|
||||
This section describes both, the branching policies on the public repo
|
||||
as well as the suggested best-practice for local branching
|
||||
|
||||
### Master/release branches
|
||||
|
||||
In glusterfs.git, the master branch is the forward development branch.
|
||||
This is where new features come in first. In fact this is where almost
|
||||
every change (commit) comes in first. The master branch is always kept
|
||||
in a buildable state and smoke tests pass.
|
||||
|
||||
Release trains (3.1.z, 3.2.z, 3.2.z) each have a branch originating from
|
||||
master. Code freeze of each new release train is marked by the creation
|
||||
of the release-3.y branch. At this point no new features are added to
|
||||
the release-3.y branch. All fixes and commits first get into master.
|
||||
From there, only bug fixes get backported to the relevant release
|
||||
branches. From the release-3.y branch, actual release code snapshots
|
||||
(e.g. glusterfs-3.2.2 etc.) are tagged (git annotated tag with 'git tag
|
||||
-a') shipped as a tarball.
|
||||
|
||||
### Personal per-task branches
|
||||
|
||||
As a best practice, it is recommended you perform all code changes for a
|
||||
task in a local branch in your working tree. The local branch should be
|
||||
created from the upstream branch to which you intend to submit the
|
||||
change. If you are submitting changes to master branch, first create a
|
||||
local task branch like this -
|
||||
|
||||
sh$ git checkout master
|
||||
sh$ git branch bug-XYZ && git checkout bug-XYZ
|
||||
... <hack, commit>
|
||||
|
||||
If you are backporting a fix to a release branch, or making a new change
|
||||
to a release branch, your commands would be slightly different. If you
|
||||
are checking out a release branch in your local working tree for the
|
||||
first time, make sure to set it up as a remote tracking branch like this
|
||||
-
|
||||
|
||||
sh$ git checkout -b release-3.2 origin/release-3.2
|
||||
|
||||
The above step is not necessary to be repeated. In the future if you
|
||||
want to work to the release branch -
|
||||
|
||||
sh$ git checkout release-3.2
|
||||
sh$ git branch bug-XYZ-release-3.2 && git checkout bug-XYZ-release-3.2
|
||||
... <cherry-pick, hack, commit>
|
||||
|
||||
Building
|
||||
--------
|
||||
|
||||
### Environment Setup
|
||||
|
||||
**For details about the required packages for the build environment
|
||||
refer : [Building GlusterFS](./Building GlusterFS.md)**
|
||||
|
||||
Ubuntu:
|
||||
|
||||
To setup the build environment on an Ubuntu system, type the following
|
||||
command to install the required packages:
|
||||
|
||||
sudo apt-get -y install python-pyxattr libreadline-dev systemtap-sdt-dev
|
||||
tar python-pastedeploy python-simplejson python-sphinx python-webob libssl-dev
|
||||
pkg-config python-dev python-eventlet python-netifaces libaio-dev libibverbs-dev
|
||||
libtool libxml2-dev liblvm2-dev make autoconf automake bison dos2unix flex libfuse-dev
|
||||
|
||||
CentOS/RHEL/Fedora:
|
||||
|
||||
On Fedora systems, install the required packages by following the
|
||||
instructions in [CompilingRPMS](./Compiling RPMS.md).
|
||||
|
||||
### Creating build environment
|
||||
|
||||
Once the required packages are installed for your appropiate system,
|
||||
generate the build configuration:
|
||||
|
||||
sh$ ./autogen.sh
|
||||
sh$ ./configure --enable-fusermount
|
||||
|
||||
### Build and install
|
||||
|
||||
#### GlusterFS
|
||||
|
||||
Ubuntu:
|
||||
|
||||
Type the following to build and install GlusterFS on the system:
|
||||
|
||||
sh$ make
|
||||
sh$ make install
|
||||
|
||||
CentOS/RHEL/Fedora:
|
||||
|
||||
In an rpm based system, there are two methods to build GlusterFS. One is
|
||||
to use the method describe above for *Ubuntu*. The other is to build and
|
||||
install RPMS as described in [CompilingRPMS](./Compiling RPMS.md).
|
||||
|
||||
#### GlusterFS UFO/SWIFT
|
||||
|
||||
To build and run Gluster UFO you can do the following:
|
||||
|
||||
1. Build, create, and install the RPMS as described in
|
||||
[CompilingRPMS](./Compiling RPMS.md).
|
||||
2. Configure UFO/SWIFT as described in [Howto Using UFO SWIFT - A quick
|
||||
and dirty setup
|
||||
guide](http://www.gluster.org/2012/09/howto-using-ufo-swift-a-quick-and-dirty-setup-guide)
|
||||
|
||||
Commit policy
|
||||
-------------
|
||||
|
||||
For a Gerrit based work flow, each commit should be an independent,
|
||||
buildable and testable change. Typically you would have a local branch
|
||||
per task, and most of the times that branch will have one commit.
|
||||
|
||||
If you have a second task at hand which depends on the changes of the
|
||||
first one, then technically you can have it as a separate commit on top
|
||||
of the first commit. But it is important that the first commit should be
|
||||
a testable change by itself (if not, it is an indication that the two
|
||||
commits are essentially part of a single change). Gerrit accommodates
|
||||
these situations by marking Change 1 as a "dependency" of Change 2
|
||||
(there is a 'Dependencies' tab in the Change page in Gerrit)
|
||||
automatically when you push the changes for review from the same local
|
||||
branch.
|
||||
|
||||
You will need to sign-off your commit (git commit -s) before sending the
|
||||
patch for review. By signing off your patch, you agree to the terms
|
||||
listed under "Developer's Certificate of Origin" section in the
|
||||
CONTRIBUTING file available in the repository root.
|
||||
|
||||
Provide a meaningful commit message. Your commit message should be in
|
||||
the following format
|
||||
|
||||
- A short one line subject describing what the patch accomplishes
|
||||
- An empty line following the subject
|
||||
- Situation necessitating the patch
|
||||
- Description of the code changes
|
||||
- Reason for doing it this way (compared to others)
|
||||
- Description of test cases
|
||||
|
||||
### Test cases
|
||||
|
||||
Part of the workflow is to aggregate and execute pre-commit test cases
|
||||
which accompany patches, cumulatively for every new patch. This
|
||||
guarantees that tests which are working till the present are not broken
|
||||
with the new patch. Every change submitted to Gerrit much include test
|
||||
cases in
|
||||
|
||||
tests/group/script.t
|
||||
|
||||
as part of the patch. This is so that code changes and accompanying test
|
||||
cases are reviewed together. All new commits now come under the
|
||||
following categories w.r.t test cases:
|
||||
|
||||
#### New 'group' directory and/or 'script.t'
|
||||
|
||||
This is typically when code is adding a new module and/or feature
|
||||
|
||||
#### Extend/Modify old test cases in existing scripts
|
||||
|
||||
This is typically when present behavior (default values etc.) of code is
|
||||
changed
|
||||
|
||||
#### No test cases
|
||||
|
||||
This is typically when code change is trivial (e.g. fixing typos in
|
||||
output strings, code comments)
|
||||
|
||||
#### Only test case and no code change
|
||||
|
||||
This is typically when we are adding test cases to old code (already
|
||||
existing before this regression test policy was enforced)
|
||||
|
||||
More details on how to work with test case scripts can be found in
|
||||
|
||||
tests/README
|
||||
|
||||
Review process
|
||||
--------------
|
||||
|
||||
### rfc.sh
|
||||
|
||||
After doing the local commit, it is time to submit the code for review.
|
||||
There is a script available inside glusterfs.git called rfc.sh. You can
|
||||
submit your changes for review by simply executing
|
||||
|
||||
sh$ ./rfc.sh
|
||||
|
||||
This script does the following:
|
||||
|
||||
- The first time it is executed, it downloads a git hook from
|
||||
<http://review.gluster.org/tools/hooks/commit-msg> and sets it up
|
||||
locally to generate a Change-Id: tag in your commit message (if it
|
||||
was not already generated.)
|
||||
- Rebase your commit against the latest upstream HEAD. This rebase
|
||||
also causes your commits to undergo massaging from the just
|
||||
downloaded commit-msg hook.
|
||||
- Prompt for a Bug Id for each commit (if it was not already provded)
|
||||
and include it as a "BUG:" tag in the commit log. You can just hit
|
||||
<enter> at this prompt if your submission is purely for review
|
||||
purposes.
|
||||
- Push the changes to review.gluster.org for review. If you had
|
||||
provided a bug id, it assigns the topic of the change as "bug-XYZ".
|
||||
If not it sets the topic as "rfc".
|
||||
|
||||
On a successful push, you will see a URL pointing to the change in
|
||||
review.gluster.org
|
||||
|
||||
Auto verification
|
||||
-----------------
|
||||
|
||||
The integration between Jenkins and Gerrit triggers an event in Jenkins
|
||||
on every push of changes, to pick up the change and run build and smoke
|
||||
test on it.
|
||||
|
||||
If the build and smoke tests execute successfuly, Jenkins marks the
|
||||
change as '+0 Verified'. If they fail, '-1 Verified' is marked on the
|
||||
change. This means passing the automated smoke test is a necessary
|
||||
condition but not sufficient.
|
||||
|
||||
It is important to note that Jenkins verification is only a generic
|
||||
verification of high level tests. More concentrated testing effort for
|
||||
the patch is necessary with manual verification.
|
||||
|
||||
If auto verification fails, it is a good reason to skip code review till
|
||||
a fixed change is pushed later. You can click on the build URL
|
||||
automatically put as a comment to inspect the reason for auto
|
||||
verification failure. In the Jenkins job page, you can click on the
|
||||
'Console Output' link to see the exact point of failure.
|
||||
|
||||
Reviewing / Commenting
|
||||
----------------------
|
||||
|
||||
Code review with Gerrit is relatively easy compared to other available
|
||||
tools. Each change is presented as multiple files and each file can be
|
||||
reviewed in Side-by-Side mode. While reviewing it is possible to comment
|
||||
on each line by double-clicking on it and writing in your comments in
|
||||
the text box. Such in-line comments are saved as drafts, till you
|
||||
finally publish them as a Review from the 'Change page'.
|
||||
|
||||
There are many small and handy features in Gerrit, like 'starring'
|
||||
changes you are interested to follow, setting the amount of context to
|
||||
view in the side-by-side view page etc.
|
||||
|
||||
Incorporate, Amend, rfc.sh, Reverify
|
||||
------------------------------------
|
||||
|
||||
Code review comments are notified via email. After incorporating the
|
||||
changes in code, you can mark each of the inline comment as 'done'
|
||||
(optional). After all the changes to your local files, amend the
|
||||
previous commit with these changes with -
|
||||
|
||||
sh$ git commit -a --amend
|
||||
|
||||
Push the amended commit by executing rfc.sh. If your previous push was
|
||||
an "rfc" push (i.e, without a Bug Id) you will be prompted for a Bug Id
|
||||
again. You can re-push an rfc change without any other code change too
|
||||
by giving a Bug Id.
|
||||
|
||||
On the new push, Jenkins will re-verify the new change (independent of
|
||||
what the verification result was for the previous push).
|
||||
|
||||
It is the Change-Id line in the commit log (which does not change) that
|
||||
associates the new push as an update for the old push (even though they
|
||||
had different commit ids) under the same Change. In the side-by-side
|
||||
view page, it is possible to set knobs in the 'Patch History' tab to
|
||||
view changes between patches as well. This is handy to inspect how
|
||||
review comments were incorporated.
|
||||
|
||||
If further changes are found necessary, comments can be made on the new
|
||||
patch as well, and the same cycle repeats.
|
||||
|
||||
If no further changes are necessary, the reviewer can mark the patch as
|
||||
reviewed with a certain score depending on the depth of review and
|
||||
confidence (+1 or +2). A -1 review indicates non-agreement for the
|
||||
change to get merged upstream.
|
||||
|
||||
Regression tests and test cases
|
||||
-------------------------------
|
||||
|
||||
All code changes which are not trivial (typo fixes, code comment
|
||||
changes) must be accompanied with either a new test case script or
|
||||
extend/modify an existing test case script. It is important to review
|
||||
the test case in conjunction with the code change to analyse whether the
|
||||
code change is actually verified by the test case.
|
||||
|
||||
Regression tests (i.e, execution of all test cases accumulated with
|
||||
every commit) is not automatically triggered as the test cases can be
|
||||
extensive and is quite expensive to execute for every change submission
|
||||
in the review/resubmit cycle. Instead it is triggered by the
|
||||
maintainers, after code review. Passing the regression test is a
|
||||
necessary condition for merge along with code review points.
|
||||
|
||||
Submission Qualifiers
|
||||
---------------------
|
||||
|
||||
For a change to get merged, there are two qualifiers which are enforced
|
||||
by the Gerrit system. They are - A change should have at least one '+2
|
||||
Reviewed', and a change should have at least one '+1 Verified'
|
||||
(regression test). The project maintainer will merge the changes once a
|
||||
patch meets these qualifiers.
|
||||
|
||||
Submission Disqualifiers
|
||||
------------------------
|
||||
|
||||
There are three types of "negative votes".
|
||||
|
||||
-1 Verified
|
||||
|
||||
-1 Code-Review ("I would prefer that you didn't submit this")
|
||||
|
||||
-2 Code-Review ("Do not submit")
|
||||
|
||||
The implication and scope of each of the three are different. They
|
||||
behave differently as changes are resubmitted as new patchsets.
|
||||
|
||||
### -1 Verified
|
||||
|
||||
Anybody voting -1 Verified will prevent \*that patchset only\* from
|
||||
getting merged. The flag is automatically cleared on the next patchset
|
||||
post. The intention is that this vote is based on the result of some
|
||||
kind of testing. A voter is expected to explain the test case which
|
||||
failed. Jenkins jobs (smoke, regression, ufounit) use this field for
|
||||
voting -1/0/+1. When voting -1, Jenkins posts the link to the URL which
|
||||
has the console output of the failed job.
|
||||
|
||||
### -1 Code-Review ("I would prefer that you didn't submit this")
|
||||
|
||||
This is an advisory vote based on the content of the patch. Typically
|
||||
issues in source code (both design and implementation), source code
|
||||
comments, log messages, license headers etc. found by human inspection.
|
||||
The reviewer explains the specific issues by commenting against the most
|
||||
relevant lines of source code in the patch. On a resubmission, -1 votes
|
||||
are cleared automatically. It is the responsibility of the maintainers
|
||||
to honor -1 Code-Review votes from reviewers (by not merging the
|
||||
patches), and inspecting that -1 comments on previous submissions are
|
||||
addressed in the new patchset. Generally this is the recommended
|
||||
"negative" vote.
|
||||
|
||||
### -2 Code-Review ("Do not submit")
|
||||
|
||||
This is a stronger vote which actually prevents Gerrit from merging the
|
||||
patch. The -2 vote persists even after resubmission and continues to
|
||||
prevent the patch from getting merged, until the voter revokes the -2
|
||||
vote (and then is further subjected to Submission Qualifiers). Typically
|
||||
one would vote -2 if they are \*against the goal\* of what the patch is
|
||||
trying to achieve (and not an issue with the patch, which can change on
|
||||
resubmission). A reviewer would also vote -2 on a patch even if there is
|
||||
agreement with the goal, but the issue in the code is of such a critical
|
||||
nature that the reviewer personally wants to inspect the next patchset
|
||||
and only then revoke the vote after finding the new patch satisfactory.
|
||||
This prevents the merge of the patch in the mean time. Every registered
|
||||
user has the right to exercise the -2 Code review vote, and cannot be
|
||||
overridden by the maintainers.
|
||||
@@ -1,35 +0,0 @@
|
||||
Fixing easy bugs is an excellent method to start contributing patches to
|
||||
Gluster.
|
||||
|
||||
- Bugs which are marked with EasyFix flag can be found from below
|
||||
BugZilla query.
|
||||
- [Bugzilla Query For EasyFix
|
||||
Bugs](https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&keywords=EasyFix&list_id=2626252&product=GlusterFS)
|
||||
- [RSS-feed for EasyFix Gluster Bugs](http://goo.gl/OpQwlv)
|
||||
- To fix EasyFix bugs,
|
||||
- When you pick an EasyFix you want to work on, assign it to
|
||||
yourself and move it to ASSIGNED
|
||||
- Check
|
||||
[Bug report life cycle](./Bug report Life Cycle.md) and
|
||||
follow it.
|
||||
- Check Developers page for details about development workflow,
|
||||
GlusterFS design documents etc.
|
||||
|
||||
Sometimes an *Easy Fix* bug has a patch attached. In those cases,
|
||||
the *Patch* keyword has been added to the bug. These bugs can be
|
||||
used by new contributors that would like to verify their workflow. [Bug
|
||||
1099645](https://bugzilla.redhat.com/1099645) is one example of those.
|
||||
|
||||
### Guidelines for new comers
|
||||
|
||||
- While trying to write a patch, do not hesitate to ask questions.
|
||||
- If something in the documentation is unclear, we do need to know so
|
||||
that we can improve it.
|
||||
- There are no stupid questions, and it's more stupid to not ask
|
||||
questions that others can easily answer. Always assume that if you
|
||||
have a question, someone else would like to hear the answer too.
|
||||
|
||||
[Reach out](http://gluster.org/community/index.html) to the developers
|
||||
in \#gluster or \#gluster-dev on Freenode IRC, or on one of the mailing
|
||||
lists, try to keep the discussions public so that anyone can learn from
|
||||
it.
|
||||
@@ -1,66 +0,0 @@
|
||||
Static Code Analysis Tools
|
||||
--------------------------
|
||||
|
||||
Bug fixes for issues reported by *Static Code Analysis Tools* should
|
||||
follow [Development Work Flow](./Development Workflow.md)
|
||||
|
||||
### Coverity
|
||||
|
||||
GlusterFS is part of [Coverity's](https://scan.coverity.com/) scan
|
||||
program.
|
||||
|
||||
- To see Coverity issues you have to be a member of the GlusterFS
|
||||
project in Coverity scan website.
|
||||
- Here is the link to [Coverity scan
|
||||
website](https://scan.coverity.com/projects/987)
|
||||
- Go to above link and subscribe to GlusterFS project (as
|
||||
contributor). It will send a request to Admin for including you in
|
||||
the Project.
|
||||
- Once admins for the GlusterFS Coverity scan approve your request,
|
||||
you will be able to see the defects raised by Coverity.
|
||||
- [BZ 789278](https://bugzilla.redhat.com/show_bug.cgi?id=789278)
|
||||
should be used as a umbrella bug for Coverity issues in master
|
||||
branch unless you are trying to fix a specific bug in Bugzilla.
|
||||
- While sending patches for fixing Coverity issues please use the
|
||||
same bug number.
|
||||
- For 3.6 branch the Coverity tracking bug is
|
||||
[1122834](https://bugzilla.redhat.com/show_bug.cgi?id=1122834)
|
||||
- When you decide to work on some issue, please assign it to your name
|
||||
in the same Coverity website. So that we don't step on each others
|
||||
work.
|
||||
- When marking a bug intentional in Coverity scan website, please put
|
||||
an explanation for the same. So that it will help others to
|
||||
understand the reasoning behind it.
|
||||
|
||||
*If you have more questions please send it to
|
||||
[gluster-devel](http://www.gluster.org/interact/mailinglists) mailing
|
||||
list*
|
||||
|
||||
### CPP Check
|
||||
|
||||
Cppcheck is available in Fedora and EL's EPEL repo
|
||||
|
||||
- Install Cppcheck
|
||||
|
||||
yum install cppcheck
|
||||
|
||||
- Clone GlusterFS code
|
||||
|
||||
git clone https://github.com/gluster/glusterfs) glusterfs
|
||||
|
||||
- Run Cpp check
|
||||
|
||||
cppcheck glusterfs/ 2>cppcheck.log
|
||||
|
||||
- [BZ 1091677](https://bugzilla.redhat.com/show_bug.cgi?id=1091677)
|
||||
should be used for submitting patches to master branch for Cppcheck
|
||||
reported issues.
|
||||
|
||||
### Daily Runs
|
||||
|
||||
We now have daily runs of various static source code analysis tools on
|
||||
the glusterfs sources. There are daily analyses of the master,
|
||||
release-3.6, and release-3.5 branches.
|
||||
|
||||
Results are posted at
|
||||
<http://download.gluster.org/pub/gluster/glusterfs/static-analysis/>
|
||||
@@ -1,73 +0,0 @@
|
||||
Release Process for GlusterFS
|
||||
=============================
|
||||
|
||||
Create tarball
|
||||
--------------
|
||||
|
||||
1. Add the release-notes to the docs/release-notes/ directory in the
|
||||
sources
|
||||
2. after merging the release-notes, create a tag like v3.6.2
|
||||
3. push the tag to git.gluster.org
|
||||
4. create the tarball with the [release job in
|
||||
Jenkins](http://build.gluster.org/job/release/)
|
||||
|
||||
Notify packagers
|
||||
----------------
|
||||
|
||||
Notify the packagers that we need packages created. Provide the link to the
|
||||
source tarball from the Jenkins release job to the [packagers
|
||||
mailinglist](mailto:packaging@gluster.org). A list of the people involved in
|
||||
the package maintenance for the different distributions is in the `MAINTAINERS`
|
||||
file in the sources.
|
||||
|
||||
Create a new Tracker Bug for the next release
|
||||
---------------------------------------------
|
||||
|
||||
The tracker bugs are used as guidance for blocker bugs and should get created when a release is made. To create one
|
||||
|
||||
- file a [new bug in Bugzilla](https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS)
|
||||
- base the contents on previous tracker bugs, like the one for [glusterfs-3.5.5](https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-3.5.5)
|
||||
- set the '''Alias''' (it is a text-field) of the bug to 'glusterfs-a.b.c' where a.b.c is the next minor version
|
||||
- save the new bug
|
||||
- you should now be able to use the 'glusterfs-a.b.c' to access the bug, use the alias to replace the BZ# in URLs, or '''blocks''' fields
|
||||
- bugs that were not fixed in this release, but were added to the tracker should be moved to the new tracker
|
||||
|
||||
|
||||
Create Release Announcement
|
||||
---------------------------
|
||||
|
||||
Create the Release Announcement (this is often done while people are
|
||||
making the packages). The contents of the release announcement can be
|
||||
based on the release notes, or should at least have a pointer to them.
|
||||
|
||||
Examples:
|
||||
|
||||
- [blog](http://blog.gluster.org/2014/11/glusterfs-3-5-3beta2-is-now-available-for-testing/)
|
||||
- [release
|
||||
notes](https://github.com/gluster/glusterfs/blob/v3.5.3/doc/release-notes/3.5.3.md)
|
||||
|
||||
Send Release Announcement
|
||||
-------------------------
|
||||
|
||||
Once the Fedora/EL RPMs are ready (and any others that are ready by
|
||||
then), send the release announcement:
|
||||
|
||||
- Gluster Mailing lists
|
||||
- gluster-announce, gluster-devel, gluster-users
|
||||
- Gluster Blog
|
||||
- Gluster Twitter account
|
||||
- Gluster Facebook page
|
||||
- Gluster LinkedIn group - Justin has access
|
||||
- Gluster G+
|
||||
|
||||
Close Bugs
|
||||
----------
|
||||
|
||||
Close the bugs that have all their patches included in the release.
|
||||
Leave a note in the bug report with a pointer to the release
|
||||
announcement.
|
||||
|
||||
Other things to consider
|
||||
------------------------
|
||||
|
||||
- Translations? - Are there strings needing translation?
|
||||
@@ -1,70 +0,0 @@
|
||||
### Guidelines For Maintainers
|
||||
|
||||
GlusterFS has maintainers, sub-maintainers and release maintainers to
|
||||
manage the project's codebase. Sub-maintainers are the owners for
|
||||
specific areas/components of the source tree. Maintainers operate across
|
||||
all components in the source tree.Release maintainers are the owners for
|
||||
various release branches (release-x.y) present in the GlusterFS
|
||||
repository.
|
||||
|
||||
In the guidelines below, release maintainers and sub-maintainers are
|
||||
also implied when there is a reference to maintainers unless it is
|
||||
explicitly called out.
|
||||
|
||||
### Guidelines that Maintainers are expected to adhere to
|
||||
|
||||
1. Ensure qualitative and timely management of patches sent for review.
|
||||
|
||||
2. For merging patches into the repository, it is expected of
|
||||
maintainers to:
|
||||
|
||||
a> Merge patches of owned components only.
|
||||
b> Seek approvals from all maintainers before merging a patchset spanning multiple components.
|
||||
c> Ensure that regression tests pass for all patches before merging.
|
||||
d> Ensure that regression tests accompany all patch submissions.
|
||||
e> Ensure that documentation is updated for a noticeable change in user perceivable behavior or design.
|
||||
f> Encourage code unit tests from patch submitters to improve the overall quality of the codebase.
|
||||
g> Not merge patches written by themselves until there is a +2 Code Review vote by other reviewers.
|
||||
|
||||
3. The responsibility of merging a patch into a release branch in
|
||||
normal circumstances will be that of the release maintainer's. Only in
|
||||
exceptional situations, maintainers & sub-maintainers will merge patches
|
||||
into a release branch.
|
||||
|
||||
4. Release maintainers will ensure approval from appropriate
|
||||
maintainers before merging a patch into a release branch.
|
||||
|
||||
5. Maintainers have a responsibility to the community, it is expected
|
||||
of maintainers to:
|
||||
|
||||
a> Facilitate the community in all aspects.
|
||||
b> Be very active and visible in the community.
|
||||
c> Be objective and consider the larger interests of the community ahead of individual interests.
|
||||
d> Be receptive to user feedback.
|
||||
e> Address concerns & issues affecting users.
|
||||
f> Lead by example.
|
||||
|
||||
### Queries on Guidelines
|
||||
|
||||
Any questions or comments regarding these guidelines can be routed to
|
||||
gluster-devel at gluster dot org.
|
||||
|
||||
### Patches in Gerrit
|
||||
|
||||
Gerrit can be used to list patches that need reviews and/or can get
|
||||
merged. Some queries have been prepared for this, edit the search box in
|
||||
Gerrit to make your own variation:
|
||||
|
||||
- [3.5 open reviewed/verified (non
|
||||
rfc)](http://review.gluster.org/#/q/project:glusterfs+branch:release-3.5+status:open+%28label:Code-Review%253D%252B1+OR+label:Code-Review%253D%252B2+OR+label:Verified%253D%252B1%29+NOT+topic:rfc+NOT+label:Code-Review%253D-2,n,z)
|
||||
- [All open 3.5 patches (non
|
||||
rfc)](http://review.gluster.org/#/q/project:glusterfs+branch:release-3.5+status:open+NOT+topic:rfc,n,z)
|
||||
- [Open NFS (master
|
||||
branch)](http://review.gluster.org/#/q/project:glusterfs+branch:master+status:open+message:nfs,n,z)
|
||||
|
||||
An other option can be used in combination with the Gerrit queries, and
|
||||
has support for filename/directory matching (the queries above do not).
|
||||
Go to the [settings](http://review.gluster.org/#/settings/projects) in
|
||||
your Gerrit profile, and enter filters like these:
|
||||
|
||||

|
||||
@@ -1,127 +0,0 @@
|
||||
We're using Gerrit and [Jenkins](http://jenkins-ci.org) at the moment.
|
||||
Our Gerrit instance:
|
||||
|
||||
http://review.gluster.org
|
||||
|
||||
It's hosted on an ancient VM (badly needs upgrading) in some hosting
|
||||
place called iWeb. We're wanting to migrate this to a Rackspace VM in
|
||||
the very near future.
|
||||
|
||||
Our main Jenkins instance:
|
||||
|
||||
http://build.gluster.org
|
||||
|
||||
That's also a pretty-out-of-date version of Jenkins, on an badly
|
||||
outdated VM. That one's in Rackspace at least. We intend on migrating to
|
||||
a new VM (and new Jenkins) in the not-too-far-future. No ETA yet. ;)
|
||||
|
||||
As well as those two main pieces, we have a bunch of VM's in Rackspace
|
||||
with various OS's on them:
|
||||
|
||||
http://build.gluster.org/computer/
|
||||
|
||||
In that list we have:
|
||||
|
||||
- bulk\*.cloud.gluster.org\
|
||||
|
||||
- Temporary VM's used for running bulk regression tests on, for
|
||||
analysing our spurious regression failure problem
|
||||
- Setup and maintained by Justin Clift
|
||||
|
||||
- freebsd0.cloud.gluster.org\
|
||||
|
||||
- FreeBSD 10.0 VM in Rackspace. Used for automatic smoke testing
|
||||
on FreeBSD of all proposed patches (uses a Gerrit trigger).
|
||||
|
||||
- g4s-rackspace-\* (apart from gfs-rackspace-f20-1), and
|
||||
tiny-rackspace-f20-1\
|
||||
|
||||
- Various VM's in Rackspace with Fedora and EL6 on them, setup by
|
||||
Luis Pabon. From their description in Jenkins, they're nodes for
|
||||
"open-stack swift executing functional test suite against
|
||||
Gluster-for-Swift".
|
||||
|
||||
- gfs-rackspace-f20-1\
|
||||
|
||||
- A VM in Rackspace for automatically building RPMs on. Setup +
|
||||
maintained by Luis Pabon.
|
||||
|
||||
- netbsd0.cloud.gluster.org\
|
||||
|
||||
- NetBSD 6.1.4 VM in Rackspace. Used for automatic smoke testing
|
||||
on NetBSD 6.x of all proposed patches (uses a Gerrit trigger).
|
||||
- Setup and maintained by Manu Dreyfus
|
||||
|
||||
- netbsd7.cloud.gluster.org\
|
||||
|
||||
- NetBSD 7 (beta) VM in Rackspace. Used for automatic smoke
|
||||
testing on NetBSD 7 of all proposed patches (uses a Gerrit
|
||||
trigger).
|
||||
- Setup and maintained by Manu Dreyfus
|
||||
|
||||
- nbslave7\*.cloud.gluster.org\
|
||||
|
||||
- NetBSD 7 slaves VMs for running our regression tests on
|
||||
- Setup and maintained by Manu Dreyfus
|
||||
|
||||
- slave20.cloud.gluster.org - slave49.cloud.gluster.org\
|
||||
|
||||
- CentOS 6.5 VM's in Rackspace. Used for automatic regression
|
||||
testing of all proposed patches (uses a Gerrit trigger).
|
||||
- Setup and maintained by Michael Scherer
|
||||
|
||||
Work is being done on the GlusterFS regression tests so they'll function
|
||||
on FreeBSD and NetBSD (instead of just Linux). When that's complete,
|
||||
we'll automatically run full regression testing on FreeBSD and NetBSD
|
||||
for all proposed patches too.
|
||||
|
||||
Non Jenkins VMs
|
||||
---------------
|
||||
|
||||
**backups.cloud.gluster.org**
|
||||
|
||||
Server holding our nightly backups. Setup and maintained by Michael
|
||||
Scherer.
|
||||
|
||||
**bareos-dev.cloud.gluster.org, bareos-data.cloud.gluster.org**
|
||||
|
||||
Shared VMs to debug Bareos and libgfapi integration. Maintained by
|
||||
Niels de Vos.
|
||||
|
||||
**bugs.cloud.gluster.org**
|
||||
|
||||
Hosting
|
||||
[gluster-bugs-webui](https://github.com/gluster/gluster-bugs-webui)
|
||||
for bug triage/checking. Maintained by Niels de Vos.
|
||||
|
||||
**docs.cloud.gluster.org**
|
||||
|
||||
Documentation server, running readTheDocs - managed by Soumya Deb.
|
||||
|
||||
**download.gluster.org**
|
||||
|
||||
Our primary download server - holds the Gluster binaries we
|
||||
generate, which people can download.
|
||||
|
||||
**gluster-sonar**
|
||||
|
||||
Hosts our Gluster
|
||||
[SonarQube](http://sonar.peircean.com/dashboard/index/com.peircean.glusterfs:glusterfs-java-filesystem)
|
||||
instance. Setup and maintained by Louis Zuckerman.
|
||||
|
||||
**salt-master.gluster.org**
|
||||
|
||||
Our Configuration Mgmt master VM. Maintained by Michael Scherer.
|
||||
|
||||
**munin.gluster.org**
|
||||
|
||||
Munin master. Maintained by Michael Scherer.
|
||||
|
||||
**webbuilder.gluster.org**
|
||||
|
||||
Our builder for the website. Maintained by Michael Scherer.
|
||||
|
||||
**www.gluster.org aka supercolony.gluster.org**
|
||||
|
||||
The main website server. Maintained by Michael Scherer, Justin
|
||||
Clift, Others ( add your name )
|
||||
@@ -1,146 +0,0 @@
|
||||
Setting up Jenkins slaves on Rackspace for GlusterFS regression testing
|
||||
=======================================================================
|
||||
|
||||
This is for RHEL/CentOS 6.x. The below commands should be run as root.
|
||||
|
||||
### Install additional required packages
|
||||
|
||||
yum -y install cmockery2-devel dbench libacl-devel mock nfs-utils yajl perl-Test-Harness salt-minion
|
||||
|
||||
### Enable yum-cron for automatic rpm updates
|
||||
|
||||
chkconfig yum-cron on
|
||||
|
||||
### Add the mock user
|
||||
|
||||
useradd -g mock mock
|
||||
|
||||
### Disable eth1
|
||||
|
||||
Because GlusterFS can fail if more than 1 ethernet interface
|
||||
|
||||
sed -i 's/ONBOOT=yes/ONBOOT=no/' /etc/sysconfig/network-scripts/ifcfg-eth1
|
||||
|
||||
### Disable IPv6
|
||||
|
||||
As per <https://access.redhat.com/site/node/8709>
|
||||
|
||||
sed -i 's/IPV6INIT=yes/IPV6INIT=no/' /etc/sysconfig/network-scripts/ifcfg-eth0
|
||||
echo 'options ipv6 disable=1' > /etc/modprobe.d/ipv6.conf
|
||||
chkconfig ip6tables off
|
||||
sed -i 's/NETWORKING_IPV6=yes/NETWORKING_IPV6=no/' /etc/sysconfig/network
|
||||
echo ' ' >> /etc/sysctl.conf
|
||||
echo '# ipv6 support in the kernel, set to 0 by default' >> /etc/sysctl.conf
|
||||
echo 'net.ipv6.conf.all.disable_ipv6 = 1' >> /etc/sysctl.conf
|
||||
echo 'net.ipv6.conf.default.disable_ipv6 = 1' >> /etc/sysctl.conf
|
||||
sed -i 's/v inet6/- inet6/' /etc/netconfig
|
||||
|
||||
### Update hostname
|
||||
|
||||
vi /etc/sysconfig/network
|
||||
vi /etc/hosts
|
||||
|
||||
### Remove IPv6 and eth1 interface from /etc/hosts
|
||||
|
||||
sed -i 's/^10\./#10\./' /etc/hosts
|
||||
sed -i 's/^2001/#2001/' /etc/hosts
|
||||
|
||||
### Install ntp
|
||||
|
||||
yum -y install ntp
|
||||
chkconfig ntpdate on
|
||||
service ntpdate start
|
||||
|
||||
### Install OpenJDK, needed for Jenkins slaves
|
||||
|
||||
yum -y install java-1.7.0-openjdk
|
||||
|
||||
### Create the Jenkins user
|
||||
|
||||
useradd -G wheel jenkins
|
||||
chmod 755 /home/jenkins
|
||||
|
||||
### Set the Jenkins password
|
||||
|
||||
passwd jenkins
|
||||
|
||||
### Copy the Jenkins SSH key from build.gluster.org
|
||||
|
||||
mkdir /home/jenkins/.ssh
|
||||
chmod 700 /home/jenkins/.ssh
|
||||
cp `<somewhere>` /home/jenkins/.ssh/id_rsa
|
||||
chown -R jenkins:jenkins /home/jenkins/.ssh
|
||||
chmod 600 /home/jenkins/.ssh/id_rsa
|
||||
|
||||
### Generate the SSH known hosts file for jenkins user
|
||||
|
||||
su - jenkins
|
||||
mkdir ~/foo
|
||||
cd ~/foo
|
||||
git clone `[`ssh://build@review.gluster.org/glusterfs.git`](ssh://build@review.gluster.org/glusterfs.git)
|
||||
(this will ask if the new host fingerprint should be added. Choose yes)
|
||||
cd ..
|
||||
rm -rf ~/foo
|
||||
exit
|
||||
|
||||
### Install git from RPMForge
|
||||
|
||||
yum -y install http://pkgs.repoforge.org/rpmforge-release/rpmforge-release-0.5.3-1.el6.rf.x86_64.rpm
|
||||
yum -y --enablerepo=rpmforge-extras update git
|
||||
|
||||
### Install the GlusterFS patch acceptance tests
|
||||
|
||||
git clone git://forge.gluster.org/gluster-patch-acceptance-tests/gluster-patch-acceptance-tests.git /opt/qa
|
||||
|
||||
### Add the loopback mount point to /etc/fstab
|
||||
|
||||
For the 1GB Rackspace VM's use this:
|
||||
|
||||
echo '/backingstore /d xfs loop 0 2' >> /etc/fstab
|
||||
mount /d
|
||||
|
||||
For the 2GB and above Rackspace VM's use this:
|
||||
|
||||
echo '/dev/xvde /d xfs defaults 0 2' >> /etc/fstab
|
||||
mount /d
|
||||
|
||||
### Create the directories needed for the regression testing
|
||||
|
||||
JDIRS="/var/log/glusterfs /var/lib/glusterd /var/run/gluster /d /d/archived_builds /d/backends /d/build /d/logs /home/jenkins/root"
|
||||
mkdir -p $JDIRS
|
||||
chown jenkins:jenkins $JDIRS
|
||||
chmod 755 $JDIRS
|
||||
ln -s /d/build /build
|
||||
|
||||
### Create the directories where regression logs are archived
|
||||
|
||||
ADIRS="/archives/archived_builds /archives/logs"
|
||||
mkdir -p $ADIRS
|
||||
chown jenkins:jenkins $ADIRS
|
||||
chmod 755 $ADIRS
|
||||
|
||||
### Install Nginx
|
||||
|
||||
For making logs available over http
|
||||
|
||||
yum -y install http://nginx.org/packages/centos/6/noarch/RPMS/nginx-release-centos-6-0.el6.ngx.noarch.rpm
|
||||
yum -y install nginx
|
||||
lokkit -s http
|
||||
|
||||
### Copy the Nginx config file into place
|
||||
|
||||
cp -f /opt/qa/nginx/default.conf /etc/nginx/conf.d/default.conf
|
||||
|
||||
### Enable wheel group for sudo
|
||||
|
||||
sed -i 's/# %wheel\tALL=(ALL)\tNOPASSWD/%wheel\tALL=(ALL)\tNOPASSWD/' /etc/sudoers
|
||||
|
||||
### Reboot (for networking changes to take effect)
|
||||
|
||||
reboot
|
||||
|
||||
### Add forward and reverse DNS entries for the slave into Rackspace DNS
|
||||
|
||||
Rackspace recently added [API calls for its Cloud
|
||||
DNS](https://developer.rackspace.com/docs/cloud-dns/getting-started/?lang=python)
|
||||
service, so we should be able to fully automate this part as well now.
|
||||
@@ -1,39 +0,0 @@
|
||||
GlusterFS 3.4 introduced the libgfapi client API for C programs. This
|
||||
page lists bindings to the libgfapi C library from other languages.
|
||||
|
||||
Go
|
||||
--
|
||||
|
||||
- [gogfapi](https://forge.gluster.org/gogfapi) - Go language bindings
|
||||
for libgfapi, aiming to provide an api consistent with the default
|
||||
Go file apis.
|
||||
|
||||
Java
|
||||
----
|
||||
|
||||
- [libgfapi-jni](https://github.com/semiosis/libgfapi-jni/) - Low
|
||||
level JNI binding for libgfapi
|
||||
- [glusterfs-java-filesystem](https://github.com/semiosis/glusterfs-java-filesystem)
|
||||
- High level NIO.2 FileSystem Provider implementation for the Java
|
||||
platform
|
||||
- [libgfapi-java-io](https://github.com/gluster/libgfapi-java-io) -
|
||||
Java bindings for libgfapi, similar to java.io
|
||||
|
||||
Python
|
||||
------
|
||||
|
||||
- [libgfapi-python](https://github.com/gluster/libgfapi-python) -
|
||||
Libgfapi bindings for Python
|
||||
|
||||
Ruby
|
||||
----
|
||||
|
||||
- [libgfapi-ruby](https://github.com/spajus/libgfapi-ruby) - Libgfapi
|
||||
bindings for Ruby using FFI
|
||||
|
||||
Rust
|
||||
----
|
||||
|
||||
- [gfapi-sys](https://github.com/cholcombe973/Gfapi-sys) - Libgfapi
|
||||
bindings for Rust using FFI
|
||||
|
||||
@@ -1,99 +0,0 @@
|
||||
This page contains a list of project ideas which will be suitable for
|
||||
students (for GSOC, internship etc.)
|
||||
|
||||
Projects with mentors
|
||||
---------------------
|
||||
|
||||
### gfsck - A GlusterFS filesystem check
|
||||
|
||||
- A tool to check filesystem integrity and repairing
|
||||
- I'm currently working on it
|
||||
- Owner: Xavier Hernandez (Datalab)
|
||||
|
||||
### Sub-directory mount support for native GlusterFS mounts
|
||||
|
||||
Allow clients to directly mount directories inside a GlusterFS volume,
|
||||
like how NFS clients can mount directories inside an NFS export.
|
||||
|
||||
Mentor: Kaushal <kshlmster at gmail dot com>
|
||||
|
||||
### GlusterD services high availablity
|
||||
|
||||
GlusterD should restart the processes it manages, bricks, nfs server,
|
||||
self-heal daemon and quota daemon, whenever it detects they have died.
|
||||
|
||||
Mentor : Atin Mukherjee <atin.mukherjee83@gmail.com>
|
||||
|
||||
### Language bindings for libgfapi
|
||||
|
||||
- API/library for accessing gluster volumes
|
||||
|
||||
### oVirt gui for stats
|
||||
|
||||
Have pretty graphs and tables in ovirt for the GlusterFS top and profile
|
||||
commands.
|
||||
|
||||
### Monitoring integrations - munin others
|
||||
|
||||
The more monitoring support we have for GlusterFS the better.
|
||||
|
||||
### More compression algorithms for compression xlator
|
||||
|
||||
The on-wire compression translator should be extended to support more
|
||||
compression algorithms. Ideally it should be pluggable.
|
||||
|
||||
### Cinder GlusterFS backup driver
|
||||
|
||||
Write a driver for cinder, a part of openstack, to allow backup onto
|
||||
GlusterFS volumes
|
||||
|
||||
### rsockets - sockets for rdma transport
|
||||
|
||||
Coding for RDMA using the familiar socket api should lead to a more
|
||||
robust rdma transport
|
||||
|
||||
### Data import tool
|
||||
|
||||
Create a tool which will allow importing already existing data in the
|
||||
brick directories into the gluster volume. This is most likely going to
|
||||
be a special rebalance process.
|
||||
|
||||
### Rebalance improvements
|
||||
|
||||
Improve rebalance performance.
|
||||
|
||||
### Meta translator
|
||||
|
||||
The meta xlator provides a /proc like interface to GlusterFS xlators.
|
||||
This could be improved upon and the meta xlator could be made a standard
|
||||
part of the volume graph.
|
||||
|
||||
### Geo-replication using rest-api
|
||||
|
||||
Might be suitable for geo replication over WAN.
|
||||
|
||||
### Quota using underlying FS' quota
|
||||
|
||||
GlusterFS quota is currently maintained completely in GlusterFSs
|
||||
namespace using xattrs. We could make use of the quota capabilities of
|
||||
the underlying fs (XFS) for better performance.
|
||||
|
||||
### Snapshot pluggability
|
||||
|
||||
Snapshot should be able to make use of snapshot support provided by
|
||||
btrfs for example.
|
||||
|
||||
### Compression at rest
|
||||
|
||||
Lessons learnt while implementing encryption at rest can be used with
|
||||
the compression at rest.
|
||||
|
||||
### File-level deduplication
|
||||
|
||||
GlusterFS works on files. So why not have dedup at the level files as
|
||||
well.
|
||||
|
||||
### Composition xlator for small files
|
||||
|
||||
Merge small files into a designated large file using our own custom
|
||||
semantics. This can improve our small file performance.
|
||||
@@ -1,238 +0,0 @@
|
||||
Simplified development workflow for GlusterFS
|
||||
=============================================
|
||||
|
||||
This page gives a simplified model of the development workflow used by
|
||||
the GlusterFS project. This will give the steps required to get a patch
|
||||
accepted into the GlusterFS source.
|
||||
|
||||
Visit [Development Work Flow](./Development Workflow.md) a more
|
||||
detailed description of the workflow.
|
||||
|
||||
Initial preperation
|
||||
-------------------
|
||||
|
||||
The GlusterFS development workflow revolves around
|
||||
[Git](http://git.gluster.org/?p=glusterfs.git;a=summary),
|
||||
[Gerrit](http://review.gluster.org) and
|
||||
[Jenkins](http://build.gluster.org).
|
||||
|
||||
Using these tools requires some initial preparation.
|
||||
|
||||
### Dev system setup
|
||||
|
||||
You should install and setup Git on your development system. Use your
|
||||
distribution specific package manger to install git. After installation
|
||||
configure git. At the minimum, set a git user email. To set the email
|
||||
do,
|
||||
|
||||
$ git config --global user.name "Name"
|
||||
$ git config --global user.email <email address>
|
||||
|
||||
You should also generate an ssh key pair if you haven't already done it.
|
||||
To generate a key pair do,
|
||||
|
||||
$ ssh-keygen
|
||||
|
||||
and follow the instructions.
|
||||
|
||||
Next, install the build requirements for GlusterFS. Refer
|
||||
[Building GlusterFS - Build Requirements](./Building GlusterFS.md#Build Requirements)
|
||||
for the actual requirements.
|
||||
|
||||
### Gerrit setup
|
||||
|
||||
To contribute to GlusterFS, you should first register on
|
||||
[gerrit](http://review.gluster.org).
|
||||
|
||||
After registration, you will need to select a username, set a preferred
|
||||
email and upload the ssh public key in gerrit. You can do this from the
|
||||
gerrit settings page. Make sure that you set the preferred email to the
|
||||
email you configured for git.
|
||||
|
||||
### Get the source
|
||||
|
||||
Git clone the GlusterFS source using
|
||||
|
||||
<ssh://><username>@review.gluster.org/glusterfs.git
|
||||
|
||||
(replace <username> with your gerrit username).
|
||||
|
||||
$ git clone (ssh://)<username> @review.gluster.org/glusterfs.git
|
||||
|
||||
This will clone the GlusterFS source into a subdirectory named glusterfs
|
||||
with the master branch checked out.
|
||||
|
||||
It is essential that you use this link to clone, or else you will not be
|
||||
able to submit patches to gerrit for review.
|
||||
|
||||
Actual development
|
||||
------------------
|
||||
|
||||
The commands in this section are to be run inside the glusterfs source
|
||||
directory.
|
||||
|
||||
### Create a development branch
|
||||
|
||||
It is recommended to use separate local development branches for each
|
||||
change you want to contribute to GlusterFS. To create a development
|
||||
branch, first checkout the upstream branch you want to work on and
|
||||
update it. More details on the upstream branching model for GlusterFS
|
||||
can be found at
|
||||
|
||||
[Development Work Flow - Branching\_policy](./Development Workflow.md#branching-policy).
|
||||
For example if you want to develop on the master branch,
|
||||
|
||||
$ git checkout master
|
||||
$ git pull
|
||||
|
||||
Now, create a new branch from master and switch to the new branch. It is
|
||||
recommended to have descriptive branch names. Do,
|
||||
|
||||
$ git branch <descriptive-branch-name>
|
||||
$ git checkout <descriptive-branch-name>
|
||||
|
||||
or,
|
||||
|
||||
$ git checkout -b <descriptive-branch-name>
|
||||
|
||||
to do both in one command.
|
||||
|
||||
### Hack
|
||||
|
||||
Once you've switched to the development branch, you can perform the
|
||||
actual code changes. [Build](./Building GlusterFS) and test to
|
||||
see if your changes work.
|
||||
|
||||
#### Tests
|
||||
|
||||
Unless your changes are very minor and trivial, you should also add a
|
||||
test for your change. Tests are used to ensure that the changes you did
|
||||
are not broken inadvertently. More details on tests can be found at
|
||||
|
||||
[Development Workflow - Test cases](./Development Workflow.md#test-cases)
|
||||
and
|
||||
[Development Workflow - Regression tests and test cases](./Development Workflow.md#regression-tests-and-test-cases)
|
||||
|
||||
### Regression test
|
||||
|
||||
Once your change is working, run the regression test suite to make sure
|
||||
you haven't broken anything. The regression test suite requires a
|
||||
working GlusterFS installation and needs to be run as root. To run the
|
||||
regression test suite, do
|
||||
|
||||
# make install
|
||||
# ./run-tests.sh
|
||||
|
||||
### Commit your changes
|
||||
|
||||
If you haven't broken anything, you can now commit your changes. First
|
||||
identify the files that you modified/added/deleted using git-status and
|
||||
stage these files.
|
||||
|
||||
$ git status
|
||||
$ git add <list of modified files>
|
||||
|
||||
Now, commit these changes using
|
||||
|
||||
$ git commit -s
|
||||
|
||||
Provide a meaningful commit message. The commit message policy is
|
||||
described at
|
||||
|
||||
[Development Work Flow - Commit policy](./Development Workflow.md#commit-policy).
|
||||
|
||||
It is essential that you commit with the '-s' option, which will
|
||||
sign-off the commit with your configured email, as gerrit is configured
|
||||
to reject patches which are not signed-off.
|
||||
|
||||
### Submit for review
|
||||
|
||||
To submit your change for review, run the rfc.sh script,
|
||||
|
||||
$ ./rfc.sh
|
||||
|
||||
The script will ask you to enter a bugzilla bug id. Every change
|
||||
submitted to GlusterFS needs a bugzilla entry to be accepted. If you do
|
||||
not already have a bug id, file a new bug at [Red Hat
|
||||
Bugzilla](https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS).
|
||||
If the patch is submitted for review, the rfc.sh script will return the
|
||||
gerrit url for the review request.
|
||||
|
||||
More details on the rfc.sh script are available at
|
||||
[Development Work Flow - rfc.sh](./Development Workflow.md#rfc.sh).
|
||||
|
||||
Review process
|
||||
--------------
|
||||
|
||||
Your change will now be reviewed by the GlusterFS maintainers and
|
||||
component owners on [gerrit](http://review.gluster.org). You can follow
|
||||
and take part in the review process on the change at the review url. The
|
||||
review process involves several steps.
|
||||
|
||||
To know component owners , you can check the "MAINTAINERS" file in root
|
||||
of glusterfs code directory
|
||||
|
||||
### Automated verification
|
||||
|
||||
Every change submitted to gerrit triggers an initial automated
|
||||
verification on [jenkins](http://build.gluster.org). The automated
|
||||
verification ensures that your change doesn't break the build and has an
|
||||
associated bug-id.
|
||||
|
||||
More details can be found at
|
||||
|
||||
[Development Work Flow - Auto verification](./Development Workflow.md#auto-verification).
|
||||
|
||||
### Formal review
|
||||
|
||||
Once the auto verification is successful, the component owners will
|
||||
perform a formal review. If they are okay with your change, they will
|
||||
give a positive review. If not they will give a negative review and add
|
||||
comments on the reasons.
|
||||
|
||||
More information regarding the review qualifiers and disqualifiers is
|
||||
available at
|
||||
|
||||
[Development Work Flow - Submission Qualifiers](./Development Workflow.md#submission-qualifiers)
|
||||
and
|
||||
[Development Work Flow - Submission Disqualifiers](./Development Workflow.md#submission-disqualifiers).
|
||||
|
||||
If your change gets a negative review, you will need to address the
|
||||
comments and resubmit your change.
|
||||
|
||||
#### Resubmission
|
||||
|
||||
Switch to your development branch and make new changes to address the
|
||||
review comments. Build and test to see if the new changes are working.
|
||||
|
||||
Stage your changes and commit your new changes using,
|
||||
|
||||
$ git commit --amend
|
||||
|
||||
'--amend' is required to ensure that you update your original commit and
|
||||
not create a new commit.
|
||||
|
||||
Now you can resubmit the updated commit for review using the rfc.sh
|
||||
script.
|
||||
|
||||
The formal review process could take a long time. To increase chances
|
||||
for a speedy review, you can add the component owners as reviewers on
|
||||
the gerrit review page. This will ensure they notice the change. The
|
||||
list of component owners can be found in the MAINTAINERS file present in
|
||||
the GlusterFS source
|
||||
|
||||
### Verification
|
||||
|
||||
After a component owner has given a positive review, a maintainer will
|
||||
run the regression test suite on your change to verify that your change
|
||||
works and hasn't broken anything. This verification is done with the
|
||||
help of jenkins.
|
||||
|
||||
If the verification fails, you will need to make necessary changes and
|
||||
resubmit an updated commit for review.
|
||||
|
||||
### Acceptance
|
||||
|
||||
After successful verification, a maintainer will merge/cherry-pick (as
|
||||
necessary) your change into the upstream GlusterFS source. Your change
|
||||
will now be available in the upstream git repo for everyone to use.
|
||||
@@ -1,270 +0,0 @@
|
||||
Description
|
||||
-----------
|
||||
|
||||
The Gluster Test Framework, is a suite of scripts used for regression
|
||||
testing of Gluster.
|
||||
|
||||
It runs well on RHEL and CentOS (possibly Fedora too, presently being
|
||||
tested), and is automatically run against every patch submitted to
|
||||
Gluster [for review](http://review.gluster.org).
|
||||
|
||||
The Gluster Test Framework is part of the main Gluster code base, living
|
||||
under the "tests" subdirectory:
|
||||
|
||||
http://git.gluster.org/?p=glusterfs.git;a=summary
|
||||
|
||||
WARNING
|
||||
-------
|
||||
|
||||
Running the Gluster Test Framework deletes “/var/lib/glusterd/\*”.
|
||||
|
||||
**DO NOT run it on a server with any data.**
|
||||
|
||||
Preparation steps for Ubuntu 14.04 LTS
|
||||
--------------------------------------
|
||||
|
||||
1. \# apt-get install dbench git libacl1-dev mock nfs-common
|
||||
nfs-kernel-server libtest-harness-perl libyajl-dev xfsprogs psmisc attr
|
||||
acl lvm2 rpm
|
||||
|
||||
2. \# apt-get install python-webob python-paste python-sphinx
|
||||
|
||||
3. \# apt-get install autoconf automake bison dos2unix flex libfuse-dev
|
||||
libaio-dev libibverbs-dev librdmacm-dev libtool libxml2-dev
|
||||
libxml2-utils liblvm2-dev make libssl-dev pkg-config libpython-dev
|
||||
python-eventlet python-netifaces python-simplejson python-pyxattr
|
||||
libreadline-dev systemtap-sdt-dev tar
|
||||
|
||||
4) Install cmockery2 from github (https://github.com/lpabon/cmockery2)
|
||||
and compile and make install as in Readme
|
||||
|
||||
5)
|
||||
|
||||
sudo groupadd mock
|
||||
sudo useradd -g mock mock
|
||||
|
||||
6) mkdir /var/run/gluster
|
||||
|
||||
**Note**: redhat-rpm-config package is not found in ubuntu
|
||||
|
||||
Preparation steps for CentOS 7 (only)
|
||||
-------------------------------------
|
||||
|
||||
1. Install EPEL:
|
||||
|
||||
$ sudo yum install -y http://epel.mirror.net.in/epel/7/x86_64/e/epel-release-7-1.noarch.rpm
|
||||
|
||||
2. Install the CentOS 7.x dependencies:
|
||||
|
||||
$ sudo yum install -y --enablerepo=epel cmockery2-devel dbench git libacl-devel mock nfs-utils perl-Test-Harness yajl xfsprogs psmisc
|
||||
|
||||
$ sudo yum install -y --enablerepo=epel python-webob1.0 python-paste-deploy1.5 python-sphinx10 redhat-rpm-config
|
||||
|
||||
==\> Despite below missing packages it worked for me
|
||||
|
||||
No package python-webob1.0 available.
|
||||
No package python-paste-deploy1.5 available.
|
||||
No package python-sphinx10 available.
|
||||
|
||||
$ sudo yum install -y --enablerepo=epel autoconf automake bison dos2unix flex fuse-devel libaio-devel libibverbs-devel \
|
||||
librdmacm-devel libtool libxml2-devel lvm2-devel make openssl-devel pkgconfig \
|
||||
python-devel python-eventlet python-netifaces python-paste-deploy \
|
||||
python-simplejson python-sphinx python-webob pyxattr readline-devel rpm-build \
|
||||
systemtap-sdt-devel tar
|
||||
|
||||
3. Create the mock user
|
||||
|
||||
$ sudo useradd -g mock mock
|
||||
|
||||
Preparation steps for CentOS 6.3+ (only)
|
||||
----------------------------------------
|
||||
|
||||
1. Install EPEL:
|
||||
|
||||
$ sudo yum install -y http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
|
||||
|
||||
2. Install the CentOS 6.x dependencies:
|
||||
|
||||
$ sudo yum install -y --enablerepo=epel cmockery2-devel dbench git libacl-devel mock nfs-utils perl-Test-Harness yajl xfsprogs
|
||||
$ sudo yum install -y --enablerepo=epel python-webob1.0 python-paste-deploy1.5 python-sphinx10 redhat-rpm-config
|
||||
$ sudo yum install -y --enablerepo=epel autoconf automake bison dos2unix flex fuse-devel libaio-devel libibverbs-devel \
|
||||
librdmacm-devel libtool libxml2-devel lvm2-devel make openssl-devel pkgconfig \
|
||||
python-devel python-eventlet python-netifaces python-paste-deploy \
|
||||
python-simplejson python-sphinx python-webob pyxattr readline-devel rpm-build \
|
||||
systemtap-sdt-devel tar
|
||||
|
||||
3. Create the mock user
|
||||
|
||||
$ sudo useradd -g mock mock
|
||||
|
||||
Preparation steps for RHEL 6.3+ (only)
|
||||
--------------------------------------
|
||||
|
||||
1. Ensure you have the "Scalable Filesystem Support" group installed
|
||||
|
||||
This provides the xfsprogs package, which is required by the test
|
||||
framework.
|
||||
|
||||
2. Install EPEL:
|
||||
|
||||
$ sudo yum install -y http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
|
||||
|
||||
3. Install the CentOS 6.x dependencies:
|
||||
|
||||
$ sudo yum install -y --enablerepo=epel cmockery2-devel dbench git libacl-devel mock nfs-utils yajl perl-Test-Harness
|
||||
$ sudo yum install -y --enablerepo=rhel-6-server-optional-rpms python-webob1.0 python-paste-deploy1.5 python-sphinx10 redhat-rpm-config
|
||||
$ sudo yum install -y --disablerepo=rhs* --enablerepo=*optional-rpms autoconf \
|
||||
automake bison dos2unix flex fuse-devel libaio-devel libibverbs-devel \
|
||||
librdmacm-devel libtool libxml2-devel lvm2-devel make openssl-devel pkgconfig \
|
||||
python-devel python-eventlet python-netifaces python-paste-deploy \
|
||||
python-simplejson python-sphinx python-webob pyxattr readline-devel rpm-build \
|
||||
systemtap-sdt-devel tar
|
||||
|
||||
4. Create the mock user
|
||||
|
||||
$ sudo useradd -g mock mock
|
||||
|
||||
Preparation steps for Fedora 16-19 (only)
|
||||
-----------------------------------------
|
||||
|
||||
**Still in development**
|
||||
|
||||
1. Install the Fedora dependencies:
|
||||
|
||||
$ sudo yum install -y attr cmockery2-devel dbench git mock nfs-utils perl-Test-Harness psmisc xfsprogs
|
||||
$ sudo yum install -y python-webob1.0 python-paste-deploy1.5 python-sphinx10 redhat-rpm-config
|
||||
$ sudo yum install -y autoconf automake bison dos2unix flex fuse-devel libaio-devel libibverbs-devel \
|
||||
librdmacm-devel libtool libxml2-devel lvm2-devel make openssl-devel pkgconfig \
|
||||
python-devel python-eventlet python-netifaces python-paste-deploy \
|
||||
python-simplejson python-sphinx python-webob pyxattr readline-devel rpm-build \
|
||||
systemtap-sdt-devel tar
|
||||
|
||||
3. Create the mock user
|
||||
|
||||
$ sudo useradd -g mock mock
|
||||
|
||||
Common steps
|
||||
------------
|
||||
|
||||
1. Ensure DNS for your server is working
|
||||
|
||||
The Gluster Test Framework fails miserably if the full domain name for
|
||||
your server doesn't resolve back to itself.
|
||||
|
||||
If you don't have a working DNS infrastructure in place, adding an entry
|
||||
for your server to its /etc/hosts file will work.
|
||||
|
||||
2. Install the version of Gluster you are testing
|
||||
|
||||
Either install an existing set of rpms:
|
||||
|
||||
$ sudo yum install [your gluster rpms here]
|
||||
|
||||
Or compile your own ones (fairly easy):
|
||||
|
||||
http://www.gluster.org/community/documentation/index.php/CompilingRPMS
|
||||
|
||||
3. Clone the GlusterFS git repository
|
||||
|
||||
$ git clone git://git.gluster.org/glusterfs
|
||||
$ cd glusterfs
|
||||
|
||||
Ensure mock can access the directory
|
||||
------------------------------------
|
||||
|
||||
Some tests run as the user "mock". If the mock user can't access the
|
||||
tests subdirectory directory, these tests fail. (rpm.t is one such test)
|
||||
|
||||
This is a known gotcha when the git repo is cloned to your home
|
||||
directory. Home directories generally don't have world readable
|
||||
permissions. You can fix this by adjusting your home directory
|
||||
permissions, or placing the git repo somewhere else (with access for the
|
||||
mock user).
|
||||
|
||||
Running the tests
|
||||
-----------------
|
||||
|
||||
The tests need to run as root, so they can mount volumes and manage
|
||||
gluster processes as needed.
|
||||
|
||||
It's also best to run them directly as the root user, instead of through
|
||||
sudo. Strange things sporadicly happen (for me) when using the full test
|
||||
framework through sudo, that haven't happened (yet) when running
|
||||
directly as root. Hangs in dbench particularly, which are part of at
|
||||
least one test.
|
||||
|
||||
# ./run-tests.sh
|
||||
|
||||
The test framework takes just over 45 minutes to run in a VM here (4
|
||||
cpu's assigned, 8GB ram, SSD storage). It may take significantly more or
|
||||
less time for you, depending on the hardware and software you're using.
|
||||
|
||||
Showing debug information
|
||||
-------------------------
|
||||
|
||||
To display verbose information while the tests are running, set the
|
||||
DEBUG environment variable to 1 prior to running the tests.
|
||||
|
||||
# DEBUG=1 ./run-tests.sh
|
||||
|
||||
Log files
|
||||
---------
|
||||
|
||||
Verbose output from the rpm.t test goes into "rpmbuild-mock.log",
|
||||
located in the same directory the test is run from.
|
||||
|
||||
Reporting bugs
|
||||
--------------
|
||||
|
||||
If you hit a bug when running the test framework, **please** create a
|
||||
bug report for it on Bugzilla so it gets fixed:
|
||||
|
||||
https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS&component=tests
|
||||
|
||||
Creating your own tests
|
||||
-----------------------
|
||||
|
||||
The test scripts are written in bash, with their filenames ending in .t
|
||||
instead of .sh.
|
||||
|
||||
When creating your own test scripts, create them in an appropriate
|
||||
subdirectory under "tests" (eg "bugs" or "features") and use descriptive
|
||||
names like "bug-XXXXXXX-checking-feature-X.t"
|
||||
|
||||
Also include the "include.rc" file, which defines the test types and
|
||||
host/brick/volume defaults:
|
||||
|
||||
. $(dirname $0)/../include.rc
|
||||
|
||||
There are 5 test types available at present, but feel free to add more
|
||||
if you need something that doesn't yet exist. The test types are
|
||||
explained in more detail below.
|
||||
|
||||
Also essential is the "cleanup" command, which removes any existing
|
||||
Gluster configuration (**without backing it up**), and also kills any
|
||||
running gluster processes.
|
||||
|
||||
There is a basic test template you can copy, named bug-000000.t in the
|
||||
bugs subdirectory:
|
||||
|
||||
$ cp bugs/bug-000000.t somedir/descriptive-name.t
|
||||
|
||||
### TEST
|
||||
|
||||
- Example of usage in basic/volume.t
|
||||
|
||||
### TEST\_IN\_LOOP
|
||||
|
||||
- Example of usage in basic/rpm.t
|
||||
|
||||
### EXPECT
|
||||
|
||||
- Example of usage in basic/volume.t
|
||||
|
||||
### EXPECT\_WITHIN
|
||||
|
||||
- Example of usage in basic/volume-status.t
|
||||
|
||||
### EXPECT\_KEYWORD
|
||||
|
||||
- Defined in include.rc, but seems to be unused?
|
||||
@@ -1,18 +0,0 @@
|
||||
Adding a new FOP
|
||||
================
|
||||
|
||||
Steps to be followed when adding a new FOP to GlusterFS:
|
||||
|
||||
1. Edit `glusterfs.h` and add a `GF_FOP_*` constant.
|
||||
2. Edit `xlator.[ch]` and:
|
||||
* add the new prototype for fop and callback.
|
||||
* edit `xlator_fops` structure.
|
||||
3. Edit `xlator.c` and add to fill_defaults.
|
||||
4. Edit `protocol.h` and add struct necessary for the new FOP.
|
||||
5. Edit `defaults.[ch]` and provide default implementation.
|
||||
6. Edit `call-stub.[ch]` and provide stub implementation.
|
||||
7. Edit `common-utils.c` and add to gf_global_variable_init().
|
||||
8. Edit client-protocol and add your FOP.
|
||||
9. Edit server-protocol and add your FOP.
|
||||
10. Implement your FOP in any translator for which the default implementation
|
||||
is not sufficient.
|
||||
@@ -1,91 +0,0 @@
|
||||
History of locking in AFR
|
||||
--------------------------
|
||||
|
||||
GlusterFS has **locks** translator which provides the following internal locking operations called `inodelk`, `entrylk` which are used by afr to achieve synchronization of operations on files or directories that conflict with each other.
|
||||
|
||||
`Inodelk` gives the facility for translators in GlusterFS to obtain range (denoted by tuple with **offset**, **length**) locks in a given **domain** for an inode.
|
||||
Full file lock is denoted by the tuple (offset: `0`, length: `0`) i.e. length `0` is considered as infinity.
|
||||
|
||||
`Entrylk` enables translators of GlusterFS to obtain locks on `name` in a given **domain** for an inode, typically a directory.
|
||||
|
||||
The **locks** translator provides both *blocking* and *nonblocking* variants and of these locks.
|
||||
|
||||
|
||||
AFR makes use of locks xlator extensively:
|
||||
|
||||
1)For FOPS (from clients)
|
||||
-----------------------
|
||||
* Data transactions take inode locks on data domain, Let's refer to this domain name as DATA_DOMAIN.
|
||||
|
||||
So locking for writes would be something like this:`inodelk(offset,length, DATA_DOMAIN)`
|
||||
For truncating a file to zero, it would be `inodelk(0,0,DATA_DOMAIN)`
|
||||
|
||||
* Metadata transactions (chown/chmod) also take inode locks but on a special range on metadata domain,
|
||||
i.e.`(LLONG_MAX-1 , 0, METADATA_DOMAIN).`
|
||||
|
||||
* Entry transactions (create, mkdir, rmdir,unlink, symlink, link,rename) take entrylk on `(name, parent inode)`.
|
||||
|
||||
|
||||
2)For self heal:
|
||||
-------------
|
||||
* For Metadata self-heal, it is the same. i.e.`inodelk(LLONG_MAX-1 , 0, METADATA_DOMAIN)`.
|
||||
* For Entry self-heal, it is `entrylk(NULL name, parent inode)`. Specifying NULL for the name takes full lock on the directory referred to by the inode.
|
||||
* For data self-heal, there is a bit of history as to how locks evolved:
|
||||
|
||||
###Initial version (say version 1) :
|
||||
There was no concept of selfheal daemon (shd). Only client lookups triggered heals. so AFR always took `inodelk(0,0,DATA_DOMAIN)` for healing. The issue with this approach was that when heal was in progress, I/O from clients was blocked .
|
||||
|
||||
###version 2:
|
||||
shd was introduced. We needed to allow I/O to go through when heal was going,provided the ranges did not overlap. To that extent, the following approach was adopted:
|
||||
|
||||
+ 1.shd takes (full inodelk in DATA_DOMAIN). Thus client FOPS are blocked and cannot modify changelog-xattrs
|
||||
+ 2.shd inspects xattrs to determine source/sink
|
||||
+ 3.shd takes a chunk inodelk(0-128kb) again in DATA_DOMAIN (locks xlator allows overlapping locks if lock owner is the same).
|
||||
+ 4.unlock full lock
|
||||
+ 5.heal
|
||||
+ 6.take next chunk lock(129-256kb)
|
||||
+ 7.unlock 1st chunk lock, heal the second chunk and so on.
|
||||
|
||||
|
||||
Thus after 4, any client FOP could write to regions that was not currently under heal. The exception was truncate (to size 0) because it needs full file lock and will always block because some chunk is always under lock by the shd until heal completes.
|
||||
|
||||
Another issue was that 2 shds could run in parallel. Say SHD1 and SHD2 compete for step 1. Let SHD1 win. It proceeds and completes step 4. Now SHD2 also succeeds in step 1, continues all steps. Thus at the end both shds will decrement the changelog leading to negative values in it)
|
||||
|
||||
### version 3
|
||||
To prevent parallel self heals, another domain was introduced, let us call it SELF_HEAL_DOMAIN. With this domain, the following approach was adopted and is **the approach currently in use**:
|
||||
|
||||
+ 1.shd takes (full inodelk on SELF_HEAL_DOMAIN)
|
||||
+ 2.shd takes (full inodelk on DATA_DOMAIN)
|
||||
+ 3.shd inspects xattrs to determine source/sink
|
||||
+ 4.unlock full lock on DATA_DOMAIN
|
||||
+ 5.take chunk lock(0-128kb) on DATA_DOMAIN
|
||||
+ 6.heal
|
||||
+ 7.take next chunk lock(129-256kb) on DATA_DOMAIN
|
||||
+ 8.unlock 1st chunk lock, heal and so on.
|
||||
+ 9.Finally release full lock on SELF_HEAL_DOMAIN
|
||||
|
||||
Thus until one shd completes step 9, another shd cannot start step 1, solving the problem of simultaneous heals.
|
||||
Note that the issue of truncate (to zero) FOP hanging still remains.
|
||||
Also there are multiple network calls involved in this scheme. (lock,heal(ie read+write), unlock) per chunk. i.e 4 calls per chunk.
|
||||
|
||||
### version 4 (ToDo)
|
||||
Some improvements that need to be made in version 3:
|
||||
* Reduce network calls using piggy backing.
|
||||
* After taking chunk lock and healing, we need to unlock the lock before locking the next chunk. This gives a window for any pending truncate FOPs to succeed. If truncate succeeds, the heal of next chunk will fail (read returns zero)
|
||||
and heal is stopped. *BUT* there is **yet another** issue:
|
||||
|
||||
* shd does steps 1 to 4. Let's assume source is brick b1, sink is brick b2 . i.e xattrs are (0,1) and (0,0) on b1 and b2 respectively. Now before shd takes (0-128kb) lock, a client FOP takes it.
|
||||
It modifies data but the FOP succeeds only on brick 2. writev returns success, and the attrs now read (0,1) (1,0). SHD takes over and heals. It had observed (0,1),(0,0) earlier
|
||||
and thus goes ahead and copies stale 128Kb from brick 1 to brick2. Thus as far as application is concerned, `writev` returned success but bricks have stale data.
|
||||
What needs to be done is `writev` must return success only if it succeeded on atleast one source brick (brick b1 in this case). Otherwise The heal still happens in reverse direction but as far as the application is concerned, it received an error.
|
||||
|
||||
###Note on lock **domains**
|
||||
We have used conceptual names in this document like DATA_DOMAIN/ METADATA_DOMAIN/ SELF_HEAL_DOMAIN. In the code, these are mapped to strings that are based on the AFR xlator name like so:
|
||||
|
||||
DATA_DOMAIN --->"vol_name-replicate-n"
|
||||
|
||||
METADATA_DOMAIN --->"vol_name-replicate-n:metadata"
|
||||
|
||||
SELF_HEAL_DOMAIN -->"vol_name-replicate-n:self-heal"
|
||||
|
||||
where vol_name is the name of the volume and 'n' is the replica subvolume index (starting from 0).
|
||||
@@ -1,92 +0,0 @@
|
||||
Self-Heal Daemon
|
||||
================
|
||||
The self-heal daemon (shd) is a glusterfs process that is responsible for healing files in a replicate/ disperse gluster volume.
|
||||
Every server (brick) node of the volume runs one instance of the shd. So even if one node contains replicate/ disperse bricks of
|
||||
multiple volumes, it would be healed by the same shd.
|
||||
|
||||
This document only describes how the shd works for replicate (AFR) volumes.
|
||||
|
||||
The shd is launched by glusterd when the volume starts (only if the volume includes a replicate configuration). The graph
|
||||
of the shd process in every node contains the following: The io-stats which is the top most xlator, its children being the
|
||||
replicate xlators (subvolumes) of *only* the bricks present in that particular node, and finally *all* the client xlators that are the children of the replicate xlators.
|
||||
|
||||
The shd does two types of self-heal crawls: Index heal and Full heal. For both these types of crawls, the basic idea is the same:
|
||||
For each file encountered while crawling, perform metadata, data and entry heals under appropriate locks.
|
||||
* An overview of how each of these heals is performed is detailed in the 'Self-healing' section of *doc/features/afr-v1.md*
|
||||
* The different file locks which the shd takes for each of these heals is detailed in *doc/developer
|
||||
-guide/afr-locks-evolution.md*
|
||||
|
||||
Metadata heal refers to healing extended attributes, mode and permissions of a file or directory.
|
||||
Data heal refers to healing the file contents.
|
||||
Entry self-heal refers to healing entries inside a directory.
|
||||
|
||||
Index heal
|
||||
==========
|
||||
The index heal is done:
|
||||
a) Every 600 seconds (can be changed via the `cluster.heal-timeout` volume option)
|
||||
b) When it is explicitly triggered via the `gluster vol heal <VOLNAME>` command
|
||||
c) Whenever a replica brick that was down comes back up.
|
||||
|
||||
Only one heal can be in progress at one time, irrespective of reason why it was triggered. If another heal is triggered before the first one completes, it will be queued.
|
||||
Only one heal can be queued while the first one is running. If an Index heal is queued, it can be overridden by queuing a Full heal and not vice-versa. Also, before processing
|
||||
each entry in index heal, a check is made if a full heal is queued. If it is, then the index heal is aborted so that the full heal can proceed.
|
||||
|
||||
In index heal, each shd reads the entries present inside .glusterfs/indices/xattrop/ folder and triggers heal on each entry with appropriate locks.
|
||||
The .glusterfs/indices/xattrop/ directory contains a base entry of the name "xattrop-<virtual-gfid-string>". All other entries are hardlinks to the base entry. The
|
||||
*names* of the hardlinks are the gfid strings of the files that may need heal.
|
||||
|
||||
When a client (mount) performs an operation on the file, the index xlator present in each brick process adds the hardlinks in the pre-op phase of the FOP's transaction
|
||||
and removes it in post-op phase if the operation is successful. Thus if an entry is present inside the .glusterfs/indices/xattrop/ directory when there is no I/O
|
||||
happening on the file, it means the file needs healing (or atleast an examination if the brick crashed after the post-op completed but just before the removal of the hardlink).
|
||||
|
||||
####Index heal steps:
|
||||
<pre><code>
|
||||
In shd process of *each node* {
|
||||
opendir +readdir (.glusterfs/indices/xattrop/)
|
||||
for each entry inside it {
|
||||
self_heal_entry() //Explained below.
|
||||
}
|
||||
}
|
||||
</code></pre>
|
||||
|
||||
<pre><code>
|
||||
self_heal_entry() {
|
||||
Call syncop_lookup(replicae subvolume) which eventually does {
|
||||
take appropriate locks
|
||||
determine source and sinks from AFR changelog xattrs
|
||||
perform whatever heal is needed (any of metadata, data and entry heal in that order)
|
||||
clear changelog xattrs and hardlink inside .glusterfs/indices/xattrop/
|
||||
}
|
||||
}
|
||||
</code></pre>
|
||||
|
||||
Note:
|
||||
* If the gfid hardlink is present in the .glusterfs/indices/xattrop/ of both replica bricks, then each shd will try to heal the file but only one of them will be able to proceed due to the self-heal domain lock.
|
||||
|
||||
* While processing entries inside .glusterfs/indices/xattrop/, if shd encounters an entry whose parent is yet to be healed, it will skip it and it will be picked up in the next crawl.
|
||||
|
||||
* If a file is in data/ metadata split-brain, it will not be healed.
|
||||
|
||||
* If a directory is in entry split-brain, a conservative merge will be performed, wherein after the merge, the entries of the directory will be a union of the entries in the replica pairs.
|
||||
|
||||
Full heal
|
||||
=========
|
||||
A full heal is triggered by running `gluster vol heal <VOLNAME> full`. This command is usually run in disk replacement scenarios where the entire data is to be copied from one of the healthy bricks of the replica to the brick that was just replaced.
|
||||
|
||||
Unlike the index heal which runs on the shd of every node in a replicate subvolume, the full heal is run only on the shd of one node per replicate subvolume: the node having the highest UUID.
|
||||
i.e In a 2x2 volume made of 4 nodes N1, N2, N3 and N4, If UUID of N1>N2 and UUID N4 >N3, then the full crawl is carried out by the shds of N1 and N4.(Node UUID can be found in `/var/lib/glusterd/glusterd.info`)
|
||||
|
||||
The full heal steps are almost identical to the index heal, except the heal is performed on each replica starting from the root of the volume:
|
||||
<pre><code>
|
||||
In shd process of *highest UUID node per replica* {
|
||||
opendir +readdir ("/")
|
||||
for each entry inside it {
|
||||
self_heal_entry()
|
||||
if (entry == directory) {
|
||||
/* Recurse*/
|
||||
again opendir+readdir (directory) followed by self_heal_entry() of each entry.
|
||||
}
|
||||
|
||||
}
|
||||
}
|
||||
</code></pre>
|
||||
@@ -1,191 +0,0 @@
|
||||
cluster/afr translator
|
||||
======================
|
||||
|
||||
Locking
|
||||
-------
|
||||
|
||||
Before understanding replicate, one must understand two internal FOPs:
|
||||
|
||||
### `GF_FILE_LK`
|
||||
|
||||
This is exactly like `fcntl(2)` locking, except the locks are in a
|
||||
separate domain from locks held by applications.
|
||||
|
||||
### `GF_DIR_LK (loc_t *loc, char *basename)`
|
||||
|
||||
This allows one to lock a name under a directory. For example,
|
||||
to lock /mnt/glusterfs/foo, one would use the call:
|
||||
|
||||
```
|
||||
GF_DIR_LK ({loc_t for "/mnt/glusterfs"}, "foo")
|
||||
```
|
||||
|
||||
If one wishes to lock *all* the names under a particular directory,
|
||||
supply the basename argument as `NULL`.
|
||||
|
||||
The locks can either be read locks or write locks; consult the
|
||||
function prototype for more details.
|
||||
|
||||
Both these operations are implemented by the features/locks (earlier
|
||||
known as posix-locks) translator.
|
||||
|
||||
Basic design
|
||||
------------
|
||||
|
||||
All FOPs can be classified into four major groups:
|
||||
|
||||
### inode-read
|
||||
|
||||
Operations that read an inode's data (file contents) or metadata (perms, etc.).
|
||||
|
||||
access, getxattr, fstat, readlink, readv, stat.
|
||||
|
||||
### inode-write
|
||||
|
||||
Operations that modify an inode's data or metadata.
|
||||
|
||||
chmod, chown, truncate, writev, utimens.
|
||||
|
||||
### dir-read
|
||||
|
||||
Operations that read a directory's contents or metadata.
|
||||
|
||||
readdir, getdents, checksum.
|
||||
|
||||
### dir-write
|
||||
|
||||
Operations that modify a directory's contents or metadata.
|
||||
|
||||
create, link, mkdir, mknod, rename, rmdir, symlink, unlink.
|
||||
|
||||
Some of these make a subgroup in that they modify *two* different entries:
|
||||
link, rename, symlink.
|
||||
|
||||
### Others
|
||||
|
||||
Other operations.
|
||||
|
||||
flush, lookup, open, opendir, statfs.
|
||||
|
||||
Algorithms
|
||||
----------
|
||||
|
||||
Each of the four major groups has its own algorithm:
|
||||
|
||||
### inode-read, dir-read
|
||||
|
||||
1. Send a request to the first child that is up:
|
||||
* if it fails:
|
||||
* try the next available child
|
||||
* if we have exhausted all children:
|
||||
* return failure
|
||||
|
||||
### inode-write
|
||||
|
||||
All operations are done in parallel unless specified otherwise.
|
||||
|
||||
1. Send a ``GF_FILE_LK`` request on all children for a write lock on the
|
||||
appropriate region
|
||||
(for metadata operations: entire file (0, 0) for writev:
|
||||
(offset, offset+size of buffer))
|
||||
* If a lock request fails on a child:
|
||||
* unlock all children
|
||||
* try to acquire a blocking lock (`F_SETLKW`) on each child, serially.
|
||||
If this fails (due to `ENOTCONN` or `EINVAL`):
|
||||
Consider this child as dead for rest of transaction.
|
||||
2. Mark all children as "pending" on all (alive) children (see below for
|
||||
meaning of "pending").
|
||||
* If it fails on any child:
|
||||
* mark it as dead (in transaction local state).
|
||||
3. Perform operation on all (alive) children.
|
||||
* If it fails on any child:
|
||||
* mark it as dead (in transaction local state).
|
||||
4. Unmark all successful children as not "pending" on all nodes.
|
||||
5. Unlock region on all (alive) children.
|
||||
|
||||
### dir-write
|
||||
|
||||
The algorithm for dir-write is same as above except instead of holding
|
||||
`GF_FILE_LK` locks we hold a GF_DIR_LK lock on the name being operated upon.
|
||||
In case of link-type calls, we hold locks on both the operand names.
|
||||
|
||||
"pending"
|
||||
---------
|
||||
|
||||
The "pending" number is like a journal entry. A pending entry is an
|
||||
array of 32-bit integers stored in network byte-order as the extended
|
||||
attribute of an inode (which can be a directory as well).
|
||||
|
||||
There are three keys corresponding to three types of pending operations:
|
||||
|
||||
### `AFR_METADATA_PENDING`
|
||||
|
||||
There are some metadata operations pending on this inode (perms, ctime/mtime,
|
||||
xattr, etc.).
|
||||
|
||||
### `AFR_DATA_PENDING`
|
||||
|
||||
There is some data pending on this inode (writev).
|
||||
|
||||
### `AFR_ENTRY_PENDING`
|
||||
|
||||
There are some directory operations pending on this directory
|
||||
(create, unlink, etc.).
|
||||
|
||||
Self heal
|
||||
---------
|
||||
|
||||
* On lookup, gather extended attribute data:
|
||||
* If entry is a regular file:
|
||||
* If an entry is present on one child and not on others:
|
||||
* create entry on others.
|
||||
* If entries exist but have different metadata (perms, etc.):
|
||||
* consider the entry with the highest `AFR_METADATA_PENDING` number as
|
||||
definitive and replicate its attributes on children.
|
||||
* If entry is a directory:
|
||||
* Consider the entry with the highest `AFR_ENTRY_PENDING` number as
|
||||
definitive and replicate its contents on all children.
|
||||
* If any two entries have non-matching types (i.e., one is file and
|
||||
other is directory):
|
||||
* Announce to the user via log that a split-brain situation has been
|
||||
detected, and do nothing.
|
||||
* On open, gather extended attribute data:
|
||||
* Consider the file with the highest `AFR_DATA_PENDING` number as
|
||||
the definitive one and replicate its contents on all other
|
||||
children.
|
||||
|
||||
During all self heal operations, appropriate locks must be held on all
|
||||
regions/entries being affected.
|
||||
|
||||
Inode scaling
|
||||
-------------
|
||||
|
||||
Inode scaling is necessary because if a situation arises where an inode number
|
||||
is returned for a directory (by lookup) which was previously the inode number
|
||||
of a file (as per FUSE's table), then FUSE gets horribly confused (consult a
|
||||
FUSE expert for more details).
|
||||
|
||||
To avoid such a situation, we distribute the 64-bit inode space equally
|
||||
among all children of replicate.
|
||||
|
||||
To illustrate:
|
||||
|
||||
If c1, c2, c3 are children of replicate, they each get 1/3 of the available
|
||||
inode space:
|
||||
|
||||
------------- -- -- -- -- -- -- -- -- -- -- -- ---
|
||||
Child: c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 ...
|
||||
Inode number: 1 2 3 4 5 6 7 8 9 10 11 ...
|
||||
------------- -- -- -- -- -- -- -- -- -- -- -- ---
|
||||
|
||||
Thus, if lookup on c1 returns an inode number "2", it is scaled to "4"
|
||||
(which is the second inode number in c1's space).
|
||||
|
||||
This way we ensure that there is never a collision of inode numbers from
|
||||
two different children.
|
||||
|
||||
This reduction of inode space doesn't really reduce the usability of
|
||||
replicate since even if we assume replicate has 1024 children (which would be a
|
||||
highly unusual scenario), each child still has a 54-bit inode space:
|
||||
$2^{54} \sim 1.8 \times 10^{16}$, which is much larger than any real
|
||||
world requirement.
|
||||
@@ -1,469 +0,0 @@
|
||||
#Block device translator
|
||||
|
||||
Block device translator (BD xlator) is a translator added to GlusterFS which provides block backend for GlusterFS. This replaces the existing bd_map translator in GlusterFS that provided similar but very limited functionality. GlusterFS expects the underlying brick to be formatted with a POSIX compatible file system. BD xlator changes that and allows for having bricks that are raw block devices like LVM which needn’t have any file systems on them. Hence with BD xlator, it becomes possible to build a GlusterFS volume comprising of bricks that are logical volumes (LV).
|
||||
|
||||
##bd
|
||||
|
||||
BD xlator maps underlying LVs to files and hence the LVs appear as files to GlusterFS clients. Though BD volume externally appears very similar to the usual Posix volume, not all operations are supported or possible for the files on a BD volume. Only those operations that make sense for a block device are supported and the exact semantics are described in subsequent sections.
|
||||
|
||||
While Posix volume takes a file system directory as brick, BD volume needs a volume group (VG) as brick. In the usual use case of BD volume, a file created on BD volume will result in an LV being created in the brick VG. In addition to a VG, BD volume also needs a file system directory that should be specified at the volume creation time. This directory is necessary for supporting the notion of directories and directory hierarchy for the BD volume. Metadata about LVs (size, mapping info) is stored in this directory.
|
||||
|
||||
BD xlator was mainly developed to use block devices directly as VM images when GlusterFS is used as storage for KVM virtualization. Some of the salient points of BD xlator are
|
||||
|
||||
* Since BD supports file level snapshots and clones by leveraging the snapshot and clone capabilities of LVM, it can be used to fully off-load snapshot and cloning operations from QEMU to the storage (GlusterFS) itself.
|
||||
|
||||
* BD understands dm-thin LVs and hence can support files that are backed by thinly provisioned LVs. This capability of BD xlator translates to having thinly provisioned raw VM images.
|
||||
|
||||
* BD enables thin LVs from a thin pool to be used from multiple nodes that have visibility to GlusterFS BD volume. Thus thin pool can be used as a VM image repository allowing access/visibility to it from multiple nodes.
|
||||
|
||||
* BD supports true zerofill by using BLKZEROOUT ioctl on underlying block devices. Thus BD allows SCSI WRITESAME to be used on underlying block device if the device supports it.
|
||||
|
||||
Though BD xlator is primarily intended to be used with block devices, it does provide full Posix xlator compatibility for files that are created on BD volume but are not backed by or mapped to a block device. Such files which don’t have a block device mapping exist on the Posix directory that is specified during BD volume creation. BD xlator is available from GlusterFS-3.5 release.
|
||||
|
||||
###Compiling BD translator
|
||||
|
||||
BD xlator needs lvm2 development library. –enable-bd-xlator option can be used with `./configure` script to explicitly enable BD translator. The following snippet from the output of configure script shows that BD xlator is enabled for compilation.
|
||||
|
||||
|
||||
#####GlusterFS configure summary
|
||||
|
||||
…
|
||||
Block Device xlator : yes
|
||||
|
||||
|
||||
###Creating a BD volume
|
||||
|
||||
BD supports hosting of both linear LV and thin LV within the same volume. However seperate examples are provided below. As noted above, the prerequisite for a BD volume is VG which is created from a loop device here, but it can be any other device too.
|
||||
|
||||
|
||||
* Creating BD volume with linear LV backend
|
||||
|
||||
* Create a loop device
|
||||
|
||||
|
||||
[root@node ~]# dd if=/dev/zero of=bd-loop count=1024 bs=1M
|
||||
|
||||
[root@node ~]# losetup /dev/loop0 bd-loop
|
||||
|
||||
|
||||
* Prepare a brick by creating a VG
|
||||
|
||||
[root@node ~]# pvcreate /dev/loop0
|
||||
|
||||
[root@node ~]# vgcreate bd-vg /dev/loop0
|
||||
|
||||
|
||||
* Create the BD volume
|
||||
|
||||
* Create a POSIX directory first
|
||||
|
||||
|
||||
[root@node ~]# mkdir /bd-meta
|
||||
|
||||
It is recommended that this directory is created on an LV in the brick VG itself so that both data and metadata live together on the same device.
|
||||
|
||||
|
||||
* Create and mount the volume
|
||||
|
||||
[root@node ~]# gluster volume create bd node:/bd-meta?bd-vg force
|
||||
|
||||
|
||||
The general syntax for specifying the brick is `host:/posix-dir?volume-group-name` where “?” is the separator.
|
||||
|
||||
|
||||
|
||||
[root@node ~]# gluster volume start bd
|
||||
[root@node ~]# gluster volume info bd
|
||||
Volume Name: bd
|
||||
Type: Distribute
|
||||
Volume ID: cb042d2a-f435-4669-b886-55f5927a4d7f
|
||||
Status: Started
|
||||
Xlator 1: BD
|
||||
Capability 1: offload_copy
|
||||
Capability 2: offload_snapshot
|
||||
Number of Bricks: 1
|
||||
Transport-type: tcp
|
||||
Bricks:
|
||||
Brick1: node:/bd-meta
|
||||
Brick1 VG: bd-vg
|
||||
|
||||
|
||||
|
||||
[root@node ~]# mount -t glusterfs node:/bd /mnt
|
||||
|
||||
* Create a file that is backed by an LV
|
||||
|
||||
[root@node ~]# ls /mnt
|
||||
|
||||
[root@node ~]#
|
||||
|
||||
Since the volume is empty now, so is the underlying VG.
|
||||
|
||||
[root@node ~]# lvdisplay bd-vg
|
||||
[root@node ~]#
|
||||
|
||||
Creating a file that is mapped to an LV is a 2 step operation. First the file should be created on the mount point and a specific extended attribute should be set to map the file to LV.
|
||||
|
||||
[root@node ~]# touch /mnt/lv
|
||||
[root@node ~]# setfattr -n “user.glusterfs.bd” -v “lv” /mnt/lv
|
||||
|
||||
Now an LV got created in the VG brick and the file /mnt/lv maps to this LV. Any read/write to this file ends up as read/write to the underlying LV.
|
||||
|
||||
[root@node ~]# lvdisplay bd-vg
|
||||
— Logical volume —
|
||||
LV Path /dev/bd-vg/6ff0f25f-2776-4d19-adfb-df1a3cab8287
|
||||
LV Name 6ff0f25f-2776-4d19-adfb-df1a3cab8287
|
||||
VG Name bd-vg
|
||||
LV UUID PjMPcc-RkD5-RADz-6ixG-UYsk-oclz-vL0nv6
|
||||
LV Write Access read/write
|
||||
LV Creation host, time node, 2013-11-26 16:15:45 +0530
|
||||
LV Status available
|
||||
open 0
|
||||
LV Size 4.00 MiB
|
||||
Current LE 1
|
||||
Segments 1
|
||||
Allocation inherit
|
||||
Read ahead sectors 0
|
||||
Block device 253:6
|
||||
|
||||
The file gets created with default LV size which is 1 LE which is 4MB in this case.
|
||||
|
||||
[root@node ~]# ls -lh /mnt/lv
|
||||
-rw-r–r–. 1 root root 4.0M Nov 26 16:15 /mnt/lv
|
||||
|
||||
truncate can be used to set the required file size.
|
||||
|
||||
[root@node ~]# truncate /mnt/lv -s 256M
|
||||
[root@node ~]# lvdisplay bd-vg
|
||||
— Logical volume —
|
||||
LV Path /dev/bd-vg/6ff0f25f-2776-4d19-adfb-df1a3cab8287
|
||||
LV Name 6ff0f25f-2776-4d19-adfb-df1a3cab8287
|
||||
VG Name bd-vg
|
||||
LV UUID PjMPcc-RkD5-RADz-6ixG-UYsk-oclz-vL0nv6
|
||||
LV Write Access read/write
|
||||
LV Creation host, time node, 2013-11-26 16:15:45 +0530
|
||||
LV Status available
|
||||
# open 0
|
||||
LV Size 256.00 MiB
|
||||
Current LE 64
|
||||
Segments 1
|
||||
Allocation inherit
|
||||
Read ahead sectors 0
|
||||
Block device 253:6
|
||||
|
||||
|
||||
[root@node ~]# ls -lh /mnt/lv
|
||||
-rw-r–r–. 1 root root 256M Nov 26 16:15 /mnt/lv
|
||||
|
||||
currently LV size has been set to 256
|
||||
|
||||
The size of the file/LV can be specified during creation/mapping time itself like this:
|
||||
|
||||
setfattr -n “user.glusterfs.bd” -v “lv:256MB” /mnt/lv
|
||||
|
||||
2. Creating BD volume with thin LV backend
|
||||
|
||||
* Create a loop device
|
||||
|
||||
|
||||
[root@node ~]# dd if=/dev/zero of=bd-loop-thin count=1024 bs=1M
|
||||
|
||||
[root@node ~]# losetup /dev/loop0 bd-loop-thin
|
||||
|
||||
|
||||
* Prepare a brick by creating a VG and thin pool
|
||||
|
||||
|
||||
[root@node ~]# pvcreate /dev/loop0
|
||||
|
||||
[root@node ~]# vgcreate bd-vg-thin /dev/loop0
|
||||
|
||||
|
||||
* Create a thin pool
|
||||
|
||||
|
||||
[root@node ~]# lvcreate –thin bd-vg-thin -L 1000M
|
||||
|
||||
Rounding up size to full physical extent 4.00 MiB
|
||||
Logical volume “lvol0″ created
|
||||
|
||||
lvdisplay shows the thin pool
|
||||
|
||||
[root@node ~]# lvdisplay bd-vg-thin
|
||||
— Logical volume —
|
||||
LV Name lvol0
|
||||
VG Name bd-vg-thin
|
||||
LV UUID HVa3EM-IVMS-QG2g-oqU6-1UxC-RgqS-g8zhVn
|
||||
LV Write Access read/write
|
||||
LV Creation host, time node, 2013-11-26 16:39:06 +0530
|
||||
LV Pool transaction ID 0
|
||||
LV Pool metadata lvol0_tmeta
|
||||
LV Pool data lvol0_tdata
|
||||
LV Pool chunk size 64.00 KiB
|
||||
LV Zero new blocks yes
|
||||
LV Status available
|
||||
# open 0
|
||||
LV Size 1000.00 MiB
|
||||
Allocated pool data 0.00%
|
||||
Allocated metadata 0.88%
|
||||
Current LE 250
|
||||
Segments 1
|
||||
Allocation inherit
|
||||
Read ahead sectors auto
|
||||
Block device 253:9
|
||||
|
||||
* Create the BD volume
|
||||
|
||||
* Create a POSIX directory first
|
||||
|
||||
|
||||
[root@node ~]# mkdir /bd-meta-thin
|
||||
|
||||
* Create and mount the volume
|
||||
|
||||
[root@node ~]# gluster volume create bd-thin node:/bd-meta-thin?bd-vg-thin force
|
||||
|
||||
[root@node ~]# gluster volume start bd-thin
|
||||
|
||||
|
||||
[root@node ~]# gluster volume info bd-thin
|
||||
Volume Name: bd-thin
|
||||
Type: Distribute
|
||||
Volume ID: 27aa7eb0-4ffa-497e-b639-7cbda0128793
|
||||
Status: Started
|
||||
Xlator 1: BD
|
||||
Capability 1: thin
|
||||
Capability 2: offload_copy
|
||||
Capability 3: offload_snapshot
|
||||
Number of Bricks: 1
|
||||
Transport-type: tcp
|
||||
Bricks:
|
||||
Brick1: node:/bd-meta-thin
|
||||
Brick1 VG: bd-vg-thin
|
||||
|
||||
|
||||
[root@node ~]# mount -t glusterfs node:/bd-thin /mnt
|
||||
|
||||
* Create a file that is backed by a thin LV
|
||||
|
||||
|
||||
[root@node ~]# ls /mnt
|
||||
|
||||
[root@node ~]#
|
||||
|
||||
Creating a file that is mapped to a thin LV is a 2 step operation. First the file should be created on the mount point and a specific extended attribute should be set to map the file to a thin LV.
|
||||
|
||||
[root@node ~]# touch /mnt/thin-lv
|
||||
|
||||
[root@node ~]# setfattr -n “user.glusterfs.bd” -v “thin:256MB” /mnt/thin-lv
|
||||
|
||||
Now /mnt/thin-lv is a thin provisioned file that is backed by a thin LV and size has been set to 256.
|
||||
|
||||
[root@node ~]# lvdisplay bd-vg-thin
|
||||
— Logical volume —
|
||||
LV Name lvol0
|
||||
VG Name bd-vg-thin
|
||||
LV UUID HVa3EM-IVMS-QG2g-oqU6-1UxC-RgqS-g8zhVn
|
||||
LV Write Access read/write
|
||||
LV Creation host, time node, 2013-11-26 16:39:06 +0530
|
||||
LV Pool transaction ID 1
|
||||
LV Pool metadata lvol0_tmeta
|
||||
LV Pool data lvol0_tdata
|
||||
LV Pool chunk size 64.00 KiB
|
||||
LV Zero new blocks yes
|
||||
LV Status available
|
||||
# open 0
|
||||
LV Size 000.00 MiB
|
||||
Allocated pool data 0.00%
|
||||
Allocated metadata 0.98%
|
||||
Current LE 250
|
||||
Segments 1
|
||||
Allocation inherit
|
||||
Read ahead sectors auto
|
||||
Block device 253:9
|
||||
|
||||
|
||||
|
||||
|
||||
— Logical volume —
|
||||
LV Path dev/bd-vg-thin/081b01d1-1436-4306-9baf-41c7bf5a2c73
|
||||
LV Name 081b01d1-1436-4306-9baf-41c7bf5a2c73
|
||||
VG Name bd-vg-thin
|
||||
LV UUID coxpTY-2UZl-9293-8H2X-eAZn-wSp6-csZIeB
|
||||
LV Write Access read/write
|
||||
LV Creation host, time node, 2013-11-26 16:43:19 +0530
|
||||
LV Pool name lvol0
|
||||
LV Status available
|
||||
# open 0
|
||||
LV Size 256.00 MiB
|
||||
Mapped size 0.00%
|
||||
Current LE 64
|
||||
Segments 1
|
||||
Allocation inherit
|
||||
Read ahead sectors auto
|
||||
Block device 253:10
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
As can be seen from above, creation of a file resulted in creation of a thin LV in the brick.
|
||||
|
||||
|
||||
###Improvisation on BD translator:
|
||||
|
||||
First version of BD xlator ( block backend) had few limitations such as
|
||||
|
||||
* Creation of directories not supported
|
||||
* Supports only single brick
|
||||
* Does not use extended attributes (and client gfid) like posix xlator
|
||||
* Creation of special files (symbolic links, device nodes etc) not
|
||||
supported
|
||||
|
||||
Basic limitation of not allowing directory creation was blocking
|
||||
oVirt/VDSM to consume BD xlator as part of Gluster domain since VDSM
|
||||
creates multi-level directories when GlusterFS is used as storage
|
||||
backend for storing VM images.
|
||||
|
||||
To overcome these limitations a new BD xlator with following
|
||||
improvements are implemented.
|
||||
|
||||
* New hybrid BD xlator that handles both regular files and block device
|
||||
files
|
||||
* The volume will have both POSIX and BD bricks. Regular files are
|
||||
created on POSIX bricks, block devices are created on the BD brick (VG)
|
||||
* BD xlator leverages exiting POSIX xlator for most POSIX calls and
|
||||
hence sits above the POSIX xlator
|
||||
* Block device file is differentiated from regular file by an extended
|
||||
attribute
|
||||
* The xattr 'user.glusterfs.bd' (BD_XATTR) plays a role in mapping a
|
||||
posix file to Logical Volume (LV).
|
||||
* When a client sends a request to set BD_XATTR on a posix file, a new
|
||||
LV is created and mapped to posix file. So every block device will
|
||||
have a representative file in POSIX brick with 'user.glusterfs.bd'
|
||||
(BD_XATTR) set.
|
||||
* Here after all operations on this file results in LV related
|
||||
operations.
|
||||
|
||||
For example, opening a file that has BD_XATTR set results in opening
|
||||
the LV block device, reading results in reading the corresponding LV
|
||||
block device.
|
||||
|
||||
When BD xlator gets request to set BD_XATTR via setxattr call, it
|
||||
creates a LV and information about this LV is placed in the xattr of the
|
||||
posix file. xattr "user.glusterfs.bd" used to identify that posix file
|
||||
is mapped to BD.
|
||||
|
||||
Usage:
|
||||
Server side:
|
||||
|
||||
[root@host1 ~]# gluster volume create bdvol host1:/storage/vg1_info?vg1 host2:/storage/vg2_info?vg2
|
||||
|
||||
It creates a distributed gluster volume 'bdvol' with Volume Group vg1
|
||||
using posix brick /storage/vg1_info in host1 and Volume Group vg2 using
|
||||
/storage/vg2_info in host2.
|
||||
|
||||
|
||||
[root@host1 ~]# gluster volume start bdvol
|
||||
|
||||
Client side:
|
||||
|
||||
[root@node ~]# mount -t glusterfs host1:/bdvol /media
|
||||
[root@node ~]# touch /media/posix
|
||||
|
||||
It creates regular posix file 'posix' in either host1:/vg1 or host2:/vg2 brick
|
||||
|
||||
[root@node ~]# mkdir /media/image
|
||||
|
||||
[root@node ~]# touch /media/image/lv1
|
||||
|
||||
|
||||
It also creates regular posix file 'lv1' in either host1:/vg1 or
|
||||
host2:/vg2 brick
|
||||
|
||||
[root@node ~]# setfattr -n "user.glusterfs.bd" -v "lv" /media/image/lv1
|
||||
|
||||
[root@node ~]#
|
||||
|
||||
|
||||
Above setxattr results in creating a new LV in corresponding brick's VG
|
||||
and it sets 'user.glusterfs.bd' with value 'lv:<default-extent-size''
|
||||
|
||||
|
||||
[root@node ~]# truncate -s5G /media/image/lv1
|
||||
|
||||
|
||||
It results in resizig LV 'lv1'to 5G
|
||||
|
||||
New BD xlator code is placed in `xlators/storage/bd` directory.
|
||||
|
||||
Also add volume-uuid to the VG so that same VG cannot be used for other
|
||||
bricks/volumes. After deleting a gluster volume, one has to manually
|
||||
remove the associated tag using vgchange <vg-name> --deltag
|
||||
`<trusted.glusterfs.volume-id:<volume-id>>`
|
||||
|
||||
|
||||
#### Exposing volume capabilities
|
||||
|
||||
With multiple storage translators (posix and bd) being supported in GlusterFS, it becomes
|
||||
necessary to know the volume type so that user can issue appropriate calls that are relevant
|
||||
only to the a given volume type. Hence there needs to be a way to expose the type of
|
||||
the storage translator of the volume to the user.
|
||||
|
||||
BD xlator is capable of providing server offloaded file copy, server/storage offloaded
|
||||
zeroing of a file etc. This capabilities should be visible to the client/user, so that these
|
||||
features can be exploited.
|
||||
|
||||
BD xlator exports capability information through gluster volume info (and --xml) output. For eg:
|
||||
|
||||
`snip of gluster volume info output for a BD based volume`
|
||||
|
||||
Xlator 1: BD
|
||||
Capability 1: thin
|
||||
|
||||
`snip of gluster volume info --xml output for a BD based volume`
|
||||
|
||||
<xlators>
|
||||
<xlator>
|
||||
<name>BD</name>
|
||||
<capabilities>
|
||||
<capability>thin</capability>
|
||||
</capabilities>
|
||||
</xlator>
|
||||
</xlators>
|
||||
|
||||
But this capability information should also exposed through some other means so that a host
|
||||
which is not part of Gluster peer could also avail this capabilities.
|
||||
|
||||
* Type
|
||||
|
||||
BD translator supports both regular files and block device, i,e., one can create files on
|
||||
GlusterFS volume backed by BD translator and this file could end up as regular posix file or
|
||||
a logical volume (block device) based on the user''s choice. User can do a setxattr on the
|
||||
created file to convert it to a logical volume.
|
||||
|
||||
Users of BD backed volume like QEMU would like to know that it is working with BD type of volume
|
||||
so that it can issue an additional setxattr call after creating a VM image on GlusterFS backend.
|
||||
This is necessary to ensure that the created VM image is backed by LV instead of file.
|
||||
|
||||
There are different ways to expose this information (BD type of volume) to user.
|
||||
One way is to export it via a `getxattr` call. That said, When a client issues getxattr("volume_type")
|
||||
on a root gfid, bd xlator will return 1 implying its BD xlator. But posix xlator will return ENODATA
|
||||
and client code can interpret this as posix xlator. Also capability list can be returned via
|
||||
getxattr("caps") for root gfid.
|
||||
|
||||
* Capabilities
|
||||
|
||||
BD xlator supports new features such as server offloaded file copy, thin provisioned VM images etc.
|
||||
|
||||
There is no standard way of exploiting these features from client side (such as syscall
|
||||
to exploit server offloaded copy). So these features need to be exported to the client so that
|
||||
they can be used. BD xlator latest version exports these capabilities information through
|
||||
gluster volume info (and --xml) output. But if a client is not part of GlusterFS peer
|
||||
it can''t run volume info command to get the list of capabilities of a given GlusterFS volume.
|
||||
For example, GlusterFS block driver in qemu need to get the capability list so that these features are used.
|
||||
|
||||
|
||||
|
||||
Parts of this documentation were originally published here
|
||||
#http://raobharata.wordpress.com/2013/11/27/glusterfs-block-device-translator/
|
||||
@@ -1,402 +0,0 @@
|
||||
GlusterFS Coding Standards
|
||||
==========================
|
||||
|
||||
Structure definitions should have a comment per member
|
||||
------------------------------------------------------
|
||||
|
||||
Every member in a structure definition must have a comment about its
|
||||
purpose. The comment should be descriptive without being overly verbose.
|
||||
|
||||
*Bad:*
|
||||
|
||||
```
|
||||
gf_lock_t lock; /* lock */
|
||||
```
|
||||
|
||||
*Good:*
|
||||
|
||||
```
|
||||
DBTYPE access_mode; /* access mode for accessing
|
||||
* the databases, can be
|
||||
* DB_HASH, DB_BTREE
|
||||
* (option access-mode <mode>)
|
||||
*/
|
||||
```
|
||||
|
||||
Declare all variables at the beginning of the function
|
||||
------------------------------------------------------
|
||||
|
||||
All local variables in a function must be declared immediately after the
|
||||
opening brace. This makes it easy to keep track of memory that needs to be freed
|
||||
during exit. It also helps debugging, since gdb cannot handle variables
|
||||
declared inside loops or other such blocks.
|
||||
|
||||
Always initialize local variables
|
||||
---------------------------------
|
||||
|
||||
Every local variable should be initialized to a sensible default value
|
||||
at the point of its declaration. All pointers should be initialized to NULL,
|
||||
and all integers should be zero or (if it makes sense) an error value.
|
||||
|
||||
|
||||
*Good:*
|
||||
|
||||
```
|
||||
int ret = 0;
|
||||
char *databuf = NULL;
|
||||
int _fd = -1;
|
||||
```
|
||||
|
||||
Initialization should always be done with a constant value
|
||||
----------------------------------------------------------
|
||||
|
||||
Never use a non-constant expression as the initialization value for a variable.
|
||||
|
||||
|
||||
*Bad:*
|
||||
|
||||
```
|
||||
pid_t pid = frame->root->pid;
|
||||
char *databuf = malloc (1024);
|
||||
```
|
||||
|
||||
Validate all arguments to a function
|
||||
------------------------------------
|
||||
|
||||
All pointer arguments to a function must be checked for `NULL`.
|
||||
A macro named `VALIDATE` (in `common-utils.h`)
|
||||
takes one argument, and if it is `NULL`, writes a log message and
|
||||
jumps to a label called `err` after setting op_ret and op_errno
|
||||
appropriately. It is recommended to use this template.
|
||||
|
||||
|
||||
*Good:*
|
||||
|
||||
```
|
||||
VALIDATE(frame);
|
||||
VALIDATE(this);
|
||||
VALIDATE(inode);
|
||||
```
|
||||
|
||||
Never rely on precedence of operators
|
||||
-------------------------------------
|
||||
|
||||
Never write code that relies on the precedence of operators to execute
|
||||
correctly. Such code can be hard to read and someone else might not
|
||||
know the precedence of operators as accurately as you do.
|
||||
|
||||
*Bad:*
|
||||
|
||||
```
|
||||
if (op_ret == -1 && errno != ENOENT)
|
||||
```
|
||||
|
||||
*Good:*
|
||||
|
||||
```
|
||||
if ((op_ret == -1) && (errno != ENOENT))
|
||||
```
|
||||
|
||||
Use exactly matching types
|
||||
--------------------------
|
||||
|
||||
Use a variable of the exact type declared in the manual to hold the
|
||||
return value of a function. Do not use an ``equivalent'' type.
|
||||
|
||||
|
||||
*Bad:*
|
||||
|
||||
```
|
||||
int len = strlen (path);
|
||||
```
|
||||
|
||||
*Good:*
|
||||
|
||||
```
|
||||
size_t len = strlen (path);
|
||||
```
|
||||
|
||||
Never write code such as `foo->bar->baz`; check every pointer
|
||||
-------------------------------------------------------------
|
||||
|
||||
Do not write code that blindly follows a chain of pointer
|
||||
references. Any pointer in the chain may be `NULL` and thus
|
||||
cause a crash. Verify that each pointer is non-null before following
|
||||
it.
|
||||
|
||||
Check return value of all functions and system calls
|
||||
----------------------------------------------------
|
||||
|
||||
The return value of all system calls and API functions must be checked
|
||||
for success or failure.
|
||||
|
||||
*Bad:*
|
||||
|
||||
```
|
||||
close (fd);
|
||||
```
|
||||
|
||||
*Good:*
|
||||
|
||||
```
|
||||
op_ret = close (_fd);
|
||||
if (op_ret == -1) {
|
||||
gf_log (this->name, GF_LOG_ERROR,
|
||||
"close on file %s failed (%s)", real_path,
|
||||
strerror (errno));
|
||||
op_errno = errno;
|
||||
goto out;
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
Gracefully handle failure of malloc
|
||||
-----------------------------------
|
||||
|
||||
GlusterFS should never crash or exit due to lack of memory. If a
|
||||
memory allocation fails, the call should be unwound and an error
|
||||
returned to the user.
|
||||
|
||||
*Use result args and reserve the return value to indicate success or failure:*
|
||||
|
||||
The return value of every functions must indicate success or failure (unless
|
||||
it is impossible for the function to fail --- e.g., boolean functions). If
|
||||
the function needs to return additional data, it must be returned using a
|
||||
result (pointer) argument.
|
||||
|
||||
*Bad:*
|
||||
|
||||
```
|
||||
int32_t dict_get_int32 (dict_t *this, char *key);
|
||||
```
|
||||
|
||||
*Good:*
|
||||
|
||||
```
|
||||
int dict_get_int32 (dict_t *this, char *key, int32_t *val);
|
||||
```
|
||||
|
||||
Always use the `n' versions of string functions
|
||||
-----------------------------------------------
|
||||
|
||||
Unless impossible, use the length-limited versions of the string functions.
|
||||
|
||||
*Bad:*
|
||||
|
||||
```
|
||||
strcpy (entry_path, real_path);
|
||||
```
|
||||
|
||||
*Good:*
|
||||
|
||||
```
|
||||
strncpy (entry_path, real_path, entry_path_len);
|
||||
```
|
||||
|
||||
No dead or commented code
|
||||
-------------------------
|
||||
|
||||
There must be no dead code (code to which control can never be passed) or
|
||||
commented out code in the codebase.
|
||||
|
||||
Only one unwind and return per function
|
||||
---------------------------------------
|
||||
|
||||
There must be only one exit out of a function. `UNWIND` and return
|
||||
should happen at only point in the function.
|
||||
|
||||
Function length or Keep functions small
|
||||
---------------------------------------
|
||||
|
||||
We live in the UNIX-world where modules do one thing and do it well.
|
||||
This rule should apply to our functions also. If a function is very long, try splitting it
|
||||
into many little helper functions. The question is, in a coding
|
||||
spree, how do we know a function is long and unreadable. One rule of
|
||||
thumb given by Linus Torvalds is that, a function should be broken-up
|
||||
if you have 4 or more levels of indentation going on for more than 3-4
|
||||
lines.
|
||||
|
||||
*Example for a helper function:*
|
||||
```
|
||||
static int
|
||||
same_owner (posix_lock_t *l1, posix_lock_t *l2)
|
||||
{
|
||||
return ((l1->client_pid == l2->client_pid) &&
|
||||
(l1->transport == l2->transport));
|
||||
}
|
||||
```
|
||||
|
||||
Defining functions as static
|
||||
----------------------------
|
||||
|
||||
Define internal functions as static only if you're
|
||||
very sure that there will not be a crash(..of any kind..) emanating in
|
||||
that function. If there is even a remote possibility, perhaps due to
|
||||
pointer derefering, etc, declare the function as non-static. This
|
||||
ensures that when a crash does happen, the function name shows up the
|
||||
in the back-trace generated by libc. However, doing so has potential
|
||||
for polluting the function namespace, so to avoid conflicts with other
|
||||
components in other parts, ensure that the function names are
|
||||
prepended with a prefix that identify the component to which it
|
||||
belongs. For eg. non-static functions in io-threads translator start
|
||||
with iot_.
|
||||
|
||||
Ensure function calls wrap around after 80-columns
|
||||
--------------------------------------------------
|
||||
|
||||
Place remaining arguments on the next line if needed.
|
||||
|
||||
Functions arguments and function definition
|
||||
-------------------------------------------
|
||||
|
||||
Place all the arguments of a function definition on the same line
|
||||
until the line goes beyond 80-cols. Arguments that extend beyind
|
||||
80-cols should be placed on the next line.
|
||||
|
||||
Style issues
|
||||
------------
|
||||
|
||||
### Brace placement
|
||||
|
||||
Use K&R/Linux style of brace placement for blocks.
|
||||
|
||||
*Good:*
|
||||
|
||||
```
|
||||
int some_function (...)
|
||||
{
|
||||
if (...) {
|
||||
/* ... */
|
||||
} else if (...) {
|
||||
/* ... */
|
||||
} else {
|
||||
/* ... */
|
||||
}
|
||||
|
||||
do {
|
||||
/* ... */
|
||||
} while (cond);
|
||||
}
|
||||
```
|
||||
|
||||
### Indentation
|
||||
|
||||
Use *eight* spaces for indenting blocks. Ensure that your
|
||||
file contains only spaces and not tab characters. You can do this
|
||||
in Emacs by selecting the entire file (`C-x h`) and
|
||||
running `M-x untabify`.
|
||||
|
||||
To make Emacs indent lines automatically by eight spaces, add this
|
||||
line to your `.emacs`:
|
||||
|
||||
```
|
||||
(add-hook 'c-mode-hook (lambda () (c-set-style "linux")))
|
||||
```
|
||||
|
||||
### Comments
|
||||
|
||||
Write a comment before every function describing its purpose (one-line),
|
||||
its arguments, and its return value. Mention whether it is an internal
|
||||
function or an exported function.
|
||||
|
||||
Write a comment before every structure describing its purpose, and
|
||||
write comments about each of its members.
|
||||
|
||||
Follow the style shown below for comments, since such comments
|
||||
can then be automatically extracted by doxygen to generate
|
||||
documentation.
|
||||
|
||||
*Good:*
|
||||
|
||||
```
|
||||
/**
|
||||
* hash_name -hash function for filenames
|
||||
* @par: parent inode number
|
||||
* @name: basename of inode
|
||||
* @mod: number of buckets in the hashtable
|
||||
*
|
||||
* @return: success: bucket number
|
||||
* failure: -1
|
||||
*
|
||||
* Not for external use.
|
||||
*/
|
||||
```
|
||||
|
||||
### Indicating critical sections
|
||||
|
||||
To clearly show regions of code which execute with locks held, use
|
||||
the following format:
|
||||
|
||||
```
|
||||
pthread_mutex_lock (&mutex);
|
||||
{
|
||||
/* code */
|
||||
}
|
||||
pthread_mutex_unlock (&mutex);
|
||||
```
|
||||
|
||||
*A skeleton fop function:*
|
||||
|
||||
This is the recommended template for any fop. In the beginning come
|
||||
the initializations. After that, the `success' control flow should be
|
||||
linear. Any error conditions should cause a `goto` to a single
|
||||
point, `out`. At that point, the code should detect the error
|
||||
that has occurred and do appropriate cleanup.
|
||||
|
||||
```
|
||||
int32_t
|
||||
sample_fop (call_frame_t *frame, xlator_t *this, ...)
|
||||
{
|
||||
char * var1 = NULL;
|
||||
int32_t op_ret = -1;
|
||||
int32_t op_errno = 0;
|
||||
DIR * dir = NULL;
|
||||
struct posix_fd * pfd = NULL;
|
||||
|
||||
VALIDATE_OR_GOTO (frame, out);
|
||||
VALIDATE_OR_GOTO (this, out);
|
||||
|
||||
/* other validations */
|
||||
|
||||
dir = opendir (...);
|
||||
|
||||
if (dir == NULL) {
|
||||
op_errno = errno;
|
||||
gf_log (this->name, GF_LOG_ERROR,
|
||||
"opendir failed on %s (%s)", loc->path,
|
||||
strerror (op_errno));
|
||||
goto out;
|
||||
}
|
||||
|
||||
/* another system call */
|
||||
if (...) {
|
||||
op_errno = ENOMEM;
|
||||
gf_log (this->name, GF_LOG_ERROR,
|
||||
"out of memory :(");
|
||||
goto out;
|
||||
}
|
||||
|
||||
/* ... */
|
||||
|
||||
out:
|
||||
if (op_ret == -1) {
|
||||
|
||||
/* check for all the cleanup that needs to be
|
||||
done */
|
||||
|
||||
if (dir) {
|
||||
closedir (dir);
|
||||
dir = NULL;
|
||||
}
|
||||
|
||||
if (pfd) {
|
||||
FREE (pfd->path);
|
||||
FREE (pfd);
|
||||
pfd = NULL;
|
||||
}
|
||||
}
|
||||
|
||||
STACK_UNWIND (frame, op_ret, op_errno, fd);
|
||||
return 0;
|
||||
}
|
||||
```
|
||||
@@ -1,55 +0,0 @@
|
||||
This document explains how to analyze core-dumps obtained from regression
|
||||
machines, with examples.
|
||||
1) Download the core-tarball and extract it.
|
||||
2) 'cd' into directory where the tarball is extracted.
|
||||
~~~
|
||||
[root@atalur Downloads]# pwd
|
||||
/home/atalur/Downloads
|
||||
[root@atalur Downloads]# ls
|
||||
build build-install-20150625_05_42_39.tar.bz2 lib64 usr
|
||||
~~~
|
||||
3) Determine the core file you need to examine. There can be more than one core file.
|
||||
You can list them from './build/install/cores' directory.
|
||||
~~~
|
||||
[root@atalur Downloads]# ls build/install/cores/
|
||||
core.9341 liblist.txt liblist.txt.tmp
|
||||
~~~
|
||||
In case you are unsure which binary generated the core-file, executing 'file' command on it will help.
|
||||
~~~
|
||||
[root@atalur Downloads]# file ./build/install/cores/core.9341
|
||||
./build/install/cores/core.9341: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from '/build/install/sbin/glusterfsd -s slave26.cloud.gluster.org --volfile-id patchy'
|
||||
~~~
|
||||
As seen, the core file was generated by glusterfsd binary, and path to it is provided (/build/install/sbin/glusterfsd).
|
||||
4) Now, run the following command on the core:
|
||||
~~~
|
||||
gdb -ex 'set sysroot ./' -ex 'core-file ./build/install/cores/core.xxx' <target, say ./build/install/sbin/glusterd>
|
||||
In this case,
|
||||
gdb -ex 'set sysroot ./' -ex 'core-file ./build/install/cores/core.9341' ./build/install/sbin/glusterfsd
|
||||
~~~
|
||||
5) You can cross check if all shared libraries are available and loaded by using 'info sharedlibrary' command from
|
||||
inside gdb.
|
||||
6) Once verified, usual gdb commands based on requirement can be used to debug the core.
|
||||
'bt' or 'backtrace' from gdb of core used in examples:
|
||||
~~~
|
||||
Core was generated by `/build/install/sbin/glusterfsd -s slave26.cloud.gluster.org --volfile-id patchy'.
|
||||
Program terminated with signal SIGABRT, Aborted.
|
||||
#0 0x00007f512a54e625 in raise () from ./lib64/libc.so.6
|
||||
(gdb) bt
|
||||
#0 0x00007f512a54e625 in raise () from ./lib64/libc.so.6
|
||||
#1 0x00007f512a54fe05 in abort () from ./lib64/libc.so.6
|
||||
#2 0x00007f512a54774e in __assert_fail_base () from ./lib64/libc.so.6
|
||||
#3 0x00007f512a547810 in __assert_fail () from ./lib64/libc.so.6
|
||||
#4 0x00007f512b9fc434 in __gf_free (free_ptr=0x7f50f4000e50) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/mem-pool.c:304
|
||||
#5 0x00007f512b9b6657 in loc_wipe (loc=0x7f510c20d1a0) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/xlator.c:685
|
||||
#6 0x00007f511cb8201d in mq_start_quota_txn_v2 (this=0x7f5118019b60, loc=0x7f510c20d2b8, ctx=0x7f50f4000bf0, contri=0x7f50f4000d60)
|
||||
at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/features/marker/src/marker-quota.c:2921
|
||||
#7 0x00007f511cb82c55 in mq_initiate_quota_task (opaque=0x7f510c20d2b0) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/features/marker/src/marker-quota.c:3199
|
||||
#8 0x00007f511cb81820 in mq_synctask (this=0x7f5118019b60, task=0x7f511cb829fa <mq_initiate_quota_task>, spawn=_gf_false, loc=0x7f510c20d430, dict=0x0, buf=0x0, contri=0)
|
||||
at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/features/marker/src/marker-quota.c:2789
|
||||
#9 0x00007f511cb82f82 in mq_initiate_quota_blocking_txn (this=0x7f5118019b60, loc=0x7f510c20d430) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/features/marker/src/marker-quota.c:3230
|
||||
#10 0x00007f511cb82844 in mq_reduce_parent_size_task (opaque=0x7f510c000df0) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/features/marker/src/marker-quota.c:3117
|
||||
#11 0x00007f512ba0f9dc in synctask_wrap (old_task=0x7f510c0053e0) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/syncop.c:370
|
||||
#12 0x00007f512a55f8f0 in ?? () from ./lib64/libc.so.6
|
||||
#13 0x0000000000000000 in ?? ()
|
||||
(gdb)
|
||||
~~~
|
||||
@@ -1,38 +0,0 @@
|
||||
|
||||
How to introduce new daemons using daemon management framework
|
||||
==============================================================
|
||||
Glusterd manages GlusterFS daemons providing services like NFS, Proactive
|
||||
self-heal, Quota, User servicable snapshots etc. Following are some of the
|
||||
aspects that come under daemon management.
|
||||
|
||||
Data members & functions of different management objects
|
||||
|
||||
- **Connection Management**
|
||||
- unix domain sockets based channel for internal communication
|
||||
- rpc connection for the communication
|
||||
- frame timeout value for UDS
|
||||
- Methods - notify
|
||||
- init, connect, termination, disconnect APIs can be invoked using the
|
||||
connection management object
|
||||
|
||||
- **Process Management**
|
||||
- Name of the process
|
||||
- pidfile to detect if the daemon is running
|
||||
- loggging directory, log file, volfile, volfileserver & volfileid
|
||||
- init, stop APIs can be invoked using the process management object
|
||||
|
||||
- **Service Management**
|
||||
- connection object
|
||||
- process object
|
||||
- online status
|
||||
- Methods - manager, start, stop which can be abstracted as a common methods
|
||||
or specific to service requirements
|
||||
- init API can be invoked using the service management object
|
||||
|
||||
The above structures defines the skeleton of the daemon management framework.
|
||||
Introduction of new daemons in GlusterFS needs to inherit these properties. Any
|
||||
requirement specific to a daemon needs to be implemented in its own service
|
||||
(for eg : snapd defines its own type glusterd_snapdsvc_t using glusterd_svc_t
|
||||
and snapd specific data). New daemons will need to have its own service specific
|
||||
code written in glusterd-<feature>-svc.h{c} and need to reuse the existing
|
||||
framework.
|
||||
@@ -1,226 +0,0 @@
|
||||
#Inode and dentry management in GlusterFS:
|
||||
|
||||
##Background
|
||||
Filesystems internally refer to files and directories via inodes. Inodes
|
||||
are unique identifiers of the entities stored in a filesystem. Whenever an
|
||||
application has to operate on a file/directory (read/modify), the filesystem
|
||||
maps that file/directory to the right inode and start referring to that inode
|
||||
whenever an operation has to be performed on the file/directory.
|
||||
|
||||
In GlusterFS a new inode gets created whenever a new file/directory is created
|
||||
OR when a successful lookup is done on a file/directory for the first time.
|
||||
Inodes in GlusterFS are maintained by the inode table which gets initiated when
|
||||
the filesystem daemon is started (both for the brick process as well as the
|
||||
mount process). Below are some important data structures for inode management.
|
||||
|
||||
## Data-structure (inode-table)
|
||||
```
|
||||
struct _inode_table {
|
||||
pthread_mutex_t lock;
|
||||
size_t hashsize; /* bucket size of inode hash and dentry hash */
|
||||
char *name; /* name of the inode table, just for gf_log() */
|
||||
inode_t *root; /* root directory inode, with inode
|
||||
number and gfid 1 */
|
||||
xlator_t *xl; /* xlator to be called to do purge and
|
||||
the xlator which maintains the inode table*/
|
||||
uint32_t lru_limit; /* maximum LRU cache size */
|
||||
struct list_head *inode_hash; /* buckets for inode hash table */
|
||||
struct list_head *name_hash; /* buckets for dentry hash table */
|
||||
struct list_head active; /* list of inodes currently active (in an fop) */
|
||||
uint32_t active_size; /* count of inodes in active list */
|
||||
struct list_head lru; /* list of inodes recently used.
|
||||
lru.next most recent */
|
||||
uint32_t lru_size; /* count of inodes in lru list */
|
||||
struct list_head purge; /* list of inodes to be purged soon */
|
||||
uint32_t purge_size; /* count of inodes in purge list */
|
||||
|
||||
struct mem_pool *inode_pool; /* memory pool for inodes */
|
||||
struct mem_pool *dentry_pool; /* memory pool for dentrys */
|
||||
struct mem_pool *fd_mem_pool; /* memory pool for fd_t */
|
||||
int ctxcount; /* number of slots in inode->ctx */
|
||||
};
|
||||
```
|
||||
|
||||
#Life-cycle
|
||||
```
|
||||
|
||||
inode_table_new (size_t lru_limit, xlator_t *xl)
|
||||
|
||||
This is a function which allocates a new inode table. Usually the top xlators in
|
||||
the graph such as protocol/server (for bricks), fuse and nfs (for fuse and nfs
|
||||
mounts) and libgfapi do inode managements. Hence they are the ones which will
|
||||
allocate a new inode table by calling the above function.
|
||||
|
||||
Each xlator graph in glusterfs maintains an inode table. So in fuse clients,
|
||||
whenever there is a graph change due to add brick/remove brick or
|
||||
addition/removal of some other xlators, a new graph is created which creates a
|
||||
new inode table.
|
||||
|
||||
Thus an allocated inode table is destroyed only when the filesystem daemon is
|
||||
killed or unmounted.
|
||||
|
||||
```
|
||||
|
||||
#what it contains.
|
||||
```
|
||||
|
||||
Inode table in glusterfs mainly contains a hash table for maintaining inodes.
|
||||
In general a file/directory is considered to be existing if there is a
|
||||
corresponding inode present in the inode table. If a inode for a file/directory
|
||||
cannot be found in the inode table, glusterfs tries to resolve it by sending a
|
||||
lookup on the entry for which the inode is needed. If lookup is successful, then
|
||||
a new inode correponding to the entry is added to the hash table present in the
|
||||
inode table. Thus an inode present in the hash-table means, its an existing
|
||||
file/directory within the filesystem. The inode table also contains the hash
|
||||
size of the hash table (as of now it is hard coded to 14057. The hash value of
|
||||
a inode is calculated using its gfid).
|
||||
|
||||
Apart from the hash table, inode table also maintains 3 important list of inodes
|
||||
1) Active list:
|
||||
Active list contains all the active inodes (i.e inodes which are currently part
|
||||
of some fop).
|
||||
2) Lru list:
|
||||
Least recently used inodes list. A limit can be set for the size of the lru
|
||||
list. For bricks it is 16384 and for clients it is infinity.
|
||||
3) Purge list:
|
||||
List of all the inodes which have to be purged (i.e inodes which have to be
|
||||
deleted from the inode table due to unlink/rmdir/forget).
|
||||
|
||||
And at last it also contains the mem-pool for allocating inodes, dentries so
|
||||
that frequent malloc/calloc and free of the data structures can be avoided.
|
||||
```
|
||||
|
||||
#Data structure (inode)
|
||||
```
|
||||
struct _inode {
|
||||
inode_table_t *table; /* the table this inode belongs to */
|
||||
uuid_t gfid; /* unique identifier of the inode */
|
||||
gf_lock_t lock;
|
||||
uint64_t nlookup;
|
||||
uint32_t fd_count; /* Open fd count */
|
||||
uint32_t ref; /* reference count on this inode */
|
||||
ia_type_t ia_type; /* what kind of file */
|
||||
struct list_head fd_list; /* list of open files on this inode */
|
||||
struct list_head dentry_list; /* list of directory entries for this inode */
|
||||
struct list_head hash; /* hash table pointers */
|
||||
struct list_head list; /* active/lru/purge */
|
||||
|
||||
struct _inode_ctx *_ctx; /* place holder for keeping the
|
||||
information about the inode by different xlators */
|
||||
};
|
||||
|
||||
As said above, inodes are internal way of identifying the files/directories. A
|
||||
inode uniquely represents a file/directory. A new inode is created whenever a
|
||||
create/mkdir/symlink/mknod operations are performed. Apart from that a new inode
|
||||
is created upon the successful fresh lookup of a file/directory. Say the
|
||||
filesystem contained some file "a" within root and the filesystem was
|
||||
unmounted. Now when glusterfs is mounted and some operation is perfomed on "/a",
|
||||
glusterfs tries to get the inode for the entry "a" with parent inode as
|
||||
root. But, since glusterfs just came up, it will not be able to find the inode
|
||||
for "a" and will send a lookup on "/a". If the lookup operation succeeds (i.e.
|
||||
the root of glusterfs contains an entry called "a"), then a new inode for "/a"
|
||||
is created and added to the inode table.
|
||||
|
||||
Depending upon the situation, an inode can be in one of the 3 lists maintained
|
||||
by the inode table. If some fop is happening on the inode, then the inode will
|
||||
be present in the active inodes list maintained by the inode table. Active
|
||||
inodes are those inodes whose refcount is greater than zero. Whenever some
|
||||
operation comes on a file/directory, and the resolver tries to find the inode
|
||||
for it, it increments the refcount of the inode before returning the inode. The
|
||||
refcount of an inode can be incremented by calling the below function
|
||||
|
||||
inode_ref (inode_t *inode)
|
||||
|
||||
Any xlator which wants to operate on a inode as part of some fop (or wants the
|
||||
inode in the callback), should hold a ref on the inode.
|
||||
Once the fop is completed before sending the reply of the fop to the above
|
||||
layers , the inode has to be unrefed. When the refcount of an inode becomes
|
||||
zero, it is removed from the active inodes list and put into LRU list maintained
|
||||
by the inode table. Thus in short if some fop is happening on a file/directory,
|
||||
the corresponding inode will be in the active list or it will be in the LRU
|
||||
list.
|
||||
```
|
||||
|
||||
#Life Cycle
|
||||
|
||||
A new inode is created whenever a new file/directory/symlink is created OR a
|
||||
successful lookup of an existing entry is done. The xlators which does inode
|
||||
management (as of now protocol/server, fuse, nfs, gfapi) will perform inode_link
|
||||
operation upon successful lookup or successful creation of a new entry.
|
||||
|
||||
inode_link (inode_t *inode, inode_t *parent, const char *name,
|
||||
struct iatt *buf);
|
||||
|
||||
inode_link actually adds the inode to the inode table (to be precise it adds
|
||||
the inode to the hash table maintained by the inode table. The hash value is
|
||||
calculated based on the gfid). Copies the gfid to the inode (the gfid is
|
||||
present in the iatt structure). Creates a dentry with the new name.
|
||||
|
||||
A inode is removed from the inode table and eventually destroyed when unlink
|
||||
or rmdir operation is performed on a file/directory, or the the lru limit of
|
||||
the inode table has been exceeded.
|
||||
|
||||
#Data structure (dentry)
|
||||
```
|
||||
|
||||
struct _dentry {
|
||||
struct list_head inode_list; /* list of dentries of inode */
|
||||
struct list_head hash; /* hash table pointers */
|
||||
inode_t *inode; /* inode of this directory entry */
|
||||
char *name; /* name of the directory entry */
|
||||
inode_t *parent; /* directory of the entry */
|
||||
};
|
||||
|
||||
A dentry is the presence of an entry for a file/directory within its parent
|
||||
directory. A dentry usually points to the inode to which it belongs to. In
|
||||
glusterfs a dentry contains the following fields.
|
||||
1) a hook using which it can add itself to the list of
|
||||
the dentries maintained by the inode to which it points to.
|
||||
2) A hash table pointer.
|
||||
3) Pointer to the inode to which it belongs to.
|
||||
4) Name of the dentry
|
||||
5) Pointer to the inode of the parent directory in which the dentry is present
|
||||
|
||||
A new dentry is created when a new file/directory/symlink is created or a hard
|
||||
link to an existing file is created.
|
||||
|
||||
__dentry_create (inode_t *inode, inode_t *parent, const char *name);
|
||||
|
||||
A dentry holds a refcount on the parent
|
||||
directory so that the parent inode is never removed from the active inode's list
|
||||
and put to the lru list (If the lru limit of the lru list is exceeded, there is
|
||||
a chance of parent inode being destroyed. To avoid it, the dentries hold a
|
||||
reference to the parent inode). A dentry is removed whenevern a unlink/rmdir
|
||||
is perfomed on a file/directory. Or when the lru limit has been exceeded, the
|
||||
oldest inodes are purged out of the inode table, during which all the dentries
|
||||
of the inode are removed.
|
||||
|
||||
Whenever a unlink/rmdir comes on a file/directory, the corresponding inode
|
||||
should be removed from the inode table. So upon unlink/rmdir, the inode will
|
||||
be moved to the purge list maintained by the inode table and from there it is
|
||||
destroyed. To be more specific, if a inode has to be destroyed, its refcount
|
||||
and nlookup count both should become 0. For refcount to become 0, the inode
|
||||
should not be part of any fop (there should not be any open fds). Or if the
|
||||
inode belongs to a directory, then there should not be any fop happening on the
|
||||
directory and it should not contain any dentries within it. For nlookup count to
|
||||
become zero, a forget has to be sent on the inode with nlookup count set to 0 as
|
||||
an argument. For fuse clients, forget is sent by the kernel itself whenever a
|
||||
unlink/rmdir is performed. But for brick processes, upon unlink/rmdir, the
|
||||
protocol/server itself has to do inode_forget. Whenever the inode has to be
|
||||
deleted due to file removal or lru limit being exceeded the inode is retired
|
||||
(i.e. all the dentries of the inode are deleted and the inode is moved to the
|
||||
purge list maintained by the inode table), the nlookup count is set to 0 via
|
||||
inode_forget api. The inode table, then prunes all the inodes from the purge
|
||||
list by destroying the inode contexts maintained by each xlator.
|
||||
|
||||
unlinking of the dentry is done via inode_unlink;
|
||||
|
||||
void
|
||||
inode_unlink (inode_t *inode, inode_t *parent, const char *name);
|
||||
|
||||
If the inode has multiple hard links, then the unlink operation performed by
|
||||
the application results just in the removal of the dentry with the name provided
|
||||
by the application. For the inode to be removed, all the dentries of the inode
|
||||
should be unlinked.
|
||||
```
|
||||
|
||||
@@ -1,259 +0,0 @@
|
||||
#Iobuf-pool
|
||||
##Datastructures
|
||||
###iobuf
|
||||
Short for IO Buffer. It is one allocatable unit for the consumers of the IOBUF
|
||||
API, each unit hosts @page_size(defined in arena structure) bytes of memory. As
|
||||
initial step of processing a fop, the IO buffer passed onto GlusterFS by the
|
||||
other applications (FUSE VFS/ Applications using gfapi) is copied into GlusterFS
|
||||
space i.e. iobufs. Hence Iobufs are mostly allocated/deallocated in Fuse, gfapi,
|
||||
protocol xlators, and also in performance xlators to cache the IO buffers etc.
|
||||
```
|
||||
struct iobuf {
|
||||
union {
|
||||
struct list_head list;
|
||||
struct {
|
||||
struct iobuf *next;
|
||||
struct iobuf *prev;
|
||||
};
|
||||
};
|
||||
struct iobuf_arena *iobuf_arena;
|
||||
|
||||
gf_lock_t lock; /* for ->ptr and ->ref */
|
||||
int ref; /* 0 == passive, >0 == active */
|
||||
|
||||
void *ptr; /* usable memory region by the consumer */
|
||||
|
||||
void *free_ptr; /* in case of stdalloc, this is the
|
||||
one to be freed not the *ptr */
|
||||
};
|
||||
```
|
||||
|
||||
###iobref
|
||||
There may be need of multiple iobufs for a single fop, like in vectored read/write.
|
||||
Hence multiple iobufs(default 16) are encapsulated under one iobref.
|
||||
```
|
||||
struct iobref {
|
||||
gf_lock_t lock;
|
||||
int ref;
|
||||
struct iobuf **iobrefs; /* list of iobufs */
|
||||
int alloced; /* 16 by default, grows as required */
|
||||
int used; /* number of iobufs added to this iobref */
|
||||
};
|
||||
```
|
||||
###iobuf_arenas
|
||||
One region of memory MMAPed from the operating system. Each region MMAPs
|
||||
@arena_size bytes of memory, and hosts @arena_size / @page_size IOBUFs.
|
||||
The same sized iobufs are grouped into one arena, for sanity of access.
|
||||
|
||||
```
|
||||
struct iobuf_arena {
|
||||
union {
|
||||
struct list_head list;
|
||||
struct {
|
||||
struct iobuf_arena *next;
|
||||
struct iobuf_arena *prev;
|
||||
};
|
||||
};
|
||||
|
||||
size_t page_size; /* size of all iobufs in this arena */
|
||||
size_t arena_size; /* this is equal to
|
||||
(iobuf_pool->arena_size / page_size)
|
||||
* page_size */
|
||||
size_t page_count;
|
||||
|
||||
struct iobuf_pool *iobuf_pool;
|
||||
|
||||
void *mem_base;
|
||||
struct iobuf *iobufs; /* allocated iobufs list */
|
||||
|
||||
int active_cnt;
|
||||
struct iobuf active; /* head node iobuf
|
||||
(unused by itself) */
|
||||
int passive_cnt;
|
||||
struct iobuf passive; /* head node iobuf
|
||||
(unused by itself) */
|
||||
uint64_t alloc_cnt; /* total allocs in this pool */
|
||||
int max_active; /* max active buffers at a given time */
|
||||
};
|
||||
|
||||
```
|
||||
###iobuf_pool
|
||||
Pool of Iobufs. As there may be many Io buffers required by the filesystem,
|
||||
a pool of iobufs are preallocated and kept, if these preallocated ones are
|
||||
exhausted only then the standard malloc/free is called, thus improving the
|
||||
performance. Iobuf pool is generally one per process, allocated during
|
||||
glusterfs_ctx_t init (glusterfs_ctx_defaults_init), currently the preallocated
|
||||
iobuf pool memory is freed on process exit. Iobuf pool is globally accessible
|
||||
across GlusterFs, hence iobufs allocated by any xlator can be accessed by any
|
||||
other xlators(unless iobuf is not passed).
|
||||
```
|
||||
struct iobuf_pool {
|
||||
pthread_mutex_t mutex;
|
||||
size_t arena_size; /* size of memory region in
|
||||
arena */
|
||||
size_t default_page_size; /* default size of iobuf */
|
||||
|
||||
int arena_cnt;
|
||||
struct list_head arenas[GF_VARIABLE_IOBUF_COUNT];
|
||||
/* array of arenas. Each element of the array is a list of arenas
|
||||
holding iobufs of particular page_size */
|
||||
|
||||
struct list_head filled[GF_VARIABLE_IOBUF_COUNT];
|
||||
/* array of arenas without free iobufs */
|
||||
|
||||
struct list_head purge[GF_VARIABLE_IOBUF_COUNT];
|
||||
/* array of of arenas which can be purged */
|
||||
|
||||
uint64_t request_misses; /* mostly the requests for higher
|
||||
value of iobufs */
|
||||
};
|
||||
```
|
||||
~~~
|
||||
The default size of the iobuf_pool(as of yet):
|
||||
1024 iobufs of 128Bytes = 128KB
|
||||
512 iobufs of 512Bytes = 256KB
|
||||
512 iobufs of 2KB = 1MB
|
||||
128 iobufs of 8KB = 1MB
|
||||
64 iobufs of 32KB = 2MB
|
||||
32 iobufs of 128KB = 4MB
|
||||
8 iobufs of 256KB = 2MB
|
||||
2 iobufs of 1MB = 2MB
|
||||
Total ~13MB
|
||||
~~~
|
||||
As seen in the datastructure iobuf_pool has 3 arena lists.
|
||||
|
||||
- arenas:
|
||||
The arenas allocated during iobuf_pool create, are part of this list. This list
|
||||
also contains arenas that are partially filled i.e. contain few active and few
|
||||
passive iobufs (passive_cnt !=0, active_cnt!=0 except for initially allocated
|
||||
arenas). There will be by default 8 arenas of the sizes mentioned above.
|
||||
- filled:
|
||||
If all the iobufs in the arena are filled(passive_cnt = 0), the arena is moved
|
||||
to the filled list. If any of the iobufs from the filled arena is iobuf_put,
|
||||
then the arena moves back to the 'arenas' list.
|
||||
- purge:
|
||||
If there are no active iobufs in the arena(active_cnt = 0), the arena is moved
|
||||
to purge list. iobuf_put() triggers destruction of the arenas in this list. The
|
||||
arenas in the purge list are destroyed only if there is atleast one arena in
|
||||
'arenas' list, that way there won't be spurious mmap/unmap of buffers.
|
||||
(e.g: If there is an arena (page_size=128KB, count=32) in purge list, this arena
|
||||
is destroyed(munmap) only if there is an arena in 'arenas' list with page_size=128KB).
|
||||
|
||||
##APIs
|
||||
###iobuf_get
|
||||
|
||||
```
|
||||
struct iobuf *iobuf_get (struct iobuf_pool *iobuf_pool);
|
||||
```
|
||||
Creates a new iobuf of the default page size(128KB hard coded as of yet).
|
||||
Also takes a reference(increments ref count), hence no need of doing it
|
||||
explicitly after getting iobuf.
|
||||
|
||||
###iobuf_get2
|
||||
|
||||
```
|
||||
struct iobuf * iobuf_get2 (struct iobuf_pool *iobuf_pool, size_t page_size);
|
||||
```
|
||||
Creates a new iobuf of a specified page size, if page_size=0 default page size
|
||||
is considered.
|
||||
```
|
||||
if (requested iobuf size > Max iobuf size in the pool(1MB as of yet))
|
||||
{
|
||||
Perform standard allocation(CALLOC) of the requested size and
|
||||
add it to the list iobuf_pool->arenas[IOBUF_ARENA_MAX_INDEX].
|
||||
}
|
||||
else
|
||||
{
|
||||
-Round the page size to match the stndard sizes in iobuf pool.
|
||||
(eg: if 3KB is requested, it is rounded to 8KB).
|
||||
-Select the arena list corresponding to the rounded size
|
||||
(eg: select 8KB arena)
|
||||
If the selected arena has passive count > 0, then return the
|
||||
iobuf from this arena, set the counters(passive/active/etc.)
|
||||
appropriately.
|
||||
else the arena is full, allocate new arena with rounded size
|
||||
and standard page numbers and add to the arena list
|
||||
(eg: 128 iobufs of 8KB is allocated).
|
||||
}
|
||||
```
|
||||
Also takes a reference(increments ref count), hence no need of doing it
|
||||
explicitly after getting iobuf.
|
||||
|
||||
###iobuf_ref
|
||||
|
||||
```
|
||||
struct iobuf *iobuf_ref (struct iobuf *iobuf);
|
||||
```
|
||||
Take a reference on the iobuf. If using an iobuf allocated by some other
|
||||
xlator/function/, its a good practice to take a reference so that iobuf is not
|
||||
deleted by the allocator.
|
||||
|
||||
###iobuf_unref
|
||||
```
|
||||
void iobuf_unref (struct iobuf *iobuf);
|
||||
```
|
||||
Unreference the iobuf, if the ref count is zero iobuf is considered free.
|
||||
|
||||
```
|
||||
-Delete the iobuf, if allocated from standard alloc and return.
|
||||
-set the active/passive count appropriately.
|
||||
-if passive count > 0 then add the arena to 'arena' list.
|
||||
-if active count = 0 then add the arena to 'purge' list.
|
||||
```
|
||||
Every iobuf_ref should have a corresponding iobuf_unref, and also every
|
||||
iobuf_get/2 should have a correspondning iobuf_unref.
|
||||
|
||||
###iobref_new
|
||||
```
|
||||
struct iobref *iobref_new ();
|
||||
```
|
||||
Creates a new iobref structure and returns its pointer.
|
||||
|
||||
###iobref_ref
|
||||
```
|
||||
struct iobref *iobref_ref (struct iobref *iobref);
|
||||
```
|
||||
Take a reference on the iobref.
|
||||
|
||||
###iobref_unref
|
||||
```
|
||||
void iobref_unref (struct iobref *iobref);
|
||||
```
|
||||
Decrements the reference count of the iobref. If the ref count is 0, then unref
|
||||
all the iobufs(iobuf_unref) in the iobref, and destroy the iobref.
|
||||
|
||||
###iobref_add
|
||||
```
|
||||
int iobref_add (struct iobref *iobref, struct iobuf *iobuf);
|
||||
```
|
||||
Adds the given iobuf into the iobref, it takes a ref on the iobuf before adding
|
||||
it, hence explicit iobuf_ref is not required if adding to the iobref.
|
||||
|
||||
###iobref_merge
|
||||
```
|
||||
int iobref_merge (struct iobref *to, struct iobref *from);
|
||||
```
|
||||
Adds all the iobufs in the 'from' iobref to the 'to' iobref. Merge will not
|
||||
cause the delete of the 'from' iobref, therefore it will result in another ref
|
||||
on all the iobufs added to the 'to' iobref. Hence iobref_unref should be
|
||||
performed both on 'from' and 'to' iobrefs (performing iobref_unref only on 'to'
|
||||
will not free the iobufs and may result in leak).
|
||||
|
||||
###iobref_clear
|
||||
```
|
||||
void iobref_clear (struct iobref *iobref);
|
||||
```
|
||||
Unreference all the iobufs in the iobref, and also unref the iobref.
|
||||
|
||||
##Iobuf Leaks
|
||||
If all iobuf_refs/iobuf_new do not have correspondning iobuf_unref, then the
|
||||
iobufs are not freed and recurring execution of such code path may lead to huge
|
||||
memory leaks. The easiest way to identify if a memory leak is caused by iobufs
|
||||
is to take a statedump. If the statedump shows a lot of filled arenas then it is
|
||||
a sure sign of leak. Refer doc/debugging/statedump.md for more details.
|
||||
|
||||
If iobufs are leaking, the next step is to find where the iobuf_unref went
|
||||
missing. There is no standard/easy way of debugging this, code reading and logs
|
||||
are the only ways. If there is a liberty to reproduce the memory leak at will,
|
||||
then logs(gf_callinginfo) in iobuf_ref/unref might help.
|
||||
TODO: A easier way to debug iobuf leaks.
|
||||
@@ -1,124 +0,0 @@
|
||||
#Mem-pool
|
||||
##Background
|
||||
There was a time when every fop in glusterfs used to incur cost of allocations/de-allocations for every stack wind/unwind between xlators because stack/frame/*_localt_t in every wind/unwind was allocated and de-allocated. Because of all these system calls in the fop path there was lot of latency and the worst part is that most of the times the number of frames/stacks active at any time wouldn't cross a threshold. So it was decided that this threshold number of frames/stacks would be allocated in the beginning of the process only once. Get one of them from the pool of stacks/frames whenever `STACK_WIND` is performed and put it back into the pool in `STACK_UNWIND`/`STACK_DESTROY` without incurring any extra system calls. The data structures are allocated only when threshold number of such items are in active use i.e. pool is in complete use.% increase in the performance once this was added to all the common data structures (inode/fd/dict etc) in xlators throughout the stack was tremendous.
|
||||
|
||||
## Data structure
|
||||
```
|
||||
struct mem_pool {
|
||||
struct list_head list; /*Each member in the mempool is element padded with a doubly-linked-list + ptr of mempool + is-in
|
||||
-use info. This list is used to add the element to the list of free members in the mem-pool*/
|
||||
int hot_count;/*number of mempool elements that are in active use*/
|
||||
int cold_count;/*number of mempool elements that are not in use. If a new allocation is required it
|
||||
will be served from here until all the elements in the pool are in use i.e. cold-count becomes 0.*/
|
||||
gf_lock_t lock;/*synchronization mechanism*/
|
||||
unsigned long padded_sizeof_type;/*Each mempool element is padded with a doubly-linked-list + ptr of mempool + is-in
|
||||
-use info to operate the pool of elements, this size is the element-size after padding*/
|
||||
void *pool;/*Starting address of pool*/
|
||||
void *pool_end;/*Ending address of pool*/
|
||||
/* If an element address is in the range between pool, pool_end addresses then it is alloced from the pool otherwise it is 'calloced' this is very useful for functions like 'mem_put'*/
|
||||
int real_sizeof_type;/* size of just the element without any padding*/
|
||||
uint64_t alloc_count; /*Number of times this type of data is allocated through out the life of this process. This may include calloced elements as well*/
|
||||
uint64_t pool_misses; /*Number of times the element had to be allocated from heap because all elements from the pool are in active use.*/
|
||||
int max_alloc; /*Maximum number of elements from the pool in active use at any point in the life of the process. This does *not* include calloced elements*/
|
||||
int curr_stdalloc;/*Number of elements that are allocated from heap at the moment because the pool is in completed use. It should be '0' when pool is not in complete use*/
|
||||
int max_stdalloc;/*Maximum number of allocations from heap after the pool is completely used that are in active use at any point in the life of the process.*/
|
||||
char *name; /*Contains xlator-name:data-type as a string
|
||||
struct list_head global_list;/*This is used to insert it into the global_list of mempools maintained in 'glusterfs-ctx'
|
||||
};
|
||||
```
|
||||
|
||||
##Life-cycle
|
||||
```
|
||||
mem_pool_new (data_type, unsigned long count)
|
||||
|
||||
This is a macro which expands to mem_pool_new_fn (sizeof (data_type), count, string-rep-of-data_type)
|
||||
|
||||
struct mem_pool *
|
||||
mem_pool_new_fn (unsigned long sizeof_type, unsigned long count, char *name)
|
||||
|
||||
Padded-element:
|
||||
----------------------------------------
|
||||
|list-ptr|mem-pool-address|in-use|Element|
|
||||
----------------------------------------
|
||||
```
|
||||
|
||||
This function allocates the `mem-pool` structure and sets up the pool for use.
|
||||
`name` parameter above is the `string` containing type of the datatype. This `name` is appended to `xlator-name + ':'` so that it can be easily identified in things like statedump. `count` is the number of elements that need to be allocated. `sizeof_type` is the size of each element. Ideally `('sizeof_type'*'count')` should be the size of the total pool. But to manage the pool using `mem_get`/`mem_put` (will be explained after this section) each element needs to be padded in the front with a `('list', 'mem-pool-address', 'in_use')`. So the actual size of the pool it allocates will be `('padded_sizeof_type'*'count')`. Why these extra elements are needed will be evident after understanding how `mem_get` and `mem_put` are implemented. In this function it just initializes all the `list` structures in front of each element and adds them to the `mem_pool->list` which represent the list of `cold` elements which can be allocated whenever `mem_get` is called on this mem_pool. It remembers mem_pool's start and end addresses in `mem_pool->pool`, `mem_pool->pool_end` respectively. Initializes `mem_pool->cold_count` to `count` and `mem_pool->hot_count` to `0`. This mem-pool will be added to the list of `global_list` maintained in `glusterfs-ctx`
|
||||
|
||||
|
||||
```
|
||||
void* mem_get (struct mem_pool *mem_pool)
|
||||
|
||||
Initial-list before mem-get
|
||||
----------------
|
||||
| Pool |
|
||||
| ----------- | ---------------------------------------- ----------------------------------------
|
||||
| | pool-list | |<---> |list-ptr|mem-pool-address|in-use|Element|<--->|list-ptr|mem-pool-address|in-use|Element|
|
||||
| ----------- | ---------------------------------------- ----------------------------------------
|
||||
----------------
|
||||
|
||||
list after mem-get from the pool
|
||||
----------------
|
||||
| Pool |
|
||||
| ----------- | ----------------------------------------
|
||||
| | pool-list | |<--->|list-ptr|mem-pool-address|in-use|Element|
|
||||
| ----------- | ----------------------------------------
|
||||
----------------
|
||||
|
||||
List when the pool is full:
|
||||
----------------
|
||||
| Pool | extra element that is allocated
|
||||
| ----------- | ----------------------------------------
|
||||
| | pool-list | | |list-ptr|mem-pool-address|in-use|Element|
|
||||
| ----------- | ----------------------------------------
|
||||
----------------
|
||||
```
|
||||
|
||||
This function is similar to `malloc()` but it gives memory of type `element` of this pool. When this function is called it increments `mem_pool->alloc_count`, checks if there are any free elements in the pool that can be returned by inspecting `mem_pool->cold_count`. If `mem_pool->cold_count` is non-zero then it means there are elements in the pool which are not in active use. It deletes one element from the list of free elements and decrements `mem_pool->cold_count` and increments `mem_pool->hot_count` to indicate there is one more element in active use. Updates `mem_pool->max_alloc` accordingly. Sets `element->in_use` in the padded memory to `1`. Sets `element->mem_pool` address to this mem_pool also in the padded memory(It is useful for mem_put). Returns the address of the memory after the padded boundary to the caller of this function. In the cases where all the elements in the pool are in active use it `callocs` the element with padded size and sets mem_pool address in the padded memory. To indicate the pool-miss and give useful accounting information of the pool-usage it increments `mem_pool->pool_misses`, `mem_pool->curr_stdalloc`. Updates `mem_pool->max_stdalloc` accordingly.
|
||||
|
||||
```
|
||||
void* mem_get0 (struct mem_pool *mem_pool)
|
||||
```
|
||||
Just like `calloc` is to `malloc`, `mem_get0` is to `mem_get`. It memsets the memory to all '0' before returning the element.
|
||||
|
||||
|
||||
```
|
||||
void mem_put (void *ptr)
|
||||
|
||||
list before mem-put from the pool
|
||||
----------------
|
||||
| Pool |
|
||||
| ----------- | ----------------------------------------
|
||||
| | pool-list | |<--->|list-ptr|mem-pool-address|in-use|Element|
|
||||
| ----------- | ----------------------------------------
|
||||
----------------
|
||||
|
||||
list after mem-put to the pool
|
||||
----------------
|
||||
| Pool |
|
||||
| ----------- | ---------------------------------------- ----------------------------------------
|
||||
| | pool-list | |<---> |list-ptr|mem-pool-address|in-use|Element|<--->|list-ptr|mem-pool-address|in-use|Element|
|
||||
| ----------- | ---------------------------------------- ----------------------------------------
|
||||
----------------
|
||||
|
||||
If mem_put is putting an element not from pool then it is just freed so
|
||||
no change to the pool
|
||||
----------------
|
||||
| Pool |
|
||||
| ----------- |
|
||||
| | pool-list | |
|
||||
| ----------- |
|
||||
----------------
|
||||
```
|
||||
|
||||
This function is similar to `free()`. Remember that ptr passed to this function is the address of the element, so this function gets the ptr to its head of the padding in front of it. If this memory falls in bettween `mem_pool->pool`, `mem_pool->pool_end` then the memory is part of the 'pool' memory that is allocated so it does some sanity checks to see if the memory is indeed head of the element by checking if `in_use` is set to `1`. It resets `in_use` to `0`. It gets the mem_pool address stored in the padded region and adds this element to the list of free elements. Decreases `mem_pool->hot_count` increases `mem_pool->cold_count`. In the case where padded-element address does not fall in the range of `mem_pool->pool`, `mem_pool->pool_end` it just frees the element and decreases `mem_pool->curr_stdalloc`.
|
||||
|
||||
```
|
||||
void
|
||||
mem_pool_destroy (struct mem_pool *pool)
|
||||
```
|
||||
Deletes this pool from the `global_list` maintained by `glusterfs-ctx` and frees all the memory allocated in `mem_pool_new`.
|
||||
|
||||
|
||||
###How to pick pool-size
|
||||
This varies from work-load to work-load. Create the mem-pool with some random size and run the work-load. Take the statedump after the work-load is complete. In the statedump if `max_alloc` is always less than `cold_count` may be reduce the size of the pool closer to `max_alloc`. On the otherhand if there are lots of `pool-misses` then increase the `pool_size` by `max_stdalloc` to achieve better 'hit-rate' of the pool.
|
||||
@@ -1,270 +0,0 @@
|
||||
|
||||
## Symbol Versions and SO_NAMEs
|
||||
|
||||
In general, adding new APIs to a shared library does not require that
|
||||
symbol versions be used or the the SO_NAME be "bumped." These actions
|
||||
are usually reserved for when a major change is introduced, e.g. many
|
||||
APIs change or a signficant change in the functionality occurs.
|
||||
|
||||
Over the normal lifetime of a When a new API is added, the library is
|
||||
recompiled, consumers of the new API are able to do so, and existing,
|
||||
legacy consumers of the original API continue as before. If by some
|
||||
chance an old copy of the library is installed on a system, it's unlikely
|
||||
that most applications will be affected. New applications that use the
|
||||
new API will incur a run-time error terminate.
|
||||
|
||||
Bumping the SO_NAME, i.e. changing the shared lib's file name, e.g.
|
||||
from libfoo.so.0 to libfoo.so.1, which also changes the ELF SO_NAME
|
||||
attribute inside the file, works a little differently. libfoo.so.0
|
||||
contains only the old APIs. libfoo.so.1 contains both the old and new
|
||||
APIs. Legacy software that was linked with libfoo.so.0 continues to work
|
||||
as libfoo.so.0 is usually left installed on the system. New software that
|
||||
uses the new APIs is linked with libfoo.so.1, and works as long as
|
||||
long as libfoo.so.1 is installed on the system. Accidentally (re)installing
|
||||
libfoo.so.0 doesn't break new software as long as reinstalling doesn't
|
||||
erase libfoo.so.1.
|
||||
|
||||
Using symbol versions is somewhere in the middle. The shared library
|
||||
file remains libfoo.so.0 forever. Legacy APIs may or may not have an
|
||||
associated symbol version. New APIs may or may not have an associated
|
||||
symbol version either. In general symbol versions are reserved for APIs
|
||||
that have changed. Either the function's signature has changed, i.e. the
|
||||
return time or the number of paramaters, and/or the parameter types have
|
||||
changed. Another reason for using symbol versions on an API is when the
|
||||
behaviour or functionality of the API changes dramatically. As with a
|
||||
library that doesn't use versioned symbols, old and new applications
|
||||
either find or don't find the versioned symbols they need. If the versioned
|
||||
symbol doesn't exist in the installed library, the application incurs a
|
||||
run-time error and terminates.
|
||||
|
||||
GlusterFS wanted to keep tight control over the APIs in libgfapi.
|
||||
Originally bumping the SO_NAME was considered, and GlusterFS-3.6.0 was
|
||||
released with libgfapi.so.7. Not only was "7" a mistake (it should have
|
||||
been "6"), but it was quickly pointed out that many dependent packages
|
||||
that use libgfapi would be forced to be recompiled/relinked. Thus no
|
||||
packages of 3.6.0 were ever released and 3.6.1 was quickly released with
|
||||
libgfapi.so.0, but with symbol versions. There's no strong technical
|
||||
reason for either; the APIs have not changed, only new APIs have been
|
||||
added. It's merely being done in anticipation that some APIs might change
|
||||
sometime in the future.
|
||||
|
||||
Enough about that now, let's get into the nitty gritty——
|
||||
|
||||
## Adding new APIs
|
||||
|
||||
### Adding a public API.
|
||||
|
||||
This is the default, and the easiest thing to do. Public APIs have
|
||||
declarations in either glfs.h, glfs-handles.h, or, at your discretion,
|
||||
in a new header file intended for consumption by other developers.
|
||||
|
||||
Here's what you need to do to add a new public API:
|
||||
|
||||
+ Write the declaration, e.g. in glfs.h:
|
||||
|
||||
```C
|
||||
int glfs_dtrt (const char *volname, void *stuff) __THROW
|
||||
```
|
||||
|
||||
+ Write the definition, e.g. in glfs-dtrt.c:
|
||||
|
||||
```C
|
||||
int
|
||||
pub_glfs_dtrt (const char *volname, void *stuff)
|
||||
{
|
||||
...
|
||||
return 0;
|
||||
}
|
||||
```
|
||||
|
||||
+ Add the symbol version magic for ELF, gnu toolchain to the definition.
|
||||
|
||||
following the definition of your new function in glfs-dtrtops.c, add a
|
||||
line like this:
|
||||
|
||||
```C
|
||||
GFAPI_SYMVER_PUBLIC_DEFAULT(glfs_dtrt, 3.7.0)
|
||||
```
|
||||
|
||||
The whole thing should look like:
|
||||
|
||||
```C
|
||||
int
|
||||
pub_glfs_dtrt (const char *volname, void *stuff)
|
||||
{
|
||||
...
|
||||
}
|
||||
GFAPI_SYMVER_PUBLIC_DEFAULT(glfs_dtrt, 3.7.0);
|
||||
```
|
||||
|
||||
In this example, 3.7.0 refers to the Version the symbol will first
|
||||
appear in. There's nothing magic about it, it's just a string token.
|
||||
The current versions we have are 3.4.0, 3.4.2, 3.5.0, 3.5.1, and 3.6.0.
|
||||
They are to be considered locked or closed. You can not, must not add
|
||||
any new APIs and use these versions. Most new APIs will use 3.7.0. If
|
||||
you add a new API appearing in 3.6.2 (and mainline) then you would use
|
||||
3.6.2.
|
||||
|
||||
+ Add the symbol version magic for OS X to the declaration.
|
||||
|
||||
following the declaration in glfs.h, add a line like this:
|
||||
|
||||
```C
|
||||
GFAPI_PUBLIC(glfs_dtrt, 3.7.0)
|
||||
```
|
||||
|
||||
The whole thing should look like:
|
||||
|
||||
```C
|
||||
int glfs_dtrt (const char *volname, void *stuff) __THROW
|
||||
GFAPI_PUBLIC(glfs_dtrt, 3.7.0);
|
||||
```
|
||||
|
||||
The version here must match the version associated with the definition.
|
||||
|
||||
+ Add the new API to the ELF, gnu toolchain link map file, gfapi.map
|
||||
|
||||
Most new public APIs will probably be added to a new section that
|
||||
looks like this:
|
||||
|
||||
```
|
||||
GFAPI_3.7.0 {
|
||||
global:
|
||||
glfs_dtrt;
|
||||
} GFAPI_PRIVATE_3.7.0;
|
||||
```
|
||||
|
||||
if you're adding your new API to, e.g. 3.6.2, it'll look like this:
|
||||
|
||||
```
|
||||
GFAPI_3.6.2 {
|
||||
global:
|
||||
glfs_dtrt;
|
||||
} GFAPI_3.6.0;
|
||||
```
|
||||
|
||||
and you must change the
|
||||
```
|
||||
GFAPI_PRIVATE_3.7.0 { ...} GFAPI_3.6.0;
|
||||
```
|
||||
section to:
|
||||
```
|
||||
GFAPI_PRIVATE_3.7.0 { ...} GFAPI_3.6.2;
|
||||
```
|
||||
|
||||
+ Add the new API to the OS X alias list file, gfapi.aliases.
|
||||
|
||||
Most new APIs will use a line that looks like this:
|
||||
|
||||
```C
|
||||
_pub_glfs_dtrt _glfs_dtrt$GFAPI_3.7.0
|
||||
```
|
||||
|
||||
if you're adding your new API to, e.g. 3.6.2, it'll look like this:
|
||||
|
||||
```C
|
||||
_pub_glfs_dtrt _glfs_dtrt$GFAPI_3.6.2
|
||||
```
|
||||
|
||||
And that's it.
|
||||
|
||||
|
||||
### Adding a private API.
|
||||
|
||||
If you're thinking about adding a private API that isn't declared in
|
||||
one of the header files, then you should seriously rethink what you're
|
||||
doing and figure out how to put it in libglusterfs instead.
|
||||
|
||||
If that hasn't convinced you, follow the instructions above, but use the
|
||||
_PRIVATE versions of macros, symbol versions, and aliases. If you're 1337
|
||||
enough to ignore this advice, then you're 1337 enough to figure out how
|
||||
to do it.
|
||||
|
||||
|
||||
## Changing an API.
|
||||
|
||||
### Changing a public API.
|
||||
|
||||
There are two ways an API might change, 1) its signature has changed, or
|
||||
2) its new functionality or behavior is substantially different than the
|
||||
old. An APIs signature consists of the function return type, and the number
|
||||
and/or type of its parameters. E.g. the original API:
|
||||
|
||||
```C
|
||||
int glfs_dtrt (const char *volname, void *stuff);
|
||||
```
|
||||
|
||||
and the changed API:
|
||||
|
||||
```C
|
||||
void *glfs_dtrt (const char *volname, glfs_t *ctx, void *stuff);
|
||||
```
|
||||
|
||||
One way to avoid a change like this, and which is preferable in many
|
||||
ways, is to leave the legacy glfs_dtrt() function alone, document it as
|
||||
deprecated, and simply add a new API, e.g. glfs_dtrt2(). Practically
|
||||
speaking, that's effectively what we'll be doing anyway, the difference
|
||||
is only that we'll use a versioned symbol to do it.
|
||||
|
||||
On the assumption that adding a new API is undesirable for some reason,
|
||||
perhaps the use of glfs_gnu() is just so pervasive that we really don't
|
||||
want to add glfs_gnu2().
|
||||
|
||||
+ change the declaration in glfs.h:
|
||||
|
||||
```C
|
||||
glfs_t *glfs_gnu (const char *volname, void *stuff) __THROW
|
||||
GFAPI_PUBLIC(glfs_gnu, 3.7.0);
|
||||
````
|
||||
|
||||
Note that there is only the single, new declaration.
|
||||
|
||||
+ change the old definition of glfs_gnu() in glfs.c:
|
||||
|
||||
```C
|
||||
struct glfs *
|
||||
pub_glfs_gnu340 (const char * volname)
|
||||
{
|
||||
...
|
||||
}
|
||||
GFAPI_SYMVER_PUBLIC(glfs_gnu340, glfs_gnu, 3.4.0);
|
||||
```
|
||||
|
||||
+ create the new definition of glfs_gnu in glfs.c:
|
||||
|
||||
```C
|
||||
struct glfs *
|
||||
pub_glfs_gnu (const char * volname, void *stuff)
|
||||
{
|
||||
...
|
||||
}
|
||||
GFAPI_SYMVER_PUBLIC_DEFAULT(glfs_gnu, 3.7.0);
|
||||
```
|
||||
|
||||
+ Add the new API to the ELF, gnu toolchain link map file, gfapi.map
|
||||
|
||||
```
|
||||
GFAPI_3.7.0 {
|
||||
global:
|
||||
glfs_gnu;
|
||||
} GFAPI_PRIVATE_3.7.0;
|
||||
```
|
||||
|
||||
+ Update the OS X alias list file, gfapi.aliases, for both versions:
|
||||
|
||||
Change the old line:
|
||||
```C
|
||||
_pub_glfs_gnu _glfs_gnu$GFAPI_3.4.0
|
||||
```
|
||||
to:
|
||||
```C
|
||||
_pub_glfs_gnu340 _glfs_gnu$GFAPI_3.4.0
|
||||
```
|
||||
|
||||
Add a new line:
|
||||
```C
|
||||
_pub_glfs_gnu _glfs_gnu$GFAPI_3.7.0
|
||||
```
|
||||
|
||||
+ Lastly, change all gfapi internal calls glfs_gnu to the new API.
|
||||
|
||||
@@ -1,71 +0,0 @@
|
||||
#On-Wire Compression + Decompression
|
||||
|
||||
The 'compression translator' compresses and decompresses data in-flight
|
||||
between client and bricks.
|
||||
|
||||
###Working
|
||||
When a writev call occurs, the client compresses the data before sending it to
|
||||
brick. On the brick, compressed data is decompressed. Similarly, when a readv
|
||||
call occurs, the brick compresses the data before sending it to client. On the
|
||||
client, the compressed data is decompressed. Thus, the amount of data sent over
|
||||
the wire is minimized. Compression/Decompression is done using Zlib library.
|
||||
|
||||
During normal operation, this is the format of data sent over wire:
|
||||
|
||||
~~~
|
||||
<compressed-data> + trailer(8 bytes)
|
||||
~~~
|
||||
|
||||
The trailer contains the CRC32 checksum and length of original uncompressed
|
||||
data. This is used for validation.
|
||||
|
||||
###Usage
|
||||
|
||||
Turning on compression xlator:
|
||||
|
||||
~~~
|
||||
gluster volume set <vol_name> network.compression on
|
||||
~~~
|
||||
|
||||
###Configurable parameters (optional)
|
||||
|
||||
**Compression level**
|
||||
~~~
|
||||
gluster volume set <vol_name> network.compression.compression-level 8
|
||||
~~~
|
||||
|
||||
~~~
|
||||
0 : no compression
|
||||
1 : best speed
|
||||
9 : best compression
|
||||
-1 : default compression
|
||||
~~~
|
||||
|
||||
**Minimum file size**
|
||||
|
||||
~~~
|
||||
gluster volume set <vol_name> network.compression.min-size 50
|
||||
~~~
|
||||
|
||||
Data is compressed only when its size exceeds the above value in bytes.
|
||||
|
||||
**Other paramaters**
|
||||
|
||||
Other less frequently used parameters include `network.compression.mem-level`
|
||||
and `network.compression.window-size`. More details can about these options
|
||||
can be found by running `gluster volume set help` command.
|
||||
|
||||
###Known Issues and Limitations
|
||||
|
||||
* Compression translator cannot work with striped volumes.
|
||||
* Mount point hangs when writing a file with write-behind xlator turned on. To
|
||||
overcome this, turn off `performance.write-behind` entirely OR
|
||||
set`performance.strict-write-ordering` to on.
|
||||
* For glusterfs versions <= 3.5, compression traslator can ONLY work with pure
|
||||
distribute volumes. This limitation is caused by AFR not being able to
|
||||
propagate xdata. This issue has been fixed in glusterfs versions > 3.5
|
||||
|
||||
###TODO
|
||||
Although zlib offers high compression ratio, it is very slow. We can make the
|
||||
translator pluggable to add support for other compression methods such as
|
||||
[lz4 compression](https://code.google.com/p/lz4/)
|
||||
@@ -1,59 +0,0 @@
|
||||
storage/posix translator
|
||||
========================
|
||||
|
||||
Notes
|
||||
-----
|
||||
|
||||
### `SET_FS_ID`
|
||||
|
||||
This is so that all filesystem checks are done with the user's
|
||||
uid/gid and not GlusterFS's uid/gid.
|
||||
|
||||
### `MAKE_REAL_PATH`
|
||||
|
||||
This macro concatenates the base directory of the posix volume
|
||||
('option directory') with the given path.
|
||||
|
||||
### `need_xattr` in lookup
|
||||
|
||||
If this flag is passed, lookup returns a xattr dictionary that contains
|
||||
the file's create time, the file's contents, and the version number
|
||||
of the file.
|
||||
|
||||
This is a hack to increase small file performance. If an application
|
||||
wants to read a small file, it can finish its job with just a lookup
|
||||
call instead of a lookup followed by read.
|
||||
|
||||
### `getdents`/`setdents`
|
||||
|
||||
These are used by unify to set and get directory entries.
|
||||
|
||||
### `ALIGN_BUF`
|
||||
|
||||
Macro to align an address to a page boundary (4K).
|
||||
|
||||
### `priv->export_statfs`
|
||||
|
||||
In some cases, two exported volumes may reside on the same
|
||||
partition on the server. Sending statvfs info for both
|
||||
the volumes will lead to erroneous df output at the client,
|
||||
since free space on the partition will be counted twice.
|
||||
|
||||
In such cases, user can disable exporting statvfs info
|
||||
on one of the volumes by setting this option.
|
||||
|
||||
### `xattrop`
|
||||
|
||||
This fop is used by replicate to set version numbers on files.
|
||||
|
||||
### `getxattr`/`setxattr` hack to read/write files
|
||||
|
||||
A key, `GLUSTERFS_FILE_CONTENT_STRING`, is handled in a special way by
|
||||
`getxattr`/`setxattr`. A getxattr with the key will return the entire
|
||||
content of the file as the value. A `setxattr` with the key will write
|
||||
the value as the entire content of the file.
|
||||
|
||||
### `posix_checksum`
|
||||
|
||||
This calculates a simple XOR checksum on all entry names in a
|
||||
directory that is used by unify to compare directory contents.
|
||||
@@ -1,18 +0,0 @@
|
||||
This document serves as a basic coding standard/practise for further
|
||||
developments after proper protocol layer is implemented.
|
||||
|
||||
With this release we are bringing abstraction based on xlator driven
|
||||
operation and protocol driven operation. ie, all the client side (fuse)
|
||||
operations are xlator driven operations and will come with 'op' value
|
||||
taken from 'libglusterfs/'.
|
||||
|
||||
All the server protocol driven operations are driven by which ever
|
||||
version of protocol is used.
|
||||
|
||||
All the currently implemented fops will remain, and 'getspec' being generated
|
||||
by top level and passes through translator graph, is treated as an 'fop'.
|
||||
|
||||
All new 'gluster' and 'glusterd' related calls will be _mgmt_ calls instead of
|
||||
fops. All release, releasedir and forget are treated as fops (but they won't
|
||||
come with requirement to use STACK_WIND and STACK_UNWIND).
|
||||
|
||||
@@ -1,683 +0,0 @@
|
||||
Translator development
|
||||
======================
|
||||
|
||||
Setting the Stage
|
||||
-----------------
|
||||
|
||||
This is the first post in a series that will explain some of the details of
|
||||
writing a GlusterFS translator, using some actual code to illustrate.
|
||||
|
||||
Before we begin, a word about environments. GlusterFS is over 300K lines of
|
||||
code spread across a few hundred files. That's no Linux kernel or anything, but
|
||||
you're still going to be navigating through a lot of code in every
|
||||
code-editing session, so some kind of cross-referencing is *essential*. I use
|
||||
cscope with the vim bindings, and if I couldn't do Crtl+G and such to jump
|
||||
between definitions all the time my productivity would be cut in half. You may
|
||||
prefer different tools, but as I go through these examples you'll need
|
||||
something functionally similar to follow on. OK, on with the show.
|
||||
|
||||
The first thing you need to know is that translators are not just bags of
|
||||
functions and variables. They need to have a very definite internal structure
|
||||
so that the translator-loading code can figure out where all the pieces are.
|
||||
The way it does this is to use dlsym to look for specific names within your
|
||||
shared-object file, as follow (from `xlator.c`):
|
||||
|
||||
```
|
||||
if (!(xl->fops = dlsym (handle, "fops"))) {
|
||||
gf_log ("xlator", GF_LOG_WARNING, "dlsym(fops) on %s",
|
||||
dlerror ());
|
||||
goto out;
|
||||
}
|
||||
|
||||
if (!(xl->cbks = dlsym (handle, "cbks"))) {
|
||||
gf_log ("xlator", GF_LOG_WARNING, "dlsym(cbks) on %s",
|
||||
dlerror ());
|
||||
goto out;
|
||||
}
|
||||
|
||||
if (!(xl->init = dlsym (handle, "init"))) {
|
||||
gf_log ("xlator", GF_LOG_WARNING, "dlsym(init) on %s",
|
||||
dlerror ());
|
||||
goto out;
|
||||
}
|
||||
|
||||
if (!(xl->fini = dlsym (handle, "fini"))) {
|
||||
gf_log ("xlator", GF_LOG_WARNING, "dlsym(fini) on %s",
|
||||
dlerror ());
|
||||
goto out;
|
||||
}
|
||||
```
|
||||
|
||||
In this example, `xl` is a pointer to the in-memory object for the translator
|
||||
we're loading. As you can see, it's looking up various symbols *by name* in the
|
||||
shared object it just loaded, and storing pointers to those symbols. Some of
|
||||
them (e.g. init) are functions, while others (e.g. fops) are dispatch tables
|
||||
containing pointers to many functions. Together, these make up the translator's
|
||||
public interface.
|
||||
|
||||
Most of this glue or boilerplate can easily be found at the bottom of one of
|
||||
the source files that make up each translator. We're going to use the `rot-13`
|
||||
translator just for fun, so in this case you'd look in `rot-13.c` to see this:
|
||||
|
||||
```
|
||||
struct xlator_fops fops = {
|
||||
.readv = rot13_readv,
|
||||
.writev = rot13_writev
|
||||
};
|
||||
|
||||
struct xlator_cbks cbks = {
|
||||
};
|
||||
|
||||
struct volume_options options[] = {
|
||||
{ .key = {"encrypt-write"},
|
||||
.type = GF_OPTION_TYPE_BOOL
|
||||
},
|
||||
{ .key = {"decrypt-read"},
|
||||
.type = GF_OPTION_TYPE_BOOL
|
||||
},
|
||||
{ .key = {NULL} },
|
||||
};
|
||||
```
|
||||
|
||||
The `fops` table, defined in `xlator.h`, is one of the most important pieces.
|
||||
This table contains a pointer to each of the filesystem functions that your
|
||||
translator might implement -- `open`, `read`, `stat`, `chmod`, and so on. There
|
||||
are 82 such functions in all, but don't worry; any that you don't specify here
|
||||
will be see as null and filled with defaults from `defaults.c` when your
|
||||
translator is loaded. In this particular example, since `rot-13` is an
|
||||
exceptionally simple translator, we only fill in two entries for `readv` and
|
||||
`writev`.
|
||||
|
||||
There are actually two other tables, also required to have predefined names,
|
||||
that are also used to find translator functions: `cbks` (which is empty in this
|
||||
snippet) and `dumpops` (which is missing entirely). The first of these specify
|
||||
entry points for when inodes are forgotten or file descriptors are released.
|
||||
In other words, they're destructors for objects in which your translator might
|
||||
have an interest. Mostly you can ignore them, because the default behavior
|
||||
handles even the simpler cases of translator-specific inode/fd context
|
||||
automatically. However, if the context you attach is a complex structure
|
||||
requiring complex cleanup, you'll need to supply these functions. As for
|
||||
dumpops, that's just used if you want to provide functions to pretty-print
|
||||
various structures in logs. I've never used it myself, though I probably
|
||||
should. What's noteworthy here is that we don't even define dumpops. That's
|
||||
because all of the functions that might use these dispatch functions will check
|
||||
for `xl->dumpops` being `NULL` before calling through it. This is in sharp
|
||||
contrast to the behavior for `fops` and `cbks`, which *must* be present. If
|
||||
they're not, translator loading will fail because these pointers are not
|
||||
checked every time and if they're `NULL` then we'll segfault. That's why we
|
||||
provide an empty definition for cbks; it's OK for the individual function
|
||||
pointers to be NULL, but not for the whole table to be absent.
|
||||
|
||||
The last piece I'll cover today is options. As you can see, this is a table of
|
||||
translator-specific option names and some information about their types.
|
||||
GlusterFS actually provides a pretty rich set of types (`volume_option_type_t`
|
||||
in `options.`h) which includes paths, translator names, percentages, and times
|
||||
in addition to the obvious integers and strings. Also, the `volume_option_t`
|
||||
structure can include information about alternate names, min/max/default
|
||||
values, enumerated string values, and descriptions. We don't see any of these
|
||||
here, so let's take a quick look at some more complex examples from afr.c and
|
||||
then come back to `rot-13`.
|
||||
|
||||
```
|
||||
{ .key = {"data-self-heal-algorithm"},
|
||||
.type = GF_OPTION_TYPE_STR,
|
||||
.default_value = "",
|
||||
.description = "Select between \"full\", \"diff\". The "
|
||||
"\"full\" algorithm copies the entire file from "
|
||||
"source to sink. The \"diff\" algorithm copies to "
|
||||
"sink only those blocks whose checksums don't match "
|
||||
"with those of source.",
|
||||
.value = { "diff", "full", "" }
|
||||
},
|
||||
{ .key = {"data-self-heal-window-size"},
|
||||
.type = GF_OPTION_TYPE_INT,
|
||||
.min = 1,
|
||||
.max = 1024,
|
||||
.default_value = "1",
|
||||
.description = "Maximum number blocks per file for which "
|
||||
"self-heal process would be applied simultaneously."
|
||||
},
|
||||
```
|
||||
|
||||
When your translator is loaded, all of this information is used to parse the
|
||||
options actually provided in the volfile, and then the result is turned into a
|
||||
dictionary and stored as `xl->options`. This dictionary is then processed by
|
||||
your init function, which you can see being looked up in the first code
|
||||
fragment above. We're only going to look at a small part of the `rot-13`'s
|
||||
init for now.
|
||||
|
||||
```
|
||||
priv->decrypt_read = 1;
|
||||
priv->encrypt_write = 1;
|
||||
|
||||
data = dict_get (this->options, "encrypt-write");
|
||||
if (data) {
|
||||
if (gf_string2boolean (data->data, &priv->encrypt_write
|
||||
== -1) {
|
||||
gf_log (this->name, GF_LOG_ERROR,
|
||||
"encrypt-write takes only boolean options");
|
||||
return -1;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
What we can see here is that we're setting some defaults in our priv structure,
|
||||
then looking to see if an `encrypt-write` option was actually provided. If so,
|
||||
we convert and store it. This is a pretty classic use of dict_get to fetch a
|
||||
field from a dictionary, and of using one of many conversion functions in
|
||||
`common-utils.c` to convert `data->data` into something we can use.
|
||||
|
||||
So far we've covered the basic of how a translator gets loaded, how we find its
|
||||
various parts, and how we process its options. In my next Translator 101 post,
|
||||
we'll go a little deeper into other things that init and its companion fini
|
||||
might do, and how some other fields in our `xlator_t` structure (commonly
|
||||
referred to as this) are commonly used.
|
||||
|
||||
`init`, `fini`, and private context
|
||||
-----------------------------------
|
||||
|
||||
In the previous Translator 101 post, we looked at some of the dispatch tables
|
||||
and options processing in a translator. This time we're going to cover the rest
|
||||
of the "shell" of a translator -- i.e. the other global parts not specific to
|
||||
handling a particular request.
|
||||
|
||||
Let's start by looking at the relationship between a translator and its shared
|
||||
library. At a first approximation, this is the relationship between an object
|
||||
and a class in just about any object-oriented programming language. The class
|
||||
defines behaviors, but has to be instantiated as an object to have any kind of
|
||||
existence. In our case the object is an `xlator_t`. Several of these might be
|
||||
created within the same daemon, sharing all of the same code through init/fini
|
||||
and dispatch tables, but sharing *no data*. You could implement shared data (as
|
||||
static variables in your shared libraries) but that's strongly discouraged.
|
||||
Every function in your shared library will get an `xlator_t` as an argument,
|
||||
and should use it. This lack of class-level data is one of the points where
|
||||
the analogy to common OOP systems starts to break down. Another place is the
|
||||
complete lack of inheritance. Translators inherit behavior (code) from exactly
|
||||
one shared library -- looked up and loaded using the `type` field in a volfile
|
||||
`volume ... end-volume` block -- and that's it -- not even single inheritance,
|
||||
no subclasses or superclasses, no mixins or prototypes, just the relationship
|
||||
between an object and its class. With that in mind, let's turn to the init
|
||||
function that we just barely touched on last time.
|
||||
|
||||
```
|
||||
int32_t
|
||||
init (xlator_t *this)
|
||||
{
|
||||
data_t *data = NULL;
|
||||
rot_13_private_t *priv = NULL;
|
||||
|
||||
if (!this->children || this->children->next) {
|
||||
gf_log ("rot13", GF_LOG_ERROR,
|
||||
"FATAL: rot13 should have exactly one child");
|
||||
return -1;
|
||||
}
|
||||
|
||||
if (!this->parents) {
|
||||
gf_log (this->name, GF_LOG_WARNING,
|
||||
"dangling volume. check volfile ");
|
||||
}
|
||||
|
||||
priv = GF_CALLOC (sizeof (rot_13_private_t), 1, 0);
|
||||
if (!priv)
|
||||
return -1;
|
||||
```
|
||||
|
||||
At the very top, we see the function signature -- we get a pointer to the
|
||||
`xlator_t` object that we're initializing, and we return an `int32_t` status.
|
||||
As with most functions in the translator API, this should be zero to indicate
|
||||
success. In this case it's safe to return -1 for failure, but watch out: in
|
||||
dispatch-table functions, the return value means the status of the *function
|
||||
call* rather than the *request*. A request error should be reflected as a
|
||||
callback with a non-zero `op_re`t value, but the dispatch function itself
|
||||
should still return zero. In fact, the handling of a non-zero return from a
|
||||
dispatch function is not all that robust (we recently had a bug report in
|
||||
HekaFS related to this) so it's something you should probably avoid
|
||||
altogether. This only underscores the difference between dispatch functions
|
||||
and `init`/`fini` functions, where non-zero returns *are* expected and handled
|
||||
logically by aborting the translator setup. We can see that down at the
|
||||
bottom, where we return -1 to indicate that we couldn't allocate our
|
||||
private-data area (more about that later).
|
||||
|
||||
The first thing this init function does is check that the translator is being
|
||||
set up in the right kind of environment. Translators are called by parents and
|
||||
in turn call children. Some translators are "initial" translators that inject
|
||||
requests into the system from elsewhere -- e.g. mount/fuse injecting requests
|
||||
from the kernel, protocol/server injecting requests from the network. Those
|
||||
translators don't need parents, but `rot-13` does and so we check for that.
|
||||
Similarly, some translators are "final" translators that (from the perspective
|
||||
of the current process) terminate requests instead of passing them on -- e.g.
|
||||
`protocol/client` passing them to another node, `storage/posix` passing them to
|
||||
a local filesystem. Other translators "multiplex" between multiple children --
|
||||
passing each parent request on to one (`cluster/dht`), some
|
||||
(`cluster/stripe`), or all (`cluster/afr`) of those children. `rot-13` fits
|
||||
into none of those categories either, so it checks that it has *exactly one*
|
||||
child. It might be more convenient or robust if translator shared libraries
|
||||
had standard variables describing these requirements, to be checked in a
|
||||
consistent way by the translator-loading infrastructure itself instead of by
|
||||
each separate init function, but this is the way translators work today.
|
||||
|
||||
The last thing we see in this fragment is allocating our private data area.
|
||||
This can literally be anything we want; the infrastructure just provides the
|
||||
priv pointer as a convenience but takes no responsibility for how it's used. In
|
||||
this case we're using `GF_CALLOC` to allocate our own `rot_13_private_t`
|
||||
structure. This gets us all the benefits of GlusterFS's memory-leak detection
|
||||
infrastructure, but the way we're calling it is not quite ideal. For one thing,
|
||||
the first two arguments -- from `calloc(3)` -- are kind of reversed. For
|
||||
another, notice how the last argument is zero. That can actually be an
|
||||
enumerated value, to tell the GlusterFS allocator *what* type we're
|
||||
allocating. This can be very useful information for memory profiling and leak
|
||||
detection, so it's recommended that you follow the example of any
|
||||
x`xx-mem-types.h` file elsewhere in the source tree instead of just passing
|
||||
zero here (even though that works).
|
||||
|
||||
To finish our tour of standard initialization/termination, let's look at the
|
||||
end of `init` and the beginning of `fini`:
|
||||
|
||||
```
|
||||
this->private = priv;
|
||||
gf_log ("rot13", GF_LOG_DEBUG, "rot13 xlator loaded");
|
||||
return 0;
|
||||
}
|
||||
|
||||
void
|
||||
fini (xlator_t *this)
|
||||
{
|
||||
rot_13_private_t *priv = this->private;
|
||||
|
||||
if (!priv)
|
||||
return;
|
||||
this->private = NULL;
|
||||
GF_FREE (priv);
|
||||
```
|
||||
|
||||
At the end of init we're just storing our private-data pointer in the `priv`
|
||||
field of our `xlator_t`, then returning zero to indicate that initialization
|
||||
succeeded. As is usually the case, our fini is even simpler. All it really has
|
||||
to do is `GF_FREE` our private-data pointer, which we do in a slightly
|
||||
roundabout way here. Notice how we don't even have a return value here, since
|
||||
there's nothing obvious and useful that the infrastructure could do if `fini`
|
||||
failed.
|
||||
|
||||
That's practically everything we need to know to get our translator through
|
||||
loading, initialization, options processing, and termination. If we had defined
|
||||
no dispatch functions, we could actually configure a daemon to use our
|
||||
translator and it would work as a basic pass-through from its parent to a
|
||||
single child. In the next post I'll cover how to build the translator and
|
||||
configure a daemon to use it, so that we can actually step through it in a
|
||||
debugger and see how it all fits together before we actually start adding
|
||||
functionality.
|
||||
|
||||
This Time For Real
|
||||
------------------
|
||||
|
||||
In the first two parts of this series, we learned how to write a basic
|
||||
translator skeleton that can get through loading, initialization, and option
|
||||
processing. This time we'll cover how to build that translator, configure a
|
||||
volume to use it, and run the glusterfs daemon in debug mode.
|
||||
|
||||
Unfortunately, there's not much direct support for writing new translators. You
|
||||
can check out a GlusterFS tree and splice in your own translator directory, but
|
||||
that's a bit painful because you'll have to update multiple makefiles plus a
|
||||
bunch of autoconf garbage. As part of the HekaFS project, I basically reverse
|
||||
engineered the truly necessary parts of the translator-building process and
|
||||
then pestered one of the Fedora glusterfs package maintainers (thanks
|
||||
daMaestro!) to add a `glusterfs-devel` package with the required headers. Since
|
||||
then the complexity level in the HekaFS tree has crept back up a bit, but I
|
||||
still remember the simple method and still consider it the easiest way to get
|
||||
started on a new translator. For the sake of those not using Fedora, I'm going
|
||||
to describe a method that doesn't depend on that header package. What it does
|
||||
depend on is a GlusterFS source tree, much as you might have cloned from GitHub
|
||||
or the Gluster review site. This tree doesn't have to be fully built, but you
|
||||
do need to run `autogen.sh` and configure in it. Then you can take the
|
||||
following simple makefile and put it in a directory with your actual source.
|
||||
|
||||
```
|
||||
# Change these to match your source code.
|
||||
TARGET = rot-13.so
|
||||
OBJECTS = rot-13.o
|
||||
|
||||
# Change these to match your environment.
|
||||
GLFS_SRC = /srv/glusterfs
|
||||
GLFS_LIB = /usr/lib64
|
||||
HOST_OS = GF_LINUX_HOST_OS
|
||||
|
||||
# You shouldn't need to change anything below here.
|
||||
|
||||
CFLAGS = -fPIC -Wall -O0 -g \
|
||||
-DHAVE_CONFIG_H -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE \
|
||||
-D$(HOST_OS) -I$(GLFS_SRC) -I$(GLFS_SRC)/contrib/uuid \
|
||||
-I$(GLFS_SRC)/libglusterfs/src
|
||||
LDFLAGS = -shared -nostartfiles -L$(GLFS_LIB)
|
||||
LIBS = -lglusterfs -lpthread
|
||||
|
||||
$(TARGET): $(OBJECTS)
|
||||
$(CC) $(OBJECTS) $(LDFLAGS) -o $(TARGET) $(OBJECTS) $(LIBS)
|
||||
```
|
||||
|
||||
Yes, it's still Linux-specific. Mea culpa. As you can see, we're sticking with
|
||||
the `rot-13` example, so you can just copy the files from
|
||||
`xlators/encryption/rot-13/src` in your GlusterFS tree to follow on. Type
|
||||
`make` and you should be rewarded with a nice little `.so` file.
|
||||
|
||||
```
|
||||
xlator_example$ ls -l rot-13.so
|
||||
-rwxr-xr-x. 1 jeff jeff 40784 Nov 16 16:41 rot-13.so
|
||||
```
|
||||
|
||||
Notice that we've built with optimization level zero and debugging symbols
|
||||
included, which would not typically be the case for a packaged version of
|
||||
GlusterFS. Let's put our version of `rot-13.so` into a slightly different file
|
||||
on our system, so that it doesn't stomp on the installed version (not that
|
||||
you'd ever want to use that anyway).
|
||||
|
||||
```
|
||||
xlator_example# ls /usr/lib64/glusterfs/3git/xlator/encryption/
|
||||
crypt.so crypt.so.0 crypt.so.0.0.0 rot-13.so rot-13.so.0
|
||||
rot-13.so.0.0.0
|
||||
xlator_example# cp rot-13.so \
|
||||
/usr/lib64/glusterfs/3git/xlator/encryption/my-rot-13.so
|
||||
```
|
||||
|
||||
These paths represent the current Gluster filesystem layout, which is likely to
|
||||
be deprecated in favor of the Fedora layout; your paths may vary. At this point
|
||||
we're ready to configure a volume using our new translator. To do that, I'm
|
||||
going to suggest something that's strongly discouraged except during
|
||||
development (the Gluster guys are going to hate me for this): write our own
|
||||
volfile. Here's just about the simplest volfile you'll ever see.
|
||||
|
||||
```
|
||||
volume my-posix
|
||||
type storage/posix
|
||||
option directory /srv/export
|
||||
end-volume
|
||||
|
||||
volume my-rot13
|
||||
type encryption/my-rot-13
|
||||
subvolumes my-posix
|
||||
end-volume
|
||||
```
|
||||
|
||||
All we have here is a basic brick using `/srv/export` for its data, and then
|
||||
an instance of our translator layered on top -- no client or server is
|
||||
necessary for what we're doing, and the system will automatically push a
|
||||
mount/fuse translator on top if there's no server translator. To try this out,
|
||||
all we need is the following command (assuming the directories involved already
|
||||
exist).
|
||||
|
||||
```
|
||||
xlator_example$ glusterfs --debug -f my.vol /srv/import
|
||||
```
|
||||
|
||||
You should be rewarded with a whole lot of log output, including the text of
|
||||
the volfile (this is very useful for debugging problems in the field). If you
|
||||
go to another window on the same machine, you can see that you have a new
|
||||
filesystem mounted.
|
||||
|
||||
```
|
||||
~$ df /srv/import
|
||||
Filesystem 1K-blocks Used Available Use% Mounted on
|
||||
/srv/xlator_example/my.vol
|
||||
114506240 2706176 105983488 3% /srv/import
|
||||
```
|
||||
|
||||
Just for fun, write something into a file in `/srv/import`, then look at the
|
||||
corresponding file in `/srv/export` to see it all `rot-13`'ed for you.
|
||||
|
||||
```
|
||||
~$ echo hello > /srv/import/a_file
|
||||
~$ cat /srv/export/a_file
|
||||
uryyb
|
||||
```
|
||||
|
||||
There you have it -- functionality you control, implemented easily, layered on
|
||||
top of local storage. Now you could start adding functionality -- real
|
||||
encryption, perhaps -- and inevitably having to debug it. You could do that the
|
||||
old-school way, with `gf_log` (preferred) or even plain old `printf`, or you
|
||||
could run daemons under `gdb` instead. Alternatively, you could wait for the
|
||||
next Translator 101 post, where we'll be doing exactly that.
|
||||
|
||||
Debugging a Translator
|
||||
----------------------
|
||||
|
||||
Now that we've learned what a translator looks like and how to build one, it's
|
||||
time to run one and actually watch it work. The best way to do this is good
|
||||
old-fashioned `gdb`, as follows (using some of the examples from last time).
|
||||
|
||||
```
|
||||
xlator_example# gdb glusterfs
|
||||
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-50.el6)
|
||||
...
|
||||
(gdb) r --debug -f my.vol /srv/import
|
||||
Starting program: /usr/sbin/glusterfs --debug -f my.vol /srv/import
|
||||
...
|
||||
[2011-11-23 11:23:16.495516] I [fuse-bridge.c:2971:fuse_init]
|
||||
0-glusterfs-fuse: FUSE inited with protocol versions:
|
||||
glusterfs 7.13 kernel 7.13
|
||||
```
|
||||
|
||||
If you get to this point, your glusterfs client process is already running. You
|
||||
can go to another window to see the mountpoint, do file operations, etc.
|
||||
|
||||
```
|
||||
~# df /srv/import
|
||||
Filesystem 1K-blocks Used Available Use% Mounted on
|
||||
/root/xlator_example/my.vol
|
||||
114506240 2643968 106045568 3% /srv/import
|
||||
~# ls /srv/import
|
||||
a_file
|
||||
~# cat /srv/import/a_file
|
||||
hello
|
||||
```
|
||||
|
||||
Now let's interrupt the process and see where we are.
|
||||
|
||||
```
|
||||
^C
|
||||
Program received signal SIGINT, Interrupt.
|
||||
0x0000003a0060b3dc in pthread_cond_wait@@GLIBC_2.3.2 ()
|
||||
from /lib64/libpthread.so.0
|
||||
(gdb) info threads
|
||||
5 Thread 0x7fffeffff700 (LWP 27206) 0x0000003a002dd8c7
|
||||
in readv ()
|
||||
from /lib64/libc.so.6
|
||||
4 Thread 0x7ffff50e3700 (LWP 27205) 0x0000003a0060b75b
|
||||
in pthread_cond_timedwait@@GLIBC_2.3.2 ()
|
||||
from /lib64/libpthread.so.0
|
||||
3 Thread 0x7ffff5f02700 (LWP 27204) 0x0000003a0060b3dc
|
||||
in pthread_cond_wait@@GLIBC_2.3.2 ()
|
||||
from /lib64/libpthread.so.0
|
||||
2 Thread 0x7ffff6903700 (LWP 27203) 0x0000003a0060f245
|
||||
in sigwait ()
|
||||
from /lib64/libpthread.so.0
|
||||
* 1 Thread 0x7ffff7957700 (LWP 27196) 0x0000003a0060b3dc
|
||||
in pthread_cond_wait@@GLIBC_2.3.2 ()
|
||||
from /lib64/libpthread.so.0
|
||||
```
|
||||
|
||||
Like any non-toy server, this one has multiple threads. What are they all
|
||||
doing? Honestly, even I don't know. Thread 1 turns out to be in
|
||||
`event_dispatch_epoll`, which means it's the one handling all of our network
|
||||
I/O. Note that with socket multi-threading patch this will change, with one
|
||||
thread in `socket_poller` per connection. Thread 2 is in `glusterfs_sigwaiter`
|
||||
which means signals will be isolated to that thread. Thread 3 is in
|
||||
`syncenv_task`, so it's a worker process for synchronous requests such as
|
||||
those used by the rebalance and repair code. Thread 4 is in
|
||||
`janitor_get_next_fd`, so it's waiting for a chance to close no-longer-needed
|
||||
file descriptors on the local filesystem. (I admit I had to look that one up,
|
||||
BTW.) Lastly, thread 5 is in `fuse_thread_proc`, so it's the one fetching
|
||||
requests from our FUSE interface. You'll often see many more threads than
|
||||
this, but it's a pretty good basic set. Now, let's set a breakpoint so we can
|
||||
actually watch a request.
|
||||
|
||||
```
|
||||
(gdb) b rot13_writev
|
||||
Breakpoint 1 at 0x7ffff50e4f0b: file rot-13.c, line 119.
|
||||
(gdb) c
|
||||
Continuing.
|
||||
```
|
||||
|
||||
At this point we go into our other window and do something that will involve a write.
|
||||
|
||||
```
|
||||
~# echo goodbye > /srv/import/another_file
|
||||
(back to the first window)
|
||||
[Switching to Thread 0x7fffeffff700 (LWP 27206)]
|
||||
|
||||
Breakpoint 1, rot13_writev (frame=0x7ffff6e4402c, this=0x638440,
|
||||
fd=0x7ffff409802c, vector=0x7fffe8000cd8, count=1, offset=0,
|
||||
iobref=0x7fffe8001070) at rot-13.c:119
|
||||
119 rot_13_private_t *priv = (rot_13_private_t *)this->private;
|
||||
```
|
||||
|
||||
Remember how we built with debugging symbols enabled and no optimization? That
|
||||
will be pretty important for the next few steps. As you can see, we're in
|
||||
`rot13_writev`, with several parameters.
|
||||
|
||||
* `frame` is our always-present frame pointer for this request. Also,
|
||||
`frame->local` will point to any local data we created and attached to the
|
||||
request ourselves.
|
||||
* `this` is a pointer to our instance of the `rot-13` translator. You can examine
|
||||
it if you like to see the name, type, options, parent/children, inode table,
|
||||
and other stuff associated with it.
|
||||
* `fd` is a pointer to a file-descriptor *object* (`fd_t`, not just a
|
||||
file-descriptor index which is what most people use "fd" for). This in turn
|
||||
points to an inode object (`inode_t`) and we can associate our own
|
||||
`rot-13`-specific data with either of these.
|
||||
* `vector` and `count` together describe the data buffers for this write, which
|
||||
we'll get to in a moment.
|
||||
* `offset` is the offset into the file at which we're writing.
|
||||
* `iobref` is a buffer-reference object, which is used to track the life cycle
|
||||
of buffers containing read/write data. If you look closely, you'll notice that
|
||||
`vector[0].iov_base` points to the same address as `iobref->iobrefs[0].ptr`, which
|
||||
should give you some idea of the inter-relationships between vector and iobref.
|
||||
|
||||
OK, now what about that `vector`? We can use it to examine the data being
|
||||
written, like this.
|
||||
|
||||
```
|
||||
(gdb) p vector[0]
|
||||
$2 = {iov_base = 0x7ffff7936000, iov_len = 8}
|
||||
(gdb) x/s 0x7ffff7936000
|
||||
0x7ffff7936000: "goodbye\n"
|
||||
```
|
||||
|
||||
It's not always safe to view this data as a string, because it might just as
|
||||
well be binary data, but since we're generating the write this time it's safe
|
||||
and convenient. With that knowledge, let's step through things a bit.
|
||||
|
||||
```
|
||||
(gdb) s
|
||||
120 if (priv->encrypt_write)
|
||||
(gdb)
|
||||
121 rot13_iovec (vector, count);
|
||||
(gdb)
|
||||
rot13_iovec (vector=0x7fffe8000cd8, count=1) at rot-13.c:57
|
||||
57 for (i = 0; i < count; i++) {
|
||||
(gdb)
|
||||
58 rot13 (vector[i].iov_base, vector[i].iov_len);
|
||||
(gdb)
|
||||
rot13 (buf=0x7ffff7936000 "goodbye\n", len=8) at rot-13.c:45
|
||||
45 for (i = 0; i < len; i++) {
|
||||
(gdb)
|
||||
46 if (buf[i] >= 'a' && buf[i] <= 'z')
|
||||
(gdb)
|
||||
47 buf[i] = 'a' + ((buf[i] - 'a' + 13) % 26);
|
||||
```
|
||||
|
||||
Here we've stepped into `rot13_iovec`, which iterates through our vector
|
||||
calling `rot13`, which in turn iterates through the characters in that chunk
|
||||
doing the `rot-13` operation if/as appropriate. This is pretty straightforward
|
||||
stuff, so let's skip to the next interesting bit.
|
||||
|
||||
```
|
||||
(gdb) fin
|
||||
Run till exit from #0 rot13 (buf=0x7ffff7936000 "goodbye\n",
|
||||
len=8) at rot-13.c:47
|
||||
rot13_iovec (vector=0x7fffe8000cd8, count=1) at rot-13.c:57
|
||||
57 for (i = 0; i < count; i++) {
|
||||
(gdb) fin
|
||||
Run till exit from #0 rot13_iovec (vector=0x7fffe8000cd8,
|
||||
count=1) at rot-13.c:57
|
||||
rot13_writev (frame=0x7ffff6e4402c, this=0x638440,
|
||||
fd=0x7ffff409802c, vector=0x7fffe8000cd8, count=1,
|
||||
offset=0, iobref=0x7fffe8001070) at rot-13.c:123
|
||||
123 STACK_WIND (frame,
|
||||
(gdb) b 129
|
||||
Breakpoint 2 at 0x7ffff50e4f35: file rot-13.c, line 129.
|
||||
(gdb) b rot13_writev_cbk
|
||||
Breakpoint 3 at 0x7ffff50e4db3: file rot-13.c, line 106.
|
||||
(gdb) c
|
||||
```
|
||||
|
||||
So we've set breakpoints on both the callback and the statement following the
|
||||
`STACK_WIND`. Which one will we hit first?
|
||||
|
||||
```
|
||||
Breakpoint 3, rot13_writev_cbk (frame=0x7ffff6e4402c,
|
||||
cookie=0x7ffff6e440d8, this=0x638440, op_ret=8, op_errno=0,
|
||||
prebuf=0x7fffefffeca0, postbuf=0x7fffefffec30)
|
||||
at rot-13.c:106
|
||||
106 STACK_UNWIND_STRICT (writev, frame, op_ret, op_errno,
|
||||
prebuf, postbuf);
|
||||
(gdb) bt
|
||||
#0 rot13_writev_cbk (frame=0x7ffff6e4402c,
|
||||
cookie=0x7ffff6e440d8, this=0x638440, op_ret=8, op_errno=0,
|
||||
prebuf=0x7fffefffeca0, postbuf=0x7fffefffec30)
|
||||
at rot-13.c:106
|
||||
#1 0x00007ffff52f1b37 in posix_writev (frame=0x7ffff6e440d8,
|
||||
this=<value optimized out>, fd=<value optimized out>,
|
||||
vector=<value optimized out>, count=1,
|
||||
offset=<value optimized out>, iobref=0x7fffe8001070)
|
||||
at posix.c:2217
|
||||
#2 0x00007ffff50e513e in rot13_writev (frame=0x7ffff6e4402c,
|
||||
this=0x638440, fd=0x7ffff409802c, vector=0x7fffe8000cd8,
|
||||
count=1, offset=0, iobref=0x7fffe8001070) at rot-13.c:123
|
||||
```
|
||||
|
||||
Surprise! We're in `rot13_writev_cbk` now, called (indirectly) while we're
|
||||
still in `rot13_writev` before `STACK_WIND` returns (still at rot-13.c:123). If
|
||||
you did any request cleanup here, then you need to be careful about what you
|
||||
do in the remainder of `rot13_writev` because data may have been freed etc.
|
||||
It's tempting to say you should just do the cleanup in `rot13_writev` after
|
||||
the `STACK_WIND,` but that's not valid because it's also possible that some
|
||||
other translator returned without calling `STACK_UNWIND` -- i.e. before
|
||||
`rot13_writev` is called, so then it would be the one getting null-pointer
|
||||
errors instead. To put it another way, the callback and the return from
|
||||
`STACK_WIND` can occur in either order or even simultaneously on different
|
||||
threads. Even if you were to use reference counts, you'd have to make sure to
|
||||
use locking or atomic operations to avoid races, and it's not worth it. Unless
|
||||
you *really* understand the possible flows of control and know what you're
|
||||
doing, it's better to do cleanup in the callback and nothing after
|
||||
`STACK_WIND.`
|
||||
|
||||
At this point all that's left is a `STACK_UNWIND` and a return. The
|
||||
`STACK_UNWIND` invokes our parent's completion callback, and in this case our
|
||||
parent is FUSE so at that point the VFS layer is notified of the write being
|
||||
complete. Finally, we return through several levels of normal function calls
|
||||
until we come back to fuse_thread_proc, which waits for the next request.
|
||||
|
||||
So that's it. For extra fun, you might want to repeat this exercise by stepping
|
||||
through some other call -- stat or setxattr might be good choices -- but you'll
|
||||
have to use a translator that actually implements those calls to see much
|
||||
that's interesting. Then you'll pretty much know everything I knew when I
|
||||
started writing my first for-real translators, and probably even a bit more. I
|
||||
hope you've enjoyed this series, or at least found it useful, and if you have
|
||||
any suggestions for other topics I should cover please let me know (via
|
||||
comments or email, IRC or Twitter).
|
||||
|
||||
Other versions
|
||||
--------------
|
||||
|
||||
Original author's site:
|
||||
|
||||
* [Translator 101 - Setting the Stage](http://pl.atyp.us/hekafs.org/index.php/2011/11/translator-101-class-1-setting-the-stage/)
|
||||
|
||||
* [Translator 101 - Init, Fini and Private Context](http://pl.atyp.us/hekafs.org/index.php/2011/11/translator-101-lesson-2-init-fini-and-private-context/)
|
||||
|
||||
* [Translator 101 - This Time for Real](http://pl.atyp.us/hekafs.org/index.php/2011/11/translator-101-lesson-3-this-time-for-real/)
|
||||
|
||||
* [Translator 101 - Debugging a Translator](http://pl.atyp.us/hekafs.org/index.php/2011/11/translator-101-lesson-4-debugging-a-translator/)
|
||||
|
||||
Gluster community site:
|
||||
|
||||
* [Translators](http://www.gluster.org/community/documentation/index.php/Translators)
|
||||
@@ -1,228 +0,0 @@
|
||||
# Unit Tests in GlusterFS
|
||||
|
||||
## Overview
|
||||
[Art-of-unittesting][definitionofunittest] provides a good definition for unit tests. A good unit test is:
|
||||
|
||||
* Able to be fully automated
|
||||
* Has full control over all the pieces running (Use mocks or stubs to achieve this isolation when needed)
|
||||
* Can be run in any order if part of many other tests
|
||||
* Runs in memory (no DB or File access, for example)
|
||||
* Consistently returns the same result (You always run the same test, so no random numbers, for example. save those for integration or range tests)
|
||||
* Runs fast
|
||||
* Tests a single logical concept in the system
|
||||
* Readable
|
||||
* Maintainable
|
||||
* Trustworthy (when you see its result, you don’t need to debug the code just to be sure)
|
||||
|
||||
## cmocka
|
||||
GlusterFS unit test framework is based on [cmocka][]. cmocka provides
|
||||
developers with methods to isolate and test modules written in C language. It
|
||||
also provides integration with Jenkins by providing JUnit XML compliant unit
|
||||
test results.
|
||||
|
||||
cmocka
|
||||
|
||||
## Running Unit Tests
|
||||
To execute the unit tests, all you need is to type `make check`. Here is a step-by-step example assuming you just cloned a GlusterFS tree:
|
||||
|
||||
```
|
||||
$ ./autogen.sh
|
||||
$ ./configure --enable-debug
|
||||
$ make check
|
||||
```
|
||||
|
||||
Sample output:
|
||||
|
||||
```
|
||||
PASS: mem_pool_unittest
|
||||
============================================================================
|
||||
Testsuite summary for glusterfs 3git
|
||||
============================================================================
|
||||
# TOTAL: 1
|
||||
# PASS: 1
|
||||
# SKIP: 0
|
||||
# XFAIL: 0
|
||||
# FAIL: 0
|
||||
# XPASS: 0
|
||||
# ERROR: 0
|
||||
============================================================================
|
||||
```
|
||||
|
||||
In this example, `mem_pool_unittest` has multiple tests inside, but `make check` assumes that the program itself is the test, and that is why it only shows one test. Here is the output when we run `mem_pool_unittest` directly:
|
||||
|
||||
```
|
||||
$ ./libglusterfs/src/mem_pool_unittest
|
||||
[==========] Running 10 test(s).
|
||||
[ RUN ] test_gf_mem_acct_enable_set
|
||||
Expected assertion data != ((void *)0) occurred
|
||||
[ OK ] test_gf_mem_acct_enable_set
|
||||
[ RUN ] test_gf_mem_set_acct_info_asserts
|
||||
Expected assertion xl != ((void *)0) occurred
|
||||
Expected assertion size > ((4 + sizeof (size_t) + sizeof (xlator_t *) + 4 + 8) + 8) occurred
|
||||
Expected assertion type <= xl->mem_acct.num_types occurred
|
||||
[ OK ] test_gf_mem_set_acct_info_asserts
|
||||
[ RUN ] test_gf_mem_set_acct_info_memory
|
||||
[ OK ] test_gf_mem_set_acct_info_memory
|
||||
[ RUN ] test_gf_calloc_default_calloc
|
||||
[ OK ] test_gf_calloc_default_calloc
|
||||
[ RUN ] test_gf_calloc_mem_acct_enabled
|
||||
[ OK ] test_gf_calloc_mem_acct_enabled
|
||||
[ RUN ] test_gf_malloc_default_malloc
|
||||
[ OK ] test_gf_malloc_default_malloc
|
||||
[ RUN ] test_gf_malloc_mem_acct_enabled
|
||||
[ OK ] test_gf_malloc_mem_acct_enabled
|
||||
[ RUN ] test_gf_realloc_default_realloc
|
||||
[ OK ] test_gf_realloc_default_realloc
|
||||
[ RUN ] test_gf_realloc_mem_acct_enabled
|
||||
[ OK ] test_gf_realloc_mem_acct_enabled
|
||||
[ RUN ] test_gf_realloc_ptr
|
||||
Expected assertion ((void *)0) != ptr occurred
|
||||
[ OK ] test_gf_realloc_ptr
|
||||
[==========] 10 test(s) run.
|
||||
[ PASSED ] 10 test(s).
|
||||
[ FAILED ] 0 test(s).
|
||||
[ REPORT ] Created libglusterfs_mem_pool_xunit.xml report
|
||||
```
|
||||
|
||||
|
||||
## Writing Unit Tests
|
||||
|
||||
### Enhancing your C functions
|
||||
|
||||
#### Programming by Contract
|
||||
Add the following to your C file:
|
||||
|
||||
```c
|
||||
#include <cmocka_pbc.h>
|
||||
```
|
||||
|
||||
```c
|
||||
/*
|
||||
* Programming by Contract is a programming methodology
|
||||
* which binds the caller and the function called to a
|
||||
* contract. The contract is represented using Hoare Triple:
|
||||
* {P} C {Q}
|
||||
* where {P} is the precondition before executing command C,
|
||||
* and {Q} is the postcondition.
|
||||
*
|
||||
* See also:
|
||||
* http://en.wikipedia.org/wiki/Design_by_contract
|
||||
* http://en.wikipedia.org/wiki/Hoare_logic
|
||||
* http://dlang.org/dbc.html
|
||||
*/
|
||||
#ifndef CMOCKERY_PBC_H_
|
||||
#define CMOCKERY_PBC_H_
|
||||
|
||||
#if defined(UNIT_TESTING) || defined (DEBUG)
|
||||
|
||||
#include <assert.h>
|
||||
|
||||
/*
|
||||
* Checks caller responsibility against contract
|
||||
*/
|
||||
#define REQUIRE(cond) assert(cond)
|
||||
|
||||
/*
|
||||
* Checks function reponsability against contract.
|
||||
*/
|
||||
#define ENSURE(cond) assert(cond)
|
||||
|
||||
/*
|
||||
* While REQUIRE and ENSURE apply to functions, INVARIANT
|
||||
* applies to classes/structs. It ensures that intances
|
||||
* of the class/struct are consistent. In other words,
|
||||
* that the instance has not been corrupted.
|
||||
*/
|
||||
#define INVARIANT(invariant_fnc) do{ (invariant_fnc) } while (0);
|
||||
|
||||
#else
|
||||
#define REQUIRE(cond) do { } while (0);
|
||||
#define ENSURE(cond) do { } while (0);
|
||||
#define INVARIANT(invariant_fnc) do{ } while (0);
|
||||
|
||||
#endif /* defined(UNIT_TESTING) || defined (DEBUG) */
|
||||
#endif /* CMOCKERY_PBC_H_ */
|
||||
```
|
||||
|
||||
##### Example
|
||||
This is an _extremely_ simple example:
|
||||
|
||||
```c
|
||||
int divide (int n, int d)
|
||||
{
|
||||
int ans;
|
||||
|
||||
REQUIRE(d != 0);
|
||||
|
||||
ans = n / d;
|
||||
|
||||
// As code is added to this function throughout its lifetime,
|
||||
// ENSURE will assert that data will be returned
|
||||
// according to the contract. Again this is an
|
||||
// extremely simple example. :-D
|
||||
ENSURE( ans == (n / d) );
|
||||
|
||||
return ans;
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
##### Important Note
|
||||
`REQUIRE`, `ENSURE`, and `INVARIANT` are only available when `DEBUG` or `UNIT_TESTING` are set in the CFLAGS. You must pass `--enable-debug` to `./configure` to enable PBC on your non-unittest builds.
|
||||
|
||||
#### Overriding functions
|
||||
Cmockery2 provides its own memory allocation functions which check for buffer overrun and memory leaks. The following header file must be included **last** to be able to override any of the memory allocation functions:
|
||||
|
||||
```c
|
||||
#include <cmocka.h>
|
||||
```
|
||||
|
||||
This file will only take effect with the `UNIT_TESTING` CFLAG is set.
|
||||
|
||||
### Creating a unit test
|
||||
Once you identify the C file you would like to test, first create a `unittest` directory under the directory where the C file is located. This will isolate the unittests to a different directory.
|
||||
|
||||
Next, you need to edit the `Makefile.am` file in the directory where your C file is located. Initialize the
|
||||
`Makefile.am` if it does not already have the following sections:
|
||||
|
||||
```
|
||||
#### UNIT TESTS #####
|
||||
CLEANFILES += *.gcda *.gcno *_xunit.xml
|
||||
noinst_PROGRAMS =
|
||||
TESTS =
|
||||
```
|
||||
|
||||
Now you can add the following for each of the unit tests that you would like to build:
|
||||
|
||||
```
|
||||
### UNIT TEST xxx_unittest ###
|
||||
xxx_unittest_CPPFLAGS = $(xxx_CPPFLAGS)
|
||||
xxx_unittest_SOURCES = xxx.c \
|
||||
unittest/xxx_unittest.c
|
||||
xxx_unittest_CFLAGS = $(UNITTEST_CFLAGS)
|
||||
xxx_unittest_LDFLAGS = $(UNITTEST_LDFLAGS)
|
||||
noinst_PROGRAMS += xxx_unittest
|
||||
TESTS += xxx_unittest
|
||||
```
|
||||
|
||||
Where `xxx` is the name of your C file. For example, look at `libglusterfs/src/Makefile.am`.
|
||||
|
||||
Copy the simple unit test from the [cmocka API][cmockaapi] to `unittest/xxx_unittest.c`. If you would like to see an example of a unit test, please refer to `libglusterfs/src/unittest/mem_pool_unittest.c`.
|
||||
|
||||
#### Mocking
|
||||
You may see that the linker will complain about missing functions needed by the C file you would like to test. Identify the required functions, then place their stubs in a file called `unittest/xxx_mock.c`, then include this file in `Makefile.am` in `xxx_unittest_SOURCES`. This will allow you to you Cmockery2's mocking functions.
|
||||
|
||||
#### Running the unit test
|
||||
You can type `make` in the directory where the C file is located. Once you built it and there are no errors, you can execute the test either by directly executing the program (in our example above it is called `xxx_unittest` ), or by running `make check`.
|
||||
|
||||
#### Debugging
|
||||
Sometimes you may need to debug your unit test. To do that, you will have to point `gdb` to the binary which is located in the same directory as the source. For example, you can do the following from the root of the source tree to debug `mem_pool_unittest`:
|
||||
|
||||
```
|
||||
$ gdb libglusterfs/src/mem_pool_unittest
|
||||
```
|
||||
|
||||
|
||||
[cmocka]: https://cmocka.org
|
||||
[definitionofunittest]: http://artofunittesting.com/definition-of-a-unit-test/
|
||||
[cmockapi]: https://api.cmocka.org
|
||||
@@ -1,44 +0,0 @@
|
||||
Versioning
|
||||
==========
|
||||
|
||||
### current
|
||||
|
||||
The number of the current interface exported by the library. A current value
|
||||
of '1', means that you are calling the interface exported by this library
|
||||
interface 1.
|
||||
|
||||
### revision
|
||||
|
||||
The implementation number of the most recent interface exported by this library.
|
||||
In this case, a revision value of `0` means that this is the first
|
||||
implementation of the interface.
|
||||
|
||||
If the next release of this library exports the same interface, but has a
|
||||
different implementation (perhaps some bugs have been fixed), the revision
|
||||
number will be higher, but current number will be the same. In that case, when
|
||||
given a choice, the library with the highest revision will always be used by
|
||||
the runtime loader.
|
||||
|
||||
### age
|
||||
|
||||
The number of previous additional interfaces supported by this library. If age
|
||||
were '2', then this library can be linked into executables which were built with
|
||||
a release of this library that exported the current interface number, current,
|
||||
or any of the previous two interfaces. By definition age must be less than or
|
||||
equal to current. At the outset, only the first ever interface is implemented,
|
||||
so age can only be `0'.
|
||||
|
||||
For every release of the library `-version-info` argument needs to be set
|
||||
correctly depending on any interface changes you have made.
|
||||
|
||||
This is quite straightforward when you understand what the three numbers mean:
|
||||
|
||||
If you have changed any of the sources for this library, the revision number
|
||||
must be incremented. This is a new revision of the current interface. If the
|
||||
interface has changed, then current must be incremented, and revision reset
|
||||
to '0'.
|
||||
|
||||
This is the first revision of a new interface. If the new interface is a
|
||||
superset of the previous interface (that is, if the previous interface has not
|
||||
been broken by the changes in this new release), then age must be incremented.
|
||||
This release is backwards compatible with the previous release.
|
||||
@@ -1,56 +0,0 @@
|
||||
performance/write-behind translator
|
||||
===================================
|
||||
|
||||
Basic working
|
||||
--------------
|
||||
|
||||
Write behind is basically a translator to lie to the application that the
|
||||
write-requests are finished, even before it is actually finished.
|
||||
|
||||
On a regular translator tree without write-behind, control flow is like this:
|
||||
|
||||
1. application makes a `write()` system call.
|
||||
2. VFS ==> FUSE ==> `/dev/fuse`.
|
||||
3. fuse-bridge initiates a glusterfs `writev()` call.
|
||||
4. `writev()` is `STACK_WIND()`ed up to client-protocol or storage translator.
|
||||
5. client-protocol, on receiving reply from server, starts `STACK_UNWIND()` towards the fuse-bridge.
|
||||
|
||||
On a translator tree with write-behind, control flow is like this:
|
||||
|
||||
1. application makes a `write()` system call.
|
||||
2. VFS ==> FUSE ==> `/dev/fuse`.
|
||||
3. fuse-bridge initiates a glusterfs `writev()` call.
|
||||
4. `writev()` is `STACK_WIND()`ed up to write-behind translator.
|
||||
5. write-behind adds the write buffer to its internal queue and does a `STACK_UNWIND()` towards the fuse-bridge.
|
||||
|
||||
write call is completed in application's percepective. after
|
||||
`STACK_UNWIND()`ing towards the fuse-bridge, write-behind initiates a fresh
|
||||
writev() call to its child translator, whose replies will be consumed by
|
||||
write-behind itself. Write-behind _doesn't_ cache the write buffer, unless
|
||||
`option flush-behind on` is specified in volume specification file.
|
||||
|
||||
Windowing
|
||||
---------
|
||||
|
||||
With respect to write-behind, each write-buffer has three flags: `stack_wound`, `write_behind` and `got_reply`.
|
||||
|
||||
* `stack_wound`: if set, indicates that write-behind has initiated `STACK_WIND()` towards child translator.
|
||||
* `write_behind`: if set, indicates that write-behind has done `STACK_UNWIND()` towards fuse-bridge.
|
||||
* `got_reply`: if set, indicates that write-behind has received reply from child translator for a `writev()` `STACK_WIND()`. a request will be destroyed by write-behind only if this flag is set.
|
||||
|
||||
Currently pending write requests = aggregate size of requests with write_behind = 1 and got_reply = 0.
|
||||
|
||||
window size limits the aggregate size of currently pending write requests. once
|
||||
the pending requests' size has reached the window size, write-behind blocks
|
||||
writev() calls from fuse-bridge. Blocking is only from application's
|
||||
perspective. Write-behind does `STACK_WIND()` to child translator
|
||||
straight-away, but hold behind the `STACK_UNWIND()` towards fuse-bridge.
|
||||
`STACK_UNWIND()` is done only once write-behind gets enough replies to
|
||||
accommodate for currently blocked request.
|
||||
|
||||
Flush behind
|
||||
------------
|
||||
|
||||
If `option flush-behind on` is specified in volume specification file, then
|
||||
write-behind sends aggregate write requests to child translator, instead of
|
||||
regular per request `STACK_WIND()`s.
|
||||
39
mkdocs.yml
39
mkdocs.yml
@@ -60,45 +60,6 @@ pages:
|
||||
- Network Configurations Techniques: Administrator Guide/Network Configurations Techniques.md
|
||||
- Performance Testing: Administrator Guide/Performance Testing.md
|
||||
- Export and Netgroup Authentication: Administrator Guide/Export And Netgroup Authentication.md
|
||||
- Developers Guide:
|
||||
- Developers Home: Developer-guide/Developers Index.md
|
||||
- Simplified Development Workflow: Developer-guide/Simplified Development Workflow.md
|
||||
- Development Workflow: Developer-guide/Development Workflow.md
|
||||
- Coding Standards: Developer-guide/coding-standard.md
|
||||
- Compiling RPMS: Developer-guide/Compiling RPMS.md
|
||||
- Building GlusterFS: Developer-guide/Building GlusterFS.md
|
||||
- Projects: Developer-guide/Projects.md
|
||||
- Language Bindings: Developer-guide/Language Bindings.md
|
||||
- Easy Fix Bugs: Developer-guide/Easy Fix Bugs.md
|
||||
- Fixing issues for static code analysis: Developer-guide/Fixing issues reported by tools for static code analysis.md
|
||||
- Backport Wishlist: Developer-guide/Backport Wishlist.md
|
||||
- Backport Guidelines: Developer-guide/Backport Guidelines.md
|
||||
- Adding File Operations: Developer-guide/adding-fops.md
|
||||
- Automatic File Replication: Developer-guide/afr.md
|
||||
- History of Locking in AFR: Developer-guide/afr-locks-evolution.md
|
||||
- Self heal Daemon: Developer-guide/afr-self-heal-daemon.md
|
||||
- inode datastructure: Developer-guide/datastructure-inode.md
|
||||
- iobuf datastructure: Developer-guide/datastructure-iobuf.md
|
||||
- mem-pool datastructure: Developer-guide/datastructure-mem-pool.md
|
||||
- gfapi Symbol Versions: Developer-guide/gfapi-symbol-versions.md
|
||||
- Daemon Management Framework: Developer-guide/daemon-management-framework.md
|
||||
- Block Device Translator: Developer-guide/bd-xlator.md
|
||||
- Write Behind Translator: Developer-guide/write-behind.md
|
||||
- Translator Development: Developer-guide/translator-development.md
|
||||
- Storage/posix Translator: Developer-guide/posix.md
|
||||
- Compression Translator: Developer-guide/network_compression.md
|
||||
- Unit Tests in GlusterFS: Developer-guide/unittest.md
|
||||
- Using Gluster Test Framework: Developer-guide/Using Gluster Test Framework.md
|
||||
- Jenkins Infrastructure: Developer-guide/Jenkins Infrastructure.md
|
||||
- Jenkins Manual Setup: Developer-guide/Jenkins Manual Setup.md
|
||||
- Coredump Analysis: Developer-guide/coredump-analysis.md
|
||||
- Bug Reporting Guidelines: Developer-guide/Bug Reporting Guidelines.md
|
||||
- Bug Triage Guidelines: Developer-guide/Bug Triage.md
|
||||
- Bug Report Life Cycle: Developer-guide/Bug report Life Cycle.md
|
||||
- Guidelines for Maintainers: Developer-guide/Guidelines For Maintainers.md
|
||||
- Versioning: Developer-guide/versioning.md
|
||||
- GlusterFS Release Process: Developer-guide/GlusterFS Release process.md
|
||||
- Bug Reporting Template: Developer-guide/Bug reporting template.md
|
||||
- Upgrade-Guide:
|
||||
- Upgrade-Guide Index: Upgrade-Guide/README.md
|
||||
- Upgrade to 3.5: Upgrade-Guide/upgrade_to_3.5.md
|
||||
|
||||
Reference in New Issue
Block a user