As Issue #1310 pointed out, for fops that have interrupt handlers the fop handler needs to pay attention to give the proper FUSE response when it's interrupted. This change extends the interrupt documentation with guidelines regarding the FUSE response. Also: - improve wording - add an 'Overview' section to explain the code flow before going in to the technical details Change-Id: I852bfb717b1bde73f220878d6376429564413820 updates: #1374 Signed-off-by: Csaba Henk <csaba@redhat.com>
9.7 KiB
Fuse interrupt handling
Conventions followed
- FUSE refers to the "wire protocol" between kernel and userspace and related specifications.
- fuse refers to the kernel subsystem and also to the GlusterFs translator.
FUSE interrupt handling spec
The Linux kernel FUSE documentation desrcibes how interrupt handling happens in fuse.
Interrupt handling in the fuse translator
Declarations
This document describes the internal API in the fuse translator with which interrupt can be handled.
The API being internal (to be used only in fuse-bridge.c; the functions are not exported to a header file).
enum fuse_interrupt_state {
/* ... */
INTERRUPT_SQUELCHED,
INTERRUPT_HANDLED,
/* ... */
};
typedef enum fuse_interrupt_state fuse_interrupt_state_t;
struct fuse_interrupt_record;
typedef struct fuse_interrupt_record fuse_interrupt_record_t;
typedef void (*fuse_interrupt_handler_t)(xlator_t *this,
fuse_interrupt_record_t *);
struct fuse_interrupt_record {
fuse_in_header_t fuse_in_header;
void *data;
/*
...
*/
};
fuse_interrupt_record_t *
fuse_interrupt_record_new(fuse_in_header_t *finh,
fuse_interrupt_handler_t handler);
void
fuse_interrupt_record_insert(xlator_t *this, fuse_interrupt_record_t *fir);
gf_boolean_t
fuse_interrupt_finish_fop(call_frame_t *frame, xlator_t *this,
gf_boolean_t sync, void **datap);
void
fuse_interrupt_finish_interrupt(xlator_t *this, fuse_interrupt_record_t *fir,
fuse_interrupt_state_t intstat,
gf_boolean_t sync, void **datap);
The code demonstrates the usage of the API through fuse_flush(). (It's a
dummy implementation only for demonstration purposes.) Flush is chosen
because a FLUSH interrupt is easy to trigger (see
tests/features/interrupt.t). Interrupt handling for flush is switched on
by --fuse-flush-handle-interrupt (a hidden glusterfs command line flag).
The implementation of flush interrupt is contained in the
fuse_flush_interrupt_handler() function and blocks guarded by the
if (priv->flush_handle_interrupt) { ...
conditional (where priv is a *fuse_private_t).
Overview
"Regular" fuse fops and interrupt handlers interact via a list containing interrupt records.
If a fop wishes to have its interrupts handled, it needs to set up an interrupt record and insert it into the list; also when it's to finish (ie. in its "cbk" stage) it needs to delete the record from the list.
If no interrupt happens, basically that's all to it - a list insertion and deletion.
However, if an interrupt comes for the fop, the interrupt FUSE request
will carry the data identifying an ongoing fop (that is, its unique),
and based on that, the interrupt record will be looked up in the list, and
the specific interrupt handler (a member of the interrupt record) will be
called.
Usually the fop needs to share some data with the interrupt handler to enable it to perform its task (also shared via the interrupt record). The interrupt API offers two approaches to manage shared data:
- Async or reference-counting strategy: from the point on when the interrupt record is inserted to the list, it's owned jointly by the regular fop and the prospective interrupt handler. Both of them need to check before they return if the other is still holding a reference; if not, then they are responsible for reclaiming the shared data.
- Sync or borrow strategy: the interrupt handler is considered a borrower of the shared data. The interrupt handler should not reclaim the shared data. The fop will wait for the interrupt handler to finish (ie., the borrow to be returned), then it has to reclaim the shared data.
The user of the interrupt API need to call the following functions to instrument this control flow:
fuse_interrupt_record_insert()in the fop to insert the interrupt record to the list;fuse_interrupt_finish_fop()in the fop (cbk) andfuse_interrupt_finish_interrupt()in the interrupt handler
to perform needed synchronization at the end their tenure. The data management
strategies are implemented by the fuse_interrupt_finish_*() functions (which
have an argument to specify which strategy to use); these routines take care
of freeing the interrupt record itself, while the reclamation of the shared data
is left to the API user.
Usage
A given FUSE fop can be enabled to handle interrupts via the following steps:
-
Define a handler function (of type
fuse_interrupt_handler_t). It should implement the interrupt handling logic and in the end call (directly or as async callback)fuse_interrupt_finish_interrupt(). Theintstatargument tofuse_interrupt_finish_interruptshould be eitherINTERRUPT_SQUELCHEDorINTERRUPT_HANDLED.INTERRUPT_SQUELCHEDmeans that the interrupt could not be delivered and the fop is going on uninterrupted.INTERRUPT_HANDLEDmeans that the interrupt was actually handled. In this case the fop will be answered from interrupt context with errnoEINTR(that is, the fop should not send a response to the kernel).
(the enum
fuse_interrupt_stateincludes further members, which are reserved for internal use).We return to the
syncanddataparguments later. -
In the
fuse_<FOP>function create an interrupt record usingfuse_interrupt_record_new(), passing the incomingfuse_in_headerand the above handler function to it.- Arbitrary further data can be referred to via the
datamember of the interrupt record that is to be passed on from fop context to interrupt context.
- Arbitrary further data can be referred to via the
-
When it's set up, pass the interrupt record to
fuse_interrupt_record_insert(). -
In
fuse_<FOP>_cbkcallfuse_interrupt_finish_fop().fuse_interrupt_finish_fop()returns a Boolean according to whether the interrupt was handled. If it was, then the FUSE request is already answered and the stack gets destroyed infuse_interrupt_finish_fopsofuse_<FOP>_cbk()can just return (zero). Otherwise follow the standard cbk logic (answer the FUSE request and destroy the stack -- these are typically accomplished byfuse_err_cbk()).
-
The last two argument of
fuse_interrupt_finish_fop()andfuse_interrupt_finish_interrupt()aregf_boolean_t syncandvoid **datap.-
syncrepresents the strategy for freeing the interrupt record. The interrupt handler and the fop handler are in race to get at the interrupt record first (interrupt handler for purposes of doing the interrupt handling, fop handler for purposes of deactivating the interrupt record upon completion of the fop handling).- If
syncis true, then the fop handler will wait for the interrupt handler to finish and it takes care of freeing. - If
syncis false, the loser of the above race will perform freeing.
Freeing is done within the respective interrupt finish routines, except for the
datafield of the interrupt record; with respect to that, see the discussion of thedatapparameter below. The strategy has to be consensual, that is,fuse_interrupt_finish_fop()andfuse_interrupt_finish_interrupt()must pass the same value forsync. If dismantling the resources associated with the interrupt record is simple,sync = _gf_falseis the suggested choice;sync = _gf_truecan be useful in the opposite case, when dismantling those resources would be inconvenient to implement in two places or to enact in non-fop context. - If
-
If
datapisNULL, thedatamember of the interrupt record will be freed within the interrupt finish routine. If it points to a validvoid *pointer, and if caller is doing the cleanup (seesyncabove), then that pointer will be directed to thedatamember of the interrupt record and it's up to the caller what it's doing with it.- If
syncis true, interrupt handler can usedatap = NULL, and fop handler will havedatappoint to a valid pointer. - If
syncis false, and handlers pass a pointer to a pointer fordatap, they should check if the pointed pointer is NULL before attempting to deal with the data.
- If
-
FUSE answer for the interrupted fop
The kernel acknowledges a successful interruption for a given FUSE request if the filesystem daemon answers it with errno EINTR; upon that, the syscall which induced the request will be abruptly terminated with an interrupt, rather than returning a value.
In glusterfs, this can be arranged in two ways.
-
If the interrupt handler wins the race for the interrupt record, ie.
fuse_interrupt_finish_fop()returns true tofuse_<FOP>_cbk(), then, as said above,fuse_<FOP>_cbk()does not need to answer the FUSE request. That's because then the interrupt handler will take care about answering it (with errno EINTR). -
If
fuse_interrupt_finish_fop()returns false tofuse_<FOP>_cbk(), then this return value does not inform the fop handler whether there was an interrupt or not. This return value occurs both when fop handler won the race for the interrupt record against the interrupt handler, and when there was no interrupt at all.However, the internal logic of the fop handler might detect from other circumstances that an interrupt was delivered. For example, the fop handler might be sleeping, waiting for some data to arrive, so that a premature wakeup (with no data present) occurs if the interrupt handler intervenes. In such cases it's the responsibility of the fop handler to reply the FUSE request with errro EINTR.