123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687 |
- Explicit volatile write back cache control
- =====================================
- Introduction
- ------------
- Many storage devices, especially in the consumer market, come with volatile
- write back caches. That means the devices signal I/O completion to the
- operating system before data actually has hit the non-volatile storage. This
- behavior obviously speeds up various workloads, but it means the operating
- system needs to force data out to the non-volatile storage when it performs
- a data integrity operation like fsync, sync or an unmount.
- The Linux block layer provides two simple mechanisms that let filesystems
- control the caching behavior of the storage device. These mechanisms are
- a forced cache flush, and the Force Unit Access (FUA) flag for requests.
- Explicit cache flushes
- ----------------------
- The REQ_FLUSH flag can be OR ed into the r/w flags of a bio submitted from
- the filesystem and will make sure the volatile cache of the storage device
- has been flushed before the actual I/O operation is started. This explicitly
- guarantees that previously completed write requests are on non-volatile
- storage before the flagged bio starts. In addition the REQ_FLUSH flag can be
- set on an otherwise empty bio structure, which causes only an explicit cache
- flush without any dependent I/O. It is recommend to use
- the blkdev_issue_flush() helper for a pure cache flush.
- Forced Unit Access
- -----------------
- The REQ_FUA flag can be OR ed into the r/w flags of a bio submitted from the
- filesystem and will make sure that I/O completion for this request is only
- signaled after the data has been committed to non-volatile storage.
- Implementation details for filesystems
- --------------------------------------
- Filesystems can simply set the REQ_FLUSH and REQ_FUA bits and do not have to
- worry if the underlying devices need any explicit cache flushing and how
- the Forced Unit Access is implemented. The REQ_FLUSH and REQ_FUA flags
- may both be set on a single bio.
- Implementation details for make_request_fn based block drivers
- --------------------------------------------------------------
- These drivers will always see the REQ_FLUSH and REQ_FUA bits as they sit
- directly below the submit_bio interface. For remapping drivers the REQ_FUA
- bits need to be propagated to underlying devices, and a global flush needs
- to be implemented for bios with the REQ_FLUSH bit set. For real device
- drivers that do not have a volatile cache the REQ_FLUSH and REQ_FUA bits
- on non-empty bios can simply be ignored, and REQ_FLUSH requests without
- data can be completed successfully without doing any work. Drivers for
- devices with volatile caches need to implement the support for these
- flags themselves without any help from the block layer.
- Implementation details for request_fn based block drivers
- --------------------------------------------------------------
- For devices that do not support volatile write caches there is no driver
- support required, the block layer completes empty REQ_FLUSH requests before
- entering the driver and strips off the REQ_FLUSH and REQ_FUA bits from
- requests that have a payload. For devices with volatile write caches the
- driver needs to tell the block layer that it supports flushing caches by
- doing:
- blk_queue_flush(sdkp->disk->queue, REQ_FLUSH);
- and handle empty REQ_FLUSH requests in its prep_fn/request_fn. Note that
- REQ_FLUSH requests with a payload are automatically turned into a sequence
- of an empty REQ_FLUSH request followed by the actual write by the block
- layer. For devices that also support the FUA bit the block layer needs
- to be told to pass through the REQ_FUA bit using:
- blk_queue_flush(sdkp->disk->queue, REQ_FLUSH | REQ_FUA);
- and the driver must handle write requests that have the REQ_FUA bit set
- in prep_fn/request_fn. If the FUA bit is not natively supported the block
- layer turns it into an empty REQ_FLUSH request after the actual write.
|