1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283 |
- Overview:
- Zswap is a lightweight compressed cache for swap pages. It takes
- pages that are in the process of being swapped out and attempts to
- compress them into a dynamically allocated RAM-based memory pool.
- If this process is successful, the writeback to the swap device is
- deferred and, in many cases, avoided completely. This results in
- a significant I/O reduction and performance gains for systems that
- are swapping.
- Zswap provides compressed swap caching that basically trades CPU cycles
- for reduced swap I/O. This trade-off can result in a significant
- performance improvement as reads to/writes from to the compressed
- cache almost always faster that reading from a swap device
- which incurs the latency of an asynchronous block I/O read.
- Some potential benefits:
- * Desktop/laptop users with limited RAM capacities can mitigate the
- performance impact of swapping.
- * Overcommitted guests that share a common I/O resource can
- dramatically reduce their swap I/O pressure, avoiding heavy
- handed I/O throttling by the hypervisor. This allows more work
- to get done with less impact to the guest workload and guests
- sharing the I/O subsystem
- * Users with SSDs as swap devices can extend the life of the device by
- drastically reducing life-shortening writes.
- Zswap evicts pages from compressed cache on an LRU basis to the backing
- swap device when the compress pool reaches it size limit or the pool is
- unable to obtain additional pages from the buddy allocator. This
- requirement had been identified in prior community discussions.
- To enabled zswap, the "enabled" attribute must be set to 1 at boot time.
- e.g. zswap.enabled=1
- Design:
- Zswap receives pages for compression through the Frontswap API and
- is able to evict pages from its own compressed pool on an LRU basis
- and write them back to the backing swap device in the case that the
- compressed pool is full or unable to secure additional pages from
- the buddy allocator.
- Zswap makes use of zsmalloc for the managing the compressed memory
- pool. This is because zsmalloc is specifically designed to minimize
- fragmentation on large (> PAGE_SIZE/2) allocation sizes. Each
- allocation in zsmalloc is not directly accessible by address.
- Rather, a handle is return by the allocation routine and that handle
- must be mapped before being accessed. The compressed memory pool grows
- on demand and shrinks as compressed pages are freed. The pool is
- not preallocated.
- When a swap page is passed from frontswap to zswap, zswap maintains
- a mapping of the swap entry, a combination of the swap type and swap
- offset, to the zsmalloc handle that references that compressed swap
- page. This mapping is achieved with a red-black tree per swap type.
- The swap offset is the search key for the tree nodes.
- During a page fault on a PTE that is a swap entry, frontswap calls
- the zswap load function to decompress the page into the page
- allocated by the page fault handler.
- Once there are no PTEs referencing a swap page stored in zswap
- (i.e. the count in the swap_map goes to 0) the swap code calls
- the zswap invalidate function, via frontswap, to free the compressed
- entry.
- Zswap seeks to be simple in its policies. Sysfs attributes allow for
- two user controlled policies:
- * max_compression_ratio - Maximum compression ratio, as as percentage,
- for an acceptable compressed page. Any page that does not compress
- by at least this ratio will be rejected.
- * max_pool_percent - The maximum percentage of memory that the compressed
- pool can occupy.
- Zswap allows the compressor to be selected at kernel boot time by
- setting the “compressor” attribute. The default compressor is lzo.
- e.g. zswap.compressor=deflate
- A debugfs interface is provided for various statistic about pool size,
- number of pages stored, and various counters for the reasons pages
- are rejected.
|