History log of /drivers/block/zram/zram_drv.h
Revision Date Author Comments
461a8eee6af3b55745be64bea403ed0b743563cf 10-Oct-2014 Minchan Kim <minchan@kernel.org> zram: report maximum used memory

Normally, zram user could get maximum memory usage zram consumed via
polling mem_used_total with sysfs in userspace.

But it has a critical problem because user can miss peak memory usage
during update inverval of polling. For avoiding that, user should poll it
with shorter interval(ie, 0.0000000001s) with mlocking to avoid page fault
delay when memory pressure is heavy. It would be troublesome.

This patch adds new knob "mem_used_max" so user could see the maximum
memory usage easily via reading the knob and reset it via "echo 0 >
/sys/block/zram0/mem_used_max".

Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Dan Streetman <ddstreet@ieee.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: <juno.choi@lge.com>
Cc: <seungho1.park@lge.com>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Seth Jennings <sjennings@variantweb.net>
Reviewed-by: David Horner <ds2horner@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9ada9da9573f3460b156b7755c093e30b258eacb 10-Oct-2014 Minchan Kim <minchan@kernel.org> zram: zram memory size limitation

Since zram has no control feature to limit memory usage, it makes hard to
manage system memrory.

This patch adds new knob "mem_limit" via sysfs to set up the a limit so
that zram could fail allocation once it reaches the limit.

In addition, user could change the limit in runtime so that he could
manage the memory more dynamically.

Initial state is no limit so it doesn't break old behavior.

[akpm@linux-foundation.org: fix typo, per Sergey]
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: <juno.choi@lge.com>
Cc: <seungho1.park@lge.com>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Seth Jennings <sjennings@variantweb.net>
Cc: David Horner <ds2horner@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
0cf1e9d6c34d4c82ac3af8015594849814843d36 30-Aug-2014 Chao Yu <chao2.yu@samsung.com> zram: fix incorrect stat with failed_reads

Since we allocate a temporary buffer in zram_bvec_read to handle partial
page operations in commit 924bd88d703e ("Staging: zram: allow partial
page operations"), our ->failed_reads value may be incorrect as we do
not increase its value when failing to allocate the temporary buffer.

Let's fix this issue and correct the annotation of failed_reads.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
d2d5e762c8990c4031890e03565983a05febd64a 07-Aug-2014 Weijie Yang <weijie.yang@samsung.com> zram: replace global tb_lock with fine grain lock

Currently, we use a rwlock tb_lock to protect concurrent access to the
whole zram meta table. However, according to the actual access model,
there is only a small chance for upper user to access the same
table[index], so the current lock granularity is too big.

The idea of optimization is to change the lock granularity from whole
meta table to per table entry (table -> table[index]), so that we can
protect concurrent access to the same table[index], meanwhile allow the
maximum concurrency.

With this in mind, several kinds of locks which could be used as a
per-entry lock were tested and compared:

Test environment:
x86-64 Intel Core2 Q8400, system memory 4GB, Ubuntu 12.04,
kernel v3.15.0-rc3 as base, zram with 4 max_comp_streams LZO.

iozone test:
iozone -t 4 -R -r 16K -s 200M -I +Z
(1GB zram with ext4 filesystem, take the average of 10 tests, KB/s)

Test base CAS spinlock rwlock bit_spinlock
-------------------------------------------------------------------
Initial write 1381094 1425435 1422860 1423075 1421521
Rewrite 1529479 1641199 1668762 1672855 1654910
Read 8468009 11324979 11305569 11117273 10997202
Re-read 8467476 11260914 11248059 11145336 10906486
Reverse Read 6821393 8106334 8282174 8279195 8109186
Stride read 7191093 8994306 9153982 8961224 9004434
Random read 7156353 8957932 9167098 8980465 8940476
Mixed workload 4172747 5680814 5927825 5489578 5972253
Random write 1483044 1605588 1594329 1600453 1596010
Pwrite 1276644 1303108 1311612 1314228 1300960
Pread 4324337 4632869 4618386 4457870 4500166

To enhance the possibility of access the same table[index] concurrently,
set zram a small disksize(10MB) and let threads run with large loop
count.

fio test:
fio --bs=32k --randrepeat=1 --randseed=100 --refill_buffers
--scramble_buffers=1 --direct=1 --loops=3000 --numjobs=4
--filename=/dev/zram0 --name=seq-write --rw=write --stonewall
--name=seq-read --rw=read --stonewall --name=seq-readwrite
--rw=rw --stonewall --name=rand-readwrite --rw=randrw --stonewall
(10MB zram raw block device, take the average of 10 tests, KB/s)

Test base CAS spinlock rwlock bit_spinlock
-------------------------------------------------------------
seq-write 933789 999357 1003298 995961 1001958
seq-read 5634130 6577930 6380861 6243912 6230006
seq-rw 1405687 1638117 1640256 1633903 1634459
rand-rw 1386119 1614664 1617211 1609267 1612471

All the optimization methods show a higher performance than the base,
however, it is hard to say which method is the most appropriate.

On the other hand, zram is mostly used on small embedded system, so we
don't want to increase any memory footprint.

This patch pick the bit_spinlock method, pack object size and page_flag
into an unsigned long table.value, so as to not increase any memory
overhead on both 32-bit and 64-bit system.

On the third hand, even though different kinds of locks have different
performances, we can ignore this difference, because: if zram is used as
zram swapfile, the swap subsystem can prevent concurrent access to the
same swapslot; if zram is used as zram-blk for set up filesystem on it,
the upper filesystem and the page cache also prevent concurrent access
of the same block mostly. So we can ignore the different performances
among locks.

Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Weijie Yang <weijie.yang@samsung.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
a830eff749eb2bf906783f6bf74a74dad3de3aea 07-Aug-2014 Sergey Senozhatsky <sergey.senozhatsky@gmail.com> zram: remove unused SECTOR_SIZE define

Drop SECTOR_SIZE define, because it's not used.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Weijie Yang <weijie.yang@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
cb8f2eec3c5c87e31219c5e58625b8e890004e48 07-Aug-2014 Sergey Senozhatsky <sergey.senozhatsky@gmail.com> zram: rename struct `table' to `zram_table_entry'

Andrew Morton has recently noted that `struct table' actually represents
table entry and, thus, should be renamed. Rename to `zram_table_entry'.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Weijie Yang <weijie.yang@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
e46b8a030d76d3c94156c545c3f4c3676d813435 08-Apr-2014 Sergey Senozhatsky <sergey.senozhatsky@gmail.com> zram: make compression algorithm selection possible

Add and document `comp_algorithm' device attribute. This attribute allows
to show supported compression and currently selected compression
algorithms:

cat /sys/block/zram0/comp_algorithm
[lzo] lz4

and change selected compression algorithm:
echo lzo > /sys/block/zram0/comp_algorithm

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
beca3ec71fe5490ee9237dc42400f50402baf83e 08-Apr-2014 Sergey Senozhatsky <sergey.senozhatsky@gmail.com> zram: add multi stream functionality

Existing zram (zcomp) implementation has only one compression stream
(buffer and algorithm private part), so in order to prevent data
corruption only one write (compress operation) can use this compression
stream, forcing all concurrent write operations to wait for stream lock
to be released. This patch changes zcomp to keep a compression streams
list of user-defined size (via sysfs device attr). Each write operation
still exclusively holds compression stream, the difference is that we
can have N write operations (depending on size of streams list)
executing in parallel. See TEST section later in commit message for
performance data.

Introduce struct zcomp_strm_multi and a set of functions to manage
zcomp_strm stream access. zcomp_strm_multi has a list of idle
zcomp_strm structs, spinlock to protect idle list and wait queue, making
it possible to perform parallel compressions.

The following set of functions added:
- zcomp_strm_multi_find()/zcomp_strm_multi_release()
find and release a compression stream, implement required locking
- zcomp_strm_multi_create()/zcomp_strm_multi_destroy()
create and destroy zcomp_strm_multi

zcomp ->strm_find() and ->strm_release() callbacks are set during
initialisation to zcomp_strm_multi_find()/zcomp_strm_multi_release()
correspondingly.

Each time zcomp issues a zcomp_strm_multi_find() call, the following set
of operations performed:

- spin lock strm_lock
- if idle list is not empty, remove zcomp_strm from idle list, spin
unlock and return zcomp stream pointer to caller
- if idle list is empty, current adds itself to wait queue. it will be
awaken by zcomp_strm_multi_release() caller.

zcomp_strm_multi_release():
- spin lock strm_lock
- add zcomp stream to idle list
- spin unlock, wake up sleeper

Minchan Kim reported that spinlock-based locking scheme has demonstrated
a severe perfomance regression for single compression stream case,
comparing to mutex-based (see https://lkml.org/lkml/2014/2/18/16)

base spinlock mutex

==Initial write ==Initial write ==Initial write
records: 5 records: 5 records: 5
avg: 1642424.35 avg: 699610.40 avg: 1655583.71
std: 39890.95(2.43%) std: 232014.19(33.16%) std: 52293.96
max: 1690170.94 max: 1163473.45 max: 1697164.75
min: 1568669.52 min: 573429.88 min: 1553410.23
==Rewrite ==Rewrite ==Rewrite
records: 5 records: 5 records: 5
avg: 1611775.39 avg: 501406.64 avg: 1684419.11
std: 17144.58(1.06%) std: 15354.41(3.06%) std: 18367.42
max: 1641800.95 max: 531356.78 max: 1706445.84
min: 1593515.27 min: 488817.78 min: 1655335.73

When only one compression stream available, mutex with spin on owner
tends to perform much better than frequent wait_event()/wake_up(). This
is why single stream implemented as a special case with mutex locking.

Introduce and document zram device attribute max_comp_streams. This
attr shows and stores current zcomp's max number of zcomp streams
(max_strm). Extend zcomp's zcomp_create() with `max_strm' parameter.
`max_strm' limits the number of zcomp_strm structs in compression
backend's idle list (max_comp_streams).

max_comp_streams used during initialisation as follows:
-- passing to zcomp_create() max_strm equals to 1 will initialise zcomp
using single compression stream zcomp_strm_single (mutex-based locking).
-- passing to zcomp_create() max_strm greater than 1 will initialise zcomp
using multi compression stream zcomp_strm_multi (spinlock-based locking).

default max_comp_streams value is 1, meaning that zram with single stream
will be initialised.

Later patch will introduce configuration knob to change max_comp_streams
on already initialised and used zcomp.

TEST
iozone -t 3 -R -r 16K -s 60M -I +Z

test base 1 strm (mutex) 3 strm (spinlock)
-----------------------------------------------------------------------
Initial write 589286.78 583518.39 718011.05
Rewrite 604837.97 596776.38 1515125.72
Random write 584120.11 595714.58 1388850.25
Pwrite 535731.17 541117.38 739295.27
Fwrite 1418083.88 1478612.72 1484927.06

Usage example:
set max_comp_streams to 4
echo 4 > /sys/block/zram0/max_comp_streams

show current max_comp_streams (default value is 1).
cat /sys/block/zram0/max_comp_streams

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
b7ca232ee7e85ed3b18e39eb20a7f458ee1d6047 08-Apr-2014 Sergey Senozhatsky <sergey.senozhatsky@gmail.com> zram: use zcomp compressing backends

Do not perform direct LZO compress/decompress calls, initialise
and use zcomp LZO backend (single compression stream) instead.

[akpm@linux-foundation.org: resolve conflicts with zram-delete-zram_init_device-fix.patch]
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
59fc86a4922f1a1c0f69eac758a7e2b2b138aab4 08-Apr-2014 Sergey Senozhatsky <sergey.senozhatsky@gmail.com> zram: drop not used table `count' member

struct table `count' member is not used.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
90a7806ea9b9f7cb4751859cc2506e2d80e36ef1 08-Apr-2014 Sergey Senozhatsky <sergey.senozhatsky@gmail.com> zram: use atomic64_t for all zram stats

This is a preparation patch for stats code duplication removal.

1) use atomic64_t for `pages_zero' and `pages_stored' zram stats.

2) `compr_size' and `pages_zero' struct zram_stats members did not
follow the existing device attr naming scheme: zram_stats.ATTR has
ATTR_show() function. rename them:

-- compr_size -> compr_data_size
-- pages_zero -> zero_pages

Minchan Kim's note:
If we really have trouble with atomic stat operation, we could
change it with percpu_counter so that it could solve atomic overhead and
unnecessary memory space by introducing unsigned long instead of 64bit
atomic_t.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
b7cccf8b4009bf74df61f3c9d86b95fabd807c11 08-Apr-2014 Sergey Senozhatsky <sergey.senozhatsky@gmail.com> zram: remove good and bad compress stats

Remove `good' and `bad' compressed sub-requests stats. RW request may
cause a number of RW sub-requests. zram used to account `good' compressed
sub-queries (with compressed size less than 50% of original size), `bad'
compressed sub-queries (with compressed size greater that 75% of original
size), leaving sub-requests with compression size between 50% and 75% of
original size not accounted and not reported. zram already accounts each
sub-request's compression size so we can calculate real device compression
ratio.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
be2d1d56c82d8cf20e6c77515eb499f8e86eb5be 08-Apr-2014 Sergey Senozhatsky <sergey.senozhatsky@gmail.com> zram: drop `init_done' struct zram member

Introduce init_done() helper function which allows us to drop `init_done'
struct zram member. init_done() uses the fact that ->init_done == 1
equals to ->meta != NULL.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
e46e33152eb82b8e2db7ffb3790a2a2653c34513 31-Jan-2014 Minchan Kim <minchan@kernel.org> zram: remove zram->lock in read path and change it with mutex

Finally, we separated zram->lock dependency from 32bit stat/ table
handling so there is no reason to use rw_semaphore between read and
write path so this patch removes the lock from read path totally and
changes rw_semaphore with mutex. So, we could do

old:

read-read: OK
read-write: NO
write-write: NO

Now:

read-read: OK
read-write: OK
write-write: NO

The below data proves mixed workload performs well 11 times and there is
also enhance on write-write path because current rw-semaphore doesn't
support SPIN_ON_OWNER. It's side effect but anyway good thing for us.

Write-related tests perform better (from 61% to 1058%) but read path has
good/bad(from -2.22% to 1.45%) but they are all marginal within stddev.

CPU 12
iozone -t -T -l 12 -u 12 -r 16K -s 60M -I +Z -V 0

==Initial write ==Initial write
records: 10 records: 10
avg: 516189.16 avg: 839907.96
std: 22486.53 (4.36%) std: 47902.17 (5.70%)
max: 546970.60 max: 909910.35
min: 481131.54 min: 751148.38
==Rewrite ==Rewrite
records: 10 records: 10
avg: 509527.98 avg: 1050156.37
std: 45799.94 (8.99%) std: 40695.44 (3.88%)
max: 611574.27 max: 1111929.26
min: 443679.95 min: 980409.62
==Read ==Read
records: 10 records: 10
avg: 4408624.17 avg: 4472546.76
std: 281152.61 (6.38%) std: 163662.78 (3.66%)
max: 4867888.66 max: 4727351.03
min: 4058347.69 min: 4126520.88
==Re-read ==Re-read
records: 10 records: 10
avg: 4462147.53 avg: 4363257.75
std: 283546.11 (6.35%) std: 247292.63 (5.67%)
max: 4912894.44 max: 4677241.75
min: 4131386.50 min: 4035235.84
==Reverse Read ==Reverse Read
records: 10 records: 10
avg: 4565865.97 avg: 4485818.08
std: 313395.63 (6.86%) std: 248470.10 (5.54%)
max: 5232749.16 max: 4789749.94
min: 4185809.62 min: 3963081.34
==Stride read ==Stride read
records: 10 records: 10
avg: 4515981.80 avg: 4418806.01
std: 211192.32 (4.68%) std: 212837.97 (4.82%)
max: 4889287.28 max: 4686967.22
min: 4210362.00 min: 4083041.84
==Random read ==Random read
records: 10 records: 10
avg: 4410525.23 avg: 4387093.18
std: 236693.22 (5.37%) std: 235285.23 (5.36%)
max: 4713698.47 max: 4669760.62
min: 4057163.62 min: 3952002.16
==Mixed workload ==Mixed workload
records: 10 records: 10
avg: 243234.25 avg: 2818677.27
std: 28505.07 (11.72%) std: 195569.70 (6.94%)
max: 288905.23 max: 3126478.11
min: 212473.16 min: 2484150.69
==Random write ==Random write
records: 10 records: 10
avg: 555887.07 avg: 1053057.79
std: 70841.98 (12.74%) std: 35195.36 (3.34%)
max: 683188.28 max: 1096125.73
min: 437299.57 min: 992481.93
==Pwrite ==Pwrite
records: 10 records: 10
avg: 501745.93 avg: 810363.09
std: 16373.54 (3.26%) std: 19245.01 (2.37%)
max: 518724.52 max: 833359.70
min: 464208.73 min: 765501.87
==Pread ==Pread
records: 10 records: 10
avg: 4539894.60 avg: 4457680.58
std: 197094.66 (4.34%) std: 188965.60 (4.24%)
max: 4877170.38 max: 4689905.53
min: 4226326.03 min: 4095739.72

Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
f614a9f48dedd2b80d1dc8bae8094842fcdb39dd 31-Jan-2014 Minchan Kim <minchan@kernel.org> zram: remove workqueue for freeing removed pending slot

Commit a0c516cbfc74 ("zram: don't grab mutex in zram_slot_free_noity")
introduced free request pending code to avoid scheduling by mutex under
spinlock and it was a mess which made code lenghty and increased
overhead.

Now, we don't need zram->lock any more to free slot so this patch
reverts it and then, tb_lock should protect it.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
92967471b67163bb1654e9b7fe99449ab70a4aaa 31-Jan-2014 Minchan Kim <minchan@kernel.org> zram: introduce zram->tb_lock

Currently, the zram table is protected by zram->lock but it's rather
coarse-grained lock and it makes hard for scalibility.

Let's use own rwlock instead of depending on zram->lock. This patch
adds new locking so obviously, it would make slow but this patch is just
prepartion for removing coarse-grained rw_semaphore(ie, zram->lock)
which is hurdle about zram scalability.

Final patch in this patchset series will remove the lock from read-path
and change rw_semaphore with mutex in write path. With bonus, we could
drop pending slot free mess in next patch.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
deb0bdeb2f3d6b81d37fc778316dae46b6daab56 31-Jan-2014 Minchan Kim <minchan@kernel.org> zram: use atomic operation for stat

Some of fields in zram->stats are protected by zram->lock which is
rather coarse-grained so let's use atomic operation without explict
locking.

This patch is ready for removing dependency of zram->lock in read path
which is very coarse-grained rw_semaphore. Of course, this patch adds
new atomic operation so it might make slow but my 12CPU test couldn't
spot any regression. All gain/lose is marginal within stddev.

iozone -t -T -l 12 -u 12 -r 16K -s 60M -I +Z -V 0

==Initial write ==Initial write
records: 50 records: 50
avg: 412875.17 avg: 415638.23
std: 38543.12 (9.34%) std: 36601.11 (8.81%)
max: 521262.03 max: 502976.72
min: 343263.13 min: 351389.12
==Rewrite ==Rewrite
records: 50 records: 50
avg: 416640.34 avg: 397914.33
std: 60798.92 (14.59%) std: 46150.42 (11.60%)
max: 543057.07 max: 522669.17
min: 304071.67 min: 316588.77
==Read ==Read
records: 50 records: 50
avg: 4147338.63 avg: 4070736.51
std: 179333.25 (4.32%) std: 223499.89 (5.49%)
max: 4459295.28 max: 4539514.44
min: 3753057.53 min: 3444686.31
==Re-read ==Re-read
records: 50 records: 50
avg: 4096706.71 avg: 4117218.57
std: 229735.04 (5.61%) std: 171676.25 (4.17%)
max: 4430012.09 max: 4459263.94
min: 2987217.80 min: 3666904.28
==Reverse Read ==Reverse Read
records: 50 records: 50
avg: 4062763.83 avg: 4078508.32
std: 186208.46 (4.58%) std: 172684.34 (4.23%)
max: 4401358.78 max: 4424757.22
min: 3381625.00 min: 3679359.94
==Stride read ==Stride read
records: 50 records: 50
avg: 4094933.49 avg: 4082170.22
std: 185710.52 (4.54%) std: 196346.68 (4.81%)
max: 4478241.25 max: 4460060.97
min: 3732593.23 min: 3584125.78
==Random read ==Random read
records: 50 records: 50
avg: 4031070.04 avg: 4074847.49
std: 192065.51 (4.76%) std: 206911.33 (5.08%)
max: 4356931.16 max: 4399442.56
min: 3481619.62 min: 3548372.44
==Mixed workload ==Mixed workload
records: 50 records: 50
avg: 149925.73 avg: 149675.54
std: 7701.26 (5.14%) std: 6902.09 (4.61%)
max: 191301.56 max: 175162.05
min: 133566.28 min: 137762.87
==Random write ==Random write
records: 50 records: 50
avg: 404050.11 avg: 393021.47
std: 58887.57 (14.57%) std: 42813.70 (10.89%)
max: 601798.09 max: 524533.43
min: 325176.99 min: 313255.34
==Pwrite ==Pwrite
records: 50 records: 50
avg: 411217.70 avg: 411237.96
std: 43114.99 (10.48%) std: 33136.29 (8.06%)
max: 530766.79 max: 471899.76
min: 320786.84 min: 317906.94
==Pread ==Pread
records: 50 records: 50
avg: 4154908.65 avg: 4087121.92
std: 151272.08 (3.64%) std: 219505.04 (5.37%)
max: 4459478.12 max: 4435857.38
min: 3730512.41 min: 3101101.67

Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7bfb3de8a1b3bebc2dc68d381efe27448c0584c5 31-Jan-2014 Minchan Kim <minchan@kernel.org> zram: add copyright

Add my copyright to the zram source code which I maintain.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
49061236a9c2e18b31617cef10d27ba136068bac 31-Jan-2014 Minchan Kim <minchan@kernel.org> zram: remove old private project comment

Remove the old private compcache project address so upcoming patches
should be sent to LKML because we Linux kernel community will take care.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
cd67e10ac6997c6d1e1504e3c111b693bfdbc148 31-Jan-2014 Minchan Kim <minchan@kernel.org> zram: promote zram from staging

Zram has lived in staging for a LONG LONG time and have been
fixed/improved by many contributors so code is clean and stable now. Of
course, there are lots of product using zram in real practice.

The major TV companys have used zram as swap since two years ago and
recently our production team released android smart phone with zram
which is used as swap, too and recently Android Kitkat start to use zram
for small memory smart phone. And there was a report Google released
their ChromeOS with zram, too and cyanogenmod have been used zram long
time ago. And I heard some disto have used zram block device for tmpfs.
In addition, I saw many report from many other peoples. For example,
Lubuntu start to use it.

The benefit of zram is very clear. With my experience, one of the
benefit was to remove jitter of video application with backgroud memory
pressure. It would be effect of efficient memory usage by compression
but more issue is whether swap is there or not in the system. Recent
mobile platforms have used JAVA so there are many anonymous pages. But
embedded system normally are reluctant to use eMMC or SDCard as swap
because there is wear-leveling and latency issues so if we do not use
swap, it means we can't reclaim anoymous pages and at last, we could
encounter OOM kill. :(

Although we have real storage as swap, it was a problem, too. Because
it sometime ends up making system very unresponsible caused by slow swap
storage performance.

Quote from Luigi on Google
"Since Chrome OS was mentioned: the main reason why we don't use swap
to a disk (rotating or SSD) is because it doesn't degrade gracefully
and leads to a bad interactive experience. Generally we prefer to
manage RAM at a higher level, by transparently killing and restarting
processes. But we noticed that zram is fast enough to be competitive
with the latter, and it lets us make more efficient use of the
available RAM. " and he announced.
http://www.spinics.net/lists/linux-mm/msg57717.html

Other uses case is to use zram for block device. Zram is block device
so anyone can format the block device and mount on it so some guys on
the internet start zram as /var/tmp.
http://forums.gentoo.org/viewtopic-t-838198-start-0.html

Let's promote zram and enhance/maintain it instead of removing.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Nitin Gupta <ngupta@vflare.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>