History log of /arch/x86/include/asm/percpu.h
Revision Date Author Comments
6fbc07bbe2b5a898532f970c5a397f8789ace0d5 18-Jun-2014 Tejun Heo <tj@kernel.org> percpu: invoke __verify_pcpu_ptr() from the generic part of accessors and operations

__verify_pcpu_ptr() is used to verify that a specified parameter is
actually an percpu pointer by percpu accessor and operation
implementations. Currently, where it's called isn't clearly defined
and we just ensure that it's invoked at least once for all accessors
and operations.

The lack of clarity on when it should be called isn't nice and given
that this is a completely generic issue, there's no reason to make
archs worry about it.

This patch updates __verify_pcpu_ptr() invocations such that it's
always invoked from the final generic wrapper once per access or
operation. As this is already the case for {raw|this}_cpu_*()
definitions through __pcpu_size_*(), only the {raw|per|this}_cpu_ptr()
accessors need to be updated.

This change makes it unnecessary for archs to worry about
__verify_pcpu_ptr(). x86's arch_raw_cpu_ptr() is updated accordingly.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
bbc344e1e3aef3034a0edc79f7f64a912517926b 18-Jun-2014 Tejun Heo <tj@kernel.org> percpu: introduce arch_raw_cpu_ptr()

Currently, archs can override raw_cpu_ptr() directly; however, we
wanna build a layer of indirection in the generic part of percpu so
that we can implement generic features there without affecting archs.

Introduce arch_raw_cpu_ptr() which is used to define raw_cpu_ptr() by
generic percpu code. The two are identical for now. x86 is currently
the only arch which overrides raw_cpu_ptr() and is converted to
define arch_raw_cpu_ptr() instead.

This doesn't introduce any functional difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
b3ca1c10d7b32fdfdfaf5484eda486323f52d9be 08-Apr-2014 Christoph Lameter <cl@linux.com> percpu: add raw_cpu_ops

The kernel has never been audited to ensure that this_cpu operations are
consistently used throughout the kernel. The code generated in many
places can be improved through the use of this_cpu operations (which
uses a segment register for relocation of per cpu offsets instead of
performing address calculations).

The patch set also addresses various consistency issues in general with
the per cpu macros.

A. The semantics of __this_cpu_ptr() differs from this_cpu_ptr only
because checks are skipped. This is typically shown through a raw_
prefix. So this patch set changes the places where __this_cpu_ptr()
is used to raw_cpu_ptr().

B. There has been the long term wish by some that __this_cpu operations
would check for preemption. However, there are cases where preemption
checks need to be skipped. This patch set adds raw_cpu operations that
do not check for preemption and then adds preemption checks to the
__this_cpu operations.

C. The use of __get_cpu_var is always a reference to a percpu variable
that can also be handled via a this_cpu operation. This patch set
replaces all uses of __get_cpu_var with this_cpu operations.

D. We can then use this_cpu RMW operations in various places replacing
sequences of instructions by a single one.

E. The use of this_cpu operations throughout will allow other arches than
x86 to implement optimized references and RMV operations to work with
per cpu local data.

F. The use of this_cpu operations opens up the possibility to
further optimize code that relies on synchronization through
per cpu data.

The patch set works in a couple of stages:

I. Patch 1 adds the additional raw_cpu operations and raw_cpu_ptr().
Also converts the existing __this_cpu_xx_# primitive in the x86
code to raw_cpu_xx_#.

II. Patch 2-4 use the raw_cpu operations in places that would give
us false positives once they are enabled.

III. Patch 5 adds preemption checks to __this_cpu operations to allow
checking if preemption is properly disabled when these functions
are used.

IV. Patches 6-20 are patches that simply replace uses of __get_cpu_var
with this_cpu_ptr. They do not depend on any changes to the percpu
code. No preemption tests are skipped if they are applied.

V. Patches 21-46 are conversion patches that use this_cpu operations
in various kernel subsystems/drivers or arch code.

VI. Patches 47/48 (not included in this series) remove no longer used
functions (__this_cpu_ptr and __get_cpu_var). These should only be
applied after all the conversion patches have made it and after we
have done additional passes through the kernel to ensure that none of
the uses of these functions remain.

This patch (of 46):

The patches following this one will add preemption checks to __this_cpu
ops so we need to have an alternative way to use this_cpu operations
without preemption checks.

raw_cpu_ops will be the basis for all other ops since these will be the
operations that do not implement any checks.

Primitive operations are renamed by this patch from __this_cpu_xxx to
raw_cpu_xxxx.

Also change the uses of the x86 percpu primitives in preempt.h.
These depend directly on asm/percpu.h (header #include nesting issue).

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Christoph Lameter <cl@linux.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Alex Shi <alex.shi@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Bryan Wu <cooloney@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: David Daney <david.daney@cavium.com>
Cc: David Miller <davem@davemloft.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Hedi Berriche <hedi@sgi.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Mike Travis <travis@sgi.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Robert Richter <rric@kernel.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
bd09d9a35111b6ffc0c7585d3853d0ec7f9f1eb4 30-Oct-2013 Greg Thelen <gthelen@google.com> percpu: fix this_cpu_sub() subtrahend casting for unsigneds

this_cpu_sub() is implemented as negation and addition.

This patch casts the adjustment to the counter type before negation to
sign extend the adjustment. This helps in cases where the counter type
is wider than an unsigned adjustment. An alternative to this patch is
to declare such operations unsupported, but it seemed useful to avoid
surprises.

This patch specifically helps the following example:
unsigned int delta = 1
preempt_disable()
this_cpu_write(long_counter, 0)
this_cpu_sub(long_counter, delta)
preempt_enable()

Before this change long_counter on a 64 bit machine ends with value
0xffffffff, rather than 0xffffffffffffffff. This is because
this_cpu_sub(pcp, delta) boils down to this_cpu_add(pcp, -delta),
which is basically:
long_counter = 0 + 0xffffffff

Also apply the same cast to:
__this_cpu_sub()
__this_cpu_sub_return()
this_cpu_sub_return()

All percpu_test.ko passes, especially the following cases which
previously failed:

l -= ui_one;
__this_cpu_sub(long_counter, ui_one);
CHECK(l, long_counter, -1);

l -= ui_one;
this_cpu_sub(long_counter, ui_one);
CHECK(l, long_counter, -1);
CHECK(l, long_counter, 0xffffffffffffffff);

ul -= ui_one;
__this_cpu_sub(ulong_counter, ui_one);
CHECK(ul, ulong_counter, -1);
CHECK(ul, ulong_counter, 0xffffffffffffffff);

ul = this_cpu_sub_return(ulong_counter, ui_one);
CHECK(ul, ulong_counter, 2);

ul = __this_cpu_sub_return(ulong_counter, ui_one);
CHECK(ul, ulong_counter, 1);

Signed-off-by: Greg Thelen <gthelen@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
90f2492cf9c84fd414ecfd2f40685fb5291a484e 21-Oct-2013 Heiko Carstens <heiko.carstens@de.ibm.com> x86: remove this_cpu_xor() implementation

Remove the unused x86 implementation of this_cpu_xor().

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
d55c5a93db2d5fa95f233ab153f594365d95b777 28-Nov-2012 H. Peter Anvin <hpa@linux.intel.com> x86, 386 removal: Remove CONFIG_CMPXCHG

All 486+ CPUs support CMPXCHG, so remove the fallback 386 support
code.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1354132230-21854-3-git-send-email-hpa@linux.intel.com
c35f77417ebfc7c21c02aa9c8c30aa4cecf331d6 10-Jun-2012 Ido Yariv <ido@wizery.com> x86: Define early read-mostly per-cpu macros

Some read-mostly per-cpu data may need to be declared or defined
early, so it can be initialized and accessed before per_cpu
areas are allocated.

Only the data that resides in the per_cpu areas should be
read-mostly, as there is little benefit in optimizing cache
lines on initialization.

Signed-off-by: Ido Yariv <ido@wizery.com>
[ Added the missing declarations in !SMP code. ]
Signed-off-by: Vlad Zolotarov <vlad@scalemp.com>
Acked-by: Shai Fultheim <shai@scalemp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/46188571.ddB8aVQYWo@vlad
Signed-off-by: Ingo Molnar <mingo@kernel.org>
641b695c2f11397bd307ea689d4d3f128360ce49 14-May-2012 Alex Shi <alex.shi@intel.com> percpu: remove percpu_xxx() functions

Remove percpu_xxx serial functions, all of them were replaced by
this_cpu_xxx or __this_cpu_xxx serial functions

Signed-off-by: Alex Shi <alex.shi@intel.com>
Acked-by: Christoph Lameter <cl@gentwo.org>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
c6ae41e7d469f00d9c92a2b2887c7235d121c009 11-May-2012 Alex Shi <alex.shi@intel.com> x86: replace percpu_xxx funcs with this_cpu_xxx

Since percpu_xxx() serial functions are duplicated with this_cpu_xxx().
Removing percpu_xxx() definition and replacing them by this_cpu_xxx()
in code. There is no function change in this patch, just preparation for
later percpu_xxx serial function removing.

On x86 machine the this_cpu_xxx() serial functions are same as
__this_cpu_xxx() without no unnecessary premmpt enable/disable.

Thanks for Stephen Rothwell, he found and fixed a i386 build error in
the patch.

Also thanks for Andrew Morton, he kept updating the patchset in Linus'
tree.

Signed-off-by: Alex Shi <alex.shi@intel.com>
Acked-by: Christoph Lameter <cl@gentwo.org>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
933393f58fef9963eac61db8093689544e29a600 22-Dec-2011 Christoph Lameter <cl@linux.com> percpu: Remove irqsafe_cpu_xxx variants

We simply say that regular this_cpu use must be safe regardless of
preemption and interrupt state. That has no material change for x86
and s390 implementations of this_cpu operations. However, arches that
do not provide their own implementation for this_cpu operations will
now get code generated that disables interrupts instead of preemption.

-tj: This is part of on-going percpu API cleanup. For detailed
discussion of the subject, please refer to the following thread.

http://thread.gmane.org/gmane.linux.kernel/1222078

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
LKML-Reference: <alpine.DEB.2.00.1112221154380.11787@router.home>
cebef5beed3de3037de85a521495897256b2c5da 14-Dec-2011 Jan Beulich <JBeulich@suse.com> x86: Fix and improve percpu_cmpxchg{8,16}b_double()

They had several problems/shortcomings:

Only the first memory operand was mentioned in the 2x32bit asm()
operands, and 2x64-bit version had a memory clobber. The first
allowed the compiler to not recognize the need to re-load the
data in case it had it cached in some register, and the second
was overly destructive.

The memory operand in the 2x32-bit asm() was declared to only be
an output.

The types of the local copies of the old and new values were
incorrect (as in other per-CPU ops, the types of the per-CPU
variables accessed should be used here, to make sure the
respective types are compatible).

The __dummy variable was pointless (and needlessly initialized
in the 2x32-bit case), given that local copies of the inputs
already exist.

The 2x64-bit variant forced the address of the first object into
%rsi, even though this is needed only for the call to the
emulation function. The real cmpxchg16b can operate on an
memory.

At once also change the return value type to what it really is -
'bool'.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/4EE86D6502000078000679FE@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
688d3be815b1563b1484ce67702249a4a7a6314e 12-Jul-2011 Christoph Lameter <cl@linux.com> percpu: Fixup __this_cpu_xchg* operations

Somehow we got into a situation where the __this_cpu_xchg() operations were
not defined in the same way as this_cpu_xchg() and friends. I had some build
failures under 32 bit that were addressed by these fixes.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
b1e7734f024c9ce4393016a97c8d821e1f18d9b4 19-Apr-2011 H. Peter Anvin <hpa@linux.intel.com> x86, percpu: Use ASM_NOP4 instead of hardcoding P6_NOP4

For use in assembly constants, use the ASM_NOP* defines.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/1303166160-10315-2-git-send-email-hpa@linux.intel.com
349c004e3d31fda23ad225b61861be38047fff16 12-Mar-2011 Christoph Lameter <cl@linux.com> x86: A fast way to check capabilities of the current cpu

Add this_cpu_has() which determines if the current cpu has a certain
ability using a segment prefix and a bit test operation.

For that we need to add bit operations to x86s percpu.h.

Many uses of cpu_has use a pointer passed to a function to determine
the current flags. That is no longer necessary after this patch.

However, this patch only converts the straightforward cases where
cpu_has is used with this_cpu_ptr. The rest is work for later.

-tj: Rolled up patch to add x86_ prefix and use percpu_read() instead
of percpu_read_stable().

Signed-off-by: Christoph Lameter <cl@linux.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
5f55924deaa62d6df687c131fb92aebe071ec787 28-Mar-2011 Eric Dumazet <eric.dumazet@gmail.com> percpu: Avoid extra NOP in percpu_cmpxchg16b_double

percpu_cmpxchg16b_double() uses alternative_io() and looks like :

e8 .. .. .. .. call this_cpu_cmpxchg16b_emu
X bytes NOPX

or, once patched (if cpu supports native instruction) on SMP build :

65 48 0f c7 0e cmpxchg16b %gs:(%rsi)
0f 94 c0 sete %al

on !SMP build :

48 0f c7 0e cmpxchg16b (%rsi)
0f 94 c0 sete %al

Therefore, NOPX should be :

P6_NOP3 on SMP
P6_NOP2 on !SMP

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
d7c3f8cee81f4548de0513403b74131aee655576 27-Mar-2011 Christoph Lameter <cl@linux.com> percpu: Omit segment prefix in the UP case for cmpxchg_double

Omit the segment prefix in the UP case. GS is not used then
and we will generate segfaults if cmpxchg16b is used otherwise.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
b9ec40af0e18fb7d02106be148036c2ea490fdf9 28-Feb-2011 Christoph Lameter <cl@linux.com> percpu, x86: Add arch-specific this_cpu_cmpxchg_double() support

Support this_cpu_cmpxchg_double() using the cmpxchg16b and cmpxchg8b
instructions.

-tj: s/percpu_cmpxchg16b/percpu_cmpxchg16b_double/ for consistency and
other cosmetic changes.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
889a7a6a5d5e64063effd40056bdc7b8fb336bd1 25-Jan-2011 Eric Dumazet <eric.dumazet@gmail.com> percpu, x86: Fix percpu_xchg_op()

These recent percpu commits:

2485b6464cf8: x86,percpu: Move out of place 64 bit ops into X86_64 section
8270137a0d50: cpuops: Use cmpxchg for xchg to avoid lock semantics

Caused this 'perf top' crash:

Kernel panic - not syncing: Fatal exception in interrupt
Pid: 0, comm: swapper Tainted: G D
2.6.38-rc2-00181-gef71723 #413 Call Trace: <IRQ> [<ffffffff810465b5>]
? panic
? kmsg_dump
? kmsg_dump
? oops_end
? no_context
? __bad_area_nosemaphore
? perf_output_begin
? bad_area_nosemaphore
? do_page_fault
? __task_pid_nr_ns
? perf_event_tid
? __perf_event_header__init_id
? validate_chain
? perf_output_sample
? trace_hardirqs_off
? page_fault
? irq_work_run
? update_process_times
? tick_sched_timer
? tick_sched_timer
? __run_hrtimer
? hrtimer_interrupt
? account_system_vtime
? smp_apic_timer_interrupt
? apic_timer_interrupt
...

Looking at assembly code, I found:

list = this_cpu_xchg(irq_work_list, NULL);

gives this wrong code : (gcc-4.1.2 cross compiler)

ffffffff810bc45e:
mov %gs:0xead0,%rax
cmpxchg %rax,%gs:0xead0
jne ffffffff810bc45e <irq_work_run+0x3e>
test %rax,%rax
je ffffffff810bc4aa <irq_work_run+0x8a>

Tell gcc we dirty eax/rax register in percpu_xchg_op()

Compiler must use another register to store pxo_new__

We also dont need to reload percpu value after a jump,
since a 'failed' cmpxchg already updated eax/rax

Wrong generated code was :
xor %rax,%rax /* load 0 into %rax */
1: mov %gs:0xead0,%rax
cmpxchg %rax,%gs:0xead0
jne 1b
test %rax,%rax

After patch :

xor %rdx,%rdx /* load 0 into %rdx */
mov %gs:0xead0,%rax
1: cmpxchg %rdx,%gs:0xead0
jne 1b:
test %rax,%rax

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Tejun Heo <tj@kernel.org>
LKML-Reference: <1295973114.3588.312.camel@edumazet-laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2485b6464cf86a5bc361666838f2439c99c00567 11-Jan-2011 Christoph Lameter <cl@linux.com> x86,percpu: Move out of place 64 bit ops into X86_64 section

Some operations that operate on 64 bit operands are defined for 32 bit.
Move them into the correct section.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
8270137a0d50507a5b40f880db636527045b8466 14-Dec-2010 Christoph Lameter <cl@linux.com> cpuops: Use cmpxchg for xchg to avoid lock semantics

Use cmpxchg instead of xchg to realize this_cpu_xchg.

xchg will cause LOCK overhead since LOCK is always implied but cmpxchg
will not.

Baselines:

xchg() = 18 cycles (no segment prefix, LOCK semantics)
__this_cpu_xchg = 1 cycle

(simulated using this_cpu_read/write, two prefixes. Looks like the
cpu can use loop optimization to get rid of most of the overhead)

Cycles before:

this_cpu_xchg = 37 cycles (segment prefix and LOCK (implied by xchg))

After:

this_cpu_xchg = 11 cycle (using cmpxchg without lock semantics)

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
7296e08abac0a22a2534a4f6e493c764f2c77583 14-Dec-2010 Christoph Lameter <cl@linux.com> x86: this_cpu_cmpxchg and this_cpu_xchg operations

Provide support as far as the hardware capabilities of the x86 cpus
allow.

Define CONFIG_CMPXCHG_LOCAL in Kconfig.cpu to allow core code to test for
fast cpuops implementations.

V1->V2:
- Take out the definition for this_cpu_cmpxchg_8 and move it into
a separate patch.

tj: - Reordered ops to better follow this_cpu_* organization.
- Renamed macro temp variables similar to their existing
neighbours.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
403047754cf690b012369b8fb563b738b88086e6 17-Dec-2010 Tejun Heo <tj@kernel.org> percpu,x86: relocate this_cpu_add_return() and friends

- include/linux/percpu.h: this_cpu_add_return() and friends were
located next to __this_cpu_add_return(). However, the overall
organization is to first group by preemption safeness. Relocate
this_cpu_add_return() and friends to preemption-safe area.

- arch/x86/include/asm/percpu.h: Relocate percpu_add_return_op() after
other more basic operations. Relocate [__]this_cpu_add_return_8()
so that they're first grouped by preemption safeness.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
8f1d97c79eb65de1d05799d6b81d79cd94169114 06-Dec-2010 Christoph Lameter <cl@linux.com> x86: Support for this_cpu_add, sub, dec, inc_return

Supply an implementation for x86 in order to generate more efficient code.

V2->V3:
- Cleanup
- Remove strange type checking from percpu_add_return_op.

tj: - Dropped unused typedef from percpu_add_return_op().
- Renamed ret__ to paro_ret__ in percpu_add_return_op().
- Minor indentation adjustments.

Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
db7829c6cc32f3c0c9a324118d743acb1abff081 09-Sep-2010 Brian Gerst <brgerst@gmail.com> x86, percpu: Optimize this_cpu_ptr

Allow arches to implement __this_cpu_ptr, and provide an x86 version.

Before:
movq $foo, %rax
movq %gs:this_cpu_off, %rdx
addq %rdx, %rax

After:
movq $foo, %rax
addq %gs:this_cpu_off, %rax

The benefit is doing it in one less instruction and not clobbering
a temporary register.

tj: * Beefed up the comment a bit and renamed in-macro temp variable
to match neighboring macros.

* Folded fix for const pointer case found in linux-next.

* Fixed sparse notation.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
23b764d056bfec0a212a67229074ac281e86e021 10-Jun-2010 Andi Kleen <andi@firstfloor.org> percpu, x86: Avoid warnings of unused variables in per cpu

Avoid hundreds of warnings with a gcc 4.6 -Wall build.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
402af0d7c692ddcfa2333e93d3f275ebd0487926 21-Apr-2010 Jan Beulich <JBeulich@novell.com> x86, asm: Introduce and use percpu_inc()

... generating slightly smaller code.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <4BCF261F020000780003B33C@vpn.id2.novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
40f0a5d0a16e68a68ab3d230f1ddd96c81cf5340 19-Apr-2010 Justin P. Mattock <justinmattock@gmail.com> Fix comment typo in percpu.h

Fix a typo in arch/x86/include/asm/percpu.h

Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
5917dae83cb02dfe74c9167b79e86e6d65183fa3 05-Jan-2010 Christoph Lameter <cl@linux-foundation.org> percpu, x86: Generic inc / dec percpu instructions

Optimize code generated for percpu access by checking for increment and
decrements.

tj: fix incorrect usage of __builtin_constant_p() and restructure
percpu_add_op() macro.

Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
dd17c8f72993f9461e9c19250e3f155d6d99df22 29-Oct-2009 Rusty Russell <rusty@rustcorp.com.au> percpu: remove per_cpu__ prefix.

Now that the return from alloc_percpu is compatible with the address
of per-cpu vars, it makes sense to hand around the address of per-cpu
variables. To make this sane, we remove the per_cpu__ prefix we used
created to stop people accidentally using these vars directly.

Now we have sparse, we can use that (next patch).

tj: * Updated to convert stuff which were missed by or added after the
original patch.

* Kill per_cpu_var() macro.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
0f5e4816dbf38ce9488e611ca2296925c1e90d5e 29-Oct-2009 Tejun Heo <tj@kernel.org> percpu: remove some sparse warnings

Make the following changes to remove some sparse warnings.

* Make DEFINE_PER_CPU_SECTION() declare __pcpu_unique_* before
defining it.

* Annotate pcpu_extend_area_map() that it is entered with pcpu_lock
held, releases it and then reacquires it.

* Make percpu related macros use unique nested variable names.

* While at it, add pcpu prefix to __size_call[_return]() macros as
to-be-implemented sparse annotations will add percpu specific stuff
to these macros.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
30ed1a79f5bf271d33e782afee3323582dcc621e 03-Oct-2009 Christoph Lameter <cl@linux-foundation.org> this_cpu: Implement X86 optimized this_cpu operations

Basically the existing percpu ops can be used for this_cpu variants that allow
operations also on dynamically allocated percpu data. However, we do not pass a
reference to a percpu variable in. Instead a dynamically or statically
allocated percpu variable is provided.

Preempt, the non preempt and the irqsafe operations generate the same code.
It will always be possible to have the requires per cpu atomicness in a single
RMW instruction with segment override on x86.

64 bit this_cpu operations are not supported on 32 bit.

Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
ed8d9adf357ec331603fa1049510399812cea7e5 03-Aug-2009 Linus Torvalds <torvalds@linux-foundation.org> x86, percpu: Add 'percpu_read_stable()' interface for cacheable accesses

This is very useful for some common things like 'get_current()' and
'get_thread_info()', which can be used multiple times in a function, and
where the result is cacheable.

tj: Added the magical undocumented "P" modifier to UP __percpu_arg()
to force gcc to dereference the pointer value passed in via the
"p" input constraint. Without this, percpu_read_stable() returns
the address of the percpu variable. Also added comment explaining
the difference between percpu_read() and percpu_read_stable().

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
8c4bfc6e8801616ab2e01c38140b2159b388d2ff 04-Jul-2009 Tejun Heo <tj@kernel.org> x86,percpu: generalize lpage first chunk allocator

Generalize and move x86 setup_pcpu_lpage() into
pcpu_lpage_first_chunk(). setup_pcpu_lpage() now is a simple wrapper
around the generalized version. Other than taking size parameters and
using arch supplied callbacks to allocate/free/map memory,
pcpu_lpage_first_chunk() is identical to the original implementation.

This simplifies arch code and will help converting more archs to
dynamic percpu allocator.

While at it, factor out pcpu_calc_fc_sizes() which is common to
pcpu_embed_first_chunk() and pcpu_lpage_first_chunk().

[ Impact: code reorganization and generalization ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
e59a1bb2fdfb745c685f5b40ffbed126331d3223 22-Jun-2009 Tejun Heo <tj@kernel.org> x86: fix pageattr handling for lpage percpu allocator and re-enable it

lpage allocator aliases a PMD page for each cpu and returns whatever
is unused to the page allocator. When the pageattr of the recycled
pages are changed, this makes the two aliases point to the overlapping
regions with different attributes which isn't allowed and known to
cause subtle data corruption in certain cases.

This can be handled in simliar manner to the x86_64 highmap alias.
pageattr code should detect if the target pages have PMD alias and
split the PMD alias and synchronize the attributes.

pcpur allocator is updated to keep the allocated PMD pages map sorted
in ascending address order and provide pcpu_lpage_remapped() function
which binary searches the array to determine whether the given address
is aliased and if so to which address. pageattr is updated to use
pcpu_lpage_remapped() to detect the PMD alias and split it up as
necessary from cpa_process_alias().

Jan Beulich spotted the original problem and incorrect usage of vaddr
instead of laddr for lookup.

With this, lpage percpu allocator should work correctly. Re-enable
it.

[ Impact: fix subtle lpage pageattr bug and re-enable lpage ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Jan Beulich <JBeulich@novell.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
3c598766a2bae1b208470e7cc934ac462561e3cb 11-May-2009 Jan Beulich <JBeulich@novell.com> x86: fix percpu_{to,from}_op()

- the byte operand constraints were wrong for 32-bit
- the to-op's input operands weren't properly parenthesized

[ Impact: fix possible miscompilation or build failure ]

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
e01009833e22dc87075d770554b34d797843ed23 10-Mar-2009 Tejun Heo <tj@kernel.org> percpu: make x86 addr <-> pcpu ptr conversion macros generic

Impact: generic addr <-> pcpu ptr conversion macros

There's nothing arch specific about x86 __addr_to_pcpu_ptr() and
__pcpu_ptr_to_addr(). With proper __per_cpu_load and __per_cpu_start
defined, they'll do the right thing regardless of actual layout.

Move these macros from arch/x86/include/asm/percpu.h to mm/percpu.c
and allow archs to override it as necessary.

Signed-off-by: Tejun Heo <tj@kernel.org>
11124411aa95827404d6bfdfc14c908e1b54513c 20-Feb-2009 Tejun Heo <tj@kernel.org> x86: convert to the new dynamic percpu allocator

Impact: use new dynamic allocator, unified access to static/dynamic
percpu memory

Convert to the new dynamic percpu allocator.

* implement populate_extra_pte() for both 32 and 64
* update setup_per_cpu_areas() to use pcpu_setup_static()
* define __addr_to_pcpu_ptr() and __pcpu_ptr_to_addr()
* define config HAVE_DYNAMIC_PER_CPU_AREA

Signed-off-by: Tejun Heo <tj@kernel.org>
2add8e235cbe0dcd672c33fc322754e15500238c 08-Feb-2009 Brian Gerst <brgerst@gmail.com> x86: use linker to offset symbols by __per_cpu_load

Impact: cleanup and bug fix

Use the linker to create symbols for certain per-cpu variables
that are offset by __per_cpu_load. This allows the removal of
the runtime fixup of the GDT pointer, which fixes a bug with
resume reported by Jiri Slaby.

Reported-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Acked-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
299e26992a737804e13e74fdb97cdab470ed19ac 21-Jan-2009 Brian Gerst <brgerst@gmail.com> x86: fix percpu_write with 64-bit constants

Impact: slightly better code generation for percpu_to_op()

The processor will sign-extend 32-bit immediate values in 64-bit
operations. Use the 'e' constraint ("32-bit signed integer constant,
or a symbolic reference known to fit that range") for 64-bit constants.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
947e76cdc34c782fc947313d4331380686eebbad 18-Jan-2009 Brian Gerst <brgerst@gmail.com> x86: move stack_canary into irq_stack

Impact: x86_64 percpu area layout change, irq_stack now at the beginning

Now that the PDA is empty except for the stack canary, it can be removed.
The irqstack is moved to the start of the per-cpu section. If the stack
protector is enabled, the canary overlaps the bottom 48 bytes of the irqstack.

tj: * updated subject
* dropped asm relocation of irq_stack_ptr
* updated comments a bit
* rebased on top of stack canary changes

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
87b264065880fa696c121dad8498a60524e0f6de 18-Jan-2009 Brian Gerst <brgerst@gmail.com> x86-64: Use absolute displacements for per-cpu accesses.

Accessing memory through %gs should not use rip-relative addressing.
Adding a P prefix for the argument tells gcc to not add (%rip) to
the memory references.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
6dbde3530850d4d8bfc1b6bd4006d92786a2787f 15-Jan-2009 Ingo Molnar <mingo@elte.hu> percpu: add optimized generic percpu accessors

It is an optimization and a cleanup, and adds the following new
generic percpu methods:

percpu_read()
percpu_write()
percpu_add()
percpu_sub()
percpu_and()
percpu_or()
percpu_xor()

and implements support for them on x86. (other architectures will fall
back to a default implementation)

The advantage is that for example to read a local percpu variable,
instead of this sequence:

return __get_cpu_var(var);

ffffffff8102ca2b: 48 8b 14 fd 80 09 74 mov -0x7e8bf680(,%rdi,8),%rdx
ffffffff8102ca32: 81
ffffffff8102ca33: 48 c7 c0 d8 59 00 00 mov $0x59d8,%rax
ffffffff8102ca3a: 48 8b 04 10 mov (%rax,%rdx,1),%rax

We can get a single instruction by using the optimized variants:

return percpu_read(var);

ffffffff8102ca3f: 65 48 8b 05 91 8f fd mov %gs:0x7efd8f91(%rip),%rax

I also cleaned up the x86-specific APIs and made the x86 code use
these new generic percpu primitives.

tj: * fixed generic percpu_sub() definition as Roel Kluin pointed out
* added percpu_and() for completeness's sake
* made generic percpu ops atomic against preemption

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Tejun Heo <tj@kernel.org>
49357d19e4fb31e28796eaff83499e7584c26878 13-Jan-2009 Tejun Heo <tj@kernel.org> x86: convert pda ops to wrappers around x86 percpu accessors

pda is now a percpu variable and there's no reason it can't use plain
x86 percpu accessors. Add x86_test_and_clear_bit_percpu() and replace
pda op implementations with wrappers around x86 percpu accessors.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
9939ddaff52787b2a7c1adf1b2afc95421aa0884 13-Jan-2009 Tejun Heo <tj@kernel.org> x86: merge 64 and 32 SMP percpu handling

Now that pda is allocated as part of percpu, percpu doesn't need to be
accessed through pda. Unify x86_64 SMP percpu access with x86_32 SMP
one. Other than the segment register, operand size and the base of
percpu symbols, they behave identical now.

This patch replaces now unnecessary pda->data_offset with a dummy
field which is necessary to keep stack_canary at its place. This
patch also moves per_cpu_offset initialization out of init_gdt() into
setup_per_cpu_areas(). Note that this change also necessitates
explicit per_cpu_offset initializations in voyager_smp.c.

With this change, x86_OP_percpu()'s are as efficient on x86_64 as on
x86_32 and also x86_64 can use assembly PER_CPU macros.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
1a51e3a0aed18767cf2762e95456ecfeb0bca5e6 13-Jan-2009 Tejun Heo <tj@kernel.org> x86: fold pda into percpu area on SMP

[ Based on original patch from Christoph Lameter and Mike Travis. ]

Currently pdas and percpu areas are allocated separately. %gs points
to local pda and percpu area can be reached using pda->data_offset.
This patch folds pda into percpu area.

Due to strange gcc requirement, pda needs to be at the beginning of
the percpu area so that pda->stack_canary is at %gs:40. To achieve
this, a new percpu output section macro - PERCPU_VADDR_PREALLOC() - is
added and used to reserve pda sized chunk at the start of the percpu
area.

After this change, for boot cpu, %gs first points to pda in the
data.init area and later during setup_per_cpu_areas() gets updated to
point to the actual pda. This means that setup_per_cpu_areas() need
to reload %gs for CPU0 while clearing pda area for other cpus as cpu0
already has modified it when control reaches setup_per_cpu_areas().

This patch also removes now unnecessary get_local_pda() and its call
sites.

A lot of this patch is taken from Mike Travis' "x86_64: Fold pda into
per cpu area" patch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
f10fcd47120e80f66665567dbe17f5071c7aef52 13-Jan-2009 Tejun Heo <tj@kernel.org> x86: make early_per_cpu() a lvalue and use it

Make early_per_cpu() a lvalue as per_cpu() is and use it where
applicable.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
1965aae3c98397aad957412413c07e97b1bd4e64 23-Oct-2008 H. Peter Anvin <hpa@zytor.com> x86: Fix ASM_X86__ header guards

Change header guards named "ASM_X86__*" to "_ASM_X86_*" since:

a. the double underscore is ugly and pointless.
b. no leading underscore violates namespace constraints.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
bb8985586b7a906e116db835c64773b7a7d51663 18-Aug-2008 Al Viro <viro@zeniv.linux.org.uk> x86, um: ... and asm-x86 move

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>