62b94a08da1bae9d187d49dfcd6665af393750f8 |
|
20-Nov-2013 |
Peter Zijlstra <peterz@infradead.org> |
sched/preempt: Take away preempt_enable_no_resched() from modules Discourage drivers/modules to be creative with preemption. Sadly all is implemented in macros and inline so if they want to do evil they still can, but at least try and discourage some. Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com> Cc: rui.zhang@intel.com Cc: jacob.jun.pan@linux.intel.com Cc: Mike Galbraith <bitbucket@online.de> Cc: hpa@zytor.com Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: lenb@kernel.org Cc: rjw@rjwysocki.net Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-fn7h6vu8wtgxk0ih402qcijx@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
bdb43806589096ac4272fe1307e789846ac08d7c |
|
10-Sep-2013 |
Peter Zijlstra <peterz@infradead.org> |
sched: Extract the basic add/sub preempt_count modifiers Rewrite the preempt_count macros in order to extract the 3 basic preempt_count value modifiers: __preempt_count_add() __preempt_count_sub() and the new: __preempt_count_dec_and_test() And since we're at it anyway, replace the unconventional $op_preempt_count names with the more conventional preempt_count_$op. Since these basic operators are equivalent to the previous _notrace() variants, do away with the _notrace() versions. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-ewbpdbupy9xpsjhg960zwbv8@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
f29c50419c8d1998edd759f1990c4243a248f469 |
|
19-May-2011 |
Steven Rostedt <srostedt@redhat.com> |
maccess,probe_kernel: Make write/read src const void * The functions probe_kernel_write() and probe_kernel_read() do not modify the src pointer. Allow const pointers to be passed in without the need of a typecast. Acked-by: Mike Frysinger <vapier@gentoo.org> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1305824936.1465.4.camel@gandalf.stny.rr.com
|
6144a85a0e018c19bc4b24f7eb6c1f3f7431813d |
|
07-Jan-2010 |
Jason Wessel <jason.wessel@windriver.com> |
maccess,probe_kernel: Allow arch specific override probe_kernel_(read|write) Some archs such as blackfin, would like to have an arch specific probe_kernel_read() and probe_kernel_write() implementation which can fall back to the generic implementation if no special operations are needed. CC: Thomas Gleixner <tglx@linutronix.de> CC: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Mike Frysinger <vapier@gentoo.org>
|
f180053694b43d5714bf56cb95499a3c32ff155c |
|
02-Mar-2009 |
Ingo Molnar <mingo@elte.hu> |
x86, mm: dont use non-temporal stores in pagecache accesses Impact: standardize IO on cached ops On modern CPUs it is almost always a bad idea to use non-temporal stores, as the regression in this commit has shown it: 30d697f: x86: fix performance regression in write() syscall The kernel simply has no good information about whether using non-temporal stores is a good idea or not - and trying to add heuristics only increases complexity and inserts fragility. The regression on cached write()s took very long to be found - over two years. So dont take any chances and let the hardware decide how it makes use of its caches. The only exception is drivers/gpu/drm/i915/i915_gem.c: there were we are absolutely sure that another entity (the GPU) will pick up the dirty data immediately and that the CPU will not touch that data before the GPU will. Also, keep the _nocache() primitives to make it easier for people to experiment with these details. There may be more clear-cut cases where non-cached copies can be used, outside of filemap.c. Cc: Salman Qazi <sqazi@google.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
3255aa2eb636a508fc82a73fabbb8aaf2ff23c0f |
|
25-Feb-2009 |
Ingo Molnar <mingo@elte.hu> |
x86, mm: pass in 'total' to __copy_from_user_*nocache() Impact: cleanup, enable future change Add a 'total bytes copied' parameter to __copy_from_user_*nocache(), and update all the callsites. The parameter is not used yet - architecture code can use it to more intelligently decide whether the copy should be cached or non-temporal. Cc: Salman Qazi <sqazi@google.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
fb71e45338453698bd7460f7e8f171ea0304d218 |
|
16-Sep-2008 |
Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> |
uaccess: fix parameters inversion for __copy_from_user_inatomic() The following patch changes to use __copy_from_user_inatomic(), but the passing parameters incorrect: x86: some lock annotations for user copy paths, v3 This fixes the netfilter crash reported by Steven Noonan. Reported-by: Steven Noonan <steven@uplinklabs.net> Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Tested-by: Steven Noonan <steven@uplinklabs.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
1d18ef489509314506328b9e464dd47c24c1d68f |
|
11-Sep-2008 |
Ingo Molnar <mingo@elte.hu> |
x86: some lock annotations for user copy paths, v3 - add annotation back to clear_user() - change probe_kernel_address() to _inatomic*() method Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
c33fa9f5609e918824446ef9a75319d4a802f1f4 |
|
17-Apr-2008 |
Ingo Molnar <mingo@elte.hu> |
uaccess: add probe_kernel_write() add probe_kernel_read() and probe_kernel_write(). Uninlined and restricted to kernel range memory only, as suggested by Linus. Signed-off-by: Ingo Molnar <mingo@elte.hu> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
|
20aa7b21b1cbd1aa3fbf5fc14da5f7484a61a824 |
|
07-Dec-2006 |
Andrew Morton <akpm@osdl.org> |
[PATCH] probe_kernel_address() needs to do set_fs() probe_kernel_address() purports to be generic, only it forgot to select KERNEL_DS, so it presently won't work right on all architectures. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
a866374aecc90c7d90619727ccd851ac096b2fc7 |
|
07-Dec-2006 |
Peter Zijlstra <a.p.zijlstra@chello.nl> |
[PATCH] mm: pagefault_{disable,enable}() Introduce pagefault_{disable,enable}() and use these where previously we did manual preempt increments/decrements to make the pagefault handler do the atomic thing. Currently they still rely on the increased preempt count, but do not rely on the disabled preemption, this might go away in the future. (NOTE: the extra barrier() in pagefault_disable might fix some holes on machines which have too many registers for their own good) [heiko.carstens@de.ibm.com: s390 fix] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Nick Piggin <npiggin@suse.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
2fff0a48416af891dce38fd425246e337831e0bb |
|
07-Dec-2006 |
Andi Kleen <ak@suse.de> |
[PATCH] Generic: Move __user cast into probe_kernel_address Caller of probe_kernel_address shouldn't need to know that pka is internally implemented with __get_user. So move the __user cast into pka. Signed-off-by: Andi Kleen <ak@suse.de>
|
1b79e5513d52e8533a08af35a3595dad80c74d1f |
|
27-Sep-2006 |
Andrew Morton <akpm@osdl.org> |
[PATCH] add probe_kernel_address() Add a version of __get_user() which is safe to call inside mmap_sem. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
c22ce143d15eb288543fe9873e1c5ac1c01b69a1 |
|
23-Jun-2006 |
Hiro Yoshioka <hyoshiok@miraclelinux.com> |
[PATCH] x86: cache pollution aware __copy_from_user_ll() Use the x86 cache-bypassing copy instructions for copy_from_user(). Some performance data are Total of GLOBAL_POWER_EVENTS (CPU cycle samples) 2.6.12.4.orig 1921587 2.6.12.4.nt 1599424 1599424/1921587=83.23% (16.77% reduction) BSQ_CACHE_REFERENCE (L3 cache miss) 2.6.12.4.orig 57427 2.6.12.4.nt 20858 20858/57427=36.32% (63.7% reduction) L3 cache miss reduction of __copy_from_user_ll samples % 37408 65.1412 vmlinux __copy_from_user_ll 23 0.1103 vmlinux __copy_user_zeroing_intel_nocache 23/37408=0.061% (99.94% reduction) Top 5 of 2.6.12.4.nt Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 100000 samples % app name symbol name 128392 8.0274 vmlinux __copy_user_zeroing_intel_nocache 64206 4.0143 vmlinux journal_add_journal_head 59746 3.7355 vmlinux do_get_write_access 47674 2.9807 vmlinux journal_put_journal_head 46021 2.8774 vmlinux journal_dirty_metadata pattern9-0-cpu4-0-09011728/summary.out Counted BSQ_CACHE_REFERENCE events (cache references seen by the bus unit) with a unit mask of 0x3f (multiple flags) count 3000 samples % app name symbol name 69755 4.2861 vmlinux __copy_user_zeroing_intel_nocache 55685 3.4215 vmlinux journal_add_journal_head 52371 3.2179 vmlinux __find_get_block 45504 2.7960 vmlinux journal_put_journal_head 36005 2.2123 vmlinux journal_stop pattern9-0-cpu4-0-09011744/summary.out Counted BSQ_CACHE_REFERENCE events (cache references seen by the bus unit) with a unit mask of 0x200 (read 3rd level cache miss) count 3000 samples % app name symbol name 1147 5.4994 vmlinux journal_add_journal_head 881 4.2240 vmlinux journal_dirty_data 872 4.1809 vmlinux blk_rq_map_sg 734 3.5192 vmlinux journal_commit_transaction 617 2.9582 vmlinux radix_tree_delete pattern9-0-cpu4-0-09011731/summary.out iozone results are original 2.6.12.4 CPU time = 207.768 sec cache aware CPU time = 184.783 sec (three times run) 184.783/207.768=88.94% (11.06% reduction) original: pattern9-0-cpu4-0-08191720/iozone.out: CPU Utilization: Wall time 45.997 CPU time 64.527 CPU utilization 140.28 % pattern9-0-cpu4-0-08191741/iozone.out: CPU Utilization: Wall time 46.878 CPU time 71.933 CPU utilization 153.45 % pattern9-0-cpu4-0-08191743/iozone.out: CPU Utilization: Wall time 45.152 CPU time 71.308 CPU utilization 157.93 % cache awre: pattern9-0-cpu4-0-09011728/iozone.out: CPU Utilization: Wall time 44.842 CPU time 62.465 CPU utilization 139.30 % pattern9-0-cpu4-0-09011731/iozone.out: CPU Utilization: Wall time 44.718 CPU time 59.273 CPU utilization 132.55 % pattern9-0-cpu4-0-09011744/iozone.out: CPU Utilization: Wall time 44.367 CPU time 63.045 CPU utilization 142.10 % Signed-off-by: Hiro Yoshioka <hyoshiok@miraclelinux.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|