60f583d56aa515b896a9d94f860f52640c1e8a75 |
|
07-Mar-2013 |
Dave Hansen <dave@sr71.net> |
x86: Do not try to sync identity map for non-mapped pages kernel_map_sync_memtype() is called from a variety of contexts. The pat.c code that calls it seems to ensure that it is not called for non-ram areas by checking via pat_pagerange_is_ram(). It is important that it only be called on the actual identity map because there *IS* no map to sync for highmem pages, or for memory holes. The ioremap.c uses are not as careful as those from pat.c, and call kernel_map_sync_memtype() on PCI space which is in the middle of the kernel identity map _range_, but is not actually mapped. This patch adds a check to kernel_map_sync_memtype() which probably duplicates some of the checks already in pat.c. But, it is necessary for the ioremap.c uses and shouldn't hurt other callers. I have reproduced this bug and this patch fixes it for me and the original bug reporter: https://lkml.org/lkml/2013/2/5/396 Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20130307163151.D9B58C4E@kernel.stglabs.ibm.com Signed-off-by: Dave Hansen <dave@sr71.net> Tested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
a25b9316841c5afa226f8f70a457861b35276a92 |
|
22-Jan-2013 |
Dave Hansen <dave@linux.vnet.ibm.com> |
x86, mm: Make DEBUG_VIRTUAL work earlier in boot The KVM code has some repeated bugs in it around use of __pa() on per-cpu data. Those data are not in an area on which using __pa() is valid. However, they are also called early enough in boot that __vmalloc_start_set is not set, and thus the CONFIG_DEBUG_VIRTUAL debugging does not catch them. This adds a check to also verify __pa() calls against max_low_pfn, which we can use earler in boot than is_vmalloc_addr(). However, if we are super-early in boot, max_low_pfn=0 and this will trip on every call, so also make sure that max_low_pfn is set before we try to use it. With this patch applied, CONFIG_DEBUG_VIRTUAL will actually catch the bug I was chasing (and fix later in this series). I'd love to find a generic way so that any __pa() call on percpu areas could do a BUG_ON(), but there don't appear to be any nice and easy ways to check if an address is a percpu one. Anybody have ideas on a way to do this? Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20130122212430.F46F8159@kernel.stglabs.ibm.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|
b3b9c2932c32e0692018ed5f12f3fd8c70eea8ce |
|
09-Oct-2012 |
Konstantin Khlebnikov <khlebnikov@openvz.org> |
mm, x86, pat: rework linear pfn-mmap tracking Replace the generic vma-flag VM_PFN_AT_MMAP with x86-only VM_PAT. We can toss mapping address from remap_pfn_range() into track_pfn_vma_new(), and collect all PAT-related logic together in arch/x86/. This patch also restores orignal frustration-free is_cow_mapping() check in remap_pfn_range(), as it was before commit v2.6.28-rc8-88-g3c8bb73 ("x86: PAT: store vm_pgoff for all linear_over_vma_region mappings - v3") is_linear_pfn_mapping() checks can be removed from mm/huge_memory.c, because it already handled by VM_PFNMAP in VM_NO_THP bit-mask. [suresh.b.siddha@intel.com: Reset the VM_PAT flag as part of untrack_pfn_vma()] Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Venkatesh Pallipadi <venki@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Ingo Molnar <mingo@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Carsten Otte <cotte@de.ibm.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Eric Paris <eparis@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: James Morris <james.l.morris@oracle.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Kentaro Takeda <takedakn@nttdata.co.jp> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Venkatesh Pallipadi <venki@google.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
5180da410db6369d1f95c9014da1c9bc33fb043e |
|
09-Oct-2012 |
Suresh Siddha <suresh.b.siddha@intel.com> |
x86, pat: separate the pfn attribute tracking for remap_pfn_range and vm_insert_pfn With PAT enabled, vm_insert_pfn() looks up the existing pfn memory attribute and uses it. Expectation is that the driver reserves the memory attributes for the pfn before calling vm_insert_pfn(). remap_pfn_range() (when called for the whole vma) will setup a new attribute (based on the prot argument) for the specified pfn range. This addresses the legacy usage which typically calls remap_pfn_range() with a desired memory attribute. For ranges smaller than the vma size (which is typically not the case), remap_pfn_range() will use the existing memory attribute for the pfn range. Expose two different API's for these different behaviors. track_pfn_insert() for tracking the pfn attribute set by vm_insert_pfn() and track_pfn_remap() for the remap_pfn_range(). This cleanup also prepares the ground for the track/untrack pfn vma routines to take over the ownership of setting PAT specific vm_flag in the 'vma'. [khlebnikov@openvz.org: Clear checks in track_pfn_remap()] [akpm@linux-foundation.org: tweak a few comments] Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Venkatesh Pallipadi <venki@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Ingo Molnar <mingo@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Carsten Otte <cotte@de.ibm.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Eric Paris <eparis@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: James Morris <james.l.morris@oracle.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Kentaro Takeda <takedakn@nttdata.co.jp> Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
b1a86e15dc0304366f50ba1720834bc419c801b1 |
|
09-Oct-2012 |
Suresh Siddha <suresh.b.siddha@intel.com> |
x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn vma routines 'pfn' argument for track_pfn_vma_new() can be used for reserving the attribute for the pfn range. No need to depend on 'vm_pgoff' Similarly, untrack_pfn_vma() can depend on the 'pfn' argument if it is non-zero or can use follow_phys() to get the starting value of the pfn range. Also the non zero 'size' argument can be used instead of recomputing it from vma. This cleanup also prepares the ground for the track/untrack pfn vma routines to take over the ownership of setting PAT specific vm_flag in the 'vma'. [khlebnikov@openvz.org: Clear pfn to paddr conversion] Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Venkatesh Pallipadi <venki@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Ingo Molnar <mingo@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Carsten Otte <cotte@de.ibm.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Eric Paris <eparis@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Morris <james.l.morris@oracle.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Kentaro Takeda <takedakn@nttdata.co.jp> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
fa83523f45fbb403eba4ebc5704bf98aa4da0163 |
|
25-May-2012 |
John Dykstra <jdykstra@cray.com> |
x86/mm/pat: Improve scaling of pat_pagerange_is_ram() Function pat_pagerange_is_ram() scales poorly to large address ranges, because it probes the resource tree for each page. On a 2.6 GHz Opteron, this function consumes 34 ms for a 1 GB range. It is called twice during untrack_pfn_vma(), slowing process cleanup and handicapping the OOM killer. This replacement consumes less than 1ms, under the same conditions. Signed-off-by: John Dykstra <jdykstra@cray.com> on behalf of Cray Inc. Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1337980366.1979.6.camel@redwood [ Small stylistic cleanups and renames ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
365811d6f9bd98543bedc02b72d94f0f0faf3670 |
|
30-May-2012 |
Bjorn Helgaas <bhelgaas@google.com> |
x86: print physical addresses consistently with other parts of kernel Print physical address info in a style consistent with the %pR style used elsewhere in the kernel. For example: -found SMP MP-table at [ffff8800000fce90] fce90 +found SMP MP-table at [mem 0x000fce90-0x000fce9f] mapped at [ffff8800000fce90] -initial memory mapped : 0 - 20000000 +initial memory mapped: [mem 0x00000000-0x1fffffff] -Base memory trampoline at [ffff88000009c000] 9c000 size 8192 +Base memory trampoline [mem 0x0009c000-0x0009dfff] mapped at [ffff88000009c000] -SRAT: Node 0 PXM 0 0-80000000 +SRAT: Node 0 PXM 0 [mem 0x00000000-0x7fffffff] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
3709c857350976408953831f0cf89d19951394a1 |
|
22-Jul-2010 |
Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> |
x86: Ioremap: fix wrong physical address handling in PAT code The following two commits fixed a problem that x86 ioremap() doesn't handle physical address higher than 32-bit properly in X86_32 PAE mode. ffa71f33a820d1ab3f2fc5723819ac60fb76080b (x86, ioremap: Fix incorrect physical address handling in PAE mode) 35be1b716a475717611b2dc04185e9d80b9cb693 (x86, ioremap: Fix normal ram range check) But these fixes are not enough, since pat_pagerange_is_ram() in PAT code also has a same problem. This patch fixes it. Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> LKML-Reference: <4C47DDCF.80300@jp.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
6a4f3b523779b67e7d560ed42652f8a59f2f9782 |
|
11-Jun-2010 |
Venkatesh Pallipadi <venki@google.com> |
x86, pat: Proper init of memtype subtree_max_end subtree_max_end that was recently added to struct memtype was not getting properly initialized resulting in WARNING: kmemcheck: Caught 64-bit read from uninitialized memory in memtype_rb_augment_cb() reported here https://bugzilla.kernel.org/show_bug.cgi?id=16092 This change fixes the problem. Reported-by: Christian Casteyde <casteyde.christian@free.fr> Tested-by: Christian Casteyde <casteyde.christian@free.fr> Signed-off-by: Venkatesh Pallipadi <venki@google.com> LKML-Reference: <1276217101-11515-1-git-send-email-venki@google.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com>
|
20413f27163fb1b8b806c0c219dc95eae67c633a |
|
26-May-2010 |
Xiaotian Feng <dfeng@redhat.com> |
x86, pat: Fix memory leak in free_memtype Reserve_memtype will allocate memory for new memtype, but in free_memtype, after the memtype erased from rbtree, the memory is not freed. Changes since V1: make rbt_memtype_erase return erased memtype so that it can be freed in free_memtype. [ hpa: not for -stable: 2.6.34 and earlier not affected ] Signed-off-by: Xiaotian Feng <dfeng@redhat.com> LKML-Reference: <1274838670-8731-1-git-send-email-dfeng@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Jack Steiner <steiner@sgi.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|
1f9cc3cb6a27521edfe0a21abf97d2bb11c4d237 |
|
23-Apr-2010 |
Robin Holt <holt@sgi.com> |
x86, pat: Update the page flags for memtype atomically instead of using memtype_lock While testing an application using the xpmem (out of kernel) driver, we noticed a significant page fault rate reduction of x86_64 with respect to ia64. For one test running with 32 cpus, one thread per cpu, it took 01:08 for each of the threads to vm_insert_pfn 2GB worth of pages. For the same test running on 256 cpus, one thread per cpu, it took 14:48 to vm_insert_pfn 2 GB worth of pages. The slowdown was tracked to lookup_memtype which acquires the spinlock memtype_lock. This heavily contended lock was slowing down vm_insert_pfn(). With the cmpxchg on page->flags method, both the 32 cpu and 256 cpu cases take approx 00:01.3 seconds to complete. Signed-off-by: Robin Holt <holt@sgi.com> LKML-Reference: <20100423153627.751194346@gulag1.americas.sgi.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@gmail.com> Cc: Rafael Wysocki <rjw@novell.com> Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
5a0e3ad6af8660be21ca98a971cd00f331318c05 |
|
24-Mar-2010 |
Tejun Heo <tj@kernel.org> |
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
|
9e41a49aab88a5a6c8f4875bf10a5543bc321f2d |
|
11-Feb-2010 |
Pallipadi, Venkatesh <venkatesh.pallipadi@intel.com> |
x86, pat: Migrate to rbtree only backend for pat memtype management Move pat backend to fully rbtree based implementation from the existing rbtree and linked list hybrid. New rbtree based solution uses interval trees (augmented rbtrees) in order to store the PAT ranges. The new code seprates out the pat backend to pat_rbtree.c file, making is cleaner. The change also makes the PAT lookup, reserve and free operations more optimal, as we don't have to traverse linear linked list of few tens of entries in normal case. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> LKML-Reference: <20100210232607.GB11465@linux-os.sc.intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
be5a0c126ad1dea2128dc5aef12c87083518d1ab |
|
10-Feb-2010 |
venkatesh.pallipadi@intel.com <venkatesh.pallipadi@intel.com> |
x86, pat: Preparatory changes in pat.c for bigger rbtree change Minor changes in pat.c to cleanup code and make it smoother to introduce bigger rbtree only change in the following patch. The changes are cleaup only and should not have any functional impact. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> LKML-Reference: <20100210195909.792781000@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
6b2f3d1f769be5779b479c37800229d9a4809fc3 |
|
27-Oct-2009 |
Christoph Hellwig <hch@lst.de> |
vfs: Implement proper O_SYNC semantics While Linux provided an O_SYNC flag basically since day 1, it took until Linux 2.4.0-test12pre2 to actually get it implemented for filesystems, since that day we had generic_osync_around with only minor changes and the great "For now, when the user asks for O_SYNC, we'll actually give O_DSYNC" comment. This patch intends to actually give us real O_SYNC semantics in addition to the O_DSYNC semantics. After Jan's O_SYNC patches which are required before this patch it's actually surprisingly simple, we just need to figure out when to set the datasync flag to vfs_fsync_range and when not. This patch renames the existing O_SYNC flag to O_DSYNC while keeping it's numerical value to keep binary compatibility, and adds a new real O_SYNC flag. To guarantee backwards compatiblity it is defined as expanding to both the O_DSYNC and the new additional binary flag (__O_SYNC) to make sure we are backwards-compatible when compiled against the new headers. This also means that all places that don't care about the differences can just check O_DSYNC and get the right behaviour for O_SYNC, too - only places that actuall care need to check __O_SYNC in addition. Drivers and network filesystems have been updated in a fail safe way to always do the full sync magic if O_DSYNC is set. The few places setting O_SYNC for lower layers are kept that way for now to stay failsafe. We enforce that O_DSYNC is set when __O_SYNC is set early in the open path to make sure we always get these sane options. Note that parisc really screwed up their headers as they already define a O_DSYNC that has always been a no-op. We try to repair it by using it for the new O_DSYNC and redefinining O_SYNC to send both the traditional O_SYNC numerical value _and_ the O_DSYNC one. Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andreas Dilger <adilger@sun.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Kyle McMartin <kyle@mcmartin.ca> Acked-by: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jan Kara <jack@suse.cz>
|
dd4377b02d9f028006beed1b7b1695ee5d1498b6 |
|
26-Nov-2009 |
Xiaotian Feng <dfeng@redhat.com> |
x86/pat: Trivial: don't create debugfs for memtype if pat is disabled If pat is disabled (boot with nopat), there's no need to create debugfs for it, it's empty all the time. Signed-off-by: Xiaotian Feng <dfeng@redhat.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <1259236428-16329-1-git-send-email-dfeng@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
eb41c8be89dbe079f49202774e04a79ccac48a09 |
|
23-Nov-2009 |
H. Peter Anvin <hpa@zytor.com> |
x86, platform: Change is_untracked_pat_range() to bool; cleanup init - Change is_untracked_pat_range() to return bool. - Clean up the initialization of is_untracked_pat_range() -- by default, we simply point it at is_ISA_range() directly. - Move is_untracked_pat_range to the end of struct x86_platform, since it is the newest field. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jack Steiner <steiner@sgi.com> LKML-Reference: <20091119202341.GA4420@sgi.com>
|
8a27138924f64d2f30c1022f909f74480046bc3f |
|
23-Nov-2009 |
H. Peter Anvin <hpa@zytor.com> |
x86, mm: is_untracked_pat_range() takes a normal semiclosed range is_untracked_pat_range() -- like its components, is_ISA_range() and is_GRU_range(), takes a normal semiclosed interval (>=, <) whereas the PAT code called it as if it took a closed range (>=, <=). Fix. Although this is a bug, I believe it is non-manifest, simply because none of the callers will call this with non-page-aligned addresses. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20091119202341.GA4420@sgi.com>
|
fd12a0d69aee6d90fa9b9890db24368a897f8423 |
|
19-Nov-2009 |
Jack Steiner <steiner@sgi.com> |
x86: UV SGI: Don't track GRU space in PAT GRU space is always mapped as WB in the page table. There is no need to track the mappings in the PAT. This also eliminates the "freeing invalid memtype" messages when the GRU space is unmapped. Signed-off-by: Jack Steiner <steiner@sgi.com> LKML-Reference: <20091119202341.GA4420@sgi.com> [ v2: fix build failure ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
83ea05ea69290b2e30da795527dbe304db1e2331 |
|
10-Nov-2009 |
Xiaotian Feng <dfeng@redhat.com> |
x86: pat: Clean up req_type special case for reserve_memtype() Commit: b6ff32d: x86, PAT: Consolidate code in pat_x_mtrr_type() and reserve_memtype() consolidated code in pat_x_mtrr_type() and reserve_memtype(), which removed the special case (req_type is -1) for the PAT-enabled part. We should also change comments and the PAT-disabled part. Signed-off-by: Xiaotian Feng <dfeng@redhat.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> LKML-Reference: <1257844987-7906-1-git-send-email-dfeng@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
e23a8b6a8f319c0f08b6ccef2dccbb37e7603dc2 |
|
24-Sep-2009 |
Roland Dreier <rdreier@cisco.com> |
x86: Reduce verbosity of "PAT enabled" kernel message On modern systems, the kernel prints the message x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106 once for every CPU. This gets kind of ridiculous on huge systems; for example, on a 64-thread system I was lucky enough to get: dmesg| grep 'PAT enabled' | wc 64 704 5174 There is already a BUG() if non-boot CPUs have PAT capabilities that don't match the boot CPU, so just print the message on the boot CPU. (I kept the print after the wrmsrl() that enables PAT, so that the log output continues to mean that the system survived enabling PAT on the boot CPU) Signed-off-by: Roland Dreier <rolandd@cisco.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> LKML-Reference: <adavdj92sso.fsf@cisco.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
dcb73bf402e0d5b28ce925dbbe4dab3b00b21eee |
|
16-Sep-2009 |
Suresh Siddha <suresh.b.siddha@intel.com> |
x86, pat: don't use rb-tree based lookup in reserve_memtype() Recent enhancement of rb-tree based lookup exposed a bug with the lookup mechanism in the reserve_memtype() which ensures that there are no conflicting memtype requests for the memory range. memtype_rb_search() returns an entry which has a start address <= new start address. And from here we traverse the linear linked list to check if there any conflicts with the existing mappings. As the rbtree is based on the start address of the memory range, it is quite possible that we have several overlapped mappings whose start address is much less than new requested start but the end is >= new requested end. This results in conflicting memtype mappings. Same bug exists with the old code which uses cached_entry from where we traverse the linear linked list. But the new rb-tree code exposes this bug fairly easily. For now, don't use the memtype_rb_search() and always start the search from the head of linear linked list in reserve_memtype(). Linear linked list for most of the systems grow's to few 10's of entries(as we track memory type of RAM pages using struct page). So we should be ok for now. We still retain the rbtree and use it to speed up free_memtype() which doesn't have the same bug(as we know what exactly we are searching for in free_memtype). Also use list_for_each_entry_from() in free_memtype() so that we start the search from rb-tree lookup result. Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> LKML-Reference: <1253136483.4119.12.camel@sbs-t61.sc.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
d535e4319a2e598ba25dc966ada4e52ea774e33f |
|
04-Sep-2009 |
Tobias Klauser <tklauser@distanz.ch> |
x86: Make memtype_seq_ops const Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
d886c73cd4cf02a71e1650cbcb6176799d78aac1 |
|
10-Jul-2009 |
Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> |
x86, pat: Sanity check remap_pfn_range for RAM region Add sanity check for remap_pfn_range of RAM regions using lookup_memtype(). Previously, we did not have anyway to get the type of RAM memory regions as they were tracked using a single bit in page_struct (WB, nonWB). Now we can get the actual type from page struct (WB, WC, UC_MINUS) and make sure the requester gets that type. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
1087637616dd5e96d834164ea462aed6159d039b |
|
10-Jul-2009 |
Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> |
x86, pat: Lookup the protection from memtype list on vm_insert_pfn() Lookup the reserved memtype during vm_insert_pfn and use that memtype for the new mapping. This takes care or handling of vm_insert_pfn() interface in track_pfn_vma*/untrack_pfn_vma. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
637b86e75f4c255a4446bc0b67ce9d914b9d2d42 |
|
10-Jul-2009 |
Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> |
x86, pat: Add lookup_memtype to get the current memtype of a paddr Add a new routine lookup_memtype() to get the current memtype based on the PAT reserves and frees. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
f58417409603d62f2eb23db4d2cf6853d84a1698 |
|
10-Jul-2009 |
Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> |
x86, pat: Use page flags to track memtypes of RAM pages Change reserve_ram_pages_type and free_ram_pages_type to use 2 page flags to track UC_MINUS, WC, WB and default types. Previous RAM tracking just tracked WB or NonWB, which was not complete and did not allow tracking of RAM fully and there was no way to get the actual type reserved by looking at the page flags. We use the memtype_lock spinlock for atomicity in dealing with memtype tracking in struct page. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
335ef896d4c6639849d79367f0fef9abc06d121b |
|
10-Jul-2009 |
Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> |
x86, pat: Add rbtree to do quick lookup in memtype tracking PAT memtype tracking uses a linear link list to keep track of IO (non-RAM) regions and their memtypes. The code used a last_accessed pointer as a cache to speedup the lookup. As per discussions with H. Peter Anvin a while back, having a rbtree here will avoid bad performances in pathological cases where we may end up with huge linked list. This may not add any noticable performance speedup in normal case as the number of entires in PAT memtype list tend to be ~20-30 range. The patch removes the "cached_entry" logic as with rbtree we have more generic way of speeding up the lookup. With this patch, we use rbtree to do the quick lookup. We still use linked list as the memtype range tracked can be of different sizes and can overlap in different ways. We also keep track of usage counts with linked list. Example: Multiple ioremaps with different sizes uncached-minus @ 0xfffff00000-0xfffff04000 uncached-minus @ 0xfffff02000-0xfffff03000 And one userlevel mmap and the thread forks a new process uncached-minus @ 0xbf453000-0xbf454000 uncached-minus @ 0xbf453000-0xbf454000 Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
9fd126bc742f74a95d2ba610247712ff05da02fe |
|
10-Jul-2009 |
Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> |
x86, pat: New i/f for driver to request memtype for IO regions Add new routines to request memtype for IO regions. This will currently be a backend for io_mapping_* routines. But, it can also be made available to drivers directly in future, in case it is needed. reserve interface reserves the memory, makes sure we have a compatible memory type available and keeps the identity map in sync when needed. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
5fc517466dd3d0fc6d2a5180ca6792e60344d8be |
|
10-Jul-2009 |
Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> |
x86, pat: Keep identity maps consistent with mmaps even when pat_disabled Make reserve_memtype internally take care of pat disabled case and fallback to default return values. Remove the specific pat_disabled checks in track_* routines. Change kernel_map_sync_memtype to sync identity map even when pat_disabled. This change ensures that, even for pat_disabled case, we take care of keeping identity map in sync. Before this patch, in pat disabled case, ioremap() keeps the identity maps in sync and other APIs like pci and /dev/mem mmap don't, which is not a very consistent behavior. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
1adcaafe7414c5731f758b158aa0525057225deb |
|
17-Aug-2009 |
Suresh Siddha <suresh.b.siddha@intel.com> |
x86, pat: Allow ISA memory range uncacheable mapping requests Max Vozeler reported: > Bug 13877 - bogl-term broken with CONFIG_X86_PAT=y, works with =n > > strace of bogl-term: > 814 mmap2(NULL, 65536, PROT_READ|PROT_WRITE, MAP_SHARED, 4, 0) > = -1 EAGAIN (Resource temporarily unavailable) > 814 write(2, "bogl: mmaping /dev/fb0: Resource temporarily unavailable\n", > 57) = 57 PAT code maps the ISA memory range as WB in the PAT attribute, so that fixed range MTRR registers define the actual memory type (UC/WC/WT etc). But the upper level is_new_memtype_allowed() API checks are failing, as the request here is for UC and the return tracked type is WB (Tracked type is WB as MTRR type for this legacy range potentially will be different for each 4k page). Fix is_new_memtype_allowed() by always succeeding the ISA address range checks, as the null PAT (WB) and def MTRR fixed range register settings satisfy the memory type needs of the applications that map the ISA address range. Reported-and-Tested-by: Max Vozeler <xam@debian.org> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
4b065046273afa01ec8e3de7da407e8d3599251d |
|
09-Apr-2009 |
Pallipadi, Venkatesh <venkatesh.pallipadi@intel.com> |
x86, PAT: Remove page granularity tracking for vm_insert_pfn maps This change resolves the problem of too many single page entries in pat_memtype_list and "freeing invalid memtype" errors with i915, reported here: http://marc.info/?l=linux-kernel&m=123845244713183&w=2 Remove page level granularity track and untrack of vm_insert_pfn. memtype tracking at page granularity does not scale and cleaner approach would be for the driver to request a type for a bigger IO address range or PCI io memory range for that device, either at mmap time or driver init time and just use that type during vm_insert_pfn. This patch just removes the track/untrack of vm_insert_pfn. That means we will be in same state as 2.6.28, with respect to these APIs. Newer APIs for the drivers to request a memtype for a bigger region is coming soon. [ Impact: fix Xorg startup warnings and hangs ] Reported-by: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com> Tested-by: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> LKML-Reference: <20090408223716.GC3493@linux-os.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
1ee4bd92a7aa49eb66c8d5672e837090d3e7b7ff |
|
10-Apr-2009 |
Marcin Slusarz <marcin.slusarz@gmail.com> |
x86: fix wrong section of pat_disable & make it static pat_disable cannot be __cpuinit anymore because it's called from pat_init and the callchain looks like this: pat_disable [cpuinit] <- pat_init <- generic_set_all <- ipi_handler <- set_mtrr <- (other non init/cpuinit functions) WARNING: arch/x86/mm/built-in.o(.text+0x449e): Section mismatch in reference from the function pat_init() to the function .cpuinit.text:pat_disable() The function pat_init() references the function __cpuinit pat_disable(). This is often because pat_init lacks a __cpuinit annotation or the annotation of pat_disable is wrong. Non CONFIG_X86_PAT version of pat_disable is static inline, so this version can be static too (and there are no callers outside of this file). Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> LKML-Reference: <49DFB055.6070405@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
0c3c8a18361a636069f5a5d9d0d0f9c2124e6b94 |
|
09-Apr-2009 |
Suresh Siddha <suresh.b.siddha@intel.com> |
x86, PAT: Remove duplicate memtype reserve in devmem mmap /dev/mem mmap code was doing memtype reserve/free for a while now. Recently we added memtype tracking in remap_pfn_range, and /dev/mem mmap uses it indirectly. So, we don't need seperate tracking in /dev/mem code any more. That means another ~100 lines of code removed :-). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> LKML-Reference: <20090409212709.085210000@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
b6ff32d9aaeeeecf98f9a852d715569183585312 |
|
09-Apr-2009 |
Suresh Siddha <suresh.b.siddha@intel.com> |
x86, PAT: Consolidate code in pat_x_mtrr_type() and reserve_memtype() Fix pat_x_mtrr_type() to use UC_MINUS when the mtrr type return UC. This is to be consistent with ioremap() and ioremap_nocache() which uses UC_MINUS. Consolidate the code such that reserve_memtype() also uses pat_x_mtrr_type() when the caller doesn't specify any special attribute (non WB attribute). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> LKML-Reference: <20090409212708.939936000@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
4bb9c5c02153dfc89a6c73a6f32091413805ad7d |
|
13-Mar-2009 |
Pallipadi, Venkatesh <venkatesh.pallipadi@intel.com> |
VM, x86, PAT: Change is_linear_pfn_mapping to not use vm_pgoff Impact: fix false positive PAT warnings - also fix VirtalBox hang Use of vma->vm_pgoff to identify the pfnmaps that are fully mapped at mmap time is broken. vm_pgoff is set by generic mmap code even for cases where drivers are setting up the mappings at the fault time. The problem was originally reported here: http://marc.info/?l=linux-kernel&m=123383810628583&w=2 Change is_linear_pfn_mapping logic to overload VM_INSERTPAGE flag along with VM_PFNMAP to mean full PFNMAP setup at mmap time. Problem also tracked at: http://bugzilla.kernel.org/show_bug.cgi?id=12800 Reported-by: Thomas Hellstrom <thellstrom@vmware.com> Tested-by: Frans Pop <elendil@planet.nl> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha>@intel.com> Cc: Nick Piggin <npiggin@suse.de> Cc: "ebiederm@xmission.com" <ebiederm@xmission.com> Cc: <stable@kernel.org> # only for 2.6.29.1, not .28 LKML-Reference: <20090313004527.GA7176@linux-os.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
92b9af9e4f144535c65aee673cfad309f25fa465 |
|
28-Feb-2009 |
Ingo Molnar <mingo@elte.hu> |
x86: i915 needs pgprot_writecombine() and is_io_mapping_possible() Impact: build fix Theodore Ts reported that the i915 driver needs these symbols: ERROR: "pgprot_writecombine" [drivers/gpu/drm/i915/i915.ko] undefined! ERROR: "is_io_mapping_possible" [drivers/gpu/drm/i915/i915.ko] undefined! Reported-by: Theodore Ts'o <tytso@mit.edu> wrote: Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
7880f7464546842ee14179bef16a6e14381ea638 |
|
25-Feb-2009 |
Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> |
gpu/drm, x86, PAT: routine to keep identity map in sync Add a function to check and keep identity maps in sync, when changing any memory type. One of the follow on patches will also use this routine. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Eric Anholt <eric@anholt.net> Cc: Keith Packard <keithp@keithp.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
be03d9e8022030c16abf534e33e185bfc3d40eef |
|
11-Feb-2009 |
Suresh Siddha <suresh.b.siddha@intel.com> |
x86, pat: fix warn_on_once() while mapping 0-1MB range with /dev/mem Jeff Mahoney reported: > With Suse's hwinfo tool, on -tip: > WARNING: at arch/x86/mm/pat.c:637 reserve_pfn_range+0x5b/0x26d() reserve_pfn_range() is not tracking the memory range below 1MB as non-RAM and as such is inconsistent with similar checks in reserve_memtype() and free_memtype() Rename the pagerange_is_ram() to pat_pagerange_is_ram() and add the "track legacy 1MB region as non RAM" condition. And also, fix reserve_pfn_range() to return -EINVAL, when the pfn range is RAM. This is to be consistent with this API design. Reported-and-tested-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
75a048119e76540d73132cfc8e0fa0c0a8bb6c83 |
|
23-Jan-2009 |
H. Peter Anvin <hpa@linux.intel.com> |
x86: handle PAT more like other CPU features Impact: Cleanup When PAT was originally introduced, it was handled specially for a few reasons: - PAT bugs are hard to track down, so we wanted to maintain a whitelist of CPUs. - The i386 and x86-64 CPUID code was not yet unified. Both of these are now obsolete, so handle PAT like any other features, including ordinary feature blacklisting due to known bugs. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|
9597134218300c045cf219be3664615e97cb239c |
|
13-Jan-2009 |
Suresh Siddha <suresh.b.siddha@intel.com> |
x86: fix PTE corruption issue while mapping RAM using /dev/mem Beschorner Daniel reported: > hwinfo problem since 2.6.28, showing this in the oops: > Corrupted page table at address 7fd04de3ec00 Also, PaX Team reported a regression with this commit: > commit 9542ada803198e6eba29d3289abb39ea82047b92 > Author: Suresh Siddha <suresh.b.siddha@intel.com> > Date: Wed Sep 24 08:53:33 2008 -0700 > > x86: track memtype for RAM in page struct This commit breaks mapping any RAM page through /dev/mem, as the reserve_memtype() was not initializing the return attribute type and as such corrupting the PTE entry that was setup with the return attribute type. Because of this bug, application mapping this RAM page through /dev/mem will die with "Corrupted page table at address xxxx" message in the kernel log and also the kernel identity mapping which maps the underlying RAM page gets converted to UC. Fix this by initializing the return attribute type before calling reserve_ram_pages_type() Reported-by: PaX Team <pageexec@freemail.hu> Reported-and-tested-by: Beschorner Daniel <Daniel.Beschorner@facton.com> Tested-and-Acked-by: PaX Team <pageexec@freemail.hu> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
b5db0e38653bfada34a92f360b4111566ede3842 |
|
16-Jan-2009 |
Linus Torvalds <torvalds@linux-foundation.org> |
Revert "x86 PAT: remove CPA WARN_ON for zero pte" This reverts commit 58dab916dfb57328d50deb0aa9b3fc92efa248ff, which makes my Nehalem come to a nasty crawling almost-halt. It looks like it turns off caching of regular kernel RAM, with the understandable slowdown of a few orders of magnitude as a result. Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Peter Anvin <hpa@zytor.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
5cca0cf15a94417f49625ce52e23589eed0a1675 |
|
09-Jan-2009 |
Suresh Siddha <suresh.b.siddha@intel.com> |
x86, pat: fix reserve_memtype() for legacy 1MB range Thierry Vignaud reported: > http://bugzilla.kernel.org/show_bug.cgi?id=12372 > > On P4 with an SiS motherboard (video card is a SiS 651) > X server fails to start with error: > xf86MapVidMem: Could not mmap framebuffer (0x00000000,0x2000) (Invalid > argument) Here X is trying to map first 8KB of memory using /dev/mem. Existing code treats first 0-4KB of memory as non-RAM and 4KB-8KB as RAM. Recent code changes don't allow to map memory with different attributes at the same time. Fix this by treating the first 1MB legacy region as special and always track the attribute requests with in this region using linear linked list (and don't bother if the range is RAM or non-RAM or mixed) Reported-and-tested-by: Thierry Vignaud <tvignaud@mandriva.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
58dab916dfb57328d50deb0aa9b3fc92efa248ff |
|
10-Jan-2009 |
venkatesh.pallipadi@intel.com <venkatesh.pallipadi@intel.com> |
x86 PAT: remove CPA WARN_ON for zero pte Impact: reduce scope of debug check - avoid warnings The logic to find whether identity map exists or not using high_memory or max_low_pfn_mapped/max_pfn_mapped are not complete as the memory withing the range may not be mapped if there is a unusable hole in e820. Specifically, on my test system I started seeing these warnings with tools like hwinfo, acpidump trying to map ACPI region. [ 27.400018] ------------[ cut here ]------------ [ 27.400344] WARNING: at /home/venkip/src/linus/linux-2.6/arch/x86/mm/pageattr.c:560 __change_page_attr_set_clr+0xf3/0x8b8() [ 27.400821] Hardware name: X7DB8 [ 27.401070] CPA: called for zero pte. vaddr = ffff8800cff6a000 cpa->vaddr = ffff8800cff6a000 [ 27.401569] Modules linked in: [ 27.401882] Pid: 4913, comm: dmidecode Not tainted 2.6.28-05716-gfe0bdec #586 [ 27.402141] Call Trace: [ 27.402488] [<ffffffff80237c21>] warn_slowpath+0xd3/0x10f [ 27.402749] [<ffffffff80274ade>] ? find_get_page+0xb3/0xc9 [ 27.403028] [<ffffffff80274a2b>] ? find_get_page+0x0/0xc9 [ 27.403333] [<ffffffff80226425>] __change_page_attr_set_clr+0xf3/0x8b8 [ 27.403628] [<ffffffff8028ec99>] ? __purge_vmap_area_lazy+0x192/0x1a1 [ 27.403883] [<ffffffff8028eb52>] ? __purge_vmap_area_lazy+0x4b/0x1a1 [ 27.404172] [<ffffffff80290268>] ? vm_unmap_aliases+0x1ab/0x1bb [ 27.404512] [<ffffffff80290105>] ? vm_unmap_aliases+0x48/0x1bb [ 27.404766] [<ffffffff80226d28>] change_page_attr_set_clr+0x13e/0x2e6 [ 27.405026] [<ffffffff80698fa7>] ? _spin_unlock+0x26/0x2a [ 27.405292] [<ffffffff80227e6a>] ? reserve_memtype+0x19b/0x4e3 [ 27.405590] [<ffffffff80226ffd>] _set_memory_wb+0x22/0x24 [ 27.405844] [<ffffffff80225d28>] ioremap_change_attr+0x26/0x28 [ 27.406097] [<ffffffff80228355>] reserve_pfn_range+0x1a3/0x235 [ 27.406427] [<ffffffff80228430>] track_pfn_vma_new+0x49/0xb3 [ 27.406686] [<ffffffff80286c46>] remap_pfn_range+0x94/0x32c [ 27.406940] [<ffffffff8022878d>] ? phys_mem_access_prot_allowed+0xb5/0x1a8 [ 27.407209] [<ffffffff803e9bf4>] mmap_mem+0x75/0x9d [ 27.407523] [<ffffffff8028b3b4>] mmap_region+0x2cf/0x53e [ 27.407776] [<ffffffff8028b8cc>] do_mmap_pgoff+0x2a9/0x30d [ 27.408034] [<ffffffff8020f4a4>] sys_mmap+0x92/0xce [ 27.408339] [<ffffffff8020b65b>] system_call_fastpath+0x16/0x1b [ 27.408614] ---[ end trace 4b16ad70c09a602d ]--- [ 27.408871] dmidecode:4913 reserve_pfn_range ioremap_change_attr failed write-back for cff6a000-cff6b000 This is wih track_pfn_vma_new trying to keep identity map in sync. The address cff6a000 is the ACPI region according to e820. [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: 0000000000000000 - 000000000009c000 (usable) [ 0.000000] BIOS-e820: 000000000009c000 - 00000000000a0000 (reserved) [ 0.000000] BIOS-e820: 00000000000cc000 - 00000000000d0000 (reserved) [ 0.000000] BIOS-e820: 00000000000e4000 - 0000000000100000 (reserved) [ 0.000000] BIOS-e820: 0000000000100000 - 00000000cff60000 (usable) [ 0.000000] BIOS-e820: 00000000cff60000 - 00000000cff69000 (ACPI data) [ 0.000000] BIOS-e820: 00000000cff69000 - 00000000cff80000 (ACPI NVS) [ 0.000000] BIOS-e820: 00000000cff80000 - 00000000d0000000 (reserved) [ 0.000000] BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved) [ 0.000000] BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved) [ 0.000000] BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) [ 0.000000] BIOS-e820: 00000000ff000000 - 0000000100000000 (reserved) [ 0.000000] BIOS-e820: 0000000100000000 - 0000000230000000 (usable) And is not mapped as per init_memory_mapping. [ 0.000000] init_memory_mapping: 0000000000000000-00000000cff60000 [ 0.000000] init_memory_mapping: 0000000100000000-0000000230000000 We can add logic to check for this. But, there can also be other holes in identity map when we have 1GB of aligned reserved space in e820. This patch handles it by removing the WARN_ON and returning a specific error value (EFAULT) to indicate that the address does not have any identity mapping. The code that tries to keep identity map in sync can ignore this error, with other callers of cpa still getting error here. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
cdecff6864a1cd352a41d44a65e7451b8ef5cee2 |
|
10-Jan-2009 |
venkatesh.pallipadi@intel.com <venkatesh.pallipadi@intel.com> |
x86 PAT: return compatible mapping to remap_pfn_range callers Impact: avoid warning message, potentially solve 3D performance regression Change x86 PAT code to return compatible memtype if the exact memtype that was requested in remap_pfn_rage and friends is not available due to some conflict. This is done by returning the compatible type in pgprot parameter of track_pfn_vma_new(), and the caller uses that memtype for page table. Note that track_pfn_vma_copy() which is basically called during fork gets the prot from existing page table and should not have any conflict. Hence we use strict memtype check there and do not allow compatible memtypes. This patch fixes the bug reported here: http://marc.info/?l=linux-kernel&m=123108883716357&w=2 Specifically the error message: X:5010 map pfn expected mapping type write-back for d0000000-d0101000, got write-combining Should go away. Reported-and-bisected-by: Kevin Winchester <kjwinchester@gmail.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
e4b866ed197cef9989348e0479fed8d864ea465b |
|
10-Jan-2009 |
venkatesh.pallipadi@intel.com <venkatesh.pallipadi@intel.com> |
x86 PAT: change track_pfn_vma_new to take pgprot_t pointer param Impact: cleanup Change the protection parameter for track_pfn_vma_new() into a pgprot_t pointer. Subsequent patch changes the x86 PAT handling to return a compatible memtype in pgprot_t, if what was requested cannot be allowed due to conflicts. No fuctionality change in this patch. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
c1c15b65ec30275575dac9322aae607075769fbc |
|
23-Dec-2008 |
H. Peter Anvin <hpa@zytor.com> |
x86: PAT: fix address types in track_pfn_vma_new() Impact: cleanup, fix warning This warning: arch/x86/mm/pat.c: In function track_pfn_vma_copy: arch/x86/mm/pat.c:701: warning: passing argument 5 of follow_phys from incompatible pointer type Triggers because physical addresses are resource_size_t, not u64. This really matters when calling an interface like follow_phys() which takes a pointer to a physical address -- although on x86, being littleendian, it would generally work anyway as long as the memory region wasn't completely uninitialized. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
982d789ab76c8a11426852fec2fdf2f412e21c0c |
|
19-Dec-2008 |
venkatesh.pallipadi@intel.com <venkatesh.pallipadi@intel.com> |
x86: PAT: remove follow_pfnmap_pte in favor of follow_phys Impact: Cleanup - removes a new function in favor of a recently modified older one. Replace follow_pfnmap_pte in pat code with follow_phys. follow_phys lso returns protection eliminating the need of pte_pgprot call. Using follow_phys also eliminates the need for pte_pa. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
2520bd3123c00272f818a176c92d03c7d0a113d6 |
|
18-Dec-2008 |
venkatesh.pallipadi@intel.com <venkatesh.pallipadi@intel.com> |
x86: PAT: add pgprot_writecombine() interface for drivers - v3 Impact: New mm functionality. Add pgprot_writecombine. pgprot_writecombine will be aliased to pgprot_noncached when not supported by the architecture. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
5899329b19100c0b82dc78e9b21ed8b920c9ffb3 |
|
18-Dec-2008 |
venkatesh.pallipadi@intel.com <venkatesh.pallipadi@intel.com> |
x86: PAT: implement track/untrack of pfnmap regions for x86 - v3 Impact: New mm functionality. Hookup remap_pfn_range and vm_insert_pfn and corresponding copy and free routines with reserve and free tracking. reserve and free here only takes care of non RAM region mapping. For RAM region, driver should use set_memory_[uc|wc|wb] to set the cache type and then setup the mapping for user pte. We can bypass below reserve/free in that case. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
9e41bff2708e420e61e6b89a54c15232857069b1 |
|
30-Oct-2008 |
Ravikiran G Thirumalai <kiran@scalex86.org> |
x86: fix /dev/mem mmap breakage when PAT is disabled Impact: allow /dev/mem mmaps on non-PAT CPUs/platforms Fix mmap to /dev/mem when CONFIG_X86_PAT is off and CONFIG_STRICT_DEVMEM is off mmap to /dev/mem on kernel memory has been failing since the introduction of PAT (CONFIG_STRICT_DEVMEM=n case). Seems like the check to avoid cache aliasing with PAT is kicking in even when PAT is disabled. The bug seems to have crept in 2.6.26. This patch makes sure that the mmap to regular kernel memory succeeds if CONFIG_STRICT_DEVMEM=n and PAT is disabled, and the checks to avoid cache aliasing still happens if PAT is enabled. Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Tested-by: Tim Sirianni <tim@scalemp.com> Cc: <stable@kernel.org> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
ad2cde16a21985cdc4302e4a4b0fc373d666fdf7 |
|
30-Sep-2008 |
Ingo Molnar <mingo@elte.hu> |
x86, pat: cleanups clean up recently added code to be more consistent with other x86 code. Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
9542ada803198e6eba29d3289abb39ea82047b92 |
|
24-Sep-2008 |
Suresh Siddha <suresh.b.siddha@intel.com> |
x86: track memtype for RAM in page struct Track the memtype for RAM pages in page struct instead of using the memtype list. This avoids the explosion in the number of entries in memtype list (of the order of 20,000 with AGP) and makes the PAT tracking simpler. We are using PG_arch_1 bit in page->flags. We still use the memtype list for non RAM pages. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
28df82ebab79c6a2b4295dd94fd8de88196a49df |
|
21-Aug-2008 |
venkatesh.pallipadi@intel.com <venkatesh.pallipadi@intel.com> |
devmem, x86: PAT Change /dev/mem mmap with O_SYNC to use UC_MINUS All kernel mappings like ioremap(), etc uses UC_MINUS as the type. /dev/mem mappings with /dev/mem being opened with O_SYNC however was using UC, resulting in a conflict with /dev/mem mmap failing. This seems to be affecting some apps (one being flashrom) which are using O_SYNC and which were working before. Switch /dev/mem with O_SYNC also to UC_MINUS. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
80c5e73d6028e0f03ab0c70e7c4cbf98ac2e0c43 |
|
20-Aug-2008 |
Venki Pallipadi <venkatesh.pallipadi@intel.com> |
x86: fix Xorg startup/shutdown slowdown with PAT Rene Herman reported significant Xorg startup/shutdown slowdown due to PAT. It turns out that the memtype list has thousands of entries. Add cached_entry to list add routine, in order to speed up the lookup for sequential reserve_memtype calls. Reported-by: Rene Herman <rene.herman@keyaccess.nl> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
012f09e7942716d5f4339f1fd9a831a485bb1d4a |
|
06-Aug-2008 |
Andreas Herrmann <andreas.herrmann3@amd.com> |
x86: compile pat debugfs interface only if CONFIG_X86_PAT is set Recently I've run a kernel w/o PAT support and wondered why there was a file "x86/pat_memtype_list" in debugfs. Of course it's empty if PAT is disabled ... so just get rid of this interface if PAT is disabled. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
d092633bff3b19faffc480fe9810805e7792a029 |
|
18-Jul-2008 |
Ingo Molnar <mingo@elte.hu> |
Subject: devmem, x86: fix rename of CONFIG_NONPROMISC_DEVMEM From: Arjan van de Ven <arjan@infradead.org> Date: Sat, 19 Jul 2008 15:47:17 -0700 CONFIG_NONPROMISC_DEVMEM was a rather confusing name - but renaming it to CONFIG_PROMISC_DEVMEM causes problems on architectures that do not support this feature; this patch renames it to CONFIG_STRICT_DEVMEM, so that architectures can opt-in into it. ( the polarity of the option is still the same as it was originally; it needs to be for now to not break architectures that don't have the infastructure yet to support this feature) Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: "V.Radhakrishnan" <rk@atr-labs.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> ---
|
fec0962e0bed407927b9ff54bb0596a3ab7e4b61 |
|
19-Jul-2008 |
venkatesh.pallipadi@intel.com <venkatesh.pallipadi@intel.com> |
x86: Add a debugfs interface to dump PAT memtype Add a debugfs interface to list out all the PAT memtype reservations. Appears at debugfs x86/pat_memtype_list and output format is type @ <start addr>-<end addr> We do not hold the lock while printing the entire list. So, the list may not be a consistent copy in case where regions are getting added or deleted at the same time. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
64d206d896ff70b828138577d5ff39deda5f1c4d |
|
18-Jul-2008 |
Ingo Molnar <mingo@elte.hu> |
x86: rename CONFIG_NONPROMISC_DEVMEM to CONFIG_PROMISC_DEVMEM Linus observed: > The real bug is that we shouldn't have "double negatives", and > certainly not negative config options. Making that "promiscuous > /dev/mem" option a negated thing as a config option was bad. right ... lets rename this option. There should never be a negation in config options. [ that reminds me of CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER, but that is for another commit ;-) ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
965194c15dc9e4f3bc44432b39c441c86af7f11d |
|
12-Jul-2008 |
Yinghai Lu <yhlu.kernel@gmail.com> |
x86: max_low_pfn_mapped fix, #2 tighten the boundary checks around max_low_pfn_mapped - dont overmap nor undermap into holes. also print out tseg for AMD cpus, for diagnostic purposes. (this is an SMM area, and we split up any big mappings around that area) Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
f361a450bf1ad14e2b003217dbf3958638631265 |
|
11-Jul-2008 |
Yinghai Lu <yhlu.kernel@gmail.com> |
x86: introduce max_low_pfn_mapped for 64-bit when more than 4g memory is installed, don't map the big hole below 4g. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
33af9039cbf629041da2bfa0cf451208391a1ec3 |
|
20-Jun-2008 |
Andreas Herrmann <andreas.herrmann3@amd.com> |
x86: pat.c final cleanup of loop body in reserve_memtype Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Suresh B Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
64fe44c38bbdfab4fe052029058ce5fe9804de68 |
|
20-Jun-2008 |
Andreas Herrmann <andreas.herrmann3@amd.com> |
x86: pat.c introduce function to check for conflicts with existing memtypes ... to strip down loop body in reserve_memtype. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Suresh B Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
f6887264deba4cd991f0ca006918dcff4c939021 |
|
20-Jun-2008 |
Andreas Herrmann <andreas.herrmann3@amd.com> |
x86: pat.c consolidate list_add handling in reserve_memtype Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Suresh B Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
3e9c83b309fd7cbf1d9b801d0d5877c040e30420 |
|
20-Jun-2008 |
Andreas Herrmann <andreas.herrmann3@amd.com> |
x86: pat.c consolidate error/debug messages in reserve_memtype ... and move last debug message out of locked section. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Suresh B Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
69e26be9b1d0c83d3581475095ce2a1ccc578215 |
|
20-Jun-2008 |
Andreas Herrmann <andreas.herrmann3@amd.com> |
x86: pat.c more trivial changes - add BUG statement to catch invalid start and end parameters - No need to track the actual type in both req_type and actual_type -- keep req_type unchanged. - removed (IMHO) superfluous comments Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Suresh B Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
ac97991ec9e0a05c8860f4a04f8477227b1c03f2 |
|
20-Jun-2008 |
Andreas Herrmann <andreas.herrmann3@amd.com> |
x86: pat.c choose more crisp variable names - parse/ml => entry (within list_for_each and friends) - new_entry => new - ret_type => new_type (to avoid confusion with req_type) (... to make it more readable...) Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Suresh B Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
bcc643dc287cb732e96a1685ac130c3ae8b1c960 |
|
20-Jun-2008 |
Andreas Herrmann <andreas.herrmann3@amd.com> |
x86: introduce macro to check whether an address range is in the ISA range Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Suresh B Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
dd0c7c4903c29da9aa3bf33deecf064d190a0d81 |
|
18-Jun-2008 |
Andreas Herrmann <andreas.herrmann3@amd.com> |
x86: shrink pat_x_mtrr_type to its essentials Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Suresh B Siddha <suresh.b.siddha@intel.com> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
6cf514fce18589ea1e0521c5f2d7c2bb280fefc7 |
|
16-Jun-2008 |
Hugh Dickins <hugh@veritas.com> |
x86: PAT: make pat_x_mtrr_type() more readable Clean up over-complications in pat_x_mtrr_type(). And if reserve_memtype() ignores stray req_type bits when pat_enabled, it's better to mask them off when not also. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
499f8f84b8324ba27d756e03f373fa16eeed9ccc |
|
10-Jun-2008 |
Andreas Herrmann <andreas.herrmann3@amd.com> |
x86: rename pat_wc_enabled to pat_enabled BTW, what does pat_wc_enabled stand for? Does it mean "write-combining"? Currently it is used to globally switch on or off PAT support. Thus I renamed it to pat_enabled. I think this increases readability (and hope that I didn't miss something). Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
cd7a4e936d345ab4cb49d68192d90bd4e4c58458 |
|
10-Jun-2008 |
Andreas Herrmann <andreas.herrmann3@amd.com> |
x86: PAT: fixed checkpatch errors (and whitespaces) x86: PAT: fixed checkpatch errors (and whitespaces) Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
97cfab6ac4ddfda0d722393bbf46cc40bc332107 |
|
10-Jun-2008 |
Andreas Herrmann <andreas.herrmann3@amd.com> |
x86: PAT: fix ambiguous paranoia check in pat_init() Starting with commit 8d4a4300854f3971502e81dacd930704cb88f606 (x86: cleanup PAT cpu validation) the PAT CPU feature flag is not cleared anymore. Now the error message "PAT enabled, but CPU feature cleared" in pat_init() is misleading. Furthermore the current code does not check for existence of the PAT CPU feature flag if a CPU is whitelisted in validate_pat_support. This patch clears pat_wc_enabled if boot CPU has no PAT feature flag and adapts the paranoia check. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
c26421d01986e1521043c8feb47256833df3bf31 |
|
29-May-2008 |
Venki Pallipadi <venkatesh.pallipadi@intel.com> |
x86: fix Xorg crash with xf86MapVidMem error Clarify the usage of mtrr_lookup() in PAT code, and to make PAT code resilient to mtrr lookup problems. Specifically, pat_x_mtrr_type() is restructured to highlight, under what conditions we look for mtrr hint. pat_x_mtrr_type() uses a default type when there are any errors in mtrr lookup (still maintaining the pat consistency). And, reserve_memtype() highlights its usage ot mtrr_lookup for request type of '-1' and also defaults in a sane way on any mtrr lookup failure. pat.c looks at mtrr type of a range to get a hint on what mapping type to request when user/API: (1) hasn't specified any type (/dev/mem mapping) and we do not want to take performance hit by always mapping UC_MINUS. This will be the case for /dev/mem mappings used to map BIOS area or ACPI region which are WB'able. In this case, as long as MTRR is not WB, PAT will request UC_MINUS for such mappings. (2) user/API requests WB mapping while in reality MTRR may have UC or WC. In this case, PAT can map as WB (without checking MTRR) and still effective type will be UC or WC. But, a subsequent request to map same region as UC or WC may fail, as the region will get trackked as WB in PAT list. Looking at MTRR hint helps us to track based on effective type rather than what user requested. Again, here mtrr_lookup is only used as hint and we fallback to WB mapping (as requested by user) as default. In both cases, after using the mtrr hint, we still go through the memtype list to make sure there are no inconsistencies among multiple users. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Tested-by: Rufus & Azrael <rufus-azrael@numericable.fr> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
be524fb96081e9e511d993ebf39b05a32b19476e |
|
29-May-2008 |
Andrew Morton <akpm@linux-foundation.org> |
x86: section mismatch fix Fix this: WARNING: vmlinux.o(.text+0x114bb): Section mismatch in reference from the function nopat() to the function .cpuinit.text:pat_disable() The function nopat() references the function __cpuinit pat_disable(). This is often because nopat lacks a __cpuinit annotation or the annotation of pat_disable is wrong. Reported-by: "Fabio Comolli" <fabio.comolli@gmail.com> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
282c454cd3a7041f59a37112bb2f82263bc38f6c |
|
29-May-2008 |
Venki Pallipadi <venkatesh.pallipadi@intel.com> |
x86: fix Xorg crash with xf86MapVidMem error Clarify the usage of mtrr_lookup() in PAT code, and to make PAT code resilient to mtrr lookup problems. Specifically, pat_x_mtrr_type() is restructured to highlight, under what conditions we look for mtrr hint. pat_x_mtrr_type() uses a default type when there are any errors in mtrr lookup (still maintaining the pat consistency). And, reserve_memtype() highlights its usage ot mtrr_lookup for request type of '-1' and also defaults in a sane way on any mtrr lookup failure. pat.c looks at mtrr type of a range to get a hint on what mapping type to request when user/API: (1) hasn't specified any type (/dev/mem mapping) and we do not want to take performance hit by always mapping UC_MINUS. This will be the case for /dev/mem mappings used to map BIOS area or ACPI region which are WB'able. In this case, as long as MTRR is not WB, PAT will request UC_MINUS for such mappings. (2) user/API requests WB mapping while in reality MTRR may have UC or WC. In this case, PAT can map as WB (without checking MTRR) and still effective type will be UC or WC. But, a subsequent request to map same region as UC or WC may fail, as the region will get trackked as WB in PAT list. Looking at MTRR hint helps us to track based on effective type rather than what user requested. Again, here mtrr_lookup is only used as hint and we fallback to WB mapping (as requested by user) as default. In both cases, after using the mtrr hint, we still go through the memtype list to make sure there are no inconsistencies among multiple users. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Tested-by: Rufus & Azrael <rufus-azrael@numericable.fr> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
46dd98a5c0b907f94742579ab605be1933a37abd |
|
15-May-2008 |
Andrew Morton <akpm@linux-foundation.org> |
arch/x86/mm/pat.c: use boot_cpu_has() arch/x86/mm/pat.c: In function 'phys_mem_access_prot_allowed': arch/x86/mm/pat.c:526: warning: passing argument 2 of 'constant_test_bit' from incompatible pointer type arch/x86/mm/pat.c:526: warning: passing argument 2 of 'variable_test_bit' from incompatible pointer type arch/x86/mm/pat.c:527: warning: passing argument 2 of 'constant_test_bit' from incompatible pointer type arch/x86/mm/pat.c:527: warning: passing argument 2 of 'variable_test_bit' from incompatible pointer type arch/x86/mm/pat.c:528: warning: passing argument 2 of 'constant_test_bit' from incompatible pointer type arch/x86/mm/pat.c:528: warning: passing argument 2 of 'variable_test_bit' from incompatible pointer type arch/x86/mm/pat.c:529: warning: passing argument 2 of 'constant_test_bit' from incompatible pointer type arch/x86/mm/pat.c:529: warning: passing argument 2 of 'variable_test_bit' from incompatible pointer type Don't open-code test_bit() on a __u32 Cc: Andrea Arcangeli <andrea@qumranet.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
31f4d870b02e1590260ab7f2a9ff74306bd27e88 |
|
13-May-2008 |
Avi Kivity <avi@qumranet.com> |
x86: fix crash on cpu hotplug on pat-incapable machines pat_disable() is __init, which means it goes away after booting is complete. Unfortunately it is used by the hotplug code if the machine is not pat-capable, causing a crash. Fix by marking pat_disable() as __cpuinit. Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
afc85343807bc2c488b7372cd7547875dfe03fe5 |
|
12-May-2008 |
Pranith Kumar <bobby.prani@gmail.com> |
x86: arch/x86/mm/pat.c - fix warning fix this warning: arch/x86/mm/pat.c: In function `phys_mem_access_prot_allowed': arch/x86/mm/pat.c:558: warning: long long unsigned int format, long unsigned int arg (arg 6) arch/x86/mm/pat.c: In function `map_devmem': arch/x86/mm/pat.c:580: warning: long long unsigned int format, long unsigned int arg (arg 6) Signed-off-by: D Pranith Kumar <bobby.prani@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
77b52b4c5c66175553ee59eb43f74366f1e54bde |
|
06-May-2008 |
Venki Pallipadi <venkatesh.pallipadi@intel.com> |
x86: add "debugpat" boot option enable debug messages by a boot option "debugpat". Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
8d4a4300854f3971502e81dacd930704cb88f606 |
|
08-May-2008 |
Thomas Gleixner <tglx@linutronix.de> |
x86: cleanup PAT cpu validation Move the scattered checks for PAT support to a single function. Its moved to addon_cpuid_features.c as this file is shared between 32 and 64 bit. Remove the manipulation of the PAT feature bit and just disable PAT in the PAT layer, based on the PAT bit provided by the CPU and the current CPU version/model white list. Change the boot CPU check so it works on Voyager somewhere in the future as well :) Also panic, when a secondary has PAT disabled but the primary one has alrady switched to PAT. We have no way to undo that. The white list is kept for now to ensure that we can rely on known to work CPU types and concentrate on the software induced problems instead of fighthing CPU erratas and subtle wreckage caused by not yet verified CPUs. Once the PAT code has stabilized enough, we can remove the white list and open the can of worms. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
f022bfd58253099102218db5249220a7f4787114 |
|
21-Mar-2008 |
Ingo Molnar <mingo@elte.hu> |
x86: PAT fix Adrian Bunk noticed the following Coverity report: > Commit e7f260a276f2c9184fe753732d834b1f6fbe9f17 > (x86: PAT use reserve free memtype in mmap of /dev/mem) > added the following gem to arch/x86/mm/pat.c: > > <-- snip --> > > ... > int phys_mem_access_prot_allowed(struct file *file, unsigned long pfn, > unsigned long size, pgprot_t *vma_prot) > { > u64 offset = ((u64) pfn) << PAGE_SHIFT; > unsigned long flags = _PAGE_CACHE_UC_MINUS; > unsigned long ret_flags; > ... > ... (nothing that touches ret_flags) > ... > if (flags != _PAGE_CACHE_UC_MINUS) { > retval = reserve_memtype(offset, offset + size, flags, NULL); > } else { > retval = reserve_memtype(offset, offset + size, -1, &ret_flags); > } > > if (retval < 0) > return 0; > > flags = ret_flags; > > if (pfn <= max_pfn_mapped && > ioremap_change_attr((unsigned long)__va(offset), size, flags) < 0) { > free_memtype(offset, offset + size); > printk(KERN_INFO > "%s:%d /dev/mem ioremap_change_attr failed %s for %Lx-%Lx\n", > current->comm, current->pid, > cattr_name(flags), > offset, offset + size); > return 0; > } > > *vma_prot = __pgprot((pgprot_val(*vma_prot) & ~_PAGE_CACHE_MASK) | > flags); > return 1; > } > > <-- snip --> > > If (flags != _PAGE_CACHE_UC_MINUS) we pass garbage from the stack to > ioremap_change_attr() and/or __pgprot(). > > Spotted by the Coverity checker. the fix simplifies the code as we get rid of the 'ret_flags' complication. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
86cf02f8eaea1b09e102e0f432fc137dc5cf4407 |
|
27-Apr-2008 |
Linus Torvalds <torvalds@linux-foundation.org> |
x86 PAT: tone down debugging messages some more Ingo already fixed one of these at my request (in "x86 PAT: tone down debugging messages", commit 1ebcc654f010d4a63f3ebf8ddd2cab5a709b1824), but there was another one he missed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
0124cecfc85a6664b1ad5f1d28cf0ab8df66fc42 |
|
26-Apr-2008 |
Venki Pallipadi <venkatesh.pallipadi@intel.com> |
x86, PAT: disable /dev/mem mmap RAM with PAT disable /dev/mem mmap of RAM with PAT. It makes things safer and eliminates aliasing. A future improvement would be to avoid the range_is_allowed duplication. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
1ebcc654f010d4a63f3ebf8ddd2cab5a709b1824 |
|
26-Apr-2008 |
Ingo Molnar <mingo@elte.hu> |
x86 PAT: tone down debugging messages Linus reported these excessive debug printouts: > Overlap at 0xe0300000-0xe0400000 > Overlap at 0xe0300000-0xe0380000 > Overlap at 0xe0300000-0xe0400000 > Overlap at 0xe0300000-0xe0400000 > Overlap at 0xe0300000-0xe0400000 > Overlap at 0xe0300000-0xe0400000 > Overlap at 0xe0300000-0xe0400000 turn that into a pr_debug(). Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
28eb559b5b0b9b51b9165a9b8faa75b0bb91ca8d |
|
03-Apr-2008 |
Ingo Molnar <mingo@elte.hu> |
pat: cleanups Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
e7f260a276f2c9184fe753732d834b1f6fbe9f17 |
|
19-Mar-2008 |
venkatesh.pallipadi@intel.com <venkatesh.pallipadi@intel.com> |
x86: PAT use reserve free memtype in mmap of /dev/mem Use reserve_memtype and free_memtype wrappers for /dev/mem mmaps. The memtype is slightly complicated here, given that we have to support existing X mappings. We fallback on UC_MINUS for that. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
f0970c13b6a5b01189aeb196ebb573cf87d95839 |
|
19-Mar-2008 |
venkatesh.pallipadi@intel.com <venkatesh.pallipadi@intel.com> |
x86: PAT phys_mem_access_prot_allowed for dev/mem mmap Introduce phys_mem_access_prot_allowed(), which checks whether the mapping is possible, without any conflicts and returns success or failure based on that. phys_mem_access_prot() by itself does not allow failure case. This ability to return error is needed for PAT where we may have aliasing conflicts. x86 setup __HAVE_PHYS_MEM_ACCESS_PROT and move x86 specific code out of /dev/mem into arch specific area. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
9307cacad0dfe3749f00303125c6f7f0523e5616 |
|
25-Mar-2008 |
Yinghai Lu <yhlu.kernel.send@gmail.com> |
x86: pat cpu feature bit setting for known cpus Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
35605a1027ac630f85a1b95684f7e86b82498cd6 |
|
25-Mar-2008 |
Yinghai Lu <yhlu.kernel.send@gmail.com> |
x86: enable PAT for amd k8 and fam10h make known_pat_cpu to think amd k8 and fam10h is ok too. also make tom2 below to be WRBACK Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
6997ab4982a29925e79f72c3a59823cf944c3529 |
|
19-Mar-2008 |
venkatesh.pallipadi@intel.com <venkatesh.pallipadi@intel.com> |
x86: add PAT related debug prints Adds debug prints at critical code. Adds enough info in dmesg to allow us to do effective first round of analysis of any issues that may result due to PAT patch series. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
2e5d9c857d4e6c9e7b7d8c8c86a68a7842d213d6 |
|
19-Mar-2008 |
venkatesh.pallipadi@intel.com <venkatesh.pallipadi@intel.com> |
x86: PAT infrastructure patch Sets up pat_init() infrastructure. PAT MSR has following setting. PAT |PCD ||PWT ||| 000 WB _PAGE_CACHE_WB 001 WC _PAGE_CACHE_WC 010 UC- _PAGE_CACHE_UC_MINUS 011 UC _PAGE_CACHE_UC We are effectively changing WT from boot time setting to WC. UC_MINUS is used to provide backward compatibility to existing /dev/mem users(X). reserve_memtype and free_memtype are new interfaces for maintaining alias-free mapping. It is currently implemented in a simple way with a linked list and not optimized. reserve and free tracks the effective memory type, as a result of PAT and MTRR setting rather than what is actually requested in PAT. pat_init piggy backs on mtrr_init as the rules for setting both pat and mtrr are same. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
|