History log of /drivers/lguest/page_tables.c
Revision Date Author Comments
9f54288def3f92b7805eb6d4b1ddcd73ecf6e889 22-Jul-2011 Rusty Russell <rusty@rustcorp.com.au> lguest: update comments

Also removes a long-unused #define and an extraneous semicolon.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
5dea1c88ed11a1221581c4b202f053c4fc138704 22-Jul-2011 Rusty Russell <rusty@rustcorp.com.au> lguest: use a special 1:1 linear pagetable mode until first switch.

The Host used to create some page tables for the Guest to use at the
top of Guest memory; it would then tell the Guest where this was. In
particular, it created linear mappings for 0 and 0xC0000000 addresses
because lguest used to switch to its real page tables quite late in
boot.

However, since d50d8fe19 Linux initialized boot page tables in
head_32.S even before the "are we lguest?" boot jump. So, now we can
simplify things: the Host pagetable code assumes 1:1 linear mapping
until it first calls the LHCALL_NEW_PGTABLE hypercall, which we now do
before we reach C code.

This also means that the Host doesn't need to know anything about the
Guest's PAGE_OFFSET. (Non-Linux guests might not even have such a
thing).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
c9f2954964df1490373065558f3156379c7a2454 30-Nov-2010 Christoph Lameter <cl@linux.com> lguest: Use this_cpu_ops

Use this_cpu_ops in a couple of places in lguest.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
5a0e3ad6af8660be21ca98a971cd00f331318c05 24-Mar-2010 Tejun Heo <tj@kernel.org> include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
fb100d78c04ff6053047625d0368d0d4b1d9912a 24-Sep-2009 Rusty Russell <rusty@rustcorp.com.au> lguest: use PGDIR_SHIFT for PAE code to allow different PAGE_OFFSET

We still assume the Guest and Host have the same PAGE_OFFSET settings,
but now we don't assume 0xC0000000.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Matias Zabaljauregui <zabaljauregui@gmail.com>
4c1ea3dd718a1d93a726cb3e66665ac4170dcccd 24-Sep-2009 Rusty Russell <rusty@rustcorp.com.au> lguest: use set_pte/set_pmd uniformly for real page table entries

If we're building a pte, we can use simple assigment; only use set_pte
etc. when we're actually going to use that destination as a PTE. I
don't know that we'll ever run under Xen, but it's neater.

And use set_pte/set_pmd rather than assuming native_ versions, even
though that's probably true for most people.

(Includes compile fix by Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Matias Zabaljauregui <zabaljauregui@gmail.com>
Cc: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
fd589a8f0a13f53a2dd580b1fe170633cf6b095f 16-Jul-2009 Anand Gadiyar <gadiyar@ti.com> trivial: fix typo "to to" in multiple files

Signed-off-by: Anand Gadiyar <gadiyar@ti.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
a91d74a3c4de8115295ee87350c13a329164aaaf 31-Jul-2009 Rusty Russell <rusty@rustcorp.com.au> lguest: update commentry

Every so often, after code shuffles, I need to go through and unbitrot
the Lguest Journey (see drivers/lguest/README). Since we now use RCU in
a simple form in one place I took the opportunity to expand that explanation.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
2e04ef76916d1e29a077ea9d0f2003c8fd86724d 31-Jul-2009 Rusty Russell <rusty@rustcorp.com.au> lguest: fix comment style

I don't really notice it (except to begrudge the extra vertical
space), but Ingo does. And he pointed out that one excuse of lguest
is as a teaching tool, it should set a good example.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ingo Molnar <mingo@redhat.com>
92b4d8df8436cdd74d22a2a5b6b23b9abc737a3e 13-Jun-2009 Rusty Russell <rusty@rustcorp.com.au> lguest: PAE fixes

1) j wasn't initialized in setup_pagetables, so they weren't set up for me
causing immediate guest crashes.

2) gpte_addr should not re-read the pmd from the Guest. Especially
not BUG_ON() based on the value. If we ever supported SMP guests,
they could trigger that. And the Launcher could also trigger it
(tho currently root-only).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
acdd0b6292b282c4511897ac2691a47befbf1c6a 13-Jun-2009 Matias Zabaljauregui <zabaljauregui@gmail.com> lguest: PAE support

This version requires that host and guest have the same PAE status.
NX cap is not offered to the guest, yet.

Signed-off-by: Matias Zabaljauregui <zabaljauregui@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
ebe0ba84f55950a89cb7af94c7ffc35ee3992f9e 30-May-2009 Matias Zabaljauregui <zabaljauregui@gmail.com> lguest: replace hypercall name LHCALL_SET_PMD with LHCALL_SET_PGD

replace LHCALL_SET_PMD with LHCALL_SET_PGD hypercall name
(That's really what it is, and the confusion gets worse with PAE support)

Signed-off-by: Matias Zabaljauregui <zabaljauregui@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reported-by: Jeremy Fitzhardinge <jeremy@goop.org>
90603d15fa95605d1d08235b73e220d766f04bb0 13-Jun-2009 Matias Zabaljauregui <zabaljauregui@gmail.com> lguest: use native_set_* macros, which properly handle 64-bit entries when PAE is activated

Some cleanups and replace direct assignment with native_set_* macros which properly handle 64-bit entries when PAE is activated

Signed-off-by: Matias Zabaljauregui <zabaljauregui@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
ed1dc77810159a733240ba6751c1b31023bf8dd7 30-May-2009 Matias Zabaljauregui <zabaljauregui@gmail.com> lguest: map switcher with executable page table entries

Map switcher with executable page table entries.
(This bug didn't matter before PAE and hence NX support -- RR)

Signed-off-by: Matias Zabaljauregui <zabaljauregui@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
df1693abc42e34bbc4351e179dbe66c28a94efb8 18-Mar-2009 Matias Zabaljauregui <zabaljauregui@gmail.com> lguest: use bool instead of int

Impact: clean up

Rusty told me, some time ago, that he had become a fan of "bool".
So, here are some replacements.

Signed-off-by: Matias Zabaljauregui <zabaljauregui at gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
6afbdd059c27330eccbd85943354f94c2b83a7fe 31-Mar-2009 Rusty Russell <rusty@rustcorp.com.au> lguest: fix spurious BUG_ON() on invalid guest stack.

Impact: fix crash on misbehaving guest

gpte_addr() contains a BUG_ON(), insisting that the present flag is
set. We need to return before we call it if that isn't the case.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
58a24566449892dda409b9ad92c2e56c76c5670c 29-Sep-2008 Matias Zabaljauregui <zabaljauregui@gmail.com> lguest: move the initial guest page table creation code to the host

This patch moves the initial guest page table creation code to the host,
so the launcher keeps working with PAE enabled configs.

Signed-off-by: Matias Zabaljauregui <zabaljauregui@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
71a3f4edc11b9dd7af28d003acbbd33496003da1 13-Aug-2008 Rusty Russell <rusty@rustcorp.com.au> lguest: use get_user_pages_fast() instead of get_user_pages()

Using a simple page table thrashing program I measure a slight
improvement. The program creates five processes. Each touches 1000
pages then schedules the next process. We repeat this 1000 times. As
lguest only caches 4 cr3 values, this rebuilds a lot of shadow page
tables requiring virt->phys mappings.

Before: 5.93 seconds
After: 5.40 seconds

(Counts of slow vs fastpath in this usage are 6092 and 2852462 respectively.)

And more importantly for lguest, the code is simpler.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
a6bd8e13034dd7d60b6f14217096efa192d0adc1 28-Mar-2008 Rusty Russell <rusty@rustcorp.com.au> lguest: comment documentation update.

Took some cycles to re-read the Lguest Journey end-to-end, fix some
rot and tighten some phrases.

Only comments change. No new jokes, but a couple of recycled old jokes.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
4357bd9453b81e0a41db1dec16e06d74256b7560 11-Mar-2008 Rusty Russell <rusty@rustcorp.com.au> lguest: Revert 1ce70c4fac3c3954bd48c035f448793867592bc0, fix real problem.

Ahmed managed to crash the Host in release_pgd(), which cannot be a Guest
bug, and indeed it wasn't.

The bug was that handing a 0 as the address of the toplevel page table
being manipulated can cause the lookup code in find_pgdir() to return
an uninitialized cache entry (we shadow up to 4 top level page tables
for each Guest).

Commit 37cc8d7f963ba2deec29c9b68716944516a3244f introduced this
behaviour in the Guest, uncovering the bug.

The patch which he submitted (which removed the /4 from the index
calculation) simply ensured that these high-indexed entries hit the
early exit path of guest_set_pmd(). But you get lots of segfaults in
guest userspace as the PMDs aren't being updated.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
31f4b46ec6f889533c06537dea96bb0d20fa625b 09-Feb-2008 Ahmed S. Darwish <darwish.07@gmail.com> lguest: accept guest _PAGE_PWT page table entries

Beginning from commit 4138cc3418f5, ioremap_nocache() sets the _PAGE_PWT
flag.

Lguest doesn't accept a guest pte with a _PWT flag and reports a "bad
page table entry" in that case.

Accept guest _PAGE_PWT page table entries.

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
84f12e39c856a8b1ab407f8216ecebaf4204b94d 19-Jan-2008 Glauber de Oliveira Costa <gcosta@redhat.com> lguest: use __PAGE_KERNEL instead of _PAGE_KERNEL

x86_64 don't expose the intermediate representation with one underline,
_PAGE_KERNEL, just the double-underlined one.

Use it, to get a common ground between 32 and 64-bit

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
382ac6b3fbc0ea6a5697fc6caaf7e7de12fa8b96 17-Jan-2008 Glauber de Oliveira Costa <gcosta@redhat.com> lguest: get rid of lg variable assignments

We can save some lines of code by getting rid of
*lg = cpu... lines of code spread everywhere by now.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
934faab464c6a26ed1a226b6cf7111b35405dde1 17-Jan-2008 Glauber de Oliveira Costa <gcosta@redhat.com> lguest: change gpte_addr header

gpte_addr() does not depend on any guest information. So we wipe out
the lg parameter from it completely.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2092aa277b0adfb8f4f47ab8a9ee00aff0ca7ed6 17-Jan-2008 Glauber de Oliveira Costa <gcosta@redhat.com> lguest: change spte_addr header

spte_addr does not depend on any guest information, so we
wipe out the lg parameter completely.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1713608f280002d9ffc6de89d7de5cf367072d63 07-Jan-2008 Glauber de Oliveira Costa <gcosta@redhat.com> lguest: per-vcpu lguest pgdir management

this patch makes the pgdir management per-vcpu. The pgdirs pool
is still guest-wide (although it'll probably need to grow when we
are really executing more vcpus), but the pgdidx index is gone,
since it makes no sense anymore. Instead, we use a per-vcpu
index.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
4665ac8e28c30c2a015c617c55783c0bf3a49c05 07-Jan-2008 Glauber de Oliveira Costa <gcosta@redhat.com> lguest: makes special fields be per-vcpu

lguest struct have room for some fields, namely, cr2, ts, esp1
and ss1, that are not really guest-wide, but rather, vcpu-wide.

This patch puts it in the vcpu struct

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
a53a35a8b485b9c16b73e5177bddaa4321971199 07-Jan-2008 Glauber de Oliveira Costa <gcosta@redhat.com> lguest: make registers per-vcpu

This is the most obvious per-vcpu field: registers.

So this patch moves it from struct lguest to struct vcpu,
and patch the places in which they are used, accordingly

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
0c78441cf4dd66f66e23dc085f0cc1e3e8669b96 07-Jan-2008 Glauber de Oliveira Costa <gcosta@redhat.com> lguest: map_switcher_in_guest() per-vcpu

The switcher needs to be mapped per-vcpu, because different vcpus
will potentially have different page tables (they don't have to,
because threads will share the same).

So our first step is the make the function receive a vcpu struct

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
e1e72965ec2c02db99b415cd06c17ea90767e3a4 25-Oct-2007 Rusty Russell <rusty@rustcorp.com.au> lguest: documentation update

Went through the documentation doing typo and content fixes. This
patch contains only comment and whitespace changes.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2d37f94a28170ca656438758fca577acb49a7932 22-Oct-2007 Rusty Russell <rusty@rustcorp.com.au> generalize lgread_u32/lgwrite_u32.

Jes complains that page table code still uses lgread_u32 even though
it now uses general kernel pte types. The best thing to do is to
generalize lgread_u32 and lgwrite_u32.

This means we lose the efficiency of getuser(). We could potentially
regain it if we used __copy_from_user instead of copy_from_user, but
I'm not certain that our range check is equivalent to access_ok() on
all platforms.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Jes Sorensen <jes@sgi.com>
47436aa4ad054c1c7c8231618e86ebd9305308dc 22-Oct-2007 Rusty Russell <rusty@rustcorp.com.au> Boot with virtual == physical to get closer to native Linux.

1) This allows us to get alot closer to booting bzImages.

2) It means we don't have to know page_offset.

3) The Guest needs to modify the boot pagetables to create the
PAGE_OFFSET mapping before jumping to C code.

4) guest_pa() walks the page tables rather than using page_offset.

5) We don't use page_offset to figure out whether to emulate: it was
always kinda quesationable, and won't work for instructions done
before remapping (bzImage unpacking in particular).

6) We still want the kernel address for tlb flushing: have the initial
hypercall give us that, too.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
ee3db0f2b6053b65f3b70253f5f810d9a3d67b28 22-Oct-2007 Rusty Russell <rusty@rustcorp.com.au> Rename "cr3" to "gpgdir" to avoid x86-specific naming.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
df29f43e650df29456804dabdb2611de914e7c0f 22-Oct-2007 Matias Zabaljauregui <matias.zabaljauregui@cern.ch> Pagetables to use normal kernel types

This is my first step in the migration of page_tables.c to the kernel
types and functions/macros (2.6.23-rc3). Seems to be working OK.

Signed-off-by: Matias Zabaljauregui <matias.zabaljauregui@cern.ch>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
3c6b5bfa3cf3b4057788e08482a468cc3bc00780 22-Oct-2007 Rusty Russell <rusty@rustcorp.com.au> Introduce guest mem offset, static link example launcher

In order to avoid problematic special linking of the Launcher, we give
the Host an offset: this means we can use any memory region in the
Launcher as Guest memory rather than insisting on mmap() at 0.

The result is quite pleasing: a number of casts are replaced with
simple additions.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
f56a384e98aa81065038c4e16f39ed989ccae687 26-Jul-2007 Rusty Russell <rusty@rustcorp.com.au> lguest: documentation VII: FIXMEs

Documentation: The FIXMEs

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
bff672e630a015d5b54c8bfb16160b7edc39a57c 26-Jul-2007 Rusty Russell <rusty@rustcorp.com.au> lguest: documentation V: Host

Documentation: The Host

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
f938d2c892db0d80d144253d4a7b7083efdbedeb 26-Jul-2007 Rusty Russell <rusty@rustcorp.com.au> lguest: documentation I: Preparation

The netfilter code had very good documentation: the Netfilter Hacking HOWTO.
Noone ever read it.

So this time I'm trying something different, using a bit of Knuthiness.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
d7e28ffe6c74416b54345d6004fd0964c115b12c 19-Jul-2007 Rusty Russell <rusty@rustcorp.com.au> lguest: the host code

This is the code for the "lg.ko" module, which allows lguest guests to
be launched.

[akpm@linux-foundation.org: update for futex-new-private-futexes]
[akpm@linux-foundation.org: build fix]
[jmorris@namei.org: lguest: use hrtimers]
[akpm@linux-foundation.org: x86_64 build fix]
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@suse.de>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>