History log of /arch/powerpc/platforms/cell/spufs/sched.c
Revision Date Author Comments
f2dec1eae8029bb056a3c1b2d3373681fa8e3e7a 16-Jul-2014 Thomas Gleixner <tglx@linutronix.de> powerpc: cell: Use ktime_get_ns()

Replace the ever recurring:
ts = ktime_get_ts();
ns = timespec_to_ns(&ts);
with
ns = ktime_get_ns();

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
74b8af7837fa55c020e2ad1b34a6b10dfe25a9b1 11-Feb-2014 Jeremy Kerr <jk@ozlabs.org> powerpc/spufs: Remove MAX_USER_PRIO define

Current ppc64_defconfig fails with:

arch/powerpc/platforms/cell/spufs/sched.c:86:0: error: "MAX_USER_PRIO" redefined [-Werror]
cc1: all warnings being treated as errors

Commit 6b6350f155af ("sched: Expose some macros related to priority")
introduced a generic MAX_USER_PRIO macro to sched/prio.h, which is
causing the conflit. Use that one instead of our own.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Cc: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1392098717.689604.970589769393.1.gpush@pablo
Signed-off-by: Ingo Molnar <mingo@kernel.org>
993db4b45fd99949d8f6e004a7744b523dca473a 11-Feb-2013 Ingo Molnar <mingo@kernel.org> sched, powerpc: Fix sched.h split-up build failure

Fix PowerPC/Cell build fallout from:

8bd75c77b7c6 sched/rt: Move rt specific bits into new header file

Reported-by: Michael Ellerman <michael@ellerman.id.au>
Cc: Clark Williams <williams@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20130207094707.7b9f825f@riff.lan
Signed-off-by: Ingo Molnar <mingo@kernel.org>
17cf22c33e1f1b5e435469c84e43872579497653 02-Mar-2010 Eric W. Biederman <ebiederm@xmission.com> pidns: Use task_active_pid_ns where appropriate

The expressions tsk->nsproxy->pid_ns and task_active_pid_ns
aka ns_of_pid(task_pid(tsk)) should have the same number of
cache line misses with the practical difference that
ns_of_pid(task_pid(tsk)) is released later in a processes life.

Furthermore by using task_active_pid_ns it becomes trivial
to write an unshare implementation for the the pid namespace.

So I have used task_active_pid_ns everywhere I can.

In fork since the pid has not yet been attached to the
process I use ns_of_pid, to achieve the same effect.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
ead53f22dc646d91a1b6201b9f44dd47d7d88c34 22-Jul-2011 Paul Gortmaker <paul.gortmaker@windriver.com> powerpc: remove non-required uses of include <linux/module.h>

None of the files touched here are modules, and they are not
exporting any symbols either -- so there is no need to be including
the module.h. Builds of all the files remains successful.

Even kernel/module.c does not need to include it, since it includes
linux/moduleloader.h instead.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
104699c0ab473535793b5fea156adaf309afd29b 28-Apr-2011 KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> powerpc: Convert old cpumask API into new one

Adapt new API.

Almost change is trivial. Most important change is the below line
because we plan to change task->cpus_allowed implementation.

- ctx->cpus_allowed = current->cpus_allowed;

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
25985edcedea6396277003854657b5f3cb31a628 31-Mar-2011 Lucas De Marchi <lucas.demarchi@profusion.mobi> Fix common misspellings

Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
5a0e3ad6af8660be21ca98a971cd00f331318c05 24-Mar-2010 Tejun Heo <tj@kernel.org> include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
fc5377668c3d808e1d53c4aee152c836f55c3490 17-Sep-2009 Christoph Hellwig <hch@lst.de> tracing: Remove markers

Now that the last users of markers have migrated to the event
tracer we can kill off the (now orphan) support code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20090917173527.GA1699@lst.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
ae142e0c52b38e44d28b12f77c6e7faa96f7a069 12-Jun-2009 Christoph Hellwig <hch@lst.de> powerpc/sputrace: Use the generic event tracer

I wrote sputrace before generic tracing infrastrucure was available.
Now that we have the generic event tracer we can convert it over and
remove a lot of code:

8 files changed, 45 insertions(+), 285 deletions(-)

To use it make sure CONFIG_EVENT_TRACING is enabled and then enable
the spufs trace channel by

echo 1 > /sys/kernel/debug/tracing/events/spufs/spufs_context/enable

and then read the trace records using e.g.

cat /sys/kernel/debug/tracing/trace

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
74019224ac34b044b44a31dd89a54e3477db4896 18-Feb-2009 Ingo Molnar <mingo@elte.hu> timers: add mod_timer_pending()

Impact: new timer API

Based on an idea from Martin Josefsson with the help of
Patrick McHardy and Stephen Hemminger:

introduce the mod_timer_pending() API which is a mod_timer()
offspring that is an invariant on already removed timers.

(regular mod_timer() re-activates non-pending timers.)

This is useful for the networking code in that it can
allow unserialized mod_timer_pending() timer-forwarding
calls, but a single del_timer*() will stop the timer
from being reactivated again.

Also while at it:

- optimize the regular mod_timer() path some more, the
timer-stat and a debug check was needlessly duplicated
in __mod_timer().

- make the exports come straight after the function, as
most other exports in timer.c already did.

- eliminate __mod_timer() as an external API, change the
users to mod_timer().

The regular mod_timer() code path is not impacted
significantly, due to inlining optimizations and due to
the simplifications.

Based-on-patch-from: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Cc: netdev@vger.kernel.org
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
86c6f274f52c3e991d429869780945c0790e7b65 26-Dec-2008 Rusty Russell <rusty@rustcorp.com.au> cpumask: powerpc: Introduce cpumask_of_{node,pcibus} to replace {node,pcibus}_to_cpumask

Impact: New APIs

The old node_to_cpumask/node_to_pcibus returned a cpumask_t: these
return a pointer to a struct cpumask. Part of removing cpumasks from
the stack.

(Also replaces powerpc internal uses of node_to_cpumask).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
34318c253b861f82bd4a2956e6c8ae8ee2c3aae7 21-Oct-2008 Andre Detsch <adetsch@br.ibm.com> powerpc/spufs: Explain conditional decrement of aff_sched_count

This patch adds a comment to clarify why atomic_dec_if_positive is being used
to decrement gang's aff_sched_count on SPU context unbind.

Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
10baa26c8ccba6f100397c9ba6534be175762e8d 21-Oct-2008 Andre Detsch <adetsch@br.ibm.com> powerpc/spufs: Improve search of node for contexts with SPU affinity

This patch improves redability of the code responsible for trying to find
a node with enough SPUs not committed to other affinity gangs.

An additional check is also added, to avoid taking into account gangs that
have no SPU affinity.

Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
b2e601d14deb2083e2a537b47869ab3895d23a28 04-Sep-2008 Andre Detsch <adetsch@br.ibm.com> powerpc/spufs: Fix possible scheduling of a context to multiple SPEs

We currently have a race when scheduling a context to a SPE -
after we have found a runnable context in spusched_tick, the same
context may have been scheduled by spu_activate().

This may result in a panic if we try to unschedule a context that has
been freed in the meantime.

This change exits spu_schedule() if the context has already been
scheduled, so we don't end up scheduling it twice.

Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
b65fe0356b5b732d7e1e0224c6a1cf2eb5255984 04-Sep-2008 Jeremy Kerr <jk@ozlabs.org> powerpc/spufs: Fix race for a free SPU

We currently have a race for a free SPE. With one thread doing a
spu_yield(), and another doing a spu_activate():

thread 1 thread 2
spu_yield(oldctx) spu_activate(ctx)
__spu_deactivate(oldctx)
spu_unschedule(oldctx, spu)
spu->alloc_state = SPU_FREE
spu = spu_get_idle(ctx)
- searches for a SPE in
state SPU_FREE, gets
the context just
freed by thread 1
spu_schedule(ctx, spu)
spu->alloc_state = SPU_USED
spu_schedule(newctx, spu)
- assumes spu is still free
- tries to schedule context on
already-used spu

This change introduces a 'free_spu' flag to spu_unschedule, to indicate
whether or not the function should free the spu after descheduling the
context. We only set this flag if we're not going to re-schedule
another context on this SPU.

Add a comment to document this behaviour.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
9f43e3914dceb0f8191875b3cdf4325b48d0d70a 02-Sep-2008 Jeremy Kerr <jk@ozlabs.org> powerpc/spufs: Fix multiple get_spu_context()

Commit 8d5636fbca202f61fdb808fc9e20c0142291d802 introduced a reference
count on SPU contexts during find_victim, but this may cause a leak in
the reference count if we later find a better contender for a context to
unschedule.

Change the reference to after we've found our victim context, so we
don't do the extra get_spu_context().

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
cb9808d3d0cb0ed97197decadcf0431140b9e231 19-Aug-2008 Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> powerpc/spufs: Remove invalid semicolon after if statement

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
8d5636fbca202f61fdb808fc9e20c0142291d802 14-Aug-2008 Jeremy Kerr <jk@ozlabs.org> powerpc/spufs: reference context while dropping state mutex in scheduler

Based on an original patch from Christoph Hellwig <hch@lst.de>.

Currently, there is a possible reference-after-free in the spusched
code - contexts may be freed after we have released their state_mutex
in spusched_tick and find_victim.

This change takes a reference to the context before releasing the
mutex, so that the context doesn't get destroyed.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
ad1ede127760d6ca4903f44dfe1a8a38b3bfb36c 24-Jul-2008 Andre Detsch <adetsch@br.ibm.com> powerpc/spufs: better placement of spu affinity reference context

This patch adjusts the placement of a reference context from
a spu affinity chain. The reference context can now be placed
only on nodes that have enough spus not intended to be used by
another gang (already running on the node).

Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
0855b543222e79cbbd9d66dd56cb54740e7d524f 24-Jul-2008 Andre Detsch <adetsch@br.ibm.com> powerpc/spufs: fix aff_mutex and cbe_spu_info[n].list_mutex deadlock

Currenlt,, it is possible to lock aff_mutex and
cbe_spu_info[n].list_mutex in different orders, allowing a deadlock to
occur. With this change, aff_mutex is not taken within a list_mutex
critical section anymore.

Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
fabb657005edbbcb0d13ee49a40f1f4b042a1d19 04-Jul-2008 Maxim Shchetynin <maxim@de.ibm.com> powerpc/spufs: add atomic busy_spus counter to struct cbe_spu_info

As nr_active counter includes also spus waiting for syscalls to return
we need a seperate counter that only counts spus that are currently running
on spu side. This counter shall be used by a cpufreq governor that targets
a frequency dependent from the number of running spus.

Signed-off-by: Christian Krafft <krafft@de.ibm.com>
Acked-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2442a8ba5abe2c27c572bc522da1c33df98c6ec7 06-Jun-2008 Luke Browning <lukebrowning@us.ibm.com> powerpc/spufs: don't extend time time slice if context is not in spu_run

An spu context shouldn't get an extra tick if the time slice code
couldn't find something else to run. This means contexts that are not
within spu_run (ie, SPU_SCHED_SPU_RUN is cleared) will not receive
extra ticks while we have no other contexts waiting.

Signed-off-by: Luke Browning <lukebrowning@us.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
028fda0a6c80c26f1d9f403b4490b9ddc74ffa3b 16-Jun-2008 Luke Browning <lukebrowning@us.ibm.com> powerpc/spufs: fix missed stop-and-signal event

There is a delay in the transition to the stopped state for class 2
interrupts. In some cases, the controlling thread detects the state of
the spu as running, and goes back to sleep resulting in a hung
application as the event is missed.

This change detects the stop condition and re-generates the wakeup event
after a context save.

Signed-off-by: Luke Browning <lukebrowning@us.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
2c911a14b74fa9cf815a936f310e4fa85bee77ce 13-Jun-2008 Luke Browning <lukebrowning@us.ibm.com> powerpc/spufs: synchronize interaction between spu exception handling and time slicing

Time slicing can occur at the same time as spu exception handling
resulting in the wakeup of the wrong thread.

This change uses the the spu's register_lock to enforce synchronization
between bind/unbind and spu exception handling so that they are
mutually exclusive.

Signed-off-by: Luke Browning <lukebrowning@us.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
08fcf1d61193d7b7779aa6d7388535e26e064a0b 12-May-2008 Luke Browning <lukebr@linux.vnet.ibm.com> [POWERPC] spufs: Fix pointer reference in find_victim

If victim (not ctx) is in spu_run, add victim to rq.

Signed-off-by: Luke Browning <lukebrowning@us.ibm.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
7a28a1549f9514f3b0dd3dde5c7337ba5d44fba3 08-May-2008 Christoph Hellwig <hch@lst.de> [POWERPC] spufs: don't requeue victim contex in find_victim if it's not in spu_run

We should not requeue the victim context in find_victim if the owner is
not in spu_run. It's first not needed because leaving the context on
the spu is an optimization and second is harmful because it means the
owner could re-enter spu_run when the context is on the runqueue and
trip the BUG_ON in __spu_update_sched_info.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
7a2142002f29a7b398c49da9bdec712dc57087c7 28-Apr-2008 Luke Browning <lukebr@linux.vnet.ibm.com> [POWERPC] spufs: try to route SPU interrupts to local node

Currently, we re-route SPU interrupts to the current cpu, which may be
on a remote node. In the case of time slicing, all spu interrupts will
end up routed to the same cpu, where the spusched_tick occurs.

This change routes mfc interrupts to the cpu where the controlling
thread last ran, provided that cpu is on the same node as the spu
(otherwise don't reroute interrupts).

This should improve performance and provide a more predictable
environment for processing spu exceptions. In the past we have seen
concurrent delivery of spu exceptions to two cpus. This eliminates that
concern.

Signed-off-by: Luke Browning <lukebr@linux.vnet.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
8a476d49555cb7e8d4222782f695048b46692731 30-Apr-2008 Julio M. Merino Vidal <jmerino@ac.upc.edu> [POWERPC] spufs: fix marker name for find_victim

Fix a typo in the marker for the find_victim function, which prevented
it from being traced. It previously read find_vitim.

Signed-off-by: Julio M. Merino Vidal <jmerino@ac.upc.edu>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
5158e9b5218bd3799c9fa8c401ad24d7f0c0a0a1 29-Apr-2008 Christoph Hellwig <hch@lst.de> [POWERPC] spufs: add context switch notification log

There are userspace instrumentation tools that need to monitor spu
context switches. This patch adds a new file called 'switch_log' to
each spufs context directory that can be used to monitor the context
switches.

Context switch in/out and exit from spu_run are monitored after the
file was first opened and can be read from it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
667471386d4068e75a6a55b615701ced61eb6333 29-Apr-2008 Denis V. Lunev <den@openvz.org> powerpc: use non-racy method for proc entries creation

Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data
be setup before gluing PDE to main tree.

Add correct ->owner to proc_fops to fix reading/module unloading race.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ce7c191bca88aa2f942f70a6d6c6315739a81a32 04-Mar-2008 Jeremy Kerr <jk@ozlabs.org> [POWERPC] spufs: don't (ab)use SCHED_IDLE

commit 4ef11014 introduced a usage of SCHED_IDLE to detect when
a context is within spu_run.

Instead of SCHED_IDLE (which has other meaning), add a flag to
sched_flags to tell if a context should be running.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
2a58aa33daef37134c8a43dca0b7578c3fa7f993 25-Feb-2008 Andre Detsch <adetsch@br.ibm.com> [POWERPC] spufs: fix use time accounting on SPE-overcommit

The spu_runcntl_RW register is restored within spu_restore function.
So, at the end of spu_bind_context, the SPU context is not just loaded,
but running.

This change corrects the state switch to account the time as USER.

Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
4ef110141b3e0758fe30d686417b5686b87eb25b 19-Feb-2008 Jeremy Kerr <jk@ozlabs.org> [POWERPC] spufs: fix scheduler starvation by idle contexts

2.6.25 has a regression where we can starve the scheduler by creating
(N_SPES+1) contexts, then running them one at a time.

The final context will never be run, as the other contexts are loaded on
the SPEs, none of which are repoted as free (ie, spu->alloc_state !=
SPU_FREE), so spu_get_idle() doesn't give us a spu to run on. Because
all of the contexts are stopped, none are descheduled by the scheduler
tick, as spusched_tick returns if spu_stopped(ctx).

This change replaces the spu_stopped() check with checking for SCHED_IDLE
in ctx->policy. We set a context's policy to SCHED_IDLE when we're not
in spu_run(). We also favour SCHED_IDLE contexts when looking for contexts
to unbind, but leave their timeslice intact for later resumption.

This patch fixes the following test in the spufs-testsuite:
tests/20-scheduler/02-yield-starvation

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
038200cfdc6467fa8100c5b9c3b81730f0158370 11-Jan-2008 Christoph Hellwig <hch@lst.de> [POWERPC] spufs: Add marker-based tracing facility

This adds markers two important points in the spufs code and a new
module (sputrace.ko) that allows reading these out through a proc file.

Long-term I'd rather see something like lttng extended to use the spufs
instrumentation, but for now I think this is a good enough quick
solution. We'll probably want to add various addition event in addition
to that ones I have already.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
a5a971129cc6155e26315fd28a450505ccc35fd8 02-Jan-2008 Paul Mackerras <paulus@samba.org> [POWERPC] Fix build failure on Cell when CONFIG_SPU_FS=y

Commit aed3a8c9bb1a8623a618232087c5ff62718e3b9a introduced a
definition of notify_spus_active in .../cell/spu_syscalls.c, and
another definition under #ifndef MODULE in .../cell/spufs/sched.c.
The latter is not necessary and causes the build to fail when
CONFIG_SPU_FS=y, so this removes it. It also removes the export
of do_notify_spus_active, which is unnecessary.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Jeremy Kerr <jk@ozlabs.org>
aed3a8c9bb1a8623a618232087c5ff62718e3b9a 14-Dec-2007 Bob Nelson <rrnelson@linux.vnet.ibm.com> [POWERPC] Oprofile: Remove dependency on spufs module

This removes an OProfile dependency on the spufs module. This
dependency was causing a problem for multiplatform systems that are
built with support for Oprofile on Cell but try to load the oprofile
module on a non-Cell system.

Signed-off-by: Bob Nelson <rrnelson@us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Acked-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
90608a2928a86b9464a8698275a1138e82e3a010 20-Dec-2007 Aegis Lin <aegislin@gmail.com> [POWERPC] spufs: Use separate timer for /proc/spu_loadavg calculation

The original spusched_timer was designed to take effect only when
a context is waiting in the runqueue.

This change adds an additional lower-freq timer has been added to
purely handle the spu_load updates. The new timer will be triggered
per LOAD_FREQ ticks.

Signed-off-by: Aegis Lin <aegislin@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
c9101bdb1b0c56a75a4618542d368fe5013946b9 20-Dec-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spufs: make state_mutex interruptible

Make most places that use spu_acquire/spu_acquire_saved interruptible,
this allows getting out of the spufs code when e.g. pressing ctrl+c.
There are a few places where we get called e.g. from spufs teardown
routines were we can't simply err out so these are left with a comment.
For now I've also not touched the poll routines because it's open what
libspe would expect in terms of interrupted system calls.

Acked-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
e65c2f6fcebb9af0c3f53c796aff730dd657f5e7 20-Dec-2007 Luke Browning <lukebr@linux.vnet.ibm.com> [POWERPC] spufs: decouple spu scheduler from spufs_spu_run (asynchronous scheduling)

Change spufs_spu_run so that the context is queued directly to the
scheduler and the controlling thread advances directly to spufs_wait()
for spe errors and exceptions.

nosched contexts are treated the same as before.

Fixes from Christoph Hellwig <hch@lst.de>

Signed-off-by: Luke Browning <lukebr@linux.vnet.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
b192541b39ed29ff82f9f2d5427f451e89617f1c 20-Dec-2007 Luke Browning <lukebr@linux.vnet.ibm.com> [POWERPC] spufs: spu_find_victim may choose wrong victim

Need to re-check priority after dropping lock. Otherwise, a
more favored context may be preempted.

Signed-off-by: Luke Browning <lukebr@linux.vnet.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
91569531d1297db42d68136ac0c85cd85223d0b9 20-Dec-2007 Luke Browning <lukebr@linux.vnet.ibm.com> [POWERPC] spufs: reorganize spu_run_init

This cleans up spu_run_init so that it does all of the spu
initialization for spufs_run_spu. It initializes the spu context as
much as possible before it activates the spu and writes the runcntl
register.

Signed-off-by: Luke Browning <lukebr@linux.vnet.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
d6ad39bc53521275d14fde86bfb94d9b2ddb7a08 20-Dec-2007 Jeremy Kerr <jk@ozlabs.org> [POWERPC] spufs: rework class 0 and 1 interrupt handling

Based on original patches from
Arnd Bergmann <arnd.bergman@de.ibm.com>; and
Luke Browning <lukebr@linux.vnet.ibm.com>

Currently, spu contexts need to be loaded to the SPU in order to take
class 0 and class 1 exceptions.

This change makes the actual interrupt-handlers much simpler (ie,
set the exception information in the context save area), and defers the
handling code to the spufs_handle_class[01] functions, called from
spufs_run_spu.

This should improve the concurrency of the spu scheduling leading to
greater SPU utilization when SPUs are overcommited.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
33bfd7a73861c3727482c6c1c1c2ef40054060b7 20-Dec-2007 Arnd Bergmann <arnd.bergmann@de.ibm.com> [POWERPC] spufs: block fault handlers in spu_acquire_runnable

This change disables the logic that faults-in spu contexts under the
covers from the page fault handler. When a fault requires a runnable
context, the handler will block until the context is scheduled by
other means.

Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
7cd58e43810852eeb7af5a0c803f3890bd08b581 20-Dec-2007 Jeremy Kerr <jk@ozlabs.org> [POWERPC] spufs: move fault, lscsa_alloc and switch code to spufs module

Currently, part of the spufs code (switch.o, lscsa_alloc.o and fault.o)
is compiled directly into the kernel.

This change moves these components of spufs into the kernel.

The lscsa and switch objects are fairly straightforward to move in.

For the fault.o module, we split the fault-handling code into two
parts: a/p/p/c/spu_fault.c and a/p/p/c/spufs/fault.c. The former is for
the in-kernel spu_handle_mm_fault function, and we move the rest of the
fault-handling code into spufs.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
9b1d21f858e8bad750ab19cac23dcbf79d099be3 20-Dec-2007 Julio M. Merino Vidal <jmerino@ac.upc.edu> [POWERPC] spufs: fix typos in sched.c comments

Fix a few typos in the spufs scheduler comments

Signed-off-by: Julio M. Merino Vidal <jmerino@ac.upc.edu>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
c0e7b4aa1c09ea992808ea8c079141bc8dd0f5bc 19-Sep-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spusched: Fix null pointer dereference in find_victim

find_victim can dereference a NULL pointer when iterating over the list
of victim spus because list_mutex only guarantees spu->ct to be stable,
but of course not to be non-NULL.

Also fix find_victim to not call spu_unbind_context without list_mutex
because that violates the above guarantee.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
36ddbb1380f282b4280c57efdb646dd8647a789f 19-Sep-2007 Andre Detsch <adetsch@br.ibm.com> [POWERPC] spufs: Fix race condition on gang->aff_ref_spu

Affinity reference point location (gang->aff_ref_spu) is reset
when the whole gang is descheduled. However, the last member of
a gang can be descheduled while we are trying to schedule another
member of the gang. This was leading to a race condition, and
the code was using gang->aff_ref_spu in an unsafe manner.

By holding the gang->aff_mutex a little bit longer, and increment
gang->aff_sched_count (which controls when gang->aff_ref_spu
should be reset) a little bit earlier, the problem is fixed.

Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
683e3ab2b54a7190ec8517705880171cc8ac1d92 31-Jul-2007 Andre Detsch <adetsch@br.ibm.com> [POWERPC] spufs: Fix affinity after introduction of node_allowed() calls

This patch fixes affinity reference point placement, which was not being
done in some situations, after the introduction of node_allowed() calls.

The previously used parameter, 'ctx', is just the iterator of the
previous list_for_each_entry_reverse loop, and its value might be
invalid at the end of the loop. Also, the right context to seek
for information when defining the reference ctx location
_is_ the reference ctx.

Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
6f6a6dc0c8ebdb6514ab6bb58ba4b8739957b342 25-Jul-2007 Masato Noguchi <Masato.Noguchi@jp.sony.com> [POWERPC] spufs: Fix incorrect initialization of cbe_spu_info.spus

We currently initialize cbe_spu_info[].spus in both init_spu_base and
spu_sched_init. The initialise in spu_sched_init clears the SPU list,
so we end up with no physical SPUs. Because of this, the spu_run
syscall will block forever.

This change removes the unnecessary initialization in spu_sched_init.

Signed-off-by: Masato Noguchi <Masato.Noguchi@jp.sony.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
486acd4850dde6d2f8c7f431432f3914c4bfb5f5 20-Jul-2007 Christoph Hellwig <hch@lst.de> [CELL] spufs: rework list management and associated locking

This sorts out the various lists and related locks in the spu code.

In detail:

- the per-node free_spus and active_list are gone. Instead struct spu
gained an alloc_state member telling whether the spu is free or not
- the per-node spus array is now locked by a per-node mutex, which
takes over from the global spu_lock and the per-node active_mutex
- the spu_alloc* and spu_free function are gone as the state change is
now done inline in the spufs code. This allows some more sharing of
code for the affinity vs normal case and more efficient locking
- some little refactoring in the affinity code for this locking scheme

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
1474855d0878cced6f39f51f3c2bd7428b44cb1e 20-Jul-2007 Bob Nelson <rrnelson@linux.vnet.ibm.com> [CELL] oprofile: add support to OProfile for profiling CELL BE SPUs

From: Maynard Johnson <mpjohn@us.ibm.com>

This patch updates the existing arch/powerpc/oprofile/op_model_cell.c
to add in the SPU profiling capabilities. In addition, a 'cell' subdirectory
was added to arch/powerpc/oprofile to hold Cell-specific SPU profiling code.
Exports spu_set_profile_private_kref and spu_get_profile_private_kref which
are used by OProfile to store private profile information in spufs data
structures.

Also incorporated several fixes from other patches (rrn). Check pointer
returned from kzalloc. Eliminated unnecessary cast. Better error
handling and cleanup in the related area. 64-bit unsigned long parameter
was being demoted to 32-bit unsigned int and eventually promoted back to
unsigned long.

Signed-off-by: Carl Love <carll@us.ibm.com>
Signed-off-by: Maynard Johnson <mpjohn@us.ibm.com>
Signed-off-by: Bob Nelson <rrnelson@us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
36aaccc1e96481e8310b1d13600096da0f24ff43 20-Jul-2007 Bob Nelson <rrnelson@linux.vnet.ibm.com> [CELL] oprofile: enable SPU switch notification to detect currently active SPU tasks

From: Maynard Johnson <mpjohn@us.ibm.com>

This patch adds to the capability of spu_switch_event_register so that
the caller is also notified of currently active SPU tasks.
Exports spu_switch_event_register and spu_switch_event_unregister so
that OProfile can get access to the notifications provided.

Signed-off-by: Maynard Johnson <mpjohn@us.ibm.com>
Signed-off-by: Carl Love <carll@us.ibm.com>
Signed-off-by: Bob Nelson <rrnelson@us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
cbc23d3e7cb3c9fd3c9fce0bc3f44f687a9517c0 20-Jul-2007 Arnd Bergmann <arnd@arndb.de> [CELL] spufs: integration of SPE affinity with the scheduller

This patch makes the scheduller honor affinity information for each
context being scheduled. If the context has no affinity information,
behaviour is unchanged. If there are affinity information, context is
schedulled to be run on the exact spu recommended by the affinity
placement algorithm.

Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
c5fc8d2a92461fcabd00dfd678204cba36b93119 20-Jul-2007 Arnd Bergmann <arnd@arndb.de> [CELL] cell: add placement computation for scheduling of affinity contexts

This patch provides the spu affinity placement logic for the spufs scheduler.
Each time a gang is going to be scheduled, the placement of a reference
context is defined. The placement of all other contexts with affinity from
the gang is defined based on this reference context location and on a
precomputed displacement offset.

Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
aa6d5b20254a21b69092dd839b70ee148303ef25 20-Jul-2007 Arnd Bergmann <arnd@arndb.de> [CELL] cell: add per BE structure with info about its SPUs

Addition of a spufs-global "cbe_info" array. Each entry contains information
about one Cell/B.E. node, namelly:
* list of spus (both free and busy spus are in this list);
* list of free spus (replacing the static spu_list from spu_base.c)
* number of spus;
* number of reserved (non scheduleable) spus.

SPE affinity implementation actually requires only access to one spu per
BE node (since it implements its own pointer to walk through the other spus
of the ring) and the number of scheduleable spus (n_spus - non_sched_spus)
However having this more general structure can be useful for other
functionalities, concentrating per-cbe statistics / data.

Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
7e90b74967ea54dbd6eb539e1cb151ec37f63d7f 20-Jul-2007 Masato Noguchi <Masato.Noguchi@jp.sony.com> [CELL] spufs: use find_first_bit() instead of sched_find_first_bit()

spu_sched->bitmap has MAX_PRIO(=140) width in bits.However, since
ff80a77f20f811c0cc5b251d0f657cbc6f788385, sched_find_first_bit()
only supports 100-bit bitmaps.

Thus, spu_sched->bitmap should be treated by generic find_first_bit().

Signed-off-by: Masato Noguchi <Masato.Noguchi@jp.sony.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
27ec41d3a1d4df2b7cd190e93aad22ab86a72aa1 20-Jul-2007 Andre Detsch <adetsch@br.ibm.com> [CELL] spufs: add spu stats in sysfs and ctx stat file in spufs

This patch exports per-context statistics in spufs as long as spu
statistics in sysfs.

It was formed by merging:
"spufs: add spu stats in sysfs" From: Christoph Hellwig
"spufs: add stat file to spufs" From: Christoph Hellwig
"spufs: fix libassist accounting" From: Jeremy Kerr
"spusched: fix spu utilization statistics" From: Luke Browning
And some adjustments by myself, after suggestions on cbe-oss-dev.

Having separate patches was making the review process harder
than it should, as we end up integrating spus and ctx statistics
accounting much more than it was on the first implementation.

Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
e840cfe6814d6f13ecb86cff7097ad7259df502e 20-Jul-2007 Jeremy Kerr <jk@ozlabs.org> [CELL] spufs: Remove spurious WARN_ON for spu_deactivate for NOSCHED contexts

In 6cbf93960e64f313f6e247cbca7afaa50e3ee2c we added a WARN_ON for
calling spu_deactivate on contexts created with the SPU_CREATE_NOSCHED
flag. However, all NOSCHED contexts will need to be deactivated when
the context is destroyed, so this gives a spurious warning when any
NOSCHED context is closed.

This change removes the WARN_ON.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
d1450317554d52e0e4a454806c4d05bb2a834f00 20-Jul-2007 Sebastian Siewior <sebastian@breakpoint.cc> [CELL] spufs: remove section mismatch warning

WARNING: arch/powerpc/platforms/cell/spufs/spufs.o(.init.text+0x158): Section
mismatch: reference to .exit.text:.spu_sched_exit (between '.init_module' and
'.spu_sched_init')

was introduced by c99c1994a2bb9493b4ac372b2b6ee2606d291171
This patch removes the warning.

Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
fe2f896d67b89a409c366c9a69e30291ab124467 29-Jun-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spufs: Add spu stats in sysfs

Export spu statistics in sysfs.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
27449971e6907ff38bde7bbc4647e55bd7309fc3 29-Jun-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spusched: Fix runqueue corruption

spu_activate can be called from multiple threads at the same time on
behalf of the same spu context. We need to make sure to only add it
once to avoid runqueue corruption.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
c77239b8be74f775142d9dd01041e2ce864ba20d 29-Jun-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spusched: Disable tick when not needed

Only enable the scheduler tick if we have any context waiting to be
scheduled.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
e9f8a0b65ac716fd7974159240ce34bddea780b3 29-Jun-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spufs: Add stat file to spufs

Export per-context statistics in spufs.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
65de66f0b8bcb7431d9df82cf32b002062b3a611 29-Jun-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spufs: Implement /proc/spu_loadavg

Provide load average information for spu context. The format
is identical to /proc/loadavg, which is also where a lot of code
and concepts is borrowed from.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
476273adc7277333aed9963bc4dc9b39066d3038 29-Jun-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spufs: Add tid file

The new tid file contains the ID of the thread currently running the
context, if any. This is used so that the new spu-top and spu-ps
tools can find the thread in /proc.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
7022543ee404880aab5c641e4983e237815edc35 29-Jun-2007 Jeremy Kerr <jk@ozlabs.org> [POWERPC] spufs: Trivial whitespace fixes

Remove redundant whitespace in arch/powerpc/platforms/cell/spufs/

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
df09cf3e2cd597d373f3a6046df0e0a50881ea44 29-Jun-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spusched: No preemption for nosched contexts

And last but not least we need to make sure the scheduler tick never
preempts a nosched context.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
46cbf93960e64f313f6e247cbca7afaa50e3ee2c 29-Jun-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spusched: Catch nosched contexts in spu_deactivate

spu_deactivate should never be called for nosched contets. Put in
a check so we can print a stacktrace and exit early in case it
happes erroneously.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
ea1ae5949d7fcd2e622226ba71741a0f43b6ef0a 29-Jun-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spusched: fix cpu/node binding

Add a cpus_allowed allowed filed to struct spu_context so that we always
use the cpu mask of the owning thread instead of the one happening to
call into the scheduler. Also use this information in
grab_runnable_context to avoid spurious wakeups.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2cf2b3b49f10d2f4a0703070fc54ce1cd84a6cda 29-Jun-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spusched: Update scheduling paramters on every spu_run

Update scheduling information on every spu_run to allow for setting
threads to realtime priority just before running them. This requires
some slightly ugly code in spufs_run_spu because we can just update
the information unlocked if the spu is not runnable, but we need to
acquire the active_mutex when it is runnable to protect against
find_victim. This locking scheme requires opencoding
spu_acquire_runnable in spufs_run_spu which actually is a nice cleanup
all by itself.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
f3f59bec0c7ad083e9c95a550bcb1e9ca27e25f4 29-Jun-2007 Jeremy Kerr <jk@ozlabs.org> [POWERPC] spusched: Print out scheduling tunables with DEBUG

Print out a few scheduler tuning parameters when we've compiled
with DEBUG defined.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
60e242393346c1a9a64e7b14dfb7f613a737324f 29-Jun-2007 Jeremy Kerr <jk@ozlabs.org> [POWERPC] spusched: Fix timeslice calculations

The current timeslice code mixes 'jiffies' up with 'spesched ticks'. This
change correctly defines the number of time slices each SPE contexts is
given, and clarifies the comment.

This brings the default timeslice for SPE contexts into a reasonable
range.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
fe443ef2ac421c9c652e251e8733e2479d8e411a 29-Jun-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spusched: Dynamic timeslicing for SCHED_OTHER

Enable preemptive scheduling for non-RT contexts.

We use the same algorithms as the CPU scheduler to calculate the time
slice length, and for now we also use the same timeslice length as the
CPU scheduler. This might be not enough for good performance and can be
changed after some benchmarking.

Note that currently we do not boost the priority for contexts waiting
on the runqueue for a long time, so contexts with a higher nice value
could starve ones with less priority. This could easily be fixed once
the rework of the spu lists that Luke and I discussed is done.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
379018022071489a7dffee74db2a267465dab561 29-Jun-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spusched: Switch from workqueues to kthread + timer tick

Get rid of the scheduler workqueues that complicated things a lot to
a dedicated spu scheduler thread that gets woken by a traditional
scheduler tick. By default this scheduler tick runs a HZ * 10, aka
one spu scheduler tick for every 10 cpu ticks.

Currently the tick is not disabled when we have less context than
available spus, but I will implement this later.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
e5c0b9ec538a86433ddd725f675e0a5a2117b9ed 05-Jun-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spufs: Don't yield nosched context

Nosched context sould never be scheduled out, thus we must not
deactivate them in spu_yield ever.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
bb5db29aa0379f0f3ef857a3a3715f17261c611b 04-Jun-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spufs scheduler: Fix wakeup races

Fix the race between checking for contexts on the runqueue and actually
waking them in spu_deactive and spu_yield.

The guts of spu_reschedule are split into a new helper called
grab_runnable_context which shows if there is a runnable thread below
a specified priority and if yes removes if from the runqueue and uses
it. This function is used by the new __spu_deactivate hepler shared
by preemption and spu_yield to grab a new context before deactivating
a specified priority and if yes removes if from the runqueue and uses
it. This function is used by the new __spu_deactivate hepler shared
by preemption and spu_yield to grab a new context before deactivating
the old one.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
e63340ae6b6205fef26b40a75673d1c9c0c8bb90 08-May-2007 Randy Dunlap <randy.dunlap@oracle.com> header cleaning: don't include smp_lock.h when not used

Remove includes of <linux/smp_lock.h> where it is not used/needed.
Suggested by Al Viro.

Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4e0f4ed0df71013290cd2a01f7b84264f7b99678 23-Apr-2007 Luke Browning <lukebrowning@us.ibm.com> [POWERPC] spu sched: make addition to stop_wq and runque atomic vs wakeup

Addition to stop_wq needs to happen before adding to the runqeueue and
under the same lock so that we don't have a race window for a lost
wake up in the spu scheduler.

Signed-off-by: Luke Browning <lukebrowning@us.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
a475c2f43520cb095452201da57395000cfeb94c 23-Apr-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spufs: remove woken threads from the runqueue early

A single context should only be woken once, and we should not have
more wakeups for a given priority than the number of contexts on
that runqueue position.

Also add some asserts to trap future problems in this area more
easily.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
390c53430498c9973e015432806edd53b2efe6c6 23-Apr-2007 Arnd Bergmann <arnd.bergmann@de.ibm.com> [POWERPC] spufs: add memory barriers after set_bit

set_bit does not guarantee ordering on powerpc, so using it
for communication between threads requires explicit
mb() calls.

Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
e097b513285e616215b23af234d127298bb8d89a 23-Apr-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spu sched: ensure preempted threads are put back on the runqueue, part2

To not lose a spu thread we need to make sure it always gets put back
on the runqueue. In find_victim aswell as in the scheduler tick as done
in the previous patch.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
b3e76cc3244ac139fc75750c5af9edbb9f191a10 23-Apr-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spu sched: ensure preempted threads are put back on the runqueue

To not lose a spu thread we need to make sure it always gets put back
on the runqueue.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
0887309589824fb1c3744c69a330c99c369124a0 23-Apr-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spufs: use cancel_rearming_delayed_workqueue when stopping spu contexts

The scheduler workqueue may rearm itself and deadlock when we try to stop
it. Put a flag in place to avoid skip the work if we're tearing down
the context.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
dbf8eefa2b814d6922492120bfa46d4bc42ceb20 23-Mar-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spufs: don't yield CPU in spu_yield

There is no reason to yield the CPU in spu_yield - if the backing
thread reenters spu_run it gets added to the end of the runqueue for
it's priority. So the yield is just a slowdown for the case where
we have higher priority contexts waiting.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
94b2a4393c500a620de90c3266d595926302e26b 10-Mar-2007 Benjamin Herrenschmidt <benh@kernel.crashing.org> [POWERPC] Fix spu SLB invalidations

The SPU code doesn't properly invalidate SPUs SLBs when necessary,
for example when changing a segment size from the hugetlbfs code. In
addition, it saves and restores the SLB content on context switches
which makes it harder to properly handle those invalidations.

This patch removes the saving & restoring for now, something more
efficient might be found later on. It also adds a spu_flush_all_slbs(mm)
that can be used by the core mm code to flush the SLBs of all SPEs that
are running a given mm at the time of the flush.

In order to do that, it adds a spinlock to the list of all SPEs and move
some bits & pieces from spufs to spu_base.c

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
50b520d4efbce45281f58112789470ec7965fd33 10-Mar-2007 Christoph Hellwig <hch@lst.de> [POWERPC] avoid SPU_ACTIVATE_NOWAKE optimization

This optimization was added recently but is still buggy,
so back it out for now.

Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2eb1b12049844a8ebc670e0e4fc908bc3f8933d3 13-Feb-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spu sched: static timeslicing for SCHED_RR contexts

For SCHED_RR tasks we can do some really trivial timeslicing. Basically
we fire up a time for every scheduler tick that searches for a higher
or same priority thread that is on the runqueue and if there is one
context switches to it. Because we can't lock spus from timer context
we actually run this from a delayed runqueue instead of a timer.

A nice optimization would be to skip the actual priority bitmap search
when there are less contexts than physical spus available. To implement
this I need a so far unpublished patch from Andre, and it will be added
after we have that patch in.

Note that right now we only do the time slicing for SCHED_RR tasks.
The code would work for SCHED_OTHER tasks aswell, but their prio
value is defered from the one the PPU thread has at time of spu_run,
and using this for spu scheduling decisions would make the code very
unfair. SCHED_OTHER support will be enabled once we the spu scheduler
knows how to calculcate cpu_context.prio (very soon)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
72cb360839f88c02ccf38f1df214316e05886ff3 13-Feb-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spu sched: use DECLARE_BITMAP

use DECLARE_BITMAP in the spu scheduler instead of reimplementing it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
52f04fcf66a5d5d90790d6cfde52e391ecf2b882 13-Feb-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spu sched: forced preemption at execution

If we start a spu context with realtime priority we want it to run
immediately and not wait until some other lower priority thread has
finished. Try to find a suitable victim and use it's spu in this
case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
ae7b4c5284d11d49ed9432c16505fcbeb8d3b8cf 13-Feb-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spu sched: update some comments

Give spu_yield a kerneldoc comment and remove the old comment
documenting spu_activate, spu_deactive and spu_yield as all of them
now have descriptive kerneldoc comments of their own.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
678b2ff1e65ecccdb15cbfe97081572fc35944b7 13-Feb-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spu sched: simplity spu_remove_from_active_list

If we call spu_remove_from_active_list that spu is always guaranteed
to be on the active list and in runnable state, so we can simply
do a list_del to remove it and unconditionally take the was_active
codepath.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
26bec67386dbf6ef887254e815398842e182cdcd 13-Feb-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spufs: optimize spu_run

There is no need to directly wake up contexts in spu_activate when
called from spu_run, so add a flag to surpress this wakeup.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
079cdb61614c466c939ebf74c7ef6745667bc61e 13-Feb-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spufs: runqueue simplification

This is the biggest patch in this series, and it reworks the guts of
the spu scheduler runqueue mechanism:

- instead of embedding a waitqueue in the runqueue there is now a
simple doubly-linked list, the actual wakeups happen by reusing
the stop_wq in the spu context (maybe we should rename it one day)
- spu_free and spu_prio_wakeup are merged into a single spu_reschedule
function
- various functionality is split out into small helpers, and kerneldoc
comments are added in various places to document what's going on.
- spu_activate is rewritten into a tight loop by removing test for
various impossible conditions and using the infrastructure in this
patch.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
8389998ae9ea2888c86c446f7911ddced50052a1 13-Feb-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spufs: move prio to spu_context

It doesn't make any sense to have a priority field in the physical spu
structure. Move it into the spu context instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
650f8b0291ecd0abdeadbd0ff3d70c3538e55405 13-Feb-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spufs: simplify state_mutex

The r/w semaphore to lock the spus was overkill and can be replaced
with a mutex to make it faster, simpler and easier to debug. It also
helps to allow making most spufs interruptible in future patches.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
202557d29eae528f464652e92085f3b19b05a0a7 13-Feb-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spufs: sched.c cleanups

Various cleanups to sched.c that don't change the global control flow:

- add kerneldoc comments to various functions
- add spu_ prefixes to various functions
- add/remove context from the runqueue in bind/unbind_context as
it's part of the logical operation
- add a call to put_active_spu to spu_unbind_contex as it's logically
part of the unbind operation

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
81998bafe299b8b675157f0a4dfe8dad43215da9 13-Feb-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spufs: bind_context sets SPU_STATE_RUNNABLE

Only bind_context/unbind_context change the spu context state. Thus
we can move all assignents of SPU_STATE_RUNNABLE into bind_context,
which parallels the unbind side aswell.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
aa56c16807ba7b8e801216cab012d2f498755ba5 13-Feb-2007 Christoph Hellwig <hch@lst.de> [POWERPC] spufs: remove superfluous SPU_STATE_SAVED assignments

unbind_context already sets the context state to SPU_STATE_SAVED, thus
the spu_deactivate callers don't need to do it again.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
867672777964b9309e4e914fe097648c938b67b2 04-Oct-2006 Arnd Bergmann <arnd@arndb.de> [POWERPC] spufs: add infrastructure for finding elf objects

This adds an 'object-id' file that the spe library can
use to store a pointer to its ELF object. This was
originally meant for use by oprofile, but is now
also used by the GNU debugger, if available.

In order for oprofile to find the location in an spu-elf
binary where an event counter triggered, we need a way
to identify the binary in the first place.

Unfortunately, that binary itself can be embedded in a
powerpc ELF binary. Since we can assume it is mapped into
the effective address space of the running process,
have that one write the pointer value into a new spufs
file.

When a context switch occurs, pass the user value to
the profiler so that can look at the mapped file (with
some care).

Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
9add11daeee2f6d69f6b86237f197824332a4a3b 04-Oct-2006 Arnd Bergmann <arnd@arndb.de> [POWERPC] spufs: implement error event delivery to user space

This tries to fix spufs so we have an interface closer to what is
specified in the man page for events returned in the third argument of
spu_run.

Fortunately, libspe has never been using the returned contents of that
register, as they were the same as the return code of spu_run (duh!).

Unlike the specification that we never implemented correctly, we now
require a SPU_CREATE_EVENTS_ENABLED flag passed to spu_create, in
order to get the new behavior. When this flag is not passed, spu_run
will simply ignore the third argument now.

Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
a68cf983f635930ea35f9e96b27d96598550dea0 04-Oct-2006 Mark Nutter <mnutter@us.ibm.com> [POWERPC] spufs: scheduler support for NUMA.

This patch adds NUMA support to the the spufs scheduler.

The new arch/powerpc/platforms/cell/spufs/sched.c is greatly
simplified, in an attempt to reduce complexity while adding
support for NUMA scheduler domains. SPUs are allocated starting
from the calling thread's node, moving to others as supported by
current->cpus_allowed. Preemption is gone as it was buggy, but
should be re-enabled in another patch when stable.

The new arch/powerpc/platforms/cell/spu_base.c maintains idle
lists on a per-node basis, and allows caller to specify which
node(s) an SPU should be allocated from, while passing -1 tells
spu_alloc() that any node is allowed.

Since the patch removes the currently implemented preemptive
scheduling, it is technically a regression, but practically
all users have since migrated to this version, as it is
part of the IBM SDK and the yellowdog distribution, so there
is not much point holding it back while the new preemptive
scheduling patch gets delayed further.

Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
6ab3d5624e172c553004ecc862bfeac16d9d68b7 30-Jun-2006 Jörn Engel <joern@wohnheim.fh-wedel.de> Remove obsolete #include <linux/config.h>

Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
a91942ae7ebd518006dcbeb2a1d7b147253c080e 19-Jun-2006 Geoff Levand <geoffrey.levand@am.sony.com> [POWERPC] spufs: fix spu irq affinity setting

This changes the hypervisor abstraction of setting cpu affinity to a
higher level to avoid platform dependent interrupt controller
routines. I replaced spu_priv1_ops:spu_int_route_set() with a
new routine spu_priv1_ops:spu_cpu_affinity_set().

As a by-product, this change eliminated what looked like an
existing bug in the set affinity code where spu_int_route_set()
mistakenly called int_stat_get().

Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
a33a7d7309d79656bc19a0e96fc4547a1633283e 23-Mar-2006 Arnd Bergmann <abergman@de.ibm.com> [PATCH] spufs: implement mfc access for PPE-side DMA

This patch adds a new file called 'mfc' to each spufs directory.
The file accepts DMA commands that are a subset of what would
be legal DMA commands for problem state register access. Upon
reading the file, a bitmask is returned with the completed
tag groups set.

The file is meant to be used from an abstraction in libspe
that is added by a different patch.

From the kernel perspective, this means a process can now
offload a memory copy from or into an SPE local store
without having to run code on the SPE itself.

The transfer will only be performed while the SPE is owned
by one thread that is waiting in the spu_run system call
and the data will be transferred into that thread's
address space, independent of which thread started the
transfer.

Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2fb9d2063626374dd8a2514b3a730facac8235d8 05-Jan-2006 Arnd Bergmann <arnd@arndb.de> [PATCH] spufs: set irq affinity for running threads

For far, all SPU triggered interrupts always end up on
the first SMT thread, which is a bad solution.

This patch implements setting the affinity to the
CPU that was running last when entering execution on
an SPU. This should result in a significant reduction
in IPI calls and better cache locality for SPE thread
specific data.

Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
8837d9216f99048636fbb2c11347358e99e06181 04-Jan-2006 Arnd Bergmann <arnd@arndb.de> [PATCH] spufs: clean up use of bitops

checking bits manually might not be synchonized with
the use of set_bit/clear_bit. Make sure we always use
the correct bitops by removing the unnecessary
identifiers.

Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
7945a4a27d5d914918b7637b055e01abfe05906e 09-Dec-2005 Arnd Bergmann <arnd@arndb.de> [PATCH] spufs: trivial compile fix

One of my last patches contained a broken line
from splitting out some other changes, this
restores a working version.

Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2a911f0bb73e67826062b7d073dd7367ca449724 06-Dec-2005 Arnd Bergmann <arnd@arndb.de> [PATCH] spufs: Improved SPU preemptability [part 2].

This patch reduces lock complexity of SPU scheduler, particularly
for involuntary preemptive switches. As a result the new code
does a better job of mapping the highest priority tasks to SPUs.

Lock complexity is reduced by using the system default workqueue
to perform involuntary saves. In this way we avoid nasty lock
ordering problems that the previous code had. A "minimum timeslice"
for SPU contexts is also introduced. The intent here is to avoid
thrashing.

While the new scheduler does a better job at prioritization it
still does nothing for fairness.

From: Mark Nutter <mnutter@us.ibm.com>
Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
5110459f181ef1f11200bb3dec61953f08cc49e7 06-Dec-2005 Arnd Bergmann <arnd@arndb.de> [PATCH] spufs: Improved SPU preemptability.

This patch makes it easier to preempt an SPU context by
having the scheduler hold ctx->state_sema for much shorter
periods of time.

As part of this restructuring, the control logic for the "run"
operation is moved from arch/ppc64/kernel/spu_base.c to
fs/spufs/file.c. Of course the base retains "bottom half"
handlers for class{0,1} irqs. The new run loop will re-acquire
an SPU if preempted.

From: Mark Nutter <mnutter@us.ibm.com>
Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
3b3d22cb84a0bb12f6bbb2b1158972894bec3f21 06-Dec-2005 Arnd Bergmann <arnd@arndb.de> [PATCH] spufs: Turn off debugging output

spufs is rather noisy when debugging is enabled, this
turns off the messages for production use.

Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
8b3d6663c6217e4f50cc3720935a96da9b984117 15-Nov-2005 Arnd Bergmann <arnd@arndb.de> [PATCH] spufs: cooperative scheduler support

This adds a scheduler for SPUs to make it possible to use
more logical SPUs than physical ones are present in the
system.

Currently, there is no support for preempting a running
SPU thread, they have to leave the SPU by either triggering
an event on the SPU that causes it to return to the
owning thread or by sending a signal to it.

This patch also adds operations that enable accessing an SPU
in either runnable or saved state. We use an RW semaphore
to protect the state of the SPU from changing underneath
us, while we are holding it readable. In order to change
the state, it is acquired writeable and a context save
or restore is executed before downgrading the semaphore
to read-only.

From: Mark Nutter <mnutter@us.ibm.com>,
Uli Weigand <Ulrich.Weigand@de.ibm.com>
Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>