History log of /kernel/acct.c
Revision Date Author Comments
067b722faf98adbe1e94581f39c06a7c82b58676 10-Oct-2014 Ying Xue <ying.xue@windriver.com> acct: eliminate compile warning

If ACCT_VERSION is not defined to 3, below warning appears:
CC kernel/acct.o
kernel/acct.c: In function `do_acct_process':
kernel/acct.c:475:24: warning: unused variable `ns' [-Wunused-variable]

[akpm@linux-foundation.org: retain the local for code size improvements
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
934fc295b30ea8ce5d5e0ab9024a10fab9b6f200 08-Aug-2014 Ionut Alexa <ionut.m.alexa@gmail.com> kernel/acct.c: fix coding style warnings and errors

Signed-off-by: Ionut Alexa <ionut.m.alexa@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2577d92ebd28dd9b3dacdfad6dcd81be0d21bbdf 31-Jul-2014 Ionut Alexa <ionut.m.alexa@gmail.com> kernel/acct.c: fix coding style warnings and errors

Signed-off-by: Ionut Alexa <ionut.m.alexa@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3064c3563ba4c23e2c7a47254ec056ed9ba0098a 07-Aug-2014 Al Viro <viro@zeniv.linux.org.uk> death to mnt_pinned

Rather than playing silly buggers with vfsmount refcounts, just have
acct_on() ask fs/namespace.c for internal clone of file->f_path.mnt
and replace it with said clone. Then attach the pin to original
vfsmount. Voila - the clone will be alive until the file gets closed,
making sure that underlying superblock remains active, etc., and
we can drop the original vfsmount, so that it's not kept busy.
If the file lives until the final mntput of the original vfsmount,
we'll notice that there's an fs_pin (one in bsd_acct_struct that
holds that file) and mnt_pin_kill() will take it out. Since
->kill() is synchronous, we won't proceed past that point until
these files are closed (and private clones of our vfsmount are
gone), so we get the same ordering warranties we used to get.

mnt_pin()/mnt_unpin()/->mnt_pinned is gone now, and good riddance -
it never became usable outside of kernel/acct.c (and racy wrt
umount even there).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
efb170c22867cdc6f770de441bdefecec6712199 07-Aug-2014 Al Viro <viro@zeniv.linux.org.uk> take fs_pin stuff to fs/*

Add a new field to fs_pin - kill(pin). That's what umount and r/o remount
will be calling for all pins attached to vfsmount and superblock resp.
Called after bumping the refcount, so it won't go away under us. Dropping
the refcount is responsibility of the instance. All generic stuff moved to
fs/fs_pin.c; the next step will rip all the knowledge of kernel/acct.c from
fs/super.c and fs/namespace.c. After that - death to mnt_pin(); it was
intended to be usable as generic mechanism for code that wants to attach
objects to vfsmount, so that they would not make the sucker busy and
would get killed on umount. Never got it right; it remained acct.c-specific
all along. Now it's very close to being killable.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
1629d0eb3ead0e0c49e4402049ec7b5b31b81cd7 07-Aug-2014 Al Viro <viro@zeniv.linux.org.uk> start carving bsd_acct_struct up

pull generic parts into struct fs_pin. Eventually we want those
to replace mnt_pin()/mnt_unpin() mess; that stuff will move to
fs/*.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
215748e67d893169de9e62c3416e9e035e9e9c5f 07-Aug-2014 Al Viro <viro@zeniv.linux.org.uk> acct: move mnt_pin() upwards.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
17c0a5aaffa63da6b5c73a31e36616bdcd12d143 07-Aug-2014 Al Viro <viro@zeniv.linux.org.uk> make acct_kill() wait for file closing.

Do actual closing of file via schedule_work(). And use
__fput_sync() there.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2798d4ce61601808b965253d60624bbf201b51b0 07-Aug-2014 Al Viro <viro@zeniv.linux.org.uk> acct: get rid of acct_lock for acct->count

* make acct->count atomic and acct freeing - rcu-delayed.
* instead of grabbing acct_lock around the places where we take a reference,
do that under rcu_read_lock() with atomic_long_inc_not_zero().
* have the new acct locked before making ns->bacct point to it

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
215752fce31c80f3b3a1530bc7cddb3ba6a69b3a 07-Aug-2014 Al Viro <viro@zeniv.linux.org.uk> acct: get rid of acct_list

Put these suckers on per-vfsmount and per-superblock lists instead.
Note: right now it's still acct_lock for everything, but that's
going to change.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
54a4d58a6459a93fc6ee898354b3d2ffb80dd03a 19-Apr-2014 Al Viro <viro@zeniv.linux.org.uk> acct: simplify check_free_space()

a) file can't be NULL
b) file can't be changed under us
c) all writes are serialized by acct->lock; no need to mess with
spinlock there.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
b8f00e6be46f4c9a112e05fd692712873c4c4048 07-Aug-2014 Al Viro <viro@zeniv.linux.org.uk> acct: new lifetime rules

Do not reuse bsd_acct_struct after closing the damn thing.
Structure lifetime is controlled by refcount now. We also
have a mutex in there, held over closing and writing (the
file is O_APPEND, so we are not losing any concurrency).

As the result, we do not need to bother with get_file()/fput()
on log write anymore. Moreover, do_acct_process() only needs
acct itself; file and pidns are picked from it.

Killed instances are distinguished by having NULL ->ns.
Refcount is protected by acct_lock; anybody taking the
mutex needs to grab a reference first.

The things will get a lot simpler in the next commits - this
is just the minimal chunk switching to the new lifetime rules.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9df7fa16ee956bf0cdf4a711eac827be92d584bc 15-May-2014 Al Viro <viro@zeniv.linux.org.uk> acct: serialize acct_on()

brute-force - on a global mutex that isn't nested into anything.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
795a2f22a8eaf749e20a11271a8821bf04ac6d90 07-May-2014 Al Viro <viro@zeniv.linux.org.uk> acct() should honour the limits from the very beginning

We need to check free space on the first write to freshly opened log.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
e25ff11ff16aba000dfe9e568d867e5142c31f16 07-May-2014 Al Viro <viro@zeniv.linux.org.uk> split the slow path in acct_process() off

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
cdd37e23092c3c6fbbb2e611f8c3d18e676bf28f 27-Apr-2014 Al Viro <viro@zeniv.linux.org.uk> separate namespace-independent parts of filling acct_t

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
ed44724b79d8e03a40665436019cf22baba80d30 19-Apr-2014 Al Viro <viro@zeniv.linux.org.uk> acct: switch to __kernel_write()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
ecfdb33d1fbc7e6e095ba24dac2930208494e734 19-Apr-2014 Al Viro <viro@zeniv.linux.org.uk> acct: encode_comp_t(0) is 0, fortunately...

There was an amusing bogosity in ac_rw calculation - it tried to
do encode_comp_t(encode_comp_t(0) / 1024). Seeing that comp_t is
a 3-bit exponent + 13-bit mantissa... it's a good thing that 0 is
represented by all-bits-clear.

The history of that one is interesting - it was introduced in
2.1.68pre1, when acct.c had been reworked and moved to separate
file. Two months later (2.1.86) somebody has noticed that the
sucker won't compile - there was no task_struct::io_usage.
At which point the ac_io calculation had changed from
encode_comp_t(current->io_usage) to encode_comp_t(0) and the
bug in the next line (absolutely real back then, had it ever
managed to compile) become a harmless bogosity. Looks like
nobody has ever noticed until now.

Anyway, let's bury that idiocy now that it got noticed. 17 years
is long enough...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
ccbf62d8a284cf181ac28c8e8407dd077d90dd4b 16-Jul-2014 Thomas Gleixner <tglx@linutronix.de> sched: Make task->start_time nanoseconds based

Simplify the timespec to nsec/usec conversions.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
22001821d9cb6ddb83ee4e1f81e6b905de623165 12-Jun-2014 Thomas Gleixner <tglx@linutronix.de> acct: Use ktime_get_ts()

do_posix_clock_monotonic_gettime() is a leftover from the initial
posix timer implementation which maps to ktime_get_ts()

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140611234606.764810535@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
46c0a8ca3e841b14a1d981e2116eaf2d1c7f2235 06-Jun-2014 Paul McQuade <paulmcquad@gmail.com> ipc, kernel: clear whitespace

trailing whitespace

Signed-off-by: Paul McQuade <paulmcquad@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7153e402731c3e72331633d1ac15a654768aecac 06-Jun-2014 Paul McQuade <paulmcquad@gmail.com> ipc, kernel: use Linux headers

Use #include <linux/uaccess.h> instead of <asm/uaccess.h>
Use #include <linux/types.h> instead of <asm/types.h>

Signed-off-by: Paul McQuade <paulmcquad@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5ae98f1589e076e4b314fc54ae2beac58842ddc2 04-May-2013 Jan Kara <jack@suse.cz> fs: Fix hang with BSD accounting on frozen filesystem

When BSD process accounting is enabled and logs information to a
filesystem which gets frozen, system easily becomes unusable because
each attempt to account process information blocks. Thus e.g. every task
gets blocked in exit.

It seems better to drop accounting information (which can already happen
when filesystem is running out of space) instead of locking system up.
So we just skip the write if the filesystem is frozen.

Reported-by: Nikola Ciprich <nikola.ciprich@linuxbox.cz>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
03d95eb2f2578083a3f6286262e1cb5d88a00c02 20-Mar-2013 Al Viro <viro@zeniv.linux.org.uk> lift sb_start_write() out of ->write()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
496ad9aa8ef448058e36ca7a787c61f2e63f0f54 23-Jan-2013 Al Viro <viro@zeniv.linux.org.uk> new helper: file_inode(file)

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
6fac4829ce0ef9b7f24369086ce5f0e9f38d37bc 13-Nov-2012 Frederic Weisbecker <fweisbec@gmail.com> cputime: Use accessors to read task cputime stats

This is in preparation for the full dynticks feature. While
remotely reading the cputime of a task running in a full
dynticks CPU, we'll need to do some extra-computation. This
way we can account the time it spent tickless in userspace
since its last cputime snapshot.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
669abf4e5539c8aa48bf28c965be05c0a7b58a27 10-Oct-2012 Jeff Layton <jlayton@redhat.com> vfs: make path_openat take a struct filename pointer

...and fix up the callers. For do_file_open_root, just declare a
struct filename on the stack and fill out the .name field. For
do_filp_open, make it also take a struct filename pointer, and fix up its
callers to call it appropriately.

For filp_open, add a variant that takes a struct filename pointer and turn
filp_open into a wrapper around it.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
91a27b2a756784714e924e5e854b919273082d26 10-Oct-2012 Jeff Layton <jlayton@redhat.com> vfs: define struct filename and have getname() return it

getname() is intended to copy pathname strings from userspace into a
kernel buffer. The result is just a string in kernel space. It would
however be quite helpful to be able to attach some ancillary info to
the string.

For instance, we could attach some audit-related info to reduce the
amount of audit-related processing needed. When auditing is enabled,
we could also call getname() on the string more than once and not
need to recopy it from userspace.

This patchset converts the getname()/putname() interfaces to return
a struct instead of a string. For now, the struct just tracks the
string in kernel space and the original userland pointer for it.

Later, we'll add other information to the struct as it becomes
convenient.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
cfd4da175599938f21a81cdd80df02fa4151dcba 10-Oct-2012 Jeff Layton <jlayton@redhat.com> acct: constify the name arg to acct_on

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
f8f3d4de2d04e1a5b4293b67faee8ebabc64e9fa 08-Feb-2012 Eric W. Biederman <ebiederm@xmission.com> userns: Convert bsd process accounting to use kuid and kgid where appropriate

BSD process accounting conveniently passes the file the accounting
records will be written into to do_acct_process. The file credentials
captured the user namespace of the opener of the file. Use the file
credentials to format the uid and the gid of the current process into
the user namespace of the user that started the bsd process
accounting.

Cc: Pavel Emelyanov <xemul@openvz.org>
Reviewed-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
d8c9584ea2a92879f471fd3a2be3af6c534fb035 08-Dec-2011 Al Viro <viro@zeniv.linux.org.uk> vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
32dc730860155b235f13e0cd3fe58b263279baf9 09-Dec-2011 Al Viro <viro@zeniv.linux.org.uk> get rid of timer in kern/acct.c

... and clean it up a bit, while we are at it

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
648616343cdbe904c585a6c12e323d3b3c72e46f 15-Dec-2011 Martin Schwidefsky <schwidefsky@de.ibm.com> [S390] cputime: add sparse checking and cleanup

Make cputime_t and cputime64_t nocast to enable sparse checking to
detect incorrect use of cputime. Drop the cputime macros for simple
scalar operations. The conversion macros are still needed.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
ebabe9a9001af0af56c0c2780ca1576246e7a74b 07-Jul-2010 Christoph Hellwig <hch@lst.de> pass a struct path to vfs_statfs

We'll need the path to implement the flags field for statvfs support.
We do have it available in all callers except:

- ecryptfs_statfs. This one doesn't actually need vfs_statfs but just
needs to do a caller to the lower filesystem statfs method.
- sys_ustat. Add a non-exported statfs_by_dentry helper for it which
doesn't won't be able to fill out the flags field later on.

In addition rename the helpers for statfs vs fstatfs to do_*statfs instead
of the misleading vfs prefix.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11cad320a4f4bc53d3585c85600c782faa12b99e 11-May-2010 Vitaliy Gusev <vgusev@openvz.org> bsdacct: use del_timer_sync() in acct_exit_ns()

acct_exit_ns --> acct_file_reopen deletes timer without check timer
execution on other CPUs. So acct_timeout() can change an unmapped memory.

Signed-off-by: Vitaliy Gusev <vgusev@openvz.org>
Cc: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
05b90496f2f366b9d3eea468351888ddf010782a 07-Apr-2010 Eric Paris <eparis@redhat.com> security: remove dead hook acct

Unused hook. Remove.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
4dd66e69d472f0ba5355a2529364d0db9a18a02b 11-Mar-2010 Veaceslav Falico <vfalico@redhat.com> copy_signal() cleanup: kill taskstats_tgid_init() and acct_init_pacct()

Kill unused functions taskstats_tgid_init() and acct_init_pacct() because
we don't use them anywhere after using kmem_cache_zalloc() in
copy_signal().

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4b731d50ff3df6b9141a6c12b088e8eb0109e83c 15-Dec-2009 Alexey Dobriyan <adobriyan@gmail.com> bsdacct: fix uid/gid misreporting

commit d8e180dcd5bbbab9cd3ff2e779efcf70692ef541 "bsdacct: switch
credentials for writing to the accounting file" introduced credential
switching during final acct data collecting. However, uid/gid pair
continued to be collected from current which became credentials of who
created acct file, not who exits.

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=14676

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reported-by: Juho K. Juopperi <jkj@kapsi.fi>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
Reviewed-by: Michal Schmidt <mschmidt@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
d8e180dcd5bbbab9cd3ff2e779efcf70692ef541 20-Aug-2009 Michal Schmidt <mschmidt@redhat.com> bsdacct: switch credentials for writing to the accounting file

When process accounting is enabled, every exiting process writes a log to
the account file. In addition, every once in a while one of the exiting
processes checks whether there's enough free space for the log.

SELinux policy may or may not allow the exiting process to stat the fs.
So unsuspecting processes start generating AVC denials just because
someone enabled process accounting.

For these filesystem operations, the exiting process's credentials should
be temporarily switched to that of the process which enabled accounting,
because it's really that process which wanted to have the accounting
information logged.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Morris <jmorris@namei.org>
df279ca8966c3de83105428e3391ab17690802a9 30-Jun-2009 Renaud Lottiaux <renaud.lottiaux@kerlabs.com> bsdacct: fix access to invalid filp in acct_on()

The file opened in acct_on and freshly stored in the ns->bacct struct can
be closed in acct_file_reopen by a concurrent call after we release
acct_lock and before we call mntput(file->f_path.mnt).

Record file->f_path.mnt in a local variable and use this variable only.

Signed-off-by: Renaud Lottiaux <renaud.lottiaux@kerlabs.com>
Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
b290ebe2c46d01b742b948ce03f09e8a3efb9a92 14-Jan-2009 Heiko Carstens <heiko.carstens@de.ibm.com> [CVE-2009-0029] System call wrappers part 04

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
76aac0e9a17742e60d408be1a706e9aaad370891 14-Nov-2008 David Howells <dhowells@redhat.com> CRED: Wrap task credential accesses in the core kernel

Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-audit@redhat.com
Cc: containers@lists.linux-foundation.org
Cc: linux-mm@kvack.org
Signed-off-by: James Morris <jmorris@namei.org>
dbda4c0b97b18fd59b3964548361b4f92357f730 13-Oct-2008 Alan Cox <alan@redhat.com> tty: Fix abusers of current->sighand->tty

Various people outside the tty layer still stick their noses in behind the
scenes. We need to make sure they also obey the locking and referencing rules.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
0c18d7a5df82524e634637c3aec24d4cba096442 25-Jul-2008 Pavel Emelyanov <xemul@openvz.org> bsdacct: fix and add comments around acct_process()

Fix the one describing what this function is and add one more - about
locking absence around pid namespaces loop.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7d1e13505be8c2bd2207894f4e0f069e1f9b51c9 25-Jul-2008 Pavel Emelyanov <xemul@openvz.org> bsdacct: account dying tasks in all relevant namespaces

This just makes the acct_proces walk the pid namespaces from current up to
the top and account a task in each with the accounting turned on.

ns->parent access if safe lockless, since current it still alive and holds
its namespace, which in turn holds its parent.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
b5a7174875ea570cc675f2c503e800db8efdd6a7 25-Jul-2008 Pavel Emelyanov <xemul@openvz.org> bsdacct: turn acct off for all pidns-s on umount time

All the bsd_acct_strcts with opened accounting are linked into a global
list. So, the acct_auto_close(_mnt) walks one and drops the accounting
for each.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
0b6b030fc30d169bb406b34b4fc60d99dde4a9c6 25-Jul-2008 Pavel Emelyanov <xemul@openvz.org> bsdacct: switch from global bsd_acct_struct instance to per-pidns one

Allocate the structure on the first call to sys_acct(). After this each
namespace, that ordered the accounting, will live with this structure till
its own death.

Two notes
- routines, that close the accounting on fs umount time use
the init_pid_ns's acct by now;
- accounting routine accounts to dying task's namespace
(also by now).

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6248b1b342005a428b1247b4e89249da1528d88d 25-Jul-2008 Pavel Emelyanov <xemul@openvz.org> bsdacct: make internal code work with passed bsd_acct_struct, not global

This adds the appropriate pointer to all the internal (i.e. static)
functions that work with global acct instance. API calls pass a global
instance to them (while we still have such).

Mostly this is a s/acct_globals./acct->/ over the file.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
a75d97976517dcda69150fd81d6be86ae63324a1 25-Jul-2008 Pavel Emelyanov <xemul@openvz.org> bsdacct: turn the acct_lock from on-the-struct to global

Don't use per-bsd-acct-struct lock, but work with a global one.

This lock is taken for short periods, so it doesn't seem it'll become a
bottleneck, but it will allow us to easily avoid many locking difficulties
in the future.

So this is a mostly s/acct_globals.lock/acct_lock/ over the file.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
e59a04a7aa5ce2483470aee4f2eb79ba6b9afe8b 25-Jul-2008 Pavel Emelyanov <xemul@openvz.org> bsdacct: make check timer accept a bsd_acct_struct argument

We're going to have many bsd_acct_struct instances, not just one, so the
timer (currently working with a global one) has to know which one to work
with.

Use a handy setup_timer macro for it (thanks to Oleg for one).

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1c552858ac2b1732a99d234d46b98098baef41ff 25-Jul-2008 Pavel Emelyanov <xemul@openvz.org> bsdacct: "truthify" a comment near acct_process

The acct_process does not accept any arguments actually.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
081e4c8a75692c21f3a119a81ca3270081879d0e 25-Jul-2008 Pavel Emelyanov <xemul@openvz.org> bsdacct: rename acct_gbls to bsd_acct_struct

After I fixed access to task->tgid in kernel/acct.c, Oleg pointed out some
bad side effects with this accounting vs pid namespaces interaction. I.e.
when some task in pid namespace sets this accounting up, this blocks all
the others from doing the same. Restricting this to init namespace only
could help, but didn't look a graceful solution.

So here is the approach to make this accounting work with pid namespaces
properly.

The idea is simple - when a task dies it accounts itself in each namespace
it is visible from and which set the accounting up.

For example here are the commands run and the output of lastcomm from init
and sub namespaces:

init_ns# accton pacct
sub_ns# accton pacct (this is a different file - sub ns is run in
a chroot-ed environment)
init_ns# cat /dev/null
sub_ns# ls /dev/null
init_ns# accton
sub_ns# accton

sub_ns# lastcomm -f pacct
ls 0 [136,0] 0.00 secs Thu May 15 10:30
accton 0 [136,0] 0.00 secs Thu May 15 10:30

init_ns# lastcomm -f pacct
accton root pts/0 0.00 secs Thu May 15 14:30 << got from sub
cat root pts/1 0.00 secs Thu May 15 14:30
ls root pts/0 0.00 secs Thu May 15 14:30 << got from sub
accton root pts/1 0.00 secs Thu May 15 14:30

That was the summary, the details are in patches.

This patch:

It will be visible in pid_namespace.h file, so fix its name to look better
outside the acct.c file.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5f7b703fe2be40db5a2bf136ac9e44cf5db267cc 24-Mar-2008 Pavel Emelyanov <xemul@openvz.org> bsd_acct: using task_struct->tgid is not right in pid-namespaces

In case we're accounting from a sub-namespace, the tgids reported will not
refer to the right namespace.

Save the pid_namespace we're accounting in on the acct_glbs and use it in
do_acct_process.

Two less :) places using the task_struct.tgid member.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
a846a1954b6397e844fe1e258af7598897ec6159 24-Mar-2008 Pavel Emelyanov <xemul@openvz.org> bsd_acct: plain current->real_parent access is not always safe

This is minor, but dereferencing even current real_parent is not safe on debug
kernels, since the memory, this points to, can be unmapped - RCU protection is
required.

Besides, the tgid field is deprecated and is to be replaced with task_tgid_xxx
call (the 2nd patch), so RCU will be required anyway.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
b59f8197c5ddd0d5d74b663650be5449dacd34aa 07-Jan-2008 Roland McGrath <roland@redhat.com> acct: real_parent ppid

The ac_ppid field reported in process accounting records
should match what getppid() would have returned to that
process, regardless of whether a debugger is attached.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
bcbe4a076609e15ea84cbebd9cd8f317ed70ce92 26-Nov-2007 Ingo Molnar <mingo@elte.hu> sched: fix kernel/acct.c comment

fix kernel/acct.c comment.

noticed by Lin Tan. Comment suggested by Olaf Kirch.

also see:

http://bugzilla.kernel.org/show_bug.cgi?id=8220

Reported-by: tammy000@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
6ae965cd647d84ab787225fe908cb5bbbc499605 18-Oct-2007 Daniel Walker <dwalker@mvista.com> whitespace fixes: process accounting

Lots of converting spaces to tabs.

Signed-off-by: Daniel Walker <dwalker@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2c6b47de17c75d553de3e2fb426d8298d2074585 25-Jul-2007 john stultz <johnstul@us.ibm.com> Cleanup non-arch xtime uses, use get_seconds() or current_kernel_time().

This avoids use of the kernel-internal "xtime" variable directly outside
of the actual time-related functions. Instead, use the helper functions
that we already have available to us.

This doesn't actually change any behaviour, but this will allow us to
fix the fact that "xtime" isn't updated very often with CONFIG_NO_HZ
(because much of the realtime information is maintained as separate
offsets to 'xtime'), which has caused interfaces that use xtime directly
to get a time that is out of sync with the real-time clock by up to a
third of a second or so.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
f3a43f3f64bff8e205c3702f6b4804d66e306848 08-Dec-2006 Josef "Jeff" Sipek <jsipek@cs.sunysb.edu> [PATCH] kernel: change uses of f_{dentry, vfsmnt} to use f_path

Change all the uses of f_{dentry,vfsmnt} to f_path.{dentry,mnt} in
linux/kernel/.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
7bcfa95e561f11a17720162935e4f704c5d6fda3 08-Dec-2006 Oleg Nesterov <oleg@tv-sign.ru> [PATCH] do_acct_process(): don't take tty_mutex

No need to take the global tty_mutex, signal->tty->driver can't go away while
we are holding ->siglock.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
24ec839c431eb79bb8f6abc00c4e1eb3b8c4d517 08-Dec-2006 Peter Zijlstra <a.p.zijlstra@chello.nl> [PATCH] tty: ->signal->tty locking

Fix the locking of signal->tty.

Use ->sighand->siglock to protect ->signal->tty; this lock is already used
by most other members of ->signal/->sighand. And unless we are 'current'
or the tasklist_lock is held we need ->siglock to access ->signal anyway.

(NOTE: sys_unshare() is broken wrt ->sighand locking rules)

Note that tty_mutex is held over tty destruction, so while holding
tty_mutex any tty pointer remains valid. Otherwise the lifetime of ttys
are governed by their open file handles. This leaves some holes for tty
access from signal->tty (or any other non file related tty access).

It solves the tty SLAB scribbles we were seeing.

(NOTE: the change from group_send_sig_info to __group_send_sig_info needs to
be examined by someone familiar with the security framework, I think
it is safe given the SEND_SIG_PRIV from other __group_send_sig_info
invocations)

[schwidefsky@de.ibm.com: 3270 fix]
[akpm@osdl.org: various post-viro fixes]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Alan Cox <alan@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Jan Kara <jack@ucw.cz>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
6cfd76a26d9fe2ba54b9d496a48c1d9285e5c5ed 07-Dec-2006 Peter Zijlstra <a.p.zijlstra@chello.nl> [PATCH] lockdep: name some old style locks

Name some of the remaning 'old_style_spin_init' locks

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
8f0ab5147951267134612570604cf8341901a80c 01-Oct-2006 Jay Lan <jlan@engr.sgi.com> [PATCH] csa: convert CONFIG tag for extended accounting routines

There were a few accounting data/macros that are used in CSA but are #ifdef'ed
inside CONFIG_BSD_PROCESS_ACCT. This patch is to change those ifdef's from
CONFIG_BSD_PROCESS_ACCT to CONFIG_TASK_XACCT. A few defines are moved from
kernel/acct.c and include/linux/acct.h to kernel/tsacct.c and
include/linux/tsacct_kern.h.

Signed-off-by: Jay Lan <jlan@sgi.com>
Cc: Shailabh Nagar <nagar@watson.ibm.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Jes Sorensen <jes@sgi.com>
Cc: Chris Sturtivant <csturtiv@sgi.com>
Cc: Tony Ernst <tee@sgi.com>
Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
eb84a20e9e6b98dcb33023ad22241d79107a08a7 29-Sep-2006 Alan Cox <alan@lxorguk.ukuu.org.uk> [PATCH] audit/accounting: tty locking

Add tty locking around the audit and accounting code.

The whole current->signal-> locking is all deeply strange but it's for
someone else to sort out. Add rather than replace the lock for acct.c

Signed-off-by: Alan Cox <alan@redhat.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
a4afee02a5dd4f20c08fca26e9b610e72d0bcbf0 14-Jul-2006 OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> [PATCH] Fix sighand->siglock usage in kernel/acct.c

IRQs must be disabled before taking ->siglock.

Noticed by lockdep.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
6ab3d5624e172c553004ecc862bfeac16d9d68b7 30-Jun-2006 Jörn Engel <joern@wohnheim.fh-wedel.de> Remove obsolete #include <linux/config.h>

Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
1dbe83c3445a1604546620a60888cf26b63f8782 27-Jun-2006 Randy Dunlap <rdunlap@xenotime.net> [PATCH] fix kernel-doc in kernel/ dir

Fix kernel-doc parameters in kernel/

Warning(/var/linsrc/linux-2617-g9//kernel/auditsc.c:1376): No description found for parameter 'u_abs_timeout'
Warning(/var/linsrc/linux-2617-g9//kernel/auditsc.c:1420): No description found for parameter 'u_msg_prio'
Warning(/var/linsrc/linux-2617-g9//kernel/auditsc.c:1420): No description found for parameter 'u_abs_timeout'
Warning(/var/linsrc/linux-2617-g9//kernel/acct.c:526): No description found for parameter 'pacct'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
7f32a25f63358aa993a3403c047f3ecfa6d96367 27-Jun-2006 Randy Dunlap <rdunlap@xenotime.net> [PATCH] kernel/acct: fix function definition

kernel/acct.c:579:19: warning: non-ANSI function declaration of function 'acct_process'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
77787bfb44da6e6166af088226707aeccee27968 25-Jun-2006 KaiGai Kohei <kaigai@ak.jp.nec.com> [PATCH] pacct: none-delayed process accounting accumulation

In current 2.6.17 implementation, signal_struct refered from task_struct is
used for per-process data structure. The pacct facility also uses it as a
per-process data structure to store stime, utime, minflt, majflt. But those
members are saved in __exit_signal(). It's too late.

For example, if some threads exits at same time, pacct facility has a
possibility to drop accountings for a part of those threads. (see, the
following 'The results of original 2.6.17 kernel') I think accounting
information should be completely collected into the per-process data structure
before writing out an accounting record.

This patch fixes this matter. Accumulation of stime, utime, minflt and majflt
are done before generating accounting record.

[mingo@elte.hu: fix acct_collect() siglock bug found by lockdep]
Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
f6ec29a42d7ac3b309a9cef179b686d23986ab98 25-Jun-2006 KaiGai Kohei <kaigai@ak.jp.nec.com> [PATCH] pacct: avoidance to refer the last thread as a representation of the process

When pacct facility generate an 'ac_flag' field in accounting record, it
refers a task_struct of the thread which died last in the process. But any
other task_structs are ignored.

Therefore, pacct facility drops ASU flag even if root-privilege operations are
used by any other threads except the last one. In addition, AFORK flag is
always set when the thread of group-leader didn't die last, although this
process has called execve() after fork().

We have a same matter in ac_exitcode. The recorded ac_exitcode is an exit
code of the last thread in the process. There is a possibility this exitcode
is not the group leader's one.
0e4648141af02331f21aabcd34940c70f09a2d04 25-Jun-2006 KaiGai Kohei <kaigai@ak.jp.nec.com> [PATCH] pacct: add pacct_struct to fix some pacct bugs.

The pacct facility need an i/o operation when an accounting record is
generated. There is a possibility to wake OOM killer up. If OOM killer is
activated, it kills some processes to make them release process memory
regions.

But acct_process() is called in the killed processes context before calling
exit_mm(), so those processes cannot release own memory. In the results, any
processes stop in this point and it finally cause a system stall.
11e64757f9fb32f13f51596bbf01988f42fca764 25-Jun-2006 Matt Helsley <matthltc@us.ibm.com> [PATCH] Remove unecessary NULL check in kernel/acct.c

copy_process() appears to be the only caller of acct_clear_integrals() and
does not pass in NULL task pointers. Remove the unecessary check.

Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
726c334223180e3c0197cc980a432681370d4baf 23-Jun-2006 David Howells <dhowells@redhat.com> [PATCH] VFS: Permit filesystem to perform statfs with a known root dentry

Give the statfs superblock operation a dentry pointer rather than a superblock
pointer.

This complements the get_sb() patch. That reduced the significance of
sb->s_root, allowing NFS to place a fake root there. However, NFS does
require a dentry to use as a target for the statfs operation. This permits
the root in the vfsmount to be used instead.

linux/mount.h has been added where necessary to make allyesconfig build
successfully.

Interest has also been expressed for use with the FUSE and XFS filesystems.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nathan Scott <nathans@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
bb231fe3a53b2d34c1aef119613816fcb18864b1 31-Mar-2006 KaiGai Kohei <kaigai@ak.jp.nec.com> [PATCH] Fix pacct bug in multithreading case.

I noticed a bug on the process accounting facility. In multi-threading
process, some data would be recorded incorrectly when the group_leader dies
earlier than one or more threads. The attached patch fixes this problem.

See below. 'bugacct' is a test program that create a worker thread after 4
seconds sleeping, then the group_leader dies soon. The worker thread
consume CPU/Memory for 6 seconds, then exit. We can estimate 10 seconds as
etime and 6 seconds as stime + utime. This is a sample program which the
group_leader dies earlier than other threads.

The results of same binary execution on different kernel are below.
-- accounted records --------------------
| btime | utime | stime | etime | minflt | majflt | comm |
original | 13:16:40 | 0.00 | 0.00 | 6.10 | 171 | 0 | bugacct |
patched | 13:20:21 | 5.83 | 0.18 | 10.03 | 32776 | 0 | bugacct |
(*) bugacct allocates 128MB memory, thus 128MB / 4KB = 32768 of minflt is
appropriate.

-- Test results in original kernel ------
$ date; time -p ./bugacct
Tue Mar 28 13:16:36 JST 2006 <- But pacct said btime is 13:16:40
real 10.11 <- But pacct said etime is 6.10
user 5.96 <- But pacct said utime is 0.00
sys 0.14 <- But pacct said stime is 0.00
$
-- Test results in patched kernel -------
$ date; time -p ./bugacct
Tue Mar 28 13:20:21 JST 2006
real 10.04
user 5.83
sys 0.19
$

In the original 2.6.16 kernel, pacct records btime, utime, stime, etime and
minflt incorrectly. In my opinion, this problem is caused by an assumption
that group_leader dies last.

The following section calculates process running time for etime and btime.
But it means running time of the thread that dies last, not process. The
start_time of the first thread in the process (group_leader) should be
reduced from uptime to calculate etime and btime correctly.

---- do_acct_process() in kernel/acct.c:
/* calculate run_time in nsec*/
do_posix_clock_monotonic_gettime(&uptime);
run_time = (u64)uptime.tv_sec*NSEC_PER_SEC + uptime.tv_nsec;
run_time -= (u64)current->start_time.tv_sec*NSEC_PER_SEC
+ current->start_time.tv_nsec;
----

The following section calculates stime and utime of the process.
But it might count the utime and stime of the group_leader duplicatly
and ignore the utime and stime of the thread dies last, when one or
more threads remain after group_leader dead.
The ac_utime should be calculated as the sum of the signal->utime
and utime of the thread dies last. The ac_stime should be done also.

---- do_acct_process() in kernel/acct.c:
jiffies = cputime_to_jiffies(cputime_add(current->group_leader->utime,
current->signal->utime));
ac.ac_utime = encode_comp_t(jiffies_to_AHZ(jiffies));
jiffies = cputime_to_jiffies(cputime_add(current->group_leader->stime,
current->signal->stime));
ac.ac_stime = encode_comp_t(jiffies_to_AHZ(jiffies));
----

The part of the minflt/majflt calculation has same problem.
This patch solves those problems, I think.

Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
c59ede7b78db329949d9cdcd7064e22d357560ef 11-Jan-2006 Randy.Dunlap <rdunlap@xenotime.net> [PATCH] move capable() to capability.h

- Move capable() from sched.h to capability.h;

- Use <linux/capability.h> where capable() is used
(in include/, block/, ipc/, kernel/, a few drivers/,
mm/, security/, & sound/;
many more drivers/ to go)

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
089545f0c71bab6511395c2a060d7f81a99bad58 06-Jan-2006 Martin Schwidefsky <schwidefsky@de.ibm.com> [PATCH] s390: cputime_t fixes

There are some more places where the use of cputime_t instead of an integer
type and the associated macros is necessary for the virtual cputime accounting
on s390. Affected are the s390 specific appldata code and BSD process
accounting.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
7b7b1ace2d9d06d76bce7481a045c22ed75e35dd 07-Nov-2005 Al Viro <viro@zeniv.linux.org.uk> [PATCH] saner handling of auto_acct_off() and DQUOT_OFF() in umount

The way we currently deal with quota and process accounting that might
keep vfsmount busy at umount time is inherently broken; we try to turn
them off just in case (not quite correctly, at that) and

a) pray umount doesn't fail (otherwise they'll stay turned off)
b) pray nobody doesn anything funny just as we turn quota off

Moreover, LSM provides hooks for doing the same sort of broken logics.

The proper way to deal with that is to introduce the second kind of
reference to vfsmount. Semantics:

- when the last normal reference is dropped, all special ones are
converted to normal ones and if there had been any, cleanup is done.
- normal reference can be cloned into a special one
- special reference can be converted to normal one; that's a no-op if
we'd already passed the point of no return (i.e. mntput() had
converted special references to normal and started cleanup).

The way it works: e.g. starting process accounting converts the vfsmount
reference pinned by the opened file into special one and turns it back
to normal when it gets shut down; acct_auto_close() is done when no
normal references are left. That way it does *not* obstruct umount(2)
and it silently gets turned off when the last normal reference to
vfsmount is gone. Which is exactly what we want...

The same should be done by LSM module that holds some internal
references to vfsmount and wants to shut them down on umount - it should
make them special and security_sb_umount_close() will be called exactly
when the last normal reference to vfsmount is gone.

quota handling is even simpler - we don't use normal file IO anymore, so
there's no need to hold vfsmounts at all. DQUOT_OFF() is done from
deactivate_super(), where it really belongs.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
4294621f41a85497019fae64341aa5351a1921b7 30-Oct-2005 Hugh Dickins <hugh@veritas.com> [PATCH] mm: rss = file_rss + anon_rss

I was lazy when we added anon_rss, and chose to change as few places as
possible. So currently each anonymous page has to be counted twice, in rss
and in anon_rss. Which won't be so good if those are atomic counts in some
configurations.

Change that around: keep file_rss and anon_rss separately, and add them
together (with get_mm_rss macro) when the total is needed - reading two
atomics is much cheaper than updating two atomics. And update anon_rss
upfront, typically in memory.c, not tucked away in page_add_anon_rmap.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
417ef531415c070926b071b75fd1c1ac4b6e2f7e 10-Sep-2005 Randy Dunlap <rdunlap@xenotime.net> [PATCH] kernel/acct: add kerneldoc

for kernel/acct.c:
- fix typos
- add kerneldoc for non-static functions

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
6c9c0b52b8c6b68b05bb06efd7079a8fc5e9ba60 07-Sep-2005 Peter Staubach <staubach@redhat.com> [PATCH] largefile support for accounting

There is a problem in the accounting subsystem in the kernel can not
correctly handle files larger than 2GB. The output file containing the
process accounting data can grow very large if the system is large enough
and active enough. If the 2GB limit is reached, then the system simply
stops storing process accounting data.

Another annoying problem is that once the system reaches this 2GB limit,
then every process which exits will receive a signal, SIGXFSZ. This signal
is generated because an attempt was made to write beyond the limit for the
file descriptor. This signal makes it look like every process has exited
due to a signal, when in fact, they have not.

The solution is to add the O_LARGEFILE flag to the list of flags used to
open the accounting file. The rest of the accounting support is already
largefile safe.

The changes were tested by constructing a large file (just short of 2GB),
enabling accounting, and then running enough commands to cause the
accounting data generated to increase the size of the file to 2GB. Without
the changes, the file grows to 2GB and the last command run in the test
script appears to exit due a signal when it has not. With the changes,
things work as expected and quietly.

There are some user level changes required so that it can deal with
largefiles, but those are being handled separately.

Signed-off-by: Peter Staubach <staubach@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 17-Apr-2005 Linus Torvalds <torvalds@ppc970.osdl.org> Linux-2.6.12-rc2

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!