History log of /kernel/audit_tree.c
Revision Date Author Comments
799b601451b21ebe7af0e6e8f6e2ccd4683c5064 04-Nov-2014 Miklos Szeredi <mszeredi@suse.cz> audit: keep inode pinned

Audit rules disappear when an inode they watch is evicted from the cache.
This is likely not what we want.

The guilty commit is "fsnotify: allow marks to not pin inodes in core",
which didn't take into account that audit_tree adds watches with a zero
mask.

Adding any mask should fix this.

Fixes: 90b1e7a57880 ("fsnotify: allow marks to not pin inodes in core")
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: stable@vger.kernel.org # 2.6.36+
Signed-off-by: Paul Moore <pmoore@redhat.com>
2991dd2b0117e864f394c826af6df144206ce0db 03-Oct-2014 Richard Guy Briggs <rgb@redhat.com> audit: rename audit_log_remove_rule to disambiguate for trees

Rename audit_log_remove_rule() to audit_tree_log_remove_rule() to avoid
confusion with watch and mark rule removal/changes.

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
e7df61f4d1ddb7fdd654dde6cd40f7cc398c3932 04-Apr-2014 Burn Alting <burn@swtf.dyndns.org> audit: invalid op= values for rules

Various audit events dealing with adding, removing and updating rules result in
invalid values set for the op keys which result in embedded spaces in op=
values.

The invalid values are
op="add rule" set in kernel/auditfilter.c
op="remove rule" set in kernel/auditfilter.c
op="remove rule" set in kernel/audit_tree.c
op="updated rules" set in kernel/audit_watch.c
op="remove rule" set in kernel/audit_watch.c

Replace the space in the above values with an underscore character ('_').

Coded-by: Burn Alting <burn@swtf.dyndns.org>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
45a22f4c11fef4ecd5c61c0a299cd3f23d77be8e 17-Feb-2014 Jan Kara <jack@suse.cz> inotify: Fix reporting of cookies for inotify events

My rework of handling of notification events (namely commit 7053aee26a35
"fsnotify: do not share events between notification groups") broke
sending of cookies with inotify events. We didn't propagate the value
passed to fsnotify() properly and passed 4 uninitialized bytes to
userspace instead (so it is also an information leak). Sadly I didn't
notice this during my testing because inotify cookies aren't used very
much and LTP inotify tests ignore them.

Fix the problem by passing the cookie value properly.

Fixes: 7053aee26a3548ebaba046ae2e52396ccf56ac6c
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
56b27cf6030dd36c56a5542ab8bfa406d337f083 22-Jan-2014 Jan Kara <jack@suse.cz> fsnotify: remove pointless NULL initializers

We usually rely on the fact that struct members not specified in the
initializer are set to NULL. So do that with fsnotify function pointers
as well.

Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
83c4c4b0a3aadc1ce7b5b2870ce1fc1f65498da0 22-Jan-2014 Jan Kara <jack@suse.cz> fsnotify: remove .should_send_event callback

After removing event structure creation from the generic layer there is
no reason for separate .should_send_event and .handle_event callbacks.
So just remove the first one.

Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7053aee26a3548ebaba046ae2e52396ccf56ac6c 22-Jan-2014 Jan Kara <jack@suse.cz> fsnotify: do not share events between notification groups

Currently fsnotify framework creates one event structure for each
notification event and links this event into all interested notification
groups. This is done so that we save memory when several notification
groups are interested in the event. However the need for event
structure shared between inotify & fanotify bloats the event structure
so the result is often higher memory consumption.

Another problem is that fsnotify framework keeps path references with
outstanding events so that fanotify can return open file descriptors
with its events. This has the undesirable effect that filesystem cannot
be unmounted while there are outstanding events - a regression for
inotify compared to a situation before it was converted to fsnotify
framework. For fanotify this problem is hard to avoid and users of
fanotify should kind of expect this behavior when they ask for file
descriptors from notified files.

This patch changes fsnotify and its users to create separate event
structure for each group. This allows for much simpler code (~400 lines
removed by this patch) and also smaller event structures. For example
on 64-bit system original struct fsnotify_event consumes 120 bytes, plus
additional space for file name, additional 24 bytes for second and each
subsequent group linking the event, and additional 32 bytes for each
inotify group for private data. After the conversion inotify event
consumes 48 bytes plus space for file name which is considerably less
memory unless file names are long and there are several groups
interested in the events (both of which are uncommon). Fanotify event
fits in 56 bytes after the conversion (fanotify doesn't care about file
names so its events don't have to have it allocated). A win unless
there are four or more fanotify groups interested in the event.

The conversion also solves the problem with unmount when only inotify is
used as we don't have to grab path references for inotify events.

[hughd@google.com: fanotify: fix corruption preventing startup]
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
736f3203a06eafd0944103775a98584082744c6b 12-Jun-2013 Chen Gang <gang.chen@asianux.com> kernel/audit_tree.c:audit_add_tree_rule(): protect `rule' from kill_rules()

audit_add_tree_rule() must set 'rule->tree = NULL;' firstly, to protect
the rule itself freed in kill_rules().

The reason is when it is killed, the 'rule' itself may have already
released, we should not access it. one example: we add a rule to an
inode, just at the same time the other task is deleting this inode.

The work flow for adding a rule:

audit_receive() -> (need audit_cmd_mutex lock)
audit_receive_skb() ->
audit_receive_msg() ->
audit_receive_filter() ->
audit_add_rule() ->
audit_add_tree_rule() -> (need audit_filter_mutex lock)
...
unlock audit_filter_mutex
get_tree()
...
iterate_mounts() -> (iterate all related inodes)
tag_mount() ->
tag_trunk() ->
create_trunk() -> (assume it is 1st rule)
fsnotify_add_mark() ->
fsnotify_add_inode_mark() -> (add mark to inode->i_fsnotify_marks)
...
get_tree(); (each inode will get one)
...
lock audit_filter_mutex

The work flow for deleting an inode:

__destroy_inode() ->
fsnotify_inode_delete() ->
__fsnotify_inode_delete() ->
fsnotify_clear_marks_by_inode() -> (get mark from inode->i_fsnotify_marks)
fsnotify_destroy_mark() ->
fsnotify_destroy_mark_locked() ->
audit_tree_freeing_mark() ->
evict_chunk() ->
...
tree->goner = 1
...
kill_rules() -> (assume current->audit_context == NULL)
call_rcu() -> (rule->tree != NULL)
audit_free_rule_rcu() ->
audit_free_rule()
...
audit_schedule_prune() -> (assume current->audit_context == NULL)
kthread_run() -> (need audit_cmd_mutex and audit_filter_mutex lock)
prune_one() -> (delete it from prue_list)
put_tree(); (match the original get_tree above)

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12b2f117f3bf738c1a00a6f64393f1953a740bd4 30-Apr-2013 Chen Gang <gang.chen@asianux.com> kernel/audit_tree.c: tree will leak memory when failure occurs in audit_trim_trees()

audit_trim_trees() calls get_tree(). If a failure occurs we must call
put_tree().

[akpm@linux-foundation.org: run put_tree() before mutex_lock() for small scalability improvement]
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
0644ec0cc8a33fb654e348897ad7684e22a4b5d8 11-Jan-2013 Kees Cook <keescook@chromium.org> audit: catch possible NULL audit buffers

It's possible for audit_log_start() to return NULL. Handle it in the
various callers.

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Julien Tinnes <jln@google.com>
Cc: Will Drewry <wad@google.com>
Cc: Steve Grubb <sgrubb@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
e2a29943e9a2ee2aa737a77f550f46ba72269db4 14-Jun-2011 Lino Sanfilippo <LinoSanfilippo@gmx.de> fsnotify: pass group to fsnotify_destroy_mark()

In fsnotify_destroy_mark() dont get the group from the passed mark anymore,
but pass the group itself as an additional parameter to the function.

Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
b3e8692b4dde5cf5fc60e4b95385229a72623182 15-Aug-2012 Miklos Szeredi <mszeredi@suse.cz> audit: clean up refcounting in audit-tree

Drop the initial reference by fsnotify_init_mark early instead of
audit_tree_freeing_mark() at destroy time.

In the cases we destroy the mark before we drop the initial reference we need to
get rid of the get_mark that balances the put_mark in audit_tree_freeing_mark().

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
a2140fc0cb0325bb6384e788edd27b9a568714e2 15-Aug-2012 Miklos Szeredi <mszeredi@suse.cz> audit: fix refcounting in audit-tree

Refcounting of fsnotify_mark in audit tree is broken. E.g:

refcount
create_chunk
alloc_chunk 1
fsnotify_add_mark 2

untag_chunk
fsnotify_get_mark 3
fsnotify_destroy_mark
audit_tree_freeing_mark 2
fsnotify_put_mark 1
fsnotify_put_mark 0
via destroy_list
fsnotify_mark_destroy -1

This was reported by various people as triggering Oops when stopping auditd.

We could just remove the put_mark from audit_tree_freeing_mark() but that would
break freeing via inode destruction. So this patch simply omits a put_mark
after calling destroy_mark or adds a get_mark before.

The additional get_mark is necessary where there's no other put_mark after
fsnotify_destroy_mark() since it assumes that the caller is holding a reference
(or the inode is keeping the mark pinned, not the case here AFAICS).

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reported-by: Valentin Avram <aval13@gmail.com>
Reported-by: Peter Moody <pmoody@google.com>
Acked-by: Eric Paris <eparis@redhat.com>
CC: stable@vger.kernel.org
0fe33aae0e94b4097dd433c9399e16e17d638cd8 15-Aug-2012 Miklos Szeredi <mszeredi@suse.cz> audit: don't free_chunk() after fsnotify_add_mark()

Don't do free_chunk() after fsnotify_add_mark(). That one does a delayed unref
via the destroy list and this results in use-after-free.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Eric Paris <eparis@redhat.com>
CC: stable@vger.kernel.org
be34d1a3bc4b6f357a49acb55ae870c81337e4f0 25-Jun-2012 David Howells <dhowells@redhat.com> VFS: Make clone_mnt()/copy_tree()/collect_mounts() return errors

copy_tree() can theoretically fail in a case other than ENOMEM, but always
returns NULL which is interpreted by callers as -ENOMEM. Change it to return
an explicit error.

Also change clone_mnt() for consistency and because union mounts will add new
error cases.

Thanks to Andreas Gruenbacher <agruen@suse.de> for a bug fix.
[AV: folded braino fix by Dan Carpenter]

Original-author: Valerie Aurora <vaurora@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Valerie Aurora <valerie.aurora@gmail.com>
Cc: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3b097c46964b07479855b01056c61540b8cadd50 15-Mar-2011 Lai Jiangshan <laijs@cn.fujitsu.com> audit_tree,rcu: Convert call_rcu(__put_tree) to kfree_rcu()

The rcu callback __put_tree() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(__put_tree).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
25985edcedea6396277003854657b5f3cb31a628 31-Mar-2011 Lucas De Marchi <lucas.demarchi@profusion.mobi> Fix common misspellings

Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
f7a998a9491f2da1d3e44d150aa611d10093da4f 30-Oct-2010 Al Viro <viro@zeniv.linux.org.uk> in untag_chunk() we need to do alloc_chunk() a bit earlier

... while we are not holding spinlocks.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
1968f5eed54ce47bde488fd9a450912e4a2d7138 28-Jul-2010 Eric Paris <eparis@redhat.com> fanotify: use both marks when possible

fanotify currently, when given a vfsmount_mark will look up (if it exists)
the corresponding inode mark. This patch drops that lookup and uses the
mark provided.

Signed-off-by: Eric Paris <eparis@redhat.com>
ce8f76fb7320297ccbe7c950fd9a2d727dd6a5a0 28-Jul-2010 Eric Paris <eparis@redhat.com> fsnotify: pass both the vfsmount mark and inode mark

should_send_event() and handle_event() will both need to look up the inode
event if they get a vfsmount event. Lets just pass both at the same time
since we have them both after walking the lists in lockstep.

Signed-off-by: Eric Paris <eparis@redhat.com>
2612abb51b11ffd2d75c472b11178115f5808909 28-Jul-2010 Eric Paris <eparis@redhat.com> fsnotify: cleanup should_send_event

The change to use srcu and walk the object list rather than the global
fsnotify_group list means that should_send_event is no longer needed for a
number of groups and can be simplified for others. Do that.

Signed-off-by: Eric Paris <eparis@redhat.com>
3a9b16b407f10b2a771bcae13fb5791e527d6bcf 28-Jul-2010 Eric Paris <eparis@redhat.com> fsnotify: send fsnotify_mark to groups in event handling functions

With the change of fsnotify to use srcu walking the marks list instead of
walking the global groups list we now know the mark in question. The code can
send the mark to the group's handling functions and the groups won't have to
find those marks themselves.

Signed-off-by: Eric Paris <eparis@redhat.com>
5444e2981c31d0ed7465475e451b8437084337e5 18-Dec-2009 Eric Paris <eparis@redhat.com> fsnotify: split generic and inode specific mark code

currently all marking is done by functions in inode-mark.c. Some of this
is pretty generic and should be instead done in a generic function and we
should only put the inode specific code in inode-mark.c

Signed-off-by: Eric Paris <eparis@redhat.com>
35566087099c3ff8901d65ee98af56347ee66e5a 18-Dec-2009 Andreas Gruenbacher <agruen@suse.de> fsnotify: take inode->i_lock inside fsnotify_find_mark_entry()

All callers to fsnotify_find_mark_entry() except one take and
release inode->i_lock around the call. Take the lock inside
fsnotify_find_mark_entry() instead.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
d07754412f9cdc2f4a99318d5ee81ace6715ea99 18-Dec-2009 Eric Paris <eparis@redhat.com> fsnotify: rename fsnotify_find_mark_entry to fsnotify_find_mark

the _entry portion of fsnotify functions is useless. Drop it.

Signed-off-by: Eric Paris <eparis@redhat.com>
e61ce86737b4d60521e4e71f9892fe4bdcfb688b 18-Dec-2009 Eric Paris <eparis@redhat.com> fsnotify: rename fsnotify_mark_entry to just fsnotify_mark

The name is long and it serves no real purpose. So rename
fsnotify_mark_entry to just fsnotify_mark.

Signed-off-by: Eric Paris <eparis@redhat.com>
2823e04de4f1a49087b58ff2bb8f61361ffd9321 18-Dec-2009 Eric Paris <eparis@redhat.com> fsnotify: put inode specific fields in an fsnotify_mark in a union

The addition of marks on vfs mounts will be simplified if the inode
specific parts of a mark and the vfsmnt specific parts of a mark are
actually in a union so naming can be easy. This patch just implements the
inode struct and the union.

Signed-off-by: Eric Paris <eparis@redhat.com>
3a9fb89f4cd04c23e16397befba92efb5d989b74 18-Dec-2009 Eric Paris <eparis@redhat.com> fsnotify: include vfsmount in should_send_event when appropriate

To ensure that a group will not duplicate events when it receives it based
on the vfsmount and the inode should_send_event test we should distinguish
those two cases. We pass a vfsmount to this function so groups can make
their own determinations.

Signed-off-by: Eric Paris <eparis@redhat.com>
0d2e2a1d00d7d23e5bd9bb0935cde7c3d5835c56 18-Dec-2009 Eric Paris <eparis@redhat.com> fsnotify: drop mask argument from fsnotify_alloc_group

Nothing uses the mask argument to fsnotify_alloc_group. This patch drops
that argument.

Signed-off-by: Eric Paris <eparis@redhat.com>
ffab83402f01555a5fa32efb48a4dd0ce8d12ef5 18-Dec-2009 Eric Paris <eparis@redhat.com> fsnotify: fsnotify_obtain_group should be fsnotify_alloc_group

fsnotify_obtain_group was intended to be able to find an already existing
group. Nothing uses that functionality. This just renames it to
fsnotify_alloc_group so it is clear what it is doing.

Signed-off-by: Eric Paris <eparis@redhat.com>
74be0cc82835aecad332a29896b0f212ba893403 18-Dec-2009 Eric Paris <eparis@redhat.com> fsnotify: remove group_num altogether

The original fsnotify interface has a group-num which was intended to be
able to find a group after it was added. I no longer think this is a
necessary thing to do and so we remove the group_num.

Signed-off-by: Eric Paris <eparis@redhat.com>
8112e2d6a7356e8c3ff1f7f3c86f375ed0305705 18-Dec-2009 Eric Paris <eparis@redhat.com> fsnotify: include data in should_send calls

fanotify is going to need to look at file->private_data to know if an event
should be sent or not. This passes the data (which might be a file,
dentry, inode, or none) to the should_send function calls so fanotify can
get that information when available

Signed-off-by: Eric Paris <eparis@redhat.com>
7b0a04fbfb35650941af87728d4891515b4fc179 18-Dec-2009 Eric Paris <eparis@redhat.com> fsnotify: provide the data type to should_send_event

fanotify is only interested in event types which contain enough information
to open the original file in the context of the fanotify listener. Since
fanotify may not want to send events if that data isn't present we pass
the data type to the should_send_event function call so fanotify can express
its lack of interest.

Signed-off-by: Eric Paris <eparis@redhat.com>
28a3a7eb3b1f3e7d834e19f06e794e429058a4dd 18-Dec-2009 Eric Paris <eparis@redhat.com> audit: reimplement audit_trees using fsnotify rather than inotify

Simply switch audit_trees from using inotify to using fsnotify for it's
inode pinning and disappearing act information.

Signed-off-by: Eric Paris <eparis@redhat.com>
5a0e3ad6af8660be21ca98a971cd00f331318c05 24-Mar-2010 Tejun Heo <tj@kernel.org> include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
1f707137b55764740981d022d29c622832a61880 31-Jan-2010 Al Viro <viro@zeniv.linux.org.uk> new helper: iterate_mounts()

apply function to vfsmounts in set returned by collect_mounts(),
stop if it returns non-zero.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2096f759abcb42200a81d776f597362fd9265024 30-Jan-2010 Al Viro <viro@zeniv.linux.org.uk> New helper: path_is_under(path1, path2)

Analog of is_subdir for vfsmount,dentry pairs, moved from audit_tree.c

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
b4c30aad39805902cf5b855aa8a8b22d728ad057 19-Dec-2009 Al Viro <viro@ZenIV.linux.org.uk> fix more leaks in audit_tree.c tag_chunk()

Several leaks in audit_tree didn't get caught by commit
318b6d3d7ddbcad3d6867e630711b8a705d873d7, including the leak on normal
exit in case of multiple rules refering to the same chunk.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6f5d51148921c242680a7a1d9913384a30ab3cbe 19-Dec-2009 Al Viro <viro@ZenIV.linux.org.uk> fix braindamage in audit_tree.c untag_chunk()

... aka "Al had badly fscked up when writing that thing and nobody
noticed until Eric had fixed leaks that used to mask the breakage".

The function essentially creates a copy of old array sans one element
and replaces the references to elements of original (they are on cyclic
lists) with those to corresponding elements of new one. After that the
old one is fair game for freeing.

First of all, there's a dumb braino: when we get to list_replace_init we
use indices for wrong arrays - position in new one with the old array
and vice versa.

Another bug is more subtle - termination condition is wrong if the
element to be excluded happens to be the last one. We shouldn't go
until we fill the new array, we should go until we'd finished the old
one. Otherwise the element we are trying to kill will remain on the
cyclic lists...

That crap used to be masked by several leaks, so it was not quite
trivial to hit. Eric had fixed some of those leaks a while ago and the
shit had hit the fan...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
916d75761c971b6e630a26bd4ba472e90ac9a4b9 24-Jun-2009 Al Viro <viro@zeniv.linux.org.uk> Fix rule eviction order for AUDIT_DIR

If syscall removes the root of subtree being watched, we
definitely do not want the rules refering that subtree
to be destroyed without the syscall in question having
a chance to match them.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9d9609851003ebed15957f0f2ce18492739ee124 11-Jun-2009 Eric Paris <eparis@redhat.com> Audit: clean up all op= output to include string quoting

A number of places in the audit system we send an op= followed by a string
that includes spaces. Somehow this works but it's just wrong. This patch
moves all of those that I could find to be quoted.

Example:

Change From: type=CONFIG_CHANGE msg=audit(1244666690.117:31): auid=0 ses=1
subj=unconfined_u:unconfined_r:auditctl_t:s0-s0:c0.c1023 op=remove rule
key="number2" list=4 res=0

Change To: type=CONFIG_CHANGE msg=audit(1244666690.117:31): auid=0 ses=1
subj=unconfined_u:unconfined_r:auditctl_t:s0-s0:c0.c1023 op="remove rule"
key="number2" list=4 res=0

Signed-off-by: Eric Paris <eparis@redhat.com>
589ff870ed60a9ebdd5ec99ec3f5afe1282fe151 18-Apr-2009 Al Viro <viro@zeniv.linux.org.uk> Switch collect_mounts() to struct path

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
24b6f16ecf37f918a1934d590e9e71c100d6388f 18-Apr-2009 Al Viro <viro@zeniv.linux.org.uk> No need for crossing to mountpoint in audit_tag_tree()

is_under() will DTRT anyway. And yes, is_subdir() behaviour
is intentional.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
318b6d3d7ddbcad3d6867e630711b8a705d873d7 13-Jan-2009 Eric Paris <eparis@redhat.com> audit: incorrect ref counting in audit tree tag_chunk

tag_chunk has bad exit paths in which the inotify ref counting is wrong.
At the top of the function we found &old_watch using inotify_find_watch().
inotify_find_watch takes a reference to the watch. This is never dropped
on an error path.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
5af75d8d58d0f9f7b7c0515b35786b22892d5f12 16-Dec-2008 Al Viro <viro@zeniv.linux.org.uk> audit: validate comparison operations, store them in sane form

Don't store the field->op in the messy (and very inconvenient for e.g.
audit_comparator()) form; translate to dense set of values and do full
validation of userland-submitted value while we are at it.

->audit_init_rule() and ->audit_match_rule() get new values now; in-tree
instances updated.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
e45aa212ea81d39b38ba158df344dc3a500153e5 15-Dec-2008 Al Viro <viro@zeniv.linux.org.uk> audit rules ordering, part 2

Fix the actual rule listing; add per-type lists _not_ used for matching,
with all exit,... sitting on one such list. Simplifies "do something
for all rules" logics, while we are at it...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
8f7b0ba1c853919b85b54774775f567f30006107 15-Nov-2008 Al Viro <viro@ZenIV.linux.org.uk> Fix inotify watch removal/umount races

Inotify watch removals suck violently.

To kick the watch out we need (in this order) inode->inotify_mutex and
ih->mutex. That's fine if we have a hold on inode; however, for all
other cases we need to make damn sure we don't race with umount. We can
*NOT* just grab a reference to a watch - inotify_unmount_inodes() will
happily sail past it and we'll end with reference to inode potentially
outliving its superblock.

Ideally we just want to grab an active reference to superblock if we
can; that will make sure we won't go into inotify_umount_inodes() until
we are done. Cleanup is just deactivate_super().

However, that leaves a messy case - what if we *are* racing with
umount() and active references to superblock can't be acquired anymore?
We can bump ->s_count, grab ->s_umount, which will almost certainly wait
until the superblock is shut down and the watch in question is pining
for fjords. That's fine, but there is a problem - we might have hit the
window between ->s_active getting to 0 / ->s_count - below S_BIAS (i.e.
the moment when superblock is past the point of no return and is heading
for shutdown) and the moment when deactivate_super() acquires
->s_umount.

We could just do drop_super() yield() and retry, but that's rather
antisocial and this stuff is luser-triggerable. OTOH, having grabbed
->s_umount and having found that we'd got there first (i.e. that
->s_root is non-NULL) we know that we won't race with
inotify_umount_inodes().

So we could grab a reference to watch and do the rest as above, just
with drop_super() instead of deactivate_super(), right? Wrong. We had
to drop ih->mutex before we could grab ->s_umount. So the watch
could've been gone already.

That still can be dealt with - we need to save watch->wd, do idr_find()
and compare its result with our pointer. If they match, we either have
the damn thing still alive or we'd lost not one but two races at once,
the watch had been killed and a new one got created with the same ->wd
at the same address. That couldn't have happened in inotify_destroy(),
but inotify_rm_wd() could run into that. Still, "new one got created"
is not a problem - we have every right to kill it or leave it alone,
whatever's more convenient.

So we can use idr_find(...) == watch && watch->inode->i_sb == sb as
"grab it and kill it" check. If it's been our original watch, we are
fine, if it's a newcomer - nevermind, just pretend that we'd won the
race and kill the fscker anyway; we are safe since we know that its
superblock won't be going away.

And yes, this is far beyond mere "not very pretty"; so's the entire
concept of inotify to start with.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Greg KH <greg@kroah.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
98bc993f99e51467057ef699e47fec020f24d233 02-Aug-2008 Al Viro <viro@zeniv.linux.org.uk> [PATCH] get rid of nameidata in audit_tree

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
6793a051fb9311f0f1ab7eafc5a9e69b8a1bd8d4 15-May-2008 Paul E. McKenney <paulmck@linux.vnet.ibm.com> [PATCH] list_for_each_rcu must die: audit

All uses of list_for_each_rcu() can be profitably replaced by the
easier-to-use list_for_each_entry_rcu(). This patch makes this change
for the Audit system, in preparation for removing the list_for_each_rcu()
API entirely. This time with well-formed SOB.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
1d957f9bf87da74f420424d16ece005202bbebd3 15-Feb-2008 Jan Blunck <jblunck@suse.de> Introduce path_put()

* Add path_put() functions for releasing a reference to the dentry and
vfsmount of a struct path in the right order

* Switch from path_release(nd) to path_put(&nd->path)

* Rename dput_path() to path_put_conditional()

[akpm@linux-foundation.org: fix cifs]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: <linux-fsdevel@vger.kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4ac9137858e08a19f29feac4e1f4df7c268b0ba5 15-Feb-2008 Jan Blunck <jblunck@suse.de> Embed a struct path into struct nameidata instead of nd->{dentry,mnt}

This is the central patch of a cleanup series. In most cases there is no good
reason why someone would want to use a dentry for itself. This series reflects
that fact and embeds a struct path into nameidata.

Together with the other patches of this series
- it enforced the correct order of getting/releasing the reference count on
<dentry,vfsmount> pairs
- it prepares the VFS for stacking support since it is essential to have a
struct path in every place where the stack can be traversed
- it reduces the overall code size:

without patch series:
text data bss dec hex filename
5321639 858418 715768 6895825 6938d1 vmlinux

with patch series:
text data bss dec hex filename
5320026 858418 715768 6894212 693284 vmlinux

This patch:

Switch from nd->{dentry,mnt} to nd->path.{dentry,mnt} everywhere.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix cifs]
[akpm@linux-foundation.org: fix smack]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
74c3cbe33bc077ac1159cadfea608b501e100344 22-Jul-2007 Al Viro <viro@zeniv.linux.org.uk> [PATCH] audit: watching subtrees

New kind of audit rule predicates: "object is visible in given subtree".
The part that can be sanely implemented, that is. Limitations:
* if you have hardlink from outside of tree, you'd better watch
it too (or just watch the object itself, obviously)
* if you mount something under a watched tree, tell audit
that new chunk should be added to watched subtrees
* if you umount something in a watched tree and it's still mounted
elsewhere, you will get matches on events happening there. New command
tells audit to recalculate the trees, trimming such sources of false
positives.

Note that it's _not_ about path - if something mounted in several places
(multiple mount, bindings, different namespaces, etc.), the match does
_not_ depend on which one we are using for access.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>