History log of /security/selinux/hooks.c
Revision Date Author Comments
e9e500827b871459306974c32a0b6398375ce7d5 18-Sep-2015 Jeff Vander Stoep <jeffv@google.com> selinux: do not check open perm on ftruncate call

Use the ATTR_FILE attribute to distinguish between truncate()
and ftruncate() system calls. The two other cases where
do_truncate is called with a filp (and therefore ATTR_FILE is set)
are for coredump files and for open(O_TRUNC). In both of those cases
the open permission has already been checked during file open and
therefore does not need to be repeated.

Commit 95dbf739313f ("SELinux: check OPEN on truncate calls")
fixed a major issue where domains were allowed to truncate files
without the open permission. However, it introduced a new bug where
a domain with the write permission can no longer ftruncate files
without the open permission, even when they receive an already open
file.

(cherry picked from commit b21800f304392ee5d20f411c37470183cc779f11)

Bug: 22567870
Change-Id: I2525a0e244c8d635b2d0c1f966071edbb365a43a

Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
fe92a28f0f5a50b96e615c904c27469b880dcc76 20-May-2015 Stephen Smalley <sds@tycho.nsa.gov> selinux: enable genfscon labeling for sysfs and pstore files

Support per-file labeling of sysfs and pstore files based on
genfscon policy entries. This is safe because the sysfs
and pstore directory tree cannot be manipulated by userspace,
except to unlink pstore entries.
This provides an alternative method of assigning per-file labeling
to sysfs or pstore files without needing to set the labels from
userspace on each boot. The advantages of this approach are that
the labels are assigned as soon as the dentry is first instantiated
and userspace does not need to walk the sysfs or pstore tree and
set the labels on each boot. The limitations of this approach are
that the labels can only be assigned based on pathname prefix matching.
You can initially assign labels using this mechanism and then change
them at runtime via setxattr if allowed to do so by policy.

Change-Id: If5999785fdc1d24d869b23ae35cd302311e94562
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Suggested-by: Dominick Grift <dac.override@gmail.com>
0374990a32aee718d74431a108559ae67fa217f5 19-May-2015 Stephen Smalley <sds@tycho.nsa.gov> selinux: enable per-file labeling for debugfs files.

upstream commit 6f29997f4a3117169eeabd41dbea4c1bd94a739c

Add support for per-file labeling of debugfs files so that
we can distinguish them in policy. This is particularly
important in Android where certain debugfs files have to be writable
by apps and therefore the debugfs directory tree can be read and
searched by all.

Since debugfs is entirely kernel-generated, the directory tree is
immutable by userspace, and the inodes are pinned in memory, we can
simply use the same approach as with proc and label the inodes from
policy based on pathname from the root of the debugfs filesystem.
Generalize the existing labeling support used for proc and reuse it
for debugfs too.

Change-Id: I6460fbed6bb6bd36eb8554ac8c4fdd574edf3b07
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
ba733f9857b966459316d0cd33b8da2e22f62d7d 08-Apr-2015 Jeff Vander Stoep <jeffv@google.com> SELinux: per-command whitelisting of ioctls

Extend the generic ioctl permission check with support for per-command
filtering. Source/target/class sets including the ioctl permission may
additionally include a set of commands. Example:

allow <source> <target>:<class> { 0x8910-0x8926 0x892A-0x8935 }
auditallow <source> <target>:<class> 0x892A

When ioctl commands are omitted only the permissions are checked. This
feature is intended to provide finer granularity for the ioctl
permission which may be too imprecise in some circumstances. For
example, the same driver may use ioctls to provide important and
benign functionality such as driver version or socket type as well as
dangerous capabilities such as debugging features, read/write/execute
to physical memory or access to sensitive data. Per-command filtering
provides a mechanism to reduce the attack surface of the kernel, and
limit applications to the subset of commands required.

The format of the policy binary has been modified to include ioctl
commands, and the policy version number has been incremented to
POLICYDB_VERSION_IOCTL_OPERATIONS=30 to account for the format change.

Bug: 18087110
Change-Id: Ibf0e36728f6f3f0d5af56ccdeddee40800af689d
Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
60ad76122e118695168472edf5b314702c782e94 21-Jan-2015 Stephen Smalley <sds@tycho.nsa.gov> selinux: Remove obsolete selinux_audit_data initialization.

Commit 899838b25f063a94594b1df6e0100aea1ec57fac eliminated the need
to initialize selinux_audit_data except in the slow path, when it is
handled by slow_avc_audit(). That commit removed all other initializations
of selinux_audit_data but this one remained since the binder security
hooks are not yet upstream (posted them to linux-kernel today).

Change-Id: I735e4500cde23275686cb3208068cbf8dd7bccd7
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
96f8bcf4a3984181282d099d2f477a0139157270 09-Jan-2015 Mark Salyzyn <salyzyn@google.com> pstore: selinux: add security in-core xattr support for pstore and debugfs

- add "pstore" and "debugfs" to list of in-core exceptions
- change fstype checks to boolean equation
- change from strncmp to strcmp for checking

Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Bug: 18917345
Bug: 18935184
Change-Id: Ib648f30ce4b5d6c96f11465836d6fee89bec1c72
1eb355df1565be4dfe5200d255ab063bfbbb0777 23-Jul-2013 Stephen Smalley <sds@tycho.nsa.gov> SELinux: Enable setting security contexts on rootfs inodes.

rootfs (ramfs) can support setting of security contexts
by userspace due to the vfs fallback behavior of calling
the security module to set the in-core inode state
for security.* attributes when the filesystem does not
provide an xattr handler. No xattr handler required
as the inodes are pinned in memory and have no backing
store.

This is useful in allowing early userspace to label individual
files within a rootfs while still providing a policy-defined
default via genfs.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
111bba77bee2fc36120766664c28d7c3e04d21bc 14-Mar-2013 John Stultz <john.stultz@linaro.org> selinux: binder: Fix COMMON_AUDIT_DATA_INIT compile issue

The COMMON_AUDIT_DATA_INIT macros have been removed, and are
now replaced with open coded ad.type initialization.

Thus, this patch updates the selinux_binder_transfer_file function
so it builds.

Change-Id: Ide41069a87638e294899768d09302f4013794e4c
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Android Kernel Team <kernel-team@android.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
1058a5ce897324d57052c132127bf4b0ee2419ce 05-Nov-2012 Stephen Smalley <sds@tycho.nsa.gov> Add security hooks to binder and implement the hooks for SELinux.

Add security hooks to the binder and implement the hooks for SELinux.
The security hooks enable security modules such as SELinux to implement
controls over binder IPC. The security hooks include support for
controlling what process can become the binder context manager
(binder_set_context_mgr), controlling the ability of a process
to invoke a binder transaction/IPC to another process (binder_transaction),
controlling the ability a process to transfer a binder reference to
another process (binder_transfer_binder), and controlling the ability
of a process to transfer an open file to another process (binder_transfer_file).

This support is used by SE Android, http://selinuxproject.org/page/SEAndroid.

Change-Id: I9a64a87825df2e60b9c51400377af4a9cd1c4049
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
d950f84c1c6658faec2ecbf5b09f7e7191953394 12-Nov-2014 Richard Guy Briggs <rgb@redhat.com> selinux: convert WARN_ONCE() to printk() in selinux_nlmsg_perm()

Convert WARN_ONCE() to printk() in selinux_nlmsg_perm().

After conversion from audit_log() in commit e173fb26, WARN_ONCE() was
deemed too alarmist, so switch it to printk().

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
[PM: Changed to printk(WARNING) so we catch all of the different
invalid netlink messages. In Richard's defense, he brought this
point up earlier, but I didn't understand his point at the time.]
Signed-off-by: Paul Moore <pmoore@redhat.com>
923190d32de4428afbea5e5773be86bea60a9925 06-Oct-2014 Stephen Smalley <sds@tycho.nsa.gov> selinux: fix inode security list corruption

sb_finish_set_opts() can race with inode_free_security()
when initializing inode security structures for inodes
created prior to initial policy load or by the filesystem
during ->mount(). This appears to have always been
a possible race, but commit 3dc91d4 ("SELinux: Fix possible
NULL pointer dereference in selinux_inode_permission()")
made it more evident by immediately reusing the unioned
list/rcu element of the inode security structure for call_rcu()
upon an inode_free_security(). But the underlying issue
was already present before that commit as a possible use-after-free
of isec.

Shivnandan Kumar reported the list corruption and proposed
a patch to split the list and rcu elements out of the union
as separate fields of the inode_security_struct so that setting
the rcu element would not affect the list element. However,
this would merely hide the issue and not truly fix the code.

This patch instead moves up the deletion of the list entry
prior to dropping the sbsec->isec_lock initially. Then,
if the inode is dropped subsequently, there will be no further
references to the isec.

Reported-by: Shivnandan Kumar <shivnandan.k@samsung.com>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
e173fb2646a832b424c80904c306b816760ce477 19-Sep-2014 Richard Guy Briggs <rgb@redhat.com> selinux: cleanup error reporting in selinux_nlmsg_perm()

Convert audit_log() call to WARN_ONCE().

Rename "type=" to nlmsg_type=" to avoid confusion with the audit record
type.

Added "protocol=" to help track down which protocol (NETLINK_AUDIT?) was used
within the netlink protocol family.

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
[Rewrote the patch subject line]
Signed-off-by: Paul Moore <pmoore@redhat.com>
cbe0d6e8794f1da6cac1ea3864d2cfaf0bf87c8e 10-Sep-2014 Paul Moore <pmoore@redhat.com> selinux: make the netif cache namespace aware

While SELinux largely ignores namespaces, for good reason, there are
some places where it needs to at least be aware of namespaces in order
to function correctly. Network namespaces are one example. Basic
awareness of network namespaces are necessary in order to match a
network interface's index number to an actual network device.

This patch corrects a problem with network interfaces added to a
non-init namespace, and can be reproduced with the following commands:

[NOTE: the NetLabel configuration is here only to active the dynamic
networking controls ]

# netlabelctl unlbl add default address:0.0.0.0/0 \
label:system_u:object_r:unlabeled_t:s0
# netlabelctl unlbl add default address:::/0 \
label:system_u:object_r:unlabeled_t:s0
# netlabelctl cipsov4 add pass doi:100 tags:1
# netlabelctl map add domain:lspp_test_netlabel_t \
protocol:cipsov4,100

# ip link add type veth
# ip netns add myns
# ip link set veth1 netns myns
# ip a add dev veth0 10.250.13.100/24
# ip netns exec myns ip a add dev veth1 10.250.13.101/24
# ip l set veth0 up
# ip netns exec myns ip l set veth1 up

# ping -c 1 10.250.13.101
# ip netns exec myns ping -c 1 10.250.13.100

Reported-by: Jiri Jaburek <jjaburek@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
e0b93eddfe17dcb7d644eb5d6ad02a86fc41a977 22-Aug-2014 Jeff Layton <jlayton@primarydata.com> security: make security_file_set_fowner, f_setown and __f_setown void return

security_file_set_fowner always returns 0, so make it f_setown and
__f_setown void return functions and fix up the error handling in the
callers.

Cc: linux-security-module@vger.kernel.org
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
25db6bea1ff5a78ef493eefdcbb9c1d27134e560 03-Sep-2014 Jiri Pirko <jiri@resnulli.us> selinux: register nf hooks with single nf_register_hooks call

Push ipv4 and ipv6 nf hooks into single array and register/unregister
them via single call.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Paul Moore <pmoore@redhat.com>
a7a91a1928fe69cc98814cb746d5171ae14d757e 03-Sep-2014 Paul Moore <pmoore@redhat.com> selinux: fix a problem with IPv6 traffic denials in selinux_ip_postroute()

A previous commit c0828e50485932b7e019df377a6b0a8d1ebd3080 ("selinux:
process labeled IPsec TCP SYN-ACK packets properly in
selinux_ip_postroute()") mistakenly left out a 'break' from a switch
statement which caused problems with IPv6 traffic.

Thanks to Florian Westphal for reporting and debugging the issue.

Reported-by: Florian Westphal <fwestpha@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
7b0d0b40cd78cadb525df760ee4cac151533c2b5 04-Aug-2014 Stephen Smalley <sds@tycho.nsa.gov> selinux: Permit bounded transitions under NO_NEW_PRIVS or NOSUID.

If the callee SID is bounded by the caller SID, then allowing
the transition to occur poses no risk of privilege escalation and we can
therefore safely allow the transition to occur. Add this exemption
for both the case where a transition was explicitly requested by the
application and the case where an automatic transition is defined in
policy.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2873ead7e46694910ac49c3a8ee0f54956f96e0c 28-Jul-2014 Paul Moore <pmoore@redhat.com> Revert "selinux: fix the default socket labeling in sock_graft()"

This reverts commit 4da6daf4d3df5a977e4623963f141a627fd2efce.

Unfortunately, the commit in question caused problems with Bluetooth
devices, specifically it caused them to get caught in the newly
created BUG_ON() check. The AF_ALG problem still exists, but will be
addressed in a future patch.

Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
4da6daf4d3df5a977e4623963f141a627fd2efce 10-Jul-2014 Paul Moore <pmoore@redhat.com> selinux: fix the default socket labeling in sock_graft()

The sock_graft() hook has special handling for AF_INET, AF_INET, and
AF_UNIX sockets as those address families have special hooks which
label the sock before it is attached its associated socket.
Unfortunately, the sock_graft() hook was missing a default approach
to labeling sockets which meant that any other address family which
made use of connections or the accept() syscall would find the
returned socket to be in an "unlabeled" state. This was recently
demonstrated by the kcrypto/AF_ALG subsystem and the newly released
cryptsetup package (cryptsetup v1.6.5 and later).

This patch preserves the special handling in selinux_sock_graft(),
but adds a default behavior - setting the sock's label equal to the
associated socket - which resolves the problem with AF_ALG and
presumably any other address family which makes use of accept().

Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
Tested-by: Milan Broz <gmazyland@gmail.com>
615e51fdda6f274e94b1e905fcaf6111e0d9aa20 26-Jun-2014 Paul Moore <pmoore@redhat.com> selinux: reduce the number of calls to synchronize_net() when flushing caches

When flushing the AVC, such as during a policy load, the various
network caches are also flushed, with each making a call to
synchronize_net() which has shown to be expensive in some cases.
This patch consolidates the network cache flushes into a single AVC
callback which only calls synchronize_net() once for each AVC cache
flush.

Reported-by: Jaejyn Shin <flagon22bass@gmail.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
5b589d44fad18228f18749360d008d5c8ff3aaf8 15-May-2014 Paul Moore <pmoore@redhat.com> selinux: reject setexeccon() on MNT_NOSUID applications with -EACCES

We presently prevent processes from using setexecon() to set the
security label of exec()'d processes when NO_NEW_PRIVS is enabled by
returning an error; however, we silently ignore setexeccon() when
exec()'ing from a nosuid mounted filesystem. This patch makes things
a bit more consistent by returning an error in the setexeccon()/nosuid
case.

Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
ca7786a2f916540931d7114d441efa141c99c898 29-Apr-2014 Stephen Smalley <sds@tycho.nsa.gov> selinux: Report permissive mode in avc: denied messages.

We cannot presently tell from an avc: denied message whether access was in
fact denied or was allowed due to global or per-domain permissive mode.
Add a permissive= field to the avc message to reflect this information.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
4f189988a0a5890db597ec48fc0e8b09922f290a 15-May-2014 Paul Moore <pmoore@redhat.com> selinux: reject setexeccon() on MNT_NOSUID applications with -EACCES

We presently prevent processes from using setexecon() to set the
security label of exec()'d processes when NO_NEW_PRIVS is enabled by
returning an error; however, we silently ignore setexeccon() when
exec()'ing from a nosuid mounted filesystem. This patch makes things
a bit more consistent by returning an error in the setexeccon()/nosuid
case.

Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
626b9740fa73cad043e136bfb3b6fca68a4f8a7c 29-Apr-2014 Stephen Smalley <sds@tycho.nsa.gov> selinux: Report permissive mode in avc: denied messages.

We cannot presently tell from an avc: denied message whether access was in
fact denied or was allowed due to global or per-domain permissive mode.
Add a permissive= field to the avc message to reflect this information.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
0d3f7a2dd2f5cf9642982515e020c1aee2cf7af6 22-Apr-2014 Jeff Layton <jlayton@redhat.com> locks: rename file-private locks to "open file description locks"

File-private locks have been merged into Linux for v3.15, and *now*
people are commenting that the name and macro definitions for the new
file-private locks suck.

...and I can't even disagree. The names and command macros do suck.

We're going to have to live with these for a long time, so it's
important that we be happy with the names before we're stuck with them.
The consensus on the lists so far is that they should be rechristened as
"open file description locks".

The name isn't a big deal for the kernel, but the command macros are not
visually distinct enough from the traditional POSIX lock macros. The
glibc and documentation folks are recommending that we change them to
look like F_OFD_{GETLK|SETLK|SETLKW}. That lessens the chance that a
programmer will typo one of the commands wrong, and also makes it easier
to spot this difference when reading code.

This patch makes the following changes that I think are necessary before
v3.15 ships:

1) rename the command macros to their new names. These end up in the uapi
headers and so are part of the external-facing API. It turns out that
glibc doesn't actually use the fcntl.h uapi header, but it's hard to
be sure that something else won't. Changing it now is safest.

2) make the the /proc/locks output display these as type "OFDLCK"

Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Carlos O'Donell <carlos@redhat.com>
Cc: Stefan Metzmacher <metze@samba.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Frank Filz <ffilzlnx@mindspring.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
5d50ffd7c31dab47c6b828841ca1ec70a1b40169 03-Feb-2014 Jeff Layton <jlayton@redhat.com> locks: add new fcntl cmd values for handling file private locks

Due to some unfortunate history, POSIX locks have very strange and
unhelpful semantics. The thing that usually catches people by surprise
is that they are dropped whenever the process closes any file descriptor
associated with the inode.

This is extremely problematic for people developing file servers that
need to implement byte-range locks. Developers often need a "lock
management" facility to ensure that file descriptors are not closed
until all of the locks associated with the inode are finished.

Additionally, "classic" POSIX locks are owned by the process. Locks
taken between threads within the same process won't conflict with one
another, which renders them useless for synchronization between threads.

This patchset adds a new type of lock that attempts to address these
issues. These locks conflict with classic POSIX read/write locks, but
have semantics that are more like BSD locks with respect to inheritance
and behavior on close.

This is implemented primarily by changing how fl_owner field is set for
these locks. Instead of having them owned by the files_struct of the
process, they are instead owned by the filp on which they were acquired.
Thus, they are inherited across fork() and are only released when the
last reference to a filp is put.

These new semantics prevent them from being merged with classic POSIX
locks, even if they are acquired by the same process. These locks will
also conflict with classic POSIX locks even if they are acquired by
the same process or on the same file descriptor.

The new locks are managed using a new set of cmd values to the fcntl()
syscall. The initial implementation of this converts these values to
"classic" cmd values at a fairly high level, and the details are not
exposed to the underlying filesystem. We may eventually want to push
this handing out to the lower filesystem code but for now I don't
see any need for it.

Also, note that with this implementation the new cmd values are only
available via fcntl64() on 32-bit arches. There's little need to
add support for legacy apps on a new interface like this.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
f64410ec665479d7b4b77b7519e814253ed0f686 19-Mar-2014 Paul Moore <pmoore@redhat.com> selinux: correctly label /proc inodes in use before the policy is loaded

This patch is based on an earlier patch by Eric Paris, he describes
the problem below:

"If an inode is accessed before policy load it will get placed on a
list of inodes to be initialized after policy load. After policy
load we call inode_doinit() which calls inode_doinit_with_dentry()
on all inodes accessed before policy load. In the case of inodes
in procfs that means we'll end up at the bottom where it does:

/* Default to the fs superblock SID. */
isec->sid = sbsec->sid;

if ((sbsec->flags & SE_SBPROC) && !S_ISLNK(inode->i_mode)) {
if (opt_dentry) {
isec->sclass = inode_mode_to_security_class(...)
rc = selinux_proc_get_sid(opt_dentry,
isec->sclass,
&sid);
if (rc)
goto out_unlock;
isec->sid = sid;
}
}

Since opt_dentry is null, we'll never call selinux_proc_get_sid()
and will leave the inode labeled with the label on the superblock.
I believe a fix would be to mimic the behavior of xattrs. Look
for an alias of the inode. If it can't be found, just leave the
inode uninitialized (and pick it up later) if it can be found, we
should be able to call selinux_proc_get_sid() ..."

On a system exhibiting this problem, you will notice a lot of files in
/proc with the generic "proc_t" type (at least the ones that were
accessed early in the boot), for example:

# ls -Z /proc/sys/kernel/shmmax | awk '{ print $4 " " $5 }'
system_u:object_r:proc_t:s0 /proc/sys/kernel/shmmax

However, with this patch in place we see the expected result:

# ls -Z /proc/sys/kernel/shmmax | awk '{ print $4 " " $5 }'
system_u:object_r:sysctl_kernel_t:s0 /proc/sys/kernel/shmmax

Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
98883bfd9d603a2760f6d53eccfaa3ae2c053e72 19-Mar-2014 Paul Moore <pmoore@redhat.com> selinux: put the mmap() DAC controls before the MAC controls

It turns out that doing the SELinux MAC checks for mmap() before the
DAC checks was causing users and the SELinux policy folks headaches
as users were seeing a lot of SELinux AVC denials for the
memprotect:mmap_zero permission that would have also been denied by
the normal DAC capability checks (CAP_SYS_RAWIO).

Example:

# cat mmap_test.c
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <sys/mman.h>

int main(int argc, char *argv[])
{
int rc;
void *mem;

mem = mmap(0x0, 4096,
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
if (mem == MAP_FAILED)
return errno;
printf("mem = %p\n", mem);
munmap(mem, 4096);

return 0;
}
# gcc -g -O0 -o mmap_test mmap_test.c
# ./mmap_test
mem = (nil)
# ausearch -m AVC | grep mmap_zero
type=AVC msg=audit(...): avc: denied { mmap_zero }
for pid=1025 comm="mmap_test"
scontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
tclass=memprotect

This patch corrects things so that when the above example is run by a
user without CAP_SYS_RAWIO the SELinux AVC is no longer generated as
the DAC capability check fails before the SELinux permission check.

Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
f5895943d91b41b0368830cdb6eaffb8eda0f4c8 14-Mar-2014 David Howells <dhowells@redhat.com> KEYS: Move the flags representing required permission to linux/key.h

Move the flags representing required permission to linux/key.h as the perm
parameter of security_key_permission() is in terms of them - and not the
permissions mask flags used in key->perm.

Whilst we're at it:

(1) Rename them to be KEY_NEED_xxx rather than KEY_xxx to avoid collisions
with symbols in uapi/linux/input.h.

(2) Don't use key_perm_t for a mask of required permissions, but rather limit
it to the permissions mask attached to the key and arguments related
directly to that.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
52a4c6404f91f2d2c5592ee6365a8418c4565f53 07-Mar-2014 Nikolay Aleksandrov <nikolay@redhat.com> selinux: add gfp argument to security_xfrm_policy_alloc and fix callers

security_xfrm_policy_alloc can be called in atomic context so the
allocation should be done with GFP_ATOMIC. Add an argument to let the
callers choose the appropriate way. In order to do so a gfp argument
needs to be added to the method xfrm_policy_alloc_security in struct
security_operations and to the internal function
selinux_xfrm_alloc_user. After that switch to GFP_ATOMIC in the atomic
callers and leave GFP_KERNEL as before for the rest.
The path that needed the gfp argument addition is:
security_xfrm_policy_alloc -> security_ops.xfrm_policy_alloc_security ->
all users of xfrm_policy_alloc_security (e.g. selinux_xfrm_policy_alloc) ->
selinux_xfrm_alloc_user (here the allocation used to be GFP_KERNEL only)

Now adding a gfp argument to selinux_xfrm_alloc_user requires us to also
add it to security_context_to_sid which is used inside and prior to this
patch did only GFP_KERNEL allocation. So add gfp argument to
security_context_to_sid and adjust all of its callers as well.

CC: Paul Moore <paul@paul-moore.com>
CC: Dave Jones <davej@redhat.com>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Fan Du <fan.du@windriver.com>
CC: David S. Miller <davem@davemloft.net>
CC: LSM list <linux-security-module@vger.kernel.org>
CC: SELinux list <selinux@tycho.nsa.gov>

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
eee3094683fbc7fe6bcdaef58c1ef31f8460cdca 05-Mar-2014 Paul Moore <pmoore@redhat.com> selinux: correctly label /proc inodes in use before the policy is loaded

This patch is based on an earlier patch by Eric Paris, he describes
the problem below:

"If an inode is accessed before policy load it will get placed on a
list of inodes to be initialized after policy load. After policy
load we call inode_doinit() which calls inode_doinit_with_dentry()
on all inodes accessed before policy load. In the case of inodes
in procfs that means we'll end up at the bottom where it does:

/* Default to the fs superblock SID. */
isec->sid = sbsec->sid;

if ((sbsec->flags & SE_SBPROC) && !S_ISLNK(inode->i_mode)) {
if (opt_dentry) {
isec->sclass = inode_mode_to_security_class(...)
rc = selinux_proc_get_sid(opt_dentry,
isec->sclass,
&sid);
if (rc)
goto out_unlock;
isec->sid = sid;
}
}

Since opt_dentry is null, we'll never call selinux_proc_get_sid()
and will leave the inode labeled with the label on the superblock.
I believe a fix would be to mimic the behavior of xattrs. Look
for an alias of the inode. If it can't be found, just leave the
inode uninitialized (and pick it up later) if it can be found, we
should be able to call selinux_proc_get_sid() ..."

On a system exhibiting this problem, you will notice a lot of files in
/proc with the generic "proc_t" type (at least the ones that were
accessed early in the boot), for example:

# ls -Z /proc/sys/kernel/shmmax | awk '{ print $4 " " $5 }'
system_u:object_r:proc_t:s0 /proc/sys/kernel/shmmax

However, with this patch in place we see the expected result:

# ls -Z /proc/sys/kernel/shmmax | awk '{ print $4 " " $5 }'
system_u:object_r:sysctl_kernel_t:s0 /proc/sys/kernel/shmmax

Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
0909c0ae999c325b9d34c6f4710f40730ae3bc24 28-Feb-2014 Paul Moore <pmoore@redhat.com> selinux: put the mmap() DAC controls before the MAC controls

It turns out that doing the SELinux MAC checks for mmap() before the
DAC checks was causing users and the SELinux policy folks headaches
as users were seeing a lot of SELinux AVC denials for the
memprotect:mmap_zero permission that would have also been denied by
the normal DAC capability checks (CAP_SYS_RAWIO).

Example:

# cat mmap_test.c
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <sys/mman.h>

int main(int argc, char *argv[])
{
int rc;
void *mem;

mem = mmap(0x0, 4096,
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
if (mem == MAP_FAILED)
return errno;
printf("mem = %p\n", mem);
munmap(mem, 4096);

return 0;
}
# gcc -g -O0 -o mmap_test mmap_test.c
# ./mmap_test
mem = (nil)
# ausearch -m AVC | grep mmap_zero
type=AVC msg=audit(...): avc: denied { mmap_zero }
for pid=1025 comm="mmap_test"
scontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
tclass=memprotect

This patch corrects things so that when the above example is run by a
user without CAP_SYS_RAWIO the SELinux AVC is no longer generated as
the DAC capability check fails before the SELinux permission check.

Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
29707b206c5171ac6583a4d1e9ec3af937e8c2e4 05-Feb-2014 Jingoo Han <jg1.han@samsung.com> security: replace strict_strto*() with kstrto*()

The usage of strict_strto*() is not preferred, because
strict_strto*() is obsolete. Thus, kstrto*() should be
used.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
3dc91d4338d698ce77832985f9cb183d8eeaf6be 10-Jan-2014 Steven Rostedt <rostedt@goodmis.org> SELinux: Fix possible NULL pointer dereference in selinux_inode_permission()

While running stress tests on adding and deleting ftrace instances I hit
this bug:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: selinux_inode_permission+0x85/0x160
PGD 63681067 PUD 7ddbe067 PMD 0
Oops: 0000 [#1] PREEMPT
CPU: 0 PID: 5634 Comm: ftrace-test-mki Not tainted 3.13.0-rc4-test-00033-gd2a6dde-dirty #20
Hardware name: /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006
task: ffff880078375800 ti: ffff88007ddb0000 task.ti: ffff88007ddb0000
RIP: 0010:[<ffffffff812d8bc5>] [<ffffffff812d8bc5>] selinux_inode_permission+0x85/0x160
RSP: 0018:ffff88007ddb1c48 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000800000 RCX: ffff88006dd43840
RDX: 0000000000000001 RSI: 0000000000000081 RDI: ffff88006ee46000
RBP: ffff88007ddb1c88 R08: 0000000000000000 R09: ffff88007ddb1c54
R10: 6e6576652f6f6f66 R11: 0000000000000003 R12: 0000000000000000
R13: 0000000000000081 R14: ffff88006ee46000 R15: 0000000000000000
FS: 00007f217b5b6700(0000) GS:ffffffff81e21000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
CR2: 0000000000000020 CR3: 000000006a0fe000 CR4: 00000000000007f0
Call Trace:
security_inode_permission+0x1c/0x30
__inode_permission+0x41/0xa0
inode_permission+0x18/0x50
link_path_walk+0x66/0x920
path_openat+0xa6/0x6c0
do_filp_open+0x43/0xa0
do_sys_open+0x146/0x240
SyS_open+0x1e/0x20
system_call_fastpath+0x16/0x1b
Code: 84 a1 00 00 00 81 e3 00 20 00 00 89 d8 83 c8 02 40 f6 c6 04 0f 45 d8 40 f6 c6 08 74 71 80 cf 02 49 8b 46 38 4c 8d 4d cc 45 31 c0 <0f> b7 50 20 8b 70 1c 48 8b 41 70 89 d9 8b 78 04 e8 36 cf ff ff
RIP selinux_inode_permission+0x85/0x160
CR2: 0000000000000020

Investigating, I found that the inode->i_security was NULL, and the
dereference of it caused the oops.

in selinux_inode_permission():

isec = inode->i_security;

rc = avc_has_perm_noaudit(sid, isec->sid, isec->sclass, perms, 0, &avd);

Note, the crash came from stressing the deletion and reading of debugfs
files. I was not able to recreate this via normal files. But I'm not
sure they are safe. It may just be that the race window is much harder
to hit.

What seems to have happened (and what I have traced), is the file is
being opened at the same time the file or directory is being deleted.
As the dentry and inode locks are not held during the path walk, nor is
the inodes ref counts being incremented, there is nothing saving these
structures from being discarded except for an rcu_read_lock().

The rcu_read_lock() protects against freeing of the inode, but it does
not protect freeing of the inode_security_struct. Now if the freeing of
the i_security happens with a call_rcu(), and the i_security field of
the inode is not changed (it gets freed as the inode gets freed) then
there will be no issue here. (Linus Torvalds suggested not setting the
field to NULL such that we do not need to check if it is NULL in the
permission check).

Note, this is a hack, but it fixes the problem at hand. A real fix is
to restructure the destroy_inode() to call all the destructor handlers
from the RCU callback. But that is a major job to do, and requires a
lot of work. For now, we just band-aid this bug with this fix (it
works), and work on a more maintainable solution in the future.

Link: http://lkml.kernel.org/r/20140109101932.0508dec7@gandalf.local.home
Link: http://lkml.kernel.org/r/20140109182756.17abaaa8@gandalf.local.home

Cc: stable@vger.kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
c0c1439541f5305b57a83d599af32b74182933fe 23-Dec-2013 Oleg Nesterov <oleg@redhat.com> selinux: selinux_setprocattr()->ptrace_parent() needs rcu_read_lock()

selinux_setprocattr() does ptrace_parent(p) under task_lock(p),
but task_struct->alloc_lock doesn't pin ->parent or ->ptrace,
this looks confusing and triggers the "suspicious RCU usage"
warning because ptrace_parent() does rcu_dereference_check().

And in theory this is wrong, spin_lock()->preempt_disable()
doesn't necessarily imply rcu_read_lock() we need to access
the ->parent.

Reported-by: Evan McNabb <emcnabb@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
46d01d63221c3508421dd72ff9c879f61053cffc 23-Dec-2013 Chad Hanson <chanson@trustedcs.com> selinux: fix broken peer recv check

Fix a broken networking check. Return an error if peer recv fails. If
secmark is active and the packet recv succeeds the peer recv error is
ignored.

Signed-off-by: Chad Hanson <chanson@trustedcs.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
465954cd649a7d8cd331695bd24a16bcb5c4c716 14-Dec-2013 Oleg Nesterov <oleg@redhat.com> selinux: selinux_setprocattr()->ptrace_parent() needs rcu_read_lock()

selinux_setprocattr() does ptrace_parent(p) under task_lock(p),
but task_struct->alloc_lock doesn't pin ->parent or ->ptrace,
this looks confusing and triggers the "suspicious RCU usage"
warning because ptrace_parent() does rcu_dereference_check().

And in theory this is wrong, spin_lock()->preempt_disable()
doesn't necessarily imply rcu_read_lock() we need to access
the ->parent.

Reported-by: Evan McNabb <emcnabb@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
a5e333d34037c64c5f667dee3c418b66874ba0b0 16-Dec-2013 Wei Yongjun <yongjun_wei@trendmicro.com.cn> SELinux: remove duplicated include from hooks.c

Remove duplicated include.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Paul Moore <pmoore@redhat.com>
29b1deb2a48a9dd02b93597aa4c055a24c0e989f 15-Dec-2013 Linus Torvalds <torvalds@linux-foundation.org> Revert "selinux: consider filesystem subtype in policies"

This reverts commit 102aefdda4d8275ce7d7100bc16c88c74272b260.

Tom London reports that it causes sync() to hang on Fedora rawhide:

https://bugzilla.redhat.com/show_bug.cgi?id=1033965

and Josh Boyer bisected it down to this commit. Reverting the commit in
the rawhide kernel fixes the problem.

Eric Paris root-caused it to incorrect subtype matching in that commit
breaking fuse, and has a tentative patch, but by now we're better off
retrying this in 3.14 rather than playing with it any more.

Reported-by: Tom London <selinux@gmail.com>
Bisected-by: Josh Boyer <jwboyer@fedoraproject.org>
Acked-by: Eric Paris <eparis@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: Anand Avati <avati@redhat.com>
Cc: Paul Moore <paul@paul-moore.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4d546f81717d253ab67643bf072c6d8821a9249c 13-Dec-2013 Paul Moore <pmoore@redhat.com> selinux: revert 102aefdda4d8275ce7d7100bc16c88c74272b260

Revert "selinux: consider filesystem subtype in policies"

This reverts commit 102aefdda4d8275ce7d7100bc16c88c74272b260.

Explanation from Eric Paris:

SELinux policy can specify if it should use a filesystem's
xattrs or not. In current policy we have a specification that
fuse should not use xattrs but fuse.glusterfs should use
xattrs. This patch has a bug in which non-glusterfs
filesystems would match the rule saying fuse.glusterfs should
use xattrs. If both fuse and the particular filesystem in
question are not written to handle xattr calls during the mount
command, they will deadlock.

I have fixed the bug to do proper matching, however I believe a
revert is still the correct solution. The reason I believe
that is because the code still does not work. The s_subtype is
not set until after the SELinux hook which attempts to match on
the ".gluster" portion of the rule. So we cannot match on the
rule in question. The code is useless.

Signed-off-by: Paul Moore <pmoore@redhat.com>
c0828e50485932b7e019df377a6b0a8d1ebd3080 10-Dec-2013 Paul Moore <pmoore@redhat.com> selinux: process labeled IPsec TCP SYN-ACK packets properly in selinux_ip_postroute()

Due to difficulty in arriving at the proper security label for
TCP SYN-ACK packets in selinux_ip_postroute(), we need to check packets
while/before they are undergoing XFRM transforms instead of waiting
until afterwards so that we can determine the correct security label.

Reported-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
817eff718dca4e54d5721211ddde0914428fbb7c 10-Dec-2013 Paul Moore <pmoore@redhat.com> selinux: look for IPsec labels on both inbound and outbound packets

Previously selinux_skb_peerlbl_sid() would only check for labeled
IPsec security labels on inbound packets, this patch enables it to
check both inbound and outbound traffic for labeled IPsec security
labels.

Reported-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
446b802437f285de68ffb8d6fac3c44c3cab5b04 04-Dec-2013 Paul Moore <pmoore@redhat.com> selinux: handle TCP SYN-ACK packets correctly in selinux_ip_postroute()

In selinux_ip_postroute() we perform access checks based on the
packet's security label. For locally generated traffic we get the
packet's security label from the associated socket; this works in all
cases except for TCP SYN-ACK packets. In the case of SYN-ACK packet's
the correct security label is stored in the connection's request_sock,
not the server's socket. Unfortunately, at the point in time when
selinux_ip_postroute() is called we can't query the request_sock
directly, we need to recreate the label using the same logic that
originally labeled the associated request_sock.

See the inline comments for more explanation.

Reported-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Tested-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
47180068276a04ed31d24fe04c673138208b07a9 04-Dec-2013 Paul Moore <pmoore@redhat.com> selinux: handle TCP SYN-ACK packets correctly in selinux_ip_output()

In selinux_ip_output() we always label packets based on the parent
socket. While this approach works in almost all cases, it doesn't
work in the case of TCP SYN-ACK packets when the correct label is not
the label of the parent socket, but rather the label of the larval
socket represented by the request_sock struct.

Unfortunately, since the request_sock isn't queued on the parent
socket until *after* the SYN-ACK packet is sent, we can't lookup the
request_sock to determine the correct label for the packet; at this
point in time the best we can do is simply pass/NF_ACCEPT the packet.
It must be said that simply passing the packet without any explicit
labeling action, while far from ideal, is not terrible as the SYN-ACK
packet will inherit any IP option based labeling from the initial
connection request so the label *should* be correct and all our
access controls remain in place so we shouldn't have to worry about
information leaks.

Reported-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Tested-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
598cdbcf861825692fe7905e0fd662c7d06bae58 11-Dec-2013 Chad Hanson <chanson@trustedcs.com> selinux: fix broken peer recv check

Fix a broken networking check. Return an error if peer recv fails. If
secmark is active and the packet recv succeeds the peer recv error is
ignored.

Signed-off-by: Chad Hanson <chanson@trustedcs.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
5c6c26813a209e7075baf908e3ad81c1a9d389e8 09-Dec-2013 Paul Moore <pmoore@redhat.com> selinux: process labeled IPsec TCP SYN-ACK packets properly in selinux_ip_postroute()

Due to difficulty in arriving at the proper security label for
TCP SYN-ACK packets in selinux_ip_postroute(), we need to check packets
while/before they are undergoing XFRM transforms instead of waiting
until afterwards so that we can determine the correct security label.

Reported-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
5b67c493248059909d7e0ff646d8475306669b27 09-Dec-2013 Paul Moore <pmoore@redhat.com> selinux: look for IPsec labels on both inbound and outbound packets

Previously selinux_skb_peerlbl_sid() would only check for labeled
IPsec security labels on inbound packets, this patch enables it to
check both inbound and outbound traffic for labeled IPsec security
labels.

Reported-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
0b1f24e6db9a60c1f68117ad158ea29faa7c3a7f 03-Dec-2013 Paul Moore <pmoore@redhat.com> selinux: pull address family directly from the request_sock struct

We don't need to inspect the packet to determine if the packet is an
IPv4 packet arriving on an IPv6 socket when we can query the
request_sock directly.

Signed-off-by: Paul Moore <pmoore@redhat.com>
7f721643db3b2da53e1b91aaa4e8cb7706bfdd10 03-Dec-2013 Paul Moore <pmoore@redhat.com> selinux: handle TCP SYN-ACK packets correctly in selinux_ip_postroute()

In selinux_ip_postroute() we perform access checks based on the
packet's security label. For locally generated traffic we get the
packet's security label from the associated socket; this works in all
cases except for TCP SYN-ACK packets. In the case of SYN-ACK packet's
the correct security label is stored in the connection's request_sock,
not the server's socket. Unfortunately, at the point in time when
selinux_ip_postroute() is called we can't query the request_sock
directly, we need to recreate the label using the same logic that
originally labeled the associated request_sock.

See the inline comments for more explanation.

Reported-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Tested-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
da2ea0d05671f878196cc949906aa89d15c567db 03-Dec-2013 Paul Moore <pmoore@redhat.com> selinux: handle TCP SYN-ACK packets correctly in selinux_ip_output()

In selinux_ip_output() we always label packets based on the parent
socket. While this approach works in almost all cases, it doesn't
work in the case of TCP SYN-ACK packets when the correct label is not
the label of the parent socket, but rather the label of the larval
socket represented by the request_sock struct.

Unfortunately, since the request_sock isn't queued on the parent
socket until *after* the SYN-ACK packet is sent, we can't lookup the
request_sock to determine the correct label for the packet; at this
point in time the best we can do is simply pass/NF_ACCEPT the packet.
It must be said that simply passing the packet without any explicit
labeling action, while far from ideal, is not terrible as the SYN-ACK
packet will inherit any IP option based labeling from the initial
connection request so the label *should* be correct and all our
access controls remain in place so we shouldn't have to worry about
information leaks.

Reported-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Tested-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
795aa6ef6a1aba99050735eadd0c2341b789b53b 10-Oct-2013 Patrick McHardy <kaber@trash.net> netfilter: pass hook ops to hookfn

Pass the hook ops to the hookfn to allow for generic hook
functions. This change is required by nf_tables.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
ab3540626435c01e08fe58ce544311a78430f112 04-Oct-2013 Linus Torvalds <torvalds@linux-foundation.org> selinux: remove 'flags' parameter from avc_audit()

Now avc_audit() has no more users with that parameter. Remove it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
19e49834d22c2271ed1f4a03aaa4b74986447fb4 04-Oct-2013 Linus Torvalds <torvalds@linux-foundation.org> selinux: remove 'flags' parameter from inode_has_perm

Every single user passes in '0'. I think we had non-zero users back in
some stone age when selinux_inode_permission() was implemented in terms
of inode_has_perm(), but that complicated case got split up into a
totally separate code-path so that we could optimize the much simpler
special cases.

See commit 2e33405785d3 ("SELinux: delay initialization of audit data in
selinux_inode_permission") for example.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
0bbf87d852d243680ed7074110ccc1dea003b61a 28-Sep-2013 Eric W. Biederman <ebiederm@xmission.com> net ipv4: Convert ipv4.ip_local_port_range to be per netns v3

- Move sysctl_local_ports from a global variable into struct netns_ipv4.
- Modify inet_get_local_port_range to take a struct net, and update all
of the callers.
- Move the initialization of sysctl_local_ports into
sysctl_net_ipv4.c:ipv4_sysctl_init_net from inet_connection_sock.c

v2:
- Ensure indentation used tabs
- Fixed ip.h so it applies cleanly to todays net-next

v3:
- Compile fixes of strange callers of inet_get_local_port_range.
This patch now successfully passes an allmodconfig build.
Removed manual inlining of inet_get_local_port_range in ipv4_local_port_range

Originally-by: Samya <samya@twitter.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
0b4bdb3573a86a88c829b9e4ad702859eb923e7e 28-Aug-2013 Eric Paris <eparis@redhat.com> Revert "SELinux: do not handle seclabel as a special flag"

This reverts commit 308ab70c465d97cf7e3168961dfd365535de21a6.

It breaks my FC6 test box. /dev/pts is not mounted. dmesg says

SELinux: mount invalid. Same superblock, different security settings
for (dev devpts, type devpts)

Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Eric Paris <eparis@redhat.com>
102aefdda4d8275ce7d7100bc16c88c74272b260 17-Apr-2013 Anand Avati <avati@redhat.com> selinux: consider filesystem subtype in policies

Not considering sub filesystem has the following limitation. Support
for SELinux in FUSE is dependent on the particular userspace
filesystem, which is identified by the subtype. For e.g, GlusterFS,
a FUSE based filesystem supports SELinux (by mounting and processing
FUSE requests in different threads, avoiding the mount time
deadlock), whereas other FUSE based filesystems (identified by a
different subtype) have the mount time deadlock.

By considering the subtype of the filesytem in the SELinux policies,
allows us to specify a filesystem subtype, in the following way:

fs_use_xattr fuse.glusterfs gen_context(system_u:object_r:fs_t,s0);

This way not all FUSE filesystems are put in the same bucket and
subjected to the limitations of the other subtypes.

Signed-off-by: Anand Avati <avati@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2be4d74f2fd45460d70d4fe65cc1972ef45bf849 03-May-2013 Chris PeBenito <cpebenito@tresys.com> Add SELinux policy capability for always checking packet and peer classes.

Currently the packet class in SELinux is not checked if there are no
SECMARK rules in the security or mangle netfilter tables. Some systems
prefer that packets are always checked, for example, to protect the system
should the netfilter rules fail to load or if the nefilter rules
were maliciously flushed.

Add the always_check_network policy capability which, when enabled, treats
SECMARK as enabled, even if there are no netfilter SECMARK rules and
treats peer labeling as enabled, even if there is no Netlabel or
labeled IPSEC configuration.

Includes definition of "redhat1" SELinux policy capability, which
exists in the SELinux userpace library, to keep ordering correct.

The SELinux userpace portion of this was merged last year, but this kernel
change fell on the floor.

Signed-off-by: Chris PeBenito <cpebenito@tresys.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
b43e725d8d386bf2092473953b525aaae71b6c28 10-Oct-2012 Eric Paris <eparis@redhat.com> SELinux: use a helper function to determine seclabel

Use a helper to determine if a superblock should have the seclabel flag
rather than doing it in the function. I'm going to use this in the
security server as well.

Signed-off-by: Eric Paris <eparis@redhat.com>
a64c54cf0811b8032fdab8c9d52576f0370837fa 24-Aug-2012 Eric Paris <eparis@redhat.com> SELinux: pass a superblock to security_fs_use

Rather than passing pointers to memory locations, strings, and other
stuff just give up on the separation and give security_fs_use the
superblock. It just makes the code easier to read (even if not easier to
reuse on some other OS)

Signed-off-by: Eric Paris <eparis@redhat.com>
308ab70c465d97cf7e3168961dfd365535de21a6 24-Aug-2012 Eric Paris <eparis@redhat.com> SELinux: do not handle seclabel as a special flag

Instead of having special code around the 'non-mount' seclabel mount option
just handle it like the mount options.

Signed-off-by: Eric Paris <eparis@redhat.com>
eadcabc697e904e0d93d10070a324d8855740b91 24-Aug-2012 Eric Paris <eparis@redhat.com> SELinux: do all flags twiddling in one place

Currently we set the initialize and seclabel flag in one place. Do some
unrelated printk then we unset the seclabel flag. Eww. Instead do the flag
twiddling in one place in the code not seperated by unrelated printk. Also
don't set and unset the seclabel flag. Only set it if we need to.

Signed-off-by: Eric Paris <eparis@redhat.com>
12f348b9dcf6d9616c86a049c3c8700f9dc0af55 09-Oct-2012 Eric Paris <eparis@redhat.com> SELinux: rename SE_SBLABELSUPP to SBLABEL_MNT

Just a flag rename as we prepare to make it not so special.

Signed-off-by: Eric Paris <eparis@redhat.com>
af8e50cc7d546c508e9091bbbdf3cf8b243bd8cd 24-Aug-2012 Eric Paris <eparis@redhat.com> SELinux: use define for number of bits in the mnt flags mask

We had this random hard coded value of '8' in the code (I put it there)
for the number of bits to check for mount options. This is stupid. Instead
use the #define we already have which tells us the number of mount
options.

Signed-off-by: Eric Paris <eparis@redhat.com>
d355987f47bbe24e7450b509a3f8aac0db88b65a 24-Aug-2012 Eric Paris <eparis@redhat.com> SELinux: make it harder to get the number of mnt opts wrong

Instead of just hard coding a value, use the enum to out benefit.

Signed-off-by: Eric Paris <eparis@redhat.com>
40d3d0b85fa22084fc3b7eeb943aca365097cea3 24-Aug-2012 Eric Paris <eparis@redhat.com> SELinux: remove crazy contortions around proc

We check if the fsname is proc and if so set the proc superblock security
struct flag. We then check if the flag is set and use the string 'proc'
for the fsname instead of just using the fsname. What's the point? It's
always proc... Get rid of the useless conditional.

Signed-off-by: Eric Paris <eparis@redhat.com>
5c73fceb8c70466c5876ad94c356922ae75a0820 23-Jul-2013 Stephen Smalley <sds@tycho.nsa.gov> SELinux: Enable setting security contexts on rootfs inodes.

rootfs (ramfs) can support setting of security contexts
by userspace due to the vfs fallback behavior of calling
the security module to set the in-core inode state
for security.* attributes when the filesystem does not
provide an xattr handler. No xattr handler required
as the inodes are pinned in memory and have no backing
store.

This is useful in allowing early userspace to label individual
files within a rootfs while still providing a policy-defined
default via genfs.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
bed4d7efb31fd81b3a3c83dc8540197cd0fe81c0 23-Jul-2013 Paul Moore <pmoore@redhat.com> selinux: remove the BUG_ON() from selinux_skb_xfrm_sid()

Remove the BUG_ON() from selinux_skb_xfrm_sid() and propogate the
error code up to the caller. Also check the return values in the
only caller function, selinux_skb_peerlbl_sid().

Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2e5aa86609ec1cf37bcc204fd7ba6c24c2f49fec 23-Jul-2013 Paul Moore <pmoore@redhat.com> lsm: split the xfrm_state_alloc_security() hook implementation

The xfrm_state_alloc_security() LSM hook implementation is really a
multiplexed hook with two different behaviors depending on the
arguments passed to it by the caller. This patch splits the LSM hook
implementation into two new hook implementations, which match the
LSM hooks in the rest of the kernel:

* xfrm_state_alloc
* xfrm_state_alloc_acquire

Also included in this patch are the necessary changes to the SELinux
code; no other LSMs are affected.

Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
9548906b2bb7ff09e12c013a55d669bef2c8e121 24-Jul-2013 Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> xattr: Constify ->name member of "struct xattr".

Since everybody sets kstrdup()ed constant string to "struct xattr"->name but
nobody modifies "struct xattr"->name , we can omit kstrdup() and its failure
checking by constifying ->name member of "struct xattr".

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Joel Becker <jlbec@evilplan.org> [ocfs2]
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Tested-by: Paul Moore <paul@paul-moore.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
13f8e9810bff12d01807b6f92329111f45218235 14-Jun-2013 David Howells <dhowells@redhat.com> SELinux: Institute file_path_has_perm()

Create a file_path_has_perm() function that is like path_has_perm() but
instead takes a file struct that is the source of both the path and the
inode (rather than getting the inode from the dentry in the path). This
is then used where appropriate.

This will be useful for situations like unionmount where it will be
possible to have an apparently-negative dentry (eg. a fallthrough) that is
open with the file struct pointing to an inode on the lower fs.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
aa9c2669626ca7e5e5bab28e6caeb583fd40099b 22-May-2013 David Quigley <dpquigl@davequigley.com> NFS: Client implementation of Labeled-NFS

This patch implements the client transport and handling support for labeled
NFS. The patch adds two functions to encode and decode the security label
recommended attribute which makes use of the LSM hooks added earlier. It also
adds code to grab the label from the file attribute structures and encode the
label to be sent back to the server.

Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com>
Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
eb9ae686507bc5a5ca78e6b3fbe629cd5cc67864 22-May-2013 David Quigley <dpquigl@davequigley.com> SELinux: Add new labeling type native labels

There currently doesn't exist a labeling type that is adequate for use with
labeled NFS. Since NFS doesn't really support xattrs we can't use the use xattr
labeling behavior. For this we developed a new labeling type. The native
labeling type is used solely by NFS to ensure NFS inodes are labeled at runtime
by the NFS code instead of relying on the SELinux security server on the client
end.

Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com>
Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
649f6e7718891fe7691e5084ce3fa623acba3129 22-May-2013 David Quigley <dpquigl@davequigley.com> LSM: Add flags field to security_sb_set_mnt_opts for in kernel mount data.

There is no way to differentiate if a text mount option is passed from user
space or the kernel. A flags field is being added to the
security_sb_set_mnt_opts hook to allow for in kernel security flags to be sent
to the LSM for processing in addition to the text options received from mount.
This patch also updated existing code to fix compilation errors.

Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
746df9b59c8a5f162c907796c7295d3c4c0d8995 22-May-2013 David Quigley <dpquigl@davequigley.com> Security: Add Hook to test if the particular xattr is part of a MAC model.

The interface to request security labels from user space is the xattr
interface. When requesting the security label from an NFS server it is
important to make sure the requested xattr actually is a MAC label. This allows
us to make sure that we get the desired semantics from the attribute instead of
something else such as capabilities or a time based LSM.

Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com>
Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
d47be3dfecaf20255af89a57460285c82d5271ad 22-May-2013 David Quigley <dpquigl@davequigley.com> Security: Add hook to calculate context based on a negative dentry.

There is a time where we need to calculate a context without the
inode having been created yet. To do this we take the negative dentry and
calculate a context based on the process and the parent directory contexts.

Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com>
Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
ca10b9e9a8ca7342ee07065289cbe74ac128c169 08-Apr-2013 Eric Dumazet <edumazet@google.com> selinux: add a skb_owned_by() hook

Commit 90ba9b1986b5ac (tcp: tcp_make_synack() can use alloc_skb())
broke certain SELinux/NetLabel configurations by no longer correctly
assigning the sock to the outgoing SYNACK packet.

Cost of atomic operations on the LISTEN socket is quite big,
and we would like it to happen only if really needed.

This patch introduces a new security_ops->skb_owned_by() method,
that is a void operation unless selinux is active.

Reported-by: Miroslav Vadkerti <mvadkert@redhat.com>
Diagnosed-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-security-module@vger.kernel.org
Acked-by: James Morris <james.l.morris@oracle.com>
Tested-by: Paul Moore <pmoore@redhat.com>
Acked-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
094f7b69ea738d7d619cba449d2af97159949459 01-Apr-2013 Jeff Layton <jlayton@redhat.com> selinux: make security_sb_clone_mnt_opts return an error on context mismatch

I had the following problem reported a while back. If you mount the
same filesystem twice using NFSv4 with different contexts, then the
second context= option is ignored. For instance:

# mount server:/export /mnt/test1
# mount server:/export /mnt/test2 -o context=system_u:object_r:tmp_t:s0
# ls -dZ /mnt/test1
drwxrwxrwt. root root system_u:object_r:nfs_t:s0 /mnt/test1
# ls -dZ /mnt/test2
drwxrwxrwt. root root system_u:object_r:nfs_t:s0 /mnt/test2

When we call into SELinux to set the context of a "cloned" superblock,
it will currently just bail out when it notices that we're reusing an
existing superblock. Since the existing superblock is already set up and
presumably in use, we can't go overwriting its context with the one from
the "original" sb. Because of this, the second context= option in this
case cannot take effect.

This patch fixes this by turning security_sb_clone_mnt_opts into an int
return operation. When it finds that the "new" superblock that it has
been handed is already set up, it checks to see whether the contexts on
the old superblock match it. If it does, then it will just return
success, otherwise it'll return -EBUSY and emit a printk to tell the
admin why the second mount failed.

Note that this patch may cause casualties. The NFSv4 code relies on
being able to walk down to an export from the pseudoroot. If you mount
filesystems that are nested within one another with different contexts,
then this patch will make those mounts fail in new and "exciting" ways.

For instance, suppose that /export is a separate filesystem on the
server:

# mount server:/ /mnt/test1
# mount salusa:/export /mnt/test2 -o context=system_u:object_r:tmp_t:s0
mount.nfs: an incorrect mount option was specified

...with the printk in the ring buffer. Because we *might* eventually
walk down to /mnt/test1/export, the mount is denied due to this patch.
The second mount needs the pseudoroot superblock, but that's already
present with the wrong context.

OTOH, if we mount these in the reverse order, then both mounts work,
because the pseudoroot superblock created when mounting /export is
discarded once that mount is done. If we then however try to walk into
that directory, the automount fails for the similar reasons:

# cd /mnt/test1/scratch/
-bash: cd: /mnt/test1/scratch: Device or resource busy

The story I've gotten from the SELinux folks that I've talked to is that
this is desirable behavior. In SELinux-land, mounting the same data
under different contexts is wrong -- there can be only one.

Cc: Steve Dickson <steved@redhat.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
77954983ad76e4f10cee8d2e06fcd085fae11780 27-Mar-2013 Hong zhi guo <honkiko@gmail.com> selinux: replace obsolete NLMSG_* with type safe nlmsg_*

Signed-off-by: Hong Zhiguo <honkiko@gmail.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
45e09bd51b2be1fbb86c2e3d5bb00d32744f1ecb 23-Jan-2013 Al Viro <viro@zeniv.linux.org.uk> selinux: opened file can't have NULL or negative ->f_path.dentry

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
496ad9aa8ef448058e36ca7a787c61f2e63f0f54 23-Jan-2013 Al Viro <viro@zeniv.linux.org.uk> new helper: file_inode(file)

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
5dbbaf2de89613d19a9286d4db0a535ca2735d26 14-Jan-2013 Paul Moore <pmoore@redhat.com> tun: fix LSM/SELinux labeling of tun/tap devices

This patch corrects some problems with LSM/SELinux that were introduced
with the multiqueue patchset. The problem stems from the fact that the
multiqueue work changed the relationship between the tun device and its
associated socket; before the socket persisted for the life of the
device, however after the multiqueue changes the socket only persisted
for the life of the userspace connection (fd open). For non-persistent
devices this is not an issue, but for persistent devices this can cause
the tun device to lose its SELinux label.

We correct this problem by adding an opaque LSM security blob to the
tun device struct which allows us to have the LSM security state, e.g.
SELinux labeling information, persist for the lifetime of the tun
device. In the process we tweak the LSM hooks to work with this new
approach to TUN device/socket labeling and introduce a new LSM hook,
security_tun_dev_attach_queue(), to approve requests to attach to a
TUN queue via TUNSETQUEUE.

The SELinux code has been adjusted to match the new LSM hooks, the
other LSMs do not make use of the LSM TUN controls. This patch makes
use of the recently added "tun_socket:attach_queue" permission to
restrict access to the TUNSETQUEUE operation. On older SELinux
policies which do not define the "tun_socket:attach_queue" permission
the access control decision for TUNSETQUEUE will be handled according
to the SELinux policy's unknown permission setting.

Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Eric Paris <eparis@parisplace.org>
Tested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
45525b26a46cd593cb72070304c4cd7c8391bd37 16-Oct-2012 Al Viro <viro@zeniv.linux.org.uk> fix a leak in replace_fd() users

replace_fd() began with "eats a reference, tries to insert into
descriptor table" semantics; at some point I'd switched it to
much saner current behaviour ("try to insert into descriptor
table, grabbing a new reference if inserted; caller should do
fput() in any case"), but forgot to update the callers.
Mea culpa...

[Spotted by Pavel Roskin, who has really weird system with pipe-fed
coredumps as part of what he considers a normal boot ;-)]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
808d4e3cfdcc52b19276175464f6dbca4df13b09 11-Oct-2012 Al Viro <viro@zeniv.linux.org.uk> consitify do_mount() arguments

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
cb0942b81249798e15c3f04eee2946ef543e8115 27-Aug-2012 Al Viro <viro@zeniv.linux.org.uk> make get_file() return its argument

simplifies a bunch of callers...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
c3c073f808b22dfae15ef8412b6f7b998644139a 22-Aug-2012 Al Viro <viro@zeniv.linux.org.uk> new helper: iterate_fd()

iterates through the opened files in given descriptor table,
calling a supplied function; we stop once non-zero is returned.
Callback gets struct file *, descriptor number and const void *
argument passed to iterator. It is called with files->file_lock
held, so it is not allowed to block.

tty_io, netprio_cgroup and selinux flush_unauthorized_files()
converted to its use.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
ee97cd872d08b8623076f2a63ffb872d0884411a 21-Aug-2012 Al Viro <viro@zeniv.linux.org.uk> switch flush_unauthorized_files() to replace_fd()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
1d151c337d79fa3de88654d2514f58fbd916a8e0 30-Jul-2012 Cyrill Gorcunov <gorcunov@openvz.org> c/r: fcntl: add F_GETOWNER_UIDS option

When we restore file descriptors we would like them to look exactly as
they were at dumping time.

With help of fcntl it's almost possible, the missing snippet is file
owners UIDs.

To be able to read their values the F_GETOWNER_UIDS is introduced.

This option is valid iif CONFIG_CHECKPOINT_RESTORE is turned on, otherwise
returning -EINVAL.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
e3fea3f70fd68af0574a5f24246cdb4ed07f2b74 09-Jun-2012 Al Viro <viro@ZenIV.linux.org.uk> selinux: fix selinux_inode_setxattr oops

OK, what we have so far is e.g.
setxattr(path, name, whatever, 0, XATTR_REPLACE)
with name being good enough to get through xattr_permission().
Then we reach security_inode_setxattr() with the desired value and size.
Aha. name should begin with "security.selinux", or we won't get that
far in selinux_inode_setxattr(). Suppose we got there and have enough
permissions to relabel that sucker. We call security_context_to_sid()
with value == NULL, size == 0. OK, we want ss_initialized to be non-zero.
I.e. after everything had been set up and running. No problem...

We do 1-byte kmalloc(), zero-length memcpy() (which doesn't oops, even
thought the source is NULL) and put a NUL there. I.e. form an empty
string. string_to_context_struct() is called and looks for the first
':' in there. Not found, -EINVAL we get. OK, security_context_to_sid_core()
has rc == -EINVAL, force == 0, so it silently returns -EINVAL.
All it takes now is not having CAP_MAC_ADMIN and we are fucked.

All right, it might be a different bug (modulo strange code quoted in the
report), but it's real. Easily fixed, AFAICS:

Deal with size == 0, value == NULL case in selinux_inode_setxattr()

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Dave Jones <davej@redhat.com>
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
8ded2bbc1845e19c771eb55209aab166ef011243 25-Jul-2012 Josh Boyer <jwboyer@redhat.com> posix_types.h: Cleanup stale __NFDBITS and related definitions

Recently, glibc made a change to suppress sign-conversion warnings in
FD_SET (glibc commit ceb9e56b3d1). This uncovered an issue with the
kernel's definition of __NFDBITS if applications #include
<linux/types.h> after including <sys/select.h>. A build failure would
be seen when passing the -Werror=sign-compare and -D_FORTIFY_SOURCE=2
flags to gcc.

It was suggested that the kernel should either match the glibc
definition of __NFDBITS or remove that entirely. The current in-kernel
uses of __NFDBITS can be replaced with BITS_PER_LONG, and there are no
uses of the related __FDELT and __FDMASK defines. Given that, we'll
continue the cleanup that was started with commit 8b3d1cda4f5f
("posix_types: Remove fd_set macros") and drop the remaining unused
macros.

Additionally, linux/time.h has similar macros defined that expand to
nothing so we'll remove those at the same time.

Reported-by: Jeff Law <law@redhat.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
CC: <stable@vger.kernel.org>
Signed-off-by: Josh Boyer <jwboyer@redhat.com>
[ .. and fix up whitespace as per akpm ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
765927b2d508712d320c8934db963bbe14c3fcec 26-Jun-2012 Al Viro <viro@zeniv.linux.org.uk> switch dentry_open() to struct path, make it grab references itself

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3d2195c3324b27e65ba53d9626a6bd91a2515797 06-Jul-2012 Eric Paris <eparis@redhat.com> SELinux: do not check open perms if they are not known to policy

When I introduced open perms policy didn't understand them and I
implemented them as a policycap. When I added the checking of open perm
to truncate I forgot to conditionalize it on the userspace defined
policy capability. Running an old policy with a new kernel will not
check open on open(2) but will check it on truncate. Conditionalize the
truncate check the same as the open check.

Signed-off-by: Eric Paris <eparis@redhat.com>
Cc: stable@vger.kernel.org # 3.4.x
Signed-off-by: James Morris <james.l.morris@oracle.com>
2597a8344ce051d0afe331706bcb4660bbdb9861 14-May-2012 Alban Crequy <alban.crequy@collabora.co.uk> netfilter: selinux: switch hook PFs to nfproto

This patch is a cleanup. Use NFPROTO_* for consistency with other
netfilter code.

Signed-off-by: Alban Crequy <alban.crequy@collabora.co.uk>
Reviewed-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Reviewed-by: Vincent Sanders <vincent.sanders@collabora.co.uk>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
e5467859f7f79b69fc49004403009dfdba3bec53 30-May-2012 Al Viro <viro@zeniv.linux.org.uk> split ->file_mmap() into ->mmap_addr()/->mmap_file()

... i.e. file-dependent and address-dependent checks.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
d007794a182bc072a7b7479909dbd0d67ba341be 30-May-2012 Al Viro <viro@zeniv.linux.org.uk> split cap_mmap_addr() out of cap_file_mmap()

... switch callers.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
259e5e6c75a910f3b5e656151dc602f53f9d7548 12-Apr-2012 Andy Lutomirski <luto@amacapital.net> Add PR_{GET,SET}_NO_NEW_PRIVS to prevent execve from granting privs

With this change, calling
prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)
disables privilege granting operations at execve-time. For example, a
process will not be able to execute a setuid binary to change their uid
or gid if this bit is set. The same is true for file capabilities.

Additionally, LSM_UNSAFE_NO_NEW_PRIVS is defined to ensure that
LSMs respect the requested behavior.

To determine if the NO_NEW_PRIVS bit is set, a task may call
prctl(PR_GET_NO_NEW_PRIVS, 0, 0, 0, 0);
It returns 1 if set and 0 if it is not set. If any of the arguments are
non-zero, it will return -1 and set errno to -EINVAL.
(PR_SET_NO_NEW_PRIVS behaves similarly.)

This functionality is desired for the proposed seccomp filter patch
series. By using PR_SET_NO_NEW_PRIVS, it allows a task to modify the
system call behavior for itself and its child tasks without being
able to impact the behavior of a more privileged task.

Another potential use is making certain privileged operations
unprivileged. For example, chroot may be considered "safe" if it cannot
affect privileged tasks.

Note, this patch causes execve to fail when PR_SET_NO_NEW_PRIVS is
set and AppArmor is in use. It is fixed in a subsequent patch.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Will Drewry <wad@chromium.org>
Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>

v18: updated change desc
v17: using new define values as per 3.4
Signed-off-by: James Morris <james.l.morris@oracle.com>
c737f8284cac91428f8fcc8281e69117fa16e887 05-Apr-2012 Eric Paris <eparis@redhat.com> SELinux: remove unused common_audit_data in flush_unauthorized_files

We don't need this variable and it just eats stack space. Remove it.

Signed-off-by: Eric Paris <eparis@redhat.com>
899838b25f063a94594b1df6e0100aea1ec57fac 04-Apr-2012 Eric Paris <eparis@redhat.com> SELinux: unify the selinux_audit_data and selinux_late_audit_data

We no longer need the distinction. We only need data after we decide to do an
audit. So turn the "late" audit data into just "data" and remove what we
currently have as "data".

Signed-off-by: Eric Paris <eparis@redhat.com>
50c205f5e5c2e2af002fd4ef537ded79b90b1b56 04-Apr-2012 Eric Paris <eparis@redhat.com> LSM: do not initialize common_audit_data to 0

It isn't needed. If you don't set the type of the data associated with
that type it is a pretty obvious programming bug. So why waste the cycles?

Signed-off-by: Eric Paris <eparis@redhat.com>
b466066f9b648ccb6aa1e174f0389b7433e460fd 04-Apr-2012 Eric Paris <eparis@redhat.com> LSM: remove the task field from common_audit_data

There are no legitimate users. Always use current and get back some stack
space for the common_audit_data.

Signed-off-by: Eric Paris <eparis@redhat.com>
bd5e50f9c1c71daac273fa586424f07205f6b13b 04-Apr-2012 Eric Paris <eparis@redhat.com> LSM: remove the COMMON_AUDIT_DATA_INIT type expansion

Just open code it so grep on the source code works better.

Signed-off-by: Eric Paris <eparis@redhat.com>
d4cf970d0732628d514405c5a975024b9e205b0b 04-Apr-2012 Eric Paris <eparis@redhat.com> SELinux: move common_audit_data to a noinline slow path function

selinux_inode_has_perm is a hot path. Instead of declaring the
common_audit_data on the stack move it to a noinline function only used in
the rare case we need to send an audit message.

Signed-off-by: Eric Paris <eparis@redhat.com>
602a8dd6ea6abd463bc26310c4a1b44919f88e68 04-Apr-2012 Eric Paris <eparis@redhat.com> SELinux: remove inode_has_perm_noadp

Both callers could better be using file_has_perm() to get better audit
results.

Signed-off-by: Eric Paris <eparis@redhat.com>
2e33405785d3eaec303c54b4a10afdebf3729da7 04-Apr-2012 Eric Paris <eparis@redhat.com> SELinux: delay initialization of audit data in selinux_inode_permission

We pay a rather large overhead initializing the common_audit_data.
Since we only need this information if we actually emit an audit
message there is little need to set it up in the hot path. This patch
splits the functionality of avc_has_perm() into avc_has_perm_noaudit(),
avc_audit_required() and slow_avc_audit(). But we take care of setting
up to audit between required() and the actual audit call. Thus saving
measurable time in a hot path.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Eric Paris <eparis@redhat.com>
d6ea83ec6864e9297fa8b00ec3dae183413a90e3 04-Apr-2012 Eric Paris <eparis@redhat.com> SELinux: audit failed attempts to set invalid labels

We know that some yum operation is causing CAP_MAC_ADMIN failures. This
implies that an RPM is laying down (or attempting to lay down) a file with
an invalid label. The problem is that we don't have any information to
track down the cause. This patch will cause such a failure to report the
failed label in an SELINUX_ERR audit message. This is similar to the
SELINUX_ERR reports on invalid transitions and things like that. It should
help run down problems on what is trying to set invalid labels in the
future.

Resulting records look something like:
type=AVC msg=audit(1319659241.138:71): avc: denied { mac_admin } for pid=2594 comm="chcon" capability=33 scontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=capability2
type=SELINUX_ERR msg=audit(1319659241.138:71): op=setxattr invalid_context=unconfined_u:object_r:hello:s0
type=SYSCALL msg=audit(1319659241.138:71): arch=c000003e syscall=188 success=no exit=-22 a0=a2c0e0 a1=390341b79b a2=a2d620 a3=1f items=1 ppid=2519 pid=2594 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=1 comm="chcon" exe="/usr/bin/chcon" subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null)
type=CWD msg=audit(1319659241.138:71): cwd="/root" type=PATH msg=audit(1319659241.138:71): item=0 name="test" inode=785879 dev=fc:03 mode=0100644 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:admin_home_t:s0

Signed-off-by: Eric Paris <eparis@redhat.com>
83d498569e9a7a4b92c4c5d3566f2d6a604f28c9 04-Apr-2012 Eric Paris <eparis@redhat.com> SELinux: rename dentry_open to file_open

dentry_open takes a file, rename it to file_open

Signed-off-by: Eric Paris <eparis@redhat.com>
95dbf739313f09c8d859bde1373bc264ef979337 04-Apr-2012 Eric Paris <eparis@redhat.com> SELinux: check OPEN on truncate calls

In RH BZ 578841 we realized that the SELinux sandbox program was allowed to
truncate files outside of the sandbox. The reason is because sandbox
confinement is determined almost entirely by the 'open' permission. The idea
was that if the sandbox was unable to open() files it would be unable to do
harm to those files. This turns out to be false in light of syscalls like
truncate() and chmod() which don't require a previous open() call. I looked
at the syscalls that did not have an associated 'open' check and found that
truncate(), did not have a seperate permission and even if it did have a
separate permission such a permission owuld be inadequate for use by
sandbox (since it owuld have to be granted so liberally as to be useless).
This patch checks the OPEN permission on truncate. I think a better solution
for sandbox is a whole new permission, but at least this fixes what we have
today.

Signed-off-by: Eric Paris <eparis@redhat.com>
48c62af68a403ef1655546bd3e021070c8508573 02-Apr-2012 Eric Paris <eparis@redhat.com> LSM: shrink the common_audit_data data union

After shrinking the common_audit_data stack usage for private LSM data I'm
not going to shrink the data union. To do this I'm going to move anything
larger than 2 void * ptrs to it's own structure and require it to be declared
separately on the calling stack. Thus hot paths which don't need more than
a couple pointer don't have to declare space to hold large unneeded
structures. I could get this down to one void * by dealing with the key
struct and the struct path. We'll see if that is helpful after taking care of
networking.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3b3b0e4fc15efa507b902d90cea39e496a523c3b 03-Apr-2012 Eric Paris <eparis@redhat.com> LSM: shrink sizeof LSM specific portion of common_audit_data

Linus found that the gigantic size of the common audit data caused a big
perf hit on something as simple as running stat() in a loop. This patch
requires LSMs to declare the LSM specific portion separately rather than
doing it in a union. Thus each LSM can be responsible for shrinking their
portion and don't have to pay a penalty just because other LSMs have a
bigger space requirement.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2f99c36986ff27a86f06f27212c5f5fa8c7164a3 23-Mar-2012 Al Viro <viro@zeniv.linux.org.uk> get rid of pointless includes of ext2_fs.h

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
1fd36adcd98c14d2fd97f545293c488775cb2823 16-Feb-2012 David Howells <dhowells@redhat.com> Replace the fd_sets in struct fdtable with an array of unsigned longs

Replace the fd_sets in struct fdtable with an array of unsigned longs and then
use the standard non-atomic bit operations rather than the FD_* macros.

This:

(1) Removes the abuses of struct fd_set:

(a) Since we don't want to allocate a full fd_set the vast majority of the
time, we actually, in effect, just allocate a just-big-enough array of
unsigned longs and cast it to an fd_set type - so why bother with the
fd_set at all?

(b) Some places outside of the core fdtable handling code (such as
SELinux) want to look inside the array of unsigned longs hidden inside
the fd_set struct for more efficient iteration over the entire set.

(2) Eliminates the use of FD_*() macros in the kernel completely.

(3) Permits the __FD_*() macros to be deleted entirely where not exposed to
userspace.

Signed-off-by: David Howells <dhowells@redhat.com>
Link: http://lkml.kernel.org/r/20120216174954.23314.48147.stgit@warthog.procyon.org.uk
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
4040153087478993cbf0809f444400a3c808074c 13-Feb-2012 Al Viro <viro@ftp.linux.org.uk> security: trim security.h

Trim security.h

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: James Morris <jmorris@namei.org>
d8c9584ea2a92879f471fd3a2be3af6c534fb035 08-Dec-2011 Al Viro <viro@zeniv.linux.org.uk> vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
fd778461524849afd035679030ae8e8873c72b81 03-Jan-2012 Eric Paris <eparis@redhat.com> security: remove the security_netlink_recv hook as it is equivalent to capable()

Once upon a time netlink was not sync and we had to get the effective
capabilities from the skb that was being received. Today we instead get
the capabilities from the current task. This has rendered the entire
purpose of the hook moot as it is now functionally equivalent to the
capable() call.

Signed-off-by: Eric Paris <eparis@redhat.com>
69f594a38967f4540ce7a29b3fd214e68a8330bd 03-Jan-2012 Eric Paris <eparis@redhat.com> ptrace: do not audit capability check when outputing /proc/pid/stat

Reading /proc/pid/stat of another process checks if one has ptrace permissions
on that process. If one does have permissions it outputs some data about the
process which might have security and attack implications. If the current
task does not have ptrace permissions the read still works, but those fields
are filled with inocuous (0) values. Since this check and a subsequent denial
is not a violation of the security policy we should not audit such denials.

This can be quite useful to removing ptrace broadly across a system without
flooding the logs when ps is run or something which harmlessly walks proc.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Serge E. Hallyn <serge.hallyn@canonical.com>
6a9de49115d5ff9871d953af1a5c8249e1585731 03-Jan-2012 Eric Paris <eparis@redhat.com> capabilities: remove the task from capable LSM hook entirely

The capabilities framework is based around credentials, not necessarily the
current task. Yet we still passed the current task down into LSMs from the
security_capable() LSM hook as if it was a meaningful portion of the security
decision. This patch removes the 'generic' passing of current and instead
forces individual LSMs to use current explicitly if they think it is
appropriate. In our case those LSMs are SELinux and AppArmor.

I believe the AppArmor use of current is incorrect, but that is wholely
unrelated to this patch. This patch does not change what AppArmor does, it
just makes it clear in the AppArmor code that it is doing it.

The SELinux code still uses current in it's audit message, which may also be
wrong and needs further investigation. Again this is NOT a change, it may
have always been wrong, this patch just makes it clear what is happening.

Signed-off-by: Eric Paris <eparis@redhat.com>
2653812e14f4e16688ec8247d7fd290bdbbc4747 30-Aug-2011 James Morris <jmorris@namei.org> selinux: sparse fix: fix several warnings in the security server cod

Fix several sparse warnings in the SELinux security server code.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Eric Paris <eparis@redhat.com>
02f5daa563456c1ff3c3422aa3ec00e67460f762 30-Aug-2011 James Morris <jmorris@namei.org> selinux: sparse fix: fix warnings in netlink code

Fix sparse warnings in SELinux Netlink code.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Eric Paris <eparis@redhat.com>
e8a65a3f67f8a85802c0a0250e48c4c4652d0da0 30-Aug-2011 James Morris <jmorris@namei.org> selinux: sparse fix: eliminate warnings for selinuxfs

Fixes several sparse warnings for selinuxfs.c

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Eric Paris <eparis@redhat.com>
b46610caba4bd9263afd07c7ef7a79974550554a 30-Aug-2011 James Morris <jmorris@namei.org> selinux: sparse fix: make selinux_secmark_refcount static

Sparse fix: make selinux_secmark_refcount static.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Eric Paris <eparis@redhat.com>
dba19c6064766730dd64757a010ec3aec503ecdb 26-Jul-2011 Al Viro <viro@zeniv.linux.org.uk> get rid of open-coded S_ISREG(), etc.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
1a67aafb5f72a436ca044293309fa7e6351d6a35 26-Jul-2011 Al Viro <viro@zeniv.linux.org.uk> switch ->mknod() to umode_t

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
4acdaf27ebe2034c342f3be57ef49aed1ad885ef 26-Jul-2011 Al Viro <viro@zeniv.linux.org.uk> switch ->create() to umode_t

vfs_create() ignores everything outside of 16bit subset of its
mode argument; switching it to umode_t is obviously equivalent
and it's the only caller of the method

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
18bb1db3e7607e4a997d50991a6f9fa5b0f8722c 26-Jul-2011 Al Viro <viro@zeniv.linux.org.uk> switch vfs_mkdir() and ->mkdir() to umode_t

vfs_mkdir() gets int, but immediately drops everything that might not
fit into umode_t and that's the only caller of ->mkdir()...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
7f1fb60c4fc9fb29fbb406ac8c4cfb4e59e168d6 06-Dec-2011 Pavel Emelyanov <xemul@parallels.com> inet_diag: Partly rename inet_ to sock_

The ultimate goal is to get the sock_diag module, that works in
family+protocol terms. Currently this is suitable to do on the
inet_diag basis, so rename parts of the code. It will be moved
to sock_diag.c later.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
75f2811c6460ccc59d83c66059943ce9c9f81a18 01-Dec-2011 Jesse Gross <jesse@nicira.com> ipv6: Add fragment reporting to ipv6_skip_exthdr().

While parsing through IPv6 extension headers, fragment headers are
skipped making them invisible to the caller. This reports the
fragment offset of the last header in order to make it possible to
determine whether the packet is fragmented and, if so whether it is
a first or last fragment.

Signed-off-by: Jesse Gross <jesse@nicira.com>
4e3fd7a06dc20b2d8ec6892233ad2012968fe7b6 21-Nov-2011 Alexey Dobriyan <adobriyan@gmail.com> net: remove ipv6_addr_copy()

C assignment can handle struct in6_addr copying.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
44fc7ea0bfe9143551649a42eb35f1460566c3c5 27-May-2011 Paul Gortmaker <paul.gortmaker@windriver.com> selinux: Add export.h to files using EXPORT_SYMBOL/THIS_MODULE

The pervasive, but implicit presence of <linux/module.h> meant
that things like this file would happily compile as-is. But
with the desire to phase out the module.h being included everywhere,
point this file at export.h which will give it THIS_MODULE and
the EXPORT_SYMBOL variants.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
7b98a5857c3fa86cb0a7e5f893643491a8b5b425 29-Aug-2011 James Morris <jmorris@namei.org> selinux: sparse fix: fix several warnings in the security server code

Fix several sparse warnings in the SELinux security server code.

Signed-off-by: James Morris <jmorris@namei.org>
6a3fbe81179c85eb53054a0f4c8423ffec0276a7 29-Aug-2011 James Morris <jmorris@namei.org> selinux: sparse fix: fix warnings in netlink code

Fix sparse warnings in SELinux Netlink code.

Signed-off-by: James Morris <jmorris@namei.org>
ad3fa08c4ff84ed87649d72e8497735c85561a3d 30-Aug-2011 James Morris <jmorris@namei.org> selinux: sparse fix: eliminate warnings for selinuxfs

Fixes several sparse warnings for selinuxfs.c

Signed-off-by: James Morris <jmorris@namei.org>
56a4ca996181b94b30e6b46509dc28e4ca3cc3f8 17-Aug-2011 James Morris <jmorris@namei.org> selinux: sparse fix: make selinux_secmark_refcount static

Sparse fix: make selinux_secmark_refcount static.

Signed-off-by: James Morris <jmorris@namei.org>
8959deef0fafeb6e5ede7efd237f74a5a6c9b472 01-Aug-2011 Paul Moore <paul.moore@hp.com> doc: Update the email address for Paul Moore in various source files

My @hp.com will no longer be valid starting August 5, 2011 so an update is
necessary. My new email address is employer independent so we don't have
to worry about doing this again any time soon.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: James Morris <jmorris@namei.org>
82c21bfab41a77bc01affe21bea9727d776774a7 01-Aug-2011 Paul Moore <paul.moore@hp.com> doc: Update the email address for Paul Moore in various source files

My @hp.com will no longer be valid starting August 5, 2011 so an update is
necessary. My new email address is employer independent so we don't have
to worry about doing this again any time soon.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
60063497a95e716c9a689af3be2687d261f115b4 27-Jul-2011 Arun Sharma <asharma@fb.com> atomic: use <linux/atomic.h>

This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
cf1dd1dae851ce5765cda5de16aa965eef7c2dbf 21-Jun-2011 Al Viro <viro@zeniv.linux.org.uk> selinux: don't transliterate MAY_NOT_BLOCK to IPERM_FLAG_RCU

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
e74f71eb78a4a8b9eaf1bc65f20f761648e85f76 21-Jun-2011 Al Viro <viro@zeniv.linux.org.uk> ->permission() sanitizing: don't pass flags to ->inode_permission()

pass that via mask instead.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
06d984737bac0545fe20bb5447ee488b95adb531 17-Jun-2011 Tejun Heo <tj@kernel.org> ptrace: s/tracehook_tracer_task()/ptrace_parent()/

tracehook.h is on the way out. Rename tracehook_tracer_task() to
ptrace_parent() and move it from tracehook.h to ptrace.h.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
95f4efb2d78661065aaf0be57f5bf00e4d2aea1d 09-Jun-2011 Linus Torvalds <torvalds@linux-foundation.org> selinux: simplify and clean up inode_has_perm()

This is a rather hot function that is called with a potentially NULL
"struct common_audit_data" pointer argument. And in that case it has to
provide and initialize its own dummy common_audit_data structure.

However, all the _common_ cases already pass it a real audit-data
structure, so that uncommon NULL case not only creates a silly run-time
test, more importantly it causes that function to have a big stack frame
for the dummy variable that isn't even used in the common case!

So get rid of that stupid run-time behavior, and make the (few)
functions that currently call with a NULL pointer just call a new helper
function instead (naturally called inode_has_perm_noapd(), since it has
no adp argument).

This makes the run-time test be a static code generation issue instead,
and allows for a much denser stack since none of the common callers need
the dummy structure. And a denser stack not only means less stack space
usage, it means better cache behavior. So we have a win-win-win from
this simplification: less code executed, smaller stack footprint, and
better cache behavior.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
cb1e922fa104bb0bb3aa5fc6ca7f7e070f3b55e9 28-Apr-2011 Eric Paris <eparis@redhat.com> SELinux: pass last path component in may_create

New inodes are created in a two stage process. We first will compute the
label on a new inode in security_inode_create() and check if the
operation is allowed. We will then actually re-compute that same label and
apply it in security_inode_init_security(). The change to do new label
calculations based in part on the last component of the path name only
passed the path component information all the way down the
security_inode_init_security hook. Down the security_inode_create hook the
path information did not make it past may_create. Thus the two calculations
came up differently and the permissions check might not actually be against
the label that is created. Pass and use the same information in both places
to harmonize the calculations and checks.

Reported-by: Dominick Grift <domg472@gmail.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2875fa00830be62431f5ac22d8f85d57f9fa3033 28-Apr-2011 Eric Paris <eparis@redhat.com> SELinux: introduce path_has_perm

We currently have inode_has_perm and dentry_has_perm. dentry_has_perm just
calls inode_has_perm with additional audit data. But dentry_has_perm can
take either a dentry or a path. Split those to make the code obvious and
to fix the previous problem where I thought dentry_has_perm always had a
valid dentry and mnt.

Signed-off-by: Eric Paris <eparis@redhat.com>
562abf624175e3f8487b7f064e516805e437e597 28-Apr-2011 Eric Paris <eparis@redhat.com> SELinux: pass last path component in may_create

New inodes are created in a two stage process. We first will compute the
label on a new inode in security_inode_create() and check if the
operation is allowed. We will then actually re-compute that same label and
apply it in security_inode_init_security(). The change to do new label
calculations based in part on the last component of the path name only
passed the path component information all the way down the
security_inode_init_security hook. Down the security_inode_create hook the
path information did not make it past may_create. Thus the two calculations
came up differently and the permissions check might not actually be against
the label that is created. Pass and use the same information in both places
to harmonize the calculations and checks.

Reported-by: Dominick Grift <domg472@gmail.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
9ade0cf440a1e5800dc68eef2e77b8d9d83a6dff 25-Apr-2011 Eric Paris <eparis@redhat.com> SELINUX: Make selinux cache VFS RCU walks safe

Now that the security modules can decide whether they support the
dcache RCU walk or not it's possible to make selinux a bit more
RCU friendly. The SELinux AVC and security server access decision
code is RCU safe. A specific piece of the LSM audit code may not
be RCU safe.

This patch makes the VFS RCU walk retry if it would hit the non RCU
safe chunk of code. It will normally just work under RCU. This is
done simply by passing the VFS RCU state as a flag down into the
avc_audit() code and returning ECHILD there if it would have an issue.

Based-on-patch-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
a269434d2fb48a4d66c1d7bf821b7874b59c5b41 25-Apr-2011 Eric Paris <eparis@redhat.com> LSM: separate LSM_AUDIT_DATA_DENTRY from LSM_AUDIT_DATA_PATH

This patch separates and audit message that only contains a dentry from
one that contains a full path. This allows us to make it harder to
misuse the interfaces or for the interfaces to be implemented wrong.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
f48b7399840b453e7282b523f535561fe9638a2d 25-Apr-2011 Eric Paris <eparis@redhat.com> LSM: split LSM_AUDIT_DATA_FS into _PATH and _INODE

The lsm common audit code has wacky contortions making sure which pieces
of information are set based on if it was given a path, dentry, or
inode. Split this into path and inode to get rid of some of the code
complexity.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
0dc1ba24f7fff659725eecbba2c9ad679a0954cd 22-Apr-2011 Eric Paris <eparis@redhat.com> SELINUX: Make selinux cache VFS RCU walks safe

Now that the security modules can decide whether they support the
dcache RCU walk or not it's possible to make selinux a bit more
RCU friendly. The SELinux AVC and security server access decision
code is RCU safe. A specific piece of the LSM audit code may not
be RCU safe.

This patch makes the VFS RCU walk retry if it would hit the non RCU
safe chunk of code. It will normally just work under RCU. This is
done simply by passing the VFS RCU state as a flag down into the
avc_audit() code and returning ECHILD there if it would have an issue.

Based-on-patch-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
1c9904297451f558191e211a48d8838b4bf792b0 22-Apr-2011 Andi Kleen <ak@linux.intel.com> SECURITY: Move exec_permission RCU checks into security modules

Right now all RCU walks fall back to reference walk when CONFIG_SECURITY
is enabled, even though just the standard capability module is active.
This is because security_inode_exec_permission unconditionally fails
RCU walks.

Move this decision to the low level security module. This requires
passing the RCU flags down the security hook. This way at least
the capability module and a few easy cases in selinux/smack work
with RCU walks with CONFIG_SECURITY=y

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
a35c6c8368d88deae6890205e73ed330b6df1db7 20-Apr-2011 Eric Paris <eparis@redhat.com> SELinux: silence build warning when !CONFIG_BUG

If one builds a kernel without CONFIG_BUG there are a number of 'may be
used uninitialized' warnings. Silence these by returning after the BUG().

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
8c9e80ed276fc4b9c9fadf29d8bf6b3576112f1a 22-Apr-2011 Andi Kleen <ak@linux.intel.com> SECURITY: Move exec_permission RCU checks into security modules

Right now all RCU walks fall back to reference walk when CONFIG_SECURITY
is enabled, even though just the standard capability module is active.
This is because security_inode_exec_permission unconditionally fails
RCU walks.

Move this decision to the low level security module. This requires
passing the RCU flags down the security hook. This way at least
the capability module and a few easy cases in selinux/smack work
with RCU walks with CONFIG_SECURITY=y

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2e1496707560ecf98e9b0604622c0990f94861d3 24-Mar-2011 Serge E. Hallyn <serge@hallyn.com> userns: rename is_owner_or_cap to inode_owner_or_capable

And give it a kernel-doc comment.

[akpm@linux-foundation.org: btrfs changed in linux-next]
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3486740a4f32a6a466f5ac931654d154790ba648 24-Mar-2011 Serge E. Hallyn <serge@hallyn.com> userns: security: make capabilities relative to the user namespace

- Introduce ns_capable to test for a capability in a non-default
user namespace.
- Teach cap_capable to handle capabilities in a non-default
user namespace.

The motivation is to get to the unprivileged creation of new
namespaces. It looks like this gets us 90% of the way there, with
only potential uid confusion issues left.

I still need to handle getting all caps after creation but otherwise I
think I have a good starter patch that achieves all of your goals.

Changelog:
11/05/2010: [serge] add apparmor
12/14/2010: [serge] fix capabilities to created user namespaces
Without this, if user serge creates a user_ns, he won't have
capabilities to the user_ns he created. THis is because we
were first checking whether his effective caps had the caps
he needed and returning -EPERM if not, and THEN checking whether
he was the creator. Reverse those checks.
12/16/2010: [serge] security_real_capable needs ns argument in !security case
01/11/2011: [serge] add task_ns_capable helper
01/11/2011: [serge] add nsown_capable() helper per Bastian Blank suggestion
02/16/2011: [serge] fix a logic bug: the root user is always creator of
init_user_ns, but should not always have capabilities to
it! Fix the check in cap_capable().
02/21/2011: Add the required user_ns parameter to security_capable,
fixing a compile failure.
02/23/2011: Convert some macros to functions as per akpm comments. Some
couldn't be converted because we can't easily forward-declare
them (they are inline if !SECURITY, extern if SECURITY). Add
a current_user_ns function so we can use it in capability.h
without #including cred.h. Move all forward declarations
together to the top of the #ifdef __KERNEL__ section, and use
kernel-doc format.
02/23/2011: Per dhowells, clean up comment in cap_capable().
02/23/2011: Per akpm, remove unreachable 'return -EPERM' in cap_capable.

(Original written and signed off by Eric; latest, modified version
acked by him)

[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: export current_user_ns() for ecryptfs]
[serge.hallyn@canonical.com: remove unneeded extra argument in selinux's task_has_capability]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1d28f42c1bd4bb2363d88df74d0128b4da135b4a 12-Mar-2011 David S. Miller <davem@davemloft.net> net: Put flowi_* prefix on AF independent members of struct flowi

I intend to turn struct flowi into a union of AF specific flowi
structs. There will be a common structure that each variant includes
first, much like struct sock_common.

This is the first step to move in that direction.

Signed-off-by: David S. Miller <davem@davemloft.net>
026eb167ae77244458fa4b4b9fc171209c079ba7 03-Mar-2011 Eric Paris <eparis@redhat.com> SELinux: implement the new sb_remount LSM hook

For SELinux we do not allow security information to change during a remount
operation. Thus this hook simply strips the security module options from
the data and verifies that those are the same options as exist on the
current superblock.

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
2ad18bdf3b8f84c85c7da7e4de365f7c5701fb3f 02-Mar-2011 Harry Ciao <qingtao.cao@windriver.com> SELinux: Compute SID for the newly created socket

The security context for the newly created socket shares the same
user, role and MLS attribute as its creator but may have a different
type, which could be specified by a type_transition rule in the relevant
policy package.

Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
[fix call to security_transition_sid to include qstr, Eric Paris]
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
c53fa1ed92cd671a1dfb1e7569e9ab672612ddc6 03-Mar-2011 Patrick McHardy <kaber@trash.net> netlink: kill loginuid/sessionid/sid members from struct netlink_skb_parms

Netlink message processing in the kernel is synchronous these days, the
session information can be collected when needed.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
0b24dcb7f2f7a0ce9b762eef0362c21c88f47b32 25-Feb-2011 Eric Paris <eparis@redhat.com> Revert "selinux: simplify ioctl checking"

This reverts commit 242631c49d4cf39642741d6627750151b058233b.

Conflicts:

security/selinux/hooks.c

SELinux used to recognize certain individual ioctls and check
permissions based on the knowledge of the individual ioctl. In commit
242631c49d4cf396 the SELinux code stopped trying to understand
individual ioctls and to instead looked at the ioctl access bits to
determine in we should check read or write for that operation. This
same suggestion was made to SMACK (and I believe copied into TOMOYO).
But this suggestion is total rubbish. The ioctl access bits are
actually the access requirements for the structure being passed into the
ioctl, and are completely unrelated to the operation of the ioctl or the
object the ioctl is being performed upon.

Take FS_IOC_FIEMAP as an example. FS_IOC_FIEMAP is defined as:

FS_IOC_FIEMAP _IOWR('f', 11, struct fiemap)

So it has access bits R and W. What this really means is that the
kernel is going to both read and write to the struct fiemap. It has
nothing at all to do with the operations that this ioctl might perform
on the file itself!

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
4a7ab3dcad0b66a486c468ccf0d6197c5dbe3326 23-Feb-2011 Steffen Klassert <steffen.klassert@secunet.com> selinux: Fix packet forwarding checks on postrouting

The IPSKB_FORWARDED and IP6SKB_FORWARDED flags are used only in the
multicast forwarding case to indicate that a packet looped back after
forward. So these flags are not a good indicator for packet forwarding.
A better indicator is the incoming interface. If we have no socket context,
but an incoming interface and we see the packet in the ip postroute hook,
the packet is going to be forwarded.

With this patch we use the incoming interface as an indicator on packet
forwarding.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
b9679a76187694138099e09d7f5091b73086e6d7 23-Feb-2011 Steffen Klassert <steffen.klassert@secunet.com> selinux: Fix wrong checks for selinux_policycap_netpeer

selinux_sock_rcv_skb_compat and selinux_ip_postroute_compat are just
called if selinux_policycap_netpeer is not set. However in these
functions we check if selinux_policycap_netpeer is set. This leads
to some dead code and to the fact that selinux_xfrm_postroute_last
is never executed. This patch removes the dead code and the checks
for selinux_policycap_netpeer in the compatibility functions.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2edeaa34a6e3f2c43b667f6c4f7b27944b811695 07-Feb-2011 Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> CRED: Fix BUG() upon security_cred_alloc_blank() failure

In cred_alloc_blank() since 2.6.32, abort_creds(new) is called with
new->security == NULL and new->magic == 0 when security_cred_alloc_blank()
returns an error. As a result, BUG() will be triggered if SELinux is enabled
or CONFIG_DEBUG_CREDENTIALS=y.

If CONFIG_DEBUG_CREDENTIALS=y, BUG() is called from __invalid_creds() because
cred->magic == 0. Failing that, BUG() is called from selinux_cred_free()
because selinux_cred_free() is not expecting cred->security == NULL. This does
not affect smack_cred_free(), tomoyo_cred_free() or apparmor_cred_free().

Fix these bugs by

(1) Set new->magic before calling security_cred_alloc_blank().

(2) Handle null cred->security in creds_are_invalid() and selinux_cred_free().

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8e6c96935fcc1ed3dbebc96fddfef3f2f2395afc 01-Feb-2011 Lucian Adrian Grijincu <lucian.grijincu@gmail.com> security/selinux: fix /proc/sys/ labeling

This fixes an old (2007) selinux regression: filesystem labeling for
/proc/sys returned
-r--r--r-- unknown /proc/sys/fs/file-nr
instead of
-r--r--r-- system_u:object_r:sysctl_fs_t:s0 /proc/sys/fs/file-nr

Events that lead to breaking of /proc/sys/ selinux labeling:

1) sysctl was reimplemented to route all calls through /proc/sys/

commit 77b14db502cb85a031fe8fde6c85d52f3e0acb63
[PATCH] sysctl: reimplement the sysctl proc support

2) proc_dir_entry was removed from ctl_table:

commit 3fbfa98112fc3962c416452a0baf2214381030e6
[PATCH] sysctl: remove the proc_dir_entry member for the sysctl tables

3) selinux still walked the proc_dir_entry tree to apply
labeling. Because ctl_tables don't have a proc_dir_entry, we did
not label /proc/sys/ inodes any more. To achieve this the /proc/sys/
inodes were marked private and private inodes were ignored by
selinux.

commit bbaca6c2e7ef0f663bc31be4dad7cf530f6c4962
[PATCH] selinux: enhance selinux to always ignore private inodes

commit 86a71dbd3e81e8870d0f0e56b87875f57e58222b
[PATCH] sysctl: hide the sysctl proc inodes from selinux

Access control checks have been done by means of a special sysctl hook
that was called for read/write accesses to any /proc/sys/ entry.

We don't have to do this because, instead of walking the
proc_dir_entry tree we can walk the dentry tree (as done in this
patch). With this patch:
* we don't mark /proc/sys/ inodes as private
* we don't need the sysclt security hook
* we walk the dentry tree to find the path to the inode.

We have to strip the PID in /proc/PID/ entries that have a
proc_dir_entry because selinux does not know how to label paths like
'/1/net/rpc/nfsd.fh' (and defaults to 'proc_t' labeling). Selinux does
know of '/net/rpc/nfsd.fh' (and applies the 'sysctl_rpc_t' label).

PID stripping from the path was done implicitly in the previous code
because the proc_dir_entry tree had the root in '/net' in the example
from above. The dentry tree has the root in '/1'.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Lucian Adrian Grijincu <lucian.grijincu@gmail.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
652bb9b0d6ce007f37c098947b2cc0c45efa3f66 01-Feb-2011 Eric Paris <eparis@redhat.com> SELinux: Use dentry name in new object labeling

Currently SELinux has rules which label new objects according to 3 criteria.
The label of the process creating the object, the label of the parent
directory, and the type of object (reg, dir, char, block, etc.) This patch
adds a 4th criteria, the dentry name, thus we can distinguish between
creating a file in an etc_t directory called shadow and one called motd.

There is no file globbing, regex parsing, or anything mystical. Either the
policy exactly (strcmp) matches the dentry name of the object or it doesn't.
This patch has no changes from today if policy does not implement the new
rules.

Signed-off-by: Eric Paris <eparis@redhat.com>
2a7dba391e5628ad665ce84ef9a6648da541ebab 01-Feb-2011 Eric Paris <eparis@redhat.com> fs/vfs/security: pass last path component to LSM on inode creation

SELinux would like to implement a new labeling behavior of newly created
inodes. We currently label new inodes based on the parent and the creating
process. This new behavior would also take into account the name of the
new object when deciding the new label. This is not the (supposed) full path,
just the last component of the path.

This is very useful because creating /etc/shadow is different than creating
/etc/passwd but the kernel hooks are unable to differentiate these
operations. We currently require that userspace realize it is doing some
difficult operation like that and than userspace jumps through SELinux hoops
to get things set up correctly. This patch does not implement new
behavior, that is obviously contained in a seperate SELinux patch, but it
does pass the needed name down to the correct LSM hook. If no such name
exists it is fine to pass NULL.

Signed-off-by: Eric Paris <eparis@redhat.com>
3610cda53f247e176bcbb7a7cca64bc53b12acdb 06-Jan-2011 David S. Miller <davem@davemloft.net> af_unix: Avoid socket->sk NULL OOPS in stream connect security hooks.

unix_release() can asynchornously set socket->sk to NULL, and
it does so without holding the unix_state_lock() on "other"
during stream connects.

However, the reverse mapping, sk->sk_socket, is only transitioned
to NULL under the unix_state_lock().

Therefore make the security hooks follow the reverse mapping instead
of the forward mapping.

Reported-by: Jeremy Fitzhardinge <jeremy@goop.org>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
415103f9932d45f7927f4b17e3a9a13834cdb9a1 02-Dec-2010 Eric Paris <eparis@redhat.com> SELinux: do not compute transition labels on mountpoint labeled filesystems

selinux_inode_init_security computes transitions sids even for filesystems
that use mount point labeling. It shouldn't do that. It should just use
the mount point label always and no matter what.

This causes 2 problems. 1) it makes file creation slower than it needs to be
since we calculate the transition sid and 2) it allows files to be created
with a different label than the mount point!

# id -Z
staff_u:sysadm_r:sysadm_t:s0-s0:c0.c1023
# sesearch --type --class file --source sysadm_t --target tmp_t
Found 1 semantic te rules:
type_transition sysadm_t tmp_t : file user_tmp_t;

# mount -o loop,context="system_u:object_r:tmp_t:s0" /tmp/fs /mnt/tmp

# ls -lZ /mnt/tmp
drwx------. root root system_u:object_r:tmp_t:s0 lost+found
# touch /mnt/tmp/file1
# ls -lZ /mnt/tmp
-rw-r--r--. root root staff_u:object_r:user_tmp_t:s0 file1
drwx------. root root system_u:object_r:tmp_t:s0 lost+found

Whoops, we have a mount point labeled filesystem tmp_t with a user_tmp_t
labeled file!

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Reviewed-by: James Morris <jmorris@namei.org>
2fe66ec242d3f76e3b0101f36419e7e5405bcff3 23-Nov-2010 Eric Paris <eparis@redhat.com> SELinux: indicate fatal error in compat netfilter code

The SELinux ip postroute code indicates when policy rejected a packet and
passes the error back up the stack. The compat code does not. This patch
sends the same kind of error back up the stack in the compat code.

Based-on-patch-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
04f6d70f6e64900a5d70a5fc199dd9d5fa787738 23-Nov-2010 Eric Paris <eparis@redhat.com> SELinux: Only return netlink error when we know the return is fatal

Some of the SELinux netlink code returns a fatal error when the error might
actually be transient. This patch just silently drops packets on
potentially transient errors but continues to return a permanant error
indicator when the denial was because of policy.

Based-on-comments-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1f1aaf82825865a50cef0b4722607abb12aeee52 16-Nov-2010 Eric Paris <eparis@redhat.com> SELinux: return -ECONNREFUSED from ip_postroute to signal fatal error

The SELinux netfilter hooks just return NF_DROP if they drop a packet. We
want to signal that a drop in this hook is a permanant fatal error and is not
transient. If we do this the error will be passed back up the stack in some
places and applications will get a faster interaction that something went
wrong.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12b3052c3ee8f508b2c7ee4ddd63ed03423409d8 16-Nov-2010 Eric Paris <eparis@redhat.com> capabilities/syslog: open code cap_syslog logic to fix build failure

The addition of CONFIG_SECURITY_DMESG_RESTRICT resulted in a build
failure when CONFIG_PRINTK=n. This is because the capabilities code
which used the new option was built even though the variable in question
didn't exist.

The patch here fixes this by moving the capabilities checks out of the
LSM and into the caller. All (known) LSMs should have been calling the
capabilities hook already so it actually makes the code organization
better to eliminate the hook altogether.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2606fd1fa5710205b23ee859563502aa18362447 13-Oct-2010 Eric Paris <eparis@redhat.com> secmark: make secmark object handling generic

Right now secmark has lots of direct selinux calls. Use all LSM calls and
remove all SELinux specific knowledge. The only SELinux specific knowledge
we leave is the mode. The only point is to make sure that other LSMs at
least test this generic code before they assume it works. (They may also
have to make changes if they do not represent labels as strings)

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Paul Moore <paul.moore@hp.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: James Morris <jmorris@namei.org>
b0ae19811375031ae3b3fecc65b702a9c6e5cc28 14-Oct-2010 KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> security: remove unused parameter from security_task_setscheduler()

All security modules shouldn't change sched_param parameter of
security_task_setscheduler(). This is not only meaningless, but also
make a harmful result if caller pass a static variable.

This patch remove policy and sched_param parameter from
security_task_setscheduler() becuase none of security module is
using it.

Cc: James Morris <jmorris@namei.org>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: James Morris <jmorris@namei.org>
d996b62a8df1d935b01319bf8defb95b5709f7b8 17-Aug-2010 Nick Piggin <npiggin@kernel.dk> tty: fix fu_list abuse

tty: fix fu_list abuse

tty code abuses fu_list, which causes a bug in remount,ro handling.

If a tty device node is opened on a filesystem, then the last link to the inode
removed, the filesystem will be allowed to be remounted readonly. This is
because fs_may_remount_ro does not find the 0 link tty inode on the file sb
list (because the tty code incorrectly removed it to use for its own purpose).
This can result in a filesystem with errors after it is marked "clean".

Taking idea from Christoph's initial patch, allocate a tty private struct
at file->private_data and put our required list fields in there, linking
file and tty. This makes tty nodes behave the same way as other device nodes
and avoid meddling with the vfs, and avoids this bug.

The error handling is not trivial in the tty code, so for this bugfix, I take
the simple approach of using __GFP_NOFAIL and don't worry about memory errors.
This is not a problem because our allocator doesn't fail small allocs as a rule
anyway. So proper error handling is left as an exercise for tty hackers.

[ Arguably filesystem's device inode would ideally be divorced from the
driver's pseudo inode when it is opened, but in practice it's not clear whether
that will ever be worth implementing. ]

Cc: linux-kernel@vger.kernel.org
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
ee2ffa0dfdd2db19705f2ba1c6a4c0bfe8122dd8 17-Aug-2010 Nick Piggin <npiggin@kernel.dk> fs: cleanup files_lock locking

fs: cleanup files_lock locking

Lock tty_files with a new spinlock, tty_files_lock; provide helpers to
manipulate the per-sb files list; unexport the files_lock spinlock.

Cc: linux-kernel@vger.kernel.org
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
49b7b8de46d293113a0a0bb026ff7bd833c73367 23-Jul-2010 Eric Paris <eparis@redhat.com> selinux: place open in the common file perms

kernel can dynamically remap perms. Drop the open lookup table and put open
in the common file perms.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
b782e0a68d17894d9a618ffea55b33639faa6bb4 23-Jul-2010 Eric Paris <eparis@redhat.com> SELinux: special dontaudit for access checks

Currently there are a number of applications (nautilus being the main one) which
calls access() on files in order to determine how they should be displayed. It
is normal and expected that nautilus will want to see if files are executable
or if they are really read/write-able. access() should return the real
permission. SELinux policy checks are done in access() and can result in lots
of AVC denials as policy denies RWX on files which DAC allows. Currently
SELinux must dontaudit actual attempts to read/write/execute a file in
order to silence these messages (and not flood the logs.) But dontaudit rules
like that can hide real attacks. This patch addes a new common file
permission audit_access. This permission is special in that it is meaningless
and should never show up in an allow rule. Instead the only place this
permission has meaning is in a dontaudit rule like so:

dontaudit nautilus_t sbin_t:file audit_access

With such a rule if nautilus just checks access() we will still get denied and
thus userspace will still get the correct answer but we will not log the denial.
If nautilus attempted to actually perform one of the forbidden actions
(rather than just querying access(2) about it) we would still log a denial.
This type of dontaudit rule should be used sparingly, as it could be a
method for an attacker to probe the system permissions without detection.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
d09ca73979460b96d5d4684d588b188be9a1f57d 23-Jul-2010 Eric Paris <eparis@redhat.com> security: make LSMs explicitly mask off permissions

SELinux needs to pass the MAY_ACCESS flag so it can handle auditting
correctly. Presently the masking of MAY_* flags is done in the VFS. In
order to allow LSMs to decide what flags they care about and what flags
they don't just pass them all and the each LSM mask off what they don't
need. This patch should contain no functional changes to either the VFS or
any LSM.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
af4f136056c984b0aa67feed7d3170b958370b2f 01-Jul-2010 Mimi Zohar <zohar@linux.vnet.ibm.com> security: move LSM xattrnames to xattr.h

Make the security extended attributes names global. Updated to move
the remaining Smack xattrs.

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
5fb49870e6d48d81d8ca0e1ef979073dc9a820f7 22-Apr-2010 Paul Moore <paul.moore@hp.com> selinux: Use current_security() when possible

There were a number of places using the following code pattern:

struct cred *cred = current_cred();
struct task_security_struct *tsec = cred->security;

... which were simplified to the following:

struct task_security_struct *tsec = current_security();

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
253bfae6e0ad97554799affa0266052968a45808 22-Apr-2010 Paul Moore <paul.moore@hp.com> selinux: Convert socket related access controls to use socket labels

At present, the socket related access controls use a mix of inode and
socket labels; while there should be no practical difference (they
_should_ always be the same), it makes the code more confusing. This
patch attempts to convert all of the socket related access control
points (with the exception of some of the inode/fd based controls) to
use the socket's own label. In the process, I also converted the
socket_has_perm() function to take a 'sock' argument instead of a
'socket' since that was adding a bit more overhead in some cases.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
84914b7ed1c5e0f3199a5a6997022758a70fcaff 22-Apr-2010 Paul Moore <paul.moore@hp.com> selinux: Shuffle the sk_security_struct alloc and free routines

The sk_alloc_security() and sk_free_security() functions were only being
called by the selinux_sk_alloc_security() and selinux_sk_free_security()
functions so we just move the guts of the alloc/free routines to the
callers and eliminate a layer of indirection.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
d4f2d97841827cb876da8b607df05a3dab812416 22-Apr-2010 Paul Moore <paul.moore@hp.com> selinux: Consolidate sockcreate_sid logic

Consolidate the basic sockcreate_sid logic into a single helper function
which allows us to do some cleanups in the related code.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
4d1e24514d80cb266231d0c1b6c02161970ad019 22-Apr-2010 Paul Moore <paul.moore@hp.com> selinux: Set the peer label correctly on connected UNIX domain sockets

Correct a problem where we weren't setting the peer label correctly on
the client end of a pair of connected UNIX sockets.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
eb2d55a32b9a91bca0dea299eedb560bafa8b14e 23-Jun-2010 Oleg Nesterov <oleg@redhat.com> rlimits: selinux, do rlimits changes under task_lock

When doing an exec, selinux updates rlimits in its code of current
process depending on current max. Make sure max or cur doesn't change
in the meantime by grabbing task_lock which do_prlimit needs for
changing limits too.

While at it, use rlimit helper for accessing CPU rlimit a line below.
To have a volatile access too.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Oleg Nesterov <oleg@redhat.com>
5ab46b345e418747b3a52f0892680c0745c4223c 28-Aug-2009 Jiri Slaby <jirislaby@gmail.com> rlimits: add task_struct to update_rlimit_cpu

Add task_struct as a parameter to update_rlimit_cpu to be able to set
rlimit_cpu of different task than current.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
8fd00b4d7014b00448eb33cf0590815304769798 26-Aug-2009 Jiri Slaby <jirislaby@gmail.com> rlimits: security, add task_struct to setrlimit

Add task_struct to task_setrlimit of security_operations to be able to set
rlimit of task other than current.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
e8c26255992474a2161c63ce9d385827302e4530 23-Mar-2010 Al Viro <viro@zeniv.linux.org.uk> switch selinux delayed superblock handling to iterate_supers()

... kill their private list, while we are at it

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
fcaaade1db63bb2d6f7611d7824eb50d2f07a546 28-Apr-2010 Stephen Smalley <sds@tycho.nsa.gov> selinux: generalize disabling of execmem for plt-in-heap archs

On Tue, 2010-04-27 at 11:47 -0700, David Miller wrote:
> From: "Tom \"spot\" Callaway" <tcallawa@redhat.com>
> Date: Tue, 27 Apr 2010 14:20:21 -0400
>
> > [root@apollo ~]$ cat /proc/2174/maps
> > 00010000-00014000 r-xp 00000000 fd:00 15466577
> > /sbin/mingetty
> > 00022000-00024000 rwxp 00002000 fd:00 15466577
> > /sbin/mingetty
> > 00024000-00046000 rwxp 00000000 00:00 0
> > [heap]
>
> SELINUX probably barfs on the executable heap, the PLT is in the HEAP
> just like powerpc32 and that's why VM_DATA_DEFAULT_FLAGS has to set
> both executable and writable.
>
> You also can't remove the CONFIG_PPC32 ifdefs in selinux, since
> because of the VM_DATA_DEFAULT_FLAGS setting used still in that arch,
> the heap will always have executable permission, just like sparc does.
> You have to support those binaries forever, whether you like it or not.
>
> Let's just replace the CONFIG_PPC32 ifdef in SELINUX with CONFIG_PPC32
> || CONFIG_SPARC as in Tom's original patch and let's be done with
> this.
>
> In fact I would go through all the arch/ header files and check the
> VM_DATA_DEFAULT_FLAGS settings and add the necessary new ifdefs to the
> SELINUX code so that other platforms don't have the pain of having to
> go through this process too.

To avoid maintaining per-arch ifdefs, it seems that we could just
directly use (VM_DATA_DEFAULT_FLAGS & VM_EXEC) as the basis for deciding
whether to enable or disable these checks. VM_DATA_DEFAULT_FLAGS isn't
constant on some architectures but instead depends on
current->personality, but we want this applied uniformly. So we'll just
use the initial task state to determine whether or not to enable these
checks.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: James Morris <jmorris@namei.org>
dd3e7836bfe093fc611f715c323cf53be9252b27 07-Apr-2010 Eric Paris <eparis@redhat.com> selinux: always call sk_security_struct sksec

trying to grep everything that messes with a sk_security_struct isn't easy
since we don't always call it sksec. Just rename everything sksec.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
634a539e16bd7a1ba31c3f832baa725565cc9f96 05-Mar-2010 Stephen Hemminger <shemminger@vyatta.com> selinux: const strings in tables

Several places strings tables are used that should be declared
const.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: James Morris <jmorris@namei.org>
ef57471a73b67a7b65fd8708fd55c77cb7c619af 26-Feb-2010 David Howells <dhowells@redhat.com> SELinux: Make selinux_kernel_create_files_as() shouldn't just always return 0

Make selinux_kernel_create_files_as() return an error when it gets one, rather
than unconditionally returning 0.

Without this, cachefiles doesn't return an error if the SELinux policy doesn't
let it create files with the label of the directory at the base of the cache.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
189b3b1c89761054fee3438f063d7f257306e2d8 23-Feb-2010 wzt.wzt@gmail.com <wzt.wzt@gmail.com> Security: add static to security_ops and default_security_ops variable

Enhance the security framework to support resetting the active security
module. This eliminates the need for direct use of the security_ops and
default_security_ops variables outside of security.c, so make security_ops
and default_security_ops static. Also remove the secondary_ops variable as
a cleanup since there is no use for that. secondary_ops was originally used by
SELinux to call the "secondary" security module (capability or dummy),
but that was replaced by direct calls to capability and the only
remaining use is to save and restore the original security ops pointer
value if SELinux is disabled by early userspace based on /etc/selinux/config.
Further, if we support this directly in the security framework, then we can
just use &default_security_ops for this purpose since that is now available.

Signed-off-by: Zhitong Wang <zhitong.wangzt@alibaba-inc.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
d78ca3cd733d8a2c3dcd88471beb1a15d973eed8 04-Feb-2010 Kees Cook <kees.cook@canonical.com> syslog: use defined constants instead of raw numbers

Right now the syslog "type" action are just raw numbers which makes
the source difficult to follow. This patch replaces the raw numbers
with defined constants for some level of sanity.

Signed-off-by: Kees Cook <kees.cook@canonical.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
002345925e6c45861f60db6f4fc6236713fd8847 04-Feb-2010 Kees Cook <kees.cook@canonical.com> syslog: distinguish between /proc/kmsg and syscalls

This allows the LSM to distinguish between syslog functions originating
from /proc/kmsg access and direct syscalls. By default, the commoncaps
will now no longer require CAP_SYS_ADMIN to read an opened /proc/kmsg
file descriptor. For example the kernel syslog reader can now drop
privileges after opening /proc/kmsg, instead of staying privileged with
CAP_SYS_ADMIN. MAC systems that implement security_syslog have unchanged
behavior.

Signed-off-by: Kees Cook <kees.cook@canonical.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <jmorris@namei.org>
17740d89785aeb4143770923d67c293849414710 28-Aug-2009 Jiri Slaby <jirislaby@gmail.com> SECURITY: selinux, fix update_rlimit_cpu parameter

Don't pass current RLIMIT_RTTIME to update_rlimit_cpu() in
selinux_bprm_committing_creds, since update_rlimit_cpu expects
RLIMIT_CPU limit.

Use proper rlim[RLIMIT_CPU].rlim_cur instead to fix that.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Eric Paris <eparis@parisplace.org>
Cc: David Howells <dhowells@redhat.com>
8964be4a9a5ca8cab1219bb046db2f6d1936227c 21-Nov-2009 Eric Dumazet <eric.dumazet@gmail.com> net: rename skb->iif to skb->skb_iif

To help grep games, rename iif to skb_iif

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dd8dbf2e6880e30c00b18600c962d0cb5a03c555 03-Nov-2009 Eric Paris <eparis@redhat.com> security: report the module name to security_module_request

For SELinux to do better filtering in userspace we send the name of the
module along with the AVC denial when a program is denied module_request.

Example output:

type=SYSCALL msg=audit(11/03/2009 10:59:43.510:9) : arch=x86_64 syscall=write success=yes exit=2 a0=3 a1=7fc28c0d56c0 a2=2 a3=7fffca0d7440 items=0 ppid=1727 pid=1729 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=rpc.nfsd exe=/usr/sbin/rpc.nfsd subj=system_u:system_r:nfsd_t:s0 key=(null)
type=AVC msg=audit(11/03/2009 10:59:43.510:9) : avc: denied { module_request } for pid=1729 comm=rpc.nfsd kmod="net-pf-10" scontext=system_u:system_r:nfsd_t:s0 tcontext=system_u:system_r:kernel_t:s0 tclass=system

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
941fc5b2bf8f7dd1d0a9c502e152fa719ff6578e 01-Oct-2009 Stephen Smalley <sds@tycho.nsa.gov> selinux: drop remapping of netlink classes

Drop remapping of netlink classes and bypass of permission checking
based on netlink message type for policy version < 18. This removes
compatibility code introduced when the original single netlink
security class used for all netlink sockets was split into
finer-grained netlink classes based on netlink protocol and when
permission checking was added based on netlink message type in Linux
2.6.8. The only known distribution that shipped with SELinux and
policy < 18 was Fedora Core 2, which was EOL'd on 2005-04-11.

Given that the remapping code was never updated to address the
addition of newer netlink classes, that the corresponding userland
support was dropped in 2005, and that the assumptions made by the
remapping code about the fixed ordering among netlink classes in the
policy may be violated in the future due to the dynamic class/perm
discovery support, we should drop this compatibility code now.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
af8ff04917169805b151280155bf772d3ca9bec0 21-Sep-2009 Eric Paris <eparis@redhat.com> SELinux: reset the security_ops before flushing the avc cache

This patch resets the security_ops to the secondary_ops before it flushes
the avc. It's still possible that a task on another processor could have
already passed the security_ops dereference and be executing an selinux hook
function which would add a new avc entry. That entry would still not be
freed. This should however help to reduce the number of needless avcs the
kernel has when selinux is disabled at run time. There is no wasted
memory if selinux is disabled on the command line or not compiled.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
0b7570e77f7c3abd43107dabc47ea89daf9a1cba 24-Sep-2009 Oleg Nesterov <oleg@redhat.com> do_wait() wakeup optimization: change __wake_up_parent() to use filtered wakeup

Ratan Nalumasu reported that in a process with many threads doing
unnecessary wakeups. Every waiting thread in the process wakes up to loop
through the children and see that the only ones it cares about are still
not ready.

Now that we have struct wait_opts we can change do_wait/__wake_up_parent
to use filtered wakeups.

We can make child_wait_callback() more clever later, right now it only
checks eligible_child().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Ratan Nalumasu <rnalumasu@gmail.com>
Cc: Vitaly Mayatskikh <vmayatsk@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Tested-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ddd29ec6597125c830f7badb608a86c98b936b64 09-Sep-2009 David P. Quigley <dpquigl@tycho.nsa.gov> sysfs: Add labeling support for sysfs

This patch adds a setxattr handler to the file, directory, and symlink
inode_operations structures for sysfs. The patch uses hooks introduced in the
previous patch to handle the getting and setting of security information for
the sysfs inodes. As was suggested by Eric Biederman the struct iattr in the
sysfs_dirent structure has been replaced by a structure which contains the
iattr, secdata and secdata length to allow the changes to persist in the event
that the inode representing the sysfs_dirent is evicted. Because sysfs only
stores this information when a change is made all the optional data is moved
into one dynamically allocated field.

This patch addresses an issue where SELinux was denying virtd access to the PCI
configuration entries in sysfs. The lack of setxattr handlers for sysfs
required that a single label be assigned to all entries in sysfs. Granting virtd
access to every entry in sysfs is not an acceptable solution so fine grained
labeling of sysfs is required such that individual entries can be labeled
appropriately.

[sds: Fixed compile-time warnings, coding style, and setting of inode security init flags.]

Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Signed-off-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
1ee65e37e904b959c24404139f5752edc66319d5 03-Sep-2009 David P. Quigley <dpquigl@tycho.nsa.gov> LSM/SELinux: inode_{get,set,notify}secctx hooks to access LSM security context information.

This patch introduces three new hooks. The inode_getsecctx hook is used to get
all relevant information from an LSM about an inode. The inode_setsecctx is
used to set both the in-core and on-disk state for the inode based on a context
derived from inode_getsecctx.The final hook inode_notifysecctx will notify the
LSM of a change for the in-core state of the inode in question. These hooks are
for use in the labeled NFS code and addresses concerns of how to set security
on an inode in a multi-xattr LSM. For historical reasons Stephen Smalley's
explanation of the reason for these hooks is pasted below.

Quote Stephen Smalley

inode_setsecctx: Change the security context of an inode. Updates the
in core security context managed by the security module and invokes the
fs code as needed (via __vfs_setxattr_noperm) to update any backing
xattrs that represent the context. Example usage: NFS server invokes
this hook to change the security context in its incore inode and on the
backing file system to a value provided by the client on a SETATTR
operation.

inode_notifysecctx: Notify the security module of what the security
context of an inode should be. Initializes the incore security context
managed by the security module for this inode. Example usage: NFS
client invokes this hook to initialize the security context in its
incore inode to the value provided by the server for the file when the
server returned the file's attributes to the client.

Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
ee18d64c1f632043a02e6f5ba5e045bb26a5465f 02-Sep-2009 David Howells <dhowells@redhat.com> KEYS: Add a keyctl to install a process's session keyring on its parent [try #6]

Add a keyctl to install a process's session keyring onto its parent. This
replaces the parent's session keyring. Because the COW credential code does
not permit one process to change another process's credentials directly, the
change is deferred until userspace next starts executing again. Normally this
will be after a wait*() syscall.

To support this, three new security hooks have been provided:
cred_alloc_blank() to allocate unset security creds, cred_transfer() to fill in
the blank security creds and key_session_to_parent() - which asks the LSM if
the process may replace its parent's session keyring.

The replacement may only happen if the process has the same ownership details
as its parent, and the process has LINK permission on the session keyring, and
the session keyring is owned by the process, and the LSM permits it.

Note that this requires alteration to each architecture's notify_resume path.
This has been done for all arches barring blackfin, m68k* and xtensa, all of
which need assembly alteration to support TIF_NOTIFY_RESUME. This allows the
replacement to be performed at the point the parent process resumes userspace
execution.

This allows the userspace AFS pioctl emulation to fully emulate newpag() and
the VIOCSETTOK and VIOCSETTOK2 pioctls, all of which require the ability to
alter the parent process's PAG membership. However, since kAFS doesn't use
PAGs per se, but rather dumps the keys into the session keyring, the session
keyring of the parent must be replaced if, for example, VIOCSETTOK is passed
the newpag flag.

This can be tested with the following program:

#include <stdio.h>
#include <stdlib.h>
#include <keyutils.h>

#define KEYCTL_SESSION_TO_PARENT 18

#define OSERROR(X, S) do { if ((long)(X) == -1) { perror(S); exit(1); } } while(0)

int main(int argc, char **argv)
{
key_serial_t keyring, key;
long ret;

keyring = keyctl_join_session_keyring(argv[1]);
OSERROR(keyring, "keyctl_join_session_keyring");

key = add_key("user", "a", "b", 1, keyring);
OSERROR(key, "add_key");

ret = keyctl(KEYCTL_SESSION_TO_PARENT);
OSERROR(ret, "KEYCTL_SESSION_TO_PARENT");

return 0;
}

Compiled and linked with -lkeyutils, you should see something like:

[dhowells@andromeda ~]$ keyctl show
Session Keyring
-3 --alswrv 4043 4043 keyring: _ses
355907932 --alswrv 4043 -1 \_ keyring: _uid.4043
[dhowells@andromeda ~]$ /tmp/newpag
[dhowells@andromeda ~]$ keyctl show
Session Keyring
-3 --alswrv 4043 4043 keyring: _ses
1055658746 --alswrv 4043 4043 \_ user: a
[dhowells@andromeda ~]$ /tmp/newpag hello
[dhowells@andromeda ~]$ keyctl show
Session Keyring
-3 --alswrv 4043 4043 keyring: hello
340417692 --alswrv 4043 4043 \_ user: a

Where the test program creates a new session keyring, sticks a user key named
'a' into it and then installs it on its parent.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
e0e817392b9acf2c98d3be80c233dddb1b52003d 02-Sep-2009 David Howells <dhowells@redhat.com> CRED: Add some configurable debugging [try #6]

Add a config option (CONFIG_DEBUG_CREDENTIALS) to turn on some debug checking
for credential management. The additional code keeps track of the number of
pointers from task_structs to any given cred struct, and checks to see that
this number never exceeds the usage count of the cred struct (which includes
all references, not just those from task_structs).

Furthermore, if SELinux is enabled, the code also checks that the security
pointer in the cred struct is never seen to be invalid.

This attempts to catch the bug whereby inode_has_perm() faults in an nfsd
kernel thread on seeing cred->security be a NULL pointer (it appears that the
credential struct has been previously released):

http://www.kerneloops.org/oops.php?number=252883

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
ed6d76e4c32de0c2ad5f1d572b948ef49e465176 29-Aug-2009 Paul Moore <paul.moore@hp.com> selinux: Support for the new TUN LSM hooks

Add support for the new TUN LSM hooks: security_tun_dev_create(),
security_tun_dev_post_create() and security_tun_dev_attach(). This includes
the addition of a new object class, tun_socket, which represents the socks
associated with TUN devices. The _tun_dev_create() and _tun_dev_post_create()
hooks are fairly similar to the standard socket functions but _tun_dev_attach()
is a bit special. The _tun_dev_attach() is unique because it involves a
domain attaching to an existing TUN device and its associated tun_socket
object, an operation which does not exist with standard sockets and most
closely resembles a relabel operation.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Eric Paris <eparis@parisplace.org>
Signed-off-by: James Morris <jmorris@namei.org>
bc6a6008e5e3c7a30191a7f19ab19e85b14b1705 21-Aug-2009 Amerigo Wang <amwang@redhat.com> selinux: adjust rules for ATTR_FORCE

As suggested by OGAWA Hirofumi in thread:
http://lkml.org/lkml/2009/8/7/132, we should let selinux_inode_setattr()
to match our ATTR_* rules. ATTR_FORCE should not force things like
ATTR_SIZE.

[hirofumi@mail.parknet.co.jp: tweaks]
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Cc: Eugene Teo <eteo@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Morris <jmorris@namei.org>
788084aba2ab7348257597496befcbccabdc98a3 31-Jul-2009 Eric Paris <eparis@redhat.com> Security/SELinux: seperate lsm specific mmap_min_addr

Currently SELinux enforcement of controls on the ability to map low memory
is determined by the mmap_min_addr tunable. This patch causes SELinux to
ignore the tunable and instead use a seperate Kconfig option specific to how
much space the LSM should protect.

The tunable will now only control the need for CAP_SYS_RAWIO and SELinux
permissions will always protect the amount of low memory designated by
CONFIG_LSM_MMAP_MIN_ADDR.

This allows users who need to disable the mmap_min_addr controls (usual reason
being they run WINE as a non-root user) to do so and still have SELinux
controls preventing confined domains (like a web server) from being able to
map some area of low memory.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
8cf948e744e0218af604c32edecde10006dc8e9e 31-Jul-2009 Eric Paris <eparis@redhat.com> SELinux: call cap_file_mmap in selinux_file_mmap

Currently SELinux does not check CAP_SYS_RAWIO in the file_mmap hook. This
means there is no DAC check on the ability to mmap low addresses in the
memory space. This function adds the DAC check for CAP_SYS_RAWIO while
maintaining the selinux check on mmap_zero. This means that processes
which need to mmap low memory will need CAP_SYS_RAWIO and mmap_zero but will
NOT need the SELinux sys_rawio capability.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2bf49690325b62480a42f7afed5e9f164173c570 14-Jul-2009 Thomas Liu <tliu@redhat.com> SELinux: Convert avc_audit to use lsm_audit.h

Convert avc_audit in security/selinux/avc.c to use lsm_audit.h,
for better maintainability.

- changed selinux to use common_audit_data instead of
avc_audit_data
- eliminated code in avc.c and used code from lsm_audit.h instead.

Had to add a LSM_AUDIT_NO_AUDIT to lsm_audit.h so that avc_audit
can call common_lsm_audit and do the pre and post callbacks without
doing the actual dump. This makes it so that the patched version
behaves the same way as the unpatched version.

Also added a denied field to the selinux_audit_data private space,
once again to make it so that the patched version behaves like the
unpatched.

I've tested and confirmed that AVCs look the same before and after
this patch.

Signed-off-by: Thomas Liu <tliu@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
25354c4fee169710fd9da15f3bb2abaa24dcf933 13-Aug-2009 Eric Paris <eparis@redhat.com> SELinux: add selinux_kernel_module_request

This patch adds a new selinux hook so SELinux can arbitrate if a given
process should be allowed to trigger a request for the kernel to try to
load a module. This is a different operation than a process trying to load
a module itself, which is already protected by CAP_SYS_MODULE.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
314dabb83a547ec4da819e8cbc78fac9cec605cd 10-Aug-2009 James Morris <jmorris@namei.org> SELinux: fix memory leakage in /security/selinux/hooks.c

Fix memory leakage in /security/selinux/hooks.c

The buffer always needs to be freed here; we either error
out or allocate more memory.

Reported-by: iceberg <strakh@ispras.ru>
Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
a2551df7ec568d87793d2eea4ca744e86318f205 31-Jul-2009 Eric Paris <eparis@redhat.com> Security/SELinux: seperate lsm specific mmap_min_addr

Currently SELinux enforcement of controls on the ability to map low memory
is determined by the mmap_min_addr tunable. This patch causes SELinux to
ignore the tunable and instead use a seperate Kconfig option specific to how
much space the LSM should protect.

The tunable will now only control the need for CAP_SYS_RAWIO and SELinux
permissions will always protect the amount of low memory designated by
CONFIG_LSM_MMAP_MIN_ADDR.

This allows users who need to disable the mmap_min_addr controls (usual reason
being they run WINE as a non-root user) to do so and still have SELinux
controls preventing confined domains (like a web server) from being able to
map some area of low memory.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
84336d1a77ccd2c06a730ddd38e695c2324a7386 31-Jul-2009 Eric Paris <eparis@redhat.com> SELinux: call cap_file_mmap in selinux_file_mmap

Currently SELinux does not check CAP_SYS_RAWIO in the file_mmap hook. This
means there is no DAC check on the ability to mmap low addresses in the
memory space. This function adds the DAC check for CAP_SYS_RAWIO while
maintaining the selinux check on mmap_zero. This means that processes
which need to mmap low memory will need CAP_SYS_RAWIO and mmap_zero but will
NOT need the SELinux sys_rawio capability.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
5bb459bb45d1ad3c177485dcf0af01580aa31125 10-Jul-2009 Oleg Nesterov <oleg@redhat.com> kernel: rename is_single_threaded(task) to current_is_single_threaded(void)

- is_single_threaded(task) is not safe unless task == current,
we can't use task->signal or task->mm.

- it doesn't make sense unless task == current, the task can
fork right after the check.

Rename it to current_is_single_threaded() and kill the argument.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
be940d6279c30a2d7c4e8d1d5435f957f594d66d 13-Jul-2009 James Morris <jmorris@namei.org> Revert "SELinux: Convert avc_audit to use lsm_audit.h"

This reverts commit 8113a8d80f4c6a3dc3724b39b470f3fee9c426b6.

The patch causes a stack overflow on my system during boot.

Signed-off-by: James Morris <jmorris@namei.org>
8113a8d80f4c6a3dc3724b39b470f3fee9c426b6 10-Jul-2009 Thomas Liu <tliu@redhat.com> SELinux: Convert avc_audit to use lsm_audit.h

Convert avc_audit in security/selinux/avc.c to use lsm_audit.h,
for better maintainability and for less code duplication.

- changed selinux to use common_audit_data instead of
avc_audit_data
- eliminated code in avc.c and used code from lsm_audit.h instead.

I have tested to make sure that the avcs look the same before and
after this patch.

Signed-off-by: Thomas Liu <tliu@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
89c86576ecde504da1eeb4f4882b2189ac2f9c4a 24-Jun-2009 Thomas Liu <tliu@redhat.com> selinux: clean up avc node cache when disabling selinux

Added a call to free the avc_node_cache when inside selinux_disable because
it should not waste resources allocated during avc_init if SELinux is disabled
and the cache will never be used.

Signed-off-by: Thomas Liu <tliu@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
9e48858f7d36a6a3849f1d1b40c3bf5624b4ee7c 07-May-2009 Ingo Molnar <mingo@elte.hu> security: rename ptrace_may_access => ptrace_access_check

The ->ptrace_may_access() methods are named confusingly - the real
ptrace_may_access() returns a bool, while these security checks have
a retval convention.

Rename it to ptrace_access_check, to reduce the confusion factor.

[ Impact: cleanup, no code changed ]

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: James Morris <jmorris@namei.org>
20dda18be9035c487c2e9534e4d18d2a1e1deade 22-Jun-2009 Stephen Smalley <sds@tycho.nsa.gov> selinux: restore optimization to selinux_file_permission

Restore the optimization to skip revalidation in selinux_file_permission
if nothing has changed since the dentry_open checks, accidentally removed by
389fb800. Also remove redundant test from selinux_revalidate_file_permission.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
adf30907d63893e4208dfe3f5c88ae12bc2f25d5 02-Jun-2009 Eric Dumazet <eric.dumazet@gmail.com> net: skb->dst accessors

Define three accessors to get/set dst attached to a skb

struct dst_entry *skb_dst(const struct sk_buff *skb)

void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)

void skb_dst_drop(struct sk_buff *skb)
This one should replace occurrences of :
dst_release(skb->dst)
skb->dst = NULL;

Delete skb->dst field

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
65c90bca0dba56f60dc4ce2a529140c3cc440f22 04-May-2009 Stephen Smalley <sds@tycho.nsa.gov> selinux: Fix send_sigiotask hook

The CRED patch incorrectly converted the SELinux send_sigiotask hook to
use the current task SID rather than the target task SID in its
permission check, yielding the wrong permission check. This fixes the
hook function. Detected by the ltp selinux testsuite and confirmed to
correct the test failure.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
ecd6de3c88e8cbcad175b2eab48ba05c2014f7b6 29-Apr-2009 Oleg Nesterov <oleg@redhat.com> selinux: selinux_bprm_committed_creds() should wake up ->real_parent, not ->parent.

We shouldn't worry about the tracer if current is ptraced, exec() must not
succeed if the tracer has no rights to trace this task after cred changing.
But we should notify ->real_parent which is, well, real parent.

Also, we don't need _irq to take tasklist, and we don't need parent's
->siglock to wake_up_interruptible(real_parent->signal->wait_chldexit).
Since we hold tasklist, real_parent->signal must be stable. Otherwise
spin_lock(siglock) is not safe too and can't help anyway.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
3bcac0263f0b45e67a64034ebcb69eb9abb742f4 29-Apr-2009 David Howells <dhowells@redhat.com> SELinux: Don't flush inherited SIGKILL during execve()

Don't flush inherited SIGKILL during execve() in SELinux's post cred commit
hook. This isn't really a security problem: if the SIGKILL came before the
credentials were changed, then we were right to receive it at the time, and
should honour it; if it came after the creds were changed, then we definitely
should honour it; and in any case, all that will happen is that the process
will be scrapped before it ever returns to userspace.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
88c48db9788862d0290831d081bc3c64e13b592f 29-Apr-2009 Eric Paris <eparis@redhat.com> SELinux: drop secondary_ops->sysctl

We are still calling secondary_ops->sysctl even though the capabilities
module does not define a sysctl operation.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
58bfbb51ff2b0fdc6c732ff3d72f50aa632b67a2 27-Mar-2009 Paul Moore <paul.moore@hp.com> selinux: Remove the "compat_net" compatibility code

The SELinux "compat_net" is marked as deprecated, the time has come to
finally remove it from the kernel. Further code simplifications are
likely in the future, but this patch was intended to be a simple,
straight-up removal of the compat_net code.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
389fb800ac8be2832efedd19978a2b8ced37eb61 27-Mar-2009 Paul Moore <paul.moore@hp.com> netlabel: Label incoming TCP connections correctly in SELinux

The current NetLabel/SELinux behavior for incoming TCP connections works but
only through a series of happy coincidences that rely on the limited nature of
standard CIPSO (only able to convey MLS attributes) and the write equality
imposed by the SELinux MLS constraints. The problem is that network sockets
created as the result of an incoming TCP connection were not on-the-wire
labeled based on the security attributes of the parent socket but rather based
on the wire label of the remote peer. The issue had to do with how IP options
were managed as part of the network stack and where the LSM hooks were in
relation to the code which set the IP options on these newly created child
sockets. While NetLabel/SELinux did correctly set the socket's on-the-wire
label it was promptly cleared by the network stack and reset based on the IP
options of the remote peer.

This patch, in conjunction with a prior patch that adjusted the LSM hook
locations, works to set the correct on-the-wire label format for new incoming
connections through the security_inet_conn_request() hook. Besides the
correct behavior there are many advantages to this change, the most significant
is that all of the NetLabel socket labeling code in SELinux now lives in hooks
which can return error codes to the core stack which allows us to finally get
ride of the selinux_netlbl_inode_permission() logic which greatly simplfies
the NetLabel/SELinux glue code. In the process of developing this patch I
also ran into a small handful of AF_INET6 cleanliness issues that have been
fixed which should make the code safer and easier to extend in the future.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <jmorris@namei.org>
df7f54c012b92ec93d56b68547351dcdf8a163d3 09-Mar-2009 Eric Paris <eparis@redhat.com> SELinux: inode_doinit_with_dentry drop no dentry printk

Drop the printk message when an inode is found without an associated
dentry. This should only happen when userspace can't be accessing those
inodes and those labels will get set correctly on the next d_instantiate.
Thus there is no reason to send this message.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
6a25b27d602aac24f3c642722377ba5d778417ec 05-Mar-2009 Eric Paris <eparis@redhat.com> SELinux: open perm for sock files

When I did open permissions I didn't think any sockets would have an open.
Turns out AF_UNIX sockets can have an open when they are bound to the
filesystem namespace. This patch adds a new SOCK_FILE__OPEN permission.
It's safe to add this as the open perms are already predicated on
capabilities and capabilities means we have unknown perm handling so
systems should be as backwards compatible as the policy wants them to
be.

https://bugzilla.redhat.com/show_bug.cgi?id=475224

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
4cb912f1d1447077160ace9ce3b3a10696dd74e5 12-Feb-2009 Eric Paris <eparis@redhat.com> SELinux: NULL terminate al contexts from disk

When a context is pulled in from disk we don't know that it is null
terminated. This patch forecebly null terminates contexts when we pull
them from disk.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
4ba0a8ad63e12a03ae01c039482967cc496b9174 12-Feb-2009 Eric Paris <eparis@redhat.com> SELinux: better printk when file with invalid label found

Currently when an inode is read into the kernel with an invalid label
string (can often happen with removable media) we output a string like:

SELinux: inode_doinit_with_dentry: context_to_sid([SOME INVALID LABEL])
returned -22 dor dev=[blah] ino=[blah]

Which is all but incomprehensible to all but a couple of us. Instead, on
EINVAL only, I plan to output a much more user friendly string and I plan to
ratelimit the printk since many of these could be generated very rapidly.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
200ac532a4bc3134147ca06686c56a6420e66c46 12-Feb-2009 Eric Paris <eparis@redhat.com> SELinux: call capabilities code directory

For cleanliness and efficiency remove all calls to secondary-> and instead
call capabilities code directly. capabilities are the only module that
selinux stacks with and so the code should not indicate that other stacking
might be possible.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
5626d3e86141390c8efc7bcb929b6a4b58b00480 30-Jan-2009 James Morris <jmorris@namei.org> selinux: remove hooks which simply defer to capabilities

Remove SELinux hooks which do nothing except defer to the capabilites
hooks (or in one case, replicates the function).

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
95c14904b6f6f8a35365f0c58d530c85b4fb96b4 28-Jan-2009 James Morris <jmorris@namei.org> selinux: remove secondary ops call to shm_shmat

Remove secondary ops call to shm_shmat, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
5c4054ccfafb6a446e9b65c524af1741656c6c60 28-Jan-2009 James Morris <jmorris@namei.org> selinux: remove secondary ops call to unix_stream_connect

Remove secondary ops call to unix_stream_connect, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2cbbd19812b7636c1c37bcf50c403e7af5278d73 28-Jan-2009 James Morris <jmorris@namei.org> selinux: remove secondary ops call to task_kill

Remove secondary ops call to task_kill, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
ef76e748faa823a738d632ee4c8ed9adaabc8a40 28-Jan-2009 James Morris <jmorris@namei.org> selinux: remove secondary ops call to task_setrlimit

Remove secondary ops call to task_setrlimit, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
ca5143d3ff3c7a4e1c2c8bdcf0f53aea227a7722 28-Jan-2009 James Morris <jmorris@namei.org> selinux: remove unused cred_commit hook

Remove unused cred_commit hook from SELinux. This
currently calls into the capabilities hook, which is a noop.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
af294e41d0c95a291cc821a1b43ec2cd13976a8b 28-Jan-2009 James Morris <jmorris@namei.org> selinux: remove secondary ops call to task_create

Remove secondary ops call to task_create, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
d541bbee6902d5ffb8a03d63ac8f4b1364c2ff93 28-Jan-2009 James Morris <jmorris@namei.org> selinux: remove secondary ops call to file_mprotect

Remove secondary ops call to file_mprotect, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
438add6b32d9295db6e3ecd4d9e137086ec5b5d9 28-Jan-2009 James Morris <jmorris@namei.org> selinux: remove secondary ops call to inode_setattr

Remove secondary ops call to inode_setattr, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
188fbcca9dd02f15dcf45cfc51ce0dd6c13993f6 28-Jan-2009 James Morris <jmorris@namei.org> selinux: remove secondary ops call to inode_permission

Remove secondary ops call to inode_permission, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
f51115b9ab5b9cfd0b7be1cce75afbf3ffbcdd87 28-Jan-2009 James Morris <jmorris@namei.org> selinux: remove secondary ops call to inode_follow_link

Remove secondary ops call to inode_follow_link, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
dd4907a6d4e038dc65839fcd4030ebefe2f5f439 28-Jan-2009 James Morris <jmorris@namei.org> selinux: remove secondary ops call to inode_mknod

Remove secondary ops call to inode_mknod, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
e4737250b751b4e0e802adae9a4d3ae0227b580b 28-Jan-2009 James Morris <jmorris@namei.org> selinux: remove secondary ops call to inode_unlink

Remove secondary ops call to inode_unlink, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
efdfac437607e4acfed66c383091a376525eaec4 29-Jan-2009 James Morris <jmorris@namei.org> selinux: remove secondary ops call to inode_link

Remove secondary ops call to inode_link, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
97422ab9ef45118cb7418d799dc69040f17108ce 29-Jan-2009 James Morris <jmorris@namei.org> selinux: remove secondary ops call to sb_umount

Remove secondary ops call to sb_umount, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
ef935b9136eeaa203f75bf0b4d6e398c29f44d27 29-Jan-2009 James Morris <jmorris@namei.org> selinux: remove secondary ops call to sb_mount

Remove secondary ops call to sb_mount, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
5565b0b865f672e3d7e31936ad1d40710ab7bfc4 29-Jan-2009 James Morris <jmorris@namei.org> selinux: remove secondary ops call to bprm_committed_creds

Remove secondary ops call to bprm_committed_creds, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2ec5dbe23d68bddc043a85d1226bfc499a724b1c 29-Jan-2009 James Morris <jmorris@namei.org> selinux: remove secondary ops call to bprm_committing_creds

Remove secondary ops call to bprm_committing_creds, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
bc05595845f58c065adc0763a678187647ec040f 29-Jan-2009 James Morris <jmorris@namei.org> selinux: remove unused bprm_check_security hook

Remove unused bprm_check_security hook from SELinux. This
currently calls into the capabilities hook, which is a noop.

Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
cd89596f0ccfa3ccb8a81ce47782231cf7ea7296 16-Jan-2009 David P. Quigley <dpquigl@tycho.nsa.gov> SELinux: Unify context mount and genfs behavior

Context mounts and genfs labeled file systems behave differently with respect to
setting file system labels. This patch brings genfs labeled file systems in line
with context mounts in that setxattr calls to them should return EOPNOTSUPP and
fscreate calls will be ignored.

Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@macbook.localdomain>
11689d47f0957121920c9ec646eb5d838755853a 16-Jan-2009 David P. Quigley <dpquigl@tycho.nsa.gov> SELinux: Add new security mount option to indicate security label support.

There is no easy way to tell if a file system supports SELinux security labeling.
Because of this a new flag is being added to the super block security structure
to indicate that the particular super block supports labeling. This flag is set
for file systems using the xattr, task, and transition labeling methods unless
that behavior is overridden by context mounts.

Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@macbook.localdomain>
0d90a7ec48c704025307b129413bc62451b20ab3 16-Jan-2009 David P. Quigley <dpquigl@tycho.nsa.gov> SELinux: Condense super block security structure flags and cleanup necessary code.

The super block security structure currently has three fields for what are
essentially flags. The flags field is used for mount options while two other
char fields are used for initialization and proc flags. These latter two fields are
essentially bit fields since the only used values are 0 and 1. These fields
have been collapsed into the flags field and new bit masks have been added for
them. The code is also fixed to work with these new flags.

Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@macbook.localdomain>
3699c53c485bf0168e6500d0ed18bf931584dd7c 06-Jan-2009 David Howells <dhowells@redhat.com> CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #3]

Fix a regression in cap_capable() due to:

commit 3b11a1decef07c19443d24ae926982bc8ec9f4c0
Author: David Howells <dhowells@redhat.com>
Date: Fri Nov 14 10:39:26 2008 +1100

CRED: Differentiate objective and effective subjective credentials on a task

The problem is that the above patch allows a process to have two sets of
credentials, and for the most part uses the subjective credentials when
accessing current's creds.

There is, however, one exception: cap_capable(), and thus capable(), uses the
real/objective credentials of the target task, whether or not it is the current
task.

Ordinarily this doesn't matter, since usually the two cred pointers in current
point to the same set of creds. However, sys_faccessat() makes use of this
facility to override the credentials of the calling process to make its test,
without affecting the creds as seen from other processes.

One of the things sys_faccessat() does is to make an adjustment to the
effective capabilities mask, which cap_capable(), as it stands, then ignores.

The affected capability check is in generic_permission():

if (!(mask & MAY_EXEC) || execute_ok(inode))
if (capable(CAP_DAC_OVERRIDE))
return 0;

This change passes the set of credentials to be tested down into the commoncap
and SELinux code. The security functions called by capable() and
has_capability() select the appropriate set of credentials from the process
being checked.

This can be tested by compiling the following program from the XFS testsuite:

/*
* t_access_root.c - trivial test program to show permission bug.
*
* Written by Michael Kerrisk - copyright ownership not pursued.
* Sourced from: http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-10/6030.html
*/
#include <limits.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <sys/stat.h>

#define UID 500
#define GID 100
#define PERM 0
#define TESTPATH "/tmp/t_access"

static void
errExit(char *msg)
{
perror(msg);
exit(EXIT_FAILURE);
} /* errExit */

static void
accessTest(char *file, int mask, char *mstr)
{
printf("access(%s, %s) returns %d\n", file, mstr, access(file, mask));
} /* accessTest */

int
main(int argc, char *argv[])
{
int fd, perm, uid, gid;
char *testpath;
char cmd[PATH_MAX + 20];

testpath = (argc > 1) ? argv[1] : TESTPATH;
perm = (argc > 2) ? strtoul(argv[2], NULL, 8) : PERM;
uid = (argc > 3) ? atoi(argv[3]) : UID;
gid = (argc > 4) ? atoi(argv[4]) : GID;

unlink(testpath);

fd = open(testpath, O_RDWR | O_CREAT, 0);
if (fd == -1) errExit("open");

if (fchown(fd, uid, gid) == -1) errExit("fchown");
if (fchmod(fd, perm) == -1) errExit("fchmod");
close(fd);

snprintf(cmd, sizeof(cmd), "ls -l %s", testpath);
system(cmd);

if (seteuid(uid) == -1) errExit("seteuid");

accessTest(testpath, 0, "0");
accessTest(testpath, R_OK, "R_OK");
accessTest(testpath, W_OK, "W_OK");
accessTest(testpath, X_OK, "X_OK");
accessTest(testpath, R_OK | W_OK, "R_OK | W_OK");
accessTest(testpath, R_OK | X_OK, "R_OK | X_OK");
accessTest(testpath, W_OK | X_OK, "W_OK | X_OK");
accessTest(testpath, R_OK | W_OK | X_OK, "R_OK | W_OK | X_OK");

exit(EXIT_SUCCESS);
} /* main */

This can be run against an Ext3 filesystem as well as against an XFS
filesystem. If successful, it will show:

[root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043
---------- 1 dhowells dhowells 0 2008-12-31 03:00 /tmp/xxx
access(/tmp/xxx, 0) returns 0
access(/tmp/xxx, R_OK) returns 0
access(/tmp/xxx, W_OK) returns 0
access(/tmp/xxx, X_OK) returns -1
access(/tmp/xxx, R_OK | W_OK) returns 0
access(/tmp/xxx, R_OK | X_OK) returns -1
access(/tmp/xxx, W_OK | X_OK) returns -1
access(/tmp/xxx, R_OK | W_OK | X_OK) returns -1

If unsuccessful, it will show:

[root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043
---------- 1 dhowells dhowells 0 2008-12-31 02:56 /tmp/xxx
access(/tmp/xxx, 0) returns 0
access(/tmp/xxx, R_OK) returns -1
access(/tmp/xxx, W_OK) returns -1
access(/tmp/xxx, X_OK) returns -1
access(/tmp/xxx, R_OK | W_OK) returns -1
access(/tmp/xxx, R_OK | X_OK) returns -1
access(/tmp/xxx, W_OK | X_OK) returns -1
access(/tmp/xxx, R_OK | W_OK | X_OK) returns -1

I've also tested the fix with the SELinux and syscalls LTP testsuites.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: J. Bruce Fields <bfields@citi.umich.edu>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
29881c4502ba05f46bc12ae8053d4e08d7e2615c 06-Jan-2009 James Morris <jmorris@namei.org> Revert "CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #2]"

This reverts commit 14eaddc967b16017d4a1a24d2be6c28ecbe06ed8.

David has a better version to come.
14eaddc967b16017d4a1a24d2be6c28ecbe06ed8 31-Dec-2008 David Howells <dhowells@redhat.com> CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #2]

Fix a regression in cap_capable() due to:

commit 5ff7711e635b32f0a1e558227d030c7e45b4a465
Author: David Howells <dhowells@redhat.com>
Date: Wed Dec 31 02:52:28 2008 +0000

CRED: Differentiate objective and effective subjective credentials on a task

The problem is that the above patch allows a process to have two sets of
credentials, and for the most part uses the subjective credentials when
accessing current's creds.

There is, however, one exception: cap_capable(), and thus capable(), uses the
real/objective credentials of the target task, whether or not it is the current
task.

Ordinarily this doesn't matter, since usually the two cred pointers in current
point to the same set of creds. However, sys_faccessat() makes use of this
facility to override the credentials of the calling process to make its test,
without affecting the creds as seen from other processes.

One of the things sys_faccessat() does is to make an adjustment to the
effective capabilities mask, which cap_capable(), as it stands, then ignores.

The affected capability check is in generic_permission():

if (!(mask & MAY_EXEC) || execute_ok(inode))
if (capable(CAP_DAC_OVERRIDE))
return 0;

This change splits capable() from has_capability() down into the commoncap and
SELinux code. The capable() security op now only deals with the current
process, and uses the current process's subjective creds. A new security op -
task_capable() - is introduced that can check any task's objective creds.

strictly the capable() security op is superfluous with the presence of the
task_capable() op, however it should be faster to call the capable() op since
two fewer arguments need be passed down through the various layers.

This can be tested by compiling the following program from the XFS testsuite:

/*
* t_access_root.c - trivial test program to show permission bug.
*
* Written by Michael Kerrisk - copyright ownership not pursued.
* Sourced from: http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-10/6030.html
*/
#include <limits.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <sys/stat.h>

#define UID 500
#define GID 100
#define PERM 0
#define TESTPATH "/tmp/t_access"

static void
errExit(char *msg)
{
perror(msg);
exit(EXIT_FAILURE);
} /* errExit */

static void
accessTest(char *file, int mask, char *mstr)
{
printf("access(%s, %s) returns %d\n", file, mstr, access(file, mask));
} /* accessTest */

int
main(int argc, char *argv[])
{
int fd, perm, uid, gid;
char *testpath;
char cmd[PATH_MAX + 20];

testpath = (argc > 1) ? argv[1] : TESTPATH;
perm = (argc > 2) ? strtoul(argv[2], NULL, 8) : PERM;
uid = (argc > 3) ? atoi(argv[3]) : UID;
gid = (argc > 4) ? atoi(argv[4]) : GID;

unlink(testpath);

fd = open(testpath, O_RDWR | O_CREAT, 0);
if (fd == -1) errExit("open");

if (fchown(fd, uid, gid) == -1) errExit("fchown");
if (fchmod(fd, perm) == -1) errExit("fchmod");
close(fd);

snprintf(cmd, sizeof(cmd), "ls -l %s", testpath);
system(cmd);

if (seteuid(uid) == -1) errExit("seteuid");

accessTest(testpath, 0, "0");
accessTest(testpath, R_OK, "R_OK");
accessTest(testpath, W_OK, "W_OK");
accessTest(testpath, X_OK, "X_OK");
accessTest(testpath, R_OK | W_OK, "R_OK | W_OK");
accessTest(testpath, R_OK | X_OK, "R_OK | X_OK");
accessTest(testpath, W_OK | X_OK, "W_OK | X_OK");
accessTest(testpath, R_OK | W_OK | X_OK, "R_OK | W_OK | X_OK");

exit(EXIT_SUCCESS);
} /* main */

This can be run against an Ext3 filesystem as well as against an XFS
filesystem. If successful, it will show:

[root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043
---------- 1 dhowells dhowells 0 2008-12-31 03:00 /tmp/xxx
access(/tmp/xxx, 0) returns 0
access(/tmp/xxx, R_OK) returns 0
access(/tmp/xxx, W_OK) returns 0
access(/tmp/xxx, X_OK) returns -1
access(/tmp/xxx, R_OK | W_OK) returns 0
access(/tmp/xxx, R_OK | X_OK) returns -1
access(/tmp/xxx, W_OK | X_OK) returns -1
access(/tmp/xxx, R_OK | W_OK | X_OK) returns -1

If unsuccessful, it will show:

[root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043
---------- 1 dhowells dhowells 0 2008-12-31 02:56 /tmp/xxx
access(/tmp/xxx, 0) returns 0
access(/tmp/xxx, R_OK) returns -1
access(/tmp/xxx, W_OK) returns -1
access(/tmp/xxx, X_OK) returns -1
access(/tmp/xxx, R_OK | W_OK) returns -1
access(/tmp/xxx, R_OK | X_OK) returns -1
access(/tmp/xxx, W_OK | X_OK) returns -1
access(/tmp/xxx, R_OK | W_OK | X_OK) returns -1

I've also tested the fix with the SELinux and syscalls LTP testsuites.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
277d342fc423fca5e66e677fe629d1b2f8f1b9e2 31-Dec-2008 Paul Moore <paul.moore@hp.com> selinux: Deprecate and schedule the removal of the the compat_net functionality

This patch is the first step towards removing the old "compat_net" code from
the kernel. Secmark, the "compat_net" replacement was first introduced in
2.6.18 (September 2006) and the major Linux distributions with SELinux support
have transitioned to Secmark so it is time to start deprecating the "compat_net"
mechanism. Testing a patched version of 2.6.28-rc6 with the initial release of
Fedora Core 5 did not show any problems when running in enforcing mode.

This patch adds an entry to the feature-removal-schedule.txt file and removes
the SECURITY_SELINUX_ENABLE_SECMARK_DEFAULT configuration option, forcing
Secmark on by default although it can still be disabled at runtime. The patch
also makes the Secmark permission checks "dynamic" in the sense that they are
only executed when Secmark is configured; this should help prevent problems
with older distributions that have not yet migrated to Secmark.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
74192246910ff4fb95309ba1a683215644beeb62 19-Dec-2008 James Morris <jmorris@namei.org> SELinux: don't check permissions for kernel mounts

Don't bother checking permissions when the kernel performs an
internal mount, as this should always be allowed.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
12204e24b1330428c3062faee10a0d80b8a5cb61 19-Dec-2008 James Morris <jmorris@namei.org> security: pass mount flags to security_sb_kern_mount()

Pass mount flags to security_sb_kern_mount(), so security modules
can determine if a mount operation is being performed by the kernel.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
459c19f524a9d89c65717a7d061d5f11ecf6bcb8 05-Dec-2008 Stephen Smalley <sds@tycho.nsa.gov> SELinux: correctly detect proc filesystems of the form "proc/foo"

Map all of these proc/ filesystem types to "proc" for the policy lookup at
filesystem mount time.

Signed-off-by: James Morris <jmorris@namei.org>
3a3b7ce9336952ea7b9564d976d068a238976c9d 14-Nov-2008 David Howells <dhowells@redhat.com> CRED: Allow kernel services to override LSM settings for task actions

Allow kernel services to override LSM settings appropriate to the actions
performed by a task by duplicating a set of credentials, modifying it and then
using task_struct::cred to point to it when performing operations on behalf of
a task.

This is used, for example, by CacheFiles which has to transparently access the
cache on behalf of a process that thinks it is doing, say, NFS accesses with a
potentially inappropriate (with respect to accessing the cache) set of
credentials.

This patch provides two LSM hooks for modifying a task security record:

(*) security_kernel_act_as() which allows modification of the security datum
with which a task acts on other objects (most notably files).

(*) security_kernel_create_files_as() which allows modification of the
security datum that is used to initialise the security data on a file that
a task creates.

The patch also provides four new credentials handling functions, which wrap the
LSM functions:

(1) prepare_kernel_cred()

Prepare a set of credentials for a kernel service to use, based either on
a daemon's credentials or on init_cred. All the keyrings are cleared.

(2) set_security_override()

Set the LSM security ID in a set of credentials to a specific security
context, assuming permission from the LSM policy.

(3) set_security_override_from_ctx()

As (2), but takes the security context as a string.

(4) set_create_files_as()

Set the file creation LSM security ID in a set of credentials to be the
same as that on a particular inode.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> [Smack changes]
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
3b11a1decef07c19443d24ae926982bc8ec9f4c0 14-Nov-2008 David Howells <dhowells@redhat.com> CRED: Differentiate objective and effective subjective credentials on a task

Differentiate the objective and real subjective credentials from the effective
subjective credentials on a task by introducing a second credentials pointer
into the task_struct.

task_struct::real_cred then refers to the objective and apparent real
subjective credentials of a task, as perceived by the other tasks in the
system.

task_struct::cred then refers to the effective subjective credentials of a
task, as used by that task when it's actually running. These are not visible
to the other tasks in the system.

__task_cred(task) then refers to the objective/real credentials of the task in
question.

current_cred() refers to the effective subjective credentials of the current
task.

prepare_creds() uses the objective creds as a base and commit_creds() changes
both pointers in the task_struct (indeed commit_creds() requires them to be the
same).

override_creds() and revert_creds() change the subjective creds pointer only,
and the former returns the old subjective creds. These are used by NFSD,
faccessat() and do_coredump(), and will by used by CacheFiles.

In SELinux, current_has_perm() is provided as an alternative to
task_has_perm(). This uses the effective subjective context of current,
whereas task_has_perm() uses the objective/real context of the subject.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
a6f76f23d297f70e2a6b3ec607f7aeeea9e37e8d 14-Nov-2008 David Howells <dhowells@redhat.com> CRED: Make execve() take advantage of copy-on-write credentials

Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.

This patch and the preceding patches have been tested with the LTP SELinux
testsuite.

This patch makes several logical sets of alteration:

(1) execve().

The credential bits from struct linux_binprm are, for the most part,
replaced with a single credentials pointer (bprm->cred). This means that
all the creds can be calculated in advance and then applied at the point
of no return with no possibility of failure.

I would like to replace bprm->cap_effective with:

cap_isclear(bprm->cap_effective)

but this seems impossible due to special behaviour for processes of pid 1
(they always retain their parent's capability masks where normally they'd
be changed - see cap_bprm_set_creds()).

The following sequence of events now happens:

(a) At the start of do_execve, the current task's cred_exec_mutex is
locked to prevent PTRACE_ATTACH from obsoleting the calculation of
creds that we make.

(a) prepare_exec_creds() is then called to make a copy of the current
task's credentials and prepare it. This copy is then assigned to
bprm->cred.

This renders security_bprm_alloc() and security_bprm_free()
unnecessary, and so they've been removed.

(b) The determination of unsafe execution is now performed immediately
after (a) rather than later on in the code. The result is stored in
bprm->unsafe for future reference.

(c) prepare_binprm() is called, possibly multiple times.

(i) This applies the result of set[ug]id binaries to the new creds
attached to bprm->cred. Personality bit clearance is recorded,
but now deferred on the basis that the exec procedure may yet
fail.

(ii) This then calls the new security_bprm_set_creds(). This should
calculate the new LSM and capability credentials into *bprm->cred.

This folds together security_bprm_set() and parts of
security_bprm_apply_creds() (these two have been removed).
Anything that might fail must be done at this point.

(iii) bprm->cred_prepared is set to 1.

bprm->cred_prepared is 0 on the first pass of the security
calculations, and 1 on all subsequent passes. This allows SELinux
in (ii) to base its calculations only on the initial script and
not on the interpreter.

(d) flush_old_exec() is called to commit the task to execution. This
performs the following steps with regard to credentials:

(i) Clear pdeath_signal and set dumpable on certain circumstances that
may not be covered by commit_creds().

(ii) Clear any bits in current->personality that were deferred from
(c.i).

(e) install_exec_creds() [compute_creds() as was] is called to install the
new credentials. This performs the following steps with regard to
credentials:

(i) Calls security_bprm_committing_creds() to apply any security
requirements, such as flushing unauthorised files in SELinux, that
must be done before the credentials are changed.

This is made up of bits of security_bprm_apply_creds() and
security_bprm_post_apply_creds(), both of which have been removed.
This function is not allowed to fail; anything that might fail
must have been done in (c.ii).

(ii) Calls commit_creds() to apply the new credentials in a single
assignment (more or less). Possibly pdeath_signal and dumpable
should be part of struct creds.

(iii) Unlocks the task's cred_replace_mutex, thus allowing
PTRACE_ATTACH to take place.

(iv) Clears The bprm->cred pointer as the credentials it was holding
are now immutable.

(v) Calls security_bprm_committed_creds() to apply any security
alterations that must be done after the creds have been changed.
SELinux uses this to flush signals and signal handlers.

(f) If an error occurs before (d.i), bprm_free() will call abort_creds()
to destroy the proposed new credentials and will then unlock
cred_replace_mutex. No changes to the credentials will have been
made.

(2) LSM interface.

A number of functions have been changed, added or removed:

(*) security_bprm_alloc(), ->bprm_alloc_security()
(*) security_bprm_free(), ->bprm_free_security()

Removed in favour of preparing new credentials and modifying those.

(*) security_bprm_apply_creds(), ->bprm_apply_creds()
(*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()

Removed; split between security_bprm_set_creds(),
security_bprm_committing_creds() and security_bprm_committed_creds().

(*) security_bprm_set(), ->bprm_set_security()

Removed; folded into security_bprm_set_creds().

(*) security_bprm_set_creds(), ->bprm_set_creds()

New. The new credentials in bprm->creds should be checked and set up
as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
second and subsequent calls.

(*) security_bprm_committing_creds(), ->bprm_committing_creds()
(*) security_bprm_committed_creds(), ->bprm_committed_creds()

New. Apply the security effects of the new credentials. This
includes closing unauthorised files in SELinux. This function may not
fail. When the former is called, the creds haven't yet been applied
to the process; when the latter is called, they have.

The former may access bprm->cred, the latter may not.

(3) SELinux.

SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:

(a) The bprm_security_struct struct has been removed in favour of using
the credentials-under-construction approach.

(c) flush_unauthorized_files() now takes a cred pointer and passes it on
to inode_has_perm(), file_has_perm() and dentry_open().

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
d84f4f992cbd76e8f39c488cf0c5d123843923b1 14-Nov-2008 David Howells <dhowells@redhat.com> CRED: Inaugurate COW credentials

Inaugurate copy-on-write credentials management. This uses RCU to manage the
credentials pointer in the task_struct with respect to accesses by other tasks.
A process may only modify its own credentials, and so does not need locking to
access or modify its own credentials.

A mutex (cred_replace_mutex) is added to the task_struct to control the effect
of PTRACE_ATTACHED on credential calculations, particularly with respect to
execve().

With this patch, the contents of an active credentials struct may not be
changed directly; rather a new set of credentials must be prepared, modified
and committed using something like the following sequence of events:

struct cred *new = prepare_creds();
int ret = blah(new);
if (ret < 0) {
abort_creds(new);
return ret;
}
return commit_creds(new);

There are some exceptions to this rule: the keyrings pointed to by the active
credentials may be instantiated - keyrings violate the COW rule as managing
COW keyrings is tricky, given that it is possible for a task to directly alter
the keys in a keyring in use by another task.

To help enforce this, various pointers to sets of credentials, such as those in
the task_struct, are declared const. The purpose of this is compile-time
discouragement of altering credentials through those pointers. Once a set of
credentials has been made public through one of these pointers, it may not be
modified, except under special circumstances:

(1) Its reference count may incremented and decremented.

(2) The keyrings to which it points may be modified, but not replaced.

The only safe way to modify anything else is to create a replacement and commit
using the functions described in Documentation/credentials.txt (which will be
added by a later patch).

This patch and the preceding patches have been tested with the LTP SELinux
testsuite.

This patch makes several logical sets of alteration:

(1) execve().

This now prepares and commits credentials in various places in the
security code rather than altering the current creds directly.

(2) Temporary credential overrides.

do_coredump() and sys_faccessat() now prepare their own credentials and
temporarily override the ones currently on the acting thread, whilst
preventing interference from other threads by holding cred_replace_mutex
on the thread being dumped.

This will be replaced in a future patch by something that hands down the
credentials directly to the functions being called, rather than altering
the task's objective credentials.

(3) LSM interface.

A number of functions have been changed, added or removed:

(*) security_capset_check(), ->capset_check()
(*) security_capset_set(), ->capset_set()

Removed in favour of security_capset().

(*) security_capset(), ->capset()

New. This is passed a pointer to the new creds, a pointer to the old
creds and the proposed capability sets. It should fill in the new
creds or return an error. All pointers, barring the pointer to the
new creds, are now const.

(*) security_bprm_apply_creds(), ->bprm_apply_creds()

Changed; now returns a value, which will cause the process to be
killed if it's an error.

(*) security_task_alloc(), ->task_alloc_security()

Removed in favour of security_prepare_creds().

(*) security_cred_free(), ->cred_free()

New. Free security data attached to cred->security.

(*) security_prepare_creds(), ->cred_prepare()

New. Duplicate any security data attached to cred->security.

(*) security_commit_creds(), ->cred_commit()

New. Apply any security effects for the upcoming installation of new
security by commit_creds().

(*) security_task_post_setuid(), ->task_post_setuid()

Removed in favour of security_task_fix_setuid().

(*) security_task_fix_setuid(), ->task_fix_setuid()

Fix up the proposed new credentials for setuid(). This is used by
cap_set_fix_setuid() to implicitly adjust capabilities in line with
setuid() changes. Changes are made to the new credentials, rather
than the task itself as in security_task_post_setuid().

(*) security_task_reparent_to_init(), ->task_reparent_to_init()

Removed. Instead the task being reparented to init is referred
directly to init's credentials.

NOTE! This results in the loss of some state: SELinux's osid no
longer records the sid of the thread that forked it.

(*) security_key_alloc(), ->key_alloc()
(*) security_key_permission(), ->key_permission()

Changed. These now take cred pointers rather than task pointers to
refer to the security context.

(4) sys_capset().

This has been simplified and uses less locking. The LSM functions it
calls have been merged.

(5) reparent_to_kthreadd().

This gives the current thread the same credentials as init by simply using
commit_thread() to point that way.

(6) __sigqueue_alloc() and switch_uid()

__sigqueue_alloc() can't stop the target task from changing its creds
beneath it, so this function gets a reference to the currently applicable
user_struct which it then passes into the sigqueue struct it returns if
successful.

switch_uid() is now called from commit_creds(), and possibly should be
folded into that. commit_creds() should take care of protecting
__sigqueue_alloc().

(7) [sg]et[ug]id() and co and [sg]et_current_groups.

The set functions now all use prepare_creds(), commit_creds() and
abort_creds() to build and check a new set of credentials before applying
it.

security_task_set[ug]id() is called inside the prepared section. This
guarantees that nothing else will affect the creds until we've finished.

The calling of set_dumpable() has been moved into commit_creds().

Much of the functionality of set_user() has been moved into
commit_creds().

The get functions all simply access the data directly.

(8) security_task_prctl() and cap_task_prctl().

security_task_prctl() has been modified to return -ENOSYS if it doesn't
want to handle a function, or otherwise return the return value directly
rather than through an argument.

Additionally, cap_task_prctl() now prepares a new set of credentials, even
if it doesn't end up using it.

(9) Keyrings.

A number of changes have been made to the keyrings code:

(a) switch_uid_keyring(), copy_keys(), exit_keys() and suid_keys() have
all been dropped and built in to the credentials functions directly.
They may want separating out again later.

(b) key_alloc() and search_process_keyrings() now take a cred pointer
rather than a task pointer to specify the security context.

(c) copy_creds() gives a new thread within the same thread group a new
thread keyring if its parent had one, otherwise it discards the thread
keyring.

(d) The authorisation key now points directly to the credentials to extend
the search into rather pointing to the task that carries them.

(e) Installing thread, process or session keyrings causes a new set of
credentials to be created, even though it's not strictly necessary for
process or session keyrings (they're shared).

(10) Usermode helper.

The usermode helper code now carries a cred struct pointer in its
subprocess_info struct instead of a new session keyring pointer. This set
of credentials is derived from init_cred and installed on the new process
after it has been cloned.

call_usermodehelper_setup() allocates the new credentials and
call_usermodehelper_freeinfo() discards them if they haven't been used. A
special cred function (prepare_usermodeinfo_creds()) is provided
specifically for call_usermodehelper_setup() to call.

call_usermodehelper_setkeys() adjusts the credentials to sport the
supplied keyring as the new session keyring.

(11) SELinux.

SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:

(a) selinux_setprocattr() no longer does its check for whether the
current ptracer can access processes with the new SID inside the lock
that covers getting the ptracer's SID. Whilst this lock ensures that
the check is done with the ptracer pinned, the result is only valid
until the lock is released, so there's no point doing it inside the
lock.

(12) is_single_threaded().

This function has been extracted from selinux_setprocattr() and put into
a file of its own in the lib/ directory as join_session_keyring() now
wants to use it too.

The code in SELinux just checked to see whether a task shared mm_structs
with other tasks (CLONE_VM), but that isn't good enough. We really want
to know if they're part of the same thread group (CLONE_THREAD).

(13) nfsd.

The NFS server daemon now has to use the COW credentials to set the
credentials it is going to use. It really needs to pass the credentials
down to the functions it calls, but it can't do that until other patches
in this series have been applied.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
745ca2475a6ac596e3d8d37c2759c0fbe2586227 14-Nov-2008 David Howells <dhowells@redhat.com> CRED: Pass credentials through dentry_open()

Pass credentials through dentry_open() so that the COW creds patch can have
SELinux's flush_unauthorized_files() pass the appropriate creds back to itself
when it opens its null chardev.

The security_dentry_open() call also now takes a creds pointer, as does the
dentry_open hook in struct security_operations.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
88e67f3b8898c5ea81d2916dd5b8bc9c0c35ba13 14-Nov-2008 David Howells <dhowells@redhat.com> CRED: Make inode_has_perm() and file_has_perm() take a cred pointer

Make inode_has_perm() and file_has_perm() take a cred pointer rather than a
task pointer.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
275bb41e9d058fbb327e7642f077e1beaeac162e 14-Nov-2008 David Howells <dhowells@redhat.com> CRED: Wrap access to SELinux's task SID

Wrap access to SELinux's task SID, using task_sid() and current_sid() as
appropriate.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
f1752eec6145c97163dbce62d17cf5d928e28a27 14-Nov-2008 David Howells <dhowells@redhat.com> CRED: Detach the credentials from task_struct

Detach the credentials from task_struct, duplicating them in copy_process()
and releasing them in __put_task_struct().

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
b6dff3ec5e116e3af6f537d4caedcad6b9e5082a 14-Nov-2008 David Howells <dhowells@redhat.com> CRED: Separate task security context from task_struct

Separate the task security context from task_struct. At this point, the
security data is temporarily embedded in the task_struct with two pointers
pointing to it.

Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in
entry.S via asm-offsets.

With comment fixes Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com>

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
15a2460ed0af7538ca8e6c610fe607a2cd9da142 14-Nov-2008 David Howells <dhowells@redhat.com> CRED: Constify the kernel_cap_t arguments to the capset LSM hooks

Constify the kernel_cap_t arguments to the capset LSM hooks.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
1cdcbec1a3372c0c49c59d292e708fd07b509f18 14-Nov-2008 David Howells <dhowells@redhat.com> CRED: Neuter sys_capset()

Take away the ability for sys_capset() to affect processes other than current.

This means that current will not need to lock its own credentials when reading
them against interference by other processes.

This has effectively been the case for a while anyway, since:

(1) Without LSM enabled, sys_capset() is disallowed.

(2) With file-based capabilities, sys_capset() is neutered.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Andrew G. Morgan <morgan@kernel.org>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
066746796bd2f0a1ba210c0dded3b6ee4032692a 11-Nov-2008 Eric Paris <eparis@redhat.com> Currently SELinux jumps through some ugly hoops to not audit a capbility
check when determining if a process has additional powers to override
memory limits or when trying to read/write illegal file labels. Use
the new noaudit call instead.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
06112163f5fd9e491a7f810443d81efa9d88e247 11-Nov-2008 Eric Paris <eparis@redhat.com> Add a new capable interface that will be used by systems that use audit to
make an A or B type decision instead of a security decision. Currently
this is the case at least for filesystems when deciding if a process can use
the reserved 'root' blocks and for the case of things like the oom
algorithm determining if processes are root processes and should be less
likely to be killed. These types of security system requests should not be
audited or logged since they are not really security decisions. It would be
possible to solve this problem like the vm_enough_memory security check did
by creating a new LSM interface and moving all of the policy into that
interface but proves the needlessly bloat the LSM and provide complex
indirection.

This merely allows those decisions to be made where they belong and to not
flood logs or printk with denials for thing that are not security decisions.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
39c9aede2b4a252bd296c0a86be832c3d3d0a273 05-Nov-2008 Eric Paris <eparis@redhat.com> SELinux: Use unknown perm handling to handle unknown netlink msg types

Currently when SELinux has not been updated to handle a netlink message
type the operation is denied with EINVAL. This patch will leave the
audit/warning message so things get fixed but if policy chose to allow
unknowns this will allow the netlink operation.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
41d9f9c524a53477467b7e0111ff3d644198f191 04-Nov-2008 Eric Paris <eparis@redhat.com> SELinux: hold tasklist_lock and siglock while waking wait_chldexit

SELinux has long been calling wake_up_interruptible() on
current->parent->signal->wait_chldexit without holding any locks. It
appears that this operation should hold the tasklist_lock to dereference
current->parent and we should hold the siglock when waking up the
signal->wait_chldexit.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
37dd0bd04a3240d2922786d501e2f12cec858fbf 31-Oct-2008 Eric Paris <eparis@redhat.com> SELinux: properly handle empty tty_files list

SELinux has wrongly (since 2004) had an incorrect test for an empty
tty->tty_files list. With an empty list selinux would be pointing to part
of the tty struct itself and would then proceed to dereference that value
and again dereference that result. An F10 change to plymouth on a ppc64
system is actually currently triggering this bug. This patch uses
list_empty() to handle empty lists rather than looking at a meaningless
location.

[note, this fixes the oops reported in
https://bugzilla.redhat.com/show_bug.cgi?id=469079]

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
8b6a5a37f87a414ef8636e36ec75accb27bb7508 29-Oct-2008 Eric Paris <eparis@redhat.com> SELinux: check open perms in dentry_open not inode_permission

Some operations, like searching a directory path or connecting a unix domain
socket, make explicit calls into inode_permission. Our choices are to
either try to come up with a signature for all of the explicit calls to
inode_permission and do not check open on those, or to move the open checks to
dentry_open where we know this is always an open operation. This patch moves
the checks to dentry_open.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
def8b4faff5ca349beafbbfeb2c51f3602a6ef3a 28-Oct-2008 Alexey Dobriyan <adobriyan@gmail.com> net: reduce structures when XFRM=n

ifdef out
* struct sk_buff::sp (pointer)
* struct dst_entry::xfrm (pointer)
* struct sock::sk_policy (2 pointers)

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
a447c0932445f92ce6f4c1bd020f62c5097a7842 13-Oct-2008 Steven Whitehouse <swhiteho@redhat.com> vfs: Use const for kernel parser table

This is a much better version of a previous patch to make the parser
tables constant. Rather than changing the typedef, we put the "const" in
all the various places where its required, allowing the __initconst
exception for nfsroot which was the cause of the previous trouble.

This was posted for review some time ago and I believe its been in -mm
since then.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Alexander Viro <aviro@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
934e6ebf96e8c1a0f299e64129fdaebc1132a427 13-Oct-2008 Alan Cox <alan@redhat.com> tty: Redo current tty locking

Currently it is sometimes locked by the tty mutex and sometimes by the
sighand lock. The latter is in fact correct and now we can hand back referenced
objects we can fix this up without problems around sleeping functions.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
452a00d2ee288f2cbc36f676edd06cb14d2878c1 13-Oct-2008 Alan Cox <alan@redhat.com> tty: Make get_current_tty use a kref

We now return a kref covered tty reference. That ensures the tty structure
doesn't go away when you have a return from get_current_tty. This is not
enough to protect you from most of the resources being freed behind your
back - yet.

[Updated to include fixes for SELinux problems found by Andrew Morton and
an s390 leak found while debugging the former]

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6c5b3fc0147f79d714d2fe748b5869d7892ef2e7 10-Oct-2008 Paul Moore <paul.moore@hp.com> selinux: Cache NetLabel secattrs in the socket's security struct

Previous work enabled the use of address based NetLabel selectors, which
while highly useful, brought the potential for additional per-packet overhead
when used. This patch attempts to mitigate some of that overhead by caching
the NetLabel security attribute struct within the SELinux socket security
structure. This should help eliminate the need to recreate the NetLabel
secattr structure for each packet resulting in less overhead.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
014ab19a69c325f52d7bae54ceeda73d6307ae0c 10-Oct-2008 Paul Moore <paul.moore@hp.com> selinux: Set socket NetLabel based on connection endpoint

Previous work enabled the use of address based NetLabel selectors, which while
highly useful, brought the potential for additional per-packet overhead when
used. This patch attempts to solve that by applying NetLabel socket labels
when sockets are connect()'d. This should alleviate the per-packet NetLabel
labeling for all connected sockets (yes, it even works for connected DGRAM
sockets).

Signed-off-by: Paul Moore <paul.moore@hp.com>
Reviewed-by: James Morris <jmorris@namei.org>
948bf85c1bc9a84754786a9d5dd99b7ecc46451e 10-Oct-2008 Paul Moore <paul.moore@hp.com> netlabel: Add functionality to set the security attributes of a packet

This patch builds upon the new NetLabel address selector functionality by
providing the NetLabel KAPI and CIPSO engine support needed to enable the
new packet-based labeling. The only new addition to the NetLabel KAPI at
this point is shown below:

* int netlbl_skbuff_setattr(skb, family, secattr)

... and is designed to be called from a Netfilter hook after the packet's
IP header has been populated such as in the FORWARD or LOCAL_OUT hooks.

This patch also provides the necessary SELinux hooks to support this new
functionality. Smack support is not currently included due to uncertainty
regarding the permissions needed to expand the Smack network access controls.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Reviewed-by: James Morris <jmorris@namei.org>
dfaebe9825ff34983778f287101bc5f3bce00640 10-Oct-2008 Paul Moore <paul.moore@hp.com> selinux: Fix missing calls to netlbl_skbuff_err()

At some point I think I messed up and dropped the calls to netlbl_skbuff_err()
which are necessary for CIPSO to send error notifications to remote systems.
This patch re-introduces the error handling calls into the SELinux code.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
d8395c876bb8a560c8a032887e191b95499a25d6 10-Oct-2008 Paul Moore <paul.moore@hp.com> selinux: Better local/forward check in selinux_ip_postroute()

It turns out that checking to see if skb->sk is NULL is not a very good
indicator of a forwarded packet as some locally generated packets also have
skb->sk set to NULL. Fix this by not only checking the skb->sk field but also
the IP[6]CB(skb)->flags field for the IP[6]SKB_FORWARDED flag. While we are
at it, we are calling selinux_parse_skb() much earlier than we really should
resulting in potentially wasted cycles parsing packets for information we
might no use; so shuffle the code around a bit to fix this.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
aa86290089a1e57b4bdbbb4720072233f66bd5b2 10-Oct-2008 Paul Moore <paul.moore@hp.com> selinux: Correctly handle IPv4 packets on IPv6 sockets in all cases

We did the right thing in a few cases but there were several areas where we
determined a packet's address family based on the socket's address family which
is not the right thing to do since we can get IPv4 packets on IPv6 sockets.
This patch fixes these problems by either taking the address family directly
from the packet.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
ea6b184f7d521a503ecab71feca6e4057562252b 22-Sep-2008 Stephen Smalley <sds@tycho.nsa.gov> selinux: use default proc sid on symlinks

As we are not concerned with fine-grained control over reading of
symlinks in proc, always use the default proc SID for all proc symlinks.
This should help avoid permission issues upon changes to the proc tree
as in the /proc/net -> /proc/self/net example.
This does not alter labeling of symlinks within /proc/pid directories.
ls -Zd /proc/net output before and after the patch should show the difference.

Signed-off-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
f06febc96ba8e0af80bcc3eaec0a109e88275fac 12-Sep-2008 Frank Mayhar <fmayhar@google.com> timers: fix itimer/many thread hang

Overview

This patch reworks the handling of POSIX CPU timers, including the
ITIMER_PROF, ITIMER_VIRT timers and rlimit handling. It was put together
with the help of Roland McGrath, the owner and original writer of this code.

The problem we ran into, and the reason for this rework, has to do with using
a profiling timer in a process with a large number of threads. It appears
that the performance of the old implementation of run_posix_cpu_timers() was
at least O(n*3) (where "n" is the number of threads in a process) or worse.
Everything is fine with an increasing number of threads until the time taken
for that routine to run becomes the same as or greater than the tick time, at
which point things degrade rather quickly.

This patch fixes bug 9906, "Weird hang with NPTL and SIGPROF."

Code Changes

This rework corrects the implementation of run_posix_cpu_timers() to make it
run in constant time for a particular machine. (Performance may vary between
one machine and another depending upon whether the kernel is built as single-
or multiprocessor and, in the latter case, depending upon the number of
running processors.) To do this, at each tick we now update fields in
signal_struct as well as task_struct. The run_posix_cpu_timers() function
uses those fields to make its decisions.

We define a new structure, "task_cputime," to contain user, system and
scheduler times and use these in appropriate places:

struct task_cputime {
cputime_t utime;
cputime_t stime;
unsigned long long sum_exec_runtime;
};

This is included in the structure "thread_group_cputime," which is a new
substructure of signal_struct and which varies for uniprocessor versus
multiprocessor kernels. For uniprocessor kernels, it uses "task_cputime" as
a simple substructure, while for multiprocessor kernels it is a pointer:

struct thread_group_cputime {
struct task_cputime totals;
};

struct thread_group_cputime {
struct task_cputime *totals;
};

We also add a new task_cputime substructure directly to signal_struct, to
cache the earliest expiration of process-wide timers, and task_cputime also
replaces the it_*_expires fields of task_struct (used for earliest expiration
of thread timers). The "thread_group_cputime" structure contains process-wide
timers that are updated via account_user_time() and friends. In the non-SMP
case the structure is a simple aggregator; unfortunately in the SMP case that
simplicity was not achievable due to cache-line contention between CPUs (in
one measured case performance was actually _worse_ on a 16-cpu system than
the same test on a 4-cpu system, due to this contention). For SMP, the
thread_group_cputime counters are maintained as a per-cpu structure allocated
using alloc_percpu(). The timer functions update only the timer field in
the structure corresponding to the running CPU, obtained using per_cpu_ptr().

We define a set of inline functions in sched.h that we use to maintain the
thread_group_cputime structure and hide the differences between UP and SMP
implementations from the rest of the kernel. The thread_group_cputime_init()
function initializes the thread_group_cputime structure for the given task.
The thread_group_cputime_alloc() is a no-op for UP; for SMP it calls the
out-of-line function thread_group_cputime_alloc_smp() to allocate and fill
in the per-cpu structures and fields. The thread_group_cputime_free()
function, also a no-op for UP, in SMP frees the per-cpu structures. The
thread_group_cputime_clone_thread() function (also a UP no-op) for SMP calls
thread_group_cputime_alloc() if the per-cpu structures haven't yet been
allocated. The thread_group_cputime() function fills the task_cputime
structure it is passed with the contents of the thread_group_cputime fields;
in UP it's that simple but in SMP it must also safely check that tsk->signal
is non-NULL (if it is it just uses the appropriate fields of task_struct) and,
if so, sums the per-cpu values for each online CPU. Finally, the three
functions account_group_user_time(), account_group_system_time() and
account_group_exec_runtime() are used by timer functions to update the
respective fields of the thread_group_cputime structure.

Non-SMP operation is trivial and will not be mentioned further.

The per-cpu structure is always allocated when a task creates its first new
thread, via a call to thread_group_cputime_clone_thread() from copy_signal().
It is freed at process exit via a call to thread_group_cputime_free() from
cleanup_signal().

All functions that formerly summed utime/stime/sum_sched_runtime values from
from all threads in the thread group now use thread_group_cputime() to
snapshot the values in the thread_group_cputime structure or the values in
the task structure itself if the per-cpu structure hasn't been allocated.

Finally, the code in kernel/posix-cpu-timers.c has changed quite a bit.
The run_posix_cpu_timers() function has been split into a fast path and a
slow path; the former safely checks whether there are any expired thread
timers and, if not, just returns, while the slow path does the heavy lifting.
With the dedicated thread group fields, timers are no longer "rebalanced" and
the process_timer_rebalance() function and related code has gone away. All
summing loops are gone and all code that used them now uses the
thread_group_cputime() inline. When process-wide timers are set, the new
task_cputime structure in signal_struct is used to cache the earliest
expiration; this is checked in the fast path.

Performance

The fix appears not to add significant overhead to existing operations. It
generally performs the same as the current code except in two cases, one in
which it performs slightly worse (Case 5 below) and one in which it performs
very significantly better (Case 2 below). Overall it's a wash except in those
two cases.

I've since done somewhat more involved testing on a dual-core Opteron system.

Case 1: With no itimer running, for a test with 100,000 threads, the fixed
kernel took 1428.5 seconds, 513 seconds more than the unfixed system,
all of which was spent in the system. There were twice as many
voluntary context switches with the fix as without it.

Case 2: With an itimer running at .01 second ticks and 4000 threads (the most
an unmodified kernel can handle), the fixed kernel ran the test in
eight percent of the time (5.8 seconds as opposed to 70 seconds) and
had better tick accuracy (.012 seconds per tick as opposed to .023
seconds per tick).

Case 3: A 4000-thread test with an initial timer tick of .01 second and an
interval of 10,000 seconds (i.e. a timer that ticks only once) had
very nearly the same performance in both cases: 6.3 seconds elapsed
for the fixed kernel versus 5.5 seconds for the unfixed kernel.

With fewer threads (eight in these tests), the Case 1 test ran in essentially
the same time on both the modified and unmodified kernels (5.2 seconds versus
5.8 seconds). The Case 2 test ran in about the same time as well, 5.9 seconds
versus 5.4 seconds but again with much better tick accuracy, .013 seconds per
tick versus .025 seconds per tick for the unmodified kernel.

Since the fix affected the rlimit code, I also tested soft and hard CPU limits.

Case 4: With a hard CPU limit of 20 seconds and eight threads (and an itimer
running), the modified kernel was very slightly favored in that while
it killed the process in 19.997 seconds of CPU time (5.002 seconds of
wall time), only .003 seconds of that was system time, the rest was
user time. The unmodified kernel killed the process in 20.001 seconds
of CPU (5.014 seconds of wall time) of which .016 seconds was system
time. Really, though, the results were too close to call. The results
were essentially the same with no itimer running.

Case 5: With a soft limit of 20 seconds and a hard limit of 2000 seconds
(where the hard limit would never be reached) and an itimer running,
the modified kernel exhibited worse tick accuracy than the unmodified
kernel: .050 seconds/tick versus .028 seconds/tick. Otherwise,
performance was almost indistinguishable. With no itimer running this
test exhibited virtually identical behavior and times in both cases.

In times past I did some limited performance testing. those results are below.

On a four-cpu Opteron system without this fix, a sixteen-thread test executed
in 3569.991 seconds, of which user was 3568.435s and system was 1.556s. On
the same system with the fix, user and elapsed time were about the same, but
system time dropped to 0.007 seconds. Performance with eight, four and one
thread were comparable. Interestingly, the timer ticks with the fix seemed
more accurate: The sixteen-thread test with the fix received 149543 ticks
for 0.024 seconds per tick, while the same test without the fix received 58720
for 0.061 seconds per tick. Both cases were configured for an interval of
0.01 seconds. Again, the other tests were comparable. Each thread in this
test computed the primes up to 25,000,000.

I also did a test with a large number of threads, 100,000 threads, which is
impossible without the fix. In this case each thread computed the primes only
up to 10,000 (to make the runtime manageable). System time dominated, at
1546.968 seconds out of a total 2176.906 seconds (giving a user time of
629.938s). It received 147651 ticks for 0.015 seconds per tick, still quite
accurate. There is obviously no comparable test without the fix.

Signed-off-by: Frank Mayhar <fmayhar@google.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
d9250dea3f89fe808a525f08888016b495240ed4 28-Aug-2008 KaiGai Kohei <kaigai@ak.jp.nec.com> SELinux: add boundary support and thread context assignment

The purpose of this patch is to assign per-thread security context
under a constraint. It enables multi-threaded server application
to kick a request handler with its fair security context, and
helps some of userspace object managers to handle user's request.

When we assign a per-thread security context, it must not have wider
permissions than the original one. Because a multi-threaded process
shares a single local memory, an arbitary per-thread security context
also means another thread can easily refer violated information.

The constraint on a per-thread security context requires a new domain
has to be equal or weaker than its original one, when it tries to assign
a per-thread security context.

Bounds relationship between two types is a way to ensure a domain can
never have wider permission than its bounds. We can define it in two
explicit or implicit ways.

The first way is using new TYPEBOUNDS statement. It enables to define
a boundary of types explicitly. The other one expand the concept of
existing named based hierarchy. If we defines a type with "." separated
name like "httpd_t.php", toolchain implicitly set its bounds on "httpd_t".

This feature requires a new policy version.
The 24th version (POLICYDB_VERSION_BOUNDARY) enables to ship them into
kernel space, and the following patch enables to handle it.

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
5cd9c58fbe9ec92b45b27e131719af4f2bd9eb40 14-Aug-2008 David Howells <dhowells@redhat.com> security: Fix setting of PF_SUPERPRIV by __capable()

Fix the setting of PF_SUPERPRIV by __capable() as it could corrupt the flags
the target process if that is not the current process and it is trying to
change its own flags in a different way at the same time.

__capable() is using neither atomic ops nor locking to protect t->flags. This
patch removes __capable() and introduces has_capability() that doesn't set
PF_SUPERPRIV on the process being queried.

This patch further splits security_ptrace() in two:

(1) security_ptrace_may_access(). This passes judgement on whether one
process may access another only (PTRACE_MODE_ATTACH for ptrace() and
PTRACE_MODE_READ for /proc), and takes a pointer to the child process.
current is the parent.

(2) security_ptrace_traceme(). This passes judgement on PTRACE_TRACEME only,
and takes only a pointer to the parent process. current is the child.

In Smack and commoncap, this uses has_capability() to determine whether
the parent will be permitted to use PTRACE_ATTACH if normal checks fail.
This does not set PF_SUPERPRIV.

Two of the instances of __capable() actually only act on current, and so have
been changed to calls to capable().

Of the places that were using __capable():

(1) The OOM killer calls __capable() thrice when weighing the killability of a
process. All of these now use has_capability().

(2) cap_ptrace() and smack_ptrace() were using __capable() to check to see
whether the parent was allowed to trace any process. As mentioned above,
these have been split. For PTRACE_ATTACH and /proc, capable() is now
used, and for PTRACE_TRACEME, has_capability() is used.

(3) cap_safe_nice() only ever saw current, so now uses capable().

(4) smack_setprocattr() rejected accesses to tasks other than current just
after calling __capable(), so the order of these two tests have been
switched and capable() is used instead.

(5) In smack_file_send_sigiotask(), we need to allow privileged processes to
receive SIGIO on files they're manipulating.

(6) In smack_task_wait(), we let a process wait for a privileged process,
whether or not the process doing the waiting is privileged.

I've tested this with the LTP SELinux and syscalls testscripts.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Andrew G. Morgan <morgan@kernel.org>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: James Morris <jmorris@namei.org>
cf9481e289247fe9cf40f2e2481220d899132049 27-Jul-2008 David Howells <dhowells@redhat.com> SELinux: Fix a potentially uninitialised variable in SELinux hooks

Fix a potentially uninitialised variable in SELinux hooks that's given a
pointer to the network address by selinux_parse_skb() passing a pointer back
through its argument list. By restructuring selinux_parse_skb(), the compiler
can see that the error case need not set it as the caller will return
immediately.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
3583a71183a02c51ca71cd180e9189cfb0411cc1 22-Jul-2008 Adrian Bunk <bunk@kernel.org> make selinux_write_opts() static

This patch makes the needlessly global selinux_write_opts() static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: James Morris <jmorris@namei.org>
383795c206946777d87ed5f6d61d6659110f9344 29-Jul-2008 Eric Paris <eparis@redhat.com> SELinux: /proc/mounts should show what it can

Given a hosed SELinux config in which a system never loads policy or
disables SELinux we currently just return -EINVAL for anyone trying to
read /proc/mounts. This is a configuration problem but we can certainly
be more graceful. This patch just ignores -EINVAL when displaying LSM
options and causes /proc/mounts display everything else it can. If
policy isn't loaded the obviously there are no options, so we aren't
really loosing any information here.

This is safe as the only other return of EINVAL comes from
security_sid_to_context_core() in the case of an invalid sid. Even if a
FS was mounted with a now invalidated context that sid should have been
remapped to unlabeled and so we won't hit the EINVAL and will work like
we should. (yes, I tested to make sure it worked like I thought)

Signed-off-by: Eric Paris <eparis@redhat.com>
Tested-by: Marc Dionne <marc.c.dionne@gmail.com>
Signed-off-by: James Morris <jmorris@namei.org>
b77b0646ef4efe31a7449bb3d9360fd00f95433d 17-Jul-2008 Al Viro <viro@zeniv.linux.org.uk> [PATCH] pass MAY_OPEN to vfs_permission() explicitly

... and get rid of the last "let's deduce mask from nameidata->flags"
bit.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
6c5a9d2e1599a099b0e47235a1c1502162b14310 27-Jul-2008 Alexey Dobriyan <adobriyan@gmail.com> selinux: use nf_register_hooks()

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
0d094efeb1e98010c6b99923f1eb7e17bf1e3a74 26-Jul-2008 Roland McGrath <roland@redhat.com> tracehook: tracehook_tracer_task

This adds the tracehook_tracer_task() hook to consolidate all forms of
"Who is using ptrace on me?" logic. This is used for "TracerPid:" in
/proc and for permission checks. We also clean up the selinux code the
called an identical accessor.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
089be43e403a78cd6889cde2fba164fefe9dfd89 15-Jul-2008 James Morris <jmorris@namei.org> Revert "SELinux: allow fstype unknown to policy to use xattrs if present"

This reverts commit 811f3799279e567aa354c649ce22688d949ac7a9.

From Eric Paris:

"Please drop this patch for now. It deadlocks on ntfs-3g. I need to
rework it to handle fuse filesystems better. (casey was right)"
6f0f0fd496333777d53daff21a4e3b28c4d03a6d 10-Jul-2008 James Morris <jmorris@namei.org> security: remove register_security hook

The register security hook is no longer required, as the capability
module is always registered. LSMs wishing to stack capability as
a secondary module should do so explicitly.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
b478a9f9889c81e88077d1495daadee64c0af541 03-Jul-2008 Miklos Szeredi <mszeredi@suse.cz> security: remove unused sb_get_mnt_opts hook

The sb_get_mnt_opts() hook is unused, and is superseded by the
sb_show_options() hook.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: James Morris <jmorris@namei.org>
2069f457848f846cb31149c9aa29b330a6b66d1b 04-Jul-2008 Eric Paris <eparis@redhat.com> LSM/SELinux: show LSM mount options in /proc/mounts

This patch causes SELinux mount options to show up in /proc/mounts. As
with other code in the area seq_put errors are ignored. Other LSM's
will not have their mount options displayed until they fill in their own
security_sb_show_options() function.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: James Morris <jmorris@namei.org>
811f3799279e567aa354c649ce22688d949ac7a9 18-Jun-2008 Eric Paris <eparis@redhat.com> SELinux: allow fstype unknown to policy to use xattrs if present

Currently if a FS is mounted for which SELinux policy does not define an
fs_use_* that FS will either be genfs labeled or not labeled at all.
This decision is based on the existence of a genfscon rule in policy and
is irrespective of the capabilities of the filesystem itself. This
patch allows the kernel to check if the filesystem supports security
xattrs and if so will use those if there is no fs_use_* rule in policy.
An fstype with a no fs_use_* rule but with a genfs rule will use xattrs
if available and will follow the genfs rule.

This can be particularly interesting for things like ecryptfs which
actually overlays a real underlying FS. If we define excryptfs in
policy to use xattrs we will likely get this wrong at times, so with
this path we just don't need to define it!

Overlay ecryptfs on top of NFS with no xattr support:
SELinux: initialized (dev ecryptfs, type ecryptfs), uses genfs_contexts
Overlay ecryptfs on top of ext4 with xattr support:
SELinux: initialized (dev ecryptfs, type ecryptfs), uses xattr

It is also useful as the kernel adds new FS we don't need to add them in
policy if they support xattrs and that is how we want to handle them.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2baf06df85b27c1d64867883a0692519594f1ef2 11-Jun-2008 James Morris <jmorris@namei.org> SELinux: use do_each_thread as a proper do/while block

Use do_each_thread as a proper do/while block. Sparse complained.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
e399f98224a03d2e85fb45eacba367c47173f6f9 11-Jun-2008 James Morris <jmorris@namei.org> SELinux: remove unused and shadowed addrlen variable

Remove unused and shadowed addrlen variable. Picked up by sparse.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Paul Moore <paul.moore@hp.com>
242631c49d4cf39642741d6627750151b058233b 05-Jun-2008 Stephen Smalley <sds@tycho.nsa.gov> selinux: simplify ioctl checking

Simplify and improve the robustness of the SELinux ioctl checking by
using the "access mode" bits of the ioctl command to determine the
permission check rather than dealing with individual command values.
This removes any knowledge of specific ioctl commands from SELinux
and follows the same guidance we gave to Smack earlier.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
abc69bb633931bf54c6db798bcdc6fd1e0284742 21-May-2008 Stephen Smalley <sds@tycho.nsa.gov> SELinux: enable processes with mac_admin to get the raw inode contexts

Enable processes with CAP_MAC_ADMIN + mac_admin permission in policy
to get undefined contexts on inodes. This extends the support for
deferred mapping of security contexts in order to permit restorecon
and similar programs to see the raw file contexts unknown to the
system policy in order to check them.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
006ebb40d3d65338bd74abb03b945f8d60e362bd 19-May-2008 Stephen Smalley <sds@tycho.nsa.gov> Security: split proc ptrace checking into read vs. attach

Enable security modules to distinguish reading of process state via
proc from full ptrace access by renaming ptrace_may_attach to
ptrace_may_access and adding a mode argument indicating whether only
read access or full attach access is requested. This allows security
modules to permit access to reading process state without granting
full ptrace access. The base DAC/capability checking remains unchanged.

Read access to /proc/pid/mem continues to apply a full ptrace attach
check since check_mem_permission() already requires the current task
to already be ptracing the target. The other ptrace checks within
proc for elements like environ, maps, and fds are changed to pass the
read mode instead of attach.

In the SELinux case, we model such reading of process state as a
reading of a proc file labeled with the target process' label. This
enables SELinux policy to permit such reading of process state without
permitting control or manipulation of the target process, as there are
a number of cases where programs probe for such information via proc
but do not need to be able to control the target (e.g. procps,
lsof, PolicyKit, ConsoleKit). At present we have to choose between
allowing full ptrace in policy (more permissive than required/desired)
or breaking functionality (or in some cases just silencing the denials
via dontaudit rules but this can hide genuine attacks).

This version of the patch incorporates comments from Casey Schaufler
(change/replace existing ptrace_may_attach interface, pass access
mode), and Chris Wright (provide greater consistency in the checking).

Note that like their predecessors __ptrace_may_attach and
ptrace_may_attach, the __ptrace_may_access and ptrace_may_access
interfaces use different return value conventions from each other (0
or -errno vs. 1 or 0). I retained this difference to avoid any
changes to the caller logic but made the difference clearer by
changing the latter interface to return a bool rather than an int and
by adding a comment about it to ptrace.h for any future callers.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: James Morris <jmorris@namei.org>
f5269710789f666a65cf1132c4f1d14fbc8d3c29 14-May-2008 Eric Paris <eparis@redhat.com> SELinux: keep the code clean formating and syntax

Formatting and syntax changes

whitespace, tabs to spaces, trailing space
put open { on same line as struct def
remove unneeded {} after if statements
change printk("Lu") to printk("llu")
convert asm/uaccess.h to linux/uaacess.h includes
remove unnecessary asm/bug.h includes
convert all users of simple_strtol to strict_strtol

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
12b29f34558b9b45a2c6eabd4f3c6be939a3980f 07-May-2008 Stephen Smalley <sds@tycho.nsa.gov> selinux: support deferred mapping of contexts

Introduce SELinux support for deferred mapping of security contexts in
the SID table upon policy reload, and use this support for inode
security contexts when the context is not yet valid under the current
policy. Only processes with CAP_MAC_ADMIN + mac_admin permission in
policy can set undefined security contexts on inodes. Inodes with
such undefined contexts are treated as having the unlabeled context
until the context becomes valid upon a policy reload that defines the
context. Context invalidation upon policy reload also uses this
support to save the context information in the SID table and later
recover it upon a subsequent policy reload that defines the context
again.

This support is to enable package managers and similar programs to set
down file contexts unknown to the system policy at the time the file
is created in order to better support placing loadable policy modules
in packages and to support build systems that need to create images of
different distro releases with different policies w/o requiring all of
the contexts to be defined or legal in the build host policy.

With this patch applied, the following sequence is possible, although
in practice it is recommended that this permission only be allowed to
specific program domains such as the package manager.

# rmdir baz
# rm bar
# touch bar
# chcon -t foo_exec_t bar # foo_exec_t is not yet defined
chcon: failed to change context of `bar' to `system_u:object_r:foo_exec_t': Invalid argument
# mkdir -Z system_u:object_r:foo_exec_t baz
mkdir: failed to set default file creation context to `system_u:object_r:foo_exec_t': Invalid argument
# cat setundefined.te
policy_module(setundefined, 1.0)
require {
type unconfined_t;
type unlabeled_t;
}
files_type(unlabeled_t)
allow unconfined_t self:capability2 mac_admin;
# make -f /usr/share/selinux/devel/Makefile setundefined.pp
# semodule -i setundefined.pp
# chcon -t foo_exec_t bar # foo_exec_t is not yet defined
# mkdir -Z system_u:object_r:foo_exec_t baz
# ls -Zd bar baz
-rw-r--r-- root root system_u:object_r:unlabeled_t bar
drwxr-xr-x root root system_u:object_r:unlabeled_t baz
# cat foo.te
policy_module(foo, 1.0)
type foo_exec_t;
files_type(foo_exec_t)
# make -f /usr/share/selinux/devel/Makefile foo.pp
# semodule -i foo.pp # defines foo_exec_t
# ls -Zd bar baz
-rw-r--r-- root root user_u:object_r:foo_exec_t bar
drwxr-xr-x root root system_u:object_r:foo_exec_t baz
# semodule -r foo
# ls -Zd bar baz
-rw-r--r-- root root system_u:object_r:unlabeled_t bar
drwxr-xr-x root root system_u:object_r:unlabeled_t baz
# semodule -i foo.pp
# ls -Zd bar baz
-rw-r--r-- root root user_u:object_r:foo_exec_t bar
drwxr-xr-x root root system_u:object_r:foo_exec_t baz
# semodule -r setundefined foo
# chcon -t foo_exec_t bar # no longer defined and not allowed
chcon: failed to change context of `bar' to `system_u:object_r:foo_exec_t': Invalid argument
# rmdir baz
# mkdir -Z system_u:object_r:foo_exec_t baz
mkdir: failed to set default file creation context to `system_u:object_r:foo_exec_t': Invalid argument

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
9f3acc3140444a900ab280de942291959f0f615d 24-Apr-2008 Al Viro <viro@zeniv.linux.org.uk> [PATCH] split linux/file.h

Initial splitoff of the low-level stuff; taken to fdtable.h

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3b5e9e53c6f31b5a5a0f5c43707503c62bdefa46 30-Apr-2008 Oleg Nesterov <oleg@tv-sign.ru> signals: cleanup security_task_kill() usage/implementation

Every implementation of ->task_kill() does nothing when the signal comes from
the kernel. This is correct, but means that check_kill_permission() should
call security_task_kill() only for SI_FROMUSER() case, and we can remove the
same check from ->task_kill() implementations.

(sadly, check_kill_permission() is the last user of signal->session/__session
but we can't s/task_session_nr/task_session/ here).

NOTE: Eric W. Biederman pointed out cap_task_kill() should die, and I think
he is very right.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Serge Hallyn <serue@us.ibm.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: David Quigley <dpquigl@tycho.nsa.gov>
Cc: Eric Paris <eparis@redhat.com>
Cc: Harald Welte <laforge@gnumonks.org>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
e52c1764f18a62776a0f2bc6752fb76b6e345827 29-Apr-2008 David Howells <dhowells@redhat.com> Security: Make secctx_to_secid() take const secdata

Make secctx_to_secid() take constant secdata.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <jmorris@namei.org>
7bf570dc8dcf76df2a9f583bef2da96d4289ed0d 29-Apr-2008 David Howells <dhowells@redhat.com> Security: Make secctx_to_secid() take const secdata

Make secctx_to_secid() take constant secdata.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
69664cf16af4f31cd54d77948a4baf9c7e0ca7b9 29-Apr-2008 David Howells <dhowells@redhat.com> keys: don't generate user and user session keyrings unless they're accessed

Don't generate the per-UID user and user session keyrings unless they're
explicitly accessed. This solves a problem during a login process whereby
set*uid() is called before the SELinux PAM module, resulting in the per-UID
keyrings having the wrong security labels.

This also cures the problem of multiple per-UID keyrings sometimes appearing
due to PAM modules (including pam_keyinit) setuiding and causing user_structs
to come into and go out of existence whilst the session keyring pins the user
keyring. This is achieved by first searching for extant per-UID keyrings
before inventing new ones.

The serial bound argument is also dropped from find_keyring_by_name() as it's
not currently made use of (setting it to 0 disables the feature).

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: <kwc@citi.umich.edu>
Cc: <arunsr@cse.iitk.ac.in>
Cc: <dwalsh@redhat.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
70a5bb72b55e82fbfbf1e22cae6975fac58a1e2d 29-Apr-2008 David Howells <dhowells@redhat.com> keys: add keyctl function to get a security label

Add a keyctl() function to get the security label of a key.

The following is added to Documentation/keys.txt:

(*) Get the LSM security context attached to a key.

long keyctl(KEYCTL_GET_SECURITY, key_serial_t key, char *buffer,
size_t buflen)

This function returns a string that represents the LSM security context
attached to a key in the buffer provided.

Unless there's an error, it always returns the amount of data it could
produce, even if that's too big for the buffer, but it won't copy more
than requested to userspace. If the buffer pointer is NULL then no copy
will take place.

A NUL character is included at the end of the string if the buffer is
sufficiently big. This is included in the returned count. If no LSM is
in force then an empty string will be returned.

A process must have view permission on the key for this function to be
successful.

[akpm@linux-foundation.org: declare keyctl_get_security()]
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Paul Moore <paul.moore@hp.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: James Morris <jmorris@namei.org>
Cc: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8f0cfa52a1d4ffacd8e7de906d19662f5da58d58 29-Apr-2008 David Howells <dhowells@redhat.com> xattr: add missing consts to function arguments

Add missing consts to xattr function arguments.

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3898b1b4ebff8dcfbcf1807e0661585e06c9a91c 28-Apr-2008 Andrew G. Morgan <morgan@kernel.org> capabilities: implement per-process securebits

Filesystem capability support makes it possible to do away with (set)uid-0
based privilege and use capabilities instead. That is, with filesystem
support for capabilities but without this present patch, it is (conceptually)
possible to manage a system with capabilities alone and never need to obtain
privilege via (set)uid-0.

Of course, conceptually isn't quite the same as currently possible since few
user applications, certainly not enough to run a viable system, are currently
prepared to leverage capabilities to exercise privilege. Further, many
applications exist that may never get upgraded in this way, and the kernel
will continue to want to support their setuid-0 base privilege needs.

Where pure-capability applications evolve and replace setuid-0 binaries, it is
desirable that there be a mechanisms by which they can contain their
privilege. In addition to leveraging the per-process bounding and inheritable
sets, this should include suppressing the privilege of the uid-0 superuser
from the process' tree of children.

The feature added by this patch can be leveraged to suppress the privilege
associated with (set)uid-0. This suppression requires CAP_SETPCAP to
initiate, and only immediately affects the 'current' process (it is inherited
through fork()/exec()). This reimplementation differs significantly from the
historical support for securebits which was system-wide, unwieldy and which
has ultimately withered to a dead relic in the source of the modern kernel.

With this patch applied a process, that is capable(CAP_SETPCAP), can now drop
all legacy privilege (through uid=0) for itself and all subsequently
fork()'d/exec()'d children with:

prctl(PR_SET_SECUREBITS, 0x2f);

This patch represents a no-op unless CONFIG_SECURITY_FILE_CAPABILITIES is
enabled at configure time.

[akpm@linux-foundation.org: fix uninitialised var warning]
[serue@us.ibm.com: capabilities: use cap_task_prctl when !CONFIG_SECURITY]
Signed-off-by: Andrew G. Morgan <morgan@kernel.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Reviewed-by: James Morris <jmorris@namei.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Paul Moore <paul.moore@hp.com>
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
b5266eb4c8d1a2887a19aaec8144ee4ad1b054c3 22-Mar-2008 Al Viro <viro@zeniv.linux.org.uk> [PATCH] switch a bunch of LSM hooks from nameidata to path

Namely, ones from namespace.c

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
0f5e64200f20fc8f5b759c4010082f577ab0af3f 21-Apr-2008 Eric Paris <eparis@redhat.com> SELinux: no BUG_ON(!ss_initialized) in selinux_clone_mnt_opts

The Fedora installer actually makes multiple NFS mounts before it loads
selinux policy. The code in selinux_clone_mnt_opts() assumed that the
init process would always be loading policy before NFS was up and
running. It might be possible to hit this in a diskless environment as
well, I'm not sure. There is no need to BUG_ON() in this situation
since we can safely continue given the circumstances.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
828dfe1da54fce81f80f97275353ba33be09a76e 17-Apr-2008 Eric Paris <eparis@redhat.com> SELinux: whitespace and formating fixes for hooks.c

All whitespace and formatting. Nothing interesting to see here. About
the only thing to remember is that we aren't supposed to initialize
static variables to 0/NULL. It is done for us and doing it ourselves
puts them in a different section.

With this patch running checkpatch.pl against hooks.c only gives us
complaints about busting the 80 character limit and declaring extern's
in .c files. Apparently they don't like it, but I don't feel like going
to the trouble of moving those to .h files...

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
744ba35e455b0d5cf4f85208a8ca0edcc9976b95 17-Apr-2008 Eric Paris <eparis@redhat.com> SELinux: clean up printks

Make sure all printk start with KERN_*
Make sure all printk end with \n
Make sure all printk have the word 'selinux' in them
Change "function name" to "%s", __func__ (found 2 wrong)

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
076c54c5bcaed2081c0cba94a6f77c4d470236ad 06-Mar-2008 Ahmed S. Darwish <darwish.07@gmail.com> Security: Introduce security= boot parameter

Add the security= boot parameter. This is done to avoid LSM
registration clashes in case of more than one bult-in module.

User can choose a security module to enable at boot. If no
security= boot parameter is specified, only the first LSM
asking for registration will be loaded. An invalid security
module name will be treated as if no module has been chosen.

LSM modules must check now if they are allowed to register
by calling security_module_enable(ops) first. Modify SELinux
and SMACK to do so.

Do not let SMACK register smackfs if it was not chosen on
boot. Smackfs assumes that smack hooks are registered and
the initial task security setup (swapper->security) is done.

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
9d57a7f9e23dc30783d245280fc9907cf2c87837 01-Mar-2008 Ahmed S. Darwish <darwish.07@gmail.com> SELinux: use new audit hooks, remove redundant exports

Setup the new Audit LSM hooks for SELinux.
Remove the now redundant exported SELinux Audit interface.

Audit: Export 'audit_krule' and 'audit_field' to the public
since their internals are needed by the implementation of the
new LSM hook 'audit_rule_known'.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
713a04aeaba35bb95d442cdeb52055498519be25 01-Mar-2008 Ahmed S. Darwish <darwish.07@gmail.com> SELinux: setup new inode/ipc getsecid hooks

Setup the new inode_getsecid and ipc_getsecid() LSM hooks
for SELinux.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Reviewed-by: Paul Moore <paul.moore@hp.com>
3e11217263d0521e212cb8a017fbc2a1514db78f 10-Apr-2008 Paul Moore <paul.moore@hp.com> SELinux: Add network port SID cache

Much like we added a network node cache, this patch adds a network port
cache. The design is taken almost completely from the network node cache
which in turn was taken from the network interface cache. The basic idea is
to cache entries in a hash table based on protocol/port information. The
hash function only takes the port number into account since the number of
different protocols in use at any one time is expected to be relatively
small.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
832cbd9aa1293cba57d06571f5fc8f0917c672af 01-Apr-2008 Eric Paris <eparis@redhat.com> SELinux: turn mount options strings into defines

Convert the strings used for mount options into #defines rather than
retyping the string throughout the SELinux code.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
0356357c5158c71d4cbf20196b2f784435dd916c 26-Mar-2008 Roland McGrath <roland@redhat.com> selinux: remove ptrace_sid

This changes checks related to ptrace to get rid of the ptrace_sid tracking.
It's good to disentangle the security model from the ptrace implementation
internals. It's sufficient to check against the SID of the ptracer at the
time a tracee attempts a transition.

Signed-off-by: Roland McGrath <roland@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
f0115e6c8980ea9125a17858291c90ecd990bc1c 06-Mar-2008 Andrew Morton <akpm@linux-foundation.org> security: code cleanup

ERROR: "(foo*)" should be "(foo *)"
#168: FILE: security/selinux/hooks.c:2656:
+ "%s, rc=%d\n", __func__, (char*)value, -rc);

total: 1 errors, 0 warnings, 195 lines checked

./patches/security-replace-remaining-__function__-occurences.patch has style problems, please review. If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Harvey Harrison <harvey.harrison@gmail.com>
Cc: James Morris <jmorris@namei.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Morris <jmorris@namei.org>
dd6f953adb5c4deb9cd7b6a5054e7d5eafe4ed71 06-Mar-2008 Harvey Harrison <harvey.harrison@gmail.com> security: replace remaining __FUNCTION__ occurrences

__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: James Morris <jmorris@namei.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Morris <jmorris@namei.org>
b0c636b99997c8594da6a46e166ce4fcf6956fda 28-Feb-2008 Eric Paris <eparis@redhat.com> SELinux: create new open permission

Adds a new open permission inside SELinux when 'opening' a file. The idea
is that opening a file and reading/writing to that file are not the same
thing. Its different if a program had its stdout redirected to /tmp/output
than if the program tried to directly open /tmp/output. This should allow
policy writers to more liberally give read/write permissions across the
policy while still blocking many design and programing flaws SELinux is so
good at catching today.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
98e9894650455426f67c2157db4f39bd14fac2f6 25-Feb-2008 James Morris <jmorris@namei.org> SELinux: remove unused backpointers from security objects

Remove unused backpoiters from security objects.

Signed-off-by: James Morris <jmorris@namei.org>
f74af6e816c940c678c235d49486fe40d7e49ce9 25-Feb-2008 Paul Moore <paul.moore@hp.com> SELinux: Correct the NetLabel locking for the sk_security_struct

The RCU/spinlock locking approach for the nlbl_state in the sk_security_struct
was almost certainly overkill. This patch removes both the RCU and spinlock
locking, relying on the existing socket locks to handle the case of multiple
writers. This change also makes several code reductions possible.

Less locking, less code - it's a Good Thing.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
5a55261716e838f188598ab3d7a0abf9cf1338f8 09-Apr-2008 Eric Paris <eparis@redhat.com> SELinux: don't BUG if fs reuses a superblock

I (wrongly) assumed that nfs_xdev_get_sb() would not ever share a superblock
and so cloning mount options would always be correct. Turns out that isn't
the case and we could fall over a BUG_ON() that wasn't a BUG at all. Since
there is little we can do to reconcile different mount options this patch
just leaves the sb alone and the first set of options wins.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: James Morris <jmorris@namei.org>
869ab5147e1eead890245cfd4f652ba282b6ac26 04-Apr-2008 Stephen Smalley <sds@tycho.nsa.gov> SELinux: more GFP_NOFS fixups to prevent selinux from re-entering the fs code

More cases where SELinux must not re-enter the fs code. Called from the
d_instantiate security hook.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
a02fe13297af26c13d004b1d44f391c077094ea0 04-Apr-2008 Josef Bacik <jbacik@redhat.com> selinux: prevent rentry into the FS

BUG fix. Keep us from re-entering the fs when we aren't supposed to.

See discussion at
http://marc.info/?t=120716967100004&r=1&w=2

Signed-off-by: Josef Bacik <jbacik@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
0794c66d49885a2f615618ce4940434b5b067d84 17-Mar-2008 Stephen Smalley <sds@tycho.nsa.gov> selinux: handle files opened with flags 3 by checking ioctl permission

Handle files opened with flags 3 by checking ioctl permission.

Default to returning FILE__IOCTL from file_to_av() if the f_mode has neither
FMODE_READ nor FMODE_WRITE, and thus check ioctl permission on exec or
transfer, thereby validating such descriptors early as with normal r/w
descriptors and catching leaks of them prior to attempted usage.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2e1479d95d02b43660fe03ab2c595ec9751a6f97 17-Mar-2008 Adrian Bunk <bunk@kernel.org> make selinux_parse_opts_str() static

This patch makes the needlessly global selinux_parse_opts_str() static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
e0007529893c1c064be90bd21422ca0da4a0198e 05-Mar-2008 Eric Paris <eparis@redhat.com> LSM/SELinux: Interfaces to allow FS to control mount options

Introduce new LSM interfaces to allow an FS to deal with their own mount
options. This includes a new string parsing function exported from the
LSM that an FS can use to get a security data blob and a new security
data blob. This is particularly useful for an FS which uses binary
mount data, like NFS, which does not pass strings into the vfs to be
handled by the loaded LSM. Also fix a BUG() in both SELinux and SMACK
when dealing with binary mount data. If the binary mount data is less
than one page the copy_page() in security_sb_copy_data() can cause an
illegal page fault and boom. Remove all NFSisms from the SELinux code
since they were broken by past NFS changes.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <jmorris@namei.org>
44707fdf5938ad269ea5d6c5744d82f6a7328746 15-Feb-2008 Jan Blunck <jblunck@suse.de> d_path: Use struct path in struct avc_audit_data

audit_log_d_path() is a d_path() wrapper that is used by the audit code. To
use a struct path in audit_log_d_path() I need to embed it into struct
avc_audit_data.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Acked-by: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4ac9137858e08a19f29feac4e1f4df7c268b0ba5 15-Feb-2008 Jan Blunck <jblunck@suse.de> Embed a struct path into struct nameidata instead of nd->{dentry,mnt}

This is the central patch of a cleanup series. In most cases there is no good
reason why someone would want to use a dentry for itself. This series reflects
that fact and embeds a struct path into nameidata.

Together with the other patches of this series
- it enforced the correct order of getting/releasing the reference count on
<dentry,vfsmount> pairs
- it prepares the VFS for stacking support since it is essential to have a
struct path in every place where the stack can be traversed
- it reduces the overall code size:

without patch series:
text data bss dec hex filename
5321639 858418 715768 6895825 6938d1 vmlinux

with patch series:
text data bss dec hex filename
5320026 858418 715768 6894212 693284 vmlinux

This patch:

Switch from nd->{dentry,mnt} to nd->path.{dentry,mnt} everywhere.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix cifs]
[akpm@linux-foundation.org: fix smack]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
b68e418c445e8a468634d0a7ca2fb63bbaa74028 07-Feb-2008 Stephen Smalley <sds@tycho.nsa.gov> selinux: support 64-bit capabilities

Fix SELinux to handle 64-bit capabilities correctly, and to catch
future extensions of capabilities beyond 64 bits to ensure that SELinux
is properly updated.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
42492594043d621a7910ff5877c3eb9202870b45 05-Feb-2008 David P. Quigley <dpquigl@tycho.nsa.gov> VFS/Security: Rework inode_getsecurity and callers to return resulting buffer

This patch modifies the interface to inode_getsecurity to have the function
return a buffer containing the security blob and its length via parameters
instead of relying on the calling function to give it an appropriately sized
buffer.

Security blobs obtained with this function should be freed using the
release_secctx LSM hook. This alleviates the problem of the caller having to
guess a length and preallocate a buffer for this function allowing it to be
used elsewhere for Labeled NFS.

The patch also removed the unused err parameter. The conversion is similar to
the one performed by Al Viro for the security_getprocattr hook.

Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Chris Wright <chrisw@sous-sol.org>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
374ea019cacfa8b69ae49eea993b74cb5968970b 28-Jan-2008 Adrian Bunk <bunk@kernel.org> selinux: make selinux_set_mnt_opts() static

selinux_set_mnt_opts() can become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: James Morris <jmorris@namei.org>
71f1cb05f773661b6fa98c7a635d7a395cd9c55d 29-Jan-2008 Paul Moore <paul.moore@hp.com> SELinux: Add warning messages on network denial due to error

Currently network traffic can be sliently dropped due to non-avc errors which
can lead to much confusion when trying to debug the problem. This patch adds
warning messages so that when these events occur there is a user visible
notification.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
effad8df44261031a882e1a895415f7186a5098e 29-Jan-2008 Paul Moore <paul.moore@hp.com> SELinux: Add network ingress and egress control permission checks

This patch implements packet ingress/egress controls for SELinux which allow
SELinux security policy to control the flow of all IPv4 and IPv6 packets into
and out of the system. Currently SELinux does not have proper control over
forwarded packets and this patch corrects this problem.

Special thanks to Venkat Yekkirala <vyekkirala@trustedcs.com> whose earlier
work on this topic eventually led to this patch.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
5dbe1eb0cfc144a2b0cb1466e22bcb6fc34229a8 29-Jan-2008 Paul Moore <paul.moore@hp.com> SELinux: Allow NetLabel to directly cache SIDs

Now that the SELinux NetLabel "base SID" is always the netmsg initial SID we
can do a big optimization - caching the SID and not just the MLS attributes.
This not only saves a lot of per-packet memory allocations and copies but it
has a nice side effect of removing a chunk of code.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
d621d35e576aa20a0ddae8022c3810f38357c8ff 29-Jan-2008 Paul Moore <paul.moore@hp.com> SELinux: Enable dynamic enable/disable of the network access checks

This patch introduces a mechanism for checking when labeled IPsec or SECMARK
are in use by keeping introducing a configuration reference counter for each
subsystem. In the case of labeled IPsec, whenever a labeled SA or SPD entry
is created the labeled IPsec/XFRM reference count is increased and when the
entry is removed it is decreased. In the case of SECMARK, when a SECMARK
target is created the reference count is increased and later decreased when the
target is removed. These reference counters allow SELinux to quickly determine
if either of these subsystems are enabled.

NetLabel already has a similar mechanism which provides the netlbl_enabled()
function.

This patch also renames the selinux_relabel_packet_permission() function to
selinux_secmark_relabel_packet_permission() as the original name and
description were misleading in that they referenced a single packet label which
is not the case.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
220deb966ea51e0dedb6a187c0763120809f3e64 29-Jan-2008 Paul Moore <paul.moore@hp.com> SELinux: Better integration between peer labeling subsystems

Rework the handling of network peer labels so that the different peer labeling
subsystems work better together. This includes moving both subsystems to a
single "peer" object class which involves not only changes to the permission
checks but an improved method of consolidating multiple packet peer labels.
As part of this work the inbound packet permission check code has been heavily
modified to handle both the old and new behavior in as sane a fashion as
possible.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
224dfbd81e1ff672eb46e7695469c395bd531083 29-Jan-2008 Paul Moore <paul.moore@hp.com> SELinux: Add a network node caching mechanism similar to the sel_netif_*() functions

This patch adds a SELinux IP address/node SID caching mechanism similar to the
sel_netif_*() functions. The node SID queries in the SELinux hooks files are
also modified to take advantage of this new functionality. In addition, remove
the address length information from the sk_buff parsing routines as it is
redundant since we already have the address family.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
da5645a28a15aed2e541a814ecf9f7ffcd4c4673 29-Jan-2008 Paul Moore <paul.moore@hp.com> SELinux: Only store the network interface's ifindex

Instead of storing the packet's network interface name store the ifindex. This
allows us to defer the need to lookup the net_device structure until the audit
record is generated meaning that in the majority of cases we never need to
bother with this at all.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
e8bfdb9d0dfc1231a6a71e849dfbd4447acdfff6 29-Jan-2008 Paul Moore <paul.moore@hp.com> SELinux: Convert the netif code to use ifindex values

The current SELinux netif code requires the caller have a valid net_device
struct pointer to lookup network interface information. However, we don't
always have a valid net_device pointer so convert the netif code to use
the ifindex values we always have as part of the sk_buff. This patch also
removes the default message SID from the network interface record, it is
not being used and therefore is "dead code".

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
75e22910cf0c26802b09dac2e34c13e648d3ed02 29-Jan-2008 Paul Moore <paul.moore@hp.com> NetLabel: Add IP address family information to the netlbl_skbuff_getattr() function

In order to do any sort of IP header inspection of incoming packets we need to
know which address family, AF_INET/AF_INET6/etc., it belongs to and since the
sk_buff structure does not store this information we need to pass along the
address family separate from the packet itself.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
6e23ae2a48750bda407a4a58f52a4865d7308bf5 20-Nov-2007 Patrick McHardy <kaber@trash.net> [NETFILTER]: Introduce NF_INET_ hook values

The IPv4 and IPv6 hook values are identical, yet some code tries to figure
out the "correct" value by looking at the address family. Introduce NF_INET_*
values for both IPv4 and IPv6. The old values are kept in a #ifndef __KERNEL__
section for userspace compatibility.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
63cb34492351078479b2d4bae6a881806a396286 16-Jan-2008 David Howells <dhowells@redhat.com> security: add a secctx_to_secid() hook

Add a secctx_to_secid() LSM hook to go along with the existing
secid_to_secctx() LSM hook. This patch also includes the SELinux
implementation for this hook.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
c9180a57a9ab2d5525faf8815a332364ee9e89b7 30-Nov-2007 Eric Paris <eparis@redhat.com> Security: add get, set, and cloning of superblock security information

Adds security_get_sb_mnt_opts, security_set_sb_mnt_opts, and
security_clont_sb_mnt_opts to the LSM and to SELinux. This will allow
filesystems to directly own and control all of their mount options if they
so choose. This interface deals only with option identifiers and strings so
it should generic enough for any LSM which may come in the future.

Filesystems which pass text mount data around in the kernel (almost all of
them) need not currently make use of this interface when dealing with
SELinux since it will still parse those strings as it always has. I assume
future LSM's would do the same. NFS is the primary FS which does not use
text mount data and thus must make use of this interface.

An LSM would need to implement these functions only if they had mount time
options, such as selinux has context= or fscontext=. If the LSM has no
mount time options they could simply not implement and let the dummy ops
take care of things.

An LSM other than SELinux would need to define new option numbers in
security.h and any FS which decides to own there own security options would
need to be patched to use this new interface for every possible LSM. This
is because it was stated to me very clearly that LSM's should not attempt to
understand FS mount data and the burdon to understand security should be in
the FS which owns the options.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
8a53514043e380aa573baa805298a7727c993985 22-Oct-2007 Eric Paris <eparis@redhat.com> SELinux: always check SIGCHLD in selinux_task_wait

When checking if we can wait on a child we were looking at
p->exit_signal and trying to make the decision based on if the signal
would eventually be allowed. One big flaw is that p->exit_signal is -1
for NPTL threads and so aignal_to_av was not actually checking SIGCHLD
which is what would have been sent. Even is exit_signal was set to
something strange it wouldn't change the fact that the child was there
and needed to be waited on. This patch just assumes wait is based on
SIGCHLD. Specific permission checks are made when the child actually
attempts to send a signal.

This resolves the problem of things like using GDB on confined domains
such as in RH BZ 232371. The confined domain did not have permission to
send a generic signal (exit_signal == -1) back to the unconfined GDB.
With this patch the GDB wait works and since the actual signal sent is
allowed everything functions as it should.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
cbfee34520666862f8ff539e580c48958fbb7706 17-Oct-2007 Adrian Bunk <bunk@kernel.org> security/ cleanups

This patch contains the following cleanups that are now possible:
- remove the unused security_operations->inode_xattr_getsuffix
- remove the no longer used security_operations->unregister_security
- remove some no longer required exit code
- remove a bunch of no longer used exports

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
b53767719b6cd8789392ea3e7e2eb7b8906898f0 17-Oct-2007 Serge E. Hallyn <serue@us.ibm.com> Implement file posix capabilities

Implement file posix capabilities. This allows programs to be given a
subset of root's powers regardless of who runs them, without having to use
setuid and giving the binary all of root's powers.

This version works with Kaigai Kohei's userspace tools, found at
http://www.kaigai.gr.jp/index.php. For more information on how to use this
patch, Chris Friedhoff has posted a nice page at
http://www.friedhoff.org/fscaps.html.

Changelog:
Nov 27:
Incorporate fixes from Andrew Morton
(security-introduce-file-caps-tweaks and
security-introduce-file-caps-warning-fix)
Fix Kconfig dependency.
Fix change signaling behavior when file caps are not compiled in.

Nov 13:
Integrate comments from Alexey: Remove CONFIG_ ifdef from
capability.h, and use %zd for printing a size_t.

Nov 13:
Fix endianness warnings by sparse as suggested by Alexey
Dobriyan.

Nov 09:
Address warnings of unused variables at cap_bprm_set_security
when file capabilities are disabled, and simultaneously clean
up the code a little, by pulling the new code into a helper
function.

Nov 08:
For pointers to required userspace tools and how to use
them, see http://www.friedhoff.org/fscaps.html.

Nov 07:
Fix the calculation of the highest bit checked in
check_cap_sanity().

Nov 07:
Allow file caps to be enabled without CONFIG_SECURITY, since
capabilities are the default.
Hook cap_task_setscheduler when !CONFIG_SECURITY.
Move capable(TASK_KILL) to end of cap_task_kill to reduce
audit messages.

Nov 05:
Add secondary calls in selinux/hooks.c to task_setioprio and
task_setscheduler so that selinux and capabilities with file
cap support can be stacked.

Sep 05:
As Seth Arnold points out, uid checks are out of place
for capability code.

Sep 01:
Define task_setscheduler, task_setioprio, cap_task_kill, and
task_setnice to make sure a user cannot affect a process in which
they called a program with some fscaps.

One remaining question is the note under task_setscheduler: are we
ok with CAP_SYS_NICE being sufficient to confine a process to a
cpuset?

It is a semantic change, as without fsccaps, attach_task doesn't
allow CAP_SYS_NICE to override the uid equivalence check. But since
it uses security_task_setscheduler, which elsewhere is used where
CAP_SYS_NICE can be used to override the uid equivalence check,
fixing it might be tough.

task_setscheduler
note: this also controls cpuset:attach_task. Are we ok with
CAP_SYS_NICE being used to confine to a cpuset?
task_setioprio
task_setnice
sys_setpriority uses this (through set_one_prio) for another
process. Need same checks as setrlimit

Aug 21:
Updated secureexec implementation to reflect the fact that
euid and uid might be the same and nonzero, but the process
might still have elevated caps.

Aug 15:
Handle endianness of xattrs.
Enforce capability version match between kernel and disk.
Enforce that no bits beyond the known max capability are
set, else return -EPERM.
With this extra processing, it may be worth reconsidering
doing all the work at bprm_set_security rather than
d_instantiate.

Aug 10:
Always call getxattr at bprm_set_security, rather than
caching it at d_instantiate.

[morgan@kernel.org: file-caps clean up for linux/capability.h]
[bunk@kernel.org: unexport cap_inode_killpriv]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Andrew Morgan <morgan@kernel.org>
Signed-off-by: Andrew Morgan <morgan@kernel.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
20510f2f4e2dabb0ff6c13901807627ec9452f98 17-Oct-2007 James Morris <jmorris@namei.org> security: Convert LSM into a static interface

Convert LSM into a static interface, as the ability to unload a security
module is not required by in-tree users and potentially complicates the
overall security architecture.

Needlessly exported LSM symbols have been unexported, to help reduce API
abuse.

Parameters for the capability and root_plug modules are now specified
at boot.

The SECURITY_FRAMEWORK_VERSION macro has also been removed.

In a nutshell, there is no safe way to unload an LSM. The modular interface
is thus unecessary and broken infrastructure. It is used only by out-of-tree
modules, which are often binary-only, illegal, abusive of the API and
dangerous, e.g. silently re-vectoring SELinux.

[akpm@linux-foundation.org: cleanups]
[akpm@linux-foundation.org: USB Kconfig fix]
[randy.dunlap@oracle.com: fix LSM kernel-doc]
Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: "Serge E. Hallyn" <serue@us.ibm.com>
Acked-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
788e7dd4c22e6f41b3a118fd8c291f831f6fddbb 14-Sep-2007 Yuichi Nakamura <ynakam@hitachisoft.jp> SELinux: Improve read/write performance

It reduces the selinux overhead on read/write by only revalidating
permissions in selinux_file_permission if the task or inode labels have
changed or the policy has changed since the open-time check. A new LSM
hook, security_dentry_open, is added to capture the necessary state at open
time to allow this optimization.

(see http://marc.info/?l=selinux&m=118972995207740&w=2)

Signed-off-by: Yuichi Nakamura<ynakam@hitachisoft.jp>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
a224be766bf593f7bcd534ca0c48dbd3eaf7bfce 15-Oct-2007 David S. Miller <davem@sunset.davemloft.net> [SELINUX]: Update for netfilter ->hook() arg changes.

They take a "struct sk_buff *" instead of a "struct sk_buff **" now.

Signed-off-by: David S. Miller <davem@davemloft.net>
227b60f5102cda4e4ab792b526a59c8cb20cd9f8 11-Oct-2007 Stephen Hemminger <shemminger@linux-foundation.org> [INET]: local port range robustness

Expansion of original idea from Denis V. Lunev <den@openvz.org>

Add robustness and locking to the local_port_range sysctl.
1. Enforce that low < high when setting.
2. Use seqlock to ensure atomic update.

The locking might seem like overkill, but there are
cases where sysadmin might want to change value in the
middle of a DoS attack.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
31e879309474d1666d645b96de99d0b682fa055f 19-Sep-2007 Eric Paris <eparis@redhat.com> SELinux: fix array out of bounds when mounting with selinux options

Given an illegal selinux option it was possible for match_token to work in
random memory at the end of the match_table_t array.

Note that privilege is required to perform a context mount, so this issue is
effectively limited to root only.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
4ac212ad4e8fafc22fa147fc255ff5fa5435cf33 29-Aug-2007 Stephen Smalley <sds@tycho.nsa.gov> SELinux: clear parent death signal on SID transitions

Clear parent death signal on SID transitions to prevent unauthorized
signaling between SIDs.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@parisplace.org>
Signed-off-by: James Morris <jmorris@localhost.localdomain>
34b4e4aa3c470ce8fa2bd78abb1741b4b58baad7 22-Aug-2007 Alan Cox <alan@lxorguk.ukuu.org.uk> fix NULL pointer dereference in __vm_enough_memory()

The new exec code inserts an accounted vma into an mm struct which is not
current->mm. The existing memory check code has a hard coded assumption
that this does not happen as does the security code.

As the correct mm is known we pass the mm to the security method and the
helper function. A new security test is added for the case where we need
to pass the mm and the existing one is modified to pass current->mm to
avoid the need to change large amounts of code.

(Thanks to Tobias for fixing rejects and testing)

Signed-off-by: Alan Cox <alan@redhat.com>
Cc: WU Fengguang <wfg@mail.ustc.edu.cn>
Cc: James Morris <jmorris@redhat.com>
Cc: Tobias Diedrich <ranma+kernel@tdiedrich.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
088999e98b8caecd31adc3b62223a228555c5ab7 01-Aug-2007 Paul Moore <paul.moore@hp.com> SELinux: remove redundant pointer checks before calling kfree()

We don't need to check for NULL pointers before calling kfree().

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
20c2df83d25c6a95affe6157a4c9cac4cf5ffaac 20-Jul-2007 Paul Mundt <lethal@linux-sh.org> mm: Remove slab destructors from kmem_cache_create().

Slab destructors were no longer supported after Christoph's
c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been
BUGs for both slab and slub, and slob never supported them
either.

This rips out support for the dtor pointer from kmem_cache_create()
completely and fixes up every single callsite in the kernel (there were
about 224, not including the slab allocator definitions themselves,
or the documentation references).

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
f36158c410651fe66f438c17b2ab3ae813f8c060 18-Jul-2007 Paul Moore <paul.moore@hp.com> SELinux: use SECINITSID_NETMSG instead of SECINITSID_UNLABELED for NetLabel

These changes will make NetLabel behave like labeled IPsec where there is an
access check for both labeled and unlabeled packets as well as providing the
ability to restrict domains to receiving only labeled packets when NetLabel is
in use. The changes to the policy are straight forward with the following
necessary to receive labeled traffic (with SECINITSID_NETMSG defined as
"netlabel_peer_t"):

allow mydom_t netlabel_peer_t:{ tcp_socket udp_socket rawip_socket } recvfrom;

The policy for unlabeled traffic would be:

allow mydom_t unlabeled_t:{ tcp_socket udp_socket rawip_socket } recvfrom;

These policy changes, as well as more general NetLabel support, are included in
the latest SELinux Reference Policy release 20070629 or later. Users who make
use of NetLabel are strongly encouraged to upgrade their policy to avoid
network problems. Users who do not make use of NetLabel will not notice any
difference.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
3bd858ab1c451725c07a805dcb315215dc85b86e 17-Jul-2007 Satyam Sharma <ssatyam@cse.iitk.ac.in> Introduce is_owner_or_cap() to wrap CAP_FOWNER use with fsuid check

Introduce is_owner_or_cap() macro in fs.h, and convert over relevant
users to it. This is done because we want to avoid bugs in the future
where we check for only effective fsuid of the current task against a
file's owning uid, without simultaneously checking for CAP_FOWNER as
well, thus violating its semantics.
[ XFS uses special macros and structures, and in general looked ...
untouchable, so we leave it alone -- but it has been looked over. ]

The (current->fsuid != inode->i_uid) check in generic_permission() and
exec_permission_lite() is left alone, because those operations are
covered by CAP_DAC_OVERRIDE and CAP_DAC_READ_SEARCH. Similarly operations
falling under the purview of CAP_CHOWN and CAP_LEASE are also left alone.

Signed-off-by: Satyam Sharma <ssatyam@cse.iitk.ac.in>
Cc: Al Viro <viro@ftp.linux.org.uk>
Acked-by: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8d9107e8c50e1c4ff43c91c8841805833f3ecfb9 14-Jul-2007 Linus Torvalds <torvalds@woody.linux-foundation.org> Revert "SELinux: use SECINITSID_NETMSG instead of SECINITSID_UNLABELED for NetLabel"

This reverts commit 9faf65fb6ee2b4e08325ba2d69e5ccf0c46453d0.

It bit people like Michal Piotrowski:

"My system is too secure, I can not login :)"

because it changed how CONFIG_NETLABEL worked, and broke older SElinux
policies.

As a result, quoth James Morris:

"Can you please revert this patch?

We thought it only affected people running MLS, but it will affect others.

Sorry for the hassle."

Cc: James Morris <jmorris@namei.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Cc: Paul Moore <paul.moore@hp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9faf65fb6ee2b4e08325ba2d69e5ccf0c46453d0 29-Jun-2007 Paul Moore <paul.moore@hp.com> SELinux: use SECINITSID_NETMSG instead of SECINITSID_UNLABELED for NetLabel

These changes will make NetLabel behave like labeled IPsec where there is an
access check for both labeled and unlabeled packets as well as providing the
ability to restrict domains to receiving only labeled packets when NetLabel
is in use. The changes to the policy are straight forward with the
following necessary to receive labeled traffic (with SECINITSID_NETMSG
defined as "netlabel_peer_t"):

allow mydom_t netlabel_peer_t:{ tcp_socket udp_socket rawip_socket } recvfrom;

The policy for unlabeled traffic would be:

allow mydom_t unlabeled_t:{ tcp_socket udp_socket rawip_socket } recvfrom;

These policy changes, as well as more general NetLabel support, are included
in the SELinux Reference Policy SVN tree, r2352 or later. Users who enable
NetLabel support in the kernel are strongly encouraged to upgrade their
policy to avoid network problems.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
ed0321895182ffb6ecf210e066d87911b270d587 28-Jun-2007 Eric Paris <eparis@redhat.com> security: Protection for exploiting null dereference using mmap

Add a new security check on mmap operations to see if the user is attempting
to mmap to low area of the address space. The amount of space protected is
indicated by the new proc tunable /proc/sys/vm/mmap_min_addr and defaults to
0, preserving existing behavior.

This patch uses a new SELinux security class "memprotect." Policy already
contains a number of allow rules like a_t self:process * (unconfined_t being
one of them) which mean that putting this check in the process class (its
best current fit) would make it useless as all user processes, which we also
want to protect against, would be allowed. By taking the memprotect name of
the new class it will also make it possible for us to move some of the other
memory protect permissions out of 'process' and into the new class next time
we bump the policy version number (which I also think is a good future idea)

Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2c3c05dbcbc7b9d71549fe0e2b249f10f5a66518 07-Jun-2007 Stephen Smalley <sds@tycho.nsa.gov> SELinux: allow preemption between transition permission checks

In security_get_user_sids, move the transition permission checks
outside of the section holding the policy rdlock, and use the AVC to
perform the checks, calling cond_resched after each one. These
changes should allow preemption between the individual checks and
enable caching of the results. It may however increase the overall
time spent in the function in some cases, particularly in the cache
miss case.

The long term fix will be to take much of this logic to userspace by
exporting additional state via selinuxfs, and ultimately deprecating
and eliminating this interface from the kernel.

Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
e63340ae6b6205fef26b40a75673d1c9c0c8bb90 08-May-2007 Randy Dunlap <randy.dunlap@oracle.com> header cleaning: don't include smp_lock.h when not used

Remove includes of <linux/smp_lock.h> where it is not used/needed.
Suggested by Al Viro.

Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
98a27ba485c7508ef9d9527fe06e4686f3a163dc 08-May-2007 Eric W. Biederman <ebiederm@xmission.com> tty: introduce no_tty and use it in selinux

While researching the tty layer pid leaks I found a weird case in selinux when
we drop a controlling tty because of inadequate permissions we don't do the
normal hangup processing. Which is a problem if it happens the session leader
has exec'd something that can no longer access the tty.

We already have code in the kernel to handle this case in the form of the
TIOCNOTTY ioctl. So this patch factors out a helper function that is the
essence of that ioctl and calls it from the selinux code.

This removes the inconsistency in handling dropping of a controlling tty and
who knows it might even make some part of user space happy because it received
a SIGHUP it was expecting.

In addition since this removes the last user of proc_set_tty outside of
tty_io.c proc_set_tty is made static and removed from tty.h

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: James Morris <jmorris@namei.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4f6a993f96a256e83b9be7612f958c7bc4ca9f00 01-Mar-2007 Paul Moore <paul.moore@hp.com> SELinux: move security_skb_extlbl_sid() out of the security server

As suggested, move the security_skb_extlbl_sid() function out of the security
server and into the SELinux hooks file.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
c60475bf35fc5fa10198df89187ab148527e72f7 28-Feb-2007 Paul Moore <paul.moore@hp.com> SELinux: rename selinux_netlabel.h to netlabel.h

In the beginning I named the file selinux_netlabel.h to avoid potential
namespace colisions. However, over time I have realized that there are several
other similar cases of multiple header files with the same name so I'm changing
the name to something which better fits with existing naming conventions.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
b529ccf2799c14346d1518e9bdf1f88f03643e99 26-Apr-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [NETLINK]: Introduce nlmsg_hdr() helper

For the common "(struct nlmsghdr *)skb->data" sequence, so that we reduce the
number of direct accesses to skb->data and for consistency with all the other
cast skb member helpers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bbe735e4247dba32568a305553b010081c8dea99 11-Mar-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [SK_BUFF]: Introduce skb_network_offset()

For the quite common 'skb->nh.raw - skb->data' sequence.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
04ff97086b1a3237bbd1fe6390fa80fe75207e23 12-Mar-2007 Al Viro <viro@ftp.linux.org.uk> [PATCH] sanitize security_getprocattr() API

have it return the buffer it had allocated

Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
fadcdb451632d32d7c0d4c71df9ac2d3b7ae2348 23-Feb-2007 Eric Paris <eparis@parisplace.org> Reassign printk levels in selinux kernel code

Below is a patch which demotes many printk lines to KERN_DEBUG from
KERN_INFO. It should help stop the spamming of logs with messages in
which users are not interested nor is there any action that users should
take. It also promotes some KERN_INFO to KERN_ERR such as when there
are improper attempts to register/unregister security modules.

A similar patch was discussed a while back on list:
http://marc.theaimsgroup.com/?t=116656343500003&r=1&w=2
This patch addresses almost all of the issues raised. I believe the
only advice not taken was in the demoting of messages related to
undefined permissions and classes.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>

security/selinux/hooks.c | 20 ++++++++++----------
security/selinux/ss/avtab.c | 2 +-
security/selinux/ss/policydb.c | 6 +++---
security/selinux/ss/sidtab.c | 2 +-
4 files changed, 15 insertions(+), 15 deletions(-)
Signed-off-by: James Morris <jmorris@namei.org>
bbaca6c2e7ef0f663bc31be4dad7cf530f6c4962 14-Feb-2007 Stephen Smalley <sds@tycho.nsa.gov> [PATCH] selinux: enhance selinux to always ignore private inodes

Hmmm...turns out to not be quite enough, as the /proc/sys inodes aren't truly
private to the fs, so we can run into them in a variety of security hooks
beyond just the inode hooks, such as security_file_permission (when reading
and writing them via the vfs helpers), security_sb_mount (when mounting other
filesystems on directories in proc like binfmt_misc), and deeper within the
security module itself (as in flush_unauthorized_files upon inheritance across
execve). So I think we have to add an IS_PRIVATE() guard within SELinux, as
below. Note however that the use of the private flag here could be confusing,
as these inodes are _not_ private to the fs, are exposed to userspace, and
security modules must implement the sysctl hook to get any access control over
them.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
b599fdfdb4bb4941e9076308efcf3bb89e577db5 14-Feb-2007 Eric W. Biederman <ebiederm@xmission.com> [PATCH] sysctl: fix the selinux_sysctl_get_sid

I goofed and when reenabling the fine grained selinux labels for
sysctls and forgot to add the "/sys" prefix before consulting
the policy database. When computing the same path using
proc_dir_entries we got the "/sys" for free as it was part
of the tree, but it isn't true for clt_table trees.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3fbfa98112fc3962c416452a0baf2214381030e6 14-Feb-2007 Eric W. Biederman <ebiederm@xmission.com> [PATCH] sysctl: remove the proc_dir_entry member for the sysctl tables

It isn't needed anymore, all of the users are gone, and all of the ctl_table
initializers have been converted to use explicit names of the fields they are
initializing.

[akpm@osdl.org: NTFS fix]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
b385a144ee790f00e8559bcb8024d042863f9be1 10-Feb-2007 Robert P. J. Day <rpjday@mindspring.com> [PATCH] Replace regular code with appropriate calls to container_of()

Replace a small number of expressions with a call to the "container_of()"
macro.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
c376222960ae91d5ffb9197ee36771aaed1d9f90 10-Feb-2007 Robert P. J. Day <rpjday@mindspring.com> [PATCH] Transform kmem_cache_alloc()+memset(0) -> kmem_cache_zalloc().

Replace appropriate pairs of "kmem_cache_alloc()" + "memset(0)" with the
corresponding "kmem_cache_zalloc()" call.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Roland McGrath <roland@redhat.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Greg KH <greg@kroah.com>
Acked-by: Joel Becker <Joel.Becker@oracle.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Jan Kara <jack@ucw.cz>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
bbea9f69668a3d0cf9feba15a724cd02896f8675 10-Dec-2006 Vadim Lobanov <vlobanov@speakeasy.net> [PATCH] fdtable: Make fdarray and fdsets equal in size

Currently, each fdtable supports three dynamically-sized arrays of data: the
fdarray and two fdsets. The code allows the number of fds supported by the
fdarray (fdtable->max_fds) to differ from the number of fds supported by each
of the fdsets (fdtable->max_fdset).

In practice, it is wasteful for these two sizes to differ: whenever we hit a
limit on the smaller-capacity structure, we will reallocate the entire fdtable
and all the dynamic arrays within it, so any delta in the memory used by the
larger-capacity structure will never be touched at all.

Rather than hogging this excess, we shouldn't even allocate it in the first
place, and keep the capacities of the fdarray and the fdsets equal. This
patch removes fdtable->max_fdset. As an added bonus, most of the supporting
code becomes simpler.

Signed-off-by: Vadim Lobanov <vlobanov@speakeasy.net>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
3d5ff529ea222461a5fa3c4df05cbdc5eb56864d 08-Dec-2006 Josef Sipek <jsipek@fsl.cs.sunysb.edu> [PATCH] struct path: convert selinux

Signed-off-by: Josef Sipek <jsipek@fsl.cs.sunysb.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
24ec839c431eb79bb8f6abc00c4e1eb3b8c4d517 08-Dec-2006 Peter Zijlstra <a.p.zijlstra@chello.nl> [PATCH] tty: ->signal->tty locking

Fix the locking of signal->tty.

Use ->sighand->siglock to protect ->signal->tty; this lock is already used
by most other members of ->signal/->sighand. And unless we are 'current'
or the tasklist_lock is held we need ->siglock to access ->signal anyway.

(NOTE: sys_unshare() is broken wrt ->sighand locking rules)

Note that tty_mutex is held over tty destruction, so while holding
tty_mutex any tty pointer remains valid. Otherwise the lifetime of ttys
are governed by their open file handles. This leaves some holes for tty
access from signal->tty (or any other non file related tty access).

It solves the tty SLAB scribbles we were seeing.

(NOTE: the change from group_send_sig_info to __group_send_sig_info needs to
be examined by someone familiar with the security framework, I think
it is safe given the SEND_SIG_PRIV from other __group_send_sig_info
invocations)

[schwidefsky@de.ibm.com: 3270 fix]
[akpm@osdl.org: various post-viro fixes]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Alan Cox <alan@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Jan Kara <jack@ucw.cz>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
e18b890bb0881bbab6f4f1a6cd20d9c60d66b003 07-Dec-2006 Christoph Lameter <clameter@sgi.com> [PATCH] slab: remove kmem_cache_t

Replace all uses of kmem_cache_t with struct kmem_cache.

The patch was generated using the following script:

#!/bin/sh
#
# Replace one string by another in all the kernel sources.
#

set -e

for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
quilt add $file
sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
mv /tmp/$$ $file
quilt refresh
done

The script was run like this

sh replace kmem_cache_t "struct kmem_cache"

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
e94b1766097d53e6f3ccfb36c8baa562ffeda3fc 07-Dec-2006 Christoph Lameter <clameter@sgi.com> [PATCH] slab: remove SLAB_KERNEL

SLAB_KERNEL is an alias of GFP_KERNEL.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
87fcd70d983d30eca4b933fff2e97d9a31743d0a 04-Dec-2006 Al Viro <viro@hera.kernel.org> [PATCH] selinux endianness annotations

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
3de4bab5b9f8848a0c16a4b1ffe0452f0d670237 17-Nov-2006 Paul Moore <paul.moore@hp.com> SELinux: peer secid consolidation for external network labeling

Now that labeled IPsec makes use of the peer_sid field in the
sk_security_struct we can remove a lot of the special cases between labeled
IPsec and NetLabel. In addition, create a new function,
security_skb_extlbl_sid(), which we can use in several places to get the
security context of the packet's external label which allows us to further
simplify the code in a few places.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
9f2ad66509b182b399a5b03de487f45bde623524 17-Nov-2006 Paul Moore <paul.moore@hp.com> NetLabel: SELinux cleanups

This patch does a lot of cleanup in the SELinux NetLabel support code. A
summary of the changes include:

* Use RCU locking for the NetLabel state variable in the skk_security_struct
instead of using the inode_security_struct mutex.
* Remove unnecessary parameters in selinux_netlbl_socket_post_create().
* Rename selinux_netlbl_sk_clone_security() to
selinux_netlbl_sk_security_clone() to better fit the other NetLabel
sk_security functions.
* Improvements to selinux_netlbl_inode_permission() to help reduce the cost of
the common case.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2ee92d46c6cabedd50edf6f273fa8cf84f707618 14-Nov-2006 James Morris <jmorris@namei.org> [SELinux]: Add support for DCCP

This patch implements SELinux kernel support for DCCP
(http://linux-net.osdl.org/index.php/DCCP), which is similar in
operation to TCP in terms of connected state between peers.

The SELinux support for DCCP is thus modeled on existing handling of
TCP.

A new DCCP socket class is introduced, to allow protocol
differentation. The permissions for this class inherit all of the
socket permissions, as well as the current TCP permissions (node_bind,
name_bind etc). IPv4 and IPv6 are supported, although labeled
networking is not, at this stage.

Patches for SELinux userspace are at:
http://people.redhat.com/jmorris/selinux/dccp/user/

I've performed some basic testing, and it seems to be working as
expected. Adding policy support is similar to TCP, the only real
difference being that it's a different protocol.

Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
67f83cbf081a70426ff667e8d14f94e13ed3bdca 09-Nov-2006 Venkat Yekkirala <vyekkirala@trustedcs.com> SELinux: Fix SA selection semantics

Fix the selection of an SA for an outgoing packet to be at the same
context as the originating socket/flow. This eliminates the SELinux
policy's ability to use/sendto SAs with contexts other than the socket's.

With this patch applied, the SELinux policy will require one or more of the
following for a socket to be able to communicate with/without SAs:

1. To enable a socket to communicate without using labeled-IPSec SAs:

allow socket_t unlabeled_t:association { sendto recvfrom }

2. To enable a socket to communicate with labeled-IPSec SAs:

allow socket_t self:association { sendto };
allow socket_t peer_sa_t:association { recvfrom };

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: James Morris <jmorris@namei.org>
6b877699c6f1efede4545bcecc367786a472eedb 09-Nov-2006 Venkat Yekkirala <vyekkirala@trustedcs.com> SELinux: Return correct context for SO_PEERSEC

Fix SO_PEERSEC for tcp sockets to return the security context of
the peer (as represented by the SA from the peer) as opposed to the
SA used by the local/source socket.

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: James Morris <jmorris@namei.org>
fc5d81e69d15c65ca20d9e5b4e242690e3e9c27d 27-Nov-2006 Akinobu Mita <akinobu.mita@gmail.com> selinux: fix dentry_open() error check

The return value of dentry_open() shoud be checked by IS_ERR().

Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: James Morris <jmorris@namei.org>
f8687afefcc821fc47c75775eec87731fe3de360 31-Oct-2006 Paul Moore <paul.moore@hp.com> [NetLabel]: protect the CIPSOv4 socket option from setsockopt()

This patch makes two changes to protect applications from either removing or
tampering with the CIPSOv4 IP option on a socket. The first is the requirement
that applications have the CAP_NET_RAW capability to set an IPOPT_CIPSO option
on a socket; this prevents untrusted applications from setting their own
CIPSOv4 security attributes on the packets they send. The second change is to
SELinux and it prevents applications from setting any IPv4 options when there
is an IPOPT_CIPSO option already present on the socket; this prevents
applications from removing CIPSOv4 security attributes from the packets they
send.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2148ccc437a9eac9f0d4b3c27cb1e41f6a48194c 30-Sep-2006 David Woodhouse <dwmw2@infradead.org> [PATCH] MLSXFRM: fix mis-labelling of child sockets

Accepted connections of types other than AF_INET, AF_INET6, AF_UNIX won't
have an appropriate label derived from the peer, so don't use it.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
3528a95322b5c1ce882ab723f175a1845430cd89 29-Sep-2006 Cory Olmo <colmo@TrustedCS.com> [PATCH] SELinux: support mls categories for context mounts

Allows commas to be embedded into context mount options (i.e. "-o
context=some_selinux_context_t"), to better support multiple categories,
which are separated by commas and confuse mount.

For example, with the current code:

mount -t iso9660 /dev/cdrom /media/cdrom -o \
ro,context=system_u:object_r:iso9660_t:s0:c1,c3,c4,exec

The context option that will be interpreted by SELinux is
context=system_u:object_r:iso9660_t:s0:c1

instead of
context=system_u:object_r:iso9660_t:s0:c1,c3,c4

The options that will be passed on to the file system will be
ro,c3,c4,exec.

The proposed solution is to allow/require the SELinux context option
specified to mount to use quotes when the context contains a comma.

This patch modifies the option parsing in parse_opts(), contained in
mount.c, to take options after finding a comma only if it hasn't seen a
quote or if the quotes are matched. It also introduces a new function that
will strip the quotes from the context option prior to translation. The
quotes are replaced after the translation is completed to insure that in
the event the raw context contains commas the kernel will be able to
interpret the correct context.

Signed-off-by: Cory Olmo <colmo@TrustedCS.com>
Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
b20c8122a3204496fca8b5343c93b60fe11dad04 26-Sep-2006 Stephen Smalley <sds@tycho.nsa.gov> [PATCH] selinux: fix tty locking

Take tty_mutex when accessing ->signal->tty in selinux code. Noted by Alan
Cox. Longer term, we are looking at refactoring the code to provide better
encapsulation of the tty layer, but this is a simple fix that addresses the
immediate bug.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Alan Cox <alan@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
bc7e982b84aceef0a040c88ff659eb5c83818f72 26-Sep-2006 Eric Paris <eparis@redhat.com> [PATCH] SELinux: convert sbsec semaphore to a mutex

This patch converts the semaphore in the superblock security struct to a
mutex. No locking changes or other code changes are done.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
23970741720360de9dd0a4e87fbeb1d5927aa474 26-Sep-2006 Eric Paris <eparis@redhat.com> [PATCH] SELinux: change isec semaphore to a mutex

This patch converts the remaining isec->sem into a mutex. Very similar
locking is provided as before only in the faster smaller mutex rather than a
semaphore. An out_unlock path is introduced rather than the conditional
unlocking found in the original code.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
296fddf7513c155adbd3a443d12add1f62b5cddb 26-Sep-2006 Eric Paris <eparis@redhat.com> [PATCH] SELinux: eliminate inode_security_set_security

inode_security_set_sid is only called by security_inode_init_security, which
is called when a new file is being created and needs to have its incore
security state initialized and its security xattr set. This helper used to be
called in other places in the past, but now only has the one. So this patch
rolls inode_security_set_sid directly back into security_inode_init_security.
There also is no need to hold the isec->sem while doing this, as the inode is
not available to other threads at this point in time.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
99f59ed073d3c1b890690064ab285a201dea2e35 30-Aug-2006 Paul Moore <paul.moore@hp.com> [NetLabel]: Correctly initialize the NetLabel fields.

Fix a problem where the NetLabel specific fields of the sk_security_struct
structure were not being initialized early enough in some cases.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9a673e563e543a5c8a6f9824562e55e807b8a56c 15-Aug-2006 Adrian Bunk <bunk@stusta.de> [SELINUX]: security/selinux/hooks.c: Make 4 functions static.

This patch makes four needlessly global functions static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7420ed23a4f77480b5b7b3245e5da30dd24b7575 05-Aug-2006 Venkat Yekkirala <vyekkirala@TrustedCS.com> [NetLabel]: SELinux support

Add NetLabel support to the SELinux LSM and modify the
socket_post_create() LSM hook to return an error code. The most
significant part of this patch is the addition of NetLabel hooks into
the following SELinux LSM hooks:

* selinux_file_permission()
* selinux_socket_sendmsg()
* selinux_socket_post_create()
* selinux_socket_sock_rcv_skb()
* selinux_socket_getpeersec_stream()
* selinux_socket_getpeersec_dgram()
* selinux_sock_graft()
* selinux_inet_conn_request()

The basic reasoning behind this patch is that outgoing packets are
"NetLabel'd" by labeling their socket and the NetLabel security
attributes are checked via the additional hook in
selinux_socket_sock_rcv_skb(). NetLabel itself is only a labeling
mechanism, similar to filesystem extended attributes, it is up to the
SELinux enforcement mechanism to perform the actual access checks.

In addition to the changes outlined above this patch also includes
some changes to the extended bitmap (ebitmap) and multi-level security
(mls) code to import and export SELinux TE/MLS attributes into and out
of NetLabel.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
a51c64f1e5c2876eab2a32955acd9e8015c91c15 28-Jul-2006 Venkat Yekkirala <vyekkirala@TrustedCS.com> [MLSXFRM]: Fix build with SECURITY_NETWORK_XFRM disabled.

The following patch will fix the build problem (encountered by Andrew
Morton) when SECURITY_NETWORK_XFRM is not enabled.

As compared to git-net-selinux_xfrm_decode_session-build-fix.patch in
-mm, this patch sets the return parameter sid to SECSID_NULL in
selinux_xfrm_decode_session() and handles this value in the caller
selinux_inet_conn_request() appropriately.

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4237c75c0a35535d7f9f2bfeeb4b4df1e068a0bf 25-Jul-2006 Venkat Yekkirala <vyekkirala@TrustedCS.com> [MLSXFRM]: Auto-labeling of child sockets

This automatically labels the TCP, Unix stream, and dccp child sockets
as well as openreqs to be at the same MLS level as the peer. This will
result in the selection of appropriately labeled IPSec Security
Associations.

This also uses the sock's sid (as opposed to the isec sid) in SELinux
enforcement of secmark in rcv_skb and postroute_last hooks.

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
beb8d13bed80f8388f1a9a107d07ddd342e627e8 05-Aug-2006 Venkat Yekkirala <vyekkirala@TrustedCS.com> [MLSXFRM]: Add flow labeling

This labels the flows that could utilize IPSec xfrms at the points the
flows are defined so that IPSec policy and SAs at the right label can
be used.

The following protos are currently not handled, but they should
continue to be able to use single-labeled IPSec like they currently
do.

ipmr
ip_gre
ipip
igmp
sit
sctp
ip6_tunnel (IPv6 over IPv6 tunnel device)
decnet

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
e0d1caa7b0d5f02e4f34aa09c695d04251310c6c 25-Jul-2006 Venkat Yekkirala <vyekkirala@TrustedCS.com> [MLSXFRM]: Flow based matching of xfrm policy and state

This implements a seemless mechanism for xfrm policy selection and
state matching based on the flow sid. This also includes the necessary
SELinux enforcement pieces.

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
892c141e62982272b9c738b5520ad0e5e1ad7b42 05-Aug-2006 Venkat Yekkirala <vyekkirala@TrustedCS.com> [MLSXFRM]: Add security sid to sock

This adds security for IP sockets at the sock level. Security at the
sock level is needed to enforce the SELinux security policy for
security associations even when a sock is orphaned (such as in the TCP
LAST_ACK state).

This will also be used to enforce SELinux controls over data arriving
at or leaving a child socket while it's still waiting to be accepted.

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dc49c1f94e3469d94b952e8f5160dd4ccd791d79 02-Aug-2006 Catherine Zhang <cxzhang@watson.ibm.com> [AF_UNIX]: Kernel memory leak fix for af_unix datagram getpeersec patch

From: Catherine Zhang <cxzhang@watson.ibm.com>

This patch implements a cleaner fix for the memory leak problem of the
original unix datagram getpeersec patch. Instead of creating a
security context each time a unix datagram is sent, we only create the
security context when the receiver requests it.

This new design requires modification of the current
unix_getsecpeer_dgram LSM hook and addition of two new hooks, namely,
secid_to_secctx and release_secctx. The former retrieves the security
context and the latter releases it. A hook is required for releasing
the security context because it is up to the security module to decide
how that's done. In the case of Selinux, it's a simple kfree
operation.

Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: David S. Miller <davem@davemloft.net>
b04ea3cebf79d6808632808072f276dbc98aaf01 14-Jul-2006 Eric Paris <eparis@parisplace.org> [PATCH] Fix security check for joint context= and fscontext= mount options

After some discussion on the actual meaning of the filesystem class
security check in try context mount it was determined that the checks for
the context= mount options were not correct if fscontext mount option had
already been used.

When labeling the superblock we should be checking relabel_from and
relabel_to. But if the superblock has already been labeled (with
fscontext) then context= is actually labeling the inodes, and so we should
be checking relabel_from and associate. This patch fixes which checks are
called depending on the mount options.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
0808925ea5684a0ce25483b30e94d4f398804978 10-Jul-2006 Eric Paris <eparis@parisplace.org> [PATCH] SELinux: add rootcontext= option to label root inode when mounting

Introduce a new rootcontext= option to FS mounting. This option will allow
you to explicitly label the root inode of an FS being mounted before that
FS or inode because visible to userspace. This was found to be useful for
things like stateless linux, see
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=190001

Signed-off-by: Eric Paris <eparis@parisplace.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
c312feb2931ded0582378712727b7ea017a951bd 10-Jul-2006 Eric Paris <eparis@parisplace.org> [PATCH] SELinux: decouple fscontext/context mount options

Remove the conflict between fscontext and context mount options. If
context= is specified without fscontext it will operate just as before, if
both are specified we will use mount point labeling and all inodes will get
the label specified by context=. The superblock will be labeled with the
label of fscontext=, thus affecting operations which check the superblock
security context, such as associate permissions.

Signed-off-by: Eric Paris <eparis@parisplace.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
a1836a42daf5ddfe9a891973734bd9a7d62eb504 30-Jun-2006 David Quigley <dpquigl@tycho.nsa.gov> [PATCH] SELinux: Add security hook definition for getioprio and insert hooks

Add a new security hook definition for the sys_ioprio_get operation. At
present, the SELinux hook function implementation for this hook is
identical to the getscheduler implementation but a separate hook is
introduced to allow this check to be specialized in the future if
necessary.

This patch also creates a helper function get_task_ioprio which handles the
access check in addition to retrieving the ioprio value for the task.

Signed-off-by: David Quigley <dpquigl@tycho.nsa.gov>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Cc: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
f9008e4c5c525941967b67777945aa6266ab6326 30-Jun-2006 David Quigley <dpquigl@tycho.nsa.gov> [PATCH] SELinux: extend task_kill hook to handle signals sent by AIO completion

This patch extends the security_task_kill hook to handle signals sent by AIO
completion. In this case, the secid of the task responsible for the signal
needs to be obtained and saved earlier, so a security_task_getsecid() hook is
added, and then this saved value is passed subsequently to the extended
task_kill hook for use in checking.

Signed-off-by: David Quigley <dpquigl@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
6ab3d5624e172c553004ecc862bfeac16d9d68b7 30-Jun-2006 Jörn Engel <joern@wohnheim.fh-wedel.de> Remove obsolete #include <linux/config.h>

Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
877ce7c1b3afd69a9b1caeb1b9964c992641f52a 29-Jun-2006 Catherine Zhang <cxzhang@watson.ibm.com> [AF_UNIX]: Datagram getpeersec

This patch implements an API whereby an application can determine the
label of its peer's Unix datagram sockets via the auxiliary data mechanism of
recvmsg.

Patch purpose:

This patch enables a security-aware application to retrieve the
security context of the peer of a Unix datagram socket. The application
can then use this security context to determine the security context for
processing on behalf of the peer who sent the packet.

Patch design and implementation:

The design and implementation is very similar to the UDP case for INET
sockets. Basically we build upon the existing Unix domain socket API for
retrieving user credentials. Linux offers the API for obtaining user
credentials via ancillary messages (i.e., out of band/control messages
that are bundled together with a normal message). To retrieve the security
context, the application first indicates to the kernel such desire by
setting the SO_PASSSEC option via getsockopt. Then the application
retrieves the security context using the auxiliary data mechanism.

An example server application for Unix datagram socket should look like this:

toggle = 1;
toggle_len = sizeof(toggle);

setsockopt(sockfd, SOL_SOCKET, SO_PASSSEC, &toggle, &toggle_len);
recvmsg(sockfd, &msg_hdr, 0);
if (msg_hdr.msg_controllen > sizeof(struct cmsghdr)) {
cmsg_hdr = CMSG_FIRSTHDR(&msg_hdr);
if (cmsg_hdr->cmsg_len <= CMSG_LEN(sizeof(scontext)) &&
cmsg_hdr->cmsg_level == SOL_SOCKET &&
cmsg_hdr->cmsg_type == SCM_SECURITY) {
memcpy(&scontext, CMSG_DATA(cmsg_hdr), sizeof(scontext));
}
}

sock_setsockopt is enhanced with a new socket option SOCK_PASSSEC to allow
a server socket to receive security context of the peer.

Testing:

We have tested the patch by setting up Unix datagram client and server
applications. We verified that the server can retrieve the security context
using the auxiliary data mechanism of recvmsg.

Signed-off-by: Catherine Zhang <cxzhang@watson.ibm.com>
Acked-by: Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
c7bdb545d23026b18be53289fd866d1ac07f5f8c 27-Jun-2006 Darrel Goeddel <dgoeddel@trustedcs.com> [NETLINK]: Encapsulate eff_cap usage within security framework.

This patch encapsulates the usage of eff_cap (in netlink_skb_params) within
the security framework by extending security_netlink_recv to include a required
capability parameter and converting all direct usage of eff_caps outside
of the lsm modules to use the interface. It also updates the SELinux
implementation of the security_netlink_send and security_netlink_recv
hooks to take advantage of the sid in the netlink_skb_params struct.
This also enables SELinux to perform auditing of netlink capability checks.
Please apply, for 2.6.18 if possible.

Signed-off-by: Darrel Goeddel <dgoeddel@trustedcs.com>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
28eba5bf9d4bf3ba4d58d985abf3a2903b7f2125 27-Jun-2006 Michael LeMay <mdlemay@epoch.ncsc.mil> [PATCH] selinux: inherit /proc/self/attr/keycreate across fork

Update SELinux to cause the keycreate process attribute held in
/proc/self/attr/keycreate to be inherited across a fork and reset upon
execve. This is consistent with the handling of the other process
attributes provided by SELinux and also makes it simpler to adapt logon
programs to properly handle the keycreate attribute.

Signed-off-by: Michael LeMay <mdlemay@epoch.ncsc.mil>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
42c3e03ef6b298813557cdb997bd6db619cd65a2 26-Jun-2006 Eric Paris <eparis@redhat.com> [PATCH] SELinux: Add sockcreate node to procattr API

Below is a patch to add a new /proc/self/attr/sockcreate A process may write a
context into this interface and all subsequent sockets created will be labeled
with that context. This is the same idea as the fscreate interface where a
process can specify the label of a file about to be created. At this time one
envisioned user of this will be xinetd. It will be able to better label
sockets for the actual services. At this time all sockets take the label of
the creating process, so all xinitd sockets would just be labeled the same.

I tested this by creating a tcp sender and listener. The sender was able to
write to this new proc file and then create sockets with the specified label.
I am able to be sure the new label was used since the avc denial messages
kicked out by the kernel included both the new security permission
setsockcreate and all the socket denials were for the new label, not the label
of the running process.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
4eb582cf1fbd7b9e5f466e3718a59c957e75254e 26-Jun-2006 Michael LeMay <mdlemay@epoch.ncsc.mil> [PATCH] keys: add a way to store the appropriate context for newly-created keys

Add a /proc/<pid>/attr/keycreate entry that stores the appropriate context for
newly-created keys. Modify the selinux_key_alloc hook to make use of the new
entry. Update the flask headers to include a new "setkeycreate" permission
for processes. Update the flask headers to include a new "create" permission
for keys. Use the create permission to restrict which SIDs each task can
assign to newly-created keys. Add a new parameter to the security hook
"security_key_alloc" to indicate whether it is being invoked by the kernel, or
from userspace. If it is being invoked by the kernel, the security hook
should never fail. Update the documentation to reflect these changes.

Signed-off-by: Michael LeMay <mdlemay@epoch.ncsc.mil>
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
7e047ef5fe2d52e83020e856b1bf2556a6a2ce98 26-Jun-2006 David Howells <dhowells@redhat.com> [PATCH] keys: sort out key quota system

Add the ability for key creation to overrun the user's quota in some
circumstances - notably when a session keyring is created and assigned to a
process that didn't previously have one.

This means it's still possible to log in, should PAM require the creation of a
new session keyring, and fix an overburdened key quota.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
35601547baf92d984b6e59cf3583649da04baea5 23-Jun-2006 David Quigley <dpquigl@tycho.nsa.gov> [PATCH] SELinux: add task_movememory hook

This patch adds new security hook, task_movememory, to be called when memory
owened by a task is to be moved (e.g. when migrating pages to a this hook is
identical to the setscheduler implementation, but a separate hook introduced
to allow this check to be specialized in the future if necessary.

Since the last posting, the hook has been renamed following feedback from
Christoph Lameter.

Signed-off-by: David Quigley <dpquigl@tycho.nsa.gov>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@muc.de>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
03e68060636e05989ea94bcb671ab633948f328c 23-Jun-2006 James Morris <jmorris@namei.org> [PATCH] lsm: add task_setioprio hook

Implement an LSM hook for setting a task's IO priority, similar to the hook
for setting a tasks's nice value.

A previous version of this LSM hook was included in an older version of
multiadm by Jan Engelhardt, although I don't recall it being submitted
upstream.

Also included is the corresponding SELinux hook, which re-uses the setsched
permission in the proccess class.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
726c334223180e3c0197cc980a432681370d4baf 23-Jun-2006 David Howells <dhowells@redhat.com> [PATCH] VFS: Permit filesystem to perform statfs with a known root dentry

Give the statfs superblock operation a dentry pointer rather than a superblock
pointer.

This complements the get_sb() patch. That reduced the significance of
sb->s_root, allowing NFS to place a fake root there. However, NFS does
require a dentry to use as a target for the statfs operation. This permits
the root in the vfsmount to be used instead.

linux/mount.h has been added where necessary to make allyesconfig build
successfully.

Interest has also been expressed for use with the FUSE and XFS filesystems.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nathan Scott <nathans@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
d720024e94de4e8b7f10ee83c532926f3ad5d708 22-Jun-2006 Michael LeMay <mdlemay@epoch.ncsc.mil> [PATCH] selinux: add hooks for key subsystem

Introduce SELinux hooks to support the access key retention subsystem
within the kernel. Incorporate new flask headers from a modified version
of the SELinux reference policy, with support for the new security class
representing retained keys. Extend the "key_alloc" security hook with a
task parameter representing the intended ownership context for the key
being allocated. Attach security information to root's default keyrings
within the SELinux initialization routine.

Has passed David's testsuite.

Signed-off-by: Michael LeMay <mdlemay@epoch.ncsc.mil>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
4e5ab4cb85683cf77b507ba0c4d48871e1562305 09-Jun-2006 James Morris <jmorris@namei.org> [SECMARK]: Add new packet controls to SELinux

Add new per-packet access controls to SELinux, replacing the old
packet controls.

Packets are labeled with the iptables SECMARK and CONNSECMARK targets,
then security policy for the packets is enforced with these controls.

To allow for a smooth transition to the new controls, the old code is
still present, but not active by default. To restore previous
behavior, the old controls may be activated at runtime by writing a
'1' to /selinux/compat_net, and also via the kernel boot parameter
selinux_compat_net. Switching between the network control models
requires the security load_policy permission. The old controls will
probably eventually be removed and any continued use is discouraged.

With this patch, the new secmark controls for SElinux are disabled by
default, so existing behavior is entirely preserved, and the user is
not affected at all.

It also provides a config option to enable the secmark controls by
default (which can always be overridden at boot and runtime). It is
also noted in the kconfig help that the user will need updated
userspace if enabling secmark controls for SELinux and that they'll
probably need the SECMARK and CONNMARK targets, and conntrack protocol
helpers, although such decisions are beyond the scope of kernel
configuration.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3e3ff15e6d8ba931fa9a6c7f9fe711edc77e96e5 09-Jun-2006 Christopher J. PeBenito <cpebenito@tresys.com> [SELINUX]: add security class for appletalk sockets

Add a security class for appletalk sockets so that they can be
distinguished in SELinux policy. Please apply.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
c8c05a8eec6f1258f6d5cb71a44ee5dc1e989b63 09-Jun-2006 Catherine Zhang <cxzhang@watson.ibm.com> [LSM-IPsec]: SELinux Authorize

This patch contains a fix for the previous patch that adds security
contexts to IPsec policies and security associations. In the previous
patch, no authorization (besides the check for write permissions to
SAD and SPD) is required to delete IPsec policies and security
assocations with security contexts. Thus a user authorized to change
SAD and SPD can bypass the IPsec policy authorization by simply
deleteing policies with security contexts. To fix this security hole,
an additional authorization check is added for removing security
policies and security associations with security contexts.

Note that if no security context is supplied on add or present on
policy to be deleted, the SELinux module allows the change
unconditionally. The hook is called on deletion when no context is
present, which we may want to change. At present, I left it up to the
module.

LSM changes:

The patch adds two new LSM hooks: xfrm_policy_delete and
xfrm_state_delete. The new hooks are necessary to authorize deletion
of IPsec policies that have security contexts. The existing hooks
xfrm_policy_free and xfrm_state_free lack the context to do the
authorization, so I decided to split authorization of deletion and
memory management of security data, as is typical in the LSM
interface.

Use:

The new delete hooks are checked when xfrm_policy or xfrm_state are
deleted by either the xfrm_user interface (xfrm_get_policy,
xfrm_del_sa) or the pfkey interface (pfkey_spddelete, pfkey_delete).

SELinux changes:

The new policy_delete and state_delete functions are added.

Signed-off-by: Catherine Zhang <cxzhang@watson.ibm.com>
Signed-off-by: Trent Jaeger <tjaeger@cse.psu.edu>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
ba0c19ed6a61a96d4b42b81cb19d4bc81b5f728c 04-Jun-2006 Stephen Smalley <sds@tycho.nsa.gov> [PATCH] selinux: fix sb_lock/sb_security_lock nesting

From: Stephen Smalley <sds@tycho.nsa.gov>

Fix unsafe nesting of sb_lock inside sb_security_lock in
selinux_complete_init. Detected by the kernel locking validator.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
d4e9dc63dca91cd89086b5a686d7f7635c8319e5 21-May-2006 Alexey Dobriyan <adobriyan@gmail.com> [PATCH] selinux: endian fix

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
30d55280b867aa0cae99f836ad0181bb0bf8f9cb 03-May-2006 Stephen Smalley <sds@tycho.nsa.gov> [PATCH] selinux: Clear selinux_enabled flag upon runtime disable.

Clear selinux_enabled flag upon runtime disable of SELinux by userspace,
and make sure it is defined even if selinux= boot parameter support is
not enabled in configuration.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Tested-by: Jon Smirl <jonsmirl@gmail.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
9c7aa6aa74fa8a5cda36e54cbbe4fffe0214497d 31-Mar-2006 Steve Grubb <sgrubb@redhat.com> [PATCH] change lspp ipc auditing

Hi,

The patch below converts IPC auditing to collect sid's and convert to context
string only if it needs to output an audit record. This patch depends on the
inode audit change patch already being applied.

Signed-off-by: Steve Grubb <sgrubb@redhat.com>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
7cae7e26f245151b9ccad868bf2edf8c8048d307 22-Mar-2006 James Morris <jmorris@namei.org> [PATCH] SELinux: add slab cache for inode security struct

Add a slab cache for the SELinux inode security struct, one of which is
allocated for every inode instantiated by the system.

The memory savings are considerable.

On 64-bit, instead of the size-128 cache, we have a slab object of 96
bytes, saving 32 bytes per object. After booting, I see about 4000 of
these and then about 17,000 after a kernel compile. With this patch, we
save around 530KB of kernel memory in the latter case. On 32-bit, the
savings are about half of this.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
cf01efd098597f7ee88a61e645afacba987c4531 22-Mar-2006 James Morris <jmorris@namei.org> [PATCH] SELinux: cleanup stray variable in selinux_inode_init_security()

Remove an unneded pointer variable in selinux_inode_init_security().

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
8aad38752e81d1d4de67e3d8e2524618ce7c9276 22-Mar-2006 Stephen Smalley <sds@tycho.nsa.gov> [PATCH] selinux: Disable automatic labeling of new inodes when no policy is loaded

This patch disables the automatic labeling of new inodes on disk
when no policy is loaded.

Discussion is here:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=180296

In short, we're changing the behavior so that when no policy is loaded,
SELinux does not label files at all. Currently it does add an 'unlabeled'
label in this case, which we've found causes problems later.

SELinux always maintains a safe internal label if there is none, so with this
patch, we just stick with that and wait until a policy is loaded before adding
a persistent label on disk.

The effect is simply that if you boot with SELinux enabled but no policy
loaded and create a file in that state, SELinux won't try to set a security
extended attribute on the new inode on the disk. This is the only sane
behavior for SELinux in that state, as it cannot determine the right label to
assign in the absence of a policy. That state usually doesn't occur, but the
rawhide installer seemed to be misbehaving temporarily so it happened to show
up on a test install.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2c7946a7bf45ae86736ab3b43d0085e43947945c 21-Mar-2006 Catherine Zhang <cxzhang@watson.ibm.com> [SECURITY]: TCP/UDP getpeersec

This patch implements an application of the LSM-IPSec networking
controls whereby an application can determine the label of the
security association its TCP or UDP sockets are currently connected to
via getsockopt and the auxiliary data mechanism of recvmsg.

Patch purpose:

This patch enables a security-aware application to retrieve the
security context of an IPSec security association a particular TCP or
UDP socket is using. The application can then use this security
context to determine the security context for processing on behalf of
the peer at the other end of this connection. In the case of UDP, the
security context is for each individual packet. An example
application is the inetd daemon, which could be modified to start
daemons running at security contexts dependent on the remote client.

Patch design approach:

- Design for TCP
The patch enables the SELinux LSM to set the peer security context for
a socket based on the security context of the IPSec security
association. The application may retrieve this context using
getsockopt. When called, the kernel determines if the socket is a
connected (TCP_ESTABLISHED) TCP socket and, if so, uses the dst_entry
cache on the socket to retrieve the security associations. If a
security association has a security context, the context string is
returned, as for UNIX domain sockets.

- Design for UDP
Unlike TCP, UDP is connectionless. This requires a somewhat different
API to retrieve the peer security context. With TCP, the peer
security context stays the same throughout the connection, thus it can
be retrieved at any time between when the connection is established
and when it is torn down. With UDP, each read/write can have
different peer and thus the security context might change every time.
As a result the security context retrieval must be done TOGETHER with
the packet retrieval.

The solution is to build upon the existing Unix domain socket API for
retrieving user credentials. Linux offers the API for obtaining user
credentials via ancillary messages (i.e., out of band/control messages
that are bundled together with a normal message).

Patch implementation details:

- Implementation for TCP
The security context can be retrieved by applications using getsockopt
with the existing SO_PEERSEC flag. As an example (ignoring error
checking):

getsockopt(sockfd, SOL_SOCKET, SO_PEERSEC, optbuf, &optlen);
printf("Socket peer context is: %s\n", optbuf);

The SELinux function, selinux_socket_getpeersec, is extended to check
for labeled security associations for connected (TCP_ESTABLISHED ==
sk->sk_state) TCP sockets only. If so, the socket has a dst_cache of
struct dst_entry values that may refer to security associations. If
these have security associations with security contexts, the security
context is returned.

getsockopt returns a buffer that contains a security context string or
the buffer is unmodified.

- Implementation for UDP
To retrieve the security context, the application first indicates to
the kernel such desire by setting the IP_PASSSEC option via
getsockopt. Then the application retrieves the security context using
the auxiliary data mechanism.

An example server application for UDP should look like this:

toggle = 1;
toggle_len = sizeof(toggle);

setsockopt(sockfd, SOL_IP, IP_PASSSEC, &toggle, &toggle_len);
recvmsg(sockfd, &msg_hdr, 0);
if (msg_hdr.msg_controllen > sizeof(struct cmsghdr)) {
cmsg_hdr = CMSG_FIRSTHDR(&msg_hdr);
if (cmsg_hdr->cmsg_len <= CMSG_LEN(sizeof(scontext)) &&
cmsg_hdr->cmsg_level == SOL_IP &&
cmsg_hdr->cmsg_type == SCM_SECURITY) {
memcpy(&scontext, CMSG_DATA(cmsg_hdr), sizeof(scontext));
}
}

ip_setsockopt is enhanced with a new socket option IP_PASSSEC to allow
a server socket to receive security context of the peer. A new
ancillary message type SCM_SECURITY.

When the packet is received we get the security context from the
sec_path pointer which is contained in the sk_buff, and copy it to the
ancillary message space. An additional LSM hook,
selinux_socket_getpeersec_udp, is defined to retrieve the security
context from the SELinux space. The existing function,
selinux_socket_getpeersec does not suit our purpose, because the
security context is copied directly to user space, rather than to
kernel space.

Testing:

We have tested the patch by setting up TCP and UDP connections between
applications on two machines using the IPSec policies that result in
labeled security associations being built. For TCP, we can then
extract the peer security context using getsockopt on either end. For
UDP, the receiving end can retrieve the security context using the
auxiliary data mechanism of recvmsg.

Signed-off-by: Catherine Zhang <cxzhang@watson.ibm.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
7306a0b9b3e2056a616c84841288ca2431a05627 16-Nov-2005 Dustin Kirkland <dustin.kirkland@us.ibm.com> [PATCH] Miscellaneous bug and warning fixes

This patch fixes a couple of bugs revealed in new features recently
added to -mm1:
* fixes warnings due to inconsistent use of const struct inode *inode
* fixes bug that prevent a kernel from booting with audit on, and SELinux off
due to a missing function in security/dummy.c
* fixes a bug that throws spurious audit_panic() messages due to a missing
return just before an error_path label
* some reasonable house cleaning in audit_ipc_context(),
audit_inode_context(), and audit_log_task_context()

Signed-off-by: Dustin Kirkland <dustin.kirkland@us.ibm.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
8c8570fb8feef2bc166bee75a85748b25cda22d9 03-Nov-2005 Dustin Kirkland <dustin.kirkland@us.ibm.com> [PATCH] Capture selinux subject/object context information.

This patch extends existing audit records with subject/object context
information. Audit records associated with filesystem inodes, ipc, and
tasks now contain SELinux label information in the field "subj" if the
item is performing the action, or in "obj" if the item is the receiver
of an action.

These labels are collected via hooks in SELinux and appended to the
appropriate record in the audit code.

This additional information is required for Common Criteria Labeled
Security Protection Profile (LSPP).

[AV: fixed kmalloc flags use]
[folded leak fixes]
[folded cleanup from akpm (kfree(NULL)]
[folded audit_inode_context() leak fix]
[folded akpm's fix for audit_ipc_perm() definition in case of !CONFIG_AUDIT]

Signed-off-by: Dustin Kirkland <dustin.kirkland@us.ibm.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
341c2d806b71cc3596afeb2d9bd26cd718e75202 11-Mar-2006 Stephen Smalley <sds@tycho.nsa.gov> [PATCH] selinux: tracer SID fix

Fix SELinux to not reset the tracer SID when the child is already being
traced, since selinux_ptrace is also called by proc for access checking
outside of the context of a ptrace attach.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
c2b507fda390b8ae90deba9b8cdc3fe727482193 05-Feb-2006 Stephen Smalley <sds@tycho.nsa.gov> [PATCH] selinux: require SECURITY_NETWORK

Make SELinux depend on SECURITY_NETWORK (which depends on SECURITY), as it
requires the socket hooks for proper operation even in the local case.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
9ac49d22138348198f729f07371ffb11991368e6 01-Feb-2006 Stephen Smalley <sds@tycho.nsa.gov> [PATCH] selinux: remove security struct magic number fields and tests

Remove the SELinux security structure magic number fields and tests, along
with some unnecessary tests for NULL security pointers. These fields and
tests are leftovers from the early attempts to support SELinux as a
loadable module during LSM development.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
26d2a4be6a56eec575dac651f6606756a971f0fb 01-Feb-2006 Stephen Smalley <sds@tycho.nsa.gov> [PATCH] selinux: change file_alloc_security to use GFP_KERNEL

This patch changes the SELinux file_alloc_security function to use
GFP_KERNEL rather than GFP_ATOMIC; the use of GFP_ATOMIC appears to be a
remnant of when this function was being called with the files_lock spinlock
held, and is no longer necessary. Please apply.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
db4c9641def55d36a6f9df79deb8a949292313ca 01-Feb-2006 Stephen Smalley <sds@tycho.nsa.gov> [PATCH] selinux: fix and cleanup mprotect checks

Fix the SELinux mprotect checks on executable mappings so that they are not
re-applied when the mapping is already executable as well as cleaning up
the code. This avoids a situation where e.g. an application is prevented
from removing PROT_WRITE on an already executable mapping previously
authorized via execmem permission due to an execmod denial.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
858119e159384308a5dde67776691a2ebf70df0f 14-Jan-2006 Arjan van de Ven <arjan@infradead.org> [PATCH] Unlinline a bunch of other functions

Remove the "inline" keyword from a bunch of big functions in the kernel with
the goal of shrinking it by 30kb to 40kb

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
095975da26dba21698582e91e96be10f7417333f 08-Jan-2006 Nick Piggin <nickpiggin@yahoo.com.au> [PATCH] rcu file: use atomic primitives

Use atomic_inc_not_zero for rcu files instead of special case rcuref.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
d28d1e080132f28ab773291f10ad6acca4c8bba2 14-Dec-2005 Trent Jaeger <tjaeger@cse.psu.edu> [LSM-IPSec]: Per-packet access control.

This patch series implements per packet access control via the
extension of the Linux Security Modules (LSM) interface by hooks in
the XFRM and pfkey subsystems that leverage IPSec security
associations to label packets. Extensions to the SELinux LSM are
included that leverage the patch for this purpose.

This patch implements the changes necessary to the SELinux LSM to
create, deallocate, and use security contexts for policies
(xfrm_policy) and security associations (xfrm_state) that enable
control of a socket's ability to send and receive packets.

Patch purpose:

The patch is designed to enable the SELinux LSM to implement access
control on individual packets based on the strongly authenticated
IPSec security association. Such access controls augment the existing
ones in SELinux based on network interface and IP address. The former
are very coarse-grained, and the latter can be spoofed. By using
IPSec, the SELinux can control access to remote hosts based on
cryptographic keys generated using the IPSec mechanism. This enables
access control on a per-machine basis or per-application if the remote
machine is running the same mechanism and trusted to enforce the
access control policy.

Patch design approach:

The patch's main function is to authorize a socket's access to a IPSec
policy based on their security contexts. Since the communication is
implemented by a security association, the patch ensures that the
security association's negotiated and used have the same security
context. The patch enables allocation and deallocation of such
security contexts for policies and security associations. It also
enables copying of the security context when policies are cloned.
Lastly, the patch ensures that packets that are sent without using a
IPSec security assocation with a security context are allowed to be
sent in that manner.

A presentation available at
www.selinux-symposium.org/2005/presentations/session2/2-3-jaeger.pdf
from the SELinux symposium describes the overall approach.

Patch implementation details:

The function which authorizes a socket to perform a requested
operation (send/receive) on a IPSec policy (xfrm_policy) is
selinux_xfrm_policy_lookup. The Netfilter and rcv_skb hooks ensure
that if a IPSec SA with a securit y association has not been used,
then the socket is allowed to send or receive the packet,
respectively.

The patch implements SELinux function for allocating security contexts
when policies (xfrm_policy) are created via the pfkey or xfrm_user
interfaces via selinux_xfrm_policy_alloc. When a security association
is built, SELinux allocates the security context designated by the
XFRM subsystem which is based on that of the authorized policy via
selinux_xfrm_state_alloc.

When a xfrm_policy is cloned, the security context of that policy, if
any, is copied to the clone via selinux_xfrm_policy_clone.

When a xfrm_policy or xfrm_state is freed, its security context, if
any is also freed at selinux_xfrm_policy_free or
selinux_xfrm_state_free.

Testing:

The SELinux authorization function is tested using ipsec-tools. We
created policies and security associations with particular security
contexts and added SELinux access control policy entries to verify the
authorization decision. We also made sure that packets for which no
security context was supplied (which either did or did not use
security associations) were authorized using an unlabelled context.

Signed-off-by: Trent Jaeger <tjaeger@cse.psu.edu>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
25a74f3ba8efb394e9a30d6de37566bf03fd3de8 09-Nov-2005 Stephen Smalley <sds@tycho.nsa.gov> [PATCH] selinux: disable setxattr on mountpoint labeled filesystems

This patch disables the setting of SELinux xattrs on files created in
filesystems labeled via mountpoint labeling (mounted with the context=
option). selinux_inode_setxattr already prevents explicit setxattr from
userspace on such filesystems, so this provides consistent behavior for
file creation.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
621d31219d9a788bda924a0613048053f3f5f211 31-Oct-2005 Oleg Nesterov <oleg@tv-sign.ru> [PATCH] cleanup the usage of SEND_SIG_xxx constants

This patch simplifies some checks for magic siginfo values. It should not
change the behaviour in any way.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
b67a1b9e4bf878aa5d4b6b44cb5a251a2f425f0d 31-Oct-2005 Oleg Nesterov <oleg@tv-sign.ru> [PATCH] remove hardcoded SEND_SIG_xxx constants

This patch replaces hardcoded SEND_SIG_xxx constants with
their symbolic names.

No changes in affected .o files.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2f51201662b28dbf8c15fb7eb972bc51c6cc3fa5 31-Oct-2005 Eric Dumazet <dada1@cosmosbay.com> [PATCH] reduce sizeof(struct file)

Now that RCU applied on 'struct file' seems stable, we can place f_rcuhead
in a memory location that is not anymore used at call_rcu(&f->f_rcuhead,
file_free_rcu) time, to reduce the size of this critical kernel object.

The trick I used is to move f_rcuhead and f_list in an union called f_u

The callers are changed so that f_rcuhead becomes f_u.fu_rcuhead and f_list
becomes f_u.f_list

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
ce4c2bd1a9dfebaefadc2d34b17c6f12101751be 30-Oct-2005 Andrew Morton <akpm@osdl.org> [PATCH] selinux-canonicalize-getxattr-fix

security/selinux/hooks.c: In function `selinux_inode_getxattr':
security/selinux/hooks.c:2193: warning: unused variable `sbsec'

Cc: James Morris <jmorris@namei.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
d381d8a9a08cac9824096213069159be17fd2e2f 30-Oct-2005 James Morris <jmorris@namei.org> [PATCH] SELinux: canonicalize getxattr()

This patch allows SELinux to canonicalize the value returned from
getxattr() via the security_inode_getsecurity() hook, which is called after
the fs level getxattr() function.

The purpose of this is to allow the in-core security context for an inode
to override the on-disk value. This could happen in cases such as
upgrading a system to a different labeling form (e.g. standard SELinux to
MLS) without needing to do a full relabel of the filesystem.

In such cases, we want getxattr() to return the canonical security context
that the kernel is using rather than what is stored on disk.

The implementation hooks into the inode_getsecurity(), adding another
parameter to indicate the result of the preceding fs-level getxattr() call,
so that SELinux knows whether to compare a value obtained from disk with
the kernel value.

We also now allow getxattr() to work for mountpoint labeled filesystems
(i.e. mount with option context=foo_t), as we are able to return the
kernel value to the user.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
89d155ef62e5e0c10e4b37aaa5056f0beafe10e6 30-Oct-2005 James Morris <jmorris@namei.org> [PATCH] SELinux: convert to kzalloc

This patch converts SELinux code from kmalloc/memset to the new kazalloc
unction. On i386, this results in a text saving of over 1K.

Before:
text data bss dec hex filename
86319 4642 15236 106197 19ed5 security/selinux/built-in.o

After:
text data bss dec hex filename
85278 4642 15236 105156 19ac4 security/selinux/built-in.o

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
7d877f3bda870ab5f001bd92528654471d5966b3 21-Oct-2005 Al Viro <viro@zeniv.linux.org.uk> [PATCH] gfp_t: net/*

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
13402580021a52e49c6d1068ff28ade4d5a175f1 30-Sep-2005 James Morris <jmorris@namei.org> [PATCH] SELinux - fix SCTP socket bug and general IP protocol handling

The following patch updates the way SELinux classifies and handles IP
based protocols.

Currently, IP sockets are classified by SELinux as being either TCP, UDP
or 'Raw', the latter being a default for IP socket that is not TCP or UDP.

The classification code is out of date and uses only the socket type
parameter to socket(2) to determine the class of IP socket. So, any
socket created with SOCK_STREAM will be classified by SELinux as TCP, and
SOCK_DGRAM as UDP. Also, other socket types such as SOCK_SEQPACKET and
SOCK_DCCP are currently ignored by SELinux, which classifies them as
generic sockets, which means they don't even get basic IP level checking.

This patch changes the SELinux IP socket classification logic, so that
only an IPPROTO_IP protocol value passed to socket(2) classify the socket
as TCP or UDP. The patch also drops the check for SOCK_RAW and converts
it into a default, so that socket types like SOCK_DCCP and SOCK_SEQPACKET
are classified as SECCLASS_RAWIP_SOCKET (instead of generic sockets).

Note that protocol-specific support for SCTP, DCCP etc. is not addressed
here, we're just getting these protocols checked at the IP layer.

This fixes a reported problem where SCTP sockets were being recognized as
generic SELinux sockets yet still being passed in one case to an IP level
check, which then fails for generic sockets.

It will also fix bugs where any SOCK_STREAM socket is classified as TCP or
any SOCK_DGRAM socket is classified as UDP.

This patch also unifies the way IP sockets classes are determined in
selinux_socket_bind(), so we use the already calculated value instead of
trying to recalculate it.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
b835996f628eadb55c5fb222ba46fe9395bf73c7 09-Sep-2005 Dipankar Sarma <dipankar@in.ibm.com> [PATCH] files: lock-free fd look-up

With the use of RCU in files structure, the look-up of files using fds can now
be lock-free. The lookup is protected by rcu_read_lock()/rcu_read_unlock().
This patch changes the readers to use lock-free lookup.

Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Ravikiran Thirumalai <kiran_th@gmail.com>
Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
badf16621c1f9d1ac753be056fce11b43d6e0be5 09-Sep-2005 Dipankar Sarma <dipankar@in.ibm.com> [PATCH] files: break up files struct

In order for the RCU to work, the file table array, sets and their sizes must
be updated atomically. Instead of ensuring this through too many memory
barriers, we put the arrays and their sizes in a separate structure. This
patch takes the first step of putting the file table elements in a separate
structure fdtable that is embedded withing files_struct. It also changes all
the users to refer to the file table using files_fdtable() macro. Subsequent
applciation of RCU becomes easier after this.

Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
e31e14ec356f36b131576be5bc31d8fef7e95483 09-Sep-2005 Stephen Smalley <sds@tycho.nsa.gov> [PATCH] remove the inode_post_link and inode_post_rename LSM hooks

This patch removes the inode_post_link and inode_post_rename LSM hooks as
they are unused (and likely useless).

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
a74574aafea3a63add3251047601611111f44562 09-Sep-2005 Stephen Smalley <sds@tycho.nsa.gov> [PATCH] Remove security_inode_post_create/mkdir/symlink/mknod hooks

This patch removes the inode_post_create/mkdir/mknod/symlink LSM hooks as
they are obsoleted by the new inode_init_security hook that enables atomic
inode security labeling.

If anyone sees any reason to retain these hooks, please speak now. Also,
is anyone using the post_rename/link hooks; if not, those could also be
removed.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
570bc1c2e5ccdb408081e77507a385dc7ebed7fa 09-Sep-2005 Stephen Smalley <sds@tycho.nsa.gov> [PATCH] tmpfs: Enable atomic inode security labeling

This patch modifies tmpfs to call the inode_init_security LSM hook to set
up the incore inode security state for new inodes before the inode becomes
accessible via the dcache.

As there is no underlying storage of security xattrs in this case, it is
not necessary for the hook to return the (name, value, len) triple to the
tmpfs code, so this patch also modifies the SELinux hook function to
correctly handle the case where the (name, value, len) pointers are NULL.

The hook call is needed in tmpfs in order to support proper security
labeling of tmpfs inodes (e.g. for udev with tmpfs /dev in Fedora). With
this change in place, we should then be able to remove the
security_inode_post_create/mkdir/... hooks safely.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
5e41ff9e0650f327a6c819841fa412da95d57319 09-Sep-2005 Stephen Smalley <sds@tycho.nsa.gov> [PATCH] security: enable atomic inode security labeling

The following patch set enables atomic security labeling of newly created
inodes by altering the fs code to invoke a new LSM hook to obtain the security
attribute to apply to a newly created inode and to set up the incore inode
security state during the inode creation transaction. This parallels the
existing processing for setting ACLs on newly created inodes. Otherwise, it
is possible for new inodes to be accessed by another thread via the dcache
prior to complete security setup (presently handled by the
post_create/mkdir/... LSM hooks in the VFS) and a newly created inode may be
left unlabeled on the disk in the event of a crash. SELinux presently works
around the issue by ensuring that the incore inode security label is
initialized to a special SID that is inaccessible to unprivileged processes
(in accordance with policy), thereby preventing inappropriate access but
potentially causing false denials on legitimate accesses. A simple test
program demonstrates such false denials on SELinux, and the patch solves the
problem. Similar such false denials have been encountered in real
applications.

This patch defines a new inode_init_security LSM hook to obtain the security
attribute to apply to a newly created inode and to set up the incore inode
security state for it, and adds a corresponding hook function implementation
to SELinux.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
216efaaaa006d2f3ecbb5bbc2b6673423813254e 16-Aug-2005 James Morris <jmorris@namei.org> [SELINUX]: Update for tcp_diag rename to inet_diag.

Also, support dccp sockets.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
911656f8a630e36b22c7e2bba3317dec9174209c 29-Jul-2005 Stephen Smalley <sds@tycho.nsa.gov> [PATCH] selinux: Fix address length checks in connect hook

This patch fixes the address length checks in the selinux_socket_connect
hook to be no more restrictive than the underlying ipv4 and ipv6 code;
otherwise, this hook can reject valid connect calls. This patch is in
response to a bug report where an application was calling connect on an
INET6 socket with an address that didn't include the optional scope id and
failing due to these checks.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
f5c1d5b2aaf9a98f15a6dcdfbba1f494d0aaae52 28-Jul-2005 James Morris <jmorris@redhat.com> [PATCH] SELinux: default labeling of MLS field

Implement kernel labeling of the MLS (multilevel security) field of
security contexts for files which have no existing MLS field. This is to
enable upgrades of a system from non-MLS to MLS without performing a full
filesystem relabel including all of the mountpoints, which would be quite
painful for users.

With this patch, with MLS enabled, if a file has no MLS field, the kernel
internally adds an MLS field to the in-core inode (but not to the on-disk
file). This MLS field added is the default for the superblock, allowing
per-mountpoint control over the values via fixed policy or mount options.

This patch has been tested by enabling MLS without relabeling its
filesystem, and seems to be working correctly.

Signed-off-by: James Morris <jmorris@redhat.com>
Signed-off-by: Stephen Smalley <sds@epoch.ncsc.mil>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
6931dfc9f3f81d148b7ed0ab3fd796f8b986a995 30-Jun-2005 Eric Paris <eparis@parisplace.org> [PATCH] selinux_sb_copy_data() should not require a whole page

Currently selinux_sb_copy_data requires an entire page be allocated to
*orig when the function is called. This "requirement" is based on the fact
that we call copy_page(in_save, nosec_save) and in_save = orig when the
data is not FS_BINARY_MOUNTDATA. This means that if a caller were to call
do_kern_mount with only about 10 bytes of options, they would get passed
here and then we would corrupt PAGE_SIZE - 10 bytes of memory (with all
zeros.)

Currently it appears all in kernel FS's use one page of data so this has
not been a problem. An out of kernel FS did just what is described above
and it would almost always panic shortly after they tried to mount. From
looking else where in the kernel it is obvious that this string of data
must always be null terminated. (See example in do_mount where it always
zeros the last byte.) Thus I suggest we use strcpy in place of copy_page.
In this way we make sure the amount we copy is always less than or equal to
the amount we received and since do_mount is zeroing the last byte this
should be safe for all.

Signed-off-by: Eric Paris <eparis@parisplace.org>
Cc: Stephen Smalley <sds@epoch.ncsc.mil>
Acked-by: James Morris <jmorris@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
9a5f04bf798254390f89445ecf0b6f4c70ddc1f8 25-Jun-2005 Jesper Juhl <juhl-lkml@dif.dk> [PATCH] selinux: kfree cleanup

kfree(NULL) is legal.

Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
09ffd94fb15d85fbf9eebb8180f50264b264d6fe 25-Jun-2005 Lorenzo Hern�ndez Garc�a-Hierro <lorenzo@gnu.org> [PATCH] selinux: add executable heap check

This patch,based on sample code by Roland McGrath, adds an execheap
permission check that controls the ability to make the heap executable so
that this can be prevented in almost all cases (the X server is presently
an exception, but this will hopefully be resolved in the future) so that
even programs with execmem permission will need to have the anonymous
memory mapped in order to make it executable.

The only reason that we use a permission check for such restriction (vs.
making it unconditional) is that the X module loader presently needs it; it
could possibly be made unconditional in the future when X is changed.

The policy patch for the execheap permission is available at:
http://pearls.tuxedo-es.org/patches/selinux/policy-execheap.patch

Signed-off-by: Lorenzo Hernandez Garcia-Hierro <lorenzo@gnu.org>
Acked-by: James Morris <jmorris@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
6b9921976f0861e04828b3aff66696c1f3fd900d 25-Jun-2005 Lorenzo Hernandez Garc�a-Hierro <lorenzo@gnu.org> [PATCH] selinux: add executable stack check

This patch adds an execstack permission check that controls the ability to
make the main process stack executable so that attempts to make the stack
executable can still be prevented even if the process is allowed the
existing execmem permission in order to e.g. perform runtime code
generation. Note that this does not yet address thread stacks. Note also
that unlike the execmem check, the execstack check is only applied on
mprotect calls, not mmap calls, as the current security_file_mmap hook is
not passed the necessary information presently.

The original author of the code that makes the distinction of the stack
region, is Ingo Molnar, who wrote it within his patch for
/proc/<pid>/maps markers.
(http://marc.theaimsgroup.com/?l=linux-kernel&m=110719881508591&w=2)

The patches also can be found at:
http://pearls.tuxedo-es.org/patches/selinux/policy-execstack.patch
http://pearls.tuxedo-es.org/patches/selinux/kernel-execstack.patch

policy-execstack.patch is the patch that needs to be applied to the policy in
order to support the execstack permission and exclude it
from general_domain_access within macros/core_macros.te.

kernel-execstack.patch adds such permission to the SELinux code within
the kernel and adds the proper permission check to the selinux_file_mprotect() hook.

Signed-off-by: Lorenzo Hernandez Garcia-Hierro <lorenzo@gnu.org>
Acked-by: James Morris <jmorris@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
9ad9ad385be27fcc7c16d290d972c6173e780a61 22-Jun-2005 David Woodhouse <dwmw2@shinybook.infradead.org> AUDIT: Wait for backlog to clear when generating messages.

Add a gfp_mask to audit_log_start() and audit_log(), to reduce the
amount of GFP_ATOMIC allocation -- most of it doesn't need to be
GFP_ATOMIC. Also if the mask includes __GFP_WAIT, then wait up to
60 seconds for the auditd backlog to clear instead of immediately
abandoning the message.

The timeout should probably be made configurable, but for now it'll
suffice that it only happens if auditd is actually running.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
da3caa204ca40c32dcb751ebead2a6835b83e8d1 22-Jun-2005 Gerald Schaefer <geraldsc@de.ibm.com> [PATCH] SELinux: memory leak in selinux_sb_copy_data()

There is a memory leak during mount when SELinux is active and mount
options are specified.

Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com>
Acked-by: Stephen Smalley <sds@epoch.ncsc.mil>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
c04049939f88b29e235d2da217bce6e8ead44f32 13-May-2005 Steve Grubb <sgrubb@redhat.com> AUDIT: Add message types to audit records

This patch adds more messages types to the audit subsystem so that audit
analysis is quicker, intuitive, and more useful.

Signed-off-by: Steve Grubb <sgrubb@redhat.com>
---
I forgot one type in the big patch. I need to add one for user space
originating SE Linux avc messages. This is used by dbus and nscd.

-Steve
---
Updated to 2.6.12-rc4-mm1.
-dwmw2

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
6af963f1d6789ef20abca5696cd52a758b396e52 01-May-2005 Stephen Smalley <sds@tycho.nsa.gov> [PATCH] SELinux: cleanup ipc_has_perm

This patch removes the sclass argument from ipc_has_perm in the SELinux
module, as it can be obtained from the ipc security structure. The use of
a separate argument was a legacy of the older precondition function
handling in SELinux and is obsolete. Please apply.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
0d3d077cd4f1154e63a9858e47fe3fb1ad0c03e5 25-Apr-2005 Herbert Xu <herbert@gondor.apana.org.au> [SELINUX]: Fix ipv6_skip_exthdr() invocation causing OOPS.

The SELinux hooks invoke ipv6_skip_exthdr() with an incorrect
length final argument. However, the length argument turns out
to be superfluous.

I was just reading ipv6_skip_exthdr and it occured to me that we can
get rid of len altogether. The only place where len is used is to
check whether the skb has two bytes for ipv6_opt_hdr. This check
is done by skb_header_pointer/skb_copy_bits anyway.

Now it might appear that we've made the code slower by deferring
the check to skb_copy_bits. However, this check should not trigger
in the common case so this is OK.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
0c9b79429c83a404a04908be65baa9d97836bbb6 17-Apr-2005 James Morris <jmorris@redhat.com> [PATCH] SELinux: add support for NETLINK_KOBJECT_UEVENT

This patch adds SELinux support for the KOBJECT_UEVENT Netlink family, so
that SELinux can apply finer grained controls to it. For example, security
policy for hald can be locked down to the KOBJECT_UEVENT Netlink family
only. Currently, this family simply defaults to the default Netlink socket
class.

Note that some new permission definitions are added to sync with changes in
the core userspace policy package, which auto-generates header files.

Signed-off-by: James Morris <jmorris@redhat.com>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 17-Apr-2005 Linus Torvalds <torvalds@ppc970.osdl.org> Linux-2.6.12-rc2

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!