Cross Reference: /fs/file.c

History log of /fs/file.c
Revision	Date	Author	Comments
e983094d6dce524f3890edfec44b7ca6dbfa1183	31-Aug-2014	Al Viro <viro@zeniv.linux.org.uk>	missing annotation in fs/file.c Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
bde6c3aa993066acb0d6ce32ecabe03b9d5df92d	01-Jul-2014	Paul E. McKenney <paulmck@linux.vnet.ibm.com>	rcu: Provide cond_resched_rcu_qs() to force quiescent states in long loops RCU-tasks requires the occasional voluntary context switch from CPU-bound in-kernel tasks. In some cases, this requires instrumenting cond_resched(). However, there is some reluctance to countenance unconditionally instrumenting cond_resched() (see http://lwn.net/Articles/603252/), so this commit creates a separate cond_resched_rcu_qs() that may be used in place of cond_resched() in locations prone to long-duration in-kernel looping. This commit currently instruments only RCU-tasks. Future possibilities include also instrumenting RCU, RCU-bh, and RCU-sched in order to reduce IPI usage. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
f6c0a1920e0180175bd5e8e4aff8ea5556f1895d	23-Apr-2014	Al Viro <viro@zeniv.linux.org.uk>	fs/file.c: don't open-code kvfree() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
7f4b36f9bb930b3b2105a9a2cb0121fa7028c432	14-Mar-2014	Al Viro <viro@zeniv.linux.org.uk>	get rid of files_defer_init() the only thing it's doing these days is calculation of upper limit for fs.nr_open sysctl and that can be done statically Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
99aea68134f3c2a27b4d463c91cfa298c3efaccf	16-Mar-2014	Eric Biggers <ebiggers3@gmail.com>	vfs: Don't let __fdget_pos() get FMODE_PATH files Commit bd2a31d522344 ("get rid of fget_light()") introduced the __fdget_pos() function, which returns the resulting file pointer and fdput flags combined in an 'unsigned long'. However, it also changed the behavior to return files with FMODE_PATH set, which shouldn't happen because read(), write(), lseek(), etc. aren't allowed on such files. This commit restores the old behavior. This regression actually had no effect on read() and write() since FMODE_READ and FMODE_WRITE are not set on file descriptors opened with O_PATH, but it did cause lseek() on a file descriptor opened with O_PATH to fail with ESPIPE rather than EBADF. Signed-off-by: Eric Biggers <ebiggers3@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
bd2a31d522344b3ac2fb680bd2366e77a9bd8209	04-Mar-2014	Al Viro <viro@zeniv.linux.org.uk>	get rid of fget_light() instead of returning the flags by reference, we can just have the low-level primitive return those in lower bits of unsigned long, with struct file * derived from the rest. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
add1f0995454374d90c9d6b2c420d2fba3d0a4e3	12-Feb-2014	Paul E. McKenney <paulmck@linux.vnet.ibm.com>	fs: Substitute rcu_access_pointer() for rcu_dereference_raw() (Trivial patch.) If the code is looking at the RCU-protected pointer itself, but not dereferencing it, the rcu_dereference() functions can be downgraded to rcu_access_pointer(). This commit makes this downgrade in __alloc_fd(), which simply compares the RCU-protected pointer against NULL with no dereferencing. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Josh Triplett <josh@joshtriplett.org>
96c7a2ff21501691587e1ae969b83cbec8b78e08	10-Feb-2014	Eric W. Biederman <ebiederm@xmission.com>	fs/file.c:fdtable: avoid triggering OOMs from alloc_fdmem Recently due to a spike in connections per second memcached on 3 separate boxes triggered the OOM killer from accept. At the time the OOM killer was triggered there was 4GB out of 36GB free in zone 1. The problem was that alloc_fdtable was allocating an order 3 page (32KiB) to hold a bitmap, and there was sufficient fragmentation that the largest page available was 8KiB. I find the logic that PAGE_ALLOC_COSTLY_ORDER can't fail pretty dubious but I do agree that order 3 allocations are very likely to succeed. There are always pathologies where order > 0 allocations can fail when there are copious amounts of free memory available. Using the pigeon hole principle it is easy to show that it requires 1 page more than 50% of the pages being free to guarantee an order 1 (8KiB) allocation will succeed, 1 page more than 75% of the pages being free to guarantee an order 2 (16KiB) allocation will succeed and 1 page more than 87.5% of the pages being free to guarantee an order 3 allocate will succeed. A server churning memory with a lot of small requests and replies like memcached is a common case that if anything can will skew the odds against large pages being available. Therefore let's not give external applications a practical way to kill linux server applications, and specify __GFP_NORETRY to the kmalloc in alloc_fdmem. Unless I am misreading the code and by the time the code reaches should_alloc_retry in __alloc_pages_slowpath (where __GFP_NORETRY becomes signification). We have already tried everything reasonable to allocate a page and the only thing left to do is wait. So not waiting and falling back to vmalloc immediately seems like the reasonable thing to do even if there wasn't a chance of triggering the OOM killer. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Cong Wang <cwang@twopensource.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
e6ff9a9fa4e05c1c03dec63cdc6a87d6dea02755	13-Jan-2014	Oleg Nesterov <oleg@redhat.com>	fs: __fget_light() can use __fget() in slow path The slow path in __fget_light() can use __fget() to avoid the code duplication. Saves 232 bytes. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
ad46183445043b562856c60b74db664668fb364b	13-Jan-2014	Oleg Nesterov <oleg@redhat.com>	fs: factor out common code in fget_light() and fget_raw_light() Apart from FMODE_PATH check fget_light() and fget_raw_light() are identical, shift the code into the new helper, __fget_light(fd, mask). Saves 208 bytes. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
1deb46e2562561255c34075825fd00f22a858bb3	13-Jan-2014	Oleg Nesterov <oleg@redhat.com>	fs: factor out common code in fget() and fget_raw() Apart from FMODE_PATH check fget() and fget_raw() are identical, shift the code into the new simple helper, __fget(fd, mask). Saves 160 bytes. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
ce08b62d18b3f97cd4e5a39bd5898872b9201875	11-Jan-2014	Oleg Nesterov <oleg@redhat.com>	change close_files() to use rcu_dereference_raw(files->fdt) put_files_struct() and close_files() do rcu_read_lock() to make rcu_dereference_check_fdtable() happy. This looks a bit ugly, files_fdtable() just reads the pointer, we can simply use rcu_dereference_raw() to avoid the warning. The patch also changes close_files() to return fdt, this avoids another rcu_read_lock()/files_fdtable() in put_files_struct(). I think close_files() needs more cleanups: - we do not need xchg() exactly because we are the last user of this files_struct - "if (file)" should be turned into WARN_ON(!file) Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
a8d4b8345e0ee48b732126d980efaf0dc373e2b0	11-Jan-2014	Oleg Nesterov <oleg@redhat.com>	introduce __fcheck_files() to fix rcu_dereference_check_fdtable(), kill rcu_my_thread_group_empty() rcu_dereference_check_fdtable() looks very wrong, 1. rcu_my_thread_group_empty() was added by 844b9a8707f1 "vfs: fix RCU-lockdep false positive due to /proc" but it doesn't really fix the problem. A CLONE_THREAD (without CLONE_FILES) task can hit the same race with get_files_struct(). And otoh rcu_my_thread_group_empty() can suppress the correct warning if the caller is the CLONE_FILES (without CLONE_THREAD) task. 2. files->count == 1 check is not really right too. Even if this files_struct is not shared it is not safe to access it lockless unless the caller is the owner. Otoh, this check is sub-optimal. files->count == 0 always means it is safe to use it lockless even if files != current->files, but put_files_struct() has to take rcu_read_lock(). See the next patch. This patch removes the buggy checks and turns fcheck_files() into __fcheck_files() which uses rcu_dereference_raw(), the "unshared" callers, fget_light() and fget_raw_light(), can use it to avoid the warning from RCU-lockdep. fcheck_files() is trivially reimplemented as rcu_lockdep_assert() plus __fcheck_files(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
ac3e3c5b1164397656df81b9e9ab4991184d3236	29-Apr-2013	Al Viro <viro@zeniv.linux.org.uk>	don't bother with deferred freeing of fdtables Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
eece09ec213e93333010bf4c6bb9175b32229c54	17-Jul-2011	Thomas Gleixner <tglx@linutronix.de>	locking: Various static lock initializer fixes The static lock initializers want to be fed the proper name of the lock and not some random string. In mainline random strings are obfuscating the readability of debug output, but for RT they prevent the spinlock substitution. Fix it up. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
6ae141718e3f9c7e2c620e999c86612a7f415bb1	22-Dec-2012	Greg Kroah-Hartman <gregkh@linuxfoundation.org>	misc: remove __dev* attributes. CONFIG_HOTPLUG is going away as an option. As a result, the __dev* markings need to be removed. This change removes the last of the __dev* markings from the kernel from a variety of different, tiny, places. Based on patches originally written by Bill Pemberton, but redone by me in order to handle some of the coding style issues better, by hand. Cc: Bill Pemberton <wfp5p@virginia.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
a77cfcb429ed98845a4e4df72473b8f37acd890b	30-Nov-2012	Al Viro <viro@zeniv.linux.org.uk>	fix off-by-one in argument passed by iterate_fd() to callbacks Noticed by Pavel Roskin; the thing in his patch I disagree with was compensating for that shite in callbacks instead of fixing it once in the iterator itself. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
c4144670fd9b34d6eae22c9f83751745898e8243	02-Oct-2012	Al Viro <viro@zeniv.linux.org.uk>	kill daemonize() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
5a8477660d9ddc090203736d7271137265cb25bb	12-Nov-2012	Al Viro <viro@zeniv.linux.org.uk>	kill bogus BUG_ON() in do_close_on_exec() It can be legitimately triggered via procfs access. Now, at least 2 of 3 of get_files_struct() callers in procfs are useless, but when and if we get rid of those we can always add WARN_ON() here. BUG_ON() at that spot is simply wrong. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
08f05c49749ee655bef921d12160960a273aad47	31-Oct-2012	Al Viro <viro@ZenIV.linux.org.uk>	Return the right error value when dup[23]() newfd argument is too large Jack Lin reports that the error return from dup3() for the RLIMIT_NOFILE case changed incorrectly after 3.6. The culprit is commit f33ff9927f42 ("take rlimit check to callers of expand_files()") which when it moved the "return -EMFILE" out to the caller, didn't notice that the dup3() had special code to turn the EMFILE return into EBADF. The replace_fd() helper that got added later then inherited the bug too. Reported-by: Jack Lin <linliangjie@huawei.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> [ Noted more bugs, wrote proper changelog, fixed up typos - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
aed976475bff939672b0e21595839c445dcec0fa	09-Oct-2012	Richard W.M. Jones <rjones@redhat.com>	dup3: Return an error when oldfd == newfd. I have tested the attached patch to fix the dup3 regression. Rich. From 0944e30e12dec6544b3602626b60ff412375c78f Mon Sep 17 00:00:00 2001 From: "Richard W.M. Jones" <rjones@redhat.com> Date: Tue, 9 Oct 2012 14:42:45 +0100 Subject: [PATCH] dup3: Return an error when oldfd == newfd. The following commit: commit fe17f22d7fd0e344ef6447238f799bb49f670c6f Author: Al Viro <viro@zeniv.linux.org.uk> Date: Tue Aug 21 11:48:11 2012 -0400 take purely descriptor-related stuff from fcntl.c to file.c was supposed to be just code motion, but it dropped the following two lines: if (unlikely(oldfd == newfd)) return -EINVAL; from the dup3 system call. dup3 is not specified by POSIX, so Linux can do what it likes. However the POSIX proposal for dup3 [1] states that it should return an error if oldfd == newfd. [1] http://austingroupbugs.net/view.php?id=411 Signed-off-by: Richard W.M. Jones <rjones@redhat.com> Tested-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
4557c669ef9801d96cf663331cdd1dcb8fa9c2f1	28-Aug-2012	Al Viro <viro@zeniv.linux.org.uk>	export fget_light Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
864bdb3b6cbd9911222543fef1cfe36f88183f44	23-Aug-2012	Al Viro <viro@zeniv.linux.org.uk>	new helper: daemonize_descriptors() descriptor-related parts of daemonize, done right. As the result we simplify the locking rules for ->files - we hold task_lock in all cases when we modify ->files. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
c3c073f808b22dfae15ef8412b6f7b998644139a	22-Aug-2012	Al Viro <viro@zeniv.linux.org.uk>	new helper: iterate_fd() iterates through the opened files in given descriptor table, calling a supplied function; we stop once non-zero is returned. Callback gets struct file , descriptor number and const void argument passed to iterator. It is called with files->file_lock held, so it is not allowed to block. tty_io, netprio_cgroup and selinux flush_unauthorized_files() converted to its use. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
ad47bd7252bf402fe7dba92f5240b5ed16832ae7	22-Aug-2012	Al Viro <viro@zeniv.linux.org.uk>	make expand_files() and alloc_fd() static no callers outside of fs/file.c left Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
b8318b01a8f7f760ae3ecae052ccc7fc123d9508	22-Aug-2012	Al Viro <viro@zeniv.linux.org.uk>	take __{set,clear}_{open_fd,close_on_exec}() into fs/file.c nobody uses those outside anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
8280d16172243702ed43432f826ca6130edb4086	21-Aug-2012	Al Viro <viro@zeniv.linux.org.uk>	new helper: replace_fd() analog of dup2(), except that it takes struct file * as source. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
fe17f22d7fd0e344ef6447238f799bb49f670c6f	21-Aug-2012	Al Viro <viro@zeniv.linux.org.uk>	take purely descriptor-related stuff from fcntl.c to file.c Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
6a6d27de340c89c5323565b49f7851362619925d	21-Aug-2012	Al Viro <viro@zeniv.linux.org.uk>	take close-on-exec logics to fs/file.c, clean it up a bit ... and add cond_resched() there, while we are at it. We can get large latencies as is... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
483ce1d4b8c3b82bc9c9a1dd9dbc44f50b3aaf5a	19-Aug-2012	Al Viro <viro@zeniv.linux.org.uk>	take descriptor-related part of close() to file.c Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
0ee8cdfe6af052deb56dccd54838a1eb32fb4ca2	16-Aug-2012	Al Viro <viro@zeniv.linux.org.uk>	take fget() and friends to fs/file.c Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
f869e8a7f753e3fd43d6483e796774776f645edb	16-Aug-2012	Al Viro <viro@zeniv.linux.org.uk>	expose a low-level variant of fd_install() for binder Similar situation to that of __alloc_fd(); do not use unless you really have to. You should not touch any descriptor table other than your own; it's a sure sign of a really bad API design. As with __alloc_fd(), you must use a first-class reference to struct files_struct; something obtained by get_files_struct(some task) (let alone direct task->files) will not do. It must be either current->files, or obtained by get_files_struct(current) by the owner of that sucker and given to you. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
56007cae94f349387c088e738c7dcb6bc513063b	16-Aug-2012	Al Viro <viro@zeniv.linux.org.uk>	move put_unused_fd() and fd_install() to fs/file.c Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
1983e781da2f7f77906f4ccc2c3dc279cd61d1ff	16-Aug-2012	Al Viro <viro@zeniv.linux.org.uk>	trim free_fdtable_rcu() embedded case isn't hit anymore Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
b9e02af0ae0783894abb576fbab45ec29aa8e7fc	16-Aug-2012	Al Viro <viro@zeniv.linux.org.uk>	don't bother with call_rcu() in put_files_struct() At that point nobody can see us anyway; everything that looks at files_fdtable(files) is separated from the guts of put_files_struct(files) - either since files is current->files or because we fetched it under task_lock() and hadn't dropped that yet, or because we'd bumped files->count while holding task_lock()... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
7cf4dc3c8dbfdfde163d4636f621cf99a1f63bfb	16-Aug-2012	Al Viro <viro@zeniv.linux.org.uk>	move files_struct-related bits from kernel/exit.c to fs/file.c Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
dcfadfa4ec5a12404a99ad6426871a6b03a62b37	12-Aug-2012	Al Viro <viro@zeniv.linux.org.uk>	new helper: __alloc_fd() Essentially, alloc_fd() in a files_struct we own a reference to. Most of the time wanting to use it is a sign of lousy API design (such as android/binder). It's not a general-purpose interface; better that than open-coding its guts, but again, playing with other process' descriptor table is a sign of bad design. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
f33ff9927f42045116d738ee47ff7bc59f739bd7	12-Aug-2012	Al Viro <viro@zeniv.linux.org.uk>	take rlimit check to callers of expand_files() ... except for one in android, where the check is different and already done in caller. No need to recalculate rlimit many times in alloc_fd() either. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
1a7bd2265fc57f29400d57f66275cc5918e30aa6	12-Aug-2012	Al Viro <viro@zeniv.linux.org.uk>	make get_unused_fd_flags() a function ... and get_unused_fd() a macro around it Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
630d9c47274aa89bfa77fe6556d7818bdcb12992	17-Nov-2011	Paul Gortmaker <paul.gortmaker@windriver.com>	fs: reduce the use of module.h wherever possible For files only using THIS_MODULE and/or EXPORT_SYMBOL, map them onto including export.h -- or if the file isn't even using those, then just delete the include. Fix up any implicit include dependencies that were being masked by module.h along the way. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>