4dcc03fc4553943c8bae086aa826bb75171e82a1 |
|
08-Aug-2014 |
Alexey Dobriyan <adobriyan@gmail.com> |
proc: make proc_subdir_lock static Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
dbcdb504417ae108a20454ef89776a614b948571 |
|
08-Aug-2014 |
Alexey Dobriyan <adobriyan@gmail.com> |
proc: add and remove /proc entry create checks * remove proc_create(NULL, ...) check, let it oops * warn about proc_create("", ...) and proc_create("very very long name", ...) proc code keeps length as u8, no 256+ name length possible * warn about proc_create("123", ...) /proc/$PID and /proc/misc namespaces are separate things, but dumb module might create funky a-la $PID entry. * remove post mortem strchr('/') check Triggering it implies either strchr() is buggy or memory corruption. It should be VFS check anyway. In reality, none of these checks will ever trigger, it is preparation for the next patch. Based on patch from Al Viro. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
cdf7e8dded6212cb29f758017a613e4eefc4ce9e |
|
24-Jan-2014 |
Rui Xiang <rui.xiang@huawei.com> |
proc: set attributes of pde using accessor functions Use existing accessors proc_set_user() and proc_set_size() to set attributes. Just a cleanup. Signed-off-by: Rui Xiang <rui.xiang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
b26d4cd385fc51e8844e2cdf9ba2051f5bba11a5 |
|
26-Oct-2013 |
Al Viro <viro@zeniv.linux.org.uk> |
consolidate simple ->d_delete() instances Rename simple_delete_dentry() to always_delete_dentry() and export it. Export simple_dentry_operations, while we are at it, and get rid of their duplicates Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
fd3930f70c8d14008f3377d51ce039806dfc542e |
|
20-Aug-2013 |
Linus Torvalds <torvalds@linux-foundation.org> |
proc: more readdir conversion bug-fixes In the previous commit, Richard Genoud fixed proc_root_readdir(), which had lost the check for whether all of the non-process /proc entries had been returned or not. But that in turn exposed _another_ bug, namely that the original readdir conversion patch had yet another problem: it had lost the return value of proc_readdir_de(), so now checking whether it had completed successfully or not didn't actually work right anyway. This reinstates the non-zero return for the "end of base entries" that had also gotten lost in commit f0c3b5093add ("[readdir] convert procfs"). So now you get all the base entries *and* you get all the process entries, regardless of getdents buffer size. (Side note: the Linux "getdents" manual page actually has a nice example application for testing getdents, which can be easily modified to use different buffers. Who knew? Man-pages can be useful) Reported-by: Emmanuel Benisty <benisty.e@gmail.com> Reported-by: Marc Dionne <marc.c.dionne@gmail.com> Cc: Richard Genoud <richard.genoud@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
f0c3b5093addc8bfe9fe3a5b01acb7ec7969eafa |
|
16-May-2013 |
Al Viro <viro@zeniv.linux.org.uk> |
[readdir] convert procfs Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
c30480b92cf497aa3b463367a82f1c2fdc5c46e9 |
|
12-Apr-2013 |
David Howells <dhowells@redhat.com> |
proc: Make the PROC_I() and PDE() macros internal to procfs Make the PROC_I() and PDE() macros internal to procfs. This means making PDE_DATA() out of line. This could be made more optimal by storing PDE()->data into inode->i_private. Also provide a __PDE_DATA() that is inline and internal to procfs. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
a8ca16ea7b0abb0a7e49492d1123b715f0ec62e8 |
|
12-Apr-2013 |
David Howells <dhowells@redhat.com> |
proc: Supply a function to remove a proc entry by PDE Supply a function (proc_remove()) to remove a proc entry (and any subtree rooted there) by proc_dir_entry pointer rather than by name and (optionally) root dir entry pointer. This allows us to eliminate all remaining pde->name accesses outside of procfs. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Grant Likely <grant.likely@linaro.or> cc: linux-acpi@vger.kernel.org cc: openipmi-developer@lists.sourceforge.net cc: devicetree-discuss@lists.ozlabs.org cc: linux-pci@vger.kernel.org cc: netdev@vger.kernel.org cc: netfilter-devel@vger.kernel.org cc: alsa-devel@alsa-project.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
4a520d2769beb736ba2bd084b8293ce148a1a7ae |
|
12-Apr-2013 |
David Howells <dhowells@redhat.com> |
proc: Supply an accessor for getting the data from a PDE's parent Supply an accessor function for getting the private data from the parent proc_dir_entry struct of the proc_dir_entry struct associated with an inode. ReiserFS, for instance, stores the super_block pointer in the proc directory it makes for that super_block, and a pointer to the respective seq_file show function in each of the proc files in that directory. This allows a reduction in the number of file_operations structs, open functions and seq_operations structs required. The problem otherwise is that each show function requires two pieces of data but only has storage for one per PDE (and this has no release function). Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> cc: Jerry Chuang <jerry-chuang@realtek.com> cc: Maxim Mikityanskiy <maxtram95@gmail.com> cc: YAMANE Toshiaki <yamanetoshi@gmail.com> cc: linux-wireless@vger.kernel.org cc: linux-scsi@vger.kernel.org cc: devel@driverdev.osuosl.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
270b5ac2151707c25d3327722c5badfbd95945bc |
|
12-Apr-2013 |
David Howells <dhowells@redhat.com> |
proc: Add proc_mkdir_data() Add proc_mkdir_data() to allow procfs directories to be created that are annotated at the time of creation with private data rather than doing this post-creation. This means no access is then required to the proc_dir_entry struct to set this. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> cc: Neela Syam Kolli <megaraidlinux@lsi.com> cc: Jerry Chuang <jerry-chuang@realtek.com> cc: linux-scsi@vger.kernel.org cc: devel@driverdev.osuosl.org cc: linux-wireless@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
271a15eabe094538d958dc68ccfc9c36b699247a |
|
12-Apr-2013 |
David Howells <dhowells@redhat.com> |
proc: Supply PDE attribute setting accessor functions Supply accessor functions to set attributes in proc_dir_entry structs. The following are supplied: proc_set_size() and proc_set_user(). Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> cc: linuxppc-dev@lists.ozlabs.org cc: linux-media@vger.kernel.org cc: netdev@vger.kernel.org cc: linux-wireless@vger.kernel.org cc: linux-pci@vger.kernel.org cc: netfilter-devel@vger.kernel.org cc: alsa-devel@alsa-project.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
3cb5bf1bf947d325fcf6e9458952b51cfd7e6677 |
|
11-Apr-2013 |
David Howells <dhowells@redhat.com> |
proc: Delete create_proc_read_entry() Delete create_proc_read_entry() as it no longer has any users. Also delete read_proc_t, write_proc_t, the read_proc member of the proc_dir_entry struct and the support functions that use them. This saves a pointer for every PDE allocated. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
866ad9a747bbf5461739fcae6d0a41c8971bbe1d |
|
04-Apr-2013 |
Al Viro <viro@zeniv.linux.org.uk> |
procfs: preparations for remove_proc_entry() race fixes * leave ->proc_fops alone; make ->pde_users negative instead * trim pde_opener * move relevant code in fs/proc/inode.c Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
ad147d011f4e9d4e4309f7974fd19c7f875ccb14 |
|
04-Apr-2013 |
David Howells <dhowells@redhat.com> |
procfs: Clean up huge if-statement in __proc_file_read() Switch huge if-statement in __proc_file_read() around. This then puts the single line loop break immediately after the if-statement and allows us to de-indent the huge comment and make it take fewer lines. The code following the if-statement then follows naturally from the call to dp->read_proc(). Signed-off-by: David Howells <dhowells@redhat.com>
|
80e928f7ebb958f4d79d4099d1c5c0a015a23b93 |
|
04-Apr-2013 |
David Howells <dhowells@redhat.com> |
proc: Kill create_proc_entry() Kill create_proc_entry() in favour of create_proc_read_entry(), proc_create() and proc_create_data(). Signed-off-by: David Howells <dhowells@redhat.com>
|
d9dda78bad879595d8c4220a067fc029d6484a16 |
|
01-Apr-2013 |
Al Viro <viro@zeniv.linux.org.uk> |
procfs: new helper - PDE_DATA(inode) The only part of proc_dir_entry the code outside of fs/proc really cares about is PDE(inode)->data. Provide a helper for that; static inline for now, eventually will be moved to fs/proc, along with the knowledge of struct proc_dir_entry layout. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
ee21ed0afc2f47007fbd8b22928ecb17316e13e2 |
|
31-Mar-2013 |
Al Viro <viro@zeniv.linux.org.uk> |
procfs: kill ->write_proc() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
b6cdc7310338e204224f865918f774eb6db0b75d |
|
31-Mar-2013 |
Al Viro <viro@zeniv.linux.org.uk> |
procfs: don't allow to use proc_create, create_proc_entry, etc. for directories Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
8ce584c7416d8a85a6f3edc17d1cddefe331e87e |
|
31-Mar-2013 |
Al Viro <viro@zeniv.linux.org.uk> |
procfs: add proc_remove_subtree() just what it sounds like; do that only to procfs subtrees you've created - doing that to something shared with another driver is not only antisocial, but might cause interesting races with proc_create() and its ilk. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
87ebdc00eeb474615496d5f10eed46709e25c707 |
|
28-Feb-2013 |
Andrew Morton <akpm@linux-foundation.org> |
fs/proc: clean up printks - use pr_foo() throughout - remove a couple of duplicated KERN_WARNINGs, via WARN(KERN_WARNING "...") - nuke a few warnings which I've never seen happen, ever. Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
d3d009cb965eae7e002ea5badf603ea8f4c34915 |
|
26-Jan-2013 |
Al Viro <viro@zeniv.linux.org.uk> |
saner proc_get_inode() calling conventions Make it drop the pde in *all* cases when no new reference to it is put into an inode - both when an inode had already been set up (as we were already doing) and when inode allocation has failed. Makes for simpler logics in callers... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
496ad9aa8ef448058e36ca7a787c61f2e63f0f54 |
|
23-Jan-2013 |
Al Viro <viro@zeniv.linux.org.uk> |
new helper: file_inode(file) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
dfb2ea45becb198beeb75350d0b7b7ad9076a38f |
|
22-Dec-2012 |
Eric W. Biederman <ebiederm@xmission.com> |
proc: Allow proc_free_inum to be called from any context While testing the pid namespace code I hit this nasty warning. [ 176.262617] ------------[ cut here ]------------ [ 176.263388] WARNING: at /home/eric/projects/linux/linux-userns-devel/kernel/softirq.c:160 local_bh_enable_ip+0x7a/0xa0() [ 176.265145] Hardware name: Bochs [ 176.265677] Modules linked in: [ 176.266341] Pid: 742, comm: bash Not tainted 3.7.0userns+ #18 [ 176.266564] Call Trace: [ 176.266564] [<ffffffff810a539f>] warn_slowpath_common+0x7f/0xc0 [ 176.266564] [<ffffffff810a53fa>] warn_slowpath_null+0x1a/0x20 [ 176.266564] [<ffffffff810ad9ea>] local_bh_enable_ip+0x7a/0xa0 [ 176.266564] [<ffffffff819308c9>] _raw_spin_unlock_bh+0x19/0x20 [ 176.266564] [<ffffffff8123dbda>] proc_free_inum+0x3a/0x50 [ 176.266564] [<ffffffff8111d0dc>] free_pid_ns+0x1c/0x80 [ 176.266564] [<ffffffff8111d195>] put_pid_ns+0x35/0x50 [ 176.266564] [<ffffffff810c608a>] put_pid+0x4a/0x60 [ 176.266564] [<ffffffff8146b177>] tty_ioctl+0x717/0xc10 [ 176.266564] [<ffffffff810aa4d5>] ? wait_consider_task+0x855/0xb90 [ 176.266564] [<ffffffff81086bf9>] ? default_spin_lock_flags+0x9/0x10 [ 176.266564] [<ffffffff810cab0a>] ? remove_wait_queue+0x5a/0x70 [ 176.266564] [<ffffffff811e37e8>] do_vfs_ioctl+0x98/0x550 [ 176.266564] [<ffffffff810b8a0f>] ? recalc_sigpending+0x1f/0x60 [ 176.266564] [<ffffffff810b9127>] ? __set_task_blocked+0x37/0x80 [ 176.266564] [<ffffffff810ab95b>] ? sys_wait4+0xab/0xf0 [ 176.266564] [<ffffffff811e3d31>] sys_ioctl+0x91/0xb0 [ 176.266564] [<ffffffff810a95f0>] ? task_stopped_code+0x50/0x50 [ 176.266564] [<ffffffff81939199>] system_call_fastpath+0x16/0x1b [ 176.266564] ---[ end trace 387af88219ad6143 ]--- It turns out that spin_unlock_bh(proc_inum_lock) is not safe when put_pid is called with another spinlock held and irqs disabled. For now take the easy path and use spin_lock_irqsave(proc_inum_lock) in proc_free_inum and spin_loc_irq in proc_alloc_inum(proc_inum_lock). Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
ee297209bf0a25c6717b7c063e76795142d32f37 |
|
21-Dec-2012 |
Xiaotian Feng <xtfeng@gmail.com> |
proc: fix inconsistent lock state Lockdep found an inconsistent lock state when rcu is processing delayed work in softirq. Currently, kernel is using spin_lock/spin_unlock to protect proc_inum_ida, but proc_free_inum is called by rcu in softirq context. Use spin_lock_bh/spin_unlock_bh fix following lockdep warning. ================================= [ INFO: inconsistent lock state ] 3.7.0 #36 Not tainted --------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. swapper/1/0 [HC0[0]:SC1[1]:HE1:SE0] takes: (proc_inum_lock){+.?...}, at: proc_free_inum+0x1c/0x50 {SOFTIRQ-ON-W} state was registered at: __lock_acquire+0x8ae/0xca0 lock_acquire+0x199/0x200 _raw_spin_lock+0x41/0x50 proc_alloc_inum+0x4c/0xd0 alloc_mnt_ns+0x49/0xc0 create_mnt_ns+0x25/0x70 mnt_init+0x161/0x1c7 vfs_caches_init+0x107/0x11a start_kernel+0x348/0x38c x86_64_start_reservations+0x131/0x136 x86_64_start_kernel+0x103/0x112 irq event stamp: 2993422 hardirqs last enabled at (2993422): _raw_spin_unlock_irqrestore+0x55/0x80 hardirqs last disabled at (2993421): _raw_spin_lock_irqsave+0x29/0x70 softirqs last enabled at (2993394): _local_bh_enable+0x13/0x20 softirqs last disabled at (2993395): call_softirq+0x1c/0x30 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(proc_inum_lock); <Interrupt> lock(proc_inum_lock); *** DEADLOCK *** no locks held by swapper/1/0. stack backtrace: Pid: 0, comm: swapper/1 Not tainted 3.7.0 #36 Call Trace: <IRQ> [<ffffffff810a40f1>] ? vprintk_emit+0x471/0x510 print_usage_bug+0x2a5/0x2c0 mark_lock+0x33b/0x5e0 __lock_acquire+0x813/0xca0 lock_acquire+0x199/0x200 _raw_spin_lock+0x41/0x50 proc_free_inum+0x1c/0x50 free_pid_ns+0x1c/0x50 put_pid_ns+0x2e/0x50 put_pid+0x4a/0x60 delayed_put_pid+0x12/0x20 rcu_process_callbacks+0x462/0x790 __do_softirq+0x1b4/0x3b0 call_softirq+0x1c/0x30 do_softirq+0x59/0xd0 irq_exit+0x54/0xd0 smp_apic_timer_interrupt+0x95/0xa3 apic_timer_interrupt+0x72/0x80 cpuidle_enter_tk+0x10/0x20 cpuidle_enter_state+0x17/0x50 cpuidle_idle_call+0x287/0x520 cpu_idle+0xba/0x130 start_secondary+0x2b3/0x2bc Signed-off-by: Xiaotian Feng <dannyfeng@tencent.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
46f69557103e11fb963ae5c98b7777e90493241b |
|
15-Dec-2012 |
Marco Stornelli <marco.stornelli@gmail.com> |
procfs: drop vmtruncate Removed vmtruncate Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
33d6dce607573b5fd7a43168e0d91221b3ca532b |
|
17-Jun-2011 |
Eric W. Biederman <ebiederm@xmission.com> |
proc: Generalize proc inode allocation Generalize the proc inode allocation so that it can be used without having to having to create a proc_dir_entry. This will allow namespace file descriptors to remain light weight entitities but still have the same inode number when the backing namespace is the same. Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
17baa2a2c40f2f0330d25819b589a515d36b2d40 |
|
05-Oct-2012 |
yan <clouds.yan@gmail.com> |
proc: use kzalloc instead of kmalloc and memset Part of the memory will be written twice after this change, but that should be negligible. [akpm@linux-foundation.org: fix __proc_create() coding-style issues, remove unneeded zero-initialisations] Signed-off-by: yan <clouds.yan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
620727506dc6da0562fa4f6950dedb8a51bd8237 |
|
05-Oct-2012 |
yan <clouds.yan@gmail.com> |
proc: return -ENOMEM when inode allocation failed If proc_get_inode() returns NULL then presumably it encountered memory exhaustion. proc_lookup_de() should return -ENOMEM in this case, not -EINVAL. Signed-off-by: yan <clouds.yan@gmail.com> Cc: Ryan Mallon <rmallon@gmail.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
00cd8dd3bf95f2cc8435b4cac01d9995635c6d0b |
|
10-Jun-2012 |
Al Viro <viro@zeniv.linux.org.uk> |
stop passing nameidata to ->lookup() Just the flags; only NFS cares even about that, but there are legitimate uses for such argument. And getting rid of that completely would require splitting ->lookup() into a couple of methods (at least), so let's leave that alone for now... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
d161a13f974c72fd7ff0069d39a3ae57cb5694ff |
|
24-Jul-2011 |
Al Viro <viro@zeniv.linux.org.uk> |
switch procfs to umode_t use both proc_dir_entry ->mode and populating functions Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
bfe8684869601dacfcb2cd69ef8cfd9045f62170 |
|
28-Oct-2011 |
Miklos Szeredi <mszeredi@suse.cz> |
filesystems: add set_nlink() Replace remaining direct i_nlink updates with a new set_nlink() updater function. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Tested-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
09570f914914d2beb0db29c5a9c7344934f2fa8c |
|
27-Jul-2011 |
David Howells <dhowells@redhat.com> |
proc: make struct proc_dir_entry::name a terminal array rather than a pointer Since __proc_create() appends the name it is given to the end of the PDE structure that it allocates, there isn't a need to store a name pointer. Instead we can just replace the name pointer with a terminal char array of _unspecified_ length. The compiler will simply append the string to statically defined variables of PDE type overlapping any hole at the end of the structure and, unlike specifying an explicitly _zero_ length array, won't give a warning if you try to statically initialise it with a string of more than zero length. Also, whilst we're at it: (1) Move namelen to end just prior to name and reduce it to a single byte (name shouldn't be longer than NAME_MAX). (2) Move pde_unload_lock two places further on so that if it's four bytes in size on a 64-bit machine, it won't cause an unused hole in the PDE struct. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
011159a0a746e03ae42d559ce5c2a70138da3129 |
|
13-May-2011 |
Alexey Dobriyan <adobriyan@gmail.com> |
airo: correct proc entry creation interfaces * use proc_mkdir_mode() instead of create_proc_entry(S_IFDIR|...), export proc_mkdir_mode() for that, oh well. * don't supply S_IFREG to proc_create_data(), it's unnecessary Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
312ec7e50c4d3f40b3762af651d1aa79a67f556a |
|
24-Mar-2011 |
Alexey Dobriyan <adobriyan@gmail.com> |
proc: make struct proc_dir_entry::namelen unsigned int 1. namelen is declared "unsigned short" which hints for "maybe space savings". Indeed in 2.4 struct proc_dir_entry looked like: struct proc_dir_entry { unsigned short low_ino; unsigned short namelen; Now, low_ino is "unsigned int", all savings were gone for a long time. "struct proc_dir_entry" is not that countless to worry about it's size, anyway. 2. converting from unsigned short to int/unsigned int can only create problems, we better play it safe. Space is not really conserved, because of natural alignment for the next field. sizeof(struct proc_dir_entry) remains the same. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
3740a20c4fe8697cb604a7d51395d23472b1768f |
|
13-Jan-2011 |
Alexey Dobriyan <adobriyan@gmail.com> |
proc: less LOCK/UNLOCK in remove_proc_entry() For the common case where a proc entry is being removed and nobody is in the process of using it, save a LOCK/UNLOCK pair. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
6d1b6e4eff89475785f60fa00f65da780f869f36 |
|
13-Jan-2011 |
Alexey Dobriyan <adobriyan@gmail.com> |
proc: ->low_ino cleanup - ->low_ino is write-once field -- reading it under locks is unnecessary. - /proc/$PID stuff never reaches pde_put()/free_proc_entry() -- PROC_DYNAMIC_FIRST check never triggers. - in proc_get_inode(), inode number always matches proc dir entry, so save one parameter. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
fb045adb99d9b7c562dc7fef834857f78249daa1 |
|
07-Jan-2011 |
Nick Piggin <npiggin@kernel.dk> |
fs: dcache reduce branches in lookup path Reduce some branches and memory accesses in dcache lookup by adding dentry flags to indicate common d_ops are set, rather than having to check them. This saves a pointer memory access (dentry->d_op) in common path lookup situations, and saves another pointer load and branch in cases where we have d_op but not the particular operation. Patched with: git grep -E '[.>]([[:space:]])*d_op([[:space:]])*=' | xargs sed -e 's/\([^\t ]*\)->d_op = \(.*\);/d_set_d_op(\1, \2);/' -e 's/\([^\t ]*\)\.d_op = \(.*\);/d_set_d_op(\&\1, \2);/' -i Signed-off-by: Nick Piggin <npiggin@kernel.dk>
|
fe15ce446beb3a33583af81ffe6c9d01a75314ed |
|
07-Jan-2011 |
Nick Piggin <npiggin@kernel.dk> |
fs: change d_delete semantics Change d_delete from a dentry deletion notification to a dentry caching advise, more like ->drop_inode. Require it to be constant and idempotent, and not take d_lock. This is how all existing filesystems use the callback anyway. This makes fine grained dentry locking of dput and dentry lru scanning much simpler. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
|
1025774ce411f2bd4b059ad7b53f0003569b74fa |
|
04-Jun-2010 |
Christoph Hellwig <hch@lst.de> |
remove inode_setattr Replace inode_setattr with opencoded variants of it in all callers. This moves the remaining call to vmtruncate into the filesystem methods where it can be replaced with the proper truncate sequence. In a few cases it was obvious that we would never end up calling vmtruncate so it was left out in the opencoded variant: spufs: explicitly checks for ATTR_SIZE earlier btrfs,hugetlbfs,logfs,dlmfs: explicitly clears ATTR_SIZE earlier ufs: contains an opencoded simple_seattr + truncate that sets the filesize just above In addition to that ncpfs called inode_setattr with handcrafted iattrs, which allowed to trim down the opencoded variant. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
57f87869f073929f8e8b3c73748aabb0cece19aa |
|
26-May-2010 |
Amerigo Wang <amwang@redhat.com> |
proc: remove obsolete comments A quick test shows these comments are obsolete, so just remove them. Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
5a0e3ad6af8660be21ca98a971cd00f331318c05 |
|
24-Mar-2010 |
Tejun Heo <tj@kernel.org> |
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
|
12bac0d9f4dbf3445a0319beee848d15fa32775e |
|
05-Mar-2010 |
Alexey Dobriyan <adobriyan@gmail.com> |
proc: warn on non-existing proc entries * warn if creation goes on to non-existent directory * warn if removal goes on from non-existing directory * warn if non-existing proc entry is removed Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
e17a5765f20d1219c3f05eb17aab11671978e0ec |
|
05-Mar-2010 |
Alexey Dobriyan <adobriyan@gmail.com> |
proc: do translation + unlink atomically at remove_proc_entry() remove_proc_entry() does lock lookup parent unlock lock unlink proc entry from lists unlock which can be made bit more correct by doing parent translation + unlink without dropping lock. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
587d4a17d837ac0f17edb26f1b6c80c0abda6343 |
|
30-Dec-2009 |
Helight.Xu <helight.xu@gmail.com> |
some clean up in fs/proc EXPORT_SYMBOL(proc_symlink); EXPORT_SYMBOL(proc_mkdir); EXPORT_SYMBOL(create_proc_entry); EXPORT_SYMBOL(proc_create_data); EXPORT_SYMBOL(remove_proc_entry); Those EXPORT_SYMBOL shouldn't be in fs/proc/root.c, should be in fs/proc/generic.c. Signed-off-by: Helight.Xu <helight.xu@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
135d5655dc58a24eda64e3f6c192d7d605e10050 |
|
16-Dec-2009 |
Alexey Dobriyan <adobriyan@gmail.com> |
proc: rename de_get() to pde_get() and inline it * de_get() is trivial -- make inline, save a few bits of code, drop "refcount is 0" check -- it should be done in some generic refcount code, don't recall it's was helpful * rename GET and PUT functions to pde_get(), pde_put() for cool prefix! * remove obvious and incorrent comments * in remove_proc_entry() use pde_put(), when I fixed PDE refcounting to be normal one, remove_proc_entry() was supposed to do "-1" and code now reflects that. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
3dec7f59c370c7b58184d63293c3dc984d475840 |
|
20-Feb-2009 |
Alexey Dobriyan <adobriyan@gmail.com> |
proc 1/2: do PDE usecounting even for ->read_proc, ->write_proc struct proc_dir_entry::owner is going to be removed. Now it's only necessary to protect PDEs which are using ->read_proc, ->write_proc hooks. However, ->owner assignments are racy and make it very easy for someone to switch ->owner on live PDE (as some subsystems do) without fixing refcounts and so on. http://bugzilla.kernel.org/show_bug.cgi?id=12454 So, ->owner is on death row. Proxy file operations exist already (proc_file_operations), just bump usecount when necessary. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
|
1681bc30f272dd2fe347b90468791b05c7044f03 |
|
13-Jan-2009 |
Randy Dunlap <randy.dunlap@oracle.com> |
proc: move fs/proc/inode-alloc.txt comment into a source file so that people will realize that it exists and can update it as needed. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
|
d72f71eb0edd629c95715aa7305b0259d3581e34 |
|
20-Feb-2009 |
Al Viro <viro@zeniv.linux.org.uk> |
constify dentry_operations: procfs Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
b4df2b92d8461444fac429c75ba6e125c63056bc |
|
27-Oct-2008 |
Alexey Dobriyan <adobriyan@gmail.com> |
proc: stop using BKL There are four BKL users in proc: de_put(), proc_lookup_de(), proc_readdir_de(), proc_root_readdir(), 1) de_put() ----------- de_put() is classic atomic_dec_and_test() refcount wrapper -- no BKL needed. BKL doesn't matter to possible refcount leak as well. 2) proc_lookup_de() ------------------- Walking PDE list is protected by proc_subdir_lock(), proc_get_inode() is potentially blocking, all callers of proc_lookup_de() eventually end up from ->lookup hooks which is protected by directory's ->i_mutex -- BKL doesn't protect anything. 3) proc_readdir_de() -------------------- "." and ".." part doesn't need BKL, walking PDE list is under proc_subdir_lock, calling filldir callback is potentially blocking because it writes to luserspace. All proc_readdir_de() callers eventually come from ->readdir hook which is under directory's ->i_mutex -- BKL doesn't protect anything. 4) proc_root_readdir_de() ------------------------- proc_root_readdir_de is ->readdir hook, see (3). Since readdir hooks doesn't use BKL anymore, switch to generic_file_llseek, since it also takes directory's i_mutex. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
|
6c2f91e077f1b60e7f83b7ee044f965f469cfdb3 |
|
14-Sep-2008 |
Arjan van de Ven <arjan@linux.intel.com> |
proc: use WARN() rather than printk+backtrace Use WARN() rather than a printk() + backtrace(); this gives a more standard format message as well as complete information (including line numbers etc) that will be collected by kerneloops.org Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
|
665020c35e89a9e0643e21561e4f8f967f4f2c4b |
|
13-Sep-2008 |
Alexey Dobriyan <adobriyan@gmail.com> |
proc: more debugging for "already registered" case Print parent directory name as well. The aim is to catch non-creation of parent directory when proc_mkdir will return NULL and all subsequent registrations go directly in /proc instead of intended directory. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ Fixed insane printk string while at it. - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
cc996099174dc05b35b7a29301026987990e7f8c |
|
02-Aug-2008 |
Alexey Dobriyan <adobriyan@gmail.com> |
[PATCH] proc: inode number fixlet Ouch, if number taken from IDA is too big, the intent was to signal an error, not check for overflow and still do overflowing addition. One still needs 2^28 proc entries to notice this. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
9a18540915faaaadd7f71c16fa877a0c19675923 |
|
26-Jul-2008 |
Alexey Dobriyan <adobriyan@gmail.com> |
[PATCH 2/2] proc: switch inode number allocation to IDA proc doesn't use "associate pointer with id" feature of IDR, so switch to IDA. NOTE, NOTE, NOTE: Do not apply if release_inode_number() still mantions MAX_ID_MASK! Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
67935df49dae836fa86621861979fafdfd37ae59 |
|
26-Jul-2008 |
Alexey Dobriyan <adobriyan@gmail.com> |
[PATCH 1/2] proc: fix inode number bogorithmetic Id which proc gets from IDR for inode number and id which proc removes from IDR do not match. E.g. 0x11a transforms into 0x8000011a. Which stayed unnoticed for a long time because, surprise, idr_remove() masks out that high bit before doing anything. All of this due to "| ~MAX_ID_MASK" in release_inode_number(). I still don't understand how it's supposed to work, because "| ~MASK" is not an inversion for "& MAX" operation. So, use just one nice, working addition. Make start offset unsigned int, while I'm at it. It's longness is not used anywhere. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
267e2a9c71b8e088ac307f9549f71468e86e26c1 |
|
26-Jul-2008 |
Arjan van de Ven <arjan@linux.intel.com> |
Use WARN() in fs/proc/ Use WARN() instead of a printk+WARN_ON() pair; this way the message becomes part of the warning section for better reporting/collection. This way, the entire if() {} section can collapse into the WARN() as well. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
881adb85358309ea9c6f707394002719982ec607 |
|
25-Jul-2008 |
Alexey Dobriyan <adobriyan@gmail.com> |
proc: always do ->release Current two-stage scheme of removing PDE emphasizes one bug in proc: open rmmod remove_proc_entry close ->release won't be called because ->proc_fops were cleared. In simple cases it's small memory leak. For every ->open, ->release has to be done. List of openers is introduced which is traversed at remove_proc_entry() if neeeded. Discussions with Al long ago (sigh). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
78e92b99ec4eb73755abd4e357b0b211eadafd88 |
|
02-May-2008 |
Denis V. Lunev <den@openvz.org> |
netns: assign PDE->data before gluing entry into /proc tree In this unfortunate case, proc_mkdir_mode wrapper can't be used anymore and this is no way to reuse proc_create_data due to nlinks assignment. So, copy the code from proc_mkdir and assign PDE->data at the appropriate moment. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
59b7435149eab2dd06dd678742faff6049cb655f |
|
29-Apr-2008 |
Denis V. Lunev <den@openvz.org> |
proc: introduce proc_create_data to setup de->data This set of patches fixes an proc ->open'less usage due to ->proc_fops flip in the most part of the kernel code. The original OOPS is described in the commit 2d3a4e3666325a9709cc8ea2e88151394e8f20fc: Typical PDE creation code looks like: pde = create_proc_entry("foo", 0, NULL); if (pde) pde->proc_fops = &foo_proc_fops; Notice that PDE is first created, only then ->proc_fops is set up to final value. This is a problem because right after creation a) PDE is fully visible in /proc , and b) ->proc_fops are proc_file_operations which do not have ->open callback. So, it's possible to ->read without ->open (see one class of oopses below). The fix is new API called proc_create() which makes sure ->proc_fops are set up before gluing PDE to main tree. Typical new code looks like: pde = proc_create("foo", 0, NULL, &foo_proc_fops); if (!pde) return -ENOMEM; Fix most networking users for a start. In the long run, create_proc_entry() for regular files will go. In addition to this, proc_create_data is introduced to fix reading from proc without PDE->data. The race is basically the same as above. create_proc_entries is replaced in the entire kernel code as new method is also simply better. This patch: The problem is the same as for de->proc_fops. Right now PDE becomes visible without data set. So, the entry could be looked up without data. This, in most cases, will simply OOPS. proc_create_data call is created to address this issue. proc_create now becomes a wrapper around it. Signed-off-by: Denis V. Lunev <den@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Chris Mason <chris.mason@oracle.com> Acked-by: David Howells <dhowells@redhat.com> Cc: Dmitry Torokhov <dtor@mail.ru> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Haavard Skinnemoen <hskinnemoen@atmel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Jaroslav Kysela <perex@suse.cz> Cc: Jeff Garzik <jgarzik@pobox.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Karsten Keil <kkeil@suse.de> Cc: Kyle McMartin <kyle@parisc-linux.org> Cc: Len Brown <lenb@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: Mikael Starvik <starvik@axis.com> Cc: Nadia Derbey <Nadia.Derbey@bull.net> Cc: Neil Brown <neilb@suse.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Osterlund <petero2@telia.com> Cc: Pierre Peiffer <peifferp@gmail.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Takashi Iwai <tiwai@suse.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
8731f14d37825b54ad0c4c309cba2bc8fdf13a86 |
|
29-Apr-2008 |
Alexey Dobriyan <adobriyan@sw.ru> |
proc: remove ->get_info infrastructure Now that last dozen or so users of ->get_info were removed, ditch it too. Everyone sane shouldd have switched to seq_file interface long ago. P.S.: Co-existing 3 interfaces (->get_info/->read_proc/->proc_fops) for proc is long-standing crap, BTW, thus a) put ->read_proc/->write_proc/read_proc_entry() users on death row, b) new such users should be rejected, c) everyone is encouraged to convert his favourite ->read_proc user or I'll do it, lazy bastards. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
5e971dce0b2f6896e02372512df0d1fb0bfe2d55 |
|
29-Apr-2008 |
Alexey Dobriyan <adobriyan@gmail.com> |
proc: drop several "PDE valid/invalid" checks proc-misc code is noticeably full of "if (de)" checks when PDE passed is always valid. Remove them. Addition of such check in proc_lookup_de() is for failed lookup case. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
7cee4e00e0f8aa7290266382ea903a5a1b92c9a1 |
|
29-Apr-2008 |
Alexey Dobriyan <adobriyan@gmail.com> |
proc: less special case in xlate code If valid "parent" is passed to proc_create/remove_proc_entry(), then name of PDE should consist of only one path component, otherwise creation or or removal will fail. However, if NULL is passed as parent then create/remove accept full path as a argument. This is arbitrary restriction -- all infrastructure is in place. So, patch allows the following to succeed: create_proc_entry("foo/bar", 0, pde_baz); remove_proc_entry("baz/foo/bar", &proc_root); Also makes the following to behave identically: create_proc_entry("foo/bar", 0, NULL); create_proc_entry("foo/bar", 0, &proc_root); Discrepancy noticed by Den Lunev (IIRC). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
f649d6d32605c7573884613289fb3b9fbd4f99a1 |
|
29-Apr-2008 |
Alexey Dobriyan <adobriyan@gmail.com> |
proc: simplify locking in remove_proc_entry() proc_subdir_lock protects only modifying and walking through PDE lists, so after we've found PDE to remove and actually removed it from lists, there is no need to hold proc_subdir_lock for the rest of operation. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
e93b4ea20adb20f1f1f07f10ba5d7dd739d2843e |
|
29-Apr-2008 |
Alexey Dobriyan <adobriyan@sw.ru> |
proc: print more information when removing non-empty directories This usually saves one recompile to insert similar printk like below. :) Sample nastygram: remove_proc_entry: removing non-empty directory '/proc/foo', leaking at least 'bar' ------------[ cut here ]------------ WARNING: at fs/proc/generic.c:776 remove_proc_entry+0x18a/0x200() Modules linked in: foo(-) container fan battery dock sbs ac sbshc backlight ipv6 loop af_packet amd_rng sr_mod i2c_amd8111 i2c_amd756 cdrom i2c_core button thermal processor Pid: 3034, comm: rmmod Tainted: G M 2.6.25-rc1 #5 Call Trace: [<ffffffff80231974>] warn_on_slowpath+0x64/0x90 [<ffffffff80232a6e>] printk+0x4e/0x60 [<ffffffff802d6c8a>] remove_proc_entry+0x18a/0x200 [<ffffffff8045cd88>] mutex_lock_nested+0x1c8/0x2d0 [<ffffffff8025f0f0>] __try_stop_module+0x0/0x40 [<ffffffff8025effd>] sys_delete_module+0x14d/0x200 [<ffffffff8045df3d>] lockdep_sys_exit_thunk+0x35/0x67 [<ffffffff8031c307>] __up_read+0x27/0xa0 [<ffffffff8045decc>] trace_hardirqs_on_thunk+0x35/0x3a [<ffffffff8020b6ab>] system_call_after_swapgs+0x7b/0x80 ---[ end trace 10ef850597e89c54 ]--- Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Cc: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
e9720acd728a46cb40daa52c99a979f7c4ff195c |
|
07-Mar-2008 |
Pavel Emelyanov <xemul@openvz.org> |
[NET]: Make /proc/net a symlink on /proc/self/net (v3) Current /proc/net is done with so called "shadows", but current implementation is broken and has little chances to get fixed. The problem is that dentries subtree of /proc/net directory has fancy revalidation rules to make processes living in different net namespaces see different entries in /proc/net subtree, but currently, tasks see in the /proc/net subdir the contents of any other namespace, depending on who opened the file first. The proposed fix is to turn /proc/net into a symlink, which points to /proc/self/net, which in turn shows what previously was in /proc/net - the network-related info, from the net namespace the appropriate task lives in. # ls -l /proc/net lrwxrwxrwx 1 root root 8 Mar 5 15:17 /proc/net -> self/net In other words - this behaves like /proc/mounts, but unlike "mounts", "net" is not a file, but a directory. Changes from v2: * Fixed discrepancy of /proc/net nlink count and selinux labeling screwup pointed out by Stephen. To get the correct nlink count the ->getattr callback for /proc/net is overridden to read one from the net->proc_net entry. To make selinux still work the net->proc_net entry is initialized properly, i.e. with the "net" name and the proc_net parent. Selinux fixes are Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Changes from v1: * Fixed a task_struct leak in get_proc_task_net, pointed out by Paul. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
2d3a4e3666325a9709cc8ea2e88151394e8f20fc |
|
08-Feb-2008 |
Alexey Dobriyan <adobriyan@sw.ru> |
proc: fix ->open'less usage due to ->proc_fops flip Typical PDE creation code looks like: pde = create_proc_entry("foo", 0, NULL); if (pde) pde->proc_fops = &foo_proc_fops; Notice that PDE is first created, only then ->proc_fops is set up to final value. This is a problem because right after creation a) PDE is fully visible in /proc , and b) ->proc_fops are proc_file_operations which do not have ->open callback. So, it's possible to ->read without ->open (see one class of oopses below). The fix is new API called proc_create() which makes sure ->proc_fops are set up before gluing PDE to main tree. Typical new code looks like: pde = proc_create("foo", 0, NULL, &foo_proc_fops); if (!pde) return -ENOMEM; Fix most networking users for a start. In the long run, create_proc_entry() for regular files will go. BUG: unable to handle kernel NULL pointer dereference at virtual address 00000024 printing eip: c1188c1b *pdpt = 000000002929e001 *pde = 0000000000000000 Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC last sysfs file: /sys/block/sda/sda1/dev Modules linked in: foo af_packet ipv6 cpufreq_ondemand loop serio_raw psmouse k8temp hwmon sr_mod cdrom Pid: 24679, comm: cat Not tainted (2.6.24-rc3-mm1 #2) EIP: 0060:[<c1188c1b>] EFLAGS: 00210002 CPU: 0 EIP is at mutex_lock_nested+0x75/0x25d EAX: 000006fe EBX: fffffffb ECX: 00001000 EDX: e9340570 ESI: 00000020 EDI: 00200246 EBP: e9340570 ESP: e8ea1ef8 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process cat (pid: 24679, ti=E8EA1000 task=E9340570 task.ti=E8EA1000) Stack: 00000000 c106f7ce e8ee05b4 00000000 00000001 458003d0 f6fb6f20 fffffffb 00000000 c106f7aa 00001000 c106f7ce 08ae9000 f6db53f0 00000020 00200246 00000000 00000002 00000000 00200246 00200246 e8ee05a0 fffffffb e8ee0550 Call Trace: [<c106f7ce>] seq_read+0x24/0x28a [<c106f7aa>] seq_read+0x0/0x28a [<c106f7ce>] seq_read+0x24/0x28a [<c106f7aa>] seq_read+0x0/0x28a [<c10818b8>] proc_reg_read+0x60/0x73 [<c1081858>] proc_reg_read+0x0/0x73 [<c105a34f>] vfs_read+0x6c/0x8b [<c105a6f3>] sys_read+0x3c/0x63 [<c10025f2>] sysenter_past_esp+0x5f/0xa5 [<c10697a7>] destroy_inode+0x24/0x33 ======================= INFO: lockdep is turned off. Code: 75 21 68 e1 1a 19 c1 68 87 00 00 00 68 b8 e8 1f c1 68 25 73 1f c1 e8 84 06 e9 ff e8 52 b8 e7 ff 83 c4 10 9c 5f fa e8 28 89 ea ff <f0> fe 4e 04 79 0a f3 90 80 7e 04 00 7e f8 eb f0 39 76 34 74 33 EIP: [<c1188c1b>] mutex_lock_nested+0x75/0x25d SS:ESP 0068:e8ea1ef8 [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
94413d8807a3c511a3675be4ce27a4d16d6408ee |
|
08-Feb-2008 |
Zhang Rui <rui.zhang@intel.com> |
proc: detect duplicate names on registration Print a warning if PDE is registered with a name which already exists in target directory. Bug report and a simple fix can be found here: http://bugzilla.kernel.org/show_bug.cgi?id=8798 [\n fixlet and no undescriptive variable usage --adobriyan] [akpm@linux-foundation.org: make printk comprehensible] Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
fd2cbe48883a01f710c2a639877e3b3e4eba6e59 |
|
08-Feb-2008 |
Alexey Dobriyan <adobriyan@sw.ru> |
proc: remove useless check on symlink removal proc symlinks always have valid ->data containing destination of symlink. No need to check it on removal -- proc_symlink() already done it. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
76df0c25d0c34eba9fbb8a44106ed096553ba0e8 |
|
08-Feb-2008 |
Alexey Dobriyan <adobriyan@sw.ru> |
proc: simplify function prototypes Move code around so as to reduce the number of forward-declarations. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
4237e0d3de38da640d7c977d68f5f7f1d207a631 |
|
08-Feb-2008 |
Alexey Dobriyan <adobriyan@sw.ru> |
proc: less LOCK operations during lookup Pseudo-code for lookup effectively is: LOCK kernel LOCK proc_subdir_lock find PDE UNLOCK proc_subdir_lock get inode LOCK proc_subdir_lock goto unlock UNLOCK proc_subdir_lock UNLOCK kernel We can get rid of LOCK/UNLOCK pair after getting inode simply by jumping to unlock_kernel() directly. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
3790ee4bd86396558eedd86faac1052cb782e4e1 |
|
11-Dec-2007 |
Eric W. Biederman <ebiederm@xmission.com> |
proc: remove/Fix proc generic d_revalidate Ultimately to implement /proc perfectly we need an implementation of d_revalidate because files and directories can be removed behind the back of the VFS, and d_revalidate is the only way we can let the VFS know that this has happened. Unfortunately the linux VFS can not cope with anything in the path to a mount point going away. So a proper d_revalidate method that calls d_drop also needs to call have_submounts which is moderately expensive, so you really don't want a d_revalidate method that unconditionally calls it, but instead only calls it when the backing object has really gone away. proc generic entries only disappear on module_unload (when not counting the fledgling network namespace) so it is quite rare that we actually encounter that case and has not actually caused us real world trouble yet. So until we get a proper test for keeping dentries in the dcache fix the current d_revalidate method by completely removing it. This returns us to the current status quo. So with CONFIG_NETNS=n things should look as they have always looked. For CONFIG_NETNS=y things work most of the time but there are a few rare corner cases that don't behave properly. As the network namespace is barely present in 2.6.24 this should not be a problem. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "Denis V. Lunev" <den@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
5a622f2d0f86b316b07b55a4866ecb5518dd1cf7 |
|
05-Dec-2007 |
Alexey Dobriyan <adobriyan@sw.ru> |
proc: fix proc_dir_entry refcounting Creating PDEs with refcount 0 and "deleted" flag has problems (see below). Switch to usual scheme: * PDE is created with refcount 1 * every de_get does +1 * every de_put() and remove_proc_entry() do -1 * once refcount reaches 0, PDE is freed. This elegantly fixes at least two following races (both observed) without introducing new locks, without abusing old locks, without spreading lock_kernel(): 1) PDE leak remove_proc_entry de_put ----------------- ------ [refcnt = 1] if (atomic_read(&de->count) == 0) if (atomic_dec_and_test(&de->count)) if (de->deleted) /* also not taken! */ free_proc_entry(de); else de->deleted = 1; [refcount=0, deleted=1] 2) use after free remove_proc_entry de_put ----------------- ------ [refcnt = 1] if (atomic_dec_and_test(&de->count)) if (atomic_read(&de->count) == 0) free_proc_entry(de); /* boom! */ if (de->deleted) free_proc_entry(de); BUG: unable to handle kernel paging request at virtual address 6b6b6b6b printing eip: c10acdda *pdpt = 00000000338f8001 *pde = 0000000000000000 Oops: 0000 [#1] PREEMPT SMP Modules linked in: af_packet ipv6 cpufreq_ondemand loop serio_raw psmouse k8temp hwmon sr_mod cdrom Pid: 23161, comm: cat Not tainted (2.6.24-rc2-8c0863403f109a43d7000b4646da4818220d501f #4) EIP: 0060:[<c10acdda>] EFLAGS: 00210097 CPU: 1 EIP is at strnlen+0x6/0x18 EAX: 6b6b6b6b EBX: 6b6b6b6b ECX: 6b6b6b6b EDX: fffffffe ESI: c128fa3b EDI: f380bf34 EBP: ffffffff ESP: f380be44 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process cat (pid: 23161, ti=f380b000 task=f38f2570 task.ti=f380b000) Stack: c10ac4f0 00000278 c12ce000 f43cd2a8 00000163 00000000 7da86067 00000400 c128fa20 00896b18 f38325a8 c128fe20 ffffffff 00000000 c11f291e 00000400 f75be300 c128fa20 f769c9a0 c10ac779 f380bf34 f7bfee70 c1018e6b f380bf34 Call Trace: [<c10ac4f0>] vsnprintf+0x2ad/0x49b [<c10ac779>] vscnprintf+0x14/0x1f [<c1018e6b>] vprintk+0xc5/0x2f9 [<c10379f1>] handle_fasteoi_irq+0x0/0xab [<c1004f44>] do_IRQ+0x9f/0xb7 [<c117db3b>] preempt_schedule_irq+0x3f/0x5b [<c100264e>] need_resched+0x1f/0x21 [<c10190ba>] printk+0x1b/0x1f [<c107c8ad>] de_put+0x3d/0x50 [<c107c8f8>] proc_delete_inode+0x38/0x41 [<c107c8c0>] proc_delete_inode+0x0/0x41 [<c1066298>] generic_delete_inode+0x5e/0xc6 [<c1065aa9>] iput+0x60/0x62 [<c1063c8e>] d_kill+0x2d/0x46 [<c1063fa9>] dput+0xdc/0xe4 [<c10571a1>] __fput+0xb0/0xcd [<c1054e49>] filp_close+0x48/0x4f [<c1055ee9>] sys_close+0x67/0xa5 [<c10026b6>] sysenter_past_esp+0x5f/0x85 ======================= Code: c9 74 0c f2 ae 74 05 bf 01 00 00 00 4f 89 fa 5f 89 d0 c3 85 c9 57 89 c7 89 d0 74 05 f2 ae 75 01 4f 89 f8 5f c3 89 c1 89 c8 eb 06 <80> 38 00 74 07 40 4a 83 fa ff 75 f4 29 c8 c3 90 90 90 57 83 c9 EIP: [<c10acdda>] strnlen+0x6/0x18 SS:ESP 0068:f380be44 Also, remove broken usage of ->deleted from reiserfs: if sget() succeeds, module is already pinned and remove_proc_entry() can't happen => nobody can mark PDE deleted. Dummy proc root in netns code is not marked with refcount 1. AFAICS, we never get it, it's just for proper /proc/net removal. I double checked CLONE_NETNS continues to work. Patch survives many hours of modprobe/rmmod/cat loops without new bugs which can be attributed to refcounting. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
2b1e300a9dfc3196ccddf6f1d74b91b7af55e416 |
|
01-Dec-2007 |
Eric W. Biederman <ebiederm@xmission.com> |
[NETNS]: Fix /proc/net breakage Well I clearly goofed when I added the initial network namespace support for /proc/net. Currently things work but there are odd details visible to user space, even when we have a single network namespace. Since we do not cache proc_dir_entry dentries at the moment we can just modify ->lookup to return a different directory inode depending on the network namespace of the process looking at /proc/net, replacing the current technique of using a magic and fragile follow_link method. To accomplish that this patch: - introduces a shadow_proc method to allow different dentries to be returned from proc_lookup. - Removes the old /proc/net follow_link magic - Fixes a weakness in our not caching of proc generic dentries. As shadow_proc uses a task struct to decided which dentry to return we can go back later and fix the proc generic caching without modifying any code that uses the shadow_proc method. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Pavel Machek <pavel@ucw.cz> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
c2319540cd7330fa9066e5b9b84d357a2c8631a2 |
|
29-Nov-2007 |
Alexey Dobriyan <adobriyan@sw.ru> |
proc: fix NULL ->i_fop oops proc_kill_inodes() can clear ->i_fop in the middle of vfs_readdir resulting in NULL dereference during "file->f_op->readdir(file, buf, filler)". The solution is to remove proc_kill_inodes() completely: a) we don't have tricky modules implementing their tricky readdir hooks which could keeping this revoke from hell. b) In a situation when module is gone but PDE still alive, standard readdir will return only "." and "..", because pde->next was cleared by remove_proc_entry(). c) the race proc_kill_inode() destined to prevent is not completely fixed, just race window made smaller, because vfs_readdir() is run without sb_lock held and without file_list_lock held. Effectively, ->i_fop is cleared at random moment, which can't fix properly anything. BUG: unable to handle kernel NULL pointer dereference at virtual address 00000018 printing eip: c1061205 *pdpt = 0000000005b22001 *pde = 0000000000000000 Oops: 0000 [#1] PREEMPT SMP Modules linked in: foo af_packet ipv6 cpufreq_ondemand loop serio_raw sr_mod k8temp cdrom hwmon amd_rng Pid: 2033, comm: find Not tainted (2.6.24-rc1-b1d08ac064268d0ae2281e98bf5e82627e0f0c56 #2) EIP: 0060:[<c1061205>] EFLAGS: 00010246 CPU: 0 EIP is at vfs_readdir+0x47/0x74 EAX: c6b6a780 EBX: 00000000 ECX: c1061040 EDX: c5decf94 ESI: c6b6a780 EDI: fffffffe EBP: c9797c54 ESP: c5decf78 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process find (pid: 2033, ti=c5dec000 task=c64bba90 task.ti=c5dec000) Stack: c5decf94 c1061040 fffffff7 0805ffbc 00000000 c6b6a780 c1061295 0805ffbc 00000000 00000400 00000000 00000004 0805ffbc 4588eff4 c5dec000 c10026ba 00000004 0805ffbc 00000400 0805ffbc 4588eff4 bfdc6c70 000000dc 0000007b Call Trace: [<c1061040>] filldir64+0x0/0xc5 [<c1061295>] sys_getdents64+0x63/0xa5 [<c10026ba>] sysenter_past_esp+0x5f/0x85 ======================= Code: 49 83 78 18 00 74 43 8d 6b 74 bf fe ff ff ff 89 e8 e8 b8 c0 12 00 f6 83 2c 01 00 00 10 75 22 8b 5e 10 8b 4c 24 04 89 f0 8b 14 24 <ff> 53 18 f6 46 1a 04 89 c7 75 0b 8b 56 0c 8b 46 08 e8 c8 66 00 EIP: [<c1061205>] vfs_readdir+0x47/0x74 SS:ESP 0068:c5decf78 hch: "Nice, getting rid of this is a very good step formwards. Unfortunately we have another copy of this junk in security/selinux/selinuxfs.c:sel_remove_entries() which would need the same treatment." Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Acked-by: Christoph Hellwig <hch@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
e1a1c997afe907e6ec4799e4be0f38cffd8b418c |
|
15-Nov-2007 |
Eric W. Biederman <ebiederm@xmission.com> |
proc: fix proc_kill_inodes to kill dentries on all proc superblocks It appears we overlooked support for removing generic proc files when we added support for multiple proc super blocks. Handle that now. [akpm@linux-foundation.org: coding-style cleanups] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@openvz.org> Cc: Alexey Dobriyan <adobriyan@sw.ru> Acked-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
e12ba74d8ff3e2f73a583500d7095e406df4d093 |
|
16-Oct-2007 |
Mel Gorman <mel@csn.ul.ie> |
Group short-lived and reclaimable kernel allocations This patch marks a number of allocations that are either short-lived such as network buffers or are reclaimable such as inode allocations. When something like updatedb is called, long-lived and unmovable kernel allocations tend to be spread throughout the address space which increases fragmentation. This patch groups these allocations together as much as possible by adding a new MIGRATE_TYPE. The MIGRATE_RECLAIMABLE type is for allocations that can be reclaimed on demand, but not moved. i.e. they can be migrated by deleting them and re-reading the information from elsewhere. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
99fc06df72fe1c9ad3ec274720dcb5658c40bfd2 |
|
16-Jul-2007 |
Changli Gao <xiaosuo@gmail.com> |
procfs directory entry cleanup Function proc_register() will assign proc_dir_operations and proc_dir_inode_operations to ent's members proc_fops and proc_iops correctly if ent is a directory. So the early assignment isn't necessary. Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
786d7e1612f0b0adb6046f19b906609e4fe8b1ba |
|
16-Jul-2007 |
Alexey Dobriyan <adobriyan@sw.ru> |
Fix rmmod/read/write races in /proc entries Fix following races: =========================================== 1. Write via ->write_proc sleeps in copy_from_user(). Module disappears meanwhile. Or, more generically, system call done on /proc file, method supplied by module is called, module dissapeares meanwhile. pde = create_proc_entry() if (!pde) return -ENOMEM; pde->write_proc = ... open write copy_from_user pde = create_proc_entry(); if (!pde) { remove_proc_entry(); return -ENOMEM; /* module unloaded */ } *boom* ========================================== 2. bogo-revoke aka proc_kill_inodes() remove_proc_entry vfs_read proc_kill_inodes [check ->f_op validness] [check ->f_op->read validness] [verify_area, security permissions checks] ->f_op = NULL; if (file->f_op->read) /* ->f_op dereference, boom */ NOTE, NOTE, NOTE: file_operations are proxied for regular files only. Let's see how this scheme behaves, then extend if needed for directories. Directories creators in /proc only set ->owner for them, so proxying for directories may be unneeded. NOTE, NOTE, NOTE: methods being proxied are ->llseek, ->read, ->write, ->poll, ->unlocked_ioctl, ->ioctl, ->compat_ioctl, ->open, ->release. If your in-tree module uses something else, yell on me. Full audit pending. [akpm@linux-foundation.org: build fix] Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
59cd0cbc75367b82f704f63b104117462275060d |
|
08-May-2007 |
Darrick J. Wong <djwong@us.ibm.com> |
Fix race between proc_readdir and remove_proc_entry Fix the following race: proc_readdir remove_proc_entry ============ ================= spin_lock(&proc_subdir_lock); [choose PDE to start filldir from] spin_unlock(&proc_subdir_lock); spin_lock(&proc_subdir_lock); [find PDE] [free PDE, refcount is 0] spin_unlock(&proc_subdir_lock); /* boom */ if (filldir(dirent, de->name, ... [de_put on error path --adobriyan] Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
7695650a924a6859910c8c19dfa43b4d08224d66 |
|
08-May-2007 |
Alexey Dobriyan <adobriyan@openvz.org> |
Fix race between proc_get_inode() and remove_proc_entry() proc_lookup remove_proc_entry =========== ================= lock_kernel(); spin_lock(&proc_subdir_lock); [find PDE with refcount 0] spin_unlock(&proc_subdir_lock); spin_lock(&proc_subdir_lock); [find PDE with refcount 0] [check refcount and free PDE] spin_unlock(&proc_subdir_lock); proc_get_inode: de_get(de); /* boom */ Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
77b14db502cb85a031fe8fde6c85d52f3e0acb63 |
|
14-Feb-2007 |
Eric W. Biederman <ebiederm@xmission.com> |
[PATCH] sysctl: reimplement the sysctl proc support With this change the sysctl inodes can be cached and nothing needs to be done when removing a sysctl table. For a cost of 2K code we will save about 4K of static tables (when we remove de from ctl_table) and 70K in proc_dir_entries that we will not allocate, or about half that on a 32bit arch. The speed feels about the same, even though we can now cache the sysctl dentries :( We get the core advantage that we don't need to have a 1 to 1 mapping between ctl table entries and proc files. Making it possible to have /proc/sys vary depending on the namespace you are in. The currently merged namespaces don't have an issue here but the network namespace under /proc/sys/net needs to have different directories depending on which network adapters are visible. By simply being a cache different directories being visible depending on who you are is trivial to implement. [akpm@osdl.org: fix uninitialised var] [akpm@osdl.org: fix ARM build] [bunk@stusta.de: make things static] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
c5ef1c42c51b1b5b4a401a6517bdda30933ddbaf |
|
12-Feb-2007 |
Arjan van de Ven <arjan@linux.intel.com> |
[PATCH] mark struct inode_operations const 3 Many struct inode_operations in the kernel can be "const". Marking them const moves these to the .rodata section, which avoids false sharing with potential dirty data. In addition it'll catch accidental writes at compile time to these shared resources. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
00977a59b951207d38380c75f03a36829950265c |
|
12-Feb-2007 |
Arjan van de Ven <arjan@linux.intel.com> |
[PATCH] mark struct file_operations const 6 Many struct file_operations in the kernel can be "const". Marking them const moves these to the .rodata section, which avoids false sharing with potential dirty data. In addition it'll catch accidental writes at compile time to these shared resources. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
2fddfeefeed703b7638af97aa3048f82a2d53b03 |
|
08-Dec-2006 |
Josef "Jeff" Sipek <jsipek@cs.sunysb.edu> |
[PATCH] proc: change uses of f_{dentry, vfsmnt} to use f_path Change all the uses of f_{dentry,vfsmnt} to f_path.{dentry,mnt} in the proc filesystem code. Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
99ac48f54a91d02140c497edc31dc57d4bc5c85d |
|
28-Mar-2006 |
Arjan van de Ven <arjan@infradead.org> |
[PATCH] mark f_ops const in the inode Mark the f_ops members of inodes as const, as well as fix the ripple-through this causes by places that copy this f_ops and then "do stuff" with it. Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
64a07bd82ed526d813b64b0957543eef55bdf9c0 |
|
26-Mar-2006 |
Steven Rostedt <rostedt@goodmis.org> |
[PATCH] protect remove_proc_entry It has been discovered that the remove_proc_entry has a race in the removing of entries in the proc file system that are siblings. There's no protection around the traversing and removing of elements that belong in the same subdirectory. This subdirectory list is protected in other areas by the BKL. So the BKL was at first used to protect this area too, but unfortunately, remove_proc_entry may be called with spinlocks held. The BKL may schedule, so this was not a solution. The final solution was to add a new global spin lock to protect this list, called proc_subdir_lock. This lock now protects the list in remove_proc_entry, and I also went around looking for other areas that this list is modified and added this protection there too. Care must be taken since these locations call several functions that may also schedule. Since I don't see any location that these functions that modify the subdirectory list are called by interrupts, the irqsave/restore versions of the spin lock was _not_ used. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
fee781e6c25772db862d3322b4745a896022a4f1 |
|
08-Jan-2006 |
Adrian Bunk <bunk@stusta.de> |
[PATCH] fs/proc/: function prototypes belong in header files Function prototypes belong into header files. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
8b90db0df7187a01fb7177f1f812123138f562cf |
|
30-Dec-2005 |
Linus Torvalds <torvalds@g5.osdl.org> |
Insanity avoidance in /proc The old /proc interfaces were never updated to use loff_t, and are just generally broken. Now, we should be using the seq_file interface for all of the proc files, but converting the legacy functions is more work than most people care for and has little upside.. But at least we can make the non-LFS rules explicit, rather than just insanely wrapping the offset or something. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
2f51201662b28dbf8c15fb7eb972bc51c6cc3fa5 |
|
31-Oct-2005 |
Eric Dumazet <dada1@cosmosbay.com> |
[PATCH] reduce sizeof(struct file) Now that RCU applied on 'struct file' seems stable, we can place f_rcuhead in a memory location that is not anymore used at call_rcu(&f->f_rcuhead, file_free_rcu) time, to reduce the size of this critical kernel object. The trick I used is to move f_rcuhead and f_list in an union called f_u The callers are changed so that f_rcuhead becomes f_u.fu_rcuhead and f_list becomes f_u.f_list Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
2b579beec255d6589fabe51b60933d723630bcd4 |
|
07-Sep-2005 |
Miklos Szeredi <miklos@szeredi.hu> |
[PATCH] proc: link count fix This patch fixes bug titled "sunrpc as module and bad proc/sys link count" reported by Jiri Slaby. The problem was, that only proc_dir_entry->nlink was updated and the corresponding inode->i_nlink was not. The fix is to implement the inode->getattr() method, and update i_nlink (if necessary). A quick audit of proc code shows that no other attribute changes after creation. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
008b150a3c4d971cd65d02d107b8fcc860bc959c |
|
20-Aug-2005 |
Al Viro <viro@parcelfarce.linux.theplanet.co.uk> |
[PATCH] Fix up symlink function pointers This fixes up the symlink functions for the calling convention change: * afs, autofs4, befs, devfs, freevxfs, jffs2, jfs, ncpfs, procfs, smbfs, sysvfs, ufs, xfs - prototype change for ->follow_link() * befs, smbfs, xfs - same for ->put_link() Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 |
|
17-Apr-2005 |
Linus Torvalds <torvalds@ppc970.osdl.org> |
Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!
|