4a32fea9d78f2d2315c0072757b197d5a304dc8b |
|
17-Aug-2014 |
Christoph Lameter <cl@linux.com> |
scheduler: Replace __get_cpu_var with this_cpu_ptr Convert all uses of __get_cpu_var for address calculation to use this_cpu_ptr instead. [Uses of __get_cpu_var with cpumask_var_t are no longer handled by this patch] Cc: Peter Zijlstra <peterz@infradead.org> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
|
c53ed7423619b4e8108914a9f31b426dd58ad591 |
|
19-Nov-2013 |
Johannes Berg <johannes.berg@intel.com> |
genetlink: only pass array to genl_register_family_with_ops() As suggested by David Miller, make genl_register_family_with_ops() a macro and pass only the array, evaluating ARRAY_SIZE() in the macro, this is a little safer. The openvswitch has some indirection, assing ops/n_ops directly in that code. This might ultimately just assign the pointers in the family initializations, saving the struct genl_family_and_ops and code (once mcast groups are handled differently.) Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
4534de8305b3f1460a527a0cda0e3dc2224c6f0c |
|
14-Nov-2013 |
Johannes Berg <johannes.berg@intel.com> |
genetlink: make all genl_ops users const Now that genl_ops are no longer modified in place when registering, they can be made const. This patch was done mostly with spatch: @@ identifier ops; @@ +const struct genl_ops ops[] = { ... }; (except the struct thing in net/openvswitch/datapath.c) Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
88d36a9949513419de3a506e7fca8b82d1dc972a |
|
14-Nov-2013 |
Johannes Berg <johannes.berg@intel.com> |
taskstats: use genl_register_family_with_ops() This simplifies the code since there's no longer a need to have error handling in the registration. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
0d20633b041041ecda39ae562e62087acf0092f1 |
|
13-Nov-2013 |
Chen Gang <gang.chen@asianux.com> |
kernel/taskstats.c: return -ENOMEM when alloc memory fails in add_del_listener() For registering in add_del_listener(), when kmalloc_node() fails, need return -ENOMEM instead of success code, and cmd_attr_register_cpumask() wants to know about it. After modification, give a simple common test "build -> boot up -> kernel/controllers/cgroup/getdelays by LTP tools". Signed-off-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
3fa582663129330d57d15b97ae534dc1203fc3aa |
|
13-Nov-2013 |
Chen Gang <gang.chen@asianux.com> |
kernel/taskstats.c: add nla_nest_cancel() for failure processing between nla_nest_start() and nla_nest_end() When failure occurs between nla_nest_start() and nla_nest_end(), we should call nla_nest_cancel() to clean up related things. Signed-off-by: Chen Gang <gang.chen@asianux.com> Cc: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
0324b5a450f8a58304e93c5d886add24ca3527bc |
|
05-Oct-2012 |
Jesper Juhl <jj@chaosbits.net> |
taskstats: cgroupstats_user_cmd() may leak on error If prepare_reply() succeeds we have allocated memory for 'rep_skb'. If nla_reserve() then subsequently fails and returns NULL we fail to release the memory we allocated, thus causing a leak. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Cc: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
2903ff019b346ab8d36ebbf54853c3aaf6590608 |
|
28-Aug-2012 |
Al Viro <viro@zeniv.linux.org.uk> |
switch simple cases of fget_light to fdget Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
4bd6e32acec66c55c6c1af4672f3216b2ac88e35 |
|
08-Feb-2012 |
Eric W. Biederman <ebiederm@xmission.com> |
userns: Convert taskstats to handle the user and pid namespaces. - Explicitly limit exit task stat broadcast to the initial user and pid namespaces, as it is already limited to the initial network namespace. - For broadcast task stats explicitly generate all of the idenitiers in terms of the initial user namespace and the initial pid namespace. - For request stats report them in terms of the current user namespace and the current pid namespace. Netlink messages are delivered syncrhonously to the kernel allowing us to get the user namespace and the pid namespace from the current task. - Pass the namespaces for representing pids and uids and gids into bacct_add_task. Cc: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
15e473046cb6e5d18a4d0057e61d76315230382b |
|
07-Sep-2012 |
Eric W. Biederman <ebiederm@xmission.com> |
netlink: Rename pid to portid to avoid confusion It is a frequent mistake to confuse the netlink port identifier with a process identifier. Try to reduce this confusion by renaming fields that hold port identifiers portid instead of pid. I have carefully avoided changing the structures exported to userspace to avoid changing the userspace API. I have successfully built an allyesconfig kernel with this change. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
25353b3377d5a75d4b830477bb90a3691155de72 |
|
30-Jul-2012 |
Alan Cox <alan@linux.intel.com> |
taskstats: check nla_reserve() return Addresses https://bugzilla.kernel.org/show_bug.cgi?id=44621 Reported-by: <rucsoftsec@gmail.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
1a51410abe7d0ee4b1d112780f46df87d3621043 |
|
20-Sep-2011 |
Linus Torvalds <torvalds@linux-foundation.org> |
Make TASKSTATS require root access Ok, this isn't optimal, since it means that 'iotop' needs admin capabilities, and we may have to work on this some more. But at the same time it is very much not acceptable to let anybody just read anybody elses IO statistics quite at this level. Use of the GENL_ADMIN_PERM suggested by Johannes Berg as an alternative to checking the capabilities by hand. Reported-by: Vasiliy Kulikov <segoon@openwall.com> Cc: Johannes Berg <johannes.berg@intel.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
a7295898a1d2e501427f557111c2b4bdfc90b1ed |
|
04-Aug-2011 |
Oleg Nesterov <oleg@redhat.com> |
taskstats: add_del_listener() should ignore !valid listeners When send_cpu_listeners() finds the orphaned listener it marks it as !valid and drops listeners->sem. Before it takes this sem for writing, s->pid can be reused and add_del_listener() can wrongly try to re-use this entry. Change add_del_listener() to check ->valid = T. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Vasiliy Kulikov <segoon@openwall.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Cc: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
dfc428b656c4693a2334a8d9865b430beddb562a |
|
04-Aug-2011 |
Oleg Nesterov <oleg@redhat.com> |
taskstats: add_del_listener() shouldn't use the wrong node 1. Commit 26c4caea9d69 "don't allow duplicate entries in listener mode" changed add_del_listener(REGISTER) so that "next_cpu:" can reuse the listener allocated for the previous cpu, this doesn't look exactly right even if minor. Change the code to kfree() in the already-registered case, this case is unlikely anyway so the extra kmalloc_node() shouldn't hurt but looke more correct and clean. 2. use the plain list_for_each_entry() instead of _safe() to scan listeners->list. 3. Remove the unneeded INIT_LIST_HEAD(&s->list), we are going to list_add(&s->list). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Vasiliy Kulikov <segoon@openwall.com> Cc: Balbir Singh <bsingharora@gmail.com> Reviewed-by: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
60063497a95e716c9a689af3be2687d261f115b4 |
|
27-Jul-2011 |
Arun Sharma <asharma@fb.com> |
atomic: use <linux/atomic.h> This allows us to move duplicated code in <asm/atomic.h> (atomic_inc_not_zero() for now) to <linux/atomic.h> Signed-off-by: Arun Sharma <asharma@fb.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
26c4caea9d697043cc5a458b96411b86d7f6babd |
|
28-Jun-2011 |
Vasiliy Kulikov <segoon@openwall.com> |
taskstats: don't allow duplicate entries in listener mode Currently a single process may register exit handlers unlimited times. It may lead to a bloated listeners chain and very slow process terminations. Eg after 10KK sent TASKSTATS_CMD_ATTR_REGISTER_CPUMASKs ~300 Mb of kernel memory is stolen for the handlers chain and "time id" shows 2-7 seconds instead of normal 0.003. It makes it possible to exhaust all kernel memory and to eat much of CPU time by triggerring numerous exits on a single CPU. The patch limits the number of times a single process may register itself on a single CPU to one. One little issue is kept unfixed - as taskstats_exit() is called before exit_files() in do_exit(), the orphaned listener entry (if it was not explicitly deregistered) is kept until the next someone's exit() and implicit deregistration in send_cpu_listeners(). So, if a process registered itself as a listener exits and the next spawned process gets the same pid, it would inherit taskstats attributes. Signed-off-by: Vasiliy Kulikov <segooon@gmail.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
f9b182e24ecb2b3bb33340f053ba31c8c4e1d895 |
|
24-Mar-2011 |
Mandeep Singh Baines <msb@chromium.org> |
taskstats: use appropriate printk priority level printk()s without a priority level default to KERN_WARNING. To reduce noise at KERN_WARNING, this patch set the priority level appriopriately for unleveled printks()s. This should be useful to folks that look at dmesg warnings closely. Signed-off-by: Mandeep Singh Baines <msb@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
9ab020cf07e457a8b425bf5af17e704f90f86d8b |
|
13-Jan-2011 |
Jeff Mahoney <jeffm@suse.com> |
taskstats: use better ifdef for alignment Commit 4be2c95d ("taskstats: pad taskstats netlink response for aligment issues on ia64") added a null field to align the taskstats structure but the discussion centered around ia64. The issue exists on other platforms with inefficient unaligned access and adding them piecemeal would be an unmaintainable mess. This patch uses Dave Miller's suggestion of using a combination of CONFIG_64BIT && !CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS to determine whether alignment is needed. Note that this will cause breakage on those platforms with applications like iotop which had hard-coded offsets into the packet to access the taskstats structure. The message seen on systems without the alignment fixes looks like: kernel unaligned access to 0xe000023879dca9bc, ip=0xa000000100133d10 The addresses may vary but resolve to locations inside __delayacct_add_tsk. iotop makes what I'd call unreasonable assumptions about the contents of a netlink genetlink packet containing generic attributes. They're typed and have headers that specify value lengths, so the client can (should) identify and skip the ones the client doesn't understand. The kernel, as of version 2.6.36, presented a packet like so: +--------------------------------+ | genlmsghdr - 4 bytes | +--------------------------------+ | NLA header - 4 bytes | /* Aggregate header */ +-+------------------------------+ | | NLA header - 4 bytes | /* PID header */ | +------------------------------+ | | pid/tgid - 4 bytes | | +------------------------------+ | | NLA header - 4 bytes | /* stats header */ | + -----------------------------+ <- oops. aligned on 4 byte boundary | | struct taskstats - 328 bytes | +-+------------------------------+ The iotop code expects that the kernel will behave as it did then, assuming that the packet format is set in stone. The format is set in stone, but the packet offsets are not. There's nothing in the packet format that guarantees that the packet will be sent in exactly the same way. The attribute contents are set (or versioned) and the aggregate contents are set but they can be anywhere in the packet. The issue here isn't that an unaligned structure gets passed to userspace, it's that the NLA infrastructure has something of a weakness: The 4 byte attribute header may force the payload to be unaligned. The taskstats structure is created at an unaligned location and then 64-bit values are operated on inside the kernel, so the unaligned access warnings gets spewed everywhere. It's possible to use the unaligned access API to operate on the structure in the kernel but it seems like a wasted effort to work around userspace code that isn't following the packet format. Any new additions would also need the be worked around. It's a maintenance nightmare. The conclusion of the earlier discussion seemed to be "ok fine, if we have to break it, don't break it on arches that don't have the problem." Dave pointed out that the unaligned access problem doesn't only exist on ia64, but also on other 64-bit arches that don't have efficient unaligned access and it should be fixed there as well. The committed version of the patch and this addition keep with the conclusion of that discussion not to break it unnecessarily, which the pid padding and the packet padding fixes did do. x86_64 and powerpc don't suffer this problem so they shouldn't suffer the solution. Other 64-bit architectures do and will, though. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reported-by: David S. Miller <davem@davemloft.net> Acked-by: David S. Miller <davem@davemloft.net> Cc: Dan Carpenter <error27@gmail.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Florian Mickler <florian@mickler.org> Cc: Guillaume Chazarain <guichaz@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
4be2c95d1f7706ca0e74499f2bd118e1cee19669 |
|
22-Dec-2010 |
Jeff Mahoney <jeffm@suse.com> |
taskstats: pad taskstats netlink response for aligment issues on ia64 The taskstats structure is internally aligned on 8 byte boundaries but the layout of the aggregrate reply, with two NLA headers and the pid (each 4 bytes), actually force the entire structure to be unaligned. This causes the kernel to issue unaligned access warnings on some architectures like ia64. Unfortunately, some software out there doesn't properly unroll the NLA packet and assumes that the start of the taskstats structure will always be 20 bytes from the start of the netlink payload. Aligning the start of the taskstats structure breaks this software, which we don't want. So, for now the alignment only happens on architectures that require it and those users will have to update to fixed versions of those packages. Space is reserved in the packet only when needed. This ifdef should be removed in several years e.g. 2012 once we can be confident that fixed versions are installed on most systems. We add the padding before the aggregate since the aggregate is already a defined type. Commit 85893120 ("delayacct: align to 8 byte boundary on 64-bit systems") previously addressed the alignment issues by padding out the pid field. This was supposed to be a compatible change but the circumstances described above mean that it wasn't. This patch backs out that change, since it was a hack, and introduces a new NULL attribute type to provide the padding. Padding the response with 4 bytes avoids allocating an aligned taskstats structure and copying it back. Since the structure weighs in at 328 bytes, it's too big to do it on the stack. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reported-by: Brian Rogers <brian@xyzw.org> Cc: Jeff Mahoney <jeffm@suse.com> Cc: Guillaume Chazarain <guichaz@gmail.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
cd85fc58cd71bf6b89612efafb9a84e655ed7d66 |
|
08-Dec-2010 |
Christoph Lameter <cl@linux.com> |
taskstats: Use this_cpu_ops Use this_cpu_inc_return in one place and avoid ugly __raw_get_cpu in another. V3->V4: - Fix off by one. V4-V4f: - Use &listener_array Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
|
3d9e0cf1fe007b88db55d43dfdb6839e1a029ca5 |
|
28-Oct-2010 |
Michael Holzheu <holzheu@linux.vnet.ibm.com> |
taskstats: split fill_pid function Separate the finding of a task_struct by pid or tgid from filling the taskstats data. This makes the code more readable. Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
9323312592cca636d7c2580dc85fa4846efa86a2 |
|
28-Oct-2010 |
Michael Holzheu <holzheu@linux.vnet.ibm.com> |
taskstats: separate taskstats commands Move each taskstats command into a single function. This makes the code more readable and makes it easier to add new commands. Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
85893120699f8bae8caa12a8ee18ab5fceac978e |
|
28-Oct-2010 |
Jeff Mahoney <jeffm@suse.com> |
delayacct: align to 8 byte boundary on 64-bit systems prepare_reply() sets up an skb for the response. The payload contains: +--------------------------------+ | genlmsghdr - 4 bytes | +--------------------------------+ | NLA header - 4 bytes | /* Aggregate header */ +-+------------------------------+ | | NLA header - 4 bytes | /* PID header */ | +------------------------------+ | | pid/tgid - 4 bytes | | +------------------------------+ | | NLA header - 4 bytes | /* stats header */ | + -----------------------------+ <- oops. aligned on 4 byte boundary | | struct taskstats - 328 bytes | +-+------------------------------+ The start of the taskstats struct must be 8 byte aligned on IA64 (and other systems with 8 byte alignment rules for 64-bit types) or runtime alignment warnings will be issued. This patch pads the pid/tgid field out to sizeof(long), which forces the alignment of taskstats. The getdelays userspace code is ok with this since it assumes 32-bit pid/tgid and then honors that header's length field. An array is used to avoid exposing kernel memory contents to userspace in the response. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
5a0e3ad6af8660be21ca98a971cd00f331318c05 |
|
24-Mar-2010 |
Tejun Heo <tj@kernel.org> |
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
|
b54452b07a7b1b8cc1385edba3ef2ef6d4679d5a |
|
18-Feb-2010 |
Alexey Dobriyan <adobriyan@gmail.com> |
const: struct nla_policy Make remaining netlink policies as const. Fixup coding style where needed. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
134e63756d5f3d0f7604dfcca847b09d1b14fd66 |
|
10-Jul-2009 |
Johannes Berg <johannes@sipsolutions.net> |
genetlink: make netns aware This makes generic netlink network namespace aware. No generic netlink families except for the controller family are made namespace aware, they need to be checked one by one and then set the family->netnsok member to true. A new function genlmsg_multicast_netns() is introduced to allow sending a multicast message in a given namespace, for example when it applies to an object that lives in that namespace, a new function genlmsg_multicast_allns() to send a message to all network namespaces (for objects that do not have an associated netns). The function genlmsg_multicast() is changed to multicast the message in just init_net, which is currently correct for all generic netlink families since they only work in init_net right now. Some will later want to work in all net namespaces because they do not care about the netns at all -- those will have to be converted to use one of the new functions genlmsg_multicast_allns() or genlmsg_multicast_netns() whenever they are made netns aware in some way. After this patch families can easily decide whether or not they should be available in all net namespaces. Many genl families us it for objects not related to networking and should therefore be available in all namespaces, but that will have to be done on a per family basis. Note that this doesn't touch on the checkpoint/restart problem where network namespaces could be used, genl families and multicast groups are numbered globally and I see no easy way of changing that, especially since it must be possible to multicast to all network namespaces for those families that do not care about netns. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
41c7bb9588904eb060a95bcad47bd3804a1ece25 |
|
01-Jan-2009 |
Rusty Russell <rusty@rustcorp.com.au> |
cpumask: convert rest of files in kernel/ Impact: Reduce stack usage, use new cpumask API. Mainly changing cpumask_t to 'struct cpumask' and similar simple API conversion. Two conversions worth mentioning: 1) we use cpumask_any_but to avoid a temporary in kernel/softlockup.c, 2) Use cpumask_var_t in taskstats_user_cmd(). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com>
|
29c0177e6a4ac094302bed54a1d4bbb6b740a9ef |
|
13-Dec-2008 |
Rusty Russell <rusty@rustcorp.com.au> |
cpumask: change cpumask_scnprintf, cpumask_parse_user, cpulist_parse, and cpulist_scnprintf to take pointers. Impact: change calling convention of existing cpumask APIs Most cpumask functions started with cpus_: these have been replaced by cpumask_ ones which take struct cpumask pointers as expected. These four functions don't have good replacement names; fortunately they're rarely used, so we just change them over. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: paulus@samba.org Cc: mingo@redhat.com Cc: tony.luck@intel.com Cc: ralf@linux-mips.org Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: cl@linux-foundation.org Cc: srostedt@redhat.com
|
b81f3ea92ba1fa676775677679889dc2a7f03c8b |
|
25-Jul-2008 |
Vegard Nossum <vegard.nossum@gmail.com> |
taskstats: remove initialization of static per-cpu variable Cc: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
363ab6f1424cdea63e5d182312d60e19077b892a |
|
12-May-2008 |
Mike Travis <travis@sgi.com> |
core: use performance variant for_each_cpu_mask_nr Change references from for_each_cpu_mask to for_each_cpu_mask_nr where appropriate Reviewed-by: Paul Jackson <pj@sgi.com> Reviewed-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
cb41d6d068716b2b3666925da34d3d7e658bf4f3 |
|
30-Apr-2008 |
Pavel Emelyanov <xemul@openvz.org> |
Use find_task_by_vpid in taskstats The pid to lookup a task by is passed inside taskstats code via genetlink message. Since netlink packets are now processed in the context of the sending task, this is correct to lookup the task with find_task_by_vpid() here. Besides, I fix the call to fill_pid() from taskstats_exit(), since the tsk->pid is not required in fill_pid() in this case, and the pid field on task_struct is going to be deprecated as well. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@engr.sgi.com> Cc: Jonathan Lim <jlim@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
f96159840bc5f605aca5113ab2d24308d3dc2eff |
|
15-Nov-2007 |
Adrian Bunk <bunk@kernel.org> |
kernel/taskstats.c: fix bogus nlmsg_free() We'd better not nlmsg_free on a pointer containing an undefined value (and without having anything allocated). Spotted by the Coverity checker. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Balbir Singh <balbir@linux.vnet.ibm> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
3a4fa0a25da81600ea0bcd75692ae8ca6050d165 |
|
19-Oct-2007 |
Robert P. J. Day <rpjday@mindspring.com> |
Fix misspellings of "system", "controller", "interrupt" and "necessary". Fix the various misspellings of "system", controller", "interrupt" and "[un]necessary". Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Adrian Bunk <bunk@kernel.org>
|
846c7bb055747989891f5cd2bb6e8d56243ba1e7 |
|
19-Oct-2007 |
Balbir Singh <balbir@linux.vnet.ibm.com> |
Add cgroupstats This patch is inspired by the discussion at http://lkml.org/lkml/2007/4/11/187 and implements per cgroup statistics as suggested by Andrew Morton in http://lkml.org/lkml/2007/4/11/263. The patch is on top of 2.6.21-mm1 with Paul's cgroups v9 patches (forward ported) This patch implements per cgroup statistics infrastructure and re-uses code from the taskstats interface. A new set of cgroup operations are registered with commands and attributes. It should be very easy to *extend* per cgroup statistics, by adding members to the cgroupstats structure. The current model for cgroupstats is a pull, a push model (to post statistics on interesting events), should be very easy to add. Currently user space requests for statistics by passing the cgroup file descriptor. Statistics about the state of all the tasks in the cgroup is returned to user space. TODO's/NOTE: This patch provides an infrastructure for implementing cgroup statistics. Based on the needs of each controller, we can incrementally add more statistics, event based support for notification of statistics, accumulation of taskstats into cgroup statistics in the future. Sample output # ./cgroupstats -C /cgroup/a sleeping 2, blocked 0, running 1, stopped 0, uninterruptible 0 # ./cgroupstats -C /cgroup/ sleeping 154, blocked 0, running 0, stopped 0, uninterruptible 0 If the approach looks good, I'll enhance and post the user space utility for the same Feedback, comments, test results are always welcome! [akpm@linux-foundation.org: build fix] Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Paul Menage <menage@google.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
a9022e9cb9e919e31d5bc15fcef5c7186740645e |
|
17-Oct-2007 |
Jesper Juhl <jesper.juhl@gmail.com> |
Clean up duplicate includes in kernel/ This patch cleans up duplicate includes in kernel/ Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Satyam Sharma <ssatyam@cse.iitk.ac.in> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
b663a79c191508f27cd885224b592a878c0ba0f6 |
|
16-Jul-2007 |
Maxim Uvarov <muvarov@ru.mvista.com> |
taskstats: add context-switch counters Make available to the user the following task and process performance statistics: * Involuntary Context Switches (task_struct->nivcsw) * Voluntary Context Switches (task_struct->nvcsw) Statistics information is available from: 1. taskstats interface (Documentation/accounting/) 2. /proc/PID/status (task only). This data is useful for detecting hyperactivity patterns between processes. [akpm@linux-foundation.org: cleanup] Signed-off-by: Maxim Uvarov <muvarov@ru.mvista.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@engr.sgi.com> Cc: Jonathan Lim <jlim@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
0a31bd5f2bbb6473ef9d24f0063ca91cfa678b64 |
|
06-May-2007 |
Christoph Lameter <clameter@sgi.com> |
KMEM_CACHE(): simplify slab cache creation This patch provides a new macro KMEM_CACHE(<struct>, <flags>) to simplify slab creation. KMEM_CACHE creates a slab with the name of the struct, with the size of the struct and with the alignment of the struct. Additional slab flags may be specified if necessary. Example struct test_slab { int a,b,c; struct list_head; } __cacheline_aligned_in_smp; test_slab_cache = KMEM_CACHE(test_slab, SLAB_PANIC) will create a new slab named "test_slab" of the size sizeof(struct test_slab) and aligned to the alignment of test slab. If it fails then we panic. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
b529ccf2799c14346d1518e9bdf1f88f03643e99 |
|
26-Apr-2007 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
[NETLINK]: Introduce nlmsg_hdr() helper For the common "(struct nlmsghdr *)skb->data" sequence, so that we reduce the number of direct accesses to skb->data and for consistency with all the other cast skb member helpers. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
37167485302c8876cb0303af113696e88c2945aa |
|
07-Dec-2006 |
Oleg Nesterov <oleg@tv-sign.ru> |
[PATCH] taskstats: cleanup reply assembling Thomas Graf wrote: > > nla_nest_start() may return NULL, either rely on prepare_reply() to be > correct and BUG() on failure or do proper error handling for all > functions. nla_put() in taskstat.c can fail only if the 'size' argument of alloc_skb() was not right. This is a kernel bug, we should not hide it. So add 'BUG()' on error path and check for 'na == NULL'. > genlmsg_cancel() is only required in error paths for dumping > procedures. So we can remove 'genlmsg_cancel()' calls and 'void *reply' (saves 227 bytes). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Thomas Graf <tgraf@suug.ch> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
51de4d90852ba4cfa5743594ec4a7f158b52dc43 |
|
07-Dec-2006 |
Oleg Nesterov <oleg@tv-sign.ru> |
[PATCH] taskstats: use nla_reserve() for reply assembling Currently taskstats_user_cmd()/taskstats_exit() do: 1) allocate stats 2) fill stats 3) make a temporary copy on stack (236 bytes) 4) copy that copy to skb 5) free stats With the help of nla_reserve() we can operate on skb->data directly, thus avoiding all these steps except 2). So, before this patch: // copy *stats to skb->data int mk_reply(skb, ..., struct taskstats *stats); fill_pid(stats); mk_reply(skb, ..., stats); After: // return a pointer to skb->data struct taskstats *mk_reply(skb, ...); stat = mk_reply(skb, ...); fill_pid(stats); Shrinks taskatsks.o by 162 bytes. A stupid benchmark (send one million TASKSTATS_CMD_ATTR_PID) shows the real user sys before: 4.02 0.06 3.96 4.02 0.04 3.98 4.02 0.04 3.97 after: 3.86 0.08 3.78 3.88 0.10 3.77 3.89 0.09 3.80 but this looks suspiciously good. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@sgi.com> Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
68062b86fc0f480b806d270a8278709a5a41bb67 |
|
07-Dec-2006 |
Oleg Nesterov <oleg@tv-sign.ru> |
[PATCH] taskstats: factor out reply assembling Introduce mk_reply() helper which does all nla_put()s on reply. Saves 453 bytes and a preparation for the next patch. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@sgi.com> Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
34ec12349c8a9505adc59d72f92b4595bc2483ff |
|
07-Dec-2006 |
Oleg Nesterov <oleg@tv-sign.ru> |
[PATCH] taskstats: cleanup ->signal->stats allocation Allocate ->signal->stats on demand in taskstats_exit(), this allows us to remove taskstats_tgid_alloc() (the last non-trivial inline) from taskstat's public interface. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
115085ea0794c0f339be8f9d25505c7f9861d824 |
|
07-Dec-2006 |
Oleg Nesterov <oleg@tv-sign.ru> |
[PATCH] taskstats: cleanup do_exit() path do_exit: taskstats_exit_alloc() ... taskstats_exit_send() taskstats_exit_free() I think this is not good, let it be a single function exported to the core kernel, taskstats_exit(), which does alloc + send + free itself. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
128fb95650b3273a8dc9ba5514b6fe7db8ea30bf |
|
07-Dec-2006 |
Oleg Nesterov <oleg@tv-sign.ru> |
[PATCH] taskstats_exit_alloc: optimize/simplify If there are no listeners, every task does unneeded kmem_cache alloc/free on exit. We don't need listeners->sem for 'if (!list_empty())' check. Yes, we may have a false positive, but this doesn't differ from the case when the listener is unregistered after we drop the semaphore. So we don't need to do allocation beforehand. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Balbir Singh <balbir@in.ibm.com> Acked-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
e18b890bb0881bbab6f4f1a6cd20d9c60d66b003 |
|
07-Dec-2006 |
Christoph Lameter <clameter@sgi.com> |
[PATCH] slab: remove kmem_cache_t Replace all uses of kmem_cache_t with struct kmem_cache. The patch was generated using the following script: #!/bin/sh # # Replace one string by another in all the kernel sources. # set -e for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do quilt add $file sed -e "1,\$s/$1/$2/g" $file >/tmp/$$ mv /tmp/$$ $file quilt refresh done The script was run like this sh replace kmem_cache_t "struct kmem_cache" Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
e94b1766097d53e6f3ccfb36c8baa562ffeda3fc |
|
07-Dec-2006 |
Christoph Lameter <clameter@sgi.com> |
[PATCH] slab: remove SLAB_KERNEL SLAB_KERNEL is an alias of GFP_KERNEL. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
17c157c889f4b07258af6bfec9e4e9dcf3c00178 |
|
15-Nov-2006 |
Thomas Graf <tgraf@suug.ch> |
[GENL]: Add genlmsg_put_reply() to simplify building reply headers By modyfing genlmsg_put() to take a genl_family and by adding genlmsg_put_reply() the process of constructing the netlink and generic netlink headers is simplified. Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Paul Moore <paul.moore@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
3dabc7157859e706770c825aa229f8943db4e0e1 |
|
15-Nov-2006 |
Thomas Graf <tgraf@suug.ch> |
[GENL]: Add genlmsg_new() to allocate generic netlink messages Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Paul Moore <paul.moore@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
339bf98ffc6a8d8eb16fc532ac57ffbced2f8a68 |
|
10-Nov-2006 |
Thomas Graf <tgraf@suug.ch> |
[NETLINK]: Do precise netlink message allocations where possible Account for the netlink message header size directly in nlmsg_new() instead of relying on the caller calculate it correctly. Replaces error handling of message construction functions when constructing notifications with bug traps since a failure implies a bug in calculating the size of the skb. Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Paul Moore <paul.moore@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
4a279ff1ea1cf325775ada983035123fcdc8e986 |
|
31-Oct-2006 |
Oleg Nesterov <oleg@tv-sign.ru> |
[PATCH] taskstats: fix sub-threads accounting If there are no listeners, taskstats_exit_send() just returns because taskstats_exit_alloc() didn't allocate *tidstats. This is wrong, each sub-thread should do fill_tgid_exit() on exit, otherwise its ->delays is not recorded in ->signal->stats and lost. Q: We don't send TASKSTATS_TYPE_AGGR_TGID when single-threaded process exits. Is it good? How can the listener figure out that it was actually a process exit, not sub-thread? Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Balbir Singh <balbir@in.ibm.com> Acked-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
3d8334def5cf831d2ed438aae021696a2faa4ddd |
|
29-Oct-2006 |
Oleg Nesterov <oleg@tv-sign.ru> |
[PATCH] taskstats: fix sk_buff size calculation prepare_reply() adds GENL_HDRLEN to the payload (genlmsg_total_size()), but then it does genlmsg_put()->nlmsg_put(). This means we forget to reserve a room for 'struct nlmsghdr'. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Thomas Graf <tgraf@suug.ch> Cc: Andrew Morton <akpm@osdl.org> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@sgi.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
d46a3d0d07ba539aea5b0e1ad30e568f0cb03576 |
|
29-Oct-2006 |
Oleg Nesterov <oleg@tv-sign.ru> |
[PATCH] taskstats: fix sk_buff leak 'return genlmsg_cancel()' in taskstats_user_cmd/taskstats_exit_send potentially leaks a skb. Unless we pass 'rep_skb' to the netlink layer we own sk_buff. This means we should always do kfree_skb() on failure. [ Thomas acked and pointed out missing return value in original version ] Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Thomas Graf <tgraf@suug.ch> Cc: Andrew Morton <akpm@osdl.org> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@sgi.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
d7c3f5f231c60d7e6ada5770b536df2b3ec1bd08 |
|
28-Oct-2006 |
Oleg Nesterov <oleg@tv-sign.ru> |
[PATCH] fill_tgid: cleanup delays accounting fill_tgid() should skip not only an already exited group leader. If the task has ->exit_state != 0 it already did exit_notify(), so it also did fill_tgid_exit()->delayacct_add_tsk(->signal->stats) and we should skip it to avoid a double accounting. This patch doesn't close the race completely, but it cleanups the code. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
a98b6094261c0112e9c455c96995972181bff049 |
|
28-Oct-2006 |
Oleg Nesterov <oleg@tv-sign.ru> |
[PATCH] taskstats: don't use tasklist_lock Remove tasklist_lock from taskstats.c. find_task_by_pid() is rcu-safe. ->siglock allows us to traverse subthread without tasklist. Q: delay accounting looks wrong to me. If sub-thread has already called taskstats_exit_send() but didn't call release_task(self) yet it will be accounted twice. The window is big. No? Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
b8534d7bd89df0cd41cd47bcd6733a05ea9a691a |
|
28-Oct-2006 |
Oleg Nesterov <oleg@tv-sign.ru> |
[PATCH] taskstats: kill ->taskstats_lock in favor of ->siglock signal_struct is (mostly) protected by ->sighand->siglock, I think we don't need ->taskstats_lock to protect ->stats. This also allows us to simplify the locking in fill_tgid(). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
fca178c0c6e8d52a1875be36b070f30884ebfae9 |
|
28-Oct-2006 |
Oleg Nesterov <oleg@tv-sign.ru> |
[PATCH] fill_tgid: fix task_struct leak and possible oops 1. fill_tgid() forgets to do put_task_struct(first). 2. release_task(first) can happen after fill_tgid() drops tasklist_lock, it is unsafe to dereference first->signal. This is a temporary fix, imho the locking should be reworked. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@sgi.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
9acc1853519a0473620d424105f9d49ea5b4e62e |
|
01-Oct-2006 |
Jay Lan <jlan@engr.sgi.com> |
[PATCH] csa: Extended system accounting over taskstats Add extended system accounting handling over taskstats interface. A CONFIG_TASK_XACCT flag is created to enable the extended accounting code. Signed-off-by: Jay Lan <jlan@sgi.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jes Sorensen <jes@sgi.com> Cc: Chris Sturtivant <csturtiv@sgi.com> Cc: Tony Ernst <tee@sgi.com> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
f3cef7a99469afc159fec3a61b42dc7ca5b6824f |
|
01-Oct-2006 |
Jay Lan <jlan@engr.sgi.com> |
[PATCH] csa: basic accounting over taskstats Add some basic accounting fields to the taskstats struct, add a new kernel/tsacct.c to handle basic accounting data handling upon exit. A handle is added to taskstats.c to invoke the basic accounting data handling. Signed-off-by: Jay Lan <jlan@sgi.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jes Sorensen <jes@sgi.com> Cc: Chris Sturtivant <csturtiv@sgi.com> Cc: Tony Ernst <tee@sgi.com> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Cc: "Michal Piotrowski" <michal.k.k.piotrowski@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
0ae646845b603e9df5711084436d389f8371ffb3 |
|
01-Oct-2006 |
Balbir Singh <balbir@in.ibm.com> |
[PATCH] Fix taskstats size calculation (use the new genetlink utility functions) The addition of the CSA patch pushed the size of struct taskstats to 256 bytes. This exposed a problem with prepare_reply(), we were not allocating space for the netlink and genetlink header. It worked earlier because alloc_skb() would align the skb to SMP_CACHE_BYTES, which added some additonal bytes. Signed-off-by: Balbir Singh <balbir@in.ibm.com> Cc: Jamal Hadi <hadi@cyberus.ca> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Thomas Graf <tgraf@suug.ch> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
fe4944e59c357f945f81bc67edb7ed1392e875ad |
|
05-Aug-2006 |
Thomas Graf <tgraf@suug.ch> |
[NETLINK]: Extend netlink messaging interface Adds: nlmsg_get_pos() return current position in message nlmsg_trim() trim part of message nla_reserve_nohdr(skb, len) reserve room for an attribute w/o hdr nla_put_nohdr(skb, len, data) add attribute w/o hdr nla_find_nested() find attribute in nested attributes Fixes nlmsg_new() to take allocation flags and consider size. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
|
d94a041519f3ab1ac023bf917619cd8c4a7d3c01 |
|
30-Jul-2006 |
Shailabh Nagar <nagar@watson.ibm.com> |
[PATCH] taskstats: free skb, avoid returns in send_cpu_listeners Add a missing freeing of skb in the case there are no listeners at all. Also remove the returning of error values by the function as it is unused by the sole caller. Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
7d94dddd438bcba97db44f120da39bb001b5249f |
|
30-Jul-2006 |
Shailabh Nagar <nagar@watson.ibm.com> |
[PATCH] make taskstats sending completely independent of delay accounting on/off status Complete the separation of delay accounting and taskstats by ignoring the return value of delay accounting functions that fill in parts of taskstats before it is sent out (either in response to a command or as part of a task exit). Also make delayacct_add_tsk return silently when delay accounting is turned off rather than treat it as an error. Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
bb129994c3bff9c5e8df91f05d7e9b6402fbd83f |
|
14-Jul-2006 |
Shailabh Nagar <nagar@watson.ibm.com> |
[PATCH] Remove down_write() from taskstats code invoked on the exit() path In send_cpu_listeners(), which is called on the exit path, a down_write() was protecting operations like skb_clone() and genlmsg_unicast() that do GFP_KERNEL allocations. If the oom-killer decides to kill tasks to satisfy the allocations,the exit of those tasks could block on the same semphore. The down_write() was only needed to allow removal of invalid listeners from the listener list. The patch converts the down_write to a down_read and defers the removal to a separate critical region. This ensures that even if the oom-killer is called, no other task's exit is blocked as it can still acquire another down_read. Thanks to Andrew Morton & Herbert Xu for pointing out the oom related pitfalls, and to Chandra Seetharaman for suggesting this fix instead of using something more complex like RCU. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
f9fd8914c1acca0d98b69d831b128d5b52f03c51 |
|
14-Jul-2006 |
Shailabh Nagar <nagar@watson.ibm.com> |
[PATCH] per-task delay accounting taskstats interface: control exit data through cpumasks On systems with a large number of cpus, with even a modest rate of tasks exiting per cpu, the volume of taskstats data sent on thread exit can overflow a userspace listener's buffers. One approach to avoiding overflow is to allow listeners to get data for a limited and specific set of cpus. By scaling the number of listeners and/or the cpus they monitor, userspace can handle the statistical data overload more gracefully. In this patch, each listener registers to listen to a specific set of cpus by specifying a cpumask. The interest is recorded per-cpu. When a task exits on a cpu, its taskstats data is unicast to each listener interested in that cpu. Thanks to Andrew Morton for pointing out the various scalability and general concerns of previous attempts and for suggesting this design. [akpm@osdl.org: build fix] Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
ad4ecbcba72855a2b5319b96e2a3a65ed1ca3bfd |
|
14-Jul-2006 |
Shailabh Nagar <nagar@watson.ibm.com> |
[PATCH] delay accounting taskstats interface send tgid once Send per-tgid data only once during exit of a thread group instead of once with each member thread exit. Currently, when a thread exits, besides its per-tid data, the per-tgid data of its thread group is also sent out, if its thread group is non-empty. The per-tgid data sent consists of the sum of per-tid stats for all *remaining* threads of the thread group. This patch modifies this sending in two ways: - the per-tgid data is sent only when the last thread of a thread group exits. This cuts down heavily on the overhead of sending/receiving per-tgid data, especially when other exploiters of the taskstats interface aren't interested in per-tgid stats - the semantics of the per-tgid data sent are changed. Instead of being the sum of per-tid data for remaining threads, the value now sent is the true total accumalated statistics for all threads that are/were part of the thread group. The patch also addresses a minor issue where failure of one accounting subsystem to fill in the taskstats structure was causing the send of taskstats to not be sent at all. The patch has been tested for stability and run cerberus for over 4 hours on an SMP. [akpm@osdl.org: bugfixes] Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
6f44993fe1d7b2b097f6ac60cd5835c6f5ca0874 |
|
14-Jul-2006 |
Shailabh Nagar <nagar@watson.ibm.com> |
[PATCH] per-task-delay-accounting: delay accounting usage of taskstats interface Usage of taskstats interface by delay accounting. Signed-off-by: Shailabh Nagar <nagar@us.ibm.com> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Cc: Jes Sorensen <jes@sgi.com> Cc: Peter Chubb <peterc@gelato.unsw.edu.au> Cc: Erich Focht <efocht@ess.nec.de> Cc: Levent Serinol <lserinol@gmail.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
c757249af152c59fd74b85e52e8c090acb33d9c0 |
|
14-Jul-2006 |
Shailabh Nagar <nagar@watson.ibm.com> |
[PATCH] per-task-delay-accounting: taskstats interface Create a "taskstats" interface based on generic netlink (NETLINK_GENERIC family), for getting statistics of tasks and thread groups during their lifetime and when they exit. The interface is intended for use by multiple accounting packages though it is being created in the context of delay accounting. This patch creates the interface without populating the fields of the data that is sent to the user in response to a command or upon the exit of a task. Each accounting package interested in using taskstats has to provide an additional patch to add its stats to the common structure. [akpm@osdl.org: cleanups, Kconfig fix] Signed-off-by: Shailabh Nagar <nagar@us.ibm.com> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Cc: Jes Sorensen <jes@sgi.com> Cc: Peter Chubb <peterc@gelato.unsw.edu.au> Cc: Erich Focht <efocht@ess.nec.de> Cc: Levent Serinol <lserinol@gmail.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|