f4ce6eca71d15b8e12a33ac8e1ef733a83944d2e |
|
13-Aug-2014 |
Borislav Petkov <bp@suse.de> |
EDAC: Fix mem_types strings type This one got forgotten during an earlier cleanup. Signed-off-by: Borislav Petkov <bp@suse.de>
|
76ac8275f296b49c58f684825543bf4eb85d43d0 |
|
11-Jun-2014 |
Chen, Gong <gong.chen@linux.intel.com> |
trace, RAS: Add basic RAS trace event To avoid confuision and conflict of usage for RAS related trace event, add an unified RAS trace event stub. Start a RAS subsystem menu which will be fleshed out in time, when more features get added to it. Signed-off-by: Chen, Gong <gong.chen@linux.intel.com> Link: http://lkml.kernel.org/r/1402475691-30045-2-git-send-email-gong.chen@linux.intel.com Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Tony Luck <tony.luck@intel.com>
|
aa2064d7dd35ac5812645780d2f22a7899e7c6e1 |
|
09-May-2014 |
Loc Ho <lho@apm.com> |
EDAC: Fix MC scrub mode comparsion bug for correctable errors The MC structure field scrub_mode is of integer type - not bit field. Use it accordingly. Signed-off-by: Loc Ho <lho@apm.com> Link: http://lkml.kernel.org/r/1399590199-12256-2-git-send-email-lho@apm.com Signed-off-by: Borislav Petkov <bp@suse.de>
|
cb6ef42e516cb8948f15e4b70dc03af8020050a2 |
|
12-Feb-2014 |
Borislav Petkov <bp@suse.de> |
EDAC: Correct workqueue setup path We're using edac_mc_workq_setup() both on the init path, when we load an edac driver and when we change the polling period (edac_mc_reset_delay_period) through /sys/.../edac_mc_poll_msec. On that second path we don't need to init the workqueue which has been initialized already. Thanks to Tejun for workqueue insights. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1391457913-881-1-git-send-email-prarit@redhat.com Cc: <stable@vger.kernel.org>
|
9da21b1509d8aa7ab4846722817d16c72d656c91 |
|
03-Feb-2014 |
Borislav Petkov <bp@suse.de> |
EDAC: Poll timeout cannot be zero, p2 Sanitize code even more to accept unsigned longs only and to not allow polling intervals below 1 second as this is unnecessary and doesn't make much sense anyway for polling errors. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1391457913-881-1-git-send-email-prarit@redhat.com Cc: Doug Thompson <dougthompson@xmission.com> Cc: <stable@vger.kernel.org>
|
7270a6085a20a9c6aecb0be8c70510702118dc71 |
|
10-Oct-2013 |
Robert Richter <robert.richter@linaro.org> |
edac: Unify reporting of device info for device, mc and pci Log messages slightly differ between edac subsystems. Unifying it. Signed-off-by: Robert Richter <robert.richter@linaro.org> Acked-by: Rob Herring <rob.herring@calxeda.com> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Robert Richter <rric@kernel.org>
|
88d84ac97378c2f1d5fec9af1e8b7d9a662d6b00 |
|
19-Jul-2013 |
Borislav Petkov <bp@suse.de> |
EDAC: Fix lockdep splat Fix the following: BUG: key ffff88043bdd0330 not in .data! ------------[ cut here ]------------ WARNING: at kernel/lockdep.c:2987 lockdep_init_map+0x565/0x5a0() DEBUG_LOCKS_WARN_ON(1) Modules linked in: glue_helper sb_edac(+) edac_core snd acpi_cpufreq lrw gf128mul ablk_helper iTCO_wdt evdev i2c_i801 dcdbas button cryptd pcspkr iTCO_vendor_support usb_common lpc_ich mfd_core soundcore mperf processor microcode CPU: 2 PID: 599 Comm: modprobe Not tainted 3.10.0 #1 Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A08 01/24/2013 0000000000000009 ffff880439a1d920 ffffffff8160a9a9 ffff880439a1d958 ffffffff8103d9e0 ffff88043af4a510 ffffffff81a16e11 0000000000000000 ffff88043bdd0330 0000000000000000 ffff880439a1d9b8 ffffffff8103dacc Call Trace: dump_stack warn_slowpath_common warn_slowpath_fmt lockdep_init_map ? trace_hardirqs_on_caller ? trace_hardirqs_on debug_mutex_init __mutex_init bus_register edac_create_sysfs_mci_device edac_mc_add_mc sbridge_probe pci_device_probe driver_probe_device __driver_attach ? driver_probe_device bus_for_each_dev driver_attach bus_add_driver driver_register __pci_register_driver ? 0xffffffffa0010fff sbridge_init ? 0xffffffffa0010fff do_one_initcall load_module ? unset_module_init_ro_nx SyS_init_module tracesys ---[ end trace d24a70b0d3ddf733 ]--- EDAC MC0: Giving out device to 'sbridge_edac.c' 'Sandy Bridge Socket#0': DEV 0000:3f:0e.0 EDAC sbridge: Driver loaded. What happens is that bus_register needs a statically allocated lock_key because the last is handed in to lockdep. However, struct mem_ctl_info embeds struct bus_type (the whole struct, not a pointer to it) and the whole thing gets dynamically allocated. Fix this by using a statically allocated struct bus_type for the MC bus. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: stable@kernel.org # v3.10 Signed-off-by: Tony Luck <tony.luck@intel.com>
|
9713faecff3d071de1208b081d4943b002e9cb1c |
|
11-Mar-2013 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
EDAC: Merge mci.mem_is_per_rank with mci.csbased Both mci.mem_is_per_rank and mci.csbased denote the same thing: the memory controller is csrows based. Merge both fields into one. There's no need for the driver to actually fill it, as the core detects it by checking if one of the layers has the csrows type as part of the memory hierarchy: if (layers[i].type == EDAC_MC_LAYER_CHIP_SELECT) per_rank = true; Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de>
|
e7e248304c8ccf02b89e04c3b3b66006b993b5a7 |
|
31-Oct-2012 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
edac: add support for raw error reports That allows APEI GHES driver to report errors directly, using the EDAC error report API. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
c7ef7645544131b0750478d1cf94cdfa945c809d |
|
21-Feb-2013 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
edac: reduce stack pressure by using a pre-allocated buffer The number of variables at the stack is too big. Reduces the stack usage by using a pre-allocated error buffer. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
80cc7d87d5eb34375f916d282450a0906a8ead60 |
|
31-Oct-2012 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
edac: lock module owner to avoid error report conflicts APEI GHES and i7core_edac/sb_edac currently can be loaded at the same time, but those are Highlander modules: "There can be only one". There are two reasons for that: 1) Each driver assumes that it is the only one registering at the EDAC core, as it is driver's responsibility to number the memory controllers, and all of them start from 0; 2) If BIOS is handling the memory errors, the OS can't also be doing it, as one will mangle with the other. So, we need to add an module owner's lock at the EDAC core, in order to avoid having two different modules handling memory errors at the same time. The best way for doing this lock seems to use the driver's name, as this is unique, and won't require changes on every driver. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
c66b5a79a9348ccd6d1cd81416027d0e12da965d |
|
15-Feb-2013 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
edac: add a new memory layer type There are some cases where the memory controller layout is completely hidden. This is the case of firmware-driven error code, like the one provided by GHES. Add a new layer to be used on such memory error report mechanisms. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
d3d09e18203dba16a9dbdb2b4cc673d90748cdd1 |
|
26-Jan-2013 |
Joe Perches <joe@perches.com> |
EDAC: Fix kcalloc argument order First number, then size. Signed-off-by: Joe Perches <joe@perches.com> Cc: <stable@vger.kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de>
|
80f5ab097b87c86581cb9736a8e55c5a3047d4bb |
|
19-Aug-2012 |
Shaun Ruffell <sruffell@digium.com> |
edac: edac_mc no longer deals with kobjects directly There are no more embedded kobjects in struct mem_ctl_info. Remove a header and a comment that does not reflect the code anymore. Signed-off-by: Shaun Ruffell <sruffell@digium.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
f430d5707aa47af8669bbc0083a79e7d780908b2 |
|
10-Sep-2012 |
Borislav Petkov <borislav.petkov@amd.com> |
EDAC: Handle empty msg strings when reporting errors A reported error could look like this [ 226.178315] EDAC MC0: 1 CE on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x427c0d offset:0xde0 grain:0 syndrome:0x1c6) with two spaces back-to-back due to the msg argument of edac_mc_handle_error being passed on empty by the specific drivers. Handle that. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
|
4da1b7bfe7699881c761d71b5e299a65bce48ab2 |
|
10-Sep-2012 |
Borislav Petkov <borislav.petkov@amd.com> |
EDAC: Remove useless assignment of error type The tracepoint decodes the error type later anyway so remove a useless assignment to the temporary p which gets overwritten later anyway. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
|
24bef66e74d647aebd34e0bef7693512b7912029 |
|
24-Oct-2012 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
edac: Fix the dimm filling for csrows-based layouts The driver is currently filling data in a wrong way, on drivers for csrows-based memory controller, when the first layer is a csrow. This is not easily to notice, as, in general, memories are filed in dual, interleaved, symetric mode, as very few memory controllers support asymetric modes. While digging into a bug for i82795_edac driver, the asymetric mode there is now working, allowing us to fill the machine with 4x1GB ranks at channel 0, and 2x512GB at channel 1: Channel 0 ranks: EDAC DEBUG: i82975x_init_csrows: DIMM A0: from page 0x00000000 to 0x0003ffff (size: 0x00040000 pages) EDAC DEBUG: i82975x_init_csrows: DIMM A1: from page 0x00040000 to 0x0007ffff (size: 0x00040000 pages) EDAC DEBUG: i82975x_init_csrows: DIMM A2: from page 0x00080000 to 0x000bffff (size: 0x00040000 pages) EDAC DEBUG: i82975x_init_csrows: DIMM A3: from page 0x000c0000 to 0x000fffff (size: 0x00040000 pages) Channel 1 ranks: EDAC DEBUG: i82975x_init_csrows: DIMM B0: from page 0x00100000 to 0x0011ffff (size: 0x00020000 pages) EDAC DEBUG: i82975x_init_csrows: DIMM B1: from page 0x00120000 to 0x0013ffff (size: 0x00020000 pages) Instead of properly showing the memories as such, before this patch, it shows the memory layout as: +-----------------------------------+ | mc0 | | csrow0 | csrow1 | csrow2 | ----------+-----------------------------------+ channel1: | 1024 MB | 1024 MB | 512 MB | channel0: | 1024 MB | 1024 MB | 512 MB | ----------+-----------------------------------+ as if both channels were symetric, grouping the DIMMs on a wrong layout. After this patch, the memory is correctly represented. So, for csrows at layers[0], it shows: +-----------------------------------------------+ | mc0 | | csrow0 | csrow1 | csrow2 | csrow3 | ----------+-----------------------------------------------+ channel1: | 512 MB | 512 MB | 0 MB | 0 MB | channel0: | 1024 MB | 1024 MB | 1024 MB | 1024 MB | ----------+-----------------------------------------------+ For csrows at layers[1], it shows: +-----------------------+ | mc0 | | channel0 | channel1 | --------+-----------------------+ csrow3: | 1024 MB | 0 MB | csrow2: | 1024 MB | 0 MB | --------+-----------------------+ csrow1: | 1024 MB | 512 MB | csrow0: | 1024 MB | 512 MB | --------+-----------------------+ So, no matter of what comes first, the information between channel and csrow will be properly represented. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
faa2ad09c01c48012fe4c117d3256e354e0f9238 |
|
23-Sep-2012 |
Shaun Ruffell <sruffell@digium.com> |
edac_mc: edac_mc_free() cannot assume mem_ctl_info is registered in sysfs. Fix potential NULL pointer dereference in edac_unregister_sysfs() on system boot introduced in 3.6-rc1. Since commit 7a623c039 ("edac: rewrite the sysfs code to use struct device") edac_mc_alloc() no longer initializes embedded kobjects in struct mem_ctl_info. Therefore edac_mc_free() can no longer simply decrement a kobject reference count to free the allocated memory unless the memory controller driver module had also called edac_mc_add_mc(). Now edac_mc_free() will check if the newly embedded struct device has been registered with sysfs before using either the standard device release functions or freeing the data structures itself with logic pulled out of the error path of edac_mc_alloc(). The BUG this patch resolves for me: BUG: unable to handle kernel NULL pointer dereference at (null) EIP is at __wake_up_common+0x1a/0x6a Process modprobe (pid: 933, ti=f3dc6000 task=f3db9520 task.ti=f3dc6000) Call Trace: complete_all+0x3f/0x50 device_pm_remove+0x23/0xa2 device_del+0x34/0x142 edac_unregister_sysfs+0x3b/0x5c [edac_core] edac_mc_free+0x29/0x2f [edac_core] e7xxx_probe1+0x268/0x311 [e7xxx_edac] e7xxx_init_one+0x56/0x61 [e7xxx_edac] local_pci_probe+0x13/0x15 ... Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Shaohui Xie <Shaohui.Xie@freescale.com> Signed-off-by: Shaun Ruffell <sruffell@digium.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
ef6e7816b4546475d04b4ea22d58c48472157c70 |
|
23-Sep-2012 |
Fengguang Wu <fengguang.wu@intel.com> |
edac_mc: fix messy kfree calls in the error path coccinelle warns about: + drivers/edac/edac_mc.c:429:9-23: ERROR: reference preceded by free on line 429 421 if (mci->csrows) { > 422 for (chn = 0; chn < tot_channels; chn++) { 423 csr = mci->csrows[chn]; 424 if (csr) { > 425 for (chn = 0; chn < tot_channels; chn++) 426 kfree(csr->channels[chn]); 427 kfree(csr); 428 } > 429 kfree(mci->csrows[i]); 430 } 431 kfree(mci->csrows); 432 } and that code block seem to mess things up in several ways (double free, memory leak, out-of-bound reads etc.): L422: The iterator "chn" and bound "tot_channels" are totally wrong. Should be "row" and "tot_csrows" respectively. Which means either memory leak, or out-of-bound reads (which if does not trigger an immediate page fault error, will further lead to kfree() on random addresses). L425: The inner loop is reusing the same iterator "chn" as the outer loop, which could lead to premature end of the outer loop, and hence memory leak. L429: The array index 'i' in mci->csrows[i] is a temporary value used in previous loops, and won't change at all in the current loop. Which means either out-of-bound read and possibly kfree(random number), or the same mci->csrows[i] get freed once and again, and possibly double free for the kfree(csr) in L427. L426/L427: a kfree(csr->channels) is needed in between to avoid leaking the memory. The buggy code was introduced by commit de3910eb ("edac: change the mem allocation scheme to make Documentation/kobject.txt happy") in the 3.6-rc1 merge window. Fix it by freeing up resources in this order: free csrows[i]->channels[j] free csrows[i]->channels free csrows[i] free csrows CC: Mauro Carvalho Chehab <mchehab@redhat.com> CC: Shaun Ruffell <sruffell@digium.com> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
41f63c5359d14ca995172b8f6eaffd93f60fec54 |
|
03-Aug-2012 |
Tejun Heo <tj@kernel.org> |
workqueue: use mod_delayed_work() instead of cancel + queue Convert delayed_work users doing cancel_delayed_work() followed by queue_delayed_work() to mod_delayed_work(). Most conversions are straight-forward. Ones worth mentioning are, * drivers/edac/edac_mc.c: edac_mc_workq_setup() converted to always use mod_delayed_work() and cancel loop in edac_mc_reset_delay_period() is dropped. * drivers/platform/x86/thinkpad_acpi.c: No need to remember whether watchdog is active or not. @fan_watchdog_active and related code dropped. * drivers/power/charger-manager.c: Seemingly a lot of delayed_work_pending() abuse going on here. [delayed_]work_pending() are unsynchronized and racy when used like this. I converted one instance in fullbatt_handler(). Please conver the rest so that it invokes workqueue APIs for the intended target state rather than trying to game work item pending state transitions. e.g. if timer should be modified - call mod_delayed_work(), canceled - call cancel_delayed_work[_sync](). * drivers/thermal/thermal_sys.c: thermal_zone_device_set_polling() simplified. Note that round_jiffies() calls in this function are meaningless. round_jiffies() work on absolute jiffies not delta delay used by delayed_work. v2: Tomi pointed out that __cancel_delayed_work() users can't be safely converted to mod_delayed_work(). They could be calling it from irq context and if that happens while delayed_work_timer_fn() is running, it could deadlock. __cancel_delayed_work() users are dropped. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Anton Vorontsov <cbouatmailru@gmail.com> Acked-by: David Howells <dhowells@redhat.com> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Doug Thompson <dougthompson@xmission.com> Cc: David Airlie <airlied@linux.ie> Cc: Roland Dreier <roland@kernel.org> Cc: "John W. Linville" <linville@tuxdriver.com> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Len Brown <len.brown@intel.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Johannes Berg <johannes@sipsolutions.net>
|
9eb07a7fb8a90ee39fa9d5489afc0330cfcfbea7 |
|
04-Jun-2012 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
edac: edac_mc_handle_error(): add an error_count parameter In order to avoid loosing error events, it is desirable to group error events together and generate a single trace for several identical errors. The trace API already allows reporting multiple errors. Change the handle_error function to also allow that. The changes at the drivers were made by this small script: $file .=$_ while (<>); $file =~ s/(edac_mc_handle_error)\s*\(([^\,]+)\,([^\,]+)\,/$1($2,$3, 1,/g; print $file; Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
03f7eae80f4b913929be84e0c883ee98196fd6ff |
|
04-Jun-2012 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
edac: remove arch-specific parameter for the error handler Remove the arch-dependent parameter, as it were not used, as the MCE tracepoint weren't implemented. It probably doesn't make sense to have an MCE-specific tracepoint, as this will cost more bytes at the tracepoint, and tracepoint is not free. The changes at the EDAC drivers were done by this small perl script: $file .=$_ while (<>); $file =~ s/(edac_mc_handle_error)\s*\(([^\;]+)\,([^\,\)]+)\s*\)/$1($2)/g; print $file; Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
08a4a136909602eae0e71e147153461df077a46f |
|
18-May-2012 |
Dan Carpenter <dan.carpenter@oracle.com> |
edac_mc: check for allocation failure in edac_mc_alloc() Add a check here for if kzalloc() failed. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
6e84d359b2bea5ce659b3c3e5d3003fb11bd91d5 |
|
30-Apr-2012 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
edac_mc: Cleanup per-dimm_info debug messages The edac_mc_alloc() routine allocates one dimm_info device for all possible memories, including the non-filled ones. The debug messages there are somewhat confusing. So, cleans them, by moving the code that prints the memory location to edac_mc, and using it on both edac_mc_sysfs and edac_mc. Also, only dumps information when DIMM/ranks are actually filled. After this patch, a dimm-based memory controller will print the debug info as: [ 1011.380027] EDAC DEBUG: edac_mc_dump_csrow: csrow->csrow_idx = 0 [ 1011.380029] EDAC DEBUG: edac_mc_dump_csrow: csrow = ffff8801169be000 [ 1011.380031] EDAC DEBUG: edac_mc_dump_csrow: csrow->first_page = 0x0 [ 1011.380032] EDAC DEBUG: edac_mc_dump_csrow: csrow->last_page = 0x0 [ 1011.380034] EDAC DEBUG: edac_mc_dump_csrow: csrow->page_mask = 0x0 [ 1011.380035] EDAC DEBUG: edac_mc_dump_csrow: csrow->nr_channels = 3 [ 1011.380037] EDAC DEBUG: edac_mc_dump_csrow: csrow->channels = ffff8801149c2840 [ 1011.380039] EDAC DEBUG: edac_mc_dump_csrow: csrow->mci = ffff880117426000 [ 1011.380041] EDAC DEBUG: edac_mc_dump_channel: channel->chan_idx = 0 [ 1011.380042] EDAC DEBUG: edac_mc_dump_channel: channel = ffff8801149c2860 [ 1011.380044] EDAC DEBUG: edac_mc_dump_channel: channel->csrow = ffff8801169be000 [ 1011.380046] EDAC DEBUG: edac_mc_dump_channel: channel->dimm = ffff88010fe90400 ... [ 1011.380095] EDAC DEBUG: edac_mc_dump_dimm: dimm0: channel 0 slot 0 mapped as virtual row 0, chan 0 [ 1011.380097] EDAC DEBUG: edac_mc_dump_dimm: dimm = ffff88010fe90400 [ 1011.380099] EDAC DEBUG: edac_mc_dump_dimm: dimm->label = 'CPU#0Channel#0_DIMM#0' [ 1011.380101] EDAC DEBUG: edac_mc_dump_dimm: dimm->nr_pages = 0x40000 [ 1011.380103] EDAC DEBUG: edac_mc_dump_dimm: dimm->grain = 8 [ 1011.380104] EDAC DEBUG: edac_mc_dump_dimm: dimm->nr_pages = 0x40000 ... (a rank-based memory controller would print, instead of "dimm?", "rank?" on the above debug info) Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
956b9ba156dbfdb9cede2b2927ddf8be2233b3a7 |
|
29-Apr-2012 |
Joe Perches <joe@perches.com> |
edac: Convert debugfX to edac_dbg(X, Use a more common debugging style. Remove __FILE__ uses, add missing newlines, coalesce formats and align arguments. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
dd23cd6eb1f59ba722a6e6aa228adff7c01404de |
|
29-Apr-2012 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
edac: Don't add __func__ or __FILE__ for debugf[0-9] msgs The debug macro already adds that. Most of the work here was made by this small script: $f .=$_ while (<>); $f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*": /\1"/g; $f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*/\1/g; $f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*"MC: /\1"/g; $f =~ s/(debugf[0-9]\s*\(\")\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+)__func__\s*\,\s*/\1\2/g; $f =~ s/(debugf[0-9]\s*\(\")\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+),\s*__func__\s*\)/\1\2)/g; $f =~ s/(debugf[0-9]\s*\(\"MC\:\s*)\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+)__func__\s*\,\s*/\1\2/g; $f =~ s/(debugf[0-9]\s*\(\"MC\:\s*)\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+),\s*__func__\s*\)/\1\2)/g; $f =~ s/\"MC\: \\n\"/"MC:\\n"/g; print $f; After running the script, manual cleanups were done to fix it the remaining places. While here, removed the __LINE__ on most places, as it doesn't actually give useful info on most places. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
de3910eb79ac8c0f29a11224661c0ebaaf813039 |
|
24-Apr-2012 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
edac: change the mem allocation scheme to make Documentation/kobject.txt happy Kernel kobjects have rigid rules: each container object should be dynamically allocated, and can't be allocated into a single kmalloc. EDAC never obeyed this rule: it has a single malloc function that allocates all needed data into a single kzalloc. As this is not accepted anymore, change the allocation schema of the EDAC *_info structs to enforce this Kernel standard. Acked-by: Chris Metcalf <cmetcalf@tilera.com> Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Greg K H <gregkh@linuxfoundation.org> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Mark Gross <mark.gross@intel.com> Cc: Tim Small <tim@buttersideup.com> Cc: Ranganathan Desikan <ravi@jetztechnologies.com> Cc: "Arvind R." <arvino55@gmail.com> Cc: Olof Johansson <olof@lixom.net> Cc: Egor Martovetsky <egor@pasemi.com> Cc: Michal Marek <mmarek@suse.cz> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Hitoshi Mitake <h.mitake@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Shaohui Xie <Shaohui.Xie@freescale.com> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
d90c008963ef638cb7ab7d5eb76362b3c2d379bc |
|
21-Mar-2012 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
edac: Get rid of the old kobj's from the edac mc code Now that al users for the old kobj raw access are gone, we can get rid of the legacy kobj-based structures and data. Reviewed-by: Aristeu Rozanski <arozansk@redhat.com> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Michal Marek <mmarek@suse.cz> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
7a623c039075e4ea21648d88133fafa6dcfd113d |
|
16-Apr-2012 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
edac: rewrite the sysfs code to use struct device The EDAC subsystem uses the old struct sysdev approach, creating all nodes using the raw sysfs API. This is bad, as the API is deprecated. As we'll be changing the EDAC API, let's first port the existing code to struct device. There's one drawback on this patch: driver-specific sysfs nodes, used by mpc85xx_edac, amd64_edac and i7core_edac won't be created anymore. While it would be possible to also port the device-specific code, that would mix kobj with struct device, with is not recommended. Also, it is easier and nicer to move the code to the drivers, instead, as the core can get rid of some complex logic that just emulates what the device_add() and device_create_file() already does. The next patches will convert the driver-specific code to use the device-specific calls. Then, the remaining bits of the old sysfs API will be removed. NOTE: a per-MC bus is required, otherwise devices with more than one memory controller will hit a bug like the one below: [ 819.094946] EDAC DEBUG: find_mci_by_dev: find_mci_by_dev() [ 819.094948] EDAC DEBUG: edac_create_sysfs_mci_device: edac_create_sysfs_mci_device() idx=1 [ 819.094952] EDAC DEBUG: edac_create_sysfs_mci_device: edac_create_sysfs_mci_device(): creating device mc1 [ 819.094967] EDAC DEBUG: edac_create_sysfs_mci_device: edac_create_sysfs_mci_device creating dimm0, located at channel 0 slot 0 [ 819.094984] ------------[ cut here ]------------ [ 819.100142] WARNING: at fs/sysfs/dir.c:481 sysfs_add_one+0xc1/0xf0() [ 819.107282] Hardware name: S2600CP [ 819.111078] sysfs: cannot create duplicate filename '/bus/edac/devices/dimm0' [ 819.119062] Modules linked in: sb_edac(+) edac_core ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter ip_tables bridge stp llc sunrpc binfmt_misc dm_mirror dm_region_hash dm_log vhost_net macvtap macvlan tun kvm microcode pcspkr iTCO_wdt iTCO_vendor_support igb i2c_i801 i2c_core sg ioatdma dca sr_mod cdrom sd_mod crc_t10dif ahci libahci isci libsas libata scsi_transport_sas scsi_mod wmi dm_mod [last unloaded: scsi_wait_scan] [ 819.175748] Pid: 10902, comm: modprobe Not tainted 3.3.0-0.11.el7.v12.2.x86_64 #1 [ 819.184113] Call Trace: [ 819.186868] [<ffffffff8105adaf>] warn_slowpath_common+0x7f/0xc0 [ 819.193573] [<ffffffff8105aea6>] warn_slowpath_fmt+0x46/0x50 [ 819.200000] [<ffffffff811f53d1>] sysfs_add_one+0xc1/0xf0 [ 819.206025] [<ffffffff811f5cf5>] sysfs_do_create_link+0x135/0x220 [ 819.212944] [<ffffffff811f7023>] ? sysfs_create_group+0x13/0x20 [ 819.219656] [<ffffffff811f5df3>] sysfs_create_link+0x13/0x20 [ 819.226109] [<ffffffff813b04f6>] bus_add_device+0xe6/0x1b0 [ 819.232350] [<ffffffff813ae7cb>] device_add+0x2db/0x460 [ 819.238300] [<ffffffffa0325634>] edac_create_dimm_object+0x84/0xf0 [edac_core] [ 819.246460] [<ffffffffa0325e18>] edac_create_sysfs_mci_device+0xe8/0x290 [edac_core] [ 819.255215] [<ffffffffa0322e2a>] edac_mc_add_mc+0x5a/0x2c0 [edac_core] [ 819.262611] [<ffffffffa03412df>] sbridge_register_mci+0x1bc/0x279 [sb_edac] [ 819.270493] [<ffffffffa03417a3>] sbridge_probe+0xef/0x175 [sb_edac] [ 819.277630] [<ffffffff813ba4e8>] ? pm_runtime_enable+0x58/0x90 [ 819.284268] [<ffffffff812f430c>] local_pci_probe+0x5c/0xd0 [ 819.290508] [<ffffffff812f5ba1>] __pci_device_probe+0xf1/0x100 [ 819.297117] [<ffffffff812f5bea>] pci_device_probe+0x3a/0x60 [ 819.303457] [<ffffffff813b1003>] really_probe+0x73/0x270 [ 819.309496] [<ffffffff813b138e>] driver_probe_device+0x4e/0xb0 [ 819.316104] [<ffffffff813b149b>] __driver_attach+0xab/0xb0 [ 819.322337] [<ffffffff813b13f0>] ? driver_probe_device+0xb0/0xb0 [ 819.329151] [<ffffffff813af5d6>] bus_for_each_dev+0x56/0x90 [ 819.335489] [<ffffffff813b0d7e>] driver_attach+0x1e/0x20 [ 819.341534] [<ffffffff813b0980>] bus_add_driver+0x1b0/0x2a0 [ 819.347884] [<ffffffffa0347000>] ? 0xffffffffa0346fff [ 819.353641] [<ffffffff813b19f6>] driver_register+0x76/0x140 [ 819.359980] [<ffffffff8159f18b>] ? printk+0x51/0x53 [ 819.365524] [<ffffffffa0347000>] ? 0xffffffffa0346fff [ 819.371291] [<ffffffff812f5896>] __pci_register_driver+0x56/0xd0 [ 819.378096] [<ffffffffa0347054>] sbridge_init+0x54/0x1000 [sb_edac] [ 819.385231] [<ffffffff8100203f>] do_one_initcall+0x3f/0x170 [ 819.391577] [<ffffffff810bcd2e>] sys_init_module+0xbe/0x230 [ 819.397926] [<ffffffff815bb529>] system_call_fastpath+0x16/0x1b [ 819.404633] ---[ end trace 1654fdd39556689f ]--- This happens because the bus is not being properly initialized. Instead of putting the memory sub-devices inside the memory controller, it is putting everything under the same directory: $ tree /sys/bus/edac/ /sys/bus/edac/ ├── devices │ ├── all_channel_counts -> ../../../devices/system/edac/mc/mc0/all_channel_counts │ ├── csrow0 -> ../../../devices/system/edac/mc/mc0/csrow0 │ ├── csrow1 -> ../../../devices/system/edac/mc/mc0/csrow1 │ ├── csrow2 -> ../../../devices/system/edac/mc/mc0/csrow2 │ ├── dimm0 -> ../../../devices/system/edac/mc/mc0/dimm0 │ ├── dimm1 -> ../../../devices/system/edac/mc/mc0/dimm1 │ ├── dimm3 -> ../../../devices/system/edac/mc/mc0/dimm3 │ ├── dimm6 -> ../../../devices/system/edac/mc/mc0/dimm6 │ ├── inject_addrmatch -> ../../../devices/system/edac/mc/mc0/inject_addrmatch │ ├── mc -> ../../../devices/system/edac/mc │ └── mc0 -> ../../../devices/system/edac/mc/mc0 ├── drivers ├── drivers_autoprobe ├── drivers_probe └── uevent On a multi-memory controller system, the names "csrow%d" and "dimm%d" should be under "mc%d", and not at the main hierarchy level. So, we need to create a per-MC bus, in order to have its own namespace. Reviewed-by: Aristeu Rozanski <arozansk@redhat.com> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Greg K H <gregkh@linuxfoundation.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
8447c4d15e357a458c9051ddc84aa6c8b9c27000 |
|
06-Jun-2012 |
Chris Metcalf <cmetcalf@tilera.com> |
edac: Do alignment logic properly in edac_align_ptr() The logic was checking the sizeof the structure being allocated to determine whether an alignment fixup was required. This isn't right; what we actually care about is the alignment of the actual pointer that's about to be returned. This became an issue recently because struct edac_mc_layer has a size that is not zero modulo eight, so we were taking the correctly-aligned pointer and forcing it to be misaligned. On Tile this caused an alignment exception. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
fd687502dc8037aa5a4b84c570ada971106574ee |
|
16-Mar-2012 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
edac: Rename the parent dev to pdev As EDAC doesn't use struct device itself, it created a parent dev pointer called as "pdev". Now that we'll be converting it to use struct device, instead of struct devsys, this needs to be fixed. No functional changes. Reviewed-by: Aristeu Rozanski <arozansk@redhat.com> Acked-by: Chris Metcalf <cmetcalf@tilera.com> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Mark Gross <mark.gross@intel.com> Cc: Jason Uhlenkott <juhlenko@akamai.com> Cc: Tim Small <tim@buttersideup.com> Cc: Ranganathan Desikan <ravi@jetztechnologies.com> Cc: "Arvind R." <arvino55@gmail.com> Cc: Olof Johansson <olof@lixom.net> Cc: Egor Martovetsky <egor@pasemi.com> Cc: Michal Marek <mmarek@suse.cz> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Joe Perches <joe@perches.com> Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Hitoshi Mitake <h.mitake@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com> Cc: Shaohui Xie <Shaohui.Xie@freescale.com> Cc: Josh Boyer <jwboyer@gmail.com> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
53f2d02898755d1b24bde1975e202815d29fdb81 |
|
23-Feb-2012 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
RAS: Add a tracepoint for reporting memory controller events Add a new tracepoint-based hardware events report method for reporting Memory Controller events. Part of the description bellow is shamelessly copied from Tony Luck's notes about the Hardware Error BoF during LPC 2010 [1]. Tony, thanks for your notes and discussions to generate the h/w error reporting requirements. [1] http://lwn.net/Articles/416669/ We have several subsystems & methods for reporting hardware errors: 1) EDAC ("Error Detection and Correction"). In its original form this consisted of a platform specific driver that read topology information and error counts from chipset registers and reported the results via a sysfs interface. 2) mcelog - x86 specific decoding of machine check bank registers reporting in binary form via /dev/mcelog. Recent additions make use of the APEI extensions that were documented in version 4.0a of the ACPI specification to acquire more information about errors without having to rely reading chipset registers directly. A user level programs decodes into somewhat human readable format. 3) drivers/edac/mce_amd.c - this driver hooks into the mcelog path and decodes errors reported via machine check bank registers in AMD processors to the console log using printk(); Each of these mechanisms has a band of followers ... and none of them appear to meet all the needs of all users. As part of a RAS subsystem, let's encapsulate the memory error hardware events into a trace facility. The tracepoint printk will be displayed like: mc_event: [quant] (Corrected|Uncorrected|Fatal) error:[error msg] on [label] ([location] [edac_mc detail] [driver_detail] Where: [quant] is the quantity of errors [error msg] is the driver-specific error message (e. g. "memory read", "bus error", ...); [location] is the location in terms of memory controller and branch/channel/slot, channel/slot or csrow/channel; [label] is the memory stick label; [edac_mc detail] describes the address location of the error and the syndrome; [driver detail] is driver-specifig error message details, when needed/provided (e. g. "area:DMA", ...) For example: mc_event: 1 Corrected error:memory read on memory stick DIMM_1A (mc:0 location:0:0:0 page:0x586b6e offset:0xa66 grain:32 syndrome:0x0 area:DMA) Of course, any userspace tools meant to handle errors should not parse the above data. They should, instead, use the binary fields provided by the tracepoint, mapping them directly into their Management Information Base. NOTE: The original patch was providing an additional mechanism for MCA-based trace events that also contained MCA error register data. However, as no agreement was reached so far for the MCA-based trace events, for now, let's add events only for memory errors. A latter patch is planned to change the tracepoint, for those types of event. Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
5926ff502f6b93ca0c1654f8a5c5317ea236dbdb |
|
09-Feb-2012 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
edac: Initialize the dimm label with the known information While userspace doesn't fill the dimm labels, add there the dimm location, as described by the used memory model. This could eventually match what is described at the dmidecode, making easier for people to identify the memory. For example, on an Intel motherboard where the DMI table is reliable, the first memory stick is described as: Memory Device Array Handle: 0x0029 Error Information Handle: Not Provided Total Width: 64 bits Data Width: 64 bits Size: 2048 MB Form Factor: DIMM Set: 1 Locator: A1_DIMM0 Bank Locator: A1_Node0_Channel0_Dimm0 Type: <OUT OF SPEC> Type Detail: Synchronous Speed: 800 MHz Manufacturer: A1_Manufacturer0 Serial Number: A1_SerNum0 Asset Tag: A1_AssetTagNum0 Part Number: A1_PartNum0 The memory named as "A1_DIMM0" is physically located at the first memory controller (node 0), at channel 0, dimm slot 0. After this patch, the memory label will be filled with: /sys/devices/system/edac/mc/csrow0/ch0_dimm_label:mc#0channel#0slot#0 And (after the new EDAC API patches) as: /sys/devices/system/edac/mc/mc0/dimm0/dimm_label:mc#0channel#0slot#0 So, even if the memory label is not initialized on userspace, an useful information with the error location is filled there, expecially since several systems/motherboards are provided with enough info to map from channel/slot (or branch/channel/slot) into the DIMM label. So, letting the EDAC core fill it by default is a good thing. It should noticed that, as the label filling happens at the edac_mc_alloc(), drivers can override it to better describe the memories (and some actually do it). Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
ca0907b9e413bb1d1f3ea123b663535b74928846 |
|
02-May-2012 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
edac: Remove the legacy EDAC ABI Now that all drivers got converted to use the new ABI, we can drop the old one. Acked-by: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
4275be63559719c3149b19751029f1b0f1b26775 |
|
18-Apr-2012 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
edac: Change internal representation to work with layers Change the EDAC internal representation to work with non-csrow based memory controllers. There are lots of those memory controllers nowadays, and more are coming. So, the EDAC internal representation needs to be changed, in order to work with those memory controllers, while preserving backward compatibility with the old ones. The edac core was written with the idea that memory controllers are able to directly access csrows. This is not true for FB-DIMM and RAMBUS memory controllers. Also, some recent advanced memory controllers don't present a per-csrows view. Instead, they view memories as DIMMs, instead of ranks. So, change the allocation and error report routines to allow them to work with all types of architectures. This will allow the removal of several hacks with FB-DIMM and RAMBUS memory controllers. Also, several tests were done on different platforms using different x86 drivers. TODO: a multi-rank DIMMs are currently represented by multiple DIMM entries in struct dimm_info. That means that changing a label for one rank won't change the same label for the other ranks at the same DIMM. This bug is present since the beginning of the EDAC, so it is not a big deal. However, on several drivers, it is possible to fix this issue, but it should be a per-driver fix, as the csrow => DIMM arrangement may not be equal for all. So, don't try to fix it here yet. I tried to make this patch as short as possible, preceding it with several other patches that simplified the logic here. Yet, as the internal API changes, all drivers need changes. The changes are generally bigger in the drivers for FB-DIMMs. Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Mark Gross <mark.gross@intel.com> Cc: Jason Uhlenkott <juhlenko@akamai.com> Cc: Tim Small <tim@buttersideup.com> Cc: Ranganathan Desikan <ravi@jetztechnologies.com> Cc: "Arvind R." <arvino55@gmail.com> Cc: Olof Johansson <olof@lixom.net> Cc: Egor Martovetsky <egor@pasemi.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Michal Marek <mmarek@suse.cz> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Joe Perches <joe@perches.com> Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Hitoshi Mitake <h.mitake@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com> Cc: Shaohui Xie <Shaohui.Xie@freescale.com> Cc: Josh Boyer <jwboyer@gmail.com> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
93e4fe64ece4eccf0ff4ac69bceb389290b8ab7c |
|
16-Apr-2012 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
edac: rewrite edac_align_ptr() The edac_align_ptr() function is used to prepare data for a single memory allocation kzalloc() call. It counts how many bytes are needed by some data structure. Using it as-is is not that trivial, as the quantity of memory elements reserved is not there, but, instead, it is on a next call. In order to avoid mistakes when using it, move the number of allocated elements into it, making easier to use it. Reviewed-by: Borislav Petkov <bp@amd64.org> Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
a895bf8b1e1ea4c032a8fa8a09475a2ce09fe77a |
|
28-Jan-2012 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
edac: move nr_pages to dimm struct The number of pages is a dimm property. Move it to the dimm struct. After this change, it is possible to add sysfs nodes for the DIMM's that will properly represent the DIMM stick properties, including its size. A TODO fix here is to properly represent dual-rank/quad-rank DIMMs when the memory controller represents the memory via chip select rows. Reviewed-by: Aristeu Rozanski <arozansk@redhat.com> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Chris Metcalf <cmetcalf@tilera.com> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Mark Gross <mark.gross@intel.com> Cc: Jason Uhlenkott <juhlenko@akamai.com> Cc: Tim Small <tim@buttersideup.com> Cc: Ranganathan Desikan <ravi@jetztechnologies.com> Cc: "Arvind R." <arvino55@gmail.com> Cc: Olof Johansson <olof@lixom.net> Cc: Egor Martovetsky <egor@pasemi.com> Cc: Michal Marek <mmarek@suse.cz> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Joe Perches <joe@perches.com> Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Hitoshi Mitake <h.mitake@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com> Cc: Shaohui Xie <Shaohui.Xie@freescale.com> Cc: Josh Boyer <jwboyer@gmail.com> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
084a4fccef39ac7abb039511f32380f28d0b67e6 |
|
27-Jan-2012 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
edac: move dimm properties to struct dimm_info On systems based on chip select rows, all channels need to use memories with the same properties, otherwise the memories on channels A and B won't be recognized. However, such assumption is not true for all types of memory controllers. Controllers for FB-DIMM's don't have such requirements. Also, modern Intel controllers seem to be capable of handling such differences. So, we need to get rid of storing the DIMM information into a per-csrow data, storing it, instead at the right place. The first step is to move grain, mtype, dtype and edac_mode to the per-dimm struct. Reviewed-by: Aristeu Rozanski <arozansk@redhat.com> Reviewed-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Chris Metcalf <cmetcalf@tilera.com> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Mark Gross <mark.gross@intel.com> Cc: Jason Uhlenkott <juhlenko@akamai.com> Cc: Tim Small <tim@buttersideup.com> Cc: Ranganathan Desikan <ravi@jetztechnologies.com> Cc: "Arvind R." <arvino55@gmail.com> Cc: Olof Johansson <olof@lixom.net> Cc: Egor Martovetsky <egor@pasemi.com> Cc: Michal Marek <mmarek@suse.cz> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Joe Perches <joe@perches.com> Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Hitoshi Mitake <h.mitake@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: James Bottomley <James.Bottomley@parallels.com> Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com> Cc: Shaohui Xie <Shaohui.Xie@freescale.com> Cc: Josh Boyer <jwboyer@gmail.com> Cc: Mike Williams <mike@mikebwilliams.com> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
a7d7d2e1a07e3811dc49af2962c940fd8bbb6c8f |
|
27-Jan-2012 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
edac: Create a dimm struct and move the labels into it The way a DIMM is currently represented implies that they're linked into a per-csrow struct. However, some drivers don't see csrows, as they're ridden behind some chip like the AMB's on FBDIMM's, for example. This forced drivers to fake^Wvirtualize a csrow struct, and to create a mess under csrow/channel original's concept. Move the DIMM labels into a per-DIMM struct, and add there the real location of the socket, in terms of csrow/channel. Latter patches will modify the location to properly represent the memory architecture. All other drivers will use a per-csrow type of location. Some of those drivers will require a latter conversion, as they also fake the csrows internally. TODO: While this patch doesn't change the existing behavior, on csrows-based memory controllers, a csrow/channel pair points to a memory rank. There's a known bug at the EDAC core that allows having different labels for the same DIMM, if it has more than one rank. A latter patch is need to merge the several ranks for a DIMM into the same dimm_info struct, in order to avoid having different labels for the same DIMM. The edac_mc_alloc() will now contain a per-dimm initialization loop that will be changed by latter patches in order to match other types of memory architectures. Reviewed-by: Aristeu Rozanski <arozansk@redhat.com> Reviewed-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Ranganathan Desikan <ravi@jetztechnologies.com> Cc: "Arvind R." <arvino55@gmail.com> Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
a4b4be3fd7a76021f67380b03d8bccebf067db72 |
|
27-Jan-2012 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
edac: rename channel_info to rank_info What it is pointed by a csrow/channel vector is a rank information, and not a channel information. On a traditional architecture, the memory controller directly access the memory ranks, via chip select rows. Different ranks at the same DIMM is selected via different chip select rows. So, typically, one csrow/channel pair means one different DIMM. On FB-DIMMs, there's a microcontroller chip at the DIMM, called Advanced Memory Buffer (AMB) that serves as the interface between the memory controller and the memory chips. The AMB selection is via the DIMM slot, and not via a csrow. It is up to the AMB to talk with the csrows of the DRAM chips. So, the FB-DIMM memory controllers see the DIMM slot, and not the DIMM rank. RAMBUS is similar. Newer memory controllers, like the ones found on Intel Sandy Bridge and Nehalem, even working with normal DDR3 DIMM's, don't use the usual channel A/channel B interleaving schema to provide 128 bits data access. Instead, they have more channels (3 or 4 channels), and they can use several interleaving schemas. Such memory controllers see the DIMMs directly on their registers, instead of the ranks, which is better for the driver, as its main usageis to point to a broken DIMM stick (the Field Repleceable Unit), and not to point to a broken DRAM chip. The drivers that support such such newer memory architecture models currently need to fake information and to abuse on EDAC structures, as the subsystem was conceived with the idea that the csrow would always be visible by the CPU. To make things a little worse, those drivers don't currently fake csrows/channels on a consistent way, as the concepts there don't apply to the memory controllers they're talking with. So, each driver author interpreted the concepts using a different logic. In order to fix it, let's rename the data structure that points into a DIMM rank to "rank_info", in order to be clearer about what's stored there. Latter patches will provide a better way to represent the memory hierarchy for the other types of memory controller. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
4e5df7ca3091a846b65f2a940a68506790a62d6a |
|
25-Nov-2011 |
Cong Wang <amwang@redhat.com> |
edac: remove the second argument of k[un]map_atomic() Signed-off-by: Cong Wang <amwang@redhat.com>
|
fe5ff8b84c8b03348a2f64ea9d884348faec2217 |
|
15-Dec-2011 |
Kay Sievers <kay.sievers@vrfy.org> |
edac: convert sysdev_class to a regular subsystem After all sysdev classes are ported to regular driver core entities, the sysdev implementation will be entirely removed from the kernel. Cc: Doug Thompson <dougthompson@xmission.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Lucas De Marchi <lucas.demarchi@profusion.mobi> Cc: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
e2e77098764636456ba7092a8b3b3b34b2a8e8d8 |
|
27-May-2011 |
Lai Jiangshan <laijs@cn.fujitsu.com> |
edac,rcu: use synchronize_rcu() instead of call_rcu()+rcu_barrier() synchronize_rcu() does the stuff as needed. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
25985edcedea6396277003854657b5f3cb31a628 |
|
31-Mar-2011 |
Lucas De Marchi <lucas.demarchi@profusion.mobi> |
Fix common misspellings Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
|
24f9a7fe3f19f3fd310f556364d01a22911724b3 |
|
07-Oct-2010 |
Borislav Petkov <borislav.petkov@amd.com> |
amd64_edac: Rework printk macros Add a macro per printk level, shorten up error messages. Add relevant information to KERN_INFO level. No functional change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
|
bb31b3122c0dd07d2d958da17a50ad771ce79e2b |
|
02-Dec-2010 |
Borislav Petkov <borislav.petkov@amd.com> |
EDAC: Fix workqueue-related crashes 00740c58541b6087d78418cebca1fcb86dc6077d changed edac_core to un-/register a workqueue item only if a lowlevel driver supplies a polling routine. Normally, when we remove a polling low-level driver, we go and cancel all the queued work. However, the workqueue unreg happens based on the ->op_state setting, and edac_mc_del_mc() sets this to OP_OFFLINE _before_ we cancel the work item, leading to NULL ptr oops on the workqueue list. Fix it by putting the unreg stuff in proper order. Cc: <stable@kernel.org> #36.x Reported-and-tested-by: Tobias Karnat <tobias.karnat@googlemail.com> LKML-Reference: <1291201307.3029.21.camel@Tobias-Karnat> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
|
accf74fff36315a31dc7319dae2927af06e9296f |
|
16-Aug-2010 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
i7core_edac: don't use a freed mci struct This is a nasty bug. Since kobject count will be reduced by zero by edac_mc_del_mc(), and this triggers the kobj release method, the mci memory will be freed automatically. So, all we have left is ctl_name, as shown by enabling debug: [ 80.822186] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 1020: edac_remove_sysfs_mci_device() remove_link [ 80.832590] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 1024: edac_remove_sysfs_mci_device() remove_mci_instance [ 80.843776] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 640: edac_mci_control_release() mci instance idx=0 releasing [ 80.855163] EDAC MC: Removed device 0 for i7core_edac.c i7 core #0: DEV 0000:3f:03.0 [ 80.862936] EDAC DEBUG: in drivers/edac/i7core_edac.c, line at 2089: (null): free structs [ 80.871134] EDAC DEBUG: in drivers/edac/edac_mc.c, line at 238: edac_mc_free() [ 80.878379] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 726: edac_mc_unregister_sysfs_main_kobj() [ 80.888043] EDAC DEBUG: in drivers/edac/i7core_edac.c, line at 1232: drivers/edac/i7core_edac.c: i7core_put_devices() Also, kfree(mci) shouldn't happen at the kobj.release, as it happens when edac_remove_sysfs_mci_device() is called, but the logic is: edac_remove_sysfs_mci_device(mci); edac_printk(KERN_INFO, EDAC_MC, "Removed device %d for %s %s: DEV %s\n", mci->mc_idx, mci->mod_name, mci->ctl_name, edac_dev_name(mci)); So, as the edac_printk() needs the mci struct, this generates an OOPS. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
bbc560ae677c0f4d7ff8404a21409c99f35b297b |
|
16-Aug-2010 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
edac_core: Print debug messages at release calls This is important to track a nasty bug at the free logic. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
6fe1108f14f4f9581af97cab752f37dc8fa9fdec |
|
12-Aug-2010 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
edac_core: Do a better job with node removal Make sure we remove groups at the right order Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
939747bd680eb09bb98792b17a5bfd2f525afe9d |
|
10-Aug-2010 |
Mauro Carvalho Chehab <mchehab@redhat.com> |
i7core_edac: Be sure that the edac pci handler will be properly released With multi-sockets, more than one edac pci handler is enabled. Be sure to un-register all instances. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
00740c58541b6087d78418cebca1fcb86dc6077d |
|
26-Sep-2010 |
Borislav Petkov <borislav.petkov@amd.com> |
amd64_edac: Fix driver module removal f4347553b30ec66530bfe63c84530afea3803396 removed the edac polling mechanism in favor of using a notifier chain for conveying MCE information to edac. However, the module removal path didn't test whether the driver had setup the polling function workqueue at all and the rmmod process was hanging in the kernel at try_to_del_timer_sync() in the cancel_delayed_work() path, trying to cancel an uninitialized work struct. Fix that by adding a balancing check to the workqueue removal path. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
|
239642fe19adc19ba0a69e96f3b1904dfd6a3b9f |
|
12-Nov-2009 |
Borislav Petkov <borislav.petkov@amd.com> |
edac: add memory types strings for debugging Instead of using deeply-nested conditionals for dumping the DIMM type in debug mode, add a strings array of the supported DIMM types. This is useful in cases where an edac driver supports multiple DRAM types and is only defined in debug builds. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
|
458e5ff13e1bed050990d97e9aa55bcdafc951a7 |
|
24-Sep-2009 |
Jesper Dangaard Brouer <hawk@comx.dk> |
edac: core: remove completion-wait for complete with rcu_barrier Module edac_core.ko uses call_rcu() callbacks in edac_device.c, edac_mc.c and edac_pci.c. They all use a wait_for_completion() scheme, but this scheme it not 100% safe on multiple CPUs. See the _rcu_barrier() implementation which explains why extra precausion is needed. The patch adds a comment about rcu_barrier() and as a precausion calls rcu_barrier(). A maintainer needs to look at removing the wait_for_completion code. [dougthompson@xmission.com: remove the wait_for_completion code] Signed-off-by Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
fbeb4384748abb78531bbe1e80d627412a0abcfa |
|
13-Apr-2009 |
Jean Delvare <khali@linux-fr.org> |
edac: use to_delayed_work() The edac-core driver includes code which assumes that the work_struct which is included in every delayed_work is the first member of that structure. This is currently the case but might change in the future, so use to_delayed_work() instead, which doesn't make such an assumption. linux-2.6.30-rc1 has the to_delayed_work() function that will allow this patch to work Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
281efb17d88a91dc3b879bb1d49e3a66daf48797 |
|
06-Jan-2009 |
Kay Sievers <kay.sievers@vrfy.org> |
edac: struct device: replace bus_id with dev_name(), dev_set_name() This patch is part of a larger patch series which will remove the "char bus_id[20]" name string from struct device. The device name is managed in the kobject anyway, and without any size limitation, and just needlessly copied into "struct device". [akpm@linux-foundation.org: coding-style fixes] Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
17aa7e034416e3080bc57a786d09ba0a4a044561 |
|
05-May-2008 |
Stephen Rothwell <sfr@canb.auug.org.au> |
dev_name introduction fall out fix Commit 06916639e2fed9ee475efef2747a1b7429f8fe76 ("driver-core: add dev_name() to help transition away from using bus_id") added a static inline dev_name() and used it in dev_printk. Unfortunately, drivers/edac/edac_core.h defines a macro called dev_name(). Rename the latter. Diagnosis by Tony Breeds and Michael Ellerman. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
1a45027d1afd7e85254b5ef8535e93ce3d588cf4 |
|
29-Apr-2008 |
Adrian Bunk <bunk@kernel.org> |
edac: remove unneeded functions and add static accessor Collection of patches, merged into one, from Adrian that do the following: 1) This patch makes the following needlessly global functions static: - edac_pci_get_log_pe() - edac_pci_get_log_npe() - edac_pci_get_panic_on_pe() - edac_pci_unregister_sysfs_instance_kobj() - edac_pci_main_kobj_setup() 2) Remove unneeded function edac_device_find() 3) Added #if 0 around function edac_pci_find() 4) make the needlessly global edac_pci_generic_check() static 5) Removed function edac_check_mc_devices() Doug Thompson modified Adrian's patches, to bettern represent the direction of EDAC, and make them one patch. Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
ff6ac2a616c85d1215899ffda815e29b699cbd3a |
|
29-Apr-2008 |
Robert P. J. Day <rpjday@crashcourse.ca> |
edac: use the shorter LIST_HEAD for brevity Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Acked-by: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
bce19683c17485b584b62b984d6dcf5332181588 |
|
26-Jul-2007 |
Doug Thompson <dougthompson@xmission.com> |
drivers/edac: fix reset edac_mc pollmsec This fixes a deadlock that could occur on a 'setup' and 'teardown' sequence of the workq for a edac_mc control structure instance. A similiar fix was previously implemented for the edac_device code. In addition, the edac_mc device code there was missing code to allow the workq period valu to be altered via sysfs control. This patch adds that fix on the code, and allows for the changing of the period value as well. Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
bf52fa4a26567bfbf5b1d30f84cf0226e61d26cd |
|
19-Jul-2007 |
Doug Thompson <dougthompson@xmission.com> |
drivers/edac: fix workq reset deadlock Fix mutex locking deadlock on the device controller linked list. Was calling a lock then a function that could call the same lock. Moved the cancel workq function to outside the lock Added some short circuit logic in the workq code Added comments of description Code tidying Signed-off-by: Doug Thompson <dougthompson@xmission.com> Cc: Greg KH <greg@kroah.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
8096cfafbb7ad3cb1a286ae7e8086167f4ebb4b6 |
|
19-Jul-2007 |
Doug Thompson <dougthompson@xmission.com> |
drivers/edac: fix edac_mc sysfs completion code This patch refactors the 'releasing' of kobjects for the edac_mc type of device. The correct pattern of kobject release is followed. As internal kobjs are allocated they bump a ref count on the top level kobj. It in turn has a module ref count on the edac_core module. When internal kobjects are released, they dec the ref count on the top level kobj. When the top level kobj reaches zero, it decrements the ref count on the edac_core object, allow it to be unloaded, as all resources have all now been released. Cc: Alan Cox alan@lxorguk.ukuu.org.uk Signed-off-by: Doug Thompson <dougthompson@xmission.com> Acked-by: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
b8f6f9755248026f21282e25cac49a1af698056c |
|
19-Jul-2007 |
Doug Thompson <dougthompson@xmission.com> |
drivers/edac: fix edac_mc init apis Refactoring of sysfs code necessitated the refactoring of the edac_mc_alloc() and edac_mc_add_mc() apis, of moving the index value to the alloc() function. This patch alters the in tree drivers to utilize this new api signature. Having the index value performed later created a chicken-and-the-egg issue. Moving it to the alloc() function allows for creating the necessary sysfs entries with the proper index number Cc: Alan Cox alan@lxorguk.ukuu.org.uk Signed-off-by: Doug Thompson <dougthompson@xmission.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
7391c6dcab3094610cb99bbd559beaa282582eac |
|
19-Jul-2007 |
Douglas Thompson <dougthompson@xmission.com> |
drivers/edac: mod edac_align_ptr function Refactor the edac_align_ptr() function to reduce the noise of casting the aligned pointer to the various types of data objects and modified its callers to its new signature Signed-off-by: Douglas Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
052dfb45ccb5ea354a426b52556bcfee75b9d2f5 |
|
19-Jul-2007 |
Douglas Thompson <dougthompson@xmission.com> |
drivers/edac: cleanup spaces-gotos after Lindent messup This patch fixes some remnant spaces inserted by the use of Lindent. Seems Lindent adds some spaces when it shoulded. These have been fixed. In addition, goto targets have issues, these have been fixed in this patch. Signed-off-by: Douglas Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
86aa8cb7bc47fe786df073246055d69d98e6330a |
|
19-Jul-2007 |
Douglas Thompson <dougthompson@xmission.com> |
drivers/edac: cleanup workq ifdefs The origin of this code comes from patches at sourceforge, that allow EDAC to be updated to various kernels. With kernel version 2.6.20 a new workq system was installed, thus the patches needed to be modified based on the kernel version. For submitting to the latest kernel.org those #ifdefs are removed Signed-off-by: Douglas Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
079708b9173595bf74b31b14c36e946359ae6c7e |
|
19-Jul-2007 |
Douglas Thompson <dougthompson@xmission.com> |
drivers/edac: core Lindent cleanup Run the EDAC CORE files through Lindent for cleanup Signed-off-by: Douglas Thompson <dougthompson@xmission.com> Signed-off-by: Dave Jiang <djiang@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
4de78c6877ec21142582ac19453c2d453d1ea298 |
|
19-Jul-2007 |
Dave Jiang <djiang@mvista.com> |
drivers/edac: mod PCI poll names Fixup poll values for MC and PCI. Also make mc function names unique to mc. Signed-off-by: Dave Jiang <djiang@mvista.com> Signed-off-by: Douglas Thompson <dougthompson@xmissin.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
66ee2f940ac8ab25f0c43a1e717d25dc46bfe74d |
|
19-Jul-2007 |
Dave Jiang <djiang@mvista.com> |
drivers/edac: mod assert_error check Change error check and clear variable from an atomic to an int Signed-off-by: Dave Jiang <djiang@mvista.com> Signed-off-by: Douglas Thompson <dougthompson@xmission.com Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
81d87cb13e367bb804bf44889ae0de7369705d6c |
|
19-Jul-2007 |
Dave Jiang <djiang@mvista.com> |
drivers/edac: mod MC to use workq instead of kthread Move the memory controller object to work queue based implementation from the kernel thread based. Signed-off-by: Dave Jiang <djiang@mvista.com> Signed-off-by: Douglas Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
c4192705fec85219086231a1c0fa61e8776e2c3b |
|
19-Jul-2007 |
Dave Jiang <djiang@mvista.com> |
drivers/edac: add dev_name getter function Move dev_name() macro to a more generic interface since it's not possible to determine whether a device is pci, platform, or of_device easily. Now each low level driver sets the name into the control structure, and the EDAC core references the control structure for the information. Better abstraction. Signed-off-by: Dave Jiang <djiang@mvista.com> Signed-off-by: Douglas Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
20bcb7a81dee21bfa3408f03f46b2891c9b5c84b |
|
19-Jul-2007 |
Douglas Thompson <dougthompson@xmission.com> |
drivers/edac: mod use edac_core.h In the refactoring of edac_mc.c into several subsystem files, the header file edac_mc.h became meaningless. A new header file edac_core.h was created. All the files that previously included "edac_mc.h" are changed to include "edac_core.h". Signed-off-by: Douglas Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
c0d121720220584bba2876b032e58a076b843fa1 |
|
19-Jul-2007 |
Dave Jiang <djiang@mvista.com> |
drivers/edac: add new nmi rescan Provides a way for NMI reported errors on x86 to notify the EDAC subsystem pending ECC errors by writing to a software state variable. Here's the reworked patch. I added an EDAC stub to the kernel so we can have variables that are in the kernel even if EDAC is a module. I also implemented the idea of using the chip driver to select error detection mode via module parameter and eliminate the kernel compile option. Please review/test. Thx! Also, I only made changes to some of the chipset drivers since I am unfamiliar with the other ones. We can add similar changes as we go. Signed-off-by: Dave Jiang <djiang@mvista.com> Signed-off-by: Douglas Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
63b7df9101895d1f0a259c567b3bab949a23075f |
|
19-Jul-2007 |
Matthias Kaehlcke <matthias.kaehlcke@gmail.com> |
drivers/edac: change from semaphore to mutex operation The EDAC core code uses a semaphore as mutex. use the mutex API instead of the (binary) semaphore. Matthaias wrote this, but since I had some patches ahead of it, I need to modify it to follow my patches. Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com> Signed-off-by: Douglas Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
e27e3dac651771fe3250f6305dee277bce29fc5d |
|
19-Jul-2007 |
Douglas Thompson <dougthompson@xmission.com> |
drivers/edac: add edac_device class This patch adds the new 'class' of object to be managed, named: 'edac_device'. As a peer of the 'edac_mc' class of object, it provides a non-memory centric view of an ERROR DETECTING device in hardware. It provides a sysfs interface and an abstraction for varioius EDAC type devices. Multiple 'instances' within the class are possible, with each 'instance' able to have multiple 'blocks', and each 'block' having 'attributes'. At the 'block' level there are the 'ce_count' and 'ue_count' fields which the device driver can update and/or call edac_device_handle_XX() functions. At each higher level are additional 'total' count fields, which are a summation of counts below that level. This 'edac_device' has been used to capture and present ECC errors which are found in a a L1 and L2 system on a per CORE/CPU basis. Signed-off-by: Douglas Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
7c9281d76c1c0b130f79d5fc021084e9749959d4 |
|
19-Jul-2007 |
Douglas Thompson <dougthompson@xmission.com> |
drivers/edac: split out functions to unique files This is a large patch to refactor the original EDAC module in the kernel and to break it up into better file granularity, such that each source file contains a given subsystem of the EDAC CORE. Originally, the EDAC 'core' was contained in one source file: edac_mc.c with it corresponding edac_mc.h file. Now, there are the following files: edac_module.c The main module init/exit function and other overhead edac_mc.c Code handling the edac_mc class of object edac_mc_sysfs.c Code handling for sysfs presentation edac_pci_sysfs.c Code handling for PCI sysfs presentation edac_core.h CORE .h include file for 'edac_mc' and 'edac_device' drivers edac_module.h Internal CORE .h include file This forms a foundation upon which a later patch can create the 'edac_device' class of object code in a new file 'edac_device.c'. Signed-off-by: Douglas Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
2da1c119fd999cb834b4fe0c1a5a8c36195df1cb |
|
19-Jul-2007 |
Adrian Bunk <bunk@stusta.de> |
drivers/edac: core: make functions static This patch makes needlessly global code static, in the edac core Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Doug Thompson <norsk5@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
5da0831c598f94582bce6bb0a55b8de2f9897cb1 |
|
19-Jul-2007 |
Douglas Thompson <dougthompson@xmission.com> |
drivers/edac: add edac_mc_find API This simple patch adds an important CORE API for EDAC that EDAC drivers can use to find their edac_mc control structure by passing a mem_ctl_info 'instance' value Needed for subsequent patches Signed-off-by: Douglas Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
831441862956fffa17b9801db37e6ea1650b0f69 |
|
17-Jul-2007 |
Rafael J. Wysocki <rjw@sisk.pl> |
Freezer: make kernel threads nonfreezable by default Currently, the freezer treats all tasks as freezable, except for the kernel threads that explicitly set the PF_NOFREEZE flag for themselves. This approach is problematic, since it requires every kernel thread to either set PF_NOFREEZE explicitly, or call try_to_freeze(), even if it doesn't care for the freezing of tasks at all. It seems better to only require the kernel threads that want to or need to be frozen to use some freezer-related code and to remove any freezer-related code from the other (nonfreezable) kernel threads, which is done in this patch. The patch causes all kernel threads to be nonfreezable by default (ie. to have PF_NOFREEZE set by default) and introduces the set_freezable() function that should be called by the freezable kernel threads in order to unset PF_NOFREEZE. It also makes all of the currently freezable kernel threads call set_freezable(), so it shouldn't cause any (intentional) change of behaviour to appear. Additionally, it updates documentation to describe the freezing of tasks more accurately. [akpm@linux-foundation.org: build fixes] Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Nigel Cunningham <nigel@nigel.suspend2.net> Cc: Pavel Machek <pavel@ucw.cz> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
9794f33ddedd878dd92fcf8b4834391840366919 |
|
12-Feb-2007 |
eric wollesen <ericw@xmtp.net> |
[PATCH] EDAC: Add Fully-Buffered DIMM APIs to core Eric Wollesen ported the Bluesmoke Memory Controller driver for the Intel 5000X/V/P (Blackford/Greencreek) chipset to the in kernel EDAC model. This patch incorporates those required changes to the edac_mc.c and edac_mc.h core files by added new Fully Buffered DIMM interface to the EDAC Core module. Signed-off-by: eric wollesen <ericw@xmtp.net> Signed-off-by: doug thompson <norsk5@xmission.com> Acked-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
4f423ddf56e5ecb1fb2eac83b8e228e3d0aae0f6 |
|
12-Feb-2007 |
Frithiof Jensen <frithiof.jensen@ericson.com> |
[PATCH] EDAC: Add memory scrubbing controls API to core This is an attempt of providing an interface for memory scrubbing control in EDAC. This patch modifies the EDAC Core to provide the Interface for memory controller modules to implment. The following things are still outstanding: - K8 is the first implemenation, The patch provide a method of configuring the K8 hardware memory scrubber via the 'mcX' sysfs directory. There should be some fallback to a generic scrubber implemented in software if the hardware does not support scrubbing. Or .. the scrubbing sysfs entry should not be visible at all. - Only works with SDRAM, not cache, The K8 can scrub cache and l2cache also - but I think this is not so useful as the cache is busy all the time (one hopes). One would also expect that cache scrubbing requires hardware support. - Error Handling, I would like that errors are returned to the user in "terms of file system". - Presentation, I chose Bandwidth in Bytes/Second as a representation of the scrubbing rate for the following reasons: I like that the sysfs entries are sort-of textual, related to something that makes sense instead of magical values that must be looked up. "My People" wants "% main memory scrubbed per hour" others prefer "% memory bandwidth used" as representation, "bandwith used" makes it easy to calculate both versions in one-liner scripts. If one later wants to scrub cache, the scaling becomes wierd for K8 changing from "blocks of 64 byte memory" to "blocks of 64 cache lines" to "blocks of 64 bit". Using "bandwidth used" makes sense in all three cases, (I.M.O. anyway ;-). - Discovery, There is no way to discover the possible settings and what they do without reading the code and the documentation. *I* do not know how to make that work in a practical way. - Bugs(??), other tools can set invalid values in the memory scrub control register, those will read back as '-1', requiring the user to reset the scrub rate. This is how *I* think it should be. - Afflicting other areas of code, I made changes to edac_mc.c and edac_mc.h which will show up globally - this is not nice, it would be better that the memory scrubbing fuctionality and interface could be entirely contained within the memory controller it applies to. Frithiof Jensen edac_mc.c and its .h file is a CORE helper module for EDAC driver modules. This provides the abstraction for device specific drivers. It is fine to modify this CORE to provide help for new features of the the drivers doug thompson Signed-off-by: Frithiof Jensen <frithiof.jensen@ericson.com> Signed-off-by: doug thompson <norsk5@xmission.com> Acked-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
7dfb71030f7636a0d65200158113c37764552f93 |
|
07-Dec-2006 |
Nigel Cunningham <ncunningham@linuxmail.org> |
[PATCH] Add include/linux/freezer.h and move definitions from sched.h Move process freezing functions from include/linux/sched.h to freezer.h, so that modifications to the freezer or the kernel configuration don't require recompiling just about everything. [akpm@osdl.org: fix ueagle driver] Signed-off-by: Nigel Cunningham <nigel@suspend2.net> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
77d6e1397a004c9376fed855e4164ca2b1dba2ed |
|
03-Nov-2006 |
Akinobu Mita <akinobu.mita@gmail.com> |
[PATCH] edac_mc: fix error handling Call sysdev_class_unregister() on failure in edac_sysfs_memctrl_setup() and decrease identation level for clear logic. Acked-by: Doug Thompson <norsk5@xmission.com> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
49c0dab7e6000888b616bedcbbc8cd4710331610 |
|
10-Jul-2006 |
Doug Thompson <norsk5@xmission.com> |
[PATCH] Fix and enable EDAC sysfs operation When EDAC was first introduced into the kernel it had a sysfs interface, but due to some problems it was disabled in 2.6.16 and remained disabled in 2.6.17. With feedback, several of the control and attribute files of that interface had some good constructive feedback. PCI Blacklist/Whitelist was a major set which has design issues and it has been removed in this patch. Instead of storing PCI broken parity status in EDAC, it has been moved to the pci_dev structure itself by a previous PCI patch. A future patch will enable that feature in EDAC by utilizing the pci_dev info. The sysfs is now enabled in this patch, with a minimal set of control and attribute files for examining EDAC state and for enabling/disabling the memory and PCI operations. The Documentation for EDAC has also been updated to reflect the new state of EDAC operation. Signed-off-by:Doug Thompson <norsk5@xmisson.com> Cc: Greg KH <greg@kroah.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
2d7bbb91c8df26c60d223205a087507430024177 |
|
30-Jun-2006 |
Doug Thompson <norsk5@xmission.com> |
[PATCH] EDAC: mc numbers refactor 1-of-2 Remove add_mc_to_global_list(). In next patch, this function will be reimplemented with different semantics. 1 Reimplement add_mc_to_global_list() with semantics that allow the caller to determine the ID number for a mem_ctl_info structure. Then modify edac_mc_add_mc() so that the caller specifies the ID number for the new mem_ctl_info structure. Platform-specific code should be able to assign the ID numbers in a platform-specific manner. For instance, on Opteron it makes sense to have the ID of the mem_ctl_info structure match the ID of the node that the memory controller belongs to. 2 Modify callers of edac_mc_add_mc() so they use the new semantics. Signed-off-by: Doug Thompson <norsk5@xmission.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
37f04581abac20444e5b7106c1e1f28bec5b989c |
|
30-Jun-2006 |
Doug Thompson <norsk5@xmission.com> |
[PATCH] EDAC: PCI device to DEVICE cleanup Change MC drivers from using CVS revision strings for their version number, Now each driver has its own local string. Remove some PCI dependencies from the core EDAC module. Made the code 'struct device' centric instead of 'struct pci_dev' Most of the code changes here are from a patch by Dave Jiang. It may be best to eventually move the PCI-specific code into a separate source file. Signed-off-by: Doug Thompson <norsk5@xmission.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
6ab3d5624e172c553004ecc862bfeac16d9d68b7 |
|
30-Jun-2006 |
Jörn Engel <joern@wohnheim.fh-wedel.de> |
Remove obsolete #include <linux/config.h> Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
7f927fcc2fd1575d01efb4b76665975007945690 |
|
28-Mar-2006 |
Alexey Dobriyan <adobriyan@gmail.com> |
[PATCH] Typo fixes Fix a lot of typos. Eyeballed by jmc@ in OpenBSD. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
9110540f7f2bbcc3577d2580a696fbb7af68c892 |
|
26-Mar-2006 |
Dave Peterson <dsp@llnl.gov> |
[PATCH] EDAC: use EXPORT_SYMBOL_GPL Change all instances of EXPORT_SYMBOL() in the core EDAC module to EXPORT_SYMBOL_GPL(). Signed-off-by: David S. Peterson <dsp@llnl.gov> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
e7ecd8910293564d357dbaf18eb179e06fa35fd0 |
|
26-Mar-2006 |
Dave Peterson <dsp@llnl.gov> |
[PATCH] EDAC: formatting cleanup Cosmetic indentation/formatting cleanup for EDAC code. Make sure we are using tabs rather than spaces to indent, etc. Signed-off-by: David S. Peterson <dsp@llnl.gov> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
54933dddc3e8ccd9db48966d8ada11951cb8a558 |
|
26-Mar-2006 |
Dave Peterson <dsp@llnl.gov> |
[PATCH] EDAC: reorder EXPORT_SYMBOL macros Fix EDAC code so EXPORT_SYMBOL comes after the function that is being exported. This is to maintain consistency with the rest of the kernel. Signed-off-by: David S. Peterson <dsp@llnl.gov> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
18dbc337af5d6efd30cb9291e74722c8ad134fd3 |
|
26-Mar-2006 |
Dave Peterson <dsp@llnl.gov> |
[PATCH] EDAC: protect memory controller list - Fix code so we always hold mem_ctls_mutex while we are stepping through the list of mem_ctl_info structures. Otherwise bad things may happen if one task is stepping through the list while another task is modifying it. We may eventually want to use reference counting to manage the mem_ctl_info structures. In the meantime we may as well fix this bug. - Don't disable interrupts while we are walking the list of mem_ctl_info structures in check_mc_devices(). This is unnecessary. Signed-off-by: David S. Peterson <dsp@llnl.gov> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
472678ebd30d87cbe8d97562dcc0e46d1076040f |
|
26-Mar-2006 |
Dave Peterson <dsp@llnl.gov> |
[PATCH] EDAC: kobject/sysfs fixes - After we unregister a kobject, wait for our kobject release method to call complete(). This causes us to wait until the kobject reference count reaches 0. Otherwise, a task accessing the EDAC sysfs interface can hold the reference count above 0 until after the EDAC module has been unloaded. When the reference count finally drops to 0, this will result in an attempt to call our release method inside the EDAC module after the module has already been unloaded. This isn't the best fix, since a process can get stuck sleeping forever uninterruptibly if the user does the following: rmmod my_module < /sys/my_sysfs/file I'll go back and implement a better fix later. However this should be ok for now. - Call edac_remove_sysfs_mci_device() from edac_mc_del_mc() rather than from edac_mc_free(). Since edac_mc_add_mc() calls edac_create_sysfs_mci_device(), edac_mc_del_mc() should call edac_remove_sysfs_mci_device(). Signed-off-by: David S. Peterson <dsp@llnl.gov> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
6e5a8748507dea83386d1d76c58aeaed1ff5a1ec |
|
26-Mar-2006 |
Dave Peterson <dsp@llnl.gov> |
[PATCH] EDAC: kobject_init/kobject_put fixes - Remove calls to kobject_init(). These are unnecessary because kobject_register() calls kobject_init(). - Remove extra calls to kobject_put(). When we call kobject_unregister(), this releases our reference to the kobject. The extra calls to kobject_put() may cause the reference count to drop to 0 while a kobject is still in use. Signed-off-by: David S. Peterson <dsp@llnl.gov> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
028a7b6d3d9fa2cc41d76d45575345cca8d00a4c |
|
26-Mar-2006 |
Dave Peterson <dsp@llnl.gov> |
[PATCH] EDAC: edac_mc_add_mc fix [2/2] This is part 2 of a 2-part patch set. Fix edac_mc_add_mc() so it cleans up properly if call to edac_create_sysfs_mci_device() fails. Signed-off-by: David S. Peterson <dsp@llnl.gov> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
a1d03fcc1399b1e23922bcc3af1772b128aa6e93 |
|
26-Mar-2006 |
Dave Peterson <dsp@llnl.gov> |
[PATCH] EDAC: edac_mc_add_mc fix [1/2] This is part 1 of a 2-part patch set. The code changes are split into two parts to make the patches more readable. Move complete_mc_list_del() and del_mc_from_global_list() so we can call del_mc_from_global_list() from edac_mc_add_mc() without forward declarations. Perhaps using forward declarations would be better? I'm doing things this way because the rest of the code is missing them. Signed-off-by: David S. Peterson <dsp@llnl.gov> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
749ede57443b2a7ede2db105145f21047efcea6a |
|
26-Mar-2006 |
Dave Peterson <dsp@llnl.gov> |
[PATCH] EDAC: cleanup code for clearing initial errors Fix xxx_probe1() functions so they call xxx_get_error_info() functions to clear initial errors. This is simpler and cleaner than duplicating the low-level code for accessing PCI config space. Signed-off-by: David S. Peterson <dsp@llnl.gov> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
537fba28928c01b7db1580627450691a4bb0b9b3 |
|
26-Mar-2006 |
Dave Peterson <dsp@llnl.gov> |
[PATCH] EDAC: printk cleanup This implements the following idea: On Monday 30 January 2006 19:22, Eric W. Biederman wrote: > One piece missing from this conversation is the issue that we need errors > in a uniform format. That is why edac_mc has helper functions. > > However there will always be errors that don't fit any particular model. > Could we add a edac_printk(dev, ); That is similar to dev_printk but > prints out an EDAC header and the device on which the error was found? > Letting the rest of the string be user specified. > > For actual control that interface may be to blunt, but at least for people > looking in the logs it allows all of the errors to be detected and > harvested. Signed-off-by: David S. Peterson <dsp@llnl.gov> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
f2fe42abbf0d99a8c4b96f1cc55db10ac35d2fb9 |
|
26-Mar-2006 |
Dave Peterson <dsp@llnl.gov> |
[PATCH] EDAC: switch to kthread_ API This patch was originally posted by Christoph Hellwig (see http://lkml.org/lkml/2006/2/14/331): "Christoph Hellwig" <hch@lst.de> wrote: > Use the kthread_ API instead of opencoding lots of hairy code for kernel > thread creation and teardown, including tasklist_lock abuse. > Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Peterson <dsp@llnl.gov> Cc: <dave_peterson@pobox.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
ceb2ca9cb0bfd885127fa9a2c27127b3fe1c8f28 |
|
14-Mar-2006 |
Dave Peterson <dsp@llnl.gov> |
[PATCH] EDAC: disable sysfs interface - Disable the EDAC sysfs code. The sysfs interface that EDAC presents to user space needs more thought, and is likely to change substantially. Therefore disable it for now so users don't start depending on it in its current form. - Disable the default behavior of calling panic() when an uncorrectible error is detected (since for now, there is no sysfs interface that allows the user to configure this behavior). Signed-off-by: David S. Peterson <dsp@llnl.gov> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
4136cabff33d6d73b8daf2f2612670cc0296f844 |
|
11-Mar-2006 |
Arjan van de Ven <arjan@linux.intel.com> |
[PATCH] edac: disable a few sysfs files to avoid them becoming an ABI Disable (via ugly #if 0's) the 3 sysfs files that I think by now we all agree are very much wrong. These files shouldn't become part of the ABI by the 2.6.16 release, so I rather have this minimal patch merged to disable them for now, the real fix can then come during the 2.6.17 devel window. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
353368dffb56b066cbe00264581a56caf0241b29 |
|
03-Feb-2006 |
Eric W. Biederman <ebiederm@xmission.com> |
[PATCH] edac_mc: Remove include of version.h By including version.h edac_mc was rebuilding on every incremental build. Which defeats the point of incremental builds. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
da9bb1d27b21cb24cbb6a2efb5d3c464d357a01e |
|
19-Jan-2006 |
Alan Cox <alan@lxorguk.ukuu.org.uk> |
[PATCH] EDAC: core EDAC support code This is a subset of the bluesmoke project core code, stripped of the NMI work which isn't ready to merge and some of the "interesting" proc functionality that needs reworking or just has no place in kernel. It requires no core kernel changes except the added scrub functions already posted. The goal is to merge further functionality only after the core code is accepted and proven in the base kernel, and only at the point the upstream extras are really ready to merge. From: doug thompson <norsk5@xmission.com> This converts EDAC to sysfs and is the final chunk neccessary before EDAC has a stable user space API and can be considered for submission into the base kernel. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: doug thompson <norsk5@xmission.com> Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|