f77cecf08cf9fba5e8f62e8ac1731c1916a97618 |
|
30-Mar-2017 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Always provide a default LOD of 0 for TXS and TXL We already provide a default LOD for textureQueryLevels and texture() on non-fragment stages. However, there are more cases where one is needed such as textureSize(gsampler2DMS*) in SPIR-V. Instead of trying to list out all of the cases one at a time, just provide the default for all TXS and TXL operations. This fixes a shader validation error in the new Sascha deferredmultisampling demo which uses textureSize(gsampler2DMS). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100391 Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit 3503b2714b98684a2ceba5f4fd9a5bfbfbcaad38)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
e1e27b0917249448a481b6681aac375505f728c3 |
|
16-Feb-2017 |
Samuel Iglesias Gonsálvez <siglesias@igalia.com> |
i965/fs: fix source type when emitting MOV_INDIRECT to read ICP handles When generating the MOV INDIRECT instruction, the source type is ignored and it is set to destination's type. However, this is going to change in a later patch, so we need to explicitly set the proper source type. brw_vec8_grf() creates an float type's fs_reg by default, when the ICP handle is actually unsigned. This patch fixes these cases before applying the aforementioned patch. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> (cherry picked from commit d8122128bc6bd291ff0abcb7f2e52d9cdc631527)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
59e6c0d8aee718cf58198d5a5b2adce3e01391a6 |
|
13-Feb-2017 |
Samuel Iglesias Gonsálvez <siglesias@igalia.com> |
i965/fs: fix indirect load DF uniforms on BSW/BXT The lowered BSW/BXT indirect move instructions had incorrect source types, which luckily wasn't causing incorrect assembly to be generated due to the bug fixed in the next patch, but would have confused the remaining back-end IR infrastructure due to the mismatch between the IR source types and the emitted machine code. v2: - Improve commit log (Curro) - Fix read_size (Curro) - Fix DF uniform array detection in assign_constant_locations() when it is acceded with 32-bit MOV_INDIRECTs in BSW/BXT. v3: - Move changes in assign_constant_locations() to other patch. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> (cherry picked from commit 56266df7ed9dbdf63acfd58944442893b4cd0c0b)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
a594bd19dc2344260904c51ea7b22bdc71428d64 |
|
15-Feb-2017 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Fix the inline nir_op_pack_double optimization We can only do the optimization if the source *is* SSA. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit a4393bd97fe62e8299273bae769201c5c9c816ea) Squashed with commit: i965/fs: Remove the inline pack_double_2x32 optimization It's broken in a number of ways. In particular, a bunch of the conditions are backwards so it doesn't actually detect what it's supposed to detect. Since it's been broken, it hasn't actually been helping anything so just deleting it isn't a regression. This (and removing another optimization) were done on master in commit b07381161777ba5d5f4a1d713f7655bcaede4139. Cc: "Kenneth Grunke" <kenneth@whitecape.org> Cc: "Mark Janes" <mark.a.janes@intel.com> [Emil Velikov: patch is a backport of the below "cherry pick"] Fixes: a4393bd97fe ("i965/fs: Fix the inline nir_op_pack_double optimization") (cherry picked from commit b07381161777ba5d5f4a1d713f7655bcaede4139)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
e6ae19944d977dc91bc45adff679337182c20683 |
|
24-Nov-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Rework gl_TessLevel*[] handling to use NIR compact arrays. Treating everything as scalar arrays allows us to drop a bunch of special case input/output munging all throughout the backend. Instead, we just need to remap the TessLevel components to the appropriate patch URB header locations in remap_patch_urb_offsets(). We also switch to treating the TES input versions of these as ordinary shader inputs rather than system values, as remap_patch_urb_offsets() just makes everything work out without special handling. This regresses one Piglit test: arb_tessellation_shader-large-uniforms/GL_TESS_CONTROL_SHADER-array-at-limit The compiler starts promoting the constant arrays assigned to gl_TessLevel* to uniform arrays. Since the shader also has a uniform array that uses the maximum number of uniform components, this puts it over the uniform component limit enforced by the linker. This is arguably a bug in the constant array promotion code (it should avoid pushing us over limits), but is unlikely to penalize any real application. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
c5ae6e78fc3bed83c6e18be6dbc8eb86a8db0898 |
|
23-Dec-2016 |
Samuel Iglesias Gonsálvez <siglesias@igalia.com> |
i965/fs: fix exec_size when emitting DIM instruction Otherwise, DIM instructions will be emitted with the default exec size which could be 16 in some cases, that is not legal. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Suggested-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b56fa830c6095f8226456b2aeb62f2dfad804be5 |
|
09-Dec-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Fetch one cacheline of pull constants at a time. Asking the DC for less than one cacheline (4 owords) of data for uniform pull constants is suboptimal because the DC cannot request less than that from L3, resulting in wasted bandwidth and unnecessary message dispatch overhead, and exacerbating the IVB L3 serialization bug. The following table summarizes the overall framerate improvement (with statistical significance of 5% and sample size ~10) from the whole series up to this patch for several benchmarks and hardware generations: | SKL | BDW | HSW SynMark2 OglShMapPcf | 24.63% ±0.45% | 4.01% ±0.70% | 10.31% ±0.38% GfxBench4 gl_manhattan31 | 5.93% ±0.35% | 3.92% ±0.31% | 6.62% ±0.22% GfxBench4 gl_4 | 2.52% ±0.44% | 1.23% ±0.10% | N/A Unigine Valley | 0.83% ±0.17% | 0.23% ±0.05% | 0.74% ±0.45% Note that there are two versions of the Manhattan demo shipped with GfxBench4, one of them is the original gl_manhattan demo which doesn't use UBOs, so this patch will have no effect on it, and another one is the gl_manhattan31 demo based on GL 4.3/GLES 3.1, which this patch benefits as shown above. I haven't observed any statistically significant regressions in the benchmarks I have at hand. Note that the comparatively huge improvement on SKL in the OglShMapPcf test case is due to the combined effect of this patch and the register pressure benefit on SKL+ of "i965/fs: Switch to the constant cache for uniform pull constants.", part of the same series. Going up to 8 oword blocks would improve performance of pull constants even more, but at the cost of some additional bandwidth and register pressure, so it would have to be done on-demand based on the number of constants actually used by the shader. v2: Fix for Gen4 and 5. v3: Non-trivial rebase. Rework to allow the visitor specifiy arbitrary pull constant block sizes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
9b22a0d295316b7547667ebbfe1e1b6182439186 |
|
09-Dec-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Expose arbitrary pull constant load sizes to the IR. Change the FS generator to ask the dataport for enough owords worth of constants to fill the execution size of the instruction -- Which means that the visitor now needs to set the execution size correctly for uniform pull constant load instructions, which we were kind of neglecting until now. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
fd249c803e3ae2acb83f5e3b7152728e73228b7b |
|
12-Dec-2016 |
Ilia Mirkin <imirkin@alum.mit.edu> |
treewide: s/comparitor/comparator/ git grep -l comparitor | xargs sed -i 's/comparitor/comparator/g' Just happened to notice this in a patch that was sent and included one of the tokens in question. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
4f2d1d6ea713df8f8d816b48b9e99c7117cf36d7 |
|
28-Nov-2016 |
Ilia Mirkin <imirkin@alum.mit.edu> |
i965: support constant gather offsets larger than 4 bits Offsets that don't fit into 4 bits need to force gather_po to be selected. Adjust the logic so that this happens. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
faf20df143a63e58aa729446f21c38ae39a438f2 |
|
29-Nov-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Refactor handling of constant tg4 offsets Previously, we had an OFFSET_VALUE source for logical texture instructions that was intended to mean exactly what it says, "offset". In reality, we only fully used it for tg4 offsets. We used offset_value.file == IMM to mean, "you have a constant offset, go look in instr->offset" and didn't actually use the contents of the register at all in that case except for in nir_emit_texture where we used it as a temporary before we copy it into instr->offset. This commit renames OFFSET_VALUE to TG4_OFFSET and restricts its usage to indirect tg4 offsets only. The nir_emit_texture code is refactored so that we explicitly build a header_bits value which is placed in instr->offset and the constant offset values (both for tg4 and regular texture operations) are used to construct header_bits and don't go through the offset source at all. Finally, we stop passing offset_value in to lower_sampler_logical_send_gen5 because we can't do indirect offsets until gen7 anyway. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
2e311e421122e0232987fdca3645c6bd39fe2470 |
|
16-Nov-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Implement load_layer_id for fragment shaders Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b63f7671a3eafa4ab293a13f45f58837bd840a46 |
|
04-Oct-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Handle compact outputs. We need to calculate the number of vec4 slots correctly. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
c4be6e0b8d91746eccf334b9e20861af4036d06a |
|
15-Nov-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fix GS push inputs with enhanced layouts. We weren't taking first_component into account when handling GS push inputs. We hardly ever push GS inputs, so this was not caught by existing tests. When I started using component qualifiers for the gl_ClipDistance arrays, glsl-1.50-transform-feedback-type-and-size started catching this. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
e1af20f18a86f52a9640faf2d4ff8a71b0a4fa9b |
|
13-Oct-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
nir/i965/anv/radv/gallium: make shader info a pointer When restoring something from shader cache we won't have and don't want to create a nir_shader this change detaches the two. There are other advantages such as being able to reuse the shader info populated by GLSL IR. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
59864e8e02057cc6fa0448a8af067a3cf53389da |
|
13-Oct-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Don't use nir_assign_var_locations for VS/TES/GS outputs. Fixes spec/arb_enhanced_layouts/execution/component-layout/vs-fs-array-dvec3. v2: Remove nir_outputs field from fs_visitor (caught by Tim and Iago). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
3728ee000aecb19793dec56d45aff9d6cfce3e5b |
|
13-Oct-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Drop unnecessary switch statement in nir_setup_outputs() TCS and FS are skipped above. CS has no output variables. All remaining cases take the same path. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
e51e055fcdf8107aafaba358fa65b00f963e1728 |
|
09-Sep-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Introduce downcast helpers for prog_data structures. Similar to brw_context(...), intel_texture_object(...), and so on. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
40dd45d0c6aa4a9d727c09225967e9c3b1f45854 |
|
30-Jun-2016 |
Ian Romanick <ian.d.romanick@intel.com> |
i965: Enable ARB_shader_atomic_counter_ops Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
3d2011cb33317b0fe9b8fe989916efc1841c6ce0 |
|
30-Jun-2016 |
Ian Romanick <ian.d.romanick@intel.com> |
i965: Refactor emission of atomic counter operations This will make it easier to add more operations. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
950af5ed40895ba7eb664a64e869cf4ae1104fc7 |
|
02-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Misc simplification. Get rid of some leftover redundant arithmetic introduced during the conversion to byte offsets and sizes that can be simplified easily. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
80e1d670b4b4c080ce2092a3b52d2415bc4c6a42 |
|
01-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Get rid of fs_inst::set_smear(). component() was generally a better alternative because of several issues set_smear() had: - It wouldn't take the original stride and offset of the register into account, which means that set_smear() on the result of e.g. another set_smear() call or an offset() call would give a bogus region as result. - It was an inherently destructive operation. See the 'nir_intrinsic_shader_clock' hunk below for how this could lead to subtle bugs in cases where set_smear() was called multiple times on the same register like 'r.set_smear(0), r.set_smear(1)' with the expectation that each call would return a separate value instead of a reference to the same subsequently mutated object. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
2d7d4a791083ff63f37ac1e40bfe8b448e7f8045 |
|
02-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Simplify a bunch of fs_inst::size_written calculations by using component_size(). Using component_size() is easier and generally more correct because it takes into account the register type and stride for you. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
62aaef6c83e4eb354bd7f15803db01e90d22fc34 |
|
02-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Simplify and fix buggy stride/offset calculations using subscript(). These were bashing the 'offset' and 'stride' values of several registers without taking the previous value into account, which probably didn't matter in practice for optimize_frontfacing_ternary() because the 'tmp' register already had a known region, but it would have given the wrong region as result in the other cases in lower_integer_multiplication(). subscript(..., i) is a more straightforward way to take the i-th field of a given type from each channel of a register which should give the right answer as result regardless of the original 'offset' and 'stride' parameters of the register region. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
c057278c065747c1f53579504bf109cafb7cb390 |
|
02-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Stop using fs_reg::in_range() in favor of regions_overlap(). Its only use left in the FS back-end should be using regions_overlap() instead to avoid getting a false negative result in cases where source and destination overlap but the former starts before the latter in the VGRF file. v2: Put back lost components factor (Iago). Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
69570bbad876bb9da609c3b651aacda28cecc542 |
|
07-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Replace fs_inst::regs_written with ::size_written field in bytes. The previous regs_written field can be recovered by rewriting each rvalue reference of regs_written like 'x = i.regs_written' to 'x = DIV_ROUND_UP(i.size_written, reg_unit)', and each lvalue reference like 'i.regs_written = x' to 'i.size_written = x * reg_unit'. For the same reason as in the previous patches, this doesn't attempt to be particularly clever about simplifying the result in the interest of keeping the rather lengthy patch as obvious as possible. I'll come back later to clean up any ugliness introduced here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
be095e11e41158f91bcb3f6fcbc2e2a91a5d9124 |
|
02-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Replace fs_reg::subreg_offset with fs_reg::offset expressed in bytes. The fs_reg::subreg_offset and ::offset fields are now redundant, the sub-GRF offset can just be added to the single ::offset field expressed in byte units. The current subreg_offset value can be recovered by applying the following rule: Replace each rvalue reference of subreg_offset like 'x = r.subreg_offset' with 'x = r.offset % reg_unit', and each lvalue reference like 'r.subreg_offset = x' with 'r.offset = ROUND_DOWN_TO(r.offset, reg_unit) + x'. For the same reason as in the previous patches, this doesn't attempt to be particularly clever about simplifying the result in the interest of keeping the rather lengthy patch as obvious as possible. I'll come back later to clean up any ugliness introduced here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
86944e063ad40cac0860bfd85a3cc4e9a9805aa3 |
|
01-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Replace fs_reg::reg_offset with fs_reg::offset expressed in bytes. The fs_reg::offset field in byte units introduced in this patch is a more straightforward alternative to the current register offset representation split between fs_reg::reg_offset and ::subreg_offset. The split representation makes it too easy to forget about one of the offsets while dealing with the other, which has led to multiple back-end bugs in the past. To make the matter worse the unit reg_offset was expressed in was rather inconsistent, for uniforms it would be expressed in either 4B or 16B units depending on the back-end, and for most other things it would be expressed in 32B units. This encodes reg_offset as a new offset field expressed consistently in byte units. Each rvalue reference of reg_offset in existing code like 'x = r.reg_offset' is rewritten to 'x = r.offset / reg_unit', and each lvalue reference like 'r.reg_offset = x' is rewritten to 'r.offset = r.offset % reg_unit + x * reg_unit'. Because the change affects a lot of places and is rather non-trivial to verify due to the inconsistent value of reg_unit, I've tried to avoid making any additional changes other than applying the rewrite rule above in order to keep the patch as simple as possible, sometimes at the cost of introducing obvious stupidity (e.g. algebraic expressions that could be simplified given some knowledge of the context) -- I'll clean those up later on in a second pass. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
979d0aca6277975986f5f278cad0f37616c9d91f |
|
26-Aug-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
intel: Rename brw_get_device_name/info to gen_get_device_name/info Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
527f37199929932300acc1688d8160e1f3b1d753 |
|
23-Aug-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
intel: s/brw_device_info/gen_device_info/ Generated by: sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/**/*.c sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/**/*.h sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.c sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.cpp sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.h Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
4135fc22ff735a40c36fcf051c1735fe23d154f2 |
|
19-Aug-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Hook up coherent framebuffer reads to the NIR front-end. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
f24e393bd5caee85994b00b93f141e6c4b99e273 |
|
22-Jul-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Translate nir_intrinsic_load_output on a fragment output. This gets the non-coherent framebuffer fetch path hooked up to the NIR front-end. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b00a236d6a6212323f77248ba923c65eeb02592b |
|
22-Jul-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Allocate fragment output temporaries on demand. This gets rid of the duplication of logic between nir_setup_outputs() and get_frag_output() by allocating fragment output temporaries lazily whenever get_frag_output() is called. This makes nir_setup_outputs() a no-op for the fragment shader stage. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
7dac8820730777756c00d7024330517848dc3b9f |
|
22-Jul-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Rework representation of fragment output locations in NIR. The problem with the current approach is that driver output locations are represented as a linear offset within the nir_outputs array, which makes it rather difficult for the back-end to figure out what color output and index some nir_intrinsic_load/store_output was meant for, because the offset of a given output within the nir_output array is dependent on the type and size of all previously allocated outputs. Instead this defines the driver location of an output to be the pair formed by its GLSL-assigned location and index (I've borrowed the bitfield macros from brw_defines.h in order to represent the pair of integers as a single scalar value that can be assigned to nir_variable_data::driver_location). nir_assign_var_locations is no longer useful for fragment outputs. Because fragment outputs are now allocated independently rather than within the nir_outputs array, the get_frag_output() helper becomes necessary in order to obtain the right temporary register for a given location-index pair. The type_size helper passed to nir_lower_io is now type_size_dvec4 rather than type_size_vec4_times_4 so that output array offsets are provided in terms of whole array elements rather than in terms of scalar components (dvec4 is the largest vector type supported by the GLSL so this will cause all individual fragment outputs to have a size of one regardless of the type). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
f3cb2c34f29d35088879a6b8101c3ac648e0febf |
|
22-Jul-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Special-case nir_intrinsic_store_output for the fragment shader. I'm about to change how fragment shader output locations are represented, so the generic nir_intrinsic_store_output implementation that assumes that outputs are just contiguous elements in the big nir_outputs array won't work anymore. This somewhat simplified implementation of nir_intrinsic_store_output for fragment shaders should be functionally equivalent to the current fall-back one. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
af0cc743e607293146861518bb6ef96f411aeca9 |
|
22-Jul-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Implement non-coherent framebuffer fetch using the sampler unit. v2: Memoize sample ID, misc codestyle changes. (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
4a87e4ade778e56d43333c65a58752b15a00ce69 |
|
21-Jul-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Get rid of fs_visitor::do_dual_src. This boolean flag was being used for two different things: - To set the brw_wm_prog_data::dual_src_blend flag. Instead we can just set it based on whether the dual_src_output register is valid, which will be the case if the shader writes the secondary blending color. - To decide whether to call emit_single_fb_write() once, or in a loop that would iterate only once, which seems pretty useless. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
d14dd727f4aded5bd34a78dc2c81374a78114440 |
|
17-Aug-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fix barrier count shift in scalar TCS backend. The "Barrier Count" field goes in 14:9 of m0.2. The vec4 backend correctly shifts by 9, but the scalar backend only shifted by 8. It's not like this changed - I think I just made a typo when writing the original scalar TCS backend code. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
159f0377556c45630cdc0721b193f34217a329b0 |
|
17-Aug-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fix execution size of scalar TCS barrier setup code. Previously, the scalar TCS backend was generating: mov(8) g17<1>UD 0x00000000UD { align1 WE_all 1Q compacted }; and(8) g17.2<1>UD g0.2<0,1,0>UD 0x0001e000UD { align1 WE_all 1Q }; shl(8) g17.2<1>UD g17.2<8,8,1>UD 0x0000000bUD { align1 WE_all 1Q }; or(8) g17.2<1>UD g17.2<8,8,1>UD 0x00008200UD { align1 WE_all 1Q }; send(8) null<1>UW g17<8,8,1>UD gateway (barrier msg) mlen 1 rlen 0 { align1 WE_all 1Q }; This is rubbish - g17.2<8,8,1>UD spans two registers, and is an illegal region. Not to mention it clobbers 8 channels of data when we only wanted to touch m0.2. Instead, we want: mov(8) g17<1>UD 0x00000000UD { align1 WE_all 1Q compacted }; and(1) g17.2<1>UD g0.2<0,1,0>UD 0x0001e000UD { align1 WE_all }; shl(1) g17.2<1>UD g17.2<0,1,0>UD 0x0000000bUD { align1 WE_all }; or(1) g17.2<1>UD g17.2<0,1,0>UD 0x00008200UD { align1 WE_all }; send(8) null<1>UW g17<8,8,1>UD gateway (barrier msg) mlen 1 rlen 0 { align1 WE_all 1Q }; Using component() accomplishes this. Fixes GL44-CTS.tessellation_shader.tessellation_shader_tc_barriers. barrier_guarded_read_write_calls on Skylake. Probably fixes other barrier issues on Gen8+. v2: Use a group(1, 0) builder so inst->exec_size is set correctly (thanks to Francisco Jerez for catching that it was incorrect). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v1] Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
0c754d1c4203d87dbb9d2dd882ef42686e6d01ec |
|
12-Aug-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Lower TEX to TXL during NIR translation. This simplifies the code slightly and will allow the SIMD lowering pass to find out easily what the actual texturing opcode is in order to determine the maximum execution size of texturing instructions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
027cbf00f248bda325521db8f56a3718898da46b |
|
02-Aug-2016 |
Mathias Fröhlich <mathias.froehlich@web.de> |
util: Move _mesa_fsl/util_last_bit into util/bitscan.h As requested with the initial creation of util/bitscan.h now move other bitscan related functions into util. v2: Split into two patches. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
875341c69b99dea7942a68c9060aa31a459e93fc |
|
02-Aug-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Rework the unlit centroid workaround. Previously, for every input, we moved the dispatch mask to the flag register, then emitted two predicated PLN instructions, one with centroid barycentric coordinates (for normal pixels), and one with pixel barycentric coordinates (for unlit helper pixels). Instead, we can simply emit a set of predicated MOVs at the top of the program which copy the pixel barycentric coordinates over the centroid ones for unlit helper pixel channels. Then, we can just use normal PLNs. On Sandybridge: total instructions in shared programs: 7538470 -> 7534500 (-0.05%) instructions in affected programs: 101268 -> 97298 (-3.92%) helped: 705 HURT: 9 (all of which are SIMD16 programs) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
12a912586f11ccbc4612532d5ceaf1bdd0cdb45a |
|
29-Jul-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Use a separate register for every access to an SSA undef. Previously, we allocated a new VGRF for every undefined definition. Instead, this patch makes us allocate a new VGRF for every use of an undefined definition. This makes sure that undefined values are fully independent of one another, and have live ranges limited to their single use. This allows register coalescing to combine the source and destination of MOVs from undefined sources, eliminating the MOV altogether. On Broadwell: total instructions in shared programs: 11641187 -> 11640214 (-0.01%) instructions in affected programs: 70199 -> 69226 (-1.39%) helped: 213 HURT: 1 v2: Add a comment (based on Iago's suggested one). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
a2b3c146d2017a626be66dcf43753d545e902c52 |
|
22-Jul-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
i965: fix varying output setup Since 7f53fead5c we treat every location as using all four components so we only need special handling for doubles when they cross multiple locations. This fixes a crash in GL45-CTS.enhanced_layouts.varying_locations where the outputs array would overflow when a dmat2 was stored at the max varying location i.e 30. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
be1c53d2cf2b12655ff69caac49cca75a55e63e0 |
|
22-Jul-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fix "operation operation" in comment. From the redundant redundant department. Reported-by: Michael Schellenberger Costa <mschellenbergercosta@googlemail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
76e161056a424e5b9c35b02a9f4e520c8c44cf2b |
|
18-Jul-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fix shared atomic intrinsics to pay attention to base. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
ad5dd39984467b29d20e03ec8bd26f6f1d2e97ad |
|
14-Jun-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
i965: add component packing support for load_output intrinsics Here we use the component qualifier (which is the first component) as an offset when loading output varyings. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
7f53fead5cf9a85c74a94d359dd5fccfbb87856c |
|
23-May-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
i965: enable component packing for vs and fs Rather than trying to work out the total number of components used at a location we simply treat all outputs as vec4s. This removes the need for complex code looping over varyings to match packed locations and the need for storing the total number of components used at each location. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
3dba8516d6468866f2534f517358a6243eb0995e |
|
20-Jul-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Move VS load_input handling to nir_emit_vs_intrinsic(). TCS/TES/GS and now FS all handle these in stage-specific functions. CS don't have inputs, so VS was the only one left using this code. Move it to the VS-specific function for clarity. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
1eef0b73aa323d94d5a080cd1efa81ccacdbd0d2 |
|
12-Jul-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Rewrite FS input handling to use the new NIR intrinsics. This eliminates the need to walk the list of input variables, recurse into their types (via logic largely redundant with nir_lower_io), and interpolate all possible inputs up front. The backend no longer has to care about variables at all, which eliminates complications from trying to pack multiple variables into the same location. Instead, each intrinsic specifies exactly what's needed. This should unblock Timothy's work on GL_ARB_enhanced_layouts. Each load_interpolated_input intrinsic corresponds to PLN instructions, while load_barycentric_at_* intrinsics correspond to pixel interpolator messages. The pixel/centroid/sample barycentric intrinsics simply refer to payload fields (delta_xy[]), and don't actually generate any code. Because we use a single intrinsic for both centroid-qualified variables and interpolateAtCentroid(), they become indistinguishable. We stop sending pixel interpolator messages for those, and instead use the payload provided data, which should be considerably faster. On Broadwell: total instructions in shared programs: 9067751 -> 9067570 (-0.00%) instructions in affected programs: 145902 -> 145721 (-0.12%) helped: 422 HURT: 209 total spills in shared programs: 2849 -> 2899 (1.76%) spills in affected programs: 760 -> 810 (6.58%) helped: 0 HURT: 10 total fills in shared programs: 3910 -> 3950 (1.02%) fills in affected programs: 617 -> 657 (6.48%) helped: 0 HURT: 10 LOST: 3 GAINED: 3 The differences mostly appear to be slight changes in MOVs. v2: Use nir_shader_compiler_options::use_interpolated_input_intrinsics flag rather than passing it directly to nir_lower_io. Use the unreachable() macro rather than assert in one place. (Review feedback from Chris Forbes.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisforbes@google.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
96dfed49e47eac7afc100e5b8d3b316dd6652fb6 |
|
19-Jul-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Stop muging cube array lengths by 6 From the Sky Lake PRM: "For SURFTYPE_CUBE: For Sampling Engine Surfaces and Typed Data Port Surfaces, the range of this field is [0,340], indicating the number of cube array elements (equal to the number of underlying 2D array elements divided by 6). For other surfaces, this field must be zero." In other words, the depth field for cube maps is in number of cubes not number of 2-D slices so we need to divide by 6. ISL will do this correctly for us assuming that we provide it with the correct array bounds which it expects to be in 2-D slices. It appears as if we've been doing this wrong ever since we first added cube map arrays for Sandy Bridge and the change to ISL made things slightly worse. While we're at it, we now need to remoe the shader hacks we've always done since they were only needed because we were setting the depth field six times too large. v2: Fix the vec4 backend as well (not sure how I missed this). Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Chris Forbes <chrisforbes@google.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
3e7cebc8da5c9f16fa1b9a25ea72b8d31c86a440 |
|
22-Jun-2016 |
Ian Romanick <ian.d.romanick@intel.com> |
i965: Use LZD to implement nir_op_find_lsb on Gen < 7 v2: Rebase on changes to previous two patches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
c2019c6c261d5c46a4e5d3edc88836bcedf75f30 |
|
22-Jun-2016 |
Ian Romanick <ian.d.romanick@intel.com> |
i965: Use LZD to implement nir_op_ifind_msb on Gen < 7 v2: Retype LZD source as UD to avoid potential problems with 0x80000000. Suggested by Matt. Also update comment about problem values with LZD(abs(x)). Suggested by Curro. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
de20086eed47e6bfe7c25835d72383114f99c7a9 |
|
22-Jun-2016 |
Ian Romanick <ian.d.romanick@intel.com> |
i965: Use LZD to implement nir_op_ufind_msb This uses one less instruction. v2: Move emit_find_msb_using_lzd out of the visitor classes. Suggested by Curro. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
0f2516d88f6607b2816445c2dc18607cdaf1beff |
|
15-Jul-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/tes/scalar: fix 64-bit indirect input loads We totally ignored this before because there were no piglit tests for indirect loads in tessellation stages with doubles. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
1737e75bfb85eb22a30e4f1c69a825b3abd946f6 |
|
15-Jul-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/tcs/scalar: only update imm_offset for second message in 64bit input loads Our indirect URB read messages take both a direct and an indirect offset so when we emit the second message for a 64-bit input load we can just always incremement the immediate offset, even for the indirect case. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
18f67c8a69fcde5d3f585effeef670d0861b0730 |
|
14-Jul-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Move pulls_bary setting to emit_pixel_interpolator_send(). pulls_bary should be set when the shader uses a pixel interpolator message. So, setting it from the function that emits pixel interpolator messages makes a lot of sense. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
7ef7738a61ded5632105b8de6f8141307592e20a |
|
15-Jul-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Write gl_FragCoord directly to the destination. This patch makes emit_general_interpolation take a destination register as an argument, and write directly to that. This is simpler than the old approach of ralloc'ing a register, writing to that temporary, and then making the caller emit per-component MOVs to copy it to the actual destination. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
ac1181ffbef5250cb3b651e047cce5116727c34c |
|
07-Jul-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
compiler: Rename INTERP_QUALIFIER_* to INTERP_MODE_*. Likewise, rename the enum type to glsl_interp_mode. Beyond the GLSL front-end, talking about "interpolation modes" seems more natural than "interpolation qualifiers" - in the IR, we're removed from how exactly the source language specifies how to interpolate an input. Also, SPIR-V calls these "decorations" rather than "qualifiers". Generated by: $ find . -regextype egrep -regex '.*\.(c|cpp|h)' -type f -exec sed -i \ -e 's/INTERP_QUALIFIER_/INTERP_MODE_/g' \ -e 's/glsl_interp_qualifier/glsl_interp_mode/g' {} \; Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Dave Airlie <airlied@redhat.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
94135e8736f2741684e978afac9d34c368f7bcb1 |
|
07-Jul-2016 |
Samuel Iglesias Gonsálvez <siglesias@igalia.com> |
i965/fs: emit DIM instruction to load 64-bit immediates in HSW v2 (Matt): - Use brw_imm_df() as source argument of DIM instruction. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
87a13f598b1ecd50bc209088cf1dc60fd90df015 |
|
11-Jul-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/fs: use the new helper function to create double immediates Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
9e196e907ee87bff2b8c215df5e31a0cd1d1a322 |
|
09-Mar-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/fs: add a helper function to create double immediates Gen7 hardware does not support double immediates so these need to be moved in 32-bit chunks to a regular vgrf instead. Instead of doing this every time we need to create a DF immediate, create a helper function that does the right thing depending on the hardware generation. v2: - Define setup_imm_df() as an independent function (Curro) - Create a specific builder to get rid of some instruction field assignments (Curro). v3: - Get devinfo from builder (Kenneth) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
27e28197e8e82e8c47fda5d6e912c5cb62c03f4a |
|
10-Jun-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
i965: add double packing support to tess stages Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
8b80e9c31db62ccf54ab593b47016ea514dec81c |
|
10-Jun-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
i965: add double support packing support to gs inputs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
9d9b0b54cdc212c372ac67cc14d7ba1a16cc69ef |
|
22-May-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
i965: add indirect packing support to gs load inputs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
2477e6cfada55563631c654fce9250e4fe276f0e |
|
23-May-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
i965: add indirect packing support for tcs and tes Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
2bda4b062f62edac1011bf65f410eeca176b5e23 |
|
20-May-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
i965: add component packing support for tcs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
cfff71a47a655e8cf930e858d408dc4db942ec7c |
|
19-May-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
i965: add component packing support for tes Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
a102ef2d4fd01a946f949a45115d65abb6714a5b |
|
19-May-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
i965: add component packing support for gs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
255cff76d961e56199acab2ab523140e43ea2de2 |
|
23-Jun-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Drop unnecessary inst->base_mrf = -1 assignments. These are now unnecessary, as base_mrf is -1 by default. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
bdab572a86f27b92ba10124f85d278e9c8861fff |
|
13-Jun-2016 |
Samuel Iglesias Gonsálvez <siglesias@igalia.com> |
i965/fs: indirect addressing with doubles is not supported in CHV/BSW/BXT From the Cherryview's PRM, Volume 7, 3D Media GPGPU Engine, Register Region Restrictions, page 844: "When source or destination datatype is 64b or operation is integer DWord multiply, indirect addressing must not be used." v2: - Fix it for Broxton too. v3: - Simplify code by using subscript() and not creating a new num_components variable (Kenneth). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
0177dbb6c2fe876a9761a4a97eec44accfa4c007 |
|
13-Jun-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/fs: Fix single-precision to double-precision conversions for CHV/BSW/BXT From the Cherryview PRM, Volume 7, 3D Media GPGPU Engine, Register Region Restrictions: "When source or destination is 64b (...), regioning in Align1 must follow these rules: 1. Source and destination horizontal stride must be aligned to the same qword. (...)" v2: - Fix it for Broxton too. v3: - Remove inst->regs_written change as it is not necessary (Ken) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462 Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
a8a9d1bf41c00123cefb6e757f3509c62e880a15 |
|
14-Jun-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
i965: remove type_size_vec4_times_4() type_size_vec4_times_4() was introduced as a fix in 8dcf807cb43383 however since 3810c1561 we can just use type_size_scalar() and get the actual number of outputs we need. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
2b648ec17c2934802dd56452d11d78ec2d525a06 |
|
27-May-2016 |
Samuel Iglesias Gonsálvez <siglesias@igalia.com> |
i965/gs/scalar: Fix load input for doubles Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
2d6f82a294ad1ab1eab0020cf65df5ecc9591272 |
|
26-May-2016 |
Samuel Iglesias Gonsálvez <siglesias@igalia.com> |
i965/fs: fix offset when loading double vector input varyings When we are not packing a double input varying, we might need to read its data in a non-aligned to 64-bit offset, so we read the wrong data. This is happening when using explicit locations in varyings because Mesa disables packing varying for that case. const_index is in 32-bit size units but offset() is multiplying it by destination type size units. When operating with double input varyings, const_index value could be not aligned to 64 bits. To fix it, we load the double vector as if it was a float based vector with twice the number of components. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
3fb289f957a8a27349a6f7df03983f92d9b6cf64 |
|
02-Jun-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs Add a wm_prog_data bit for has_side_effects This is more accurate than calling _mesa_active_fragment_shader_has_side_effects because it looks at whether or not the SSBOs, images, or atomic buffers are actually written rather than just existing in the program. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
0a3acff5b53d409181dcd2f31a4a50af06f73a57 |
|
23-May-2016 |
Jordan Justen <jordan.l.justen@intel.com> |
i965: Remove old CS local ID handling The old method pushed data for each channels uvec3 data of gl_LocalInvocationID. The new method pushes 1 dword of data that is a 'thread local ID' value. Based on that value, we can generate gl_LocalInvocationIndex and gl_LocalInvocationID with some calculations. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
8f48d23e0fcc0809f6397a67c26751a45a95e076 |
|
23-May-2016 |
Jordan Justen <jordan.l.justen@intel.com> |
i965: Add nir channel_num system value v2: * simd16/32 fixes (curro) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
25e1b8d366a6131bc9d46fe27f6bc476f05a7a58 |
|
01-Jun-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fix isoline reads in scalar TES. Isolines aren't reversed. commit 5b2d8c2273c6f fixed this for the vec4 TES backend, but not the scalar one. Found while debugging GL45-CTS.tessellation_shader. tessellation_control_to_tessellation_evaluation.gl_tessLevel. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: mesa-stable@lists.freedesktop.org
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b27dfa5403ed1884999524417c08d2bc50365965 |
|
24-May-2016 |
Ian Romanick <ian.d.romanick@intel.com> |
i965: If control_data_header_size_bits is zero, don't do EndPrimitive This can occur when max_vertices=0 is explicitly specified. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
796238d9e6eee0b942d34c57bd8bdf0f9c98b6c3 |
|
18-May-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Use SIMD8 SSBO GET_BUFFER_SIZE message regardless of the dispatch width. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
29e471725115edf941458c5be0bb7e93218ddd0f |
|
18-May-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Don't emit duplicated SSBO GET_BUFFER_SIZE instruction unnecessarily. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
a55452530f7525e9cf5d2619bef66a61b488b4af |
|
26-Apr-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Emit fixed width memory fence opcode regardless of the dispatch width. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
29ce110be6d0d4e4df51be635810f528f7dd7f40 |
|
19-May-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Remove extract virtual opcodes. These can be easily represented in the IR as a MOV instruction with strided source so they seem rather redundant. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
a8e7b4f1d9ec50d2214e7694da26af6a108e506f |
|
20-May-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Handle SAMPLEINFO consistently like other texturing instructions. Seems like this texturing opcode was missing its logical counterpart which would prevent it from taking advantage of the SIMD lowering infrastructure, define it and plumb it through the back-end. At some point we'll likely want to emit a single SAMPLEINFO message shared among all channels irrespective of this change, but for the moment this should be enough to get the intrinsic working in SIMD32 mode. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
47e2a57fe955c04763c979ff4ca61c6867fa05bb |
|
18-May-2016 |
Jordan Justen <jordan.l.justen@intel.com> |
i965/compute: Fix uniform init issue when SIMD8 is skipped In d8347f12ead89c5a58f69ce9283a54ac8487159c, we added support for skipping SIMD8 generation when the program local size is too large for SIMD8 to be usable. This change was missed in that commit. This bug would impact gen7 platforms when the compute shader local size is greater than 512, and gen8 platforms when the local size is greater than 448. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
e58fabc93a25ccc910369f3638b302d46de12271 |
|
24-May-2016 |
Jordan Justen <jordan.l.justen@intel.com> |
i965/gen7: Fix gl_HelperInvocation It appears that UV immediates aren't working on Ivy Bridge. In this case, a signed version will work, and this fixes the piglit tests/spec/glsl-4.50/execution/helper-invocation.shader_test test. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
44997fc0c1cc7f24216e3b1c5d954919df946ee5 |
|
02-May-2016 |
Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> |
i965: Support textures with multiple planes Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
015035027beb38fb9a3b06f8cd94aadc96a8f728 |
|
23-May-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Mark UBO uniform pull constant loads as force_writemask_all. This lets the rest of the backend know that the uniform pull constant load opcodes don't respect channel enables -- Without this the register allocator has no way to know that the return payload of a pull constant load is not per-channel and spills of the destination will be broken under non-uniform control flow. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b46867cd378e5fb135fd060d50c8028d3dac622a |
|
19-May-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: do not depend on std140 alignment rules for UBO loads The previous implementation relied on the std140 alignment rules to avoid handling misalignment in the case where we are loading more than 2 double components from a vector, which requires to emit a second load message. This alternative implementation deals with misalignment and is more flexible going forward. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
dac10e8a1390711f1f36f224644c4a33586cebe3 |
|
17-May-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965, anv: Use NIR FragCoord re-center and y-transform passes. This handles gl_FragCoord transformations and other window system vs. user FBO coordinate system flipping by multiplying/adding uniform values, rather than recompiles. This is much better because we have no decent way to guess whether the application is going to use a shader with the window system FBO or a user FBO, much less the drawable height. This led to a lot of recompiles in many applications. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
fb5dcb81cc121e4355b7eef014474a5c42a2f6db |
|
19-May-2016 |
Matt Turner <mattst88@gmail.com> |
i965: Pass nir_src/nir_dest by reference. Cuts 6K of .text. text data bss dec hex filename 5772372 264648 29320 6066340 5c90a4 lib/i965_dri.so before 5766074 264648 29320 6060042 5c780a lib/i965_dri.so after Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
07353599e07529e98494057f556b9d96c1df5cfd |
|
05-May-2016 |
Matt Turner <mattst88@gmail.com> |
i965/fs: Add and use get_nir_src_imm(). The next patch wants to inspect the LOD argument and do something different if it's 0.0f. But at that point we've emitted a MOV for it and we just have a register to look at. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
cbb0e3a7e8fffa4d5c5af8660d99cd3da8af97ec |
|
17-May-2016 |
Matt Turner <mattst88@gmail.com> |
i965/fs: Assert that nir_op_extract_*'s src1 is a constant.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
ccfe25f7583dd8d0ff0609de3728c8b15fb0f8fb |
|
31-Mar-2016 |
Juan A. Suarez Romero <jasuarez@igalia.com> |
i965/fs: shuffle 32bits into 64bits for doubles VS Thread Payload handles attributes in URB as vec4, no matter if they are actually single or double precision. So with double-precision types, value ends up in the registers split in 32bits chunks, in different positions. We need to shuffle the chunks to get the doubles correctly. v2: * Extra blank line. Add { } on if body (Ian Romanick) * Use dest directly (Kenneth Graunke) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
58f304defe804a6f01b0b961997ecfe61fe00d34 |
|
09-May-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/tes/scalar: Fix load input for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
61197b8d5dd963bd9288385308feb3f0dcaf6742 |
|
09-May-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/tcs/scalar: fix store output for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
cda3435ea85904a17c5c23a7c044e59ba0181b96 |
|
09-May-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/tcs/scalar: fix load input for doubles v2: do not write to the original indirect_offset since that is an expression that could be used somewhere else (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
66192b3c16b09fa7ba97574103fc3d883b3cbfdb |
|
09-May-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/fs: fix nir_intrinsic_store_output for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
3cce67aff09a4c248e9a69a8b05a63ac6b3e4878 |
|
09-May-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/fs: fix number of output components for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
8c6d147373cbdefef5945b00626bb62bb03198ca |
|
26-Jan-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/fs: support doubles with shared variable stores This is pretty much the same we do with SSBOs. v2: do not shuffle in-place, it is not safe since the original 64-bit data could be used after the write, instead use a temporary like we do for SSBO stores (Iago) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
943f9442bf7943a992730e642e91ed874d50790c |
|
25-Jan-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/fs: support doubles with ssbo stores Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b9aa66aa516c100d5476ee966f428aaf743d786c |
|
25-Jan-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/fs: add shuffle_64bit_data_for_32bit_write helper This does the inverse operation of shuffle_32bit_load_result_to_64bit_data and we will use it when we need to write 64-bit data in the layout expected by untyped write messages. v2 (curro): - Use subscript() instead of stride() - Assert on the input types rather than silently retyping. - Use offset() instead of horiz_offset(), drop the multiplier definition. - Drop the temporary vgrf and force_writemask_all. - Make component_i const. - Move to brw_fs_nir.cpp v3 (curro): - Pass dst and src by reference. - Simplify allocation of tmp register. - Move to brw_fs_nir.cpp. - Get rid of the temporary. v3 (Iago): - Check that the src and dst regions do not overlap, since that would typically be a bug in the caller. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
33f7ec18ac399719df06ab7031cb43965e6793be |
|
25-Jan-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/fs: support doubles with SSBO loads Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
8aa01ac596fc0722058e10808c8141533c3fd1fe |
|
05-May-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/fs: support doubles with shared variable loads Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
6eab06b866916d4fd52adf7b8bb6113948a3811a |
|
05-May-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/fs: Add do_untyped_vector_read helper We are going to need the same logic for anything that reads doubles via untyped messages (CS shared variables and SSBOs). Add a helper function with that logic so that we can reuse it. v2: - Make this a static function instead of a method of fs_visitor (Iago) - We only support types with a size of 4 or 8 (Curro) - Avoid retypes by using a separate vgrf for the packed result (Curro) - Put dst parameter before source parameters (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b86d4780ed203b2a22afba5f95c73b15165a7259 |
|
13-Jan-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/fs: support doubles with UBO loads UBO loads with constant offset use the UNIFORM_PULL_CONSTANT_LOAD instruction, which reads 16 bytes (a vec4) of data from memory. For dvec types this only provides components x and y. Thus, if we are reading more than 2 components we need to issue a second load at offset+16 to read the next 16-byte chunk with components w and z. UBO loads with non-constant offset emit a load for each component in the vector (and rely in CSE to fix redundant loads), so we only need to consider the size of the data type when computing the offset of each element in a vector. v2 (Sam): - Adapt the code to use component() (Curro). v3 (Sam): - Use type_sz(dest.type) in VARYING_PULL_CONSTANT_LOAD() call (Curro). - Add asserts to ensure std140 vector alignment rules are followed (Curro). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
58f1804c4f38b76c20872d6887b7b5e6029e0454 |
|
18-Jan-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/fs: fix pull constant load component selection for doubles UNIFORM_PULL_CONSTANT_LOAD is used to load a contiguous vec4 starting at a constant offset that is 16-byte aligned. If we need to access an unaligned offset we emit a load with an aligned offset and use the remaining constant offset to select the component into the vec4 result that we are interested in. This component must be computed in units of the type size, since that is what fs_reg::set_smear expects. This patch does this change in the two places where we use this message: In demote_pull_constants when we lower uniform access with constant offset into the pull constant buffer and in UBO loads with constant offset. v2 (Sam): - Fix set_smear() in fs_visitor::lower_constant_loads(), take into account source type instead and remove MAX2 (Curro). - Improve changes to nir_intrinsic_load_ubo case in nir_emit_intrinsic() (Curro). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
50b7676dc46bae39c5e9b779828ef4fb2e1fbefc |
|
22-Jan-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/fs: add shuffle_32bit_load_result_to_64bit_data helper There will be a few places where we need to shuffle the result of a 32-bit load into valid 64-bit data, so extract this logic into a separate helper that we can reuse. v2 (Curro): - Use subscript() instead of stride() - Assert on the input types rather than retyping. - Use offset() instead of horiz_offset(), drop the multiplier definition. - Don't use force_writemask_all. - Mark component_i as const. - Make the function name lower case. v3 (Curro): - Pass src and dst by reference. - Move to brw_fs_nir.cpp Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
c907ca6c8d256f4b8c271bcf0901661ef943ae08 |
|
13-May-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Flip interpolateAtOffset's y offset when necessary. Fixes 4 dEQP-GLES31.functional.shaders.multisample_interpolation tests: - interpolate_at_offset.no_qualifiers.default_framebuffer - interpolate_at_offset.centroid_qualifier.default_framebuffer - interpolate_at_offset.sample_qualifier.default_framebuffer - interpolate_at_offset.array_element.default_framebuffer Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
50e5e1f747ad820eb491e093600a4bde9c13efba |
|
03-May-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Implement the new NIR MCS texturing Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
3aa542c65760c7e9b92a41d850677a44879cc5c7 |
|
09-May-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Delete bogus assertion in emit_gs_input_load(). This looks like leftover cruft from an earlier attempt at writing point size hacks. Each vertex has its own copy of gl_PointSize, so accessing any vertex other than 0 would cause this to fail. The tests seem to work fine without it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
1c41cb58def637c9e033cb7bf108f1096c9ae63c |
|
08-May-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Support instanced GS inputs in the scalar backend. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
5fc37726501bc65f3bbaef2573ac89e980f1a412 |
|
08-May-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Use an early return for the push case in emit_gs_input_load(). Just trying to keep things from getting too ugly in the next commit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
75ada43a3af88835de6a83ed453d4ed512df0412 |
|
19-Apr-2016 |
Samuel Iglesias Gonsálvez <siglesias@igalia.com> |
i965/fs: take into account doubles when calculating read_size for MOV_INDIRECT v2: - Fix assert's line width (Topi). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
efaf62a40a95b240cab7b0f371c7178aa19b7f3a |
|
12-Jan-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/fs: implement i2d and u2d Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
c63a6f21494685d41d51887901298639c4d32c22 |
|
18-Jan-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/fs: implement d2i and d2u These need the same treatment as d2f, so generalize our d2f lowering to cover these too. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
e0c45182e3d865d7f187dc35e70832f1fa7c9fad |
|
18-Jan-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/fs: implement d2b v2: Use subscript() instead of stride() (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
80f60a4302c8bd805882baaf60db72cf785593e3 |
|
07-Jan-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/fs: implement fsign() for doubles v2 (Sam): - Fix indentation (Kenneth) - Simplify code (Kenneth) v3: Use subscript() instead of stride() (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
e8a8fc956358fb5e0f776b39fdbce9247bb5538a |
|
10-Nov-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/fs: We only support 32-bit integer ALU operations for now Add asserts so we remember to address this when we enable 64-bit integer support, as suggested by Connor and Jason. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
a644b0939dd8284bca25042bccd2439c173dd7d7 |
|
30-Jul-2015 |
Connor Abbott <connor.w.abbott@intel.com> |
i965/fs: add support for f2d and d2f Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
e83f51d54e9c3db11526b66a741352135eae6f52 |
|
04-Aug-2015 |
Connor Abbott <connor.w.abbott@intel.com> |
i965/fs: fix compares for doubles The destination has to have the same source as the type, or else the simulator will complain. As a result, we need to emit a CMP that outputs a 64-bit wide result and then do a strided MOV to pick out the low 32 bits of each channel. v2: Use subscript() instead of stride() (Curro) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
935e0e305dd7a4f67557e969513a30357d308efb |
|
19-Apr-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/fs: optimize unpack double When we are actually unpacking from a double that we have previously packed from its 32-bit components we can bypass the pack operation and source from its arguments directly. v2 (Sam): - Fix line overflow (Topi) - Bail if the parent instruction's source is not SSA (Connor) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
ba1907f040e9d61be932a8e098061d94d4ba30cb |
|
19-Apr-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/fs: optimize pack double When we are actually creating a double using values obtained from a previous unpack operation we can bypass the unpack and source from the original double value directly. v2: - Style changes (Topi) - Bail is parent instruction's src is not SSA (Connor) v3: Use subscript() instead of stride() (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
7782f39e759798975ace6f3272dd3f263ddc8702 |
|
14-Aug-2015 |
Connor Abbott <connor.w.abbott@intel.com> |
i965/fs/nir: translate double pack/unpack v2 (Sam): - Fix line overflow (Topi). v3: Use subscript() instead of stride() (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
d17cdacba37cff8ee172322c9ba2c4a58bf57d8b |
|
29-Jul-2015 |
Connor Abbott <connor.w.abbott@intel.com> |
i965/fs: always pass the bitsize to brw_type_for_nir_type() v2 (Sam): - Add bitsize to brw_type_for_nir_type() in optimize_extract_to_float() v3 (Sam): - Fix line width (Topi). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
0f1690fd9514f7a282141a7ad57a06b334b6c1a4 |
|
29-Jul-2015 |
Connor Abbott <connor.w.abbott@intel.com> |
i965/fs: use the NIR bit size when creating registers v2 (Iago): - Squashed bits from 'support double precission constant operands for the implementation of 64-bit emit_load_const'. - Do not use BRW_REGISTER_TYPE_D for all 32-bit registers since that breaks asserts and functionality for some piglit tests. Just keep 32-bit types untouched and add 64-bit support. - Use DF instead of Q for 64-bit registers. Otherwise the code we generate will use Q sometimes and DF others and we hit unwanted DF/Q conversions, so always use DF. v3 (Sam): - Mark 'reg_type' occurrences as const (Topi). Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Tapani Palli <tapani.palli@intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
7bc987abe0dc863b091bf77f5b02138ebe79e559 |
|
03-May-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Move handling of samples_identical into the switch statement This is where we handle texop_texture_samples so it makes things more consistent.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
3ba228f9978cbabc2b4731327454dd91a208c317 |
|
03-May-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Simplify texture destination fixups There are a few different fixups that we have to do for texture destinations that re-arrange channels, fix hardware vs. API mismatches, or just shrink the result to fit in the NIR destination. These were all being done in a somewhat haphazard manner. This commit replaces all of the shuffling with a single LOAD_PAYLOAD operation at the end and makes it much easier to insert fixups between the texture instruction itself and the LOAD_PAYLOAD. Shader-db results on Haswell: total instructions in shared programs: 6227035 -> 6226669 (-0.01%) instructions in affected programs: 19119 -> 18753 (-1.91%) helped: 85 HURT: 0 total cycles in shared programs: 56491626 -> 56476126 (-0.03%) cycles in affected programs: 672420 -> 656920 (-2.31%) helped: 92 HURT: 42
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
a815499294afb485fe6773fba9ba12fa6773c654 |
|
03-May-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Merge nir_emit_texture and emit_texture The fs_visitor::emit_texture helper originated when we still had both NIR and IR visitors for the FS backend. Since the old visitor was removed, emit_texture serves no real purpose beyond arbitrarily splitting heavily-linked code across two functions.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
a808ba59657b3e5c6399e51fa1f4ebe9cad201a9 |
|
03-May-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Rework passthrough TCS checks. According to Timothy, using program_string_id == 0 to identify the passthrough TCS is going to be problematic for his shader cache work. So, change it to strcmp() the name at visitor creation time. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
7d9143ad885752184156b3a0d3e492aef09af3b0 |
|
15-Nov-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Write a scalar TCS backend that runs in SINGLE_PATCH mode. Unlike most shader stages, the Hull Shader hardware makes us explicitly tell it how many threads to dispatch and manually configure the channel mask. One perk of this is that we have a lot of flexibility - we can run it in either SIMD4x2 or SIMD8 mode. Treating it as SIMD8 means that shaders with 8 or fewer output vertices (which is overwhemingly the common case) can be handled by a single thread. This has several intriguing properties: - Accessing input arrays with gl_InvocationID as the index is a simple SIMD8 URB read with g1 as the header. No indirect addressing required. - Barriers are no-ops. - We could potentially do output shadowing to combine writes, as the concurrency concerns are gone. (We don't do this yet, though.) v2: Drop first_non_payload_grf change, as it was always adding 0 (caught by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
9464d8c49813aba77285e7465b96e92a91ed327c |
|
27-Apr-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Switch the arguments to nir_foreach_function This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_function(\([^,]*\),\s*\([^,]*\))/nir_foreach_function(\2, \1)/ Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
707e72f13bb78869ee95d3286980bf1709cba6cf |
|
27-Apr-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Switch the arguments to nir_foreach_instr This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_instr(\([^,]*\),\s*\([^,]*\))/nir_foreach_instr(\2, \1)/ and similar expressions for nir_foreach_instr_safe etc. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
7efff10585122d484dc3adab14af9380b9b8f309 |
|
13-Apr-2016 |
Connor Abbott <cwabbott0@gmail.com> |
i965/nir: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
13195f7ef85e0923a7b7d5b8a35eb6b6c257db1c |
|
23-Apr-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Reduce the response length of sampler messages on Skylake. Often, we don't need a full 4 channels worth of data from the sampler. For example, depth comparisons and red textures only return one value. To handle this, the sampler message header contains a mask which can be used to disable channels, and reduce the message length (in SIMD16 mode on all hardware, and SIMD8 mode on Broadwell and later). We've never used it before, since it required setting up a message header. This meant trading a smaller response length for a larger message length and additional MOVs to set it up. However, Skylake introduces a terrific new feature: for headerless messages, you can simply reduce the response length, and it makes the implicit header contain an appropriate mask. So to read only RG, you would simply set the message length to 2 or 4 (SIMD8/16). This means we can finally take advantage of this at no cost. total instructions in shared programs: 9091831 -> 9073067 (-0.21%) instructions in affected programs: 191370 -> 172606 (-9.81%) helped: 2609 HURT: 0 total cycles in shared programs: 70868114 -> 68454752 (-3.41%) cycles in affected programs: 35841154 -> 33427792 (-6.73%) helped: 16357 HURT: 8188 total spills in shared programs: 3492 -> 1707 (-51.12%) spills in affected programs: 2749 -> 964 (-64.93%) helped: 74 HURT: 0 total fills in shared programs: 4266 -> 2647 (-37.95%) fills in affected programs: 3029 -> 1410 (-53.45%) helped: 74 HURT: 0 LOST: 1 GAINED: 143 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
c7a09c057162ed0b7e9e039470c76bb79518876c |
|
10-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Properly report regs_written from SAMPLEINFO The previous behavior would only allocate one register and then write four thus potentially stomping three innocent bystanders. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
0bd956b34b376bdc1eaf91a2a8463d13dd59e641 |
|
24-Apr-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Don't force a header for texture offsets of 0. Calling textureOffset() with an offset of <0, 0, 0> is equivalent to calliing texture(). We don't actually need to set up an offset, which causes a message header to be created. A fairly common pattern is to sample at a point with a bunch of offsets, and average them. It's natural to write all the lookups as textureOffset, but use <0, 0> for the center sample. shader-db results on Skylake: total instructions in shared programs: 9092095 -> 9092087 (-0.00%) instructions in affected programs: 2826 -> 2818 (-0.28%) helped: 12 HURT: 2 total cycles in shared programs: 70870166 -> 70870144 (-0.00%) cycles in affected programs: 15924 -> 15902 (-0.14%) helped: 2 HURT: 0 This also helps prevent code quality regressions in a future patch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
f310c02b94fba0a0a5ea7f5573f906de823cc5fe |
|
16-Apr-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs_surface_builder: Take a GL format enum instead of mesa_format Reviewed-by: Chad Versace <chad.versace@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
d8c8f4203f8bb18152af0d0c120f3582a93c07c2 |
|
06-Apr-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fix interpolateAtSample() on single sampled buffers. Fixes dEQP-GLES31.functional.shaders.multisample_interpolation tests: - interpolate_at_sample.non_multisample_buffer.sample_n_default_framebuffer - interpolate_at_sample.non_multisample_buffer.sample_n_singlesample_rbo - interpolate_at_sample.non_multisample_buffer.sample_n_singlesample_texture Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
447d3eec6a869200612e5010f47335cb26789a3a |
|
06-Apr-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fix gl_SampleMaskIn[] in per-sample shading mode. The coverage mask is not sufficient - in per-sample mode, we also need to AND with a mask representing the samples being processed by the current fragment shader invocation. Fixes 18 dEQP-GLES31.functional.shaders.sample_variables tests: sample_mask_in.bit_count_per_sample.multisample_{rbo,texture}_{1,2,4,8} sample_mask_in.bit_count_per_two_samples.multisample_{rbo,texture}_{4,8} sample_mask_in.bits_unique_per_sample.multisample_{rbo,texture}_{1,2,4,8} sample_mask_in.bits_unique_per_two_samples.multisample_{rbo,texture}_{4,8} Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b6dc940ec273252678d40707d300851fa1c85ea5 |
|
13-Apr-2016 |
Connor Abbott <cwabbott0@gmail.com> |
nir: rename nir_foreach_block*() to nir_foreach_block*_call() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
f30f6e26252ed09eca1922f7c8633c7c7b6e50fe |
|
15-Apr-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Don't allow OOB array access of images We have had a guard against OOB array access of images on IVB for a long time, but it can actually cause hangs on any GPU generation. This can happen due to getting an untyped SURFACE_STATE for a typed message. We didn't used to hit this with the piglit test on anything other than IVB because the OOB in the test would cause us to go past the top of the pull constant UBO and we would get a surface index of 0 which is was always a valid surface. Now that we're pushing small arrays, we can end up grabbing garbage from the GRF and going to some random index which causes a hang. The solution is to just do the bounds check on all hardware. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94944 Reviewed-by: Francisco Jerez <currojerez@riseup.net> Tested-by: Mark Janes <mark.a.janes@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
479e38ad63ab1421afe4f25d36f434ac2e12e817 |
|
25-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Get rid of the param_size array Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
3c93cdfaf598bc3c28e3dc288da35675c666602b |
|
25-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Use MOV_INDIRECT for all indirect uniform loads Instead of using reladdr, this commit changes the FS backend to emit a MOV_INDIRECT whenever we need an indirect uniform load. We also have to rework some of the other bits of the backend to handle this new form of uniform load. The obvious change is that demote_pull_constants now acts more like a lowering pass when it hits a MOV_INDIRECT. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
240d16ea94834eb2472e91fd4856381951a07007 |
|
25-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Use UD type for offsets in VARYING_PULL_CONSTANT_LOAD Reveiewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
765dd6534937e125b95c7998862b1a4ec76a22d8 |
|
25-Mar-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Implement the new imod and irem opcodes Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
bfd17c76c1267756ea16051cbe174cb23ff49f44 |
|
08-Apr-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Port INTEL_PRECISE_TRIG=1 to NIR. This makes the extra multiply visible to NIR's algebraic optimizations (for constant reassociation) as well as constant folding. This means that when the result of sin/cos are multiplied by an constant, we can eliminate the extra multiply altogether, reducing the cost of the workaround. It also means we only have to implement it one place, rather than in both backends. This makes INTEL_PRECISE_TRIG=1 cost nothing on GPUTest/Volplosion, which has a ton of sin() calls, but always multiplies them by an immediate constant. The extra multiply gets folded away. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
5ea3647f89abccea5496824815b5b729f38f7a23 |
|
25-Mar-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Move the code for load/store_shared to emit_cs_intrinsic They are compute-shader only and that's where the code for doing atomics on shared variables lives so it seemes to make sense. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
80c72a8ea7b1018661da0e6509a7f88ca1f5086f |
|
25-Mar-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Provide a default LOD for buffer textures Our hardware requires an LOD for all texelFetch commands even if they are on buffer textures. GLSL IR gives us an LOD of 0 in that case, but the LOD is really rather meaningless. This commit allows other NIR producers to be more lazy and not provide one at all. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
65fbc43d54403905e3eaea02372b5a364dc1d773 |
|
27-Jan-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Add an INTEL_PRECISE_TRIG=1 option to fix SIN/COS output range. The SIN and COS instructions on Intel hardware can produce values slightly outside of the [-1.0, 1.0] range for a small set of values. Obviously, this can break everyone's expectations about trig functions. According to an internal presentation, the COS instruction can produce a value up to 1.000027 for inputs in the range (0.08296, 0.09888). One suggested workaround is to multiply by 0.99997, scaling down the amplitude slightly. Apparently this also minimizes the error function, reducing the maximum error from 0.00006 to about 0.00003. When enabled, fixes 16 dEQP precision tests dEQP-GLES31.functional.shaders.builtin_functions.precision. {cos,sin}.{highp,mediump}_compute.{scalar,vec2,vec4,vec4}. at the cost of making every sin and cos call more expensive (about twice the number of cycles on recent hardware). Enabling this option has been shown to reduce GPUTest Volplosion performance by about 10%. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
14c46954c910efb1db94a068a866c7259deaa9d9 |
|
25-Mar-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Add an implemnetation of nir_op_fquantize2f16 Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
084b24f5582567ebf5aa94b7f40ae3bdcb71316b |
|
16-Mar-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
nir: rename nir_const_value fields to include bitsize information Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
ef76ea4ba97d0ac122491fd3f1b2bbb8e4163150 |
|
04-Mar-2016 |
Alejandro Piñeiro <apinheiro@igalia.com> |
i965/fs/nir: "surface_access::" prefix not needed "using namespace brw::surface_access" is already present at the top of the source file. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
1f862e923cba1d5cd54a707f70f0be113635e855 |
|
21-Jan-2016 |
Matt Turner <mattst88@gmail.com> |
i965/fs: Optimize float conversions of byte/word extract. instructions in affected programs: 31535 -> 29966 (-4.98%) helped: 23 cycles in affected programs: 272648 -> 266022 (-2.43%) helped: 14 HURT: 1 The patch decreases the number of instructions in the two Unigine programs by: #1721: 4374 -> 4155 instructions (-5.01%) #1706: 3582 -> 3363 instructions (-6.11%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
0e9dc59a58e632979b3bdebb19d184bd22a0c182 |
|
11-Feb-2016 |
Matt Turner <mattst88@gmail.com> |
i965: Make emit_minmax return an instruction*. And use it in brw_fs_nir.cpp.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
2f2c00c7279e7c43e520e21de1781f8cec263e92 |
|
11-Feb-2016 |
Matt Turner <mattst88@gmail.com> |
i965: Lower min/max after optimization on Gen4/5. Gen4/5's SEL instruction cannot use conditional modifiers, so min/max are implemented as CMP + SEL. Handling that after optimization lets us CSE more. On Ironlake: total instructions in shared programs: 6426035 -> 6422753 (-0.05%) instructions in affected programs: 326604 -> 323322 (-1.00%) helped: 1411 total cycles in shared programs: 129184700 -> 129101586 (-0.06%) cycles in affected programs: 18950290 -> 18867176 (-0.44%) helped: 2419 HURT: 328 Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
ac089126b9b647f930ee2657aa16ea8e8f6a5dd7 |
|
09-Feb-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
glsl/types: Rename sampler_type to sampled_type It's a bit more descriptive since it is the base type that you get when you sample from it. Also, the next commit adds a bare "sampler" type and we need glsl_type::sampler_type available for a public static member. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
8750299a420af76cebd3067f6f603eacde06ae06 |
|
09-Feb-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Remove the const_offset from nir_tex_instr When NIR was originally drafted, there was no easy way to determine if something was constant or not. The result was that we had lots of special-casing for constant values such as this. Now that load_const instructions are SSA-only, it's really easy to find constants and this isn't really needed anymore. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Rob Clark <robclark@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b8ab9c8c8674d67e09c1134ca44b37e0a611f5b5 |
|
06-Feb-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Plumb separate surfaces and samplers through from NIR Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
5ec456375e4fdd0b6c7d797f99191044e19ead74 |
|
03-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Separate texture from sampler in nir_tex_instr This commit adds the capability to NIR to support separate textures and samplers. As it currently stands, glsl_to_nir only sets the texture deref and leaves the sampler deref alone as it did before and nir_lower_samplers assumes this. Backends can still assume that they are combined and only look at only at the texture index. Or, if they wish, they can assume that they are separate because nir_lower_samplers, tgsi_to_nir, and prog_to_nir all set both texture and sampler index whenever a sampler is required (the two indices are the same in this case). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
ee85014b90af1d94d637ec763a803479e9bac5dc |
|
06-Feb-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir/tex_instr: Rename sampler to texture We're about to separate the two concepts. When we do, the sampler will become optional. Doing a rename first makes the separation a bit more safe because drivers that depend on GLSL or TGSI behaviour will be fine to just use the texture index all the time. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
1dc312e295c66ab8674d2f47f859e310f607b2ed |
|
21-Jan-2016 |
Matt Turner <mattst88@gmail.com> |
i965/fs: Implement support for extract_word. The vec4 backend will lower it. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
eb63640c1d38a200a7b1540405051d3ff79d0d8a |
|
17-Jan-2016 |
Emil Velikov <emil.velikov@collabora.com> |
glsl: move to compiler/ Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b3340cd32acf5935891f19833de0cfc500a93e0b |
|
21-Jan-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Implement a drirc workaround for broken dual color blending. OpenGL's dual color blending feature was specified so that an implementation could support both multiple render targets (MRT) and dual source blending. Fragment shader outputs specify both "location" (the render target number) and "index" (either color 0 or 1). I believe DirectX only has the notion of "location" - if using dual color blending, location 0 or 1 will specify the operands. If not, then location means the render target index. The two features can't be used together. As such, some applications mistakenly try to use <loc = 0, index = 0> and <loc = 1, index = 0> in a shader used for dual color blending with a single render target, rather than the correct <loc = 0, index = 0> and <loc = 0, index = 1>. In particular, Unigine Heaven 4.0 and Valley 1.0 suffer from this bug. Unigine is aware of the problem, and quickly developed a fix, but has not bothered to change the download link on their website to a working copy in over a year. People were still using the broken version and complaining. We tried working around this by disabling dual color blending, but that apparently hurts performance, and people were once again unhappy. On i965, dual source blending is achieved by using different framebuffer write messages than normal rendering. So, we have to compile different code for the two cases. We're not being pedantic: we actually have to know in order to function. Normally, dual source blending is detectable in the shader: if a shader has an output with index = 1, then it's meant for blending, not MRT. With the broken inputs, they're indistinguishable, so we can only tell by looking at the current GL state. This patch implements a new drirc workaround: export dual_color_blend_by_location=true which makes the i965 driver detect when OpenGL state is configured for dual source blending, and recompile the fragment shader to use the right messages. In that case, we allow either location = 1 or index = 1 to specify the second source for the blending equations. It also re-enables GL_ARB_blend_func_extended for Unigine. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92233 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b82e26a6a4d6baf121f44c61c862bfa79ba0d172 |
|
13-Jan-2016 |
Matt Turner <mattst88@gmail.com> |
nir: Lower bitfield_extract. The OpenGL specifications for bitfieldExtract() says: The result will be undefined if <offset> or <bits> is negative, or if the sum of <offset> and <bits> is greater than the number of bits used to store the operand. Therefore passing bits=32, offset=0 is legal and defined in GLSL. But the earlier SM5 ubfe/ibfe opcodes are specified to accept a bitfield width ranging from 0-31. As such, Intel and AMD instructions read only the low 5 bits of the width operand, making them not able to implement the GLSL-specified behavior directly. This commit adds ubfe/ibfe operations from SM5 and a lowering pass for bitfield_extract to to handle the trivial case of <bits> = 32 as bitfieldExtract: bits > 31 ? value : bfe(value, offset, bits) Fixes: ES31-CTS.shader_bitfield_operation.bitfieldExtract.uvec3_0 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92595 Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Tested-by: Marta Lofstedt <marta.lofstedt@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b85a229e1f542426b1c8000569d89cd4768b9339 |
|
08-Jan-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
glsl: Delete the ir_binop_bfm and ir_triop_bfi opcodes. TGSI doesn't use these - it just translates ir_quadop_bitfield_insert directly. NIR can handle ir_quadop_bitfield_insert as well. These opcodes were only used for i965, and with Jason's recent patches, we can do this lowering in NIR (which also gains us SPIR-V handling). So there's not much point to retaining this GLSL IR lowering code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
4a1c8a3037cd29938b2a6e2c680c341e9903cfbe |
|
28-Dec-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Push most TES inputs in SIMD8 mode. Using the push model for inputs is much more efficient than pulling inputs - the hardware can simply copy a large chunk into URB registers at thread creation time, rather than having the thread send messages to request data from the L3 cache. Unfortunately, it's possible to have more TES inputs than fit in registers, so we have to fall back to the pull model in some cases. However, it turns out that most tessellation evaluation shaders are fairly simple, and don't use many inputs. An arbitrary cut-off of 32 vec4 slots (16 registers) is more than sufficient to ensure that 100% of TES inputs are pushed for Shadow of Mordor, Unigine Heaven, GPUTest/TessMark, and SynMark. Note that unlike most SIMD8 stages, this actually reads packed vec4 data, since that is what our vec4 TCS programs write. Improves performance in GPUTest's tessmark_x64 microbenchmark by 93.4426% +/- 5.35541% (n = 25) on my Lenovo X250 at 1024x768. Improves performance in Synmark's Gl40TerrainFlyTess microbenchmark by 22.74% +/- 0.309394% (n = 5). Improves performance in Shadow of Mordor at low settings with tessellation enabled at 1280x720 by 2.12197% +/- 0.478553% (n = 4). shader-db statistics for files containing tessellation shaders: total instructions in shared programs: 184358 -> 181181 (-1.72%) instructions in affected programs: 27971 -> 24794 (-11.36%) helped: 226 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b022150d70a1cfdda2007fa16b04c601eef45d6f |
|
28-Dec-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Use LOAD_PAYLOAD for SIMD8 TES input loads, not MOV. We need a MOV to replicate g0.0<0,1,0> to all 8 channels. Since the message payload is a single register, MOV seemed more sensible than LOAD_PAYLOAD. However, MOV cannot be CSE'd, while LOAD_PAYLOAD can. All input loads can use the same header - we don't need to re-expand g0 every time. CSE accomplishes this, saving instructions. shader-db statistics for files containing tessellation shaders: total instructions in shared programs: 186923 -> 184358 (-1.37%) instructions in affected programs: 30536 -> 27971 (-8.40%) helped: 226 HURT: 0 total cycles in shared programs: 1009850 -> 1005356 (-0.45%) cycles in affected programs: 168206 -> 163712 (-2.67%) helped: 226 HURT: 0 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
cddfc2cefa93b884c40329dcb193fe4fb22143ab |
|
10-Dec-2015 |
Kristian Høgsberg Kristensen <krh@bitplanet.net> |
i965: Add support for gl_DrawIDARB and enable extension We have to break open a new vec4 for gl_DrawIDARB. We've used up all space in the vec4 we use for SGVS and gl_DrawIDARB has to come from its own separate vertex buffer anyway. This is because we point the vb for base vertex and base instance into the draw parameter BO for indirect draw calls, but the draw id is generated by mesa in a different buffer. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
17ebb55a14b5a9aa639845fbda9330ef9421834a |
|
10-Dec-2015 |
Kristian Høgsberg Kristensen <krh@bitplanet.net> |
i965: Add support for gl_BaseVertexARB and gl_BaseInstanceARB We already have gl_BaseVertexARB in the .x component of the SGVS vec4 and plug gl_BaseInstanceARB into the last free component (.y). Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
237f2f2d8b45d9d956102eec6f9be63193e5269b |
|
26-Dec-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Get rid of function overloads When Connor originally drafted NIR, he copied the same function+overload system that GLSL IR had with a few names changed. However, this double-indirection is not really needed and has only served to confuse people. Instead, let's just have functions which may not have unique names and may or may not have an implementation. If someone wants to do overload resolving, they can hav a hash table based function+overload system in the overload resolving pass. There's no good reason to keep it in core NIR. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> ir3 bits are Reviewed-by: Rob Clark <robclark@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
a5038427c3624e559f954124d77304f9ae9b884c |
|
10-Nov-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Add tessellation evaluation shaders The TES is essentially a post-tessellator VS, which has access to the entire TCS output patch, and a special gl_TessCoord input. Otherwise, they're very straightforward. This patch implements SIMD8 tessellation evaluation shaders for Gen8+. The tessellator can generate a lot of geometry, so operating in SIMD8 mode (8 vertices per thread) is more efficient than SIMD4x2 mode (only 2 vertices per thread). I have another patch which implements SIMD4x2 mode for older hardware (or via an environment variable override). We currently handle all inputs via the pull model. v2: Improve comments (suggested by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
c8a74e3a4ea6ac5dfa35adac06af14a8fa4ff773 |
|
30-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
nir: Delete bany, ball, fany, fall. As in the previous patches, these can be implemented as any(v) -> any_nequal(v, false) all(v) -> all_equal(v, true) and their removal simplifies the code in the next patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b8425bb1e845bef19dac8d8a9fd672e958018802 |
|
11-Dec-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Use the correct source for local memory load offsets The offset for loads is in src[0]. This was a copy+paste error in the nir_intrinsic_load/store refactoring. This commit fixes a segfault in ES31-CTS.compute_shader.work-group-size. I have no idea how piglit failed to catch this... Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93348 Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
78b81be627734ea7fa50ea246c07b0d4a3a1638a |
|
25-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Get rid of *_indirect variants of input/output load/store intrinsics There is some special-casing needed in a competent back-end. However, they can do their special-casing easily enough based on whether or not the offset is a constant. In the mean time, having the *_indirect variants adds special cases a number of places where they don't need to be and, in general, only complicates things. To complicate matters, NIR had no way to convdert an indirect load/store to a direct one in the case that the indirect was a constant so we would still not really get what the back-ends wanted. The best solution seems to be to get rid of the *_indirect variants entirely. This commit is a bunch of different changes squashed together: - nir: Get rid of *_indirect variants of input/output load/store intrinsics - nir/glsl: Stop handling UBO/SSBO load/stores differently depending on indirect - nir/lower_io: Get rid of load/store_foo_indirect - i965/fs: Get rid of load/store_foo_indirect - i965/vec4: Get rid of load/store_foo_indirect - tgsi_to_nir: Get rid of load/store_foo_indirect - ir3/nir: Use the new unified io intrinsics - vc4: Do all uniform loads with byte offsets - vc4/nir: Use the new unified io intrinsics - vc4: Fix load_user_clip_plane crash - vc4: add missing src for store outputs - vc4: Fix state uniforms - nir/lower_clip: Update to the new load/store intrinsics - nir/lower_two_sided_color: Update to the new load intrinsic NIR and i965 changes are Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> NIR indirect declarations and vc4 changes are Reviewed-by: Eric Anholt <eric@anholt.net> ir3 changes are Reviewed-by: Rob Clark <robdclark@gmail.com> NIR changes are Acked-by: Rob Clark <robdclark@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
f3970fad9e5b04e04de366a65fed5a30da618f9d |
|
08-Dec-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs_nir: Refactor store_output, load_input, and load_uniform There was way too much incrementing of things going on. Instead, let's just start everything off at the right base location, and then increment in the loop. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
e288b4a133f1ea8208cd219545a72805ed5a91c6 |
|
10-Oct-2015 |
Jordan Justen <jordan.l.justen@intel.com> |
i965/nir: Implement shared variable atomic operations v3: * Update based on latest SSBO code (Iago) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
faddb301ff72bd7ac8d4274e0d895ca37a4d3bce |
|
29-Jul-2015 |
Jordan Justen <jordan.l.justen@intel.com> |
i965/fs: Handle nir shared variable store intrinsic v4: * Apply similar optimization for shared variable stores as 0cb7d7b4b7c32246d4c4225a1d17d7ff79a7526d. This was causing a OpenGLES 3.1 CTS failure, but 867c436ca841b4196b4dde4786f5086c76b20dd7 fixes that. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
8613206bd3dd80dc916b6ce7c47bf59cd4d114c8 |
|
29-Jul-2015 |
Jordan Justen <jordan.l.justen@intel.com> |
i965/fs: Handle nir shared variable load intrinsic v3: * Remove extra #includes (Iago) * Use recently added GEN7_BTI_SLM instead of BRW_SLM_SURFACE_INDEX (curro) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
18069dce4a4c3d71e6afc6b10bfa7bee0560ba9c |
|
11-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Make uniform offsets be in terms of bytes This commit pushes makes uniform offsets be terms of bytes starting with nir_lower_io. They get converted to be in terms of vec4s or floats when we cram them in the UNIFORM register file but reladdr remains in terms of bytes all the way down to the point where we lower it to a pull constant load. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
22c273de2b97743587310f7bbf66767191bde866 |
|
11-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Remove unused indirect handling The one and only place where the FS backend allows reladdr is on uniforms. For locals, inputs, and outputs, we lower it away before the backend ever sees it. This commit gets rid of the dead indirect handling code. Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
13ad8d03f201a4d09bf7ab9078b00807d61dfada |
|
01-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Use a stride of 1 and byte offsets for UBOs Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
3810c1561401aba336765d64d1a5a3e44eb58eb3 |
|
25-Nov-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fix scalar vertex shader struct outputs. While we correctly set output[] for composite varyings, we set completely bogus values for output_components[], making emit_urb_writes() output zeros instead of the actual values. Unfortunately, our simple approach goes out the window, and we need to recurse into structs to get the proper value of vector_elements for each field. Together with the previous patch, this fixes rendering in an upcoming game from Feral Interactive. v2: Use pointers instead of pass-by-mutable-reference (Jason, Matt). Cc: "11.1 11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
3e9003e9cf55265ab1fb6522dc5cbb2f455ea1f9 |
|
20-Nov-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fix fragment shader struct inputs. Apparently we have literally no support for FS varying struct inputs. This is somewhat surprising, given that we've had tests for that very feature that have been passing for a long time. Normally, varying packing splits up structures for us, so we don't see them in the backend. However, with SSO, varying packing isn't around to save us, and we get actual structs that we have to handle. This patch changes fs_visitor::emit_general_interpolation() to work recursively, properly handling nested structs/arrays/and so on. (It's easier to read with diff -b, as indentation changes.) When using the vec4 VS backend, this fixes rendering in an upcoming game from Feral Interactive. (The scalar VS backend requires additional bug fixes in the next patch.) v2: Use pointers instead of pass-by-mutable-reference (Jason, Matt). Cc: "11.1 11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
f36993b46962eab4446bc1964eb47149751aee26 |
|
23-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Clean up #includes in the compiler. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
ecac1aab538d65f0867fd93e23d0d020c1a5d0f1 |
|
23-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Push down inclusion of brw_program.h. We were including it in headers, which then caused it to be included in tons of places it wasn't needed. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
6fe9ea78fa413ca3f0359f62881876f6b7a12f03 |
|
23-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Remove duplicate #includes. Added in commits 36fd65381 and 337dad8ce even though the existing include was in view. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
6c8ba59cff14a1a86273f4008ff2a8e68335ab25 |
|
11-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Use nir_lower_tex for texture coordinate lowering Previously, we had a rescale_texcoords helper in the FS backend for handling rescaling of texture coordinates. Now that we can do variants in NIR, we can use nir_lower_tex to do the rescaling for us. This allows us to delete the i965-specific code and gives us proper TEXTURE_RECTANGLE and GL_CLAMP handling in vertex and geometry shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
c875e3cdd21811ad6669160d59fa39a4526ef872 |
|
14-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965/fs: Add support for gl_HelperInvocation system value. In most cases (when the negate is copy propagated and the MOV removed), this is two instructions on Gen >= 8 and only two instructions on earlier platforms -- and it doesn't use the flag register. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
99840eb983f74cd447546f7205c8c9f505ef82c8 |
|
18-Nov-2015 |
Ian Romanick <ian.d.romanick@intel.com> |
i965: Enable EXT_shader_samples_identical On the vec4 backend, textureSamplesIdentical() will always return false. There are currently no test cases for the vec4 backend, so we don't have much confidence in any implementation. We also don't think anyone is likely to miss it. v2: Handle immediate value for MCS smarter. Rebase on changes to nir_texop_sampels_identical (missing second parameter). Suggested by Jason. v3: Add Neil's code to handle 16x MSAA in the FS. Also rebase on top of f9a9ba5e. Stub out the vec4 implementation. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Neil Roberts <neil@linux.intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [v2] Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> [v2]
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
457bb290efc162ea3c7c51a820ab7cf88a4efb8d |
|
18-Nov-2015 |
Ian Romanick <ian.d.romanick@intel.com> |
nir: Add nir_texop_samples_identical opcode This is the NIR analog to GLSL IR ir_samples_identical. v2: Don't add the second nir_tex_src_ms_index parameter. Suggested by Ken and Jason. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
9b978046eb1d1657060365e8dcde4aad41b50af9 |
|
02-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965/fs: Use brw_imm_uw(). W/UW immediates are 16-bits, but those 16-bits must be replicated in the high 16-bits of the 32-bit field. Remove the useless W/UW immediate saturating code, since we'll now be using the appropriate immediate (and W/UW immediates in the IR can now no longer be larger than 16-bits). Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
3ccc41ecfc5e9345a1c291748d8840984f7413ae |
|
02-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965/fs: Replace fs_reg(imm) constructors with brw_imm_*(). Cuts 10k of .text, of which only 776 bytes are the fs_reg constructor implementations themselves. text data bss dec hex filename 5204535 214112 27784 5446431 531b1f i965_dri.so before 5193977 214112 27784 5435873 52f1e1 i965_dri.so after Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
fc19a0d2e422ea8e45bc5440a91f858f5f345884 |
|
08-Nov-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Allow indirect GS input indexing in the scalar backend. This allows arbitrary non-constant indices on GS input arrays, both for the vertex index, and any array offsets beyond that. All indirects are handled via the pull model. We could potentially handle indirect addressing of pushed data as well, but it would add additional code complexity, and we usually have to pull inputs anyway due to the sheer volume of input data. Plus, marking pushed inputs as live due to indirect addressing could exacerbate register pressure problems pretty badly. We'd need to be careful. v2: Use updated MOV_INDIRECT opcode. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b163aa01487ab5f9b22c48b7badc5d65999c4985 |
|
27-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Rename GRF to VGRF. The 2-bit hardware register file field is ARF, GRF, MRF, IMM. Rename GRF to VGRF (virtual GRF) so that we can reuse the GRF name to mean an assigned general purpose register. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
dba309fc14d1ca99251c8f8115d2a26ac86f14f6 |
|
30-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Initialize registers. The test (file == BAD_FILE) works on registers for which the constructor has not run because BAD_FILE is zero. The next commit will move BAD_FILE in the enum so that it's no longer zero. In the case of this->outputs, the constructor was being run implicitly, and we were unnecessarily memsetting is to zero. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
d4fdb84f80dd3dbad2b71ea6e877f24dc625aa2a |
|
10-Nov-2015 |
Samuel Iglesias Gonsálvez <siglesias@igalia.com> |
i965/fs/nir: fix the number of register written by FS_OPCODE_GET_BUFFER_SIZE FS_OPCODE_GET_BUFFER_SIZE is calculated with a resinfo's sampler message. This patch adjusts the number of registers written by the opcode following what the PRM spec says about the number of registers written by the SIMD8 and SIMD16's writeback messages for sampler messages. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
918bda23dda36004c95f6441328ecc892e068886 |
|
05-Nov-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Split nir_emit_intrinsic by stage with a general fallback. Many intrinsics only apply to a particular stage (such as discard). In other cases, we may want to interpret them differently based on the stage (such as load_primitive_id or load_input). The current method isn't that pretty - we handle all intrinsics in one giant function. Sometimes we assert on stage, sometimes we forget. Different behaviors are handled via if-ladders based on stage. This commit introduces new nir_emit_<stage>_intrinsic() functions, and makes nir_emit_instr() call those. In turn, those fall back to the generic nir_emit_intrinsic() function for cases they don't want to handle specially. This makes it clear which intrinsics only exist in one stage, and makes it easy to handle inputs/outputs differently for various stages. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
51694072218b5ae84b5d8f98ee2172d7c5d61b31 |
|
06-Nov-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/nir/fs: Add comment for no-op memory barrier functions Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
faa119307035787f5e421dd6a9eb4d0101de963b |
|
10-Oct-2015 |
Jordan Justen <jordan.l.justen@intel.com> |
i965/nir/fs: Implement new barrier functions for compute shaders For these nir intrinsics, we emit the same code as nir_intrinsic_memory_barrier: * nir_intrinsic_memory_barrier_atomic_counter * nir_intrinsic_memory_barrier_buffer * nir_intrinsic_memory_barrier_image We treat these nir intrinsics as no-ops: * nir_intrinsic_group_memory_barrier * nir_intrinsic_memory_barrier_shared v3: * Add comment for no-op cases (curro) v4: * Moving comment to a separate patch authored by curro Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
8dcf807cb43383590ba193c7ff20b8a98e4a9f65 |
|
14-Oct-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fix scalar VS float[] and vec2[] output arrays. The scalar VS backend has never handled float[] and vec2[] outputs correctly (my original code was broken). Outputs need to be padded out to vec4 slots. In fs_visitor::nir_setup_outputs(), we tried to process each vec4 slot by looping from 0 to ALIGN(type_size_scalar(type), 4) / 4. However, this is wrong: type_size_scalar() for a float[2] would return 2, or for vec2[2] it would return 4. This looked like a single slot, even though in reality each array element would be stored in separate vec4 slots. Because of this bug, outputs[] and output_components[] would not get initialized for the second element's VARYING_SLOT, which meant emit_urb_writes() would skip writing them. Nothing used those values, and dead code elimination threw a party. To fix this, we introduce a new type_size_vec4_times_4() function which pads array elements correctly, but still counts in scalar components, generating correct indices in store_output intrinsics. Normally, varying packing avoids this problem by turning varyings into vec4s. So this doesn't actually fix any Piglit or dEQP tests today. However, if varying packing is disabled, things would be broken. Tessellation shaders can't use varying packing, so this fixes various tcs-input Piglit tests on a branch of mine. v2: Shorten the implementation of type_size_4x to a single line (caught by Connor Abbott), and rename it to type_size_vec4_times_4() (renaming suggested by Jason Ekstrand). Use type_size_vec4 rather than using type_size_vec4_times_4 and then dividing by 4. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
eea3c907cc480a105224b21be51d62bc64ea1057 |
|
30-Oct-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/fs: Do not mark used surfaces in FS_OPCODE_GET_BUFFER_SIZE Do it in the visitor, like we do for other opcodes. v2: use const, get rid of useless surf_index temporary (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
027b64a55afc0fe8efcf9f6217192807e285c830 |
|
30-Oct-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/fs: Do not mark direct used surfaces in VARYING_PULL_CONSTANT_LOAD Right now the generator marks direct surfaces as used but leaves marking of indirect surfaces to the caller. Just make the callers handle marking in both cases for consistency. v2: Use const and remove useless surf_index temporary (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
fb77da89f51fd82d5cee95400acb20ad74d9e7bc |
|
31-Oct-2015 |
Timothy Arceri <t_arceri@yahoo.com.au> |
i965: add support for image AoA V3: clamp array index to the correct size (the size of the current array rather than the inner array) Francisco Jerez. V2: avoid useless zero-initialization and addition for the first AoA level, avoid redundant temporary, make use of type_size_scalar(), rename aoa_size to element_size, assign the indirect indexing temporary directly to image.reladdr, and replace while loop with a for loop. All suggested by Francisco Jerez. Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
36fd65381756ed1b8f774f7fcdd555941a3d39e1 |
|
12-Mar-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Add scalar geometry shader support. This is hidden behind INTEL_SCALAR_GS=1 for now, as we don't yet support instanced geometry shaders, and Orbital Explorer's shader spills like crazy. But the infrastructure is in place, and it's largely working. v2: Lots of rebasing. v3: (feedback from Kristian Høgsberg) - Handle stride and subreg_offset correctly for ATTRs; use a helper. - Fix missing emit_shader_time_end() call. - Delete dead code after early EOT in static vertex case to avoid tripping asserts in emit_shader_time_end(). - Use proper D/UD type in intexp2(). - Fix "EndPrimitve" and "to that" typos. - Assert that invocations == 1 so we know this is missing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
0b19f651958c3888588190c8c8a9e701173a2aa2 |
|
26-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965/fs: Clean up FBH code. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
4379ca22f18f5731248ee794ab651db721ba38b2 |
|
07-Oct-2015 |
Emil Velikov <emil.velikov@collabora.com> |
i965: Implement nir_intrinsic_shader_clock v2: - Add a few const qualifiers for good measure. - Drop unneeded retype()s (Matt) - Convert timestamp to SIMD8/16, as fs_visitor::get_timestamp() returns SIMD4 (Connor) v3: - Remove unneeded temporary + MOV (Connor) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
8c902a580a490181e7cde29073b11181db4614f8 |
|
17-Jun-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Implement ARB_fragment_layer_viewport. Normally, we could read gl_Layer from bits 26:16 of R0.0. However, the specification requires that bogus out-of-range 32-bit values written by previous stages need to appear in the fragment shader as-written. Instead, we pass in the full 32-bit value from the VUE header as an extra flat-shaded varying. We have the SF override the value to 0 when the previous stage didn't actually write a value (it's actually defined to return 0). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
0cb7d7b4b7c32246d4c4225a1d17d7ff79a7526d |
|
22-Oct-2015 |
Kristian Høgsberg Kristensen <krh@bitplanet.net> |
i965/fs: Optimize ssbo stores Reviewed-by: Francisco Jerez <currojerez@riseup.net> Write groups of enabled components together. Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
feff21d1a6ba49a0d6f7526e1ff473a0b574c92e |
|
22-Oct-2015 |
Kristian Høgsberg Kristensen <krh@bitplanet.net> |
i965/fs: Drop offset_reg temporary in ssbo load Now that we don't read each component one-by-one, we don't need the temoprary vgrf for the offset. More importantly, this register was type UD while the nir source was type D. This broke copy propagation and left a redundant MOV in the generated code. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
a19bf6d3ccbab6170ccfb7e04316a58f3e19396c |
|
21-Oct-2015 |
Kristian Høgsberg Kristensen <krh@bitplanet.net> |
i965/fs: Don't uniformize surface index twice The emit_untyped_read and emit_untyped_write helpers already uniformize the surface index argument. No need to do it before calling them. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
24a3a697e5e029767c2d210a94d47c52c5e5e299 |
|
17-Oct-2015 |
Kristian Høgsberg Kristensen <krh@bitplanet.net> |
i965/fs: Read all components of a SSBO field with one send Instead of looping through single-component reads, read all components in one go. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
1db44252d01bf7539452ccc2b5210c74b8dcd573 |
|
20-Oct-2015 |
Ben Widawsky <benjamin.widawsky@intel.com> |
i965: Implement ARB_shader_stencil_export (gen9+) v2: remove useless source_stencil_to_render_target (Ken) Squash in the actual packing function, which also got to v2: Move the definition of the OPCODE outside of FB_WRITE opcodes (Matt) Reorder the regioning to be in VWH order (Matt) Don't retype src in the backend, just assert instead (Matt) Rename the debug prints to something better (Matt) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
48c76eae8e52fba2fe22d2cfa7f3c94a5420feb2 |
|
10-Jul-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Implement gl_InvocationID. It's stored in bits 31:27 of g1 (along with the URB handles). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
c5ae34f38f239d346090212a9f33a947a3b7642e |
|
24-Sep-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Implement nir_intrinsic_load_primitive. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
6f9ca3026693e061ee55fa6d5f16d9ec0e744b59 |
|
15-Oct-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/fs: use the right number of UBOs Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
176e6930e6c24dfce7cc730faa2612d27689a4df |
|
18-Jul-2015 |
Timothy Arceri <t_arceri@yahoo.com.au> |
i965: add arrays of arrays support for varyings V2: get the correct vector elements value for outputs Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
d3f45888045c84b2bc382a34d169a0ede4774a24 |
|
09-Oct-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965: Adapt SSBOs to work with their own separate index space Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
baee16bf02eedc6a32381d79da6c7ac942f782ae |
|
28-Sep-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
nir: split SSBO min/max atomic instrinsics into signed/unsigned versions NIR is typeless so this is the only way to keep track of the type to select the proper atomic to use. v2: - Use imin,imax,umin,umax for the intrinsic names (Connor Abbott) - Change message for unreachable paths (Michael Schellenberger) Tested-by: Markus Wick <markus@selfnet.de> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
2953c3d76178d7589947e6ea1dbd902b7b02b3d4 |
|
15-Aug-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Map scalar VS input locations properly; avoid tons of MOVs. Previously, we used nir_lower_io with the scalar type_size function, which mapped VERT_ATTRIB_* locations to...some numbers. Then, in fs_visitor::nir_setup_inputs(), we created temporaries indexed by those numbers, and emitted MOVs from the actual ATTR registers to those temporaries. Virtually all of these were copy propagated away, but it's still ugly. This patch reworks our input lowering to produce NIR lower_input intrinsics that properly index into the ATTR file, so we can access it directly. No changes in shader-db. v2: Fix unreachable() message (Ken), update commit message (Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
da361acd1c899d533caec6cae5a336f6ab35e076 |
|
17-Jul-2015 |
Neil Roberts <neil@linux.intel.com> |
i965/fs: Handle non-const sample number in interpolateAtSample If a non-const sample number is given to interpolateAtSample it will now generate an indirect send message with the sample ID similar to how non-const sampler array indexing works. Previously non-const values were ignored and instead it ended up using a constant 0 value. The generator will try to determine if the sample ID is dynamically uniform via nir_src_is_dynamically_uniform. If not it will query the pixel interpolator in a loop, once for each different live sample number. The next live sample number is found using emit_uniformize. If multiple live channels have the same sample number then they will be handled in a single iteration of the loop. The loop is necessary because the indirect send message doesn't seem to have a way to specify a different value for each fragment. This fixes the following two Piglit tests: arb_gpu_shader5-interpolateAtSample-nonconst arb_gpu_shader5-interpolateAtSample-dynamically-nonuniform v2: Handle dynamically non-uniform sample ids. v3: Remove the BREAK instruction and predicate the WHILE directly. Make the tokens arrays const. (Matt Turner) v4: Iterate over the live channels instead of each possible sample number. v5: Don't special case immediate values in brw_pixel_interpolator_query. Make a better wrapper for the function to set up the PI send instruction. Ensure that the SHL instructions are scalar. (Francisco Jerez). Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
1e3c1b107e075b210998998423901092b8fcd79b |
|
03-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Use nir_foreach_variable Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
bf7b6fd3fd6d98305d64ee6224ca9f9e7ba48444 |
|
02-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/shader: Get rid of the shader, prog, and shader_prog fields Unfortunately, we can't get rid of them entirely. The FS backend still needs gl_program for handling TEXTURE_RECTANGLE. The GS vec4 backend still needs gl_shader_program for handling transfom feedback. However, the VS needs neither and we can substantially reduce the amount they are used. One day we will be free from their tyranny. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
756613ed35d6fd2216b5138731c0c38886b8e14a |
|
02-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Use the nir info instead of pulling things out of [shader_]prog Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b62e36d18fac4a9c9977ddfa4bc2c2dbbcdad1b4 |
|
02-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Move sampler unit lookup into rescale_texcoord The texunit variable we create and assign in nir_emit_texture gets passed through two more layers of function calls before it gets to its sole use in rescale_texcoord. The best part is that we already pass the sampler into rescale_texcoord so we can just look it up there. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
7926c3ea7d8f455cbee390d20c78dadf5432b9bc |
|
01-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/backend_shader: Add a field to store the NIR shader Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
30c63571133ed50907ec14172c2f3ef82ee8a34e |
|
01-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Move prog_data uniform setup to the codegen level As of now, uniform setup is more-or-less unified between vec4 and fs and no longer requires the fs_visitor. This makes uniform setup more of a language/API thing than a backend compiler thing. This commit moves setting up the stage_prog_data.params arrays to the same place as we set up the rest of stage_prog_data. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
cdf314cb21377ee7caca05bd1abab6a2b921d213 |
|
01-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Simplify uniform setup Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
7fee8b6f055831bc070bb36d02a8b1c4d601652a |
|
02-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Pull GLSL uniform handling into a common function The way we deal with GLSL uniforms and builtins is basically the same in both the vec4 and the fs backend. This commit takes the best parts of both implementations and pulls the common code into a shared helper function. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
03c4171b577b06b1d8dde50b6eb9507d8ef4c1ce |
|
29-Sep-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Pull common ARB program uniform handling into a common function The way we deal with ARB program uniforms is basically the same in both the vec4 and the fs backend. This commit takes the best parts of both implementations and pulls the common code into a shared helper function. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
58cea0c2b63db236e6efcae930c5fb936181c2a9 |
|
30-Sep-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/shader: Pull setup_image_uniform_values out of backend_shader I tried to do this once before but Curro pointed out that having it in backend_shader meant it could use the setup_vec4_uniform_values helper which did different things in vec4 and fs. Now the setup_uniform_values function differs only by an assert in the two backends so there's no real good reason to be using it anymore. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
681b4badaedec5c9503887c4afb32485ce22c30e |
|
24-Sep-2015 |
Jordan Justen <jordan.l.justen@intel.com> |
i965/cs: Generate code to load gl_NumWorkGroups This code also sets cs_prog_data->uses_num_work_groups which is later used by state setup to indicate that the gl_NumWorkGroups surface needs to be setup. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
6668eb5a451c43ac78a784711cf239fdf7ca75ef |
|
11-Sep-2015 |
Samuel Iglesias Gonsalvez <siglesias@igalia.com> |
mesa: rename gl_shader_program's NumUniformBlocks to NumBufferInterfaceBlocks Because it counts shader storage blocks too. v2: - Use NumBufferInterfaceBlocks instead (Jordan). Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
14af6f4698a9f60c080b9adda4d3b4c45b157bd7 |
|
01-Jun-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/nir/fs: Implement nir_intrinsic_ssbo_atomic_* Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
5b186aafe7a8d3f96a99ad2fddd2bff99d99e923 |
|
01-Jun-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/nir/fs: Implement nir_intrinsic_load_ssbo Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
337dad8ceeb4f313a47b4ddb31805f355c3fc3a5 |
|
01-Jun-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/nir/fs: Implement nir_intrinsic_store_ssbo Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
f5dd2c182275a9de57e5186491012c402a6248e0 |
|
01-Jun-2015 |
Samuel Iglesias Gonsalvez <siglesias@igalia.com> |
i965/fs/nir: implement nir_intrinsic_get_buffer_size v2: - Remove inst->regs_written assignment as the instruction only writes to one register. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
c5743a5d7fa62a339222ceb96d568a525d77fe0c |
|
13-Mar-2015 |
Jordan Justen <jordan.l.justen@intel.com> |
i965/nir: Support gl_WorkGroupID variable Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
49f999b9cb6ecb32cb27d10b47d234a176ae4c77 |
|
13-Mar-2015 |
Jordan Justen <jordan.l.justen@intel.com> |
i965/nir: Support gl_LocalInvocationID variable Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
34cff76fc2da1ce9abad6e2b1856fec6a950d19c |
|
05-Nov-2014 |
Jordan Justen <jordan.l.justen@intel.com> |
i965/cs: Enable barrier in MEDIA_INTERFACE_DESCRIPTOR Enable barrier in MEDIA_INTERFACE_DESCRIPTOR if the program uses the barrier() GLSL function. On Ivy Bridge and Haswell, this allows the piglit test tests/spec/arb_compute_shader/execution/simple-barrier-atomics.shader_test to pass. On gen8, this enables a similar test with a local group size of 896 to pass. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
55ebaa6d003b69c0a159a00d82a1e96f685062d6 |
|
28-Aug-2015 |
Ilia Mirkin <imirkin@alum.mit.edu> |
i965: add handling for imageSamples Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
0b91bcea98c0fe201bba89abe1ca3aee4d04c56c |
|
12-Aug-2015 |
Ilia Mirkin <imirkin@alum.mit.edu> |
i965: add support for textureSamples function Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> [v2: kayden-supplied code in fs_nir replacing need for logical opcode] Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
0cc331dddd1a99c7af3619c92c48b5c32e17f6b3 |
|
04-Aug-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/nir: Use nir_system_value_from_intrinsic to reduce duplication. This code is all pretty much identical. We just needed the translation from one enum value to the other. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
c676c432f30158190c260e7f3731ee6667ad4103 |
|
17-Aug-2015 |
Matt Turner <mattst88@gmail.com> |
i965/fs: Remove fs_visitor::try_replace_with_sel(). No shader-db changes on g4x, snb, hsw, or bdw. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
2581fe931a48478123d8054ce7a291cffa851de9 |
|
28-Aug-2015 |
Marta Lofstedt <marta.lofstedt@intel.com> |
i965/fs: Do not set the size for zero-size uniforms Zero sized uniforms can exist in the list, but they don't get get any space allocated in prog_data->params or in the param_size array, so the size should not be set for them. This was previously fixed in: commit: 781dc7c0e1f41502f18e07c0940af949a78d2792. However, commit: 259f7291de2387aa3ac5f856b39b7b934a1d8e7d removed the fix. Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
97f4efd573aed7ffc0ea9395f4e69ccdeb5041f6 |
|
27-May-2015 |
Nanley Chery <nanley.g.chery@intel.com> |
mesa/macros: add power-of-two assertions for alignment macros ALIGN and ROUND_DOWN_TO both require that the alignment value passed into the macro be a power of two in the comments. Using software assertions verifies this to be the case. v2: use static inline functions instead of gcc-specific statement expressions (Brian). v3: fix indendation (Brian). v4: add greater than zero requirement (Anuj). Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
259f7291de2387aa3ac5f856b39b7b934a1d8e7d |
|
18-Aug-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Rework uniform handling Previously, we treated the entire UNIFORM file as if it had two elements: One for direct things and one for indirect. This is substantially different from how the old visitor code handled it where each element was effectively its own uniform. This commit makes the NIR path more like the old ir_visitor path where each uniform is separate. This should allow us to more easily make decisions about what to push. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
0db8e87b4a16b123f7c0b44d54f23b535a136ee6 |
|
18-Aug-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir/intrinsics: Add a second const index to load_uniform In the i965 backend, we want to be able to "pull apart" the uniforms and push some of them into the shader through a different path. In order to do this effectively, we need to know which variable is actually being referred to by a given uniform load. Previously, it was completely flattened by nir_lower_io which made things difficult. This adds more information to the intrinsic to make this easier for us. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
6c33d6bbf9b54784e4498a81c73b712dca5dd737 |
|
12-Aug-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
nir: Pass a type_size() function pointer into nir_lower_io(). Previously, there were four type_size() functions in play - the i965 compiler backend defined scalar and vec4 type_size() functions, and nir_lower_io contained its own similar functions. In fact, the i965 driver used nir_lower_io() and then looped over the components using its own type_size - meaning both were in play. The two are /basically/ the same, but not exactly in obscure cases like subroutines and images. This patch removes nir_lower_io's functions, and instead makes the driver supply a function pointer. This gives the driver ultimate flexibility in deciding how it wants to count things, reduces code duplication, and improves consistency. v2 (Jason Ekstrand): - One side-effect of passing in a function pointer is that nir_lower_io is now aware of and properly allocates space for image uniforms, allowing us to drop hacks in the backend Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> v2 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
640c472fd075814972b1276c5b0ed3a769aacda5 |
|
12-Aug-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Move type_size() methods out of visitor classes. I want to use C function pointers to these, and they don't use anything in the visitor classes anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
c56899f41a904762225267cb9c543a0abd901ad5 |
|
19-Aug-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Make setup_vec4_uniform_value and _image_uniform_values take an offset This way they don't implicitly increment the uniforms variable and don't have to be called in-sequence during uniform setup. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
56ebd3314bfc5895fab47586fc8cda024aac4fd8 |
|
20-Aug-2015 |
Martin Peres <martin.peres@linux.intel.com> |
i965: Fix "handle nir_intrinsic_image_size" I pushed a half-baked version of "i965: handle nir_intrinsic_image_size" by accident. Not having the Reviewed-by: tags on the last two commits should have been a red flag but I somehow missed it after the QA check. This patch should fix image-size for non-int images. I will add support to the piglit test for all the other image types. Sorry for the noise. Signed-off-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
50db9c1db645c1a4d5777d2cacfd7ac74ebbe544 |
|
28-Apr-2015 |
Martin Peres <martin.peres@linux.intel.com> |
i965: handle nir_intrinsic_image_size v2, Review from Francisco Jerez: - avoid the camelCase for the booleans - init the booleans using the sampler type - force the initialization of all the components of the output register v3: - Rename a variable from CubeMapArray to CubeArray to re-use GLSL's name (Ilia) - Fix some indentation and drop parenthesis (Topi) - Fix a signed/unsigned comparaison warning Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
13a04abc277089275217dce119e18acf4d4ce52d |
|
27-Jul-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Clamp image array indices to the array bounds on IVB. This fixes the spec@arb_shader_image_load_store@invalid index bounds piglit tests on IVB, which were causing a GPU hang and then a crash due to the invalid binding table index result of the array index calculation. Other generations seem to behave sensibly when an invalid surface is provided so it doesn't look like we need to care. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
a47ae8de2cf30fbe45318a18a2ea032f30ab7d10 |
|
27-Jul-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Translate image load, store and atomic NIR intrinsics. v2: Move array coordinate workaround into the surface builder. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
912ef52c29fdc373889594b963cc93c89fa9e3f7 |
|
28-Jun-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Handle image uniforms in NIR programs. v2: Move the image_params array back to brw_stage_prog_data. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
8a688bee83ced46eb4bff741f05d2da033c07ade |
|
10-Aug-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Make resolve_source_modifiers consistent with the vec4 version Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
e77a4a9b1f66de383043df95aada40fd5a004913 |
|
04-Aug-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Implement nir_op_imul/umul_high in terms of MULH. And get rid of another no16() call. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
db8a6de571bb72ef43209a415e5492001a87b1d8 |
|
17-Jun-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/nir: Add new utility method brw_glsl_base_type_for_nir_type() This method returns the glsl_base_type corresponding to a nir_alu_type. It will factorize code currently present in fs_nir, that can be reused in vec4_nir on its upcoming emit_texture support. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
97e205fd35bf77fd761caf24c611ff72cc0d85e2 |
|
17-Apr-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/nir: Move brw_type_for_nir_type() to brw_nir to allow reuse Upcoming NIR->vec4 pass can benefit from this method, so lets move it up. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
781dc7c0e1f41502f18e07c0940af949a78d2792 |
|
30-Jul-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Fix regression with SIMD8 VS since b5f1a48e234d47b24df38cb562cffb8941d43795. With num_direct_uniforms == 0 there's no space allocated in the param_size array for the one block of direct uniforms -- On the FS stage this would be a harmless no-op because it would simply re-set one of the param_size entries allocated for the sampler units to zero, but on the VS stage it has been reported to cause memory corruption followed by a crash -- Surprising how a full piglit run on Gen8 didn't catch it. Reported-and-reviewed-by: "Lofstedt, Marta" <marta.lofstedt@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
7cb60d770fc24bf00b6f7e5898cca1426e55c026 |
|
27-Jul-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Translate memory barrier NIR intrinsics. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b5f1a48e234d47b24df38cb562cffb8941d43795 |
|
28-Jun-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Execute nir_setup_uniforms, _inputs and _outputs unconditionally. Images take up zero uniform slots in the nir_shader::num_uniforms calculation, but nir_setup_uniforms needs to be executed even if the program has no non-image uniforms so the driver-specific image parameters are uploaded. nir_setup_uniforms is a no-op if there are really no uniforms, so checking the num_uniform count is useless in any case. The nir_setup_inputs and _outputs changes shouldn't lead to any functional change, they are just meant to preserve the symmetry between them and nir_setup_uniforms. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
3e5a90792d14aeb599dd236f830e6e344b35c905 |
|
05-May-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Don't overwrite fs_visitor::uniforms and ::param_size during the SIMD16 run. Image variables need to allocate additional uniform slots over nir_shader::num_uniforms. nir_setup_uniforms() overwrites the values imported from the SIMD8 visitor and then exits early before entering the nir_shader::uniforms loop, so image uniforms are never re-created. Instead leave the imported values alone, they *must* be the same for the uniform layout of both runs to be compatible. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
ea0ac53f059c418d5797c495b87020f2ca2ec842 |
|
29-Jun-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Drop unused untyped surface read and atomic emit methods. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
854c4d8b37416d3e5593099a8e5441f3cf861173 |
|
05-May-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Revisit NIR atomic counter intrinsic translation. Rewrite the NIR atomic counter intrinsics translation code making use of the recently introduced surface builder. This will allow the removal of some of the functionality duplicated between the visitor and surface builder. v2: Drop VEC4 suport. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
3af2623da5167aa686bcb2cff01d27058a507026 |
|
20-Jul-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Lift the constness restriction on surface indices passed to untyped ops. v2: Update NIR atomic intrinsic handling too (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b406c34a65677cac2517336d93ab279c3d35fce6 |
|
23-Jul-2015 |
Dave Airlie <airlied@redhat.com> |
i965: fix warning since tess merge. Signed-off-by: Dave Airlie <airlied@redhat.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
fadf34773527779eef4622b2586d87ec00476c0f |
|
13-Jul-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Fix stride field for the result of emit_uniformize(). This is essentially the same problem fixed in an earlier patch for immediates. Setting the stride to zero will be particularly useful for my future SIMD lowering pass, because we will be able to just check whether the stride of a source register is zero and skip emitting the copies required to unzip it in that case. Instead of setting stride to zero in every caller of emit_uniformize() I've changed the function to return the result as its return value (previously it was being written into a caller-provided destination register), because this way we can enforce that the result is used with the correct regioning from the function itself. The changes to the prototype of its VEC4 counterpart are mainly for the sake of symmetry, VEC4 registers don't have stride. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
8ba1982b1e37aa69680e243fe391254211ae273a |
|
17-Jul-2015 |
Alejandro Piñeiro <apinheiro@igalia.com> |
i965/nir/fs: removed unneeded support for global variables As functions are inlined, and nir_lower_global_vars_to_local gets run, all global variables are lowered to local variables. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b00cd6e4a0f9a84d514f428428be348900236e2e |
|
09-Jul-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Implement nir_op_uadd_carry and _usub_borrow without accumulator. This gets rid of two no16() fall-backs and should allow better scheduling of the generated IR. There are no uses of usubBorrow() or uaddCarry() in shader-db so no changes are expected. However the "arb_gpu_shader5/execution/built-in-functions/fs-usubBorrow" and "arb_gpu_shader5/execution/built-in-functions/fs-uaddCarry" piglit tests go from 40 to 28 instructions. The reason is that the plain ADD instruction can easily be CSE'ed with the original addition, and the b2i negation can easily be propagated into the source modifier of another instruction, so effectively both operations are performed with just one instruction. v2: Rely on carry_to_arith() and borrow_to_arith() to lower these (Ilia Mirkin). Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
3ee2daf23dc91b8dfc017b5c89c10ab1376ba4df |
|
10-Jul-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Implement b2f and b2i using negation. Booleans are represented as 0/-1 on modern hardware which means we can just negate them to convert them into a numeric type. Negation has the benefit that it can be implemented using a source modifier which can easily be propagated into some other instruction. shader-db results on HSW: total instructions in shared programs: 6349082 -> 6346693 (-0.04%) instructions in affected programs: 40948 -> 38559 (-5.83%) helped: 123 HURT: 1 GAINED: 1 LOST: 0 Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
308c0bf74307af0f3385cdcbb00aa0534ec3e5da |
|
12-Mar-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Switch on shader stage in nir_setup_outputs(). Adding new shader stages to a switch statement is less confusing than an if-else-if ladder where all but the first case are fragment shader specific (but don't claim to be). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
73d0e7f3451eaeb62ac039d2dcee1e1c6787e3db |
|
02-Jul-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Fix matNxM vertex attributes where M != 4. Matrix vertex attributes have their columns padded out to vec4s, which I was failing to account for. Scalar NIR expects them to be packed, however. Fixes 1256 dEQP tests on Broadwell. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
7009e2683ebb917393d87639f549588f22c03a32 |
|
06-Jul-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/gen4-5: Enable 16-wide dispatch on shaders with control flow. This was probably disabled due to a combination of several bugs in the generator code (fixed earlier in this series) and a misunderstanding of the hardware spec. The documentation for most control flow instructions mentions among other restrictions: "Instruction compression is not allowed." This however doesn't have any implications on 16 wide not being supported, because none of the control flow instructions have multi-register operands (control flow instructions are not compressed on more recent hardware either, except maybe SNB's IF with inline compare). In fact Gen4-5 had 16-wide control flow masks and stacks, and the spec mentions in several places that control flow instructions push and pop 16 channels worth of data -- Otherwise there doesn't seem to be any indication that it shouldn't work. Causes no piglit regressions, and gives the following shader-db results on ILK: total instructions in shared programs: 4711384 -> 4711384 (0.00%) instructions in affected programs: 0 -> 0 helped: 0 HURT: 0 GAINED: 1215 LOST: 0 Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
493af150fb3b1c007d791b24dcd5ea8a92ad763c |
|
03-Jul-2015 |
Neil Roberts <neil@linux.intel.com> |
i965/skl: Set the pulls bary bit in 3DSTATE_PS_EXTRA On Gen9+ there is a new bit in 3DSTATE_PS_EXTRA that must be set if the shader sends a message to the pixel interpolator. This fixes the interpolateAt* tests on SKL, apart from interpolateatsample-nonconst but that is not implemented anywhere so it's not a regression. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.6 10.5" <mesa-stable@lists.freedesktop.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
7abc1e3286bc4729e144d3a247c2a275e46aaf53 |
|
02-Jul-2015 |
Neil Roberts <neil@linux.intel.com> |
i965/fs: Don't disable SIMD16 when using the pixel interpolator There was a comment saying that in SIMD16 mode the pixel interpolator returns coords interleaved 8 channels at a time and that this requires extra work to support. However, this interleaved format is exactly what the PLN instruction requires so I don't think anything needs to be done to support it apart from removing the line to disable it and to ensure that the message lengths for the send message are correct. I am more convinced that this is correct because as it says in the comment this interleaved output is identical to what is given in the thread payload. The code generated to apply the plane equation to these coordinates is identical on SIMD16 and SIMD8 except that the dispatch width is larger which implies no special unmangling is needed. Perhaps the confusion stems from the fact that the description of the PLN instruction in the IVB PRM seems to imply that the src1 inputs are not interleaved so it wouldn't work. However, in the HSW and BDW PRMs, the pseudo-code is different and looks like it expects the interleaved format. Mesa doesn't seem to generate different code on IVB to uninterleave the payload registers and everything is working so I can only assume that the PRM is wrong. I tested the interpolateAt tests on HSW and did a full Piglit run on IVB on there were no regressions. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
89bc4c78c394e50ddb16cc089bd3ec90681342d7 |
|
18-Jun-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Remove fs_inst constructors that don't take an explicit exec_size Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
f7dcc1160331462a071c54ca1067f9e2f57b55be |
|
18-Jun-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Add a builder argument to offset() Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
0ecdf04060518149e99a098caf4f6025fd6482a4 |
|
26-Jun-2015 |
Connor Abbott <cwabbott0@gmail.com> |
i965/fs: emit constants only once Before, we would lazily emit a MOV whenever we encountered a use of a constant. Now that we have a dedicated file for SSA values, we can instead only emit the MOV's once, which is more consistent and prevents us from relying on CSE to re-combine the constants when they aren't absorbed into the instruction. total instructions in shared programs: 6078991 -> 6073118 (-0.10%) instructions in affected programs: 402221 -> 396348 (-1.46%) helped: 1527 HURT: 0 GAINED: 8 LOST: 2 v2: split this out from the previous commit (Jason) Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
864907e2f14523c130e6ff24c081789bb079bae1 |
|
24-Jun-2015 |
Connor Abbott <cwabbott0@gmail.com> |
i965/fs: use SSA values directly Before, we would use registers, but set a magical "parent_instr" field to indicate that it was actually purely an SSA value (i.e., it wasn't involved in any phi nodes). Instead, just use SSA values directly, which lets us get rid of the hack and reduces memory usage since we're not allocating a nir_register for every value. It also makes our handling of load_const more consistent compared to the other instructions. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
f0e772392f1c61df6e3f253dc236eb9737fb6146 |
|
13-Mar-2015 |
Jordan Justen <jordan.l.justen@intel.com> |
i965/nir: Support barrier intrinsic function Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
cfc175b40995ca4e590cd30897f6bb017e1376a3 |
|
10-Jun-2015 |
Chad Versace <chad.versace@intel.com> |
i965/fs: Fix unused variable warning Annotate offset_components with attribute 'unused'. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
44928b799adbbf2671c482431b3b7a390118725c |
|
08-Jun-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Remove dead IR construction code from the visitor. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
bf83a1a219af8bf82c3c721888bbe0dfc3eced34 |
|
03-Jun-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Migrate translation of NIR texturing instructions to the IR builder. v2: Don't remove assignments of base_ir just yet. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
979fe2ffee3956186017fe6c115aed53fc87ad3d |
|
03-Jun-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Migrate translation of NIR intrinsics to the IR builder. v2: Use fs_builder::SEL instead of ::emit. Use set_condmod(). Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
fe88c7ae38c72ea09ced69fb12ff00f58bdf1d6e |
|
03-Jun-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Migrate translation of NIR ALU instructions to the IR builder. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
3632c28bde071950dc57e42eb62a65fb838c8bdc |
|
03-Jun-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Migrate translation of NIR control flow to the IR builder. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
9976731485abb68eb3b5ae6f11a7838977b95b5b |
|
03-Jun-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Migrate NIR variable handling to the IR builder. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
09733f220ac9921ce7d8c3524bc5327d8203c446 |
|
03-Jun-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Migrate NIR emit_percomp() to the IR builder. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
546839ef639bf871feaa62ab7d811f2fc783bdcd |
|
03-Jun-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Migrate pull constant loads to the IR builder. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
87a4bc511811327a00f9bbc1b6870b7fa46675f7 |
|
21-May-2015 |
Martin Peres <martin.peres@linux.intel.com> |
mesa: reference built-in uniforms into gl_uniform_storage This change introduces a new field in gl_uniform_storage to explicitely say that a uniform is built-in. In the case where it is, no storage is defined to make it clear that it is read-only from the mesa side. I fixed all the places in the code that made use of the structure that I changed. Any place making a wrong assumption and using the storage straight away will just crash. This patch seems to implement the path of least resistance towards listing built-in uniforms in GL_ACTIVE_UNIFORM (and other APIs). Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
2126c68e5cba79709e228f12eb3062a9be634a0e |
|
20-May-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Get rid of the array elements parameter on load/store intrinsics Previously, we used intrinsic->const_index[1] to represent "the number of array elements to load" for load/store intrinsics. However, this set to 1 by every pass that ever creates a load/store intrinsic. Also, while it might make some sense for registers, it makes no sense whatsoever in SSA. On top of that, the i965 backend was the only backend to ever support it; freedreno and vc4 just assert that it's always 1. Let's just delete it. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
1e4e17fbd9296cc5064aabdb351a894d10190cb6 |
|
11-May-2015 |
Matt Turner <mattst88@gmail.com> |
i965/fs: Lower integer multiplication after optimizations. 32-bit x 32-bit integer multiplication requires multiple instructions until Broadwell. This patch just lets us treat the MUL instruction in the FS backend like it operates on Broadwell, and after optimizations we lower it into a sequence of instructions on older platforms. Doing this will allow us to some extra optimization on integer multiplies. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
3bdbc1e436828606d0b549b9480e7cc28b42d159 |
|
07-May-2015 |
Ian Romanick <ian.d.romanick@intel.com> |
nir: Delete all traces of nir_op_flog Nothing produces it, and nothing can consume it. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
e0a17f6e31a8cefc173ced5f53cb2d28a842fbb6 |
|
07-May-2015 |
Ian Romanick <ian.d.romanick@intel.com> |
nir: Delete all traces of nir_op_fexp Nothing produces it, and nothing can consume it. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
e1ae0c3bc37be7b1de21ee248d674671d01da8e6 |
|
19-Feb-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Fix variable indexing of sampler arrays under non-uniform control flow. ARB_gpu_shader5 requires sampler array indexing expressions to be dynamically uniform, this however doesn't have any implications on the control flow that leads to the evaluation of that expression being uniform. Use emit_uniformize() to obtain an arbitrary live value from the binding table index calculation instead of assuming that the first channel is always live. Fixes the following Piglit test cases: arb_gpu_shader5/execution/sampler_array_indexing/fs-nonuniform-control-flow.shader_test arb_gpu_shader5/execution/sampler_array_indexing/vs-nonuniform-control-flow.shader_test part of the series: http://lists.freedesktop.org/archives/piglit/2015-February/014615.html Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b234537cc3e513ded9b5385d876e4c531f72af94 |
|
19-Feb-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Fix variable indexing of UBO arrays under non-uniform control flow. ARB_gpu_shader5 requires UBO array indexing expressions to be dynamically uniform, this however doesn't have any implications on the control flow that leads to the evaluation of that expression being uniform. Use emit_uniformize() to obtain an arbitrary live value from the binding table index calculation instead of assuming that the first channel is always live. Fixes the following Piglit tests: arb_gpu_shader5/execution/ubo_array_indexing/fs-nonuniform-control-flow.shader_test arb_gpu_shader5/execution/ubo_array_indexing/vs-nonuniform-control-flow.shader_test part of the series: http://lists.freedesktop.org/archives/piglit/2015-February/014616.html Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
0c06d019bcf626b289ae94ca791dc25c216c1e5c |
|
24-Apr-2015 |
Matt Turner <mattst88@gmail.com> |
i965/fs: Fix code emission for imul_high in NIR. Copy over from brw_fs_visitor.cpp. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
c68364ac341d5fbbc5b6dcf74812a776359c0168 |
|
10-Apr-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Use the correct offsets when handling register indirects Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
28e9601d0e681411b60a7de8be9f401b0df77d29 |
|
16-Apr-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Add a devinfo field to backend_visitor and use it for gen checks Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
ceb6e5eebe13b85f57cf5a7a22371c10170943a3 |
|
14-Apr-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Remove the context parameter from brw_texture_offset It wasn't really being used anyway. We used it to assert that gpu_shader5 is supported in the back-end but that should be caught by the front-end. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
5af0604d528733af9113a6f8711c39796ce0ae40 |
|
07-Apr-2015 |
Matt Turner <mattst88@gmail.com> |
i965/fs: Calculate delta_x and delta_y together. This lets SIMD16 programs on G45 and Gen5 use the PLN instruction. On Ironlake: total instructions in shared programs: 5634757 -> 5518055 (-2.07%) instructions in affected programs: 1745837 -> 1629135 (-6.68%) helped: 11439 HURT: 4 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b6354d9bb077815d2e388dc5d0e7411ea6d89748 |
|
24-Jan-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/nir: Make INTEL_DEBUG=ann work with NIR. Now that we store a copy of the NIR shader, and don't immediately free it, we can use it in annotations as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
89c1feb78d010bc457f5d02be84c955eebf3549f |
|
08-Apr-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Create NIR during LinkShader() and ProgramStringNotify(). Previously, we translated into NIR and did all the optimizations and lowering as part of running fs_visitor. This meant that we did all of that work twice for fragment shaders - once for SIMD8, and again for SIMD16. We also had to redo it every time we hit a state based recompile. We now generate NIR once at link time. ARB programs don't have linking, so we instead generate it at ProgramStringNotify time. Mesa's fixed function vertex program handling doesn't bother to inform the driver about new programs at all (which is rather mean), so we generate NIR at the last minute, if it hasn't happened already. shader-db runs ~9.4% faster on my i7-5600U, with a release build. v2: Check NirOptions != NULL in ProgramStringNotify(). Don't bother using _mesa_program_enum_to_shader_stage as we already know it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b3e286c4575bf6af343c1a03471fd876cdfb5c43 |
|
08-Apr-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
nir: Store num_direct_uniforms in the nir_shader. Storing this here is pretty sketchy - I don't know if any driver other than i965 will want to use it. But this will make it a lot easier to generate NIR code at link time. We'll probably rework it anyway. (Ian suggested making nir_assign_var_locations_scalar_direct_first simply modify the nir_shader's fields, rather than passing pointers to them. If this stays long term, we should do that. But Jason and I suspect we'll be reworking this area again in the near future.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
f41f07f685e7f585e433b5fd1fadf602e74f0f1e |
|
08-Apr-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Move lower_output_reads to brw_link_shader(). This makes it so emit_nir_code() doesn't modify the GLSL IR. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
024ecc783b763712d2896fd315d8b5222c27b1ec |
|
11-Apr-2015 |
Matt Turner <mattst88@gmail.com> |
i965/fs/nir: Mark fallthrough.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
99264b7f37dc92bcb3a9ae226e00c9300414431c |
|
08-Apr-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
nir: Make nir_lower_samplers take a gl_shader_stage, not a gl_program *. We don't actually need a gl_program struct. We only used it to translate prog->Target (i.e. GL_VERTEX_PROGRAM) to the gl_shader_stage (i.e. MESA_SHADER_VERTEX). We may as well just pass that. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
d131630c0825f199768965c504b6fa1e593d03d5 |
|
02-Apr-2015 |
Matt Turner <mattst88@gmail.com> |
nir: Remove fsin_reduced/fcos_reduced. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
1bd1fc248ce5ecc6882309ab64ec61835fea1eda |
|
03-Apr-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Use brw_nir_cubemap_normalize for NIR shaders Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
cb966fb2bea77b1d7b1bdb6597b7b85d810f2d0a |
|
01-Apr-2015 |
Eric Anholt <eric@anholt.net> |
i965: Use the tex projector lowering pass instead of hand-rolling it. This only impacts the ARB_fp path. We can't quite disable the GLSL-level lowering pass, because it needs to apply before brw_do_lower_unnormalized_offset(). total instructions in shared programs: 5667857 -> 5667847 (-0.00%) instructions in affected programs: 1114 -> 1104 (-0.90%) helped: 16 HURT: 6 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b9d7454571029ab330f28164fe6869f5e455ca90 |
|
01-Apr-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Run DCE again before going out of SSA We run lowering and optimization passes that might leave garbage lying around. This keeps the FS cse from having to clean it up. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
37703040a142da6bc7c458479a70e35118e10e6b |
|
23-Mar-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Run the ffma peephole after the rest of the optimizations The idea here is that fusing multiply-add combinations too early can reduce our ability to perform CSE and value-numbering. Instead, we split ffma opcodes up-front, hope CSE cleans up, and then fuse after-the-fact. Unless an algebraic pass does something silly where it inserts something between the multiply and the add, splitting and re-fusing should never cause a problem. We run the late algebraic optimizations after this so that things like compare-with-zero don't hurt our ability to fuse things. shader-db results for fragment shaders on Haswell: total instructions in shared programs: 4390538 -> 4379236 (-0.26%) instructions in affected programs: 989359 -> 978057 (-1.14%) helped: 5308 HURT: 97 GAINED: 78 LOST: 5 This does, unfortunately, cause some substantial hurt to a shader in Kerbal Space Program. However, the damage is caused by changing a single instruction from a ffma to an add. This, in turn, *decreases* register pressure in one part of the program causing it to fail to register allocate and spill. Given the overwhelmingly positive results in other shaders and the fact that the NIR for the Kerbal shaders is actually better, this should be considered a positive. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
a8c8b3b8720bb7ce8ac1cb94815ed36d8c881f66 |
|
21-Mar-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Add a dedicated ffma peephole optimization i965/nir: Use the dedicated ffma peephole total instructions in shared programs: 4418748 -> 4394618 (-0.55%) instructions in affected programs: 1292790 -> 1268660 (-1.87%) helped: 5999 HURT: 457 GAINED: 4 LOST: 9 Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
da294f9b2f666f487001b2a25627c867c40eb3d9 |
|
24-Mar-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir/algebraic: Add a seperate section for "late" optimizations i965/nir: Use the late optimizations Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
826d3afb8f421a62020308813397e541e672381e |
|
30-Jan-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Add ARB_fragment_program support to the NIR backend. Use prog_to_nir where we would normally call glsl_to_nir, handle program parameter lists, and skip a few things that don't exist. Using NIR generates much better shader code than Mesa IR, since we get real optimizations, as opposed to prog_optimize: total instructions in shared programs: 314007 -> 279892 (-10.86%) instructions in affected programs: 285173 -> 251058 (-11.96%) helped: 2001 HURT: 67 GAINED: 4 LOST: 7 v2: Change early return in nir_setup_uniforms to if/else (Jordan). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
649173b473ded2d7b1aded91cd4aab42eaeb5766 |
|
01-Feb-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Implement texture projection support. Our fragment program backend implements support for TXP directly, and there's no NIR lowering pass to remove the projection. When we switch fragment program support over to NIR, we need to support it somehow. It's easy enough to support directly. v2: Split out offset/tex_offset rename (requested by Jordan). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
0a9bcf9e39409ea5acfdfbcf0c388e41e0f9ea45 |
|
25-Mar-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Rename offset to tex_offset to avoid shadowing offset(). fs_visitor::nir_emit_texture() created an fs_reg variable called offset, which shadowed the offset() helper function in brw_ir_fs.h. Rename the variable to tex_offset so we can still call offset(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
a6d4a108d27f2b635748c583fe0507f04b3b493e |
|
18-Mar-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Use signed integer type for booleans FS instructions with NIR on i965: total instructions in shared programs: 2663561 -> 2619051 (-1.67%) instructions in affected programs: 1612965 -> 1568455 (-2.76%) helped: 5455 HURT: 12 FS instructions with NIR on g4x: total instructions in shared programs: 2352633 -> 2307908 (-1.90%) instructions in affected programs: 1441842 -> 1397117 (-3.10%) helped: 5463 HURT: 11 FS instructions with NIR on ilk: total instructions in shared programs: 3997305 -> 3934278 (-1.58%) instructions in affected programs: 2189409 -> 2126382 (-2.88%) helped: 8969 HURT: 22 FS instructions with NIR on hsw (snb and ivb were similar): total instructions in shared programs: 4109389 -> 4109242 (-0.00%) instructions in affected programs: 109869 -> 109722 (-0.13%) helped: 339 HURT: 190 No SIMD16 programs were gained or lost on any platform Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
41d64fa184671d372f6630deaf2401b00d4e984a |
|
17-Mar-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Do boolean resolves on GEN <= 5 v2: A couple comment clean-ups from Matt Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
2612e569e04e29500f81ed233bd86b45ef583495 |
|
17-Mar-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Properly set the predicate on the SEL used in min/max Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
235c728020af352ee0f4b7d598c951f4a4e83232 |
|
17-Mar-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Use emit_lrp for emitting flrp Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
8a0946f3b1522e5f91afe14c8c3b22ba6009ed04 |
|
06-Mar-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Make an emit_discard_jump() function to reduce duplication. This is already copied in two places, and I want to copy it to a third place. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Carl Worth <cworth@cworth.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
46c35c61e9c5c1b56fdd9fcd4eb45591dd16d21d |
|
18-Mar-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Sort uniforms direct-first and use two different uniform registers Previously, we put all the uniforms into one big array. The problem with this approach is that, as soon as there was one indirect array acces, the backend would decide that the entire large array should be pull constants. This commit splits the array in half: first direct-only uniforms and then potentially-indirect uniforms. This may not be optimal, but it does let the backend promote things to push constants. Shader-db results on HSW: total instructions in shared programs: 4114840 -> 4112172 (-0.06%) instructions in affected programs: 43316 -> 40648 (-6.16%) helped: 116 HURT: 0 v2: Set param_size[num_direct_uniforms] only if we have indirect uniforms. This caused a bug that, strangely enough, only showed up on Broadwell vertex shaders. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
25db44a84597960a6aea6b252bcf2c3d7e17fc74 |
|
18-Mar-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir/lower_io: Make variable location assignment a manual operation Previously, we just assigned variable locations in nir_lower_io. Now, we force the user to assign variable locations for us. This gives the backend a bit more control over where variables are placed. v2: Rename from _packed to _scalar Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
639115123efe7f71d432e24b1719adda7d23e97e |
|
18-Mar-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Use a list instead of a hash_table for inputs, outputs, and uniforms We never did a single hash table lookup in the entire NIR code base that I found so there was no real benifit to doing it that way. I suppose that for linking, we'll probably want to be able to lookup by name but we can leave building that hash table to the linker. In the mean time this was causing problems with GLSL IR -> NIR because GLSL IR doesn't guarantee us unique names of uniforms, etc. This was causing massive rendering isues in the unreal4 Sun Temple demo. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
7ef0b6b367f73e24e6dd47a15d439775d3dd1297 |
|
09-Mar-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Add VS output support to nir_setup_outputs(). Adapted from fs_visitor::visit(ir_variable *). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
eb137117b7db6c78d6a1662730524d622301c708 |
|
09-Mar-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Handle VS inputs in the NIR backend. (Jason noted that this is not a good long term solution, and we should instead improve nir_lower_io so that this extra set of MOVs is unnecessary. I tend to agree, but decided we could do that as a follow-up improvement.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
a5c4e7fcf52c048c02e4ee14413a574b4ff3695e |
|
09-Mar-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Refactor fs_visitor::nir_setup_inputs(). No functional change. In preparation for supporting vertex shaders, this adds a switch statement on shader stage (since vertex attributes and fragment shader varyings will need different handling). It also renames "varying" to "input", to be more general. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
34628a838aa96643be02cd23eb55af50025dd422 |
|
09-Mar-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Implement NIR intrinsics for loading VS system values. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b9dea9bc45299f19c445170a4cac27810547de00 |
|
09-Mar-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/nir: Lower to registers a bit later. We can't safely call nir_optimize() with register present, since several passes called in the loop can't handle registers, and will fail asserts. Notably, nir_lower_vec_alus() and nir_opt_algebraic() really don't want registers. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
1f0067811c059fb3b284a2169e94fbdec7a4b909 |
|
09-Mar-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/nir: Optimize after nir_lower_var_copies(). Array variable copy splitting generates a bunch of stuff we want to clean up before proceeding. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
1d8ef6ba606a88239de633e5abcc19471c9d3cf4 |
|
09-Mar-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Store a pointer to brw_sampler_prog_key_data in the visitor. The NIR backend hardcodes brw_wm_prog_key at the moment, which won't work when we support scalar VS. We could use get_tex(), but it's a static method. I was going to promote it to fs_visitor, but then realized that both parameters (stage and key) are already members. It then occured to me that we could just set up a pointer in the constructor, and skip having a function altogether. This patch also converts all existing users to use key_tex. v2: Make key_tex a "const brw_sampler_prog_key_data *" instead of non-const; word-wrap some lines. (Review comments from Topi.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
e4f26acc08a3d852e60a27d0f0da7001944cb607 |
|
28-Feb-2015 |
Ian Romanick <ian.d.romanick@intel.com> |
i965/fs: Silence unused parameter warning brw_fs_visitor.cpp:2162:56: warning: unused parameter 'offset_components' [-Wunused-parameter] fs_reg offset_value, unsigned offset_components, ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
c6f2abe67e38c52361a1d342dca6ec5ed7747913 |
|
06-Mar-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
nir: Plumb the shader stage into glsl_to_nir(). The next commit needs to know the shader stage in glsl_to_nir(). To facilitate that, we pass the gl_shader rather than the raw exec_list of instructions. This has both the exec_list and the stage. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b200cbb0a41aaebb007668f870a483f0b9ecd898 |
|
06-Mar-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
nir: Add native_integers to nir_shader_compiler_options. glsl_to_nir, tgsi_to_nir, and prog_to_nir all want to know whether the driver supports native integers. Presumably other passes may as well. Adding this to nir_shader_compiler_options is an easy way to provide that information, as it's accessible via nir_shader::options. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
a55da73be46b4576015417b2dff71a719bc8b797 |
|
06-Mar-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
nir: Try to make sense of the nir_shader_compiler_options code. The code in glsl_to_nir is entirely dead, as we translate from GLSL to NIR at link time, when there isn't a _mesa_glsl_parse_state to pass, so every caller passes NULL. glsl_to_nir seems like the wrong place to try and create the shader compiler options structure anyway - tgsi_to_nir, prog_to_nir, and other translators all would have to duplicate that code. The driver should set this up once with whatever settings it wants, and pass it in. Eric also added a NirOptions field to ctx->Const.ShaderCompilerOptions[] and left a comment saying: "The memory for the options is expected to be kept in a single static copy by the driver." This suggests the plan was to do exactly that. That pointer was not marked const, however, and the dead code used a mix of static structures and ralloced ones. This patch deletes the dead code in glsl_to_nir, instead making it take the shader compiler options as a mandatory argument. It creates an (empty) options struct in the i965 driver, and makes NirOptions point to that. It marks the pointer const so that we can actually do so without generating "discards const qualifier" compiler warnings. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
a84f66a9b6cf46bb19ca71faca5b1d6d81209caf |
|
06-Mar-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/nir: Resolve source modifiers on Gen8+ logic operations. On Gen8+, AND/OR/XOR/NOT don't support the abs() source modifier, and negate changes meaning to bitwise-not (~, not -). This isn't what NIR expects, so we should resolve the source modifers via a MOV. +30 Piglits (fs-op-bit{and,or,xor}-not-abs-*). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
5666d9266fd43d552c76ce7b472abc0afde6c32b |
|
28-Feb-2015 |
Matt Turner <mattst88@gmail.com> |
i965/fs/nir: Mark fallthrough.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
54cd2f7c9655ccbb00209b1f49692196df2a33a1 |
|
28-Feb-2015 |
Matt Turner <mattst88@gmail.com> |
i965/fs/nir: Mark fallthrough.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b8a1637119249c1d5e76c27d0053360bbb7f4e77 |
|
27-Feb-2015 |
Ian Romanick <ian.d.romanick@intel.com> |
i965/fs/nir: Use emit_math for nir_op_fpow It appears that all the other instructions that need it already use it. This one just got missed. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Cc: "10.5" <mesa-stable@lists.freedesktop.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
8eb6c109994de2827b0a1340a2dc8d933edaf5e0 |
|
20-Aug-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Handle conditional discards. The discard condition tells us which channels we want killed. We want to invert that condition to get the channels that should survive (remain live) in f0.1. Emit a CMP to negate it. Nothing generates these today, but that will change shortly. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b8582d18e6b0737c4a34777837c10898ed177e30 |
|
15-Feb-2015 |
Matt Turner <mattst88@gmail.com> |
i965/fs/nir: Optimize integer multiply by a 16-bit constant. Gen8+ support was just broken, since MUL now consumes 32-bits from both sources. Fixes 986 piglit tests on my BDW. total instructions in shared programs: 7753873 -> 7753522 (-0.00%) instructions in affected programs: 28164 -> 27813 (-1.25%) helped: 77 GAINED: 47 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
7a997a386375a98b70ae5e1d880c8d47f236de8d |
|
15-Feb-2015 |
Matt Turner <mattst88@gmail.com> |
i965/fs/nir: Optimize (gl_FrontFacing ? x : y) where x and y are ±1.0. total instructions in shared programs: 7756214 -> 7753873 (-0.03%) instructions in affected programs: 455452 -> 453111 (-0.51%) helped: 2333 Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
2bd139e18c941e7ea0870ba43314a5c10fd5bb12 |
|
19-Feb-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Un-hardcode DEBUG_WM, "FS", and "fragment". These code paths can (or will) be used for other shader stages. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
231267bf011e1fa6edb52ffad27fcbca8e0e28e1 |
|
31-Jan-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Use VARYING_SLOT checks rather than strcmp(). Comparing the location field is equivalent and more efficient. We'll also need this when we start using NIR for ARB programs, as our NIR converter will set the location field correctly, but probably won't use the GLSL names for these concepts. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
3c57a595276d0614940d70315e78de0d83bf74ac |
|
14-Feb-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Don't support gl_FrontFacing as an input variable Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
785b22caee28892d9d995a743de1dee5434c9ce1 |
|
14-Feb-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Add support for nir_intrinsic_load_front_face Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
6489cb1ae6f3cb999b1a9c60d941ef4c388febd1 |
|
11-Feb-2015 |
Eric Anholt <eric@anholt.net> |
i965: Shut up a compiler warning about uninitialized var. We always pass this argument, even if it won't be used by the particular texture op. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
ccbe15f3325d7a6d04d0ea18227a08f53decec16 |
|
03-Feb-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Fix saturate on MAD and LRP with the NIR backend. Fixes misrendering in "Witcher 2" with INTEL_USE_NIR=1, and probably many other programs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
ab24e1270674192d2aeb4ba0cc39497edb3342f8 |
|
03-Feb-2015 |
Connor Abbott <cwabbott0@gmail.com> |
i965/nir: use redundant phi optimization Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Tested-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
8776b1b14b229d110f283f5da8c3c36261068ede |
|
22-Jan-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs_nir: Get rid of get_alu_src Originally, get_alu_src was supposed to handle resolving swizzles and things like that. However, now that basically every instruction we have only takes scalar sources, we don't really need it anymore. The only case where it's still marginally useful is for the mov and vecN operations that are left over from SSA form. We can handle those cases as a special case easily enough. As a side-effect, we don't need the vec_to_movs pass anymore. v2 Jason Ekstrand <jason.ekstrand@intel.com>: - Rework the way we detect if we need an extra copy for swizzling. The old code involved a pile of confusing switch fall-throughs; we now use a loop. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
112d738b91aac44c2509aafe68bdbf9ab74bb3c1 |
|
23-Dec-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Use NIR's scalarizing abilities and stop handling vectors Now that we can scalarize with NIR, there's no need for all this code anymore. Let's get rid of it and just do scalar operations. v2: run copy prop before lowering phi nodes v3: Get rid of the "emit(...)->saturate = foo" pattern v4: Run alu_to_scalar as an optimization pass total instructions in shared programs: 5998321 -> 5974070 (-0.40%) instructions in affected programs: 732075 -> 707824 (-3.31%) helped: 3137 HURT: 191 GAINED: 18 LOST: 0 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
f02f1af9f7582bc9ca685ef240751aa57ce42638 |
|
23-Jan-2015 |
Ian Romanick <ian.d.romanick@intel.com> |
i965/fs: Allow SIMD16 on pre-SNB when try_replace_with_sel is successful If try_replace_with_sel is able to replace the flow control with a SEL instruction, then there is no flow control... failing SIMD16 because of nonexistent flow control is wrong. No piglit regressions on any i965 platform in Jenkins. total instructions in shared programs: 4382707 -> 4382707 (0.00%) instructions in affected programs: 0 -> 0 helped: 0 HURT: 0 GAINED: 2089 LOST: 0 No other platforms affected in shader-db. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
d7743bb1c2d5cfe44a018251d21def18eb6d4b97 |
|
21-Jan-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/nir: Report NIR instruction counts (in SSA form) via KHR_debug. This allows us to count NIR instructions via shader-db. Use "run" as normal. The results file will contain both NIR and assembly. Then, to generate a NIR report: ./report.py <(grep NIR results/foo) <(grep NIR results/bar) Or, to generate an i965 report: ./report.py <(grep -v NIR results/foo) <(grep -v NIR results/bar) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
f3e06fcc6add67ed3eeecbce600994ef3220ec1c |
|
20-Jan-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/nir: Print NIR on INTEL_DEBUG=fs. This is useful for debugging and looking for optimization opportunities. It will need to be expanded when we add support for other scalar stages. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
faa38e16aadd9f2a2416fcb5087d7f8fc8178bf2 |
|
20-Jan-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/nir: Do optimizations again just before lowering source mods. We want to run CSE and algebraic optimizations again after lowering IO. Some of the passes in the optimization loop don't handle saturates and other modifiers, so run it before lowering to source modifiers. total instructions in shared programs: 6046190 -> 6045768 (-0.01%) instructions in affected programs: 22406 -> 21984 (-1.88%) helped: 47 HURT: 0 GAINED: 0 LOST: 0 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
45123ee8186cff6bb819b9c9e44d6d5a1bb41923 |
|
16-Jan-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/nir: Use offset() instead of altering reg_offset directly. offset() properly handles reg_width, so it'll work for SIMD16. While we're in the area, simplify a few cases, and use retype() to cut a few more lines of code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
3f263ffbb37d77f97a86686e1d2d5eeabf4ecae6 |
|
16-Jan-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/nir: Replace fs_reg(GRF, virtual_grf_alloc(...)) with vgrf(...). brw_fs_nir.cpp creates almost all of its registers via: fs_reg reg = fs_reg(GRF, virtual_grf_alloc(num_components)); When we add SIMD16 support, we'll need to set reg->width = 16 and double the VGRF size...on pretty much every VGRF it allocates. This patch replaces that pattern with a new "vgrf" helper method: fs_reg reg = vgrf(num_components); The new function correctly takes reg_width into account. For now, reg_width is always 1, so this should have no functional change. v2: Just make vgrf() account for reg_width right away, rather than changing the behavior in the next patch. v3: Replace one last virtual_grf_alloc I missed. It's used in code that only runs for dispatch_width == 8, so it doesn't matter, but consistency is nice. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
d1533d87cc7e2c39e7ce9dc838b45a2c39c96e33 |
|
16-May-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Replace fs_reg(fs_visitor, type) with fs_visitor::vgrf(type). I dislike how fs_reg has a constructor that knows about fs_visitor. Apart from that, it stands alone, with no need to interact with the rest of the compiler. Which is sensible - a class that represents a register should do just that. Allocating virtual register numbers should be left up to the compiler (fs_visitor). This patch replaces the constructor with a new fs_visitor::vgrf method, eliminating fs_reg's dependency on fs_visitor. It ends up being no more code. v2: Rebase from May 2014 -> January 2015. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
c56adc68e2e75276785fd933b47621c87f9fd3ee |
|
15-Jan-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Do a final copy lowering pass before lowering locals to regs Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
55b5058e69859ba28c2f32de6edf5f0df3c6c28c |
|
14-Jan-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Rename lower_variables to lower_vars_to_ssa The original name wasn't particularly descriptive. This one indicates that it actually gives you SSA values as opposed to the old pass which lowered variables to registers. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
4aa6162f6ecf96c7400c17c310eba0cfd0f5e083 |
|
10-Jan-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir/tex_instr: Add a nir_tex_src struct and dynamically allocate the src array This solves a number of problems. First is the ability to change the number of sources that a texture instruction has. Second, it solves the delema that may occur if a texture instruction has more than 4 sources. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
cb53aacaa1555b98fa77146492e96a7e3d7631ba |
|
17-Dec-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs_nir: Handle sample ID, position, and mask better Before, we were emitting the full pile of setup instructions for sample_id and sample_pos every time they were used. With this commit, we emit them in their own pass once at the beginning of the shader and simply emit uses later on. When it comes time for setting up VS, we can put setup for its special values in the same pass. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
2c7da78805175f36879111306ac37c12d33bf65b |
|
16-Dec-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Make load_const SSA-only As it was, we weren't ever using load_const in a non-SSA way. This allows us to substantially simplify the load_const instruction. If we ever need a non-SSA constant load, we can do a load_const and an imov. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
951a7f23a076c1570f68b50fc7d03a33eb5145e7 |
|
16-Dec-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Move the other lowering passes to before out-of-SSA Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
821e75a16038aba23aa0d46c081c99f07ee44ecd |
|
16-Dec-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir/lower_atomics: Use/support SSA Previously, lower_atomics was non-SSA only. We assert-failed if the destination of an atomic operation intrinsic was an SSA def and we used temporary registers for computing offsets. This commit changes both of these behaviors. We now use SSA values for computing offsets (so we can optimize them) and we handle SSA destinations. We also move the pass to run before we go out of SSA on i965 as it now generates SSA values. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
dfb3abbaecfbe30b8858a5428c604f9d90f65505 |
|
13-Dec-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Remove predication We stopped generating predicates in glsl_to_nir some time ago. Right now, it's all dead untested code that I'm not convinced always worked in the first place. If we decide we want them back, we can revert this patch. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b3fd098e7daa491637d66d03366b67c989937a1f |
|
13-Dec-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Make bcsel a fully vector operation Previously, the condition was a scalar that applied to all components simultaneously. As of this commit, the condition is a vector and each component is switched seperately. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
3c2c0a164c2308a5777d7a59b6da4b44a57ba6e2 |
|
06-Dec-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs_nir: Add support for indirect texture arrays v2 Jason Ekstrand <jason.ekstrand@intel.com>: - Use the nir_tex_src_sampler_offset source type instead of the sampler_indirect thing that I cooked up before. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
62ac0ee804027d1a1fa9864e03428ced7bd8510a |
|
05-Dec-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir/tex_instr: Rename the indirect source type and add an array size In particular, we rename nir_tex_src_sampler_index to _sampler_offset and add a sampler_array_size field to nir_tex_instr. This way we can pass the size of sampler arrays through to backends even after removing the variable information and, with it, the type. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
534d145e5ea039d57833395a36eed90721f6b272 |
|
09-Dec-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Use a source for uniform buffer indices instead of an index In GLSL-to-NIR we were just setting the base index to 0 whenever there was an indirect so having it expressed as a sum makes no sense. Also, while a base offset may make sense for the memory location (first element in the array, etc.) it makes less sense for the actual uniform buffer index. This may change later, but it seems to make more sense for now. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
cd4b995254fe29bae9ab5a9563cc615274d361ed |
|
05-Dec-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Make texture instruction names more consistent This commit renames nir_instr_as_texture to nir_instr_as_tex and renames nir_instr_type_texture to nir_instr_type_tex to be consistent with nir_tex_instr. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
f77f4c00ce4834ca14dd27bed28949dc012e7daf |
|
15-Nov-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Add a basic constant folding pass Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
d5410bd8f65b8d0f845dc8beccd498b6fa098660 |
|
12-Dec-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Add an algebraic optimization pass This pass uses the previously built algebraic transformations framework and should act as an example for anyone else wanting to make an algebraic transformation pass for NIR. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
919426631b7bd32f012eb9b6ffd8a9aff74788e1 |
|
13-Nov-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Add a lowering pass for adding source modifiers where possible Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
a1c259d6668bf934a79e7815dff3636783adea9f |
|
05-Dec-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs_nir: Implement the ARB_gpu_shader5 interpolation intrinsics Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
e257a5112476c47928b2fa2a2f2ea3108d13264b |
|
04-Dec-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs_nir: Add a has_indirect flag and clean up some of the input/output code Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
27663dbe8edfb7583d9d8fc3704a04a5c837fe05 |
|
04-Dec-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Vectorize intrinsics We used to have the number of components built into the intrinsic. This meant that all of our load/store intrinsics had vec1, vec2, vec3, and vec4 variants. This lead to piles of switch statements to generate the correct intrinsic names, and introspection to figure out the number of components. We can make things much nicer by allowing "vectorized" intrinsics. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
26865f858d48dd473fc294f7fe14c964715cd55e |
|
27-Nov-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs_nir: Use the new variable lowering code This commit switches us over to the new variable lowering code which is capable of properly handling lowering indirects as we go. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
7c5284d0e52add862821ab13be61228e53867e62 |
|
02-Dec-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs_nir: Don't dump the shader. This is killing piglit. I'll leave the logging local Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
04fb073344b03a02d56291dd273bdef96147e857 |
|
14-Nov-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs_nir: Properly saturate multiplies Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
c2abfc0b86628bb1b756e4ef125c97cb4386aea2 |
|
13-Nov-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs_nir: Handle SSA constants Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
e0aa4c6272851ed418dfa18ee6014f40b0e266c2 |
|
12-Nov-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs_nir: Use an array rather than a hash table for register lookup Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
20adc516e27e390b1558703720a2a2129c9e8ad5 |
|
12-Nov-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs_nir: Add the CSE pass and actually run in a loop Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
20a581260633cb6d0d8ca571e7f3e886298a5733 |
|
11-Nov-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Add a fused multiply-add peephole
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
c937bdb3c2c41c5bf914ae7ead9223b8b87e9fe2 |
|
08-Nov-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs_nir: Turn on the peephole select optimization Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
2bd5a24a5e440ba0072528fdb32892cf8c935a8e |
|
07-Nov-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs_nir: Validate optimization passes Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
10adf8fc858c21cd95b3e02a8d6abee563ca1046 |
|
07-Nov-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Differentiate between signed and unsigned versions of find_msb We also make the return types match GLSL. The GLSL spec specifies that findMSB and findLSB return a signed integer. Previously, nir had them return unsigned. This updates nir's behavior to match what GLSL expects. We also update the nir-to-fs generator to take the new instructions. While we're at it, we fix the case where the input to findMSB is zero. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
4285aaecdceac55005e1ea2e75e17c6490d158a9 |
|
12-Dec-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs_nir: Do retyping for ALU srouces in get_nir_alu_src Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
63eb32950e64715a7a686ae9da82b55954db9ab8 |
|
22-Oct-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs_nir: Convert the shader to/from SSA Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
ff0a9fcf332ce319fae1eb53f3e5d863d0289cbf |
|
21-Oct-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs_nir: Don't duplicate emit_general_interpolation Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
744b4e9348db1767a772fda2a5cbe33abbba7db1 |
|
16-Oct-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs_nir: Add atomic counters support Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
95fbd6e1eed58f1f87aaa425bb5312a92db29d21 |
|
15-Oct-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs_nir: Handle coarse/fine derivatives Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
4582341ea74a076c981c962f1a01311bfa3bf991 |
|
16-Oct-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs_nir: Add support for sample_pos and sample_id
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
7cd1537aae28b9189b1251688ac1a5dc9d36cc80 |
|
16-Oct-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
Fix up varying pull constants Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
b092bc9805f0f28209fc70fb367e0dc26e294317 |
|
16-Oct-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs_nir: Use the correct texture offset immediate Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
c181ff268e4787056fdee417d30d52b1098fe211 |
|
15-Oct-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs_nir: Use the correct types for texture inputs Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
c2ded36bb60d3dfad0036dac7adbf7718968ccf2 |
|
15-Oct-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs_nir: Make the sampler register always unsigned Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|
2faf7f87d6a1c00b3f3d3907178a2eeeefa5d2a9 |
|
15-Aug-2014 |
Connor Abbott <connor.abbott@intel.com> |
i965/fs: add a NIR frontend This is similar to the GLSL IR frontend, except consuming NIR. This lets us test NIR as part of an actual compiler. v2: Jason Ekstrand <jason.ekstrand@intel.com>: Make brw_fs_nir build again Only use NIR of INTEL_USE_NIR is set whitespace fixes
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
|