Cross Reference: /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs

History log of /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
Revision	Date	Author	Comments (<<< Hide modified files) (Show modified files >>>)
f77cecf08cf9fba5e8f62e8ac1731c1916a97618	30-Mar-2017	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Always provide a default LOD of 0 for TXS and TXL We already provide a default LOD for textureQueryLevels and texture() on non-fragment stages. However, there are more cases where one is needed such as textureSize(gsampler2DMS*) in SPIR-V. Instead of trying to list out all of the cases one at a time, just provide the default for all TXS and TXL operations. This fixes a shader validation error in the new Sascha deferredmultisampling demo which uses textureSize(gsampler2DMS). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100391 Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit 3503b2714b98684a2ceba5f4fd9a5bfbfbcaad38) /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
e1e27b0917249448a481b6681aac375505f728c3	16-Feb-2017	Samuel Iglesias Gonsálvez <siglesias@igalia.com>	i965/fs: fix source type when emitting MOV_INDIRECT to read ICP handles When generating the MOV INDIRECT instruction, the source type is ignored and it is set to destination's type. However, this is going to change in a later patch, so we need to explicitly set the proper source type. brw_vec8_grf() creates an float type's fs_reg by default, when the ICP handle is actually unsigned. This patch fixes these cases before applying the aforementioned patch. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> (cherry picked from commit d8122128bc6bd291ff0abcb7f2e52d9cdc631527) /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
59e6c0d8aee718cf58198d5a5b2adce3e01391a6	13-Feb-2017	Samuel Iglesias Gonsálvez <siglesias@igalia.com>	i965/fs: fix indirect load DF uniforms on BSW/BXT The lowered BSW/BXT indirect move instructions had incorrect source types, which luckily wasn't causing incorrect assembly to be generated due to the bug fixed in the next patch, but would have confused the remaining back-end IR infrastructure due to the mismatch between the IR source types and the emitted machine code. v2: - Improve commit log (Curro) - Fix read_size (Curro) - Fix DF uniform array detection in assign_constant_locations() when it is acceded with 32-bit MOV_INDIRECTs in BSW/BXT. v3: - Move changes in assign_constant_locations() to other patch. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> (cherry picked from commit 56266df7ed9dbdf63acfd58944442893b4cd0c0b) /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
a594bd19dc2344260904c51ea7b22bdc71428d64	15-Feb-2017	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Fix the inline nir_op_pack_double optimization We can only do the optimization if the source is SSA. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit a4393bd97fe62e8299273bae769201c5c9c816ea) Squashed with commit: i965/fs: Remove the inline pack_double_2x32 optimization It's broken in a number of ways. In particular, a bunch of the conditions are backwards so it doesn't actually detect what it's supposed to detect. Since it's been broken, it hasn't actually been helping anything so just deleting it isn't a regression. This (and removing another optimization) were done on master in commit b07381161777ba5d5f4a1d713f7655bcaede4139. Cc: "Kenneth Grunke" <kenneth@whitecape.org> Cc: "Mark Janes" <mark.a.janes@intel.com> [Emil Velikov: patch is a backport of the below "cherry pick"] Fixes: a4393bd97fe ("i965/fs: Fix the inline nir_op_pack_double optimization") (cherry picked from commit b07381161777ba5d5f4a1d713f7655bcaede4139) /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
e6ae19944d977dc91bc45adff679337182c20683	24-Nov-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Rework gl_TessLevel[] handling to use NIR compact arrays. Treating everything as scalar arrays allows us to drop a bunch of special case input/output munging all throughout the backend. Instead, we just need to remap the TessLevel components to the appropriate patch URB header locations in remap_patch_urb_offsets(). We also switch to treating the TES input versions of these as ordinary shader inputs rather than system values, as remap_patch_urb_offsets() just makes everything work out without special handling. This regresses one Piglit test: arb_tessellation_shader-large-uniforms/GL_TESS_CONTROL_SHADER-array-at-limit The compiler starts promoting the constant arrays assigned to gl_TessLevel to uniform arrays. Since the shader also has a uniform array that uses the maximum number of uniform components, this puts it over the uniform component limit enforced by the linker. This is arguably a bug in the constant array promotion code (it should avoid pushing us over limits), but is unlikely to penalize any real application. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
c5ae6e78fc3bed83c6e18be6dbc8eb86a8db0898	23-Dec-2016	Samuel Iglesias Gonsálvez <siglesias@igalia.com>	i965/fs: fix exec_size when emitting DIM instruction Otherwise, DIM instructions will be emitted with the default exec size which could be 16 in some cases, that is not legal. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Suggested-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b56fa830c6095f8226456b2aeb62f2dfad804be5	09-Dec-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: Fetch one cacheline of pull constants at a time. Asking the DC for less than one cacheline (4 owords) of data for uniform pull constants is suboptimal because the DC cannot request less than that from L3, resulting in wasted bandwidth and unnecessary message dispatch overhead, and exacerbating the IVB L3 serialization bug. The following table summarizes the overall framerate improvement (with statistical significance of 5% and sample size ~10) from the whole series up to this patch for several benchmarks and hardware generations: \| SKL \| BDW \| HSW SynMark2 OglShMapPcf \| 24.63% ±0.45% \| 4.01% ±0.70% \| 10.31% ±0.38% GfxBench4 gl_manhattan31 \| 5.93% ±0.35% \| 3.92% ±0.31% \| 6.62% ±0.22% GfxBench4 gl_4 \| 2.52% ±0.44% \| 1.23% ±0.10% \| N/A Unigine Valley \| 0.83% ±0.17% \| 0.23% ±0.05% \| 0.74% ±0.45% Note that there are two versions of the Manhattan demo shipped with GfxBench4, one of them is the original gl_manhattan demo which doesn't use UBOs, so this patch will have no effect on it, and another one is the gl_manhattan31 demo based on GL 4.3/GLES 3.1, which this patch benefits as shown above. I haven't observed any statistically significant regressions in the benchmarks I have at hand. Note that the comparatively huge improvement on SKL in the OglShMapPcf test case is due to the combined effect of this patch and the register pressure benefit on SKL+ of "i965/fs: Switch to the constant cache for uniform pull constants.", part of the same series. Going up to 8 oword blocks would improve performance of pull constants even more, but at the cost of some additional bandwidth and register pressure, so it would have to be done on-demand based on the number of constants actually used by the shader. v2: Fix for Gen4 and 5. v3: Non-trivial rebase. Rework to allow the visitor specifiy arbitrary pull constant block sizes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
9b22a0d295316b7547667ebbfe1e1b6182439186	09-Dec-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: Expose arbitrary pull constant load sizes to the IR. Change the FS generator to ask the dataport for enough owords worth of constants to fill the execution size of the instruction -- Which means that the visitor now needs to set the execution size correctly for uniform pull constant load instructions, which we were kind of neglecting until now. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
fd249c803e3ae2acb83f5e3b7152728e73228b7b	12-Dec-2016	Ilia Mirkin <imirkin@alum.mit.edu>	treewide: s/comparitor/comparator/ git grep -l comparitor \| xargs sed -i 's/comparitor/comparator/g' Just happened to notice this in a patch that was sent and included one of the tokens in question. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
4f2d1d6ea713df8f8d816b48b9e99c7117cf36d7	28-Nov-2016	Ilia Mirkin <imirkin@alum.mit.edu>	i965: support constant gather offsets larger than 4 bits Offsets that don't fit into 4 bits need to force gather_po to be selected. Adjust the logic so that this happens. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
faf20df143a63e58aa729446f21c38ae39a438f2	29-Nov-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Refactor handling of constant tg4 offsets Previously, we had an OFFSET_VALUE source for logical texture instructions that was intended to mean exactly what it says, "offset". In reality, we only fully used it for tg4 offsets. We used offset_value.file == IMM to mean, "you have a constant offset, go look in instr->offset" and didn't actually use the contents of the register at all in that case except for in nir_emit_texture where we used it as a temporary before we copy it into instr->offset. This commit renames OFFSET_VALUE to TG4_OFFSET and restricts its usage to indirect tg4 offsets only. The nir_emit_texture code is refactored so that we explicitly build a header_bits value which is placed in instr->offset and the constant offset values (both for tg4 and regular texture operations) are used to construct header_bits and don't go through the offset source at all. Finally, we stop passing offset_value in to lower_sampler_logical_send_gen5 because we can't do indirect offsets until gen7 anyway. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
2e311e421122e0232987fdca3645c6bd39fe2470	16-Nov-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Implement load_layer_id for fragment shaders Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b63f7671a3eafa4ab293a13f45f58837bd840a46	04-Oct-2016	Kenneth Graunke <kenneth@whitecape.org>	i965/fs: Handle compact outputs. We need to calculate the number of vec4 slots correctly. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
c4be6e0b8d91746eccf334b9e20861af4036d06a	15-Nov-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Fix GS push inputs with enhanced layouts. We weren't taking first_component into account when handling GS push inputs. We hardly ever push GS inputs, so this was not caught by existing tests. When I started using component qualifiers for the gl_ClipDistance arrays, glsl-1.50-transform-feedback-type-and-size started catching this. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
e1af20f18a86f52a9640faf2d4ff8a71b0a4fa9b	13-Oct-2016	Timothy Arceri <timothy.arceri@collabora.com>	nir/i965/anv/radv/gallium: make shader info a pointer When restoring something from shader cache we won't have and don't want to create a nir_shader this change detaches the two. There are other advantages such as being able to reuse the shader info populated by GLSL IR. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
59864e8e02057cc6fa0448a8af067a3cf53389da	13-Oct-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Don't use nir_assign_var_locations for VS/TES/GS outputs. Fixes spec/arb_enhanced_layouts/execution/component-layout/vs-fs-array-dvec3. v2: Remove nir_outputs field from fs_visitor (caught by Tim and Iago). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
3728ee000aecb19793dec56d45aff9d6cfce3e5b	13-Oct-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Drop unnecessary switch statement in nir_setup_outputs() TCS and FS are skipped above. CS has no output variables. All remaining cases take the same path. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
e51e055fcdf8107aafaba358fa65b00f963e1728	09-Sep-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Introduce downcast helpers for prog_data structures. Similar to brw_context(...), intel_texture_object(...), and so on. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
40dd45d0c6aa4a9d727c09225967e9c3b1f45854	30-Jun-2016	Ian Romanick <ian.d.romanick@intel.com>	i965: Enable ARB_shader_atomic_counter_ops Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
3d2011cb33317b0fe9b8fe989916efc1841c6ce0	30-Jun-2016	Ian Romanick <ian.d.romanick@intel.com>	i965: Refactor emission of atomic counter operations This will make it easier to add more operations. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
950af5ed40895ba7eb664a64e869cf4ae1104fc7	02-Sep-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: Misc simplification. Get rid of some leftover redundant arithmetic introduced during the conversion to byte offsets and sizes that can be simplified easily. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
80e1d670b4b4c080ce2092a3b52d2415bc4c6a42	01-Sep-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: Get rid of fs_inst::set_smear(). component() was generally a better alternative because of several issues set_smear() had: - It wouldn't take the original stride and offset of the register into account, which means that set_smear() on the result of e.g. another set_smear() call or an offset() call would give a bogus region as result. - It was an inherently destructive operation. See the 'nir_intrinsic_shader_clock' hunk below for how this could lead to subtle bugs in cases where set_smear() was called multiple times on the same register like 'r.set_smear(0), r.set_smear(1)' with the expectation that each call would return a separate value instead of a reference to the same subsequently mutated object. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
2d7d4a791083ff63f37ac1e40bfe8b448e7f8045	02-Sep-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: Simplify a bunch of fs_inst::size_written calculations by using component_size(). Using component_size() is easier and generally more correct because it takes into account the register type and stride for you. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
62aaef6c83e4eb354bd7f15803db01e90d22fc34	02-Sep-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: Simplify and fix buggy stride/offset calculations using subscript(). These were bashing the 'offset' and 'stride' values of several registers without taking the previous value into account, which probably didn't matter in practice for optimize_frontfacing_ternary() because the 'tmp' register already had a known region, but it would have given the wrong region as result in the other cases in lower_integer_multiplication(). subscript(..., i) is a more straightforward way to take the i-th field of a given type from each channel of a register which should give the right answer as result regardless of the original 'offset' and 'stride' parameters of the register region. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
c057278c065747c1f53579504bf109cafb7cb390	02-Sep-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: Stop using fs_reg::in_range() in favor of regions_overlap(). Its only use left in the FS back-end should be using regions_overlap() instead to avoid getting a false negative result in cases where source and destination overlap but the former starts before the latter in the VGRF file. v2: Put back lost components factor (Iago). Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
69570bbad876bb9da609c3b651aacda28cecc542	07-Sep-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: Replace fs_inst::regs_written with ::size_written field in bytes. The previous regs_written field can be recovered by rewriting each rvalue reference of regs_written like 'x = i.regs_written' to 'x = DIV_ROUND_UP(i.size_written, reg_unit)', and each lvalue reference like 'i.regs_written = x' to 'i.size_written = x * reg_unit'. For the same reason as in the previous patches, this doesn't attempt to be particularly clever about simplifying the result in the interest of keeping the rather lengthy patch as obvious as possible. I'll come back later to clean up any ugliness introduced here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
be095e11e41158f91bcb3f6fcbc2e2a91a5d9124	02-Sep-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: Replace fs_reg::subreg_offset with fs_reg::offset expressed in bytes. The fs_reg::subreg_offset and ::offset fields are now redundant, the sub-GRF offset can just be added to the single ::offset field expressed in byte units. The current subreg_offset value can be recovered by applying the following rule: Replace each rvalue reference of subreg_offset like 'x = r.subreg_offset' with 'x = r.offset % reg_unit', and each lvalue reference like 'r.subreg_offset = x' with 'r.offset = ROUND_DOWN_TO(r.offset, reg_unit) + x'. For the same reason as in the previous patches, this doesn't attempt to be particularly clever about simplifying the result in the interest of keeping the rather lengthy patch as obvious as possible. I'll come back later to clean up any ugliness introduced here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
86944e063ad40cac0860bfd85a3cc4e9a9805aa3	01-Sep-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: Replace fs_reg::reg_offset with fs_reg::offset expressed in bytes. The fs_reg::offset field in byte units introduced in this patch is a more straightforward alternative to the current register offset representation split between fs_reg::reg_offset and ::subreg_offset. The split representation makes it too easy to forget about one of the offsets while dealing with the other, which has led to multiple back-end bugs in the past. To make the matter worse the unit reg_offset was expressed in was rather inconsistent, for uniforms it would be expressed in either 4B or 16B units depending on the back-end, and for most other things it would be expressed in 32B units. This encodes reg_offset as a new offset field expressed consistently in byte units. Each rvalue reference of reg_offset in existing code like 'x = r.reg_offset' is rewritten to 'x = r.offset / reg_unit', and each lvalue reference like 'r.reg_offset = x' is rewritten to 'r.offset = r.offset % reg_unit + x * reg_unit'. Because the change affects a lot of places and is rather non-trivial to verify due to the inconsistent value of reg_unit, I've tried to avoid making any additional changes other than applying the rewrite rule above in order to keep the patch as simple as possible, sometimes at the cost of introducing obvious stupidity (e.g. algebraic expressions that could be simplified given some knowledge of the context) -- I'll clean those up later on in a second pass. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
979d0aca6277975986f5f278cad0f37616c9d91f	26-Aug-2016	Jason Ekstrand <jason.ekstrand@intel.com>	intel: Rename brw_get_device_name/info to gen_get_device_name/info Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
527f37199929932300acc1688d8160e1f3b1d753	23-Aug-2016	Jason Ekstrand <jason.ekstrand@intel.com>	intel: s/brw_device_info/gen_device_info/ Generated by: sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/*/.c sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/*/.h sed -i -e 's/brw_device_info/gen_device_info/g' */i965/.c sed -i -e 's/brw_device_info/gen_device_info/g' */i965/.cpp sed -i -e 's/brw_device_info/gen_device_info/g' */i965/.h Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
4135fc22ff735a40c36fcf051c1735fe23d154f2	19-Aug-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: Hook up coherent framebuffer reads to the NIR front-end. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
f24e393bd5caee85994b00b93f141e6c4b99e273	22-Jul-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: Translate nir_intrinsic_load_output on a fragment output. This gets the non-coherent framebuffer fetch path hooked up to the NIR front-end. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b00a236d6a6212323f77248ba923c65eeb02592b	22-Jul-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: Allocate fragment output temporaries on demand. This gets rid of the duplication of logic between nir_setup_outputs() and get_frag_output() by allocating fragment output temporaries lazily whenever get_frag_output() is called. This makes nir_setup_outputs() a no-op for the fragment shader stage. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
7dac8820730777756c00d7024330517848dc3b9f	22-Jul-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: Rework representation of fragment output locations in NIR. The problem with the current approach is that driver output locations are represented as a linear offset within the nir_outputs array, which makes it rather difficult for the back-end to figure out what color output and index some nir_intrinsic_load/store_output was meant for, because the offset of a given output within the nir_output array is dependent on the type and size of all previously allocated outputs. Instead this defines the driver location of an output to be the pair formed by its GLSL-assigned location and index (I've borrowed the bitfield macros from brw_defines.h in order to represent the pair of integers as a single scalar value that can be assigned to nir_variable_data::driver_location). nir_assign_var_locations is no longer useful for fragment outputs. Because fragment outputs are now allocated independently rather than within the nir_outputs array, the get_frag_output() helper becomes necessary in order to obtain the right temporary register for a given location-index pair. The type_size helper passed to nir_lower_io is now type_size_dvec4 rather than type_size_vec4_times_4 so that output array offsets are provided in terms of whole array elements rather than in terms of scalar components (dvec4 is the largest vector type supported by the GLSL so this will cause all individual fragment outputs to have a size of one regardless of the type). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
f3cb2c34f29d35088879a6b8101c3ac648e0febf	22-Jul-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: Special-case nir_intrinsic_store_output for the fragment shader. I'm about to change how fragment shader output locations are represented, so the generic nir_intrinsic_store_output implementation that assumes that outputs are just contiguous elements in the big nir_outputs array won't work anymore. This somewhat simplified implementation of nir_intrinsic_store_output for fragment shaders should be functionally equivalent to the current fall-back one. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
af0cc743e607293146861518bb6ef96f411aeca9	22-Jul-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: Implement non-coherent framebuffer fetch using the sampler unit. v2: Memoize sample ID, misc codestyle changes. (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
4a87e4ade778e56d43333c65a58752b15a00ce69	21-Jul-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: Get rid of fs_visitor::do_dual_src. This boolean flag was being used for two different things: - To set the brw_wm_prog_data::dual_src_blend flag. Instead we can just set it based on whether the dual_src_output register is valid, which will be the case if the shader writes the secondary blending color. - To decide whether to call emit_single_fb_write() once, or in a loop that would iterate only once, which seems pretty useless. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
d14dd727f4aded5bd34a78dc2c81374a78114440	17-Aug-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Fix barrier count shift in scalar TCS backend. The "Barrier Count" field goes in 14:9 of m0.2. The vec4 backend correctly shifts by 9, but the scalar backend only shifted by 8. It's not like this changed - I think I just made a typo when writing the original scalar TCS backend code. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
159f0377556c45630cdc0721b193f34217a329b0	17-Aug-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Fix execution size of scalar TCS barrier setup code. Previously, the scalar TCS backend was generating: mov(8) g17<1>UD 0x00000000UD { align1 WE_all 1Q compacted }; and(8) g17.2<1>UD g0.2<0,1,0>UD 0x0001e000UD { align1 WE_all 1Q }; shl(8) g17.2<1>UD g17.2<8,8,1>UD 0x0000000bUD { align1 WE_all 1Q }; or(8) g17.2<1>UD g17.2<8,8,1>UD 0x00008200UD { align1 WE_all 1Q }; send(8) null<1>UW g17<8,8,1>UD gateway (barrier msg) mlen 1 rlen 0 { align1 WE_all 1Q }; This is rubbish - g17.2<8,8,1>UD spans two registers, and is an illegal region. Not to mention it clobbers 8 channels of data when we only wanted to touch m0.2. Instead, we want: mov(8) g17<1>UD 0x00000000UD { align1 WE_all 1Q compacted }; and(1) g17.2<1>UD g0.2<0,1,0>UD 0x0001e000UD { align1 WE_all }; shl(1) g17.2<1>UD g17.2<0,1,0>UD 0x0000000bUD { align1 WE_all }; or(1) g17.2<1>UD g17.2<0,1,0>UD 0x00008200UD { align1 WE_all }; send(8) null<1>UW g17<8,8,1>UD gateway (barrier msg) mlen 1 rlen 0 { align1 WE_all 1Q }; Using component() accomplishes this. Fixes GL44-CTS.tessellation_shader.tessellation_shader_tc_barriers. barrier_guarded_read_write_calls on Skylake. Probably fixes other barrier issues on Gen8+. v2: Use a group(1, 0) builder so inst->exec_size is set correctly (thanks to Francisco Jerez for catching that it was incorrect). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v1] Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
0c754d1c4203d87dbb9d2dd882ef42686e6d01ec	12-Aug-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: Lower TEX to TXL during NIR translation. This simplifies the code slightly and will allow the SIMD lowering pass to find out easily what the actual texturing opcode is in order to determine the maximum execution size of texturing instructions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
027cbf00f248bda325521db8f56a3718898da46b	02-Aug-2016	Mathias Fröhlich <mathias.froehlich@web.de>	util: Move _mesa_fsl/util_last_bit into util/bitscan.h As requested with the initial creation of util/bitscan.h now move other bitscan related functions into util. v2: Split into two patches. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
875341c69b99dea7942a68c9060aa31a459e93fc	02-Aug-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Rework the unlit centroid workaround. Previously, for every input, we moved the dispatch mask to the flag register, then emitted two predicated PLN instructions, one with centroid barycentric coordinates (for normal pixels), and one with pixel barycentric coordinates (for unlit helper pixels). Instead, we can simply emit a set of predicated MOVs at the top of the program which copy the pixel barycentric coordinates over the centroid ones for unlit helper pixel channels. Then, we can just use normal PLNs. On Sandybridge: total instructions in shared programs: 7538470 -> 7534500 (-0.05%) instructions in affected programs: 101268 -> 97298 (-3.92%) helped: 705 HURT: 9 (all of which are SIMD16 programs) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
12a912586f11ccbc4612532d5ceaf1bdd0cdb45a	29-Jul-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Use a separate register for every access to an SSA undef. Previously, we allocated a new VGRF for every undefined definition. Instead, this patch makes us allocate a new VGRF for every use of an undefined definition. This makes sure that undefined values are fully independent of one another, and have live ranges limited to their single use. This allows register coalescing to combine the source and destination of MOVs from undefined sources, eliminating the MOV altogether. On Broadwell: total instructions in shared programs: 11641187 -> 11640214 (-0.01%) instructions in affected programs: 70199 -> 69226 (-1.39%) helped: 213 HURT: 1 v2: Add a comment (based on Iago's suggested one). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
a2b3c146d2017a626be66dcf43753d545e902c52	22-Jul-2016	Timothy Arceri <timothy.arceri@collabora.com>	i965: fix varying output setup Since 7f53fead5c we treat every location as using all four components so we only need special handling for doubles when they cross multiple locations. This fixes a crash in GL45-CTS.enhanced_layouts.varying_locations where the outputs array would overflow when a dmat2 was stored at the max varying location i.e 30. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
be1c53d2cf2b12655ff69caac49cca75a55e63e0	22-Jul-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Fix "operation operation" in comment. From the redundant redundant department. Reported-by: Michael Schellenberger Costa <mschellenbergercosta@googlemail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
76e161056a424e5b9c35b02a9f4e520c8c44cf2b	18-Jul-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Fix shared atomic intrinsics to pay attention to base. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
ad5dd39984467b29d20e03ec8bd26f6f1d2e97ad	14-Jun-2016	Timothy Arceri <timothy.arceri@collabora.com>	i965: add component packing support for load_output intrinsics Here we use the component qualifier (which is the first component) as an offset when loading output varyings. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
7f53fead5cf9a85c74a94d359dd5fccfbb87856c	23-May-2016	Timothy Arceri <timothy.arceri@collabora.com>	i965: enable component packing for vs and fs Rather than trying to work out the total number of components used at a location we simply treat all outputs as vec4s. This removes the need for complex code looping over varyings to match packed locations and the need for storing the total number of components used at each location. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
3dba8516d6468866f2534f517358a6243eb0995e	20-Jul-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Move VS load_input handling to nir_emit_vs_intrinsic(). TCS/TES/GS and now FS all handle these in stage-specific functions. CS don't have inputs, so VS was the only one left using this code. Move it to the VS-specific function for clarity. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
1eef0b73aa323d94d5a080cd1efa81ccacdbd0d2	12-Jul-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Rewrite FS input handling to use the new NIR intrinsics. This eliminates the need to walk the list of input variables, recurse into their types (via logic largely redundant with nir_lower_io), and interpolate all possible inputs up front. The backend no longer has to care about variables at all, which eliminates complications from trying to pack multiple variables into the same location. Instead, each intrinsic specifies exactly what's needed. This should unblock Timothy's work on GL_ARB_enhanced_layouts. Each load_interpolated_input intrinsic corresponds to PLN instructions, while load_barycentric_at_* intrinsics correspond to pixel interpolator messages. The pixel/centroid/sample barycentric intrinsics simply refer to payload fields (delta_xy[]), and don't actually generate any code. Because we use a single intrinsic for both centroid-qualified variables and interpolateAtCentroid(), they become indistinguishable. We stop sending pixel interpolator messages for those, and instead use the payload provided data, which should be considerably faster. On Broadwell: total instructions in shared programs: 9067751 -> 9067570 (-0.00%) instructions in affected programs: 145902 -> 145721 (-0.12%) helped: 422 HURT: 209 total spills in shared programs: 2849 -> 2899 (1.76%) spills in affected programs: 760 -> 810 (6.58%) helped: 0 HURT: 10 total fills in shared programs: 3910 -> 3950 (1.02%) fills in affected programs: 617 -> 657 (6.48%) helped: 0 HURT: 10 LOST: 3 GAINED: 3 The differences mostly appear to be slight changes in MOVs. v2: Use nir_shader_compiler_options::use_interpolated_input_intrinsics flag rather than passing it directly to nir_lower_io. Use the unreachable() macro rather than assert in one place. (Review feedback from Chris Forbes.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisforbes@google.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
96dfed49e47eac7afc100e5b8d3b316dd6652fb6	19-Jul-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965: Stop muging cube array lengths by 6 From the Sky Lake PRM: "For SURFTYPE_CUBE: For Sampling Engine Surfaces and Typed Data Port Surfaces, the range of this field is [0,340], indicating the number of cube array elements (equal to the number of underlying 2D array elements divided by 6). For other surfaces, this field must be zero." In other words, the depth field for cube maps is in number of cubes not number of 2-D slices so we need to divide by 6. ISL will do this correctly for us assuming that we provide it with the correct array bounds which it expects to be in 2-D slices. It appears as if we've been doing this wrong ever since we first added cube map arrays for Sandy Bridge and the change to ISL made things slightly worse. While we're at it, we now need to remoe the shader hacks we've always done since they were only needed because we were setting the depth field six times too large. v2: Fix the vec4 backend as well (not sure how I missed this). Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Chris Forbes <chrisforbes@google.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
3e7cebc8da5c9f16fa1b9a25ea72b8d31c86a440	22-Jun-2016	Ian Romanick <ian.d.romanick@intel.com>	i965: Use LZD to implement nir_op_find_lsb on Gen < 7 v2: Rebase on changes to previous two patches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
c2019c6c261d5c46a4e5d3edc88836bcedf75f30	22-Jun-2016	Ian Romanick <ian.d.romanick@intel.com>	i965: Use LZD to implement nir_op_ifind_msb on Gen < 7 v2: Retype LZD source as UD to avoid potential problems with 0x80000000. Suggested by Matt. Also update comment about problem values with LZD(abs(x)). Suggested by Curro. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
de20086eed47e6bfe7c25835d72383114f99c7a9	22-Jun-2016	Ian Romanick <ian.d.romanick@intel.com>	i965: Use LZD to implement nir_op_ufind_msb This uses one less instruction. v2: Move emit_find_msb_using_lzd out of the visitor classes. Suggested by Curro. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
0f2516d88f6607b2816445c2dc18607cdaf1beff	15-Jul-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/tes/scalar: fix 64-bit indirect input loads We totally ignored this before because there were no piglit tests for indirect loads in tessellation stages with doubles. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
1737e75bfb85eb22a30e4f1c69a825b3abd946f6	15-Jul-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/tcs/scalar: only update imm_offset for second message in 64bit input loads Our indirect URB read messages take both a direct and an indirect offset so when we emit the second message for a 64-bit input load we can just always incremement the immediate offset, even for the indirect case. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
18f67c8a69fcde5d3f585effeef670d0861b0730	14-Jul-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Move pulls_bary setting to emit_pixel_interpolator_send(). pulls_bary should be set when the shader uses a pixel interpolator message. So, setting it from the function that emits pixel interpolator messages makes a lot of sense. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
7ef7738a61ded5632105b8de6f8141307592e20a	15-Jul-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Write gl_FragCoord directly to the destination. This patch makes emit_general_interpolation take a destination register as an argument, and write directly to that. This is simpler than the old approach of ralloc'ing a register, writing to that temporary, and then making the caller emit per-component MOVs to copy it to the actual destination. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
ac1181ffbef5250cb3b651e047cce5116727c34c	07-Jul-2016	Kenneth Graunke <kenneth@whitecape.org>	compiler: Rename INTERP_QUALIFIER_* to INTERP_MODE_. Likewise, rename the enum type to glsl_interp_mode. Beyond the GLSL front-end, talking about "interpolation modes" seems more natural than "interpolation qualifiers" - in the IR, we're removed from how exactly the source language specifies how to interpolate an input. Also, SPIR-V calls these "decorations" rather than "qualifiers". Generated by: $ find . -regextype egrep -regex '.\.(c\|cpp\|h)' -type f -exec sed -i \ -e 's/INTERP_QUALIFIER_/INTERP_MODE_/g' \ -e 's/glsl_interp_qualifier/glsl_interp_mode/g' {} \; Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Dave Airlie <airlied@redhat.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
94135e8736f2741684e978afac9d34c368f7bcb1	07-Jul-2016	Samuel Iglesias Gonsálvez <siglesias@igalia.com>	i965/fs: emit DIM instruction to load 64-bit immediates in HSW v2 (Matt): - Use brw_imm_df() as source argument of DIM instruction. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
87a13f598b1ecd50bc209088cf1dc60fd90df015	11-Jul-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/fs: use the new helper function to create double immediates Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
9e196e907ee87bff2b8c215df5e31a0cd1d1a322	09-Mar-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/fs: add a helper function to create double immediates Gen7 hardware does not support double immediates so these need to be moved in 32-bit chunks to a regular vgrf instead. Instead of doing this every time we need to create a DF immediate, create a helper function that does the right thing depending on the hardware generation. v2: - Define setup_imm_df() as an independent function (Curro) - Create a specific builder to get rid of some instruction field assignments (Curro). v3: - Get devinfo from builder (Kenneth) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
27e28197e8e82e8c47fda5d6e912c5cb62c03f4a	10-Jun-2016	Timothy Arceri <timothy.arceri@collabora.com>	i965: add double packing support to tess stages Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
8b80e9c31db62ccf54ab593b47016ea514dec81c	10-Jun-2016	Timothy Arceri <timothy.arceri@collabora.com>	i965: add double support packing support to gs inputs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
9d9b0b54cdc212c372ac67cc14d7ba1a16cc69ef	22-May-2016	Timothy Arceri <timothy.arceri@collabora.com>	i965: add indirect packing support to gs load inputs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
2477e6cfada55563631c654fce9250e4fe276f0e	23-May-2016	Timothy Arceri <timothy.arceri@collabora.com>	i965: add indirect packing support for tcs and tes Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
2bda4b062f62edac1011bf65f410eeca176b5e23	20-May-2016	Timothy Arceri <timothy.arceri@collabora.com>	i965: add component packing support for tcs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
cfff71a47a655e8cf930e858d408dc4db942ec7c	19-May-2016	Timothy Arceri <timothy.arceri@collabora.com>	i965: add component packing support for tes Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
a102ef2d4fd01a946f949a45115d65abb6714a5b	19-May-2016	Timothy Arceri <timothy.arceri@collabora.com>	i965: add component packing support for gs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
255cff76d961e56199acab2ab523140e43ea2de2	23-Jun-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Drop unnecessary inst->base_mrf = -1 assignments. These are now unnecessary, as base_mrf is -1 by default. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
bdab572a86f27b92ba10124f85d278e9c8861fff	13-Jun-2016	Samuel Iglesias Gonsálvez <siglesias@igalia.com>	i965/fs: indirect addressing with doubles is not supported in CHV/BSW/BXT From the Cherryview's PRM, Volume 7, 3D Media GPGPU Engine, Register Region Restrictions, page 844: "When source or destination datatype is 64b or operation is integer DWord multiply, indirect addressing must not be used." v2: - Fix it for Broxton too. v3: - Simplify code by using subscript() and not creating a new num_components variable (Kenneth). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
0177dbb6c2fe876a9761a4a97eec44accfa4c007	13-Jun-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/fs: Fix single-precision to double-precision conversions for CHV/BSW/BXT From the Cherryview PRM, Volume 7, 3D Media GPGPU Engine, Register Region Restrictions: "When source or destination is 64b (...), regioning in Align1 must follow these rules: 1. Source and destination horizontal stride must be aligned to the same qword. (...)" v2: - Fix it for Broxton too. v3: - Remove inst->regs_written change as it is not necessary (Ken) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462 Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
a8a9d1bf41c00123cefb6e757f3509c62e880a15	14-Jun-2016	Timothy Arceri <timothy.arceri@collabora.com>	i965: remove type_size_vec4_times_4() type_size_vec4_times_4() was introduced as a fix in 8dcf807cb43383 however since 3810c1561 we can just use type_size_scalar() and get the actual number of outputs we need. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
2b648ec17c2934802dd56452d11d78ec2d525a06	27-May-2016	Samuel Iglesias Gonsálvez <siglesias@igalia.com>	i965/gs/scalar: Fix load input for doubles Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
2d6f82a294ad1ab1eab0020cf65df5ecc9591272	26-May-2016	Samuel Iglesias Gonsálvez <siglesias@igalia.com>	i965/fs: fix offset when loading double vector input varyings When we are not packing a double input varying, we might need to read its data in a non-aligned to 64-bit offset, so we read the wrong data. This is happening when using explicit locations in varyings because Mesa disables packing varying for that case. const_index is in 32-bit size units but offset() is multiplying it by destination type size units. When operating with double input varyings, const_index value could be not aligned to 64 bits. To fix it, we load the double vector as if it was a float based vector with twice the number of components. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
3fb289f957a8a27349a6f7df03983f92d9b6cf64	02-Jun-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs Add a wm_prog_data bit for has_side_effects This is more accurate than calling _mesa_active_fragment_shader_has_side_effects because it looks at whether or not the SSBOs, images, or atomic buffers are actually written rather than just existing in the program. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
0a3acff5b53d409181dcd2f31a4a50af06f73a57	23-May-2016	Jordan Justen <jordan.l.justen@intel.com>	i965: Remove old CS local ID handling The old method pushed data for each channels uvec3 data of gl_LocalInvocationID. The new method pushes 1 dword of data that is a 'thread local ID' value. Based on that value, we can generate gl_LocalInvocationIndex and gl_LocalInvocationID with some calculations. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
8f48d23e0fcc0809f6397a67c26751a45a95e076	23-May-2016	Jordan Justen <jordan.l.justen@intel.com>	i965: Add nir channel_num system value v2: * simd16/32 fixes (curro) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
25e1b8d366a6131bc9d46fe27f6bc476f05a7a58	01-Jun-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Fix isoline reads in scalar TES. Isolines aren't reversed. commit 5b2d8c2273c6f fixed this for the vec4 TES backend, but not the scalar one. Found while debugging GL45-CTS.tessellation_shader. tessellation_control_to_tessellation_evaluation.gl_tessLevel. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: mesa-stable@lists.freedesktop.org /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b27dfa5403ed1884999524417c08d2bc50365965	24-May-2016	Ian Romanick <ian.d.romanick@intel.com>	i965: If control_data_header_size_bits is zero, don't do EndPrimitive This can occur when max_vertices=0 is explicitly specified. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
796238d9e6eee0b942d34c57bd8bdf0f9c98b6c3	18-May-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: Use SIMD8 SSBO GET_BUFFER_SIZE message regardless of the dispatch width. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
29e471725115edf941458c5be0bb7e93218ddd0f	18-May-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: Don't emit duplicated SSBO GET_BUFFER_SIZE instruction unnecessarily. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
a55452530f7525e9cf5d2619bef66a61b488b4af	26-Apr-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: Emit fixed width memory fence opcode regardless of the dispatch width. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
29ce110be6d0d4e4df51be635810f528f7dd7f40	19-May-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: Remove extract virtual opcodes. These can be easily represented in the IR as a MOV instruction with strided source so they seem rather redundant. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
a8e7b4f1d9ec50d2214e7694da26af6a108e506f	20-May-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: Handle SAMPLEINFO consistently like other texturing instructions. Seems like this texturing opcode was missing its logical counterpart which would prevent it from taking advantage of the SIMD lowering infrastructure, define it and plumb it through the back-end. At some point we'll likely want to emit a single SAMPLEINFO message shared among all channels irrespective of this change, but for the moment this should be enough to get the intrinsic working in SIMD32 mode. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
47e2a57fe955c04763c979ff4ca61c6867fa05bb	18-May-2016	Jordan Justen <jordan.l.justen@intel.com>	i965/compute: Fix uniform init issue when SIMD8 is skipped In d8347f12ead89c5a58f69ce9283a54ac8487159c, we added support for skipping SIMD8 generation when the program local size is too large for SIMD8 to be usable. This change was missed in that commit. This bug would impact gen7 platforms when the compute shader local size is greater than 512, and gen8 platforms when the local size is greater than 448. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
e58fabc93a25ccc910369f3638b302d46de12271	24-May-2016	Jordan Justen <jordan.l.justen@intel.com>	i965/gen7: Fix gl_HelperInvocation It appears that UV immediates aren't working on Ivy Bridge. In this case, a signed version will work, and this fixes the piglit tests/spec/glsl-4.50/execution/helper-invocation.shader_test test. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
44997fc0c1cc7f24216e3b1c5d954919df946ee5	02-May-2016	Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	i965: Support textures with multiple planes Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
015035027beb38fb9a3b06f8cd94aadc96a8f728	23-May-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: Mark UBO uniform pull constant loads as force_writemask_all. This lets the rest of the backend know that the uniform pull constant load opcodes don't respect channel enables -- Without this the register allocator has no way to know that the return payload of a pull constant load is not per-channel and spills of the destination will be broken under non-uniform control flow. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b46867cd378e5fb135fd060d50c8028d3dac622a	19-May-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: do not depend on std140 alignment rules for UBO loads The previous implementation relied on the std140 alignment rules to avoid handling misalignment in the case where we are loading more than 2 double components from a vector, which requires to emit a second load message. This alternative implementation deals with misalignment and is more flexible going forward. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
dac10e8a1390711f1f36f224644c4a33586cebe3	17-May-2016	Kenneth Graunke <kenneth@whitecape.org>	i965, anv: Use NIR FragCoord re-center and y-transform passes. This handles gl_FragCoord transformations and other window system vs. user FBO coordinate system flipping by multiplying/adding uniform values, rather than recompiles. This is much better because we have no decent way to guess whether the application is going to use a shader with the window system FBO or a user FBO, much less the drawable height. This led to a lot of recompiles in many applications. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
fb5dcb81cc121e4355b7eef014474a5c42a2f6db	19-May-2016	Matt Turner <mattst88@gmail.com>	i965: Pass nir_src/nir_dest by reference. Cuts 6K of .text. text data bss dec hex filename 5772372 264648 29320 6066340 5c90a4 lib/i965_dri.so before 5766074 264648 29320 6060042 5c780a lib/i965_dri.so after Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
07353599e07529e98494057f556b9d96c1df5cfd	05-May-2016	Matt Turner <mattst88@gmail.com>	i965/fs: Add and use get_nir_src_imm(). The next patch wants to inspect the LOD argument and do something different if it's 0.0f. But at that point we've emitted a MOV for it and we just have a register to look at. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
cbb0e3a7e8fffa4d5c5af8660d99cd3da8af97ec	17-May-2016	Matt Turner <mattst88@gmail.com>	i965/fs: Assert that nir_op_extract_*'s src1 is a constant. /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
ccfe25f7583dd8d0ff0609de3728c8b15fb0f8fb	31-Mar-2016	Juan A. Suarez Romero <jasuarez@igalia.com>	i965/fs: shuffle 32bits into 64bits for doubles VS Thread Payload handles attributes in URB as vec4, no matter if they are actually single or double precision. So with double-precision types, value ends up in the registers split in 32bits chunks, in different positions. We need to shuffle the chunks to get the doubles correctly. v2: * Extra blank line. Add { } on if body (Ian Romanick) * Use dest directly (Kenneth Graunke) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
58f304defe804a6f01b0b961997ecfe61fe00d34	09-May-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/tes/scalar: Fix load input for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
61197b8d5dd963bd9288385308feb3f0dcaf6742	09-May-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/tcs/scalar: fix store output for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
cda3435ea85904a17c5c23a7c044e59ba0181b96	09-May-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/tcs/scalar: fix load input for doubles v2: do not write to the original indirect_offset since that is an expression that could be used somewhere else (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
66192b3c16b09fa7ba97574103fc3d883b3cbfdb	09-May-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/fs: fix nir_intrinsic_store_output for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
3cce67aff09a4c248e9a69a8b05a63ac6b3e4878	09-May-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/fs: fix number of output components for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
8c6d147373cbdefef5945b00626bb62bb03198ca	26-Jan-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/fs: support doubles with shared variable stores This is pretty much the same we do with SSBOs. v2: do not shuffle in-place, it is not safe since the original 64-bit data could be used after the write, instead use a temporary like we do for SSBO stores (Iago) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
943f9442bf7943a992730e642e91ed874d50790c	25-Jan-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/fs: support doubles with ssbo stores Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b9aa66aa516c100d5476ee966f428aaf743d786c	25-Jan-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/fs: add shuffle_64bit_data_for_32bit_write helper This does the inverse operation of shuffle_32bit_load_result_to_64bit_data and we will use it when we need to write 64-bit data in the layout expected by untyped write messages. v2 (curro): - Use subscript() instead of stride() - Assert on the input types rather than silently retyping. - Use offset() instead of horiz_offset(), drop the multiplier definition. - Drop the temporary vgrf and force_writemask_all. - Make component_i const. - Move to brw_fs_nir.cpp v3 (curro): - Pass dst and src by reference. - Simplify allocation of tmp register. - Move to brw_fs_nir.cpp. - Get rid of the temporary. v3 (Iago): - Check that the src and dst regions do not overlap, since that would typically be a bug in the caller. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
33f7ec18ac399719df06ab7031cb43965e6793be	25-Jan-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/fs: support doubles with SSBO loads Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
8aa01ac596fc0722058e10808c8141533c3fd1fe	05-May-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/fs: support doubles with shared variable loads Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
6eab06b866916d4fd52adf7b8bb6113948a3811a	05-May-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/fs: Add do_untyped_vector_read helper We are going to need the same logic for anything that reads doubles via untyped messages (CS shared variables and SSBOs). Add a helper function with that logic so that we can reuse it. v2: - Make this a static function instead of a method of fs_visitor (Iago) - We only support types with a size of 4 or 8 (Curro) - Avoid retypes by using a separate vgrf for the packed result (Curro) - Put dst parameter before source parameters (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b86d4780ed203b2a22afba5f95c73b15165a7259	13-Jan-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/fs: support doubles with UBO loads UBO loads with constant offset use the UNIFORM_PULL_CONSTANT_LOAD instruction, which reads 16 bytes (a vec4) of data from memory. For dvec types this only provides components x and y. Thus, if we are reading more than 2 components we need to issue a second load at offset+16 to read the next 16-byte chunk with components w and z. UBO loads with non-constant offset emit a load for each component in the vector (and rely in CSE to fix redundant loads), so we only need to consider the size of the data type when computing the offset of each element in a vector. v2 (Sam): - Adapt the code to use component() (Curro). v3 (Sam): - Use type_sz(dest.type) in VARYING_PULL_CONSTANT_LOAD() call (Curro). - Add asserts to ensure std140 vector alignment rules are followed (Curro). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
58f1804c4f38b76c20872d6887b7b5e6029e0454	18-Jan-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/fs: fix pull constant load component selection for doubles UNIFORM_PULL_CONSTANT_LOAD is used to load a contiguous vec4 starting at a constant offset that is 16-byte aligned. If we need to access an unaligned offset we emit a load with an aligned offset and use the remaining constant offset to select the component into the vec4 result that we are interested in. This component must be computed in units of the type size, since that is what fs_reg::set_smear expects. This patch does this change in the two places where we use this message: In demote_pull_constants when we lower uniform access with constant offset into the pull constant buffer and in UBO loads with constant offset. v2 (Sam): - Fix set_smear() in fs_visitor::lower_constant_loads(), take into account source type instead and remove MAX2 (Curro). - Improve changes to nir_intrinsic_load_ubo case in nir_emit_intrinsic() (Curro). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
50b7676dc46bae39c5e9b779828ef4fb2e1fbefc	22-Jan-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/fs: add shuffle_32bit_load_result_to_64bit_data helper There will be a few places where we need to shuffle the result of a 32-bit load into valid 64-bit data, so extract this logic into a separate helper that we can reuse. v2 (Curro): - Use subscript() instead of stride() - Assert on the input types rather than retyping. - Use offset() instead of horiz_offset(), drop the multiplier definition. - Don't use force_writemask_all. - Mark component_i as const. - Make the function name lower case. v3 (Curro): - Pass src and dst by reference. - Move to brw_fs_nir.cpp Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
c907ca6c8d256f4b8c271bcf0901661ef943ae08	13-May-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Flip interpolateAtOffset's y offset when necessary. Fixes 4 dEQP-GLES31.functional.shaders.multisample_interpolation tests: - interpolate_at_offset.no_qualifiers.default_framebuffer - interpolate_at_offset.centroid_qualifier.default_framebuffer - interpolate_at_offset.sample_qualifier.default_framebuffer - interpolate_at_offset.array_element.default_framebuffer Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
50e5e1f747ad820eb491e093600a4bde9c13efba	03-May-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Implement the new NIR MCS texturing Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
3aa542c65760c7e9b92a41d850677a44879cc5c7	09-May-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Delete bogus assertion in emit_gs_input_load(). This looks like leftover cruft from an earlier attempt at writing point size hacks. Each vertex has its own copy of gl_PointSize, so accessing any vertex other than 0 would cause this to fail. The tests seem to work fine without it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
1c41cb58def637c9e033cb7bf108f1096c9ae63c	08-May-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Support instanced GS inputs in the scalar backend. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
5fc37726501bc65f3bbaef2573ac89e980f1a412	08-May-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Use an early return for the push case in emit_gs_input_load(). Just trying to keep things from getting too ugly in the next commit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
75ada43a3af88835de6a83ed453d4ed512df0412	19-Apr-2016	Samuel Iglesias Gonsálvez <siglesias@igalia.com>	i965/fs: take into account doubles when calculating read_size for MOV_INDIRECT v2: - Fix assert's line width (Topi). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
efaf62a40a95b240cab7b0f371c7178aa19b7f3a	12-Jan-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/fs: implement i2d and u2d Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
c63a6f21494685d41d51887901298639c4d32c22	18-Jan-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/fs: implement d2i and d2u These need the same treatment as d2f, so generalize our d2f lowering to cover these too. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
e0c45182e3d865d7f187dc35e70832f1fa7c9fad	18-Jan-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/fs: implement d2b v2: Use subscript() instead of stride() (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
80f60a4302c8bd805882baaf60db72cf785593e3	07-Jan-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/fs: implement fsign() for doubles v2 (Sam): - Fix indentation (Kenneth) - Simplify code (Kenneth) v3: Use subscript() instead of stride() (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
e8a8fc956358fb5e0f776b39fdbce9247bb5538a	10-Nov-2015	Iago Toral Quiroga <itoral@igalia.com>	i965/fs: We only support 32-bit integer ALU operations for now Add asserts so we remember to address this when we enable 64-bit integer support, as suggested by Connor and Jason. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
a644b0939dd8284bca25042bccd2439c173dd7d7	30-Jul-2015	Connor Abbott <connor.w.abbott@intel.com>	i965/fs: add support for f2d and d2f Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
e83f51d54e9c3db11526b66a741352135eae6f52	04-Aug-2015	Connor Abbott <connor.w.abbott@intel.com>	i965/fs: fix compares for doubles The destination has to have the same source as the type, or else the simulator will complain. As a result, we need to emit a CMP that outputs a 64-bit wide result and then do a strided MOV to pick out the low 32 bits of each channel. v2: Use subscript() instead of stride() (Curro) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
935e0e305dd7a4f67557e969513a30357d308efb	19-Apr-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/fs: optimize unpack double When we are actually unpacking from a double that we have previously packed from its 32-bit components we can bypass the pack operation and source from its arguments directly. v2 (Sam): - Fix line overflow (Topi) - Bail if the parent instruction's source is not SSA (Connor) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
ba1907f040e9d61be932a8e098061d94d4ba30cb	19-Apr-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/fs: optimize pack double When we are actually creating a double using values obtained from a previous unpack operation we can bypass the unpack and source from the original double value directly. v2: - Style changes (Topi) - Bail is parent instruction's src is not SSA (Connor) v3: Use subscript() instead of stride() (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
7782f39e759798975ace6f3272dd3f263ddc8702	14-Aug-2015	Connor Abbott <connor.w.abbott@intel.com>	i965/fs/nir: translate double pack/unpack v2 (Sam): - Fix line overflow (Topi). v3: Use subscript() instead of stride() (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
d17cdacba37cff8ee172322c9ba2c4a58bf57d8b	29-Jul-2015	Connor Abbott <connor.w.abbott@intel.com>	i965/fs: always pass the bitsize to brw_type_for_nir_type() v2 (Sam): - Add bitsize to brw_type_for_nir_type() in optimize_extract_to_float() v3 (Sam): - Fix line width (Topi). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
0f1690fd9514f7a282141a7ad57a06b334b6c1a4	29-Jul-2015	Connor Abbott <connor.w.abbott@intel.com>	i965/fs: use the NIR bit size when creating registers v2 (Iago): - Squashed bits from 'support double precission constant operands for the implementation of 64-bit emit_load_const'. - Do not use BRW_REGISTER_TYPE_D for all 32-bit registers since that breaks asserts and functionality for some piglit tests. Just keep 32-bit types untouched and add 64-bit support. - Use DF instead of Q for 64-bit registers. Otherwise the code we generate will use Q sometimes and DF others and we hit unwanted DF/Q conversions, so always use DF. v3 (Sam): - Mark 'reg_type' occurrences as const (Topi). Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Tapani Palli <tapani.palli@intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
7bc987abe0dc863b091bf77f5b02138ebe79e559	03-May-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Move handling of samples_identical into the switch statement This is where we handle texop_texture_samples so it makes things more consistent. /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
3ba228f9978cbabc2b4731327454dd91a208c317	03-May-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Simplify texture destination fixups There are a few different fixups that we have to do for texture destinations that re-arrange channels, fix hardware vs. API mismatches, or just shrink the result to fit in the NIR destination. These were all being done in a somewhat haphazard manner. This commit replaces all of the shuffling with a single LOAD_PAYLOAD operation at the end and makes it much easier to insert fixups between the texture instruction itself and the LOAD_PAYLOAD. Shader-db results on Haswell: total instructions in shared programs: 6227035 -> 6226669 (-0.01%) instructions in affected programs: 19119 -> 18753 (-1.91%) helped: 85 HURT: 0 total cycles in shared programs: 56491626 -> 56476126 (-0.03%) cycles in affected programs: 672420 -> 656920 (-2.31%) helped: 92 HURT: 42 /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
a815499294afb485fe6773fba9ba12fa6773c654	03-May-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Merge nir_emit_texture and emit_texture The fs_visitor::emit_texture helper originated when we still had both NIR and IR visitors for the FS backend. Since the old visitor was removed, emit_texture serves no real purpose beyond arbitrarily splitting heavily-linked code across two functions. /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
a808ba59657b3e5c6399e51fa1f4ebe9cad201a9	03-May-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Rework passthrough TCS checks. According to Timothy, using program_string_id == 0 to identify the passthrough TCS is going to be problematic for his shader cache work. So, change it to strcmp() the name at visitor creation time. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
7d9143ad885752184156b3a0d3e492aef09af3b0	15-Nov-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Write a scalar TCS backend that runs in SINGLE_PATCH mode. Unlike most shader stages, the Hull Shader hardware makes us explicitly tell it how many threads to dispatch and manually configure the channel mask. One perk of this is that we have a lot of flexibility - we can run it in either SIMD4x2 or SIMD8 mode. Treating it as SIMD8 means that shaders with 8 or fewer output vertices (which is overwhemingly the common case) can be handled by a single thread. This has several intriguing properties: - Accessing input arrays with gl_InvocationID as the index is a simple SIMD8 URB read with g1 as the header. No indirect addressing required. - Barriers are no-ops. - We could potentially do output shadowing to combine writes, as the concurrency concerns are gone. (We don't do this yet, though.) v2: Drop first_non_payload_grf change, as it was always adding 0 (caught by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
9464d8c49813aba77285e7465b96e92a91ed327c	27-Apr-2016	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Switch the arguments to nir_foreach_function This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_function($[^,]$,\s$[^,]*$)/nir_foreach_function(\2, \1)/ Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
707e72f13bb78869ee95d3286980bf1709cba6cf	27-Apr-2016	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Switch the arguments to nir_foreach_instr This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_instr($[^,]$,\s$[^,]*$)/nir_foreach_instr(\2, \1)/ and similar expressions for nir_foreach_instr_safe etc. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
7efff10585122d484dc3adab14af9380b9b8f309	13-Apr-2016	Connor Abbott <cwabbott0@gmail.com>	i965/nir: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
13195f7ef85e0923a7b7d5b8a35eb6b6c257db1c	23-Apr-2016	Kenneth Graunke <kenneth@whitecape.org>	i965/fs: Reduce the response length of sampler messages on Skylake. Often, we don't need a full 4 channels worth of data from the sampler. For example, depth comparisons and red textures only return one value. To handle this, the sampler message header contains a mask which can be used to disable channels, and reduce the message length (in SIMD16 mode on all hardware, and SIMD8 mode on Broadwell and later). We've never used it before, since it required setting up a message header. This meant trading a smaller response length for a larger message length and additional MOVs to set it up. However, Skylake introduces a terrific new feature: for headerless messages, you can simply reduce the response length, and it makes the implicit header contain an appropriate mask. So to read only RG, you would simply set the message length to 2 or 4 (SIMD8/16). This means we can finally take advantage of this at no cost. total instructions in shared programs: 9091831 -> 9073067 (-0.21%) instructions in affected programs: 191370 -> 172606 (-9.81%) helped: 2609 HURT: 0 total cycles in shared programs: 70868114 -> 68454752 (-3.41%) cycles in affected programs: 35841154 -> 33427792 (-6.73%) helped: 16357 HURT: 8188 total spills in shared programs: 3492 -> 1707 (-51.12%) spills in affected programs: 2749 -> 964 (-64.93%) helped: 74 HURT: 0 total fills in shared programs: 4266 -> 2647 (-37.95%) fills in affected programs: 3029 -> 1410 (-53.45%) helped: 74 HURT: 0 LOST: 1 GAINED: 143 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
c7a09c057162ed0b7e9e039470c76bb79518876c	10-Oct-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Properly report regs_written from SAMPLEINFO The previous behavior would only allocate one register and then write four thus potentially stomping three innocent bystanders. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
0bd956b34b376bdc1eaf91a2a8463d13dd59e641	24-Apr-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Don't force a header for texture offsets of 0. Calling textureOffset() with an offset of <0, 0, 0> is equivalent to calliing texture(). We don't actually need to set up an offset, which causes a message header to be created. A fairly common pattern is to sample at a point with a bunch of offsets, and average them. It's natural to write all the lookups as textureOffset, but use <0, 0> for the center sample. shader-db results on Skylake: total instructions in shared programs: 9092095 -> 9092087 (-0.00%) instructions in affected programs: 2826 -> 2818 (-0.28%) helped: 12 HURT: 2 total cycles in shared programs: 70870166 -> 70870144 (-0.00%) cycles in affected programs: 15924 -> 15902 (-0.14%) helped: 2 HURT: 0 This also helps prevent code quality regressions in a future patch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
f310c02b94fba0a0a5ea7f5573f906de823cc5fe	16-Apr-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs_surface_builder: Take a GL format enum instead of mesa_format Reviewed-by: Chad Versace <chad.versace@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
d8c8f4203f8bb18152af0d0c120f3582a93c07c2	06-Apr-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Fix interpolateAtSample() on single sampled buffers. Fixes dEQP-GLES31.functional.shaders.multisample_interpolation tests: - interpolate_at_sample.non_multisample_buffer.sample_n_default_framebuffer - interpolate_at_sample.non_multisample_buffer.sample_n_singlesample_rbo - interpolate_at_sample.non_multisample_buffer.sample_n_singlesample_texture Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
447d3eec6a869200612e5010f47335cb26789a3a	06-Apr-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Fix gl_SampleMaskIn[] in per-sample shading mode. The coverage mask is not sufficient - in per-sample mode, we also need to AND with a mask representing the samples being processed by the current fragment shader invocation. Fixes 18 dEQP-GLES31.functional.shaders.sample_variables tests: sample_mask_in.bit_count_per_sample.multisample_{rbo,texture}_{1,2,4,8} sample_mask_in.bit_count_per_two_samples.multisample_{rbo,texture}_{4,8} sample_mask_in.bits_unique_per_sample.multisample_{rbo,texture}_{1,2,4,8} sample_mask_in.bits_unique_per_two_samples.multisample_{rbo,texture}_{4,8} Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b6dc940ec273252678d40707d300851fa1c85ea5	13-Apr-2016	Connor Abbott <cwabbott0@gmail.com>	nir: rename nir_foreach_block() to nir_foreach_block_call() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
f30f6e26252ed09eca1922f7c8633c7c7b6e50fe	15-Apr-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Don't allow OOB array access of images We have had a guard against OOB array access of images on IVB for a long time, but it can actually cause hangs on any GPU generation. This can happen due to getting an untyped SURFACE_STATE for a typed message. We didn't used to hit this with the piglit test on anything other than IVB because the OOB in the test would cause us to go past the top of the pull constant UBO and we would get a surface index of 0 which is was always a valid surface. Now that we're pushing small arrays, we can end up grabbing garbage from the GRF and going to some random index which causes a hang. The solution is to just do the bounds check on all hardware. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94944 Reviewed-by: Francisco Jerez <currojerez@riseup.net> Tested-by: Mark Janes <mark.a.janes@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
479e38ad63ab1421afe4f25d36f434ac2e12e817	25-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Get rid of the param_size array Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
3c93cdfaf598bc3c28e3dc288da35675c666602b	25-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Use MOV_INDIRECT for all indirect uniform loads Instead of using reladdr, this commit changes the FS backend to emit a MOV_INDIRECT whenever we need an indirect uniform load. We also have to rework some of the other bits of the backend to handle this new form of uniform load. The obvious change is that demote_pull_constants now acts more like a lowering pass when it hits a MOV_INDIRECT. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
240d16ea94834eb2472e91fd4856381951a07007	25-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Use UD type for offsets in VARYING_PULL_CONSTANT_LOAD Reveiewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
765dd6534937e125b95c7998862b1a4ec76a22d8	25-Mar-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965: Implement the new imod and irem opcodes Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
bfd17c76c1267756ea16051cbe174cb23ff49f44	08-Apr-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Port INTEL_PRECISE_TRIG=1 to NIR. This makes the extra multiply visible to NIR's algebraic optimizations (for constant reassociation) as well as constant folding. This means that when the result of sin/cos are multiplied by an constant, we can eliminate the extra multiply altogether, reducing the cost of the workaround. It also means we only have to implement it one place, rather than in both backends. This makes INTEL_PRECISE_TRIG=1 cost nothing on GPUTest/Volplosion, which has a ton of sin() calls, but always multiplies them by an immediate constant. The extra multiply gets folded away. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
5ea3647f89abccea5496824815b5b729f38f7a23	25-Mar-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Move the code for load/store_shared to emit_cs_intrinsic They are compute-shader only and that's where the code for doing atomics on shared variables lives so it seemes to make sense. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
80c72a8ea7b1018661da0e6509a7f88ca1f5086f	25-Mar-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Provide a default LOD for buffer textures Our hardware requires an LOD for all texelFetch commands even if they are on buffer textures. GLSL IR gives us an LOD of 0 in that case, but the LOD is really rather meaningless. This commit allows other NIR producers to be more lazy and not provide one at all. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
65fbc43d54403905e3eaea02372b5a364dc1d773	27-Jan-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Add an INTEL_PRECISE_TRIG=1 option to fix SIN/COS output range. The SIN and COS instructions on Intel hardware can produce values slightly outside of the [-1.0, 1.0] range for a small set of values. Obviously, this can break everyone's expectations about trig functions. According to an internal presentation, the COS instruction can produce a value up to 1.000027 for inputs in the range (0.08296, 0.09888). One suggested workaround is to multiply by 0.99997, scaling down the amplitude slightly. Apparently this also minimizes the error function, reducing the maximum error from 0.00006 to about 0.00003. When enabled, fixes 16 dEQP precision tests dEQP-GLES31.functional.shaders.builtin_functions.precision. {cos,sin}.{highp,mediump}_compute.{scalar,vec2,vec4,vec4}. at the cost of making every sin and cos call more expensive (about twice the number of cycles on recent hardware). Enabling this option has been shown to reduce GPUTest Volplosion performance by about 10%. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
14c46954c910efb1db94a068a866c7259deaa9d9	25-Mar-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965: Add an implemnetation of nir_op_fquantize2f16 Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
084b24f5582567ebf5aa94b7f40ae3bdcb71316b	16-Mar-2016	Iago Toral Quiroga <itoral@igalia.com>	nir: rename nir_const_value fields to include bitsize information Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
ef76ea4ba97d0ac122491fd3f1b2bbb8e4163150	04-Mar-2016	Alejandro Piñeiro <apinheiro@igalia.com>	i965/fs/nir: "surface_access::" prefix not needed "using namespace brw::surface_access" is already present at the top of the source file. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
1f862e923cba1d5cd54a707f70f0be113635e855	21-Jan-2016	Matt Turner <mattst88@gmail.com>	i965/fs: Optimize float conversions of byte/word extract. instructions in affected programs: 31535 -> 29966 (-4.98%) helped: 23 cycles in affected programs: 272648 -> 266022 (-2.43%) helped: 14 HURT: 1 The patch decreases the number of instructions in the two Unigine programs by: #1721: 4374 -> 4155 instructions (-5.01%) #1706: 3582 -> 3363 instructions (-6.11%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
0e9dc59a58e632979b3bdebb19d184bd22a0c182	11-Feb-2016	Matt Turner <mattst88@gmail.com>	i965: Make emit_minmax return an instruction*. And use it in brw_fs_nir.cpp. /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
2f2c00c7279e7c43e520e21de1781f8cec263e92	11-Feb-2016	Matt Turner <mattst88@gmail.com>	i965: Lower min/max after optimization on Gen4/5. Gen4/5's SEL instruction cannot use conditional modifiers, so min/max are implemented as CMP + SEL. Handling that after optimization lets us CSE more. On Ironlake: total instructions in shared programs: 6426035 -> 6422753 (-0.05%) instructions in affected programs: 326604 -> 323322 (-1.00%) helped: 1411 total cycles in shared programs: 129184700 -> 129101586 (-0.06%) cycles in affected programs: 18950290 -> 18867176 (-0.44%) helped: 2419 HURT: 328 Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
ac089126b9b647f930ee2657aa16ea8e8f6a5dd7	09-Feb-2016	Jason Ekstrand <jason.ekstrand@intel.com>	glsl/types: Rename sampler_type to sampled_type It's a bit more descriptive since it is the base type that you get when you sample from it. Also, the next commit adds a bare "sampler" type and we need glsl_type::sampler_type available for a public static member. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
8750299a420af76cebd3067f6f603eacde06ae06	09-Feb-2016	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Remove the const_offset from nir_tex_instr When NIR was originally drafted, there was no easy way to determine if something was constant or not. The result was that we had lots of special-casing for constant values such as this. Now that load_const instructions are SSA-only, it's really easy to find constants and this isn't really needed anymore. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Rob Clark <robclark@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b8ab9c8c8674d67e09c1134ca44b37e0a611f5b5	06-Feb-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Plumb separate surfaces and samplers through from NIR Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
5ec456375e4fdd0b6c7d797f99191044e19ead74	03-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Separate texture from sampler in nir_tex_instr This commit adds the capability to NIR to support separate textures and samplers. As it currently stands, glsl_to_nir only sets the texture deref and leaves the sampler deref alone as it did before and nir_lower_samplers assumes this. Backends can still assume that they are combined and only look at only at the texture index. Or, if they wish, they can assume that they are separate because nir_lower_samplers, tgsi_to_nir, and prog_to_nir all set both texture and sampler index whenever a sampler is required (the two indices are the same in this case). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
ee85014b90af1d94d637ec763a803479e9bac5dc	06-Feb-2016	Jason Ekstrand <jason.ekstrand@intel.com>	nir/tex_instr: Rename sampler to texture We're about to separate the two concepts. When we do, the sampler will become optional. Doing a rename first makes the separation a bit more safe because drivers that depend on GLSL or TGSI behaviour will be fine to just use the texture index all the time. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
1dc312e295c66ab8674d2f47f859e310f607b2ed	21-Jan-2016	Matt Turner <mattst88@gmail.com>	i965/fs: Implement support for extract_word. The vec4 backend will lower it. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
eb63640c1d38a200a7b1540405051d3ff79d0d8a	17-Jan-2016	Emil Velikov <emil.velikov@collabora.com>	glsl: move to compiler/ Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jose Fonseca <jfonseca@vmware.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b3340cd32acf5935891f19833de0cfc500a93e0b	21-Jan-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Implement a drirc workaround for broken dual color blending. OpenGL's dual color blending feature was specified so that an implementation could support both multiple render targets (MRT) and dual source blending. Fragment shader outputs specify both "location" (the render target number) and "index" (either color 0 or 1). I believe DirectX only has the notion of "location" - if using dual color blending, location 0 or 1 will specify the operands. If not, then location means the render target index. The two features can't be used together. As such, some applications mistakenly try to use <loc = 0, index = 0> and <loc = 1, index = 0> in a shader used for dual color blending with a single render target, rather than the correct <loc = 0, index = 0> and <loc = 0, index = 1>. In particular, Unigine Heaven 4.0 and Valley 1.0 suffer from this bug. Unigine is aware of the problem, and quickly developed a fix, but has not bothered to change the download link on their website to a working copy in over a year. People were still using the broken version and complaining. We tried working around this by disabling dual color blending, but that apparently hurts performance, and people were once again unhappy. On i965, dual source blending is achieved by using different framebuffer write messages than normal rendering. So, we have to compile different code for the two cases. We're not being pedantic: we actually have to know in order to function. Normally, dual source blending is detectable in the shader: if a shader has an output with index = 1, then it's meant for blending, not MRT. With the broken inputs, they're indistinguishable, so we can only tell by looking at the current GL state. This patch implements a new drirc workaround: export dual_color_blend_by_location=true which makes the i965 driver detect when OpenGL state is configured for dual source blending, and recompile the fragment shader to use the right messages. In that case, we allow either location = 1 or index = 1 to specify the second source for the blending equations. It also re-enables GL_ARB_blend_func_extended for Unigine. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92233 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b82e26a6a4d6baf121f44c61c862bfa79ba0d172	13-Jan-2016	Matt Turner <mattst88@gmail.com>	nir: Lower bitfield_extract. The OpenGL specifications for bitfieldExtract() says: The result will be undefined if <offset> or <bits> is negative, or if the sum of <offset> and <bits> is greater than the number of bits used to store the operand. Therefore passing bits=32, offset=0 is legal and defined in GLSL. But the earlier SM5 ubfe/ibfe opcodes are specified to accept a bitfield width ranging from 0-31. As such, Intel and AMD instructions read only the low 5 bits of the width operand, making them not able to implement the GLSL-specified behavior directly. This commit adds ubfe/ibfe operations from SM5 and a lowering pass for bitfield_extract to to handle the trivial case of <bits> = 32 as bitfieldExtract: bits > 31 ? value : bfe(value, offset, bits) Fixes: ES31-CTS.shader_bitfield_operation.bitfieldExtract.uvec3_0 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92595 Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Tested-by: Marta Lofstedt <marta.lofstedt@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b85a229e1f542426b1c8000569d89cd4768b9339	08-Jan-2016	Kenneth Graunke <kenneth@whitecape.org>	glsl: Delete the ir_binop_bfm and ir_triop_bfi opcodes. TGSI doesn't use these - it just translates ir_quadop_bitfield_insert directly. NIR can handle ir_quadop_bitfield_insert as well. These opcodes were only used for i965, and with Jason's recent patches, we can do this lowering in NIR (which also gains us SPIR-V handling). So there's not much point to retaining this GLSL IR lowering code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
4a1c8a3037cd29938b2a6e2c680c341e9903cfbe	28-Dec-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Push most TES inputs in SIMD8 mode. Using the push model for inputs is much more efficient than pulling inputs - the hardware can simply copy a large chunk into URB registers at thread creation time, rather than having the thread send messages to request data from the L3 cache. Unfortunately, it's possible to have more TES inputs than fit in registers, so we have to fall back to the pull model in some cases. However, it turns out that most tessellation evaluation shaders are fairly simple, and don't use many inputs. An arbitrary cut-off of 32 vec4 slots (16 registers) is more than sufficient to ensure that 100% of TES inputs are pushed for Shadow of Mordor, Unigine Heaven, GPUTest/TessMark, and SynMark. Note that unlike most SIMD8 stages, this actually reads packed vec4 data, since that is what our vec4 TCS programs write. Improves performance in GPUTest's tessmark_x64 microbenchmark by 93.4426% +/- 5.35541% (n = 25) on my Lenovo X250 at 1024x768. Improves performance in Synmark's Gl40TerrainFlyTess microbenchmark by 22.74% +/- 0.309394% (n = 5). Improves performance in Shadow of Mordor at low settings with tessellation enabled at 1280x720 by 2.12197% +/- 0.478553% (n = 4). shader-db statistics for files containing tessellation shaders: total instructions in shared programs: 184358 -> 181181 (-1.72%) instructions in affected programs: 27971 -> 24794 (-11.36%) helped: 226 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b022150d70a1cfdda2007fa16b04c601eef45d6f	28-Dec-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Use LOAD_PAYLOAD for SIMD8 TES input loads, not MOV. We need a MOV to replicate g0.0<0,1,0> to all 8 channels. Since the message payload is a single register, MOV seemed more sensible than LOAD_PAYLOAD. However, MOV cannot be CSE'd, while LOAD_PAYLOAD can. All input loads can use the same header - we don't need to re-expand g0 every time. CSE accomplishes this, saving instructions. shader-db statistics for files containing tessellation shaders: total instructions in shared programs: 186923 -> 184358 (-1.37%) instructions in affected programs: 30536 -> 27971 (-8.40%) helped: 226 HURT: 0 total cycles in shared programs: 1009850 -> 1005356 (-0.45%) cycles in affected programs: 168206 -> 163712 (-2.67%) helped: 226 HURT: 0 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
cddfc2cefa93b884c40329dcb193fe4fb22143ab	10-Dec-2015	Kristian Høgsberg Kristensen <krh@bitplanet.net>	i965: Add support for gl_DrawIDARB and enable extension We have to break open a new vec4 for gl_DrawIDARB. We've used up all space in the vec4 we use for SGVS and gl_DrawIDARB has to come from its own separate vertex buffer anyway. This is because we point the vb for base vertex and base instance into the draw parameter BO for indirect draw calls, but the draw id is generated by mesa in a different buffer. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
17ebb55a14b5a9aa639845fbda9330ef9421834a	10-Dec-2015	Kristian Høgsberg Kristensen <krh@bitplanet.net>	i965: Add support for gl_BaseVertexARB and gl_BaseInstanceARB We already have gl_BaseVertexARB in the .x component of the SGVS vec4 and plug gl_BaseInstanceARB into the last free component (.y). Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
237f2f2d8b45d9d956102eec6f9be63193e5269b	26-Dec-2015	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Get rid of function overloads When Connor originally drafted NIR, he copied the same function+overload system that GLSL IR had with a few names changed. However, this double-indirection is not really needed and has only served to confuse people. Instead, let's just have functions which may not have unique names and may or may not have an implementation. If someone wants to do overload resolving, they can hav a hash table based function+overload system in the overload resolving pass. There's no good reason to keep it in core NIR. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> ir3 bits are Reviewed-by: Rob Clark <robclark@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
a5038427c3624e559f954124d77304f9ae9b884c	10-Nov-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Add tessellation evaluation shaders The TES is essentially a post-tessellator VS, which has access to the entire TCS output patch, and a special gl_TessCoord input. Otherwise, they're very straightforward. This patch implements SIMD8 tessellation evaluation shaders for Gen8+. The tessellator can generate a lot of geometry, so operating in SIMD8 mode (8 vertices per thread) is more efficient than SIMD4x2 mode (only 2 vertices per thread). I have another patch which implements SIMD4x2 mode for older hardware (or via an environment variable override). We currently handle all inputs via the pull model. v2: Improve comments (suggested by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
c8a74e3a4ea6ac5dfa35adac06af14a8fa4ff773	30-Nov-2015	Matt Turner <mattst88@gmail.com>	nir: Delete bany, ball, fany, fall. As in the previous patches, these can be implemented as any(v) -> any_nequal(v, false) all(v) -> all_equal(v, true) and their removal simplifies the code in the next patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b8425bb1e845bef19dac8d8a9fd672e958018802	11-Dec-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Use the correct source for local memory load offsets The offset for loads is in src[0]. This was a copy+paste error in the nir_intrinsic_load/store refactoring. This commit fixes a segfault in ES31-CTS.compute_shader.work-group-size. I have no idea how piglit failed to catch this... Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93348 Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
78b81be627734ea7fa50ea246c07b0d4a3a1638a	25-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Get rid of _indirect variants of input/output load/store intrinsics There is some special-casing needed in a competent back-end. However, they can do their special-casing easily enough based on whether or not the offset is a constant. In the mean time, having the _indirect variants adds special cases a number of places where they don't need to be and, in general, only complicates things. To complicate matters, NIR had no way to convdert an indirect load/store to a direct one in the case that the indirect was a constant so we would still not really get what the back-ends wanted. The best solution seems to be to get rid of the _indirect variants entirely. This commit is a bunch of different changes squashed together: - nir: Get rid of _indirect variants of input/output load/store intrinsics - nir/glsl: Stop handling UBO/SSBO load/stores differently depending on indirect - nir/lower_io: Get rid of load/store_foo_indirect - i965/fs: Get rid of load/store_foo_indirect - i965/vec4: Get rid of load/store_foo_indirect - tgsi_to_nir: Get rid of load/store_foo_indirect - ir3/nir: Use the new unified io intrinsics - vc4: Do all uniform loads with byte offsets - vc4/nir: Use the new unified io intrinsics - vc4: Fix load_user_clip_plane crash - vc4: add missing src for store outputs - vc4: Fix state uniforms - nir/lower_clip: Update to the new load/store intrinsics - nir/lower_two_sided_color: Update to the new load intrinsic NIR and i965 changes are Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> NIR indirect declarations and vc4 changes are Reviewed-by: Eric Anholt <eric@anholt.net> ir3 changes are Reviewed-by: Rob Clark <robdclark@gmail.com> NIR changes are Acked-by: Rob Clark <robdclark@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
f3970fad9e5b04e04de366a65fed5a30da618f9d	08-Dec-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs_nir: Refactor store_output, load_input, and load_uniform There was way too much incrementing of things going on. Instead, let's just start everything off at the right base location, and then increment in the loop. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
e288b4a133f1ea8208cd219545a72805ed5a91c6	10-Oct-2015	Jordan Justen <jordan.l.justen@intel.com>	i965/nir: Implement shared variable atomic operations v3: * Update based on latest SSBO code (Iago) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
faddb301ff72bd7ac8d4274e0d895ca37a4d3bce	29-Jul-2015	Jordan Justen <jordan.l.justen@intel.com>	i965/fs: Handle nir shared variable store intrinsic v4: * Apply similar optimization for shared variable stores as 0cb7d7b4b7c32246d4c4225a1d17d7ff79a7526d. This was causing a OpenGLES 3.1 CTS failure, but 867c436ca841b4196b4dde4786f5086c76b20dd7 fixes that. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
8613206bd3dd80dc916b6ce7c47bf59cd4d114c8	29-Jul-2015	Jordan Justen <jordan.l.justen@intel.com>	i965/fs: Handle nir shared variable load intrinsic v3: * Remove extra #includes (Iago) * Use recently added GEN7_BTI_SLM instead of BRW_SLM_SURFACE_INDEX (curro) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
18069dce4a4c3d71e6afc6b10bfa7bee0560ba9c	11-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965: Make uniform offsets be in terms of bytes This commit pushes makes uniform offsets be terms of bytes starting with nir_lower_io. They get converted to be in terms of vec4s or floats when we cram them in the UNIFORM register file but reladdr remains in terms of bytes all the way down to the point where we lower it to a pull constant load. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
22c273de2b97743587310f7bbf66767191bde866	11-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Remove unused indirect handling The one and only place where the FS backend allows reladdr is on uniforms. For locals, inputs, and outputs, we lower it away before the backend ever sees it. This commit gets rid of the dead indirect handling code. Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
13ad8d03f201a4d09bf7ab9078b00807d61dfada	01-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Use a stride of 1 and byte offsets for UBOs Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
3810c1561401aba336765d64d1a5a3e44eb58eb3	25-Nov-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Fix scalar vertex shader struct outputs. While we correctly set output[] for composite varyings, we set completely bogus values for output_components[], making emit_urb_writes() output zeros instead of the actual values. Unfortunately, our simple approach goes out the window, and we need to recurse into structs to get the proper value of vector_elements for each field. Together with the previous patch, this fixes rendering in an upcoming game from Feral Interactive. v2: Use pointers instead of pass-by-mutable-reference (Jason, Matt). Cc: "11.1 11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
3e9003e9cf55265ab1fb6522dc5cbb2f455ea1f9	20-Nov-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Fix fragment shader struct inputs. Apparently we have literally no support for FS varying struct inputs. This is somewhat surprising, given that we've had tests for that very feature that have been passing for a long time. Normally, varying packing splits up structures for us, so we don't see them in the backend. However, with SSO, varying packing isn't around to save us, and we get actual structs that we have to handle. This patch changes fs_visitor::emit_general_interpolation() to work recursively, properly handling nested structs/arrays/and so on. (It's easier to read with diff -b, as indentation changes.) When using the vec4 VS backend, this fixes rendering in an upcoming game from Feral Interactive. (The scalar VS backend requires additional bug fixes in the next patch.) v2: Use pointers instead of pass-by-mutable-reference (Jason, Matt). Cc: "11.1 11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
f36993b46962eab4446bc1964eb47149751aee26	23-Nov-2015	Matt Turner <mattst88@gmail.com>	i965: Clean up #includes in the compiler. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
ecac1aab538d65f0867fd93e23d0d020c1a5d0f1	23-Nov-2015	Matt Turner <mattst88@gmail.com>	i965: Push down inclusion of brw_program.h. We were including it in headers, which then caused it to be included in tons of places it wasn't needed. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
6fe9ea78fa413ca3f0359f62881876f6b7a12f03	23-Nov-2015	Matt Turner <mattst88@gmail.com>	i965: Remove duplicate #includes. Added in commits 36fd65381 and 337dad8ce even though the existing include was in view. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
6c8ba59cff14a1a86273f4008ff2a8e68335ab25	11-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965: Use nir_lower_tex for texture coordinate lowering Previously, we had a rescale_texcoords helper in the FS backend for handling rescaling of texture coordinates. Now that we can do variants in NIR, we can use nir_lower_tex to do the rescaling for us. This allows us to delete the i965-specific code and gives us proper TEXTURE_RECTANGLE and GL_CLAMP handling in vertex and geometry shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
c875e3cdd21811ad6669160d59fa39a4526ef872	14-Nov-2015	Matt Turner <mattst88@gmail.com>	i965/fs: Add support for gl_HelperInvocation system value. In most cases (when the negate is copy propagated and the MOV removed), this is two instructions on Gen >= 8 and only two instructions on earlier platforms -- and it doesn't use the flag register. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
99840eb983f74cd447546f7205c8c9f505ef82c8	18-Nov-2015	Ian Romanick <ian.d.romanick@intel.com>	i965: Enable EXT_shader_samples_identical On the vec4 backend, textureSamplesIdentical() will always return false. There are currently no test cases for the vec4 backend, so we don't have much confidence in any implementation. We also don't think anyone is likely to miss it. v2: Handle immediate value for MCS smarter. Rebase on changes to nir_texop_sampels_identical (missing second parameter). Suggested by Jason. v3: Add Neil's code to handle 16x MSAA in the FS. Also rebase on top of f9a9ba5e. Stub out the vec4 implementation. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Neil Roberts <neil@linux.intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [v2] Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> [v2] /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
457bb290efc162ea3c7c51a820ab7cf88a4efb8d	18-Nov-2015	Ian Romanick <ian.d.romanick@intel.com>	nir: Add nir_texop_samples_identical opcode This is the NIR analog to GLSL IR ir_samples_identical. v2: Don't add the second nir_tex_src_ms_index parameter. Suggested by Ken and Jason. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
9b978046eb1d1657060365e8dcde4aad41b50af9	02-Nov-2015	Matt Turner <mattst88@gmail.com>	i965/fs: Use brw_imm_uw(). W/UW immediates are 16-bits, but those 16-bits must be replicated in the high 16-bits of the 32-bit field. Remove the useless W/UW immediate saturating code, since we'll now be using the appropriate immediate (and W/UW immediates in the IR can now no longer be larger than 16-bits). Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
3ccc41ecfc5e9345a1c291748d8840984f7413ae	02-Nov-2015	Matt Turner <mattst88@gmail.com>	i965/fs: Replace fs_reg(imm) constructors with brw_imm_*(). Cuts 10k of .text, of which only 776 bytes are the fs_reg constructor implementations themselves. text data bss dec hex filename 5204535 214112 27784 5446431 531b1f i965_dri.so before 5193977 214112 27784 5435873 52f1e1 i965_dri.so after Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
fc19a0d2e422ea8e45bc5440a91f858f5f345884	08-Nov-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Allow indirect GS input indexing in the scalar backend. This allows arbitrary non-constant indices on GS input arrays, both for the vertex index, and any array offsets beyond that. All indirects are handled via the pull model. We could potentially handle indirect addressing of pushed data as well, but it would add additional code complexity, and we usually have to pull inputs anyway due to the sheer volume of input data. Plus, marking pushed inputs as live due to indirect addressing could exacerbate register pressure problems pretty badly. We'd need to be careful. v2: Use updated MOV_INDIRECT opcode. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b163aa01487ab5f9b22c48b7badc5d65999c4985	27-Oct-2015	Matt Turner <mattst88@gmail.com>	i965: Rename GRF to VGRF. The 2-bit hardware register file field is ARF, GRF, MRF, IMM. Rename GRF to VGRF (virtual GRF) so that we can reuse the GRF name to mean an assigned general purpose register. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
dba309fc14d1ca99251c8f8115d2a26ac86f14f6	30-Oct-2015	Matt Turner <mattst88@gmail.com>	i965: Initialize registers. The test (file == BAD_FILE) works on registers for which the constructor has not run because BAD_FILE is zero. The next commit will move BAD_FILE in the enum so that it's no longer zero. In the case of this->outputs, the constructor was being run implicitly, and we were unnecessarily memsetting is to zero. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
d4fdb84f80dd3dbad2b71ea6e877f24dc625aa2a	10-Nov-2015	Samuel Iglesias Gonsálvez <siglesias@igalia.com>	i965/fs/nir: fix the number of register written by FS_OPCODE_GET_BUFFER_SIZE FS_OPCODE_GET_BUFFER_SIZE is calculated with a resinfo's sampler message. This patch adjusts the number of registers written by the opcode following what the PRM spec says about the number of registers written by the SIMD8 and SIMD16's writeback messages for sampler messages. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
918bda23dda36004c95f6441328ecc892e068886	05-Nov-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Split nir_emit_intrinsic by stage with a general fallback. Many intrinsics only apply to a particular stage (such as discard). In other cases, we may want to interpret them differently based on the stage (such as load_primitive_id or load_input). The current method isn't that pretty - we handle all intrinsics in one giant function. Sometimes we assert on stage, sometimes we forget. Different behaviors are handled via if-ladders based on stage. This commit introduces new nir_emit_<stage>_intrinsic() functions, and makes nir_emit_instr() call those. In turn, those fall back to the generic nir_emit_intrinsic() function for cases they don't want to handle specially. This makes it clear which intrinsics only exist in one stage, and makes it easy to handle inputs/outputs differently for various stages. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
51694072218b5ae84b5d8f98ee2172d7c5d61b31	06-Nov-2015	Francisco Jerez <currojerez@riseup.net>	i965/nir/fs: Add comment for no-op memory barrier functions Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
faa119307035787f5e421dd6a9eb4d0101de963b	10-Oct-2015	Jordan Justen <jordan.l.justen@intel.com>	i965/nir/fs: Implement new barrier functions for compute shaders For these nir intrinsics, we emit the same code as nir_intrinsic_memory_barrier: * nir_intrinsic_memory_barrier_atomic_counter * nir_intrinsic_memory_barrier_buffer * nir_intrinsic_memory_barrier_image We treat these nir intrinsics as no-ops: * nir_intrinsic_group_memory_barrier * nir_intrinsic_memory_barrier_shared v3: * Add comment for no-op cases (curro) v4: * Moving comment to a separate patch authored by curro Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
8dcf807cb43383590ba193c7ff20b8a98e4a9f65	14-Oct-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Fix scalar VS float[] and vec2[] output arrays. The scalar VS backend has never handled float[] and vec2[] outputs correctly (my original code was broken). Outputs need to be padded out to vec4 slots. In fs_visitor::nir_setup_outputs(), we tried to process each vec4 slot by looping from 0 to ALIGN(type_size_scalar(type), 4) / 4. However, this is wrong: type_size_scalar() for a float[2] would return 2, or for vec2[2] it would return 4. This looked like a single slot, even though in reality each array element would be stored in separate vec4 slots. Because of this bug, outputs[] and output_components[] would not get initialized for the second element's VARYING_SLOT, which meant emit_urb_writes() would skip writing them. Nothing used those values, and dead code elimination threw a party. To fix this, we introduce a new type_size_vec4_times_4() function which pads array elements correctly, but still counts in scalar components, generating correct indices in store_output intrinsics. Normally, varying packing avoids this problem by turning varyings into vec4s. So this doesn't actually fix any Piglit or dEQP tests today. However, if varying packing is disabled, things would be broken. Tessellation shaders can't use varying packing, so this fixes various tcs-input Piglit tests on a branch of mine. v2: Shorten the implementation of type_size_4x to a single line (caught by Connor Abbott), and rename it to type_size_vec4_times_4() (renaming suggested by Jason Ekstrand). Use type_size_vec4 rather than using type_size_vec4_times_4 and then dividing by 4. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
eea3c907cc480a105224b21be51d62bc64ea1057	30-Oct-2015	Iago Toral Quiroga <itoral@igalia.com>	i965/fs: Do not mark used surfaces in FS_OPCODE_GET_BUFFER_SIZE Do it in the visitor, like we do for other opcodes. v2: use const, get rid of useless surf_index temporary (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
027b64a55afc0fe8efcf9f6217192807e285c830	30-Oct-2015	Iago Toral Quiroga <itoral@igalia.com>	i965/fs: Do not mark direct used surfaces in VARYING_PULL_CONSTANT_LOAD Right now the generator marks direct surfaces as used but leaves marking of indirect surfaces to the caller. Just make the callers handle marking in both cases for consistency. v2: Use const and remove useless surf_index temporary (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
fb77da89f51fd82d5cee95400acb20ad74d9e7bc	31-Oct-2015	Timothy Arceri <t_arceri@yahoo.com.au>	i965: add support for image AoA V3: clamp array index to the correct size (the size of the current array rather than the inner array) Francisco Jerez. V2: avoid useless zero-initialization and addition for the first AoA level, avoid redundant temporary, make use of type_size_scalar(), rename aoa_size to element_size, assign the indirect indexing temporary directly to image.reladdr, and replace while loop with a for loop. All suggested by Francisco Jerez. Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
36fd65381756ed1b8f774f7fcdd555941a3d39e1	12-Mar-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Add scalar geometry shader support. This is hidden behind INTEL_SCALAR_GS=1 for now, as we don't yet support instanced geometry shaders, and Orbital Explorer's shader spills like crazy. But the infrastructure is in place, and it's largely working. v2: Lots of rebasing. v3: (feedback from Kristian Høgsberg) - Handle stride and subreg_offset correctly for ATTRs; use a helper. - Fix missing emit_shader_time_end() call. - Delete dead code after early EOT in static vertex case to avoid tripping asserts in emit_shader_time_end(). - Use proper D/UD type in intexp2(). - Fix "EndPrimitve" and "to that" typos. - Assert that invocations == 1 so we know this is missing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
0b19f651958c3888588190c8c8a9e701173a2aa2	26-Oct-2015	Matt Turner <mattst88@gmail.com>	i965/fs: Clean up FBH code. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
4379ca22f18f5731248ee794ab651db721ba38b2	07-Oct-2015	Emil Velikov <emil.velikov@collabora.com>	i965: Implement nir_intrinsic_shader_clock v2: - Add a few const qualifiers for good measure. - Drop unneeded retype()s (Matt) - Convert timestamp to SIMD8/16, as fs_visitor::get_timestamp() returns SIMD4 (Connor) v3: - Remove unneeded temporary + MOV (Connor) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
8c902a580a490181e7cde29073b11181db4614f8	17-Jun-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Implement ARB_fragment_layer_viewport. Normally, we could read gl_Layer from bits 26:16 of R0.0. However, the specification requires that bogus out-of-range 32-bit values written by previous stages need to appear in the fragment shader as-written. Instead, we pass in the full 32-bit value from the VUE header as an extra flat-shaded varying. We have the SF override the value to 0 when the previous stage didn't actually write a value (it's actually defined to return 0). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
0cb7d7b4b7c32246d4c4225a1d17d7ff79a7526d	22-Oct-2015	Kristian Høgsberg Kristensen <krh@bitplanet.net>	i965/fs: Optimize ssbo stores Reviewed-by: Francisco Jerez <currojerez@riseup.net> Write groups of enabled components together. Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
feff21d1a6ba49a0d6f7526e1ff473a0b574c92e	22-Oct-2015	Kristian Høgsberg Kristensen <krh@bitplanet.net>	i965/fs: Drop offset_reg temporary in ssbo load Now that we don't read each component one-by-one, we don't need the temoprary vgrf for the offset. More importantly, this register was type UD while the nir source was type D. This broke copy propagation and left a redundant MOV in the generated code. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
a19bf6d3ccbab6170ccfb7e04316a58f3e19396c	21-Oct-2015	Kristian Høgsberg Kristensen <krh@bitplanet.net>	i965/fs: Don't uniformize surface index twice The emit_untyped_read and emit_untyped_write helpers already uniformize the surface index argument. No need to do it before calling them. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
24a3a697e5e029767c2d210a94d47c52c5e5e299	17-Oct-2015	Kristian Høgsberg Kristensen <krh@bitplanet.net>	i965/fs: Read all components of a SSBO field with one send Instead of looping through single-component reads, read all components in one go. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
1db44252d01bf7539452ccc2b5210c74b8dcd573	20-Oct-2015	Ben Widawsky <benjamin.widawsky@intel.com>	i965: Implement ARB_shader_stencil_export (gen9+) v2: remove useless source_stencil_to_render_target (Ken) Squash in the actual packing function, which also got to v2: Move the definition of the OPCODE outside of FB_WRITE opcodes (Matt) Reorder the regioning to be in VWH order (Matt) Don't retype src in the backend, just assert instead (Matt) Rename the debug prints to something better (Matt) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
48c76eae8e52fba2fe22d2cfa7f3c94a5420feb2	10-Jul-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Implement gl_InvocationID. It's stored in bits 31:27 of g1 (along with the URB handles). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
c5ae34f38f239d346090212a9f33a947a3b7642e	24-Sep-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Implement nir_intrinsic_load_primitive. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
6f9ca3026693e061ee55fa6d5f16d9ec0e744b59	15-Oct-2015	Iago Toral Quiroga <itoral@igalia.com>	i965/fs: use the right number of UBOs Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
176e6930e6c24dfce7cc730faa2612d27689a4df	18-Jul-2015	Timothy Arceri <t_arceri@yahoo.com.au>	i965: add arrays of arrays support for varyings V2: get the correct vector elements value for outputs Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
d3f45888045c84b2bc382a34d169a0ede4774a24	09-Oct-2015	Iago Toral Quiroga <itoral@igalia.com>	i965: Adapt SSBOs to work with their own separate index space Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
baee16bf02eedc6a32381d79da6c7ac942f782ae	28-Sep-2015	Iago Toral Quiroga <itoral@igalia.com>	nir: split SSBO min/max atomic instrinsics into signed/unsigned versions NIR is typeless so this is the only way to keep track of the type to select the proper atomic to use. v2: - Use imin,imax,umin,umax for the intrinsic names (Connor Abbott) - Change message for unreachable paths (Michael Schellenberger) Tested-by: Markus Wick <markus@selfnet.de> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
2953c3d76178d7589947e6ea1dbd902b7b02b3d4	15-Aug-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/vs: Map scalar VS input locations properly; avoid tons of MOVs. Previously, we used nir_lower_io with the scalar type_size function, which mapped VERT_ATTRIB_* locations to...some numbers. Then, in fs_visitor::nir_setup_inputs(), we created temporaries indexed by those numbers, and emitted MOVs from the actual ATTR registers to those temporaries. Virtually all of these were copy propagated away, but it's still ugly. This patch reworks our input lowering to produce NIR lower_input intrinsics that properly index into the ATTR file, so we can access it directly. No changes in shader-db. v2: Fix unreachable() message (Ken), update commit message (Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
da361acd1c899d533caec6cae5a336f6ab35e076	17-Jul-2015	Neil Roberts <neil@linux.intel.com>	i965/fs: Handle non-const sample number in interpolateAtSample If a non-const sample number is given to interpolateAtSample it will now generate an indirect send message with the sample ID similar to how non-const sampler array indexing works. Previously non-const values were ignored and instead it ended up using a constant 0 value. The generator will try to determine if the sample ID is dynamically uniform via nir_src_is_dynamically_uniform. If not it will query the pixel interpolator in a loop, once for each different live sample number. The next live sample number is found using emit_uniformize. If multiple live channels have the same sample number then they will be handled in a single iteration of the loop. The loop is necessary because the indirect send message doesn't seem to have a way to specify a different value for each fragment. This fixes the following two Piglit tests: arb_gpu_shader5-interpolateAtSample-nonconst arb_gpu_shader5-interpolateAtSample-dynamically-nonuniform v2: Handle dynamically non-uniform sample ids. v3: Remove the BREAK instruction and predicate the WHILE directly. Make the tokens arrays const. (Matt Turner) v4: Iterate over the live channels instead of each possible sample number. v5: Don't special case immediate values in brw_pixel_interpolator_query. Make a better wrapper for the function to set up the PI send instruction. Ensure that the SHL instructions are scalar. (Francisco Jerez). Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
1e3c1b107e075b210998998423901092b8fcd79b	03-Oct-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965: Use nir_foreach_variable Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
bf7b6fd3fd6d98305d64ee6224ca9f9e7ba48444	02-Oct-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/shader: Get rid of the shader, prog, and shader_prog fields Unfortunately, we can't get rid of them entirely. The FS backend still needs gl_program for handling TEXTURE_RECTANGLE. The GS vec4 backend still needs gl_shader_program for handling transfom feedback. However, the VS needs neither and we can substantially reduce the amount they are used. One day we will be free from their tyranny. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
756613ed35d6fd2216b5138731c0c38886b8e14a	02-Oct-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Use the nir info instead of pulling things out of [shader_]prog Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b62e36d18fac4a9c9977ddfa4bc2c2dbbcdad1b4	02-Oct-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Move sampler unit lookup into rescale_texcoord The texunit variable we create and assign in nir_emit_texture gets passed through two more layers of function calls before it gets to its sole use in rescale_texcoord. The best part is that we already pass the sampler into rescale_texcoord so we can just look it up there. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
7926c3ea7d8f455cbee390d20c78dadf5432b9bc	01-Oct-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/backend_shader: Add a field to store the NIR shader Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
30c63571133ed50907ec14172c2f3ef82ee8a34e	01-Oct-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965: Move prog_data uniform setup to the codegen level As of now, uniform setup is more-or-less unified between vec4 and fs and no longer requires the fs_visitor. This makes uniform setup more of a language/API thing than a backend compiler thing. This commit moves setting up the stage_prog_data.params arrays to the same place as we set up the rest of stage_prog_data. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
cdf314cb21377ee7caca05bd1abab6a2b921d213	01-Oct-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Simplify uniform setup Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
7fee8b6f055831bc070bb36d02a8b1c4d601652a	02-Oct-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Pull GLSL uniform handling into a common function The way we deal with GLSL uniforms and builtins is basically the same in both the vec4 and the fs backend. This commit takes the best parts of both implementations and pulls the common code into a shared helper function. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
03c4171b577b06b1d8dde50b6eb9507d8ef4c1ce	29-Sep-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Pull common ARB program uniform handling into a common function The way we deal with ARB program uniforms is basically the same in both the vec4 and the fs backend. This commit takes the best parts of both implementations and pulls the common code into a shared helper function. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
58cea0c2b63db236e6efcae930c5fb936181c2a9	30-Sep-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/shader: Pull setup_image_uniform_values out of backend_shader I tried to do this once before but Curro pointed out that having it in backend_shader meant it could use the setup_vec4_uniform_values helper which did different things in vec4 and fs. Now the setup_uniform_values function differs only by an assert in the two backends so there's no real good reason to be using it anymore. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
681b4badaedec5c9503887c4afb32485ce22c30e	24-Sep-2015	Jordan Justen <jordan.l.justen@intel.com>	i965/cs: Generate code to load gl_NumWorkGroups This code also sets cs_prog_data->uses_num_work_groups which is later used by state setup to indicate that the gl_NumWorkGroups surface needs to be setup. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
6668eb5a451c43ac78a784711cf239fdf7ca75ef	11-Sep-2015	Samuel Iglesias Gonsalvez <siglesias@igalia.com>	mesa: rename gl_shader_program's NumUniformBlocks to NumBufferInterfaceBlocks Because it counts shader storage blocks too. v2: - Use NumBufferInterfaceBlocks instead (Jordan). Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
14af6f4698a9f60c080b9adda4d3b4c45b157bd7	01-Jun-2015	Iago Toral Quiroga <itoral@igalia.com>	i965/nir/fs: Implement nir_intrinsic_ssbo_atomic_* Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
5b186aafe7a8d3f96a99ad2fddd2bff99d99e923	01-Jun-2015	Iago Toral Quiroga <itoral@igalia.com>	i965/nir/fs: Implement nir_intrinsic_load_ssbo Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
337dad8ceeb4f313a47b4ddb31805f355c3fc3a5	01-Jun-2015	Iago Toral Quiroga <itoral@igalia.com>	i965/nir/fs: Implement nir_intrinsic_store_ssbo Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
f5dd2c182275a9de57e5186491012c402a6248e0	01-Jun-2015	Samuel Iglesias Gonsalvez <siglesias@igalia.com>	i965/fs/nir: implement nir_intrinsic_get_buffer_size v2: - Remove inst->regs_written assignment as the instruction only writes to one register. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
c5743a5d7fa62a339222ceb96d568a525d77fe0c	13-Mar-2015	Jordan Justen <jordan.l.justen@intel.com>	i965/nir: Support gl_WorkGroupID variable Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
49f999b9cb6ecb32cb27d10b47d234a176ae4c77	13-Mar-2015	Jordan Justen <jordan.l.justen@intel.com>	i965/nir: Support gl_LocalInvocationID variable Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
34cff76fc2da1ce9abad6e2b1856fec6a950d19c	05-Nov-2014	Jordan Justen <jordan.l.justen@intel.com>	i965/cs: Enable barrier in MEDIA_INTERFACE_DESCRIPTOR Enable barrier in MEDIA_INTERFACE_DESCRIPTOR if the program uses the barrier() GLSL function. On Ivy Bridge and Haswell, this allows the piglit test tests/spec/arb_compute_shader/execution/simple-barrier-atomics.shader_test to pass. On gen8, this enables a similar test with a local group size of 896 to pass. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
55ebaa6d003b69c0a159a00d82a1e96f685062d6	28-Aug-2015	Ilia Mirkin <imirkin@alum.mit.edu>	i965: add handling for imageSamples Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
0b91bcea98c0fe201bba89abe1ca3aee4d04c56c	12-Aug-2015	Ilia Mirkin <imirkin@alum.mit.edu>	i965: add support for textureSamples function Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> [v2: kayden-supplied code in fs_nir replacing need for logical opcode] Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
0cc331dddd1a99c7af3619c92c48b5c32e17f6b3	04-Aug-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/nir: Use nir_system_value_from_intrinsic to reduce duplication. This code is all pretty much identical. We just needed the translation from one enum value to the other. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
c676c432f30158190c260e7f3731ee6667ad4103	17-Aug-2015	Matt Turner <mattst88@gmail.com>	i965/fs: Remove fs_visitor::try_replace_with_sel(). No shader-db changes on g4x, snb, hsw, or bdw. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
2581fe931a48478123d8054ce7a291cffa851de9	28-Aug-2015	Marta Lofstedt <marta.lofstedt@intel.com>	i965/fs: Do not set the size for zero-size uniforms Zero sized uniforms can exist in the list, but they don't get get any space allocated in prog_data->params or in the param_size array, so the size should not be set for them. This was previously fixed in: commit: 781dc7c0e1f41502f18e07c0940af949a78d2792. However, commit: 259f7291de2387aa3ac5f856b39b7b934a1d8e7d removed the fix. Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
97f4efd573aed7ffc0ea9395f4e69ccdeb5041f6	27-May-2015	Nanley Chery <nanley.g.chery@intel.com>	mesa/macros: add power-of-two assertions for alignment macros ALIGN and ROUND_DOWN_TO both require that the alignment value passed into the macro be a power of two in the comments. Using software assertions verifies this to be the case. v2: use static inline functions instead of gcc-specific statement expressions (Brian). v3: fix indendation (Brian). v4: add greater than zero requirement (Anuj). Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
259f7291de2387aa3ac5f856b39b7b934a1d8e7d	18-Aug-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Rework uniform handling Previously, we treated the entire UNIFORM file as if it had two elements: One for direct things and one for indirect. This is substantially different from how the old visitor code handled it where each element was effectively its own uniform. This commit makes the NIR path more like the old ir_visitor path where each uniform is separate. This should allow us to more easily make decisions about what to push. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
0db8e87b4a16b123f7c0b44d54f23b535a136ee6	18-Aug-2015	Jason Ekstrand <jason.ekstrand@intel.com>	nir/intrinsics: Add a second const index to load_uniform In the i965 backend, we want to be able to "pull apart" the uniforms and push some of them into the shader through a different path. In order to do this effectively, we need to know which variable is actually being referred to by a given uniform load. Previously, it was completely flattened by nir_lower_io which made things difficult. This adds more information to the intrinsic to make this easier for us. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
6c33d6bbf9b54784e4498a81c73b712dca5dd737	12-Aug-2015	Kenneth Graunke <kenneth@whitecape.org>	nir: Pass a type_size() function pointer into nir_lower_io(). Previously, there were four type_size() functions in play - the i965 compiler backend defined scalar and vec4 type_size() functions, and nir_lower_io contained its own similar functions. In fact, the i965 driver used nir_lower_io() and then looped over the components using its own type_size - meaning both were in play. The two are /basically/ the same, but not exactly in obscure cases like subroutines and images. This patch removes nir_lower_io's functions, and instead makes the driver supply a function pointer. This gives the driver ultimate flexibility in deciding how it wants to count things, reduces code duplication, and improves consistency. v2 (Jason Ekstrand): - One side-effect of passing in a function pointer is that nir_lower_io is now aware of and properly allocates space for image uniforms, allowing us to drop hacks in the backend Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> v2 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
640c472fd075814972b1276c5b0ed3a769aacda5	12-Aug-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Move type_size() methods out of visitor classes. I want to use C function pointers to these, and they don't use anything in the visitor classes anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
c56899f41a904762225267cb9c543a0abd901ad5	19-Aug-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965: Make setup_vec4_uniform_value and _image_uniform_values take an offset This way they don't implicitly increment the uniforms variable and don't have to be called in-sequence during uniform setup. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
56ebd3314bfc5895fab47586fc8cda024aac4fd8	20-Aug-2015	Martin Peres <martin.peres@linux.intel.com>	i965: Fix "handle nir_intrinsic_image_size" I pushed a half-baked version of "i965: handle nir_intrinsic_image_size" by accident. Not having the Reviewed-by: tags on the last two commits should have been a red flag but I somehow missed it after the QA check. This patch should fix image-size for non-int images. I will add support to the piglit test for all the other image types. Sorry for the noise. Signed-off-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
50db9c1db645c1a4d5777d2cacfd7ac74ebbe544	28-Apr-2015	Martin Peres <martin.peres@linux.intel.com>	i965: handle nir_intrinsic_image_size v2, Review from Francisco Jerez: - avoid the camelCase for the booleans - init the booleans using the sampler type - force the initialization of all the components of the output register v3: - Rename a variable from CubeMapArray to CubeArray to re-use GLSL's name (Ilia) - Fix some indentation and drop parenthesis (Topi) - Fix a signed/unsigned comparaison warning Signed-off-by: Martin Peres <martin.peres@linux.intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
13a04abc277089275217dce119e18acf4d4ce52d	27-Jul-2015	Francisco Jerez <currojerez@riseup.net>	i965/fs: Clamp image array indices to the array bounds on IVB. This fixes the spec@arb_shader_image_load_store@invalid index bounds piglit tests on IVB, which were causing a GPU hang and then a crash due to the invalid binding table index result of the array index calculation. Other generations seem to behave sensibly when an invalid surface is provided so it doesn't look like we need to care. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
a47ae8de2cf30fbe45318a18a2ea032f30ab7d10	27-Jul-2015	Francisco Jerez <currojerez@riseup.net>	i965/fs: Translate image load, store and atomic NIR intrinsics. v2: Move array coordinate workaround into the surface builder. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
912ef52c29fdc373889594b963cc93c89fa9e3f7	28-Jun-2015	Francisco Jerez <currojerez@riseup.net>	i965/fs: Handle image uniforms in NIR programs. v2: Move the image_params array back to brw_stage_prog_data. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
8a688bee83ced46eb4bff741f05d2da033c07ade	10-Aug-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Make resolve_source_modifiers consistent with the vec4 version Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
e77a4a9b1f66de383043df95aada40fd5a004913	04-Aug-2015	Francisco Jerez <currojerez@riseup.net>	i965/fs: Implement nir_op_imul/umul_high in terms of MULH. And get rid of another no16() call. Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
db8a6de571bb72ef43209a415e5492001a87b1d8	17-Jun-2015	Eduardo Lima Mitev <elima@igalia.com>	i965/nir: Add new utility method brw_glsl_base_type_for_nir_type() This method returns the glsl_base_type corresponding to a nir_alu_type. It will factorize code currently present in fs_nir, that can be reused in vec4_nir on its upcoming emit_texture support. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
97e205fd35bf77fd761caf24c611ff72cc0d85e2	17-Apr-2015	Eduardo Lima Mitev <elima@igalia.com>	i965/nir: Move brw_type_for_nir_type() to brw_nir to allow reuse Upcoming NIR->vec4 pass can benefit from this method, so lets move it up. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
781dc7c0e1f41502f18e07c0940af949a78d2792	30-Jul-2015	Francisco Jerez <currojerez@riseup.net>	i965/fs: Fix regression with SIMD8 VS since b5f1a48e234d47b24df38cb562cffb8941d43795. With num_direct_uniforms == 0 there's no space allocated in the param_size array for the one block of direct uniforms -- On the FS stage this would be a harmless no-op because it would simply re-set one of the param_size entries allocated for the sampler units to zero, but on the VS stage it has been reported to cause memory corruption followed by a crash -- Surprising how a full piglit run on Gen8 didn't catch it. Reported-and-reviewed-by: "Lofstedt, Marta" <marta.lofstedt@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
7cb60d770fc24bf00b6f7e5898cca1426e55c026	27-Jul-2015	Francisco Jerez <currojerez@riseup.net>	i965/fs: Translate memory barrier NIR intrinsics. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b5f1a48e234d47b24df38cb562cffb8941d43795	28-Jun-2015	Francisco Jerez <currojerez@riseup.net>	i965/fs: Execute nir_setup_uniforms, _inputs and _outputs unconditionally. Images take up zero uniform slots in the nir_shader::num_uniforms calculation, but nir_setup_uniforms needs to be executed even if the program has no non-image uniforms so the driver-specific image parameters are uploaded. nir_setup_uniforms is a no-op if there are really no uniforms, so checking the num_uniform count is useless in any case. The nir_setup_inputs and _outputs changes shouldn't lead to any functional change, they are just meant to preserve the symmetry between them and nir_setup_uniforms. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
3e5a90792d14aeb599dd236f830e6e344b35c905	05-May-2015	Francisco Jerez <currojerez@riseup.net>	i965/fs: Don't overwrite fs_visitor::uniforms and ::param_size during the SIMD16 run. Image variables need to allocate additional uniform slots over nir_shader::num_uniforms. nir_setup_uniforms() overwrites the values imported from the SIMD8 visitor and then exits early before entering the nir_shader::uniforms loop, so image uniforms are never re-created. Instead leave the imported values alone, they must be the same for the uniform layout of both runs to be compatible. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
ea0ac53f059c418d5797c495b87020f2ca2ec842	29-Jun-2015	Francisco Jerez <currojerez@riseup.net>	i965/fs: Drop unused untyped surface read and atomic emit methods. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
854c4d8b37416d3e5593099a8e5441f3cf861173	05-May-2015	Francisco Jerez <currojerez@riseup.net>	i965/fs: Revisit NIR atomic counter intrinsic translation. Rewrite the NIR atomic counter intrinsics translation code making use of the recently introduced surface builder. This will allow the removal of some of the functionality duplicated between the visitor and surface builder. v2: Drop VEC4 suport. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
3af2623da5167aa686bcb2cff01d27058a507026	20-Jul-2015	Francisco Jerez <currojerez@riseup.net>	i965: Lift the constness restriction on surface indices passed to untyped ops. v2: Update NIR atomic intrinsic handling too (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b406c34a65677cac2517336d93ab279c3d35fce6	23-Jul-2015	Dave Airlie <airlied@redhat.com>	i965: fix warning since tess merge. Signed-off-by: Dave Airlie <airlied@redhat.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
fadf34773527779eef4622b2586d87ec00476c0f	13-Jul-2015	Francisco Jerez <currojerez@riseup.net>	i965: Fix stride field for the result of emit_uniformize(). This is essentially the same problem fixed in an earlier patch for immediates. Setting the stride to zero will be particularly useful for my future SIMD lowering pass, because we will be able to just check whether the stride of a source register is zero and skip emitting the copies required to unzip it in that case. Instead of setting stride to zero in every caller of emit_uniformize() I've changed the function to return the result as its return value (previously it was being written into a caller-provided destination register), because this way we can enforce that the result is used with the correct regioning from the function itself. The changes to the prototype of its VEC4 counterpart are mainly for the sake of symmetry, VEC4 registers don't have stride. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
8ba1982b1e37aa69680e243fe391254211ae273a	17-Jul-2015	Alejandro Piñeiro <apinheiro@igalia.com>	i965/nir/fs: removed unneeded support for global variables As functions are inlined, and nir_lower_global_vars_to_local gets run, all global variables are lowered to local variables. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b00cd6e4a0f9a84d514f428428be348900236e2e	09-Jul-2015	Francisco Jerez <currojerez@riseup.net>	i965: Implement nir_op_uadd_carry and _usub_borrow without accumulator. This gets rid of two no16() fall-backs and should allow better scheduling of the generated IR. There are no uses of usubBorrow() or uaddCarry() in shader-db so no changes are expected. However the "arb_gpu_shader5/execution/built-in-functions/fs-usubBorrow" and "arb_gpu_shader5/execution/built-in-functions/fs-uaddCarry" piglit tests go from 40 to 28 instructions. The reason is that the plain ADD instruction can easily be CSE'ed with the original addition, and the b2i negation can easily be propagated into the source modifier of another instruction, so effectively both operations are performed with just one instruction. v2: Rely on carry_to_arith() and borrow_to_arith() to lower these (Ilia Mirkin). Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
3ee2daf23dc91b8dfc017b5c89c10ab1376ba4df	10-Jul-2015	Francisco Jerez <currojerez@riseup.net>	i965: Implement b2f and b2i using negation. Booleans are represented as 0/-1 on modern hardware which means we can just negate them to convert them into a numeric type. Negation has the benefit that it can be implemented using a source modifier which can easily be propagated into some other instruction. shader-db results on HSW: total instructions in shared programs: 6349082 -> 6346693 (-0.04%) instructions in affected programs: 40948 -> 38559 (-5.83%) helped: 123 HURT: 1 GAINED: 1 LOST: 0 Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
308c0bf74307af0f3385cdcbb00aa0534ec3e5da	12-Mar-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Switch on shader stage in nir_setup_outputs(). Adding new shader stages to a switch statement is less confusing than an if-else-if ladder where all but the first case are fragment shader specific (but don't claim to be). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
73d0e7f3451eaeb62ac039d2dcee1e1c6787e3db	02-Jul-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/vs: Fix matNxM vertex attributes where M != 4. Matrix vertex attributes have their columns padded out to vec4s, which I was failing to account for. Scalar NIR expects them to be packed, however. Fixes 1256 dEQP tests on Broadwell. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
7009e2683ebb917393d87639f549588f22c03a32	06-Jul-2015	Francisco Jerez <currojerez@riseup.net>	i965/gen4-5: Enable 16-wide dispatch on shaders with control flow. This was probably disabled due to a combination of several bugs in the generator code (fixed earlier in this series) and a misunderstanding of the hardware spec. The documentation for most control flow instructions mentions among other restrictions: "Instruction compression is not allowed." This however doesn't have any implications on 16 wide not being supported, because none of the control flow instructions have multi-register operands (control flow instructions are not compressed on more recent hardware either, except maybe SNB's IF with inline compare). In fact Gen4-5 had 16-wide control flow masks and stacks, and the spec mentions in several places that control flow instructions push and pop 16 channels worth of data -- Otherwise there doesn't seem to be any indication that it shouldn't work. Causes no piglit regressions, and gives the following shader-db results on ILK: total instructions in shared programs: 4711384 -> 4711384 (0.00%) instructions in affected programs: 0 -> 0 helped: 0 HURT: 0 GAINED: 1215 LOST: 0 Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
493af150fb3b1c007d791b24dcd5ea8a92ad763c	03-Jul-2015	Neil Roberts <neil@linux.intel.com>	i965/skl: Set the pulls bary bit in 3DSTATE_PS_EXTRA On Gen9+ there is a new bit in 3DSTATE_PS_EXTRA that must be set if the shader sends a message to the pixel interpolator. This fixes the interpolateAt* tests on SKL, apart from interpolateatsample-nonconst but that is not implemented anywhere so it's not a regression. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.6 10.5" <mesa-stable@lists.freedesktop.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
7abc1e3286bc4729e144d3a247c2a275e46aaf53	02-Jul-2015	Neil Roberts <neil@linux.intel.com>	i965/fs: Don't disable SIMD16 when using the pixel interpolator There was a comment saying that in SIMD16 mode the pixel interpolator returns coords interleaved 8 channels at a time and that this requires extra work to support. However, this interleaved format is exactly what the PLN instruction requires so I don't think anything needs to be done to support it apart from removing the line to disable it and to ensure that the message lengths for the send message are correct. I am more convinced that this is correct because as it says in the comment this interleaved output is identical to what is given in the thread payload. The code generated to apply the plane equation to these coordinates is identical on SIMD16 and SIMD8 except that the dispatch width is larger which implies no special unmangling is needed. Perhaps the confusion stems from the fact that the description of the PLN instruction in the IVB PRM seems to imply that the src1 inputs are not interleaved so it wouldn't work. However, in the HSW and BDW PRMs, the pseudo-code is different and looks like it expects the interleaved format. Mesa doesn't seem to generate different code on IVB to uninterleave the payload registers and everything is working so I can only assume that the PRM is wrong. I tested the interpolateAt tests on HSW and did a full Piglit run on IVB on there were no regressions. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
89bc4c78c394e50ddb16cc089bd3ec90681342d7	18-Jun-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Remove fs_inst constructors that don't take an explicit exec_size Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
f7dcc1160331462a071c54ca1067f9e2f57b55be	18-Jun-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Add a builder argument to offset() Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
0ecdf04060518149e99a098caf4f6025fd6482a4	26-Jun-2015	Connor Abbott <cwabbott0@gmail.com>	i965/fs: emit constants only once Before, we would lazily emit a MOV whenever we encountered a use of a constant. Now that we have a dedicated file for SSA values, we can instead only emit the MOV's once, which is more consistent and prevents us from relying on CSE to re-combine the constants when they aren't absorbed into the instruction. total instructions in shared programs: 6078991 -> 6073118 (-0.10%) instructions in affected programs: 402221 -> 396348 (-1.46%) helped: 1527 HURT: 0 GAINED: 8 LOST: 2 v2: split this out from the previous commit (Jason) Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
864907e2f14523c130e6ff24c081789bb079bae1	24-Jun-2015	Connor Abbott <cwabbott0@gmail.com>	i965/fs: use SSA values directly Before, we would use registers, but set a magical "parent_instr" field to indicate that it was actually purely an SSA value (i.e., it wasn't involved in any phi nodes). Instead, just use SSA values directly, which lets us get rid of the hack and reduces memory usage since we're not allocating a nir_register for every value. It also makes our handling of load_const more consistent compared to the other instructions. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
f0e772392f1c61df6e3f253dc236eb9737fb6146	13-Mar-2015	Jordan Justen <jordan.l.justen@intel.com>	i965/nir: Support barrier intrinsic function Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
cfc175b40995ca4e590cd30897f6bb017e1376a3	10-Jun-2015	Chad Versace <chad.versace@intel.com>	i965/fs: Fix unused variable warning Annotate offset_components with attribute 'unused'. Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
44928b799adbbf2671c482431b3b7a390118725c	08-Jun-2015	Francisco Jerez <currojerez@riseup.net>	i965/fs: Remove dead IR construction code from the visitor. Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
bf83a1a219af8bf82c3c721888bbe0dfc3eced34	03-Jun-2015	Francisco Jerez <currojerez@riseup.net>	i965/fs: Migrate translation of NIR texturing instructions to the IR builder. v2: Don't remove assignments of base_ir just yet. Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
979fe2ffee3956186017fe6c115aed53fc87ad3d	03-Jun-2015	Francisco Jerez <currojerez@riseup.net>	i965/fs: Migrate translation of NIR intrinsics to the IR builder. v2: Use fs_builder::SEL instead of ::emit. Use set_condmod(). Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
fe88c7ae38c72ea09ced69fb12ff00f58bdf1d6e	03-Jun-2015	Francisco Jerez <currojerez@riseup.net>	i965/fs: Migrate translation of NIR ALU instructions to the IR builder. Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
3632c28bde071950dc57e42eb62a65fb838c8bdc	03-Jun-2015	Francisco Jerez <currojerez@riseup.net>	i965/fs: Migrate translation of NIR control flow to the IR builder. Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
9976731485abb68eb3b5ae6f11a7838977b95b5b	03-Jun-2015	Francisco Jerez <currojerez@riseup.net>	i965/fs: Migrate NIR variable handling to the IR builder. Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
09733f220ac9921ce7d8c3524bc5327d8203c446	03-Jun-2015	Francisco Jerez <currojerez@riseup.net>	i965/fs: Migrate NIR emit_percomp() to the IR builder. Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
546839ef639bf871feaa62ab7d811f2fc783bdcd	03-Jun-2015	Francisco Jerez <currojerez@riseup.net>	i965/fs: Migrate pull constant loads to the IR builder. Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
87a4bc511811327a00f9bbc1b6870b7fa46675f7	21-May-2015	Martin Peres <martin.peres@linux.intel.com>	mesa: reference built-in uniforms into gl_uniform_storage This change introduces a new field in gl_uniform_storage to explicitely say that a uniform is built-in. In the case where it is, no storage is defined to make it clear that it is read-only from the mesa side. I fixed all the places in the code that made use of the structure that I changed. Any place making a wrong assumption and using the storage straight away will just crash. This patch seems to implement the path of least resistance towards listing built-in uniforms in GL_ACTIVE_UNIFORM (and other APIs). Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Martin Peres <martin.peres@linux.intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
2126c68e5cba79709e228f12eb3062a9be634a0e	20-May-2015	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Get rid of the array elements parameter on load/store intrinsics Previously, we used intrinsic->const_index[1] to represent "the number of array elements to load" for load/store intrinsics. However, this set to 1 by every pass that ever creates a load/store intrinsic. Also, while it might make some sense for registers, it makes no sense whatsoever in SSA. On top of that, the i965 backend was the only backend to ever support it; freedreno and vc4 just assert that it's always 1. Let's just delete it. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Rob Clark <robclark@freedesktop.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
1e4e17fbd9296cc5064aabdb351a894d10190cb6	11-May-2015	Matt Turner <mattst88@gmail.com>	i965/fs: Lower integer multiplication after optimizations. 32-bit x 32-bit integer multiplication requires multiple instructions until Broadwell. This patch just lets us treat the MUL instruction in the FS backend like it operates on Broadwell, and after optimizations we lower it into a sequence of instructions on older platforms. Doing this will allow us to some extra optimization on integer multiplies. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
3bdbc1e436828606d0b549b9480e7cc28b42d159	07-May-2015	Ian Romanick <ian.d.romanick@intel.com>	nir: Delete all traces of nir_op_flog Nothing produces it, and nothing can consume it. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
e0a17f6e31a8cefc173ced5f53cb2d28a842fbb6	07-May-2015	Ian Romanick <ian.d.romanick@intel.com>	nir: Delete all traces of nir_op_fexp Nothing produces it, and nothing can consume it. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
e1ae0c3bc37be7b1de21ee248d674671d01da8e6	19-Feb-2015	Francisco Jerez <currojerez@riseup.net>	i965: Fix variable indexing of sampler arrays under non-uniform control flow. ARB_gpu_shader5 requires sampler array indexing expressions to be dynamically uniform, this however doesn't have any implications on the control flow that leads to the evaluation of that expression being uniform. Use emit_uniformize() to obtain an arbitrary live value from the binding table index calculation instead of assuming that the first channel is always live. Fixes the following Piglit test cases: arb_gpu_shader5/execution/sampler_array_indexing/fs-nonuniform-control-flow.shader_test arb_gpu_shader5/execution/sampler_array_indexing/vs-nonuniform-control-flow.shader_test part of the series: http://lists.freedesktop.org/archives/piglit/2015-February/014615.html Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b234537cc3e513ded9b5385d876e4c531f72af94	19-Feb-2015	Francisco Jerez <currojerez@riseup.net>	i965: Fix variable indexing of UBO arrays under non-uniform control flow. ARB_gpu_shader5 requires UBO array indexing expressions to be dynamically uniform, this however doesn't have any implications on the control flow that leads to the evaluation of that expression being uniform. Use emit_uniformize() to obtain an arbitrary live value from the binding table index calculation instead of assuming that the first channel is always live. Fixes the following Piglit tests: arb_gpu_shader5/execution/ubo_array_indexing/fs-nonuniform-control-flow.shader_test arb_gpu_shader5/execution/ubo_array_indexing/vs-nonuniform-control-flow.shader_test part of the series: http://lists.freedesktop.org/archives/piglit/2015-February/014616.html Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
0c06d019bcf626b289ae94ca791dc25c216c1e5c	24-Apr-2015	Matt Turner <mattst88@gmail.com>	i965/fs: Fix code emission for imul_high in NIR. Copy over from brw_fs_visitor.cpp. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
c68364ac341d5fbbc5b6dcf74812a776359c0168	10-Apr-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Use the correct offsets when handling register indirects Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
28e9601d0e681411b60a7de8be9f401b0df77d29	16-Apr-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965: Add a devinfo field to backend_visitor and use it for gen checks Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
ceb6e5eebe13b85f57cf5a7a22371c10170943a3	14-Apr-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965: Remove the context parameter from brw_texture_offset It wasn't really being used anyway. We used it to assert that gpu_shader5 is supported in the back-end but that should be caught by the front-end. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
5af0604d528733af9113a6f8711c39796ce0ae40	07-Apr-2015	Matt Turner <mattst88@gmail.com>	i965/fs: Calculate delta_x and delta_y together. This lets SIMD16 programs on G45 and Gen5 use the PLN instruction. On Ironlake: total instructions in shared programs: 5634757 -> 5518055 (-2.07%) instructions in affected programs: 1745837 -> 1629135 (-6.68%) helped: 11439 HURT: 4 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b6354d9bb077815d2e388dc5d0e7411ea6d89748	24-Jan-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/nir: Make INTEL_DEBUG=ann work with NIR. Now that we store a copy of the NIR shader, and don't immediately free it, we can use it in annotations as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
89c1feb78d010bc457f5d02be84c955eebf3549f	08-Apr-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Create NIR during LinkShader() and ProgramStringNotify(). Previously, we translated into NIR and did all the optimizations and lowering as part of running fs_visitor. This meant that we did all of that work twice for fragment shaders - once for SIMD8, and again for SIMD16. We also had to redo it every time we hit a state based recompile. We now generate NIR once at link time. ARB programs don't have linking, so we instead generate it at ProgramStringNotify time. Mesa's fixed function vertex program handling doesn't bother to inform the driver about new programs at all (which is rather mean), so we generate NIR at the last minute, if it hasn't happened already. shader-db runs ~9.4% faster on my i7-5600U, with a release build. v2: Check NirOptions != NULL in ProgramStringNotify(). Don't bother using _mesa_program_enum_to_shader_stage as we already know it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b3e286c4575bf6af343c1a03471fd876cdfb5c43	08-Apr-2015	Kenneth Graunke <kenneth@whitecape.org>	nir: Store num_direct_uniforms in the nir_shader. Storing this here is pretty sketchy - I don't know if any driver other than i965 will want to use it. But this will make it a lot easier to generate NIR code at link time. We'll probably rework it anyway. (Ian suggested making nir_assign_var_locations_scalar_direct_first simply modify the nir_shader's fields, rather than passing pointers to them. If this stays long term, we should do that. But Jason and I suspect we'll be reworking this area again in the near future.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
f41f07f685e7f585e433b5fd1fadf602e74f0f1e	08-Apr-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Move lower_output_reads to brw_link_shader(). This makes it so emit_nir_code() doesn't modify the GLSL IR. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
024ecc783b763712d2896fd315d8b5222c27b1ec	11-Apr-2015	Matt Turner <mattst88@gmail.com>	i965/fs/nir: Mark fallthrough. /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
99264b7f37dc92bcb3a9ae226e00c9300414431c	08-Apr-2015	Kenneth Graunke <kenneth@whitecape.org>	nir: Make nir_lower_samplers take a gl_shader_stage, not a gl_program *. We don't actually need a gl_program struct. We only used it to translate prog->Target (i.e. GL_VERTEX_PROGRAM) to the gl_shader_stage (i.e. MESA_SHADER_VERTEX). We may as well just pass that. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
d131630c0825f199768965c504b6fa1e593d03d5	02-Apr-2015	Matt Turner <mattst88@gmail.com>	nir: Remove fsin_reduced/fcos_reduced. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
1bd1fc248ce5ecc6882309ab64ec61835fea1eda	03-Apr-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965: Use brw_nir_cubemap_normalize for NIR shaders Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
cb966fb2bea77b1d7b1bdb6597b7b85d810f2d0a	01-Apr-2015	Eric Anholt <eric@anholt.net>	i965: Use the tex projector lowering pass instead of hand-rolling it. This only impacts the ARB_fp path. We can't quite disable the GLSL-level lowering pass, because it needs to apply before brw_do_lower_unnormalized_offset(). total instructions in shared programs: 5667857 -> 5667847 (-0.00%) instructions in affected programs: 1114 -> 1104 (-0.90%) helped: 16 HURT: 6 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b9d7454571029ab330f28164fe6869f5e455ca90	01-Apr-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Run DCE again before going out of SSA We run lowering and optimization passes that might leave garbage lying around. This keeps the FS cse from having to clean it up. Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
37703040a142da6bc7c458479a70e35118e10e6b	23-Mar-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Run the ffma peephole after the rest of the optimizations The idea here is that fusing multiply-add combinations too early can reduce our ability to perform CSE and value-numbering. Instead, we split ffma opcodes up-front, hope CSE cleans up, and then fuse after-the-fact. Unless an algebraic pass does something silly where it inserts something between the multiply and the add, splitting and re-fusing should never cause a problem. We run the late algebraic optimizations after this so that things like compare-with-zero don't hurt our ability to fuse things. shader-db results for fragment shaders on Haswell: total instructions in shared programs: 4390538 -> 4379236 (-0.26%) instructions in affected programs: 989359 -> 978057 (-1.14%) helped: 5308 HURT: 97 GAINED: 78 LOST: 5 This does, unfortunately, cause some substantial hurt to a shader in Kerbal Space Program. However, the damage is caused by changing a single instruction from a ffma to an add. This, in turn, decreases register pressure in one part of the program causing it to fail to register allocate and spill. Given the overwhelmingly positive results in other shaders and the fact that the NIR for the Kerbal shaders is actually better, this should be considered a positive. Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
a8c8b3b8720bb7ce8ac1cb94815ed36d8c881f66	21-Mar-2015	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Add a dedicated ffma peephole optimization i965/nir: Use the dedicated ffma peephole total instructions in shared programs: 4418748 -> 4394618 (-0.55%) instructions in affected programs: 1292790 -> 1268660 (-1.87%) helped: 5999 HURT: 457 GAINED: 4 LOST: 9 Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
da294f9b2f666f487001b2a25627c867c40eb3d9	24-Mar-2015	Jason Ekstrand <jason.ekstrand@intel.com>	nir/algebraic: Add a seperate section for "late" optimizations i965/nir: Use the late optimizations Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
826d3afb8f421a62020308813397e541e672381e	30-Jan-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/fs: Add ARB_fragment_program support to the NIR backend. Use prog_to_nir where we would normally call glsl_to_nir, handle program parameter lists, and skip a few things that don't exist. Using NIR generates much better shader code than Mesa IR, since we get real optimizations, as opposed to prog_optimize: total instructions in shared programs: 314007 -> 279892 (-10.86%) instructions in affected programs: 285173 -> 251058 (-11.96%) helped: 2001 HURT: 67 GAINED: 4 LOST: 7 v2: Change early return in nir_setup_uniforms to if/else (Jordan). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
649173b473ded2d7b1aded91cd4aab42eaeb5766	01-Feb-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/fs: Implement texture projection support. Our fragment program backend implements support for TXP directly, and there's no NIR lowering pass to remove the projection. When we switch fragment program support over to NIR, we need to support it somehow. It's easy enough to support directly. v2: Split out offset/tex_offset rename (requested by Jordan). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
0a9bcf9e39409ea5acfdfbcf0c388e41e0f9ea45	25-Mar-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/fs: Rename offset to tex_offset to avoid shadowing offset(). fs_visitor::nir_emit_texture() created an fs_reg variable called offset, which shadowed the offset() helper function in brw_ir_fs.h. Rename the variable to tex_offset so we can still call offset(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
a6d4a108d27f2b635748c583fe0507f04b3b493e	18-Mar-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Use signed integer type for booleans FS instructions with NIR on i965: total instructions in shared programs: 2663561 -> 2619051 (-1.67%) instructions in affected programs: 1612965 -> 1568455 (-2.76%) helped: 5455 HURT: 12 FS instructions with NIR on g4x: total instructions in shared programs: 2352633 -> 2307908 (-1.90%) instructions in affected programs: 1441842 -> 1397117 (-3.10%) helped: 5463 HURT: 11 FS instructions with NIR on ilk: total instructions in shared programs: 3997305 -> 3934278 (-1.58%) instructions in affected programs: 2189409 -> 2126382 (-2.88%) helped: 8969 HURT: 22 FS instructions with NIR on hsw (snb and ivb were similar): total instructions in shared programs: 4109389 -> 4109242 (-0.00%) instructions in affected programs: 109869 -> 109722 (-0.13%) helped: 339 HURT: 190 No SIMD16 programs were gained or lost on any platform Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
41d64fa184671d372f6630deaf2401b00d4e984a	17-Mar-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Do boolean resolves on GEN <= 5 v2: A couple comment clean-ups from Matt Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
2612e569e04e29500f81ed233bd86b45ef583495	17-Mar-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Properly set the predicate on the SEL used in min/max Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
235c728020af352ee0f4b7d598c951f4a4e83232	17-Mar-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Use emit_lrp for emitting flrp Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
8a0946f3b1522e5f91afe14c8c3b22ba6009ed04	06-Mar-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/fs: Make an emit_discard_jump() function to reduce duplication. This is already copied in two places, and I want to copy it to a third place. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Carl Worth <cworth@cworth.org> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
46c35c61e9c5c1b56fdd9fcd4eb45591dd16d21d	18-Mar-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Sort uniforms direct-first and use two different uniform registers Previously, we put all the uniforms into one big array. The problem with this approach is that, as soon as there was one indirect array acces, the backend would decide that the entire large array should be pull constants. This commit splits the array in half: first direct-only uniforms and then potentially-indirect uniforms. This may not be optimal, but it does let the backend promote things to push constants. Shader-db results on HSW: total instructions in shared programs: 4114840 -> 4112172 (-0.06%) instructions in affected programs: 43316 -> 40648 (-6.16%) helped: 116 HURT: 0 v2: Set param_size[num_direct_uniforms] only if we have indirect uniforms. This caused a bug that, strangely enough, only showed up on Broadwell vertex shaders. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
25db44a84597960a6aea6b252bcf2c3d7e17fc74	18-Mar-2015	Jason Ekstrand <jason.ekstrand@intel.com>	nir/lower_io: Make variable location assignment a manual operation Previously, we just assigned variable locations in nir_lower_io. Now, we force the user to assign variable locations for us. This gives the backend a bit more control over where variables are placed. v2: Rename from _packed to _scalar Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
639115123efe7f71d432e24b1719adda7d23e97e	18-Mar-2015	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Use a list instead of a hash_table for inputs, outputs, and uniforms We never did a single hash table lookup in the entire NIR code base that I found so there was no real benifit to doing it that way. I suppose that for linking, we'll probably want to be able to lookup by name but we can leave building that hash table to the linker. In the mean time this was causing problems with GLSL IR -> NIR because GLSL IR doesn't guarantee us unique names of uniforms, etc. This was causing massive rendering isues in the unreal4 Sun Temple demo. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
7ef0b6b367f73e24e6dd47a15d439775d3dd1297	09-Mar-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/fs: Add VS output support to nir_setup_outputs(). Adapted from fs_visitor::visit(ir_variable *). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
eb137117b7db6c78d6a1662730524d622301c708	09-Mar-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/fs: Handle VS inputs in the NIR backend. (Jason noted that this is not a good long term solution, and we should instead improve nir_lower_io so that this extra set of MOVs is unnecessary. I tend to agree, but decided we could do that as a follow-up improvement.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
a5c4e7fcf52c048c02e4ee14413a574b4ff3695e	09-Mar-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/fs: Refactor fs_visitor::nir_setup_inputs(). No functional change. In preparation for supporting vertex shaders, this adds a switch statement on shader stage (since vertex attributes and fragment shader varyings will need different handling). It also renames "varying" to "input", to be more general. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
34628a838aa96643be02cd23eb55af50025dd422	09-Mar-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Implement NIR intrinsics for loading VS system values. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b9dea9bc45299f19c445170a4cac27810547de00	09-Mar-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/nir: Lower to registers a bit later. We can't safely call nir_optimize() with register present, since several passes called in the loop can't handle registers, and will fail asserts. Notably, nir_lower_vec_alus() and nir_opt_algebraic() really don't want registers. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
1f0067811c059fb3b284a2169e94fbdec7a4b909	09-Mar-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/nir: Optimize after nir_lower_var_copies(). Array variable copy splitting generates a bunch of stuff we want to clean up before proceeding. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
1d8ef6ba606a88239de633e5abcc19471c9d3cf4	09-Mar-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/fs: Store a pointer to brw_sampler_prog_key_data in the visitor. The NIR backend hardcodes brw_wm_prog_key at the moment, which won't work when we support scalar VS. We could use get_tex(), but it's a static method. I was going to promote it to fs_visitor, but then realized that both parameters (stage and key) are already members. It then occured to me that we could just set up a pointer in the constructor, and skip having a function altogether. This patch also converts all existing users to use key_tex. v2: Make key_tex a "const brw_sampler_prog_key_data *" instead of non-const; word-wrap some lines. (Review comments from Topi.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
e4f26acc08a3d852e60a27d0f0da7001944cb607	28-Feb-2015	Ian Romanick <ian.d.romanick@intel.com>	i965/fs: Silence unused parameter warning brw_fs_visitor.cpp:2162:56: warning: unused parameter 'offset_components' [-Wunused-parameter] fs_reg offset_value, unsigned offset_components, ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
c6f2abe67e38c52361a1d342dca6ec5ed7747913	06-Mar-2015	Kenneth Graunke <kenneth@whitecape.org>	nir: Plumb the shader stage into glsl_to_nir(). The next commit needs to know the shader stage in glsl_to_nir(). To facilitate that, we pass the gl_shader rather than the raw exec_list of instructions. This has both the exec_list and the stage. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b200cbb0a41aaebb007668f870a483f0b9ecd898	06-Mar-2015	Kenneth Graunke <kenneth@whitecape.org>	nir: Add native_integers to nir_shader_compiler_options. glsl_to_nir, tgsi_to_nir, and prog_to_nir all want to know whether the driver supports native integers. Presumably other passes may as well. Adding this to nir_shader_compiler_options is an easy way to provide that information, as it's accessible via nir_shader::options. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
a55da73be46b4576015417b2dff71a719bc8b797	06-Mar-2015	Kenneth Graunke <kenneth@whitecape.org>	nir: Try to make sense of the nir_shader_compiler_options code. The code in glsl_to_nir is entirely dead, as we translate from GLSL to NIR at link time, when there isn't a _mesa_glsl_parse_state to pass, so every caller passes NULL. glsl_to_nir seems like the wrong place to try and create the shader compiler options structure anyway - tgsi_to_nir, prog_to_nir, and other translators all would have to duplicate that code. The driver should set this up once with whatever settings it wants, and pass it in. Eric also added a NirOptions field to ctx->Const.ShaderCompilerOptions[] and left a comment saying: "The memory for the options is expected to be kept in a single static copy by the driver." This suggests the plan was to do exactly that. That pointer was not marked const, however, and the dead code used a mix of static structures and ralloced ones. This patch deletes the dead code in glsl_to_nir, instead making it take the shader compiler options as a mandatory argument. It creates an (empty) options struct in the i965 driver, and makes NirOptions point to that. It marks the pointer const so that we can actually do so without generating "discards const qualifier" compiler warnings. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
a84f66a9b6cf46bb19ca71faca5b1d6d81209caf	06-Mar-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/nir: Resolve source modifiers on Gen8+ logic operations. On Gen8+, AND/OR/XOR/NOT don't support the abs() source modifier, and negate changes meaning to bitwise-not (~, not -). This isn't what NIR expects, so we should resolve the source modifers via a MOV. +30 Piglits (fs-op-bit{and,or,xor}-not-abs-*). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
5666d9266fd43d552c76ce7b472abc0afde6c32b	28-Feb-2015	Matt Turner <mattst88@gmail.com>	i965/fs/nir: Mark fallthrough. /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
54cd2f7c9655ccbb00209b1f49692196df2a33a1	28-Feb-2015	Matt Turner <mattst88@gmail.com>	i965/fs/nir: Mark fallthrough. /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b8a1637119249c1d5e76c27d0053360bbb7f4e77	27-Feb-2015	Ian Romanick <ian.d.romanick@intel.com>	i965/fs/nir: Use emit_math for nir_op_fpow It appears that all the other instructions that need it already use it. This one just got missed. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Cc: "10.5" <mesa-stable@lists.freedesktop.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
8eb6c109994de2827b0a1340a2dc8d933edaf5e0	20-Aug-2014	Kenneth Graunke <kenneth@whitecape.org>	i965/fs: Handle conditional discards. The discard condition tells us which channels we want killed. We want to invert that condition to get the channels that should survive (remain live) in f0.1. Emit a CMP to negate it. Nothing generates these today, but that will change shortly. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b8582d18e6b0737c4a34777837c10898ed177e30	15-Feb-2015	Matt Turner <mattst88@gmail.com>	i965/fs/nir: Optimize integer multiply by a 16-bit constant. Gen8+ support was just broken, since MUL now consumes 32-bits from both sources. Fixes 986 piglit tests on my BDW. total instructions in shared programs: 7753873 -> 7753522 (-0.00%) instructions in affected programs: 28164 -> 27813 (-1.25%) helped: 77 GAINED: 47 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
7a997a386375a98b70ae5e1d880c8d47f236de8d	15-Feb-2015	Matt Turner <mattst88@gmail.com>	i965/fs/nir: Optimize (gl_FrontFacing ? x : y) where x and y are ±1.0. total instructions in shared programs: 7756214 -> 7753873 (-0.03%) instructions in affected programs: 455452 -> 453111 (-0.51%) helped: 2333 Reviewed-by: Eric Anholt <eric@anholt.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
2bd139e18c941e7ea0870ba43314a5c10fd5bb12	19-Feb-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/fs: Un-hardcode DEBUG_WM, "FS", and "fragment". These code paths can (or will) be used for other shader stages. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
231267bf011e1fa6edb52ffad27fcbca8e0e28e1	31-Jan-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/fs: Use VARYING_SLOT checks rather than strcmp(). Comparing the location field is equivalent and more efficient. We'll also need this when we start using NIR for ARB programs, as our NIR converter will set the location field correctly, but probably won't use the GLSL names for these concepts. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
3c57a595276d0614940d70315e78de0d83bf74ac	14-Feb-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Don't support gl_FrontFacing as an input variable Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
785b22caee28892d9d995a743de1dee5434c9ce1	14-Feb-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Add support for nir_intrinsic_load_front_face Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
6489cb1ae6f3cb999b1a9c60d941ef4c388febd1	11-Feb-2015	Eric Anholt <eric@anholt.net>	i965: Shut up a compiler warning about uninitialized var. We always pass this argument, even if it won't be used by the particular texture op. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
ccbe15f3325d7a6d04d0ea18227a08f53decec16	03-Feb-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/fs: Fix saturate on MAD and LRP with the NIR backend. Fixes misrendering in "Witcher 2" with INTEL_USE_NIR=1, and probably many other programs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
ab24e1270674192d2aeb4ba0cc39497edb3342f8	03-Feb-2015	Connor Abbott <cwabbott0@gmail.com>	i965/nir: use redundant phi optimization Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Tested-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
8776b1b14b229d110f283f5da8c3c36261068ede	22-Jan-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs_nir: Get rid of get_alu_src Originally, get_alu_src was supposed to handle resolving swizzles and things like that. However, now that basically every instruction we have only takes scalar sources, we don't really need it anymore. The only case where it's still marginally useful is for the mov and vecN operations that are left over from SSA form. We can handle those cases as a special case easily enough. As a side-effect, we don't need the vec_to_movs pass anymore. v2 Jason Ekstrand <jason.ekstrand@intel.com>: - Rework the way we detect if we need an extra copy for swizzling. The old code involved a pile of confusing switch fall-throughs; we now use a loop. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
112d738b91aac44c2509aafe68bdbf9ab74bb3c1	23-Dec-2014	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Use NIR's scalarizing abilities and stop handling vectors Now that we can scalarize with NIR, there's no need for all this code anymore. Let's get rid of it and just do scalar operations. v2: run copy prop before lowering phi nodes v3: Get rid of the "emit(...)->saturate = foo" pattern v4: Run alu_to_scalar as an optimization pass total instructions in shared programs: 5998321 -> 5974070 (-0.40%) instructions in affected programs: 732075 -> 707824 (-3.31%) helped: 3137 HURT: 191 GAINED: 18 LOST: 0 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
f02f1af9f7582bc9ca685ef240751aa57ce42638	23-Jan-2015	Ian Romanick <ian.d.romanick@intel.com>	i965/fs: Allow SIMD16 on pre-SNB when try_replace_with_sel is successful If try_replace_with_sel is able to replace the flow control with a SEL instruction, then there is no flow control... failing SIMD16 because of nonexistent flow control is wrong. No piglit regressions on any i965 platform in Jenkins. total instructions in shared programs: 4382707 -> 4382707 (0.00%) instructions in affected programs: 0 -> 0 helped: 0 HURT: 0 GAINED: 2089 LOST: 0 No other platforms affected in shader-db. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
d7743bb1c2d5cfe44a018251d21def18eb6d4b97	21-Jan-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/nir: Report NIR instruction counts (in SSA form) via KHR_debug. This allows us to count NIR instructions via shader-db. Use "run" as normal. The results file will contain both NIR and assembly. Then, to generate a NIR report: ./report.py <(grep NIR results/foo) <(grep NIR results/bar) Or, to generate an i965 report: ./report.py <(grep -v NIR results/foo) <(grep -v NIR results/bar) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
f3e06fcc6add67ed3eeecbce600994ef3220ec1c	20-Jan-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/nir: Print NIR on INTEL_DEBUG=fs. This is useful for debugging and looking for optimization opportunities. It will need to be expanded when we add support for other scalar stages. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
faa38e16aadd9f2a2416fcb5087d7f8fc8178bf2	20-Jan-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/nir: Do optimizations again just before lowering source mods. We want to run CSE and algebraic optimizations again after lowering IO. Some of the passes in the optimization loop don't handle saturates and other modifiers, so run it before lowering to source modifiers. total instructions in shared programs: 6046190 -> 6045768 (-0.01%) instructions in affected programs: 22406 -> 21984 (-1.88%) helped: 47 HURT: 0 GAINED: 0 LOST: 0 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
45123ee8186cff6bb819b9c9e44d6d5a1bb41923	16-Jan-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/nir: Use offset() instead of altering reg_offset directly. offset() properly handles reg_width, so it'll work for SIMD16. While we're in the area, simplify a few cases, and use retype() to cut a few more lines of code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
3f263ffbb37d77f97a86686e1d2d5eeabf4ecae6	16-Jan-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/nir: Replace fs_reg(GRF, virtual_grf_alloc(...)) with vgrf(...). brw_fs_nir.cpp creates almost all of its registers via: fs_reg reg = fs_reg(GRF, virtual_grf_alloc(num_components)); When we add SIMD16 support, we'll need to set reg->width = 16 and double the VGRF size...on pretty much every VGRF it allocates. This patch replaces that pattern with a new "vgrf" helper method: fs_reg reg = vgrf(num_components); The new function correctly takes reg_width into account. For now, reg_width is always 1, so this should have no functional change. v2: Just make vgrf() account for reg_width right away, rather than changing the behavior in the next patch. v3: Replace one last virtual_grf_alloc I missed. It's used in code that only runs for dispatch_width == 8, so it doesn't matter, but consistency is nice. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
d1533d87cc7e2c39e7ce9dc838b45a2c39c96e33	16-May-2014	Kenneth Graunke <kenneth@whitecape.org>	i965: Replace fs_reg(fs_visitor, type) with fs_visitor::vgrf(type). I dislike how fs_reg has a constructor that knows about fs_visitor. Apart from that, it stands alone, with no need to interact with the rest of the compiler. Which is sensible - a class that represents a register should do just that. Allocating virtual register numbers should be left up to the compiler (fs_visitor). This patch replaces the constructor with a new fs_visitor::vgrf method, eliminating fs_reg's dependency on fs_visitor. It ends up being no more code. v2: Rebase from May 2014 -> January 2015. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
c56adc68e2e75276785fd933b47621c87f9fd3ee	15-Jan-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Do a final copy lowering pass before lowering locals to regs Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
55b5058e69859ba28c2f32de6edf5f0df3c6c28c	14-Jan-2015	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Rename lower_variables to lower_vars_to_ssa The original name wasn't particularly descriptive. This one indicates that it actually gives you SSA values as opposed to the old pass which lowered variables to registers. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
4aa6162f6ecf96c7400c17c310eba0cfd0f5e083	10-Jan-2015	Jason Ekstrand <jason.ekstrand@intel.com>	nir/tex_instr: Add a nir_tex_src struct and dynamically allocate the src array This solves a number of problems. First is the ability to change the number of sources that a texture instruction has. Second, it solves the delema that may occur if a texture instruction has more than 4 sources. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
cb53aacaa1555b98fa77146492e96a7e3d7631ba	17-Dec-2014	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs_nir: Handle sample ID, position, and mask better Before, we were emitting the full pile of setup instructions for sample_id and sample_pos every time they were used. With this commit, we emit them in their own pass once at the beginning of the shader and simply emit uses later on. When it comes time for setting up VS, we can put setup for its special values in the same pass. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
2c7da78805175f36879111306ac37c12d33bf65b	16-Dec-2014	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Make load_const SSA-only As it was, we weren't ever using load_const in a non-SSA way. This allows us to substantially simplify the load_const instruction. If we ever need a non-SSA constant load, we can do a load_const and an imov. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
951a7f23a076c1570f68b50fc7d03a33eb5145e7	16-Dec-2014	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Move the other lowering passes to before out-of-SSA Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
821e75a16038aba23aa0d46c081c99f07ee44ecd	16-Dec-2014	Jason Ekstrand <jason.ekstrand@intel.com>	nir/lower_atomics: Use/support SSA Previously, lower_atomics was non-SSA only. We assert-failed if the destination of an atomic operation intrinsic was an SSA def and we used temporary registers for computing offsets. This commit changes both of these behaviors. We now use SSA values for computing offsets (so we can optimize them) and we handle SSA destinations. We also move the pass to run before we go out of SSA on i965 as it now generates SSA values. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
dfb3abbaecfbe30b8858a5428c604f9d90f65505	13-Dec-2014	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Remove predication We stopped generating predicates in glsl_to_nir some time ago. Right now, it's all dead untested code that I'm not convinced always worked in the first place. If we decide we want them back, we can revert this patch. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b3fd098e7daa491637d66d03366b67c989937a1f	13-Dec-2014	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Make bcsel a fully vector operation Previously, the condition was a scalar that applied to all components simultaneously. As of this commit, the condition is a vector and each component is switched seperately. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
3c2c0a164c2308a5777d7a59b6da4b44a57ba6e2	06-Dec-2014	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs_nir: Add support for indirect texture arrays v2 Jason Ekstrand <jason.ekstrand@intel.com>: - Use the nir_tex_src_sampler_offset source type instead of the sampler_indirect thing that I cooked up before. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
62ac0ee804027d1a1fa9864e03428ced7bd8510a	05-Dec-2014	Jason Ekstrand <jason.ekstrand@intel.com>	nir/tex_instr: Rename the indirect source type and add an array size In particular, we rename nir_tex_src_sampler_index to _sampler_offset and add a sampler_array_size field to nir_tex_instr. This way we can pass the size of sampler arrays through to backends even after removing the variable information and, with it, the type. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
534d145e5ea039d57833395a36eed90721f6b272	09-Dec-2014	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Use a source for uniform buffer indices instead of an index In GLSL-to-NIR we were just setting the base index to 0 whenever there was an indirect so having it expressed as a sum makes no sense. Also, while a base offset may make sense for the memory location (first element in the array, etc.) it makes less sense for the actual uniform buffer index. This may change later, but it seems to make more sense for now. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
cd4b995254fe29bae9ab5a9563cc615274d361ed	05-Dec-2014	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Make texture instruction names more consistent This commit renames nir_instr_as_texture to nir_instr_as_tex and renames nir_instr_type_texture to nir_instr_type_tex to be consistent with nir_tex_instr. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
f77f4c00ce4834ca14dd27bed28949dc012e7daf	15-Nov-2014	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Add a basic constant folding pass Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
d5410bd8f65b8d0f845dc8beccd498b6fa098660	12-Dec-2014	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Add an algebraic optimization pass This pass uses the previously built algebraic transformations framework and should act as an example for anyone else wanting to make an algebraic transformation pass for NIR. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
919426631b7bd32f012eb9b6ffd8a9aff74788e1	13-Nov-2014	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Add a lowering pass for adding source modifiers where possible Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
a1c259d6668bf934a79e7815dff3636783adea9f	05-Dec-2014	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs_nir: Implement the ARB_gpu_shader5 interpolation intrinsics Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
e257a5112476c47928b2fa2a2f2ea3108d13264b	04-Dec-2014	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs_nir: Add a has_indirect flag and clean up some of the input/output code Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
27663dbe8edfb7583d9d8fc3704a04a5c837fe05	04-Dec-2014	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Vectorize intrinsics We used to have the number of components built into the intrinsic. This meant that all of our load/store intrinsics had vec1, vec2, vec3, and vec4 variants. This lead to piles of switch statements to generate the correct intrinsic names, and introspection to figure out the number of components. We can make things much nicer by allowing "vectorized" intrinsics. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
26865f858d48dd473fc294f7fe14c964715cd55e	27-Nov-2014	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs_nir: Use the new variable lowering code This commit switches us over to the new variable lowering code which is capable of properly handling lowering indirects as we go. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
7c5284d0e52add862821ab13be61228e53867e62	02-Dec-2014	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs_nir: Don't dump the shader. This is killing piglit. I'll leave the logging local Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
04fb073344b03a02d56291dd273bdef96147e857	14-Nov-2014	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs_nir: Properly saturate multiplies Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
c2abfc0b86628bb1b756e4ef125c97cb4386aea2	13-Nov-2014	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs_nir: Handle SSA constants Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
e0aa4c6272851ed418dfa18ee6014f40b0e266c2	12-Nov-2014	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs_nir: Use an array rather than a hash table for register lookup Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
20adc516e27e390b1558703720a2a2129c9e8ad5	12-Nov-2014	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs_nir: Add the CSE pass and actually run in a loop Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
20a581260633cb6d0d8ca571e7f3e886298a5733	11-Nov-2014	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Add a fused multiply-add peephole /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
c937bdb3c2c41c5bf914ae7ead9223b8b87e9fe2	08-Nov-2014	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs_nir: Turn on the peephole select optimization Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
2bd5a24a5e440ba0072528fdb32892cf8c935a8e	07-Nov-2014	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs_nir: Validate optimization passes Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
10adf8fc858c21cd95b3e02a8d6abee563ca1046	07-Nov-2014	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Differentiate between signed and unsigned versions of find_msb We also make the return types match GLSL. The GLSL spec specifies that findMSB and findLSB return a signed integer. Previously, nir had them return unsigned. This updates nir's behavior to match what GLSL expects. We also update the nir-to-fs generator to take the new instructions. While we're at it, we fix the case where the input to findMSB is zero. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
4285aaecdceac55005e1ea2e75e17c6490d158a9	12-Dec-2014	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs_nir: Do retyping for ALU srouces in get_nir_alu_src Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
63eb32950e64715a7a686ae9da82b55954db9ab8	22-Oct-2014	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs_nir: Convert the shader to/from SSA Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
ff0a9fcf332ce319fae1eb53f3e5d863d0289cbf	21-Oct-2014	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs_nir: Don't duplicate emit_general_interpolation Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
744b4e9348db1767a772fda2a5cbe33abbba7db1	16-Oct-2014	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs_nir: Add atomic counters support Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
95fbd6e1eed58f1f87aaa425bb5312a92db29d21	15-Oct-2014	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs_nir: Handle coarse/fine derivatives Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
4582341ea74a076c981c962f1a01311bfa3bf991	16-Oct-2014	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs_nir: Add support for sample_pos and sample_id /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
7cd1537aae28b9189b1251688ac1a5dc9d36cc80	16-Oct-2014	Jason Ekstrand <jason.ekstrand@intel.com>	Fix up varying pull constants Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b092bc9805f0f28209fc70fb367e0dc26e294317	16-Oct-2014	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs_nir: Use the correct texture offset immediate Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
c181ff268e4787056fdee417d30d52b1098fe211	15-Oct-2014	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs_nir: Use the correct types for texture inputs Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
c2ded36bb60d3dfad0036dac7adbf7718968ccf2	15-Oct-2014	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs_nir: Make the sampler register always unsigned Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
2faf7f87d6a1c00b3f3d3907178a2eeeefa5d2a9	15-Aug-2014	Connor Abbott <connor.abbott@intel.com>	i965/fs: add a NIR frontend This is similar to the GLSL IR frontend, except consuming NIR. This lets us test NIR as part of an actual compiler. v2: Jason Ekstrand <jason.ekstrand@intel.com>: Make brw_fs_nir build again Only use NIR of INTEL_USE_NIR is set whitespace fixes /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_nir.cpp