Cross Reference: /external/mesa3d/src/mesa/drivers/dri/i965/brw

History log of /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
Revision	Date	Author	Comments (<<< Hide modified files) (Show modified files >>>)
0d5071db5e50629a63490639a3c86dfc65bf27ab	13-Jan-2017	Kenneth Graunke <kenneth@whitecape.org>	i965: Move Gen4-5 interpolation stuff to brw_wm_prog_data. This fixes glxgears rendering, which had surprisingly been broken since late October! Specifically, commit 91d61fbf7cb61a44adcaae51ee08ad0dd6b. glxgears uses glShadeModel(GL_FLAT) when drawing the main portion of the gears, then uses glShadeModel(GL_SMOOTH) for drawing the Gouraud-shaded inner portion of the gears. This results in the same fragment program having two different state-dependent interpolation maps: one where gl_Color is flat, and another where it's smooth. The problem is that there's only one gen4_fragment_program, so it can't store both. Each FS compile would trash the last one. But, the FS compiles are cached, so the first one would store FLAT, and the second would see a matching program in the cache and never bother to compile one with SMOOTH. (Clearing the program cache on every draw made it render correctly.) Instead, move it to brw_wm_prog_data, where we can keep a copy for every specialization of the program. The only downside is bloating the structure a bit, but we can tighten that up a bit if we need to. This also lets us kill gen4_fragment_program entirely! Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
56ee2df4bf9b1e8c26cf8689f5ef20237c95466b	13-Jan-2017	Juan A. Suarez Romero <jasuarez@igalia.com>	i965/vec4: Fix mapping attributes This patch reverts 57bab6708f2bbc1ab8a3d202e9a467963596d462, which was causing issues with ILK and earlier VS programs. 1. brw_nir.c: Revert "i965/vec4/nir: vec4 also needs to remap vs attributes" Do not perform a remap in vec4 backend. Rather, do it later when setup attributes 2. brw_vec4.cpp: This fixes mapping ATTRx to proper GRFn. Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99391 [jordan.l.justen@intel.com: merge Juan's two patches from bugzilla] Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
57bab6708f2bbc1ab8a3d202e9a467963596d462	22-Apr-2016	Alejandro Piñeiro <apinheiro@igalia.com>	i965/vec4/nir: vec4 also needs to remap vs attributes Doubles need extra space, so we would need to do a remapping for vec4 too in order to take that into account. We reuse the already existing remap_vs_attrs, but passing is_scalar, so they could remap accordingly. v2: code-format remap_vs_attrs_params initialization (Matt) Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
b4c44ff08c1154ff3790de89f29520d178e9e0ef	08-Aug-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Use the nir_move_comparisons pass. While the below stats are encouraging this pass will also become very usefull for avoiding regression once brw_do_channel_expressions() and brw_do_vector_splitting() are disabled. On Broadwell: total instructions in shared programs: 13078787 -> 13060898 (-0.14%) instructions in affected programs: 1809827 -> 1791938 (-0.99%) helped: 4527 HURT: 157 total cycles in shared programs: 256562762 -> 256590424 (0.01%) cycles in affected programs: 159749392 -> 159777054 (0.02%) helped: 5583 HURT: 2289 total spills in shared programs: 14929 -> 14923 (-0.04%) spills in affected programs: 62 -> 56 (-9.68%) helped: 1 HURT: 0 total fills in shared programs: 20144 -> 20141 (-0.01%) fills in affected programs: 253 -> 250 (-1.19%) helped: 1 HURT: 3 LOST: 0 GAINED: 2 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
b5e682a1efa0a929574e739807ac2b046e004561	10-Aug-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Move nir_lower_locals_to_regs a bit later. I'm going to add a boolean scheduling pass that I want run late, but after copy propagation and dead code elimination. Yet, I don't want to have to think about registers. So, move the register conversion a little later. No impact on shader-db. Suggested by Jason Ekstrand. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
5edc3381628d1db4468f31b1c66bb518146e35b5	09-Jan-2017	Kenneth Graunke <kenneth@whitecape.org>	compiler: Merge shader_info's tcs and tes structs. Annoyingly, SPIR-V lets you specify all of these fields in either the TCS or TES, which means that we need to be able to store all of them for either shader stage. Putting them in a union won't work. Combining both is an easy solution, and given that the TCS struct only had a single field, it's pretty inexpensive. This patch renames the combined struct to "tess" to indicate that it's for tessellation in general, not one of the two stages. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
c2acf97fcc9b32eaa9778771282758e5652a8ad4	16-Dec-2016	Juan A. Suarez Romero <jasuarez@igalia.com>	nir/i965: use two slots from inputs_read for dvec3/dvec4 vertex input attributes So far, input_reads was a bitmap tracking which vertex input locations were being used. In OpenGL, an attribute bigger than a vec4 (like a dvec3 or dvec4) consumes just one location, any other small attribute. So we mark the proper bit in inputs_read, and also the same bit in double_inputs_read if the attribute is a dvec3/dvec4. But in Vulkan, this is slightly different: a dvec3/dvec4 attribute consumes two locations, not just one. And hence two bits would be marked in inputs_read for the same vertex input attribute. To avoid handling two different situations in NIR, we just choose the latest one: in OpenGL, when creating NIR from GLSL/IR, any dvec3/dvec4 vertex input attribute is marked with two bits in the inputs_read bitmap (and also in the double_inputs_read), and following attributes are adjusted accordingly. As example, if in our GLSL/IR shader we have three attributes: layout(location = 0) vec3 attr0; layout(location = 1) dvec4 attr1; layout(location = 2) dvec3 attr2; then in our NIR shader we put attr0 in location 0, attr1 in locations 1 and 2, and attr2 in location 3 and 4. Checking carefully, basically we are using slots rather than locations in NIR. When emitting the vertices, we do a inverse map to know the corresponding location for each slot. v2 (Jason): - use two slots from inputs_read for dvec3/dvec4 NIR from GLSL/IR. v3 (Jason): - Fix commit log error. - Use ladder ifs and fix braces. - elements_double is divisible by 2, don't need DIV_ROUND_UP(). - Use if ladder instead of a switch. - Add comment about hardware restriction in 64bit vertex attributes. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
45912fb908f7a1d2efbce0f1dbe81e5bc975fbe1	10-Dec-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965/compiler: Use the new nir_opt_copy_prop_vars pass We run this after nir_lower_vars_to_ssa so that as many load/store_var intrinsics as possible before copy_prop_vars executes. This is because the pass isn't particularly efficient (it does a lot of linear walks of a linked list) so we'd like as much of the work as possible to be done before copy_prop_vars runs. Shader DB results on Sky Lake: total instructions in shared programs: 12020290 -> 12013627 (-0.06%) instructions in affected programs: 26033 -> 19370 (-25.59%) helped: 16 HURT: 13 total cycles in shared programs: 137772848 -> 137549012 (-0.16%) cycles in affected programs: 6955660 -> 6731824 (-3.22%) helped: 217 HURT: 237 total loops in shared programs: 3208 -> 3208 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 4112 -> 4057 (-1.34%) spills in affected programs: 483 -> 428 (-11.39%) helped: 2 HURT: 0 total fills in shared programs: 5519 -> 5102 (-7.56%) fills in affected programs: 993 -> 576 (-41.99%) helped: 2 HURT: 0 LOST: 0 GAINED: 0 Broadwell had similar results. On older hardware, the impact isn't as large because they don't advertise GL 4.5. Of the hurt programs, all but one are hurt by a single instruction and the one is hurt by 3 instructions. All of the helped programs, on the other hand, are helped by at least 3 instructions and one kerbal space program shader is helped by 44.59%. The real star of the show, however, is the Gl43CSDof synmark2 benchmark which has two shaders which are cut by 28% and 40% and the over-all runtime performance of the benchmark on my Sky Lake laptop is improved by around 25-30% (it's a bit hard to be exact due to thermal throttling). Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
e6ae19944d977dc91bc45adff679337182c20683	24-Nov-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Rework gl_TessLevel[] handling to use NIR compact arrays. Treating everything as scalar arrays allows us to drop a bunch of special case input/output munging all throughout the backend. Instead, we just need to remap the TessLevel components to the appropriate patch URB header locations in remap_patch_urb_offsets(). We also switch to treating the TES input versions of these as ordinary shader inputs rather than system values, as remap_patch_urb_offsets() just makes everything work out without special handling. This regresses one Piglit test: arb_tessellation_shader-large-uniforms/GL_TESS_CONTROL_SHADER-array-at-limit The compiler starts promoting the constant arrays assigned to gl_TessLevel to uniform arrays. Since the shader also has a uniform array that uses the maximum number of uniform components, this puts it over the uniform component limit enforced by the linker. This is arguably a bug in the constant array promotion code (it should avoid pushing us over limits), but is unlikely to penalize any real application. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
8962cc96ec2bc1eb561a438512adc5042e2c8d34	17-Dec-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965: Use nir_opt_trivial_continues and nir_opt_if Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
715f0d06d19e7c33d98f99c764c5c3249d13b1c0	13-Dec-2016	Timothy Arceri <timothy.arceri@collabora.com>	i965: use nir loop unrolling pass shader-db results for BDW: total instructions in shared programs: 12589614 -> 12590119 (0.00%) instructions in affected programs: 50525 -> 51030 (1.00%) helped: 7 HURT: 145 total cycles in shared programs: 241524604 -> 241490502 (-0.01%) cycles in affected programs: 1941404 -> 1907302 (-1.76%) helped: 302 HURT: 449 total loops in shared programs: 4245 -> 2947 (-30.58%) loops in affected programs: 1535 -> 237 (-84.56%) helped: 1142 HURT: 0 total spills in shared programs: 14453 -> 14453 (0.00%) spills in affected programs: 0 -> 0 helped: 0 HURT: 0 total fills in shared programs: 18984 -> 18984 (0.00%) fills in affected programs: 0 -> 0 helped: 0 HURT: 0 LOST: 26 GAINED: 15 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
e729504fb1799c3ae31cea76d73946530ef9806f	14-Sep-2016	Timothy Arceri <timothy.arceri@collabora.com>	nir: pass compiler rather than devinfo to functions that call nir_optimize Later we will pass compiler to nir_optimise to be used by the loop unroll pass. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
eda3ec7957ec9324641ee75847b892885e77335f	05-Dec-2016	Timothy Arceri <timothy.arceri@collabora.com>	i965: use nir_lower_indirect_derefs() for GLSL This moves the nir_lower_indirect_derefs() call into brw_preprocess_nir() so thats is called by both OpenGL and Vulkan and removes that call to the old GLSL IR pass lower_variable_index_to_cond_assign() We want to do this pass in nir to be able to move loop unrolling to nir. There is a increase of 1-3 instructions in a small number of shaders, and 2 Kerbal Space program shaders that increase by 32 instructions. The changes seem to be caused be the difference in the GLSL IR vs NIR variable index lowering passes. The GLSL IR pass creates a simple if ladder for arrays of size 4 or less, while the NIR pass implements a binary search for all arrays regardless of size. Shader-db results BDW: total instructions in shared programs: 13021176 -> 13021819 (0.00%) instructions in affected programs: 57693 -> 58336 (1.11%) helped: 20 HURT: 190 total cycles in shared programs: 299805580 -> 299750826 (-0.02%) cycles in affected programs: 2290024 -> 2235270 (-2.39%) helped: 337 HURT: 442 total fills in shared programs: 19984 -> 19984 (0.00%) fills in affected programs: 0 -> 0 helped: 0 HURT: 0 LOST: 4 GAINED: 0 V2: remove the do_copy_propagation() call from the i965 GLSL IR linking code. This call was added in f7741c52111 but since we are moving the variable index lowering to NIR we no longer need it and can just rely on the nir copy propagation pass. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
77f65b3b643d135a5eab3799a600302177dccb26	13-Dec-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/nir: enable lowering of texture gradient for shadow samplers This gets the lowering on the Vulkan driver too, which is required for hardware that does not have the sample_l_d message (up to IvyBridge). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
f90da64fc65d8da99a5a5a140be7b64c8cf5ee6a	30-Nov-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/nir: enable lowering of texture gradient for cube maps This gets the lowering on the Vulkan driver too. Fixes Vulkan CTS cube map texture gradient tests in: dEQP-VK.glsl.texture_functions.texturegrad.* Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
0291bf4db2affbfb6a7daeb562c0cf56f6a85829	06-Dec-2016	Jason Ekstrand <jason.ekstrand@intel.com>	Revert "i965: use nir_lower_indirect_derefs() for GLSL" This reverts commit 9404439a754e5640ccd98df40fa694835c0d8759. I didn't intend to push it and it breaks clip and cull distance. /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
9404439a754e5640ccd98df40fa694835c0d8759	15-Aug-2016	Timothy Arceri <timothy.arceri@collabora.com>	i965: use nir_lower_indirect_derefs() for GLSL This moves the nir_lower_indirect_derefs() call into brw_preprocess_nir() so thats is called by both OpenGL and Vulkan and removes that call to the old GLSL IR pass lower_variable_index_to_cond_assign() We want to do this pass in nir to be able to move loop unrolling to nir. There is a increase of 1-3 instructions in a small number of shaders, and 2 Kerbal Space program shaders that increase by 32 instructions. Shader-db results BDW: total instructions in shared programs: 8705873 -> 8706194 (0.00%) instructions in affected programs: 32515 -> 32836 (0.99%) helped: 3 HURT: 79 total cycles in shared programs: 74618120 -> 74583476 (-0.05%) cycles in affected programs: 528104 -> 493460 (-6.56%) helped: 47 HURT: 37 LOST: 2 GAINED: 0 /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
a8ef92b031729fca86f8615d0ca113ddf0965af9	08-Nov-2016	Jason Ekstrand <jason@jlekstrand.net>	i965/compiler: Disable trig workarounds on KBL+ The precision of our trig instructions appears to have been fixed on Kaby Lake. Neither Ben nor I can find any documentation for this. However, the dEQP precision tests now pass with INTEL_PRECISE_TRIG=0 where they fail on Sky Lake. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
38a8507f79b8da71b309654ce56854bbea1bcf94	17-Oct-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Use NIR-based clip/cull lowering for OpenGL as well. The old approach works fine, and this approach isn't necessarily better. But it at least has the advantage that Vulkan and GL use the same approach. I originally wrote it to gain additional testing for the new paths. shader-db statistics show 0 instruction count changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
2e423ca1477bd212c01676c5e4828ebdb83310d8	25-Oct-2016	Timothy Arceri <timothy.arceri@collabora.com>	nir: stop adjusting driver location for varying packing As of 59864e8e020 we just use the location assigned by the front-end and no longer need this for i965. Since there were some issues in the logic with assigning arrays the same driver location if they didn't start at the same location just remove it and let other drivers implement a solution if needed when they add ARB_enhanced_layouts support. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
91d61fbf7cb61a44adcaae51ee08ad0dd6b2a03b	20-Oct-2016	Timothy Arceri <timothy.arceri@collabora.com>	i965: rewrite brw_setup_vue_interpolation() Here brw_setup_vue_interpolation() is rewritten not to use the InterpQualifier array in gl_fragment_program which will allow us to remove it. This change also makes the code which is only used by gen4/5 more self contained as it now has its own gen5_fragment_program struct rather than storing the map in brw_context. This means the interpolation map will only get processed once and will get stored in the in memory cache rather than being processed everytime the fs changes. Also by calling this from the fs compile code rather than from the upload code and using the interpolation assigned there we can get rid of the BRW_NEW_INTERPOLATION_MAP flag. It might not seem ideal to add a gen5_fragment_program struct however by the end of this series we will have gotten rid of all the brw_{shader_stage}_program structs and replaced them with a generic brw_program struct so there will only be two program structs which is better than what we have now. V2: Don't remove BRW_NEW_INTERPOLATION_MAP from dirty_bit_map until the following patch to fix build error. V3 - Suggestions by Jason: - name struct gen4_fragment_program rather than gen5_fragment_program - don't use enum with memset() - create interp mode set helper and simplify logic to call it - add assert when calling function to show prog will never be NULL for gen4/5 i.e. no Vulkan Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
e1af20f18a86f52a9640faf2d4ff8a71b0a4fa9b	13-Oct-2016	Timothy Arceri <timothy.arceri@collabora.com>	nir/i965/anv/radv/gallium: make shader info a pointer When restoring something from shader cache we won't have and don't want to create a nir_shader this change detaches the two. There are other advantages such as being able to reuse the shader info populated by GLSL IR. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
094fe3a9591ce200162d955635eee577c13f9324	13-Oct-2016	Timothy Arceri <timothy.arceri@collabora.com>	nir: move nir_shader_info to a common compiler header This will allow use to stop copying values between structs and will also simplify handling handling these values in the shader cache. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
59864e8e02057cc6fa0448a8af067a3cf53389da	13-Oct-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Don't use nir_assign_var_locations for VS/TES/GS outputs. Fixes spec/arb_enhanced_layouts/execution/component-layout/vs-fs-array-dvec3. v2: Remove nir_outputs field from fs_visitor (caught by Tim and Iago). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
89e1436e2d4ff0c15202708979eb36761cae4167	11-Oct-2016	Ian Romanick <ian.d.romanick@intel.com>	i965: Silence unused parameter warnings brw_link.cpp:76:44: warning: unused parameter ‘shader_type’ [-Wunused-parameter] gl_shader_stage shader_type, ^ brw_nir.c: In function ‘brw_nir_lower_vs_inputs’: brw_nir.c:194:55: warning: unused parameter ‘devinfo’ [-Wunused-parameter] const struct gen_device_info *devinfo, ^ brw_vec4_visitor.cpp:914:37: warning: unused parameter ‘sampler’ [-Wunused-parameter] uint32_t sampler, ^ brw_vec4_visitor.cpp:1146:34: warning: unused parameter ‘stream_id’ [-Wunused-parameter] vec4_visitor::gs_emit_vertex(int stream_id) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
36f0f0318275f65f8744ec6f9471702e2f58e6d5	07-Sep-2016	Eric Anholt <eric@anholt.net>	nir: Allow opt_peephole_sel to be more aggressive in flattening IFs. VC4 was running into a major performance regression from enabling control flow in the glmark2 conditionals test, because of short if statements containing an ffract. This pass seems like it was was trying to ensure that we only flattened IFs that should be entirely a win by guaranteeing that there would be fewer bcsels than there were MOVs otherwise. However, if the number of ALU ops is small, we can avoid the overhead of branching (which itself costs cycles) and still get a win, even if it means moving real instructions out of the THEN/ELSE blocks. For now, just turn on aggressive flattening on vc4. i965 will need some tuning to avoid regressions. It does looks like this may be useful to replace freedreno code. Improves glmark2 -b conditionals:fragment-steps=5:vertex-steps=0 from 47 fps to 95 fps on vc4. vc4 shader-db: total instructions in shared programs: 101282 -> 99543 (-1.72%) instructions in affected programs: 17365 -> 15626 (-10.01%) total uniforms in shared programs: 31295 -> 31172 (-0.39%) uniforms in affected programs: 3580 -> 3457 (-3.44%) total estimated cycles in shared programs: 225182 -> 223746 (-0.64%) estimated cycles in affected programs: 26085 -> 24649 (-5.51%) v2: Update shader-db output. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1) /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
111f6b250d01fa1937103f24b5cb54b15dd77fbf	14-Sep-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Roll set_default_interpolation into lower_fs_inputs Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
246db0063eb6e01aad961b1c73d32fca911ae1df	14-Sep-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Use NIR for handling forced per-sample interpolation Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
ed65e6ef49e17e9cae93a8f98e2968346de2bc6e	14-Sep-2016	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Add a flag to lower_io to force "sample" interpolation Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
2d8a3fa7ea994ad02a40ff497109f966e3fcbeec	14-Sep-2016	Kenneth Graunke <kenneth@whitecape.org>	nir: Report progress from nir_lower_phis_to_scalar. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
32630e211e60a2b41388d403cfbd4f43344d8590	14-Sep-2016	Kenneth Graunke <kenneth@whitecape.org>	nir: Report progress from nir_lower_alu_to_scalar. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
527f37199929932300acc1688d8160e1f3b1d753	23-Aug-2016	Jason Ekstrand <jason.ekstrand@intel.com>	intel: s/brw_device_info/gen_device_info/ Generated by: sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/*/.c sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/*/.h sed -i -e 's/brw_device_info/gen_device_info/g' */i965/.c sed -i -e 's/brw_device_info/gen_device_info/g' */i965/.cpp sed -i -e 's/brw_device_info/gen_device_info/g' */i965/.h Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
7dac8820730777756c00d7024330517848dc3b9f	22-Jul-2016	Francisco Jerez <currojerez@riseup.net>	i965/fs: Rework representation of fragment output locations in NIR. The problem with the current approach is that driver output locations are represented as a linear offset within the nir_outputs array, which makes it rather difficult for the back-end to figure out what color output and index some nir_intrinsic_load/store_output was meant for, because the offset of a given output within the nir_output array is dependent on the type and size of all previously allocated outputs. Instead this defines the driver location of an output to be the pair formed by its GLSL-assigned location and index (I've borrowed the bitfield macros from brw_defines.h in order to represent the pair of integers as a single scalar value that can be assigned to nir_variable_data::driver_location). nir_assign_var_locations is no longer useful for fragment outputs. Because fragment outputs are now allocated independently rather than within the nir_outputs array, the get_frag_output() helper becomes necessary in order to obtain the right temporary register for a given location-index pair. The type_size helper passed to nir_lower_io is now type_size_dvec4 rather than type_size_vec4_times_4 so that output array offsets are provided in terms of whole array elements rather than in terms of scalar components (dvec4 is the largest vector type supported by the GLSL so this will cause all individual fragment outputs to have a size of one regardless of the type). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
9f32721f8695f3e55849dce015da3b53d1af5d57	21-Jul-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Enable NIR lowering of txf and rect offsets This fixes the following piglit tests on gen6+: tex-miplevel-selection textureProjGradOffset 2DRect tex-miplevel-selection textureGradOffset 2DRect tex-miplevel-selection textureGradOffset 2DRectShadow tex-miplevel-selection textureProjGradOffset 2DRect_ProjVec4 tex-miplevel-selection textureProjGradOffset 2DRectShadow Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
7f53fead5cf9a85c74a94d359dd5fccfbb87856c	23-May-2016	Timothy Arceri <timothy.arceri@collabora.com>	i965: enable component packing for vs and fs Rather than trying to work out the total number of components used at a location we simply treat all outputs as vec4s. This removes the need for complex code looping over varyings to match packed locations and the need for storing the total number of components used at each location. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
1eef0b73aa323d94d5a080cd1efa81ccacdbd0d2	12-Jul-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Rewrite FS input handling to use the new NIR intrinsics. This eliminates the need to walk the list of input variables, recurse into their types (via logic largely redundant with nir_lower_io), and interpolate all possible inputs up front. The backend no longer has to care about variables at all, which eliminates complications from trying to pack multiple variables into the same location. Instead, each intrinsic specifies exactly what's needed. This should unblock Timothy's work on GL_ARB_enhanced_layouts. Each load_interpolated_input intrinsic corresponds to PLN instructions, while load_barycentric_at_* intrinsics correspond to pixel interpolator messages. The pixel/centroid/sample barycentric intrinsics simply refer to payload fields (delta_xy[]), and don't actually generate any code. Because we use a single intrinsic for both centroid-qualified variables and interpolateAtCentroid(), they become indistinguishable. We stop sending pixel interpolator messages for those, and instead use the payload provided data, which should be considerably faster. On Broadwell: total instructions in shared programs: 9067751 -> 9067570 (-0.00%) instructions in affected programs: 145902 -> 145721 (-0.12%) helped: 422 HURT: 209 total spills in shared programs: 2849 -> 2899 (1.76%) spills in affected programs: 760 -> 810 (6.58%) helped: 0 HURT: 10 total fills in shared programs: 3910 -> 3950 (1.02%) fills in affected programs: 617 -> 657 (6.48%) helped: 0 HURT: 10 LOST: 3 GAINED: 3 The differences mostly appear to be slight changes in MOVs. v2: Use nir_shader_compiler_options::use_interpolated_input_intrinsics flag rather than passing it directly to nir_lower_io. Use the unreachable() macro rather than assert in one place. (Review feedback from Chris Forbes.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisforbes@google.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
448adfbc67f4f6d0268a2f94dac311a26dc19864	18-May-2016	Timothy Arceri <timothy.arceri@collabora.com>	nir: use the same driver location for packed varyings Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
a8a9d1bf41c00123cefb6e757f3509c62e880a15	14-Jun-2016	Timothy Arceri <timothy.arceri@collabora.com>	i965: remove type_size_vec4_times_4() type_size_vec4_times_4() was introduced as a fix in 8dcf807cb43383 however since 3810c1561 we can just use type_size_scalar() and get the actual number of outputs we need. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
5e43ba7e9e9bfce451f9caa3845136f8a5b6eda0	26-May-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965: Move brw_create_nir to brw_program.c This way it's no longer part of libi965_compiler.la since it depends on GLSL and ARB program stuff. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
86a2447eec7e87e46e842ca7a3ad5cd9fadb1ca5	26-May-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Move the type_size_*_bytes functions to brw_nir.h Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
32210dea8e474f8e93f5df681fb6a8265a0cda4b	26-May-2016	Jason Ekstrand <jason.ekstrand@intel.com>	compiler: Move glsl_to_nir to libglsl.la Right now libglsl.la depends on libnir.la so putting it in libnir.la adds a dependency on libglsl.la that goes the wrong direction. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
654e950cba55dabd2d9accb60db8e5f4c1495716	02-May-2016	Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	i965: Invoke lowering pass for YUV textures Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
0970c563d6e4a30ab0852ef664dc97a995997f88	20-May-2016	Dave Airlie <airlied@redhat.com>	nir: remove dead glsl variables before lowering io. For cull distance GLSL will let unsized unused arrays get into the backend, we should nuke those straight away, to save caring about them later. This fixes: arb_separate_shader_objects/linker/large-number-of-unused-varyings as a side effect (even without culling changes). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
dac10e8a1390711f1f36f224644c4a33586cebe3	17-May-2016	Kenneth Graunke <kenneth@whitecape.org>	i965, anv: Use NIR FragCoord re-center and y-transform passes. This handles gl_FragCoord transformations and other window system vs. user FBO coordinate system flipping by multiplying/adding uniform values, rather than recompiles. This is much better because we have no decent way to guess whether the application is going to use a shader with the window system FBO or a user FBO, much less the drawable height. This led to a lot of recompiles in many applications. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
d6281a9d955ad97f993927bc214e4b641cfbe359	15-Apr-2016	Juan A. Suarez Romero <jasuarez@igalia.com>	i965: take care of doubles when lowering VS inputs Input attributes can require 2 vec4 or 1 vec4 depending on whether they are double-precision or not. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
b0fb08e179d784ca319c3c547a874fd24ce93c3f	01-Apr-2016	Juan A. Suarez Romero <jasuarez@igalia.com>	i965: take care of doubles when remapping VS attributes Double-precision types require 1 slot in VUE for double and dvec2, and 2 slots for anything else. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
dfbabc6bad775e1575ff4a97a3c871341cd57f77	25-Mar-2016	Rob Clark <robclark@freedesktop.org>	nir/lower-io: add support for lowering inputs Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
b085016f94721a6c18f7076fc37c450a98e6bdbc	25-Mar-2016	Rob Clark <robclark@freedesktop.org>	nir: rename lower_outputs_to_temporaries -> lower_io_to_temporaries Since it will gain support to lower inputs, give it a more generic name. Signed-off-by: Rob Clark <robclark@freedesktop.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
47fcef9a209e533103a7ecf4d69440a67aa463ed	25-Mar-2016	Rob Clark <robclark@freedesktop.org>	nir: move callsite of lower_outputs_to_temporaries Going to convert this pass to parameterized lower_io_to_temporaries, and we want the user to be able to specify whether to lower outputs or inputs or both. The restriction of running this pass before validate to avoid output reads no longer applies. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
30424fd25a2f6554c35272d8edeacab0299ad8cc	07-Aug-2015	Connor Abbott <connor.w.abbott@intel.com>	i965: use pack/unpackDouble lowering Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
bea2f8beb53450bd07e5a33d48f00a9e9520645d	04-Aug-2015	Connor Abbott <connor.w.abbott@intel.com>	i965: use double lowering pass v2: also lower trunc, ceil, floor, fract and roundEven (Iago) v3: also lower mod for doubles (Sam) Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
9464d8c49813aba77285e7465b96e92a91ed327c	27-Apr-2016	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Switch the arguments to nir_foreach_function This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_function(\([^,]\),\s\([^,]*\))/nir_foreach_function(\2, \1)/ Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
707e72f13bb78869ee95d3286980bf1709cba6cf	27-Apr-2016	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Switch the arguments to nir_foreach_instr This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_instr(\([^,]\),\s\([^,]*\))/nir_foreach_instr(\2, \1)/ and similar expressions for nir_foreach_instr_safe etc. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
7efff10585122d484dc3adab14af9380b9b8f309	13-Apr-2016	Connor Abbott <cwabbott0@gmail.com>	i965/nir: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
b6dc940ec273252678d40707d300851fa1c85ea5	13-Apr-2016	Connor Abbott <cwabbott0@gmail.com>	nir: rename nir_foreach_block() to nir_foreach_block_call() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
b63a98b1211d22f759ae9c80b2270fe2d3b2639e	25-Mar-2016	Jason Ekstrand <jason.ekstrand@intel.com>	nir/dead_variables: Configurably work with any variable mode The old version of the pass only worked on globals and locals and always left inputs, outputs, uniforms, etc. alone. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
bfd17c76c1267756ea16051cbe174cb23ff49f44	08-Apr-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Port INTEL_PRECISE_TRIG=1 to NIR. This makes the extra multiply visible to NIR's algebraic optimizations (for constant reassociation) as well as constant folding. This means that when the result of sin/cos are multiplied by an constant, we can eliminate the extra multiply altogether, reducing the cost of the workaround. It also means we only have to implement it one place, rather than in both backends. This makes INTEL_PRECISE_TRIG=1 cost nothing on GPUTest/Volplosion, which has a ton of sin() calls, but always multiplies them by an immediate constant. The extra multiply gets folded away. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
b0dffdc616801a1fd8534502e11ac840369041ab	08-Apr-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Pass brw_compiler into brw_preprocess_nir() instead of is_scalar. I want to be able to read other fields. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
084b24f5582567ebf5aa94b7f40ae3bdcb71316b	16-Mar-2016	Iago Toral Quiroga <itoral@igalia.com>	nir: rename nir_const_value fields to include bitsize information Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
9076c4e289de0debf1fb2a7237bdeb9c11002347	14-Aug-2015	Connor Abbott <connor.w.abbott@intel.com>	nir: update opcode definitions for different bit sizes Some opcodes need explicit bitsizes, and sometimes we need to use the double version when constant folding. v2: fix output type for u2f (Iago) v3: do not change vecN opcodes to be float. The next commit will add infrastructure to enable 64-bit integer constant folding so this is isn't really necessary. Also, that created problems with source modifiers in some cases (Iago) v4 (Jason): - do not change bcsel to work in terms of floats - leave ldexp generic Squashed changes to handle different bit sizes when constant folding since otherwise we would break the build. v2: - Use the bit-size information from the opcode information if defined (Iago) - Use helpers to get type size and base type of nir_alu_type enum (Sam) - Do not fallback to sized types to guess bit-size information. (Jason) Squashed changes in i965 and gallium/nir drivers to support sized types. These functions should only see sized types, but we can't make that change until we make sure that nir uses the sized versions in all the relevant places. A later commit will address this. Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
7d021cb15e6d67ecef8b020fd36c4a680bcc9c39	18-Jan-2016	Jordan Justen <jordan.l.justen@intel.com>	i965/nir: Lower nir compute shader shared variables Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
a0294c2cf3d23943c7f365abf84765afa0f383a2	25-Feb-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Simplify brw_nir_lower_vue_inputs() slightly. The same code appeared in both branches; pull it above the if statement. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
8151003ade952c3e9d8284fada9237e1311cf173	25-Feb-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Avoid recalculating the normal VUE map for IO lowering. The caller already computes it. Now that we have stage specific functions, it's really easy to pass this in. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
15b3639bf1b0676e74b107d74653185eedbc6688	25-Feb-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Avoid recalculating the tessellation VUE map for IO lowering. The caller already computes it. Now that we have stage specific functions, it's really easy to pass this in. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
cfbd9831f89ef165e7998d0b8524a1aefedec404	25-Feb-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Eliminate brw_nir_lower_{inputs,outputs,io} functions. Now that each stage is directly calling brw_nir_lower_io(), and we have per-stage helper functions, it makes sense to just call the relevant one directly, rather than going through multiple switch statements. This also eliminates stupid function parameters, such as the two that only apply to vertex attributes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
b96ddd2e52e7205e5820714f2ad1028b666426c6	25-Feb-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Split brw_nir_lower_inputs/outputs into per-stage functions. These functions are both giant switch statements where most cases don't overlap at all. Let's put the bulk of the work in per-stage helpers. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
d33c478bedc6ac0b2bd2646e443e690cdfb2b640	25-Feb-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Remove catch-all nir_lower_io call with specific cases. Most cases already call nir_lower_io explicitly for input and output lowering. This catch all isn't very useful anymore - we can just add it to the remaining cases. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
51f87979934100af8c40cfa17f670bc38417ddc0	25-Feb-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Move optimizations from brw_nir_lower_io to brw_postprocess_nir. This simplifies things. Every caller of brw_nir_lower_io() immediately calls brw_postprocess_nir(). The only real change this will have is that we get an extra brw_nir_optimize() call when compiling compute shaders, but that seems fine. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
dcd4a841e9096b988ea3ca2779e7c8b1ca5e5747	25-Feb-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Always do NIR IO lowering at specialization time. We've now hit literally every case other than geometry shaders (and compute shaders, but those are a no-op). So, let's just move geometry shaders over too and be done with it. The only advantage to doing this at link time was to save the expense of running the pass on recompiles. But we're already running a lot of passes, and the extra code complexity isn't worth it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
b3cb6e78aa219ad73c145a25ee1bb48fd8b025d0	17-Feb-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Do lower_io late for fragment shaders The Vulkan driver wants to be able to delete fragment outputs that are beyond key.nr_color_regions; this is a lot easier if we lower outputs at specialization time rather than link time. (Rationale added to commit message by Ken) Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
d56ae2d1605fc1b5a3fdf5aba9aefc3c7692a4ba	14-Jan-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Apply VS attribute workarounds in NIR. This patch re-implements the pre-Haswell VS attribute workarounds. Instead of emitting shader code in the vec4 backend, we now simply call a NIR pass to emit the necessary code. This simplifies the vec4 backend. Beyond deleting code, it removes the primary use of ATTR as a destination. It also eliminates the requirement that the vec4 VS backend express the ATTR file in terms of VERT_ATTRIB_* locations, giving us a bit more flexibility. This approach is a little different: rather than munging the attributes at the top, we emit code to fix them up when they're accessed. However, we run the optimizer afterwards, so CSE should eliminate the redundant math. It may even be able to fuse it with other calculations based on the input value. shader-db does not handle non-default NOS settings, so I have no statistics about this patch. Note that the scalar backend does not implement VS attribute workarounds, as they are unnecessary on hardware which allows SIMD8 VS. v2: Do one multiply for FIXED rescaling and select components from either the original or scaled copy, rather than multiplying each component separately (suggested by Matt Turner). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
74f956c416d5b0b37b4c2d6b957167bb203502c3	22-Jan-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Use nir_lower_load_const_to_scalar(). I don't know why, but we never hooked up this pass Eric wrote. Otherwise, you can end up with stupid scalarized code such as: vec4 ssa_7 = load_const (0.0, 0.0, 0.0, 0.0) vec4 ssa_8 = ... vec1 ssa_9 = feq ssa_8, ssa_7 vec1 ssa_10 = feq ssa_8.y, ssa_7.y vec1 ssa_11 = feq ssa_8, ssa_7.z vec1 ssa_12 = feq ssa_8.y, ssa_7.w ssa_8.xyxy == <0, 0, 0, 0> should only take two feq instructions. shader-db on Skylake: total instructions in shared programs: 9121153 -> 9120749 (-0.00%) instructions in affected programs: 32421 -> 32017 (-1.25%) helped: 277 HURT: 69 total cycles in shared programs: 69003364 -> 69000912 (-0.00%) cycles in affected programs: 899186 -> 896734 (-0.27%) helped: 313 HURT: 403 This also prevents regressions when disabling channel expressions. v2: Don't call opt_cse afterwards (requested by Matt). It should happen in the optimization loop below anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
a39a8fbbaa129f4e52f2a3ad2747182e9a74d910	17-Jan-2016	Emil Velikov <emil.velikov@collabora.com>	nir: move to compiler/ Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jose Fonseca <jfonseca@vmware.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
3657cbf24f3b0baf7e3382e572d97a36b0ed4103	14-Jan-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Apply add_const_offset_to_base for vec4 VS inputs too. This shouldn't hurt anything, and I'm about to introduce a pass that will want it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
a3500f943e2c61c0aed043108132f35b79d16676	14-Jan-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Make add_const_offset_to_base() work at the shader level. This makes it a pass, hiding the parameter structs and block callbacks so it's simpler to work with. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
317628dbb35d03d1e855332c892594ae491c5d24	18-Nov-2015	Rob Clark <robclark@freedesktop.org>	nir: extract out helper macros for running passes Note these are a bit uglier, due to avoidance of GNU C extensions. But drivers which do not need to be built with compilers that don't support the extension can wrap these macros with their own. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
237f2f2d8b45d9d956102eec6f9be63193e5269b	26-Dec-2015	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Get rid of function overloads When Connor originally drafted NIR, he copied the same function+overload system that GLSL IR had with a few names changed. However, this double-indirection is not really needed and has only served to confuse people. Instead, let's just have functions which may not have unique names and may or may not have an implementation. If someone wants to do overload resolving, they can hav a hash table based function+overload system in the overload resolving pass. There's no good reason to keep it in core NIR. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> ir3 bits are Reviewed-by: Rob Clark <robclark@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
794eb9d7270456ab3d2cadbaf302192eca7f4dbc	08-Dec-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Handle mix-and-match TCS/TES with separate shader objects. GL_ARB_separate_shader_objects allows the application to mix-and-match TCS and TES programs separately. This means that the interface between the two stages isn't known until the final SSO pipeline is in place. This isn't a great match for our hardware: the TCS and TES have to agree on the Patch URB entry layout. Since we store data as per-patch slots followed by per-vertex slots, changing the number of per-patch slots can significantly alter the layout. This can easily happen with SSO. To handle this, we store the [Patch]OutputsWritten and [Patch]InputsRead bitfields in the TCS/TES program keys, introducing program recompiles. brw_upload_programs() decides the layout for both TCS and TES, and passes it to brw_upload_tcs/tes(), which store it in the key. When creating the NIR for a shader specialization, we override nir->info.inputs_read (and friends) to the program key's values. Since everything uses those, no further compiler changes are needed. This also replaces the hack in brw_create_nir(). To avoid recompiles, brw_precompile_tes() looks to see if there's a TCS in the linked shader. If so, it accounts for the TCS outputs, just as brw_upload_programs() would. This eliminates all recompiles in the non-SSO case. In the SSO case, there should only be recompiles when using a TCS and TES that have different input/output interfaces. Fixes Piglit's mix-and-match-tcs-tes test. v2: Pull the brw_upload_programs code into a brw_upload_tess_programs() helper function (requested by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
01b1b44d31adde3954d1f1404ca66f90d87d4ae5	08-Dec-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Defer input lowering for tessellation stages until specialization. With tessellation shaders and SSO, we won't be able to always decide on VUE map layouts at LinkProgram time. Unfortunately, we have to delay it until shader specialization time. However, uniform lowering cannot be deferred - brw_codegen_*_prog() reads nir->num_uniforms. Fortunately, we don't need to defer it - uniform, system value, atomic, and sampler lowering can safely stay where it is. This patch moves those to brw_lower_nir()'s only caller, renames brw_lower_nir() to brw_nir_lower_io(), and introduces calls to that. For non-tessellation stages, I chose to call brw_nir_lower_io() from brw_create_nir(), so it's still done at the same time. There's no need to defer it, and doing it at LinkProgram time is nice. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
9f0944d15b9d2cd85f501f80eea7e6b6fc7f3487	10-Dec-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Make TES inputs match TCS outputs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
4fac9500100273424450b5687c4e04dfd066d08e	10-Dec-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Force VS -> TCS varyings to use the SSO VUE map layout. The compact VUE map only works when varying packing is in use. Unfortunately, varying packing is disabled for TCS inputs. This is needed to fix Piglit's tcs-input-read-array-interface test. v2: Make lines fit in 80 columns (caught by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
bee42cc1f78c480d13da879682268ee14b0d6fe7	10-Dec-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Handle TCS outputs and TES inputs. TCS outputs and TES inputs both refer to a common "patch URB entry" shared across all invocations. First, there are some number of per-patch entries. Then, there are per-vertex entries accessed via an offset for the variable and a stride times the vertex index. Because these calculations need to be done in both the vec4 and scalar backends, it's simpler to just compute the offset calculations in NIR. It doesn't necessarily make much sense to use per-vertex intrinsics afterwards, but that at least means we don't lose the per-patch vs. per-vertex information. v2: Use is_input/is_output helpers (suggested by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
31140d097a7939e0f917aa76bd37b5c682898e63	10-Dec-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Handle TCS inputs and TES outputs. TES outputs work exactly like VS outputs, so we can simply add a case statement for those. TCS inputs are very similar to geometry shaders - they're arrays of per-vertex data. We use the same method I used for the scalar GS backend. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
9f3917bf372aa19f85875dbe30ca12adc9b67b90	10-Dec-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Fix partial variable access for geometry shaders in SSO mode. Without varying packing, if a VS writes a compound variable, and the GS only reads part of it, the base location of the variable may not actually be in the VUE map. To cope with this, we do lowering in terms of varying slots, add any constant offsets to the base, and then do the VUE map remapping. This ensures we only look up VUE map entries for slots which actually exist. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
8c4deb10dfbef683a2052a7bd62450aa76ad8fde	09-Dec-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Separate base offset/constant offset combining from remapping. My tessellation branch has two additional remap functions. I don't want to replicate this logic there. v2: Handle inputs/outputs separately (suggested by Jason Ekstrand). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
78b81be627734ea7fa50ea246c07b0d4a3a1638a	25-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Get rid of _indirect variants of input/output load/store intrinsics There is some special-casing needed in a competent back-end. However, they can do their special-casing easily enough based on whether or not the offset is a constant. In the mean time, having the _indirect variants adds special cases a number of places where they don't need to be and, in general, only complicates things. To complicate matters, NIR had no way to convdert an indirect load/store to a direct one in the case that the indirect was a constant so we would still not really get what the back-ends wanted. The best solution seems to be to get rid of the _indirect variants entirely. This commit is a bunch of different changes squashed together: - nir: Get rid of _indirect variants of input/output load/store intrinsics - nir/glsl: Stop handling UBO/SSBO load/stores differently depending on indirect - nir/lower_io: Get rid of load/store_foo_indirect - i965/fs: Get rid of load/store_foo_indirect - i965/vec4: Get rid of load/store_foo_indirect - tgsi_to_nir: Get rid of load/store_foo_indirect - ir3/nir: Use the new unified io intrinsics - vc4: Do all uniform loads with byte offsets - vc4/nir: Use the new unified io intrinsics - vc4: Fix load_user_clip_plane crash - vc4: add missing src for store outputs - vc4: Fix state uniforms - nir/lower_clip: Update to the new load/store intrinsics - nir/lower_two_sided_color: Update to the new load intrinsic NIR and i965 changes are Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> NIR indirect declarations and vc4 changes are Reviewed-by: Eric Anholt <eric@anholt.net> ir3 changes are Reviewed-by: Rob Clark <robdclark@gmail.com> NIR changes are Acked-by: Rob Clark <robdclark@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
18069dce4a4c3d71e6afc6b10bfa7bee0560ba9c	11-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965: Make uniform offsets be in terms of bytes This commit pushes makes uniform offsets be terms of bytes starting with nir_lower_io. They get converted to be in terms of vec4s or floats when we cram them in the UNIFORM register file but reladdr remains in terms of bytes all the way down to the point where we lower it to a pull constant load. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
aa35b0c2c71f054f72df5a85779d0862fa7d6e4a	25-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/vec4: Get rid of the nir_inputs array It's not really buying us anything at this point. It's just a way of remapping one offset namespace onto another. We can just use the location namespace the whole way through. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
f36993b46962eab4446bc1964eb47149751aee26	23-Nov-2015	Matt Turner <mattst88@gmail.com>	i965: Clean up #includes in the compiler. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
d278e31459374feb18edd97d5adaacccc08f978a	18-Nov-2015	Rob Clark <robclark@freedesktop.org>	util: move brw_env_var_as_boolean() to util Kind of a handy function. And I'll want it available outside of i965 for common nir-pass helpers. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Nicolai Hähnle <nhaehnle@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
d9b8fde963a53d4e06570d8bece97f806714507a	12-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965: Use NIR for lowering texture swizzle Now that nir_lower_tex can do texture swizzle lowering, we can use that instead of repeating more-or-less the same code in both backends. This both allows us to share code and means that things like the tg4 work-arounds are somewhat simpler because they don't have to take the swizzle into account. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
6c8ba59cff14a1a86273f4008ff2a8e68335ab25	11-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965: Use nir_lower_tex for texture coordinate lowering Previously, we had a rescale_texcoords helper in the FS backend for handling rescaling of texture coordinates. Now that we can do variants in NIR, we can use nir_lower_tex to do the rescaling for us. This allows us to delete the i965-specific code and gives us proper TEXTURE_RECTANGLE and GL_CLAMP handling in vertex and geometry shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
1417f6a216b46dbbaa1bfe0cef97e2b4a48224c0	11-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	nir/lower_tex: Report progress Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
ce767bbdfff7c2a7829b652c111a11eb9ddba026	11-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965: Move postprocess_nir to codegen time This allows us to insert NIR passes between initial NIR compilation and optimization (link time) and actual backend code-gen. In particular, it will allow us to do shader variants in NIR and share some of that shader variant code between backends. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
9cf108193b61c342c94c4cd980c4b403638e1051	11-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Split shader optimization and lowering into three stages At the moment, brw_create_nir just calls the three stages in sequence so there's not much difference. Soon, however, we will want to start doing variants in NIR at which point the postprocessing step will have to move from shader create time to codegen time. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
f58813842bcece3498f55ec5d582466ccff92a5e	15-May-2015	Jason Ekstrand <jason.ekstrand@intel.com>	nir: s/nir_type_unsigned/nir_type_uint v2: do the same in tgsi_to_nir (Samuel) v3: added missing cases after rebase (Iago) v4: Add a blank space after '#' in one of the comments (Matt) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
006e4f070f08ff1e1731863940bc51de9e97b865	19-Oct-2015	Rob Clark <robdclark@gmail.com>	nir: add nir_var_all enum Otherwise, passing -1 gets you: error: invalid conversion from 'int' to 'nir_variable_mode' [-fpermissive] Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
0bee3acc2a303b4cbbac0f6f54ffc8be79bc7470	16-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Add hooks for testing nir_shader_clone This commit adds code for testing nir_shader_clone by running it after each and every optimization pass and throwing away the old shader. Testing nir_shader_clone is hidden behind a new INTEL_CLONE_NIR environment variable. Reviewed-by: Rob Clark <robclark@freedesktop.org> Acked-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
9ff71b649b4b3808a9e17ce69743c6037fd6603c	03-Nov-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/nir: Validate that NIR passes call nir_metadata_preserve(). Failing to call nir_metadata_preserve() can have nasty consequences: some pass breaks dominance information, but leaves it marked as valid, causing some subsequent pass to go haywire and probably crash. This pass adds a simple validation mechanism to ensure passes handle this properly. We add a new bogus metadata flag that isn't used for anything in particular, set it before each pass, and ensure it isn't still set after the pass. nir_metadata_preserve will reset the flag, so correct passes will work, and bad passes will assert fail. (I would have made these functions static inline, but nir.h is included in C++, so we can't bit-or enums without lots of casting...) Thanks to Dylan Baker for the idea. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
7bc097899924f40140981567c7bb52297dd801f2	03-Nov-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/nir: Add OPT() and OPT_V() macros for invoking NIR passes. OPT() is the normal macro for passes that return booleans, while OPT_V() is a variant that works for passes that don't properly report progress. (Such passes should be fixed to return a boolean, eventually.) These macros take care of calling nir_validate_shader() and setting progress appropriately. In the future, it would be easy to add shader dumping similar to INTEL_DEBUG=optimizer by extending the macro. v2 (Jason Ekstrand): - Fix an unused variable warning Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
94ff35204dba0ddbd7f5c4342206c8acba22d32f	22-Oct-2015	Eduardo Lima Mitev <elima@igalia.com>	nir/nir_opt_peephole_ffma: Move this lowering pass to the i965 driver Because the next patch will add an optimization that is specific to i965, we want to move this loweing pass to that driver altogether. This is safe because i965 is the only consumer. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
8dcf807cb43383590ba193c7ff20b8a98e4a9f65	14-Oct-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Fix scalar VS float[] and vec2[] output arrays. The scalar VS backend has never handled float[] and vec2[] outputs correctly (my original code was broken). Outputs need to be padded out to vec4 slots. In fs_visitor::nir_setup_outputs(), we tried to process each vec4 slot by looping from 0 to ALIGN(type_size_scalar(type), 4) / 4. However, this is wrong: type_size_scalar() for a float[2] would return 2, or for vec2[2] it would return 4. This looked like a single slot, even though in reality each array element would be stored in separate vec4 slots. Because of this bug, outputs[] and output_components[] would not get initialized for the second element's VARYING_SLOT, which meant emit_urb_writes() would skip writing them. Nothing used those values, and dead code elimination threw a party. To fix this, we introduce a new type_size_vec4_times_4() function which pads array elements correctly, but still counts in scalar components, generating correct indices in store_output intrinsics. Normally, varying packing avoids this problem by turning varyings into vec4s. So this doesn't actually fix any Piglit or dEQP tests today. However, if varying packing is disabled, things would be broken. Tessellation shaders can't use varying packing, so this fixes various tcs-input Piglit tests on a branch of mine. v2: Shorten the implementation of type_size_4x to a single line (caught by Connor Abbott), and rename it to type_size_vec4_times_4() (renaming suggested by Jason Ekstrand). Use type_size_vec4 rather than using type_size_vec4_times_4 and then dividing by 4. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
c9541a74e4d179ad844bdf8af1e3de541c5b14c2	24-Sep-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Add scalar GS input lowering code. We really ought to compute the VUE map at link time and stash it, rather than recomputing it here, but with the mess of program structures I wasn't sure where to put it. We can improve that later. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
a3d0359aff7a9be90149c416844f330b4f9a15ed	26-Oct-2015	Timothy Arceri <timothy.arceri@collabora.com>	glsl: keep track of intra-stage indices for atomics This is more optimal as it means we no longer have to upload the same set of ABO surfaces to all stages in the program. This also fixes a bug where since commit c0cd5b var->data.binding was being used as a replacement for atomic buffer index, but they don't have to be the same value they just happened to end up the same when binding is 0. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Cc: Alejandro Piñeiro <apinheiro@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90175 /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
dbac0a6352053bd6106feff88d95b0fd38b82afe	16-Oct-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/nir: Switch on shader stage in nir_lower_outputs(). VS, GS, and FS continue doing the same thing they did before. We can simplify the FS code a bit because it is always scalar. Compute shaders now assert that there are no outputs instead of doing a loop over 0 outputs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
0d1eef536bc744f5c4dcdf854ad6adfdfe4f4dcb	14-Oct-2015	Jordan Justen <jordan.l.justen@intel.com>	i965/fs: Ignore compute shaders in brw_nir_lower_inputs The commit shown below caused compute shaders to hit the unreachable in the default of the switch block. Since compute shaders don't have any inputs, we can make brw_nir_lower_inputs a no-op for CS. commit 2953c3d76178d7589947e6ea1dbd902b7b02b3d4 Author: Kenneth Graunke <kenneth@whitecape.org> Date: Fri Aug 14 15:15:11 2015 -0700 i965/vs: Map scalar VS input locations properly; avoid tons of MOVs. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
63728dac57c18df0f45bb2482f60188fac2d1efe	13-Oct-2015	Jordan Justen <jordan.l.justen@intel.com>	i965/fs: Simplify FS in brw_nir_lower_inputs to only support scalar mode Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
bd198b9f0a292a9ff4ffffec3a29bad23d62caba	15-Aug-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/vs: Simplify fs_visitor's ATTR file. Previously, ATTR was indexed by VERT_ATTRIB_* slots; at the end of compilation, assign_vs_urb_setup() translated those into GRF units, and converted ATTR to HW_REGs. This patch moves the transslation earlier, making ATTR work in terms of GRF units from the beginning. assign_vs_urb_setup() simply has to add the number of payload registers and push constants to obtain the final hardware GRF number. (We can't do this earlier as those values aren't known.) ATTR still supports reg_offset; however, it's simply added to reg. It's not clear whether this is valuable or not. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
2953c3d76178d7589947e6ea1dbd902b7b02b3d4	15-Aug-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/vs: Map scalar VS input locations properly; avoid tons of MOVs. Previously, we used nir_lower_io with the scalar type_size function, which mapped VERT_ATTRIB_* locations to...some numbers. Then, in fs_visitor::nir_setup_inputs(), we created temporaries indexed by those numbers, and emitted MOVs from the actual ATTR registers to those temporaries. Virtually all of these were copy propagated away, but it's still ugly. This patch reworks our input lowering to produce NIR lower_input intrinsics that properly index into the ATTR file, so we can access it directly. No changes in shader-db. v2: Fix unreachable() message (Ken), update commit message (Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
5d7f8cb5a511977e256e773716fac3415d01443e	01-Oct-2015	Kenneth Graunke <kenneth@whitecape.org>	nir: Introduce new nir_intrinsic_load_per_vertex_input intrinsics. Geometry and tessellation shaders process multiple vertices; their inputs are arrays indexed by the vertex number. While GLSL makes this look like a normal array, it can be very different behind the scenes. On Intel hardware, all inputs for a particular vertex are stored together - as if they were grouped into a single struct. This means that consecutive elements of these top-level arrays are not contiguous. In fact, they may sometimes be in completely disjoint memory segments. NIR's existing load_input intrinsics are awkward for this case, as they distill everything down to a single offset. We'd much rather keep the vertex ID separate, but build up an offset as normal beyond that. This patch introduces new nir_intrinsic_load_per_vertex_input intrinsics to handle this case. They work like ordinary load_input intrinsics, but have an extra source (src[0]) which represents the outermost array index. v2: Rebase on earlier refactors. v3: Use ssa defs instead of nir_srcs, rebase on earlier refactors. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
1e3c1b107e075b210998998423901092b8fcd79b	03-Oct-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965: Use nir_foreach_variable Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
ca941799ce76eac8afe2503fbacffee057e949d3	03-Oct-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Remove the prog parameter from brw_nir_lower_inputs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
cd1ae6ebfac22f76d26a5b8659423969b2aeddce	06-Aug-2015	Jason Ekstrand <jason.ekstrand@intel.com>	nir/glsl: Take a gl_shader_program and a stage rather than a gl_shader Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
193d29516ddb76f469fea17119493e2b685bc6b7	26-Aug-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/nir: Refactor input/output lowering setup into helpers. The code for input lowering is going to get significantly more complicated shortly, so I wanted to pull it out. Vertex shader inputs are handled nearly identically regardless of vec4/scalar mode, so I opted to not split that. I thought about having each function actually do the lowering, but one pass through nir_lower_io that handles all types (which weren't handled earlier) is probably more efficient. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
39a1d36a67974dd9fc3c0d834d6a117cdfed8f33	13-Aug-2015	Kenneth Graunke <kenneth@whitecape.org>	nir: Allow nir_lower_io() to only lower one type of variable. We may want to use different type_size functions for (e.g.) inputs vs. uniforms. Passing in -1 for mode ignores this, handling all modes as before. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
df31c1850d14729e27513ae733110a668f6b6e95	05-Aug-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/gs: Use new NIR intrinsics. By performing the vertex counting in NIR, we're able to elide a ton of useless safety checks around every EmitVertex() call: total instructions in shared programs: 3952 -> 3720 (-5.87%) instructions in affected programs: 3491 -> 3259 (-6.65%) helped: 11 HURT: 0 Improves performance in Gl32GSCloth by 0.671742% +/- 0.142202% (n=621) on Haswell GT3e at 1024x768. This should also make it easier to implement Broadwell's "Static Vertex Count" feature someday. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
faf5f174ddbc7680f6947ceababb94fdb552bcdb	16-Sep-2015	Rob Clark <robclark@freedesktop.org>	nir/lower_tex: support projector lowering per sampler type Some hardware, such as adreno a3xx, supports txp on some but not all sampler types. In this case we want more fine grained control over which texture projectors get lowered. v2: split out nir_lower_tex_options struct to make it easier to add the additional parameters coming in the following patches Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
d9b9ff76f17ee36b87b2722fa2a19e1d9f036c26	17-Sep-2015	Rob Clark <robclark@freedesktop.org>	nir: rename nir_lower_tex_projector Since the following patches will add additional tex-lowering related functionality, which doesn't make sense to split out into a separate pass (as they would require duplication of the projector lowering logic), let's give this pass a more generic name. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
fc11dbe13f3470ff2a4cb91c6b063db2456664da	09-Sep-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/vec4: Use nir_move_vec_src_uses_to_dest The idea here is not that it gives register coalescing a little bit of a helping hand. It doesn't actually fix the coalescing problems, but it seems to help a good bit. Shader-db results for vec4 programs on Haswell: total instructions in shared programs: 1746280 -> 1683959 (-3.57%) instructions in affected programs: 1259166 -> 1196845 (-4.95%) helped: 11363 HURT: 148 v2 (Jason Ekstrand): - Run nir_move_vec_src_uses_to_dest after going out of SSA - New shader-db numbers Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
c951bb83056724df02ba7e6fe2dfa720c0f45c1f	09-Sep-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/vec4_nir: Use partial SSA form rather than full non-SSA We made this switch in the FS backend some time ago and it seems to make a number of things a bit easier. In particular, supporting SSA values takes very little work in the backend and allows us to take advantage of the majority of the SSA information even after we've gotten rid of Phi nodes. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
1484d8c9aa2e7e78462ffb5c207394bef77af89b	01-May-2015	Connor Abbott <cwabbott0@gmail.com>	i965/nir: enable the dead control flow optimization total instructions in shared programs: 7541551 -> 7541381 (-0.00%) instructions in affected programs: 3054 -> 2884 (-5.57%) helped: 29 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
86c57ebe0ed1acc98545746058862db7429412da	21-Aug-2015	Boyan Ding <boyan.j.ding@gmail.com>	i965/nir: Make use of nir_opt_undef Shader-db result on Ivy Bridge: total instructions in shared programs: 145484 -> 145445 (-0.03%) instructions in affected programs: 225 -> 186 (-17.33%) helped: 5 HURT: 0 Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
5f14c417c86ced1847746c64d4db54c7e5ddc187	18-Aug-2015	Kenneth Graunke <kenneth@whitecape.org>	nir: Use nir_shader::stage rather than passing it around. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
259f7291de2387aa3ac5f856b39b7b934a1d8e7d	18-Aug-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/fs: Rework uniform handling Previously, we treated the entire UNIFORM file as if it had two elements: One for direct things and one for indirect. This is substantially different from how the old visitor code handled it where each element was effectively its own uniform. This commit makes the NIR path more like the old ir_visitor path where each uniform is separate. This should allow us to more easily make decisions about what to push. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
6c33d6bbf9b54784e4498a81c73b712dca5dd737	12-Aug-2015	Kenneth Graunke <kenneth@whitecape.org>	nir: Pass a type_size() function pointer into nir_lower_io(). Previously, there were four type_size() functions in play - the i965 compiler backend defined scalar and vec4 type_size() functions, and nir_lower_io contained its own similar functions. In fact, the i965 driver used nir_lower_io() and then looped over the components using its own type_size - meaning both were in play. The two are /basically/ the same, but not exactly in obscure cases like subroutines and images. This patch removes nir_lower_io's functions, and instead makes the driver supply a function pointer. This gives the driver ultimate flexibility in deciding how it wants to count things, reduces code duplication, and improves consistency. v2 (Jason Ekstrand): - One side-effect of passing in a function pointer is that nir_lower_io is now aware of and properly allocates space for image uniforms, allowing us to drop hacks in the backend Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> v2 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
da1b1bf85cdc691ec27f379de84dec495cdd51e0	15-Jul-2015	Iago Toral Quiroga <itoral@igalia.com>	i965/nir: Do not scalarize phis in non-scalar setups Significantly reduces register pressure in some piglit tests. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
db8a6de571bb72ef43209a415e5492001a87b1d8	17-Jun-2015	Eduardo Lima Mitev <elima@igalia.com>	i965/nir: Add new utility method brw_glsl_base_type_for_nir_type() This method returns the glsl_base_type corresponding to a nir_alu_type. It will factorize code currently present in fs_nir, that can be reused in vec4_nir on its upcoming emit_texture support. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
e4f02f47e70d384531ac68e6d33a62fdcdbd1f28	16-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Lower "vecN" instructions and mark them unreachable This enables NIR pass "lower_vec_to_movs" on shaders that work on vec4. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
9e5d827f455f3c72af6cb8d60b97890bab8d5ad0	25-Jun-2015	Alejandro Piñeiro <apinheiro@igalia.com>	i965/nir: Disable alu_to_scalar pass on non-scalar shaders Disables nir_lower_alu_to_scalar when the shader stage being processed work on vec4 vectors, like the upcoming NIR->vec4 backend. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
662c4c99065381b8e265310d176cfdef6698ca57	16-Jun-2015	Eduardo Lima Mitev <elima@igalia.com>	i965/nir/vec4: Implement store_output intrinsic This implementation is based on the current URB setup in vec4_visitor, which requires the output register to be stored in the output_reg array at variable's original shader location index. But since nir_lower_io() pass uses the value in var->data.driver_location, we need to put there var->data.location instead, prior to calling nir_lower_io(), so that we end up with the correct index in const_index[0]. The driver_location is not used at all, so this patch also disables the nir_assign_var_locations pass on non-scalar shaders. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
97e205fd35bf77fd761caf24c611ff72cc0d85e2	17-Apr-2015	Eduardo Lima Mitev <elima@igalia.com>	i965/nir: Move brw_type_for_nir_type() to brw_nir to allow reuse Upcoming NIR->vec4 pass can benefit from this method, so lets move it up. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
f7152525374015594e037fa11bb64e1c7174829b	01-Jul-2015	Eduardo Lima Mitev <elima@igalia.com>	i965/nir/vec4: Implement load_const intrinsic Similar to fs_nir backend, a nir_local_values map will be filled with newly allocated registers as the load_const instrinsic instructions are processed. Later, get_nir_src() will fetch the registers from this map for sources that are ssa. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
6e58fc56a5a396020cd299db11895120ec3da520	03-Jul-2015	Iago Toral Quiroga <itoral@igalia.com>	i965/nir: Dot not assign direct uniform locations first for vec4-based shaders In the vec4 backend we want uniform locations to be assigned consecutively since that way the offsets produced by nir_lower_io are exactly what we need to implement nir_intrinsic_load_uniform. Otherwise we would need a mapping to match the output of nir_lower_io to the actual uniform registers we need to use. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
01f6235020f9f0c2bc1a6e6ea9bd15c22fb2bcf5	18-Jun-2015	Iago Toral Quiroga <itoral@igalia.com>	nir/nir_lower_io: Add vec4 support The current implementation operates in scalar mode only, so add a vec4 mode where types are padded to vec4 sizes. This will be useful in the i965 driver for its vec4 nir backend (and possbly other drivers that have vec4-based shaders). Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
5e839727ed2378a01d3b657bad83abd4728e8da6	22-Jul-2015	Eduardo Lima Mitev <elima@igalia.com>	i965/nir: Pass a is_scalar boolean to brw_create_nir() The upcoming introduction of NIR->vec4 pass will require that some NIR lowering passes are enabled/disabled depending on the type of shader (scalar vs. vector). With this patch we pass a 'is_scalar' variable to the process of constructing the NIR, to let an external context decide how the shader should be handled. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
864907e2f14523c130e6ff24c081789bb079bae1	24-Jun-2015	Connor Abbott <cwabbott0@gmail.com>	i965/fs: use SSA values directly Before, we would use registers, but set a magical "parent_instr" field to indicate that it was actually purely an SSA value (i.e., it wasn't involved in any phi nodes). Instead, just use SSA values directly, which lets us get rid of the hack and reduces memory usage since we're not allocating a nir_register for every value. It also makes our handling of load_const more consistent compared to the other instructions. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
2b1a1d8b1294f91b7ac563da1f395deba4384765	24-Jun-2015	Connor Abbott <cwabbott0@gmail.com>	nir/from_ssa: add a flag to not convert everything from SSA We already don't convert constants out of SSA, and in our backend we'd like to have only one way of saying something is still in SSA. The one tricky part about this is that we may now leave some undef instructions around if they aren't part of a phi-web, so we have to be more careful about deleting them. v2: rename and flip meaning of flag (Jason) Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
c8b8e8b29b755cd3d80fc5e470f441cb3716152a	22-Jun-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Don't count NIR instructions for shader-db. Matt, Jason, and I haven't found this useful in a long time. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
f4310cdbd08f20276237fbefa3eba406aa109636	10-Jun-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Re-index SSA definitions before printing NIR code. This makes the SSA definitions use sequential numbers (0, 1, 2, ...) instead of seemingly random ones. There's not much point normally, but it makes debug output much easier to read. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
58aed1031d40e62c9f41f7c512b3165dd5913d1e	20-May-2015	Jason Ekstrand <jason.ekstrand@intel.com>	prog_to_nir: Use a variable for uniform data Previously, the prog_to_nir pass was directly generating uniform load/store intrinsics. This converts it to use a single giant "parameters" variable and we now depend on lowering to get the uniform load/store intrinsics. One advantage of this is that we now have one code-path after we do the initial conversion into NIR. No shader-db changes. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
89c1feb78d010bc457f5d02be84c955eebf3549f	08-Apr-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Create NIR during LinkShader() and ProgramStringNotify(). Previously, we translated into NIR and did all the optimizations and lowering as part of running fs_visitor. This meant that we did all of that work twice for fragment shaders - once for SIMD8, and again for SIMD16. We also had to redo it every time we hit a state based recompile. We now generate NIR once at link time. ARB programs don't have linking, so we instead generate it at ProgramStringNotify time. Mesa's fixed function vertex program handling doesn't bother to inform the driver about new programs at all (which is rather mean), so we generate NIR at the last minute, if it hasn't happened already. shader-db runs ~9.4% faster on my i7-5600U, with a release build. v2: Check NirOptions != NULL in ProgramStringNotify(). Don't bother using _mesa_program_enum_to_shader_stage as we already know it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c