0d5071db5e50629a63490639a3c86dfc65bf27ab |
|
13-Jan-2017 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Move Gen4-5 interpolation stuff to brw_wm_prog_data. This fixes glxgears rendering, which had surprisingly been broken since late October! Specifically, commit 91d61fbf7cb61a44adcaae51ee08ad0dd6b. glxgears uses glShadeModel(GL_FLAT) when drawing the main portion of the gears, then uses glShadeModel(GL_SMOOTH) for drawing the Gouraud-shaded inner portion of the gears. This results in the same fragment program having two different state-dependent interpolation maps: one where gl_Color is flat, and another where it's smooth. The problem is that there's only one gen4_fragment_program, so it can't store both. Each FS compile would trash the last one. But, the FS compiles are cached, so the first one would store FLAT, and the second would see a matching program in the cache and never bother to compile one with SMOOTH. (Clearing the program cache on every draw made it render correctly.) Instead, move it to brw_wm_prog_data, where we can keep a copy for every specialization of the program. The only downside is bloating the structure a bit, but we can tighten that up a bit if we need to. This also lets us kill gen4_fragment_program entirely! Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
56ee2df4bf9b1e8c26cf8689f5ef20237c95466b |
|
13-Jan-2017 |
Juan A. Suarez Romero <jasuarez@igalia.com> |
i965/vec4: Fix mapping attributes This patch reverts 57bab6708f2bbc1ab8a3d202e9a467963596d462, which was causing issues with ILK and earlier VS programs. 1. brw_nir.c: Revert "i965/vec4/nir: vec4 also needs to remap vs attributes" Do not perform a remap in vec4 backend. Rather, do it later when setup attributes 2. brw_vec4.cpp: This fixes mapping ATTRx to proper GRFn. Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99391 [jordan.l.justen@intel.com: merge Juan's two patches from bugzilla] Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
57bab6708f2bbc1ab8a3d202e9a467963596d462 |
|
22-Apr-2016 |
Alejandro Piñeiro <apinheiro@igalia.com> |
i965/vec4/nir: vec4 also needs to remap vs attributes Doubles need extra space, so we would need to do a remapping for vec4 too in order to take that into account. We reuse the already existing remap_vs_attrs, but passing is_scalar, so they could remap accordingly. v2: code-format remap_vs_attrs_params initialization (Matt) Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
b4c44ff08c1154ff3790de89f29520d178e9e0ef |
|
08-Aug-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Use the nir_move_comparisons pass. While the below stats are encouraging this pass will also become very usefull for avoiding regression once brw_do_channel_expressions() and brw_do_vector_splitting() are disabled. On Broadwell: total instructions in shared programs: 13078787 -> 13060898 (-0.14%) instructions in affected programs: 1809827 -> 1791938 (-0.99%) helped: 4527 HURT: 157 total cycles in shared programs: 256562762 -> 256590424 (0.01%) cycles in affected programs: 159749392 -> 159777054 (0.02%) helped: 5583 HURT: 2289 total spills in shared programs: 14929 -> 14923 (-0.04%) spills in affected programs: 62 -> 56 (-9.68%) helped: 1 HURT: 0 total fills in shared programs: 20144 -> 20141 (-0.01%) fills in affected programs: 253 -> 250 (-1.19%) helped: 1 HURT: 3 LOST: 0 GAINED: 2 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
b5e682a1efa0a929574e739807ac2b046e004561 |
|
10-Aug-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Move nir_lower_locals_to_regs a bit later. I'm going to add a boolean scheduling pass that I want run late, but after copy propagation and dead code elimination. Yet, I don't want to have to think about registers. So, move the register conversion a little later. No impact on shader-db. Suggested by Jason Ekstrand. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
5edc3381628d1db4468f31b1c66bb518146e35b5 |
|
09-Jan-2017 |
Kenneth Graunke <kenneth@whitecape.org> |
compiler: Merge shader_info's tcs and tes structs. Annoyingly, SPIR-V lets you specify all of these fields in either the TCS or TES, which means that we need to be able to store all of them for either shader stage. Putting them in a union won't work. Combining both is an easy solution, and given that the TCS struct only had a single field, it's pretty inexpensive. This patch renames the combined struct to "tess" to indicate that it's for tessellation in general, not one of the two stages. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
c2acf97fcc9b32eaa9778771282758e5652a8ad4 |
|
16-Dec-2016 |
Juan A. Suarez Romero <jasuarez@igalia.com> |
nir/i965: use two slots from inputs_read for dvec3/dvec4 vertex input attributes So far, input_reads was a bitmap tracking which vertex input locations were being used. In OpenGL, an attribute bigger than a vec4 (like a dvec3 or dvec4) consumes just one location, any other small attribute. So we mark the proper bit in inputs_read, and also the same bit in double_inputs_read if the attribute is a dvec3/dvec4. But in Vulkan, this is slightly different: a dvec3/dvec4 attribute consumes two locations, not just one. And hence two bits would be marked in inputs_read for the same vertex input attribute. To avoid handling two different situations in NIR, we just choose the latest one: in OpenGL, when creating NIR from GLSL/IR, any dvec3/dvec4 vertex input attribute is marked with two bits in the inputs_read bitmap (and also in the double_inputs_read), and following attributes are adjusted accordingly. As example, if in our GLSL/IR shader we have three attributes: layout(location = 0) vec3 attr0; layout(location = 1) dvec4 attr1; layout(location = 2) dvec3 attr2; then in our NIR shader we put attr0 in location 0, attr1 in locations 1 and 2, and attr2 in location 3 and 4. Checking carefully, basically we are using slots rather than locations in NIR. When emitting the vertices, we do a inverse map to know the corresponding location for each slot. v2 (Jason): - use two slots from inputs_read for dvec3/dvec4 NIR from GLSL/IR. v3 (Jason): - Fix commit log error. - Use ladder ifs and fix braces. - elements_double is divisible by 2, don't need DIV_ROUND_UP(). - Use if ladder instead of a switch. - Add comment about hardware restriction in 64bit vertex attributes. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
45912fb908f7a1d2efbce0f1dbe81e5bc975fbe1 |
|
10-Dec-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/compiler: Use the new nir_opt_copy_prop_vars pass We run this after nir_lower_vars_to_ssa so that as many load/store_var intrinsics as possible before copy_prop_vars executes. This is because the pass isn't particularly efficient (it does a lot of linear walks of a linked list) so we'd like as much of the work as possible to be done before copy_prop_vars runs. Shader DB results on Sky Lake: total instructions in shared programs: 12020290 -> 12013627 (-0.06%) instructions in affected programs: 26033 -> 19370 (-25.59%) helped: 16 HURT: 13 total cycles in shared programs: 137772848 -> 137549012 (-0.16%) cycles in affected programs: 6955660 -> 6731824 (-3.22%) helped: 217 HURT: 237 total loops in shared programs: 3208 -> 3208 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 4112 -> 4057 (-1.34%) spills in affected programs: 483 -> 428 (-11.39%) helped: 2 HURT: 0 total fills in shared programs: 5519 -> 5102 (-7.56%) fills in affected programs: 993 -> 576 (-41.99%) helped: 2 HURT: 0 LOST: 0 GAINED: 0 Broadwell had similar results. On older hardware, the impact isn't as large because they don't advertise GL 4.5. Of the hurt programs, all but one are hurt by a single instruction and the one is hurt by 3 instructions. All of the helped programs, on the other hand, are helped by at least 3 instructions and one kerbal space program shader is helped by 44.59%. The real star of the show, however, is the Gl43CSDof synmark2 benchmark which has two shaders which are cut by 28% and 40% and the over-all runtime performance of the benchmark on my Sky Lake laptop is improved by around 25-30% (it's a bit hard to be exact due to thermal throttling). Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
e6ae19944d977dc91bc45adff679337182c20683 |
|
24-Nov-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Rework gl_TessLevel*[] handling to use NIR compact arrays. Treating everything as scalar arrays allows us to drop a bunch of special case input/output munging all throughout the backend. Instead, we just need to remap the TessLevel components to the appropriate patch URB header locations in remap_patch_urb_offsets(). We also switch to treating the TES input versions of these as ordinary shader inputs rather than system values, as remap_patch_urb_offsets() just makes everything work out without special handling. This regresses one Piglit test: arb_tessellation_shader-large-uniforms/GL_TESS_CONTROL_SHADER-array-at-limit The compiler starts promoting the constant arrays assigned to gl_TessLevel* to uniform arrays. Since the shader also has a uniform array that uses the maximum number of uniform components, this puts it over the uniform component limit enforced by the linker. This is arguably a bug in the constant array promotion code (it should avoid pushing us over limits), but is unlikely to penalize any real application. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
8962cc96ec2bc1eb561a438512adc5042e2c8d34 |
|
17-Dec-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Use nir_opt_trivial_continues and nir_opt_if Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
715f0d06d19e7c33d98f99c764c5c3249d13b1c0 |
|
13-Dec-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
i965: use nir loop unrolling pass shader-db results for BDW: total instructions in shared programs: 12589614 -> 12590119 (0.00%) instructions in affected programs: 50525 -> 51030 (1.00%) helped: 7 HURT: 145 total cycles in shared programs: 241524604 -> 241490502 (-0.01%) cycles in affected programs: 1941404 -> 1907302 (-1.76%) helped: 302 HURT: 449 total loops in shared programs: 4245 -> 2947 (-30.58%) loops in affected programs: 1535 -> 237 (-84.56%) helped: 1142 HURT: 0 total spills in shared programs: 14453 -> 14453 (0.00%) spills in affected programs: 0 -> 0 helped: 0 HURT: 0 total fills in shared programs: 18984 -> 18984 (0.00%) fills in affected programs: 0 -> 0 helped: 0 HURT: 0 LOST: 26 GAINED: 15 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
e729504fb1799c3ae31cea76d73946530ef9806f |
|
14-Sep-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
nir: pass compiler rather than devinfo to functions that call nir_optimize Later we will pass compiler to nir_optimise to be used by the loop unroll pass. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
eda3ec7957ec9324641ee75847b892885e77335f |
|
05-Dec-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
i965: use nir_lower_indirect_derefs() for GLSL This moves the nir_lower_indirect_derefs() call into brw_preprocess_nir() so thats is called by both OpenGL and Vulkan and removes that call to the old GLSL IR pass lower_variable_index_to_cond_assign() We want to do this pass in nir to be able to move loop unrolling to nir. There is a increase of 1-3 instructions in a small number of shaders, and 2 Kerbal Space program shaders that increase by 32 instructions. The changes seem to be caused be the difference in the GLSL IR vs NIR variable index lowering passes. The GLSL IR pass creates a simple if ladder for arrays of size 4 or less, while the NIR pass implements a binary search for all arrays regardless of size. Shader-db results BDW: total instructions in shared programs: 13021176 -> 13021819 (0.00%) instructions in affected programs: 57693 -> 58336 (1.11%) helped: 20 HURT: 190 total cycles in shared programs: 299805580 -> 299750826 (-0.02%) cycles in affected programs: 2290024 -> 2235270 (-2.39%) helped: 337 HURT: 442 total fills in shared programs: 19984 -> 19984 (0.00%) fills in affected programs: 0 -> 0 helped: 0 HURT: 0 LOST: 4 GAINED: 0 V2: remove the do_copy_propagation() call from the i965 GLSL IR linking code. This call was added in f7741c52111 but since we are moving the variable index lowering to NIR we no longer need it and can just rely on the nir copy propagation pass. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
77f65b3b643d135a5eab3799a600302177dccb26 |
|
13-Dec-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/nir: enable lowering of texture gradient for shadow samplers This gets the lowering on the Vulkan driver too, which is required for hardware that does not have the sample_l_d message (up to IvyBridge). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
f90da64fc65d8da99a5a5a140be7b64c8cf5ee6a |
|
30-Nov-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/nir: enable lowering of texture gradient for cube maps This gets the lowering on the Vulkan driver too. Fixes Vulkan CTS cube map texture gradient tests in: dEQP-VK.glsl.texture_functions.texturegrad.* Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
0291bf4db2affbfb6a7daeb562c0cf56f6a85829 |
|
06-Dec-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
Revert "i965: use nir_lower_indirect_derefs() for GLSL" This reverts commit 9404439a754e5640ccd98df40fa694835c0d8759. I didn't intend to push it and it breaks clip and cull distance.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
9404439a754e5640ccd98df40fa694835c0d8759 |
|
15-Aug-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
i965: use nir_lower_indirect_derefs() for GLSL This moves the nir_lower_indirect_derefs() call into brw_preprocess_nir() so thats is called by both OpenGL and Vulkan and removes that call to the old GLSL IR pass lower_variable_index_to_cond_assign() We want to do this pass in nir to be able to move loop unrolling to nir. There is a increase of 1-3 instructions in a small number of shaders, and 2 Kerbal Space program shaders that increase by 32 instructions. Shader-db results BDW: total instructions in shared programs: 8705873 -> 8706194 (0.00%) instructions in affected programs: 32515 -> 32836 (0.99%) helped: 3 HURT: 79 total cycles in shared programs: 74618120 -> 74583476 (-0.05%) cycles in affected programs: 528104 -> 493460 (-6.56%) helped: 47 HURT: 37 LOST: 2 GAINED: 0
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
a8ef92b031729fca86f8615d0ca113ddf0965af9 |
|
08-Nov-2016 |
Jason Ekstrand <jason@jlekstrand.net> |
i965/compiler: Disable trig workarounds on KBL+ The precision of our trig instructions appears to have been fixed on Kaby Lake. Neither Ben nor I can find any documentation for this. However, the dEQP precision tests now pass with INTEL_PRECISE_TRIG=0 where they fail on Sky Lake. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
38a8507f79b8da71b309654ce56854bbea1bcf94 |
|
17-Oct-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Use NIR-based clip/cull lowering for OpenGL as well. The old approach works fine, and this approach isn't necessarily better. But it at least has the advantage that Vulkan and GL use the same approach. I originally wrote it to gain additional testing for the new paths. shader-db statistics show 0 instruction count changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
2e423ca1477bd212c01676c5e4828ebdb83310d8 |
|
25-Oct-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
nir: stop adjusting driver location for varying packing As of 59864e8e020 we just use the location assigned by the front-end and no longer need this for i965. Since there were some issues in the logic with assigning arrays the same driver location if they didn't start at the same location just remove it and let other drivers implement a solution if needed when they add ARB_enhanced_layouts support. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
91d61fbf7cb61a44adcaae51ee08ad0dd6b2a03b |
|
20-Oct-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
i965: rewrite brw_setup_vue_interpolation() Here brw_setup_vue_interpolation() is rewritten not to use the InterpQualifier array in gl_fragment_program which will allow us to remove it. This change also makes the code which is only used by gen4/5 more self contained as it now has its own gen5_fragment_program struct rather than storing the map in brw_context. This means the interpolation map will only get processed once and will get stored in the in memory cache rather than being processed everytime the fs changes. Also by calling this from the fs compile code rather than from the upload code and using the interpolation assigned there we can get rid of the BRW_NEW_INTERPOLATION_MAP flag. It might not seem ideal to add a gen5_fragment_program struct however by the end of this series we will have gotten rid of all the brw_{shader_stage}_program structs and replaced them with a generic brw_program struct so there will only be two program structs which is better than what we have now. V2: Don't remove BRW_NEW_INTERPOLATION_MAP from dirty_bit_map until the following patch to fix build error. V3 - Suggestions by Jason: - name struct gen4_fragment_program rather than gen5_fragment_program - don't use enum with memset() - create interp mode set helper and simplify logic to call it - add assert when calling function to show prog will never be NULL for gen4/5 i.e. no Vulkan Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
e1af20f18a86f52a9640faf2d4ff8a71b0a4fa9b |
|
13-Oct-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
nir/i965/anv/radv/gallium: make shader info a pointer When restoring something from shader cache we won't have and don't want to create a nir_shader this change detaches the two. There are other advantages such as being able to reuse the shader info populated by GLSL IR. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
094fe3a9591ce200162d955635eee577c13f9324 |
|
13-Oct-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
nir: move nir_shader_info to a common compiler header This will allow use to stop copying values between structs and will also simplify handling handling these values in the shader cache. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
59864e8e02057cc6fa0448a8af067a3cf53389da |
|
13-Oct-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Don't use nir_assign_var_locations for VS/TES/GS outputs. Fixes spec/arb_enhanced_layouts/execution/component-layout/vs-fs-array-dvec3. v2: Remove nir_outputs field from fs_visitor (caught by Tim and Iago). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
89e1436e2d4ff0c15202708979eb36761cae4167 |
|
11-Oct-2016 |
Ian Romanick <ian.d.romanick@intel.com> |
i965: Silence unused parameter warnings brw_link.cpp:76:44: warning: unused parameter ‘shader_type’ [-Wunused-parameter] gl_shader_stage shader_type, ^ brw_nir.c: In function ‘brw_nir_lower_vs_inputs’: brw_nir.c:194:55: warning: unused parameter ‘devinfo’ [-Wunused-parameter] const struct gen_device_info *devinfo, ^ brw_vec4_visitor.cpp:914:37: warning: unused parameter ‘sampler’ [-Wunused-parameter] uint32_t sampler, ^ brw_vec4_visitor.cpp:1146:34: warning: unused parameter ‘stream_id’ [-Wunused-parameter] vec4_visitor::gs_emit_vertex(int stream_id) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
36f0f0318275f65f8744ec6f9471702e2f58e6d5 |
|
07-Sep-2016 |
Eric Anholt <eric@anholt.net> |
nir: Allow opt_peephole_sel to be more aggressive in flattening IFs. VC4 was running into a major performance regression from enabling control flow in the glmark2 conditionals test, because of short if statements containing an ffract. This pass seems like it was was trying to ensure that we only flattened IFs that should be entirely a win by guaranteeing that there would be fewer bcsels than there were MOVs otherwise. However, if the number of ALU ops is small, we can avoid the overhead of branching (which itself costs cycles) and still get a win, even if it means moving real instructions out of the THEN/ELSE blocks. For now, just turn on aggressive flattening on vc4. i965 will need some tuning to avoid regressions. It does looks like this may be useful to replace freedreno code. Improves glmark2 -b conditionals:fragment-steps=5:vertex-steps=0 from 47 fps to 95 fps on vc4. vc4 shader-db: total instructions in shared programs: 101282 -> 99543 (-1.72%) instructions in affected programs: 17365 -> 15626 (-10.01%) total uniforms in shared programs: 31295 -> 31172 (-0.39%) uniforms in affected programs: 3580 -> 3457 (-3.44%) total estimated cycles in shared programs: 225182 -> 223746 (-0.64%) estimated cycles in affected programs: 26085 -> 24649 (-5.51%) v2: Update shader-db output. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
111f6b250d01fa1937103f24b5cb54b15dd77fbf |
|
14-Sep-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Roll set_default_interpolation into lower_fs_inputs Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
246db0063eb6e01aad961b1c73d32fca911ae1df |
|
14-Sep-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Use NIR for handling forced per-sample interpolation Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
ed65e6ef49e17e9cae93a8f98e2968346de2bc6e |
|
14-Sep-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Add a flag to lower_io to force "sample" interpolation Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
2d8a3fa7ea994ad02a40ff497109f966e3fcbeec |
|
14-Sep-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
nir: Report progress from nir_lower_phis_to_scalar. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
32630e211e60a2b41388d403cfbd4f43344d8590 |
|
14-Sep-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
nir: Report progress from nir_lower_alu_to_scalar. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
527f37199929932300acc1688d8160e1f3b1d753 |
|
23-Aug-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
intel: s/brw_device_info/gen_device_info/ Generated by: sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/**/*.c sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/**/*.h sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.c sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.cpp sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.h Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
7dac8820730777756c00d7024330517848dc3b9f |
|
22-Jul-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/fs: Rework representation of fragment output locations in NIR. The problem with the current approach is that driver output locations are represented as a linear offset within the nir_outputs array, which makes it rather difficult for the back-end to figure out what color output and index some nir_intrinsic_load/store_output was meant for, because the offset of a given output within the nir_output array is dependent on the type and size of all previously allocated outputs. Instead this defines the driver location of an output to be the pair formed by its GLSL-assigned location and index (I've borrowed the bitfield macros from brw_defines.h in order to represent the pair of integers as a single scalar value that can be assigned to nir_variable_data::driver_location). nir_assign_var_locations is no longer useful for fragment outputs. Because fragment outputs are now allocated independently rather than within the nir_outputs array, the get_frag_output() helper becomes necessary in order to obtain the right temporary register for a given location-index pair. The type_size helper passed to nir_lower_io is now type_size_dvec4 rather than type_size_vec4_times_4 so that output array offsets are provided in terms of whole array elements rather than in terms of scalar components (dvec4 is the largest vector type supported by the GLSL so this will cause all individual fragment outputs to have a size of one regardless of the type). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
9f32721f8695f3e55849dce015da3b53d1af5d57 |
|
21-Jul-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Enable NIR lowering of txf and rect offsets This fixes the following piglit tests on gen6+: tex-miplevel-selection textureProjGradOffset 2DRect tex-miplevel-selection textureGradOffset 2DRect tex-miplevel-selection textureGradOffset 2DRectShadow tex-miplevel-selection textureProjGradOffset 2DRect_ProjVec4 tex-miplevel-selection textureProjGradOffset 2DRectShadow Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
7f53fead5cf9a85c74a94d359dd5fccfbb87856c |
|
23-May-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
i965: enable component packing for vs and fs Rather than trying to work out the total number of components used at a location we simply treat all outputs as vec4s. This removes the need for complex code looping over varyings to match packed locations and the need for storing the total number of components used at each location. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
1eef0b73aa323d94d5a080cd1efa81ccacdbd0d2 |
|
12-Jul-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Rewrite FS input handling to use the new NIR intrinsics. This eliminates the need to walk the list of input variables, recurse into their types (via logic largely redundant with nir_lower_io), and interpolate all possible inputs up front. The backend no longer has to care about variables at all, which eliminates complications from trying to pack multiple variables into the same location. Instead, each intrinsic specifies exactly what's needed. This should unblock Timothy's work on GL_ARB_enhanced_layouts. Each load_interpolated_input intrinsic corresponds to PLN instructions, while load_barycentric_at_* intrinsics correspond to pixel interpolator messages. The pixel/centroid/sample barycentric intrinsics simply refer to payload fields (delta_xy[]), and don't actually generate any code. Because we use a single intrinsic for both centroid-qualified variables and interpolateAtCentroid(), they become indistinguishable. We stop sending pixel interpolator messages for those, and instead use the payload provided data, which should be considerably faster. On Broadwell: total instructions in shared programs: 9067751 -> 9067570 (-0.00%) instructions in affected programs: 145902 -> 145721 (-0.12%) helped: 422 HURT: 209 total spills in shared programs: 2849 -> 2899 (1.76%) spills in affected programs: 760 -> 810 (6.58%) helped: 0 HURT: 10 total fills in shared programs: 3910 -> 3950 (1.02%) fills in affected programs: 617 -> 657 (6.48%) helped: 0 HURT: 10 LOST: 3 GAINED: 3 The differences mostly appear to be slight changes in MOVs. v2: Use nir_shader_compiler_options::use_interpolated_input_intrinsics flag rather than passing it directly to nir_lower_io. Use the unreachable() macro rather than assert in one place. (Review feedback from Chris Forbes.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisforbes@google.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
448adfbc67f4f6d0268a2f94dac311a26dc19864 |
|
18-May-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
nir: use the same driver location for packed varyings Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
a8a9d1bf41c00123cefb6e757f3509c62e880a15 |
|
14-Jun-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
i965: remove type_size_vec4_times_4() type_size_vec4_times_4() was introduced as a fix in 8dcf807cb43383 however since 3810c1561 we can just use type_size_scalar() and get the actual number of outputs we need. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
5e43ba7e9e9bfce451f9caa3845136f8a5b6eda0 |
|
26-May-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Move brw_create_nir to brw_program.c This way it's no longer part of libi965_compiler.la since it depends on GLSL and ARB program stuff. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
86a2447eec7e87e46e842ca7a3ad5cd9fadb1ca5 |
|
26-May-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Move the type_size_*_bytes functions to brw_nir.h Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
32210dea8e474f8e93f5df681fb6a8265a0cda4b |
|
26-May-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
compiler: Move glsl_to_nir to libglsl.la Right now libglsl.la depends on libnir.la so putting it in libnir.la adds a dependency on libglsl.la that goes the wrong direction. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
654e950cba55dabd2d9accb60db8e5f4c1495716 |
|
02-May-2016 |
Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> |
i965: Invoke lowering pass for YUV textures Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
0970c563d6e4a30ab0852ef664dc97a995997f88 |
|
20-May-2016 |
Dave Airlie <airlied@redhat.com> |
nir: remove dead glsl variables before lowering io. For cull distance GLSL will let unsized unused arrays get into the backend, we should nuke those straight away, to save caring about them later. This fixes: arb_separate_shader_objects/linker/large-number-of-unused-varyings as a side effect (even without culling changes). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
dac10e8a1390711f1f36f224644c4a33586cebe3 |
|
17-May-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965, anv: Use NIR FragCoord re-center and y-transform passes. This handles gl_FragCoord transformations and other window system vs. user FBO coordinate system flipping by multiplying/adding uniform values, rather than recompiles. This is much better because we have no decent way to guess whether the application is going to use a shader with the window system FBO or a user FBO, much less the drawable height. This led to a lot of recompiles in many applications. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
d6281a9d955ad97f993927bc214e4b641cfbe359 |
|
15-Apr-2016 |
Juan A. Suarez Romero <jasuarez@igalia.com> |
i965: take care of doubles when lowering VS inputs Input attributes can require 2 vec4 or 1 vec4 depending on whether they are double-precision or not. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
b0fb08e179d784ca319c3c547a874fd24ce93c3f |
|
01-Apr-2016 |
Juan A. Suarez Romero <jasuarez@igalia.com> |
i965: take care of doubles when remapping VS attributes Double-precision types require 1 slot in VUE for double and dvec2, and 2 slots for anything else. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
dfbabc6bad775e1575ff4a97a3c871341cd57f77 |
|
25-Mar-2016 |
Rob Clark <robclark@freedesktop.org> |
nir/lower-io: add support for lowering inputs Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
b085016f94721a6c18f7076fc37c450a98e6bdbc |
|
25-Mar-2016 |
Rob Clark <robclark@freedesktop.org> |
nir: rename lower_outputs_to_temporaries -> lower_io_to_temporaries Since it will gain support to lower inputs, give it a more generic name. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
47fcef9a209e533103a7ecf4d69440a67aa463ed |
|
25-Mar-2016 |
Rob Clark <robclark@freedesktop.org> |
nir: move callsite of lower_outputs_to_temporaries Going to convert this pass to parameterized lower_io_to_temporaries, and we want the user to be able to specify whether to lower outputs or inputs or both. The restriction of running this pass before validate to avoid output reads no longer applies. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
30424fd25a2f6554c35272d8edeacab0299ad8cc |
|
07-Aug-2015 |
Connor Abbott <connor.w.abbott@intel.com> |
i965: use pack/unpackDouble lowering Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
bea2f8beb53450bd07e5a33d48f00a9e9520645d |
|
04-Aug-2015 |
Connor Abbott <connor.w.abbott@intel.com> |
i965: use double lowering pass v2: also lower trunc, ceil, floor, fract and roundEven (Iago) v3: also lower mod for doubles (Sam) Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
9464d8c49813aba77285e7465b96e92a91ed327c |
|
27-Apr-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Switch the arguments to nir_foreach_function This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_function(\([^,]*\),\s*\([^,]*\))/nir_foreach_function(\2, \1)/ Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
707e72f13bb78869ee95d3286980bf1709cba6cf |
|
27-Apr-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Switch the arguments to nir_foreach_instr This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_instr(\([^,]*\),\s*\([^,]*\))/nir_foreach_instr(\2, \1)/ and similar expressions for nir_foreach_instr_safe etc. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
7efff10585122d484dc3adab14af9380b9b8f309 |
|
13-Apr-2016 |
Connor Abbott <cwabbott0@gmail.com> |
i965/nir: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
b6dc940ec273252678d40707d300851fa1c85ea5 |
|
13-Apr-2016 |
Connor Abbott <cwabbott0@gmail.com> |
nir: rename nir_foreach_block*() to nir_foreach_block*_call() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
b63a98b1211d22f759ae9c80b2270fe2d3b2639e |
|
25-Mar-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir/dead_variables: Configurably work with any variable mode The old version of the pass only worked on globals and locals and always left inputs, outputs, uniforms, etc. alone. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
bfd17c76c1267756ea16051cbe174cb23ff49f44 |
|
08-Apr-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Port INTEL_PRECISE_TRIG=1 to NIR. This makes the extra multiply visible to NIR's algebraic optimizations (for constant reassociation) as well as constant folding. This means that when the result of sin/cos are multiplied by an constant, we can eliminate the extra multiply altogether, reducing the cost of the workaround. It also means we only have to implement it one place, rather than in both backends. This makes INTEL_PRECISE_TRIG=1 cost nothing on GPUTest/Volplosion, which has a ton of sin() calls, but always multiplies them by an immediate constant. The extra multiply gets folded away. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
b0dffdc616801a1fd8534502e11ac840369041ab |
|
08-Apr-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Pass brw_compiler into brw_preprocess_nir() instead of is_scalar. I want to be able to read other fields. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
084b24f5582567ebf5aa94b7f40ae3bdcb71316b |
|
16-Mar-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
nir: rename nir_const_value fields to include bitsize information Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
9076c4e289de0debf1fb2a7237bdeb9c11002347 |
|
14-Aug-2015 |
Connor Abbott <connor.w.abbott@intel.com> |
nir: update opcode definitions for different bit sizes Some opcodes need explicit bitsizes, and sometimes we need to use the double version when constant folding. v2: fix output type for u2f (Iago) v3: do not change vecN opcodes to be float. The next commit will add infrastructure to enable 64-bit integer constant folding so this is isn't really necessary. Also, that created problems with source modifiers in some cases (Iago) v4 (Jason): - do not change bcsel to work in terms of floats - leave ldexp generic Squashed changes to handle different bit sizes when constant folding since otherwise we would break the build. v2: - Use the bit-size information from the opcode information if defined (Iago) - Use helpers to get type size and base type of nir_alu_type enum (Sam) - Do not fallback to sized types to guess bit-size information. (Jason) Squashed changes in i965 and gallium/nir drivers to support sized types. These functions should only see sized types, but we can't make that change until we make sure that nir uses the sized versions in all the relevant places. A later commit will address this. Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
7d021cb15e6d67ecef8b020fd36c4a680bcc9c39 |
|
18-Jan-2016 |
Jordan Justen <jordan.l.justen@intel.com> |
i965/nir: Lower nir compute shader shared variables Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
a0294c2cf3d23943c7f365abf84765afa0f383a2 |
|
25-Feb-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Simplify brw_nir_lower_vue_inputs() slightly. The same code appeared in both branches; pull it above the if statement. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
8151003ade952c3e9d8284fada9237e1311cf173 |
|
25-Feb-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Avoid recalculating the normal VUE map for IO lowering. The caller already computes it. Now that we have stage specific functions, it's really easy to pass this in. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
15b3639bf1b0676e74b107d74653185eedbc6688 |
|
25-Feb-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Avoid recalculating the tessellation VUE map for IO lowering. The caller already computes it. Now that we have stage specific functions, it's really easy to pass this in. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
cfbd9831f89ef165e7998d0b8524a1aefedec404 |
|
25-Feb-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Eliminate brw_nir_lower_{inputs,outputs,io} functions. Now that each stage is directly calling brw_nir_lower_io(), and we have per-stage helper functions, it makes sense to just call the relevant one directly, rather than going through multiple switch statements. This also eliminates stupid function parameters, such as the two that only apply to vertex attributes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
b96ddd2e52e7205e5820714f2ad1028b666426c6 |
|
25-Feb-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Split brw_nir_lower_inputs/outputs into per-stage functions. These functions are both giant switch statements where most cases don't overlap at all. Let's put the bulk of the work in per-stage helpers. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
d33c478bedc6ac0b2bd2646e443e690cdfb2b640 |
|
25-Feb-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Remove catch-all nir_lower_io call with specific cases. Most cases already call nir_lower_io explicitly for input and output lowering. This catch all isn't very useful anymore - we can just add it to the remaining cases. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
51f87979934100af8c40cfa17f670bc38417ddc0 |
|
25-Feb-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Move optimizations from brw_nir_lower_io to brw_postprocess_nir. This simplifies things. Every caller of brw_nir_lower_io() immediately calls brw_postprocess_nir(). The only real change this will have is that we get an extra brw_nir_optimize() call when compiling compute shaders, but that seems fine. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
dcd4a841e9096b988ea3ca2779e7c8b1ca5e5747 |
|
25-Feb-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Always do NIR IO lowering at specialization time. We've now hit literally every case other than geometry shaders (and compute shaders, but those are a no-op). So, let's just move geometry shaders over too and be done with it. The only advantage to doing this at link time was to save the expense of running the pass on recompiles. But we're already running a lot of passes, and the extra code complexity isn't worth it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
b3cb6e78aa219ad73c145a25ee1bb48fd8b025d0 |
|
17-Feb-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Do lower_io late for fragment shaders The Vulkan driver wants to be able to delete fragment outputs that are beyond key.nr_color_regions; this is a lot easier if we lower outputs at specialization time rather than link time. (Rationale added to commit message by Ken) Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
d56ae2d1605fc1b5a3fdf5aba9aefc3c7692a4ba |
|
14-Jan-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Apply VS attribute workarounds in NIR. This patch re-implements the pre-Haswell VS attribute workarounds. Instead of emitting shader code in the vec4 backend, we now simply call a NIR pass to emit the necessary code. This simplifies the vec4 backend. Beyond deleting code, it removes the primary use of ATTR as a destination. It also eliminates the requirement that the vec4 VS backend express the ATTR file in terms of VERT_ATTRIB_* locations, giving us a bit more flexibility. This approach is a little different: rather than munging the attributes at the top, we emit code to fix them up when they're accessed. However, we run the optimizer afterwards, so CSE should eliminate the redundant math. It may even be able to fuse it with other calculations based on the input value. shader-db does not handle non-default NOS settings, so I have no statistics about this patch. Note that the scalar backend does not implement VS attribute workarounds, as they are unnecessary on hardware which allows SIMD8 VS. v2: Do one multiply for FIXED rescaling and select components from either the original or scaled copy, rather than multiplying each component separately (suggested by Matt Turner). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
74f956c416d5b0b37b4c2d6b957167bb203502c3 |
|
22-Jan-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Use nir_lower_load_const_to_scalar(). I don't know why, but we never hooked up this pass Eric wrote. Otherwise, you can end up with stupid scalarized code such as: vec4 ssa_7 = load_const (0.0, 0.0, 0.0, 0.0) vec4 ssa_8 = ... vec1 ssa_9 = feq ssa_8, ssa_7 vec1 ssa_10 = feq ssa_8.y, ssa_7.y vec1 ssa_11 = feq ssa_8, ssa_7.z vec1 ssa_12 = feq ssa_8.y, ssa_7.w ssa_8.xyxy == <0, 0, 0, 0> should only take two feq instructions. shader-db on Skylake: total instructions in shared programs: 9121153 -> 9120749 (-0.00%) instructions in affected programs: 32421 -> 32017 (-1.25%) helped: 277 HURT: 69 total cycles in shared programs: 69003364 -> 69000912 (-0.00%) cycles in affected programs: 899186 -> 896734 (-0.27%) helped: 313 HURT: 403 This also prevents regressions when disabling channel expressions. v2: Don't call opt_cse afterwards (requested by Matt). It should happen in the optimization loop below anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
a39a8fbbaa129f4e52f2a3ad2747182e9a74d910 |
|
17-Jan-2016 |
Emil Velikov <emil.velikov@collabora.com> |
nir: move to compiler/ Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
3657cbf24f3b0baf7e3382e572d97a36b0ed4103 |
|
14-Jan-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Apply add_const_offset_to_base for vec4 VS inputs too. This shouldn't hurt anything, and I'm about to introduce a pass that will want it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
a3500f943e2c61c0aed043108132f35b79d16676 |
|
14-Jan-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Make add_const_offset_to_base() work at the shader level. This makes it a pass, hiding the parameter structs and block callbacks so it's simpler to work with. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
317628dbb35d03d1e855332c892594ae491c5d24 |
|
18-Nov-2015 |
Rob Clark <robclark@freedesktop.org> |
nir: extract out helper macros for running passes Note these are a bit uglier, due to avoidance of GNU C extensions. But drivers which do not need to be built with compilers that don't support the extension can wrap these macros with their own. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
237f2f2d8b45d9d956102eec6f9be63193e5269b |
|
26-Dec-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Get rid of function overloads When Connor originally drafted NIR, he copied the same function+overload system that GLSL IR had with a few names changed. However, this double-indirection is not really needed and has only served to confuse people. Instead, let's just have functions which may not have unique names and may or may not have an implementation. If someone wants to do overload resolving, they can hav a hash table based function+overload system in the overload resolving pass. There's no good reason to keep it in core NIR. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> ir3 bits are Reviewed-by: Rob Clark <robclark@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
794eb9d7270456ab3d2cadbaf302192eca7f4dbc |
|
08-Dec-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Handle mix-and-match TCS/TES with separate shader objects. GL_ARB_separate_shader_objects allows the application to mix-and-match TCS and TES programs separately. This means that the interface between the two stages isn't known until the final SSO pipeline is in place. This isn't a great match for our hardware: the TCS and TES have to agree on the Patch URB entry layout. Since we store data as per-patch slots followed by per-vertex slots, changing the number of per-patch slots can significantly alter the layout. This can easily happen with SSO. To handle this, we store the [Patch]OutputsWritten and [Patch]InputsRead bitfields in the TCS/TES program keys, introducing program recompiles. brw_upload_programs() decides the layout for both TCS and TES, and passes it to brw_upload_tcs/tes(), which store it in the key. When creating the NIR for a shader specialization, we override nir->info.inputs_read (and friends) to the program key's values. Since everything uses those, no further compiler changes are needed. This also replaces the hack in brw_create_nir(). To avoid recompiles, brw_precompile_tes() looks to see if there's a TCS in the linked shader. If so, it accounts for the TCS outputs, just as brw_upload_programs() would. This eliminates all recompiles in the non-SSO case. In the SSO case, there should only be recompiles when using a TCS and TES that have different input/output interfaces. Fixes Piglit's mix-and-match-tcs-tes test. v2: Pull the brw_upload_programs code into a brw_upload_tess_programs() helper function (requested by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
01b1b44d31adde3954d1f1404ca66f90d87d4ae5 |
|
08-Dec-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Defer input lowering for tessellation stages until specialization. With tessellation shaders and SSO, we won't be able to always decide on VUE map layouts at LinkProgram time. Unfortunately, we have to delay it until shader specialization time. However, uniform lowering cannot be deferred - brw_codegen_*_prog() reads nir->num_uniforms. Fortunately, we don't need to defer it - uniform, system value, atomic, and sampler lowering can safely stay where it is. This patch moves those to brw_lower_nir()'s only caller, renames brw_lower_nir() to brw_nir_lower_io(), and introduces calls to that. For non-tessellation stages, I chose to call brw_nir_lower_io() from brw_create_nir(), so it's still done at the same time. There's no need to defer it, and doing it at LinkProgram time is nice. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
9f0944d15b9d2cd85f501f80eea7e6b6fc7f3487 |
|
10-Dec-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Make TES inputs match TCS outputs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
4fac9500100273424450b5687c4e04dfd066d08e |
|
10-Dec-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Force VS -> TCS varyings to use the SSO VUE map layout. The compact VUE map only works when varying packing is in use. Unfortunately, varying packing is disabled for TCS inputs. This is needed to fix Piglit's tcs-input-read-array-interface test. v2: Make lines fit in 80 columns (caught by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
bee42cc1f78c480d13da879682268ee14b0d6fe7 |
|
10-Dec-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Handle TCS outputs and TES inputs. TCS outputs and TES inputs both refer to a common "patch URB entry" shared across all invocations. First, there are some number of per-patch entries. Then, there are per-vertex entries accessed via an offset for the variable and a stride times the vertex index. Because these calculations need to be done in both the vec4 and scalar backends, it's simpler to just compute the offset calculations in NIR. It doesn't necessarily make much sense to use per-vertex intrinsics afterwards, but that at least means we don't lose the per-patch vs. per-vertex information. v2: Use is_input/is_output helpers (suggested by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
31140d097a7939e0f917aa76bd37b5c682898e63 |
|
10-Dec-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Handle TCS inputs and TES outputs. TES outputs work exactly like VS outputs, so we can simply add a case statement for those. TCS inputs are very similar to geometry shaders - they're arrays of per-vertex data. We use the same method I used for the scalar GS backend. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
9f3917bf372aa19f85875dbe30ca12adc9b67b90 |
|
10-Dec-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fix partial variable access for geometry shaders in SSO mode. Without varying packing, if a VS writes a compound variable, and the GS only reads part of it, the base location of the variable may not actually be in the VUE map. To cope with this, we do lowering in terms of varying slots, add any constant offsets to the base, and then do the VUE map remapping. This ensures we only look up VUE map entries for slots which actually exist. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
8c4deb10dfbef683a2052a7bd62450aa76ad8fde |
|
09-Dec-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Separate base offset/constant offset combining from remapping. My tessellation branch has two additional remap functions. I don't want to replicate this logic there. v2: Handle inputs/outputs separately (suggested by Jason Ekstrand). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
78b81be627734ea7fa50ea246c07b0d4a3a1638a |
|
25-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Get rid of *_indirect variants of input/output load/store intrinsics There is some special-casing needed in a competent back-end. However, they can do their special-casing easily enough based on whether or not the offset is a constant. In the mean time, having the *_indirect variants adds special cases a number of places where they don't need to be and, in general, only complicates things. To complicate matters, NIR had no way to convdert an indirect load/store to a direct one in the case that the indirect was a constant so we would still not really get what the back-ends wanted. The best solution seems to be to get rid of the *_indirect variants entirely. This commit is a bunch of different changes squashed together: - nir: Get rid of *_indirect variants of input/output load/store intrinsics - nir/glsl: Stop handling UBO/SSBO load/stores differently depending on indirect - nir/lower_io: Get rid of load/store_foo_indirect - i965/fs: Get rid of load/store_foo_indirect - i965/vec4: Get rid of load/store_foo_indirect - tgsi_to_nir: Get rid of load/store_foo_indirect - ir3/nir: Use the new unified io intrinsics - vc4: Do all uniform loads with byte offsets - vc4/nir: Use the new unified io intrinsics - vc4: Fix load_user_clip_plane crash - vc4: add missing src for store outputs - vc4: Fix state uniforms - nir/lower_clip: Update to the new load/store intrinsics - nir/lower_two_sided_color: Update to the new load intrinsic NIR and i965 changes are Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> NIR indirect declarations and vc4 changes are Reviewed-by: Eric Anholt <eric@anholt.net> ir3 changes are Reviewed-by: Rob Clark <robdclark@gmail.com> NIR changes are Acked-by: Rob Clark <robdclark@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
18069dce4a4c3d71e6afc6b10bfa7bee0560ba9c |
|
11-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Make uniform offsets be in terms of bytes This commit pushes makes uniform offsets be terms of bytes starting with nir_lower_io. They get converted to be in terms of vec4s or floats when we cram them in the UNIFORM register file but reladdr remains in terms of bytes all the way down to the point where we lower it to a pull constant load. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
aa35b0c2c71f054f72df5a85779d0862fa7d6e4a |
|
25-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Get rid of the nir_inputs array It's not really buying us anything at this point. It's just a way of remapping one offset namespace onto another. We can just use the location namespace the whole way through. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
f36993b46962eab4446bc1964eb47149751aee26 |
|
23-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Clean up #includes in the compiler. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
d278e31459374feb18edd97d5adaacccc08f978a |
|
18-Nov-2015 |
Rob Clark <robclark@freedesktop.org> |
util: move brw_env_var_as_boolean() to util Kind of a handy function. And I'll want it available outside of i965 for common nir-pass helpers. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Nicolai Hähnle <nhaehnle@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
d9b8fde963a53d4e06570d8bece97f806714507a |
|
12-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Use NIR for lowering texture swizzle Now that nir_lower_tex can do texture swizzle lowering, we can use that instead of repeating more-or-less the same code in both backends. This both allows us to share code and means that things like the tg4 work-arounds are somewhat simpler because they don't have to take the swizzle into account. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
6c8ba59cff14a1a86273f4008ff2a8e68335ab25 |
|
11-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Use nir_lower_tex for texture coordinate lowering Previously, we had a rescale_texcoords helper in the FS backend for handling rescaling of texture coordinates. Now that we can do variants in NIR, we can use nir_lower_tex to do the rescaling for us. This allows us to delete the i965-specific code and gives us proper TEXTURE_RECTANGLE and GL_CLAMP handling in vertex and geometry shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
1417f6a216b46dbbaa1bfe0cef97e2b4a48224c0 |
|
11-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir/lower_tex: Report progress Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
ce767bbdfff7c2a7829b652c111a11eb9ddba026 |
|
11-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Move postprocess_nir to codegen time This allows us to insert NIR passes between initial NIR compilation and optimization (link time) and actual backend code-gen. In particular, it will allow us to do shader variants in NIR and share some of that shader variant code between backends. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
9cf108193b61c342c94c4cd980c4b403638e1051 |
|
11-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Split shader optimization and lowering into three stages At the moment, brw_create_nir just calls the three stages in sequence so there's not much difference. Soon, however, we will want to start doing variants in NIR at which point the postprocessing step will have to move from shader create time to codegen time. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
f58813842bcece3498f55ec5d582466ccff92a5e |
|
15-May-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: s/nir_type_unsigned/nir_type_uint v2: do the same in tgsi_to_nir (Samuel) v3: added missing cases after rebase (Iago) v4: Add a blank space after '#' in one of the comments (Matt) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
006e4f070f08ff1e1731863940bc51de9e97b865 |
|
19-Oct-2015 |
Rob Clark <robdclark@gmail.com> |
nir: add nir_var_all enum Otherwise, passing -1 gets you: error: invalid conversion from 'int' to 'nir_variable_mode' [-fpermissive] Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
0bee3acc2a303b4cbbac0f6f54ffc8be79bc7470 |
|
16-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Add hooks for testing nir_shader_clone This commit adds code for testing nir_shader_clone by running it after each and every optimization pass and throwing away the old shader. Testing nir_shader_clone is hidden behind a new INTEL_CLONE_NIR environment variable. Reviewed-by: Rob Clark <robclark@freedesktop.org> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
9ff71b649b4b3808a9e17ce69743c6037fd6603c |
|
03-Nov-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/nir: Validate that NIR passes call nir_metadata_preserve(). Failing to call nir_metadata_preserve() can have nasty consequences: some pass breaks dominance information, but leaves it marked as valid, causing some subsequent pass to go haywire and probably crash. This pass adds a simple validation mechanism to ensure passes handle this properly. We add a new bogus metadata flag that isn't used for anything in particular, set it before each pass, and ensure it *isn't* still set after the pass. nir_metadata_preserve will reset the flag, so correct passes will work, and bad passes will assert fail. (I would have made these functions static inline, but nir.h is included in C++, so we can't bit-or enums without lots of casting...) Thanks to Dylan Baker for the idea. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
7bc097899924f40140981567c7bb52297dd801f2 |
|
03-Nov-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/nir: Add OPT() and OPT_V() macros for invoking NIR passes. OPT() is the normal macro for passes that return booleans, while OPT_V() is a variant that works for passes that don't properly report progress. (Such passes should be fixed to return a boolean, eventually.) These macros take care of calling nir_validate_shader() and setting progress appropriately. In the future, it would be easy to add shader dumping similar to INTEL_DEBUG=optimizer by extending the macro. v2 (Jason Ekstrand): - Fix an unused variable warning Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
94ff35204dba0ddbd7f5c4342206c8acba22d32f |
|
22-Oct-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
nir/nir_opt_peephole_ffma: Move this lowering pass to the i965 driver Because the next patch will add an optimization that is specific to i965, we want to move this loweing pass to that driver altogether. This is safe because i965 is the only consumer. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
8dcf807cb43383590ba193c7ff20b8a98e4a9f65 |
|
14-Oct-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fix scalar VS float[] and vec2[] output arrays. The scalar VS backend has never handled float[] and vec2[] outputs correctly (my original code was broken). Outputs need to be padded out to vec4 slots. In fs_visitor::nir_setup_outputs(), we tried to process each vec4 slot by looping from 0 to ALIGN(type_size_scalar(type), 4) / 4. However, this is wrong: type_size_scalar() for a float[2] would return 2, or for vec2[2] it would return 4. This looked like a single slot, even though in reality each array element would be stored in separate vec4 slots. Because of this bug, outputs[] and output_components[] would not get initialized for the second element's VARYING_SLOT, which meant emit_urb_writes() would skip writing them. Nothing used those values, and dead code elimination threw a party. To fix this, we introduce a new type_size_vec4_times_4() function which pads array elements correctly, but still counts in scalar components, generating correct indices in store_output intrinsics. Normally, varying packing avoids this problem by turning varyings into vec4s. So this doesn't actually fix any Piglit or dEQP tests today. However, if varying packing is disabled, things would be broken. Tessellation shaders can't use varying packing, so this fixes various tcs-input Piglit tests on a branch of mine. v2: Shorten the implementation of type_size_4x to a single line (caught by Connor Abbott), and rename it to type_size_vec4_times_4() (renaming suggested by Jason Ekstrand). Use type_size_vec4 rather than using type_size_vec4_times_4 and then dividing by 4. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
c9541a74e4d179ad844bdf8af1e3de541c5b14c2 |
|
24-Sep-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Add scalar GS input lowering code. We really ought to compute the VUE map at link time and stash it, rather than recomputing it here, but with the mess of program structures I wasn't sure where to put it. We can improve that later. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
a3d0359aff7a9be90149c416844f330b4f9a15ed |
|
26-Oct-2015 |
Timothy Arceri <timothy.arceri@collabora.com> |
glsl: keep track of intra-stage indices for atomics This is more optimal as it means we no longer have to upload the same set of ABO surfaces to all stages in the program. This also fixes a bug where since commit c0cd5b var->data.binding was being used as a replacement for atomic buffer index, but they don't have to be the same value they just happened to end up the same when binding is 0. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Cc: Alejandro Piñeiro <apinheiro@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90175
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
dbac0a6352053bd6106feff88d95b0fd38b82afe |
|
16-Oct-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/nir: Switch on shader stage in nir_lower_outputs(). VS, GS, and FS continue doing the same thing they did before. We can simplify the FS code a bit because it is always scalar. Compute shaders now assert that there are no outputs instead of doing a loop over 0 outputs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
0d1eef536bc744f5c4dcdf854ad6adfdfe4f4dcb |
|
14-Oct-2015 |
Jordan Justen <jordan.l.justen@intel.com> |
i965/fs: Ignore compute shaders in brw_nir_lower_inputs The commit shown below caused compute shaders to hit the unreachable in the default of the switch block. Since compute shaders don't have any inputs, we can make brw_nir_lower_inputs a no-op for CS. commit 2953c3d76178d7589947e6ea1dbd902b7b02b3d4 Author: Kenneth Graunke <kenneth@whitecape.org> Date: Fri Aug 14 15:15:11 2015 -0700 i965/vs: Map scalar VS input locations properly; avoid tons of MOVs. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
63728dac57c18df0f45bb2482f60188fac2d1efe |
|
13-Oct-2015 |
Jordan Justen <jordan.l.justen@intel.com> |
i965/fs: Simplify FS in brw_nir_lower_inputs to only support scalar mode Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
bd198b9f0a292a9ff4ffffec3a29bad23d62caba |
|
15-Aug-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Simplify fs_visitor's ATTR file. Previously, ATTR was indexed by VERT_ATTRIB_* slots; at the end of compilation, assign_vs_urb_setup() translated those into GRF units, and converted ATTR to HW_REGs. This patch moves the transslation earlier, making ATTR work in terms of GRF units from the beginning. assign_vs_urb_setup() simply has to add the number of payload registers and push constants to obtain the final hardware GRF number. (We can't do this earlier as those values aren't known.) ATTR still supports reg_offset; however, it's simply added to reg. It's not clear whether this is valuable or not. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
2953c3d76178d7589947e6ea1dbd902b7b02b3d4 |
|
15-Aug-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Map scalar VS input locations properly; avoid tons of MOVs. Previously, we used nir_lower_io with the scalar type_size function, which mapped VERT_ATTRIB_* locations to...some numbers. Then, in fs_visitor::nir_setup_inputs(), we created temporaries indexed by those numbers, and emitted MOVs from the actual ATTR registers to those temporaries. Virtually all of these were copy propagated away, but it's still ugly. This patch reworks our input lowering to produce NIR lower_input intrinsics that properly index into the ATTR file, so we can access it directly. No changes in shader-db. v2: Fix unreachable() message (Ken), update commit message (Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
5d7f8cb5a511977e256e773716fac3415d01443e |
|
01-Oct-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
nir: Introduce new nir_intrinsic_load_per_vertex_input intrinsics. Geometry and tessellation shaders process multiple vertices; their inputs are arrays indexed by the vertex number. While GLSL makes this look like a normal array, it can be very different behind the scenes. On Intel hardware, all inputs for a particular vertex are stored together - as if they were grouped into a single struct. This means that consecutive elements of these top-level arrays are not contiguous. In fact, they may sometimes be in completely disjoint memory segments. NIR's existing load_input intrinsics are awkward for this case, as they distill everything down to a single offset. We'd much rather keep the vertex ID separate, but build up an offset as normal beyond that. This patch introduces new nir_intrinsic_load_per_vertex_input intrinsics to handle this case. They work like ordinary load_input intrinsics, but have an extra source (src[0]) which represents the outermost array index. v2: Rebase on earlier refactors. v3: Use ssa defs instead of nir_srcs, rebase on earlier refactors. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
1e3c1b107e075b210998998423901092b8fcd79b |
|
03-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Use nir_foreach_variable Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
ca941799ce76eac8afe2503fbacffee057e949d3 |
|
03-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Remove the prog parameter from brw_nir_lower_inputs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
cd1ae6ebfac22f76d26a5b8659423969b2aeddce |
|
06-Aug-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir/glsl: Take a gl_shader_program and a stage rather than a gl_shader Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
193d29516ddb76f469fea17119493e2b685bc6b7 |
|
26-Aug-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/nir: Refactor input/output lowering setup into helpers. The code for input lowering is going to get significantly more complicated shortly, so I wanted to pull it out. Vertex shader inputs are handled nearly identically regardless of vec4/scalar mode, so I opted to not split that. I thought about having each function actually do the lowering, but one pass through nir_lower_io that handles all types (which weren't handled earlier) is probably more efficient. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
39a1d36a67974dd9fc3c0d834d6a117cdfed8f33 |
|
13-Aug-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
nir: Allow nir_lower_io() to only lower one type of variable. We may want to use different type_size functions for (e.g.) inputs vs. uniforms. Passing in -1 for mode ignores this, handling all modes as before. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
df31c1850d14729e27513ae733110a668f6b6e95 |
|
05-Aug-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/gs: Use new NIR intrinsics. By performing the vertex counting in NIR, we're able to elide a ton of useless safety checks around every EmitVertex() call: total instructions in shared programs: 3952 -> 3720 (-5.87%) instructions in affected programs: 3491 -> 3259 (-6.65%) helped: 11 HURT: 0 Improves performance in Gl32GSCloth by 0.671742% +/- 0.142202% (n=621) on Haswell GT3e at 1024x768. This should also make it easier to implement Broadwell's "Static Vertex Count" feature someday. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
faf5f174ddbc7680f6947ceababb94fdb552bcdb |
|
16-Sep-2015 |
Rob Clark <robclark@freedesktop.org> |
nir/lower_tex: support projector lowering per sampler type Some hardware, such as adreno a3xx, supports txp on some but not all sampler types. In this case we want more fine grained control over which texture projectors get lowered. v2: split out nir_lower_tex_options struct to make it easier to add the additional parameters coming in the following patches Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
d9b9ff76f17ee36b87b2722fa2a19e1d9f036c26 |
|
17-Sep-2015 |
Rob Clark <robclark@freedesktop.org> |
nir: rename nir_lower_tex_projector Since the following patches will add additional tex-lowering related functionality, which doesn't make sense to split out into a separate pass (as they would require duplication of the projector lowering logic), let's give this pass a more generic name. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
fc11dbe13f3470ff2a4cb91c6b063db2456664da |
|
09-Sep-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Use nir_move_vec_src_uses_to_dest The idea here is not that it gives register coalescing a little bit of a helping hand. It doesn't actually fix the coalescing problems, but it seems to help a good bit. Shader-db results for vec4 programs on Haswell: total instructions in shared programs: 1746280 -> 1683959 (-3.57%) instructions in affected programs: 1259166 -> 1196845 (-4.95%) helped: 11363 HURT: 148 v2 (Jason Ekstrand): - Run nir_move_vec_src_uses_to_dest after going out of SSA - New shader-db numbers Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
c951bb83056724df02ba7e6fe2dfa720c0f45c1f |
|
09-Sep-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4_nir: Use partial SSA form rather than full non-SSA We made this switch in the FS backend some time ago and it seems to make a number of things a bit easier. In particular, supporting SSA values takes very little work in the backend and allows us to take advantage of the majority of the SSA information even after we've gotten rid of Phi nodes. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
1484d8c9aa2e7e78462ffb5c207394bef77af89b |
|
01-May-2015 |
Connor Abbott <cwabbott0@gmail.com> |
i965/nir: enable the dead control flow optimization total instructions in shared programs: 7541551 -> 7541381 (-0.00%) instructions in affected programs: 3054 -> 2884 (-5.57%) helped: 29 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
86c57ebe0ed1acc98545746058862db7429412da |
|
21-Aug-2015 |
Boyan Ding <boyan.j.ding@gmail.com> |
i965/nir: Make use of nir_opt_undef Shader-db result on Ivy Bridge: total instructions in shared programs: 145484 -> 145445 (-0.03%) instructions in affected programs: 225 -> 186 (-17.33%) helped: 5 HURT: 0 Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
5f14c417c86ced1847746c64d4db54c7e5ddc187 |
|
18-Aug-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
nir: Use nir_shader::stage rather than passing it around. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
259f7291de2387aa3ac5f856b39b7b934a1d8e7d |
|
18-Aug-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Rework uniform handling Previously, we treated the entire UNIFORM file as if it had two elements: One for direct things and one for indirect. This is substantially different from how the old visitor code handled it where each element was effectively its own uniform. This commit makes the NIR path more like the old ir_visitor path where each uniform is separate. This should allow us to more easily make decisions about what to push. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
6c33d6bbf9b54784e4498a81c73b712dca5dd737 |
|
12-Aug-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
nir: Pass a type_size() function pointer into nir_lower_io(). Previously, there were four type_size() functions in play - the i965 compiler backend defined scalar and vec4 type_size() functions, and nir_lower_io contained its own similar functions. In fact, the i965 driver used nir_lower_io() and then looped over the components using its own type_size - meaning both were in play. The two are /basically/ the same, but not exactly in obscure cases like subroutines and images. This patch removes nir_lower_io's functions, and instead makes the driver supply a function pointer. This gives the driver ultimate flexibility in deciding how it wants to count things, reduces code duplication, and improves consistency. v2 (Jason Ekstrand): - One side-effect of passing in a function pointer is that nir_lower_io is now aware of and properly allocates space for image uniforms, allowing us to drop hacks in the backend Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> v2 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
da1b1bf85cdc691ec27f379de84dec495cdd51e0 |
|
15-Jul-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/nir: Do not scalarize phis in non-scalar setups Significantly reduces register pressure in some piglit tests. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
db8a6de571bb72ef43209a415e5492001a87b1d8 |
|
17-Jun-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/nir: Add new utility method brw_glsl_base_type_for_nir_type() This method returns the glsl_base_type corresponding to a nir_alu_type. It will factorize code currently present in fs_nir, that can be reused in vec4_nir on its upcoming emit_texture support. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
e4f02f47e70d384531ac68e6d33a62fdcdbd1f28 |
|
16-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Lower "vecN" instructions and mark them unreachable This enables NIR pass "lower_vec_to_movs" on shaders that work on vec4. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
9e5d827f455f3c72af6cb8d60b97890bab8d5ad0 |
|
25-Jun-2015 |
Alejandro Piñeiro <apinheiro@igalia.com> |
i965/nir: Disable alu_to_scalar pass on non-scalar shaders Disables nir_lower_alu_to_scalar when the shader stage being processed work on vec4 vectors, like the upcoming NIR->vec4 backend. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
662c4c99065381b8e265310d176cfdef6698ca57 |
|
16-Jun-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/nir/vec4: Implement store_output intrinsic This implementation is based on the current URB setup in vec4_visitor, which requires the output register to be stored in the output_reg array at variable's original shader location index. But since nir_lower_io() pass uses the value in var->data.driver_location, we need to put there var->data.location instead, prior to calling nir_lower_io(), so that we end up with the correct index in const_index[0]. The driver_location is not used at all, so this patch also disables the nir_assign_var_locations pass on non-scalar shaders. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
97e205fd35bf77fd761caf24c611ff72cc0d85e2 |
|
17-Apr-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/nir: Move brw_type_for_nir_type() to brw_nir to allow reuse Upcoming NIR->vec4 pass can benefit from this method, so lets move it up. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
f7152525374015594e037fa11bb64e1c7174829b |
|
01-Jul-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/nir/vec4: Implement load_const intrinsic Similar to fs_nir backend, a nir_local_values map will be filled with newly allocated registers as the load_const instrinsic instructions are processed. Later, get_nir_src() will fetch the registers from this map for sources that are ssa. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
6e58fc56a5a396020cd299db11895120ec3da520 |
|
03-Jul-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/nir: Dot not assign direct uniform locations first for vec4-based shaders In the vec4 backend we want uniform locations to be assigned consecutively since that way the offsets produced by nir_lower_io are exactly what we need to implement nir_intrinsic_load_uniform. Otherwise we would need a mapping to match the output of nir_lower_io to the actual uniform registers we need to use. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
01f6235020f9f0c2bc1a6e6ea9bd15c22fb2bcf5 |
|
18-Jun-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
nir/nir_lower_io: Add vec4 support The current implementation operates in scalar mode only, so add a vec4 mode where types are padded to vec4 sizes. This will be useful in the i965 driver for its vec4 nir backend (and possbly other drivers that have vec4-based shaders). Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
5e839727ed2378a01d3b657bad83abd4728e8da6 |
|
22-Jul-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/nir: Pass a is_scalar boolean to brw_create_nir() The upcoming introduction of NIR->vec4 pass will require that some NIR lowering passes are enabled/disabled depending on the type of shader (scalar vs. vector). With this patch we pass a 'is_scalar' variable to the process of constructing the NIR, to let an external context decide how the shader should be handled. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
864907e2f14523c130e6ff24c081789bb079bae1 |
|
24-Jun-2015 |
Connor Abbott <cwabbott0@gmail.com> |
i965/fs: use SSA values directly Before, we would use registers, but set a magical "parent_instr" field to indicate that it was actually purely an SSA value (i.e., it wasn't involved in any phi nodes). Instead, just use SSA values directly, which lets us get rid of the hack and reduces memory usage since we're not allocating a nir_register for every value. It also makes our handling of load_const more consistent compared to the other instructions. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
2b1a1d8b1294f91b7ac563da1f395deba4384765 |
|
24-Jun-2015 |
Connor Abbott <cwabbott0@gmail.com> |
nir/from_ssa: add a flag to not convert everything from SSA We already don't convert constants out of SSA, and in our backend we'd like to have only one way of saying something is still in SSA. The one tricky part about this is that we may now leave some undef instructions around if they aren't part of a phi-web, so we have to be more careful about deleting them. v2: rename and flip meaning of flag (Jason) Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
c8b8e8b29b755cd3d80fc5e470f441cb3716152a |
|
22-Jun-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Don't count NIR instructions for shader-db. Matt, Jason, and I haven't found this useful in a long time. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
f4310cdbd08f20276237fbefa3eba406aa109636 |
|
10-Jun-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Re-index SSA definitions before printing NIR code. This makes the SSA definitions use sequential numbers (0, 1, 2, ...) instead of seemingly random ones. There's not much point normally, but it makes debug output much easier to read. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
58aed1031d40e62c9f41f7c512b3165dd5913d1e |
|
20-May-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
prog_to_nir: Use a variable for uniform data Previously, the prog_to_nir pass was directly generating uniform load/store intrinsics. This converts it to use a single giant "parameters" variable and we now depend on lowering to get the uniform load/store intrinsics. One advantage of this is that we now have one code-path after we do the initial conversion into NIR. No shader-db changes. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|
89c1feb78d010bc457f5d02be84c955eebf3549f |
|
08-Apr-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Create NIR during LinkShader() and ProgramStringNotify(). Previously, we translated into NIR and did all the optimizations and lowering as part of running fs_visitor. This meant that we did all of that work twice for fragment shaders - once for SIMD8, and again for SIMD16. We also had to redo it every time we hit a state based recompile. We now generate NIR once at link time. ARB programs don't have linking, so we instead generate it at ProgramStringNotify time. Mesa's fixed function vertex program handling doesn't bother to inform the driver about new programs at all (which is rather mean), so we generate NIR at the last minute, if it hasn't happened already. shader-db runs ~9.4% faster on my i7-5600U, with a release build. v2: Check NirOptions != NULL in ProgramStringNotify(). Don't bother using _mesa_program_enum_to_shader_stage as we already know it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_nir.c
|