4295af646fa7cf9b2cd8d0c2a481a7fc5eb43553 |
|
06-Jan-2017 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fix texturing in the vec4 TCS and GS backends. We were failing to zero m0.2 of the sampler message header for TCS and GS messages in the simple case. fs_generator has done this for about a year now, but we missed it in vec4_generator. Fixes ES31-CTS.core.texture_cube_map_array.sampling, GL45-CTS.texture_cube_map_array.sampling, and many dEQP-GLES31.functional.shaders.opaque_type_indexing.sampler subtests: - dynamically_uniform.tessellation_control.isampler3d - dynamically_uniform.tessellation_control.isamplercube - dynamically_uniform.tessellation_control.sampler2d - dynamically_uniform.tessellation_control.usamplercube - dynamically_uniform.tessellation_control.sampler2darray - dynamically_uniform.tessellation_control.isampler2darray - dynamically_uniform.tessellation_control.usampler3d - dynamically_uniform.tessellation_control.usampler2darray - dynamically_uniform.tessellation_control.usampler2d - dynamically_uniform.tessellation_control.sampler3d - dynamically_uniform.tessellation_control.samplercube - dynamically_uniform.tessellation_control.isampler2d - uniform.tessellation_control.isampler3d - uniform.tessellation_control.isamplercube - uniform.tessellation_control.usampler2d - uniform.tessellation_control.usampler3d - uniform.tessellation_control.sampler2darray - uniform.tessellation_control.isampler2darray - uniform.tessellation_control.usampler2darray - uniform.tessellation_control.sampler2d - uniform.tessellation_control.usamplercube - uniform.tessellation_control.sampler3d - uniform.tessellation_control.samplercube - uniform.tessellation_control.isampler2d Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
e481dcc35eefdc9d9c8dc97370174405746a36d3 |
|
17-Jun-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: make the generator set correct NibCtrl for SIMD4 DF instructions From the HSW PRM, Command Reference, QtrCtrl: "NibCtrl is only allowed for SIMD4 instructions with a DF (Double Float) source or destination type." v2: Assert that the type is DF (Samuel) v3: Don't set the default group to 0 and then set it only for 4-wide instructions. Instead, assert that exec size and group are always a correct match and then always set the default group from the instruction. (Curro) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
486fd5422c09bbd9b951b3b7124f1a904ecff709 |
|
29-Aug-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: use the IR's execution size In the vec4 backend the generator sets to 8 the execution size for all instructions by default, however, to implement 64-bit floating-point we will need to split certain instruction into smaller sizes so we need the IR to convey this information like we do in the scalar backend. This patch uses the execution size from the vec4 IR. We will use this feature in a later patch when we implement a SIMD splitting pass. v2: - Drop the assertion on the execution size being 8 or 4 (Curro) - Use exec_size from backend_instruction (Curro) Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
c722a8e61ebc72d7d21c2bed0f623218d739fdb7 |
|
17-Feb-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: Rename DF to/from F generator opcodes The opcodes are not specific for conversions to/from float since we need the same for conversions to/from other 32-bit types. Rename the opcodes accordingly and change the asserts to check the size of the types involved instead. v2: - Rename to VEC4_OPCODE_TO_DOUBLE and VEC4_OPCODE_FROM_DOUBLE (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
54b998e0e488189307d2614fe56a3b78b442d316 |
|
17-Jun-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: add VEC4_OPCODE_SET_{LOW,HIGH}_32BIT opcodes These opcodes will set the low/high 32-bit in each 64-bit data element using Align1 mode. We will use this to implement packDouble2x32. We use Align1 mode because in order to implement this in Align16 mode we would need to use 32-bit logical swizzles (XZ for low, YW for high), but the IR works in terms of 64-bit logical swizzles for DF operands all the way up to codegen. v2: - use suboffset() instead of get_element_ud() - no need to set the width on the dst Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
6979e5a41241993b9e7bedea80f29fb43d96aa47 |
|
31-May-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: add VEC4_OPCODE_PICK_{LOW,HIGH}_32BIT opcodes These opcodes will pick the low/high 32-bit in each 64-bit data element using Align1 mode. We will use this, for example, to do things like unpackDouble2x32. We use Align1 mode because in order to implement this in Align16 mode we would need to use 32-bit logical swizzles (XZ for low, YW for high), but the IR works in terms of 64-bit logical swizzles for DF operands all the way up to codegen. v2: - use suboffset() instead of get_element_ud() - no need to set the width on the dst Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
558f27953101c438747c3e9d3c3f98ce21e79007 |
|
14-Aug-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: add double/float conversion pseudo-opcodes These need to be emitted as align1 MOV's, since they need to have a stride of 2 on the float register (whether src or dest) so that data from another thread doesn't cross the middle of a SIMD8 register. v2 (Iago): - The float-to-double needs to align 32-bit data to 64-bit before doing the conversion. This was doable in align16 when we tried to use an execsize of 4, but with an execsize of 8 we would need another align1 opcode to do that (since we need data to cross the middle of a SIMD register). Just making the opcode handle this internally seems more practical that adding another opcode just for this purpose and having the caller know about this before converting. - The double-to-float conversion produces 32-bit elements aligned to 64-bit so we make the opcode re-pack the result to 32-bit and fit in one register, as expected by SIMD4x2 operation. This still requires that callers reserve two registers for the float data destination because we need to produce 64-bit aligned data first, and repack it later on the same destination register, but it saves the need for a re-pack opcode only to achieve this making the operation complete in a single opcode. Hopefully that is worth the weirdness of the double register allocation... Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
88b5acfa09d4efa2aea1fc9cc4f8169a48c40286 |
|
23-Dec-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/generator/tex: Handle an immediate sampler with an indirect texture In this case we were dying when we tried to do SHL addr sampler imm(8) because that puts an immediate in src0 of a two source instruction. This fixes 2704 of the new separate sampler Vulkan CTS tests on Sky Lake. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
3c78d31374422b028b19afa5799689c404a5b73e |
|
23-Apr-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Let the caller of brw_set_dp_write/read_message control the target cache. brw_set_dp_read_message already had a target_cache argument, but its interpretation was rather convoluted (on Gen6 the render cache was used if the caller asked for it, otherwise it was ignored using the sampler cache instead), and the constant cache wasn't representable at all. brw_set_dp_write_message used the data cache on Gen7+ except for RENDER_TARGET_WRITE messages, in which case it would use the render cache. On Gen6 the render cache was always used. Instead of the above, provide the shared unit SFID that the caller expects will be used. Makes no functional changes. v3: Non-trivial rebase. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
e1af20f18a86f52a9640faf2d4ff8a71b0a4fa9b |
|
13-Oct-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
nir/i965/anv/radv/gallium: make shader info a pointer When restoring something from shader cache we won't have and don't want to create a nir_shader this change detaches the two. There are other advantages such as being able to reuse the shader info populated by GLSL IR. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
c05a4f11a03dd5614a9462b5cb28e8b630bfddc0 |
|
16-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/ir: Pass identity mask to brw_find_live_channel() in the packed dispatch case. This avoids emitting a few extra instructions required to take the dispatch mask into account when it's known to be tightly packed. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
8a468d186e6fc27c26dd12ba989192e7596f667a |
|
15-Sep-2016 |
Jason Ekstrand <jason@jlekstrand.net> |
i965/fs: Take Dispatch/Vector mask into account in FIND_LIVE_CHANNEL On at least Sky Lake, ce0 does not contain the full story as far as enabled channels goes. It is possible to have completely disabled channels where the corresponding bits in ce0 are 1. In order to get the correct execution mask, you have to mask off those channels which were disabled from the beginning by taking the AND of ce0 with either sr0.2 or sr0.3 depending on the shader stage. Failure to do so can result in FIND_LIVE_CHANNEL returning a completely dead channel. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Francisco Jerez <currojerez@riseup.net> [ Francisco Jerez: Fix a couple of typos, add mask register type assertion, clarify reason why ce0 can have bits set for disabled channels, clarify that this may only be a problem when thread dispatch doesn't pack channels tightly in the SIMD thread. Apply same treatment to Align16 path. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
527f37199929932300acc1688d8160e1f3b1d753 |
|
23-Aug-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
intel: s/brw_device_info/gen_device_info/ Generated by: sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/**/*.c sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/**/*.h sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.c sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.cpp sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.h Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
90eaf01616a8cf7a39dae63a3d5636874fa68fa5 |
|
30-Aug-2016 |
Matt Turner <mattst88@gmail.com> |
i965: Pass start_offset to brw_set_uip_jip(). Without this, we would pass over the instructions in the SIMD8 program (which is located earlier in the buffer) when brw_set_uip_jip() is called to handle the SIMD16 program. The assertion about compacted control flow was bogus: halt, cont, break cannot be compacted because they have both JIP and UIP. Instead, we should never see a compacted instruction in this code at all. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
9c9f45b82410646d2f7a8576d03de9916118bf07 |
|
26-Aug-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: remove the generator hack for dual instanced GS This hack was introduced in commit 03ac2c7223f7645e3: i965/gs: Fix up gl_PointSize input swizzling for DUAL_INSTANCED gs Specifically to fixup the code we emitted to deal with gl_PointSize inputs in dual instance mode, where we were emitting a MOV to copy the point size from .w (where the hardware delivers it) to .x (because code will expect this to be a float). This meant that we were emitting a MOV to an ATTR destination that could have a width of 4 (in dual instanced mode) so it was necessary to fix the execution size and regioning of the instruction. Fortunately, Ken fixed this in 67c5d00273ca2: i965/vec4/gs: Stop munging the ATTR containing gl_PointSize. by using a WWWW swizzle instead of a MOV, and as the commit log in that patch states, we no longer emit instructions with ATTR destinations, so that makes the fixup code in the generator unnecessary. Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
29eb8059fd7906d2595ea99bc65a27691b9fbe53 |
|
22-Jul-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/eu: Take into account the target cache argument in brw_set_dp_read_message. brw_set_dp_read_message() was setting the data cache as send message SFID on Gen7+ hardware, ignoring the target cache specified by the caller. Some of the callers were passing a bogus target cache value as argument relying on brw_set_dp_read_message not to take it into account. Fix them too. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
3ef31122d08fdf7e8e6a8d74a9d91006fe840f86 |
|
12-Aug-2016 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Print spills:fills. Allows shader-db to work on vec4 programs (has been broken since shader-db commit 646df5ca98b2 from April!) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
de20086eed47e6bfe7c25835d72383114f99c7a9 |
|
22-Jun-2016 |
Ian Romanick <ian.d.romanick@intel.com> |
i965: Use LZD to implement nir_op_ufind_msb This uses one less instruction. v2: Move emit_find_msb_using_lzd out of the visitor classes. Suggested by Curro. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
6e28976d35cf0a15c62bed1fd2ceeb734a3fc81e |
|
07-Jul-2016 |
Samuel Iglesias Gonsálvez <siglesias@igalia.com> |
i965: enable the emission of the DIM instruction v2 (Matt): - Take a DF source argument for the DIM instruction emission in the visitors. - Indentation. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
81bc6de8c0f7faafd0f3b0aee944a14ba3ef0b64 |
|
19-May-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/ir: Make BROADCAST emit an unmasked single-channel move. Alternatively we could have extended the current semantics to 32-wide mode by changing brw_broadcast() to emit multiple indexed MOV instructions in the generator copying the selected value to all destination registers, but it seemed rather silly to waste EU cycles unnecessarily copying the exact same value 32 times in the GRF. The vstride change in the Align16 path is required to avoid assertions in validate_reg() since the change causes the execution size of the MOV and SEL instructions to be equal to the source region width. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
59156b2e96315910f1e929c14c5b25ce88f75911 |
|
14-May-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fix undefined df bits in brw_reg comparisons. Commit 5310bca024f77da40ea6f4c275455f9cb0528f9e added a new "double df" field to the brw_reg struct, adding an extra 4 bytes of data that isn't usually initialized (or may contain irrelevant garbage if the struct is mutated). This means that it's no longer safe to memcmp(). Instead, add a brw_regs_equal() function which ignores the extra df bits unless they matter. To keep the implementation cheap, we wrap the first set of fields in a union/struct so that we can use a single DWord comparison. v2: Drop unnecessary casts (caught by Francisco Jerez). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
aaac8a18904f44e93a2223c93727086358d6a655 |
|
24-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Add support for SHADER_OPCODE_MOV_INDIRECT Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
f60750968c66f7aa15181c4ba315bb594e615044 |
|
15-Mar-2016 |
Matt Turner <mattst88@gmail.com> |
i965/vec4/tcs: Set conditional mod on TCS_OPCODE_SRC0_010_IS_ZERO. Missing this causes an assertion failure in the scheduler with the next patch. Additionally, this gives cmod propagation enough information to optimize code better. total instructions in shared programs: 7112991 -> 7112852 (-0.00%) instructions in affected programs: 25704 -> 25565 (-0.54%) helped: 139 total cycles in shared programs: 64812898 -> 64810674 (-0.00%) cycles in affected programs: 127224 -> 125000 (-1.75%) helped: 139 Acked-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
22a10dd0308c4993350e3e0609588a6f4e1cd402 |
|
15-Dec-2015 |
Samuel Iglesias Gonsalvez <siglesias@igalia.com> |
i965/vec4/gen6: fix exec_size for MOV with a width of 4 in generate_gs_ff_sync() Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
b91b9e4b005858bad07eec1f92438a22468ac1ae |
|
04-Dec-2015 |
Samuel Iglesias Gonsalvez <siglesias@igalia.com> |
i965/vec4/gen6: fix exec_size for instructions with destination width of 4 Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
30fc3fa24d90c1ceda33ba95832e17c67584e2bc |
|
03-Dec-2015 |
Samuel Iglesias Gonsalvez <siglesias@igalia.com> |
i965/vec4/gen6: fix exec_size for instructions with width of 4 in generate_gs_svb_write() Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
d03e5d52557ce6523eb65dfec9172d6000f5ff8d |
|
03-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Plumb separate surfaces and samplers through from NIR Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
f88027f7bda781701c74bf71ebf89aa3b30b70d8 |
|
03-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Separate the sampler from the surface in generate_tex Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
830b075e86e3e9af1bf12316d0f9d888a85a973b |
|
05-Jan-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Explicitly write the "TR DS Cache Disable" bit at TCS EOT. Bit 0 of the Patch Header is "TR DS Cache Disable". Setting that bit disables the DS Cache for tessellator-output topologies resulting in stitch-transition regions (but leaves it enabled for other cases). We probably shouldn't leave this to chance - the URB could contain garbage - which could result in the cache randomly being turned on or off. This patch makes the final EOT write 0 to the first DWord (which only contains this one bit). This ensures the cache is always on. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
138a7dc826baeb7451e748d47e508a639bad76c9 |
|
14-Jan-2016 |
Matt Turner <mattst88@gmail.com> |
i965: Drop extra newline from shader compile messages. Ilia changed shader-db's run.c to not expect messages to contain a newline in shader-db commit 51bbc8035.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
82ad571abf2fa2d85047451690f6a335f66d25fa |
|
08-Jan-2016 |
Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> |
glsl: Move _mesa_shader_stage_to_string/abbrev to shader_enums.c These are used by code that doesn't necessarily link to libglsl.la. Move them to shader_enums.[ch] where we keep similar helpers. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
bd8ab8dedb2cc557ea3cb58d507f237743b3f7f9 |
|
24-Dec-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Don't set interleave or complete on TCS EOT message. Setting interleave on the TCS EOT message causes Ivybridge hardware to GPU hang like crazy. Individual tests would pass, but running even a simple test like nop.shader_test in a loop would hang within 1-3 runs. Adding sleep delays worked around the problem, somehow. Interleave doesn't make much sense given that we only have one patch URB handle, not two. Complete doesn't seem useful either. There's no reason to actually set those bits. We were just being lazy. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
b7793783b3df94880655234bc2a9054eddf01913 |
|
26-Nov-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Relase input URB Handles on Gen7/7.5 when TCS threads finish. Pre-Broadwell hardware requires us to manually release the ICP Handles by issuing URB read messages with the "Complete" bit set. We can do this in pairs to use fewer URB read messages. Based heavily on work from Chris Forbes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
6ceabb72eae938570d9aa0ae054bab1df421d79a |
|
25-Dec-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Use proper TCS barrier ID bits for Ivybridge/Baytrail. Gen7 uses bits 15:12 while Gen7+ uses bits 16:13. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
5898cbae2479874a6206e27e6b73a3ba244a2094 |
|
26-Nov-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Use proper TCS Instance ID bits for Ivybridge/Baytrail. Gen7 uses 22:16 while Gen7.5+ uses 23:17. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
1245724f728915694ecb9c318a68107c01ccc808 |
|
17-Nov-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Port tessellation evaluation shaders to vec4 mode. This can be used on Broadwell by setting INTEL_SCALAR_TES=0. More importantly, it will be used for Ivybridge and Haswell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
24be658d13b13fdb8a1977208038b4ba43bce4ac |
|
17-Nov-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Add tessellation control shaders. The TCS is the first tessellation shader stage, and the most complicated. It has access to each of the control points in the input patch, and computes a new output patch. There is one logical invocation per output control point; all invocations run in parallel, and can communicate by reading and writing output variables. One of the main responsibilities of the TCS is to write the special gl_TessLevelOuter[] and gl_TessLevelInner[] output variables which control how much new geometry the hardware tessellation engine will produce. Otherwise, it simply writes outputs that are passed along to the TES. We run in SIMD4x2 mode, handling two logical invocations per EU thread. The hardware doesn't properly manage the dispatch mask for us; it always initializes it to 0xFF. We wrap the whole program in an IF..ENDIF block to handle an odd number of invocations, essentially falling back to SIMD4x1 on the last thread. v2: Update comments (requested by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
e3e70698c3cfa7e9acccd6eddfb37516c45d5ac2 |
|
24-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Use byte offsets for UBO pulls on Sandy Bridge Previously, the VS_OPCODE_PULL_CONSTANT_LOAD opcode operated on vec4-aligned byte offsets on Iron Lake and below and worked in terms of vec4 offsets on Sandy Bridge. On Ivy Bridge, we add a new *LOAD_GEN7 variant which works in terms of vec4s. We're about to change the GEN7 version to work in terms of bytes, so this is a nice unification. Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
55ffa64daf765b1229364518106a4124bd84b9a7 |
|
23-Nov-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/gen9+: Switch thread scratch space to non-coherent stateless access. The thread scratch space is thread-local so using the full IA-coherent stateless surface index (255 since Gen8) is unnecessary and potentially expensive. On Gen8 and early steppings of Gen9 this is not a functional change because the kernel already sets bit 4 of HDC_CHICKEN0 which overrides all HDC memory access to be non-coherent in order to workaround a hardware bug. This happens to fix a full system hang when running any spilling code on a pre-production SKL GT4e machine I have on my desk (forcing all HDC access to non-coherent from the kernel up to stepping F0 might be a good idea though regardless of this patch), and improves performance of the OglPSBump2 SynMark benchmark run with INTEL_DEBUG=spill_fs by 33% (11 runs, 5% significance) on a production SKL GT2 (on which HDC IA-coherency is apparently functional so it wouldn't make sense to disable globally). Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
f36993b46962eab4446bc1964eb47149751aee26 |
|
23-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Clean up #includes in the compiler. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
ecac1aab538d65f0867fd93e23d0d020c1a5d0f1 |
|
23-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Push down inclusion of brw_program.h. We were including it in headers, which then caused it to be included in tons of places it wasn't needed. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
2d8c5299032d229c8f6e936db5644cd53716e6c1 |
|
20-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Prevent implicit upcasts to brw_reg. Now that backend_reg inherits from brw_reg, we have to be careful to avoid the object slicing problem. Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
94b1031703b1b5759436fe215323727cffce5f86 |
|
25-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Remove fixed_hw_reg field from backend_reg. Since backend_reg now inherits brw_reg, we can use it in place of the fixed_hw_reg field. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
e42fb0c2a687cdcd6af2a590f6f5e24f64cfff3b |
|
23-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Make 'dw1' and 'bits' unnamed structures in brw_reg. Generated by sed -i -e 's/\.bits\././g' *.c *.h *.cpp sed -i -e 's/dw1\.//g' *.c *.h *.cpp and then reverting changes to comments in gen7_blorp.cpp and brw_fs_generator.cpp. There wasn't any utility offered by forcing the programmer to list these to access their fields. Removing them will reduce churn in future commits. This is C11 (and gcc has apparently supported it for sometime "compatibility with other compilers") See https://gcc.gnu.org/onlinedocs/gcc/Unnamed-Fields.html Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
0b45d47f71f57f685ce1a12a3dcd4fdb63c160b4 |
|
29-Jun-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Add initial assembly validation pass. Initially just checks that sources are non-NULL, which would have alerted us to the problem fixed by commit 6c846dc5. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
93e371c140cb1aa438ce3c1a9946811d92032897 |
|
29-Jun-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Set annotation_info's mem_ctx. It was being memset to 0 previously. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
eca4c43a33c5c1bb63c8aa9d0506ed2ba3f9d8cb |
|
30-Oct-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: Do not mark used surfaces in VS_OPCODE_GET_BUFFER_SIZE Do it in the visitor, like we do for other opcodes. v2: use const, get rid of useless surf_index temporary (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
6105d1d0a02c7eea83b327965713be3bada306f7 |
|
30-Oct-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: Do not mark used direct surfaces in VS_OPCODE_PULL_CONSTANT_LOAD Right now the generator marks direct surfaces as used but leaves marking of indirect surfaces to the caller. Just make the callers handle marking in both cases for consistency. v2: Use const, do not add unnecessary temporary (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
4ef27745c8ed5153464db22950a90d74d2ef4435 |
|
09-Sep-2015 |
Neil Roberts <neil@linux.intel.com> |
i965/vec4/skl+: Use ld2dms_w instead of ld2dms In order to support 16x MSAA, skl+ has a wider version of ld2dms that takes two parameters for the MCS data. The MCS data in the response still fits in a single register so we just need to ensure we copy both values rather than just the lower one. Acked-by: Ben Widawsky <ben@bwidawsk.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
45cd76e342d1e8ecea38e2048b96cf5be3a30fab |
|
06-Jun-2015 |
Connor Abbott <cwabbott0@gmail.com> |
i965: dump scheduling cycle estimates The heuristic we're using is rather lame, since it assumes everything is non-uniform and loops execute 10 times, but it should be enough for measuring improvements in the scheduler that don't result in a change in the number of instructions. v2: - Switch loops and cycle counts to be compatible with older shader-db. - Make loop heuristic 10x to match with spilling code. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
ee46c1e6261584326e9153a22861a16778803506 |
|
29-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Test against BRW_IMMEDIATE_VALUE, not IMM. No functional change, since they were both 3, but BRW_IMMEDIATE_VALUE is the hardware value and IMM was the IR value -- and you can see that BRW_IMMEDIATE_VALUE was correctly used in the context of this patch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
fa142773d9a7d3249396fe2547da24eaf58962c1 |
|
24-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Drop brw_set_default_* before popping insn state. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
11a7b6bbaaa8790f580ccdd99ac1798629df2041 |
|
24-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Remove unnecessary #includes from the generator. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
c6b24448b578c4a8ab031923df3ef1e2d865d5bb |
|
23-Oct-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Eliminate the vec4_generator class altogether. We really weren't taking advantage of vec4_generator being a class. By adding a "p" parameter to the helper methods, and "prog_data" to ones which need binding table information, we can convert everything to static functions. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
1a094a2ee2d63073ac12c8ab0dbd38c0e9270cf5 |
|
23-Oct-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Move vec4_generator class definition into the .cpp file. The public API for the generator is brw_vec4_generate_code(); nobody actually needs to use the class. This means we can extend it without triggering the recompiles associated with altering brw_vec4.h. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
4cba8f5d21e4b50343e7c7bfbeb603b59c5d71dd |
|
23-Oct-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Wrap vec4_generator in a C function. vec4_generator is a class for convenience, but only exports a single method as its public API. It makes much more sense to just export a single function. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
73ff0ead3688519eb76ea8bc32eabb9004e6f37b |
|
23-Oct-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Convert src_reg/dst_reg to brw_reg at the end of the visitor. This patch makes the visitor convert registers to the HW_REG file at the very end, after register allocation, post-RA scheduling, and dependency control flagging. After that, everything is in fixed brw_regs. This simplifies the code generator, as it can just use the hardware registers rather than having to interpret our abstract files. In particular, interpreting the UNIFORM file meant reading prog_data to figure out where push constants are supposed to start. Having the part of the code that performs register allocation also translate everything to hardware registers seems sensible. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
8f1d968704858d78d7e78a6b88db3ea2bc0cf749 |
|
06-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Remove gl_program and gl_shader_program from the generator Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
e00314bc57a59b3f816daba6249e7b7157761f86 |
|
06-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/asm: Explicitly use a nir_instr for IR annotations Now that everything goes through NIR, we don't need this to be a void pointer anymore. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
08fe5799e61e9251dec163d000709ff33434216d |
|
25-Sep-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/gs: Allow src0 immediates in GS_OPCODE_SET_WRITE_OFFSET. GS_OPCODE_SET_WRITE_OFFSET is a MUL with a constant src[1] and special strides. We can easily make the generator handle constant src[0] arguments by instead generating a MOV with the product of both operands. This isn't necessarily a win in and of itself - instead of a MUL, we generate a MOV, which should be basically the same cost. However, we can probably avoid the earlier MOV to put src[0] into a register. shader-db statistics for geometry shaders only: total instructions in shared programs: 3207 -> 3173 (-1.06%) instructions in affected programs: 3207 -> 3173 (-1.06%) helped: 11 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
bcef2abad7cf255b6ac112b9ebf0ff75e491c968 |
|
25-Sep-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Move GS_THREAD_END mlen calculations out of the generator. The visitor was setting a mlen that was wrong for Broadwell, but the generator was ignoring it and doing the right thing regardless. We may as well move the logic fully into the visitor. This will be useful in the next commit as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
6485880232df46c0cdded0b063b8841a7855bd32 |
|
28-Aug-2015 |
Samuel Iglesias Gonsalvez <siglesias@igalia.com> |
i965/vec4: Implement VS_OPCODE_GET_BUFFER_SIZE Notice that Skylake needs to include a header in the sampler message so it will need some tweaks to work there. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
f50645d05c6dffa6463856ded0b8461ac9d24535 |
|
15-Sep-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965: Turn BRW_MAX_MRF into a macro that accepts a hardware generation There are some bug reports about shaders failing to compile in gen6 because MRF 14 is used when we need to spill. For example: https://bugs.freedesktop.org/show_bug.cgi?id=86469 https://bugs.freedesktop.org/show_bug.cgi?id=90631 Discussion in bugzilla pointed to the fact that gen6 might actually have 24 MRF registers available instead of 16, so we could use other MRF registers and avoid these conflicts (we still need to investigate why some shaders need up to MRF 14 anyway, since this is not expected). Notice that the hardware docs are not clear about this fact: SNB PRM Vol4 Part2's "Table 5-4. MRF Registers Available in Device Hardware" says "Number per Thread" - "24 registers" However, SNB PRM Vol4 Part1, 1.6.1 Message Register File (MRF) says: "Normal threads should construct their messages in m1..m15. (...) Regardless of actual hardware implementation, the thread should not assume th at MRF addresses above m15 wrap to legal MRF registers." Therefore experimentation was necessary to evaluate if we had these extra MRF registers available or not. This was tested in gen6 using MRF registers 21..23 for spilling and doing a full piglit run (all.py) forcing spilling of everything on the FS backend. It was also tested by doing spilling of everything on both the FS and the VS backends with a piglit run of shader.py. In both cases no regressions were observed. In fact, many of these tests where helped in the cases where we forced spilling, since that triggered the same underlying problem described in the bug reports. Here are some results using INTEL_DEBUG=spill_fs,spill_vec4 for a shader.py run on gen6 hardware: Using MRFs 13..15 for spilling: crash: 2, fail: 113, pass: 6621, skip: 5461 Using MRFs 21..23 for spilling: crash: 2, fail: 12, pass: 6722, skip: 5461 This patch sets the ground for later patches to implement spilling using MRF registers 21..23 in gen6. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
085861083638ec782c17d3aa72ab46f1a0099935 |
|
16-Sep-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965: Move MRF register asserts out of brw_reg.h In a later patch we will make BRW_MAX_MRF return a different value depending on the hardware generation, but it is inconvenient to add a gen parameter to the brw_reg functions only for the assertions, so move these to places where we have the hardware generation available. Ken suggested to add the asserts to brw_set_src0 and brw_set_dest since that would make sure that we catch all uses of MRF registers, even those coming from modules that generate native code directly, like blorp. Unfortunately, this is very late in the process which can make things harder to debug, so add asserts to the generator as well. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
d48ac93066190077510d635e71631b6574261d08 |
|
18-Sep-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965: Maximum allowed size of SEND messages is 15 (4 bits) Until now we only used MRFs 1..15 for regular SEND messages, so the message length could not possibly exceed the maximum size. Soon we'll allow to use MRF registers 1..23 in gen6, so we need to be careful not to build messages that can go beyond the limit. That could occur, specifically, when building URB write messages, which we may need to split in chunks due to their size. Previously we would simply go and create a new message when we reached MRF 13 (since 13..15 were reserved for spilling), now we also want to check the size of the message explicitly. Besides adding that condition to split URB write messages properly, this patch also adds asserts in the generator. Notice that brw_inst_set_mlen already asserts for this, but asserting in the generators is easy and can make debugging easier in some cases. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
0b91bcea98c0fe201bba89abe1ca3aee4d04c56c |
|
12-Aug-2015 |
Ilia Mirkin <imirkin@alum.mit.edu> |
i965: add support for textureSamples function Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> [v2: kayden-supplied code in fs_nir replacing need for logical opcode] Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
2484263fe97cebc9fa7a5c9de04c757dc6cc7713 |
|
29-Jul-2015 |
Anuj Phogat <anuj.phogat@gmail.com> |
Delete duplicate function is_power_of_two() and use _mesa_is_pow_two() Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
3af2623da5167aa686bcb2cff01d27058a507026 |
|
20-Jul-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Lift the constness restriction on surface indices passed to untyped ops. v2: Update NIR atomic intrinsic handling too (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
d7565b7d65f8203c20735a61b86e9158b8ec4447 |
|
16-Apr-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Remove the dependance on brw_context from the generators Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
e639a6f68e701f23b977a49c45d646c164991d36 |
|
16-Apr-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Plumb compiler debug logging through a function pointer in brw_compiler v2 (Ken): Make shader_debug_log a printf-like function. v3 (Jason): Add a void * to pass the brw_context through Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
c753866cc4ae15313430f9b6edba1b82e44b003a |
|
28-May-2015 |
Neil Roberts <neil@linux.intel.com> |
i965/vec4: Fix the source register for indexed samplers Previously when setting up the sample instruction for an indirect sampler the vec4 backend was directly passing the pseudo opcode's src0. However vec4_visitor::visit(ir_texture *) doesn't set the texture operation's src0 -- it's left as BAD_FILE, which when translated into a brw_reg gives the null register. In brw_SAMPLE, gen6_resolve_implied_move() inserts a MOV from the inst->base_mrf and sets the src0 appropriately. The indirect sampler case did not have a call to gen6_resolve_implied_move(). The fs backend avoids this because the platforms that support dynamic indexing of samplers (IVB+) have been converted to not use the fake-MRF hack, and instead send from proper GRFs. This patch makes it call gen6_resolve_implied_move before setting up the indirect message. This is similar to what is done for constant sampler numbers in brw_SAMPLE. The Piglit tests for sampler array indexing didn't pick this up because they were using a texture with a solid colour so it didn't matter what texture coordinates were actually used. The tests have now been changed to be more thorough in this commit: http://cgit.freedesktop.org/piglit/commit/?id=4f9caf084eda7 With that patch the tests for gs and vs are currently failing on Ivybridge, but this patch fixes them. There are no other changes to a Piglit run on Ivybridge. On Skylake the gs tests were failing even without the Piglit patch because Skylake needs the source registers to work correctly in order to send a message header to select SIMD4x2 mode. (The explanation in the commit message is partially written by Matt Turner) Tested-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
d46d04529b9c1e55b4c3b65a7078bbbd7ab1a810 |
|
03-Jun-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Use UW-typed immediate in multiply inst. Some hardware reads only the low 16-bits even if the type is UD, but other hardware like Cherryview can't handle this. Fixes spec@arb_gpu_shader5@execution@sampler_array_indexing@fs-simple on Cherryview. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90830 Reviewed-by: Neil Roberts <neil@linux.intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
7f62fdae1629d75dd581d1c57b28c2f099c5ef6b |
|
29-May-2015 |
Neil Roberts <neil@linux.intel.com> |
i965: Don't add base_binding_table_index if it's zero When calculating the binding table index for non-constant sampler array indexing it needs to add the base binding table index which is a constant within the generated code. Often this base is zero so we can avoid a redundant instruction in that case. It looks like nothing in shader-db is doing non-constant sampler array indexing so this patch doesn't make any difference but it might be worth having anyway. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Ben Widawsky <ben@bwidawsk.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
6c846dc57b1d6f3e015a604dba1976f96c4be9e9 |
|
28-May-2015 |
Neil Roberts <neil@linux.intel.com> |
i965: Don't use a temporary when generating an indirect sample Previously when generating the send instruction for a sample instruction with an indirect sampler it would use the destination register as a temporary store. This breaks when used in combination with the opt_sampler_eot optimisation because that forces the destination to be null. This patch fixes that by avoiding the temp register altogether. The reason the temporary register was needed was because it was trying to ensure the binding table index doesn't overflow a byte by and'ing it with 0xff. The result is then or'd with samper_index<<8. This patch instead just and's the whole thing by 0xfff. This will ensure that a bogus sampler index won't overflow into the rest of the message descriptor but unlike the previous code it won't ensure that the binding table index doesn't overflow into the sampler index. It doesn't seem like that should matter very much though because if the shader is generating a bogus sampler index then it's going to just get garbage out either way. Instead of doing sampler_index<<8|(sampler_index+base_table_index) the new code avoids one operation by doing sampler_index*0x101+base_table_index which should be equivalent. However if we wanted to avoid the multiply for some reason we could do this by adding an extra or instruction still without needing the temporary register. This fixes a number of Piglit tests on Skylake that were using indirect samplers such as: spec@arb_gpu_shader5@execution@sampler_array_indexing@fs-simple Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Ben Widawsky <ben@bwidawsk.net> Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
76c1086f2dfb37a1edf6d2df6eebbe11ccbfc50b |
|
24-Mar-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Change header_present to header_size in backend_instruction Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
715bc6d8b16a0bbdc17fe1e1e46b88a679bf312b |
|
23-Apr-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Introduce the FIND_LIVE_CHANNEL pseudo-opcode. This instruction calculates the index of an arbitrary channel enabled in the current execution mask. It's expected to be used as input for the BROADCAST opcode, but it's implemented as a separate instruction rather than being baked into BROADCAST because FIND_LIVE_CHANNEL has no dependencies so it can always be CSE'ed with other instances of the same instruction within a basic block. v2: Whitespace fixes. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
c74511f5dc239eefb8604294c6c1e57b3a394111 |
|
20-Feb-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Introduce the BROADCAST pseudo-opcode. The BROADCAST instruction picks the channel from its first source given by an index passed in as second source. This will be used in situations where all channels from the same SIMD thread have to agree on the value of something, e.g. a surface binding table index. This is in particular the case for UBO, sampler and image arrays, which can be indexed dynamically with the restriction that all active SIMD channels access the same index, provided to the shared unit as part of a single scalar field of the message descriptor. Simply taking the index value from the first channel as we were doing until now is incorrect, because it might contain an uninitialized value if the channel had previously been disabled by non-uniform control flow. v2: Minor style fixes. Improve commit message. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
f1d1d17db6bdeac0519652aa7432048507154a28 |
|
23-Apr-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Add memory fence opcode. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
f118e5d15fd9b35cf27a975a702c5fb81d3157aa |
|
23-Apr-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Add typed surface access opcodes. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
0775d8835ac8d1f2ab75d04f0cddbad36b6787fe |
|
23-Apr-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Add untyped surface write opcode. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
c97a7705ea61f0d1e78bcfe91c0c8e05c79b45cb |
|
19-Mar-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Reorder sources of the untyped atomic opcode. This is consistent with the untyped surface read opcode. From now on all typed and untyped surface access opcodes will follow the same pattern: src[0] will be the message payload, src[1] will be the surface index and src[2] will be a control immediate (atomic operation for atomic opcodes and number of vector components for surface read and write opcodes). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
ac747ca5f72332b1ff97041cc808be551596a26f |
|
19-Mar-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Pass the number of components as a source of the untyped surface read opcode. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
20915130ace4cc0f700ece2a99c0353581a156bb |
|
26-Feb-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Add support for untyped surface message sends from GRF. This doesn't actually enable untyped surface message sends from GRF yet, the upcoming atomic counter and image intrinsic lowering code will. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
8865fe309da2597398071f5171808c27aac787b4 |
|
26-Feb-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Don't request untyped atomic writeback message if the destination is null. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
0519a6259b0e6b83eaeafdf0ed30a67713c4b6ed |
|
22-Apr-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Simplify generator code for untyped surface messages. The generate_untyped_*() methods do nothing useful other than calling the corresponding function from brw_eu_emit.c. The calls to brw_mark_surface_used() will go away too in a future commit. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
2f1c16df3e997771bcedb60ae7f16a21c4c60144 |
|
23-Apr-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Fix the untyped surface opcodes to deal with indirect surface access. Change brw_untyped_atomic() and brw_untyped_surface_read() to take the surface index as a register instead of a constant and to use brw_send_indirect_message() to emit the indirect variant of send with a dynamically calculated message descriptor. This will be required to support variable indexing of image arrays for ARB_shader_image_load_store. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
be119e80c9414aaf5101809c44ad4bf64e8676bf |
|
23-Apr-2015 |
Neil Roberts <neil@linux.intel.com> |
i965/skl: Force the exec size to 8 when initing header for SIMD4x2 On Gen9+ there needs to be a header when sampling using SIMD4x2. The header is set up by copying from the g0 register. Commit 07c571a39f tried to fix this mov instruction to always use an exec size of 8 because previously it was incorrectly using 4. It did this by casting the type of the destination register to vec8. This was done because there is code in brw_set_dest to guess the exec size based on the width of the dest register. However I misunderstood how this works because it is actually only used when the width is less than 8. That means the patch actually changed it to use the default exec size which on SIMD16 would be 16 and the MOV would clobber over the first register in the send message. This patch makes it additionally set the default exec size to 8. This is similar to how the message is set up in fs_generator::generate_tex. I think this wasn't picked up by any Piglit tests because we don't have any fragment shaders that hit this code path so nothing was using SIMD16. However the patch caused failures in deqp tests. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90153 Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Tapani Pälli <tapani.palli@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
a85c4c9b3f75cac9ab133caa91a40eec2e4816ae |
|
16-Apr-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Rename brw_compile to brw_codegen This name better matches what it's actually used for. The patch was generated with the following command: for file in *; do sed -i -e s/brw_compile/brw_codegen/g $file done Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
2bf207b47347ec1c672448e3019029f899a5d3b5 |
|
16-Apr-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Add a devinfo field to the generator and use it for gen checks Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
9c89e47806ee0437a2617eb4b90a0b953869fea2 |
|
16-Apr-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Make the annotation code take a device_info instead of a context Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
61c4702489fa1694892c5ce90ccf65a5094df3e7 |
|
15-Apr-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Remove the context field from brw_compiler Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
4e9c79c847c81701300b5b0d97d85dcfad32239a |
|
15-Apr-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Make the brw_inst helpers take a device_info instead of a context Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
33f73e93ff6e14f72153d3df7e80763137fcb943 |
|
24-Mar-2015 |
Neil Roberts <neil@linux.intel.com> |
i965/skl: Add the header for constant loads outside of the generator Commit 5a06ee738 added a step to the generator to set up the message header when generating the VS_OPCODE_PULL_CONSTANT_LOAD_GEN7 instruction. That pseudo opcode is implemented in terms of multiple actual opcodes, one of which writes to one of the source registers in order to set up the message header. This causes problems because the scheduler isn't aware that the source register is written to and it can end up reorganising the instructions incorrectly such that the write to the source register overwrites a needed value from a previous instruction. This problem was presenting itself as a rendering error in the weapon in Enemy Territory: Quake Wars. Since commit 588859e1 there is an additional problem that the double register allocated to include the message header would end up being split into two. This wasn't happening previously because the code to split registers was explicitly avoided for instructions that are sending from the GRF. This patch fixes both problems by splitting the code to set up the message header into a new pseudo opcode so that it will be done outside of the generator. This new opcode has the header register as a destination so the scheduler can recognise that the register is written to. This has the additional benefit that the scheduler can optimise the message header slightly better by moving the mov instructions further away from the send instructions. On Skylake it appears to fix the following three Piglit tests without causing any regressions: gs-float-array-variable-index gs-mat3x4-row-major gs-mat4x3-row-major I think we actually may need to do something similar for the fs backend and possibly for message headers from regular texture sampling but I'm not entirely sure. v2: Make sure the exec-size is retained as 8 for the mov instruction to initialise the header from g0. This was accidentally lost during a rebase on top of 07c571a39fa1. Split the patch into two so that the helper function is a separate change. Fix emitting the MOV instruction on Gen7. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89058 Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
07c571a39fa12c3db1c638302de7aed67844609b |
|
10-Apr-2015 |
Neil Roberts <neil@linux.intel.com> |
i965/skl: Use an exec size of 8 to initialise the message header Commit e93566a15c61c33faa changed the message header code needed to make Skylake use SIMD4x2 so that it uses a register with width 4 instead of 8 as the source register in the send message. However it also changed the width for the dest in the MOV instruction which is used to initialise the header register with the values from g0. The width of the destination is used to determine the exec size in brw_set_dest so this would end up making the MOV have an exec size of 4. I think this would end up leaving the top half of the register uninitialised. The top half of the header has meaningful values so this probably isn't a good idea. This patch just casts the dest register for the MOV instruction back to a vec8 to fix it. It doesn't cause any changes to a Piglit run. Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
959d16e38e007b29349d7371fb390a5449c88341 |
|
25-Feb-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Pass number of components explicitly to brw_untyped_atomic and _surface_read. And calculate the message response size based on the number of components rather than the other way around. This simplifies their interface somewhat and allows the caller to request a writeback message with more than one vector component in SIMD4x2 mode. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
a815cd8449c207956176020e752cd0051ed842ec |
|
26-Feb-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Don't disable exec masking for sampler message sends. This was telling the sampler to do texture fetches for *all* channels in the non-constant surface index case, what could have reduced throughput unnecessarily when some of the channels were disabled by control flow. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
a902a5d6ba921ab006496aeecab0f68bca7ffb09 |
|
19-Mar-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Factor out logic to build a send message instruction with indirect descriptor. This is going to be useful because the Gen7+ uniform and varying pull constant, texturing, typed and untyped surface read, write, and atomic generation code on the vec4 and fs back-end all require the same logic to handle conditionally indirect surface indices. In pseudocode: | if (surface.file == BRW_IMMEDIATE_VALUE) { | inst = brw_SEND(p, dst, payload); | set_descriptor_control_bits(inst, surface, ...); | } else { | inst = brw_OR(p, addr, surface, 0); | set_descriptor_control_bits(inst, ...); | inst = brw_SEND(p, dst, payload); | set_indirect_send_descriptor(inst, addr); | } This patch abstracts out this frequently recurring pattern so we can now write: | inst = brw_send_indirect_message(p, sfid, dst, payload, surface) | set_descriptor_control_bits(inst, ...); without worrying about handling the immediate and indirect surface index cases explicitly. v2: Rebase. Improve documentatation and commit message. (Topi) Preserve UW destination type cargo-cult. (Topi, Ken, Matt) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
5a06ee7384934f8b5177b2f01bb7dff08b370145 |
|
12-Mar-2015 |
Neil Roberts <neil@linux.intel.com> |
i965/skl: Send a message header when doing constant loads SIMD4x2 Commit 0ac4c272755c7 made it add a header for the send message when using SIMD4x2 on Skylake because without this it will end up using SIMD8D. However the patch missed the case when a sampler is being used to implement constant loads from a buffer surface in a SIMD4x2 vertex shader. This fixes 29 Piglit tests, mostly related to the ARL instruction in vertex programs. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
65f9b83e05d790ddb3846b7a0e2b02241c61ab02 |
|
23-Nov-2013 |
Francisco Jerez <currojerez@riseup.net> |
i965: Add missing defines for render cache messages. And remove duplicated definition of OWORD_DUAL_BLOCK_WRITE. Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
e0137fd6f720e4977466b1760ac02a72c5abceb8 |
|
12-Feb-2015 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Add and use byte-MOV instruction for unpack 4x8. Previously we were using a B/UB source in an Align16 instruction, which is illegal. It for some reason works on all platforms, except Broadwell. Cc: "10.5" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86811 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
4470bf1f494ce313bda4f1627c775569d886f93f |
|
11-Feb-2015 |
Ian Romanick <ian.d.romanick@intel.com> |
i965/vec4: Silence unused parameter warnings brw_vec4_copy_propagation.cpp:243:59: warning: unused parameter 'reg' [-Wunused-parameter] int arg, struct copy_entry *entry, int reg) ^ brw_vec4_generator.cpp:869:57: warning: unused parameter 'inst' [-Wunused-parameter] vec4_generator::generate_unpack_flags(vec4_instruction *inst, ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
dfe957c02b753dbb5b372e768a5677f577daf9ef |
|
06-Feb-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Move up fs_inst::flag_subreg to backend_instruction. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
7de8a3e13efe1c3eede531737f6780d388152355 |
|
22-Jan-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/emit: Do the sampler index adjustment directly in header.0.3 Prior to this commit, the adjust_sampler_state_pointer function took an extra register that it could use as scratch space. The usual candidate was the destination of the sampler instruction. However, if that register ever aliased anything important such as the sampler index, this would scratch all over important data. Fortunately, the calculation is such that we can just do it in place and we don't need the scratch space at all. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
6de077f01d3439c80c9392455d6ca7e7f4493632 |
|
22-Jan-2015 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Fix fprintf argument ordering. Introduced in commit 3167a80b.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
3167a80bb1119616b70fbbcf2661d3fb511a6034 |
|
13-Jan-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fix "vertex" vs. "geometry" and "VS" vs. "GS" in debug output. We were happily printing "Native code for unnamed vertex shader" and "VS vec4" program for geometry shaders in our INTEL_DEBUG=gs output, as well as the KHR_debug output used by shader-db. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
0ac4c272755c75108a10a84ce33bf6a6234985d3 |
|
10-Dec-2014 |
Kristian Høgsberg <krh@bitplanet.net> |
i965/skl: Always use a header for SIMD4x2 sampler messages SKL+ overloads the SIMD4x2 SIMD mode to mean either SIMD8D or SIMD4x2 depending on bit 22 in the message header. If the bit is 0 or there is no header we get SIMD8D. We always wand SIMD4x2 in vec4 and for fs pull constants, so use a message header in those cases and set bit 22 there. Based on an initial patch from Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
bf2307937995212895375d1e258d50207da3d24e |
|
25-Nov-2014 |
Kristian Høgsberg <krh@bitplanet.net> |
i965: Rename brw_vec4_prog_data/key to brw_bue_prog_data/key These structs aren't vec4 specific, they are shared by shader stages operating on Vertex URB Entries (VUEs). VUEs are the data structures in the URB that hold vertex data between the pipeline geometry stages. Using vue in the name instead of vec4 makes a lot more sense, especially when we add scalar vertex shader support. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
9ed8d00ab546c8d3eadbefa5a6c553cbf9ebebeb |
|
14-Nov-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Give compile stats through KHR_debug. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
f13870db09d7a10141b5ffc24058bb2abceaa035 |
|
05-Dec-2014 |
Ben Widawsky <benjamin.widawsky@intel.com> |
i965/gs: Avoid DW * DW mul The GS has an interesting use for mul. Because the GS can emit multiple vertices per input vertex, and it also has a unique count at the top of the URB payload, the GS unit needs to be able to dynamically specify URB write offsets (relative to the global offset). The documentation in the function has a very good explanation from Paul on the mechanics. This fixes around 2000 piglit tests on BSW. v2: Reworded commit message (Ben) no mention of CHV (Matt) Change SHRT_MAX to USHRT_MAX (Ken, and Matt) Update comment in code to reflect the use of UW (Ben) Add Gen7+ assertion for the relevant GS code, since it won't work on Gen6- (Ken) Drop the bogus hunk in emit_control_data_bits() (Ken) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84777 (with many dupes) Cc: "10.4 10.3 10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
0d3cc01b0b092271938ce2cf2b77d27dc385e4d8 |
|
24-Oct-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Allow CSE on uniform-vec4 expansion MOVs. Three source instructions cannot directly source a packed vec4 (<0,4,1> regioning) like vec4 uniforms, so we emit a MOV that expands the vec4 to both halves of a register. If these uniform values are used by multiple three-source instructions, we'll emit multiple expansion moves, which we cannot combine in CSE (because CSE emits moves itself). So emit a virtual instruction that we can CSE. Sometimes we demote a uniform to to a pull constant after emitting an expansion move for it. In that case, recognize in opt_algebraic that if the .file of the new instruction is GRF then it's just a real move that we can copy propagate and such. total instructions in shared programs: 5822418 -> 5812335 (-0.17%) instructions in affected programs: 351841 -> 341758 (-2.87%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
531feec9dc4680046f21c517d13312c7df7b7619 |
|
16-Aug-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Handle destination writemasks in VEC4_OPCODE_PACK_BYTES. Since pack_bytes expands to two mov(4) align1 instructions, we can't use swizzles directly. For an instruction like pack_bytes m4.y:UD, vgrf13.xyzw:UD we can write into the .y component by settings the offset based on the swizzle. Also while we're doing this, we can set the dependency control hints properly, so that a series of pack_bytes writing into separate components of a register can issue without blocking.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
e14c7c7faff3c204a5eefc1f2ea487d4730b8382 |
|
10-Mar-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Add VEC4_OPCODE_PACK_4_BYTES. Will be used by emit_pack_{s,u}norm_4x8().
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
cb0ba848d4176c1ed2c4542fd5875867f460fc3b |
|
09-Mar-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Add vector float immediate infrastructure. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
56ac25918aea37a3c2c52f99fc285f2475be9128 |
|
21-Nov-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Don't overwrite the math function with conditional mod. Ben was asking about the undocumented restriction that the math instruction cannot use the dependency control hints. I went to reconfirm and disabled the is_math() check in opt_set_dependency_control() and saw that the disassembled math instructions with dependency hints had a bogus math function. We were mistakenly overwriting it by setting an empty conditional mod. Unfortunately, this wasn't the cause of the aforementioned problem (I reproduced it). This bug is benign, since we don't set dependeny hints on math instructions -- but maybe some day. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
f5bef2d2e53fc43cbdf15914e8886fc51f77b546 |
|
21-Nov-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Assert that math instructions don't have conditional mod. The math function field is at the same location as conditional mod. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
bd502139290ea902cbc4b5f535c102f8f98774b1 |
|
12-Nov-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Combine offset/texture_offset fields. texture_offset was only used by some texturing operations, and offset was only used by spill/unspill and some URB operations. These fields are never used at the same time. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
66ab9c22fecc053f099376c7b20958e0ffdf05ca |
|
30-Aug-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Use BRW_MATH_DATA_SCALAR when source regioning is scalar. Notice the mistaken (but harmless) argument swapping in brw_math_invert(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
1f77bfce7debe34366942ec441eda38747a47f74 |
|
23-Jul-2014 |
Samuel Iglesias Gonsalvez <siglesias@igalia.com> |
i965/gen6/gs: Add an additional parameter to the FF_SYNC opcode. We will use this parameter in later patches to provide information relevant to transform feedback that needs to be set as part of the FF_SYNC message. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
3ea410972a9954babdcb6a0b1d4e5bc6f1ff61d2 |
|
23-Jul-2014 |
Samuel Iglesias Gonsalvez <siglesias@igalia.com> |
i965/gen6/gs: implement GS_OPCODE_FF_SYNC_SET_PRIMITIVES opcode This opcode will be used when filling FF_SYNC header before emitting vertices and their data. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
5933a08bd98d053a4cc5797d901bb399a8a5b470 |
|
18-Jul-2014 |
Samuel Iglesias Gonsalvez <siglesias@igalia.com> |
i965/gen6/gs: implement GS_OPCODE_SVB_SET_DST_INDEX opcode This opcode generates code to copy the specified destination index into subregister 5 of the MRF message header. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
e86ae1b0a32bbbe4fb02ae9cee5b447a75d7e27f |
|
18-Jul-2014 |
Samuel Iglesias Gonsalvez <siglesias@igalia.com> |
i965/gen6/gs: implement GS_OPCODE_SVB_WRITE opcode This opcode will be used when sending SVB WRITE messages to save transform feedback outputs into Streamed Vertex Buffers. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
024b7c0f33a6e8f59d8b3d9dd9f72d671f426890 |
|
24-Jul-2014 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/gen6/gs: Implement GS_OPCODE_SET_PRIMITIVE_ID. In gen6 the geometry shader payload includes the PrimitiveID information in r0.1. When the shader code uses glPimitiveIdIn we will have to move this to a separate hardware register where we can map this attribute. This opcode takes the selected destination register and moves r0.1 there. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
f373b7ed820024080838742f419bbca5fcbde2bf |
|
17-Jul-2014 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/gen6/gs: Implement GS_OPCODE_SET_DWORD_2. We had GS_OPCODE_SET_DWORD_2_IMMED but this required its source argument to be an immediate. In gen6 we need to set dword 2 of the URB write message header from values stored in separate register, so we need something more flexible. This change replaces GS_OPCODE_SET_DWORD_2_IMMED with GS_OPCODE_SET_DWORD_2. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
621685ad4c747cc67e1b6c7ba95fa59774196a54 |
|
09-Jul-2014 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/gen6/gs: Add instruction URB flags to geometry shaders EOT message. Gen6 seems to require that EOT messages include the complete flag too or else the GPU hangs. We add will this flag to the instruction when we emit the thread end opcode. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
2c85132e511bbef9a0965c69848981b1bffb5bad |
|
09-Jul-2014 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/gen6/gs: Implement GS_OPCODE_URB_WRITE_ALLOCATE. Gen6 geometry shaders need to allocate URB handles for each new vertex they emit after the first (the URB handle for the first vertex is obtained via the FF_SYNC message). This opcode adds the URB allocation mechanism to regular URB writes. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
d0bdd4ce983ddd52f9f4b70dced4e471c60a130c |
|
09-Jul-2014 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/gen6/gs: Implement GS_OPCODE_FF_SYNC. This implements the FF_SYNC message required in gen6 geometry shaders to get the initial URB handle. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
f3401451070f1b38cc8ed17f486923f03eaeb828 |
|
06-Aug-2014 |
Abdiel Janulgue <abdiel.janulgue@linux.intel.com> |
i965/vec4/fs: Count loops in shader debug Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
ddc1d297bcb219eea176f72d48d15fe5e3333c99 |
|
29-Aug-2014 |
Abdiel Janulgue <abdiel.janulgue@linux.intel.com> |
i965/vec4: inline generate_vec4_instruction() within generate_code() Suggested by Matt. This patch combines and moves back the code-generation functions from generate_vec4_instruction() into generate_code(). Makes generate_code() a bit larger, but helps us to count loops in a straightforward manner. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
a3d0ccb037082f3aa66bd558dfbe89f63a6eedd3 |
|
12-Jul-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Pass a cfg pointer to generate_{code,assembly}. The loop over all instructions is now two-fold, over all of the blocks and all of the instructions in each block. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
301b71557b2f24f7f59402f634cd531d0adb3349 |
|
10-Aug-2014 |
Chris Forbes <chrisf@ijw.co.nz> |
i965/vec4: Add support for non-const sampler indices in generator Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
f7146d1a946003bfbb6bc9fc6462a4c827cd93ba |
|
10-Aug-2014 |
Chris Forbes <chrisf@ijw.co.nz> |
i965/vec4: Refactor generate_tex in prep for non-const samplers Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
8ce3fa8e91e96adac9ba909876d3b3066bdcd723 |
|
10-Aug-2014 |
Chris Forbes <chrisf@ijw.co.nz> |
i965: Extract helper function for surface state pointer adjustment Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
a55eae9b6d822ab1d5e61b400426b9350e152cc4 |
|
13-Jul-2014 |
Chris Forbes <chrisf@ijw.co.nz> |
i965/vec4: Generate indirect sends for nonconstant UBO array access Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
f7e9756201bb97d845bb4a73ed71efacfbc24c87 |
|
11-Aug-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Drop gen <= 7 assertion in pull constant load handling. I don't see any reason for this to exist. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
849046b8429f690fcc9eb7c31e193b467dd97e1a |
|
12-Aug-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Port Gen8 SET_VERTEX_COUNT handling to vec4_generator. Broadwell requires the number of vertices written by the geometry shader to be specified in a separate register, as part of the terminating message's payload. This also means GS_OPCODE_THREAD_END needs to increment mlen. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
6be68767b9b5344d5753b8909f5ec8f57309b71a |
|
04-Aug-2014 |
Chris Forbes <chrisf@ijw.co.nz> |
i965/vec4/Gen4-7: Use src1 for sampler_index instead of ->sampler field Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
e64dbd050d6d5b4ea502ee2fc727e12135833771 |
|
04-Aug-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/eu: Merge brw_CONT and gen6_CONT. The only difference is setting PopCount on Gen4-5. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
b8c2538e17cd3e0a2fa8f6f80f76eee4a293a90a |
|
27-Jul-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Replace sizeof(struct gen7_sampler_state) with the size itself. These are the last users of struct gen7_sampler_state. v2: Use a local sampler_state_size variable, to help distinguish the various 16s (suggested by Topi Pohjolainen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
53992a102ffddf2e0fad401252cfc1c034d022ad |
|
30-Jun-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Use immediate storage in brw_reg for visitor regs. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
c0f1929dd23bbc558e9eef0f8fd40e10dfef3c21 |
|
19-May-2014 |
Eric Anholt <eric@anholt.net> |
i965: Move dispatch_grf_start_reg and first_curbe_grf into stage_prog_data. I wanted to access this value from stage-generic code, so stop storing it under two different names. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
3d826729dabab53896cdbb1f453c76fab1c7e696 |
|
29-Jun-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Use unreachable() instead of unconditional assert(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
e65844023492c1ebc12c3fd299fe614164fe32a2 |
|
29-Jun-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Remove useless conditionals. Setting a couple of bits is the same cost or less as conditionally setting a couple of bits.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
c5030ac0ac15d3c91c4352789f94281da9a9dcad |
|
25-Jun-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Use typed foreach_in_list instead of foreach_list. Acked-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
e59a9ecc98a9715307ae42f8e267b2f09129d690 |
|
29-Jun-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/disasm: Stop using gen8_disassemble in favor of brw_disassemble. At this point, brw_disassemble can do everything gen8_disassemble can do - and, thanks to the new brw_inst API, it supports all generations. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
7c79608b5b8a7eb4bed9fa9d594c9bda696dd49a |
|
13-Jun-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Replace 'struct brw_instruction' with 'brw_inst'. Use this an an opportunity to clean up the formatting of some old code (brw_ADD, for instance). Signed-off-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
5e6818faa57b8572478a9993db3367d51a3af10c |
|
08-Jun-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Extend is_haswell checks to gen >= 8 in Gen4-7 generators. We're going to use fs_generator/vec4_generator for Gen8+ code soon, thanks to the new brw_instruction API. When we do, we'll generally want to take the Haswell paths on Gen8+ as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
4362631d7b787837210c30ba0d89e1a034c57af8 |
|
08-Jun-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Convert vec4_generator to the new brw_inst API. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
f3ddd71f2878e42d2c9e927bd5f695a62b357c58 |
|
07-Jun-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Combine generate_math[12]_gen6 methods. These are trivial to combine: we should just avoid checking the second operand if it's brw_null_reg. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
5260a26e927df2bda7059b170c007a03da65b03b |
|
07-Jun-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Drop the generate_math2_gen7() method. It's now a single line of code, so we may as well fold it into the caller. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
b003fc265fc672b35d15ce7c2d05e8b9c81c4ee9 |
|
07-Jun-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Rename brw_math to gen4_math. Usually, I try to use "brw" for functions that apply to all generations, and "gen4" for dead end/legacy code that is only used on Gen4-5. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
de65ec2fdeb3a22d408db24535d738b39cc3402c |
|
07-Jun-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Split Gen4-5 and Gen6+ MATH instruction emitters. Our existing functions, brw_math and brw_math2, had unclear roles: Gen4-5 used brw_math for both unary and binary math functions; it never used brw_math2. Since operands are already in message registers, this is reasonable. Gen6+ used brw_math for unary math functions, and brw_math2 for binary math functions, duplicating a lot of code. The only real difference was that brw_math used brw_null_reg() for src1. This patch improves brw_math2's assertions to allow both unary and binary operations, renames it to gen6_math(), and drops the Gen6+ code out of brw_math(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
757d7ddf01db694c51c63ea260510d89febea18a |
|
25-May-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Emit compaction stats without walking the assembly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
6fdfe3f2dc4975a002dd019d1f16ff287d5aadfd |
|
25-May-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Move program header printing to end of generate_code(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
92b055625da1b8d9144bf746ac67210df7deba73 |
|
25-May-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Move annotation info into generate code. Suggested by Ken as a way to cut down lines of code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
e374809819d82f2e3e946fe809c4d46061ddc5b5 |
|
01-Jun-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Put '_default_' in the name of functions that set default state. Eventually we're going to use functions to set bits on an instruction. Putting 'default' in the name of functions that alter default state will help distinguins them. This patch was generated entirely mechanically, by the following: for file in brw*.{cpp,c,h}; do sed -i \ -e 's/brw_set_mask_control/brw_set_default_mask_control/g' \ -e 's/brw_set_saturate/brw_set_default_saturate/g' \ -e 's/brw_set_access_mode/brw_set_default_access_mode/g' \ -e 's/brw_set_compression_control/brw_set_default_compression_control/g' \ -e 's/brw_set_predicate_control/brw_set_default_predicate_control/g' \ -e 's/brw_set_predicate_inverse/brw_set_default_predicate_inverse/g' \ -e 's/brw_set_flag_reg/brw_set_default_flag_reg/g' \ -e 's/brw_set_acc_write_control/brw_set_default_acc_write_control/g' \ $file; done No manual changes were done after running that command. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
776ad51165b1f7ee18a9a4cccbed1ce3b2c4fcf9 |
|
31-May-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Don't use brw_set_conditionalmod in the FS and vec4 compilers. brw_set_conditionalmod and brw_next_insn work together to set the conditional modifier for the next instruction, then turn it off. The Gen8+ generators don't implement this: we just set it for all future instructions, and whack it for each fs_inst/vec4_instruction. Both approaches work out because we only set conditional_mod on IR instructions like CMP, AND, and so on, which correspond to exactly one assembly instruction. The Gen8 generators would break if we had an IR instruction that generated multiple instructions, and the Gen4-7 EU emit layer would do...something. To safeguard against this, assert that we only generated one instruction if conditional_mod is set, and just set the flag directly on that instruction rather than altering default state. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
f0f7fb181fc267934a44904da4530f50a698b18d |
|
19-May-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Print disassembly after compaction. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
a35b9cb625495e51a42b56cd1d8d2cb019abe243 |
|
19-May-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Add annotation data structure and support code. Will be used to print disassembly after jump targets are set and instructions are compacted, while still retaining higher-level IR annotations and basic block information. An array of 'struct annotation' will live along side the generated assembly. The generators will populate the array with their IR annotations, and basic block pointers if the instructions began or ended a basic block pointer. We'll then update the instruction offset when we compact instructions and then using the annotations print the disassembly. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
9976294e867785ea480f52178a3d3dc67ac72d32 |
|
16-May-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Pass in start_offset to brw_compact_instructions(). Let's us avoid recompacting the SIMD8 instructions when we compact the SIMD16 program. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
db1449b7005af190d2ef1f2ad94f96c4b29943db |
|
16-May-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Rename brw/gen8_dump_compile to brw/gen8_disassemble. "Disassemble" is an accurate description of what this function does. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
1ef52d6ab3f298af14088354682ee861573e5284 |
|
10-May-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Pass brw_context and assembly separately to brw_dump_compile. brw_dump_compile will be called indirectly by code common used by generations before and after the gen8 instruction format change. Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
74b252d270404a729fd41df9e835a368fa8f9044 |
|
07-May-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Pull brw_compact_instructions() out of brw_get_program(). Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
09747066714a341b85907c474f18a0d05bbc7071 |
|
28-Mar-2014 |
Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> |
i965/vec4: Add support for the MAC instruction. This allows us to generate the MAC (multiply-accumulate) instruction, which can be used to implement some expressions in fewer instructions than doing a series of MUL and ADDs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
306ed81b9363721058c568244f9860c5c8c819f4 |
|
04-Apr-2014 |
Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> |
i965: Add writes_accumulator flag Our hardware has an "accumulator" register, which can be used to store intermediate results across multiple instructions. Many instructions can implicitly write a value to the accumulator in addition to their normal destination register. This is enabled by the "AccWrEn" flag. This patch introduces a new flag, inst->writes_accumulator, which allows us to express the AccWrEn notion in the IR. It also creates a n ALU2_ACC macro to easily define emitters for instructions that implicitly write the accumulator. Previously, we only supported implicit accumulator writes from the ADDC, SUBB, and MACH instructions. We always enabled them on those instructions, and left them disabled for other instructions. To take advantage of the MAC (multiply-accumulate) instruction, we need to be able to set AccWrEn on other types of instructions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
a76e5dce4fc8d50f8699c108833f24e80167d706 |
|
23-Dec-2013 |
Eric Anholt <eric@anholt.net> |
i965: Move compiler debugging output to stderr. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
fdcf6c8fad9edfc87eb5f647af254a6fd6b3b71c |
|
20-Feb-2014 |
Eric Anholt <eric@anholt.net> |
i965: Use the object label when available for INTEL_DEBUG=vs,gs,fs output. Note that this requires updated run.py in shader_db. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
008338bc4e2d9cc5931b9968d019619c09392389 |
|
25-Jan-2014 |
Jordan Justen <jordan.l.justen@intel.com> |
i965: support gl_InvocationID for gen7 v2: * Make gl_InvocationID a system value v3: * Properly shift from R0.1 into DST.4 by adding GS_OPCODE_GET_INSTANCE_ID Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
42b226ef824ed61ccf51fa9a1198cba305ad5472 |
|
19-Feb-2014 |
Francisco Jerez <currojerez@riseup.net> |
i965: Make sure that backend_reg::type and brw_reg::type are consistent for fixed regs. And define non-mutating helper functions to retype fixed and normal regs with a common interface. At some point we may want to get rid of ::fixed_hw_reg completely and have fixed regs use the normal register data members (e.g. backend_reg::reg to select a fixed GRF number, src_reg::swizzle to store the swizzle, etc.), I have the feeling that this is not the last headache we're going to get because of the multiple ways to represent the same thing and the different register interface depending on the file a register is stored in... Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
a32817f3c248125fb537c3a915566445e5600d45 |
|
27-Nov-2013 |
Francisco Jerez <currojerez@riseup.net> |
i965: Unify fs_generator:: and vec4_generator::mark_surface_used as a free function. This way it can be used anywhere. I need it from the visitor. Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
2f97119950515c841bca98a890e5110206bad945 |
|
03-Feb-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fix INTEL_DEBUG=vs for fixed-function/ARB programs. Since commit 9cee3ff562f3e4b51bfd30338fd1ba7716ac5737, INTEL_DEBUG=vs has caused a NULL pointer dereference for fixed-function/ARB programs. In the vec4 generators, "prog" is a gl_program, and "shader_prog" is the gl_shader_program. This is different than the FS visitor. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
9cee3ff562f3e4b51bfd30338fd1ba7716ac5737 |
|
22-Jan-2014 |
Paul Berry <stereotype441@gmail.com> |
i965: Remove *_generator::shader field; use prog field instead. The "shader" field in fs_generator, vec4_generator, and gen8_generator was only used for one purpose; to figure out if we were compiling an assembly program or a GLSL shader (shader is NULL for assembly programs). And it wasn't being used properly: in vec4 shaders we were always initializing it based on prog->_LinkedShaders[MESA_SHADER_FRAGMENT], regardless of whether we were compiling a geometry shader or a vertex shader. This patch simplifies things by using the "prog" field instead; this is also NULL for assembly programs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
ce527a6722491fa7d696266d5dec13f0b72bf8e8 |
|
10-Dec-2013 |
Topi Pohjolainen <topi.pohjolainen@intel.com> |
i965: rename tex_ms to tex_cms Prepares for the introduction of non-compressed multi-sampled lookup used in the blorp programs. v2: now also taking into account gen8 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
15fc919491ea27bd395988a332502bdb23ee44d0 |
|
18-Jan-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Support arbitrarily large sampler state indices on Haswell+. Like the scalar backend, we add an offset to the "Sampler State Pointer" field to select a group of 16 samplers, then use the "Sampler Index" field to select within that group. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
d58e03fe4f04c24c70c76e7ad86fd04b9130a711 |
|
18-Jan-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Refactor sampler message setup. The next patch adds an additional case where the message header is necessary. So we want to do the g0 copy if inst->header_present is set, rather than inst->texture_offset. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
51000c2ff8a6b59b6dab51cbd63ef87ac6f2a317 |
|
23-Mar-2013 |
Paul Berry <stereotype441@gmail.com> |
i965: Modify some error messages to refer to "vec4" instead of "vs". These messages are in code that is shared between the VS and GS back-ends, so use the terminology "vec4" to avoid confusion. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
7629c489c88a6f6dd47b311a90ad64e216c9a37c |
|
29-Nov-2013 |
Chris Forbes <chrisf@ijw.co.nz> |
i965: Add shader opcode for sampling MCS surface Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
6032261682388ced64bd33328a5025f561927a38 |
|
16-Oct-2013 |
Eric Anholt <eric@anholt.net> |
i965: Merge together opcodes for SHADER_OPCODE_GEN4_SCRATCH_READ/WRITE I'm going to be introducing gen7 variants, and the previous naming was going to get confusing. Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
5e621cb9fef7eada5a3c131d27f5b0b142658758 |
|
11-Sep-2013 |
Francisco Jerez <currojerez@riseup.net> |
i965/gen7: Implement code generation for untyped surface read instructions.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
cfaaa9bbb7a6ab5819f4fa9e38352b72d6293cff |
|
11-Sep-2013 |
Francisco Jerez <currojerez@riseup.net> |
i965/gen7: Implement code generation for untyped atomic instructions. Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
32f898a71cd0a83677944f0444145c4a04c966a1 |
|
10-Oct-2013 |
Chris Forbes <chrisf@ijw.co.nz> |
i965/vs: Add support for shadow comparitors with gather4 gather4_c's argument layout is straightforward -- refz just goes on the end. gather4_po_c's layout however -- the array index is replaced with refz. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
6bb2cf2107c4461ea9dd100edaf110b839311b90 |
|
08-Oct-2013 |
Chris Forbes <chrisf@ijw.co.nz> |
i965: Add SHADER_OPCODE_TG4_OFFSET for gather with nonconstant offsets. The generator code ends up clearer this way than if we had to sniff via the message length. Implemented via the gather4_po message in hardware, which is present in Gen7 and later. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
03ac2c7223f7645e30028bf59b4c9cf0f5734fc0 |
|
16-Oct-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/gs: Fix up gl_PointSize input swizzling for DUAL_INSTANCED gs. Geometry shaders that run in "DUAL_INSTANCED" mode store their inputs in vec4's. This means that when compiling gl_PointSize input swizzling (a MOV instruction which uses a geometry shader input as both source and destination), we need to do two things: - Set force_writemask_all to ensure that the MOV happens regardless of which channels are enabled. - Set the source register region to <4;4,1> (instead of <0;4,1> to satisfy register region restrictions. v2: move the source register region fixup to the top of vec4_generator::generate_vec4_instruction(), so that it applies to all instructions rather than just MOV. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
d395485e1df44853cdf86b0bd46b7af36c7e1c13 |
|
03-Oct-2013 |
Eric Anholt <eric@anholt.net> |
i965/vec4: Dynamically assign the VS/GS binding table offsets. Note that the dropped comment in brw_context.h is mostly (better written) in brw_binding_table.c as well. Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
3c9dc2d31b80fc73bffa1f40a91443a53229c8e2 |
|
02-Oct-2013 |
Eric Anholt <eric@anholt.net> |
i965: Make a brw_stage_prog_data for storing the SURF_INDEX information. It would be nice to be able to pack our binding table so that programs that use 1 render target don't upload an extra BRW_MAX_DRAW_BUFFERS - 1 binding table entries. To do that, we need the compiled program to have information on where its surfaces go. v2: Rename size to size_bytes to be more explicit. Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
69909c866b6595f80d206c8e2484b1dc6668e7be |
|
20-Sep-2013 |
Matt Turner <mattst88@gmail.com> |
i965: Add Gen assertion checks for newer instructions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
014cce3dc49f5b0bfd7fbb1940ed661c9fc7bbd7 |
|
19-Sep-2013 |
Matt Turner <mattst88@gmail.com> |
i965: Generate code for ir_binop_carry and ir_binop_borrow. Using the ADDC and SUBB instructions on Gen7. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
dd4c2a516cf2c700694bcbd37d644d7239f4cf48 |
|
15-Sep-2013 |
Chris Forbes <chrisf@ijw.co.nz> |
i965: use gather slots in the binding table for gather4. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
fb455500bfb11cca0f45076a9eaccc0ddd764731 |
|
31-Mar-2013 |
Chris Forbes <chrisf@ijw.co.nz> |
i965: add SHADER_OPCODE_TG4 Adds the Gen7 message IDs, a new SHADER_OPCODE_TG4 pseudo-op, and low-level support for emitting it via generate_tex(). V3: Updated for changes in master. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|
ec44d56a5b20632bcd4cb19ae6fa5d615df4149f |
|
18-Sep-2013 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Rename brw_{fs,vec4}_emit.cpp to brw_{fs,vec4}_generator.cpp. The previous names were really confusing to talk about: - brw_fs_visitor() contained methods named emit_whatever(). - brw_fs_generator() contained methods named generate_whatever(), but lived in brw_fs_emit.cpp. So when someone said "the emit layer", or "emit code", we weren't sure whether they meant the visitor's emit() functions or the generator in brw_fs_emit.cpp. By renaming these files, the method names, class names, and file names all match, which is much less confusing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
|