2b57adad0056273e38d9a9736cd98be95c0deb07 |
|
18-Aug-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4/scalarize_df: support more swizzles via vstride=0 By exploiting gen7's hardware decompression bug with vstride=0 we gain the capacity to support additional swizzle combinations. This also fixes ZW writes from X/Y channels like in: mov r2.z:df r0.xxxx:df Because DF regions use 2-wide rows with a vstride of 2, the region generated for the source would be r0<2,2,1>.xyxy:DF, which is equivalent to r0.xxzz, so we end up writing r0.z in r2.z instead of r0.x. Using a vertical stride of 0 in these cases we get to replicate the XX swizzle and write what we want. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
c3edacaa288ae01c0f37e645737feeeb48f2c3f2 |
|
19-Jul-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4/scalarize_df: do not scalarize swizzles that we can support natively Certain swizzles like XYZW can be supported by translating only the first two 64-bit swizzle channels to 32-bit channels. This happens with swizzles such that the first two logical components, when translated to 32-bit channels and replicated across the second dvec2 row, select the same channels specified by the 3rd and 4th logical swizzle components. Notice that this opens up the possibility that some instructions are not scalarized and can end up with XY or ZW 32-bit writemasks. Make sure we always scalarize in such cases. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
d8e123cc5d66022069f3aee53318bfd1075bcc53 |
|
22-Jun-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: Add a shuffle_64bit_data helper SIMD4x2 64bit data is stored in register space like this: r0.0:DF x0 y0 z0 w0 r1.0:DF x1 y1 z1 w1 When we need to write data such as this to memory using 32-bit write messages we need to shuffle it in this fashion: r0.0:DF x0 y0 x1 y1 r0.1:DF z0 w0 z1 w1 and emit two 32-bit write messages, one for r0.0 at base_offset and another one for r0.1 at base_offset+16. We also need to do the inverse operation when we read using 32-bit messages to produce valid SIMD4x2 64bit data from the data read. We can achieve this by aplying the exact same shuffling to the data read, although we need to apply different channel enables since the layout of the data is reversed. This helper implements the data shuffling logic and we will use it in various places where we read and write 64bit data from/to memory. v2 (Curro): - Use the writemask helper and don't assert on the original writemask being XYZW. - Use the Vec4 IR builder to simplify the implementation. v3 (Iago): - Use byte_offset() instead of offset(). v3: - Fix typo (Matt) - Clarify the example and fix indention (Matt). Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
b3a7d0ee9d5f792ab68fbe77da5e3ea85d4bc4c0 |
|
08-Jun-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: Lower 64-bit MAD The previous patch made sure that we do not generate MAD instructions for any NIR's 64-bit ffma, but there is nothing preventing i965 from producing MAD instructions as a result of lowerings or optimization passes. This patch makes sure that any 64-bit MAD produced inside the driver after translating from NIR is also converted to MUL+ADD before we generate code. v2: - Use a copy constructor to copy all relevant instruction fields from the original mad into the add and mul instructions v3: - Rename the lowering and fix commit log (Matt) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
e238601a2da9512c0fd263e8378f30498a0a1507 |
|
24-May-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: translate 64-bit swizzles to 32-bit The hardware can only operate with 32-bit swizzles, which is a rather limiting restriction. However, the idea is not to expose this to the optimization passes, which would be a mess to deal with. Instead, we let the bulk of the vec4 backend ignore this fact and we fix the swizzles right at codegen time. At the moment the pass only needs to handle single value swizzles thanks to the scalarization pass that runs before it. Notice that this only works for X/Y swizzles. We will add support for Z/W swizzles in the next patch, since they need a bit more work. v2 (Sam): - Do not expand swizzle of 64-bit immediate values. v3: - Do this after translation to hardware registers instead of doing it right before so we don't need the force_vstride0 flag (Curro). - Squashed patch that included FIXED_GRF in the list of register files that need this translation (Iago). - Remove swizzle assignments for VGRF and UNIFORM files in convert_to_hw_regs(), they will be set by apply_logical_swizzle() (Iago). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
fb7cb853c964db44ab99c1592e1ef7dec2f0c25b |
|
24-May-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: add a scalarization pass for double-precision instructions The hardware only supports 32-bit swizzles, which means that we can only access directly channels XY of a DF making access to channels ZW more difficult, specially considering the various regioning restrictions imposed by the hardware. The combination of both things makes handling ramdom swizzles on DF operands rather difficult, as there are many combinations that can't be represented at all, at least not without some work and some level of instruction splitting depending on the case. Writemasks are 64-bit in general, however XY and ZW writemasks also work in 32-bit, which means these writemasks can't be represented natively, adding to the complexity. For now, we decided to try and simplify things as much as possible to avoid dealing with all this from the get go by adding a scalarization pass that runs after the main optimization loop. By fully scalarizing DF instructions in align16 we avoid most of the complexity introduced by the aforementioned hardware restrictions and we have an easier path to an initial fully functional version for the vector backend in Haswell and IvyBridge. Later, we can improve the implementation so we don't necessarily scalarize everything, iteratively adding more complexity and building on top of a framework that is already working. Curro drafted some ideas for how this could be done here: https://bugs.freedesktop.org/show_bug.cgi?id=92760#c82 v2: - Use a copy constructor for the scalar instructions so we copy all relevant instructions fields from the original instruction. v3: Fix indention in one switch (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
58767f0fec7809c3408adbc4d147dd56f2ee3d4d |
|
29-Aug-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: add a SIMD lowering pass Generally, instructions in Align16 mode only ever write to a single register and don't need any form of SIMD splitting, that's why we have never had a SIMD splitting pass in the vec4 backend. However, double-precision instructions typically write 2 registers and in some cases they run into certain hardware bugs and limitations that we need to work around by splitting the instructions so we only write to 1 register at a time. This patch implements a SIMD splitting pass similar to the one in the scalar backend. Because we only use double-precision instructions in Align16 mode in gen7 (gen8+ is fully scalar and gens < 7 do not implement fp64) the pass should be a no-op on any other generation. For now the pass only handles the gen7 restriction where any instruction that writes 2 registers also needs to read 2 registers. This affects double-precision instructions reading uniforms, for example. Later patches will extend the lowering pass adding a few more cases. v2: - Move the simd lowering pass after the main optimization loop and run copy-propagation and dce if it reports progress (Curro) - Compute number of registers written instead of fixing it to 1 (Iago) - Use group from backend_instruction (Iago) - Drop assertion that checked that we only split 8-wide instructions into 4-wide. (Curro) - Don't assume that instructions can only be 8-wide, we might want to use 16-wide instructions in the future too (Curro) - Wrap gen7 workarounds in a conditional to ease adding workarounds for other gens in the future (Curro) - Handle dst/src overlap hazard (Curro) - Use the horiz_offset() helper to simplify the implementation (Curro) - Drop the assertion that checks that each split instruction writes exactly one register (Curro) - Use the copy constructor to generate split instructions with all the relevant fields initialized to the values in the original instruction instead of copying only a handful of them manually (Curro) v3 (Iago): - When copying to a temporary, allocate the number of registers required for the copy based on the size written of the lowered instruction instead of assuming that all lowered instructions produce single-register writes - Adapt to changes in offset() Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
98da3623d5dfd991362c4fd3571325fe0277a2f9 |
|
09-Mar-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: add a helper function to create double immediates Gen7 hardware does not support double immediates so these need to be moved in 32-bit chunks to a regular vgrf instead. Instead of doing this every time we need to create a DF immediate, create a helper function that does the right thing depending on the hardware generation. v2 (Curro): - Use swizzle() and writemask() helpers and make tmp const. v3 (Iago): - Adapt to changes in offset() Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
bfc1f0f017db6bd11a558237c9a4ebeacf73f5ba |
|
29-Jun-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: add helpers for conversions to/from doubles Use these helpers to implement d2f and f2d. We will reuse these helpers when we implement things like d2i or i2d as well. v2: - Rename the helpers (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
9b6174dffa4c085a0b7f66db6c46b831c5c91f0b |
|
05-Feb-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: add dst_null_df() Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
fd249c803e3ae2acb83f5e3b7152728e73228b7b |
|
12-Dec-2016 |
Ilia Mirkin <imirkin@alum.mit.edu> |
treewide: s/comparitor/comparator/ git grep -l comparitor | xargs sed -i 's/comparitor/comparator/g' Just happened to notice this in a patch that was sent and included one of the tokens in question. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
f182e5eafc31ebc7c140e9a369d5f747948733ae |
|
17-Oct-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Handle component qualifiers on non-generic varyings. ARB_enhanced_layouts only requires component qualifier support for generic varyings, so this is all the vec4 backend knew how to handle. This patch extends the backend to handle it for all varyings, so we can use store_output intrinsics with a component set for things like clip/cull distances. We may want to use that for other VUE header fields in the future as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
89e1436e2d4ff0c15202708979eb36761cae4167 |
|
11-Oct-2016 |
Ian Romanick <ian.d.romanick@intel.com> |
i965: Silence unused parameter warnings brw_link.cpp:76:44: warning: unused parameter ‘shader_type’ [-Wunused-parameter] gl_shader_stage shader_type, ^ brw_nir.c: In function ‘brw_nir_lower_vs_inputs’: brw_nir.c:194:55: warning: unused parameter ‘devinfo’ [-Wunused-parameter] const struct gen_device_info *devinfo, ^ brw_vec4_visitor.cpp:914:37: warning: unused parameter ‘sampler’ [-Wunused-parameter] uint32_t sampler, ^ brw_vec4_visitor.cpp:1146:34: warning: unused parameter ‘stream_id’ [-Wunused-parameter] vec4_visitor::gs_emit_vertex(int stream_id) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
d1b1fca0b7cccff718923f2344ea144dc3ebb869 |
|
22-Jun-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
i965/vec4: add support for packing vs/gs/tes outputs Here we create a new output_generic_reg array with the ability to store the dst_reg for each component of user defined varyings. This is needed as the previous code only stored the dst_reg based on the varying location which meant packed varyings would overwrite each other. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
96dfed49e47eac7afc100e5b8d3b316dd6652fb6 |
|
19-Jul-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Stop muging cube array lengths by 6 From the Sky Lake PRM: "For SURFTYPE_CUBE: For Sampling Engine Surfaces and Typed Data Port Surfaces, the range of this field is [0,340], indicating the number of cube array elements (equal to the number of underlying 2D array elements divided by 6). For other surfaces, this field must be zero." In other words, the depth field for cube maps is in number of cubes not number of 2-D slices so we need to divide by 6. ISL will do this correctly for us assuming that we provide it with the correct array bounds which it expects to be in 2-D slices. It appears as if we've been doing this wrong ever since we first added cube map arrays for Sandy Bridge and the change to ISL made things slightly worse. While we're at it, we now need to remoe the shader hacks we've always done since they were only needed because we were setting the depth field six times too large. v2: Fix the vec4 backend as well (not sure how I missed this). Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Chris Forbes <chrisforbes@google.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
6e28976d35cf0a15c62bed1fd2ceeb734a3fc81e |
|
07-Jul-2016 |
Samuel Iglesias Gonsálvez <siglesias@igalia.com> |
i965: enable the emission of the DIM instruction v2 (Matt): - Take a DF source argument for the DIM instruction emission in the visitors. - Indentation. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
fb5dcb81cc121e4355b7eef014474a5c42a2f6db |
|
19-May-2016 |
Matt Turner <mattst88@gmail.com> |
i965: Pass nir_src/nir_dest by reference. Cuts 6K of .text. text data bss dec hex filename 5772372 264648 29320 6066340 5c90a4 lib/i965_dri.so before 5766074 264648 29320 6060042 5c780a lib/i965_dri.so after Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
f687b8e1785df0825443f07778e5d0ddf6f9be09 |
|
13-May-2016 |
Ian Romanick <ian.d.romanick@intel.com> |
i965: Silence unused parameter warnings The only place that actually used the type parameter was the GS visitor, and it was always passed glsl_type::int. Just remove the parameter. brw_vec4_vs_visitor.cpp:38:61: warning: unused parameter ‘type’ [-Wunused-parameter] const glsl_type *type) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
2a25a5142bd78b22cc9ada41b8988bb282c2a7ac |
|
14-Apr-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fold vectorize_mov() back into the one caller. After the previous patch, this helper is only called in one place. So, just fold it back in - there are a lot of parameters here and not much code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
8e76f664beb845f8dca30ca5635f9369618563b0 |
|
09-Dec-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Get rid of the uniform_size array Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
056849772f66582fd7e8a181c3fb16955f84243b |
|
25-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Use MOV_INDIRECT instead of reladdr for indirect push constants This commit moves us to an instruction based model rather than a register-based model for indirects. This is more accurate anyway as we have to emit instructions to resolve the reladdr. It's also a lot simpler because it gets rid of the recursive reladdr problem by design. One side-effect of this is that we need a whole new algorithm in move_uniform_array_access_to_pull_constants. This new algorithm is much more straightforward than the old one and is fairly similar to what we're already doing in the FS backend. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
d880c6f9f59dac7cfe33713fff1c09c63ab7fb4f |
|
25-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Inline get_pull_constant_offset It's not really doing enough anymore to justify a helper function. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reveiewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
d3a89a7c494d577fdf8f45c0d8735004a571e86b |
|
04-Mar-2016 |
Alejandro Piñeiro <apinheiro@igalia.com> |
i965/vec4/nir: remove emit_untyped_surface_read and emit_untyped_atomic at brw_vec4_visitor surface_access emit_untyped_read and emit_untyped_atomic provides the same functionality. v2: surface parameter of emit_untyped_atomic is a const, no need to specify default predicate on emit_untyped_atomic, use retype (Francisco Jerez). Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
2f76a9924e7b0b33a508ee3651b0cb2ab536a7dc |
|
02-Mar-2016 |
Juan A. Suarez Romero <jasuarez@igalia.com> |
i965/vec4: add opportunistic behaviour to opt_vector_float() opt_vector_float() transforms several scalar MOV operations to a single vectorial MOV. This is done when those MOV covers all the components of the destination register. So something like: mov vgrf3.0.xy:D, 0D mov vgrf3.0.w:D, 1065353216D mov vgrf3.0.z:D, 0D is transformed in: mov vgrf3.0:F, [0F, 0F, 0F, 1F] But there are cases where not all the components are written. For example, in: mov vgrf2.0.x:D, 1073741824D mov vgrf3.0.xy:D, 0D mov vgrf3.0.w:D, 1065353216D mov vgrf4.0.xy:D, 1065353216D mov vgrf4.0.w:D, 0D mov vgrf6.0:UD, u4.xyzw:UD Nor vgrf3 nor vgrf4 .z components are written, so the optimization is not applied. But it could be applied anyway with the components covered, using a writemask to select the ones written. So we could transform it in: mov vgrf2.0.x:D, 1073741824D mov vgrf3.0.xyw:F, [0F, 0F, 0F, 1F] mov vgrf4.0.xyw:F, [1F, 1F, 0F, 0F] mov vgrf6.0:UD, u4.xyzw:UD This commit does precisely that: opportunistically apply opt_vector_float() when possible. total instructions in shared programs: 7124660 -> 7114784 (-0.14%) instructions in affected programs: 443078 -> 433202 (-2.23%) helped: 4998 HURT: 0 total cycles in shared programs: 64757760 -> 64728016 (-0.05%) cycles in affected programs: 1401686 -> 1371942 (-2.12%) helped: 3243 HURT: 38 v2: change vectorize_mov() signature (Matt). v3: take in account predicates (Juan). v4 [mattst88]: Update shader-db numbers. Fix some whitespace issues. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
2f2c00c7279e7c43e520e21de1781f8cec263e92 |
|
11-Feb-2016 |
Matt Turner <mattst88@gmail.com> |
i965: Lower min/max after optimization on Gen4/5. Gen4/5's SEL instruction cannot use conditional modifiers, so min/max are implemented as CMP + SEL. Handling that after optimization lets us CSE more. On Ironlake: total instructions in shared programs: 6426035 -> 6422753 (-0.05%) instructions in affected programs: 326604 -> 323322 (-1.00%) helped: 1411 total cycles in shared programs: 129184700 -> 129101586 (-0.06%) cycles in affected programs: 18950290 -> 18867176 (-0.44%) helped: 2419 HURT: 328 Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
d03e5d52557ce6523eb65dfec9172d6000f5ff8d |
|
03-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Plumb separate surfaces and samplers through from NIR Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
eb63640c1d38a200a7b1540405051d3ff79d0d8a |
|
17-Jan-2016 |
Emil Velikov <emil.velikov@collabora.com> |
glsl: move to compiler/ Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
a39a8fbbaa129f4e52f2a3ad2747182e9a74d910 |
|
17-Jan-2016 |
Emil Velikov <emil.velikov@collabora.com> |
nir: move to compiler/ Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
24be658d13b13fdb8a1977208038b4ba43bce4ac |
|
17-Nov-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Add tessellation control shaders. The TCS is the first tessellation shader stage, and the most complicated. It has access to each of the control points in the input patch, and computes a new output patch. There is one logical invocation per output control point; all invocations run in parallel, and can communicate by reading and writing output variables. One of the main responsibilities of the TCS is to write the special gl_TessLevelOuter[] and gl_TessLevelInner[] output variables which control how much new geometry the hardware tessellation engine will produce. Otherwise, it simply writes outputs that are passed along to the TES. We run in SIMD4x2 mode, handling two logical invocations per EU thread. The hardware doesn't properly manage the dispatch mask for us; it always initializes it to 0xFF. We wrap the whole program in an IF..ENDIF block to handle an odd number of invocations, essentially falling back to SIMD4x1 on the last thread. v2: Update comments (requested by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
bb9eb599335ec4ac3a2a579359fb239f16de17e8 |
|
26-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Optimize predicate handling for any/all. For a select whose condition is any(v), instead of emitting cmp.nz.f0(8) null<1>D g1<0,4,1>D 0D mov(8) g7<1>.xUD 0x00000000UD (+f0.any4h) mov(8) g7<1>.xUD 0xffffffffUD cmp.nz.f0(8) null<1>D g7<4,4,1>.xD 0D (+f0) sel(8) g8<1>UD g4<4,4,1>UD g3<4,4,1>UD we now emit cmp.nz.f0(8) null<1>D g1<0,4,1>D 0D (+f0.any4h) sel(8) g9<1>UD g4<4,4,1>UD g3<4,4,1>UD Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
aa35b0c2c71f054f72df5a85779d0862fa7d6e4a |
|
25-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Get rid of the nir_inputs array It's not really buying us anything at this point. It's just a way of remapping one offset namespace onto another. We can just use the location namespace the whole way through. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
f36993b46962eab4446bc1964eb47149751aee26 |
|
23-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Clean up #includes in the compiler. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
ecac1aab538d65f0867fd93e23d0d020c1a5d0f1 |
|
23-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Push down inclusion of brw_program.h. We were including it in headers, which then caused it to be included in tons of places it wasn't needed. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
d9b8fde963a53d4e06570d8bece97f806714507a |
|
12-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Use NIR for lowering texture swizzle Now that nir_lower_tex can do texture swizzle lowering, we can use that instead of repeating more-or-less the same code in both backends. This both allows us to share code and means that things like the tg4 work-arounds are somewhat simpler because they don't have to take the swizzle into account. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
1a094a2ee2d63073ac12c8ab0dbd38c0e9270cf5 |
|
23-Oct-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Move vec4_generator class definition into the .cpp file. The public API for the generator is brw_vec4_generate_code(); nobody actually needs to use the class. This means we can extend it without triggering the recompiles associated with altering brw_vec4.h. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
4cba8f5d21e4b50343e7c7bfbeb603b59c5d71dd |
|
23-Oct-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Wrap vec4_generator in a C function. vec4_generator is a class for convenience, but only exports a single method as its public API. It makes much more sense to just export a single function. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
73ff0ead3688519eb76ea8bc32eabb9004e6f37b |
|
23-Oct-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Convert src_reg/dst_reg to brw_reg at the end of the visitor. This patch makes the visitor convert registers to the HW_REG file at the very end, after register allocation, post-RA scheduling, and dependency control flagging. After that, everything is in fixed brw_regs. This simplifies the code generator, as it can just use the hardware registers rather than having to interpret our abstract files. In particular, interpreting the UNIFORM file meant reading prog_data to figure out where push constants are supposed to start. Having the part of the code that performs register allocation also translate everything to hardware registers seems sensible. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
627f94b72e0e9443ad116f072599a7342269f297 |
|
28-Sep-2015 |
Alejandro Piñeiro <apinheiro@igalia.com> |
i965/vec4: adding vec4_cmod_propagation optimization vec4 port of fs_cmod_propagation. Shader-db results (no vec4 grepping): total instructions in shared programs: 6240413 -> 6235841 (-0.07%) instructions in affected programs: 401933 -> 397361 (-1.14%) total loops in shared programs: 1979 -> 1979 (0.00%) helped: 2265 HURT: 0 v2: remove extra space and combine two if blocks, as suggested by Matt Turner v3: add condition check to bail out if current inst and inst being scanned has different writemask, as pointed by Matt Turner v3: updated shader-db numbers v4: remove block from foreach_inst_in_block_*_starting_from after commit 801f151917fedb13c5c6e96281a18d833dd6901f Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
0ca401327ef9e280b3a8b008f1e41477afec3a35 |
|
06-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Use a const nir_shader in backend_shader Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
8f1d968704858d78d7e78a6b88db3ea2bc0cf749 |
|
06-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Remove gl_program and gl_shader_program from the generator Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
73e0dfbaca2fd334fd3505412bf0d38054affd25 |
|
05-Oct-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965: Make vec4_visitor's destructor virtual We need a virtual destructor when at least one of the class' methods is virtual. Failure to do so might lead to undefined behavior when destructing derived classes. Fixes the following warning: brw_vec4_gs_visitor.cpp: In function 'const unsigned int* brw::brw_gs_emit(brw_context*, gl_shader_program*, brw_gs_compile*, void*, unsigned int*)': brw_vec4_gs_visitor.cpp:703:11: warning: deleting object of polymorphic class type 'brw::vec4_gs_visitor' which has non-virtual destructor might cause undefined behaviour [-Wdelete-non-virtual-dtor] delete gs; Curro: This shouldn't be causing any actual bugs at the moment because gen6_gs_visitor is the only subclass of vec4_visitor destroyed through a pointer of a base class (vec4_gs_visitor *) and its destructor is basically the same as its parent's. Anyway it seems sensible to change this so it doesn't bite us in the future. Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
4caa10193f6a88f476807aee56b900b3a02d9a6a |
|
03-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Remove more dead visitor/vertex program code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
bf7b6fd3fd6d98305d64ee6224ca9f9e7ba48444 |
|
02-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/shader: Get rid of the shader, prog, and shader_prog fields Unfortunately, we can't get rid of them entirely. The FS backend still needs gl_program for handling TEXTURE_RECTANGLE. The GS vec4 backend still needs gl_shader_program for handling transfom feedback. However, the VS needs neither and we can substantially reduce the amount they are used. One day we will be free from their tyranny. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
404419ee1a57c79982d93eefe4de099d61ad2eee |
|
02-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs,vec4: Get rid of the sanity_param_count It doesn't exist for anything other than an assert that, as far as I can tell, isn't possible to trip. Soon, we will remove prog from the visitor entirely and this will become even more impossible to hit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
7926c3ea7d8f455cbee390d20c78dadf5432b9bc |
|
01-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/backend_shader: Add a field to store the NIR shader Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
ea006c4cb5eb2d98d6bfd5a6c32fcae10b636f17 |
|
01-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Move binding table setup to codegen time. Setting up binding tables really has little to do with the actual process of turning shaders into instructions; it's more part of setting up prog_data. This commit moves it out of the visitors and with the rest of the prog_data setup stuff. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
7fee8b6f055831bc070bb36d02a8b1c4d601652a |
|
02-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Pull GLSL uniform handling into a common function The way we deal with GLSL uniforms and builtins is basically the same in both the vec4 and the fs backend. This commit takes the best parts of both implementations and pulls the common code into a shared helper function. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
3de81508ea513bf01f2c996c25a2cfdb5b3231d0 |
|
30-Sep-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/shader: Get rid of the setup_vec4_uniform_value helper It's not used by anything anymore Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
5609e0d7b41e861a3359991e8d0f2053b255fc31 |
|
30-Sep-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Get rid of the uniform_vector_size array The uniform_vector_size array was only ever used by pack_uniform_registers which no longer needs it. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
64b145422b928bed75d3665e4149a323b7208470 |
|
21-Sep-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Delete the old vec4_vp code Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
1153f12076d121fd0213f58f1953872a60da041d |
|
21-Sep-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Delete the old ir_visitor code Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
5ef169034c77ede86546d8dc42f7f22abcd6faa0 |
|
07-Aug-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/nir/vec4: Implement nir_intrinsic_ssbo_atomic_* Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
6485880232df46c0cdded0b063b8841a7855bd32 |
|
28-Aug-2015 |
Samuel Iglesias Gonsalvez <siglesias@igalia.com> |
i965/vec4: Implement VS_OPCODE_GET_BUFFER_SIZE Notice that Skylake needs to include a header in the sampler message so it will need some tweaks to work there. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
c951bb83056724df02ba7e6fe2dfa720c0f45c1f |
|
09-Sep-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4_nir: Use partial SSA form rather than full non-SSA We made this switch in the FS backend some time ago and it seems to make a number of things a bit easier. In particular, supporting SSA values takes very little work in the backend and allows us to take advantage of the majority of the SSA information even after we've gotten rid of Phi nodes. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
4f4b7c4711d98606270133dfd456acabfa8267a6 |
|
28-Aug-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Remove the brw_vue_prog_key base class. The legacy userclip fields are only used for the vertex shader, and at that point there's only program_string_id and the tex struct, which are common to all keys. So there's no need for a "VUE" key base class. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
323962182547aeafcdb3bac28434ef81f70eb785 |
|
28-Aug-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Virtualize vec4_visitor::emit_urb_slot(). This avoids a downcast of key, which won't exist in the base class soon. I'm not a huge fan of this patch, but given that we're currently using inheritance, this seems like the "right" way to do it. The alternative is to make key a void pointer in the parent class and continue downcasting. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
27e83b62bb52de7a681ed82679a707555023f43d |
|
28-Aug-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Store a key_tex pointer in vec4_visitor. I'm about to remove the base class for VS/GS/HS/DS program keys, at which point we won't be able to use key->tex anymore. Instead, we'll need to store a direct pointer (like we do in the FS backend). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
014b90221ad5cf833bfdd55b0336771d209f0f1d |
|
28-Aug-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Move legacy clip plane handling to vec4_vs_visitor. This is now only used for the vertex shader, so it makes sense to get it out of any paths run by the geometry shader. Instead of passing the gl_clip_plane array into the run() method (which is shared among all subclasses), we add it as a vec4_vs_visitor constructor parameter. This eliminates the bogus NULL parameter in the GS case. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
cfa056c6a5eadf87f92a71346c0dddd2a080e302 |
|
18-Aug-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4_nir: Get rid of the uniform_driver_location tracking Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
640c472fd075814972b1276c5b0ed3a769aacda5 |
|
12-Aug-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Move type_size() methods out of visitor classes. I want to use C function pointers to these, and they don't use anything in the visitor classes anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
c56899f41a904762225267cb9c543a0abd901ad5 |
|
19-Aug-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Make setup_vec4_uniform_value and _image_uniform_values take an offset This way they don't implicitly increment the uniforms variable and don't have to be called in-sequence during uniform setup. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
8d8b8f58540abbdb8a006a38830a08346a0edf34 |
|
19-Aug-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Rename setup_vector_uniform_values to setup_vec4_uniform_value The new name more accurately represents what it does: Set up a single vec4 uniform value. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
7068a6409c897e44cd98377df310691592ef6d0d |
|
10-Aug-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4_visitor: Make some function arguments const references Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
1d658cf8795383dbef127e46f3740b516bfe21b9 |
|
03-Aug-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4_nir: Do boolean source modifier resolves on BDW+ On BDW+, the negation source modifier on NOT, AND, OR, and XOR, is actually a boolean negate and not an integer negate. However, NIR's soruce modifiers are the integer version. We have to resolve it with a MOV prior to emitting the actual instruction. This is basically the same thing we do in the FS backend. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
7ade42755f8900aaf67073214c073419f734e7a8 |
|
29-Jun-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/gs: Refactor ir_emit_vertex and ir_end_primitive So the implementation is independent of GLSL IR and the visit methods of the vec4 visitor. This way we will be able to reuse that implementation directly from the NIR vec4 backend. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
1343f403b2d08a0877f17133abb6dccf0f51127b |
|
06-Jul-2015 |
Alejandro Piñeiro <apinheiro@igalia.com> |
i965/ir/vec4: Refactor visit(ir_texture *ir) Splitted in two. The emission is moved to a new vec4_visitor method, vec4_visitor::emit_texture, ir order to be reused on the nir path. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
c15eea2afa7a295992cde949b8e2a5d4552f6290 |
|
06-Jul-2015 |
Alejandro Piñeiro <apinheiro@igalia.com> |
i965/vec4: Change vec4_visitor::swizzle_result() method to allow reuse This patch changes the signature of swizzle_result() to accept lower level arguments. The purpose is to reuse it in the upcoming NIR->vec4 pass. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
57182332b84b58fed6641314def67450893b7419 |
|
18-Jun-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/vec4: Change vec4_visitor::gather_channel() method to allow reuse This patch changes the signature of gather_channel() to accept the gather component directly instead of fetching it internally from ir_texture. This will allow reuse in the upcoming NIR->vec4 pass. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
72c8d7721feb966cf8530a3ee2642f0b842dc0f8 |
|
18-Jun-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/vec4: Change vec4_visitor::emit_mcs_fetch() method to allow reuse This patch changes the signature of emit_mcs_fetch() to accept lower level arguments. The purpose is to reuse it in the upcoming NIR->vec4 pass. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
434481f3155040217c3e5a8da98dab4248435f0e |
|
18-Jun-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/vec4: Move is_high_sample() method to vec4_visitor class The is_high_sample() method is currently accessible only in the implementation of vec4_visitor. Since we need to reuse it in the upcoming NIR->vec4 pass, lets make it a method of the class instead. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
314474872b77f291132a01f7c1df2788586fc943 |
|
17-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/vec4: Return the emitted instruction in emit_lrp() Needed in the NIR backend to set the "saturate" value of the instruction. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
d53098393e3929b0c8d82f56144c7497b184f5b7 |
|
17-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/vec4: Return the emitted instruction in emit_minmax() Needed in the NIR backend to set the "saturate" value of the instruction. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
068a41b349e8bc30293c44d96553184f7562949f |
|
17-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/vec4: Return the last emitted instruction in emit_math() Needed in the NIR backend to set the "saturate" value of the instruction. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
f3187ea31ede6bc181ee561573d127aa2e485657 |
|
16-Jun-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/nir/vec4: Add get_nir_dst() and get_nir_src() methods These methods are essential for the implementation of the NIR->vec4 pass. They work similar to their fs_nir counter-parts. When processing instructions, these methods are invoked to resolve the brw registers (source or destination) corresponding to the NIR sources or destination. It uses the map of NIR register index to brw register for all registers locally allocated in a block. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
f7152525374015594e037fa11bb64e1c7174829b |
|
01-Jul-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/nir/vec4: Implement load_const intrinsic Similar to fs_nir backend, a nir_local_values map will be filled with newly allocated registers as the load_const instrinsic instructions are processed. Later, get_nir_src() will fetch the registers from this map for sources that are ssa. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
59006d3ad3ed5d29e84afa5931f425344e2ef658 |
|
22-Jul-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/nir/vec4: Add shader function implementation It basically allocates registers local to a function in a nir_locals map, then emits all its control-flow blocks. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
4023b55fdd7005a8a100637c229a1c40648cdd2b |
|
16-Jun-2015 |
Alejandro Piñeiro <apinheiro@igalia.com> |
i965/nir/vec4: Add setup for system values Similar to other variable setups, system values will initialize the corresponding register inside a 'nir_system_values' map, which will then be queried later when processing the different system value intrinsics for the appropriate register. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
01c5617c8edc2f392363e9f8861d62a9fc9aa973 |
|
16-Jun-2015 |
Alejandro Piñeiro <apinheiro@igalia.com> |
i965/vec4: Redefine make_reg_for_system_value() to allow reuse in NIR->vec4 pass The new virtual method is more flexible, it has a signature: dst_reg *make_reg_for_system_value(int location, const glsl_type *type); v2 (Jason Ekstrand): Use the new version in unit tests so make check passes again Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
195156e571e851273c135847f91ed73b3bfc1914 |
|
16-Jun-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/nir/vec4: Add setup of uniform variables Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
b929acb6a8659fdc06623b766bdf59904d8a3558 |
|
16-Jun-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/nir/vec4: Add setup of input variables in NIR->vec4 pass This implementation sets up a map of input variable offsets to source registers that are already initialized with the corresponding register offset. This map will then be queried when processing load_input intrinsic operations, to obtain the correct register source from which the input data will be loaded. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
78e7ce2b7329f8cc3f771afbf39d3fa662e02d9e |
|
16-Jun-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/vec4: Move type_size() method to brw_vec4_visitor class The type_size() method is currently accessible only in the implementation of vec4_visitor. Since we need to reuse it in the upcoming NIR->vec4 pass, lets make it a method of the class instead. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
abf4fa3c03ebe5716c90c8a310945c3621cf598f |
|
16-Jun-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/nir/vec4: Add implementation placeholders for a new NIR->vec4 pass This patch will add a brw_vec4_nir.cpp file filled with entry point methods to the main functionality, following a structure similar to brw_fs_nir.cpp. Subsequent patches in this series will be adding the implementations for these methods, incrementally. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
a0c02d2bbb765b0e997ad524d8e51838e529d9c0 |
|
28-Jun-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Define the setup_vector_uniform_values() backend_visitor interface. This cleans up the VEC4 implementation of setup_uniform_values() somewhat and will avoid duplication of the image uniform upload code by having a common interface to upload a vector of uniforms on either back-end. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
fadf34773527779eef4622b2586d87ec00476c0f |
|
13-Jul-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Fix stride field for the result of emit_uniformize(). This is essentially the same problem fixed in an earlier patch for immediates. Setting the stride to zero will be particularly useful for my future SIMD lowering pass, because we will be able to just check whether the stride of a source register is zero and skip emitting the copies required to unzip it in that case. Instead of setting stride to zero in every caller of emit_uniformize() I've changed the function to return the result as its return value (previously it was being written into a caller-provided destination register), because this way we can enforce that the result is used with the correct regioning from the function itself. The changes to the prototype of its VEC4 counterpart are mainly for the sake of symmetry, VEC4 registers don't have stride. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
13372a0ce746cde6fa6e0aa3c5130e4227f123e0 |
|
29-Jun-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Move c->last_scratch into vec4_visitor. Nothing outside of vec4_visitor uses it, so we may as well keep it internal. Commit db9c915abcc5ad78d2d11d0e732f04cc94631350 for the vec4 backend. (The empty class will be going away soon.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
0163c99e8f6959b5d6c7a937a322127cfdf9315f |
|
30-Jun-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Plumb log_data through so the backend_shader field gets set. Jason plumbed this through a while back in the FS backend, but apparently we were just passing NULL in the vec4 backend. This patch passes brw in as intended. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
40801295d5a3d747661abb1e2ca64d44c0e3dc05 |
|
23-Jun-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Remove the brw_context from the visitors As of this commit, nothing actually needs the brw_context. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
663f8d121d792edee5c012461bfd0b650011ff4a |
|
20-Jun-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vs: Pass the current set of clip planes through run() and run_vs() Previously, these were pulled out of the GL context conditionally based on whether we were running ff/ARB or a GLSL program. Now, we just pass them in so that the visitor doesn't have to grab them itself. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
1b0f6ffa15b25e8601d60fe1ea74e893f7d33cf5 |
|
20-Jun-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Pull calls to get_shader_time_index out of the visitor Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
c7893dc3c590b86787d8118e3920debaea3f16da |
|
19-Jun-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Use a single index per shader for shader_time. Previously, each shader took 3 shader time indices which were potentially at arbirary points in the shader time buffer. Now, each shader gets a single index which refers to 3 consecutive locations in the buffer. This simplifies some of the logic at the cost of having a magic 3 a few places. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
d7565b7d65f8203c20735a61b86e9158b8ec4447 |
|
16-Apr-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Remove the dependance on brw_context from the generators Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
e639a6f68e701f23b977a49c45d646c164991d36 |
|
16-Apr-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Plumb compiler debug logging through a function pointer in brw_compiler v2 (Ken): Make shader_debug_log a printf-like function. v3 (Jason): Add a void * to pass the brw_context through Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
e7f628c2fc5ef42672e3281e224226c3d47b1bac |
|
07-Sep-2014 |
Chris Forbes <chrisf@ijw.co.nz> |
glsl: Add ir node for barrier v2: * Changes suggested by mattst88 [jordan.l.justen@intel.com: Add nir support] Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
8b9ecfff360711cffc41a0a062de5ad810f9cf2b |
|
20-May-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Make fs/vec4_visitor inherit from ir_visitor directly This is using multiple inheritance in C++. However, ir_visitor is really just an interface with no data so it shouldn't be so bad. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
99cb4233205edcfa1a1e2967eef7bb16ff19bec4 |
|
20-May-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Rename backend_visitor to backend_shader The backend_shader class really is a representation of a shader. The fact that it inherits from ir_visitor is somewhat immaterial. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
046abc998c6951ea8a4aee0a2c1b832f6c877b73 |
|
20-Mar-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Define helper function to copy an arbitrary live component from some register. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
3da9f708d4f1375d674fae4d6c6eb06e4c8d9613 |
|
20-Feb-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Perform basic optimizations on the FIND_LIVE_CHANNEL opcode. v2: Save some CPU cycles by doing 'return progress' rather than 'depth++' in the discard jump special case. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
0519a6259b0e6b83eaeafdf0ed30a67713c4b6ed |
|
22-Apr-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Simplify generator code for untyped surface messages. The generate_untyped_*() methods do nothing useful other than calling the corresponding function from brw_eu_emit.c. The calls to brw_mark_surface_used() will go away too in a future commit. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
a85c4c9b3f75cac9ab133caa91a40eec2e4816ae |
|
16-Apr-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Rename brw_compile to brw_codegen This name better matches what it's actually used for. The patch was generated with the following command: for file in *; do sed -i -e s/brw_compile/brw_codegen/g $file done Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
2bf207b47347ec1c672448e3019029f899a5d3b5 |
|
16-Apr-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Add a devinfo field to the generator and use it for gen checks Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
33f73e93ff6e14f72153d3df7e80763137fcb943 |
|
24-Mar-2015 |
Neil Roberts <neil@linux.intel.com> |
i965/skl: Add the header for constant loads outside of the generator Commit 5a06ee738 added a step to the generator to set up the message header when generating the VS_OPCODE_PULL_CONSTANT_LOAD_GEN7 instruction. That pseudo opcode is implemented in terms of multiple actual opcodes, one of which writes to one of the source registers in order to set up the message header. This causes problems because the scheduler isn't aware that the source register is written to and it can end up reorganising the instructions incorrectly such that the write to the source register overwrites a needed value from a previous instruction. This problem was presenting itself as a rendering error in the weapon in Enemy Territory: Quake Wars. Since commit 588859e1 there is an additional problem that the double register allocated to include the message header would end up being split into two. This wasn't happening previously because the code to split registers was explicitly avoided for instructions that are sending from the GRF. This patch fixes both problems by splitting the code to set up the message header into a new pseudo opcode so that it will be done outside of the generator. This new opcode has the header register as a destination so the scheduler can recognise that the register is written to. This has the additional benefit that the scheduler can optimise the message header slightly better by moving the mov instructions further away from the send instructions. On Skylake it appears to fix the following three Piglit tests without causing any regressions: gs-float-array-variable-index gs-mat3x4-row-major gs-mat4x3-row-major I think we actually may need to do something similar for the fs backend and possibly for message headers from regular texture sampling but I'm not entirely sure. v2: Make sure the exec-size is retained as 8 for the mov instruction to initialise the header from g0. This was accidentally lost during a rebase on top of 07c571a39fa1. Split the patch into two so that the helper function is a separate change. Fix emitting the MOV instruction on Gen7. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89058 Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
a9e4cf5d323dbf11e42deda389ed03db571a7df7 |
|
15-Apr-2015 |
Neil Roberts <neil@linux.intel.com> |
i965/vec4: Add a helper function to emit VS_OPCODE_PULL_CONSTANT_LOAD There were three places in the visitor that had a similar chunk of code to emit the VS_OPCODE_PULL_CONSTANT_LOAD opcode using a register for the offset. This patch combines the chunks into a helper function to reduce the code duplication. It will also be useful in the next patch to expand what happens on Gen9+. This shouldn't introduce any functional changes. Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
cdb1eb9a3fa096b0eeef239a602cd1c42cf27498 |
|
02-Apr-2015 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Remove emit_scs() prototype. This has never existed. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
3818dfcf3c2d03809774bba613d7dd92752b36db |
|
17-Mar-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965: Handle scratch accesses where reladdr also points to scratch space This is a problem when we have IR like this: (array_ref (var_ref temps) (swiz x (expression ivec4 bitcast_f2i (swiz xxxx (array_ref (var_ref temps) (constant int (2)) ) )) )) ) ) where we are indexing an array with the result of an expression that accesses the same array. In this scenario, temps will be moved to scratch space and we will need to add scratch reads/writes for all accesses to temps, however, the current implementation does not consider the case where a reladdr pointer (obtained by indexing into temps trough a expression) points to a register that is also stored in scratch space (as in this case, where the expression used to index temps access temps[2]), and thus, requires a scratch read before it is accessed. v2 (Francisco Jerez): - Handle also recursive reladdr addressing. - Do not memcpy dst_reg into src_reg when rewriting reladdr. v3 (Francisco Jerez): - Reduce complexity by moving recursive reladdr scratch access handling to a separate recursive function. - Do not skip demoting reladdr index registers to scratch space if the top level GRF has already been visited. v4 (Francisco Jerez) - Remove redundant checks. - Simplify code by making emit_resolve_reladdr return a register with the original src data except for reg, reg_offset and reladdr. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89508 Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
e6e655ef76bb22193b31af2841cb50fda0c39461 |
|
18-Mar-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Define helpers to calculate the common live interval of a range of variables. These will be especially useful when we start keeping track of liveness information for each subregister. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
21c829e5cc6fefa5a42550e9043fade3e9e54e64 |
|
18-Mar-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Remove unused method definition. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
e60318fbcdec139227e427f8ec4d17f07f0d3798 |
|
19-Feb-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Replace debug_flag with debug_enabled. backend_visitor now handles this, so we can delete the vec4_visitor specific code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
4470bf1f494ce313bda4f1627c775569d886f93f |
|
11-Feb-2015 |
Ian Romanick <ian.d.romanick@intel.com> |
i965/vec4: Silence unused parameter warnings brw_vec4_copy_propagation.cpp:243:59: warning: unused parameter 'reg' [-Wunused-parameter] int arg, struct copy_entry *entry, int reg) ^ brw_vec4_generator.cpp:869:57: warning: unused parameter 'inst' [-Wunused-parameter] vec4_generator::generate_unpack_flags(vec4_instruction *inst, ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
bfbb0e84e11e06af3d751701f157a21915976ca1 |
|
06-Feb-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Move IR object definitions to separate header files. One should be able to manipulate i965 IR without pulling the whole FS/VEC4 visitor classes -- Optimization passes and other transformations would ideally be visitor-agnostic. Among other issues this avoids a circular dependency between the header file where such visitor-agnostic code will be defined and the main FS/VEC4 header where both IR (layer below) and visitor (layer above) happen to be defined. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
447879eb88b8df41ad32cf4406cc636b112b72d9 |
|
10-Feb-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Factor out virtual GRF allocation to a separate object. Right now virtual GRF book-keeping and allocation is performed in each visitor class separately (among other hundred different things), leading to duplicated logic in each visitor and preventing layering as it forces any code that manipulates i965 IR and needs to allocate virtual registers to depend on the specific visitor that happens to be used to translate from GLSL IR. v2: Use realloc()/free() to allocate VGRF book-keeping arrays (Connor). Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
3167a80bb1119616b70fbbcf2661d3fb511a6034 |
|
13-Jan-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fix "vertex" vs. "geometry" and "VS" vs. "GS" in debug output. We were happily printing "Native code for unnamed vertex shader" and "VS vec4" program for geometry shaders in our INTEL_DEBUG=gs output, as well as the KHR_debug output used by shader-db. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
c855f49c99379cc65e5a91fe9297a6b961e09e1f |
|
21-Dec-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Add parameter to skip doing constant propagation. After CSEing some MOV ..., VF instructions we have code like mov tmp, [1F, 2F, 3F, 4F]VF mov r10, tmp mov r11, tmp ... use r10 use r11 We want to copy propagate tmp into the uses of r10 and r11, but *not* constant propagate the VF immediate into the uses of tmp. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
44573458bdc52acc304fb75d6df502312b8e149c |
|
20-Dec-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Add pass to gather constants into a vector-float MOV. Currently only handles consecutive instructions with the same destination that collectively write all channels. total instructions in shared programs: 5879798 -> 5869011 (-0.18%) instructions in affected programs: 465236 -> 454449 (-2.32%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
3978585bccf69ff8f607cad0de025ea91c418587 |
|
20-Dec-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Add fs_reg/src_reg constructors that take vf[4]. Sometimes it's easier to generate 4x values into an array, and the memcpy is 1 instruction, rather than 11 to piece 4 arguments together. I'd forgotten to remove the prototype from fs_reg from a previous patch, so it's already there for us here. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
bf2307937995212895375d1e258d50207da3d24e |
|
25-Nov-2014 |
Kristian Høgsberg <krh@bitplanet.net> |
i965: Rename brw_vec4_prog_data/key to brw_bue_prog_data/key These structs aren't vec4 specific, they are shared by shader stages operating on Vertex URB Entries (VUEs). VUEs are the data structures in the URB that hold vertex data between the pipeline geometry stages. Using vue in the name instead of vec4 makes a lot more sense, especially when we add scalar vertex shader support. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
2881b123d00562fee8b7d2b4f7825f89a73e0d9f |
|
02-Dec-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Use ~0 to represent true on all generations. Jason realized that we could fix the result of the CMP instruction on Gen <= 5 by doing -(result & 1). Also do the resolves in the vec4 backend before use, rather than when the bool was created. The FS does this and it saves some unnecessary resolves. On Ironlake: total instructions in shared programs: 4289762 -> 4287277 (-0.06%) instructions in affected programs: 619430 -> 616945 (-0.40%) Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
169b6c1955deee7333d61f9ff149b7124bdea7d1 |
|
01-Dec-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Handle vertex color clamping in emit_urb_slot(). Vertex color clamping only applies to a few specific built-ins: COL0/1 and BFC0/1 (aka gl_[Secondary]{Front,Back}Color). It seems weird to handle special cases in a function called emit_generic_urb_slot(). emit_urb_slot() is all about handling special cases, so it makes more sense to handle this there. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
a64f3ba3d1c9be83783539203330f32c037abdb1 |
|
29-Oct-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Move program key structures to brw_program.h. With fs_visitor/fs_generator being reused for SIMD8 VS/GS programs, we're running into weird #include patterns, where scalar code #includes brw_vec4.h and such. Program keys aren't really related to SIMD4X2/SIMD8 execution - they mostly capture NOS for a particular shader stage. Consolidating them all in one place that's vec4/scalar neutral should help avoid problems. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
a50915984fe1205a3479cc8a5d07a8b3bde7d6bc |
|
29-Oct-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Make live_intervals part of the vec4_visitor class. Like in fs_visitor. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
f1e5418f402c7ac087b1c127cb4476d0d02e0073 |
|
12-Nov-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Don't treat IF or WHILE with cmod as writing the flag. Sandybridge's IF and WHILE instructions can do an embedded comparison with conditional mod. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
70fcd565388354da5a3c96d8a265e4d0b5ad7292 |
|
10-Mar-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Optimize packSnorm4x8(). Reduces the number of instructions needed to implement packSnorm4x8() from 13 -> 7.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
3532be76805e79993bc6f684876586c189ec605b |
|
10-Mar-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Optimize packUnorm4x8(). Reduces the number of instructions needed to implement packUnorm4x8() from 11 -> 6.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
94a30bbd4fe5f3eda167819e307f736268fd33f6 |
|
10-Mar-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Optimize unpackSnorm4x8(). Reduces the number of instructions needed to implement unpackSnorm4x8() from 16 -> 6. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
bf686b2785c63116ab4ab7e62eb77be51b97d346 |
|
09-Mar-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Optimize unpackUnorm4x8(). Reduces the number of instructions needed to implement unpackUnorm4x8() from 11 -> 4. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
cb0ba848d4176c1ed2c4542fd5875867f460fc3b |
|
09-Mar-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Add vector float immediate infrastructure. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
88fea85f09e2252035bec66ab26c375b45b000f5 |
|
21-Nov-2014 |
Ben Widawsky <benjamin.widawsky@intel.com> |
i965/vec4/gen8: Handle the MUL dest hazard exception Fix one of the few cases where we can't reliable touch the destination hazard bits. I am explicitly doing this patch individually so it is easy to backport. I was tempted to do this patch before the previous patch which reorganized the code, but I believe even doing that first, this is still easy to backport. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84212 Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
d9432af45a1a69d0cd1dcf12edfae920adeb4734 |
|
12-Nov-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Move common fields into backend_instruction. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
bd20fad3168e9c89d7892397466f7d98a002aeb2 |
|
11-Nov-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Combine all the math emitters. 17 insertions(+), 102 deletions(-). Works just as well. v2: Make emit_math take const references (suggested by Matt), drop redundant WRITEMASK_XYZW setting (Matt and Curro). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
dba683cf1624a9a30489df7b88ada1b1a86c991d |
|
11-Nov-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Use const references in emit() functions. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
0efc53a96ca853db24e7cf96190b1dfa94375b1f |
|
11-Nov-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Use macros to create prototypes for emitter helpers. We do this almost everywhere else; this should make it easier to modify. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
072ea414d04f1b9a7bf06a00b9011e8ad521c878 |
|
01-Sep-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Remove cfg-invalidating parameter from invalidate_live_intervals. Everything has been converted to preserve the CFG. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
269b6e24d6ec61d8d8d0c5d1b3d1bfa4f4a55f5f |
|
25-Aug-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Preserve CFG in spill_reg(). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
c66165ab2b15047792808433b788632a4b9df287 |
|
01-Aug-2014 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/gen6/gs: Fix binding table clash between TF surfaces and textures. For gen6 geometry shaders we use the first BRW_MAX_SOL_BINDINGS entries of the binding table for transform feedback surfaces. However, vec4_visitor will setup the binding table so that textures use the same space in the binding table. This is done when calling assign_common_binding_table_offsets(0) as part if its run() method. To fix this clash we add a virtual method to the vec4_visitor hierarchy to assign the binding table offsets, so that we can change this behavior specifically for gen6 geometry shaders by mapping textures right after the first BRW_MAX_SOL_BINDINGS entries. Also, when there is no user-provided geometry shader, we only need to upload the binding table if we have transform feedback, however, in the case of a user-provided geometry shader, we can't only look into transform feedback to make that decision. This fixes multiple piglit tests for textureSize() and texelFetch() when these functions are called from a geometry shader in gen6, like these: bin/textureSize gs sampler2D -fbo -auto bin/texelFetch gs usampler2D -fbo -auto Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
1f77bfce7debe34366942ec441eda38747a47f74 |
|
23-Jul-2014 |
Samuel Iglesias Gonsalvez <siglesias@igalia.com> |
i965/gen6/gs: Add an additional parameter to the FF_SYNC opcode. We will use this parameter in later patches to provide information relevant to transform feedback that needs to be set as part of the FF_SYNC message. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
3ea410972a9954babdcb6a0b1d4e5bc6f1ff61d2 |
|
23-Jul-2014 |
Samuel Iglesias Gonsalvez <siglesias@igalia.com> |
i965/gen6/gs: implement GS_OPCODE_FF_SYNC_SET_PRIMITIVES opcode This opcode will be used when filling FF_SYNC header before emitting vertices and their data. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
5933a08bd98d053a4cc5797d901bb399a8a5b470 |
|
18-Jul-2014 |
Samuel Iglesias Gonsalvez <siglesias@igalia.com> |
i965/gen6/gs: implement GS_OPCODE_SVB_SET_DST_INDEX opcode This opcode generates code to copy the specified destination index into subregister 5 of the MRF message header. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
e86ae1b0a32bbbe4fb02ae9cee5b447a75d7e27f |
|
18-Jul-2014 |
Samuel Iglesias Gonsalvez <siglesias@igalia.com> |
i965/gen6/gs: implement GS_OPCODE_SVB_WRITE opcode This opcode will be used when sending SVB WRITE messages to save transform feedback outputs into Streamed Vertex Buffers. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
024b7c0f33a6e8f59d8b3d9dd9f72d671f426890 |
|
24-Jul-2014 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/gen6/gs: Implement GS_OPCODE_SET_PRIMITIVE_ID. In gen6 the geometry shader payload includes the PrimitiveID information in r0.1. When the shader code uses glPimitiveIdIn we will have to move this to a separate hardware register where we can map this attribute. This opcode takes the selected destination register and moves r0.1 there. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
5c30da184514f7d20c033a0c4d1f99626adaddd4 |
|
17-Jul-2014 |
Iago Toral Quiroga <itoral@igalia.com> |
i965: Generalize emit_urb_slot() to emit to any dst_reg. In gen7+ we emit vertices as they come, however in gen6 geometry shaders we have to buffer vertex data for all vertices and then emit it all in one go at the end. To achieve this we need to generalize emit_urb_slot() to store vertex data in general purpose registers and not only MRF registers. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
9b32fd0f704cf34172d3fd85934bfff7a6f77753 |
|
16-Jul-2014 |
Iago Toral Quiroga <itoral@igalia.com> |
i965: Provide means to create registers of a given size. Implemented by Ilia Mirkin <imirkin@alum.mit.edu>. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
f373b7ed820024080838742f419bbca5fcbde2bf |
|
17-Jul-2014 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/gen6/gs: Implement GS_OPCODE_SET_DWORD_2. We had GS_OPCODE_SET_DWORD_2_IMMED but this required its source argument to be an immediate. In gen6 we need to set dword 2 of the URB write message header from values stored in separate register, so we need something more flexible. This change replaces GS_OPCODE_SET_DWORD_2_IMMED with GS_OPCODE_SET_DWORD_2. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
2c85132e511bbef9a0965c69848981b1bffb5bad |
|
09-Jul-2014 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/gen6/gs: Implement GS_OPCODE_URB_WRITE_ALLOCATE. Gen6 geometry shaders need to allocate URB handles for each new vertex they emit after the first (the URB handle for the first vertex is obtained via the FF_SYNC message). This opcode adds the URB allocation mechanism to regular URB writes. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
d0bdd4ce983ddd52f9f4b70dced4e471c60a130c |
|
09-Jul-2014 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/gen6/gs: Implement GS_OPCODE_FF_SYNC. This implements the FF_SYNC message required in gen6 geometry shaders to get the initial URB handle. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
1ee1d8ab468cafd25cfcc513319f3f046492947f |
|
31-Aug-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Reswizzle sources when necessary. Despite the comment above the function claiming otherwise, the function did not reswizzle sources, which would lead to bad code generation since commit 04895f5c, which began claiming we could do such swizzling when we could not. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82932 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
5598458e69232dcab9500717edbbf88085223529 |
|
16-Jun-2014 |
Abdiel Janulgue <abdiel.janulgue@linux.intel.com> |
i965/vec4: Remove try_emit_saturate Now that saturate is implemented natively as an instruction, we can cut down on unneeded functionality. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
ddc1d297bcb219eea176f72d48d15fe5e3333c99 |
|
29-Aug-2014 |
Abdiel Janulgue <abdiel.janulgue@linux.intel.com> |
i965/vec4: inline generate_vec4_instruction() within generate_code() Suggested by Matt. This patch combines and moves back the code-generation functions from generate_vec4_instruction() into generate_code(). Makes generate_code() a bit larger, but helps us to count loops in a straightforward manner. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
e0aa45768c6bda947b645ae6962054673937a55f |
|
13-Jul-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Add invalidate_cfg parameter to invalidate_live_intervals(). Will let us avoid invalidating the CFG if the optimization pass has removed instructions using the new basic block methods. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
9a071e3339afcf6fd937ae31121fa3b3face3bfe |
|
18-Aug-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Add a pass to reduce swizzles. total instructions in shared programs: 4344280 -> 4288033 (-1.29%) instructions in affected programs: 397468 -> 341221 (-14.15%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
a3d0ccb037082f3aa66bd558dfbe89f63a6eedd3 |
|
12-Jul-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Pass a cfg pointer to generate_{code,assembly}. The loop over all instructions is now two-fold, over all of the blocks and all of the instructions in each block. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
2cd6169e9298e75e4f71c358471b80eb8bf19f11 |
|
09-Aug-2014 |
Chris Forbes <chrisf@ijw.co.nz> |
i965/vec4: Add support for nonconst sampler indexing in VS visitor V2: Set force_writemask_all on ADD; this *is* necessary in the VS case too. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
8c229d306b3f312adbdfbaf79967ee43fbfc839e |
|
11-Aug-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Delete the Gen8 code generators. We now use the brw_eu_emit.c code instead. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
298da9fa2adba3f0f4c89220c696684937016f7c |
|
04-Aug-2014 |
Chris Forbes <chrisf@ijw.co.nz> |
i965/vec4/Gen8: Use src1 for sampler_index instead of ->sampler field Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
6be68767b9b5344d5753b8909f5ec8f57309b71a |
|
04-Aug-2014 |
Chris Forbes <chrisf@ijw.co.nz> |
i965/vec4/Gen4-7: Use src1 for sampler_index instead of ->sampler field Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
1a3fd11aefdf6ed327f633ea7e13bae2e8a92ca7 |
|
04-Aug-2014 |
Chris Forbes <chrisf@ijw.co.nz> |
i965/vec4: Pass sampler index in src1 for texture ops Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
bbd5dd5226f01e4cb5b69eb98d052f8e0e332e75 |
|
07-Aug-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Remove unused emit_bool_comparison method. Apparently unused since it was added in commit af3c9803. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
aba15d93a64e4f6619f641e252a7bc6c43442a29 |
|
12-Jul-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Move aeb list into opt_cse_local. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
3c8dc48ad1d4061a2a1d0b9ea3126350b98274f0 |
|
06-Mar-2013 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Add basic common subexpression elimination. [mattst88]: Modified to perform CSE on instructions with the same writemask. Offered no improvement before. total instructions in shared programs: 1995633 -> 1995185 (-0.02%) instructions in affected programs: 14410 -> 13962 (-3.11%) Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
423932791d0e4bbae28f3557659f031d3b2ac980 |
|
30-Jun-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Rename intel_asm_printer -> intel_asm_annotation. The #ifndef include guards already said the right thing :) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
ce706b4a9bd53fbe274687025965333541a0e70d |
|
30-Jun-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Make a brw_predicate enum. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
46e5b2a497216133be656b38ebfcf96da64b7744 |
|
30-Jun-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Make a brw_conditional_mod enum. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
ab74a42eef781b05bab2c67acbd37484f0e3aa2f |
|
30-Jun-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Move common fields into backend_instruction. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
3de11cacf0cb307ff3b4130746732d9db73d7583 |
|
30-Jun-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Use enum brw_reg_type for register types. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
34ef6a7651d6651e0bca77c4d4b890af582ad360 |
|
30-Jun-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Move is_zero/one/null/accumulator into backend_reg. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
c019105f3742b39ba6913235f85ddfb327a39d12 |
|
30-Jun-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Make a common backend_reg class. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
9377b189f75e1cc440b7e2ef955cb1700c486887 |
|
30-Jun-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Drop imm union from visitor register classes. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
35b741c8e74cf7c6a99d513c1fd01477545a172d |
|
28-Jun-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Pass const references to instruction functions. text data bss dec hex filename 4231165 123200 39648 4394013 430c1d i965_dri.so 4186277 123200 39648 4349125 425cc5 i965_dri.so Cuts 43k of .text and saves a bunch of useless struct copies. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
d35f34cea9558c23700532d4a7142dab2cc342a8 |
|
28-Jun-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Pass const references to vec4_instruction(). text data bss dec hex filename 4244821 123200 39648 4407669 434175 i965_dri.so 4231165 123200 39648 4394013 430c1d i965_dri.so Cuts 13k of .text and saves a bunch of useless struct copies. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
d5432e3f45ee3c5b5b824ad941a40c01025a275d |
|
24-Jun-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Make try_copy_propagate() static. Now that can_do_source_mods() isn't part of the visitor, this doesn't need to be either. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
7526df70ea249c26332c35017f7a810332b2deee |
|
24-Jun-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Rename try_copy/constant_propagat{ion,e} to match the fs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
46659d46a8c2f7bbc8deb472faff2dccbde92d29 |
|
24-Jun-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Make can_do_source_mods() a member of the instruction classes. Pretty nonsensical to have it as a method of the visitor just for access to brw. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
63117ac32948310c87e30f67b475a07f98884633 |
|
11-Jun-2014 |
Ian Romanick <ian.d.romanick@intel.com> |
i965/vec4: Emit smarter code for b2f of a comparison Previously we would emit the comparison, emit an AND to mask off extra bits from the comparison result, then convert the result to float. Now, do the comparison, then use a cleverly constructed SEL to pick either 0.0f or 1.0f. No piglit regressions on Ivybridge. total instructions in shared programs: 1642311 -> 1639449 (-0.17%) instructions in affected programs: 136533 -> 133671 (-2.10%) GAINED: 0 LOST: 0 Programs that are affected appear to save between 1 and 5 instuctions (just by skimming the output from shader-db report.py. v2: s/b2i/b2f/ in commit subject (noticed by Chris Forbes). Remove extraneous fix_3src_operand (suggested by Matt). The latter change required swapping the order of the operands and using predicate_inverse. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
f3ddd71f2878e42d2c9e927bd5f695a62b357c58 |
|
07-Jun-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Combine generate_math[12]_gen6 methods. These are trivial to combine: we should just avoid checking the second operand if it's brw_null_reg. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
5260a26e927df2bda7059b170c007a03da65b03b |
|
07-Jun-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Drop the generate_math2_gen7() method. It's now a single line of code, so we may as well fold it into the caller. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
7b9cf797903a5ea70072a28c0486d3e99ee60645 |
|
06-Mar-2013 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Make src_reg::equals() take a constant reference, not a pointer. This is more typical C++ style. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
92b055625da1b8d9144bf746ac67210df7deba73 |
|
25-May-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Move annotation info into generate code. Suggested by Ken as a way to cut down lines of code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
56d6dcf4f771d57d2759b2a5c5006f24444c696f |
|
29-May-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Give dump_instruction() a FILE* argument. Use function overloading rather than default arguments, since gdb doesn't know about default arguments. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
f0f7fb181fc267934a44904da4530f50a698b18d |
|
19-May-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Print disassembly after compaction. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
85ce2242cb9adf6bbf32f74e7578c66f426e8fc8 |
|
18-Apr-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Port untyped atomic message support to Broadwell. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
45367d2d092266e60d6f4a5b8f17d5a410cdfee9 |
|
18-Apr-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Port untyped surface reads support to Broadwell. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
69fd0551661797d89fe339ea3310c9e735a651d5 |
|
18-Apr-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Drop mark_surface_used from gen8 generators. Francisco made brw_mark_surface_used a freestanding function in commit a32817f3c248125fb537c3a915566445e5600d45. We should use it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
9565392031d96e21ebe21dbf7f2ef55958c674db |
|
28-Apr-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Remove 'mul_arg' from try_emit_mad(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
30c35d1dcb2fde19b1c968751fda5151b795d257 |
|
09-Apr-2014 |
Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> |
i965: Add is_accumulator() function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
10dd6eca89951e0cb40e21c3b53caa33d8fcb383 |
|
13-Mar-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Add is_null() method to dst_reg. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
a6367dfc157ecc71a686955323526ac4de3652f6 |
|
12-Mar-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Rename depends_on_flags() to reads_flag(). To be consistent with the fs backend. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
de4692f56cc566e0f6bd979dd2e7c88a0efde7e6 |
|
12-Mar-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Add and use vec4_instruction::writes_flag(). To be consistent with the fs backend. Also the instruction scheduler incorrectly considered SEL with a conditional modifier to read the flag register. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
b0d3205c2a676d9eeda72335ef61ce3f0bddc63a |
|
12-Mar-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Add missing doxygen close brace. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
a290cd039cc07330598a101e74d25289ce70bcee |
|
18-Feb-2014 |
Topi Pohjolainen <topi.pohjolainen@intel.com> |
i965: Merge resolving of shader program source Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
9b169a18939d1bde8db415a001b5c57259231546 |
|
15-Feb-2014 |
Topi Pohjolainen <topi.pohjolainen@intel.com> |
i965/vec4: Mark invariant members as constants in vec4_visitor Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
7189fce237cc7f4bc76a85cca8bcf75756d9affc |
|
27-Feb-2014 |
Petri Latvala <petri.latvala@intel.com> |
i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically. v2: Don't add function parameters, pass the required size in prog_data->nr_params. v3: - Use the name uniform_array_size instead of uniform_param_count. - Round up when dividing param_count by 4. - Use MAX2() instead of taking the maximum by hand. - Don't crash if prog_data passed to vec4_visitor constructor is NULL v4: Rebase for current master v5 (idr): Trivial whitespace change. Signed-off-by: Petri Latvala <petri.latvala@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71254 Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
56879a7ac41b8c7513a97cc02921f76a2ec8407c |
|
24-Feb-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Handle ir_triop_lrp on Gen4-5 as well. When the vec4 backend encountered an ir_triop_lrp, it always emitted an actual LRP instruction, which only exists on Gen6+. Gen4-5 used lower_instructions() to decompose ir_triop_lrp at the IR level. Since commit 8d37e9915a3b21 ("glsl: Optimize open-coded lrp into lrp."), we've had an bug where lower_instructions translates ir_triop_lrp into arithmetic, but opt_algebraic reassembles it back into a lrp. To avoid this ordering concern, just handle ir_triop_lrp in the backend. The FS backend already does this, so we may as well do likewise. v2: Add a comment reminding us that we could emit better assembly if we implemented the infrastructure necessary to support using MAC. (Assembly code provided by Eric Anholt). Cc: "10.1" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75253 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
008338bc4e2d9cc5931b9968d019619c09392389 |
|
25-Jan-2014 |
Jordan Justen <jordan.l.justen@intel.com> |
i965: support gl_InvocationID for gen7 v2: * Make gl_InvocationID a system value v3: * Properly shift from R0.1 into DST.4 by adding GS_OPCODE_GET_INSTANCE_ID Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
3b032732753b18c84482e30dd3675403eec7919f |
|
29-Nov-2013 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Trivial improvements to the with_writemask() function. Add assertion that the register is not in the HW_REG or IMM file, calculate the conjunction of the old and new mask instead of replacing the old [consistent with the behavior of brw_writemask(), causes no functional changes right now], make it static inline to let the compiler do a slightly better job at optimizing things, and shorten its name. v2: Assert that the new writemask is not zero to avoid undefined hardware behaviour. Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
42b226ef824ed61ccf51fa9a1198cba305ad5472 |
|
19-Feb-2014 |
Francisco Jerez <currojerez@riseup.net> |
i965: Make sure that backend_reg::type and brw_reg::type are consistent for fixed regs. And define non-mutating helper functions to retype fixed and normal regs with a common interface. At some point we may want to get rid of ::fixed_hw_reg completely and have fixed regs use the normal register data members (e.g. backend_reg::reg to select a fixed GRF number, src_reg::swizzle to store the swizzle, etc.), I have the feeling that this is not the last headache we're going to get because of the multiple ways to represent the same thing and the different register interface depending on the file a register is stored in... Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
98306e727b8291507ff4fd5dd5c4806f3fed9202 |
|
19-Feb-2014 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Add non-mutating helper functions to modify src_reg::swizzle and ::negate. Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
2337820d49149126991d0814b225db7b57789016 |
|
19-Feb-2014 |
Francisco Jerez <currojerez@riseup.net> |
i965: Add non-mutating helper functions to modify the register offset. Yes, we could avoid having four copies of essentially the same code by using templates here. Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
a32817f3c248125fb537c3a915566445e5600d45 |
|
27-Nov-2013 |
Francisco Jerez <currojerez@riseup.net> |
i965: Unify fs_generator:: and vec4_generator::mark_surface_used as a free function. This way it can be used anywhere. I need it from the visitor. Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
ae8b066da5862b4cfc510b3a9a0e1273f9f6edd4 |
|
19-Feb-2014 |
Francisco Jerez <currojerez@riseup.net> |
i965: Move up duplicated fields from stage-specific prog_data to brw_stage_prog_data. There doesn't seem to be any reason for nr_params, nr_pull_params, param, and pull_param to be duplicated in the stage-specific subclasses of brw_stage_prog_data. Moving their definition to the common base class will allow some code sharing in a future commit, the removal of brw_vec4_prog_data_compare and brw_*_prog_data_free, and the simplification of the stage-specific brw_*_prog_data_compare. Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
7f00c5f1a3e0db20a89cfedefd53cbe817fec9e3 |
|
23-Nov-2013 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Add constructor of src_reg from a fixed hardware reg. Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
b424da4be07ab8d34986e6f3824c679b623df952 |
|
28-Nov-2013 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Fix confusion between SWIZZLE and BRW_SWIZZLE macros. Most of the VEC4 back-end agrees on src_reg::swizzle being one of the BRW_SWIZZLE macros defined in brw_reg.h, except in two places where we use Mesa's SWIZZLE macros. There is even a doxygen comment saying that Mesa's macros are the right ones. They are incompatible swizzle representations (3 bits vs. 2 bits per component), and the code using Mesa's works by pure luck. Fix it. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
31d1077dd2f0fec34ac221168943cecc8c9afbf0 |
|
03-Feb-2014 |
Chris Forbes <chrisf@ijw.co.nz> |
i965/vec4: Emit shader w/a for Gen6 gather Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
9cee3ff562f3e4b51bfd30338fd1ba7716ac5737 |
|
22-Jan-2014 |
Paul Berry <stereotype441@gmail.com> |
i965: Remove *_generator::shader field; use prog field instead. The "shader" field in fs_generator, vec4_generator, and gen8_generator was only used for one purpose; to figure out if we were compiling an assembly program or a GLSL shader (shader is NULL for assembly programs). And it wasn't being used properly: in vec4 shaders we were always initializing it based on prog->_LinkedShaders[MESA_SHADER_FRAGMENT], regardless of whether we were compiling a geometry shader or a vertex shader. This patch simplifies things by using the "prog" field instead; this is also NULL for assembly programs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
a4d68e9ee94cf4855a3240c3516279b4e7740268 |
|
17-Jan-2014 |
Paul Berry <stereotype441@gmail.com> |
i965: Add GS support to INTEL_DEBUG=shader_time. Previously, time spent in geometry shaders would be counted as part of the vertex shader time. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
9eb568d7531eb4715be24d5076353ea6c10c8ceb |
|
07-Dec-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Create a new vec4 backend for Broadwell. This replaces the old vec4_generator backend. v2: Port to use the C-based instruction representation. Also, remove Geometry Shader offset hacks - the visitor will handle those instead of this code. v3: Texturing fixes (including adding textureGather support). v4: Pass brw_context to gen8_instruction functions as required. v5: Add SHADER_OPCODE_TXF_MCS support; port DUAL_INSTANCED gs fixes (caught by Eric). Simplify ADDC/SUBB handling; add comments to gen8_set_dp_message calls (suggested by Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
3122c2421a46207e06481ade5ccd5c864d13091a |
|
30-Nov-2013 |
Chris Forbes <chrisf@ijw.co.nz> |
i965/vs: Sample from MCS surface when required Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
1c263f8f4f767df0511e63377c17a95ebebba944 |
|
11-Nov-2013 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Add invalidate_live_intervals method. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
ba885c30c74f9efc94743d4582d30a0e70924b97 |
|
26-Sep-2013 |
Francisco Jerez <currojerez@riseup.net> |
i965/gen7: Handle atomic instructions from the VEC4 back-end. This can deal with all the 15 32-bit untyped atomic operations the hardware supports, but only INC and PREDEC are going to be exposed through the API for now. v2: Represent atomics as GLSL intrinsics. Add support for variably indexed atomic counter arrays. v3: Add comment on why we don't need to assign uniform storage for atomic counters. Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
5e621cb9fef7eada5a3c131d27f5b0b142658758 |
|
11-Sep-2013 |
Francisco Jerez <currojerez@riseup.net> |
i965/gen7: Implement code generation for untyped surface read instructions.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
cfaaa9bbb7a6ab5819f4fa9e38352b72d6293cff |
|
11-Sep-2013 |
Francisco Jerez <currojerez@riseup.net> |
i965/gen7: Implement code generation for untyped atomic instructions. Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
34cba13ef822faebbb1f10f1400f87fa9bf70d60 |
|
16-Oct-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vec4: Add the ability to suppress register spilling. In future patches, this will allow us to first try compiling a geometry shader in DUAL_OBJECT mode (which is more efficient but uses more registers) and then if spilling is required, fall back on DUAL_INSTANCED mode. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
8bb15813e3047820a95724e4257aa2c862eeb31a |
|
16-Oct-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vec4: Add the ability for attributes to be interleaved. When geometry shaders are operated in "single" or "dual instanced" mode, a single set of geometry shader inputs is interleaved into the thread payload (with each payload register containing a pair of inputs) in order to save register space. This patch modifies vec4_visitor::lower_attributes_to_hw_regs so that it can handle the interleaved format. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
e0f34301b29ecf3fb7118b2e05872510c104a49b |
|
23-Oct-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vec4: Extract function to set up vec4 prog key for precompiling. This will allow us to re-use it for precompiling geometry shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
068df64ba6a8309427612836e5eb384721ca6d40 |
|
23-Oct-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vec4: Remove uses_clip_distance from program key. This should never have been in the program key in the first place, since it's determined by the shader source, not by GL state. Change the code to just refer to gl_program::UsesClipDistanceOut directly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
705a90e30435490c2de84f4f6741cab335fa7608 |
|
03-Oct-2013 |
Eric Anholt <eric@anholt.net> |
i965: Move the common binding table offset code to brw_shader.cpp. Now that both vec4 and fs are dynamically assigning offsets, a lot of the code is the same. v2: Avoid passing around the next offset through the class. (Review by Paul) Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
3c9dc2d31b80fc73bffa1f40a91443a53229c8e2 |
|
02-Oct-2013 |
Eric Anholt <eric@anholt.net> |
i965: Make a brw_stage_prog_data for storing the SURF_INDEX information. It would be nice to be able to pack our binding table so that programs that use 1 render target don't upload an extra BRW_MAX_DRAW_BUFFERS - 1 binding table entries. To do that, we need the compiled program to have information on where its surfaces go. v2: Rename size to size_bytes to be more explicit. Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
5463b5bbbdf133986ac89fd6afdf2bc9622e3ca6 |
|
03-Oct-2013 |
Eric Anholt <eric@anholt.net> |
i965: Always have the struct gl_program * in the backend visitor. vec4 already had it, so put it in the FS, too. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
b645913ff6c74228d8c05dd236a545ef2e734071 |
|
28-Sep-2013 |
Matt Turner <mattst88@gmail.com> |
i965: Remove the "ARF" register file. The registers in the architecture register file don't share much in common, so there's no point in grouping them together. Use the HW_REG class instead. The vec4 backend already does this. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
014cce3dc49f5b0bfd7fbb1940ed661c9fc7bbd7 |
|
19-Sep-2013 |
Matt Turner <mattst88@gmail.com> |
i965: Generate code for ir_binop_carry and ir_binop_borrow. Using the ADDC and SUBB instructions on Gen7. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
4ec37317c55ee6be1a5988867aaeb8e9b3f02892 |
|
19-Sep-2013 |
Matt Turner <mattst88@gmail.com> |
i965: Add UD null register helpers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
4ed3930f9741721f4ace2d008678c0c88fdcc501 |
|
31-Mar-2013 |
Chris Forbes <chrisf@ijw.co.nz> |
i965/vs: Add support for ir_tg4 Pretty much the same as the FS case. Channel select goes in the header, V2: Less mangling. V3: Avoid sampling at all, for degenerate swizzles. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
70953b5fea1445fe121ac4b4a816c984742f2e19 |
|
12-Sep-2013 |
Francisco Jerez <currojerez@riseup.net> |
i965: Initialize all member variables of vec4_instruction on construction. The vec4_instruction object relies on the memory allocator zeroing out its contents before it's initialized, which is quite an unusual practice in the C++ world because it ties objects to some specific allocation scheme, and gives unpredictable results when an object is created with a different allocator -- Stack allocation, array allocation, or aggregation inside a different object are some of the useful possibilities that come to my mind. Initialize all fields from the constructor and stop using the zeroing allocator. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
4f1ebb8ddd0294698601a8c4fc38f1e39bfd51f6 |
|
18-Sep-2013 |
Kenneth Graunke <kenneth@whitecape.org> |
i965, mesa: Use the new DECLARE_R[Z]ALLOC_CXX_OPERATORS macros. These classes declared a placement new operator, but didn't declare a delete operator. Switching to the macro gives them a delete operator, which probably is a good idea anyway. This also eliminates a lot of boilerplate. v2: Properly use RZALLOC in Mesa IR/TGSI translators. Caught by Eric and Chad. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
564a900a4539996b139b8d7618a40b22bbad1290 |
|
21-Apr-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vec4: Add the ability to emit opcodes with just a dst register. This is needed for GS_OPCODE_PREPARE_CHANNEL_MASKS. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
6ced0fa57f1ad308b8cdb0ad7ccb9dffb30ad107 |
|
21-Apr-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/gs: Add opcodes needed for EndPrimitive(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
e241e7c979ba2fc558caaeebf7be84f5c705022a |
|
01-Sep-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vec4: Make with_writemask() non-static. This will allow it to be shared between brw_vec4_visitor.cpp and brw_vec4_vs_visitor.cpp (which will be created in the next patch). Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
8f9a339c10c6a0904c0fbdfdcc7a65696d7246e9 |
|
01-Sep-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vs: Move vs-specific code out of brw_vec4.h. Now brw_vec4.h contains only code that is shared between the vertex and geometry shaders. Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
4929be0b5fad98f6f7303a07dd24e4cf6f417467 |
|
02-Aug-2013 |
Matt Turner <mattst88@gmail.com> |
i965/vs: Add support for translating ir_triop_fma into MAD. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
7aaaa8bc8fb851a4783292a5e1ffdfeda2451dae |
|
22-Aug-2013 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Expose the payload registers to the register allocator. For now, nothing else can get allocated over them. That may change at some point in the future. This also means that base_reg_count can be computed without knowing the number of registers used for the payload, which is required if we want to allocate the register set once at context creation time. See commit 551e1cd44f6857f7e29ea4c8f892da5a97844377, which implemented virtually identical code in the FS backend. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
cfe39ea14edc8db13c549b853b214e676f8276f1 |
|
23-Aug-2013 |
Paul Berry <stereotype441@gmail.com> |
i965: Allow C++ type safety in the use of enum brw_urb_write_flags. (From a suggestion by Francisco Jerez) If an enum represents a bitfield of flags, e.g.: enum E { A = 1, B = 2, C = 4, D = 8, }; then C++ normally prohibits statements like this: enum E x = A | B; because A and B are implicitly converted to ints before OR-ing them, and an int can't be stored in an enum without a type cast. C, on the other hand, allows an int to be implicitly converted to an enum without casting. In the past we've dealt with this situation by storing flag bitfields as ints. This avoids ugly casting at the expense of some type safety that C++ would normally have offered (e.g. we get no warning if we accidentally use the wrong enum type). However, we can get the best of both worlds if we override the | operator. The ugly casting is confined to the operator overload, and we still get the benefit of C++ making sure we don't use the wrong enum type. v2: Remove unnecessary comment and unnecessary use of "enum" keyword. Use static_cast. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
612226c43b072eb45dc3ed21484054824e1c863c |
|
23-Aug-2013 |
Paul Berry <stereotype441@gmail.com> |
i965: Remove redundant (and uninitialized) field vec4_generator::ctx. We never noticed that this field was uninitialized because it is only used in an error path that reports internal Mesa errors. But it's silly to have it around anyway because &brw->ctx is equivalent. Should fix Coverity defect CID 1063351: Uninitialized pointer field (UNINIT_CTOR) /src/mesa/drivers/dri/i965/brw_vec4_emit.cpp: 148 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
35bdd552d5beb31e9b8319986c8f78d762c1228c |
|
19-Feb-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/gs: Add GS_OPCODE_SET_DWORD_2_IMMED. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
7417eddea9969cf09f36b05f218a59b22c076f0c |
|
23-Mar-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/gs: Add GS_OPCODE_SET_VERTEX_COUNT. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
ce722fd65dde02ed823cbc0b19863cae8d8f6ee2 |
|
23-Mar-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/gs: Add GS_OPCODE_SET_WRITE_OFFSET. v2: Added a comment to vec4_generator::generate_gs_set_write_offset(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
4416cb79926f089ff55dbbb352b94ec2890ae823 |
|
23-Mar-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/gs: Add GS_OPCODE_THREAD_END. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
96eb2f353605b382cf4fc930cc1d322130b12270 |
|
21-Mar-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/gs: Add GS_OPCODE_URB_WRITE. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
a9e8c10bd76f9a94b878b76bb5ae696beeaae2e0 |
|
11-Aug-2013 |
Paul Berry <stereotype441@gmail.com> |
i965: Combine 4 boolean args of brw_urb_WRITE into a flags bitfield. The arguments to brw_urb_WRITE() were getting pretty unwieldy, and we have to add more flags to support geometry shaders anyhow. Also plumb these flags through brw_clip_emit_vue(), brw_set_urb_message(), and the vec4_instruction class. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
7f57101ad53112b16e4a400f312f68a85dfbd108 |
|
13-Jul-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vec4: Virtualize setup_payload instead of setup_attributes. When I initially generalized the vec4_visitor class in preparation for geometry shaders, I assumed that the setup_attributes() function would need to be different between vertex and geometry shaders, but its caller, setup_payload(), could be shared. So I made setup_attributes() a virtual function. It turns out this isn't true; setup_payload() needs to be different too, since the geometry shader payload sometimes includes an extra register (primitive ID) that has to come before uniforms. So setup_payload() needs to be the virtual function instead of setup_attributes(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
626495d269e2c2df9dae5c46c086ffff93c77a19 |
|
13-Jul-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vec4: Allow for dispatch_grf_start_reg to vary. Both 3DSTATE_VS and 3DSTATE_GS have a dispatch_grf_start_reg control, which determines the register where the hardware delivers data sourced from the URB (push constants followed by per-vertex input data). For vertex shaders, we always set dispatch_grf_start_reg to 1, since R1 is always the first register available for push constants in vertex shaders. For geometry shaders, we'll need the flexibility to set dispatch_grf_start_reg to different values depending on the behvaiour of the geometry shader; if it accesses gl_PrimitiveIDIn, we'll need to set it to 2 to allow the primitive ID to be delivered to the thread in R1. This patch eliminates the assumption that dispatch_grf_start_reg is always 1. In vec4_visitor, we record the regnum that was passed to vec4_visitor::setup_uniforms() in prog_data for later use. In vec4_generator, we consult this value when converting an abstract UNIFORM register to a concrete hardware register. And in the code that emits 3DSTATE_VS, we set dispatch_grf_start_reg based on the value recorded in prog_data. This will allow us to set dispatch_grf_start_reg to the appropriate value when compiling geometry shaders. Vertex shaders will continue to always use a dispatch_grf_start_reg of 1. v2: Make dispatch_grf_start_reg "unsigned" rather than "GLuint". Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
72168f5f0069b2a0d8a2434ba80f4446952e84c7 |
|
15-Aug-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vec4: Move vec4 data structures and functions to brw_vec4.{cpp,h}. This patch moves the following things into brw_vec4.{cpp,h}: - struct brw_vec4_compile - struct brw_vec4_prog_key - brw_vec4_prog_data_compare() - brw_vec4_prog_data_free() This will allow us to avoid having to include brw_vs.h in geometry-shader-specific files. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
e556286802811b4b99c692d1ff5197f8ee1f011b |
|
21-Aug-2013 |
Paul Berry <stereotype441@gmail.com> |
i965: Make brw_{shader,vec4}.h safe to include from C. The patch that follows will move the definition of struct brw_vec4_prog_key from brw_vs.h to brw_vec4.h, making it necessary for brw_vs.h to include brw_vec4.h (because brw_vs.h defines struct brw_vs_prog_key, which contains brw_vec4_prog_key as a member). Since brw_vs.h is included from C source files, that means that brw_vec4.h will need to be safe to include from C. Same for brw_shader.h, since it is included by brw_vec4.h. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
5fb13d871e062354a77a427b3a3fe7f3d6908e5a |
|
20-Mar-2013 |
Paul Berry <stereotype441@gmail.com> |
i965: Stop including brw_vs.h from brw_vec4.h. This is backwards from what we are going to want in the long term, which is: - brw_vec4.h declares general-purpose vec4 infrastructure needed by both VS and GS - brw_vs.h includes brw_vec4.h and adds VS-specific parts. - brw_gs.h includes brw_vec4.h and adds GS-specific parts. Note that at the moment brw_vec.h contains a fair amount of VS-specific declarations--I plan to address that in a later patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
7e9559c9ba4dd82aca83b08d039103e38a3f94be |
|
15-Aug-2013 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Rework binding table size calculation. Unlike the FS, the VS backend already computed the binding table size. However, it did so poorly: after compilation, it looked to see if any pull constants/textures/UBOs were in use, and set num_surfaces to the maximum surface index for that category. If the VS only used a single texture or UBO, this overcounted by quite a bit. The shader time surface was also noted at state upload time (during drawing), not at compile time, which is inefficient. I believe it also had an off by one error. This patch computes it accurately, while also simplifying the code. It also renames num_surfaces to binding_table_size, since num_surfaces wasn't actually the number of surfaces used. For example, a VS that used one UBO and no other surfaces would have set num_surfaces to SURF_INDEX_VS_UBO(1) == 18, rather than 1. A bit of a misnomer there. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
c642bd3dcc1a6f1039732c614ab8a56dd3779427 |
|
15-Aug-2013 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Plumb brw_vec4_prog_data into vec4_generator(). This will be useful for the next commit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
a9be50f77675a70a44d231fc1f7fa85f875c5153 |
|
07-Aug-2013 |
Chris Forbes <chrisf@ijw.co.nz> |
i965: add new VS_OPCODE_UNPACK_FLAGS_SIMD4X2 Splits the bottom 8 bits of f0.0 for further wrangling in a SIMD4x2 program. The 4 bits corresponding to the channels in each program flow are copied to the LSBs of dst.x visible to each flow. This is useful for working with clipping flags in the VS. V3: - Fixup immediate types - Teach scheduler about the hidden dep on flags Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> V2: Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
9e2c1e28a14bb7c5ec49d6e7638b07a9e03ddca9 |
|
15-Aug-2013 |
Chris Forbes <chrisf@ijw.co.nz> |
i965/vs: add vec4_instruction::depends_on_flags We're about to have an instruction that depends on the flags but isn't predicated. This lays the groundwork. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
972e2f11c073e71d4c57b005ae1f906d96714849 |
|
07-Jul-2013 |
Chris Forbes <chrisf@ijw.co.nz> |
i965/vs: Do legacy clip lowering earlier We need to produce clip flags for the vertex header on Gen4/5, so clip plane lowering has to be done before we try to emit the flags/psiz attribute. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
ae6eba3e32a142665d2ae6e15c9122d3201c0b5d |
|
15-Feb-2013 |
Bryan Cain <bryancain3@gmail.com> |
glsl: add ir_emit_vertex and ir_end_primitive instruction types These correspond to the EmitVertex and EndPrimitive functions in GLSL. v2 (Paul Berry <stereotype441@gmail.com>): Add stub implementations of new pure visitor functions to i965's vec4_visitor and fs_visitor classes. v3 (Paul Berry <stereotype441@gmail.com>): Rename classes to be more consistent with the names used in the GL spec. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
8c9a54e7bcfc80295ad77097910d35958dfd3644 |
|
06-Jul-2013 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Delete intel_context entirely. This makes brw_context inherit directly from gl_context; that was the only thing left in intel_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
86f2711722dc10c25c2fabc09d8bd020a1ba6029 |
|
03-Jul-2013 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Remove pointless intel_context parameter from try_copy_propagate. It's already part of the visitor class. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
263a7e4cd992738814575b04d2de24ca0a0ad08a |
|
06-Jun-2013 |
Eric Anholt <eric@anholt.net> |
i965/vs: Use the MAD instruction when possible. This is different from how we do it in the FS - we are using MAD even when some of the args are constants, because with the relatively unrestrained ability to schedule a MOV to prepare a temporary with that data, we can get lower latency for the sequence of instructions. No significant performance difference on GLB2.7 trex (n=33/34), though it doesn't have that many MADs. I noticed MAD opportunities while reading the code for the DOTA2 bug. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
0f3068a58bdbceb2cb93e3848b0e2145629cdf43 |
|
01-May-2013 |
Eric Anholt <eric@anholt.net> |
i965/vs: Make virtual grf live intervals actually cover their used range. This is the same change as the previous commit to the FS. A very few VSes are regressed by 1 or 2 instructions, which look recoverable with a bit more dead code elimination. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
acd2bccd852f1e4edbac2e57dd47139908e79b5d |
|
18-Apr-2013 |
Matt Turner <mattst88@gmail.com> |
i965/vs: Add support for bit instructions. v2: Rebase on LRP addition. Use fix_3src_operand() when emitting BFE and BFI2. Add BFE and BFI2 to is_3src_inst check in brw_vec4_copy_propagation.cpp. Subtract result of FBH from 31 (unless an error) to convert MSB counts to LSB counts Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
573d8813fdbb116f4500d2044c56d80aab73ab7f |
|
01-Dec-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add instruction scheduling. While this is ignorant of dependency control, it's still good for a 0.39% +/- 0.08% performance improvement on GLBenchmark 2.7 (n=548) v2: Rewrite as a subclass of the base class for the FS instruction scheduler, inheriting the same latency information. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
ab04f3b2d74af061a0d2ebf3d1a02d8fcf73ff09 |
|
30-Apr-2013 |
Eric Anholt <eric@anholt.net> |
i965: Share the register file enum between the two backends. I need this so I can look at vec4 and fs registers' files from the same .cpp file without namespaces. As far as I can tell we never rely on the particular numerical values of the files, though I thought it sounded like a good idea when doing the VS (it turns out having 0 be BAD_FILE is nicer). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
63c8155b09bca7917631ec678a0d0db6e7965a1a |
|
29-Apr-2013 |
Eric Anholt <eric@anholt.net> |
i965: Make dump_instructions be a virtual method of the visitor. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
5e46482993dfd30b888d5219f6fecf4b4d1f42de |
|
28-Apr-2013 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Move is_math/is_tex/is_control_flow() to backend_instruction. These are entirely based on the opcode, which is available in backend_instruction. It makes sense to only implement them in one place. This changes the VS implementation of is_tex() slightly, which now accepts FS_OPCODE_TXB and SHADER_OPCODE_LOD. However, since those aren't generated in the VS anyway, it should be fine. This also makes is_control_flow() available in the VS. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
0c1d87b0d7e2c9f1ae6e838a8fa7f074557e45f0 |
|
25-Apr-2013 |
Matt Turner <mattst88@gmail.com> |
i965/vs: Add support for LRP instruction. Only 13 affected programs in shader-db, but they were all helped. total instructions in shared programs: 368877 -> 368851 (-0.01%) instructions in affected programs: 1576 -> 1550 (-1.65%) Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
c0f67a127b0b3e4bb715d1562a82c984d160280e |
|
25-Apr-2013 |
Matt Turner <mattst88@gmail.com> |
i965/vs: Add a function to fix-up uniform arguments for 3-src insts. Three-source instructions have a vertical stride overloaded to 4, which prevents directly using vec4 uniforms as arguments. Instead we need to insert a MOV instruction to do the replication for the three-source instruction. With this in place, we can use three-source instructions in the vertex shader. While some thought needs to go into deciding whether its better to use a three-source instruction rather than a sequence of equivalent instructions (when one or more sources are uniforms or immediates), this will allow us to skip a lot of ugly lowering code and use the BFE and BFI2 instructions directly. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
e9fa3a94486d80da34542cfd24425c208a8d30fe |
|
23-Mar-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vs: Don't hardcode DEBUG_VS in generic vec4 code. Since the vec4_visitor and vec4_generator classes are going to be re-used for geometry shaders, we can't enable their debug functionality based on (INTEL_DEBUG & DEBUG_VS) anymore. Instead, add a debug_flag boolean to these two classes, so that when they're instantiated the caller can specify whether debug dumps are needed. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
defdb310b76ad30c192a087292e86377f4ea0d83 |
|
22-Mar-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vs: Generalize computation of array strides in preparation for GS. Geometry shader inputs are arrays, but they use an unusual array layout: instead of all array elements for a given geometry shader input being stored consecutively, all geometry shader inputs are interleaved into one giant array. As a result, the array stride we use to access geometry shader inputs must be equal to the size of the input VUE, rather than the size of the array element. This patch introduces a new virtual function, vec4_visitor::compute_array_stride(), which will allow geometry shader compilation to specialize the computation of array stride to account for the unusual layout of geometry shader input arrays. It also renames the local variable that the ir_dereference_array visitor uses to store the stride, to avoid confusion. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
444fce6398556118629ef01204a7d8ff7af0bea3 |
|
22-Mar-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vs: Generalize attribute setup code in preparation for GS. This patch introduces a new function, vec4_visitor::lower_attributes_to_hw_regs(), which replaces registers of type ATTR in the instruction stream with the hardware registers that store those attributes. This logic will need to be common between the vertex and geometry shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
28fe02ce6e6fa6061cf69af9b292ee6553591473 |
|
22-Mar-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vs: Generalize vertex emission code in preparation for GS. This patch introduces a new function, vec4_visitor::emit_vertex(), which contains the code for emitting vertices that will need to be common between the vertex and geometry shaders. Geometry shaders will need to use a different message header, and a different opcode, for their URB writes, so we introduce virtual functions emit_urb_write_header() and emit_urb_write_opcode() to take care of the GS-specific behaviours. Also, since vertex emission happens at the end of the VS, but in the middle of the GS, we need to be sure to only call emit_shader_time_end() during VS vertex emission. We accomplish this by moving the call to emit_shader_time_end() into the VS implementation of emit_urb_write_opcode(). Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
7214451bdce6d553620d2b9b3f1f89d14b113357 |
|
17-Feb-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vs: rename vec4_generator::generate_vs_instruction. Since this function is going to get used for geometry shaders too, it deserves a more generic name: generate_vec4_instruction. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
9bb6840b28a9a77377d437198c62d705cade5370 |
|
17-Feb-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vs: Generalize data structures pointed to by vec4_generator. This patch removes the following field from vec4_generator, since it is not used: - struct brw_vs_compile *c And changes the following field: - struct gl_vertex_program *vp => struct gl_program *prog With these changes, vec4_generator no longer refers to any VS-specific data structures. This will pave the way for re-using it for geometry shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> v2: Use the name "prog" rather than "p". Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
4d773603d33f5628a7e7f407371187a650c3e6e5 |
|
09-Apr-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vs: Rename vec4_generator::prog to shader_prog. The next patch is going to change the type of vec4_generator::vp from struct gl_vertex_program * to struct gl_program *, and rename it. The sensible name to change it to is vec4_generator::prog. However, prog is already used. Since the existing vec4_generator::prog is of type struct gl_shader_program, it makes sense to rename it to shader_prog. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
5743bea0ba1eda07be831d95c5b7729f9ba98a92 |
|
17-Feb-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vs: move VS-specific data members to vs_vec4_visitor. This patch moves the following data structures from vec4_visitor to vec4_vs_visitor, since they contain VS-specific data: - struct brw_vs_compile *c (renamed to vs_compile) - struct brw_vs_prog_data *prog_data (renamed to vs_prog_data) - src_reg *vp_temp_regs - src_reg vp_addr_reg Since brw_vs_compile and brw_vs_prog_data also contain vec4-generic data, the following pointers are added to the base class, to allow it to access the vec4-generic portions of these data structures: - struct brw_vec4_compile *c - struct brw_vec4_prog_key *key - struct brw_vec4_prog_data *prog_data Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> v2: Use shorter names in the base class and longer names in the derived class. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
0ce95222aff64a316b95c67ef427901ffbe3e061 |
|
16-Feb-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vs: move ARB_vertex_program functions to vec4_vs_visitor. This patch moves functions from vec4_visitor to vec4_vs_visitor that deal with ARB (assembly) vertex programs. There's no point in having these functions in the base class since we don't intend to support assembly programs for the GS stage. The following functions are moved: - setup_vp_regs - get_vp_dst_reg - get_vp_src_reg Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
42a3d63dd4470be73b92b5d87daa32a9c293f127 |
|
17-Feb-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vs: Add virtual function make_reg_for_system_value(). The system values handled by vec4_visitor::visit(ir_variable *) are VS-specific (vertex ID and instance ID). This patch moves the handling of those values into a new virtual function, make_reg_for_system_value(), so that this VS-specific code won't be inherited by geomtry shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
8941f73c7ccb3c6cfa965a19f346e4b6ead6abdb |
|
17-Feb-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vs: Make some vec4_visitor functions virtual. This patch makes the following vec4_visitor functions virtual, since they will need to be implemented differently for vertex and geometry shaders. Some of the functions are renamed to reflect their generic purpose, rather than their VS-specific behaviour: - setup_attributes - emit_attribute_fixups (renamed to emit_prolog) - emit_vertex_program_code (renamed to emit_program_code) - emit_urb_writes (renamed to emit_thread_end) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
e9be5a05f70be7cff58b29bff07af71e6d339085 |
|
16-Feb-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vs: Make vec4_vs_visitor class derived from vec4_visitor. This patch just creates the derived class; later patches will migrate VS-specific functions and data structures from the base class into the derived class. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
09cd6e06d2c7a54ca6eb8d3102822efa78e01a9c |
|
16-Feb-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vs: Remove brw_vs_prog_data pointer from brw_vs_compile. In patches that follow, we'll be splitting structs brw_vs_prog_data and brw_vs_compile into a vec4-generic base struct and a VS-specific derived struct (this will allow the vec4-generic code to be re-used for geometry shaders). Having brw_vs_compile point to brw_vs_prog_data makes it difficult to do this cleanly. Fortunately most of the functions that use brw_vs_compile (those in the vec4_visitor class) already have access to brw_vs_prog_data through a separate pointer (vec4_visitor::prog_data). So all we have to do is use that pointer consistently, and plumb prog_data through the few remaining functions that need access to it. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
b29613371c316e9273ebe29ba37fb2f04c2ed58d |
|
16-Feb-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vs: Make type of vec4_visitor::vp more generic. The vec4_visitor functions don't use any VS specific data from vec4_visitor::vp. So rename it to "prog" and change its type from struct gl_vertex_program * to struct gl_program *. This will allow the code to be re-used for geometry shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> v2: Use the name "prog" rather than "p". Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
fe97f26c86d65b1b0e026c725c7da348a91093d9 |
|
09-Apr-2013 |
Paul Berry <stereotype441@gmail.com> |
i965: Rename backend_visitor::prog to shader_prog. The next patch is going to change the type of vec4_visitor::vp from struct gl_vertex_program * to struct gl_program *, and rename it. The sensible name to change it to is vec4_visitor::prog. However, prog is already used in backend_visitor (which vec4_visitor derives from). Since backend_visitor::prog is of type struct gl_shader_program *, it makes sense to rename it to shader_prog. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
d5f7aebac2b1afbc5023cd114174860d8d763d06 |
|
04-Apr-2013 |
Eric Anholt <eric@anholt.net> |
i965/vs: Use GRFs for pull constant offsets on gen7. This allows the computation of the offset to get written directly into the message source. shader-db results: total instructions in shared programs: 3308390 -> 3283025 (-0.77%) instructions in affected programs: 442998 -> 417633 (-5.73%) No difference in GLB2.7 low res (n=9). Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
4fee05b020af72ee802d4349de76fbc36cdd53a9 |
|
01-Dec-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add a pass to set dependency control fields on instructions. This is a more aggressive version of the old brw_optimize() path. Reduces cycles spent in the vertex shader on minecraft by 18.6% +/- 10.0% (n=15). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
130138030a3dc8bda20766146ca9fda4047133d3 |
|
18-Dec-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Teach copy propagation about sends from GRFs. This incidentally also teaches it a bit about gen6 math -- we now allow unswizzled, unmodified GRF temps as the sources for math. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
c3a22d42a88c299561dd913d0a00bb986921eeba |
|
18-Dec-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Prepare split_virtual_grfs() for the presence of SENDs from GRFs. v2: Fix silly bool handling, and don't add new tabs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
d2ba1c24b440ee74436335d8e815be9b72b1ba7f |
|
19-Mar-2013 |
Eric Anholt <eric@anholt.net> |
i965: Track ARB program state along with GLSL state for shader_time. This will let us do much better printouts for non-GLSL programs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
8371c68a4b4c12f4dd75f82b8b29a624705910a5 |
|
23-Mar-2013 |
Paul Berry <stereotype441@gmail.com> |
i965: Rename BRW_VARYING_SLOT_MAX -> BRW_VARYING_SLOT_COUNT. The new name clarifies that it represents *one more* than the maximum possible brw_varying_slot value. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
ec9c3882d949298366c872f766d3d18c6ae93f8e |
|
22-Mar-2013 |
Paul Berry <stereotype441@gmail.com> |
i965: Clarify nomenclature: vert_result -> varying This patch removes the terminology "vert_result" from the i965 driver, replacing it with "varying". The old terminology, "vert_result", was confusing because (a) it referred to the enum gl_vert_result, which no longer exists (it was replaced with gl_varying_slot), and (b) it implied a vertex output, but with the advent of geometry shaders, it could be either a vertex or a geometry output, depending what shaders are in use. The generic term "varying" is less confusing. No functional change. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> v2: Whitespace fixes.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
36b252e94724b2512ea941eff2b3a3abeb80be79 |
|
23-Feb-2013 |
Paul Berry <stereotype441@gmail.com> |
Replace gl_vert_result enum with gl_varying_slot. This patch makes the following search-and-replace changes: gl_vert_result -> gl_varying_slot VERT_RESULT_* -> VARYING_SLOT_* Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Brian Paul <brianp@vmware.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
203c12b18feb596999d9512e108408e72dd4ffd3 |
|
09-Jan-2013 |
Chad Versace <chad.versace@linux.intel.com> |
i965/vs/gen7: Emit code for GLSL ES 3.00 pack/unpack operations (v3) FIXME: This patch emits VS code that violates documented hardware restrictions and then relies on undocumented behavior that results from that violation. This patch passes all tests, but should be fixed ASAP to conform to the hardware documentation. v2: Explain undocumented hardware behavior. Improve comments. v3: Use ALU1 helper methods F32TO16() and F16TO32(). [for anholt] Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1) Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
aa6e35e80dddfe1011e845da6325d276263e2242 |
|
21-Nov-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Reference the core GL uniform storage for non-builtin uniforms. Like in the FS, there's no reason to use an external copy if the ParameterValues[] relayout of it isn't the layout we need. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
d5efc14635cf25bc130bfa77737913913d9202ce |
|
21-Nov-2012 |
Eric Anholt <eric@anholt.net> |
i965: Add asserts to check that we don't realloc ParameterValues. Things are even more restrictive than they used to be, so I've made mistakes in this area. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
c9e48e5b083b6cf97ecdb2d17c874ea631203b06 |
|
02-Aug-2012 |
Eric Anholt <eric@anholt.net> |
i965: Generalize VS compute-to-MRF for compute-to-another-GRF, too. No statistically significant performance difference on glbenchmark 2.7 (n=60). It reduces cycles spent in the vertex shader by 3.3% +/- 0.8% (n=5), but that's only about .3% of all cycles spent according to the fixed shader_time. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
471af25fc57dc43a8277b4b17ec82547287621d0 |
|
01-Dec-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Extend opt_compute_to_mrf to handle limited "reswizzling" The way our visitor works, scalar expression/swizzle results that get stored in channels other than .x will have an intermediate MOV from their result in the .x channel to the real .y (or whatever) channel, and similarly for vec2/vec3 results. By knowing how to adjust DP4-type instructions for optimizing out a swizzled MOV, we can reduce instructions in common matrix multiplication cases. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
1cb57ea493d892bf5065e5fb0c5dd745744cc71c |
|
09-Dec-2012 |
Chris Forbes <chrisf@ijw.co.nz> |
i965/vs: Fix gen6+ math operand quirks in one place This causes immediate values to get moved to a temp on gen7, which is needed for an upcoming change but hadn't happened in the visitor until then. v2: Drop gen > 7 checks (doesn't exist), and style-fix comments (changes by anholt). Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
338b5f887d462bbe7ef58a233cd00619e43415f0 |
|
10-Dec-2012 |
Eric Anholt <eric@anholt.net> |
i965: Adjust the split between shader_time_end() and shader_time_write(). I'm about to emit other kinds of writes besides time deltas, and it turns out with the frequency of resets, we couldn't really use the old time delta write() function more than once in a shader.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
71f06344a0d72a6bd27750ceca571fc016b8de85 |
|
27-Nov-2012 |
Eric Anholt <eric@anholt.net> |
i965: Add a debug flag for counting cycles spent in each compiled shader. This can be used for two purposes: Using hand-coded shaders to determine per-instruction timings, or figuring out which shader to optimize in a whole application. Note that this doesn't cover the instructions that set up the message to the URB/FB write -- we'd need to convert the MRF usage in these instructions to GRFs so that our offsets/times don't overwrite our shader outputs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1) v2: Check the timestamp reset flag in the VS, which is apparently getting set fairly regularly in the range we watch, resulting in negative numbers getting added to our 32-bit counter, and thus large values added to our uint64_t. v3: Rebase on reladdr changes, removing a new safety check that proved impossible to satisfy. Add a comment to the AOP defs from Ken's review, and put them in a slightly more sensible spot. v4: Check timestamp reset in the FS as well.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
ef2fbf67d4bd941a9a0e1c6f8515fb4911e05c50 |
|
28-Nov-2012 |
Eric Anholt <eric@anholt.net> |
i965: Add a flag for instructions with normal writemasking disabled. For getting values from the new timestamp register, the channels we load have nothing to do with the pixels dispatched.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
eda9726ef51dcfd3895924eb0f74df8e67aa9c3a |
|
27-Nov-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Split final assembly code generation out of vec4_visitor. Compiling shaders requires several main steps: 1. Generating VS IR from either GLSL IR or Mesa IR 2. Optimizing the IR 3. Register allocation 4. Generating assembly code This patch splits out step 4 into a separate class named "vec4_generator." There are several reasons for doing so: 1. Future hardware has a different instruction encoding. Splitting this out will allow us to replace vec4_generator (which relies heavily on the brw_eu_emit.c code and struct brw_instruction) with a new code generator that writes the new format. 2. It reduces the size of the vec4_visitor monolith. (Arguably, a lot more should be split out, but that's left for "future work.") 3. Separate namespaces allow us to make helper functions for generating instructions in both classes: ADD() can exist in vec4_visitor and create IR, while ADD() in vec4_generator() can create brw_instructions. (Patches for this upcoming.) Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
746fc346eae21d227b06799f3e82a1404c75bdc9 |
|
27-Nov-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Rework memory contexts for shader compilation data. During compilation, we allocate a bunch of things: the IR needs to last at least until code generation...and then the program store needs to last until after we upload the program. For simplicity's sake, just keep it all around until we upload the program. After that, it can all be freed. This will also save a lot of headaches during the upcoming refactoring. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
403bb1d306c5bc23ad9e2c26fd39071e6e41f665 |
|
27-Nov-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Pass the brw_context pointer into vec4_visitor and do_vs_prog. We used to steal it out of the brw_compile struct...but vec4_visitor isn't going to have one of those in the future. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
9136723214136a95a3c915d580943c888cd99503 |
|
21-Nov-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Move struct brw_compile (p) entirely inside fs_generator. The brw_compile structure contains the brw_instruction store and the brw_eu_emit.c state tracking fields. These are only useful for the final assembly generation pass; the earlier compilation stages doesn't need them. This also means that the code generator for future hardware won't have access to the brw_compile structure, which is extremely desirable because it prevents accidental generation of Gen4-7 code. v2: rzalloc p, as suggested by Eric. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
4a64efc01bef924d4b22d0878b1fef89e5e5bbac |
|
22-Nov-2012 |
Chris Forbes <chrisf@ijw.co.nz> |
i965: emit w/a for packed attribute formats in VS Implements BGRA swizzle, sign recovery, and normalization as required by ARB_vertex_type_10_10_10_2_rev. V2: Ported to the new VS backend, since that's all that's left; fixed normalization. V3: Moved fixups out of the GLSL-only path, so it works for FF/VP too. V4 (Kayden): Rework ES3 normalization, don't heap allocate registers; tidy comments. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
894fe54ec9b859b9fa47cc153fdc3c23cb98455e |
|
22-Nov-2012 |
Chris Forbes <chrisf@ijw.co.nz> |
i965/vs: add support for emitting SHL, SHR, ASR Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
a405717b885a4e211dc28c462d174ed8e600fcf9 |
|
11-Nov-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Remove dead vec4_visitor::src_reg_for_float prototype. No such function exists. src_reg's constructor does that. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
66c8473e028d416a87783da45de34454e4e9f6b8 |
|
08-Oct-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Replace brw_vs_emit.c with dumping code into the vec4_visitor. Rather than having two separate backends, just create a small layer that translates the subset of Mesa IR used for ARB_vertex_program and fixed function programs to the Vec4 IR. This allows us to use the same optimization passes, code generator, register allocator as for GLSL. v2: Incorporate Eric's review comments. - Fix use of uninitialized src_swiz[] values in the SWIZZLE_ZERO/ONE case: just initialize it to 0 (.x) since the value doesn't matter (those channels get writemasked out anyway). - Properly reswizzle source register's swizzles, rather than overwriting the swizzle. - Port the old brw_vs_emit code for computing .x of the EXP2 opcode. - Update comments, removing mention of NV_vertex_program, etc. - Delete remaining #warning lines and debug comments. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
1f0093720de41ca23c408f11784fcc39d58271d2 |
|
08-Oct-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Refactor min/max handling to share code. v2: Properly use "conditionalmod" pre-Gen6, rather than the incorrectly copy-and-pasted "BRW_CONDITIONAL_G". Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
fd8655aa7a78f3ded44e9dee572f17309a44a945 |
|
08-Oct-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Add support for emitting DPH opcodes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
54679fcbcae7a2d41cb439e52e386bd811a291b4 |
|
03-Oct-2012 |
Eric Anholt <eric@anholt.net> |
i965: Share the predicate field between FS and VS. Note that BRW_PREDICATE_NONE is 0 and BRW_PREDICATE_NORMAL is 1, so that's a lot like the true/false we had in the FS before. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
24aeeb2fdcde7a0c257db6469c6b0f064d53d3cf |
|
03-Oct-2012 |
Eric Anholt <eric@anholt.net> |
i965: Make the FS and VS share a few visitor/instruction fields. This will let us reuse brw_fs_cfg.cpp from brw_vec4_*. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
af911b2819e5175008c67e6939d88ec28cda69d1 |
|
16-Oct-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Do the temporary allocation in emit_scratch_write(). Both callers were doing basically the same thing, just written differently. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
9499f7984e7393f5acf214f126481695a774e8e7 |
|
16-Oct-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Simplify emit_scratch_write() prototype. Both callers used (effectively) inst->dst as the argument, so just reference it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
914d8f9f84a3539758716d676d59a1fee4cc559f |
|
04-Oct-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add a little bit of IR-level debug ability. This is super basic, but it let me visualize a problem I had with opt_compute_to_mrf(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
34c58acb59bc0b827e28ef9e89044621ab0b3ee1 |
|
03-Oct-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add support for splitting virtual GRFs. This should improve our ability to register allocate without spilling. Unfortuantely, due to the live variable analysis being ignorant of loops, we still have register allocation failures on some programs. v2: Add more context to the comment explaining the function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
d4bcc6591812ebe72a363cf98371de5e5016f481 |
|
03-Oct-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Try again when we've successfully spilled a reg. Before, we'd spill one reg, then continue on without actually register allocating, then assertion fail when we tried to use a vgrf number as a register number. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
9237f0ea8d176fb5dcd41868dcc723fe34f6b1f3 |
|
02-Oct-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Implement register spilling. To validate this code, I ran piglit -t vs quick.tests with the "go spill everything" debugging code enabled. There was only one regression: glsl-vs-unroll-explosion simply ran out of registers. This should be fine in the real world, since no one actually spills every single register. NOTE: This is a candidate for the 9.0 branch. Even if it proves to have bugs, it's likely better than simply failing to compile. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
bb020d09c382285210a5aebe412ddabfad19e4a0 |
|
25-Jun-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add a surface index to VS_OPCODE_PULL_CONSTANT instructions. Similar to the previous commit for the fragment shader, now we have a buffer index and an offset. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
b2f5d4c3ec9ec2fec8b39c87eb00121a24107276 |
|
04-Jul-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Move class functions to brw_vec4.cpp. This has less impact than for the FS (4k savings), because it was partially done already, but makes things more consistent. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
9195191e50429d9cf25e6498f9fb108758ac2be6 |
|
27-Jan-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Avoid allocating registers in to the gen7 MRF hack region. This is the corresponding fix to the previous one for the FS, but I don't have a particular test for it. NOTE: This is a candidate for the 8.0 branch.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
8e34021099527868097b2c877fc32f29aa4d7bb6 |
|
07-Dec-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Implement EXT_texture_swizzle support for VS texturing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
328b693a199a67ce3a17d258f34d7bfd26790871 |
|
12-Nov-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Add support for texel offsets. The visit() half computes the values to put in the header based on the IR and simply stuffs that in the vec4_instruction; the emit() half uses this to set up the message header. This works out well since emit() can use brw_reg directly and access individual DWords without kludgery. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
ca182cd0fa338ad39d531cb1be6a5a1bbf455771 |
|
26-Oct-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Implement vec4_visitor::generate_tex(). This is the part that takes the vec4_instruction IR and turns it into actual Gen ISA. v2: Add Gen4 messages, don't retype m0 to UW. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
bdd76ebef126281d837f3a817a9f19fca7799a88 |
|
28-Oct-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Add a new dst_reg constructor for file, number, type, and mask. This will be especially useful for loading texturing parameters, where I need to (for example) reference m3.xz<D>. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
7e7c40ff98cc2b930bc3113609ace5430f2bdc95 |
|
26-Oct-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Add vec4_instruction::is_tex() query. Copy and pasted from fs_inst::is_tex(), but without TXB. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
a73c65c5342bf41fa0dfefe7daa9197ce6a11db4 |
|
18-Oct-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Enable faster workaround-free math on Ivybridge. According to the documentation, Ivybridge's math instruction works in SIMD16 mode for the fragment shader, and no longer forbids align16 mode for the vertex shader. The documentation claims that SIMD16 mode isn't supported for INT DIV, but empirical evidence shows that it works fine. Presumably the note is trying to warn us that the variant that returns both quotient and remainder in (dst, dst + 1) doesn't work in SIMD16 mode since dst + 1 would be sechalf(dst), trashing half your results. Since we don't use that variant, we don't care and can just enable SIMD16 everywhere. The documentation also still claims that source modifiers and conditional modifiers aren't supported, but empirical evidence and study of the simulator both show that they work just fine. Goodbye workarounds. Math just works now. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
31874f074c2eaf2a9421c57f0798c79078d296c4 |
|
04-Oct-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Fix comparisons with uint negation. The condmod instruction ends up generating garbage condition codes, because apparently the comparison happens on the accumulator value (33 bits for UD), not the truncated value that would be written. Fixes vs-op-neg-* Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
2e5a1a254ed81b1d3efa6064f48183eefac784d0 |
|
07-Oct-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
intel: Convert from GLboolean to 'bool' from stdbool.h. I initially produced the patch using this bash command: for file in {intel,i915,i965}/*.{c,cpp,h}; do [ ! -h $file ] && sed -i 's/GLboolean/bool/g' $file && sed -i 's/GL_TRUE/true/g' $file && sed -i 's/GL_FALSE/false/g' $file; done Then I manually added #include <stdbool.h> to fix compilation errors, and converted a few functions back to GLboolean that were used in core Mesa's function pointer table to avoid "incompatible pointer" warnings. Finally, I cleaned up some whitespace issues introduced by the change. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chad Versace <chad@chad-versace.us> Acked-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
cea946307f319cc7cf3e2cf730be34cd51047965 |
|
07-Oct-2011 |
Brian Paul <brianp@vmware.com> |
i965: make swizzle_for_size() return unsigned Silences a warning about comparing to an unsigned variable. It looks like the result of swizzle_for_size() is always assigned to unsigned vars. Reviewed-by: Chad Versace <chad@chad-versace.us>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
e967c5b38fcde15a7e78910239735d851e7a7e40 |
|
07-Oct-2011 |
Brian Paul <brianp@vmware.com> |
i965: make size_swizzles[] static const Reviewed-by: Chad Versace <chad@chad-versace.us>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
cc3a699e32bed62c38b3b2de280973f067962504 |
|
24-Sep-2011 |
Paul Berry <stereotype441@gmail.com> |
i965 new VS: Fix src_reg(uint32_t) constructor. This constructor was storing its argument in the wrong field of the "imm" enum, resulting in it being converted to a float when it should have remained an unsigned integer. This was preventing clipping from working properly on pre-GEN6. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
e7da40afe84349a640fe15e3af408a0dfe880e85 |
|
24-Sep-2011 |
Paul Berry <stereotype441@gmail.com> |
i965 new VS: don't share clip plane constants in pre-GEN6 In pre-GEN6, when using clip planes, both the vertex shader and the clipper need access to the client-supplied clip planes, since the vertex shader needs them to set the clip flags, and the clipper needs them to determine where to insert new vertices. With the old VS backend, we used a clever optimization to avoid placing duplicate copies of these planes in the CURBE: we used the same block of memory for both the clipper and vertex shader constants, with the clip planes at the front of it, and then we instructed the clipper to read just the initial part of this block containing the clip planes. This optimization was tricky, of dubious value, and not completely working in the new VS backend, so I've removed it. Now, when using the new VS backend, separate parts of the CURBE are used for the clipper and the vertex shader. Note that this doesn't affect the number of push constants available to the vertex shader, it simply causes the CURBE to occupy a few more bytes of URB memory. The old VS backend is unaffected. GEN6+, which does clipping entirely in hardware, is also unaffected. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
62bad54727690bff5ed42a74272e7822fd36cdb6 |
|
02-Sep-2011 |
Paul Berry <stereotype441@gmail.com> |
i965: Set up clip distance VUE slots appropriately for gl_ClipDistance. When gl_ClipDistance is in use, the contents of the gl_ClipDistance array just need to be copied directly into the clip distance VUE slots, so we re-use the code that copies all other generic VUE slots (this has been extracted to its own method). When gl_ClipDistance is not in use, the vertex shader needs to calculate the clip distances based on user-specified clipping planes. This patch also removes the i965-specific enum values BRW_VERT_RESULT_CLIP[01], since we now have generic Mesa enums that serve the same purpose (VERT_RESULT_CLIP_DIST[01]). Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
c662764f4f9d9d0303fb2685dfdc93824fa15dca |
|
06-Sep-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add support for compute-to-MRF. Removes 1.8% of the instructions from 97% of the vertex shaders in shader-db.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
160848d8ef96cf3a760c02cc576df7dbffc1f669 |
|
06-Sep-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add a function for how many MRFs get written as part of a SEND. This will be used for compute-to-mrf, which needs to know when MRFs get overwritten.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
63bc443f8a026a2035ffd3122c8462c6fa36d20b |
|
06-Sep-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Remove dead fields of src_reg. These were copy and pasted from the FS, and are never used. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
f0c04e6c22babf2aee2ad1ee85dbd6f996be3712 |
|
03-Sep-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add support for simple algebraic optimizations. We generate silly code for array access, and it's easier to generally support the cleanup than to specifically avoid the bad code in each place we might generate it. Removes 4.6% of instructions from 41.6% of shaders in shader-db, particularly savage2/hon and unigine. v2: Fixes by Ken: Make is_zero/one member functions, and fix a progress flag. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
cc9eb936c220267b6130b705fc696d05906a31df |
|
02-Sep-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add support for copy propagation of the UNIFORM and ATTR files. Removes 2.0% of the instructions from 35.7% of vertex shaders in shader-db.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
42ce13195b94d0d51ca8e7fa5eed07fde8f37988 |
|
30-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add constant propagation to a few opcodes. This differs from the FS in that we track constants in each destination channel, and we we have to look at all the swizzled source channels. Also, the instruction stream walk is done in an O(n) manner instead of O(n^2). Across shader-db, this reduces 8.0% of the instructions from 60.0% of the vertex shaders, leaving us now behind the old backend by 11.1% overall.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
87b51fc4a807616eaab0c4b38e41c328c08875e3 |
|
01-Sep-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Keep track of indices into a per-register array for virtual GRFs. Tracking virtual GRFs has tension between using a packed array per virtual GRF (which is good for register allocation), and sparse arrays where there's an element per actual register (so the first and second column of a mat2 can be distinguished inside of an optimization pass). The FS mostly avoided the need for this second sparse array by doing virtual GRF splitting, but that meant that instances where virtual GRF splitting didn't work, instructions using those registers got much less optimized.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
df35d691807656d3627b6fa6f51a08674bdc043e |
|
07-Sep-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add support for overflowing the number of available push constants. Fixes glsl-vs-uniform-array-4. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=33742 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
6af968b6736c87c05ea579df50e23b6f23b900d4 |
|
06-Sep-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add annotation to more of the URB write. While we had nice debug output for most of the instruction stream, it was terminated by a series of anonymous MOVs and a send. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
bba910373fc6cdca939422d94adfe58b43e41b86 |
|
31-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add support for vector comparison ops resulting in bool cond codes. Fixes a giant pile of VS tests on gen4. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
e604f98f580b74dd6c597ef492706ce74697443e |
|
23-Aug-2011 |
Paul Berry <stereotype441@gmail.com> |
i965: new VS: use the VUE map to write out vertex attributes. Previously, the new VS backend used two functions, emit_vue_header_gen6() and emit_vue_header_gen4() to emit the fixed parts of the VUE, and then a pair of carefully-constructed loops to emit the rest of the VUE, leaving out the parts that were already emitted as part of the header. This patch changes the new VS backend to use the VUE map to emit the entire VUE. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
d1435a49e9765ab4e988dd8b65a5599da34f3512 |
|
23-Aug-2011 |
Paul Berry <stereotype441@gmail.com> |
i965: new VS: move clip distance computation (GEN5+) to a separate function. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
d9eca0b27903acef97f7b69a70dc791b433f1c98 |
|
23-Aug-2011 |
Paul Berry <stereotype441@gmail.com> |
i965: new VS: Move PSIZ/flags computation to a separate function. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
f86d1976f81811aec0a555946e263295ed1403db |
|
23-Aug-2011 |
Paul Berry <stereotype441@gmail.com> |
i965: new VS: move NDC computation (GEN4-5) to a separate function. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
34fbab2125555ba0afffa361e1c74fb3359ef3a7 |
|
23-Aug-2011 |
Paul Berry <stereotype441@gmail.com> |
i965: new VS: Use output_reg[] to find NDC and HPOS registers. Previously, emit_vue_header_gen4() used local variables to keep track of which registers were storing the NDC and HPOS. This patch uses the output_reg[] array instead, so that the code that manipulates NDC and HPOS can be more easily refactored. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
0c9ae24dbdfcfea06fb0e8cdfd7737da48fa4e31 |
|
27-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Avoid the emit(), remove(), insert_before() for array instructions. v2: Add generator instructions for the scratch opcodes. Add emit_before() for handling ->ir and ->annotation inheritance. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
e45d0270c9f6f170e35ae39e95977b60f0f0be9a |
|
27-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Move logic for weird CMP type handling to CMP generators. v2: Don't bother with the no-dst-reg version of CMP() Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
2f12be5c952ec84eece74a321e5b0a92314aba3a |
|
27-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Create instruction generators outside of the emit() functions. v2: Fixed gen6 IF(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
72cfc6f3778d8297e52c254a5861a88eb62e4d67 |
|
23-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Pack live uniform vectors together in the push constant upload. At some point we need to also move uniform accesses out to pull constants when there are just too many in use, but we lack tests for that at the moment. Fixes glsl-vs-large-uniform-array. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
7c84b9d303345fa5075dba8c4ea7af449d93b0f8 |
|
23-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Track uniforms as separate vectors once we've done array access. This will make it easier to figure out which elements are totally unused and not upload them. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
483f5b348b0f3c0ca7082fd2047c354e8af285e7 |
|
22-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add support for pull constant loads for uniform arrays. v2: reworked the instruction emit and made use of gen6_resolve_implied_move, from Ken's review
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
88e08de8017b69591b37dafde9afd15f796fb404 |
|
27-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Restructure emit() functions around a vec4_instruction constructor. We sometimes want to put an instruction somewhere besides the end of the instruction stream, and we also want per-opcode instruction generation to enable compile-time checking of operands.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
2f0edc60f4bd2ae5999a6afa656e3bb3f181bf0f |
|
26-Aug-2011 |
Chad Versace <chad@chad-versace.us> |
i965: Fix Android build by removing relative includes Replace each occurence of #include "../glsl/*.h" with #include "glsl/*.h" Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad@chad-versace.us>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
8174945d3346dc049ae56dcb4bf1eab39f5c88aa |
|
17-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add simple dead code elimination. This is copied right from the fragment shader. It is needed for real register allocation to work correctly.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
3dadc1e3cceac80a1b63cad2e10f0e0f8904531b |
|
17-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Copy the live intervals calculation over from the FS. This is a rather pessimistic calculation, since it doesn't distinguish individual channels of a vec4, or elements of an array, but should be a minimum start for register allocation.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
7642c1de6b65b7dfd9e39904291cc9737cd54b56 |
|
11-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Avoid generating a MOV for most ir_assignment handling. Removes an average of 11.5% of instructions in 54% of vertex shaders in shader-db.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
abf843a797876b5e3c5c91dbec25b6553d2cc281 |
|
09-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add support for ir_binop_pow. Fixes vs-pow-float-float.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
584ff407482fd3baf5ce081dbbf9653eb76c40f1 |
|
07-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add support for scratch read/write codegen.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
d0e4d71070cd7fa197ed98612782484ec1f27123 |
|
07-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Move virtual GRFs with array accesses to them to scratch space.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
1ff4f11dd94711a498cde0330101c58636ef2741 |
|
07-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Track the variable index of array accesses. This isn't used currently, as we lower all array accesses.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
31ef2e3ec2f5837eea0899b4bda5ea15e335a6a2 |
|
06-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Avoid generating extra moves when setting up large ir_constants. We were also screwing up the types in the process, and just not emitting moves was easier.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
979072613139870f12e329e4b483c7f688b40560 |
|
06-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Handle assignment of structures/arrays/matrices better. This gets the right types on the instructions, as well as emitting minimal swizzles/writemasks.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
a070d5f363e99b0f846d555e9ca3a74ec807fdc0 |
|
04-May-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Start adding support for uniforms There's no clever packing here, no pull constants, and no array support.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|
af3c9803d818fd33139f1247a387d64b967b8992 |
|
02-May-2011 |
Eric Anholt <eric@anholt.net> |
i965: Start adding the VS visitor and codegen. The low-level IR is a mashup of brw_fs.cpp and ir_to_mesa.cpp. It's currently controlled by the INTEL_NEW_VS=1 environment variable, and only tested for the trivial "gl_Position = gl_Vertex;" shader so far.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.h
|