56ee2df4bf9b1e8c26cf8689f5ef20237c95466b |
|
13-Jan-2017 |
Juan A. Suarez Romero <jasuarez@igalia.com> |
i965/vec4: Fix mapping attributes This patch reverts 57bab6708f2bbc1ab8a3d202e9a467963596d462, which was causing issues with ILK and earlier VS programs. 1. brw_nir.c: Revert "i965/vec4/nir: vec4 also needs to remap vs attributes" Do not perform a remap in vec4 backend. Rather, do it later when setup attributes 2. brw_vec4.cpp: This fixes mapping ATTRx to proper GRFn. Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99391 [jordan.l.justen@intel.com: merge Juan's two patches from bugzilla] Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
58fdb85f0f413d1a144d4beb6519da59bc52c974 |
|
21-Apr-2016 |
Alejandro Piñeiro <apinheiro@igalia.com> |
i965/vec4: take into account doubles when creating attribute mapping Doubles needs more that one slot per attribute. So when filling the attribute_map we check if it is a double in order to allocate one extra register. Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
f8310189f4a31c443657cd0c1aef35db02b86c95 |
|
21-Apr-2016 |
Alejandro Piñeiro <apinheiro@igalia.com> |
i965/vec4: use attribute slots for first non payload GRF As part of the payload setup, setup_attributes is called with the first GRF that can be used for the attributes (first ones are used for uniforms for example) and returns the first GRF that is not part of the payload. Before this patch, it adds directly the number of attributes. But as with 64-bit attributes can consume more than one slot, that is not valid anymore. This patch change the addition to use the number of slots consumed. gen >= 8 would not be affected, as they use the scalar mode. For that case, the vs configuration is done at fs_visitor::assign_vs_urb_setup. v2: add explanation in commit log (Jordan) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
c2acf97fcc9b32eaa9778771282758e5652a8ad4 |
|
16-Dec-2016 |
Juan A. Suarez Romero <jasuarez@igalia.com> |
nir/i965: use two slots from inputs_read for dvec3/dvec4 vertex input attributes So far, input_reads was a bitmap tracking which vertex input locations were being used. In OpenGL, an attribute bigger than a vec4 (like a dvec3 or dvec4) consumes just one location, any other small attribute. So we mark the proper bit in inputs_read, and also the same bit in double_inputs_read if the attribute is a dvec3/dvec4. But in Vulkan, this is slightly different: a dvec3/dvec4 attribute consumes two locations, not just one. And hence two bits would be marked in inputs_read for the same vertex input attribute. To avoid handling two different situations in NIR, we just choose the latest one: in OpenGL, when creating NIR from GLSL/IR, any dvec3/dvec4 vertex input attribute is marked with two bits in the inputs_read bitmap (and also in the double_inputs_read), and following attributes are adjusted accordingly. As example, if in our GLSL/IR shader we have three attributes: layout(location = 0) vec3 attr0; layout(location = 1) dvec4 attr1; layout(location = 2) dvec3 attr2; then in our NIR shader we put attr0 in location 0, attr1 in locations 1 and 2, and attr2 in location 3 and 4. Checking carefully, basically we are using slots rather than locations in NIR. When emitting the vertices, we do a inverse map to know the corresponding location for each slot. v2 (Jason): - use two slots from inputs_read for dvec3/dvec4 NIR from GLSL/IR. v3 (Jason): - Fix commit log error. - Use ladder ifs and fix braces. - elements_double is divisible by 2, don't need DIV_ROUND_UP(). - Use if ladder instead of a switch. - Add comment about hardware restriction in 64bit vertex attributes. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
7c6b714cd0fe06044c9a810186f5ce3690152574 |
|
05-Jan-2017 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Print VS output VUE map in Vulkan too. We need to move this to the shared layer. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
c762809e49daf61fc986721006ce6a520e6e735f |
|
01-Sep-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: run scalarize_df() after spilling Spilling of 64-bit data requires data shuffling for the corresponding scratch read/write messages. This produces unsupported swizzle regions and writemasks that we need to scalarize. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
73610384a8357287cef64434c789ff03c2f6f37a |
|
01-Sep-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: prevent src/dst hazards during 64-bit register allocation 8-wide compressed DF operations are executed as two separate 4-wide DF operations. In that scenario, we have to be careful when we allocate register space for their operands to prevent the case where the first half of the instruction overwrites the source of the second half. To do this we mark compressed instructions as having hazards to make sure that ther register allocators assigns a register regions for the destination that does not overlap with the region assigned for any of its source operands. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
2b57adad0056273e38d9a9736cd98be95c0deb07 |
|
18-Aug-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4/scalarize_df: support more swizzles via vstride=0 By exploiting gen7's hardware decompression bug with vstride=0 we gain the capacity to support additional swizzle combinations. This also fixes ZW writes from X/Y channels like in: mov r2.z:df r0.xxxx:df Because DF regions use 2-wide rows with a vstride of 2, the region generated for the source would be r0<2,2,1>.xyxy:DF, which is equivalent to r0.xxzz, so we end up writing r0.z in r2.z instead of r0.x. Using a vertical stride of 0 in these cases we get to replicate the XX swizzle and write what we want. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
c3edacaa288ae01c0f37e645737feeeb48f2c3f2 |
|
19-Jul-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4/scalarize_df: do not scalarize swizzles that we can support natively Certain swizzles like XYZW can be supported by translating only the first two 64-bit swizzle channels to 32-bit channels. This happens with swizzles such that the first two logical components, when translated to 32-bit channels and replicated across the second dvec2 row, select the same channels specified by the 3rd and 4th logical swizzle components. Notice that this opens up the possibility that some instructions are not scalarized and can end up with XY or ZW 32-bit writemasks. Make sure we always scalarize in such cases. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
2f0bc54e2bf6c7d218f30acc88f5cb94bd6214f7 |
|
01-Jul-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: split instructions that read 64-bit interleaved attributes Stages that use interleaved attributes generate regions with a vstride=0 that can hit the gen7 hardware decompression bug. v2: - Make static the function and fix indent (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
0579c85e5ca7d406cad42db7c1501d6b1fb9696b |
|
01-Jul-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: dump subnr for FIXED_GRF This came in handy when debugging the payload setup for Tess Eval, since it prints correct subnr for attributes that can be loaded in the second half of a register. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
5fe8d567d8dadeb2b77addd73762f6bde4acfac2 |
|
06-Oct-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: fix attribute setup for doubles Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
6a01259d8a13aace16e4f1ce9e09e0e41bd52273 |
|
06-Oct-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: fix indentation in lower_attributes_to_hw_regs() Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
ae400e38d90ea2fddf1b050ff94f52bdec94e150 |
|
15-Jul-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: make emit_pull_constant_load support 64-bit loads This way callers don't need to know about 64-bit particularities and we reuse some code. v2: - use byte_offset() instead of offset() - only mark the surface as used once Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
df6e3aa6ae23346bad59d071d340a67be0e2a2c5 |
|
13-Jul-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: fix move_push_constants_to_pull_constants() for 64-bit data v2: adapt to changes in offset() Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
eee2c0d7854e55e92e0e72eb0fb94ab83d702754 |
|
13-Jul-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: fix indentation in move_push_constants_to_pull_constants() Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
07bc6a35d3d6d94d45b81bd10002f0e420d855c2 |
|
28-Jun-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: do not split scratch read/write opcodes 64-bit scratch read/writes require to shuffle data around so we need to have access to the full 64-bit data. We will do the right thing for these when we emit the messages. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
2a857104e41167cef3c6a5132a45c88056c75dff |
|
23-Jun-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: Do not use DepCtrl with 64-bit instructions The BDW PRM says that it is not supported, but it seems that gen7 is also affected, since doing DepCtrl on double-float instructions leads to GPU hangs in some cases, which is probably not surprising knowing that this is not supported in new hardware iterations. The SKL PRMs do not mention this restriction, so it is probably fine. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
506154f704dcb9185dadcd655fd6d0603916ea97 |
|
23-Jun-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: extend the DWORD multiply DepCtrl restriction to all gen8 platforms v2: - Add Broxton as Intel's internal PRMs says that it is needed (Matt). Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
b3a7d0ee9d5f792ab68fbe77da5e3ea85d4bc4c0 |
|
08-Jun-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: Lower 64-bit MAD The previous patch made sure that we do not generate MAD instructions for any NIR's 64-bit ffma, but there is nothing preventing i965 from producing MAD instructions as a result of lowerings or optimization passes. This patch makes sure that any 64-bit MAD produced inside the driver after translating from NIR is also converted to MUL+ADD before we generate code. v2: - Use a copy constructor to copy all relevant instruction fields from the original mad into the add and mul instructions v3: - Rename the lowering and fix commit log (Matt) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
83dcd146020f5e54d1e0a46c585ed672e75abaa0 |
|
01-Jun-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: Skip swizzle to subnr in 3src instructions with DF operands We make scalar sources in 3src instructions use subnr instead of swizzles because they don't really use swizzles. With doubles it is more complicated because we use vstride=0 in more scenarios in which they don't produce scalar regions. Also RepCtrl=1 is not allowed with 64-bit operands, so we should avoid this. v2: Fix typo (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
49be3abbe7afd64f9e3435e9a9e341e30acacb52 |
|
30-May-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: fix indentation in pack_uniform_registers Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
bdf5498c6b7870c14139279e76f1e4b281bed2cd |
|
29-Jun-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: fix pack_uniform_registers for doubles We need to consider the fact that dvec3/4 require two vec4 slots. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
23278a75ce06c3c083892b2a20d9efdf794167d6 |
|
18-Jul-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: teach register coalescing about 64-bit Specifically, at least for now, we don't want to deal with the fact that channel sizes for fp64 instructions are twice the size, so prevent coalescing from instructions with a different type size. Also, we should check that if we are coalescing a register from another MOV we should be writing the same amount of data in both operations, otherwise we end up wiring more or less than the original instruction. This can happen, for example, when we have split fp64 MOVs with an exec size of 4 that only write one register each and then a MOV with exec size of 8 that reads both. We want to avoid the pass to think that it can coalesce from the first split MOV alone. Ideally we would like the pass to see that it can coalesce from both split MOVs instead, but for now we keep it simple. Finally, the pass doesn't support coalescing of multiple registers but in the case of normal SIMD4x2 double-precision instructions they naturally write two registers (one per vertex) and there is no reason why we should not allow coalescing in this case. Change the restriction to bail if we see instructions that write more than 8 channels, where the channels can be 32-bit or 64-bit. v2: - Make sure that scan_inst and inst write the same amount of data. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
ac5a06ff83c32ab14e01e526e729b2fbfe3a2426 |
|
18-Jul-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: implement access to DF source components Z/W The general idea is that with 32-bit swizzles we cannot address DF components Z/W directly, so instead we select the region that starts at the the 16B offset into the register and use X/Y swizzles. The above, however, has the caveat that we can't do that without violating register region restrictions unless we probably do some sort of SIMD splitting. Alternatively, we can accomplish what we need without SIMD splitting by exploiting the gen7 hardware decompression bug for instructions with a vstride=0. For example, an instruction like this: mov(8) r2.x:DF r0.2<0>xyzw:DF Activates the hardware bug and produces this region: Component: x0 y0 z0 w0 x1 y1 z1 w1 Register: r0.2 r0.3 r0.2 r0.3 r1.2 r1.3 r1.2 r1.3 Where r0.2 and r0.3 are r0.z:DF for the first vertex of the SIMD4x2 execution and r1.2 and r1.3 are the same for the second vertex. Using this to our advantage we can select r0.z:DF by doing r0.2<0,2,1>.xyxy and r0.w by doing r0.2<0,2,1>.zwzw without needing to split the instruction. Of course, this only works for gen7, but that is the only hardware platform were we implement align16/fp64 at the moment. v2: Adapted to the fact that we now do this after converting to hardware registers (Iago) Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
e238601a2da9512c0fd263e8378f30498a0a1507 |
|
24-May-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: translate 64-bit swizzles to 32-bit The hardware can only operate with 32-bit swizzles, which is a rather limiting restriction. However, the idea is not to expose this to the optimization passes, which would be a mess to deal with. Instead, we let the bulk of the vec4 backend ignore this fact and we fix the swizzles right at codegen time. At the moment the pass only needs to handle single value swizzles thanks to the scalarization pass that runs before it. Notice that this only works for X/Y swizzles. We will add support for Z/W swizzles in the next patch, since they need a bit more work. v2 (Sam): - Do not expand swizzle of 64-bit immediate values. v3: - Do this after translation to hardware registers instead of doing it right before so we don't need the force_vstride0 flag (Curro). - Squashed patch that included FIXED_GRF in the list of register files that need this translation (Iago). - Remove swizzle assignments for VGRF and UNIFORM files in convert_to_hw_regs(), they will be set by apply_logical_swizzle() (Iago). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
fb7cb853c964db44ab99c1592e1ef7dec2f0c25b |
|
24-May-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: add a scalarization pass for double-precision instructions The hardware only supports 32-bit swizzles, which means that we can only access directly channels XY of a DF making access to channels ZW more difficult, specially considering the various regioning restrictions imposed by the hardware. The combination of both things makes handling ramdom swizzles on DF operands rather difficult, as there are many combinations that can't be represented at all, at least not without some work and some level of instruction splitting depending on the case. Writemasks are 64-bit in general, however XY and ZW writemasks also work in 32-bit, which means these writemasks can't be represented natively, adding to the complexity. For now, we decided to try and simplify things as much as possible to avoid dealing with all this from the get go by adding a scalarization pass that runs after the main optimization loop. By fully scalarizing DF instructions in align16 we avoid most of the complexity introduced by the aforementioned hardware restrictions and we have an easier path to an initial fully functional version for the vector backend in Haswell and IvyBridge. Later, we can improve the implementation so we don't necessarily scalarize everything, iteratively adding more complexity and building on top of a framework that is already working. Curro drafted some ideas for how this could be done here: https://bugs.freedesktop.org/show_bug.cgi?id=92760#c82 v2: - Use a copy constructor for the scalar instructions so we copy all relevant instructions fields from the original instruction. v3: Fix indention in one switch (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
f4b8649233fa10e89205b6b5f6f334279b198f22 |
|
17-Jun-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: split double-precision SEL There is a hardware bug affecting compressed double-precision SEL instructions in align16 mode by which they won't read predication mask properly. The bug does not affect other predicated instructions and it does not affect SEL in Align1 mode either. This was found empirically and verified by Curro in the simulator. Fix this by splitting double-precision SEL in Align16 mode to use an execution size of 4. v2: Check that the dst type is 64-bit, since we can have 16-wide single precision bcsel instructions that also write 2 registers. v3: Replace bcsel by SEL in all the comments as bcsel is the nir opcode but SEL is the actual assembly instruction (Matt). Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
a83608f50483ac397545d3815bfe8dc3be5126b6 |
|
29-Jun-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: dump NibCtrl for instructions with execsize != 8 v2: do it in the same fashion as the FS backend for consistency (Curro) Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
58767f0fec7809c3408adbc4d147dd56f2ee3d4d |
|
29-Aug-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: add a SIMD lowering pass Generally, instructions in Align16 mode only ever write to a single register and don't need any form of SIMD splitting, that's why we have never had a SIMD splitting pass in the vec4 backend. However, double-precision instructions typically write 2 registers and in some cases they run into certain hardware bugs and limitations that we need to work around by splitting the instructions so we only write to 1 register at a time. This patch implements a SIMD splitting pass similar to the one in the scalar backend. Because we only use double-precision instructions in Align16 mode in gen7 (gen8+ is fully scalar and gens < 7 do not implement fp64) the pass should be a no-op on any other generation. For now the pass only handles the gen7 restriction where any instruction that writes 2 registers also needs to read 2 registers. This affects double-precision instructions reading uniforms, for example. Later patches will extend the lowering pass adding a few more cases. v2: - Move the simd lowering pass after the main optimization loop and run copy-propagation and dce if it reports progress (Curro) - Compute number of registers written instead of fixing it to 1 (Iago) - Use group from backend_instruction (Iago) - Drop assertion that checked that we only split 8-wide instructions into 4-wide. (Curro) - Don't assume that instructions can only be 8-wide, we might want to use 16-wide instructions in the future too (Curro) - Wrap gen7 workarounds in a conditional to ease adding workarounds for other gens in the future (Curro) - Handle dst/src overlap hazard (Curro) - Use the horiz_offset() helper to simplify the implementation (Curro) - Drop the assertion that checks that each split instruction writes exactly one register (Curro) - Use the copy constructor to generate split instructions with all the relevant fields initialized to the values in the original instruction instead of copying only a handful of them manually (Curro) v3 (Iago): - When copying to a temporary, allocate the number of registers required for the copy based on the size written of the lowered instruction instead of assuming that all lowered instructions produce single-register writes - Adapt to changes in offset() Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
4ea3bf8ebb56c8db6e885a77d81502a0b2adca4f |
|
10-Jun-2016 |
Juan A. Suarez Romero <jasuarez@igalia.com> |
i965/vec4: handle 32 and 64 bit channels in liveness analysis Our current data flow analysis does not take into account that channels on 64-bit operands are 64-bit. This is a problem when the same register is accessed using both 64-bit and 32-bit channels. This is very common in operations where we need to access 64-bit data in 32-bit chunks, such as the double packing and packing operations. This patch changes the analysis by checking the bits that each source or destination datatype needs. Actually, rather than bits, we use blocks of 32bits, which is the minimum channel size. Because a vgrf can contain a dvec4 (256 bits), we reserve 8 32-bit blocks to map the channels. v2 (Curro): - Simplify code by making the var_from_reg helpers take an extra argument with the register component we want. - Fix a couple of cases where we had to update the code to the new way of representing live variables. v3: - Fix indent in multiline expressions (Matt) - Fix comment's closing tag (Matt) - Use DIV_ROUND_UP(inst->size_written, 16) instead of 2 * regs_written(inst) to avoid rounding issues. The same for regs_read(i). (Curro). - Add asserts in var_from_reg() to avoid exceeding the allocated registers (Curro). Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
29dd5cf9d64ac998cb313db8a908272a6154ec46 |
|
30-May-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: dump the instruction execution size Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
f79547840a1951dbf82c7b6629935c6e89020e27 |
|
30-May-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: fix regs_read() for doubles Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
c722a8e61ebc72d7d21c2bed0f623218d739fdb7 |
|
17-Feb-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: Rename DF to/from F generator opcodes The opcodes are not specific for conversions to/from float since we need the same for conversions to/from other 32-bit types. Rename the opcodes accordingly and change the asserts to check the size of the types involved instead. v2: - Rename to VEC4_OPCODE_TO_DOUBLE and VEC4_OPCODE_FROM_DOUBLE (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
21cf6f14d5abdf7d0f9641404387e0c00de6f56f |
|
12-Feb-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: make opt_vector_float ignore doubles The pass does not support doubles in its current form. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
54b998e0e488189307d2614fe56a3b78b442d316 |
|
17-Jun-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: add VEC4_OPCODE_SET_{LOW,HIGH}_32BIT opcodes These opcodes will set the low/high 32-bit in each 64-bit data element using Align1 mode. We will use this to implement packDouble2x32. We use Align1 mode because in order to implement this in Align16 mode we would need to use 32-bit logical swizzles (XZ for low, YW for high), but the IR works in terms of 64-bit logical swizzles for DF operands all the way up to codegen. v2: - use suboffset() instead of get_element_ud() - no need to set the width on the dst Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
6979e5a41241993b9e7bedea80f29fb43d96aa47 |
|
31-May-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: add VEC4_OPCODE_PICK_{LOW,HIGH}_32BIT opcodes These opcodes will pick the low/high 32-bit in each 64-bit data element using Align1 mode. We will use this, for example, to do things like unpackDouble2x32. We use Align1 mode because in order to implement this in Align16 mode we would need to use 32-bit logical swizzles (XZ for low, YW for high), but the IR works in terms of 64-bit logical swizzles for DF operands all the way up to codegen. v2: - use suboffset() instead of get_element_ud() - no need to set the width on the dst Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
c35fa7ac5507a64943aa518b2dac8bddfdc9e14b |
|
18-Nov-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: set correct register regions for 32-bit and 64-bit For 32-bit instructions we want to use <4,4,1> regions for VGRF sources so we should really set a width of 4 (we were setting 8). For 64-bit instructions we want to use a width of 2 because the hardware uses 32-bit swizzles, meaning that we can only address 2 consecutive 64-bit components in a row. Also, Curro suggested that the hardware is probably fixing the width to 2 for 64-bit instructions anyway, so just go with that and use <2,2,1>. v2: - No need to explicitly set the vertical stride of 64-bit regions to 2, brw_vecn_grf with a width of 2 will do that for us. - No need to adjust the width of dst registers. v3 (Ian): - Make type_size and width const. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
558f27953101c438747c3e9d3c3f98ce21e79007 |
|
14-Aug-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: add double/float conversion pseudo-opcodes These need to be emitted as align1 MOV's, since they need to have a stride of 2 on the float register (whether src or dest) so that data from another thread doesn't cross the middle of a SIMD8 register. v2 (Iago): - The float-to-double needs to align 32-bit data to 64-bit before doing the conversion. This was doable in align16 when we tried to use an execsize of 4, but with an execsize of 8 we would need another align1 opcode to do that (since we need data to cross the middle of a SIMD register). Just making the opcode handle this internally seems more practical that adding another opcode just for this purpose and having the caller know about this before converting. - The double-to-float conversion produces 32-bit elements aligned to 64-bit so we make the opcode re-pack the result to 32-bit and fit in one register, as expected by SIMD4x2 operation. This still requires that callers reserve two registers for the float data destination because we need to produce 64-bit aligned data first, and repack it later on the same destination register, but it saves the need for a re-pack opcode only to achieve this making the operation complete in a single opcode. Hopefully that is worth the weirdness of the double register allocation... Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
2d6eee3144ce16b39909522be466bdb3871f4c1b |
|
13-Aug-2015 |
Connor Abbott <connor.w.abbott@intel.com> |
i965/vec4: add support for printing DF immediates Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
e729504fb1799c3ae31cea76d73946530ef9806f |
|
14-Sep-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
nir: pass compiler rather than devinfo to functions that call nir_optimize Later we will pass compiler to nir_optimise to be used by the loop unroll pass. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
a1a292d17710a2bfb33f798c9f5fda73a5985261 |
|
04-Oct-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Store a clip_distance_mask field similar to cull_distance_mask. This isn't useful for legacy GL, but will be used in Vulkan. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
19c652b29ce7271374cd0951bdadc9840964e78e |
|
04-Oct-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Use shader_info for brw_vue_prog_data::cull_distance_mask. This also allows us to move it from a GL specific location to a part of the compiler shared by both GL and Vulkan. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
e1af20f18a86f52a9640faf2d4ff8a71b0a4fa9b |
|
13-Oct-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
nir/i965/anv/radv/gallium: make shader info a pointer When restoring something from shader cache we won't have and don't want to create a nir_shader this change detaches the two. There are other advantages such as being able to reuse the shader info populated by GLSL IR. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
89e1436e2d4ff0c15202708979eb36761cae4167 |
|
11-Oct-2016 |
Ian Romanick <ian.d.romanick@intel.com> |
i965: Silence unused parameter warnings brw_link.cpp:76:44: warning: unused parameter ‘shader_type’ [-Wunused-parameter] gl_shader_stage shader_type, ^ brw_nir.c: In function ‘brw_nir_lower_vs_inputs’: brw_nir.c:194:55: warning: unused parameter ‘devinfo’ [-Wunused-parameter] const struct gen_device_info *devinfo, ^ brw_vec4_visitor.cpp:914:37: warning: unused parameter ‘sampler’ [-Wunused-parameter] uint32_t sampler, ^ brw_vec4_visitor.cpp:1146:34: warning: unused parameter ‘stream_id’ [-Wunused-parameter] vec4_visitor::gs_emit_vertex(int stream_id) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
f57f526fc5cfaedf26b2becf8f1899d5de0d0461 |
|
16-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/ir: Skip eliminate_find_live_channel() for stages with sparse thread dispatch. The eliminate_find_live_channel optimization eliminates FIND_LIVE_CHANNEL instructions in cases where control flow is known to be uniform, and replaces them with 'MOV 0', which in turn unblocks subsequent elimination of the BROADCAST instruction frequently used on the result of FIND_LIVE_CHANNEL. This is however not correct in per-sample fragment shader dispatch because the PSD can dispatch a fully unlit sample under certain conditions. Disable the optimization in that case. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> v2: Add devinfo argument to brw_stage_has_packed_dispatch() to implement hardware generation check.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
5ca35c63673dad28854c00ce34ec6f085ba4ec5e |
|
02-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Assert that ATTR regions are register-aligned. It might be useful to actually handle this once copy propagation becomes smarter about register-misaligned offsets. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
8bed1adfc144d9ae8d55ccb9b277942da8a78064 |
|
02-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Assign correct destination offset to rewritten instruction in register coalesce. Because the pass already checks that the destination offset of each 'scan_inst' that needs to be rewritten matches 'inst->src[0].offset' exactly, the final offset of the rewritten instruction is just the original destination offset of the copy. This is in preparation for adding support for sub-GRF offsets to the VEC4 IR. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
3a74e437fdec02c28749c94bc1bcf21c3c4b48d7 |
|
02-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Don't coalesce registers with overlapping writes not matching the MOV source. In preparation for adding support for sub-GRF offsets to the VEC4 IR. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
1bb5074474445ea9f54d0f52383f99ac0fa6128f |
|
02-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Compare full register offsets in opt_register_coalesce nop move check. In preparation for adding support for sub-GRF offsets to the VEC4 IR. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
3be0d6d040753c62b25077fb6b85ad1f0808b258 |
|
02-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Check that the write offsets match when setting dependency controls. For simplicity just assume that two writes to the same GRF with different sub-GRF offsets will potentially interfere and break the dependency control chain. This is in preparation for adding sub-GRF offset support to the VEC4 IR. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
b52fefc4d55a4627bf0d59c78ac531603fa08fda |
|
02-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Change opt_vector_float to keep track of the last offset seen in bytes. This simplifies things slightly and makes the pass more correct in presence of sub-GRF offsets. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
230615e2280e6d28456e7d6a42b1e42645515b4d |
|
09-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Simplify src/dst_reg to brw_reg conversion by using byte_offset(). This should also have the side effect of fixing convert_to_hw_regs() to handle sub-GRF register offsets. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
eb746a80e5e99bafd3957a1cb2d9db8548a1a6be |
|
02-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/ir: Update several stale comments. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
47784e2346b56bea6a1111fecaa953239ff198ca |
|
02-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/ir: Don't print ARF subnr values twice. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
5d65d51e78c2f73389a0d30dac6dda4561e91bec |
|
02-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Print src/dst_reg::offset field consistently for all register files. C.f. 'i965/fs: Print fs_reg::offset field consistently for all register files.'. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
fcd9d1badcd97486eea5d87bf701a3b0a16b4ba9 |
|
02-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Drop backend_reg::in_range() in favor of regions_overlap(). This makes sure that overlap checks are done correctly throughout the back-end when the '*this' register starts before the register/size pair provided as argument, and is actually less annoying to use than in_range() at this point since regions_overlap() takes its size arguments in bytes. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
728dd30c0ac0078653974de36087456065d2e3ae |
|
08-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Replace vec4_instruction::regs_read with ::size_read using byte units. The previous regs_read value can be recovered by rewriting each reference of regs_read() like 'x = i.regs_read(j)' to 'x = DIV_ROUND_UP(i.size_read(j), reg_unit)'. For the same reason as in the previous patches, this doesn't attempt to be particularly clever about simplifying the result in the interest of keeping the rather lengthy patch as obvious as possible. I'll come back later to clean up any ugliness introduced here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
69fdf13c215c2970feaca76f178a5c2c11ba8fec |
|
03-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Replace vec4_instruction::regs_written with ::size_written field in bytes. The previous regs_written field can be recovered by rewriting each rvalue reference of regs_written like 'x = i.regs_written' to 'x = DIV_ROUND_UP(i.size_written, reg_unit)', and each lvalue reference like 'i.regs_written = x' to 'i.size_written = x * reg_unit'. For the same reason as in the previous patches, this doesn't attempt to be particularly clever about simplifying the result in the interest of keeping the rather lengthy patch as obvious as possible. I'll come back later to clean up any ugliness introduced here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
d28cfa35fec75c367b940ff829ba8eaa035fbd22 |
|
02-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Add wrapper functions for vec4_instruction::regs_read and ::regs_written. This is in preparation for dropping vec4_instruction::regs_read and ::regs_written in favor of more accurate alternatives expressed in byte units. The main reason these wrappers are useful is that a number of optimization passes implement dataflow analysis with register granularity, so these helpers will come in handy once we've switched register offsets and sizes to the byte representation. The wrapper functions will also make sure that GRF misalignment (currently neglected by most of the back-end) is taken into account correctly in the calculation of regs_read and regs_written. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
fba020e5af49d9d9a2c6e4d4b79115ed1e74a127 |
|
01-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Replace dst/src_reg::reg_offset with dst/src_reg::offset expressed in bytes. The dst/src_reg::offset field in byte units introduced in the previous patch is a more straightforward alternative to an offset representation split between ::reg_offset and ::subreg_offset fields. The split representation makes it too easy to forget about one of the offsets while dealing with the other, which has led to multiple FS back-end bugs in the past. To make the matter worse the unit reg_offset was expressed in was rather inconsistent, for uniforms it would be expressed in either 4B or 16B units depending on the back-end, and for most other things it would be expressed in 32B units. This encodes reg_offset as a new offset field expressed consistently in byte units. Each rvalue reference of reg_offset in existing code like 'x = r.reg_offset' is rewritten to 'x = r.offset / reg_unit', and each lvalue reference like 'r.reg_offset = x' is rewritten to 'r.offset = r.offset % reg_unit + x * reg_unit'. Because the change affects a lot of places and is rather non-trivial to verify due to the inconsistent value of reg_unit, I've tried to avoid making any additional changes other than applying the rewrite rule above in order to keep the patch as simple as possible, sometimes at the cost of introducing obvious stupidity (e.g. algebraic expressions that could be simplified given some knowledge of the context) -- I'll clean those up later on in a second pass. v2: Fix division by the wrong reg_unit in the UNIFORM case of convert_to_hw_regs(). (Iago) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
527f37199929932300acc1688d8160e1f3b1d753 |
|
23-Aug-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
intel: s/brw_device_info/gen_device_info/ Generated by: sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/**/*.c sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/**/*.h sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.c sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.cpp sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.h Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
e7c376adfdecd4c1333997c8be8bb066a87c67b4 |
|
19-Aug-2016 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Ignore swizzle of VGRF for use by var_range_end(). var_range_end(v, n) loops over the n components of variable number v and finds the maximum value, giving the last use of any component of v. Therefore it expects v to correspond to the variable associated with the .x channel of the VGRF. var_from_reg() however returns the variable for the first channel of the VGRF, post-swizzle. So, if the last register had a swizzle with y, z, or w in the swizzle component, we would read out of bounds. For any other register, we would read liveness information from the next register. The fix is to convert the src_reg to a dst_reg in order to call the dst_reg version of var_from_reg() that doesn't consider the swizzle. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
4c3a6b07e2960266adca634f8607ef38f71b8318 |
|
20-Jul-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Make opt_vector_float reset at the top of each block The pass isn't really control-flow aware and you can get into case where it tries to combine instructions from different blocks. This can actually lead to an assertion failure when removing unneeded instructions if part of the vector is set in one block and part in another. This prevents regressions in the next commit. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
7ea09511ca4f58640063cc1ee08386cce5300535 |
|
04-Apr-2016 |
Juan A. Suarez Romero <jasuarez@igalia.com> |
i965/fs: calculate first non-payload GRF using attrib slots When computing where the first non-payload GRF starts, we can't rely on the number of attributes, as each attribute can be using 1 or 2 slots depending on whether they are a dvec3/4 or other. Instead, we need to use the number of slots used by the attributes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
b7423b485e11b768f68e8d5865fbc74b07ee6d48 |
|
04-Apr-2016 |
Juan A. Suarez Romero <jasuarez@igalia.com> |
i965/vec4: use attribute slots to calculate URB read length Do not use total attributes because a dvec3/dvec4 attribute requires two slots. So rather use total attribute slots. v2: do not use loop to calculate required attribute slots (Kenneth Graunke) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
1ec466d0ff59ab17edef95c84ed733c1fea5655e |
|
28-Apr-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Stop setting dispatch_grf_start_reg from the visitor Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
1cc7573162a7f0e8346d7abab50890c58a0dce9a |
|
28-Apr-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965: Pass devinfo pointer to is_3src() helpers. This is not strictly required for the following changes because none of the three-source opcodes we support at the moment in the compiler back-end has been removed or redefined, but that's likely to change in the future. In any case having hardware instructions specified as a pair of hardware device and opcode number explicitly in all cases will simplify the opcode look-up interface introduced in a subsequent commit, since the opcode number alone is in general ambiguous. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
c55dc77ab13420a9fe0177ccd21a6b0a950d9113 |
|
28-Apr-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965: Pass devinfo pointer to brw_instruction_name(). A future series will implement support for an instruction that happens to have the same opcode number as another instruction we support already on a disjoint set of hardware generations. In order to disambiguate which instruction it is brw_instruction_name() will need some way to find out which device we are generating code for. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
60a17d071825da4a06303cb699e4417edaaa6386 |
|
14-Apr-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Properly handle integer types in opt_vector_float(). Previously, opt_vector_float() always interpreted MOV sources as floating point, and always created a MOV with a F-type destination. This meant that we could mess up sequences of integer loads, such as: mov vgrf6.0.x:D, 0D mov vgrf6.0.y:D, 1D mov vgrf6.0.z:D, 2D mov vgrf6.0.w:D, 3D Here, integer 0/1/2/3 become approximately 0.0f, so we generated: mov vgrf6.0:F, [0F, 0F, 0F, 0F] which is clearly wrong. We can properly handle this by converting integer values to float (rather than bitcasting), and emitting a type converting MOV: mov vgrf6.0:D, [0F, 1F, 2F, 3F] To do this, see first see if the integer values (converted to float) are representable. If so, we use a D-type MOV. If not, we then try the floating point values and an F-type MOV. We make zero not impose type restrictions. This is important because 0D would imply a D-type MOV, but is often used in sequences such as MOV 0D, MOV 0x3f800000D, where we want to use an F-type MOV. Fixes about 54 dEQP-GLES2 failures with the vec4 VS backend. This recently became visible due to changes in opt_vector_float() which made it optimize more cases, but it was a pre-existing bug. Apparently it also manages to turn more integer loads into VFs, producing the following shader-db statistics on Haswell: total instructions in shared programs: 7084195 -> 7082191 (-0.03%) instructions in affected programs: 246027 -> 244023 (-0.81%) helped: 1937 total cycles in shared programs: 65669642 -> 65651968 (-0.03%) cycles in affected programs: 531064 -> 513390 (-3.33%) helped: 1177 v2: Handle the type of zero better. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
1aa28f3509b033e0f86510a6d4c7993fca650b3b |
|
14-Apr-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Make opt_vector_float() only handle non-type-conversion MOVs. We don't handle this properly - we'd have to perform the type conversion before trying to convert the value to a VF. While we could do that, it doesn't seem particularly useful - most vector loads should be consistently typed (all float or all integer). As a special case, we do allow type-converting MOVs of integer 0, as it's represented the same regardless of the type. I believe this case does actually come up. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
2a25a5142bd78b22cc9ada41b8988bb282c2a7ac |
|
14-Apr-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fold vectorize_mov() back into the one caller. After the previous patch, this helper is only called in one place. So, just fold it back in - there are a lot of parameters here and not much code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
9967561158acd94edff0fa93ceaf4bc527e271ed |
|
14-Apr-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Rework opt_vector_float() control flow. This reworks opt_vector_float() so that there's only one place that flushes out any accumulated state and emits a VF. v2: Don't break the sequence for non-representable numbers - just skip recording their values. Only break it for non-MOVs or register changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
a112391d52a458c588b8770cbf1ca9fce8863b79 |
|
06-Apr-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Handle MOV_INDIRECT in pack_uniform_registers Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
aaac8a18904f44e93a2223c93727086358d6a655 |
|
24-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Add support for SHADER_OPCODE_MOV_INDIRECT Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
61ee5e62a2beeb2e405ff3aa5e3eb26d1bf5437d |
|
05-Apr-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Use can_do_writemask in can_reswizzle Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
75b68f9114dc3ba1b501fb7de8198c03b3dcb1fd |
|
05-Apr-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Move can_do_writemask to vec4_instruction Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
8e76f664beb845f8dca30ca5635f9369618563b0 |
|
09-Dec-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Get rid of the uniform_size array Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
056849772f66582fd7e8a181c3fb16955f84243b |
|
25-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Use MOV_INDIRECT instead of reladdr for indirect push constants This commit moves us to an instruction based model rather than a register-based model for indirects. This is more accurate anyway as we have to emit instructions to resolve the reladdr. It's also a lot simpler because it gets rid of the recursive reladdr problem by design. One side-effect of this is that we need a whole new algorithm in move_uniform_array_access_to_pull_constants. This new algorithm is much more straightforward than the old one and is fairly similar to what we're already doing in the FS backend. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
01425c45b32fa7f323515b05697c6cc0d245ad32 |
|
17-Mar-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Remove the RCP+RSQ algebraic optimizations NIR already has this optimization and it can do much better than the little peephole in the backend. No shader-db change on Haswell or Broadwell. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
7d7990cf657550be4d038a0424ffdc0ef7fd8faa |
|
14-Mar-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Consider removal of no-op MOVs as progress during register coalesce. Bug found by the liveness analysis validation pass that will be introduced in a later commit. The no-op MOV check in opt_register_coalesce() was removing instructions which makes the cached liveness analysis calculation inconsistent with the shader IR. We were failing to set progress to true in that case though, which means that invalidate_live_intervals() wouldn't necessarily be called at the end of the function. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
2f76a9924e7b0b33a508ee3651b0cb2ab536a7dc |
|
02-Mar-2016 |
Juan A. Suarez Romero <jasuarez@igalia.com> |
i965/vec4: add opportunistic behaviour to opt_vector_float() opt_vector_float() transforms several scalar MOV operations to a single vectorial MOV. This is done when those MOV covers all the components of the destination register. So something like: mov vgrf3.0.xy:D, 0D mov vgrf3.0.w:D, 1065353216D mov vgrf3.0.z:D, 0D is transformed in: mov vgrf3.0:F, [0F, 0F, 0F, 1F] But there are cases where not all the components are written. For example, in: mov vgrf2.0.x:D, 1073741824D mov vgrf3.0.xy:D, 0D mov vgrf3.0.w:D, 1065353216D mov vgrf4.0.xy:D, 1065353216D mov vgrf4.0.w:D, 0D mov vgrf6.0:UD, u4.xyzw:UD Nor vgrf3 nor vgrf4 .z components are written, so the optimization is not applied. But it could be applied anyway with the components covered, using a writemask to select the ones written. So we could transform it in: mov vgrf2.0.x:D, 1073741824D mov vgrf3.0.xyw:F, [0F, 0F, 0F, 1F] mov vgrf4.0.xyw:F, [1F, 1F, 0F, 0F] mov vgrf6.0:UD, u4.xyzw:UD This commit does precisely that: opportunistically apply opt_vector_float() when possible. total instructions in shared programs: 7124660 -> 7114784 (-0.14%) instructions in affected programs: 443078 -> 433202 (-2.23%) helped: 4998 HURT: 0 total cycles in shared programs: 64757760 -> 64728016 (-0.05%) cycles in affected programs: 1401686 -> 1371942 (-2.12%) helped: 3243 HURT: 38 v2: change vectorize_mov() signature (Matt). v3: take in account predicates (Juan). v4 [mattst88]: Update shader-db numbers. Fix some whitespace issues. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
cfbd9831f89ef165e7998d0b8524a1aefedec404 |
|
25-Feb-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Eliminate brw_nir_lower_{inputs,outputs,io} functions. Now that each stage is directly calling brw_nir_lower_io(), and we have per-stage helper functions, it makes sense to just call the relevant one directly, rather than going through multiple switch statements. This also eliminates stupid function parameters, such as the two that only apply to vertex attributes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
2f2c00c7279e7c43e520e21de1781f8cec263e92 |
|
11-Feb-2016 |
Matt Turner <mattst88@gmail.com> |
i965: Lower min/max after optimization on Gen4/5. Gen4/5's SEL instruction cannot use conditional modifiers, so min/max are implemented as CMP + SEL. Handling that after optimization lets us CSE more. On Ironlake: total instructions in shared programs: 6426035 -> 6422753 (-0.05%) instructions in affected programs: 326604 -> 323322 (-1.00%) helped: 1411 total cycles in shared programs: 129184700 -> 129101586 (-0.06%) cycles in affected programs: 18950290 -> 18867176 (-0.44%) helped: 2419 HURT: 328 Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
8122d21d1507b4d6d351299f88fff0c645c0b4ff |
|
13-Feb-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fix gl_DrawID in the vec4 backend. brw_draw_upload.c uploads VertexID/InstanceID first, then DrawID. So we need to assign the attribute mapping in that order as well. Fixes the following Pigit tests with the vec4 backend: - arb_shader_draw_parameters-drawid vertexid - arb_shader_draw_parameters-drawid-indirect basevertex Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
5743fd957145040a4734b5542ee5187cfad4cf1d |
|
11-Feb-2016 |
Ben Widawsky <benjamin.widawsky@intel.com> |
i965: Rename optimizer debug 00 filename This allows ls, and scripts to get the file names in the correct order of optimization. Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
85f5c18fef1ff2f19d698f150e23a02acd6f59b9 |
|
14-Jan-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Drop support for ATTR as an instruction destination. This is no longer necessary...and it doesn't make much sense to have inputs as destinations. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
d56ae2d1605fc1b5a3fdf5aba9aefc3c7692a4ba |
|
14-Jan-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Apply VS attribute workarounds in NIR. This patch re-implements the pre-Haswell VS attribute workarounds. Instead of emitting shader code in the vec4 backend, we now simply call a NIR pass to emit the necessary code. This simplifies the vec4 backend. Beyond deleting code, it removes the primary use of ATTR as a destination. It also eliminates the requirement that the vec4 VS backend express the ATTR file in terms of VERT_ATTRIB_* locations, giving us a bit more flexibility. This approach is a little different: rather than munging the attributes at the top, we emit code to fix them up when they're accessed. However, we run the optimizer afterwards, so CSE should eliminate the redundant math. It may even be able to fuse it with other calculations based on the input value. shader-db does not handle non-default NOS settings, so I have no statistics about this patch. Note that the scalar backend does not implement VS attribute workarounds, as they are unnecessary on hardware which allows SIMD8 VS. v2: Do one multiply for FIXED rescaling and select components from either the original or scaled copy, rather than multiplying each component separately (suggested by Matt Turner). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
830b075e86e3e9af1bf12316d0f9d888a85a973b |
|
05-Jan-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Explicitly write the "TR DS Cache Disable" bit at TCS EOT. Bit 0 of the Patch Header is "TR DS Cache Disable". Setting that bit disables the DS Cache for tessellator-output topologies resulting in stitch-transition regions (but leaves it enabled for other cases). We probably shouldn't leave this to chance - the URB could contain garbage - which could result in the cache randomly being turned on or off. This patch makes the final EOT write 0 to the first DWord (which only contains this one bit). This ensures the cache is always on. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
9870f798beab701a9edda81ff7ccc39f1875d610 |
|
15-Jan-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs/generator: Take an actual shader stage rather than a string Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
824d82025d0bff9841647942aca501fba16fc1a9 |
|
14-Jan-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Make an is_scalar boolean in brw_compile_vs(). Shorter than compiler->scalar_stage[MESA_SHADER_VERTEX], which can help with line-wrapping. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
53a9b6223f4ebf66e8892e04ffe47eb5586eda5c |
|
31-Dec-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Move 3-src subnr swizzle handling into the vec4 backend. While most align16 instructions only support a SubRegNum of 0 or 4 (using swizzling to control the other channels), 3-src instructions actually support arbitrary SubRegNums. When the RepCtrl bit is set, we believe it ignores the swizzle and uses the equivalent of a <0,1,0> region from the subnr. In the past, we adopted a vec4-centric approach of specifying subnr of 0 or 4 and a swizzle, then having brw_eu_emit.c convert that to a proper SubRegNum. This isn't a great fit for the scalar backend, where we don't set swizzles at all, and happily set subnrs in the range [0, 7]. This patch changes brw_eu_emit.c to use subnr and swizzle directly, relying on the higher levels to set them sensibly. This should fix problems where scalar sources get copy propagated into 3-src instructions in the FS backend. I've only observed this with TES push model inputs, but I suppose it could happen in other cases. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
cddfc2cefa93b884c40329dcb193fe4fb22143ab |
|
10-Dec-2015 |
Kristian Høgsberg Kristensen <krh@bitplanet.net> |
i965: Add support for gl_DrawIDARB and enable extension We have to break open a new vec4 for gl_DrawIDARB. We've used up all space in the vec4 we use for SGVS and gl_DrawIDARB has to come from its own separate vertex buffer anyway. This is because we point the vb for base vertex and base instance into the draw parameter BO for indirect draw calls, but the draw id is generated by mesa in a different buffer. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
17ebb55a14b5a9aa639845fbda9330ef9421834a |
|
10-Dec-2015 |
Kristian Høgsberg Kristensen <krh@bitplanet.net> |
i965: Add support for gl_BaseVertexARB and gl_BaseInstanceARB We already have gl_BaseVertexARB in the .x component of the SGVS vec4 and plug gl_BaseInstanceARB into the last free component (.y). Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
bd8ab8dedb2cc557ea3cb58d507f237743b3f7f9 |
|
24-Dec-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Don't set interleave or complete on TCS EOT message. Setting interleave on the TCS EOT message causes Ivybridge hardware to GPU hang like crazy. Individual tests would pass, but running even a simple test like nop.shader_test in a loop would hang within 1-3 runs. Adding sleep delays worked around the problem, somehow. Interleave doesn't make much sense given that we only have one patch URB handle, not two. Complete doesn't seem useful either. There's no reason to actually set those bits. We were just being lazy. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
b7793783b3df94880655234bc2a9054eddf01913 |
|
26-Nov-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Relase input URB Handles on Gen7/7.5 when TCS threads finish. Pre-Broadwell hardware requires us to manually release the ICP Handles by issuing URB read messages with the "Complete" bit set. We can do this in pairs to use fewer URB read messages. Based heavily on work from Chris Forbes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
1245724f728915694ecb9c318a68107c01ccc808 |
|
17-Nov-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Port tessellation evaluation shaders to vec4 mode. This can be used on Broadwell by setting INTEL_SCALAR_TES=0. More importantly, it will be used for Ivybridge and Haswell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
24be658d13b13fdb8a1977208038b4ba43bce4ac |
|
17-Nov-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Add tessellation control shaders. The TCS is the first tessellation shader stage, and the most complicated. It has access to each of the control points in the input patch, and computes a new output patch. There is one logical invocation per output control point; all invocations run in parallel, and can communicate by reading and writing output variables. One of the main responsibilities of the TCS is to write the special gl_TessLevelOuter[] and gl_TessLevelInner[] output variables which control how much new geometry the hardware tessellation engine will produce. Otherwise, it simply writes outputs that are passed along to the TES. We run in SIMD4x2 mode, handling two logical invocations per EU thread. The hardware doesn't properly manage the dispatch mask for us; it always initializes it to 0xFF. We wrap the whole program in an IF..ENDIF block to handle an odd number of invocations, essentially falling back to SIMD4x1 on the last thread. v2: Update comments (requested by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
83dedb6354d0e9b04e8ccad77e86bdb7bad44bdd |
|
20-Nov-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Add src/dst interference for certain instructions with hazards. When working on tessellation shaders, I created some vec4 virtual opcodes for creating message headers through a sequence like: mov(8) g7<1>UD 0x00000000UD { align1 WE_all 1Q compacted }; mov(1) g7.5<1>UD 0x00000100UD { align1 WE_all }; mov(1) g7<1>UD g0<0,1,0>UD { align1 WE_all compacted }; mov(1) g7.3<1>UD g8<0,1,0>UD { align1 WE_all }; This is done in the generator since the vec4 backend can't handle align1 regioning. From the visitor's point of view, this is a single opcode: hs_set_output_urb_offsets vgrf7.0:UD, 1U, vgrf8.xxxx:UD Normally, there's no hazard between sources and destinations - an instruction (naturally) reads its sources, then writes the result to the destination. However, when the virtual instruction generates multiple hardware instructions, we can get into trouble. In the above example, if the register allocator assigned vgrf7 and vgrf8 to the same hardware register, then we'd clobber the source with 0 in the first instruction, and read back the wrong value in the last one. It occured to me that this is exactly the same problem we have with SIMD16 instructions that use W/UW or B/UB types with 0 stride. The hardware implicitly decodes them as two SIMD8 instructions, and with the overlapping regions, the first would clobber the second. Previously, we handled that by incrementing the live range end IP by 1, which works, but is excessive: the next instruction doesn't actually care about that. It might also be the end of control flow. This might keep values alive too long. What we really want is to say "my source and destinations interfere". This patch creates new infrastructure for doing just that, and teaches the register allocator to add interference when there's a hazard. For my vec4 case, we can determine this by switching on opcodes. For the SIMD16 case, we just move the existing code there. I audited our existing virtual opcodes that generate multiple instructions; I believe FS_OPCODE_PACK_HALF_2x16_SPLIT needs this treatment as well, but no others. v2: Rebased by mattst88. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
f36993b46962eab4446bc1964eb47149751aee26 |
|
23-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Clean up #includes in the compiler. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
2d8c5299032d229c8f6e936db5644cd53716e6c1 |
|
20-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Prevent implicit upcasts to brw_reg. Now that backend_reg inherits from brw_reg, we have to be careful to avoid the object slicing problem. Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
799f924073c62c3a012c48a51895b46ad621e36c |
|
24-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Use scope operator to ensure brw_reg is interpreted as a type. In the next patch, I make backend_reg's inheritance from brw_reg private, which confuses clang when it sees the type "struct brw_reg" in the derived class constructors, thinking it is referring to the privately inherited brw_reg: brw_fs.cpp:366:23: error: 'brw_reg' is a private member of 'brw_reg' fs_reg::fs_reg(struct brw_reg reg) : ^ brw_shader.h:39:22: note: constrained by private inheritance here struct backend_reg : private brw_reg ^~~~~~~~~~~~~~~ brw_reg.h:232:8: note: member is declared here struct brw_reg { ^ Avoid this by marking brw_reg with the scope resolution operator.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
f093c842e65b251e24ea3a2d6daaa91326a4f862 |
|
21-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Use implicit backend_reg copy-constructor. In order to do this, we have to change the signature of the backend_reg(brw_reg) constructor to take a reference to a brw_reg in order to avoid unresolvable ambiguity about which constructor is actually being called in the other modifications in this patch. As far as I understand it, the rule in C++ is that if multiple constructors are available for parent classes, the one closest to you in the class heirarchy is closen, but if one of them didn't take a reference, that screws things up. Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
309a44d63c75a7d688157486b094e555f49c907d |
|
22-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Add and use backend_reg::equals(). Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
6c8ba59cff14a1a86273f4008ff2a8e68335ab25 |
|
11-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Use nir_lower_tex for texture coordinate lowering Previously, we had a rescale_texcoords helper in the FS backend for handling rescaling of texture coordinates. Now that we can do variants in NIR, we can use nir_lower_tex to do the rescaling for us. This allows us to delete the i965-specific code and gives us proper TEXTURE_RECTANGLE and GL_CLAMP handling in vertex and geometry shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
ce767bbdfff7c2a7829b652c111a11eb9ddba026 |
|
11-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Move postprocess_nir to codegen time This allows us to insert NIR passes between initial NIR compilation and optimization (link time) and actual backend code-gen. In particular, it will allow us to do shader variants in NIR and share some of that shader variant code between backends. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
a5b3115f0a9ede775b332b1a669de570668e871c |
|
02-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Drop IMM fs_reg/src_reg -> brw_reg conversions. The previous two commits make this unnecessary. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
f9a9ba5eac2f1934bd7fecc92cd309f22411164b |
|
02-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Replace src_reg(imm) constructors with brw_imm_*(). Cuts 1.5k of .text. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
44d6c0c805d2911cc5dfe853e5bc5a505f87775f |
|
12-Nov-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Convert scalar_* flags to a scalar_stage array. I was going to add scalar_tcs and scalar_tes flags, and then thought better of it and decided to convert this to an array. Simpler. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
0eb3db117b56b081ee2674cc8940c193ffc3c41b |
|
02-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Use BRW_MRF_COMPR4 macro in more places. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
49b3215d7076db8b9afe8998b01ef250795b5892 |
|
27-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Combine register file field. The first four values (2-bits) are hardware values, and VGRF, ATTR, and UNIFORM remain values used in the IR. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
b3315a6f56fb93f2884168cbf9358b2606641db5 |
|
27-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Replace HW_REG with ARF/FIXED_GRF. HW_REGs are (were!) kind of awful. If the file was HW_REG, you had to look at different fields for type, abs, negate, writemask, swizzle, and a second file. They also caused annoying problems like immediate sources being considered scheduling barriers (commit 6148e94e2) and other such nonsense. Instead use ARF/FIXED_GRF/MRF for fixed registers in those files. After a sufficient amount of time has passed since "GRF" was used, we can rename FIXED_GRF -> GRF, but doing so now would make rebasing awful. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
b163aa01487ab5f9b22c48b7badc5d65999c4985 |
|
27-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Rename GRF to VGRF. The 2-bit hardware register file field is ARF, GRF, MRF, IMM. Rename GRF to VGRF (virtual GRF) so that we can reuse the GRF name to mean an assigned general purpose register. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
7638e75cf99263c1ee8e31c6cc5a319feec2c943 |
|
26-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Use brw_reg's nr field to store register number. In addition to combining another field, we get replace silliness like "reg.reg" with something that actually makes sense, "reg.nr"; and no one will ever wonder again why dst.reg isn't a dst_reg. Moving the now 16-bit nr field to a 16-bit boundary decreases code size by about 3k. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
3048053908310eaf082058e5be34ae902e1fc02c |
|
26-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Unwrap some lines. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
58fa9d47b536403c4e3ca5d6a2495691338388fd |
|
26-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Remove swizzle/writemask fields from src/dst_reg. Also allows us to handle HW_REGs in the swizzle() and writemask() functions. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
94b1031703b1b5759436fe215323727cffce5f86 |
|
25-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Remove fixed_hw_reg field from backend_reg. Since backend_reg now inherits brw_reg, we can use it in place of the fixed_hw_reg field. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
1392e45bfb396ccbfa5bb0c6063522e0550988d3 |
|
24-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Use immediate storage in inherited brw_reg. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
e42fb0c2a687cdcd6af2a590f6f5e24f64cfff3b |
|
23-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Make 'dw1' and 'bits' unnamed structures in brw_reg. Generated by sed -i -e 's/\.bits\././g' *.c *.h *.cpp sed -i -e 's/dw1\.//g' *.c *.h *.cpp and then reverting changes to comments in gen7_blorp.cpp and brw_fs_generator.cpp. There wasn't any utility offered by forcing the programmer to list these to access their fields. Removing them will reduce churn in future commits. This is C11 (and gcc has apparently supported it for sometime "compatibility with other compilers") See https://gcc.gnu.org/onlinedocs/gcc/Unnamed-Fields.html Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
e42a29531ae3d5dedb72011da2947357dfa8715b |
|
10-Nov-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Print force_writemask_all in dump_instructions(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
4ef27745c8ed5153464db22950a90d74d2ef4435 |
|
09-Sep-2015 |
Neil Roberts <neil@linux.intel.com> |
i965/vec4/skl+: Use ld2dms_w instead of ld2dms In order to support 16x MSAA, skl+ has a wider version of ld2dms that takes two parameters for the MCS data. The MCS data in the response still fits in a single register so we just need to ensure we copy both values rather than just the lower one. Acked-by: Ben Widawsky <ben@bwidawsk.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
7c81a6a647257c309cb1ca36c60aa4bfa8e2e022 |
|
26-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Replace default case with list of enum values. If we add a new file type, we'd like to get warnings if it's not handled. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
4cba8f5d21e4b50343e7c7bfbeb603b59c5d71dd |
|
23-Oct-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Wrap vec4_generator in a C function. vec4_generator is a class for convenience, but only exports a single method as its public API. It makes much more sense to just export a single function. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
73ff0ead3688519eb76ea8bc32eabb9004e6f37b |
|
23-Oct-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Convert src_reg/dst_reg to brw_reg at the end of the visitor. This patch makes the visitor convert registers to the HW_REG file at the very end, after register allocation, post-RA scheduling, and dependency control flagging. After that, everything is in fixed brw_regs. This simplifies the code generator, as it can just use the hardware registers rather than having to interpret our abstract files. In particular, interpreting the UNIFORM file meant reading prog_data to figure out where push constants are supposed to start. Having the part of the code that performs register allocation also translate everything to hardware registers seems sensible. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
bfc73ff10eafad59b6ae9ca3991f9f1a3700b3a1 |
|
07-Oct-2015 |
Emil Velikov <emil.l.velikov@gmail.com> |
i965: remove unneeded src_reg copy in emit_shader_time_write The variable is already of type src_reg. creating a new instance only to destroy it seems unnecessary. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
8cf84a7e470dbd3b46ce4081459d2ecfab22c2d5 |
|
09-Oct-2015 |
Alejandro Piñeiro <apinheiro@igalia.com> |
i965/vec4: print predicate control at brw_vec4 dump_instruction v2: externalize pred_ctrl_align16 from brw_disasm.c instead of adding a copy on brw_vec4.c, as suggested by Matt Turner Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
627f94b72e0e9443ad116f072599a7342269f297 |
|
28-Sep-2015 |
Alejandro Piñeiro <apinheiro@igalia.com> |
i965/vec4: adding vec4_cmod_propagation optimization vec4 port of fs_cmod_propagation. Shader-db results (no vec4 grepping): total instructions in shared programs: 6240413 -> 6235841 (-0.07%) instructions in affected programs: 401933 -> 397361 (-1.14%) total loops in shared programs: 1979 -> 1979 (0.00%) helped: 2265 HURT: 0 v2: remove extra space and combine two if blocks, as suggested by Matt Turner v3: add condition check to bail out if current inst and inst being scanned has different writemask, as pointed by Matt Turner v3: updated shader-db numbers v4: remove block from foreach_inst_in_block_*_starting_from after commit 801f151917fedb13c5c6e96281a18d833dd6901f Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
801f151917fedb13c5c6e96281a18d833dd6901f |
|
20-Oct-2015 |
Neil Roberts <neil@linux.intel.com> |
i965: Remove block arg from foreach_inst_in_block_*_starting_from Since 49374fab5d793 these macros no longer actually use the block argument. I think this is worth doing to make the macros easier to use because they already have really long names and a confusing set of arguments. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
9e17c36b8ba79e688011a5fd293ad5f42da21b66 |
|
14-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Extract can_change_source_types() functions. Make them members of fs_inst/vec4_instruction for use elsewhere. Also fix the fs version to check that dst.type == src[1].type and for !saturate. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
41c474df53d9dcd5fd8e24eba5b7acc2b3c32795 |
|
15-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vs: Move URB entry_size and read_length calculations to compile_vs Reviewed-By: Eduardo Lima Mitev <elima@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
4467344c829f1dccdf74e27bef2c5fda72552be6 |
|
09-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Rename brw_foo_emit to brw_compile_foo Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
5d8bf6de6166a686a006478a420bcd373860e9ee |
|
08-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vs: Rework vs_emit to take a nir_shader and a brw_compiler This commit removes all dependence on GL state by getting rid of the brw_context parameter and the GL data structures. v2 (Jason Ekstrand): - Patch use_legacy_snorm_formula through as a function argument rather than trying to go through the shader key. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
8f1d968704858d78d7e78a6b88db3ea2bc0cf749 |
|
06-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Remove gl_program and gl_shader_program from the generator Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
5e86f5b3d21fe8e96676bb0608990d72dbf61b85 |
|
06-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs: Remove the gl_program from the generator Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
031d3501322aee0a1474c7f2a9b79f9fa9947430 |
|
26-Aug-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Unify URB entry size/read length calculations between backends. Both the vec4 and scalar VS backends had virtually identical URB entry size and read length calculations. We can move those up a level to backend-agnostic code and reuse it for both. Unfortunately, the backends need to know nr_attributes to compute first_non_payload_grf, so I had to store that in prog_data. We could use urb_read_length, but that's nr_attributes rounded up to a multiple of two, so doing so would waste a register in some cases. There's more code to be removed in the vec4 backend, but that will come in a follow-on patch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
ee0f0108c8e87b9cfec25bade66670bbc4254139 |
|
07-Oct-2015 |
Kristian Høgsberg Kristensen <krh@bitplanet.net> |
i965: Move brw_get_shader_time_index() call out of emit functions brw_get_shader_time_index() is all tangled up in brw_context state and we can't call it from the compiler. Thanks the Jasons recent refactoring, we can just get the index and pass to the emit functions instead. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
ba71d581aeb96c4626500eb5b19f3bef2f40d586 |
|
05-Oct-2015 |
Kristian Høgsberg Kristensen <krh@bitplanet.net> |
i965: Move brw_dump_ir() out of brw_*_emit() functions We move these calls one level up into the codegen functions. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
5a360dcad1fdb91f9129cb21775b9af60cbf57e4 |
|
03-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Generalize predicated break pass for use in vec4 backend. instructions in affected programs: 44204 -> 43762 (-1.00%) helped: 221 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
bf7b6fd3fd6d98305d64ee6224ca9f9e7ba48444 |
|
02-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/shader: Get rid of the shader, prog, and shader_prog fields Unfortunately, we can't get rid of them entirely. The FS backend still needs gl_program for handling TEXTURE_RECTANGLE. The GS vec4 backend still needs gl_shader_program for handling transfom feedback. However, the VS needs neither and we can substantially reduce the amount they are used. One day we will be free from their tyranny. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
404419ee1a57c79982d93eefe4de099d61ad2eee |
|
02-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/fs,vec4: Get rid of the sanity_param_count It doesn't exist for anything other than an assert that, as far as I can tell, isn't possible to trip. Soon, we will remove prog from the visitor entirely and this will become even more impossible to hit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
ca6a436f12cb55e9415049a217229c99b02ad3b8 |
|
02-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Use nir info instead of pulling things out of [shader_]prog Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
ea006c4cb5eb2d98d6bfd5a6c32fcae10b636f17 |
|
01-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Move binding table setup to codegen time. Setting up binding tables really has little to do with the actual process of turning shaders into instructions; it's more part of setting up prog_data. This commit moves it out of the visitors and with the rest of the prog_data setup stuff. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
28709e37d96d6b64753ca4dcce5fbfeb75f5b499 |
|
01-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/shader: Pull assign_common_binding_table_offsets out of backend_shader This really has nothing to do with the backend compiler and we'd like to eventually be able to set this up earlier in the compile process. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
5609e0d7b41e861a3359991e8d0f2053b255fc31 |
|
30-Sep-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Get rid of the uniform_vector_size array The uniform_vector_size array was only ever used by pack_uniform_registers which no longer needs it. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
ea35fb0fbead2902b1ba37e7cdb1523853fabd8b |
|
30-Sep-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Use the actual channels used in pack_uniform_registers Previously, pack_uniform_registers worked based on the size of the uniform as given to us when we initially set up the uniforms. However, we have to walk through the uniforms and figure out liveness anyway, so we migh as well record the number of channels used as we go. This may also allow us to pack things tighter in a few cases. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
fc3f45234b4ff9545c84fbe8ec5261604d5ab611 |
|
01-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vs: Move lazy NIR creation to codegen_vs_prog The next commit will add code to codegen_vs_prog that requires the NIR shader to be there in all cases. It doesn't hurt anything to just move it from brw_vs_emit to its only caller. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
b85761d11d2abff4d45a4938b34c1c7840c97339 |
|
21-Sep-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Always use NIR GLSL IR vs. NIR shader-db results for vec4 programs on i965: total instructions in shared programs: 1499328 -> 1388354 (-7.40%) instructions in affected programs: 1245199 -> 1134225 (-8.91%) helped: 7469 HURT: 2440 GLSL IR vs. NIR shader-db results for vec4 programs on G4x: total instructions in shared programs: 1436799 -> 1325825 (-7.72%) instructions in affected programs: 1205599 -> 1094625 (-9.20%) helped: 7469 HURT: 2440 GLSL IR vs. NIR shader-db results for vec4 programs on Iron Lake: total instructions in shared programs: 1436654 -> 1325682 (-7.72%) instructions in affected programs: 1205503 -> 1094531 (-9.21%) helped: 7468 HURT: 2440 GLSL IR vs. NIR shader-db results for vec4 programs on Sandy Bridge: total instructions in shared programs: 2016249 -> 1787033 (-11.37%) instructions in affected programs: 1850547 -> 1621331 (-12.39%) helped: 14856 HURT: 1481 GLSL IR vs. NIR shader-db results for vec4 programs on Ivy Bridge: total instructions in shared programs: 1848027 -> 1648216 (-10.81%) instructions in affected programs: 1660279 -> 1460468 (-12.03%) helped: 14668 HURT: 1369 GLSL IR vs. NIR shader-db results for vec4 programs on Bay Trail: total instructions in shared programs: 1848027 -> 1648216 (-10.81%) instructions in affected programs: 1660279 -> 1460468 (-12.03%) helped: 14668 HURT: 1369 GLSL IR vs. NIR shader-db results for vec4 programs on Haswell: total instructions in shared programs: 1848027 -> 1648216 (-10.81%) instructions in affected programs: 1660279 -> 1460468 (-12.03%) helped: 14668 HURT: 1369 I also ran our full suite of benchmarks on a Haswell and had the following statistically significant (according to ministat) changes: Test master-glsl master-nir diff bench_OglGeomPoint 461.556 463.006 1.450 bench_OglTerrainFlyInst 184.484 187.574 3.090 bench_OglTerrainPanInst 132.412 136.307 3.895 bench_OglTexFilterAniso 19.653 19.645 -0.008 bench_OglTexFilterTri 58.333 58.009 -0.324 bench_OglVSInstancing 65.049 65.327 0.278 bench_trexoff 69.474 69.694 0.220 bench_valley 40.708 41.125 0.417 v2 (Jason Ekstrand): - Remove more uses of NirOptions as a switch - New shader-db numbers - Added benchmark numbers Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
6485880232df46c0cdded0b063b8841a7855bd32 |
|
28-Aug-2015 |
Samuel Iglesias Gonsalvez <siglesias@igalia.com> |
i965/vec4: Implement VS_OPCODE_GET_BUFFER_SIZE Notice that Skylake needs to include a header in the sampler message so it will need some tweaks to work there. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
f2e75ac88a92ab2180de576aca298929cfce03f2 |
|
22-Sep-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/vec4: Don't coalesce regs in Gen6 MATH ops if reswizzle/writemask needed Gen6 MATH instructions can not execute in align16 mode, so swizzles or writemasking are not allowed. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92033 Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
10da96887c785930c2553b2d5bde91e52b8b034a |
|
21-Sep-2015 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Detect and delete useless MOVs. With NIR: instructions in affected programs: 111508 -> 109193 (-2.08%) helped: 507 Without NIR: instructions in affected programs: 28763 -> 28474 (-1.00%) helped: 186 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
a548c75e31b4146d55133cb8c57a82117c196584 |
|
05-Sep-2015 |
Kristian Høgsberg Kristensen <krh@bitplanet.net> |
i965: Move perf_debug code to brw_codegen_*_prog() We're trying to avoid a libdrm dependency in the core compiler, so let's move the perf_debug code one level up from the brw_*_emit() helpers to the brw_codegen_*_prog() helpers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
79f1a7ae28c37f77e08e550cd077959a2a1f8341 |
|
05-Aug-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/vec4: Fix saturation errors when coalescing registers If the register types do not match and the instruction that contains the final destination is saturated, register coalescing generated non-equivalent code. This did not happen when using IR because types usually matched, but it is visible in nir-vec4. For example, mov vgrf7:D vgrf2:D mov.sat m4:F vgrf7:F is coalesced to: mov.sat m4:D vgrf2:D The patch prevents coalescing in such scenario, unless the instruction we want to coalesce into is a MOV (without type conversion implied). In that case, the patch sets the register types to the type of the final destination. Shader-db results in HSW (only vec4 instructions shown): total instructions in shared programs: 1754415 -> 1754416 (0.00%) instructions in affected programs: 74 -> 75 (1.35%) helped: 0 HURT: 1 GAINED: 0 LOST: 0 Only one extra instruction in one of the shaders, that comes from eliminating a saturation error by preventing register coalesce. Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
1037e0a84f61f4b1815093bcfd548d4b58ca106f |
|
11-Sep-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Don't reswizzle hardware registers Cc: "11.0 10.6" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91719 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
d4e29af2344c06490913efc35430f93a966061bb |
|
11-Sep-2015 |
Alejandro Piñeiro <apinheiro@igalia.com> |
i965/vec4: check writemask when bailing out at register coalesce opt_register_coalesce stopped to check previous instructions to coalesce with if somebody else was writing on the same destination. This can be optimized to check if somebody else was writing to the same channels of the same destination using the writemask. Shader DB results (taking into account only vec4): total instructions in shared programs: 1781593 -> 1734957 (-2.62%) instructions in affected programs: 1238390 -> 1191754 (-3.77%) helped: 12782 HURT: 0 GAINED: 0 LOST: 0 v2: removed some parenthesis, fixed indentation, as suggested by Matt Turner v3: added brackets, for consistency, as suggested by Eduardo Lima Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
0b91bcea98c0fe201bba89abe1ca3aee4d04c56c |
|
12-Aug-2015 |
Ilia Mirkin <imirkin@alum.mit.edu> |
i965: add support for textureSamples function Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> [v2: kayden-supplied code in fs_nir replacing need for logical opcode] Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
bd6e516fc24128d604f677a16f692d88d65a49f1 |
|
23-Jul-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965: Add a debug option for spilling everything in vec4 code Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
4f4b7c4711d98606270133dfd456acabfa8267a6 |
|
28-Aug-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Remove the brw_vue_prog_key base class. The legacy userclip fields are only used for the vertex shader, and at that point there's only program_string_id and the tex struct, which are common to all keys. So there's no need for a "VUE" key base class. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
014b90221ad5cf833bfdd55b0336771d209f0f1d |
|
28-Aug-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Move legacy clip plane handling to vec4_vs_visitor. This is now only used for the vertex shader, so it makes sense to get it out of any paths run by the geometry shader. Instead of passing the gl_clip_plane array into the run() method (which is shared among all subclasses), we add it as a vec4_vs_visitor constructor parameter. This eliminates the bogus NULL parameter in the GS case. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
082b7f1876095f32578720f30fdc35771b2b3e0a |
|
28-Aug-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Delete the brw_vue_program_key::userclip_active flag. There are two uses of this flag. The primary use is checking whether we need to emit code to convert legacy gl_ClipVertex/gl_Position clipping to clip distances. In this case, we also have to upload the clip planes as uniforms, which means setting nr_userclip_plane_consts to a positive value. Checking if it's > 0 works for detecting this case. Gen4-5 also wants to know whether we're doing clipping at all, so it can emit user clip flags. Checking if output_reg[VARYING_SLOT_CLIP_DIST0] is set to a real register suffices for this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
4de86e1371b0d59a5b9a787b726be3d373024647 |
|
01-Sep-2015 |
Alejandro Piñeiro <apinheiro@igalia.com> |
i965/vec4: fill src_reg type using the constructor type parameter The src_reg constructor that received the glsl_type was using it only to build the swizzle, but not to fill this->type as dst_reg is doing. This caused some type mismatch between movs and alu operations on the NIR path, so copy propagation optimization was not applied to remove unneeded movs if negate modifier was involved. This was first detected on minus (negate+add) operations. Shader DB results (taking into account only vec4): total instructions in shared programs: 20019 -> 19934 (-0.42%) instructions in affected programs: 2918 -> 2833 (-2.91%) helped: 79 HURT: 0 GAINED: 0 LOST: 0 Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
8765f1d7ddfb00dc5b202e4e679ebe640a547d50 |
|
18-Aug-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Only consider fixed_hw_reg in equals() if file is HW_REG/IMM. Noticed when debugging things that lead to the next patch. On G45 (and presumably ILK) this helps register coalescing: total instructions in shared programs: 4077373 -> 4077340 (-0.00%) instructions in affected programs: 43751 -> 43718 (-0.08%) helped: 52 HURT: 2 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
34d162260f513a7eaec12611e3859bb34230cf33 |
|
08-Jul-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/vec4: Handle uniform and GRF array access on vertex programs (NIR) When the NIR-vec4 pass is enabled, handles uniform and GRF array access on ARB_vertex_program like it is done on vertex shaders. When the old IR-vec4 pass is used, emit_program_code() emits pull constant loads directly instead of using relative addressing, hence to call to move_uniform_array_access_to_pull_constants() is not needed and it is enough to call to split_uniform_registers(). The patch also calls to move_grf_array_access_to_scratch() like it is done for shaders, however I suspect this is a no-op for vertex programs and we could remove it. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
90825e3ca977057c8f3d6ad2d1aa38277cc3ff11 |
|
08-Jul-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/vec4: Enable NIR-vec4 pass on ARB_vertex_programs Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
38fc4a91cd5c04fdd5921b8776f8e203513ab517 |
|
01-Jul-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/nir: Enable NIR-vec4 pass on geometry shaders Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
0d43d27df742ad95a086580bae2ee08a0bc00e69 |
|
23-May-2015 |
Alejandro Piñeiro <apinheiro@igalia.com> |
i965/vec4: Add a new dst_reg constructor accepting a brw_reg_type This is useful for the upcoming texture support in NIR->vec4 pass, as we found several cases where the brw_type is available, but not the glsl_type. Without this new constructor, the alternative would be: dst_reg reg(MRF, <reg>) reg.type = <brw_type> reg.writemask = <mask> Adding a new constructor makes code easier to read. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
5e839727ed2378a01d3b657bad83abd4728e8da6 |
|
22-Jul-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/nir: Pass a is_scalar boolean to brw_create_nir() The upcoming introduction of NIR->vec4 pass will require that some NIR lowering passes are enabled/disabled depending on the type of shader (scalar vs. vector). With this patch we pass a 'is_scalar' variable to the process of constructing the NIR, to let an external context decide how the shader should be handled. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
47d68908f2c3ad3e9011a2cf910b04cd3300673a |
|
16-Jun-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/nir/vec4: Select between new nir_vec4 or current vec4_visitor code-paths The NIR->vec4 pass will be activated if both the following conditions are met: * INTEL_USE_NIR environment variable is defined and is positive (1 or true) * The stage is vertex shader (support for geometry shaders and ARB_vertex_program will be added later). Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
f12302b89836a24255674a251f7a6902b4e9af7c |
|
29-Jun-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Get rid of brw_vs_compile completely. After tearing it out another level or two, and just passing the key and vp directly, we can finally remove this struct. It also eliminates a pointless memcpy() of the key. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
64390967c1abc326875e495f233afec6e685db72 |
|
30-Jun-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Remove 'c'/vs_compile from vec4_vs_visitor. At this point, the brw_vs_compile structure only contains the key and gl_vertex_program pointer. We may as well pass and store them directly; it's simpler and more convenient (key-> instead of vs_compile->key...). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
13372a0ce746cde6fa6e0aa3c5130e4227f123e0 |
|
29-Jun-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Move c->last_scratch into vec4_visitor. Nothing outside of vec4_visitor uses it, so we may as well keep it internal. Commit db9c915abcc5ad78d2d11d0e732f04cc94631350 for the vec4 backend. (The empty class will be going away soon.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
8524deb8c8fc37abc2cb2717be64a533746a92f9 |
|
29-Jun-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Move total_scratch calculation into the visitor. This is more consistent with how we do it in the FS backend, and reduces a tiny bit of duplication. It'll also allow for a bit more tidying. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
dc776ffb900b21421158ef8efbd675bdd47593bc |
|
29-Jun-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Move perf_debug about register spilling into the visitor. This patch makes us only issue the performance warning about register spilling if we actually spilled registers. We also use scratch space for indirect addressing and the like. This is basically commit c51163b0cf7aff0375b1a5ea4cb3da9d9e164044 for the vec4 backend. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
0163c99e8f6959b5d6c7a937a322127cfdf9315f |
|
30-Jun-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Plumb log_data through so the backend_shader field gets set. Jason plumbed this through a while back in the FS backend, but apparently we were just passing NULL in the vec4 backend. This patch passes brw in as intended. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
40801295d5a3d747661abb1e2ca64d44c0e3dc05 |
|
23-Jun-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Remove the brw_context from the visitors As of this commit, nothing actually needs the brw_context. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
bcaf4a3f077e3e3fbc66f264fe9124fa920ee70c |
|
23-Jun-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4_vs: Add an explicit use_legacy_snorm_formula flag This way we can stop doing is_gles3 checks inside of the compiler. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
663f8d121d792edee5c012461bfd0b650011ff4a |
|
20-Jun-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vs: Pass the current set of clip planes through run() and run_vs() Previously, these were pulled out of the GL context conditionally based on whether we were running ff/ARB or a GLSL program. Now, we just pass them in so that the visitor doesn't have to grab them itself. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
1b0f6ffa15b25e8601d60fe1ea74e893f7d33cf5 |
|
20-Jun-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Pull calls to get_shader_time_index out of the visitor Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
c7893dc3c590b86787d8118e3920debaea3f16da |
|
19-Jun-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Use a single index per shader for shader_time. Previously, each shader took 3 shader time indices which were potentially at arbirary points in the shader time buffer. Now, each shader gets a single index which refers to 3 consecutive locations in the buffer. This simplifies some of the logic at the cost of having a magic 3 a few places. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
6e255a3299c9ec5208cb5519b5da2edb0ce2972b |
|
17-Apr-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Add compiler options to brw_compiler This creates the options at screen cration time and then we just copy them into the context at context creation time. We also move is_scalar to the brw_compiler structure. We also end up manually setting some values that the core would have set by default for us. Fortunately, there are only two non-zero shader compiler option defaults that we aren't overriding anyway so this isn't a big deal. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
d7565b7d65f8203c20735a61b86e9158b8ec4447 |
|
16-Apr-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Remove the dependance on brw_context from the generators Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
e639a6f68e701f23b977a49c45d646c164991d36 |
|
16-Apr-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Plumb compiler debug logging through a function pointer in brw_compiler v2 (Ken): Make shader_debug_log a printf-like function. v3 (Jason): Add a void * to pass the brw_context through Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
0f8ec779ddff4126837a7d4216ecf1d4b97e93d2 |
|
12-Mar-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Create a shader_dispatch_mode enum to replace VS/GS fields. We used to store the GS dispatch mode in brw_gs_prog_data while separately storing the VS dispatch mode in brw_vue_prog_data::simd8. This patch introduces an enum to represent all possible dispatch modes, and stores it in brw_vue_prog_data::dispatch_mode, unifying the two. Based on a suggestion by Matt Turner. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
b95ec49e57f81bdd75795dc93022533704efe509 |
|
20-May-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vs: Rework the logic for generating NIR from ARB vertex programs Whether or not to use NIR is now equivalent to brw->scalar_vs. We can simplify the logic and make it far less confusing. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
99cb4233205edcfa1a1e2967eef7bb16ff19bec4 |
|
20-May-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Rename backend_visitor to backend_shader The backend_shader class really is a representation of a shader. The fact that it inherits from ir_visitor is somewhat immaterial. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
3687d752e51829b4723c9abb07ae56d2bbcda570 |
|
12-Mar-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Combine the fs_visitor constructors. For scalar GS support, we either need to add a fourth constructor which takes the GS structures, or combine the existing two and pass the shader stage. Given that they're not significantly different, I opted for the latter. v2: Remove more stuff from the .h file (Jason and Jordan). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
76c1086f2dfb37a1edf6d2df6eebbe11ccbfc50b |
|
24-Mar-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Change header_present to header_size in backend_instruction Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
3da9f708d4f1375d674fae4d6c6eb06e4c8d9613 |
|
20-Feb-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Perform basic optimizations on the FIND_LIVE_CHANNEL opcode. v2: Save some CPU cycles by doing 'return progress' rather than 'depth++' in the discard jump special case. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
f2fad0dc80627e853eea558498f18a9fa769992e |
|
19-Feb-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Perform basic optimizations on the BROADCAST opcode. v2: Style fixes. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
f118e5d15fd9b35cf27a975a702c5fb81d3157aa |
|
23-Apr-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Add typed surface access opcodes. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
0775d8835ac8d1f2ab75d04f0cddbad36b6787fe |
|
23-Apr-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Add untyped surface write opcode. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
20915130ace4cc0f700ece2a99c0353581a156bb |
|
26-Feb-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Add support for untyped surface message sends from GRF. This doesn't actually enable untyped surface message sends from GRF yet, the upcoming atomic counter and image intrinsic lowering code will. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
17233f9bbcbf570f0c7633c63dbd5ed88634ed60 |
|
21-Apr-2015 |
Jordan Justen <jordan.l.justen@intel.com> |
i965: Add brw_setup_tex_for_precompile. Use in VS, GS & FS. Suggested-by: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
1ac7db07b363207e8ded9259f84bbcaa084b8667 |
|
12-Mar-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Unhardcode a few more stage names and abbreviations. The stage_abbrev and stage_name fields in backend_visitor provide what we need without any additional effort. It also means we'll get the right names for compute shaders, SIMD8 geometry shaders, and both kinds of tessellation shaders. This does unfortunately change the capitalization of the stage abbreviation in the INTEL_DEBUG=optimizer output filenames. It doesn't seem worth adding code to handle, though. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
dffc1a0ae3a75d426f10c5d3ba021de977467929 |
|
25-Apr-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Remove unnecessary NULL check on generate_code() result. Code generation is not allowed to fail for any reason - in fact, fs_generator has no mechanism for failing. The visitor is responsible for that. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
28e9601d0e681411b60a7de8be9f401b0df77d29 |
|
16-Apr-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Add a devinfo field to backend_visitor and use it for gen checks Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
89c1feb78d010bc457f5d02be84c955eebf3549f |
|
08-Apr-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Create NIR during LinkShader() and ProgramStringNotify(). Previously, we translated into NIR and did all the optimizations and lowering as part of running fs_visitor. This meant that we did all of that work twice for fragment shaders - once for SIMD8, and again for SIMD16. We also had to redo it every time we hit a state based recompile. We now generate NIR once at link time. ARB programs don't have linking, so we instead generate it at ProgramStringNotify time. Mesa's fixed function vertex program handling doesn't bother to inform the driver about new programs at all (which is rather mean), so we generate NIR at the last minute, if it hasn't happened already. shader-db runs ~9.4% faster on my i7-5600U, with a release build. v2: Check NirOptions != NULL in ProgramStringNotify(). Don't bother using _mesa_program_enum_to_shader_stage as we already know it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
bff421332661bfd0f82ab9eee9e4fec9d06ed1a1 |
|
03-Apr-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Check the INTEL_USE_NIR environment variable once at context creation Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
31dc63d5ca090fed3f1adcd4fd0db2f1f7aa19f7 |
|
25-Mar-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/nir: Use NIR for ARB_vertex_program support on Gen8+. Everything is already in place; we simply have to take the scalar code generation path. This gives us SIMD8 VS programs, instead of SIMD4x2. v2: Rebase on the patch that drops brw->gen >= 8. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
ef09cfb51e0c1cc9e3c6f370813a843a6ecaa4e2 |
|
25-Mar-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Drop unnecessary brw->gen >= 8 check from scalar VS code. brw->scalar_vs already implies that brw->gen >= 8. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
e6e655ef76bb22193b31af2841cb50fda0c39461 |
|
18-Mar-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Define helpers to calculate the common live interval of a range of variables. These will be especially useful when we start keeping track of liveness information for each subregister. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
588859e18cb597612e56980a65a762ef069363e4 |
|
18-Mar-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Fix handling of multiple register reads and writes in split_virtual_grfs(). Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
9304f60cbe7c348a4771a7746606730bea3ae45f |
|
18-Mar-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Fix handling of multiple register reads and writes in opt_register_coalesce(). Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
516d45f78a3bbab0288c49c0f876ebdf4ad05bff |
|
18-Mar-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Some more trivial swizzle clean-up. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
430c6bf70e48c08ba4dc9e00f2b88e2230793010 |
|
18-Mar-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Improve src_reg/dst_reg conversion constructors. This simplifies the src_reg/dst_reg conversion constructors using the swizzle utils introduced in a previous patch. It also makes them more useful by changing their semantics slightly: dst_reg(src_reg) used to set the writemask to XYZW if the src_reg swizzle was anything other than XXXX, which was almost certainly not what the caller intended if the swizzle was non-trivial. After this patch the same components that are present in the swizzle will be enabled in the resulting writemask. src_reg(dst_reg) used to set the first components of the swizzle to the enabled components of the writemask and then replicate the last enabled component to fill the swizzle, which, in cases where the writemask didn't have exactly the first n components set, would in general not be compatible with the original dst_reg. E.g.: | ADD(tmp, src_reg(tmp), src_reg(1)); would *not* do what one would expect (add one to each of the enabled components of tmp) if tmp didn't have a writemask of the described form (e.g. YZ, YW, XZW would all fail). This pattern actually occurs in many different places in the VEC4 back-end, it's a wonder that it hasn't caused piglit failures until now. After this patch src_reg(dst_reg) will construct a swizzle with each enabled component at its natural position (e.g. Y at the second position, Z at the third, and so on). The resulting swizzle will behave like the identity when used in any instruction with the original writemask. I've manually verified that *none* of the callers of both conversion constructors were relying on the previous broken semantics. There are no piglit regressions on any generation. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
62fd3353387547504966d77f3350afc9b688ef93 |
|
18-Mar-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Pass argument by reference to src_reg/dst_reg conversion constructors. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
23bda945f570b4f566ed39b4c1de89a957247df7 |
|
18-Mar-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Remove swizzle_for_size() in favour of brw_swizzle_for_size(). It could be objected that swizzle_for_size() is "faster" than brw_swizzle_for_size(). It's not measurably better in any reasonable CPU-bound benchmark on VLV according to the Finnish benchmarking system (including the SynMark2 DrvShComp shader compilation benchmark). Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
9a17e4e900256b5be73d935fa5f35c98b3b0d7fe |
|
18-Mar-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Simplify opt_register_coalesce() using the swizzle utils. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
05ec72d8ecdba04a81745fbc3ca0df40c7fb8828 |
|
18-Mar-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Simplify reswizzle() using the swizzle utils. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
7b30493dc4f0b1346fe4c1fe52211f0c0d7ed229 |
|
18-Mar-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Simplify opt_reduce_swizzle() using the swizzle utils. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
7e816c7feb8cffa878546eee363240b1b66d5c42 |
|
18-Mar-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Fix signedness of dst_reg::writemask. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
b0d422cd2a99d2fd26ab11880d5d8410ebfc64b2 |
|
16-Mar-2015 |
Matt Turner <mattst88@gmail.com> |
i965/fs: Print spills:fills and number of promoted constants. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
78df9d5e30fbca8b0795594448a3bcae05d5f5f2 |
|
05-Mar-2015 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Handle saturate in dump_instruction(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
63d6d09a3b3790c5ec00f2cbc06f58c82ae40b0c |
|
03-Feb-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Don't attempt to reduce swizzles of send from GRF instructions. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
eb47d0efd39d73d4388389d6c0ebe458160f79fa |
|
05-Feb-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Optimize multiplication by -1 into a negated MOV. instructions in affected programs: 968 -> 942 (-2.69%) helped: 4 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
55de910f909ac668ec7ea8fd94ec4f235b0d0335 |
|
11-Feb-2015 |
Eric Anholt <eric@anholt.net> |
i965: Quiet another compiler warning about uninitialized values. The compiler can't tell that we're always going to hit the first if block on the first time through the loop. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
b40bcd24e0c86fb02c226261c1fe46fb362be217 |
|
04-Feb-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Don't set any dependency control bits for F32TO16 on Gen8. It's expanded to several instructions. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
530445330b403d835a4027b41388b5eea8c2e1ab |
|
03-Feb-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Init mlen for several send from GRF instructions. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
de666fc102b805707c7033b203c5b76ccbbcef8d |
|
05-Feb-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Fix the scheduler to take into account reads and writes of multiple registers. v2: Avoid nested ternary operators in vec4_instruction::regs_read(). (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
8ad486077e122c19b603750e19dd678bb7793d5b |
|
05-Feb-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Make vec4_visitor::implied_mrf_writes() return zero for sends from GRF. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
388b136e677e30249e062145b488c2d938c1ef17 |
|
05-Feb-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Implement equals() method for dst_reg too. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
dfe957c02b753dbb5b372e768a5677f577daf9ef |
|
06-Feb-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Move up fs_inst::flag_subreg to backend_instruction. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
447879eb88b8df41ad32cf4406cc636b112b72d9 |
|
10-Feb-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965: Factor out virtual GRF allocation to a separate object. Right now virtual GRF book-keeping and allocation is performed in each visitor class separately (among other hundred different things), leading to duplicated logic in each visitor and preventing layering as it forces any code that manipulates i965 IR and needs to allocate virtual registers to depend on the specific visitor that happens to be used to translate from GLSL IR. v2: Use realloc()/free() to allocate VGRF book-keeping arrays (Connor). Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
8030e269e911c4f90a44d9a77eb342dd2657d229 |
|
03-Dec-2014 |
Ben Widawsky <benjamin.widawsky@intel.com> |
i965/vec4: Correct MUL destination hazard As it turns out, we were over-thinking the cause of the hang on Cherryview. It's simply errata for Cherryview. commit 88fea85f09e2252035bec66ab26c375b45b000f5 Author: Ben Widawsky <benjamin.widawsky@intel.com> Date: Fri Nov 21 10:47:41 2014 -0800 i965/vec4/gen8: Handle the MUL dest hazard exception This is an explanation to why we never saw the hang on BDW. NOTE: The problem the original patch was trying to fix does still exist. It will have to be fixed at some point. v2: Modify commit message, s/CHV/BDW Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84212 Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
94e7b59a75fc2ecc51a74196f6cd198546603b85 |
|
05-Jan-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Convert CMP.GE -(abs)reg 0 -> CMP.Z reg 0. total instructions in shared programs: 5952059 -> 5951603 (-0.01%) instructions in affected programs: 138812 -> 138356 (-0.33%) GAINED: 1 LOST: 0 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
41d9f232b6a7f53086b9c428cca30e45905abd48 |
|
12-Jan-2015 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Make sure that imm writes are to registers in the same file. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87887
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
3167a80bb1119616b70fbbcf2661d3fb511a6034 |
|
13-Jan-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fix "vertex" vs. "geometry" and "VS" vs. "GS" in debug output. We were happily printing "Native code for unnamed vertex shader" and "VS vec4" program for geometry shaders in our INTEL_DEBUG=gs output, as well as the KHR_debug output used by shader-db. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
68ed14d6adcaf4b91216fc1c53792e88d1fd024d |
|
13-Jan-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Pass a shader stage abbreviation to fs_generator(). A lot of messages hardcoded the string "FS", which is confusing on Broadwell, where we use this code for VS support as well. shader-db particularly got confused, as it reported two "FS SIMD8" shaders, and no vertex shaders at all. Craziness ensued. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
0b98b2bf535d6e6b6b02c0d47ea03f98adf42f15 |
|
01-Jan-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Make the precompile ignore DEPTH_TEXTURE_MODE on Gen7.5+. Gen7.5+ platforms that support the "Shader Channel Select" feature leave key->tex.swizzles[i] as SWIZZLE_NOOP except when GL_DEPTH_TEXTURE_MODE is GL_ALPHA (which is really uncommon). So, the precompile should leave them as SWIZZLE_NOOP (aka SWIZZLE_XYZW) as well. We didn't notice this because prog->ShadowSamplers is not set correctly. The next patch will fix that problem. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87886 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
408e298942ffb03c00e05dce2569c291df6bec49 |
|
01-Jan-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fix INTEL_DEBUG=optimizer with VF types. Hardcoding stderr is wrong; INTEL_DEBUG=optimizer uses other files. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
9b8bd67768769b685c25e1276e053505aede5f93 |
|
01-Jan-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Show opt_vector_float() and later passes in INTEL_DEBUG=optimizer. In order to support calling opt_vector_float() inside a condition, this patch makes OPT() a statement expression: https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html We've used that elsewhere already. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
798c094e6266bf53b332f332e82a90c338c49915 |
|
21-Dec-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Do separate copy followed by constant propagation after opt_vector_float(). total instructions in shared programs: 5877012 -> 5876617 (-0.01%) instructions in affected programs: 33140 -> 32745 (-1.19%) From before the commit that allows VF constant propagation (which hurt some programs) to here, the results are: total instructions in shared programs: 5877951 -> 5876617 (-0.02%) instructions in affected programs: 123444 -> 122110 (-1.08%) with no programs hurt. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
bbdd3198a5f778ba55c037e4af86d88b06ca4e95 |
|
20-Dec-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Do CSE, copy propagation, and DCE after opt_vector_float(). total instructions in shared programs: 5869005 -> 5868220 (-0.01%) instructions in affected programs: 70208 -> 69423 (-1.12%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
44573458bdc52acc304fb75d6df502312b8e149c |
|
20-Dec-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Add pass to gather constants into a vector-float MOV. Currently only handles consecutive instructions with the same destination that collectively write all channels. total instructions in shared programs: 5879798 -> 5869011 (-0.18%) instructions in affected programs: 465236 -> 454449 (-2.32%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
7bc6e455e231076bfac6c678c375ea4aca94ebf0 |
|
21-Dec-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Add support for saturating immediates. I don't feel great about assert(!"unimplemented: ...") but these cases do only seem possible under some currently impossible circumstances. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
3978585bccf69ff8f607cad0de025ea91c418587 |
|
20-Dec-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Add fs_reg/src_reg constructors that take vf[4]. Sometimes it's easier to generate 4x values into an array, and the memcpy is 1 instruction, rather than 11 to piece 4 arguments together. I'd forgotten to remove the prototype from fs_reg from a previous patch, so it's already there for us here. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
8517e665bc4c378e8e7523827090fd1b06abaecd |
|
12-Dec-2014 |
Andres Gomez <agomez@igalia.com> |
i965/brw_reg: struct constructor now needs explicit negate and abs values. We were assuming, when constructing a new brw_reg struct, that the negate and abs register modifiers would not be present by default in the new register. Now, we force explicitly setting these values when constructing a new register. This will avoid problems like forgetting to properly set them when we are using a previous register to generate this new register, as it was happening in the dFdx and dFdy generation functions. Fixes piglit test shaders/glsl-deriv-varyings Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82991 Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
ee5fb8d1ba7f50ed94e1a34fa0f6e15a0588145e |
|
21-Oct-2014 |
Kristian Høgsberg <krh@bitplanet.net> |
i965: Generate vs code using scalar backend for BDW+ With everything in place, we can now use the scalar backend compiler for vertex shaders on BDW+. We make scalar vertex shaders the default on BDW+ but add a new vec4vs debug option to force the vec4 backend. No piglit regressions. Performance impact is minimal, I see a ~1.5 improvement on the T-Rex GLBenchmark case, but in general it's in the noise. Some of our internal synthetic, vs bounded benchmarks show great improvement, 20%-40% in some cases, but real-world cases are mostly unaffected. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
bf2307937995212895375d1e258d50207da3d24e |
|
25-Nov-2014 |
Kristian Høgsberg <krh@bitplanet.net> |
i965: Rename brw_vec4_prog_data/key to brw_bue_prog_data/key These structs aren't vec4 specific, they are shared by shader stages operating on Vertex URB Entries (VUEs). VUEs are the data structures in the URB that hold vertex data between the pipeline geometry stages. Using vue in the name instead of vec4 makes a lot more sense, especially when we add scalar vertex shader support. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
0d3cc01b0b092271938ce2cf2b77d27dc385e4d8 |
|
24-Oct-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Allow CSE on uniform-vec4 expansion MOVs. Three source instructions cannot directly source a packed vec4 (<0,4,1> regioning) like vec4 uniforms, so we emit a MOV that expands the vec4 to both halves of a register. If these uniform values are used by multiple three-source instructions, we'll emit multiple expansion moves, which we cannot combine in CSE (because CSE emits moves itself). So emit a virtual instruction that we can CSE. Sometimes we demote a uniform to to a pull constant after emitting an expansion move for it. In that case, recognize in opt_algebraic that if the .file of the new instruction is GRF then it's just a real move that we can copy propagate and such. total instructions in shared programs: 5822418 -> 5812335 (-0.17%) instructions in affected programs: 351841 -> 341758 (-2.87%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
afd605f3461462ba1b9f522b079ff5a03e7ab55c |
|
01-Dec-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Make vertex color clamp handling code VS specific. Vertex color clamping only applies to gl_[Secondary]{Front,Back}Color, which are compatibility-only built-in varyings. We only support GS in core profile, so they can't exist in geometry shaders. We can drop several dirty bits from the GS program key - they're unnecessary for a core profile implementation. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
5df88c2096281f416b2738debac1c4c329e29673 |
|
03-Nov-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Rewrite dead code elimination to use live in/out. Improves 359 shaders by >=10% 114 shaders by >=20% 91 shaders by >=30% 82 shaders by >=40% 22 shaders by >=50% 4 shaders by >=60% 2 shaders by >=80% total instructions in shared programs: 5845346 -> 5822422 (-0.39%) instructions in affected programs: 364979 -> 342055 (-6.28%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
e14c7c7faff3c204a5eefc1f2ea487d4730b8382 |
|
10-Mar-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Add VEC4_OPCODE_PACK_4_BYTES. Will be used by emit_pack_{s,u}norm_4x8().
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
cb0ba848d4176c1ed2c4542fd5875867f460fc3b |
|
09-Mar-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Add vector float immediate infrastructure. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
88fea85f09e2252035bec66ab26c375b45b000f5 |
|
21-Nov-2014 |
Ben Widawsky <benjamin.widawsky@intel.com> |
i965/vec4/gen8: Handle the MUL dest hazard exception Fix one of the few cases where we can't reliable touch the destination hazard bits. I am explicitly doing this patch individually so it is easy to backport. I was tempted to do this patch before the previous patch which reorganized the code, but I believe even doing that first, this is still easy to backport. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84212 Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
156f565f9eb36dad3cd959952724bc54f9ff21ea |
|
21-Nov-2014 |
Ben Widawsky <benjamin.widawsky@intel.com> |
i965/vec4: Extract depctrl hazards Move this to a separate function so that we can begin to add other little caveats without making too big a mess. NOTE: There is some desire to improve this function eventually, but we need to fix a bug first. v2: Use const for the inst for the hazard check (Matt) Invert safe logic to get rid of the double negative (Matt) Add PRM reference for predicates (Matt) Add note about empirical evidence for math (Matt) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
7d560a3861ff30aa9d8ec872cf9cd7d72a980eb2 |
|
21-Oct-2014 |
Ian Romanick <ian.d.romanick@intel.com> |
i965: Silence unused parameter warning in brw_dump_ir Just remove the parameter. Silences: brw_program.c: In function 'brw_dump_ir': brw_program.c:566:33: warning: unused parameter 'brw' [-Wunused-parameter] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
b52126b44f40643aa2c0986c1d51330f4e4130b5 |
|
27-Sep-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Optimize sqrt+inv into rsq. Transform sqrt a, b rcp c, a into sqrt a, b rsq c, b In most cases the sqrt's result is still used, so the improvement here is that we've broken a dependency between these instructions. Leads to 80 fewer INV instructions and 80 more RSQ. Occasionally the sqrt's result is no longer used, leading to: instructions in affected programs: 5005 -> 4949 (-1.12%) Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
189ac077644c4ef2c6c15080b6d094410c74abdc |
|
27-Sep-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Call opt_algebraic after opt_cse. The next patch adds an algebraic optimization for the pattern sqrt a, b rcp c, a and turns it into sqrt a, b rsq c, b but many vertex shaders do a = sqrt(b); var1 /= a; var2 /= a; which generates sqrt a, b rcp c, a rcp d, a If we apply the algebraic optimization before CSE, we'll end up with sqrt a, b rsq c, b rcp d, a Applying CSE combines the RCP instructions, preventing this from happening. No shader-db changes. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
72bb3f81c621931e42759148bc8bddc511266dd0 |
|
02-Sep-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Don't iterate between blocks with inst->next/prev. The register coalescing portion of this patch hurts three shaders in Guacamelee by one instruction each, but examining the diff makes me believe that what we were generating was (perhaps harmlessly) incorrect.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
90bfeb22444df6ce779251522e47bf169e130f8e |
|
01-Sep-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Don't use instruction list after calculating the cfg. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
a4fb8897a2bd00eefa8a503ec17d45e791bced91 |
|
01-Sep-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Remove now unneeded calls to calculate_cfg(). Now that nothing invalidates the CFG, we can calculate_cfg() immediately after emit_fb_writes()/emit_thread_end() and never again. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
072ea414d04f1b9a7bf06a00b9011e8ad521c878 |
|
01-Sep-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Remove cfg-invalidating parameter from invalidate_live_intervals. Everything has been converted to preserve the CFG. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
269b6e24d6ec61d8d8d0c5d1b3d1bfa4f4a55f5f |
|
25-Aug-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Preserve CFG in spill_reg(). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
b0b64c85e4a0dafbb46405e4b3c17be24b63347f |
|
25-Aug-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Preserve the CFG in a few more places. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
c66165ab2b15047792808433b788632a4b9df287 |
|
01-Aug-2014 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/gen6/gs: Fix binding table clash between TF surfaces and textures. For gen6 geometry shaders we use the first BRW_MAX_SOL_BINDINGS entries of the binding table for transform feedback surfaces. However, vec4_visitor will setup the binding table so that textures use the same space in the binding table. This is done when calling assign_common_binding_table_offsets(0) as part if its run() method. To fix this clash we add a virtual method to the vec4_visitor hierarchy to assign the binding table offsets, so that we can change this behavior specifically for gen6 geometry shaders by mapping textures right after the first BRW_MAX_SOL_BINDINGS entries. Also, when there is no user-provided geometry shader, we only need to upload the binding table if we have transform feedback, however, in the case of a user-provided geometry shader, we can't only look into transform feedback to make that decision. This fixes multiple piglit tests for textureSize() and texelFetch() when these functions are called from a geometry shader in gen6, like these: bin/textureSize gs sampler2D -fbo -auto bin/texelFetch gs usampler2D -fbo -auto Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
2c85132e511bbef9a0965c69848981b1bffb5bad |
|
09-Jul-2014 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/gen6/gs: Implement GS_OPCODE_URB_WRITE_ALLOCATE. Gen6 geometry shaders need to allocate URB handles for each new vertex they emit after the first (the URB handle for the first vertex is obtained via the FF_SYNC message). This opcode adds the URB allocation mechanism to regular URB writes. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
d0bdd4ce983ddd52f9f4b70dced4e471c60a130c |
|
09-Jul-2014 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/gen6/gs: Implement GS_OPCODE_FF_SYNC. This implements the FF_SYNC message required in gen6 geometry shaders to get the initial URB handle. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
667f758788f0796d9be16f0f361022d447f622f5 |
|
09-Sep-2014 |
Chris Forbes <chrisf@ijw.co.nz> |
i965/vec4: slightly improve insn dumping with no srcs Previously, we would get a trailing ', ' which looked strange. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
6b6145204dd4a1112f6e1fe10162636141495b79 |
|
11-Sep-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Separate gl_InstanceID and gl_VertexID uploading. We always uploaded them together, mostly out of laziness - both required an additional vertex element. However, gl_VertexID now also requires an additional vertex buffer for storing gl_BaseVertex; for non-indirect draws this also means uploading (a small amount of) data. This is extra overhead we don't need if the shader only uses gl_InstanceID. In particular, our clear shaders currently use gl_InstanceID for doing layered clears, but don't need gl_VertexID. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
87472ae58cf2a5c812630f4eabd485931d243e0c |
|
05-Sep-2014 |
Matt Turner <mattst88@gmail.com> |
i965/fs: Brown bag fix.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
e8df6a6b32aae7695ce010f18588c51cb7d18978 |
|
31-Aug-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Add ability to reswizzle arbitrary swizzles. Before commit 04895f5c we would only reswizzle dot product instructions (since they wrote the same value into all channels, and we didn't have to think about anything else). That commit extended reswizzling to cases when the swizzle was single valued -- i.e., writing the same result into all channels. But allowing reswizzling of arbitrary things is actually really easy and is even less code. (Why didn't we do this in the first place?!) total instructions in shared programs: 4266079 -> 4261000 (-0.12%) instructions in affected programs: 351933 -> 346854 (-1.44%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
1ee1d8ab468cafd25cfcc513319f3f046492947f |
|
31-Aug-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Reswizzle sources when necessary. Despite the comment above the function claiming otherwise, the function did not reswizzle sources, which would lead to bad code generation since commit 04895f5c, which began claiming we could do such swizzling when we could not. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82932 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
f92fbd554f2e9e702a2bd650c9b2571a3f4f1ab8 |
|
02-Sep-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Move curb_read_length/total_scratch to brw_stage_prog_data. All shader stages have these fields, so it makes sense to store them in the common base structure, rather than duplicating them in each. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
1c573c9adbb8bb95bc10f6ade76a430684918160 |
|
28-Aug-2014 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Don't segfault when debug-logging a null program Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
20a849b4aa63c7fce96b04de674a4c70f054ed9c |
|
13-Jul-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Use basic-block aware insertion/removal functions. To avoid invalidating and recreating the control flow graph. Also stop invalidating the CFG in places we didn't add or remove an instruction. cfg calculations: 202951 -> 80307 (-60.43%) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
04895f5c601b240df547739da786b7c2b65bdd1e |
|
15-Aug-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Allow reswizzling writemasks when swizzle is single-valued. total instructions in shared programs: 4288033 -> 4266151 (-0.51%) instructions in affected programs: 930915 -> 909033 (-2.35%)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
9a071e3339afcf6fd937ae31121fa3b3face3bfe |
|
18-Aug-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Add a pass to reduce swizzles. total instructions in shared programs: 4344280 -> 4288033 (-1.29%) instructions in affected programs: 397468 -> 341221 (-14.15%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
a3d0ccb037082f3aa66bd558dfbe89f63a6eedd3 |
|
12-Jul-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Pass a cfg pointer to generate_{code,assembly}. The loop over all instructions is now two-fold, over all of the blocks and all of the instructions in each block. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
596990d91e2a4c4a3a303c6c2da623bf1840771b |
|
12-Jul-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Add and use foreach_block macro. Use this as an opportunity to rename 'block_num' to 'num'. block->num is clear, and block->block_num has always been redundant.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
972e87ca30b4c4b7f6269e5f9fe8c5cb6356f744 |
|
14-Aug-2014 |
Pekka Paalanen <pekka.paalanen@collabora.co.uk> |
i965: fix compiler error in union initiliazer gcc 4.6.3 chokes with the following error: brw_vec4.cpp: In member function 'int brw::vec4_visitor::setup_uniforms(int)': brw_vec4.cpp:1496:37: error: expected primary-expression before '.' token Apparently C++ does not do named initializers for unions, except maybe as a gcc extension, which is not present here. As .f is the first element of the union, just drop it. Fixes the build error. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
2c50212b14da27de4e3da62488ae4e35c069d84e |
|
11-Aug-2014 |
Neil Roberts <neil@linux.intel.com> |
i965: Store uniform constant values in a gl_constant_value instead of float The brw_stage_prog_data struct previously contained an array of float pointers to the values of parameters. These were then copied into a batch buffer to upload the values using a regular assignment. However the float values were also being overloaded to store integer values for integer uniforms. This can break if x87 floating-point registers are used to do the assignment because the fst instruction tries to fix up invalid float values. If an integer constant happened to look like an invalid float value then it would get altered when it was copied into the batch buffer. This patch changes the pointers to be gl_constant_value instead so that the assignment should end up copying without any alteration. This also makes it more obvious that the values being stored here are overloaded for multiple types. There are some static asserts where the values are uploaded to ensure that the size of gl_constant_value is the same as a float. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81150 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
f17bfc9ba954608c58fd0560f255e40eef7e7cea |
|
11-Aug-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Never use the Gen8 code generators. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
074d472398b3cc7f32fe5c0cc742853cf66fabed |
|
30-Jun-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Switch to the EU emit layer for code generation on Broadwell. Everything should be in place to unify code generation between Gen4-7 and Gen8+. We should be able to drop the Gen8 generators at this point. However, leave them hooked up for a brief moment, for testing and comparison purposes. Set GEN8=1 to use the old Gen8+ code generator paths. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
36a4a6bbdca0c30e16d56e6b406ea7c94831048f |
|
22-Jul-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Port INTEL_DEBUG=optimizer to the vec4 backend. Largely via copy and paste. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
3e9105f7eefae97c928034662f67019973b9e483 |
|
12-Jul-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Use foreach_inst_in_block a couple more places. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
1761671b0627ce8e1c0eae721e1fca5c2d04690e |
|
12-Jul-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Replace cfg instances with calls to calculate_cfg(). Avoids regenerating it unnecessarily. Every program in shader-db improved, none by an amount less than a 1/3 reduction. One Dota2 shader decreased from 62 -> 24. cfg calculations: 429492 -> 193197 (-55.02%) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
1854ead64ca465ca03e8e5369cd1749bc92c315a |
|
06-Jul-2014 |
Chris Forbes <chrisf@ijw.co.nz> |
i965: Avoid crashing while dumping vec4 insn operands We'd otherwise go looking into virtual_grf_sizes for things that aren't in there at all. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
3c8dc48ad1d4061a2a1d0b9ea3126350b98274f0 |
|
06-Mar-2013 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Add basic common subexpression elimination. [mattst88]: Modified to perform CSE on instructions with the same writemask. Offered no improvement before. total instructions in shared programs: 1995633 -> 1995185 (-0.02%) instructions in affected programs: 14410 -> 13962 (-3.11%) Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
34ef6a7651d6651e0bca77c4d4b890af582ad360 |
|
30-Jun-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Move is_zero/one/null/accumulator into backend_reg. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
53992a102ffddf2e0fad401252cfc1c034d022ad |
|
30-Jun-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Use immediate storage in brw_reg for visitor regs. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
489ec685542590c7412db81623952c1aa75d946f |
|
19-May-2014 |
Eric Anholt <eric@anholt.net> |
i965: Update a ton of comments about constant buffers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
c0f1929dd23bbc558e9eef0f8fd40e10dfef3c21 |
|
19-May-2014 |
Eric Anholt <eric@anholt.net> |
i965: Move dispatch_grf_start_reg and first_curbe_grf into stage_prog_data. I wanted to access this value from stage-generic code, so stop storing it under two different names. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
3d826729dabab53896cdbb1f453c76fab1c7e696 |
|
29-Jun-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Use unreachable() instead of unconditional assert(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
266109736a9a69c3fdbe49fe1665a7a63c5cc122 |
|
25-Jun-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Use typed foreach_in_list_safe instead of foreach_list_safe. Acked-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
c5030ac0ac15d3c91c4352789f94281da9a9dcad |
|
25-Jun-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Use typed foreach_in_list instead of foreach_list. Acked-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
46659d46a8c2f7bbc8deb472faff2dccbde92d29 |
|
24-Jun-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Make can_do_source_mods() a member of the instruction classes. Pretty nonsensical to have it as a method of the visitor just for access to brw. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
d0575d98fc595dcc17706dc73d1eb461027ca17a |
|
14-Jun-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Fix dead code elimination for VGRFs of size > 1. When faced with code such as: mov vgrf31.0:UD, 960D mov vgrf31.1:UD, vgrf30.xxxx:UD The dead code eliminator didn't consider reg_offsets, so it decided that the second instruction was writing was writing to the same register as the first one, and eliminated the first one. But they're actually different registers. This fixes INTEL_DEBUG=shader_time for vertex shaders. In the above code, vgrf31.0 represents the offset into the shader_time buffer where the data should be written, and vgrf31.1 represents the actual time data. With a completely undefined offset, results were...unexpected. I think this is probably one of the few cases (maybe only case) where we generate multiple MOVs to a large VGRF. Normally, we just use them as texturing results; the other SEND-from-GRF uses a size 1 VGRF. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79029 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
7b9cf797903a5ea70072a28c0486d3e99ee60645 |
|
06-Mar-2013 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Make src_reg::equals() take a constant reference, not a pointer. This is more typical C++ style. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
56d6dcf4f771d57d2759b2a5c5006f24444c696f |
|
29-May-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Give dump_instruction() a FILE* argument. Use function overloading rather than default arguments, since gdb doesn't know about default arguments. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
306ed81b9363721058c568244f9860c5c8c819f4 |
|
04-Apr-2014 |
Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> |
i965: Add writes_accumulator flag Our hardware has an "accumulator" register, which can be used to store intermediate results across multiple instructions. Many instructions can implicitly write a value to the accumulator in addition to their normal destination register. This is enabled by the "AccWrEn" flag. This patch introduces a new flag, inst->writes_accumulator, which allows us to express the AccWrEn notion in the IR. It also creates a n ALU2_ACC macro to easily define emitters for instructions that implicitly write the accumulator. Previously, we only supported implicit accumulator writes from the ADDC, SUBB, and MACH instructions. We always enabled them on those instructions, and left them disabled for other instructions. To take advantage of the MAC (multiply-accumulate) instruction, we need to be able to set AccWrEn on other types of instructions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
30c35d1dcb2fde19b1c968751fda5151b795d257 |
|
09-Apr-2014 |
Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> |
i965: Add is_accumulator() function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
602510395a96a1f6ca29189e4f5cfb3f07f21d23 |
|
13-Feb-2014 |
Mike Stroyan <mike@LunarG.com> |
i965: Avoid dependency hints on math opcodes Putting NoDDClr and NoDDChk dependency control on instruction sequences that include math opcodes can cause corruption of channels. Treat math opcodes like send opcodes and suppress dependency hinting. Signed-off-by: Mike Stroyan <mike@LunarG.com> Tested-by: Tony Bertapelli <anthony.p.bertapelli@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
596737ee91cc199a8edff5dc440736471e28f297 |
|
24-Mar-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Let DCE eliminate dead writes in other basic blocks. We previously stopped searching for unread writes after encountering control flow, but we can instead just search backwards until we hit control flow. instructions in affected programs: 22854 -> 22194 (-2.89%)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
20dee82a75ac7415fba0b3540a1f99d60b2325db |
|
01-Apr-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Consider sources of non-GRF-dst instructions for dead channels. Previously we'd ignore the sources of instructions with non-GRF destinations when calculating calculating the dead channels. This would lead to us incorrectly removing the first instruction in this sequence: mov vgrf11, ... cmp.ne.f0 null, vgrf11, 1.0 mov vgrf11, ... Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76616
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
e14cc504f307a7fa88c8b6757df53026aaa39b08 |
|
02-Apr-2014 |
Tapani Pälli <tapani.palli@intel.com> |
i965/vec4: do not trim dead channels on gen6 for math Do not set a writemask on Gen6 for math instructions, those are executed using align1 mode that does not support a destination mask. v2: cleanups, better comment (Matt) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76883 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
3a8bd9724196075da76ddcb50eff4867c5a37398 |
|
29-Mar-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Don't trim writemasks of texture instructions. It was my understanding that the writemask works in SIMD4x2 mode for texturing instructions and doesn't require a message header. Some bit of this logic must be wrong, so disable it until it's understood. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76617 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
764e25d79dad3096274ab2df04f5aa3ffb232119 |
|
19-Mar-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Eliminate dead writes to the flag register. For each write, search previous instructions for unread writes to the flag register and remove them. Note that this will not eliminate the last unread write. total instructions in shared programs: 788074 -> 788004 (-0.01%) instructions in affected programs: 4930 -> 4860 (-1.42%) Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
9cd51bb0c4608258199c69bc7738e72f055799d2 |
|
11-Mar-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Eliminate writes that are never read. With an awful O(n^2) algorithm that searches previous instructions for dead writes. total instructions in shared programs: 805582 -> 788074 (-2.17%) instructions in affected programs: 144561 -> 127053 (-12.11%) Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
1b8f143a2302739de90cb643d732e12b55d4e4eb |
|
12-Mar-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Factor code out of DCE into a separate function. Will be reused in the next commit. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
9630ba6c6e754b438cf67c7d76ec1c99488df3ba |
|
11-Mar-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Let dead code eliminate trim dead channels. That is, modify mad dst, a, b, c to be mad dst.xyz, a, b, c if dst.w is never read. total instructions in shared programs: 811869 -> 805582 (-0.77%) instructions in affected programs: 168287 -> 162000 (-3.74%) Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
dc0f5099fa3cb564c25eb892fde93cacd29df8f1 |
|
11-Mar-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Track live ranges per-channel, not per vgrf. Will be squashed with the next patch. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
89ccd11eebeee884d581e831b61368ac97057b43 |
|
11-Mar-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Don't dead code eliminate instructions writing the flag. A future patch adds support for removing dead writes to the flag register. This patch simplifies the logic until then. total instructions in shared programs: 811813 -> 811869 (0.01%) instructions in affected programs: 3378 -> 3434 (1.66%) Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
3a12f50f9ca7f03f470ee053b9076ac12c4d486d |
|
11-Mar-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Preparatory clean up of dead_code_eliminate(). Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
10dd6eca89951e0cb40e21c3b53caa33d8fcb383 |
|
13-Mar-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Add is_null() method to dst_reg. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
0884ce8f42d0e04e889c6d0e4dde91f9aa58e85e |
|
13-Mar-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Print the predicate in dump_instructions(). Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
01d9023a9b9a50b42f7a4ef4799d0e35e0b045ca |
|
11-Mar-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fix register types in dump_instructions(), again. In commit e57d77280efcbfd6579a88f071426653287ef833, I fixed this for destinations in the Vec4 backend, and sources in the scalar backend. But not both types in both backends. To prevent this mess from continuing, make the reg_encoding table static, so only the disassembler can use it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
a290cd039cc07330598a101e74d25289ce70bcee |
|
18-Feb-2014 |
Topi Pohjolainen <topi.pohjolainen@intel.com> |
i965: Merge resolving of shader program source Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
59989a4a92e638415d50e9acdd0685eb56eb17f3 |
|
27-Feb-2014 |
Petri Latvala <petri.latvala@intel.com> |
i965: Assert array index on access to vec4_visitor's arrays. v2: vec4_visitor::pack_uniform_registers(): Use correct comparison in the assert, this->uniforms is already adjusted. Compare the actual value used to index uniform_size and uniform_vector_size instead. Signed-off-by: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
a76e5dce4fc8d50f8699c108833f24e80167d706 |
|
23-Dec-2013 |
Eric Anholt <eric@anholt.net> |
i965: Move compiler debugging output to stderr. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
f28c9208652143b4925bd97ce9823728c34d34a5 |
|
21-Feb-2014 |
Eric Anholt <eric@anholt.net> |
i965: Refactor debug dumping of GLSL IR. This was only going to get worse when tesselation shows up, and was causing too much extra duplication in my stderr changes coming up. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
c2ebbe2728cd709029313f4b9c9cc53432c510a1 |
|
20-Feb-2014 |
Eric Anholt <eric@anholt.net> |
i965: Stop throwing away our double precision for time calculations. Fixes negative times being reported in our perf debug. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
42b226ef824ed61ccf51fa9a1198cba305ad5472 |
|
19-Feb-2014 |
Francisco Jerez <currojerez@riseup.net> |
i965: Make sure that backend_reg::type and brw_reg::type are consistent for fixed regs. And define non-mutating helper functions to retype fixed and normal regs with a common interface. At some point we may want to get rid of ::fixed_hw_reg completely and have fixed regs use the normal register data members (e.g. backend_reg::reg to select a fixed GRF number, src_reg::swizzle to store the swizzle, etc.), I have the feeling that this is not the last headache we're going to get because of the multiple ways to represent the same thing and the different register interface depending on the file a register is stored in... Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
ae8b066da5862b4cfc510b3a9a0e1273f9f6edd4 |
|
19-Feb-2014 |
Francisco Jerez <currojerez@riseup.net> |
i965: Move up duplicated fields from stage-specific prog_data to brw_stage_prog_data. There doesn't seem to be any reason for nr_params, nr_pull_params, param, and pull_param to be duplicated in the stage-specific subclasses of brw_stage_prog_data. Moving their definition to the common base class will allow some code sharing in a future commit, the removal of brw_vec4_prog_data_compare and brw_*_prog_data_free, and the simplification of the stage-specific brw_*_prog_data_compare. Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
7f00c5f1a3e0db20a89cfedefd53cbe817fec9e3 |
|
23-Nov-2013 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Add constructor of src_reg from a fixed hardware reg. Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
b424da4be07ab8d34986e6f3824c679b623df952 |
|
28-Nov-2013 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Fix confusion between SWIZZLE and BRW_SWIZZLE macros. Most of the VEC4 back-end agrees on src_reg::swizzle being one of the BRW_SWIZZLE macros defined in brw_reg.h, except in two places where we use Mesa's SWIZZLE macros. There is even a doxygen comment saying that Mesa's macros are the right ones. They are incompatible swizzle representations (3 bits vs. 2 bits per component), and the code using Mesa's works by pure luck. Fix it. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
e57d77280efcbfd6579a88f071426653287ef833 |
|
05-Feb-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fix register types in dump_instructions(). This regressed when I converted BRW_REGISTER_TYPE_* to be an abstract type that doesn't match the hardware description. dump_instruction() was using reg_encoding[] from brw_disasm.c, which no longer matches (and was incorrect for Gen8+ anyway). This patch introduces a new function to convert the abstract enum values into the letter suffix we expect. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reported-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
ce527a6722491fa7d696266d5dec13f0b72bf8e8 |
|
10-Dec-2013 |
Topi Pohjolainen <topi.pohjolainen@intel.com> |
i965: rename tex_ms to tex_cms Prepares for the introduction of non-compressed multi-sampled lookup used in the blorp programs. v2: now also taking into account gen8 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
71bc11a37508542662132b16a53acd5f541cd2b4 |
|
05-Dec-2013 |
Matt Turner <mattst88@gmail.com> |
i965: Print reg_offset for vgrf of size > 1 in dump_instruction(). Previously we wouldn't print the +0 for the first part of a VGRF of size greater than 1. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
a4d68e9ee94cf4855a3240c3516279b4e7740268 |
|
17-Jan-2014 |
Paul Berry <stereotype441@gmail.com> |
i965: Add GS support to INTEL_DEBUG=shader_time. Previously, time spent in geometry shaders would be counted as part of the vertex shader time. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
9eb568d7531eb4715be24d5076353ea6c10c8ceb |
|
07-Dec-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Create a new vec4 backend for Broadwell. This replaces the old vec4_generator backend. v2: Port to use the C-based instruction representation. Also, remove Geometry Shader offset hacks - the visitor will handle those instead of this code. v3: Texturing fixes (including adding textureGather support). v4: Pass brw_context to gen8_instruction functions as required. v5: Add SHADER_OPCODE_TXF_MCS support; port DUAL_INSTANCED gs fixes (caught by Eric). Simplify ADDC/SUBB handling; add comments to gen8_set_dp_message calls (suggested by Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
26a3bf5c726199d7664d5878ef1f73592e55caa7 |
|
28-Nov-2013 |
Eric Anholt <eric@anholt.net> |
i965: Stop doing our optimization on a copy of the GLSL IR. The original intent was that we'd keep a driver-private copy, and there would be the normal copy for swrast to make use of without the tuning (or anything more invasive we might do) specific to i965. Only, we don't generate swrast code any more, because swrast can't render current shaders anyway. Thus, our private copy is rather a waste, and we can just do our backend-specific operations on the linked shader. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
7629c489c88a6f6dd47b311a90ad64e216c9a37c |
|
29-Nov-2013 |
Chris Forbes <chrisf@ijw.co.nz> |
i965: Add shader opcode for sampling MCS surface Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
8814806c97ed60c5bb4d6cb1927cd05445864388 |
|
21-Oct-2013 |
Matt Turner <mattst88@gmail.com> |
i965: Print conditional mod in dump_instruction(). Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
637dda1c307aee921ecc646b75f891deab6585a9 |
|
02-Dec-2013 |
Matt Turner <mattst88@gmail.com> |
i965: Print argument types in dump_instruction(). Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
729fe77e3bdf64768e8447c281f249ac80c1b9a2 |
|
02-Dec-2013 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Don't print swizzles for immediate values. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
2b8e0a73fbc021305fdcab7a3c6661de7af911a9 |
|
02-Dec-2013 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Print negate and absolute value for src args. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
a85f1b7adf1023667fea090242ba448d935eaa67 |
|
26-Nov-2013 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Add support for printing HW_REGs in dump_instruction(). Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
0e4053234df5e3461e80c90dfd743c3ac96006eb |
|
26-Nov-2013 |
Matt Turner <mattst88@gmail.com> |
i965: Don't print extra (null) arguments in dump_instruction(). Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
d2fcdd0973ee33a2627d1dee6d78091e605af160 |
|
29-Nov-2013 |
Matt Turner <mattst88@gmail.com> |
i965/cfg: Clean up cfg_t constructors. parent_mem_ctx was unused since db47074a, so remove the two wrappers around create() and make create() the constructor. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
a97cd0f4d7902965d5173f4bcbf2ad27c0eb5d12 |
|
30-Oct-2013 |
Matt Turner <mattst88@gmail.com> |
i965: Add a pass to remove dead control flow. Removes IF/ENDIF and IF/ELSE/ENDIF with no intervening instructions. total instructions in shared programs: 1360393 -> 1360387 (-0.00%) instructions in affected programs: 157 -> 151 (-3.82%) (no change in vertex shaders) Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
1c263f8f4f767df0511e63377c17a95ebebba944 |
|
11-Nov-2013 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Add invalidate_live_intervals method. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
34fe051e215107dddbaae71e2edf15f88d839936 |
|
20-Oct-2013 |
Francisco Jerez <currojerez@riseup.net> |
i965: Add a 'has_side_effects' back-end instruction predicate. This patch fixes the three dead code elimination passes and the VEC4/FS instruction scheduling passes so they leave instructions with side effects alone. At some point it might be interesting to have the instruction scheduler calculate the exact memory dependencies between atomic ops, but they're rare enough that it seems unlikely that it will make any practical difference. Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
6032261682388ced64bd33328a5025f561927a38 |
|
16-Oct-2013 |
Eric Anholt <eric@anholt.net> |
i965: Merge together opcodes for SHADER_OPCODE_GEN4_SCRATCH_READ/WRITE I'm going to be introducing gen7 variants, and the previous naming was going to get confusing. Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
5e621cb9fef7eada5a3c131d27f5b0b142658758 |
|
11-Sep-2013 |
Francisco Jerez <currojerez@riseup.net> |
i965/gen7: Implement code generation for untyped surface read instructions.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
cfaaa9bbb7a6ab5819f4fa9e38352b72d6293cff |
|
11-Sep-2013 |
Francisco Jerez <currojerez@riseup.net> |
i965/gen7: Implement code generation for untyped atomic instructions. Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
6bb2cf2107c4461ea9dd100edaf110b839311b90 |
|
08-Oct-2013 |
Chris Forbes <chrisf@ijw.co.nz> |
i965: Add SHADER_OPCODE_TG4_OFFSET for gather with nonconstant offsets. The generator code ends up clearer this way than if we had to sniff via the message length. Implemented via the gather4_po message in hardware, which is present in Gen7 and later. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
89647cffb31ee1ea42d581b1053b4bb147b3e58a |
|
16-Oct-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vec4: if register allocation fails, don't try to schedule. Otherwise the scheduler would be invoked with prog_data->total_grf == 0, causing havoc. In a future patch, this will allow us to try compiling a geometry shader in DUAL_OBJECT mode with spilling disabled, and then fall back to DUAL_INSTANCED mode if that failed. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
8bb15813e3047820a95724e4257aa2c862eeb31a |
|
16-Oct-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vec4: Add the ability for attributes to be interleaved. When geometry shaders are operated in "single" or "dual instanced" mode, a single set of geometry shader inputs is interleaved into the thread payload (with each payload register containing a pair of inputs) in order to save register space. This patch modifies vec4_visitor::lower_attributes_to_hw_regs so that it can handle the interleaved format. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
e0f34301b29ecf3fb7118b2e05872510c104a49b |
|
23-Oct-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vec4: Extract function to set up vec4 prog key for precompiling. This will allow us to re-use it for precompiling geometry shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
068df64ba6a8309427612836e5eb384721ca6d40 |
|
23-Oct-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vec4: Remove uses_clip_distance from program key. This should never have been in the program key in the first place, since it's determined by the shader source, not by GL state. Change the code to just refer to gl_program::UsesClipDistanceOut directly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
705a90e30435490c2de84f4f6741cab335fa7608 |
|
03-Oct-2013 |
Eric Anholt <eric@anholt.net> |
i965: Move the common binding table offset code to brw_shader.cpp. Now that both vec4 and fs are dynamically assigning offsets, a lot of the code is the same. v2: Avoid passing around the next offset through the class. (Review by Paul) Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
d395485e1df44853cdf86b0bd46b7af36c7e1c13 |
|
03-Oct-2013 |
Eric Anholt <eric@anholt.net> |
i965/vec4: Dynamically assign the VS/GS binding table offsets. Note that the dropped comment in brw_context.h is mostly (better written) in brw_binding_table.c as well. Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
3c9dc2d31b80fc73bffa1f40a91443a53229c8e2 |
|
02-Oct-2013 |
Eric Anholt <eric@anholt.net> |
i965: Make a brw_stage_prog_data for storing the SURF_INDEX information. It would be nice to be able to pack our binding table so that programs that use 1 render target don't upload an extra BRW_MAX_DRAW_BUFFERS - 1 binding table entries. To do that, we need the compiled program to have information on where its surfaces go. v2: Rename size to size_bytes to be more explicit. Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
a5ec01fb1bd4ad5418eb16cb05e6f6929d1444e8 |
|
20-Sep-2013 |
Matt Turner <mattst88@gmail.com> |
i965: Don't copy prop source mods into instructions that can't take them.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
e7dc88026a821a31bf2afeb934dded11c91401a1 |
|
20-Sep-2013 |
Matt Turner <mattst88@gmail.com> |
i965: Fixup for don't dead-code eliminate instructions that write to the accumulator. Accidentally pushed an old version of the patch. v2: Set destination register using brw_null_reg(). Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
92dc16c3e2e2b9e3e71baaccc67bbe727e9d68ab |
|
20-Sep-2013 |
Matt Turner <mattst88@gmail.com> |
i965: Don't dead-code eliminate instructions that write to the accumulator. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
fb455500bfb11cca0f45076a9eaccc0ddd764731 |
|
31-Mar-2013 |
Chris Forbes <chrisf@ijw.co.nz> |
i965: add SHADER_OPCODE_TG4 Adds the Gen7 message IDs, a new SHADER_OPCODE_TG4 pseudo-op, and low-level support for emitting it via generate_tex(). V3: Updated for changes in master. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
4e3d1712a223f9f0b4ff4a34b9b5447a92877347 |
|
28-Aug-2013 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Detect GRF sources in split_virtual_grfs send-from-GRF code. It is incorrect to assume that src[0] of a SEND-from-GRF opcode is the GRF. VS_OPCODE_PULL_CONSTANT_LOAD_GEN7 uses an IMM as src[0], and stores the GRF as src[1]. To be safe, loop over all the source registers and mark any GRFs. We probably won't ever have more than one, but it's simpler to just check all three rather than attempting to bail early. Fixes assertion failures in Unigine Sanctuary since we started making register allocation rely on split_virtual_grfs working. (The register classes were actually sufficient, we were just interpreting an IMM as a virtual GRF number.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68637 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: mesa-stable@lists.freedesktop.org
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
09e2df5961cfe04925bdd820e6ea59af3ba783f6 |
|
30-Aug-2013 |
Eric Anholt <eric@anholt.net> |
i965/vs: Fix regression on pre-gen6 with no VS uniforms in use. df06745c5adb524e15d157f976c08f1718f08efa made it so that we didn't allocate extra uniform space for unused clip planes, which also incidentally made us not allocate any space at all, which we were relying on for this no-uniforms case. Instead of putting the knowledge of this special HW exception into the thing that normally preallocates prog_data for us, just allocate it here. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68766 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
4416cb79926f089ff55dbbb352b94ec2890ae823 |
|
23-Mar-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/gs: Add GS_OPCODE_THREAD_END. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
96eb2f353605b382cf4fc930cc1d322130b12270 |
|
21-Mar-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/gs: Add GS_OPCODE_URB_WRITE. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
7f57101ad53112b16e4a400f312f68a85dfbd108 |
|
13-Jul-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vec4: Virtualize setup_payload instead of setup_attributes. When I initially generalized the vec4_visitor class in preparation for geometry shaders, I assumed that the setup_attributes() function would need to be different between vertex and geometry shaders, but its caller, setup_payload(), could be shared. So I made setup_attributes() a virtual function. It turns out this isn't true; setup_payload() needs to be different too, since the geometry shader payload sometimes includes an extra register (primitive ID) that has to come before uniforms. So setup_payload() needs to be the virtual function instead of setup_attributes(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
626495d269e2c2df9dae5c46c086ffff93c77a19 |
|
13-Jul-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vec4: Allow for dispatch_grf_start_reg to vary. Both 3DSTATE_VS and 3DSTATE_GS have a dispatch_grf_start_reg control, which determines the register where the hardware delivers data sourced from the URB (push constants followed by per-vertex input data). For vertex shaders, we always set dispatch_grf_start_reg to 1, since R1 is always the first register available for push constants in vertex shaders. For geometry shaders, we'll need the flexibility to set dispatch_grf_start_reg to different values depending on the behvaiour of the geometry shader; if it accesses gl_PrimitiveIDIn, we'll need to set it to 2 to allow the primitive ID to be delivered to the thread in R1. This patch eliminates the assumption that dispatch_grf_start_reg is always 1. In vec4_visitor, we record the regnum that was passed to vec4_visitor::setup_uniforms() in prog_data for later use. In vec4_generator, we consult this value when converting an abstract UNIFORM register to a concrete hardware register. And in the code that emits 3DSTATE_VS, we set dispatch_grf_start_reg based on the value recorded in prog_data. This will allow us to set dispatch_grf_start_reg to the appropriate value when compiling geometry shaders. Vertex shaders will continue to always use a dispatch_grf_start_reg of 1. v2: Make dispatch_grf_start_reg "unsigned" rather than "GLuint". Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
72168f5f0069b2a0d8a2434ba80f4446952e84c7 |
|
15-Aug-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vec4: Move vec4 data structures and functions to brw_vec4.{cpp,h}. This patch moves the following things into brw_vec4.{cpp,h}: - struct brw_vec4_compile - struct brw_vec4_prog_key - brw_vec4_prog_data_compare() - brw_vec4_prog_data_free() This will allow us to avoid having to include brw_vs.h in geometry-shader-specific files. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
5fb13d871e062354a77a427b3a3fe7f3d6908e5a |
|
20-Mar-2013 |
Paul Berry <stereotype441@gmail.com> |
i965: Stop including brw_vs.h from brw_vec4.h. This is backwards from what we are going to want in the long term, which is: - brw_vec4.h declares general-purpose vec4 infrastructure needed by both VS and GS - brw_vs.h includes brw_vec4.h and adds VS-specific parts. - brw_gs.h includes brw_vec4.h and adds GS-specific parts. Note that at the moment brw_vec.h contains a fair amount of VS-specific declarations--I plan to address that in a later patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
c642bd3dcc1a6f1039732c614ab8a56dd3779427 |
|
15-Aug-2013 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Plumb brw_vec4_prog_data into vec4_generator(). This will be useful for the next commit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
53631be4ebaa4fb13a7f129727c1cdd32fcc6f3d |
|
06-Jul-2013 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Move intel_context::gen and gt fields to brw_context. Most functions no longer use intel_context, so this patch additionally removes the local "intel" variables to avoid compiler warnings. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
b15f1fc3c6b3b9dc4422940c412f80e581c9900d |
|
03-Jul-2013 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Move intel_context::perf_debug to brw_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
329779a0b45b63be17627f026533c80b2c8f7991 |
|
03-Jul-2013 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Move intel_context::batch to brw_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
426ca34b7a2c3b9edfc0189daece8de3aff80627 |
|
13-Jun-2013 |
Eric Anholt <eric@anholt.net> |
glsl: Remove ir_print_visitor.h includes and usage We have ir->print() to do the old declaration of a visitor and having the IR accept the visitor (yuck!). And now you can call _mesa_print_ir() safely anywhere that you know what an ir_instruction is. A couple of missing printf("\n")s are added in error paths -- when an expression is handed to the visitor, it doesn't print '\n' (since it might be a step in printing a whole expression tree). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
6220cc931f15ddb428ea481e8b9a70ce26ca3304 |
|
28-May-2013 |
Eric Anholt <eric@anholt.net> |
i965/vs: Fix implied_mrf_writes() for integer division pre-gen6. Previously it would assertion fail in debug builds (though the correct value was returned in a non-debug build). Marking it as a candidate for stable even though it has no current consumers in the stable branches, in case one shows up in a later backport. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64727 NOTE: This is a candidate for stable branches. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
0f3068a58bdbceb2cb93e3848b0e2145629cdf43 |
|
01-May-2013 |
Eric Anholt <eric@anholt.net> |
i965/vs: Make virtual grf live intervals actually cover their used range. This is the same change as the previous commit to the FS. A very few VSes are regressed by 1 or 2 instructions, which look recoverable with a bit more dead code elimination. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
573d8813fdbb116f4500d2044c56d80aab73ab7f |
|
01-Dec-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add instruction scheduling. While this is ignorant of dependency control, it's still good for a 0.39% +/- 0.08% performance improvement on GLBenchmark 2.7 (n=548) v2: Rewrite as a subclass of the base class for the FS instruction scheduler, inheriting the same latency information. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
63c8155b09bca7917631ec678a0d0db6e7965a1a |
|
29-Apr-2013 |
Eric Anholt <eric@anholt.net> |
i965: Make dump_instructions be a virtual method of the visitor. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
5e46482993dfd30b888d5219f6fecf4b4d1f42de |
|
28-Apr-2013 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Move is_math/is_tex/is_control_flow() to backend_instruction. These are entirely based on the opcode, which is available in backend_instruction. It makes sense to only implement them in one place. This changes the VS implementation of is_tex() slightly, which now accepts FS_OPCODE_TXB and SHADER_OPCODE_LOD. However, since those aren't generated in the VS anyway, it should be fine. This also makes is_control_flow() available in the VS. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
417d8917d4924652f1cd0c64dbf3677d4eddbf8c |
|
16-Apr-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vec4: Fix hypothetical use of uninitialized data in attribute_map[]. Fixes issue identified by Klocwork analysis: 'attribute_map' array elements might be used uninitialized in this function (vec4_visitor::lower_attributes_to_hw_regs). The attribute_map array contains the mapping from shader input attributes to the hardware registers they are stored in. vec4_vs_visitor::setup_attributes() only populates elements of this array which, according to core Mesa, are actually used by the shader. Therefore, when vec4_visitor::lower_attributes_to_hw_regs() accesses the array to lower a register access in the shader, it should in principle only access elements of attribute_map that contain valid data. However, if a bug ever caused the driver back-end to access an input that was not flagged as used by core Mesa, then lower_attributes_to_hw_regs() would access uninitialized memory, which could cause illegal instructions to get generated, resulting in a possible GPU hang. This patch makes the situation more robust by using memset() to pre-initialize the attribute_map array to zero, so that if such a bug ever occurred, lower_attributes_to_hw_regs() would generate a (mostly) harmless access to r0. In addition, it adds assertions to lower_attributes_to_hw_regs() so that if we do have such a bug, we're likely to discover it quickly. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
dea70404eb615bfa148fbd0fec5670fb2657c47b |
|
11-Apr-2013 |
Eric Anholt <eric@anholt.net> |
i965: Fix a warning in the release build. This was copy and pasted from can_reswizzle_dst(), and we can just fold it in instead to avoid the warning. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
195a6cca3cbc26eeea0f7f8dfc21dd3429911779 |
|
11-Apr-2013 |
Matt Turner <mattst88@gmail.com> |
i965/vs: Print error if vertex shader fails to compile. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
32a8e877666f7c3798d736bb6f05ad2f41356ebf |
|
11-Apr-2013 |
Matt Turner <mattst88@gmail.com> |
i965: NULL check prog on shader compilation failure. Also change if (shader) to if (prog) for consistency. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
e9fa3a94486d80da34542cfd24425c208a8d30fe |
|
23-Mar-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vs: Don't hardcode DEBUG_VS in generic vec4 code. Since the vec4_visitor and vec4_generator classes are going to be re-used for geometry shaders, we can't enable their debug functionality based on (INTEL_DEBUG & DEBUG_VS) anymore. Instead, add a debug_flag boolean to these two classes, so that when they're instantiated the caller can specify whether debug dumps are needed. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
444fce6398556118629ef01204a7d8ff7af0bea3 |
|
22-Mar-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vs: Generalize attribute setup code in preparation for GS. This patch introduces a new function, vec4_visitor::lower_attributes_to_hw_regs(), which replaces registers of type ATTR in the instruction stream with the hardware registers that store those attributes. This logic will need to be common between the vertex and geometry shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
9bb6840b28a9a77377d437198c62d705cade5370 |
|
17-Feb-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vs: Generalize data structures pointed to by vec4_generator. This patch removes the following field from vec4_generator, since it is not used: - struct brw_vs_compile *c And changes the following field: - struct gl_vertex_program *vp => struct gl_program *prog With these changes, vec4_generator no longer refers to any VS-specific data structures. This will pave the way for re-using it for geometry shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> v2: Use the name "prog" rather than "p". Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
5743bea0ba1eda07be831d95c5b7729f9ba98a92 |
|
17-Feb-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vs: move VS-specific data members to vs_vec4_visitor. This patch moves the following data structures from vec4_visitor to vec4_vs_visitor, since they contain VS-specific data: - struct brw_vs_compile *c (renamed to vs_compile) - struct brw_vs_prog_data *prog_data (renamed to vs_prog_data) - src_reg *vp_temp_regs - src_reg vp_addr_reg Since brw_vs_compile and brw_vs_prog_data also contain vec4-generic data, the following pointers are added to the base class, to allow it to access the vec4-generic portions of these data structures: - struct brw_vec4_compile *c - struct brw_vec4_prog_key *key - struct brw_vec4_prog_data *prog_data Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> v2: Use shorter names in the base class and longer names in the derived class. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
8941f73c7ccb3c6cfa965a19f346e4b6ead6abdb |
|
17-Feb-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vs: Make some vec4_visitor functions virtual. This patch makes the following vec4_visitor functions virtual, since they will need to be implemented differently for vertex and geometry shaders. Some of the functions are renamed to reflect their generic purpose, rather than their VS-specific behaviour: - setup_attributes - emit_attribute_fixups (renamed to emit_prolog) - emit_vertex_program_code (renamed to emit_program_code) - emit_urb_writes (renamed to emit_thread_end) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
e9be5a05f70be7cff58b29bff07af71e6d339085 |
|
16-Feb-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vs: Make vec4_vs_visitor class derived from vec4_visitor. This patch just creates the derived class; later patches will migrate VS-specific functions and data structures from the base class into the derived class. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
5fff3752c88255ea3f4eb26cddb2c996694b33b1 |
|
17-Feb-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vs: split brw_vs_prog_data into generic and VS-specific parts. This will allow the generic parts to be re-used for geometry shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> v2: Put urb_read_length and urb_entry_size in the generic struct. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
0c994f181ce1a09cdbb7db27e4ad5565248bf8e1 |
|
16-Feb-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vs: split brw_vs_prog_key into generic and VS-specific parts. This will allow the generic parts to be re-used for geometry shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
09cd6e06d2c7a54ca6eb8d3102822efa78e01a9c |
|
16-Feb-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vs: Remove brw_vs_prog_data pointer from brw_vs_compile. In patches that follow, we'll be splitting structs brw_vs_prog_data and brw_vs_compile into a vec4-generic base struct and a VS-specific derived struct (this will allow the vec4-generic code to be re-used for geometry shaders). Having brw_vs_compile point to brw_vs_prog_data makes it difficult to do this cleanly. Fortunately most of the functions that use brw_vs_compile (those in the vec4_visitor class) already have access to brw_vs_prog_data through a separate pointer (vec4_visitor::prog_data). So all we have to do is use that pointer consistently, and plumb prog_data through the few remaining functions that need access to it. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
b29613371c316e9273ebe29ba37fb2f04c2ed58d |
|
16-Feb-2013 |
Paul Berry <stereotype441@gmail.com> |
i965/vs: Make type of vec4_visitor::vp more generic. The vec4_visitor functions don't use any VS specific data from vec4_visitor::vp. So rename it to "prog" and change its type from struct gl_vertex_program * to struct gl_program *. This will allow the code to be re-used for geometry shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> v2: Use the name "prog" rather than "p". Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
fe97f26c86d65b1b0e026c725c7da348a91093d9 |
|
09-Apr-2013 |
Paul Berry <stereotype441@gmail.com> |
i965: Rename backend_visitor::prog to shader_prog. The next patch is going to change the type of vec4_visitor::vp from struct gl_vertex_program * to struct gl_program *, and rename it. The sensible name to change it to is vec4_visitor::prog. However, prog is already used in backend_visitor (which vec4_visitor derives from). Since backend_visitor::prog is of type struct gl_shader_program *, it makes sense to rename it to shader_prog. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
d5f7aebac2b1afbc5023cd114174860d8d763d06 |
|
04-Apr-2013 |
Eric Anholt <eric@anholt.net> |
i965/vs: Use GRFs for pull constant offsets on gen7. This allows the computation of the offset to get written directly into the message source. shader-db results: total instructions in shared programs: 3308390 -> 3283025 (-0.77%) instructions in affected programs: 442998 -> 417633 (-5.73%) No difference in GLB2.7 low res (n=9). Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
3badbf7f7fa4898c69545fea3c60ea29cf61ae3b |
|
05-Apr-2013 |
Eric Anholt <eric@anholt.net> |
i965/vs: When asked to make a dst_reg for a src.xxxx, just write to src.x. We have several places in our pull constant handling where we make a temporary src_reg for an int, and then turn it into a dst. In doing so, we were writing to the dst.xyzw, so we never register coalesced it with a later mov from dst.x to real_dst.x. These extra channels written would be removed if we had channel-wise DCE in the backend, but we don't. Fix it for now by just not writing these extra channels that won't get used. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
4fee05b020af72ee802d4349de76fbc36cdd53a9 |
|
01-Dec-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add a pass to set dependency control fields on instructions. This is a more aggressive version of the old brw_optimize() path. Reduces cycles spent in the vertex shader on minecraft by 18.6% +/- 10.0% (n=15). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
20d846ce8b46604ced835eb68079a0dbae2e19dc |
|
12-Mar-2013 |
Eric Anholt <eric@anholt.net> |
i965: Add names for all instructions to dump_instruction() in FS and VS. I'd previously added the minimum names to understand my dumps, but this makes dumps in general much easier to read. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
6192e9b377c6fa4f36da42af6c06ca32b10e7e62 |
|
20-Mar-2013 |
Eric Anholt <eric@anholt.net> |
i965/vs: Include URB payload setup in shader_time. This much more accurately reflects the cost of the vertex shader, since the payload setup is often a significant fraction of the instructions in the VS. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
55feb19704ae69c580f431d6498344521de369cd |
|
18-Dec-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Use a send from a 2-register VGRF for shader time writes. This will let us emit it later, after we're setting up MRFs for the URB write. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
130138030a3dc8bda20766146ca9fda4047133d3 |
|
18-Dec-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Teach copy propagation about sends from GRFs. This incidentally also teaches it a bit about gen6 math -- we now allow unswizzled, unmodified GRF temps as the sources for math. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
c3a22d42a88c299561dd913d0a00bb986921eeba |
|
18-Dec-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Prepare split_virtual_grfs() for the presence of SENDs from GRFs. v2: Fix silly bool handling, and don't add new tabs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
d2ba1c24b440ee74436335d8e815be9b72b1ba7f |
|
19-Mar-2013 |
Eric Anholt <eric@anholt.net> |
i965: Track ARB program state along with GLSL state for shader_time. This will let us do much better printouts for non-GLSL programs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
d24819dce8cf6dac23f27df46fabbf756a732229 |
|
11-Mar-2013 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Add IR dumping for immediates. This makes dump_instructions more useful. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
db3a0f13ef13b6d392dfc3b7346351533600d343 |
|
11-Mar-2013 |
Eric Anholt <eric@anholt.net> |
i965: Split shader_time entries into separate cachelines. This avoids some snooping overhead between EUs processing separate shaders (so VS versus FS). Improves performance of a minecraft trace with shader_time by 28.9% +/- 18.3% (n=7), and performance of my old GLSL demo by 93.7% +/- 0.8% (n=4). v2: Add a define for the stride with a comment explaining its units and why. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
14cec07177f438717cc6fb9252525e16d6b3d8dd |
|
22-Feb-2013 |
Eric Anholt <eric@anholt.net> |
i965: Make perf_debug() output to GL_ARB_debug_output in a debug context. I tried to ensure that performance in the non-debug case doesn't change (we still just check one condition up front), and I think the impact is small enough in the debug context case to warrant including all of it. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
f52ce6a0ca73d1cd89091689efd8ea2e14748723 |
|
24-Jan-2013 |
Chris Forbes <chrisf@ijw.co.nz> |
i965: add a new virtual opcode: SHADER_OPCODE_TXF_MS This is very similar to the TXF opcode, but lowers to `ld2dms` rather than `ld` on Gen7. V4: - add SHADER_OPCODE_TXF_MS to is_tex() functions, so regalloc thinks it actually writes the correct number of registers. Otherwise in nontrivial shaders some of the registers tend to get clobbered, producing bad results. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
f0213b124259804ce8e114575fe9058dffdf5864 |
|
13-Feb-2013 |
Matt Turner <mattst88@gmail.com> |
i965/vs/gen7: Allow MATH instructions to have MRF as a destination total instructions in shared programs: 346873 -> 346847 (-0.01%) instructions in affected programs: 364 -> 338 (-7.14%) (All affected shaders are from Lightsmark) Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
d5efc14635cf25bc130bfa77737913913d9202ce |
|
21-Nov-2012 |
Eric Anholt <eric@anholt.net> |
i965: Add asserts to check that we don't realloc ParameterValues. Things are even more restrictive than they used to be, so I've made mistakes in this area. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
c9e48e5b083b6cf97ecdb2d17c874ea631203b06 |
|
02-Aug-2012 |
Eric Anholt <eric@anholt.net> |
i965: Generalize VS compute-to-MRF for compute-to-another-GRF, too. No statistically significant performance difference on glbenchmark 2.7 (n=60). It reduces cycles spent in the vertex shader by 3.3% +/- 0.8% (n=5), but that's only about .3% of all cycles spent according to the fixed shader_time. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
471af25fc57dc43a8277b4b17ec82547287621d0 |
|
01-Dec-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Extend opt_compute_to_mrf to handle limited "reswizzling" The way our visitor works, scalar expression/swizzle results that get stored in channels other than .x will have an intermediate MOV from their result in the .x channel to the real .y (or whatever) channel, and similarly for vec2/vec3 results. By knowing how to adjust DP4-type instructions for optimizing out a swizzled MOV, we can reduce instructions in common matrix multiplication cases. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
f74560f3fb516971e6a7b03a2382db2f58699f59 |
|
10-Dec-2012 |
Eric Anholt <eric@anholt.net> |
i965: Scale shader_time to compensate for resets. Some shaders experience resets more than others, which skews the numbers reported. Attempt to correct for this by linearly scaling according to the number of resets that happen. Note that will not be accurate if invocations of shaders have varying times and longer invocations are more likely to reset. However, this should at least be better than the previous situation.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
338b5f887d462bbe7ef58a233cd00619e43415f0 |
|
10-Dec-2012 |
Eric Anholt <eric@anholt.net> |
i965: Adjust the split between shader_time_end() and shader_time_write(). I'm about to emit other kinds of writes besides time deltas, and it turns out with the frequency of resets, we couldn't really use the old time delta write() function more than once in a shader.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
71f06344a0d72a6bd27750ceca571fc016b8de85 |
|
27-Nov-2012 |
Eric Anholt <eric@anholt.net> |
i965: Add a debug flag for counting cycles spent in each compiled shader. This can be used for two purposes: Using hand-coded shaders to determine per-instruction timings, or figuring out which shader to optimize in a whole application. Note that this doesn't cover the instructions that set up the message to the URB/FB write -- we'd need to convert the MRF usage in these instructions to GRFs so that our offsets/times don't overwrite our shader outputs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1) v2: Check the timestamp reset flag in the VS, which is apparently getting set fairly regularly in the range we watch, resulting in negative numbers getting added to our 32-bit counter, and thus large values added to our uint64_t. v3: Rebase on reladdr changes, removing a new safety check that proved impossible to satisfy. Add a comment to the AOP defs from Ken's review, and put them in a slightly more sensible spot. v4: Check timestamp reset in the FS as well.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
b126228f1247fb0fed686ee3ef2c87461f2fc7a7 |
|
30-Nov-2012 |
Eric Anholt <eric@anholt.net> |
i965: Include codegen time in the INTEL_DEBUG=perf stall detection. In the VS case, we were missing the entire compile time in the stall detection! Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
0f06864ba566eaff5b739a9d0fba5ed7eaadd60b |
|
30-Nov-2012 |
Eric Anholt <eric@anholt.net> |
i965: Don't leak the IR annotation into later instructions. After walking our IR instructions (Mesa or GLSL), we don't want to also mark the start of the FB/URB writes or whatever as being that IR. This can end up being misleading when the end of the IR visit got copy propagated out to a later instruction in the URB writes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
c1023608002c985b9d72edc64732cd666de2a206 |
|
27-Nov-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Move struct brw_compile (p) entirely inside vec4_generator. The brw_compile structure contains the brw_instruction store and the brw_eu_emit.c state tracking fields. These are only useful for the final assembly generation pass; the earlier compilation stages doesn't need them. This also means that the code generator for future hardware won't have access to the brw_compile structure, which is extremely desirable because it prevents accidental generation of Gen4-7 code. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
eda9726ef51dcfd3895924eb0f74df8e67aa9c3a |
|
27-Nov-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Split final assembly code generation out of vec4_visitor. Compiling shaders requires several main steps: 1. Generating VS IR from either GLSL IR or Mesa IR 2. Optimizing the IR 3. Register allocation 4. Generating assembly code This patch splits out step 4 into a separate class named "vec4_generator." There are several reasons for doing so: 1. Future hardware has a different instruction encoding. Splitting this out will allow us to replace vec4_generator (which relies heavily on the brw_eu_emit.c code and struct brw_instruction) with a new code generator that writes the new format. 2. It reduces the size of the vec4_visitor monolith. (Arguably, a lot more should be split out, but that's left for "future work.") 3. Separate namespaces allow us to make helper functions for generating instructions in both classes: ADD() can exist in vec4_visitor and create IR, while ADD() in vec4_generator() can create brw_instructions. (Patches for this upcoming.) Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
8af8a26480e9e71fb1501b675f21a469c1699b9b |
|
27-Nov-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Move uses of brw_compile from do_vs_prog to brw_vs_emit. The brw_compile structure is closely tied to the Gen4-7 hardware encoding. However, do_vs_prog is very generic: it just calls out to get a compiled program and then uploads it. This isn't ultimately where we want it, but it's a step in the right direction: it's now closer to the code generator. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
746fc346eae21d227b06799f3e82a1404c75bdc9 |
|
27-Nov-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Rework memory contexts for shader compilation data. During compilation, we allocate a bunch of things: the IR needs to last at least until code generation...and then the program store needs to last until after we upload the program. For simplicity's sake, just keep it all around until we upload the program. After that, it can all be freed. This will also save a lot of headaches during the upcoming refactoring. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
403bb1d306c5bc23ad9e2c26fd39071e6e41f665 |
|
27-Nov-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Pass the brw_context pointer into vec4_visitor and do_vs_prog. We used to steal it out of the brw_compile struct...but vec4_visitor isn't going to have one of those in the future. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
dd50c88386c8220f4631115b68a10930378ead6c |
|
27-Nov-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Move some functions from brw_vec4_emit.cpp to brw_vec4.cpp. This leaves only the final code generation stage in brw_vec4_emit.cpp, moving the payload setup, run(), and brw_vs_emit functions to brw_vec4.cpp. The fragment shader backend puts these functions in brw_fs.cpp, so this patch also helps with consistency. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
10ff6772c8054aea12ac0f08e2e3898fd4a7f76b |
|
25-Oct-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Don't lose the MRF writemask when doing compute-to-MRF. Consider the following code sequence: mul(8) g4<1>F g1<0,4,1>.wzwwF g3<4,4,1>.wzwwF mov.sat(8) m1<1>.xyF g4<4,4,1>F mul(8) g4<1>F g1<0,4,1>.xxyxF g3<4,4,1>.xxyxF mov.sat(8) m1<1>.zwF g4<4,4,1>F The compute-to-MRF pass will discover the first mov.sat and attempt to replace it by rewriting earlier instructions. Everything works out, so it replaces scan_inst's destination file, reg, and reg_offset, resulting in: mul(8) m1<1>F g1<0,4,1>.wzwwF g3<4,4,1>.wzwwF mul(8) g4<1>F g1<0,4,1>.xxyxF g3<4,4,1>.xxyxF mov.sat(8) m1<1>.zwF g4<4,4,1>F Unfortunately, it loses the .xy writemask on the mov.sat's MRF destination. While this doesn't pose an immediate problem, it then proceeds to transform the second mov.sat, resulting in: mul(8) m1<1>F g1<0,4,1>.wzwwF g3<4,4,1>.wzwwF mul(8) m1<1>F g1<0,4,1>.xxyxF g3<4,4,1>.xxyxF Instead of writing both halves of the vector (like the original code), it overwrites the full vector both times, clobbering the desired .xy values. When encountering a MOV, the compute-to-MRF code scans for instructions which generate channels of the MOV source. It ensures that all necessary channels are available (possibly written by several instructions). In this case, *more* channels are available than necessary, so we want to take the subset that's actually used. Taking the bitwise and of both writemasks should accomplish that. This was discovered by analyzing an ARB_vertex_program test (glean/vertProg1/MUL test (with swizzle and masking)) with my new Mesa IR -> Vec4 IR translator code. However, it should be possible with GLSL programs as well. NOTE: This is a candidate for stable release branches. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
f593acd5778d4fdfa3493bb90c99b52e45667bc0 |
|
19-Oct-2012 |
Tapani Pälli <tapani.palli@intel.com> |
i965/vs: include format argument in debug printf otherwise some compilers will throw error "error: format not a string literal and no format arguments" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
20ebebac5153affcbd44350332678a2fb04d4c96 |
|
03-Oct-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Improve live interval calculation. This is derived from the FS visitor code for the same, but tracks each channel separately (otherwise, some typical fill-a-channel-at-a-time patterns would produce excessive live intervals across loops and cause spilling). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48375 (crash -> failure, can turn into pass by forcing unrolling still)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
914d8f9f84a3539758716d676d59a1fee4cc559f |
|
04-Oct-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add a little bit of IR-level debug ability. This is super basic, but it let me visualize a problem I had with opt_compute_to_mrf(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
34c58acb59bc0b827e28ef9e89044621ab0b3ee1 |
|
03-Oct-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add support for splitting virtual GRFs. This should improve our ability to register allocate without spilling. Unfortuantely, due to the live variable analysis being ignorant of loops, we still have register allocation failures on some programs. v2: Add more context to the comment explaining the function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
25ca9cc8236845a4be32a6f39b4a6d1664d4b403 |
|
04-Jul-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Move the other two src_reg/dst_reg constructors to brw_vec4.cpp. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
b2f5d4c3ec9ec2fec8b39c87eb00121a24107276 |
|
04-Jul-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Move class functions to brw_vec4.cpp. This has less impact than for the FS (4k savings), because it was partially done already, but makes things more consistent. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
7e7c40ff98cc2b930bc3113609ace5430f2bdc95 |
|
26-Oct-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Add vec4_instruction::is_tex() query. Copy and pasted from fs_inst::is_tex(), but without TXB. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
1d4f3ca8f0442821c914b758b323e6e5124149a3 |
|
29-Sep-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Implement integer quotient and remainder math operations. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
c662764f4f9d9d0303fb2685dfdc93824fa15dca |
|
06-Sep-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add support for compute-to-MRF. Removes 1.8% of the instructions from 97% of the vertex shaders in shader-db.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
160848d8ef96cf3a760c02cc576df7dbffc1f669 |
|
06-Sep-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add a function for how many MRFs get written as part of a SEND. This will be used for compute-to-mrf, which needs to know when MRFs get overwritten.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
f0c04e6c22babf2aee2ad1ee85dbd6f996be3712 |
|
03-Sep-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add support for simple algebraic optimizations. We generate silly code for array access, and it's easier to generally support the cleanup than to specifically avoid the bad code in each place we might generate it. Removes 4.6% of instructions from 41.6% of shaders in shader-db, particularly savage2/hon and unigine. v2: Fixes by Ken: Make is_zero/one member functions, and fix a progress flag. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
cc9eb936c220267b6130b705fc696d05906a31df |
|
02-Sep-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add support for copy propagation of the UNIFORM and ATTR files. Removes 2.0% of the instructions from 35.7% of vertex shaders in shader-db.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
42ce13195b94d0d51ca8e7fa5eed07fde8f37988 |
|
30-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add constant propagation to a few opcodes. This differs from the FS in that we track constants in each destination channel, and we we have to look at all the swizzled source channels. Also, the instruction stream walk is done in an O(n) manner instead of O(n^2). Across shader-db, this reduces 8.0% of the instructions from 60.0% of the vertex shaders, leaving us now behind the old backend by 11.1% overall.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
df35d691807656d3627b6fa6f51a08674bdc043e |
|
07-Sep-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add support for overflowing the number of available push constants. Fixes glsl-vs-uniform-array-4. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=33742 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
72cfc6f3778d8297e52c254a5861a88eb62e4d67 |
|
23-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Pack live uniform vectors together in the push constant upload. At some point we need to also move uniform accesses out to pull constants when there are just too many in use, but we lack tests for that at the moment. Fixes glsl-vs-large-uniform-array. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
7c84b9d303345fa5075dba8c4ea7af449d93b0f8 |
|
23-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Track uniforms as separate vectors once we've done array access. This will make it easier to figure out which elements are totally unused and not upload them. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
8174945d3346dc049ae56dcb4bf1eab39f5c88aa |
|
17-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add simple dead code elimination. This is copied right from the fragment shader. It is needed for real register allocation to work correctly.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
3dadc1e3cceac80a1b63cad2e10f0e0f8904531b |
|
17-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Copy the live intervals calculation over from the FS. This is a rather pessimistic calculation, since it doesn't distinguish individual channels of a vec4, or elements of an array, but should be a minimum start for register allocation.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|