| Mesa 20.2.0 Release Notes / 2020-09-28 |
| ====================================== |
| |
| Mesa 20.2.0 is a new development release. People who are concerned |
| with stability and reliability should stick with a previous release or |
| wait for Mesa 20.2.1. |
| |
| Mesa 20.2.0 implements the OpenGL 4.6 API, but the version reported by |
| glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) / |
| glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used. |
| Some drivers don't support all the features required in OpenGL 4.6. OpenGL |
| 4.6 is **only** available if requested at context creation. |
| Compatibility contexts may report a lower version depending on each driver. |
| |
| Mesa 20.2.0 implements the Vulkan 1.2 API, but the version reported by |
| the apiVersion property of the VkPhysicalDeviceProperties struct |
| depends on the particular driver being used. |
| |
| SHA256 checksum |
| --------------- |
| |
| :: |
| |
| 63f0359575d558ef98dd78adffc0df4c66b76964ebf603b778b7004964191d30 mesa-20.2.0.tar.xz |
| |
| |
| New features |
| ------------ |
| |
| - GL_ARB_compute_variable_group_size on Iris. |
| |
| - GL_ARB_gpu_shader5 on llvmpipe |
| |
| - GL_ARB_post_depth_coverage on llvmpipe |
| |
| - GLES 3.2 on llvmpipe |
| |
| - GL_EXT_shader_group_vote on GLES3. |
| |
| - GL_EXT_texture_shadow_lod on llvmpipe |
| |
| - VK_AMD_texture_gather_bias_lod on RADV. |
| |
| - VK_AMD_gpu_shader_half_float on RADV/ACO. |
| |
| - VK_AMD_gpu_shader_int16 on RADV/ACO. |
| |
| - VK_EXT_extended_dynamic_state on ANV and RADV. |
| |
| - VK_EXT_image_robustness on RADV. |
| |
| - VK_EXT_private_data on ANV and RADV. |
| |
| - VK_EXT_custom_border_color on ANV and RADV. |
| |
| - VK_EXT_pipeline_creation_cache_control on ANV and RADV. |
| |
| - VK_EXT_shader_demote_to_helper_invocation on RADV/LLVM. |
| |
| - VK_EXT_subgroup_size_control on RADV/ACO. |
| |
| - VK_GOOGLE_user_type on ANV and RADV. |
| |
| - VK_KHR_shader_subgroup_extended_types on RADV/ACO. |
| |
| - GL_ARB_gl_spirv on nvc0/nir. |
| |
| - GL_ARB_spirv_extensions on nvc0/nir. |
| |
| - RADV now uses ACO per default as backend |
| |
| - RADV_DEBUG=llvm option to enable LLVM backend for RADV |
| |
| - VK_EXT_image_robustness for ANV |
| |
| - VK_EXT_shader_atomic_float on ANV |
| |
| - VK_EXT_4444_formats on ANV and RADV. |
| |
| - VK_KHR_memory_model on RADV. |
| |
| - GL 4.5 on llvmpipe |
| |
| - EGL_KHR_swap_buffers_with_damage on X11 (DRI3) |
| |
| |
| Bug fixes |
| --------- |
| |
| - [Regression][Bisected][20.2][radeonsi] American Truck Simulator continually allocates memory until OOM |
| - anv: dEQP-VK.robustness.robustness2.* failures on gen12 |
| - [RADV] Problems reading primitive ID in fragment shader after tessellation |
| - Massive memory leak (at least AMD, others unknown) |
| - Substance Painter 6.1.3 black glitches on Radeon RX570 |
| - vkCmdCopyImage broadcasts subsample 0 of MSAA src into all subsamples of dst on RADV |
| - Crash in ruvd_end_frame when calling vaBeginPicture/vaEndPicture without rendering anything |
| - X-Plane 11 Installer crashes on startup since `glsl: declare gl_Layer/gl_ViewportIndex/gl_ViewportMask as vs builtins` |
| - Horizon Zero Dawn graphics corruption with with radv |
| - Amber test opt_peel_loop_initial_if: Assertion failed |
| - Dirt Rally: Flickering glitches on certain foliage since Mesa 20.1.0 caused by MSAA |
| - [BRW] WRC 5 asserts with gallium nine and iris. |
| - radv: Corruption in "The Surge 2" |
| - [RADV] Detroit: Become Human Demo game lock-ups with RADV |
| - Road Redemption certain graphic effects rendered white color |
| - vulkan/wsi/x11: deadlock with Xwayland when compositor holds multiple buffers |
| - [RADV/ACO] Death Stranding cause a GPU hung (*ERROR* Waiting for fences timed out!) |
| - lp_bld_init.c:172:7: error: implicit declaration of function ‘LLVMAddConstantPropagationPass’; did you mean ‘LLVMAddCorrelatedValuePropagationPass’? [-Werror=implicit-function-declaration] |
| - Intel Vulkan driver crash with alpha-to-coverage |
| - EGL_KHR_swap_buffers_with_damage support on X11 |
| - radv: blitting 3D images with linear filter |
| - [ACO] Compiling pipelines from RPCS3's shader interpreter spins forever in ACO code |
| - Intel Vulkan driver assertion with small xfb buffer |
| - [spirv-fuzz] SPIR-V parsing failed "src->type->type == dest->type->type" |
| - radeonsi: radeonsi crashes in Chrome on chromeos |
| - [RADV] commit d19bc94e4eb94 broke gamescope with Navi |
| - 4e3a7dcf6ee4946c46ae8b35e7883a49859ef6fb breaks Gamescope showing windows properly. |
| - anv: crashes in CTS test dEQP-VK.subgroups.*.framebuffer.*_tess_eval |
| - Intel Vuikan (anv) crash in copy_non_dynamic_state() when using validation layer |
| - Mafia 3: Trees get rendered incorrectly |
| - radv: dEQP-VK.synchronization.op.multi_queue.timeline_semaphore.write_clear_attachments_*_concurrent fail when forcing DCC. |
| - Crash on GTA 5 through proton 5.0.9 and GE versions |
| - Mesa 20.2.0-rc1 fails to build for AMD |
| - Assertion failure compiling shader from Zigguart |
| - Panfrost locks for waiting fence when running Source engine games |
| - ci: `-Dtools=panfrost` should be build-tested |
| - panfrost: Register allocation fails for Firefox WebRender shaders |
| - VRAM leak with vuilkan external memory + opengl memory objects |
| - [vulkan/build] Recent build system changes made VK_EXT_acquire_xlib_display unnecessarily depend on GBM |
| - ci: Capture devcoredumps on chezas |
| - Possible array out of bounds in brw_vec4_nir.cpp |
| - freedreno/a6xx: incorrect rendering in asphalt 9 |
| - [tgl][bisected][regression][iris] failure on dEQP-EGL.functional.wide_color.pbuffer_8888_colorspace_default |
| - Multiply defined symbols compiling with gcc@10.1.0 |
| - shrinking descriptor pool on intel+vulkan |
| - dEQP-VK.renderpass2.dedicated_allocation.attachment.1.12 fails on NAVI14 |
| - turnip: binning and indirect dependency |
| - Amber test leads to NIR validation failed after nir_opt_if (on spirv-fuzz shader) |
| - Unable to compile mesa-git from b559d26c |
| - Ambient light too bright with ACO in AC: Odyssey |
| - Multiple issues with Detroit Become Human |
| - ci: Capture artifacts in baremetal mode |
| - turnip/ir3: fine derivatives |
| - panfrost: regression: Major stuttering and low compositor FPS with glmark2 |
| - khr_debug-push-pop-group_gl: ../src/util/simple_mtx.h:86: simple_mtx_lock: Assertion \`c != _SIMPLE_MTX_INVALID_VALUE' failed. |
| - freedreno/a6xx: skai/skqp fails |
| - SPIR-V parsing fails in src/compiler/spirv/spirv_to_nir.c |
| - SPIR-V parsing fails in src/compiler/spirv/vtn_cfg.c |
| - Weird GLSL bug |
| - iris driver is broken in Freedesktop 19.08 |
| - LLVM not properly shutdown in `si_pipe.c`? |
| - Panfrost: add current status to docs/features.txt |
| - Opengl incorrect rendering on yuzu Amd |
| - RADV: VK_ACCESS_MEMORY_READ/WRITE_BIT is not implemented |
| - [bisected][regression][all platforms] multiple deqp-gles31/glescts/piglit failures |
| - 7406ea37, "ac/surface: require that gfx8 doesn't have DCC in order to be displayable", breaks Gamescope being able to launch games on RX580, and possibly other gfx8 cards |
| - vkGetSemaphoreCounterValue doesn't update without vkWaitSemaphores calls on Intel UHD 620 |
| - [RADV] System crash when playing XCOM Chimera Squad because of commit #7a5e6fd2 |
| - [RADV] Non-precise occlusion queries return non-zero when all fragments are discarded |
| - [DXVK] Project Cars rendering problems |
| - ADDRLIB ODR Violation |
| - Build fails with current mesa from git "undefinierter Verweis auf »nir_lower_clip_disable«" |
| - KDE Compositor stuttering after Check for window destruction in dri3_wait_for_event_locked |
| - Add fallthrough to prevent errors caused by missing break |
| - i965/20.1: gray rendering with torcs racing |
| - glBindBufferRange call seems to be ignored by one of two shader-programs on radeon cards |
| - [bisected][g33] piglit.spec.ext_framebuffer_object.fbo-cubemap failure |
| - Increase GL_MAX_COMPUTE_SHADER_STORAGE_BLOCKS to greater value. |
| - nir: st_nir_lower_builtin fails for gl_LightSource[i] |
| - Sometimes VLC player process gets stuck in memory after closure if video output used is Auto or OpenGL |
| - Double unlock in rbug_context.c |
| - Double copy for TexSubImage |
| - [v3d] corruption when GS omits some vertices |
| - Iris crashes when reading from multisampled front buffer on platforms without front buffer |
| - freedreno: subway surfers crash when repeatedly toggling fullscreen |
| - [RADV/GFX8] Performance drop in DOOM Eternal when "Present from compute" is enabled |
| - freedreno: multiple applications crash on a5xx |
| - Use-after-free crash innv50_ir::GCRA::RIG_Node::init() |
| - intel: Sample mask writes need to be honored in Vulkan |
| - [RADV] - Path of Exile (238960) - Map outline, landscape and markers are missing with the Vulkan renderer. |
| - ASTC texture decompression fails when using software fallback |
| - [i965][iris][regression][bisected] multiple piglit and glcts failures on all platforms |
| - please publish GPG keyring used to sign new releases |
| - [BISECTED] compiling shader causes crash |
| - Missing render Information on Stellaris |
| - freedreno/ir3: allow copy-propagate from array |
| - Zink + GALLIUM_HUD SIGSEGV |
| - piglit spec@egl_ext_device_base@conformance fails LLVM 11 Git assertion since "llvmpipe/fs: add caching support" |
| - llvmpipe: 1x1 framebuffer with a 2x2 viewport |
| - [regression] nir build failure |
| - ci: need to end baremetal tests after kernel panic/instaboot |
| - If-statement body is executed for false condition |
| - freedreno/a6xx: broken rendering in playcanvas "after the flood" |
| - [regression] performance drop on Dota 2, CS:GO, and gfxbench GL benchmarks on ICL/Iris |
| - [amd] C++ ODR violatation for union GB_ADDR_CONFIG |
| - Zink reports incorrect amount of video memory |
| - [RADV/LLVM]: void llvm::ICmpInst::AssertOK(): Assertion \`getOperand(0)->getType() == getOperand(1)->getType() && "Both operands to ICmp instruction are not of the same type!"' failed. |
| - glsl-1.50-gs-max-output hangs on Navi10 + NGG |
| - anv: Runs out of binding tables with PPSSPP during long runs |
| - Segfault in Panfrost with waypipe |
| - ci: Use rsync instead of rm -rf ; cp for baremetal rootfs |
| - i965: Rendering problems replaying a trace of "Refunct" after mesa-20.1.0-rc1 release [bisected] |
| - Panfrost (rk3399 NanoPi M4) hang/crash on playing video on Kodi/X11 |
| - gallium/winsys/radeon/drm fails assertion on 32bit |
| - NIR validation failed after glsl to nir, before function inline, wrong {src,dst}->type ? |
| - nir/spirv asin() function not precise enough |
| - Mesa 20.0.7 / 20.1.0-rc4 regression, extremally long shader compilation time in NIR |
| - Android build error after 689acc73 |
| - freedreno/a6xx: gpu hangs in google earth |
| - Mesa-git build fails on Fedora Rawhide |
| - Doom Eternal 1.1 performs very poorly on RADV |
| - iris/i965: possible regression in 20.0.5 due to changes in buffer manager sharing across screens (firefox/mozilla#1634213) |
| - iris/i965: possible regression in 20.0.5 due to changes in buffer manager sharing across screens (firefox/mozilla#1634213) |
| - Incorrect _NetBSD__ macro inside execmem.c |
| - Possible invalid sizeof in device.c |
| - YUV FP16 lowering validation failing |
| - GLSL compiler assertion is_float() failed in glsl/ir_validate.cpp, visit_leave on specific WebGL shader |
| - [RADV] - Doom Eternal (782330) & Metro Exodus (412020) - Title requires 'RADV_DEBUG=zerovram' to eliminate colorful graphical aberrations. |
| - [RADV] - Doom Eternal (782330) & Metro Exodus (412020) - Title requires 'RADV_DEBUG=zerovram' to eliminate colorful graphical aberrations. |
| - mesa trunk master vulkan overlay-layer meson.build warning empty configuration_data() object |
| - [meson] increase minimum required version |
| - Kicad fails to render 3D PCB models. |
| - freedreno: minetest: alpha channel issue on a6xx |
| - Reproduceable i915 gpu hang Intel Iris Plus Graphics (Ice Lake 8x8 GT2) |
| - 7 Days to Die - "Reflection Quality" setting broken, results in environment rendered black |
| - glsl: regression affecting shader compilation time |
| - freedreno: glamor issue with x11 desktops |
| - finish converting from fnv1a to xxhash |
| - Hang in iris_dri in kitty |
| - Setting twice value to output_stream in radv_nir_to_llvm.c |
| - Overwriting value of `jit_tex->sample_stride` in lp_setup.c |
| - [AMDGPU][OpenGL] apitrace of kernel/firmware crash that requires a reboot |
| - Flickering in Superposition benchmark |
| - Double lock in fbobject.c |
| - Possible typo in aco_insert_waitcnt.cpp |
| - [bisected] Steam crashes when newest Iris built with LTO |
| - Freeing null pointer inside radv_amdgpu_cs.c |
| - Duplicated sub expression in radv_nir_to_llvm.c |
| - i965/vec4: opt_cse_local cause the out of bound array access |
| - NIR: Regression on shader using 8/16-bit integers |
| - ACO: Compiler segfault on 8/16-bit integers. |
| - lp_bld_intr.c:70:16: error: use of undeclared identifier 'LLVMFixedVectorTypeKind'; did you mean 'LLVMVectorTypeKind'? |
| - recent seqno changes causing surfaceflinger crash |
| - [radeonsi] [glthread] Crash with glthread enabled |
| - Deadlock in anv_timelines_wait() |
| - [gles3] supertuxkart: some textures are incorrect |
| - post_version.py does not work with release candidates |
| - post_version.py does not work with release candidates |
| - radv regression on android |
| - ogl: Set mesa_glthread=true as default on the RPCS3 emulator |
| - [iris] android deqp dEQP-EGL.functional.robustness.negative_context#invalid_notification_strategy_enum fails |
| - zink: conditional rendering |
| - [RadeonSI] Glitches on VEGA8 + RX 560X after MR 4863 |
| - RadeonSI OpenGL broken for GFX8 after unify code for overriding offset |
| - freedreno/turnip: Don't request fragcoord components we don't use |
| - Make check fails in ANV |
| - src\util\meson.build:294:4: ERROR: Program or command 'winepath' not found or not executable |
| - Please add Zink to features.txt |
| - llvmpipe: assert triggers in LLVM |
| - debug builds are massively broken on Windows |
| - ci: Report flakes on IRC from baremetal tests |
| - heavy glitches on amd ryzen 5 since version 20.x |
| - zink asserts with 32-bit boolean |
| - OpenGL: Surviving Mars black screen late-game (possible shader problem) |
| - Kerbal Space Program (KSP) hangs entire Navi system |
| - Dirt: Showdown bad performance and broken rendering with enabled advanced lightning |
| - gravit & Firefox WebGL broken since 3dc2ccc14c0e035368fea6ae3cce8c481f3c4ad2 "ac/surface: replace RADEON_SURF_OPTIMIZE_FOR_SPACE with !FORCE_SWIZZLE_MODE" |
| - mesa 20.0.5 causing kitty to crash |
| - radeonsi: "Torchlight II" trace showing regression on mesa-20.0.6 [bisected] |
| - [RADV/LLVM/ACO/Regression] After mesa commit a3dc7fffbb7be0f1b2ac478b16d3acc5662dff66 all games stucks at start |
| - Android building error after commit 2ab45f41 |
| - freedreno/a6xx: pubg rendering glitches |
| - iris: Crash when trying to capture window in OBS Studio |
| - lp_test_format failure with llvm-11 |
| |
| |
| Changes |
| ------- |
| |
| Abhishek Kumar (1): |
| |
| - egl: Limit the EGL ver for android |
| |
| Adam Jackson (1): |
| |
| - glx: Fix build and warnings with -Dglx=dri -Dglx-direct=false |
| |
| Alejandro Piñeiro (9): |
| |
| - v3d/tex: only look up the 2nd texture gather offset for 1d non-arrays |
| - v3d/tex: set up default values for Configuration Parameter 1 if possible |
| - v3d/tex: use TMUSLOD register if possible |
| - v3d: moving v3d simulator to src/broadcom |
| - v3d/tex: handle correctly coordinates for cube/cubearrays images |
| - vulkan/util: add struct vk_pipeline_cache_header |
| - nir/lower_tex: handle query lod with nir_lower_tex_packing_16 at lower_tex_packing |
| - v3d/packet: fix typo on Set InstanceID/PrimitiveID packet |
| - v3d: set instance id to 0 at start of tile |
| |
| Alyssa Rosenzweig (475): |
| |
| - pan/mdg: Track more types |
| - pan/mdg: Be a bit more pedantic in invert passes |
| - panfrost: Enumify bifrost blend types |
| - pan/bi: Add texture indices to IR |
| - pan/bi: Pipe multiple textures through |
| - pan/bi: Pack round opcodes (FMA, either 16 or 32) |
| - pan/bit: Add framework forinterpreting double vs float |
| - pan/bit: Interpret ROUND |
| - pan/bit: Add round tests |
| - panfrost: Fix texture field size |
| - panfrost: Fix size of bifrost sampler descriptor |
| - panfrost: Fix sampler wrap/filter field orders |
| - panfrost: Fix norm coords on bifrost sampler |
| - panfrost: Fix tiled texture "stride"s on Bifrost |
| - pan/decode: Don't crash on missing payload |
| - pan/bi: Enable lower_mediump_outputs NIR pass |
| - panfrost: Update Bifrost fields in mali_shader_meta |
| - pan/bi: Lower for now sincos |
| - pan/mdg: Ingest actual isub ops |
| - pan/mdg: Rename .one to .sat_signed |
| - pan/mdg: Move constant switch opts to algebraic pass |
| - pan/mdg: Drop forever todo |
| - pan/mdg: Drop `opt` in name of midgard_opt_cull_dead_branch |
| - pan/mdg: Enable nir_opt_algebraic_distribute_src_mods |
| - panfrost: Update dEQP expectation list |
| - panfrost: Setup gl_FragCoord as sysval on Bifrost |
| - pan/bi: Add clause type for gl_FragCoord.zw load |
| - pan/bi: Abort on unknown op packing |
| - pan/bi: Abort on unhandled intrinsics |
| - pan/bi: Futureproof COMBINE lowering against non-u32 |
| - pan/bi: Print bad instruction on src packing fail |
| - pan/bi: Passthrough direct ld_var addresses |
| - pan/bi: Lower gl_FragCoord |
| - pan/bi: Set clause type for gl_FragCoord.z |
| - pan/bi: Fix double-abs flipping |
| - pan/bi: Fix missing swizzle |
| - pan/bi: Fix incorrectly flipped swizzle |
| - pan/bi: Disable CSEL4 emit for now |
| - pan/bi: Fix DISCARD ops in disasm |
| - pan/bi: Structify DISCARD |
| - pan/bi: Remove BI_GENERIC |
| - pan/bi: Unwrap BRANCH into CONDITIONAL class |
| - pan/bi: Handle discard_if in NIR->BIR naively |
| - pan/bi: Emit discard (not if) |
| - pan/bi: Add float-only mode to condition fusing |
| - pan/bi: Fuse conditions into discard_if |
| - pan/bi: Handle discard/branch in get_component_count |
| - pan/bi: Pack ADD.DISCARD |
| - pan/bi: Structify ADD ICMP 16 |
| - pan/bi: Pack ADD ICMP 32 |
| - pan/bi: Pack ADD ICMP 16 |
| - pan/bi: Don't pack ICMP on FMA |
| - pan/bit: Add swizzles to round tests |
| - pan/bit: Add more 16-bit fmod tests |
| - pan/bit: Add ICMP tests |
| - pan/bi: Rename BI_ISUB to BI_IMATH |
| - pan/bi: Use IMATH for nir_op_iadd |
| - pan/bi: Pack FMA IADD/ISUB 32 |
| - pan/bi: Pack ADD IADD/ISUB for 8/16/32 |
| - pan/bi: Add SUB.v2i16/SUB.v4i8 opcodes to disasm |
| - pan/bi: Don't schedule <32-bit IMATH to FMA |
| - pan/bit: Interpret IMATH |
| - pan/bit: Interpret v4i8 ops |
| - pan/bit: Remove test names |
| - pan/bit: Use swizzle helper for round |
| - pan/bit: Factor out identity swizzle helper |
| - pan/bit: Add IMATH packing tests |
| - pan/decode: Fix flags_hi printing |
| - pan/mdg: Explain helper invocations dataflow theory |
| - pan/mdg: Analyze helper invocation termination |
| - pan/mdg: Analyze helper execution requirements |
| - pan/mdg: Use the helper invo analyze passes |
| - pan/mdg: Use analysis to set .cont/.last flags |
| - pan/mdg: Remove texture_op_count |
| - pan/mdg: Set types for derivatives |
| - pan/mdg: Fix derivative swizzle |
| - panfrost: Run dEQP-GLES3.functional.shaders.derivate.* on CI |
| - pan/decode: Use a page table for tracking mmaps |
| - pan/decode: Fix min/max_tile_coord mixup |
| - pan/mfbd: Add format codes for PIPE_FORMAT_B5G5R5A1_UNORM |
| - panfrost: Switch formats to table |
| - panfrost: Fix Z24 vs Z32 mixup |
| - panfrost: Enable AFBC for Z24X8 |
| - nir: Add fsat_signed opcode |
| - nir: Add fclamp_pos opcode |
| - panfrost: Add modifier detection helpers |
| - pan/mdg: Remove .pos propagation pass |
| - pan/mdg: Drop nir_lower_to_source_mods |
| - pan/mdg: Prepare for modifier helpers |
| - pan/mdg: Ingest fsat_signed/fclamp_pos |
| - pan/mdg: Apply abs/neg modifiers |
| - pan/mdg: Treat inot as a modifier |
| - pan/mdg: Remove invert optimizations |
| - pan/mdg: Use helpers for branch/discard inversion |
| - pan/mdg: Apply outmods |
| - pan/mdg: Emit fcsel when beneficial |
| - pan/mdg: Optimize pipelining logic |
| - pan/mdg: Precompute mir_special_index |
| - pan/mdg: Optimize liveness computation in DCE |
| - pan/mdg: Handle comparisons in fp16 path |
| - pan/mdg: Fix constant combining crash |
| - pan/mdg: Remove mir_*size routines |
| - pan/mdg: Remove mir_get_alu_src |
| - pan/mdg: Include more types |
| - pan/mdg: Handle dest up/lower correctly with swizzles |
| - pan/mdg: Respect !32-bit sizes in RA |
| - pan/mdg: Explain ld/st sign/zero extension |
| - pan/mdg: Add abs/neg/shift modifiers to IR |
| - pan/mdg: Use src_types to determine size in scheduling |
| - pan/mdg: Use type to determine triviality of a move |
| - pan/mdg: Identify scalar integer mods |
| - pan/mdg: Promote imov to fmov on a NIR level |
| - pan/mdg: Remove promote_float pass |
| - pan/mdg: Defer modifier packing until emit time |
| - pan/mdg: Remove redundant redundancy |
| - pan/mdg: Streamline dest_override handling |
| - pan/mdg: Implement b2f16 |
| - pan/mdg: Don't generate conversions for fp16 LUTs |
| - pan/mdg: Ignore dest.type when offseting load swizzle |
| - pan/lcra: Remove unused alignment parameters |
| - pan/lcra: Allow per-variable bounds to be set |
| - pan/mdg: Use type size to determine alignment |
| - pan/mdg: Eliminate load_64 |
| - pan/mdg: Set RA bounds for fp16 |
| - pan/mdg: Print mask when dest=0 |
| - pan/mdg: Round up bytemasks when spilling |
| - pan/mdg: Print constant vectors less wrong |
| - pan/mdg: Factor out mir_adjust_constant |
| - pan/mdg: Only combine 16-bit constants to lower half |
| - pan/mdg: Separately pack constants to the upper half |
| - pan/mdg: Fix type checking issues with compute |
| - pan/mdg: Pack barriers correctly |
| - pan/mdg: Use shifts instead of division for RA sizes |
| - pan/mdg: Implement vector constant printing for 8-bit |
| - pan/mdg: Implement condense_writemask for 8-bit |
| - pan/mdg: Pack 8-bit swizzles in 16-bit ops |
| - panfrost: Guard experimental fp16 behind debug flag |
| - panfrost: Keep cached BOs mmap'd |
| - panfrost: Remove deadcode |
| - panfrost: Fill in SCALED formats to format table |
| - panfrost: Don't set PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY |
| - panfrost: Don't zero staging buffer for tiling |
| - panfrost: Allow bpp24 tiling |
| - panfrost: Allow tiling on RECT textures |
| - panfrost: Limit blend shader work count |
| - panfrost: Remove dated comment about leaks |
| - panfrost: Disable tib read/write when colourmask = 0x0 |
| - panfrost: Avoid redundant shader executions with mask=0x0 |
| - panfrost: Don't set CAN_DISCARD for MFBD |
| - panfrost: Fix transform feedback types |
| - pan/mdg: Cleanup comments that look like division |
| - pan/mdg: Eliminate expand_writemask division |
| - pan/mdg: Eliminate 64-bit swizzle packing division |
| - pan/mdg: Avoid division in printing helpers |
| - pan/mdg: Eliminate remaining divisions from compiler |
| - panfrost: Fix dated comment |
| - panfrost: Use _mesa_roundevenf when packing clear colours |
| - panfrost: Handle !independent_blend for blend shaders |
| - pan/mdg: Add pack_colour_32 opcode |
| - pan/mdg: Lower shifts to 32-bit |
| - pan/mdg: Ensure we don't DCE into impossible masks |
| - pan/mdg: Allow DCE on ld_color_buffer masks |
| - panfrost: Add debug print before query flushes |
| - panfrost: Only run batch debug when specifically asked |
| - nir: Add un/pack_32_4x8 opcodes |
| - util: Add SATURATE macro |
| - util/format: Use SATURATE |
| - mesa: Use SATURATE |
| - mesa/swrast: Use SATURATE |
| - gallium/draw: Use SATURATE |
| - glsl: Use SATURATE |
| - panfrost: Use SATURATE |
| - softpipe: Use SATURATE |
| - intel: Use SATURATE |
| - i965: Use SATURATE |
| - iris: Use SATURATE |
| - etnaviv: Use SATURATE |
| - nouveau: Use SATURATE |
| - pan/decode: Fix unused variable warning |
| - pan/decode: Fix tiler warning |
| - pan/decode: Dump missing field on Bifrost |
| - pan/decode: Dump unknown2 |
| - panfrost: Fix Bifrost blending with depth-only FBO |
| - panfrost: Adjust null_rt for Bifrost |
| - panfrost: Tweak zsbuf magic numbers for Bifrost |
| - panfrost: Tweak Bifrost colour buffer magic |
| - panfrost: Force Z/S tiling on Bifrost |
| - panfrost: Share MRT blend flag calculation with Bifrost |
| - panfrost: Set unk2 to accomodate blending |
| - panfrost: Identify Bifrost texture format swizzle |
| - panfrost: Ensure nonlinear strides are 16-aligned |
| - panfrost: Document Midgard Inf/NaN suppress bit |
| - panfrost: Add defines for bifrost unk1 flags |
| - panfrost: Identify MALI_BIFROST_EARLY_Z flag |
| - panfrost: Set MALI_BIFROST_EARLY_Z as necessary |
| - pan/decode: Decode Bifrost shader flags |
| - pan/bi: Add TEX.vtx opcode for vertex texturing |
| - pan/bi: Also add compact vertex texturing |
| - pan/bi: Document compute_lod bit for compact tex |
| - pan/bi: Allow vertex txl with lod=0 as compact |
| - pan/bi: Add f16 TEXC.vtx op |
| - pan/bi: Pack compact vertex texturing |
| - pan/bi: Add CSEL.16 packing tests |
| - pan/bi: Suppress inf/nan for now |
| - panfrost: Don't generate gl_FragCoord varying on Bifrost |
| - panfrost: Set reads_frag_coord as a sysval |
| - panfrost: Preload gl_FragCoord on Bifrost |
| - pan/bi: Remove FMA? parameter from get_src |
| - pan/bi: Remove comment about old scheduler design |
| - pan/bi: Move bi_registers to common IR structures |
| - pan/bi: Move bi_registers to bi_bundle |
| - pan/bi: Drop `struct` from bi_registers |
| - pan/bi: Add FILE* argument to bi_print_registers |
| - pan/bi: Move bi_flip_ports out of port assignment |
| - pan/bi: Document constant count invariant |
| - pan/bi: Disassemble pos=0xe |
| - pan/bi: Add MUL.i32 to disasm |
| - pan/bi: Remove more artefacts of 2-pass scheduling |
| - pan/bi: Add bi_layout.c for clause layout helpers |
| - pan/bi: Add helper to measure clause size |
| - pan/bi: Remove schedule_barrier |
| - pan/bi: Allow printing branches without targets |
| - pan/bi: Fix emit_if successor assignment |
| - pan/bi: Only rewrite COMBINE dest if not SSA |
| - pan/bi: Fix CONVERT component counting |
| - pan/bi: Fix branch condition typesize |
| - pan/bi: Passthrough ZERO in branch packing |
| - pan/bi: Add branch constant field to IR |
| - pan/bi: Pack branch offset constants |
| - pan/bi: Set branch_constant if there is a branch |
| - pan/bi: Assign constant port for branch offsets |
| - pan/bi: Preliminary branch packing |
| - pan/bi: Link clauses back to their blocks |
| - pan/bi: Add bi_foreach_clause_in_block_from{_rev} helpers |
| - pan/bi: Measure distance between blocks |
| - pan/bi: Pack proper clause offsets |
| - pan/bi: Set branch_conditional if b2b is set |
| - pan/bi: Set back-to-back bit more accurately |
| - pan/bi: Set branch conditional bit |
| - pan/bi: Pack unconditional branch |
| - pan/bi: Defer block naming until after emit |
| - pan/bi: Add bi_foreach_block_from_rev helper |
| - pan/bi: Measure backwards branches as well |
| - pan/bi: Allow two successors in header packing |
| - pan/bi: Passthrough deps of the branch target |
| - panfrost: Disable QUAD_STRIP/POLYGON on Bifrost |
| - panfrost: Add GPU IDs for G31/G52 |
| - panfrost: Probe G31/G52 if PAN_MESA_DEBUG=bifrost |
| - pan/mdg: Handle un/pack opcodes as moves |
| - pan/mdg: Add pack_unorm_4x8 via 8-bit |
| - pan/mdg: Treat packs "specially" |
| - pan/mdg: Handle bitsize for packs |
| - pan/mdg: Print 8-bit constants |
| - pan/mdg: Drop the u8 from the colorbuf op names |
| - pan/mdg: Implement raw colourbuf loads on T720 |
| - panfrost: Add theory for new framebuffer lowering |
| - panfrost: Determine unpacked type for formats |
| - panfrost: Add quirks for blend shader types |
| - panfrost: Determine load classes for formats |
| - panfrost: Determine classes for stores |
| - panfrost: Stub out lowering boilerplate |
| - panfrost: Un/pack pure 32-bit |
| - panfrost: Un/pack pure 16-bit |
| - panfrost: Un/pack pure 8-bit |
| - panfrost: Un/pack 8-bit UNORM |
| - panfrost: Flesh out dispatch |
| - panfrost: Un/pack UNORM 4 |
| - panfrost: Un/pack RGB565 and RGB5A1 |
| - panfrost: Un/pack RGB10_A2_UNORM |
| - panfrost: Un/pack RGB10_A2_UINT |
| - panfrost: Un/pack R11G11B10 |
| - panfrost: Un/pack sRGB via NIR |
| - panfrost: Switch to pan_lower_framebuffer |
| - panfrost: Conditionally allow fp16 blending |
| - panfrost: Account for differing types in blend lower |
| - panfrost: Let Gallium pack colours |
| - panfrost: Check for large tilebuffer requirements |
| - panfrost: Add separate_stencil BO to batch |
| - panfrost: Use internal_format throughout |
| - panfrost: Update fails list |
| - pan/mdg: Handle 16-bit ld_vary |
| - pan/mdg: Fuse f2f16 into load_interpolated_input |
| - panfrost: Fix PRESENT flag mix-up |
| - panfrost: Permit AFBC of RGB8 |
| - panfrost: Use VTX tag for vertex texturing |
| - panfrost: Don't flush explicitly when mipmapping |
| - panfrost: Remove unused nir_lower_framebuffer pass |
| - pan/mdg: Disassemble out-of-order bits |
| - pan/mdg: Add quirk for missing out-of-order support |
| - pan/mdg: Enable out-of-order execution after texture ops |
| - nir: Fold f2f16(b2f32(x)) to b2f16(x) |
| - pan/mdg: Don't double-replicate blend on T720 |
| - pan/mdg: Distinguish blend shaders in internal shader-db |
| - pan/mdg: Add roundmode enum |
| - pan/mdg: Add opcode roundmode property |
| - pan/mdg: Lower roundmodes |
| - pan/mdg: Implement \*_rtz conversions with roundmode |
| - pan/mdg: Fold roundmode into applicable instructions |
| - pan/mdg: Handle f2u8 |
| - pan/mdg: Allow f2u8 and friends thru |
| - pan/mdg: Handle regular nir_intrinsic_load_output |
| - panfrost: Passthrough NATIVE loads/stores |
| - pan/bi: Handle SEL with vec3 16-bit |
| - pan/bi: Fix SEL.16 swizzle |
| - pan/bi: Pack second argument of F32_TO_F16 |
| - pan/bi: Passthrough second argument of F32_TO_F16 |
| - pan/bi: Handle vectorized load_const |
| - panfrost: Update MALI_EARLY_Z description |
| - panfrost: Document MALI_WRITES_GLOBAL bit |
| - panfrost: Handle writes_memory correctly |
| - panfrost: Readd MIDGARD_SHADERLESS quirk to t760 |
| - panfrost: Explicitly convert to 32-bit for logic-ops |
| - pan/bi: Disassemble gl_PointCoord reads. |
| - panfrost: Prefer sysval for gl_PointCoord on Bifrost |
| - panfrost: Fix gl_PointSize out of GL_POINTS |
| - panfrost: Mark point sprites as todo on Bifrost |
| - pan/mdg: Legalize inverts with constants |
| - pan/mdg: Ensure ld_vary_16 is aligned |
| - panfrost: Ensure we have ro before using it |
| - nir: Remove nir_intrinsic_output_u8_as_fp16_pan |
| - pan/mdg: Avoid fusing ld_vary_16 with non-zero component |
| - panfrost: Calculate varying size by format |
| - panfrost: Add panfrost_streamout_offset helper |
| - panfrost: Introduce bitfields for tracking varyings |
| - panfrost: Determine varying buffer presence |
| - panfrost: Emit unlinked varyings |
| - panfrost: Emit special varyings |
| - panfrost: Emit xfb records |
| - panfrost: Add helper to determine if we are capturing |
| - panfrost: Add high-level varying emit |
| - panfrost: Use new varying linking |
| - panfrost: Remove unused routines |
| - panfrost: Allow R/RG/RGB varyings |
| - panfrost: Only store varying formats |
| - panfrost: Use shader_info harder |
| - panfrost: Override varying format to minimal precision |
| - panfrost: Demote mediump varyings to fp16 |
| - pan/mdg: Explicitly type 64-bit uniform moves |
| - pan/mdg: Analyze types for 64-bitness in RA |
| - pan/mdg: Prefer type over regmode for schedule constraints |
| - pan/mdg: Precolour blend inputs |
| - panfrost: Merge bifrost_bo/midgard_bo |
| - panfrost: Update sampler view in Bifrost path |
| - panfrost: Fix level_2 |
| - panfrost: Correctly calculate tiled stride |
| - panfrost: Enable AFBC for RGB565 |
| - panfrost: Simplify AFBC format check |
| - pan/mdg: Factor out unit check |
| - pan/mdg: Allow scheduling "x + x" to multipliers |
| - pan/mdg: Canonicalize (x * 2.0) to (x + x) |
| - pan/mdg: Reassociate adds for multiply-by-two |
| - nir: Propagate \*2*16 conversions into vectors |
| - panfrost: Specify stack_shift on SFBD |
| - pan/mdg: Defer nir_fuse_io_16 until after opts |
| - pan/mdg: Don't assign destination in writeout block to r1 |
| - pan/mdg: Remove bundle interference code |
| - pan/mdg: Schedule writeout to VLUT |
| - pan/mdg: Defer smul, vlut until after writeout moves |
| - pan/mdg: Allow Z/S writes to use any 2nd stage unit |
| - pan/mdg: Prioritize non-moves on VADD/VLUT |
| - pan/mdg: Skip r1.w write where possible |
| - pan/mdg: Schedule based on liveness |
| - pan/mdg: Respect type/mask in mir_lower_special_reads |
| - pan/mdg: Fix indirect UBO swizzles |
| - pan/decode: Fix MSAA texture decoding |
| - pan/decode: Identify layered MSAA flag |
| - pan/mdg: Allow ignoring move mode |
| - pan/mdg: Handle GLSL_SAMPLER_DIM_MS |
| - pan/mdg: Handle nir_tex_src_ms_index |
| - pan/mdg: Handle nir_texop_txf_ms |
| - pan/mdg: Use _VTX tag for texelFetch in frag shaders |
| - panfrost: Set depth to sample_count for MSAA 2D |
| - panfrost: Identify layer_stride |
| - panfrost: Allocate space for multisampling |
| - panfrost: Index texture by sample |
| - panfrost: Include pointer for each sample |
| - panfrost: Set layer_stride for multisampled rendering |
| - panfrost: Don't advertise MSAA 2x |
| - panfrost: Identify coverage_mask |
| - panfrost: Pass sample_mask to the hardware |
| - panfrost: Implement alpha-to-coverage |
| - panfrost: Identify depth/stencil layer strides |
| - panfrost: Set depth/stencil_layer_stride accordingly |
| - panfrost: Enable MSAA if we render to such a surface |
| - panfrost: Save sample_mask before blitting |
| - panfrost: Expose MSAA 4x |
| - glsl: Handle 16-bit types in loop analysis |
| - docs/features: Track Panfrost |
| - panfrost: Introduce pan_pool struct |
| - panfrost: Allocate pool BOs against the pool |
| - panfrost: Track the device through the pool |
| - panfrost: Expose pool-based allocation API |
| - panfrost: Move debug flags into the device |
| - panfrost: Drop Gallium-local pan_bo_create wrapper |
| - panfrost: Move pool routines to common code |
| - panfrost: Factor out scoreboarding state |
| - panfrost: Pass polygon_list to tiler init function |
| - panfrost: Drop batch from scoreboard routines |
| - panfrost: Move scoreboarding routines to common |
| - panfrost: Handle PIPE_FORMAT_X24S8_UINT |
| - panfrost: Handle PIPE_FORMAT_S8_UINT |
| - panfrost: Move panfrost_translate_texture_type |
| - panfrost: Report blend shader work count |
| - panfrost: Clamp pure int pixels |
| - panfrost: Generate shader variants on framebuffer bind |
| - panfrost: Always use SOFTWARE for pure formats |
| - panfrost: Extend fetched framebuffer results |
| - panfrost: Fix fence leak |
| - panfrost: Fix write to free'd memory |
| - panfrost: Add a sparse array to map GEM handles to BOs |
| - panfrost: Index BOs from the BO map sparse array |
| - panfrost: Merge PAN_BO_IMPORTED/PAN_BO_EXPORTED |
| - panfrost: Remove PAN_BO_COHERENT_LOCAL |
| - panfrost: Remove PAN_BO_DONT_REUSE |
| - panfrost: Remove panfrost_bo_access type |
| - panfrost: Compact unused BO flag bits |
| - panfrost: Add format codes for new compressed textures |
| - panfrost: Pipe in compressed texture feature mask |
| - panfrost: Filter compressed texture formats |
| - panfrost: Map PIPE_{DXT, RGTC, BPTC} to MALI_BCn |
| - docs/features: Update ASTC entries for Panfrost |
| - pan/mdg: Bump compiler RT maximum |
| - pan/mdg: Identify per-sample interpolation mode |
| - pan/mdg: Implement gl_SampleID |
| - panfrost: Force Z/S writeback |
| - panfrost: Expose panfrost_get_blend_shader |
| - panfrost: Add MALI_PER_SAMPLE bit |
| - panfrost: Include sample count in payload estimates |
| - panfrost: Identify zs_samples field |
| - panfrost: Add rectangle subtraction algorithm |
| - panfrost: Handle per-sample shading |
| - panfrost: Set zs_samples as necessary |
| - panfrost: Track surfaces drawn per-batch |
| - panfrost: Extract panfrost_batch_reserve_framebuffer |
| - panfrost: Use Midgard-specific reloads |
| - panfrost: Call util_blitter_save_fragment_constant_buffer_slot |
| - panfrost: Overhaul tilebuffer allocations |
| - panfrost: Set PIPE_CAP_MIXED_COLORBUFFER_FORMATS |
| - panfrost: Fix sRGB clear colour packing |
| - panfrost: Implement Z32F_S8 blits |
| - panfrost: Abort on unsupported blit |
| - panfrost: Avoid integer underflow in rt_count_1 |
| - panfrost: Honour cso->compare_mode |
| - panfrost: Fix faults with RASTERIZER_DISCARD |
| - panfrost: Report CAPs more honestly |
| - panfrost: Enable Chromium |
| - panfrost: Revert "Disable frame throttling" |
| - docs/features: Mark trivial missed feature |
| - panfrost: Enable FP16 by default |
| - panfrost: Avoid wait=true flushing all batches |
| - panfrost: Remove wait parameter to flush_all_batches |
| - panfrost: Skip specifying in_syncs |
| - panfrost: Allocate syncobjs in panfrost_flush |
| - panfrost: Remove unused batch_fence->signaled |
| - panfrost: Remove unused batch_fence->ctx |
| - pan/bit: Update f32->f16 convert test |
| - pan/bit: Remove BI_SHIFT stub |
| - pan/mdg: Mask spills from texture write |
| - pan/mdg: Test for SSA before chasing addresses |
| - docs/features: Add GL_EXT_multisampled_render_to_texture |
| - panfrost: Add MSAA mode selection field |
| - panfrost: Implement EXT_multisampled_render_to_texture |
| - panfrost: Set STRIDE_4BYTE_ALIGNED_ONLY |
| - panfrost: Fix WRITES_GLOBAL bit |
| - pan/mdg: Ensure barrier op is set on texture |
| - panfrost: Fix blend leak for render targets 5-8 |
| - panfrost: Free cloned NIR shader |
| - panfrost: Free NIR of blit shaders |
| - panfrost: Free hash_to_temp map |
| - pan/mdg: Free previous liveness |
| - panfrost: Use memctx for sysvals |
| - panfrost: Free batch->dependencies |
| - pan/mdg: Fix discard encoding |
| - pan/mdg: Fix perspective combination |
| - pan/bit: Set d3d=true for CMP tests |
| |
| Andreas Baierl (1): |
| |
| - nir/ lower_int_to_float: Handle umax and umin |
| |
| Andres Gomez (10): |
| |
| - .mailmap: add an alias for Iago Toral Quiroga |
| - .mailmap: add an alias for Andres Gomez |
| - gitlab-ci: update tracie README after changes in main script |
| - scripts: remove unittest.mock dependency when not used |
| - gitlab-ci: create always the "results" directory with tracie |
| - gitlab-ci: correct tracie behavior with replay errors |
| - gitlab-ci: build gfxreconstruct from the "dev" branch |
| - gitlab-ci: get the last frame from a gfxr trace using gfxrecon-info |
| - gitlab-ci/traces: updated paths and checksums for POLARIS10 traces |
| - gitlab-ci: Test AMD's Raven with traces |
| |
| Andrey Vostrikov (1): |
| |
| - egl/x11: Free memory allocated for reply structures on error |
| |
| Andrii Simiklit (3): |
| |
| - glsl_type: don't serialize padding bytes from glsl_struct_field |
| - i965/vec4: Ignore swizzle of VGRF for use by var_range_end() |
| - glsl: fix crash on glsl macro redefinition |
| |
| Ani (1): |
| |
| - drirc: Enable glthread for rpcs3 |
| |
| Anuj Phogat (6): |
| |
| - intel/devinfo: Add is_dg1 to device info |
| - intel/l3: Add DG1 L3 configuration |
| - intel/ehl: Use GEN11_URB_MIN_MAX_ENTRIES in device info |
| - intel/ehl: Use macro GEN11_LP_FEATURES in device info |
| - intel/ehl: Rename gen_device_info struct |
| - intel/ehl: Add new PCI-IDs |
| |
| Arcady Goldmints-Orlov (4): |
| |
| - anv: increase minUniformBufferOffsetAlignment to 64 |
| - intel/compiler: fix alignment assert in nir_emit_intrinsic |
| - nir/spirv/glsl450: increase asin(x) precision |
| - intel/compiler: Always apply sample mask on Vulkan. |
| |
| Axel Davy (19): |
| |
| - st/nine: Set correctly blend max_rt |
| - gallium/util: Fix leak in the live shader cache |
| - ttn: Add new allow_disk_cache parameter |
| - ttn: Implement disk cache |
| - st/nine: Enable ttn cache |
| - radeonsi: Enable tgsi to nir disk cache |
| - st/nine: Add checks for pure device |
| - st/nine: Return error when setting invalid depth buffer |
| - st/nine: Do not return invalidcall on getrenderstate |
| - st/nine: Pass more adapter formats for CheckDepthStencilMatch |
| - st/nine: Improve return error code in CheckDeviceFormat |
| - st/nine: Fix uninitialized variable in BEM() |
| - st/nine: Fix a crash if the state is not initialized |
| - st/nine: Add missing NULL checks |
| - st/nine: Increase available GPU memory |
| - st/nine: Retry allocations after freeing some space |
| - st/nine: Improve pDestRect handling |
| - st/nine: Ignore pDirtyRegion |
| - st/nine: Handle full pSourceRect better |
| |
| Bas Nieuwenhuizen (80): |
| |
| - radv: Fix implicit sync with recent allocation changes. |
| - radv: Extend tiling flags to 64-bit. |
| - radv: Provide a better error for permission issues with priorities. |
| - radv: Support VK_PIPELINE_COMPILE_REQUIRED_EXT. |
| - radv: Support VK_PIPELINE_CREATE_EARLY_RETURN_ON_FAILURE_BIT_EXT. |
| - radv: Support VK_PIPELINE_CACHE_CREATE_EXTERNALLY_SYNCHRONIZED_BIT_EXT. |
| - radv: Expose VK_EXT_pipeline_creation_cache_control. |
| - radv/winsys: Finish mapping for sparse residency. |
| - radv/winsys: Remove extra sizeof multiply. |
| - radv: Handle failing to create .cache dir. |
| - radv: Remove dead code. |
| - radv: Do not close fd -1 when NULL-winsys creation fails. |
| - radv: Implement vkGetSwapchainGrallocUsage2ANDROID. |
| - frontend/dri: Implement mapping individual planes. |
| - util/format: Add VK_FORMAT_D16_UNORM_S8_UINT. |
| - util/format: Use correct pipe format for VK_FORMAT_G8_B8_R8_3PLANE_420_UNORM. |
| - util/format: Add more multi-planar formats. |
| - gallium/dri: Remove lowered_yuv tracking for plane mapping. |
| - radeonsi: Explicitly map Z16_UNORM_S8_UINT to None for GFX10. |
| - amd/common,radeonsi: Move gfx10_format_table to common. |
| - radeonsi: Define gfx10_format in the common header. |
| - radv: Include gfx10_format_table.h only from a single source file. |
| - radv: Use common gfx10_format_table.h |
| - radv: Use ac_surface to determine fmask enable. |
| - radv: Pass no_metadata_planes info in to ac_surface. |
| - radv: Enforce the contiguous memory for DCC layers in ac_surface. |
| - radv: Rely on ac_surface for avoiding cmask for linear images. |
| - radv: Use offsets in surface struct. |
| - radv: Disable DCC in ac_surface. |
| - radv: Disable HTILE in ac_surface. |
| - radv: Allocate values/predicates at the end of the image. |
| - amd/common: Add total alignment calculation. |
| - radv: Use ac_surface to allocate aux surfaces. |
| - vulkan/wsi/x11: Ensure we create at least minImageCount images. |
| - radv/winsys: Deal with realloc failures in BO lists. |
| - radv: Handle mmap failures. |
| - radv/winsys: Distinguish device/host memory errors. |
| - radv: Make radv_alloc_shader_memory static. |
| - turnip: semaphore support. |
| - meson: Do not require shader cache for radv. |
| - amd/addrlib: fix another C++ one definition rule violation |
| - radv: Set handle types in Android semaphore/fence import. |
| - radv: Always enable PERFECT_ZPASS_COUNTS. |
| - Revert "radv: add support for MRTs compaction to avoid holes" |
| - radv: Use correct semaphore handle type for Android import. |
| - amd/llvm: Mark pointer function arguments as 32-byte aligned. |
| - amd/common: Cache intra-tile addresses for retile map. |
| - amd/addrlib: Clean up unused colorFlags argument |
| - amd/registers: add RLC_PERFMON_CLK_CNTL for pre-GFX10 |
| - radeonsi: Inhibit clock-gating for perf counters. |
| - meson: Add mising git_sha1.h dependency. |
| - amd: Add detection of timeline semaphore support. |
| - radv/winsys: Add binary syncobj ABI changes for timeline semaphores. |
| - radv: Add thread for timeline syncobj submission. |
| - radv: Add winsys support for submitting timeline syncobj. |
| - radv: Add winsys functions for timeline syncobj. |
| - radv: Add timeline syncobj for timeline semaphores. |
| - radv: Fix uninitialized variable in renderpass. |
| - vulkan/wsi/x11: report device-group present rectangles with prime. |
| - vulkan/wsi: Convert usage of -1 to UINT32_MAX. |
| - radv: Fix host->host signalling with legacy timeline semaphores. |
| - mesa/st: Actually free the driver part of memory objects on destruction. |
| - radv: Don't use both DCC and CMASK for single sample images. |
| - radv: Fix assert that is too strict. |
| - radv: Do not consider layouts fast-clearable on compute queue. |
| - radv: When importing an image, redo the layout based on the metadata. |
| - radv: Use getter instead of setter to extract value. |
| - driconf: Support selection by Vulkan applicationName. |
| - radv: Override the uniform buffer offset alignment for World War Z. |
| - radv: Fix handling of attribs 16-31. |
| - radv: Remove conformance warnings with ACO. |
| - radv: Update CTS version. |
| - radv: Fix 3d blits. |
| - radv: Fix threading issue with submission refcounts. |
| - radv: Avoid deadlock on bo_list. |
| - spirv: Deal with glslang not setting NonUniform on constructors. |
| - radeonsi: Work around Wasteland 2 bug. |
| - spirv: Deal with glslang bug not setting the decoration for stores. |
| - ac/surface: Fix depth import on GFX6-GFX8. |
| - st/mesa: Deal with empty textures/buffers in semaphore wait/signal. |
| |
| Ben Skeggs (38): |
| |
| - nir: use bitfield_insert instead of bfi in nir_lower_double_ops |
| - nvir: bump max encoding size of instructions |
| - nvir: introduce OP_LOP3_LUT |
| - nvir: introduce OP_WARPSYNC |
| - nvir: introduce OP_BREV with lowering to EXTBF_REV for current GPUs |
| - nvir: introduce OP_SHF |
| - nvir: introduce OP_BMSK |
| - nvir: introduce OP_SGXT |
| - nvir: introduce OP_FINAL |
| - nvir: add constant folding for OP_PERMT |
| - nvir: run replaceZero() before replaceCvt() |
| - nvir/nir: fix fragment program output when using MRT |
| - nvir/nir: move nir options to codegen |
| - nvir/nir: flesh out options |
| - nvir/nir: turn on lower_rotate |
| - nvir/nir: implement nir_op_extract_u8 |
| - nvir/nir: implement nir_op_extract_i8 |
| - nvir/nir: implement nir_op_extract_u16 |
| - nvir/nir: implement nir_op_extract_i16 |
| - nvir/nir: implement nir_op_urol |
| - nvir/nir: implement nir_op_uror |
| - nvir/nir: nir expects the shift amount to wrap, rather than clamp |
| - nvir/nir: use nir_lower_idiv |
| - nvir/gm107: implement OP_PERMT |
| - nvir/gm107: replace SHR+AND+AND with PRMT+PRMT in PFETCH lowering |
| - nvir/gm107: separate out header for sched data calculator |
| - nvir/nir/gm107: split nir shader compiler options from gf100 |
| - nvir/nir/gm107: turn on nir_lower_extract64 |
| - nvir/nir/gm107: switch off lower_extract_byte |
| - nvir/nir/gm107: switch off lower_extract_word |
| - nvir/gv100: initial support |
| - nvir/gv100: enable support for tu1xx |
| - nvc0: use NVIDIA headers for GK104->GM2xx compute QMD |
| - nvc0: use NVIDIA headers for GP100- compute QMD |
| - nvc0: move setting of entrypoint for a shader stage to a function |
| - nvc0: remove hardcoded blitter vertprog |
| - nvc0: initial support for gv100 |
| - nvc0: initial support for tu1xx |
| |
| Benjamin Cheng (1): |
| |
| - drirc: Add picom to adaptive_sync exclusion list |
| |
| Benjamin Tissoires (3): |
| |
| - CI: reduce bandwidth for git pull |
| - gitlab-ci: update ci-fairy minio to latest upstream |
| - gitlab-ci: do not run full CI on scheduled pipelines |
| |
| Blaž Tomažič (1): |
| |
| - radeonsi: Fix omitted flush when moving suballocated texture |
| |
| Boris Brezillon (14): |
| |
| - spirv: Split the vtn_emit_scoped_memory_barrier() logic |
| - nir: Replace the scoped_memory barrier by a scoped_barrier |
| - intel/compiler: Extract control barriers from scoped barriers |
| - spirv: Use scoped barriers for SpvOpControlBarrier |
| - nir: Add new rules to optimize NOOP pack/unpack pairs |
| - nir: Use a switch in build_deref_offset()/deref_instr_get_const_offset() |
| - nir: Allow casts in nir_deref_instr_get[_const]_offset() |
| - freedreno: Initialize lower_int64_options to a proper value |
| - nir: Stop passing an options arg to nir_lower_int64() |
| - nir: Extend nir_lower_int64() to support i2f/f2i lowering |
| - intel: Set int64_options to ~0 when lowering 64b ops |
| - nir: Get rid of __[u]int64_to_fp32() and __fp32_to_[u]int64() |
| - nir: Fix i64tof32 lowering |
| - spirv: Add a vtn_get_mem_operands() helper |
| |
| Boyuan Zhang (2): |
| |
| - radeon/vcn/enc: Re-write PPS encoding for HEVC |
| - radeon/vcn: bump vcn3.0 encode major version to 1 |
| |
| Brian Ho (14): |
| |
| - turnip: Execute ir3_nir_lower_gs pass again |
| - turnip: Fill out VkPhysicalDeviceSubgroupProperties |
| - nir: Support sysval tess levels in SPIR-V to NIR |
| - nir: Add an option for lowering TessLevelInner/Outer to vecs |
| - turnip: Lower shaders for tessellation |
| - turnip: Offset by component when lowering gl_TessLevel* |
| - turnip: Parse tess state and support PATCH primtype |
| - turnip: Allocate tess BOs as a function of draw size |
| - turnip: Update VFD_CONTROL with tess system values |
| - turnip: Emit HS/DS user consts as draw states |
| - turnip: Support tess for draws |
| - turnip: Force sysmem for tessellation |
| - ir3: Unconditionally enable MERGEDREGS on a6xx |
| - turnip: Enable tessellationShader physical device feature |
| |
| Caio Marcelo de Oliveira Filho (32): |
| |
| - intel/dev: Bail when INTEL_DEVID_OVERRIDE is not valid |
| - intel/fs: Clean up variable group size handling in backend |
| - intel/fs: Add an option to lower variable group size in backend |
| - intel/fs: Add and use a new load_simd_width_intel intrinsic |
| - intel: Let drivers call brw_nir_lower_cs_intrinsics() |
| - iris: Implement ARB_compute_variable_group_size |
| - util/list: Add list_foreach_entry_from_safe |
| - nir: Use deref intrinsics to set writes_memory when gathering info |
| - intel/fs: Use writes_memory from shader_info |
| - nir: Consider atomic counter intrinsics when setting writes_memory |
| - intel/fs: Remove unused emission of load_simd_with_intel |
| - intel/fs: Remove unused state from brw_nir_lower_cs_intrinsics |
| - intel/fs: Early return when can't satisfy explicit group size |
| - intel/fs: Remove redundant assert() |
| - intel/fs: Remove min_dispatch_width spilling decision from RA |
| - intel/fs: Support INTEL_DEBUG=no8,no32 in compute shaders |
| - intel/fs: Add helper to get prog_offset and simd_size |
| - i965: Use new helper functions to pick SIMD variant for CS |
| - iris: Set CS KernelStatePointer at dispatch |
| - iris: Use new helper functions to pick SIMD variant for CS |
| - anv: Use new helper functions to pick SIMD variant for CS |
| - intel/fs: Generate multiple CS SIMD variants for variable group size |
| - iris, i965: Drop max_variable_local_size |
| - iris, i965: Update limits for ARB_compute_variable_group_size |
| - intel: Add helper to calculate GPGPU_WALKER::RightExecutionMask |
| - nir: Fix printing execution scope of a scoped barrier |
| - spirv: Memory semantics is optional for OpControlBarrier |
| - intel/fs: Add Fall-through comment |
| - nir: Fix logic that ends combine barrier sequence |
| - spirv: Handle most execution modes earlier |
| - nir: Filter modes of scoped memory barrier in nir_opt_load_store_vectorize |
| - spirv: Propagate explicit layout only in types that need it |
| |
| Charmaine Lee (1): |
| |
| - llvmpipe: do not enable tessellation shader without llvm coroutines support |
| |
| Chris Forbes (12): |
| |
| - bifrost: Set RTZ rounding mode for f2i conversion |
| - bifrost: Lower x->bool conversions to != 0 |
| - bifrost: Emit "d3d" variant of comparison instructions |
| - bifrost: Document d3d/gl comparison control bit |
| - bifrost: Add lowering for b2i32 |
| - bifrost: Add support for nir_op_inot |
| - bifrost: Add support for nir_op_ishl |
| - bifrost: Add support for nir_op_uge |
| - bifrost: Add support for nir_op_imul |
| - bifrost: Add support for nir_op_iabs |
| - bifrost: Honor src swizzle in special math ops |
| - bifrost: Fix packing of ADD_FEXP2_FAST |
| |
| Chris Wilson (6): |
| |
| - iris: Place a seqno at the end of every batch |
| - iris: Convert fences to using lightweight seqno |
| - iris: Store a seqno for each batch in the fence |
| - iris: Initialise stub iris_seqno to 0 |
| - iris: Rename iris_seqno to iris_fine_fence |
| - iris: Fixup copy'n'paste mistake in Makefile.sources |
| |
| Christian Gmeiner (31): |
| |
| - etnaviv: fix SAMP_ANISOTROPY register value |
| - etnaviv: do not use int filter when anisotropic filtering is used |
| - ci: bare-metal: make it possible to use a script for serial |
| - ci: extend expect-output.sh |
| - ci: add U-Boot specific fetch strings |
| - etnaviv: drop translate_blend(..) |
| - ci: add arm_test-base docker image |
| - ci: use separate docker images for baremetal builds |
| - ci: fix possible spuriously run of jobs |
| - etnaviv: delete not used struct |
| - etnaviv: convert enums |
| - etnaviv: move etna_lower_io(..) to etnaviv_nir.c |
| - etnaviv: get rid of etna_compile dependency |
| - etnaviv: move etna_lower_alu(..) to etnaviv_nir.c |
| - etnaviv: drop OPT_V define |
| - etnaviv: make more use of compile_error(..) |
| - etnaviv: move liveness related stuff into own file |
| - etnaviv: merge struct etna_compile and etna_state |
| - etnaviv: drop emit macro |
| - etnaviv: move functions that generate asm to own file |
| - etnaviv: move nir compiler related stuff into .c file |
| - etnaviv: move ra into own file |
| - etnaviv: replace prims-emitted query |
| - ci: bare-metal: use nginx to get results from DUT |
| - etnaviv: explicitly set nir_variable_mode |
| - etnaviv: introduce struct etna_compiler |
| - etnaviv: move shader_count to etna_compiler |
| - etnaviv: do register setup only once |
| - etnaviv: fix nir validation problem |
| - etnaviv: call nir_lower_bool_to_bitsize |
| - etnaviv: completely turn off MSAA |
| |
| Christopher Egert (2): |
| |
| - radv: use util_float_to_half_rtz |
| - r600: Use TRUNC_COORD on samplers |
| |
| Clément Guérin (1): |
| |
| - radv: Always expose non-visible local memory type on dedicated GPUs |
| |
| Con Kolivas (1): |
| |
| - Linux: Change minimum priority threads from SCHED_IDLE to nice 19 SCHED_BATCH. |
| |
| Connor Abbott (88): |
| |
| - tu: Support pipelines without a fragment shader |
| - tu: Add a "scratch bo" allocation mechanism |
| - tu: Add noubwc debug flag to disable UBWC |
| - tu: Implement fallback linear staging blit for CopyImage |
| - freedreno/a6xx: Document dual-src blending enable bits |
| - ir3: Fixup dual-source blending slot |
| - tu: Move RENDER_COMPONENTS setting to pipeline state |
| - tu: Implement dual-src blending |
| - tu: Advertise COLOR_ATTACHMENT_BLEND_BIT for blendable formats |
| - tu: Always initialize image_view fields for blit sources |
| - tu: Fall back to 3d blit path for BC1_RGB_* formats |
| - tu: Fix buffer compressed pitch calculation with unaligned sizes |
| - tu: Support VK_FORMAT_FEATURE_BLIT_SRC_BIT for texture-only formats |
| - tu: Fix IBO descriptor for cubes |
| - tu: Respect VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT |
| - tu: Add missing storage image/texel buffer bits |
| - tu: Remove useless post-binning flushes |
| - tu: Don't actually track seqno's for events |
| - tu: Remove useless event_write helpers |
| - tu: Rewrite flushing to use barriers |
| - tu: Fix context faults loading unused descriptor sets |
| - ir3: Pass reserved_user_consts to ir3_shader_from_nir() |
| - tu: Remove num_samp hack |
| - tu: Use the ir3 shader API |
| - tu: Remove tu_shader_compile_options |
| - tu: Set num_components to 0 when building bindless intrinsics |
| - ir3: Don't calculate num_samp ourselves |
| - tu: Actually remove dead variables after io lowering |
| - ir3: Split out variant-specific lowering and optimizations |
| - ir3, freedreno: Round up constlen earlier |
| - ir3: Include ir3_compiler from ir3_shader |
| - ir3: Support variants with different constlen's |
| - ir3: Add ir3_trim_constlen() |
| - tu: Share constlen between different stages properly |
| - freedreno: Refactor ir3_cache shader compilation |
| - freedreno: Share constlen between different stages properly |
| - freedreno: On a5xx+ INDX_SIZE is MAX_INDICES |
| - freedreno/registers: Label firstIndex field in CP_DRAW_INDX_OFFSET |
| - tu: Pass firstIndex directly to CP_DRAW_INDX_OFFSET |
| - freedreno/a6xx: use firstIndex field |
| - nir: Refactor load/store intrinsic helper |
| - nir: add vec2_index_32bit_offset address format |
| - tu: Rewrite variable lowering |
| - tu: Enable KHR_variable_pointers |
| - ir3: Add layer_zero variant bit |
| - tu: Force gl_Layer to 0 when necessary |
| - freedreno/a6xx: Force gl_Layer to 0 when necessary |
| - freedreno: Include adreno_pm4.xml.h before adreno_a6xx.xml.h |
| - freedreno: Sync registers with envytools |
| - freedreno/a6xx: Rename and document HLSQ_UPDATE_CNTL |
| - freedreno/a6xx: Add some documentation for shared consts |
| - tu: Don't invalidate irrelevant state when changing pipeline |
| - freedreno/a6xx: Add stencilref register info |
| - ir3: Handle gl_FragStencilRefARB |
| - tu: Enable VK_EXT_shader_stencil_export |
| - freedreno: Add a helper for computing guardband sizes |
| - tu: Use common guardband helper |
| - freedreno: Use common guardband helper |
| - freedreno/ir3: Fix SSBO size for bindless SSBO's |
| - tu: Enable VK_EXT_depth_clip_enable |
| - freedreno: Clean up CP_DRAW_MULTI_INDIRECT definition |
| - freedreno: Add INDIRECT_COUNT CP_DRAW_INDIRECT_MULTI variants |
| - tu: Integrate WFI/WAIT_FOR_ME/WAIT_MEM_WRITES with cache tracking |
| - tu: Add missing wfi to tu6_emit_hw() |
| - tu: Implement VK_KHR_draw_indirect_count |
| - tu: Fix empty blit scissor case |
| - tu: Fix hangs for DS with no output |
| - tu: Detect invalid-for-binning renderpass dependencies |
| - tu: Enable vertex & fragment stores & atomics |
| - tu: Fix descriptor update templates with input attachments |
| - ir3: Validate bindless samp_tex correctly |
| - ir3: Remove redundant samp_tex validation |
| - ir3: Fix incorrect src flags for samp_tex |
| - tu: Enable resource dynamic indexing |
| - freedreno/rnn: Return success when parsing addvariant |
| - tu: Dump CP_DRAW_INDIRECT_MULTI draw BO's |
| - freedreno/rnn: Support stripes in rnndec_decodereg |
| - freedreno/cffdec: Handle CP_DRAW_INDIRECT_MULTI like other draws |
| - freedreno: Add trace for CP_DRAW_INDIRECT_MULTI |
| - freedreno/a6xx: Fix CP_BIN_SIZE_ADDRESS name |
| - freedreno/rnn: Make rnn_decode_enum() respect variants |
| - freedreno/cffdec: Stop open-coding enum parsing |
| - freedreno/afuc: Add missing rnn_prepdb() |
| - freedreno/afuc: Fix PM4 enum parsing |
| - tu: Fix DST_INCOHERENT_FLUSH copy/paste error |
| - freedreno: Document draw predication packets |
| - tu: Reset has_tess after renderpass |
| - tu: Implement VK_EXT_conditional_rendering |
| |
| D Scott Phillips (4): |
| |
| - intel/fs: Update location of Render Target Array Index for gen12 |
| - anv,iris: Fix input vertex max for tcs on gen12 |
| - intel/dump_gpu: Fix name of LD_PRELOAD in env append logic |
| - anv/gen11+: Disable object level preemption |
| |
| Daniel Schürmann (54): |
| |
| - aco: either copy-propagate or inline create_vector operands |
| - aco: coalesce parallelcopies during register allocation |
| - nir: add nir_intrinsic_elect to divergence analysis |
| - nir: refactor divergence analysis state |
| - nir: rework phi handling in divergence analysis |
| - nir: simplify phi handling in divergence analysis |
| - nir: reset ssa-defs as non-divergent during divergence analysis instead of upfront |
| - aco: fix WQM coalescing |
| - aco: restrict copying of create_vector operands to GFX9+ |
| - aco: don't move create_vector subdword operands to unsupported register offsets |
| - aco: fix corner case in register allocation |
| - aco: don't allow unaligned subdword accesses on GFX6/7 |
| - aco: fix register assignment for p_create_vector on GFX6/7 |
| - aco: simplify statistics collection for copies |
| - aco: use full-register instructions to implement subdword packing on GFX6/7 |
| - aco: Workarounds subdword lowering on GFX6/7 |
| - aco: adjust GFX6 subdword lowering workarounds for 8bit |
| - aco: add and use scratch SGPR to lower subdword p_create_vector on GFX6/7 |
| - aco: coalesce copies more aggressively when lowering to hw |
| - aco: skip partial copies on first iteration when lowering to hw |
| - aco: optimize packing of 16bit subdword registers on GFX6/7 |
| - aco: remove unnecessary split- and create_vector instructions for subdword loads |
| - aco: fix shared subdword loads |
| - aco: reorder calls to aco_validate() and cleanup aco_compile_shader() |
| - aco: don't allow SGPRs on logical phis |
| - aco: fix WQM handling in nested loops |
| - radv/aco: implement logic64 instead of lowering |
| - aco: align swap operations to 4 bytes on GFX6/7 |
| - aco: don't allow partial copies on GFX6/7 |
| - radv: introduce RADV_DEBUG=llvm option |
| - radv: change use_aco -> use_llvm |
| - radv: enable ACO by default |
| - aco: fix partial copies on GFX6/7 |
| - aco: remove superflous (bool & exec) if the result comes from VOPC |
| - nir: also move vecN in case of nir_move_copies |
| - nir: refactor nir_can_move_instr |
| - nir/algebraic: optimize bcsel(a, 0, 1) to b2i |
| - nir: also move b2i in case of nir_move_copies |
| - nir/algebraic: optimize iand/ior of (n)eq zero |
| - nir/algebraic: add optimizations for fsign/isign |
| - nir/algebraic: add some more unop + bcsel optimizations |
| - nir/algebraic: optimize fmul(x, bcsel(c, -1.0, 1.0)) -> bcsel(c, -x, x) |
| - nir/algebraic: optimize (a < 0.0) ? -a : a -> fabs(a) |
| - nir/algebraic: add distributive rules for ior/iand |
| - nir/algebraic: propagate b2i out of ior/iand |
| - nir/algebraic: fold some nested bcsel |
| - aco: fix scratch loads which cross element_size boundaries |
| - aco: ensure to not extract more components than have been fetched |
| - aco: don't split store data if it was already split into more elements |
| - aco: prevent infinite recursion in RA for subdword variables |
| - aco: ensure readfirstlane subdword operands are always dword aligned |
| - radv: call radv_nir_lower_ycbcr_textures after first optimizations |
| - aco: add GFX6/7 subdword lowering tests |
| - aco: execute branch instructions in WQM if necessary |
| |
| Daniel Stone (13): |
| |
| - CI: Disable Panfrost T7x0 jobs |
| - CI: Re-enable Panfrost T7x0 jobs |
| - llvmpipe: Expect increased exp precision on Windows |
| - CI: Windows: Build LLVM and llvmpipe |
| - CI: Disable Panfrost T720/T760 |
| - Revert "CI: Disable Panfrost T720/T760" |
| - CI: Enable assertions on Windows |
| - CI: Try shared libraries on Windows |
| - CI: Correct build-directory path on Windows, and keep it |
| - CI: Re-enable the Windows VS2019 build job |
| - CI: Temporarily disable Panfrost T860 jobs |
| - CI: Re-enable Panfrost T860 jobs |
| - CI: Disable Windows build due to unstable infrastructure |
| |
| Danylo Piliaiev (25): |
| |
| - glsl: rename has_implicit_uint_to_int_conversion to *_int_to_uint_* |
| - i965: Fix out-of-bounds access to brw_stage_state::surf_offset |
| - anv: Translate relative timeout to absolute when calling anv_timelines_wait |
| - anv: Fix deadlock in anv_timelines_wait |
| - meson: Disable GCC's dead store elimination for memory zeroing custom new |
| - mesa: Fix double-lock of Shared->FrameBuffers and usage of wrong mutex |
| - st/mesa: Clear texture's views when texture is removed from Shared->TexObjects |
| - intel/fs: Work around dual-source blending hangs in combination with SIMD16 |
| - glsl: Don't replace lrp pattern with lrp if arguments are not floats |
| - glsl: inline functions with unsupported return type before converting to nir |
| - i965: Work around incorrect usage of glDrawRangeElements in UE4 |
| - st/mesa: account for "loose", per-mipmap level textures in CopyImageSubData |
| - iris: Honor scanout requirement from DRI |
| - iris: Fix fast-clearing of depth via glClearTex(Sub)Image |
| - nir/opt_if: Fix opt_if_simplification when else branch has jump |
| - nir/tests: Add tests for opt_if_simplification |
| - st/mesa: Treat vertex outputs absent in outputMapping as zero in mesa_to_tgsi |
| - anv/nir: Unify inputs_read/outputs_written between geometry stages |
| - spirv: Only require bare types to match when copying variables |
| - glsl: Eliminate out-of-bounds triop_vector_insert |
| - intel/compiler: Fix pointer arithmetic when reading shader assembly |
| - glsl: Eliminate assigments to out-of-bounds elements of vector |
| - nir/lower_io: Eliminate oob writes and return zero for oob reads |
| - nir/large_constants: Eliminate out-of-bounds writes to large constants |
| - nir/lower_samplers: Clamp out-of-bounds access to array of samplers |
| |
| Daryl W. Grunau (1): |
| |
| - prevent multiply defined symbols |
| |
| Dave Airlie (199): |
| |
| - i965: add support for gen 5 pipelined pointers to dump |
| - i965: disable shadow batches when batch debugging. |
| - draw/tess: free tessellation control shader i/o memory. |
| - llvmpipo/nir: free compute shader NIR |
| - llvmpipe: simple texture barrier implementation. |
| - gallivm/sample: add multisample support for texel fetch |
| - gallivm/sample: add multisample image operation support |
| - gallivm/nir/tgsi: add multisample texture sampling. |
| - gallivm/nir: add multisample support to image size |
| - gallivm/nir: add multisample image operations |
| - draw: introduce sampler num samples + stride members |
| - draw: add support for num_samples + sample_stride to the image paths |
| - llvmpipe: add num_samples/sample_stride support to jit textures |
| - llvmpipe: add samples support to image jit |
| - util: add a resource wrapper to get resource samples |
| - llvmpipe: add multisample support to texture allocator. |
| - llvmpipe: add a max samples define set to 4. |
| - gallium/util: split out zstencil clearing code. |
| - llvmpipe: fix race between draw and setting fragment shader. |
| - llvmpipe: add get_sample_position support (v2) |
| - llvmpipe/jit: pass fragment sample mask via jit context. |
| - llvmpipe: pass incoming sample_mask into fragment shader context. |
| - llvmpipe: add internal multisample texture mapping path. |
| - llvmpipe: add multisample resource copy region support. |
| - llvmpipe: add clear texture support for multisample textures. |
| - llvmpipe: handle multisample render target clears |
| - draw: disable point/line smoothing for multisample (v2) |
| - llvmpipe: pass color and depth sample strides into fragment shader. |
| - llvmpipe: record sample info for color/depth buffers in scene |
| - llvmpipe/rast: fix tile clearing for multisample color and depth tiles |
| - llvmpipe: plumb multisample state bit into setup code. |
| - llvmpipe: add multisample bit to fragment shader key. |
| - llvmpipe: change mask input to fragment shader to 64-bit. |
| - llvmpipe: add cbuf/zsbuf + coverage samples to the fragment shader key. |
| - gallivm: add sample id/pos intrinsic support |
| - gallivm: add mask api to force mask |
| - nir/tgsi: translate the interp location |
| - llvmpipe: pass interp location into interpolation code. |
| - llvmpipe: add centroid interpolation support. |
| - llvmpipe: add per-sample interpolation. |
| - llvmpipe: move getting mask value out of depth code. (v2) |
| - llvmpipe: add per-sample depth/stencil test |
| - llvmpipe: move some fs code around |
| - llvmpipe: multisample sample mask + early/late depth pass |
| - llvmpipe: handle multisample early depth test/late depth write |
| - llvmpipe: interpolate Z at sample points for early depth test. |
| - llvmpipe: handle multisample color stores. |
| - llvmpipe: hook up sample position system value |
| - llvmpipe: add multisample alpha to coverage support. |
| - llvmpipe: add multisample alpha to one support |
| - llvmpipe: handle gl_SampleMask writing. |
| - llvmpipe: don't allow branch to end for early Z with multisample |
| - llvmpipe: pass mask store into interp for centroid interpolation |
| - llvmpipe: move color storing earlier in frag shader |
| - llvmpipe: fix multisample occlusion queries. |
| - llvmpipe: disable opaque variant for multisample |
| - llvmpipe: add new rast api to pass full 64-bit mask. |
| - llvmpipe: add fixed point sample positions to scene. |
| - llvmpipe: build 64-bit coverage mask in rasterizer |
| - llvmpipe: fixup multisample coverage masks for covered tiles |
| - llvmpipe: generate multisample triangle rasterizer functions (v2) |
| - llvmpipe: choose multisample rasterizer functions per triangle (v2) |
| - llvmpipe: choose correct position for multisample |
| - llvmpipe: don't choose pixel centers for multisample |
| - drisw: add multisample support to sw dri layer. |
| - llvmpipe: enable 4x sample MSAA + texture multisample |
| - gallivm/sample: add num samples query for txqs (v2) |
| - gallivm/nir: hooks up texture samples queries |
| - llvmpipe: enable GL_ARB_shader_texture_image_samples |
| - llvmpipe: add min samples support to the fragment shader. |
| - llvmpipe: enable ARB_sample_shading |
| - llvmpipe: make sample position a global array. |
| - zink: enable conditional rendering if available |
| - r600: enable TEXCOORD semantic for TGSI. |
| - r600/sfn: plumb the chip class into the instruction emission |
| - r600/sfn: fix cayman float instruction emission. |
| - r600/sfn: cayman fix int trans op2 |
| - r600/sfn: add callstack non-evergreen support |
| - r600/sfn: add emit if start cayman support |
| - llvmpipe: don't use sample mask with 0 samples |
| - llvmpipe: use per-sample position not sample id for interp |
| - llvmpipe/interp: fix interpolating frag pos for sample shading |
| - llvmpipe: remove non-simple interpolation paths. |
| - gallivm/nir: add an interpolation interface. |
| - llvmpipe/interp: refactor out use of pixel center offset |
| - llvmpipe/interp: refactor out centroid calculations |
| - llvmpipe: add interp instruction support |
| - llvmpipe/fs: hook up the interpolation APIs. |
| - gallivm/nir: add sample_mask_in support |
| - llvmpipe: add gl_SampleMaskIn support. |
| - r600/sfn: fix nop channel assignment. |
| - llvmpipe: compute shaders work better with all the threads. |
| - llvmpipe: move coroutines out of noopt case |
| - ci: bump virglrenderer to latest version |
| - util/disk_cache: add fallback for disk_cache_get_function_identifier |
| - llvmpipe/cs: overhaul cs variant key state. |
| - llvmpipe/draw: drop variant number from function names. |
| - gallivm: rework coroutine malloc/free callouts. |
| - gallivm: rework debug printf hook to use global mapping. |
| - gallivm: add support for a cache object |
| - gallivm: skip operations if we have a cached object. |
| - gallivm: add cache interface to mcjit |
| - llvmpipe: add infrastructure for disk cache support |
| - gallivm: don't cache shaders that use fetch functions. |
| - llvmpipe/fs: add caching support |
| - llvmpipe/cs: add shader caching |
| - draw: add disk cache callbacks for draw shaders |
| - llvmpipe: hook draw disk cache up |
| - draw: add disk caching for draw shaders |
| - draw/gs: fix emitting inactive primitives crash |
| - draw/gs: add more info to debugging. |
| - gallivm/nir: add group barrier support |
| - llvmpipe: fix subpixel bits reporting. |
| - gallivm/format: convert unsigned values to float properly. |
| - gallivm/conv: enable conversion min code. (v2) |
| - gallivm/sample: fix texel type for stencil 8-bit |
| - llvmpipe/setup: add planes for draw regions if no scissor. |
| - gallivm/cache: don't require a null terminator for cache data. |
| - mesa/gles3: add support for GL_EXT_shader_group_vote |
| - virgl: change vendor id to reflect reality more. |
| - llvmpipe: change vendor to be more generic. |
| - softpipe: change vendor name to something more generic. |
| - gallivm/nir: fix const loading on big endian systems |
| - glsl: fix constant packing for 64-bit big endian. |
| - gallivm/nir: fix big-endian 64-bit splitting/merging. |
| - llvmpipe: fix occlusion queries on big-endian. |
| - mesa/get: fix enum16 big-endian getting. |
| - draw/llvm: fix big-endian mask adjusting |
| - draw: pass nr_samplers into llvm sample state creation. |
| - llvmpipe: pass number of samplers into llvm sampler code. |
| - gallivm/sample: change texture function generator api |
| - gallivm: add indirect texture switch statement builder. |
| - draw: add support for indirect texture access |
| - llvmpipe: add support for indirect texture access. |
| - gallivm/nir: add texture unit indexing |
| - gallivm/nir: handle non-uniform texture offsets |
| - gallivm/sample: pass indirect offset into texture/image units |
| - llvmpipe/draw: wire up indirect offset |
| - gallivm/sample: handle size unit offset |
| - llvmpipe: enable ARB_gpu_shader5 |
| - draw: pass number of images to image soa create |
| - llvmpipe: pass number of images into image soa create |
| - gallivm/nir: support passing image index into image code. |
| - gallivm/nir: refactor image operations for indirect support. |
| - gallivm/img: refactor out the texel return type (v2) |
| - gallivm/nir: add support for indirect image loading |
| - draw/sample: add support for indirect images |
| - llvmpipe: handle indirect images properly |
| - ci: fixup tests after all indirect images fixes. |
| - docs: update llvmpipe GL 4.0 status |
| - draw/clip: cleanup viewport index handling code. |
| - draw/clip: fix viewport index for geometry shaders |
| - mesa/version: only enable GL4.1 with correct limits. |
| - llvmpipe: bump texture/scene limits to enable GL 4.1 |
| - llvmpipe: bump to GL support to GL 4.1 |
| - llvmpipe: enable GL 4.2 |
| - gallivm/nir: call end prim at end on all GS streams. |
| - draw: emit so primitives before ending empty pipeline. |
| - draw/gs: fix up current verts in output fetching. |
| - gallivm/draw/gs: pass vertex stream count into shader build |
| - draw/gs: only allocate memory for streams needed. |
| - gallivm/gs_iface: pass stream into end primitive interface. |
| - gallivm/nir: don't access stream var outside bounds |
| - gallivm/nir: end primitive for all streams. |
| - draw: account primitive lengths for all streams. |
| - draw/gs: reverse the polarity of the invocation/prims execution |
| - draw: use common exit path in pipeline finish. |
| - draw: free vertex info from geometry streams. |
| - draw/gs: use mask to limit vertex emission. |
| - ci/virgl: update results after streams fixes. |
| - llvmpipe: add ARB_post_depth_coverage support. |
| - llvmpipe: denote NEW fs when images change. |
| - llvmpipe: flush resources on sampler view binding |
| - llvmpipe/cs: fix image/sampler binding for compute |
| - nouveau: avoid LTO ODR warning (v2) |
| - gallivm/sample: always square rho before fast log2 |
| - llvmpipe/format: fix snorm conversion |
| - mesa: change dsa texture error codes for GL 4.6 |
| - ci: bump piglit checkout for dsa tests |
| - llvmpipe: fix stencil only formats. |
| - llvmpipe: fix position offset interpolation |
| - llvmpipe/cs: respect render condition |
| - llvmpipe: add framebuffer fetching support (v1.1) |
| - ci/llvmpipe: reenable gpu shader5 tests |
| - llvmpipe: enable EXT_texture_shadow_lod |
| - llvmpipe/draw: handle constant buffer limits and robustness (v1.1) |
| - drisw: add robustness extension support. |
| - glx/drisw: add robustness support |
| - llvmpipe: add device reset query context hook. |
| - llvmpipe: enable robust buffer access + GL 4.3, GLES 3.2 and robust buffer access behaviour |
| - llvmpipe/ms: fix sign extension bug in rasterizer. |
| - Revert "llvmpipe: Use the default behavior of ALLOW_MAPPED_BUFFERS." |
| - radv: cleanup locking around timeline waiting. |
| - llvmpipe: only read 0 for channels being read |
| - llvmpipe/blit: for 32-bit unorm depth blits just copy 32-bit |
| - llvmpipe: enable GL 4.5 |
| - llvmpipe/cs: update compute counters not fragment shader. |
| - llvmpipe: include gallivm perf flags in shader cache. |
| - gallivm: disable brilinear for lod bias and explicit lod. |
| |
| David McFarland (1): |
| |
| - radv: link with ld_args_build_id |
| |
| David Stevens (2): |
| |
| - nir: Add colorspace support to YUV lowering pass |
| - i965/i915: Add colorspace support to YUV sampling |
| |
| Denys (1): |
| |
| - gitlab: Ask about reproduction rate in the issue template |
| |
| Dmitriy Nester (8): |
| |
| - mesa: check draw buffer completeness on glClearBufferfv/glClearBufferuiv |
| - nir: replace fnv1a hash function with xxhash |
| - freedreno: replace fnv1a hash function with xxhash |
| - i965: replace fnv1a hash function with xxhash |
| - util/hash_table: replace fnv1a hash function with xxhash |
| - r600: replace fnv1a hash function with xxhash |
| - zink: replace fnv1a hash function with xxhash |
| - util: delete fnv1a hash function |
| |
| Duncan Hopkins (1): |
| |
| - zink. Changed sampler default name. |
| |
| Dylan Baker (41): |
| |
| - docs: Add release notes for 20.0.6 |
| - docs: Add SHA256 sums for 20.0.6 |
| - docs: update calendar, add news item, and link releases notes for 20.0.6 |
| - docs: Add release notes for 20.0.7 |
| - docs/relnotes Add sha256 sums to 20.0.7 |
| - docs: update calendar, add news item, and link releases notes for 20.0.7 |
| - tests: Make tests aware of meson test wrapper |
| - meson: Bump required version to 0.52.0 |
| - meson: Use the check_header function |
| - meson: Use build_always_stale instead of build_always |
| - meson: Use builtins for checking gnu __attributes__ |
| - drm-shim/meson: The name of the target is a string not a list |
| - drm-shim/meson: Use portable override_options for setting C standard |
| - meson: use gnu_symbol_visibility argument |
| - meson: use 2 space not 3 space indent |
| - meson: deprecated 'true' and 'false' in combo options for 'enabled' and 'disabled' |
| - vulkan-overlay/meson: use install_data instead of configure_file |
| - docs: Add release notes for 20.0.8 |
| - docs: Add sha256sums for 20.0.8 |
| - docs: update calendar, add news item, and link releases notes for 20.0.8 |
| - mesa/swrast: use logf2 instead of util_fast_log2 |
| - VERSION: bump for 20.2.0-rc1 |
| - .pick_status.json: Update to 9333a8570d2174b73da63c3ee6f1a740ae487ab8 |
| - .pick_status.json: Update to 1e28745bc0d3528c1dfc25459456849feb58d407 |
| - meson/freedreno: Fix lua requirement |
| - .pick_status.json: Update to fdb97d3d2914c8f887a7968432db4fdbd35d8376 |
| - bump version for 20.2.0-rc2 |
| - .pick_status.json: Update to 61042b1bdb199f98dd34085ed29a8c492ed9b2a3 |
| - .pick_status.json: Update to 6d28270968e0728bf8bdf48a6abd261c50d9ef07 |
| - .pick_status.json: Update to ca7d66e847d08914cec0a5e003b400da9c0a2695 |
| - VERSION: bump for 20.2.0-rc3 |
| - .pick_status.json: Update to 7fbded8b5821a47c26245b181446f972f920a96e |
| - .pick_status.json: Mark e93979ba599355c42df01a89073362b970489a3a as denominated |
| - .pick_status.json: Update to b9927c8c8d0c105699306a68773c015930ff9509 |
| - VERSION: bump for 20.2.0-rc4 |
| - .pick_status.json: Update to ef980ac0c1cd65993ba0c1d20e1c09b45bfef99d |
| - fix: gallivm: disable brilenear for lod bias and explicit lod. |
| - .pick_status.json: Update to a1f46d7b6943699e5efb60fbcfdd1450db85adb1 |
| - amd/ac_surface: convert tabs to 3 spaces |
| - .pick_status.json: Update to 90b98c06493f8a9759e5496d5ec91fb60edf7b92 |
| - .pick_status.json: Update to 472a20c5fc0feda0f074b4ff95fd7c7a6305c8cd |
| |
| Eduardo Lima Mitev (2): |
| |
| - freedreno: Centralize UUID generation into new files freedreno_uuid.c/h |
| - freedreno/uuid: Generate meaningful device and driver UUID |
| |
| Elie Tournier (12): |
| |
| - virgl: implement ARB_clear_texture |
| - virgl: Enable CAP_CLEAR_TEXTURE if host supports it |
| - docs/features: Add ARB_clear_texture to virgl |
| - gallium: add TGSI_PROPERTY_FS_BLEND_EQUATION_ADVANCED |
| - glsl_to_tgsi: Set TGSI_PROPERTY_FS_BLEND_EQUATION_ADVANCED |
| - virgl: Reserved last caps of capability_bits |
| - gallium: Add PIPE_CAP_BLEND_EQUATION_ADVANCED |
| - st: expose KHR_blend_equation_advanced if PIPE_CAP_BLEND_EQUATION_ADVANCED |
| - glsl_to_ir: do lower_blend_equation if PIPE_CAP_FBFETCH |
| - virgl: Use alpha_src_factor to store blend_equation_advenced value |
| - virgl: Encode barrier for blend_equation_advanced |
| - virgl: set PIPE_CAP_BLEND_EQUATION_ADVANCED |
| |
| Emmanuel (3): |
| |
| - meson: Do not enable USE_ELF_TLS for FreeBSD |
| - iris: Explicitly cast value to uint64_t |
| - i965: Explicitly cast value to uint64_t |
| |
| Emmanuel Gil Peyrot (2): |
| |
| - util/rand_xor: use getrandom() when available |
| - Expose EGL_KHR_platform_* when EXT is supported |
| |
| Emmanuel Vadot (1): |
| |
| - meson: Add versioning for xvmc tracker |
| |
| Eric Anholt (228): |
| |
| - freedreno/ir3: Initialize the unused dwords of the immediates consts. |
| - freedreno/ir3: Drop redundant IR3_REG_HALF setup in ALU ops. |
| - freedreno/ir3: Leave bools as 1-bit, storing them in full regs. |
| - freedreno/ir3: Set up the block predecessors for a3xx TF |
| - freedreno/ir3: Fix the a3xx TF outputs stores. |
| - freedreno/ir3: Fix register allocation assertion failures. |
| - freedreno: Stop doing binning shaders other than the VS in shader-db. |
| - freedreno/ir3: Skip tess epilogue if the program is missing stores. |
| - freedreno: Fix assertion failures on GS/tess shaders with shader-db enabled. |
| - freedreno/ir3: Remove unused half precision shader key flag. |
| - freedreno: Emit debug messages when doing draw-time recompiles of shaders. |
| - freedreno/ir3: Improve shader key normalization. |
| - freedreno/ir3: Stop initializing regid of so->outputs during setup. |
| - freedreno/ir3: Set up outputs for multi-slot varyings. |
| - freedreno: Immediately compile a default variant of shaders. |
| - freedreno/ir3: Set the FS .msaa flag to true during precompiles. |
| - freedreno/ir3: Add some more tests of cat6 disasm. |
| - freedreno/ir3: Sync some new changes from envytools. |
| - freedreno/ir3: Define the bindful uniform/nonuniform desc modes for cat6 a6xx. |
| - freedreno/ir3: Disable sin/cos range reduction for mediump. |
| - ci: Clean up setup of the job-specific env vars in baremetal testing. |
| - ci: Enable IRC flake reporting on freedreno baremetal boards. |
| - ci: Improve the flakes reports on IRC. |
| - ci: Fix the nick used in IRC reporting. |
| - freedreno: Deduplicate ringbuffer macros with computerator/fdperf |
| - freedreno: Clean up tests around ORing in the reloc flags. |
| - freedreno: Rename append_bo() in case it doesn't get inlined. |
| - freedreno: Initialize the bo's iova at creation time. |
| - freedreno: Start moving relocs flags into the BOs. |
| - freedreno: Replace OUT_RELOCD with permanently flagging shader BOs for it. |
| - freedreno: Mark all ringbuffer BOs as to be dumped on crash. |
| - freedreno: Tell the kernel that all BOs are for writing. |
| - freedreno: Replace OUT_RELOCW with OUT_RELOC. |
| - freedreno: Drop the "write" arg to emit_const_bo now relocs don't care. |
| - nir: Fix count when we didn't lower load_uniforms but did shift load_ubos. |
| - freedreno: Fix non-constbuf-upload UBO block indices and count. |
| - freedreno: Add a nohw flag to skip submitting to the kernel. |
| - freedreno: Split the fd_batch_resource_used by read vs write. |
| - freedreno: Add an early out for preparing to read a resource. |
| - freedreno: Move the resource_read early out to an inline. |
| - freedreno: Skip taking the lock for resource usage if it's already flagged. |
| - freedreno/a4xx+: Increase max texture size to 16384. |
| - freedreno/a6xx: Improve layout testcase logging for UBWC fails. |
| - freedreno/a6xx: Add a testcase for UBWC buffer sharing. |
| - freedreno: Pull the tile_alignment lookup for a layout to a helper. |
| - freedreno/a6xx: Fix UBWC blockheight for RG8. |
| - freedreno/a6xx: Fix UBWC mipmap sizing. |
| - freedreno/a6xx: Fix UBWC mipmapping height alignment. |
| - nir: Include num_ubos in the printed shader (if nonzero). |
| - freedreno/ir3: Clean up a silly nir_src_for_ssa(src.ssa). |
| - freedreno/ir3: Leave the cursor alone during ir3_nir_try_propagate_bit_shift. |
| - freedreno/ir3: Move i/o offset lowering after analyze_ubo_ranges. |
| - freedreno: Trim num_ubos to just the ones we haven't lowered to constbuf. |
| - freedreno/a6xx: Use LDC for UBO loads. |
| - freedreno: Drop the noubo fails list for CI, since there aren't any now. |
| - freedreno: Fix attempts to push UBO contents past the constlen on pre-a6xx. |
| - freedreno: Fix resource layout dump loop. |
| - freedreno: Avoid duplicate BO relocs in FD_RINGBUFFER_OBJECTs. |
| - ci: Move cross file generation to a shared script. |
| - ci: Autodetect whether we need cross setup in lava_arm builds. |
| - ci: Make cmake toolchain file for deqp cross build setup. |
| - ci: Make the create-rootfs more resilient. |
| - ci: Update versions of packages to remove from rootfses. |
| - ci: Switch the baremetal runner to be an x86 docker image. |
| - ci: Disable SMP on the a5xx boards. |
| - ci: Make a530's GLES3/31 fractional runs much more complete. |
| - freedreno/a5xx: Move resource layout to fdl. |
| - freedreno/fdl: Separate the list of a6xx testcases from the the test code. |
| - freedreno/a5xx: Add the outline of a unit test for a5xx layout. |
| - freedreno/a5xx: Set MIN_LAYERSZ on 3D textures like we do on a6xx. |
| - freedreno/a5xx: Define the 2D blit UBWC pitch fields |
| - ci: Fix DEQP_CASELIST_FILTER (used by a630 noubo run) |
| - ci: Do an explicit NIR validation-enabled pass on freedreno a630. |
| - ci: Don't forget to set NIR_VALIDATE in baremetal runs. |
| - ci: Enable a fractional run with UBO-to-constbuf disabled on a3xx. |
| - ci: Improve baremetal's logging of the job env var passthrough. |
| - freedreno/a6xx: Fix the size of buffer image views. |
| - freedreno: Fix printing of unused src in disasm of cat6 RESINFO. |
| - freedreno: Add more resinfo/ldgb testcases. |
| - freedreno: Fix resinfo asm, which doesn't have srcs besides IBO number. |
| - freedreno: Set the immediate flag in a4/a5xx resinfos. |
| - freedreno/ir3: Refactor out IBO source references. |
| - freedreno/ir3: Move handle_bindless_cat6 to compiler_nir and reuse. |
| - freedreno/ir3: Use RESINFO for a6xx image size queries. |
| - ci: Drop double ".txt" suffix on the unexpected results file. |
| - ci: Drop old comment about enabling --deqp-watchdog. |
| - ci: Auto-detect the architecture for VK ICD filenames. |
| - ci: Add DEQP_EXPECTED_RENDERER support for VK tests. |
| - ci: Move baremetal DEQP_NO_SAVE_RESULTS setup to the yml. |
| - ci: Quick exit qpa extraction for non-matching qpas. |
| - ci: Disable the firmware loader user helper option in arm64 kernels. |
| - ci: Build a cheza kernel. |
| - ci: Add scripts for controlling bare-metal chezas. |
| - ci: Switch cheza (freedreno a630) testing to baremetal. |
| - ci: Don't build an arm_test container now that the last user is gone. |
| - ci: Rename x86_cross_arm_test to just arm_test. |
| - turnip: Move vertex buffer bindings to SET_DRAW_STATE. |
| - turnip: Don't bother clamping VB size. |
| - turnip: Simplify vertex buffer bindings. |
| - turnip: Use tu_cs_emit_regs() for BLEND_CONTROL. |
| - turnip: Add support for alphaToOne. |
| - freedreno/a6xx: Add support for ALPHA_TO_ONE. |
| - freedreno: Upload gallium constbufs as needed when referenced as a UBO. |
| - freedreno/ir3: Refactor ir3_cp's lower_immed(). |
| - freedreno/ir3: Stop pushing immediates once we've filled the constbuf. |
| - freedreno/ir3: Drop unnecessary alignment of pushed UBO size. |
| - freedreno/ir3: Stop shifting UBO 1 down to be UBO 0. |
| - freedreno/ir3: Account for driver params in UBO max const upload. |
| - freedreno/ir3: Drop the max_const on a6xx to 512. |
| - freedreno/ir3: Handle cases where we decide not to lower UBO 0 loads. |
| - turnip: Fix crashes in compute with no descriptors to load. |
| - ci: Bump up to the current version of the VK CTS. |
| - ci: Disable shader cache on vulkan CI runs. |
| - ci: Build the full VK CTS for baremetal testing. |
| - ci: Enable pre-merge fractional vulkan CTS runs on the turnip driver. |
| - ci: Use rsync for initial nfsroot population on cheza. |
| - turnip: Expose robustBufferAccess. |
| - freedreno/a6xx: Fix clip_halfz support. |
| - ci: Leave a note as to what might be going on with a test. |
| - ci: Fix weird filesystem globs appearing in failed test .qpa files. |
| - ci: Disable some flaky tests on turnip. |
| - ci/bare-metal: Reword the final output of the init script on the board. |
| - ci/bare-metal: Make which test to run configurable. |
| - ci/bare-metal: Use the deqp-runner bits straight out of the artifacts. |
| - ci/bare-metal: Stop fetching the git tree. |
| - ci/bare-metal: Terminate the job with an error on kernel panic. |
| - docs: Replace ancient swrast conformance docs with more current information. |
| - docs: Add dri-devel to the mailing lists and drop the DRI wiki link. |
| - ci: disable the windows tests until the runner can be stabilized again |
| - ci: Bump vulkan CTS to 1.2.3.0. |
| - ci: Enable NIR validation on a630 GLES2 and VK tests. |
| - ci/bare-metal: Skip setting of unset variables at startup. |
| - ci/bare-metal: Don't include dev packages in arm*test. |
| - ci/tracie: Print the path if the trace isn't found. |
| - ci/tracie: Fix apitrace dump using "less" which isn't in the ARM rootfs. |
| - ci: Add a freedreno a630 tracie run. |
| - freedreno/a6xx: Define the register fields for polygon fill mode. |
| - turnip: Add support for polygon fill modes. |
| - freedreno/a6xx: Add support for polygon fill mode (as long as front==back). |
| - ci: Remove a stray "always" on the freedreno traces job. |
| - ci/bare-metal: Fail early when we get stuck powering on a cheza. |
| - ci/baremetal: Bump the kernel to a recent drm-msm-fixes for msm semaphores. |
| - turnip: Do better TU_DEBUG=startup logging of drmGetDevices2() failure. |
| - turnip: Fix error handling of DRM_MSM_GEM_INFO ioctls. |
| - turnip: Properly return VK_DEVICE_LOST on queuesubmit failures. |
| - gallium/util: Add a helper function for point sprite handling. |
| - vc4: Enable PIPE_CAP_TGSI_TEXCOORD. |
| - v3d: Enable PIPE_CAP_TGSI_TEXCOORD. |
| - v3d: Fix -Wmaybe-uninitialized compiler warning in the v33 code. |
| - ci: Disable pixmark-piano trace on a630 due to GPU hangs. |
| - util: Avoid strict aliasing bugs in xxhash. |
| - util: Mark util_format_description() as a const function. |
| - softpipe: Clean up softpipe's SSBO load/store interpreting instructions. |
| - util: Remove unused util_format_planar_is_supported(). |
| - etnaviv: Use the util_pack_color_union() helper. |
| - gallium/util: Fix location of the comment about S8_UINT handling. |
| - gallium/util: Clean up the Z/S tile write path. |
| - gallium/util: Move the Z/S handling to the outside of get_tile(). |
| - svga: Reuse util_format_unpack_rgba(). |
| - util: Merge util_format_write_4* functions. |
| - util: Merge util_format_read_4* functions. |
| - util: Use designated initializers to clean up the format tables' pack/unpack. |
| - llvmpipe: Generalize "could llvmpipe fetch this format" check in unit testing. |
| - util: Remove the stub pack/unpack functions for YUV formats. |
| - util: Share a single function pointer for the 4-byte rgba unpack function. |
| - docs: Move the current CI .rst doc to docs/ci/ and link to it from .gitlab-ci. |
| - docs: Move the conformance and the CI docs to a top level Testing section. |
| - docs: Move the gitlab-ci docs to RST. |
| - docs: Relax the expectations of HW CI farms. |
| - docs: Document how to interact with docker containers. |
| - freedreno/ir3_cmdline: Fix an uninit var warning. |
| - freedreno/ir3: Fix uninit var warning. |
| - intel: Fix release-build warnings about sf_entry_size. |
| - intel/perf: Fix unused var warning in release builds. |
| - intel/perf: Move perf query register programming to static tables. |
| - freedreno/a2xx: Fix compiler warning in disasm. |
| - meson: Enable GCing of functions and data from compilation units by default. |
| - freedreno/ir3: Fix duplicated fine derivatives instructions. |
| - freedreno/ir3: Add unit tests for derivatives disasm. |
| - ci: Use FDO_CI_CONCURRENT as our -j flags when present in the runner env. |
| - freedreno/ir3: Add a note about the instructions in the disasm test. |
| - freedreno/ir3: Add a bunch more tests for cat6 opcodes. |
| - freedreno/ir3: Refactor cat6 general dst printing. |
| - freedreno/ir3: Fix disasm of register offsets in ldp/stp. |
| - freedreno/ir3: Add missing ld_args_build_id to the ir3_delay unit test. |
| - ci: Set XDG_CACHE_HOME to tmpfs for bare-metal runners to avoid NFS. |
| - ci: Update checksums for freedreno traces. |
| - llvmpipe: Remove a bunch of default handling of pipe caps. |
| - llvmpipe: Use the default behavior of ALLOW_MAPPED_BUFFERS. |
| - softpipe: Remove a bunch of default handling of pipe caps. |
| - softpipe: Use the default behavior of ALLOW_MAPPED_BUFFERS. |
| - virgl: Remove a bunch of default handling of pipe caps. |
| - swr: Remove a bunch of default handling of pipe caps. |
| - swr: Use the default behavior of ALLOW_MAPPED_BUFFERS. |
| - svga: Remove a bunch of default handling of pipe caps. |
| - i915: Remove a bunch of default handling of pipe caps. |
| - softpipe: Refactor pipe_shader_state setup. |
| - softpipe: Convert to comma-separated SOFTPIPE_DEBUG for debug options. |
| - softpipe: Add support for reporting shader-db output. |
| - softpipe: Enable PIPE_CAP_TGSI_TEXCOORD. |
| - softpipe: Enable PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS; |
| - ci/bare-metal: Capture the first devcoredump a job produces. |
| - drm-shim: Return -EINVAL instead of abort()ing on unknown ioctls. |
| - docs: Explain how to set up a personal gitlab runner. |
| - nir: Add a pass to cut the trailing ends of vectors. |
| - i965: Enable vector shrinking in the vec4 backend. |
| - amd: Swap from nir_opt_shrink_load() to nir_opt_shrink_vectors(). |
| - nir: Remove the old nir_opt_shrink_load. |
| - freedreno: Fix "Offset of packed bitfield changed" warnings: |
| - nir/lower_amul: Use num_ubos/ssbos instead of recomputing it. |
| - nir: Add a little more docs about NIR's constant_data. |
| - nir: Print the constant data size associated with a shader. |
| - freedreno/ir3: Fix the type of half-float indirect uniform loads. |
| - freedreno/a6xx: Document the bit for the magic 32bit-uniforms-as-16b mode. |
| - freedreno/computerator: Set SP_MODE_CONTROL to the same value as vulkan/GL |
| - freedreno/ir3: Merge the redundant immediate_idx/immediates_count fields |
| - freedreno/ir3: Simpify the immediates from an array of vec4 to array of dwords. |
| - freedreno: Rename emit_const_bo() to emit_const_ptrs(). |
| - freedreno: Split ir3_const's user buffer and indirect upload APIs. |
| - freedreno/ir3: Clean up instrlen setup. |
| - freedreno: Increase the NUM_UNIT on compute's consts in indirect dispatch. |
| - freedreno: Add more asserts for DST_OFF/NUM_UNIT in indirect const uploads. |
| - freedreno/ir3: Fix assertion failures dumping CS high full regs. |
| - turnip: Make sure we include the build id. |
| - gallium/tgsi_exec: Fix up NumOutputs counting |
| - freedreno: Make the pack struct have a .qword for wide addresses. |
| - turnip: Fix truncation of CS shader iovas to 32 bits. |
| - turnip: Fix truncation of iovas to 32 bits in queries. |
| |
| Eric Engestrom (146): |
| |
| - cut 20.1 branch |
| - docs: update calendar for 20.1.0-rc2 |
| - post_version.py: fix branch name construction for release candidates |
| - post_version.py: invert `is_point` into `is_first_release` to make its purpose clearer |
| - post_version.py: stop adding release candidates to the index and relnotes |
| - docs: update calendar for 20.1.0-rc3 |
| - gitlab-ci: exclude scripts that don't affect the build |
| - util/rand_xor: make it clear that {,s_}rand_xorshift128plus take *exactly 2* uint64_t |
| - util/rand_xor: drop unused header |
| - util/rand_xor: fallback Linux to time-based instead of fixed seed |
| - util/rand_xor: extend the urandom path to all non-Windows platforms |
| - docs: update calendar for 20.1.0-rc4 |
| - anv: pass the fd directly to anv_gem_reg_read() |
| - anv: replace magic `| 1` with already #define'd name |
| - anv: disable VK_EXT_calibrated_timestamps when the timestamp register is unreadable |
| - git_sha1_gen.py: fix out-of-date comment |
| - git_sha1_gen.py: fix code style |
| - git_sha1_gen.py: fix whitespace |
| - compiler: delete leftover autotools test wrapper |
| - no_extern_c.h: fix typo in comment |
| - tree-wide: fix deprecated GitLab URLs |
| - docs: drop no-longer-relevant comment about bugzilla |
| - docs: Add release notes for 20.1.0 |
| - docs: update calendar, add news item, and link releases notes for 20.1.0 |
| - meson: remove "empty array"/"array of an empty string" confusion |
| - glapi: remove deprecated .getchildren() that has been replace with an iterator |
| - intel/genxml: drop sort_xml.sh and move the loop directly in gen_sort_tags.py |
| - intel: fix gen_sort_tags.py |
| - docs: Add release notes for 20.1.1 |
| - docs: update calendar, add news item, and link releases notes for 20.1.1 |
| - v3d: add missing unlock() in error path |
| - intel/genxml: drop python 2 support for gen_sort_tags.py |
| - intel/genxml: replace gen_sort_tags.py MIT licence with SPDX equivalent |
| - docs: update the blocks of unused EGL enums assigned to us |
| - i965: drop dead #include "config.h" |
| - iris: drop dead #include "config.h" |
| - gen_release_notes.py: update script to the new rST way of things |
| - post_version.py: update script to the new rST way of things |
| - intel/tools: rewrite run-test.sh in python |
| - intel/tools: make test aware of the meson test wrapper |
| - khronos-update.py: add script to simplify update of Khronos headers & xml files |
| - docs: remove plain-text copy of versions.rst |
| - util/os_file: replace broken windows-detection code with detect_os.h |
| - util: introduce os_dupfd_cloexec() helper |
| - replace all F_DUPFD_CLOEXEC with os_dupfd_cloexec() |
| - vulkan/wsi: replace all dup() with os_dupfd_cloexec() |
| - radv: replace all dup() with os_dupfd_cloexec() |
| - anv: replace all dup() with os_dupfd_cloexec() |
| - iris: replace all dup() with os_dupfd_cloexec() |
| - i965: replace all dup() with os_dupfd_cloexec() |
| - egl: replace all dup() with os_dupfd_cloexec() |
| - etnaviv: replace all dup() with os_dupfd_cloexec() |
| - freedreno: replace all dup() with os_dupfd_cloexec() |
| - svga: replace all dup() with os_dupfd_cloexec() |
| - virgl: replace all dup() with os_dupfd_cloexec() |
| - docs: publish our release maintainers' keys |
| - docs: remind release maintainers to sign the tarballs and publish their key |
| - docs: suggest alternative installation methods for meson |
| - docs: stop considering `Cc: mesa-stable` as an email address |
| - docs: reword "sending a patch revision" to "updating a merge request" |
| - docs: drop `git sendemail` instructions |
| - docs: prefer `Fixes:` over `Cc: mesa-stable` |
| - docs: add some formatting to the "backport merge request" option |
| - docs: reword a sentence a bit |
| - docs: make it clear that the tags needs to be in the commit message |
| - docs: move `Fixes:` tag explanation to its own section |
| - docs: move "stable" tag explanation next to `Fixes:` |
| - driconf: drop 28% catalan translation |
| - driconf: drop 15% german translation |
| - driconf: drop 26% spanish translation |
| - driconf: drop 6% french translation |
| - driconf: drop 8% dutch translation |
| - driconf: drop 9% swedish translation |
| - driconf: drop now unused translation facility |
| - util: rename xmlpool.h to driconf.h |
| - gitlab-ci: drop gettext from the build images |
| - docs: drop deleted file from extra sphinx files |
| - docs: cat maintainer keys to a single file |
| - docs: add some padding to the release calendar |
| - docs: add planning for 20.2 |
| - bin/symbols-check: explain C++ symbols workaround |
| - docs: Add release notes for 20.1.2 |
| - docs: update calendar and link releases notes for 20.1.2 |
| - docs: fix 20.1.2 relnotes |
| - docs: add a page explaining the GitLab CI and the Intel CI |
| - mesa/glformats: make _mesa_gles_error_check_format_and_type() more consistent |
| - docs: add release notes for 20.1.3 |
| - docs: update calendar and link releases notes for 20.1.3 |
| - docs: fix a bunch of typos |
| - egl: always compile surfaceless |
| - vulkan: automatically compile the `display` platform when available |
| - meson: move xlib-lease block further down |
| - egl: automatically compile the `drm` platform when available |
| - introduce `commit_in_branch.py` script to help devs figure this out |
| - bin/gen_release_notes.py: drop new_features.txt when we release XX.Y.0 |
| - egl/wayland: add missing newline between functions |
| - glx: drop always-true #ifdef |
| - docs/submittingpatches: add more than one `Cc: mesa-stable` example to the examples list |
| - meson/intel: add missing dep on git_sha1.h |
| - meson: fix android vulkan build |
| - egl: inline fallback for create_pixmap_surface |
| - egl: inline fallback for create_pbuffer_surface |
| - egl: drop unused fallback function |
| - egl: inline fallback for swap_buffers_with_damage |
| - egl: inline fallback for swap_buffers_region |
| - egl: inline fallback for post_sub_buffer |
| - egl: inline fallback for copy_buffers |
| - egl: inline fallback for query_buffer_age |
| - egl: inline fallback for create_wayland_buffer_from_image |
| - egl: inline fallback for get_sync_values |
| - egl: drop now empty egl_dri2_fallbacks.h |
| - egl: mark the rest of the callbacks as mandatory or optional |
| - egl: inline _EGLAPI into _EGLDriver |
| - docs: add release notes for 20.1.4 |
| - docs: update calendar and link releases notes for 20.1.4 |
| - post_version.py: don't generate relnotes twice |
| - post_version.py: drop incorrect conf.py changes |
| - post_version.py: stop using non-existent functions and fix commit message |
| - post_version.py: update the files in the current worktree, not the one with the script that we run |
| - post_version.py: fix relnotes links |
| - bin/gen_release_notes: automatically commit release notes |
| - docs/releasing: improve wording |
| - bin/khronos-update: having a folder in include/ is not a requirement |
| - bin/khronos-update: add support for the SPIRV files |
| - bin/khronos-update: add workaround for python bug 9625 |
| - egl: replace _eglInitDriver() with a simple variable |
| - egl: drop unnecessary _eglGetDriver() |
| - egl: fix _eglMatchDriver() return type |
| - egl: inline _eglMatchAndInitialize() and refactor _eglMatchDriver() |
| - egl: rename _eglMatchDriver() to _eglInitializeDisplay() |
| - egl: drop left-over function prototype |
| - egl: const _eglDriver |
| - egl/haiku: drop overwritten preset of EGL version |
| - egl: consistently use dri2_egl_display() helper macro |
| - meson: fix `-D xlib-lease=auto` detection |
| - docs: add release notes for 20.1.5 |
| - docs: update calendar and link releases notes for 20.1.5 |
| - pick-ui: specify git commands in "resolve cherry pick" message |
| - egl/entrypoint-check: split sort-check into a function |
| - egl/entrypoint-check: add check that GLVND and plain EGL have the same entrypoints |
| - driconf: fix force_gl_vendor description |
| - meson: bump required glvnd version |
| - egl/x11_dri3: enable & require xfixes 2.0 |
| - egl/x11_dri3: implement EGL_KHR_swap_buffers_with_damage |
| - meson: don't advertise TLS support if glx wasn't build with it |
| - meson: drop leftover PTHREAD_SETAFFINITY_IN_NP_HEADER |
| |
| Erico Nunes (16): |
| |
| - lima/ppir: introduce liveness internal live set |
| - lima/ppir: fix lod bias register codegen |
| - lima/ppir: do not assume single src for pipeline outputs |
| - lima/ppir: combine varying loads in node_to_instr |
| - lima/ppir: duplicate intrinsics in nir |
| - lima/ppir: duplicate consts in nir |
| - lima/ppir: remove unused clone functions |
| - lima/ppir: rework emit nir to ppir |
| - lima/ppir: rework store output |
| - lima/ppir: add fallback mov option for const scheduler |
| - lima/ppir: rework select conditions |
| - lima/ppir: handle failures on all ppir_emit_cf_list paths |
| - lima/ppir: improve handling for successors in other blocks |
| - lima/ppir: rework tex lowering |
| - lima/ppir: optimize tex loads with single successor |
| - lima/ppir: use a ready list in node_to_instr |
| |
| Erik Faye-Lund (124): |
| |
| - compiler/nir: move tan-calculation to helper |
| - vtn/opencl: add native_tan-support |
| - vtn/opencl: native variants of sin/cos |
| - vtn/opencl: native divide support |
| - vtn/opencl: native powr support |
| - vtn/opencl: native recip support |
| - vtn/opencl: native rsqrt support |
| - vtn/opencl: native sqrt support |
| - compiler/glsl: explicitly store NumUniformBlocks |
| - mesa/st: consider NumUniformBlocks instead of num_ubos when binding |
| - zink: use nir_lower_uniforms_to_ubo |
| - zink: lower b2b to b2i |
| - util/os_memory: never use os_memory_debug.h |
| - st/wgl: pass st_context_iface into stw_st_framebuffer_present_locked |
| - st/wgl: allocate and resolve msaa-textures |
| - docs/features: add zink features |
| - zink: load vk_GetMemoryFdKHR while creating screen |
| - zink: add a GET_PROC_ADDR macro to simplify load_device_extensions |
| - docs/features: mark GL_NV_conditional_render as done for zink |
| - zink: disable vkCmdResolveImage when respecting render-condition |
| - zink: do not expose real value for PIPE_CAP_MAX_VIEWPORTS |
| - zink: correct PIPE_SHADER_CAP_MAX_SHADER_IMAGES |
| - zink: mark depth-component cube-maps as done |
| - zink: implement i2b1 |
| - docs: fix broken release-calendar |
| - zink: hammer in an explicit wait when retrieving buffer contents for reading |
| - zink: use samples from state |
| - zink: do not dig into resource for nr_samples |
| - zink: pass batch instead of context for queries |
| - zink: implement nir_texop_txf_ms |
| - zink: expose PIPE_CAP_TEXTURE_MULTISAMPLE |
| - docs/features: mark GL_ARB_texture_multisample as done for zink |
| - zink: use general-layout when blitting to/from same resource |
| - zink: Use store_dest_raw instead of storing an uint |
| - nir: reuse existing psiz-variable |
| - zink: emulate B8G8R8X8_SRGB with B8G8R8A8_SRGB |
| - zink: assert that image-view format isn't undefined |
| - zink: only report device-local memory as video-memory |
| - gallium/hud: do not specify potentially invalid depth-range |
| - TEMP: add rst-conversion scripts |
| - docs: convert articles to reructuredtext |
| - TEMP: remove rst-conversion scripts |
| - docs: delete no longer needed file |
| - docs: fixup botched table |
| - docs: escape double colons |
| - docs: escape asterisks |
| - docs: escape trailing underscores properly |
| - docs: fixup broken rst |
| - docs: fixup heading-levels |
| - docs: use sphinx |
| - docs: disable syntax-highlighting by default |
| - docs: use code-block with caption instead of table |
| - docs: format notes as rst-notes |
| - docs: use code-blocks |
| - docs: drop open-coded toc for articles |
| - docs: add xlibdriver to table-of-contents |
| - docs: do not copy source-files to site |
| - docs: use rst footnotes instead of manual ones |
| - docs: reformat license table as rst table |
| - docs: use rst-note for highlighted text |
| - docs: bundle extra files |
| - docs: include specs into the generated docs |
| - gitlab-ci: build and deploy docs |
| - docs: drop news in favour of the introduction as index-page |
| - README: update references to internal docs |
| - docs: update internal references |
| - docs/relnotes: update internal references |
| - radv: update internal reference |
| - bin/perf-annotate-jit.py: update internal reference |
| - docs/release-calendar: restore missing id |
| - nir: do not try to merge xfb-outputs |
| - Revert "gallium/hud: don't use user vertex buffers" |
| - gallium/hud: don't use user vertex buffers |
| - zink: enable cull-distance if supported |
| - zink: expose GLSL 1.30 |
| - docs: update internal references |
| - docs/relnotes: update internal references |
| - docs: fixup relnotes after rst-conversion |
| - docs/features: mark GL3 as complete for zink |
| - docs/features: update ARB_texture_buffer_object line |
| - docs/features: remove driver-list for forward-compatible context |
| - mesa/main: fix inverted condition |
| - gallium/os: call "ANSI" version of GetCommandLine |
| - graw/gdi: do not depend on UNICODE macro |
| - gallium/util: limit STACK_LEN on Windows |
| - gallium/util: add missing include |
| - docs: update favicon |
| - docs: remove non-existent reference |
| - docs: restore accidentally dropped labels |
| - docs: fix internal references |
| - docs: use ref-links for internal references |
| - gallium/docs: update to recent sphinx |
| - gallium/docs: fixup formatting of numbered lists |
| - gallium/docs: remove reference to non-existent label |
| - gallium/docs: use none for highlight_language |
| - gallium/docs: prefix exts dir with underscore |
| - gallium/docs: remove non-existent static dir |
| - gallium/docs: remove unused imgmath extension |
| - ci: only build docs in the upstream-repo |
| - ci: only build docs if any docs changed |
| - ci: test docs for non-master builds |
| - ci: move deploy-stage later in the pipeline |
| - ci: move test-docs to container stage |
| - ci: add graphviz to the .docs-base template |
| - merge gallium docs into main docs |
| - docs: clean up gallium index-file |
| - docs: add an extension to generate redirects |
| - docs: move gallium specific docs into gallium folder |
| - docs: use svg for graphviz output |
| - docs: fixup envvar output |
| - zink: expose depth-clip if supported |
| - mesa/main: factor out one-time-init into a helper |
| - mesa/main: use call_once instead of open-coding |
| - gallium/util: do not use _MTX_INITIALIZER_NP on Windows |
| - mesa/main: use p_atomic_inc_return instead of locking |
| - mesa: do not use bitfields for advanced-blend state |
| - mesa: treat Color._AdvancedBlendMode as enum |
| - zink: use ralloc in nir-to-spirv |
| - zink: use ralloc for plain malloc-calls |
| - zink: pass mem_ctx to ralloc_size-call |
| - zink: use ralloc for spirv_builder as well |
| - mesa/program: fix shadow property for samplers |
| - docs: add some very basic documentation about zink |
| - mesa: handle GL_FRONT after translating to it |
| |
| Francisco Jerez (23): |
| |
| - intel/ir: Update performance analysis parameters for memory fence codegen changes. |
| - iris: Simplify iris_batch_prepare_noop(). |
| - iris: Extend iris_context dirty state flags to 128 bits. |
| - iris: Add batch-local synchronization book-keeping to iris_bo. |
| - iris: Add infrastructure to partition batch into sync boundaries. |
| - iris: Bracket batch operations which access memory within sync regions. |
| - iris: Annotate all BO uses with domain and sequence number information. |
| - iris: Drop redundant iris_address::write flag. |
| - iris: Report use of any in-flight buffers on first draw call after sync boundary. |
| - iris: Introduce cache coherency matrix for batch-local memory ordering. |
| - iris: Update cache coherency matrix on PIPE_CONTROL. |
| - iris: Implement buffer-local memory barrier based on cache coherency matrix. |
| - iris: Insert buffer barrier in existing cache flush helpers. |
| - iris: Remove batch argument of iris_resource_prepare_access() and friends. |
| - iris: Perform compute predraw flushes from compute batch. |
| - iris: Remove depth cache set tracking and synchronization. |
| - iris: Remove render cache hash table-based synchronization. |
| - iris: Open-code iris_cache_flush_for_read() and iris_cache_flush_for_depth(). |
| - iris: Emit single render target flush PIPE_CONTROL on format mismatch. |
| - iris: Remove iris_flush_depth_and_render_caches(). |
| - OPTIONAL: iris: Perform BLORP buffer barriers outside of iris_blorp_exec() hook. |
| - iris/icl+: Report same caching domain as main surface for clear color BO. |
| - intel/ir/gen12+: Work around FS performance regressions due to SIMD32 discard divergence. |
| |
| Frank Binns (2): |
| |
| - docs: change "Fixes:" tag example to match git fixes output |
| - egl/dri2: only take a dri2_dpy reference when binding a new context/surfaces |
| |
| Frédéric Bonnard (2): |
| |
| - clover: Fix types collision between c++ and altivec |
| - meson: Revert commit overriding C++ standard with gnu++11 on ppc64el |
| |
| Gert Wollny (66): |
| |
| - r600: Annotate some case fallthroughs |
| - r600: remove unused static functions |
| - r600/sb: replace memset by using member initialization/assignment |
| - r600: remove some unused variables to silence warnings |
| - r600: Fix warning regarding mixing enums and unsigned in ?: expression |
| - r600: Fix nir compiler options, i.e. don't lower IO to temps for TESS |
| - r600/sfn: Unify semantic name and index query and use TEXCOORD semantic |
| - r600/sfn: Fix printing vertex fetch instruction flags |
| - r600: Lower int64 ops from TGSI-to-NIR shaders too |
| - r600: Lower lerp after tgsi_to_nir |
| - r600: Add support for loading index register from other than chan X |
| - r600/sfn: Handle CF index loading from non-X channel |
| - r600/sfn: rework getting a vector and uniforms from the value pool |
| - r600/sfn: Skip move instructions if they are only ssa and without modifiers |
| - r600/sfn: re-use an allocated register in lookup |
| - r600/sfn: skip copying LOD if the target register is is the same |
| - r600/sfn: Fix memring print output |
| - r600/sfn: Fix RING instruction assembly emission |
| - r600/sfn: Fix GDS assembly emission |
| - r600/sfn: Fix RAT instruction assembly emission |
| - r600/sfn: Make allocate_reserved_registers forward to a virtual function |
| - r600/sfn: Fix handling of output register index |
| - r600/sfn: Make 3vec loads skip possible moves |
| - r600/sfn: Add support for viewport index output |
| - r600/sfn: Take FOGC, and backcolors into account im GS outputs |
| - r600/sfn: Handle loading sample_pos |
| - r600/sfn: Add FS output sample_mask |
| - r600/sfn: Don't reject VARYING_SLOT_PCNT |
| - r600/sfn: remove pointless check |
| - r600/sfn: assert when alu dest is missing |
| - r600/sfn: support indirect sampler buffer reads. |
| - r600/sfn: Add support for texture_samples |
| - r600/sfn: use the per shader atomic base |
| - r600/sfn: SSBO: Fix query of dest components |
| - r600/sfn: Fix clip vertex output as possible stream variable |
| - r600/sfn: Fix splitting constants that come from different kcache banks. |
| - r600/sfn: Don't reorder outputs by location |
| - r600/sfn: Fix printing ALU op without dest |
| - r600: Fix duplicated subexpression in r600_asm.c |
| - r600/sfn: Fix mapping for f32tof64 and f64tof32 |
| - r600/sfn: use modern c++ in printing LDS read instruction |
| - r600/sfn: Correctly update the number of literals when forcing a new group |
| - r600/sfn: remove debug output leftover |
| - nir: lower_tex: Don't normalize coordinates for TXF with RECT |
| - r600/sfn: lower image derefs |
| - r600/sfn: Add imageio support |
| - r600/sfn: Add support for image_size |
| - r600/sfn: Add support for reading cube image array dim. |
| - r600/sfn: Take SSBO buffer ID offset into account |
| - r600/sfn: Handle memory_barrier |
| - r600/sfn: Add lowering pass for shared IO |
| - r600/sfn: Add support for shared atomics |
| - r600/sfn: Don't set num_components on TESS sysvalue intrinsics |
| - r600/sfn: lower rotate ALU ops |
| - r600/sfn: Pipe through requesting a register at a given channel |
| - r600/sfn: emit texture instructions in one block |
| - r600/sfn: Add option to get a temp value for a specific channel |
| - r600/sfn: correct handling of loading vec4 with fetching constants |
| - r600/sfn: Add a forced output swizzle for depth write |
| - r600/sfn: Fix Ring output swizzle masks |
| - r600/sfn: Fix default z swizzle for GDS instructions |
| - r600: Add shader key item to identify when the sample mask should be used |
| - r600/sfn: Only use sample mask if the according shader key is set |
| - r600/sfn: Make the pin_to_channel generic |
| - d600/sfn: write stream outputs to correct mem ring |
| - gallivm/nir: Lower uniforms to UBOs in llvm draw if the driver didn't request this already |
| |
| Greg V (1): |
| |
| - gallium,util: undef ALIGN on FreeBSD to prevent name clash |
| |
| Guido Günther (2): |
| |
| - etnaviv: drm: Use NSEC_PER_SEC |
| - etnaviv: drm: Normalize nano seconds |
| |
| Gurchetan Singh (1): |
| |
| - virgl: apply bgra dest swizzle and add Portal 2 |
| |
| Hanno Böck (1): |
| |
| - Properly check mmap return value |
| |
| Hyunjun Ko (6): |
| |
| - freedreno,tu: Don't request fragcoord components not being read. |
| - tu,radv: fix potentially wrong offset of flexible array. |
| - vulkan: Adds helpers for vk_object (de)alloation and (de)initialization. |
| - tu: Fix wrong copies of sampler descriptor. |
| - turnip: Use the common base object type and struct. |
| - turnip: implement VK_EXT_private_data |
| |
| Iago Toral Quiroga (7): |
| |
| - v3d/compiler: don't rewrite unused temporaries to point to NOP register |
| - v3d/compiler: fix spill offset |
| - v3d/compiler: fix image size for 1D arrays |
| - nir/lower_clip: make the pass compatible with Vulkan semantics |
| - v3d/compiler: handle compact varyings |
| - v3d/compiler: request fragment shader clip lowering to be vulkan compatible. |
| - nir/lower_tex: skip lower_tex_packing for the texture samples query |
| |
| Ian Romanick (24): |
| |
| - nir/algebraic: Recognize open-coded byte or word extract from bfe |
| - nir/algebraic: Split ibfe and ubfe with two constant sources |
| - nir/algebraic: Optimize some bfe patterns |
| - nir/algebraic: Optimize ushr of pack_half, not ishr |
| - nir/algebraic: Add some half packing optimizations for pack_half_2x16_split |
| - nir/algebraic: Eliminate useless extract before unpack |
| - i965: Assert that blorp always handles color blits |
| - meta: Make _mesa_meta_texture_object_from_renderbuffer static |
| - meta: Make _mesa_meta_setup_sampler static |
| - meta: Remove support for clearing integer buffers |
| - mesa: Add matrix utility functions to load matrices |
| - mesa: Add function to calculate an orthographic projection |
| - meta: Stop frobbing MatrixMode |
| - meta: Use same vertex coordinates for GLSL and FF clears |
| - meta: Coalesce the GLSL and FF paths in meta_clear |
| - meta: Remove support for multisample blits |
| - anv/tests: Don't rely on assert or changing NDEBUG in tests |
| - anv/tests: Silence unused parameter warnings in main |
| - anv: Silence unused parameter warning in anv_image_get_clear_color_addr |
| - intel: Silence unused parameter warning in __intel_log_use_args |
| - intel/drm-shim: Add noop ioctl handler for set_tiling |
| - intel/drm-shim: Return correct values for I915_PARAM_HAS_ALIASING_PPGTT |
| - glsl: Remove integer matrix support from ir_dereference_array::constant_expression_value |
| - nir/algebraic: Don't distrubte absolute-value into dot-products |
| |
| Icecream95 (78): |
| |
| - pan/midgard: Fix old style shadows |
| - panfrost: Fix background showing when using discard |
| - panfrost: Enable PIPE_CAP_VERTEX_COLOR_UNCLAMPED |
| - panfrost: Decode AFBC flag bits |
| - panfrost: Only use AFBC YTR with RGB and RGBA |
| - pan/midgard: Use a signed value for checking inline constants |
| - Revert "panfrost: Keep cached BOs mmap'd" |
| - panfrost: Mark PIPE_BUFFER BOs as not renderable |
| - pan/mdg: Add a macro for printing instruction source information |
| - pan/mdg: Move r1.w writeout to branch->dest |
| - pan/mdg: Remove old zs store lowering |
| - pan/mdg: Remove old depth writeout code |
| - pan/mdg: Remove writeout case from bytemask_of_read_components |
| - nir: Replace the zs_output_pan intrinsic with combined_output_pan |
| - pan/mdg: Replace writeout booleans with a single value |
| - pan/mdg: Add new depth writeout code |
| - pan/mdg: Move search_var to earlier in midgard_compile.c |
| - pan/mdg: Add depth/stencil support to emit_fragment_store |
| - pan/mdg: Add new depth store lowering |
| - pan/mdg: Print writeout sources in mir_print_instruction |
| - panfrost: Add writes_stencil to the EARLY_Z disable list |
| - panfrost: Move sampler view bo creation to a separate function |
| - panfrost: Create a new sampler view bo when the layout changes |
| - panfrost: Tiled to linear layout conversion |
| - panfrost: Clean up panfrost_frag_meta_rasterizer_update |
| - panfrost: Implement ARB_depth_clamp |
| - pan/decode: Fix helper invocations when tracing |
| - pan/decode: Add missing wrap modes |
| - pan/mdg: Fix max_comp calculation for constant printing |
| - panfrost: RGBA4 and RGB5_A1 framebuffer support |
| - panfrost: Update sampler views when the texture bo changes |
| - panfrost: Copy resources when mapping to avoid waiting for readers |
| - panfrost: Only copy resources when they are in a pending batch |
| - panfrost: Add PAN_MESA_DEBUG=gl3 flag |
| - panfrost: Do fine-grained flushing for occlusion query results |
| - pan/mdg: Vectorize vlut operations |
| - pan/decode: Make mapped memory read-only while decoding |
| - nir: Add a base value to load_raw_output_pan |
| - panfrost: Fix MALI_READS_TILEBUFFER |
| - pan/mdg: Handle tilebuffer wait loops |
| - pan/mdg: Use the writeout tag for tilebuffer wait loops |
| - panfrost: Add rt formats to shader state |
| - panfrost: Add a bitset of render targets read by shaders |
| - pan/mdg: Do the pan_lower_framebuffer pass later |
| - pan/mdg: Emit a tilebuffer wait loop when needed |
| - pan/mdg: Handle non-blend framebuffer lowering |
| - pan/mdg: Support MRT in output load lowering |
| - pan/mdg: Set the z/s store intrinsic base correctly |
| - pan/mdg: Use a 32-bit ld_color_buffer op when needed |
| - panfrost: Implement texture_barrier |
| - panfrost: Stop keying on rt format when using native loads |
| - panfrost: Use f2fmp for framebuffer lowering conversions |
| - panfrost: Enable framebuffer fetch |
| - pan/mdg: Fix non-debug compiliation |
| - compiler: Add dual-source factors to blend_factor |
| - gallium: Dual source support in blend_factor_to_shader |
| - pan/mdg: Add a nir pass to reorder store_output intrinsics |
| - pan/mdg: Dual source blend input/writeout support |
| - pan/mdg: Skip z/s combining for dual-source writes |
| - panfrost: Dual source blend support |
| - pan/decode: Open the dump file later |
| - pan/mdg: Don't disassemble blit shaders |
| - panfrost: Rename lower_store to is_blend in pan_lower_framebuffer |
| - pan/mdg: Do per-sample framebuffer loads |
| - panfrost: Do per-sample shading when outputs are read |
| - nir: Add a face_sysval argument to nir_lower_two_sided_color |
| - nir: Fix lower_two_sided_color when the face is an input |
| - panfrost: Report TEXTURE_BUFFER_OBJECTS cap when gl3 flag set |
| - panfrost: Set depth_enabled when stencil is enabled |
| - nir: Set the alignment for SSBO lowering |
| - panfrost: Make panfrost_bo_wait take a wait_readers bool |
| - panfrost: Fix calls to panfrost_flush_batches_accessing_bo |
| - panfrost: Fake RGTC support |
| - panfrost: Use more tilebuffer sizes |
| - panfrost: 8x MRT support |
| - pan/mdg: Use the blend RT for blend shader framebuffer fetches |
| - panfrost: Allow PIPE_TEXTURE_1D_ARRAY textures |
| - pan/mdg: Fix spilling of non-32-bit types |
| |
| Icenowy Zheng (1): |
| |
| - panfrost: signal syncobj if nothing is going to be flushed |
| |
| Ilia Mirkin (14): |
| |
| - freedreno/a3xx: there's no r8i/ui rb format, only rg8i/rg8ui |
| - freedreno/a3xx: reinstate rgb10_a2ui texture format |
| - freedreno/ir3: avoid applying (sat) on bary.f |
| - freedreno/a3xx: fix const footprint |
| - freedreno: fix off-by-one in assertions checking for const sizes |
| - freedreno/a3xx: parameterize ubo optimization |
| - freedreno/a3xx: fix rasterizer discard |
| - nouveau: allow invalidating coherent/persistent buffer backings |
| - st/mesa: allow R8 to not be exposed as renderable by driver |
| - a4xx: add noperspective interpolation support |
| - a4xx: add polygon offset clamp, fix units |
| - ir3: mark ucp_enables as allowed values on all keys |
| - a4xx: hook up centroid ij coords |
| - ir3: use empirical size for params as used by the shader |
| |
| Indrajit Kumar Das (2): |
| |
| - st/mesa: use fragment shader to copy stencil buffer |
| - st/mesa: optimize DEPTH_STENCIL copies using fragment shader |
| |
| Italo Nicola (17): |
| |
| - panfrost: Fix outmods on int to float conversions |
| - pan/mdg: fix src_type in instructions that need a implicit zero |
| - pan/mdg: prepare effective_writemask() |
| - pan/mdg: eliminate references to ins->alu.op |
| - pan/mdg: eliminate references to ins->alu.reg_mode |
| - pan/mdg: fix comment |
| - pan/mdg: eliminate references to ins->alu.outmod |
| - pan/mdg: apply float outmods to textures |
| - pan/mdg: eliminate references to ins->texture.op |
| - pan/mdg: eliminate references to ins->load_store.op |
| - pan/mdg: defer register packing |
| - pan/mdg: externalize mir_pack_mod |
| - pan/mdg: remove ins->alu |
| - pan/mdg: refactor emit_alu_bundle |
| - pan/mdg: defer branch packing |
| - pan/mdg: remove ins->br_compact and ins->branch_extended |
| - pan/mdg: emit REGISTER_UNUSED on unused ALU src2 |
| |
| Iván Briano (9): |
| |
| - anv: use the correct format on Android |
| - anv: Disable B5G6R5_UNORM_PACK16 |
| - anv: Add a way to reserve states from a pool |
| - anv: Implement VK_EXT_custom_border_color |
| - anv: support externally synchronized pipeline caches |
| - anv: implement VK_PIPELINE_CREATE_FAIL_ON_PIPELINE_COMPILE_REQUIRED_BIT_EXT |
| - anv: enable VK_EXT_pipeline_creation_cache_control |
| - anv: Add VK_EXT_custom_border_color to relnotes |
| - anv: fix allocation of custom border color pool |
| |
| James Park (1): |
| |
| - amd/llvm: Reorder LLVM headers |
| |
| James Zhu (1): |
| |
| - ac/gpu_info: Correct Acturus cu bitmap |
| |
| Jan Beich (5): |
| |
| - drm-uapi: Add sync_file.h |
| - anv,iris: unbreak on BSDs after 812cf5f522ab,abf8aed68047 |
| - util: enable futex usage on BSDs after 7dc2f4788288 |
| - meson: unbreak sysctl.h detection on BSDs |
| - anv: disable i915_perf warning on non-Linux |
| |
| Jan Palus (1): |
| |
| - targets/opencl: fix build against LLVM>=10 with Polly support |
| |
| Jan Zielinski (1): |
| |
| - gallium/swr: Fix crashes in sampling code |
| |
| Jason Ekstrand (167): |
| |
| - intel/eu: Use non-coherent mode (BTI=253) for stateless A64 messages |
| - Revert "anv/gen12: Temporarily disable VK_KHR_buffer_device_address (and EXT)" |
| - vulkan: Allow destroying NULL debug report callbacks |
| - vulkan,anv: Add a common base object type for VkDevice |
| - anv: Stop clflushing events |
| - anv: Allocate CPU-side memory for events |
| - vulkan,anv: Add a base object struct type |
| - vulkan,anv: Move the DEFINE_HANDLE_CASTS macros to vk_object.h |
| - anv: Refactor setting descriptors with immutable sampler |
| - vulkan: Add run-time object type asserts in handle casts |
| - vulkan/wsi: Make wsi_swapchain inherit from vk_object_base |
| - anv/allocator: Add a start_offset to anv_state_pool |
| - vulkan/object: Always include the type |
| - anv,vulkan: Implement VK_EXT_private_data |
| - vulkan: Handle vkGet/SetPrivateDataEXT on Android swapchains |
| - nir: Make "divergent" a property of an SSA value |
| - util/list: Add a list pair iterator |
| - util/vma: Add an option to configure high/low preference |
| - util/vma: Add a debug print helper |
| - util/ra: Add [de]serialization support |
| - anv: Set 3DSTATE_VF_INSTANCING on the SVGS element |
| - anv: Set MOCS in 3DSTATE_CONSTANT_* on Gen9+ |
| - nir: Add some docs to the metadata types |
| - anv: Call vk_object_base_finish for image views |
| - anv: Fix descriptor set clean-up on BO allocation failure |
| - nir: Use 8-bit types for most info fields |
| - anv:gpu_memcpy: Emit 3DSTATE_VF_INDEXING on Gen8+ |
| - nir: Validate jump instructions as an instruction type |
| - nir: Use a switch statement in nir_handle_add_jump |
| - nir: Add documentation for each jump instruction type |
| - nir/clone: Re-use clone_alu for nir_alu_instr_clone |
| - nir: Add a new helper for iterating phi sources leaving a block |
| - nir: Add a store_reg helper and use the builder in phis_to_regs |
| - nir: Add const to nir_intrinsic_src_components |
| - nir/lower_double_ops: Rework the if (progress) tree |
| - nir/opt_deref: Report progress if we remove a deref |
| - nir/copy_prop_vars: Record progress in more places |
| - nir: Fix sources for image atomic fadd |
| - intel/vec4: Stomp the return type of RESINFO to UINT32 |
| - intel/fs: Fix unused texture coordinate zeroing on Gen4-5 |
| - intel/fs: Emit HALT for discard on Gen4-5 |
| - anv/allocator: Compare to start_offset in state_pool_free_no_vg |
| - nir: Add a nir_metadata_all enum value |
| - nir: Add a nir_shader_preserve_all_metadata helper |
| - nir: Call nir_metadata_preserve on !progress |
| - nir: Properly preserve metadata in more cases |
| - intel/nir: Call nir_metadata_preserve on !progress |
| - iris: Better handle metadata in NIR passes |
| - anv: Add an anv_batch_set_storage helper |
| - anv: Add anv_pipeline_init/finish helpers |
| - nir/intrinsics: Put the _intel intrinsics together at the end |
| - anv: Use resolve_device_entrypoint for dispatch init |
| - vulkan: Update Vulkan XML and headers to 1.2.145 |
| - anv: Bump the advertised patch version to 145 |
| - intel/fs: Expose a couple of NIR lowering helpers |
| - intel/fs: Break wm_prog_data setup into a helper |
| - intel/fs: Move more prog_data setup into populate_wm_prog_data |
| - intel/compiler: Expose brw_texture_offset to C |
| - intel/eu: Add a brw_urb_dest_msg_type helper |
| - intel/eu: Set the right subnr for ALIGN16 destinations |
| - intel/eu: Add the RNDU opcode |
| - vulkan/wsi: Don't consider VK_SUBOPTIMAL_KHR to be an error condition |
| - wsi/x11: Log swapchain status changes |
| - freedreno: Only call nir_lower_io on shader_in/out |
| - lima: Only call nir_lower_io on shader_in/out |
| - nouveau: Only call nir_lower_io on shader_in/out |
| - vc4: Only call nir_lower_io on shader_in/out |
| - v3d: Only call nir_lower_io on shader_in/out |
| - panfrost: Only call nir_lower_io on shader_in/out |
| - nir: Assert that nir_lower_io is only called with allowed modes |
| - nir: Remove shared support from lower_io |
| - nir: Add docs to nir_lower[_explicit]_io |
| - anv: Handle clamping of inverted depth ranges |
| - nir/validate: Don't abort() until after the shader has printed |
| - spirv: Skip phis in unreachable blocks in the second phi pass |
| - spirv: Allow block-decorated struct types for constants |
| - vulkan: Update Vulkan XML and headers to 1.2.148 |
| - anv: Advertise VK_EXT_image_robustness |
| - spirv: Update headers and grammar json |
| - spirv: Add support for SPV_EXT_shader_atomic_float |
| - intel/fs: Use the correct logical op for global float atomics |
| - anv: Advertise support for VK_EXT_shader_atomic_float |
| - nir: Allow for system values with variable numbers of destination components |
| - nir/lower_io: Choose to set access based on intrinsic metadata |
| - nir/lower_io: Use b2b for shader and function temporaries |
| - nir/lower_io: Add support for global scratch addressing |
| - spirv: Simplify our handling of NonUniform |
| - spirv: Drop the void \*ptr from vtn_value |
| - spirv: Fix indentation in vtn_handle_ptr |
| - spirv: Clean up OpSignBitSet |
| - spirv: Use nir_bany/ball for OpAny/All |
| - spirv: Add a helpers for getting types of values |
| - spirv: Rename push_value_pointer to push_pointer |
| - spirv: Add a vtn_push_nir_ssa helper |
| - spirv/amd: Use vtn_push_nir_ssa |
| - spirv: Add a vtn_get_nir_ssa helper |
| - spirv: Use the new helpers in OpConvertUToPtr/PtrToU |
| - spirv: Refactor vtn_push_ssa |
| - spirv/alu: Use vtn_push_ssa_value |
| - spirv/glsl450: Use vtn_push_ssa_value |
| - spirv/subgroups: Stop incrementing w |
| - spirv/subgroups: Refactor to use vtn_push_ssa |
| - spirv: Simplify vtn_ssa_value creation |
| - spirv: Hand-roll fewer vtn_ssa_value creations |
| - spirv: Add better checks for SSA value types |
| - spirv: Drop the sampled boolean from vtn_type |
| - spirv: Give atomic counters their own variable mode |
| - spirv: Add a helper for getting the NIR type of a vtn_type |
| - spirv: Remove a dead case in function parameter handling |
| - spirv: More heavily use vtn_ssa_value in function parameter handling |
| - anv,turnip,radv,clover,glspirv: Run nir_copy_prop before nir_opt_deref |
| - spirv: Rework our handling of images and samplers |
| - spirv: Also copy over binding information for atomic counters |
| - nir: Take a mode in remove_unused_io_vars |
| - nir/dead_variables: Respect the modes passed to remove_dead_vars |
| - nir: Add nir_foreach_shader_in/out_variable helpers |
| - nir: Add a nir_foreach_function_temp_variable helper |
| - nir: Add a nir_foreach_uniform_variable helper |
| - nir: Add a nir_foreach_gl_uniform_variable helper for GL linking |
| - nir: Add and use a nir_variable_list_for_mode helper |
| - nir: Take a nir_shader and variable mode in assign_var_locations |
| - nir: Take a shader and variable mode in nir_assign_io_var_locations |
| - nir/linking: Rework some internal helpers |
| - st/nir: Rework fixup_varying_slots |
| - nir/split_vars: Add mode checks to list walks |
| - nir: Split nir_index_vars into two functions |
| - nir/lower_amul: Add a variable mode check |
| - nir: Use a nir_shader and mode in lower_clip_cull_distance_arrays |
| - nir/lower_io_to_temporaries: Use a separate list for new inputs |
| - nir/io_to_vector: Use nir_foreach_variable_with_modes |
| - nir/lower_two_sided_color: Use nir_variable_create |
| - nir/lower_uniforms_to_ubo: Use nir_foreach_variable_with_modes |
| - nir/split_per_member_structs: Use nir_variable_with_modes_safe |
| - nir/lower_variable_initializers: Restrict the modes we lower |
| - nir/gl_nir_linker: Use nir_foreach_variable_with_modes |
| - freedreno/ir3_lower_tess: Rework var list helpers |
| - lima/standalone: Rework i/o variable fixup |
| - freedreno/ir3_cmdline: Rework i/o variable fixup |
| - r600/sfn/lower_tess_io: Rework get_tcs_varying_offset |
| - r600/sfn/lower_tex: Get rid of the lower_sampler vector |
| - r600/sfn: Use nir_foreach_variable_with_modes in IO vectorization |
| - panfrost/midgard: Make search_var take a nir_shader and mode |
| - panfrost: Use nir_foreach_variable_with_modes in pan_compile |
| - aco: Use nir_foreach_variable_with_modes to walk SSBOs |
| - mesa/ptn: Use nir_variable_create |
| - gallium/ttn: Use variable create/add helpers |
| - nir: Use a single list for all shader variables |
| - nir/split_per_member_structs: Inline split_variables_in_list |
| - nir/gl_nir_linker: Call add_vars_with_modes once for GL_PROGRAM_INPUT |
| - nir: Add a find_variable_with\_[driver\_]location helper |
| - vulkan: Update Vulkan XML and headers to 1.2.149 |
| - anv: Implement VK_EXT_4444_formats |
| - nir/deref: Don't try to compare derefs containing casts |
| - compiler/types: Add a struct_type_is_packed wrapper |
| - spirv: Do more complex unwrapping in get_nir_type |
| - anv: Advertise shaderIntegerFunctions2 |
| - spirv: Don't emit RMW for vector indexing in shared or global |
| - clover/spirv: Don't call llvm::regularizeLlvmForSpirv |
| - intel/nir: Pass the nir_builder by reference in lower_alpha_to_coverage |
| - intel/nir: Rewrite the guts of lower_alpha_to_coverage |
| - intel/fs: Fix MOV_INDIRECT and BROADCAST of Q types on Gen11+ |
| - intel/fs: Don't copy-propagate stride=0 sources into ddx/ddy |
| - iris: Re-emit push constants if we have a varying workgroup size |
| - spirv: Run repair_ssa if there are discard instructions |
| - nir: More NIR_MAX_VEC_COMPONENTS fixes |
| - intel/fs/swsb: SCHEDULING_FENCE only emits SYNC_NOP |
| - radeonsi: Only call nir_lower_var_copies at the end of the opt loop |
| |
| Jesse Natalie (10): |
| |
| - nir_lower_io: Add addr_format_is_offset helper |
| - nir: When nir_lower_vars_to_explicit_types is run on temps, update scratch_size |
| - nir: Support load/store of temps as scratch in nir_lower_explicit_io |
| - nir: Support vec8/vec16 in nir_lower_bit_size |
| - nir: Support algebraic opts on vectors larger than 4 |
| - nir: Support 8 and 16 component vectors for reduceable intrinsics |
| - nir/vtn: Add support for 8 and 16 vector ball/bany |
| - u_debug_stack_test: Fix MSVC compiling by using ATTRIBUTE_NOINLINE |
| - nir: More NIR_MAX_VEC_COMPONENTS fixes |
| - glsl_type: Add packed to structure type comparison for hash map |
| |
| JibbityJobbity (1): |
| |
| - drirc: Enable glthread for PCSX2 |
| |
| Jon Turney (1): |
| |
| - glthread: Fix use of alloca() without #include "c99_alloca.h" |
| |
| Jonathan Gray (13): |
| |
| - util: unbreak endian detection on OpenBSD |
| - util/anon_file: add OpenBSD shm_mkstemp() path |
| - meson: build with _ISOC11_SOURCE on OpenBSD |
| - meson: don't build with USE_ELF_TLS on OpenBSD |
| - meson: conditionally include -ldl in gbm pkg-config file |
| - util: futex fixes for OpenBSD |
| - util/u_thread: include pthread_np.h if found |
| - anv: use os_get_total_physical_memory() |
| - util/os_misc: add os_get_available_system_memory() |
| - anv: use os_get_available_system_memory() |
| - util/os_misc: os_get_available_system_memory() for OpenBSD |
| - radv: remove seccomp includes |
| - vulkan: make VK_TIME_DOMAIN_CLOCK_MONOTONIC_RAW_EXT conditional |
| |
| Jonathan Marek (135): |
| |
| - turnip: update "fetchsize" value to match fdl6_layout changes |
| - turnip: enable tiling for compressed formats |
| - util/format: translate 422_UNORM and 420_UNORM vulkan formats |
| - freedreno/registers: document 422_UNORM and 420_UNORM formats |
| - turnip: implement VK_KHR_sampler_ycbcr_conversion |
| - turnip: enable 422_UNORM formats |
| - freedreno: move a4xx specific layout code to a4xx code |
| - freedreno/a5xx: remove unused reference to gmem_alignw in layout code |
| - freedreno/a6xx: don't use gmem_alignw for imported buffers |
| - freedreno/a6xx: split up gmem/tile alignment requirements |
| - freedreno: reduce extra height alignment in a6xx layout |
| - freedreno/a6xx: use RESOLVE_TS event |
| - freedreno: add adreno 650 |
| - freedreno/layout: add explicit offset/pitch argument to fdl6_layout |
| - turnip: support VkImageDrmFormatModifierExplicitCreateInfoEXT |
| - turnip: fix RENDER_COMPONENTS value |
| - turnip: move HLSQ_UPDATE_CNTL write to before xs config writes |
| - turnip: update some properties based on blob driver |
| - turnip: clamp sampler minLod/maxLod |
| - freedreno/a6xx: use nonbinning VS when GS is used |
| - turnip: correctly emit non-binning vs in transform feedback case |
| - turnip: fix HW binning with geometry shader |
| - turnip: use common emit_xs_cntl to fill a6xx_sp_xs_ctrl_reg0 |
| - turnip: fix VFD_CONTROL for binning pass |
| - turnip: pipeline program state refactor |
| - turnip: share code between 3D blit/clear path and tu_pipeline |
| - turnip: add layered 3D path clear for CmdClearAttachments |
| - turnip: add emit renderpass cache flushes for sysmem 3D CmdClearAttachments |
| - turnip: remove some dead/redundant code |
| - freedreno/ir3: fix ir3_nir_move_varying_inputs |
| - turnip: remove duplicated stage2opcode and stage2shaderdb |
| - turnip: simplify stage2 helpers |
| - turnip: set VFD_INDEX_OFFSET in 3D clear/blit path |
| - turnip: fix 3D path always being used for CmdBlitImage |
| - turnip: fix cubic filtering with CmdBlitImage |
| - turnip: compute and graphics have completely separate state |
| - turnip: move descriptor set BO tracking to CmdBindDescriptorSets |
| - turnip: improve dirty bit handling a bit |
| - turnip: delete dead dynamic state code |
| - turnip: refactor draw states and dynamic states |
| - turnip: input attachment descriptor set rework |
| - turnip: use draw states for input attachments |
| - turnip: use u_format for packing gmem clear values |
| - freedreno/a6xx: FETCHSIZE is PITCHALIGN |
| - freedreno/fdl6: rework layout code a bit (reduce linear align to 64 bytes) |
| - turnip: fix a crash when rasterizerDiscardEnable is set |
| - turnip: fix a sample shading case |
| - turnip: fix renderpass gmem configs when there are too many attachments |
| - turnip: set the API version |
| - turnip: move enum translation functions to a common header |
| - freedreno/a6xx: VSC "STRM_ARRAY_PITCH" is "STRM_LIMIT" |
| - freedreno/a6xx: remove unnecessary OVERFLOW_FLAG_REG check |
| - turnip: remove unnecessary OVERFLOW_FLAG_REG check |
| - freedreno/a4xx: restore pitch to bytes change to layout code |
| - freedreno/a4xx: simplify setup_slices |
| - turnip: rework streamout state and add missing counter buffer read/writes |
| - turnip: refactor CmdDraw* functions (and a few fixes) |
| - turnip: enable VK_EXT_index_type_uint8 |
| - turnip: implement CmdDrawIndirectByteCountEXT |
| - turnip: fix ts_cs_memory typo |
| - turnip: use pipeline cs for shader programs instead of separate bo |
| - freedreno/registers: a6xx depth bounds test registers |
| - turnip: implement depthBounds |
| - turnip: translate CreateRenderPass to CreateRenderPass2 |
| - turnip: replace a memset(0) with zalloc in CreateRenderPass |
| - turnip: use RenderPassCreateInfo for render_pass_add_implicit_deps |
| - turnip: move some logic out of create_render_pass_common |
| - turnip: implement VK_EXT_vertex_attribute_divisor |
| - turnip: fix empty scissor case |
| - turnip: fix update_stencil_mask |
| - turnip: disable early_z for VK_FORMAT_S8_UINT |
| - freedreno/registers: add CP_DRAW_INDIRECT_MULTI |
| - freedreno/ir3: add support for load_draw_id |
| - turnip: implement VK_KHR_shader_draw_parameters |
| - turnip: fix VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VULKAN_1_1_FEATURES |
| - turnip: fix huge scissor min/max case |
| - freedreno/ir3: fix resinfo wrmask |
| - freedreno/regs: add extra bits for UBWC array pitch |
| - turnip: enable largePoints |
| - turnip: enable depthBiasClamp |
| - freedreno/registers: update varying-related registers |
| - freedreno/a3xx: support LINEAR_PIXEL/PERSP_CENTROID/LINEAR_CENTROID sysvals |
| - freedreno/a4xx: fake LINEAR_PIXEL varying support for u_blitter |
| - freedreno/ir3: add generic get_barycentric() |
| - freedreno/a5xx: set missing bary sysvals |
| - freedreno/a6xx: set missing bary sysvals |
| - turnip: set missing bary sysvals |
| - freedreno/ir3: add support for INTERP_MODE_NOPERSPECTIVE |
| - turnip: make tiling config part of framebuffer state |
| - turnip: rework render_tiles loop |
| - turnip: vsc improvements |
| - turnip: fix tess param bo size calculation |
| - turnip: clear_blit: pass aspect mask to setup function |
| - turnip: support multi-image layouts |
| - turnip: enable 420_UNORM formats |
| - freedreno/layout: fix explicit layout offset not added to slice offset |
| - freedreno/ir3: fix/rework tess levels |
| - Revert "nir: Add an option for lowering TessLevelInner/Outer to vecs" |
| - Revert "nir: Support sysval tess levels in SPIR-V to NIR" |
| - freedreno/regs: document SS6_UBO state src |
| - turnip: use global bo for clear blit shaders |
| - freedreno/ir3: add support for a650 tess shared storage |
| - freedreno/regs: document CS shared storage size bit |
| - freedreno/a2xx: fix compressed textures |
| - freedreno: add a fd_resource_pitch helper |
| - freedreno/layout: layout simplifications and pitch from level 0 pitch |
| - turnip: fix active_desc_sets not being set for compute pipeline |
| - freedreno/ir3: fix setup_input for sparse vertex inputs |
| - freedreno/ir3: run nir_opt_loop_unroll in optimization loop |
| - freedreno: fix layout pitchalign field not being set for imported buffers |
| - freedreno/regs: update primitive output related registers |
| - turnip: clean up primitive output state |
| - turnip: drop GS clear path |
| - turnip: use DIRTY SDS bit to avoid making copies of pipeline load state ib |
| - turnip: emit compute pipeline directly in CmdBindPipeline |
| - turnip: fix inconsistencies with tu6_load_state_size |
| - turnip: remove use of tu_cs_entry for draw states |
| - gitlab-ci: re-enable arm64_a630_vk |
| - freedreno/regs: update a6xx GRAS registers |
| - freedreno/regs: update a6xx RB regs |
| - freedreno/regs: update a6xx VPC regs |
| - freedreno/regs: update a6xx PC regs |
| - turnip: disable tiling for NV12/IYUV formats |
| - turnip: remove extra gmem alignment |
| - freedreno/ir3: fix wrong local_primitive_id_start type |
| - turnip: move WFI out of draw state to fix a650 hangs |
| - turnip: use patchControlPoints for HS_INPUT_SIZE value |
| - turnip: fix SP_HS_UNKNOWN_A831 value for A650 |
| - turnip: workaround for a630 d24_unorm_s8_uint fails |
| - turnip: fix sysmem CmdClearAttachments 3D fallback breaking GMEM path flush |
| - turnip: delete tu_clear_sysmem_attachments_2d |
| - turnip: add support for D32_SFLOAT_S8_UINT |
| - turnip: rework extended formats to allow more extended formats |
| - util/format: translate A4R4G4B4_UNORM and A4B4G4R4_UNORM vulkan formats |
| - turnip: implement VK_EXT_4444_formats |
| |
| Jordan Justen (17): |
| |
| - intel/dev: Split .num_subslices out of GEN12_FEATURES macro |
| - intel/dev: Add device info for RKL |
| - intel/l3: Don't rely on cfg entry URB size being 0 as a sentinal |
| - intel/l3: Allow platforms to have no l3 configurations |
| - iris/l3: Enable L3 full way allocation when L3 config is NULL |
| - anv: Set L3 full way allocation at context init if L3 cfg is NULL |
| - intel/dev: Add device info for DG1 |
| - iris: Make use of devinfo has_aux_map field |
| - anv: Make use of devinfo has_aux_map field |
| - anv/pipeline: Split VFE/INTERFACE_DESCRIPTOR out to emit_media_cs_state |
| - anv/cmd_buffer: Split GPGPU_WALKER out to emit_gpgpu_walker |
| - iris: Split walker and state update into iris_upload_gpgpu_walker |
| - iris/compute: Split out iris_load_indirect_location |
| - intel/compiler/cs: Allow simd32 in some more cases with no8 and/or no16 |
| - intel/compiler/fs: Still attempt simd32 when INTEL_DEBUG=no16 is used |
| - iris: Add missing break in switch in modifier_is_supported |
| - anv, iris: Set MediaSamplerDOPClockGateEnable for gen12+ |
| |
| Jose Maria Casanova Crespo (4): |
| |
| - v3d: Fix swizzle in DXT3 and DXT5 formats |
| - v3d: Include supported DXT formats to enable s3tc/dxt extensions |
| - vc4: don't relay on intr->num_components for non-vectorized intrinsics |
| - nir: only uniforms with dynamically_uniform offset are dynamically_uniform |
| |
| Joshua Ashton (7): |
| |
| - anv: Remove RANGE_SIZE usage |
| - radv: Remove RANGE_SIZE usage |
| - turnip: Remove RANGE_SIZE usage |
| - vulkan: Update Vulkan XML and headers to 1.2.140 |
| - radv: Implement VK_EXT_custom_border_color |
| - radeonsi: Use TRUNC_COORD on samplers |
| - radv: Implement VK_EXT_4444_formats |
| |
| José Fonseca (3): |
| |
| - glthread: Add GLAPIENTRY to _mesa_marshal_MultiDrawArrays. |
| - appveyor: Upgrade pip. |
| - appveyor: Use Python3. |
| |
| Karol Herbst (50): |
| |
| - nir/deref: copy ptr_stride when rematerializing |
| - nir/validate: validate the stride for deref_ptr_as_array |
| - Revert "nir/validate: validate the stride for deref_ptr_as_array" |
| - nvir/nir: use component helpers instead of insn->num_components |
| - st/mesa: lower images when needed |
| - nir/lower_images: fix for array of arrays |
| - nir/lower_images: handle dec and inc |
| - nv50/ir/nir: move away from image_deref intrinsics |
| - nv50/ir/nir: handle image atomic inc and dec |
| - nv50/ir/nir: remove image uniform hack |
| - gv100/ir: fix atom cas |
| - gv100/ir: fix shift lowering |
| - gv100/ir: fix OP_TXG for shadow textures |
| - nv50/ir/nir: add workaround for double vertex attribs |
| - nv50/ir/print: add missing VIEWPORT_MASK handling |
| - nv50/ir/nir: fix ext_demote_to_helper_invocation |
| - nv50/ir/nir: fix nv_viewport_array2 |
| - nvc0: enable spirv caps with nir |
| - nv50/ir/nir: don't emit a restart with set a stream_id |
| - nv50/ir/nir: handle clip vertex for tess eval shaders |
| - nv50/ir/nir: rework input output handling |
| - nv50/ir/nir: rework CFG handling |
| - nv50/ir/ra: convert some for loops to Range-based for loops |
| - nv50/ir/ra: fix memory corruption when spilling |
| - nv50/ir/nir: fix interpolation on explicit operations |
| - gv100/ir: implement sample shading |
| - gv100/ir: fix coherent and volatile memory access |
| - nv50/ir/nir: fix cache mode conversion |
| - nv50/ir: fix memset on non trivial types warning |
| - nv50/ir/tgsi: move call to tgsi_scan_shader inside Source constructor |
| - nvc0: set local mem size for compute on gv100 |
| - nvc0: set sampler index mode to independently on gv100 compute |
| - gv100/ir: set ftz bit on floating point operations |
| - ci: bump libdrm to 2.4.102 |
| - nouveau: enable HMM |
| - gallium: add PIPE_CAP_RESOURCE_FROM_USER_MEMORY_COMPUTE_ONLY |
| - nvc0: support PIPE_CAP_RESOURCE_FROM_USER_MEMORY_COMPUTE_ONLY |
| - nouveau: expose HMM |
| - ci: need to install wget in order to download libdrm |
| - ci: bump libdrm to 2.4.102 |
| - nouveau: enable HMM |
| - gallium: add PIPE_CAP_RESOURCE_FROM_USER_MEMORY_COMPUTE_ONLY |
| - nvc0: support PIPE_CAP_RESOURCE_FROM_USER_MEMORY_COMPUTE_ONLY |
| - nouveau: expose HMM |
| - st/mesa: fix st_CopyPixels without support for stencil exports |
| - nv50/ir/tgsi: silence warning about unhandled GS_INPUT_PRIM property |
| - nv50/ir: initialize persampleInvocation to false |
| - nir/lower_io: assert that offsets are used for shader_in |
| - nv50/ir/nir: fix global_atomic_comp_swap |
| - spirv: extract switch parsing into its own function |
| |
| Kenneth Graunke (20): |
| |
| - iris: Include linux/sync_file.h instead of cut and pasting contents |
| - anv: Include linux/sync_file.h instead of cut and pasting contents |
| - iris: Rename iris_syncpt to iris_syncobj for clarity. |
| - iris: Give up on not passing ice to iris_init_batch |
| - iris: Destroy transfer slab after batches |
| - iris: Flush any current work in iris_fence_await before adding deps |
| - intel: Move anv_gem_supports_syncobj_wait to common code. |
| - iris: Detect DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT kernel support |
| - iris: Implement PIPE_FLUSH_DEFERRED support. |
| - intel: Delete hardcoded devinfo->urb.size values for Gen7+ (sans DG1). |
| - iris: Delete useless #define |
| - intel/eu: Add a brw_urb_desc helper |
| - CI: Disable Panfrost Mali-T820, Lima Mali-400 and Lima Mali-450 jobs |
| - intel: Disable loading drivers on DG1 devices for now |
| - nir: Fix divergence analysis for tessellation input/outputs |
| - iris: Implement pipe->texture_subdata directly |
| - iris: Fix CCS check in iris_texture_subdata(). |
| - iris: Delete shader variants when deleting the API-facing shader |
| - iris: Reorder the loops in iris_fence_await() for clarity. |
| - iris: Drop stale syncobj references in fence_server_sync |
| |
| Kristian Høgsberg (73): |
| |
| - freedreno/ir3: Pass stream output info to ir3_shader_from_nir |
| - freedreno/ir3: Rename ir3_nir_lower_to_explicit_io |
| - freedreno/ir3: Add ir3_nir_lower_to_explicit_input() pass |
| - freedreno/ir3: Lower GS builtins before lowering IO |
| - freedreno/ir3: Drop hack to clean up split vars |
| - freedreno/fdl: Align after dividing by block size |
| - freedreno/a6xx: Set tfetch correctly for compressed formats |
| - freedreno/ir3: Drop wrmask for ir3 local and global store intrinsics |
| - freedreno/a6xx: Create shader dependent streamout state at compile time |
| - freedreno/a6xx: Map inputs to VFD entries up front |
| - freedreno/a6xx: Allocate ringbuffer based on VFD count |
| - freedreno/a6xx: Emit VFD setup as array writes |
| - freedreno/a6xx: Avoid stalling for occlusion queries |
| - freedreno: Use the right amount of &'s |
| - freedreno: Use explicit \*_NONE enum for undefined formats |
| - turnip: Use hw enum when emitting A6XX_RB_STENCIL_CONTROL |
| - turnip: Use tu6_reduction_mode() to avoid warning |
| - turnip: Use {} initializer to silence warning |
| - freedreno/ir3: Avoid {0} initializer for struct reginfo |
| - src/util: Remove out-of-range comparison |
| - mapi: Fix a couple of warning in generated code |
| - mesa/st: Use memset to zero out struct |
| - egl/android: Move get_format under HAVE_DRM_GRALLOC guard where it's used |
| - egl/android: Drop unused variable |
| - freedreno/a6xx: Move per element offset to VFD_DECODE |
| - freedreno/a6xx: Decouple VFD_FETCH and VFD_DECODE |
| - freedreno/a6xx: Create stateobj for VFD_DECODE |
| - freedreno/a6xx: Program VFD_DEST_CNTL from program stateobj |
| - freedreno/a6xx: Turn on robustness extensions |
| - docs/features.txt: Update for freedreno |
| - freedreno/a6xx: Fix VFD_CONTROL emit |
| - freedreno/a6xx: Don't write REG_A6XX_RB_SRGB_CNTL in restore |
| - freedreno/a6xx: Set index buffer size to bo size |
| - freedreno: Handle DRM_FORMAT_MOD_INVALID in shared code |
| - turnip: Put VK_KHR_external_fence_fd stubs back |
| - freedreno/a6xx: Don't blit with R2D_RAW |
| - freedreno/a6xx: Move fd6_ifmt into fd6_blitter.c |
| - freedreno/a6xx: Split out src and dst setup helpers for blit |
| - freedreno/a6xx: Don't set unknown bit when tiling differs |
| - freedreno/a6xx: Set src and dst rects outside blit loop |
| - freedreno/a6xx: Program SP_2D_SRC_FORMAT outside blit loop |
| - freedreno/a6xx: Consolidate computing blit_cntl |
| - freedreno/a6xx: Don't emit src state when clearing |
| - freedreno/a6xx: Separate stencil sysmem clear fix |
| - freedreno/a6xx: Enable FMT6_10_10_10_2_UNORM blitting |
| - freedreno/a6xx: Make blit_control helper a little more helpful |
| - freedreno/a6xx: Program A6XX_SP_2D_SRC_FORMAT_COLOR_FORMAT based on dst format |
| - freedreno/a6xx: Move REG_A6XX_SP_2D_SRC_FORMAT programming to helper |
| - freedreno/a6xx: Move CP_SET_MARKER to setup helper |
| - freedreno/a6xx: Program RB_UNKNOWN_8C01 in setup helper |
| - freedreno/a6xx: Don't take pipe_blit_info in emit_blit_dst |
| - freedreno/a6xx: Split clear and blit texture into different functions |
| - freedreno/registers: Rename SP_2D_SRC_FORMAT |
| - turnip: Move device enumeration and feature discovery to tu_drm.c |
| - turnip: Move tu_bo functions to tu_drm.c |
| - turnip: Collapse some tu_drm wrappers |
| - turnip: Move remaining drm code to tu_drm.c |
| - turnip: Only include msm_drm in tu_drm.c |
| - egl/android: Remove unused variable |
| - mapi/test: Change type to unsigned for offset |
| - gallium: Switch u_debug_stack/symbol.c to util/hash_table.h |
| - util: Move stack debug functions to src/util |
| - util: Add unit test for stack backtrace caputure |
| - gallium/android: Rewrite backtrace helper for android |
| - ci: Include enough Android headers to let us compile test EGL |
| - mapi: Mark TLS symbols as optional in glapi-symbols.txt |
| - turnip: Make tu_android.c compile again |
| - meson: Define ANDROID and ANDROID_API_LEVEL when compiling for Android |
| - anv: Pass device to setup_gralloc0_usage for error reporting |
| - anv: Add stub for anv_gem_get_tiling() for Android |
| - vulkan: Allow global symbol HMI for Android |
| - radv/android: Remove unused variable |
| - ci: Add a build test for the Android platform |
| |
| Krzysztof Raszkowski (1): |
| |
| - gallium/swr: Fix building swr with MSVC |
| |
| Laura Ekstrand (3): |
| |
| - docs: include meson in the toctree |
| - docs: Remove version. |
| - docs: Add the favicon to the new page. |
| |
| Leo Liu (3): |
| |
| - radeon/vcn: reset the decode flags from message buffer |
| - radeon/vcn: add Sienna to use internal register offset |
| - radeon/vcn/dec: add db_aligned_height to message buffer |
| |
| Lepton Wu (3): |
| |
| - mapi: x86: Fix dynamic entries in x86 tsd stubs. |
| - mapi: Return NULL function pointers for GL_EXT_debug_marker |
| - egl: Allow software rendering for vgem/virtio_gpu in platform_device |
| |
| Lionel Landwerlin (60): |
| |
| - drm-shim: move handle lock to shim_fd |
| - drm-shim: don't create a memfd per BO |
| - drm-shim: silence warnings |
| - intel/dev: print out error when platform is not found by name |
| - intel: add stub_gpu tool |
| - ci: Add intel to shaderdb runs |
| - iris: don't assert on unfinished aux import in copy paths |
| - anv: don't expose VK_INTEL_performance_query without kernel support |
| - anv: fix alignments for uniform buffers |
| - genxml: run sorting script |
| - genxml: fix invalid end value for video fields |
| - genxml: factor out utility functions |
| - genxml: pack: deal with default field not being simple integers |
| - intel/genxml: fix bits generation for MI_LOAD_REGISTER_IMM |
| - intel/mi-builder: add framework for self modifying batches |
| - anv: don't reserve a particular register for draw count |
| - anv: add a new execution mode for secondary command buffers |
| - intel/genxml: add PIPE_CONTROL command cache invalidate bit |
| - intel/perf: make pipeline statistic query loading optional |
| - intel/perf: store the appropriate OA formats in queries |
| - intel/perf: update generated code to ralloc all data |
| - intel/perf: create a unique list of counters |
| - intel/perf: compute number of passes for a set of counters |
| - intel/perf: emit counter units in generated code |
| - intel/perf: add helper to compute metrics from counters |
| - intel/perf: add counter category to generated code |
| - intel/perf: report whether the platform supported |
| - anv: use a query filled by the perf code |
| - intel/perf: reuse offset specified in the query |
| - anv: Implement VK_KHR_performance_query |
| - intel/perf: repurpose INTEL_DEBUG=no-oaconfig |
| - anv: fixup unwinding of device create failure |
| - blorp: rename workaround address function |
| - anv: store the workaround address |
| - iris: store workaround address |
| - i965: store workaround_bo offset |
| - intel: add identifier for debug purposes |
| - iris: add identifier BO |
| - i965: add identifier BO |
| - anv: add identifier BO |
| - intel/aub_error_decoder: print driver identifier if found |
| - iris: fix BO destruction in error path |
| - i965: don't forget to set screen on duped image |
| - iris: fix export of GEM handles |
| - i965: fix export of GEM handles |
| - anv: add an option to disable secondary command buffer calls |
| - anv: garbage collect timeline semaphore when querying value |
| - iris: fix fallback to swrast driver |
| - anv: fix uninitialized variable access |
| - anv: properly handle fence import of sync_fd = -1 |
| - anv: fix descriptor set free |
| - anv: fix incorrect realloc failure handling |
| - anv: centralize vk to gen arrays |
| - anv: fix up dynamic clip emission |
| - anv: don't fail userspace relocation with perf queries |
| - anv: fix transform feedback surface size |
| - anv: VK_INTEL_performance_query interaction with VK_EXT_private_data |
| - intel/perf: store query symbol name |
| - intel/perf: fix raw query kernel metric selection |
| - intel/compiler: fixup Gen12 workaround for array sizes |
| |
| Liviu Prodea (1): |
| |
| - util: Make process_test path compatible with mingw native toolchains |
| |
| Louis-Francis Ratté-Boulianne (1): |
| |
| - nir: Always create UBO variable when lowering uniforms to ubo |
| |
| Lucas Stach (3): |
| |
| - etnaviv: generalize FE stall before loading shader and sampler states |
| - etnaviv: retarget transfer to render resource when necessary |
| - etnaviv: don't expose timer queries |
| |
| Luigi Santivetti (3): |
| |
| - dri2: dri2_make_current() fold multiple if blocks |
| - dri2: do not conflate unbind and bindContext() failure |
| - egl/dri2: try to bind old context if bindContext failed |
| |
| Marcin Ślusarz (24): |
| |
| - i965: remove unused variable |
| - glsl_to_tgsi: add fallthrough comments |
| - glsl: cleanup vertex shader input checks |
| - iris: remove unused iris_bo->swizzle_mode |
| - intel/compiler: fix Android build |
| - st/mesa: fix reporting of float perf counters max value |
| - iris: return max counter value for AMD_performance_monitor |
| - iris: remove iris_monitor_config |
| - intel/perf: move query_mask and location out of gen_perf_query_counter |
| - iris: propagate error from gen_perf_begin_query to glBeginPerfQueryINTEL |
| - i965: propagate error from gen_perf_begin_query to glBeginPerfQueryINTEL |
| - util: fix possible fd leaks in os_socket_listen_abstract |
| - glsl: catch out of bounds access in the debug version |
| - util: fix possible buffer overflow in util_get_process_exec_path |
| - util/format: initialize non-important components to 0 |
| - mesa: fix out of bounds access in glGetFramebufferParameterivEXT |
| - mesa: quiet down static analyzers |
| - iris: quiet down static analyzers |
| - intel/vec4: fix out of bounds read |
| - intel/perf: fix performance counters availability after glFinish |
| - anv: refresh cached current batch bo after emitting some commands |
| - anv: fix minor gen_ioctl(I915_PERF_IOCTL_CONFIG) error handling issue |
| - intel/perf: split load_oa_metrics |
| - intel/perf: export performance counters sorted by [group|set] and name |
| |
| Marek Olšák (226): |
| |
| - mesa: optimize glPush/PopClientAttrib by removing malloc overhead |
| - mesa: don't call _mesa_update_state for _mesa_get_clamp_fragment_color |
| - mesa: don't set unnecessary program flags in _mesa_update_state |
| - mesa: don't update shaders on fixed-func state changes if user shaders are bound |
| - mesa,st/mesa: add a fast path for non-static VAOs |
| - mesa: inline vbo_context inside gl_context to remove vbo_context dereferences |
| - mesa: add glInternalBufferSubDataCopyMESA for glthread |
| - mesa: add _mesa_InternalBind{ElementBuffer,VertexBuffers} for glthread |
| - glthread: do glBufferSubData as unsynchronized upload + GPU copy |
| - glthread: don't use atomics for refcounting to decrease overhead on AMD Zen |
| - glthread: track pointers and strides for Pointer & EXT_dsa attrib functions |
| - glthread: track instance divisor changes |
| - glthread: track primitive restart state |
| - glthread: initialize VAOs properly |
| - glthread: handle POS vs GENERIC0 aliasing |
| - glthread: handle gl{Push,Pop}ClientAttrib{DefaultEXT} for glthread states |
| - glthread: upload non-VBO vertices and indices for non-Indirect non-IBM draws |
| - tgsi_to_nir: handle TGSI_SEMANTIC_BLOCK_SIZE |
| - tgsi_to_nir: handle TGSI_OPCODE_BARRIER |
| - radeonsi: unify and align down the max SSBO/TBO/UBO buffer binding size |
| - radeonsi: clean up and deduplicate code around internal compute dispatches |
| - radeonsi: bind shader images after DCC is disabled for image stores |
| - radeonsi: add SI_IMAGE_ACCESS_DCC_OFF to ignore DCC for shader images |
| - radeonsi: implement and use compute-based DCC decompression on gfx9-10 |
| - radeonsi: add a workaround to fix KHR-GL45.texture_view.view_classes on gfx9 |
| - radeonsi: fix si_compute_clear_render_target with render condition enabled |
| - radeonsi: revert an accidental change in si_clear_buffer |
| - Revert "ac/surface: remove RADEON_SURF_TC_COMPATIBLE_HTILE and assume it's always set" |
| - Revert "ac: reassociate FP expressions for inexact instructions for radeonsi" |
| - ac/surface: fix MSAA crash with FORCE_SWIZZLE_MODE on gfx9 |
| - radeonsi: don't wait for idle at the end of gfx IBs |
| - ac/surface: unset RADEON_SURF_TC_COMPATIBLE_HTILE if HTILE hasn't been computed |
| - radeonsi/gfx9: always use IMG_DATA_FORMAT_S8_32 for 8-bit stencil |
| - radeonsi: allow tc_compatible_htile to be mutable |
| - radeonsi: enable TC-compatible HTILE on demand for best Z/S performance |
| - tgsi_to_nir: translate non-vec4 image stores correctly |
| - radeonsi: fix compilation of monolithic PS |
| - amd: update amdgpu_drm.h |
| - amd: remove duplicated definitions from amdgpu_drm.h |
| - amd: assume CMASK is always rb/pipe_aligned, remove ac_surface.u.gfx9.cmask |
| - amd: assume HTILE is always rb/pipe_aligned, remove ac_surface.u.gfx9.htile |
| - ac/surface,radeonsi: move the set/get_bo_metadata code to ac_surface.c |
| - ac/surface,radeonsi: move the set/get_umd_metadata code into ac_surface.c |
| - amd: unify code for overriding offset and stride for imported buffers |
| - ac/surface: override all offsets including metadata offsets |
| - ac/surface: fix broken pitch override on gfx8 |
| - gallium: rename 'state tracker' to 'frontend' |
| - gallium: change comments to remove 'state tracker' |
| - gallium: rename PIPE_RESOURCE_FLAG_ST_PRIV to FRONTEND_PRIV |
| - gallium: remove more "state tracker" occurences |
| - radeonsi: also enable tgsi_to_nir caching for compute shaders |
| - glthread: stop using GLenum16 to get correct GL errors for out-of-bounds enums |
| - radeonsi: don't expose 16xAA on chips with 1 RB due to an occlusion query issue |
| - ac/nir: honor ACCESS_STREAM_CACHE_POLICY for L1 and L0 caches too |
| - radeonsi: use correct clear value size for EQAA in expand_fmask |
| - radeonsi: optimize access pattern for compute blits with linear textures |
| - radeonsi: tweak clear/copy_buffer limits when to use compute |
| - radeonsi: simplify setting resource usage for si_init_temp_resource_from_box |
| - radeonsi: rename SI_RESOURCE_FLAG_TRANSFER to FORCE_LINEAR |
| - radeonsi: use vi_dcc_enabled instead of using tex->surface.dcc_offset directly |
| - radeonsi: use display_dcc_offset for setting displayable_dcc_cb_mask |
| - winsys/amdgpu: add RADEON_FLAG_UNCACHED for faster blits over PCIe |
| - radeonsi: disable the L2 cache for most CPU mappings of textures |
| - radeonsi: disable the L2 cache for CPU read mappings of buffers |
| - radeonsi: compute perf tests - don't test 1 wave/SA limit, test no limit first |
| - radeonsi: test uncached clear/copy buffer performance with compute shaders |
| - gallium/u_threaded: execute transfer_unmap with THREAD_SAFE directly |
| - ac/gpu_info: compute the best safe IB alignment |
| - ac/surface: don't compute single-sample CMASK if it's unaligned |
| - radeonsi: don't use INDIRECT_BUFFER within IBs |
| - radeonsi: decrease the max GS invocation count to 32 |
| - Revert "radeonsi: don't wait for idle at the end of gfx IBs" |
| - ac: update register and packet definitions for preemption |
| - radeonsi: move resetting tracked registers into a new function |
| - radeonsi: split si_all_descriptors_begin_new_cs and rename functions |
| - radeonsi: don't enable TC-compatible HTILE for stencil if stencil doesn't use it |
| - radeonsi/gfx8: enable TC-compatible HTILE from the beginning as before |
| - radeonsi: don't hardcode most perf counter block counts |
| - ac/gpu_info: replace num_good_cu_per_sh with min/max_good_cu_per_sa |
| - amd: replace SH -> SA (shader array) in comments |
| - radeonsi/gfx10: implement most performance counters |
| - glthread: don't upload for glDraw inside a display list and always sync |
| - nir: add i2imp and u2ump opcodes for conversions to mediump |
| - nir: add int16 and uint16 type helpers |
| - nir: lower int16 and uint16 in nir_lower_mediump_outputs |
| - nir: fix lower_wpos for 16-bit fddy |
| - nir: add options::vectorize_vec2_16bit to limit vectorization to vec2 16 |
| - glsl: treat lowp as mediump when lowering builtins |
| - glsl: handle int16 and uint16 types and add instructions for mediump |
| - glsl: lower mediump integer types to int16 and uint16 |
| - glsl: lower mediump partial derivatives |
| - glsl: lower the precision of imageLoad |
| - glsl: lower samplers with highp coordinates correctly |
| - gallium: add shader caps INT16 and FP16_DERIVATIVES |
| - ac: rename has_double_rate_fp16 -> has_packed_math_16bit |
| - ac/nir: use more types from ac_llvm_context |
| - ac/nir: support vector types in the type suffix of overloaded intrinsics |
| - ac/nir: remove type and num_channels args from ac_build_buffer_store_common |
| - ac/nir: support 16-bit data in buffer_load_format opcodes |
| - ac/nir: support 16-bit data in image opcodes |
| - ac/nir: handle nir_op_[fiu]2[fiu]mp opcodes |
| - ac/nir: select v_cvt_pkrtz for all conversions from f32 to f16 for radeonsi |
| - ac/nir: set the second v_cvt_pkrtz argument to undef if it's unused |
| - ac/nir: support v2f16 derivatives |
| - nir: don't count samplers and images in interface blocks |
| - nir: gather which images are buffers |
| - nir: gather which images are MSAA |
| - radeonsi: remove unused leftover code for INDIRECT_BUFFER inside IBs |
| - radeonsi: remove const_buffers_declared hacks |
| - radeonsi: pass at most 3 images and/or shader buffers via user SGPRs for compute |
| - radeonsi: add a hack to disable TRUNC_COORD for shadow samplers |
| - gallium/u_vbuf: get rid of some pointer dereferences |
| - gallium/u_vbuf: add a faster path for uploading non-interleaved attribs |
| - glthread: sync in glFlush for multiple contexts |
| - radeonsi: enable ARB_sparse_buffer |
| - ac,radeonsi: replace == GFX10 with >= GFX10 where it's needed |
| - ac,radeonsi: start adding support for gfx10.3 |
| - ac/surface: add displayable DCC code for gfx10.3 |
| - radeonsi: honor a user-specified pitch on gfx10.3 |
| - radeonsi: enable larger SDMA clears and copies on gfx10.3 |
| - radeonsi: implement R9G9B9E5 render target and image store support on gfx10.3 |
| - radeonsi: move L2_CACHE_CONTROL registers into si_emit_framebuffer_state |
| - radeonsi: set BIG_PAGE fields on gfx10.3 |
| - radeonsi: don't set any XNACK options on gfx10.3 |
| - ac: align num_vgprs for gfx10.3 |
| - radeonsi: add support for Sienna Cichlid |
| - radeonsi: require LLVM 11 for gfx10.3 |
| - ac/surface: don't recompute the DCC retile map for imported textures |
| - amd/addrlib: don't recompute DCC info for every ComputeDccAddrFromCoord call |
| - amd/addrlib: remove unused members of ADDR2_COMPUTE_DCC_ADDRFROMCOORD_INPUT |
| - ac/surface: add a wrapper structure to hold ADDR_HANDLE |
| - ac/surface: cache DCC retile maps (v2) |
| - amd/addrlib: fix the C++ one definition rule violation |
| - ac/surface: don't set is_displayable if displayable DCC is missing |
| - ac/surface: require that gfx8 doesn't have DCC in order to be displayable |
| - ac/surface: enable DCC for the first level in the mip tail on gfx10 |
| - ac/surface: don't free dcc_retile_map on failure |
| - radeonsi: compact MRTs to save PS export memory space |
| - ac/nir: fix 64-bit division for GL CTS |
| - glapi: fix incorrect param names in ARB_vertex_attrib_binding functions |
| - glthread: rename non_vbo_attrib_mask -> user_buffer_mask, attribs -> buffers |
| - glthread: handle ARB_vertex_attrib_binding |
| - radeonsi: don't wait for idle at the end of gfx IBs |
| - radeonsi: replace ctx->screen with sscreen in si_flush_gfx_cs |
| - glsl,driconf: add allow_glsl_120_subset_in_110 for SPECviewperf13 |
| - driconf: add workarounds for SPECviewperf13 |
| - amd: add proper definitions for NOP packets |
| - ac,winsys/amdgpu: align IBs the same as the kernel |
| - radeonsi: don't add the border color buffer into the init_config state |
| - radeonsi: rename init_config states to cs_preamble states |
| - radeonsi: don't add the tess ring buffers into the cs_preamble state |
| - radeonsi: make wait_mem_scratch unmappable |
| - radeonsi: disallow adding BOs into si_pm4_state except 1 shader BO per state |
| - radeonsi: make si_pm4_cmd_begin/end static and simplify all usages |
| - radeonsi: clear per-context buffers at the end of si_create_context |
| - radeonsi: remove tabs |
| - radeonsi: don't flush in fence_server_sync |
| - ac/gpu_info: fix num_physical_sgprs_per_simd for gfx10 |
| - radeonsi: fix NGG culling for Wave64 |
| - radeonsi: always use Wave32 for GS fast launch, because Wave64 hangs |
| - radeonsi: always use Wave64 for HS/GS/VS shader stages (except GS fast launch) |
| - radeonsi: don't try to enable NGG culling for GS |
| - radeonsi: add a debug option to enable NGG culling for tessellation |
| - glsl: make print_type non-static for debugging |
| - glsl: print precision qualifiers in IR dumps |
| - glsl: print constant initializers |
| - glsl: fix the type of ir_constant_data::u16 |
| - glsl: fix evaluating float16 constant expression matrices |
| - glsl: run validate_ir_tree if GLSL_VALIDATE=1 regardless of the build config |
| - glsl: validate more stuff |
| - glsl: convert reusable lower_precision util code into helper functions |
| - glsl: remove the return type from lower_precision |
| - glsl: cleanups in lower_precision |
| - glsl: flatten a tautological conditional in lower_precision |
| - glsl: don't lower precision of textureSize |
| - glsl: don't lower builtins to mediump that don't allow it |
| - glsl: lower builtins to mediump that ignore precision of certain parameters |
| - glsl: lower builtins to mediump that always return mediump or lowp |
| - glsl: add capability to lower mediump array types |
| - glsl: lower mediump temporaries to 16 bits except structures (v2) |
| - gallium: add PIPE_SHADER_CAP_GLSL_16BIT_TEMPS for LowerPrecisionTemporaries |
| - Revert "ac/surface: require that gfx8 doesn't have DCC in order to be displayable" |
| - glsl: don't validate array types in ir_dereference_variable |
| - radeonsi: prevent a gfx10_ngg_calculate_subgroup_info failure for TES+NGG GS |
| - radeonsi: add missing initialization of registers |
| - radeonsi/gfx10: set the correct value for OFFCHIP_BUFFERING |
| - radeonsi: sort registers in si_emit_initial_compute_regs according to GPU gen |
| - radeonsi: sort registers in si_init_cs_preamble_state according to GPU gen |
| - ac: add helper ac_get_register_name |
| - ac: add tables for CP register shadowing |
| - winsys/amdgpu: make amdgpu_bo_unmap non-static |
| - radeonsi: make cs_preamble_state optional |
| - radeonsi: reorder code in update_gs_ring_buffers and init_tess_factor_ring |
| - radeonsi: implement CP register shadowing |
| - radeonsi: add reg shadowing codepaths to GS and tess ring setup |
| - radeonsi: add debug code for register shadowing |
| - radeonsi: don't restore states at the beginning of IBs if they're shadowed |
| - radeonsi: set up IBs for preemption |
| - radeonsi: enable preemption if the kernel enabled it |
| - amd: rename SIENNA -> SIENNA_CICHLID |
| - amd: add support for Navy Flounder |
| - amd: enable displayable DCC for everything newer than Navi1x |
| - radeonsi: disable SDMA on gfx9 |
| - radeonsi: reorder NIR optimizations |
| - radeonsi: call nir_split_array_vars/shrink_vec_array_vars/opt_find_array_copies |
| - glsl: lower_precision - fix assertion failure with dereferences of constants |
| - glsl: fix constant expression evaluation for 16-bit types |
| - glsl: don't lower atomic functions to mediump |
| - glsl: don't create conversion opcodes for array types |
| - glsl: don't lower to mediump for desktop OpenGL |
| - glsl: improve precision determination for calls |
| - Revert "radeonsi: honor a user-specified pitch on gfx10.3" |
| - radeonsi: use correct wave size in gfx10_ngg_calculate_subgroup_info |
| - radeonsi: use the same units for esgs_ring_size and ngg_emit_size |
| - radeonsi: increase minimum NGG vertex count requirement per workgroup on gfx 10.3 |
| - radeonsi: fix applying the NGG minimum vertex count requirement |
| - radeonsi: don't count unusable vertices to the NGG LDS size |
| - radeonsi: add a common function for getting the size of gs_ngg_scratch |
| - radeonsi: remove the NGG hack decreasing LDS usage to deal with overflows |
| - radeonsi: various fixes for gfx10.3 |
| - radeonsi: disable NGG culling on gfx10.3 because of hangs |
| - st/mesa: don't generate NIR for ARB_vp/fp if NIR is not preferred |
| - radeonsi: fix tess levels coming as scalar arrays from SPIR-V |
| - gallivm: fix build on LLVM 12 due to LLVMAddConstantPropagationPass removal |
| - ac/llvm: fix unaligned VS input loads on gfx10.3 |
| - Revert "ac: generate FMA for inexact instructions for radeonsi" |
| |
| Marek Vasut (3): |
| |
| - etnaviv: Disable seamless cube map on GC880 |
| - etnaviv: Remove etna_resource_get_status() |
| - etnaviv: Add lock around pending_ctx |
| |
| Mario Kleiner (1): |
| |
| - vulkan/wsi: Really terminate DRM lease in wsi_release_display(). |
| |
| Mathias Fröhlich (2): |
| |
| - st/mesa: Move _NEW_FRAG_CLAMP to NewFragClamp driver flag. |
| - mesa: set _NEW_FRAG_CLAMP only when needed |
| |
| Matt Turner (22): |
| |
| - intel/compiler: Drop opt_sampler_eot() |
| - intel/tools: Remove unnecessary reg number checking |
| - intel/tools: Drop srctype from ipreg |
| - intel/tools: Require explicit regions/types for special regs |
| - intel/tools: Disallow control subregisters > 3 |
| - intel/tools: Add assembler tests for the cr0 register |
| - intel/compiler: Add assert that set bits are within mask |
| - intel/compiler: Don't emit no-op cr0 changes |
| - intel/tools: Fix typos |
| - intel/tools: Remove stray newline |
| - intel/tools: Don't allow empty type specifier |
| - intel/tools: Simplify register type handling |
| - intel/tools: Make swizzle an integer |
| - intel/tools: Make writemask an integer |
| - intel/tools: Simplify immediate handling |
| - intel/tools: Simplify dstregion |
| - intel/compiler: Relax SENDS regioning assertions |
| - intel/tools: Pass integers, not enums, to stride() |
| - intel/tools: Manually set ARF register file/nr/subnr |
| - intel/tools: Don't hardcode notification register |
| - intel/tools: Simplify notification register handling |
| - intel/tools: Test notification subregisters |
| |
| Mauro Rossi (17): |
| |
| - android: iris: add iris_seqno.{c,h} to Makefile.sources |
| - freedreno/drm: android: add libfreedreno_registers static dependency |
| - freedreno: android: add adreno-pm4-pack.xml.h generation to android build |
| - android: util: fix build for GL4.1 support |
| - android: svga: fix build for GL4.1 support |
| - android: aco: add aco_ir.cpp to Makefile.sources |
| - android: nvir/gv100: update sources in Makefile.sources |
| - android: freedreno: add fd5_layout.c to Makefile.sources |
| - android: freedreno/ir3: add missing generated sources and rules |
| - android: freedreno/ir3: simplify generated sources rules |
| - android: panfrost/encoder: add libmesa_nir static dependency |
| - radv: fix build on Android 7 (v2) |
| - android: freedreno/registers: fix generated headers rules |
| - android: freedreno/ir3: fix include paths |
| - android: freedreno/common: add support for libfreedreno_common static |
| - android: freedreno: move a2xx disasm out of gallium |
| - android: freedreno/common: add libmesa_git_sha1 static dependency |
| |
| Michel Dänzer (38): |
| |
| - gitlab-ci: Use YAML anchor for llvmpipe paths in virgl rules |
| - gitlab-ci: Update to current templates |
| - gitlab-ci: Move down container_pre_build.sh invocation in x86_build.sh |
| - gitlab-ci: Add Debian testing repository for x86_build image |
| - gitlab-ci: Install WINE from Debian testing |
| - gitlab-ci: Move lib{drm,pciaccess}-dev cross packages out of loop |
| - gitlab-ci: Install g++-mingw-w64-x86-64-win32 instead of mingw-w64 |
| - Revert "ac,radeonsi: fix compilations issues with LLVM 11" |
| - Revert "gallium/gallivm: fix compilation issues with llvm 11" |
| - gitlab-ci: Enable -Werror in `meson-s390x` job |
| - gitlab-ci: Also list arm/x86_build in needs: of test jobs |
| - gitlab-ci: x86_test-base image as common base for x86_test-gl/vk |
| - gitlab-ci: Pull in GCC 9 from Debian testing in x86_test-gl/vk images |
| - gitlab-ci: Move LLVM/clang 6/7 packages to the x86_build_old image |
| - gitlab-ci: Use Debian 10 wine-development packages |
| - gitlab-ci: Stop using packages from Debian testing |
| - gitlab-ci: Move meson back to x86_test-gl/vk ephemeral packages lists |
| - gitlab-ci: Add x86_build-base docker image |
| - gitlab-ci: Use separate docker images for cross builds |
| - loader/dri3: Add dri3_wait_for_event_locked full_sequence out parameter |
| - loader/dri3: Use dri3_wait_for_event_locked in loader_dri3_wait_for_msc |
| - loader/dri3: Check for window destruction in dri3_wait_for_event_locked |
| - gitlab-ci: Automatically run pipelines for Marge Bot pre-merge only |
| - gitlab-ci: Use rules: instead of except:/only: for test-docs job |
| - gitlab-ci: Extend .ci-run-policy template for docs jobs |
| - gitlab-ci: Do not create the "success" job when the test-docs job exists |
| - ci: Use "when: always" for pages job |
| - ci: Move deploy stage between container & build stages |
| - Revert "loader/dri3: Check for window destruction in dri3_wait_for_event_locked" |
| - gitlab-ci: Remove indirect dependencies from needs: |
| - gitlab-ci: Drop dependencies: |
| - Revert https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4580 |
| - gitlab-ci: Fix "triggered by Marge for a merge request" rule |
| - gitlab-ci: Only trigger test-docs job automatically for MRs |
| - ci: Use FDO_CI_CONCURRENT in run-shader-db.sh as well |
| - ci: Do not mark container / pages jobs as interruptible |
| - ci: Use half as many parallel softpipe / virgl test jobs |
| - ci: Use ignore_scheduled_pipelines anchor in .radeonsi-rules |
| |
| Michel Zou (1): |
| |
| - swr: fix build with mingw |
| |
| Mike Blumenkrantz (73): |
| |
| - zink: explicitly zero some arrays in ntv |
| - zink: add SpvId returns to a couple ntv functions |
| - zink: flush active queries on destroy and free query object |
| - zink: fix vkCmdResetQueryPool usage |
| - zink: reset query on-demand when beginning a new query from resume |
| - zink: always use logical eq ops in ntv with 1bit inputs |
| - zink: track program usages for each shader |
| - zink: emit interpolation decorations for ntv outputs |
| - zink: handle more glsl->spirv builtin translation |
| - zink: rework input/output location emission |
| - zink: use '2' variants for device props/feats, check features for ext enabling |
| - zink: add spirv builder util functions for emitting xfb decorations |
| - zink: add spirv_builder methods for OpVectorExtractDynamic and OpVectorInsertDynamic |
| - zink: implement streamout and xfb handling in ntv |
| - zink: implement transform feedback support to finish off opengl 3.0 |
| - zink: set PIPE_CAP_VIEWPORT_TRANSFORM_LOWERED and remove POS special casing |
| - zink: switch to passing VkPhysicalDeviceFeatures2 in VkDeviceCreateInfo |
| - zink: enable xfb extension in screen creation |
| - zink: use int assignment for vk int type |
| - zink: use correct define value for reserved slot count in ntv |
| - zink: clamp VkImageCreateInfo.arrayLayers to 1 for image resource creation |
| - zink: unify code for setting resource barriers |
| - zink: handle signed and unsigned min/max ops in ntv |
| - zink: add ult handling for ntv |
| - zink: add bitfield_reverse handling to ntv |
| - zink: lower byte/word extract ops in nir |
| - zink: handle ixor in ntv |
| - zink: handle isign alu in ntv |
| - zink: set lower_mul_high and lower_rotate in ntv compiler options |
| - zink: use OpFUnordNotEqual for nir_op_fne |
| - zink: set lower_uadd_carry in nir options |
| - zink: implement Vk_EXT_index_type_uint8 |
| - nir: add lowering pass for clip plane enabling |
| - st/program: use nir_lower_clip_disable instead of nir_lower_clip_vs conditionally |
| - nir: add lowering pass for fragcolor -> fragdata |
| - zink: translate gl_FragColor to gl_FragData before ntv to fix multi-rt output |
| - u_prim_restart: handle user buffers in util_translate_prim_restart_ib() |
| - nir: allow nir_lower_point_size_mov to run in geometry shader |
| - nir: allow nir_lower_clip_halfz to run in geometry shaders |
| - zink: rework query handling |
| - zink: use #define for number of queries per-pool |
| - zink: only stall during query destroy for xfb queries |
| - zink: properly handle query pool overflows |
| - zink: only reset query pool on query end if current batch isn't in renderpass |
| - zink: use right vulkan type for GL_PRIMITIVES_GENERATED queries |
| - zink: handle ntv case of nested loop instructions more permissively |
| - zink: add lengthy comment and remove assert from discard_if ntv pass |
| - zink: use type of src[0] for ntv store and load ops |
| - zink: try copy_region hook for blits where we can't do a regular blit or resolve |
| - zink: block vkCmdBlitImage usage for multi sampled blits |
| - zink: block resolve blits for depth/stencil buffers |
| - zink: handle empty attachments |
| - zink: try to handle multisampled null buffers |
| - zink: enable tgsi texcoord pipe cap |
| - zink: destroy gfx program when a shader is freed |
| - zink: destroy descriptor pools on context destroy |
| - zink: free pipeline cache during program destroy |
| - zink: free all ntv allocations after creating shader module |
| - zink: use helper function to handle uvec/bvec types |
| - zink: handle texelFetchOffset with offsets |
| - zink: add some asserts for building access chains in ntv |
| - zink: omit Lod image operand in ntv when not using an image texture dim |
| - nir: allow lower_psiz_mov to run in tessellation stages |
| - nir\_ allow nir_lower_clip_halfz to run in tess eval shader |
| - u_prim_restart: handle indirect draws |
| - zink: add extension loading framework for spirv builder |
| - zink: implement VK_EXT_robustness2 |
| - zink: clamp PIPE_SHADER_CAP_MAX_SHADER_BUFFERS to PIPE_MAX_SHADER_BUFFERS |
| - zink: handle VK_EXT_vertex_attribute_divisor setup |
| - zink: store valid timestamp bits onto zink_screen |
| - zink: implement handling for VK_EXT_calibrated_timestamps |
| - u_prim_restart: add inline function for getting restart index based on index size |
| - zink: reorder create_stream_output_target to fix failure case leak |
| |
| Miklós Máté (1): |
| |
| - docs: add some missing stuff to sourcetree.rst |
| |
| Nanley Chery (18): |
| |
| - iris: Drop can_fast_clear_color's format parameter |
| - iris: Remove the CCS_D fallback |
| - iris: Avoid fast-clear with incompatible view |
| - iris: Disable sRGB fast-clears for non-0/1 values |
| - intel: Add ISL_AUX_USAGE_GEN12_CCS_E |
| - iris: Don't support sRGB + Y_TILED_CCS on gen9 |
| - iris: Use ISL_AUX_USAGE_GEN12_CCS_E on gen12 |
| - isl/drm: Support I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS |
| - gallium/dri2: Support I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS |
| - iris: Handle importing aux-enabled surfaces on TGL |
| - iris: Refactor modifier_is_supported for gen12 |
| - iris: Support I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS |
| - iris: Zero the add-on clear color BO on import |
| - dri_util: Update internal_format to GL_RGB8 for MESA_FORMAT_B8G8R8X8_UNORM |
| - iris: Don't call SET_TILING for dmabuf imports |
| - gallium/dri2: Report correct YUYV and UYVY plane count |
| - iris: Fix aux assertion in resource_get_handle |
| - blorp: Fix alignment test for HIZ_CCS_WT fast-clears |
| |
| Nataraj Deshpande (3): |
| |
| - anv: Limit vulkan version to 1.1 for Android |
| - anv: Disable extensions based on Android versions |
| - dri_util: Update internal_format to GL_RGB8 for MESA_FORMAT_R8G8B8X8_UNORM |
| |
| Neha Bhende (6): |
| |
| - util: Initialize pipe_shader_state for passthrough and transform shaders |
| - util: Add util functionality for GL4.1 support |
| - winsys/drm: Add GL4.1 support in drm winsys |
| - svga/include: Headers for GL4.1 support |
| - svga: Add GL4.1(compatibility profile) support in svga driver |
| - svga: Performance fixes |
| |
| Neil Armstrong (2): |
| |
| - Revert "CI: Disable Lima jobs due to lab unhealthiness" |
| - Revert "CI: Disable Panfrost Mali-T820 jobs" |
| |
| Neil Roberts (26): |
| |
| - nir/scheduler: Handle nir_intrinsic_load_per_vertex_input |
| - v3d: Remove unused member of v3d_compile |
| - nir/schedule: Store a pointer to the scoreboard in nir_deps_state |
| - nir/scheduler: Add an option to specify what stages share memory for I/O |
| - v3d: Let scheduler know GS doesn’t have shared I/O memory |
| - gallium: Add pipe cap for primitive restart with fixed index |
| - mesa: Add PrimitiveRestartFixedIndex to gl_constants |
| - v3d: Disable PIPE_CAP_PRIMITIVE_RESTART |
| - v3d: Add missing macro for stvpmd instruction |
| - v3d: Use stvpmd for non-uniform offsets in GS |
| - compiler: Add a system value for the line coord |
| - v3d: Implement the line coord intrinsic |
| - nir: Add intrinsics for the line width |
| - v3d: Handle the line width intrinsics |
| - v3d: Add a lowering pass for line smoothing |
| - v3d: Enable perpendicular line caps when line smoothing |
| - broadcom/qpu: set VC5_QPU_RADDR_A out of the switch at _pack_branch |
| - v3d/compiler: Fix sorting the gs and fs inputs |
| - v3d/compiler: Lower geometry output store base into offset src |
| - nir/scheduler: Move nir_scheduler to its own header |
| - nir/schedule: Store a pointer to the options struct in scoreboard |
| - nir/schedule: Add a callback for backend-specific dependencies |
| - v3d: Mark scheduling dependency for prim id and first output |
| - nir/schedule: Add an option for a fallback scheduling algorithm |
| - v3d: Changed v3d_compile:failed to an enum |
| - v3d: Retry with the fallback scheduler when RA fails |
| |
| Oschowa (5): |
| |
| - radv: Don't take absolute value of unsigned type. |
| - aco: Don't declare 'Block' as class, but define as struct. |
| - aco: Don't std::move temporary object. |
| - aco: Use correct reference type in for-range-loop. |
| - radv: Explicitly cast TIMESTAMP_NOT_READY value to uin32_t where needed. |
| |
| Pablo Saavedra (5): |
| |
| - ci: TRACES_DB_PATH and RESULTS_PATH defined as relative paths |
| - ci: ArgumentParser receives the args from the main parameters |
| - ci: Migrate tracie tests done in shell script to pytest |
| - ci: Split test_tracie_skips_traces_without_checksum in separate cases |
| - ci: Fix TypoError error when traces in traces.yml is an empty list |
| |
| Pavel Asyutchenko (1): |
| |
| - vulkan/overlay: fix crash on destroying NULL swapchain |
| |
| Peter Seiderer (3): |
| |
| - vc4_bufmgr: fix time_t printf |
| - pan_bo.h: add time.h include for time_t |
| - v3d_bufmgr: fix time_t printf |
| |
| Pierre Moreau (4): |
| |
| - clover/nir: Check the result of spirv_to_nir |
| - clover/api: Address missing braces for subobj init |
| - clover: Address unnecessary copy warnings |
| - clover/spirv: Remove unused tuple header |
| |
| Pierre-Eric Pelloux-Prayer (62): |
| |
| - radeonsi: fix export count |
| - mesa: add gl_coontext::ForceIntegerTexNearest |
| - driconf: add force_integer_tex_nearest option |
| - radeonsi: add workaround for issue 2647 |
| - radeonsi: don't print gs_copy_shader stats for shaderdb |
| - glsl: init gl_FragColor if zero_init=true |
| - glsl: rework zero initialization |
| - glsl: add a is_implicit_initializer flag |
| - mesa: extend GLSLZeroInit semantics |
| - gallium: add a new cap PIPE_CAP_GLSL_ZERO_INIT |
| - ac/nir: export some undef as zero |
| - ac/surface: remove shadowing declaration |
| - amdgpu/radeon: add secure api |
| - radeonsi: add AMD_DEBUG=tmz option |
| - radeon: add RADEON_CREATE_ENCRYPTED flag |
| - radeonsi: allocate framebuffer texture as secure when using tmz |
| - amdgpu: add encrypted slabs support |
| - radeonsi: force using staging texture when uploading to secure texture |
| - radeonsi/sdma: implement tmz support |
| - gallium: PIPE_RESOURCE_FLAG_ENCRYPTED |
| - radeonsi: add support for PIPE_RESOURCE_FLAG_ENCRYPTED |
| - amdgpu: use AMDGPU_IB_FLAGS_SECURE when requested |
| - radeonsi: determine secure flag must be set for gfx IB |
| - radeonsi: do not use cmask with encrypted texture |
| - amd/addrlib: fix forgotten char -> enum conversions |
| - radeonsi: fix inversed arguments in si_test_gds_memory_management |
| - amdgpu: fix unitialized variable |
| - radeonsi/sdma: remove useless compare |
| - radeonsi/drirc: enable zerovram option for 7 Days to Die |
| - winsys/radeon: do not cast bo->va as void* |
| - radeonsi: add return value to gfx10_ngg_calculate_subgroup_info |
| - radeonsi/ngg: try GS multi-cycling mode if default mode failed |
| - ac/surface: set SCANOUT if surf->is_displayable |
| - ac/surface: fix epitch when modifying surf_pitch |
| - ac/llvm: load 1 byte at a time if unaligned on gfx10 |
| - st/mesa: make texture views inherit compressed_data storage |
| - radeonsi: bump SI_NUM_SHADER_BUFFERS to 32 |
| - st/mesa: do not clear NewDriverState for inactive states |
| - glsl: reject size1x8 for image variable with floating-point data types |
| - ac/llvm: remove the -1 hack from ac_atomic_inc_wrap |
| - glsl: don't expose imageAtomicIncWrap for signed image |
| - glsl: only allow 32 bits atomic operations on images |
| - glsl: declare gl_Layer/gl_ViewportIndex/gl_ViewportMask as vs builtins |
| - st/mesa: set compressed_data to NULL when freed |
| - bin/symbols-check.py: add --ignore-symbol argument |
| - ac/llvm: export ac_init_llvm_once in targets |
| - mesa: rename _mesa_free_errors_data |
| - mesa: add bool param to _mesa_free_context_data |
| - mesa/st: release debug_output after destroying the context |
| - ac/surface: adapt surf_size when modifying surf_pitch |
| - radeonsi: adjust epitch for PIPE_FORMAT_R8G8_R8B8_UNORM |
| - radeonsi: extend workaround for KHR-GL45.texture_view.view_classes on gfx9 |
| - ac/llvm: handle static/shared llvm init separately |
| - mesa/st: introduce PIPE_CAP_NO_CLIP_ON_COPY_TEX |
| - radeonsi: enable PIPE_CAP_NO_CLIP_ON_COPY_TEX |
| - ac/llvm: add option to clamp division by zero |
| - radeonsi,driconf: add clamp_div_by_zero option |
| - radeonsi: use radeonsi_clamp_div_by_zero for SPECviewperf13, Road Redemption |
| - glsl: fix per_vertex_accumulator::fields size |
| - r600/uvd: set dec->bs_ptr = NULL on unmap |
| - radeon/vcn: set dec->bs_ptr = NULL on unmap |
| - mesa: fix glUniform* when a struct contains a bindless sampler |
| |
| Pierre-Loup A. Griffais (2): |
| |
| - radv: fix null descriptor for dynamic buffers |
| - radv: fix vertex buffer null descriptors |
| |
| Qiang Yu (6): |
| |
| - radeonsi: remove emacs style config file |
| - panfrost: don't always build bifrost_compiler |
| - radeonsi: fix syncobj wait timeout |
| - radeonsi: fix user fence space when MCBP is enabled |
| - radeonsi: fix max syncobj wait timeout |
| - radeonsi: fix user fence GPU address |
| |
| Rafael Antognolli (8): |
| |
| - intel: Store the aperture size in devinfo. |
| - intel/isl: Update mocs for DG1 |
| - intel/l3: Return the URB size from devinfo for DG1 |
| - intel/devinfo: Add function to check for DRM_I915_GEM_GET_TILING. |
| - iris/bufmgr: Do not use map_gtt or use set/get_tiling on DG1 |
| - anv/dg1: Don't use SET_TILING kernel uapi. |
| - iris: Align last_seqnos to 64 bits. |
| - anv: Align "used" attribute to 64 bits. |
| |
| Rhys Kidd (5): |
| |
| - nv50_2d: regenerate envytools-based rnndb headers |
| - nv50_2d,nvc0_2d: Document SET_PIXELS_FROM_MEMORY_SAFE_OVERLAP from rnndb |
| - nvc0_2d: Document SET_PIXELS_FROM_MEMORY_CORRAL_SIZE from rnndb |
| - nvc0: fix macro define for NVE4_COPY() |
| - nvc0: add documentation for nve4+ (Kepler) COPY class |
| |
| Rhys Perry (174): |
| |
| - aco: remove use of f-strings |
| - aco: add message to static_assert |
| - nir: add missing group_memory_barrier handling |
| - compiler/spirv: flag nclamp/nmin/nmax as exact |
| - nir: make fsat return 0.0 with NaN instead of passing it through |
| - docs: add src/amd/ to sourcetree.html |
| - docs/envvars: document ACO_DEBUG |
| - docs/envvars: update RADV_FORCE_FAMILY |
| - aco: simplify consecutive ordered vmem/lds writes optimization |
| - aco: fix consecutively written vgprs from vmem instructions |
| - aco: mark phi definitions as last-seen phi operands |
| - aco: consider affinities when creating v_mac_f32 |
| - aco: improve phi affinities with p_split_vector |
| - aco: split operations that use a swap's definition |
| - aco: fix disassembly with LLVM 11 |
| - nir/opt_if: run opt_peel_loop_initial_if after all other optimizations |
| - nir/opt_if: use nir_src_as_bool in opt_peel_loop_initial_if helper |
| - aco: fix typo in insert_waitcnt's kill() |
| - nir: fix lowering to scratch with boolean access |
| - aco: fix interaction with 3f branch workaround and p_constaddr |
| - aco: consider SDWA during value numbering |
| - aco: check instruction format before waiting for a previous SMEM store |
| - aco: preserve more fields when combining additions into SMEM |
| - aco: don't reorder barriers in the scheduler |
| - aco: fix 64-bit shared_atomic_exchange |
| - docs: add missing "shader\_" in VK_KHR_shader_subgroup_extended_types |
| - radv: set keep_statistic_info with RADV_DEBUG=shaderstats |
| - ac/gpu_info, radv: set max_wave64_per_simd to 20 on GFX10 |
| - aco: use v_xor3_b32 |
| - aco: validate instructions reading/writing upper halves/bytes |
| - aco: p_extract_vector in 64-bit u2f16/i2f16 |
| - aco: allow reading/writing upper halves/bytes when possible |
| - aco: prefer 4-byte aligned definitions |
| - aco: add Info::{operand_size,definition_size} |
| - aco: use Info::definition_size instead of definition's regclass |
| - aco: fix moving sub-dword values out of a register for a fixed definition |
| - aco: use num_opcodes instead of last_opcode |
| - aco: improve code for f2{i,u}{8,16} |
| - aco: use p_as_uniform in emit_vop1_instruction |
| - aco: add and set precise flag |
| - aco: create mads when signed zeros should be preserved |
| - aco: try to use fma instead of mad when denormals are enabled |
| - aco: create 16-bit mad/fma |
| - aco: update comment about preserving fp16/fp64 denormals |
| - aco: create 16-bit input and output modifiers |
| - aco: improve sub-dword check for sgpr/constant propagation |
| - aco: fix half_pi constant for 16-bit fsin/fcos |
| - aco: use 32-bit inline constants for 16-bit integer instructions |
| - aco: improve 8/16-bit constants |
| - aco: copy-propagate constants through p_extract_vector/p_split_vector |
| - aco: optimize 16-bit and 64-bit float comparisons |
| - aco: validate sub-dword pseudo instructions |
| - aco: add more opcodes to can_swap_operands |
| - aco: allow GFX9 partial writes with instructions which use opsel |
| - aco: improve check for moving temporaries out of fixed definitions |
| - aco: fix encoding of certain s_setreg_imm32_b32 instructions |
| - aco: fix validation error from vgpr spill/restore code |
| - aco: fix sub-dword opsel/sdwa checks |
| - aco: fix validation of opsel when set for the definition |
| - aco: shrink ssa_info |
| - aco: make ssa_info::label 64-bit |
| - aco: shrink mad_info |
| - aco: fix edge check with sub-dword temporaries |
| - aco: use the same regclass as the definition for undef phi operands |
| - radv: add new drirc option radv_no_dynamic_bounds |
| - radv: enable radv_no_dynamic_bounds for Path of Exile |
| - radv: enable radv_no_dynamic_bounds for more Path of Exile executables |
| - nir: slight correction to cube_face_coord constant folding |
| - spirv: set variables to restrict by default |
| - radv: fix image variable types in meta shaders |
| - aco: only use SMEM if we can prove it's safe |
| - aco: allow SMEM for some sub-dword accesses |
| - radv/aco,aco: allow SMEM SSBO loads on GFX6/7 |
| - aco: fix copy+paste error in split_buffer_store |
| - aco: don't store byte-aligned short stores |
| - aco: add missing bld.scc() in byte_align_scalar() |
| - aco: don't create byte-aligned short loads |
| - aco: fix when sub-dword create_vector operand cannot be placed perfectly |
| - aco: improve vectorization of 8/16-bit loads/stores |
| - aco: ignore blocked registers when checking edges in get_reg_impl() |
| - aco: remove outdated assert in handle_operands() |
| - radv: enable zerovram for Quantic Dream games |
| - aco: use VOP2 version of v_mbcnt_hi_u32_b32 on GFX6/7 |
| - aco: rework boolean phi pass |
| - aco: create better code for boolean phis with constant operands |
| - aco: optimize boolean phis with uniform selections |
| - aco: don't create phis with undef operands in the boolean phi pass |
| - aco: read 0 from inactive lanes when using dpp |
| - aco: optimize some masked swizzles to DPP |
| - aco: implement <32-bit masked_swizzle_amd |
| - nir/lower_subgroups: pass options struct to lower_shuffle |
| - nir/lower_subgroups: add lower_shuffle_to_swizzle_amd |
| - radv: use lower_shuffle_to_swizzle_amd |
| - aco: add 32-bit integer addition to can_swap_operands |
| - aco: fix underestimated pressure in spiller when a phi has a killed def |
| - aco: rewrite graph coloring in spiller |
| - aco: use unordered_set for spill id interferences |
| - aco: add add_interference() helper |
| - aco: use s_round_mode/s_denorm_mode |
| - aco: flush denormals before fp16 fabs/fneg if needed |
| - aco: fix nir_op_f2f16_rtne with non-default rounding modes |
| - aco: set tcs_in_out_eq=false if float controls of VS and TCS stages differ |
| - radv: enable more float_controls features |
| - aco: properly recognize that s_waitcnt mitigates VMEMtoScalarWriteHazard |
| - aco: use s_waitcnt_depctr to mitigate VMEMtoScalarWriteHazard |
| - spirv: don't split memory barriers |
| - nir/lower_int64: lower 64-bit amul |
| - aco: always set FI on GFX10 |
| - radv: replace discard with demote for Quantic Dream games |
| - aco: implement b2i8/b2i16 |
| - aco: be more careful combining additions that could wrap into loads/stores |
| - aco: allow overflow for some SMEM instructions |
| - aco: add NUW flag |
| - nir: add nir_unsigned_upper_bound and nir_addition_might_overflow |
| - aco: use nir_addition_might_overflow to combine additions into SMEM |
| - aco: move some setup code into helpers |
| - aco: make validate() usable in tests |
| - aco: print ACO IR before scheduling instead of after |
| - radv: fix invalid conversion warnings in vk_format.h |
| - aco: fix copy of uninitialized boolean |
| - aco: fix includes in aco_ir.cpp |
| - aco: add missing add_to_hazard_query |
| - aco: rework barriers and replace can_reorder |
| - radv/aco,aco: use scoped barriers |
| - aco: consider intrinsic access in visit_{load,store}_image |
| - nir,radv/aco: add and use pass to lower make available/visible barriers |
| - aco: enable value numbering of s_buffer_load_* |
| - aco: use storage_scratch |
| - aco: improve sync_info for TCS output stores |
| - aco: improve workgroup-scope and lower vmem/smem barriers |
| - aco: create acq+rel barriers instead of acq/rel |
| - nir/load_store_vectorize: fix indentation |
| - ac/nir: implement scoped_barrier |
| - radv: use scoped barriers |
| - aco: remove isel for GLSL-style barriers |
| - aco: add framework for unit testing |
| - aco: add a few tests for the assembler and optimizer |
| - aco: add framework for testing isel and integration tests |
| - ci: enable ACO tests |
| - aco/tests: add tests for sub-dword swaps |
| - aco: optimize swizzled SALU 8/16-bit conversions |
| - aco: fix waitcnt insertion on GFX10.3 |
| - aco: don't create v_mad_f32 on GFX10.3 |
| - aco: update bug workarounds for GFX10_3 |
| - aco: fix max_waves_per_simd on Polaris, VegaM and GFX10.3 |
| - aco: update vgpr_alloc_granule for GFX10.3 |
| - aco: implement subgroup shader_clock on GFX10.3 |
| - aco: update aco_opcodes.py for GFX10.3 |
| - aco: disable SMEM stores on GFX10.3 |
| - aco: replace MADs in isel with FMA on GFX10.3 |
| - spirv: set ACCESS_COHERENT for ssbo/global/image atomic load/store |
| - radv/aco: enable VK_KHR_memory_model |
| - ac/nir: consider an image load/store intrinsic's access |
| - ac/nir: fix coherent global loads/stores |
| - radv/llvm: enable VK_KHR_memory_model |
| - aco: fix C++11/C++14 compilation |
| - aco: set constant_data_offset correctly in the case of merged shaders |
| - aco: don't move memory accesses to before control barriers |
| - aco: fix non-rtz pack_half_2x16 |
| - aco: consider branch definitions in spiller |
| - aco: don't consider the first partial spill if it's the wrong type |
| - aco: don't fix break condition for break+discard to exec |
| - aco: fix regclass checks when fixing to vcc/exec with Builder |
| - aco: fix spills_entry heuristic for branch blocks in init_live_in_vars() |
| - aco: keep loop live-through variables spilled |
| - aco: reserve 2 sgprs for each branch |
| - aco: create long jumps |
| - aco: fix byte_align_scalar for 3 dword vectors |
| - aco: fix one-off error in Operand(uint16_t) |
| - nir/opt_if: fix opt_if_merge when destination branch has a jump |
| - aco: fix v_writelane_b32 with two sgprs |
| - aco: don't apply constant to SDWA on GFX8 |
| - radv: initialize with expanded cmask if the destination layout needs it |
| - radv,aco: fix reading primitive ID in FS after TES |
| |
| Rob Clark (265): |
| |
| - util/simple_mtx: add assert_locked() |
| - freedreno: add screen lock wrappers |
| - freedreno: switch to simple_mtx |
| - freedreno: fix buffer import |
| - gallium: extract out logicop helper |
| - freedreno/drm: drop atomic refcnts |
| - freedreno/drm: inline the things |
| - freedreno/a6xx: small query cleanup |
| - freedreno/a6xx: avoid unnecessary clearing VS DP state |
| - freedreno/a6xx: move const state to single stateobj |
| - freedreno/a6xx: move scissor state to stateobj |
| - freedreno/a6xx: limit PROG_FB_RAST state emit |
| - freedreno/a6xx: limit LRZ state emit |
| - freedreno/a6xx: move blend-color to stateobj |
| - freedreno/a6xx: combine sample mask into blend state |
| - freedreno/a6xx: skip unnecessary MRT blend state |
| - freedreno/a6xx: add OUT_PKT() |
| - freedreno/a6xx: convert draw packet to OUT_PKT() |
| - freedreno/a6xx: split out const emit |
| - freedreno/ir3: inline const emit |
| - freedreno/a6xx: convert const emit to OUT_PKT() |
| - freedreno: scissor vs disabled scissor micro-opt |
| - freedreno/a6xx: more OUT_REG() |
| - freedreno: sync registers with envytools |
| - freedreno/a6xx: don't set SP_FS_CTRL_REG0.VARYING for fragcoord |
| - freedreno/a6xx: fix LRZ hang |
| - freedreno/a6xx: add some more formats |
| - freedreno: we don't need aligned vbo's |
| - freedreno/a6xx: compressed blit fixes |
| - freedreno/a6xx: enable tiled compressed textures |
| - freedreno/gmem: don't assume scissor opt when estimating # of bins |
| - freedreno: initialize max_scissor |
| - freedreno/gmem: add div_align() helper |
| - freedreno/gmem: add helper to dump GMEM layout |
| - freedreno: add gmemtool |
| - freedreno/gmem: relax alignment on a6xx |
| - freedreno/gmem: rework gmem layout algo |
| - freedreno/ir3: don't allow negative const_offset |
| - freedreno/ir3: fix indirect cb0 load_ubo lowering |
| - freedreno/ir3: limit # of tex prefetch by shader size |
| - freedreno/ir3/postsched: reset sfu_delay on sync |
| - freedreno/ir3/postsched: try to avoid (sy) syncs |
| - freedreno/ir3/sched: avoid scheduling outputs |
| - freedreno/ir3/sched: try to avoid syncs |
| - freedreno/a6xx: fix max-scissor opt |
| - freedreno/ir3: use const_index accessors |
| - nir: fix indices for ir3 ssbo_atomic intrinsics |
| - nir: add helper to copy const_index[] |
| - nir: add pass to lower disjoint wrmask's |
| - freedreno/ir3: use lower_wrmasks pass |
| - freedreno/fdperf: add dependency on generated headers |
| - freedreno/drm: don't pass thru 'DUMP' flag on older kernels |
| - freedreno/drm: handle ancient kernels |
| - freedreno/ir3: remove Sethi-Ullman numbering pass |
| - freedreno/ir3: juggle around ir3_debug_print() |
| - freedreno/ir3/dce: report progress |
| - freedreno/cf: report progress |
| - freedreno/ir3/cp: report progress |
| - freedreno/ir3/deps: report progress |
| - freedreno/ir3/group: report progress |
| - freedreno/ir3/legalize: report progress |
| - freedreno/ir3/postsched: report progress |
| - freedreno/ir3: add IR3_PASS() macro |
| - freedreno/ir3: move where we preserve binning pass inputs |
| - freedreno/ir3: be iterative |
| - freedreno/ir3: make foreach_src declare cursor ptr |
| - freedreno/ir3: make foreach_ssa_src declar cursor ptr |
| - freedreno/ir3: make input/output iterators declare cursor ptr |
| - freedreno/ir3/group: fix for half-regs |
| - freedreno/ir3: fix mismatched flags on split |
| - freedreno/ir3/cf: handle multiple cov's properly |
| - freedreno/ir3: fix immed type in create_addr0() |
| - freedreno/ir3/print: print cat2 condition |
| - freedreno/ir3/cp: fix cmps folding |
| - freedreno/ir3: fix mismatched wrmask for overlapping VS inputs |
| - freedreno/ir3: add simple validate pass |
| - freedreno/ir3: add helpers to deal with src/dst types |
| - freedreno/ir3/validate: add checking for types and opcodes |
| - freedreno/drm: disallow exported buffers in bo cache |
| - freedreno: add batch debugging |
| - freedreno: clear last_fence after resource tracking |
| - freedreno: handle PIPE_TRANSFER_MAP_DIRECTLY |
| - freedreno/gmem: make noscis debug actually do something on a6xx |
| - freedreno/gmemtool: make GMEM alignment per-gen |
| - freedreno/gmemtool: add a405 |
| - freedreno/gmemtool: add verbose mode |
| - freedreno/gmem: add some asserts |
| - freedreno/gmem: fix nbins_x/y mismatch |
| - freedreno/gmem: split out helper to calc # of bins |
| - freedreno/a6xx: LRZ fix for alpha-test |
| - freedreno/a6xx: document LRZ flag buffer |
| - freedreno/a6xx: fix vsc assert |
| - nir: get_base_type() should return enum type |
| - nir: extract out convert_to_bitsize() helper |
| - nir/builder: add bitsize conversion helpers |
| - nir/lower_tex: fixes for fp16 yuv lowering |
| - freedreno/ir3: split kill from no_earlyz |
| - freedreno/a6xx: sync registers from envytools |
| - freedreno/a6xx: update depth-plane control regs |
| - freedreno/a6xx: re-work LRZ state tracking |
| - freedreno/a6xx: add early-lrz-late-z mode |
| - freedreno/a6xx: also consider alpha-test for ztest-mode |
| - freedreno/a6xx: more early-z |
| - freedreno/computerator: fix missing dependency on generated header |
| - nir/print: print tex dest type |
| - freedreno/ir3: add debug code to print conflicting half-regs |
| - freedreno/ir3: respect tex prefetch limits |
| - freedreno/ir3: remove RA "q-values" optimization |
| - freedreno/ir3: limit pre-fetched tex dest |
| - freedreno/ir3: unify shader create/delete paths |
| - freedreno/ir3: move the libdrm dependency out of shared code |
| - turnip: drop linking libfreedreno_drm |
| - freedreno/ir3: don't rely on intr->num_components |
| - radv: don't set num_components for non-vectorized intrinsics |
| - nir/builder: don't set intr->num_components |
| - nir/lower-atomics-to-ssbo: don't set num_components |
| - spriv: don't set num_components for non-vectorised intrinsics |
| - v3d: don't use intr->num_components for non-vectorized intrinsics |
| - nir/validate: validate intr->num_components |
| - freedreno/log-parser: fix compute times |
| - freedreno/sched: reset delay counters at start of block |
| - freedreno/ir3/validate: also check instr->address |
| - freedreno/ir3/cp: properly handle already-folded RELATIV |
| - freedreno: splitup emit_string_marker |
| - freedreno/a6xx: emit shader names in debug builds |
| - freedreno/ir3/legalize: don't allow (nopN) if (rptN) |
| - freedreno/ir3/print: print (r) flag |
| - freedreno/ir3: add test for delay slot calculation |
| - freedreno/ir3/delay: calculate delay properly for (rptN)'d instructions |
| - freedreno/ir3: add helpers to move instructions |
| - freedreno/ir3: delay test support for vectorish instructions |
| - freedreno/ir3/cp: extract valid_flags |
| - freedreno/ir3: add post-scheduler cp pass |
| - freedreno/ir3: convert regmask_t to struct |
| - freedreno/ir3: move mergedreg state out of reg |
| - freedreno/ir3: decouple regset from gpu gen |
| - freedreno/ir3: pass variant to postsched |
| - freedreno/ir3: re-work assembler API |
| - freedreno/ir3: make mergedregs a property of the variant |
| - freedreno/a6xx: set .MERGEREGS based on variant |
| - turnip: set .MERGEDREGS based on variant |
| - freedreno/computerator: MERGEDREGS update |
| - freedreno/ir3: update obsolete comment |
| - spirv: atomic_counter_read_deref is not vectorized |
| - spirv: drop some dead code |
| - glsl_to_nir: fix is_helper_invocation |
| - glsl_to_nir: fix shader_clock |
| - glsl_to_nir: fix vote_any/vote_all |
| - freedreno/ir3: refactor out helper to compile shader from asm |
| - freedreno/ir3: add accessor for const_state |
| - freedreno/a6xx: defer userconst cmdstream size calculation |
| - freedreno/ir3: move ubo_state into const_state |
| - freedreno/ir3: drop shader->num_ubos |
| - freedreno/ir3: constify shader key |
| - freedreno/ir3: pass variant to ir3_create() |
| - freedreno/ir3: convert over to ralloc |
| - freedreno/ir3: move num_reserved_user_consts out of const_state |
| - freedreno/ir3: un-embed const_state |
| - freedreno/ir3: move const_state back to variant |
| - freedreno/ir3: move output_loc to variant |
| - freedreno/ir3: split out ubo info from range |
| - freedreno/ir3: splitup get_existing_range() |
| - freedreno/ir3: split ubo analysis/lowering passes |
| - ci: remove some freedreno a6xx skips |
| - freedreno/ir3: add helper to determine point-coord inputs |
| - freedreno/a6xx: de-duplicate vinterp/vpsrepl state building |
| - freedreno/a6xx: use point-coord helper |
| - freedreno/a5xx: use point-coord helper |
| - freedreno/a4xx: use point-coord helper |
| - freedreno/a3xx: use point-coord helper |
| - freedreno: convert builtin blit VS prog to ureg builder |
| - freedreno/ir3: switch PIPE_CAP_TGSI_TEXCOORD |
| - freedreno: make foreach_bit() declare it's cursor |
| - freedreno: split out batch draw tracking helper |
| - freedreno: split out batch clear tracking helper |
| - freedreno: handle batch flush in resource tracking |
| - freedreno/ir3/ra: fix pre-color edge case |
| - freedreno/ir3: add ir3_finalize_nir() |
| - freedreno/ir3: move finalize_nir to pscreen hook |
| - freedreno/ir3: add ir3_compiler_destroy() |
| - freedreno/ir3: shuffle some variant fields |
| - freedreno/a6xx+ir3: stop generating pointless binning shaders |
| - freedreno/ir3: build binning variant at same time as draw variant |
| - freedreno/ir3: disk-cache support |
| - freedreno/ir3: move nir finalization to after cache miss |
| - freedreno/fdperf: fix print of base address |
| - freedreno/fdperf: better compatible string matching |
| - freedreno/fdperf: prefer render node |
| - gitlab-ci: reduce a630 runner load |
| - freedreno/ir3: add missing VS driver params |
| - freedreno/ir3: make compile fails more visible |
| - freedreno/a6xx: bail instead of crash for compile fails |
| - freedreno/ir3/ra: be better at failing |
| - freedreno/a6xx: don't enable early-z/lrz if no z-test |
| - freedreno/ir3: DCE unused arrays |
| - driconf: allowlist/denylist |
| - gitlab-ci: re-enable all a630 jobs |
| - freedreno: small comment re-word |
| - freedreno: whitespace fix |
| - freedreno/ir3/parser: half-precision relative regs |
| - freedreno/ir3: set array precision on creation |
| - freedreno/ir3: fix half-reg array stores |
| - freedreno/ir3/ra: debug msgs tweak |
| - freedreno/ir3/ra: assign vreg names to all array elements |
| - freedreno/ir3/ra: fix array conflicts for split/merged |
| - freedreno: sync registers from envytools |
| - freedreno: make gen_header.py check parent directory |
| - freedreno: slurp in rnndb |
| - freedreno: slurp in rnn |
| - freedreno: slurp in decode tools |
| - freedreno: slurp in afuc |
| - freedreno/rnn: warnings cleanup |
| - freedreno/decode: warnings cleanup |
| - freedreno/afuc: warnings cleanup |
| - freedreno: add CI for envytools tools |
| - freedreno/ir3: split out regmask |
| - freedreno: drop shader_t |
| - freedreno: deduplicate a3xx+ disasm |
| - freedreno: move a2xx disasm out of gallium |
| - freedreno: deduplicate a2xx disasm |
| - freedreno/ci: add a2xx trace to CI job |
| - freedreno/tools: check rnn parse status |
| - freedreno/rnn: split out helper to find files |
| - freedreno/rnn: add error helper |
| - freedreno/rnn: rename schema file |
| - freedreno/rnn: update schema for 'pos' |
| - freedreno/rnn: add relaxed boolean type |
| - freedreno/rnn: add high/low/pos to registers |
| - freedreno/rnn: add radix/align |
| - freedreno/rnn: relax Hexadecimal to HexOrNumber |
| - freedreno/rnn: add variants/varset to domain |
| - freedreno/registers/a2xx: fix validation error |
| - freedreno/registers/a4xx: fix validation error |
| - freedreno/registers/adreno_pm4: fix validation errors |
| - freedreno/rnn: describe copyright element in schema |
| - freedreno/rnn: add "addvariant" to schema |
| - freedreno/rnn: allow name to be optional in arrays |
| - freedreno/rnn: fix use-group |
| - freedreno/registers/mdp5: fix validation error |
| - freedreno/rnn: schema updates for dynamic/irregular offsets |
| - freedreno/rnn: add schema validation |
| - freedreno/rnn: headergen2 warnings cleanup |
| - freedreno/decode: cffdec warnings cleanup |
| - freedreno/ir3: add missing track_ubo_use() |
| - freedreno/a6xx: don't emit a bogus size for empty cb slots |
| - freedreno/a6xx: fixup draw state earlier |
| - freedreno/rnn: also look for .xml.gz |
| - freedreno/rnn: rework RNN_DEF_PATH construction |
| - freedreno/registers: add .gitignore |
| - freedreno/registers: split header build into subdirs |
| - freedreno/registers: install gzip'd register database |
| - freedreno/decode: move dependencies up a level |
| - freedreno: allow fence_fd fences to be recycled |
| - freedreno/ir3: ir3_cmdline updates |
| - freedreno/ir3: lower local_index using local_id |
| - glsl/lower_precision: split out const lowering |
| - gallium: replace 16BIT_TEMPS cap with 16BIT_CONSTS |
| - glsl: remove LowerPrecisionTemporaries |
| - glsl: don't inline intrinsics for mediump |
| - glsl_to_nir: fix bitfield_extract with 16-bit operands |
| - freedreno/registers: add some missing regs to build |
| - freedreno/crashdec: handle section name typos |
| - freedreno/a6xx: fix occlusion query with more than one tile |
| - freedreno: handle case of shadowing current render target |
| - freedreno/gmemtool: add tile_alignw/h and a650 |
| |
| Rohan Garg (3): |
| |
| - iris: Fix documentation for _iris_batch_flush |
| - ci: Include trace replay support in ARM rootfses. |
| - gitlab-ci: Replay traces on lava devices |
| |
| Roland Scheidegger (1): |
| |
| - gallivm: fix half to float conversions with llvm 11 |
| |
| Roman Gilg (2): |
| |
| - vulkan/wsi/x11: add sent image counter |
| - vulkan/wsi/x11: wait for acquirable images in FIFO mode |
| |
| Roman Stratiienko (5): |
| |
| - egl: Build surfaceless platform on Android |
| - Android: Fixes for Q and R |
| - panfrost: Android build fixes 2020 week 31 |
| - lima: Fix lima_screen_query_dmabuf_modifiers() |
| - android: freedreno: Another build fix |
| |
| Sagar Ghuge (3): |
| |
| - iris: Use modfiy disables for 3DSTATE_WM_DEPTH_STENCIL command |
| - intel/compiler: Optimize integer add with 0 into mov |
| - intel/compiler: Remove unnecessary optimization for MUL |
| |
| Samuel Pitoiset (235): |
| |
| - ci: fix reporting the number of unexpected/flakes |
| - ci: add lists of expected failures & skipped tests for RAVEN with ACO |
| - aco: remove unecessary p_split_vector with v2b reg class |
| - radv: enable shaderInt16 unconditionally with LLVM and only GFX8+ with ACO |
| - radv: cleanup radv_CreateInstance() |
| - radv: rename radv_devices() to radv_enumerate_physical_devices() |
| - radv: fix a memleak if the physical device initialization failed |
| - radv: report INITIALIZATION_FAILED when the amdgpu winsys init failed |
| - radv: don't report error with other vendor DRM devices |
| - radv: use a linked list for physical devices |
| - radv: display an error message if the winsys init failed |
| - radv/winsys: do not count visible VRAM buffers twice in the budget |
| - ci: remove unused .test-radv-fossilize rule |
| - ci: set ACO_DEBUG=validateir,validatera global for RADV testing |
| - ci: run radv-fossils with Pitcairn (GFX6) and Bonaire (GFX7) too |
| - radv: remove the LLVM version string when ACO is used |
| - radv: do not print the LLVM version string twice in hang reports |
| - radv: report correct backend IR in hang reports when ACO is used |
| - aco: fix 64-bit trunc with negative exponents on GFX6 |
| - nir: do not vectorize load/store if offset can overflow and robustness enabled |
| - aco: prevent invalid loads/stores vectorization if robustness is enabled |
| - radv: limit the Vulkan version to 1.1 for Android |
| - radv: handle different Vulkan API versions correctly |
| - radv: update the list of allowed Android extensions |
| - aco: optimize add/sub(a, cndmask(b, 0, 1, cond)) -> addc/subbrev_co(0, a, b) |
| - radv: use the common base object type for VkDevice |
| - radv: use the base object struct types |
| - radv: implement VK_EXT_private_data |
| - vulkan: import common code for generating extensions |
| - radv: use the common code for generating extensions and dispatch tables |
| - anv: use the common code for generating extensions and dispatch tables |
| - turnip: use the common code for generating extensions and dispatch tables |
| - radv: add a LLVM version string workaround for SotTR and ACO |
| - aco: remove useless check for nir_tex_src_bias |
| - aco: add support for texturing with clamped LOD |
| - ac/llvm: add support for texturing with clamped LOD |
| - radv: enable shaderResourceMinLod |
| - spirv: handle OpCopyObject correctly with any types |
| - radv: fix missing break in radv_GetPhysicalDeviceProperties2() |
| - aco: store 16-bit temporary outputs as v2b |
| - aco: convert 16-bit values before exporting MRTs |
| - aco: allow to load/store 16-bit values in VMEM for tess and geom |
| - aco: implement 8-bit/16-bit mov's with p_create_vector |
| - aco: implement 16-bit vertex fetches with tbuffer_load_format_d16_* |
| - aco: validate v_interp_*_f16 as VOP3 instructions instead of VINTRP |
| - aco: emit v_interp_*_f16 instructions as VOP3 instead of VINTRP |
| - aco: implement 16-bit interp |
| - aco: fix off-by-one error with 16-bit MTBUF opcodes on GFX10 |
| - radv/aco: enable storageInputOutput16 on GFX9+ |
| - aco: fix missing break in label_instruction() |
| - radv: fix missing break in radv_GetPhysicalDeviceFeatures2() |
| - radv: fix duplicated expression in ac_setup_rings() |
| - radv/winsys: remove useless free in radv_amdgpu_create_bo_list() |
| - aco: declare 8-bit/16-bit reduce operations |
| - aco: implement 8-bit/16-bit reductions |
| - aco: validate 8-bit/16-bit VGPR operands for readfirstlane/readlane/writelane |
| - aco: implement 8-bit/16-bit nir_intrinsic_read_first_invocation |
| - aco: implement 8-bit/16-bit nir_intrinsic_{shuffle,_read_invocation} |
| - aco: implement 8-bit/16-bit nir_intrinsic_quad_* |
| - aco: use a temporary SGPR for 8-bit/16-bit literal reduction identities |
| - aco: sign-extend the input and identity for 8-bit subgroup operations |
| - radv: do not return from radv_GetPhysicalDeviceFeatures2() |
| - radv: cleanup physical device features |
| - radv: remove useless assignment in build_streamout_vertex() |
| - spirv: add ReadClockKHR support with device scope |
| - aco: implement nir_intrinsic_shader_clock with device scope |
| - ac/nir: fix shader clock with subgroup scope |
| - ac/nir: implement nir_intrinsic_shader_clock with device scope |
| - radv: advertise shaderDeviceClock on GFX8+ |
| - spirv: add SpvCapabilityImageGatherBiasLodAMD |
| - spirv: add support for bias/lod with OpImageGather |
| - ac/nir: add support for bias/lod with texture gather |
| - aco: add support for bias/lod with texture gather |
| - radv: add support for querying which formats support texture gather LOD |
| - radv: advertise VK_AMD_texture_gather_bias_lod |
| - spirv,radv,anv: implement no-op VK_GOOGLE_user_type |
| - radv/aco: enable VK_EXT_subgroup_size_control |
| - aco: fix register allocation for subdword instructions on GFX10 |
| - aco: implement 8-bit/16-bit reductions on GFX10 |
| - aco: allocate a temp VGPR for some 8-bit/16-bit reduction ops on GFX10 |
| - aco: allow gfx10_wave64_bpermute with 8-bit/16-bit input |
| - aco: sign-extend input/indentity for 32-bit reduce ops on GFX10 |
| - radv/aco: enable VK_KHR_subgroup_extended_types on GFX8+ |
| - radv: enable zero VRAM for Doom Eternal |
| - radv: enable zero VRAM for all VKD3D (DX12->VK) games |
| - aco: implement 16-bit reduce operations on GFX6-GFX7 |
| - aco: implement 16-bit nir_intrinsic_quad_* on GFX6-GFX7 |
| - aco: fix subdword copies on GFX6-GFX7 |
| - aco: sign-extend input/identity for 16-bit subgroup ops on GFX6-GFX7 |
| - radv/aco: enable 64-bit atomic features if RADV is linked with LLVM 8 |
| - aco: use v_bfe_u32 for unsigned reductions sign-extension on GFX6-GFX7 |
| - aco: fix sign-extend 8-bit subgroup operations on GFX6-GFX7 |
| - aco: fix nir_intrinsic_quad_* with 8-bit in GFX6-GFX7 |
| - radv/aco: enable VK_KHR_shader_subgroup_extended_types on GFX6-GFX7 |
| - ac/nir: adjust an assertion for D16 on GFX6-GFX7 |
| - nir/lower_explicit_io: fix NON_UNIFORM access for UBO loads |
| - radv/llvm: expose VK_EXT_shader_demote_to_helper_invocation with LLVM 9+ |
| - aco: implement 8-bit/16-bit conversions on GFX6-GFX7 |
| - aco: fix alignment of vectors with 4 elements |
| - radv/aco: enable 8-bit/16-bit storage on GFX6-GFX7 |
| - radv/aco: enable shaderInt16 on GFX6-GFX7 |
| - radv/aco: enable shaderInt8 and VK_KHR_shader_float16_int8 on GFX6-GFX7 |
| - ac/nir: fix integer comparisons with pointers |
| - radv: set DB_SHADER_CONTROL.CONSERVATIVE_Z_EXPORT correctly |
| - radv: add new drirc option radv_enable_mrt_output_nan_fixup |
| - aco: implement radv_enable_mrt_output_nan_fixup workaround |
| - radv/llvm: implement radv_enable_mrt_output_nan_fixup workaround |
| - radv: enable radv_enable_mrt_output_nan_fixup for RAGE 2 |
| - ac: add ac_choose_spi_color_formats() to common code |
| - spirv: fix using OpSampledImage with OpUndef instead of OpType{Image,Sampler} |
| - aco: allow to swap operands for some 16-bit float instructions |
| - spirv: do not set num_components for non-vectorized mbcnt_amd intrinsic |
| - radv/aco: enable FP16 features/extensions on GFX9+ |
| - radv: lower discards to demote to workaround a RDR2 game bug |
| - radv: make sure to set CB_SHADER_MASK correctly for internal CB operations |
| - radv: compute CB_SHADER_MASK from the fragment shader outputs |
| - radv: only requires LLVM 9 for GFX10 if not using ACO |
| - radv: replace == GFX10 with >= GFX10 where it's needed |
| - aco: replace == GFX10 with >= GFX10 where it's needed |
| - radv: add support for Sienna Cichlid |
| - radv: require LLVM 11+ for GFX 10.3 if not using ACO |
| - aco: fix printing ASM on GFX6-7 if clrxdisasm is not found |
| - aco: improve validation checks for readlane/writelane |
| - aco: fix printing ASM on GFX6-7 again |
| - gitlab-ci: stop testing RADV with LLVM |
| - gitlab-ci: update the list of expected CTS failures for RADV/ACO |
| - gitlab-ci: update the list of expected failures for Pitcairn |
| - radv: fix checking the return value of cs_finalize() |
| - gitlab-ci: add parallel-rdp fossils |
| - radv: lower 64-bit drcp/dsqrt/drsq for fixing precision issues |
| - radv: lower 64-bit dfloor on GFX6 for fixing precision issues |
| - gitlab-ci: add a list of expected failures for RADV/ACO on NAVI14 |
| - gitlab-ci: set the number of Fossilize threads to 4 |
| - gitlab-ci: append Fossilize stdout/stderr to a file to reduce spam |
| - gitlab-ci: attach the Fossilize log file as artifact on failure |
| - radv: remove the shader ballot workaround for Youngblood with LLVM |
| - radv: remove the load/store workaround for Monster Hunter World with LLVM |
| - radv: enable VK_AMD_shader_ballot on GFX6-7 with both compiler backends |
| - radv: adjust CB_SHADER_MASK for dual-source blending in the shader info pass |
| - radv: rework 8/16-bit color attachment formats detection |
| - radv: use SPI_SHADER_ZERO for non-written color attachments |
| - radv: add support for MRTs compaction to avoid holes |
| - radv: fix wide points and lines |
| - radv: fix wide lines with multisample enabled |
| - Revert "vulkan/wsi/x11: Ensure we create at least minImageCount images." |
| - radv,vulkan: add a new x11 wsi drirc workaround for DOOM Eternal |
| - radv: disable FMASK compression when drawing with GENERAL layout |
| - radv: set depth/stencil enable values correctly for the meta clear path |
| - radv: implement missing VK_ACCESS_MEMORY_{READ,WRITE}_BIT |
| - radv: store the primitive topology hardware value in the pipeline |
| - radv: adjust IA_MULTI_VGT_PARAM.WD_SWITCH_ON_EOP at draw time |
| - radv: adjust IA_MULTI_VGT_PARAM.PARTIAL_VS_WAVE at draw time |
| - radv: compute prim_vertex_count at draw time |
| - aco: fix more validation errors from vgpr spill/restore code |
| - radv: return VK_ERROR_DEVICE_LOST if wait-for-idle failed or expired |
| - radv: remove the secure compile support feature |
| - radv: rework dynamic viewports/scissors support |
| - radv: add VK_EXT_extended_dynamic_state but leave it disabled |
| - radv: declare new extended dynamic states |
| - radv: add support for dynamic cull mode and front face |
| - radv: add support for dynamic primitive topology |
| - radv: add support for dynamic and scissor count |
| - radv: add support for dynamic depth/stencil states |
| - radv: add support for dynamic vertex input binding stride |
| - radv: advertise VK_EXT_extended_dynamic_state |
| - radv: add the custom border color BO to the list of buffers |
| - radv: destroy the base object if VkCreateQueryPool() failed |
| - radv: destroy the base object if VkCreateRenderPass*() failed |
| - radv: destroy the base object if VkCreateImage() failed |
| - radv: destroy the base object if VkCreateBuffer() failed |
| - radv: destroy the base object if VkCreateEvent() failed |
| - radv: destroy the base object if VkCreateSemaphore() failed |
| - radv: destroy the base object if VkCreateFence() failed |
| - radv: destroy the base object if VkAllocateCommandBuffers() failed |
| - radv: destroy the base object if VkCreateInstance() failed |
| - radv/winsys: replace alloca() by malloc() everywhere |
| - radv/winsys: pass the buffer list via the CS ioctl for less CPU overhead |
| - radv: fix destroying the syncobj when exporting a fence FD |
| - radv: fix the error code when exporting a semaphore/fence fails |
| - radv: fix the error code when allocating a fresh imported syncobj fails |
| - radv: optimize creating signaled syncobj with amdgpu_cs_create_syncobj2() |
| - radv: split fence into two parts as enum+union. |
| - radv: remove one useless goto in radv_queue_submit_deferred() |
| - radv: improve the error messages when a CS submission failed |
| - radv: return better Vulkan error codes when VkQueueSubmit() fails |
| - radv: disable CPU caching for IBS to reduce fetch latency |
| - radv/winsys: always allow GTT placements on APUs |
| - radv: advertise VK_EXT_image_robustness |
| - radv: do not perform read-modify-write with the upload BO |
| - radv: disable CPU caching for the upload BO to reduce fetch latency |
| - aco: add support for nir_intrinsic_shared_atomic_fadd |
| - ac/nir: add support for nir_intrinsic_shared_atomic_fadd |
| - radv: advertise VK_EXT_shader_atomic_float |
| - radv: add missing return values check for some winsys calls |
| - radv/winsys: check more allocation failures |
| - radv/winsys: remove useless check when binding virtual buffers/images |
| - radv/winsys: return a Vulkan error code when binding virtual buffers/images |
| - radv/winsys: be more robust when a CS failed during recording |
| - radv: remove declared but unused radv_pipeline::is_dual_src |
| - radv: remove set but unused radv_pipeline::vertex_elements |
| - radv: remove outdated TODO related to PA_SU_VTX_CNTL.PIX_CENTER |
| - radv: emit more invariant registers as part of the initial gfx state |
| - radv: emit PA_SC_LINE_CNTL as part of the rasterization state |
| - radv: clean up VGT_SHADER_STAGES_EN emission |
| - radv: clean up PA_SC_CLIPRECT_RULE emission |
| - radv: reduce the number of allocated dwords for compute CS |
| - radv: clean up radv_compute_generate_pm4() |
| - radv: remove unnecessary radv_tessellation_state::num_patches |
| - radv: remove no-op si_multiwave_lds_size_workaround() |
| - radv: remove one unnecessary param to radv_generate_graphics_pipeline_key() |
| - radv: align the LDS size in calculate_tess_lds_size() |
| - radv: set LDS TCS size at shaders creation for GFX9+ |
| - radv: remove unnecessary radv_tessellation_state::lds_size |
| - radv: clean up tessellation state emission |
| - radv: add radv_pipeline_init_input_assembly_state() |
| - radv: add radv_pipeline_generate_vgt_gs_out() |
| - radv: clean up adjusting MSAA state if conservative rast is enabled |
| - radv: clean up binning state initialization |
| - radv: assign pipeline gfx fields before PM4 emission |
| - radv: constify all radv_pipeline_generate_*() helpers |
| - radv: add radv_pipeline_init_shader_stages_state() |
| - radv: remove useless return value to radv_pipeline_scratch_init() |
| - radv: clean up remaining pipeline init functions |
| - radv: print warnings for famous RADV_PERFTEST options that no longer exist |
| - radv: do not honor a user-specified pitch on GFX 10.3 |
| - radv: increase minimum NGG vertex count requirement per workgroup on GFX 10.3 |
| - radv: fix sample shading on GFX 10.3 |
| - radv: set BYPASS_VTX_RATE_COMBINER_GFX103 on GFX 10.3 |
| - radv/gfx10: add missing initialization of registers |
| - radv: limit LATE_ALLOC_GS to prevent a GPU hang on GFX10 |
| - radv: fix emitting the border color pointer on the compute queue |
| - nir/algebraic: mark some optimizations with fsat(NaN) as inexact |
| - aco: handle unaligned loads on GFX10.3 |
| - spirv: fix emitting switch cases that directly jump to the merge block |
| - radv: fix transform feedback crashes if pCounterBufferOffsets is NULL |
| |
| Satyajit Sahu (1): |
| |
| - frontends/va: Handle dynamic resolution/SVC for VP9 |
| |
| Satyeshwar Singh (1): |
| |
| - intel/dev: Don't consider all TGL SKUs as GT1 only |
| |
| Serge Martin (3): |
| |
| - amd/common: Fix incorrect use of asprintf instead of vasprintf |
| - clover: add more cl_mem_object_type to pipe_texture_target mapping |
| - clover: implements clEnqueueFillBuffer |
| |
| Shawn Guo (1): |
| |
| - freedreno/a4xx: fix \*_NONE enum conversion |
| |
| Simon Ser (3): |
| |
| - EGL: sync headers with Khronos |
| - gbm: document that gbm_bo_map exposes a linear view |
| - radv: use bitshifts for debug enum values |
| |
| SureshGuttula (1): |
| |
| - radeon/vcn: Corrected vp9 ref associated data incase of target->codec is NULL |
| |
| Tapani Pälli (14): |
| |
| - st/mesa: destroy only own program variants when program is released |
| - anv: call base finish only if pass given in DestroyRenderPass |
| - anv: add VK_EXT_extended_dynamic_state but leave it disabled |
| - anv: add new dynamic states |
| - anv: consider dynamic state when creating pipeline |
| - anv: handle dynamic viewport count |
| - anv: add support for dynamic cull mode and winding order |
| - anv: add support for dynamic viewport and scissor with count |
| - anv: add support for dynamic primitive topology change |
| - anv: depth/stencil dynamic state support |
| - anv: dynamic vertex input binding stride and size support |
| - anv: toggle on VK_EXT_extended_dynamic_state |
| - anv: add a check for depthStencilState before using it |
| - anv: null check for buffer before reading size |
| |
| Thong Thai (8): |
| |
| - radeon: Fix whitespaces |
| - gallium/auxiliary/vl: Fix compute shader scaling for non-square pixels |
| - gallium/auxiliary/vl: Fix compute shader scale_y for interlaced videos |
| - frontends/va: Fix deinterlace bottom field first flag |
| - frontends/vdpau: Default destination rect to source rect |
| - radeon/vcn: add vcn 3.0 encode support |
| - radeonsi: use PIPE_FORMAT_P010 for 10-bit VP9 decoding |
| - radeon/vcn: increase render_pic_list size |
| |
| Timothy Arceri (69): |
| |
| - glsl: stop cascading errors if process_parameters() fails |
| - glsl: fix slow linking of uniforms in the nir linker |
| - radv: fix regression with builtin cache |
| - nir: add glsl_get_ifc_packing() helper |
| - nir: add callback to nir_remove_dead_variables() |
| - glsl: add can_remove_uniform() helper to the NIR linker |
| - glsl: remove dead uniforms in the nir linker |
| - glsl/spirv: remove dead uniforms in spirv nir linker |
| - gitlab-ci: bump piglit checkout commit |
| - i965: call brw_nir_lower_uniforms() after uniform linking is complete |
| - util: add BITSET_LAST_BIT() helper |
| - glsl: add struct to gather more info about uniform array access |
| - glsl: add update_array_sizes() helper to the NIR uniform linker |
| - glsl: gather uniform dereference info before main linking loop |
| - glsl: when NIR linker enable use it to resize uniform arrays |
| - glsl: fix potential slow compile times for GLSLOptimizeConservatively |
| - glsl: fix incorrect optimisation in opt_constant_variable() |
| - glsl: fix uniform array resizing in the nir linker |
| - glsl: small optimisation fix for uniform array resizing |
| - st_glsl_to_nir: fix potential use after free |
| - mesa: remove _mesa prefix from static function |
| - mesa: add _mesa_program_state_value_size() helper |
| - glsl: define gl_LightSource members in ARB_vertex_program order |
| - st/glsl_to_nir: disable st_nir_lower_builtin() when packing supported |
| - glsl: remove stale FIXME |
| - i965: add and fix fallthrough comments |
| - llvmpipe: add missing fallthrough comments |
| - gallivm: add missing break |
| - anv: update fallthrough comment so gcc sees it |
| - intel/compiler: add and fix up fallthrough comments for gcc warnings |
| - iris: add missing fallthrough comment |
| - egl: move fallthrough comment so gcc can see it |
| - nir: add missing break to nir_opt_access() |
| - mesa: fix fallthrough in glformats |
| - mesa: add fallthrough comments to glformats.c |
| - mesa: add fallthrough comments to get.c |
| - nir: fix implicit fallthrough warnings |
| - mesa: add fallthrough comments to COPY_SZ_4V() |
| - radeonsi: add missing fallthrough comment |
| - glx: add missing fallthrough comment |
| - glsl: move fallthrough comment to where gcc can see it |
| - radeon: add missing fallthrough comments |
| - spirv: add missing fallthrough comments |
| - mesa/vbo: add some missing fallthrough comments |
| - mesa: add missing fallthrough comment to teximage.c |
| - mesa: fix unintended fallthrough in glIsEnabled() |
| - r300: add and fix up fallthrough comments |
| - svga: add missing fallthrough comments |
| - mesa: update fallthrough comment so gcc can see it |
| - nv30: add missing fallthrough comment |
| - meson: turn on Wimplicit-fallthrough project wide |
| - nouveau: fix pointer-sign warning |
| - gitlab-ci: Enable -Werror in `meson-classic` job |
| - r600/radeonsi: silence zero-length-bounds gcc warnings |
| - radeonsi: fix SI_NUM_ATOMS |
| - iris: fix maybe-uninitialized warning for initial_state variable |
| - iris: silence maybe-uninitialized for stc_dst_aux_usage variable |
| - nouveau/nvc0: silence maybe-uninitialized warning |
| - panfrost: add some missing fallthrough comments |
| - panfrost: hide more unused code in bi_lower_combine.c |
| - panfrost: add some missing fallthrough comments to bi_pack.c |
| - freedreno: fix missing fallthrough comments |
| - v3d: remove redefine of VG(x) |
| - zink: fix missing fallthrough comment |
| - nine: remove unused var |
| - etnaviv: add missing fallthrough comments |
| - lima: add missing fallthrough comments |
| - lima: add missing break |
| - gitlab-ci: Enable -Werror in `meson-gallium` job |
| |
| Timur Kristóf (4): |
| |
| - aco/gfx10: Refactor of GFX10 wave64 bpermute. |
| - aco: Implement subgroup shuffle on GFX6-7. |
| - radv/aco: Always enable subgroup shuffle. |
| - aco: Fix emit_boolean_exclusive_scan in wave32 mode. |
| |
| Tomeu Vizoso (55): |
| |
| - panfrost: Emit blend descriptors on Bifrost |
| - panfrost: Don't leak temporary descriptors array |
| - pan/decode: Check for correct unknown field |
| - pan/decode: Use correct printf modifier for long int |
| - panfrost: Split bit out of format.unk3 |
| - panfrost: Create additional BO for the checksum of imported BOs (Bifrost) |
| - panfrost: Add a bit more info about some tiler fields |
| - pan/bi: Print shaders only if BIFROST_MESA_DEBUG=shaders |
| - pan/decode: Trace to stderr with PANDECODE_DUMP_FILE=stderr |
| - panfrost: GPUs newer than G-71 don't have swizzles... |
| - panfrost: mali_attr_meta.unknown1 is zero on Bifrost |
| - panfrost: Add Bifrost texture trampoline BO to batch |
| - pan/decode: Properly print tripped zeroes |
| - virgl: Properly check for encode_stride when encoding transfers |
| - panfrost: Add checksum BOs to batch |
| - panfrost: Don't trample on top of Bifrost-specific unions |
| - panfrost: Handle MALI_RGB8_UNORM in panfrost_format_to_bifrost_blend |
| - gitlab-ci: Run more dEQP tests for virgl |
| - gitlab-ci: Add manual tests for Virgl using GLES on the host |
| - gitlab-ci: Test virgl with Khronos' OpenGL CTS |
| - gitlab-ci: Update CTS runner |
| - ci: Don't call renderdoc's ReplayController.Shutdown() |
| - ci: Move ARM rootfses to stable |
| - gitlab-ci: Build kernel drivers for a few ethernet USB dongles |
| - gitlab-ci: More stable URL for kernel and ramdisk artifacts, for LAVA |
| - gitlab-ci: Remove left-behind rules: |
| - gitlab-ci: Don't rebuild kernels and rootfs if they have been already built in mainline |
| - gitlab-ci: Run all of GLES3 tests for Panfrost |
| - gitlab-ci: Re-add kernels for bare-metal |
| - gitlab-ci: Download traces from MinIO |
| - gitlab-ci: Upload tracie artifacts to MinIO |
| - gitlab-ci: Fix needs: of the arm64 LAVA test jobs |
| - ci: Upload images of failed replays to MinIO |
| - ci: Use smaller glxgears trace |
| - ci: Prefix tracie artifacts with the device name |
| - ci: Test with more traces |
| - ci: Disable trace testing on Mali T760 |
| - ci: Fix the overwriting of traces.yml for baremetal |
| - ci: Namespace trace artifacts to the job number |
| - ci: Always print status code of HTTP uploads in tracie |
| - ci: Print load stats after running dEQP |
| - ci: Fix URL for glslang |
| - ci: Don't ship vk-build-programs after building dEQP |
| - ci: Split building of libdrm to its own script |
| - ci: Build kernels and rootfs for x86 devices |
| - ci: Upload reference images for traces |
| - ci: Print URL to image diff when a trace replay fails |
| - ci: Generate MinIO credentials within LAVA jobs |
| - ci: Set date in LAVA DUTs from NTP servers |
| - ci: Build-test Panfrost tools |
| - ci: Upload traces' reference and actual images to MinIO |
| - ci: Download traces from MinIO in baremetal runs |
| - ci: Remove kernel module build that slipped in |
| - ci: Actually upload trace artifacts to MinIO for baremetal |
| - ci: Use a rootfs tarball for NFS root, instead of a ramdisk (for LAVA) |
| |
| Tony Wasserka (4): |
| |
| - nir/lower_idiv: Port recent LLVM fixes to emit_udiv |
| - radv: Fix various non-critical integer overflows |
| - aco: Fix integer overflows when emitting parallel copies during RA |
| - amd/common: Fix various non-critical integer overflows |
| |
| Vinson Lee (25): |
| |
| - freedreno: Add missing break statement. |
| - llvmpipe: Fix variable name. |
| - r600/sfn: Initialize VertexStageExportForGS m_num_clip_dist member variable. |
| - panfrost: Ensure final.no_colour is initialized. |
| - r600/sfn: Use correct setter method. |
| - freedreno: Add missing va_end. |
| - pan/bi: Initialize struct fma_op_info member extended. |
| - zink: Check fopen result. |
| - etnaviv: Fix memory leak on error path. |
| - panfrost: Fix printf format specifier. |
| - r300g: Remove extra printf format specifiers. |
| - vdpau: Fix wrong calloc sizeof argument. |
| - mesa: Fix NetBSD compiler macro. |
| - Switch from cElementTree to ElementTree. |
| - intel/genxml: Migrate from deprecated xml.etree.ElementTree getchildren. |
| - rbug: Fix rbug_delete_vs_state lock acquisition. |
| - nir: Add nir_lower_clip_disable.c to SCons build. |
| - util: Fix SCons build. |
| - util: Fix memory leaks in unit test. |
| - meson: Fix lmsensors warning message. |
| - vulkan: Fix memory leaks. |
| - freedreno: Fix file descriptor leak. |
| - svga: Fix unused printf argument. |
| - freedreno: Check file descriptor before write. |
| - panfrost: Delete debug allocated syncobj. |
| |
| Yevhenii Kharchenko (1): |
| |
| - st/mesa: fix corrupted texture levels, when adding more levels than expected |
| |
| Yevhenii Kolesnikov (5): |
| |
| - glsl: subroutine signatures must match exactly |
| - nvir: don't use designated initialisers in C++ code |
| - intel/compiler: don't propagate cmp to add if add is saturated |
| - mesa: change error code of \*TextureSubImage\* for incorreect target |
| - nine: fix incorrect calculation of layer count for 3D textures |
| |
| jzielins (2): |
| |
| - gallium/swr: Fix compilation warnings |
| - swr: Bump maximum 2D texture size to 16kx16k |
| |
| mmenzyns (1): |
| |
| - nv50: Clear nv50_ir_prog_info of dead and codegen specific variables |