| Mesa 25.0.0 Release Notes / 2025-02-19 |
| ====================================== |
| |
| Mesa 25.0.0 is a new development release. People who are concerned |
| with stability and reliability should stick with a previous release or |
| wait for Mesa 25.0.1. |
| |
| Mesa 25.0.0 implements the OpenGL 4.6 API, but the version reported by |
| glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) / |
| glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used. |
| Some drivers don't support all the features required in OpenGL 4.6. OpenGL |
| 4.6 is **only** available if requested at context creation. |
| Compatibility contexts may report a lower version depending on each driver. |
| |
| Mesa 25.0.0 implements the Vulkan 1.4 API, but the version reported by |
| the apiVersion property of the VkPhysicalDeviceProperties struct |
| depends on the particular driver being used. |
| |
| SHA checksums |
| ------------- |
| |
| :: |
| |
| SHA256: 96a53501fd59679654273258c6c6a1055a20e352ee1429f0b123516c7190e5b0 mesa-25.0.0.tar.xz |
| SHA512: 7f5b6674c40b6c8dcab7934512ff754b40a6a8a466422c90236f614d322033d4d465307ddcd983f9f3afb1310e132ec3186a085d261c95493a0c460b2ec59ce8 mesa-25.0.0.tar.xz |
| |
| |
| New features |
| ------------ |
| |
| - cl_khr_depth_images in rusticl |
| - Vulkan 1.4 on radv/gfx8+ |
| - VK_KHR_dedicated_allocation on panvk |
| - VK_KHR_global_priority on panvk |
| - VK_KHR_index_type_uint8 on panvk |
| - VK_KHR_map_memory2 on panvk |
| - VK_KHR_multiview on panvk/v10+ |
| - VK_KHR_shader_non_semantic_info on panvk |
| - VK_KHR_shader_relaxed_extended_instruction on panvk |
| - VK_KHR_vertex_attribute_divisor on panvk |
| - VK_KHR_zero_initialize_workgroup_memory on panvk |
| - VK_KHR_shader_draw_parameters on panvk |
| - VK_KHR_shader_float16_int8 on panvk |
| - VK_KHR_8bit_storage on panvk |
| - VK_EXT_4444_formats on panvk |
| - VK_EXT_global_priority on panvk |
| - VK_EXT_global_priority_query on panvk |
| - VK_EXT_host_query_reset on panvk |
| - VK_EXT_image_robustness on panvk |
| - VK_EXT_pipeline_robustness on panvk |
| - VK_EXT_provoking_vertex on panvk |
| - VK_EXT_queue_family_foreign on panvk |
| - VK_EXT_sampler_filter_minmax on panvk |
| - VK_EXT_scalar_block_layout on panvk |
| - VK_EXT_tooling_info on panvk |
| - depthClamp on panvk |
| - depthBiasClamp on panvk |
| - drawIndirectFirstInstance on panvk |
| - fragmentStoresAndAtomics on panvk/v10+ |
| - sampleRateShading on panvk |
| - occlusionQueryPrecise on panvk |
| - shaderInt16 on panvk |
| - shaderInt64 on panvk |
| - imageCubeArray on panvk |
| - VK_KHR_depth_clamp_zero_one on RADV |
| - VK_KHR_maintenance8 on radv |
| - VK_KHR_shader_subgroup_rotate on panvk/v10+ |
| - Vulkan 1.1 on panvk/v10+ |
| - VK_EXT_subgroup_size_control on panvk/v10+ |
| - initial GFX12 (RDNA4) support on RADV |
| |
| |
| Bug fixes |
| --------- |
| |
| - radeonsi: regression with running DaVinci Resolve under rusticl since 666a6eb871d5dec79362bdc5d16f15915eb52f96 |
| - [ANV][LNL] - Black Myth: Wukong (2358720) - Corruption is visible near the edge of water. |
| - [ANV][LNL] - Hogwarts Legacy (990080) - Pixelated corruption is visible when looking out at the water. |
| - radv/video/h265: pps.flags.transform_skip_enabled_flag = 1 randomly hangs GPU |
| - [ANV][LNL] - Steel Rats (619700) - Game crashes after opening logos play before reaching main menu |
| - nvk: Implement host-only descriptors |
| - Gnome-shell Wayland fails to start with segfault at modifier-less driver |
| - [ANV][LNL] - DYNASTY WARRIORS: ORIGINS (2384580) - Dithered transparency has vertical bands. |
| - AMD Radeon R9 270 randomly causes video playback applications to crash with "amdgpu: The CS has been rejected" |
| - Rendering issues on GravityMark with RadeonSI ACO |
| - i915: multiple tests assert with tgsi_ureg.h:893: ureg_swizzle: Assertion \`reg.File != TGSI_FILE_NULL' failed. |
| - shaders/closed/steam/deus-ex-mankind-divided/260.shader_test fails NIR validation |
| - shaders/closed/steam/deus-ex-mankind-divided/260.shader_test fails NIR validation |
| - panvk : vk_pipeline_cache_object_deserialize: Assertion \`reader.current == reader.end && !reader.overrun' failed. |
| - 46a8d5e7ef61735416d0c54886a7a9930621ae2c causes a permission denied spam |
| - [BUILD] Build Failure: Implicit Function Declaration 'timespec_sub_saturate' (loader_wayland_helper.c) |
| - intel genX_acceleration_structure: missing dependency to bvh/header.spv.h |
| - KHR_subgroup glsl parsing broken |
| - intel: add config options to disable ELK compiler bits |
| - a618: godot-tps-gles3-high trace reproducible flakes |
| - radv: mesh shader depth-only rendering is broken |
| - anv: Enable VK_FORMAT_A4R4G4B4_UNORM_PACK16_EXT for Android 15 |
| - Using a buffer allocated on a rx 6800XT for scanout on a Ryzen 7950X results in glitches |
| - Systemfreeze from mesa version 1:24.3.0-1-x86_64 and above with Chromium and derivatives [and more or less all other graphic related things] |
| - msm_kgsl.h:560:21: error: expected ‘:’, ‘,’, ‘;’, ‘}’ or ‘__attribute__’ before ‘*’ token |
| - [radeonsi] VC1 hardware decoding over vaapi outputs green screen |
| - consecutive glDrawPixels do not reflect a changed pixel mapping |
| - Crashing while Processing Shaders in Marvel Rivals on Mesa 24.3.2 & Mesa 24.3.3 |
| - Assertion \`nir_cf_node_get_function(&block->cf_node)->structured' failed |
| - r300: Conditional jump or move depends on uninitialised value in Xnine.mova test |
| - anv: Mesh shaders with two OpSetMeshOutputsEXT instructions are not supported |
| - hasvk: apps crash since "intel/compiler: Remove usage of variable length arrays" |
| - nir_validate should check metadata |
| - anv: vkcube(pp) segfault in multi-GPU config, apparent vkCreateSwapchainKHR failure |
| - anv,regression: Black square artifacts in Fenyx Rising on BMG |
| - [anv] Cyberpunk visual corruption on BMG |
| - [ANV][LNL] - Cyberpunk 2077 (1091500) - Flickering mesh during benchmark. |
| - Intel Arc A770: Crosshair in THE FINALS renders too large |
| - 3d render issues in Chromium after 1:24.3.1-3 update over 1:24.2.7-1 of mesa package |
| - intel/compiler: Out of bounds read in brw_eu_compact.c |
| - intel/compiler: Out of bounds read in brw_eu_compact.c |
| - egl,dri2: Segfault when running wayland clients on non-default GPU |
| - anv,regression: Visual glitches in Ghost of Tsushima on BMG |
| - anv, regression: Resident Evil 2 d3d12 freezes in main menu on a Arc b580 |
| - radeonsi: fails to build with libc++ |
| - Random mesa crashes in kwin_wayland on a 6600XT |
| - enc->enc_pic.enc_pic_order_cnt_type always zero even if pic->pic_order_cnt_type non-zero that application set |
| - [anv] Visual corruption in Cyberpunk on LNL and BMG |
| - [anv] Borderlands 3 visual corruption on BMG |
| - [ANV] LNL triangle corruption on clothing in HogwartsLegacy-trace-dx12-1080p-ultra |
| - Intel: Dark graphical glitches on cars and characters on Disney Speedstorm |
| - Regression in VA-API decoding |
| - freedreno: fails to build with Android NDK 27c |
| - hk_cmd_draw.c:3471:32: error: expression in static assertion is not constant |
| - anv/gfx12: Enable non-zero fast clears for non-FCV CCS_E |
| - gen12: 5% regression in factorio |
| - 32-bit: error: format ‘%lx’ expects argument of type |
| - regression;bisected;FTBFS: commit b13e2a495e9e3da56add7d852ca01b2cd7eef52d breaks x86_32 mesa build |
| - glxext.c: error: 'struct glx_screen' has no member named 'frontend_screen' |
| - regression;bisected;FTBFS: commit ae76a6a04596bfdbd37bab165bc5f2a5ff60d389 breaks x86 mesa build |
| - Can't allocate dpb buffer on firefox |
| - Segmentation fault resetting a query pool used to get BLAS properties |
| - libvulkan_lvp link fails if glslangValidator is not installed |
| - lvp acceleration structure broken on \`main` but not on \`staging/24.x` |
| - radv: warning that "radv is not a conformant Vulkan implementation" on Navi 32 |
| - [anv][UHD630] DXVK 2.5 - 2.5.2 with DXVK_HUD=compiler or DXVK_HUD=fps freezes the game or the entire system (Works without compiler/fps HUD, DXVK 2.4.1 works fine) |
| - Licenses seems incomplete/misleading |
| - anv: Symbol clash in intel_batch_decoder build when expat not available |
| - glcts failures on LNL/BMG |
| - Lavapipe vulkan 1.4 support? |
| - d3d12 vaapi: thread safety issues |
| - anv: Missing textures and glitches in It Takes Two (game) |
| - [anv][bisected] GravityMark segfault when enabling u-trace on RT workload |
| - features.txt does not have a Vulkan 1.4 section despite some drivers already supporting the new version |
| - Black screen bug that only affects AMD |
| - Failure to correctly decode H.264, possibly specific to use of array output view |
| - X1-85: Portal 2: Bottom of portal gun disappears |
| - X-Plane 12: Prop disc rendering regression |
| - Errors when enumerating devices create incorrect expecations |
| - Resident evil 3 remake hanging - f8b584d6 regression |
| - R6700XT: QP value doesn't affect output when using CQP rate control w/ H264/H265 VAAPI encoders |
| - Bug in Mesa headers: \`error: redefinition of typedef 'GLsync'` |
| - nak: Crash when starting The First Descendant |
| - [r300] Regression in f424ef18010 breaks wayland on RS480M |
| - anv: Missing text in Age of Mythology Retold on a Arc b580 |
| - RustiCL: and Clover broken with 9b7ea720c93 (!32713 (merged)) |
| - nvk: Artifact Classic crash at loading screen |
| - radeonsi VAAPI - vc-1 interlaced decoding garbled on Polaris |
| - VDPAU AV1 hardware decoding broken for Mesa 25.0.0-devel |
| - mesa: st_glsl_to_nir call to nir_opt_fragdepth might not be valid with MSAA |
| - rusticl: warning: pointers cannot be transmuted to integers during const eval |
| - rusticl: warning: pointers cannot be transmuted to integers during const eval |
| - X1-85: Half Life 2 water rendering artifacts |
| - crash on video playback |
| - anv: Allow buffer compression for vkd3d by default? |
| - anv: bellwright needs force_vk_vendor=-1 %command% to launch |
| - [anv] Possible regression from !31269 |
| - Up to 60% perf drop in SynMark DrvRes benchmark |
| - Memory leak on closing and re-opening X11 windows |
| - SIVPE errors on GPU-based screen recording (Radeon 890M) |
| - d3d12: va-api: build failure regression since 24.3.0-rc1 with MinGW GCC and clang |
| - anv: Marvel Rivals XeSS crash, game needs force_vk_vendor=-1 env variable |
| - anv: \`MESA: warning: INTEL_HWCONFIG_MIN_GS_URB_ENTRIES (2) != devinfo->urb.min_entries[MESA_SHADER_GEOMETRY] (0)` |
| - aco: two nir_shader_clock are miss optimized to one for GFX12 |
| - aco: opengl buffer blit test fail when using aco on GFX12 |
| - aco: nir_ddx/ddy v_interp optimization does not work on GFX12 |
| - VAAPI b_depth 2 causes "manage_dpb_before_encode UVD - Failed to find ref0" error |
| - regression;bisected;FTBFS: commits 37d47913437e2e9f72283ea8bffce00efc40fce2 and e67e44522f4f5de4fcde53ad0fb75e396ef31f52 breaks x86 mesa build |
| - anv: Enable storage image compression on TGL |
| - zink: zink_create_quads_emulation_gs doesn't write primitive ID |
| - DZN/DXIL doesn't validate GTK shaders |
| - black screen and "Failed to add framebuffer" error in wayland compositors when not filtering dmabuf formats with ccs modifiers on intel graphics when upgrading to mesa 24.3.0 |
| - nir: nir_opt_if_merge_test fails validation with NIR_DEBUG=validate_ssa_dominance |
| - radv: Vulkan AV1 video decode glitches |
| - radv: support RGP captures for purely compute pipelines |
| - regression;bisected: c49a71c03c9166b0814db92420eadac74cbc4b11 leads to artifacts if on top of launched game (in full screen mode) show list running apps (Hold Alt + Tab) |
| - !32067 broke piglit "spec\@egl_khr_create_context\@no-error context gl" |
| - Intel: Re-enable bo cache in iris driver (Xe2) |
| - [amdgpu][regression] GPU Hang/Reset Triggered by Several Applications |
| - ANV: X4 Foundations crashes with vkAllocateDescriptorSets -12 |
| - About twenty vulkan-samples cases will crash caused by the same error while running on PanVK |
| - Firestorm crashes on startup with Mesa 24.3 |
| - anv: Use-after-free detected by AddressSanitizer while running dEQP-VK |
| - GPU process crash via WebGPU shader - UAF in mesa gcm_schedule_early_instr at src/compiler/nir/nir_opt_gcm.c:477 |
| - radv: DCC causes glitches in Red Dead Redemption 2 |
| - A5xx rendering issues with firefox |
| - [ANV][Regression] Broken rendering in Flycast + Per-Pixel Alpha Sorting |
| - [TGL][anv] Performance regression in Dota 2 replay |
| - vtn: OpTypeStruct in kernel parameters trigger assertion in glsl_types.h |
| - anv: Assertion failure in \`dEQP-VK.image.extended_usage_bit_compatibility.image_format_list.s8_uint_optimal_transfer_src_bit` |
| - radv: Resident Evil 6 Benchmark Tool has artifacts on 7900 XTX when DCC is enabled, game launched on 4K monitor without scaling and with FullHD settings |
| - [AMD RX 6700 XT] Artifacts while upscaling games in fullscreen mode |
| - Distorted pixelated graphics with Radeon RX 7900 XT with some games |
| - Total War Warhammer 2 Graphical Glitch |
| - Glitching artifacts in tile shaped patterns on 6700 XT, when using upscaled fullscreen game on labwc |
| - anv: Page fault when using MTL simulator in dEQP-VK.ray_tracing_pipeline.data_spill.report_intersection.float32 |
| - mesa_cache_db.c:316:33: error: call to undeclared function 'mremap' |
| - [trunk] shaders fail hard in openmw after cbfc225e2bda2c8627a4580fa3a9b63bfb7133e0 |
| - u_perfetto.h:33:9: error: unknown type name 'clockid_t'; did you mean 'clock_t'? |
| - brw_fs_opt_copy_propagation incorrectly handles size changes of uniforms |
| - RADV Command buffer reuse doesn't reinitialize is_secondary |
| - Virgl:Qcom sa8155 GL_MAX_FRAGMENT_SHADER_STORAGE_BLOCKS/GL_MAX_VERTEX_SHADER_STORAGE_BLOCKS is too small to run antutu benchmark apk |
| - nouveau paraview msaa corruption 23.1 bisected regression |
| - mesa fails to build due to missing SPV_ENV_UNIVERSAL_1_6 symbol |
| |
| |
| Changes |
| ------- |
| |
| Aaron Ruby (6): |
| |
| - meson: Remove experimental from gfxstream driver build |
| - gfxstream: Some cleanup in manual entrypoints |
| - gfxstream: Remove VK_HOST_CONNECTION macro |
| - gfxstream: Fix unused variable warnings in ResourceTracker.cpp |
| - vulkan/util: Add c99_compat.h inclusion for cpp 'restrict' compatibility |
| - gfxstream: Remove internal vk_util.h and vk_struct_id.h entirely |
| |
| Adam Jackson (2): |
| |
| - docs/envvars: Remove mention of IRIS_ENABLE_CLOVER |
| - docs/envvars: Combine WGL sections |
| |
| Alejandro Piñeiro (1): |
| |
| - docs/features: mark VK_EXT_scalar_block_layout as supported for vc7+ |
| |
| Aleksi Sapon (9): |
| |
| - draw: primitive ID is per-patch |
| - llvmpipe: spec\@arb_tessellation_shader\@execution\@gs-primitiveid-instanced is fixed |
| - zink: spec\@arb_tessellation_shader\@execution\@gs-primitiveid-instanced is fixed |
| - draw: front-face injection must check geometry shader primitive type |
| - llvmpipe: PointCoord is offset when multisampling is enabled |
| - meson: fix finding Python on Windows |
| - llvmpipe: fix lp_test_arit on Windows |
| - llvmpipe: LLVM v2f32 trunc/floor/ceil/nearbyint generates optimal x86 code since at least version 8 |
| - llvmpipe: disable anisotropic filtering for non-2D textures |
| |
| Alyssa Rosenzweig (206): |
| |
| - nir/opt_algebraic: optimize patterns from Skia |
| - nir/opt_algebraic: add more 64-bit patterns |
| - nir/opt_algebraic: add another 64-bit pattern |
| - nir: add amul flag |
| - nir: add late_lower_int64 option |
| - nir: add ilea_agx/ulea_agx opcodes |
| - nir/builder: use amul over ishl on agx |
| - nir/opt_algebraic: don't lower amul if requested |
| - nir/lower_uniforms_to_ubo: use amul |
| - rusticl: respect late_lower_int64 |
| - agx: vectorize SSBOs |
| - agx: model IC dispatch |
| - agx: fix bfeil timing |
| - hk: reduce max SSBO size |
| - libagx: promote math to use AGX address mode |
| - agx: rewrite address mode lowering |
| - agx: change int conversion test |
| - agx: add pseudo for signext |
| - agx: optimize signext+iadd |
| - agx: fold zext into int sources |
| - agx: add tests for sign/zero-extend propagate |
| - agx: fix atomics in tess count shaders |
| - hk: don't advertise impossible modifiers |
| - agx: optimize signext imad |
| - agx: fuse iadd+large shift into imad |
| - agx: make imad+ishl rules actually work |
| - hk: drop assert |
| - hk: fix meta shader name |
| - libagx: fix cl warning |
| - libagx: drop branch |
| - libagx: drop dead code |
| - libagx: vectorize triangle def'n |
| - libagx: drop Clockwise |
| - libagx: simplify index patch expression |
| - libagx: don't key unroll to index size |
| - libagx: fix unroll kernel constant qualifier |
| - libagx: drop silliness in restart kernel |
| - agx: fuse also 8-bit address math |
| - asahi: extract agx_get_num_cores |
| - asahi: correct core count, max freq |
| - asahi: fix a2c with sample shading, harder |
| - asahi: assert/cse resource valid |
| - asahi: don't take compiled_shader in agx_build_internal_usc |
| - asahi: drop dead param |
| - asahi: factor out more compiled shader |
| - asahi: move agx_gather_device_key |
| - util: add u_tristate data structure |
| - panfrost: switch to u_tristate |
| - agx: make needs_g13x_coherency a tri-state |
| - nir/lower_convert_alu_types: use intrinsics_pass |
| - nir/conversion_builder: avoid redundant uint->uint clamp |
| - nir/opt_algebraic: optimize convert_uint_sat(ulong) |
| - nir: add names to function parameters |
| - nir/print: print function signature |
| - nir/print: annotate entrypoints |
| - nir/print: print parameter names in calls |
| - vtn: gather function parameter names |
| - vtn: use rzalloc in bindgen |
| - vtn: use named parameters in bindgen |
| - vtn: preserve name, is_return in bindings |
| - nir: split off some definitions for OpenCL |
| - compiler: make glsl_sampler_dim available to CL |
| - nir/lower_system_values: add ID to 32-bit lowering |
| - nir: add nir_fixup_is_exported pass |
| - vtn: introduce vtn_bindgen tool |
| - libagx: switch to vtn_bindgen |
| - libagx: move out of lib/ |
| - libagx: DCE |
| - asahi: drop dead ACCESS |
| - asahi,agx: move texture lowering into the compiler |
| - asahi: drop desc align alloc |
| - asahi/decode: disasm 3D helper progs |
| - asahi/clc: drop getopt |
| - agx: vectorize scratch access |
| - agx: gather workgroup size |
| - asahi,hk: reenable rgb32 buffer textures |
| - hk: generalize internal launch |
| - hk: expose missing eds3 feature |
| - hk: handle mismatching colour vs z/s dimensions |
| - hk: implement EXT_depth_bias_control |
| - hk: be robust against invalid MSAA inputs |
| - hk: do not increment GS queries for passthru GS |
| - hk: use common wg size |
| - hk: add cmd buffer to hk_cs |
| - hk: dce |
| - libagx: fix return type |
| - libagx: don't export vertex_id_for_top |
| - asahi/genxml: fix 0 encoding for groups |
| - asahi/genxml: fix 128-bit in CL path |
| - asahi/genxml: optimize out masking with shr |
| - asahi/genxml: define missing macros |
| - asahi: add XML for cdm stream link with return |
| - asahi: refmt |
| - vtn: ignore SpvFunctionParameterAttributeSret |
| - nir/pack_bits: handle 8-bit vec8 -> 64-bit |
| - nir: add nir_lower_calls_to_builtins pass |
| - asahi/clc: switch to nir_lower_calls_to_builtins |
| - nir: add nir_foreach_entrypoint macros |
| - nir: add workgroup size to functions |
| - vtn: plumb through OpEntryPoint |
| - vtn: gather workgroup size in libraries |
| - nir: add nir_function::pass_flags |
| - nir: add nir_remove_entrypoints helper |
| - nir: add nir_lower_constant_to_temp helper |
| - nir: add helpers for precompiled shaders |
| - asahi,vtn: precompile kernels |
| - libagx: increase wg size for query copy |
| - asahi: crash on fault |
| - hk: fix incorrect index size translate |
| - hk: fix z bias perf regression |
| - hk: implement hack for layered no attachments |
| - hk: clarify bounds check calculations |
| - agx: disable bounds check optimization |
| - agx: reduce preamble/main alignment |
| - asahi: drop dead pool stuff |
| - asahi: don't leak rodata |
| - hk,asahi,libagx: unify a bit of code |
| - asahi: drop dead |
| - asahi: fix page size alignment |
| - asahi: fix u_blitter related leaks |
| - asahi: label individual pools |
| - asahi,hk: mmap BO on first use |
| - asahi: add more asserts around bo add |
| - asahi: fix agx_batch_add_bo |
| - asahi: add =bodump debug help |
| - asahi: fix agxdecode memory mapping |
| - hk: implement timestamps |
| - hk: claim 1.4 |
| - zink: fix gl_PrimitiveID reads with quads |
| - nir/search_helpers: handle bcsel in is_only_used_as_float |
| - nir/opt_algebraic: optimize sign bit manipulation |
| - nir/opt_load_store_vectorize: match amul like imul |
| - nir,asahi: make argument alignment configurable |
| - mesa_clc: add depfile support |
| - libagx: switch to depfile support |
| - libagx: remove redundant source files |
| - vulkan: rename depth bias graphics states |
| - vulkan: bump layer api versions |
| - nir: add printf_abort intrinsic |
| - nir/lower_printf: allow fixed address |
| - nir/lower_printf: lower aborts |
| - nir/lower_printf: use unsigned math |
| - nir/lower_printf: use 64-bit math |
| - util/printf: be robust against truncated buffers |
| - util/printf: add context-ful helpers |
| - vulkan: add vk_check_printf_status helper |
| - nir/lower_point_size: skip non-var derefs |
| - clc: plumb cl_khr_subgroup_ballot |
| - libcl: add a common header for CPU/GPU stuff |
| - libcl: add VkDraw(Indexed)IndirectCommand definitions |
| - util/bitpack_helpers: make partially CL safe |
| - asahi: allow c23 extensions |
| - asahi/clc: remap __FILE__ |
| - asahi,hk: wire up printf, abort |
| - agx: implement halts |
| - libagx: drop pointless helper |
| - libagx: port to common libcl.h |
| - compiler: use libcl.h for CL |
| - compiler: add mesa_prim_has_adjacency helper |
| - asahi: use mesa_prim_has_adjacency |
| - nir: add lower_scratch_to_var pass |
| - compiler/glsl_types: add glsl_get_word_size_align_bytes |
| - agx: optimize scratch access |
| - radeonsi: use mesa_prim_has_adjacency |
| - asahi: fix mmap'ing imported BOs |
| - hk,libagx: move hk_draw to the gpu |
| - asahi: use common draw |
| - libagx: add missing agx_vdm_return |
| - agx: add more 8-bit address fusing rules |
| - asahi: reformat |
| - agx: match another address pattern |
| - libagx: move index size helpers to the gpu |
| - libagx: refactor index buffer code |
| - libagx: factor out load/store_index |
| - hk: use index buffer overflow check |
| - hk: factor out hk_draw_as_indexed_indirect |
| - hk,libagx: accelerate index buffer robustness |
| - hk,libagx: handle adjacency without a GS |
| - libagx,hk: handle pipeline stats queries without a GS |
| - libagx: use designated initializers |
| - hk: avoid compiling unneeded VS->GS variants |
| - hk: fix primitive restart dirty tracking |
| - glsl: fix glsl_get_word_size_align_bytes |
| - nir: pass a callback to nir_lower_robust_access |
| - nir/lower_robust_access: fix robustness with atomic swap |
| - libagx: add agx_barrier enum |
| - nir,asahi,hk: add barrier argument to MESA_DISPATCH_PRECOMP |
| - intel: set max_buffer_size to nir_lower_printf |
| - nir/lower_printf: drop null check |
| - nir/lower_printf: drop default max buffer size |
| - nir,util: move printf serializing into util |
| - util: add u_printf_hash helper |
| - util/u_printf: add singleton implementation |
| - util/u_printf: allow printing from singleton |
| - nir/lower_printf: add option to hash format strings |
| - nir/lower_printf: support dynamic buffer size |
| - nir: add nir_lower_printf_buffer pass |
| - agx: defer printf address lowering |
| - nir/lower_printf: drop static buffer addr lowering |
| - util,vulkan,asahi,hk: hash format strings |
| - nir/lower_robust_access: do not preserve control flow |
| - nir: fix O(N^2) behaviour in nir_remove_dead_variables |
| - meson: project-wide fs = import('fs') |
| - clc,libagx: drop --in for mesa_clc |
| - clc,libagx: automatically set lang version |
| - nir/serialize: strip function names names |
| |
| Antonino Maniscalco (1): |
| |
| - nir,zink,asahi: support passing through gl_PrimitiveID |
| |
| Antonio Ospite (53): |
| |
| - ci/deqp: replace local android patches with upstream solution |
| - docs/android: update docs/android.rst after libgallium_dri updates |
| - docs/android: improve documentation about building llvmpipe for Android |
| - docs: remove leftover mention of meson dri3 option |
| - ci/android: unset compiler env vars in debian/android_build.sh |
| - ci/android: add a script to build LLVM libraries for Android |
| - ci/container: remove S3_JWT_FILE when container_job_trampoline.sh exits |
| - ci: set GIT_COMMITTER_DATE in a locale-agnostic format |
| - ci/deqp: refresh some patches to apply on top of recent VK-GL-CTS |
| - ci/deqp: cherry-pick fixes for building GL and GLES deqp on Android |
| - ci/deqp: enable building testlog tools on Android too |
| - ci/deqp: collect the mustpass lists also for the android target |
| - ci/android: fix problem with deqp version file when building for Android |
| - ci/android: build deqp for DEQP_API=VK |
| - ci/android: build llvmpipe driver for Android by forcing llvm fallback |
| - ci/android: don't copy the DRI drivers which are not needed anymore |
| - ci/android: restart all services after copying the new mesa libraries |
| - ci/android: handle premature exit of .gitlab-ci/cuttlefish-runner.sh |
| - ci/android: update version of cuttlefish host tools |
| - ci/android: add sudo to EPHEMERAL deps for debian/x86_64_test-android.sh |
| - ci/android: get custom cuttlefish images from the S3 |
| - ci/android: make cuttlefish-runner.sh more robust against different Android images |
| - ci/android: better separate host and guest mesa artifacts |
| - ci/android: use a custom kernel when launching cuttlefish |
| - ci/android: fix warning when using chown |
| - ci/android: fix result dir for Android guest execution of deqp-runner |
| - ci/android: don't call cuttlefish-host-resources script |
| - ci/android: reorder PATH and LD_LIBRARY_PATH values to clarify priority |
| - ci/android: also copy mesa vulkan libraries to the Android guest |
| - ci/android: update list of deqp files pushed to the guest system |
| - ci/android: use a native adb connection |
| - ci/android: set XDG_CACHE_HOME and pass --shader-cache-dir to deqp-runner |
| - ci/android: use a /data/deqp subdirectory on guest to store dEQP files |
| - ci/android: set VK_DRIVER_FILES before launching cuttlefish |
| - ci/android: add ci rules to test llvmpipe on Android |
| - ci/android: add ci rules to test venus on Android |
| - ci/android: upgrade DEBIAN_TEST_ANDROID_TAG |
| - ci/android: fix meson C++ cross-compiler argument detection |
| - ci/android: update ANDROID_NDK and ANDROID_SDK_VERSION |
| - ci/android: use ANDROID_SDK_VERSION when building deqp components |
| - ci/android: use ANDROID_SDK_VERSION for debian-android job too |
| - ci/android: rename variable ANDROID_NDK to ANDROID_NDK_VERSION |
| - docs/android: bump suggested platform-sdk-version to 34 |
| - freedreno/meson: remove C++ cross-build arguments HACKs |
| - freedreno/meson: sort list of options passed to get_supported_arguments() |
| - ci/android: update CUTTLEFISH_BUILD_NUMBER |
| - ci/android: define an INSTALL var for the source of mesa artifacts |
| - ci/android: improve handling of expectation files |
| - ci/android: fix pulling results from Android device |
| - ci/android: post-process testlog XML and create a junit.xml |
| - ci/android: pass --max-fails to deqp-runner in cuttlefish-runner.sh |
| - ci/android: pass --allow-downgrades when installing cuttlefish host tools |
| - ci/android: stop pushing libglapi.so since it's not available anymore |
| |
| Arseny Kapoulkine (1): |
| |
| - radv: On GFX11, use box sorting heuristic based on ray flags |
| |
| Arvind Yadav (1): |
| |
| - amd: Add amdgpu userqueue IOCTL functions |
| |
| Asahi Lina (16): |
| |
| - asahi: Add pipe bind flags to resource debug |
| - asahi: Add PIPE_BIND_SHARED to imported resources |
| - asahi: Extract agx_decompress_inplace() |
| - asahi: Introduce batch->feedback to disable compression in PBE |
| - asahi: In-place decompress shared resources for feedback loops |
| - hk: Add virtio implicit sync support |
| - hk: Fix DRM modifier selection for compressed surfaces |
| - hk: Enable missing swapchainMaintenance1 support |
| - asahi: Use 64bit size fields |
| - hk: Bump up max buffer size |
| - asahi: UAPI update to add GET_TIME & cleanup |
| - asahi: Fix agx_gpu_time_to_ns & implement DRM_ASAHI_GET_TIME |
| - asahi: UAPI update to add support for user timestamp buffers |
| - asahi: Add timestamp buffer ops |
| - asahi: Virt UABI update |
| - asahi: hk: Enable timestamps for virt |
| |
| Autumn Ashton (1): |
| |
| - radv/video: Fix bitstreamStartOffset including dstBufferOffset |
| |
| Bas Nieuwenhuizen (1): |
| |
| - util/perf: Fix some warnings. |
| |
| Benjamin Cheng (4): |
| |
| - ac/vcn: allow sq signature package to be skipped |
| - radv/video: support event for pre-VCN4 encode queues |
| - radv/video: support event for pre-VCN4 decode queues |
| - radv/video: enable by default on vcn2/3 with latest fw |
| |
| Benjamin Lee (36): |
| |
| - panvk: inherit sample count in secondary cmdbufs |
| - nir: clamp small W in nir_lower_viewport_transform |
| - nir: document order requirement for nir_lower_viewport_transform |
| - panvk: refactor fbinfo into a temp var in get_tiler_desc |
| - panvk: treat provoking vertex as dynamic state |
| - panvk: set provoking vertex in fbinfo |
| - panvk: advertise VK_EXT_provoking_vertex |
| - nir: handle arbitrary per-view outputs in nir_lower_multiview |
| - nir: document index semantics in nir_lower_multiview |
| - nir: treat per-view outputs as arrayed IO |
| - nir: add option to use compact view indices |
| - panvk: implement multiview support |
| - panvk: only clear enabled views |
| - panvk: disable position fifo optimization when multiview enabled |
| - panvk: advertise multiview support on v10+ |
| - panvk: add note about pan_lower_store_component requirements |
| - nir: update docs for nir_get_io_arrayed_index_src |
| - panvk: set uses_sample_shading NIR flag when sample shading is forced |
| - panvk: fix sample position when sample shading is disabled |
| - panvk/csf: fix alpha-to-coverage |
| - panfrost: add intrinsic to load frag coord at a barycentric |
| - panfrost: add nir pass to lower noperspective varyings |
| - panfrost: collect noperspective varyings in shader info |
| - panvk: pass noperspective_varyings sysval as a push constant |
| - panfrost: add pass to lower noperspective varyings to a constant |
| - panvk: use static noperspective when statically linking VS and FS |
| - panfrost: factor FS shader key into a helper function |
| - panfrost: specialize VS on FS interpolation qualifiers |
| - panvk: handle sample mask writes on 1-sample targets |
| - panvk: remove load_multisampled_pan sysval |
| - panfrost/va: add FLUSH instruction |
| - panfrost/va: implement fquantizetf16 ftz |
| - panvk: disable round_to_nearest_even for NEAREST-filtered samplers |
| - panfrost: remove incorrect usage of MALI_PIXEL_KILL_STRONG_EARLY |
| - panfrost: fix hang by using MALI_PIXEL_KILL_WEAK_EARLY in color preload |
| - panfrost: remove is_blit flag |
| |
| Benjamin Otte (1): |
| |
| - vulkan/wsi: Support alpha swapchains on win32 |
| |
| Benjamin ROBIN (1): |
| |
| - util/disk_cache: Do not try to delete old cache if cache is disabled |
| |
| Bo Hu (5): |
| |
| - gfxstream: snapshot: avoid double boxing dispatchable handle |
| - gfxstream: snapshot: DescriptorSet allocate and update |
| - gfxstream-guest: update offset to correct value |
| - update decoder.py to clean up un-used ApiCallInfo |
| - remove the mReconstructionMutex in load |
| |
| Boris Brezillon (103): |
| |
| - panvk: Enable CI on G610 |
| - pan/ci: Move g610-vk jobs to post-merge CI |
| - panvk: Change the prototype of panvk_select_tiler_hierarchy_mask() |
| - panvk: Kill unused fields in panvk_cmd_graphics_state |
| - panvk: Move the panvk_cmd_graphics_state definition to panvk_cmd_draw.h |
| - panvk: Move panvk_cmd_compute_state to a common place |
| - panvk: Move is_dirty() to panvk_cmd_draw.h and rename it |
| - panvk: Don't link the VS and FS shaders on v10 |
| - panvk: Sanitize the driver-internal dirty state tracking |
| - panvk: Move common gfx bits to a new source file in the common dir |
| - panvk: Cache the fs_required() result |
| - panvk/csf: Fix a wait-LS operation in finish_cs() |
| - panvk/cs: Poison cmdbuf registers when PANVK_DEBUG=cs is set |
| - panvk/ci: Update CI expectations to have a green CI |
| - panfrost: Increase AFBC body alignment requirement on v6+ |
| - panfrost: Add a helper to expose the maximum effective tile size |
| - panfrost: Add the concept of render block |
| - panfrost: Add support for AFBC(split) |
| - panfrost: Advertise support for AFBC(32x8,sparse,split) |
| - pan/decode: Flush the dump file before crashing |
| - panvk/csf: Keep a cache of the CS reg file at the panvk_queue level |
| - panvk/csf: Fix cross command buffer render pass suspend/resume |
| - panvk/csf: Explain why the tiler is set to 0xdeadbeefdeadbeef |
| - panvk: Fix panvk_plane_index() for D32_SFLOAT_S8_UINT |
| - pan/cs: Add cs_exception_handler_ctx |
| - pan/cs: Align exception handlers with NOPs |
| - pan/cs: Add dynamic save_reg to exception handler |
| - pan/cs: Add block macro for exception handler |
| - panvk/csf: Fix register overlap in issue_fragment_jobs() |
| - pan/cs: Return the dump region size when an exception handler is defined |
| - pan/cs: Return exception handler size/address |
| - panfrost: Add cs_exception_handler_def() to the ForEachMacros list |
| - panvk/csf: Use the information returned by cs_exception_handler_def() |
| - panfrost: Use the handler size returned by cs_exception_handler_def() |
| - panvk: Filter out input-attachment usage on non renderable formats |
| - pan/decode: Untangle CS disassembling and interpretation |
| - pan/decode: s/interpret_ceu/interpret_cs/ |
| - pan/decode: Rename pandecode_cs() into pandecode_interpret_cs() |
| - pan/decode: Add a helper to print CS binaries without interpreting them |
| - pan/decode: Provide a helper to print messages outside of the decoding path |
| - pan/cs: Add a LOAD_IP pseudo instruction |
| - pan/cs: Add an event-based tracing mechanism |
| - panvk/csf: Use event-based CS tracing |
| - panvk/csf: Don't disable SIMULTANEOUS_USE when tracing is enabled |
| - panvk: Add a flag to force SIMULTANEOUS_USE |
| - pan/texture: Move the plane info retrieval logic to a helper function |
| - pan/texture: Stop passing the view format around |
| - pan/texture: s/index/plane_index/ in panfrost_emit_plane() |
| - pan/texture: Stop passing a layout to panfrost_emit_plane() |
| - pan/texture: Pass pan_image_section_info around |
| - nir: Let nir_lower_texcoord_replace_late() report progress |
| - panfrost: s/NIR_PASS_V/NIR_PASS/ |
| - panfrost: Use nir_shader_intrinsics_pass() for the line_smooth lowering pass |
| - panvk: s/NIR_PASS_V/NIR_PASS/ |
| - pan: s/NIR_PASS_V/NIR_PASS/ |
| - panvk: Move the descriptors preparation out of CreateImageView() |
| - vk/meta: Pass depth/stencil attachments only when a clear is requested |
| - panvk: Ignore the view aspects when dealing with depth/stencil attachments |
| - pan/cs: Fix cs_builder allocation failure robustness |
| - panvk: Wrap our descriptor lowering passes in NIR_PASS() |
| - panvk: Stop using magic values for the sysval push constant offset/range |
| - panvk: Automate sysval access from NIR shaders |
| - panvk: Lower dynamic push_constant loads in desc_copy logic |
| - panvk: Lower load_push_constant with dynamic offset to global loads |
| - pan/bi: Get rid of bi_lower_load_push_const_with_dyn_offset() |
| - panvk: Don't define push_constant range/base when we don't have to |
| - pan/indirect: Don't use .base to pass the push_constant offset |
| - pan/mi: Don't pretend we support push constants |
| - pan/bi: Disallow non-zero .{range,base} on load_push_constant instructions |
| - pan/bi: Fix mem_access_size_align_cb() for push constants |
| - panvk: Don't lower load_base_vertex |
| - panvk: Fix first_vertex/base_instance types |
| - pan: Don't pretend we support load_{vertex_id_zero_base,first_vertex} |
| - panvk: Don't lower load_blend_const_color_rgba |
| - panvk: Factor-out the sysvals initialization logic |
| - panvk: Pass a cmdbuf to blend_emit_descs() |
| - panvk: Pack push constants |
| - panfrost: Kill the mali_ptr typedef |
| - panfrost: Kill the uXX typedefs |
| - panfrost: Move MALI_EXTRACT_INDEX to pan_format.h |
| - panfrost: Move MAX_{MIP_LEVELS,IMAGE_PLANES} to pan_texture.h |
| - panfrost: Kill panfrost-job.h |
| - panvk: Don't invalidate the viewport on cull mode updates |
| - panvk/jm: Fix depth clipping with small viewport depth range |
| - panvk: Fix an alignment issue on x86 |
| - panvk: Fix panvk_priv_mem_bo() on 32-bit platforms |
| - panfrost/ci: Add panvk and panfrost to the debian-x86_32 job |
| - pan/genxml: s/PAN_PAN_HELPERS_H/PAN_PACK_HELPERS_H/ |
| - pan/genxml: Include pan_pack_helpers.h instead of copying it |
| - pan/genxml: Generate MALI_XXX_PACKED_T macros |
| - panfrost: Fix instanced draws when attributes have a non-zero divisor |
| - pan/cs: Fix the tracepoint register dump loops |
| - pan/cs: Allow undefined value if condition=always in cs_branch_label() |
| - pan/cs: cs_{break,continue} are not for_each macros |
| - panvk/csf: Make all sync operations on the CSG scope |
| - panvk/csf: Use cs_sr_reg64() instead of cs_reg64() when setting the OQ pointer |
| - panvk/csf: Rework the occlusion query logic to avoid draw flushes |
| - panvk/csf: Fix add_memory_dependency() for input attachment access |
| - panvk/csf: Add a knob to force texture cache invalidation on RUN_FRAGMENT |
| - panvk: Don't clobber registers if the render pass was suspended |
| - pan/decode: Fix the blend_count mask |
| - panvk/csf: Don't free the resources twice when init_render_desc_ringbuf() fails |
| - panvk: Initialize device virtual address space after the VM creation |
| |
| Brad Smith (1): |
| |
| - util: Support elf_aux_info() on OpenBSD arm and ppc |
| |
| Brian Paul (2): |
| |
| - svga: add svga_resource_create_with_modifiers() function |
| - svga: fix printing 64-bit value for 32-bit build |
| |
| Caio Oliveira (90): |
| |
| - intel/executor: Fix exec_size in \@read macro for Xe2 |
| - intel/brw: Add test for combining SWSB dependencies in SENDs |
| - intel/brw: Allow extra SWSB encodings for Xe2 |
| - intel/common: Properly dispose resources in mi_builder tests |
| - intel/common: Prepare mi_builder tests to support Xe KMD |
| - intel/common: Implement Xe KMD in mi_builder tests |
| - intel/common: Enable mi_builder test for PTL |
| - intel/brw: Add SHADER_OPCODE_BALLOT |
| - intel/brw: Add SHADER_OPCODE_QUAD_SWAP |
| - intel/brw: Omit type and region in payload sources when printing IR |
| - intel/brw: Use <V,W,H> notation for FIXED_GRF and ARF source when printing IR |
| - intel/executor: Enable PTL |
| - intel/brw: Fix decoding of cond_modifier and saturate in EU validation |
| - intel/brw: Fix SWSB output when printing IR |
| - intel/brw: Dump IR after lower scoreboard pass |
| - util/ra: Remove unimplemented function declaration |
| - intel/brw: Add is_control_source for the new subgroup ops |
| - mr-label-maker: Rules for intel/executor |
| - intel/brw: Enable EU validation and compaction tests for PTL |
| - intel/brw: Dump errors when brw_assemble() fails EU validation |
| - intel/compiler: Use #pragma once instead of header guards |
| - intel/brw: Remove overloads for brw_print_instruction/s functions |
| - intel/brw: Consider if SEND is gather variant when setting ex_desc |
| - intel/brw: Add TGL_PIPE_SCALAR value |
| - intel/brw: Add assembly support for ARF scalar register |
| - intel/brw: Add validation for ARF scalar register |
| - intel/executor: Add example using scalar register and send gather |
| - intel/brw: Skip some regioning EU validation for Vx1 and VxH modes |
| - intel/brw: Extract format enum in EU validation code |
| - intel/brw: Add validation for some Xe2 register regioning restrictions |
| - intel/brw: Add some tests for new Xe2 register regioning restrictions |
| - intel/brw: Add SHADER_OPCODE_READ_FROM_CHANNEL and LIVE_CHANNEL |
| - intel/brw: Disallow cmod in some cases of ARF scalar as destination |
| - intel/brw: Use variable instead of manually count the passes |
| - intel/brw: Rename brw_inst.h to brw_eu_inst.h |
| - intel/brw: Rename brw_inst to brw_eu_inst |
| - intel/brw: Rename brw_compact_inst to brw_eu_compact_inst |
| - intel/brw: Rename brw_inst_bits/set_bits to brw_eu_inst_bits/set_bits |
| - intel/brw: Rename brw_inst_* helpers to brw_eu_inst_* |
| - intel/brw: Rename brw_compact_inst_* helpers to brw_eu_compact_inst_* |
| - intel/brw: Gather brw_reg related implementations in brw_reg.cpp |
| - intel/brw: Add missing call to invalidate analysis |
| - intel/brw: Move two NIR passes to brw_nir.c |
| - gallium/meson: Ensure all needed sym_config are set. |
| - intel/brw: Remove 'fs' prefix from passes filenames |
| - intel/brw: Remove 'fs' prefix from passes and related functions |
| - intel/brw: Add missing bits in 3-src SWSB encoding for Xe2+ |
| - intel/brw/xe2+: Do not use $.dst or $.src SWSB annotations in SENDs |
| - intel/compiler: Use INFINITY spill cost to represent no_spill |
| - util: Add operator new[] to linear context helper declarations |
| - intel/compiler: Use linear allocator for ACP trees in copy-prop |
| - intel/brw: Remove uses of VLAs |
| - intel/elk: Add ELK_MAX_MRF_ALL for static allocating arrays |
| - intel/elk: Remove uses of VLAs |
| - intel/elk: Fix typo in assertion |
| - util/ra: Move less used data out of ra_node |
| - util/ra: Don't store a pointer to graph per ra_node |
| - util/ra: Bump the initial size of adjacency lists |
| - util/ra: Don't store a pointer to a ra_regs per ra_reg |
| - intel/brw: Rename brw_fs_validate to brw_validate |
| - docs: Update syntax on Performance tips page |
| - intel/brw: Rename brw_fs_generator.cpp to brw_generator.cpp |
| - intel/brw: Add brw_generator.h header |
| - intel/brw: Rename fs_generator to brw_generator |
| - intel/brw: Add missing cases to flags_written() |
| - intel/brw: Remove extra wrapping around fs_visitor in tests |
| - intel/brw: Rename brw_fs_builder.h to brw_builder.h |
| - intel/brw: Rename fs_builder to brw_builder |
| - intel/brw: Stop using namespace for brw_builder |
| - intel/brw: Move a few builder helpers to brw_builder.h/cpp |
| - intel/brw: Move shuffle_from_32bit_read implementation to brw_builder |
| - intel/brw: Apply conventions to lower_src_modifiers helper |
| - intel/brw: Rename brw_fs_reg_allocate.cpp to brw_reg_allocate.cpp |
| - intel/brw: Remove 'fs' prefix from reg alloc code |
| - intel/brw: Rely on existing helper for dispatch width of geometry stages |
| - intel/elk: Fix wrong destination to memset |
| - intel/brw: Use brw prefix for some schedule instructions identifiers |
| - intel/brw: Use brw prefix instead of namespace in dynamic_msaa_flags() |
| - intel/brw: Remove unused enum |
| - intel/executor: Fix typo when copying result into Lua table |
| - intel/tools: Use idep_libintel_common in meson |
| - intel/tools: Add helpers for decoder_init/disasm |
| - intel/tools: Merge libaub into libintel_tools |
| - intel: Add meson option -Dintel-elk |
| - intel/brw: Add scoreboard support for scalar register |
| - intel/brw: Plumb through generator whether SEND is gather variant |
| - intel/brw: Add SHADER_OPCODE_SEND_GATHER |
| - intel/brw: Add lowering for SHADER_OPCODE_SEND_GATHER |
| - intel/brw: Use SHADER_OPCODE_SEND_GATHER in Xe3 |
| - intel/brw: Fallback to SEND from SEND_GATHER if possible |
| |
| Caleb Callaway (2): |
| |
| - docs: Intel GPU performance tips |
| - docs: clarify ASPM performance tips |
| |
| Casey Bowman (1): |
| |
| - vulkan/screenshot-layer: Add region command option |
| |
| Caterina Shablia (9): |
| |
| - pan/bi: fix a typo |
| - pan/va: fix WMASK packing |
| - pan/bi: handle read_invocation |
| - pan/bi: handle ballot, ballot_relaxed and as_uniform |
| - pan/bi: lower some subgroup intrinsics |
| - pan/bi: lower the rest of subgroup ops using nir_lower_subgroups |
| - pan/bi: add a MEMORY_BARRIER pseudo-instruction |
| - pan/bi: handle barriers with SUBGROUP scope |
| - panvk: enable subgroupSizeControl |
| |
| Chen, Phoebe (1): |
| |
| - amd/vpelib: Refactor YUV format check |
| |
| Chia-I Wu (69): |
| |
| - panvk: ensure res table is restored after meta |
| - panvk: add memory mmap/munmap helpers |
| - panvk: do not leak mapped memory |
| - panvk: update CI expectations |
| - panvk: add get_subqueue_stages |
| - panvk: rework collect_cache_flush_info |
| - panvk: rework collect_cs_deps |
| - panvk: always skip frag->tiler subqueue wait |
| - panvk: skip frag subqueue self-wait within a render pass |
| - panvk: skip tiler subqueue self-wait within a render pass |
| - panvk: improve should_split_render_pass |
| - panvk: fix a missing cache invalidation |
| - panvk: update expectations for G610 |
| - vulkan: include host write in expanded dst access flags |
| - panvk: add normalize_dependency |
| - panvk: improve VK_QUEUE_FAMILY_EXTERNAL support |
| - panvk: add support for VK_EXT_queue_family_foreign |
| - panvk: fix base_workgroup_id sysval |
| - ci: update the comment on MESA_VK_ABORT_ON_DEVICE_LOSS |
| - panvk: report queue lost timely when PANVK_DEBUG=sync |
| - panvk: implement check_status on v10+ |
| - panvk: no need to map IB internally on valhall |
| - panvk: clang-format issue_fragment_jobs |
| - panvk: fix frag_completed for layered rendering |
| - panvk: minor clean up to prepare_blend |
| - panvk: fix dirty check for prepare_blend |
| - panvk: expand top-of-pipe and bottom-of-pipe |
| - panvk: use u_foreach_bit to loop over mask bits |
| - panvk: fix vs image support |
| - panvk: add panvk_queue_submit_init |
| - panvk: add panvk_queue_submit_init_storage |
| - panvk: add panvk_queue_submit_init_waits |
| - panvk: add panvk_queue_submit_init_cmdbufs |
| - panvk: add panvk_queue_submit_init_signals |
| - panvk: add panvk_queue_submit_ioctl |
| - panvk: add panvk_queue_submit_process_signals |
| - panvk: add panvk_queue_submit_process_debug |
| - panvk: clean up panvk_queue_submit |
| - panvk: move pandecode_next_frame a bit earlier |
| - panvk/csf: fix SIMULTANEOUS_USE gpu faults |
| - panvk/csf: fix subqueue ctx memory pool |
| - panvk: use cs_tracing_ctx::enabled for exception handler |
| - panvk: add u_trace_context to panvk_device |
| - panvk: define cmdbuf begin/end tracepoints |
| - panvk/csf: add CS_REG_SCRATCH_COUNT |
| - panvk/csf: add u_trace to panvk_cmd_buffer |
| - panvk/csf: add vk_sync to panvk_queue |
| - panvk/csf: flush and process trace events for one-time cmdbufs |
| - panvk/csf: flush and process trace events for all cmdbufs |
| - panvk: improve C++ compat for perfetto |
| - panvk: add u_trace perfetto support |
| - panvk: silence a perfetto init warning |
| - vulkan: add vk_device_get_timestamp |
| - vulkan: add common GetPhysicalDeviceCalibrateableTimeDomainsKHR |
| - vulkan: add common GetCalibratedTimestampsKHR |
| - anv: use common calibrated timestamp support partially |
| - hasvk: use common calibrated timestamp support |
| - radv: use common calibrated timestamp support |
| - tu: use common calibrated timestamp support |
| - nvk: use common calibrated timestamp support |
| - hk: remove calibrated timestamp support |
| - panvk: no need to zero availability on query create |
| - panvk: no need to check query count on query create |
| - panvk: no need to zero results on query reset |
| - panvk/csf: no need to sb wait on query begin |
| - panvk/csf: no need to sb wait on query end |
| - panvk/csf: no need to sb wait on query copy |
| - panvk/csf: no need to flush caches after query copy |
| - panvk/csf: add a comment on query synchronization |
| |
| Christian Gmeiner (20): |
| |
| - broadcom/common: Make v3d_device_info.h usable for C++ |
| - v3d: Move v3d_ioctl(..) to src/broadcom/common |
| - v3dv: Switch to v3d_ioctl(..) |
| - v3d: Move v3d_X(..) to src/broadcom/common |
| - v3dv: Switch to v3d_X(..) |
| - broadcom: Add perfcount library |
| - v3d: Switch to use libbroadcom_perfcntrs |
| - v3dv: Switch to use libbroadcom_perfcntr |
| - etnaviv: blt: Add DBG(..) why blt usage was not possible |
| - etnaviv: rs: Add DBG(..) why blt usage was not possible |
| - v3d: Sync v3d_drm.h with drm-misc-next |
| - broadcom: Add perfetto data source |
| - pps: Add support for v3d ds |
| - perfetto: Add v3d data sources to system.cfg |
| - perfetto: Add v3d data sources to gpu.cfg |
| - docs: Update perfetto with the latest status |
| - etnaviv: isa: Support src2 for texld |
| - etnaviv: isa: Support src2 for texldb and texldl |
| - egl/meson: Specify which symbols to export |
| - v3dv: Add some CPU tracepoints |
| |
| Christopher Michael (5): |
| |
| - v3d: Add check to see if v3d supports cpu_queue |
| - v3d: Add check to see if v3d supports multisync |
| - v3d: Add support for timestamp queries |
| - v3d: Add support for time elapsed queries |
| - v3d: Add support for PIPE_QUERY_TIMESTAMP_DISJOINT |
| |
| Collabora's Gfx CI Team (5): |
| |
| - Uprev Piglit to eebe1b555f51dbb702f696d08ad5ae8153bcdcdd |
| - Uprev Piglit to d04d6fff00849a2a8e29ef3251c6ca04a2f68dc7 |
| - Uprev Piglit to 468221c722481c470e6a23760b914c33143c2af6 |
| - Uprev Piglit to 4c0fd15fd956ec70c5509bedee219d602b334464 |
| - Uprev Piglit to 631b72944f56e688f56a08d26c8a9f3988801a08 |
| |
| Connor Abbott (55): |
| |
| - vulkan/runtime: Add driver callbacks for BVH building |
| - vulkan/runtime,radv: Add shared BVH building framework |
| - vulkan/runtime,radv: Add shared BVH building framework |
| - ir3: Fix reload_live_out() in shared RA |
| - tu: Add Vulkan 1.4 features and properties |
| - tu: Expose Vulkan 1.4 on a7xx |
| - tu: Move queue-related code to a new file |
| - tu: Refactor the submit path |
| - tu/kgsl: Make wait_timestamp_safe() return VkResult |
| - tu/knl: Move u_trace fence handling to generic code |
| - tu: Rename bo_list to submit_bo_list |
| - util/dynarray: Add macro for appending an array |
| - tu: Make userspace RD dump generic |
| - freedreno/fdl: Make tiled r8g8 images have 4k alignment |
| - tu: Re-enable tiled non-ubwc R8G8 images |
| - freedreno/fdl: Fix 3d mipmapping height alignment |
| - freedreno/fdl, tu: Make mutable part of the image layout |
| - freedreno/fdl: Don't enable r8g8 special case for mutable images |
| - freedreno/fdl, tu: Allow swaps with mutable tiled images |
| - tu: Allow UBWC with images with swapped formats. |
| - vk/bvh: Fix clang build error with turnip |
| - ir3: Allow collect sources to be undef |
| - ir3: Support assembling/disassembling ray_intersection and resbase |
| - ir3: Plumb through two-dimensional UAV loads |
| - ir3: Plumb through ray_intersection intrinsic |
| - tu: Implement cmd_fill_buffer_addr internal function |
| - tu: Implement buffer_write_cp |
| - freedreno: CP_SCRATCH_WRITE exists on a7xx too |
| - freedreno: Add new a7xx CP_REG_RMW and CP_REG_TO_SCRATCH fields |
| - freedreno/a7xx: Document partial workgroup register |
| - tu: Stop emitting HLSQ_CS_KERNEL_GROUP_* |
| - tu/a7xx: Emit HLSQ_CS_LAST_LOCAL_SIZE dynamically |
| - tu: Implement unaligned dispatches |
| - tu: Add common define for maxTexelBufferElements |
| - tu: Create meta device |
| - freedreno: Introduce ray tracing features |
| - tu/kgsl: Bump uapi header |
| - tu: Plumb through raytracing fuse |
| - tu: Move fd_dev_info() before name generation |
| - tu: Display when raytracing is disabled in device string |
| - tu: Support VK_KHR_acceleration_structure |
| - tu: Support VK_KHR_ray_query |
| - tu: Expose VK_KHR_ray_tracing_maintenance1 |
| - tu, ir3: Implement a750 RT workaround |
| - ir3: Use nir_split_struct_vars for temporaries |
| - vk/bvh: Add default stubs for unsupported entrypoints |
| - anv: Delete acceleration structure stubs |
| - radv: Delete acceleration structure stubs |
| - tu: Use image view format for sysmem resolves |
| - tu: Handle non-identity GMEM swaps when resolving |
| - tu: Handle non-identity GMEM swaps for input attachments |
| - tu, freedreno: Write PC_DGEN_SU_CONSERVATIVE_RAS_CNTL |
| - tu: Stop setting binning fields on a7xx |
| - tu: Support VK_EXT_conservative_rasterization on a7xx |
| - tu: Add missing assignment to shared_viewport |
| |
| Constantine Shablia (23): |
| |
| - panvk: move samplerAnisotropy in the order it appears in struct definition |
| - panvk: enable shaderInt64 |
| - panvk: elaborate the comment on the maxMemoryAllocationCount limit |
| - panvk: adjust maxSamplerAllocationCount limit |
| - nir: introduce instance_index system value |
| - nir: lower INSTANCE_{ID,INDEX} to an offset load_instance_{index,id} respectively |
| - Revert "nir: lower INSTANCE_{ID,INDEX} to an offset load_instance_{index,id} respectively" |
| - Revert "nir: introduce instance_index system value" |
| - panvk: replace vkGetBufferMemoryRequirements2 with vkGetDeviceBufferMemoryRequirements |
| - panvk: never prefer or require dedicated allocation for buffers |
| - panvk: never require dedicated allocation for images |
| - panvk: add panvk_image_init helper |
| - panvk: implement vkGetDeviceImageMemoryRequirements |
| - panvk: enable shaderInt8, VK_KHR_8bit_storage and VK_KHR_shader_float16_int8 |
| - pan/util: sort files in meson.build |
| - panvk: order KHR extension enables alphabetically |
| - panvk/csf: use gfx_state_set_dirty instead of touching state directly |
| - pan,nir: introduce load_attribute_pan |
| - pan/bi: handle load_attribute_pan |
| - panvk: Fix base_{instance,vertex} handling |
| - panvk: lower drawid to zero |
| - panvk: enable shaderDrawParameters |
| - panvk: enable drawIndirectFirstInstance |
| |
| Corentin Noël (6): |
| |
| - virgl: Propagate the GL_MAX_stage_SHADER_STORAGE_BLOCKS for each stage |
| - virgl: Simply loop over the resources to figure-out if it is already added |
| - virgl: Update virgl_hw.h from virglrenderer |
| - virgl: Use MAX_SAMPLERS instead of MAX_SHADER_SAMPLER_VIEWS |
| - virgl/ci: Remove screen size arguments |
| - virgl/ci: Re-enable virgl-traces |
| |
| Daniel Schürmann (49): |
| |
| - aco/ra: set Pseudo_instruction::scratch_sgpr to SCC if it doesn't need to be preserved |
| - aco/ra: use bitset for sgpr_operands_alias_defs |
| - aco/ra: explicitly assign scratch SGPR for linear phis |
| - aco: remove Pseudo_instruction::tmp_in_scc |
| - aco/insert_NOPs: implement vector-based RegCounterMap as replacement for VGPRCounterMap |
| - aco/insert_NOPs: use RegCounterMap as replacement for the CounterMap implementation |
| - aco/insert_NOPs: add early exit to handle_valu_partial_forwarding_hazard_instr |
| - aco/print_asm: allow for empty blocks with arbitrary offsets |
| - aco/assembler: constify assembly functions |
| - aco/assembler: Actually insert s_inst_prefetch instructions when aligning blocks for loops |
| - aco/assembler: change ctx.loop_header to uint32_t instead of Block* |
| - aco/assembler: chain branches instead of emitting long jumps |
| - aco: remove definition from SOPP branch instructions |
| - aco: remove definition from Pseudo branch instructions |
| - aco/assembler: Don't emit target basic block index when chaining branches |
| - aco/print_ir: don't print disconnected empty blocks |
| - aco/optimizer_postRA: set branch()->never_taken if exec is constant non-zero |
| - aco: move try_optimize_branching_sequence() to postRA optimizations |
| - aco/jump_threading: remove branch sequence optimization |
| - aco: move branch lowering optimization into separate file 'aco_lower_branches.cpp' |
| - aco/lower_branches: remove edges between blocks if there is no direct branch |
| - ac/lower_ngg: Fix collecting buffer offsets from 4 lanes on gfx12 |
| - ac/lower_ngg: move break blocks after loop in streamout code generation for gfx12/ACO |
| - ac/lower_ngg: move readlane into break blocks in streamout code generation for gfx12/ACO |
| - nir/divergence: change nir_has_divergent_loop() to return true only for divergent breaks |
| - aco/jump_threading: don't remove loop preheaders |
| - aco/assembler: Find loop exits using the successor's loop nest depth |
| - aco: consider s_cbranch_exec* instructions in needs_exec_mask() |
| - aco/lower_branches: do eliminate_useless_exec_writes_in_block() during branch lowering. |
| - aco/lower_branches: implement try_remove_simple_block() in lower_branches() |
| - aco: move try_merge_break_with_continue() to lower_branches() |
| - aco/lower_branches: allow for non-fallthrough loop exits in try_merge_break_with_continue() |
| - aco: delete aco_jump_threading.cpp |
| - aco/lower_branches: stitch linear blocks if there is exactly one successor with one predecessor |
| - nir/from_ssa: only consider divergence if requested |
| - Revert "nir: add nir_clear_divergence_info, use it in nir_opt_varyings" |
| - aco/insert_NOPs: refactor VALUReadSGPRHazard detection |
| - aco/insert_NOPs: implement VALU -> VALU case for VALUReadSGPRHazard on GFX12 |
| - nir/loop_analyze: only iterate loop header phis in compute_induction_information() |
| - nir/loop_analyze: remove nir_loop_variable::in_if_branch and nir_loop_variable::in_nested_loop |
| - nir/loop_analyze: remove nir_loop_variable::in_loop |
| - nir/loop_analyze: directly record induction variables into nir_loop_info |
| - nir/loop_analyze: don't initialize nir_loop_variable separately |
| - nir/loop_analyze: replace nir_loop_variable array with hash table |
| - nir/loop_analyze: insert only induction vars into hash map |
| - nir/loop_analyze: ignore terminating induction variable in guess_loop_limit() |
| - nir/loop_analyze: re-use the same nir_loop_variable struct before and after the increment |
| - nir/loop_analyze: store nir_loop_induction_variable hash table in loop_info |
| - nir/loop_analyze: stack-allocate loop_info_state |
| |
| Daniel Stone (22): |
| |
| - ci: Don't run Meson tests in critical-path jobs |
| - ci: Slash ASan and UBSan build coverage |
| - ci: Give much more time to ASan and UBSan jobs |
| - ci: Let rootfs builds run for 2 hours (!) |
| - pipe_loader: Fix pipe_i915 with the dynamic loader |
| - ci: Disable Werror on wrapped subprojects |
| - ci: Remove obsolete compiler-wrapper |
| - ci: Move build containers above test containers |
| - ci/fedora: Install which into build image |
| - ci: Define LLVM_VERSION as a container property |
| - ci: Require LLVM_VERSION to be set explicitly |
| - ci/debian: Upgrade Debian images to LLVM 19 |
| - ci: Fix dependency on lint job |
| - ci: Fix kernel section nesting |
| - ci: Move dEQP message into section |
| - ci: Pass build targets to dEQP CMake |
| - ci: Don't build Vulkan for GL dEQP |
| - ci: Trim down VVL external builds |
| - ci: Capture Ninja log |
| - ci: Only build Perfetto in build-test jobs |
| - ci: Only build what we use for testing jobs |
| - ci: Move r300/nine/nvk builds out of critical path |
| |
| Danylo Piliaiev (31): |
| |
| - ir3/parser: Print the line where parsing error occurred |
| - nir/nir_opt_offsets: Do not fold load/store with const offset > max |
| - freedreno/registers: Define Fragment Shading Rate registers |
| - ir3,tu: Add support for Fragment Shading Rate and plumb it into Turnip |
| - tu/a7xx: Implement VK_KHR_fragment_shading_rate |
| - ir3/parser: Add fullnop and fullsync sections for debugging |
| - tu: Enable UBWC for 3D images without mipmaps |
| - freedreno/fdl: Pass fd_dev_info to fdl6_layout |
| - tu,freedreno: Enable linear mipmap tail for UBWC images |
| - tu: Disable fragmentShadingRateWithShaderSampleMask due to issues |
| - tu,ir3: Add workaround for reading shading rate on A7XX gen1,gen2 |
| - tu: Handle cmdbuf and rp_blit flags of TU_DEBUG_STALE_REGS_FLAGS |
| - tu/perfetto: Always emit submission event and time it |
| - tu/perfetto: Add app and engine names to the command buffer tracepoint |
| - ir3: Make allocation of consts more generic and order independent |
| - ir3: Use generic consts alloc for driver params |
| - tu,ir3: Make push consts be able to start from higher than c0.x offsets |
| - ir3: Use generic const alloc for everything and call it once |
| - tu: Allocate consts for driver params as early as possible |
| - tu: Do not re-calculate static blend LRZ state |
| - freedreno/regs: Set correct shr for GRAS_LRZ_BUFFER_PITCH.ARRAY_PITCH |
| - tu: Fix LRZ for arrayed depth |
| - tu: Handle 8x MSAA for LRZ |
| - freedreno,tu: Unify LRZ layout calculations |
| - tu: Track at which draw call LRZ is disabled |
| - tu: Do not disable LRZ for whole RP if it is disabled in RP |
| - ir3: Consider const alloc alignment in free space size calcs |
| - tu: Fix stale A7XX_GRAS_LRZ_CNTL2 in 3d blits or !valid lrz case |
| - tu/a7xx: Always have depth/stencil in corresponding resolve groups |
| - tu: Get correct src view when storing gmem attachment |
| - tu: Handle mismatched mutability when resolving from GMEM |
| |
| Dave Airlie (9): |
| |
| - nir/functions: force inlining for barriers. |
| - v3dv: report correct error on failure to probe |
| - venus: handle device probing properly. |
| - vulkan: update to 302 headers for av1 encode |
| - lavapipe: fix beta build due to changes in AMDX ext |
| - radv/video: set max slice counts to 1 for h264/5 encode |
| - anv: add default av1 tables from media-driver |
| - genxml: add av1 fields |
| - anv: add initial support for AV1 decoding |
| |
| David (Ming Qiang) Wu (3): |
| |
| - frontends/va: adding PIPE_FORMAT_P012 |
| - frontends/va: add PIPE_VIDEO_PROFILE_AV1_PROFILE2 |
| - radeonsi/vcn: support 12bit YUV420 AV1 decoding |
| |
| David Heidelberg (14): |
| |
| - util: Drop 3Dnow optimisation leftovers |
| - util: Remove MMX/MMXext detection code |
| - util: Drop ancient Intel CPU detection |
| - util: drop XOP detection code |
| - llvmpipe: align with u_cpu_detect struct changes |
| - compiler/rust: drop duplicated bindgen check |
| - ci/freedreno: update Adreno 306 expectations |
| - ci/freedreno: increase Adreno 618 timeout to 1h |
| - docs: remove deprecated component list and licenses |
| - docs: Clarify project name and include Mesa3D |
| - docs: move license(s) to licenses directory |
| - c11: use SPDX-License-Identifier header |
| - licenses: add missing licenses |
| - drm-uapi: update licenses statement |
| |
| David Rosca (148): |
| |
| - radeonsi/vcn: Fix coding AV1 render size |
| - frontends/va: Add minus_1 to AV1 render_width/height |
| - gallium: Add PIPE_VIDEO_CAP_SKIP_CLEAR_SURFACE |
| - frontends/va: Support skip clear on surface creation |
| - frontends/vdpau: Support skip clear on surface creation |
| - radeonsi: Support PIPE_VIDEO_CAP_SKIP_CLEAR_SURFACE |
| - radeonsi/vcn: Stop clearing decode internal buffers |
| - radv/video: Fix H264 slice control |
| - radv/video: Fix HEVC slice control |
| - radv/video: Report correct encodeInputPictureGranularity |
| - radv/video: Avoid selecting rc layer over maximum |
| - radv/video: Use 64x16 alignment for HEVC encode |
| - radv/video: Override pic_init_qp_minus26 in PPS |
| - radeonsi/vcn: Use correct frame context buffer for preencode on VCN5 |
| - radeonsi: Check all supported formats in si_vid_is_target_buffer_supported |
| - frontends/va: Create surfaces with correct fourcc for RT format |
| - frontends/va: Stop reallocating to prefered format in EndPicture |
| - frontends/va: Stop reallocating from progressive to interlaced in EndPicture |
| - frontends/va: Stop reallocating buffers for protected playback |
| - frontends/va: Stop reallocating according to JPEG sampling factor |
| - frontends/va: Check if target buffer is supported in EndPicture |
| - frontends/va: Stop reallocating buffers in EndPicture |
| - frontends/va: Use compositor blit with different number of planes |
| - frontends/va: Only use interlaced surfaces when progressive is not supported |
| - pipe: Remove video update_decoder_target |
| - radeonsi/vpe: Set correct surface swizzle mode |
| - radeonsi/vpe: Don't allow DCC surfaces |
| - frontends/va: Return correct pixel formats in surface attributes query |
| - frontends/va: Change default fourcc for RGB 10bit to X2R10G10B10 |
| - gallium/vl: Implement rendering to 3-plane YUV formats |
| - gallium/vl: Don't support planar RGB as video format |
| - frontends/va: Enable 3-plane YUV formats as postproc output |
| - radeonsi/vcn: Support tiling for JPEG decode |
| - radv/video: Fix IB signature checksum |
| - radv/video: Always use setup reference slot when valid |
| - ac/surface: Add RADEON_SURF_VIDEO_REFERENCE |
| - radeonsi: Support PIPE_BIND_VIDEO_DECODE/ENCODE_DPB |
| - radeonsi/vcn: Create decode DPB surfaces with PIPE_BIND_VIDEO_DECODE_DPB |
| - radeonsi/vcn: Create encode DPB surfaces with PIPE_BIND_VIDEO_ENCODE_DPB |
| - frontends/va: Add support for VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME_3 |
| - frontends/va: Store picture type for buffers in encode DPB |
| - radeonsi/vcn: Don't allow encoding H264 B-frame references |
| - frontends/va: Move mjpeg sampling_factor to pipe_mjpeg_picture_desc |
| - radeonsi/vcn: Remove code handling buffer_get_virtual_address failure |
| - radeonsi/vcn: Unmap bitstream buffer in radeon_dec_destroy |
| - radeonsi/vcn: Gracefully handle decode errors and report to frontend |
| - radeonsi/vcn: Make sure JPEG target buffer format matches sampling factor |
| - radeonsi/vcn: Cleanup JPEG supported formats |
| - radeonsi/vpe: Silence expected errors with unsupported output format |
| - gallium/vl: Add plane order for Y8_400 format |
| - gallium/vl: Fix plane order for IYUV format |
| - frontends/va: Stop converting formats in Put/GetImage |
| - radeonsi: Update minimum supported encode size for VCN5 |
| - radeonsi/vcn: Align bitstream buffer to 128 when resizing |
| - radeonsi/uvd: Align bitstream buffer to 128 when resizing |
| - radeonsi/vcn: Enable write combine for decode |
| - radeonsi/vcn: Don't keep last fence |
| - radeonsi/vcn: Use local variable for destory fence |
| - pipe: Remove PIPE_DEFAULT_DECODER_FEEDBACK_TIMEOUT_NS |
| - frontends/va: Get AV1 decode subsampling_x/y |
| - radeonsi/vcn: Return error when decoding 12bit VP9 and 4:2:2/4:4:4 AV1 |
| - frontends/va: Fix decoding VC1 interlaced video |
| - frontends/va: Don't allow Render/EndPicture without BeginPicture |
| - frontends/va: Don't allow EndPicture without calling driver begin_frame |
| - ac/parse_ib: Parse VCN IB_COMMON_OP_WRITEMEMORY |
| - radv/amdgpu: Set VCN version for ac_parse_ib |
| - frontends/va: Fix deinterlace filter |
| - radeonsi/vcn: Change required FW version for rc_per_pic_ex on VCN3 |
| - radv/video: Fix DPB tier2 surface params |
| - radv/video: Use correct array index for decode target and DPB images |
| - radv/video: Remove dt_field_mode handling code |
| - radv: Fix sampling from image layers of video decode target |
| - ac/surface: Don't force linear for VIDEO_REFERENCE with emulated image opcodes |
| - frontends/va: Get buffer feedback with locked mutex in MapBuffer |
| - radeonsi/vcn: Use compute only context |
| - gallium/vl: Fix unbinding sampler views |
| - gallium/vl: Create sampler state also when gfx is not supported |
| - gallium/vl: Add rgba compute shader |
| - gallium/vl: Add param to create compute only vl_compositor |
| - gallium: Add param to create compute only multimedia context |
| - frontends/va: Use compute only context if driver prefers compute |
| - radeonsi/vcn: Fix crash when failing to allocate internal buffers |
| - frontends/va: Only report surface alignment when non-zero |
| - frontends/va: Allow creating DRM PRIME surfaces without surface descriptor |
| - frontends/va: Set csc matrix in PutSurface |
| - gallium/vl: Fix creating buffers with auxiliary planes |
| - radeonsi: Add radeon_bitstream and use it in radeon_vcn_enc |
| - radeonsi/vce: Remove support for FW 50 and older |
| - radeonsi/vce: Set more header params |
| - radeonsi/vce: Move dual pipe context to offset 0 of CPB |
| - radeonsi/vce: Use app DPB management |
| - radeonsi/vce: Support slice encoding |
| - radeonsi/vce: Support VBAQ |
| - radeonsi/vce: Support quality presets |
| - radeonsi/vce: Support min/max QP and max frame size |
| - radeonsi/vce: Support intra refresh |
| - radeonsi/vce: Support raw packed headers |
| - radeonsi/vce: Set input pic swizzle mode on GFX9 |
| - radeonsi/vce: Cleanup |
| - radeonsi/uvd: Stop clearing decode internal buffers |
| - radeonsi/uvd: Optimize bitstream buffer resizing |
| - radeonsi/uvd: Set decode target swizzle mode on GFX9 |
| - radeonsi/uvd_enc: Rework DPB allocation |
| - radeonsi/uvd_enc: Use app DPB management |
| - radeonsi/uvd_enc: Consider input surface size for padding |
| - radeonsi/uvd_enc: Support Pre-Encode |
| - radeonsi/uvd_enc: Support VBAQ |
| - radeonsi/uvd_enc: Support quality presets |
| - radeonsi/uvd_enc: Support slice encoding |
| - radeonsi/uvd_enc: Support intra refresh |
| - radeonsi/uvd_enc: Support temporal layer rate control |
| - radeonsi/uvd_enc: Support min/max QP and max frame size |
| - radeonsi/uvd_enc: Support dynamic rate control changes |
| - radeonsi/uvd_enc: Support raw packed headers |
| - radeonsi/uvd_enc: Set input pic swizzle mode on GFX9 |
| - radeonsi: Enable implemented VCE/UVD encode features |
| - gallium/vl: Fix sampler view components for Y8_400 format |
| - gallium/vl: Add vl compositor layer mirror |
| - gallium/vl: Clear remaining planes in YUV conversion |
| - gallium/vl: Use matrix for scale and crop in cs compositor |
| - gallium/vl: Implement rotation and mirror in cs compositor |
| - frontends/va: Simplify format check in PutSurface |
| - frontends/va: Disable color conversion for luma-only source formats |
| - frontends/va: Stop using util_compute_blit |
| - frontends/va: Refactor vlVaPostProcCompositor to be usable outside processing |
| - frontends/va: Support rotation and mirror for processing |
| - frontends/va: Implement format conversions in PutImage/GetImage |
| - gallium/auxiliary: Remove util_compute_blit |
| - radeonsi: Fix reporting support for AV1 Profile2 |
| - radeonsi/vcn: Fix AV1 coded size for VCN 5.0 |
| - radeonsi: Report surface alignment for AV1 encode |
| - gallium/vl: Add compute shader deinterlace filter |
| - frontends/va: Stop using extra context for deinterlacing |
| - frontends/va: Implement QuerySurfaceStatus as SyncSurface with 0 timeout |
| - frontends/va: Don't flush before resource_get_handle |
| - frontends/va: Remove vlVaBuffer derived_image_buffer |
| - frontends/va: Add surface pipe_fence for vl_compositor rendering |
| - gallium/vl: Don't flush in vl_compositor yuv_deint and rgb_to_yuv |
| - frontends/va: Add context mutex |
| - frontends/va: Unlock driver mutex for SyncSurface/Buffer fence wait |
| - frontends/va: Fix decoding VC1 streams with multiple slices |
| - ac/vcn_dec: Fix AV1 film grain on VCN5 |
| - radeonsi/video: Avoid stream handle duplicates in PID namespace |
| - frontends/vdpau: Set H264 chroma_format_idc |
| - radeonsi/vcn: Set correct chroma format for H264 decode |
| - radeonsi/uvd: Set correct chroma format for H264 decode |
| - radv/video: Fix setting balanced preset for HEVC encode with SAO enabled |
| - radv/video: Move IB header from begin/end to encode_video |
| |
| David Tobolik (2): |
| |
| - rusticl/style: use Arc::clone instead of .clone() |
| - rusticl/style: add util for conversion with err |
| |
| Deborah Brouwer (36): |
| |
| - freedreno/ci: add prefix for a630-vk-asan tests |
| - ci: Remove duplicate slash before $RESULTS_DIR |
| - ci/b2c: update RESULTS_DIR for .b2c-test jobs |
| - ci: add a tool to summarize a failed pipeline |
| - ci/pipeline_message: add unit tests for tool |
| - ci: move pipeline_summary tool to .marge/hooks |
| - ci: debian/x86_64_pyutils remove redundant rules |
| - ci: python-test rename artifacts |
| - ci: yaml-toml-shell-test: use pyutils container |
| - ci: separate python tests and artifacts |
| - ci: post gantt: use logging instead of print |
| - ci: add some static typing to the gantt scripts |
| - ci: make the gantt scripts available as modules |
| - ci: post gantt: add --marge-user-id option |
| - ci: post gantt: add --project-id option |
| - ci: post gantt: add pipeline-id to gantt filename |
| - ci: post gantt: ignore pipeline_summary message |
| - ci: gantt chart: include in-progress jobs |
| - ci: add --ci-timeout option for gantt scripts |
| - ci: add pytests for the gantt chart scripts |
| - ci: update token retrieval method for gantt charts |
| - ci: collapse yamllint and shellcheck sections |
| - ci: run-pytest.sh: allow script to run locally |
| - ci: add .flake8 linting to ci scripts and tests |
| - ci: update_traces_checksum: fix E501 line too long |
| - ci: update the pyutils container |
| - ci: stop using a venv for run-pytest.sh |
| - ci: set python version 3.11 for run-pytest.sh |
| - ci: pipeline_message: catch module loading errors |
| - ci: pipeline_message: improve job list formatting |
| - ci: pipeline_message: add test to parse error logs |
| - ci: pipeline_message: ignore \`error_type` errors |
| - ci: pipeline_message: ignore harmless build logs |
| - ci: pipeline_message: ignore \`generated` errors |
| - ci: pipeline_message: parse \`fatal` messages |
| - ci: pipeline_message: reset empty errors |
| |
| Derek Foreman (3): |
| |
| - vulkan/wsi/wayland: Fix time calculation |
| - vulkan/wsi/wayland: Avoid spurious discard event at startup |
| - vulkan/wsi/wayland: Move timing calculations to the swapchain |
| |
| Detlev Casanova (3): |
| |
| - ci/fluster/lava: Add fluster in LAVA rootfs |
| - ci/fluster: Add radeonsi-raven-vaapi-fluster jobs |
| - ci/deqp-runner: uprev from 0.20.2 to 0.20.3 |
| |
| Dylan Baker (25): |
| |
| - VERSION: bump to 25.0 |
| - docs: reset new_features.txt |
| - docs/release-calendar: update one more time for pushed back release |
| - docs: add release notes for 24.3.0 |
| - docs/relnotes/24.3.0: Add SHA sums |
| - docs/release-calendar: remove 24.3 RC dates |
| - docs: Add calendar entries for 24.3 release. |
| - anv: advertise Vulkan 1.4 |
| - anv: bump max number of push constants to 256 |
| - anv: Add new Vulkan 1.4 features and properties |
| - anv: bump conformance version to 1.4 |
| - maintainer-scripts: Bump Vulkan release version to 1.4 |
| - docs: add release notes for 24.3.1 |
| - docs: Add SHA sums for 24.3.1 |
| - docs: update calendar for 24.3.1 |
| - clc: Tell clang to track imported dependencies |
| - docs: add release notes for 24.3.2 |
| - docs: Update checksums for 24.3.2 |
| - docs: update calendar for 24.3.2 |
| - docs/release-calendar: Move next release to January 2nd |
| - intel/tests: Fix coverity warning about possibly leaked memory |
| - intel/tests: Fix missing assignment of error condition |
| - docs: add release notes for 24.3.3 |
| - docs: Add SHA sums to 24.3.3 release notes |
| - docs: update calendar for 24.3.3 |
| |
| Eric Engestrom (139): |
| |
| - meson: bump spirv-tools version needed to v2022.1 |
| - radeonsi/ci: add more flakes seen recently |
| - radv/ci: add more flakes seen recently |
| - broadcom/ci: add more flakes seen recently |
| - freedreno/ci: add more flakes seen recently |
| - ci: upgrade the fedora image from 38 to 41 |
| - ci/build: drop "verify after bump to F39" as that did not help |
| - ci/build: add workaround for incorrect maybe-uninitialized error |
| - ci: move error handling functions at the end |
| - ci: use quiet alias for commands |
| - ci: make error handling quieter |
| - broadcom/ci: add flakes seen recently |
| - freedreno/ci: add flakes seen recently |
| - nvk+zink/ci: add flakes seen recently |
| - radv+zink/ci: add flakes seen recently |
| - ci: raise priority of release manager pipelines |
| - ci: reduce priority of nightly pipeline jobs from 50 to 45 |
| - meson: move openmp block out of the middle of the x11 deps block |
| - meson: define only once the versions of the x11 deps |
| - radv/ci: document flakes seen recently |
| - broadcom/ci: document flakes seen recently |
| - nvk/ci: document flakes seen recently |
| - freedreno/ci: document flakes seen recently |
| - docs: update calendar for 24.2.7 |
| - docs: add release notes for 24.2.7 |
| - docs: add sha sum for 24.2.7 |
| - turnip/ci: document regression |
| - ci/crosvm: remove noise inside deqp-runner output |
| - v3dv/ci: mark whole group as flaky |
| - docs: fix invalid expression in new pipe cap |
| - docs: fix invalid expression in teflon docs |
| - intel/ci: disable CML jobs because of networking issues |
| - intel/ci: add missing .intel-common-manual-rules to .{iris,crocus,i915g}-manual-rules |
| - ci/build: drop mold wrapper for \`ninja install` |
| - ci: drop override forcing ld to be gold (and forcing gold to be installed everywhere) |
| - ci: when installing mold, make its use automatic |
| - ci: bump image tags |
| - radeonsi/ci: drop two failures that are mysteriously fixed by using mold? |
| - ci/container: move deqp build section into the script itself |
| - ci/container: move apitrace build section into the script itself |
| - ci/container: move crosvm build section into the script itself |
| - ci/container: move deqp-runner build section into the script itself |
| - ci/container: move fossilize build section into the script itself |
| - ci/container: move gfxreconstruct build section into the script itself |
| - ci/container: move kdl build section into the script itself |
| - ci/container: move libclc build section into the script itself |
| - ci/container: move llvm-spirv build section into the script itself |
| - ci/container: move mold build section into the script itself |
| - ci/container: move ninetests build section into the script itself |
| - ci/container: move piglit build section into the script itself |
| - ci/container: move rust build section into the script itself |
| - ci/container: move vkd3d-proton build section into the script itself |
| - ci/container: move vulkan-validation build section into the script itself |
| - ci/container: move wayland build section into the script itself |
| - ci/container: add sections around the other build scripts |
| - ci/container: close debian_{setup,cleanup} sections |
| - ci/lava: add setup-test-env.sh to the rootfs |
| - ci/container: add section around strip-rootfs.sh |
| - ci: bump image tags |
| - zink+nvk/ci: fix deqp binary used for gles tests |
| - zink+radv/ci: fix deqp binary used for gles tests |
| - ci/deqp: move testlog-to-* tools to /deqp |
| - ci/deqp: only compress caselists when they exist |
| - ci/deqp: build testlog tools on android |
| - ci/deqp: fetch & checkout exactly the commit/tag/branch requested |
| - ci/deqp: avoid downloading 1.47 GiB multiple times |
| - ci/deqp: error out in case of invalid build API |
| - ci/deqp: build glcts in gles build, for gles*-khr tests |
| - ci/deqp: add build of \`main` branch |
| - ci/deqp: make sure the main commit is actually from the main branch |
| - ci/deqp: fully isolate deqp builds |
| - ci: bump image tags |
| - ci/container: setup sections in all image builds |
| - radv/ci: document regression of test_shader_sm66_is_helper_lane in 7469f99e...25b8f4f7 |
| - meson: simplify logic a bit |
| - meson: drop unused variables |
| - meson: reuse variable |
| - meson/megadriver: s/_/-/ in an argument name to be consistent |
| - meson/megadriver: simplify setting common megadriver arguments |
| - meson/megadriver: support various lib suffixes |
| - ci/deqp: simplify paths since we are already in /deqp-$deqp_api/ |
| - ci/deqp: fix the "is this a build on main?" check |
| - ci/deqp: support having commit backports and local patches for main too |
| - ci/deqp: simplify generating the version description file |
| - ci/deqp: mention the deqp api in the version string |
| - ci/deqp: only print the commit list header when the list is not empty |
| - ci/lava: turn the $BUILD_VK check into a proper if block |
| - ci/deqp: add a deqp-vk build on the \`main` branch |
| - ci: bump image tags |
| - radv/ci: use deqp-vk-main in radv jobs |
| - docs: update calendar for 24.2.8 |
| - docs: add release notes for 24.2.8 |
| - docs: add sha sum for 24.2.8 |
| - ci/meson: make meson wrap fallback list more readable |
| - ci/meson: add FORCE_FALLBACK_FOR variable for build jobs to use |
| - docs/release-calendar: add 25.0 branchpoint and RCs schedule |
| - docs/release-calendar: fixup sed fail |
| - docs/release-calendar: push the 25.0 branchpoint back by 2 weeks |
| - docs: update calendar for 24.3.4 |
| - docs: add release notes for 24.3.4 |
| - docs: add sha sum for 24.3.4 |
| - docs/release-calendar: push back the 24.3.x releases by one week |
| - docs: update url to vulkan features & extensions |
| - anv,gfxstream,panvk,zink: update urls to vulkan docs |
| - radv,lvp: fix url to VkAabbPositionsKHR docs |
| - ci: make linker warnings fatal |
| - VERSION: bump for 25.0.0-rc1 |
| - [25.0-only] hk: comment out dead variable |
| - .pick_status.json: Update to 5b856a741d6dc18d409a0c06ad6492cc3ee9a6bd |
| - .pick_status.json: Mark 0ee5015da4c386c0ef8b6ff12fd2bb34022d86a6 as denominated |
| - .pick_status.json: Update to e49df902b4c1b98569921d8b858e6e3855bf10e0 |
| - .pick_status.json: Update to e192d7d615dec9c9c04447c4b9ab0244d6380944 |
| - .pick_status.json: Mark 39969409f6fb60b21aea36be4d5424718fcc26b8 as denominated |
| - VERSION: bump for 25.0.0-rc2 |
| - .pick_status.json: Update to fdaf7c7b9647874e66e79653050f9d0999dc9134 |
| - docs/android: drop libglapi.so now that it's gone |
| - .pick_status.json: Mark 5f54beb30728f6510ce50071ddaef5f9157b16ef as denominated |
| - gfxstream: fix signedness of shifts |
| - gfxstream: drop dead variables |
| - gfxstream: use \`range` variable for its intended purpose |
| - gfxstream: mark unused variables as such |
| - .pick_status.json: Update to ee9edd46254884ab7fe6c96518e23d421d5f5344 |
| - llvmpipe/tests: include math.h for INFINITY |
| - ci: don't run on tag pipelines |
| - ci: only trigger the CI for release managers when pushing to staging branch |
| - .pick_status.json: Update to 18f0807408425da11cb1d8cd1d73de369317440d |
| - .pick_status.json: Update to 30a3d567c8b996fde86b07d2bad018013a54ff44 |
| - ci: run containers builds on staging branches |
| - .pick_status.json: Mark 13e987669ccee373948753e113e9ce7e9bdbef55 as denominated |
| - VERSION: bump for 25.0.0-rc3 |
| - .pick_status.json: Update to e41438275e005bbb20fc9c8115d7d29343c292d8 |
| - ci: debian-testing-ubsan is used by tests |
| - ci/yaml-toml-shell-py-test: don't run on post-merge pipelines |
| - ci/yaml-toml-shell-py-test: run on direct push pipelines |
| - .pick_status.json: Update to a9b6a54a8cce0aab44c81ea4821ee564b939ea51 |
| - .pick_status.json: Update to 06d8afff640c66e51517bf4bebd2a58abb2fa055 |
| - .pick_status.json: Update to 2361ed27f34774f0a73324915a9ddb57f43e112a |
| - .pick_status.json: Update to 56aac9fdecad0f7d335f82653832927486f07d44 |
| - .pick_status.json: Update to 6b20b0658489afe745a28b8f09c57067e45b47f3 |
| |
| Eric R. Smith (28): |
| |
| - util: rename PIPE_FORMAT_Y8_U8V8_422_UNORM |
| - dri, mesa: fix NV16 texture format |
| - egl, mesa: add support for NV15 and NV20 textures |
| - dri: fix NV15 and NV20 definitions to make sure they will be used |
| - panfrost: add panfrost support for NV15, NV16 and NV20 |
| - panvk: fix depth bias calculation |
| - panfrost: add a perf warning when resources need to be converted |
| - panfrost: convert resources before binding them to images |
| - panfrost: check afbc status in panfrost_query_compression_modifiers |
| - mesa: when blitting between formats clear any unused components |
| - aux: add support for dumping the swizzle in pipe_blit_info |
| - mesa: update more drivers to handle pipe_blit_info swizzle_enable |
| - format: Add R8_G8B8_422_UNORM format |
| - panvk: update feature support |
| - panvk: split device and instance version numbers |
| - panvk: advertise version 1.1 support |
| - panfrost: fix read/write resource confusion in afbc_pack |
| - panfrost: fix potential memory leak |
| - panvk: fix fs_required() |
| - panfrost: apply DEPTH_STENCIL flag consistently |
| - panfrost: Allow ATEST input to be a FAU index |
| - panfrost: ensure sample_mask is written before color |
| - panvk: re-enable fragmentStoresAndAtomics for v10 |
| - drm-uapi: update drm_fourcc.h to latest version |
| - panfrost: support MTK 16L32S detiling |
| - panfrost: avoid potential divide by 0 calculating timer_resolution |
| - panfrost: fix YUV center information for 422 |
| - panfrost: fix backward propagation of values in loops |
| |
| Erico Nunes (2): |
| |
| - ci/lima: update piglit ci expectations |
| - ci/lima: enable again |
| |
| Erik Faye-Lund (134): |
| |
| - panvk: drop unused include |
| - panfrost: use mesa_log infra instead of stdio |
| - glx: avoid null-deref |
| - panfrost: use 64-bits for layout calculations |
| - panvk: set correct max extents for images |
| - panvk: support binding swapchain memory |
| - panvk: wire up swapchain image creation |
| - panvk: remove duplicate property |
| - panvk: implement sampleRateShading |
| - panvk: check for maxResourceSize-overflow in vkCreateImage |
| - panvk: document reason for maxResourceSize-limit |
| - docs: mark GL_ARB_shader_subroutine as always supported |
| - docs: mark GL_ARB_get_program_binary as always supported |
| - docs: update GL_OES_shader_image_atomic support |
| - docs: update GL_ARB_multi_draw_indirect support |
| - docs: refer to panfrost by version |
| - docs: fixup a few mistakes with panfrost |
| - docs: add missing panfrost extensions |
| - lima: fixup typo |
| - lima: add assert to validate list-lenght |
| - lima: avoid memleak on error |
| - panfrost: sanity-check alignment |
| - panvk: correct signedness of timestamps |
| - panvk: widen type before multiplying |
| - mesa/main: properly check for EXT_memory_object |
| - mesa/main: properly check for EXT_memory_object_fd |
| - mesa/main: properly check for EXT_memory_object_win32 |
| - mesa/main: properly check for EXT_semaphore |
| - mesa/main: properly check for EXT_semaphore_win32 |
| - st/mesa: check requirements for MESA_texture_const_bandwidth |
| - mesa: error-check GL_TEXTURE_TILING_EXT params |
| - panvk: report minmax-support for sampled formats |
| - panvk: expose KHR_dedicated_allocation |
| - vulkan/meta: plug a couple of memory leaks |
| - panvk: free preload-shaders after compiling |
| - panvk, nvk: spell width correctly |
| - panvk/ci: correct name of skips-file |
| - panvk/ci: remove duplicate skips |
| - panvk/ci: add some missing skips |
| - panvk/ci: update ci results for g610 |
| - panvk/ci: add a few flakes |
| - panvk/ci: add a full panvk job |
| - panfrost: match 4-bit format order |
| - panfrost: add missing 4-bit formats |
| - panvk: expose EXT_4444_formats |
| - panvk/ci: update g52 results |
| - panvk/ci: update g610 results |
| - panvk: expose scalarBlockLayout |
| - panvk/ci: remove duplicate skips |
| - panvk/ci: update g52 results |
| - panvk/ci: update g52-vk-full job |
| - panvk: do not expose subgroup support |
| - panvk: disable imageCubeArray on bifrost |
| - panvk: soften the language around opt-in |
| - panvk: do not require opt-in for panvk on v10 |
| - panvk/ci: correct timeouts as crash |
| - panvk/ci: fixup g52 skip sorting |
| - panvk/ci: add a few more g52 skips |
| - panvk: fixup bad indent |
| - panvk: only validate the push-sets that we update |
| - panvk: back out of vk 1.1 support |
| - panvk: make vk-version helper internal to source |
| - docs: add new panvk features |
| - panvk: fix image size for cube-arrays on bifrost |
| - Revert "panvk: disable imageCubeArray on bifrost" |
| - st/mesa: document ARB_texture_float quirk |
| - pan/cs: fix broken allocation-failure check |
| - panfrost: clean up mmap-diagnostics |
| - panfrost: report errors from panfrost_bo_mmap |
| - panfrost: handle mmap failures |
| - panfrost: handle NULL-batches |
| - panfrost: propagate cs_builder error instead of asserting |
| - panfrost: handle pool-allocation errors |
| - panfrost: handle errors allocating csf oom-handler |
| - panfrost: try to survive start-up alloc fails |
| - pan/ci: update t860 ci xfails |
| - panvk: drop fragmentStoresAndAtomics support for now |
| - vulkan: add vk_descriptor_type_is_dynamic helper |
| - v3dv: use vk_descriptor_type_is_dynamic |
| - turnip: use vk_descriptor_type_is_dynamic |
| - dozen: use vk_descriptor_type_is_dynamic |
| - panvk: use vk_descriptor_type_is_dynamic |
| - radv: use vk_descriptor_type_is_dynamic |
| - asahi: use vk_descriptor_type_is_dynamic |
| - turnip: use vk_descriptor_type_is_dynamic |
| - pvr: use vk_descriptor_type_is_dynamic |
| - panvk: use vk_descriptor_type_is_dynamic |
| - lavapipe: use vk_descriptor_type_is_dynamic |
| - anv: use vk_descriptor_type_is_dynamic |
| - hasvk: use vk_descriptor_type_is_dynamic |
| - dozen: use vk_descriptor_type_is_dynamic |
| - nvk: use vk_descriptor_type_is_dynamic |
| - panvk/ci: update expected failures |
| - docs: fixup broken markup |
| - docs: fixup link in radv docs |
| - docs/ci: treat warnings as errors |
| - docs: update panvk status |
| - panvk/ci: drop needless envvar |
| - Revert "panfrost: Disable CRC by default" |
| - pan/ci: update t760 checksum |
| - pan/ci: update opencl expectations |
| - docs/panfrost: document vulkan support |
| - docs: update panvk status |
| - docs/features: fixup panvk KHR_shader_draw_parameters-support |
| - pan/va: fix base-level for nir_texop_lod |
| - pan/ci: add some occasional flakes |
| - docs/features: add a few missing extensions |
| - docs/features: mark panfrost as supporting GL_OES_texture_view |
| - pan/ci: drop empty trailing variables-list |
| - panfrost: reuse tiler hierarchy mask selection from panvk |
| - panfrost: limit maximum texture size |
| - panfrost: do not artificially limit texture-sizes |
| - pan/midgard: use macros for mir_prev_op / mir_next_op |
| - pan/midgard: constify pointers |
| - pan/compiler: don't pass midgard_instruction by value |
| - panvk: expose subgroup operations |
| - panvk: expose vk1.1 on v10 hardware |
| - pan/bi: bump iter_count to 2000 |
| - panvk: do not expose EXT_subgroup_size_control on bifrost |
| - panvk/ci: update expected failures |
| - panfrost: mark helper as static |
| - panfrost: handle allocation errors when afbc-packing |
| - panfrost: unify emit_tls and emit_fbd |
| - panfrost: propagate allocation scratchpad allocation errors |
| - panfrost: propagate errors from panfrost_batch_create_bo |
| - panfrost: in-place map/unmap shouldn't grow |
| - gallium/aux: do not assert on map-failures |
| - meson: build panvk by default on arm |
| - panvk: fix line-rasterization of bifrost |
| - panvk/ci: add back incorrectly removed crash |
| - pan/ci: add flaky tests to the flake-list |
| - pan/ci: add fail from llvm 19 upgrade |
| - panvk: correct number of read bytes for dynamic buffers |
| - panvk: report passing the VK CTS |
| |
| Ernst Persson (1): |
| |
| - intel/vulkan: Add bvh build dependency |
| |
| Evan (1): |
| |
| - amd/vpelib: Shaper Refactor |
| |
| Faith Ekstrand (27): |
| |
| - vulkan: Allow the same item to show up twice in core version <requires> |
| - vulkan: Add Vulkan 1.4 feature aliases |
| - treewide: Stop putting enum in front of Vulkan enum types |
| - vulkan: Update XML and headers to 1.4.303 |
| - nvk: Increase push constant space to 256B |
| - nvk: No-op implement VK_KHR_global_priority |
| - nvk: Add new Vulkan 1.4 features and properties |
| - nvk: Advertise Vulkan 1.4 |
| - nvk: Only support Vulkan 1.4 on Turing+ |
| - nvk: Move Vulkan 1.4 features to the 1.4 section |
| - nvk: Move Vulkan 1.4 properties to the 1.4 section |
| - nvk: Set a command buffer error if pushbuf alloc fails |
| - nvk: Call nir_opt_access |
| - nak: Use ldc.constant for load_global when CAN_REORDER is set |
| - nvk: Handle pCounterBuffers == NULL in Begin/EndTransformFeedback |
| - nvk: Fix scissor bounds |
| - nvk: Rename nvk_descriptor_set::mapped_ptr |
| - nvk: Respect VK_DESCRIPTOR_POOL_CREATE_HOST_ONLY_BIT_EXT |
| - nvk: Implement descriptorBufferPushDescriptors |
| - nvk: Pull shaders from the state command buffer in nvk_cmd_process_cmds() |
| - nvk: Handle shader==NULL in nvk_cmd_upload_qmd() |
| - nvk: Allow sparse loads on EDB buffers |
| - nak: Handle sparse texops with unused color destinations |
| - nvk: Use suld for EDB uniform texel buffers |
| - nvk: Align UBO/SSBO addresses down rather than up |
| - nak: Use suld.constant when ACCESS_CAN_REORDER is set |
| - nvk: Use suld.constant for EDB uniform texel buffers |
| |
| Felix DeGrood (6): |
| |
| - iris: Use vfg distribution mode = RR_STRICT for Xe2+ |
| - anv: Use vfg distribution mode = RR_STRICT for Xe2+ |
| - anv: allow compressed buffers types on vkd3d titles |
| - anv: remove unnecessary driconf entries for anv_enable_buffer_comp |
| - vk/overlay-layer: defer log creation to swapchain creation |
| - intel/perf: add new perf consts to support more metrics |
| |
| Feng Jiang (2): |
| |
| - virgl: Ensure that PIPE_SHADER_CAP_MAX_CONST_BUFFERS is less than PIPE_MAX_CONSTANT_BUFFERS |
| - radv/rt: Fix memleak in radv_init_header() |
| |
| Francisco Jerez (27): |
| |
| - intel/fs/xe2: Fix up subdword integer region restriction with strided byte src and packed byte dst. |
| - intel/brw/xe3+: Relax SEND EOT register assignment restrictions. |
| - intel/brw: Saturate shifted subgroup index to avoid reading past the end of register file. |
| - intel/brw: Use urb_read_length instead of nr_attribute_slots to calculate VS first_non_payload_grf. |
| - intel/brw/xe3+: Mask subgroup shuffle index to be within valid range to avoid VRT hangs. |
| - anv/gfx12.5: Request subgroup size 8 for RT trampoline shader. |
| - intel/brw: Allow specifying a required subgroup size for fragment shaders. |
| - intel/blorp: Specify a subgroup size requirement of 16 for fast clear or repclear shaders. |
| - intel/common/xe2+: Allow SIMD32 PS for all multisample cases. |
| - intel/brw/xe3: Define XE3_MAX_GRF. |
| - intel/brw/xe3: Extend regalloc sets to maximum Xe3 GRF size. |
| - intel/brw/xe3+: Bump number of SBID tokens for Xe3. |
| - intel/brw/xe3+: Disable round-robin allocation heuristic on Xe3+. |
| - intel/brw: Indent body of brw_compile_fs() not applicable to xe3+. |
| - intel/brw: Indent conditional block from brw_compile_fs() not applicable to Xe2+. |
| - intel/brw: Exit early from run_fs() if compilation failed before optimization loop. |
| - intel/brw/xe3+: brw_compile_fs() implementation for Xe3+. |
| - intel/brw/xe3+: Optimize CS/TASK/MESH compile time optimistically assuming SIMD32. |
| - intel/brw: Report number of GRF registers used in brw_stage_prog_data. |
| - intel/brw: Define ptl_register_blocks() helper. |
| - intel/genxml/xe3+: Update definitions for shader state setup. |
| - iris/xe3+: Set RegistersPerThread during shader state setup based on prog_data. |
| - intel/blorp/xe3+: Set RegistersPerThread during shader state setup based on prog_data. |
| - anv/xe3+: Set RegistersPerThread during shader state setup based on prog_data. |
| - anv/xe3+: Set RegistersPerThread for bindless shader dispatch. |
| - iris/xe3+: Enable VRT. |
| - anv/xe3+: Enable VRT. |
| |
| Frank Binns (2): |
| |
| - pvr: add TI j721s2 as a supported device |
| - pvr: add 36.53.104.796 (BXS-4-64) to the list of supported GPUs |
| |
| Friedrich Vock (15): |
| |
| - vulkan/rmv: Correctly set heap size |
| - vulkan/runtime/bvh: Set leaf_node_count for updates |
| - radv,driconf: Apply DOOM Eternal/idTech workarounds for Indiana Jones |
| - aco/lower_to_hw_instr: Check the right instruction's opcode |
| - radv/rt: Remove nir_intrinsic_execute_callable instrs in monolithic mode |
| - aco: Fix dead instruction/index handling for try_insert_saveexec_out_of_loop |
| - nir: Serialize all parameter attributes |
| - nir,vtn: Add return info to parameters |
| - nir: Add parameter divergence info |
| - vtn: Set parameter type in glsl_type_add_to_function_params |
| - nir: Add indirect calls |
| - nir: Apply passes to all functions |
| - nir: Add nir_instr_is_before helper |
| - nir: Free liveness info when invalidating metadata |
| - nir: Add indirect call optimizations |
| |
| GKraats (1): |
| |
| - i915g: fix glClearColor using a 1 byte color format |
| |
| Georg Lehmann (79): |
| |
| - radv: run copy prop before vectorizing |
| - nir/opt_16bit_tex_image: optimize extract half sources |
| - nir: add nir_def_all_uses_ignore_sign_bit |
| - pan/bi: use nir_def_all_uses_ignore_sign_bit |
| - aco: use nir_def_all_uses_ignore_sign_bit |
| - nir: handle fmul(a,a)/ffma(a,a,b) in nir_def_all_uses_ignore_sign_bit |
| - aco/gfx8: use ds_swizzle_b32 rotate mode |
| - nir: return def for debug info in nir_instr_def |
| - nir/instr_set: replace nir_instr_get_def_def with nir_instr_def |
| - nir/instr_set: support instrs with no def |
| - nir: cse terminate/demote |
| - nir/opt_undef: replace undef in a separate pass |
| - nir/opt_undef: use some nir helpers |
| - nir/opt_undef: keep undefs used by partial undef vectors |
| - nir/opt_undef: handle unpack/pack like mov/vec |
| - aco/isel: use undef Operands for p_create_vector created from nir vecs |
| - util: add BITSET_LAST_BIT_BEFORE |
| - nir/move_discards_to_top: single final iteration |
| - nir/move_discards_to_top: don't move across is_helper_invocation |
| - radv/ci: document test_shader_sm66_is_helper_lane as fixed |
| - freedreno/ci: update a630 KSP checksum |
| - nir/opt_intrinsic: rework sample mask opt with vector alu |
| - nir/opt_intrinsic: fix sample mask opt with demote |
| - radv: optimize sample mask comparisons |
| - aco/optimizer: label fcanonicalize like a copy if there is nothing to flush |
| - nir/opt_algebraic: optimize ffma(b2f, b2f, c) |
| - nir/opt_algebraic: optimize d3d9 ftrunc |
| - nir/opt_algebraic: optimize d3d9 ceil |
| - nir/opt_algebraic: mark a - ffract(a) as nan incorrect. |
| - radv: fix reporting mesh/task/rt as supported dgc indirect stages |
| - radv: rework vk_property initialization |
| - aco/gfx12: disable vinterp ddx/ddy optimization |
| - aco/gfx12+: do not use v_pack_b32_f16 to pack untyped data |
| - radeonsi/ci: add vangogh ubo fail |
| - zink: spec\@ext_framebuffer_multisample\@blit-mismatched-formats was fixed |
| - aco/gfx11+: use v_and_b32 to extract local id 0 |
| - radv: track holes in the clip/cull masks |
| - nir: add constant clip/cull distance optimization |
| - radv: use nir_opt_clip_cull_const |
| - nir/uub: properly limit float support to 32bit |
| - nir: add unsigned upper bound support for f2i32 |
| - nir: add unsigned upper bound support for fsat |
| - aco/gfx12: don't assume memory operations complete in order |
| - aco/ra: don't write to exec/ttmp with mulk/addk/cmovk |
| - aco/ra: disallow s_cmpk with scc operand |
| - aco/ra: don't write to scc/ttmp with s_fmac |
| - nir/opt_remove_phis: rematerialize equal alu |
| - nir/opt_algebraic: optimize min(max(a, b), a) |
| - nir: optimize unpacking 8bit values from a 64bit source |
| - aco/isel: skip and(exec) for top level demote_if/terminate_if |
| - aco: rename p_early_exit_if to if_not |
| - aco: allow p_exit_early_if_not with exec condition |
| - aco/insert_exec: exit shader using exec for top level discard |
| - aco: create v_cmpx with s_andn2(exec, v_cmp) |
| - nir: sink/move alu with two identical, non constant sources. |
| - amd: switch to FRONT_FACE_ALL_BITS(0) |
| - nir: add load_front_face_fsign |
| - amd: support load_front_face_fsign |
| - nir: add nir_alu_srcs_negative_equal_typed |
| - nir,amd: optimize front_face ? a : -a |
| - aco/optimizer: fix signed extract of sub dword temps with SDWA |
| - aco/insert_exec: reset top exec for p_discard_if |
| - radv: run peephole_select in optimize_nir_algebraic |
| - nir/peephole_select: allow load_vector/scalar_arg_amd |
| - aco: guard small_vector move/copy operator against self assignment |
| - aco: support less trivial component types in small_vec |
| - aco: implement some more std::vector functions for small_vec |
| - nir/opt_algebaric: convert fadd(a, a) to a * 2.0 |
| - aco: update is_dual_issue_capable for gfx11.5+ |
| - aco/sched_ilp: continue open clauses |
| - aco/sched_ilp: add dependencies of later clause instrs more aggressively |
| - aco/sched_ilp: only remove WaW/WaR for inter clause dependencies |
| - aco/sched_ilp: reorder VINTRP |
| - aco/sched_ilp: new latency heuristic |
| - aco/sched_ilp: rename priority to wait_cycles |
| - aco/sched_ilp: use more realistic memory latencies |
| - aco/sched_ilp: base latency and issue cycles on aco_statistics |
| - nir: fix range analysis for frcp |
| - nir: fix frsq range analysis |
| |
| Gert Wollny (6): |
| |
| - virgl/vtest: take handle from host when using protocol version >=3 |
| - virgl/vtest: When trying to use protocol 3 check host feature |
| - virgl/vtest: change interface of virgl_vtest_submit_cmd |
| - virgl/vtest: Add support for creating blob resources |
| - ci: Upref virglrenderer version |
| - radeon/evergreen: ensure equal sizes for depth-stencil npot textures |
| |
| Guilherme Gallo (9): |
| |
| - ci/lava: Set default exit code to 1 for failed jobs |
| - ci/lava: Improve exception handling for job failures |
| - ci/lava: Uprev freezegun |
| - ci/intel: Set HWCI modules for puff DUT |
| - ci/iris: Force UART for puff boards |
| - ci/iris: Rebalance iris-cml-deqp jobs |
| - ci/iris: Fix iris-cml-traces expectations |
| - ci/iris: Update iris-cml-deqp CI expectations |
| - ci/container: set up S3_JWT_FILE also for container jobs |
| |
| Gurchetan Singh (17): |
| |
| - util: add c++ guards to u_mm.h |
| - gfxstream: move isHostVisible function |
| - gfxstream: nuke android::base::SubAllocator |
| - gfxstream: use vulkan_lite_runtime |
| - gfxstream: nuke EntityManager.h include |
| - gfxstream: aemu: vendor it |
| - gfxstream: modify libaemu for Mesa use case |
| - gfxstream: guest: use internal version of AEMU headers + impls |
| - gfxstream: use canonical Mesa dependencies |
| - gfxstream: conditionals for using gfxstream::aemu |
| - gfxstream: delete qemu_pipe target |
| - gfxstream: for Android, look for the autogenerated files |
| - gfxstream: change output location |
| - gfxstream: remove abort() |
| - gfxstream: fix issues with VK1.4 build |
| - gfxstream: remove references to Fuchsia Goldfish |
| - gfxstream: fix some integration bugs |
| |
| Hans-Kristian Arntzen (11): |
| |
| - vulkan/wsi/wayland: Use X11-style image count strategy when using FIFO. |
| - radv: Fix missing gang barriers for task shaders. |
| - radv/winsys: Report VA mappings in bo_log too. |
| - radv: Add sparse mappings to radv_check_va.py. |
| - wsi/x11: Do not use allocation callbacks on a thread. |
| - wsi/wayland: Only use commit timing protocol alongside present time. |
| - wsi/wayland: Don't fallback to broken legacy throttling with FIFO |
| - wsi/wayland: Handle FIFO -> MAILBOX transitions correctly |
| - wsi/wayland: Remove unused present_mode member. |
| - wsi/wayland: Add forward progress guarantee for present wait. |
| - radv: Add radv_invariant_geom=true for Indiana Jones. |
| |
| Hsieh, Mike (1): |
| |
| - amd/vpelib: Refactor 3D LUT parameters |
| |
| Hyunjun Ko (10): |
| |
| - anv: define ANV_VIDEO_H264_MAX_DPB_SLOTS |
| - anv: Enable remapping picture ID |
| - anv: handle negative value of slot index for h265 decoding. |
| - intel/genxml: define MEMORYADDRESSATTRIBUTES for Gen12.5 with TILEF |
| - anv/video: Fix to return supported video format correctly. |
| - anv: calculate global parmeters correctly for AV1 decoding |
| - anv: support in-loop super resolution for AV1 decoding |
| - anv: fix to set default cdf buf correctly. |
| - anv: change bool to VkResult |
| - anv: Fix to set CDEF flter flag correctly for AV1 decoding |
| |
| Iago Toral Quiroga (15): |
| |
| - v3d: add a V3D_DEBUG option to force synchronous execution of jobs |
| - broadcom: handle double buffer on V3D 7.1 tile size calculations |
| - v3d: group tile spec into a struct inside the job |
| - v3d: save a pointer to the TILE_BINNING_MODE_CFG packet in the CL |
| - v3d: do tile state BO allocation later |
| - v3d: only enable double-buffer for jobs where it might make sense |
| - v3dv: add missing support for double-buffer on V3D 7.x |
| - v3d: drop blank line |
| - v3d: store size of qpu program for compiled shaders |
| - broadcom: add helpers for double-buffer heuristic |
| - v3d: use heuristic to enable double-buffer mode |
| - v3dv: use the double buffer heuristic helpers |
| - broadcom: move double-buffer heuristic helpers to the compiler |
| - v3dv: fix missing access bit flag when checking for texel buffer reads |
| - v3dv: fix crash on 32-bit builds |
| |
| Ian Romanick (57): |
| |
| - brw/emit: Add correct 3-source instruction assertions for each platform |
| - brw/copy: Don't copy propagate through smaller entry dest size |
| - brw/cse: Don't eliminate instructions that write flags |
| - brw/lower: Don't emit spurious moves to or from NULL register |
| - brw/opt: Always do copy prop, DCE, and register coalesce after lower_regioning |
| - brw/opt: Always do both kinds of copy propagation before lower_load_payload |
| - brw/build: Add scalar_group() helper |
| - brw/lower: Lower invalid source conversion to better code |
| - Fix copy-and-paste bug in nir_lower_aapoint_impl |
| - brw/lower: Don't "fix" regioning of broadcast |
| - brw: Use resize_sources several more places |
| - brw/build: Use SIMD8 temporaries in emit_uniformize |
| - brw/copy: Allow copy prop into src1 of broadcast |
| - nir/algebraic: Optimize some trivial bfi |
| - brw/algebraic: Fix ADD constant folding |
| - brw/algebraic: Fix MUL constant folding |
| - brw/emit: Fix typo in recently added ADD3 assertion |
| - brw/algebraic: Partial constant folding of ADD3 |
| - brw/const: Allow mixing signed and unsigned immediate sources |
| - brw/copy: Don't try to be clever about ADD3 constant propagation |
| - brw: Emit immediate value for MAD in canonical position |
| - brw/copy: Commute immediates for MAD multiplicands |
| - brw/algebraic: Constant fold multiplicands of MAD |
| - brw/algebraic: Don't restrict MAD(a, b, 1) optimization to float32 |
| - brw/const: Refactor checking whether an immediate source is allowed |
| - brw/const: Allow constants in integer MAD |
| - brw/const: Allow HF constants in MAD on Gfx11 |
| - brw/const: Remove TODO that isn't allowed by the hardware |
| - brw/algebraic: Pull brw_constant_fold_instruction out of the switch statement |
| - brw/emit: Fix BROADCAST when value is uniform and index is immediate |
| - brw: Add devinfo parameter to fs_inst::regs_read |
| - brw: Basic infrastructure to store convergent values as scalars |
| - brw/lower: Allow uniform and scalar sources to many kinds of SEND |
| - brw/nir: Fix up handling of sources that might be convergent vectors |
| - brw/lower: Adjust source stride on DF is_scalar sources to MAD on Gfx9 |
| - brw/lower: Properly handle UNIFORM globals address in lower_trace_ray_logical_send |
| - brw/emit: Allow scalar sources to HF math instructions on Xe2 |
| - brw/nir: Prepare try_rebuild_source for scalar values |
| - brw/build: Prepare BROADCAST for scalar values |
| - brw/nir: Treat load_const as convergent |
| - brw/nir: Treat some load_uniform as convergent |
| - brw/nir: Treat load_workgroup_id as convergent |
| - brw/nir: Treat some ALU results as convergent |
| - brw/nir: Treat some load_ubo as convergent |
| - brw/nir: Treat load_inline_data_intel as convergent |
| - brw/nir: Treat load_reloc_const_intel as convergent |
| - brw/nir: Treat load_btd_{global,local}_arg_addr_intel and load_btd_shader_type_intel as convergent |
| - brw/nir: Treat load_*_uniform_block_intel as convergent |
| - brw/nir: Treat some resource_intel as convergent |
| - brw/nir: Eliminate nir_to_brw_state::uniform_values |
| - brw/nir: Don't try optimize around emit_uniformize |
| - brw/nir: Simplify get_nir_image_intrinsic_image and get_nir_buffer_intrinsic_index |
| - brw/nir: Treat some ballot as convergent |
| - brw/nir: Don't generate scalar byte to float conversions on DG2+ in optimize_extract_to_float |
| - iris: Add missing nir_metadata_preserve in iris_lower_storage_image_derefs |
| - crocus: Add missing nir_metadata_preserve in crocus_lower_storage_image_derefs |
| - brw/copy: Fix handling of offset in extract_imm |
| |
| Icenowy Zheng (4): |
| |
| - zink: do not set transform feedback bits when not available |
| - meson: prefer 'python3' to 'python' when finding python3 |
| - zink: emit consts as uint only on IMG proprietary drivers |
| - zink: use lazy descriptors for IMG proprietary drivers |
| |
| Igor Torrente (2): |
| |
| - Zink: Add NVK to the non \`driver_workarounds.implicit_sync` list |
| - NVK: Enable RW DMA-BUF export |
| |
| Ivan Avdeev (1): |
| |
| - radv: add a flag to indicate ray tracing support |
| |
| Iván Briano (6): |
| |
| - intel/rt: fix ray_query stack address calculation |
| - intel/decoder: fix INTEL_DEBUG=bat |
| - anv: remove unused/misleading/wrong parameters from the RT trampoline |
| - vulkan: calculate remaining layers of 2d view of 3d image correctly |
| - anv: disable logic op for float/srgb formats |
| - hasvk: disable logic op for float/srgb formats |
| |
| James Hogan (3): |
| |
| - glsl: Expose gl_ViewID_OVR back to GLSL 1.30 |
| - mesa: Fix multiview attachment completeness check |
| - mesa: Fix FramebufferTextureMultiviewOVR num_views check |
| |
| Janne Grunau (1): |
| |
| - panvk: Silence warning on incompatible DRM render devices |
| |
| Jason Macnak (3): |
| |
| - Simplify ApiInfo |
| - Pass VkSnapshotApiCallInfo-s through VkDecoderGlobalState |
| - Update VkDecoderSnapshot locking |
| |
| Jesse Natalie (4): |
| |
| - microsoft/compiler: Put holes in driver_location based on I/O variable sizes |
| - microsoft/clc: Initialize printf buffer for tests |
| - microsoft/compiler: Skip POS for io compaction |
| - microsoft/compiler: Update clip/cull split pass to handle clip/cull getting merged |
| |
| Jianxun Zhang (5): |
| |
| - anv,hasvk,genxml: Rename genxml files using verx10 |
| - isl: Refactor WA 22015614752 |
| - iris: Allow compression on multi-sampled stencil (xe2) |
| - isl: Allow CCS in more cases (xe2) |
| - isl: Move a CCS restriction in GFX 12.x |
| |
| Job Noorman (87): |
| |
| - ir3/ra: prevent moving source intervals for shared collects |
| - ir3,tu: include ir3 debug flags in shader hash key |
| - ir3,tu: filter debug flags included in the hash key |
| - ir3: fold shared movs into other movs |
| - nir: add ir3-specific bitwise triop opcodes |
| - nir/search: make is_only_used_by_iadd reusable |
| - nir/search: add is_only_used_by_{iand,ior} helpers |
| - ir3: fix backend support for bitwise triops |
| - ir3: add codegen for bitwise triops |
| - ir3: add pass to select bitwise triops |
| - ir3/isa: allow rpt6/rpt7 |
| - ir3: add workaround for predication hardware bug |
| - nir/lower_subgroups: support unknown subgroup size |
| - ir3: use generic lowering for 64b scan/reduce |
| - ir3: remove unused ir3_nir_lower_64b_subgroups |
| - nir: add read_getlast_ir3 intrinsic |
| - ir3: add codegen for read_getlast_ir3 |
| - ir3: add helper to get the subgroup size |
| - ir3: rename cluster_size to brcst_cluster_size |
| - nir/lower_subgroups: add extra filter data to options |
| - nir/lower_subgroups: disable boolean reduce when not supported |
| - ir3: add support for clustered subgroup reductions |
| - tu: advertise VK_SUBGROUP_FEATURE_CLUSTERED_BIT |
| - nir/lower_subgroups: add option to only lower clustered rotates |
| - ir3: lower clustered rotates to shuffles |
| - tu: advertise VK_SUBGROUP_FEATURE_ROTATE_CLUSTERED_BIT_KHR |
| - ir3: don't update builder cursor for IR3_CURSOR_AFTER_BLOCK |
| - ir3: add ir3_after_instr_and_phis helper |
| - ir3: use generic INSTR0 implementation for ir3_NOP |
| - ir3: refactor builders to use ir3_builder API |
| - ir3: reformat after refactoring in previous commit |
| - ir3: add reformatting commits to .git-blame-ignore-revs |
| - ir3/isa: fix conflict between stib.b and stsc |
| - ir3/isa: fix cat3-alt immed src |
| - ir3/isa: fix isaspec for sad.s32 |
| - ir3: teach backend about sad |
| - ir3: add codegen for sad |
| - ir3/cp: only mark mad srcs as swapped when swap succeeded |
| - ir3/cp: extract common src swapping code |
| - ir3/cp: make try_swap_mad_two_srcs more generic |
| - ir3/cp: add support for swapping srcs of sad |
| - ir3/validate: print file/line info |
| - ir3,freedreno: remove binning outputs after vs ucp lowering |
| - ir3/cp: swap back correct srcs when swap failed |
| - ir3: always set wrmask for movmsk |
| - ir3: emit uniform iadd3 as two adds |
| - ir3: output early-preamble stat as integer |
| - ir3/ra: fix non-trivial collect detection |
| - ir3/ra: allocate shared collects dst over its srcs when possible |
| - ir3/parser: fix parsing integer as float |
| - ir3/a7xx: properly handle alias scope and type |
| - ir3/a7xx: disasm halfness of alias dst |
| - ir3/a7xx: implement and document unknown alias field |
| - ir3/a7xx: handle alias.rt dst |
| - ir3/a7xx: document alias.rt |
| - ir3/print: add support for alias |
| - ir3: teach backend about alias |
| - ir3: introduce alias goups |
| - ir3: add validation for alias |
| - ir3: add ir3_compiler::has_alias |
| - ir3: add support for alias.tex |
| - ir3: optimize alias register allocation by reusing GPRs |
| - ir3/legalize: insert (ss) to read consts after stc |
| - ir3/legalize: insert (sy) to read consts after ldc.k |
| - ir3/dce: support partial writes from collects |
| - ir3: add some preamble helpers |
| - ir3: make find_end a global helper |
| - tu,ir3: inform ir3 of dynamically remapped FS slots |
| - ir3: make shader output struct non-anonymous |
| - ir3: reuse ir3_find_output in ir3_find_output_regid |
| - tu: add chip param to tu6_emit_fs_outputs |
| - tu: add support for aliased render target components |
| - freedreno: add chip param to emit_fs_output |
| - freedreno: add support for aliased render target components |
| - ir3: add support for alias.rt |
| - ir3: disable alias.rt pre-a750 |
| - ir3: account for inserted nops in delay calculation |
| - freedreno: move ForEachMacros into freedreno |
| - freedreno: remove unused entries from ForEachMacros |
| - freedreno: add missing entries to ForEachMacros |
| - ir3: schedule alias.rt at the end of the preamble |
| - ir3: rematerialize preamble defs in block dominated by sources |
| - ir3: add helper to calculate src read delay |
| - ir3: make delay slots a compiler property |
| - ir3/a7xx: update delays slots |
| - ir3/a7xx: enable delayed src2 read for all cat3 instructions |
| - ir3: fix emitting descriptor prefetches at end of preamble |
| |
| John Anthony (2): |
| |
| - panvk: Enable storageBuffer16BitAccess |
| - panvk: Enable VK_KHR_vertex_attribute_divisor |
| |
| Jordan Justen (6): |
| |
| - intel/dev: Add PTL 0xb0b0 PCI ID |
| - intel/dev: Split hwconfig warning check into hwconfig_item_warning() |
| - intel/dev: Split apply and check paths for hwconfig |
| - intel/dev: Don't process hwconfig table to apply items when not required |
| - intel/dev: Add intel_check_hwconfig_items() |
| - iris: Check that mem_fence_bo was created |
| |
| Jose Maria Casanova Crespo (9): |
| |
| - v3d: Enable Early-Z with discards when depth updates are disabled |
| - rpi4/ci: mark another flaky timeline_semaphore test |
| - rpi4/ci: another detected flaky timeline_semaphore test |
| - vc4/ci: fails udpate after last piglit uprev |
| - rpi4/ci: Increase timeout for rusticl jobs. |
| - v3d: Don't load/store if rasterizer discard is enabled |
| - v3d/ci: update rpi expectations by last piglit uprev |
| - v3d: Apply FBO resources invalidations on job creation |
| - Revert "ci: take igalia farm offline" |
| |
| Joshua Duong (1): |
| |
| - gfxstream: update auto-generated comments. |
| |
| José Roberto de Souza (16): |
| |
| - intel/dev/xe: Fix access to eu_per_dss_mask |
| - intel/dev/xe: Fix size of eu_per_dss_mask |
| - intel/genxml/xe2: Add STATE_SYSTEM_MEM_FENCE_ADDRESS instruction |
| - anv: Always create anv_async_submit in init_copy_video_queue_state() |
| - anv: Emit STATE_SYSTEM_MEM_FENCE_ADDRESS |
| - iris: Emit STATE_SYSTEM_MEM_FENCE_ADDRESS |
| - iris: Add support for damage region |
| - anv: Allow larger SLM sizes for task and mesh shader |
| - anv: Check VkResult of perf query batch buffer |
| - anv: Check VkResult main batch buffer before start companion batch buffer |
| - iris: Drop BO_ALLOC_COHERENT from iris_utrace_create_ts_buffer() |
| - iris: Rename BO_ALLOC_COHERENT to BO_ALLOC_CACHED_COHERENT |
| - anv: Return scanout PAT entry for scanout and external buffers in discrete GPUs |
| - anv: Allow WSI blit_src Image to be kept compressed when transitioning to VK_IMAGE_LAYOUT_PRESENT_SRC_KHR |
| - iris: Make sure a uncached heap is choosen for scanout and shared buffers when LLC is not available |
| - iris: Pick scanout PAT entry for scanout buffers |
| |
| Juan A. Suarez Romero (26): |
| |
| - util/format: nr_channels is always <= 4 |
| - v3dv: remove unused assignments |
| - v3dv: fix BO allocation |
| - v3dv: free pointers on multisync error |
| - v3dv: ensure there is always a perfmon and counter |
| - broadcom/compiler: ensure offset source exists |
| - broadcom/compiler: fix fp16 conversion operations |
| - v3d: make v3d_flush_resource reallocate non-shareable resources |
| - vc4: ensure sharing tiled resources are of proper format |
| - v3d: fix BO allocation |
| - v3d: remove intermediate variable |
| - v3d: find linear modifier when required |
| - vc4: find linear modifier when required |
| - v3d/ci: clean some asan failures |
| - v3d: avoid 0-size variable length array |
| - v3dv: fix assigned value is garbage or undefined |
| - vc4: initialize variable |
| - v3dv: check requirements for USAGE_INPUT_ATTACHMENT |
| - freedreno: a2xx: fix maybe uninitialized variable |
| - radeonsi/vcn: fix maybe uninitialized |
| - v3d: fix format overflow error |
| - virgl: fix member access to a NULL pointer struct |
| - etnaviv: cast assertion |
| - ci/build: add ubsan build jobs |
| - broadcom/ci: add ubsan jobs for broadcom drivers |
| - ci: take igalia farm offline |
| |
| Jung-uk Kim (1): |
| |
| - FreeBSD: Disable support for "-mtls-dialect" for FreeBSD |
| |
| Juston Li (1): |
| |
| - util/cache_test: Fix racey Cache.List test |
| |
| Kai Wasserbäch (1): |
| |
| - fix(FTBFS): clc/clover: pass a VFS instance explicitly |
| |
| Karmjit Mahil (21): |
| |
| - tu: Fix push_set host memory leak on command buffer reset |
| - tu: Fix potential alloc of 0 size |
| - nir: Fix \`no_lower_set` leak on early return |
| - tu: Fix memory leaks on VK_PIPELINE_COMPILE_REQUIRED |
| - nir/algebraic: turn \`u{ge,lt} a, 1` to \`i{ne,eq} a, 0` |
| - nir,ir3: Add icsel_eqz |
| - nir: Fix the spelling of compare |
| - freedreno/rddecompiler: clang-format fix |
| - freedreno/rddecompiler: Fix some unsused function warnings |
| - ir3: Fix some Wsign-compare when compiling a generate-rd.cc |
| - util/idalloc: Fix util_idalloc_foreach() build issue |
| - util/idalloc: Minor refactor of util_idalloc_foreach() |
| - tu: Fix \`clear_values` leak |
| - tu: Fix FDM patchpoint memory leak |
| - tu: Fix leaking of some descriptor sets |
| - tu: Initialize tu_tiling_config even when tiling isn't possible |
| - tu: Free pre_chain patchpoint data |
| - util/simple_mtx: Add ASSERTED to parameter used only in an assert |
| - vulkan: Add inital vram-report-limit layer |
| - freedreno/replay: Define __user for msm_kgsl |
| - loader/wayland: Fix missing timespec.h include |
| |
| Karol Herbst (77): |
| |
| - nv/codegen: Do not use a zero immediate for tex instructions |
| - nvc0: return NULL instead of asserting in nvc0_resource_from_user_memory |
| - clover: drop support for nir drivers |
| - gallium: drop PIPE_SHADER_IR_NIR_SERIALIZED |
| - rusticl/kernel: fix kernel variant selection |
| - vtn: handle struct kernel arguments passed by value |
| - nir/lower_cl_images: lower scalar image_loads to vec4 |
| - rusticl/mem: add restrictions for CL_DEPTH, CL_DEPTH_STENCIL and msaa images |
| - rusticl/image: fix clEnqueueFillImage for CL_DEPTH |
| - rusticl/device: advertize cl_khr_depth_images if supported |
| - rusticl: enable cl_khr_depth_images |
| - rusticl: check for overrun status when deserializing |
| - rusticl/kernel: convert name and type_name to Option<CString> |
| - rusticl/mesa: make driver_name() return a &CStr |
| - rusticl/program: check if provided binary pointers are null |
| - rusticl: rework query APIs |
| - rusticl/api: add a write_len_only variant for writing API properties |
| - rusticl/api: add a write_iter variant for writing API properties |
| - rusticl/program: use write_len_only for CL_PROGRAM_BINARIES |
| - rusticl/program: use write_iter for CL_PROGRAM_DEVICES |
| - rusticl/program: pass the slice directly for CL_PROGRAM_IL |
| - rusticl/program: use write_len_only for CL_PROGRAM_IL |
| - rusticl/platform: pass the slice directly for CL_PLATFORM_EXTENSIONS_WITH_VERSION |
| - rusticl/api: use constant arrays instead of Vecs for queries |
| - rusticl/context: use write_iter for CL_DEVICES_FOR_GL_CONTEXT_KHR |
| - rusticl/proc: make generated entry points unsafe |
| - rusticl/api: mark get_info and get_info_obj as unsafe |
| - rusticl/util: add Properties::is_empty() and len() |
| - rusticl/util: add Properties::iter() |
| - rusticl/util: make Properties::props private |
| - rusticl/util: reimplement Properties over Vec of scalars |
| - rusticl/api: simplify CLProp implementation of Properties |
| - rusticl/api: use Properties for 0 terminated arrays consistently |
| - rusticl/util: make Properties::from_ptr unsafe |
| - rusticl/api: remove Option around Properties |
| - rusticl/util: rename Properties::from_ptr to new |
| - rusticl/util: fix duplicate key detection in Properties::new |
| - rusticl/platform: silence static_mut_refs warning |
| - rusticl/util: fix ptr_to_integer_transmute_in_consts warning |
| - rusticl: fix clippy::needless-lifetimes |
| - rusticl: fix clippy::doc-lazy-continuation |
| - rusticl/queue: add a life check to prevent applications dead locking |
| - rusticl: stop using system headers for CL and GL |
| - include: Update the OpenCL headers to latest |
| - rusticl/mesa: remove PipeTransfer::res |
| - rusticl/mem: remove mem_type argument from new_image |
| - rusticl/device: remove unused functions |
| - rusticl/mesa/context: use Default for pipe_grid_info initialization |
| - rusticl/mesa: add missing files to meson.build |
| - rusticl/queue: make QueueContext::dev public |
| - rusticl/mem: pass around QueueContext instead of PipeContext |
| - rusticl/mesa/resource: port to NonNull |
| - rusticl/device: fix CL_DEVICE_HALF_FP_CONFIG query |
| - rusticl/device: fix default device enumeration |
| - rusticl/kernel: take set kernel arguments into account for CL_KERNEL_LOCAL_MEM_SIZE |
| - rusticl/kernel: fix image_size of 1D buffer images |
| - rusticl/mesa: set take_ownership to true for set_sampler_views |
| - rusticl/mesa: add PipeSamplerView wrapper |
| - rusticl/mesa: use PipeSamplerView over the raw type |
| - rusticl/kernel: create the sampler views earlier |
| - rusticl/mem: add functions to create sampler and image views to Image |
| - rusticl/mesa: rework image and sampler view creation APIs |
| - rusticl/kernel: store memory arguments as Weak references |
| - rusticl/device: add unsynchronized mapping functions to helper context |
| - rusticl/mem: simplify is_svm implementation |
| - rusticl/mem: add Allocation type |
| - rusticl/mem: reimplement has_same_parent and rename it to backing_memory_eq |
| - rusticl/mem: rework last user of get_parent() and remove it |
| - rusticl/mem: add Allocation::is_user_alloc_for_dev |
| - rusticl/mem: use get_res_for_access instead of get_res_of_dev |
| - trace: copy pipe_caps |
| - trace: add get_compute_state_info |
| - rusticl/mem: set bind flags for gl imports |
| - rusticl/mesa: add PipeContext::device_reset_status |
| - rusticl/queue: check device error status |
| - rusticl/kernel: call nir_lower_variable_initializers earlier |
| - rusticl/mem: do not apply offset with in copy_image_to_buffer |
| |
| Kenneth Graunke (35): |
| |
| - brw: Fix emit_a64_oword_block_header UNIFORM -> VGRF copies |
| - brw: Fix try_rebuild_source's ult32/ushr handling to use unsigned types |
| - nir: Use load_global_constant for reorderable nir_var_mem_global access |
| - nir/algebraic: Reassociate fadd into fmul in DP4-like pattern |
| - brw: Drop image deref handling from brw_analyze_ubo_ranges |
| - brw: Drop "regular uniform" concept from UBO push analysis |
| - brw: Drop a few crocus references in comments |
| - brw: Use nir_combined_align in brw_nir_should_vectorize_mem |
| - brw: Only consider components read for UBO loads |
| - brw: Only consider components read for UBO push analysis |
| - brw: Simplify choose_oword_block_size_dwords() |
| - nir: Allow large overfetching holes in the load store vectorizer |
| - anv: Don't consider nir_var_mem_global for vectorizer robustness checks |
| - brw: Tune vectorizer conditions to allow overfetching with holes |
| - brw: Fix register unit calculation in SIMD32 LOAD_PAYLOAD lowering |
| - brw: Allow SIMD32 math instructions on Xe2 |
| - brw: Combine convergent texture buffer fetches into fewer loads |
| - iris: Tune the BO cache's bucket sizes |
| - brw: Don't rely on SIMD splitting in opt_combine_convergent_txfs |
| - brw: Limit maximum push UBO ranges to 64 registers in the NIR pass. |
| - brw: Don't shrink UBO push ranges in the backend |
| - brw: Delete pull constant lowering |
| - brw: Delete assign_constant_locations and push_constant_loc[] |
| - brw: Fix vectorizer hole_size condition after signedness change |
| - nir: Add a nir_def_first_component_read() helper |
| - brw: Add more safeguards against misaligned OWord Block messages |
| - brw: Skip fetching unread leading components of UBO loads |
| - brw: Make get_nir_src_imm() usable for non-32-bit-sizes. |
| - brw: Skip unnecessary work for trivial emit_uniformize of IMMs |
| - brw: Skip unread leading/trailing components in convergent block loads |
| - brw: Add a new MEMORY_MODE_CONSTANT option |
| - brw: Allow CSE of MEMORY_MODE_CONSTANT loads |
| - brw: Align and combine constant-offset UBO loads in NIR |
| - brw: Always use MEMORY_LOAD for load_ubo_uniform_block_intel intrinsics |
| - brw: Fix Xe2 spilling code to limit to SIMD32 rather than SIMD16 |
| |
| Kevin Chuang (3): |
| |
| - anv: Implement encode shader to fit in ANV BVH |
| - anv: Add INTEL_DEBUG for bvh dump and visualization tools |
| - anv/bvh: Dump BVH synchronously upon command buffer completion |
| |
| Kevron Rees (1): |
| |
| - anv, drirc: Add workaround to speed up Spiderman reg allocation |
| |
| Konstantin (5): |
| |
| - nir/lower_non_uniform_access: Group accesses using the same resource |
| - radv/printf: Guard against helper invocations |
| - radv: Do not overwrite VRS rates when doing fast clears |
| - vulkan/meta: Add a pipeline cache |
| - vulkan: Fix the argument order of update_as |
| |
| Konstantin Seurer (39): |
| |
| - util: Fix some brackets in util_dynarray\_.*_ptr |
| - nir: Add missing access flags to print_access |
| - radv: Lower non-uniform access after vectorization |
| - amd: Add ac_shader_debug_info |
| - aco: Handle nir_debug_info_instr |
| - aco: Pass debug information to the driver |
| - radv: Add a helper for accessing the shader binary |
| - radv: Store debug info inside radv_shader |
| - radv: Dump nir shaders before compiling |
| - nir: Add a first_line parameter to gather_debug_info |
| - nir: Do not gather source locations for phis |
| - radv: Add RADV_DEBUG=nirdebuginfo |
| - gallivm: Add float operation behavior flags to lp_type |
| - gallivm: Preserve -0 and nan |
| - lavapipe: Implement VK_KHR_shader_float_controls2 |
| - gallivm: Use an accurate log2 implementation for lodq |
| - lavapipe: Implement VK_KHR_compute_shader_derivatives |
| - radv: Fix encoding empty acceleration structures |
| - llvmpipe: Disable anisotropic filtering for explicit lod |
| - llvmpipe: Use a simpler and faster AF implementation |
| - llvmpipe: Remove unused AF code |
| - llvmpipe: Move max_anisotropy to static sampler state |
| - lavapipe: Advertise vulkan 1.4 |
| - meson: Require glslangValidator when building lavapipe |
| - lavapipe: Check the pool type in handle_reset_query_pool |
| - meson: Include the loader subdir when building lavapipe |
| - gallivm: Take helper invocations into account when skipping branches |
| - nir/print: Print less unused shader info |
| - nir/tests: Improve shader creation |
| - nir/tests: Add a helper for comparing a shader against a string |
| - nir/tests: Add reference shaders |
| - nir: Add a test runner |
| - nir/print: Do not print trailing spaces after preds/succs |
| - docs: Add documentation for NIR unit testing |
| - llvmpipe: Fix half-pixel sample offset with AF |
| - llvmpipe: Avoid a crash when using 5 coords with AF |
| - radv/rmv: Use radv_rmv_log_resource_destroy more |
| - radv/meta: Stop using strings for meta keys |
| - gallivm: Remove loop limiting |
| |
| Koo, Anthony (1): |
| |
| - amd/vpelib: Add system event logging |
| |
| Lars-Ivar Hesselberg Simonsen (26): |
| |
| - panvk: Set fs.multisampled sysval for v10+ |
| - panvk: Add frag->frag barrier before resolve |
| - panvk: update expectations for G610 |
| - pan/genxml: Fix decode of exception_handler 0x0 |
| - pan/cs: Add mask support for reg_perm |
| - panvk: Build cmd_fb_preload on explicit fb_info |
| - panvk: Add incremental rendering support on v10+ |
| - panfrost: Disable AFRC texture/sampler reswizzle |
| - panvk: Disable AFBC for mutable formats on v7 |
| - panfrost: Only allow AFBC(RGB) and AFBC(BGR) on v7 |
| - panfrost: Limit reswizzle to AFBC formats |
| - panfrost: Decouple reswizzling from texture build |
| - panfrost: Standardize naming of sampler reswizzle |
| - panvk: Remove ZS texture_swizzle_replicate_x |
| - panvk: Fix descriptor decode |
| - panvk: Fix valgrind issue in nir_lower_descriptors |
| - panvk: Fix valgrind issue in panvk_compile_shaders |
| - pan/genxml: Fix vertex_packet Attribute on v9+ |
| - panvk: Use LD_VAR[_IMM] + ADs for varyings |
| - panvk: Limit AD allocation to max var loads in v9+ |
| - panvk: Use LD_VAR_BUF[_IMM] when possible |
| - panvk: Fix barriers in secondary cmdbufs w/o rp's |
| - panfrost: Do not evaluate_per_sample for non-MSAA |
| - Revert "panfrost: remove is_blit flag" |
| - Revert "panfrost: fix hang by using MALI_PIXEL_KILL_WEAK_EARLY in color preload" |
| - panvk: Set missing shader_modifies_coverage flag |
| |
| Leder, Brendan Steve (2): |
| |
| - amd/vpelib: Refactor OCSC and update missing check |
| - amd/vpelib: Move bg color |
| |
| Leonard Göhrs (1): |
| |
| - ci/lava: update lavacli from version 1.5.2 to 2.2.0 |
| |
| Lina Versace (3): |
| |
| - anv: Sort extensions in enablement table |
| - anv: Update features.txt |
| - anv: Fix feature pipelineProtectedAccess |
| |
| LingMan (10): |
| |
| - mesa: Bump required Rust version to 1.78 |
| - nak/hw_test: Use std::mem::offset_of!() |
| - compiler/rust: Use std::mem::offset_of!() |
| - mesa: Add rustfmt.toml |
| - rusticl: Use C-string literals |
| - rusticl: Use C-string literals for spirv extension names |
| - rusticl/cl_prop: Use C-string literals |
| - rusticl/core: Use C-string literals for XPlatManager::get_proc_address_func |
| - rusticl: Use C-string literals for NirShader::add_var |
| - rusticl: Use C-string literals for DiskCache::new |
| |
| Lionel Landwerlin (96): |
| |
| - anv: fix extent computation in image->image host copies |
| - anv: update shader descriptor resource limits |
| - anv: split generated draw flags from mocs/dword-count |
| - intel: make sure intel_wa.h can be included by opencl code |
| - anv: implement Wa_16011107343/22018402687 for generated draws |
| - brw: allocate physical register sizes for spilling |
| - anv: fix descriptor asserts |
| - anv: fix incorrect aspect flag for depth/stencil formats |
| - anv: fix missing push constant reallocation |
| - anv: prevent access to destroyed vk_sync objects post submission |
| - anv: track allocated descriptor pool sizes |
| - anv: indent driconf code |
| - anv: add a workaround for X4 Foundations |
| - anv: document the X4 Foundations workaround a bit more |
| - anv: move helpers out of genX_pipeline.c/anv_private.h |
| - anv: remove 3DSTATE_RASTER from pipeline |
| - anv: remove 3DSTATE_MULTISAMPLE from the pipeline |
| - anv: remove 3DSTATE_VF_STATISTICS from pipeline |
| - anv: pass anv_device to batch_set_preemption |
| - anv: rework vertex input helper |
| - anv: split vertex buffer emission in a different function |
| - anv: move gfx tracking values to anv_cmd_graphics_state |
| - anv: move tracking of tcs_input_vertices/fs_msaa_flags to hw state |
| - anv: split runtime flushing code for reuse |
| - brw: change fs_msaa flags checks to test compiled flag first |
| - brw: rename brw_sometimes to intel_sometimes |
| - brw: move barycentric_mode enum to intel_shader_enums.h |
| - brw: move fs_msaa_flags logic to intel_shader_enums.h |
| - fix |
| - Revert in correct commit "fix" |
| - anv: move primitive_topology to anv_gfx_dynamic_state |
| - anv: try to avoid using cmd_buffer in gfx runtime flushing |
| - anv: reuse device local variable in hw state emission |
| - anv: rework Wa_18038825448 to track state on anv_gfx_dynamic_state |
| - anv: avoid using cmd_buffer for TBIMR state computation |
| - anv: avoid using cmd_buffer for flushing runtime |
| - anv/iris: leave 4k alignments for clear colors with modifiers |
| - brw: use transpose unspill messages when possible |
| - anv: report formats supported by the common bvh framework |
| - anv: fix missing bindings valid dynamic state change check |
| - anv: set pipeline flags correct for imported libs |
| - vulkan: make acceleration structure debug markers virtual |
| - vulkan: add an enum for the build step |
| - vulkan: track encode step of the BVH building |
| - anv: add BVH building tracking through u_trace |
| - intel/decoder: fix COMPUTE_WALKER handling |
| - anv: document UBO descriptor range alignments |
| - blorp: use 2D dimension for 1D tiled images |
| - hk: fix timeline value type |
| - anv: fix index buffer size changes |
| - anv: limit the memcpy data for push constants |
| - vulkan/runtime: avoid emitting empty build_leaves |
| - anv: add tracepoints timestamp mode for empty dispatches |
| - anv: rework tbimr push constant workaround |
| - anv: ensure null-rt bit in compiler isn't used when there is ds attachment |
| - anv: use the correct MOCS for depth destinations |
| - intel: fix generation shader on Gfx9 |
| - brw: introduce a new register type for the address register |
| - brw: use phys_nr() more in generation |
| - brw: split validation iteration into blocks |
| - brw: add infra to make use of the address register in the IR |
| - brw: add scheduler support for address registers |
| - brw: avoid having the scratch surface handle partially written |
| - brw: move final send lowering up into the IR |
| - brw: fix coarse_z computation on Xe2+ |
| - brw: handle load_printf_buffer_size intrinsic |
| - anv: handle printf buffer size relocations |
| - nir: make lower-level printf helper respect buffer size |
| - anv: update debug printf example code |
| - anv: remove print lowering |
| - blorp: disable PS shaders with depth/stencil HiZ ops |
| - brw: fix CSE with negation |
| - anv: don't look at pipelines to figure out CPS values |
| - compiler: add VARYING_BIT_PRIMITIVE_COUNT |
| - anv/Wa_18019110168: copy the primitive count writes |
| - anv/brw: rework primitive count writing |
| - libcl: add MIN2/MAX2 macros |
| - libcl_vk: add some vulkan enums/structures for DGC |
| - spirv: build vtn_bindgen for Anv/Iris |
| - brw/elk: move internal kernel parsing out of intel_clc |
| - meson: build mesa_clc for Anv/Iris |
| - intel/cl: switch to SPIRV as shader storage |
| - meson: rework mesa-clc=system handling |
| - intel: rework CL pre-compile |
| - meson: required SPIRV-Tools LLVM workaround on LLVM17+ |
| - intel: fix dependency for internal CL shaders |
| - anv: use flags for format capabilities |
| - anv: pass physical device to format helpers |
| - anv: add a drirc to disable border colors without format |
| - anv: expose A4B4G4R4_UNORM_PACK16 support with CBCWF is disabled |
| - anv: dirty pipeline & push constants after internal CS shaders |
| - anv: reduce alignment for small heaps |
| - brw: fixup scoreboarding for find_live_channels |
| - anv,driconf: Add sampler coordinate precision workaround for Dynasty Warriors |
| - anv: disable VF statistics for memcpy |
| - anv: ensure Wa_16012775297 interacts correctly with Wa_18020335297 |
| |
| Lorenzo Rossi (1): |
| |
| - nvk: fix preprocess buffer alignment |
| |
| Louis-Francis Ratté-Boulianne (3): |
| |
| - panfrost: Split up allocation and packing of tiler descriptor |
| - panfrost: Select the effective tile size as part of pan_fb_info |
| - panfrost: Re-emit texture descriptor if the data size has changed |
| |
| Lu Yao (1): |
| |
| - zink: fix decomposed_attrs val error when zink_vs_key->size is 4 |
| |
| Lucas De Marchi (1): |
| |
| - intel/tools: Fix Xe KMD error dump parser |
| |
| Lucas Stach (26): |
| |
| - etnaviv: drm: properly handle BO list member |
| - etnaviv: drm: assert mutual exclusivity between cache and zombie list |
| - etnaviv: drm: use list_first_entry |
| - etnaviv: stall after RS/BLT operation when draw_stall debug option is enabled |
| - etnaviv: Update headers from rnndb |
| - etnaviv: add debug switch to disable texture descriptor usage |
| - etnaviv: fix polygon offset for 24bpp depth buffers |
| - ci/etnaviv: drop gl-1.4-polygon-offset fail |
| - etnaviv: isa: fix typo in SRC2_USE map |
| - etnaviv: Update headers from rnndb |
| - etnaviv: clean up component use setting in linker |
| - etnaviv: fix flatshading |
| - etnaviv: emit full varying component use |
| - ci/etnaviv: drop GC2000 flat shading fails |
| - etnaviv: split dummy RT backing store from reloc |
| - etnaviv: fix rendering without vertex buffers/attributes |
| - ci/etnaviv: drop failures caused by missing vertex attributes |
| - etnaviv: fix polygon offset disable |
| - etnaviv: memcpy varying setup from stack |
| - etnaviv: emit varying interpolation state on halti5 |
| - etnaviv: fix flatshading on halti5 GPUs |
| - etnaviv: only emit used PA_SHADER_ATTRIBUTES states |
| - etnaviv: track TS flushed status as bool |
| - etnaviv: dynamically partition the constant memory in unfied uniform mode |
| - etnaviv: allow more constants in unified uniform mode |
| - etnaviv: hwdb: fix lookup of GC3000 in i.MX6QP |
| |
| Lukas Lipp (1): |
| |
| - wsi: Fix wrong function name for lvp wsi metal surface |
| |
| M Henning (6): |
| |
| - nvk/cmd_buffer: Pass count to set_root_array |
| - nvk: Fix invalidation of NVK_CBUF_TYPE_DYNAMIC_UBO |
| - nvk: Remove params for dirty_cbufs_for_descriptors |
| - nvk: Fix two typos in comments |
| - nvk: Fix uninitialized var warnings in host_copy |
| - nak/hw_runner: Skip copy call for empty buffer |
| |
| Manuel (1): |
| |
| - gfxstream: Avoid repeated functionality |
| |
| Manuel Dun (4): |
| |
| - gfxstream: Using DETECT_OS_ANDROID from util instead of __ANDROID__ |
| - gfxstream: Using DETECT_OS_FUCHSIA from util instead of __Fushsia__ |
| - gfxstream: Using DETECT_OS_LINUX from util instead of __linux__ |
| - Gfxstream: Initial mingw "compilable" Windows version of mesa/gfxstream |
| |
| Marc Herbert (5): |
| |
| - docs: add "apt-get build-dep" and "dnf buildep" |
| - docs: cross-compile: add useful "apt" and "dnf" builddep commands |
| - docs: show how to use ccache when cross-compiling |
| - docs: show which pkg-config Fedora uses for cross-compilation |
| - docs: move cross c*_args from [properties] to [built-in options] |
| |
| Marek Olšák (353): |
| |
| - gallium/radeon: import libdrm_radeon source code, drop the dependency |
| - aco: remove unused TCS fields from aco_shader_info |
| - ac/nir: get pass_tessfactors_by_reg from nir_gather_tcs_info |
| - radeonsi: fix passing TCS wave ID from LS to HS for monolithic LS+HS |
| - radeonsi: don't overwrite info.tess._primitive mode when it can be correct |
| - radeonsi: get the value for load_tcs_primitive_mode_amd from shader info |
| - radeonsi: replace are_tessfactors_def_in_all_invocs with nir_gather_tcs_info |
| - radeonsi: reduce si_shader_key_ge::tes_prim_mode size to 2 bits |
| - radeonsi: remove unused function si_get_tcs_out_patch_stride |
| - radeonsi: don't set tess level outputs in patch_outputs_written unconditionally |
| - radeonsi: remove unused si_shader_info::output_readmask |
| - radeonsi: set \*outputs_written in scan_io_usage instead of later |
| - radeonsi: split outputs_written_before_tes_gs into ls_es_* and tcs_* masks |
| - radeonsi/ci: update navi31 failures |
| - glsl: add a helper for duplicated code calling nir_opt_varyings |
| - gallium: use struct nir_shader * type in finalize_nir instead of void * |
| - st/mesa: call pipe_screen::finalize_nir outside of st_finalize_nir |
| - gallium: add PIPE_CAP_CALL_FINALIZE_NIR_IN_LINKER |
| - st/mesa: add ST_DEBUG=xfb printing xfb info |
| - mesa: capture shaders to disk before invoking the linker |
| - nir/opt_varyings: add nir_io_always_interpolate_convergent_fs_inputs |
| - nir/opt_varyings: add nir_io_compaction_rotates_color_channels |
| - nir/opt_varyings: fix packing color varyings |
| - nir/opt_varyings: implement compaction without flexible interpolation |
| - nir/opt_varyings: don't count the cost of the same instruction multiple times |
| - radeonsi: fix buffer_size for emulated GS statistics |
| - radeonsi: fix an assertion failure in si_shader_ps with AMD_DEBUG=mono |
| - radeonsi: handle nir_intrinsic_component in kill_ps_outputs |
| - radeonsi: fix gl_FrontFace elimination when one side is culled |
| - radeonsi/ci: add options to test llvmpipe, softpipe, virgl, zink |
| - nir/print: print fb_fetch_output for variables |
| - nir/lower_pntc_ytransform: handle lowered IO |
| - nir/lower_clip: fixes for lowered IO without compact arrays |
| - nir/lower_clip: rewrite find_output to handle vec2/3 and make it readable |
| - nir/lower_fragcoord_wtrans: handle trimmed fragcoord loads |
| - nir/lower_two_sided_color: fix for lowered IO |
| - nir: add nir_io_semantics::fb_fetch_output_coherent |
| - nir: rename nir_io_glsl_opt_varyings to nir_io_dont_optimize and deprecate it |
| - nir: add nir_io_separate_clip_cull_distance_arrays to replace PIPE_CAP |
| - vc4/lower_blend: don't read non-existent channels |
| - nir: make use_interpolated_input_intrinsics a nir_lower_io parameter |
| - ac/surface: adjust HiZ enablement |
| - radeonsi: prepare for making SI_NGG_CULL_TRIANGLES/LINES VS only, rename them |
| - radeonsi: optionally return MESA_PRIM_UNKNOWN from si_get_input_prim |
| - radeonsi: rewrite/replace gfx10_ngg_get_vertices_per_prim |
| - radeonsi: return a better value for load_initial_edgeflags_amd |
| - radeonsi: clean up and rename gfx10_edgeflags_have_effect |
| - radeonsi: add helper si_shader_culling_enabled |
| - radeonsi: only compute and use min_direct_count on gfx7-8 |
| - radeonsi: enable NGG culling for non-monolithic TES and GS |
| - radeonsi: don't use nir_io_dont_optimize because it's deprecated |
| - r300: don't lower sin/cos in finalize_nir |
| - nir/opt_varyings: use a hash table to make cloning SSA faster |
| - amd: import libdrm_amdgpu ioctl wrappers |
| - util,amd: add inlinable versions of drmIoctl/drmCommandWrite* |
| - nir: allow cloning indirect array derefs in nir_clone_deref_instr |
| - nir/lower_io_to_temporaries: fix interp_deref_at_* lowering |
| - radeonsi: don't call set_framebuffer_state in si_destroy_context |
| - radeonsi: handle a failure to create gfx_cs |
| - winsys/amdgpu: fix FD mismatch |
| - Revert "gbm: mark surface buffers as explicit flushed" |
| - nir/lower_clip: don't set cursor to fix crashes due to removed instructions |
| - nir/lower_clip: separate code for IO variables and intrinsics |
| - nir/lower_clip: set clip_distance_array_size outside of create_clipdist_vars |
| - nir/lower_clip: convert nir_lower_clip_gs to nir_shader_intrinsics_pass |
| - nir/lower_clip: implement ClipVertex lowering for GS + lowered IO correctly |
| - vc4: lower clip planes in st/mesa |
| - nir/opt_varyings: always call remove_dead_varyings in init_linkage |
| - nir/opt_varyings: add a default callback for varying_estimate_instr_cost |
| - nir/opt_varyings: replace options::lower_varying_from_uniform with a cost number |
| - nir/algebraic: use is_used_once in a few iand/ior patterns |
| - nir/algebraic: optimize (a & b) & (a & c) ==> (a & b) & c |
| - nir/algebraic: optimize (a | b) | (a | c) ==> (a | b) | c |
| - nir/algebraic: optimize (a & b) | (a | c) => a | c, (a & b) & (a | c) => a & b |
| - gallium: replace PIPE_SHADER_CAP_INDIRECT_INPUT/OUTPUT_ADDR with NIR options |
| - st/mesa: replace EmitNoIndirectInput / EmitNoIndirectOutput with NIR options |
| - util/bitset_test: test the return value of BITSET_TEST_RANGE_INSIDE_WORD better |
| - util/bitset: add BITSET_GET_RANGE_INSIDE_WORD |
| - nir/linking_helpers: don't promote interpolated varyings to flat |
| - nir/opt_varyings: remove redundant conditions from a while loop |
| - nir/opt_varyings: fix compaction with sparse indirect FS inputs |
| - nir/opt_varyings: count the number of unused components for compaction correctly |
| - nir/opt_varyings: fix max_slot for color varying compaction |
| - nir/opt_varyings: make top-level compaction code for TES, TCS, GS separate |
| - nir/opt_varyings: change try_move_postdominator param to nir_instr type |
| - amd,zink: remove options.varying_estimate_instr_cost callbacks |
| - nir/opt_varyings: propagate indirect uniform/UBO loads into the next shader |
| - nir/opt_varyings: add inter-shader code motion for uniform/UBO indexing |
| - nir/opt_varyings: fix getting deref variables for sysvals |
| - nir/opt_varyings: remove rare dead output stores after inter-shader code motion |
| - nir/opt_varyings: fix compile failures in the disabled PRINT code |
| - amd/ci: add piglit failures due to a overzealous test |
| - nir/lower_io_passes: lower indirect IO for TCS |
| - radeonsi: pass cull face state via user SGPRs for shader culling |
| - radeonsi: revert to always returning true for load_cull_any_enabled_amd |
| - radeonsi: try to fix Navi14 regression in debug builds |
| - radeonsi: don't compute total_direct_count in si_draw if it's unused |
| - radeonsi/ci: handle glinfo errors better |
| - radeonsi/ci: stop using a global flakes list, only use a per-chip flakes list |
| - radeonsi/ci: remove most flakes and some skips, update navi31 failures |
| - radeonsi/ci: remove --slow |
| - radeonsi/ci: update navi31 failures |
| - r600: fix a constant buffer memory leak for u_blitter |
| - ac/lower_ngg: improve streamout code generation for gfx12/ACO to match LLVM |
| - ac: update SPI_GRP_LAUNCH_GUARANTEE_* register values for gfx12 |
| - ac/surface/gfx12: enable DCC 256B compressed blocks and reorder modifiers |
| - radeonsi/gfx12: set DB_RENDER_OVERRIDE based on stencil state |
| - radeonsi/gfx12: adjust HiZ/HiS logic |
| - ac/nir: reserve the first LDS vec4 for the HS tf0/1 group vote in TCS |
| - ac/nir: use s_sendmsg(HS_TESSFACTOR) to optimize writing tess factors for gfx11 |
| - ac/nir: allow a TCS input to be available from both VGPRs and LDS |
| - ac,radv,radeonsi: enable TCS input reads from VGPRs for all compatible loads |
| - ac/nir: add new helpers for computing the TCS LDS/offchip size accurately |
| - radeonsi: remove unused parameter tcs_vgpr_only_inputs from si_get_nir_shader |
| - radeonsi: switch to the new TCS LDS/offchip size computation |
| - radv: switch to the new TCS LDS/offchip size computation |
| - ac/nir: call nir_gather_tcs_info only once for RADV |
| - nir/opt_varyings: set all IO types to float to facilitate full vectorization |
| - nir/opt_varyings: clear info->clip/cull_distance_array_size if relocated |
| - st/mesa: don't use nir_opt_fragdepth because it's incorrect with MSAA |
| - mesa: set correct XFB prim mode for draw validation after resuming XFB |
| - mesa: fix printing _NEW_* flags |
| - gallium: pass XFB primitive mode to set_stream_output_targets |
| - st/mesa: add a pass that unlowers IO intrinsics to variables |
| - glsl,st/mesa: always lower IO for GLSL, unlower IO for drivers |
| - v3d: enable uniform expression propagation from outputs to the next shader |
| - ci: update fail lists and trace checksums |
| - virgl/ci: disable virgl-traces because it doesn't upload results |
| - radeonsi/ci: don't copy skips.csv to the results directory |
| - radeonsi/ci: update failures and flakes |
| - radeonsi: fix a gfx10.3 regression due to a gfx12 change |
| - radeonsi: kill Z and stencil PS outputs if depth or stencil is disabled |
| - radeonsi/gfx11: fix alpha-to-coverage + alpha-to-one used together |
| - radeonsi: fix alpha-to-coverage + alpha-to-one used together for gfx6-10.3 |
| - radeonsi: implement nir_opt_frag_depth using kill_z instead of the NIR pass |
| - radeonsi: eliminate shader code computing killed Z/S/samplemask PS outputs |
| - radeonsi: make NGG streamout output primitive type known at compile time |
| - radeonsi/gfx12: fix DrawTransformFeedback(stream != 0) |
| - radeonsi/gfx12: tune streamout performance |
| - radeonsi: make nir->info and si_shader_info::base identical |
| - radeonsi: remove some uses of enum pipe_shader_type |
| - radeonsi: make si_init_shader_args static |
| - radeonsi: call si_init_shader_args in si_get_nir_shader |
| - radeonsi: use nir->info instead of sel->info.base |
| - radeonsi: disable luminance alpha formats on gfx6 |
| - radeonsi,radv: fix incorrect min_esverts for NGG subgroup calculation |
| - ac: remove unused code |
| - ac/llvm: remove unused code |
| - radeonsi/ci: update failures |
| - radeonsi: fix a TCS regression |
| - radeonsi: switch si_get_blitter_vs to IO intrinsics |
| - radeonsi: remove unused code |
| - amd: update addrlib |
| - radeonsi: fix a front face regression (crash) |
| - nir/opt_load_store_vectorize: make hole_size signed to indicate overlapping loads |
| - radv: reduce maxGeometryShaderInvocations to 32 |
| - ac/nir: handle disabled PS VGPRs in ac_nir_load_arg_at_offset |
| - amd: lower load_pixel_coord in NIR |
| - amd: lower load_frag_coord in NIR |
| - amd: lower load_local_invocation_id in NIR |
| - amd: lower load_first_vertex/base_instance/draw_id/view_index in NIR |
| - amd: lower load_invocation_id in NIR |
| - amd: lower load_sample_id in NIR |
| - amd: lower load_sample_pos in NIR |
| - amd: lower load_frag_shading_rate in NIR |
| - amd: lower load_front_face in NIR |
| - ac,radeonsi: move load_vector_arg flags to common code |
| - amd: lower load_barycentric_pixel/centroid/sample in NIR |
| - amd: lower load_barycentric_at_offset in NIR |
| - amd: lower load_gs_wave_id_amd in NIR |
| - amd: lower load_vertex_id/instance_id and overwrite_vs_arguments in NIR |
| - radeonsi: don't return 0 from si_get_max_workgroup_size |
| - ac/nir: extract a load_subgroup_id lowered helper |
| - amd: lower load_local_invocation_index in NIR |
| - amd: lower load_subgroup_invocation in NIR |
| - amd: lower load_tess_rel_patch_id/primitive_id/tess_coord and overwrite.. in NIR |
| - ac/llvm: remove already lowered cases |
| - ac/nir: lower more loads in ac_nir_lower_intrinsics_to_args instead of drivers |
| - ac/nir: clean up ac_nir_lower_indirect_derefs |
| - ac/nir: add helper ac_nir_load_arg_upper_bound |
| - ac/nir: set arg_upper_bound_u32 for vs_rel_patch_id |
| - ac/nir: split local_invocation_ids to 3 separate VGPR inputs |
| - ac/nir: set upper ranges for range analysis while lowering system values |
| - radeonsi: lower sysval intrinsics as late as possible |
| - amd: optimize atomics before lowering intrinsics |
| - radeonsi: use nir_opt_sink |
| - radeonsi: use nir_opt_move |
| - vulkan: silence an unused variable warning |
| - llvmpipe: silence an unused result warning |
| - util/disk_cache: silence unused result warnings |
| - nir: set nir_io_semantics::num_slots to at least 1 in build helpers |
| - nir: set src_type and dest_type to float implicitly for IO build helpers |
| - nir: don't set num_slots/src/dest_type/write_mask when they're set automatically |
| - nir: flip the early exit condition in nir_lower_io_temporaries |
| - nir: remove redundant option linker_ignore_precision |
| - nir: use IO intrinsics in nir_lower_bitmap |
| - nir: use IO intrinsics in nir_lower_drawpixels |
| - mesa: remove unused PROGRAM_SYSTEM_VALUE |
| - mesa: remove unused PROGRAM_WRITE_ONLY |
| - st/mesa: fold st_translate_prog_to_nir into prog_to_nir |
| - st/mesa: run DCE before st_unlower_io_to_vars |
| - st/mesa: use IO intrinsics in st_nir_lower_fog |
| - st/mesa: use IO intrinsics in st_nir_lower_position_invariant |
| - st/mesa: switch ATI_fs to IO intrinsics |
| - st/mesa: unlower IO for internal shaders if needed |
| - st/mesa: switch Z/S DrawPixels shaders to IO intrinsics |
| - st/mesa: switch GL_SELECT shader to IO intrinsics |
| - st/mesa: switch st_nir_make_passthrough_shader to IO intrinsics |
| - st/mesa: switch st_pbo_create_vs and st_pbo_create_gs to IO intrinsics |
| - st/mesa: switch PBO create_fs to IO intrinsics |
| - st/mesa: switch st_nir_make_clearcolor_shader to IO intrinsics |
| - st/mesa: don't use nir_copy_var |
| - st/mesa: recompute IO bases for ARB_vp/fp |
| - glsl: remove unused code |
| - glsl: fix corruption due to blake3 hash not being set for nir_opt_undef |
| - radeonsi: ignore PIPE_RESOURCE_FLAG_TEXTURING_MORE_LIKELY for TC-compatible HTILE |
| - radeonsi: simplify and fix enable_tc_compatible_htile_next_clear logic |
| - radeonsi: re-enable non-TC-compatible HTILE for write-only Z/S |
| - mesa: switch ARB_vp/fp to IO intrinsics |
| - mesa: switch fixed-func fragment program to IO intrinsics |
| - nir/algebraic: use is_used_once for comparison patterns |
| - nir/algebraic: add and improve pack/unpack patterns |
| - nir/algebraic: optimize pack_split(unpack(a).x, unpack(a).y) -> a |
| - radeonsi: fix a perf regression due to slow reply from GEM_WAIT_IDLE for timeout=0 |
| - radeonsi: always use RADEON_USAGE_DISALLOW_SLOW_REPLY |
| - ac: update ATOMIC_MEM definitions |
| - ac/nir: sort xfb info to facilitate vectorization of xfb stores |
| - ac/nir: vectorize streamout stores for legacy pipeline optimally |
| - ac/nir/ngg: vectorize streamout stores for NGG optimally |
| - ac/nir/ngg: fold so_vertex_index * so_stride into immediate offset |
| - ac/nir/ngg: export positions after streamout to improve performance |
| - ac,radeonsi: scalarize overfetching loads |
| - radeonsi: lower descriptors sooner to allow vectorizing descriptor loads |
| - amd: vectorize SMEM loads aggressively, allow overfetching for ACO |
| - radeonsi: don't set BREAK_PRIMGRP/WAVE_AT_EOI when tessellation is disabled |
| - radeonsi: only set BREAK_PRIMGRP/WAVE_AT_EOI when TES/GS need PrimID sysval after TES |
| - radeonsi/gfx12: enable alt_hiz_logic |
| - radeonsi/gfx12: set DIS_PG_SIZE_ADJUST_FOR_STRIP after shader compilation |
| - radeonsi/gfx12: use ACO if LLVM is 19 or older |
| - radeonsi/gfx12: use ACO for streamout because it's faster |
| - mesa: rework enablement of force_gl_names_reuse |
| - mesa: enable GL name reuse by default for all drivers except virgl |
| - ac/nir: remove broadcast_last_cbuf because it can be deduced from NIR |
| - ac/nir: split ac_nir_lower_ps into 2 passes |
| - nir: add barycentric coordinates src to load_point_coord_maybe_flipped |
| - ac: use Z_EXPORT_FORMAT=32_AR for Z + Alpha mrtz exports |
| - ac/llvm: lower vector load_const in NIR |
| - ac/llvm: remove the low-optimizing compiler option |
| - radeonsi: add si_screen::use_aco to shader cache key to fix shader cache failures |
| - radeonsi: remove unused variables from si_shader_context (LLVM) |
| - radeonsi: make many shader functions static or move them to .c files |
| - radeonsi: remove unused functions |
| - nir: add next_stage param to nir_slot_is_varying & nir_remove_sysval_output |
| - Revert "ac/llvm: enable wqm for ac_build_quad_swizzle from ac_build_fs_interp_mov" |
| - nir: add a pass that moves output stores to the end of the shader |
| - st/mesa: move VS & TES output stores to the end before unlowering IO |
| - mesa: switch fixed-func vertex program to IO intrinsics |
| - st/mesa: assert that all incoming shaders use lowered IO |
| - st/mesa: remove dead/no-op code due to IO being always lowered |
| - glsl: remove dead code due to IO being always lowered |
| - glsl: simplify nir_lower_io_to_temporaries logic |
| - nir: remove dead code due to IO being always lowered in st/mesa |
| - st/mesa: inline st_finalize_nir_before_variants |
| - nir: remove handling IO variables from passes used by st/mesa |
| - gallium/u_threaded: move tc_batch_execute after all call functions |
| - gallium/u_threaded: make the execute function table private |
| - gallium/u_threaded: use TC_END_BATCH to terminate the loop |
| - gallium/u_threaded: replace the function table with a switch and direct calls |
| - gallium/u_threaded: inline all tc_call functions |
| - gallium/u_threaded: sort cases in batch_execute by their occurrence |
| - zink/ci: skip KHR-Single-GL46...SizedDeclarationsPrimitive due to random timeout |
| - dri: put shared-glapi into libgallium.*.so |
| - glapi: stop using the remap table |
| - glapi: remove the remap table |
| - loader: improve the existing loader-libgallium non-matching version error |
| - glapi: rename exported symbols so as not to conflict with old libglapi |
| - freedreno/ci: skip a dmat3 div test timing out |
| - radv: don't call ac_nir_lower_ps_early |
| - ac/nir: optimize front_face in ac_nir_lower_ps_early |
| - ac/nir: lower sample_pos in ac_nir_lower_ps_early |
| - ac/nir: lower barycentric_at_offset/sample in ac_nir_lower_ps_early |
| - ac/nir: lower fbfetch_output in ac_nir_lower_ps_early |
| - ac/nir: return progress from ac_nir_lower_ps_early |
| - ac/nir: return progress from ac_nir_lower_ps_late |
| - ac/nir: handle FRAG_RESULT_COLOR with dual src blending in ac_nir_lower_ps_early |
| - ac/nir: switch passes to use nir_shader_intrinsics_pass |
| - ac/nir: drop 16x EQAA support from ac_get_ps_iter_mask |
| - ac/nir: clamp vertex color outputs in the right place |
| - radeonsi: sample shading state fixes |
| - ac,aco,radeonsi: replace SampleMaskIn with 1 << SampleID if full sample shading |
| - ac/nir: simplify force_*_sample_interp options in ac_nir_lower_ps_early |
| - ac/nir: simplify force_*_center_interp options in ac_nir_lower_ps_early |
| - ac/nir: optimize barycentric_at_sample(sample_id) in ac_lower_ps_early |
| - ac/nir: optimize frag_coord <-> pixel_coord in ac_nir_lower_ps_early |
| - ac/nir: eliminate sample_mask_in without MSAA in ac_nir_lower_ps_early |
| - ac/nir: cosmetic stuff for ac_nir_lower_ps |
| - aco: implement replacing frag_coord with pixel_coord in PS prolog |
| - aco: simplify how broadcast_last_cbuf is implemented in PS epilog |
| - aco: implement replacement of sample_mask_in with helper_invocation in PS prolog |
| - ac/nir: compute ddx/ddy for barycentric_at_offset at the beginning of shaders |
| - ac/nir: lower sample_pos to load_sample_positions_amd when frag_coord is center |
| - nir/opt_varyings: handle user barycentrics |
| - mesa: enable GL name reuse for virgl |
| - radeonsi: disallow compute queues on Raven/Raven2 due to hangs |
| - ac/nir: clamp vertex color outputs in the right place |
| - radeonsi: get sample positions from user SGPRs instead of memory |
| - radeonsi: fix PS prolog not counting used fragcoord VGPRs correctly |
| - radeonsi: implement replacing frag_coord with pixel_coord at draw time |
| - radeonsi: don't set the alpha ref user SGPR if alpha test doesn't use it |
| - radeonsi: simplify how broadcast_last_cbuf is implemented for PS epilogs |
| - radeonsi: use load_pixel_coord for polygon stipple lowering |
| - radeonsi: remove si_nir_kill_ps_outputs and use ac_nir_lower_ps_early instead |
| - radeonsi: add load_polygon_stipple_buffer_amd instead of using si_shader_args |
| - radeonsi: call si_init_gs_output_info in si_get_nir_shader |
| - radeonsi: add si_nir_shader_ctx holding parameters from si_get_nir_shader |
| - radeonsi: call si_nir_late_opts unconditionally |
| - radeonsi: set the "first" parameter of si_nir_opts correctly |
| - radeonsi: simplify how the NIR name of shader variants is modified |
| - radeonsi: cosmetic changes in get_nir_shader |
| - radeonsi: reorder NIR passes in get_nir_shader (part 1) |
| - radeonsi: reorder NIR passes in get_nir_shader (part 2) |
| - radeonsi: reorder NIR passes in get_nir_shader (part 3) |
| - radeonsi: split and restructure get_nir_shader |
| - radeonsi: get LS+HS and ES+GS together in get_nir_shader instead of separately |
| - radeonsi: set uses_vmem_load/sampler in get_nir_shaders |
| - radeonsi: move/rewrite PS color input gathering for shader variants |
| - radeonsi: use barycentrics from load_point_coord_maybe_flipped |
| - radeonsi: lower indirect indexing sooner |
| - radeonsi: move spi_ps_input_config functions up |
| - radeonsi: split si_fixup_spi_ps_input_config |
| - radeonsi: get SPI_PS_INPUT_ENA from shader variant NIR for ACO |
| - radeonsi: minor restructuring of si_llvm_compile_shader |
| - radeonsi: verify that SPI_PS_INPUT_ENA from LLVM is equal to ACO |
| - radeonsi: remove ac_shader_config from si_shader_part |
| - radeonsi: precompute COMPUTE_PGM_RSRC3 |
| - radeonsi: set SHARED_VGPR_CNT for compute for ACO |
| - radeonsi: set SHARED_VGPR_CNT for gfx shaders for ACO |
| - radeonsi: gather PS inputs from shader variant NIR |
| - radeonsi: don't set BASE in si_nir_lower_ps_color_input |
| - radeonsi: remove si_shader_info code that is no longer needed |
| - radeonsi: implement replacement of sample_mask_in with helper_invocation |
| - radeonsi: ignore pipe_rasterizer_state::force_persample_interp |
| - radeonsi: fix interpolateAt* with non-GL4 ARB_sample_shading |
| - radeonsi/ci: add more gfx11 flakes |
| - radeonsi: set gl_FragCoord to pixel center to fix GLCTS failures |
| - radeonsi: validate BITSET_TEST_RANGE_INSIDE_WORD assertion at compile time |
| - radeonsi: remove SI_TRACKED__UNUSED_GAP |
| - radeonsi: dead code removal and move some code out of headers |
| - radeonsi: remove redundant divergence analysis and smem flagging |
| - radeonsi: remove an incorrectly defined modifier |
| - winsys/amdgpu: disable DCC for gfx12 when using AMD_FORCE_FAMILY |
| - ac/fake_hw_db: deobfuscate GPU name strings |
| - gallium,st/mesa: allow reporting compile failures from create_vs/fs/.._state |
| |
| Mark Collins (5): |
| |
| - util: Add file modification notifier utility |
| - tu/util: Support toggling TU_DEBUG options at runtime |
| - tu/lrz: Check for TU_DEBUG(nolrz) late |
| - freedreno/docs: Document TU_DEBUG_FILE |
| - util/u_debug: Ignore newlines in \`parse_*_string` |
| |
| Martin Krastev (7): |
| |
| - svga/ci: enable vmware farm |
| - svga/ci: set vmware piglit job parallelism to 2 |
| - svga/ci: triage piglit failures |
| - svga/ci: update svga/ci KERNEL_TAG |
| - svga/ci: drop FDO_CI_CONCURRENT to 1 |
| - svga/ci: disable vmware farm |
| - svga/ci: enable vmware farm |
| |
| Martin Roukala (né Peres) (39): |
| |
| - zink/ci: document new-ish vangogh flakes |
| - ci: disable mupuf's farm |
| - Revert "ci: disable mupuf's farm" |
| - ci: disable mupuf's farm |
| - Revert "ci: disable mupuf's farm" |
| - freedreno-ci: document more a618-gl flakes |
| - freedreno-ci: document a a750-gl flake |
| - turnip/ci: document the a750-vkcts expectations |
| - turnip/ci: bump the vkcts a750 timeout by 15 minutes |
| - turnip/ci: skip a vkd3d test that causes a GPU hang on a750 |
| - nvk/ci: update the ga106 expectations |
| - zink/ci: update the nvk-ga106 expectations |
| - zink/ci: update the radv expectations |
| - radv/ci: update the vkcts expectations |
| - ci/test: make the .b2c-${arch}-test-* jobs provide a default b2c |
| - ci/tests: de-duplicate the b2c version between architectures |
| - ci/test: uprev to b2c v0.9.14 |
| - freedreno/ci: use the default b2c |
| - r300/ci: use the default b2c |
| - i915g/ci: use the default b2c version |
| - ci/b2c: modernize the job description to use run_* |
| - ci/b2c: run the machine registration check before the test container |
| - radeonsi/ci: update the vangogh expectations |
| - radeonsi/ci: run on ACO changes |
| - radeonsi/ci: run a fraction of glcts-vangogh in pre-merge |
| - ci/init-stage2: use the common scripts from the build artifact |
| - ci/b2c: use the runner description rather than ID |
| - ci/b2c: allow defining a boot watchdog |
| - freedreno/ci: use the boot watchdog to ensure the a750 boots |
| - zink/ci: update nvk expectations |
| - zink/ci: update RADV expectations |
| - radeonsi/ci: update the vangogh expectations |
| - ci/b2c: allow jobs to select a file in the dtb url |
| - ci/b2c: allow using another initrd that contains firmware |
| - freedreno/ci: uprev the a750 kernel to msm-next |
| - ci: fix the artifact name |
| - zink/ci: use the debian-built-testing for nvk |
| - ci/b2c: fix the S3 artifact for amd64 manual vk/gl |
| - turnip/ci: re-introduce the \`multiviewport` flakes |
| |
| Mary Guillemard (56): |
| |
| - agx: Add support for EGL_NV_context_priority_realtime |
| - panfrost: Report default value for GROUP_PRIORITIES_INFO in drm-shim |
| - pan/kmod: Expose medium priority on panfrost |
| - panvk: Implement global priority extensions |
| - panvk: Advertise VK_EXT_tooling_info |
| - panvk: Advertise VK_KHR_shader_non_semantic_info |
| - panvk: Advertise VK_KHR_shader_relaxed_extended_instruction |
| - panvk: Implement VK_KHR_zero_initialize_workgroup_memory |
| - bi: Execute nir_opt_algebraic after nir_lower_pack |
| - panvk: Implement VK_EXT_sampler_filter_minmax for v10 |
| - panvk: Only flag rw_nc pool as uncached on v10+ |
| - panvk: Take rasterization samples into account in draw |
| - panfrost: Remove faulty assert in cs_loop_conditional_* |
| - panvk: Wire occlusion queries to internals |
| - panvk: Implement occlusion queries for JM |
| - panvk: Implement occlusion queries for CSF |
| - panvk: Expose precise occlusion queries |
| - panvk: Advertise VK_EXT_host_query_reset |
| - panvk: Enable depthClamp and depthBiasClamp |
| - panvk: Enable shaderInt16 |
| - panvk: Advertise VK_KHR_index_type_uint8 |
| - panvk: Advertise VK_KHR_map_memory2 |
| - panvk: Disable integer array indices clamping |
| - panvk: Advertise VK_EXT_image_robustness |
| - panvk: Advertise VK_EXT_pipeline_robustness |
| - panvk: Call vk_free on queue array instead of vk_object_free |
| - panvk: Use vk_zalloc for queue array allocation |
| - panvk: Update Mali-G52 CI baseline |
| - panvk: Add a nightly job for Mali-G52 |
| - nak: Fix 8-bit selection for vectors |
| - nak: Simplify 16-bit vector selection to not use try_from |
| - meson: Add mesa-clc and install-mesa-clc options |
| - meson: Add precomp-compiler and install-precomp-compiler options |
| - asahi: Remove unneeded dependencies for asahi_clc |
| - util/bitpack_helpers: Use UINT64_MAX instead of ~0ULL |
| - util/bitpack_helpers: Make fixed packs CL safe |
| - nir,agx: Allow nir_precomp_print_blob to print a static array |
| - libcl: Respect NDEBUG for assert |
| - panforst: Update ForEachMacros |
| - pan/genxml: Move pack_header to an external file |
| - libcl: Add VkQueryType and VkQueryResultFlagBits definitions |
| - pan/genxml: Switch unpack to use uint32_t |
| - pan/genxml: Emit struct details before pack function |
| - pan/genxml: Move [un]pack internals to use packed structs |
| - pan/genxml: Enforce explicit packed types on pan_[un]pack |
| - pan/genxml: Switch pan_section_ptr to cast to packed type |
| - pan/genxml: Switch [un]pack codegen to macros |
| - pan/genxml: Switch __gen_unpack to macros |
| - panfrost: Fix group priorities in drm-shim |
| - panfrost: Fix PROGRESS_LOAD destination register |
| - pan/bi: Properly encode LEA_BUF_IMM |
| - pan/bi: Remove shift lanes invalid encodings |
| - pan/bi: Fix invalid CLPER encoding |
| - pan/bi: Use 2D dimension with TEX_FETCH with CUBE on Valhall |
| - pan/decode: Fix indirect branch calculation for 64-bit |
| - panvk: Disallow unknown GPU models early in physical device init |
| |
| Matt Turner (16): |
| |
| - anv: Align anv_descriptor_pool::host_mem |
| - vulkan: Skip memcpy() call if passed null pointers |
| - anv: Protect memcpy/memset/qsort calls against NULL arguments |
| - anv: Avoid null ptr dereference |
| - intel: Avoid unaligned pointer access |
| - vulkan: Avoid pointer aliasing |
| - nir: Get correct number of components |
| - intel/decoder: Avoid duplicate symbols when expat is not available |
| - brw: Avoid reading past the end of \`p->store` |
| - brw: Pass brw_codegen to next_offset |
| - brw: Bounds check access to \`p->store` |
| - brw: Pass number and sizeof separately to calloc |
| - elk: Avoid reading past the end of \`p->store` |
| - elk: Pass brw_codegen to next_offset |
| - elk: Bounds check access to \`p->store` |
| - elk: Pass number and sizeof separately to calloc |
| |
| Matthew Brost (1): |
| |
| - anv/xe: Bind queue per anv_queue |
| |
| Mauro Rossi (4): |
| |
| - nvk/android: Avoid building error in nak bindings |
| - nvk/android: Advertise Vulkan 1.1 for Android 12L and lower |
| - nvk/android: Add support for ANDROID_native_buffer |
| - android: remove shared-glapi building rules |
| |
| Maíra Canal (3): |
| |
| - v3dv: Check multiple DRM primary nodes before picking the display fd |
| - v3dv: delete \`v3dv_debug.h` |
| - v3dv: use Mesa log infrastructure instead of using stderr |
| |
| Mel Henning (27): |
| |
| - nak: Fix two warnings of elided_named_lifetimes |
| - gallium/winsys/nouveau: Don't mark the api PUBLIC |
| - nak: Add nak_nir_mark_lcssa_invariants |
| - compiler/rust/bitset: Fix the bitset iterator |
| - compiler/rust: Fix running tests |
| - compiler/rust/bitset: Add a basic test |
| - compiler/rust/bitset: Removed unused start param |
| - compiler/rust/bitset: Make BitSetIter private |
| - compiler/rust/bitset: impl FromIterator |
| - compiler/rust/bitset: Remove impl Not |
| - compiler/rust/bitset: Add a lazy expression API |
| - compiler/rust/bitset: Take a stream in union_with |
| - nak: Migrate liveness to new bitset expression api |
| - compiler/rust/bitset: Don't expose words |
| - compiler/rust/bitset: Test next_unset() |
| - nak: Add ShaderModel::hw_reserved_gprs() |
| - nak: Add gpr_limit_from_local_size |
| - nir_validate: Handle unstructured control flow |
| - nak: lower_load_ssbo_descriptor modifies cf |
| - nir: Update num_blocks in sort_unstructured_blocks |
| - nvk: Fix an assertion in nvk_slm_area_ensure |
| - nak: Return VK_ERROR_UNKNOWN on assertion failure |
| - nak: Fix a spelling error |
| - nak/opt_copy_prop: Fix IAdd3 overflow check |
| - nak/opt_copy_prop: Add force_alu_src_type |
| - nak/opt_copy_prop: Force alu src for IAdd2X/IAdd3X |
| - driconf: force_vk_vendor on Deep Rock Galactic+NVK |
| |
| Mi, Yanfeng (2): |
| |
| - anv:Fix memory grow calculation overflow issue |
| - anv:increase instruction heap to 3Gb |
| |
| Michael Cheng (2): |
| |
| - anv : Add tracepoint for as_build |
| - intel : Expose Shader hashes for utrace and Perfetto |
| |
| Michel Dänzer (4): |
| |
| - Revert "util/mesa-db: Further simplify mesa_db_compact" |
| - Revert "util: Use persistent array of index entries" |
| - Revert "winsys/amdgpu: fix FD mismatch" |
| - winsys/amdgpu: Always use amdgpu_device_get_fd for aws->fd |
| |
| Michel Zou (1): |
| |
| - ac/gpu_info: Fix missing prototype mingw error |
| |
| Mike Blumenkrantz (38): |
| |
| - zink: restrict implicit feedback loop detection using miplevels/layers |
| - mesa: use default params for clearbuffer functions |
| - zink: rework query result checking |
| - zink: use internal map flag for qbos |
| - glsl: make gl_ViewID_OVR visible to all shader stages |
| - glsl: enable OVR_multiview if OVR_multiview2 is enabled |
| - lavapipe: stop storing texture handle for samplers |
| - vk/sampler: split out sampler init from create |
| - lavapipe: split out sampler init from create |
| - lavapipe: split out bda descriptor function params from struct |
| - lavapipe: fix bitmask type for sampler updating |
| - lavapipe: move workgraph lowering up and delete pipeline param |
| - lavapipe: unsupport NV_device_generated_commands |
| - lavapipe: stop using pipeline layouts in some places |
| - lavapipe: handle VK_REMAINING_ARRAY_LAYERS with HIC |
| - lavapipe: fix 3D->2D blitting |
| - lavapipe: abort on unsupported depth copy ops |
| - lavapipe: support zs<->color copies |
| - lavapipe: maintenance8 |
| - zink: enable maintenance8 |
| - glsl: plumb num_views down to shader_info::view_mask |
| - zink: fix viewport detection when switching last stage shaders |
| - zink: add radv ci fail |
| - zink: disable shader objects when viewmask is set |
| - zink: fix replacing incompatible pipelines |
| - egl: never select swrast for vmwgfx |
| - zink: deduplicate VkDevice and VkInstance |
| - aco: exclude novalidateir from codegen flags |
| - zink: check for bound gfx stages before dereferencing |
| - zink: add zink_resource_reference() util function |
| - zink: refcount needs_present resource |
| - ci: mark radv-raven-traces-restricted with allow_failure |
| - zink: emit SpvCapabilityDemoteToHelperInvocation for IsHelperInvocation |
| - zink: also refcount needs_present from frontbuffer flush |
| - zink: guard rebar check against fallback heap detection |
| - radv: fix error reporting for VkExternalMemoryTypeFlagBitsKHR |
| - zink: only enable unsynchronized_texture_subdata with HIC |
| - zink: never try to oom flush during unsync texture upload |
| |
| Mike Lothian (1): |
| |
| - gallium/radeon: Fix r600_pci_ids.h include |
| |
| Mykhailo Skorokhodov (1): |
| |
| - drirc/anv: force_vk_vendor=-1 for Bellwright |
| |
| Nanley Chery (22): |
| |
| - anv: Support non-0/1 sRGB fast-clear colors on gfx9 |
| - anv: Store fast-clear colors with the view swizzle |
| - anv: Drop fast-clear value conversion check |
| - intel/blorp: Assert 3D Ys fast-clear restriction |
| - intel/isl: Allow CCS on 3D 64bpp+ Tile64 |
| - intel: Allow CCS on 3D surfaces for gfx120 |
| - intel/isl: Fix DecompressInL3 assignment on gfx12.5 |
| - anv: Enable storage accesses with modifiers on gfx12+ |
| - anv: Enable more storage compression on gfx12+ |
| - anv: Only consider R32 image formats as supporting atomics |
| - anv: Allow compressed memtypes with default buffer types |
| - anv: Slow clear if fast-clear cost is not mitigated |
| - iris: Reduce fast-clear post-amble flushes |
| - iris: Use L3 Fabric flush in fast-clear post-amble on TGL |
| - anv: Reduce fast-clear post-amble synchronization |
| - anv: Use L3 Fabric flush in fast-clear post-amble on TGL |
| - anv: Drop bpc check for non-zero fast clears |
| - Revert "anv: turn off non zero fast clears for CCS_E" |
| - anv: Inline can_fast_clear_with_non_zero_color |
| - anv: Allow more single subresource fast-clears with FCV |
| - anv: Drop can_fast_clear_with_non_zero_color() |
| - anv: Limit slow clear heuristic to ACM and prior |
| |
| Patrick Lerda (8): |
| |
| - r600: fix the evergreen sampler when the minification and the magnification are not identical |
| - r600: restructure r600_create_vertex_fetch_shader() to remove memcpy() |
| - r600: ensure that the last vertex is always processed on evergreen |
| - r600: evergreen stencil/depth mipmap blit workaround |
| - r600: reverse fix spec ext_packed_depth_stencil getteximage |
| - winsys/radeon: fix radeon_winsys_bo_from_handle() related race condition |
| - r600: fix r600_init_screen_caps() has_streamout issue |
| - r600: fix r600_init_shader_caps() has_atomics issue |
| |
| Paulo Zanoni (3): |
| |
| - brw: don't forget the base when emitting SHADER_OPCODE_MOV_RELOC_IMM |
| - brw: don't read past the end of old_src buffer in resize_sources() |
| - brw: increase brw_reg::subnr size to 6 bits |
| |
| Pavel Ondračka (27): |
| |
| - r300: group KIL for R300/R400 |
| - r300: run nir_opt_algebraic in the backend |
| - r300: always transform sin/cos input for fs |
| - r300/ci: update RV410 CI expectations |
| - ci: bring back some i915g testing |
| - i915/ci: update CI expectations |
| - r300: disable ATI2N textures on R400 |
| - r300: disable microtiling for scanout buffers |
| - r300/ci: update CI expectations |
| - r300: fix uninitialized use in transform_vertex_ROUND |
| - nir: add support for clamping in nir_lower_tex_shadow |
| - etnaviv: always clamp shadow sampler comparison reference value |
| - r300: fix presubtract assert |
| - r300: move shadow lowering to NIR |
| - r300: reswizzle some shadow texture calculations to use w channel |
| - r300: delete backend shadow lowering code |
| - r300: use ssa-like form for gl_FragCoord transformation |
| - r300: add some more nir cleanup compiler passes |
| - r300: use ssa-like form for backend texture lowering |
| - r300: don't allocate fs registers when translating from NIR |
| - r300: get rid of the register rename pass |
| - r300: get rid of some texture fixups |
| - r300: remove support for register arrays from nir_to_rc |
| - r300: fix memory leak in contant remaping |
| - ci: fix debian-build-testing BUILDTYPE |
| - i915/ci: use debian-build-testing instead of debian-testing |
| - i915: rework shader compile failures reporting |
| |
| Peyton Lee (5): |
| |
| - frontends/va: add support for VAProcColorStandardExplicit |
| - frontends/va: add support for VAProcColorStandardExplicit |
| - frontends/va: function process_frame has return value |
| - radeonsi/vpe: optimize software functions |
| - radeonsi/vpe: add destroy_fence function |
| |
| Philipp Zabel (11): |
| |
| - teflon: Use correct convolution params struct |
| - teflon: Mark dilated convolutions and fused activation as not supported |
| - teflon: Support fused ReLU activation |
| - etnaviv/nn: Enable fused ReLU activation |
| - teflon: Add is_signed parameter to ml_subgraph_invoke and ml_subgraph_read_output |
| - etnaviv/nn: Add support for signed 8-bit tensors |
| - teflon/tests: prep test executor for signed convolutions |
| - teflon/tests: Enable int8 tests |
| - etnaviv/ml: Create combined input tensors for addition first |
| - teflon: Reject per-axis quantization |
| - teflon: Support fused ReLU6 activation via output saturation |
| |
| Pierre-Eric Pelloux-Prayer (40): |
| |
| - radv: set info->family_overridden when RADV_FORCE_FAMILY is used |
| - ac/surface: add flags to surface metadata |
| - radeonsi: refuse to import texture with family_overriden being set |
| - ac: rename ac_surface_test_common -> ac_fake_hw_db |
| - ac: add 'polaris12' gpu to ac_fake_hw_db |
| - ac: switch AMD_FORCE_FAMILY handling to using ac_fake_hw_db |
| - radeonsi/tests: update expected results |
| - ac/perfcounter: fix buffer overflow |
| - dri: Remove unused function |
| - radeonsi/gfx12: disable display dcc for front buffer rendering |
| - radeonsi: disable DCC for PIPE_BIND_USE_FRONT_RENDERING |
| - glx: return BadMatch for invalid reset notification strategy |
| - ac/nir: remove prim_stride_ret arg from ngg_build_streamout_buffer_info |
| - radeonsi: use bytes units in streamout |
| - DEPENDENCY: ac/llvm: fix sparse code handling |
| - radeonsi: fallback to util_blitter_draw_rectangle |
| - radeonsi/tests: update results |
| - gl/spirv: update subgroup_size if GroupNonUniform is used |
| - amd: move all uses of libdrm_amdgpu to ac_linux_drm |
| - amd: amdgpu-virtio implementation |
| - ac/virtio: disable userptr and local buffers |
| - ac/virtio: disable timeline syncobj support |
| - radeonsi: enable virtio native context support |
| - radv: enable virtio native context support |
| - radv/virtio: disable syncobj timeline support |
| - ac/virtio: add virtio-only AMDGPU_GEM_CREATE flag |
| - radeonsi, radv, virtio: use AMDGPU_GEM_CREATE_VIRTIO_SHARED |
| - radeonsi: clear the debug callback on ctx destroy |
| - ttn: init source_blake3 and name from tgsi_shader_info |
| - ac/llvm: add wqm param to ac_build_quad_swizzle |
| - ac/llvm: enable wqm for ac_build_quad_swizzle from ac_build_fs_interp_mov |
| - radeonsi: do not use std::max |
| - glx: fix glx-create-context-invalid-es-version |
| - dri: use _checked variants of xcb requests |
| - dri: deal with ARGB1555 |
| - egl/wayland: validate dri_screen_display_gpu before use |
| - amd: add ac_drm_device_get_cookie |
| - radeonsi: use ac_drm_device_get_cookie |
| - radeonsi: update si_need_gfx_cs_space upper bound |
| - radeonsi: disable dcc when external shader stores are used |
| |
| Qiang Yu (81): |
| |
| - ac/surface/tests: support all block sizes |
| - ac/surf: add more modifiers to gfx12 supported list |
| - radeonsi: disable use_gfx12_xfb_intrinsic when use ACO |
| - util/blake3: add _mesa_blake3_from_printed_string |
| - radeonsi: add AMD_FORCE_SHADER_USE_ACO for debug |
| - nir: do not generate b2i64 when driver want to lower it |
| - aco: enable gfx12 support for radeonsi |
| - radeonsi: fix unigine heaven crash when use aco on gfx8/9 |
| - aco: fix voffset missing when buffer store base >=4096 |
| - radeonsi: fix OpenCL shader compile fail |
| - ac/nir: lower access for shared and scratch memory |
| - ac,radv: move ac_nir_lower_bit_size_callback to common place |
| - radeonsi: fix OpenCL piglit tests fails when using ACO |
| - radeonsi: replace ac_nir_lower_subdword_loads |
| - ac: remove ac_nir_lower_subdword_loads |
| - radeonsi: fix global access ACO compile fail when OpenCL |
| - radeonsi: enable ACO by default for pre-GFX10 GPUs |
| - radeonsi: unify disk cache id no matter use_aco or not |
| - gallium: add pipe_caps struct definition |
| - gallium: add u_init_pipe_screen_caps |
| - asahi: add agx_init_screen_caps |
| - crocus: add crocus_init_screen_caps |
| - d3d12: add d3d12_init_screen_caps |
| - etnaviv: add etna_init_screen_caps |
| - freedreno: add fd_init_screen_caps |
| - i915: add i915_init_screen_caps |
| - iris: add iris_init_screen_caps |
| - lima: add lima_init_screen_caps |
| - llvmpipe: add llvmpipe_init_screen_caps |
| - nouveau/nv30: add nv30_init_screen_caps |
| - nouveau/nv50: add add nv50_init_screen_caps |
| - nouveau/nvc0: add nvc0_init_screen_caps |
| - panfrost: add panfrost_init_screen_caps |
| - r300: add r300_init_screen_caps |
| - r600: add r600_init_screen_caps |
| - radeonsi: add si_init_screen_caps |
| - softpipe: add softpipe_init_screen_caps |
| - svga: add svga_init_screen_caps |
| - tegra: init screen caps |
| - v3d: add v3d_init_screen_caps |
| - vc4: add vc4_init_screen_caps |
| - virgl: add virgl_init_screen_caps |
| - zink: add zink_init_screen_caps |
| - nine: change cap macros to use pipe_caps access |
| - egl,gallium,glx: replace dri_get_screen_param with pipe_caps access |
| - mesa/st: enable extension use pipe_caps access |
| - egl,gallium,gbm,mesa: replace get_param with pipe_caps access |
| - gallium,mesa: replace get_paramf with pipe_caps access |
| - rusticl: use pipe_caps access |
| - asahi: remove agx_get_param and agx_get_paramf |
| - crocus: remove crocus_get_param and crocus_get_shader_paramf |
| - d3d12: remove d3d12_get_param and d3d12_get_paramf |
| - etnaviv: remove etna_screen_get_param and etna_screen_get_paramf |
| - freedreno: remove fd_screen_get_param and fd_screen_get_paramf |
| - i915: remove i915_get_param and i915_get_paramf |
| - iris: remove iris_get_param and iris_get_paramf |
| - lima: remove lima_screen_get_param and lima_screen_get_paramf |
| - llvmpipe: remove llvmpipe_get_param and llvmpipe_get_paramf |
| - nouveau/nv30: remove nv30_screen_get_param and nv30_screen_get_paramf |
| - nouveau/nv50: remove nv50_screen_get_param and nv50_screen_get_paramf |
| - nouveau/nvc0: remove nvc0_screen_get_param and nvc0_screen_get_paramf |
| - panfrost: remove panfrost_get_param and panfrost_get_paramf |
| - r300: remove r300_get_param and r300_get_paramf |
| - r600: remove r600_get_param and r600_get_paramf |
| - radeonsi: remove si_get_param and si_get_paramf |
| - softpipe: remove softpipe_get_param and softpipe_get_paramf |
| - svga: remove svga_get_param and svga_get_paramf |
| - tegra: remove tegra_screen_get_param and tegra_screen_get_paramf |
| - v3d: remove v3d_screen_get_param and v3d_screen_get_paramf |
| - vc4: remove vc4_screen_get_param and vc4_screen_get_paramf |
| - virgl: remove virgl_get_param and virgl_get_paramf |
| - zink: remove zink_get_param and zink_get_paramf |
| - gallium: remove get_param and get_paramf |
| - docs,src: replace doc and comments for PIPE_CAP with pipe_caps |
| - gallium,mesa: remove uint surffix from pipe_caps |
| - radeonsi: remove si_screen.max_texel_buffer_elements |
| - etnaviv: remove min/max_texture_gather_offset init |
| - lavapipe: fix min_vertex_pipeline_param |
| - gallium: fix ddebug and noop screen caps init |
| - radeonsi: fix has_non_uniform_tex_access info |
| - radeonsi: fix GravityMark corruption when use aco |
| |
| Rebecca Mckeever (14): |
| |
| - panvk: Use vk_image::drm_format_mod instead of pan_image::layout.modifier |
| - panvk: Replace tab with spaces |
| - panvk: Enable multiplane images and image views |
| - pan/texture: s/pan_image_view_get_zs_image/pan_image_view_get_zs_plane/ |
| - pan/texture: s/pan_image_view_get_rt_image/pan_image_view_get_color_plane/ |
| - pan/texture: Accept holes in the pan_image_view::planes array |
| - pan/desc: Pass an image to pan_force_clean_write_rt() |
| - pan/desc: Add a pan_image_view_get_s_plane() helper and use it |
| - panvk: Support D32_S8 as a multiplanar format |
| - pan/format: Use HW version to determine siting for YUV 422 formats |
| - pan/texture: Only use plane_chroma_2p for chroma planes |
| - util/hash_table: Add _mesa_hash_table_u64_replace() |
| - panvk: Allow a 32-bit binding value in desc id key and use 64-bit keys |
| - panvk: Fix assertion in is_disjoint() |
| |
| Rhys Perry (72): |
| |
| - nir: add more intrinsics to nir_intrinsic_can_reorder |
| - nir/algebraic: optimize bcsel(ieq(b, 0), a, shift(a, b)) |
| - nir/algebraic: optimize ushr(a, ishl(iand(b, 3), 3)) |
| - ac/nir: add ACCESS_CAN_REORDER to lowered load_global_constant |
| - aco: optimize nir_op_shfr with <32 src1 |
| - nir,aco,ac/llvm: add nir_op_alignbyte_amd |
| - nir_lower_mem_access_bit_sizes: support 64-bit offsets |
| - nir_lower_mem_access_bit_sizes: add nir_mem_access_shift_method |
| - nir_lower_mem_access_bit_sizes: pass access to callback |
| - nir_lower_mem_access_bit_sizes: support load_constant |
| - aco,ac/nir: flag loads to use smem in NIR |
| - radv,ac/nir: lower sub-dword loads using nir_lower_mem_access_bit_sizes |
| - aco: remove load byte_align |
| - radv,ac/nir: split global access using nir_lower_mem_access_bit_sizes |
| - nir/algebraic: fix iabs(ishr(iabs(a), b)) optimization |
| - nir/algebraic: check bit sizes in lowered unpack(pack()) optimization |
| - nir/lcssa: fix premature exit of loop after rematerializing derefs |
| - glsl/list: add comments above foreach macros |
| - glsl/list: add and use helpers in foreach_list_typed macros |
| - glsl/list: remove parenthesis in foreach_list_typed macros |
| - glsl/list: remove underscores in foreach_list_typed macros |
| - nir/opt_move_discards_to_top: use nir_tex_instr_has_implicit_derivative |
| - nir: fix return value of nir_instr_move for some cases |
| - nir/opt_move_discards_to_top: remove recursion |
| - nir/opt_move_discards_to_top: update variable name |
| - nir/opt_move_discards_to_top: use nir_intrinsic_can_reorder |
| - nir/opt_move_discards_to_top: add more intrinsics to add_src_to_worklist |
| - nir/opt_move_discards_to_top: allow multiple discards to be moved |
| - nir/lcssa: use nir_intrinsic_can_reorder |
| - nir/algebraic: add ddxy to is_only_used_as_float |
| - nir/algebraic: add is_used_once to bcsel(, bcsel()) opts |
| - nir/algebraic: optimize more bcsel(, bcsel()) |
| - aco: add SSA repair pass |
| - aco: use repair pass for LCSSA workaround |
| - aco: require WQM after demote in control flow |
| - aco: skip code if exec is empty |
| - aco/tests: add tests for empty exec masks |
| - aco: don't use uniform continues if exec might be empty |
| - aco: make small_vec copyable |
| - aco: use small_vec in RegCounterMap |
| - nir/tests: fix SSA dominance in opt_if_merge tests |
| - aco/gfx12: insert wait between VMEM WaW |
| - aco: force linear for event_vmem_sample and event_vmem_bvh |
| - aco: don't CSE p_shader_cycles_hi_lo_hi |
| - radv: constant fold after lowering memory accesses |
| - radv: fix expanded push constant loads when all are inlined |
| - radv: skip loading unused push constants |
| - ac/nir: have ac_nir_lower_mem_access_bit_sizes preserve >128 bit SMEM |
| - nir: make load_helper_invocation non-reorderable |
| - nir/move_discards_to_top: don't move across more intrinsics |
| - nir: make ballot ALU and mbcnt_amd operations reorderable |
| - aco: fix max_workgroup_count[0] |
| - aco: decrease max_workgroup_size |
| - radv: increase maxComputeWorkGroupCount[0] |
| - aco/tests: fix skip_lines=True with remaining characters in matches |
| - aco/util: fix bit_reference::operator&= |
| - aco: use VOP3 v_mov_b16 if necessary |
| - v3dv: fix SSA dominance error |
| - microsoft/compiler: invalidate loop analysis in dxil_nir_lower_double_math |
| - microsoft/compiler: repair SSA in dxil_nir_split_tess_ctrl |
| - d3d12: fix phi handling in d3d12_lower_primitive_id |
| - d3d12: store only once in d3d12_emit_points |
| - nir: rerun loop analysis if the parameters change |
| - nir/loop_analyze: use a sparse array and stop indexing SSA defs |
| - nir/gcm: stop preserving nir_metadata_loop_analysis |
| - nir/liveness: stop requiring instr indices |
| - nir/validate: validate metadata |
| - nir/validate: preserve dominance during SSA validation |
| - nir/validate: validate ssa dominance by default |
| - radv: set has_image_bvh_intersect_ray for null winsys |
| - aco: don't use divergence information for most ALU defs |
| - nir/divergence: assume all instructions are loop invariant if no continues |
| |
| Rob Clark (11): |
| |
| - vdrm+tu+fd: Make cross-device optional |
| - freedreno/registers: Add GMU_CORE_FW_VERSION |
| - freedreno/a6xx: Align lrz setup with tu |
| - freedreno/a6xx: Add nolrzfc debug option |
| - freedreno/a6xx: Align lrz height to 32 |
| - tu: Align lrz height to 32 |
| - freedreno/a6xx: Use LATE_Z with OC + discard |
| - freedreno/a6xx: Fix timestamp emit |
| - ir3: Add preamble instr count metric |
| - freedreno/pps: Fix multiple counter collection runs |
| - tu: Fix raytracing query with vdrm |
| |
| Robert Mader (2): |
| |
| - v3d: Support SAND128 base modifier |
| - freedreno: Support offset query for multi-planar planes |
| |
| Rohan Garg (5): |
| |
| - intel/compiler: disable mesh autostrip for WA 16020916187 |
| - iris: use CALLOC_STRUCT instead of calloc for readability |
| - isl: disable aux when creating uncompressed TileY/Tile64 surfaces from compressed ones |
| - anv: refactor choose_isl_tiling_flags to pass fewer arguments |
| - iris: assert that we're not exporting a TILE64 surface |
| |
| Roland Scheidegger (1): |
| |
| - llvmpipe: Fix overflow issues calculating loop iterations for aniso |
| |
| Roman Stratiienko (1): |
| |
| - v3dv/android: Suppress AHB-related log spam |
| |
| Ruijing Dong (2): |
| |
| - radeosi/vcn: enable EFC for VCN5.0+ when gfx >= 12 |
| - radeonsi/vcn: center mv map buffer changed in vcn5.x |
| |
| Russell Greene (1): |
| |
| - perfetto: fix macos compile |
| |
| Sagar Ghuge (30): |
| |
| - anv: Enable MCS_CCS compression on Gfx12+ |
| - blorp: Use the calculated execution mask |
| - anv: Update include dir for anv_tests |
| - anv: Split GRL code path in separate file |
| - anv: Add header to track BVH data structures |
| - anv: Add shader to build BVH header |
| - anv: Add shader to copy acceleration structures |
| - anv: Implement cmd_fill_buffer_addr callback |
| - anv: Move update buffer code in helper |
| - anv: Implement write_buffer_cp callback |
| - anv: Implement flush_buffer_write_cp callbck |
| - anv: Implement cmd_dispatch_unaligned callback |
| - anv: Implement acceleration structure API |
| - anv: Add helper to copy data from src to dest anv_address |
| - intel: Use the common RT BVH framework |
| - intel/compiler: Extend nir_intrinsic_load_topology_id_intel for xe3 |
| - intel/genxml: Drop morton walk field from Xe2 |
| - intel/genxml: Update COMPUTE_WALKER_BODY |
| - intel: Use Morton compute walk order |
| - intel/genxml: Update SAMPLER_STATE structure |
| - anv: Switch to ANISOTROPIC_FAST filter mode |
| - iris: Switch to ANISOTROPIC_FAST filter mode |
| - intel: Set correct maxComputeSharedMemorySize for Xe3+ |
| - intel/genxml: Add coarse pixel related changes |
| - anv: Add pipelined coarse pixel state |
| - intel/genxml: Update URB related instructions and structures |
| - iris: Use 3DSTATE_URB_ALLOC_* instructions |
| - blorp: Use 3DSTATE_URB_ALLOC_* instructions |
| - anv: Use 3DSTATE_URB_ALLOC_* instructions |
| - intel/brw/xe3+: Don't compile SIMD32 if there is ray queries |
| |
| Sam Lantinga (1): |
| |
| - util: Fixed crash in HEVC encoding on 32-bit systems |
| |
| Samuel Pitoiset (241): |
| |
| - aco: cleanup using fixed registers in the trap handler shader |
| - aco: save/restore SCC in the trap handler shader |
| - aco: use scalar buffer stores for dumping SGPRS from the trap on GFX8 |
| - aco: add a helper to dump SGPR to memory for the trap handler |
| - aco: fix storing SQ_WAVE_STATUS in the trap handler shader |
| - aco: declare phys regs for tba_hi/tma_hi |
| - radv,aco: dump m0 and exec from the trap handler |
| - vulkan/runtime: return same cmdbuf level from the command pool freelist |
| - docs: add missing documentation for RADV_DEBUG=psocachestats |
| - radv: remove unused parameter to radv_fill_nir_compiler_options() |
| - radv: dump the trap handler shader with RADV_DEBUG=dump_trap_handler |
| - aco: do not reorder s_trap instructions |
| - radv: cleanup printing SGPRS dumped from the trap handler |
| - radv,aco: dump more SQ_WAVE regs from the trap handler |
| - radv,aco: add a separate function to compile the trap handler shader |
| - aco: simplify postprocessing the trap handler shader |
| - radv,aco: use the trap handler layout struct while compiling the shader |
| - radv: fix the TMA descriptor size |
| - radv: compute the TMA BO size instead of using a constant |
| - radv,aco: save/restore overwritten VGPRs in the trap handler shader |
| - nir: add nir_intrinsic_debug_break instruction |
| - spirv: handle NonSemantic.DebugBreak to emit nir_debug_break() |
| - aco: emit nir_intrinsic_debug_break |
| - radv: emit nir_debug_break instructions when the trap handler is enabled |
| - radv: do not always invalidate L2 for GPUs with non-coherent RBs on GFX10+ |
| - radv: move the GFX11 special case for mips to radv_image_is_pipe_misaligned() |
| - radv: determine the first mip that is pipe misaligned on GFX10+ |
| - radv: use vk_image_view_subresource_range() when possible |
| - radv: pass the image subresource range to radv_{src,dst}_access_flush() |
| - radv: optimize the pipe misaligned L2 cache invalidation on GFX11 |
| - aco: fix saving/restoring VGPRS in the trap handler on GFX9 |
| - aco: use a 64-bit mov to save exec in the trap handler shader |
| - aco: add a new variant for vop1() with two operands |
| - aco: fix validation for v_movrels_b32 and friends |
| - aco: restore m0/exec before exiting the trap handler |
| - aco: use all invocations from the current wave in the trap handler |
| - aco: save/restore VGPRS on GFX8 in the trap handler shader |
| - aco: drop the second M0 operand for s_set_gpr_idx_on |
| - radv,aco: dump VGPRS from the trap handler shader |
| - radv: mark live invocations when dumping VGPRS with the trap handler |
| - radv: dump SPIR-V and NIR for the faulty shader detected with the trap |
| - radv: fix ignoring src stage mask when dst stage mask is BOTTOM_OF_PIPE |
| - radv: consider VK_PIPELINE_STAGE_2_NONE like BOTTOM_OF_PIPE |
| - radv: destroy meta resources properly when creating the device failed |
| - radv: add a helper to destroy a logical device |
| - radv: add a new drirc option to disable DCC for mips and enable it for RDR2 |
| - radv,aco: dump LDS from the trap handler |
| - radv: remove VK_VALVE_descriptor_set_host_mapping |
| - radv: fix skipping on-disk shaders cache when not useful |
| - radv: mark VERDE (GFX6) as Vulkan 1.3 conformant |
| - radv: fix dumping debug/perftest options when there are holes |
| - radv: add a pipeline helper to skip shaders cache |
| - radv: fix dumping the trap handler shader disassembly |
| - radv: fix printing with RADV_DEBUG=psocachestats |
| - radv: only pass relevant stages when emitting DGC push constants |
| - radv: capture shader executable info at shader creation time |
| - radv: allow shaders caching with RADV_DEBUG=hang and the trap handler |
| - vulkan: add MESA_VK_TRACE_PER_SUBMIT |
| - radv: finish tools after cleaning meta resources |
| - radv: add new start/stop sqtt helpers for capturing with SQTT |
| - radv: add support for capturing RGP per-submit |
| - radv: add address binding report support for BOs imported with a fd |
| - radv: add address binding report support for BOs imported with a ptr |
| - radv: add a small helper to dump VM fault with the GPU hang report |
| - radv: dump address binding report with RADV_DEBUG=hang |
| - radv: try to detect use-after-free with address binding report |
| - zink/ci: skip one more modifier test on POLARIS10 |
| - radv: promote VK_KHR_dynamic_rendering_local_read to core 1.4 API |
| - radv: promote VK_KHR_global_priority to core 1.4 API |
| - radv: promote VK_KHR_index_type_uint8 to core 1.4 API |
| - radv: promote VK_KHR_line_rasterization to core 1.4 API |
| - radv: promote VK_KHR_maintenance5 to core 1.4 API |
| - radv: promote VK_KHR_maintenance6 to core 1.4 API |
| - radv: promote VK_KHR_map_memory2 to core 1.4 API |
| - radv: promote VK_KHR_push_descriptor to core 1.4 API |
| - radv: promote VK_KHR_shader_subgroup_rotate to core 1.4 API |
| - radv: promote VK_EXT_pipeline_robustness to core 1.4 API |
| - radv: add new Vulkan 1.4 features/properties |
| - radv: advertise Vulkan 1.4 on GFX8+ |
| - radv: bump VKCTS conformance version to 1.4.0.0 for some GFX8+ GPUs |
| - radv/ci: mark few tests as expected failures |
| - ac/parse_ib: fix parsing SDMA CONSTANT_FILL packet |
| - ac/parse_ib: print VA for the SDMA CONSTANT_FILL/WRITE packets |
| - radv: fix stencil only copies of depth/stencil images with SDMA |
| - radv: enable DGC IES for compute with ESO |
| - radv: fix initializing HTILE when the image has VRS rates |
| - ci: update VKCTS main to a9f7069b9a5ba94715a175cb1818ed504add0107 |
| - radv: remove redundant drirc for incorrect dual-source blending |
| - radv: add radv_disable_dcc_stores and enable for Indiana Jones: The Great Circle |
| - radv: only dump device name info on Linux with RADV_DEBUG=hang |
| - radv: dump the Mesa version with RADV_DEBUG=hang |
| - radv/meta: add missing vk_meta_device_finish() |
| - radv/meta: move vk_meta_device_init() to radv_device_init_meta() |
| - radv: disable alphaToOne except for Zink |
| - ac/nir: export alpha to MRTZ.a and one to MRT0.a for alpha-to-one on GFX11 |
| - aco: export alpha to MRTZ.a and one to MRT0.a for alpha-to-one on GFX11 |
| - radv: fix alpha-to-coverage with alpha-to-one when MRTZ is also exported |
| - radv: remove remaining discard to demote options |
| - radv: fix disabling DCC for stores with drirc |
| - radv: simplify determining some fragment shader info with epilogs |
| - radv: fix alpha-to-coverage with alpha-to-one without MRTZ |
| - Revert "radv: disable alphaToOne except for Zink" |
| - spirv: add an options to lower SpvOpTerminateInvocation to OpKill |
| - radv: add radv_lower_terminate_to_discard and enable for Indiana Jones |
| - radv: mark HAWAII (GFX7) as Vulkan 1.3 conformant |
| - radv: report same buffer aligment for DGC preprocessed buffer |
| - Revert "radv: fix creating unlinked shaders with ESO when nextStage is 0" |
| - radv/ci: fix expected list of failures for TAHITI |
| - radv: fix missing variants for the last VGT stage with shader object |
| - ci: uprev vkd3d-proton to c965c1351fd6915a65bb7f647319536252a24a93 |
| - radv: fix capturing RT pipelines that return VK_OPERATION_DEFERRED_KHR for RGP |
| - radv: reorganize query code by adding separate begin/end helpers |
| - radv: remove dead code in radv_CmdCopyQueryPoolResults() |
| - radv: add few more query helpers for copying results |
| - radv: only enable emulated mesh/task shader queries on GFX10.3 |
| - radv/nir: fix checking if task shader invocations query is enabled |
| - radv: fix getting the number of vertices per prim for the last VGT stage |
| - radv: rename GDS queries to emulated queries |
| - radv/nir: simplify lowering of query intrinsics |
| - radv: cleanup enabling the global BO list when BDA is used |
| - radv: check descriptor indexing features for enabling the global BO list |
| - radv: rework emitting SPI_SHADER_Z_FORMAT |
| - radv: rename color output state to fragment output state |
| - radv: add support for VK_PRIMITIVE_TOPOLOGY_META_RECT_LIST_MESA |
| - radv: use VK_PRIMITIVE_TOPOLOGY_META_RECT_LIST_MESA for meta pipelines |
| - radv: pass extra graphics pipeline create info using pNext |
| - radv/meta: rework creating meta pipelines for query resolves |
| - radv/meta: convert the copy/fill pipelines to vk_meta |
| - radv/meta: convert the copy VRS to HTILE pipelines to vk_meta |
| - radv/meta: convert the FMASK expand pipelines to vk_meta |
| - radv/meta: convert the FMASK copy pipelines to vk_meta |
| - radv/meta: convert the DCC retile pipelines to vk_meta |
| - radv/meta: convert the HTILE expand CS pipelines to vk_meta |
| - radv/meta: convert the DCC decompress CS pipelines to vk_meta |
| - radv/meta: convert the clear HTILE mask pipelines to vk_meta |
| - radv/meta: convert the DCC comp-to-single pipelines to vk_meta |
| - radv/meta: convert DGC pipeline layout to vk_meta |
| - radv/meta: convert the query resolve pipelines to vk_meta |
| - radv/meta: convert the image-to-buffer pipelines to vk_meta |
| - radv/meta: convert the buffer-to-image pipelines to vk_meta |
| - radv/meta: convert the image-to-image pipelines to vk_meta |
| - radv/meta: convert the clear image pipelines to vk_meta |
| - radv/meta: convert the compute resolve pipelines to vk_meta |
| - radv/meta: remove radv_meta_create_compute_pipeline() |
| - vulkan: add a new vk_meta option to use the rect list pipeline path |
| - vulkan: use the meta pipeline cache for graphics pipelines |
| - radv/meta: convert the HTILE expand GFX pipelines to vk_meta |
| - radv/meta: convert the HW resolve GFX pipelines to vk_meta |
| - radv/meta: convert the fast-clear GFX pipelines to vk_meta |
| - radv/meta: convert the blit GFX pipelines to vk_meta |
| - radv/meta: convert the clear GFX pipelines to vk_meta |
| - radv/meta: convert the resolve GFX pipelines to vk_meta |
| - radv/meta: use only one push constant range for blit2d pipelines |
| - radv/meta: convert the blit2d GFX pipelines to vk_meta |
| - radv/meta: remove unused radv_meta_create_xxx() helpers |
| - radv: fix destroying DGC pipelines |
| - radv: disable RT with LLVM completely |
| - radv/meta: remove a workaround for building accel structs with LLVM |
| - radv/meta: always initialize emulated etc2 on-demand |
| - radv/meta: move initializing emulated astc to radv_device_init_meta() |
| - radv/meta: stop initializing RT accel structs |
| - radv: fix adding the BO to cmdbuf list when emitting buffer markers |
| - radv/meta: fix loading the meta pipeline cache |
| - radv/meta: reduce length of some cache keys |
| - radv/meta: add radv_meta_get_noop_pipeline_layout() |
| - radv/meta: do not create redundant pipeline layout objects |
| - radv: disable logic op for float/srgb formats |
| - ac/descriptors: fix configuring NBC views on GFX12 |
| - aco: fix VS prologs on GFX12 |
| - radv: disable VRS coarse shading with 8x MSAA on GFX12 |
| - radv: configure the VRS surface swizzle mode on GFX12 |
| - radv: fix programming WALK_ALIGN8_PRIM_FITS_ST on GFX12 |
| - radv: program DB_RENDER_OVERRIDE correctly on GFX12 |
| - ac/nir: fix lowering subgroup ID for compute shaders on GFX12 |
| - ac/nir: fix a comment typo in load_subgroup_id_lowered() |
| - ac/gpu_info: add cp_dma_use_L2 |
| - radv: fix CP DMA clears/copies on GFX12 |
| - aco: always use ds_bpermute for shuffle/rotate on GFX12 |
| - radv: fix configuring the attribute ring size on GFX12 |
| - radv: rename attr_ring to ge_rings |
| - radv: change the BASE_HI field for VGT_TF_MEMORY_BASE_HI on GFX12 |
| - ac/surface: honor RADEON_SURF_PREFER_xxx_ALIGNMENT on GFX12 |
| - radv: advertise VK_MESA_image_alignment_control on GFX12 |
| - radv: fix emitting SPI_SHADER_GS_OUT_CONFIG_PS with NULL FS on GFX12 |
| - radv: fail to initialize when the AMD GPU generation is unsupported |
| - radv: mark AMD CDNA as unsupported |
| - radv: add GFX12 support to the null winsys |
| - ac/nir: fix skipping streamout when no buffers are bound on GFX12 |
| - vulkan: Update XML and headers to 1.4.305 |
| - radv: promote VK_EXT_depth_clamp_zero_one to KHR |
| - radv: bump maxViewportDimensions to 32K on GFX12 |
| - radv: add a helper to report if cooperative matrix is enabled |
| - zink/ci: add lists for RADV/GFX1200 |
| - radv: remove duplicate definition of SQTT_BUFFER_ALIGN_SHIFT |
| - ac/sqtt: update programming SQTT on GFX12 |
| - radv: add support for VkMemoryBarrierAccessFlags3KHR |
| - radv: adjust the source aspect for color to depth/stencil image copies |
| - radv: advertise VK_KHR_maintenance8 |
| - radv: do not overallocate the number of exports for streamout on GFX12 |
| - radv: fix transform feedback on GFX12 |
| - radv: declare a new user SGPR for emulating queries on GFX12 |
| - radv: lower emulated queries with global atomics on GFX12 |
| - radv: allocate memory for the shader query buffer on GFX12 |
| - radv: emit the shader buffer query VA on GFX12 |
| - radv: use global atomics for generated/written primitives query on GFX12 |
| - radv: re-emit streamout state for GFX12 when the user SGPR changes |
| - radv: exclude layer when recomputing FS input bases |
| - ac/cmdbuf: program SPI_SHADER_GS_MESHLET_CTRL to 0 in the GFX12 preamble |
| - radv: program COMPUTE_DISPATCH_INTERLEAVE on GFX12 |
| - radv: add support for BO metadata on GFX12 |
| - radv: add a new helper to set image BO metadata |
| - ac/gpu_info: add gfx12_supports_display_dcc |
| - radv: fix an assertion about DCC and modifier on GFX12 |
| - radv: fix the number of drm modifier planes for DCC on GFX12 |
| - ci: update VKCTS main to a9988483c0864d7190e5e6264ccead95423dfd00 |
| - radv/ci: update descriptor buffer skipped tests |
| - radv: fix disabling logic op for srgb/float formats when blending is enabled |
| - radv: disable video support on GFX12 |
| - radv: disable VK_KHR_cooperative_matrix on GFX12 |
| - radv: fix programming pitches for LINEAR_SUB_WINDOW on GFX12 |
| - radv: fix programming mip level for TILED_SUB_WINDOWS on GFX12 |
| - radv/ci: add expected list of failures for GFX1200 |
| - radeonsi: fix programming DCC for SDMA on GFX12 |
| - radv: use stage instead of entrypoint to determine valid gfx stages |
| - docs: add a note about GFX12 (RDNA4) on RADV |
| - ac,radeonsi: add SDMA DCC tiling for GFX12+ |
| - ac/descriptors: allow to configure DCC for buffer descriptors |
| - radv/amdgpu: add support for AMDGPU_GEM_CREATE_GFX12_DCC |
| - radv/meta: add missing pipeline lookups |
| - radv/meta: stop using string keys also for DGC and query objects |
| - util/disk_cache: add a new helper to create a disk cache |
| - vulkan/runtime: allow to use a different disk cache |
| - radv: fix caching on-demand meta shaders |
| - radv: fix adding the BO to cmdbuf list when starting conditional rendering |
| - radv: fix fetching draw vertex data from counter buffers with transform feedback |
| - radv/meta: disable conditional rendering for fill/update buffer operations |
| - radv: fix adding the VRS image BO to the cmdbuf list on GFX11 |
| - ac,radv,radeonsi: add new GFX12_DCC_WRITE_COMPRESS_DISABLE tiling flag |
| - ac/gpu_info: add gfx12_supports_dcc_write_compress_disable |
| - radv: add initial DCC support on GFX12 |
| - radv: fix adding the BO for unaligned SDMA copies to the cmdbuf list |
| |
| Saroj Kumar (1): |
| |
| - ac/surface: fix missing NULL check in gfx12_select_swizle_mode() |
| |
| Sathishkumar S (1): |
| |
| - radeonsi/vcn: enable roi decode and rgb targets on JPEG_5_0_1 |
| |
| Scott Moreau (1): |
| |
| - dri: Fix hardware cursor for cards without modifier support |
| |
| Serdar Kocdemir (4): |
| |
| - Change C style cast on extension structs |
| - Wrap queue related functions on codegen |
| - The BumpPool of VkStream is not freeAll'ed |
| - gfxstream: add VK_DRIVER_FILES to devenv |
| |
| Sergi Blanch Torne (6): |
| |
| - ci: disable Collabora's farm due to maintenance |
| - Revert "ci: disable Collabora's farm due to maintenance" |
| - ci: disable Collabora's farm due to maintenance |
| - Revert "ci: disable Collabora's farm due to maintenance" |
| - ci: disable Collabora's farm due to unexpected power cut |
| - Revert "ci: disable Collabora's farm due to unexpected power cut" |
| |
| Shashank Sharma (1): |
| |
| - amd: add new AMDGPU_INFO subquery for userqueue metadata |
| |
| Sil Vilerino (26): |
| |
| - vl/vl_winsys: Add missing include for function declaration |
| - u_dynarray.h: Fix warning C4267 conversion from 'size_t' to 'type', possible loss of data |
| - u_math.h: Change power of two assert to fix warning C4146: unary minus operator applied to unsigned type, result still unsigned |
| - src/gallium/auxiliary/util/u_draw.h: Fix C4244 'argument' : conversion from 'type1' to 'type2', possible loss of data |
| - util: Fix warning C4244 'argument' : conversion from 'type1' to 'type2', possible loss of data |
| - src/compiler: Fix warning C4244 'argument' : conversion from 'type1' to 'type2', possible loss of data |
| - src/compiler: Fix warning C4389: An == or != operation involved signed and unsigned variables. This could result in a loss of data. |
| - d3d12: Fix warning C4267 conversion from 'size_t' to 'type', possible loss of data |
| - d3d12: Fix warning C4244 'argument' : conversion from 'type1' to 'type2', possible loss of data |
| - d3d12: Fix warning C4389: An == or != operation involved signed and unsigned variables. This could result in a loss of data. |
| - d3d12: Fix warning C4018 signed/unsigned mismatch |
| - d3d12: Add offset limit check to d3d12_resource_from_memobj |
| - d3d12_bufmgr.cpp: Fix warning C4244 for x86 builds assign uint64_t to size_t |
| - util: cpu_detect.c Fix warning C5274: behavior change: _Alignas no longer applies to the type '<unnamed-tag>' (only applies to declared data objects) |
| - d3d12_video_encoder_bitstream_builder_h264: Fix warning C4244 for x86 builds assign uint64_t to size_t |
| - d3d12_resource: Fix warning C4244 for x86 builds assign uint64_t to uintptr_t |
| - d3d12_video_dec_h264: Fix warning C4244 uint64_t to size_t cast |
| - d3d12_video_dec_vp9.cpp: Fix warning C4244: 'argument': conversion from 'uint64_t' to 'const unsigned int', possible loss of data |
| - d3d12_video_dec_hevc.cpp: Fix warning C4244: 'argument': conversion from 'uint64_t' to 'const unsigned int', possible loss of data |
| - d3d12_video_proc.h/cpp: Fix warning C4244: 'argument': conversion from 'uint64_t' to 'const unsigned int', possible loss of data |
| - d3d12_video_enc_av1.cpp: Fix warning C4244: 'argument': conversion from 'uint64_t' to 'unsigned int', possible loss of data |
| - d3d12_video_enc_h264.cpp: Fix warning C4244: 'argument': conversion from 'uint64_t' to 'unsigned int', possible loss of data |
| - d3d12_video_enc_hevc.cpp: Fix warning C4244: 'argument': conversion from 'uint64_t' to 'unsigned int', possible loss of data |
| - d3d12_video_dec.h/cpp: Fix warning C4244: 'argument': conversion from 'uint64_t' to 'unsigned int', possible loss of data |
| - d3d12_video_enc.h/cpp Fix warning C4244: 'argument': conversion from 'uint64_t' to 'unsigned int', possible loss of data |
| - d3d12: Enable Warnings C4267, C4996, C4146, C4244, C4389, C4838, C4302, C4018 in src/gallium/drivers/d3d12 subtree |
| |
| Simon Perretta (70): |
| |
| - pvr: add initial pco stub/boilerplate |
| - pvr, pco: Add new compiler framework and shader gen stubs |
| - pco: add env debug option parsing |
| - pco: stubs for SPIR-V/NIR compilation options |
| - pvr: connect basic pco functions to the driver |
| - pvr: remove pipeline shader hard-coding support |
| - pvr: add device info and functions for calculating available temps |
| - pvr: add shader compilation stubs |
| - pvr: track pipeline flags |
| - pvr: add device info for additional iterator features |
| - pvr: fix GetInstanceProcAddr ubsan warning when _instance == NULL |
| - pvr: drop PVRX macro |
| - pco: suppress warning for functions passing structs |
| - pco: pygen stubs |
| - pco, pygen: enum emit support, define some enums and op/ref mods/types |
| - pco, pygen: define basic isa field types |
| - pco, pygen: define and emit isa instruction group header variant fields |
| - pco, pygen: isa instruction group header validation and encoding support |
| - pco, pygen: isa lower source definitions |
| - pco, pygen: isa upper sources definitions |
| - pco, pygen: isa internal source selector definitions |
| - pco, pygen: isa destination definitions |
| - pco, pygen: isa main alu ops |
| - pco, pygen: isa backend alu ops |
| - pco, pygen: isa bitwise alu ops |
| - pco, pygen: isa control alu ops |
| - pco, pygen: query bytes required for each variant |
| - pco, pygen: generate op and mod info |
| - pco: define data structures and basic builder implementation with ops |
| - pco: NIR translation and PCO IR pass boilerplate |
| - pco: printing and validation boilerplate |
| - pco, pygen: generate string representations of enum elements |
| - pco: basic instruction printing |
| - pco, pygen: move unnamed tuple structs into classes |
| - pco, pygen: add bitset support for op mods |
| - pco, pygen: common underscore replacement for op names |
| - pco: add verbose printing debug option |
| - pco, pygen: distinguish hw ops that are built directly into instruction groups |
| - pco, pygen: instruction to instruction group mapping, printing |
| - pco: additional ref functions |
| - pco: boilerplate nir lowering passes |
| - pco, pygen: add initial uvsw op boilerplate |
| - pco, pygen: add better exception messages |
| - pco: adjust align padding to be per-function instead of per-shader |
| - pco, pygen: support querying ref mods, if op/ref mods have been set |
| - pco: set up and tear down glsl type singleton with context |
| - pco, pygen: add support for instructions with variable srcs/dests |
| - pco, pygen: re-order some mods to match their evaluation order |
| - pco: print ranges of non-ssa refs with >1 channel, datatypes for immediates |
| - pco, pygen: drop unspecified bit sizes for references |
| - pco, pygen: add defs and mappings for common ops |
| - pco, pygen: restructure igrp alu components into arrays |
| - pco, pygen: amend bitfield assertion messages |
| - pco, pygen: isa ditr op |
| - pco, pygen: isa itrsmp op |
| - pco: initial implementation of translation and passes |
| - pco: add public print wrappers |
| - pco: vector component tracking, vector collation when ingesting NIR |
| - pco: re-indexing debug option and additional vector and component tracking |
| - pco: add mappings and translation for ditr |
| - pco: temporarily add hardcoded vs/fs I/O for testing, BXS-4-64 iteration support |
| - pco: add helpers for overriding ref chans and offsetting vals |
| - pco: vec coalescing improvement to register allocation |
| - pco: add opt subpass for propagating comps referencing hw regs |
| - pco: track the number of bytes encoded for each function |
| - pvr, pco: rewrite compiler/driver interface for vs & fs I/O |
| - pco: modifier propagation optimization, shared opt context boilerplate |
| - pco: initial validation boilerplate and SSA checks |
| - CODEOWNERS: update for new pco compiler tree |
| - pco: fix x86 build |
| |
| Simon Ser (6): |
| |
| - dri: revert INVALID modifier special-casing |
| - llvmpipe: handle llvmpipe_resource_map() errors |
| - dri: don't fetch X11 modifiers if we don't support them |
| - egl/wayland: only supply LINEAR modifier when supported |
| - egl/wayland: fallback to implicit modifiers if advertised by compositor |
| - gbm: fix get_back_bo() failure with gbm_surface and implicit modifiers |
| |
| Sonny Jiang (1): |
| |
| - radeonsi/vcn: Add vcn_5_0_1 support |
| |
| Tapani Pälli (21): |
| |
| - intel/dev: update mesa_defs.json from workaround database |
| - anv: utilize ray query bo per queue for Wa_14022863161 |
| - anv: extend Wa_14017794102 with lineage Wa_14023061436 |
| - isl: modify existing assert by allowing CCS_E aux usage |
| - intel/dev: update mesa_defs.json from workaround database |
| - intel/dev: lower amount of max gs threads for Wa_18040209780 |
| - anv/android: always create 2 graphics and compute capable queues |
| - iris: allow bo cache for compressed bos on verx10 == 200 |
| - drirc/anv: force_vk_vendor=-1 for Marvel Rivals |
| - intel/dev: update mesa_defs.json from internal database |
| - dri: remove GLsync typedef |
| - anv: handle mesh in sbe_primitive_id_override |
| - iris: initialize whole pipe_box struct for memcmp |
| - intel/compiler: take reg_unit size into account with ubo ranges |
| - anv: set dependency between SF_CLIP and CC_PTR states |
| - mesa/st: take pixelmaps in to account in drawpixels cache |
| - intel/dev: update mesa_defs.json from internal database |
| - isl: use workaround framework for Wa_1207137018 |
| - mesa: enable GL_EXT_conservative_depth extension |
| - anv: tighten condition for changing barrier layouts |
| - anv: apply cache flushes on pipeline select with gfx20 |
| |
| Thomas H.P. Andersen (2): |
| |
| - drirc/nvk: force_vk_vendor=-1 for Artifact Classic |
| - nvk: follow naming convention for devices |
| |
| Tim Huang (1): |
| |
| - amd: add GFX v11.5.3 support |
| |
| Tim Keller (1): |
| |
| - dril: Check for null config in dril_target.c |
| |
| Timothy Arceri (24): |
| |
| - glsl/nir: fix function cloning at link time |
| - glsl: fix compiler global temp collisions |
| - glsl: tidy up glsl_to_nir() params |
| - glsl: remove unused member |
| - Revert "glsl: Move ForceGLSLAbsSqrt handling to glsl-to-nir." |
| - glsl: remove more now unused params from glsl_to_nir() |
| - glsl: don't copy symbol table to shaders |
| - glsl: drop _mesa_glsl_copy_symbols_from_table() |
| - glsl: use symbol table directly for builtin functions |
| - glsl: drop unused symbol table from gl_shader |
| - glsl: disable function return lowering in glsl ir |
| - glsl: remove return lowering from glsl ir |
| - glsl: drop last remaining lower jump test |
| - glsl: remove now unused ir reader |
| - glsl: move _mesa_glsl_compile_shader() declaration |
| - glsl: remove glsl/program.h |
| - nir: allow loops with unknown induction var initialiser to unroll |
| - glsl: drop unused ir_equals.cpp |
| - glsl: drop unused array refcount code and tests |
| - glsl: drop opt_dead_code_local |
| - glsl: enable layout qualifier if OVR_multiview enabled |
| - glsl: fix num_views validation message |
| - glsl: fix num_views linker error |
| - glsl: fix return value for subgroupBallot() |
| |
| Timur Kristóf (109): |
| |
| - radv: Mark GS copy shaders as internal. |
| - radv: Add ability to dump shaders based on stage. |
| - aco: Separate options for printing IR and recording disassembly. |
| - radv: Separate option to dump NIR. |
| - radv: Separate option to print shader disassembly. |
| - radv: Separate option to dump backend IR. |
| - radv: Refactor RADV_DEBUG=shaders to be a combination of other options. |
| - radv: Slightly reword preoptir debug flag. |
| - radv: Also allow filtering SPIR-V dump per stage. |
| - radv: Set dump flags in a smarter way by default. |
| - amd: Rename GFX1103_R1/R2 to PHOENIX/2 |
| - radv: Add a flush postamble on GFX6. |
| - radv: Don't flush at the end of each command buffer on GFX6. |
| - ac/nir/ngg: Don't emit dead code with dot_op. |
| - ac/nir/ngg: Trade 1 VALU shift for 2 SALU add. |
| - ac/nir/cull: Slightly refactor control flow for small primitive culling. |
| - ac/nir/ngg: Slightly refactor workgroup scan. |
| - ac/nir/ngg: Pass wg_repack_result as pointer instead of returning it. |
| - ac/nir/ngg: Workgroup scan over two bools. |
| - ac/nir/ngg: Implement optional primitive compaction. |
| - ac/nir/ngg: Remove erroneous NUW addition from workgroup scan. |
| - radv: Reorder potentially per-primitive FS builtins. |
| - radv: Slightly simplify potentially per-primitive FS inputs. |
| - radv, aco: Consolidate num_interp + num_prim_interp into num_inputs. |
| - radv: Emit SPI_PS_IN_CONTROL when emitting PS inputs on GFX10.3. |
| - radv: Remove now unused num_prim_interp from shader_info. |
| - radv: Use default 0 for undefined builtin PS inputs. |
| - radv: Only set NGG_DISABLE_PROVOK_REUSE for VS. |
| - ac/nir/ngg: Add ability to store primitive ID as per-primitive. |
| - radv: Reorder FS primitive ID input after layer and viewport. |
| - radv: Configure implicit VS primitive ID to be per-primitive. |
| - ac/nir/ngg: Use ac_nir_prerast_out in mesh shader lowering. |
| - ac/nir/ngg: Simplify updating mesh shader output info. |
| - ac/nir: Pass ac_nir_prerast_out to ac_nir_export_parameters. |
| - ac/nir: Pass ac_nir_prerast_out to ac_nir_export_position. |
| - ac/nir: Introduce ac_nir_store_parameters_to_attr_ring. |
| - ac/nir/ngg: Refactor VS/TES attribute ring stores. |
| - ac/nir/ngg: Refactor GS attribute ring stores. |
| - ac/nir/ngg: Refactor export_pos0_wait_attr_ring. |
| - ac/nir/ngg: Remove dead code for attribute ring stores. |
| - ac/nir/ngg: Move wait attr ring workaround for GS to better place. |
| - ac/nir/ngg: Move emitting GS vertex param exports to if. |
| - ac/nir/ngg: Refactor storing per-primitive primitive ID to attribute ring. |
| - ac/nir: Mark when pre-rast output is used as varying or sysval. |
| - ac/nir: Split GS output usage masks to varying and sysval masks. |
| - ac/nir: Only export positions when they are really system values. |
| - ac/nir: Only export parameters when they are actually varying. |
| - ac/nir: Only store params to attribute ring that are varying. |
| - aco: Update documentation |
| - radv: Add some documentation. |
| - radv: Implement FS layer ID input as a system value. |
| - Revert "nir/opt_varyings: Add workaround for RADV mesh shader multiview." |
| - ac/nir/ngg: Don't mark multiview layer output as varying. |
| - amd: Set lower_layer_fs_input_to_sysval in common code, not in drivers. |
| - radv: Rename layer_input to reads_layer in PS info. |
| - radv: Only print "testing use only" message on GFX12+. |
| - ac/nir: Move ac_nir_lower_bit_size_callback to ac_nir.c |
| - ac/nir: Move ac_nir_get_mem_access_flags to ac_nir.c |
| - ac/nir: Move ac_nir callback functions to ac_nir.c |
| - ac/nir: Move ac_set_nir_options to ac_nir.c |
| - ac: Stop including nir.h in ac_shader_util.h |
| - ac/nir: Rename emit_streamout to ac_nir_emit_legacy_streamout |
| - ac: Move ac_nir_config struct to ac_nir.h |
| - ac/nir: Move ac_nir_create_gs_copy_shader to separate file. |
| - ac/nir: Expose ac_nir_unpack_value in ac_nir_helpers.h |
| - ac/nir: Move ac_nir_lower_intrinsics_to_args to separate file. |
| - ac/nir: Move ac_nir_lower_legacy_vs to separate file. |
| - ac/nir: Move ac_nir_lower_legacy_gs to separate file. |
| - ac/nir: Move ac_nir_gs_shader_query declaration to ac_nir_helpers.h |
| - ac/nir: Move ac_nir_opt_pack_half to separate file. |
| - ac/nir: Move ac_nir_lower_mem_access_bit_sizes to separate file. |
| - ac/nir: Move ac_nir_lower_sin_cos to separate file. |
| - ac/nir: Move pre-rasterization related utilities in separate file. |
| - ac/nir: Rename ac_nir_lower_ngg_ms to ac_nir_lower_ngg_mesh. |
| - ac/nir: Move ac_nir_lower_ngg_mesh to separate file. |
| - ac: Move AC_HS_MSG_VOTE_LDS_BYTES to ac_shader_util.h |
| - ac: Stop including ac_nir.h from ac_shader_util.c |
| - ac/nir: Move all ac_nir_* files to a new folder. |
| - radv: Lower array derefs of vectors outside of shader linking. |
| - ac/nir/ngg: Mitigate NGG fully culled bug when GS output is compile-time zero. |
| - ac/nir/ngg: Mitigate attribute ring wait bug when primitive ID is per-primitive. |
| - aco: Move NGG pos export scheduling determination to drivers. |
| - ac/nir/ngg: Remove some superfluous variables from culling code. |
| - ac/nir/ngg: Add a few comments explaining some variables. |
| - ac/nir/ngg: Remove unused vs_output struct. |
| - ac/nir/ngg: Carve out ac_nir_ngg_alloc_vertices_and_primitives. |
| - ac/nir/ngg: Use ac_nir_ngg_alloc_vertices_and_primitives in mesh shader lowering. |
| - ac/nir/ngg: Carve out ac_nir_create_output_phis. |
| - ac/nir/ngg: Carve out NGG streamout code. |
| - ac/nir/ngg: Carve out ac_nir_repack_invocations_in_workgroup. |
| - ac/nir/ngg: Slightly refactor emitting vertex parameters. |
| - ac/nir/ngg: Add radeon_info to NGG lowering options. |
| - ac/nir/ngg: Add and use a has_attr_ring_wait_bug field to ac_gpu_info. |
| - ac/nir/ngg: Add and use a has_attr_ring field to ac_gpu_info. |
| - ac/nir/ngg: Add and use a has_ngg_fully_culled_bug field to ac_gpu_info. |
| - ac/nir/ngg: Add and use a has_ngg_passthru_no_msg field to ac_gpu_info. |
| - ac/nir/ngg: Use gfx_level from radeon_info. |
| - ac/nir/ngg: Remove gfx_level and family from NGG lowering options. |
| - ac/nir/ngg: Pass radeon_info to mesh shader lowering. |
| - ac/nir/ngg: Use has_attr_ring and has_attr_ring_wait_bug in mesh shader lowering too. |
| - ac/nir/ngg: Rework attribute ring wait workaround in VS/TES. |
| - ac/nir/ngg: Carve out ngg_gs_process_out_primitive. |
| - ac/nir/ngg: Carve out ngg_gs_process_out_vertex. |
| - ac/nir/ngg: Rework GS output code for better attribute ring handling. |
| - ac/nir/ngg: Remove now unused export_pos0_wait_attr_ring. |
| - ac/nir/ngg: Don't call has_input_primitive in GS lowering. |
| - ac/nir/ngg: Move GS lowering to separate file. |
| - radv, radeonsi: Disable early prim export on GFX11+. |
| - ac/nir/ngg: Use SALU to calculate which threads store to attribute ring in GS. |
| |
| Tomeu Vizoso (42): |
| |
| - etnaviv/ml: Fix includes |
| - etnaviv/nn: Fix use of etna_core_info |
| - etnaviv/ci: Add expectation files for the VIPNano-SI+ NPU |
| - etnaviv/ml: Rework the dumping of tensors |
| - etnaviv: Add script to decode weights in Huffman format |
| - etnaviv/ml: Split V7 coefficient encoding to a new file |
| - etnaviv/ml: Add encoding of coefficients for V8 |
| - etnaviv/ml: Fix padding for convolutions in V8 |
| - etnaviv/ml: Implement tiling for V8 |
| - etnaviv/ml: Set two bits in the NN instruction for V8 |
| - etnaviv/ml: Disable caching on V8 |
| - etnaviv/ml: Fix reshuffle TP jobs on V8 |
| - etnaviv/ml: Only reshuffle when needed on V8 |
| - etnaviv/ml: Make use of the new depthwise support in V8 |
| - etnaviv/ci: Update expectations for the NPU in the A311D |
| - etnaviv/ml: Zero out the NN config |
| - etnaviv/ml: Zero all BOs |
| - teflon: Support multiple graph inputs and outputs |
| - etnaviv/ml: Adapt to changes in teflon regarding multiple inputs |
| - etnaviv/ml: Support addition operations on V8 |
| - teflon: Add files mentioned in the docs for image classification |
| - teflon/docs: Update performance measurements on LibreComputer Alta |
| - teflon/docs: Add i.MX8MP to list of supported NPUs |
| - teflon/docs: Clarify smoke test instructions |
| - teflon: Add tests for the YOLOX model |
| - teflon: Support tests with inputs with less than 4 dims |
| - teflon: Rename model tests so they aren't skipped by gtest-runner |
| - teflon: Don't crash when a tensor isn't quantized |
| - teflon/tests: Add support for models with float inputs and outputs |
| - teflon/tests: Also use the cache for models in the test suite |
| - etnaviv/ml: Specify which of the input tensors need transposing. |
| - etnaviv/ml: Fix in_image_slice in transposes when width != height |
| - etnaviv/ml: Take offsets into account in TP operations |
| - teflon: Add support for tensor split and concatenation operations |
| - etnaviv/ml: Add support for tensor split and concatenation operations |
| - teflon: Limit support for Add to two unpopulated tensors |
| - etna/ml: Write out the size of the requested tensor |
| - teflon: Add support for tensor padding operations |
| - etnaviv/ml: Add support for tensor padding operations |
| - teflon: Add support for FullyConnected |
| - teflon: Add tests for FullyConnected |
| - etnaviv/ml: Implement FullyConnected |
| |
| Valentine Burley (99): |
| |
| - amd/ci: Drop x86_64 suffix from job names |
| - amd/ci: Merge and convert Raven piglit testing |
| - amd/ci: Convert LAVA RADV jobs to deqp-runner suites |
| - amd/ci: Increase fraction for radeonsi-raven-piglit |
| - panfrost/ci: Turn redundant GLESCTS-full run into disabled Piglit job |
| - svga/ci: Convert to deqp-runner suite |
| - panfrost/ci: Convert to deqp-runner suite |
| - ci: Drop lava-piglit:(x86_64|arm64) definitions |
| - radv/ci: Convert Valve RADV jobs to deqp-runner suites |
| - turnip/ci: Bump the number of tests per group for a618 |
| - turnip/ci: Bump the number of tests per group for a630 |
| - turnip/ci: Bump the number of tests per group for a660 |
| - turnip/ci: Decrease fraction for a630-vk-asan |
| - turnip/ci: Adjust some timeouts |
| - turnip/ci: Remove a630-vk-asan skip |
| - turnip/ci: Update expectations |
| - freedreno/ci: Drop redundant DEQP_VER |
| - turnip/ci: Ony increase hangcheck timer for spilling tests on a630 |
| - lavapipe/ci: Convert lavapipe-vk-asan to a deqp-runner suite |
| - etnaviv/ci: Convert to deqp-runner suites |
| - softpipe/ci: Convert softpipe-asan-gles31 to a deqp-runner suite |
| - radv/ci: Use deqp-vk-main in Raven and Stoney RADV jobs |
| - turnip/ci: Enable ASan leak detection in a630-vk-asan |
| - ci/deqp: Remove non-suite support |
| - llvmpipe/ci: Move Piglit timeout inside the suite |
| - ci/deqp: Simplify conditional arguments |
| - ci/deqp: Add a DEQP_FORCE_ASAN option |
| - llvmpipe/ci: Actually enable ASan testing for llvmpipe-deqp-asan |
| - anv/ci: Fix GPU_VERSION configuration for anv-jsl and anv-jsl-full |
| - anv/ci: Bump the number of tests per group for ADL |
| - anv/ci: Bump the number of tests per group for JSL |
| - anv/ci: Bump the number of tests per group for TGL |
| - anv/ci: Re-enable TGL and JSL manual jobs |
| - anv/ci: Remove fails that are in .gitlab-ci/all-skips.txt |
| - anv/ci: Update expectations |
| - ci/lava: Use CI_JOB_TIMEOUT instead of separate variable |
| - ci/windows: Bump the number of tests per group |
| - ci/windows: Add a manual full job |
| - ci/windows: Update expectations |
| - turnip/ci: Update expectations |
| - ci/windows: Always include windows-msvc in scheduled pipelines |
| - panvk/ci: Move the fractions out of suites |
| - panvk/ci: Bump the number of tests per group for G52 |
| - lavapipe/ci: Bump the number of tests per group |
| - lavapipe/ci: Update expectations |
| - venus/ci: Bump the number of tests per group |
| - venus/ci: Update expectations |
| - angle/ci: Update expectations |
| - zink/ci: Update expectations for ANV |
| - turnip/ci: Document flake |
| - lavapipe/ci: Update expectations |
| - lavapipe/ci: Re-enable lavapipe-vk-asan |
| - ci: Uprev vkd3d-proton to b121e6d746341e0aaba7663e3d85f3194e8e20e1 |
| - virgl/ci: Disable virgl-iris-traces-performance |
| - virgl/ci: Migrate the two iris jobs to 1130g7-volteer |
| - anv/ci: Increase anv-tgl-angle parallelism to 2 |
| - zink/ci: Migrate the two TGL traces jobs to 1130g7-volteer |
| - zink/ci: Increase zink-anv-tgl parallelism to 4 |
| - ci: Add Valentine to the restricted traces access list |
| - freedreno/ci: Update a630-traces-restricted checksums |
| - zink/ci: Skip crashing trace in zink-anv-tgl-traces-restricted |
| - turnip/ci: Decrease the fraction on a660-vk-full |
| - ci: Fix trace update script reading GitLab token from default location |
| - pan/ci: Document some flakes |
| - android/ci: Allow specifying Vulkan driver in cuttlefish-runner.sh |
| - android/ci: Build ANV for Android |
| - freedreno/ci: Update expectations |
| - panfrost/ci: Revert to 6.6 kernel on G57 |
| - amd/ci: Add lava-hp-x360-14a-cb0001xx-zork and use it for VA-API testing |
| - amd/ci: Run full radeonsi-raven-va job pre-merge |
| - freedreno/ci: Update expectations again |
| - turnip/ci: Bump the number of tests per group for a630-vk-asan |
| - anv/ci: Move a test to common anv-skips |
| - ci: Uprev VKCTS to 1.4.1.0 |
| - pan/ci: Properly wire up DRIVER_NAME |
| - panvk/ci: Skip waived tests |
| - ci: Uprev VKCTS to 1.4.1.1 |
| - ci: Skip broken PenumbraOverture trace for zink and freedreno |
| - zink/ci: Update checksum for Osmos trace on TGL |
| - anv/ci: Revert to 6.6 kernel on anv-jsl |
| - iris/ci: Decrease iris-glk-deqp paralellism |
| - panfrost/ci: Move panfrost-g52-piglit to nightly |
| - zink/ci: Increase zink-anv-adl parallelism |
| - turnip/ci: Increase a660-vk fraction |
| - freedreno/ci: Decrease a660-gl paralellism |
| - freedreno/ci: Disable a618-gl, a618-egl, and a618-piglit |
| - turnip/ci: Disable a630-vk |
| - freedreno/ci: Decrease a630-gl parallelism |
| - freedreno/ci: Re-enable some traces on a618 and disable a630-traces |
| - zink/ci: Increase parallelism of zink-tu-a618 |
| - freedreno/ci: Don't automatically retry manual jobs |
| - freedreno/ci: Migrate a618-piglit-full to kingoftown |
| - amd/ci: Migrate amd-raven-skqp from lenovo-zork to hp-zork |
| - anv/ci: Decrease anv-jsl-angle parallelism |
| - virgl/ci: Skip flaky trace |
| - amd/ci: Increase amd-raven-skqp parallelism |
| - freedreno/ci: Document flakes |
| - venus/ci: Skip flaky test due to intermittent timeouts |
| - amd/ci: Revert to 6.6 kernel on Raven |
| |
| Vignesh Raman (6): |
| |
| - ci: Uprev crosvm |
| - ci: Force db410c to host mode |
| - ci: Uprev kernel to 6.13 |
| - ci: update expectation files |
| - ci: export RESULTS_DIR in crosvm-script.sh |
| - ci: use CI_PROJECT_NAME for artifacts name |
| |
| Vinson Lee (4): |
| |
| - hk: Fix hk_ia_update arguments order |
| - vulkan: Add missing va_end |
| - intel/elk: Fix assert with side effect |
| - hk: Fix build error with static_assert |
| |
| Visan, Tiberiu (3): |
| |
| - amd/vpelib: patch to match shader (#456) |
| - amd/vpelib: remove luma offset (#459) |
| - amd/vpelib: fixed file headers for Palamida scan |
| |
| Vldly (1): |
| |
| - freedreno: Fix resource tracking on repeated map with discard |
| |
| Xaver Hugl (1): |
| |
| - vulkan/wsi: unset GAMMA_LUT, CTM and DEGAMMA_LUT when doing a modeset |
| |
| Yinjie Yao (3): |
| |
| - radeonsi/vcn: Indentation fix |
| - radeonsi/vcn: Fix compile warnings with previously uninitialized variables. |
| - radeonsi/vcn: Disable 2pass encode for VCN 5.0. |
| |
| Yiwei Zhang (4): |
| |
| - venus: enable VK_EXT_external_memory_acquire_unmodified if needed |
| - venus: use dedicated allocation for ANB image memory import |
| - venus: fix to handle pipeline flags2 from maint5 |
| - venus: fix maintenance5 props init and create flags2 |
| |
| Yogesh Mohan Marimuthu (25): |
| |
| - amd: update amdgpu_drm.h for new userq ioctl |
| - amd: include amdgpu_drm.h from mesa instead of system for ac_fake_hw_db.h |
| - winsys/amdgpu: add DOORBELL domain to bo |
| - winsys/amdgpu: add CLEAR_VRAM flag to zero vram when creating bo |
| - winsys/amdgpu: add userq helper functions |
| - ac/gpuinfo: add use_userq and AMD_USERQ variable |
| - winsys/amdgpu: call userq init and destroy functions |
| - ac: add new userq signal and wait packet id |
| - ac: add inherit vmid field to indirect buffer packet |
| - winsys/amdgpu: use bo_va_op_raw() function instead of bo_va_op() |
| - winsys/amdgpu: use timeline syncobj for userq vm operations |
| - winsys/amdgpu: destroy bo_fence_lock late in do_winsys_deinit() |
| - winsys/amdgpu: pass job fences to VM ioctl |
| - winsys/amdgpu: wait for vm syncobj before creating userq |
| - winsys/amdgpu: move noop and ib_bytes adjustment to cs_flush |
| - winsys/amdgpu: move legacy chunk init and submission to new function |
| - winsys/amdgpu: add userq cmd submission support in amdgpu_cs_submit_ib() |
| - winsys/amdgpu: don't add fence dependency of other queues for userq |
| - winsys/amdgpu: send hdp flush packet for userq |
| - winsys/amdgpu: keep has_local_buffers true for userq |
| - winsys/amdgpu: use VM_ALWAYS_VALID for all VRAM and GTT allocations |
| - ac/gpu_info: populate fw info using new fw info ioctl for userq |
| - winsys/amdgpu: ring doorbell before calling userq_signal ioctl |
| - winsys/amdgpu: use next_wptr as cache for userq |
| - winsys/amdgpu: ensure strict order in updating mqd wptr and doorbell |
| |
| You, Min-Hsuan (1): |
| |
| - amd/vpelib: fix coverity defects |
| |
| Zan Dobersek (8): |
| |
| - fd/pps: specify counter group for each countable |
| - fd/pps: provide derived counters on a7xx |
| - freedreno/registers: update RB_BLIT_INFO, RB_CCU_CNTL |
| - tu/a7xx: use concurrent resolve groups |
| - tu: ensure completion of generic-clear resolves for color, depth/stencil clears |
| - tu/a7xx: support 8x MSAA |
| - freedreno/registers: fix RBBM_PRIMCTR understanding and usage |
| - freedreno/a7xx: fix fd_lrzfc_layout |
| |
| Zhao, Jiali (1): |
| |
| - amd/vpelib: 420 and 422 Output Single Segment cositing support |
| |
| Zoltán Böszörményi (3): |
| |
| - features.txt: Add Vulkan 1.4 section |
| - docs/features: Mark VK_EXT_host_image_copy as implemented on Turnip |
| - docs/features: Mark more Vulkan 1.4 features as done for drivers |
| |
| duncan.hopkins (9): |
| |
| - glx: change \`#if` guard around \`dri_common.h` to stop missing 'driDestroyConfigs' symbol on MacOS builds. |
| - glx: ignore zink check for has_explicit_modifiers and DRI3 on MacOS. |
| - kopper: Add '#if' guard around \`loader_dri3_get_pixmap_buffer` to stop missing symbol on MacOS. |
| - glx: Guard some of the bind_extensions() code with the same conditions as \`glx_screen`s `frontend_screen` member. |
| - glx: Add back in \`applegl_create_display()` so the OpenGL.framework, on MacOS, pointer get setup. |
| - zink: MoltenVk has conditional VK_DYNAMIC_STATE_VERTEX_INPUT_BINDING_STRIDE support. |
| - zink: Avoid optimalDeviceAccess on MoltenVK when creating depth taregts. |
| - zink, kopper: Conitionally add VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT to swap chain imageUsage. |
| - zink: stop zink_set_primitive_emulation_keys producing geometry shaders on platforms that do not support them. |
| |
| liuqiang (2): |
| |
| - lavapipe: Resolved write to pointer after free |
| - d3d10umd: Modify comment |
| |
| nyanmisaka (1): |
| |
| - frontends/vdpau: Get AV1 decode subsampling_x/y |
| |
| sergiuferentz (1): |
| |
| - Use try_unbox in VkDescriptorBufferInfo |