commit | adf8d168af9334a8bf940824fcf4207d01e05ae5 | [log] [tgz] |
---|---|---|
author | bors <bors@rust-lang.org> | Sun Sep 08 13:54:02 2024 +0000 |
committer | bors <bors@rust-lang.org> | Sun Sep 08 13:54:02 2024 +0000 |
tree | 554dcbfba6db92d503399e3122ecfa788a968902 | |
parent | 7b18b3eb6d45efe45ecbe0b6043a1a93cef4a166 [diff] | |
parent | 6b4ff514d99511354ed206c2544b821b6986bf78 [diff] |
Auto merge of #130002 - orlp:better-div-floor-ceil, r=thomcc better implementation of signed div_floor/ceil Tracking issue for signed `div_floor`/`div_ceil`: https://github.com/rust-lang/rust/issues/88581. This PR improves the implementation of those two functions by adding a better branchless algorithm. Side-by-side comparison of `i32::div_floor` on x86-64: ```asm div_floor_new: div_floor_old: push rax push rax test esi, esi test esi, esi je .LBB0_3 je .LBB1_6 mov eax, esi mov eax, esi not eax not eax lea ecx, [rdi - 2147483648] lea ecx, [rdi - 2147483648] or ecx, eax or ecx, eax je .LBB0_2 je .LBB1_7 mov eax, edi mov eax, edi cdq cdq idiv esi idiv esi xor esi, edi test edx, edx sar esi, 31 setg cl test edx, edx test esi, esi cmove esi, edx sets dil add eax, esi test dil, cl pop rcx jne .LBB1_4 ret test edx, edx .LBB0_3: setns cl lea rdi, [rip + .L__unnamed_1] test esi, esi call qword ptr [rip + panic...] setle dl .LBB0_2: or dl, cl lea rdi, [rip + .L__unnamed_1] jne .LBB1_5 call qword ptr [rip + panic...] .LBB1_4: dec eax .LBB1_5: pop rcx ret .LBB1_6: lea rdi, [rip + .L__unnamed_2] call qword ptr [rip + panic...] .LBB1_7: lea rdi, [rip + .L__unnamed_2] call qword ptr [rip + panic...] ``` And on Aarch64: ```asm _div_floor_new: _div_floor_old: stp x29, x30, [sp, #-16]! stp x29, x30, [sp, #-16]! mov x29, sp mov x29, sp cbz w1, LBB0_4 cbz w1, LBB1_9 mov w8, #-2147483648 mov x8, x0 cmp w0, w8 mov w9, #-2147483648 b.ne LBB0_3 cmp w0, w9 cmn w1, #1 b.ne LBB1_3 b.eq LBB0_5 cmn w1, #1 LBB0_3: b.eq LBB1_10 sdiv w8, w0, w1 LBB1_3: msub w9, w8, w1, w0 sdiv w0, w8, w1 eor w10, w1, w0 msub w8, w0, w1, w8 asr w10, w10, #31 tbz w1, #31, LBB1_5 cmp w9, #0 cmp w8, #0 csel w9, wzr, w10, eq b.gt LBB1_7 add w0, w9, w8 LBB1_5: ldp x29, x30, [sp], #16 cmp w1, #1 ret b.lt LBB1_8 LBB0_4: tbz w8, #31, LBB1_8 adrp x0, l___unnamed_1@PAGE LBB1_7: add x0, x0, l___unnamed_1@PAGEOFF sub w0, w0, #1 bl panic... LBB1_8: LBB0_5: ldp x29, x30, [sp], #16 adrp x0, l___unnamed_1@PAGE ret add x0, x0, l___unnamed_1@PAGEOFF LBB1_9: bl panic... adrp x0, l___unnamed_2@PAGE add x0, x0, l___unnamed_2@PAGEOFF bl panic... LBB1_10: adrp x0, l___unnamed_2@PAGE add x0, x0, l___unnamed_2@PAGEOFF bl panic... ```
Website | Getting started | Learn | Documentation | Contributing
This is the main source code repository for Rust. It contains the compiler, standard library, and documentation.
Performance: Fast and memory-efficient, suitable for critical services, embedded devices, and easily integrate with other languages.
Reliability: Our rich type system and ownership model ensure memory and thread safety, reducing bugs at compile-time.
Productivity: Comprehensive documentation, a compiler committed to providing great diagnostics, and advanced tooling including package manager and build tool (Cargo), auto-formatter (rustfmt), linter (Clippy) and editor support (rust-analyzer).
Read “Installation” from The Book.
If you really want to install from source (though this is not recommended), see INSTALL.md.
See https://www.rust-lang.org/community for a list of chat platforms and forums.
See CONTRIBUTING.md.
Rust is primarily distributed under the terms of both the MIT license and the Apache License (Version 2.0), with portions covered by various BSD-like licenses.
See LICENSE-APACHE, LICENSE-MIT, and COPYRIGHT for details.
The Rust Foundation owns and protects the Rust and Cargo trademarks and logos (the “Rust Trademarks”).
If you want to use these names or brands, please read the media guide.
Third-party logos may be subject to third-party copyrights and trademarks. See Licenses for details.