Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: implement sret_union ABI for pointer-ful types #55045

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

topolarity
Copy link
Member

@topolarity topolarity commented Jul 5, 2024

This effectively expands our existing union ABI to cover both of these existing cases:

  • sret ABI (which can stack-allocate a single pointer-ful type)
  • union ABI (which can stack-allocate many pointer-free types)

This provides some nice speed-ups for temporary "wrappers":

const v = Any[]
@noinline maybe_wrapped(i) = (i % 32 != 0) ? Some(v) : nothing
function foo()
    count = 0
    for i = 1:1_000_000
        count += (maybe_wrapped(i) !== nothing) ? 1 : 0
    end
    return count
end

On this PR this gives:

julia> @btime foo()
  1.675 ms (0 allocations: 0 bytes)
968750

compared to current master:

julia> @btime foo()
  6.877 ms (968750 allocations: 14.78 MiB)
968750

TODO:

The most outstanding TODO here is what to do about ϕ-nodes. Right now, if the incoming Union{...} type has a pointer-containing type then this change forces the incoming object to be boxed, even if the object at run-time is actually pointer-free.

But that's just a band-aid so the code works - it introduces new boxes where we didn't have them before, which is a regression that almost certainly needs to be fixed before landing this.

@JeffBezanson JeffBezanson added performance Must go faster compiler:codegen Generation of LLVM IR and native code labels Jul 5, 2024
This is a combination the existing:
 - `sret`  ABI (which can stack-allocate a _single_ pointerful type)
 - `union` ABI (which can stack-allocate many _pointer-free_ types)

This provides some nice speed-ups for temporary "wrappers":
```julia
const v = Any[]
@noinline maybe_wrapped(i) = (i % 32 != 0) ? Some(v) : nothing
function foo()
    count = 0
    for i = 1:1_000_000
        count += (maybe_wrapped(i) !== nothing) ? 1 : 0
    end
    return count
end
```

On this PR this gives:
```julia
julia> @Btime foo()
  1.675 ms (0 allocations: 0 bytes)
968750
```

compared to current master:
```julia
julia> @Btime foo()
  6.877 ms (968750 allocations: 14.78 MiB)
968750
```

The most outstanding TODO here is what to do about PHI nodes. Right now,
if the incoming `Union{...}` type has a pointer-containing type then the
object is forced to be boxed, even if the object at run-time is actually
pointer-free.

But that's just a band-aid - it means we introduce new boxes where we
didn't have them before, which is a regression that almost certainly to
be fixed before landing this.

Co-authored-by: Gabriel Baraldi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:codegen Generation of LLVM IR and native code performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants