octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #38628] bsxfun slow for complex


From: Arun Giridhar
Subject: [Octave-bug-tracker] [bug #38628] bsxfun slow for complex
Date: Sat, 17 Sep 2022 07:03:31 -0400 (EDT)

Follow-up Comment #12, bug #38628 (project octave):

I started work on this today.

Digging through bsxfun.cc, I found that the slow code path happens only if the
following conditions are all met:
1. A and B are different types.
2. A and B are different sizes.
3. Neither A nor B is empty in any dimension.

If any of the conditions is not met, there's a faster code path available.

In the slow code path, there's a 150-line for-loop with a large if-else tree
inside it, which is done for each element of the result.

The first big question is whether that whole loop can be eliminated and
replaced with a call to broadcasting. (I don't know that yet and would like to
hear your thoughts on the viability of that approach.)

If we do need that big loop, the next question is whether we can do all the
type-checking up front and then end up with a bunch of different tight loops
with no decisions inside, one of which will be chosen based on the type.

I instrumented the code with some cout statements to trace the code paths. I
also added comments for myself as I work on this.

The hg diff is attached.

Test based on comment #10, packed in one line to ease command line recall:

>> Nc = 8; F = complex (randn(7,11189), randn(7,11189)); tic; G1 =
bsxfun(@times, eye(Nc, Nc), shiftdim(F, -2)); toc, tic; Gr = bsxfun(@times,
eye(Nc, Nc), shiftdim(real(F), -2)); Gi = bsxfun(@times, eye(Nc, Nc),
shiftdim(imag(F), -2)); G2 = complex(Gr, Gi); toc, assert (all (G1(:) ==
G2(:)))


Output that was used to deduce the above rules about which code path is used
where:

maybe_optimized_builtin with name = times
        retval.numel == 0
Entering special:
        dva.numel == 64
        dvb.numel == 78323
Code path 3
Elapsed time is 2.46602 seconds.
maybe_optimized_builtin with name = times
        retval.numel == 5012672
maybe_optimized_builtin with name = times
        retval.numel == 5012672
Elapsed time is 0.068238 seconds.



(file #53708)

    _______________________________________________________

Additional Item Attachment:

File name: bsxfun.commented.patch         Size:3 KB
    <https://file.savannah.gnu.org/file/bsxfun.commented.patch?file_id=53708>



    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?38628>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]