Performance Of Xtensor Types Vs. NumPy For Simple Reduction
Solution 1:
wow this is a coincidence! I am working on exactly this speedup!
xtensor's sum is a lazy operation -- and it doesn't use the most performant iteration order for (auto-)vectorization. However, we just added a evaluation_strategy
parameter to reductions (and the upcoming accumulations) which allows you to select between immediate
and lazy
reductions.
Immediate reductions perform the reduction immediately (and not lazy) and can use a iteration order optimized for vectorized reductions.
You can find this feature in this PR: https://github.com/QuantStack/xtensor/pull/550
In my benchmarks this should be at least as fast or faster than numpy. I hope to get it merged today.
Btw. please don't hesitate to drop by our gitter channel and post a link to the question, we need to monitor StackOverflow better: https://gitter.im/QuantStack/Lobby
Post a Comment for "Performance Of Xtensor Types Vs. NumPy For Simple Reduction"