New version is up to 3x faster, even better performance coming soon #204
johannesvollmer
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Performance Improvements
Great news! Version 1.5.3 brings 3x better performance on RLE compressed images and 1.5x better performance on DEFLATE compressed images on the benchmarks.
This has been achieved chiefly by leveraging autovectorization as well as the superscalar nature of today's CPUs in interleaving and prefix sum computation. DEFLATE also benefited from switching to a faster decompressor implementation. Thanks to @Shnatsel for profiling the code and contributing these improvements! Also, thank you for all the other support, Shnatsel!
More Coming
Even more performance work is already underway. Here's what you can look forward to in subsequent releases:
Currently, the bottleneck when opening images is the automatic channel type conversion. What is that? For example, if your project needs
f16
pixels, but the file containsf32
pixels, those values will be automatically converted by the library. This will stop being a performance issue, once the native CPU instructions to convert betweenf16
andf32
types are exposed in stable Rust, which is slated for Rust 1.69. Format conversion using native instructions is already prototyped and shows great performance on nightly.There's also ongoing experimentation with a fully parallel decoding pipeline, with even reading the file being multi-threaded.
Help Wanted
We're looking for contributors to help implement the DWA lossy compression format, as well as deep data, which allows you to store multiple colors per pixel. Here's what you need to know for DWA compression.
Beta Was this translation helpful? Give feedback.
All reactions