ImageIO Benchmarks #21

IanButterworth · 2021-02-28T01:20:53Z

Some benchmarks for FileIO save and load functions with the different Image IO backends, log x axis, because ImageMagick can be a lot slower.

All defaults, no kwargs.

This is with
FileIO JuliaIO/FileIO.jl#290
ImageMagick v0.7.6
QuartzImageIO v0.7.3
ImageIO v0.5.1
TiffImages v0.2.2
PNGFiles v0.3.6

Benchmark code

import Pkg
Pkg.develop(path=joinpath(@__DIR__,"..","FileIO.jl"))
Pkg.develop(path=joinpath(@__DIR__,"..","TiffImages.jl"))
Pkg.add.(["ImageCore", "BenchmarkTools", "DataFrames", "CSV"])

using FileIO, ImageCore, BenchmarkTools, DataFrames, CSV

res = DataFrame(backend=String[], file_fmt=String[], img_size=NTuple{2,Int}[], img_eltype=Type[],
                    save_first=Union{Missing,Float64}[], save=Float64[], load_first=Union{Missing,Float64}[], load=Float64[])
tmp, _ = mktemp()
for backend in ["ImageMagick", "QuartzImageIO", "ImageIO"]
    Pkg.add(backend)
    @info backend
    for ext in ["tiff","png"]
        fpath = string(tmp, ".", ext)
        for typ in [Gray{N0f8}, RGB{N0f8}]
            for n in 1:3
                println("$typ $(10^n)x$(10^n) ===")
                img = rand(typ, 10^n, 10^n)
                backend = backend
                file_fmt = ext
                img_size = size(img)
                img_eltype = typ
                save_first = if n == 1
                    @elapsed FileIO.save(fpath, img)
                else
                    missing
                end
                b = @benchmark FileIO.save($fpath, $img)
                save = median(b).time

                load_first = if n == 1
                    @elapsed img2 = FileIO.load(fpath)
                else
                    missing
                end
                b = @benchmark FileIO.load($fpath)
                load = median(b).time
                push!(res, (backend, file_fmt, img_size, img_eltype, save_first, save, load_first, load))
            end
        end
    end
    Pkg.rm(backend)
end
CSV.write("results.csv", res)

cc. @tlnagy @timholy @Drvi

The text was updated successfully, but these errors were encountered:

timholy · 2021-03-01T17:11:54Z

That's awesome.

Just this morning I discovered via profiling that the biggest contributor to ImageMagick's slowness for small files is extracting the pixel depth, which is a single ccall. Crazy. There might of course be a way around that, but I'm not in a rush; having an implementation that works and we can change is so important, and the benchmarks above are really exciting!

For the smallest images, JuliaIO/FileIO.jl#295 will improve matters even further.

It's noteworthy that for TIFF, 10x10 and 100x100 are almost the same speed. That might merit some investigation, eventually.

johnnychen94 · 2022-02-03T23:19:05Z

With my recent JpegTurbo.jl development, I noticed that only benchmarking with randomly generated images can be quite misleading; many image compression tricks work only when there exist overlaps between meaningful blocks and patches. Thus I would suggest adding more test images of the same size and plotting the median result of those samples when we regenerate the graphs.

IanButterworth · 2022-02-03T23:24:04Z

Absolutely. Note that PNGFiles.jl now has automated CI benchmarking set up JuliaIO/PNGFiles.jl#52

i.e. see the report here that asserted there was no performance change in this PR JuliaIO/PNGFiles.jl#51 (comment)

But that is currently using random images, and @Drvi already suggested they should be replaced with test images.

Perhaps we should set the same thing up for ImageIO with TestImages vs. each backend

johnnychen94 · 2022-02-03T23:57:10Z

We just need to add a few high-resolution test images to TestImages.jl... Among those widely used test image dataset, I know there's DIV2k but it's licensed for academic purposes only, do you have any suggestions on where we can find such test images?

IanButterworth · 2022-02-04T00:19:19Z

NASA?

tlnagy · 2022-02-04T01:24:22Z

It's been a goal of mine for awhile to add automated CI benchmarking to TiffImages (ref tlnagy/TiffImages.jl#53) but I'm super busy for the foreseeable future. Does it make more sense for the benchmarks to live at the individual package level or here in ImageIO? Maybe both?

But I agree with @johnnychen94 that it makes sense to use real images in addition to randomly-generated ones.

johnnychen94 · 2022-02-04T02:04:07Z

Performance benchmarks are used for two purposes: 1) test against other similar packages, which can usually be written in other languages, and 2) regression test

For benchmark CI such as JuliaIO/PNGFiles.jl#52, it is used to track if PRs/releases are slowing things down.

For benchmark scripts like this issue, JuliaIO/JpegTurbo.jl#15, and the one @timholy created in https://github.com/JuliaImages/image_benchmarks, it's used for advertising purposes to convince people that we're doing great stuff. Also to prepare for JuliaImages 1.0, we definitely need such benchmarks.

Does it make more sense for the benchmarks to live at the individual package level or here in ImageIO? Maybe both?

Unless we move all packages into one gigantic monorepo, benchmark CIs for regression tests should still be put together with the source codes.

On the other hand, I prefer to have the "benchmark against other frameworks" codes stay in one repo as @timholy already did. I haven't yet committed to https://github.com/JuliaImages/image_benchmarks because the codes there are not very extensible/flexible in the sense that it's not always easy to switch on/off certain cases. Thus if we keep adding more benchmark cases there, we'll soon reach a status that it takes too long to get the result of interest. This is quite similar to the DemoCards I made for https://juliaimages.org/stable/examples/; that we can easily create an ad-hoc version of benchmark/demo scripts that works at first, but it's always a pain to convince/guide others to contribute benchmark/demo cases using the ad-hoc undocumented framework.

Some discussion on this can be found in JuliaImages/Images.jl#947 and I also have a very draft experiment in johnnychen94/Workflows.jl#1, but I certainly don't have enough time to finish it... Maybe we can propose this as this year's GSoC project by updating https://julialang.org/jsoc/gsoc/images/?

timholy · 2022-02-04T09:00:07Z

I'm supportive of changes to the architecture of image_benchmarks. That said, in the long run I expect that image_benchmarks will have a similar fate as Julia's own "microbenchmarks" (repo: https://github.com/JuliaLang/Microbenchmarks): people want them, lots of folks who have different favorite image-processing suites will request that we compare their favorite framework, but nobody wants to maintain them. Building many different languages' suites on a single machine is a major pain in the neck, and I have delayed doing this precisely because it's no fun. But for long-term growth it's important in our current phase. (I don't really expect to keep them going for 10 years, though; realistically I might imagine maintaining them for a couple of years.)

Consequently, anything that you want to live "forever" and be primarily focused on within-Julia performance I would put elsewhere. I'm happy to rename that repo if that would help, e.g., cross-suite-benchmarks or something.

johnnychen94 mentioned this issue Apr 30, 2021

slow JPEG loading JuliaImages/Images.jl#960

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ImageIO Benchmarks #21

ImageIO Benchmarks #21

IanButterworth commented Feb 28, 2021

timholy commented Mar 1, 2021

johnnychen94 commented Feb 3, 2022 •

edited

Loading

IanButterworth commented Feb 3, 2022

johnnychen94 commented Feb 3, 2022

IanButterworth commented Feb 4, 2022

tlnagy commented Feb 4, 2022

johnnychen94 commented Feb 4, 2022 •

edited

Loading

timholy commented Feb 4, 2022 •

edited

Loading

ImageIO Benchmarks #21

ImageIO Benchmarks #21

Comments

IanButterworth commented Feb 28, 2021

timholy commented Mar 1, 2021

johnnychen94 commented Feb 3, 2022 • edited Loading

IanButterworth commented Feb 3, 2022

johnnychen94 commented Feb 3, 2022

IanButterworth commented Feb 4, 2022

tlnagy commented Feb 4, 2022

johnnychen94 commented Feb 4, 2022 • edited Loading

timholy commented Feb 4, 2022 • edited Loading

johnnychen94 commented Feb 3, 2022 •

edited

Loading

johnnychen94 commented Feb 4, 2022 •

edited

Loading

timholy commented Feb 4, 2022 •

edited

Loading