Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lua Scripting Allocation, Performance, and Correctness Improvements #882

Merged
merged 54 commits into from
Dec 18, 2024

Conversation

kevin-montrose
Copy link
Contributor

@kevin-montrose kevin-montrose commented Dec 14, 2024

This is a big one, but kinda had to be.

At a high level this:

  • Removes NLua, switching to KeraLua directly
  • Manually PInvokes in numerous places to avoid encoding issues and allocations
    • I don't have a Linux VM handy, so these may need some tweaking - getting towards my EOD so I'm getting this up now, but this is a known gap
  • Removes allocations from all the .NET-to-Lua and Lua-to-.NET mappings
  • Directly writes responses to networking streams instead of double buffering them
  • Adds numerous tests for matching Redis behavior
  • Adds numerous benchmarks to cover the 'calls into Garnet' cases, as existing ones always returned null
  • Adds benchmarks to cover some of the nitty-gritty details of LuaRunner
  • Better matches Redis behavior

Lower level notes:

  • Lua stack growth is manually managed, quite a lot of work went into checking correctness here in DEBUG builds
  • Because quite a lot of the overhead here is in the PInvoke transition, some more work is moved into Lua
  • SHA lookups are on the fast path for script execution, so a new key type was added that's a bit gnarly
  • I audited basically all of the Lua methods we're using, and marked some of them as CallConvSuppressGCTransition as they do no allocations or unbounded operations

There's still more to do (offhand we need memory tracking, memory limits, and script timeouts) but this is enough to be merged.

Performance

There are now 4 relevant benchmarks, I picked and merged on top of main (commit f1ce0dafecaba15335e8e0767a6eeb441b72e56c) to run these.

LuaScripts

Existing benchmark, really just covers the "invoke Lua"-case.

Roughly allocation free, and the redis.call case is ~8x faster.

main

Method Params Mean Error StdDev Gen0 Allocated
Script1 None 110.2 ns 0.43 ns 0.36 ns 0.0004 24 B
Script2 None 183.7 ns 2.41 ns 2.26 ns 0.0021 144 B
Script3 None 302.2 ns 1.37 ns 1.29 ns 0.0033 240 B
Script4 None 1,822.0 ns 7.98 ns 7.46 ns 0.0114 776 B

nLuaToKeraLua

Method Params Mean Error StdDev Gen0 Allocated
Script1 None 100.5 ns 0.69 ns 0.61 ns - -
Script2 None 166.4 ns 0.56 ns 0.50 ns 0.0002 24 B
Script3 None 248.6 ns 0.90 ns 0.85 ns 0.0005 32 B
Script4 None 229.3 ns 0.64 ns 0.53 ns - -

LuaScriptCacheOperations

New benchmark, covers the caching of scripts and runners.

30-50% faster for many operations, and big improvements in the "initial load into session"-case of LoadInnerHit.

I played around a bit with improving the hash digest method, but didn't really find any wins there. It's covered by a benchmark now though.

main

Method Params Mean Error StdDev Allocated
LookupHit None 8.441 us 0.8641 us 2.479 us 688 B
LookupMiss None 5.448 us 0.6202 us 1.749 us 688 B
LoadOuterHit None 9.741 us 0.7198 us 2.042 us 64 B
LoadInnerHit None 378.921 us 16.8124 us 47.144 us 11392 B
LoadMiss None 10.830 us 1.0620 us 3.081 us 688 B
Digest None 15.038 us 0.7799 us 2.122 us 688 B

nLuaToKeraLua

Method Params Mean Error StdDev Median Allocated
LookupHit None 3.900 us 0.7896 us 2.266 us 4.000 us 688 B
LookupMiss None 2.603 us 0.5921 us 1.708 us 2.950 us 688 B
LoadOuterHit None 4.268 us 1.0059 us 2.837 us 3.950 us 688 B
LoadInnerHit None 237.461 us 11.7043 us 32.627 us 226.350 us 432 B
LoadMiss None 7.274 us 0.9774 us 2.836 us 7.300 us 688 B
Digest None 14.988 us 0.9382 us 2.600 us 15.100 us 688 B

LuaRunnerOperations

New benchmark, covers various operations on LuaRunner. Notably I put a real-ish UDF in here, with non-trivial logic and two redis.calls.

Basic operations are all faster, and very low or no-allocation now. The most common (resetting parameters before running the script) is ~70x faster. Construction (setting up the initial environment) and compilation are modestly (maybe 30%) faster.

main

Method Params Mean Error StdDev Gen0 Allocated
ResetParametersSmall None 7.586 us 0.1486 us 0.1825 us - 416 B
ResetParametersLarge None 7.992 us 0.0228 us 0.0190 us - 416 B
ConstructSmall None 150.604 us 0.5033 us 0.4708 us - 10136 B
ConstructLarge None 152.187 us 0.7295 us 0.6467 us - 10136 B
CompileForSessionSmall None 2.167 us 0.0125 us 0.0117 us 0.0038 424 B
CompileForSessionLarge None 34.670 us 0.0867 us 0.0724 us - 3488 B

nLuaToKeraLua

Method Params Mean Error StdDev Allocated
ResetParametersSmall None 98.95 ns 0.573 ns 0.508 ns -
ResetParametersLarge None 101.28 ns 1.707 ns 2.990 ns -
ConstructSmall None 96,956.18 ns 791.399 ns 701.555 ns 344 B
ConstructLarge None 97,113.50 ns 454.837 ns 403.201 ns 3408 B
CompileForSessionSmall None 1,691.61 ns 4.783 ns 3.994 ns -
CompileForSessionLarge None 34,023.67 ns 103.126 ns 96.464 ns -

Note the above is is ns not us.

ScriptOperations

Existing benchmark, but expanded with more realistic UDF (the same as in LuaRunnerOperations). As before I'm only reporting the None set, as ACL disables most code and AOF isn't relevant here.

There's a throughput loss in ScriptLoad (~23%) but improvements everywhere else - this is probably due to more being in the initial environment definition. Improvements range from ~15% in the LargeScript case to ~75% in the SmallScript case.

Additionally, allocations are basically removed from all paths except ScriptLoad, where they're reduced.

main

Method Params Mean Error StdDev Gen0 Allocated
ScriptLoad None 64.855 us 0.2211 us 0.1960 us 0.2441 16000 B
ScriptExistsTrue None 21.089 us 0.0906 us 0.0803 us - -
ScriptExistsFalse None 18.875 us 0.0393 us 0.0368 us - -
Eval None 70.382 us 0.5100 us 0.4521 us 0.2441 16000 B
EvalSha None 34.546 us 0.3003 us 0.2809 us 0.1831 12000 B
SmallScript None 247.413 us 2.2619 us 2.1158 us 0.9766 88001 B
LargeScript None 4,736.069 us 28.8228 us 26.9609 us 15.6250 1079203 B
ArrayReturn None 293.738 us 1.7299 us 1.5335 us 2.9297 206400 B

nLuaToKeraLua

Method Params Mean Error StdDev Median Allocated
ScriptLoad None 79.533 us 0.4813 us 0.4267 us 79.527 us 9600 B
ScriptExistsTrue None 17.470 us 0.0342 us 0.0303 us 17.469 us -
ScriptExistsFalse None 16.698 us 0.0778 us 0.0690 us 16.686 us -
Eval None 56.661 us 0.2285 us 0.2138 us 56.648 us -
EvalSha None 22.948 us 0.1270 us 0.1187 us 22.904 us -
SmallScript None 52.158 us 1.0171 us 2.2326 us 51.311 us -
LargeScript None 4,094.942 us 16.7428 us 13.9810 us 4,093.703 us 8 B
ArrayReturn None 107.365 us 0.4634 us 0.3618 us 107.198 us -

Correctness

Various fixes added:

  • String values with invalid utf-8 codepoints now work
  • ok and err special cases match Redis behavior
  • Parameter validation to redis.call matches Redis behavior
  • Lua number coercion is fixed, Redis mostly shoves them into longs
  • Lua boolean coercison is fixed, Redis (weirdly) maps them to 1 and nil
  • Metatables are allowed in scripts, but ignored when mapping returns
  • Nils terminate arrays in RESP returns, as in Redis (this is almost but not quite Lua default behavior)

And probably some more I'm forgetting.

…d to upstream some KeraLua additions; also horribly broken
…pt may not be viable... but was worth experimenting with
…s; fix some ZADD scripting tests; special err response stuff can't be done on the Lua side, remove it
…ing else; avoid copying regular used strings into Lua each time they're needed
@kevin-montrose kevin-montrose marked this pull request as ready for review December 14, 2024 00:08
@badrishc
Copy link
Contributor

Fantastic PR, this is huge for Lua users of Garnet!

libs/server/Lua/LuaRunner.cs Outdated Show resolved Hide resolved
libs/server/Lua/LuaRunner.cs Show resolved Hide resolved
Copy link
Contributor

@TedHartMS TedHartMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent PR!

@kevin-montrose kevin-montrose merged commit b256901 into main Dec 18, 2024
18 checks passed
@kevin-montrose kevin-montrose deleted the nLuaToKeraLua branch December 18, 2024 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants