-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: improve ManagedAuthenticatedEncryptor Decrypt() and Encrypt() flow #59424
base: main
Are you sure you want to change the base?
perf: improve ManagedAuthenticatedEncryptor Decrypt() and Encrypt() flow #59424
Conversation
src/DataProtection/DataProtection/src/Managed/ManagedAuthenticatedEncryptor.cs
Show resolved
Hide resolved
src/DataProtection/DataProtection/src/Managed/ManagedAuthenticatedEncryptor.cs
Outdated
Show resolved
Hide resolved
src/DataProtection/DataProtection/src/KeyManagement/KeyManagementOptions.cs
Outdated
Show resolved
Hide resolved
src/DataProtection/DataProtection/src/Managed/ManagedAuthenticatedEncryptor.cs
Outdated
Show resolved
Hide resolved
src/DataProtection/DataProtection/src/Managed/ManagedAuthenticatedEncryptor.cs
Show resolved
Hide resolved
src/DataProtection/DataProtection/src/Managed/ManagedAuthenticatedEncryptor.cs
Outdated
Show resolved
Hide resolved
src/DataProtection/DataProtection/src/Managed/ManagedAuthenticatedEncryptor.cs
Outdated
Show resolved
Hide resolved
src/DataProtection/DataProtection/src/Managed/ManagedAuthenticatedEncryptor.cs
Show resolved
Hide resolved
src/DataProtection/DataProtection/src/Managed/ManagedAuthenticatedEncryptor.cs
Show resolved
Hide resolved
src/DataProtection/DataProtection/src/SP800_108/ManagedSP800_108_CTR_HMACSHA512.cs
Outdated
Show resolved
Hide resolved
Code review notes
Scenario notesIs the goal to improve the performance of the [real-world?] AntiForgery benchmark or to improve the performance of DataProtection in a standalone benchmark? The PR description (and attached graph) make it sound like improving the performance of the crank-based benchmark is the goal, but no throughput measurement is provided for the changes in this PR. Please provide that graph. It would supply evidence that these changes have real-world impact and aren't just microbenchmark improvements. |
The goal of PR is to bring linux performance closer to windows performance for
DataProtection
scenario. Below is the picture of Antiforgery benchmarks on win vs lin machines.The benchmark I am relying on to show the result numbers is here, which is basically building default
ServiceProvider
, adding DataProtection via.AddDataProtection()
and callingIDataProtector.Protect()
orIDataProtector.Unprotect()
.Results:
Optimization details
I looked into
Unprotect
method forManagedAuthenticatedEncryptor
and spottedMemoryStream
usage and multipleBuffer.BlockCopy
usages. Also I saw that there is some shuffling ofbyte[]
data, which I think can be skipped and performed in such a way, that some allocations are skipped.In order to be as safe as possible, I created a separate
DataProtectionPool
which provides API to rent and return byte arrays. It is not intersecting withArrayPool<byte>.Shared
.ManagedSP800_108_CTR_HMACSHA512.DeriveKeys
is changed to explicit usageManagedSP800_108_CTR_HMACSHA512.DeriveKeysHMACSHA512
, because _kdkPrfFactory is anyway hardcoded to useHMACSHA512
. There is a static API allowing to hash without allocating kdkbyte[]
which is rented from the buffer:HMACSHA512.TryHashData(kdk, prfInput, prfOutput, out _);
Avoided usage of
DeriveKeysWithContextHeader
which allocates a separate intermediate array forcontextHeader
andcontext
. Instead passing the spansoperationSubkey
andvalidationSubkey
directly intoManagedSP800_108_CTR_HMACSHA512.DeriveKeys
ManagedSP800_108_CTR_HMACSHA512.DeriveKeysHMACSHA512
had 2 more arrays (prfInput
andprfOutput
), which now I am renting (viaDataProtectionPool
) or evenstackalloc
'ing. They are returned to the pool withclearArray: true
flag to make sure key material is removed from the memory after usage.In
Decrypt()
flow I am again using HashAlgorithm.TryComputeHash overload, which works based on theSpan<byte>
types, compared to previously used HashAlgorithm.ComputeHashIn
Decrypt()
flow changed usage to SymmetricAlgorithm.DecryptCbc() instead of CryptoTransform.TransformBlock() with same idea to useSpan<byte>
API instead of anotherbyte[]
allocation.Encrypt()
flow is reusing №1, №2 and №3 optimizations as wellEncrypt()
before was relying on theMemoryStream
andCryptoStream
to write data in the result buffer, but I am pre-calculating the length, and then doing a single allocation of result array:var outputArray = new byte[keyModifierLength + ivLength + cipherTextLength + macLength];
All required data is copied into the outputArray via APIs supportingSpan<byte>
.All listed optimizations are included in the
net10
TFM, but only some (№ 2, №3 and №6) are used innetstandard2.0
andnetFx
TFMs which DataProtection also targets.Related to #59287