[PATCH] Improve endian conversion in umac.c
rapier
rapier at psc.edu
Thu Mar 10 07:02:31 AEDT 2022
On 3/8/22 6:12 PM, Darren Tucker wrote:
> On Wed, 9 Mar 2022 at 09:59, rapier <rapier at psc.edu> wrote:
>> I was poking at the MAC routines looking for some efficiencies for high
>> performance environments. I was looking at the umac.c and comparing it
>> to the original source at https://fastcrypto.org/front/umac/umac.c After
>> a couple of false starts I found that reverting the endian conversion
>> routines back to what Krovetz wrote realized a 8% to 16% improvement
>
> Interesting! One obvious difference is what you have is potentially
> inline-able static functions instead of function calls across
> compilation units that (barring whole program optimization) can't be
> inlined. If you put the existing functions from misc.c into umac.c as
> statics do you see the same improvement?
That worked and I saw the same improvement. For a 20GB test (a dd pipe
with aes2560ctr) I'm seeing peaks at 870MB/s versus 720MB/s for stock.
So it does look like that its being inlined. I'm going to poke at a
couple more things and then provide an updated patch. I think I have a
big endian system around here somewhere so I want to test on that as well.
This is pleasing. Initially I was looking at improving performance by
pipelining the MAC but that's not possible with ETM. This is about the
level of performance gain I was hoping to get with that and it's a lot
easier.
Anyway, I'll get the new patch up soon.
Chris
More information about the openssh-unix-dev
mailing list