Adam Ierymenko
|
0c498556d5
Unroll Salsa20 fully for a little more speed (non-SSE now almost as fast as SSE)
|
9 lat temu |
Adam Ierymenko
|
160278c489
Little bit of reorg in Salsa20 which seems to speed things up very slightly.
|
9 lat temu |
Adam Ierymenko
|
789046ca57
Speed up Salsa20 just a bit.
|
9 lat temu |
Adam Ierymenko
|
a297e4a5bf
Add build def ZT_NO_TYPE_PUNNING, which when defined disables type punning code that might cause unaligned access errors on architectures that care (e.g. Android/ARM)
|
9 lat temu |
Adam Ierymenko
|
f19c3c51d3
Revert slow non-SSE Salsa20 modification since it did not fix Android/ARM issue. Also update Salsa20 comments and clean up a bit.
|
9 lat temu |
Adam Ierymenko
|
7c9949eea3
For @glimberg -- a *possible* fix to the alignment headaches on Android/ARM. If this works we should find a define that can be used to enable it there since it will slow things down on non-x86 other architectures.
|
9 lat temu |
Adam Ierymenko
|
8d2e20ede6
Get rid of __align stuff in Salsa20 -- not portable, does not seem to help much on newer chips.
|
10 lat temu |
Adam Ierymenko
|
f2d372545a
Salsa20 SSE Windows build fix -- turns out you can't be as loose with SSE intrinsics in Visual Studio
|
10 lat temu |
Adam Ierymenko
|
12692c551e
SSE optimized Salsa20 -- anywhere from 20% to 50% faster than plain C version
|
10 lat temu |
Adam Ierymenko
|
8c9b73f67b
Make Salsa20 variable-round, allowing for Salsa20/12 to be used for Packet encrypt and decrypt. Profiling analysis found that Salsa20 encrypt was accounting for a nontrivial percentage of CPU time, so it makes sense to cut this load fundamentally. There are no published attacks against Salsa20/12, and DJB believes 20 rounds to be overkill. This should be more than enough for our needs. Obviously incorporating ASM Salsa20 is among the next steps for performance.
|
11 lat temu |
Adam Ierymenko
|
ef3e319c64
Several things:
|
11 lat temu |
Adam Ierymenko
|
150850b800
New git repository for release - version 0.2.0 tagged
|
11 lat temu |