The digest routines use init/update/sum, where sum will automatically
copy the internal state to support calculating running digests.
The XOF routines use init/absorb/squeeze, which behave exactly as stated
on the tin.
Apparently this only happens with clang (or with some particular
clang versions), and only on i386.
Fixes 16970; bug not in any released Tor.
Found by Teor; fix from Yawning.
This probably requires the user to manually set CFLAGS, but should
result in a net gain on 32 bit x86. Enabling SSE2 support would be
possible on x86_64, but will result in slower performance.
Implements feature #16535.
The code was always in our Ed25519 wrappers, so enable it when using
the ed25519-donna backend, and deal with the mocking related
crypto_rand silliness.
Implements feature 16533.
The only reason 16 byte alignment is required is for SSE2 load and
store operations, so only align datastructures to 16 byte boundaries
when building with SSE2 support.
This fixes builds with GCC SSP on platforms that don't have special
case code to do dynamic stack re-alignment (everything not x86/x86_64).
Fixes bug #16666.
This needs to be done to allow for the possibility of removing the
ref10 code at a later date, though it is not performance critical.
When integrated by kludging it into tor, it passes unit tests, and is
twice as fast.
Integrating it the "wrong" way into common/crypto_ed25519.c passes
`make check`, and there appear to be some known answer tests for this,
so I assume I got it right.
Blinding a public key goes from 139.10 usec to 70.78 usec using
ed25519-donna (NB: Turboboost/phase of moon), though the code isn't
critical path, so supporting it is mostly done for completeness.
Integrate ed25519-donna into the build process, and provide an
interface that matches the `ref10` code. Apart from the blinding and
Curve25519 key conversion, this functions as a drop-in replacement for
ref10 (verified by modifying crypto_ed25519.c).
Tests pass, and the benchmarks claim it is quite a bit faster, however
actually using the code requires additional integration work.
The compiler is allowed to assume that a "uint64_t *" is aligned
correctly, and will inline a version of memcpy that acts as such.
Use "uint8_t *", so the compiler does the right thing.
This is meant to prevent memory corruption bugs from doing
unspeakable infinite-loop-like things to the hashtables. Addresses
ticket 11737. We should disable these if they turn out to be
expensive.
Silence clang warnings under --enable-expensive-hardening, including:
+ implicit truncation of 64 bit values to 32 bit;
+ const char assignment to self;
+ tautological compare; and
+ additional parentheses around equality tests. (gcc uses these to
silence assignment, so clang warns when they're present in an
equality test. But we need to use extra parentheses in macros to
isolate them from other code).
This helps us avoid undefined behavior. It's based on a patch from teor,
except that I wrote a perl script to regenerate the patch:
#!/usr/bin/perl -p -w -i
BEGIN { %vartypes = (); }
if (/^[{}]/) {
%vartypes = ();
}
if (/^ *crypto_int(\d+) +([a-zA-Z_][_a-zA-Z0-9]*)/) {
$vartypes{$2} = $1;
} elsif (/^ *(?:signed +)char +([a-zA-Z_][_a-zA-Z0-9]*)/) {
$vartypes{$1} = '8';
}
# This fixes at most one shift per line. But that's all the code does.
if (/([a-zA-Z_][a-zA-Z_0-9]*) *<< *(\d+)/) {
$v = $1;
if (exists $vartypes{$v}) {
s/$v *<< *(\d+)/SHL$vartypes{$v}($v,$1)/;
}
}
# remove extra parenthesis
s/\(SHL64\((.*)\)\)/SHL64\($1\)/;
s/\(SHL32\((.*)\)\)/SHL32\($1\)/;
s/\(SHL8\((.*)\)\)/SHL8\($1\)/;
There are some loops of the form
for (i=1;i<1;++i) ...
And of course, if the loop index is initialized to 1, it will never
be less than 1, and the loop body will never be executed. This
upsets coverity.
Patch fixes CID 1221543 and 1221542
When size_t is the most memory you can have, make sure that things
referring to real parts of memory are size_t, not uint64_t or off_t.
But not on any released Tor.
This implementation allows somebody to add a blinding factor to a
secret key, and a corresponding blinding factor to the public key.
Robert Ransom came up with this idea, I believe. Nick Hopper proved a
scheme like this secure. The bugs are my own.
For proposal 228, we need to cross-certify our identity with our
curve25519 key, so that we can prove at descriptor-generation time
that we own that key. But how can we sign something with a key that
is only for doing Diffie-Hellman? By converting it to the
corresponding ed25519 point.
See the ALL-CAPS warning in the documentation. According to djb
(IIUC), it is safe to use these keys in the ways that ntor and prop228
are using them, but it might not be safe if we start providing crazy
oracle access.
(Unit tests included. What kind of a monster do you take me for?)
This is another case where DJB likes sticking the whole signature
prepended to the message, and I don't think that's the hottest idea.
The unit tests still pass.
This reduces the likelihood that I have made any exploitable errors
in the encoding/decoding.
This commit also imports the trunnel runtime source into Tor.
This fixes bug 13102 (not on any released Tor) where using the
standard SSIZE_MAX name broke mingw64, and we didn't realize.
I did this with
perl -i -pe 's/SIZE_T_MAX/SIZE_MAX/' src/*/*.[ch] src/*/*/*.[ch]
We're calling mallocfn() and reallocfn() in the HT_GENERATE macro
with the result of a product. But that makes any sane analyzer
worry about overflow.
This patch keeps HT_GENERATE having its old semantics, since we
aren't the only project using ht.h. Instead, define a HT_GENERATE2
that takes a reallocarrayfn.
We might use libsodium or ed25519-donna later on, but for now, let's
see whether this is fast enough. We should use it in all cases when
performance doesn't matter.
The fix for bug 8746 added a hashtable instance that never actually
invoked HT_FIND. This caused a warning, since we didn't mark HT_FIND
as okay-not-to-use.
scan-build recognizes that in theory there could be a numeric overflow
here.
This can't numeric overflow can't trigger IRL, since in order to fill a
hash table with more than P=402653189 buckets with a reasonable load
factor of 0.5, we'd first have P/2 malloced objects to put in it--- and
each of those would have to take take at least sizeof(void*) worth of
malloc overhead plus sizeof(void*) content, which would run you out of
address space anyway on a 32-bit system.
In digestmap_set/get benchmarks, doing unaligned access on x86
doesn't save more than a percent or so in the fast case. In the
slow case (where we cross a cache line), it could be pretty
expensive. It also makes ubsan unhappy.