There was a frequent block of code that did "find the next router
line, see if we've hit the end of the list, get the ID hash from the
line, and enforce well-ordering." Per Ahf's review, I'm extracting
it to its own function.
Previously, we operated on smartlists of NUL-terminated strings,
which required us to copy both inputs to produce the NUL-terminated
strings. Then we copied parts of _those_ inputs to produce an
output smartlist of NUL-terminated strings. And finally, we
concatenated everything into a final resulting string.
This implementation, instead, uses a pointer-and-extent pattern to
represent each line as a pointer into the original inputs and a
length. These line objects are then added by reference into the
output. No actual bytes are copied from the original strings until
we finally concatenate the final result together.
Bookkeeping structures and newly allocated strings (like ed
commands) are allocated inside a memarea, to avoid needless mallocs
or complicated should-I-free-this-or-not bookkeeping.
In my measurements, this improves CPU performance by something like
18%. The memory savings should be much, much higher.
Also, add very strict split/join functions, and totally forbid
nonempty files that end with somethig besides a newline. This
change is necessary to ensure that diff/apply are actually reliable
inverse operations.
The 2-line diff changs is needed to make the unit tests actually
test the cases that they thought they were testing.
The bogus free was found while testing those cases
(This commit was extracted by nickm based on the final outcome of
the project, taking only the changes in the files touched by this
commit from the consdiff_rebased branch. The directory-system
changes are going to get worked on separately.)