SSync
A Java and Haskell implementation of the rsync algorithm.
Note that this is not an implementation of rsync itself! The data it produces is not compatible with either rsync or librsync. It is merely an implementation of the signature-generation, delta-analysis, and patch-application as described in the paper linked above.
Java
To compute the signature of a file, use SignatureComputer.compute
or a SignatureComputer.SignatureFileInputStream
; to create a patch, read the generated signature data into a SignatureTable
and pass it together with an input stream to PatchComputer.compute
or a PatchComputer.PatchComputerInputStream
to build a patch, and finally send the patch together with a BlockFinder
to PatchApplier.apply
or a PatchApplier.PatchInputStream
to generate the new file.
The use of these classes is demonstrated in the class com.socrata.ssync.SSync
.
Haskell
The SSync
library uses conduit for streaming data.
The produceSignatureTable
conduit will digest a byte-stream into a signature file, which can itself be read into a SignatureTable
value via consumeSignatureTable
. If the signature table is malformed, consumeSignatureTable
will throw a SignatureTableException
. The patchComputer
conduit can combine the signature table with a stream of bytes to produce a patch file. Finally, the patchApplier
conduit can combine the patch file with the data from the file being patched to produce the target.
The use of these functions is demonstrated in the code for the ssync
executable.
The ssync
library (but not the executable) is compatible with GHCJS (note: GHCJS is currently a moving target; ssync has been built with the version at commit 100fa6d67). When using GHCJS, the only HashAlgorithm
available is MD5
.
The binary-equivalence-test.sh
file contains tests that ensure the Java and Haskell versions produce exactly the same output for the same input.