SSH Compression - Block Deduplication
Matt Olson
molson at atlantis.oceanconsulting.com
Tue Sep 13 05:55:30 EST 2011
Hi Gert,
Let me start by saying I'm not an expert in gzip compression internals.
For others to read along:
http://www.gzip.org/algorithm.txt
(RE: LZ77) Distance of 32k and lengths (essentially variable block) of 258
bytes are both quite small when talking about graphics data. With moderm
processors and memory, it would be interesting to see how this performs
with distance of 4MB and length of 32k. Those fit well within modern L2
and L1 caches respectively.
Of course the actualy distance and length values balance a race between
CPU time and network latency. Example: if it takes 500ms to
search the last 4MB for duplicates when your network is 100ms latency,
then you really haven't gained anything in apparent speed; you have
only conserved bandwidth.
WAN accelerator deduplication data dictionaries are much larger and can
cache patterns found within the entire (or multiple) session(s).
However, LZ77 with larger distance and length values do have the speed
advantage of not having to go to main memory or disk. I think 4MB/32KB
would be useful with X11 and be an interesting test.
Matt
On Mon, 12 Sep 2011, Gert Doering wrote:
> Hi,
>
> On Mon, Sep 12, 2011 at 08:26:41AM -0700, Matt Olson wrote:
>> I may look around and see if I can find a library that does another layer
>> of tunneling or a Xorg addon to provide deduplication.
>
> Doesn't gzip compression suit your needs? This already does fairly
> thorough deduplication - not on a "per block level" but on a "per byte
> sequence" level, so much more flexible...
>
> gert
> --
> USENET is *not* the non-clickable part of WWW!
> //www.muc.de/~gert/
> Gert Doering - Munich, Germany gert at greenie.muc.de
> fax: +49-89-35655025 gert at net.informatik.tu-muenchen.de
>
More information about the openssh-unix-dev
mailing list