Packets Sizes and Information Leakage
Chris Rapier
rapier at psc.edu
Fri Dec 16 03:12:14 EST 2005
So one of my coworkers is doing a little research on SSH usage in the
wild using netflow data. One of the things he's trying to do is
determine a way to differentiate between data transfers and interactive
sessions. We thought of a couple of ways but we wanted to float them
here and see if there are methods incorporated to defeat thi sort of
traffic analysis.
The first idea is to look at the average number of packets per second
over the length of the flow. The idea is that a data transfer would have
a significantly higher number of PPS than an interactive session. If
we analyze few thousand ssh flows and build a histogram we expect to see
two (or maybe 3 peaks) corresponding to various connection types. I
think this probably has the best chance of statistically significant
results.
The second method would be to look at the packet size. The idea being
that interactive packets would end up being significantly smaller than
full size data packets. I know that some padding is used to protect
against plaintext attacks according to the RFC but I didn't know if
there was any additional padding on top of that to protect against
traffic analysis. Are interactive packets coalesced or padded to the
known MTU? I'm going to run some tcpdumps but I wanted to ask here as well.
The other method would be to use packet arrival times but we only have
flow data and putting a packet sniffer on 10G link is prohibitively
expensive for work like this.
Please note: If there aren't any countermeasures for this type of
traffic analysis I'm not saying that is a problem at all. Knowing a flow
is interactive versus a bulk data transfer really doesn't help out an
attacker all that much. I'm just curious at this time and my coworker
needs the data for a presentation to a center director here.
Thanks for your time!
Chris Rapier
PSC
More information about the openssh-unix-dev
mailing list