Correlation Attacks

Freewheeling spot to chew the fat on anything cryptostorm-related that doesn't fit elsewhere (i.e. support, howto, &c.). Criticism & praise & brainstorming & requests for explanation... this is where it goes when it's hot & ready for action! :-)
Posts: 1
Joined: Fri Nov 11, 2016 8:15 am

Correlation Attacks

Post by PrivateMailP » Fri Nov 11, 2016 8:23 am

Hi Guys,

Just joined / paying member, loving this VPN service, keep up the amazing work!

Random question time:

Just thinking about how easy / feasible it is to do a correlation attack against VPN traffic.

If it's possible, as in, the attacker can guess what data you've requested as they can see x data was requested from server and x data (+/- known encryption details) was delivered to the client.

If this is doable / done, could we pseudo randomly pad the data once it hits the VPN exit on it's way back to the client?

As in I request 1 meg of data from x, it reaches the vpn, the vpn / server / magic pads the data between 1 KB right up to 1 MB then sends it back to the client, VPN client discards additional data and processes the remaining "de-padded" data.

Just thinking out loud, is it a good / possible way to prevent these kind of attacks?


User avatar
Site Admin
Posts: 495
Joined: Thu Jan 01, 1970 5:00 am

Re: Correlation Attacks

Post by df » Sat Nov 12, 2016 7:17 pm

If an attacker can hack into a server (such as a website) that the target/client is known to access, then the attacker would be able to see any data the client sends to that server, provided the attacker already knows that the target is a Cryptostorm member, and which exit node the target likes to use (unless the website gets very little
traffic, then looking at all the received data from all visitors would be feasible).

In a scenario like that, it wouldn't help if additional padding was added to the packets between the client and the VPN server, and SSL/HTTPS on that website also wouldn't matter if the attacker was able to obtain enough access on that web server to access the private SSL keys (or directly listen to the syscalls made by the web server, like with `strace`) which would give the attacker the ability to decrypt the client's SSL/HTTPS traffic (and inject something like a browser exploit into the HTTP response before it gets encrypted and sent out).

If the attacker didn't have direct access to the web server in that scenario, but did have enough access to one of the hops (routers) between our exit node and the web site, then SSL/HTTPS would help because the attacker wouldn't be able to decrypt/manipulate the traffic passing through. That's why, even when using a VPN, it's a good idea to also use SSL/HTTPS when connecting to anything. Once the client's packets get decrypted by the VPN exit node and are sent out to the internet, plaintext protocols are still vulnerable to the usual MiTM attacks to anyone who's able to access a hop between the source (exit node) and destination (website, or whatever).

Actually, the OpenVPN protocol already includes some padding according to ... tyOverview -
The encrypted packet is formatted as follows:

HMAC(explicit IV, encrypted envelope)
Explicit IV
Encrypted Envelope

The plaintext of the encrypted envelope is formatted as follows:

64 bit sequence number
payload data, i.e. IP packet or Ethernet frame
The HMAC and explicit IV are outside of the encrypted envelope.

The per-packet IV is randomized using a nonce-based PRNG that is initially seeded from the OpenSSL RAND_bytes function.

HMAC, encryption, and decryption functions are provided by the OpenSSL EVP interface and allows the user to select an arbitrary cipher, key size, and message digest for HMAC. BlowFish? is the default cipher and SHA1 is the default message digest. The OpenSSL EVP interface handles padding to an even multiple of block size using PKCS#5 padding. CBC-mode cipher usage is encouraged but not required.
So introducing additional pseudo-random padding data to OpenVPN packets could make it easier to identify a client's traffic if the pseudo-random part of the padding wasn't implemented with enough random data. Actually, even if enough random data was used, an attacker still might be able to identify your traffic if the attacker knew that you were doing this additional/unusual padding on your traffic. Although, those scenarios would require the attacker to have enough access to your ISP's routers or any hop between you and us to sniff your encrypted traffic, which would require them to already know your real IP or at least your ISP. They still wouldn't be able to decrypt your traffic, but in the case where they've hacked your ISP but don't yet know which IP is yours, simply knowing that you're doing this padding thing would allow them to separate your traffic from everyone else's, which would give them your real IP, which they could then try to hack directly, or they could hack your ISP's gateway router and try to manipulate any non-encrypted traffic that you might generate if you switch off the VPN. With that kind of access they could also simply DoS you by blocking all outgoing VPN/SSL/HTTPS traffic to the point that you get so frustrated that you switch to a plaintext protocol, which would allow them to inject something malicious that could be used to gain access to your system.

I guess padding might fool some simpler DPI firewalls, but even that's unlikely since padding wouldn't change the initial OpenVPN handshake, which is what most DPI rules look for when trying to detect OpenVPN traffic.
Completely obfuscating the packet would be a better method of bypassing those types of firewalls, such as with obfsproxy (which we have running on one server at the moment), or by hiding the VPN traffic in SSL traffic with stunnel (which we might implement soon).

In order for any of those attacks to work, the attacker has to first know that the client is going to visit a particular website, and which exit node the client is going to be using (for a busy website, it would also help to have an estimated time when the client will access the website). That's why OPSEC is important. A VPN won't help much if you're creating chatter on a personal Facebook account tied to your real identity, then switching off the VPN and accessing the same account from your real IP. If a hacker or agency/LEO is able to get access logs from Facebook, they would be able to see that you came in from a Cryptostorm exit node and from another IP that probably looks residential and is probably your real IP. An example is the recent KickassTorrents case where the admin was identified because he was logging into the "official" KAT Facebook account from his real IP, as well as some other things. In that case, even if he was always connected to a VPN it still wouldn't help because some of the things he was logging into (iTunes, an Apple email account, and KAT's Facebook) were registered under his real identity.

Another tool that will make correlation attacks more unlikely is our Voodoo nodes (, because it would require the attacker to have access to the destination web server as well as your ISP's traffic in order to cross reference the time stamps of your outgoing traffic to the Voodoo entry node against the destination's incoming traffic from the Voodoo exit node in order to determine which real IP is yours, which might not be possible if the destination receives a lot of traffic, and which shouldn't be possible anyways since the attacker wouldn't know your ISP unless you did something silly like talked about your ISP on a public IRC channel or connected to something directly while on and off the VPN.

Hope this information helps answer your questions :)

Posts: 1
Joined: Sun Nov 13, 2016 4:14 pm

Re: Correlation Attacks

Post by goki299 » Sun Nov 13, 2016 4:31 pm

Hi, interesting information in this thread.

How likely would correlation attacks be if you are torrenting?

Let's assume you use a shared VPN server where incoming and outgoing traffic is monitored (for whatever reason). That means the attacker can see the encrypted payload to your client and the connections to the internet/torrent peers.

You download a torrent and the attacker sees "your" IP via the torrent tracker, which is the IP of the VPN server. How likely would correlation attacks be in this scenario? Given that you connect to 15+ other torrent peers and there are other users on the shared VPN server?


User avatar
Site Admin
Posts: 1275
Joined: Wed Feb 05, 2014 3:47 am

Re: Correlation Attacks

Post by parityboy » Tue Nov 15, 2016 11:05 pm


The attacker would need to be sniffing packets "either side" of the exit node, and even then it would be difficult if other users on the exit node are torrenting as well, since the connections from other torrent peers on the public Internet are effectively multiplexed into a single OpenVPN tunnel per VPN user.

However, if the other users on the exit node are merely web browsing it would be much easier, since web browsing is bursty by nature, where as a torrent is pretty continuous in terms of its traffic pattern.