Ok, this is a very loosely-structured post as we're asking for member community assistance and despite a few weeks of fairly intensive work in-house, we still don't have things boiled down to a good, strong summary. I've begun work on this post several times, and had other tech team folks contribute both bits of writing and general conceptual guidance to the document... only to throw it out and start from scratch repeatedly.
The problem starts like this:
A few weeks back, members asked us to take a look at results being produced by the website
dnskeaktest.com, which is a nice little nonprofit service created by a CS student to help folks check their network sessions for "leaks." Oddly, some members on fully-verified (via other means) cryptostorm sessions were getting dns resolvers showing up in the results when they visited that site. Some members... but far from all of them. Even within the same OS, it was hit-and-miss.
This is the kind of issue that tends to lead to important questions, and for a couple of months our support folks have had an ear to ground for any such reports. When we get them, we ask members to tell us all they can about their local machine, OS, local network, etc. Without a theory to run with, this was a bit of a wild goose chase. Many members have been patient and helpful through this, and we offer our thanks (especially @shrouded, who did extensive testing that lead to our current understanding). As the weeks and then a few months wore on, the issue escalated in our tech team; we'd come at it from different angles, frustrated we couldn't really pin down specific theories to test.
Now, I believe we know what's going on... and this post is, in part, a request for folks to help verify our provisional findings. If this holds up, we already have an approach to it that brings things up a notch in terms of session security... but until we get this validated, we've hesitated to start pushing out "solutions" to a problem we don't fully understand.
Let me step back and hit that point squarely:
"DNS leaks" are a hot topic, when it comes to "VPN services" and especially to technically unsophisticated reviewers. Every me-too "VPN services" has a "DNS leak blocker" they claim magically solves this issue. Unfortunately, there's a few problems here. The definition of "DNS leak" ends up being entirely opaque, leading one to wonder how one can claim to "solve" a problem that cannot even be described properly! I've written about that issue in a long essay you can find
here, if you're curious. Worse, these "DNS leak" prevention tools being pedalled by just about every "VPN service" out there are basically junk. This I know because I've sat and analysed enough pcap files via wireshark (another pending post, still in pre-publication edit for now) to see the leaks firsthand, via actual packets leaving actual NICs on actual computers. Those results often don't jibe with the results presented by "DNS leak" detection websites. More on that, below.
Indeed, some of the tools being pedalled by "VPN services" are abandonware that legitimate DNS projects have stepped back from and no longer support. So why do they keep getting pushed? One, it looks good to claim a "magical answer" to the problem: the review sites check off that box, never bothering to see if they actually work. Marketing hype, basically. Two, some of these tools succeed in making the
DNS leak websites show no leak... even though the leaks are still taking place in actual network sessions, as verified by pcaps.
So, we've not pushed out some "solution" to this issue until we fully understood it. And our research, much of what I've done in bits and pieces over the summer months, keeps running up against anomalies, mutually conflicting explanations, hand-waving, and just plain junk. This has slowed things down, as I'd simply hoped to find folks really smart in this, and ask them what's going on: that's a great approach, and one we prefer around here. There's plenty of smart DNS folks, often generous in their time and expertise, but this "DNS leak" question is so poorly specified and so poorly framed that they generally recoil in disgust when asked about it. And: fair enough. I get it. It's like being asked if internal combustion engines are fast or slow... a vague, poorly-defined, frustrating question.
Back at the core of things:
As I poked at this question, and asked smart folks to help me see why the question itself is so wobbly, I noticed something obvious - in hindsight. We're talking about "DNS leaks" and yet... to "test" them we visit websites. Hold on a second. Sure yes, in general the website does what? It asks the client machine: "hey I want to have you look up an IP address for a domain name... what DNS resolver will you use for that?" Which makes sense but also is a bit wtf. Why should a
webserver sitting halfway 'round the world be able to ask a client PC to do DNS resolutions for it? That doesn't make any sense at all.
Since I'm slow on the update, I tend to do old-fashioned work: I looked at the code that these websites serve up when queried.
They all (that I've seen) load big javascript libraries. Well ok then, that explains a good bit of what's going on.
Javascript - any browser scripting language - has some things it's allowed to ask the browser to tell it about the machine on which the browser is running. This helps the js (or other scripting language) understand what's available resource-wise, so it can optimise what it serves to the browser, from the server. Plus lots of other stuff it can do, some of which is really neat and some of which is fuel both for "browser fingerprinting" attacks and for horrific malware exploits (like last summer's #torsploit, brought to you by the NSA). So the javascript queries the browser about... DNS lookups? Well ok, but that's sort of orthogonal, isn't it?
Yes, it is.
In fact, when I poked at the js itself - and I'm far from a competent js coder, nor do I pretend to be one, thanks - it seemed clear that the ways this question is asked can only be described as hack-ish. Not that saying so is an insult; rather it suggests this is using tweaks to get an answer to a question that wasn't really expected in the design of the tool itself. Which makes sense: why would a
webserver be entitled to know what
DNS resolvers a client machine is going to use?
Further...
These DNS leak websites, as far as I can tell thus far, don't actually
have the client machine do a DNS query and see where it queries .Nope, they ask the client machine - or some bits of the client machine - what DNS resolver it
intendts to use, and then report that result as a leak. Or not. And this is, in a word, totally fucked when it comes to determining whether there is a "DNS leak" from an on-network client, or not. Think about it for a minute, visually. See?
Once you step back, it's obvious isn't it? The web server asks the browser to ask the client machine what DNS resolver it will use. The client machine - or some part of it - says "oh I use this/these resolvers for such questions" and hands it back to the webserver, via .js
over the encrypted tunnel the answer. What the client machine doesn't say anything about is whether it would do this lookup
within the tunnel or whether it would "leak" the query plaintext outside the tunnel... this is all a hypothetical answer to a hypothetical question. And there's nothing about the answer that says anything about whether it'd go across the tunnel, in any case. Which, by most relevant definitions, is what a "DNS leak" really means.
Even a little bit of wiresharking shows that the queries that local machines
actually do aren't always matched up with what the "DNS leak" websites think the machines are going to do. Sometimes this variance shows a "leak" that isn't happening, and sometimes they show no leak when in fact the machine is really leaking plaintext DNS queries. These variances aren't 100% of the time, and the websites do a useful thing... but they are far from definitive, or really helpful in tracking down what the heck's going on.
Plus they have no idea whether queries go across-tunnel or not, since no actual queries get made during the "tests."
Right ok, then.
Now's where things get weird.
As I've been piecing together these bits of comprehension to a simple question that gets a little non-simple before getting outright pear-shaped (or perhaps cone-shaped?), anomalous results keep popping up that none of us can explain, even so. This goes on for weeks, as we come back to them, research, ask around to colleagues, come back, brainstorm...
Finally one of our member support staffers asks an insightful question: what about IP6?
Aha, there you go. We've written before, here about the issue of
ip6 packet leaks in openvpn-based network frameworks. A known issue, and one we've been keen to tackle once the tools are there to do it properly (long story, separate post). A corresponding issue is, we now know,
IP6 DNS resolver information leak... which is sort of the same thing, but also not at all.
Here's an example of what happens:
During cryptostorm session creation, the widget tells the operating system to shift over it's DNS resolution requests to the resolvers pushed down from the cryptostorm network. The operating system, in theory, does this and ensures that physical hardware (NICs) also has this new information, and uses it. This isn't always a perfect process, hence conventional "DNS leaks." But there's a secondary hole, too: what about IP6?
Because each NIC, each networking tidbit in fact, likely has its own preferences for what
IP6 DNS resolvers to use. Perhaps those come from a local ISP, during DHCP, or perhaps from the local LAN router. Or perhaps from... who really knows? Those settings are rarely something folks put much thought into, as few folks are using IP6 intentionally yet, at the end-user side of things.
Well, then there's this: some browser (read: Chrome) apparently like IP6 alot so they go ahead and use their own IP6-based DNS resolver settings to look up even IP4 DNS questions for matters relating to web browsing resolution of namespace. So even if your OS, and your NIC, and your local router all know nothing of IP6 DNS resolution, it appears (from what the research says; we've not yet tested and worked with this in-house to best understand it) that your web browser might just decide to use IP6 to do DNS lookups. Which... wait a second. These DNS test website ask the web browser about what DNS servers the machine prefers, right? Right.
Yeah, now you see what's happening, a bit.
Are there "solutions" to this issue? Yes. There's tedious and imprecise ones that might break stuff. There's hypothetical perfect solutions that don't exist and might also have bad side effects. There's also a couple of elegant approaches we've put together, and now - before we test solutions - we're asking for help in validating what we think we're seeing here.
I'd hope to organise all this into a neat, screenshotted, how-to guide for folks to follow. That's not going to happen. Others on our team are great at that; I'm not. They are working other high-relevance tasks and can't pull off to make this pretty. So here's some links:
http://ipv6.google.com/
http://test-ipv6.com/
https://dnsleaktest.com
http://ipv6-test.com/
(and in particular:
http://v6.ipv6-test.com/api/myip.php)
...this link randomly shows up in the browser window I've been using:
http://wiki.wireshark.org/OpenVPN
(was there a reason, specifically? Anyhow it's useful info, so pasting it here(
and this is totally unrelated, but also relates to a possible solution, so why not post:
http://code.kryo.se/iodine/
If you can make sense of what I'm trying to communicate in the post above, take a run at these test sites and see what happens. Turn off IP6 locally (this depends on your OS, in the details), Turn off IP6 on your NIC(s). Turn it off on your local router. Try mixes of those turn-offs and see which are relevant, and which oddly conflict. If you're feeling frisky, wireshark network sessions as you turn "6" on and off, at various levels.
Tell your browser not to use IP6 for resolution, especially if it's Chrome. See if that makes a difference in the "DNS leak" results.
See why this gets weird, and why we need more data before we can deploy a
real solution? There's alot of variables; some, our in-house testing suggests, don't matter at all. A few seem to matter alot. We're not going to report provisional results, as we don't want to bias the data coming in from the community.
This isn't cause for outright panic. It appears that some substantial chunk of "DNS leaks" reported by some of the more clever websites are really artefacts of IP6 resolution that is
theoretically possible on the client machine but is in practice rarely called upon in actual packet traffic coming off NICs. Some is, but not all of it. Are these "false positive" results, then? Not really, not the phrase I'd use. I'd say this is a complex question to answer and that the only real answer comes from sitting on the NIC and seeing what packets come and go: pcaps or STFU, in other words.
I suspect all this can be boiled down to a nice, concise explanation. I also suspect the solution to this "problem" is far more elegant than my rambling explanation is thus far. Which are both good things. But there's a long, looping path to get from there - a strange sense that these "DNS leak" websites are telling us something important... just not what they think they're telling us, nor what we first thought we're hearing (that's the "there"), to an elegant way to make sure no errant packets are doing stuff they shouldn't be doing in our security model (the "here"). Blah, that's an awful sentence - apologies.
And by all means, if there's nicely-written technical resources that lay all this out in beautiful form, share those links. I've hunted, and read, and sniffed around, and asked colleages to help me understand - as have several others on our staff - for quite a few weeks. Lots of general papers on IP6 or DNS resolution or whatnot (of course); not so much on "this his how browsers answer questions about DNS queries on client machines, and how those answers might or might not relate to actual packets doing actual things." We'd love to see more of
that, please!
There's quite a bit of stuff out there written about "DNS leaks" - by VPN companies, by review sites, by random folks. I'm discomfited by the fact that I've not seen these browser-side questions tackled by someone smarter than me, already, and posted publicly. It makes me sense I'm off-track... but we've poked at these wireshark results enough and I've done enough testing with those site links, above, to know
something is happening beyond the simplistic explanations posted out there thus far. And I do feel that there's been some (inadvertent or otherwise) tuning of VPN client software to get "green check" passing results from several of the DNS leak websites, with all but total ignorance of why those green checks happen, or not... or whether they relate at all to actual packet traffic. That's a sobering realisation.
Bad news? Right now, your machine is likely leaking some IP6 packets you don't know about, whether you're on cryptostorm or not (folks blocking IP6 at their router, or similarly hands-on tactics, likely are not... but not guaranteed, actually; more on that later). This has been known and flagged by us since our launch, in the general category of "IP6 leaks." Good news is that we can button up those leaks, not just the generic class but also the (I suspect far more pernicious and subtle) IP6 DNS query flavour themselves.
tl;dr throw some data at us, and we're going to present to the broader community some tools to improve real-life network security that nobody's ever created previously.
So... that's that.
Cheers,