To what extent is filtering HTTPS traffic possible?

Question

How can Privoxy filter Secure (HTTPS) URLs?

Since secure HTTP connections are encrypted SSL sessions between your browser and the secure site, and are meant to be reliably secure, there is little that Privoxy can do but hand the raw gibberish data though from one end to the other unprocessed.

The only exception to this is blocking by host patterns, as the client needs to tell Privoxy the name of the remote server, so that Privoxy can establish the connection. If that name matches a host-only pattern, the connection will be blocked.

Access to HTTPS websites such as twitter.com can be prevented, but can embedded HTTPS content such as YouTube videos be blocked from loading on websites?

score 5 · Accepted Answer · answered Feb 04 '17 at 12:21

5

Filtering HTTPS traffic is possible. The question you should be asking is: To what extent?

HTTPS is an application layer protocol, meaning that it is wrapped inside the IP protocol, which is at network level. So, the IP address of the source and destination is not encrypted. Hence, it is possible to block a domain name or IP address, effectively stopping all traffic to or from it, including but not limited to HTTPS. I have evidence that this method is more than enough to stop YouTube videos.

But is it possible to peek inside HTTPS traffic? That depends. A simple eavesdropper cannot. But a man-in-the-middle can theoretically intercept the client's session request to the server, send its own session request, thus becoming the encrypted client of the secure server. So, it can decrypt the server's response. Now, it can act as a secure server for the original client, establishing its own secure session with it, re-encrypting the decrypted contents and sending them to the client. However, modern web browsers easily discover this kind of attack by checking the digital certificate of the man-in-the-middle. To evade this detection, the man-in-the-middle must somehow procure the original secure server's digital certificate's key pairs. Needless to say that this is very difficult; most of the times impossible.

answered Feb 04 '17 at 12:21

I have changed the title. – user198350 Feb 05 '17 at 19:09
My mobile provider can block tethering to my laptop, even with a VPN connection on my android phone. There must be information in the headers, or they do a MITM with secure connections. Chinese 4g provider. – jiggunjer Feb 05 '17 at 21:16
@jiggunjer Actually, that has nothing to do with "information in the headers" or anything of the sort. It is a negotiated feature of the phone. The provider sends a "do not allow tethering signal" through the SIM card. Your phone honors the request. Some phones, like those running Cyanogenmod, do not honor it and allow tethering anyway. – Feb 06 '17 at 05:45
But my tethering was working fine with VPN at first. They started blocking the first time I tried tethering multiple devices; that caused the block to trigger. How could they see I was using multiple devices unless something was visible? – jiggunjer Feb 06 '17 at 05:53
@jiggunjer In all the probability they didn't see anything specific to you on the network. They just decided to block tethering. The initial condition that lead you to think that tethering could be bypassed with VPN has to do with DHCP scenarios. I cannot explain that in the comment section. (I am running out characters.) Anyway, we are straying from the subject. – Feb 06 '17 at 06:00

score 1 · Answer 2 · answered Feb 05 '17 at 20:38

Fleet Command's answer, as currently written, is mostly correct. I don't want to duplicate that answer, so I recommend starting by reading that answer (in its entirety).

However, there is one point that should be elaborated on. (I figured this would take more characters than a comment, so I'm adding it as an answer.)

To evade this detection, the man-in-the-middle must somehow procure the original secure server's digital certificate's key pairs.

That is just one approach. There is another approach: The "man-in-the-middle" could simply use its own certificate. This technique is being performed by many organizations (including commercial companies, public schools, etc.)

For example, you could have a firewall that provides "HTTPS-filtering" capabilities. This firewall may receive the outbound HTTPS traffic and, instead of routing it to a website, the firewall may just act like it is the website. Then the firewall establishes its own HTTPS connection to the website, retrieves the data from that website, and passes on the data to the web browser (spying on the traffic, and making any desired filtering/changes, as the firewall desires).

The challenge with this is that unless the "man-in-the-middle" device (the "firewall" in this example) handles the key problem, then the web browser will know that the data is not coming from the website that it tried to reach. The way the web browser knows that the website is what it wants to reach is by using SSL technology. (Historically, although the S in HTTPS meant "Secure", it was also technically accurate to think of it as "HTTP over SSL". Although, nowadays, that's usually "HTTP over TLS".)

One way to handle this challenge is to get the website's private keys, as noted by Fleet Command. However, there is another way.

When the web browser receives an SSL certificate (which contains some information, including an SSL key), whether that comes from the actual website or from the "man-in-the-middle" device, the web browser looks to see whether that SSL certificate should be trusted to identify itself as the website. A common way to do this is for the web browser to look in its certificate store. Let's see three different scenarios of what happens when someone tries to visit a website called Example.com, and then you'll see the other possible weakness that lets HTTPS filtering work.

If your website says "I am Example.com and you know this because of my SSL certificate which you can tell is endorsed by GoDaddy.com", and if your web browser uses a certificate store that has a certificate that says to trust everything endorsed by GoDaddy.com, then the web browser is satisfied, and you don't complain.
If you go to a restaurant and connect to a Wi-Fi device that is trying to use "man-in-the-middle" techniques to spy on you, and the Wi-Fi device says "I am Example.com and you know this because of my SSL certificate which you can tell is endorsed by Cyberoam". However, your web browser doesn't contain a certificate that says it should trust Cyberoam. As a result, the web browser shows the user a warning that the communication with the website is not trustworthy because the certificate doesn't appear to be valid.
However, then you go to work and you work says that your computer needs to join the Active Directory domain for security reasons. You agree, and then your computer trusts the network's "domain controller" to make whatever security configuration changes are desired. Work wants to control HTTPS traffic, and so the domain controller specifies that the Cyberoam certificate should be installed to your computer's certificate store. Now, not only is your work able to spy on your HTTPS traffic (without you really knowing about it), but so can that Wi-Fi device nearby the restaurant.

In general, the attitude of many organizations is "we want to control things, and don't care so much about whether the end users have the privacy that they desire". Here is another example of this same sort of thing happening: Security.StackExchange.com question: "My college is forcing me to install their SSL certificate".

I think of how many people just choose "I agree" without reading the forms and understanding the security implications. As long as most average people will just cooperate with steps when an organization says "this is required", and as long as organizations tend to have this controlling attitude, the market seems ripe for companies to keep selling equipment which is designed to be able to snoop on HTTPS traffic using private keys, and for organizations to keep those private keys installed onto machines (so that the equipment can effectively do the intended task of snooping on HTTPS traffic).

Now that I've discussed the tech, effectively answering the question in the title ("To what extent is filtering HTTPS traffic possible?"), let me provide a straightforward answer to the other question I see:

can embedded HTTPS content such as YouTube videos be blocked from loading on websites?

As Fleet Command's answer noted, the firewalls could simply notice the destination IP address is related to YouTube, and block the traffic there. The end user would know that the traffic is blocked, because the page wouldn't load.

If MITM techniques are being deployed, then a device could theoretically allow a web browser to get some content from the website, while other content could be changed (including being blocked). For instance, video might be allowed, but comments could be changed (or vice versa). The end user would likely be oblivious to what is happening.

That's not entirely man-in-the-middle; when a company wrests the control of your computer, that's no longer your computer. This is one of the 10 Immutable Laws of Security. Nevertheless, I believe Firefox doesn't receive certificates from Active Directory. — , Feb 06 '17 at 06:01
@FleetCommand In the case of the college demanding access, I highly doubt that average students feel like they've given their computer to the college just because they followed instructions to connect. I just reviewed The "10 Immuntable Laws of Security" published by Microsoft; it seems you're essentially calling the "company"(/work/college/whatever) "a bad guy". Even if/though the client computer has a modified certificate store, I still say the SSL interceptor is in the middle of the HTTPS connection, hence acting as a MITM. — TOOGAM, Feb 06 '17 at 06:12
Lots of malware spread by social engineering, i.e. persuading a user to follow a set of instructions that ultimately leads to installing malware. Also, I was not calling the company/college a bad guy but I am now. Blocking YouTube or Facebook in the workplace/college is justifiable. Stealing an employee's/student's password is not. — , Feb 06 '17 at 08:14

score 0 · Answer 3 · answered Feb 07 '17 at 00:00

One of the main drivers for deployment of MitM is due to browsers refusing to display any content returned by a proxy from a CONNECT command.

Initially this issue arose because a researcher was able to get a browser to follow a redirect from a CONNECT method. They should have just fixed the browser, but instead it was recommended to ignore any body on the response from CONNECT as untrustworthy.

The problem with this, is that https URLs via a proxy are handled by the browser by connecting (CONNECT method) to the server via the proxy first. If a proxy wants to refuse the connection, in the past it would send back a block page response, which could give the user information about what was going on, what to do etc. With the changes to the browsers to ignore the block page response, the user experience became extremely poor, reporting a generic connectivity error, which sent people looking for router/cable etc problems. This is the current situation and I wish I had a dollar for every support ticket this has caused us.

To get around this, and force the browser to display a proper block page required MitM. There are long discussions about this in the IETF HTTP WG mailing list.

Until browsers allow a way to display a message from a proxy about why a connection was blocked, this will continue to be the case.

To what extent is filtering HTTPS traffic possible?

3 Answers3