There has been pressure brought on webmasters recently to force all websites to use HTTPS. This has even extended to strong-arm tactics such as threatening to downgrade the searchengine rakings of sites which don't comply.
The latest browsers now pop security warnings on pages with password fields that don't use HTTPS.
A somewhat more extreme development is that Chromebooks are, I am told, disabling access to the microphone and camera unless HTTPS is used, even if the use is on an internal LAN.
The rationale behind this is twofold:
A man-in-the-middle attack refers to a person gaining access to the data stream between the end user and the service provider.
So, MITM is really just phone tapping, but with computer data. The question of whether you think that 'tapping' of your Internet data is likely to be a problem, depends of course on whether you trust data carriers such as your ISP, or not. If not, then you need to use the equivalent of a voice scrambler. Which, is effectively what HTTPS is.
There is some controversy over what the term MITM means Some say it only refers to a literal wiretap on Internet cabling; others say that it can refer to any situation where an attacker is able to place a listening device at some point along the data path between the user's keyboard and the destination hard disk. There are an increasing number of precedents in the security industry for the latter being the accepted meaning today, so I think we can reasonably use it here.
The statistics for computer security breaches suggest that actual tapping of a wired data connection is rare. Also, we've been using the wired Web since the 90's with no significant number of recorded incidents, so why should this suddenly have become a concern? Good question.
However, the rise of the use of wireless hotspots has created the risk of data interception from the radio signals these use, and there is indeed a possible justification for concern in this case.
Originally created with banking and e-commerce sites in mind, HTTPS encrypts the data stream between the browser and the webserver. It does this using a public/private key pair contained in a certificate. The certificate is typically issued by one of a number of certificate signing authorities, and in addition to providing the encryption key, offers a degree of proof that the person or organisation operating the website is who they claim to be.
This answers the two abovementioned concerns:
The scheme is something like this:
The classic case for using HTTPS is on banking sites. In such cases, there should be only one data source, the bank's server. When used in that way, it achieves the two stated objectives, of ensuring that the data source is where it claims to be, and that the data has not been read or modified enroute.
For any security sensitive Web work, these two protections are highly desirable features. When connecting to your banking site or the like, you should always check for the 'closed padlock' security icon, and preferably also check that the name on the certificate matches the company you think ought to be providing the service. If in doubt, don't enter any passwords, bank card numbers or the like.
In its original form, this is what HTTPS was intended to do, and if used strictly in this way, it does what it is supposed to.
An important point to realise though, is that HTTPS only protects data in transit, roughly from the point here it leaves the browser, to where it first arrives on the webserver. Before or after those points, it offers no protection whatsoever.
Here's the rub: A fairly exhaustive search of IT security websites turned up only a handful of reported thefts of data from on the wire. By contrast, there are countless thousands of reported data thefts from user's own computers, and from website servers.
If we compile a list of the most common attack vectors, we see a picture like this emerging:
The important point is that the overwhelming majority of these attack vectors affect either the end user's computer or the website operator's equipment. Attacks on data carriers' equipment are much rarer, probably because their security is generally much better than that of either of the endpoints.
Whilst HTTPS prevents the data being read enroute, it does nothing to prevent it from being read either on the user's computer, or on the website operator's equipment. Which is where the majority of attacks take place.
Most of the endpoint attacks are not MITM attacks, but a few do fall into that category. For example, a hacker who can place a password-sniffing script into the browser, where that script will send any discovered passwords to him, is carrying out a MITM attack.
Likewise an attacker who manages to place a script on the webserver which intercepts the POST variables on all incoming page requests, will be able to read any passwords therein, because by the time the data reaches the script interpreter, it has already been automatically decrypted back to plaintext.
Thus, we can see that HTTPS is limited to protecting against a small subset of all attack vectors. Furthermore, it does not even protect against all forms of man-in-the-middle attack.
The current promotional drive aims to get HTTPS rolled-out to every single website on the planet. This is a rather different proposition from its original intended use, and brings with it some problems.
The problems here, arise not through any fault in HTTPS itself, but when too much is expected of the technology. It is after all designed to protect high sensitivity data, but only when that data is on the wire, and only when part of a strictly one-to-one conversation between client and service.
You'd think that shouldn't be the case. But it is.
The issue this raises is due to the sheer number of offsite data pulls that large sites make. In stead of having to worry about the trustworthiness of one site, we have to worry about 20, 40 or more, some of which we've never heard of. If any of these sites has a mind to, it could very easily spy on passwords being typed into the page. All of them are, in very real terms, potential men in the middle.
Even if you explicitly trust all advertisers with your passwords -OK, hope I didn't make you spill your coffee with that one- that isn't the end of the story. The large adsites serve out data to a vast number of client websites, some of the larger client sites in turn have millions of subscribers. The sheer number of vulnerable users is bound to make the adsite a prime target for hackers. Why hack a single large website if you can hack an advertiser and plant password-sniffers on many more computers that way?
The hacker need not even load malware onto the hacked advertiser's site. If it is more convenient he could just put a link in the ad code to his own site. Provided this uses HTTPS, even with a free certificate, the browser won't complain about loading content from it.
Therefore, this situation creates a high security risk. So, how is it handled?
Firstly, on an HTTPS site the browser will allow any third party content to be fetched unchallenged so long as the third party is also using HTTPS. The third party need not be using the same credentials as the main site. In fact with so many secondary connections in use, that would be nigh on impossible to arrange.
A browser will warn you if any of these third parties is not using HTTPS. In the old days that warning did have some value, it being unlikely that a hacker would own a bona fide SSL certificate. Today, with free certificates available, it will rarely arise.
The need to handle a situation where every page request spawns numerous certificate validations, means that browsers simply cannot display every certificate involved under the 'padlock' information. So instead, they only display the security information for the actual site requested by the user. It is as if the rest do not exist.
Thus, even if one of the third party sources is a hacker using a free certificate, the browser won't inform you, and the certificate won't be visible under the padlock.
All in all, a very unsatisfactory situation. There is a multiple risk of MITM attacks from an unknown number of sites, only one of which the user explicitly trusts. The browser remains silent about this risk, in fact creating the impression that all of the data loaded comes from the one trusted site.
So, whilst HTTPS is a useful security feature for highly sensitive data in a controlled environment, it is far less effective in the general browsing arena.
Time for a little demo. After all, seeing is believing. Here is a password field created by this site:
You'll note that the offsite script can read not only its own password field, but any on the main page too. It doesn't need to be told where the field is, either. It can find it automatically. There's another right at the foot of the page if you want to scroll down and enter something, just to prove the point.
When you consider that the key advantage of the mass HTTPS rollout is claimed to be protection against an untrustworthy individual at your ISP being able to steal your password, what is the point if the same risk still exists at tens of advertisers?
It might be possible to sandbox the ads on the page so they can't steal passwords, but the issue here is that we have no idea if any given site has implemented such protections, or not. So either way, the reassurance of the certificate info that 'this site is secure' is without any real substance.
It's worth taking a look at a couple of examples of data connections on large websites:
I set a data trace on the browser's Internet connection, and opened a well-known Scottish news site. A respectable journal, this carries no ads. This was the log:
I then opened a well-known international news site which does carry ads:
So in visiting these two sites which I trust, my browser also, without even notifying me, pulled-in data from forty three other sites. All of these 43 additional sites were using HTTPS, but no security information for these sites appeared under the 'padlock' icon. From the user's perspective it was as if they didn't exist.
If we consider that the connection went through one ISP in a protected state to one trusted and validated site, but 43 other untrusted data sources were involved, then that represents a 2/45 or one in 22.5 success rate at safeguarding the connection from men in the middle. Call it five percent and we're being generous.
There would seem to be a trading standards issue here. A security product with a 5% success rate is not fit for purpose.
I'll just recap that this situation arises not through any fault in the HTPPS protocol, but through its application to an environment which it was not intended to work in, and to do a job which is essentially beyond its capabilities.
HTTPS is promoted on the strength that it will protect your online data by preventing interception of it, and therefore make the Web a safe place.
There are two distinct issues with this claim:
The vast majority of documented IT security incidents have not been man-in-the-middle attacks. Many of the compromised sites were using HTTPS anyway, and it made not the slightest bit of difference.
That ineffectiveness is not the fault of HTTPS, which was designed with a particular role in mind. It arises through having unrealistic expectations of it.
Again, HTTPS was not designed for use in this kind of environment. It was designed for use in online banking and the like, where the data comes from a single trusted source.
I'd say that the chief concerns here are that promotion of HTTPS in this way, when it has so limited a effect on actual security, is bound to create false expectations in the minds of users, that it will make their browsing safe and secure. In reality it only attempts to address two out of many security issues, and because of the way in which it is deployed, does not properly achieve even these two objectives.
MITM risk or no, the key concern is that of plaintext password storage on website operators' servers. If we can do something to address that one, we can make a really significant advance in the safe usage of Web services.
The way to achieve this is for the browser itself to handle the one-way encrypting of a password before any other process is allowed access to it.
<input type='password' encryption='sha256' salt='3d2c54b3a7f4cb312c34' >
Once such a scheme is well established, browser authors would make it such that use of unencrypted password fields will trigger a security warning.
Such an arrangement would be compatible with sites carrying advertising.
Session cookies should also be encrypted by the browser, to prevent user impersonation.
The browser coders are keen to say that webmasters should only use official encryption systems, that they should not attempt to devise their own in case they prove to be weak. Well, here is a situation which just begs for the creation of an official encryption system.
Before the point arises, nobody is saying that such a scheme would be unbreakable. No security mechanism is bulletproof. Only that It would be infinitely better than the present situation. The most important benefit of this browser feature is that by providing readily available, persistent encryption it will discourage the storage of passwords as plaintext.
HTTPS could be used as well, if desired.
YES if the dialog between your browser and the secure website is strictly one-to-one.
NO if the website carries third party content such as advertising, captchas, tracking or profiling, hit counters, etc.
Since the majority of websites do carry third party content, the answer will usually be no.
Which, ought to be worrying.