HTTPS: Benefits and misconceptions

HTTPS certificate informationThere has been pressure brought on webmasters recently to force all websites to use HTTPS. This 'Let's Encrypt' initiative has been spearheaded by the browser authors, notably Google and Mozilla.

In particular, site maintainers have been warned that pages with password fields which do not use HTTPS may soon cause the browser to display a warning to the user, to the effect that the site is insecure.

Firstly let's take a look at what HTTPS is, and what it achieves.  Primarily created with banking and e-commerce sites in mind, HTTPS encrypts the data stream between the browser and the webserver. It does this using a public/private key pair contained in a certificate. The certificate is typically issued by one of a number of certificate signing authorities,  and in addition to providing the encryption key, offers a degree of proof that the organisation operating the website in question, are who they claim to be. 

It's not difficult to see that the extra degree protection offered by HTTPS is highly desirable if you're doing online banking. In fact, it would be most unwise to perform any money transactions, or exchange any sensitive personal information for that matter, on a site which didn't offer it.

What we're looking at here though, is an attempt to enforce the use of HTTPS on every single website throughout the entire world. Not only that, but to fully comply with Google's dictates, the webmaster would have to enforce its use for all site visitors. Leaving the choice  of plain HTTP or HTTPS  up to the visitor would not be an option.

That is a rather different matter from protecting a banking website, and considering that the vast majority of websites handle no sensitive information, it has to be asked, why do this anyway, and does it offer any real advantage?  Additionally, will it create problems that would not otherwise have existed?

Reasons

The primary reason for adopting HTTPS is, we are told, to forestall man-in-the-middle (MITM) attacks. The argument here is that data passing through the hands of Internet carriers is open to inspection by those carriers, and if a dishonest person were to gain access to the carrier's equipment, then your data might be collected and used for criminal purposes. Also, the man-in-the-middle could in principle modify the data being sent to you by the website to suit his own purposes. 

So, MITM is really just phone tapping, but with computer data. The question of whether you think that 'tapping' of your Internet data is likely to be a problem, depends of course on whether you trust data carriers such as your ISP, or not. If not, then you need to use the equivalent of a voice scrambler. Which, is effectively what HTTPS is.

A standard website data transaction, without HTTPS

The surprising thing about Web browsers is that they support the use of password-entry fields, but they don't have any inbuilt protection mechanism for those fields. Any data in the password field is sent as-is to the webserver. Thus, where password fields exist on a page the use of HTTPS does improve the security situation to some degree. I say to some degree, because it does not alter that fact that the password seen by the webserver is still in plaintext.

How it works

When you visit an HTTPS website, an encryption 'handshake' takes place in which a public key is sent to your computer, with which any data you send to the server should be encrypted. Decryption of this data requires the matching private key, which only the website's server has access to. Thus, even if the data and the public key were to be intercepted enroute, it still would not be possible to decrypt the data.

In practice the private and public keys are both provided in the form of a certificate, which is typically bought from a certificate publishing authority. The certificate provider performs a second function, in allowing your browser to determine if the certificate was indeed issued to the site you are visiting, and whether it is still valid. This makes it much harder for an impostor to pretend to be the website in question.

A website data transaction with HTTPS

The use of certificate-based public/private key encryption makes for better security than simply encrypting the data in a fixed manner. In particular, it means that having the public key does not expose the data to decryption. Only the private key can do that. Furthermore, the keys are embedded in a certificate which is tied to the domain's website, so if a hacker were to set up a fake copy of the website, then the browser should issue a warning to the effect that the certificate is being used on a site other than the intended one.

Also, if the original website were to be bought-up by untrusted individuals then the certificate can be revoked by its issuer, which will again mean the browser warning you about its non-genuineness. So, in terms of protecting data in transit, HTTPS is a more effective measure than most.

Limitations

Whilst this sounds like first class security, and in many ways it is, the important point to realise is that as with a telephone voice scrambler, HTTPS only protects the data in transit. Just as the voice scrambler doesn't protect your conversation from being overheard by another person in the same room, so HTTPS cannot protect the data when it is on your computer, nor can it continue to protect any data you send, once that data has arrived at the website.

This arrangement fulfils very well the stated goal of preventing MITM attacks. However, as seen on this diagram, it leaves all other attack points just as unprotected and vulnerable as they were before.  Perhaps most seriously, if the website operator is storing the received passwords in plaintext in a database, then implementing  HTTPS will make not the slightest difference to the security risk this poses.

It's worth adding that the password must be present as plaintext at some point on the client computer, so no matter what hashing or encryption we do, there is always a risk of desktop malware acquiring it in plaintext. This is a risk which can only be mitigated by keeping the client computer clean of malware. The real concern here though, is that the password should not be left vulnerable anywhere on the webserver. That situation ought to be avoidable, and it is if things are designed correctly.

Why no evidence of MITM attacks?

For some months I've been asking-around for any verifiable reports of passwords being stolen by way of MITM attacks. So far, nobody has come up with anything.

To be fair, that may be because MITM incidents are hard to quantify or document. Or it may be that they are extremely rare.

The scientist in me says that a hypothesis with no real-world evidence is skating on thin ice, though. The scientific principle requires evidence. Requires. Only in very rare cases would a theory become accepted as fact in the absence of supporting evidence. Even if the principle of MITM can be demonstrated on the testbench, the lack of real world examples is still a major issue. 

By contrast, there is overwhelming and incontrovertible evidence that SQL code injection is a major security problem for webmasters. No one denies this. Interestingly though, there is also overwhelming evidence that posting 'mailtos' on webpages leads to spamming incidents. Yet in spite of this being easy to demonstrate in the real world, we have a substantial cadre of 'deniers' who try to make out that it is not so.

Lesson to be learned is, don't believe everything you are told! Check the facts for yourself -Which is something I strongly encourage you to do in this case. Don't take my word. Find out.

If anyone can provide substantiated evidence of real-world MITM attacks being a significant fraction of website password thefts (say 10% just for a ballpark)  then I'll change the content of this page to suit. Maybe even say that every website on the planet should use HTTPS. 

The evidence has to be in the form of genuine incident reports, though. Theorizing, testbench demonstrations or reciting of textbooks is not evidence.

Hashing vs encrypting

The textbooks say that in any case, passwords should be hashed, not encrypted. The distinction here is that whilst encryption is a two-way process which returns the original plaintext, hashing is one-way and instead provides a hash or token which represents the original plaintext password. The hash cannot be converted back to plaintext. The obvious advantage is that even if the hash of the password is stolen, that does not tell the thief what password was originally used.

To verify that the correct password has been supplied in a login process, the server compares the hash of the user's submission with a stored hash of the password as originally set. If the hashes are the same, the user is allowed access.

The standard solution to the risk of password theft from the database is, indeed, hashing. This process could be applied at either the client or the server. In practice it's easier to apply it at the server, since more powerful software is typically available there. However, hashing at the client has the advantage that the plaintext password never leaves the client computer.

We'll look at these options in more detail later. For the moment it's sufficient to say that passwords should be hashed, and that a warning that passwords are insecure if HTTPS is not used, is somewhat wide of the mark. Passwords are insecure if they are not hashed. That is the priority concern. Get this message across - along with the message that anything using SQL is fundamentally insecure- and we'll make a worthwhile difference to the Web security problem.

Getting our priorities right

-What are the major computer security issues?

In order to make our protection effective, we first need to understand not only what we are trying to protect against, but also how frequently each class of attack arises. It is also no use providing very heavy protection of one attack route if other easier attack routes exist.

In the Web data case, man-in-the-middle attacks are theoretically a serious risk but are not so common in the real world. This is because they require access to a data carrier's equipment which, with the possible exception of public WiFi, is not normally accessible to the public. Far easier to attack a webserver, which after all has to be publicly accessible to do its job. This situation is exacerbated  through the use of SQL databases on webhosts which suffer from one of the most serious security issues in the whole of IT, namely malicious code injection.

A quick scan of IT security advisories site tends to confirm this, showing on the server side an endless list of buffer over-run and code injection exploits, combined with a good few cross-site scripting hacks. Man-in-the-middle attacks seldom figure in these lists. Meanwhile on the userspace side of things you will see endless reports of phishing, malware, fake antivirus, malicious download, and other social engineering exploits. If we put just a few of these attack vectors into place on our Website-access diagram, we see a picture like this emerging:

Points of attack for a typical Web data interchange

The elephant in the room here is that all but one of these attack vectors affect either the visitor's computer or the webmaster's server. Yet, HTTPS protects.. only the wire between them.  In reality there are probably a few more security issues faced by data carriers that I haven't listed, but the point still remains that the frequency with which these issues arise is heavily weighted towards the two ends of the communication path, with not so much in the middle.  What then, is the point in placing so heavy an emphasis on protecting a part of the data path where the least number of vulnerabilities arise?

Most people will be aware of the ongoing and seemingly endless series of high-profile website hackings. The key concern in these incidents has been the theft of user logins and passwords, which in some cases have then been used to break-into other sites where the same person had used the same name and password. Incidents of this kind are by far and away  the greatest Internet security concern these days. The issue that comes to mind is thus one of what is being done to deal with these very serious security problems, and whether HTTPS offers any protection.

Well, the answer is no. HTTPS offers no protection at all against the kinds of hackings we've seen recently. The reason is simple; these intrusions happen mostly on the website itself, or on the hapless user's computer. The two places where HTTPS cannot protect the data.

I've covered the mechanisms behind these hackings in other articles, so no need to repeat here. The point I would raise though, is that converting the entire world wide web to using HTTPS is a gargantuan task, and will involve a colossal amount of work. When there are other more serious security issues in urgent need of fixing, can this be justified? My own thoughts on this would be a resounding NO. It cannot be justified. The HTTPS proponents seem to disagree. Well, they are entitled to their opinions, but the facts and statistics of the situation are readily available for anyone willing to spend a little time investigating them.

The promoters' response

When asked about this aspect, the HTTPS promoters say that yes, hashing should definitely be used, but that hashing should only be performed at the webserver.  Naturally this leads to the assumption that passwords will be sent over the wire in plaintext, hence the need for HTTPS. Two arguments are put forward for this approach; One, that offically-coded server-side hashing routines are likely to be more secure than what might be a handcrafted piece of code running in the browser, and Two, that hashing before transmission means that the password effectively IS the hash, and therefore a man-in-the-middle who steals the hash can use it as a surrogate password. 

We'll look at these arguments in a minute, but the key weakness of purely server-side hashing is that the plaintext password is briefly exposed to any malware running on the server, as in the diagram below: 

HTTPS with server-side only password hashing

We've already discussed the situation where SQL code injection exploits are probably the Number One form of hacking risk in the entire IT industry, let alone in webhosting. Well, a code injection exploit is an ideal way to splice an extra bit of PHP code onto your favourite content management system. You know, the one you forgot to patch last week when that new advisory came out..

A significant weakness in relying purely on HTTPS to protect passwords in-transit becomes apparent. Once decrypted by HTTPS in the server's environment, the password is available in plaintext as a POST variable, and any Web scripting language like PHP can easily pick-off this value and, for example, email it to a hacker. The code required to do so is extremely trivial, in fact it hardly even qualifies as hacking. The password will no doubt be hashed immediately afterwards, but to no avail as the that malicious bit of PHP code has already sent the plaintext version to the hacker.

As for the two arguments for server-side hashing, well, if there are no industrial-strength client-side hashing routines... who needs to put that right?  Yup. The browser coders. Also, if we did use HTTPS, the hashed password supposedly can't be intercepted in transit, so in that case what's the problem with client-side hashing? I don't see any.

In any event, there is nothing to prevent the password being hashed at both ends. Some of the more secure hashing algorithms actually process the password up to five thousand times, so why not do a few each on client and server?

The right way

Let's take a brief and somewhat simplified look at how passwords should be handled by a browser-to-website data transaction. As we've seen, the browser has no inbuilt protection mechanism. This is bad, and I think it ought to be changed. Because of this situation, it is almost always necessary to resort to javascript   preprocessing of the form. The preprocessing done here should consist of hashing the password before it is transmitted. This is true regardless of whether or not HTTPS is used,a nd the reason is that failing to hash on the client leaves the password vulnerable when it arrives at the server. 

A client and server-side password hashing arrangement

If we want to do a really good job of protecting passwords we can actually take things one or two stages further, and employ salting of the hash, and/or a challenge/response mechanism. What we are aiming for here is to ensure that the hash we use for any given password will not match the hash used by any other website, even for the same plaintext password. The main objective here is to protect the interests of lazy people who use the same password on multiple websites. If salting is in use, then password hashes lifted from one such site will not work on the others.

The important difference here is that although the password is briefly exposed as plaintext on the user's computer, once it leaves the computer it is in a non human-readable form, and stays that way, right up to and even inside the server database. In fact it is never converted back into the original text. So, as far as passwords are concerned, hashing covers the issues of two thirds of this diagram, whereas HTTPS covers only the middle section.

That said, hashing only covers passwords, and for mission-critical data like bank statements we might prefer to encrypt the lot as well. So, we can combine HTTPS with hashing. In fact that's no problem at all, since the HTTPS layer is transparent to the hashing mechanism. The overall setup might look something like:

Best security practice is to use both HTTPS and client/server hashing.

In conclusion

On sites handling user logins but less-sensitive data, for example social media or forums,  it really is far more important to protect protect passwords by way of a proper hashing mechanism than to implement HTTPS.  The majority of password thefts occur on the web or database server, where HTTPS offers no protection whatsoever.

The danger here is that where the site doesn't handle critical data and only a limited amount of working time is available, that time will be spent on implementing HTTPS, leaving passwords to be stored on the server as plaintext. -did I mention, that's bad?

This will be exacerbated if browsers start to flag sites which don't use HTTPS as insecure, because by reverse inference it  will suggest that those which do use HTTPS are secure. -as in, 'Look, it has a closed padlock so it must be OK!' That situation may well lead to proper password protection being omitted.

There is the danger of HTTPS being seen as a password security cure-all.Password hashing is important for all sites that use logins, because people will insist on using the same password for multiple sites. Thus, the theft of a password from a low-importance site is not as trivial an issue as it might seem.

Furthermore, if at all possible it should be arranged that  the password is never exposed as plaintext on the webserver. Any scheme which requires the password to be decrypted server-side to plaintext and then hashed as two separate steps, is vulnerable to malware planted through SQL code injection or other exploits. Since SQL code injection exploits are extremely common, this is likely a more pressing concern than protection against MITM attacks. This consideration suggests the use of client-side hashing, or else of an alternative encryption system which does not automatically decrypt data as it arrives on the server.  

If there is to be any modification to how browsers handle password fields, instead of enforcing HTTPS the browser should focus on enforcing at least a basic level of password hashing. That way, the password remains safe after it arrives on the webserver.  HTTPS does not achieve this fundamental security requirement.

As I say, there is a danger to browser password-field warnings, in that promoting HTTPS this way may lead to its being seen as a security cure-all by inexperienced webmasters. Like the guy who decides he doesn't need to visit his doctor because, after all, he's on a course of Dr Marvel's Wonder Liniment, which is known to cure all ills, the webmaster may decide that having implemented HTTPS, no other security measures are necessary. That's not to say that HTTPS is in the same class of product as snake oil. It isn't. However, like the aforementioned product it isn't a cure-all either. We need to get our priorities right, and if password theft is the big issue, then HTTPS is not going to fix this because passwords are typically stolen from either the user's computer or the website's server. The two places where HTTPS offers no protection.


Site: iwrconsultancy Thread: blog/https.htm

Recently Visited