Custom HTML Authentication

Best Practices on Securing Custom HTML Authentication Procedures

By Gunter Ollmann

Interactive web-based applications now form an important part of the e-business world. There is great pressure on organisations to make available many of their services through the Internet to their end clients, business partners, and own employees. Many of these new online services require end users to positively identify themselves to the application and actively work to ensure the information and level of access is appropriate for the authenticated user. While many methods are available to an organisation seeking to implement an authentication method for their Internet service, the majority have chosen to do so through HTML form submission over HTTP. Although they tend to understand the threats to their hosting environment from attackers, and actively test and patch the hosts against publicly disclosed vulnerabilities, very often the security fails at the implementation of their custom authentication procedure. Organisations must now ensure that adequate secure procedures are implemented within the custom application, particularly the authentication process and the associated management of session state.

This article explains the steps and procedures an organisation should review in the process of constructing a web-based service. Emphasis is placed upon the mechanisms and logic for preventing differing attack types, and the processes for evaluating their relevance to the architecture and the end user.

HTTP Authentication Security

For the majority of organisations, delivery of their Internet services is through the World Wide Web over HTTP. While the HTTP protocol (RFC 2616) versions 1.0 and 1.1 include access authentication schemes for controlling access to realms within a web site, most organisations choose not to implement these largely due to scalability and control.

There are two native HTTP access authentication schemes available to an organisation � Basic and Digest.

Basic Access Authentication

Basic Access Authentication assumes the client will identify themselves with a login name and password. When the client browser initially accesses a site using this scheme, the web server will reply with a 401 response containing a �WWW-Authenticate� tag containing a value of �Basic� and the name of the protected realm (e.g. WWW-Authenticate: Basic realm="wwwProtectedSite�). The client browser will then prompt the user for their login name and password for that realm. The client browser then responds to the web server with an �Authorization� tag, containing the value �Basic� and the base64-encoded concatenation of the login name, a colon, and the password (e.g. Authorization: Basic QWRtaW46Zm9vYmFy). Unfortunately, the authentication reply can be easily decrypted should an attacker sniff the transmission.

Digest Access Authentication

Digest Access Authentication expands upon the security of Basic Access Authentication by using a one-way cryptographic hashing algorithm (MD5) to encrypt authentication data and, secondly, adding a single use (connection unique) �nonce� value set by the web server. This value is used by the client browser in the calculation of a hashed password response. While the password is obscured by the use of the cryptographic hashing and the use of the nonce value precludes the threat of a replay attack, the login name is submitted in clear text.

However, while both HTTP access authentication schemes may appear suitable for commercial use over the Internet, particularly when used over an SSL encrypted session, many organisations have chosen to utilise custom HTML and application level authentication procedures in order to provide a more sophisticated authentication procedure. Chiefly amongst these are:

Requirements for users to more fully identify themselves uniquely, beyond simple user name and password fields.
To provide a robust defence against brute force type attacks.
To better handle the client-server sessions, particularly their cancellation and expiry.
To overcome the nuances of a distributed, load-balanced site architecture, and client-side caching or proxy services.

A Standard HTML Authentication Form

As with the HTTP authentication schemes, consider a classic HTML form that contains two text input fields. The client is expected to supply login credentials in the form of user name and password, and then submit the data. When submitted, the remote web server verifies the supplied data through some authentication technique. If the login credentials are correct, the client has successfully authenticated and is allowed to proceed further into the web site. If authentication fails, the client is again presented with the login form and requested to try again.

Although replicating most of the functionality of the HTTP authentication schemes, and still limited to the classic two user input fields, there are a number of design considerations and improvements an organisation should make to increase the security and robustness of the authentication process.

Security Improvements:

In all cases, when submission of confidential client data is required, the data submission must be conducted over an encrypted channel. At the very least, such data submissions should be conducted over SSL. When submitting authentication data, use of the highest level of encryption available between the client and web server is recommended.
Should the client fail to submit the appropriate credentials and thus fail the corresponding authentication procedure, no information should be passed back to the client indicating why authentication resulted in a failure. The client should be presented with a generic �Authentication Failure� message. Passing informative messages such as �User does not exist� or �Password incorrect�, an attacker can enumerate user accounts (half the information required to login) and guess passwords.
Without implementing an account lockout facility, it is a trivial task for an attacker to brute-force an account by automating the guessing of passwords. Many sites will utilise the account lockout procedures supplied by the host or authentication server, traditionally three authentication failures will result in an account lock-out. Numerous tools and scripts are currently available to facilitate HTML Form brute-forcing. It should be noted that account brute-forcing may not require knowledge of a particular login name/ ID. If site users are permitted to select their own password, an attacker may select a common word for the password (e.g. �password� or their login name/ID) and brute-force the login name/ID instead.
Although client-side data validation is recommended before submitting to the web host, it is vital that comprehensive server-side validation of the submitted data is carried out before carrying out authentication processes. It is a relatively simple task for an attacker to bypass any form of client-side data controls. Client-side data validation should be used to correct unintentional client mistakes in the submitted data and to reduce the necessity for excess corrective client-server communications. Server-side validation of the submitted data should also be carried out, replicating the client-side checks. Any client data received that fails the duplicated validation checks should be treated as highly suspect.
Developers have two HTTP methods for sending the client login credentials, GET and POST. The preferred method of sending data is the POST. While it is a trivial task for an attacker to modify client data relying on either method, the GET method requires less skill and understanding by an attacker or malicious user, as most browser applications will display the URL containing information relating to the GET request. Additionally, information appearing in the displayed URL may be bookmarked and locally stored, thus effectively caching the login credentials for any other user of the client system. Importantly, the URL�s are often logged in the web server�s access log, firewall/proxy/web-cache logs, as well as the client browsers history file and disk cache.

Account Lockout

For many organisations, introducing the same lockout processes for authentication on their web-based Intranet as used for their OS domains is probably a logical decision, and the same mechanisms for regulation or resetting of accounts may be used. Unfortunately, for many applications accessible over the Internet, a similar or poorly thought-out account lockout process can quickly result in successful brute-forcing or denial-of-service (DoS) attacks.

It is important that organisations not only evaluate the lockout process, but also review the mechanism for resetting locked accounts. This task may be complicated by country specific regulations, for instance, Italian Internet banking regulations insist that a locked account may only be unlocked after a minimum of two days. Depending on the legal requirements, the backend support mechanisms (e.g. 24-hour helpdesk and telephone support) and client expectations, several automated authentication restrictions or lockout options and processes should be evaluated before implementation.

Security Improvements:

A popular method for defeating many brute-force attacks is through the use of an increasing time interval between the sending of authentication denial and retry pages after registering the failed authentication attempt. To implement this solution successfully, the server-side application must be capable of storing and retrieving information relating the time of the last unsuccessful attempt and number of failed attempts since the last successful authentication. Many sites currently double the time interval between receiving the authentication failure and the sending of the response for each failure, quickly rendering a brute-force attack largely impotent. However, caution should be taken to limit the maximum time interval between responses and the resetting of the login failure counter (e.g. a maximum of 1 hour between responses and a counter reset after 24 hours).
Automatically lockout an account after a threshold has been reached (e.g. three authentication failures), but do not inform the client system that this process has happened, and record the time of the last authentication attempt. Thus any further attempts to authenticate will result display of the same failure message. If no attempts to authenticate have occurred within a predetermined time span (e.g. 1 hour), the account could then be automatically unlocked.
Again, automatically lockout an account after a threshold has been reached (e.g. three authentication failures), and do not inform the client system that this process has happened. However, if the correct authentication information is later provided, the client could then be presented with information stating that the account is currently locked out and then issued with instructions for unlocking it (e.g. phone the 24 hour helpdesk and answer a number of identity questions).
Ensure that information required for the authentication process can not be easily guessed. For instance, the authentication of users to an Internet banking portal my require submission of a personal account number. In most cases bank account numbers are allocated sequentially and an attack could be easily automated to intentionally lockout or time-delay a large number of accounts. If sequentially allocated information is required as part of the authentication process, organisations should consider implementing a multilayered authentication process.
For web sites requiring more rigorous authentication processes than name and password, organisations should implement a multilayered challenge-response system instead of receiving all the required login information in a single form submission. The multiple challenge-response authentication process will help defend against many popular automated attack tools.
The web-based application must be able to track and log connections relating to the source IP address of the web client. The application should be able to identify authentication failures to multiple user accounts initiated by a single IP address, and take an appropriate action. Depending upon the flexibility of perimeter network defences (e.g. Firewalls and Routers), it may be possible to dynamically block an offending IP address. However, the organisation must carefully review the impact such an automated response would have to clients connecting from behind address-translated devices (e.g. Proxy servers and NAT firewalls) and other online services (e.g. AOL and Internet caf�s).
An option should be made available for the successfully authenticated user to review a history of failed attempts since the last successful login. Advice could be provided on either strengthening the account against further attacks, or provide reassurance by informing the user of the security mechanisms used by the application to prevent these attacks.

Brute-forcing & Automated attacks

As more organisations shift components of their service offering to the Internet requiring some level of authentication to access, there has been a substantial increase in the number of tools and methodologies used to brute-force or otherwise gain access to the application and site content. Many of these tools use sophisticated, automated methods to overcome the web applications authentication processes. For applications or sites requiring minimal helpdesk involvement and continual access, or where account lockout procedures are not an option, other anti brute-force security options must be explored. It should also be pointed out that, as processes improve in updating the security posture of the service host (e.g. application of current security patches) and perimeter defence systems increase, attackers are being forced to focus on the security flaws inherent to the organisations custom developed application. A secure and robust authentication process is often seen as a key element in the overall security of the web-based application, particularly the prevention of automated attacks.

Security Improvements:

An important step in halting automated attacks that attempt to either brute-force the authentication process or subvert the stability of the web application is through the addition of random content located on the page presented to the authenticating client browser. The client must be capable of successfully submitting this random content as part of the authentication process to proceed further in the web site or application. For instance, consider an extra text input field that requires the sixth word of this paragraph to be typed in to a text field and submitted. Each visit to the login page would require the client to input a different, randomly selected, word referenced in this paragraph.
Unfortunately, a select few specialist tools and scripts can be tuned to overcome this security precaution by successfully interpreting the worded request (e.g. �please type in the sixth word��). However, by presenting the random word or number to the client in a graphic GIF or JPG format, it becomes much more difficult to automate. Dynamically generating this graphic pass phrase, and using random fonts or colours each time, can make it almost impossible for an automated process to succeed.
One of the key techniques automated tools use when calculating whether an attack phase has been successful or not, is through returned error codes and page information from the host web server. A secure practice is to force any error or unexpected request to generate a HTTP 200 OK response, instead of the myriad of 400 type errors. Ideally, when the web application encounters any invalid request from the client, whether successfully authenticated or not, the response should be to revoke any Session ID and cookie information, and route the client to the standard login page and require them to re-authenticate.
The introduction of 3rd party time dependant shared passwords or token systems (e.g. SecureID from RSA Security and SafeWord from Secure Computing) can make it extremely difficult for an attacker to launch an automated attack. The use of time dependent shared passwords and tokens are also of great value in the defence against replay attacks and key logging. Obviously, any user of the site must have access to a valid token or device, and the organisation must be able to control both their physical distribution and revocation.

Client Access from Shared Hosts

If a secure application could be accessed from a shared host such as in an Internet Caf�, or any other potentially insecure system, additional steps will be required to safeguard the integrity of the authentication data. In particular, the procedures required to overcome the impact of host level monitoring such as key loggers, Trojan horse applications and man-in-the-middle type attacks.

Security Improvements:

Consider modifying the requirement for the client to input a full length password, and instead ensure the web application requests random elements of the password. For instance, if the client�s password is a minimum of 8 characters (e.g. �Aut0m4t3d�), the web application may randomly request the 2nd, 3rd and 6th characters and the client would be required to submit the characters �ut4�. Obviously both the frequency the shared host is used and the total length of the password would have an effect on the usefulness of this security mechanism against key logging and man-in-the-middle attacks in the longer term.
Just as the preferred method of submitting data is the HTTP POST method due to issues with remote logging and local history/favourites saving, the web application should ensure that page information is not cached locally on shared hosts. By default, many web browsers will cache web page content whether accessed over a secure or insecure connection. Organisations should ensure that the appropriate flags are set for all secure pages, and that the caching information is supplied in both the HTTP and HTML responses. As not all client browsers and caching devices (e.g. proxy servers) are known to successfully implement all no-caching options, organisations should ensure that multiple no-caching options are used. The following HTTP tags should be included in the host responses:
Pragma: no-cache Cache-Control: private, max-age=0, no-cache Expires: 01/01/99 20:00:00 GMT
The following lines should be inserted in the HTML HEAD of pages:
<meta http-equiv="expires" content="01/01/99 20:00:00 GMT"> <meta http-equiv="pragma" content="no-cache"> <meta http-equiv="cache-control" content = "max-age=0"> <meta http-equiv="cache-control" content = "no-cache"> <meta http-equiv="cache-control" content = "no-store">
Client browsers are expected to request a new copy of the page if the �Expires� data field has been set to a date in the past. This method potentially disables the use of the browser �back button�.
Current versions of the popular browsers include functions to �remember� HTML form data, and may try to fill form input fields automatically. Organisations should ensure that appropriate HTML scripting is used to flush all form input fields. Ideally, when a client browser opens the page, an embedded script will automatically clear all input fields contained within the form.

Maintaining State

While web servers often provide a versatile and cost effective platform for delivering application content to clients, the HTTP protocol does not possess a mechanism for managing the state of the connection. Thus web based applications must include their own processes for managing the state of a client�s connection (e.g. authenticated or not).

Typically, the process of managing the state of a HTTP client is through the use of session IDs. Session IDs should be used by the application to uniquely identify a client browser, while background processes are used to associate the session ID with a level of access. Thus, once a client has successfully authenticated to the web based application, the session ID can be used as a stored authentication voucher so that the client does not have to retype their login information after each page request.

Organisations have three methods available to them to both allocate and receive session ID information:

Session ID information embedded in the URL, which is received by the application through HTTP GET requests when the client clicks on links.
Session ID information stored within the fields of a form and submitted to the application. Typically the session ID information would be embedded within the form as a hidden field.
Through the use of cookies.

The preferred method for managing the state of an authenticated user is through the use of cookies. Each time the client browser accesses content from a particular domain or URL, if a cookie exists, the client browser is expected to submit any relevant cookie information as part of the HTTP request. Thus cookies can be used to preserve knowledge of the client browser across many pages and over periods of time. Cookies can be constructed to contain expiry information and may last beyond a single session. Such cookies are referred to as �persistent cookies�, and are stored on the client browsers hard-drive in a location defined by the particular browser or operating system (e.g. c:\documents and settings\clientname\cookies for Internet Explorer on Windows XP). By omitting expiration information from a cookie, the client browser is expected to store the cookie only in memory. These �session cookies� should be erased when the browser is closed.

If an attacker is not able to easily compromise the security of the HTML form authentication process, the attacker may attempt to hijack an active session by submitting valid session ID information. A popular method of attack and gaining valid session ID�s is through the application of brute-forcing techniques. The ease of this form of attack depends greatly upon the uniqueness of the session ID and the security of the data channel.

Security Improvements:

Ensure that transmission of the session ID is always conducted over an encrypted data channel such as SSL. If the session ID is used to uniquely identify an authenticated user within the application, under no circumstances should this information ever pass between the client and server unencrypted. If using the cookie method for managing session IDs, organisations should note that the client browser will submit the session ID with every request (this includes pages and graphics) and may even submit it to other servers within the same domain � which may or may not be done over a secure data channel.
Session IDs should never contain specific login information (e.g. user name, password).
Ensure that the session ID is not predictable. It is vital that a cryptographically strong algorithm is used to generate a unique session ID for an authenticated user. Ideally the session ID should be a random value. Do not use linear algorithms based upon predictable variables such as date, time and client IP address.
Ensure that the session ID is of sufficient length to guarantee that any guess or brute-force type attack is unlikely to ever succeed. Given current processor and bandwidth limitations, session ID�s consisting of over 50 random characters in length are recommended.
As the most common method of an attacker successfully acquiring a valid session ID is through brute-forcing, it is thus extremely important that this type of attack is identified and managed correctly. Should a series of content requests be made from an IP address using multiple invalid session ID�s, the application should disregard any communication request from the IP address for a period of time � effectively enforcing a lockout procedure. The application should wait for multiple invalid session ID�s, because the perceived attack may be due to an incorrectly cached ID.
Ensure that the server-side application can correctly manage expiry and revocation information relating to the session ID. Session ID�s should only be active for a limited period of time and dependant upon the type of application and value of information accessible. Ideally the application should be capable of monitoring the period of inactivity for each session ID and be able to delete or revoke the session ID when a threshold has been reached.
The processes for handling and manipulating session ID information must be robust and capable of correctly handing attacks targeting the content within. Ensure that the content of the session ID is of the expected size and type, and that the quality of the information is verified before processing. For instance, be capable of identifying over-sized session ID�s that may constitute a buffer overflow type attack. Additionally, ensure that the content of the session ID does not contain unexpected information � for example, if the session ID will be used within the application�s backend database, care should be taken that the session ID does not contain embedded data strings that may be interpreted as an extension to the Select SQL query.

Conclusions

While multiple methods for securing the authentication process are available to the organisation for consideration, there is no single fix, and organisations frequently fail to successfully implement all the requisite processes. When implementing an HTML authentication mechanism for a web-based application, organisations should carefully review both the user access requirements and the response processing to potential attacks.

As the commercial web hosting software (e.g. Microsoft IIS Server and Apache) becomes more secure, and the processes for hardening and updating servers are better managed, attackers will increasingly seek to compromise the integrity of the host system through flaws in the custom application. The authentication process should be viewed as the barbican of a castle, capable of defeating the increasing threat of targeted and automated attacks.

There are however two vital things that developers of web-based services must always remind themselves of and take preventative steps against � assume that any client-side data checking or encoding can be bypassed, and ensure that all data submitted to the server-side application is thoroughly checked before processing. It is equally important to note that these apply to the entire application, not just the authentication phase.

The Author asserts the moral right to be identified as the author of this work