One of the coolest new features of HTML5 is WebSockets, which let us talk to the server without using AJAX requests.

What are WebSockets?


WebSockets is a technique for two-way full-duplex communication over one (TCP) socket, a type of PUSH technology. At the moment, it’s still being standardized by the W3C; however, the latest versions of Chrome and Safari have support for WebSockets.

In more details, WebSocket is a computer communications protocol, providing full-duplex communication channels over a single TCP connection. The WebSocket protocol was standardized by the IETF as RFC 6455 in 2011, and the WebSocket API in Web IDL is being standardized by the W3C. WebSocket is a different TCP protocol from HTTP. Both protocols are located at layer 7 in the OSI model and, as such, depend on TCP at layer 4. Although they are different, RFC 6455 states that WebSocket “is designed to work over HTTP ports 80 and 443 as well as to support HTTP proxies and intermediaries” thus making it compatible with the HTTP protocol. To achieve compatibility, the WebSocket handshake uses the HTTP Upgrade header to change from the HTTP protocol to the WebSocket protocol. The WebSocket protocol enables interaction between a browser and a web server with lower overheads, facilitating real-time data transfer from and to the server. This is made possible by providing a standardized way for the server to send content to the browser without being solicited by the client, and allowing for messages to be passed back and forth while keeping the connection open. In this way, a two-way (bi-directional) ongoing conversation can take place between a browser and the server. The communications are done over TCP port number 80 (or 443 in the case of TLS-encrypted connections), which is of benefit for those environments which block non-web Internet connections using a firewall. Similar two-way browser-server communications have been achieved in non-standardized ways using stopgap technologies such as Comet.

The WebSocket protocol is currently supported in most major browsers including Google Chrome, Microsoft Edge, Internet Explorer, Firefox, Safari and Opera. WebSocket also requires web applications on the server to support it.

What do WebSockets Replace?


Websockets can replace long-polling. This is an interesting concept; the client sends a request to the server – now, rather than the server responding with data it may not have, it essentially keeps the connection open until the fresh, up-to-date data is ready to be sent – the client next receives this, and sends another request. This has its benefits: decreased latency being one of them, as a connection which has already been opened does not require a new connection to be established. However, long-polling isn’t really a piece of fancy technology: it’s also possible for a request to time-out, and thus a new connection will be needed anyway.

Many Ajax applications makes use of the above – this can often be attributed to poor resource utilization.

Wouldn’t it be great if the server could wake up one morning and send its data to clients who are willing to listen without some sort of pre established connection? Welcome to the world of PUSH technology!

Why do we need WebSockets?


So why do we need WebSockets? What problem are we trying to solve by using them? The answer is easy. We need a better way for web applications running on a client browser to communicate in real time with their servers. Currently, there are two common methods of providing this.
  • The first is for the application to poll the server continuously for any new data. If there is new data, then that is sent to the client, generally, via AJAX. This is similar to the way some children troll their parents by asking “Are we there yet?” every few seconds when riding in the car with their parents. Much to the parents’ chagrin, they have to answer ‘no’ continuously every few seconds until they finally reach their destination. Polling is just like that, where the application is basically asking whether there is new data after regular intervals of time and the server has to respond back every time, even if there is no new data to give.
  • The second is called ‘Long Polling’. This is a variation of the first technique but instead of the server giving an ‘empty’ response and closing the connection when it has no new data to give, the connection between client and server is kept open (with a timeout period). At some point in the future when the server does have some new data to give, it is given to the client and the connection is closed (provided it does so within the timeout period). This is better than polling in most ways, but if you try to use this approach in applications where a lot of data is generated very fast, then it becomes almost like the polling technique.
Both methods have their merits when compared with each other, but they also share a common set of disadvantages developers could do without.
  • Both use the HTTP protocol to send messages to the server. Every packet of information sent over this protocol is wrapped in a lot of header information which describes things like where is this packet heading, where it came from, the user agent information etc. All of this adds a lot of overhead when communicating in real time.
  • Neither of these methods are ‘bi-directional full duplex’ where both client and server can send and receive each other’s messages at the exact same time like, for example, a telephone system, where the people at both ends can talk and hear at the same time.
These are the reasons current techniques are not good enough for fast, scalable real time communication on the web. We need a better solution, and that is what WebSockets gives us.

How does it work?


Before the client and the server start sending and receiving messages, they need to establish a connection first. This is done by establishing a ‘handshake’, where the client sends out a request to connect, and if the server wants, it will send out a response accepting the connection. The protocol specification makes it clear that one of the design decisions when making this protocol was to ensure that both HTTP based clients and WebSocket based ones can operate on the same port. This is why the handshake is such that the client and server ‘upgrade’ from an HTTP based protocol to a WebSocket based protocol.

The protocol spec has an example of such a handshake. The initiating handshake from the client should look like this:

GET /chat HTTP/1.1
   Host: server.example.com
   Upgrade: WebSocket
   Connection: Upgrade
   Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
   Origin: http://example.com
   Sec-WebSocket-Protocol: chat, superchat
   Sec-WebSocket-Version: 13



The responding handshake from the server should look like this:

HTTP/1.1 101 Switching Protocols
   Upgrade: WebSocket
   Connection: Upgrade
   Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
   Sec-WebSocket-Protocol: chat



Here the client will send a key in the Sec-WebSocket-Key header which is base64 encoded. For a server to form a response, it will take this and append the magic string 258EAFA5-E914-47DA-95CA-C5AB0DC85B11to it, and then calculate the SHA-1 hash of this string. Then it will encode that hash value to base64, and that will be the sec-WebSocket-Accept header in the server’s response.

In the above example:
  • The client sends the Sec-WebSocket-Key string dGhlIHNhbXBsZSBub25jZQ==
  • The server appends the magic string to form the string dGhlIHNhbXBsZSBub25jZQ== 258EAFA5-E914-47DA-95CA-C5AB0DC85B11
  • Now the server generates the SHA-1 hash for this longer string, which is b37a4f2cc0624f1690f64606cf385945b2bec4ea
  • Finally, the server base64-encodes the hash string to give s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
  • And this base64-encoded value is used in the Sec-WebSocket-Accept header in the server’s response.
An important thing to note is the Origin header. The client-side handshake will always include this header, and then it will be up to the server whether they want to accept clients from different origins or not.

The WebSockets API


Front-end web developers will be more interested in the WebSockets API, which is a JavaScript based API which developers will use to do messaging between their client side app and the server.

Does The Browser Support WebSockets?


The first things all developers should do when working with the WebSockets API is to detect whether or not the client browser supports them. If so, we can work our magic with them. If not, we’ll have to fall back to another method of client-server communication, such as long-polling mentioned above.


if ('WebSocket' in window){
   /* WebSocket is supported. You can proceed with your code*/
} else {
   /*WebSockets are not supported. Try a fallback method like long


Opening and Closing WebSocket Connections:


Assuming that WebSockets are supported by the browser, the first task will be to connect to a WebSocket server by calling the WebSocket constructor

var connection = new WebSocket('ws://example.org:12345/myapp');



You could also use wss://, which is the secure socket variant to ws:// in the same way https is to http.

var connection = new WebSocket('wss://secure.example.org:67890/myapp');



You could also specify sub-protocols of your own like so:

var connection = new WebSocket('ws://example.org:12345/myapp', ['chat', 'super-awesome-chat']);



If your connection is accepted and established by the server, then an onopen event is fired on the client’s side. You can handle it like so

connection.onopen = function(){
   /*Send a small message to the console once the connection is established */
   console.log('Connection open!');
}



If the connection is refused by the server, or for some other reason is closed, then the onclose event is fired .

connection.onclose = function(){
   console.log('Connection closed');
}


You can even explicitly close it on your own by calling the close() method.

connection.close();
    


In case of any errors, you can handle them using the onerror event.

connection.onerror = function(error){
   console.log('Error detected: ' + error);
}
    

Sending and Receiving Messages


Once we’ve successfully opened a connection to the server, we need to send messages to and receive messages from the server. Sending messages is very straightforward. We use the .send() method on our connection object.

    connection.send('Hey server, whats up?');
    


Should the client receive a message from the server, it raises the onmessage event for you to handle.
connection.onmessage = function(e){
   var server_message = e.data;
   console.log(server_message);
}
    


If you want to send JSON objects to the server rather than a simple message, they should be serialized to a string, like so:
var message = {
'name': 'bill murray',
'comment': 'No one will ever believe you'
};
connection.send(JSON.stringify(message));
    

Supporting WebSockets on the Server


Most web servers revolve solely around the HTTP protocol. As WebSockets use their own protocol, you may need to install additional libraries and add-ons to support ws:// or the wss:// protocols in addition to http:// and https://.
  • Javascript enthusiasts should install and use Node.js, where dealing with web sockets on the server side is very easy. \You could take a look at the pages for WebSocket-Node or ws on how to use them in your Node.js apps for WebSocket related communication.
  • Python fans should check out the Tornado server.
  • Rubyists should take a look at the EventMachine WebSocket server.
  • .NET developers take a look at SignalR.

Cross browser support: Are we there yet?


The latest version of the WebSocket Protocol (RFC 6455) is currently only supported by a couple of the major browsers (Chrome and Opera) right now. While we wait for the other browsers to catch up however, there are several ways to roll out cross-browser WebSocket-based applications right now.

A nice cross browser way to do JavaScript-based real time communication is socket.io. This works with Node.js and other technologies to create a cross browser way to do real time communication with web applications. It uses WebSockets if the client supports it, and falls back on other things like flash if it’s not, and even has AJAX polling and multi-part streaming in its arsenal. It builds upon various technologies (where WebSockets is one of them) to create an abstraction level which can be used by all clients in a unified cross browser way.

Another way to go would be cloud hosted API services like Pusher. Instead of rolling out your own WebSocket server, you could use these types of services to run a WebSocket server, and interact on the client side with the API they provide. Generally they provide a flash fallback (which simulates WebSockets) in case the browser does not support WebSockets.

Summary


WebSockets provide a really simple way to do fast, robust and very efficient communication between the client and the server, removing some of the problems we face with the HTTP protocol. This technology is especially suited for applications where there is a high amount of data being generated rapidly and which needs to be communicated quickly. One very good area where this can be used is in the area of HTML5 based multiplayer online games (especially ones where you require quick response times, like first-person shooters). Other possible uses on the web include real-time breaking-news updates, fast updating streams on social media, as well as sport scores and online chat applications.

Related Chapter:





The above content is written by:

Abhishek Dey

Abhishek Dey

A Visionary Software Engineer With A Mission To Empower Every Person & Every Organization On The Planet To Achieve More

Microsoft | University of Florida

View LinkedIn profile


If you have any feedback, please use this form: https://thealgorists.com/Feedback.




Subscribe to Our Youtube Channel

Follow Us On LinkedIn
wave