HTTP is a term that you should be prepared to come across many times in your programming journey. When I was starting as a junior programmer, I didn’t let the abbreviations intimidate me. Even experienced people have to look up some acronyms because new abbreviated technologies emerge every day.
The main reason why some newbie programmers never make it in the industry is that they don’t do their research well (or at all). You need to dig deeper than just the definition of these abbreviations. Start by looking at some background of how it started to understand it better. In this HTTP for beginner programmers guide, I will provide you with as many details as possible.
What is HTTP?
HTTP is an abbreviation for HyperText Transfer Protocol. It refers to an application layer protocol for collaborative and distributed hypermedia data systems. HTTP was designed to facilitate communication between the web servers and clients.
Clients refer to web-browsers such as Edge, Chrome, Firefox, and Opera. They can also be any device or program that can access the web. On the other hand, servers refer to the computers in the cloud (mainly in our days).
Communication between the servers and the web browsers is done by request and responses. Take a look at this simple HTTP process.
- A browser sends an HTTP request to the web;
- The web server receives the request;
- The server processes the request;
- A response is returned to the browser;
- The browser receives the response;
As you can see in the above process, HTTP follows a client-server model. This means that when the client (web browser) makes a request, it has to wait for the server to process it and send a response. It is a stateless protocol which means that no data is kept between the two requests. However, some applications can implement server-side sessions or states using hidden variables within web forms or HTTP cookies.
Evolution of HTTP
The term HTTP was coined in project Xanadu in 1965 by Ted Nelson. The technology was later improved on the World Wide Web Project in 1989 by a team led by Tim Bernes-Lee. They created a web server, web browser, HTML, and HTTP. But the first protocol had only the GET protocol that requested a page from the server. It would then produce the HTML page as the response.
HTTP V0.9 was the first documented version IN 1991. The subsequent version was developed by the HTTP working group back in 1995. He made the protocol more efficient by adding richer meta information and expanded it with extended negotiation and operations. HTTP has seen a significant improvement ever since it was launched to exchange files in a lab. Now it can carry high-resolution images and videos across the internet.
Tim Berners proposed to build the hypertext system on the internet while still working for CERN. He called the proposed system mesh before changing it to the World Wide Web. It was implemented in 1990 alongside four building blocks running on TCP/IP protocols.
- The hypertext markup language was created to represent the text formats;
- HTTP was created as a simple protocol to exchange the HTML files;
- A client was created to display the HTML documents. It was initially called World Wide Web;
- A server to authorize access to the HTML document;
The first version did not have a version number when it was initially released. It was later named version 0.9 to differentiate it from the subsequent versions. HTTP/0.9 was very simple, and it started with the GET method and the resource path
This second version was very limited on both the servers and the browsers. However, it was later expanded and became more versatile. A few tools still make use of this protocol to date.
- HTTP/1.1 -The standard protocol
This was published in 1997, shortly after HTTP/1.0 was released. It underwent proper standardization and clarified all the ambiguities of the previous version.
Officially standardized in 2015, HTTP/2 was released to cope with the growing technology.
What are the components of HTTP?
This refers to the tool that acts on behalf of the user. This role is mostly assumed by a browser or developer programs that we use to debug our applications.
The browser always initiates the request before the server responds. However, some mechanisms have been developed over the years allowing servers to simulate messages.
For a web page to be displayed, the browser has to send a request to the server to fetch the HTML file. It parses the file by executing additional scripts, resources such as audio or video, and the layout. All these resources are mixed accordingly by the web browser to display a complete document to the user.
A webpage contains texts embedded with links that can be activated with a mouse click or a tap on the touchscreen devices. This fetches a new webpage hence allowing users to navigate through the web easily. These new directions are interpreted in HTTP requests and then responses that are clearer for use.
The Web Server
The server is responsible for submitting all the resources requested by the web browser. Although it appears as a single machine virtually, it can be a big collection of servers. This way, it can share the load or store a complex software responsible for interrogating other computers to partially or generate the document on demand.
Many server software can be stored on the same machine. They can share the same IP address with the host header and http/1.1
There are numerous machines and computers found between the server and the web browser. Proxies are responsible for relaying HTTP messages. They operate on the physical network or transport levels because of the layered structure of the web stack. This makes them transparent and allows them to make a significant impact on performance. There are two types of proxies, namely transparent and non-transparent. The transparent proxies forward the requests without altering them, while the non-transparent can alter the response before passing it to the server. Let us look at its functions.
- Load balancing;
This is an HTTP method used to request data from the server by URL. The length of the URL remains limited when get is in use. As a result, users can bookmark the results easily. GET is best used when data does not contain any sensitive information, documents, or images.
- You can bookmark data easily using the GET method;
- GET has length restrictions;
- You can cache GET requests;
- GET method should not be used with sensitive info;
- It can only request data but not modify it;
- The requests remain in the browser history;
The post method is the opposite of GET. It is used to send data to a server and update or create a new resource. Besides, the data sent is stored in the body. It is one of the most commonly used methods.
- It has no length restrictions;
- POST cannot be bookmarked;
- The POST requests cannot be cached;
- These requests do not remain in your browser history;
The HEAD method is similar to GET, only that it doesn’t come with a response body. It is mainly used for retrieving meta-information in terms of response headers without the entire text.
Like POST, the PUT method is also used to transmit data to the server to update or create a new resource. However, the main difference between the two methods is that the PUT method is idempotent.
The delete method has only one sole purpose, which is to get rid of a specified resource.
This is a method that is used to describe the target resource communications options.
It echoes the client in case of any changes by the intermediate servers.
This is a method that converts a request connection to a more transparent TCP/IP tunnel. This allows SSL encrypted communication via an unencrypted proxy.
This is the method responsible for applying partial modifications to a resource.
Applications and websites increase performance by reusing fetched resources. They reduce network traffic and latency, thus improving the time needed to display a resource. Websites become more responsive by the use of HTTP caching.
Types of caches
Caching refers to the process of storing expensive to get resources and serving them back when required. When the request is made and the web cache has it in its store, it sends its copy back rather than re-downloading it. This helps reduce the load from the server and bring the resources closer to the client., it helps increase performance on the website.
Caches have to be configured correctly because the resources on the server can change after some time. Hence a resource needs to be cached only until the resources change. There are two main categories of caches: shared or private caches. A shared cache stores resources that more than one user can use. Contrary, a private cache stores resources dedicated to a single user only. Examples of caches include:
- Browser caches;
- CDN caches;
- Proxy caches;
- Load balancers;
- Reverse proxy caches;
An HTTP cookie refers to a small piece of data sent to the web browser by the server. The browser then stores this cookie until the same server requests it back. Its main job is to tell if the two requests originated from the same browser. So they keep you logged in on the same website. So it recalls stateful information for the stateless HTTP protocol.
Cookies are used for three main reasons: Session management, tracking, and personalization
Cookies were initially used as storage by the client. However, this has now been replaced by modern-day APIs. Cookies can deteriorate the performance of the client because they are sent with every request. This can consume a lot of data. Standard APIs for client storage include local storage and session storage.
How cookies are created
A server can send one or more cookie headers after receiving an HTTP request. The browser stores the created cookie and then sends it back to the same server whenever a request is made. It is returned inside an HTTP header to the server. It includes an expiration date that specifies the duration in which the cookie can be transferred to and from the client. Some other restrictions can be included, such as domains and paths that limit where cookies can be sent.
HTTP vs HTTPS
HTTPS refers to Hypertext transfer protocol Secure. Its also known as HTTP over SSL or HTTP over TLS. The websites you see with a domain starting with HTTPS:// means that they are using the secure protocol. The best thing about this protocol is that it will direct you to HTTPS even if you type HTTP on the browser. Besides, HTTPS also uses the transmission control protocol to transmit data packets. However, it sends and receives packets via port 443, contrary to port 80 used by HTTPS.
Netscape created HTTPS in 1994 to facilitate their web browser. It initially used the SSL protocol but later involved the TLS back in 2000. Nowadays, these two terms are used interchangeably by people (but TLS is the SSL’s successor).
When you run on HTTPS, that means that your connection is encrypted. A few years back, Google started showing the HTTP connections as not secure. Webmasters were forced to migrate to HTTPS, which shows as a secure connection on the web browsers. It uses a public key that a web browser can only decrypt. It has to be deployed on the server first.
The certificates are signed by a certificate authority cryptographically. The browsers have a list of the certificate authorities that they trust. If it detects a CA certificate from its trusted list signs, the browser gives it a small lock on the address bar. Getting a signed nowadays is easy while some companies, such as let’s encrypt, issue SSL certificates for free. If a user finds out that your website is on HTTP, then they might bounce off.
Differences between HTTP and HTTPS
Let us look at the difference between these two protocols.
- URL appears as HTTP:// while HTTPS appears as HTTPS://;
- HTTP runs on port 80 while HTTPS runs on port 443;
- HTTP is unsecured while HTTPS is secure;
- Domains running on HTTP don’t require validation. Contrary, HTTPS requires; certificates, legal documents, and domain validation.;
- Data in HTTP is not encrypted while HTTPS sends encrypted data;
- HTTP does not require an SSL certificate, while HTTPS requires a signed SSL;
HTTP Frequently Asked Questions
- Is HTTP secure?
HTTP is not secure because the messages are not encrypted. Suppose you are using it for just browsing, then it’s okay. But if you have to fill forms, then HTTP is not safe.
- Where is HTTP used?
HTTP is used to send hypertext and media from the servers to the clients. IT is based on a TCP/IP and is used by the servers, proxies, and browsers.
- Who invented HTTP?
HTTP was invented by Tim Berners-Lee together with his team from 1989-1991. Since then, it has seen massive changes to make it better. HTTP can now carry High-resolution videos and images in 3D
- How do HTTP servers work?
HTTP servers process requests from the clients and sends back the relevant resources. The first thing it does after receiving a request is matching the URL with its files. Then it sends back the files to the client, such as a web browser.
- Is HTTP 1.0 still used?
HTTP 1.0 is still being used by many tools coded using this protocol. If you are creating a proxy, make sure it supports both HTTP 1.0 AND 1.1
Understanding the evolution of the World Wide Web is very important, especially for beginner programmers. HTTP was one of the first building blocks of the World Wide Web. It was created to facilitate the transmission of data between the web servers and the browsers. Since its release, HTTP has undergone many improvements to make it better and cope with emerging technologies. HTTP is one of the greatest inventions that has made the internet such a great success.
In the following article about HTTP, we are going to discuss HTTP and REST APIs.