Hi, let’s start off with a recap of what’s discussed in our introduction to API’s on (LINK) If you have not already, make sure to read through it so that you’re familiar with how a web API works
Web APIs sit on top of the web’s HyperText Transfer Protocol also commonly known as HTTP, nowadays security is becoming more important as API’s tend to carry more user data. Because off this, production traffic usually hets encrypted by using HTTPS. This does not functionally make a difference, as it just adds a security layer on top of HTTP
Just as most registered protocols that sit on top of IP, communication originates from a random port on your pc (defined by it’s IP address) and is sent to a specific port on the specific endpoint (with it’s own IP address) you want to communicate with. Each registered protocol has at least one registered port to be used preferably. Sometimes alternative ports are also defined HTTP(S) traffic is usually transferred on following ports: HTTP: – main: 80 – alternatives: 591, 8008, 8080 HTTPS: – main: 443 – alternative: 8443
This brings us to Uniform Resource Locators, you might’ve already spotted one of those while visiting this website using your browser. URLs are used to direct your browser (or other IP based software) towards a web accessible resource and serves the same purpose as an address you put on an envelope, assuming you still send physical mail of some sort 😉 let’s take the example of the first website that was developed and split everything up:
http://info.cern.ch:80/index.html
http:// | tells our browser (or other software that uses URLs) that we want to use the http protocol to transfer the resource hereafter defined. Usually our browser will add this part by itself |
info.cern.ch | is the DNS (Domain Name System) name of the computer we are trying to reach, our pc will internally convert this name into an IP address |
:80 | defines the port on which the destination computer is listening for incoming requests. Since this is the default http port, this part can be omitted |
/ | defines in what location (starting from the server’s root directory) the resource we’re looking for exists. In this case, the resource sits directly in the root. This is not necessarily the root of the filesystem on the server we are accessing, it will usually be a virtual root sitting in a directory somewhere on the filesysem |
index.html | defines the resource we want to access. Note that usually server admins configure a default resource to be accessed when only the root directory is requested. These will usually be index.htm, index.html, index.php. Because of this, this part can in this case also be omitted |
taking the before mentioned into account, we can actually enter info.cern.ch into our browser’s URL bar or download the page using wget http://info.cern.ch on our command line (I use WSL)
In the previous sections we discussed how we would define what we would want to access. What actually happens when accessing a resource is that the software we’re using sends a request to the server to take a certain action on a resource. In the case of browsing the web, we’re usually requesting a GET of a resource. When posting a form on a webpage we’re performing a POST on a resource. In the case of APIs, there’s some more actions that we can take, the most common ones are PUT (to update a resource) and DELETE we can simulate this using a telnet client (I’m ussing telnet on ubuntu in WSL) as follows:
plissens@DESKTOP-0P1AOC4:~$ telnet info.cern.ch 80 Trying 188.184.100.82... Connected to webafs674.cern.ch. Escape character is '^]'. GET / HTTP/1.1 Host: info.cern.ch HTTP/1.1 200 OK Date: Wed, 02 Sep 2020 11:48:33 GMT Server: Apache Last-Modified: Wed, 05 Feb 2014 16:00:31 GMT ETag: "40521bd2-286-4f1aadb3105c0" Accept-Ranges: bytes Content-Length: 646 Connection: close Content-Type: text/html <html><head></head><body><header> <title>http://info.cern.ch</title> </header> <h1>http://info.cern.ch - home of the first website</h1> <p>From here you can:</p> <ul> <li><a href="http://info.cern.ch/hypertext/WWW/TheProject.html">Browse the first website</a></li> <li><a href="http://line-mode.cern.ch/www/hypertext/WWW/TheProject.html">Browse the first website using the line-mode browser simulator</a></li> <li><a href="http://home.web.cern.ch/topics/birth-web">Learn about the birth of the web</a></li> <li><a href="http://home.web.cern.ch/about">Learn about CERN, the physics laboratory where the web was born</a></li> </ul> </body></html> Connection closed by foreign host.
The request I sent to the server is
GET / HTTP/1.1 Host: info.cern.ch
GET / HTTP/1.1
The first line (once the telnet connection was setup) tells the server I’m connected to that I want to use the HTTP 1.1 GET method on / As mentioned before, other common methods are POST, PUT & DELETE
The second line tells the server that the hostname of the virtual host is info.cern.ch. This part has to do with running multiple websites/API’s and is beyond the scope of this training (feel free to contact lissensp for more info) The third, empty line lets the server know that my input is complete and that it is allowed to process it.
If I were performing a POST, an empty line would indicate the start of the request body, this would be the data that we want to send to the server, more on that later.
The response I got from the server is:
HTTP/1.1 200 OK Date: Wed, 02 Sep 2020 11:48:33 GMT Server: Apache Last-Modified: Wed, 05 Feb 2014 16:00:31 GMT ETag: "40521bd2-286-4f1aadb3105c0" Accept-Ranges: bytes Content-Length: 646 Connection: close Content-Type: text/html <html><head></head><body><header> <title>http://info.cern.ch</title> </header> <h1>http://info.cern.ch - home of the first website</h1> <p>From here you can:</p> <ul> <li><a href="http://info.cern.ch/hypertext/WWW/TheProject.html">Browse the first website</a></li> <li><a href="http://line-mode.cern.ch/www/hypertext/WWW/TheProject.html">Browse the first website using the line-mode browser simulator</a></li> <li><a href="http://home.web.cern.ch/topics/birth-web">Learn about the birth of the web</a></li> <li><a href="http://home.web.cern.ch/about">Learn about CERN, the physics laboratory where the web was born</a></li> </ul> </body></html>
This response consists of a few parts
The first line is always the response code which serves as a quick indication on how my request got processed, in this case I got a 200, which indicates my request got handled correctly and a response was returned:
HTTP/1.1 200 OK
Response codes are defined using following standard:
1xx | Indicates information, most common code is: 100 Continue |
2xx | Indicates success, most common codes are: 200 OK, 201 Created |
3xx | Indicates redirection, most common code is: 301 Moved Permanently |
4xx | Indicates client errors, most common codes are: 400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found |
5xx | Indicates server errors, most common codes are: 500 Internal Server Error, 502 Bad Gateway |
the following lines up until the empty line are response headers, they contain information about the response like processing time, the type of the content, the server software,… :
Date: Wed, 02 Sep 2020 11:48:33 GMT Server: Apache Last-Modified: Wed, 05 Feb 2014 16:00:31 GMT ETag: "40521bd2-286-4f1aadb3105c0" Accept-Ranges: bytes Content-Length: 646 Connection: close Content-Type: text/html
After that we find the actual body of the response which is to be interpreted by our browser. If we were testing an API, the response would have to be interpreted by the client.
<html><head></head><body><header> <title>http://info.cern.ch</title> </header> <h1>http://info.cern.ch - home of the first website</h1> <p>From here you can:</p> <ul> <li><a href="http://info.cern.ch/hypertext/WWW/TheProject.html">Browse the first website</a></li> <li><a href="http://line-mode.cern.ch/www/hypertext/WWW/TheProject.html">Browse the first website using the line-mode browser simulator</a></li> <li><a href="http://home.web.cern.ch/topics/birth-web">Learn about the birth of the web</a></li> <li><a href="http://home.web.cern.ch/about">Learn about CERN, the physics laboratory where the web was born</a></li> </ul> </body></html>
Great, now that your knowledge on the web has been refreshed, you can continue to the next lesson.