Forward and reverse proxy

The word proxy describes someone or something acting on behalf of someone else.

Forward proxy

Most discussions of web proxies refers to the type of proxy known as a forward proxy.

The proxy event in this case is that the forward proxy retrieves data from another website on behalf of the original request.

A tale of 3 computers:

X = your computer, or "client" computer on the Internet.

Y = the proxy website, proxy.example.org.

Z = the website you want to visit, www.example.net.

Normally, one would connect directly from XZ.

However, in some scenarios, it's better for YZ on behalf of X, which chains as follows: XYZ.

Reasons why X would want to use a forward proxy server

X is unable to access Z directly because:

  1. Someone with administration authority over X’s Internet connection has decided to block all access to site Z. I.e.:

    • The Storm Worm virus is spreading by tricking people into visiting onerandomsite.com, so the system administrator has blocked access to the site to prevent users from inadvertently infecting themselves.
    • Employees at a large company have been wasting too much time on facebook.com, so management wants access blocked during business hours.
    • A local elementary school disallows Internet access to the playboy.com website.
    • A government is unable to control the publishing of news, so it controls access to news instead, by blocking sites such as wikipedia.org. See TOR or FreeNet.

  2. The administrator of Z has blocked X. I.e.:

    • The administrator of Z has noticed hacking attempts coming from X, so the administrator has decided to block X’s IP address (and/or net range).
    • Z is a forum website. X is spamming the forum. Z blocks X.

Reverse proxy

A tale of 3 computers:

X = your computer, or “client” computer on the Internet.

Y = the proxy website, proxy.example.org.

Z = the website you want to visit, www.example.net.

Normally, one would connect directly from XZ.

However, in some scenarios, it’s better for the administrator of Z to restrict or disallow direct access, and force visitors to go through Y first. So, as before, we have data being retrieved by YZ on behalf of X, which chains as follows: XYZ.

What is different this time compared to a forward proxy, is that this time the user X doesn't know he is accusing Z, because the user X only sees he is communicating with Y. The server Z is invisible to clients and only the reverse proxy Y is visible externally. A reverse proxy requires no (proxy) configuration the client side.

The client X thinks he is only communicating with Y (XY), but the reality is that Y is forwarding all communication (XYZ again).

Reasons why Z would want to set up a reverse proxy server

Z wants to force all traffic to its website to pass through Y first:

  1. Z has a large website that millions of people want to see, but a single web server cannot handle all the traffic. So Z sets up many servers, and puts a reverse proxy on the Internet that will send users to the server closest to them when they try to visit Z. This is part of how the Content Distribution Network (CDN) concept works. I.e.:

    • Apple Trailers uses Akamai.
    • Jquery.com hosts its JavaScript files using CloudFront CDN.

  2. The administrator of Z is worried about retaliation for content hosted on the server and doesn't want to expose the mains server directly to the public. I.e.:

    • Owners of spam brands such as “Canadian Pharmacy” appear to have thousands of servers, while in reality having most websites hosted on far fewer servers. Additionally, abuse companies about the spam will only shut down the public servers, no the main server.

In the above scenarios, Z has the ability to choose Y.