Squid - the cross-platform open source proxy

Performance is one of the most important aspects of a good website. It’s been proven that longer loading times lead to visitors turning their back on sites before these portals have the chance to convince users to stay with the quality of their content. As vital components of modern user experiences, page speed has been an important Google ranking factor since 2010. Website operators should, therefore, make sure they’re well versed in this topic and optimise the speed of their website. Users have the option of compressing images (Compress images with the help of free tools), aggregating and optimising code files, or reducing the number of inquiries.

A further approach that will provide your web server with more relief involves using reverse proxy servers. This software component acts as an interface between browsers and web servers by processing browser requests by proxy and independently delivering cached static content without contacting the web server. This is especially effective when servers have to dynamically create the pages in the server upon each request but don’t have to change them constantly. One of the most popular solutions for setting up such a cache proxy is the free programme, squid.

What is Squid?

Published in 1998 the proxy server software, Squid, was originally released by developer Duane Wessel as a free spinoff of ‘Harvest object cache’. At the same time, the commercial version was also released under the name ‘NetCache’ whose development has since been discontinued. Squid is available under the GNU General Public Licence and supports the protocols HTTP, HTTP/2, HTTPS, and FTP among other variants. Squid proxy servers run on most conventional operating systems, such as the various Linux distributions, Mac OS X or Windows. Operating the proxy server can either be done via the corresponding command line tool or a graphical user interface, like GAdmin SQUID or SquidMan.

Thousands of website operators take advantage of the caching possibilities of the open source proxy practice. For example, Wikipedia has been using Squid proxies for years to deliver its content and relieve its database and webservers. Furthermore, thanks to the support of HTTPS, Squid is able to take on the construction of secured SSL connections. Various internet providers worldwide use Squid as a transparent proxy in order to ensure optimised internet access. Of course, the open source software can also be used to operate a general forward proxy for an individual client; this would hide a user’s IP address and provide additional protection to the firewall’s packet filter.  Alternatively, Squid is also able to filter independent packets with the expansion SquidGuard.

Why you should use a Squid proxy server

Since its very first version, Squid has been an open source product, which is why there’s no licence required and source text is also freely available. The software can be downloaded for free and also can be adjusted to meet users’ various diverse needs. However, this only proves to be rarely needed: decades of experience from the Squid project employee that have been maintaining and developing the programme on a volunteer basis for years shows the versatility and speed optimisation that Squid offers. The software also proves convincing for private use due to its definable access control lists. On the one hand, access to certain content can be blocked or the usable bandwidth can be reduced while on the other hand users are able to seamlessly analyse the used proxies in order to control the dataflow.

An important characteristic of Squid is its high flexibility, which really pays off for larger, more complex networks. Following this, it’s possible to create a cache-proxy set-up by utilising multiple Squid proxies and distributing requests to these. This type of organisation relieves the individual components and enormously increases the system’s reliability . Just as is the case in a Content Delivery Network the individual reverse proxy servers are able to be located in multiple locations.

How the caching behaviour of proxy software works

The listed security and control functions that can be realised with this set-up make it clear how versatile the software can be. Firstly, Squid is an attractive option due to its core function as a proxy server for caching data. In order to guarantee that it’s up to date and available, Squid calculates its statuses regularly, and there are two potential results for this: the inspected object can either still be up to date (fresh) or it can be no longer up to date (stale). In order avoid the task of having to inspect the entire data set, an algorithm calculates how often each individual object requires verification. Here, the following information is considered into the evaluation:

LM last modified; Header information on the date on which the last change was made.
EX expire; header information that delivers details on the expiration date of an object
NOW The current date
OBJ_Date The storage date in the Squid cache and the date of the last change
MIN Minimum cache duration
MAX Maximum cache duration
PERCENT Duration factor
Obj_Age Describe the time since the object has been in the cache (NOW - OBJ_Date)
LM_Age Age of an object at the time of cache(OBJ_Date - LM)
LM_FACTOR Age factor (Obj_Age / LM_Age)

Until the date X, the object in the cache remains valid:

X = OBJ_Date + (LM_Age * PERCENT)

The following link depicts Squid’s caching algorithm. Simply put, the algorithm is designed to do the following: the Squid proxy server controls the status of an object more often when the object itself is undergoing more frequent changes. Here, the earliest inspection period is MIN, i.e. the assigned minimum duration in the cache. When the maximum duration date MAX is reached then Squid has to contact the web server. To this end, the proxy software a GET request with if-Modified-Since entries, including the OBJ_Date. The web server verifies the status of the object and then forwards it.

  • The status code 304 (not modified) when the object is unchanged
  • Or the status 200 (OK) as well as the unchanged object

This means that data will only be transferred when something has actually changed.

Which hardware requirements does Squid offer?

If you want to use a Squid reverse proxy for your web server, then you should first make sure that you have the necessary hardware structures. A caching proxy doesn’t feature any special requirements in terms of processing power. It requires instead the proper amount of working and hard drive memory. Nowadays, both components are easily obtained, which is why acquiring these is less a question of price and more one of the right calculation. Calculate your demand in relation to your web project and allow the potential for growth to be accounted for as well. When purchasing your hardware make sure to choose modern components like SDD storage, which are categorised by quick access time and so allow the best possible speed optimisation of their website.

Installing Squid—how it works

Generally, you have two options for installing the squid software on your system. The first variety requires Squid to be located in the packet management of your used distribution. If this is the case, then installation of the proxy programme is carried out according to the known pattern via the command line. Under Ubuntu, the corresponding command is:

sudo apt-get update
sudo apt-get install squid

The second path for installation is carried out by downloading the installation file. This can be unpacked and compiled using conventional methods (as seen in versions 3.5.20):

tar xzf squid-3.5.20.tar.gz
cd squid-3.5.20
./configure
make

With the command:

make install

After this step is finished, start the installation

An unofficial prefab MSI installation packet has been available for Windows systems since version 3.5; just double click after download to get the process started.

For every published stable version, there’s a development version as well as a beta version that contain new features. Both versions are used first and foremost to test this function, which is why users should only resort to using these if they’re able to fully understand Squid software.

How to configure your squid proxy server for increased website acceleration

The configuration file, squid.conf allows users to define the type of proxy Squid should act as. These are generally found under /etc or /usr/local/squid/etc or also in the directory that you determined during installation. There are already various predefined settings that are labeled via comment lines, which begin with hash marks (#). The following paragraphs we’ve compiled some important options that you’ll need in order to setup Squid.

Network options: #NETWORK OPTIONS

This area allows users to configure IP addresses and ports that are relevant for operation Squid servers. The following entries are intended for the cache proxy.

http_port

Syntax: http_port [hostname or IP address:]Port number

Description: defines the port on which the Squid listens in on HTTP requests. Generally, port 3128 is cited here. If neither hosts names nor IP addresses are available, then the settings are valid for all the integrated IP addresses. Entering multiple ports is also possible.

Example: http_port 192.168.0.1:3128

https_port

Syntax: https_port [IP addresses:]Port number cert=path to SSL certificate [key=path to private SSL key] [options]

Description: the information on the HTTPS port is required if the Squid proxy is to receive the SSL or TLS connections. The path’s information for the applied certificate (in PEM format) is required. If not private SSL key is given, Squid automatically assumes that the PEM file already contains a key. The parameter options allow you to enter additional options according to OpenSSL documentation.

Icp_port

Syntax: icp_port Port number

Description: here you can enter the port via which the Squid ICP receives requests (Internet Cache protocol) or UDP packets. One statement is only necessary if you use multiple proxies that are to communicate with one another. The standard port is 3130; in order to turn off the function enter the parameter 0.

Example: icp_port 3130

Caching options: #OPTIONS WHICH AFFECT THE CACHE SIZE

The caching options allows users to determine, among other things, whether and/or how much working memory your Squid proxy should use for caching purposes. It can also be used to define the minimum and maximum object sizes and the general caching behaviour.

cache_mem

Syntax: cache_mem working memory in MB

Description: with cache_mem you’re allowed to determine the size of the reserved main memory for in-transit objects, hot objects, and negative-cached objects. Given that the data is arranged in blocks of 4KB, the indicated value must also be a multiple of 4KB. Make sure that you do mistake this option with the absolute storage needs of Squid, which isn’t regulated in this way.

Example: cache_mem 256 MB

maximum_object_size

Syntax: maximum object_size object size in KB/MB

Description: this entry gives Squid information on the size of the object that’s to be cached. The minimum limitation for size can be determined with minimum_object_size.

Example: maximum_object size 4 MB

Entries for caching and logfile directories: #LOGFILE PATHNAMES AND CACHE DIRECTORIES

In addition to the entries on ports and caching behaviour, the squid server needs information regarding the directory in which the content and incidental log data should be cached.

cache_dir

Syntax: cache_dir directory type directory path storage space directory amount

Description: define the caching directory as well as its maximum storage capacity in megabytes and the number of directories and subdirectories with cache_dir. By default, the installed directory type is ufs. Generally, this option is turned off and has to be reactivated.

Example: cache_dir ufs/usr/local/squid/var/cache/squid 100 16 256

Cache_log

Syntax: cache_log file path

Description: determine the storage location of your squid proxy server’s log file, which records information on the software’s behaviour.

Example: cache_log /usr/local/squid/var/logs/cache.log

Access options: # ACCESS CONTROLS

Finally, you’ll need clearly defined access lists for the ports used by Squid. Here, the following two parameters are key:

acl

syntax: acl listname list type argument

Description: here you have the possibility to create a details access list for all HTTP, ICP, andTCP connections. For a more precise overview of types and options, take a look at the official online guide.

Example: acl all src 0.0.0.0

http_access

Syntax: http_access allow|deny [!] listname

Description: allow or deny the access to the HTTP port with the previously defined access lists. The exclamation mark at the beginning means that allocations for all the connections apply that do not belong to the named list.

Example: http_access deny!SSL_portsThe simple solution: a cache proxy