Making use of a Proxy/Cache and/or Mirror site to relieve Network Congestion
(G-MING/InfoCities: January 1998)

John Heaton

Abstract

The subject of this paper came about after a series of demonstrations at one of the meetings of the Culture Work-Package in Manchester at which some of the demonstrations of remote sites were being slowed down by encountering Network Congestion to the majority of external (European and even UK) resources.

Introduction

One of the problems that has been noticed when accessing a remote site via the Internet, is that of Network Congestion. This affects all types of accesses to a remote site, not just accesses using a WWW browser. However there are several ways in which this congestion could be partially overcome for HTTP; FTP and Gopher sites.

 

Proxy/Cache Server

The simplest method is for the users to configure their WWW browser to make use of a Proxy/Cache server. This can speed up the response time from a remote site if the remote documents are accessed via the proxy/cache regularly and/or by several other users.

The first such access via a proxy/cache can actually be slower than accessing the remote site directly as there are more transfers involved.

The users request is first directed to the proxy/cache which then issues a request on the users behalf to the remote site or more usually to a peer proxy/cache which is nearer to the remote site. As the requested document makes its way back through the network to the user it will be temporarily stored on the proxy/cache disk at each proxy/cache system involved.

Each subsequent request would normally cause the document to be served from the local copy, rather than being requested from the remote system. To ensure that the local proxy/cache always has an up-to-date copy of the remote document, the local proxy would still poll the remote site to see if the document stored locally is older that the remote document. Some remote sites do not play fair in this matter, and return a 'no- cache' header or return a header whose date has already been passed, either of which causes the cached document to be expired and force the local system to replace its local document by one that has probably not changed. (This is called 'Cache-Busting')

If the user has their WWW browser configured without a proxy/cache and then switches to one, the odds are that the first attempt will take just as long to process and this will discourage the user, who will promptly reconfigure the WWW browser to do without a proxy/cache. (Who was it that said 'First Impressions..' ?). The problem is that users expect dramatic differences immediately but the changes become more apparent over time.

You can view the statistics from the JANET Proxy/Cache server at the URL in the References section, but most proxy/cache server are doing very well nowadays if they can manage to serve 35% of documents from a cached copy.

Mirroring

Another approach to relieving the network congestion is to make a complete (mirror) copy of a remote site available via a local server. This idea could be expanded to have several mirror sites located strategically around the Internet.

Figure 1: PharmWeb Requests One good example of this type of mirroring is of the PharmWeb service. The PharmWeb was originally run from a users own public_html directory on the WWW server at Manchester Computing as just a set of normal user pages. As the load on the MC system increased, (both to the PharmWeb section and the fact that the same system was being used for a Search Engine and as a Proxy/Cache server for the Manchester/UMIST campus) Tony D'Emanuele from PharmWeb started to build up a network of PharmWeb Mirror (pwmirror) sites spread all around the world. On entering any one of the pwmirror sites a user is given the opportunity to stay on that site or change to another pwmirror site which may give a faster response. More recently the PharmWeb service was transferred to its own dedicated system with a much faster internet connection, and we noticed quite a large jump in the local access statistics as if the users had switched back to using that system, but the other pwmirror sites are still increasing in popularity and more are coming online.

Within the InfoCities project, we could make use of this mirroring concept, by keeping partial mirror copies of a remote site locally. For example:

The Culture work-package consists of several 'partner' sites each with its own WWW site. Each partner site is presented in the native language and normally has a section which is in English. It would be very useful to collect together the English sections of each partner site and serve it from the Manchester site.

i.e:

http:// partner-site-1/english/homepage.html
http:// partner-site-2/english/homepage.html
....
http:// partner-site-n/english/homepage.html

could be mirrored locally in Manchester as:

http:// manchester-site/partner-site-1/homepage.html
http:// manchester-site/partner-site-2/homepage.html
....
http:// manchester-site/partner-site-n/homepage.html

To make all this possible the Manchester site would need to agree with each of the partner sites how the local mirror would be kept up-to-date.

Figure 2: PharmWeb Mirror SitesWith the PharmWeb example each of the remote pwmirror sites are kept in sync with the main site by the system administrator of the main site pushing any updated files out rather than relying on each pwmirror site administrator updating themselves. This ensures that all the pwmirror sites are updated over a relatively short period of time.

 

 

 

 

 

CountryLocationURL
AustraliaUniversity of Sydneyhttp://www.pharm.su.oz.au/pwmirror/
BrazilUniversity of SÆo Paulohttp://www.fcfrp.usp.br/pwmirror/
CanadaCape Breton Community Networkhttp://highlander.cbnet.ns.ca/pwmirror/
JapanNational Institute of Health Sciences, Tokyohttp://www.nihs.go.jp/pwmirror/
New ZealandNew Zealand Pharmacist Onlinehttp://www.pharmacy.co.nz/pwmirror/
South AfricaPotchefstroom University for Christian Higher Educationhttp://www.puk.ac.za/pwmirror/
UKUniversity of Manchesterhttp://www.pharmweb.net/
 University of Manchesterhttp://pharmweb1.man.ac.uk/pwmirror/
USASchool of Pharmacy, University of North Carolinahttp://sunsite.unc.edu/pwmirror/
 University of Kansas Medical Centerhttp://www.kumc.edu/pwmirror/
 College of Pharmacy, University of Texashttp://saklad.uthscsa.edu/pwmirror/

Squid Redirectors

Recent developments in the Squid Proxy/Cache software that is used widely around the world can combine the use of the traditional idea of a proxy/cache server with the use of a local mirror copy of a remote site. It achieves this by use of a Redirector program which runs on the proxy/cache system alongside the squid process. The Redirector program is disabled by default as it requires an extra program or script to perform the redirection.

If an alternative mirror copy of a remote site exists locally, one problem is in getting the local community to make use of it. When implemented the redirector program can achieve this transparently to the user.

What the redirector does is to intercept all requests sent to the squid proxy/cache and if certain predefined text strings are found in a URL, then the URL is rewritten to point to the alternative URL and the request is handed back to the squid program for processing. The user still sees the remote URL in the WWW browser but actually gets the contents of an alternative server.

AuthorJohn Heaton
WWWhttp://www.phers.co.uk
Telephone+44 161 275 6011
FAX+44 161 275 6040
AddressG-MING Applications Programme
Manchester Computing
The University of Manchester
Oxford Road
Manchester, M13-9PL

References:

  1. The JANET Proxy/Cache: http://wwwcache.ja.net/

  2. A Distributed Testbed for National Information Provisioning http://www.nlanr.net/Cache/

  3. The Squid Proxy/Cache software: http://www.nlanr.net/Squid/

  4. Co-operative Web Caching using Multicasting over a Metropolitan Area Network in Greater Manchester, England, (a set of powerpoint slides from a presentation to the JENC 8 conference in Edinburgh 1998) http://www.gap.g-ming.net.uk/JENC8/

  5. WWW Cache Services in Europe: http://www.terena.nl/projects/insight/caching/cache-servers.html

  6. PharmWeb: http://www.pharmweb.net/

  7. Manchester INFOCITIES: http://www.infocities.g-ming.net.uk


This document is also available ZIP'ped up as an MS Word 7 document
 
© 2010 John Heaton, G1YYH
Joomla! is Free Software released under the GNU/GPL License.