P.S. Banerjee, G. Sahoo, Umesh Prasad |
Abstract
In present scenario Internet has become an integral
part of every ones life, as many services like mail, news,
chat are available and huge amounts of information on
almost any subject is available. However, in most cases
the bandwidth to connect to the Internet is limited. It
needs to be used efficiently and more importantly
productively. Generally, bandwidth is distributed among
groups of users based on some policy constraints.
However, it turns out that the users do not always use
the entire allocated bandwidth at all times. Also, some
times they need more bandwidth than the bandwidth
allocated to them. Ideally, productive usage should be
preferred over unproductive usage when bandwidth is
scarce. But when it is abundant then any kind of use
can be permitted provided it is in consonance with
policy. The bandwidth usage patterns of users vary
with time of the day, time of the year and requirements.
So there is a need for dynamic allocation of bandwidth
that satisfies the requirements of the users, manages
variable usage and is consistent with administrative
usage policy.
Internet usage is varied and in the context of an
institution or organization an administrator would like to
maximize productive usage. There is, therefore, a need
to implement control access policies, which prevents
unproductive use but at the same time does not, to
the extent possible, impose censorship. Squid proxy
server is a full-featured web proxy, which increases the
efficiency of the Internet link by providing caching and
proxy services. Squid provides many mechanisms to
set access control policies. However, deciding which policies to implement
requires experimentation and
usage statistics that must be processed to obtain
useful data. The proposed architecture elaborated
in this paper is based on machine learning to determine
policies depending on the content of current
URLs being visited. The main component in this
architecture is the Squid traffic Analyzer, which
classifies the traffic and generates URL lists. These
URL lists are used in formulating access policies.
The concept of delay priority will also be introduced
which gives more options to system administrators in
setting policies for bandwidth management. As Squid
allows HTTP tunneling, it forms a loophole for strict
policy management. In this paper the proxy tunneling
in Squid has also been considered and some possible
solutions to this problem will also be suggested.
Keywords: Web Proxy, Machine-Learning, Network Traffic, Meta Data
View PDF