An Efficient Algorithm Distance Calculation of Page Sequences Using Dynamic Programming
Published: 2016
Author(s) Name: Saurabh Dhyani, Ghanshyam Singh Thakur |
Author(s) Affiliation: Dept of Comp App., Maulana Azad National Institute of Technology, Bhopal, Madhya Pradesh, India
Locked
Subscribed
Available for All
Abstract
Today web data is rapidly growing, but the information residing in the web includes inconsistent information because it is having different types of information, moreover the data are heterogeneous. Due to heterogeneity of data it is a critical task to extract relevant information from the web. Web uses mining technique, extracts the relevant information from huge amount of data available in the web logs format that enclose intrinsic information regarding web pages accessed. Because of this large amount of web log data, it is better to deal with small set of data at a time, instead of handling with complete data. Now we need to find the distance between two user sessions, using some distance similarity function which can accomplish this kind of tasks. Clustering of users tends to establish groups of users exhibiting similar
browsing patterns. In this paper we propose an efficient algorithm for calculating the similarity between two user sessions based on sequence alignment that uses one of the dynamic programming techniques that is Longest Common Subsequences.
Keywords: Clustering, Longest Common Subsequence, Web Logs, Web Usage Mining
View PDF