Clustering approach in content free data cleaning
Published: 2009
Author(s) Name: Sohil D. Pandya, Dr. Paresh V. Virparia
Locked
Subscribed
Available for All
Abstract
In this era of Knowledge, organizations can gain competitive advantage only by proficient data analysis. This paper emphasizes on application of clustering in context free data cleaning by correcting values of attributes, using various sequence similarity metrics, where reference data set is not available, to improve the quality of data which in turn lead to eminent data analysis. Authors propose an algorithm to examine suitability of value to correct other values of attributes. Various sequence similarity metrics were used, to find distance of two values of attributes, to test the data and generate results. Experimental results show how the approach can effectively clean the data without reference data.
Keywords: Clustering, Context free data cleaning, Sequence similarity metrics.
View PDF