An Evolutionary and a Rule-based Approach to String Transformation
Published: 2014
Author(s) Name: Nandita Bhanja Chaudhuri, D. Kamal Kumari, S. Ram Prasad Reddy |
Author(s) Affiliation: CSE Department, Vigana’s Institute of Engg. for Women, Ranga Reddy, Telangana, India
Locked
Subscribed
Available for All
Abstract
Natural language processing focuses on analyzing and
processing human languages using digital computers.
String transformation is an important area of research in the field of natural language processing. String transformation maps a source string to a desirable form, which is related to various applications like spelling error correction, query reformulation, top k related candidate generation, and word stemming. Even though various traditional approaches are available for string transformation but they cannot be considered as optimal because accuracy and efficiency are the basic parameters to optimize. This paper proposes a novel model for string transformation which is up to 99% accurate, with an improved F-measure and G-measure. The model is intended to use evolutionary approach, which involves the methods to search the population of keywords, algorithm to find the distances between strings, and finally transforming the strings with and without using a dictionary. Our paper mainly focuses on the following: (1) spelling error correction which detects a wrong spelling and provides correct suggestion, (2) top k candidate generation which provides the most related suggestions for a keyword, (3) query reformulation which transforms a short form of a query into an elaborate form, and (4) word stemming which identifies a part of a word when it is also concatenated by grammatical stuffs. In short, it identifies redundant queries. It is rule-based system which is implemented without using a dictionary. Graphical comparisons are demonstrated for the candidates generated between the existing and the proposed system. Experimental results on large scale data shows that the proposed model is accurate and improved over the traditional approach.
Keywords: Natural Language Processing, String Transformation, Spelling Error Correction, Top K Candidate Generation, Query Reformulation, Stemming, F-measure, G-measure
View PDF