Tuesday, 05 Nov, 2024

+91-9899775880

011-47044510

011-49075396

Comparison of String Similarity Algorithms to Measure Lexical Similarity

National Journal of System and Information Technology

Volume 10 Issue 2

Published: 2017
Author(s) Name: Sagar J. Gandhi, Mihirraj M. Thakor, Jikitsha Sheth, Hariom I. Pandit, Hemin S. Patel | Author(s) Affiliation: Student, MCA Prog., Shrimad Rajchandra Instt. of Mgt. and Comp. Applicn. of UTU, Bardoli, Gujarat.
Locked Subscribed Available for All

Abstract

A string similarity represents the lexical similarity between two words. This can be further exploited to identify similarity between questions. Several string similarity algorithm exists in literature. In this paper the authors have implemented five string similarity algorithms viz. Dice coefficient, Jaccard similarity, Levenshtein distance, Jaro distance and Cosine similarity. The results of these algorithms are further compared with human judges to determine, which of them resembles the human way to dissimilarize the given strings. The experimentation is done over 1000 English word pairs.

Keywords: N.A.

View PDF

Refund policy | Privacy policy | Copyright Information | Contact Us | Feedback © Publishingindia.com, All rights reserved