Volume & Issue no: Volume 5, Issue 6, November - December 2016
____________________________________________________________________________________________________
Title: |
SIMILARITY MEASURE FOR TEXT CLASSIFICATION |
Author Name: |
Radha mothukuri, Nagaraju.M, Divya Chilukuri |
Abstract: |
Abstract
Text processing plays very important role in information
retrieval, data mining and web search. Text classification can
efficiently enhance the text processing capability by
automatically sorting out them according to defined collection
of categories. Measuring the similarity between documents is
an important operation in the text processing field. In this
work, a similarity measure is proposed to compute the
similarity between two documents with respect to a feature.
The proposed measure takes the following three cases into
account: The feature appears in both documents, the feature
appears in only one document, and the feature appears in
none of the documents. For the first case, the similarity
increases as the difference between the two involved feature
values decreases. Furthermore, the contribution of the
difference is normally scaled. For the second case, a fixed
value is contributed to the similarity. For the last case, the
feature has no contribution to the similarity. The proposed
measure will be extended to gauge the similarity between two
sets of documents. The effectiveness of this measure will be
evaluated on several real-world data sets for text classification
problems.
Keywords:-text mining, classification, similarity
measure,accuracy. |
Cite this article: |
Radha mothukuri, Nagaraju.M, Divya Chilukuri , "
SIMILARITY MEASURE FOR TEXT CLASSIFICATION" , International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) ,
Volume 5, Issue 6, November - December 2016 , pp.
016-024 , ISSN 2278-6856.
|
Full Text [PDF] Back to Current Issue |
NOTE: Authors note that paper cannot be withdrawn at any condition once it is accepted. The Team of IJETTCS advise you, do not submit same article to the multiple journals simultaneously. This may create a problem for you. Please wait for review report which will take maximum 01 to 02 week.