Call of Papers for Current Volume ********************OnLine Paper Submission for Current Volume

Volume & Issue no: Volume 3, Issue 5, September - October 2014

____________________________________________________________________________________________________

Title:
Vector space model for deep web data retrieval and extraction
Author Name:
Dr. Poonam yadav
Abstract:
Abstract Deep web data extraction is challenging problem recently since the structured data from deep web pages underlie intricate structure. So, extraction of web data from deep web pages received much attention among the researchers. In this research, vector space model and content features are utilized for deep web data extraction. Initially, extracted deep web pages are taken as input for the proposed method and Document Object Model (DOM tree) is constructed. Through the DOM tree, information given in the whole web pages is split into block wise and block with its contents are given for feature computation process. Here, frequency level, title level and numerical level features are calculated after constructing vector space model which is a vector of words and its frequency. From the feature score value of every block, the important blocks are chosen as final useful data for the taken web page. The proposed approach of deep web data extraction is implemented using deep web pages which are collected from the complete planet web site and performance of the system is evaluated using precision and recall. Keywords:- Deep web data extraction, deep search engine, web data extraction, DOM tree, precision, recall
Cite this article:
Dr. Poonam yadav , " Vector space model for deep web data retrieval and extraction " , International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) , Volume 3, Issue 5, September - October 2014 , pp. 274-276 , ISSN 2278-6856.
Full Text [PDF]                           Back to Current Issue

NOTE: Authors note that paper cannot be withdrawn at any condition once it is accepted. The Team of IJETTCS advise you, do not submit same article to the multiple journals simultaneously. This may create a problem for you. Please wait for review report which will take maximum 01 to 02 week. 

 

Contact us


International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
ISSN 2278-6856
Frequency : 6 Issues/Year


E-mail: editor@ijettcs.org