Content based Document Retrieval using Content Extraction |
||||
|
||||
|
||||
BibTeX: |
||||
@article{IJIRSTV4I2005, |
||||
Abstract: |
||||
The procedure with advancement of information surge has made it hard to get significant information on the web. In this proposed system, the necessity for practical Information Retrieval (IR) strategy has been extended. Document data contains huge information; user can easily get the information by using only title and keywords of document or information. We propose a fast and effective content-based document information retrieval system that retrieves the information from the actual content of a document. In proposed system, we use model of Latent Dirichlet Allocation that is used to extract major keywords for a given document. To improve the performance of system we use MongoDB database for the effective documents indexing. B-tree based indexing of MongoDB makes our system flexible, effective and fast than the previous system. |
||||
Keywords: |
||||
Information Retrieval, CBDIR, Inverted Indexing, B-tree Indexing, MongoDB |
||||