International Journal for Innovative Research in Science and Technology :: IJIRST (International Journal for Innovative Research in Science & Technology)

Content based Document Retrieval using Content Extraction

Print Email Cite

International Journal for Innovative Research in Science & Technology

Volume 4 Issue - 2

Year of Publication : 2017

Authors : Ajaykumar Ashok Awad

BibTeX:

@article{IJIRSTV4I2005,
title={Content based Document Retrieval using Content Extraction },
author={Ajaykumar Ashok Awad},
journal={International Journal for Innovative Research in Science & Technology},
volume={4},
number={2},
pages={61--66},
year={},
url={http://www.ijirst.org/articles/IJIRSTV4I2005.pdf},
publisher={IJIRST (International Journal for Innovative Research in Science & Technology)},
}

Abstract:

The procedure with advancement of information surge has made it hard to get significant information on the web. In this proposed system, the necessity for practical Information Retrieval (IR) strategy has been extended. Document data contains huge information; user can easily get the information by using only title and keywords of document or information. We propose a fast and effective content-based document information retrieval system that retrieves the information from the actual content of a document. In proposed system, we use model of Latent Dirichlet Allocation that is used to extract major keywords for a given document. To improve the performance of system we use MongoDB database for the effective documents indexing. B-tree based indexing of MongoDB makes our system flexible, effective and fast than the previous system.

Keywords:

Information Retrieval, CBDIR, Inverted Indexing, B-tree Indexing, MongoDB

Download Article

ISSN (online) : 2349-6010

Content based Document Retrieval using Content Extraction

BibTeX:

Abstract:

Keywords:

High Impact Factor

I. C. Value

Browse Categories

Browse Archives

License

No Plagiarism

ISSN (online) : 2349-6010

Content based Document Retrieval using Content Extraction

BibTeX:

Abstract:

Keywords:

Citation

High Impact Factor

I. C. Value

Browse Categories

Browse Archives

License

No Plagiarism