IJIRST (International Journal for Innovative Research in Science & Technology)ISSN (online) : 2349-6010

 International Journal for Innovative Research in Science & Technology

Improved Cardinality Estimation using Entity Resolution in Crowdsourced Data


Print Email Cite
International Journal for Innovative Research in Science & Technology
Volume 3 Issue - 2
Year of Publication : 2016
Authors : Indulakshmi K R ; Preethymol B ; Silja Varghese

BibTeX:

@article{IJIRSTV3I2050,
     title={Improved Cardinality Estimation using Entity Resolution in Crowdsourced Data},
     author={Indulakshmi K R, Preethymol B and Silja Varghese },
     journal={International Journal for Innovative Research in Science & Technology},
     volume={3},
     number={2},
     pages={183--187},
     year={},
     url={http://www.ijirst.org/articles/IJIRSTV3I2050.pdf},
     publisher={IJIRST (International Journal for Innovative Research in Science & Technology)},
}



Abstract:

Crowdsourcing platforms adopt the new Labour as a Service model and allow for easy distribution of small tasks to a large number of workers. Crowdsourced systems introduce the open world model of databases. In the open world model, the database is considered to be incomplete and data needs to be collected in real time. For enumeration queries, the cardinality estimation of crowd collected data determines the query progress monitoring. A statistical tool is proposed to estimate the cardinality which enables users to judge the query completeness of crowdsourced data. Moreover, the crowdsourced database contains the records representing the same real world entity. A hybrid human-machine approach is proposed in which machines first, coarse pass over all the collected data, and crowd workers verifies only the most likely matching pairs. The entity resolution merges the duplicate records and hence can improve the cardinality estimation.


Keywords:

Cardinality Estimation, Entity Resolution, Hybrid Human Machine, Crowdsourced Data


Download Article