Improved Cardinality Estimation using Entity Resolution in Crowdsourced Data |
||||
|
|
||||
|
||||
BibTeX: |
||||
|
@article{IJIRSTV3I2050, |
||||
Abstract: |
||||
|
Crowdsourcing platforms adopt the new Labour as a Service model and allow for easy distribution of small tasks to a large number of workers. Crowdsourced systems introduce the open world model of databases. In the open world model, the database is considered to be incomplete and data needs to be collected in real time. For enumeration queries, the cardinality estimation of crowd collected data determines the query progress monitoring. A statistical tool is proposed to estimate the cardinality which enables users to judge the query completeness of crowdsourced data. Moreover, the crowdsourced database contains the records representing the same real world entity. A hybrid human-machine approach is proposed in which machines first, coarse pass over all the collected data, and crowd workers verifies only the most likely matching pairs. The entity resolution merges the duplicate records and hence can improve the cardinality estimation. |
||||
Keywords: |
||||
|
Cardinality Estimation, Entity Resolution, Hybrid Human Machine, Crowdsourced Data |
||||



