An Iterative Scheme for Leverage-based Approximate Aggregation

Han, Shanshan; Wang, Hongzhi; Wan, Jialin; Li, Jianzhong

Computer Science > Databases

arXiv:1711.01960 (cs)

[Submitted on 6 Nov 2017 (v1), last revised 22 Jan 2019 (this version, v4)]

Title:An Iterative Scheme for Leverage-based Approximate Aggregation

Authors:Shanshan Han, Hongzhi Wang, Jialin Wan, Jianzhong Li

View PDF

Abstract:The current data explosion poses great challenges to the approximate aggregation with an efficiency and accuracy. To address this problem, we propose a novel approach to calculate the aggregation answers with a high accuracy using only a small portion of the data. We introduce leverages to reflect individual differences in the samples from a statistical perspective. Two kinds of estimators, the leverage-based estimator, and the sketch estimator (a "rough picture" of the aggregation answer), are in constraint relations and iteratively improved according to the actual conditions until their difference is below a threshold. Due to the iteration mechanism and the leverages, our approach achieves a high accuracy. Moreover, some features, such as not requiring recording the sampled data and easy to extend to various execution modes (e.g., the online mode), make our approach well suited to deal with big data. Experiments show that our approach has an extraordinary performance, and when compared with the uniform sampling, our approach can achieve high-quality answers with only 1/3 of the same sample size.

Comments:	17 pages, 9 figures
Subjects:	Databases (cs.DB)
Cite as:	arXiv:1711.01960 [cs.DB]
	(or arXiv:1711.01960v4 [cs.DB] for this version)
	https://doi.org/10.48550/arXiv.1711.01960

Submission history

From: Jialin Wan [view email]
[v1] Mon, 6 Nov 2017 15:34:35 UTC (3,131 KB)
[v2] Thu, 18 Oct 2018 13:38:36 UTC (608 KB)
[v3] Tue, 30 Oct 2018 15:18:04 UTC (608 KB)
[v4] Tue, 22 Jan 2019 08:44:46 UTC (613 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.DB

< prev | next >

new | recent | 1711

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Shanshan Han
Hongzhi Wang
Jialin Wan
Jianzhong Li

export BibTeX citation

Computer Science > Databases

Title:An Iterative Scheme for Leverage-based Approximate Aggregation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Databases

Title:An Iterative Scheme for Leverage-based Approximate Aggregation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators