In-Database Learning with Sparse Tensors

Khamis, Mahmoud Abo; Ngo, Hung Q.; Nguyen, XuanLong; Olteanu, Dan; Schleich, Maximilian

Computer Science > Databases

arXiv:1703.04780v2 (cs)

[Submitted on 14 Mar 2017 (v1), revised 23 Jun 2017 (this version, v2), latest version 6 Feb 2020 (v5)]

Title:In-Database Learning with Sparse Tensors

Authors:Mahmoud Abo Khamis, Hung Q. Ngo, XuanLong Nguyen, Dan Olteanu, Maximilian Schleich

View PDF

Abstract:In-database analytics is of great practical importance as it avoids the costly repeated loop data scientists have to deal with on a daily basis: select features, export the data, convert data format, train models using an external tool, reimport the parameters. It is also a fertile ground of theoretically fundamental and challenging problems at the intersection of relational and statistical data models. This paper introduces a unified framework for training and evaluating a class of statistical learning models inside a relational database. This class includes ridge linear regression, polynomial regression, factorization machines, and principal component analysis. We show that, by synergizing key tools from relational database theory such as schema information, query structure, recent advances in query evaluation algorithms, and from linear algebra such as various tensor and matrix operations, one can formulate in-database learning problems and design efficient algorithms to solve them. The algorithms and models proposed in the paper have already been implemented inside the LogicBlox database engine and used in retail-planning and forecasting applications, with significant performance benefits over out-of-database solutions that require the costly data-export loop.

Comments:	36 pages, 4 figures
Subjects:	Databases (cs.DB)
ACM classes:	H.2.4; I.2.6
Cite as:	arXiv:1703.04780 [cs.DB]
	(or arXiv:1703.04780v2 [cs.DB] for this version)
	https://doi.org/10.48550/arXiv.1703.04780

Submission history

From: Maximilian Schleich [view email]
[v1] Tue, 14 Mar 2017 22:27:09 UTC (47 KB)
[v2] Fri, 23 Jun 2017 21:08:38 UTC (80 KB)
[v3] Wed, 30 May 2018 19:48:12 UTC (79 KB)
[v4] Sun, 18 Nov 2018 12:23:53 UTC (166 KB)
[v5] Thu, 6 Feb 2020 21:16:32 UTC (153 KB)

Computer Science > Databases

Title:In-Database Learning with Sparse Tensors

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Databases

Title:In-Database Learning with Sparse Tensors

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators