Computer Science > Data Structures and Algorithms
[Submitted on 10 Apr 2013 (this version), latest version 29 Jun 2014 (v4)]
Title:Nimble Algorithms for Cloud Computing
View PDFAbstract:Cloud computing is a new paradigm where data is stored across multiple servers and the goal is to compute a function of all the data. We consider a simple model where each server uses polynomial time and space, but communication among servers being more expensive is ideally bounded by a polylogarithmic function of the input size. We will dub algorithms that satisfy these types of resource bounds as "nimble".
The main contribution of the paper is to develop nimble algorithms for several areas which involve massive data and for that reason have been extensively studied in the context of Streaming Algorithms. The areas are approximation of Frequency Moments, Counting bipartite homomorphisms (number of copies of a fixed bipartite graph H in a graph G), Rank-k approximation to a matrix, and Clustering. For frequency moments, we will use a new importance sampling technique based on high powers of the frequencies. We reduce the problem of counting homomorphisms to estimating implicitly defined frequency moments. For rank-k approximations, besides recent results of several authors developed in the Streaming context, we use a new variant of the random projection method. For clustering, we use our rank-k approximation and the small "coreset" of Chen, of size at most polynomial in the dimension.
In contrast to our algorithms in the cloud computing model, in the streaming model, known lower bound results for frequency moments and rank-k approximations rule out the existence of algorithms that use polylogarithmic space.
Submission history
From: Santosh Vempala [view email][v1] Wed, 10 Apr 2013 23:05:01 UTC (20 KB)
[v2] Mon, 20 May 2013 09:51:00 UTC (20 KB)
[v3] Tue, 16 Jul 2013 13:54:22 UTC (27 KB)
[v4] Sun, 29 Jun 2014 13:42:24 UTC (34 KB)
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.