Computer Science > Distributed, Parallel, and Cluster Computing
[Submitted on 7 Jun 2023]
Title:An Analytical Model-based Capacity Planning Approach for Building CSD-based Storage Systems
View PDFAbstract:The data movement in large-scale computing facilities (from compute nodes to data nodes) is categorized as one of the major contributors to high cost and energy utilization. To tackle it, in-storage processing (ISP) within storage devices, such as Solid-State Drives (SSDs), has been explored actively. The introduction of computational storage drives (CSDs) enabled ISP within the same form factor as regular SSDs and made it easy to replace SSDs within traditional compute nodes. With CSDs, host systems can offload various operations such as search, filter, and count. However, commercialized CSDs have different hardware resources and performance characteristics. Thus, it requires careful consideration of hardware, performance, and workload characteristics for building a CSD-based storage system within a compute node. Therefore, storage architects are hesitant to build a storage system based on CSDs as there are no tools to determine the benefits of CSD-based compute nodes to meet the performance requirements compared to traditional nodes based on SSDs. In this work, we proposed an analytical model-based storage capacity planner called CSDPlan for system architects to build performance-effective CSD-based compute nodes. Our model takes into account the performance characteristics of the host system, targeted workloads, and hardware and performance characteristics of CSDs to be deployed and provides optimal configuration based on the number of CSDs for a compute node. Furthermore, CSDPlan estimates and reduces the total cost of ownership (TCO) for building a CSD-based compute node. To evaluate the efficacy of CSDPlan, we selected two commercially available CSDs and 4 representative big data analysis workloads.
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.