Download PDFOpen PDF in browser

Rafiki: Task-Level Capacity Planning in Distributed Stream Processing Systems

EasyChair Preprint 6466

12 pagesDate: August 29, 2021

Abstract

Distributed Stream Processing is a valuable paradigm for reliably processing vast amounts of data at high throughput rates with low end-to-end latencies. Most systems of this type offer a fine-grained level of control to parallelize the computation of individual tasks within a streaming job. Adjusting the parallelism of tasks has a direct impact on the overall level of throughput a job can provide as well as the amount of resources required to provide an adequate level of service. However, finding optimal parallelism configurations that fall within the expected Quality of Service requirements is no small feat to accomplish.

In this paper we present Rafiki, an approach to automatically determine optimal parallelism configurations for Distributed Stream Processing jobs. Here we conduct a number of proactive profiling runs to gather information about the processing capacities of individual tasks, thereby making the selection of specific utilization targets possible. Understanding the capacity information enables users to adequately provision resources so that streaming jobs can deliver the desired level of service at a reduced operational cost with predictable recovery times. We implemented Rafiki prototypically together with Apache Flink where we demonstrate its usefulness experimentally.

Keyphrases: Parallelization, Profiling, Quality of Service, capacity planning, distributed stream processing, performance modeling, resource optimization

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@booklet{EasyChair:6466,
  author    = {Benjamin J. J. Pfister and Wolf Simon Lickefett and Jan Nitschke and Sumit Paul and Morgan K. Geldenhuys and Dominik Scheinert and Kordian Gontarska and Lauritz Thamsen},
  title     = {Rafiki: Task-Level Capacity Planning in Distributed Stream Processing Systems},
  howpublished = {EasyChair Preprint 6466},
  year      = {EasyChair, 2021}}
Download PDFOpen PDF in browser