As with any distributed system, starting from the moment when Pinot receives a query request to the moment the user receives a response, several hosts are at play. As the number of these hosts increases, the blast-radius for failures also increases. LinkedIn’s large Pinot deployment faced a number of resiliency issues related to failures, slowness, and undesirable user access-patterns. Providing a robust query processing framework resilient to such issues becomes paramount. To improve the resiliency and stability of our production clusters, we strengthened the Query Processing Resiliency in Pinot by providing two main capabilities:
1. Adaptive Server Selection at Pinot Broker to route queries intelligently to the best available servers
2. RunTime Query Killing to protect the Pinot Servers from crashing
Adaptive Server Selection
When the Pinot Broker routes queries, it has the option to pick some select servers based on the replication factor. Pinot currently uses a light-weight Server-Selection layer which follows a naive round-robin approach. Pinot Servers do the bulk of the heavy-lifting during query processing. Therefore, selecting the right server(s) to process a query has significant impact on the query processing latency (eg: a single slow server becoming overall bottleneck).
LinkedIn developed an intelligent yet light-weight Server Selection framework to adaptively route queries to the best available server(s). The framework collects various stats regarding the health and capabilities of each Server like Server Latency, Number of In-flight Requests. Using these Server Statistics, the broker now intelligently routes queries to the best available server.
LinkedIn has rolled out this framework to its entire production deployment. In the first part of this presentation, we will cover the design, implementation details and the benefits observed in LinkedIn production. It has improved the overall Availability of our deployment and significantly reduced the need for manual intervention.
Query Pre-emption
In the second part of the presentation, we plan to cover the Query Killing framework design, instrumentation instructions, production benefits, and future extensions.
Pinot has successfully supported high-qps analytical queries on large data sets within LinkedIn. However, resource usage accounting/control remains a challenge. A typical scenario is when unplanned expensive queries / adhoc data exploration queries land, Pinot brokers/servers can be brought down by out of memory errors or held up by a bunch of CPU intensive queries. To deal with this, we introduced a light-weight yet responsive framework to account the CPU and memory footprint of each query. This framework employs a lockless thread usage sampling algorithm, which can easily instrument generic multi-threaded query execution code with adjustable accuracy-overhead trade-off. The usage stats sampled from all threads serving the same queries will be periodically aggregated and ranked. On top of these collected usage numbers, we implemented a mechanism to interrupt culprit queries which degrade the cluster availability/latency SLA. Our experiment and production numbers demonstrate that this framework can effectively find and kill the offending queries while incurring minimum performance overhead.