into one or more partitions, which is an ordered, replayable sequence of records. • Task: the unit of parallelism of the job, just as the partition is to the stream. コンテナ数を増やすとスケール (ただし多重度の限界はprtition数に依存)
assignment of tasks across the individual containers – monitor the liveness of individual containers – redistribute the tasks among the remaining ones during a failure
AM in YARN, is the control hub for Samza applications running on Kubernetes. It is responsible for requesting Pods from Kubernetes and coordinating work assignment across Pods. • Below graph describes the lifecycle of a Samza application running on Kubernetes. 参考<https://cwiki.apache.org/confluence/display/SAMZA/SEP-20%3A+Samza+on+Kubernetes>
Hot-standby containers to support applications with strict downtime requirements • Making it easy to auto-scale and auto-tune Samza applications • Supporting machine learning related use cases • Enabling end-to-end exactly once processing 参考< https://engineering.linkedin.com/blog/2018/11/samza-1-0--stream-processing-at-massive-scale>