Skip to main content

Re: Spark Streams vs Kafka Streams

Spark Structured Streaming has some significant limitations compared to
Kafka Streams.

This one has always proved hard to overcome:

"Multiple streaming aggregations (i.e. a chain of aggregations on a
streaming DF) are not yet supported on streaming Datasets."





On Thu, 29 Apr. 2021, 8:13 am Parthasarathy, Mohan, <mparthas@hpe.com>
wrote:

> Matthias,
>
> I will create a KIP or ticket for tracking this issue.
>
> -thanks
> Mohan
>
>
> On 4/28/21, 1:01 PM, "Matthias J. Sax" <mjsax@apache.org> wrote:
>
> Feel free to do a KIP and contribute to Kafka!
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals
>
> Or create a ticket for tracking.
>
>
> -Matthias
>
> On 4/28/21 12:49 PM, Parthasarathy, Mohan wrote:
> > Andrew,
> >
> > I am not sure I understand. We have built several analytics
> applications. We typically use custom aggregations as they are not
> available directly in the library.
> >
> > -mohan
> >
> >
> > On 4/28/21, 12:12 PM, "Andrew Otto" <otto@wikimedia.org> wrote:
> >
> > I'd assume this is because Kafka Streams is positioned for
> building
> > streaming applications, rather than doing analytics, whereas
> Spark is more
> > often used for analytics purposes.
> >
> >
>
>
>

Comments