You could try increasing retries and see if that helps as well as adjusting
the producer batch size to a lower value. (I think the retries default is
Integer.MAX when you're on kafka streams version 2.1 or higher so you can
definitely increase it beyond 5). Additionally you could look at the "
delivery.timeout.ms " config property. Default is 2 minutes but you could
experiment with increasing it as well. Another property to check if you're
getting timeout exceptions would be " default.api.timeout.ms ". Those are
just some initial ideas, good luck!
Alex
On Thu, Sep 26, 2019 at 6:02 PM Xiyuan Hu < xiyuan.huhu@gmail.com > wrote:
> Thanks Alex! Some updates:
>
> I tried to restart service with staging pool, which has far less
> traffic as production environment. And after restart, the application
> works fine without issues. I assume I can't restart the service in
> production, is caused by the huge lag in pro...