我们有一个kafka streams应用程序(2.0),它正在与kafka brokers(1.1.0)通信。streams应用程序一直在重新处理整个日志,没有明显的原因-应用程序没有重新启动,没有重新平衡,只是坐在那里-在某些情况下,它正在处理消息,在另一些情况下,它正在等待接收消息(已处理消息六个小时前)。我们做了大量的研究,排除了
potential cause
通过设置
offset-retention-minutes
至1周,与我们的邮件保留时间相同。此外,这将不可能是问题的根本原因消费者组偏移量在其积极处理消息时被重置。
在事件发生时,代理日志中没有任何有趣的内容:
[2019-02-21 09:02:20,009] INFO [GroupMetadataManager brokerId=2] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2019-02-21 09:12:20,009] INFO [GroupMetadataManager brokerId=2] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2019-02-21 09:12:51,084] INFO [ProducerStateManager partition=MY_TOPIC-1] Writing producer snapshot at offset 422924 (kafka.log.ProducerStateManager)
[2019-02-21 09:12:51,085] INFO [Log partition=MY_TOPIC-1, dir=/data1/kafka] Rolled new log segment at offset 422924 in 1 ms. (kafka.log.Log)
[2019-02-21 09:14:56,384] INFO [ProducerStateManager partition=MY_TOPIC-12] Writing producer snapshot at offset 295610 (kafka.log.ProducerStateManager)
[2019-02-21 09:14:56,384] INFO [Log partition=MY_TOPIC-12, dir=/data1/kafka] Rolled new log segment at offset 295610 in 1 ms. (kafka.log.Log)
[2019-02-21 09:15:19,365] INFO [ProducerStateManager partition=__transaction_state-8] Writing producer snapshot at offset 3939084 (kafka.log.ProducerStateManager)
[2019-02-21 09:15:19,365] INFO [Log partition=__transaction_state-8, dir=/data1/kafka] Rolled new log segment at offset 3939084 in 0 ms. (kafka.log.Log)
[2019-02-21 09:21:26,755] INFO [ProducerStateManager partition=MY_TOPIC-9] Writing producer snapshot at offset 319799 (kafka.log.ProducerStateManager)
[2019-02-21 09:21:26,755] INFO [Log partition=MY_TOPIC-9, dir=/data1/kafka] Rolled new log segment at offset 319799 in 1 ms. (kafka.log.Log)
[2019-02-21 09:22:20,009] INFO [GroupMetadataManager brokerId=2] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2019-02-21 09:23:31,283] INFO [ProducerStateManager partition=__consumer_offsets-17] Writing producer snapshot at offset 47345110 (kafka.log.ProducerStateManager)
[2019-02-21 09:23:31,297] INFO [Log partition=__consumer_offsets-17, dir=/data1/kafka] Rolled new log segment at offset 47345110 in 28 ms. (kafka.log.Log)
当然
没有什么
在应用程序日志中(即使日志级别设置为
DEBUG
)
有什么可能导致这个问题的想法吗?