Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mosquitto broker disconnects local client prematurely - claiming keepalive timeout has occurred. #3138

Open
RobLeighton67 opened this issue Oct 16, 2024 · 12 comments

Comments

@RobLeighton67
Copy link

Setup.
Mosquitto broker v2.0.19 built/running on Pi Zero W .
Operating system is Dietpi (Raspbian variant of Debian Bookworm).
2 clients running on the same device are connected , one is based on the Paho Java Client (keepalive 60s) , the other is mqtt.js based (also keepalive 60s).
Broker also acting as bridge to a remote cloud based broker.
No persistence.

Issue.
Intermittently (after several hours or so, although it can take longer) the Paho client gets suddenly disconnected by the broker which logs a keepalive timeout as the reason for disconnection.
In each case the last pub and sub to Mosquitto have been within the most recent keepalive period (i.e. much less then the 1.5 * keepalive timeout).

Log excerpt:

1728664952: Received PINGREQ from 180645f5
1728664952: Sending PINGRESP to 180645f5
1728664953: Received PUBLISH from 180645f5 (d0, q1, r0, m422, 'gateway2/00000000180645F5/bridge/request/health_check', ... (0 bytes))
1728664953: Sending PUBLISH to mqttjs_625841e0 (d0, q0, r0, m0, 'gateway2/00000000180645F5/bridge/request/health_check', ... (0 bytes))
1728664953: Sending PUBACK to 180645f5 (m422, rc0)
1728664953: Received PUBLISH from mqttjs_625841e0 (d0, q0, r1, m0, 'gateway2/00000000180645F5/bridge/devices', ... (22060 bytes))
1728664953: Sending PUBLISH to 180645f5 (d0, q0, r0, m0, 'gateway2/00000000180645F5/bridge/devices', ... (22060 bytes))
1728664953: Sending PUBLISH to mqttjs_625841e0 (d0, q0, r0, m0, 'gateway2/00000000180645F5/bridge/devices', ... (22060 bytes))
1728664953: Received PUBLISH from mqttjs_625841e0 (d0, q0, r0, m0, 'gateway2/00000000180645F5/bridge/logging', ... (174 bytes))
1728664953: Sending PUBLISH to mqttjs_625841e0 (d0, q0, r0, m0, 'gateway2/00000000180645F5/bridge/logging', ... (174 bytes))
1728664953: Received PUBLISH from mqttjs_625841e0 (d0, q0, r0, m0, 'gateway2/00000000180645F5/bridge/response/health_check', ... (39 bytes))
1728664953: Sending PUBLISH to 180645f5 (d0, q0, r0, m0, 'gateway2/00000000180645F5/bridge/response/health_check', ... (39 bytes))
1728664953: Sending PUBLISH to mqttjs_625841e0 (d0, q0, r0, m0, 'gateway2/00000000180645F5/bridge/response/health_check', ... (39 bytes))
1728664956: Received PINGREQ from mqttjs_625841e0
1728664956: Sending PINGRESP to mqttjs_625841e0
1728664956: Received PUBLISH from mqttjs_625841e0 (d0, q0, r0, m0, 'gateway2/00000000180645F5/bridge/logging', ... (608 bytes))
1728664956: Sending PUBLISH to mqttjs_625841e0 (d0, q0, r0, m0, 'gateway2/00000000180645F5/bridge/logging', ... (608 bytes))
1728664956: Sending PINGREQ to local.00000000180645F5
1728664956: Received PUBLISH from mqttjs_625841e0 (d0, q1, r0, m2109, 'gateway2/00000000180645F5/device/XXXXXXX', ... (426 bytes))
1728664956: Sending PUBLISH to mqttjs_625841e0 (d0, q0, r0, m0, 'gateway2/00000000180645F5/device/XXXXXXXX', ... (426 bytes))
1728664956: Sending PUBLISH to local.00000000180645F5 (d0, q1, r0, m806, 'gateway2/00000000180645F5/device/XXXXXXX', ... (426 bytes))
1728664956: Sending PUBACK to mqttjs_625841e0 (m2109, rc0)
1728664956: Received PINGRESP from local.00000000180645F5
1728664956: Received PUBACK from local.00000000180645F5 (Mid: 806, RC:0)
1728664956: Received PUBLISH from mqttjs_625841e0 (d0, q1, r0, m2110, 'gateway2/00000000180645F5/device/XXXXXXX', ... (426 bytes))
1728664956: Sending PUBLISH to mqttjs_625841e0 (d0, q0, r0, m0, 'gateway2/00000000180645F5/device/XXXXXX', ... (426 bytes))
1728664956: Sending PUBLISH to local.00000000180645F5 (d0, q1, r0, m807, 'gateway2/00000000180645F5/device/XXXXXX', ... (426 bytes))
1728664956: Sending PUBACK to mqttjs_625841e0 (m2110, rc0)
1728664956: Received PUBACK from local.00000000180645F5 (Mid: 807, RC:0)
1728664956: Received PINGREQ from mqttjs_625841e0
1728664956: Sending PINGRESP to mqttjs_625841e0
1728664956: Client 180645f5 has exceeded timeout, disconnecting.
1728665091: mosquitto version 2.0.19 terminating

As this log excerpt shows, the timeout is about 4s after the previous PINGREQ from the client...?
This happens with no warnings or anything else seemingly as the cause.

@RobLeighton67
Copy link
Author

Just to add, it's occurred with every version I have used - 2.0.11, 2.0.12. 2.0.18 - whether installed through APT or built on the device.

@RobLeighton67 RobLeighton67 changed the title Mosquitto broker disconnects local client prematurely - assuming keep a;ive timeout Mosquitto broker disconnects local client prematurely - claiming keepalive timeout has occurred. Oct 16, 2024
@ralight
Copy link
Contributor

ralight commented Oct 30, 2024

Do you have any unusual config options set that might be contributing to the problem?

@RobLeighton67
Copy link
Author

Sorry for the slow response, I was away.
I have persistence disabled, cleansession is false, bridge_insecure is false, Otherwise just standard bridge configuration stuff, pretty simple.

@ggrote
Copy link

ggrote commented Nov 13, 2024

I've got the same problem since a few versions, all clients keep getting disconnected stating they would have exceeded timeout.

Zeile  159: 1731510645: Client esp8266-c11e0b has exceeded timeout, disconnecting.
Zeile  298: 1731510729: Client ESP-VitoWifi2604 has exceeded timeout, disconnecting.
Zeile  419: 1731510807: Client esp8266-c11e0b has exceeded timeout, disconnecting.
Zeile  448: 1731510837: Client SHELLYRGBW_GK_1E0632 has exceeded timeout, disconnecting.
Zeile  449: 1731510837: Client esp8266-c11e0b has exceeded timeout, disconnecting.
Zeile  656: 1731510903: Client esp8266-c11e0b has exceeded timeout, disconnecting.
Zeile  763: 1731510963: Client esp8266-c11e0b has exceeded timeout, disconnecting.
Zeile  844: 1731510993: Client ESP-Garage332c has exceeded timeout, disconnecting.
Zeile  919: 1731511017: Client esp8266-c11e0b has exceeded timeout, disconnecting.
Zeile  961: 1731511053: Client esp8266-c11e0b has exceeded timeout, disconnecting.
Zeile 1169: 1731511125: Client nodered_be70a1b3c7203444 has exceeded timeout, disconnecting.
Zeile 1424: 1731511203: Client esp8266-c11e0b has exceeded timeout, disconnecting.
Zeile 1457: 1731511233: Client esp8266-c11e0b has exceeded timeout, disconnecting.
Zeile 1547: 1731511257: Client ESP-Garage9144 has exceeded timeout, disconnecting.
Zeile 1626: 1731511281: Client ESP-Garage3831 has exceeded timeout, disconnecting.
Zeile 1689: 1731511293: Client ESP-Garage28d5 has exceeded timeout, disconnecting.
Zeile 1732: 1731511305: Client SHELLYRGBW_GK_1E0632 has exceeded timeout, disconnecting.
Zeile 1855: 1731511323: Client ESP-Garagea2f3 has exceeded timeout, disconnecting.
Zeile 1954: 1731511359: Client ESP-Garage99e6 has exceeded timeout, disconnecting.
Zeile 2011: 1731511371: Client ESP-Garage2e61 has exceeded timeout, disconnecting.
Zeile 2019: 1731511389: Client ESP-Garagee763 has exceeded timeout, disconnecting.
Zeile 2020: 1731511395: Client esp8266-c11e0b has exceeded timeout, disconnecting.
Zeile 2030: 1731511401: Client ESP-Garage161e has exceeded timeout, disconnecting.
Zeile 2060: 1731511413: Client ESP-Garage60f8 has exceeded timeout, disconnecting.
Zeile 2085: 1731511431: Client ESP-Garage7871 has exceeded timeout, disconnecting.
Zeile 2660: 1731511605: Client esp8266-c11e0b has exceeded timeout, disconnecting.
Zeile 3131: 1731511749: Client esp8266-c11e0b has exceeded timeout, disconnecting.
Zeile 3317: 1731511803: Client esp8266-c11e0b has exceeded timeout, disconnecting.
Zeile 3418: 1731511857: Client ESP-Garageee25 has exceeded timeout, disconnecting.
Zeile 3419: 1731511857: Client esp8266-c11e0b has exceeded timeout, disconnecting.
Zeile 3420: 1731511869: Client ESP-Garage1154 has exceeded timeout, disconnecting.
Zeile 3454: 1731511881: Client ESP-Garagea49d has exceeded timeout, disconnecting.
Zeile 3508: 1731511911: Client nodered_be70a1b3c7203444 has exceeded timeout, disconnecting.
Zeile 3671: 1731511983: Client esp8266-c11e0b has exceeded timeout, disconnecting.
Zeile 3818: 1731512031: Client esp8266-c11e0b has exceeded timeout, disconnecting.

My config is persistence true, allow_anonymous true and a simple bridge configuration.

A few versions ago everything worked fine. But I can't tell in which version.

@RobLeighton67
Copy link
Author

RobLeighton67 commented Nov 13, 2024 via email

@1MikeMakuch
Copy link

I have a client site also seeing "has exceeded timeout" disconnects, seemingly erroneous. It does seem to be correlated to bursts of traffic, but don't yet have that conclusively.

@ggrote
Copy link

ggrote commented Nov 20, 2024

Any news about the problem? Is someone working on the issue?

@karlri
Copy link

karlri commented Nov 22, 2024

This could be the same issue that we were having with system time adjustments causing clients to be kicked:
#3168

@ggrote
Copy link

ggrote commented Nov 22, 2024

Could be possible, but I didn't change any system time. Today I even tried the newest version 1 Build and still got the same issue.

@karlri
Copy link

karlri commented Nov 22, 2024

I would assume that any system without a realtime clock will sync NTP at the earliest convenience which could have the same effect as manually changing the system time. I'm just guessing here.

@RobLeighton67
Copy link
Author

Yeah, I've pretty much ruled out the time-adjustment idea - as you can see from my attached log the recorded timestamps are contiguous, no weird gap or recession. Also the occurrence is much less than my NTP sync which is daily.

@ggrote
Copy link

ggrote commented Nov 23, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants