Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenThread Border Router 2.12.0 no longer starting after upgrading from 2.11.1 #3826

Open
gomactrading opened this issue Nov 17, 2024 · 27 comments

Comments

@gomactrading
Copy link

Describe the issue you are experiencing

OpenThread Border Router 2.12.0 no longer starting after upgrading from 2.11.1. HAOS core 2024_11.2 on Qnap NAS in VM. Smlight SLZB-06p7 LAN adapter. Rebooted adapter, restarted HA, uninstalled and re-installed the add-on and the integrations. Add-on starts for a few seconds and turns itself off. Reverting to OTBR 2.11.1 brings everything back to normal. Does anyone else encounter the same issue particularly with this adapter?

What type of installation are you running?

Home Assistant OS

Which operating system are you running on?

Home Assistant Operating System

Which add-on are you reporting an issue with?

OpenThread Border Router

What is the version of the add-on?

2.12.0

Steps to reproduce the issue

  1. Installed add-on
  2. Restarted HAOS
  3. Restarted adapter
  4. Uninstalled and re-installed add-on
  5. Reverted to 2.11.1

System Health information

System Information

version core-2024.11.2
installation_type Home Assistant OS
dev false
hassio true
docker true
user root
virtualenv false
python_version 3.12.4
os_name Linux
os_version 6.6.54-haos
arch x86_64
timezone America/New_York
config_dir /config
Home Assistant Community Store
GitHub API ok
GitHub Content ok
GitHub Web ok
HACS Data ok
GitHub API Calls Remaining 4992
Installed Version 2.0.1
Stage running
Available Repositories 1459
Downloaded Repositories 6
Home Assistant Cloud
logged_in false
can_reach_cert_server ok
can_reach_cloud_auth ok
can_reach_cloud ok
Home Assistant Supervisor
host_os Home Assistant OS 13.2
update_channel stable
supervisor_version supervisor-2024.11.2
agent_version 1.6.0
docker_version 27.2.0
disk_total 48.5 GB
disk_used 5.9 GB
healthy true
supported true
host_connectivity true
supervisor_connectivity true
ntp_synchronized true
virtualization kvm
board ova
supervisor_api ok
version_api ok
installed_addons File editor (5.8.0), Samba share (12.3.2), Matter Server (6.6.0), Get HACS (1.3.1), OpenThread Border Router (2.11.1)
Dashboards
dashboards 2
resources 0
views 0
mode storage
Recorder
oldest_recorder_run November 7, 2024 at 1:04 AM
current_recorder_run November 15, 2024 at 10:17 PM
estimated_db_size 214.33 MiB
database_engine sqlite
database_version 3.45.3

Anything in the Supervisor logs that might be useful for us?

No response

Anything in the add-on logs that might be useful for us?

No response

Additional information

No response

@wpfog
Copy link

wpfog commented Nov 18, 2024

I have the same problem here on a HA Yellow. Had to do a rollback to 2.11.1

@gomactrading
Copy link
Author

gomactrading commented Nov 19, 2024 via email

@Xeevis
Copy link

Xeevis commented Nov 20, 2024

Just noticed OTBR integration error and addon in reboot loop, I have ConBee II and assume it's the same problem. Since I don't have any thread devices atm I won't bother with rollback to test with 2.11.1.

Addon log
[14:36:54] INFO: The otbr-web is disabled.
[14:36:55] INFO: Enabled socat-otbr-tcp.
s6-rc: info: service socat-otbr-tcp: starting
s6-rc: info: service mdns: starting
s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service mdns successfully started
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service banner: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
s6-rc: info: service legacy-cont-init successfully started
[14:36:56] INFO: Starting mDNS Responder...
Default: mDNSResponder (Engineering Build) (Nov 15 2024 16:35:01) starting
[14:36:56] INFO: Starting socat TCP client for OTBR daemon...
s6-rc: info: service socat-otbr-tcp successfully started
-----------------------------------------------------------
 Add-on: OpenThread Border Router
 OpenThread Border Router add-on
-----------------------------------------------------------
 Add-on version: 2.12.0
 You are running the latest version of this add-on.
 System: Home Assistant OS 13.2  (amd64 / qemux86-64)
 Home Assistant Core: 2024.11.2
 Home Assistant Supervisor: 2024.11.2
-----------------------------------------------------------
 Please, share the above information when looking for help
 or support in, e.g., GitHub, forums or the Discord chat.
-----------------------------------------------------------
s6-rc: info: service banner successfully started
s6-rc: info: service universal-silabs-flasher: starting
[14:36:58] INFO: Flashing firmware is disabled
s6-rc: info: service universal-silabs-flasher successfully started
s6-rc: info: service otbr-agent: starting
[14:37:00] INFO: Setup OTBR firewall...
[14:37:00] INFO: Starting otbr-agent...
[NOTE]-AGENT---: Running 0.3.0-b041fa5-dirty
[NOTE]-AGENT---: Thread version: 1.3.0
[NOTE]-AGENT---: Thread interface: wpan0
[NOTE]-AGENT---: Radio URL: spinel+hdlc+uart:///tmp/ttyOTBR?uart-baudrate=115200&uart-flow-control
[NOTE]-AGENT---: Radio URL: trel://enp0s3
[NOTE]-ILS-----: Infra link selected: enp0s3
49d.17:15:30.927 [C] P-SpinelDrive-: Software reset co-processor successfully
00:00:00.132 [N] RoutingManager: BR ULA prefix: fd1d:2f12:3491::/48 (loaded)
00:00:00.132 [N] RoutingManager: Local on-link prefix: fdd1:92c5:ef2e:eb73::/64
s6-rc: info: service otbr-agent successfully started
s6-rc: info: service otbr-agent-rest-discovery: starting
s6-rc: info: service otbr-agent-configure: starting
00:00:01.156 [N] Mle-----------: Role disabled -> detached
00:00:01.367 [N] P-Netif-------: Changing interface state to up.
read: Connection reset by peer
s6-rc: warning: unable to start service otbr-agent-configure: command exited 1
00:00:02.121 [W] P-Netif-------: Failed to process request#2: No such process
00:00:02.129 [W] P-Netif-------: Failed to process request#6: No such process
[14:37:03] INFO: Successfully sent discovery information to Home Assistant.
s6-rc: info: service otbr-agent-rest-discovery successfully started
/run/s6/basedir/scripts/rc.init: warning: s6-rc failed to properly bring all the services up! Check your logs (in /run/uncaught-logs/current if you have in-container logging) for more information.
/run/s6/basedir/scripts/rc.init: fatal: stopping the container.
s6-rc: info: service otbr-agent-rest-discovery: stopping
s6-rc: info: service otbr-agent-rest-discovery successfully stopped
s6-rc: info: service otbr-agent: stopping
00:00:03.073 [N] Mle-----------: Role detached -> disabled
[14:37:03] INFO: otbr-agent exited with code 0 (by signal 0).
Chain OTBR_FORWARD_INGRESS (0 references)
target     prot opt source               destination         
DROP       all  --  anywhere             anywhere             PKTTYPE = unicast
DROP       all  --  anywhere             anywhere             match-set otbr-ingress-deny-src src
ACCEPT     all  --  anywhere             anywhere             match-set otbr-ingress-allow-dst dst
DROP       all  --  anywhere             anywhere             PKTTYPE = unicast
ACCEPT     all  --  anywhere             anywhere            
otbr-ingress-deny-src
otbr-ingress-deny-src-swap
otbr-ingress-allow-dst
otbr-ingress-allow-dst-swap
Chain OTBR_FORWARD_EGRESS (0 references)
target     prot opt source               destination         
ACCEPT     all  --  anywhere             anywhere            
[14:37:04] INFO: OTBR firewall teardown completed.
s6-rc: info: service otbr-agent successfully stopped
s6-rc: info: service legacy-cont-init: stopping
s6-rc: info: service universal-silabs-flasher: stopping
s6-rc: info: service socat-otbr-tcp: stopping
s6-rc: info: service mdns: stopping
2024/11/20 14:37:04 socat[83] W exiting on signal 15
Default: mDNSResponder (Engineering Build) (Nov 15 2024 16:35:01) stopping
s6-rc: info: service socat-otbr-tcp successfully stopped
s6-rc: info: service universal-silabs-flasher successfully stopped
s6-rc: info: service banner: stopping
s6-rc: info: service banner successfully stopped
s6-rc: info: service legacy-cont-init successfully stopped
s6-rc: info: service fix-attrs: stopping
s6-rc: info: service fix-attrs successfully stopped
s6-rc: info: service s6rc-oneshot-runner: stopping
s6-rc: info: service s6rc-oneshot-runner successfully stopped
[14:37:04] INFO: mDNS ended with exit code 4 (signal 0)...
s6-rc: info: service mdns successfully stopped
[14:50:44] INFO: The otbr-web is disabled.
[14:50:44] INFO: Enabled socat-otbr-tcp.
s6-rc: info: service socat-otbr-tcp: starting
s6-rc: info: service mdns: starting
s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service banner: starting
s6-rc: info: service mdns successfully started
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
s6-rc: info: service legacy-cont-init successfully started
[14:50:47] INFO: Starting mDNS Responder...
Default: mDNSResponder (Engineering Build) (Nov 15 2024 16:35:01) starting
[14:50:47] INFO: Starting socat TCP client for OTBR daemon...
-----------------------------------------------------------
s6-rc: info: service socat-otbr-tcp successfully started
 Add-on: OpenThread Border Router
 OpenThread Border Router add-on
-----------------------------------------------------------
 Add-on version: 2.12.0
 You are running the latest version of this add-on.
 System: Home Assistant OS 13.2  (amd64 / qemux86-64)
 Home Assistant Core: 2024.11.2
 Home Assistant Supervisor: 2024.11.2
-----------------------------------------------------------
 Please, share the above information when looking for help
 or support in, e.g., GitHub, forums or the Discord chat.
-----------------------------------------------------------
s6-rc: info: service banner successfully started
s6-rc: info: service universal-silabs-flasher: starting
[14:50:49] INFO: Flashing firmware is disabled
s6-rc: info: service universal-silabs-flasher successfully started
s6-rc: info: service otbr-agent: starting
[14:50:51] INFO: Setup OTBR firewall...
[14:50:51] INFO: Starting otbr-agent...
[NOTE]-AGENT---: Running 0.3.0-b041fa5-dirty
[NOTE]-AGENT---: Thread version: 1.3.0
[NOTE]-AGENT---: Thread interface: wpan0
[NOTE]-AGENT---: Radio URL: spinel+hdlc+uart:///tmp/ttyOTBR?uart-baudrate=115200&uart-flow-control
[NOTE]-AGENT---: Radio URL: trel://enp0s3
[NOTE]-ILS-----: Infra link selected: enp0s3
49d.17:08:16.107 [W] P-SpinelDrive-: Wait for response timeout
49d.17:08:18.114 [W] P-SpinelDrive-: Wait for response timeout
49d.17:08:20.116 [W] P-SpinelDrive-: Wait for response timeout
49d.17:08:20.116 [C] Platform------: Init() at spinel_driver.cpp:82: Failure
49d.17:08:22.118 [W] P-SpinelDrive-: Wait for response timeout
[14:51:00] WARNING: otbr-agent exited with code 1 (by signal 0).
Chain OTBR_FORWARD_INGRESS (0 references)
target     prot opt source               destination         
DROP       all  --  anywhere             anywhere             PKTTYPE = unicast
DROP       all  --  anywhere             anywhere             match-set otbr-ingress-deny-src src
ACCEPT     all  --  anywhere             anywhere             match-set otbr-ingress-allow-dst dst
DROP       all  --  anywhere             anywhere             PKTTYPE = unicast
ACCEPT     all  --  anywhere             anywhere            
otbr-ingress-deny-src
otbr-ingress-deny-src-swap
otbr-ingress-allow-dst
otbr-ingress-allow-dst-swap
Chain OTBR_FORWARD_EGRESS (0 references)
target     prot opt source               destination         
ACCEPT     all  --  anywhere             anywhere            
[14:51:00] INFO: OTBR firewall teardown completed.
s6-svlisten1: fatal: /run/s6-rc/servicedirs/otbr-agent failed permanently or its supervisor died
s6-rc: warning: unable to start service otbr-agent: command exited 1
s6-rc: info: service legacy-cont-init: stopping
s6-rc: info: service universal-silabs-flasher: stopping
s6-rc: info: service socat-otbr-tcp: stopping
s6-rc: info: service mdns: stopping
/run/s6/basedir/scripts/rc.init: warning: s6-rc failed to properly bring all the services up! Check your logs (in /run/uncaught-logs/current if you have in-container logging) for more information.
/run/s6/basedir/scripts/rc.init: fatal: stopping the container.
s6-rc: info: service universal-silabs-flasher successfully stopped
s6-rc: info: service banner: stopping
2024/11/20 14:51:00 socat[78] W exiting on signal 15
Default: mDNSResponder (Engineering Build) (Nov 15 2024 16:35:01) stopping
s6-rc: info: service banner successfully stopped
s6-rc: info: service socat-otbr-tcp successfully stopped
s6-rc: info: service legacy-cont-init successfully stopped
s6-rc: info: service fix-attrs: stopping
s6-rc: info: service fix-attrs successfully stopped
s6-rc: info: service s6rc-oneshot-runner: stopping
s6-rc: info: service s6rc-oneshot-runner successfully stopped
[14:51:00] INFO: mDNS ended with exit code 4 (signal 0)...
s6-rc: info: service mdns successfully stopped

@gomactrading
Copy link
Author

gomactrading commented Nov 20, 2024 via email

@tl-sl
Copy link
Contributor

tl-sl commented Nov 20, 2024

Log snippet

s6-rc: info: service otbr-agent-rest-discovery: starting
s6-rc: info: service otbr-agent-configure: starting
s6-rc: warning: unable to start service otbr-agent-configure: command exited 1
[14:37:03] INFO: Successfully sent discovery information to Home Assistant.
s6-rc: info: service otbr-agent-rest-discovery successfully started
/run/s6/basedir/scripts/rc.init: warning: s6-rc failed to properly bring all the services up! Check your logs (in /run/uncaught-logs/current if you have in-container logging) for more information.
/run/s6/basedir/scripts/rc.init: fatal: stopping the container.

I can't reproduce here so far, however I wonder if there is a race between otbr-agent-configure and otbr-agent-rest-discovery. As they both are calling ot-ctl at about the same time. First fails and latter succeeds.

@gomactrading
Copy link
Author

gomactrading commented Nov 21, 2024 via email

@tl-sl
Copy link
Contributor

tl-sl commented Nov 21, 2024

should I understand that you have the latest HAOS in a VM environment (mine is VM on Qnap NAS) and the latest OTBR and using SMLIGHT SLZB-06p7?

yes I have tested with latest HAOS and OTBR across arm64, x86_64, x86_64 KVM VM and x86_64 Vmware VM. with SLZB-06, SLZB-06p7 and Sonoff dongle-E. None have triggered the error.

I am rather tempted to say that the problem is hidden somewhere in the changes made in OTBR between 2.11.1 & 2.12.0

Yes i suspect its the discovery agent change, but only guessing as can't reproduce locally.

However if you want to test, here is possible fix, if my previous comment was correct, then this might fix it:

https://github.com/tl-sl/addons/tree/otbr-conf

  • Copy openthread_border_router folder to HA /addons folder.
  • Edit config.yaml and delete line starting with image:
  • Go to HA addons and install local addon copy of OTBR Addon (you might need to refresh cache for it to show up in addon store.

I will create a PR if someone affected confirms this fixes.

@gomactrading
Copy link
Author

gomactrading commented Nov 21, 2024 via email

@gomactrading
Copy link
Author

gomactrading commented Nov 21, 2024 via email

@tl-sl
Copy link
Contributor

tl-sl commented Nov 21, 2024

Sorry, no go here. same behavior as 2.12.0

can you post the logs from addon here

@gomactrading
Copy link
Author

gomactrading commented Nov 21, 2024 via email

@tl-sl
Copy link
Contributor

tl-sl commented Nov 21, 2024

Add-on version: 2.12.0

That is not running the patched local version. Local addons are installed as separate add-on and can be confirmed from the local- as start of integration hostname on main info page.

[NOTE]-AGENT---: Radio URL: spinel+hdlc+uart:///dev/ttyS0?uart-baudrate=460800&uart-init-deassert

Its also not configured correctly. Make sure you enter network details for SLB-06p7.

@gomactrading
Copy link
Author

gomactrading commented Nov 21, 2024 via email

@tl-sl
Copy link
Contributor

tl-sl commented Nov 21, 2024

All is good. Now I have 2 Open Thread Border Router Services in the Integration. Not sure if that is correct.

Yes the local addon is always separate addon and settings from the core addon. Post logs again once you are correctly configured.

@gomactrading
Copy link
Author

gomactrading commented Nov 21, 2024 via email

@tl-sl
Copy link
Contributor

tl-sl commented Nov 21, 2024

Add-on version: 2.12.0

oops, seems I forgot to bump the version!

However logs got truncated at the end there. Can post the remainder of logs that are missing above.

@gomactrading
Copy link
Author

gomactrading commented Nov 21, 2024 via email

@tl-sl
Copy link
Contributor

tl-sl commented Nov 21, 2024

[NOTE]-AGENT---: Running 0.3.0-b041fa52-dirty [NOTE]-AGENT---: Thread version: 1.3.0 [NOTE]-AGENT---: Thread interface: wpan0 [NOTE]-AGENT---: Radio URL: spinel+hdlc+uart:///tmp/ttyOTBR?uart-baudrate=460800&uart-init-deassert [NOTE]-AGENT---: Radio URL: t

Yes I need the rest of logs after it got cut-off here.

@gomactrading
Copy link
Author

gomactrading commented Nov 21, 2024 via email

@tl-sl
Copy link
Contributor

tl-sl commented Nov 21, 2024

no its the same, end part is missing. You can delete all the older logs and just start from here (through to the end):

[22:01:12] INFO: [32mStarting mDNS Responder...[0m
Default: mDNSResponder (Engineering Build) (Nov 21 2024 02:19:23) starting
[22:01:12] INFO: [32mStarting socat TCP client for OTBR daemon...[0m

[34m-----------------------------------------------------------[0m
[34m Add-on: OpenThread Border Router[0m
[34m OpenThread Border Router add-on[0m
[34m-----------------------------------------------------------[0m
[34m Add-on version: 2.12.0[0m
[32m You are running the latest version of this add-on.[0m
s6-rc: info: service socat-otbr-tcp successfully started
[34m System: Home Assistant OS 13.2 (amd64 / qemux86-64)[0m
[34m Home Assistant Core: 2024.11.2[0m
[34m Home Assistant Supervisor: 2024.11.2[0m
[34m-----------------------------------------------------------[0m
[34m Please, share the above information when looking for help[0m
[34m or support in, e.g., GitHub, forums or the Discord chat.[0m
[34m-----------------------------------------------------------[0m
s6-rc: info: service banner successfully started
s6-rc: info: service universal-silabs-flasher: starting
[22:01:13] INFO: [32mFlashing firmware is disabled[0m
s6-rc: info: service universal-silabs-flasher successfully started
s6-rc: info: service otbr-agent: starting
[22:01:13] INFO: [32mSetup OTBR firewall...[0m
[22:01:13] INFO: [32mStarting otbr-agent...[0m
[NOTE]-AGENT---: Running 0.3.0-b041fa52-dirty
[NOTE]-AGENT---: Thread version: 1.3.0
[NOTE]-AGENT---: Thread interface: wpan0
[NOTE]-AGENT---: Radio URL: spinel+hdlc+uart:///tmp/tt

@gomactrading
Copy link
Author

gomactrading commented Nov 21, 2024 via email

@darkxst
Copy link
Contributor

darkxst commented Nov 21, 2024

There are no errors there, looks to have started ok.

You may need to delete the OTBR integration on device page and reboot so It can pickup the new local add-on.

@gomactrading
Copy link
Author

gomactrading commented Nov 21, 2024 via email

@darkxst
Copy link
Contributor

darkxst commented Nov 21, 2024

Yes when core OTBR update is available, you delete the local add-on and do again the same to reinstall OTBR integration and it will switch back to core add-on.

@gomactrading
Copy link
Author

gomactrading commented Nov 21, 2024 via email

@tl-sl
Copy link
Contributor

tl-sl commented Nov 21, 2024

2.12.1 is released now with the above fix. Hopefully resolves this issue! 😁

@Xeevis
Copy link

Xeevis commented Nov 21, 2024

2.12.1 is released now with the above fix. Hopefully resolves this issue! 😁

This has fixed the issue for me, all good now at least with the addon and integration staying up, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants