-
Notifications
You must be signed in to change notification settings - Fork 463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PointCloud crashes when switching .rviz files #1753
Comments
Thanks for this detailed error report. |
Processing the slot onTimeSignal when the original sender() object was deleted, will cause a crash (ros-visualization#1753). - Drop Display* argument, but rely on QPointer<Display> that is part of the signal - Check for valid sender display
Thank you for your response and for the corrections to the issues in my description. |
Yeah, that might be an issue. Quoting the doc: Attempting to destroy a locked mutex results in undefined behavior. |
Thank you for your reply and modification. I'm sorry I didn't get back to you in time. |
Thanks for the feedback. The MessageFilter destructor correctly disconnects as expected: ~MessageFilter()
{
message_connection_.disconnect();
MessageFilter::clear();
}
Could you try to build with these cmake flags and post the resulting backtrace(s) when just running rviz: This enables the address sanitizer, which in detail tracks allocated and freed memory. |
I used these cmake flags you gave to build rviz, and amazingly, I couldn't reproduce problem (2) in the description while running rviz, I tried several times and never crashed. |
The memory leaks reported are not related to your issue. If you want to report some, only consider those related to rviz. There are many low-level libraries having leaks, which we cannot fix anyway. That you can't reproduce the issue (2) anymore might be related to the slower execution with I don't understand why the |
I used
According to other printed information, mutex_ is locked in the According to the core dump file, the address of the object when the crash occurs is 0x55ce07f77a68, which is the same as the address of the object that invokes the
This should indicate that the crash occurred in the Signal1::call function and that mutex_ was not unlocked. |
Thanks for your investigation. I continued as well and traced the issue down to |
A release build disables all assertions. Hence, it is not aborting anymore (due to failing assertions). |
Fixed via #1754 |
Describe your issue here and explain how to reproduce it.
The description may be a little too much, please be patient to read it : )
Your environment
My scenario:
Hi, I've added a dozen display plugins to rviz, including pointcloud2, marker, markerArray, etc. I then saved the settings to the xxx.rviz file. Later, I loaded the xxx.rviz file several times and opened the file through ‘File -> Open Config’. (The corresponding data is still being sent when the config file is switched.) I found that rviz occasionally crashed.
I found that there are three reasons for crashing, all related to the PointCloudCommon class:
(1) Based on the backtrace and code analysis, it is found that an emitTimeSignal signal is sent in the PointCloudCommon::processMessage function. This signal transfers the pointer pointing to the pointcloud2 plug-in to the TimePanel::onTimeSignal function. In some cases, the pointcloud2 plug-in is destroyed before TimePanel::onTimeSignal is executed. As a result, the display pointer transferred to the TimePanel::onTimeSignal function becomes invalid, and a segment fault occurs when an invalid memory is accessed.
I wonder if you can avoid this by adding a judgment at the top of the TimePanel::onTimeSignal function that determines whether sender() is a null pointer.
the backtrace shows that:
(2)Another possible cause of the crash is that PointCloudCommon has been destructed and the mutex new_clouds_mutex_ has been destroyed. However, the lock operation is still performed in the PointCloudCommon::processMessage function, leading to the crash.
the backtrace shows that:
(3) The last possible cause of the crash is that the mutex transformers_mutex_ is locked and not unlocked in the PointCloudCommon::transformCloud function, while PointCloudCommon is destructed, causing the crash.
the backtrace shows that:
The text was updated successfully, but these errors were encountered: