-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pairwise distance computation in condensed form for symmetric metrics. #79
base: master
Are you sure you want to change the base?
Conversation
5d36c4e
to
b20e644
Compare
b20e644
to
1fdc3d2
Compare
1fdc3d2
to
c8aec6d
Compare
Thanks, but I'm not sure we want to provide this kind of function. Julia supports very powerful So instead of returning a vector, which is arguably a hack due to the limitations of other languages, we should use What the most interesting feature for you in this PR? Saving RAM? Making subsequent computations faster? |
Thank you for your answer.
The most interesting feature for me is indeed making subsequent computations faster, for instance when comparing a large number of such matrices... (In my case it also matches the form I intended to store the results in.)
I am fairly new to Julia actually and could not find the best way to represent the result matrix, however I agree that we should use something different!
|
If you mainly care about the computation speed, then returning a I think you could just add a |
I too agree that |
What is the status on this? I also think that the output for semi-metrics should be symmetric. Also, besides saving computation, we could have a method to flatten a symmetric matrix (the upper or lower part of it as a vector). Many times that is what is needed in distance calculations on a huge point cloud. |
Re-stating what was already stated in the previous comments. The main advantage of storing half of the matrix entries in a flattened vector is that we can perform subsequent calculations more efficiently. Simple statistics like histograms, means, variances are easy to apply. Even if pairwise returns a matrix type that is symmetric, its zero entries would compromise the summaries. |
This patch adds support for computing pairwise distances (for semimetrics) in a condensed form, similar to how
pdist
works in several packages available for MATLAB/Octave, R, Python, etc. (although on columns).At the moment I have added no tests for the new methods, nor have I written functions to convert between the condensed form and the redundant one.