Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for differential analyses #77

Open
wants to merge 1 commit into
base: next
Choose a base branch
from

Conversation

rsarky
Copy link
Contributor

@rsarky rsarky commented Aug 20, 2020

There is a scope for optimising the analysis process when it comes to differential analyses, ie. we already have some existing analyses results in PaStA and some new patches arrive for analyses.
What PaStA currently does is it assigns each of these new patches to a single element cluster, and then it runs the complete analyses again. This results in a lot of redundant comparisons. Example:

Consider the following existing state clusters of PaStA. I have indexed each cluster for illustration purposes:

1. 1 2 3
2. 4 5 7
3. 6 8

PaStA performed around 8*8 comparisons (ignoring other thresholds that PaStA has for now). For further comparisons PaStA will use the representative of each cluster, let's take the first element of each cluster above to be it's representative.ie repr( 1 2 3) = 1.

Now consider that patches 9 and 10 arrive. They will be assigned to their own single element clusters, ie:

1. 1 2 3
2. 4 5 6
3. 6 8
4. 9
5. 10

In the current situation PaStA performs 5x5 comparisons (compare representative of each cluster against the other).
But we can reduce this by only comparing representatives of existing clusters with newly arriving patches as the other comparisons have already been done in the previous step. ie we reduce the comparisons to 3x2. Additionally we will also need to compare the new patches against each other a further 2x2 comparisons. Combined a total of 5x2 comparisons which is still much less than the naive way.

This can be written in a crude mathematical way as follows:

evaluation result = new_patches X existing_patches + new_patches X new_patches [note that existing_patches X existing_patches has already been done and it's result exists in the patch groups file]
Thus, evaluation_result = new_patches X (existing_patches + new_patches)

Things to consider

  • What if we lose the patch groups file at some point? In this case the evaluation will have to be carried out from scratch again as
    the evaluation results that have been cached does not contain all the information

bin/pasta_analyse.py Outdated Show resolved Hide resolved
Previously analyse compared all patches under consideration disregarding
previous evaluation results.
This patch adds a new differential flag that utilises the existing
evaluation results and only compares the newly added patches to the
existing ones, reducing the number of comparisons.

The differential evaluation process can be explained as follows:

result = new_patches X existing_patches + new_patches X new_patches
= new_patches X (new_patches + existing_patches)
= new_patches X victims

Signed-off-by: Rohit Sarkar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants