It combines the idea of information entropy with gSpan to mine the informative sub-graphs instead of just mining the frequent sub-graphs, which enables selecting the more accurate features for the classification. For more details, please see our publication here.
Below is an example of the format of a text file containing a set of graphs. Each line denotes a vertex (v) or edge (e) with a given label (end of line).
t # 0
v 0 2
v 1 2
v 2 2
v 3 3
v 4 2
e 0 1 2
e 0 2 2
e 2 3 3
e 2 4 2
t # 1
v 0 2
v 1 2
v 2 6
e 0 1 2
e 0 2 2
Below is an example of the format of a text file containing a set of labels. Each line denotes a label (a) for a graph with the same number (t).
t # 0
l 0 1
l 1 0
t # 1
l 0 1
l 1 1
This program supports 2 ways to run.
- From the command line.
usage: ESM
-a,--max-node <arg> Maximum number of nodes for each sub-graph
-d,--data <arg> (Required) File path of data set
-e,--max-entropy <arg> Maximum value of entropy that will be returned
-g,--max-graph <arg> Maximum number of sub-graphs that will be
returned
-h,--help Help
-i,--min-node <arg> Minimum number of nodes for each sub-graph
-l,--label <arg> (Required) File path of label
-r,--result <arg> File path of result
-s,--sup <arg> (Required) Minimum support
NOTE: You need to specify the classpath for ESM.jar
and its dependency gSpan.jar
.
For example, if you are using Linux, you need to put all .jar
in the same folder and run the command:
$ java -cp ESM.jar:./* cn.edu.neu.esm.MainKt
- Directly run it from an IDE.
In this mode, you can only specify the file path of input data set and the minimum support.
Java implementation of frequent sub-graph mining algorithm gSpan
Zhu, Z.; Zhao, Y. Multi-Graph Multi-Label Learning Based on Entropy. Entropy 2018, 20, 245.