the Graph Evolution Rule Miner

Michele Berlingerio
Pisa KDD Laboratory
ISTI - CNR, Italy
Francesco Bonchi
Yahoo! Research
Barcelona, Spain
Björn Bringmann
Katholieke Universiteit
Leuven, Belgium
Aristides Gionis
Yahoo! Research
Barcelona, Spain

 

In Proceedings of Machine Learning and Knowledge Discovery in Databases, European Conference, ECML PKDD 2009, Bled, Slovenia, 115-130.
Bibtex


Software download

Here you can download a static compiled linux binary of GERM. It was compiled on a 32bits machine running the Ubuntu 8.04 Linux operating system

GERM Linux Binary

Dataset download

Dblp 92-02 dataset
Dblp 03-05 dataset
Dblp 05-07 dataset

Dataset format

GERM takes as input an ASCII file with one header line assumed to be:

t # 0
followed by the node list in the format
v ID LAB
where ID is a positive consecutive integer, starting from 0, indicating the id of the node and LAB is a positive integer indicating the label of the node, followed by the edge list in the format
e NODE1 NODE2 LAB
where NODE1 and NODE2 are positive integers indicating the starting and ending node of the edge (NODE1 is assumed < NODE2) and LAB is a positive integer indicating the label of the edge.

Example:

t # 0
v 0 0
v 1 4
v 2 3
e 0 1 1
e 0 2 1
e 1 2 2
GERM will return a file consisting in patterns described in a format similar to the one of the input graph, except for the header line of each patter which will be of the format:
t # ID SUPPORT
where ID is a positive consecutive number indicating the id of the pattern found, and SUPPORT is the absolut support of the pattern
Usage

Usage:
germ minsupp inputfile maxsize
or
germlabs minsupp inputfile maxsize
where germ is the version for datasets with no vertices labels and germlabs is for labeled vertices

Mandatory parameters:
minsupp: minimum absolute support threshold for the resulting patterns
inputfile: the input file name, in the above format
maxsize: maximum number of edges in the resulting patterns