igraph-help
[Top][All Lists]

## Re: [igraph] Re: Enumeration of subgraphs under constaints

 From: nkavv Subject: Re: [igraph] Re: Enumeration of subgraphs under constaints Date: Wed, 26 Sep 2007 19:25:17 +0300 User-agent: Internet Messaging Program (IMP) 3.2.6

```Quoting Gabor Csardi <address@hidden>:

Hi

> I see. The first is basically motif-finding. This wouldn't be very hard
> to do for larger graphs than the currently supported 3-4 vertices
> (actually there are no such limits in the current implementation,
> only we didn't have good graph-isomorphism code when i wrote this, but
> now we do), but in practice the number of non-isomorphic graphs is
> growing so fast that it makes sense only for at most eight vertices,
> or so. How large are your graphs and what is the size of the
> subgraphs you're enumerating?

Typically the graphs are 1-500 nodes. Nodes represent instructions/opcodes (e.g.
at assembly level) in my problem. Basic blocks of a few hundred nodes can be
found in CFGs with limited control flow and/or that have been unrolled (e.g.
cryptographic functions).

Subgraphs should be around 1-30 nodes typically but there is no strict limit.

> Zero-predessor means in-degree zero?

Yes.

> This is a bit more specific,
> so there is more hope than for the other problem. But i'm not an expert
> on this, could you point me to some algorithms for 1) enumerating
> all non-isomorphic graphs with n vertices and k zero-predessor nodes, and/or
> 2) an algorithm for finding all subgraphs of a graph
> with n vertices and k zero-predessor nodes?

I believe the first polynomial-complexity algorithm for enumerating subgraphs
with m input nodes and k output nodes was published last year:

Polynomial-Time Subgraph Enumeration for Automated Instruction Set Extension
by Paolo Bonzini and Laura Pozzi
Technical report 2006/07, December 2006

It is a bit different problem since nodes that represent input and output
variables are also considered, but could be generalized to all kind of nodes.

For a given constraint of m input nodes and k output nodes, the task is to
enumerate is to enumerate all subgraphs. There some additional constraints that
might help with pruning, namely the "convexity" constraint (to ensure a legal of
instruction nodes). What is even more important is that overlapping is not
really meaningful so it is not permitted. This would help a lot.

The complexity (I believe) should be: O(n^(m+k)).

Implementing the algorithm is a major undertaking, but i think it would be
doable (by me in the future, if noone else interested, that's OK). What is your
opinion?

Still, I would use the VF2 for keeping only the non-isomorphic graphs (in case
the polynomial-time algorithm doesn't do that, i don't remember exactly).

Kind regards