Counting paths in a directed acyclic graph

I’ve got a directed acyclic graph G with two vertices of interest, v[sub]0[/sub] and v[sub]n - 1[/sub]. v[sub]0[/sub] has indegree 0, and v[sub]n - 1[/sub] has outdegree 0. Every other vertex has positive indegree and outdegree. I’d like to count the total number of paths from v[sub]0[/sub] to v[sub]n - 1[/sub], and maybe enumerate them.

I know that I can represent G as an adjacency matrix M and sum the non-zero values of (M[sup]n[/sup])[sub]0, n - 1[/sub], but that has no possibility of enumeration. What’s a better way to go about this?

Well, it’s hardly my field, but assuming this is a computational problem, and not pure math, you could use a list of adjacency lists (with pointers) or a matrix of adjacency lists without pointers

Beyond that, I got nuthin’

I don’t think it matters how you represent the graph. Personally, I’d do a queue-based breadth-first search from V[sub]0[/sub] to V[sub]n-1[/sub], and rather than stopping on the first hit, keep searching until I’ve emptied the queue.

Without any more structure than this on the graph I can’t see how any one enumeration would have any more meaning than any other. Maybe separating them out by length (replace M by tM and check the coefficient of t in exp™), but other than that what meaning can an enumeration possibly have?

I think you mean something different by the word “enumeration”. I’m looking to at a minimum count the paths from v[sub]0[/sub] to v[sub]n - 1[/sub], but it’d be really nice to know what they are. Think enumerative combinatorics.

ultrafilter, are you writing a computer program?

With each node associate a list of “possible next nodes” in the path; populate this list by working backward from the target T. (For each node S with an edge S->T, push T onto S’s list. Then, if S hasn’t been visited before (so that its list has only one element), apply the same function recursively to S: that is, push S onto the list of all nodes R with R->S, etc.) Then just recursively follow all of these possible paths forward from the source. (Each of these steps is easy to do with depth-first recursion.)

Even better: I’m writing a program to describe part of another program that I’m writing.

I thought maybe Dijkstra’s algorithm could be modified to solve this, but it doesn’t look like it’s guaranteed to visit each possible path from A to B.

On review, I think that modifying a search (either breath-first or Omphaloskeptic’s depth-first approach) would be simplest.

If you want any pointers in doing a depth-first or breadth first search, let me know. What language are you using?

Is there any reason to work backwards if I know that there’s a path from every vertex to v[sub]n - 1[/sub]? I can see it being necessary if there are other nodes of outdegree 0, but I think I can just do a normal DFS here.

I think you’re right. I hadn’t really thought through the implications of that constraint, but as long as all the other vertices have positive outdegree there’s always somewhere to go.

This sounds like an incredibly inefficient algorithm, but it’s not really. If you actually want to list all of the possible paths, you’re already asking for something that’s potentially exponential in the size of the input; a boring recursive search won’t be exponential in the output size. You should probably use a more efficient algorithm to count the number of paths first, and maybe skip the output step if there are jillions of them. (There are a number of ways to count the paths efficiently. You mentioned one, with the adjacency matrix; for large n there are more efficient methods for computing M[sup]n[/sup] than doing n-1 matrix multiplications, though I don’t know if this is an issue for you.)

C++. It’s probably not the best choice, but it can be made to work.

I’m not too worried about the size of anything. The graph in question has under 50 vertices, and it’s pretty sparse. It’s just a little larger than I think I’d like to process by hand.

I did manage to implement a DFS, and it’s given me correct results on a couple graphs so far. Here’s the code, if anyone cares to review it:



#define MAX_NAME_LENGTH 256

#include <stack>
#include <string>
#include <vector>

class vertex
{
    public:
        vertex();
        virtual ~vertex();
        // Compiler-provided copy constructor is fine

        const char* getName();
        void setName( const char* name );

    private:
        char m_name[MAX_NAME_LENGTH];
};

class edge
{
    public:
        edge();
        virtual ~edge();

        vertex getInitialVertex();
        vertex getTerminalVertex();
        void setInitialVertex( const vertex& initialVertex );
        void setTerminalVertex( const vertex& terminalVertex );

        void setVertices( const vertex& initialVertex, const vertex& terminalVertex );

    private:
        vertex m_initialVertex;
        vertex m_terminalVertex;
};

class graph
{
    public:
        graph();
        virtual ~graph();

        void addVertex( vertex v );
        void addEdge( vertex initialVertex, vertex finalVertex );

        void enumeratePaths( vertex initialVertex, vertex finalVertex );

    private:
        void enumeratePaths( std::stack< const char* > currentPath, vertex initialVertex, vertex finalVertex );

        std::vector< edge > m_edges;
        std::vector< vertex > m_vertices;
};

vertex::vertex()
{
    memset( m_name, 0, MAX_NAME_LENGTH * sizeof( char ) );
}

vertex::~vertex()
{
}

const char* vertex::getName()
{
    return ( const char* ) m_name;
}

void vertex::setName( const char* name )
{
    strncpy( m_name, name, MAX_NAME_LENGTH );
}

/*************************************************************/

edge::edge()
{
}

edge::~edge()
{
}

vertex edge::getInitialVertex()
{
    return vertex( m_initialVertex );
}

vertex edge::getTerminalVertex()
{
    return vertex( m_terminalVertex );
}

void edge::setInitialVertex( const vertex& initialVertex )
{
    m_initialVertex = initialVertex;
}

void edge::setTerminalVertex( const vertex& terminalVertex )
{
    m_terminalVertex = terminalVertex;
}

void edge::setVertices( const vertex& initialVertex, const vertex& terminalVertex )
{
    setInitialVertex( initialVertex );
    setTerminalVertex( terminalVertex );
}

/*************************************************************/

graph::graph()
{
}

graph::~graph()
{
}

void graph::addVertex( vertex v )
{
    m_vertices.push_back( v );
}

void graph::addEdge( vertex initialVertex, vertex finalVertex )
{
    edge e;
    e.setInitialVertex( initialVertex );
    e.setTerminalVertex( finalVertex );
    m_edges.push_back( e );
}

void graph::enumeratePaths( vertex initialVertex, vertex finalVertex )
{
    std::stack< const char* > currentPath;
    enumeratePaths( currentPath, initialVertex, finalVertex );
}

void graph::enumeratePaths( std::stack< const char* > currentPath, vertex initialVertex, vertex finalVertex )
{
    currentPath.push( initialVertex.getName() );

    if ( strcmp( initialVertex.getName(), finalVertex.getName() ) == 0 )
    {
        printf( "Path: " );

        std::stack< const char* > storage;
        while ( !currentPath.empty() )
        {
            storage.push( currentPath.top() );
            currentPath.pop();
        }
        while ( !storage.empty() )
        {
            printf( "%s ", storage.top() );
            currentPath.push( storage.top() );
            storage.pop();
        }

        printf( "
" );
    }
    else
    {
        std::vector< edge >::iterator edgeIterator = m_edges.begin();
        while ( edgeIterator != m_edges.end() )
        {
            if ( strcmp( initialVertex.getName(), ( edgeIterator->getInitialVertex().getName() ) ) == 0 )
            {
                enumeratePaths( currentPath, edgeIterator->getTerminalVertex(), finalVertex );
            }

            ++edgeIterator;
        }
    }

    currentPath.pop();
}

int main()
{
    graph myGraph;

    vertex vertices[8];

    vertices[0].setName( "A" );
    vertices[1].setName( "B" );
    vertices[2].setName( "C" );
    vertices[3].setName( "D" );
    vertices[4].setName( "E" );
    vertices[5].setName( "F" );
    vertices[6].setName( "G" );
    vertices[7].setName( "H" );

    for ( int i = 0; i < 8; ++i )
    {
        myGraph.addVertex( vertices* );
    }

    edge edges[10];

    myGraph.addEdge( vertices[0], vertices[1] );
    myGraph.addEdge( vertices[0], vertices[5] );
    myGraph.addEdge( vertices[1], vertices[2] );
    myGraph.addEdge( vertices[1], vertices[3] );
    myGraph.addEdge( vertices[2], vertices[4] );
    myGraph.addEdge( vertices[3], vertices[4] );
    myGraph.addEdge( vertices[4], vertices[7] );
    myGraph.addEdge( vertices[5], vertices[6] );
    myGraph.addEdge( vertices[5], vertices[7] );
    myGraph.addEdge( vertices[6], vertices[7] );

    myGraph.enumeratePaths( vertices[0], vertices[7] );

    return 0;
}


I haven’t done all the error-handling that I would if this were production code, so don’t worry about that.

A few comments on your code (I understand this is Q&D code, so you may already know all of this and just have decided not to fix it for the quick hack):
[ol][li]enumeratePaths() has a stack<char*> passed by value (and so copied at every recursive call); it’s pretty clear from the function body that you meant it to be passed as a reference or pointer, since you do the work of restoring the stack state before exiting the function:[/li]


void graph::enumeratePaths( std::stack< const char* > &currentPath,
                            vertex initialVertex, vertex finalVertex )

[li]I’m not sure why you chose a stack; this required you to do the double-pop loop to reverse and print, then reverse again to restore. This looks like it’s made for a doubly-linked std::list (with push_back and forward iterators, say).[/li][li]I cringe every time I see a vector:: push_back (as with your addVertex() and addEdge()). This may require reallocation and copying of the entire array; unless you need random access, using a list is probably cheaper eventually.[/li]On the other hand, you may want to make your edge structure a map< vertex,list<vertex> > outNodes (of all targets of edges from a given node), possibly with the inverse map, of all in-nodes for a node. This means that your loop in enumeratePaths() doesn’t need to loop over all edges; it just loops over the edges in outNodes[initialVertex].[/ol]

“enumerate” means to actually put them in bijection with some set of natural numbers. I can tell you how many there are, but what explicit bijection would have any particular meaning without more structure to the graph?

All bijections are equally meaningful here, because all I care about is the list.

Why do you include <string> and then use const char* everywhere?

For strncpy.