Counting paths in a directed acyclic graph

ultrafilter · November 13, 2006, 5:22pm

I’ve got a directed acyclic graph G with two vertices of interest, v[sub]0[/sub] and v[sub]n - 1[/sub]. v[sub]0[/sub] has indegree 0, and v[sub]n - 1[/sub] has outdegree 0. Every other vertex has positive indegree and outdegree. I’d like to count the total number of paths from v[sub]0[/sub] to v[sub]n - 1[/sub], and maybe enumerate them.

I know that I can represent G as an adjacency matrix M and sum the non-zero values of (M[sup]n[/sup])[sub]0, n - 1[/sub], but that has no possibility of enumeration. What’s a better way to go about this?

KP · November 13, 2006, 5:35pm

Well, it’s hardly my field, but assuming this is a computational problem, and not pure math, you could use a list of adjacency lists (with pointers) or a matrix of adjacency lists without pointers

Beyond that, I got nuthin’

Subway_Prophet · November 13, 2006, 5:50pm

I don’t think it matters how you represent the graph. Personally, I’d do a queue-based breadth-first search from V[sub]0[/sub] to V[sub]n-1[/sub], and rather than stopping on the first hit, keep searching until I’ve emptied the queue.

Mathochist · November 13, 2006, 6:55pm

Without any more structure than this on the graph I can’t see how any one enumeration would have any more meaning than any other. Maybe separating them out by length (replace M by tM and check the coefficient of t in exp™), but other than that what meaning can an enumeration possibly have?

ultrafilter · November 13, 2006, 7:19pm

I think you mean something different by the word “enumeration”. I’m looking to at a minimum count the paths from v[sub]0[/sub] to v[sub]n - 1[/sub], but it’d be really nice to know what they are. Think enumerative combinatorics.

Subway_Prophet · November 13, 2006, 7:52pm

ultrafilter, are you writing a computer program?

Omphaloskeptic · November 13, 2006, 8:12pm

With each node associate a list of “possible next nodes” in the path; populate this list by working backward from the target T. (For each node S with an edge S->T, push T onto S’s list. Then, if S hasn’t been visited before (so that its list has only one element), apply the same function recursively to S: that is, push S onto the list of all nodes R with R->S, etc.) Then just recursively follow all of these possible paths forward from the source. (Each of these steps is easy to do with depth-first recursion.)

ultrafilter · November 13, 2006, 8:18pm

Even better: I’m writing a program to describe part of another program that I’m writing.

Subway_Prophet · November 13, 2006, 9:37pm

I thought maybe Dijkstra’s algorithm could be modified to solve this, but it doesn’t look like it’s guaranteed to visit each possible path from A to B.

On review, I think that modifying a search (either breath-first or Omphaloskeptic’s depth-first approach) would be simplest.

If you want any pointers in doing a depth-first or breadth first search, let me know. What language are you using?

ultrafilter · November 14, 2006, 12:56am

Is there any reason to work backwards if I know that there’s a path from every vertex to v[sub]n - 1[/sub]? I can see it being necessary if there are other nodes of outdegree 0, but I think I can just do a normal DFS here.

Omphaloskeptic · November 14, 2006, 1:17am

I think you’re right. I hadn’t really thought through the implications of that constraint, but as long as all the other vertices have positive outdegree there’s always somewhere to go.

This sounds like an incredibly inefficient algorithm, but it’s not really. If you actually want to list all of the possible paths, you’re already asking for something that’s potentially exponential in the size of the input; a boring recursive search won’t be exponential in the output size. You should probably use a more efficient algorithm to count the number of paths first, and maybe skip the output step if there are jillions of them. (There are a number of ways to count the paths efficiently. You mentioned one, with the adjacency matrix; for large n there are more efficient methods for computing M[sup]n[/sup] than doing n-1 matrix multiplications, though I don’t know if this is an issue for you.)

ultrafilter · November 14, 2006, 1:24am

C++. It’s probably not the best choice, but it can be made to work.

I’m not too worried about the size of anything. The graph in question has under 50 vertices, and it’s pretty sparse. It’s just a little larger than I think I’d like to process by hand.

ultrafilter · November 14, 2006, 3:46am

I did manage to implement a DFS, and it’s given me correct results on a couple graphs so far. Here’s the code, if anyone cares to review it:



#define MAX_NAME_LENGTH 256

#include <stack>
#include <string>
#include <vector>

class vertex
{
    public:
        vertex();
        virtual ~vertex();
        // Compiler-provided copy constructor is fine

        const char* getName();
        void setName( const char* name );

    private:
        char m_name[MAX_NAME_LENGTH];
};

class edge
{
    public:
        edge();
        virtual ~edge();

        vertex getInitialVertex();
        vertex getTerminalVertex();
        void setInitialVertex( const vertex& initialVertex );
        void setTerminalVertex( const vertex& terminalVertex );

        void setVertices( const vertex& initialVertex, const vertex& terminalVertex );

    private:
        vertex m_initialVertex;
        vertex m_terminalVertex;
};

class graph
{
    public:
        graph();
        virtual ~graph();

        void addVertex( vertex v );
        void addEdge( vertex initialVertex, vertex finalVertex );

        void enumeratePaths( vertex initialVertex, vertex finalVertex );

    private:
        void enumeratePaths( std::stack< const char* > currentPath, vertex initialVertex, vertex finalVertex );

        std::vector< edge > m_edges;
        std::vector< vertex > m_vertices;
};

vertex::vertex()
{
    memset( m_name, 0, MAX_NAME_LENGTH * sizeof( char ) );
}

vertex::~vertex()
{
}

const char* vertex::getName()
{
    return ( const char* ) m_name;
}

void vertex::setName( const char* name )
{
    strncpy( m_name, name, MAX_NAME_LENGTH );
}

/*************************************************************/

edge::edge()
{
}

edge::~edge()
{
}

vertex edge::getInitialVertex()
{
    return vertex( m_initialVertex );
}

vertex edge::getTerminalVertex()
{
    return vertex( m_terminalVertex );
}

void edge::setInitialVertex( const vertex& initialVertex )
{
    m_initialVertex = initialVertex;
}

void edge::setTerminalVertex( const vertex& terminalVertex )
{
    m_terminalVertex = terminalVertex;
}

void edge::setVertices( const vertex& initialVertex, const vertex& terminalVertex )
{
    setInitialVertex( initialVertex );
    setTerminalVertex( terminalVertex );
}

/*************************************************************/

graph::graph()
{
}

graph::~graph()
{
}

void graph::addVertex( vertex v )
{
    m_vertices.push_back( v );
}

void graph::addEdge( vertex initialVertex, vertex finalVertex )
{
    edge e;
    e.setInitialVertex( initialVertex );
    e.setTerminalVertex( finalVertex );
    m_edges.push_back( e );
}

void graph::enumeratePaths( vertex initialVertex, vertex finalVertex )
{
    std::stack< const char* > currentPath;
    enumeratePaths( currentPath, initialVertex, finalVertex );
}

void graph::enumeratePaths( std::stack< const char* > currentPath, vertex initialVertex, vertex finalVertex )
{
    currentPath.push( initialVertex.getName() );

    if ( strcmp( initialVertex.getName(), finalVertex.getName() ) == 0 )
    {
        printf( "Path: " );

        std::stack< const char* > storage;
        while ( !currentPath.empty() )
        {
            storage.push( currentPath.top() );
            currentPath.pop();
        }
        while ( !storage.empty() )
        {
            printf( "%s ", storage.top() );
            currentPath.push( storage.top() );
            storage.pop();
        }

        printf( "
" );
    }
    else
    {
        std::vector< edge >::iterator edgeIterator = m_edges.begin();
        while ( edgeIterator != m_edges.end() )
        {
            if ( strcmp( initialVertex.getName(), ( edgeIterator->getInitialVertex().getName() ) ) == 0 )
            {
                enumeratePaths( currentPath, edgeIterator->getTerminalVertex(), finalVertex );
            }

            ++edgeIterator;
        }
    }

    currentPath.pop();
}

int main()
{
    graph myGraph;

    vertex vertices[8];

    vertices[0].setName( "A" );
    vertices[1].setName( "B" );
    vertices[2].setName( "C" );
    vertices[3].setName( "D" );
    vertices[4].setName( "E" );
    vertices[5].setName( "F" );
    vertices[6].setName( "G" );
    vertices[7].setName( "H" );

    for ( int i = 0; i < 8; ++i )
    {
        myGraph.addVertex( vertices* );
    }

    edge edges[10];

    myGraph.addEdge( vertices[0], vertices[1] );
    myGraph.addEdge( vertices[0], vertices[5] );
    myGraph.addEdge( vertices[1], vertices[2] );
    myGraph.addEdge( vertices[1], vertices[3] );
    myGraph.addEdge( vertices[2], vertices[4] );
    myGraph.addEdge( vertices[3], vertices[4] );
    myGraph.addEdge( vertices[4], vertices[7] );
    myGraph.addEdge( vertices[5], vertices[6] );
    myGraph.addEdge( vertices[5], vertices[7] );
    myGraph.addEdge( vertices[6], vertices[7] );

    myGraph.enumeratePaths( vertices[0], vertices[7] );

    return 0;
}

I haven’t done all the error-handling that I would if this were production code, so don’t worry about that.

Omphaloskeptic · November 14, 2006, 4:24am

A few comments on your code (I understand this is Q&D code, so you may already know all of this and just have decided not to fix it for the quick hack):
[ol][li]enumeratePaths() has a stack<char*> passed by value (and so copied at every recursive call); it’s pretty clear from the function body that you meant it to be passed as a reference or pointer, since you do the work of restoring the stack state before exiting the function:[/li]


void graph::enumeratePaths( std::stack< const char* > &currentPath,
                            vertex initialVertex, vertex finalVertex )

[li]I’m not sure why you chose a stack; this required you to do the double-pop loop to reverse and print, then reverse again to restore. This looks like it’s made for a doubly-linked std::list (with push_back and forward iterators, say).[/li][li]I cringe every time I see a vector:: push_back (as with your addVertex() and addEdge()). This may require reallocation and copying of the entire array; unless you need random access, using a list is probably cheaper eventually.[/li]On the other hand, you may want to make your edge structure a map< vertex,list<vertex> > outNodes (of all targets of edges from a given node), possibly with the inverse map, of all in-nodes for a node. This means that your loop in enumeratePaths() doesn’t need to loop over all edges; it just loops over the edges in outNodes[initialVertex].[/ol]

Mathochist · November 14, 2006, 10:23pm

“enumerate” means to actually put them in bijection with some set of natural numbers. I can tell you how many there are, but what explicit bijection would have any particular meaning without more structure to the graph?

ultrafilter · November 14, 2006, 11:14pm

All bijections are equally meaningful here, because all I care about is the list.

Capt.Ridley_s_Shooting_Party · November 14, 2006, 11:24pm

Why do you include <string> and then use const char* everywhere?

ultrafilter · November 14, 2006, 11:26pm

For strncpy.

Topic		Replies	Views
Paths in Directed Graphs Factual Questions	11	938	March 25, 2004
On pathfinding algorithms, and hills. Factual Questions	13	1884	October 20, 2009
Math & systems theory question: Nodes, links, systems Factual Questions	20	975	June 2, 2018
Pure math problem - modified travelling salesman Factual Questions	39	1753	January 17, 2004
Programmer Geeks: Game Programming Factual Questions	6	1128	June 13, 2000

Counting paths in a directed acyclic graph

Related topics