My latest Data Structure assignment has the following requirement: given a binary search tree, comes up with a method of displaying the deepest nodes. Deepest is defined by the leaf nodes with the greatest height.
My classmates and I have been pondering over how we could do this. Here are some of them:
Tag a height variable to each node.
Use a 2-d ‘linked list’ to store the nodes, layers by layers.
A method which requires 3 queues. The method is quite contrived (it’s something like ‘magic’. It works, but I have no idea what’s going on)
Are there any other ways? (BTW, I could have just picked one of the above methods and be done with it. But I am curious…)
What I would do is write a recursive routine like this (pseudocode):
function recursiveRoutine(node, depth)
if node has children
for each childnode in node
recursiveRoutine(childnode, depth+1)
next
else
if depth > bestdepth
clear collection of bestnodes
add node to collection of bestnodes
elseif depth = bestdepth
add node to collection of bestnodes
endif
endif
end function
There exists a purely recursive solution which, if your data structures instructors are anything like mine were, is probably the one they’re looking for. Hint: find a way to encode node depths in the recursion, rather than doing it explicitly in the nodes.
If I can give you any advice on programing it would be this: implement the method that makes the most intuitive sense to you. Don’t worry about optimization – that can be done later, after you have something that works.
For me, method #1 makes the most sense, and would be the easiest for me to implement. But if one of the others makes more sense to you, go for it.
I think this works too, but when depth > bestdepth, do you have to set bestdepth = depth? I think this is simplest solution I have found so far (easier to understand than that 3 queues). I didn’t think of this – guess my brain isn’t used to think recursively.
For anyone who’s interested, here’s the three queues method (I tried my best to explan it…)
Color me befuddled. At what point are you actually traversing the tree?
I could see a method for using some queues that would go like this:
Add the root to the queue/list of candidate nodes.
Go through each item in the list of candidate nodes and find any children, and put them into the queue of deeper nodes. If no deeper nodes, then the candidate nodes are the deepest ones. If deeper nodes, then flush candidate nodes and replace them with deeper nodes.
The three queue method you propose is just a convoluted BFS, right? Rather than put the nodes for the next level directly on the queue, you swap queues (in addition to setting the result queue, which is unnecessary).
Personally, I’d have gone DFS. More intuitive (to me) and uses less space. But either way (assuming you don’t have links between children and depth information stored with the nodes), you have to touch every node. Both run in O(n) time.
Not trying to be a jerk, but your 3-queue method reminds me of a sorting algorthm a good friend created for our 200-level algorithms class back in the Dark Ages. His method was huge, slow, and complex. We dubbed it the “FoolSort”
Most of the others in the class, including me, did some sort of heap sort and a couple wisguys tried to claim they’d independently discoververed QuickSort.
Thanks for the memories. Makes me wonder if he still has a printout of the finished FoolSort program tucked away in his memorabilia.
Practical hint for the OP: Recursion is one of THE fundamental forces in computing, sorta like gravity in physics. Get comfortable with it and get good at it. Rule of thumb: A recursive data structure is almost always best processed using a recursive algorithm, ESPECIALLY in an academic environment. Yes, any recusrive function can be unrolled, but usually at great cost to comprehendability.
Took me a while as well, but it’s very useful indeed. Generally makes for very elegant code. They weren’t always the fastest solution, but now compilers do a very good job at them.
And you are, of course, correct about bestdepth=depth.
Not to be abrasive, but bullshit. Recursion is usually more difficult to comprehend than the non-recursive solution for anyone who didn’t write the code. And for what it’s worth, the non-recursive solution (which always exists btw) has better performance. Yes, you should get comfortable with recursion, but only because pointy headed people who think they are being clever like to write recursive code. For any serious application, you will almost never need recursion. If you find yourself considering a recursive solution, you should think hard about whether there is a reasonably simple non-recursive method. If you can’t think of one, be sure to document your recursive method extensively, because it will likely be difficult for others to understand. Code is much more readable if the input and output to methods can be cleanly encapsulated. With recursion, you’re going to have to understand a bunch of intermediate partial results to understand what’s going on.
Any non-trivial algorithm is going to be difficult to comprehend for anyone who didn’t write it. That said, there are cases where recursive code is by far easier to understand and implement than iterative methods, and where the difference in efficiency just isn’t all that great.
Consider a method to walk a tree and count the number of nodes in it. Here’s the recursive solution:
CountNodes( node N )
Total <- 1
for each child n of N
Total <- Total + CountNodes( n )
And here’s the iterative solution:
CountNodes( node N )
Total <- 0
Stack S
insert N into S
while top( S )
pop top( S ) into J
Total <- Total + 1
for each child n of J
insert n into S
Now who here seriously thinks that the second code snippet is easier to understand than the first? I can assure it wasn’t easier to write.
In terms of efficiency, both algorithms are [symbol]Q/symbol with respect to time and [symbol]Q/symbol with respect to memory usage. The second algorithm actually makes more function calls than the first, so there’s no time savings there. What’s the benefit of that code?
LSLGuy is absolutely right that recursion is one of the fundamental ideas of computing, particularly when it comes to algorithms and data structures. Not to be abrasive, but the fact that you have difficulty with it is not relevant.
Don’t assume I have difficulty with recursion just because I think it is good coding practice to avoid it when possible. Many people have difficulty with recursion, I don’t.
Maybe a better question is, why are you walking a tree to count the nodes in it? Why not increment a counter every time a node is added and decrement when it is removed?
Here’s the code for that:
Tree {
addNode(Node parent){
totalNodes++;
/* more code /
}
removeNode(Node node){
totalNodes–;
/ more code */
}
}
If you’re doing a lot of adding and removing of nodes and not a lot of checking the total, theoretically, your method could be more efficient. But one addition operation per change is almost certainly going to be negligible. And it’s encapsultated, object oriented, and self documenting.
I’m not even saying that recursion is never useful. I’m just saying, it is better to avoid it if you can. But I get the feeling that we’re approaching the question from very different vantage points.
The problem with a DFS is that I need the deepest nodes. In the end, the DFS will give me a queue with the nodes sorted by their height – but how I do know which nodes are the deepest. (They are the last few nodes in the queue, but how many?)
I didn’t think of the 3-queue method. We ‘overheard’ that someone managed to get the ‘deepest nodes’ with a 3 queue, and we worked it out that it may be done that way.
I don’t like the 3-queue method. It’s clumsy, hard to explain what’s going on and etc. I could just do a DFS (as mentioned above), but I have no idea which nodes will be the deepest.