Binary Search Tree - deepest nodes only

Digital_Stimulus · October 14, 2005, 6:56pm

You have to maintain the depth at the each node, but that’s not really problem is it? I’d also maintain a “bestdepth” variable (as is done in the recursive solution given), only storing nodes at the current best (and deleting the others when “bestdepth” is incremented).

LSLGuy · October 15, 2005, 4:43pm

jawdirk:

Not to be abrasive, but bullshit. Recursion is usually more difficult to comprehend than the non-recursive solution for anyone who didn’t write the code. And for what it’s worth, the non-recursive solution (which always exists btw) has better performance. Yes, you should get comfortable with recursion, but only because pointy headed people who think they are being clever like to write recursive code. For any serious application, you will almost never need recursion. If you find yourself considering a recursive solution, you should think hard about whether there is a reasonably simple non-recursive method. If you can’t think of one, be sure to document your recursive method extensively, because it will likely be difficult for others to understand. Code is much more readable if the input and output to methods can be cleanly encapsulated. With recursion, you’re going to have to understand a bunch of intermediate partial results to understand what’s going on.

I think it depends most strongly on your data structure. For an inherently flat structure like an array or list or collection, I agree that a recursive algorithm is almost certainly the result of somebody showing off or screwing around.
Conversely, the systems I’m stuck designing & building are rife with trees, the canonical example of a recursive data stucture.

Almost every data object we handle is composed of an arbitrary set of generally self-similar sub-objects to an arbitrary breadth & depth. We’re either modelling an org chart of business entities, or of ownership interests, or of personnel superior/subordinate relationships or of parts in assemblies in subassemblies in subassemblies in … Or we’re traversing files in folders in folders in folders or machines in subnets of subnets of subnets of … or of users in LDAP containers in LDAP containers in LDAP containers in or …

And a recursive algorithm is ideally suited to traversing, assessing, enumerating, restructuring, maintaining, copying, rendering, etc., those sorts of structures. The equivalent iterative algorithm ends up with lots of variables and/or a stack for internal housekeeping. You can either have the stack be implict in the recursive call chain or you can expose it (and potentially screw it up) by using your own stack.

Switching to your idea about keeping a tree membernode count and updating it on node add/delete …

That assumes your code controls 100% of the potential changes to the structure. For an academic exercise that’s great. For some business objects that works too. (And, not to be abrasive, it didn’t do what the OP/customer asked for.)

But far more often this hunk of code we’re working on right here is reporting on (or acting on) objects it doesn’t control, at least not fully. There may be no provisions in those objects for storing abritrary meta-data and if there is, I need to deconflict my metadata from that which other unrelated (and possibly ignorant or selfish) processes may also be writing to the objects’ metadata store.

In the real world of disconnected processes taking place asynchronously on machines halfway across the world (unless the packet gets lost), you can’t generally count on metadata you hope to maintain. What if someody does a backup/restore operation on 1/3rd of your tree without telling you?

You end up needing to build a resync tool to reset your metadata. And having a method to detect that your metadata is out of sync with whatever it’s summarizing. And a method to handle that degraded condition gracefully until it’s corrected. Finally, for a tree-shaped structure, it’s almost a sure thing your resync tool will operate by enumerating the tree recursively.

Maintaining metadata is, like database normalization, a prime topic for religious flamewars. In an ideal world you’d maintain almost none, just as ideally all your databases would be normalized to the 5th form. Conversely, performance and convenience dictate maintaining some metadata & allowing some denormalization. The same logical problem may have different answers depending on scale and scope. Sometimes Quantity has a Quality all its own.

The challenge for professionals is to use their judgement to choose the best mix of purity & practicality to get the job done reliably.

jawdirk · October 15, 2005, 6:19pm

Not to be abrasive, but what a fine post! You and Ultrafilter made a very clear argument for when recursion is called for, one that I never heard during any of the classes in my M.S. CS. And for what it’s worth, a recursive algorithm seems like the best solution to the OP. Thanks LSLGuy, I learned something today!

Topic		Replies	Views
What is this tree structure representation called? Factual Questions	6	922	February 3, 2006
Computer science question Factual Questions	4	729	April 13, 2004
How to find the depth of hierarchical data in an adjacency list? Factual Questions	4	6479	March 2, 2011
For anyone who knows AVL trees Factual Questions	13	1120	February 17, 2004
Please help with my data structures and algorithms class Factual Questions	44	6697	August 27, 2014

Binary Search Tree - deepest nodes only

Related topics