I like reading technical interview questions and trying to answer them (i.e. “explain why this works”, or “explain why this happens”, not questions “write a depth first tree traversal function”), mainly because I’m a smart Alec and like knowing the corner cases of programming languages that most programmers aren’t aware of :o
Anyway, I’ve just come across this Java VM bug:
for(Object o=null;;o=new Object[]{o});
Running this code in the latest JVM (and also the latest OpenJVM) should cause a JVM crash, unless they’ve rushed out a fix. Now, the question is: why does this happen?
Similarly, for C++ programmers, what is printed out when the following code is executed:
int funA() { std::cout << "In A" << std::endl; return 0; }
int funB() { std::cout << "In B" << std::endl; return 1; }
someOtherFunction(funA(), funB());
I don’t know any C++, but the “obvious” answer seems to be it would output to STDOUT
In A
In B
Of course, I’m guessing the obvious answer isn’t the right one, since otherwise you wouldn’t have asked it. (Also, using a side-effect-causing function to return an argument for another function is naughty.)
…ok I cheated and ran the code. It prints B first. Why is the execution order reversed?
The Java one seems straightforward: it’s a memory leak bomb kind of thing in an infinite loop. The code is making holders that contain each other, like Russian dolls, with no endpoint. I just ran it under 1.5 and it does crash the JVM. Odd that it doesn’t just create an out of memory exception.
I imagine it’s because of the order that the values are pushed on the stack for the method call.
(I’ll try to come back with some edge cases of my own!)
IIRC, the reversal of arguments in a c++ function call has to do with not knowing the number of arguments (because of overloading, and/or functions like print where you really don’t know the number of arguments at compile time).
Another edge cas, of c== and C#e - if you have a defined virtual function foo in class Parent, and you override it in class Child, an instance of class child will call the overridden version of foo, except if foo is called in the Parent’s constructor. Even though you’re instantiating a Child, it’ll call the Parent’s version of foo. Why is that?
Then why does the equivalent C program behave the same way?
Behold:
#include <stdio.h>
int funA() { printf("In A
"); return 0; }
int funB() { printf("In B
"); return 1; }
int someOtherFunction(int a, int b) {
return 1;
}
int main() {
someOtherFunction(funA(), funB());
}
Output:
In B
In A
There’s no overloading or method dispatch to worry about here.
I’m puzzled about that Java problem; it seems to be of the level of geekiness of the well known double-checked lockingflaw in Java. A quick Google of the code turns up lots of folks surmising that it is a crash of the GC, but I didn’t see a true definitive answer as to why GC died instead of a proper OOM exception being thrown.
The default in C is for the caller to push the arguments on the stack in the reverse order. It seems nice that this makes them pop in original order.
From here:
The stack explanation makes a lot of sense. I’m inclined to believe that the same explanation is true for the C++ code and the issue doesn’t have anything to do with method dispatch there. (Although I know there are all sorts of other gotchas related thereto.)
Unsurprisingly, C-derived languages with higher-level function semantics execute the functions in the expected order. (I tried it in Perl and Javascript.)
Neither C nor C++ guarantee the order in which function (or method) arguments will be evaluated. Although many compilers, and yours in particular, might choose right-to-left order, you shouldn’t depend on this behavior.
The constructor for the base class is called before the child class. The constructor has to call the parent’s function because the child has not been constructed yet.
Yep, got it. Actually, it’s more general than that: C and C++ don’t guarantee the evaluation order of any expression. The only thing that they guarantee is that expressions will be evaluated before a “sequence point”, and function evaluation just happens to be a sequence point.
Whew - y’all scared me for a second. My first thought was “Undefined Behavior” for the C++ example, but the first responses seemed so confident I was afraid I had missed something.
In fact, my default answer for most C/C++ “tricky” questions is “Undefined Behavior”. More often than not that’s the case.
Undefined, but the different compilers are pretty consistent, and most seem to do right-to-left. Some compilers do provide switches to influence this behavior, along with behavior such as “callee clears the stack before returning”
I certainly hope nobody asks me these kinds of close-to-the-metal questions—I can philosophize on them all day but wouldn’t be able to give a clean definitive answer in an interview.
My own kind of edge-case question might be like this: I might ask a data warehouse ETL developer to consider a certain bit of ETL code that does batch runs based on last-run-timestamp and explain what the risks are for losing source-system rows, and how they can get around it. One area that many don’t consider is the delayed commit problem, where a source row is hung in a tx until after the ETL ran, making it not visible, and then they increment their counter or last-run-timestamp and never see the row in future runs. There isn’t a right answer to fixing this, but the developer should be able to address the pros and cons of each option.
Here’s a cool one that’s more the style of this thread:
At some time (might be fixed now), if you renamed any Oracle OCI-based database application executable to include parentheses, the application would fail to connect to the database.
For example:
“mysqlapp.exe” would work fine, but “mysqlapp(2).exe” would not. The app would run but its database connection would fail.
You can try it with the SQL*Plus executable if you want.
There is a very good reason for this, but I leave it to the reader to figure it out. And it’s Oracle’s fault.
What about the order of evaluation in a binary logical operator?
if (funA() || funB()) cout << “Done.” << endl;
IIRC, funA is guaranteed to be evaluated first, and funB won’t be evaluated if funA were to return 1. Or is the || considered a sequence point? That doesn’t sound right to me, but I’m not a guru.
Certainly not a bad rule of thumb when it comes to C and C++. However, I believe this is actually a case of unspecified behavior, not undefined, with this being the difference:
Undefined behavior indicates that an implementation may behave unpredictably when a program reaches a certain state, which almost without exception is a result of a bug. Undefined behavior can be manifested as a run time crash, unstable and unreliable program state, or—in rare cases—it may even pass unnoticed. Examples of undefined behaviors include an attempt to write to a buffer past its boundary, accessing an out of range array subscript, de-referencing a dangling pointer or deleting a non-NULL pointer more than once.
Unspecified behavior, on the other hand, is a consistent and documented behavior that a certain implementation employs in cases that are left intentionally unspecified by the C++ Standard—usually, to allow efficient and simple compiler implementation on various platforms. […]
To sum up, unspecified behavior is usually something you shouldn’t worry about, unless your software is required to be portable. Conversely, undefined behavior is always undesirable and should never occur.
(Bolding mine.)
You’d be mighty peeved if the mere act of calling a multi-argument function had the potential to crash your program, or worse, just because the evaluation order of arguments was undefined in the C/C++ sense of the word.
The || and && operators are special. “Short-circuit evaluation” is specifically required by both the C and C++ standards in the definition of those operators, so they will always be evaluated from left-to-right.
For example, to quote from section 5.15 of the C++ standard:
“Unlike |, || guarantees left-to-right evaluation; moreover, the second operand is not evaluated if the first operand evaluates to true.”
Capt. Ridley is mostly right in what he says, but there are a few C/C++ operators (like ‘||’) that implicitly include a sequence point. For expressions with such operators, there actually is a well-defined evaluation order of the subexpressions.
I don’t necessarily have any super tricky ones (and this may be a hijack), but in my experience it’s surprising how many interviewees are tripped up by the following questions. In both cases, I tell them to use whatever language they’re most comfortable with, and pseudocode is fine:
Write a function to reverse an array in-place.
Given the following 3 functions, write a method to print a listing of the contents of a directory and its subdirectories. Your function should take one argument, a string, which is the path to a directory.
boolean isDirectory(String) - given a path, return whether it’s a directory
boolean hasChildren(String) - given a path that is a directory, return boolean
String getChildren(String) - given a path that is a directory, return array of children
I cannot fathom how anyone looking for a programming job can’t figure these ones out. Now, I am very careful not to use the work “recursion” or “recursive” for #2, at least until they’re really struggling, but still. Anybody who I would hire better be able to solve these in their sleep.
Now, I will admit that there is one red herring and one “trick” part in #2. Additionally, in #1, there is an extremely common mistake most people make their first try which causes their method to do nothing, but I don’t penalize them for that (as I have been known to do it in my own code). Can anybody figure them out?