C Programming - Iterate Through Array

Lets say I have the following function:


float foo (float *input){
  int i;
  float rv;
  for (i=0; i<SIZE_OF_ARRAY; ++i){
    if (*input > THRESH){
      rv=*input;
      break;
    }
    ++input;
  }
  return(rv)
}

Am I passing in a pointer to the first item in the array input? So that ++input moves to the next item?

Would it be valid to rewrite it as:


float foo (float *input){
  int i;
  float rv;
  for (i=0; i<SIZE_OF_ARRAY-1; i++){
    if (input* > THRESH){
      rv=input*;
      break;
    }
  }
  return(rv)
}

If so, is one style preferred over the other?

Thanks,

LarsenMTL

Yes, they are equivalent. Array* is the same as *(Array+i) in most (all?) instances. Arrays are passed to functions in C as a pointer to the first element.

I prefer to use indices (Array*) because it makes it explicit that you are dealing with an array - and it doesn’t freak out programmers not comfortable with pointers that might read/edit/maintain your code. It also maintains the input parameter (so you could make it const or use it later in your function if you wanted).

Thanks Jas09!

Both will work fine in this case (although you need to add a semicolon to the return statements).

As to which is better, I would personally prefer the second syntax, because it is clearer. Modifying arguments to a function looks odd, and might lead another programmer (or you, at a later date) to think that perhaps you intended the function to modify external data in the calling function (e.g. write out to *input), as is commonly done when a function needs to return more than one variable (and the programmer doesn’t want to package them in one struct).

Furthermore, the first syntax is error-prone because you might use the same pointer in a later part of the function, and it will be pointing at the end of the array.

On edit: I see several people beat me to it…

See question 6.3 and 6.4 here for a technical discussion.

I see two potential bugs.

The first, the float rv is not initialized. If none of the values is greater than the threshold, the function will return an undefined value. This is usually undesirable behavior.

The second, in the second code block, the final element of the array is not checked. Do you really intend “i<SIZE_OF_ARRAY-1” instead of “i<SIZE_OF_ARRAY” or “i<=SIZE_OF_ARRAY-1”.

Also, in the second block, I’d define the function as “float foo(const float input)” to make clear that you are not modifying the array.

In almost all circumstances, I’d prefer the second style, since it is much more clear what the intent of the function is. However, the pointer method will often be fractionally faster than the array dereferencing (since the array dereference will require a pointer addition, unless the compiler is really good). Unless you’re calling this function literally billions of times, it’s not worth the obscuration.

The second style is prefered, but as Pleonast points out it will most likely be minimally slower for arcane reasons that would make sense if you knew assembler.

I’d just point out, though, that if your objective is to find the point at which you exceed a value, you’d do better to do a binary search if speed is of the essence.

Thanks all! My examples above isn’t “real” code just enough to let me get my question across.

Thanks for the bug hunting. In my real code, the compiler complained about the first one but I would have missed that second one (I meant to use <=).

Binary search requires your array to be ordered doesn’t it?

It’s just sort of puzzling what you’re trying to do if the array isn’t sorted.

That doesn’t change the answer to your question.

It would need to be ordered for iteration to work as well.

There’s no reason to subtract the one. You’re better off to just use <.

It depends upon what the routine is supposed to be doing. If it’s supposed to be returning the smallest value that is above a threshold then you’re correct.

This is the SDMB. We eat/sleep/breathe/live arcane crap.

The notation A* is equivalent to (A + isizeof(A[0])). Going through and doing that computation eats a couple of CPU cycles per array reference. Most of the time you just don’t care, but if you’re writing a real-time system or iterating over every element of a really big array, it could matter.

There’ve been a lot of good reasons given to prefer this style over using pointer arithmetic. It might also be worth noting that random access gets to be a little trickier if you’re not using the subscript notation.

If they’re really that interested in performance, they should be compiling with optimization turned on, which ought to result in essentially identical performance. In fact, using less-obfuscated code could make easier to optimize code, resulting in better performance.

dup post

Correct. To be more precise, the compiler should perform some sort of strength reduction on the code snippet that ultrafilter posted. That would replace the expensive multiply with an increment that executes on each iteration of the loop – effectively making the second variant do exactly what the first variant does.

Multiplication in C isn’t exactly the same as index offsetting in x86 assembly if I recall correctly. There somewhat isn’t any way to write an assembler equivalent in C beyond “A*”.

The following is about this behavior in regards to the LEA command, but the same holds for MOV. The key point is that indexing is almost always an addition and a shift (not multiplication), and doing it as an index off of a register only takes one clock cycle where as running a shift and add as separate ops will take 2 or more cycles.