TCP/IP Packets Arriving Out of Order

Sysadmindopers:
Is there any precedent for a situation I’m running into where packets arrive in substantially the wrong order?

I’ve got a packet A sent at:
10:10:10
Then I’ve got a packet B sent at 10:11:20.
These times per Server 1.

We’re seeing packet B showing up at Server 2, then 10 seconds later packet A shows up at Server 2.

Let me know. This is bizarre.

I assume these are UDP, not TCP? (Not that I know enough about TCP/IP to offer any suggestion based on your answer, I’m just curious)

One of the specific design goals of TCP/IP is to allow packets to arrive out of order and then be reassembled correctly. I’d be more concerned about why it takes you a minute and 20 seconds to send all of two packets in the first place.

Yeah, it has to be UDP or perhaps raw IP (if anyone uses that). TCP guarantees in-order delivery.

I have never heard of this. The point of TCP is that the stack takes care of making sure it arrives and the order in which it arrives.

As Revtim said, my first guess would have been that this is UDP and not TCP - but I trust that you know it is TCP. I have no other guess though.

Some basics:

  1. Make sure the duplex settings are accurate. Duplex mismatches will give you irregular behavior.

  2. Change cables.

  3. Is the switch using only layer 2 or is layer 3 configured as well?

Good question, I believe they’re TCP/IP.

On edit: I’ll check with the programmers.

These packets are sent in response to specific events; if the embedded device controlled by Server 1 experiences a certain physical event, a particular message is sent to Server 2.

It’s irrelevant whether the packets are TCP or UDP. As friedo says, TCP packets can certainly arrive out of order. The point of TCP is that it is supposed to be able to cope with the foibles of an internet, such as packets arriving out of order or not at all.

My problem is actually at the application level.
The app on server 2 was written by a development team that did not expect packets to show up 70 seconds out of order.

Picture this:
“Storage Facility A temperature exceeds bounds. If this condition persists for more than 2 hours, we’ll send a maintenance guy a page.”
“Storage Facility A temperature within proper range. Do not wake the janitor.”

Okay, that works.

Now picture this:
“Storage Facility A temperature within proper range. Do not wake the janitor.”
“Storage Facility A temperature exceeds bounds. If this condition persists for more than 2 hours, we’ll send a maintenance guy a page.”

Okay, that happens, and next thing you know there’s a guy responding to an imaginary thermal problem.

Thus, I’m trying to ascertain if packets taking this long is unprecedented.
Do I beat up my app developers or do I beat up the Cisco guy?

Duly noted.
Thank you very much, and I’ll research item 3.

That can’t be a problem with TCP as long as you keep an open connection between the sensor and the other end. TCP will not deliver the “within proper range” message until after it delivers the “exceeds bounds” message. Either you’re using UDP or two distinct TCP connections to deliver the two messages. I’d suggest using only one TCP connection.

Mr. Slant, when you say the packets are arriving out of order, are you basing this on observing network traffic (i.e. using a sniffer or otherwise directly observing network traffic,) or basing it only on when the application reports having received them?

TCP guarantees in-order delivery to the application, but the packets carrying the TCP stream can arrive in any order. UDP, on the other hand, doesn’t have any inherent order, and packets may be delivered to the application willy-nilly. If your programmers assumed that UDP packets would be delivered to the receiver in the order that they were sent, they’re the ones you have a beef with.

You’re right, all events are completely distinct from one another at that level of the software.
As I’m a relative newcomer to this software, I’m as confused as anyone else as to why most of the architecture is like it is.
Personally, I would have just had the sensor end (Server A) contain the logic, and just have it send Server B the “we need the HVAC guy pronto” message.

TCP knows nothing of higher level protocols and will happily deliver messages in whatever order the application demands. All that TCP guarantees is that IP packets during a particular connection are arranged into the correct order, if possible. So even if you had one extremely long TCP session, as you suggest, the messages could still arrive in the wrong order.

Anyway, back to the OP - would it be possible for Server 1 to timestamp the messages it sends? That way, the application on Server 2 ought to be able to cope with out-of-order messages.

I’ve got two logs.
One is in the core application and other is in the TCP/IP listener.
The listener is pretty primitive, so I suspect, but cannot be certain, that it will yield very accurate times.
Based on extensive observation, the core application is at most 2 seconds off of the TCP/IP listener’s log times.

I believe what I’m hearing is that TCP/IP does occasionally deliver packets out of order.
Any clues how common this is?

That’ll require coding, but yes.
If I have to buy the programmer time involved I’m inclined to just put the logic on Server A and be done with it.
Server B’s product line is closer to end of life than Server A is, and if I’m spending the money I might as well make the improvement more portable.

No. TCP packets occasionally arrive out of order. TCP guarantees that data sent on the same connection will be delivered to the receiving application in-order.

If you don’t have access to an enterprise sniffer appliance, you might want to try installing Wireshark on one of the clients to see what it turns up.

Sounds like Java is involved. :slight_smile:

On a more serious note, this sounds like an application problem. TCP guarantees that packets will get to the socket in order. I’d look at your core application and see if it’s possible that what it’s logging and what it’s doing is different – i.e. they are actually sent out of order but that is not reflected in the logs. I see that sort of thing daily with brilliance such as:



Thread A:
LOG("Send Packet A")
SendPacket()
Thread B:
LOG("Send Packet B")
SendPacket()


If both the LOG and SendPacket() are properly but independently synchronized it’s possible for the LOG to reflect A then B, but reality to reflect B then A.