Monitoring PCs across wireless - an oddball situation

Allrighty, I’m looking for a solution that’s oddball enough that I really can’t come up with a good way to search for it. I’m hoping someone out in SDMBland can suggest a product.

I’m trying to do some monitoring of PCs for telemetry and video surveillance that are on school buses. These units are accessible via Wi-Fi, but naturally they are only accessible when they are near the school campuses with outdoor Wi-Fi coverage.

These PCs will spend most of their time off, and most of their on-time outside of the Wi-Fi coverage. That being said, there will be a time every day, Monday through Friday, when these things will be both online and in Wi-Fi coverage.

The real problem is that the normal state for these things is to be offline, but just not offline for, say, more than 24 hours. I don’t want to just email the administrator when the things go “offline” (because they will constantly), and I don’t want to do something like emailing a notification on boot because it’ll force someone to dig through 30 messages looking for the one that doesn’t show up. They just won’t do that.

So, the ideal situation is to find a way to get some sort of central monitoring and have it act based on “If it’s Monday through Friday and a system hasn’t checked in within the last 24 hours” send a notification.

Has anyone got any ideas on how to go about this?

-Joe

Are the PCs running Windows?

If so you could create a VB script that is scheduled to run every 30 minutes or so, but the script decides whether or not to send the email.

The script could also write to a text file that saves the current date/time of the last email.

So, you set a variable for LastSent by grabbing the date/time of the last send from the text file, then turning it into a datetime type.

Then you check …

If today is monday-friday…
If now (datetime) compared to LastSent (datetime) >= 24 hours…
Then attempt to send email.

If the email fails because it cannot get a network connection, then no new date/time is written to the text file.

If the email is a success, then write the current date/time to the text file.

The scheduled task will run again 30 minutes later and fail at step 2, thus not sending an email.

Clear as mud?

Can you collect the DHCP logs from the Wi-Fi networks? If so, you could periodically analyze them to check which PCs have established contact with one or more of your networks, and issue alerts/reports as needed.

Yes, they are running Windows, and yes, you are clear as mud.

The problem I’m seeing, assuming I’m seeing through the mud correctly, is that eventually an email will be sent by a running machine. Assuming all are working correctly, you’ve the got administrator getting 30 emails that they have to sift through, looking for the one that’s not there. That would be a lot more work than, and accomplish the same thing as having the software running on the machine send an email when it boots.

(Edit)

Now I understand a little bit more clearly, but if I’m understanding it correctly, your process will involve Machine X emailing a report that it’s been down too long. Which will be a problem in that a dead machine can’t send email.

It might be possible to do something like that, but you’re looking at collecting logs from multiple DHCP servers, searching for a particular MAC, checking the lease time, and all that. Plus a typical DHCP server will have lease times for a week. I can’t go messing with that on their network.

Honestly, I could do something as simple as a ping test (I’d rather make certain particular services are running on the PCs, but one hurdle at a time). Stuff like TheDude can do things like that. I just can’t figure how to tweak it to only send an email if something is offline more than 24 hours.

-Joe

There are existing canned monitoring solutions for school buses from various surveillance vendors. Wouldn’t one of those make the most sense vs trying to cobble something together.

My bad. I mis-read your sentence to mean that you wanted the machines to send email once every 24 hours.

But, you can set up a system on the receiving machine (or a machine that can receive copies of the emails) that parses the emails and decides who has checked in. All you would need to do is either have a dedicated email address for the reports, that only accepts emails from the bus systems (to avoid collecting spam and/or avoid having to sift through unrelated emails) and the subject line can be written to indicate the machine.

When a machine checks in (using my example script), the receiving system marks the date/time/machine name in a text file and then a report can be run at your convenience that shows which machines haven’t checked in.

Surprisingly, this is a common issue with many of them. What often happens is there’s an incident, someone goes out to the bus to pull the hard drive out of the DVR to check the incident, and it turns out the thing has been dead for the last three months. Last time it was checked the HDD wasn’t slotted in all the way, a fuse blew, or the DVR is completely fried.

What I’m looking for isn’t something that will tell us immediately if a unit is dead (with all the comings and goings it’s pretty much impossible), but something that will tell us if it died yesterday.

-Joe

As you note, “going dead” can’t trigger some action or notification. The notification that something isn’t working has to be created through the absence of something. Instead of making this a manual search through emails for the one that isn’t there, you could fairly easily automate this with a report or simple program. Have each bus ping “home” every hour when on, and run a report of any buses that haven’t pinged in 24 hours.

Why are you trying for a technological solution in the first place? Isn’t this something that can be checked with routine maintenance? I assume that someone checks the bus over once every so often. Just add this to the list of things to check.

Where is the DVR located on the bus, and does it display any sort of “I’m on and working” indication? If it’s in easy sight of the driver, I’d like to think it would be easy to add “make sure the green light is on” to their pre-drive checklists. Or are these things truly black boxes under the floor and not easily monitored?

Truly a black box, and intentionally so. The drivers aren’t supposed to have access or control over the things.

-Joe

If you have a SMS server it can do this. My terminoligy might be off a little, but setup a collection with all the bus PCs in it. Install the SMS client on all the bus PCs. Run a report on the SMS server against that collection on last time the systems have reported to SMS.

-Otanx