Here’s what I want to do (I’m currently using Windows XP):
Check if a program is running
Listen to the program and grab output in various forms (e.g. text from a text box, the color of certain pixels, etc.)
Interact with the program, e.g. click buttons, menus, enter text into text boxes, etc.
Nearly all of my programming experience is in Unix (shell scripting, Perl, C++, etc.) so I’m mostly trying to find out if it’s possible to do this and what to start reading about. Visual Basic? Perl? Windows Scripting Host, whatever that is?
All my Google searches have been in vain, so any guidance would be greatly appreciated.
You can use Visual Basic to write a program of this sort. You will have to learn the windows API calls (I would google for Windows API). There is plenty of references and help on the net for getting the “handle” of a form and interacting with it and the child controls.
It might help to know what program you are trying to take control of. Some like Excel, Word, etc. support automation which allows you to interact with the program without having to use the Windows API calls.
You can use Windows Scripting Host (WSH) also but I prefer VB for its debugging and step through capabilities. It is nice to watch your code execute line by line, query variables, make code changes while the code is executing, and rerun parts of code.
I have found this to be very time-consuming and frustrating project so good luck!
jsmith got to it first. I don’t know what sort of “clever” solutions there might be out there (I suspect perl would fall under this category), but I’d just use the tried and true method of API calls. Not sure how familiar you are with the Win API but basically you can send any command to any control you want.
And yes, using the API is pretty much the lowest level you can work at, so its obviously going to be more involved. But it gives you precise control over everything and really all you need is a solid API reference.
You’ll need to use the Windows API, as the previous posters have said. What they neglected to say is you can use C++ to do so. As you are familiar with C++, and it is generally the most-used language (short of VB) for dealing with the Windows API, I would strongly suggest you use C++ for this project. (The preferred IDE/compiler would be MS VC++ .NET but you could use any text editor and GCC, or Borland C++ as well.) There will be many references available to you if you choose either C++ or VB as your programming language. (Previously there would have been more VB references available; now that VB has changed significantly from VB 6.0 to VB.NET I believe there is probably more C++ reference material available which is any good for the “latest version”.) Googling “windows API C++” or “win32 API C++” should turn up a number of references. There are books available about this topic as well.
Win32 API programming is not that difficult once you get into the swing of things. It is conceptually much more difficult, for example, to code any number of beginning computer science data structures: heap, red-black trees, kd-trees, etc. Win32 API programming can be time-intensive (read: tedious) and frustrating, however. But there were in the mid-90s hundreds of coders who wrote programs to manipulate AOL with Win32 API. Let me emphasize that for you: AOL users. We used Visual Basic, but since you know C++, that should not be a stumbling block for you.
I would suggest looking for a couple utility programs that can find information of an object for you. This will make writing code easier. You use these programs by mousing over an object (say an “OK” button") and the program reports back information about this OK button. I know this probably sounds very ill-defined but it is because my memory of programming Win32 API is pretty hazy (haven’t done it since early in high school.)
Thanks for the replies so far! I’ll check out that perl module, as that would be the easiest solution by far. However, I’m always interested in improving my programming skills, so I may check out the API route as well.
Any suggestions for a good book or website on Windows API calls? My initial web searches haven’t turned up any sort of comprehensive reference.
Whoops, factorial’s post wasn’t there when I started posting. Searching for “win32 API C++” did turn up a number of useful tutorials. So let me amend my question in the last post to “does anyone know of a book or website on the Windows API in C++ they think is particularly good and would like to recommend?”
There should be more than enough stuff online for API info…also check newsgroups. But if you’re insistent on using a book, do a search on Charles Petzold.
The reason I didn’t mention C++ in combination with the Win API is that I assumed the OP had not done any windows programming. With VB, you can skip all the Windows side of things and just start messing around with the API calls.
Trigonal, your assumption was right on – I haven’t done any Windows programming. It would be nice to skip the Windows side of things. However, I know absolutely nothing about Visual Basic (for example, I don’t even know if it’s compiled or interpreted), so I guess that’ll be my next web search.
I am ignorant about this aspect of C++ win32 programming, but is there anything that would stop you from making a console application that used win32 API?
If there isn’t, the fact that he doesn’t know Windows programming would become less relevant.
I don’t know what type of interface you want your program to have, Giraffe (maybe none at all: maybe it periodically gets information and writes it to a text file, for example) so knowing this might be helpful when suggesting a language for you to use.
Nope, In C++ you can make a Console application instead of a Windows application with Forms aka windows.
I am versed in both VB and C++, I used C++ far longer than I have used VB however, once I got the hang of VB I started to like that way better for its ease of use when doing quick and dirty projects. I feel the code is a heck of a lot easier to write than C++ plus I don’t have to worry about cleaning up any of my variables. If you have a .net version of VB, C++, or C# I have found their features to be pretty much all the same so its a matter of which language you prefer.
The Win32::GuiTest module is pretty much just a wrapper around the API calls (and a pretty thin wrapper, at that), so you’ll definitely benefit to know about the underlying API.
Depending on your goal, you may be trying to swat a fly with a nuke. And learning even the rudiments of the Win 32 API is not for the faint hearted. A gigantic mass of detail, and you won’t start seeing the forest for several months.
If you just want to remotely control a garden variety Windows program on the same PC, there are “macro” programs for just that purpose. They can start another program, click its buttons, and retrieve data from its screens. My personal favorite is called “Macro Express 3” from www.macros.com. Cheap, reliable, and has a lot of capability. There’s no way you could build even 1% of that functionality for the same money unless you work for pennies a day.
The next question is whether the easiest way to tackle your problem is really by remote controlling. For example, if you’re trying to automate some web interaction, it may be easier to write your own program to send/receive over HTTP than it would be to try to remote-control a browser. Only you know what you’re trying to achieve, but my general advice is to take a step back and make sure you’re not starting down the wrong road.