Looking for a program that will get records from a web form without my constant input

I don’t know what to call what I’m about to describe, otherwise I’d search for it myself. I’m looking for a program that presents an easy interface between me and the web in order to look through records on a website without my constant input. I guess you’d call it a kind of bot, but I want one with a good UI, and one that is free or almost free. I don’t care how long the searches take, I don’t mind if the program is as slow as a human typing: I don’t want to swamp someone’s server. I just don’t want to sit here and type in 100’s of searches.

I am looking through a long list of records on a county assessor’s site, and have to enter a 5 field pin number in order to pull up each record. The fields in the form don’t require me to hit the tab key, but sometimes it gets stuck in a field. The form that comes up has a hyperlink in it, and you have to click that hyperlink to get the record. Also, the final record has a link to a picture file which usually is empty, but sometimes has a picture in it. I’d like to capture that picture as well. Are there any programs out there that will do this?

I’d suggest rather than generate all these queries through the web, which is gonna swamp the auditor’s server no matter how you do it, that you contact your county auditor and see if the information is available in any other format. I know the Lucas County (Ohio) Auditor offers for sale a CDROM which contains all the data and images available in their on-line version for a very nominal fee.

http://www.co.lucas.oh.us/auditor/Real_Estate/AREIS402.asp

This is Cook County (Chicago) , where they’re not very helpful or forthcoming with their information if it doesn’t follow their internal policies to give it out. I’m pretty sure they don’t sell it on CD, and I don’t know any “insiders” who could help me.

My original question stands.

I’m going to jump in with an instinctive response of “Perl” here, though that’s probably not very helpful 'cos if you knew Perl, you’d probably already be using it. :slight_smile: You could try posting on this site and someone might be able to help you out (of course, you’d need Perl installed…)

When you complete the form and press “send” (or whatever), does it then take you to a complicated URL which has some of the info you’ve typed - like that 5-figure one? If so, does replacing the one in the URL with another one give you the corresponding page? (in other words, is the job of the web form just to generate the URL which delivers the correct record?) If so then (assuming you’ve got a list of 5-digit numbers) it’s just a case of generating the URLs yourself, then downloading the html files.

Can you post a link to the site so we can see what we’re up against? and what type of computer are you on?

guessing that it might be:

http://www.cookcountyassessor.com

and having a quick look it appears that both the records and pictures are accessed by URL in a fairly straightforward manner…we could be in business here.

Yes, that’s it. There’s another site I’m interested in doing the same thing with (Cook County treasurer) but that one has an added twist, which I doubt I could overcome: it has a raster image of a “key” you have to input so as to defeat attempts like mine.

This would all be moot of course, if the different offices made flat electronic data files available to the public. I think that’s too advanced a concept for the County. (It would save paper and cost them almost nothing) ::sigh::

I just thought there might be an easy program out there to handle things like this, because there have been many other occasions I’ve come across where it would have come in handy. I’m sure that others have experienced the same thing.

I expect that any such program would have to have some kind of macro-ish programming language built in, simply because of the huge number of ways of doing these kind of things. The name for it is screen scraping, so you could try searching for that.

For what it’s worth, you can get any individual record (I think!) using the URL:


http://www.cookcountyassessor.com/filings/searchnew/searchdetails001.asp?pin=XXXXXXXXXXXXXX

where XXXXXXXX is the PIN you’re interested in - so to get record number 01-01-100-020-0000 the URL is:


http://www.cookcountyassessor.com/filings/searchnew/searchdetails001.asp?pin=1011000200000

Getting the images is a bit trickier; first you have to access the following URL to get the image dynamically loaded onto the server:


http://www.cookcountyassessor.com/filings/searchnew/ParcelImage.asp?pin=XXXXXXXXXXXXX

where again XXXXXX is the PIN. Then the image will be at:


http://198.173.15.21/AssessorPics/GeoSpan/XXXXXXXXX.jpg

Note that there’s a .jpg on the end of this one.
For me (on a linux box) the following:



wget http://www.cookcountyassessor.com/filings/searchnew/searchdetails001.asp?pin=XXXXXXXXXXXX
wget http://www.cookcountyassessor.com/filings/searchnew/ParcelImage.asp?pin=XXXXXXXXXX
wget http://198.173.15.21/AssessorPics/GeoSpan/XXXXXXXXXXX.jpg


works fine. A simple shell script to do this (or something similar) for a list of all the PINs you’re interested in (and rename the files to something sensible) would do what you’re after.