How to search for image names (engines that include html?)

timgregory · December 12, 2003, 10:29pm

Hello!

I need to search my company’s website for a particular image, to see how many times it occurs throughout the site.

Because the file has a unique name, I thought, ‘easy! I’ll just type the file name into Google.’ But it looks like Google just searches the text of an html page, and doesn’t include html.

Can anyone suggest how to find how many times an image occurs on a website? Perhaps a search engine that doesn’t filter out html code?

(Telneting in, and searching directly on the unix server is not an option, btw.)

Thanks

Q.E.D · December 12, 2003, 10:38pm

Have you got (or can you get) the ability to FTP into the server? Then it would be a simple matter to check each .html or .htm document manually, or you could automate it with a PHP or Perl script.

Musicat · December 12, 2003, 11:10pm

Probably too late for this idea to work for you now, but I always keep a complete mirror-image copy of all data for every web site on a folder in my local PC or network, down to the exact directory structure. Then I can use all the standard search tools from a PC, including some dynamite, lightning-fast DOS stuff.

I can’t imagine doing it any other way. In your case, would FTP-ing all data down to your local PC be an option? (Just in case PHP or perl isn’t in timgregory’s vocabulary, Q.E.D.)

micco · December 12, 2003, 11:27pm

Google has an image search feature. Go to their main page and click the “Images” tab or go here. It appears to work based on filename and the advanced search allows you to specify a domain. It didn’t work on a quick search of an image name on one of my domains, so either they haven’t indexed me or it’s not really using filename and relying on alt tags or context.

Another option would be to use an offline browser like WebWhacker (commercial, but there are probably free alternatives with similar functionality). This might be easier if you don’t have FTP access to download the files since this kind of utility will spider a site and create a local copy for you. You could then search that local copy using a variety of tools.

timgregory · December 13, 2003, 5:42am

hey guys, thanks for the suggestions.

FTPing is not an option, sadly. For security reasons, users on my level don’t have direct access to any of the web servers – we access the pages through a file management tool (one without a search tool).

Keeping local copies through this tool would be cumbersome at best. Plus I just started this job, so I haven’t touched many of the pages yet!

Google’s image search didn’t work, either – although google can indeed find our pages (if i type in a phrase from any page).

I’ll check out WebWhacker and any similar (free) tools. Thanks!

In the mean time, if anyone else can think of a decent work around to my lame server access, I’d appreciate it! Thanks.

Q.E.D · December 13, 2003, 5:52am

Could you not contact the webmaster of your company’s site and ask him/her? I bet they could find out faster than you can.

Ryle_Dup · December 13, 2003, 6:03am

Search for webstripper, this will download the whole website based off the linked pages on the domain, you can use those files to see how many time the image appears.

Bill_H · December 13, 2003, 6:17am

Options I’d investigate, prioritized:

call the web person and ask either for the answer or access to a copy for the answer.
Write a perl script to spider through the pages and search for the key word you’re looking for. check out lwp-rget for a start.
Download the site, and grep through it. black widow is a good tool.

Note that for options #2 and #3, you’re relying on the site being easily spiderable. So if it has flash menus or complex javascript or whatever, it won’t work.

Topic		Replies	Views
google did it again! Miscellaneous and Personal Stuff I Must Share	17	2566	July 2, 2001
Another search engine question Factual Questions	4	781	November 3, 1999
Is there a program that can save every image from an HTML file? Factual Questions	9	953	April 21, 2007
Is there any such thing as non-text searching on the internet? Factual Questions	13	1455	June 4, 2009
Doing a search for an image on google images Factual Questions	29	8374	February 9, 2014

How to search for image names (engines that include html?)

Related topics