I publish a print newsletter and am considering selling a CD or DVD that would contain back issues, probably in PDF format. It would obviously be convenient for users of this disc to be able to search for words and phrases of interest.
Of course, they could use the search features of their native operating system, but that would probably be somewhat cumbersome, since the OS wouldn’t have indexed the disc.
I assume I could include a software package (and an index) that would perform the function, but I know nothing about what would be required.
What is the best way of doing this, and can you recommend any specific products from your own experience? Or point me to other resources that will help me research this further?
Adobe Acrobat will search an entire disk of PDFs for the search term. Check to see if this is the functionality you want. If not, try to figure out what it lacks, and maybe someone can help you better with a 3rd party app, or something.
For Windows, you could just write a quick-and-dirty WinForms app in C# (using Microsoft Visual C# Express) to accomplish this. It would require .net be installed, but you can compile to .net 1 which should be available on almost every Windows OS, including most copies of Windows XP.
If you want it to be cross-platform, it’s a bit trickier.
You could do the C# path outlined above, but using a cross-platform development environment like RealBasic or Runtime Revolution or Filemaker. Those are all expensive.
You could possibly write a Flash movie as a .swf file, but then the customer won’t be able to run it unless they have Flash installed-- and even worse, security restrictions will probably prevent Flash from querying the filesystem.
Ditto those problems with Java, with the added complication that Java is pretty terrible at writing UIs for something like this…
You could look into a HTML/JavaScript solution, which could work in Windows without triggering too many security warnings, but then you wouldn’t be able to do a natural text search of the files, you’d need your search index “contained” in the JavaScript somehow. That’s rather complex, but doable.
Yeah, I was thinking of a simple HTML index file to make it easy to get to the PDF files, but searching wouldn’t work well in that situation. However, the full version of Acrobat 9 has the ability to create PDF Portfolios, which can bring together multiple files in a searchable format.