We have Microsoft Word 2003 documents that are prepared as forms for users to fill out. There are gray boxes they can type into, and open squares that populate with an “X” when double clicked. One of our secretarial support people created these documents.
I have to harvest the user data from a large number of these filled-out .doc files, and was wondering if I could automate the process. If I open the file in a text editor, I can find the user responses typed into the gray boxes, and I could write a program that would pull this out. But I don’t see any sign of where the “X” selected boxes are encoded. The text that accompanies the boxes is easy to find but none of the bytes nearby seem to change for the box that is X’d. There are plenty of non-ASCII bytes in the file, but I haven’t picked up a clue regarding where this information is stored.
I’d make the [del]idiot[/del] secretarial support person key in all the data into Excel or a database app or raw tab-delimited text, which would teach him or her not to do something so annoying ever again.
Then import the spreadsheet or table or .txt file with comparative ease.
Wait, what? I asked the person to put these documents out there. We need to email people to get their information and so we attached these to the emails, with instructions to fill it out, save the doc with their initials in the filename, and attach that in an email back to us.
Google Forms has an easy, drag and drop GUI that allows you to publish a form that will auto-populate a Google Spreadsheet. You can then easily export that spreadsheet to Excel.
You should have had them fill their answers out in an Excel spreadsheet, whcih is an importable format.
Word processing documents are only good for word processing. They ain’t worth shit for data entry, nor are they the correct format for graphics. I don’t know why the hell people reach for their freaking word processor to create a document that isn’t a word processing document.
As you yourself discovered, you can’t (easily) import directly from Word.
Wait a minute. So if you had done it as a web-based project, you’d have needed reviews and a subcontractor, but you didn’t need reviews, etc., to do it as a Word Document? Why?
We don’t know how complicated or elaborate this document was. If it was a fairly simple and short form, with perhaps a little form letter with it, it could easily have been done as well as a web page, with a little bit of PHP and SQL on the back end. Would your company have required endless reviews just because it’s written in HTML instead of Word? (Of course, this thread has morphed from “How do I read the Word file?” to “How could it have been done differently, meaning how could it be done next time?”)
It would only take a limited knowledge of XHTML, PHP, and SQL to do a reasonably simple form. I’m a very n00b at this myself, and I think (as of just this last week or two) I could have done it, possibly except for the subtleties of authentication and security. (But I have SQL experience too.) It should be easy enough to find someone who knows enough to do this, or can quickly learn enough, if you don’t.
ETA: Okay, so it would take longer than a 10 minute surveymonkey job, if that’s how quick it really is with them. But every time I get a survey from surveymonkey, I never answer them. How many other people do you think would respond to a survey that came from a questionnaire company?
Senegoid that seems a little naive re. corporate politics. I was constantly having to do things like this in my last job and for me - senior middle manager in corporate marketing - to get access to a public-facing company webserver with the correct permissions would probably have taken a month of interdepartmental wrangling (and I was in charge of the website!), followed by a three week testing cycle.
All that grief versus knocking together a Word document and emailing it, which anyone in an office can do. My second point is why bother when all that coding has been done before, and better, by companies that are dedicated to such a job and offer the service for free?
Adding to a large company website, putting something on their webserver or similar isn’t a trivial matter. It needs to go through different layers of approval.
Emailing out a simple word document to a few hundred or a few thousand recipients is something that I don’t need approval for.
If you’re in marketing, you face this sort of thing on a daily basis.
While a word form may not be the “best” way - it is easy, fast, can be done on the spot and also has the advantage that a word document is much easier to print and fill by hand than a web document (which may or may not be a consideration, but shouldn’t be ignored).
It’s not always about coming up with the most elegant solution, but often the one that can be done fastest, and with least amount of hassle.
By the way, in my last job I rented a ‘secret’ webserver that the IT department didn’t know about, that I did use for testing and other stuff similar to your proposal. But again, SurveyMonkey does it better than I ever could.
jjimm and bengangmo, okay, I was just thinking of the technical work involved in doing it, not the corporate politics or web admin hassles. I wasn’t clear on whether the OP’s questionnaire was going to the outside world, or perhaps just to in-house people, nor whether it might all go on some local in-house intranet server.
As for renting a “secret” server for development and testing, I’ve been reading that VM’s are all the rage for doing that these days. In my Linux Admin, Windows Server Admin, and SQL programming and admin classes, we always set up our own VM’s to play with. (I was a Unix admin 25-some years ago. Things have changed since then, to put it mildly.)