Working with a large document.

I’m currently working on a novel and I’ve been having a lot of problems with corrupt files. I’m on Vista laptop, Dual Core processor, 2GB of RAM, Nvidia Geforce 9200m GE (1GB). Not a particularly good system, but you’d think it would be able to handle a document.
I’ve been working with Open Office Writer (latest version) and every once in a while when I come back to the document it will get stuck trying to open the file. The green bar in the bottom will remain at about two thirds importing the document. I have had to do an enormous rewrite previously because of this, since I had been on a rather lengthy writing spree, and somewhere amongst the endless coffee and take away pizza I’d forgotten to back up. I learned my lesson from that, and back up regularly now, but it still seems to happen every once in a while. It’s quite frustrating having to rewrite sections because suddenly the file became corrupted and I’m slowly losing the will to continue.

The document itself is 195 pages long so far, 125k words. I personally don’t consider that particularly long, but from my google searches it seems that anything over about 50 pages is considered ‘long’ as far as corrupting files goes.
I’ve tried using Microsoft Word 2000 (the only version I have), but my custom dictionary of fantasy names is not present and the formatting is a little off. Also, when I try to close the file, it also crashes so I’m not convinced it can handle what I’m doing here either. I have tried contacting the creators of Open Office about the problem and they had no idea what to do and asked me to upload the file for them. I am not comfortable doing that, a little bit paranoid and over protective over my IP I guess. My hope is that I don’t need to send the document to anybody.

Is there a proper way to approach writing a large document? Should I be using specific settings? Splitting it is not an option, I need to be able to jump instantly from chapter to chapter and be able to replace nouns throughout the entire thing at the drop of a hat. I do enjoy my writing hobby and would be willing to buy dedicated writing software if there are any recommendations out there.

Pretty much at my wits end though. It’s like I’m constantly building a house of cards.

That’s not a large document or anything close to it these days unless you have an insane amount of embedded images or something screwy like that. Your system itself should easily be able to handle that load so I am not sure what the issue is exactly. If it is important enough to you, you can buy MS Word by itself if OpenOffice is choking on it. That should work and it would be supported.

You can get a 60 day trial for free starting now to see if that will work.

You might consider Vedit. They advertise that they can handle files up to 2 gB in size, and larger if the files are split. They are a full fledged editor so they would have a ton of capabilities that you probably wouldn’t need, but they can sure handle the big files.

I looked at the VEDIT page and it’s a text editor. If the OP needs features such as text styling, headers and footers, images in the text etc. then VEDIT probably won’t work for him.

I’m not familiar with Open Office, but in Word you might try the Master Document / Subdocument feature. Split your chapters into separate documents and use the Master Document to combine them back into the book. That way search should work on the whole, and obviate your problem with splitting.

Yeah, I took a look at it and it did not seem to have the functionality I require.

I shall have a look and see if there’s a master document feature. It sounds like the best way if it’s possible with Open Office and I imagine it would theoretically limit file corruptions to a single split rather than the whole thing. Why it’s happening so consistently I do not know though. Could a virus have effected the file itself?

If this keeps happening it looks like I might have to shell out for Word 07. Just seems a bit expensive for what it offers me, since I will probably have no need for any of the advanced features.

Microsoft Works is cheaper and includes a spreadsheet program and rudimentary database, and probably some other stuff now. I used to use it all the time but haven’t kept up with the last few versions. The word processor and spreadsheet are not as full-featured as Word and Excel but have 80% of what you actually use for a fraction of the cost.

However, there are options other than Microsoft.

In truth neither Open Office nor Word is fit for purpose when writing a book. They are intended for smaller tasks. They are not really accepted as a professional format for publication anyway. They might be called desktop publication tools, but they are amateur tools, or now, mostly tools for business presentations. If you send a publisher a Word file they will dump the raw text out, remove all formatting, and have a professional reformat it with a proper tool. Which won’t be Word.

For a novel you don’t need about 95% of the whizzy bits that these systems provide anyway. Spell checking maybe. Grammar checking - I would hope not. Either your grammar is awful and you shouldn’t be writing a novel, or you will have your own developed style, and the grammar checker will just get in your way.

If you were a geek, Emacs and asciidoc would be a very useful pair. Asciidoc can spit openbook, which is a format that is accpetable as input for professional publication.

A novel doesn’t typically have a bibliography, so you don’t even need a tool to manage that.

I know that on the Macintosh, Scrivener and Nisus Writer are recommended for long documents like novels. A Google search showed me that the author of the Mac OS X program Scrivener has a page listing software appropriate for novel writers ( He lists some Windows programs for novel writers. The first one on his list, PageFour (, costs $35.00.

This, or simply save each chapter as a separate file. My husband uses Scrivener for his novel writing. I think it might be a Mac-only program, but a search for that might bring up some PC options for you.

Yeah, I figured this was probably the case. I would never send what I’ve got right now to a publisher anyway, the formatting is completely wrong, it isn’t double spaced etc, it’s just easier for me to read and edit right now how it is.
I generally turn the grammar checking function off, mostly because it doesn’t understand that people don’t speak like robots. For checking small grammatical errors I prefer to have a proof reader, but then I usually prefer to have their input on style rather than the technicalities. No software can tell you if a sentence reads nicely.

As I said before, saving each chapter in a separate file would not be useful, though I might just do that anyway to chapters I’m happy with.

Thanks for the software suggestion, Arnold Winkelried, it looks as if something like that will probably suit my needs. The ability to scan your work for over used phrases sounds quite handy, but then I don’t want to rely too much on such features. Though on the flip side I can’t imagine going back and writing on a typewriter. Guess some advancements are for the better!
Two more things that I may as well use this thread for:

Firstly, thinking back I remember that I did actually start this document in MS Word 2007, on a completely different PC. I got hold of my own desktop PC after that and continued work on the document in Word 2000. Now I’m on a laptop, since I’ve travelled a bit, and it had a Word 2007 trial on it. I used that up, but couldn’t justify purchasing the software when I was aware of Open Office and knew others had said it was pretty good.
Now, some may consider this a stupid question, but could this be the source of the corruption? I’m willing to bet it is. If so, could starting a new Open office document, then copying and pasting the text solve the problem? Or would that just transfer the problem over to the new document?

Secondly, I’m getting ticked off with another little quirk of the software. If I click on a part of the document, then scroll down and let go it will automatically snap back to where the cursor is. Is there any way to stop this? Seems like an utterly useless feature to me.

I routinely use Word to work with two to four hundred page documents, often with a graphics (table or chart) every six to seven pages. This was with Office 2K until I foolishly “upgraded” to Office 2007. Never really had problems, now I get crashes from time to time (though I’m slightly hesitant to completely blame it on the upgrade).

Anyone know if Word 2007 got the Master Document system bug free?

For styling, Word is actually pretty powerful in its limited extent. When I dump to Quark or InDesign all of the general styles come through cutting a lot of time off the design end (Mrs.Dvl does the design side of our business, so I’m very familiar with the process and time savings).

Clearly Word won’t kern for shit or come close to replacing layout software, but can play nice with others if you give it some candy.

Just a datapoint.
(ETA: worth mentioning anytime Word formatting / relation to design comes up: get thee to and check out their tools. Fantastic stuff.)

I’m curious, where have you heard them called that?

I have never heard anyone call Microsoft Word a desktop publication tool. It’s a word processor. Microsoft does make desktop publishing software, it’s called Publisher.’s word processor is called Write, and I’ve never heard anyone call it a desktop publication tool, either.

I’m not sure what you’re saying here. I’m a professional writer. In my experience virtually every magazine or publisher wants files sent to them in Word. It is the universal standard. Some prefer rtf and nowadays you can find people who will take short pieces sent in the body of an email because they know they’ll have to reformat it in the end anyway. But that’s no different from the old days in which publishers had editors mark up manuscript pages.

If you mean that you can’t publish directly off a Word file, that’s probably true. Note that Fake Tales of San Francisco isn’t doing any such thing, however.

If you want to publish off Word you can use a free pdf converter and move your formatted pages and embedded fonts and images over directly to a final publishing file. The result is as good as your skills can make it. I’ve self-published books that way and they look much better than most PoD books and many books that have come out of New York publishers. Word is not the best tool for projects this complicated and I plan to teach myself LyX but the results are fine and there is no new learning curve.

But again, as a writer you should use Word or create files that are compatible with Word for your standard manuscript. It is omnipresent.

Don’t do a simple cut and paste because that normally preserves formatting. Word has a Paste Special option on the Edit menu in 2003. You can choose unformatted text from the menu. I assume that Open Office has an equivalent function. That would be your best bet and will probably clear up your problems.

This is absolute nonsense.

Standard manuscript format is this: 12-point courier, single sided, double spaced. Use an underline to indicate italics and centered # to indicate a line break.

Word is perfect for this. And since commercial publishers never ask for an electronic manuscript until after they buy your book, there’s no reason to use anything else. Even then, using Word is good since that’s what they’re used to using.

All the pro authors I know use word processors like Word, OpenOffice, or various equivalents. No author in his right mind would use Emacs or anything like that.

Sure there are features you won’t use, but that’s true of any program. But it’s ridiculous to assert that Word is not accepted when 100% of all commercial publishers accept it.

As for the formatting, that isn’t the author’s job. The publisher chooses the format, and won’t use what you use in any case.

My last book was just over 500 pages, and I spent more than a decade writing/editing reference books that topped 1000 pages. Always broke the work up into separate files, usually by chapter. It’s cumbersome to scroll through hundreds of pages, and I can’t imagine how anyone could work that way (or why they’d want to).

As Exapno says, publishers will want word docs. But since they’re going to have to transfer everything into a page layout program, don’t get too hung up on formatting. Margins, fonts, tables – all of that stuff will get stripped out and redone.

Yes, there is a special paste option. I’ll try that and run with it for a while, hopefully that will mean I can continue without having to buy anything dedicated to writing. Thanks for the suggestion.

I’m using an ODT file now, which is Open Office’s standard file type for Writer documents, but I was using .doc before that, since it was compatible with more programs. The ODT file with the exact same information in it is approximately 280Kb, whilst the Word 97-2003 type documents are 1.6mb. Not huge numbers, but it makes the .doc file seven times the size of the ODT file.
Anybody know why that is? Something to do with it being compatible with many different programs? Or does this indicate I had a lot of hidden information in the previous files that could possibly be the source of the files going bad?

If it helps, it would normally only affect the same chapters each time, and when I managed to open them in Word 2000, the said chapters would turn into numbers, 1’s, 2’s, 3’s, 4’s, repeated constantly after one another on a grey background, ascending all the way into the teens.

It helps me keep track of it all and using the ctrl + f function it’s actually rather quick to find what I want. Perhaps I’m just odd, I simply like to have the bigger picture in front of me at all times. It also helps when I come up with a better name for a place or character or something like that, I can just replace all in one go.

For every possible aspect of writing there are successful writers who are on opposite sides of how it should be done.

My last book was 400 pages. I can’t imagine how cumbersome that would have been to break into chapters and have to search among chapters for items that applied to many of them. A book is a unit whole. I want it all available to me at any moment. How could anybody think differently? :slight_smile: