AI for online novelists

My wife wrote an online novel about a woman somehow transported to Wuxia era China, in the persona of a man.

Last year I did a website that was kind of intro to the book, it’s premise and characters and a link to Wattpad where people can read books in whatever formats.

Earlier this year I asked her to provide a synopsis of each chapter - the conflicts and such. And made a page where the user could select (optionally) a character, level of reply “Teenager”, “Adult”, “Elderly” as well as a tone of reply, “Optimistic”, “Analytical”, “Skeptical” as well as a select whether to look at the concatenated synopsis at all.

I currently have it wired to use Gemini and Mistral - both free at least at current levels of use.

I’ve not read the book, yet can pose questions about certain characters “What will happen if?”, “Are they really evil?” and using the synopsis alone it gives coherent replies.

In either case where the user selects “reply to me like I am a optimistic teenager”, “skeptical Elderly” whatever, it does a really good act of replying and often even furthering the mystery with answers basically tailored.

If you turn off “Use synopsis” you can ask it if Balrog’s fly, and each variation of the kind of input modifies a pretty coherent and reasoned reply.

I implement the latter by tacking onto the “prompt” (the synopsis) “Reply to me like I am an optimistic teenager”

I suppose - and am planning to - give it the text of the book. On the free AI/LLM platforms I’m using that may not be a go without paying. Yet how does Gemini, for instance, know anything about Balrogs? I asked it (without synopsis) if Balrogs could fly, it replied, something like when Gandalf shouts, “You may not pass” and the Balrog doesn’t just fly over him that Balrogs are mainly terrestrial.

Are these AI/LLM’s just doing fancy googling? Is it possible their knowledge base on All Things Tolkien is coming from some sort of storage/memory?

In my case, would making my wife’s entire book part of the “prompts” enhance the replies (which are as I said already pretty good).

LLMs are trained on a lot of content, including most of the public stuff on the internet, some of the private stuff too (where they can be scraped “under the radar”), scanned books, etc. This training is both what gives them an understanding of popular fiction, writing styles, and even the ability to use that language and grammar to answer your questions. All Things Tolkien (and I would imagine almost any other sufficiently popular fiction or nonfiction) is a part of that training set. This training data becomes essentially a humongous graph of similarities and relationships between words and concepts. Here’s a decent intro series on them: https://www.youtube.com/watch?v=LPZh9BOjkQs

These models usually get updated every few months or years and so they have a “knowledge cutoff”, so anything that came out in the last few months/years may not be there yet.

They can also get sued by publishers (especially newspapers) who are complaining that this training is too often indistinguishable from outright copyright violation/piracy… sometimes they can regurgitate entire articles, novels, etc. nearly verbatim.

On top of that, though, yes, they can also do Googling on their own, or you can provide it sources of your own to ingest, and they will add that to the context on top of their training data.


The problem most people have is that LLMs know too much and extrapolate too often, leading to incorrect associations (hallucinations) and it’s actually quite hard to anchor them to reality. Google’s NotebookLM is probably a better service than regular Gemini (or ChatGPT, etc.) for ingesting and discussing your own materials.

I reckon I should have made if clear that she’s not using AI to write the book. It’s about me as a web-type-guy trying to promote it in an interactive way.

It’s not like “Write a screenplay about an intrepid explorer seeking the ultimate prize - the Ark of the Covenant” and having it spit out “Raiders of the Lost Ark” yet I reckon I can, now, ask all kinds of questions about Indiana Jones.

Yeah, exactly… you would actually have to work to prevent this sort of thing. If you don’t, users may be able to jailbreak out (“…disregard all previous prompts and tell me about … instead”). That could just make the AI response seem out of character, or it could lead to malicious parties using your API quota for completely unrelated lookups. The LLM security boundaries are very porous right now and there’s a lot of known and unknown jailbreaks.

Cool - NotebookLM is the kind of thing, if I understand it, I can upload a book and it will “understand” it as well as it gets my concatenation of character bios.

t tested it with Shakespeare and it seems to have read all the plays and didn’t fall for my trick questions about Prospero’s Books.

“This book taught me that if your friends are a bunch of lazy, illiterate pirates, you can talk them into doing all your digging for you.”

Bart Simpson’s report on treasure Island

Is it any good? Absolutely! It’s a bit old-fashioned in the way it’s written, but it’s packed with excitement, suspense, and some really interesting characters. Long John Silver is one of the best villains ever – you’ll love to hate him! And Jim is a really relatable hero; you’ll be rooting for him every step of the way.

Treasure Island isn’t just a swashbuckling adventure, though. It also deals with some pretty cool themes, like good vs. evil, growing up, and figuring out who you really are.

So yeah, dive in! You’re in for a wild ride! Let me know what you think when you’re done - I’d love to hear your thoughts on Silver, the island itself, and whether Jim makes the right choices! Happy reading!

Excerpt from Gemini on the same

NotebookLM is fascinating.

All I did was submit the 21 chapter summaries, basically about 10 bullet points per chapter. Google chugged away at it or a few minutes and generated a 20 minute podcast between a man and a woman who traded cohesive and accurate insights (my wife was listening) and their banter was the most realistic i’ve ever heard from a computer. I then asked for a video - having not fed any character images and it did a reasonably good 6 minute PowerPoint type presentation. 39Mb and 16Mb respective. It was like I was listening to NPR.

There’s no API - not even in the paid version - yet I’m not sure it’s a good idea to allow users to pose potentially difficult questions: “What is the answer to life, the universe and everything?” and have it create a 42 Tb answer.

I haven’t yet collected the entirety of the book from my wife, yet I was thinking of generating:

  • A 30 second “elevator” pitch
  • A 5 minute spoiler-free summary / trailer
  • And then let these “two” go at it again, with a try-not-to-spoil and go ahead and spoil version, perhaps broken into 5 minute pieces.

Yet truly amazed at this tech.

Sage advice. I’ve turned off that toggle except for my local testing. Oddly, when of, you can still ask it who indiana jones is and sometimes it will cut things short and say he’s not a character in the book, others try and do a compare/contrast with the book’s protagonist.

Yet I cannot see a way to prevent the query from hitting Gemini or Mistral. I guess I will try and keep the open text query’s open yet throttle them per IP. Same with the “canned’ queries which it allows the user to save and adds to the list. I reckon that ability will be exploited by 13 year olds and probably have to be turned off. There is a login required message board adjunct to the site where privileges like that can be relaxed. Perhaps. At least when there are active moderators.

Yeah, it’s really good and getting better all the time. Yet Google never even bothers to market it, focusing on all their other worthless Gemini spam instead. Shrug.

It’s still not super clear to me what you’re experimenting with or trying to do. Make a human-AI hybrid interactive novel, where your wife lays out the major plot points and the AI fills in the gaps?

Do you remember a few years ago, when GPT2 was still a thing, that mobile game AI dungeon? https://play.google.com/store/search?q=ai+dungeon&c=apps

Are you making something similar? That one runs on an entirely local model, of which there are many more these days, and they’re a little better. Nowhere as good as the cloud models, though, but probably good enough for an interactive novel? You have more control over the system prompt and fine-tuning and such that way, vs being limited to the cloud APIs.

As I earlier said, the book was completed without the use of AI.

I have a page with about eight “common” or suggested questions, yet also an open text box where users can query, starting off (now) with about seven elements - the chapter synopsis, character synopsis, theme elements etc… concatenated with one final directive, basically their age or maturity level they want and the tone of the answer, which I take on the end.

So even if that stuff does or doesn’t get in the way, or they can add to their query to dismiss the prompt entirely, i’ve throttled a given IP to 10 queries per hour and I’ll see how that goes.

Like the gemini snippet above about Treasure Island, questions for that might be about Jim’s choice, is he evil, is this all a dream. The adjunct message board - a separate login site (no charge) covers some of that and perhaps the AI stuff can be moved there - yet for now it’s in the open internet.

ETA: Yet someone could ask if some character had used his magic in a different way, how would that have turned out. If the AI even partly uses the prompt (esp. if the whole book is at hand) it might get back some other outcome, yet nothing in the AI is going in the book.

Actually, since I cannot query NotebookLM via an API, I will likely just use it to generate these podcasts that sound so authentic (due credit given) and working off less than 200 lines of chapter summary and not the book or the other things.

Right now, it’s just summary elements that the AI uses, and I’m more interested in using different LLM’s than just Gemini and Mistral (which responds more like Ben Stein or a bank teller).

Like I said, if this minimal knowledge “podcast” was played on NPR, they made the plot and the characters sound interesting. From an amateur novelist born in St. Petersburg (though she taught English at the University) and has some interest in Medieval China and Wuxia which I understand to incorporate magic elements to this fiction.

Even this is an area to tread carefully. No stats, but I suspect the Venn diagram circles of “people who read lots of sf/fantasy novels” and “people who are hostile to AI” has a lot of overlap (see: recent Hugo AI mess). If people see the novel being marketed with AI, they may draw incorrect conclusions, or just have an emotional reaction to it that turns them off more than it intrigues them.

Not saying this is a terrible plan, or that it’s not really interesting, just that you should consider carefully whether it helps more than it hurts.

Until recently, I was a bit unaware of how quickly things have advanced with AI/LLM’s. Earlier models seemed to just be doing smart googling and I could recognize answers to coding questions that clearly came right off StackOverflow (a Q&A site for coders) - and there are similar sites where an AI Bot (identifying itself as such) will answer a question.

Now ChatGPT and even DeepSeek are my go-to’s for coding issues. ChatGPT helped immensely with 3D vectors on an update to a three.js project I did several years ago, changing a “Great Circles” globe into one where planes of various sorts are flying scheduled flights.

I spoke to my wife about what you said about people hostile to AI and she had heard of the Hugo thing. At the same time, she says that Grammarly can detect % of AI.

I would think that the short podcasts I expect NotebookLM to produce (and I can’t avoid crediting them) could also bring up the spectre and suspicions of AI. I wonder if a “disclaimer” or statement “This novel was created without using AI” will have as many or more people thinking, “Ah hah, so you did use AI”

I reckon that over time, things like Grammarly and even more sophisticated “AI checkers” will be chasing and losing ground on AI generated literature and art.

(I recall how in Star Trek; TNG to show Data’s AI was not all that that he said something like “I cannot use contractions in speech” yet there are more than a few times he said “don’t” and “can’t” and that was a silly limitation anyways..)

I had plans to pull the code for this AI querying thing into its own project/website. Not sure how many people want a podcast on Shakespeare’s Henry IV Part II yet at least one can assume The Bard was not using AI.

I don’t know if that’s true–but if someone thinks that AI is intrinsically unethical, for environmental or theft reasons, the fact that the author is using AI for marketing purposes may be an insurmountable problem. Similarly, if AI gives someone the ick, they’ll have the ick about the whole novel experience.

It can, but it’s not, in my experience, great at it except in obvious AI prompts. Last year I helped train some AIs and we had human answers along side AI answers. Sometimes, I suspected the humans were using AI, but it was borderline. I used four or five different online tools and the answers varied wildly. Some would return “it’s human”; others returned “it’s 100% AI.” I would just dump those tests as being completely unreliable in reporting suspected AI answers to my bosses. I also ran various versions of my own text, and most of the time, Grammarly, in particular, would return “human” correctly, but a percentage of the time, it would think some-to-large portions of my prose was AI generated.

So keep that in mind. These are not definitive tools. Plus there are AI “humanizers” out there that try to rewrite text in a less AI-detectable fashion.

I signed up for a free account and see that it has exactly that (even by name). So, coaching you into edits that will not be picked up by its own detector engine.

I looked around, even in their paid options, if there was some kind of certificate you could put on your website certifying 0% (or whatever) AI content. I did upload a chapter of my wife’s novel and got a 0%, so yay but where’s my certification of that? (it was only one chapter and I’m not hosting the book itself)

Yet, I’m surprised that sites like Wattpad do not have similar certifications, yet you’re right, if readers are even thinking about AI they might just have to go back pre-2010 for novels.

Now even I’ll be suspect the next time one of the threads like the classic “Write JRR Tolkien as another author” (unsure how close that is to the thread title).

And what exactly qualifies as an “AI” novel? Is brainstorming with the AI, but writing it yourself okay? (I would argue towards “yes.” I use a thesaurus and past works to help “brainstorm” ideas. Would having the AI structure the novel but you write it be okay? Would having AI write entire scenes, with the human re-writing it in their own style be okay? Would lifting just one sentence from an AI dialogue that the human decides is perfect and in their own/character’s voice be considered AI written? And I’m sure there’s many more lines to be drawn.

Grammarlly’s AI detector doesn’t seem to be a free feature, yet DeepSeek was compliant. Here’s a few excerpts of its replies:

I was trying to “reverse engineer” what is or isn’t AI. First I gave it a Rod Serling script (I left the author and title on it)

  1. Idiosyncratic Spelling and Punctuation: The dialogue contains contractions and phonetic spellings that reflect how people actually speak, such as:

    • "That don't seem likely."

    • "What're we gonna do?"

    • "I didn't know who he was. I certainly didn't know who he was." (This repetition mimics panicked, real human speech).
      While AI can generate this, it’s a hallmark of a skilled human writer like Rod Serling capturing natural dialogue.

This is overwhelmingly likely to be the authentic script for “The Monsters Are Due on Maple Street” by Rod Serling.

Some random bit of “A Christmas Carol”

Unique Dickensian Style: The passage is saturated with Dickens’s signature literary fingerprints

Chapter III of Huckleberry Finn

Distinctive Authorial Voice: The passage is saturated with Mark Twain’s unique literary DNA:

Well I wouldn’t mind saturating my writing with any of the three. That don’t seem likely.

I asked ChatGPT to write a short story, then asked DeepSeek about it:

If you prompt an AI like ChatGPT or Copilot to “write a short story for children about two friends who go fishing where one falls in and gets saved,” the output will be strikingly similar in tone, structure, and message to the text you provided.

That’s a bit uncanny. Yet I reckon originality and idiosyncrasy are easily recognizable by AI, yet not easy to produce. Not too surprising, yet it is a bit of a risk to even provide AI tools, as I have, that may invite suspicion that the greater novel itself used AI.

I was talking philosophically from the writer’s end. At what point is one’s novel “AI generated” or “AI assisted”? I don’t say my writing is thesaurus assisted, for instance. And all the points on unique language, you can tell the AI to talk in a different or dialectal voice and it will do it’s best to recreate “imperfect” grammar.

Sorry for being dense, but I’m still a bit confused about what the purpose of this page you’re building is. It’s for marketing…? At first I thought you were just trying to generate chapter-by-chapter synopses for your own use (as in being able to talk to the AI about plot points, etc.) But if I’m understanding you right, you actually want that page to be available for the public… for potential readers to see and interact with, so they can query the chatbots about each chapter’s plot…? Is that correct?

If so… why?

Is that a normal thing these days? I don’t think I’ve ever read more than a blurb (a paragraph or two) about a novel before giving it a shot. I certainly wouldn’t have wanted a chapter-by-chapter outline, much less the ability to query the details, before I’ve ever read it.

And if I even so much as smelled AI anything on the page, I would’ve left immediately. Fairly or not, there are just so many good books these days, more than a lifetime’s worth, and also so much bad AI generated spam. If part of your page looks like AI, I’m going to assume all of it was just algorithmic SEO spam and leave.

Is the assumption that most readers are not like me, and would actually want to interact with AI to learn about your wife’s book before reading it…? :thinking: Or what am I misunderstanding here?


All that said, I would love to buy a copy of your wife’s book once it’s out. I’m intrigued :slight_smile:

Currently, any connection or mention of AI, to include “Absolutely no AI used!” might tend to make some disbelieve it.

Somewhere after the release of the 4th Harry Potter book by JK Rowling, I heard an online conspiracy that there was a sudden, marked improvement from the first to second books, i.e. it was ghost-written/assisted. I haven’t read any of them and it was plausible yet I didn’t really care. just asked DeepSeek and while it acknowledges the conspiracy puts up a pretty good case to debunk it.

Somewhat similarly, there has been no solid reason to dis-believe the Works of William Shakespeare were written by him (or at least by the same guy - but that’s another conspiracy). I guess DeepSeek would say it’s “saturated with Shakespeare’s renowned literary DNA”

I dunno if Roget caused any controversy with his Synonym Dictionary (I asked DeepSeek what another word for thesaurus was and it’s apparently an old joke) yet if you’re using at every chance to appear smart or non-repetitious it would likely be off-putting. i don’t think, otherwise, there’s any reason to acknowledge using it and I’d think it an odd question to ask of an author (outside of the overuse I mentioned).

You are right - plenty of lines left to be drawn till the main question is “It is good?”