Need help forming a regexp ("not" operator?)

I, unfortunately, never really learned regexps, and now I’m sort of in dire need of them. I could do this without them, but it would be a general PITA. I basically need to make a delimeter to match any characters NOT in a given set. \W gets close, but for the things I don’t want to delimit I need to include a few special characters like apostrophes and hyphens, so something like --all the expressions NOT in “\w’-”–(that’s not all the characters necessary, but should give you an idea). You’d think it’d be easy to Google this, but it’s not. Maybe that’s because I don’t know how to use regexps…

[^abcd] means “match any character except a, b, c or d”.

So something like [^\w’-"–] should do what you want.

If a character class delimited with square braces beings with a caret, it means match something not in the class:

[^,;:]

Matches a single character which is not a comma semicolon of colon, for instance. - may also be used to indicate ranges:

[^A-G1-5]

Matches a single character not an uppercase A through G or 1 through 5.

Ah, I got confused because “^” seems to also be an anchor meaning “begins with.” What’s the difference between these usages? I assume it’s something like

^[A-G] means “starts with A-G”

while

[^A-G] mans “not A-G”

Is that right?

Yep.

So, I’m trying to make a delimiter that will separate paragraphs based on whether they have one or more blank lines separating them (in other words, 2 or more newline characters). I figured:

{2,}

Would work, when that failed, I tried

[
]{2,}

Which also didn’t work. Am I missing something? Is the scanner just being buggy?

Never mind, forgot Windows uses carriage returns. Stripping the file of carriage returns with a simple program (I only have to handle Unix-formatted files) worked perfectly.

Are you sure the input has just newlines (
) and not crlfs (
)?

ETA: never mind then.