Although I’m rather new to the Java Regex scene I think I have a pretty good grasp on the concepts. That being said I have a problem. What I’m trying to do is search through an htm file and find a string that looks like either of the following:
VALUE=“this is the string i want”
NAME=“or this is what i want”
Basically the goal is to extract the string within the quotes if a VALUE attribute exists OR a NAME attribute if the VALUE does not exist. So I can easily do the following:
Pattern style = Pattern.compile(“VALUE=”([^"]*)"");
Check to see if I found a match, and if not recompile with the NAME attribute instead and process all over again. I’m relatively sure I can combine these into something that looks like this:
Pattern style = Pattern.compile(“VALUE=”([^"])" | NAME="([^"])"");
This way I can check either OR and group the string of whichever one it finds. The problem is it doesn’t seem to work and won’t find a match.
My question is, is this correct syntax or am I just not able to match and group something like this? If you need any other information just let me know. Thanks in advance!
That’s not exactly right, because the " character has meaning in a regular expression, so you need to escape it(Java Regular Expressions are quite annoying for things like this). You want:
String regexp = “(VALUE|NAME)=\\”([^"]*)\\"";
// and so on
Wait, I messed that up. It should be:
String regexp = “(VALUE|NAME)=\”([^"]*)\"";
I was looking at that earlier and figured I was trying to OR too much. This ended up working for me:
This is what I’m looking at in my html file:
<PARAM NAME=“string-name” VALUE=“string-value”>
If a VALUE exists I’d prefer that over the NAME but apparently the regex doesnt check the expressions in order. Is there another way for me to do this or am I stuck compiling and checking for each one in order to check in the right order?
Now that I think about it that makes perfect sense. I have no idea why I thought it wouldn’t check it like it is. :smack:
You want to use the repeating search capability of Matcher.find()
Use Matcher.find(). Use the pattern (VALUE|NAME)="(^"*)" if the search succeeds and Matcher.group(1) is VALUE then you’re done, just get Matcher.group(2). If Matcher.group(1) is NAME, save Matcher.group(2) in a variable and call Matcher.find() again. If you get a match, check that Matcher.group(1) is VALUE and if it is, use Matcher.group(2) instead of the variable you saved.