Although I’m rather new to the Java Regex scene I think I have a pretty good grasp on the concepts. That being said I have a problem. What I’m trying to do is search through an htm file and find a string that looks like either of the following:
VALUE=“this is the string i want”
NAME=“or this is what i want”
Basically the goal is to extract the string within the quotes if a VALUE attribute exists OR a NAME attribute if the VALUE does not exist. So I can easily do the following:
Check to see if I found a match, and if not recompile with the NAME attribute instead and process all over again. I’m relatively sure I can combine these into something that looks like this:
This way I can check either OR and group the string of whichever one it finds. The problem is it doesn’t seem to work and won’t find a match.
My question is, is this correct syntax or am I just not able to match and group something like this? If you need any other information just let me know. Thanks in advance!
That’s not exactly right, because the " character has meaning in a regular expression, so you need to escape it(Java Regular Expressions are quite annoying for things like this). You want:
String regexp = “(VALUE|NAME)=\\”([^"]*)\\"";
Pattern.compile(regexp);
// and so on
I was looking at that earlier and figured I was trying to OR too much. This ended up working for me:
Pattern.compile(“VALUE|NAME=”([^"]*)"");
Kind of…
This is what I’m looking at in my html file:
<PARAM NAME=“string-name” VALUE=“string-value”>
If a VALUE exists I’d prefer that over the NAME but apparently the regex doesnt check the expressions in order. Is there another way for me to do this or am I stuck compiling and checking for each one in order to check in the right order?
You want to use the repeating search capability of Matcher.find()
Use Matcher.find(). Use the pattern (VALUE|NAME)="(^"*)" if the search succeeds and Matcher.group(1) is VALUE then you’re done, just get Matcher.group(2). If Matcher.group(1) is NAME, save Matcher.group(2) in a variable and call Matcher.find() again. If you get a match, check that Matcher.group(1) is VALUE and if it is, use Matcher.group(2) instead of the variable you saved.