perl programming question - search&replace

I am an inexperienced Perl programmer, and I have looked on the various Perl websites, but haven’t found anything addressing my problem.

I have a string that looks something like

{\em

and I want to replace it with the string

<em>

I have tried the following code:
$find = “\em”;
$replace = “<em>”;
$infile =~ s/$find/$replace/g;

BUT, it doesn’t seem to want to find the string “{\em”. I can’t find anything referencing a search and replace with a backspace in it.

Can anyone help me? Thanks in advance!

If you are enclosing {\em in double quotes, you need to escape the escape character. IOW, you need to use “{\em”

Zev Steinhardt

Oops - typo in the OP! What I meant (of course) is I can’t find reference to a search and replace with a backSLASH character in it! Thanks again!

Zev, that’s exactly what I’m doing. I have a double backslash in there (which is what I’ve found in the Perl references I’ve looked in). If I do

print “$find”

the string looks exactly right. It just isn’t matching in the search and replace.

The opening curly bracket “{” is itself a special character in Perl. If you’re matching it in your regexp, you’ll also need to escape it with a backslash:

s/{\em/<em>/g

ref: Programming Perl, 3rd ed., p. 158

Because the backslash must be escaped twice - once in the assignment to $find, and once in the s/// operator - your code requires that you encode the backslash to be matched as four backslashes.

The following works:



    my $find = "\{\\\\em";
    $replace = "<em>";
    $infile =~ s/$find/$replace/g;


If you don’t need variable interpolation it can be written as



    $infile =~ s/\{\\em/<em>/g;


tschild , that worked perfectly. It also makes sense now that you would need to escape it twice (though I never would’ve though of that). Thank you so much!

Terminus , just FYI, it appears to work for me BOTH ways (if I escape the “{”, but also if I don’t).

Why would you need four backslashes if there’s only one backslash in the original string? Wouldn’t that match two backslashes in the original string?

If it worked I’m guessing the OP had another typo and the original string was “{\em”

-k

Ah the mysteries of Perl variable interpolation. When you assign a string to a scalar using double quotes:



$find = "{\\em"


Perl interprets the double backslash as a single backslash before it reaches the regex parser. So the regex parser only sees a single backslash, then inteprets the “\e” as an ASCII ESC character. To get around this, you need to backslash-escape both backslashes, or you can use single quotes, or use the search string in s/// directly. The latter is a good idea in any case because putting a variable in s/// forces the compiler to do extra work.

Ref: Programming Perl, 3rd ed. pp. 191-193

crozell: You’ve solved your problem, but I’m curious (and too lazy to do it myself). What happens if you say:



$find = '{\\em'    # Single quotes!
$replace = '<em>';
$infile =~ s/$find/$replace/g;


It’s because the OP was putting the text to be replaced in a variable. He wanted to replace {\em. When that’s placed in the replace command, the backslash needed to be escaped, so {\em. Now, if you use a variable to send that information, the variable needs to contain {\em. But to put that into the variable, the backslahes each need to be escaped again, hence {\\em.

(On preview: yeah, what Terminus said.)

Terminus , I tried it with the single quotes and it didn’t seem to work. Sorry!

It may seem (from the OP) that I am doing extra work by putting the search strings in a variable, but I have several strings I’m trying to replace in the same file. There may be better ways to do this, but I’m a novice at Perl and don’t use it often enough to put a lot of work learning it completely. If there is obviously a better way to do this and I’m just missing it, please let me know!

Thanks again for the help everyone!

If you want to use single quotes, this works:

$find = ‘{\\em’;

That initial { still has to be escaped. At least this worked for me, and I assume that’s why.

Probably the easiest way to handle this kind of thing in general is to use the quotemeta function, as follows:

$find = quotemeta(’{\em’);

quotemeta will figure out all the escape characters for you.

Ach, you probably need to backslash the curly brace if you use single quotes.

crozell: Looks like you’ve got a legitimate use for variable interpolation in s///. Your script sounds like a one-off deal (or is rarely run). So if it works for you, don’t sweat it!