bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: sed problem with ^ and \


From: Bob Proulx
Subject: Re: sed problem with ^ and \
Date: Tue, 15 Mar 2005 09:13:14 -0700
User-agent: Mutt/1.5.6+20040907i

address@hidden wrote:
> My sed filter is supposed to delete everything in the file that is not 
> between <SQL> and </SQL>
> 
>       /^\<SQL>/,/\<\/SQL\>/!d

I think you meant '/^\<SQL\>/,/\<\/SQL\>/!d', right?  (You did not
backslash the first > in the line.)  But I assume you were quoting the
string in some other way because a plain '>' on the line would have
generated a shell error.  But \< and \> are undefined expressions.  So
that is your problem.

  Regex syntax clashes (problems with backslashes)
     `sed' uses the POSIX basic regular expression syntax.  According to
     the standard, the meaning of some escape sequences is undefined in
     this syntax;  notable in the case of `sed' are `\|', `\+', `\?',
     `\`', `\'', `\<', `\>', `\b', `\B', `\w', and `\W'.

> This worked fine on HP-UX but on Linux the ^ doesn't seem to be 
> recognised to mean "start of line" and it also doesn't seem to like the 
> \ escape before the < or >.

On HP-UX the libc RE engine defaults to undefined escape sequences as
being defined as the character itself.  Of course use of undefined
sequences is not portable.  Especially for sed that provides 'egrep'
style expressions in addition to the older style regular expressions
there is a collision on syntax.

Try this syntax instead.

  /^<SQL>/,/<\/SQL>/!d

Here is a test case.

  cat >/tmp/testcase <<EOF
  one
  <SQL>
  two
  </SQL>
  three
  <SQL>
  four
  </SQL>
  five
  EOF

  sed '/^<SQL>/,/<\/SQL>/!d' /tmp/testcase
  <SQL>
  two
  </SQL>
  <SQL>
  four
  </SQL>

Does that help?

> We just moved from HP-UX 11 to Linux and I have a behaviour difference 
> with a regular expression pattern with sed.
[...]
> This worked fine on HP-UX but on Linux the ^ doesn't seem to be
> recognised to mean "start of line" and it also doesn't seem to like the
> \ escape before the < or >.

Okay, now it is time for the griping to begin!  You are asking for
help about a GNU Project program on a GNU mailing list.  So why are
you talking about the Linux kernel here?  Your question has absolutely
nothing at all to do with Linux.

  http://www.gnu.org/gnu/linux-and-gnu.html

In this case it would have been better to show the version of the sed
program.  The first line of 'sed --version' would be most appropriate.

  sed --version
  GNU sed version 4.1.4

> Confidentiality Note: This message is intended only for the named 
> recipient and may contain confidential, proprietary or legally 
> privileged information. Unauthorized individuals or entities are not 
> permitted access to this information. Any dissemination, distribution, 
> or copying of this information is strictly prohibited. If you have 
> received this message in error, please advise the sender by reply 
> e-mail, and delete this message and any attachments. Thank you.

And then you have committed a second breach of etiquette.  You have
included an email disclaimer in your message and posted it to a public
mailing list.  Many people on the Internet will refuse to even
acknowledge your messages if they include such notices since basically
you have told them by the notice that they can't.  In any case, they
are annoying.  Don't do it!

  http://www.goldmark.org/jeff/stupid-disclaimers/

If nothing else please post your message from a different account that
does not include such annoying things.

Not to be completely negative, your choice of subject for this message
was quite good and descriptive.  I give you full marks for it.  Good
job there.

Bob

Email Disclaimer: You are not allowed to read this message.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]