[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Gawk and non-ASCII characters

From: Charles Kozierok
Subject: Gawk and non-ASCII characters
Date: Sat, 16 Oct 2010 08:22:56 -0400

Hi folks,

Having a heck of a time dealing with a specific issue and hoping
someone can help.

I am grabbing HTML code from a site that has some non-ASCII codes in
it. Specifically, the code is "C2 A0". This shows up in ANSI as a
capital "A" with a circumflex on top followed by a space. In ASCII it
becomes a regular "A" followed by a space.

I need to be able to properly identify these so I can get rid of them,
but I can't figure out how to do it. The character doesn't seem to
match any character codes within gawk, and I can't find any command
line or option settings to either filter them out or have them be
dealt with properly.

It's probably me, but maybe a problem of some sort. Can anyone point
me in the right direction?

Best regards,


reply via email to

[Prev in Thread] Current Thread [Next in Thread]