Thank you for recommendation.
I have found this problem when I decided to rewrite my old script from PERL to AWK. I like AWK, I use it a lot because I process text files. The refactored script in AWK worked great for several test files until it hit files with several extended ascii charecters. It was difficult to find source of this problem. Maybe that real source of trouble is that converter (internaly used by substr(), length(), etc) silently discards characters those cannot be mapped to UTF-8 codepoints. I prefer to see fatal error then data silently ignored; that results in corrupted reports. I have found this problem only because I was comparing report generated with my old PERL script with new AWK script, so I noticed difference (and I feel I was lucky that I tested at computer with GAWK).
Other question on my mind. Is "split" working correctly in GAWK? I use it to get result in blength() function but maybe it works only because there is a bug in GAWK and once it is fixed (split will handle strings based on locale settings), it will stop working.