[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Question] How the sed deal with the '\0' embedded in string?

From: Assaf Gordon
Subject: Re: [Question] How the sed deal with the '\0' embedded in string?
Date: Tue, 13 Sep 2016 10:49:25 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0


On 09/13/2016 12:11 AM, Du Dengke wrote:
I am learning the sed source code, I want to know how the sed deal with the 
'\0' embedded in string.
I can't find the *.c deal with the '\0', could anybody tell me where?

I assume you're asking about NUL characters in the input file, and not in the 
sed program.

The flow is:
    (using function-pointer input->read_fn)

getdelim(3) is a standard POSIX function that reads from a stream (e.g. stdin)
until the specified delimiter is encountered, or EOF.
If the delimiter is '\n' (the default in sed),
then NULs in the input are read as-is without special treatment.

All these functions use explicit length (number of bytes read),
and not 'strlen' - so embedded NULs are not an issue.

The following demonstrates getdelim(3) (error checking omitted for brevity):

    $ cat getdelimtest.c
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <ctype.h>
int main()
      char *buf = NULL;
      size_t l = 0;
      size_t n = getdelim(&buf, &l, '\n', stdin);
      printf("getdelim read %zu bytes\n", n);
      for (size_t i=0;i<n;i++) {
        printf("[%zu] = ",i);
        if (isprint(buf[i]))
          printf("'%c'\n", buf[i]);
          printf("0x%02x\n", buf[i]);
      return 0;
$ gcc -g -Wall -o getdelimtest getdelimtest.c
    $ printf 'a\0b\n' | ./getdelimtest
    getdelim read 4 bytes
    [0] = 'a'
    [1] = 0x00
    [2] = 'b'
    [3] = 0x0a

 - assaf

reply via email to

[Prev in Thread] Current Thread [Next in Thread]