Could sed or awk use NUL character as record separator?

Question

I have a NUL delimited output coming from the following command :

some commands | grep -i -c -w -Z 'some regex'

The output consists of records of the format :

[file name]\0[pattern count]\0

I want to use text manipulation tools, such as sed/awk, to change the records to the following format :

[file name]:[pattern count]\0

But it seems that sed/awk usually handles only records delimited by the "newline" character. I would like to know that how sed/awk could be used to achieve my purpose, or if sed/awk could not handle such case what other Linux tool should I use.

Thanks for any suggestion.

Lawrence

so how do you look at this file? with a hex editor? How does it know where to 'break' the lines? Why not just convert the '\0' to '\n' and have a nice easy to read file that can be processed using the standard unix paradigm? Otherwise at every step, you'll be fighting the basic law of unix, "each record on its own line" ! ;-) Life is too short, There are much more interesting problems to do battle with. Can you get the original source of output to use '\n' or ... shudder, '\r\n' ? Good luck. — shellter, Feb 7 '12 at 3:17
The output is not to be displayed, it is piped into another command. I use NUL as separator as Linux file names could have "newline" character in it. I agree that life is only too short for us to figure out all the solutions for our questions. — user1129812, Feb 7 '12 at 3:50
but a filename is a different piece of 'data' than the data included in a pipe. the 2 only meet as an when data is written into file with a name that may have a '\n' in it. Good luck. — shellter, Feb 7 '12 at 4:07
I finally figure out that grep -c -Z would only place a NUL character after [file name] but would place a "newline" character after [pattern count]. I now choose not to use the grep -Z option but TejasP's answer is still helpful for me to parse NUL delimited files using awk in the future. Thanks all. — user1129812, Feb 7 '12 at 6:03

Tejas Patil · Accepted Answer · 2012-02-07 02:23:14Z

up vote 2 down vote accepted

By default, the record separator is the newline character, defining a record to be a single line of text. You can use a different character by changing the built-in variable RS. The value of RS is a string that says how to separate records; the default value is "\n", the string containing just a newline character.

 awk 'BEGIN { RS = "/" } ; { print $0 }' BBS-list

answered Feb 7 '12 at 2:23

Tejas Patil

4,95311531

3

I have tested that the command awk 'BEGIN { RS = "\0" } ; { print $0 }' could delimit records with the NUL character. But The GNU Awk User's Guide says that RS = "\0" Is Not Portable. Anyway, I could start with this command to try to change the NUL character before the [pattern count] to the ":" character in my case. – user1129812 Feb 7 '12 at 3:21

add a comment |

Graeme · Answer 2 · 2014-03-22 11:55:05Z

up vote 3 down vote

Since version 4.2.2, GNU sed has had the -z or --null-data option to do exactly this. Eg:

sed -z 's/old/new' null_separated_infile

answered Mar 22 '14 at 11:55

Graeme

1,196512

add a comment |

jaypal singh · Answer 3 · 2012-02-07 02:50:56Z

up vote 0 down vote

Using `sed` for removing the `null` characters -

sed 's/\x0/ /g' infile > outfile

or make in-file substitution by doing (this will make backup of your original file and overwrite your original file with substitutions).

sed -i.bak 's/\x0/ /g' infile

Using `tr`:

tr -d "\000" < infile > outfile

answered Feb 7 '12 at 2:50

jaypal singh

48.2k1267103

or tr "\000" "\n" < infile > output :-?) – shellter Feb 7 '12 at 3:23

@shellter You are right. I was not sure if OP wanted to substitute them with newlines or remove them … :) – jaypal singh Feb 7 '12 at 3:37

But my purpose is to only replace the NUL character before the [pattern count], not to replace all NUL characters. – user1129812 Feb 7 '12 at 3:43

@user1129812 In that case you can use the sed command and remove the g option from it. g option is for making global substitutions. When removed, it will only make the change on first occurrence on each line. – jaypal singh Feb 7 '12 at 3:58

add a comment |

asked	5 years, 2 months ago
viewed	4574 times
active	3 years, 1 month ago

Could sed or awk use NUL character as record separator?

3 Answers 3

Using `sed` for removing the `null` characters -

Using `tr`:

Your Answer

Not the answer you're looking for? Browse other questions tagged sed awk nul or ask your own question.

Hot Network Questions

Could sed or awk use NUL character as record separator?

3 Answers 3

Using sed for removing the null characters -

Using tr:

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged sed awk nul or ask your own question.

Related

Hot Network Questions

Using `sed` for removing the `null` characters -

Using `tr`: