Q. How do I find and replace character codes ( control-codes or nonprintable characters ) such as ctrl+a using sed command under UNIX like operating systems?
A. ASCII is the American Standard Code for Information Interchange. It is a 7-bit code. Many 8-bit codes (such as ISO 8859-1, the Linux default character set) contain ASCII as their lower half. The international counterpart of ASCII is known as ISO 646.
Character codes often contain code positions which are not assigned to any visible character but reserved for control purposes.
You can easily find and replace them with the help of shell substitute:
sed -e 's/'$(echo "octal-value")'/replace-word/g'
OR
sed 's/'`echo "octal-value"`'/replace-word/g'
To replace 0x1B (033 octal) with ASCII letters, enter:
sed 's/'`echo "\033"`'/foo/g'
OR
sed -e 's/'$(echo "\033")'/ESC/g'
You can view ascii table by reading its man page or here is the ascii table for your references:
The following table contains the 128 ASCII characters. C program '\X' escapes are noted. Oct Dec Hex Char Oct Dec Hex Char ------------------------------------------------------------------------ 000 0 00 NUL '\0' 100 64 40 @ 001 1 01 SOH (start of heading) 101 65 41 A 002 2 02 STX (start of text) 102 66 42 B 003 3 03 ETX (end of text) 103 67 43 C 004 4 04 EOT (end of transmission) 104 68 44 D 005 5 05 ENQ (enquiry) 105 69 45 E 006 6 06 ACK (acknowledge) 106 70 46 F 007 7 07 BEL '\a' (bell) 107 71 47 G 010 8 08 BS '\b' (backspace) 110 72 48 H 011 9 09 HT '\t' (horizontal tab) 111 73 49 I 012 10 0A LF '\n' (new line) 112 74 4A J 013 11 0B VT '\v' (vertical tab) 113 75 4B K 014 12 0C FF '\f' (form feed) 114 76 4C L 015 13 0D CR '\r' (carriage ret) 115 77 4D M 016 14 0E SO (shift out) 116 78 4E N 017 15 0F SI (shift in) 117 79 4F O 020 16 10 DLE (data link escape) 120 80 50 P 021 17 11 DC1 (device control 1) 121 81 51 Q 022 18 12 DC2 (device control 2) 122 82 52 R 023 19 13 DC3 (device control 3) 123 83 53 S 024 20 14 DC4 (device control 4) 124 84 54 T 025 21 15 NAK (negative ack.) 125 85 55 U 026 22 16 SYN (synchronous idle) 126 86 56 V 027 23 17 ETB (end of trans. blk) 127 87 57 W 030 24 18 CAN (cancel) 130 88 58 X 031 25 19 EM (end of medium) 131 89 59 Y 032 26 1A SUB (substitute) 132 90 5A Z 033 27 1B ESC (escape) 133 91 5B [ 034 28 1C FS (file separator) 134 92 5C \ '\\' 035 29 1D GS (group separator) 135 93 5D ] 036 30 1E RS (record separator) 136 94 5E ^ 037 31 1F US (unit separator) 137 95 5F _ 040 32 20 SPACE 140 96 60 ` 041 33 21 ! 141 97 61 a 042 34 22 " 142 98 62 b 043 35 23 # 143 99 63 c 044 36 24 $ 144 100 64 d 045 37 25 % 145 101 65 e 046 38 26 & 146 102 66 f 047 39 27 ´ 147 103 67 g 050 40 28 ( 150 104 68 h 051 41 29 ) 151 105 69 i 052 42 2A * 152 106 6A j 053 43 2B + 153 107 6B k 054 44 2C , 154 108 6C l 055 45 2D - 155 109 6D m 056 46 2E . 156 110 6E n 057 47 2F / 157 111 6F o 060 48 30 0 160 112 70 p 061 49 31 1 161 113 71 q 062 50 32 2 162 114 72 r 063 51 33 3 163 115 73 s 064 52 34 4 164 116 74 t 065 53 35 5 165 117 75 u 066 54 36 6 166 118 76 v 067 55 37 7 167 119 77 w 070 56 38 8 170 120 78 x 071 57 39 9 171 121 79 y 072 58 3A : 172 122 7A z 073 59 3B ; 173 123 7B { 074 60 3C 176 126 7E ~ 077 63 3F ? 177 127 7F DEL
🐧 17 comments so far... add one ↓
Category | List of Unix and Linux commands |
---|---|
File Management | cat |
Firewall | Alpine Awall • CentOS 8 • OpenSUSE • RHEL 8 • Ubuntu 16.04 • Ubuntu 18.04 • Ubuntu 20.04 |
Network Utilities | dig • host • ip • nmap |
OpenVPN | CentOS 7 • CentOS 8 • Debian 10 • Debian 8/9 • Ubuntu 18.04 • Ubuntu 20.04 |
Package Manager | apk • apt |
Processes Management | bg • chroot • cron • disown • fg • jobs • killall • kill • pidof • pstree • pwdx • time |
Searching | grep • whereis • which |
User Information | groups • id • lastcomm • last • lid/libuser-lid • logname • members • users • whoami • who • w |
WireGuard VPN | Alpine • CentOS 8 • Debian 10 • Firewall • Ubuntu 20.04 |
I try to use your syntax but I can’t make it work.
Finally, I ended with this syntax
sed -i “s/\oXX/,/g” file where XX is the octal value of the character to replace.
Thanks for the command. It works for AIX’s sed, which does not support the \x and \o escape characters for non-printing ASCII characters.
Thanks Alexis, it’s work for me..
What helped me is seeing the non printable character in the file I wished to process using this command.
sed -n ‘l’ inputfile.txt
You will visibly see what sed can process.
What helped me is seeing the non printable character in the file I wished to process using this command.
sed -n ‘l’ inputfile.txt
You will visibly see what sed can process.
Thanks Buddy.
You just made my day.
Thanks to Alexis as well for the “sed -i “s/\oXX/,/g” file where XX is the octal value of the character to replace.” tip. which worked for me.
Using \x and the hex value – e.g. sed ‘s/\x0D/\n/’ – also works fpr GNU sed.
Very important: When using the echo command as shown above, the leading 0 is always required; even when the value is three digits long. So for example:
$(echo “357277275”) will work, but
$(echo “\357\277\275”) does NOT work.
I learned this the hard way when I changed the 033 in the example to 357 and it broke.
My previous post didn’t print correctly because of the \\.
Let me try again with a double \\\\ to see if that works.
I meant to say that this works: $(echo “\357\277\275″)
My post is still not printing correctly. The leading 0 is getting removed.
Here is my third attempt:
$(echo “\\357\\277\\275″)
Thanks …
Does not work for MS-Windows sed (using GNU SED 0.32 inside Windows). I can’t use the embedded echo, since MS-Windows does not allow it (well I could exec bash, but that’s another story)
For me, it did work provided that the ‘-e’ option is applied to ‘echo’ instead of ‘sed’:
sed ‘s/’`echo -e “33”`’/foo/g’
(Using sed 4.2.1 on Linux Mint 16). Hope this helps someone.
For Solaris, replace octal character (023)
sed ‘s/’`echo “23”`’/ foo/g’
BUT ONLY WORKS on KSH not in Bash
Hello,
I have read your piece on replacing non-printable characters but I have not benn able to do the job I want to do.
I have about 150 files which are the result of converting scanned jpegs to txt. I removed all the non-printable characters except ^M (control M) which I want to replace with two linefeeds so the text will retain paragraphs in LaTeX.
using analgous sed commands to your examples I have failed completely. Using
1,$s/^M/\r\r/g
works on a sample file in vi (if I remember correctly!) but the corresbonding sed command does not.
Can you help?
Dave
You could also do:
sed “s/[\015]/foo/g” input.txt
Brilliant — thank you!!