sed - streamline editor
- Details
- Last Updated: Thursday, 02 May 2024 05:40
- Published: Monday, 27 September 2021 16:30
- Hits: 629
sed:
streamline editor for noninteractive purpose. This is a linux pgm that was used in older days to manipulate files. Now a days, perl and other scripts can do the job just as easily. sed allows us to do same editing as from within vi, but allows us to do it from cmd line by providing editing cmd and name of file. It's very easy to use, and should be your first choice whenever you want to search, replace for patterns in your file. Other scripting languages require quite a bit of code to many any manipulations to a file (i.e you have to open a file, copy lines one by one to other file, after modifying lines of interest, and then close all the files. Most scripting languages don't allow editing the same file content, while sed allows you to do that). So, 10 lines of code in some other language may require just 1 line of code in sed. sed has very easy syntax (you just need to regular expression), and very few cmds to get 99% of work done.
A very good reference is here: https://www.grymoire.com/Unix/Sed.html
You can also include sed cmds in any other script, as though they were native unix cmds (as sed is installed by default on all Linux distro). So, you can think of sed as ls, cp, etc which does file editing for you.
NOTE: There are 2 variants of sed: BSD sed and GNU sed. BSD sed is the original one, but it has lot of idiosyncrasies. You should always use the gnu sed version, and that's what's installed on most Linux distro. Most of the cmds you find online are assuming GNU sed (also called as gsed). Type "sed --version" on terminal to check sed version.
Terminal prompt$ sed --version
sed (GNU sed) 4.2.2
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
syntax:
sed <options> 'command' filename => command tells sed what to do with lines in file, change it, remove it, etc. There are many optional options that can be provided which have to be preceeded by -. The command itself may be put in single quotes, or in double quotes (see under variable substitution to see when should double quotes be used).
ex: sed -i '/_slash/d' ~/tmp.txt => Here -i is the option, '...' is the cmd, and we are using this cmd on file tmp.txt
options:
Sed cmds are provided on terminal and it prints modified o/p on screen. If we want the modified o/p to be saved in a file, we can redirect sed o/p to another file (by usng linux > operator). There are various options to change the default behaviour.
- -i => If we want to modify the original file, we have to use -i (inplace) option. NOTE: -i is not for "case insensitive". sed doesn't have an option to be called with case insensitive option (BSD sed may have a -I option for case insensitivity applied globally, but it doesn't work on gnu sed).
- -r => sed by default uses BRE (Basic RE, see in Regular Expression section), but using -r makes it use ERE (ERE is recommended).
- -e => To specify multiple sed cmds
command:
command follows the usual syntax: i.e 'flag1/<original_pattern>/<modified_pattern>/flag2' => This is similar to pattern replacement syntax that you see in Perl, Python, etc. "flags" in beginning or end specify what we want to do with the pattern. d=delete (used in flag2), s=substitute (used in flag1), etc. More details later.
deleting patterns: We use '/<pattern>/d' to delete matched pattern
sed -i -r '/_slash/d' ~/tmp.txt => delete _slash from tmp.txt (-i means same file tmp.txt is modified. -r means use extended regular expression (ERE)).
delete blank lines: sed '/^$/d' in.txt > out.txt => all lines in file in.txt that start and end with nothing in b/w are deleted, and o/p is passed on to new file out.txt
substituting pattern: We use 's/<orig_pattern>/<new_pattern>/' to substituting matched pattern
sed -r "s/'/ /g" in.txt > out.txt =>Here we are replacing ' with space. So, we had to use double quotes, since using single quotes gave an error => unmatched '. g is used to say do it globally.
sed -e 's/.*/PRE: & SUF \\/g' in.txt > out.txt =>Here in every line, we are replacing he start and end of line with "PRE:" and "SUF \" respectively. We have to use double backslash, since single backslash is itself an escape char. If we don't use \\, and instead use \, then / following \ is treated as literal, so we get "unterminated cmd" error.
sed -r 's/\$\{.*slash\}//g' ~/tmp${num}.txt > out.txt => here we globally replace ${.....slash} with nothing, i.e remove that pattern. Here instead of doing inline replacement, we pass on the results to another file named out.txt
Remembering matching patterns: This is done by enclosing the pattern to be rembered using () and then recalling it back using \1, \2 etc where \1 is the 1st pattern within (), \2 is the second pattern and so on.
sed -r 's/(.*)\s+(.*)/\2 -SPACE \1/g' ~/orig.txt > ~/mod.txt => Here we are replacing all lines which have patterns of form " aa/a?aa bbb/ccc" with "bbb/ccc -SPACE aa/a?aa". NOTE: we didn't use \(.*\) since we are using ERE (by using -r) where "(", ")" is recognized as special char. If we were using BRE, then we needed to use \(.*\)
Substituting with a variable: So far, we used single quotes in cmd section. We said that single and double quotes don't matter in the cmd section of sed. We use one over the other depending on whether the cmd itself contains single or double quotes. i.e double quotes is used if you need to use a single quote in command itself. One other place double quotes is used is when you need to substitute the value of a var in the sed cmd. It behaves just like other scripting languages, where one kind of quote allows "var substitution" while other kind doesn't.
ex: set a ="my lord"; sed -i "s/abc.*$/ME $a/" z.log => HereAnything with "abc" until the end of line is subtituted with char "ME my lord". If we used single quotes, then "ME $a" will get substituted.
Print or delete between specific Markers: This is very used cmd when we have markers in a file and we want to cut out that protion of file, and replace it with something else. This is mostly used when some automated section of a file is updated via scripts, and we want the rest of the file untouched.
ex: sed -e '/START_MARKER/,/END_MARKER/d' File1 => d flag says to delete lines (p flag is to print lines, but doesn't work - FIXME) b/w the 2 markers: START_MARKER and END_MARKER. If END_MARKER is some pattern which is not there in the file, then everything from START_MARKER to the EOF is deleted. Use some marker as ZZZZ which is not there in file, and then contents from START_MARKER to the EOF is deleted.
Substitute text between specific Markers with contents from another file: This is not easy, as I tried several 1 liners on internet and nothing worked. The only code that works is below and it ONLY works in Bash shell. It'll error out in csh as it needs 2 lines for sed cmd to work (which apparently breaks in csh):
Here, original file is edited. It has contents b/w "START of Pattern' and "END of Pattern" that are replaced by contents of replacement_file. NOTE the 2nd line on sed cmd. You have to put that code in 2nd line (i.e after a newline), else it won't work !!
lead='START of Pattern'
tail='END of Pattern'
sed -i "/$lead/,/$tail/{ /$lead/{p; r replacement_file.txt
}; /$tail/p; d }" original_file.txt