It seems that normal practice would put the setting of IFS outside the while loop in order to not repeat setting it for each iteration... Is this just a habitual "monkey see, monkey do" style, as it has been for this monkey until I read man read, or am I missing some subtle (or blatantly obvious) trap here?
4 Answers
The trap is that
IFS=; while read..
sets the IFS for the whole shell environment outside the loop, whereas
while IFS= read
redefines it only for the read invocation (except in the Bourne shell).
You can check that doing a loop like
while IFS= read xxx; ... done
then after such loop, echo "blabalbla $IFS ooooooo" prints
blabalbla
ooooooo
whereas after
IFS=; read xxx; ... done
the IFS stays redefined: now echo "blabalbla $IFS ooooooo" prints
blabalbla ooooooo
So if you use the second form, you have to remember to reset : IFS=$' \t\n'.
The second part of this question has been merged here, so I've removed the related answer from here.
- 39,269
Let's look at an example, with some carefully-crafted input text:
text=' hello world\
foo\bar'
That's two lines, the first beginning with a space and ending with a backslash. First, let's look at what happens without any precautions around read (but using printf '%s\n' "$text" to carefully print $text without any risk of expansion). (Below, $ is the shell prompt.)
$ printf '%s\n' "$text" |
while read line; do printf '%s\n' "[$line]"; done
[hello worldfoobar]
read ate up the backslashes: backslash-newline causes the newline to be ignored, and backslash-anything ignores that first backslash. To avoid backslashes being treated specially, we use read -r.
$ printf '%s\n' "$text" |
while read -r line; do printf '%s\n' "[$line]"; done
[hello world\]
[foo\bar]
That's better, we have two lines as expected. The two lines almost contain the desired content: the double space between hello and world has been retained, because it's within the line variable. On the other hand, the initial space was eaten up. That's because read reads as many words as you pass it variables, except that the last variable contains the rest of the line — but it still starts with the first word, i.e. the initial spaces are discarded.
So, in order to read each line literally, we need to make sure that no word splitting is going on. We do this by setting the IFS variable to an empty value.
$ printf '%s\n' "$text" |
while IFS= read -r line; do printf '%s\n' "[$line]"; done
[ hello world\]
[foo\bar]
Note how we set IFS specifically for the duration of the read built-in. The IFS= read -r line sets the environment variable IFS (to an empty value) specifically for the execution of read.
This is an instance of the general simple command syntax: a (possibly empty) sequence of variable assignments followed by a command name and its arguments (also, you can throw in redirections at any point). Since read is a built-in, the variable never actually ends up in an external process's environment; nonetheless the value of $IFS is what we're assigning there as long as read is executing¹. Note that read is not a special built-in, so the assignment does last only for its duration.
Thus we're taking care not to change the value of IFS for other instructions that may rely on it. This code will work no matter what the surrounding code has set IFS to initially, and it will not cause any trouble if the code inside the loop relies on IFS.
Contrast with this code snippet, which looks files up in a colon-separated path. The list of file names is read from a file, one file name per line.
IFS=":"; set -f
while IFS= read -r name; do
for dir in $PATH; do
## At this point, "$IFS" is still ":"
if [ -e "$dir/$name" ]; then echo "$dir/$name"; fi
done
done <filenames.txt
If the loop was while IFS=; read -r name; do …, then for dir in $PATH would not split $PATH into colon-separated components. If the code was IFS=; while read …, it would be even more obvious that IFS is not set to : in the loop body.
Of course, it would be possible to restore the value of IFS after executing read. But that would require knowing the previous value, which is extra effort. IFS= read is the simple way (and, conveniently, also the shortest way).
¹ And, if read is interrupted by a trapped signal, possibly while the trap is executing — this is not specified by POSIX and depends on the shell in practice.
- 8,981
- 829,060
-
8Thanks Gilles.. a very nice guided tour..(did you mean 'set -f'?).... Now, for the reader, to restate what has already been said, I'd like to emphasise the issue which had me looking at it the wrong way. First and foremost is the fact that the construct
while IFS= read(without a semi-colon after=) is not a special form ofwhileor ofIFSor ofread.. The construct is generic: ie.anyvar=anyvalue anycommand. The lack of;after settinganyvarmakes the scope ofanyvarlocal toanycommand.. The while--do/done loop is 100% unrelated to the local scope ofany_var. – Peter.O Aug 18 '11 at 05:00
Apart from the (already clarified) IFS scoping differences between the while IFS='' read, IFS=''; while read and while IFS=''; read idioms (per-command vs script/shell-wide IFS variable scoping), the take-home lesson is that you lose the leading and trailing spaces of an input line if the IFS variable is set to (contain a) space.
This can have pretty serious consequences if file paths are being processed.
Therefore setting the IFS variable to the empty string is anything but a bad idea since it ensures that a line's leading and trailing whitespace does not get stripped.
See also: Bash, read line by line from file, with IFS
(
shopt -s nullglob
touch ' file with spaces '
IFS=$' \t\n' read -r file <<<"$(printf '%s' *file*with*spaces*)"
ls -l "$file"
IFS='' read -r file <<<"$(printf '%s' *file*with*spaces*)"
ls -l "$file"
)
Inspired by Yuzem’s answer
If you want to set IFS to an actual character, this worked for me
iconv -f cp1252 zapni.tv.php | while IFS='#' read -d'#' line
do
echo "$line"
done
while IFS=X readdoesn't split atX, butwhile IFS=X; readdoes... – Peter.O Aug 17 '11 at 08:44whiledoes not make much sense - the condition forwhileends at that semicolon, so there is no actual loop...readbecomes just the first command inside the one-element loop... Or not? What about thedothen..? – rozcietrzewiacz Aug 17 '11 at 08:49whilecondition (beforedo). – rozcietrzewiacz Aug 17 '11 at 08:55IFS=work, butIFS=Xdoesn't... (or maybe I've OD'd on this for a while.. coffee break needed :) – Peter.O Aug 17 '11 at 09:21IFS= read ...sets the IFS variable only for the duration of the read command itself. For examplefoo=bar sh -c 'echo $foo'returns "bar", but then typingecho $fooreturns an empty string. – glenn jackman Aug 17 '11 at 13:26while IFS=':' read …is the IFS variable reset to:string on each loop ? Like will IFS value be restored to:if I alter it inside the loop as well ?while IFS=':' read -r pth ; do ; IFS='/' list=($pth); echo test; done < file.txt? – Stphane Dec 03 '15 at 22:19while IFS=':' read …IFS variable is not reset, but overriden, and only for thereadbuild-in. And if you set IFS inside the loop, you did set it also for outside. Check the Gilles answer, it explains this more precisely. – papo Jan 18 '22 at 19:34