Is it possible to grep inside a bunch of archives with regular expressions on both the names of the files they contain and the contents of those files? I would like to know which files in which archives match the pattern. I'm on OS X, if it matters.
1 Answers
If you need to use more expressive grep-style file patterns:
tar -OT <(tar -tf /path/to/file.tar | grep 'FILE_PATTERN') -xf /path/to/file.tar \
| grep 'CONTENT_PATTERN'
-O specifies the output to be stdout, and -T specifies a file containing names to extract, when used in conjunction with -x.
If simpler pathname expansion is good enough, you can replace the process substitution (<( ... )) with a simpler echo line, this avoids having to read run tar on the file twice:
tar -OT <(echo 'FILE_PATTERN') -xf /path/to/file.tar \
| grep 'CONTENT_PATTERN'
If you want to also see the filenames, add the -v flag (personally I will go for -xvf), but then you'll also need to modify CONTENT_PATTERN to grep for the filenames again. I'll leave this as an exercise for the reader...
It gets a bit tricky, and you'll probably have to use awk for a little more output processing... The matching filenames will be displayed per line, so unfortunately there's no clear-cut delimiter here. Assuming filenames will not be repeated as contents:
tar ... | awk '/^FILLE_AWK_PATTERN$/{f=$0;next}...'
That sets the awk variable f to be every new filename encountered and skips to the next line. Then,
tar ... | awk '...$f&&/CONTENT_AWK_PATTERN/{print $f;$f=""}'
Once we see a matching line, we print $f and reset our filename until the next file is 'encountered'.
Putting it together:
tar -OT <(echo 'FILE_PATTERN') -xf /path/to/file.tar \
| awk '/^FILLE_AWK_PATTERN$/{f=$0;next};$f&&/CONTENT_AWK_PATTERN/{print $f;$f=""}'
- 1,253
-
1Thanks for your time. I ended up using
acatfromatoolsinstead of tar and piped to grep + uniq – Emre Aug 28 '15 at 02:30 -
-
for file in *.zip ; do acat $file foo | grep bar | uniq | sed -e 's/^/'$file' /g' ; done– Emre Aug 28 '15 at 11:25
zgrep, but you must add more details for your question to clarify it. – cuonglm Aug 27 '15 at 04:31