(Version 4)
It a simple program for doing search & replace in multiple files files from a command line with support for recursion into subdirectories, regular expressions (a.k.a. regexps a.k.a. regexs) and a test (dry run) mode.
There have been many many such programs written by others in the past but I just wanted a very simple command line one that works on Microsoft Windows as well as Linux so I hacked a copy of my Bulk File & Directory Renamer with Recursion & Regular Expressions program to do this simpler task.
Although I mainly wrote for Microsoft Windows 2k (where very few
programs support
regular expressions), it should run equally well under Linux or Mac OS
X under
Perl. (Of course one does not really need a special program for this if
one is
using GNU/Linux/Bash as a one line mess of 'find', '-name', '-exec'
&
'sed'/'awk' can do it but I am too forgetful of syntax
& sloppy with
typing to risk destroying files attempting that.)
A Perl interpreter (with the 'File::Find', 'File::Path' & 'Getopt::Std' modules but those usually comes as standard with Perl anyway).
To process all files and directories in the current working
directory and
subdirectories thereof changing any occurrence of the expression
<From>
to the expression <To>
,
simply:
perl SearchAndReplaceInMultipleFiles.pl <From> <To>
Depending on how you installed the program you might be able to discard some of it and less pedantically do:
SearchAndReplaceInMultipleFiles <From> <To>
Non-trivial expressions and expressions containing spaces will probably need to be in (double for Windows) quotation marks so the shell passes them to the program as strings rather than trying to split up or process them itself.
You will probably want to use the '-m g
'
option (see below) as that will cause it to replace every match in each
of the files rather just the first match in each file, its default
behaviour.
Although designed primarily for plain text (and similar, such as HTML) files, it can work on any file type (e.g. to replace dates in EXIF headers of JPG files) but take care as it simply treats all files as if they were plain text or binary, blithely ignoring structure, so it is easy to accidentally corrupt more complex formats.
There are additional options which can be inserted before the two obligatory regular expressions
perl SearchAndReplaceInMultipleFiles.pl <options> <From> <To>
SearchAndReplaceInMultipleFiles <options> <From> <To>
The options are provided in the common Linux short format of
single letters
each prefixed by '-
' and separated from
eachother & parameters
by spaces. Options that don't require additional parameters can be
grouped
(e.g. '-ft
' means the same as '-f
-t
').
<name>
regular
expression.<directory>
directory instead of
the current working directory.It can be very useful for correcting spelling mistakes duplicated across lots of files, for example with holiday photographs where a friend spots that I have consistently misspelt the name of a place across dozens of photograph captions.
All it needs is (replace the from & to strings to those required):
Windows:
SearchAndReplaceInMultipleFiles -m g "\bLund'n Bridj\b" "London Bridge"
Linux:
SearchAndReplaceInMultipleFiles -m g '\bLund'n Bridj\b$' 'London Bridge'
You can use it just as it is as a recipe but if you want an
explanation, here goes. The '-m g
'
tells it to replace every occurrence in each file (so , for
example, 'Lund'n Bridj viewed from
Lund'n
Bridj tour', becomes 'London
Bridge viewed from London
Bridge tour' not 'London
Bridge viewed from Lund'n
Bridj tour').
The '\b
'
marks word boundaries (more generic than spaces, it also includes
pronunciation and string ends) to prevent it changing words of which
'Lund'n Bridj' is a substring.
One does not really need these complications in this case
as substrings are not likely to be problem so one could
simply do 'SearchAndReplaceInMultipleFiles "Lund'n
Bridj" "London
Bridge"
' & repeat the
command until it makes no further changes.
One complication of running programs from a command line is
that the command line interpreter ('shell') treats some characters
specially and replaces them with other things (such as values of
settings or unprintable characters) before running the
program. Some of these characters are even treated so inside strings.
The risky characters are '$
' & '$
'
on Linux and '%
'
on Windows.
There are two solutions in Linux. The simplest is to use
single quotation marks (''...'
') instead of
double quotation marks ('"..."
') for the
strings. The other is to prefix ('escape') each '$
'
& '\
'
with another '\
'. E.g. instead of
SearchAndReplaceInMultipleFiles -t "$20" "$30"
use
SearchAndReplaceInMultipleFiles -t '$20' '$30'
or
SearchAndReplaceInMultipleFiles -t "\$20" "$30"
In Windows I don't know how to prevent it
substituting things
beginning with '%
' if it recognises them as
settings (such as '%TMP%
', the path to the
system temporary directory). Within batch files supposedly prefixing
'%'
with '^
' works but that did not work when I
tested it directly on a command line. Fortunately Windows typically
only has about 20 settings strings (use 'SET
'
to see which your installation is using) & '%
'
is usually used in English text with a space or punctuation after it so
this is rarely likely to be a problem.
This is a powerful program that, running with sufficient file
permissions,
could corrupt every file on your computer
(and
networked drives) so take care. Preferably make a back-up copy of your
files
before use, take care that you are running on the directory you intend
to run
it on and run it in test mode (the '-t
'
option) first checking
that the changes it is going to make are what want them to be. Treat it
with
the care you would treat 'rm -rf *
' on Linux
or 'del
/s /q
*
' on Windows.
It does not need fancy installation. Provided Perl has been installed and this program has been download, it should be read to run! The following is just options for making it look tidier.
If you are only going to use it for a one off job, just put
it in directory
you want renaming done in and run it from there with 'perl
SearchAndReplaceInMultipleFiles.pl
' (it might accidentally
corrupt
itself by searching & replacing in itself but its
job will have been done by then) and delete it afterwards.
Alternatively, to keep for later use, put it anywhere listed
in the
computers 'path' setting (e.g. '/usr/local/bin/
'
on Linux &
'C:\Windows\System32\
' will typically work) or
put it wherever you like
and add
its directory to the computer's 'path' setting.
The '.pl' on the end that tells the computer that it is a Perl program looks a bit untidy.
On Linux, you can remove it by renaming the program provided
you tell your
computer it is a Perl script either by always running it explicitly
prefixed
with 'perl
' or by editing the first line of
the program so that
that the bit after the '#!
' is the location
of your computer's
Perl interpreter.
On Microsoft Windows you cannot remove the '.pl
'
without using the
explicit
'perl
' method but you can avoid needing to
type it by adding
';.PL
' to the 'PATHEXT' system setting (that
tells Windows that
'.pl
' files are executables and therefore,
like '.exe
' files, don't
need the
file extension typed when searching for them in the system path
directories).
'SearchAndReplaceInMultipleFiles
' is
nicely
descriptive but long to type.
That is not a problem if using GNU/Linux/Bash because file names in the
system
path directories can be autocompleted by pressing the tab key.
Unfortunately on Microsoft Windows, only items in the current working directory, not the system path directories, can be autocompleted. Hence on Windows it is probably worthwhile renaming the program to something much shorter if it is going to be used frequently.
Download SearchAndReplaceInMultipleFiles.pl (5 KiB).
See my computer programs index page for more simple useful computer programs.