Bulk File & Directory Renamer with Recursion & Regular Expressions

(Version 10)

What is This?

It a simple program for renaming multiple files & directories (a.k.a. folders) from a command line with support for recursion into subdirectories, regular expressions (a.k.a. regexps a.k.a. regexs) and a test (dry run) mode.

You may ask why I have made this when, it being such a useful type of program, there have been many many such programs written by others in the past. Well, I am fussy and did not find one that fitted my liking when I started it so I wrote my own and kept adding new features as I needed them.

Although I mainly wrote for M$ Windows 2k (where very few programs support regular expressions), it should run equally well under Linux or Mac OS X under Perl. (Of course one does not really need a special program for this if one is using GNU/Linux/Bash as a one line mess of 'find', '-name', '-type', '-exec', 'sed' & 'mv' can do it but I am too forgetful of syntax & sloppy with typing to risk destroying files attempting that.)

System Requirements

A Perl interpreter (with the 'File::Find', 'File::Path' & 'Getopt::Std' modules but those usually comes as standard with Perl anyway).

How to Use It

Basic Usage

To process all files and directories in the current working directory and subdirectories thereof changing any occurrence of the expression <From> to the expression <To>, simply:

perl RecursiveRegexpRename.pl <From> <To>

Depending on how you installed the program you might be able to discard some of it and less pedantically do:

RecursiveRegexpRename <From> <To>

Non-trivial expressions and expressions containing spaces will probably need to be in (double for Windows) quotation marks so the shell passes them to the program as strings rather than trying to split up or process them itself.

To move the files or directories into another folder, just make <To> the path, relative to where it was, to move to. Note that it uses '/' as the directory separator symbol (even on M$ Windows) and that is a special character in Perl regular expressions & so should be escaped as '\/'.

Options

There are additional options which can be inserted before the two obligatory regular expressions

perl RecursiveRegexpRename.pl <options> <From> <To>

RecursiveRegexpRename <options> <From> <To>

The options are provided in the common Linux short format of single letters each prefixed by '-' and separated from eachother & parameters by spaces. Options that don't require additional parameters can be grouped (e.g. '-ft ' means the same as '-f -t ').

-h
Print a summary of the instructions. (The instruction summary will also be printed if a syntax error is found in the options or if the program is run without parameters.)
-f
Limit the renaming to files only, not directories. (By default it renames both, equivalent to specifying '-fd'.)
-d
Limit the renaming to directories only, not files. (By default it renames both, equivalent to specifying '-fd'.)
-n <Name>
Limit to filenames matching the <name> regular expression.
-m [egimosx]
Apply the given Perl search & replace modifiers. The most useful for file renaming are:
-b <directory>
Process files within <directory> directory instead of the current working directory.
-e
Do not recurse subdirectories.
-p
Allow creation of directories. Unless one is using the program to move items to other directories by having directory separator characters in <to> and the destination directories don't all already exist then this is irrelevant. If one tries to move an item to a non-existent directory without this option set then the program will abort with an error, but with this option set it will instead automatically create the directory.
-t
Test mode. (It does a dry run printing out the changes it would have made but does not does actually make the changes).

Example: Simple Search and Replace in Multiple File & Directory Names

I find this very useful for correcting spelling mistakes that I have duplicated across lots of file & directory names before noticing, for example with holiday photographs where a friend spots that I have consistently misspelt the name of a place across dozens of photographs.

All it needs is (replace the from & to strings to those required):

Windows: RecursiveRegexpRename -m g "\bLund'n Bridj\b" "London Bridge"

Linux: RecursiveRegexpRename -m g '\bLund'n Bridj\b$' 'London Bridge'

You can use it just as it is as a recipe but if you want an explanation, here goes. The '-m g' tells it to replace every occurrence in each file name (so , for example, 'Lund'n Bridj view, Lund'n Bridj.jpg', becomes 'London Bridge view, London Bridge.jpg' not 'London Bridge view, Lund'n Bridj.jpg'). The '\b' marks word boundaries (more generic than spaces, it also includes pronunciation and string ends) to prevent it changing words of which 'Lund'n Bridj' is a substring. One does not really need these complications in this case as substrings are not likely to be problem so one could simply do 'RecursiveRegexpRename "Lund'n Bridj" "London Bridge"' & repeat the command until it makes no further changes.

Example: Changing of File Name Extensions

I was asked this by a reader who wanted to rename 2 TiB of image files from '*.fil' to '*.tif'. All it needs is:

Windows: RecursiveRegexpRename -f -m i "\.fil$" ".tif"

Linux: RecursiveRegexpRename -f -m i '\.fil$' '.tif'

The same method (just changing the parameter text appropriately of course) would work for other common ones like changing '*.txt' to '*.csv' ,'*.tiff' to '*.tif', '*.jpeg' to '*.jpg' & '*.pps' to '*.ppt'.

You can use it just as it is as a recipe but if you want an explanation, here goes. The '-f' restricts it to files lest one has any odd directories named ending with '.fil'. The '-m i' tells it to be case insensitive (so '.Tif', '.TIF' etc. also become '.fil' too). The '\' ensures that the following '.' is treated simply as a '.' character in matches rather than a wildcard ('.' is the single character wildcard in Perl regular expressions, i.e. it is the 'joker' that matches any character) to prevent, for example, 'tasks to fulfil' becoming 'tasks to fu.tif'. The '$' matches the end of a string and so as to prevent, for example 'space.filler.fil' becoming 'space.tifler.fil'. All that is essentially paranoia. Provided one does not have any directory names ending '.fil', any file or directory names with '.fil' other than at the end or any names ending with 'fil' that aren't ending '.fil' then a simple plain 'RecursiveRegexpRename ".fil" ".tif"' would work fine.

Example: Changing of Spaces Representation

A problem with file names is that they are usually supposed to be useful for a human to read and normal English text has spaces but spaces were (unwisely in my opinion) reserved as file name separator characters in early filesystems so bodges like using "Example_file_name" or "ExampleFileName" for "Example file name" became de facto standards. Thesedays the normal Windows, Linux & Mac filesystems can cope with spaces but some people have got used to the bodged styles and prefer them or have to use them because some applications (most notably the WWW) still don't like spaces.

It can be annoying to receive files from different sources named in a mixture of the formats or in a format that is not one's own preference. However one can, to some extent, use this program to convert them in bulk to the way one wants them.

The following table is for Windows use. For Linux use simply replace all the double quotation marks ('"') with single quotation marks (''').


From
Spaces
(Example file name)
Underscores
(Example_file_name)
Camel case
(ExampleFileName)
To Spaces
(Example file name)

RecursiveRegexpRename -m g "_" " " RecursiveRegexpRename -m g "(?<=.)([A-Z])" " \L$1"
Underscores
(Example_file_name)
RecursiveRegexpRename -m g " " "_"
RecursiveRegexpRename -m g "(?<=.)([A-Z])" "_\L$1"
Camel case
(ExampleFileName)
RecursiveRegexpRename -m g "(^| +)(\w*)" "\u\L$2" RecursiveRegexpRename -m g "(^|_+)([^_]*)" "\u\L$2"

Note that there is no way such an automated conversion can work perfectly in every case. For example, it cannot tell that 'FreeBbcTvProgram.mpg' should become 'Free BBC TV program.mpg' not 'Free bbc tv program.mpg'. However it can often do the majority of the work.

Example: Changing Hex Values into Characters

A reader of this site had a large number of files with some of the characters in the file names replaced with two-digit upper-case hexadecimal character values prefixed by underscores, e.g. spaces had become' _20', left parentheses '_28' and right square brackets '_5D'. The reader wanted them corrected back to characters. Whilst one could run the program repeatedly, once for each of the character codes which needs to be replaced, it can be all be done in one pass by using the 'e' Perl search & replace modifier and using as the replacement string a Perl expression that calculates the required character from the hex character code:

RecursiveRegexpRename -m ge "_([0-9A-F]{2})" "chr(hex($1))"

(That is for Windows, for Linux use simply replace all the double quotation marks ('"') with single quotation marks (''').)

Example: Changing Case

A reader reader asked how to covert filenames to lower case. That turns out to be particularly easy (shown with Windows quotes):

RecursiveRegexpRename -m ge "(.*)" "\L$1"

It works because '(.*)' takes the whole name so '$1' just replaces the name with itself but it is operated on by '\L' which is Perl's in-string operator for converting to lower case.

To just change the first letter of each word to lower case replace '\L' with '\l'. Similarly replace it with '|U' for all to upper case or '\u' for first letters to upper case.

Example: Sorting Files into Folders by Initial Letter

An MP3 player of mine allowed selecting an individual track to play by filename but a flat folder of hundreds of files was tedious to navigate by its buttons. Simple solution was to turn the flat folder into a hierarchical set of folders by filename first letter:

RecursiveRegexpRename -p "^(.)" "\1\/\1"

It it similarly could be extended to a few more levels (or even splitting every letter into its own folder level!).

Avoiding Command Line Problems

One complication of running programs from a command line is that the command line interpreter ('shell') treats some characters specially and replaces them with other things (such as values of settings or unprintable characters) before running the program. Some of these characters are even treated so inside strings. The risky characters are '$' & '$' on Linux and '%' on Windows.

There are two solutions in Linux. The simplest is to use single quotation marks (''...'') instead of double quotation marks ('"..."') for the strings. The other is to prefix ('escape') each '$' & '\' with another '\'. E.g. instead of

RecursiveRegexpRename -f -m i -t "\.\$00" ".jpg"

use

RecursiveRegexpRename -f -m i -t '\.\$00' '.jpg'

or

RecursiveRegexpRename -f -m i -t "\\.\\\$00" ".jpg"

In Windows I don't know how to prevent it substituting things beginning with '%' if it recognises them as settings (such as '%TMP%', the path to the system temporary directory). Within batch files supposedly prefixing '%' with '^' works but that did not work when I tested it directly on a command line. Fortunately Windows typically only has about 20 settings strings (use 'SET' to see which your installation is using) & file names only usually contain '%' when representing non-ASCII characters in files saved from a website so this is rarely likely to be a problem (unlike '\').

Safety

This is a powerful program that, running with sufficient file permissions, could corrupt the name of every file and directory on your computer (and networked drives) so take care. Preferably make a back-up copy of your files before use, take care that you are running on the directory you intend to run it on and run it in test mode (the '-t ' option) first checking that the changes it is going to make are what want them to be. Treat it with the care you would treat 'rm -rf *' on Linux or 'del /s /q *' on Windows.

In particular, note that it overwrites files at the destination without warning. Hence if 2 files end up with the same name (e.g. if one removes the vowels from the file names" Coot.txt" & "Cat.txt" the both become "Ct.txt") the latter overwrites the former. A less obvious situation (which caught me myself out - fortunately I had all the lost photograph files well backed up) is that a moved file can overwrite a yet-to-be-moved file. E.g. if one has a series of files" 01.txt", "02.txt", "03.txt" etc. which wants to increment the serial numbers of and naïvely does it in one operation then "01.txt" may be moved to "02.txt" before the original "02.txt" itself is moved so that file is lost then the new "02.txt" (which started off as "01.txt") is moved to "03.txt" destroying that file too & so on. A simple work-around is to use a temporary renaming that does not clash with existing file names then rename to the final values, e.g. move" 01.txt" to "02.XmovedX.txt" etc. then a trivially remove the ".XmovedX"s from the names.

Installation Options

It does not need fancy installation. Provided Perl has been installed and this program has been download, it should be read to run! The following is just options for making it look tidier.

Changing Installation Directory

If you are only going to use it for a one off job, just put it in directory you want renaming done in and run it from there with 'perl RecursiveRegexpRename.pl' (it might accidentally rename itself but its job will have been done by then) and delete it afterwards.

Alternatively, to keep for later use, put it anywhere listed in the computers 'path' setting (e.g. '/usr/local/bin/' on Linux & 'C:\Windows\System32\' will typically work) or put it wherever you like and add its directory to the computer's 'path' setting.

Getting rid of the '.pl' Filename Extension

The '.pl' on the end that tells the computer that it is a Perl program looks a bit untidy.

On Linux, you can remove it by renaming the program provided you tell your computer it is a Perl script either by always running it explicitly prefixed with 'perl ' or by editing the first line of the program so that that the bit after the '#! ' is the location of your computer's Perl interpreter.

On M$ Windows you cannot remove the '.pl' without using the explicit 'perl ' method but you can avoid needing to type it by adding ';.PL' to the 'PATHEXT' system setting (that tells Windows that '.pl' files are executables and therefore, like '.exe' files, don't need the file extension typed when searching for them in the system path directories).

Shortening the Name

'RecursiveRegexpRename' is nicely descriptive but long to type. That is not a problem if using GNU/Linux/Bash because file names in the system path directories can be autocompleted by pressing the tab key.

Unfortunately on M$ Windows, only items in the current working directory, not the system path directories, can be autocompleted. Hence on Windows I rename the program to the much shorter name 'RRR'. Note that renaming it to simply 'Rename' on Windows is not a good idea as Windows already has a built-in command of that name.

Download

Download RecursiveRegexpRename.pl (6 kiB).

Other Perl Scripts, Disclaimers Etc.

See my computer programs index page for more simple useful computer programs.