8th May 2017
Sorting Files by Name
Including advanced sorts, and coping with spaces
Sorting files is normally a fairly straight-forward task; "ls -lSr
" will sort them by size, (smallest to largest). "ls -ltr
" sorts them by last-modified time (oldest to newest), and so on. For more advanced sorts, say to sort them in numerical order, you may want to pipe the files through the dedicated "sort
" utility. For example, here I have a set of files, which I'd like to sort numerically. For an extra twist, the names contain spaces, which as we shall see, complicates things slightly.
$ ls 10. Blade Runner 2. Pulp Fiction 11. The Shining 3. The Wizard of Oz 12. Fight Club 4. 2001 - A Space Odyssey 13. Alien 5. Schindler's List 14. Toy Story 6. Star Wars 15. The Matrix 7. Se7en 16. Amelie 8. To Kill a Mockingbird 1. Casablanca 9. The Silence of the Lambs $
These files are listed in alphabetical order by default; "10" comes before "11", which is good, but also before "1", which is not what we want. We will need to pass these files to "sort -n
", because that understands about the difference between sorting alphabetically and sorting numerically.
$ ls | sort -n 1. Casablanca 2. Pulp Fiction 3. The Wizard of Oz 4. 2001 - A Space Odyssey 5. Schindler's List 6. Star Wars 7. Se7en 8. To Kill a Mockingbird 9. The Silence of the Lambs 10. Blade Runner 11. The Shining 12. Fight Club 13. Alien 14. Toy Story 15. The Matrix 16. Amelie $
As an aside, you may notice that "ls
" by itself formatted the output into columns; this is because it knew it was writing to a terminal, so it prettified the output a little. "ls -1
" (that's minus 1, not minus l
) tells "ls
" to always list one file per line. If it knows it is writing to a pipe, a file, or anything else, "ls
" will default to writing one file per line.
Now that we have sorted the files, we can iterate through them in a loop. Here is one well-intentioned attempt:
$ for filename in `ls | sort -n` > do > echo "Found file: ${filename}" > done
Unfortunately, that does not deal with the fact that these filenames contain spaces. Every word passed to the for
loop is processed in turn:
Found file: 1. Found file: Casablanca Found file: 2. Found file: Pulp Found file: Fiction Found file: 3. Found file: The Found file: Wizard . . .
Probably the tidiest way around this problem, is to pass the output into a loop; "while read filename
" will read its input line-by-line, into the variable named filename
. So rather than grabbing the output of "ls | sort -n
" into a for
loop, we pipe it into a while
loop, like this:
$ ls | sort -n | while read filename > do > echo "Found file: ${filename}" > done
This produces the desired result: spaces are preserved, and the files are listed in the order that you would expect:
Found file: 1. Casablanca Found file: 2. Pulp Fiction Found file: 3. The Wizard of Oz Found file: 4. 2001 - A Space Odyssey Found file: 5. Schindler's List . . .
These files can now be processed however you wish; this technique could be useful for a Flyway style database migration tool, where you name all of your SQL
files to indicate the order in which they should be run, for example.
This is one of those types of problems, where if you know the answer, it is likely to be quite obvious, but for those not so familiar with the shell and how it processes strings, it can be quite frustrating and time consuming to work it out by trial-and-error. Hopefully this short page has helped you better to understand what is going on, and why the for
loop didn't work, where the while
loop did.
Invest in your career. Buy my Shell Scripting Tutorial today:
og:image credit: Se7en movie, just because it's in this list, and it includes lists!