3 Jun 2015
Sorting on Fields
The "sort
" utility seems a pretty obvious thing. But it can catch you out in odd little ways. Here's a simple example. Given some machine sizings as follows, how does a script display and sort them appropriately?:
Size | CPU | Memory |
---|---|---|
Tiny | 1 | 2048 |
Small | 1 | 4096 |
Medium | 2 | 4096 |
Big | 2 | 8192 |
Large | 4 | 8192 |
XL | 4 | 16384 |
XXL | 8 | 32768 |
Start with a template file which declares two simple arrays - call it size.tmpl
. This can be read in by the main script via "source ./size.tmpl
" (or just ". ./size.tmpl
"):
declare -A CPU RAM # CPU Count CPU['Tiny']=1 CPU['Small']=1 CPU['Medium']=2 CPU['Big']=2 CPU['Large']=4 CPU['XL']=4 CPU['XXL']=8 # RAM in Mb RAM['Tiny']=2048 # 2Gb RAM['Small']=4096 # 4Gb RAM['Medium']=4096 # 4Gb RAM['Big']=8192 # 8Gb RAM['Large']=8192 # 8Gb RAM['XL']=16384 # 16Gb RAM['XXL']=32768 # 32Gb
Then create a script which will read those arrays, and display their output, call it sort.sh
:
#!/bin/bash . ./size.tmpl printf "%-10s%4s%8s\n" Size CPU RAM # show header for SIZE in ${!CPU[*]} do printf "%-10s%4d%8d\n" ${SIZE} ${CPU[$SIZE]} ${RAM[$SIZE]} done
Unfortunately, this doesn't parse the arrays in any particular order:
$ ./sort.sh Size CPU RAM XL 4 16384 Medium 2 4096 Tiny 1 2048 Small 1 4096 Large 4 8192 Big 2 8192 XXL 8 32768
The answer is to use sort
. The -n
switch tells it to sort numerically (so that "9" comes before "10", for example). And you can give it keys to sort on. By default, the padding is whitespace, which is what we have here, so we just need to use "sort -n -k 3 -k 2
". This tells it to sort on column 3 (and anything which might come after), then on column two:
#!/bin/bash . ./size.tmpl printf "%-10s%4s%8s\n" Size CPU RAM # show header for SIZE in ${!CPU[*]} do printf "%-10s%4d%8d\n" \ ${SIZE} ${CPU[$SIZE]} ${RAM[$SIZE]} done | sort -n -k 3 -k 2
This now gives a more sensibly formatted output:
$ ./sort.sh Size CPU RAM Tiny 1 2048 Small 1 4096 Medium 2 4096 Big 2 8192 Large 4 8192 XL 4 16384 XXL 8 32768
And so we have a nicely formatted display, sorted by CPU and by RAM.
Bonus Points
For bonus points, we can tell sort
more about the input format. If it was CSV, for example, we can use "sort -t,
" to tell it that the comma separates the fields:
#!/bin/bash . ./size.tmpl echo "Size,CPU,RAM" for SIZE in ${!CPU[*]} do printf "%s,%d,%d\n" \ ${SIZE} ${CPU[$SIZE]} ${RAM[$SIZE]} done | sort -t, -n -k 3 -k 2
Then you can create a sorted CSV file:
$ ./sort-csv.sh Size,CPU,RAM Tiny,1,2048 Small,1,4096 Medium,2,4096 Big,2,8192 Large,4,8192 XL,4,16384 XXL,8,32768
Invest in your career. Buy my Shell Scripting Tutorial today: