Skip to content

Sum of File Sizes in a Directory

Oracle DBA has a big issues. The directory usge as show by AIX is not the same as shown by his WIN based tools. Something does not add up….. He suspects that du does not work correctly – “after all this is AIX 6.1, it’s got to have a bug!” I propose to sum up the files in each directory in question and to compare it to the values produced by du -sk *. If we are “close” that it is not AIX. Why “close” and not “the same”. I will be adding just the files not subdirectories as we will travel through the contents of a directory its subdirectories, and so forth…

Here is the code (script I call addmeup.ksh) to accomplish the tasks we want to do:

# execution example: ./addmeup.ksh /path/to/somewhere
cd $1
for file in `find . -type f`
sizeInKB=`ls -s $file | awk '{print $1}'`
((TotalSizeKB = TotalSizeKB + sizeInKB))
echo "$TotalSizeKB 'KB'"

So, what is cooking here? The first line specifies what command line interpreter (shell) will be used to execute the rest of the file (our script). In our case it will be the Korn shell. The line reading cd $1 takes this script argument and uses it as the subject of the cd command – we are changing position to directory indicated by $1. Next, the value of the variable size is set to 0. The following line defines the start of the for loop and it should be “read” from right to left. Following this advice, we see that in the current (.) directory we look just for files (`find . -type f`), no directories, subdirectories, . or ... The name of each file found by the find command is assigned to the variable called file – do I have a vivid imagination or not?
Following line defines the body of our loop. Between the do and the done is the space to conduct our “actions” aka calculations. Our first “action” line, executes the ls -s against the file which name resides in the filevariable. Next, still on the same line, the awk '{print $1}' extracts the value in the first column of the output generated by the proceeding ls -s $file. This value is the size of the file in kilobytes, and it is assigned to the variable SizeInKB.

The addition on the next line increases the value of the variable TotalSizeInKB by the amount stored in SizeInKB. The done marks the end of the loop body and it also causes the whole process to begin from the beginning – from the line containing the for file ...... The addition is cumulative, the size of every encountered file is added to the variable storing the value of the total size. When there is no more files to find, the first line (the line with the for in it) fails to load the file variable and the loop body is skipped.

The next executed statement is the one that prints the value of TotalSizeInKB followed by abbreviation KB. Now let see how it works.

MarcoPolo:/u40/oradata> du -sk *
260252352 CLTYcmn8
527459228 CLTYcmn8_adi
527459228 CLTYftr8
263282568 CLTYtst
laorrdu001:/u40/oradata> ./addmeup.ksh CLTYcmn8
260252340 KB
MarcoPolo:/u40/oradata> ./addmeup.ksh CLTYcmn8_adi
527459212 KB
MarcoPolo:/u40/oradata> ./addmeup.ksh CLTYftr8
527459208 KB
MarcoPolo:/u40/oradata> addmeup.ksh CLTYtst7
263282552 KB

This is how the du command was exaggerated and the Oracle DBA had one less thing to worry about.

Please check out Jeff’s comment below. He has a different way to do it. You can also learn how to use ‘awk’ more effectively. AWK rules!!!!

Posted in AIX, scripts.

Tagged with , , , , , , , , , .

3 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

  1. Jeff says

    handy little script, to save some time you can basically accomplish everything in two commands:

    this returns results on large directories much faster than individual ls’s of every file.
    find . -xdev -type f -ls|awk ‘{size+=$2} END {print size” KB”}’

    to upconvert your output to GB for du -sg:
    find . -xdev -type f -ls | awk ‘
    {size += $7} END {printf(“%5.2f GB\n”,size/1024/1024/1024)}’

    i added the xdev in there for my own safety sake.

    however, while its much faster the results are even a bit more skewed from du.. i wonder why? ah well.

  2. MarekD:-) says


    yes, you are 100% as -ls is a stripped down version of ls internal to the find – I just checked with the man page.
    By the way, do you awk often? What you show is very elegant.

    Thanks for your comment and all the best!!!


  3. Jeff says

    I wouldn’t say I awk often, I mean I do awk quite a bit but i’m a little limited in my awk skills, I use it extensively for one liners and such but writing larger, more complex tasks I’ll usually revert back to shell or perl.

    btw, I greatly enjoy your blog.. have a good one

Some HTML is OK

or, reply to this post via trackback.

Copyright © 2016 Waldemar Mark Duszyk. All Rights Reserved. Created by Blog Copyright.