Wednesday, February 20, 2008

Average file size within a directory

To calculate the average file size within a directory on a Linux system, following command can be used:

ls -l | gawk '{sum += $5; n++;} END {print sum/n;}'

If you'd like to know the average size of some particular kind of files (like jpg files) then use the following:

ls -l *.jpg | gawk '{sum += $5; n++;} END {print sum/n;}'

9 comments:

Anonymous said...

Works great, thanks a ton. /hat_tip

Anonymous said...

Much obliged. I googled for exactly this, and found exactly what I needed.

ctford said...

The command has an off-by-one error. It increments n one too many times because of the "total" line from ls -l.

The average is therefore slightly lower than it should be. Try the following:

ls -l | gawk '{sum += $5; n++;} END {print sum/(n-1);}'

Unknown said...

Or you could just ignore total by adding another pipe and grep:

ls -l | grep -v ^total | gawk '{sum += $5; n++;} END {print sum/n;}'

here's another approach that uses find, which can be used to count non directory files, this will also traverse the directory:

find . -type f -print | xargs ls -l | gawk '{sum += $5; n++;} END {print sum/n;}'

All of the find options are available so you can restrict it to only the current level:

find . -maxdepth 1 -type f -print | xargs ls -l | gawk '{sum += $5; n++;} END {print sum/n;}'

And for the example counting the avg size of jpg's in the current directory

find . -maxdepth 1 -type f -name *.jpg -print | xargs ls -l | gawk '{sum += $5; n++;} END {print sum/n;}'

Hope this is useful.

Note: The command syntax above doesn't support files/dirs with spaces in the name.

Unknown said...

Ok, an updated version that should work with spaces in file names

find . -type f -print0 | xargs -0 ls -l | gawk '{sum += $5; n++;} END {print sum/n;}

or to put it in KB

find . -type f -print0 | xargs -0 ls -l | gawk '{sum += $5; n++;} END {print sum/n/1024;}

Unknown said...

Sorry for the spam, I'm not able to edit my comments. I left off the trailing apostrophe on the previous command.

Here's an updated one with bonus output :-)


find /some/dir -type f \
-print0 | xargs -0 ls -l | gawk \
'{sum += $5; n++;} END {print "Total Size: " sum/1024/1024 " MB : Avg Size: " sum/n/1024 " KB : Total Files: " n ;}'

Total Size: 55214.8 MB : Avg Size: 9792.17 KB : Total Files: 5774

Unknown said...

This is awesome

Anonymous said...

Thanks, exactly what I needed

Unknown said...

Hi All,

I want to take all the files size in a directory and subdirectory and take the average file size of it .