Tuesday, October 29, 2013

Deleting a million directories in Linux

This morning a component went ballistic and created more than a million folders under /path/to/myfolder, until the file system was completely full.
this command was showing 100% of Inodes used:
watch df -i /opt/oracle/domains/osbpr1do/shared/apps/fileadapter/controlDir/fileftp/controlFiles
You could delete folder/files in several way, most of them VERY inefficient:
#this takes forever and deletes also the parent folder
rm -rf /path/to/myfolder

#this funnily fails with "file not found", most likely because there are 
#special characters like a equal sign = in the folder name
find /path/to/myfolder -type d -mtime +5 -exec ls -ltr {} \;

#of course I have tried also with quotes, with same result
find /path/to/myfolder -type d -mtime +5 -exec ls -ltr "{}" \;

#I have tried deleting only subsets
rm -rf /opt/oracle/domains/osbpr1do/shared/apps/fileadapter/controlDir/fileftp/controlFiles/a*
#but the problem is that the shell expands the a*, and this can resolve to too many argumnent
#so the command fails


#this is very simple but deletes also 
find /path/to/myfolder/ -type d -delete

then I read this article and I tested this:
rsync -av --delete --remove-source-files /tmp/empty/ /path/to/myfolder/

where /tmp/empty/ is an empty folder, and it works like magic. Reading around it says that this is due to the fact that rsync has a LIFO way of reading file system info, rather than FIFO.

1 comment:

Unknown said...

"find" has -delete parameter to avoid risks with -exec rm "{}" \; and such.

I only discovered it recently and I'm not sure it's available on Solaris but on Linux and Mac, it is.

For other complex situations, it is also sometimes useful to pipe the find results to xargs.
find . -atime +4 -print | xargs ls -l
An advantage if you specify xargs -n2 is that it will run the command as soon as it gets two items. This is helpful for a grep to display the name of the file too. (I know grep -r is better for simple cases)