tracking Unix (Solaris) disk usage

Where I work the particular unix server I work on regularly runs at 90%+ disk usage, and often bumps up to 100%. I wrote a little two line shell script and couldn’t save it! What are some ways to report disk usage? Find out where all the stuff is that’s filling up storage?

I am not a sysadmin, obviously, but it seems to me the people who are aren’t being quite strict enough about it. I’m just curious, I’m not out to get anyone.

Try the “df -k” command. That will give you the Disk Free (in kilobytes, rather than the tradional 512 byte blocks) for each file system.

The “du -k” command will give you the disk usage (again KB instead of blocks) for your current directory and all subdirectories. This can take a long time to run and have a jillion lines of output, and may skip directories where you don’t have access permissions. I prefer df for an overview of system disk usage.

Once you know which filesystem has the problem use a unix find command to search for big files. The syntax for find is subtle and complex. Try some simple examples before you go hunting for big files, and have a second log in active so you can kill the find if it uses up too many system resources.

Read the man pages for these commands on your system, YMMV, IAN your Sysadmin, etc.

I just discovered the du command today, and, as you say, it can have quite a lot of output. I was hoping for a more general reporting/summary tool. And I know what filesystem is filling up, but the find command once again is too specific. Thanks for the response.

I don’t think you can have the best of both worlds here. Either you get a high-level report of disk usage (via ‘df’) that doesn’t tell you about individual files, or you get a per-file report (via ‘find’ and ‘du’ or ‘ls’) so that you can track down the troublemaker(s).

A command like this might help:


find DIRECTORY -type f -exec ls -s {} \; | sort -n

This will list all your files in order by size, with the largest ones at the end. It will probably be pretty slow to run. But, once you have a few “suspect” files to watch, you can track them more rapidly.

Try this:

du -a / | sort -nr | less
If you don’t have less then you might have to use more. The kernel logger can also be configured to track file usage or overly large files, but that would have to be configured in syslog by someone with admin privlidges.

Given the variation, it sounds like someone might be using a bunch of scratch space to run some kind of calculation job. This should be configured on its own partition which is managed seperately, thus not eating into disk space allocated for user accounts, but I’ve noticed that some more recent Linux distros dispense with this, though a commercial Solaris machine shouldn’t have this problem. Alternatively, it could be that someone is scratching to the user directory and there aren’t any limits on disk usage per account, which would be unusual for a public server. You should definitely bring this to the attention to the sysadmins; an overloaded filesystem can crash the system. Just don’t give him your name or account number.

Stranger

Oh, and if you’re trying to pick up Unix, do yourself a favor and pick up a copy of the excellent Unix For The Impatient. It’s a good reference covering a lot of older commerical true Unix distros, whereas most of what you’ll find online tends to be biased toward GNU/Linux for which the utilities often have somewhat different syntax and output. (All commerical Unices are derived from a common tree with two main branches–System V and BSD–which were subsequently remerged to varying degrees, whereas Linux is a “Unix-like” OS in which the GNU utilities were almost exclusively written from ground up to behave similarly to Unix utils but may function differently under the hood.)

Stranger

I usually do “du -hs *” on Linux, which shows the size of everything in the current directory. Then the directories that are biggest I go into and run the same command to get a better breakdown.

On Solaris, it might work with “du -ks *”. I don’t have access to Solaris right now to test it.

A guy I worked with years ago solved this by writing something that scanned the current and all subdirectories and counted the files and summed the filesizes. I didn’t know how sweet of a tool that was until now. He wrote it using recursion and I know it was a resource hog. Oddly enough, they (the bosses) are very particular about processes hogging resources, but apparently not about disk hogs. Oh well.

If I can dig up a few spare minutes, or hours really, I might try and throw it together. Or not, I’m lazy.

On linux, du -ksh * will do exactly this.



[/] $ du -ksh *
4.3M    bin
51M     boot
0       cdrom
4.0K    debootstrap
236K    dev
37M     etc
3.1G    home
4.0K    initrd

...etc etc...


Typo: the -k is unnecessary. du -sh * is sufficient. Also, I have no idea if this works on solaris, sorry.

ed and c_goatThe “du -sh *” command doesn’t work under solaris (at least under the solaris I tried), but “du -sk *” does work. Thanks! I learned something new today.

If you’re looking for files larger than a certain size, for example, 1000000 bytes, try:


find DIRECTORY -size +1000000c -ls

or, if your system desn’t support the -ls option in find:


find DIRECTORY -size +1000000c -exec ls -l {} \;

where DIRECTORY is the name of the directory you want to start your search in.

The -mount option in find is useful if you want to restrict your search to the same file system you started in. For example, if the / partition has the big files and the other disk partitions do not (determined from a df -k command), you’ll want to start your search in the root (/) directory, but there’s no point in traversing the other file systems. Runnning


find / -mount -size +1000000c -exec ls -l {} \;

will restrict the search just to the root partition.

The “c” at the end of the file size option specifies the search in characters (bytes) rather than 512 byte blocks. Makes a leetle difference, as I’ve discovered more than once. :smack: The “+” at the beginning of the file size option tells find to look for files larger than one million bytes, instead of


exactly

one million bytes. Again, :smack:

I think “du -sh *” will certainly be of help. I’ll still probably need to write something to get exactly what I want.

I did exactly this, once… I copied some data files I was working on to my home directory, for convenience, and never bothered to look at how big they were. It turned out that the sysadmin had forgot to set my quota when he created my account, and I was floored when I realized that my home directory was taking up more space than would have fit on the entire hard drive of my personal computer.

Apparently, while the ACN folks at University of Manitoba have user space quotas and a public scratch area, they don’t limit disk usage of the /usr/local/tmp to somewhat less than maximum:

Seriously, that seems unconsciounable for a public server. I used to grab big chunks of virtual disk space on the old IBM mainframe back when I was a wee fumbling freshman numnick in undegrad just for “fun” (which tells you how much there was to do in that town) but the system would scavenge them back pretty quickly if they weren’t being used, and it didn’t infringe upon the partition for user account space. (This is back in the days where getting your user account upped from 2Mb to 4Mb was a Big Deal. Yeah, laugh it up…your day will come.)

Stranger

Here’s what we teach our operators:

Go to the top level of the filesystem in question. So for this



Filesystem            kbytes    used   avail capacity  Mounted on
/dev/vx/dsk/rootvol  9290743 8490000  707836    93%    /
/proc                      0       0       0     0%    /proc
fd                         0       0       0     0%    /dev/fd
mnttab                     0       0       0     0%    /etc/mnttab
/dev/vx/dsk/var      9279907 5584074 3603034    98%    /var


…suppose /var is the problem child.

cd /var du -dks * | sort -n

…the “d” keeps the du command from adding files outside the filesystem. That’s useful if you have nested mountpoints. The “k” is just for kilobytes because blocks are non-intuitive. The “s” is for summary. So you’re doing is adding up the sizes of the files & directories visible in this directory. The sort is just to order them numerically.

Output can look like this:



131     snmp
164     saf
216     run
343     apache
375     spool
562     cron
1568    preserve
3152    tmp
6863    mail
286245  opt
292381  log
338525  adm
544591  sadm


from here you can see what directories are the largest. You can then cd into each of the highest directories and repeat the command. Doing this, you can sometimes iterate youself into the largest directories. From experience I know that /var/sadm doesn’t have much you can delete or compress. /var/adm though is a typical log directory and may have a lot of old log files that can be removed. /var/log, obviously, is the same.

Other useful commands:
find / -type f -size +100000000c -mtime -1

find all files over 100M in sized that have changed in the last 24 hours. Useful to find rapidly growing files.

Note, for all of these, you have to be “root” to get an accurate count. Otherwise, you’ll not add & include those files & directories that you cannot read.

A few more possibilities for reporting.

The program xdu presents a graphical view of the output of du, making it easy to see which directories are hogging the most space.

And if your Solaris system is running Samba, you can mount Unix directories as drives on a PC and use your favorite disk space manager from the “other” world.