DISC SPACE: HOW MUCH IS ENOUGH?
by Vladimir Volokh, VESOFT
Presented at 1993 INTEREX Conference, San Francisco, CA, USA
Published by INTERACT Magazine, June 1993.
ABSTRACT. In spite of many new developments in disc technology,
the good old Winchester drive, with its mechanically movable arms,
is still the primary medium for all our files -- be it programs,
sources or Data Bases.
It seems that many simple questions related to disc usage do not
have easy answers:
How do we measure disc space -- what's the relation between megs
and sectors? How can an MPE user find out the disc capacity? How
much of it is free and how usable is the free space? And is 'used
space' really used by us? What is reblocking, squeezing,
trimming, condensing and other space transformations?
The paper presents some observations on HP3000 file structure,
similarities and differences between "Classic" and "Spectrum".
This author's hope is that such knowledge will help users to
better control their computing environment.
BIOGRAPHY OF THE AUTHOR:
Vladimir Volokh is the president of VESOFT, Inc., a software house
based in Los Angeles, CA, USA which was founded in 1980 by him and
his son, Eugene.
They are the creators of MPEX/3000, a productivity and system
control tool, SECURITY/3000, a log-on access control package, and
VEAUDIT/3000, an auditing tool that reports loopholes on the
HP3000 system, of which there are 10,000+ packages installed
worldwide.
Vladimir Volokh is a computer scientist with more than 10 years of
HP3000 experience as system analyst, consultant, and technical
manager; he is a frequent speaker at users groups around the
world.
In spite of many new developments in disc technology, the good old
Winchester drive, with its mechanically movable arms, is still the
primary medium for all of our files.
In this article I will try to present some observations on HP3000 file
structure -- both for "Classic" (MPE/V) and "Spectrum" (MPE/iX)
computers -- in the hope that it might help HP3000 users manage disc
space better. It seems that many simple questions have answers that
are not so simple.
HOW DO WE MEASURE DISC SPACE?
In various discussions about disc space, you've seen terms like
"sectors", "kilobytes", "megabytes", and "gigabytes". What do these
words mean? Well, nothing is simple. A sector, by HP's definition,
is 256 bytes; "kilo" (K) is 1000, or 1024 for memory devices; "mega"
is 1000000, or 1024K for memory devices; and "giga" is a prefix
denoting a billion, or 1.073 billion for memory devices. My
dictionary tells me that "tera" means one trillion (10`^12) of a given
unit. HP's Glossary of Terms mentions only "kilo" and "mega" (in
1989). So, considering all this mathematics, how many megabytes does
your disc have if after :DISCFREE C on your XL machine you see the
following:
ALL MEASUREMENTS ARE IN SECTORS.
ALL PERCENTAGES ARE RELATIVE TO THE DEVICE SIZE.
| Configured | In Use | Availab
-----------+-------------------+-------------------+---------------
LDEV : 1 -- (MPEXL_SYSTEM_VOLUME_SET:MEMBER1)
Device | 2232192 | 1708144 ( 77%) | 524048
Permanent | 1852720 ( 83%) | 1605344 ( 72%) | 247376
Transient | 1852720 ( 83%) | 102800 ( 5%) | 524048
Considering that 4*256 is close enough to 1000 you can do it easily --
just divide the number of sectors by 4000 and you will have it in
megs. In this case, it'll be 2232192/4000 = 558, close enough to the
real answer, 545 megs.
HOW BIG IS THE DISC?
As you've seen above, MPE/iX gives you an answer via :DISCFREE. In
MPE/V the utility FREE5.PUB.SYS -- true to its name -- shows only free
space. But if it shows you that X sectors are free, that's X sectors
out of how many? This information is hidden deep inside the VINIT
utility (as if it were unimportant). Try this:
:VINIT >pfspace 1;addr
You will see a lot of information about addresses and sizes of free
space (and you don't care much about that). But at the end of this
listing you will see:
TOTAL VOLUME CAPACITY: 216832 SECTORS
TOTAL FREE SPACE AVAILABLE: 16490 SPACE
MAXIMUM CONTIGUOUS AREA: 5505 SECTORS
By the way, it's not my typo (if you're wondering about "16490 SPACE")
-- it's an unknown MPE designer's mistake, frozen in time ....
HOW MUCH OF IT IS FREE?
MPE/iX gives a pretty straightforward answer to this question: look at
its output in the example above -- this time not on the first line but
on the second:
Permanent | 1852720 ( 83%) | 1605344 ( 72%) | 247376
As you see, 83% of the whole space is configured to be used as
permanent, 72% is used, so only 11% (which is 83-72) is available for
permanent files. But why doesn't this simple calculation work for the
third line (transient space)?
Transient | 1852720 ( 83%) | 102800 ( 5%) | 524048
Even though transient space can also take up to 83% of the space on
LDEV 1, in this case only 28% is left for that: 17% can't be permanent
and 11% is unused by permanent files; because 5% is actually used by
transient space, 23% is available.
On MPE/V machines available space is supposed to be shown by the
FREE5.PUB.SYS utility or via the PFSPACE command of :VINIT (16490
sectors in the PFSPACE example above). But what about virtual
(transient) space? This information, again, is hidden -- this time
inside the :SYSDUMP output:
:SYSDUMP $NULL
ANY CHANGES? YES
...
DISC ALLOCATION CHANGES? YES
VIRTUAL MEMORY CHANGES? YES
LIST VIRTUAL MEMORY DEVICE ALLOCATION? YES
VOLUME NAME LDEV # VM ALLOCATION
LDEV1 1 25
...
ENTER VOLUME NAME , SIZE IN KILOSECTORS (MAX = 255 )?
This means that MPE/V knows nothing about virtual space utilization at
the moment; some space is also taken (possibly) by spool files and by
temporary files. Note also that even though total, free and virtual
space is given by DEV#, used space is not. (The :REPORT command gives
used filespace-sectors by group and accounts.) One way to know this
distribution is to use the MPEX command %LISTF @.@.@,DISCUSE.
IS FREE SPACE REALLY AVAILABLE TO US?
Seeing the FREE5 output on "Classic" one should pay attention not only
to the "TOTAL FREE SPACE" line but also to the preceding ones:
:RUN FREE5.PUB.SYS
VOLUME MH7945U1 LDEV 1
LARGEST FREE AREA= 25530
SIZE COUNT SPACE AVERAGE
>100000 0 0 0
>10000 2 42796 21398
>1000 0 0 0
>100 4 540 135
>10 29 1065 36
>1 107 217 2
TOTAL FREE SPACE=44618
If you have a lot of small pieces, they might not be usable at all
because none of your files may have small enough extents (more on this
later). What you need is not just free space but CONTIGUOUS space.
On "Classics", disc space can be condensed to some degree by the >COND
command in :VINIT; on "Spectrum" machines the disc fragmentation
shouldn't be a problem (or so HP tells us).
IS THE "USED" SPACE REALLY USED BY US?
OK, by subtracting "free" space from the "total" space or just looking
at the :DISCFREE output we might get an idea of how many sectors are
"used" -- physically, that is. Keep in mind, however, that probably
about half of those files which you see on the full backup listing
|1have not been used| (either modified or accessed) for a long time --
6 months or more. But which half? Some answers to this question can
be found in the :STORE command of MPE or, better yet, using selection
by ACCDATE and/or MODDATE in MPEX (with totals of files and space).
Archiving and purging seldom used files saves a lot of disc space,
directory space, and backup time.
TO BLOCK OR NOT TO BLOCK?
Another question is: how is the space used inside "active" files? One
factor -- relevant on MPE/V machines but not on MPE/iX machines -- is
blocking. MPE/V does all disc I/Os in multiples of one sector (256
bytes). The blocking factor is the number of records that we choose
to fit into a certain number of sectors (block). But very often we
don't choose -- we simply rely on MPE/V defaults, which can range from
good to very bad (see [1] for more details). A bad blocking factor
wastes not only disc space, but also I/O time -- the more records per
one I/O we read/write, the better. Consider some examples:
ACCOUNT= SYS GROUP= OPERATOR
FILENAME CODE ------------LOGICAL RECORD----------- ----SPACE----
SIZE TYP EOF LIMIT R/B SECTORS #X MX
REPORT1 132B FA 26 10000 1 1251 1 8
REPORT2 132B FA 26 10000 60 651 1 8
REPORT3 132B FA 26 26 60 62 1 1
REPORT4 132B FA 26 26 9 20 1 1
Here the file REPORT1 is built with the default blocking factor 1 (1 =
256 bytes / 132 bytes); the remainder (256 - 132 = 124 bytes) is
simply wasted, though it's almost 50% of the space; this file is like
a piece of swiss cheese -- with many big holes inside. The second
file is the result of changing the blocking factor to 60, thus
achieving the BEST space utilization for this file -- now 60 records
take 60*132=7920 bytes which is close to the size of a block of 31
sectors (256*31=7936). However, we can get an even bigger saving by
SQUEEZEing this file (setting FLIMIT down to EOF) -- that's how we got
the file REPORT3. By reblocking it again we save more space; as a
result, the difference in size between REPORT1 and REPORT4 files is
quite significant. Things like this can be done using our very own
MPEX (the %ALTFILE command with options SQUEEZE and BLKFACT=BEST).
And what about XL (or should we say iX) computers? The blocking
factor does not mean much there; all the records are tightly packed,
except for the last extent which can (for very big files) be up to
2048 sectors. The good news is that the FCLOSE intrinsic (on the XL)
has a new option called "XLTRIM", which allows the system to reuse
free space beyond the end of file without decreasing the file limit.
Look at the following before-and-after example:
ACCOUNT= SYS GROUP= PUB
FILENAME CODE ------------LOGICAL RECORD----------- ----SPACE----
SIZE TYP EOF LIMIT R/B SECTORS #X MX
REPORT1 132B FA 26 10000 1 256 1 *
REPORT2 132B FA 26 10000 1 16 1 8
Quite a savings (MPEX's %ALTFILE ;XLTRIM does it) -- and we can append
to the file!
THE EXTENT QUESTION, OR WHERE THE FILE IS?
The extent is MPE's compromise between two extremes in file size
management: assigning all file space requested to the file immediately
or giving space one record (or sector) at a time. In MPE/V a file can
consist of anywhere from 1 to 32 extents (the default number is 8).
Each extent resides wholly on one disc, but different extents may be
located on different discs. So where is any given file? You have to
know the full extent map of the file and only then can you think about
improving system performance through disc balancing. If you use the
LISTDIR5 >LISTF you might see the DISC DEV # line, but this of course
is only the device of the first extent (the same goes for :STORE
listings). >LISTF ...;MAP, however, gives you a map (the first digit
is the "volume table index", which is not necessarily the device
number, and is hard to convert to the device number):
LISTDIR5 G.06.00 (C) HEWLETT-PACKARD CO., 1983
>LISTF VESOFT.PUB.SYS
FCODE: 0 FOPTIONS: STD,ASCII,VARIABLE
BLK FACTOR: 1 CREATOR: **
REC SIZE: 1276(B) LOCKWORD: **
BLK SIZE: 640(W) SECURITY--READ: ANY
EXT SIZE: 10(S) WRITE: ANY
# REC: 482 APPEND: ANY
# SEC: 70 LOCK: ANY
# EXT: 7 EXECUTE: ANY
MAX REC: 13 **SECURITY IS ON
MAX EXT: 7 COLD LOAD ID: %24025
# LABELS: 0 CREATED: THU, 9 APR 1992
MAX LABELS: 0 MODIFIED: THU, 9 APR 1992
DISC DEV #: 3 ACCESSED: THU, 9 APR 1992
DISC TYPE: 3 LABEL ADR: **
DISC SUBTYPE: 4 SEC OFFSET: %5
CLASS: DISC FLAGS: NO ACCESSORS
>LISTF VESOFT.PUB.SYS;MAP
FCODE: 0 FOPTIONS: STD,ASCII,VARIABLE
BLK FACTOR: 1 CREATOR: **
REC SIZE: 1276(B) LOCKWORD: **
BLK SIZE: 640(W) SECURITY--READ: ANY
EXT SIZE: 10(S) WRITE: ANY
# REC: 482 APPEND: ANY
# SEC: 70 LOCK: ANY
# EXT: 7 EXECUTE: ANY
MAX REC: 13 **SECURITY IS ON
MAX EXT: 7 COLD LOAD ID: %24025
# LABELS: 0 CREATED: THU, 9 APR 1992
MAX LABELS: 0 MODIFIED: THU, 9 APR 1992
DISC DEV #: 3 ACCESSED: THU, 9 APR 1992
DISC TYPE: 3 LABEL ADR: **
DISC SUBTYPE: 4 SEC OFFSET: %5
CLASS: DISC FLAGS: NO ACCESSORS
EXT MAP: %300161067 %200233735 %300162124 %200240307
%300162207 %100211521 %200240326
>
In MPE/XL file labels are kept separately from the data, and yet
:LISTF ,3 still shows the file label address, which might have no
relevance to the location of the data at all. Here is an example of
:LISTF ,3 and MPEX's %LISTF ,4 showing the full extent map of the
file:
:LISTF LOG3320,3
FILE CODE : 0 FOPTIONS: BINARY,VARIABLE,NOCCTL,STD
BLK FACTOR: 1 CREATOR : **
REC SIZE: 2044(BYTES) LOCKWORD: **
BLK SIZE: 2048(BYTES) SECURITY--READ : CR
EXT SIZE: 0(SECT) WRITE : CR
NUM REC: 2720 APPEND : CR
NUM SEC: 2304 LOCK : CR
NUM EXT: 9 EXECUTE : CR
MAX REC: 1024 **SECURITY IS ON
FLAGS : 1 ACCESSORS,SHARED,1 R,1 W
NUM LABELS: 0 CREATED : THU, APR 9, 1992, 2:01 PM
MAX LABELS: 0 MODIFIED: THU, APR 9, 1992, 2:01 PM
DISC DEV #: 1 ACCESSED: THU, APR 9, 1992, 2:01 PM
SEC OFFSET: 0 LABEL ADDR: **
VOLCLASS : MPEXL_SYSTEM_VOLUME_SET:DISC
MPEX %LISTF log3320 PAGE 1
MANAGER.SYS,PUB THU, APR 9, 1992, 4:01 PM
ACCOUNT= SYS GROUP= PUB
-----FILE------ EXTENTS -----SECTORS----- DEVICE
NAME CODE NUM MAX USED NOW SAVABLE CLASS
LOG3320 10 * 2560 208 DISC
Dev/Sector: 2/%0000004516700 2/%0000000444440 2/%0000000072040
Dev/Sector: 3/%0000001407620 3/%0000007737620 1/%0000003207300
Dev/Sector: 1/%0000002675520 2/%0000006572620 3/%0000000536200
Dev/Sector: 2/%0000006577760
To finish this little essay I propose a puzzle to MPE/iX users:
what do these two "*" mean in the following :LISTF ,2 ??
ACCOUNT= SYS GROUP= PUB
FILENAME CODE ------------LOGICAL RECORD----------- ----SPACE----
SIZE TYP EOF LIMIT R/B SECTORS #X MX
PUZZLE 128W FB 1608 2222 1 1616 * *
The answer is in one of the recommended reading items:
1. Eugene Volokh, "The Truth About Disc Files",
Presented at 1982 HPIUG Conference, San Antonio, TX, USA
2. Andy Tauber, "Disc Balancing",
INTERACT Magazine, Jan. 1986
3. Greg Englestad, "HP3000 Disc Management",
SUPERGROUP Magazine, Sep.-Nov. 1987
4. Eugene Volokh, "The Truth About MPE/XL Disc Files",
Presented at 1989 INTEREX Conference, San Francisco, CA USA
5. S.Gordon, V.Volokh, "The Art And Science Of Disc Space Management",
INTERACT Magazine, July 1991
Go to Adager's index of technical papers