DISC SPACE: HOW MUCH IS ENOUGH?
                      by Vladimir Volokh, VESOFT
     Presented at 1993 INTEREX Conference, San Francisco, CA, USA
              Published by INTERACT Magazine, June 1993.


    ABSTRACT.  In  spite of many new  developments in disc technology,
    the good old Winchester drive, with its mechanically movable arms,
    is  still the primary medium for all  our files -- be it programs,
    sources or Data Bases.

    It  seems that many simple questions  related to disc usage do not
    have easy answers:

    How  do we measure disc space  -- what's the relation between megs
    and  sectors?  How can an MPE user find out the disc capacity? How
    much  of it is free and how usable is the free space? And is 'used
    space'   really  used  by  us?   What  is  reblocking,  squeezing,
    trimming, condensing and other space transformations?

    The  paper  presents  some observations on  HP3000 file structure,
    similarities and differences between "Classic" and "Spectrum".

    This  author's  hope  is  that  such knowledge will  help users to
    better control their computing environment.

    BIOGRAPHY OF THE AUTHOR:

    Vladimir Volokh is the president of VESOFT, Inc., a software house
    based in Los Angeles, CA, USA which was founded in 1980 by him and
    his son, Eugene.

    They  are  the  creators  of MPEX/3000, a  productivity and system
    control  tool, SECURITY/3000, a log-on access control package, and
    VEAUDIT/3000,  an  auditing  tool  that  reports loopholes  on the
    HP3000  system,  of  which  there  are 10,000+  packages installed
    worldwide.

    Vladimir Volokh is a computer scientist with more than 10 years of
    HP3000  experience  as  system analyst,  consultant, and technical
    manager;  he  is  a  frequent  speaker at users  groups around the
    world.

In  spite  of  many new developments in  disc technology, the good old
Winchester  drive,  with  its mechanically movable  arms, is still the
primary medium for all of our files.

In this article I will try to present some observations on HP3000 file
structure  --  both  for  "Classic"  (MPE/V)  and  "Spectrum" (MPE/iX)
computers  -- in the hope that it  might help HP3000 users manage disc
space  better.  It seems that many  simple questions have answers that
are not so simple.

HOW DO WE MEASURE DISC SPACE?

In  various  discussions  about  disc  space,  you've seen  terms like
"sectors",  "kilobytes",  "megabytes", and "gigabytes".  What do these
words  mean?  Well, nothing is simple.   A sector, by HP's definition,
is  256 bytes; "kilo" (K) is 1000,  or 1024 for memory devices; "mega"
is  1000000,  or  1024K  for  memory  devices; and "giga"  is a prefix
denoting   a  billion,  or  1.073  billion  for  memory  devices.   My
dictionary tells me that "tera" means one trillion (10`^12) of a given
unit.  HP's  Glossary  of  Terms  mentions only "kilo"  and "mega" (in
1989).  So, considering all this  mathematics, how many megabytes does
your  disc  have  if after :DISCFREE C on  your XL machine you see the
following:

     ALL MEASUREMENTS ARE IN SECTORS.
     ALL PERCENTAGES ARE RELATIVE TO THE DEVICE SIZE.

                |    Configured     |      In Use       |     Availab
     -----------+-------------------+-------------------+---------------
     LDEV :     1 -- (MPEXL_SYSTEM_VOLUME_SET:MEMBER1)
      Device    |    2232192        |    1708144 ( 77%) |     524048
      Permanent |    1852720 ( 83%) |    1605344 ( 72%) |     247376
      Transient |    1852720 ( 83%) |     102800 (  5%) |     524048

Considering that 4*256 is close enough to 1000 you can do it easily --
just  divide  the  number  of sectors by 4000 and  you will have it in
megs.   In this case, it'll be 2232192/4000 = 558, close enough to the
real answer, 545 megs.

HOW BIG IS THE DISC?

As  you've  seen  above, MPE/iX gives you  an answer via :DISCFREE. In
MPE/V the utility FREE5.PUB.SYS -- true to its name -- shows only free
space.  But if it shows you that  X sectors are free, that's X sectors
out  of  how  many?  This information is hidden  deep inside the VINIT
utility (as if it were unimportant).  Try this:

      :VINIT >pfspace 1;addr

You  will  see a lot of information  about addresses and sizes of free
space  (and  you  don't care much about that).  But at the end of this
listing you will see:

      TOTAL VOLUME CAPACITY:   216832 SECTORS
      TOTAL FREE SPACE AVAILABLE: 16490 SPACE
      MAXIMUM CONTIGUOUS AREA: 5505 SECTORS

By the way, it's not my typo (if you're wondering about "16490 SPACE")
-- it's an unknown MPE designer's mistake, frozen in time ....

HOW MUCH OF IT IS FREE?

MPE/iX gives a pretty straightforward answer to this question: look at
its output in the example above -- this time not on the first line but
on the second:

    Permanent |    1852720 ( 83%) |    1605344 ( 72%) |     247376

As  you  see,  83%  of  the  whole  space is configured  to be used as
permanent,  72% is used, so only 11% (which is 83-72) is available for
permanent files.  But why doesn't this simple calculation work for the
third line (transient space)?

    Transient |    1852720 ( 83%) |     102800 (  5%) |     524048

Even  though  transient space can also take up  to 83% of the space on
LDEV 1, in this case only 28% is left for that: 17% can't be permanent
and  11% is unused by permanent files;  because 5% is actually used by
transient space, 23% is available.

On  MPE/V  machines  available  space  is supposed to  be shown by the
FREE5.PUB.SYS  utility  or  via  the PFSPACE command  of :VINIT (16490
sectors  in  the  PFSPACE  example  above).  But  what  about  virtual
(transient)  space?   This information, again, is  hidden -- this time
inside the :SYSDUMP output:

     :SYSDUMP $NULL
     ANY CHANGES? YES
     ...
     DISC ALLOCATION CHANGES? YES
     VIRTUAL MEMORY CHANGES? YES
     LIST VIRTUAL MEMORY DEVICE ALLOCATION? YES
     VOLUME NAME   LDEV #   VM ALLOCATION
       LDEV1       1         25
     ...
     ENTER VOLUME NAME , SIZE IN KILOSECTORS (MAX = 255 )?

This means that MPE/V knows nothing about virtual space utilization at
the  moment; some space is also taken (possibly) by spool files and by
temporary  files.  Note also that even  though total, free and virtual
space  is given by DEV#, used space is not. (The :REPORT command gives
used  filespace-sectors by group and accounts.)   One way to know this
distribution is to use the MPEX command %LISTF @.@.@,DISCUSE.

IS FREE SPACE REALLY AVAILABLE TO US?

Seeing the FREE5 output on "Classic" one should pay attention not only
to the "TOTAL FREE SPACE" line but also to the preceding ones:

:RUN FREE5.PUB.SYS
VOLUME MH7945U1            LDEV 1
LARGEST FREE AREA= 25530
  SIZE  COUNT  SPACE   AVERAGE
>100000 0      0       0
>10000  2      42796   21398
>1000   0      0       0
>100    4      540     135
>10     29     1065    36
>1      107    217     2
TOTAL FREE SPACE=44618

If  you  have  a lot of small pieces, they  might not be usable at all
because none of your files may have small enough extents (more on this
later).   What  you need is not just  free space but CONTIGUOUS space.
On "Classics", disc space can be condensed to some degree by the >COND
command  in  :VINIT;  on  "Spectrum"  machines the  disc fragmentation
shouldn't be a problem (or so HP tells us).

IS THE "USED" SPACE REALLY USED BY US?

OK, by subtracting "free" space from the "total" space or just looking
at  the :DISCFREE output we might get  an idea of how many sectors are
"used"  -- physically, that is.  Keep  in mind, however, that probably
about  half  of  those files which you see  on the full backup listing
|1have not been used| (either modified or accessed) for a long time --
6  months or more. But which half?   Some answers to this question can
be  found in the :STORE command of MPE or, better yet, using selection
by  ACCDATE  and/or MODDATE in MPEX (with  totals of files and space).
Archiving  and  purging  seldom used files saves  a lot of disc space,
directory space, and backup time.

TO BLOCK OR NOT TO BLOCK?

Another  question is: how is the space used inside "active" files? One
factor  -- relevant on MPE/V machines but not on MPE/iX machines -- is
blocking.  MPE/V  does  all disc I/Os in  multiples of one sector (256
bytes).   The blocking factor is the  number of records that we choose
to  fit  into a certain number of  sectors (block).  But very often we
don't choose -- we simply rely on MPE/V defaults, which can range from
good  to  very bad (see [1] for  more details).  A bad blocking factor
wastes  not only disc space, but also I/O time -- the more records per
one I/O we read/write, the better.  Consider some examples:

ACCOUNT=  SYS         GROUP=  OPERATOR

FILENAME  CODE  ------------LOGICAL RECORD-----------  ----SPACE----
                  SIZE  TYP        EOF      LIMIT R/B  SECTORS #X MX

REPORT1           132B  FA          26      10000   1     1251  1  8
REPORT2           132B  FA          26      10000  60      651  1  8
REPORT3           132B  FA          26         26  60       62  1  1
REPORT4           132B  FA          26         26   9       20  1  1

Here the file REPORT1 is built with the default blocking factor 1 (1 =
256  bytes  /  132  bytes);  the remainder (256 -  132 = 124 bytes) is
simply  wasted, though it's almost 50% of the space; this file is like
a  piece  of  swiss cheese -- with many  big holes inside.  The second
file  is  the  result  of  changing  the  blocking factor  to 60, thus
achieving  the BEST space utilization for  this file -- now 60 records
take  60*132=7920  bytes  which is close to the  size of a block of 31
sectors  (256*31=7936).  However, we can get  an even bigger saving by
SQUEEZEing this file (setting FLIMIT down to EOF) -- that's how we got
the  file  REPORT3.  By  reblocking it again we  save more space; as a
result,  the  difference in size between  REPORT1 and REPORT4 files is
quite  significant.   Things like this can be  done using our very own
MPEX (the %ALTFILE command with options SQUEEZE and BLKFACT=BEST).

And  what  about  XL  (or  should we say  iX) computers?  The blocking
factor  does not mean much there;  all the records are tightly packed,
except  for  the  last extent which can (for  very big files) be up to
2048  sectors. The good news is that  the FCLOSE intrinsic (on the XL)
has  a  new  option called "XLTRIM", which  allows the system to reuse
free  space beyond the end of  file without decreasing the file limit.
Look at the following before-and-after example:

ACCOUNT=  SYS         GROUP=  PUB

FILENAME  CODE  ------------LOGICAL RECORD-----------  ----SPACE----
                  SIZE  TYP        EOF      LIMIT R/B  SECTORS #X MX

REPORT1           132B  FA          26      10000   1      256  1  *
REPORT2           132B  FA          26      10000   1       16  1  8

Quite a savings (MPEX's %ALTFILE ;XLTRIM does it) -- and we can append
to the file!

THE EXTENT QUESTION, OR WHERE THE FILE IS?

The  extent  is  MPE's  compromise  between two extremes  in file size
management: assigning all file space requested to the file immediately
or  giving space one record (or sector) at a time. In MPE/V a file can
consist  of  anywhere from 1 to 32  extents (the default number is 8).
Each  extent resides wholly on one  disc, but different extents may be
located  on different discs.  So where is any given file?  You have to
know the full extent map of the file and only then can you think about
improving  system performance through disc  balancing.  If you use the
LISTDIR5  >LISTF you might see the DISC DEV # line, but this of course
is  only  the  device  of  the first extent (the  same goes for :STORE
listings).   >LISTF ...;MAP, however, gives you a map (the first digit
is  the  "volume  table  index",  which is not  necessarily the device
number, and is hard to convert to the device number):

LISTDIR5 G.06.00 (C) HEWLETT-PACKARD CO., 1983
>LISTF VESOFT.PUB.SYS

FCODE: 0              FOPTIONS: STD,ASCII,VARIABLE
BLK FACTOR: 1         CREATOR: **
REC SIZE: 1276(B)     LOCKWORD: **
BLK SIZE: 640(W)      SECURITY--READ:    ANY
EXT SIZE: 10(S)                 WRITE:   ANY
# REC: 482                      APPEND:  ANY
# SEC: 70                       LOCK:    ANY
# EXT: 7                        EXECUTE: ANY
MAX REC: 13                   **SECURITY IS ON
MAX EXT: 7            COLD LOAD ID: %24025
# LABELS: 0           CREATED: THU,  9 APR 1992
MAX LABELS: 0         MODIFIED: THU,  9 APR 1992
DISC DEV #: 3         ACCESSED: THU,  9 APR 1992
DISC TYPE: 3          LABEL ADR: **
DISC SUBTYPE: 4       SEC OFFSET: %5
CLASS: DISC           FLAGS: NO ACCESSORS

>LISTF VESOFT.PUB.SYS;MAP

FCODE: 0              FOPTIONS: STD,ASCII,VARIABLE
BLK FACTOR: 1         CREATOR: **
REC SIZE: 1276(B)     LOCKWORD: **
BLK SIZE: 640(W)      SECURITY--READ:    ANY
EXT SIZE: 10(S)                 WRITE:   ANY
# REC: 482                      APPEND:  ANY
# SEC: 70                       LOCK:    ANY
# EXT: 7                        EXECUTE: ANY
MAX REC: 13                   **SECURITY IS ON
MAX EXT: 7            COLD LOAD ID: %24025
# LABELS: 0           CREATED: THU,  9 APR 1992
MAX LABELS: 0         MODIFIED: THU,  9 APR 1992
DISC DEV #: 3         ACCESSED: THU,  9 APR 1992
DISC TYPE: 3          LABEL ADR: **
DISC SUBTYPE: 4       SEC OFFSET: %5
CLASS: DISC           FLAGS: NO ACCESSORS
EXT MAP: %300161067   %200233735   %300162124   %200240307
         %300162207   %100211521   %200240326
>

In  MPE/XL  file  labels  are  kept separately from  the data, and yet
:LISTF  ,3  still  shows  the file label address,  which might have no
relevance  to  the location of the data at  all. Here is an example of
:LISTF  ,3  and  MPEX's  %LISTF ,4 showing the  full extent map of the
file:

:LISTF LOG3320,3

FILE CODE : 0                   FOPTIONS: BINARY,VARIABLE,NOCCTL,STD
BLK FACTOR: 1                   CREATOR : **
REC SIZE: 2044(BYTES)           LOCKWORD: **
BLK SIZE: 2048(BYTES)           SECURITY--READ    : CR
EXT SIZE: 0(SECT)                         WRITE   : CR
NUM REC: 2720                             APPEND  : CR
NUM SEC: 2304                             LOCK    : CR
NUM EXT: 9                                EXECUTE : CR
MAX REC: 1024                           **SECURITY IS ON
                                FLAGS   : 1 ACCESSORS,SHARED,1 R,1 W
NUM LABELS: 0                   CREATED : THU, APR  9, 1992,  2:01 PM
MAX LABELS: 0                   MODIFIED: THU, APR  9, 1992,  2:01 PM
DISC DEV #: 1                   ACCESSED: THU, APR  9, 1992,  2:01 PM
SEC OFFSET: 0                   LABEL ADDR: **
VOLCLASS  : MPEXL_SYSTEM_VOLUME_SET:DISC

               MPEX %LISTF log3320   PAGE 1
       MANAGER.SYS,PUB   THU, APR  9, 1992,  4:01 PM

ACCOUNT=  SYS         GROUP=  PUB

-----FILE------ EXTENTS           -----SECTORS-----  DEVICE
NAME      CODE  NUM MAX           USED NOW  SAVABLE  CLASS

LOG3320          10   *               2560      208  DISC
   Dev/Sector:    2/%0000004516700   2/%0000000444440   2/%0000000072040
   Dev/Sector:    3/%0000001407620   3/%0000007737620   1/%0000003207300
   Dev/Sector:    1/%0000002675520   2/%0000006572620   3/%0000000536200
   Dev/Sector:    2/%0000006577760

To finish this little essay I propose a puzzle to MPE/iX users:
what do these two "*" mean in the following :LISTF ,2 ??

ACCOUNT=  SYS         GROUP=  PUB

FILENAME  CODE  ------------LOGICAL RECORD-----------  ----SPACE----
                  SIZE  TYP        EOF      LIMIT R/B  SECTORS #X MX

PUZZLE            128W  FB        1608       2222   1     1616  *  *

The answer is in one of the recommended reading items:

1. Eugene Volokh, "The Truth About Disc Files",
   Presented at 1982 HPIUG Conference, San Antonio, TX, USA

2. Andy Tauber, "Disc Balancing",
   INTERACT Magazine, Jan. 1986

3. Greg Englestad, "HP3000 Disc Management",
   SUPERGROUP Magazine, Sep.-Nov. 1987

4. Eugene Volokh, "The Truth About MPE/XL Disc Files",
   Presented at 1989 INTEREX Conference, San Francisco, CA USA

5. S.Gordon, V.Volokh, "The Art And Science Of Disc Space Management",
   INTERACT Magazine, July 1991

Go to Adager's index of technical papers