Database Modeling
F. Alfredo Rego Adager Calle del Arco 24 Antigua Guatemala
(Note: Transcribed from the 1987 typewriter original)
My "Database Dynamics" essay begins with these three paragraphs:
A database models the dynamic behavior of entities and their relationships by means of entries. An entry consists of a key (which uniquely identifies the entity or relationship) and a collection of attributes (which give quality and color to the entity or relationship).
Entities and relationships don't just sit there. They interact with one another and with their environments: Transactions happen which affect (and are affected by) entities and relationships. Such transactions include changes in database structure as well as changes in the meaning and value of the information maintained by the structure.
We cannot store a real entity or a real relationship in a database, just as we cannot store a real orchestra in a stereo cassette. At best, we can hope to store a half-decent description or representation which, through the magic of electronics, will play back a reasonably useful likeness. The representation, due to limitations of technology and economics, will consist of a group of values for a relatively small collection of characteristics which, in the case of databases, we call fields.
Database Modeling
In this "Database Modeling" essay, I would like to discuss the word that specifies the purpose of a database. Let's refer to my previous definition, with the emphasis now shifted from entries to models: "A database models the dynamic behavior of entities and their relationships by means of entries".
Why modeling? Modeling implies redundancy, since we now have to deal with two things: the model and the modeled. Why do we bother? For convenience, either intellectual or economical. We can afford to manipulate the model (even in ways which will break it). We cannot afford to fool around with the real thing.
Even though we model because of convenience, most of the time we deal with cumbersome models. Why do we put up with such ignominy? Let's explore the issues in this essay.
Static Modeling
Please note that most database models reflect only the static part of a database (the entries and the datasets, with access paths, for instance). Few database models concern themselves with the dynamic behavior of entries, which they exhibit when they become involved in transactions. Unfortunately, we find this malady in all kinds of modeling. The database modeling problem is just a special case of a very general and unwholesome condition.
Dynamic Modeling
Let's now consider models that, besides dealing with static structures, also deal with dynamic transactions. Instead of just asking "What is it?", we will now also ask "What is going on?". Let's deal with modeling, in general, since this will cover database modeling in particular.
Painful choices
In my paper, "The Database Administrator: A Ruthless Juggler of Reality, Algorithms and Data Structures", I say:
"Traditionally, the DBA's position has been linked to the rather misunderstood task of 'maintaining the database'. I say 'misunderstood' because the emphasis has been mainly on the data-structure aspect of the database. Two other fundamental aspects have usually been slighted: the reality that the model is trying to reflect, and the algorithms that manipulate the model to produce the results actually desired from the computer model itself.
The DBA is faced with unenviable choices: (1) Which aspects of reality are going to be incorporated into the computer model, and which are going to be left out? (2) Which algorithms are going to be allowed to manipulate the model, and which are going to be left out? (3) Which data structures are going to be allowed to support the model, and which are going to be left out? The painful part in these choices comes from the fact that what is left out of the computer model is as important for the model's effectiveness as what is actually put into the model."
Modeling Checklist
A model builder has the awesome responsibility of choosing what is left out of the model as well as what is put into the model. As a practitioner of this craft, I have developed, through the decades, a checklist which I would like to share with you.
Avoid the blinding effects of powerful conventions and unquestioning familiarity. Don't take first principles for granted. Question them. Are they convenient? Are they useful? Are they current? Are they obsolete? Let's look at two specific examples: agendas and badges.
An agenda is a model for a meeting. There are many ways to arrange the order of topics in an agenda, depending on the meeting's objective. One seldom-tried order might actually increase the likelihood of agreement on thorny issues. The idea is to begin with items that will, most likely, bring consensus. This will create a feeling of cooperation and agreement that, hopefully, will spill into later discussions regarding conflictive items.
Many companies (including, of course, Hewlett-Packard) require that outsiders wear badges (bright-red, to boot!) when they visit their buildings. This is very understandable. The badge is, in a way, a model of the person as that person is perceived by the company. Without affecting the usefulness of the model (from the company's perspective), the writing could be changed from "Escorted visitor" to "Guest". The difference, as perceived by the outsider, would be dramatic and would benefit the visited company in terms of good will!
A good model should provide us with insight and understanding. It should encourage synthesis in thought, the shaping of analogy, and the discovery of latent parallels. Given a choice among several models, go for the model that facilitates patterns. For example, the touch-tone model of telephone dialing (besides being quicker) helps us remember long numbers by means of the "graphics" formed by the sequence of digits; such "graphics" are next to impossible to detect in a rotary dialing model. As another example, consider the design of musical instruments, which are models for melody and harmony. I am familiar with the piano and the guitar but my first introduction to harmonics was through the acoustic (Spanish) guitar. This fact probably biases my opinion: The various keys and their harmonic chords, progressions and inversions are easier to detect, as patterns or graphics, on the guitar (although I am sure that some pianists would disagree with me, since they have the genius to detect any pattern on the piano keys!)
A good model should not strain our memory. It should have good mnemonics. Software programs are notorious for failing to provide this common courtesy. Database schemas are not far ahead. What do people mean when they say "MO-PD-X2-MSTR"?
A good model should not overload the mind. Construction blueprints provide an example of division, in the name of sanity. We have an architect's drawing which represents a perspective of the finished product (usually with trees, people, cars, and so on, in a highly stylized format). We also have specialized blueprints for the plumbing, electrical, structural, and decorative elements. And blueprints, in themselves, come in various flavors, such as "as planned" and "as built".
A good model should "go with the flow". We should use binary notation when we deal with individual bits or with combinations of individual bits. We should use quaternary (tetramerous) notation when we deal with groups of two bits (since 2 to the power of 2 is 4). We should use octal notation when we deal with groups of 3 bits (since 2 to the power of 3 is 8). We should use hexadecimal notation when we deal with groups of 4 bits (since 2 to the power of 4 is 16). And so on. Why, then, do we use octal notation on a 16-bit machine? This forces us to treat the rightmost 15 bits as 5 groups of 3 bits and the leftmost bit as an anomaly! This is not too bad when we deal with one 16-bit word, but it becomes a stretching exercise for the brain when we deal with double words!
An ideal model should be a work of fine art. It should have an immediate intelligibility because of its beauty, order and elegance. It should deal with useful questions and issues and it should be applicable to the greatest number of practical cases. It should stand the test of time and it should be usable by a large number of persons.
Clocks and calendars model time. Even though we conceive time as a linear arrow that forever moves forward, our practical models of time tend to be modular. The classical clock has a round face with 12 divisions. The little ("hours") hand goes around the face in 12 hours. The big ("minutes") hand goes around the face in one hour. The optional "seconds" hand goes around the face in one minute. There are digital clocks that provide a "window" instead of a "face". This window flashes numbers that represent hours, minutes, seconds. Some digital clocks also flash days, weeks, months and years. It is easier to grasp the modularity of our time conventions if we look at a "classical" clock, with a face and with hands that go around in the face.
In the classical calendar, we pull out a page at the beginning of each month. There are other calendar systems that have different moduli: They may have a page per day, a page per week, a page per year. Even though we know about other time chunks, such as decades, fourscore, century, millenia, and so on, we do not see calendars that model them. The ancient monuments created by the Maya Indians in cities such as Tikal in Guatemala dealt with time in terms of thousands of years. The modern agendas and time-management systems have a hard time providing anything more long-term than a five-year calendar!
Conclusion
Few database models concern themselves with the dynamic behavior of entries, which they exhibit when they become involved in transactions. Unfortunately, we find this malady in all kinds of modeling. The database modeling problem is just a special case of a very general and unwholesome condition.
Let me share one of my pet peeves with you, to illustrate my conclusion.
I travel extensively on airplanes. As a result, I suffer (among other things) the opprobium of "Theatre in the Air" movies. Movies, by the way, are models of fantasies. The fantasy created by the director, actors, etc., gets modeled by means of still frames that, when flickered on a screen in front of our eyes, fool us into thinking that the fantasy is actually moving. At any rate, when the movie is good, the captain comes on the sound system, just at the most intriguing moment in the dialogue, to tell us that the outside temperature is -60 degrees or whatever. The captain, who is supposedly in charge, seems to take special pride in ruining the flow of a good movie. On the other hand, when the movie is bad (which is most of the time), I have no way to escape and I am even forced to close my window, so I don't even have the choice of enjoying the view outside. Even worse, when the movie is horrendously violent and vicious and I have my little children with me, my blood really boils. Particularly when I read this pap in the in-flight magazine: "Although this film has been edited for airline use, parental guidance is suggested." Notice that they use the cowardly passive voice (i.e., they don't say who, specifically, suggests that I use parental guidance). Also notice that they don't say how, specifically, I am supposed to exercise parental guidance, given the fact that we are all strapped to the seat and must remain strapped to the seat, facing the screen.
May your models (and the systems that you impose on your users) be nicer!
What do your worldwide HP e3000 colleagues think of Adager? See a sample of comments from real people who use Adager in the real world, where performance and reliability really count.
Back to Adager