Style, Architectural, & Design Principles of the Z Directory.






It may help to know some of the thought processes behind the Z Directory to be able to use it more effectively. This section delves into an assortment of topics related to its overall usage.






Synopsis.
The z directory is a software library. It is similar to Rogue Wave, STL, MFC, or other general-purpose toolsets. It is entirely written in c++. It runs on Unix (Sun Solaris, HP-UX) and Microsoft (Win32) platforms. It has a wider scope than most software toolkits.






Objectives:
  • SIMPLE OBJECTS. c++ is a language of interfaces. Each object is intended to have the simplest possible interface (e.g., set of member functions). Simplicity includes minimizing the number of functions; keeping parameters intuitive; standardizing on naming and usage conventions. Though an object may be complex, its interface should be as intuitive and simple as possible.

    This should be intuitive to anyone who has driven an automobile. The complexity of every car reduces to 4 controls: throttle, brake, steering, and gearshift (5, if you include clutch). Thus, the end user can focus on the driving experience and other things besides learning how to use it (a long way from the first autos, and most software today, which would include a repair manual).


  • HIDING IMPLEMENTATION. Implementation specifics are hidden from the user. The interface ("API") is intended to reflect only logical functionality. For example, moving a block of data from one place to another is considered "message transport". Client code is written in such terms. The user does not refer to sockets, shared memory, X.25, or other implementation vehicles.

    The mechanism by which something is to be done takes a back seat to providing a correct view. In most cases, the user need not be forced to know implementation details. For example, regarding transporting data, sockets, TLI, or COM/DCOM may be here today, gone tomorrow, but the principle of moving data from A to B will endure. This is equally true in the areas of encryption, window operations (GUIs), databases, etc.

  • INSULATING THE ENVIRONMENT. Software architecture can be dramatically affected by the operating system it runs on or the compiler used to create it. Computer languages are traditionally obligated to provide a rigid specification for a compiler, but c++ compilers have notoriously failed to live up to the language's specifications. Software may be tied to a specific database or window system, reflecting he limitations or characteristics of that system. These are bad things. On the other hand, the software may limit itself to a particular environment in order to take advantage of useful functionality that the software can leverage, such as, say, transaction rollbacks (in a database) or threads (in an OS). This is the "best of breed" approach.

    Operating-system independent code: the Z Directory tries to isolate all ties between the OS and the code as much as possible. It is here to protect you. Operating system nuances are isolated to the lowest possible layer. The foundation that the Z Directory provides means that your program behaves uniformly across all platforms. One simple technique is to install wrappers (usually at low layers) for OS API functions (system calls). The Z Directory offers functions such as z_strcpy(), as a replacement for strcpy(). Sometimes the Z Directory version improves on what the OS or compiler offers. For example, z_strncpy() null-terminates. One may wonder, why usez_strcpy() when you can use strcpy() ? Why yet an additional layer of complication? Consider the layer 0 string function z_bcopy(). In Microsoft, moving a [binary] block of bytes is accomplished via the system call memmove(). HP-UX (Hewlett-Packard's interpretation of unix) also uses memmove(). In Sun Microsystem's Solaris OS, it's bcopy(). On some versions of System V unix, none of these are available. Or perhaps it is available now.

    Freedom of choice: you can spend your weekends updating your code to make it work on all the platforms you are interested in, or to reflect the latest changes in some vendor's operating system. Or, you can use the Z Directory and go to the beach. You can argue that your favorite vendor's implemenation is the best, and keep working on your code, or you can use the Z Directory and have a barbeque picnic. While some bigshot, or programmer at Microsoft, or Borland, or Sun Micro, or POSIX committee decides to either make their own version so that you can pay them more money and argue that they are the best, or change the way they do things in order to standardize, the Z Directory's z_bcopy() function interface has remained the same for over 15 years. There are many other examples: a thread under microsoft is created via the system call _beginthreadex(), and under unix it is done via pthread_create(). The Z Directory's z_strchr() is equivalent to sometimes a strchr(), other times index(). Z Directory's z_sleep() under Microsoft is a Sleep() {with a big 'S', counting in milliseconds}, or in unix, a sleep() {counting in seconds}. In System V, one can use poll(); in BSD, one can use select(), etc, etc. To quote Jimi Hendrix: blah blah, woof, woof.

    Unfortunately, some behaviors cannot be resolved in the Z Directory. For example, to send e-mail in Win32 environments, a MTA (Mail Transport Agent) server and a return address must be designated, whereas in Unix environments, neither is required. Also, Microsoft environments provide the Registry and cut-and-paste buffers that handle font type and other information besides straight ASCII text.


  • MINIMIZATION OF CODE DUPLICATION. One of the design principles of the Z Directory is that "less is more". Usually, when a block of code is duplicated, unnecessary duplication results, and the code is degraded. This is a very common and very bad practice. Often when you want to do something similar to what does an existing block of source code, you simply copy that code and modify it. This may be fine for disposable software with very limited shelf life. If you do this, often you will find you will need to expand the overall system. The more you copy-and-paste, the more you damage the software. Multiple occurances of the same overall source code indicates the need for creating an inheritance hierarchy, or breaking the code into subroutines and putting the repetitive process in a loop.

    Sometimes it is unavoidable. For example, consider the "rundriver_o" class. It provides routines common to most console-based programs, such as parsing a command-line, or loading variables from an INI file. The file_o class is used for reading of an INI file. This is a layer 9 object. For code in layers 0-8, a simpledriver_o class is used. This uses its own set of code for opening and reading a file. Not only is this functionality duplicated by 2 sets of code, but in this case, operating-system dependencies are introduced in both places; namely, system calls such as fread() and fopen(). In this case, this is mitigated by putting the simpledriver class at layer 1, and have it use low-level file operation code found at layer 0. The cost of doing this is to deny this object from layer 0 code. We can repeat the code-duplication process by introducing a layer-0 version of the rundriver's functionality (do you see a recursive solution here?), but we stop here, by simply denying rundriver functionality to layer 0 objects.


  • ALL SOFTWARE IS A LIBRARY COMPONENT. Typically software is created by someone accustomed to a particular environment, with a specific set of software tools. New software is built with the current environment in mind. Only after the software evolves does the typical designer realize how to de-couple it from the systems it interfaces or is built upon, or how to make it more general-purpose. Most "finished" programs can be folded into a general-purpose library. It takes knowledge, experience, intelligence and skill in designing and writing code so that it can be created as a generic library component in advance. As software grows larger and more complex, protocols must be defined. The protocols often define how future software is constructed. For example, a simple software object such as string operations has very little protocol built into it; whereas, if constructing a more complex component such as a server object, one must decide on how many input and output channels are to be implemented, what sort of transport mechanisms are to be employed, over-all architecture (event driven, polling, threaded, etc.), format of data packets, and a host of other policies and protocols. The trick to software is to know how to factor out protocols, so that generic-universality is maximized. The more this design principle can be maintained, the better the overall quality of the software architecture: it will be easier to grow and maintain, and have a longer shelf life.

    Vettrasoft policy is to write code without an end target in mind. This may seem rather strange, but it is a cornerstone of why the Z Directory succeeds. An analogy can be found in mathematics, where it is commonly accepted that those who dive into "pure math" might do so without a specific end goal (eg, real-world application) in sight. This decoupling is one secret to writing "permacode".







General Structure and Architecture.
Layering of code is nothing new. Often the code of some operating systems is likened to an onion: a block, or "layer" of code, is surrounded by successive blocks of code. The Z Directory is more like a high-rise building. There are strict rules to the construction and placement of Z Directory components: given code ("block A") dependent on other code ("block B"), the dependent code must be situated at a layer higher than the other code. In other words, A must reside above B.

One approach in design of toolkit libraries is to make a library based on categories - ie, your program would link to say, a "window", a "container", and a "math" library. Though more intuitive, such an approach could lead to problems due to interdependencies - if a part of library A used another library (lib B), and somewhere in library B it accesses library A, there is a cyclic reference that can cause problems for some linkers.

Source code for "categories" (or "groups" - logical units for organizing source code files according to its over-all purpose) may be split across layers in the Z Directory. The components are ordered by their layer, starting from 0 and going upward. In order to link a program to the z library, one must include all the appropriate libraries. It is necessary to explicitly list each library in your link command. For example, if you are building an application called "program" using gnu's gcc compiler, the compiler command (using unix and gnu) would look something like the following:
gcc program -libz05 -libz04 -libz03 -libz02 -libz01 -libz00 -lm


Since code in layer 5 uses code in layer 4, code in layer 4 uses code in layer 3, and so on, the libraries should be given in descending order, up to the layer 0 library. A standard, core z library name is of the form libzx.a l(ibzx.lib on Microsoft), where "x" is the layer the library corresponds to. All "regular" (core) code in a layer goes into a corresponding library. There are additional z libraries for specific add-on packages, such as X-Windows or databases:

X-Windows libzxwn.a
SQL Server or Informix RDBMS libzdbn.a
libzvann.a Vantive

The lowest layer is 0. There are two rules defining the position of code in the z directory:
  • First, if code uses other code, the first code must be at a higher layer.
  • Second, code must be placed at the lowest possible layer. Thus, if code A uses code B and code C (objects or functions), and B is at layer n, while C is at layer n-2, then A must reside in layer (n+1). There are occasional exceptions, done for pragmatic maintenance of the code. Usually when a group of code is closely related and one dips into used code in another piece of code. In this case it may be more convenient to keep the code together, whereas breaking it up across layer would cause unnecessary confusion.
The lower layers tend to have ties to the operating system, so there are heavy infestations of "run-time optioning", consisting of such constructs:
      #if os_Win32
          // (--do this--)
      #else
        #if os_Unix
          // (--do it another way--)
        #else
          #pragma ("error)      // unknown OS
        #endif
      #endif
    
Run-time optioning attempts to trap OS nuances. The code is expected to behave the same, regardless of the OS - that is the goal, which is not always possible.

All header files (aka include files) are put into one directory. This makes for a very large bucket, and makes it very simple to specify include file search paths. Almost all of the header files start with "z_", and are standard ".h" file types. A header file also have a "_p.h" file associated with it - a private header file. Private header files are mainly for storage of inline member functions.

Each layer contains various subject categories. These groupings reflect the category the code belongs to. Examples of this include "string", "sql", "www", "money", or "time" (These same names are used in the reference section). A given name may be spread out over different layers. For example, "time" appears in layer 2 and layer 3. Lower layer code may consist of just simple c-code functions, or wrappers around operating system calls ("z_strcpy()" simply calls "strcpy()", for example). Higher-level code usually provides more powerful operations.

Difficulties maintaining the Z Directory layering.

This section explores the aggravation of cyclic loops and how to tackle this problem. It serves as an example only and can be skipped.

Suppose you have this code:
    t.cpp:  class Top : public class Base { };
    b.cpp:  class Base { void fun() { primitive(); } };
    pf.cpp: void primitive() { }
    
If the Z directory consisted of only these elements, then the Z Directory would look like this:
    layer2/t.cpp
    layer1/b.cpp
    layer0/pf.cpp
    
Cyclical dependencies can create problems that challenge this scheme. Suppose a File class is introduced, which used class Top:
    f.cpp:
        class File
        {
            void file_op () { Top t; t.top_dog(); }
        };

    t.cpp:
        class Top : public class Base
        {
        public:
            void top_dog() { /* .. */ }
        };

    b.cpp:
        class Base
        {
            void fun() { primitive(); }
        };

    pf.cpp:
        void primitive() { }
    
In the Z Directory scheme of things, the files would be organized so:
    layer3/f.cpp
    layer2/t.cpp
    layer1/b.cpp
    layer0/pf.cpp
    
(from this point, the file contents will be collapsed into single lines)
Now suppose class Base wants to use the new File class. This would require relocating Base, so that it is above File:
    b.cpp:  class Base  { void fun() { primitive(); File f; f.file_op(); } };
    f.cpp:  class File  { void file_op () { top_dog(); } };
    t.cpp:  class Top : public class Base { void top_dog() {} };
    pf.cpp: void primitive() { }
    
However, Top uses Base (as a base class), so Top would have to be moved above Base:
    t.cpp:  class Top : public class Base { void top_dog() {} };
    b.cpp:  class Base { void fun() { primitive(); file_op(); } };
    f.cpp:  class File { void file_op () { top_dog(); } };
    pf.cpp: void primitive() { }
    
But File calls Top::top_dog(), which means File must be moved above the Top class:
    f.cpp:  class File { void file_op () { top_dog(); } };
    t.cpp:  class Top : public class Base { void top_dog() {} };
    b.cpp:  class Base { void fun() { primitive(); file_op(); } };
    pf.cpp: void primitive() { }
    
Which is exactly where we started, and the problem is not solved - an infinite loop. The following countermeasures are available:
  • Copy the functionality (this is effectively code duplication) into the layer where the code is to be added.
  • Create a similar class or function(s) as the ones you want to use, but water it down so that it can fit into a lower layer.
Admittedly these are shortcomings. Such a problem exists with the rundriver_o class, which uses the layer 9 file object to parse an INI file. This deprives layer 0-9 code of rundriver. A simpledriver_o class had been constructed to satisfy that need.






Naming Conventions.
The most obvious naming convention in the z directory is the ubiquitous prefix "z_". Many functions start with "z_". This was chosen to distinguish its functions from other libraries, and its short but visually intense appearance is intended to improve readability. Underscores ("_") are found in abundance. This character, shunned by many programmers, is part of the set of characters that comprise a 'word' in the c language. It eases readability of function names.

Consider an X-window function such as "XmInitializeAppSet()". The z-dir counterpart would probably be named "z_init_appset()". Upper case is shunned in most class and function names. Thus, it tends to be reminiscent of its roots - unix. Ultra-long and ultra-short names are discouraged. Class member functions do not contain the "z", and shorter names are preferred, as well as operator overloading ("+=", "==", "^", etc).

The name of class objects end in "_o". The last character, an 'o', represents "object". The roots of this naming convention came from the Unix community's habit of naming types with a "_t" at the end (such as "time_t"). However, this particular habit appears to be rather unique to the z directory. Having such a non-standard convention is actually an advantage. The "_o" naming convention provides an immediate visual clue that the object is a z-directory object.

Macro (#define) naming conventions include the following:
zos_<operating_system> macro to control inclusion of code, depending on operating system
zcc_<compiler> macro that depends on type of compiler in use







Coding Standards.
Vettrasoft has a very strong culture and history with regard to its c++ coding standards. Coding standards ensure consistency, allowing the user to focus on the contents of the code, rather than deciphering what is written, or having to deal with some programmer trying to show that he is clever. For more information, please see the coding standards page.






SQA - Quality Assurance (aka "Testing").
There are many programs available for verification of the correctness of the Z Directory. A set of such programs called "main drivers" can be downloaded. A main driver program usually tests a particular class. A main driver program contains main(), and provide 3 sets of testing: an interactive facility, a batch run, and a validation function. The interactive part allows for manual testing of primary features of the class. It is also a laboratory for how to use the code that it applies to. The batch section is usually the best source of examples for how to use a class. The validation (or verification) section allows for automated (regression) testing of the class. It uses standard command-line switches to control the test program:
parameter description
-h, -? Show a brief help message and exit
-V Show the program version information
-i Run interactive mode
-batch Run a batch mode [test]
-e Run a validation test
-vol Set volume level (output debugging messages). Range is from 0 (complete silence) to 100 (maximum verbosity)


Extended command-line switches are for the orthodox class or higher:

parameter description
-db, database The name of the database to be used
-server Database server name


Many of the command-line switches are the same as those used by the rundriver_o class object, and many main driver programs have been [recently] converted to be based on using the run-driver class. An extension of the run-driver class (in layer 1), called "testdriver_o", has been created to automate much of the testing process. It is used in the following fashion:
    class xxx_test_o : public testdriver_o
    {
    public:
        inline xxx_test_o () { }

        int cmdline (int, char **);
        void version (const char *);
        int validate (const char *, const int = 0, const boolean = TRUE);
        int interactive ();

        // void testdriver_description (const char * = NULL);
    };

    void xxx_test_o::version (const char *prog)
    { cout << prog << ": build date: " << __TIMESTAMP__ << endl; }

    int main (int argc, char **argv)
    {
        int ie;
        xxx_test_o md;

        ie = md.run (argc, argv);
        exit (ie);
    }


Here, xxx should be a name representing the class to be tested. The test-driver class was designed to be sub-classed in the manner given above, and the version() member function must be declared in the above fashion (If not, the __TIMESTAMP__ macro will not work properly). This function, and validate(), must be written. E.g., version() and validate() are virtual abstract member functions. The command-line arguments may be over-ridden or added to by defining your own in the test driver class. If additional arguments are to be employed, this function may be written as so:
    int xxx_test_o::cmdline (int argc, char **argv)
    {
        testdriver_o::cmdline (argc, argv);

        int i;
        for (i = 1; i < argc; i ++)
        {
            // your parameter parsing goes here
        }

        return (0);
    }