C++ Coding Standards


Table of Contents


Our layman's introdution to coding standards.

"Coding standards" in computer science refers to the style in which someone writes source code (that is, computer software programs). Since the compiler (the thing that you submit your 'source code' to in order to create a program) ignores blanks, tabs, line breaks, and anything else it considers "whitespace", they can be used by the writer to create a wide assortment of styles that all result in the exact same program. This freedom has positive and negative side effects. Imagine a [typical] scenario where a number of programmers are brought in to work on a big project, Each one has his or her unique style, naming conventions, documentation habits, etc. The source code is aggregated into a set of documents. Clearly the syle changes as the code bounces from one author to the next. As the original authors go away and get replaced by others, new styles are injected. Furthermore, when bug fixes, modifications, and additions are made by others, the styles blend together in a random fashion.

Consider the following code fragment:

//----------------------------------------------------------------------
// z_FACTOR -- get the factors of a number
//
// RETURNS
//       0 -- success
//      -1 -- error (too many prime factors)
//----------------------------------------------------------------------
int z_factor (count_t n, count_t primes[])
{
    int ie;
    int i, j;

    for (i = 0; i < z_Maxnum_LCD_Factors; i ++)
        primes[i] = -1;

    // --for each number, [2..n], see if it goes into 'n'--
    for (i = 2, j = 0; i <= n; i ++)
    {
        if (is_z_prime (i, &ie))
        {
            while (z_is_divisible_by (n, i))
            {
                primes [j++] = i;               // add 'i' to list of primes
                n /= i;                         // cut n down: n = n/i

                if (j >= z_Maxnum_LCD_Factors)  // ..still have room?
                    { return (-1); }
            }
        }
    }

    return (0);
}


This actual Z Directory code fragment reflects these coding standards:
  • 'blocks' are indented by 4 spaces (always);
  • spaces are used liberally to separate tokens;
  • in-line commants are lined up;
  • the function is preceded by a descriptive comment block;
  • braces are used to elucidate loops and blocks, even when they are not required;
  • logical "phrases" are separated by 1 blank link


The same code can be written in a more cramped style:
int z_factor (count_t n, count_t primes[]) {
    int ie; int i, j;
    for (i = 0; i < z_Maxnum_LCD_Factors; i ++) primes[i] = -1;
    for (i = 2, j = 0; i <= n; i ++)
        if (is_z_prime (i, &ie))
        {
            while (z_is_divisible_by (n, i))
            {
                primes [j++] = i; n /= i;
                if (j >= z_Maxnum_LCD_Factors) { return (-1); }
            }
        }
    return (0);
}


Some may consider this an improvement as it saves vertical space (some people consider that a plus). In fact, many religiously put the opening brace at the end of a line (notice the "{" at the end of the first line above, containing "int z_factor"..), rather than lining it up in the same column as its corresponding closing brace. The same code can be condensed even further:
int z_factor (count_t n, count_t primes[]) { int ie; int i, j;
for(i=0; i<z_Maxnum_LCD_Factors; i++) primes[i]= -1;
for(i=2, j=0; i<=n; i++) if(is_z_prime(i,&ie))
while(z_is_divisible_by(n,i))
{ primes [j++] = i; n /= i; if (j >= z_Maxnum_LCD_Factors)
return -1; } return 0; }


Whatever style or format, it makes no difference to the compiler. And often, many who pay for source code don't care either. Where on the axis between total freedom and strict rules is a personal choice. Yes, one may choose to have lines a thousand characters wide, indent 1 space or 50, put comments in arbitrary locations, or not at all. Vettrasoft policy is close to the side of stringent control. There are good reasons for that which should be obvious to the intelligent, experienced software engineeer.

Purpose

These c++ development standards outlined here can be used by all software developers using c++. This document describes how our software is organized and written, which may help developers find things. For anyone working for Vettrasoft, submitting source code, partnering, or simply following the Vettrasoft way, we offer this document to help write the associated source code.

These standards are intended to be reasonable and common sense rules. As part of the rules of organization, Z Directory code should have a common look and feel. This helps simplify maintenance and eases readability.

We hope programmers who are faced with reading this document approach the task of writing software as an art form. As art, software should be written carefully, with great precision, and with the goal of making it attractive as possible. It is easy to make code ugly. The opposite is not easy, and requires skill, dedication, experience, and the other ingredients found in a good artist.

note: Of interest is the google standards for writing c++ code. We are currently looking into it and comparing it with our own.

File Organization

Ordering of Items in Files

A c++ object should generally be contained in 1 file. This is not a rule, since big, complex objects may span several files. The file should be broken down into the following sections:

HEADER big comment block for the entire file
INCLUDES a short section of "#include" files
DEFINES macros, definitions, constants
STRUCTURES struct's, class declarations
function declarations forward declarations of functions used in the file
function definitions the guts of the source code
test-driver section a block of code that provides a sample of how the code is used, and/or validation routines that validate that the code operates according to specifications

File Naming

Source files are made up of a base name, followed by the suffix ".c" or ".cpp". We are moving to ".cpp" since the porting of code to the Microsoft environment. Source file names will usually be all-lowercase, with "_" judiciously placed where 2 words strongly need a separator. Acronyms may be added, usually to the beginning of the name, and in caps.

A top-level makefile, or one specific to a program that habitats in the same directory as the program, will be named "Makefile". Each source code directory should have a "Makefile" which is linked to the "Makefile" in the parent directory, so that all package directories share the top-level "Makefile". Z-dir leaf-node directories will not have a makefile. A Z-dir level directory will contain a makefile. The file name has the form "Make".

Line Width

Source files will have a width not greater than 80 characters. Actually, they should not exceed 79 characters (to accommodate odd terminals that wrap around on the 80th character). This will allow source listings to fit on a page or terminal screen without dropping characters.

Indentation

Structural indentation will be used to improve readability and clarity. Each logical indentation block should be indented four (4) spaces relative to the prior block. Indentation should be done via blank characters (Hex 20). Programmers should not use tabs in source code files (.h, .c, .cpp, .sh, .ksh, .pl, etc). Please set Visual Studio or other development tools to indent 4 spaces and convert white-space to blank characters (Hex 0x20). Also, do not leave trailing blanks or tabs at the end of a line.

Language Issues

Compound Statements

The brackets and indentation of compound statements will be as follows. The open curly brace '{' shall follow the 'if', 'while', or 'for', statements on the following line. Indent the opening and closing brackets to the same column as the statement. The following is a correctly formatted example.
        if (...)
        {
            ...
        }
    


If there is only a single line in the block, the curly brackets, '{' and '}', need not be used. Do not merge everything on 1 line where there is an 'if' statement:
        if (xenv.area & MASK)
            create_tables (people_list, &ie);
    
NOT:
        if (xenv.area & MASK) create_tables (people_list, &ie);
    
This is useful for setting debugger breakpoints, too.
The following style is not recommended:
        if (name == "business")
        {
             // code
        }
        if (name == "person")
        {
            // more code
        }
    


Where there is a sequence of 'if()' statements that resemble a switch, separate the statements with "else"s. The following is better:
        if (name == "business")
        {
            // code
        }
        else if (name == "person")
        {
            // more code
        }
    


Arguments of a Function and Function Calls

The arguments of method and function calls should be separated by a comma and space. There should be a space before the function name and opening parentheses, as follows:
        calculator.reset (x1, y1, x2, y2);
    
This is not always required. Macros, or very short or very long lines can drop the space:
        if (max(x,y))
            // ...
    
If the first line of a function definition exceeds 79 characters, the line should be broken up in such manner:
    return_type function_name (type_0 first_arg, type_1 second_arg,
                type_2 third_arg, type_3 fourth_arg)
OR
    return_type a_very_long_function_name
            (type_0 first_arg, type_1 second_arg, type_2 third_arg)


Preprocessor Directives

Preprocessor Directives (those lines that start with a '#") should always have the '#' at the very start of a line, in column 1. This is because they should rarely appear in source code, and when they do, they should visually jump out. They are usually found near the top of a file. The body statements of nested preprocessor directives can be indented with 2 spaces, like so:
#if zos_BSD || zos_SysV || zos_linux
#  include 
#  include 
#  include 
#  if defined(zos_solaris) || defined(zos_freeBSD)
#    include 
#  endif
#endif


The 'Return' keyword

In the Z Directory, you will usually see the parameter of a return statement to be wrapped in parenthesis:
    z_finish();
    return (0);                 // the old way!
Vettrasoft will be moving towards a free-air return parameter, like so:
    z_finish();
    return 0;                   // the new way
As google's coding standards points out: "Do not needlessly surround the return expression with parentheses" .. You wouldn't write "var = (value);"

Naming Conventions

Class and function names will are almost always in lower case. Class names end in "_o". The following are some examples.
    curator_o
    string_buffer_o

Sample member function names:

    int buffer_length()
    void draw_box (int x, int y)

Class data members shall begin with an "m" and underscore. The reason for this convention is to alleviate the problem of name space conflicts between the data members, argument lists, member functions, and external macros. Another reason is to help identify if a name belongs to the current class context, without resorting to the use of the "this" pointer.

    class Student
    {
    private:
        string_o m_name;
        int m_id;
    }

Local variable names in a method should be descriptive, lower case, and have words separated with underscores. For Some reason, upper-case lettering seems to have won as a way to distinguish words:

   DWORD dwiInitializeVectorOtherNetworkByteOrder = 0;
vs:
   size_t initialize_array_network_byte_order = 0;
Vettrasoft considers UsingCaSeToDistinguishWords hard on the eyes and avoids it. Other misguided souls (or perhaps the same people) don't know how long a variable name should be. Here is a (real life) example of excess:
bcopy(putoutEncryptedDataPtr, &dwInitializeFinalVectorLengthNetworkByteOrder, sizeof(DWORD));
bzero(outputLocalEncryptedDataPtr + dwInitializeTotalLengthVector, &dwNetworkByteOrderOutBufferLength);
if (!GetRandomhashCryptographyProviderHandle(&mdata_CryptographyProviderHandle))
{
    // [this code monkey is out of control] ...
}
Besides being impossible to read (one may wonder why the author did not take it the next step: encrypting and uuencoding his variable names, then inserting the new names in the source), it becomes harder to have a line of code do something, as the names take up all the space and begin to crowd out anything else.

Of course, there are those who take the other extreme. Another false thinking is that computer science is a part of mathematics, hence variable names should be letters:
    count_t i;
    double k = 1.0 / (1.0 + 0.2316419 * fabs(x));
    double m, t, s = 0.0;
    s +=  0.319382 * (t=k);
    double b = z_mean_avg(x, n);
    for (i = 0, m = 0.0; i < n; i ++)
        m += (x[i] - b) * (x[i] - b);
One may wonder what those people do when their programs require more than 20-30 variables.
Here are a couple nice variable names:
    int box_length;
    string_o student_name;

Do not declare variables inside loops, including for statements. This can be dangerous on some compilers! (for subtle reasons). A 'for' statement should be as follows:

    int i;
    for(i = 0; i < my_vector.length(); i++)
    {
      ...
    }

NOT:
    for(int i = 0; i < my_vector.length(); i++)
    {
      ...
    }
Also, avoid declaring variables inside a switch statements. There are times (particularly when goto's are involved) when such variables fail to get initialized.
DO LIKE SO:
    void function (int value)
    {
        string_o s, *me ("& you");
        switch (value)
        {
        case 2:
            if (all_fails)
                goto cleanup;
            break;
        case 1:
            s = "hello";
            break;
        default:
            break;
        }

    cleanup:
        delete me;
    }
NOT:
    void function (int value)
    {
        string_o *me ("& you");
        switch (value)
        {
        case 2:
            if (all_fails)
                goto cleanup;
            break;
        case 1:
            string_o s ("hello");
            break;
        default:
            break;
        }

    cleanup:
        delete me;
    }

Constants should use upper-case. For example:

    static double MY_PI = 3.14159265;


Class data members should not be declared as public. Instead, make a function that returns the value of the class members. his uphold the class's encapsulation and make the class easier to maintain.

C++ Structural Issues

Some consider pointers to be a thing of the past. Try to use references, when possible:
DO LIKE SO:

    business_o b;
    db_create(b);

    static int db_create (orthodox_o &x)
    {
        // ...
    }
NOT:
    business_o b;
    db_create(&b);

    static int db_create (orthodox_o *px)
    {
        // ...
    }


 

Avoid global-scope c++ objects.

There are very solid reasons why you should never do this:
string_o global_myname("I am an ignorant code monkey");

int some_function (string_o &s, time_o &when)
{
    // ..
}
The solution can be somewhat tedious. Objects outside of functions (and static variable-objects in functions) should be made pointers, and a 1-time initializer should instantiate them:
string_o *global_var = NULL;

int initializer()
{
    static boolean did_master_init = FALSE;
    if (did_master_init == FALSE)
    {
        if (global_var == NULL)
            global_var = new string_o("I am a software engineer");
        did_master_init = TRUE;
    }
}

Documentation

Comments in Source Code

Comments should describe what is happening, how it is being done, what parameters mean, which globals are used and any restrictions. You should comment profusely: all thoughts, reasons why the code was written in such manner, meanings of variables, etc. Thoughts about what is going on, alternative implementations, pitfalls, limitations, and problems with the code should be jotted down in the code as comments while the thought is being created. Of course, this slows down the entire process of creating code, but the future benefits will outweigh the immediate slowdown. With practice (perhaps a lot of practice), you should find your code improving after you jot down comments in code. Sometimes the ill-formed ideas of what to implement improve after you've thought things through, having forced yourself to think by writing down the plan. It will be much harder, even impossible, to record the same thoughts later, after they are forgotten.

There are 3 type of comments in a source code file:
  • File header comment block. This is a block at the top of a file. This is generally large - it can potentially span several pages. The comments found here are full-width, e.g., the block spans the width of an 80-character terminal. The first and last lines of this header are exactly as in the following header. This header may also contain some SCCS or RCS information. Currently (year 2002) embedding source code control pragmas is not done:
  • //======================================================================
    // file: {filename} -- {short, 1-line description}
    //======================================================================
    


  • Function definition introductory header comments.


  • In-function or in-line snippets. These are intermingled with source code. They could occupy their own lines, or share a line with source code. When such comments start a non-trivial operation they can be of the form:
        // --this is a 1-line comment--
                    
    or, if the comment takes up multiple lines, it should form a "box", like so:
        //..........................................................
        // this is a multi-line comment..
        // that's all!
        //..........................................................
    
    In the latter case, dots are to be used (for readability, and the last dot is 1 character short of where an 8-char tab item would start.


In the case of short intra-line comments, they should line up. Start them where an 8-char tab item would start. Which tab stop is not important, so long as it is aesthetically pleasing:
int pktbuff_o::add_string (string_o &s, int *p_ie, boolean nd)
{
    string_o swork, swork2;                     // holds copy of input string
    if (nd == TRUE)                             // does user want to copy it?
        swork = s;                              // yes, copy entire buffer

    string_o &rs_incoming = (nd) ? swork : s;   // and set up a reference to it

    if (mp_pkt == NULL)                         // is this object initted?
        { *pie = 1; return (-1); }              // no, don't know packet type

    // ...
    r_pkt.write (s, pkt.data(), nb, &ie);
    if (ie) return -1;                  // convert input buffer to string
    m_buffer += s;                      // concatenate to internal string buff
    i ++;                               // and increment internal msg counter


Either the ``//'' style comments or the /* */ style comments can be used.

Class Documentation

Source-code control tags, such as "@see", "@version", and "@author" may be introduced in the future. "@version" references SCCS-ID commands to insert filename and version number, if we decide to use SCCS. I will add more to this section after that decision is made.

Function Documentation

Public member functions will be documented using conventions. The following tags will be used:
  • SYNOPSIS -- quick top-level description
  • DESCRIPTION -- all the details; can be big
  • INPUT -- a list of input variables
  • OUTPUT -- a list of output variables
  • BUGS -- known defects
  • EXAMPLE -- sample usage(s)
  • SIDE EFFECTS-- globals, internal vars, state changes set
  • RETURNS -- a list of descriptions re. values returned
  • HISTORY -- contains concise descriptions of what transpired


Which tags are to be included is discretionary. They are to follow the ordering listed above.
The following is an example of the member function documentation.
    //----------------------------------------------------------------------
    // myclass_o::PARENT -- gets the parent of the specified name.
    //
    // DESCRIPTION
    //
    // SYNOPSIS
    //
    // RETURNS
    //       -- a name
    //
    // HISTORY
    //      Tue 02/22/2011: did a bug fix {--Author's Initials}
    //----------------------------------------------------------------------
    string_o myclass_o::parent (string_o &name)
    {
    }

Error Handling

We DO allow for c++ try/throw exceptions; much of the earlier code was built without this c++ mechanism. Almost all functions that return a variable of type "int" return an error code. Vettrasoft follows Unix standards: 0 indicates success, and -1 indicates failure.

The following is an example of a function with typical error handling:
    class myclass_o
    {
        int create_table (string_o &, float, int * = NULL);
    }

    int myclass_o::create_table (string_o &arg, float f, int *pexi)
    {
        int ie;
        int *pie = (pexi != NULL) ? pexi : &ie;
        *pie = 0;

        // ...
    }
The caller of this function can supply a parameter (which is almost always the last parameter of the function's argument list) to get a code that provides more information about the error. It should be made optional. This lets the programmer call the member function with or without an int pointer, allowing for 2 coding styles:
  • less parameters (simpler), but less information can be obtained, if an error occurs
  • more cluttered, but with more access to details about errors.
The value of the code for the output error parameter is set by the person that created the function. There are no standards as to how to set this value, except that if no error occurred, it should be set to 0. If the function's client does not supply an integer to receive the error code, e.g.:
    void main ()
    {
        myclass_o mc;
        string_o s;
        float f0 = 2.71828;
        int ie = mc.func (s, f0);
    }
Then the error code that would normally be supplied to the output parameter ("pexi") is received by the internal stack variable "ie" (inside function "myclass_o::func()"), and is discarded. Since it is unknown whether the pointer has been provided or not, it is suggested that the programmer set up an alternate, internal error variable, and a second pointer. This other pointer points to either the internal or external error variable, depending on whether the caller-client explicitly provided his or her own variable.

Another anticipated method of error-handling is to use an "error object" when doing a throw (inside a try/catch construction). The z library has just such an "error_o" [object].


This section will receive more documentation later.

Other Notes.

Submit any comments to: support@vettrasoft.com

Document Revision History:

Tue 06/25/2002: updated
Tue 02/22/2011: document page reformatted {--GeG}