About Software (in really simple language)

a eulogy on software and the Z Directory, for non-programmers



Overview



Human and computer are not compatible. This may seem like a pointlessly obvious statement, but it applies in many non-obvious contexts. Most of us expect that the more time and effort (and bodies) that is applied to a problem, the closer the solution is found and that the job approaches completion. In computers, The opposite can (and often will) happen. Oftentimes many software systems grow and grow as effort is applied to it, and unless careful planning is applied to the architecture, it turns into an unmanageable mess with cancerous-like out of control growth. Most of the time a system is created under duress: time and budget constraints prevent those building a system from engineering it optimally, whilst shortcuts and patches undermine the overall health of its design and architecture. 'What it takes' to create quality software does not fit into the human experience. Though made by humans, "the right stuff" for software defies the logic of humans. One common mistake is the attitude that more is better. Surely a squad of 100 programmers surely can create a better system than a tight team of 5 focused software engineers? It takes an evolutionary path of experience in figuring out how to write truly good code. There is an evolutionary learning curve to software. At first, you may think that the more you write, the better a programmer you are. The old IBM "KLOC" (1,000 Lines Of Code) philosophy is, the bigger the better (traditional American attitude). This simply doesn't work. Because the bigger sofware is, the more there is to learn about it, and so the more difficult it is to explain it, remember it, document it, or for others to maintain it. Eventually you will learn that you were wrong about being tough, able to churn out volumes of code. What you have done is to pollute - you have only made a lot of garbage.



Enter OOP



Object-Oriented Programming, or OOP, truly took programing to a higher level. Without it, code consists basically of a lot of functions. This is the case irregardless of [non-OOP] language, whether it be BASIC, COBOL, RPG, Pascal, C or one of the many other choices. The care and feeding of your software increases linearly with the amount of code when it consists of a bunch of functions (aka subroutines; whatever you prefer to call these blocks of code). I find that after about 25,000 LOC (Lines of Code) most software becomes very hard to manage. On the other hand, OOP almost forces you to think small. OOP gives you a tool to pull together a massive amount of functionality in 1 line. It packages up groups of software into "objects". What exactly is an object? It can be anything you define it to be. Not trying to confuse you; but software is a very synthetic creation, letting you define your own realities.

For instance, you can create and initialize [the software representation of] a hospital or a government in a few lines of code:
Hospital h;
Government g;

h.run();
g.control(h);




Why OOP?



A good analogy of how OOP affects software source code could be found in the automotive industry: a carburator or an engine can be a very complicated device, but nevertheless it is contained in a single metal box (of sorts). The complexity of software, what before was spread out over many many subroutines has all been hidden away into "objects". One needs to go through the pain of working in a pre-OOP language on big projects to fully understand the benefits of OOP. And, if used correctly, the overall code size should be relatively small. And compactness is important from the engineering perspective for many reasons: it makes things 'easier to find' in the source code; it reduces the amount of associated documentation required and the overhead for a new person to learn the system; and the amount of memory consumed by the program is less. If you think issues such as making things "easier to find" is a non-engineering issue, anad you think the programmer should just "deal with it", read on.

As to which OOP language, suggesting one or another often results in a passionate debate. Although Smalltalk existed for a decade before C++, it was the C++ computer language that really changed things in a big way.

Going back to software and its intractability for humans: if you approach creating computer programs to solve a particular actual-world problem, you will create more problems for yourself over time. First of all, it requires a huge amount of work to create software in the best possible manner. This is mainly due to the fact that (a) writing code, line-by-line, is very labor intensive; (b) as you create software for a particular context (say you want it to run on X Windows; or Microsoft Windows; or perhaps via a command line), and you try to expand the context, you need to continuously restructure the software so that it can be more universal, and not depend on the contect you are used to.
Some quick examples:
  • you use a specific database (Informix, or Oracle, or mySQL - whatever) and you imbed the SQL code in the application. If you want to make your code work for another database (or even universally, for most any database) you add blocks of code to accomodate each new database. Although SQL is supposed to be a standardized language across databases, the truth is that it is not. Every vendor adds some adjustments or extensions. There is no sheriff out there to tell them "hold on buddy, that's a violation of the standard". Often, the approach to add unique blocks of code per newly supported databse (or other item, such as Operating System or window system) will quickly degrade the program as it bloats up with additional code, often replicating what was done in the prior blocks of code with possibly only a few lines of code changes.


  • You create and market a word processing program (like Microsoft Word). The code uses MFC and is interspersed with GUI calls (functions that interact with the window - display system). Now you want a version that runs on linux with X windows. You copy the entire source code wholesale, then begin editing the new code set from the top down, to make it work on the new OS and window system. You end up supporting 2 entirely separate pieces of code, even though a lot of formatting, text search operations and other things comprise the same code in both environments.


  • You are building a web site. It has a "contact-us" page that requires you to enter a name and e-mail address. You design a "contact" database table consisting of the fields on that page: name, e-mail address, subject line, message text, and maybe a few other supporting fields. You get it all working, then proceed to build a page for downloading software. As before, you make a download-request database table which includes the fields for name and e-mail address. Then you realize that those 2 items are aren't really logically belonging to a web-mail message, nor a download request, but represents traits of a user (aka client, person, or whatever you want to call the person that accesses things in your web site). You have already built code that fully works for sending a message or doing a download. What do you do? The choices are (a) to build on the existing system, or (b) to re-structure the database schema, and re-write the corresponding code that applies to it. If your choice was unquestionably (b), you are a software engineer and you sacrifice time in exchange for quality of the system. If you always choose (a) you are a code monkey. One thing of note: the rewrite that entails option [b] will be completely invisible to the end user. Whereas if you replace a run-of-the-mill Chevy 350 engine with a top-of-the-line Corvette Z-type 350 engine, you can immediately demonstrate the benefit by showing that the improved car can go 50 MPH faster. But in software, the disadvantage with continuing with the incorrect (in this case downright crazy) design will show up only gradually over time. This philosophy doesn't mesh too well with the attitude of short-term quarterly earnings targeting done by most American corporations.




Man vs. Machine



Usually, people facing such problems are in a hurry to get the problem at hand solved as quickly as possible via computer software. More hacks and shortcuts are done to the software system, which ends up actually polluting the software. Think of using opiate narcotics to reduce pain, which results in doing more harm to the organism over time. By the way, the last (web site) example was something that actually occured while creating the Vettrasoft web site.

"Non-computer people" - say, the manager or business owner who doesn't understand the process of writing code doesn't really have a metric to estimate progress or quality of a software project. He will probably end up relying on superficial appearances - the programmer is putting in long hours, or appears to be sincere in trying to solve the problems at hand. This is often doubly frustrating for the non-programmer manager, as the arguments for more time, effort or money that the engineers say is required sound plausible, but it never seems to end. One may end up wondering if they are concocting arguments for job-security, or come up with other conspiracy theories. Many organizations and businesses have been shot down by failed software. History is littered with examples of huge amounts of money being wasted in this regard. Why? Because few understand what software is.



Opportunites for Huckster Software



This opens up the floodgates for pundits and false prophets. Various products are touted as the end-all solution to the software problem. Maybe.. by using Powerbuilder, or switching to Oracle, or doing a design in UML, or running the code through a memory leak checker will solve the budget overruns. Use AJAX and the program will be done in half the time. Throw in some XML; that will make it high-tech. Still behind? Buy a CRM system. That didn't solve our problems - add in a document management system.

This search for the holy grail approach opens up opportunities for "shyster engineering" - come up with a 'new technology', give it a fancy buzzword, present scholarly lectures by professorial-looking types, back it up with a team of muscular, well-polished salesmen and the cycle repeats. These lies are successful as long as people don't understand the fundamentals of the problem. Professional software systems cannot be written by hand. It simply takes too much work. And, as the size of the software system increases, the time and effort required (and hence, cost) to work on it - whether to maintain it, improve it, or expand on it - increases not linearly, but exponentially.



The [True] Future of Computing



One day (say in a few hundred years, maybe by 2250) the art of writing software will be no doubt be unrecognizable from its current form. You won't be be writing loops and incrementing variables (if you live that long), or worrying about incomplete if statements. Perhaps you'll be assembling very powerful software "objects", whose behaviours and interfaces have been completely standardized for a long time. If that is so, the key to the future of software - and the best approach to current software development - is not by finding a better interactive development environment, or by switching to the latest [fad] language, or learning a new protocol (whether it be XML, SNMP, DOM, ..). The key is to build your code based on top of (eg, by using) powerful object libraries.



Why Vettrasoft is the Future of Software



Object library - modular components is the approach of the Z Directory. It is the intent of Vettrasoft to provide massively powerful software objects that you can play with like building blocks, assembling and mixing them however you want. Instead of writing code to dissect sentances or chapters from the text of a book, you would be able to apply the "book object" to a mass of text, and be able to generate the index, table of contents, chapters, search patterns, etc. Vettrasoft does not give you a massive, singular [overpriced] system that replaces programming by configuration. Rather, it provides you building blocks that you assemble via traditional programming techniques.

One philosophy of Vettrasoft is to free your program from being tied to a particular vendor. That is, you won't be restricted to say just the Microsoft operating system, or to the Oracle database. On the other hand, it means you are tied to Vettrasoft. Unfortunately, like politics or war, you have to choose sides in the land of computers. Another approach is to use a variety of vendors in your application - say you want to use STL for your containers, Rogue Wave for your strings, somebody's proprietary memory allocation system, and mySQL or Sybase's SQL Anywhere for your database. The mix and match approach isn't quite the same as "system integration", though the idea is similar. We are still discussing code writing. Then there is the problem of getting it to all work together (and you might not be able to!).

A real-world example of this issue was a particular vendor's library. Somewhere deep inside it threw an exception. This excpetion could not be caught - but only in the case of linking to Oracle back-end on IBM AIX operating system. This combination was crucial to our protagonist, and many weeks were spent trying to resolve this bug, which caused the application program to crash [and burn].



Z Directory: Pluses and Minuses



On a typical Microsoft development computer at Vettrasoft, running "Windows Server 2003 R2", are installed components such as:
  • Microsoft .NET Compact Framework: "version 1.0 SP3 Developer", "2.0"
  • Microsoft .NET Framework: 2.0 Service Pack 2
  • Microsoft .NET Framework: 2.0 Service Pack 3
  • Microsoft .NET Framework 3.5 SP1
  • Microsoft .NET Framework 4 Client Profile
  • Microsoft SQL Server Backward compatibility
  • Microsoft Visual C++ 2008 Redistributable 9.0.21022
  • Microsoft Visual J# 2.0 Redistributable Package


What these packages, modules, frameworks, or whatever Microsoft wants to call it - we can't say. Apparently some programs need a higher version, others need a specific service pack or framework, other programs need a 'backward compatibility hook'.

Before launching into the various aspects and properties of our system, consider this paragraph, taken from Wikipedia on Microsoft Windows Powershell , discussing the need for shells:
The shell is a command line interpreter .. the shell also includes a scripting language.. which can be used to automate various tasks. However, the shell cannot be used to automate all facets of GUI functionality, in part because command-line equivalents of operations exposed via the graphical interface are limited, and the scripting language is elementary and does not allow the creation of complex scripts.

Microsoft attempted to address some of these shortcomings by introducing the Windows Script Host in 1998 with Windows 98, and its command-line based host: cscript.exe. It integrates with the Active Script engine and allows scripts to be written in compatible languages, such as JScript and VBScript, leveraging the APIs exposed by applications via COM. ... Different versions of Windows provided various special-purpose command line interpreters (such as netsh and WMIC) with their own command sets. None of them were integrated with the command shell; nor were they interoperable.


What this excerpt reveals are design practices abhorrent to the Vettrasoft philosophy: piling on new software to address shortcomings in the existing framework; attempts to connect disparate systems (in this case, Active Scripts and VSScript); a hodgepodge of versions, providing alternate solutions (and hence differing protocols, languages, and APIs requiring learning the 'language' of each version-set), and other problems we'll just bundle into one term: "programming hell".

The Vettrasoft approach is, in a way, an all-in-one solution: the Z Directory provides you with, say 80-90% of the software building blocks you need (and you can link in a few other specialized vendors in the remaining 10-20%). This is the approach SAP's [Germanic mentality] ERP system presents - a monolithic, large system that takes care of all your needs. The main purpose of this approach in the Z Directory is to make sure that all the pieces fit: the single-vendor approach (Vettrasoft) lets us deal with the problems of making sure that everything has good, uniform interfaces and interoperates correctly under all possible environments (which is a lot of work that you don't have to do).

Another disadvantage of isolating your code from the vagarities of its environment is that the "insulator code" is reduced to the lowest common denominator. Say you develop a program that runs only in a Microsoft operating system. In that case, you can take advantage of the "registry" to store information permanently (like a database). Whereas an SQL-based database requires you to store data as a matrix, eg rows and columns, the Microsoft registry is oriented towards a tree structure format. But you won't be able to move your program to a unix system, which doesn't have the Registry. Since the Z Directory proclaims operating system ("OS") independence, it can't utilize the built-in Registry facility provided in the Microsoft OS. Thus, the capabilities provided by the Z Directory is limited to the lowest common denominator - if a database, OS, or transport protocol has a unique feature, that feature cannot be provided by the Z Directory interface, because the Z Directory is designed to work the same across all its supported environments. Another case is setting of an environment variable. These variables belong to the user, not the computer - both true in Microsoft and unix. Setting them (that is, pushing them "up", instead of into the current environment) is relatively straightforward in Microsoft - it translates to editing the Registry. However, in unix, it depends on the shell a program runs in. If one sets an environment variable, which shell is affected? Csh? Ksh? All? If so, text files need to be modified; it gets ugly. The quickest solution is to disallow setting an environment variable - a dissatisfying solution. A further complication is that Microsoft has two types of environment variables - the user's, and the system's.

These issues are not insurmountable tragedies. Almost all such 'cutting edge' features can be either implemented internally or emulated. In the Registry case, files and directories in a file system can be used to store data in logically the same (tree-structure) fashion as Microsoft's Registry.