(Historical note added in 2008: The following article was written in year 2000 and is for readers who are used to programming but not to OOP. It explains the basics of OOP from a traditional (procedural) programming perspective and without the distracting hype that OOP was being evangelised with at the time I was writting. If you are a recently trained programmers whose first language was an object oriented one like Java, and so naturally think in OOP ways, this article might seem strange or even derogatory to OOP. It might nevertheless be slightly amusing though :-) .)
Object Oriented Programming (called 'OOP' for short) is promoted as a radical, difficult to comprehend & even frightening way of programming that vital to know about. Is this true? No.
Object Oriented Programming is not very radical or very difficult compared to conventional programming. When one looks under the ideology and sees what is actually there, one finds there is not really much different at all!
The reason for it seeming so difficult is that introductions to OOP are normally given by the follow types of people who are not ideal for teaching the basics:
In the following article I will try to explain what OOP is and why it is used but without the customary hype, irrelevant extras & detail (and, hopefully, without the incompetence). I am assuming the reader is familiar with the fundamental concepts of normal programming such as a program being made of 'commands' which tell the computer do things, 'variables' to store data in and the idea of grouping a set of commands together in 'functions' (or 'subroutines' which are virtually the same) which can be called as needed from different parts of a program. However, the ability to write programs is not required and I won't be giving the detailed syntax for any particular OOP language. If you want that then there are a surfeit of books & web pages to choose from already.
Firstly, some background. A conventional ('procedural') program consists of a sequence of commands. The commands can do input, output, manipulate data and control the order in which the commands are carried out. So as not to have to duplicate commands in different places in a program where same action needs to be performed, a set of commands can be combined into a 'function' or 'subroutine' which acts like a new command. This same set of commands can then be called from different places in the program. As well as substantially reducing the length of programs, this can make the structure neater by encapsulating the commands which, together, perform some particular operation in one place. The rest of the program need not be concerned about the details of what commands make up the function and just treat it as something which does that operation. This neatness is not merely an aesthetic feature but a great aid to making programs quicker to write, easier to debug, more reliable & more reusable.
Splitting a program into functions makes programming quicker not only because different programmers may be able to work on separate functions at the same time but, crucially, it breaks down a huge task which would be difficult for a human to store in mind as one piece into smaller units more suited human memory. It makes debugging easier because it localises the effect of a single bug so it is easier to track down and, when bugs have been eliminated from a function, one need not waste time rechecking it when debugging other parts of the program. To aid this, it is normal to have the variables a function uses internally hidden from the rest of the program, unless there is a special necessity to reveal them. This ensures that any problem related to such a 'local' variable can be traced to function which needs correcting. This also makes the program more reliable because, once a function is working, it should stay working as more of the program is built up. Other parts of the program cannot disrupt those hidden variables & commands inside the function. The splitting also, of course, makes subsequent programs quicker to write as whole functions can be reused from earlier programs and even stored in libraries for use by any future program. The generic term for this is approach of breaking a big solution into small pieces whose internal workings are not of concern to other pieces is 'modular'.
Enough about functions. Now for variables. In many computer languages, a programmer can define a compound variable type in addition to those which are ready made in a language. For example if a particular language has variable types for names & dates, a type for storing birth records could be made by combining two name type variables, for the family name and given names, with a date type variable, for the date of birth. These combined variables are called different things in different languages including 'structures' (C), 'clusters' (LabView) & even just 'types' (Fortran). Combined variable types can be useful. For example, instead of having to work on and pass around three variables together whenever a birth record is used, a single variable of this combine type could be thus be used. Commands & functions which don't need to use all the member variables of a combined variable need not be concerned they are there. One can even copy a combined variable in one command, rather than one command for each member variable, without knowing or caring what all the member variables are. This is applying a modular approach to collections of variables as functions were for collections of commands.
That was around for decades then someone then came up with the idea of including functions as well as variables in those combined variable types. Combined variable types could then have member functions as well as member variables. These functions are only really in the program once (it would be very inefficient otherwise) because they are written into the variable type specification not the individual variables of that type themselves. However, they act as if they were duplicated in each variable of that combined variable type because they, by default, act on the member variables of the particular variable of the variable they called with. For example one could have included a member function to calculate a person's age into that birth record combined variable type. When a particular variable of that type, storing a particular given name, family name & date of birth, has its age calculating member function called, the function will automatically use the member variables of that particular variable, not the generic variable type, to calculate an age. This can be quite handy because it neatly bundles data & the functions which act on it together. It can also aid conceptualising a program because, in calling a member function, one is effectively telling the data what to do to itself which is, in some situations, closer to reality than giving the data to a command to process.
Now you have understood that, we can go onto Object Orientated Programming at last. Correction: if you have understood then you have understood Object Oriented Programming! That idea in the preceding paragraph of putting functions into combined variable types is Object Oriented Programming! Does it not sound dramatic enough? Okay, lets put some hype in: rename 'member functions' to 'methods'; rename 'combined variable types' to 'classes'; and rename 'variables of combined types' to 'objects'. That's all it is!
There are few useful extras which normally come with OOP. They can mostly exist in procedural programming languages as well so they are not necessarily OOP features but they are ubiquitous in OOP languages so I suppose I ought to mention them. You can skip this section if it is too detailed.
OOP is good for the same reasons that other modular programming schemes are: aesthetic neatness, quick writing, eased debugging, improved reliability & increased reusability in large programs.
In addition, an object based structure naturally fits certain common uses of computer programs including graphical user interfaces (with buttons, windows, scroll bars as objects) and databases (records as objects).
Of course, all this the modular structuring can be done with a classical 'procedural' language (indeed the OOP C++ language was originally made a collection of 'search & replace' operations that converted C++ programs to the procedural C language!) but looks cleaner in OOP because OOP was designed for this structuring. And OOP is fashionable!!
There are drawbacks to OOP as well. It is not the best thing to use in all circumstances. Don't fall into the trap of using for ideological reasons when it is not the most suitable method for a particular task (and similarly don't dogmatically stick to a single programming language, pick the more suitable one for each job).
For a start, it should be obvious that OOP, or any such heavy programming method, is probably not ideal for doing small quickly-written one-off programs where the time taken to define a class structure is more than the time you will save by having it neatly modular. Of course, one can use an OOP language for short programs, it is just that structuring your own programs in an OOP fashion in addition to using the language's in-built own objects would be inefficient. I don't know what the cross-over point is but I guess it is several hundred lines of program for myself, although I would probably do modular structuring into functions at well below a hundred lines.
Neither is it suitable for very low power computers such as the microcontrollers embedded in consumer products which often have only less than a kilobyte of memory to fit the program into (compared to gigabytes on a office PC) and only a few bytes (compared to megabytes) to store variables in.
There are also some jobs which are naturally "do this ... then this ... then this ... then this ..." tasks in which case programming it procedurally could well be neater and easier than doing it object oriented. I've found this for simple one-task programs that control mechanical devices or batch process files. They typically read in the parameters, read in data from files, apply a series a manipulations to the data & parameters and output to electronic hardware or to a file in that order. Sometimes they don't even have conditionals or loops. Object orientation would be an ill-fitting arrangement for these programs. (An interesting aside: often such small programs are called in turn by other small programs and a collection of such programs naturally builds up into an effectively object oriented system, where the little programs act as classes, without any planned intention for them to be so.)
The most serious drawback is one common to all neatly structured modular programming: the division into modules really needs to be decided in advance of programming. In an ideal programming situation this would be the case but, in reality, customers who don't understand programming often change the requirements drastically after programming has been started (or even finished!). Often spec's are changed in a way that looks small from the outside but which mean that objects which were built to act totally independently of each other are changed so that they need to control eachother directly. The program alterations are then either a time-consuming restructuring of much of the program or messy ad-hoc direct links which wreck the modularity. For example, if a customer asked for a simple product database in which records are only ever set once & recalled one at a time for reading, then the obvious class structure would be one of a records class for storing the product data and a database class consisting of an array of those customer records objects along with a method to add a new record object and search method to returns a copy of a requested record for reading. If the customer then demands that a product record should link to related products with the links changeable from the display terminal, then the structuring will be serious. Not only will it need that extra variable added to the record class (easy) but record objects will need to link to other record objects which was previously only via the database object (disrupts the neat modular structure), the database will need to pass around the original record object instead of a copy so it can be altered (lots of changes needed in different places in the program) and a contention-resolution system will be needed to prevent the records now being altered from different parts of the program simultaneously (a difficult and time consuming programming task). The only solution I know of for this is to warn customers that such spec' changing is like asking for different foundations in a house after the walls are built and get them to contractually agree to pay for such alterations they request but customers don't like that.
The rest of this article was not language specific but one major misconception needs to be cleared up with the language 'C++'.
OOP is most commonly advocated for 'C++'. In part this advocacy is just because 'C++' is the procedural language 'C' with OOP added in later so they make a nice pair of languages to compare & contrast unlike, for example, 'Java' which was OOP from its first creation. However this does not explain the fanatical enthusiasm with which C++ was welcomed. Was the explanation that OOP was so much better than procedural programming? No. It was simply that there were some very useful things absent from the original C language that were either rectified in C++ or could be bodged up with OOP tricks:
abs()
'
to
calculate to the absolute (unsigned) value of an integer but
'fabs()
' to do the same thing for a floating
point number. The
compiler should be able to distinguish these itself from the type of
parameter.
In C++ it can. This is not a specifically OOP feature though.IncreaseCount()
' could easily be accidentally
duplicated. A
common solution was to start each function name with the name of the
file or
module which it was in but that was messy, increased typing &
actually
broke the official C spec' (which stated compilers could ignore all but
the
first 6 characters of function names). In C++ there is a neater
bodge-up:
bunging all the functions from one file or module in a class, even if
that
class does not have any variables, neatly localises the function names.
Once
more, this is need not have required OOP; for example, in the Perl
language one
can specify anything to be only locally visible (indeed adding OOP
features to
Perl required almost no changes to the language, essentially just an
alternative syntax which called such localised regions classes!).//
' command which merely means "ignore
everything else on
this line"! It is quicker to type than the original C comment markers
'/*
...*/
' which
needed one to mark both ends of the
stuff to ignore.The reason for these deficiencies in C is because C was designed for low level fast programs on low power computers not for database & graphical user interface programs on the far more powerful computers available at present. It is C++'s adaptation of C towards this changed role which gives it its popularity not so much its OOP nature.
Object Oriented Programming is not as different from normal procedural programming as is made out by its advocates and is not as difficult to understand as their proselytising implies. It is useful in making big modular programs but such programs should have been structured very similar to an OOP structure anyway. It can be more hassle than it is worth for short & quick programs. The enthusiasm for C++ in particular is mainly because it adds in some important basic features that were missing from C.