Object orientation (OO) is an object-centered way of looking at the world. The claim behind object orientation is that if lets us use a "natural" model for analysis, design, and implementation of systems. It is "natural" because as humans, we perceive things as "objects." (Unlike the Ferengi, who perceive things as profit centers. ;-) We do this so we can deal with complexity. For example, think about the human body, the atmosphere, and the galaxy. Each of these objects is an enormously complex system, yet we can conceive of each as a single object. This is a highly effective abstraction mechanism. So let us explore the definition of "object" in more detail.
There are several common definitions of the term object:
In the context of object-oriented programming, we usually use the third definition of object. Is everything an object in one of the senses above? Others may say, "No, love is something that is not an object." But I would say, "Yes, everything is an object." However, I'll concede that some objects, such as love, are best considered as relationships between other objects. The problem with OO is that most implementations of it relegate relationships to second-class status -- they get implemented as attributes and sometimes methods (see definitions below), and they get buried. But really OO should recognize that objects and relationships are both primary modeling constructs. Since each relationship is itself an object, it is easy to place relationships in a position of secondary importance.
An object has five characteristics: identity, lifetime, state, behavior, and boundary.
Objects are created, they exist for a space of time, and they are destroyed. This cycle is called an object's lifetime. Programming languages need to deal very carefully with the process of object creation and destruction.
Every object has a state that indicates the object's attributes and relationships to other objects. For example, a blacksmith may be in the state of holding a hammer poised to strike a hot ingot. A telephone operator may be in the state of talking with a customer. A bicycle may in the state of being red; next week after I paint it, the same bike could be blue. Two people can be in the state of marriage.
An object acts and reacts to its environment in a certain way. This activity is called object behavior. For example, a ball may react to being pushed by rolling away. A person may react to being pushed by pushing back. Some objects are entirely passive -- they do not actively generate new events -- they are simply acted upon by the world around them. Other objects are active -- they cause new things to happen. A baby smiles, evoking a response from a mother. A bomb derails a train. Thus we say that babies and bombs are active objects. As objects behave, they very often interact with other objects. Object interaction is often a major part of an object's behavior.
Each object has a boundary that delimits the extent of the object -- what is part of the object and what is not. In the real world, sometimes object boundaries are fuzzy. For example, where does a bump on a cloud begin, and where does it end? In our software systems, we require that object boundaries be crisp. For instance, a report has a first and a last page. Objects boundaries are well-defined and the extent of our software objects is finite.
All of these characteristics combine to help us understand what an object is -- regardless of which definition of "object" we're using. Let us focus on sense #3 now. First, we want to explore a definition of object-oriented programming (OOP). According to Grady Booch, here it is:
"Object-oriented programming is a method of implementation in which programs are organized as cooperative collections of objects, each of which represents an instance of some class, and whose classes are all members of a hierarchy of classes united via inheritance relationships." -- Grady Booch, Object-Oriented Analysis and Design with Applications, 2nd ed., Benjamin-Cummings, 1994, p. 38.
We won't argue with his definition for now -- let's just accept it and focus on what an object is in this context. In object-oriented programming languages (OOPL's), an object has attributes that define its state, and methods that define its behavior.
Attributes are fields or variables associated with an object that define its state. For example, a money object may have one attribute called value and another called monetary_unit. The first attribute describes how much money the object represents, and the second attribute describes the currency unit (e.g., US dollar, British pound) for this money object. Attributes are also sometimes called instance variables because objects are usually created as instances of classes, and attributes are stored in variables associated with an instance.
Methods are pieces of code associated with an object. Methods are invoked in much the same way as functions are invoked. An object method is invoked by some other method in the system. Methods may have arguments and a return value. However, unlike general functions, a method must be invoked for a specific object. For example, we may call the print method for a money object, and this results in the appropriate currency symbol and monetary value being printed. A method can access and update the attributes of the object for which it was invoked. Methods that update an object's attributes are called mutators. Methods that do not modify attributes are called accessors. The act of invoking methods on objects is often described as passing messages to objects. We pass a money object the print message, and the object responds to the message by printing itself. In most OOPL's, message-passing is nothing more than standard method invocation.
Object lifetime is controlled by constructors and destructors in many OOPL's. You create an object by invoking the right sort of constructor, and you destroy an object by calling the appropriate destructor for that object. Some OOPL's destroy objects automatically when they are no longer needed. Other languages require the programmer to invoke object destructors directly to destroy some kinds of objects. For example, in Java you must create objects with the new operator, but you never need to worry about destroying objects. Instead, they are "garbage-collected" whenever it can be proven that they are no longer needed by the program. But in C++, you must always use the delete operator to destroy an object you have created with the new operator. (By the way, constructor/destructor/memory management is a major source of complexity in C++ programs.)
Object boundaries are cleanly delimited in OOPL's according to class definitions. Since objects are created as instances of classes, the class defines which attributes and methods are available for a given object. The other aspect of boundary that crops up in OOPL's is that some attributes and methods, though present, may not be accessible to all other objects in the system. This protection is described more thoroughly in the section on encapsulation.
Object identity has been the topic of much discussion in the OO community, but now it is reasonably well understood. In traditional programming languages, different variables are distinguished by their names or by where they are stored in memory. For example, since x and y are two different names, we can assume that they represent different variables. However, this distinction breaks down in OOPL's, since we can usually assign an object reference to a variable. Because two different variables could hold references to the same object, we cannot use variable names as the distinguishing property. Furthermore, since two objects could have the same values for their attributes, we cannot use the values of those attributes to identify objects. Since OOP must deal with the concept of object identity, some languages have provided separate operators to test equality and identity. For example, in Java when we ask "a == b" we are asking whether the objects to which a and b refer are identical. But if we ask "a.equals(b)" we are testing whether a and b have the same attribute values. "==" is an operator that takes two arguments, but "equals" is a method we invoke on a, passing b as a parameter.
So why do we want to do object-oriented programming in the first place? The answer is tied up in how we manage complexity in a program. In the 1970's, we learned that there was a better way to program by dividing a problem into smaller functions that could be composed to solve a larger problem. Thus was born structured programming. However, structured programming paid little attention to data, and focused on how the logic of a program was divided up. We'd start with a problem and perform functional decomposition to discover the smaller steps that make up our larger solution. That was great as far as it went, but it didn't go far enough. Object orientation follows the train of thought that was started with the work on abstract data types (ADT's). An ADT is a collection of data fields together with the code needed to manipulate the data. ADT's were created to provide a way to implement user-defined data types that act just like built-in data types such as integer or string. Thus, if you need a complex number type, for example, you could implement it as an ADT, and thus extend the data types your program can use for arithmetic operations. We added a few mechanisms to ADT's and ended up with the concepts of object-oriented programming. Thus, an object-oriented program is not divided into functions like a structured program, but instead is divided into objects. An OO program is expressed not as a collection of interacting functions, but as a collection of interacting objects. An object-oriented program has a form that is more like the real world than does a structured program. An object-oriented program emphasizes its data, and binds functions more closely to the data on which they operate. Instead of decomposing a problem into subfunctions, in OOP we create user-defined types and then build a system of interacting objects from those types to accomplish our intended purpose. The world is beginning to accept OOP as a superior paradigm to the structured approach. I see OOP as a natural extension of structured programming.
The concept of "object" allows us to form abstractions that hide a great deal of complexity. For example, we say "human body" and in one phrase -- one object -- hide all the body's subsystems which themselves are extremely complex objects (e.g., circulatory system, nervous system, brain, hand, etc.). This is a powerful form of abstraction, but it is not enough by itself. We also have large numbers and varieties of objects in our world, and we use abstraction techniques to manage quantity and variety as well.
For example, when we say "galaxy" we are talking about billions of unique stars, each with its own peculiar properties. In our natural minds there is no way to comprehend those billions of stars individually. We must conceive of them as a system. We organize these billions of star objects into a class we call "star." Every object in this class shares a set of properties. For example, all stars have mass, luminance, core temperature, surface temperature, composition, date of birth, etc. When we talk about the class "star" we automatically think of an object that radiates light and has the properties mentioned before. Furthermore, we think of the billions and billions of such objects that exist in the universe. So in the concept of class, we are able to abstract variety and quantity. Classification is the process of classifying objects according to their properties as we have described here.
In OOP, class is usually defined as a template used for the creation (or instantiation) of objects. A class serves as a type that indicates what the structure of an object should be. A class defines the attributes and methods that belong to all objects instantiated from the class. Attributes are considered instance variables (variables that correspond to an object instance, or object). Classes may also themselves have attributes (called class variables or class attributes) and methods (class methods).
In the real world, when we classify objects, we also relate classes to one another. For example, "law enforcement official" and "pilot" are special classes of "person." The set of police is a subset of the set of people. Elephant is a special kind of mammal. This process is called generalization/specialization. We can say that some classes are specializations of others, or conversely that some class is a generalization of another class. Classes may exhibit many kinds of relationships among themselves, and generalization/specialization is one of the most common.
Another very common class relationship is that of aggregation. One class may be composed of parts that come from other classes. For example, a computer workstation may be an aggregation of monitor, CPU unit, power cords, keyboard, mouse, speakers, and printer objects. The computer workstation is made up of the aggregation of the other objects. This relationship is sometimes called the part-of relationship. An object from one class is "part of" an object in another class.
There are many other class-to-class relationships, such as uses, derives-from, is-a-sequence-of, and so forth. In object-oriented programming, we use the concept of inheritance in place of the generalization/specialization relationship.
One of the powerful advantages of object-oriented programming is that it allows us to take an existing class that is close to what we want, and make minor modifications to give us a new class that does what we need. This is done through the mechanism of inheritance. When we define a class, we can says that the class derives from an existing class. We say that the derived class inherits from or extends the base class. In the derived class we inherit the properties of the base class, so all attributes and methods defined in the base class are also present for the derived class. Furthermore, in the derived class we may add new attributes and methods. We may also override definitions of methods from the base class, providing customized implementations of methods for the derived class. For example, suppose we have the class Person and we need to implement the class Driver. We'd rather not start from scratch to implement Driver, so we start by deriving Driver from Person. Thus Driver inherits properties such as name, age, height, and width from Person. But now we add a new attribute to Driver that indicates the driver license number. We also add a method to Driver that performs the behavior DriveVehicle. We didn't need to reimplement the other attributes and methods of Person -- we just added one attribute and one method. Now if we want to create a class BusDriver we derive from Driver and add an attribute to store the chauffeur license number. We may want to override the definition of DriveVehicle with a new implementation that allows a BusDriver to drive commercial as well as regular vehicles.
People often confuse inheritance with the more natural concept of generalization/specialization. Don't do this! Inheritance represents derives-from, not the is-a relationship. Inheritance is an implementation technique, not a conceptual modeling technique. We use inheritance to help us reuse existing code, not as the primary way of modeling how classes relate to one another. For example, suppose we have a Point class that has a x_position and y_position. To implement a Circle we only need to add a radius attribute and customize the Draw method. Thus, we could say that Circle derives from Point, but not Circle is a Point. However, sometimes derives-from and is-a coincide. In our previous example, Driver derives from Person and Driver is a Person. It is best to make the derives-from and is-a hierarchies coincide as much as possible. Think about it really hard before making them diverge.
Encapsulation literally means the process of enclosing in a capsule or other small container. In a programming context, encapsulation indicates the process of organizing data and functions in such a way that some information is hidden, and other information is accessible. In OOP, some attributes may be visible to other objects, and some attributes may be inaccessible. Similarly, some methods may be visible and some invisible in other contexts. For example, in a method, we may define a local variable. This variable is accessible only within the method. When the method completes, the local variable is destroyed. We are guaranteed that no other method can change this local variable without our cooperation. This reduces side-effects one method may have on another, and thus decreases the complexity of our program.
With OOP, we use encapsulation to protect an object's attributes from being modified by other objects. In some OOPL's, like C++ and Java, you can declare an attribute to be public, in which case it can be accessed and changed by any object in the system. However, you can also declare attributes to be private, meaning that only methods associated with that object can access and update such attributes. Given this kind of encapsulation, we can assume that changes to an object come only from within the object's methods. Thus we can reduce the scope of our search when we're looking for what could have caused a particular change. Rather than search the whole program for the offending code, we need only look within the object's class.
Similarly, some methods may be private, thus restricting their accessibility to other methods within the class. Again, when determining how a particular method is invoked, it is much easier when the method is private, because the invocation could only come from within the class.
The idea behind encapsulation is to make the system simpler by reducing unwanted coupling of modules (remember, this is a structured programming idea). We want to make changes have as small a "ripple effect" on the system as possible. For this reason, we place as little as possible into the public interface of an object -- we hide as much as we can. For example, a typical OO thing to do is to make an object's attributes accessible only by calling a method on the object. This makes it possible to change the type of the attribute without changing the public interface. Suppose p1 is a Person object, and Person has a method called Age() that returns an int telling how many years old a person is (this age is stored in an int attribute called age). Further, suppose we decide that we don't want to store the person's age any more -- we want to store the person's birth date and compute the age whenever someone asks for it. Now, because we've placed the Age() method into Person's public interface, but we've hidden the age attribute in the private interface, we are free to change the structure of Person. We can eliminate the age attribute, and replace the body of the Age() method with a computation instead of a retrieval. We could just have easily have changed the type of age from int to float. We still would have had to change the body of Age() to convert from float to int, but it would have been easy to do. More importantly, this change would have no impact on other classes in the system. We have contained the ripple effect.
Overloading is defining multiple versions of the same method or operator. For example, we may have two versions of a method named Print. One takes a string as its argument, and another takes an integer. Or perhaps another version of the Print method takes two strings instead of one.
Overloading is not a feature that is necessary to object orientation, however it is an advanced programming-language technique that is found in most object-oriented programming languages. In most languages that support overloading, you can change the number or types of arguments for different variants of overloaded methods, but the return type must be the same for all variants. Some languages, like Java, support method overloading but not operator overloading. Finally, note that overloading is a special kind of polymorphism.
Polymorphism means having many shapes. Overloading is one kind of polymorphism (the same method or operator name may take different kinds of arguments). Dynamically-bound functions exhibit another kind of polymorphism (the same invocation may exhibit many different behaviors, depending on which specific code is dynamically bound to the invocation).
The ultimate expression of polymorphism lets us define generic methods that can apply to many kinds of operators. Consider what happens when a method takes a parameter whose type is a class that has many specializations. For example, the method may be Start, which takes a parameter of type Vehicle, and Vehicle may have several specializations, such as Car, Truck, and Motorcycle. A language that supports polymorphism lets us pass any kind of Vehicle object to Start, regardless of whether it is a car, truck, or motorcycle. In this case we have one method that operates on many types of input parameters. There are more exotic expressions of polymorphism that we won't explore here. Again, polymorphism is an advanced programming-language technique that is not essential to the OO way of programming, but it is found in most OO programming languages, and it is highly useful.
Dynamic binding or late binding is the process of waiting until run-time when a method is invoked to determine with actual method will be selected to execute. This matters mostly when you're calling a method on a class that has multiple subclasses. Depending on the actual type of the object, you may want a different version of the method (because perhaps the method has been overridden in the subclasses). If a method is dynamically bound, this means that when the method is invoked, the system first checks the type of the object, and then calls the method for that specific type of object. Dynamic binding isn't needed if you never override a method in a derived class. (Do you see why?)*
Some languages, like Smalltalk and Java, bind all methods dynamically (though optimizers may convert some dynamic invocations to static invocations). Other languages, like C++, allow some methods to be bound statically (this means "at com Grady Booch, pile-time") and others to be bound dynamically (C++ uses virtual functions to do this).
*The answer is because if you never override, the system has only one choice for the method that will satisfy the method invocation.