C++ started as a C preprocessor (the cfront compile), but by now has become an independent language. And although it is still possible to use plain C code and C's standard libraries, C++ now offers its own libraries for the most important extensions such as input/output, strings, and collections (see Section 13.2.5>). We therefore start all over again with some simple examples.
#include <iostream> using namespace std; int main() { cout << "Hello World" << endl; return 0; }
The code still resembles the original in C, but also shows many differences. First, the standard C++ header files don't have a suffix (since the standard commitee could not agree on one). Second, the namespace directive tells the compiler that we want to use the standard library without explicit qualifiers. C++ solves (as a late addition) C's problem of name clashes between different libraries by introducing namespaces. Without the using directive, we would need to put a std:: qualifier in front of all symbols defined in the std namespace, in our case the symbols cout and endl from the input/output library.
#include <iostream> int main() { std::cout << "Hello World" << std::endl; return 0; }
As another observation, C++ seems to be stricter about the return type of the main function. We have to properly define main as an integer function and must not forget the return statement. However, the most striking line is the print statement which does not even remotely resemble the print statements we have seen so far.
cout << "Hello World" << endl;It reminds more of a UNIX shell command, although the arrows point in the other direction. The analogy is not so far fetched, since the shift operator << indicates that objects are pushed into the standard output stream cout. The last object endl is the newline. C++'s relies heavily on operator overloading to implement type-safe and efficient input and output streams. In its standard incarnation, streams are even template types depending on the underlying character implementation and its stream related properties (or traits). In other words, to understand the simple print statement to its full extend, we have to understand most of the complex C++ features in the first place.
Before diving into the heart of C++, let us mention a few minor differences between C and C++. Besides the C-style comment /* ... */, C++ also ignores everything starting with two slashes // until the end of the line. This is the preferred style of comments in C++. Moreover, variables can be declared anywhere in the code, not just at the beginning of a block. As a useful applications of this rule, a loop variable can be declared as part of a for statement.
for (int i=0; i<n; i++) { ... }
The scope of the variable is just the for statement, it is not visible outside (like a number of other things, this has changed during the evolution of C++).
Handling strings in an efficient and safe manner is a non-trivial task in C. You have to consider pointers, buffer lengths, and memory management (who is responsible for the deletion of a string allocated on the heap?). Therefore, the (late) addition of a standard string implementation to C++ was a most welcome and, compared to other languages, long overdue improvement. Since we will use them in the examples below, we introduce strings here, although, just like the standard input/output streams, their implementation can't be fully understood without most of the other C++ features.
int main() { string name = "Homer"; cout << "Hello, " << name << endl; cout << "How are you, " + name << endl; string ho = name.substr(0, 2).append(", "); cout << ho << ho << ho << endl; return 0; } Hello, Homer How are you, Homer Ho, Ho, Ho,
The most interesting line
string ho = name.substr(0, 2).append(", ");
takes the first two characters as a substring and appends the separator string ", ". Besides these simple string operations, the standard implementation contains everything from access to individual characters to complex search methods.
In C, arguments are always passed by value. If we want a function to change a variable which is defined outside of the function, we have to pass the pointer (i.e., the address) of the variable to the function. The pointer is again passed by value.
static void count(int* counter) { *counter += 1; } void main() { int counter = 0; count(&counter); }
C++ introduces the option to pass argument by reference. Semantically, this is equivalent to passing the pointer, but the syntax differs.
static void count(int& counter) { counter += 1; } int main() { int counter = 0; count(counter); }
The actual advantage over the pointer notation is debatable, but references are used consistently throughout the C++ standard library. To avoid expensive copying of argument objects, most read-only arguments are passed as a const reference. In this case, the reference notation is more readable since it replaces the pass-by-value for performance reasons only.
static void show(const string& s) { cout << s; }
Objects and classes are C++'s first and formost addition to C. A C++ class is basically a C structure with methods. C++ supports inheritance (including full multiple inheritance) and fine-grained visibility control. Like Objective C, C++ separates the declaration of a class from its implementation. Here is the declaration of the Person class (typically to be found in a header file called Person.h).
class Person { string _name; int _age; public: Person(const string& name, int age); const string& name() const; void name(const string& name); void printOn(ostream& out) const; };
Apart from the many const keywords, this declaration does not contain any surprises. First, we define the two attributes _name and age. The default visibility in a class is private, so that these attributes will be visible inside of the class only. All the methods are defined with public visibility as indicated by the public: directive. Similar to Objective C, a visibility directive is valid until overridden by a new one. The first method is a constructor. It is named after the class itself and has no return value. As we will see, constructors in C++ are initialization methods which are called automatically once the memory for an object has been reserved.
The next two methods are the accessor methods for the name attribute. There are about as many naming conventions for C++ as there are developers. The one we follow here uses the same name for the getter and setter. This demonstrates C++ ability to use a method name multiple times in a class as long as the method signatures differ. However, we can not use the same name for the attribute itself. Hence, we follow the convention to begin attribute names with an underscore character. The last method is supposed to print a Person object on an output stream.
Now it is time to explain the many const keywords. As explained above, most C++ APIs use constant references as an efficient replacement of passing by value. The const keyword behind the name() getter and the print method indicates that calling these methods does not change the person object. Both these meanings of const play together. The compiler ensures that only constant methods are called for a constant object reference. As an example, the name setter obtains a constant reference to a string object (the new name). Inside of this method, we can call constant string methods of this string. Calling a destructive (non-const) method such as append will cause compile error.
After this lengthy explanation let's look at the implementation of the class. C++, each method is implemented individually by taking the method signature, qualifying method name with the class name, and following it with the method body. [1]
Person::Person(const string& name, int age) : _name(name), _age(age) {} const string& Person::name() const { return _name; } void Person::name(const string& name) { _name = name; } void Person::printOn(ostream& out) const { out << "name=" << _name << ", age=" << _age; }
The only striking definition is the constructor which uses initializers for the two attributes. These initializers are put between the signature and the constructor body. When a person object is constructed, the attributes are directly initialized with the associated constructor calls. The alternative assignment in the constructor body
Person::Person(const string& name, int age) { _name = name; _age = age; }
first initialized the name string with the default constructor and then assigns the actual name. Besides being more efficient, initializers may be the only option in scenarios where the attribute's class does not support default constructor or assignment operator.
It looks like we are eventually ready to use the new class.
int main() { Person person("Homer", 55); person.printOn(cout); return 0; } name=Homer, age=55
C++ leaves the developer many choices. One of these choices is the memory management. Like C's structures, instances of C++ classes can be allocated on the stack or the head. We have already used stack based strings in the examples above. Here, we create a Person object on the stack by passing the constructor arguments to the person variable as if it were a function. Behind the scenes, C++ reserves a block of memory for the object on the stack and calls the constructor whose signature matches the passed arguments. The string initialization
string s = "blah";
is actually equivalent to
string s("blah");
and therefore calls the string constructor which takes a plain C-string (const char*).
The heap-based version of the problem looks more familiar in front of the background of the previous chapters.
int main() { Person* person = new Person("Homer", 55); person->printOn(cout); delete person; return 0; }
The allocation is indicated by the new operator and the initialization by the constructor call. This is complemented by the delete operator which first calls the destructor (which we have not covered yet) before returning the allocated memory to the operating system. We will not get into the details here, but the new and delete operators can be changed to implement different allocation policies for particular classes or in general.
For a long time, C++ had no standard collection library. Instead, one had to rely on either the compiler's collection classes (e.g., as contained in Microsoft's MFC library) or third party libraries. There are two mainly two ways to implement collections in C++. On the one hand, there is the Smalltalk model based on a common base class for all elements in a collection and a hierarchy of classes modelling the different kinds of collections. On the other hand is the template model based on parametrized functions and classes. Both approaches have their virtues. The Smalltalk model provides clean interfaces, a simpler implementation, and smaller executables. As a trade-off, all objects in a collection must be heap based and the application code requires a lot of casting which takes away some of the compile-time type safety of C++. The template approach is truely type safe, but requires a lot more effort on the implementation side. Also, the code bloat associated with the excessive use of templates causes long compile times and large executables. In the end, the template model became part of the ANSI C++ standard with the standard template library, or STL for short.
The STL applies templates in a radical way. Everything is a template. The collections themselves, their iterators, and the functions acting on them. Although going against object oriented design, these three elements are treated as separate entities. To start with, let's see what a standard iteration through a container looks like.
#include <iostream> #include <vector> #include <iterator> using namespace std; int main() { vector<int> v; for (int i=0; i<5; i++) v.push_back(i); for (vector<int>::iterator i=v.begin(); i!=v.end(); ++i) { cout << *i << ' '; } }
We construct a vector, fill it with the five integers from zero to four, and print it by iterating through the container using STL's standard iterator syntax. To understand the STL iterator model we need to recollect the pointer-based loop through an array.
int main() { const int n = 5; int v[n]; for (int i=0; i<n; i++) v[i] = i; int* begin = &v[0]; int* end = begin + n; for (int* i=begin; i!=end; ++i) { cout << *i << ' '; } }
The similarities are intentional. The standard template library was defined to accomodate the most efficient implementation, that is, pointer arithmetic. We can easily design take the last example and cast it into a class definition that complies with the STL algorithms.
class IntArray { int _n; int* _v; public: IntArray(int n) : _n(n), _v(new int[n]) {} ~IntArray() { delete[] _v; } typedef int* iterator; iterator begin() { return _v; } iterator end() { return _v + _n; } int& operator[](int i) { return _v[i]; } };
The standard STL loop looks exactly the same as for the built-in vector collection.
IntArray v(5); for (int i=0; i<5; ++i) v[i] = i; for (IntArray::iterator i=v.begin(); i!=v.end(); ++i) { cout << *i << ' '; }
[1] | As you can see, this individual implementation of each method requires more typing and reduces the readability when compared to treating the implementation as a block (like Objective C's @implementation/@end). |