16.3. More Features

16.3.1. Input and Output

Perl uses a different scalar data type for input and output: file handles. Besides the three standard file handles STDIN, STDOUT, and STDERR, you can create your own by opening a file.

open(TEST, ">test.dat");
print TEST "Hello World";
close(TEST);

The open function takes the symbol of the file handle as the first argument and creates the associated new file handle as a side effect. The second argument describes which file to open and how to open it. The syntax is derived from the UNIX shell. A greater-than symbol opens a file for output, the shift symbol >> opens a file in append mode. Perl's print functions can write to different output streams, the default one being STDOUT. If the stream is specified, it is placed between the name of the print function and the argument list. The stream is not an additional argument (no comma between the stream and the following arguments). It is really a special syntax just for specifying an output stream.

The return value of open indicates the success or failure of the operation. The idiom you'll find in many Perl programs is the combination of the opening of a file and the die function which exits the program if the operation failes.

open(TEST, ">test.dat") || die "Could not create test.dat";
print TEST "Hello World";
close(TEST);

Once a file handle has been created, it can be used like any other scalar value. It can be assigned to variables and passed to functions.

sub hello {
    my $out = shift;
    print $out "Hello again";
}

open(TEST, ">test.dat") || die "Could not create test.dat";
hello(TEST);
close(TEST);

16.3.2. Special Variables

A Perl program can use a large number of predefined variables. You need the environment variables of the operating system context in which the script is running? They are readily available in the special hash variable %ENV. The command line arguments of your script are contained in the @ARGV array. The process id of the Perl process running your script or the name of your script file? They are found in the variables with the fancy names $$ and $0, respectively. All in all there are a few dozens of special variables defined in Perl. Some of them customize the behavior of Perl functions, some of them are set as side effects of functions. We have already used the special variables $, and $\ which control how the print function separates output fields and records.

16.3.3. Packages and Modules

Packages are Perl's namespace mechanism. When defining a symbol (variable, function), the symbol is always placed in the current package. You can refer explicitly to a symbol in a package by preceding the symbol's name by the package qualifier consisting of the package name and two colons. Up to now, we have only used the default package main. We could therefore refer to this package using the main:: prefix.

$x = "Hello World";
print $main::x;

Perl's package statement switches to a different package. It can occur anywhere in the code and causes the current package to be changed to the new specified package until the end of the block or a new package statement.

sub hello {"hello-main"; }
{
    package A;
    print hello();
    sub hello {"hello-A"; }
    package B;
    print hello();
    sub hello {"hello-B"; }
}
print hello(), main::hello(), A::hello(), B::hello();

-> hello-A
hello-B
hello-main hello-main hello-A hello-B

Within a package, the call of the hello function without package qualifier refers to the function defined in the package, even if the function is defined after the call. The same could be demonstrated with global variables, but since the strict mode enforces fully qualified global variable names, this does not make sense anymore.

A Perl module is package defined in a file whose name is the package name with the .pm. Here is a tiny module called Sample and stored in a file Sample.pm.

package Sample;

BEGIN { print "Loading module Sample"; }
END { print "Unloading module Sample"; }

sub hello { print "Hello", $_[0]; }

1;

After the package declaration, we define a constructor and a destructor for the module. These functions have the special names BEGIN and END (an awk heritage) and are called before loading the module and right before unloading it, respectively. Note that the sub keyword of these functions can be omitted. Besides these special functions, we only define one more function hello to be used by clients of the module. The expression 1; at the end of the module makes sure that the module returns a true value when loading it. To use the module (e.g., in our sample script sample.pl), we have to load it using the require command.

require Sample;
Sample::hello("World");

-> Hello World

We can then use the module's package as if it was defined in the script.

16.3.4. Object Oriented Perl

Perl uses its packages and modules to add object oriented features to the language. Object are references which have been "blessed" with a package. The package plays the role of the class definition, its functions are the methods. Class methods are functions which obtain the package (i.e., class) as the first argument, instance methods are functions whose first argument is the instance. Here is a first example.

package Person;

sub new {
    my $class = shift;
    my ($name, $age) = @_;
    my $self = { name => $name, age => $age};
    return bless $self, $class;
}

sub name {
    my $self = shift;
    $self->{name};
}

package main;

my $person = Person->new("Homer", 55);
print ref($person), $person->name();

-> Person Homer

We create a new package called Person (it does not have to be in a new module) and define two methods. The constructor new is a class method creating new instances of Person class. We use a hash reference (the most common case) for the object $self and set the name and age entries as supplied to the constructor. The next step is the magic blessing of this hash reference. It links the reference to the class so that Perl can find its method (i.e., the functions of the attached package). Next, we define a simple getter method for the name. It demonstrates how the instance is passed to the instance method as the first argument. By convention this argument is called $self like in Smalltalk and Python.

The only new syntax is the call of the class and instance methods using the arrow ->. Doing so lets Perl pass the class and instance implicitly as the first argument to the respective functions of the package implementing the class. Alternatively, we could have done this ourselves.

$person = Person::new("Person", "Homer", 55);
print Person::name($person);

The method call supported by the new syntax implicitly uses the package attached to the reference $person during the "blessing". It then tries to find the method in this package (and its base class packages as we will see below) and calls the function with the instance as the first argument.

With the knowledge about the internal structure of our Person object, we could have accessed the name directly as $person->{name}, but this would have violated the encapsulation of the class. Encapsulation is not enforced by Perl, but merely a convention. Use the provided methods only and do not exploit knowledge about the structure of a class.

Inheritance is realized by putting the names of the base class packages into a special global array called ISA.

package Employee;

@Employee::ISA = qw(Person);

sub new {
    my $self = Person::new(@_);
    $self->{no} = $_[3];
    return $self;
}

sub no {
    my $self = shift;
    $self->{no};
}

package main;

my $employee = Employee->new("Homer", 55, 12345);
print ref($employee), $employee->name(), $employee->no();

Employee Homer 12345

In the Employee constructor new, we first call the Person constructor and then add the additional attribute to the blessed hash table.

16.3.5. Function Prototypes