Previous Page Next Page

11.3. Call-by-Reference

11.3.1. Symbolic References—Typeglobs

Definition

A typeglob is an alias for a variable; i.e., another name for a variable. It is called a symbolic reference and is analogous to a soft link in the UNIX filesystem. You can create an alias by prefixing the name of a Perl variable with a "*". The "*" represents all the different types of variables: scalar, array, hash, filehandle, subroutine, etc. It is another name for all identifiers on the symbol table with the same name. The name typeglob comes from the fact that it "globs" onto all datatypes with the same name. For example, *name would represent $name, @name, %name, &name, etc.

Aliases were used predominantly in early (Perl 4) programs as a mechanism to pass parameters by reference and can still be used, although with the advent of hard references (see Chapter 13, "Does This Job Require a Reference?"), the practice of using typeglobs and aliases is much a thing of the past. Since there are a number of library routines that evolved during the early years of Perl, where typeglobs are still often found, and because they are used by Perl to build the symbol table for your program (see Chapter 12), they will be introduced here. (To see an example of how hard references are used with subroutines, see "Hard References—Pointers" on page 349, and for a complete discussion, see Chapter 13.)

Passing by Reference with Aliases

Aliases (or typeglobs) can be passed to functions to ensure true call-by-reference so that you can modify the global copy of the variable rather than the local copy stored in the @_ array. If you are passing an array or multiple arrays to a subroutine, rather than copying the entire array into the subroutine, you can pass an alias or a pointer. (See "Hard References—Pointers" on page 349.) To create an alias for a variable, an asterisk is prepended to the alias name, as in

*alias=*variable;

The asterisk represents all of the funny characters that prefix variables, including subroutines, filehandles, and formats. Typeglobs produce a scalar value that represents all objects with the same name; i.e., it "globs" onto all the symbols in the symbol table that have that name.[5] It is your job to determine what symbol you want the alias to reference. This is done by prepending the correct funny character to the alias name when you want to access its underlying value. For example:

[5] This is not the same as the globbing done for filename substitution, as in <p*>.

Given:*alias = *var
Then:$alias refers to the scalar $var
 @alias refers to the array @var
 $alias{string} refers to an element of a hash %var


If a filehandle is passed to a subroutine, a typeglob can be used to make the filehandle local.

Perl 5 improved the alias mechanism so that the alias can now represent one funny character rather than all of them and introduced an even more convenient method for passing by reference, the hard reference, or what you may recognize as a C-like pointer.

Making Aliases Private—local versus my

The names of variables created with the my operator are not stored on the symbol table but within a temporary scratch pad. The my operator creates a new variable that is private to its block. Since typeglobs are associated with the symbol table of a particular package, they cannot be made private with the my operator. To make typeglobs local, the local function must be used.

Example 11.18.

(The Script)
    #!/usr/bin/perl
1   $colors="rainbow";
2   @colors=("red", "green", "yellow" );
3   &printit(*colors);            # Which color is this?
4   sub printit{
5       local(*whichone)=@_;      # Must use local, not my with globs
6       print *whichone, "\n";       # The package is main
7       $whichone="Prism of Light";  # Alias for the scalar
8       $whichone[0]="PURPLE";       # Alias for the array
    }
9   print "Out of subroutine.\n";
10  print "\$colors is $colors.\n";
11  print "\@colors is @colors.\n";

Output)
6   *main::colors
9   Out of subroutine.
10  $colors is Prism of Light.
11  @colors is PURPLE green yellow.

Explanation

  1. The scalar $colors is assigned rainbow.

  2. The array @colors is assigned three values: red, green, and yellow.

  3. The printit subroutine is called. An alias for all symbols named colors is passed as a parameter. The asterisk creates the alias (typeglob).

  4. The printit subroutine is defined.

  5. The @_ array contains the alias that was passed. Its value is assigned with the local alias, *whichone. *whichone is now an alias for any colors symbol.

  6. Attempting to print the value of the alias itself tells you only that it is in the main package and is a symbol for all variables, subroutines, and filehandles called colors.

  7. The scalar represented by the alias is assigned a new value.

  8. The array represented by the alias, the first element of the array, is assigned a new value.

  9. Out of the subroutine.

  10. Out of the subroutine, the scalar $colors has been changed.

  11. Out of the subroutine, the array @colors has also been changed.

Example 11.19.

(The Script)
    # Revisiting Example 11.6 -- Now using typeglob
1   print "Give me 5 numbers: ";
2   @n = split(' ', <STDIN>);
3   &params(*n);
4   sub params{
5       local(*arr)=@_;
6       print 'The values of the @arr array are ', @arr, "\n";
7       print "The first value is $arr[0]\n";
8       print "the last value is ", pop(@arr), "\n";
9       foreach $value(@arr){
10          $value+=5;
11          print "The value is $value.\n";
        }
    }
    print "Back in main\n";
12  print "The new values are @n.\n";

(Output)
1   Give me 5 numbers: 1 2 3 4 5
6   The values in the @arr array are 12345
7   The first value is 1
8   The last value is 5
11  The value is 6
    The value is 7
    The value is 8
    The value is 9
    Back in main
12  The new values are 6 7 8 9  <--- Look here. Got popped this time!

					  

Explanation

  1. The user is asked for input.

  2. The user input is split on whitespace and returned to the @n array.

  3. The subroutine params is called. An alias for any n in the symbol table is passed as a parameter.

  4. The params subroutine is defined.

  5. In the subroutine, the alias was passed to the @_ array. This value is assigned to a local typeglob, *arr.

  6. The values in the @arr array are printed. Remember, @arr is just an alias for the array @n. It refers to the values in the @n array.

  7. The first element in the array is printed.

  8. The last element of the array is popped, not just the reference to it.

  9. The foreach loop assigns, in turn, each element of the @arr array to the scalar $value.

  10. Each element of the array is incremented by 5 and stored in the scalar $value.

  11. The new values are printed.

  12. The values printed illustrate that those values changed in the function by the alias really changed the values in the @n array. See Example 11.6.

Passing Filehandles by Reference

The only way to pass a filehandle directly to a subroutine is by reference. You can use typeglob to create an alias for the filehandle or use a hard reference. (See Chapter 13, "Does This Job Require a Reference?" for more on hard references.)

Example 11.20.

(The Script)
    #!/bin/perl
1   open(READMEFILE, "f1") || die;
2   &readit(*READMEFILE);     # Passing a filehandle to a subroutine
    sub readit{
3       local(*myfile)=@_;    # myfile is an alias for READMEFILE
4       while(<myfile>){
           print;
        }
    }

Explanation

  1. The open function opens the UNIX file f1 for reading and attaches it to the READMEFILE handle.

  2. The readit subroutine is called. The filehandle is aliased with typeglob and passed as a parameter to the subroutine.

  3. The local alias myfile is assigned the value of @_; i.e., the alias that was passed into the subroutine.

  4. The alias is another name for READMEFILE. It is enclosed in angle brackets, and each line from the filehandle will be read and then printed as it goes through the while loop.

Selective Aliasing and the Backslash Operator

Perl 5 references allow you to alias a particular variable rather than all variable types with the same name. For example:

*array=\@array;
*scalar=\$scalar;
*hash=\%assoc_array;
*func=\&subroutine;

Example 11.21.

(The Script)
    # References and typeglob
1   @list=(1, 2, 3, 4, 5);
2   $list="grocery";
3   *arr = \@list;      # *arr is a reference only to the array @list
4   print @arr, "\n";
5   print "$arr\n";         # Not a scalar reference
    sub alias {
6       local (*a) = @_;    # Must use local, not my
7       $a[0] = 7;
8       pop @a;
    }
9   &alias(*arr);       # Call the subroutine
10  print "@list\n";
11  $num=5;
12  *scalar=\$num;      # *scalar is a reference to the scalar $num
13  print "$scalar\n";

(Output)
4   1 2 3 4 5
5
10  7 2 3 4
13  5

					  

Explanation

  1. The @list array is assigned a list of values.

  2. The $list scalar is assigned a value.

  3. The *arr alias is another name for the array @list. It is not an alias for any other type.

  4. The alias *arr is used to refer to the array @list.

  5. The alias *arr does not reference a scalar. Nothing prints.

  6. In the subroutine, the local alias *a receives the value of the alias passed in as a parameter and assigned to the @_ array.

  7. The array is assigned new value via the alias.

  8. The last value of the array is popped off via the alias.

  9. The subroutine is called, passing the alias *arr as a parameter.

  10. The @list values are printed reflecting the changes made in the subroutine.

  11. The scalar $num is assigned a value.

  12. A new alias is created. *scalar refers only to the scalar $num.

  13. The alias is just another name for the scalar $num. Its value is printed.

11.3.2. Hard References—Pointers

Passing values to a subroutine by reference with pointers is now a more common practice than using typeglobs. Before demonstrating how to do this, we will define a pointer, its syntax, and how to use it and then provide examples. For more on pointers and other uses for them, see Chapter 13.

Definition

A hard reference, commonly called a pointer, is a scalar variable that contains the address of another variable. The backslash operator (\) is used to create the pointer. When printing the value of the pointer, you see not only a hexadecimal address stored there but also the data type of the variable that resides at that address.

For example, if you write

$p = \$name;

then $p will be assigned the address of the scalar $name. $p is a reference to $name. The value stored in $p, when printed, looks like SCALAR(0xb057c).

Since pointers contain addresses, they can be used to pass arguments by reference to a subroutine; and, because the pointer is simply a scalar variable, not a typeglob, it can be made a private, lexical my variable.

my $arrayptr=\@array;       # creates a pointer to an array
my $scalarptr=\$scalar;     # creates a pointer to a scalar
my $hashptr=\%assoc_array;  # creates a pointer to a hash
my $funcptr=\&subroutine;   # creates a pointer to a subroutine

Dereferencing the Pointer

If you print the value of the reference, you will see an address. If you want to go to that address and get the value stored there—i.e., dereference the pointer—the pointer must be prefaced by two funny symbols: one is the dollar sign because the pointer itself is a scalar, and preceding that, the funny symbol representing the type of data it points to. For example, if $p is a reference to a scalar $x, then $$p will get the value of $x, and if $p is a reference to an array @x, then @$p would get the values in @x. In both examples, the reference $p is preceded by the funny symbol representing the data type of the variable it points to. When using more complex types, the arrow (infix) operator can be used. (See "References and Anonymous Variables" on page 406 for more on the arrow operator.) Table 11.1 shows examples of creating and de-referencing pointers.

Table 11.1. Creating and Dereferencing Pointers
AssignmentCreate a ReferenceDereferenceDereference with Arrow
$sca= 5;$p = \$sca;print $$p; 
@arr=(4,5,6);$p = \@arr;print @$p; print $$p[0];$p–>[0]
%hash=(key=>'value');$p = \%hash;print %$p; print $$p{key};$p–>{key}


Example 11.22.

(The Script)
    #!/bin/perl
1   $num=5;
2   $p = \$num;       # The backslash operator means "adddress of"
3   print 'The address assigned $p is ', $p, "\n";
4   print "The value stored at that address is $$p \n";

Memory Addresses

  (Output)
  3   The address assigned $p is SCALAR(0xb057c)
  4   The value stored at that address is 5

Explanation

  1. The scalar $num is assigned the value 5.

  2. The scalar $p is assigned the address of $num. The function of the backslash operator is to create the reference. $p is called either a reference or a pointer (the terms are interchangeable).

  3. The address stored in $p is printed. Perl also tells you the data type is SCALAR.

  4. To dereference $p, another dollar sign is prepended to $p. This dollar sign tells Perl that you are looking for the value of the scalar that $p references, which is $num.

Example 11.23.

(The Script)
1   @toys = qw( Buzzlightyear Woody Thomas Pokemon );
2   $num = @toys;
3   %movies=("Toy Story"=>"US",
             "Thomas"=>"England",
             "Pokemon"=>"Japan",
            );
4   $ref1 = \$num;        # Scalar pointer
5   $ref2 = \@toys;       # Array pointer
6   $ref3= \%movies;      # Hash pointer
7   print "There are $$ref1 toys.\n";    # Dereference pointers
8   print "They are: @$ref2.\n";
9   while( ($key, $value) = each ( %$ref3 )){
10     print "$key--$value\n";
    }
11  print "His favorite toys are $ref2->[0] and $ref2->[3].\n";
12  print "The Pokemon movie was made in $ref3->{'Pokemon'}.\n";
(Output)
7   There are 4 toys.
8   They are: Buzzlightyear Woody Thomas Pokemon.
10  Thomas--England
    Pokemon--Japan
    Toy Story--US
11  His favorite toys are Buzzlightyear and Pokemon.
12  The Pokemon movie was made in Japan.

					  

Explanation

  1. The array @toys is assigned a list.

  2. The array @toys is assigned to the scalar variable $num, returning the number of elements in the array.

  3. The hash %movies is assigned key/value pairs.

  4. The reference $ref1 is a scalar. It is assigned the address of the scalar $num. The backslash operator allows you to create the reference.

  5. The reference $ref2 is a scalar. It is assigned the address of the array @toys.

  6. The reference $ref3 is a scalar. It is assigned the address of the hash %movies.

  7. The reference is dereferenced, meaning: Go to the address that $ref1 is pointing to and print the value of the scalar stored there.

  8. The reference is again dereferenced, meaning: Go to the address that $ref2 is pointing to, get the array, and print it.

  9. The built-in each function gets keys and values from the hash. The hash pointer, $ref3, is preceded by a percent sign; in other words, dereference the pointer to the hash.

  10. Key/value pairs are printed from the hash %movies.

  11. Dereference the pointer and get the first element and the fourth elements of the array. To dereference a pointer to an array, use the arrow (infix operator) and the subscript of the array element you are fetching. You could also use the form $$ref2[0] or $$ref2[3], but it's not as easy to read or write.

  12. The pointer is dereferenced using the arrow notation. Curly braces surround the hash key. You could also use the form $$ref3{Pokemon}.

Pointers as Arguments

When a subroutine receives parameters, they are stored in the special @_ array. For example, if you send two arrays to a subroutine, both arrays are stored in the @_ as a single list. There is really no way to separate the two arrays without knowing at least the size of the first one. However, if you send two pointers to the subroutine, they can/will contain the addresses of the original arrays and allow easy separation and dereferencing of those arrays. See Example 11.24.

Example 11.24.

 (The Script)
# Passing by reference with pointers
1 @list1= (1..100);
2 @list2 = (5..200);

3 display(@list1, @list2); # Pass two arrays

  print "-" x 35,"\n";

4 display(\@list1, \@list2); # Pass two pointers


5 sub display{
    print "@_\n";
 }


(Output)
 3   1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67

                     <continues>

    177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192
193 194 195 196 197 198 199 200
    -----------------------------------
 4   ARRAY(0x182e048) ARRAY(0x182ea38)

Passing Pointers to a Subroutine
Example 11.25.

Explanation

  1. The array @list1 is assigned values between 1 and 100.

  2. The array @list2 is assigned four values.

  3. The &addemup subroutine is called with two arguments. The backslash is used to create the pointers. The addresses of @list1 and @list2 are being passed.

  4. The subroutine &addemup is declared.

  5. The @_ array contains the two arguments just passed in. The arguments are shifted from the @_ into my variables $arr1 and $arr2. They are pointers.

  6. $total is assigned an initial value of 0.

  7. The value of the pointer is printed. It points to the array @list1.

  8. The value of the pointer is printed. It points to the array @list2.

  9. The foreach loop is entered and, by using the dereferenced pointers, each element from @list1 and @list2 assigned, in turn, to $num until all of the elements in both arrays have been processed.

  10. Each value of $num is added on and assigned to the value in $total until the loop ends.

  11. The sum of $total is returned to line 3, where it is passed as an argument to the print function and then printed.

11.3.3. Autoloading

The Perl AUTOLOAD function lets you check to see if a subroutine has been defined. The AUTOLOAD subroutine is called whenever Perl is told to call a subroutine and the subroutine can't be found. The special variable $AUTOLOAD is assigned the name of the undefined subroutine.

The AUTOLOAD function can also be used with objects to provide an implementation for calling unnamed methods. (A method is an object-oriented name for a subroutine.)

Example 11.26.

(The Script)
    #!/bin/perl
1   sub AUTOLOAD {
2      my(@arguments)=@_;
3      $args=join(', ', @arguments);
4      print "$AUTOLOAD was never defined.\n";
5      print "The arguments passed were $args.\n";
    }

6   $driver="Jody";
    $miles=50;
    $gallons=5;

7   &mileage($driver, $miles, $gallons);  # Call to an undefined
                                          # subroutine
(Output)
4   main::mileage was never defined.
5   The arguments passed were Jody, 50, 5.

Explanation

  1. The subroutine AUTOLOAD is defined.

  2. The AUTOLOAD subroutine is called with the same arguments as would have been passed to the original subroutine called on line 7.

  3. The arguments are joined by commas and stored in the scalar $args.

  4. The name of the package and the subroutine that was originally called are stored in the $AUTOLOAD scalar. (For this example, main is the default package.)

  5. The arguments are printed.

  6. The scalar variables are assigned values.

  7. The mileage subroutine is called with three arguments. Perl calls the AUTOLOAD function if there is a call to an undefined function, passing the same arguments as would have been passed in this example to the mileage subroutine.

Example 11.27.

    #!/bin/perl
    # Program to call a subroutine without defining it
1   sub AUTOLOAD {
2       my(@arguments) = @_;
3       my($package, $command)=split("::",$AUTOLOAD, 2);
4       return '$command @arguments';   # Command substitution
    }

5   $day=date("+%D");     # date is an undefined subroutine
6   print "Today is $day.\n";
7   print cal(3,2007);    # cal is an undefined subroutine


(Output)
Today is 03/26/07.

     March 2007
Su Mo Tu We Th Fr Sa
             1  2  3
 4  5  6  7  8  9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31

Explanation

  1. The subroutine AUTOLOAD is defined.

  2. The AUTOLOAD subroutine is called with the same arguments as would have been passed to the original subroutine on lines 5 and 7.

  3. The $AUTOLOAD variable is split into two parts by a double colon delimiter (::). The array returned consists of the package name and the name of the subroutine that was called.

  4. The value returned is the name of the function called, which in the first case happens to be a UNIX command and its arguments. The backquotes cause the enclosed string to be executed as a UNIX command. Tricky!

  5. The date function has never been defined. AUTOLOAD will pick its name and assign it to $AUTOLOAD in the AUTOLOAD function. The date function will pass an argument. The argument, +%D, is also an argument to the UNIX date command. It returns today's date.

  6. The returned value is printed.

  7. The cal function has never been defined. It takes two arguments. AUTOLOAD will assign cal to $AUTOLOAD. The arguments are 3 and 2003 assigned to @arguments. They will be passed to the AUTOLOAD function and used in line 4. After variable substitution, the backquotes cause the string to be executed. The UNIX command cal 3 2003 is executed and the result returned to the print fucntion.

11.3.4. BEGIN and END Subroutines (Startup and Finish)

The BEGIN and END subroutines may remind UNIX programmers of the special BEGIN and END patterns used in the awk programming language. For C++ programmers, the BEGIN has been likened to a constructor, and the END a destructor. The BEGIN and END subroutines are similar to both in functionality.

A BEGIN subroutine is executed immediately, before the rest of the file is even parsed. If you have multiple BEGINs, they will be executed in the order they were defined.

The END subroutine is executed when all is done; that is, when the program is exiting, even if the die function caused the termination. Multiple END blocks are executed in reverse order.

The keyword sub is not necessary when using these special subroutines.

Example 11.28.

    #!/bin/perl
    # Program to demonstrate BEGIN and END subroutines
1   chdir("/stuff") || die "Can't cd: $!\n";
2   BEGIN{ print "Welcome to my Program.\n"};
3   END{ print "Bailing out somewhere near line ",_ _ LINE_ _,
                                                 " So long.\n"};

(Output)
Welcome to my Program.
Can't cd: No such file or directory
Bailing out somewhere near line 5. So long.

Explanation

  1. An effort is made to change directories to /stuff. The chdir fails and the die is executed. Normally, the program would exit immediately, but this program has defined an END subroutine. The END subroutine will be executed before the program dies.

  2. The BEGIN subroutine is executed as soon as possible; that is, as soon as it has been defined. This subroutine is executed before anything else in the program happens.

  3. The END subroutine is always executed when the program is about to exit, even if a die is called. The line printed is there just for you awk programmers.

11.3.5. The subs Function

The subs function allows you to predeclare subroutine names. Its arguments are a list of subroutines. This allows you to call a subroutine without the ampersand or parentheses and to override built-in Perl functions.

Example 11.29.

    #!/bin/perl
    # The subs module
1   use subs qw(fun1 fun2 );

2   fun1;
3   fun2;

4   sub fun1{
       print "In fun1\n";
    }

5   sub fun2{
       print "In fun2\n";
    }

(Output)
In fun1
In fun2

Explanation

  1. The subs module is loaded (see "The use Function (Modules and Pragmas)" on page 378) into your program and given a list of subroutines.

  2. fun1 is called with neither an ampersand nor parentheses, because it was in the subs list. The function is not defined until later.

  3. fun2 is also called before it is defined.

Previous Page Next Page