You can use the terms "reference" and "pointer" interchangeably in Perl. A reference is a variable that refers to another one. In short, it contains the address of another variable. We have seen the usefulness of references when passing values to a subroutine (Chapter 11). We can also use references, or pointers, to create more complex data types, such as a hash that contains a key followed by a list of values, or an array of arrays, etc. and we will need references in the next chapter when creating Perl objects.
A hard reference is a scalar variable that holds the address of another type of data. It is similar to a pointer found in the C programming language.[1] This chapter will focus on hard references.
[1] Unlike C pointers, Perl pointers are strings and you cannot perform pointer arithmetic with them.
A Perl variable resides in a symbol table and holds only one hard reference to its underlying value. Its value may be as simple as a single number or as complex as a hash. There may be other hard references that point to the same value, but the variable that actually holds the value is unaware of them.
A symbolic reference names another variable rather than just pointing to a value.[2] Typeglobs, variable names preceded by *, are a kind of symbolic reference. They are aliases.
[2] Wall, L., Programming Perl, O'Reilly & Associates: Sebastopol, CA, 1996, p. 244.
You may remember using typeglobs in previous examples. In Chapter 11, we discussed how typeglobs were used in the early days of Perl to pass arguments to subroutines by reference. In Chapter 12, typeglobs were used to import symbols onto the symbol table of a package. The following statement uses typeglobs:
*town = *city; # Any type called city can also be referenced as town
The asterisk represents all of the funny characters that prefix variables, including subroutines, filehandles, and formats; i.e., it "globs" onto all the symbols in the symbol table that have that name.[3] *town is an alias for *city. It is your job to determine what symbol you want the alias to reference. This is done by prepending the correct funny character to the alias name when you want to access its underlying value. For example:
[3] This is not the same as the globbing done for filename substitution, as in <p*>.
Given: | *town = *city |
Then: | $town refers to the scalar $city
@town refers to the array @city $town{"mayor"} refers to an element of a hash $city{"mayor"} |
Example 13.1 demonstrates another type of symbolic reference where the value of one variable references the name of another variable.
To protect yourself from inadvertently using symbolic references in a program, use the strict pragma with the refs argument. This causes Perl to check that symbolic references are not used in the program. Here, we reexecute the previous example using the strict pragma.
We discussed pointers in Chapter 11 when passing references to a subroutine. To reiterate: A hard reference is a scalar that holds the address of another data type. A variable that is assigned an address can also be called a pointer because it points to some other address or to another reference. This type of reference can point to a scalar, array, associative array, or a subroutine. The pointer was introduced in Perl 5 to give you the ability to create complex data types, such as arrays of arrays, arrays of hashes, hashes of hashes, etc. In all of the examples where typeglobs were used, we can now opt for pointers instead. Pointers provide a way to pass parameters to subroutines by reference.
The backslash unary operator is used to create a hard reference, similar to the & used in C to get the "address of." In the following example, $p is the reference. It is assigned the address of the scalar $x.
$p = \$x;
An example of hard references from the Perl man page perlref:
$scalarref = \$foo; # reference to scalar $foo $arrayref = \@ARGV; # reference to array @ARGV $hashref = \%ENV; # reference to hash %ENV $coderef = \&handler; # reference to subroutine handler $globref = \*STDOUT; # reference to typeglob STDOUT $reftoref = \$scalarref; # reference to another reference (pointer to pointer, ugh)
If you print the value of a reference (or pointer), you will see an address. If you want to go to that address and get the value stored there—that is, dereference the pointer—the pointer must be prefaced by two "funny" symbols. The first is the dollar sign, because the pointer itself is a scalar, and preceding that goes the funny symbol representing the type of data to which it points. When using more complex types, the arrow (infix) operator can be used.
(The Script) #!/bin/perl 1 $num=5; 2 $p = \$num; # $p gets the address of $num 3 print 'The address assigned $p is ', $p, "\n"; 4 print "The value stored at that address is $$p\n"; # dereference (Output) 3 The address assigned $p is SCALAR(0xb057c) 4 The value stored at that address is 5 Explanation
|
Code View: #!/bin/perl 1 @toys = qw( Barbie Elmo Thomas Barney ); 2 $num = @toys; 3 %games=("Nintendo" => "Wii", "Sony" => "PlayStation 3", "Microsoft" => "XBox 360", ); 4 $ref1 = \$num; # Create pointers 5 $ref2 = \@toys; 6 $ref3 = \%games; 7 print "There are $$ref1 toys.\n"; # dereference pointers 8 print "They are: ",join(",",@$ref2), ".\n"; 9 print "Jessica's favorite toy is $ref2->[0].\n"; 10 print "Willie's favorite toy is $ref2->[2].\n"; 11 while(($key,$value)=each(%$ref3)){ print "$key => $value\n"; } 12 print "They waited in line for a $ref3->{'Nintendo'}\n"; (Output) There are 4 toys. They are: Barbie,Elmo,Thomas,Barney. Jessica's favorite toy is Barbie. Willie's favorite toy is Thomas. Microsoft => XBox 360 Sony => PlayStation 3 Nintendo => Wii They waited in line for a Wii Explanation
|
It is not necessary to name a variable to create a reference (pointer) to it. If a variable or subroutine has no name, it is called anonymous. If an anonymous variable (or subroutine) is assigned to a scalar, then the scalar is a reference to that variable (subroutine).
The arrow operator (–>), also called the infix operator, is used to dereference the reference to anonymous arrays and hashes. Although not really necessary, the arrow operator makes the program easier to read.
Anonymous array elements are enclosed in square brackets ([]). These square brackets are not to be confused with the square brackets used to subscript an array. Here they are used as an expression to be assigned to a scalar. The brackets will not be interpolated if enclosed within quotes. The arrow (infix) operator is used to get the individual elements of the array.
(The Script) #!/bin/perl 1 my $arrayref = [ 'Woody', 'Buzz', 'Bo', 'Mr. Potato Head' ]; 2 print "The value of the reference, \$arrayref is ", $arrayref, "\n"; # All of these examples dereference $arrayref 3 print "$arrayref->[3]", "\n"; 4 print $$arrayref[3], "\n"; 5 print ${$arrayref}[3], "\n"; 6 print "@{$arrayref}", "\n"; (Output) 2 The value of the reference, $arrayref is ARRAY(0x8a6f134) 3 Mr. Potato Head 4 Mr. Potato Head 5 Mr. Potato Head 6 Woody Buzz Bo Mr. Potato Head Explanation
|
An anonymous hash is created by using curly braces ({}). You can mix array and hash composers to produce complex data types. These braces are not the same braces that are used when subscripting a hash. The anonymous hash is assigned to a scalar reference.
(The Script) #!/bin/perl 1 my $hashref = { "Name"=>"Woody", "Type"=>"Cowboy" }; 2 print $hashref->{"Name"}, "\n\n"; 3 print keys %$hashref, "\n"; 4 print values %$hashref, "\n"; (Output) 2 Woody 3 NameType 4 WoodyCowboy Explanation |
The ability to create references (pointers) to anonymous data structures lends itself to more complex types. For example, you can have hashes nested in hashes or arrays of hashes or arrays of arrays, etc.
Just as with simpler references, the anonymous data structures are dereferenced by prepending the reference with the correct funny symbol that represents its data type. For example, if $p is a pointer to a scalar, you can write $$p to dereference the scalar, and if $p is a pointer to an array, you can write @$p to dereference the array or $$p[0] to get the first element of the array. You can also dereference a pointer by treating it as a block. $$p[0] could also be written ${$p}[0] or @{p}[0..3]. Sometimes, the braces are used to prevent ambiguity, and sometimes they are necessary so that the funny character dereferences the correct part of the structure.
A list may contain another list or set of lists, most commonly used to create a multidimensional array. A reference is assigned an anonymous array containing another anonymous array in Examples 13.7 and 13.8.
#!/bin/perl # Program to demonstrate a reference to a list with a # nested list 1 my $arrays = [ '1', '2', '3', [ 'red', 'blue', 'green' ]]; 2 for($i=0;$i<3;$i++){ 3 print $arrays->[$i],"\n"; } 4 for($i=0;$i<3;$i++){ 5 print $arrays->[3]->[$i],"\n"; } 6 print "@{$arrays}\n"; 7 print "--@{$arrays->[3]}--", "\n"; (Output) 3 1 2 3 5 red blue green 6 1 2 3 ARRAY(0x8a6f134) 7 --red blue green-- Explanation
|
Code View: (The Script) #!/bin/perl # Program to demonstrate a pointer to a two-dimensional array. 1 my $matrix = [ [ 0, 2, 4 ], [ 4, 1, 32 ], [ 12, 15, 17 ] ] ; 2 print "Row 3 column 2 is $matrix->[2]->[1].\n"; 3 print "Dereferencing with two loops.\n"; 4 for($x=0;$x<3;$x++){ 5 for($y=0;$y<3;$y++){ 6 print "$matrix->[$x]->[$y] "; } print "\n\n"; } print "\n"; 7 print "Derefencing with one loop.\n"; 8 for($i = 0; $i < 3; $i++){ 9 print "@{$matrix->[$i]}", "\n\n"; } 10 $p=\$matrix; # Reference to a reference 11 print "Dereferencing a reference to reference.\n" 12 print ${$p}->[1][2], "\n"; (Output) 2 Row 3 column 2 is 15. 3 Dereferencing with two loops. 6 0 2 4 4 1 32 12 15 17 7 Dereferencing with one loop. 9 0 2 4 4 1 32 12 15 17 11 Dereferencing a reference to reference. 12 32 Explanation
|
A list may contain a hash or a set of hashes. In Example 13.9, a reference is assigned an anonymous array containing two anonymous hashes.
1 my $petref = [ { "name" => "Rover", "type" => "dog", "owner" => "Mr. Jones", }, 2 { "name" => "Sylvester", "type" => "cat", "owner" => "Mrs. Black", } 3 ]; 4 print "The first pet's name is $petref->[0]->{name}.\n"; print "Printing an array of hashes.\n"; 5 for($i=0; $i<2; $i++){ 6 while(($key,$value)=each %{$petref->[$i]} ){ 7 print "$key -- $value\n"; } print "\n"; } print "Adding a hash to the array.\n"; 8 push @{$petref},{ "owner"=>"Mrs. Crow", "name"=>"Tweety", "type"=>"bird" }; 9 while(($key,$value)=each %{$petref->[2]}){ 10 print "$key -- $value\n"; } (Output) 4 The first pet's name is Rover. Printing an array of hashes. 7 owner -- Mr. Jones type -- dog name -- Rover owner -- Mrs. Black type -- cat name -- Sylvester Adding a hash to the array. 10 type -- bird owner -- Mrs. Crow name -- Tweety Explanation
|
A hash may contain another hash or a set of hashes. In Example 13.10, a reference is assigned an anonymous hash consisting of two keys, each of which is associated with a value that happens to be another hash (consisting of its own key/value pairs).
Code View: #!/bin/perl # Program to demonstrate a hash containing anonymous hashes. 1 my $hashref = { 2 Math => { # key "Anna" => 100, "Hao" => 95, # values "Rita" => 85, }, 3 Science => { # key "Sam" => 78, "Lou" => 100, # values "Vijay" => 98, }, 4 }; 5 print "Anna got $hashref->{'Math'}->{'Anna'} on the Math test.\n"; 6 $hashref->{'Science'}->{'Lou'}=90; 7 print "Lou's grade was changed to $hashref->{'Science'}->{'Lou'}.\n"; 8 print "The nested hash of Math students and grades is: "; 9 print %{$hashref->{'Math'}}, "\n"; # Prints the nested hash, Math 10 foreach $key (keys %{$hashref}){ 11 print "Outer key: $key \n"; 12 while(($nkey,$nvalue)=each(%{$hashref->{$key}})){ 13 printf "\tInner key: %-5s -- Value: %-8s\n", $nkey,$nvalue; } } (Output) 5 Anna got 100 on the Math test. 7 Lou's grade was changed to 90. 8 The nested hash of Math students and grades is: Rita85Hao95Anna100 11 Outer key: Science 13 Inner key: Lou -- Value: 90 Inner key: Sam -- Value: 78 Inner key: Vijay -- Value: 98 11 Outer key: Math 13 Inner key: Rita -- Value: 85 Inner key: Hao -- Value: 95 Inner key: Anna -- Value: 1005 Anna got 100 on the Math test. Explanation
|
A hash may contain nested hash keys associated with lists of values. In Example 13.11, a reference is assigned two keys associated with values that are also keys into another hash. The nested hash keys are, in turn, associated with an anonymous list of values.
(The Script) # A hash with nested hash keys and anonymous arrays of values 1 my $hashptr = { "Teacher"=>{"Subjects"=>[ qw(Science Math English)]}, "Musician"=>{"Instruments"=>[ qw(piano flute harp)]}, }; # Teacher and Musician are keys. # The values consist of nested hashes. 2 print $hashptr->{"Teacher"}->{"Subjects"}->[0],"\n"; 3 print "@{$hashptr->{'Musician'}->{'Instruments'}}\n"; (Output) 2 Science 3 piano flute harp Explanation
|
An anonymous subroutine is created by using the keyword sub without a subroutine name. The expression is terminated with a semicolon. For more on using anonymous subroutine, see "Closures" in Chapter 14.
(The Script) #!/bin/perl 1 my $subref = sub { print @_ ; }; 2 &$subref('a','b','c'); print "\n"; (Output) 1 abc Explanation |
When passing arguments to subroutines, they are sent to the subroutine and stored in the @_ array. If you have a number of arguments, say an array, a scalar, and another array, the arguments are all flattened out onto the @_ array. It would be hard to tell where one argument ended and the other began unless you also passed along the size of each of the arrays, and then the size would be pushed onto the @_ array and you would have to get that to determine where the first array ended, and so on. The @_ could also be quite large if you are passing a 1,000-element array. So, the easiest and most efficient way to pass arguments is by address, as shown in Example 13.13.
(The Script) 1 @toys = qw(Buzzlightyear Woody Bo); 2 $num = @toys; # Number of elements in @toys is assigned to $num 3 gifts( \$num, \@toys ); # Passing by reference 4 sub gifts { 5 my($n, $t) = @_; # Localizing the reference with 'my' 6 print "There are $$n gifts: "; 7 print "@$t\n"; 8 push(@$t, 'Janey', 'Slinky'); } 9 print "The original array was changed to: @toys\n"; (Output) 6,7 There are 3 gifts: Buzzlightyear Woody Bo 9 The original array was changed to: Buzzlightyear Woody Bo Janey Slinky Explanation
|
Explanation
|
One of the only ways to pass a filehandle to a subroutine is by reference. You can use a typeglob to create an alias for the filehandle and then use the backslash to create a reference to the typeglob. Wow...
Code View: (The Script) #!/bin/perl 1 open(README, "/etc/passwd") || die; 2 &readit(\*README); # Reference to a typeglob 3 sub readit { 4 my ($passwd)=@_; 5 print "\$passwd is a $passwd.\n"; 6 while(<$passwd>){ 7 print; } } 9 seek(README,0,0) || die "seek: $!\n"; # Reset back to begining of job (Output) 5 $passwd is a GLOB(0xb0594). 7 root:x:0:1:Super-User:/:/usr/bin/csh daemon:x:1:1::/: bin:x:2:2::/usr/bin: sys:x:3:3::/: adm:x:4:4:Admin:/var/adm: lp:x:71:8:Line Printer Admin:/usr/spool/lp: smtp:x:0:0:Mail Daemon User:/: uucp:x:5:5:uucp Admin:/usr/lib/uucp: nuucp:x:9:9:uucp Admin:/var/spool/uucppublic:/usr/lib/uucp/uucico listen:x:37:4:Network Admin:/usr/net/nls: nobody:x:60001:60001:Nobody:/: noaccess:x:60002:60002:No Access User:/: nobody4:x:65534:65534:SunOS 4.x Nobody:/: ellie:x:9496:40:Ellie Quigley:/home/ellie:/usr/bin/csh 9 seek: Bad file number Explanation
|
The ref function is used to test for the existence of a reference. If the argument for ref is a pointer variable, ref returns the type of data the reference points to; e.g., SCALAR is returned if the reference points to a scalar, and ARRAY is returned if it points to an array. If the argument is not a reference, the null string is turned. Table 13.1 lists the values returned by the ref function
What Is Returned | Meaning |
---|---|
REF | Pointer to pointer |
SCALAR | Pointer to scalar |
ARRAY | Pointer to array |
HASH | Pointer to hash |
CODE | Pointer to subroutine |
GLOB | Pointer to typeglob |
(The Script) 1 sub gifts; # Forward declaration 2 $num = 5; 3 $junk = "xxx"; 4 @toys = qw/Budlightyear Woody Thomas/ ; 5 gifts( \$num, \@toys, $junk ); 6 sub gifts { 7 my( $n, $t, $j) = @_; 8 print "\$n is a reference.\n" if ref($n); print "\$t is a reference.\n" if ref($t); 9 print "\$j is a not a reference.\n" if ref($j); 10 printf "\$n is a reference to a %s.\n", ref($n); 11 printf "\$t is a reference to an %s.\n", ref($t); } (Output) 8 $n is a reference. $t is a reference. 9 10 $n is a reference to a SCALAR. 11 $t is a reference to an ARRAY. Explanation
|