ch05lev1sec1.html

Chapter 5. What's in a Name

5.1. About Perl Variables

Before starting this chapter, a note to you, the reader. Each line of code in an example is numbered. The output and explanations are also numbered to match the number in the code. These numbers are provided to help you understand important lines of each program. When copying examples into your text editor, don't include these numbers, or you will generate many unwanted errors! With that said, let's proceed.

5.1.1. Types

Variables are fundamental to all programming languages. They are data items whose values may change throughout the run of the program, whereas literals or constants remain fixed. They can be placed anywhere in the program and do not have to be declared as in other higher languages, where you must specify the data type that will be stored there. You can assign strings, numbers, or a combination of these to Perl variables. For example, you may store a number in a variable and then later change your mind and store a string there. Perl doesn't care.

Perl variables are of three types: scalar, array, and associative array (more commonly called hashes). A scalar variable contains a single value (e.g., one string or one number), an array variable contains an ordered list of values indexed by a positive number, and a hash contains an unordered set of key/value pairs indexed by a string (the key) that is associated with a corresponding value. (See "Scalars, Arrays, and Hashes" on page 77.)

5.1.2. Scope and the Package

The scope of a variable determines where it is visible in the program. In Perl scripts, the variable is visible to the entire script (i.e., global in scope) and can be changed anywhere within the script.

The Perl sample programs you have seen in the previous chapters are compiled internally into what is called a package, which provides a namespace for variables. Almost all variables are global within that package. A global variable is known to the whole package and, if changed anywhere within the package, the change will permanently affect the variable. The default package is called main, similar to the main() function in the C language. Such variables in C would be classified as static. At this point, you don't have to worry about naming the main package or the way in which it is handled during the compilation process. The only purpose in mentioning packages now is to let you know that the scope of variables in the main package, your script, is global. Later, when we talk about the our, local, and my functions in packages, you will see that it is possible to change the scope and namespace of a variable.

Figure 5.1. Namespaces for scalars, lists, and hashes in package main.

5.1.3. Naming Conventions

Unlike C or Java, Perl variables don't have to be declared before being used. They spring to life just by the mere mention of them. Variables have their own namespace in Perl. They are identified by the "funny characters" that precede them. Scalar variables are preceded by a $ sign, array variables are preceded by an @ sign, and hash variables are preceded by a % sign. Since the "funny characters" indicate what type of variable you are using, you can use the same name for a scalar, array, or hash and not worry about a naming conflict. For example, $name, @name, and %name are all different variables; the first is a scalar, the second is an array, and the last is a hash.^[1]

^[1] Using the same name is allowed but not recommended; it makes reading too confusing.

Since reserved words and filehandles are not preceded by a special character, variable names will not conflict with reserved words or filehandles. Variables are case sensitive. The variables named $Num, $num, and $NUM are all different.

If a variable starts with a letter, it may consist of any number of letters (an underscore counts as a letter) and/or digits. If the variable does not start with a letter, it must consist of only one character. Perl has a set of special variables (e.g., $_, $^, $., $1, $2, etc.) that fall into this category. (See "Special Variables" on page 845 in Appendix A .) In special cases, variables may also be preceded with a single quote but only when packages are used.

An unitialized variable will get a value of zero or null, depending on whether its context is numeric or string.

5.1.4. Assignment Statements

The assignment operator, the equal sign (=), is used to assign the value on its right-hand side to a variable on its left-hand side. Any value that can be "assigned to" represents a named region of storage and is called an lvalue.^[2] Perl reports an error if the operand on the left-hand side of the assignment operator does not represent an lvalue.

^[2] The value on the left-hand side of the equal sign is called an lvalue, and the value on the right-hand side an rvalue.

When assigning a value or values to a variable, if the variable on the left-hand side of the equal sign is a scalar, Perl evaluates the expression on the right-hand side in a scalar context. If the variable on the left of the equal sign is an array, then Perl evaluates the expression on the right in an array context. (See "Scalars, Arrays, and Hashes" on page 77.)

A simple statement is an expression terminated with a semicolon.

Format

variable=expression;

Example 5.1.

Code View:
(The Script) # Scalar, array, and hash assignment 1 $salary=50000; # Scalar assignment 2 @months=('Mar', 'Apr', 'May'); # Array assignment 3 %states= ( # Hash assignment 'CA' => 'California', 'ME' => 'Maine', 'MT' => 'Montana', 'NM' => 'New Mexico', ); 4 print "$salary\n"; 5 print "@months\n"; 6 print "$months[0], $months[1], $months[2]\n"; 7 print "$states{'CA'}, $states{'NM'}\n"; 8 print $x + 3, "\n"; # $x just came to life! 9 print "***$name***\n"; # $name is born! (Output) 4 50000 5 Mar Apr May 6 Mar, Apr, May 7 California, New Mexico 8 3 9 ******

Explanation

The scalar variable $salary is assigned the numeric literal 50000.
The array @months is assigned the comma-separated list, Mar, Apr, May. The list is enclosed in parentheses and each list item is quoted.
The hash, %states, is assigned a list consisting of a set of strings separated by either a digraph symbol (=>) or a comma.^[a] The string on the left is called the key.^[b] The string to the right is called the value. The key is associated with its value.
The value of the scalar, $salary, is printed, followed by a newline.
The @months array is printed. The double quotes preserve spaces between each element.
The individual elements of the array, @months, are scalars and are thus preceded by a dollar sign ($). The array index starts at zero.
The key elements of the hash, %states, are enclosed in curly braces ({}). The associated value is printed. Each value is a single value, a scalar. The value is preceded by a dollar sign ($).
The scalar variable, $x, is referenced for the first time. Because the number three is added to $x, the context is numeric. $x has an initial value of zero.
The scalar variable, $name, is referenced for the first time. The context is string and the initial value is null.

^[a] The comma can be used in both Perl 4 and Perl 5. The => symbol was introduced in Perl 5.
^[b] The => operator, unlike the comma, causes the key to be quoted, but if the key consists of more than one word or begins with a number, then it must be quoted.

^[a] The comma can be used in both Perl 4 and Perl 5. The => symbol was introduced in Perl 5.

^[b] The => operator, unlike the comma, causes the key to be quoted, but if the key consists of more than one word or begins with a number, then it must be quoted.

5.1.5. Quoting Rules

Since quoting affects the way in which variables are interpreted, this is a good time to review Perl's quoting rules. Perl quoting rules are similar to shell quoting rules. This may not be good news to shell programmers, who find using quotes frustrating, to say the least. It is often difficult to determine which quotes to use, where to use them, and how to find the culprit if they are misused; in other words, it's a real debugging nightmare.^[3] For those of you who fall into this category, Perl offers an alternative method of quoting.^[4]

^[3] Barry Rosenberg, in his book KornShell Programming Tutorial, has a chapter titled "The Quotes From Hell."

^[4] Larry Wall, creator of Perl, calls his alternative quoting method "syntactic sugar."

Perl has three types of quotes and all three types have a different function. They are single quotes, double quotes, and backquotes.

The backslash (\) behaves like a set of single quotes but can be used only to quote a single character.

A pair of single or double quotes may delimit a string of characters. Quotes will either allow the interpretation of special characters or protect special characters from interpretation, depending on the kind of quotes you use.

Single quotes are the "democratic" quotes. All characters enclosed within them are treated equally; in other words, there are no special characters. But the double quotes discriminate. They treat some of the characters in the string as special characters. The special characters include the $ sign, the @ symbol, and escape sequences such as \t and \n.

When backquotes surround an operating system command, the command will be executed by the shell. This is called command substitution. The output of the command will either be printed as part of a print statement or assigned to a variable. If you are using Windows, Linux, or UNIX, the commands enclosed within backquotes must be supported by the particular operating system and will vary from system to system.

No matter what kind of quotes you are using, they must be matched. Because the quotes mark the beginning and end of a string, Perl will complain about a "Might be a multiline runaway string" or "Execution of quotes aborted . . ." or "Can't find string terminator anywhere before EOF..." and fail to compile if you forget one of the quotes.

Double Quotes

Double quotes must be matched unless embedded within single quotes or preceded by a backslash.

When a string is enclosed in double quotes, scalar variables (preceded with a $) and arrays (preceded by the @ symbol) are interpolated (i.e., the value of the variable replaces the variable name in the string). Hashes (preceded by the % sign) are not interpolated within the string enclosed in double quotes.

Strings that contain string literals (e.g., \t, \n) must be enclosed in double quotes for backslash interpretation.

A single quote may be enclosed in double quotes, as in "I don't care!"

Example 5.2.

(The Script) # Double quotes 1 $num=5; 2 print "The number is $num.\n"; 3 print "I need \$5.00.\n"; 4 print "\t\tI can't help you.\n"; (Output) 2 The number is 5. 3 I need $5.00. 4 I can't help you.

Explanation

The scalar variable $num is assigned the value 5.
The string is enclosed in double quotes. The value of the scalar variable is printed. The string literal, \n, is interpreted.
The dollar sign ($) is printed as a literal dollar sign when preceded by a backslash; in other words, variable substitution is ignored.
The special literals \t and \n are interpreted when enclosed within double quotes.

Single Quotes

If a string is enclosed in single quotes, it is printed literally (what you see is what you get).

If a single quote is needed within a string, then it can be embedded within double quotes or backslashed. If double quotes are to be treated literally, they can be embedded within single quotes.

Example 5.3.

(The Script) # Single quotes 1 print 'I need $100.00.', "\n"; 2 print 'The string literal, \t, is used to represent a tab.', "\n"; 3 print 'She cried, "Help me!"', "\n"; (Output) 1 I need $100.00. 2 The string literal, \t, is used to represent a tab. 3 She cried, "Help me!"

Explanation

The dollar sign is interpreted literally. In double quotes, it would be interpreted as a scalar. The \n is in double quotes in order for backslash interpretation to occur.
The string literal, \t, is not interpreted to be a tab but is printed literally.
The double quotes are protected when enclosed in single quotes (i.e., they are printed literally).

Backquotes

UNIX/Windows^[5] commands placed within backquotes are executed by the shell, and the output is returned to the Perl program. The output is usually assigned to a variable or made part of a print string. When the output of a command is assigned to a variable, the context is scalar (i.e., a single value is assigned).^[6] For command substitution to take place, the backquotes cannot be enclosed in either double or single quotes. (Make note, UNIX shell programmers, backquotes cannot be enclosed in double quotes as in shell programs.)

^[5] If using other operating systems, such as DOS or Mac OS 9.1 and below, the OS commands available for your system will differ.

^[6] If output of a command is assigned to an array, the first line of output becomes the first element of the array, the second line of output becomes the next element of the array, and so on.

Example 5.4.

(The Script for Unix/Linux) # Backquotes and command substitution 1 print "The date is ", 'date'; # Windows users: 'date /T' 2 print "The date is 'date'", ".\n"; # Backquotes treated literally 3 $directory='pwd'; # Windows users: 'cd' 4 print "\nThe current directory is $directory."; (Output) 1 The date is Mon Jun 25 17:27:49 PDT 2007. 2 The date is 'date'. 4 The current directory is /home/jody/ellie/perl.

Explanation

The UNIX date command will be executed by the shell, and the output will be returned to Perl's print string. The output of the date command includes the newline character. For Windows users, the command is 'date /T'.
Command substitution will not take place when the backquotes are enclosed in single or double quotes.
The scalar variable $dir, including the newline, is assigned the output of the UNIX pwd command (i.e., the present working directory). For Windows users, the command is 'cd'.
The value of the scalar, $dir, is printed to the screen.

Perl's Alternative Quotes

Perl provides an alternative form of quoting—the q, qq, qx, and qw constructs.

The q represents single quotes.
The qq represents double quotes.
The qx represents backquotes.
The qw represents a quoted list of words. (See "Array Slices" on page 84.)

Table 5.1. Alternative Quoting Constructs
Quoting Construct What It Represents
q/Hello/ 'Hello'
qq/Hello/ "Hello"
qx/date/ 'date'
@list=qw/red yellow blue/; @list=( 'red', 'yellow', 'blue');

Table 5.1. Alternative Quoting Constructs
Quoting Construct	What It Represents
`q/Hello/`	`'Hello'`
`qq/Hello/`	`"Hello"`
`qx/date/`	`'date'`
`@list=qw/red yellow blue/;`	`@list=( 'red', 'yellow', 'blue');`

The string to be quoted is enclosed in forward slashes, but alternative delimiters can be used for all four of the q constructs. You can use a nonalphanumeric character for the delimeter, such as a # sign, ! point, or paired characters, such as parentheses, square brackets, etc. A single character or paired characters can be used:

>q/Hello/
q#Hello#
q{Hello}
q[Hello]
q(Hello)

Example 5.5.

Code View:
(The Script) # Using alternative quotes 1 print 'She cried, "I can\'t help you!"',"\n"; # Clumsy 2 print qq/She cried, "I can't help you!"\n/; # qq for double # quotes 3 print qq(I need $5.00\n); # Really need single quotes # for a literal dollar sign to print 4 print q/I need $5.00\n/; # What about backslash interpretation? print qq(I need \$5.00\n); # Can escape the dollar sign 5 print qq/\n/, q/I need $5.00/,"\n"; 6 print q!I need $5.00!,"\n"; 7 print "The present working directory is ", 'pwd'; 8 print qq/Today is /, qx/date/; 9 print "The hour is ", qx{date +%H}; (Output) 1 She cried, "I can't help you!" 2 She cried, "I can't help you!" 3 I need .00 4 I need $5.00\nI need $5.00 5 I need $5.00 6 I need $5.00 7 The present working directory is /home/jody/ellie/perl 8 Today is Mon Jun 25 17:29:34 PDT 2007 9 The hour is 17

Explanation

The string is enclosed in single quotes. This allows the conversational quotes to be printed as literals. The single quote in can\'t is quoted with a backslash so that it will also be printed literally. If it were not quoted, it would be matched with the first single quote. The ending single quote would then have no mate, and, alas, the program would either tell you that you have a runaway quote or search for its mate until it reached the end of file unexpectedly.
The qq construct replaces double quotes. Now parentheses delimit the string.
Because the qq is used, the dollar sign ($) in $5.00 is interpreted as a scalar variable with a null value. The .00 is printed. (This is not the way to handle your money!)
The single q replaces single quotes. The $5 is treated as a literal. Unfortunately, so is the \n because backslash interpretation does not take place within single quotes. Without a newline, the next line is run together with line 4. In the next line, if the dollar sign is preceded by a backslash, the backslash "escapes" the special meaning of the $. Now the string will print correctly.
The \n is double quoted with the qq construct, the string I need $5.00 is single quoted with the q construct, and old-fashioned double quotes are used for the second \n.
An alternative delimiter, the exclamation point (!), is used with the q construct (instead of the forward slash) to delimit the string.
The string The present working directory is enclosed in double quotes; the UNIX command pwd is enclosed in backquotes for command substitution.
The qq construct quotes Today is; the qx construct replaces the backquotes used for command substitution.
Alternative delimiters, the curly braces, are used with the qx construct (instead of the forward slash). The output of the UNIX date command is printed.