If the first line of the script contains the #! symbols (called the shbang line) followed by the full pathname of the file where your version of the Perl executable resides, this tells the kernel what program is interpreting the script. An example of the startup line might be
#!/usr/bin/perl
It is extremely important that the path to the interpreter is entered correctly after the shbang (#!). Perl may be installed in different directories on different systems. Most Web servers will look for this line when invoking CGI scripts written in Perl. Any inconsistency will cause a fatal error. To find the path to the Perl interpreter on your system, type at your UNIX prompt[1]:
[1] Another way to find the interpreter would be: find / -name '*perl*' -print;
which perl
If the shbang line is the first line of the script, you can execute the script directly from the command line by its name. If the shbang is not the first line of the script, the UNIX shell will try to interpret the program as a shell script, and the shbang line will be interpreted as a comment line. (See "Executing the Script" on page 36 for more on how to execute Perl programs.)
Mac OS is really just a version of UNIX and comes bundled with Perl 5.8. You open a terminal and use Perl exactly the same way you would use it for Solaris, Linux, *BSD, HP-UX, AIX OSX, etc.
Win32 platforms don't provide the shbang syntax or anything like it.[2] For Windows XP and Windows NT 4.0[3] you can associate a Perl script with extensions such as .pl or .plx and then run your script directly from the command line. At the command-line prompt or from the system control panel, you can set the PATHEXT environment variable to the name of the extension that will be associated with Perl scripts. At the command line, to set the environment variable, type
[2] Although Win32 platforms don't ordinarily require the shbang line, the Apache Web server does, so you will need the shbang line if you are writing CGI scripts that will be executed by Apache.
[3] File association does not work on Windows 95 unless the program is started from the Explorer window.
SET PATHEXT=.pl;%PATHEXT%
At the control panel, to make the association permanent, do the following:
1. | Go to the Start menu. |
2. | Select Settings or just select Control Panel. |
3. | Select Control Panel. |
4. | In the control panel, click on the System icon. |
5. | Click on Advanced. |
6. | Click on Environment Variables. |
7. | Click on New. |
8. | Type PATHEXT in the Variable Name box. |
9. | In the Variable Value box, type the extension you want, followed by a semicolon and %PATHEXT%. |
10. | OK the setting. |
From now on when you create a Perl script, append its name with the extension you have chosen, such as myscript.pl or myscript.plx. Then the script can be executed directly at the command line by just typing the script name without the extension, e.g., myscript.pt. (See "Executing the Script" on page 36 for more on script execution.)
Since you will be using a text editor to write Perl scripts, you can use any of the editors provided by your operating system or download more sophisticated editors specifically designed for Perl, including third-party editors and Integrated Development Environments (IDEs). Table 3.1 lists some of the editors available.
BBEdit, JEdit | Macintosh |
Wordpad, Notepad, UltraEdit, vim, PerlEdit, JEdit, TextPad | Windows |
pico, vi, emacs, PerlEdit, JEdit | Linux/UNIX |
Komodo | Linux, Mac OS, Windows |
OptiPerl, PerlExpress | Windows |
Affus | Mac OS X |
The only naming convention for a Perl script is that it follow the naming conventions for files on your operating system (upper-/lowercase letters, numbers, etc.). If, for example, you are using Linux, filenames are case sensitive, and since there are a great number of system commands, you may want to add an extension to your Perl script names to make sure the names are unique. You are not required to add an extension to the filename unless you are creating libraries or modules, writing CGI scripts if the server requires a specific extension, or have set up Windows to expect an extension on certain types of files. By adding a unique extension to the name, you can prevent clashes with other programs that might have the same name. For example, UNIX provides a command called "test". If you name a script "test", which version will be executed? If you're not sure, you can add a .plx or .perl extension to the end of the Perl script name to give it its own identity.
And of course, give your scripts sensible names that indicate the purpose of the script rather than names like "foo", "foobar", or "testing".
Perl is called a free-form language, meaning you can place statements anywhere on the line and even cross over lines. Whitespace refers to spaces, tabs, and newlines. The newline is represented as "\n" and must be enclosed in double quotes. Whitespace is used to delimit words. Any number of blank spaces are allowed between symbols and words. Whitespace enclosed in single or double quotes is preserved; otherwise, it is ignored. The following expressions are the same:
5+4*2 | is the same as | 5 | + | 4 | * | 2; |
And both of the following Perl statements are correct even though the output will show that the whitespace is preserved when quoted.
print "This is a Perl statement."; print "This is also a Perl statement.";
Even though you have a lot of freedom when writing Perl scripts, it is better to put statements on their own line and to provide indentation when using blocks of statements (we'll discuss this in Chapter 5). Of course, annotating your program with comments, so that you and others will understand what is going on, is vitally important. See the next section for more on comments.
You may write a very clever Perl script today and in two weeks have no idea what your script was trying to do. If you pass the script on to someone else, the confusion magnifies. Comments are plain text that allow you to insert documentation in your Perl script with no effect on the execution of the program. They are used to help you and other programmers maintain and debug scripts. Perl comments are preceded by a # mark. They extend across the line, but do not continue onto the next line.
Perl does not understand the C language comments /* and */ or C++ comments //.
1 # This is a comment 2 print "hello"; # And this is a comment Explanation |
Perl executable statements make up most of the Perl script. As in C, the statement is an expression, or series of expressions, terminated with a semicolon. Perl statements can be simple or compound, and a variety of operators, modifiers, expressions, and functions make up a statement, as shown in the following example.
print "Hello, to you!\n"; $now = localtime(); print "Today is $now.\n"; $result = 5 * 4 / 2; print "Good-bye.\n";
A big part of any programming language is the set of functions built into the language or packaged in special libraries (see Apendix A.1). Perl comes with many useful functions, independent program code that performs some task. When you call a Perl built-in function, you just type its name, or optionally you can type its name followed by a set of parentheses. All function names must be typed in lowercase. Many functions require arguments, messages that you send to the function. For example, the print function won't display anything if you don't pass it an argument, the string of text you want to print on the screen. If the function requires arguments, then place the arguments, separated by commas, right after the function name. The function usually returns something after it has performed its particular task. In the script shown at the beginning of this chapter, we called two built-in Perl functions, print and localtime. The print function took a string as its argument and displayed the string of text on the screen. The localtime function, on the other hand, didn't require an argument but returned the current date and time. Both of the following statements are valid ways to call a function with an argument. The argument is "Hello, there.\n"
print("Hello, there.\n"); print "Hello, there.\n";
A Perl script can be executed at the command line directly by its name when the #! startup line is included in the script file and the script has execute permission (see Example 3.3) or, if using Windows, filename association has been set as discussed in "Startup" on page 32. If the #! is not the first line of the script, you can execute a script by passing the script as an argument to the Perl interpreter.
Perl will then compile and run your script using its own internal form. If you have syntax errors, Perl will let you know. You can check to see if your script has compiled successfully by using the -c switch as follows:
$ perl -c scriptname
To execute a script at either the UNIX or MS-DOS prompt, type
$ perl scriptname
The following example illustrates the five parts of a Perl script:
The startup line (UNIX)
Comments
The executable statements in the body of the script
Checking Perl syntax
The execution of the script (UNIX, Windows)
Expect to make errors and maybe lots of them. You may try many times before you actually get a program to run perfectly. Knowing your error messages is like knowing the quirks of your boss, mate, or even yourself. Some programmers make the same error over and over again. Don't worry. In time, you will learn what most of these messages mean and how to prevent them.
When you execute a Perl script, it takes just one step on your part, but internally the Perl interpreter takes two steps. First, it compiles the entire program into bytecode, an internal representation of the program. After that, Perl's bytecode engine runs the bytecode line by line. If you have compiler errors, such as a missing semicolon at the end of the line, misspelled keyword, or mismatched quotes, you will get what is called a syntax error. These types of errors are picked up by using the -c switch and are usually easy to find once you have become acquainted with them.
After the program passes the compile phase (i.e., you don't get any syntax errors or complaints from the compiler), then you may get what are called runtime, or logical, errors. These errors are harder to find and are probably caused by not anticipating problems that might occur when the program starts running. Or it's possible that the program has faulty logic in the way it was designed. Runtime errors may be caused if a file or database you're trying to open doesn't exist, a user enters bad input, you get into an infinite loop, or you try to illegally divide by zero.Whatever the case, these problems, called "bugs," are harder to find. Perl comes with a debugger that is helpful in determining what caused these logical errors by letting you step through your program line by line. (See "Debugger" on page 858.)