Perl Programming Language

About

Perl is a general-purpose programming language originally developed for text manipulation and now used for a wide range of tasks including system administration, web development, network programming, GUI development, and more.

Using Perl

In order to use Perl on a machine, it must be installed. The Perl installation basically consists of the Perl compiler, used to translate Perl scripts into executable machine code when they are run. Perl is part of a base Linux install, and other than installing some missing library modules, there should be no need to modify anything.

Perl scripts or Perl programs are just simple text files (usually with a .pl extension, but not always). Any text editor can create and edit them.

Perl is run from the Linux command line using the perl command.

Variables

Variables in Perl are always preceded with a sign called a sigil which indicates what data type the variable is.
https://www.perltutorial.org/perl-variables/

$ – scalar

A scalar can contain a single value such as a number or a string. It can also contain a reference to another data structure.

@ – array

By definition, an array is a variable that provides dynamic storage for a list, with each item in the list given an index, starting at 0.
https://www.perltutorial.org/perl-array/

% – hash

A Perl hash is defined by named key-value pairs (not indices) and is sometimes referred to as an associative array.
https://www.perltutorial.org/perl-hash/

my – local scope

A my declares the listed variables to be local (lexically) to the enclosing block, file, or eval. If more than one variable is listed, the list must be placed in parentheses.
https://perldoc.perl.org/functions/my.html

$_ – the topic

In Perl a scalar variable called$_ or the topic is the most important variable in a block of code. It can be any variable: a scalar, array, hash, object. Several functions and operators use this variable as a default.

POD

Plain Old Documentation is what Perl calls its commenting and documentation system.

A POD region is an area of code ignored by the regular Perl compiler, but with the potential to be formatted and output to create documentation.

POD Commands

POD commands start and stop POD regions and format the text within them.

Lines starting with an equal = sign and followed directly by a POD directive are called POD commands and are interpreted as the start of a POD region.

Perl parses all subsequent lines of a POD region until it reaches a =cut POD command or the end of the file.

Everything after a POD command, and before the next POD command is called a POD block.

Within a POD block, there are POD paragraphs. A POD paragraph consists of non-blank lines of text, separated by one or more blank lines.

Command Description
=head1 =head2 =head3 =head4 This command indicates the beginning of a POD block, and that the text in the remainder of the paragraph is a heading. That text may contain formatting codes.
=pod This command indicates the beginning of a POD block. If there is any text in the POD block, it will be ignored. If =pod appears in the middle of a POD block, it is ignored.
=cut This command indicates that this line is the end of a previously started POD block. If there is any text after =cut on the line, it will also be ignored. It’s illegal to start a POD block with a =cut command and doing so will make Perl throw a warning.

More commands and documentation can be found at perldoc.perl.org.

Regex

There are three regular expression operators within Perl:

  • m//: Match Regular Expression
  • s///: Substitute Regular Expression
  • tr///: Transliterate Regular Expression

s/// Search and replace

Syntax

# $target       The text to evaluate.
# =~            Perl "match" operator.
# /             Perl regular expression command delimiter.
# s             Perl regex "search" operator.            
# {regex}       Text to find in the target (regex pattern to match).
# {replacement} The text to replace found instances with.
# {modifiers}   Tells the regex how to function
#     g         Replace all occurrences of the regex in the target.
#     r         Return results instead of modifying the target.

$target =~ s/{regex}/{replacement}/{modifiers};

Common Functions

glob()

glob() returns a list of files from the given directory.
https://www.geeksforgeeks.org/perl-glob-function/

use

use imports some semantics into the current package from the named module, generally by aliasing certain subroutine or variable names into your package.
https://perldoc.perl.org/functions/use.html

use lib – Add directory to search path

use lib is typically used to add extra directories to Perl’s search path (@INC) so that later do, require, and use statements will find library files that aren’t located in Perl’s default search path. It has no effect on anything outside of the program or module which contains the use lib directive.
https://docstore.mik.ua/orelly/perl/prog3/ch31_13.htm

Add the current directory to the search path:

use lib '.';

Add ~/libperl (the libperl directory under the users home directory) to the search path:

use lib "$ENV{HOME}/libperl";

Perls Scripting

Starting with a shebang

The #! is commonly called a “shebang” and it tells the computer how to run the script. You’ll also see lots of shell-scripts with #!/bin/sh or #!/bin/bash. /usr/bin/perl is the path to the Perl interpreter which will execute the script.

A new Perl script should always begin with:

#!/usr/bin/perl
use warnings;
use strict;

NOTES:

  • #!/usr/bin/perl -w turns-on warnings everywhere where as use warnings; is lexically scoped.
  • #!/usr/bin/perl -T turns-on Perls taint mode (or “tainting”) which means input is marked as “not trusted” until it’s format is checked.

Ending with a 1

Place this just before the end of a module file:

1;

Modules must end with a true value or else the module is considered not to have loaded. By convention this value is usually 1 though it can be any true value. A module can end with false to indicate failure but this is rarely used and it should instead die() (exit with an error).

Common Perl Functions

Perl builtin functions can serve as terms in an expression. Builtin functions may be used either with or without parentheses around its arguments.

Some examples of common builtin functions are:

package, use, print, die, return, sub, chomp, substr, each, pop, push, splice, sort

See the perlfunc doc for complete builtin function information.

Common Perl Modules

Module Name Description
Data::Dumper Pretty printing of data structures
Date::Parse Parse date strings into time values.
Date::Format Formats dates.
DateTime Formats date and time combinations.
DBI MySQL database access module.
Exporter Allows this package to export functions and variables to the user’s namespace using the standard import method.
File::Basename Parse file paths into directory, filename and suffix.
Example: Get the name of the Perl script currently running:
my $name = basename($0);
File::stat Returns a 13-element array giving the status info for a file.
Getopt::Long Gets command-line options passed to the script.
HTTP::Cookies HTTP cookie jars class.
JSON JSON (JavaScript Object Notation) encoder/decoder.
LWP::UserAgent Makes HTTP requests.
Net::DNS::Resolver A DNS resolver class.
Term::ReadKey Reads keyboard input.
Time::Zone Miscellaneous timezone manipulation routines.
URI::Escape Percent-encode and percent-decode unsafe URI characters.
YAML::Tiny Reads and writes YAML-style files.

Installing Perl Modules

Use the perl -c command to parse a script file and see if all dependancies exist:

perl -c /path/to/a/perl/file.pl

If all is good you will see:

/path/to/a/perl/file.pl syntax OK

If problems are found, they will be listed:

Can't locate PHP/Serialization.pm in @INC (you may need to install the PHP::Serialization module) (@INC contains: /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.24.1 /usr/local/share/perl/5.24.1 /usr/lib/x86_64-linux-gnu/perl5/5.24 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.24 /usr/share/perl/5.24 /usr/local/lib/site_perl /usr/lib/x86_64-linux-gnu/perl-base) at /path/to/a/perl/file.pl line 56, <DATA> line 1.
BEGIN failed--compilation aborted at /path/to/a/perl/file.pl line 56, <DATA> line 1.

In this case we see (you may need to install the PHP::Serialization module).

Aptitude

Use the aptitude Linux package manager to find and install Perl modules.

First search for the package to make sure it exists. If the Perl module name is PHP::Serialization the aptitude package version will be called libphp-serialization-perl. You can search for whole are parts of the name:

Use aptitude search to search for the package:

aptitude search libphp-serialization-perl

If one or more results are found they will be listed:

i    libphp-serialization-perl    - Perl module to manipulate serialized PHP data structures  

Install the package:

sudo aptitude install libphp-serialization-perl

Perl Commands

Command line uses for perl.

Get the version of Perl currently installed

Use perl -v to see if Perl is installed, and which version is running.

perl -v