C.3. Some Perl Examples
The following examples are very basic. They demonstrate how basic Perl can be used to perform a reversed complement of a sequence using the tr function, how to search for a pattern or motif from a file using regular expressions, and a third example to filter the bases from a sequence using regular expressions in a loop.
Example C.1.
# Sequence--Reverse Complement
$dna_strand='GGGGaaaaaaaattAtAtat';
$dna_strand=reverse($dna_strand);
print "The original DNA strand is $dna_strand\n";
$dna_strand =~ tr/ACGTacgt/TGCAtgca/;
print "The reversed complement is ",$dna_strand,"\n";
(Output)
The original DNA strand is tatAtAttaaaaaaaaGGGG
The reversed complement is ataTaTaattttttttCCCC
|
Example C.2.
File name: fasta
>testsrings
MTKKIGLFYGTQTGKTESVAEIIRDEFGNDVVTLHDVSQAEVTDLNDYQYLIIGCPTWNI
GELQSDWEGLYSELDDVDFNGKLVAYFGTGDQIGYADNFQDAIGILEEKISQRGGKTVGY
WSTDGYDFNDSKALRNGKFVGLALDEDNQSDLTDDRIKSWVAQLKSEFG
(Perl script)
# Finding a pattern/motif in a protein file
open(PFH, "fasta") or die "Can't open file: $!\n";
$pattern="QTGK";
while($string=<PFH>){
print "$pattern found on line $..\n"
if $string =~/$pattern/i;
}
close(PFH);
(Output)
QTGK found on line 1.
|
Example C.3.
# Filter sequence from any unwanted characters
# Prints just nucleotides
open(JF, "junkseq") or die;
@line=<JF>;
@chars = split(//,$line[0]);
print "Original sequence with junk\n";
print @chars,"\n\n";
print "Cleaned up sequence\n";
for($i=0; $i <= $#chars; $i++){
if($chars[$i] =~ /[Aa]|[Tt]|[Gg]|[Cc]/){
print uc "$chars[$i]";
}
}
print "\n";
(Output)
Original sequence with junk
tg^c*ttcgh#ittgcatgggttc'tt:igg!tt~8$gttcggsstt$$@^ucgte+2%%&tagc
Cleaned up sequence
TGCTTCGTTGCATGGGTTCTTGGTTGTTCGGTTCGTTAGC
|