From the documentation:
$p = HTML::Parser->new(api_version => 3,
text_h => [ sub {...}, "dtext" ]);
This creates a new parser object with a text event handler subroutine that receives the original text with general entities decoded.
Edit:
use HTML::Parser;
use LWP::Simple;
my $html = get "http://perltraining.stonehenge.com";
HTML::Parser->new(text_h => [\my @accum, "text"])->parse($html);
print map $_->[0], @accum;
Another
#!/usr/bin/perl -w
use strict;
use HTML::Parser;
my $text;
my $p = HTML::Parser->new(text_h => [ sub {$text .= shift},
'dtext']);
$p->parse_file('test.html');
print $text;
Which, when used on a file like this:
<html>
<head>
<title>Test</title>
</head>
<body>
<h1>Test Stuff</h1>
<p>This is a test</p>
<ul>
<li>this</li>
<li>is a</li>
<li>list</li>
</ul>
</body>
</html>
produces the following output:
Test
Test Stuff
This is a test
this
is a
list
Does that help?
5
solved What does this HTML::Parser() code do in Perl? [closed]