{"id":7161,"date":"2022-12-20T19:35:41","date_gmt":"2022-12-20T22:35:41","guid":{"rendered":"http:\/\/lode.uno\/linux-man\/index.php\/2022\/12\/20\/htmlfilter-man3\/"},"modified":"2022-12-20T19:35:41","modified_gmt":"2022-12-20T22:35:41","slug":"htmlfilter-man3","status":"publish","type":"post","link":"https:\/\/lode.uno\/linux-man\/2022\/12\/20\/htmlfilter-man3\/","title":{"rendered":"HTML::Filter (man3)"},"content":{"rendered":"<h1 align=\"center\">HTML::Filter<\/h1>\n<p> <a href=\"#NAME\">NAME<\/a><br \/> <a href=\"#NOTE\">NOTE<\/a><br \/> <a href=\"#SYNOPSIS\">SYNOPSIS<\/a><br \/> <a href=\"#DESCRIPTION\">DESCRIPTION<\/a><br \/> <a href=\"#EXAMPLES\">EXAMPLES<\/a><br \/> <a href=\"#SEE ALSO\">SEE ALSO<\/a><br \/> <a href=\"#COPYRIGHT\">COPYRIGHT<\/a> <\/p>\n<hr>\n<h2>NAME <a name=\"NAME\"><\/a> <\/h2>\n<p style=\"margin-left:11%; margin-top: 1em\">HTML::Filter \u2212 Filter HTML text through the parser<\/p>\n<h2>NOTE <a name=\"NOTE\"><\/a> <\/h2>\n<p style=\"margin-left:11%; margin-top: 1em\"><b>This module is deprecated.<\/b> The &#8220;HTML::Parser&#8221; now provides the functionally of &#8220;HTML::Filter&#8221; much more efficiently with the &#8220;default&#8221; handler.<\/p>\n<h2>SYNOPSIS <a name=\"SYNOPSIS\"><\/a> <\/h2>\n<p style=\"margin-left:11%; margin-top: 1em\">require HTML::Filter; <br \/> $p = HTML::Filter\u2212>new\u2212>parse_file(&#8220;index.html&#8221;);<\/p>\n<h2>DESCRIPTION <a name=\"DESCRIPTION\"><\/a> <\/h2>\n<p style=\"margin-left:11%; margin-top: 1em\">&#8220;HTML::Filter&#8221; is an <small>HTML<\/small> parser that by default prints the original text of each <small>HTML<\/small> element (a slow version of <b>cat<\/b>(1) basically). The callback methods may be overridden to modify the filtering for some <small>HTML<\/small> elements and you can override <b>output()<\/b> method which is called to print the <small>HTML<\/small> text.<\/p>\n<p style=\"margin-left:11%; margin-top: 1em\">&#8220;HTML::Filter&#8221; is a subclass of &#8220;HTML::Parser&#8221;. This means that the document should be given to the parser by calling the $p\u2212><b>parse()<\/b> or $p\u2212><b>parse_file()<\/b> methods.<\/p>\n<h2>EXAMPLES <a name=\"EXAMPLES\"><\/a> <\/h2>\n<p style=\"margin-left:11%; margin-top: 1em\">The first example is a filter that will remove all comments from an <small>HTML<\/small> file. This is achieved by simply overriding the comment method to do nothing.<\/p>\n<p style=\"margin-left:11%; margin-top: 1em\">package CommentStripper; <br \/> require HTML::Filter; <br \/> @ISA=qw(HTML::Filter); <br \/> sub comment { } # ignore comments<\/p>\n<p style=\"margin-left:11%; margin-top: 1em\">The second example shows a filter that will remove any < <small>TABLE<\/small> >s found in the <small>HTML<\/small> file. We specialize the <b>start()<\/b> and <b>end()<\/b> methods to count table tags and then make output not happen when inside a table.<\/p>\n<p style=\"margin-left:11%; margin-top: 1em\">package TableStripper; <br \/> require HTML::Filter; <br \/> @ISA=qw(HTML::Filter); <br \/> sub start <br \/> { <br \/> my $self = shift; <br \/> $self\u2212>{table_seen}++ if $_[0] eq &#8220;table&#8221;; <br \/> $self\u2212>SUPER::start(@_); <br \/> } <br \/> sub end <br \/> { <br \/> my $self = shift; <br \/> $self\u2212>SUPER::end(@_); <br \/> $self\u2212>{table_seen}\u2212\u2212 if $_[0] eq &#8220;table&#8221;; <br \/> } <br \/> sub output <br \/> { <br \/> my $self = shift; <br \/> unless ($self\u2212>{table_seen}) { <br \/> $self\u2212>SUPER::output(@_); <br \/> } <br \/> }<\/p>\n<p style=\"margin-left:11%; margin-top: 1em\">If you want to collect the parsed text internally you might want to do something like this:<\/p>\n<p style=\"margin-left:11%; margin-top: 1em\">package FilterIntoString; <br \/> require HTML::Filter; <br \/> @ISA=qw(HTML::Filter); <br \/> sub output { push(@{$_[0]\u2212>{fhtml}}, $_[1]) } <br \/> sub filtered_html { join(&#8220;&#8221;, @{$_[0]\u2212>{fhtml}}) }<\/p>\n<h2>SEE ALSO <a name=\"SEE ALSO\"><\/a> <\/h2>\n<p style=\"margin-left:11%; margin-top: 1em\">HTML::Parser<\/p>\n<h2>COPYRIGHT <a name=\"COPYRIGHT\"><\/a> <\/h2>\n<p style=\"margin-left:11%; margin-top: 1em\">Copyright 1997\u22121999 Gisle Aas.<\/p>\n<p style=\"margin-left:11%; margin-top: 1em\">This library is free software; you can redistribute it and\/or modify it under the same terms as Perl itself.<\/p>\n<hr>\n","protected":false},"excerpt":{"rendered":"<p>  HTML::Filter \u2212 Filter HTML text through the parser <\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[3496,3007],"class_list":["post-7161","post","type-post","status-publish","format-standard","hentry","category-sin-categoria","tag-htmlfilter","tag-man3"],"gutentor_comment":0,"_links":{"self":[{"href":"https:\/\/lode.uno\/linux-man\/wp-json\/wp\/v2\/posts\/7161","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lode.uno\/linux-man\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lode.uno\/linux-man\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lode.uno\/linux-man\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/lode.uno\/linux-man\/wp-json\/wp\/v2\/comments?post=7161"}],"version-history":[{"count":0,"href":"https:\/\/lode.uno\/linux-man\/wp-json\/wp\/v2\/posts\/7161\/revisions"}],"wp:attachment":[{"href":"https:\/\/lode.uno\/linux-man\/wp-json\/wp\/v2\/media?parent=7161"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lode.uno\/linux-man\/wp-json\/wp\/v2\/categories?post=7161"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lode.uno\/linux-man\/wp-json\/wp\/v2\/tags?post=7161"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}