Clean HTML AddOn

The Clean HTML AddOn takes the HTML embedded in your email and cleans it up and makes sure it is W3C compliant so that it won’t cause any issues with your blog. Saves hours of time fiddling with your posts.

Configuration

There is a configuration page that allows you to set commonly used settings.

2016-02-09_1446

For example if you change Forbidden Attributes to “style,class” then all style and class attributes will be removed.

Optional Configuration

The Clean HTML AddOn comes preconfigured to handle most people’s needs, but if you need something additional read on.

This AddOn has one filter called “postie_htmlcleaner_config” which allows you the opportunity to modify the configuration.

Create a file named filterPostie.php in the wp-content directory and paste the following code into it:

For example this will remove all style and class attributes from the incoming email.

<?php
add_filter('postie_htmlcleaner_config', 'my_htmlcleaner_config');

function my_htmlcleaner_config($config)
{
    $config->set('HTML.ForbiddenAttributes', 'class,style');
    return $config;
}

This example this will only allow <b>, <u>, <div>, <p> and <a>. The <a> will also allow the href attribute

<?php
add_filter('postie_htmlcleaner_config', 'my_htmlcleaner_config');

function my_htmlcleaner_config($config)
{
    $config->set('HTML.Allowed', 'a[href],b,u,div,p');
    return $config;
}

Internally this AddOn uses HtmlPurifier. See http://htmlpurifier.org/live/configdoc/plain.html for all the options.


A user asked if this AddOn will clean up the “junk” MS Word and MS Outlook add. It does clean up the Word/Outlook junk, but I will say that some of what Word adds is valid html (such as class and style attributes) so you need to see what styles your theme provides that conflicts with what comes with the email.For example here is something that was from Outlook and was cleaned, but didn’t display quite the way the user wanted:

<p class="MsoNormal" style="text-align: center;" align="center"><span style="font-size: 12pt; font-family: 'Times New Roman', serif;">BOARD OF DIRECTORS MEETING</span></p>

This is valid html, but if your theme has a “MsoNormal” style defined things might look strange. Note also that a lot of specific styling was specified which might look odd in your theme especially the “font-size: 12pt; font-family: ‘Times New Roman’, serif”