Archive: Mon 4 Nov 2013

  1. A WordPress plugin: Icon Table of Contents and Menus

    Though I haven’t posted much to this site recently, I have been busy working on a number of different projects. One of these projects has involved developing a new WordPress plugin to generate a table of contents. This is the first plugin that I’ve taken the time to submit to the WordPress.org plugin directory.

    Background

    While I have been working with PHP and WordPress for many years, and have developed a number of bespoke plugin solutions for clients as well as for my own sites, this is the first time that I’ve submitted a plugin to the public domain. That’s primarily because, even for a relatively straightforward bit of code, there is a big step in moving from something that is going to be used in a controlled environment to something that is designed to be used by anyone.

    The plugin itself is relatively straightforward in terms of its output – it creates an icon which can be expanded to display a table of contents for the HTML of the current page.

    The way that it generates this table (actually an unordered HTML list) is a little interesting and draws on some work which I did a long time ago when building a reverse proxy to sit between some content and a web server. This proxy was used to display content in a unified format using static HTML templates but with content generated from multiple sources and different content management systems.

    As part of that process all of the fetched content was built into a PHP DOMDocument so that it could be easily manipulated and inserted into the static HTML templates. In some cases, that required quite a lot of tidying up of the generated HTML content, using PHP’s Tidy function and other bespoke PHP code to clean everything up. It’s aspects of this code that are used in my apparently fairly simple new table of contents plugin.

    So how does the plugin work?

    The plugin works by manipulating the content of the current post or page by adding a WordPress filter on the content. When applied the plugin fetches the content for the page or post using the WordPress function the_content(). This returns a string which is loaded into a PHP DOMDocument. It determines the charset from the database and sets the appropriate HTML headers and metadata to make sure it gets processed properly by the DOMDocument.

    It then uses DOMXPath to query the DOMDocument to get the HTML headings we are interested in (h1 to h4):

    $xpath->query('//*[self::h1 or self::h2 or self::h3 or self::h4]')

    Each of these HTML headings then gets its own bookmark based on its position within the page hierarchy. This ‘id’ is calculated from the DOM fragment that is being generated to display the unordered list at the top of the page:

    $levels = array();
    // here $head represents root level of the current section
    // in the form of a <ul> element
    $tmp = &$head; 
    while (!is_null($tmp) && $tmp != $frag) {
       $levels[] = $tmp->childNodes->length;
       $tmp = &$tmp->parentNode->parentNode;
    }
    $id = 'section'.implode('.', array_reverse($levels));

    This is the sort of content manipulation we had to do a lot of when creating our reverse proxy content processing system, though in this case I followed the recommendation of a very useful StackOverflow answer.

    Once all of this processing is done, the plugin calls a DOM-to-string function called get_content_as_string(). This function is derived directly from code from the earlier system. It works by iterating through each node and building a string which can be returned and displayed as the content of the page.

    function get_content_as_string($node) {   
       $str = "";
       if ($node) {
          if ($node->nodeName=="script"||$node->nodeName=="style"||
    $node->nodeName=="object"||$node->nodeName=="embed"||
    $node->nodeName=="canvas") $str .= $node->nodeValue;   
          if ($node->childNodes) {
             foreach ($node->childNodes as $cnode) {
                if ($cnode->nodeType==XML_TEXT_NODE) {
                   $str .= $cnode->nodeValue;
                }
                else if ($cnode->nodeType==XML_ELEMENT_NODE) {
                   $str .= "<" . $cnode->nodeName;
                   if ($attribnodes=$cnode->attributes) {
                      $str .= " ";
                      foreach ($attribnodes as $anode) {
                         if ($anode) {
                         $nodeName = $anode->nodeName;
                         $nodeValue = $anode->nodeValue;
                         $str .= $nodeName . "=\"" . $nodeValue . "\" ";
                         }
                      }
                   }   
                   $nodeText = $this->get_content_as_string($cnode);
                   if (empty($nodeText) && !$attribnodes)
                      $str .= " />";        // unary
                   else
                      $str .= ">" . $nodeText . "nodeName . ">";
                }
             }
             // A bit of cleanup
             $str = preg_replace("/\s>/si",">",$str);
             $str = preg_replace("/\><\/input>/is","/>",$str);
             $str = preg_replace("/<\/img>/is","",$str);
             return preg_replace("/
    <\/br>/is","
    ",$str);
          }
       }
    }

    The function itself can be modified to do quite a bit of content tidying itself, but in this case does its best to leave the content alone. It’s not a perfect solution, in that it isn’t designed to handle very large strings, but in most use cases this isn’t likely to be an issue. If anyone ever finds that it is, I promise to do some more work on streamlining the code.

    Find out more

    You can find more about the plugin itself, including details of how to download, install and apply it to your templates, here:

    Updated onĀ 7 November 2013, to reflect change in plugin to handle charsets properly (version 1.2)