robertandrews Posted March 30, 2022 Share Posted March 30, 2022 I am making use of PHP's `DOMDocument within WordPress to carry out two DOM operations on the post content, by filtering the_content: Wrap a certain element pattern in a div. Add a class to a particular element. But I have a problem with that: DOMDocument is also destroying embedded WordPress content like tweets. eg. An embedded tweet is not rendered as an <iframe> as it should be, it is just rendered using <p>. (FYI, in WordPress, authors can paste in the URL of an oEmbed object like a tweet, YouTube video, SoundCloud track - whilst they are stored in the post database as just a plain URL, when the_content is output, they are rendered as <iframe>s or similar). I'm new to DOMDocument. Is there any way I can stop it from destroying embedded elements? My two pieces of code are below. They each use DOMDocument similarly. /* Wrap element in another element. Contributed by @XzKto, https://stackoverflow.com/a/8428323/1375163 */ function wrap_element($dom, $wrapped_element, $new_element) { // Initialise the new wrapper $wrapper = $dom->createElement($new_element); //Clone our created element $wrapper_clone = $wrapper->cloneNode(); //Replace image with this wrapper div $wrapped_element->parentNode->replaceChild($wrapper_clone,$wrapped_element); //Append the element to wrapper div $wrapper_clone->appendChild($wrapped_element); } /* Blockquote */ add_filter( 'the_content', 'bootstrap_blockquote' ); function bootstrap_blockquote( $content ) { // Load DOM of post content // $content = mb_convert_encoding($content, 'HTML-ENTITIES', 'UTF-8'); // $dom = new DOMDocument('1.0', 'utf-8'); // libxml_use_internal_errors(true); // $dom->loadHTML($content, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD); $dom = new DOMDocument; libxml_use_internal_errors(true); $dom->loadHTML($content); libxml_clear_errors(); foreach($dom->getElementsByTagName('blockquote') as $blockquote){ // Add blockquote class // Class addition contributed by @Gillu13, https://stackoverflow.com/a/63088684/1375163 $class_to_add = 'blockquote'; $blockquote->setAttribute('class', $class_to_add); // Wrap blockquote in <figure> wrap_element($dom, $blockquote, 'figure'); } $content = $dom->saveHTML(); return $content; } .. /* Original contributed by @jack, https://stackoverflow.com/a/10683463/1375163 */ add_filter( 'the_content', 'segment_post' ); function segment_post( $content ) { // Load post as document object module $dom = new DOMDocument; libxml_use_internal_errors(true); $dom->loadHTML($content); libxml_clear_errors(); // Initialise variables $segments = array(); $card = null; foreach ($dom->getElementsByTagName('h3') as $h3) { // 1. First, collect all nodes $card_nodes = array($h3); // iterate until another h3 or no more siblings for ($next = $h3->nextSibling; $next && $next->nodeName != 'h3'; $next = $next->nextSibling) { $card_nodes[] = $next; } // 2. Create <div> placeholder, with class attributes $card = $dom->createElement('section'); $card->setAttribute('class', 'card p-4 mb-3'); // replace the h3 with the new card $h3->parentNode->replaceChild($card, $h3); // and move all nodes into the newly created card foreach ($card_nodes as $node) { $card->appendChild($node); } // keep title of the original h3 $segments[] = $h3->nodeValue; /* // 3. Also wrap with <section> // Initialise the new wrapper $wrapper = $dom->createElement('section'); //Clone our created element $wrapper_clone = $wrapper->cloneNode(); //Replace image with this wrapper div $card->parentNode->replaceChild($wrapper_clone,$card); //Append the element to wrapper div $wrapper_clone->appendChild($card); */ } // make sure we have segments (card is the last inserted card in the dom) /* if ($segments && $card) { $ul = $dom->createElement('ul'); foreach ($segments as $title) { $li = $dom->createElement('li'); $a = $dom->createElement('a', $title); $a->setAttribute('href', '#'); $li->appendChild($a); $ul->appendChild($li); } // add as sibling of last card added $card->parentNode->appendChild($ul); } */ // TODO: examine https://stackoverflow.com/questions/10703057/wrap-all-html-tags-between-h3-tag-sets-with-domdocument-in-php $content = $dom->saveHTML(); return $content; } Quote Link to comment https://forums.phpfreaks.com/topic/314642-how-to-stop-domdocument-destroying-embeds/ Share on other sites More sharing options...
maxxd Posted March 30, 2022 Share Posted March 30, 2022 I'm not sure how WP handles the oEmbeds any more, but I'd say try firing your hook later. The third parameter to add_filter is a priority - set it to like 100 or something high. It's possible that somehow what you're doing is interfering with however WP handles oEmbeds in the content. Quote Link to comment https://forums.phpfreaks.com/topic/314642-how-to-stop-domdocument-destroying-embeds/#findComment-1594735 Share on other sites More sharing options...
robertandrews Posted March 31, 2022 Author Share Posted March 31, 2022 On 3/30/2022 at 12:59 PM, maxxd said: I'm not sure how WP handles the oEmbeds any more, but I'd say try firing your hook later. The third parameter to add_filter is a priority - set it to like 100 or something high. It's possible that somehow what you're doing is interfering with however WP handles oEmbeds in the content. I tried 100 (it was already 50) - no difference. For clarity, I have separated out these pieces of DOMDocument-dependent code, packaging them up as plugins released on GitHub... WP Bootstrapify: "Optimise WordPress post HTML elements on-the-fly for enhanced Bootstrap requirements." WP Post Segmenter: "Turn post H3 segments into <sections> and Bootstrap cards for impactful reader presentation." Things that are true to say... Re: the frame embed getting eliminated... I think the fault was the code of WP Bootstrapify. This aims to wrap and add new classes to <blockquote>. As embedded tweets include a fallback to <blockquote>, they were getting wrapped in my new class, <figure>, and having additional classes added, as though they were any other <blockquote>. If I include an override to ignore items with class .tweet-embed, the Twitter frames stay as-is. Look toward DOMDocument. If I disable both plugins, the posts appear without those awful special characters (an eye-opener as to how many are in my underlying posts?). If I test by leaving only one of them, eg. WP Post Segmenter, active... The formatting problem returns... This leads me to believe the issue must be with DOMDocument or my particular DOMDocument options, how it might be interfering with the formatting. Anyone have any ideas? Currently using utf8_decode($content) - I wonder if that's a problem. Quote Link to comment https://forums.phpfreaks.com/topic/314642-how-to-stop-domdocument-destroying-embeds/#findComment-1594788 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.