Wednesday, June 03, 2009

Tagging pattern in an element of XML using PHP

Recently I had a requirement to tag a pattern within an element of the provided XML.

Following code worked:
<?php

function replaceElementWithTaggedElement($doc, $element, $pattern, $tagNameForPattern)
{
$newElement = $doc->appendChild(new domelement($element->nodeName));

$content = $element->nodeValue;
while(preg_match($pattern, $content, $matches, PREG_OFFSET_CAPTURE))
{
$match = $matches[0][0];
$offset = $matches[0][1];
$firstPart = substr($content,0,$offset);
$secondPart = substr($content,$offset+strlen($match));
$newElement->appendChild($doc->createTextNode($firstPart));

$taggedElement = $doc->createElement($tagNameForPattern);
$taggedElement->appendChild($doc->createTextNode($match));
$newElement->appendChild($taggedElement);

$content = $secondPart;
}
$newElement->appendChild($doc->createTextNode($content));

$element->parentNode->replaceChild($newElement, $element);
}

$doc = new DOMDocument();
$doc->loadXML("<root><one>This is the first text node</one><two>This is the second text node and the word to be highlighted is second</two></root>");
$oldElement = $doc->getElementsByTagName("two")->item(0);

replaceElementWithTaggedElement($doc, $oldElement, "/second/", "tagged");

echo $doc->saveXML();
?>

OUTPUT

<?xml version="1.0"?>
<root><one>This is the first text node</one><two>This is the <tagged>second</tagged> text node and the word to be highlighted is <tagged>second</tagged></two></root>

No comments: