Jump to content

Regex help


lowpitch

Recommended Posts

Hello there chaps,

Fairly experienced PHP user, but I must admit I wince whenever I have to do regex, and normally find an alternative way of solving it, mainly out of confusion/fear.

Anyway, this my situation. I have a load of HTML content being spat out by my CMS. Inside this HTML, i've added some custom attributes to my <a> tags so that, at display-time, I can rewrite the links dynamically depending on how the site is being viewed (HTML / flash / pda), etc. So, an example link in the CMS-generated HTML might look like

[code]<a linkType="internal" linkID="36" href="#" title="Link to this page" class="whatever">Some text</a>[/code]

If viewing the site as HTML, I might want to rewrite this to...

[code]<a href="viewPage.php?page=36" title="Link to this page" class="whatever">Some text</a>[/code]

Now, I was planning on getting hold of all the <a>something</a> tags, manipulating them as XML nodes (so I can access the attributes), and creating a new XML node for my rewritten <a> tag and saving this new node in place of the old node before writing the HTML to the browser.

So, essentially, in pseudo code, what I'd like to achieve is the following

[code]$links = getSomeKindOfArrayContainingAllTheATags ();

foreach ($links as $link)

{

  createNewTagBasedOnExistingTag();

  replaceOldHTMLwithNewHTML ();

}[/code]

However, I'm stuck on a couple of things here. Firstly, I'm very stuck trying to write the regex to isolate the <a> tags - note, I don't just want the href, or the content within the <a>sdfsd</a> tags, I want the whole thing. I'm also stuck on what methods to use in PHP to achieve what I want, how I'd go about replacing the old HTML with the new HTML, etc etc.

One way I assume I could do it is to get hold of all the <a> tags along with their offsets, or something, and do it that way.

Another way I can think of is to do it in three steps - firstly, get an array of all the a tags. Then iterate through them, creating an array of new tags which will act as replacements. Then finally, do some kind of uber regex search and replace, replacing old for new.

So I think what I'm hoping for is that someone will be able to give me a few pointers on the actual regex i need to use for this, and also if someone could point me in the right direction of how this kind of process would work, what methods to look up etc. I'd very much appreciate any help on this.

I'm using PHP 5.1, and I apologise if the formatting of this post goes strange.

Many thanks,
lp.
Link to comment
Share on other sites

Thank you, that's very interesting and looks just what I need.

I've been looking at preg_replace a little - if I first manage to get an array of all the matching tags, and then build up an array of tags to replace them with, would I just be able to call

$whatever = preg_replace ($theSamePatternUsedToFindThemBefore, $arrayOfReplacementStrings, $myContent);

It looks to me like that would work, am I off course?

Thanks again,
Toby
Link to comment
Share on other sites

Actually, I have just achieved what I need.

preg_match_all ("#<a ([^>]*)>(.*?)</a>#is", $theSourceHTML, $arrayToStoreOutput, PREG_SET_ORDER);

Then I iterate through $arrayToStoreOutput, doing a str_replace on each one.

It works as I want it to, although I'm sure it could be more efficient.

Thanks for the link to the regex - very helpful.
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.