Jump to content

regex help


raimis100

Recommended Posts

I am sure there are many threads on this as the others have mentioned.. I suppose it boils down to search terms used. Perhaps terms like 'scrape'?

 

But to give you one example:

$str = 'This is an <abbr title="silly example">string</abbr> contains a <a href="[url=http://www.somesite.bork/somefile.php]http://www.somesite.bork/somefile.php[/url]"><strong> hyperlink </strong></a> but you can also visit <a href="[url=http://www.whatever.com/somefile2.php]http://www.whatever.com/somefile2.php[/url]">this link</a> as well.';

preg_match_all('#<a[^>]*href=['"]([^'"]+)['"][^>]*>(.+?)</a>#si', $str, $link);
$arrTotal = count($link) - 1;
for ($a = 0 ; $a < $arrTotal ; $a++) {
    $href[] = $link[1][$a]; // stores the value of attribute href into array $href
    $linkText[] = trim(strip_tags($link[2][$a])); // stores hyperlink text into array $linkText
}
echo '<pre>'.print_r($href, true); // output array $href
echo '<pre>'.print_r($linkText, true); // output array $linkText

 

But I prefer using DOM / XPath for parsing tags. Assuming we use $str from the first snippet:

 

$dom = new DOMDocument;
$dom->loadHTML($str); // replace $str with string name in question
$xpath = new DOMXPath($dom);
$aTag = $xpath->query('//a[@href]');

foreach ($aTag as $val) {
    $href[] = $val->getAttribute('href'); // stores the value of attribute href into array $href
    $linkText[] = $val->nodeValue; // stores hyperlink text into array $linkText
}
$linkText = array_map('trim', $linkText);
echo '<pre>'.print_r($href, true);
echo '<pre>'.print_r($linkText, true);

Edit: the posting system is detecting the bogus URLs in the href values and inserting url bbc tags around them.. so you can simply remove those url tags when you cut and paste to test those snippets.

Link to comment
https://forums.phpfreaks.com/topic/165702-regex-help/#findComment-874585
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.