alinn Posted December 18, 2012 Share Posted December 18, 2012 How can I post on wordpress with this script and how can I edit the post if images from the original page have been changed? <?php // File: MatchAllDivMain.php // Read html file to be processed into $data variable $data = file_get_contents('test.html'); // Commented regex to extract contents from <div class="main">contents</div> // where "contents" may contain nested <div>s. // Regex uses PCRE's recursive (?1) sub expression syntax to recurs group 1 $pattern_long = '{ # recursive regex to capture contents of "main" DIV <div\s+class="main"\s*> # match the "main" class DIV opening tag ( # capture "main" DIV contents into $1 (?: # non-cap group for nesting * quantifier (?: (?!<div[^>]*>|</div>). )++ # possessively match all non-DIV tag chars | # or <div[^>]*>(?1)</div> # recursively match nested <div>xyz</div> )* # loop however deep as necessary ) # end group 1 capture </div> # match the "main" class DIV closing tag }six'; // single-line (dot matches all), ignore case and free spacing modes ON // short version of same regex $pattern_short = '{<div\s+class="main"\s*>((??:(?!<div[^>]*>|</div>).)++|<div[^>]*>(?1)</div>)*)</div>}si'; $matchcount = preg_match_all($pattern_long, $data, $matches); // $matchcount = preg_match_all($pattern_short, $data, $matches); echo("<pre>\n"); if ($matchcount > 0) { echo("$matchcount matches found.\n"); // print_r($matches); for($i = 0; $i < $matchcount; $i++) { echo("\nMatch #" . ($i + 1) . ":\n"); echo($matches[1][$i]); // print 1st capture group for match number i } } else { echo('No matches'); } echo("\n</pre>"); ?> Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.