Jump to content

Matching value between certain tags


WillyTheFish

Recommended Posts

Hey guys,

 

this is probably pretty simple, but I really have a problem with regex.

 

This is the input string:

</w:r><w:r><w:rPr><w:b/><w:sz w:val="24"/></w:rPr><w:br/></w:r><w:r><w:rPr><w:b/>
<w:sz w:val="24"/></w:rPr><w:br/></w:r><aml:annotation aml:id="2" w:type="Word.Bookmark.Start" w:name="something"/>
<w:r><w:rPr><w:sz w:val="24"/>
</w:rPr><w:t>THIS IS THE TEXT I WANT TO EXTRACT!</w:t></w:r><w:r><w:rPr><w:sz w:val="24"/></w:rPr><w:br/>
</w:r><aml:annotation aml:id="2" w:type="Word.Bookmark.End"/><w:r><w:rPr><w:sz w:val="24"/></w:rPr><w:br/></w:r><w:r><w:rPr>
<w:sz w:val="24"/></w:rPr><w:br/></w:r><w:r><w:rPr><w:sz w:val="24"/><w:highlight w:val="light-gray"/></w:rPr><w:t>NO MATCH</w:t>

 

 

I want to match the text between "Word.Bookmark.Start" and "Word.Bookmark.End",

but not the tags inbetween. So the match of the RegEx in the example above should be:

"THIS IS THE TEXT I WANT TO EXTRACT!"

 

I do not want to extract the text "NO MATCH", since it is not located between "Word.Bookmark.Start" and "Word.Bookmark.End".

 

I wanna gather all matches in an array.

 

Please help! Many thanks :)

 

Link to comment
https://forums.phpfreaks.com/topic/209354-matching-value-between-certain-tags/
Share on other sites

try

<?php
$test = '</w:r><w:r><w:rPr><w:b/><w:sz w:val="24"/></w:rPr><w:br/></w:r><w:r><w:rPr><w:b/>
<w:sz w:val="24"/></w:rPr><w:br/></w:r><aml:annotation aml:id="2" w:type="Word.Bookmark.Start" w:name="something"/>
<w:r><w:rPr><w:sz w:val="24"/>
</w:rPr><w:t>THIS IS THE TEXT I WANT TO EXTRACT!</w:t></w:r><w:r><w:rPr><w:sz w:val="24"/></w:rPr><w:br/>
</w:r><aml:annotation aml:id="2" w:type="Word.Bookmark.End"/><w:r><w:rPr><w:sz w:val="24"/></w:rPr><w:br/></w:r><w:r><w:rPr>
<w:sz w:val="24"/></w:rPr><w:br/></w:r><w:r><w:rPr><w:sz w:val="24"/><w:highlight w:val="light-gray"/></w:rPr><w:t>NO MATCH</w:t>';
preg_match('/"Word\.Bookmark\.Start"[^>]*>(.*)<[^>]*"Word\.Bookmark\.End"/s', $test, $out);
$out = strip_tags($out[1]);
print_r($out);
?>

thanks sasa, almost... this regex gives me two arrays:

 

array(2) { [0]=> string(242) ""Word.Bookmark.Start" w:name="something"/> THIS IS THE TEXT I WANT TO EXTRACT! string(147) " THIS IS THE TEXT I WANT TO EXTRACT! " }

 

could you have another look at it? thank you so much!

ops, sorry i'm stupid, forgot to print it right... actually, both elements are incorrect :(

 

Array

(

    [0] => "Word.Bookmark.Start" w:name="something"/>

<w:r><w:rPr><w:sz w:val="24"/>

</w:rPr><w:t>THIS IS THE TEXT I WANT TO EXTRACT!</w:t></w:r><w:r><w:rPr><w:sz w:val="24"/></w:rPr><w:br/>

</w:r><aml:annotation aml:id="2" w:type="Word.Bookmark.End"

    [1] =>

<w:r><w:rPr><w:sz w:val="24"/>

</w:rPr><w:t>THIS IS THE TEXT I WANT TO EXTRACT!</w:t></w:r><w:r><w:rPr><w:sz w:val="24"/></w:rPr><w:br/>

</w:r>

)

<?php
$test = '</w:r><w:r><w:rPr><w:b/><w:sz w:val="24"/></w:rPr><w:br/></w:r><w:r><w:rPr><w:b/>
<w:sz w:val="24"/></w:rPr><w:br/></w:r><aml:annotation aml:id="2" w:type="Word.Bookmark.Start" w:name="something"/>
<w:r>  <w:rPr><w:sz w:val="24"/>   <blkah>  <blah>
</w:rPr><w:t>THIS IS THE TEXT I WANT TO EXTRACT!</w:t></w:r><w:r><w:rPr><w:sz w:val="24"/></w:rPr><w:br/>
</w:r><aml:annotation aml:id="2" w:type="Word.Bookmark.End"/><w:r><w:rPr><w:sz w:val="24"/></w:rPr><w:br/></w:r><w:r><w:rPr>
<w:sz w:val="24"/></w:rPr><w:br/></w:r><w:r><w:rPr><w:sz w:val="24"/><w:highlight w:val="light-gray"/></w:rPr><w:t>NO MATCH</w:t>';
preg_match('/"Word\.Bookmark\.Start".*?>[\n\s]*([^<\n\s][^<\n]+)<.*?"Word\.Bookmark\.End"/s', $test, $out);
//$out = strip_tags($out[1]);
print_r($out);
?>

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.