dsaba Posted May 3, 2007 Share Posted May 3, 2007 I need some help stripping certain text out of some span tags here's a sample string: <?php $string = '<span onmouseover="_tipon(this)" onmouseout="_tipoff()" style="direction: rtl; text-align: right"><span class="google-src-text" style="direction: ltr; text-align: left">othertext</span>hello</span><br /> <span onmouseover="_tipon(this)" onmouseout="_tipoff()" style="direction: rtl; text-align: right"><span class="google-src-text" style="direction: ltr; text-align: left">blablabla</span>how</span><br /> <span onmouseover="_tipon(this)" onmouseout="_tipoff()" style="direction: rtl; text-align: right"><span class="google-src-text" style="direction: ltr; text-align: left">blaijsalk2</span>are</span><br /> <span onmouseover="_tipon(this)" onmouseout="_tipoff()" style="direction: rtl; text-align: right"><span class="google-src-text" style="direction: ltr; text-align: left">superblabla</span>you</span><br />'; ?> I want to strip everything out of the string except: <?php $newstring = 'hello<br />how<br />are<br />you<br />'; ?> how can I do this with regex or just any kind of parsing string algorithim? I'm making this post for help after trying many different failed methods, so I have tried first to do it myself and have run out of ideas...so now i'm asking for the community's help/advice -thank you Quote Link to comment Share on other sites More sharing options...
fert Posted May 3, 2007 Share Posted May 3, 2007 $str=preg_replace("<span(.*)>(.*)</span>","$2",$str); that might work, but I haven't tried it. Quote Link to comment Share on other sites More sharing options...
corbin Posted May 3, 2007 Share Posted May 3, 2007 Because of the similarity of the spans, and the randomness by which you've chosen spans to read, I'm pretty sure it's either gonna be the most complex regexp I've ever seen, or it'll be impossible. (You could maybe try to write a regexp that does it based on the order of spans? Or maybe one that does something to do with the order of the spans.... Are the spans that you're trying to pull text out of always in the same order, and are there always the same number of spans? If both of those are yes, then I can write a regexp that'll pull the stuff out, but if not, then I have no idea ;p. Edit: Fert's thing will work to grab text from all the span's but the way you worded your question, I'm under the impression that you only want text from certain spans (in which case you could use Fert's in conjunction with preg_replace and just use certain array keys of the variable set to the matches ;p) Quote Link to comment Share on other sites More sharing options...
dsaba Posted May 3, 2007 Author Share Posted May 3, 2007 i edited fert's code to do this: <?php $string = '<span onmouseover="_tipon(this)" onmouseout="_tipoff()" style="direction: rtl; text-align: right"><span class="google-src-text" style="direction: ltr; text-align: left">othertext</span>hello</span><br /> <span onmouseover="_tipon(this)" onmouseout="_tipoff()" style="direction: rtl; text-align: right"><span class="google-src-text" style="direction: ltr; text-align: left">blablabla</span>how</span><br /> <span onmouseover="_tipon(this)" onmouseout="_tipoff()" style="direction: rtl; text-align: right"><span class="google-src-text" style="direction: ltr; text-align: left">blaijsalk2</span>are</span><br /> <span onmouseover="_tipon(this)" onmouseout="_tipoff()" style="direction: rtl; text-align: right"><span class="google-src-text" style="direction: ltr; text-align: left">superblabla</span>you</span><br />'; $array = explode('<br />', $string); foreach ($array as $value) { $str=preg_replace("/<span(.*)>(.*)<\/span>/","$2",$value); $newArray[] = $str; } $fulltext = implode('<br>', $newArray); echo $fulltext; ?> it gives me my desired result -thanks fert! Quote Link to comment Share on other sites More sharing options...
dsaba Posted May 3, 2007 Author Share Posted May 3, 2007 ok this code does what I wanted before: $newstring = preg_replace("/<span(.*)>(.*)<\/span>/","$2", $oldstring); however I tried using it on this code and it doesn't do the job, i don' t know why here is the original string: <?php $string = '<span onmouseover="_tipon(this)" onmouseout="_tipoff()" style="direction: rtl; text-align: right"><span class="google-src-text" style="direction: ltr; text-align: left">blablabla<a href="http://64.233.179.104/translate_c?hl=en&langpair=en%7Car&u=http://www.google.com/">blablalink</a></span>hello how are you <a href="http://64.233.179.104/translate_c?hl=en&langpair=en%7Car&u=http://www.google.com/">click here</a> </span>'; ?> here is the new desired result: <?php $string = 'hello how are you <a href="http://64.233.179.104/translate_c?hl=en&langpair=en%7Car&u=http://www.google.com/">click here</a>'; ?> there are two spans, one span inside of another, I want the text from inside the second span as you can see, it worked before, only difference now is there is a <a> tag as part of the text in the second span What do I need to edit to this? that worked before? $newstring = preg_replace("/<span(.*)>(.*)<\/span>/","$2", $oldstring); Quote Link to comment Share on other sites More sharing options...
corbin Posted May 3, 2007 Share Posted May 3, 2007 Hmmm I think you might be looking for $string = '<span onmouseover="_tipon(this)" onmouseout="_tipoff()" style="direction: rtl; text-align: right"><span class="google-src-text" style="direction: ltr; text-align: left">blablabla<a href="http://64.233.179.104/translate_c?hl=en&langpair=en%7Car&u=http://www.google.com/">blablalink</a></span>hello how are you <a href="http://64.233.179.104/translate_c?hl=en&langpair=en%7Car&u=http://www.google.com/">click here</a> </span>'; $string = preg_replace("/<span (.*)>/Ui", "", $string); $string = preg_replace("#<\/span>#Ui", "", $string); echo $string; I think I might have misunderstood what you're trying to get from the spans though.... Quote Link to comment Share on other sites More sharing options...
dsaba Posted May 3, 2007 Author Share Posted May 3, 2007 no corbin thats not what i'm looking for let me try to re-explain myself, because fert had the right idea, nobody seems to know what i'm trying to do, which makes it harder huh if you're trying to help me? breakin' it down now: here is the original string: <?php $originalstring= '<span onmouseover="_tipon(this)" onmouseout="_tipoff()" style="direction: rtl; text-align: right"><span class="google-src-text" style="direction: ltr; text-align: left">blablabla<a href="http://www.blabla.com">blablalink</a></span>hello how are you <a href="http://www.clickhere.com">click here</a> </span>'; ?> I want to make this string into: <?php $newstring = 'hello how are you <a href="http://www.clickhere.com">click here</a>'; ?> some observations: 1. there are two spans in the original string 2. one span is inside of the other (something like that...) 3. the text that is in the $newstring is inside of the second span Fert's code: $newstring = preg_replace("/<span(.*)>(.*)<\/span>/","$2", $originalstring); worked when there weren't any <a> tags in the text I want to keep (the one thats in the inside of the second span remember, look above) So i need to edit this in some way to make it also accept the <a> tag as text to keep, this is where I NEED YOUR HELP now corbit: your code took everything out of BOTH spans not just the second span, but taking stuff only out of the second span can be done look at fert's preg_replace does it, I dont' know how but it does, how can I edit it to let it also work with the <a> tag as well? Quote Link to comment Share on other sites More sharing options...
ToonMariner Posted May 3, 2007 Share Posted May 3, 2007 The text is actually inside the first span so try this $newstring = preg_replace("/<span(.)*?>(.)*?<span(.)*?>(.)*?<\/span>(.)*?<\/span>/","<span$1>$2$5</span>", $originalstring); Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.