marklarah Posted March 17, 2009 Share Posted March 17, 2009 So I have the HTML for a remote webpage stored in a variable (gotten through cURL). It's htmlentities'd, so I have the text for it. I need to get a specific link from this webpage. I think there are other links, but there is one in a specific format. The format is <a href="http://linkhere" style="font-size:15px;"> I have read the things online how to parse it, but I can't seem to find a specific one which will do this. How would I go about doing it? Thanks...Mark. Quote Link to comment https://forums.phpfreaks.com/topic/149887-solved-simple-string-parsing/ Share on other sites More sharing options...
premiso Posted March 17, 2009 Share Posted March 17, 2009 <?php $string = '<a href="http://linkhere" style="font-size:15px;">'; preg_match('~<a href="(.+?)" style="font-size:15px;">~si', $string, $matches); echo $matches[1]; ?> Quote Link to comment https://forums.phpfreaks.com/topic/149887-solved-simple-string-parsing/#findComment-787133 Share on other sites More sharing options...
marklarah Posted March 17, 2009 Author Share Posted March 17, 2009 Oh ok, thats a great way of doing it thanks. Only thing is, it's not echoing anything, which it should. The string has entities on, if that makes a difference. Quote Link to comment https://forums.phpfreaks.com/topic/149887-solved-simple-string-parsing/#findComment-787136 Share on other sites More sharing options...
premiso Posted March 17, 2009 Share Posted March 17, 2009 Then give me the exact string you are trying to get out, or modify it to fit your needs. The above should give you a good idea how to modify it. If it is not echoing anything, it is not finding any matches. Simple as that. Quote Link to comment https://forums.phpfreaks.com/topic/149887-solved-simple-string-parsing/#findComment-787139 Share on other sites More sharing options...
marklarah Posted March 17, 2009 Author Share Posted March 17, 2009 Ok, sort of half-fixed it. Turns out the entities does matter, i'm using preg_match('~<a href="(.+?)" style="font-size:15px;">~si', $result, $matches); Now, I get a lot of stuff, but the last thing is the link. the first the characters of what's on the page is ?"> - I presume this is the preg_match. If this needs editing then, then what comes before the link is id="link"> - there is a space between the end of that div and the link/ Quote Link to comment https://forums.phpfreaks.com/topic/149887-solved-simple-string-parsing/#findComment-787144 Share on other sites More sharing options...
premiso Posted March 17, 2009 Share Posted March 17, 2009 Why not parse it before you run it through htmlentities ??? That would make this about 10 times easier on you. Or if it is already entitied then do a html_entity_decode on it before running it through the preg_match function. Quote Link to comment https://forums.phpfreaks.com/topic/149887-solved-simple-string-parsing/#findComment-787145 Share on other sites More sharing options...
marklarah Posted March 17, 2009 Author Share Posted March 17, 2009 Well okay, it's off, but I'm still getting lots of stuff before the link. Quote Link to comment https://forums.phpfreaks.com/topic/149887-solved-simple-string-parsing/#findComment-787148 Share on other sites More sharing options...
marklarah Posted March 17, 2009 Author Share Posted March 17, 2009 Okay, we have this on the page: <a href="?"> So it echoing from ?"> to the end of the link. Quote Link to comment https://forums.phpfreaks.com/topic/149887-solved-simple-string-parsing/#findComment-787154 Share on other sites More sharing options...
premiso Posted March 17, 2009 Share Posted March 17, 2009 but I'm still getting lots of stuff before the link. Hmm, that gives me a ton of information. Post your current code, and the "stuff" before the link. Given that you take the original code I posted above it works fine right? So the issue lies with how the data is stored in the string. Do a print_r on the $matches variable, view the source and paste that array here as well. Just to make sure you are echoing $matches[1] and not $matches[0]. Quote Link to comment https://forums.phpfreaks.com/topic/149887-solved-simple-string-parsing/#findComment-787158 Share on other sites More sharing options...
marklarah Posted March 17, 2009 Author Share Posted March 17, 2009 print_r returns exactly the same. The problem is, there are a few links on the page, so it goes from the first link with <a href="..... to our link we're trying to single out, because it is unique, with the style. Can we edit the preg_match to identify a line of text before the link? Quote Link to comment https://forums.phpfreaks.com/topic/149887-solved-simple-string-parsing/#findComment-787168 Share on other sites More sharing options...
marklarah Posted March 17, 2009 Author Share Posted March 17, 2009 nevermind, done it, there was a tab before the link, so I've put that into the preg and that seems to work. Thanks a million though for your help! <3 you... Quote Link to comment https://forums.phpfreaks.com/topic/149887-solved-simple-string-parsing/#findComment-787174 Share on other sites More sharing options...
samshel Posted March 17, 2009 Share Posted March 17, 2009 try this.. <?php $string = '<a href="http://linkhere" style="font-size:20px;"<a href="http://linkhere" style="font-size:15px;">'; preg_match('~<a href="([^\"]+)" style="font-size:15px;">~si', $string, $matches); echo $matches[1]; ?> Quote Link to comment https://forums.phpfreaks.com/topic/149887-solved-simple-string-parsing/#findComment-787175 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.