Ninjakreborn Posted July 11, 2010 Share Posted July 11, 2010 <?php $url = file_get_contents('http://search.yahoo.com/search;_ylt=A0geu5jCGDhIk6gAr6pXNyoA?p=link:' . $domain); $preg = preg_match_all("=<span class\=\"url\">(.*)</span>=siU", $url, $results); ?> This simply goes and get's everything from a string where <span class="url">whatever</span> it grabs everything in the span tags. This works great..but why is it doubling the array? It returns $results as $array[0] and $array[1] with 0 and 1 both having a duplicate..so all of 0 has the text I want but the same text is in 1..why did it create it that way? More trying to figure this out so I can understand Regular Expressions better. Quote Link to comment Share on other sites More sharing options...
wildteen88 Posted July 11, 2010 Share Posted July 11, 2010 If you look in the source you'll see, you'll have two distinctive arrays, $array[0] will contain both the html and urls, eg Array ( [0] => Array ( [0] => <span class="url">http://some url here</span> .... rest of matches ) ) Where as $array[1] will just contains the urls (what you're searching for (.*)), eg Array ( [1] => Array ( [0] => http://some url here .... rest of matches ) ) Hope that helps to understand what preg_match_all is doing Quote Link to comment Share on other sites More sharing options...
Ninjakreborn Posted July 11, 2010 Author Share Posted July 11, 2010 Very much so, thank you very much. Oh and Wildteen, I have not seen you in forever. It's not to see you around here again. Hope things are doing well for you. Thanks again for the help. I am really trying to jump into this Regex stuff, finally starting to write some of my own. After all this time I finally find out it's very helpful with some of the stuff I have had to do. Thanks again. Quote Link to comment Share on other sites More sharing options...
Ninjakreborn Posted July 11, 2010 Author Share Posted July 11, 2010 <?php $temp = preg_match_all("\bInlinks\b \(([:digit:])\)", $url, $test); ?> OK I can't get this to work. I want to take a look at a giant string and return 1 instance of this "Inlinks (123)" The 123 could be replaced by whatever number of how many results. I am having a strange error like Warning: preg_match_all() [function.preg-match-all]: Delimiter must not be alphanumeric or backslash in /users1/functions.php on line 51 I tried the above regular expression also without the \b's around the text because I thought you could directly match using a word or something. So when the Regex finds the words Inlinks followed by a set of numbers in parenthesis I want to grab those numbers. My logic flow for this was match the word Inlinks followed by () (which is why I escaped the first set) then get any numeric characters inside that. But instead I get this error. Any advice would be apreciated. Quote Link to comment Share on other sites More sharing options...
wildteen88 Posted July 11, 2010 Share Posted July 11, 2010 You forgot the delimiters (they define the start/end of your regex pattern). I tend to use ~ as the delimiters. You'll want to wrap [:digit:] within square brackets too, otherwise you'll receive an error. Also another change you'll need to add is the 1 or more quantifier (+ symbol) after [:digit:] to match one or more numbers, otherwise your pattern will only match a single digit and not multiple. Fixed code $temp = preg_match_all("~\bInlinks\b \(([[:digit:]+])\)~", $url, $test); Quote Link to comment Share on other sites More sharing options...
Ninjakreborn Posted July 11, 2010 Author Share Posted July 11, 2010 OK I appreciate that. I made some notes about the delimiters and other things you said. I tried it and it wasn't working, so I cut it out into a test case and it still wasn't working then reduced the pattern a little to try and make it more specific and it still wasn't working. Here is the current pattern and test case <?php $temp = preg_match_all("~\(([[:digit:]+])\)~", 'test whatever (1234) whatever test', $test); echo '<pre>'; print_r($test); echo '</pre>'; ?> I basically want to grab the numbers between (). So in this ist should just be start delimiter and then escape the ( and then inside that grab all the digits and then unescape the other ) and close delimeter. Any advice is appreciated. This test case returns an empty array. I will expand upon it myself to make sure the Inlinks is in front of it after I get the test case working but there must be something I am doing wrong here. Thanks again. Quote Link to comment Share on other sites More sharing options...
wildteen88 Posted July 11, 2010 Share Posted July 11, 2010 Woops my bad. I meant to say add the + operator after [[:digit:]], not [:digit:]. I forgot the extra [] Quote Link to comment Share on other sites More sharing options...
Ninjakreborn Posted July 11, 2010 Author Share Posted July 11, 2010 Oh ok. I got it adjusted and now it's even working with the Inlinks part as well. Thanks again, I appreciate it. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.