Jump to content

Help with this small php script


kevinkhan

Recommended Posts

Please have a look at this script and tell me what is wrong..

 

Im trying to extract the link and whats between <a href>  </a> and i only want to take the ones which have a class of class="vehicle"

 


<?php

function getMatches($strMatch,$strContent) 
  {
	if(preg_match_all($strMatch,$strContent,$objMatches))
    {
		return $objMatches;
	}
	return "";
}


  $strContent = '<li class="vehicle"><a href="http://www.domain.ie/abc?/">vehicle1</a>
                 <li class="bus"><a href="http://www.domain.ie/abc?/">Alfa Romeo 147</a>
                 <li class="vehicle"><a href="http://www.domain.ie/abc?/">vehicle2</a>
                 <li class="van"><a href="http://www.domain.ie/abc?/">Alfa Romeo 147</a>
                 <li class="van"><a href="http://www.domain.ie/abc?/">Alfa Romeo 147</a>';

  $strMatch "#<li class=\"vehicle\"><a href=\"(.*)\">(.*)</a>#";
  $objListMatches = getMatches($strMatch,$strContent);


    $sUrl = $objListMatches[1];
    $sTitle = $objListMatches[2];


echo $sUrl;
echo "<br />";
echo $sTitle;



 

Any help will be greatly appreciated ;)

Link to comment
Share on other sites

Sorry i left out the equals sign :)

 

it worked fine but now i change the code to this and when i run the script i get nothing :(

 

 

<?php


$strContent = file_get_contents('http://www.carzone.ie/search/results?searchsource=browse&cacheBuster=1256829754944212');


function getMatches($strMatch,$strContent) 
  {
        if(preg_match_all($strMatch,$strContent,$objMatches))
    {
            return $objMatches;
        }
        return "";
    }


  

  $strMatch ='#<li class="vehicle"><a href="([^"]*)">([^<]*)</a>#is';
            
  $objListMatches = getMatches($strMatch,$strContent);


    $sUrl = $objListMatches[1];
    $sTitle = $objListMatches[2];

print_r($sUrl);

echo "<br />";
print_r($sTitle);



?>

 

Does the regular expression have to change now?

 

Im looking for the same piece of html code in the carzone.ie url..

Link to comment
Share on other sites

sorry you are right

 

i changed it to this

 

$strMatch ='#<li class="vehicle-make-model"><a href="([^"]*)">([^<]*)</a>#is';

 

check this please

 

im looking for the titles of ads on the page with a class of vehicle-make-model

 

please check i really want to know how to do this...

 

Link to comment
Share on other sites

As suggested in previous threads, you would probably be better with DOMDocument. Interestingly the source code recieved from file_get_contents is different to what you get when you view source in a browser, that is probably what is causing you problems. When clicking view source in Firefox, we get this...

 

<li class="vehicle-make-model"><a title="Alfa Romeo 146 Alfa Romeo  Alfa Romeo 146 1.4 5DR" href="http://www.carzone.ie/search/Alfa-Romeo/146/Alfa-Rom/200935195049580/advert?channel=CARS">Alfa Romeo 146 Alfa Romeo  …</a></li>

 

When viewing the contents of file_get_contents, we get this format instead...

 

<li class="vehicle-make-model">  <a href="http://www.carzone.ie/search/Alfa-Romeo/146/1.4/200840190284999/advert?channel=CARS">  Alfa Romeo 146 </a>

 

Making a guess at which bits you want, you should try something like...

 

'~<li class="vehicle-make-model">\s*<a href="([^"]*)">(.*?)</a>~is'

Link to comment
Share on other sites

As suggested in previous threads, you would probably be better with DOMDocument. Interestingly the source code recieved from file_get_contents is different to what you get when you view source in a browser, that is probably what is causing you problems. When clicking view source in Firefox, we get this...

 

<li class="vehicle-make-model"><a title="Alfa Romeo 146 Alfa Romeo  Alfa Romeo 146 1.4 5DR" href="http://www.carzone.ie/search/Alfa-Romeo/146/Alfa-Rom/200935195049580/advert?channel=CARS">Alfa Romeo 146 Alfa Romeo  …</a></li>

 

 

 

When viewing the contents of file_get_contents, we get this format instead...

 

<li class="vehicle-make-model">  <a href="http://www.carzone.ie/search/Alfa-Romeo/146/1.4/200840190284999/advert?channel=CARS">  Alfa Romeo 146 </a>

 

Making a guess at which bits you want, you should try something like...

 

'~<li class="vehicle-make-model">\s*<a href="([^"]*)">(.*?)</a>~is'

 

Thats exactly what i want...

 

Thanks for that.. ;):) :) :)

 

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.