how to retrieve a specific content

etusha · February 11, 2009

hello

need help new to php my boss give mi a advenced work to o fast plz help me

for example Top News Story from http://www.bbc.co.uk/

//div[@id=hpFeatureBoxInt]

cola · February 11, 2009

I do not sure what u mean but to receive some content of site

<?php

$l="www.something.net";

$cg = file_get_contents($l);

?>

etusha · February 11, 2009

i dont need all site i need only Top News Story

"Zimbabwe PM pledges 'new chapter'

Zimbabwe's new Prime Minister Morgan Tsvangirai vows to stabilise the shattered economy and end political violence."

cola · February 11, 2009

simplest way too do this is to go on the web site copy

and punt into sting

<?

$title=""Zimbabwe PM pledges 'new chapter' ";

$text= "Zimbabwe's new Prime Minister Morgan Tsvangirai vows to stabilise the shattered economy and end political violence.";

echo "$title <br> $text";

?>

premiso · February 11, 2009

simplest way too do this is to go on the web site copy

and punt into sting

lol I am sure he wants it done automatically/dynamically.

You need to look at the html source, find common tags then use either preg_match or a series of explode's to grab the data. Of course to retrieve the site data you will have to pull in the full page html source with file_get_contents and or curl.

etusha · February 11, 2009

premiso plz help me

i dont have any idea where to start

im new to php

premiso · February 11, 2009

<?php
$content = file_get_contents('http://www.bbc.co.uk/');

preg_match('~"hpFeatureBoxInt">(.+?)</div>~s', $content, $matches);

list($title, $body) = explode("</h3>", $matches[1]);
$title = trim(strip_tags($title));
$body = trim(strip_tags($body));

echo "Title: $title <br />";
echo "Body: $body <br />";

die();
?>

First I viewed the source of the BB page you wanted to parse, second I looked for an identifing tag of what you wanted out of that page.

I found

<div id="hpFeatureBoxInt">
<h2><span class="dy">Top News Story</span></h2>
<h3><a href="/go/homepage/i/int/news/world/1/-/news/1/hi/world/africa/7884282.stm"><img width="201" height="150" src="/feedengine/homepage/images/_45468316_84737466_201x150.jpg" alt="Morgan Tsvangirai addresses crowds"/>Zimbabwe PM pledges 'new chapter'</a></h3>
	<p>Zimbabwe's new Prime Minister Morgan Tsvangirai vows to stabilise the shattered economy and end political violence.</p>

	<p id="fbilisten"><a href="/go/homepage/i/int/news/heading/-/news/">More from BBC News</a>
</p>
</div>

I did a quick search for "hpFeatureBoxIn"> and did not find any other matches on the page, which means finding that would return the right result.

Next I used file_get_contents to retrieve the html source of the website you wanted to parse and put it into a string.

I then used that string in preg_match with the regex: [em]'~"hpFeatureBoxInt">(.+?)</div>~s'[/em] which finds the tag and grabs everything being the starting tag and the ending div tag and stored into an array of matches. The match was stored in the "1" index of the array, the "0" index returns what was found with the original tags in tact, so we do not want that one.

Next I used explode to separate the match to 2 separate variables, $title and $body. I explode'd it </h3> cause that separated the two.

Next I used strip_tags to remove any html tags left and trim the extra whitespaces. Now you have the two items inside strings to display them how you want.

Questions let me know.

redarrow · February 11, 2009

didnit work for me i see body that it lol

premiso · February 11, 2009

didnit work for me i see body that it lol

Works great on my end, are you using a host that does not fopen_url ? If so, grabbing the data via curl should solve that problem and make it work.

Either way, I did test it and it is working great on my box.

redarrow · February 11, 2009

can you kindly give another way,

as my server seems to ignore all that, very bad i am so upset enjoyed that reading your example.

redarrow · February 11, 2009

it you preg match man it wrong ..

premiso · February 11, 2009

it you preg match man it wrong ..

Come on now, no dissing on me.

It is not wrong, here it is so you can see for yourself:

http://www.emocium.com/test/test.php

Tested and working on PHP 4 and PHP 5.

redarrow · February 11, 2009

not dissing no one your grate.

i am sorry .

why my server not working then , never had this problem ever.

any ideas.

premiso · February 11, 2009

No clue. My bet is that file_get_contents is not working on your end, for whatever reason. Maybe you are blocked from viewing that site?

There are a lot of scenarios that would make that not work. Try using curl to retrieve the web page data and see if that makes it work. If you are trying this on a shared host, chances are they disallow fopen_url which would make the file_get_contents function not work for remote urls.

redarrow · February 11, 2009

i got that set to on , i am trying from home very strange stuff

redarrow · February 11, 2009

file get contents works i get the page it the preg_match last array match not working.

redarrow · February 11, 2009

can this effect the file_get_contents ?

max_execution_time = 900 ; Maximum execution time of each script, in seconds

max_input_time = 60 ; Maximum amount of time each script may spend parsing request data

;max_input_nesting_level = 64 ; Maximum input variable nesting level

memory_limit = 128M ; Maximum amount of memory a script may consume (128MB)

etusha · February 11, 2009

premiso thank for your help

Sign In

how to retrieve a specific content

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Important Information