etrader

March 10, 2011

My issue is about renaming. How in a php script I can command to re-write files in a folder?

March 9, 2011

I have a folder containing files. I want to replace some characters in the file names (e.g. "_" with "-"), then make a list of new files (renamed ones).

Thanks :shy:

March 2, 2011

I have a list generated from an array by foreach as

foreach ($list_array as $item) {
echo "$item<br />";
}

This is a long list, how can I separate the list to different pages? I mean the simplest way to do so

February 20, 2011

Linked image in html has the form of

<a href="http://site.com/link.html"><img class="not important" src='imagelink.jpg' title="not important" alt="not important"/></a>

How to write href and src into an array during a foreach?

February 20, 2011

Nice idea, but I got an error

PHP Parse error:  syntax error, unexpected '(' in the preg_match_all line

February 20, 2011

It is easy to get image or link by DomDocument, but I did not find a way to get image with its target link. Imagine a html as

<div class=image>
<a href='http://site.com'><img src='imagelink.jpg'></a>
</div>

How to get both the image link and href?

$dom = new DOMDocument();
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//div[@class='image']");
for ($i = 0; $i < $hrefs->length; $i++) {
$href = $hrefs->item($i);

Now to get the image and its href, we need first getElementsByTagName('a') and getElementsByTagName('img') but they do not work inside foreach.

What's your idea?

February 19, 2011

Thanks for your informative reply. One more question. Does the order has an influence on cURL processing? I mean changing the order of lines in

curl_setopt($ch, CURLOPT_USERAGENT, 'Googlebot/2.1 (http://www.googlebot.com/bot.html)');
curl_setopt($ch, CURLOPT_URL,"http://www.site.com");
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_REFERER, 'http://www.google.com/bot.html'); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);

February 18, 2011

When making a fake identity in cURL, we use

curl_setopt($ch, CURLOPT_USERAGENT, 'Googlebot/2.1 (http://www.googlebot.com/bot.html)');

but which is better for the referrer:

curl_setopt($ch, CURLOPT_AUTOREFERER, true);

or

curl_setopt ($ch, CURLOPT_REFERER, "http://www.google.com/bot.html");

:shy:

February 17, 2011

Imagine an html with the following structure

<div class="item"> 
    <div class="title"> 
        <a class="title" href="http://www.domain.com/title.html">Title is here</a>   
    </div> 
    <div class="image"> 
        <a href="http://www.domain.com/title.html"><img src=image.jpg /></a>
    </div>
</div>

How to make an array containing $title - $url - $image_url ?

February 17, 2011

download the file first?

(untested)

$cachepage = "cache/pagename.html";
$external = "http://google.com";
if(!file_exists($cachepage)) {
	$ch = curl_init($external);
	$fp = fopen($cachepage,'w');
	curl_setopt($ch, CURLOPT_HEADER, false);
	curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
	curl_setopt($ch, CURLOPT_FILE, $fp);
	curl_exec($ch);
	curl_close($ch);
	fclose($fp);
}
$html = file_get_html($cachepage);
unset($cachepage);

The only problem I have is that it writes pagename.html with chmod 644, and then unset cannot delete it or re-writing in the next run.

February 16, 2011

The interesting point about simple_html_dom is the ability to capture links from a webpage as

foreach($html->find('a') as $element)  {
       $string = $element->href; }

Is it possible to do so without simple_html_dom?

February 16, 2011

Just throwing it out there, but maybe using one of the half dozen built-in DOM libraries in PHP instead of one thrown together with a bunch of regular expressions and recursive calls would improve performance?

What do you mean by that? I did not get it ::)

February 16, 2011

This will write "http://google.com" on pagename.html in cache directory?

I must create this file to be written ?

February 16, 2011

I successfully load a page by simple_html_dom.php (developed in simplehtmldom.sourceforge.net) as

$html = file_get_html('externalpage');

But sometimes this make a high load on CPU and the page does not load for a long time (probably due to the external site server). How can I skip the process when it is not normal to avoid high CPU usage?

February 10, 2011

Perfect solution! Thanks!

February 10, 2011

I have a foreach loop as

foreach ($xml->channel->item as $value){
$title = $value->title;
$text = $value->text;
echo "$title <br /> $text";
}

I want to skip any entry in which the title contains a character like ":"

February 10, 2011

I have two xml files as

$xml1 = simplexml_load_file("1.xml");
$xml2 = simplexml_load_file("2.xml");

and want to merge them to create an array for a foreach process.

February 10, 2011

I have some similar arrays. How I can combine them to shuffle the new array?

February 10, 2011

Sorry for that. I thought that RFC 822 can have four-digit year.

I think DATE_RFC2822 or DATE_RFC1123 suits my need

February 10, 2011

I prefer to have four-digit year as Mon, 15 Aug 2005 15:52:01 +0000

February 10, 2011

Yes! As I run the above-mentioned code, it produces Mon, 15 Aug 05 15:52:01 +0000

February 10, 2011

sounds good , but it shows two-digit year instead of four-digit: 11 instead of 2011. ::)

February 9, 2011

How to change the date in format of ISO-8601 (example: 2005-08-15T15:52:01+0000) to RFC 822 (example: Mon, 15 Aug 05 15:52:01 +0000)? :shy:

February 9, 2011

Actually, I loved the option1. It was so brilliant. Thanks a million

February 9, 2011

It is a variable string. It is not a static text. It comes from a dynamic array.

Sign In

etrader

Posts

Joined

Last visited

Content Type

Profiles

Forums

Posts posted by etrader

Renaming files in a folder and make a list

Renaming files in a folder and make a list

Paging a list (divide to separate pages)

Regex to get src together with href

Getting image with its link by DomDocument

Getting image with its link by DomDocument

fake referrer in cURL

fake referrer in cURL

How to parse html by DomDocument?

High CPU load on simple_html_dom

High CPU load on simple_html_dom

High CPU load on simple_html_dom

High CPU load on simple_html_dom

High CPU load on simple_html_dom

skipping an entry in foreach

skipping an entry in foreach

How to mix two arrays?

How to mix two arrays?

Changing date to RFC822 format

Changing date to RFC822 format

Changing date to RFC822 format

Changing date to RFC822 format

Changing date to RFC822 format

Deleting urls from a string

Deleting urls from a string

Browse

Activity

Important Information