Jump to content

[SOLVED] Probably really simple question


LemonInflux

Recommended Posts

#\<div>(.*?)</div>#

 

That is meant to find div tags, and return the insides. However, it doesn't.

 

I'm trying to create something so that when users search for images, the code goes to google images, takes the div tag (all the images are stored in a div tag with no name), and preg_matches them to echo. However, I get an empty array. Any ideas?

Link to comment
Share on other sites

post your sample input data, and exact code you use to parse through it

 

however this works fine for me:

<?php
$str = '<div>hello</div>  jkljdsfkljsdff <div>whatever></div> somethign else later <div> lalalal!</div>';
$pat = '~<div>(.*?)</div>~';
preg_match_all($pat, $str, $out);
print_r($out);

?>

 

 

you can add the s modifier if you are parsing or the match is through multiple lines:

$pat = '~<div>(.*?)</div>~s';

Link to comment
Share on other sites

<?php

if(isset($_POST['search']) && $_POST['search'] != '')
{
$str = htmlspecialchars($_POST['str']);

/** Here I have the bit to ready it for the URL bar **/

$subject = "http://images.google.co.uk/images?q=". $str;

$pattern = '~<div>(.*?)</div>~s';

preg_match($pattern, $subject, $matches, PREG_OFFSET_CAPTURE);

echo $matches[0];

}

echo '<form action="" method="post">
<p><input type="text" name="search" /></p>
<p><input type="submit" name="Submit" value="Search!"></p>
</form>';

?>

 

That's the full relevant code.

Link to comment
Share on other sites

your subject or haystack you are searching for is this string:

$subject = "http://images.google.co.uk/images?q=". $str;

 

I suspect $str hold a queryString for a search term

 

you want to search through the contents of that page, not the page url string!

 

try this:

 

$subject = file_get_contents("http://images.google.co.uk/images?q=". $str);

 

when debugging your code, try outputting and validating what your input should be next time, and you can catch something like this on your own, like echo $subject;

 

 

*edit I also noticed you did this:

$str = htmlspecialchars($_POST['str']);

 

if you are trying to mimic a queryString that google sends in the URL I would do:

$str = urlencode($_POST['str']);

Link to comment
Share on other sites

<?php

if(isset($_POST['search']) && $_POST['search'] != '')
{
$str = htmlspecialchars($_POST['search']);
$str = urlencode($str);

$subject = file_get_contents("http://images.google.co.uk/images?q=". $str);

$pattern = '~<div>(.*?)</div>~s';

preg_match($pattern, $subject, $matches, PREG_OFFSET_CAPTURE);

echo $matches[0];

}

echo '<form action="" method="post">
<p><input type="text" name="search" /></p>
<p><input type="submit" name="Submit" value="Search!"></p>
</form>';

?>

 

A search of 'bob' outputs:

 

Array ( [0] => Array ( [0] =>

 

[1] => 8066 ) [1] => Array ( [0] =>

[1] => 8071 ) )

Link to comment
Share on other sites

The array source, if anyone's bothered:

 

Array
(
    [0] => Array
        (
            [0] => <div><br><div id=ImgContent></div>
            [1] => 8076
        )

    [1] => Array
        (
            [0] => <br><div id=ImgContent>
            [1] => 8081
        )

)

Link to comment
Share on other sites

Dsaba, your idea of removing the htmlspecialchars didn't work (so all your profanity and abuse was for absolute nothing :/). I'm not sure why it's returning id's in the div tags, I need the tags with no id.

What profanity and abuse are you talking about?

I thought you were trying to mimic urlencode on queryStrings in a way with htmlspecialchars, I suggested the best way to mimic url encode on querystrings is of course to use the urlencode() function. Something not working? I have no idea what you mean. It was a suggestion, if you don't want to do it, don't. If you're offended by people's attempts to help you or offer suggestions, then simply do not press that "new thread button." Because suggestions whether you agree with them or not are exactly what you are going to get.

 

If you are going through personal issues with yourself or me, I suggest you be mature and contact me directly about your issue, instead of slandering me on public forums because that is profane.

Link to comment
Share on other sites

you know you're an ass

 

I'm not sure what I did to deserve that. I'm not trying to do anything to you, I'm trying to understand what your problem is so I can try and help :/ As well as this, I tried to contact you; you blocked me. It's a bit difficult if I have no way of even talking to you.

 

 

 

Anyways, back on topic. Any ideas?

Link to comment
Share on other sites

I tried to contact you; you blocked me.

 

You seem to know how to use a forum. Don't you know about PMS ?

You have a very bad attitude. Why? Because you purposely provoke and try to start confrontation when there is none. Its splendid how one can choose to propagandize information and deceive others by not telling the whole truth. If you have a problem with me lemoninflux, so be it. What does this have to do with this thread and your php problem? Grow up. I'm done discussing this and lowering to your level.

 

Anyways, back on topic. Any ideas?

You are responsible for this thread going off topic, by provoking others.

Link to comment
Share on other sites

Well uh... anyway, here's two ways to do it.

 

The first way pulls all the image srcs from the results page (the images on their original pages, not the google thumbnails).

The second way pulls the div containing the table of all of the images.  Please note that the first page is a different source than the second, since the second tells Google that javascript is not available.

 

Oh, and I pushed the output into files since I developed this in the CLI, and I didn't feel like looking through raw HTML to see if I had done it correctly.

 

<?php

$p = '/dyn\.Img((.*?));/';

$content = file_get_contents('http://images.google.com/images?hl=en&q=dog');

if(preg_match_all($p, $content, $matches)) {
$urls = array();
foreach($matches[0] as $match) {
	$e = explode('","', $match);
	$urls[] = $e[3];
}
$h = fopen('out.html', 'w');
foreach($urls as $url) {
	fwrite($h, "<img src=\"{$url}\" />\r\n");
}
}
else {
echo 'Error with processing Google page!';
} 

$p = '/<div>(.*?)<\/div>/s';

$content = file_get_contents('http://images.google.com/images?hl=en&q=dog&gbv=1');

if(preg_match($p, $content, $match)) {
$h = fopen('out2.html', 'w');
fwrite($h, $match[1]);
}
else {
echo 'Error with processing Google page!';
}

?>

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.