Jump to content

[SOLVED] searching


-entropyman

Recommended Posts

Hello,

 

need someone's help again.  I'm attempting to write a script that searches a .htm for a specific number.  i want to find the line that the $number is on and return only that line.  here's what i've got

 

$image = 'http://xx.x.xx.xx/######.png';
        $server = 'http://xx.x.xx.xx/xxxxxxx.htm'
$number = basename($image, ".png");
$ns = file_get_contents($server);
if (preg_match ($number, $ns))
{
	echo "match was found";
}
else
{
	echo "failure!";
}

 

now i'm pretty sure my preg match line is wrong but i'm not sure how to fix it.

 

after i fix the preg match i'm getting rid of that if but i'm not sure how to return the line it's on.

any ideas?

 

and the htm page is setup like this

 

00001 testtestest test testwe
00002 testtestest test testwew
00003 testtestest test testrrrrrrr
00004 testtestest test testeee
00005 testtestest test testrrrrrr

 

any ideas?

 

???

Link to comment
Share on other sites

way I do it would be

<?php
$image = 'http://xx.x.xx.xx/######.png';
$server = 'http://xx.x.xx.xx/xxxxxxx.htm'
$number = basename($image, ".png");
$content = file_get_contents($server);
$data = explode("\n",$contnet);
$matched = array();
foreach($data as $key=>$value){
if(stristr($value,$number)){
$matched[] = $key;
}
}
print_r($matched);
?>

 

a regular expression could be more useful if you knew it was in a <img> tag for example

Link to comment
Share on other sites

mine is more lengthy, but here it is:

<?php
$image = 'http://xx.x.xx.xx/######.png';
$server = 'http://xx.x.xx.xx/xxxxxxx.htm';
$number = basename($image, ".png");
$ns = file_get_contents($server);
$data_array = explode("\n", $ns);
$result_array = array();
$count = 0;
foreach ($data_array as $value){
if (strstr($value, $number)){
	$result_array[] .= $value;
	$count++;
}
}
if ($count = 0){
print "failed to find number";
}
else {
print "Found $count results. Results as follows:\n<br />";
foreach ($result_array as $value){
	print $value."\n<br />";
}
}

Link to comment
Share on other sites

cooldude you explode everything into a giant array and then run a loop on each "word" and check if it matches $number and if it does it saves the array key for that giant array so your code will just return for example (in an array) "0 8 12 ..."

 

jonsjava you also explode everything into a giant array and then run a loop on each "word" and check if it matches $number and if it does, it saves the array value from the position in that giant array so your code will just return for example (in an array) "00003 00003 00003 00003"

 

-entropy:  Assuming that the point of finding the line is to get the data on the line and separate it for use, here's my take:

 

<?php
  $image = 'http://xx.x.xx.xx/######.png';
  $server = 'http://xx.x.xx.xx/xxxxxxx.htm'
  $number = basename($image, ".png");

  $list = file($server);
  foreach ($list as $key => $val) {
     if (stristr($val, $number)) {
        // accommodates if there's more than 1 row
        $row[] = explode($list[$key]);
    } // end if $number in $val
  } // end foreach $list

 

edit: oops my bad both of you. For some reason I looked at \n as a space  :o  well anyways, you can skip the explode part by using file instead of file_get_contents

Link to comment
Share on other sites

using file()

and using

$var = file_get_contents()

$data = explode("\n",$var);

 

produces the same array so I don't see your method saving resources.

 

Our method in turn actually is better assuming you find a match because you then have the data broken into lines so you can directly say

 

foreach($matched as $value){

echo $data[$value];

}

 

 

The point is to find the matches the output for the function is the end users own method we just pointout how to do it simply (most ppl asking a question know how to format output in a method they like)

Link to comment
Share on other sites

I see it as an over treatment of a variable because in turn if the file I am working with is of reasonable size and I want to re apply conditions to the original file (say striptags)  I can apply this to the unexploded file and then do additional treatment as needed.

 

I've done a lot of reverse engineering of formatted outputs back into database and in my experience using the file_get_contents combined with preconditioning and then exploding into lines (or on <td> tags as I find to be very common) the output is easier to work with.

 

 

I also can recall the entire file from a variable if needed to be inserted.

 

 

However if I'm working on a grossly oversized file then using the file() method is better for saving resources, however a grossly oversized file in a .htm file is hard to find because its a glorified text file in all reality.

Link to comment
Share on other sites

True your method does preserve the file as a whole in case you need it I'll give you that one.  Though, I can't really think of a situation off the top of my head where you'd actually need both.  I mean, trying to parse one thing might be easier by the line, while trying to parse another might be better doing it as a whole, but you can accomplish any task from just one or the other, so in the end, having 2 copies of the same information loaded in memory is always gonna be harder on the computer than just executing another line of code. 

Link to comment
Share on other sites

so in the end, having 2 copies of the same information loaded in memory is always gonna be harder on the computer than just executing another line of code. 

 

Your talking aggregate on modern servers running 4-64gigs of ram if the file is under 100kb which 99% of html files are

 

If you don't believe me about 64 gig of ram on a server

(http://www.newegg.com/Product/Product.aspx?Item=N82E16813151008)

 

 

And yes 64-bit architecture can address all of it 2^64 = 1.84467441 × 10^19 which is about 1.84 X 10^10 gigs of ram (more than anyone needs) :)

Link to comment
Share on other sites

as i reread this code i realize i forgot to tell you something very important  :(.  the reason i broke the url to look for number is that the number won't be repeated in this htm file.  thus only one result will ever be return.  similarly, i am also going to take the line return and further manipulate it. so i need to store the line in a variable?  i'm not sure how to fix your code to do that.  i'm still very new ^^

 

thank you again  :)  

Link to comment
Share on other sites

modifying my example a bit

 

<?php
$image = 'http://xx.x.xx.xx/######.png';
$server = 'http://xx.x.xx.xx/xxxxxxx.htm'
$number = basename($image, ".png");
$content = file_get_contents($server);
$data = explode("\n",$contnet);
foreach($data as $key=>$value){
if(stristr($value,$number)){
$matched = $key;
#watch it break
break;
}
}
#the row is now
$row = $data[$matached];
?>

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.