Jump to content

strpos does not behave like it should


MockY

Recommended Posts

I have a string like this:

$data = "<a href="">Arbitrary characters. Bla bla bla</a><p>123456789</p><p>$10,100</p><p>01/02/2010</p><p>01/02/2011</p>";

 

What I am trying to accomplish is to extract the number between the first <p> tags as well as the date between the last <p> tags. I can extract the first number just fine by doing this:

 

$opening_p = strpos($data, '<p>');
$opening_p += 3;
$closing_p = strpos($data, '</p>');
$length = $closing_p - $opening_p; // The length of the number changes, so I need to know the current length
$first_number = substr($data, $opening_p, $length);

 

However, no matter what I do, the position for the fourth <p> comes out to be the same as the first <p>

In other words, this produces exactly the same number for the <p>

 

$opening_fourth_p = strpos($data, '<p>', 4);
$opening_fourth_p += 3;

 

So even with the offset in strpos, the position is exactly the same for both "extractions".

 

So to conclude, both of these returns the same value:

$opening_p = strpos($data, '<p>');
$opening_fourth_p = strpos($data, '<p>', 4);

 

Is there something I miss?

Link to comment
https://forums.phpfreaks.com/topic/231288-strpos-does-not-behave-like-it-should/
Share on other sites

UPDATE:

 

It seems like the offset does not work at all

I did this and get the same value

$test_data = "testing<i<testing<hey";
strpos($test_data, '<'); // results in 7
strpos($test_data, '<', 2); // Also results in 7

 

Shouldn't this work?

The 3rd argument is an offset from the start of the string, not from the matches. So

$opening_fourth_p = strpos($data, '<p>', 4);

starts from the 4th character of $data, not the fourth occurrence of <p>. Also, in general, offsets are 0indexed.

 

http://php.net/manual/en/function.strpos.php

 

 

Assuming the data format stays the same, here's a much easier way:

$data = "<a href=''>Arbitrary characters. Bla bla bla</a><p>123456789</p><p>$10,100</p><p>01/02/2010</p><p>01/02/2011</p>";
list(,$str_1,,,$str_2) = explode('<p>',$data);
$str_1 = substr($str_1,0,strlen($str_1)-4);
$str_2 = substr($str_2,0,strlen($str_2)-4);

 

What this does is split your string into an array at every '<p>', then it assigns it to some variables by using list(). Finally, it strips of the trailing '</p>' that is present.

 

http://php.net/manual/en/function.list.php

http://php.net/manual/en/function.explode.php

http://php.net/manual/en/function.substr.php

http://php.net/manual/en/function.strlen.php

 

 

Here's similar way as the last post:

<?php
$data = "<a href=''>Arbitrary characters. Bla bla bla</a><p>123456789</p><p>$10,100</p><p>01/02/2010</p><p>01/02/2011</p>";
$data = str_replace(array('<p>','</p>'),'|',$data); //just make sure you use a replacement character that's not in the data string already.
list(,$p1,,,,,,$p2) = explode('|',$data);
?>

 

Ken

 

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.