Jump to content

Non Utf-8 Characters & Special Characters - No Errors But No Output


Recommended Posts

I am trying to extract between the <title> </title> tags from several different websites. The code below works perfectly unless there are non ascii characters in the result. But the weird thing is that while it doesn't produce an output if the string has odd characters - it doesn't produce an error either.

 

The characters (and they won't always be the same characters) in the example that breaks are ➹➹➹➹➹ which produces the code below. I want the regex to pickup the characters (in this case ➹) and everything between the title tag.

➹➹➹➹➹ Carnival Cruise Departure Dates

 

 

 

preg_match('/<title">([^"]*)<\/title>/isu',$var2,$title);
$title=$title[1];
 if (preg_last_error() == PREG_NO_ERROR) {
 echo "----------Title: ".$title."<br>";
 }
 else if (preg_last_error() == PREG_INTERNAL_ERROR) {
 echo "----------Title: There is an internal error!";
 }
 else if (preg_last_error() == PREG_BACKTRACK_LIMIT_ERROR) {
 echo "----------Title: Backtrack limit was exhausted!";
 }
 else if (preg_last_error() == PREG_RECURSION_LIMIT_ERROR) {
 echo "----------Title: Recursion limit was exhausted!";
 }
 else if (preg_last_error() == PREG_BAD_UTF8_ERROR) {
 echo "----------Title: Bad UTF8 error!";
 }
 else if (preg_last_error() == PREG_BAD_UTF8_ERROR) {
 echo "----------Title: Bad UTF8 offset error!";
 }

 

Thanks in advance.

Edited by phoenixx

Solved with a little trial and error I simply changed

preg_match('/<title">([^"]*)<\/title>/isu',$var2,$title);

to

preg_match('/<title">([^"]*.?)<\/title>/isu',$var2,$title);

Edited by PFMaBiSmAd
removed wysiwyg micro-font formatting
This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.