phoenixx Posted December 5, 2012 Share Posted December 5, 2012 I am trying to extract between the <title> </title> tags from several different websites. The code below works perfectly unless there are non ascii characters in the result. But the weird thing is that while it doesn't produce an output if the string has odd characters - it doesn't produce an error either. The characters (and they won't always be the same characters) in the example that breaks are ➹➹➹➹➹ which produces the code below. I want the regex to pickup the characters (in this case ➹) and everything between the title tag. ➹➹➹➹➹ Carnival Cruise Departure Dates preg_match('/<title">([^"]*)<\/title>/isu',$var2,$title); $title=$title[1]; if (preg_last_error() == PREG_NO_ERROR) { echo "----------Title: ".$title."<br>"; } else if (preg_last_error() == PREG_INTERNAL_ERROR) { echo "----------Title: There is an internal error!"; } else if (preg_last_error() == PREG_BACKTRACK_LIMIT_ERROR) { echo "----------Title: Backtrack limit was exhausted!"; } else if (preg_last_error() == PREG_RECURSION_LIMIT_ERROR) { echo "----------Title: Recursion limit was exhausted!"; } else if (preg_last_error() == PREG_BAD_UTF8_ERROR) { echo "----------Title: Bad UTF8 error!"; } else if (preg_last_error() == PREG_BAD_UTF8_ERROR) { echo "----------Title: Bad UTF8 offset error!"; } Thanks in advance. Link to comment https://forums.phpfreaks.com/topic/271652-non-utf-8-characters-special-characters-no-errors-but-no-output/ Share on other sites More sharing options...
phoenixx Posted December 5, 2012 Author Share Posted December 5, 2012 Solved with a little trial and error I simply changed preg_match('/<title">([^"]*)<\/title>/isu',$var2,$title); to preg_match('/<title">([^"]*.?)<\/title>/isu',$var2,$title); Link to comment https://forums.phpfreaks.com/topic/271652-non-utf-8-characters-special-characters-no-errors-but-no-output/#findComment-1397756 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.