Jump to content

Issues with regex in preg_replace


limex

Recommended Posts

Hi,

 

I want to strip the chars that are invalid in XML based on the specs from the W3 Homepage:

 

function strip_invalid_xml_chars( $in ) {
$out = "";
$length = strlen($in);
for ( $i = 0; $i < $length; $i++) {
$current = ord($in{$i});
if ( ($current == 0x9) || ($current == 0xA) || ($current == 0xD) || (($current >= 0x20) && ($current <= 0x7E)) || (($current >= 0xA0) && ($current <= 0xD7FF)) || (($current >= 0xE000) && ($current <= 0xFFFD)) || (($current >= 0x10000) && ($current <= 0x10FFFF))) {
$out .= chr($current);
} else {
$out .= " ";
}
}
return $out;
}

 

But the performance is not the best, so I decided to use regex:

 

$input_sting = "abcdefg ™ ´ ®";
$clean_string=preg_replace('/[^\x9\xA\xD\x20-\x7E\xA0-\{xD7FF}\x{E000}-x{FFFD}\x{10000}-\x{10FFFF}]/u','',$input_sting); 

 

But I get an Warning and an empty $clear_string:

Compilation failed: range out of order in character class at offset 26

 

Could someone fix this? Thanks a lot

Link to comment
Share on other sites

There are a number of silly mistakes in your pattern; for example the hex character typoes \{xD7FF} and x{FFFD}, and it nukes the range \x20-\x7E which are perfectly normal, printable, safe, happy-in-XML characters. Can you links us to where precisely you got these ranges of characters from?..

Link to comment
Share on other sites

  • 2 months later...
This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.