JacobSeated Posted November 3, 2020 Share Posted November 3, 2020 (edited) I recently fixed a bug in my own website involving preg_replace, the thing is, each time I was editing an article in the front-end, preg_replace would be called on the HTML to replace certain HTML elements. What I suspect is that preg_replace has a bug that causes it to remove certain characters from the replacement string, leading to corruption of the output. Obviously only the HTML (haystack) should be modified, and the replacement string dropped in place of the needle, without modifying it in any way. This is not what happens if the replacement string contains backslashes. I have tried to figure out what exactly is going on here, and have come up with a fix by using str_replace instead. But, I wonder if there is a solution that would allow me to keep using preg_replace? I also wonder if there are other characters that might be removed doing the replacement operation? I know that you must escape backslashes when declaring variables, but the replacement string is obtained directly from a MySQL database, and I know the data is OK. The fact that you need to escape literal backslashes in PHP scripts makes it harder to debug the problem. For example, if you just try my solution directly, without addslashes, you will be missing backslashes. I guess you either have to escape those, or load the data from a file. This is my current solution (I do not use addslashes in the live version): $html = '<div>REPLACEMENT_ID</div>'; $replacement_id = 'REPLACEMENT_ID'; $replacement = addslashes('<pre>\\\\</pre>'); // $html = preg_replace("|{$replacement_id}|", $replacement, $html); // $html = str_replace($replacement_id, $replacement, $html); $pos = strpos($html, $replacement_id); if ($pos !== false) { $html = substr_replace($html, $replacement, $pos, strlen($replacement_id)); } print_r($html); If you comment out the substr_replace test, and instead uncomment the preg_replace one, then you will get an inaccurate number of backslashes, similar to the result I got when using data directly from my database. Hope someone can help shed some light on this 😄 Edited November 3, 2020 by JacobSeated Quote Link to comment Share on other sites More sharing options...
Solution kicken Posted November 3, 2020 Solution Share Posted November 3, 2020 This is just a matter of two separate levels of escape sequence processing that you need to wrap your mind around, which can be difficult at times. When you're setting a string in PHP first you have PHP's escape sequence processing. PCRE then has it's own level of processing that is done on the value that was passed to the function. For example, if you wanted to use \0 in a replacement literally rather than have it interpreted as a back reference you have to pass the string '\\0' as your replacement. If you're defining value in your PHP source as a string then you need to escape those slashes again for PHP's sake so you have $replacement = "\\\\0" If you get the value from a file or database you don't have to worry about the PHP level of escaping, but do still need to account for the PCRE level so you need your file to contain \\0 not just \0. It's not clear to me exactly what output you're expecting in your code sample. The addslashes call effectively mitigates PHP's escaping meaning $replacement is set to the literal value "<pre>\\\\</pre>". preg_replace will then see that and process it's own escaping which means the value it's working with is effectively "<pre>\\</pre>". That means your final replaced output would be "<div><pre>\\</pre></div>" If you have "<pre>\\\\</pre>" stored in your database and are pulling that value from there then you should get the same result, just don't run it through addslashes() as you don't have to deal with the PHP level of escaping things. 1 Quote Link to comment Share on other sites More sharing options...
JacobSeated Posted November 3, 2020 Author Share Posted November 3, 2020 That makes sense, thanks. I am now wondering if I should use preg_replace_callback or if it is safe to use addslashes. Guess I have to read up on those. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.