newbtophp Posted October 15, 2009 Share Posted October 15, 2009 Example: \166\x65\156\x6f\x6f\165\1631\x5f\x6d\141\x77\x61\x6c\171 Always 3-4 characters followed by a dash, the first character is always a number or an x. All help is very apreciated Quote Link to comment https://forums.phpfreaks.com/topic/177790-solved-regex-for-some-characters/ Share on other sites More sharing options...
nrg_alpha Posted October 15, 2009 Share Posted October 15, 2009 #[x0-9][0-9]{2,3}# newtophp, here are some links to help you out in regex: http://www.phpfreaks.com/tutorial/regular-expressions-part1---basic-syntax http://www.regular-expressions.info/ http://weblogtoolscollection.com/regex/regex.php Quote Link to comment https://forums.phpfreaks.com/topic/177790-solved-regex-for-some-characters/#findComment-937476 Share on other sites More sharing options...
nrg_alpha Posted October 15, 2009 Share Posted October 15, 2009 Ooops... missed the slash part: #\\\[x0-9][0-9]{2,3}# Quote Link to comment https://forums.phpfreaks.com/topic/177790-solved-regex-for-some-characters/#findComment-937522 Share on other sites More sharing options...
newbtophp Posted October 15, 2009 Author Share Posted October 15, 2009 Thanks for the links, ill try and experiment, although regex seems hard & complex. But ill challenge myself. I created a function using your expression: <?php $text = "<?php //Begin xurlencoded code '\166\x65\156\x6f\x6f\165\1631\x5f\x6d\141\x77\x61\x6c\171'; ?>"; function xurldecode($input){ //find xurlencoded data $input = preg_match('#\\\[x0-9][0-9]{2,3}#', $output); //little trick to decode the found data $output = urldecode(str_replace('\x','%',$input)); return $output; } echo(xurldecode($text)); ?> But doesnt seem to work, can you perhaps point me in the right direction? :-\ Quote Link to comment https://forums.phpfreaks.com/topic/177790-solved-regex-for-some-characters/#findComment-937530 Share on other sites More sharing options...
cags Posted October 15, 2009 Share Posted October 15, 2009 I suggest you read up on the preg_match function on php.net, because thats not how it is used. Quote Link to comment https://forums.phpfreaks.com/topic/177790-solved-regex-for-some-characters/#findComment-937541 Share on other sites More sharing options...
newbtophp Posted October 15, 2009 Author Share Posted October 15, 2009 I suggest you read up on the preg_match function on php.net, because thats not how it is used. Sure, had a quick look: <?php $text = '<? \166\x65\156\x6f\x6f\165\1631\x5f\x6d\141\x77\x61\x6c\171 ?>'; if(preg_match('#\\\[x0-9][0-9]{2,3}#', $text, $matches)) { print_r($matches); } ?> Im sure im doing it correctly but it dont print the whole match :-\ Quote Link to comment https://forums.phpfreaks.com/topic/177790-solved-regex-for-some-characters/#findComment-937544 Share on other sites More sharing options...
nrg_alpha Posted October 15, 2009 Share Posted October 15, 2009 <?php $text = '\166\x65\156\x6f\x6f\165\1631\x5f\x6d\141\x77\x61\x6c\171'; if(preg_match('#^(?:[x0-9a-f][0-9a-f]{2,3})+$#', $text, $matches)) { echo $matches[0]; }?> Sorry, my head's not on straight today.. Quote Link to comment https://forums.phpfreaks.com/topic/177790-solved-regex-for-some-characters/#findComment-937553 Share on other sites More sharing options...
cags Posted October 15, 2009 Share Posted October 15, 2009 You have the usage much closer now. It matches the pattern it is supposed to just fine, the issue is I don't think it's doing what you want it to do. A laymans break down of the code is... look for a backslash followed by either x or the number 0-9 followed by 2 or 3 more digits (digits meaning 0-9) If you run your code I suspect $matches will output \166 If you wish to get all the parts of the string that match you would use preg_match_all. But I'm guessing what you want is to check that the entire line repeats that pattern? EDIT: which by the looks of it nrg_alpha just showed you (including the valid hex chars that were omitted originally) Quote Link to comment https://forums.phpfreaks.com/topic/177790-solved-regex-for-some-characters/#findComment-937558 Share on other sites More sharing options...
nrg_alpha Posted October 15, 2009 Share Posted October 15, 2009 EDIT: which by the looks of it nrg_alpha just showed you (including the valid hex chars that were omitted originally) Yeah, sorry about that.. got stuff on my mind.. so I'm not all there today. Perhaps this is a sign I should stop offering suggestions for the day Quote Link to comment https://forums.phpfreaks.com/topic/177790-solved-regex-for-some-characters/#findComment-937564 Share on other sites More sharing options...
newbtophp Posted October 15, 2009 Author Share Posted October 15, 2009 EDIT: which by the looks of it nrg_alpha just showed you (including the valid hex chars that were omitted originally) Yeah, sorry about that.. got stuff on my mind.. so I'm not all there today. Perhaps this is a sign I should stop offering suggestions for the day It's ok mate, take a break mate, I apreciate the help. Put your feet up and watch tv Quote Link to comment https://forums.phpfreaks.com/topic/177790-solved-regex-for-some-characters/#findComment-937569 Share on other sites More sharing options...
salathe Posted October 15, 2009 Share Posted October 15, 2009 What precisely are you aiming to do? Regular expression creation, and programming in general, thrives on having a specific task to accomplish. The original post says you want to match a sequence of characters, yet your own code snippets show that the subject string may contain other things before/after the sequence. Perhaps it would better suit us to be given whatever input you will have, what exactly you want to be done, and some output which should result. For example, do you wish to make sure that a string only contains the type of sequence your original post mentions; to extract that sequence from a string which may also contain other things; something else entirely? Quote Link to comment https://forums.phpfreaks.com/topic/177790-solved-regex-for-some-characters/#findComment-937574 Share on other sites More sharing options...
Garethp Posted October 15, 2009 Share Posted October 15, 2009 Here's my code function CheckString($String) { if(preg_match('~^(?:\\\\[x0-9][a-z0-9#&]{2,3})+$~i', $String, $M)) { echo $M[0]; return 1; } return 0; } The problem is, for us to even be able to use regex on it, it needs to be single quoted, but for it to echo the value of the string, it needs to be double quoted. Is there a way to convert single quoted strings into double quoted strings? [/code] Quote Link to comment https://forums.phpfreaks.com/topic/177790-solved-regex-for-some-characters/#findComment-937584 Share on other sites More sharing options...
newbtophp Posted October 15, 2009 Author Share Posted October 15, 2009 What precisely are you aiming to do? Regular expression creation, and programming in general, thrives on having a specific task to accomplish. The original post says you want to match a sequence of characters, yet your own code snippets show that the subject string may contain other things before/after the sequence. Perhaps it would better suit us to be given whatever input you will have, what exactly you want to be done, and some output which should result. For example, do you wish to make sure that a string only contains the type of sequence your original post mentions; to extract that sequence from a string which may also contain other things; something else entirely? Too summarise: Im trying to create a way to replace all encoded code with readable/decoded code. I know how to decode its very simple just echo the encoded: <?php echo "\166\x65\156\x6f\x6f\165\1631\x5f\x6d\141\x77\x61\x6c\171"; ?> Which will decode to readable code: venoous1_mawaly I can do it manually be echoing each encoded string and then replacing it with the echo output, but this can get tiring, would be nice to create a way to do this so it echos the whole file replaced with the include readable output. Heres a sample encoded file. <?php $x0b = "\166\x65\156\x6f\x6f\165\1631\x5f\x6d\141\x77\x61\x6c\171"; $x0c = "ve\156\x6f\x6f\x75\163\061_m\x61waly"; $x0d = "\x33%\x28\124\124?\x3a\x7dA\102A4"; $x0e = "\x6c\157ca\x6c\x68os\x74"; $x10 = "\x35\x2d10-\x32\x301\x30"; $x11 = mysql_connect($x0e, $x0c, $x0d); ?> If i manually echo'd each string and replaced it, the whole code would be readable and look like: <?php $x0b = "venoous1_mawaly"; $x0c = "venoous1_mawaly"; $x0d = "3%(TT?:}ABA4"; $x0e = "localhost"; $x10 = "5-10-2010"; $x11 = mysql_connect($x0e, $x0c, $x0d); ?> Im not entirelly sure on how to do this, so i thought if i can get some help with regex i can use preg_match_all() and then echo it or something. Looking at the pattern of the encoded strings, they look similar to urlencoded() strings except dont have % instead have an x, but this is not always true. Any help is apreciated. Quote Link to comment https://forums.phpfreaks.com/topic/177790-solved-regex-for-some-characters/#findComment-937621 Share on other sites More sharing options...
salathe Posted October 15, 2009 Share Posted October 15, 2009 A quick solution would be to replace all of the escape sequences present in the code (\123 is an escape sequence; the chances of finding one in PHP code outside of a string are minimal). It is worth noting here that escape sequences like those presented in this thread represent ASCII characters within double-quoted strings and can use either octal or hexadecimal notation. (For more info, peek at the Double quoted strings in the PHP Manual.) The octal notation of a character can be represented by the regular expression \[0-7]{1,3} (Aside: For now, assume the backslash means a literal backslash character.), in other words a backslash followed by between one and three digits between one and seven (since octal numbers are base 8, those are the only digits used). The hexadecimal notation of a character can be represented by the regular expression \x[0-9A-Fa-f]{1,2} In other words, a backslash and letter x followed by one or two hexadecimal digits. We can put these two together to help solve our problem which can be divided into two main steps: [o]Match our (octal or hexadecimal) escape sequences [o]Replace them with their ASCII characters The first step requires us to figure out a regular expression which will match the octal/hex escape sequences. Luckily, here's one that I prepared earlier so we will just use it (if you need an explanation, just ask). The Regex /\\([0-7]{1,3}|x[0-9A-Fa-f]{1,2})/ We will eventually need three backslashes due to the PHP parser thinking that the slash might be the start of an escape sequence (like the PHP code we're going to be replacing!) and the regular expression engine using that same character for its own escape sequences (the latter is why we need two in the example given)! The details aren't super-important suffice to say that you will need all three there in your PHP code (as shown below). Now, the replacement. Replacement function The easiest way to do our replacements is to use a callback function (using preg_replace_callback), which takes in the matched values, examines them, and returns the value we want in their place. The idea is fairly simple; the matched values (our octal or hexadecimal escape sequences) will be fed into the function, we convert that to the appropriate ASCII character and return that character. The callback function is called once for each escape sequence found. This function could be written in a myriad of different ways to get the job done, this is just one quick example. function decode_octhex($match) { // E.g 166 or x6f $value = $match[1]; // Hexadecimal notation if ($value[0] == 'x') { return chr(hexdec(ltrim($value, 'x'))); } // Octal notation return chr(octdec($value)); } Now it's all well and good having the component parts, but for the sake of simplicity (and what I know you're really here for) lets put all of the pieces into a small script so that we can tie everything neatly together. Example Script <?php // Use file_get_contents() or whatever here. $encoded = '<?php $x0b = "\166\x65\156\x6f\x6f\165\1631\x5f\x6d\141\x77\x61\x6c\171"; $x0c = "ve\156\x6f\x6f\x75\163\061_m\x61waly"; $x0d = "\x33%\x28\124\124?\x3a\x7dA\102A4"; $x0e = "\x6c\157ca\x6c\x68os\x74"; $x10 = "\x35\x2d10-\x32\x301\x30"; $x11 = mysql_connect($x0e, $x0c, $x0d); ?>'; /** * Takes a hexadecimal or octal value * and returns its equivalent ASCII character. */ function decode_octhex($match) { // E.g 166 or x6f $value = $match[1]; // Hexadecimal notation if ($value[0] == 'x') { return chr(hexdec(ltrim($value, 'x'))); } // Octal notation return chr(octdec($value)); } // Replace octal/hexadecimal escape sequences with their ASCII values echo preg_replace_callback('/\\\([0-7]{1,3}|x[0-9A-Fa-f]{1,2})/', 'decode_octhex', $encoded); ?> Example Output <?php $x0b = "venoous1_mawaly"; $x0c = "venoous1_mawaly"; $x0d = "3%(TT?:}ABA4"; $x0e = "localhost"; $x10 = "5-10-2010"; $x11 = mysql_connect($x0e, $x0c, $x0d); ?> There we go! Quote Link to comment https://forums.phpfreaks.com/topic/177790-solved-regex-for-some-characters/#findComment-937664 Share on other sites More sharing options...
newbtophp Posted October 15, 2009 Author Share Posted October 15, 2009 salathe thank you very much! for that reply & tutorial! :D :D Solved Quote Link to comment https://forums.phpfreaks.com/topic/177790-solved-regex-for-some-characters/#findComment-937668 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.