Olney Posted October 4, 2006 Share Posted October 4, 2006 I have a bit of code that I'm strugling with (I'm not a php Pro)I hope it's ok to ask how to do this.I have an open source script I'm trying to modify that takes tags into the submit form.The problem is the delimeter is a comma " ,"I was to use it in Japanese where the double byte comma would look like this "„ÄÅ" (If written in ASCII)Can anyone give me advice on which part of the code to change?[code]function tags_normalize_string($string) { return preg_replace('/[;,]\s+$/', ",",$string);}function tags_insert_string($link, $lang, $string, $date = 0) { global $db; $string = tags_normalize_string($string); if ($date == 0) $date=time(); $words = preg_split('/[,;]+/', $string); if ($words) { $db->query("delete from tags where tag_link_id = $link"); foreach ($words as $word) { $word=trim($word); if (!$inserted[$word] && !empty($word)) { $db->query("insert into tags (tag_link_id, tag_lang, tag_words, tag_date) values ($link, '$lang', '$word', from_unixtime($date))"); $inserted[$word] = true; } } return true; } return false;}function tags_get_string($link, $lang) { global $db; $counter = 0; $res = $db->get_col("select tag_words from tags where tag_link_id=$link and tag_lang='$lang'"); if (!$res) return false; foreach ($db->get_col("select tag_words from tags where tag_link_id=$link and tag_lang='$lang'") as $word) { if($counter>0) $string .= ', '; $string .= $word; $counter++; } return $string;}[/code]I'm almost sure this is where to change the code but not sure which part to change.I've been going at it by trial & error with no luck.Thanks in advance if anyone knows. Quote Link to comment Share on other sites More sharing options...
effigy Posted October 5, 2006 Share Posted October 5, 2006 I'm confused; do you want to use the double comma as a delimiter?[code]<meta charset="utf-8"/><pre><?php $double_comma = pack('c*', 0xE2, 0x80, 0x9E); $string = "A{$double_comma}B${double_comma}C${double_comma}"; $result = preg_split("/$double_comma/", $string, -1, PREG_SPLIT_NO_EMPTY); print_r($result);?></pre>[/code][quote]Array( [0] => A [1] => B [2] => C)[/quote] Quote Link to comment Share on other sites More sharing options...
Olney Posted October 5, 2006 Author Share Posted October 5, 2006 Thank you for the reply.No I don't want to use a double commaI would like to use the Japanese Double byte encoded commaIf I write it in ASCII Characters it would look like this"„ÄÅ"So currentlyThis is the delimiter " ,"But I'm not sure where to put "„ÄÅ"In the code to make it the delimiterHypothetically it's like saying instead of" ,"as the delimiter I would like to put " -" or something.Either I would put the ASCII code or Native Japanese comma in the codebut I'm just not sure where, I'm stumped. Quote Link to comment Share on other sites More sharing options...
effigy Posted October 5, 2006 Share Posted October 5, 2006 I'm not familiar with this area, but it looks like you want something from the "Multibyte String" suite--maybe [url=http://us2.php.net/mb_convert_encoding]mb_convert_encoding[/url]. Quote Link to comment Share on other sites More sharing options...
Olney Posted October 5, 2006 Author Share Posted October 5, 2006 Thank you but using mb_convert would probably screw me up more than just finding out what part of the above code to change.Even though it's Japanese the encode is still UTF-8.Since Japanese won't type in a Latin comma, I'm just trying to take the latin comma out of the above code & put the Japanese comma in the actual code. This would be the delimeter.Instead of thinking of it as a foreign language let's say I was trying to change the delimeter in the code from" ," to " 45"[code]$words = preg_split('[b]/[,;]+[/b]/', $string); if ($words) {[/code]So I know somewhere in the code it goes from the above to[code]$words = preg_split('/[b][45;][/b]+/', $string); if ($words) {[/code]But I'm just not sure in the whole code what to change so that if a user types in Cars 45 Trucks (from Cars, Trucks)It makes sure it seperates the two terms Cars & Trucks.[code]function tags_normalize_string($string) { return preg_replace('/[;,]\s+$/', ",",$string);}function tags_insert_string($link, $lang, $string, $date = 0) { global $db; $string = tags_normalize_string($string); if ($date == 0) $date=time(); $words = preg_split('/[,;]+/', $string); if ($words) { $db->query("delete from tags where tag_link_id = $link"); foreach ($words as $word) { $word=trim($word); if (!$inserted[$word] && !empty($word)) { $db->query("insert into tags (tag_link_id, tag_lang, tag_words, tag_date) values ($link, '$lang', '$word', from_unixtime($date))"); $inserted[$word] = true; } } return true; } return false;}function tags_get_string($link, $lang) { global $db; $counter = 0; $res = $db->get_col("select tag_words from tags where tag_link_id=$link and tag_lang='$lang'"); if (!$res) return false; foreach ($db->get_col("select tag_words from tags where tag_link_id=$link and tag_lang='$lang'") as $word) { if($counter>0) $string .= ', '; $string .= $word; $counter++; } return $string;}[/code] Quote Link to comment Share on other sites More sharing options...
effigy Posted October 5, 2006 Share Posted October 5, 2006 It took me a while to figure out what the [url=http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=3001]Japanese comma[/url] is. How about this?[code]<meta charset="utf-8"><pre><?php echo 'Japanese comma = ', $j_comma = pack('c*', 0xE3, 0x80, 0x81), '<br/>'; echo 'String delimited with Latin and Japanese comma = ', $string = "A{$j_comma}B,C{$j_comma}", '<br/>'; $result = preg_split("/[,$j_comma]/", $string, -1, PREG_SPLIT_NO_EMPTY); print_r($result);?></pre>[/code] Quote Link to comment Share on other sites More sharing options...
Olney Posted October 5, 2006 Author Share Posted October 5, 2006 I really thank you for taking the time to write the above codeBut I'm still not sure how to change it for the code I got.I don't want to modify the Japanese comma for the entire program just in the code I posted above.By looking at your exampleDo i do this?[code]function tags_normalize_string($string) { return preg_replace('/[b][,$j_comma][/b]\s+$/', ",",$string);}function tags_insert_string($link, $lang, $string, $date = 0) { global $db; $string = tags_normalize_string($string); if ($date == 0) $date=time(); $words = preg_split('/[b][,$j_comma][/b]+/', $string); if ($words) { $db->query("delete from tags where tag_link_id = $link"); foreach ($words as $word) { $word=trim($word); if (!$inserted[$word] && !empty($word)) { $db->query("insert into tags (tag_link_id, tag_lang, tag_words, tag_date) values ($link, '$lang', '$word', from_unixtime($date))"); $inserted[$word] = true; } } return true; } return false;}function tags_get_string($link, $lang) { global $db; $counter = 0; $res = $db->get_col("select tag_words from tags where tag_link_id=$link and tag_lang='$lang'"); if (!$res) return false; foreach ($db->get_col("select tag_words from tags where tag_link_id=$link and tag_lang='$lang'") as $word) { if($counter>0) $string .= ', '; $string .= $word; $counter++; } return $string;}[/code] Quote Link to comment Share on other sites More sharing options...
effigy Posted October 5, 2006 Share Posted October 5, 2006 An easier solution is a combination of[tt] /u [/tt]and[tt] \x{...}[/tt]. Try changing your preg_split line from[tt] $words = preg_split('/[,;]+/', $string); [/tt]to[tt] $words = preg_split('/[,;\x{3001}]+/u', $string);[/tt] Quote Link to comment Share on other sites More sharing options...
Olney Posted October 5, 2006 Author Share Posted October 5, 2006 I'm getting a bit more confused, & really thank you for keep trying to explain but.The original code works completely fine except it's an ASCII comma.I'm realizing it might be either of 3 places I need to changeTHISfunction tags_normalize_string($string) { return preg_replace('/[b][;,][/b]\s+$/', ",",$string);TO perhaps THISfunction tags_normalize_string($string) { return preg_replace('/[b][,$j_comma][/b]\s+$/', ",",$string);This$words = preg_split('/[,;]+/', $string);To THIS$words = preg_split('/[b][,$j_comma]][/b]+/', $string);& THIS if($counter>0) $string .= ', '; $string .= $word;TO if($counter>0) $string .= [b]', '; [/b](Not sure maybe changed?) $string .= $word; Quote Link to comment Share on other sites More sharing options...
effigy Posted October 5, 2006 Share Posted October 5, 2006 Ah, I didn't see the counter. In that case, I would stick with the $j_comma method. You need to make sure that your regex is surrounded in double quotes so that the $j_comma variable will interpolate. Quote Link to comment Share on other sites More sharing options...
Olney Posted October 5, 2006 Author Share Posted October 5, 2006 Hey guyThanks for the help but I got a feeling now this code change is just beyond my level.I appreciate your time & hope that it helps someone else. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.