strimpak Posted October 11, 2012 Share Posted October 11, 2012 (edited) First of all this is my first post to this great site. Excuse me for not being such an expert in defining my exact problem. I am trying to use spellChecker extension for Horde Webmail. This extension uses aspell. This thing works like a charm for the English version of the spellchecker. I want to use this spellchecker for Greek language. It does not work for greek words. I debugged the whole source code to make a point. Below i ll describe the way this works. When you click the spellcheck it calls the aspell as a cmd and this returns two arrays, one with the bad words and one with the suggestions. This arrays are converted to a Json string and passed to a javascript function that uses 2 regexp in order to match every entry of each bad word. These regular expression works with english strings as I referred like a charm. The way it works is trying to match every bad work and substitute it with a <span> element. The problem is that with greek strings the word matching does not work. Trying to find a solution I encoded to utf8 but this failed. After reading a lot on the internet I found a possible solution with XRegExp but I couldn't make it work. My last chance was to transfer this work to php. So I made xmlHttpRequest for each bad word and use php to make the regexp replacements. But with preg_replace I could not make it work. Below is the actual javascript code: //node is the bad word var re_text = '<span index="'+ (i++) + '" class="spellcheckIncorrect">'+ node + '</span>'; //content is the whole string content = content.replace(new RegExp("(?:^|\\B)" + RegExp.escape(node) + "(?:\\b|$)", 'g'), re_text); // Go through and see if we matched anything inside a tag (i.e. // class/spellcheckIncorrect is often matched if using a // non-English lang). content = content.replace(new RegExp("(<[^>]*)" + RegExp.escape(re_text) + "([^>]*>)", 'g'), '\$1' + node +'\$2'); So I would like to ask your valuable help to bypass my problem with the greek words. I am not a regular expression specialist but I realize that this regex tries to match all words like "node" as a word identified as a word using \b or as the first(^) or last word($) in the whole string , using g. Can anyone suggest a way either using javascript or php (I have implementing to pass all needed information to php and back to javascript using xhr) or any other way. The info about the server I use as a development and production is a solaris 10, php5, apache2. Thanks in advance, and sorry if I couldn't describe well my problem. Edited October 11, 2012 by strimpak Quote Link to comment https://forums.phpfreaks.com/topic/269365-regular-expression-for-greek-words/ Share on other sites More sharing options...
strimpak Posted October 12, 2012 Author Share Posted October 12, 2012 //node is the bad word var re_text = '<span index="'+ (i++) + '" class="spellcheckIncorrect">'+ node + '</span>'; //content is the whole string content = content.replace(new RegExp("(?:^|\\B)" + RegExp.escape(node) + "(?:\\b|$)", 'g'), re_text); // Go through and see if we matched anything inside a tag (i.e. // class/spellcheckIncorrect is often matched if using a // non-English lang). content = content.replace(new RegExp("(<[^>]*)" + RegExp.escape(re_text) + "([^>]*>)", 'g'), '\$1' + node +'\$2'); Ok let me describe you what have I done till now. I passed all the bad words and the content to a php script. I encode the word and contents to utf8 and I make a match like this: $content = preg_replace("/".$node."/", $retext, $content); This matches greek paterns and replaces with the span. I think it's a step forward. Now my problem is how to distinct that are separate words. The regexp /(?:^|\b)greekword(?:\b|$)/g does not work. I thing that preg_replace has as default no limit, so I threw the g. Does anyone have an idea how I can do it? Quote Link to comment https://forums.phpfreaks.com/topic/269365-regular-expression-for-greek-words/#findComment-1384728 Share on other sites More sharing options...
Christian F. Posted October 12, 2012 Share Posted October 12, 2012 That's not PHP RegExp, but Javascript RegExps. Also, without knowing some examples of input, nodes and output, we can't really help you. The only thing we can do is pretty much just guessing. Quote Link to comment https://forums.phpfreaks.com/topic/269365-regular-expression-for-greek-words/#findComment-1384897 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.