Jump to content

pr0no

New Members
  • Posts

    4
  • Joined

  • Last visited

    Never

Profile Information

  • Gender
    Not Telling

pr0no's Achievements

Newbie

Newbie (1/5)

0

Reputation

  1. Oh, nevermind! It works great; for some reason when I take live output from the database here, it makes the error described above. But it works perfectly with the string as I gave it in this post Thanks!
  2. Hey, thanks. It doesn't fully work as expected however. Consider the input: It/PRP was/VBD not/RB okay/JJ or/CC funny/JJ and/CC I/NN will/MD never/RB buy/VB from/IN them/PRP ever/RB again/RB The output now is: It was not not-okay or funny and I will not-buy from them not-ever again However, the expected output is: It was not not-okay or not-funny and I will not-buy from them not-ever not-again The difference is in "not-funny" and "not-again". They are respectively a JJ and RB word, but they do not get tagged like the others. I think this is due to the second if-statement: if($type!='RB' || !in_array($word, $neg_adv)) { if($type=='JJ' || $type=='RB' ... Why do you first check if $type is not 'RB', and then check if $type * is * 'RB'? Is the first one meant to remove the negation word (not, never)? I think this is stopping "funny" and "again" from being tagged. Could you explain?
  3. Consider the following POS-tagged string: It/PRP was/VBD not/RB okay/JJ or/CC funny/JJ and/CC I/NN will/MD never/RB buy/VB from/IN them/PRP ever/RB again/RB (It was not okay or funny and I will never buy from them ever again) I want to accomplish the following: [*]Check for negating adverbs (RB) against defined array('not', 'never') [*]When there's a match, remove the adverb [*]Concatenate "not-" to the beginning of every subsequent adjective (JJ), adverb (RB), or verb (VB or VBN for past tense) [*]Remove all POS-tags (/XX) Thus, the desired output would be: It was not-okay or not-funny and I will not-buy from them not-ever not-again My first thought was to do this the way I know how to: explode the string on space, then explode every word on "/" to [JJ => okay], then make a switch statement to treat every word (case JJ: concatenate, etc.), but this seems very sloppy. Does anybody have a more clean and / or efficient way of doing this, for instance regex? The strings have been pre-cleaned, so they will always only contain words (no punctuation, other characters than a-z, etc.). Any tips, example code fragments, etc. would be greatly appreciated! *Edit: I am aware, btw, of the very basic character of this way of treating negations, but it is good enough for what I need. There will be an error margin, but that's ok *
  4. Consider the following string $text = "Dat foo 13.45 and $600 bar {baz:70} and {8}"; I need to label all numbers in $text, except for when they are between curly braces. I now have this: preg_replace("/(?<!{)([0-9]+(?:\.[0-9]+)?)(?!})/","{NUMBER:$0}",$text); which outputs: Dat foo {NUMBER:13.45} and $ {NUMBER:600} bar {baz: {NUMBER:7} 0} and {8} However, the desired output is: Dat foo {NUMBER:13.45} and ${NUMBER:600} bar {baz:70} and {8} where numbers between { and } are ignored, even if they are surrounded by alfanumerical (or other) characters. In other words - how do I need to adjust the regex to completely ignore whatever there is between curly braces? Your help would be greatly appreciated!
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.