dsaba
Members-
Posts
724 -
Joined
-
Last visited
Never
Everything posted by dsaba
-
* ~(??:(?:https?|ftp)://)|www\.)\S+(?<!\p{P})\.\S+(?<!\p{P})~
-
Effigy your regex will match: www.lalallookiamnotareallink http://again (look even SMF thinks these are links) All urls have at least 1 '.' in them so I'd say: ~(??:(?:https?|ftp)://)|www\.)(?:\S+\.\S+)(?<!\p{P})~ I also don't see why you are using the x modifier because \S will not match white space anyways. I couldn't figure out the meaning of the P unicode grapheme, could you a provide a link to a listing of these, or just tell what it is?
-
'\b(((((H|h)(T|t)|(F|f))(T|t)(P|p)((S|s)?))\://)?(www.|[a-zA-Z0-9].)[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,6}(\:[0-9]{1,5})*(/(|[a-zA-Z0-9\.\,\;\?\'\\\+&%\$#\=~_\-]+?))*)($|[^\w/][<\s]|[<\s]|[^\w/]$)'; $replacement = '\'<a href="\'.((\'$4\' == \'\')?\'http://$1\':\'$1\').\'" target="_blank">$1[/url]$16\''; return preg_replace('¦'.$pattern.'¦e', $replacement, $text); }; You are using the | delimiter, your pattern uses the | symbol in it, so you must escape it within your pattern or pick a different delimiter that is not used in your pattern. Try using ~ delimiter
-
there is a difference between finding links and urls I assume you identify these as "links" if they are in href attributes so to match those: $pat = '~href[\'"]([^\'"]+)[\'"]~i'; preg_match_all($pat, $source, $out); print_r($out[1]); //array of urls in href attribute tags to get just urls from any text: $pat = '~(??(?<=http://|ftp://|https://)(?:http://|ftp://|https://)|)www\.|http://|https://|ftp://)[^\s]+~i';
-
*this is better to match just the url with any text ~(??(?<=http://|ftp://|https://)(?:http://|ftp://|https://)|)www\.|http://|https://|ftp://)[^\s]+~ Tested: http://nancywalshee03.freehostia.com/regextester/regex_tester.php?seeSaved=ev0obz7e
-
$pat ='~(?<=visit: )(?:http://|https://|ftp://)*(?:www)?[^\s]+(?= and Win!)~i'; $source = preg_replace($pat, '[url REMOVED]', $source);
-
Both your patterns work fine within the haystack you provided. If it doesn't work the problem is coming from somewhere else, verify your content from file_get_contents to be what you think it is when you have these "problems"
-
2 suggestions, index tags and topic bookmarks
dsaba replied to jeffjohnvol's topic in PHPFreaks.com Website Feedback
I'm working on an algorithm that's similar to the way word clouds works, it automatically parses text and finds most likely words that should be tagged for the text. Other than the number of time a word occurs in a text, there are other ways to determine what needs tagging. This would be a nice mod for SMF. -
In that case I could have just used ~(hi my )a( name )b( is)~ instead of your pattern, because it yields the same results. I wanted to see if this string alone could be the full match "hi my name is" In your regex pattern, the full match is "hi a my name b is", so you didn't "skip" over any words, and the reason for the output is because of php code, not regex. The two example matches I provided were quite easy, but in my question of only matching "hi my name is" in that haystack...this boggled my mind. I didn't know if this could be done, so I take it you can't do it then?
-
Match 6 digits in a row, no repeats in any of the digits
dsaba replied to dsaba's topic in Regex Help
I thought "it" was understood to be the regex pattern. However, you responded with a non-regex solution below. I see this is the essence of your shorter/better solution. Its a great way to solve the problem, but not with regex. Regardless of this still, how/why is your non-regex solution a shorter/better way to do it? -
this does not provide a parse error and there are no delimiters in the pattern: preg_match("\r\n", 'hello world');
-
Here is the sample haystack: hi my a name b is I want to match the string "hi my name is" without the 'b' and the 'a' I know how to skip over words, or not match them when they come on the ends of strings, but not when they come in the middle of the string. ie these two examples skip over the 'a' ~hi my name is (?=a)~ in "hi my name is a" ~(?<=a) hi my name is~ in "a hi my name is" I'd like to see how this can be done, or if it can be done at all with the PCRE engine in PHP. Thanks
-
Only allowing numbers, letters, dashes, and underscores
dsaba replied to LanceT's topic in Regex Help
because the m modifier turns on the meanings for the ^ and $ characters to mean anchors of the start and beginning of a line -
so the reason why "\r\n" works as a pattern but this doesn't '\r\n' and why \r?\n? yields so many subgroups is because of sheer confusion & mainstream lingo?
-
This works: ~{if cond=(.+?)}(.+?){/if}\s*(?:{else}(.+?){/else})?~si I don't see why you need * either, why would you have empty if and/or else statements? (regex)? indicates an optional subgroup (?:regex) indicates a subgroup that is not captured
-
These work for matching 6 digits in a row with none repeating, is there a shorter/better way to write it? ~(\d)(?!\1)(\d)(?!\1|\2)(\d)(?!\1|\2|\3)(\d)(?!\1|\2|\3|\4)(\d)(?!\1|\2|\3|\4|5)(\d)~ ~(?\d)(?!.*\1)){6}~ checked against: 123456 in PCRE
-
Do you have an explanation for this behavior, effigy?
-
expression is still greedy because you're spoiling it haha j/k it is still greedy because you haven't made it lazy .* is greedy .*? is lazy see the difference add a ? after the repetition operator +, * and it becomes lazy when you say: (.+)} you realize . means match ANY character including the }, so it never stops you match repititions of certains things like [^}]+ which is "not the } sign" or you can make it lazy: (.+?)} now it will stop at the first } it reaches read this tutorial: http://www.regular-expressions.info/tutorial.html also take out the U modifier, it behaves different with it on, try my regex tester and it will show you visually with colors where your subgroups begin and end http://nancywalshee03.freehostia.com/regextester/regex_tester.php?seeSaved=fkfwfka2
-
I went here: http://www.phpfreaks.com/forums/index.php?action=profile;u=29536;sa=showPosts;start=15 You can press page numbers and it changes the start=... variable in the url I don't see where you can specify, the limit of how many per page. I want to see them all at once. like a See All link would do... I don't feel like digging through SMF docs to see if this is possible, so I figured I'd just ask. Thanks
-
str_replace requires 3 arguments: mixed str_replace ( mixed $search , mixed $replace , mixed $subject [, int &$count ] )
-
this isn't too hard.. you can use perl style regex with preg_match_all() All of these patterns will work where ~ is the delimiter ~<h2>((??!</h2>).)+)</h2>~is ~<h2>(.+?)</h2>~is ~<h2>([^<]+)</h2>~i $pat = '~<h2>([^<]+)</h2>~i'; preg_match_all($pat, $content, $out); print_r($out); //look at your outputted array, matches in subgroup 1 aka: $out[1] $numH2s = count($out[1]); see this tut: http://www.regular-expressions.info/repeat.html