KingOfHeart Posted January 4, 2013 Share Posted January 4, 2013 preg_match/preg_match_all ........something about this function always screw me with understanding it, way too many symbols. So when helping me, try to simplfy when I need to do to understand. http://tf2spreadsheet.blogspot.com/ I want to extract the quality, class, item, refined, alt version(notes not needed) How do I echo all this data? Using file_get_contests I can get all the main part, but now how do I split it up? Please don't link me to the "http://php.net/manual/en/function.preg-match-all.php" Too many patterns at once..any way we can break it up to teaching me one step at a time for all the symbols I need to use and stuff? Quote Link to comment https://forums.phpfreaks.com/topic/272698-dummy-looking-for-a-little-help/ Share on other sites More sharing options...
requinix Posted January 4, 2013 Share Posted January 4, 2013 Fortunately for you the source doc is public and accessible as CSV. https://spreadsheets.google.com/pub?key=0AnM9vQU7XgF9dHZvNXl1dHJqZ21oNmp4UklSU2RUYXc&output=csv You can use fopen+fgetcsv+fclose to read it. Quote Link to comment https://forums.phpfreaks.com/topic/272698-dummy-looking-for-a-little-help/#findComment-1403238 Share on other sites More sharing options...
KingOfHeart Posted January 4, 2013 Author Share Posted January 4, 2013 I need it split up and I want, I really want to use preg_match. So often I needed this function but too confused to use it other then copy & paste Quote Link to comment https://forums.phpfreaks.com/topic/272698-dummy-looking-for-a-little-help/#findComment-1403244 Share on other sites More sharing options...
requinix Posted January 4, 2013 Share Posted January 4, 2013 Okay, I would accept "because I'd like to try using regular expressions" as a reason, though only after I made sure you understood that a regex is not the solution to scanning an HTML page. But as it so happens regex is not a solution at all for this. The table is built entirely using Javascript, with JSON returned from Google Docs. The closest you could get would be running the regex against the JSON but it would be entirely ridiculous to try to construct something that would work as you'd want. Sorry but this isn't a problem you can use to learn regular expressions. Quote Link to comment https://forums.phpfreaks.com/topic/272698-dummy-looking-for-a-little-help/#findComment-1403287 Share on other sites More sharing options...
KingOfHeart Posted January 5, 2013 Author Share Posted January 5, 2013 (edited) Fine..then how about about turning bbcode into html output? [b]This area would be bolded[/b] [color = red]This color would be red[/color] etc. How would I use preg for that? Edited January 5, 2013 by KingOfHeart Quote Link to comment https://forums.phpfreaks.com/topic/272698-dummy-looking-for-a-little-help/#findComment-1403316 Share on other sites More sharing options...
KingOfHeart Posted January 5, 2013 Author Share Posted January 5, 2013 I found a script I used, now all I need help with is for someone to teach me how it works exactly. function bbcode2html($message) { $preg = array( '/(?<!\\\\)\[color(?::\w+)?=(.*?)\](.*?)\[\/color(?::\w+)?\]/si' => "<span style=\"color:\\1\">\\2</span>", '/(?<!\\\\)\[size(?::\w+)?=(.*?)\](.*?)\[\/size(?::\w+)?\]/si' => "<span style=\"font-size:\\1pt\">\\2</span>", '/(?<!\\\\)\[font(?::\w+)?=(.*?)\](.*?)\[\/font(?::\w+)?\]/si' => "<span style=\"font-family:\\1\">\\2</span>", '/(?<!\\\\)\[align(?::\w+)?=(.*?)\](.*?)\[\/align(?::\w+)?\]/si' => "<div style=\"text-align:\\1\">\\2</div>", '/(?<!\\\\)\[b(?::\w+)?\](.*?)\[\/b(?::\w+)?\]/si' => "<span style=\"font-weight:bold\">\\1</span>", '/(?<!\\\\)\[i(?::\w+)?\](.*?)\[\/i(?::\w+)?\]/si' => "<span style=\"font-style:italic\">\\1</span>", '/(?<!\\\\)\[u(?::\w+)?\](.*?)\[\/u(?::\w+)?\]/si' => "<span style=\"text-decoration:underline\">\\1</span>", '/(?<!\\\\)\[center(?::\w+)?\](.*?)\[\/center(?::\w+)?\]/si' => "<div style=\"text-align:center\">\\1</div>", // [email] '/(?<!\\\\)\[email(?::\w+)?\](.*?)\[\/email(?::\w+)?\]/si' => "<a href=\"mailto:\\1\" class=\"bb-email\">\\1</a>", '/(?<!\\\\)\[email(?::\w+)?=(.*?)\](.*?)\[\/email(?::\w+)?\]/si' => "<a href=\"mailto:\\1\" class=\"bb-email\">\\2</a>", // [url] '/(?<!\\\\)\[url(?::\w+)?\]www\.(.*?)\[\/url(?::\w+)?\]/si' => "<a href=\"http://www.\\1\" target=\"_blank\" class=\"bb-url\">\\1</a>", '/(?<!\\\\)\[url(?::\w+)?\](.*?)\[\/url(?::\w+)?\]/si' => "<a href=\"\\1\" target=\"_blank\" class=\"bb-url\">\\1</a>", '/(?<!\\\\)\[url(?::\w+)?=(.*?)?\](.*?)\[\/url(?::\w+)?\]/si' => "<a href=\"\\1\" target=\"_blank\" class=\"bb-url\">\\2</a>", // [img] '/(?<!\\\\)\[img(?::\w+)?\](.*?)\[\/img(?::\w+)?\]/si' => "<img width = 100 height = 100 src=\"\\1\" alt=\"\\1\" class=\"bb-image\" />", '/(?<!\\\\)\[img(?::\w+)?=(.*?)x(.*?)\](.*?)\[\/img(?::\w+)?\]/si' => "<img width=\"\\1\" height=\"\\2\" src=\"\\3\" alt=\"\\3\" class=\"bb-image\" />", // [list] '/(?<!\\\\)(?:\s*<br\s*\/?>\s*)?\[\*(?::\w+)?\](.*?)(?=(?:\s*<br\s*\/?>\s*)?\[\*|(?:\s*<br\s*\/?>\s*)?\[\/?list)/si' => "\n<li class=\"bb-listitem\">\\1</li>", '/(?<!\\\\)(?:\s*<br\s*\/?>\s*)?\[\/list(?!u|o)\w+)?\](?:<br\s*\/?>)?/si' => "\n</ul>", '/(?<!\\\\)(?:\s*<br\s*\/?>\s*)?\[\/list:u(:\w+)?\](?:<br\s*\/?>)?/si' => "\n</ul>", '/(?<!\\\\)(?:\s*<br\s*\/?>\s*)?\[\/list:o(:\w+)?\](?:<br\s*\/?>)?/si' => "\n</ol>", '/(?<!\\\\)(?:\s*<br\s*\/?>\s*)?\[list(?!u|o)\w+)?\]\s*(?:<br\s*\/?>)?/si' => "\n<ul>", '/(?<!\\\\)(?:\s*<br\s*\/?>\s*)?\[list:u(:\w+)?\]\s*(?:<br\s*\/?>)?/si' => "\n<ul class=\"bb-list-unordered\">", '/(?<!\\\\)(?:\s*<br\s*\/?>\s*)?\[list:o(:\w+)?\]\s*(?:<br\s*\/?>)?/si' => "\n<ol>", '/(?<!\\\\)(?:\s*<br\s*\/?>\s*)?\[list(?:)?(:\w+)?=1\]\s*(?:<br\s*\/?>)?/si' => "\n<ol class=\"bb-list-ordered,bb-list-ordered-d\">", '/(?<!\\\\)(?:\s*<br\s*\/?>\s*)?\[list(?:)?(:\w+)?=i\]\s*(?:<br\s*\/?>)?/s' => "\n<ol class=\"bb-list-ordered,bb-list-ordered-lr\">", '/(?<!\\\\)(?:\s*<br\s*\/?>\s*)?\[list(?:)?(:\w+)?=I\]\s*(?:<br\s*\/?>)?/s' => "\n<ol class=\"bb-list-ordered,bb-list-ordered-ur\">", '/(?<!\\\\)(?:\s*<br\s*\/?>\s*)?\[list(?:)?(:\w+)?=a\]\s*(?:<br\s*\/?>)?/s' => "\n<ol class=\"bb-list-ordered,bb-list-ordered-la\">", '/(?<!\\\\)(?:\s*<br\s*\/?>\s*)?\[list(?:)?(:\w+)?=A\]\s*(?:<br\s*\/?>)?/s' => "\n<ol class=\"bb-list-ordered,bb-list-ordered-ua\">", //line breaks '/\n/' => "<br>", // escaped tags like \[b], \[color], \[url], ... '/\\\\(\[\/?\w+(?::\w+)*\])/' => "\\1" ); $message = preg_replace(array_keys($preg), array_values($preg), $message); return $message; } */ function bbcode2html($message) { $bbcode = array( "'\[center\](.*?)\[/center\]'is" => "<center>\\1</center>", "'\[left\](.*?)\[/left\]'is" => "<div style='text-align: left;'>\\1</div>", "'\[right\](.*?)\[/right\]'is" => "<div style='text-align: right;'>\\1</div>", "'\[pre\](.*?)\[/pre\]'is" => "<pre>\\1</pre>", "'\[b\](.*?)\[/b\]'is" => "<b>\\1</b>", "'\[quote\](.*?)\[/quote\]'is" => "<div class='top'><b>Quote:</b><hr>\\1</div>", "'\[i\](.*?)\[/i\]'is" => "<i>\\1</i>", "'\[u\](.*?)\[/u\]'is" => "<u>\\1</u>", "'\[s\](.*?)\[/s\]'is" => "<del>\\1</del>", "'\[url\](.*?)\[/url\]'is" => "<a href='\\1' target='_BLANK'>\\1</a>", "'\[url=(.*?)\](.*?)\[/url\]'is" => "<a href=\"\\1\" target=\"_BLANK\">\\2</a>", "'\[page=(.*?)\](.*?)\[/page\]'is" => "<a href=\"http://openzelda.thegaminguniverse.org/\\1\" target=\"_BLANK\">\\2</a>", "'\[img\](.*?)\[/img\]'is" => "<img border=\"0\" src=\"\\1\">", "'\[img=(.*?)\]'" => "<img border=\"0\" src=\"\\1\">", "'\[email\](.*?)\[/email\]'is" => "<a href='mailto: \\1'>\\1</a>", "'\[size=(.*?)\](.*?)\[/size\]'is" => "<span style='font-size: \\1;'>\\2</span>", "'\[font=(.*?)\](.*?)\[/font\]'is" => "<span style='font-family: \\1;'>\\2</span>", "'\[color=(.*?)\](.*?)\[/color\]'is" => "<span style='color: \\1;'>\\2</span>", "'\n'is" => "<br>", "' 'is" => " ", "' 'is" => " ", "'\[list=o\](.*?)\[/list\]'is" => "<ol>\\1</ol>", "'\[list=u\](.*?)\[/list\]'is" => "<ul>\\1</ol>", "'\[list\](.*?)\[/list\]'is" => "<ol>\\1</ol>", "'\[li\](.*?)\[/li\]'is" => "<li>\\1</li>", "'\[code\](.*?)\[/code\]'is" => "<div class='code'>\\1</div>", "'\[spoiler=(.*?)\](.*?)\[/spoiler\]'is" => "<a href=\"javascript:unhide('\\1');\">\\2</a>", "'\[hide=(.*?)\](.*?)\[/hide\]'is" => "<div id='\\1' class='hidden'>\\2</div>" ); $message = preg_replace(array_keys($bbcode), array_values($bbcode), $message); return $message; } What does the \\1 and \\2 mean? (.*?) < where can I find out about this? What other symbols are there? Quote Link to comment https://forums.phpfreaks.com/topic/272698-dummy-looking-for-a-little-help/#findComment-1403319 Share on other sites More sharing options...
requinix Posted January 5, 2013 Share Posted January 5, 2013 (edited) The world of regular expressions is very complicated. Not something people can cover in an online forum. You should pick up a book (an actual book) on the subject; I believe the owl book is still the reigning champion. Meanwhile look for tutorials on Perl or PCRE syntax (and not POSIX). Eh, I'll type out the basics. It's like 99% of what you may ever need. Miscellaneous . Dot-all. Any character except newlines | Alternation. Either everything to the left or everything to the right. Can be limited to a group; can be chained Metacharacters \\ Backslash \d Digit \D Non-digit (opposite of \d) \n Unix newline (LF); \r\n is the Windows newline (CRLF) \r Mac newline (CR) \s Whitespace (spaces and tabs) \S Non-whitespace (opposite of \w) \t Tab \w "Word" character (lower- and uppercase letters, numbers, and underscores) \W Non-word character (opposite of \w) Character sets (most special symbols, like . and | and ?, lose their meanings; metacharacters still apply) [abc] Either 'a' or 'b' or 'c' [^abc] Neither 'a' nor 'b' nor 'c' [a-c] Either 'a' or 'c' or anything between them [-ac] Either '-' or 'a' or 'c' (hyphen becomes normal at the beginning) [ac^-] Either 'a', 'c', '^', or '-' (^ is only special at the beginning, hyphen becomes normal at the end) [a-c-f] Either 'a', 'c', something between them, '-', or 'f' (cannot chain ranges, hyphen becomes normal) Quantifiers (work on the preceeding single unit) X Exactly one of X X? Zero or one of X (ie, X is optional) X* Zero or more of X X+ One or more of X X{a} Exactly a-many of X (eg, {2} is exactly two) X{a,b} Between a- and b-many of X; either a or b can be optional (eg, {2,} is at least two) X*+ Possessive. Zero or more of X, as many as possible and no backtracking X+? Lazy. One or more of X, as few as possible Grouping and capturing (abc) Group "abc" together as one unit and capture it for later \N The N-th captured group, counting capturing (s from left to right $N Same as \N (?:abc) Group "abc" together as one unit but do not capture it Assertions/anchors (none of these capture or consume characters) ^ Beginning of the string $ End of the string \b Word boundary (between a \w and a \W) \B Non-word boundary (opposite of \B) (?=X) Positive lookahead. Ensure that X follows immediately (?!X) Negative lookahead. Ensure that X does not follow immediately (?<=X) Positive lookbehind. Ensure that X preceeded immediately (?<!X) Negative lookbehind. Ensure that X did not preceed immediately Flags /e Eval, preg_replace() only. Replacement string is evaluated as PHP code first. Deprecated in favor of using preg_replace_callback() /i Letters (besides \w) are case-insensitive /m ^ and $ can also match the beginning and end of a line /s Dot-all will match newlines /x Whitespace is ignored, line comments allowed See also the manual. Edited January 5, 2013 by requinix Quote Link to comment https://forums.phpfreaks.com/topic/272698-dummy-looking-for-a-little-help/#findComment-1403327 Share on other sites More sharing options...
KingOfHeart Posted January 5, 2013 Author Share Posted January 5, 2013 Now that looks more like it I will copy and save all info. Quote Link to comment https://forums.phpfreaks.com/topic/272698-dummy-looking-for-a-little-help/#findComment-1403352 Share on other sites More sharing options...
requinix Posted January 5, 2013 Share Posted January 5, 2013 Wish I could edit... One change: \s Whitespace (spaces and tabs and newlines) \S Non-whitespace (opposite of \s) Quote Link to comment https://forums.phpfreaks.com/topic/272698-dummy-looking-for-a-little-help/#findComment-1403364 Share on other sites More sharing options...
KingOfHeart Posted January 6, 2013 Author Share Posted January 6, 2013 BTW, what does the :: mean?? I managed to create some new bbcodes but not 100% sure on the required 1 space thing. '/(?<!\\\\)\[img(?::\s+)width(?::\w+)?=(.*?)(?::\s+)height(?::\w+)?=(.*?)(?::\s)src(?::\w+)?=(.*?)\]\]/si' => "<img width=\"\\1\" height=\"\2\" src=\"\\3\"/>", < this should output an image with a width of 24 and a height of 40. I manged to create a bbode without the width and height so far. Quote Link to comment https://forums.phpfreaks.com/topic/272698-dummy-looking-for-a-little-help/#findComment-1403538 Share on other sites More sharing options...
requinix Posted January 6, 2013 Share Posted January 6, 2013 Only that first colon matters: it's a (?:...) where the first character inside is a colon. And if you didn't guess, a backslash can escape an otherwise important character. "[abc]" would be a character set while "\[abc\]" is literally a left bracket, "abc", and a right bracket. Quote Link to comment https://forums.phpfreaks.com/topic/272698-dummy-looking-for-a-little-help/#findComment-1403544 Share on other sites More sharing options...
KingOfHeart Posted January 6, 2013 Author Share Posted January 6, 2013 Figured the backslash part since I used it lots of times in other scripts. I understand now for the colon. Still can't get the required space so far..trying different combinations. '/(?<!\\\\)\[img(?::\s+)? width(?::\w+)?=(.*?)(?::\s+)? height(?::\w+)?=(.*?)(?::\s+)? src(?::\w+)?=(.*?)\]\]/si' => "<img width=\"\\1\" height=\"\2\" src=\"\\3\"/>", Quote Link to comment https://forums.phpfreaks.com/topic/272698-dummy-looking-for-a-little-help/#findComment-1403547 Share on other sites More sharing options...
requinix Posted January 6, 2013 Share Posted January 6, 2013 What are you aiming for? If it's "" then what are the :\w+ things for? Quote Link to comment https://forums.phpfreaks.com/topic/272698-dummy-looking-for-a-little-help/#findComment-1403553 Share on other sites More sharing options...
KingOfHeart Posted January 6, 2013 Author Share Posted January 6, 2013 (edited) The w+ worked in all the other scripts so far but none of the scripts required spacing. The w+ allowed the numbers after the equal sign if I understood correct. I know \s+ or maybe it's just \s that's used to check for spacing but not sure yet which symbols to use next to it. Guess I could/should replace the ones that require numbers with a d+ right if I only allowed numbers.. but I might also want to allow things like this "" or "" as an option. Edited January 6, 2013 by KingOfHeart Quote Link to comment https://forums.phpfreaks.com/topic/272698-dummy-looking-for-a-little-help/#findComment-1403560 Share on other sites More sharing options...
requinix Posted January 6, 2013 Share Posted January 6, 2013 Not those \w+s. I mean the ones with the colons. That you were asking about before. I'd still love to hear exactly what strings you're trying to match but how about /\[img(?:\s+width=(\d+(?:px|%)?)|\s+height=(\d+(?:px|%)?)|\s+src=([^\s\]]+))+\]/i Take a minute to look over that. $1 will be the width, $2 the height, and $3 the src. Possibly empty, like if one wasn't given. Then you can feed that to preg_replace_callback() like $new = preg_replace_callback('...', function($matches) { // need the src, width, and height to work if (!isset($matches[1], $matches[2], $matches[3])) { return $matches[0]; // unchanged } list(, $width, $height, $src) = $matches; // you may want to check for a valid width and height here // eg, is there a risk that someone will use width=9999999px? if (ctype_digit($width)) { $width .= "px"; } if (ctype_digit($height)) { $height .= "px"; } return "<img src=\"" . htmlentities($src) . "\" style=\"width: {$width}; height: {$height}\" />"; }, $old); Quote Link to comment https://forums.phpfreaks.com/topic/272698-dummy-looking-for-a-little-help/#findComment-1403567 Share on other sites More sharing options...
KingOfHeart Posted January 6, 2013 Author Share Posted January 6, 2013 Your actually getting one step ahead so let me state the problem. As of right now when I use it returns it as meaning plain text...it did not find this exact match. -------------- I was wondering how I shoujld go about setting limits. Haven't used too much preg yet..would I just use it like this... function bbcode2html($message) { $preg = array( '/(?<!\\\\)\[img(?::\s+)? width(?::\w+)?=(.*?)(?::\s+)? height(?::\w+)?=(.*?)(?::\s+)? src(?::\w+)?=(.*?)\]\]/si' => "<img width=\"\\1\" height=\"\2\" src=\"\\3\"/>" ); preg_replace_callback(array_keys($preg), array_values($preg), $message);// < do I call it like this??? $message = preg_replace(array_keys($preg), array_values($preg), $message); return $message; } $new = preg_replace_callback('...', function($matches) { // need the src, width, and height to work if (!isset($matches[1], $matches[2], $matches[3])) { return $matches[0]; // unchanged } list(, $width, $height, $src) = $matches; // you may want to check for a valid width and height here // eg, is there a risk that someone will use width=9999999px? if (ctype_digit($width)) { $width .= "px"; } if (ctype_digit($height)) { $height .= "px"; } return "<img src=\"" . htmlentities($src) . "\" style=\"width: {$width}; height: {$height}\" />"; }, $old); ------------------------- I think I'll create a shorter function like [test ] and make it only work if I have a space after the word test and work my way on from there Quote Link to comment https://forums.phpfreaks.com/topic/272698-dummy-looking-for-a-little-help/#findComment-1403568 Share on other sites More sharing options...
KingOfHeart Posted January 6, 2013 Author Share Posted January 6, 2013 Sweet, I got the image preg to work finally....Not sure if this is how you would do it, but at least now I can use sizes if needed...I'll try to see if I can use your function '/(?<!\\\\)\[img(?::\w+)?=(.*?)\]/si' => "<img src=\"\\1\" alt=\"\\1\"/>", '/(?<!\\\\)\[img(?:\s+)width(?::\w+)?=(.*?)(?:\s+)height(?::\w+)?=(.*?)(?:\s+)src(?::\w+)?=(.*?)\]/si' => "<img width=\"\\1\" height=\"\\2\" src=\"\\3\"/>", Quote Link to comment https://forums.phpfreaks.com/topic/272698-dummy-looking-for-a-little-help/#findComment-1403571 Share on other sites More sharing options...
KingOfHeart Posted January 6, 2013 Author Share Posted January 6, 2013 (edited) I don't consider myself ready to use something like this. If it becomes an issue I'll use non-preg or remove the rights for the user to use this function. Unless if you think you can work with me, I'm going to just pass. Also besides px, they're allowed to use % as well Edited January 6, 2013 by KingOfHeart Quote Link to comment https://forums.phpfreaks.com/topic/272698-dummy-looking-for-a-little-help/#findComment-1403573 Share on other sites More sharing options...
KingOfHeart Posted January 6, 2013 Author Share Posted January 6, 2013 I found something shorter and simpler. preg_match_all("/(?<!\\\\)\[img(?:\s+)width(?::\d+)?=(.*?)(?:\s+)height(?::\d+)?=(.*?)(?:\s+)src(?::\w+)?=(.*?)\]/si", $message, $matches); foreach ($matches[1] as $num) { $wid = "width=" . $num; $message = ($num > 700)? str_replace("$wid", "width=700", $message) : $message; } foreach ($matches[2] as $num2) { $hei = "height=" . $num2; $message = ($num2 > 700)? str_replace("$hei", "height=700", $message) : $message; } I decided to remove px and % but don't know how to make the bbcode work only if you use numbers. I started to use d+ but it still converted the bbcode into html. I got to go to bed now but any idea how? '/(?<!\\\\)\[img(?:\s+)width(?::\d+)?=(.*?)(?:\s+)height(?::\d+)?=(.*?)(?:\s+)src(?::\w+)?=(.*?)\]/si' => "<img width=\"\\1\" height=\"\\2\" src=\"\\3\"/>", I want to convert it to an image with the width and height < this works fine and I want to convert to plain text as < no idea how Hope all I have to do is add another symbol to the search. Quote Link to comment https://forums.phpfreaks.com/topic/272698-dummy-looking-for-a-little-help/#findComment-1403579 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.