tgavin Posted April 1, 2008 Share Posted April 1, 2008 I'm working on stripping all of the special characters, etc so that I have SEO friendly urls. I'm not getting the results I want. Starting with the title Cheech & Chong's Nice Dreams. I would like for it to be 'Cheech-and-Chongs-Nice-Dreams Nothing I've tried is working. <?php function create_slug($string) { return ereg_replace("[^A-Za-z0-9]", "-", $string); } $title = "Cheech & Chong's Nice Dreams"; $title_slug = create_slug($title); echo $title_slug; // or return return preg_replace('/\s/', '-', $str); // i've tried more $return = ereg_replace("[^A-Za-z0-9]", "-", $string); $return = preg_replace('/\s/', '-', $return); return str_replace('--', '-', $return); ?> ugh. Quote Link to comment Share on other sites More sharing options...
dsaba Posted April 1, 2008 Share Posted April 1, 2008 <?php $str = 'Cheech & Chong\'s Nice Dreams'; $str = str_replace('&', 'and', $str); $str = preg_replace('~[^\w]+~', '-', $str); echo $str; ?> notice how multiple non-alphanumeric chars are replaced by a SINGLE '-' Quote Link to comment Share on other sites More sharing options...
tgavin Posted April 1, 2008 Author Share Posted April 1, 2008 <?php $str = 'Cheech & Chong\'s Nice Dreams'; $str = str_replace('&', 'and', $str); $str = preg_replace('~[^\w]+~', '-', $str); echo $str; ?> notice how multiple non-alphanumeric chars are replaced by a SINGLE '-' And how! Perfect. That leaves it wide open to add anything else I may need. Thanks! Quote Link to comment Share on other sites More sharing options...
wizman45 Posted April 7, 2008 Share Posted April 7, 2008 Its amazing what you find out. I had the exact problem. Thanks for posting the solution. Quote Link to comment Share on other sites More sharing options...
effigy Posted April 7, 2008 Share Posted April 7, 2008 [^\w]+ FYI: This can be shortened to \W+. Quote Link to comment Share on other sites More sharing options...
BrandonK Posted April 8, 2008 Share Posted April 8, 2008 Very nice idea tgavin. That's a lot more efficient than what I was using before. I added one more step to the replacement for slightly better results (strips punctuation): <?php $str = 'Cheech & Chong\'s Nice Dreams'; $str = str_replace('&', 'and', $str); $str = preg_replace(array('/[\p{P}]+/', '/[^\w]+/'), array('', '-'), $str); echo $str; This way you don't get Cheech-and-Chong-s-Nice-Dreams, but instead Cheech-and-Chongs-Nice-Dreams. YMMV. \p{P} matches all punctuation (PHP 4.4.0+ and 5.1.0+) Quote Link to comment Share on other sites More sharing options...
tgavin Posted April 8, 2008 Author Share Posted April 8, 2008 Very nice idea tgavin. That's a lot more efficient than what I was using before. I added one more step to the replacement for slightly better results (strips punctuation): <?php $str = 'Cheech & Chong\'s Nice Dreams'; $str = str_replace('&', 'and', $str); $str = preg_replace(array('/[\p{P}]+/', '/[^\w]+/'), array('', '-'), $str); echo $str; This way you don't get Cheech-and-Chong-s-Nice-Dreams, but instead Cheech-and-Chongs-Nice-Dreams. YMMV. \p{P} matches all punctuation (PHP 4.4.0+ and 5.1.0+) Effigy, thank you! Brandon, thank you too. I was using this because I don't know what the hell I'm doing <?php function create_slug($str) { $str = str_replace("'", '', $str); $str = str_replace('&', 'and', $str); $str = str_replace('!', '', $str); $str = str_replace('?', '', $str); $str = str_replace('(', '', $str); $str = str_replace(')', '', $str); $str = str_replace('.', '', $str); $str = preg_replace('[^\w]+', '-', $str); return strtolower($str); } ?> Throw that in with mod rewrite and you have an SEO gold mine! Quote Link to comment Share on other sites More sharing options...
BrandonK Posted April 10, 2008 Share Posted April 10, 2008 I discovered a bug with the regex that I posted. Since I am planning on using this function to replace my current method for creating slugs, I ran it through some of my odd product names... If your product name has a hyphen, it is stripped by the \p{P} so: "Cheech-Chong Up In Smoke" becomes "CheechChong-Up-In-Smoke" After a lot of trial and error, I came up with this: <?php $str = 'Cheech & Chong\'s Nice Dreams'; $str = str_replace('&', 'and', $str); $str = preg_replace(array('/[^\w\s-_\\/]+/', '/[^\w]+/'), array('', '-'), $str); echo $str; It loses some of its simplicity that we had earlier, but it allows the above string to come out as "Cheech-Chong-Up-In-Smoke" as well as handling backslashes like I needed it to. If someone else has an idea to clean it up, I'd be happy to run it through my tests. P.S. For those that are interested, here's what I found out about the \p variations: not matched by \p{P}: `~$^+=<> {Pc}: _ {Pd}: - {Pe}: )]} {Pf}: ????? - no clue, probably has to do with the placement of the punctuation {Pi}: ????? - no clue, probably has to do with the placement of the punctuation {Po}: !@#%&*:"?\;',./ {Ps}: ({[ Not much more info here: http://www.php.net/manual/en/reference.pcre.pattern.syntax.php Quote Link to comment Share on other sites More sharing options...
tgavin Posted April 11, 2008 Author Share Posted April 11, 2008 I noticed that bug too. After trial and error, scratching head, pulling hair I finally realized why my previous database records were being rewritten The solution I had wasn't very elegant. However, it did keep the title intact, sans the punctuation, and made everything lowercase. The problem I see in your fix is that if somebody was looking for 'up in smoke', they would be more apt to type 'cheech-and-chongs', not 'Cheech-Chong'. It's also easier to read in a search result, and to remember. Making it lowercase just simplifies everything. Plus, I *think* it's common practice too. Quote Link to comment Share on other sites More sharing options...
BrandonK Posted April 11, 2008 Share Posted April 11, 2008 I threw Cheech-Chong in there to show that it preserves existing hyphens. You initial example still works with Cheech & Chong. I tend to make urls lowercase because servers can be case sensitive. I do think that search engines are smart enough to not mistake Cheech with cheech, but its always better to error on the side of caution. strtolower() is easy to add. Quote Link to comment Share on other sites More sharing options...
tgavin Posted April 12, 2008 Author Share Posted April 12, 2008 little problem. I noticed that numbers are disappearing. If I use shrek, then shrek2, the original shrek is overwritten. Quote Link to comment Share on other sites More sharing options...
BrandonK Posted April 14, 2008 Share Posted April 14, 2008 They shouldn't be. You'd have to post your code, but numerics should stay: <?php $str = 'Shrek2 vs Shrek 2'; $str = str_replace('&', 'and', $str); $str = preg_replace(array('/[^\w\s-_\\/]+/', '/[^\w]+/'), array('', '-'), $str); echo $str; Quote Link to comment Share on other sites More sharing options...
tgavin Posted April 14, 2008 Author Share Posted April 14, 2008 my bad. It was something else. I forgot that I had made this post or I would have mentioned it. Sorry. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.