lag Posted October 11, 2007 Share Posted October 11, 2007 Hi there, long time lurker first time poster I am teaching myself PHP and I am trying to learn about regular expressions, and preg_replace and that sort of thing but I'm having some trouble figuring out what to do. I'm trying to write a script to help import some old HTML files as blog post for my website, these HTML files are 300-2000 lines long. I am currently using fgets() to read the files one line at a time, clean them and write them to the database in the appropriate format once the entire article is completed. My issue is many of the links included on these post no longer exist (some date into the '90s) and I want as part of my importing function to "fix" these links as so. Here is an example line I might get: <p>12. Then use a <a href="http://www.deadsite.com/randompage.html">coping saw</a> to cut the initial grooves onto the mark you just made.</p> What I would like to do is parse each line and replace it as so: <p>12. Then use a <a href="http://www.google.com/search?q=coping+saw">coping saw</a> to cut the initial grooves onto the mark you just made.</p> Basically I am taking the contents of the link (the part the user can read) and placing it as a search query link to say, google or wikipedia. There is no rhyme or reason really to how these are formatted other than the standard <a href> tag, and sometimes there is more than one link a line, sometimes none. The way I have this written in my draft, it's a two line function, of course that's my imagination. Any ideas? I would greatly appreciate it Quote Link to comment https://forums.phpfreaks.com/topic/72803-solved-partially-replace-a-string-with-data-from-the-same-string/ Share on other sites More sharing options...
thedarkwinter Posted October 11, 2007 Share Posted October 11, 2007 the str_replace function should work, replacing all occurences of the first param with the second using regex, can become more complicated (preg_replace). $input = str_replace("www.old.com", "www.new.com", $input); cheers, tdw Quote Link to comment https://forums.phpfreaks.com/topic/72803-solved-partially-replace-a-string-with-data-from-the-same-string/#findComment-367161 Share on other sites More sharing options...
lag Posted October 11, 2007 Author Share Posted October 11, 2007 the str_replace function should work, replacing all occurences of the first param with the second using regex, can become more complicated (preg_replace). $input = str_replace("www.old.com", "www.new.com", $input); cheers, tdw I could use this to replace the content once I have it separated, but I also need a method of extracting the link text and original link such as this: Random Text <a href="http://randomsiteblah.com/rand4308/">Link Text 1</a> more random text. Random Characters <a href="http://anotherdomain.com/448random.html">Link Text 2</a> more random text. to Random Text <a href="http://google.com/search?=First+Link">First Link</a> more random text. Random Characters <a href="http://google.com/search?=Another+Link">Another Link</a> more random text. Each one is different, but they all follow a standard (after <a href=" and before "> and then after the previous statement but before the next /a> Quote Link to comment https://forums.phpfreaks.com/topic/72803-solved-partially-replace-a-string-with-data-from-the-same-string/#findComment-367175 Share on other sites More sharing options...
Psycho Posted October 11, 2007 Share Posted October 11, 2007 This will do exactly as you requested. However, I have not done any extensive testing. <?php function replaceLinks ($string) { preg_match_all("|<a[^>]+>(.*)</a>|U", $string, $links, PREG_SET_ORDER); foreach ($links as $link) { $oldLink = $link[0]; $linkText = $link[1]; $searchParams = str_replace(' ', '+', $linkText); $newLink = '<a href="http://www.google.com/search?q='.$searchParams.'">'.$linkText.'</a>'; } return str_replace($oldLink, $newLink, $string); } $string = '<p>12. Then use a <a href="http://www.deadsite.com/randompage.html">coping saw</a> to cut the initial grooves onto the mark you just made.</p>'; $string = replaceLinks ($string); echo $string; //Output: // //<p>12. Then use a <a href="http://www.google.com/search?q=coping+saw">coping saw</a> to cut the initial grooves onto the mark you just made.</p> ?> You could add some functionality to the function to strip out the actual link from the href param to test if the link is currently valid or not and only replace it if it is not. EDIT: Modified function. The return was out of place and was exiting after the first replacement if there were multiple links! Quote Link to comment https://forums.phpfreaks.com/topic/72803-solved-partially-replace-a-string-with-data-from-the-same-string/#findComment-367177 Share on other sites More sharing options...
lag Posted October 11, 2007 Author Share Posted October 11, 2007 Ah that is exactly what I was trying to do, now I see where my error was (in the syntax). Thank you so much for your help Also, very good idea about checking the links first! I will do that! Good catch on the multiple links, I will run this around for a bit in my script and post here how my results came out Quote Link to comment https://forums.phpfreaks.com/topic/72803-solved-partially-replace-a-string-with-data-from-the-same-string/#findComment-367189 Share on other sites More sharing options...
lag Posted October 11, 2007 Author Share Posted October 11, 2007 Ok, for some reason this breaks it... <p>Sample Here: <a href="http://www.cookies.com/index.html">Sample 1</a> <a href="http://www.cookies.com/index.html">Sample 2</a> (Mirror)</p> Returns <p>Sample Here: <a href="http://www.cookies.com/index.html">Sample 1</a> | <a href="http://www.google.com/search?q=Low+Quality">Sample 2</a> (Mirror)</p> Throughout the files I try to process it seems to work and then not work sporadically... Here is my stripped down debug script: <? $pointer = @fopen("processme.html", "r"); if ($pointer) { while (!feof($pointer)) { $theLine = fgets($pointer, 4096); $theLine = replaceLinks($theLine); //keep apart for debugging echo $theLine; } fclose($pointer); } function replaceLinks ($string) { preg_match_all("|<a[^>]+>(.*)</a>|U", $string, $links, PREG_SET_ORDER); foreach ($links as $link) { $oldLink = $link[0]; $linkText = $link[1]; $searchParams = str_replace(' ', '+', $linkText); $newLink = '<a href="http://www.google.com/search?q='.$searchParams.'">'.$linkText.'</a>'; } return str_replace($oldLink, $newLink, $string); } ?> Quote Link to comment https://forums.phpfreaks.com/topic/72803-solved-partially-replace-a-string-with-data-from-the-same-string/#findComment-367325 Share on other sites More sharing options...
Psycho Posted October 11, 2007 Share Posted October 11, 2007 OK, The str_replace() and the return need to be separated. Corrected function below. Also added trim() to the link text when creating the query parameters. <?php function replaceLinks ($string) { preg_match_all("|<a[^>]+>(.*)</a>|U", $string, $links, PREG_SET_ORDER); foreach ($links as $link) { $oldLink = $link[0]; $linkText = $link[1]; $searchParams = str_replace(' ', '+', trim($linkText)); $newLink = '<a href="http://www.google.com/search?q='.$searchParams.'">'.$linkText.'</a>'; $string = str_replace($oldLink, $newLink, $string); } return $string; } ?> Quote Link to comment https://forums.phpfreaks.com/topic/72803-solved-partially-replace-a-string-with-data-from-the-same-string/#findComment-367366 Share on other sites More sharing options...
lag Posted October 11, 2007 Author Share Posted October 11, 2007 Ah good deal, sorry for being such a noob... but I've learned alot trying to fix it myself before you posted the solution so hopefully I wont have to be a pain again heh Quote Link to comment https://forums.phpfreaks.com/topic/72803-solved-partially-replace-a-string-with-data-from-the-same-string/#findComment-367485 Share on other sites More sharing options...
Psycho Posted October 12, 2007 Share Posted October 12, 2007 FYI: I'm only "OK" at regular expressions. I don't completely understand that regular expression - I picked it off of the manual page on php.net where it explains preg_match_all() Quote Link to comment https://forums.phpfreaks.com/topic/72803-solved-partially-replace-a-string-with-data-from-the-same-string/#findComment-367561 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.