spikeon Posted February 27, 2008 Share Posted February 27, 2008 i'm trying to make a script that will re-format html into sexy, pretty, code to do that i need to have indents but indents only change in certian spots: they increase after a open tag, and decrease before a close tag what i need is a script that will take this: <u><b><i><a href ='http://www.gmail.com/'><img src='gmail.png' /></a></i></b></u> modified with this: //remove spacing and newlines $file = str_replace("\n", " ", $file); $file = str_replace("\t", " ", $file); $file = preg_replace('/\s+/', ' ', $file); $file = str_replace("/ >", "/>", $file); //add newlines in all the right places $file = str_replace(">", ">\n", $file); $file = str_replace("<", "\n<", $file); $file = str_replace("\n ", "\n", $file); $file = str_replace("\n\n", "\n", $file); (which becomes this:) <u> <b> <i> <a href ='http://www.gmail.com/'> <img src='gmail.png' /> </a> </i> </b> </u> into this: <u> <b> <i> <a href ='http://www.gmail.com/'> <img src='gmail.png' /> </a> </i> </b> </u> heres what i got: foreach($lines as $line){ if(preg_match("[<][^/].*[^/][>]", $line)){ $indent = $indent + 1; } if(preg_match("[<][/].*[/][>]", $line)){ $indent = $indent - 1; } if($indent > 0){ $breaks = ""; for($i = 0; $i <= $indent; $i++){ $breaks .= " "; } $line = $breaks . $line; } $total .= $line . "\n"; } what am i doing wrong??? Quote Link to comment https://forums.phpfreaks.com/topic/93347-finding-opening-tags-and-closting-tags-with-preg_match/ Share on other sites More sharing options...
effigy Posted February 27, 2008 Share Posted February 27, 2008 Have you tried tidy? Quote Link to comment https://forums.phpfreaks.com/topic/93347-finding-opening-tags-and-closting-tags-with-preg_match/#findComment-478171 Share on other sites More sharing options...
spikeon Posted February 27, 2008 Author Share Posted February 27, 2008 well, i don't have access to my php.ini file, so i find myself having to re-make alot of these "Extensions" and if i didn't code some of my own stuff every now and then, my fingers would rust and i'd start to speak in grunts. Quote Link to comment https://forums.phpfreaks.com/topic/93347-finding-opening-tags-and-closting-tags-with-preg_match/#findComment-478180 Share on other sites More sharing options...
spikeon Posted February 28, 2008 Author Share Posted February 28, 2008 ??? i'm still stumped, my question has not been answered Quote Link to comment https://forums.phpfreaks.com/topic/93347-finding-opening-tags-and-closting-tags-with-preg_match/#findComment-478718 Share on other sites More sharing options...
wpt394 Posted February 28, 2008 Share Posted February 28, 2008 Explode each line into an array, use a test to determine if the previous line is an opening or closing tag, and keep a running tally. Suppose you start with <html>\n<head>\n</head>\n<body>\n<h1>Here is a title\n</h1>\n<p>Here is a paragraph\n</p>\n</body>\n</html> Turn the code into a massive string, then explode it into an array... $string = '<html>\n<head>\n</head>\n<body>\n<h1>Here is a title\n</h1>\n<p>Here is a paragraph\n</p>\n</body>\n</html>'; $array = explode('\n',$string); Then output the code keeping track of how many indents each line should have... $indent_count = -1; for($i=0; $i<sizeof($array); $i++){ if(substr($array[$i], 2) == '</'){$indent_count--;} //if the array element is a closing tag, subtract an indent else if(substr($array[$i], 1) == '<'){$indent_count++;} //otherwise, check to see if it is an opening tag -- if so, add one indent to the count for($j=0; $j<sizeof($indent_count); $j++){ //print as many indents as there are in the count echo '\t'; } echo $array[$i].'\n'; //print the actual piece of code, followed by a new line } Quote Link to comment https://forums.phpfreaks.com/topic/93347-finding-opening-tags-and-closting-tags-with-preg_match/#findComment-478728 Share on other sites More sharing options...
wpt394 Posted February 28, 2008 Share Posted February 28, 2008 Found a couple errors in my last post....use this.... $string = '<html>\n<head>\n</head>\n<body>\n<h1>\nHere is a title\n</h1>\n<p>\nHere is a paragraph\n</p>\n</body>\n</html>'; $array = explode('\n',$string); print_r($array); $indent_count = -1; for($i=0; $i<sizeof($array); $i++){ if(substr($array[$i], 0, 1) == '<' && substr($array[$i], 0, 2) != '</'){$indent_count++;} //if an opening tag -- add one indent to the count //This next line is confusing but it just means: //"If this line is not a tag, and the last line was an opening tag, add an indent" if(substr($array[$i], 0, 1) != '<' && substr($array[$i-1], 0, 1) == '<' && substr($array[$i-1], 0, 2) != '</'){$indent_count++;} //Again confusing, but... //"If this line is a closing tag, and the last line was not a tag, subtract an indent" if(substr($array[$i-1], 0, 1) != '<' && substr($array[$i],0, 2) == '</'){$indent_count--;} for($j=0; $j<$indent_count; $j++){ //print as many indents as there are in the count echo '\t'; } echo $array[$i].'\n'; //print the actual piece of code, followed by a new line if(substr($array[$i],0, 2) == '</'){$indent_count--;} //if the array element is a closing tag, subtract an indent } Quote Link to comment https://forums.phpfreaks.com/topic/93347-finding-opening-tags-and-closting-tags-with-preg_match/#findComment-478736 Share on other sites More sharing options...
wpt394 Posted February 28, 2008 Share Posted February 28, 2008 Hi, me again....I was thinking about this code, and realized that for tags that don't have closing tags (for example image tags), you will end up adding an extra indentation. The way to fix it would just be to add another if statement to the loop that checks to see if the array element is an image tag, and if so, subtract an indentation...something like if(substr($array[$i],0, 4) == '<img'){$indent_count--;} //if the array element is an image tag, subtract the accidentally added indent Quote Link to comment https://forums.phpfreaks.com/topic/93347-finding-opening-tags-and-closting-tags-with-preg_match/#findComment-478814 Share on other sites More sharing options...
wpt394 Posted February 28, 2008 Share Posted February 28, 2008 OK, last post, I promise. Here are the last three posts combined. Same concept, more elegant coding: //remove spacing and newlines $file = str_replace("\n", " ", $file); $file = str_replace("\t", " ", $file); $file = preg_replace('/\s+/', ' ', $file); $file = str_replace("/ >", "/>", $file); //add newlines in all the right places $file = str_replace(">", ">\n", $file); $file = str_replace("<", "\n<", $file); $file = str_replace("\n ", "\n", $file); $file = str_replace("\n\n", "\n", $file); $array = explode('\n',$file); $indent_count = 0; $newfile = ''; for($i=0; $i<sizeof($array); $i++){ /* If a closing tag, subtract an indent */ if(substr($array[$i], 0, 2) == '</'){$indent_count--;} /* Add this line's indents */ for($j=0; $j<$indent_count; $j++){$newfile .= '\t';} /* Add the actual line followed by a \n (since it was destroyed in the array explode) */ $newfile .= $array[$i].'\n'; /* If a tag, and not a closing tag, must be an opening tag, so add an indent */ if(substr($array[$i],0, 1) == '<' && substr($array[$i],0, 2) != '</'){$indent_count++;} /* Previous step doesn't account for image tags (or any tag without a closing) so test for an image tag, and subtract indent if found */ if(substr($array[$i],0, 4) == '<img'){$indent_count--;} } echo $newfile; Quote Link to comment https://forums.phpfreaks.com/topic/93347-finding-opening-tags-and-closting-tags-with-preg_match/#findComment-478827 Share on other sites More sharing options...
spikeon Posted February 28, 2008 Author Share Posted February 28, 2008 I implimented that, and it still didn't work. i'll give out FULL CODE so i can get help ok, heres a link to the page its on: http://www.youwereloved.org/format.php Heres the full code <?php header("Content-type: text/html; charset=UTF-8") ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>Format Me</title> </head><body> <?php // edit number: 42 function format_code($dir){ //get site $file = file_get_contents($dir); //remove spacing and newlines //remove spacing and newlines $file = str_replace("\n", " ", $file); $file = str_replace("\t", " ", $file); $file = preg_replace('/\s+/', ' ', $file); $file = str_replace("/ >", "/>", $file); //add newlines in all the right places $file = str_replace(">", ">\n", $file); $file = str_replace("<", "\n<", $file); $file = str_replace("\n ", "\n", $file); $file = str_replace("\n\n", "\n", $file); $array = explode('\n',$file); $indent_count = 0; $newfile = ''; for($i=0; $i<sizeof($array); $i++){ /* If a closing tag, subtract an indent */ if(substr($array[$i], 0, 2) == '</'){$indent_count--;} /* Add this line's indents */ for($j=0; $j<$indent_count; $j++){$newfile .= '\s';} /* Add the actual line followed by a \n (since it was destroyed in the array explode) */ $newfile .= $array[$i].'\n'; /* If a tag, and not a closing tag, must be an opening tag, so add an indent */ if(substr($array[$i],0, 1) == '<' && substr($array[$i],0, 2) != '</'){$indent_count++;} /* Previous step doesn't account for image tags (or any tag without a closing) so test for an image tag, and subtract indent if found */ if(substr($array[$i],0, 4) == '<img' || substr($array[$i],0, 4) == '<br' || substr($array[$i],0, 4) == '<hr' ){$indent_count--;} } return $newfile; } $dir = "http://www.akirablaid.com/index.php"; echo "<xmp>"; echo format_code($dir); echo " "; echo "</xmp>" ?> </body></html> Quote Link to comment https://forums.phpfreaks.com/topic/93347-finding-opening-tags-and-closting-tags-with-preg_match/#findComment-478855 Share on other sites More sharing options...
spikeon Posted February 28, 2008 Author Share Posted February 28, 2008 played arround with it a little more.... still dosen't work <?php header("Content-type: text/html; charset=UTF-8") ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>Format Me</title> </head><body> <?php // edit number: 50 function format_code($dir){ //get site $file = file_get_contents($dir); //remove spacing and newlines $file = str_replace("\n", " ", $file); $file = str_replace("\t", " ", $file); $file = preg_replace('/\s+/', ' ', $file); $file = str_replace("/ >", "/>", $file); //make br's, img's and hr's kosher //$file = str_replace("<br>", "<br />" $file); //$file = str_replace("<br/>", "<br />" $file); //add newlines in all the right places $file = str_replace(">", ">\n", $file); $file = str_replace("<", "\n<", $file); $file = str_replace("\n ", "\n", $file); $file = str_replace("\n\n", "\n", $file); $array = explode('\n',$file); $indent_count = 0; $newfile = ''; for($i=0; $i<sizeof($array); $i++){ /* If a closing tag, subtract an indent */ if(substr($array[$i], 0, 2) == '</'){$indent_count--;} /* Add this line's indents */ for($j=0; $j<$indent_count; $j++){$newfile .= '\s';} //$array[$i] = ereg_replace("<img\s*(.*)\s*/*>", "<img \\1 />", $array[$i]); //$array[$i] = ereg_replace("<hr\s*(.*)/*\s*>", "<hr \\1 />", $array[$i]); /* Add the actual line followed by a \n (since it was destroyed in the array explode) */ $newfile .= $array[$i].'\n'; /* If a tag, and not a closing tag, must be an opening tag, so add an indent */ if(substr($array[$i],0, 1) == '<' && substr($array[$i],0, 2) != '</'){$indent_count++;} /* Previous step doesn't account for image tags (or any tag without a closing) so test for an image tag, and subtract indent if found */ if(substr($array[$i],0, 4) == '<img' || substr($array[$i],0, 3) == '<br' || substr($array[$i],0, 3) == '<hr' ){ $indent_count--; } } return $newfile; } $dir = "http://www.akirablaid.com/index.php"; echo "<xmp>"; echo format_code($dir); echo " "; echo "</xmp>" ?> </body></html> Quote Link to comment https://forums.phpfreaks.com/topic/93347-finding-opening-tags-and-closting-tags-with-preg_match/#findComment-478874 Share on other sites More sharing options...
spikeon Posted February 28, 2008 Author Share Posted February 28, 2008 also, if someone can tell me how do do what i'm attempting to do in the commented out lines, that'd be nice Quote Link to comment https://forums.phpfreaks.com/topic/93347-finding-opening-tags-and-closting-tags-with-preg_match/#findComment-478908 Share on other sites More sharing options...
wpt394 Posted March 5, 2008 Share Posted March 5, 2008 This works, but a few notes: 1) For the "special" array (tags that shouldn't induce indents because they have no closing tags), things are case sensitive -- there's got to be a way to get around that, i'll leave that to you. 2) Your sample code contains errors (i.e. more </strong> tags than <strong> tags). That's a pain, and it screws up the output of the script....I'm sure you could write a bit of code that could test for that, and prevent changes to the indent_count on errors...I'll leave that one to you too. do something where you have a for loop that tests each line and for every opening tag of a particular sort do a ++ and for every closing tag of a particular sort do --. Work from there...Or just don't write erroneous code! 3) I added a str_replace so that <br>(breaks) don't get their own line...Most script I've seen doesn't usually let each and every <br> tag have its own line, but feel free to take that out if you want. 4) Your meta keyword and description tags are RIDICULOUS. CHANGE THEM NOW! First of all, search engines hardly even look at the keyword tag anymore (although description is important). Secondly, not only does keyword stuffing NOT help, in some cases can actually get you penalized! Ok, with that...here is the code... $file = 'text.txt'; $opFile = fopen ($file, "r"); $string = fread ($opFile, filesize ($file)); fclose ($opFile); //remove spacing and newlines $string = str_replace("\n", " ", $string); $string = str_replace("\t", " ", $string); $string = preg_replace('/\s+/', ' ', $string); $string = str_replace("/ >", "/>", $string); //add newlines in all the right places $string = str_replace(">", ">\n", $string); $string = str_replace("<", "\n<", $string); $string = str_replace("\n ", "\n", $string); $string = str_replace("\n\n", "\n", $string); $string = str_replace("\n<br", "<br", $string); // next part, break by lines $array = split("\n", $string); $indent_count = 0; $newfile = ''; $specials = array('<br','<hr','<img','<?xml','<!DOC','<!--','<meta','<link','<INPUT'); for($i=0; $i<sizeof($array); $i++){ /* If a closing tag, subtract an indent */ if(substr($array[$i], 0, 2) == '</'){$indent_count--;} /* Add this line's indents */ for($j=0; $j<$indent_count; $j++){$newfile .= "\t";} /* Prepend the prestring onto the array element (aka line) */ $newfile .= $array[$i]."\n"; /* If a tag, and not a closing tag, must be an opening tag, so add an indent */ if(substr($array[$i],0, 1) == '<' && substr($array[$i],0, 2) != '</'){$indent_count++;} /* Previous step doesn't account tags w/o closings, so test for them, and subtract indent if found */ for ($z=0; $z<sizeof($specials); $z++){ if(substr($array[$i],0, strlen($specials[$z])) == $specials[$z]){$indent_count--;} } } echo '<pre>'.htmlspecialchars (print_r ($newfile, TRUE)).'</pre>'; Quote Link to comment https://forums.phpfreaks.com/topic/93347-finding-opening-tags-and-closting-tags-with-preg_match/#findComment-483722 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.