l3rodey Posted March 14, 2016 Share Posted March 14, 2016 Hi everyone, Just had a quick question, ( I am not hard-core programmer just know my way around ). I have some PHP that runs in the back of my site that deletes stuff from my html before rendering, Like line breaks and comments to render smaller HTML it uses preg_replace() and str_replace() to do so. What I am wanting to do is something which I feel should be possible but need to know how. I have a <h2>This is the title</h2> On my page and I want to add the following and make it looks like this: <h2 id="this-is-the-title">This is the title</h2> I worked for hours this morning trying to work it out but failed to do anything all I can do is id="1" and + 1 every time and this is not what I want I want it to read the h2 and replace special char and spaces with -. Is there a way of doing this? Here is a couple of my preg_replaces and str_replaces already thee. $start_buffer = preg_replace('/" style=/', '"style=', $start_buffer); $start_buffer = preg_replace('/" class=/', '"class=', $start_buffer); $start_buffer = str_replace("; }", "}", $start_buffer); $start_buffer = str_replace("{ ", "{", $start_buffer); As you can see all this is really doing is removing spaces that are not required. Any help what so ever will be greatly appreciated. Quote Link to comment Share on other sites More sharing options...
Jacques1 Posted March 14, 2016 Share Posted March 14, 2016 (edited) FIrst off: Why do you want to do this? None of what you've said so far makes a lot of sense. What are you going to do with all those auto-generated IDs? Which problem are they supposed to solve? And why on earth would you remove spaces from your HTML tags? Optimizing a website is a valid goal, but this is rather silly. Have you ever actually measured the performance gain? A much more effect approach, for example, would be to stop cluttering your HTML markup with inline styles and using a proper external CSS hierarchy. Edited March 14, 2016 by Jacques1 Quote Link to comment Share on other sites More sharing options...
l3rodey Posted March 14, 2016 Author Share Posted March 14, 2016 Hi Jacques, The PHP file that actually does the compression of HTML actually is a caching tool to cache it more so then compress it. The reason it compresses is to store a smaller cache file on our server more than anything else. It was not an SEO Strategy as such more of a speed issue. The auto-generated ID's is for Anchoring, Our site contains more than 300,000 pages and we are wanting to add a nice following side-bar that has the H2 elements on it and when clicked takes you to where you want to go. I know there is other methods of this but this is the way I wanted to go. I cannot go back and edit 300,000 pages to add the id manually so a PHP script is best. Quote Link to comment Share on other sites More sharing options...
Jacques1 Posted March 14, 2016 Share Posted March 14, 2016 The PHP file that actually does the compression of HTML actually is a caching tool to cache it more so then compress it. The reason it compresses is to store a smaller cache file on our server more than anything else. It was not an SEO Strategy as such more of a speed issue. Like I said, it's a bit silly. Either use a professional minifier and actually measure the benefit, or simply keep your original HTML. The auto-generated ID's is for Anchoring, Our site contains more than 300,000 pages and we are wanting to add a nice following side-bar that has the H2 elements on it and when clicked takes you to where you want to go. I know there is other methods of this but this is the way I wanted to go. I cannot go back and edit 300,000 pages to add the id manually so a PHP script is best. OK. Regular expressions are a very poor approach for HTML parsing, though. I'd use an actual HTML parser together with a slugifier to generate the IDs. Quote Link to comment Share on other sites More sharing options...
l3rodey Posted March 14, 2016 Author Share Posted March 14, 2016 Thanks but this does not answer my question the slugs I am fine with that's great but reading inside the h2 then going back and editing that exact h2 is where I am stuck. I can make a slug that is fine but I am not sure how to read in php what is inside the h2 (then I will slug it) then edit the h2 that the text is in... This is where I'm stuck Quote Link to comment Share on other sites More sharing options...
Jacques1 Posted March 14, 2016 Share Posted March 14, 2016 You haven't read the answer. The only reason why you're stuck is because you're using a wrong approach (regular expressions). Use an actual HTML parser, and this will be done in 5 minutes and maybe 5 lines of code. Or you can spend a couple of hours on a buggy regex, but then you'll have to wait for somebody else. Quote Link to comment Share on other sites More sharing options...
requinix Posted March 14, 2016 Share Posted March 14, 2016 Load the HTML into DOMDocument, use getElementsByTagName to get all the H2s and loop over them, and for each one set the "id" attribute to whichever value you want according to its textContent. DOMDocument Quote Link to comment Share on other sites More sharing options...
Jacques1 Posted March 14, 2016 Share Posted March 14, 2016 I gave him that link already. When you set the ID attribute, make sure you're not overwriting an existing ID. This make break JavaScript code or CSS rules. So first check if the attribute exists. Quote Link to comment Share on other sites More sharing options...
requinix Posted March 14, 2016 Share Posted March 14, 2016 I gave him that link already.I know. I was anticipating a "what is domdocument" reply. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.