guymclarenza Posted November 9, 2023 Share Posted November 9, 2023 This function is not working but I need some addd info from the function, Assume $content is populated, I want to count outbound links, ie links going to another website and internal links those staying on the website but going to another page. I also want to list said links by outbount and internal with link text and url., It would also be nice if I could at the same time fail or pass link text if neccessary, If linktext is "click here" or something equally non descriptive it would be marked with a red cross, or green tick if good, Why is this not working? and how do I make the neccessary changes to the script to add the functionality function checkLinksForDescriptiveText($content) { // Example: Check links for descriptive link text preg_match_all('/<a [^>]*href=["\']([^"\']+)["\'][^>]*>([^<]+)<\/a>/i', $content, $links); $total_links = count($links[1]); $descriptive_links = 0; foreach ($links[2] as $linkText) { $linkText = strip_tags($linkText); if (strlen($linkText) > 0 && strlen($linkText) < 50) { $descriptive_links++; } } return [$total_links, $descriptive_links]; } Quote Link to comment https://forums.phpfreaks.com/topic/317427-count-links-on-a-page-sort-them-and-grade-them/ Share on other sites More sharing options...
Barand Posted November 9, 2023 Share Posted November 9, 2023 This thread may help Quote Link to comment https://forums.phpfreaks.com/topic/317427-count-links-on-a-page-sort-them-and-grade-them/#findComment-1612819 Share on other sites More sharing options...
guymclarenza Posted November 20, 2023 Author Share Posted November 20, 2023 This function parses the input using an HTML 4 parser. The parsing rules of HTML 5, which is what modern web browsers use, are different. Depending on the input this might result in a different DOM structure. Therefore this function cannot be safely used for sanitizing HTML. As an example, some HTML elements will implicitly close a parent element when encountered. The rules for automatically closing parent elements differ between HTML 4 and HTML 5 and thus the resulting DOM structure that DOMDocument sees might be different from the DOM structure a web browser sees, possibly allowing an attacker to break the resulting HTML. Seems like that is not a fantastic option Quote Link to comment https://forums.phpfreaks.com/topic/317427-count-links-on-a-page-sort-them-and-grade-them/#findComment-1613011 Share on other sites More sharing options...
gizmola Posted November 22, 2023 Share Posted November 22, 2023 On 11/20/2023 at 5:40 AM, guymclarenza said: This function parses the input using an HTML 4 parser. The parsing rules of HTML 5, which is what modern web browsers use, are different. Depending on the input this might result in a different DOM structure. Therefore this function cannot be safely used for sanitizing HTML. What function are you talking about? Quote Link to comment https://forums.phpfreaks.com/topic/317427-count-links-on-a-page-sort-them-and-grade-them/#findComment-1613025 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.