yanjchan Posted April 7, 2010 Share Posted April 7, 2010 Hi! Does anyone here know how to find out which div has the most text in it on a remote webpage? For example, if the remote webpage looked like this: <html> <head> <title>Hi</title> </head> <body> <div id="big">sjdf0afh0u9 4uf0a udf8saudfo9aufosalfd</div> <div id="small">iaosjfioajdf</div> </body> </html> the PHP script would find out that the "big" div was the biggest and return its id and contents. So far, I discovered a method of doing this ... provided the site was pure XML. Unfortunately, not all webpages parsed by this script can be trusted to be as such. Thanks in advance! Quote Link to comment https://forums.phpfreaks.com/topic/197827-get-div-with-longest-length-from-remote-webpage/ Share on other sites More sharing options...
salathe Posted April 7, 2010 Share Posted April 7, 2010 The method that you discovered would probably be what anyone else would suggest. Care to elaborate on your method, and any shortcomings or faults with it that you've noticed (for instance, needing well-formed XML). Quote Link to comment https://forums.phpfreaks.com/topic/197827-get-div-with-longest-length-from-remote-webpage/#findComment-1038243 Share on other sites More sharing options...
yanjchan Posted April 7, 2010 Author Share Posted April 7, 2010 Hi! Thanks for replying. The problem is that the DOM throws errors on things such as ul and nbsp. This is unacceptable, as for obvious reasons I cannot ask everyone who submits a site to write perfectly good XML without those elements. THanks. Quote Link to comment https://forums.phpfreaks.com/topic/197827-get-div-with-longest-length-from-remote-webpage/#findComment-1038490 Share on other sites More sharing options...
salathe Posted April 8, 2010 Share Posted April 8, 2010 There are numerous tools that you can use to coerce HTML into well-formed HTML or XML (the latter being most useful to you); one such tool is HTML Tidy. Quote Link to comment https://forums.phpfreaks.com/topic/197827-get-div-with-longest-length-from-remote-webpage/#findComment-1038926 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.