Jump to content

Get div with longest length from remote webpage


yanjchan

Recommended Posts

Hi!

Does anyone here know how to find out which div has the most text in it on a remote webpage?

For example, if the remote webpage looked like this:

 

<html>
<head>
<title>Hi</title>
</head>
<body>
<div id="big">sjdf0afh0u9 4uf0a udf8saudfo9aufosalfd</div>
<div id="small">iaosjfioajdf</div>
</body>
</html>

 

the PHP script would find out that the "big" div was the biggest and return its id and contents.

 

So far, I discovered a method of doing this ... provided the site was pure XML. Unfortunately, not all webpages parsed by this script can be trusted to be as such.

 

Thanks in advance!

Link to comment
Share on other sites

The method that you discovered would probably be what anyone else would suggest. Care to elaborate on your method, and any shortcomings or faults with it that you've noticed (for instance, needing well-formed XML).

Link to comment
Share on other sites

Hi!

Thanks for replying.

The problem is that the DOM throws errors on things such as ul and nbsp.

This is unacceptable, as for obvious reasons I cannot ask everyone who submits a site to write perfectly good XML without those elements.

 

THanks.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.