meman1188 Posted December 20, 2007 Share Posted December 20, 2007 I was wondering how, from a logic perspective, web bots are able to pick out the content picture in html. For example, Google News is very good at this, picking out the picture in the article that the news is coming from. Example 2: Facebook also does a pretty good job when you post a link and it puts a picture next to it. It is almost always the most appropriate image. My thoughts on this so others can bounce off... mostly i've come up with ways that don't work. - Its not always the biggest pictures (header pictures can be very large in area) - you can't rely on someone naming a div 'content' so you can't narrow it down that way - the only one i came up with.. a lot of background images are inserted through CSS while the image i'm looking for should be an image tag. This helps increase the odds but still doesn't guarantee the right image, not to say thats possible, but Google News seems to do very well. Thanks for the help Quote Link to comment Share on other sites More sharing options...
redbullmarky Posted December 20, 2007 Share Posted December 20, 2007 facebooks way of doing things is quite slick. (for anyone that doesn't know, when sending a message, all you do is past the URL of the article in the message window, and an AJAXy thing inserts a brief article preview as well as an image) it lets you pick from any of the images on the page (aside from background ones). i've not noticed it pick the exact image yet (as i've only tried it once), but my guess would be it's just analysing the pageflow a bit, excluding things like 'logo.gif' (and other common element names), and picking an image close to a header tag providing the page is sort of formatted well enough. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.