natasha_thomas Posted May 14, 2011 Share Posted May 14, 2011 Folks, I am using SIMPLEHTMLPARSER. I am not able to parse HTML, looks like nothing is showing up when i do var_dump($html->find('div[id=Teaser_Item] img[src]', 0)); Actually, what i want to extract is the IMG SRC which is: http://wap.ebay.com/Pages/RbHttpHandler.ashx?width=313&height=592&fsize=999000&format=jpg&url=http%3A%2F%2Fi.ebayimg.com%2F00%2F%24%28KGrHqN%2C!jEE2n%28iTLozBNwBPG0bUg~~0_1.JPG%3Fset_id%3D8800005007 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html><head><META http-equiv="Content-Type" content="text/html; charset=UTF-8"><META HTTP-EQUIV="expires" CONTENT="0"><META HTTP-EQUIV="cache-control" CONTENT="no-cache"><META HTTP-EQUIV="pragma" CONTENT="no-cache"><META name="google-site-verification" content="2AT13qxpDWUTCw-6xXa-Hme6iQ7ds3rYZ5cH5-_K13Y"><META http-equiv="Content-Style-Type" content="text/css"><title>YSLBlack Suede Platform Pumps Size 7 (39) - eBay Mobile (item 160586179890 5/14/2011 8:39:22 AM)</title><link rel="stylesheet" type="text/css" href="/nbinternal/global.css"><style>div.body {margin-left:5px !important;margin-right:5px !important;width:1253px;} div, p, td, span, li {color:#000000;} div.body > div, div.body > table {color:#000000;} hr {color:#000000;} a {color:#0000CC;} td.tabbed-active, a.tabbed-active {border-bottom-color:#FFFFFF;} div, td, form, li, input, select, textarea {font-size:12px;} .medium, .medium *, .medium td * {font-size:12px !important;} .headline *, .medium .headline * {font-size:14px !important;} .large .headline *, .headline .large * {font-size:16px !important;} .large, .large *, .large td * {font-size:14px !important;} .small .headline * {font-size:12px !important;} .small, .small *, .small td * {font-size:10px !important;} </style></head><body style="width:1253px;background-color:#FFFFFF;"><div class="body" style="width:1253px;background-color:#FFFFFF;"> <div style="margin-bottom: 4px;background-color: #ffffff;" id="CommonHeader" class="pageheader mode1"><table class="pageheader" cellspacing="0" cellpadding="0"><tr><td class="logo" style="background-color: #ffffff;"><a href="/Default.aspx?emvAD=1263x592&aid=160586179890&emvcc=0"><img src="RbHttpHandler.ashx?width=1253&height=592&fsize=999000&format=gif&url=%7E%2FImages%2FeBayLogos%2Funscaled___ebay_logo_large.gif" alt="eBay mobile"></a></td></tr></table></div><div id="ebayLine1" class="separator mode1"> <img src="RbHttpHandler.ashx?width=1253&height=592&fsize=999000&format=gif&url=%2Fimages%2FeBayLines%2Funscaled___630.gif" alt="" class="separator "> </div><div id="Status" class="default"> <div style="margin-left: 5px;margin-right: 5px;padding-top: 4px;padding-bottom: 4px;border:none;" id="Teaser_Item" class="teaser mode11"><table cellpadding="0" cellspacing="0" style="width:100%;"><tr><td style="vertical-align:top;padding-right:2px;width:317px;" valign="top"><img src="RbHttpHandler.ashx?width=313&height=592&fsize=999000&format=jpg&url=http%3A%2F%2Fi.ebayimg.com%2F00%2F%24%28KGrHqN%2C%21jEE2n%28iTLozBNwBPG0bUg%7E%7E0_1.JPG%3Fset_id%3D8800005007" alt=""></td><td class="ttext" style="vertical-align:top;" valign="top"><strong>YSLBlack Suede Platform Pumps Size 7 (39)</strong></td></tr></table></div> <div style="padding-top: 0px;padding-bottom: 0px;border-color: #fae273;border-style: solid;border-width: 1px;border-top:none;border-left:none;border-right:none;background-color: #ffd869;background-image: url(RbHttpHandler.ashx?url=/images/BlockHeader/unscaled___630.gif);background-repeat: no-repeat;background-position: top-left;" id="BgHeader" class="text mode1 small"> <div> </div></div> <div style="padding-top: 4px;padding-bottom: 4px;vertical-align: middle;border-color: #bababa;border-style: solid;border-width: 1px;border-top:none;border-bottom:none;background-color: #f0eff7;text-align: center;" class="buttonmenu mode2"> <span id="ButtonRefresh" class="button-image" style="margin:0px;background-color: #7a7a7a;margin-right: 3px;"><a href="/Pages/ViewItem.aspx?emvAD=1263x592&aid=160586179890&emvcc=0" class="button-inactive" style="color: #ffffff !important;border-color: #606060;border-style: solid;border-width: 1px;border-style: solid;"><strong>Refresh</strong></a></span> </div></div><div id="Content" class="default"> <div style="padding-top: 0px;padding-bottom: 0px;text-align: center;line-height: 1.5em;background-image: url(RbHttpHandler.ashx?url=~/Images/TabbedMenu_BgGradient.jpg);background-repeat: repeat-x;background-position: top-left;" id="MenuA" class="tabbedmenu mode1 small"> <table cellspacing="0" cellpadding="0"> <tr> <td style="color: #000000 !important;background-color: #ffffff;border-color: #bababa;border-style: solid;border-bottom-color: #ffffff;border-style: solid;text-align: center;line-height: 1.5em;" class="tabbed-active"> <a href="/Pages/ViewItem.aspx?emvAD=1263x592&aid=160586179890&emvcc=0" style="color: #000000 !important;border:none;" id="ButtonMenuItem1" class="tabbed-active"> Summary</a> </td> <td style="color: #00008b !important;border-color: #bababa;border-style: solid;border-style: solid;text-align: center;line-height: 1.5em;" class="tabbed-inactive"> <a href="/Pages/ViewItemPic.aspx?emvAD=1263x592&aid=160586179890&emvcc=0" style="color: #00008b !important;border:none;" id="ButtonMenuItem2" class="tabbed-inactive"> Picture</a> </td> <td style="color: #00008b !important;border-color: #bababa;border-style: solid;border-style: solid;text-align: center;line-height: 1.5em;" class="tabbed-inactive tabbed-last"> <a href="/Pages/ViewItemDesc.aspx?emvAD=1263x592&aid=160586179890&emvcc=0" style="color: #00008b !important;border:none;" id="ButtonMenuItem3" class="tabbed-inactive"> Description</a> </td> </tr> </table> </div> <div style="vertical-align: top;border-color: #bababa;border-style: solid;border-width: 1px;border-top:none;border-bottom:none;" class="table mode1"><table cellpadding="0" cellspacing="0" style="vertical-align: top;"><tr> <td style="padding: 4px;vertical-align: top;text-align: right;width: 45%;" valign="top"><span style="color:#999999 !important;font-weight:bold;">Item number:</span></td> <td style="padding: 4px;vertical-align: top;width: 55%;" valign="top"><span style="color:#000000 !important;font-weight:bold;">160586179890</span></td></tr><tr> <td style="padding: 4px;vertical-align: top;text-align: right;width: 45%;" valign="top"><span style="color:#999999 !important;font-weight:bold;">Last Bid:</span></td> <td style="padding: 4px;vertical-align: top;width: 55%;" valign="top"><span style="color:#000000 !important;font-weight:bold;">US $99.00</span></td></tr><tr> <td style="padding: 4px;vertical-align: top;text-align: right;width: 45%;" valign="top"><span style="color:#999999 !important;font-weight:bold;">Ended:</span></td> <td style="padding: 4px;vertical-align: top;width: 55%;" valign="top"><strong>5/14/2011 8:39:22 AM</strong></td></tr><tr> <td style="padding: 4px;vertical-align: top;text-align: right;width: 45%;" valign="top"><span style="color:#999999 !important;">Bid count:</span></td> <td style="padding: 4px;vertical-align: top;width: 55%;" valign="top"><span style="color:#000000 !important;">0</span></td></tr><tr> <td style="padding: 4px;vertical-align: top;text-align: right;width: 45%;" valign="top"><span style="color:#999999 !important;">High bidder:</span></td> <td style="padding: 4px;vertical-align: top;width: 55%;" valign="top"><span style="color:#000000 !important;">-</span></td></tr><tr> <td style="padding: 4px;vertical-align: top;text-align: right;width: 45%;" valign="top"><span style="color:#999999 !important;">Quantity:</span></td> <td style="padding: 4px;vertical-align: top;width: 55%;" valign="top"><span style="color:#000000 !important;">1</span></td></tr></table></div> <div style="padding-top: 0px;padding-bottom: 0px;border-color: #bababa;border-style: dotted;border-width: 1px;border-bottom:none;border-left:none;border-right:none;" id="SeparatorLine1" class="text mode1"> <div></div></div> <div style="vertical-align: top;border-color: #bababa;border-style: solid;border-width: 1px;border-top:none;border-bottom:none;" class="table mode1"><table cellpadding="0" cellspacing="0" style="vertical-align: top;"><tr> <td style="padding: 4px;vertical-align: top;text-align: right;width: 45%;" valign="top"><span style="color:#999999 !important;">Seller:</span></td> <td style="padding: 4px;vertical-align: top;width: 55%;" valign="top">namtalae (64)</td></tr><tr> <td style="padding: 4px;vertical-align: top;text-align: right;width: 45%;" valign="top"><span style="color:#999999 !important;">Feedback:</span></td> <td style="padding: 4px;vertical-align: top;width: 55%;" valign="top">97.3% Positive</td></tr><tr> <td style="padding: 4px;vertical-align: top;text-align: right;width: 45%;" valign="top"><span style="color:#999999 !important;">Location:</span></td> <td style="padding: 4px;vertical-align: top;width: 55%;" valign="top">Istanbul<br>TR</td></tr></table></div> <div style="padding-top: 0px;padding-bottom: 0px;border-color: #bababa;border-style: dotted;border-width: 1px;border-bottom:none;border-left:none;border-right:none;" class="text mode1"> <div></div></div> <div style="padding-top: 4px;vertical-align: top;border-color: #bababa;border-style: solid;border-width: 1px;border-top:none;border-bottom:none;" class="table mode1"><table cellpadding="0" cellspacing="0" style="vertical-align: top;"><tr> <td style="padding: 4px;vertical-align: top;text-align: right;width: 45%;" valign="top"><span style="color:#999999 !important;">Ships to:</span></td> <td style="padding: 4px;vertical-align: top;width: 55%;" valign="top">Worldwide</td></tr><tr> <td style="padding: 4px;vertical-align: top;text-align: right;width: 45%;" valign="top"><span style="color:#999999 !important;">Postal costs:</span></td> <td style="padding: 4px;vertical-align: top;width: 55%;" valign="top">US $25.50<br><a href="/Pages/ShippingCosts.aspx?emvAD=1263x592&aid=160586179890&emvcc=0">Additional</a></td></tr><tr> <td style="padding: 4px;vertical-align: top;text-align: right;width: 45%;" valign="top"><span style="color:#999999 !important;">Insurance:</span></td> <td style="padding: 4px;vertical-align: top;width: 55%;" valign="top">Optional</td></tr><tr> <td style="padding: 4px;vertical-align: top;text-align: right;width: 45%;" valign="top"><span style="color:#999999 !important;">Payment<br>methods:</span></td> <td style="padding: 4px;vertical-align: top;width: 55%;" valign="top">PayPal</td></tr></table></div> <div style="padding-top: 0px;padding-bottom: 0px;border-color: #bababa;border-style: dotted;border-width: 1px;border-bottom:none;border-left:none;border-right:none;" id="SeparatorLine2" class="text mode1"> <div></div></div> <div style="padding-top: 3px;padding-bottom: 3px;border-color: #bababa;border-style: solid;border-width: 1px;border-bottom:none;background-color: #eaedf7;" id="PayPalInfo" class="text mode1"> <div></div></div> <div style="padding-top: 4px;padding-bottom: 4px;vertical-align: middle;border-color: #bababa;border-style: solid;border-width: 1px;background-color: #f0eff7;text-align: center;" class="buttonmenu mode2"> <span id="ButtonRefresh" class="button-image" style="margin:0px;background-color: #7a7a7a;margin-right: 3px;"><a href="/Pages/ViewItem.aspx?emvAD=1263x592&aid=160586179890&emvcc=0" class="button-inactive" style="color: #ffffff !important;border-color: #606060;border-style: solid;border-width: 1px;border-style: solid;"><strong>Refresh</strong></a></span> </div> <div style="color: #808080 !important;" id="NotFullInfo" class="text mode1"> <div style="color: #808080 !important;"><strong>Note:</strong> To view the full item listing, visit www.ebay.com using a computer before you bid or buy.</div></div> <div style="padding-top: 0px;padding-bottom: 0px;border-color: #bababa;border-style: solid;border-width: 1px;border-bottom:none;border-left:none;border-right:none;" id="SeparatorLineBottom" class="text mode1"> <div></div></div> <div id="TextBreadcrump" class="text mode1"> <div><a href="/Pages/SearchResults.aspx?emvcc=0"><span style="color:#7A7A7A !important;font-weight:bold;"><</span> Results</a></div></div></div><div style="margin-top: 4px;" id="EBayLine2" class="separator mode1"> <img src="RbHttpHandler.ashx?width=1253&height=592&fsize=999000&format=gif&url=%2Fimages%2FeBayLines%2Funscaled___630.gif" alt="" class="separator "> </div> <div style="padding-top: 0px;padding-bottom: 0px;" id="MainMenu" class="buttonmenu mode1"> <table width="1253" cellspacing="0" cellpadding="0"> <tr> <td><a href="/Pages/Search.aspx?emvAD=1263x592&aid=160586179890&emvcc=0" style="border:none;"><img src="RbHttpHandler.ashx?width=417&height=592&fsize=999000&format=gif&url=%2Fimages%2FButtonMenu%2Fen%2Fgif%2F630%2Funscaled___bmenu_highlight_left.gif" alt="Search"></a></td> <td><a href="/Member/MyEbay.aspx?emvAD=1263x592&aid=160586179890&emvcc=0" style="border:none;"><img src="RbHttpHandler.ashx?width=417&height=592&fsize=999000&format=gif&url=%2Fimages%2FButtonMenu%2Fen%2Fgif%2F630%2Funscaled___bmenu_normal_mid.gif" alt="My eBay"></a></td> <td><a href="/Default.aspx?emvAD=1263x592&aid=160586179890&emvcc=0" style="border:none;"><img src="RbHttpHandler.ashx?width=417&height=592&fsize=999000&format=gif&url=%2Fimages%2FButtonMenu%2Fen%2Fgif%2F630%2Funscaled___bmenu_normal_right.gif" alt="Home"></a></td></tr></table></div> <div style="padding-left: 7px;" id="FooterMenu" class="pipedmenu mode1 small"> <a href="/Pages/About/US.aspx?emvAD=1263x592&aid=160586179890&emvcc=0" id="BAbout" class="piped-inactive">About eBay</a> <span>|</span> <a href="/Pages/UserAgreement/US.aspx?emvAD=1263x592&aid=160586179890&emvcc=0" id="BUA" class="piped-inactive">User Agreement</a> <span>|</span> <a href="/Pages/Help.aspx?emvAD=1263x592&aid=160586179890&emvcc=0" id="BHelp" class="piped-inactive">Help</a> </div> <div id="Text1" class="text mode1 small"> <div>view ebay in<br>Mobile | <a href="http://www.ebay.com/?redirect=mobile"> Classic </a></div></div></div></body></html> can someone help me debugging this, please? Cheers Natasha Thomas Quote Link to comment https://forums.phpfreaks.com/topic/236413-pasring-with-simplehtmlparser/ Share on other sites More sharing options...
QuickOldCar Posted May 14, 2011 Share Posted May 14, 2011 I can't even get that image with my parser, and I tried to add many exceptions and rules trying to get any type of image. Here is the link and I'll discuss the issues. <img src="RbHttpHandler.ashx?width=313&height=592&fsize=999000&format=jpg&url=http%3A%2F%2Fi.ebayimg.com%2F00%2F%24%28KGrHqN%2C%21jEE2n%28iTLozBNwBPG0bUg%7E%7E0_1.JPG%3Fset_id%3D8800005007" alt=""> Problem one: This is an internal link, there is no ./ ../ or the host before the script name, I'm sure can use some type of pattern and start from the scripts name of RbHttpHandler.ashx, but simple parser isn't going to do that. Problem two: There is no image type extension which simple parser looks for Problem three: If visit these links, You'll see it runs through a script process. http://cgi.ebay.com/RbHttpHandler.ashx?width=313&height=592&fsize=999000&format=jpg&url=http%3A%2F%2Fi.ebayimg.com%2F00%2F%24(KGrHqN%2C!jEE2n(iTLozBNwBPG0bUg~~0_1.JPG And both of these links I can't even connect to see the image. http%3A%2F%2Fi.ebayimg.com%2F00%2F%24%28KGrHqN%2C%21jEE2n%28iTLozBNwBPG0bUg%7E%7E0_1.JPG%3Fset_id%3D8800005007 http%3A%2F%2Fi.ebayimg.com%2F00%2F%24%28KGrHqN%2C%21jEE2n%28iTLozBNwBPG0bUg%7E%7E0_1.JPG I guess ebay is making every effort for people not to scrape their data and use one of their api's. Quote Link to comment https://forums.phpfreaks.com/topic/236413-pasring-with-simplehtmlparser/#findComment-1215467 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.