HaLo2FrEeEk Posted January 10, 2009 Share Posted January 10, 2009 I'm building a service where I need to retrieve and parse information from another page on the internet. It's a page from Bungie.net's Halo 3 fileshare. Here's mine: http://www.bungie.net/stats/halo3/fileshare.aspx?gamertag=HaLo2FrEeEk I need to get all the film and film clip entries off the page. There are only a few identifying features for each slot that the contents are a film or film clip is the image, the url of which is either: .../images/halo3stats/fileshareiconssm/filmclips/sm/... or: .../images/halo3stats/fileshareiconssm/films/sm/... Or the Film Length field. Now, I need to pull out the title, film length, and the h3fileid from every slot that is a film, but ONLY if it's a film. Is there a way I can do this efficiently? What I have in mind now is sorta hard on the server. The regex parses the contents and gets a list of all the h3fileid's for each film slot (using the image url), then goes through and recursively retrieves the code from the page for each of those files and gets it's title and film length. It's a lot of work for the server, I fear, and it will probably take too long. Since the information I need is all on the actual fileshare page, is there a way I can just get title, film length, and h3fileid for each film and filmclip item, and only those items? Thank you in advance for any help. Quote Link to comment https://forums.phpfreaks.com/topic/140270-only-getting-entries-which-contain-a-certain-text-ignoring-others/ Share on other sites More sharing options...
HaLo2FrEeEk Posted January 10, 2009 Author Share Posted January 10, 2009 Sorry for the double post, I would have edited, but it wouldn't let me. Here was the edit: Actually, now that I think about it, I only need the h3fileid and title of each item, I'll get the film length whent he user selects which film he wants. So can I just get the title and h3fileid for only film and filmclip items? Here is an example of a NON film/filmclip item: <div class="slotWrap"><div id="ctl00_ctl00_mainContent_shareRepeater_ctl00_fileshareitem_messageBoxPanelPanel"> <div id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_messageBoxPanel"> </div> </div> <div id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_outerShell" class="user_content_mini_outer_shell"> <div id="ctl00_ctl00_mainContent_shareRepeater_ctl00_fileshareitem_screenshotBoxPanelPanel"> <div id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_screenshotBoxPanel" class="user_content_mini_box"> <div id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_ajaxLoadingPanel" style="display:none;"> <img src="/images/ajax-loading-horizontal.gif" alt="Loading..." width="30px" height="13px" style="text-align: center;" /> </div> <div id="ctl00_ctl00_mainContent_shareRepeater_ctl00_fileshareitem_fsItemPanelPanel"> <div id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_fsItemPanel" class="user_content_mini_box_inner"> <div class="shareTitle"> <ul class="infoA"> <li class="float_right">0%</li> <li><h3><a id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_titleLink" href="/Online/Halo3UserContentDetails.aspx?h3fileid=61127153" target="primaryWindow">Valhalla X-Mas</a></h3></li> <li id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_fileFlagsListItem" class="float_right"> <a id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_hlFavIcon" href="/Online/Halo3UserContentDetails.aspx?h3fileid=61127153" target="primaryWindow"></a> <a id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_hlFileSetIcon" href="/Online/Halo3UserContentDetails.aspx?h3fileid=61127153" target="primaryWindow"><img id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_set_flag_image" src="/images/halo3stats/fileshareicons/linkedfile_icon.gif" alt="Part of File Set" style="height:16px;width:16px;border-width:0px;" /></a> </li> <li>Created 12.22.2008 by <a id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_authorLink" href="/Stats/Halo3/Default.aspx?player=HaLo2FrEeEk++++" target="primaryWindow">HaLo2FrEeEk </a></li> </ul> </div> <div class="share-mid"> <a id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_thumbnailLink" class="relative_image_container" href="/Online/Halo3UserContentDetails.aspx?h3fileid=61127153" target="primaryWindow"><img id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_fileThumbnail" class="not_screenshot_pic" src="/images/halo3stats/fileshareiconssm/maps/sm/valhalla.gif" style="border-width:0px;" /></a> <div class="shareCommon"> <ul class="infoC"> <li id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_xboxDownload_listitem"><div id="ctl00_ctl00_mainContent_shareRepeater_ctl00_fileshareitem_xboxDownloadButtonPanel"> <a id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_xboxDownloadButton" href="javascript:__doPostBack('ctl00$mainContent$shareRepeater$ctl00$fileshareitem$xboxDownloadButton','')">Download to Halo 3</a> </div></li><li id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_dl_listitem">22 downloads</li><li id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_map_listitem">Map: Valhalla</li> </ul> </div> </div> <div class="clear"></div> <!--[if IE]><div class="IE_description_fix"><![endif]--> <div id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_bottomArea" class="description"> The crew of V-398 barely survived their unplanned landing in... </div> <!--[if IE]></div><![endif]--> <div class="ssMoreDetails"><div id="ctl00_ctl00_mainContent_shareRepeater_ctl00_fileshareitem_moreDetailsLinkPanel"> <a id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_moreDetailsLink" href="/Online/Halo3UserContentDetails.aspx?h3fileid=61127153" target="primaryWindow">more details</a> </div></div> </div> </div> </div><div id="ctl00_ctl00_mainContent_shareRepeater_ctl00_fileshareitem_moreDetailsLinkPanel"> </div><div id="ctl00_ctl00_mainContent_shareRepeater_ctl00_fileshareitem_reportSpamLinkPanel"> </div><div id="ctl00_ctl00_mainContent_shareRepeater_ctl00_fileshareitem_reportResultsLabelPanel"> </div><div id="ctl00_ctl00_mainContent_shareRepeater_ctl00_fileshareitem_saveGalleryLinkButtonPanel"> </div><div id="ctl00_ctl00_mainContent_shareRepeater_ctl00_fileshareitem_fsItemPanelPanel"> </div> </div> </div> <div id="ctl00_ctl00_mainContent_shareRepeater_ctl00_fileshareitem_empty_fsItemPanelPanel"> </div> <div id="ctl00_ctl00_mainContent_shareRepeater_ctl00_fileshareitem_unpinned_ssPanelPanel"> </div> <div id="ctl00_ctl00_mainContent_shareRepeater_ctl00_fileshareitem_actionBarPanel"> <div id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_actionBar" class="bottom_bar"> <ul class="links"> <li class="slotNum">Slot : 1 </li> <li id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_copyListItem"><div id="ctl00_ctl00_mainContent_shareRepeater_ctl00_fileshareitem_shareCopyButtonPanel"> <a id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_shareCopyButton" href="javascript:__doPostBack('ctl00$mainContent$shareRepeater$ctl00$fileshareitem$shareCopyButton','')">copy to my share</a> </div></li> <li id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_removeListItem"><div id="ctl00_ctl00_mainContent_shareRepeater_ctl00_fileshareitem_removeButtonPanel"> <a onclick="return confirm('Are you sure you wish to remove this item?');" id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_removeButton" href="javascript:__doPostBack('ctl00$mainContent$shareRepeater$ctl00$fileshareitem$removeButton','')">delete</a> </div></li> <li id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_groupListItem"><a id="ctl00_mainContent_shareRepeater_ctl00_fileshareitem_groupButton" onclick="return openFileSetAddWindow(61127153,0,'ctl00_topLevelControls_fileSetWindow');" href="../StatControls/#">add to file set</a></li> </ul> </div><div id="ctl00_ctl00_mainContent_shareRepeater_ctl00_fileshareitem_shareCopyButtonPanel"> </div><div id="ctl00_ctl00_mainContent_shareRepeater_ctl00_fileshareitem_trophyLinkButtonPanel"> </div> </div> <div id="ctl00_ctl00_mainContent_shareRepeater_ctl00_fileshareitem_adminInfoPanel"> </div> </div> And here is one that is a filmclip: <div class="slotWrap"><div id="ctl00_ctl00_mainContent_shareRepeater_ctl01_fileshareitem_messageBoxPanelPanel"> <div id="ctl00_mainContent_shareRepeater_ctl01_fileshareitem_messageBoxPanel"> </div> </div> <div id="ctl00_mainContent_shareRepeater_ctl01_fileshareitem_outerShell" class="user_content_mini_outer_shell"> <div id="ctl00_ctl00_mainContent_shareRepeater_ctl01_fileshareitem_screenshotBoxPanelPanel"> <div id="ctl00_mainContent_shareRepeater_ctl01_fileshareitem_screenshotBoxPanel" class="user_content_mini_box"> <div id="ctl00_mainContent_shareRepeater_ctl01_fileshareitem_ajaxLoadingPanel" style="display:none;"> <img src="/images/ajax-loading-horizontal.gif" alt="Loading..." width="30px" height="13px" style="text-align: center;" /> </div> <div id="ctl00_ctl00_mainContent_shareRepeater_ctl01_fileshareitem_fsItemPanelPanel"> <div id="ctl00_mainContent_shareRepeater_ctl01_fileshareitem_fsItemPanel" class="user_content_mini_box_inner"> <div class="shareTitle"> <ul class="infoA"> <li class="float_right">0%</li> <li><h3><a id="ctl00_mainContent_shareRepeater_ctl01_fileshareitem_titleLink" href="/Online/Halo3UserContentDetails.aspx?h3fileid=21305912" target="primaryWindow">flare</a></h3></li> <li>Created 11.17.2007 by <a id="ctl00_mainContent_shareRepeater_ctl01_fileshareitem_authorLink" href="/Stats/Halo3/Default.aspx?player=Chewyy+++++++++" target="primaryWindow">Chewyy </a></li> </ul> </div> <div class="share-mid"> <a id="ctl00_mainContent_shareRepeater_ctl01_fileshareitem_thumbnailLink" class="relative_image_container" href="/Online/Halo3UserContentDetails.aspx?h3fileid=21305912" target="primaryWindow"><img id="ctl00_mainContent_shareRepeater_ctl01_fileshareitem_fileThumbnail" class="not_screenshot_pic" src="/images/halo3stats/fileshareiconssm/filmclips/sm/construct.gif" style="border-width:0px;" /></a> <div class="shareCommon"> <ul class="infoC"> <li id="ctl00_mainContent_shareRepeater_ctl01_fileshareitem_xboxDownload_listitem"><div id="ctl00_ctl00_mainContent_shareRepeater_ctl01_fileshareitem_xboxDownloadButtonPanel"> <a id="ctl00_mainContent_shareRepeater_ctl01_fileshareitem_xboxDownloadButton" href="javascript:__doPostBack('ctl00$mainContent$shareRepeater$ctl01$fileshareitem$xboxDownloadButton','')">Download to Halo 3</a> </div></li><li id="ctl00_mainContent_shareRepeater_ctl01_fileshareitem_dl_listitem">0 downloads</li><li id="ctl00_mainContent_shareRepeater_ctl01_fileshareitem_length_listitem">Film Length: 00:00:05</li> </ul> </div> </div> <div class="clear"></div> <!--[if IE]><div class="IE_description_fix"><![endif]--> <div id="ctl00_mainContent_shareRepeater_ctl01_fileshareitem_bottomArea" class="description"> suicide by flare </div> <!--[if IE]></div><![endif]--> <div class="ssMoreDetails"><div id="ctl00_ctl00_mainContent_shareRepeater_ctl01_fileshareitem_moreDetailsLinkPanel"> <a id="ctl00_mainContent_shareRepeater_ctl01_fileshareitem_moreDetailsLink" href="/Online/Halo3UserContentDetails.aspx?h3fileid=21305912" target="primaryWindow">more details</a> </div></div> </div> </div> </div><div id="ctl00_ctl00_mainContent_shareRepeater_ctl01_fileshareitem_moreDetailsLinkPanel"> </div><div id="ctl00_ctl00_mainContent_shareRepeater_ctl01_fileshareitem_reportSpamLinkPanel"> </div><div id="ctl00_ctl00_mainContent_shareRepeater_ctl01_fileshareitem_reportResultsLabelPanel"> </div><div id="ctl00_ctl00_mainContent_shareRepeater_ctl01_fileshareitem_saveGalleryLinkButtonPanel"> </div><div id="ctl00_ctl00_mainContent_shareRepeater_ctl01_fileshareitem_fsItemPanelPanel"> </div> </div> </div> <div id="ctl00_ctl00_mainContent_shareRepeater_ctl01_fileshareitem_empty_fsItemPanelPanel"> </div> <div id="ctl00_ctl00_mainContent_shareRepeater_ctl01_fileshareitem_unpinned_ssPanelPanel"> </div> <div id="ctl00_ctl00_mainContent_shareRepeater_ctl01_fileshareitem_actionBarPanel"> <div id="ctl00_mainContent_shareRepeater_ctl01_fileshareitem_actionBar" class="bottom_bar"> <ul class="links"> <li class="slotNum">Slot : 2 </li> <li id="ctl00_mainContent_shareRepeater_ctl01_fileshareitem_copyListItem"><div id="ctl00_ctl00_mainContent_shareRepeater_ctl01_fileshareitem_shareCopyButtonPanel"> <a id="ctl00_mainContent_shareRepeater_ctl01_fileshareitem_shareCopyButton" href="javascript:__doPostBack('ctl00$mainContent$shareRepeater$ctl01$fileshareitem$shareCopyButton','')">copy to my share</a> </div></li> <li id="ctl00_mainContent_shareRepeater_ctl01_fileshareitem_removeListItem"><div id="ctl00_ctl00_mainContent_shareRepeater_ctl01_fileshareitem_removeButtonPanel"> <a onclick="return confirm('Are you sure you wish to remove this item?');" id="ctl00_mainContent_shareRepeater_ctl01_fileshareitem_removeButton" href="javascript:__doPostBack('ctl00$mainContent$shareRepeater$ctl01$fileshareitem$removeButton','')">delete</a> </div></li> <li id="ctl00_mainContent_shareRepeater_ctl01_fileshareitem_groupListItem"><a id="ctl00_mainContent_shareRepeater_ctl01_fileshareitem_groupButton" onclick="return openFileSetAddWindow(21305912,0,'ctl00_topLevelControls_fileSetWindow');" href="../StatControls/#">add to file set</a></li> </ul> </div><div id="ctl00_ctl00_mainContent_shareRepeater_ctl01_fileshareitem_shareCopyButtonPanel"> </div><div id="ctl00_ctl00_mainContent_shareRepeater_ctl01_fileshareitem_trophyLinkButtonPanel"> </div> </div> <div id="ctl00_ctl00_mainContent_shareRepeater_ctl01_fileshareitem_adminInfoPanel"> </div> </div> Quote Link to comment https://forums.phpfreaks.com/topic/140270-only-getting-entries-which-contain-a-certain-text-ignoring-others/#findComment-733985 Share on other sites More sharing options...
effigy Posted January 12, 2009 Share Posted January 12, 2009 <pre> <?php $content = file_get_contents('http://www.bungie.net/stats/halo3/fileshare.aspx?gamertag=HaLo2FrEeEk'); $pieces = explode('<div class="shareTitle">', $content); foreach ($pieces as $piece) { if (!preg_match('%/images/halo3stats/fileshareiconssm/film(?:clip)?s/sm/%', $piece)) { continue; } preg_match('% fileshareitem_titleLink[^>]+? h3fileid=(?P<h3fileid>\d+) [^>]+> (?P<title>.+?) </a> .+? Film\s+Length:\s+ (?P<length>[\d:]+) %xis', $piece, $matches); $result = array(); foreach (array_keys($matches) as $key) { if (!is_numeric($key)) { array_push($result, $matches[$key]); } } print_r($result); } ?> </pre> Quote Link to comment https://forums.phpfreaks.com/topic/140270-only-getting-entries-which-contain-a-certain-text-ignoring-others/#findComment-735416 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.