vapor33 Posted June 25, 2015 Share Posted June 25, 2015 I want to pull data from this page here: http://74.80.133.251/7778/ The source code page comes default by the gameserver provider to show our game server screenshots. Not sure how to go about this, as I've never really ran into this before. Basically I want to pull the image from each seperate screenshot and then display "player name, guid, time and date" under each screenshot. (see attached image) I would like to place them into css flex boxes 5 across and 5 down (25) for each page. And some other misc. features but you get the point. Is this feasible? If so...how can it be done? Source code for existing page: <p> <a href=pb001758.htm target=_blank>001758</a> "-=D3G=-V@POR" (W) GUID=00000000000000076561197999415753(VALID) [2015.06.23 15:31:33] <p> <a href=pb001759.htm target=_blank>001759</a> "-=D3G=-sn1per" (W) GUID=00000000000000076561198054321032(VALID) [2015.06.23 20:06:04] <p> <a href=pb001760.htm target=_blank>001760</a> "-=D3G=-sn1per" (W) GUID=00000000000000076561198054321032(VALID) [2015.06.23 20:10:44] <p> <a href=pb001761.htm target=_blank>001761</a> "-=D3G=-sn1per" (W) GUID=00000000000000076561198054321032(VALID) [2015.06.23 20:15:03] <p> <a href=pb001762.htm target=_blank>001762</a> ".ronin" (W) GUID=00000000000000076561198197734605(VALID) [2015.06.23 20:16:06] <p> <a href=pb001763.htm target=_blank>001763</a> "Springfield1903" (W) GUID=00000000000000076561198156993122(VALID) [2015.06.23 20:19:35] <p> <a href=pb001764.htm target=_blank>001764</a> "-=D3G=-sn1per" (W) GUID=00000000000000076561198054321032(VALID) [2015.06.23 20:19:55] <p> <a href=pb001765.htm target=_blank>001765</a> ".ronin" (W) GUID=00000000000000076561198197734605(VALID) [2015.06.23 20:20:23] <p> <a href=pb001766.htm target=_blank>001766</a> "-=D3G=-RotGM" (W) GUID=00000000000000076561198133386615(VALID) [2015.06.23 20:20:43] <p> <a href=pb001767.htm target=_blank>001767</a> "-=D3G=-snak3dr1" (W) GUID=00000000000000076561198045586709(VALID) [2015.06.23 20:21:03] <p> <a href=pb001768.htm target=_blank>001768</a> "-=D3G=-HALF-EVIL" (W) GUID=00000000000000076561198164691378(VALID) [2015.06.23 20:21:23] <p> <a href=pb001769.htm target=_blank>001769</a> "-=D3G=-V@POR" (W) GUID=00000000000000076561197999415753(VALID) [2015.06.23 20:21:43] <p> <a href=pb001770.htm target=_blank>001770</a> "Springfield1903" (W) GUID=00000000000000076561198156993122(VALID) [2015.06.23 20:23:48] <p> <a href=pb001771.htm target=_blank>001771</a> "-=D3G=-sn1per" (W) GUID=00000000000000076561198054321032(VALID) [2015.06.23 20:24:12] <p> <a href=pb001772.htm target=_blank>001772</a> ".ronin" (W) GUID=00000000000000076561198197734605(VALID) [2015.06.23 20:25:11] <p> <a href=pb001773.htm target=_blank>001773</a> "-=D3G=-snak3dr1" (W) GUID=00000000000000076561198045586709(VALID) [2015.06.23 20:25:31] <p> <a href=pb001774.htm target=_blank>001774</a> "-=D3G=-Sailan" (W) GUID=00000000000000076561198010795033(VALID) [2015.06.23 20:25:51] <p> <a href=pb001775.htm target=_blank>001775</a> "-=D3G=-RotGM" (W) GUID=00000000000000076561198133386615(VALID) [2015.06.23 20:26:11] <p> <a href=pb001776.htm target=_blank>001776</a> "-=D3G=-HALF-EVIL" (W) GUID=00000000000000076561198164691378(VALID) [2015.06.23 20:26:31] <p> <a href=pb001777.htm target=_blank>001777</a> "-=D3G=-V@POR" (W) GUID=00000000000000076561197999415753(VALID) [2015.06.23 20:26:51] <p> <a href=pb001778.htm target=_blank>001778</a> "awesomeguy2422" (W) GUID=00000000000000076561198161397621(VALID) [2015.06.23 20:27:11] <p> <a href=pb001779.htm target=_blank>001779</a> "dwagonade" (W) GUID=00000000000000076561198125858550(VALID) [2015.06.23 20:28:15] <p> <a href=pb001780.htm target=_blank>001780</a> "Springfield1903" (W) GUID=00000000000000076561198156993122(VALID) [2015.06.23 20:28:37] Quote Link to comment https://forums.phpfreaks.com/topic/297035-how-to-pull-data-from-html-page/ Share on other sites More sharing options...
CroNiX Posted June 25, 2015 Share Posted June 25, 2015 Its not going to be easy due to all of the malformed/improper HTML of those pages. You'd start by using either CURL, or file_get_contents() and retrieve that remote page. The rest gets trickier. Normally you could use some libraries to traverse the DOM tree of the retrieved page, but this page has no document declaration, no html section, no head section, no body section, etc. It's just a straight list of <p> tags. So not sure if any of the dom traversing libraries, such as simple_html_dom, will be able to parse it. So you might be needing to use regex to get the bits you want out of each line (player name, guid, time and date) Then for each line, you'd need to also grab the href from each <a> tag, since that's where the image is located. Again, those pages are not using proper HTML markup: <p> <img src=pb001759.png> Once you can decipher the img src, you'd grab the image using, again, file_get_contents or CURL. I'd store all data in the database so it's easily sortable and you can look up things faster, like retrieving all data for a particular username. Quote Link to comment https://forums.phpfreaks.com/topic/297035-how-to-pull-data-from-html-page/#findComment-1514973 Share on other sites More sharing options...
vapor33 Posted June 26, 2015 Author Share Posted June 26, 2015 Awesome, thanks for the quick response! I will see what I can do Quote Link to comment https://forums.phpfreaks.com/topic/297035-how-to-pull-data-from-html-page/#findComment-1514982 Share on other sites More sharing options...
vapor33 Posted June 26, 2015 Author Share Posted June 26, 2015 Upon further examination this file seems to be in the same folder as the screenshots folder, not sure what this actually does? <?xml version="1.0" encoding="UTF-8"?> <configuration> <system.webServer> <directoryBrowse enabled="false" /> <defaultDocument> <files> <add value="pbsvss.htm" /> </files> </defaultDocument> </system.webServer> </configuration> Quote Link to comment https://forums.phpfreaks.com/topic/297035-how-to-pull-data-from-html-page/#findComment-1514983 Share on other sites More sharing options...
CroNiX Posted June 26, 2015 Share Posted June 26, 2015 It's a Microsoft IIS webserver config file. Not positive about it, but seems to prevent listing a directory when going to it in your web browser. Quote Link to comment https://forums.phpfreaks.com/topic/297035-how-to-pull-data-from-html-page/#findComment-1515009 Share on other sites More sharing options...
vapor33 Posted June 26, 2015 Author Share Posted June 26, 2015 Apparently we can't PM on this forum? Can you contact me via email cronix? vapor>33@>i>cloud>.com (just remove the > signs) Thanks Quote Link to comment https://forums.phpfreaks.com/topic/297035-how-to-pull-data-from-html-page/#findComment-1515011 Share on other sites More sharing options...
Psycho Posted June 26, 2015 Share Posted June 26, 2015 (edited) Here's a function to extract the data that appears to work for the data. You can then iterate through the array to create the content that you wish <?php function getData($url) { $output = array(); $content = file_get_contents($url); $lines = preg_match_all("#<a[^\n]*\n#", $content, $matches); foreach($matches[0] as $line) { if(preg_match("#<a[^>]*>([^<]*)</a> \"([^\"]*)\" \(W\) GUID\=(\d*)[^\[]*\[([^ ]*) ([^\]]*)#", $line, $match)) { $output[] = array( 'id' => $match[1], 'image' => "{$url}pb{$match[1]}.png", 'username' => $match[2], 'guid' => $match[3], 'date' => $match[4], 'time' => $match[5] ); } } return $output; } $url = "http://74.80.133.251/7778/"; $data = getData($url); echo "<pre>" . print_r($data, 1) . "</pre>"; ?> Here is an example of part of the output Array ( [0] => Array ( [id] => 001646 [image] => http://74.80.133.251/7778/pb001646.png [username] => -=D3G=-RotGM [guid] => 00000000000000076561198133386615 [date] => 2015.06.25 [time] => 17:20:36 ) [1] => Array ( [id] => 001647 [image] => http://74.80.133.251/7778/pb001647.png [username] => -=D3G=-Icey842 [guid] => 00000000000000076561198091035675 [date] => 2015.06.25 [time] => 17:22:18 ) [2] => Array ( [id] => 001648 [image] => http://74.80.133.251/7778/pb001648.png [username] => budsanonymous [guid] => 00000000000000076561198188792511 [date] => 2015.06.25 [time] => 17:23:28 ) Note that using RegEx is not very efficient. So, you may want to cache the data and only update if the cache is older than some time period. Edited June 26, 2015 by Psycho Quote Link to comment https://forums.phpfreaks.com/topic/297035-how-to-pull-data-from-html-page/#findComment-1515027 Share on other sites More sharing options...
CroNiX Posted June 26, 2015 Share Posted June 26, 2015 I was playing with a similar approach that didn't use regex except to extract the GUID status. Barands is a bit cleaner. function get_pb_data() { $pb_data = array(); $base_url = 'http://74.80.133.251/7778/'; // Retrieve main page $base_page = file_get_contents($base_url); // Exit and show error if couldn't be retrieved if ( ! $base_page) { exit('Could not retrieve main page: ' . $base_page); } // These are substring items that will be removed from each line of text of the HTML $items_to_remove = array( "\r", // hidden return chars (if present) '<p> ', // <p> tags with trailing space (note no closing </p>'s in src) '"', // Quotes around username '[', // Bracket around date/time ']', // Bracket around date/time '(W) GUID=' // (W), don't know if needed, and GUID= text ); // Remove the items from the raw page text $base_page = str_replace($items_to_remove, '', $base_page); // Create an array for each line $lines = explode("\n", $base_page); // Cycle through the lines, format the data and store it in a new array foreach($lines as $line) { // Remove the anchor tag from the line to isolate the punkbuster id ($pb_id) $line = strip_tags($line); if (substr_count($line, ' ') == 4) { // Grab the fields by exploding on spaces list($pb_id, $username, $guid, $date, $time) = explode(' ', $line); // Grab the "status" string from the GUID line within () preg_match('/\((.*?)\)/', $guid, $guid_status); // Remove the status string from the GUID line to isolate GUID $guid = str_replace($guid_status[0], '', $guid); // Replace dots with dashes in the date for a mysql valid format $date = str_replace('.', '-', $date); // Store the formatted user data. Normally would go in a db or something... $pb_data[] = array( 'id' => $pb_id, 'username' => $username, 'guid' => $guid, 'guid_status' => $guid_status[1], 'image_source' => $base_url . 'pb' . $pb_id . '.png', 'date' => $date, 'time' => $time, 'datetime' => $date . ' ' . $time ); } } return $pb_data; } output: $data = get_pb_data(); echo '<pre>'; print_r($data); -------------- Array ( [0] => Array ( [id] => 001646 [username] => -=D3G=-RotGM [guid] => 00000000000000076561198133386615 [guid_status] => VALID [image_source] => http://74.80.133.251/7778/pb001646.png [date] => 2015-06-25 [time] => 17:20:36 [datetime] => 2015-06-25 17:20:36 ) [1] => Array ( [id] => 001647 [username] => -=D3G=-Icey842 [guid] => 00000000000000076561198091035675 [guid_status] => VALID [image_source] => http://74.80.133.251/7778/pb001647.png [date] => 2015-06-25 [time] => 17:22:18 [datetime] => 2015-06-25 17:22:18 ) [2] => Array ( [id] => 001648 [username] => budsanonymous [guid] => 00000000000000076561198188792511 [guid_status] => VALID [image_source] => http://74.80.133.251/7778/pb001648.png [date] => 2015-06-25 [time] => 17:23:28 [datetime] => 2015-06-25 17:23:28 ) [3] => Array ( [id] => 001649 [username] => -=D3G=-Roosevelt [guid] => 00000000000000076561198161436214 [guid_status] => VALID [image_source] => http://74.80.133.251/7778/pb001649.png [date] => 2015-06-25 [time] => 17:23:48 [datetime] => 2015-06-25 17:23:48 ) [4] => Array ( [id] => 001650 [username] => -=D3G=-RotGM [guid] => 00000000000000076561198133386615 [guid_status] => VALID [image_source] => http://74.80.133.251/7778/pb001650.png [date] => 2015-06-25 [time] => 17:26:12 [datetime] => 2015-06-25 17:26:12 ) ) Quote Link to comment https://forums.phpfreaks.com/topic/297035-how-to-pull-data-from-html-page/#findComment-1515031 Share on other sites More sharing options...
vapor33 Posted June 26, 2015 Author Share Posted June 26, 2015 Here is the source from a ss viewer that does work for all games but I hate the layout and gui....that's why I was attempting to create my own. But can you check the code for me as I am unsure what's "clean & secure" code and what's not. Are your examples better than the code below? Thanks for all the suggestions guys! <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <!----------------------------------------- Main Stream Gamers PunkBuster Screenshot Viewer An original work by Bruce "Goose" Kaskubar Licensed under GPL v3 http://www.msgamers.com --------------------------------------- --> <head> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> <title>MSGPBSSV: MSG's PunkBuster™ Screenshot Viewer</title> <style type="text/css"> a { text-decoration:none; color:#00F; } a:hover { color: #FFF; background-color: #009; } a.white { color:#FFF; } a.white:hover { background-color: #FFF; color:#565; } td a { display: block; } tr { vertical-align:top; } html { height:99%; } .footer { background-color:#787; height:1.4em; font-size:80%;} .invisible { visibility:hidden; height:0px; } .ss { color:#933; font-weight:bold; } .small { font-size:90%; font-weight:normal; } .thorax { background-color:#EEE; } .title { background-color:#787; height:1.4em; font-weight:bold; } #nav ul { margin:0 0 0 0; padding:0 0 0 0; list-style:none; } #nav ul li a { display:block; text-decoration:none; height: 1.3em; } </style> </head> <body style="background-color:#ABA; font-family:Verdana, Arial, Helvetica, sans-serif; height:100%;"> <!--------------------------------------- HEADER ------------------------------------- --> <div style="height:3%;"> <a href="http://deadendgaming.net"> <img alt="DEAD END GAMING" height="94" src="http://deadendgaming.net/images/d3g_email_logo.png" width="500" /></a> <span class="small" style="float:right;">By: MSG version <span id="versionNum">1.06</span></span></div> <!--------------------------------------- LEFT SIDE ------------------------------------- --> <div style="float:left; width:29em; margin:4px 0px 4px 0px;"> <div class="title"> <table width="100%" border="0" cellspacing="0" cellpadding="0"> <tr> <td>Settings</td> <td align="right" class="small" style="cursor:pointer;"><div id="hideSettings" style="display:none;"><a onclick="javascript:document.getElementById('settings').style.display ='none';document.getElementById('hideSettings').style.display ='none';document.getElementById('showSettings').style.display ='';">hide</a></div><div id="showSettings"><a onclick="javascript:document.getElementById('settings').style.display ='';document.getElementById('showSettings').style.display ='none';document.getElementById('hideSettings').style.display ='inline';">show</a></div></td> </tr> </table> </div> <div class="thorax" style="height:auto; overflow:auto; display:none;" id="settings"> <form name="settings"> <input name="pbss" type="checkbox" /> Display screenshots in <a href="http://www.aliasfinder.co.uk/products/pbss/index.htm" target="msgx">PBSS plugin</a> <select name="pbssw"><option>400</option><option>500</option><option selected="selected">600</option><option>700</option><option>800</option></select> x <select name="pbssh"><option>300</option><option selected="selected">400</option><option>500</option><option>600</option></select> <input type="radio" name="timeFormatInd" value="0" checked="checked" onclick="TFx=this.value;"/>12-hour time <input type="radio" name="timeFormatInd" value="1" onclick="TFx=this.value;"/>24-hour time </form> </div> <div id="navTitle" class="title" style="margin-top:8px;">List Selection</div> <div id="nav" class="thorax"> <ul> <li><a href="javascript:showXsummary();">Index Summary</a></li> <li><a href="javascript:showPlayersWithAliases();">Players With Aliases</a></li> <li><a href="javascript:showPlayerSScensus('Screenshot Census by Player, all','1');">Screenshot Census by Player, all</a></li> <li><a href="javascript:showPlayerSScensus('Screenshot Census by Player, >0','Players[i].ssCnt >0');">Screenshot Census by Player, >0</a></li> <li><a href="javascript:showScreenshotCensusByTime(15);">Screenshot Census by 15 Minute Periods</a></li> </ul> </div> <div id ="listTitle" class="title" style="margin-top:8px;"></div> <div id="list" class="thorax" style="height:256px; overflow:auto;">loading PunkBuster's screenshot index...</div> <div id="listFooter" class="footer"></div> <div style="height:auto; margin-top:8px;"> <div id="detailsTitle" class="title" style="height:auto;"></div> <div id="details" class="thorax" style="height:auto; overflow:auto;"></div> </div> </div> <!--------------------------------------- RIGHT SIDE ------------------------------------- --> <div id="screenshots" style="float:right; background-color:#EEE; margin:4px 4px 4px 4px; height:92%; width:auto;"> <div id="ssTitle" class="title"></div> <div id="ss" class="small" style="height:97%; overflow:auto;"></div> </div> <!--------------------------------------- FOOTER ------------------------------------- --> <div style="clear:both; background-color:#565; height:4%;" class="small"> <table style="color:#FFF; width:100%;"> <tr> <td width="33%"><a href="http://anticheatinc.net">ACI WEBSITE</a> <a href="http://www.pbbans.com">PBBANS WEBSITE</a></td> <td width="34%" align="center"><a href="http://deadendgaming.net">D3G WEBSITE</a></td> <td width="33%" align="right"><a href="http://www.msgamers.com/smf/index.php?board=23.0" target="msgx" class="white">Viewer forum</a></td> </tr> <tr><td colspan="3" align="center" style="color:#DDD;"><script>m=/.*\/\/(.+).+)@(.+)/.exec(document.URL);document.write((m===null)?document.URL:m[3]);</script></td></tr> </table> </div> <!--------------------------------------- CONTENT LOADER ------------------------------------- --> <iframe src="pbsvss.htm" name="loader" onload="parsePBSV(this);" class="invisible" style="color:#963; font-weight:bold;">Sorry, the Viewer uses inline frames and your browser does not support them.</iframe> </body> <script type="text/javascript"> //----------------------------------- // Globals //----------------------------------- var MOTY =["Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"]; var DOTW =["Sun","Mon","Tue","Wed","Thu","Fri","Sat"]; var TF =[["h:i a", "h:i:s a"], ["H:i", "H:i:s"]]; var TFx =0; // default time format var linesFoundCnt =0; // number of entries in index var linesLoadedCnt =0; // number of index entries imported var excludedLines =new Array(); // skipped PB index entries // [0]: reason index // [1]: the excluded line var exclusionReasons =[ 'player name could not be found', 'line item format was nonstandard' ]; var earliestItemTS =-1;// PB line item timestamp range var latestItemTS =-1; // PB line item timestamp range var pwssCnt =0; // players with screenshots var maxPSScnt =0; // highest number of screenshots for a single player (name & GUID) var maxTSScnt =0; // highest number of screenshots for a single time period function parsePBSV(f) { // create Viewer objects from PB index found in frame f var x =frames[f.getAttribute('name')].document; var p =x.getElementsByTagName('p'); linesFoundCnt =p.length; for (i =0; i <p.length; i++) importItem(p[i].innerHTML); document.getElementById('list').innerHTML ="calculating statistics..."; setTimeout("calcStats()", 256); } function importItem(l) { // import PB line item l var m =ssRE.exec(l); if (m ===null) { excludedLines.push([1, l]); return; } var ts =new Date(m[7], m[8] -1, m[9], m[10], m[11], m[12], 0); addScreenshot(m[2], addPlayer(m[3], m[5], ts).OID, ts, m[1]); linesLoadedCnt++; } function calcStats () { pwssCnt =thePlayerCensus('Players[i].ssCnt >0'); for (i in Players) if (Players[i].ssCnt >maxPSScnt) maxPSScnt =Players[i].ssCnt; document.getElementById('list').innerHTML ="sequencing player names..."; setTimeout("sequencePlayers()", 256); } function sequencePlayers () { var i, p, s =new Array(); for (i in Players) s.push(Players[i]); s.sort(function (a, b) { if (a.sortNm > b.sortNm) return 1; if (a.sortNm < b.sortNm) return -1; return 0; }); if (s.length >0) { PlayerAOID =s[0].OID; PlayerZOID =s[s.length -1].OID; for (i =0; i <s.length; i++) { // build player linked list p =getPlayerByOID(s[i].OID); if (i >0) p.prevOID =s[i -1].OID; if (i <s.length -1) p.nextOID =s[i +1].OID; } } showXsummary(); } function showXsummary () { // present the PB index file summary document.getElementById('listTitle').innerHTML ='Index Summary'; document.getElementById('list').innerHTML = '<table cellpadding="3" cellspacing="0">' +'<tr><td>items encountered:</td><td>' +fix(linesFoundCnt) +'</td></tr>' +'<tr><td>items loaded:</td><td>' +fix(linesLoadedCnt) +'</td></tr>' +'<tr><td><acronym title="index entries not elsewhere included due to noncompliance with expected format">items excluded</acronym>:</td><td><a href="javascript:showExcludedLines();">' +fix(excludedLines.length) +'</a></td></tr>' +'<tr><td>from:</td><td>' +((earliestItemTS ==-1) ? '' : theDT(earliestItemTS, 'D dMy ' +TF[TFx][1])) +'</td></tr>' +'<tr><td>through:</td><td>' +((latestItemTS ==-1) ? '' : theDT(latestItemTS, 'D dMy ' +TF[TFx][1])) +'</td></tr>' +'<tr><td>players:</td><td>' +fix(thePlayerCnt()) +'</td></tr>' +'<tr><td>screenshots:</td><td>' +fix(theSSCnt()) +'</td></tr>' +'<tr><td>players<br>w/screenshots:</td><td><br>' +fix(pwssCnt) +'</td></tr>' +'<tr><td><acronym title="average (mean) screenshot count among players with screenshots" >screenshots<br>/player</acronym>:</td><td><br>' +((pwssCnt ==0) ? 'quite meaningless' : fix(Math.round(theSSCnt() /pwssCnt))) +'</td></tr>' +'</table>'; document.getElementById('listFooter').innerHTML =''; } function showExcludedLines() { // present details of non-imported index entries document.getElementById('ssTitle').innerHTML ='Excluded Index Entries'; var i, s, m, s2; var d ='<table cellpadding="3" cellspacing="0">'; for (i =0; i <excludedLines.length; i++) { m =ssX.exec(excludedLines[i][1]); if (m ===null) s ='[The line is so bad it could not be formatted for display.]'; else s =m[1] +' ' +m[2]; s2 =excludedLines[i][1].replace(/'/g, """); // escape excluded line's single and double quotes s2 =s2.replace(/"/g, """); d +='<tr>' +'<td align="right">' +(i +1) +' of ' +excludedLines.length +'</td>' +'<td><a href="javascript:alert(\'' +s2 +'\');">' +s +'</a></td>' +'</tr>'; } d +='</table>'; document.getElementById('ss').innerHTML =d; } //----------------------------------- // Player class and methods //----------------------------------- var Players =new Array(); // object store; key is OID var PlayersXng =new Object(); // Players index; key is name and GUID, value is OID var PlayersXg =new Object(); // Players index; key is GUID, value is array of OIDs var PlayersS =new Array(); // Players index; key is name, value is OID; in ascending sequence by name var PlayerAOID =-1; // OID of first player in sequence var PlayerZOID =-1; // OID of last player in sequence function Player (o, n, g, t) { this.OID =o; // unique identifier this.name =n; // name this.GUID =g; // Activision ID this.ssCnt =0; // screenshot count this.firstXts =t; // earliest indexed screenshot timestamp this.lastXts =t; // latest indexed screenshot timestamp this.sortNm =theSortNm(n);// key for sorting by name this.prevOID=-1; // previous player in ascending sequence this.nextOID=-1; // next player in ascending sequence } function addPlayer (n, g, t) { // if not present, add Player with name n, guid g, and screenshot timestamp t; maintain indexes // return Player object if ((n +',' +g) in PlayersXng) { if (Players[PlayersXng[n +',' +g]].firstXts >t) Players[PlayersXng[n +',' +g]].firstXts =t; if (Players[PlayersXng[n +',' +g]].lastXts <t) Players[PlayersXng[n +',' +g]].lastXts =t; return Players[PlayersXng[n +',' +g]]; } Players.push(new Player(Players.length, n, g, t)); PlayersXng[n +',' +g] =Players.length -1; if (g in PlayersXg) {} else PlayersXg[g] =[]; PlayersXg[g].push(Players.length -1); return Players[Players.length -1]; } function getPlayerAliases (g, n, c) { // return list of names used by Player whose GUID is g except for name n for criteria c; var d =""; for (var i in PlayersXg[g]) if (Players[PlayersXg[g][i]].name !=n) d +='<a href="javascript:showPlayerDetails(' +Players[PlayersXg[g][i]].OID +', \'' +c +'\')">' +theDisplayNm(Players[PlayersXg[g][i]].name) +'</a>'; return d; } function getPlayerByOID (o) { // return the Player whose OID is o, or null; return Players[o]; } function showPlayerDetails (o, c) { // show details of player whose OID is o matching criteria c var gpa =getPlayerAliases(Players[o].GUID, Players[o].name, c); if (gpa <='') gpa ='[none]'; var d = '<table cellpadding="0" cellspacing="0" width="100%"><tr>' +'<td>Player Details</td>' +'<td align="right"><input type="button" value="'; if (Players[o].prevOID >=0) d +='<" onclick="showPlayerDetails(' +Players[o].prevOID +', \'' +c +'\')">'; else d +=' ">'; d +=' <input type="button" value="'; if (Players[o].nextOID >=0) d +='>" onclick="showPlayerDetails(' +Players[o].nextOID +', \'' +c +'\')'; else d +=' '; d += '"></td>' +'</tr></table>'; document.getElementById('detailsTitle').innerHTML =d; var spd = '<table cellpadding="2" cellspacing="0" width="100%">' +'<tr>' +'<td><acronym title="name without color codes">name:</acronym></td>' +'<td colspan="2">' +theDisplayNm(Players[o].name) +'</td>' +'</tr>' +'<tr>' +'<td><acronym title="name as entered in Multiplayer Options, including color codes">real name</acronym>:</td><td colspan="2">' +Players[o].name +'</td>' +'</tr>' +'<tr>' +'<td><acronym title="other players with the same GUID">aliases</acronym>:</td>' +'<td colspan="2">' +gpa +'</td>' +'</tr>' +'<tr>' +'<td>GUID:</td>' +'<td colspan="2">' +Players[o].GUID +'</td>' +'</tr>' +'<tr>' +'<td><acronym title="time of first reference in index">earliest</acronym>:</td>' +'<td colspan="2">' +theDT(Players[o].firstXts, 'D dMy ' +TF[TFx][1]) +'</td>' +'</tr>' +'<tr>' +'<td><acronym title="time of last reference in index">latest</acronym>:</td>' +'<td colspan="2">' +theDT(Players[o].lastXts, 'D dMy ' +TF[TFx][1]) +'</td>' +'</tr>' +'<tr>' +'<td>screenshots:</td>' +'<td colspan="2"><a href="javascript:showScreenshots(\'Screenshots from ' +theDisplayNm(Players[o].name) +'\', \'Screenshots[i].playerOID ==' +o +'\');">' +Players[o].ssCnt +'</a></td>' +'</tr>'; // build an array of screenshots from this player var s =new Array(), i; for (i in Screenshots) if (Screenshots[i].playerOID ==o) s.push(Screenshots[i]); s.sort(function (a, b) { return a.dtm -b.dtm; } ); // build an array of days for which the player has screenshots var s2 =new Array(), d, cnt, maxCnt =0, t=new Date(); for (i =0; i <s.length;) { d =new Date(s[i].dtm.getFullYear(), s[i].dtm.getMonth(), s[i].dtm.getDate(), 0, 0, 0).valueOf(); cnt =0; while (i <s.length && s[i].dtm.valueOf() <d +1000*60*60*24) { cnt++; i++; } s2.push(new Array(d, cnt)); if (cnt >maxCnt) maxCnt =cnt; } // emit the daily line items if (s2.length >0) { spd += '<tr style="background-color:#CCC;">' +'<td colspan="3">Screenshots by Day</td>' +'</tr>'; spd +='<div style="max-height:256px;">'; for (i =0; i <s2.length; i++) { spd += '<tr>' +'<td>' +'<a href="javascript:showPeriodDetails(' +s2[i][0] +',' +(s2[i][0] +1000*60*60*24) +');">' +theDT(new Date(s2[i][0]), 'D dMy') +'</a>' +'</td>' +'<td align="right">' +'<a href="javascript:showScreenshots(\'Screenshots from ' +theDisplayNm(Players[o].name) +'\', \'Screenshots[i].playerOID==' +o +'&&Screenshots[i].dtm>=' +s2[i][0] +'&&Screenshots[i].dtm<' +(s2[i][0] +1000*60*60*24) +'\');">' +s2[i][1] +'</a>' +'</td>' +'<td width="40%">' +theLine(s2[i][1], maxCnt) +'</td>' +'</tr>'; } spd +='</div>'; } document.getElementById('details').innerHTML =spd +'</table>'; } function showPlayerSScensus (t, c) { // return list of player SS links entitled t for players meeting criteria c document.getElementById('listTitle').innerHTML =t; var tpl = '<tr style="background-color:#CCC;">' +'<td>Name</td>' +'<td align="right">Count</td>' +'<td width="40%"> </td>' +'</tr>'; var itemCnt =0; var i =PlayerAOID; while (i >=0) { if (eval(c)) { itemCnt++; tpl +='<tr>' +'<td>' +'<a href="javascript:showPlayerDetails(' +Players[i].OID +', \'' +c +'\');">' +theDisplayNm(Players[i].name) +'</a>' +'</td>' +'<td align="right">' +'<a href="javascript:showScreenshots(\'Screenshots from ' +theDisplayNm(Players[i].name) +'\', \'Screenshots[i].playerOID ==' +Players[i].OID +'\');">' +Players[i].ssCnt +'</a>' +'</td>' +'<td>' +theLine(Players[i].ssCnt, maxPSScnt) +'</td>' +'</tr>'; } i =Players[i].nextOID; } document.getElementById('list').innerHTML = '<table cellpadding="2" cellspacing="0" width="100%">' +tpl +'</table>'; document.getElementById('listFooter').innerHTML =itemCnt +' players'; } function showPlayersWithAliases () { // return list of players with aliases document.getElementById('listTitle').innerHTML ='Players With Aliases'; var s =new Array(), i; for (i in Players) if (PlayersXg[Players[i].GUID].length >1) s.push(Players[i]); s.sort(function (a, b) { if (a.sortNm > b.sortNm) return 1; if (a.sortNm < b.sortNm) return -1; return 0; }); var d = '<tr style="background-color:#CCC;">' +'<td width="50%">Name</td>' +'<td width="50%">Aliases</td></tr>'; for (i in s) { d +='<tr><td><a href="javascript:showPlayerDetails(' +s[i].OID +', \'1\');">' +theDisplayNm(s[i].name) +'</a></td>' +'<td>' +getPlayerAliases(s[i].GUID, s[i].name, '1') +'</a></td></tr>'; } document.getElementById('list').innerHTML = '<table cellpadding="2" cellspacing="0" width="100%">' +d +'</table>'; document.getElementById('listFooter').innerHTML =fix(s.length) +' players'; } function theDisplayNm (n) { // return the display name for name n var tdn =n; var re = /\^\d/; // replace color codes while (re.test(tdn)) tdn = tdn.replace(re, ''); var re = /\</; // replace opening brackets while (re.test(tdn)) tdn = tdn.replace(re, '<'); var re = /\>/; // replace closing brackets while (re.test(tdn)) tdn = tdn.replace(re, '>'); return tdn; } function thePlayerCnt () { return Players.length; } function thePlayerCensus (c) { // return count of players matching criteria c var cnt =0; for (i in Players) if (eval(c)) cnt++; return cnt; } function theSortNm (n) { // return the sort key for name n var tsn =n; var re = /\^\d|\W|_/g; // remove special characters while (re.test(tsn)) tsn = tsn.replace(re, ''); return tsn.toUpperCase(); } function updatePlayerSSCnt(o, a) { // add a to screenshot count of Player whose OID is o // return the new count Players[o].ssCnt +=a; return Players[o].ssCnt; } //----------------------------------- // Screenshot classes and methods //----------------------------------- var Screenshots =new Array(); // object store; key is OID var ScreenshotsXf =new Object(); // Screenshots index; key is file name, value is OID var ssID =0; // viewed screenshot sequence number (used as global instead of local because some browsers' DOM methods refuse to recognize an object whose ID matches that of a previously deleted object) var SSrequests =new Array(); // screenshot request queue object store var ssTm1, ssTm2; function Screenshot (o, s, p, t, f) { this.OID =o; // unique identifier this.seqNo =s; // PB sequence number this.playerOID =p; // a Player.OID this.dtm =t; // PB ss time stamp this.fileNm =f; // PB ss HTML file name } function SSrequest (i, f, tr, tc) { this.ssID =i; // ss identifier this.fileNm =f; // ss file name this.addDT =0; // request add time this.startDT =tr; // request start time this.compDT =tc; // request completion time } function addScreenshot (s, p, t, f) { // add screenshot with sequence number s, player OID p, timestamp t, and file name f // if its file is already recorded, update the existing ss entry if (t <earliestItemTS || earliestItemTS ==-1) earliestItemTS =t; if (t >latestItemTS || latestItemTS ==-1) latestItemTS =t; if (f in ScreenshotsXf) { // overlay prior entry Screenshots[ScreenshotsXf[f]].seqNo =s; updatePlayerSSCnt(Screenshots[ScreenshotsXf[f]].playerOID, -1); // decrease count for overridden player Screenshots[ScreenshotsXf[f]].playerOID =p; updatePlayerSSCnt(p, 1); Screenshots[ScreenshotsXf[f]].dtm =t; return; } Screenshots.push(new Screenshot(Screenshots.length, s, p, t, f)); ScreenshotsXf[f] =Screenshots.length -1; updatePlayerSSCnt(p, 1); } function loadSS() { // retrieve next SS in queue if (SSrequests.length <1) return; // prevent the last screenie from requesting another one var ss =SSrequests.pop(); f =document.getElementById(ss.ssID); f.src =ss.fileNm; // Why go to all this trouble, instead of just setting each iFrame's SRC parameter from the get go? // There's no difference for local file access. For FTP access, all the screenies would be requested at practically the same time. // Most FTP servers get fed up with that rush of activity and start refusing connections. Error messages ensue. // By queueing the file loads, the total load time increases but at least we tend to get all the files. } function showScreenshots(t, c) { // return list of screenshots entitled t that meet criteria c var s =new Array(), i, f; document.getElementById('ssTitle').innerHTML =t; for (i in Screenshots) if (eval(c)) s.push(Screenshots[i]); s.sort(function (a, b) { return a.dtm -b.dtm; } ); var d =''; SSrequests.length =0; for (i =0; i <s.length; i++) { d +='<tr>' +'<td><p class="small">' +(i +1) +' of ' +s.length +'<br />' +'<a href="javascript:showPlayerDetails(' +s[i].playerOID +', ' +'\'1\');">' +theDisplayNm(getPlayerByOID(s[i].playerOID).name) +'</a>' +theDT(s[i].dtm, 'D dMY') +'<br />' +theDT(s[i].dtm, TF[TFx][1]) +'<br />' +s[i].fileNm +'</p></td>' +'<td>' +'<div class="ss"></div>' +'<iframe id="ss' +ssID +'" name="ss' +ssID +'" class="invisible" onload="showSS(this);"></iframe>' +'</td>' +'</tr>'; SSrequests.push(new SSrequest("ss" +ssID, s[i].fileNm, 0, 0)); ssID++; } document.getElementById('ss').innerHTML ='<table cellpadding="2" cellspacing="2">' +d +'</table>'; SSrequests.reverse(); // process as a queue, not a stack loadSS(); // get first SS } function showSS (f) { // show screenshot loaded by frame f if (f.getAttribute('src') <='') return; //skip loads prior to setting src parameter if (!document.forms.settings["pbss"].checked || frames[f.getAttribute('name')].document.images.length !=1) f.previousSibling.innerHTML =frames[f.getAttribute('name')].document.body.innerHTML; else { f.previousSibling.innerHTML ='<embed' +' src="' +frames[f.getAttribute('name')].document.images[0].src +'"' +' gamma="1.5"' +' image="' +frames[f.getAttribute('name')].document.images[0].src +'"' +' pluginspage="http://www.aliasfinder.co.uk/products/pbss/files/nppbss.dll"' +' type="application/x-png-pbss"' +' width="' +document.forms.settings["pbssw"].value +'"' +' height="' +document.forms.settings["pbssh"].value +'"' +'></embed>' } loadSS(); // get next SS } function showScreenshotsForPeriod(t1, t2) { // present the screenshots from time t1 to time t2 var dt1 =new Date(t1); var dt2 =new Date(t2); showScreenshots('Screenshots from ' +theDT(dt1, 'dMy ' +TF[TFx][0]) +' to ' +theDT(dt2, 'dMy ' +TF[TFx][0]), 'Screenshots[i].dtm >=' +t1 +' && Screenshots[i].dtm <' +t2) } function showScreenshotCensusByTime (m) { // present screenshot list by periods of m minutes var mInt =parseInt(m); document.getElementById('listTitle').innerHTML ='Screenshot Census by ' +'<select onchange="showScreenshotCensusByTime(this[selectedIndex].text);">' +'<option' +((mInt ==1) ? ' selected="selected"' : '') +'>1</option>' +'<option' +((mInt ==5) ? ' selected="selected"' : '') +'>5</option>' +'<option' +((mInt ==10) ? ' selected="selected"' : '') +'>10</option>' +'<option' +((mInt ==15) ? ' selected="selected"' : '') +'>15</option>' +'<option' +((mInt ==20) ? ' selected="selected"' : '') +'>20</option>' +'<option' +((mInt ==30) ? ' selected="selected"' : '') +'>30</option>' +'<option' +((mInt ==60) ? ' selected="selected"' : '') +'>60</option>' +'</select>' +' Minute Periods'; var s =new Array(), i, p, pNext; for (i in Screenshots) s.push(Screenshots[i]); s.sort(function (a, b) { return (a.dtm -b.dtm) }); var d = '<tr style="background-color:#CCC;">' +'<td>Period Start</td>' +'<td align="right">Count</td>' +'<td width="40%"> </td>' +'</tr>'; if (s.length >0) { // find maximum number of screenies in a period var p =new Date(s[0].dtm.getFullYear(), s[0].dtm.getMonth(), s[0].dtm.getDate(), s[0].dtm.getHours(), Math.floor(s[0].dtm.getMinutes() /mInt) *mInt, 0, 0); // start time of census period var pNext =new Date(p.getFullYear(), p.getMonth(), p.getDate(), p.getHours(), p.getMinutes() +mInt, 0, 0); // start time of next census period } var ssCnt =0; maxTSScnt =0; for (i =0; i <s.length; i++) { if (s[i].dtm <pNext) { ssCnt++; continue; } // count screenies in this period if (ssCnt >maxTSScnt) maxTSScnt =ssCnt; p =new Date(s[i].dtm.getFullYear(), s[i].dtm.getMonth(), s[i].dtm.getDate(), s[i].dtm.getHours(), Math.floor(s[i].dtm.getMinutes() /mInt) *mInt, 0, 0); pNext =new Date(p.getFullYear(), p.getMonth(), p.getDate(), p.getHours(), p.getMinutes() +mInt, 0, 0); ssCnt =1; } if (ssCnt >maxTSScnt) maxTSScnt =ssCnt; if (s.length >0) { var p =new Date(s[0].dtm.getFullYear(), s[0].dtm.getMonth(), s[0].dtm.getDate(), s[0].dtm.getHours(), Math.floor(s[0].dtm.getMinutes() /mInt) *mInt, 0, 0); // start time of census period var pNext =new Date(p.getFullYear(), p.getMonth(), p.getDate(), p.getHours(), p.getMinutes() +mInt, 0, 0); // start time of next census period } var ssCnt =0, pCnt =0; for (i =0; i <s.length; i++) { if (s[i].dtm <pNext) { ssCnt++; continue; } // count screenies in this period d +=theSSCBTline(p, pNext, ssCnt); p =new Date(s[i].dtm.getFullYear(), s[i].dtm.getMonth(), s[i].dtm.getDate(), s[i].dtm.getHours(), Math.floor(s[i].dtm.getMinutes() /mInt) *mInt, 0, 0); if (p.toString() !=pNext.toString()) d +='<tr><td> </td><td> </td></tr>'; // show gap in time pNext =new Date(p.getFullYear(), p.getMonth(), p.getDate(), p.getHours(), p.getMinutes() +mInt, 0, 0); ssCnt =1; pCnt++; } if (ssCnt >0) { d +=theSSCBTline(p, pNext, ssCnt); pCnt++; } document.getElementById('list').innerHTML ='<table cellpadding="2" cellspacing="0" width="100%">' +d +'</table>'; document.getElementById('listFooter').innerHTML =fix(pCnt) +' period' +((pCnt !=1) ? 's' : ''); } function theSSCBTline(t1, t2, c) { // return a formatted line for the screenshot census return '<tr>' +'<td><a href="javascript:showPeriodDetails(' +t1.valueOf() +', ' +t2.valueOf() +')">' +theDT(t1, 'D dMy ' +TF[TFx][0]) +'</a></td>' +'<td align="right">' +'<a href="javascript:showScreenshotsForPeriod(' +t1.valueOf() +', ' +t2.valueOf() +')">' +fix(c) +'</a>' +'</td>' +'<td>' +theLine(c, maxTSScnt) +'</td>' +'</tr>'; } function theSSCnt () { return Screenshots.length; } //----------------------------------- // Time Period classes and methods //----------------------------------- function showPeriodDetails (t1, t2) { // show details of time period starting at time t1 and ending just before time t2 document.getElementById('detailsTitle').innerHTML = '<table cellpadding="0" cellspacing="0" width="100%">' +'<tr>' +'<td>Time Period Details</td>' +'<td align="right">' +'<input type="button" value="<" onclick="showPeriodDetails(' +(t1 -(t2 -t1)) +', ' +t1 +')">' +' ' +'<input type="button" value=">" onclick="showPeriodDetails(' +t2 +', ' +(t2 +(t2 -t1)) +')">' +'</td>' +'</tr>' +'</table>'; // build an array of screenshots for the time period sequenced by player name var s =new Array(), i; for (i in Screenshots) if (Screenshots[i].dtm.valueOf() >=t1 && Screenshots[i].dtm.valueOf() <t2) s.push(Screenshots[i]); s.sort(function (a, b) { if (Players[a.playerOID].sortNm > Players[b.playerOID].sortNm) return 1; if (Players[a.playerOID].sortNm < Players[b.playerOID].sortNm) return -1; return 0; }); // build an array of players for the time period var s2 =new Array(), p, cnt, maxCnt =0, spd; for (i =0; i <s.length;) { p =s[i].playerOID; cnt =0; while (i <s.length && s[i].playerOID ==p) { cnt++; i++; } s2.push(new Array(p, cnt)); if (cnt >maxCnt) maxCnt =cnt; } var spd = '<table cellpadding="2" cellspacing="0" width="100%">' +'<tr>' +'<td>period start:</td>' +'<td colspan="2">' +theDT(new Date(t1), 'D dMy ' +TF[TFx][0]) +'</td>' +'</tr>' +'<tr>' +'<td>period end:</td>' +'<td colspan="2">' +theDT(new Date(t2), 'D dMy ' +TF[TFx][0]) +'</td>' +'</tr>' +'<tr>' +'<td>screenshots:</td>' +'<td><a href="javascript:showScreenshotsForPeriod(' +t1 +', ' +t2 +')">' +s.length +'</a></td>' +'<td width="40%"> </td>' +'</tr>'; if (s2.length >0) { spd += '<tr style="background-color:#CCC;">' +'<td colspan="3">Screenshots by Player</td>' +'</tr>'; spd +='<div style="max-height:256px;">'; for (i =0; i <s2.length; i++) { spd += '<tr>' +'<td>' +'<a href="javascript:showPlayerDetails(' +s2[i][0] +',\'\');">' +theDisplayNm(Players[s2[i][0]].name) +'</a>' +'</td>' +'<td align="right">' +'<a href="javascript:showScreenshots(\'Screenshots from ' +theDisplayNm(Players[s2[i][0]].name) +'\', \'Screenshots[i].playerOID==' +s2[i][0] +'&&Screenshots[i].dtm>=' +t1 +'&&Screenshots[i].dtm<' +t2 +'\');">' +s2[i][1] +'</a>' +'</td>' +'<td>' +theLine(s2[i][1], maxCnt) +'</td>' +'</tr>'; } spd +='</div>'; } document.getElementById('details').innerHTML =spd +'</table>'; } //----------------------------------- // More Globals //----------------------------------- function fix(n) { // return integer n as a comma-delimited string var f =n.toString(); var re = /(\d+)(\d{3})/; while (re.test(f)) f = f.replace(re, '$1,$2'); return f; } function format(n, d) { // return integer n as a string of at least d digits var f =n.toString(); while (f.length <d) f ='0' +f; return f; } function hide(i) { // hide the object whose id is i // document.getElementById(i).style.visibility ='hidden'; document.getElementById(i).style.display ='none'; } function show(i) { // hide the object whose id is i // document.getElementById(i).style.visibility ='visible'; document.getElementById(i).style.display ='block'; } function theDT (dt, f) { // return text date and time for date object dt according to format f // f: use d, D, m, M, Y, y, h, H, i, s, a, [any other character], respectively, to include day, abbreviated textual day of the week, numeric month with leading zero, abbreviated textual month, full year, 2-digit year, hours of 12-hour clock, hours of 24-hour clock, minutes, seconds, am/pm, and any other character, in given sequence (codes are consistent with PHP conventions) var td ='', ampm =''; var ff =f.split(''); for (var i =0; i <ff.length; i++) switch (ff[i]) { case "d" : td +=dt.getDate().toString(); break; case "D" : td +=DOTW[dt.getDay()]; break; case "m" : td +=format(dt.getMonth() +1, 2); break; case "M" : td +=MOTY[dt.getMonth()]; break; case "Y" : td +=dt.getFullYear().toString(); break; case "y" : td +=dt.getFullYear().toString().substr(2,2); break; case "h" : if (dt.getHours() >12) { td +=(dt.getHours() -12).toString(); break; } if (dt.getHours() >0) { td +=dt.getHours().toString(); break; } td +='12'; break; case "H" : td +=dt.getHours().toString(); break; case "i" : td +=format(dt.getMinutes(), 2); break; case "s" : td +=format(dt.getSeconds(), 2); break; case "a" : td +=(dt.getHours() >11) ? 'p.m.' : 'a.m.'; break; default: td +=ff[i]; } return td; } function theLine(c, t) { // render an HTML table whose width is c's percentage of t if (c <=0 || t <=0) return ' '; return '<table cellpadding="0" cellspacing="0" bgcolor="#BBBBBB" width="' +Math.round(100 *c /t) +'%"><tr><td> </td></tr></table>'; } // structure of a PB screen shot index line // e.g.,<a href=pb000914.htm target=_blank>000914</a> "^3[MSG]Goose" (W) GUID=xxx...xxx(VALID) [2007.11.26 17:47:25] // $1 ss file name // $2 ss sequence # // $3 player name // $4 ?? e.g., W // $5 GUID // $6 ?? e.g., VALID // $7 ss year // $8 ss month // $9 ss day // $10 ss hour // $11 ss minute // $12 ss second var ssRE = /.*?<a\s+href=.*?(pb\d+?\.htm).*?>(.*?)<\/a>\s*"([\s\S]*?)"\s+\((.*?)\).+GUID=([0-9a-f]+)\s*\((.*?)\)\s+\[(\d{4})\.(\d{2})\.(\d{2})\s+(\d{2})\d{2})\d{2})\]/i; var ssPlayerRE = /.*?<a.*?>.*?<\/a>\s*"([\s\S]*?)"/i; var ssX = /.*?<a\s+href=.*?(pb\d+?\.htm).*?>.*?<\/a>\s*([\s\S]*)/i; </script> </html> Quote Link to comment https://forums.phpfreaks.com/topic/297035-how-to-pull-data-from-html-page/#findComment-1515038 Share on other sites More sharing options...
vapor33 Posted June 26, 2015 Author Share Posted June 26, 2015 Also...would it be more beneficial to use a css flex box for the extracted data within the new html page or no? Quote Link to comment https://forums.phpfreaks.com/topic/297035-how-to-pull-data-from-html-page/#findComment-1515042 Share on other sites More sharing options...
Psycho Posted June 29, 2015 Share Posted June 29, 2015 I was playing with a similar approach that didn't use regex except to extract the GUID status. Barands is a bit cleaner. Except Barand hasn't posted in this thread . . . so I assume you are talking about the code I posted. Here is the source from a ss viewer that does work for all games but I hate the layout and gui....that's why I was attempting to create my own. But can you check the code for me as I am unsure what's "clean & secure" code and what's not. Are your examples better than the code below? Also...would it be more beneficial to use a css flex box for the extracted data within the new html page or no? Vapor, you posted a problem and we took the time to provide solutions. Now you are asking us to analyze some other code to determine what is better and "clean & secure". Sorry, but I'm not going to take the time to do that. The code I posted works with the content on the page provided at the time I wrote it. If that content changes (specifically the format), I can make no guarantees. As to how you should display the output, that I leave to you. That is where you definitely need to worry about ensuring the code will not create security problems (e.g. XSS). That can be resolved using htmlentities() as one solution. But, how you create the output (CSS tables, etc.) is up to you. This forum is about helping people with code they have written. It would be nice to see you at least make an attempt as opposed to just writing everything for you. Quote Link to comment https://forums.phpfreaks.com/topic/297035-how-to-pull-data-from-html-page/#findComment-1515213 Share on other sites More sharing options...
CroNiX Posted June 29, 2015 Share Posted June 29, 2015 Except Barand hasn't posted in this thread . . . so I assume you are talking about the code I posted. Ah yes, apologies Psycho Quote Link to comment https://forums.phpfreaks.com/topic/297035-how-to-pull-data-from-html-page/#findComment-1515219 Share on other sites More sharing options...
vapor33 Posted June 30, 2015 Author Share Posted June 30, 2015 Except Barand hasn't posted in this thread . . . so I assume you are talking about the code I posted. Vapor, you posted a problem and we took the time to provide solutions. Now you are asking us to analyze some other code to determine what is better and "clean & secure". Sorry, but I'm not going to take the time to do that. The code I posted works with the content on the page provided at the time I wrote it. If that content changes (specifically the format), I can make no guarantees. As to how you should display the output, that I leave to you. That is where you definitely need to worry about ensuring the code will not create security problems (e.g. XSS). That can be resolved using htmlentities() as one solution. But, how you create the output (CSS tables, etc.) is up to you. This forum is about helping people with code they have written. It would be nice to see you at least make an attempt as opposed to just writing everything for you. I was going to create one from scratch, hence my original post. Then I received permission to alter an existing script....so I added the reply. Figured an existing script was easier to work with seeings how it aleady works but I don't like the gui or layout of the page. I don't expect you to write it for me, as I already said: I have no idea how php works and/or what I can or cant do with it. Sorry my issues as a php noob has hindered your free time in responding to my thread. I did not mean to "waste your time". I asked for assistance and received it and I'm grateful, I just don't know how to proceed is all. I will try and figure it out through trial and error, thanks. Quote Link to comment https://forums.phpfreaks.com/topic/297035-how-to-pull-data-from-html-page/#findComment-1515254 Share on other sites More sharing options...
vapor33 Posted June 30, 2015 Author Share Posted June 30, 2015 @cronix, your example does not work when placed into a php page. @psycho, I have used your example and it's working here - http://deadendgaming.net/phptest/phpfreaks.php Is there a way to change the formatting to the date field? I looked up the str_replace function of (regex) but can't seem to get the format right. I just produces an error. I tried to use this to no avail: $match[4] = str_replace('$match[4]' , 'date("m.d.y g:i a"), $match[4]); Quote Link to comment https://forums.phpfreaks.com/topic/297035-how-to-pull-data-from-html-page/#findComment-1515309 Share on other sites More sharing options...
CroNiX Posted June 30, 2015 Share Posted June 30, 2015 (edited) What doesn't work about it? Strange, it works here. Did you add the open/close php tags to the start/end of the script? Did you add the output that is from the very top of the 2nd code block to the very bottom of the script? When it does: $data = get_pb_data(); echo '<pre>'; print_r($data); Edited June 30, 2015 by CroNiX Quote Link to comment https://forums.phpfreaks.com/topic/297035-how-to-pull-data-from-html-page/#findComment-1515313 Share on other sites More sharing options...
vapor33 Posted June 30, 2015 Author Share Posted June 30, 2015 What doesn't work about it? Strange, it works here. Did you add the open/close php tags to the start/end of the script? Did you add the output that is from the very top of the 2nd code block to the very bottom of the script? When it does: $data = get_pb_data(); echo '<pre>'; print_r($data); I closed the php tags but forgot to add the 2nd code block to the bottom of the script. Told you guys I have no idea what Im doing....lol It works just fine too: http://deadendgaming.net/phptest/cronix.php Thanks Quote Link to comment https://forums.phpfreaks.com/topic/297035-how-to-pull-data-from-html-page/#findComment-1515327 Share on other sites More sharing options...
CroNiX Posted July 1, 2015 Share Posted July 1, 2015 You wouldn't want to change the date format that the array is storing. With mine, it's the same format ("datetime" from output of the array) that a mysql datetime column format is in (Y-m-d H:i:s), which allows you to do date calculations easily, like get me all the ones from the "last 30 minutes" or whatever. Instead, you'd want to format it when you loop over the php array and output the html. Quote Link to comment https://forums.phpfreaks.com/topic/297035-how-to-pull-data-from-html-page/#findComment-1515334 Share on other sites More sharing options...
vapor33 Posted July 1, 2015 Author Share Posted July 1, 2015 ok...I will see what I can do Quote Link to comment https://forums.phpfreaks.com/topic/297035-how-to-pull-data-from-html-page/#findComment-1515337 Share on other sites More sharing options...
Ani123 Posted January 1, 2016 Share Posted January 1, 2016 Hey,I am sharing a code,check this: public static String getTitle(String address) throws IOException { URL url = new URL(address); BufferedReader reader = null; try { reader = new BufferedReader(new InputStreamReader(url.openStream())); String line = null; while ((line = reader.readLine()) != null) { int start = line.indexOf("<title>"); int end = line.indexOf("</title>"); if (start != -1) { return line.substring(start + "<title>".length(), end); } } return ""; } finally { if (reader != null) reader.close(); }} Quote Link to comment https://forums.phpfreaks.com/topic/297035-how-to-pull-data-from-html-page/#findComment-1528923 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.