QuickOldCar
Staff Alumni-
Posts
2,972 -
Joined
-
Last visited
-
Days Won
28
Everything posted by QuickOldCar
-
I wasn't happy with anything I did prior as it had complications still. So I sat down and really thought this out, I came up with a better working code. I set a url get value, so try the links something like my site with the url=. http://get.blogdns.com/dynaindex/simple-parse?url=http://www.phpfreaks.com/forums/index.php simple_html_dom.php set to same folder <?php include('simple_html_dom.php'); function getHost($url) { $parseUrl = parse_url(trim($url)); return trim($parseUrl[host] ? $parseUrl[host] : array_shift(explode('/', $parseUrl[path], 2))); } $url = mysql_real_escape_string($_GET['url']); //simple way to add the http:// that dom requires, using curl is a better option if (substr($url, 0, 4) != "http") { $url = "http://$url"; } $parsed_url = getHost($url); $http_parsed_host = "http://$parsed_url/"; $html = file_get_html($url); foreach($html->find('a') as $element) $dom = new DOMDocument(); @$dom->loadHTML($html); $xpath = new DOMXPath($dom); $hrefs = $xpath->evaluate("/html/body//a"); for ($i = 0; $i < $hrefs->length; $i++) { $href = $hrefs->item($i); $href_link = $href->getAttribute('href'); $parse_count = count("$http_parsed_host"); $substr_count = +7; if (substr($href_link, 0, $substr_count) == "mailto:") { $mail_link = $href_link; $href_link = trim($mail_link,$href_link); } if (substr($href_link, 0, 1) == "/") { $href_link = trim($href_link,"/"); } if (substr($href_link, 0, 2) == "//") { $href_link = trim($href_link,"//"); } if (substr($href_link, 0, 3) == "///") { $href_link = trim($href_link,"///"); } if ((substr($href_link, 0, == "https://") OR (substr($href_link, 0, 12) == "https://www.") OR (substr($href_link, 0, 7) == "http://") OR (substr($href_link, 0, 11) == "http://www.") OR (substr($href_link, 0, 6) == "ftp://") OR (substr($href_link, 0, 11) == "feed://www.") OR (substr($href_link, 0, 7) == "feed://")) { $final_href_link[] = $href_link; } else { if (substr($href_link, 0, 1) != "/") { $final_href_link[] = "$http_parsed_host$href_link"; } } } $links_array = array_unique($final_href_link); sort($links_array); foreach ($links_array as $links) { //echo "$links<br />"; echo "<a href='$links'>$links</a><br />"; } echo "<a href='$mail_link'>$mail_link</a><br />"; ?> Some other thoughts, you would be able to look at the endings of the href_links and sort them by type, such as images in an array of jpg,jpeg,bmp,png,gif, even audio or video types.
-
For a very simple parse host is this code and apply your html dom above this to find all the href links, I just wrote this up inside the comment, so I hope I got it right, but will show you what need to do. <?php function getHost($url) { $parseUrl = parse_url(trim($url)); return trim($parseUrl[host] ? $parseUrl[host] : array_shift(explode('/', $parseUrl[path], 2))); } //Usage: //href's from dom $href_link = "value from dom element"; //parse the url host $parsed_url = getHost($url); //add http:// and end slash to parsed host for to be a href link again $http_parsed_host = "http://$parsed_url/"; //check for some common href beginnings, if there leave link alone, else modify it. if ((substr($href_link, 0, == "https://") OR (substr($href_link, 0, 12) == "https://www.") OR (substr($href_link, 0, 7) == "http://") OR (substr($href_link, 0, 11) == "http://www.") OR (substr($href_link, 0, 4) == "www.") OR (substr($href_link, 0, 6) == "ftp://") OR (substr($href_link, 0, 11) == "feed://www.")OR (substr($href_link, 0, 7) == "feed://")) { $final_href_link[] = $href_link; { } else { if ((substr($href_link, 0, 1) == "/")) { $href_link = ltrim($href_link, "/"); } $href_links_input .= str_replace( = array("./","../","../../","../../../"), '', $href_link); $final_link = "$http_parsed_url$href_links_input"; $final_href_link[] = $final_link; } $links_array = array_unique($final_href_link); sort($links_array); foreach ($links_array as $links) { //echo "$links<br />"; echo "<a href='$links'>$links</a><br />"; } ?>
-
Let me rephrase all what needs to be done. You need to make a link checking system. For every href discovered must do substring checks, if follows a regular format such as http://, www.,ftp:// and so on you keep the link as it is, else you do more checks if is anything odd at beginning of the link have to trim it off. Then use a parsed host and place the parsed host at the beginning to the href. Also add the end slash for the parsed host. The parser I made does all this by curl to find the resolved pages. Dom to find the href links. A very complex parse for the host that does host, main host, can handle any queries and also the second level domains as well. I place them all in arrays and a loop. I still didn't get that sites logo image yet it was using ../../../, I added some rules but didn't quite get it right yet.
-
I got it working for that, some their links work as ../, so I added a rule to include the parsed host in front of those type links. I already had for / and if was no http present in front of the href url. So I now added the ./ and../ as well. And one day will need to do more. Parsing links from many places takes a lot more than that simple code you have. http://get.blogdns.com/dynaindex/page-parser?domainname=http%3A%2F%2Fschulnetz.nibis.de%2Fdb%2Fschulen%2Fschule.php%3Fschulnr%3D94468%26lschb
-
The below code works , I can read Welcome natasha thomas. You should now edit your login password on your site before someone does harm to your account. Here is a link to the working code http://get.blogdns.com/dynaindex/testscrape I just echoed the html. You will see that the href links go to my own site, thats because that site did self versus the full http link. Would have to fix those with maybe dom, I had to do that for my page parser. <?php $url = "https://www.majesticseo.com/account/login?redirect=%2Faccount"; /*connect to the url using curl to see if exists and get the information*/ $cookie = tempnam('tmp','cookie'); $cookie_file_path = "tmp/"; $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie); curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path); curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3'); curl_setopt($ch, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1); curl_setopt ($ch, CURLOPT_POST, 2); curl_setopt ($ch, CURLOPT_POSTFIELDS, "LoginEmail=wow@mailinator.com&LoginPassword=natashaworld"); curl_setopt($ch, CURLOPT_TIMEOUT, 15); curl_setopt($ch, CURLOPT_MAXREDIRS, 15); curl_setopt($ch, CURLOPT_HEADER, 1); curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_AUTOREFERER, true); curl_setopt ($ch, CURLOPT_FILETIME, 1); curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1); curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE); curl_setopt($ch, CURLOPT_ENCODING , ""); $curl_session = curl_init(); curl_setopt($curl_session, CURLOPT_COOKIEJAR, $cookie); curl_setopt($curl_session, CURLOPT_COOKIEFILE, $cookie_file_path); curl_setopt($curl_session, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3'); curl_setopt($curl_session, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1); curl_setopt ($curl_session, CURLOPT_POST, 2); curl_setopt ($curl_session, CURLOPT_POSTFIELDS, "LoginEmail=wow@mailinator.com&LoginPassword=natashaworld"); curl_setopt($curl_session, CURLOPT_ENCODING , ""); curl_setopt($curl_session, CURLOPT_TIMEOUT, 15); curl_setopt($curl_session, CURLOPT_HEADER, 1); curl_setopt($curl_session, CURLOPT_SSL_VERIFYPEER, FALSE); curl_setopt($curl_session, CURLOPT_HEADER, true); curl_setopt($curl_session, CURLOPT_MAXREDIRS, 15); curl_setopt($curl_session, CURLOPT_RETURNTRANSFER, true); curl_setopt( $curl_session, CURLOPT_AUTOREFERER, true ); curl_setopt ($curl_session, CURLOPT_HTTPGET, true); curl_setopt($curl_session, CURLOPT_URL, $url); $string = mysql_real_escape_string(curl_exec($curl_session)); $html = mysql_real_escape_string(curl_exec ($ch)); $info = curl_getinfo($ch); /*curl response check and to resolve url to the actual location*/ $response = curl_getinfo( $ch ); if ($response['http_code'] == 301 || $response['http_code'] == 302) { ini_set("user_agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3"); $headers = get_headers($response['url']); $location = ""; foreach( $headers as $value ) { if ( substr( strtolower($value), 0, 9 ) == "location:" ) return get_final_url( trim( substr( $value, 9, strlen($value) ) ) ); } } if ( preg_match("/window\.location\.replace\('(.*)'\)/i", $content, $value) || preg_match("/window\.location\=[\"'](.*)[\"']/i", $content, $value) || preg_match("/location\.href\=[\"'](.*)[\"']/i", $content, $value) ) { $finalurl = get_final_url($value[1]); } else { $finalurl = $response['url']; } $html = curl_exec($ch); $header = "Location: "; /*parse the url into the main domain name*/ function get_main_domain($temp_main_domain) { $domain_parts = explode('/', $temp_main_domain); if ($domain_parts[0]=='http:' || $domain_parts[0]=='https:') { $temp_main_domain= $domain_parts[2]; } else { $temp_main_domain= $domain_parts[0]; } unset($domain_parts); $domain_parts = explode('.', $temp_main_domain); $positions=count($domain_parts); $positions-=3; if (strlen($domain_parts[($positions+2)])==2) { $final_main_url=$domain_parts[$positions].'.'.$domain_parts[($positions+1)].'.'.$domain_parts[($positions+2)]; } else if (strlen($domain_parts[($positions+2)])==0) { $final_main_url=$domain_parts[($positions)].'.'.$domain_parts[($positions+1)]; } else { $final_main_url=$domain_parts[($positions+1)].'.'.$domain_parts[($positions+2)]; } return $final_main_url; } $final_main_parsed_host = get_main_domain($finalurl); $final_main_parsed_host = strtolower($final_main_parsed_host); echo "Main Parsed Host: $final_main_parsed_host<br />"; /*because stupid people resolve their sites to all uppercase - i have to check it and attempt to fix it*/ $checknew_parse_url = $finalurl; function checkgetparsedHost($checknew_parse_url) { $checkparsedUrl = parse_url(trim($checknew_parse_url)); return trim($checkparsedUrl[host] ? $checkparsedUrl[host] : array_shift(explode('/', $checkparsedUrl[path], 2))); } $checkget_parse_url = parse_url($checknew_parse_url, PHP_URL_HOST); $checkhost_parse_url .= str_replace(array('Www.','WWW.'), 'www.', $checkget_parse_url); $checkhost_parse_url = strtolower($checkhost_parse_url); $checkport_parse_url = parse_url($checknew_parse_url, PHP_URL_PORT); $checkuser_parse_url = parse_url($checknew_parse_url, PHP_URL_USER); $checkpass_parse_url = parse_url($checknew_parse_url, PHP_URL_PASS); $checkget_path_parse_url = parse_url($checknew_parse_url, PHP_URL_PATH); $checkpath_parse_url .= str_replace(array('Www.','WWW.'), 'www.', $checkget_path_parse_url); $checkquery_parse_url = parse_url($checknew_parse_url, PHP_URL_QUERY); $checkquery_parse_url = "?$checkquery_parse_url"; $checkquery_parse_url = rtrim($checkquery_parse_url, '#'); $checkfragment_parse_url = parse_url($checknew_parse_url, PHP_URL_FRAGMENT); $checkfragment_parse_url = "#$checkfragment_parse_url"; $checkhostpath_url = "$checkhost_parse_url$checkpath_parse_url"; $checkhostpath_url = rtrim($checkhostpath_url, '?'); $checkquery_parse_url = rtrim($checkquery_parse_url, '?'); $checkhostpathquery_url = "$checkhost_parse_url$checkpath_parse_url$checkquery_parse_url"; $checkcomplete_url = "$checkhost_parse_url$checkport_parse_url$checkuser_parse_url$checkpass_parse_url$checkpath_parse_url$checkquery_parse_url$checkfragment_parse_url"; $checkcomplete_url = rtrim($checkcomplete_url, '#'); $url = "http://$checkcomplete_url"; echo "Resolved: $finalurl"; echo ""; echo "<br />"; echo "Lowercased: $url"; echo ""; $md5_url = md5($url); print "<br />"; echo ""; /*if was a curl error - job ends and back to url insert area*/ if (!$html) { ?> <br /><FONT COLOR=red>No url inserted:</b></FONT> <br /><B><FONT COLOR=orange>Please try another url, that website may not exist. The url may or may not require the www.</b></FONT><br /> <?php exit; } if (curl_errno($ch)) { ?> <B><FONT COLOR=orange> <?php curl_error($ch); ?> </b></FONT><br /> <?php } else { ?> <br /> <?php $errmsg = curl_error($ch); curl_close($ch); $valid = array(200, 201, 202, 203, 204, 205, 206, 207, 300, 301, 302, 303, 304, 305, 306, 307); if (in_array($info['http_code'], $valid)) { ?> <B><FONT COLOR=lime>Connection OK</b></FONT> <?php } $invalid = array(400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 500, 501, 502, 503, 504, 505, 506, 507, 510); if (in_array($info['http_code'], $invalid)) { ?> <B><FONT COLOR=red>Connection Error</b></FONT> <?php } ?> <br /> <?php $redirected = array(300, 301, 302, 303, 307); if (in_array($info['http_code'], $redirected)) { ?> <B><FONT COLOR=orange>Redirection</b></FONT> <?php } $redirectedno = array(200, 201, 202, 203, 204, 205, 206, 207); if (in_array($info['http_code'], $redirectedno)) { echo "<FONT COLOR=lime> Direct Connection</b></FONT><br />"; echo $html; } print <<<END <br /> END; } ?>
-
Am I incorrect that you do not need to login to the backend of there, aren't you supposed to upload some sort of site, then from there it would be your own login credentials from w/e platform or scripts you will then have. I did try for the heck of it to use curl and login your information, it didn't work. I then modified the code with https and my own scraper, what seems to happen is the site always wants to redirect back to their main page for a check, not sure how to go about that through curl.
-
Seems to work fine. Just a heads up, you wrote sing up instead of sign up
-
A better explanation of the types of files modified, when the action is to be taken place and upon what circumstance may help in determining this. One thing with php is that pretty much everything can imagine up is possible, but may be easier ways. As a first thought, I would say just to include() the php file under some type of if statement. Or possibly an ifexists, or checking the date modified. Just post the process here so everyone can get a better understanding of what exactly you would like to do.
-
Try this $query = mysql_query("SELECT MAX(ART_ID) AS maxAID FROM Artisan"); while(mysql_fetch_array($query)) { $maxAID = $row["maxAID"]; for($artID = 1; $artID <= $maxAID; $artID++) { $Aquery = mysql_query("SELECT * FROM Artisan WHERE ART_ID = '$artID'"); } ?> <table align = "center" width="90%" border="0" style="font-size: 12px; margin-top: 30px; font-family: Tahoma;"> <?php while(mysql_fetch_array($Aquery)) { $ART_Name = $row["ART_Name"]; ?> <tr> <td><?php echo $ART_Name; ?></td> <td> Filler </td> </tr> <?php }?></table><?php $Qquery = mysql_query("SELECT * FROM ArtisanQuestion ORDER BY ARQ_ID"); ?> <table align = "left" width="35%" border="0" style="font-size: 12px; margin-top: 30px; font-family: Tahoma;"> <?php while(mysql_fetch_array($Qquery)) { $Ques_ID = $row["ARQ_ID"]; $Ques = $row["ARQ_Question"]; ?> <tr> <td><?php echo "<b>".$Ques_ID."</b>. ".$Ques; ?></td> </tr> <?php }?></table><?php $Aquery = mysql_query("SELECT * ArtisanAnswer WHERE ART_ID = '$artID' ORDER BY ARQ_ID"); ?><table align = "right" width="35%" border="0" style="font-size: 12px; margin-top: 30px; font-family: Tahoma;"> <?php while(mysql_fetch_array($Aquery)) { $Ans = $row["ARAN_Answer"]; ?> <tr> <td><?php echo $Ans; ?></td> </tr> <?php }?></table><?php } } ?>
-
My best guess is somewhere along there you have an extra } or missing the } at a certain position. Got errors on and see any messages? It would tell you the line with the issue.
-
You have a pile of <?php with no closing ?>. If it's all php you just need the start and stop. EDIT: I opened my eyes better and see the closes are all there.
-
if ($session->isAdmin()){ echo "Logged in - allowing the action<br />"; $action_location_or_function = "whatever you need here"; } else { echo "Not Logged in - Not allowing the action<br />"; $action_location_or_function = "whatever you need here"; } Is a few ways can do this. Even something like below if have a field with user rights: if ($user_rights >= "6"){ echo "echo your form here"; } else { echo "You don't have sufficient rights to do that."; } Whatever you have and need to do, you can wrap the last end brackets so it's just used in the last else statement. I'm saying that you insert what you want done or not done within the if or else brackets.
-
I decided to test my above code within my website. Using the same random code above, except instead of inserting a value..I instead fetch the results from the id, and display it. So here's a random websites display. http://dynaindex.com/random-url
-
I was going to say the same, also the way this is doing it, you would have to also include an insert, or else update because the original user could change their image then the other friends image would be incorrect. Just call to the friends users table by their name and get the image to display, because going by id for for tables they are different, but the names should remain the same.
-
Random is not really in a true sense very random when comes to computers, you will see what I mean if use it a lot. There are many ways to do random, many ways are very slow, this should do the job. You need to be finding the stored values of player id's, my below example picks one at random from a mysql query. So pulling the one player_id, then call back to that player_id row, insert the crystal_value. You then do something like this with your specific values of course: <?php $con = mysql_connect('localhost','username','userpassword'); if (!$con) { die('Could not connect: ' . mysql_error()); } mysql_select_db('databasename', $con); $offset_result = mysql_query( " SELECT FLOOR(RAND() * COUNT(*)) AS `offset` FROM `table` "); $offset_row = mysql_fetch_object( $offset_result ); $offset = $offset_row->offset; $player_id = mysql_query( " SELECT * FROM `table` LIMIT $offset, 1 " ); $query = mysql_query("SELECT* FROM table WHERE player_id = '".$player_id."'"); $row = mysql_fetch_array($query); $player_id = $row['player_id']; $crystal_value = $row['crystal_value']; $crystal_add = "10";//set this to the value you want $crystal_value = "$crystal_value+$crystal_add"; mysql_query("UPDATE table SET crystal_value='$crystal_value' WHERE player_id='$player_id'"); mysql_close($con); ?> I wrote this code out so you can understand the process, I'm sure you can optimize this more.
-
Need to make a php file called delete.php, inside this file is a POST form and value unique to that row,I personally use id because should only be one id. using names and such there could be multiples. So the POST value should be coming from your display page. I do something like this, and this is hard for me to explain fully. I made a .php file that does the connect to database that references the posts unique id from a GET in a hyperlink. I use GET versus POST because I use it many places throughout my site. All my admin hyperlinks for functions target a new window and javascript code at the end with window.close. My admin functions are protected by admin user sessions to be able to execute the action. If not admin it redirects them with header back to the main site url. The php file is located in the same directory so just referencing the php file. <a href='delete-id.php?theid=$post_id'> [ Delete Post ] </a> inside delete-id.php will be a form, the value is theid, in the hyperlink above it uses $post_id to discover it's value from the post. <form action="delete-id.php" method="get"> <p>Post ID : <input type="text" name="theid" value="" class="text" style="width:30px; height:25px;" /> <input type="submit" value="Delete" class="button" style="width:20px; height:30px;" /></p> </form> delete-id.php <?php $id = $_GET['theid']; $connect = mysql_connect('localhost', 'user', 'password'); mysql_select_db('databasename'); mysql_query("DELETE FROM table WHERE id = ('$id')"); mysql_close($connect); } ?> The php file is located in the same directory so just referencing the php file. <a href='delete-id.php?theid=$post_id'> [ Delete Post ] </a> I attached a text file that could rename to just .php extension and modify to your needs. My way is just a different way of doing this, you can just do POST and from the same page and form. Depends what you need to do with it, any checking, how it gets called upon and so on. Hope this helps. [attachment deleted by admin]
-
Well the more....more....more may seem like a good idea, but it's not when want to see the end and it's throwing unresponsive script error when go down real far. That's just way too much information loaded onto a single page at one time. You need to paginate the results into pages at least.
-
I see that now, Is there an unlocked area specific for codes and snippets?
-
Thanks for the welcome. I would like to post a code here. http://www.phpfreaks.com/forums/faqcode-snippet-repository/ There is no "new topic" button though. Also as I browsed around it seems is no "reply" to any posts as well.
-
If anyone can tell me if it's normal that new members have some sort of time limit...post views or anything like that I'd appreciate it. I can't seem to do anything here except this intro post...besides read them. I would like to reply to some posts, and also add a snippet of code or two to share.
-
Hi everyone, I've been doing php and other coding for quite some time now, I just recently decided to join a community. In the future I can see myself sharing some hints, tricks, tutorials or some code. This will be when I have enough time to do so.