
bschultz
Members-
Posts
486 -
Joined
-
Last visited
Everything posted by bschultz
-
Maxxd, when using a browser, there is a button to press to go from page 1 to page 2. This button is just a link, so the second curl request uses this link as the URL. Using Firefox Developer Tools - Network, page #4 makes two GET calls to two other pages (which includes the SQL Selects), then the content loads on page #4 Here are the two external page requests curl 'http://209.151.229.186/AffWeb_USRN/V2/ASP/GTD.asp?SQLCMD=spGetStationOptions%20%27WBJI-FM%27,%20%27Virtual%20News%20Network%20MF%27,%20%2710/19/2020%27,%20%2710/19/2020%27&DT=1603034713365' -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0' -H 'Accept: */*' -H 'Accept-Language: en-US,en;q=0.5' --compressed -H 'Referer: http://209.151.229.186/AffWeb_USRN/V2/log_exact.asp?startDate=10/19/2020&endDate=10/19/2020&SD=10/19/2020&ED=10/19/2020&gsfCode=0' -H 'Content-Type: application/x-www-form-urlencoded' -H 'Connection: keep-alive' -H 'Cookie: SavePW=1; Password=xxxx; ASPSESSIONIDAARTRTBC=EAPNEIJAHEJILFGECNDAKNPE; ASPSESSIONIDQSBQTQDD=HNHHELCCMPNLDFACKGPLCJHF' curl 'http://209.151.229.186/AffWeb_USRN/V2/ASP/GTD.asp?SQLCMD=spGetAllSpots%20%27WBJI-FM%27,%20%27Virtual%20News%20Network%20MF%27,%20%2710/19/2020%27,%20%2710/19/2020%27&DT=1603034713563' -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0' -H 'Accept: */*' -H 'Accept-Language: en-US,en;q=0.5' --compressed -H 'Referer: http://209.151.229.186/AffWeb_USRN/V2/log_exact.asp?startDate=10/19/2020&endDate=10/19/2020&SD=10/19/2020&ED=10/19/2020&gsfCode=0' -H 'Content-Type: application/x-www-form-urlencoded' -H 'Connection: keep-alive' -H 'Cookie: SavePW=1; Password=xxxx; ASPSESSIONIDAARTRTBC=EAPNEIJAHEJILFGECNDAKNPE; ASPSESSIONIDQSBQTQDD=HNHHELCCMPNLDFACKGPLCJHF' I can generate a temp password if someone want to use a browser to see what's happening...just private message me for the login details. Thanks!
-
4 pages deep means login page (page 1). Simulate a link click to page 2. Simulate a link click to page 3. Simulate a link click to page 4. Pages 2 and 3 have no javascript or Ajax coding. Page 4 does! Pages 2 and 3 have some coding that is tied to the login to display certain info. As far as I can tell, page 4 uses AJAX to set some database calls...thus without the AXAX info passed via CURL, I get AJAX:0 error. I'm assuming the AJAX:0 error is in the code of the page...but when I visit that page via a browser, it works...so no error. What code would you like me to post? Page 1, 2, 3, or 4?
-
I'm trying to login and scrape a page 4 pages deep. I can get to the fourth page...but that page only returns AJAX ERROR:0. I know NOTHING about AJAX calls via Curl. Can someone please help me with what to look for in the source code of the 4th page (when using a browser) to what I'm supposed to pass along via CURL? If you need the source code or login credentials to see what's happening in the background, I can generate a temp password for you. Thanks!
-
Upon further investigation, there is a cookie being set by a JQuery script...which in turn sets the rowid. So, how can I get CURL to interact with Java? Headless browser? I've never had much success with headless browsers before. Any tips? Thanks!
-
But the rowid isn't set until you submit the form...so parse-url returns NULL. The flow of the login is: login page -> formprocessing page -> landing page. In the script, I'm sending a POST to the form processing page. Once that is done, the form processing page sets the rowid. I can see the rowid in my browser, but CURL doesn't know that value.
-
My real job is as a radio announcer. We are required to play advertising commercials for various programs that we broadcast. I wrote a script a few years ago to automatically login to the providers website (PHP, Curl) and download the mp3's that we are supposed to play each day. Now, the provider has updated their website and changed the login process. It used to be (in a browser) that when you logged in, you were taken to a landing page with a unique "rowid" in the in URL. Once you knew that rowid, you could simply go to that page, and bypass the login process. Not very secure...but easy to scrape! Now, when you login, you are assigned a random rowid. Using the Firefox Deveoloper Tools, I see that there is NO cookie associated with the login. The rowid is now set in the form processing page. In the Developer Tools, this is seen in the Params...Query String section. How can I extract these params (rowid) from the form processing script, and proceed to the landing page...with the rowid in the Curl command? This just returns me to the login page..since the rowid isn't set. Thanks! $url = 'http://domain.com/formprocessing.html'; $fields = array( 'username' => urlencode('xxx'), 'password' => urlencode('xxx') ); $fields_string = ''; foreach($fields as $key=>$value) { $fields_string .= $key.'='.$value.'&'; } rtrim($fields_string, '&'); //open connection $ch = curl_init(); //set the url, number of POST vars, POST data curl_setopt($ch,CURLOPT_URL, $url); curl_setopt($ch,CURLOPT_POST, count($fields)); curl_setopt($ch,CURLOPT_POSTFIELDS, $fields_string); //execute post $result = curl_exec($ch); //close connection curl_close($ch);
-
I have a given date on a webpage, that I'm scraping to insert into a DB. The date is in this format: Sun, Feb 9<br />3:00 PM ET I need to insert this into the DB in this format: 2020-02-09 15:00:00 It will always be this year...How can I change this data to be inserted correctly? I'm trying this...and it's inserting as 1969 $date = 'Sun, Feb 9<br />3:00 PM ET': $healthy = array("<br />", " ET", "Sun.", "Mon.", "Tue.", "Wed.", "Thu", "Fri.", "Sat.", "Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"); $yummy = array(" ", "2020", "Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"); $date = str_replace($healthy, $yummy, $date1); $start = date('Y-m-d H:i:s', strtotime("$date -120 minutes")); Thanks!
-
That class will work....thank you!
-
30 files in this directory
-
All files are 128k constant. id3 tags are empty. That's how they're downloaded. I read that -k would force KB...but without it, $showsize was off from what right clicking the directory showed me for file size. Would it be better to get size of all files instead of directory size? Does du take into account the size of all files individually, or just the combined disk space used?
-
I have a directory of mp3 files that I need to find out the combined length in minutes and seconds of all the audio files. The files are all 128kbps stereo mp3's. Here's what I have so far... <?php //connect to remote server (hostname, port) $connection = ssh2_connect('192.168.2.4', 22); //authenticate ssh2_auth_password($connection, 'username', 'password'); //execute remote command (replace /path/to/directory with absolute path) $stream = ssh2_exec($connection, 'du -k /remotedirectory'); stream_set_blocking($stream, true); //get the output $dirSize = stream_get_contents($stream); //show the output and close the connection $showsize = $dirSize; //echo $showsize; exit; $math = (($showsize * 1000) / 128); //without the /128 it shows 34308000...which is correct. the files are 128kbps //echo $math; exit; echo gmdate("i:s", $math); // shows 27:11 which is wrong...the actual total time of all of the files in the directory is 36:34 ...minutes and seconds fclose($stream); ?> Any ideas where I'm off in the logic of the math? Thanks.
-
Simple Dome to get url NOT inside an <a href tag
bschultz replied to bschultz's topic in PHP Coding Help
I used Wordpress. I want to bail on Wordpress. I've moved the Wordpress posts to a new database. I'm displaying the post on the page. Using Wordpress, I could have a post that read: This is a post. Here is an mp3 address. (with the address shown, no a href tages needed) The Wordpress plugin would the the url and display an html5 audio player on the site for that mp3. Another plugin would take a Youtube link and embed the movie in a player. I want to scrape, or parse, the database content...pull out the mp3 url's...and replace them with an html5 player. -
I posted this....http://forums.phpfreaks.com/topic/297984-multiple-regex-on-the-same-string/ If you don't want to read the original post, I'm trying to abandon Wordpress, but still use the functionality of a couple of plugins. Not seeing any responses tells me that maybe I should be barking up a different tree. Is it possible to parse a database post and look for a URL that is NOT inside an a href tag? I Googling, I didn't see anything in the simpledom examples about scraping for anything OUTSIDE of a tag. Thanks!
-
Another option...update all the old wordpress posts with updated html5 player code. Any ideas on how to write that auto-update code?
-
I'm trying to move away from a Wordpress site. The site used multiple plugins for taking Youtube URL's in a post and embedding a Youtube player. I have that part figured out. The old site also used a plugin to take an mp3 URL and change that to an html5 audio player. That's where I'm having some problems. I need to strip the [php] and [/php] tags from the Wordpress post...and replace them with correct open and close php tags. I want to remove all LINKS to mp3 files...and put a player in place. I also want to take all mp3 URL's...and put a player in place. One wrinkle...in some of the Wordpress posts, I have a php include of another file. In that included file are mp3 links. The following code somewhat works. It matches the third PATTERN correctly (mp3 LINK). The second PATTERN does NOT match (mp3 URL...no a href tags) Are the second and third PATTERNS conflicting? Can they both match the same thing? I don't know nearly enough about regex to know. Also, why isn't the second PATTERN matching a URL? Also, how can I handle the included file...since it doesn't appear to be matching those LINKS (the included file is in the Wordpress post content...do I need to run eval on that post BEFORE running the regex? If so, how do you store eval results in a variable for further processing? <?php $patterns = array(); $patterns[] = '#(https?://)(www.)(?:youtube(?:-nocookie)?\.com/(?:[^/\s]+/.+/|(?:v|e(?:mbed)?)/|[^?&\s]*[?&]v=)|youtu\.be/)([^"&?/ ]{11})#x'; $patterns[] = '((https?:\/\/)?(\w+?\.)+?(\w+?\/)+\w+?.(mp3|ogg))'; $patterns[] = "((?i)a\\s+[^>]*?href\\s?=[\\s'\"]+(.*?(mp3))['\"]+.*?[^<]*<\/a>)"; $replacements = array(); $replacements[] = '<iframe width="640" height="385" src="http://www.youtube.com/embed/\\3" frameborder="0" allowfullscreen></iframe>'; $replacements[] = '<a href="\\0" class="sm2_button">BRN</a>'; $replacements[] = '<a href="\\1" class="sm2_button">BRN</a>'; $newwithyoutube1 = str_replace("[php]","<?php ",$row['content']); $newwithyoutube2 = str_replace("[/php]"," ?>",$newwithyoutube1); $newwithyoutube3 = preg_replace($patterns, $replacements, $newwithyoutube2); $newwithyoutube4 = str_replace(' <<a', ' <a', $newwithyoutube3); //the third PATTERN is adding an extra < symbol...so remove it if (strpos($newwithyoutube4 ,'<br')) { eval('?>'.$newwithyoutube4.'<?php '); } else { $nlnewphrase = nl2br($newwithyoutube4); eval('?>' . $nlnewphrase . '<?php '); } ?> Thanks!
-
That worked. THANK YOU! You have no idea how much I hate the formatting of regex!
-
I'm trying to search a string for Youtube url's and mp3 url's...and replace each with a player. Here's the code: $patterns = array(); $patterns[0] = '#(http://)(www.)(?:youtube(?:-nocookie)?\.com/(?:[^/]+/.+/|(?:v|e(?:mbed)?)/|.*[?&]v=)|youtu\.be/)([^"&?/ ]{11})#x'; $patterns[1] = '((https?:\/\/)?(\w+?\.)+?(\w+?\/)+\w+?.(mp3|ogg))'; $replacements = array(); $replacements[0] = '<iframe width="640" height="385" src="http://www.youtube.com/embed/\\3" frameborder="0" allowfullscreen></iframe>'; $replacements[1] = '<a id="m5" class="audio {skin:"green", autoPlay:false, addShadow:false,addGradientOverlay:true}" href="\\0">Beaver Radio Network</a>'; $brnpost = 'This is our story content. Here is the youtube link: <br /><br /> http://www.youtube.com/watch?v=qvwefWgIQhY <br /><br />Listen Here: http://beaverradionetwork.com/audio/1011/brnpodcasts/leonhardt_skaar_2015.mp3'; echo preg_replace($patterns, $replacements, $brnpost); If I run each Regex individually, it works. If I join them into the array, it takes part of the mp3 link...and puts it in the Youtube embed. Like this: This is our story content. Here is the youtube link: <br /><br /> <iframe width="640" height="385" src="http://www.youtube.com/embed/leonhardt_s" frameborder="0" allowfullscreen></iframe>kaar_2015.mp3</body> What am I doing wrong? Thanks!
-
I am abandoning Wordpress after several years. I need to handle some plugin functionality though OUTSIDE of Wordpress. I currently have a plugin that takes a link to an mp3, and Wordpress displays an html5 audio player. I also have a plugin that takes a youtube link, and embeds that video in the post. I don't need any code (unless you'd like to show me)...but what functions should I be using to find those links...and then add in the other coding needed to achieve what the plugins did? I don't want to inlcude the Wordpress files and structure in every page just to be able to use the plugins with the new code. Here's what I currently have in the database: http://domain.com/link.mp3 Here's the code I need to add: <audio controls> <source src="http://domain.com/link.mp3" type="audio/mpeg"> Your browser does not support the audio element. </audio> And for Youtube: https://www.youtube.com/watch?v=qvwefWgIQhY Needs to be: <iframe width="420" height="315" src="https://www.youtube.com/embed/qvwefWgIQhY" frameborder="0" allowfullscreen></iframe> I've already pulled all the posts from the Wordpress database into the new database. Just looking for how to process the links in the new html layout. EDIT...after putting in the raw links...I see that this site already does what I want to achieve...so had to put them in code tags! Thanks!
-
Nevermind...figured it out. I never knew you could stick variables in an echo that way. Learned TWO things today! Thanks again!
-
Thanks...that did the trick! So that I can understand what's going on with this code...what is the significance of the {} brackets around the XML fields names?
-
I work at a radio station in the US. We are required to air some commercial content from our news provider. Our commercial scheduling people are currently putting these commercials in by hand (they read the providers webpage for the schedule that tells them which commercials are to be played...and when they are to be aired). I'd begun writing some code to read through their html page to download the commercials and rename them so that they can show up each day on the on-air log...and they would air when they are supposed to. Now, the provider is providing an xml file...which should ease how I'm pulling the info...but I'm running into some problems. Here's the xml: <BroadcastLogs> <DayData> <Date>2015-05-11</Date> <Rotation> <Time> 6a - 7p</Time> <SpotData> <SpotTitle>Biore</SpotTitle> <SpotISCI>QBIO-0065-000</SpotISCI> <Length>30</Length> <Type>Primary</Type> <SpotType/><Script></Script> </SpotData> <SpotData> <SpotTitle>Motel 6</SpotTitle> <SpotISCI>YOTL-5022-000</SpotISCI> <Length>30</Length> <Type>Linked</Type> <SpotType/><Script></Script> </SpotData> </Rotation> <Rotation> <Time> 6a - 7p</Time> <SpotData> <SpotTitle>Biore</SpotTitle> <SpotISCI>QBIO-0065-000</SpotISCI> <Length>30</Length> <Type>Primary</Type> <SpotType/><Script></Script> </SpotData> <SpotData> <SpotTitle>Motel 6</SpotTitle> <SpotISCI>YOTL-5022-000</SpotISCI> <Length>30</Length> <Type>Linked</Type> <SpotType/><Script></Script> </SpotData> </Rotation> <Rotation> <Time> 6a - 7p</Time> <SpotData> <SpotTitle>Sprint/Boost Promotional Offer</SpotTitle> <SpotISCI>BQRAA-4051</SpotISCI> <Length>60</Length> <Type>Primary</Type> <SpotType/><Script></Script> </SpotData> </Rotation> </DayData> </BroadcastLogs> You will notice that there's only one <DayData><Date>xxxx-xx-xx</Date> for the day...which shouldn't be to tough to solve. The part I'm having a problem with is if the commercial is 30 seconds...there are TWO ads...for every ONE <rotation> tag. Here you will find TWO 30 seconds ads <Rotation> <Time> 6a - 7p</Time> <SpotData> <SpotTitle>Biore</SpotTitle> <SpotISCI>QBIO-0065-000</SpotISCI> <Length>30</Length> <Type>Primary</Type> <SpotType/><Script></Script> </SpotData> <SpotData> <SpotTitle>Motel 6</SpotTitle> <SpotISCI>YOTL-5022-000</SpotISCI> <Length>30</Length> <Type>Linked</Type> <SpotType/><Script></Script> </SpotData> </Rotation> If the commercial is a 60 second commercial...there's only ONE <rotation> tag... <Rotation> <Time> 6a - 7p</Time> <SpotData> <SpotTitle>Sprint/Boost Promotional Offer</SpotTitle> <SpotISCI>BQRAA-4051</SpotISCI> <Length>60</Length> <Type>Primary</Type> <SpotType/><Script></Script> </SpotData> </Rotation> I've tried some code from the php.net site...but could seem to get the foreach's right for the difference in tags used if it's a 30 second or a 60 seconds ad. Now, I'm trying this code: <?php function RecurseXML($xml,$parent="") { $child_count = 0; foreach($xml as $key=>$value) { $child_count++; if(RecurseXML($value,$parent.".".$key) == 0) // no childern, aka "leaf node" { //////////the code on the following line was taken from a google search...and worked by displaying $parent, $key and $value // print($parent . "." . (string)$key . " = " . (string)$value . "<br />\n"); ////// I added the next section to the code, to set variables if ($parent . "." . (string)$key === '.DayData.Date') { $date = (string)$value; } if ($parent . "." . (string)$key === '.DayData.Rotation.Time') { $time = (string)$value; } if ($parent . "." . (string)$key === '.DayData.Rotation.SpotData.SpotTitle') { $title = (string)$value; } if ($parent . "." . (string)$key === '.DayData.Rotation.SpotData.SpotISCI') { $link = (string)$value; } if ($parent . "." . (string)$key === '.DayData.Rotation.SpotData.Length') { $length = (string)$value; } } echo "Date - " . $date . "<br />Time - " . $time . "<br />Title - " . $title . "<br />Link - " . $link . "<br />Length - " . $length . "<br />-------------------<br />"; } return $child_count; } $string = file_get_contents('full url goes here'); $xml = new SimpleXMLElement($string); RecurseXML($xml); ?> The code above isn't working since the loop is off...the first line returned is the date...the next four are blank. The next chunk has the date empty, the time is listed...and the rest in that chunk are blank...and so on. What I need is the values for: DayData->Date DayData->Rotation->Time DayData->Rotation->SpotData->SpotTitle DayData->Rotation->SpotData->SpotISCI DayData->Rotation->SpotData->Length Am I even on the right path here? I've never dealt with xml...much less parsing it! Thanks!
-
I work at a radio station. Each week, we have to download (using SCP or wget) a bunch of mp3's to air during the weekend. The problem I'm having is that the remote directory that these files are located in changes the name from one week to another. For instance, this week one directory is named 'Live In Concert ROCK (Sunday) 02.22.15 version3'. Next week, it might be 'Live In Concert ROCK (Sunday) 02.29.15 version2' What I need is a matching pattern of 'Live in Concert' and the 'date'...and there will be other stuff in between those two patterns. To complicate this...it's on a remote machine. What are my options? Thanks.
-
Typecast worked for the int values...but all the other fields had whitespace too. I ran trim on the foreach for the array...and it didn't work. I ran trim on the insert of the mysql table for each field...and it worked. Still don't know where the whitespace came from, but I got rid of it. Thanks!
-
That's what I have for them. The WYSIWYG box shows the copied data...not the raw html. If they don't know how to read the html, how are they supposed to know what to label the elements for my code?
-
My clients have ZERO html knowledge...they copy and paste...there's no way for them to know what the elements are from one schol to the next.