PNewCode Posted March 9 Share Posted March 9 Hello. I have the following code that works for it's intended purpose for all youtube links including shares (si) extensions. Can someone please tell me how to alter it to add the shorts? This is in PHP and has to remain that way. I can't use javascript for this because this is part of a very long php coded page. Shorts examplehttps://www.youtube.com/shorts/J0iIQ629N2c My code that works for all but shorts $regex_pattern = "/(youtube.com|youtu.be)\/(watch)?(\?v=)?(\S+)?/"; I have tried the following and it failed. I should note that I have been trying to understand regex patterns and my mind doesn't seem to want to learn it at a normal good rate but I keep trying $regex_pattern = "/(youtube.com|youtu.be)\/(shorts|watch)?(\?v=)?(\S+)?/"; $regex_pattern = "/(youtube.com|youtu.be)\/(shorts)\/(watch)?(\?v=)?(\S+)?/"; $regex_pattern = "/(youtube.com|youtu.be)\/(watch)?(\?v=)?(\S+)?(\shorts)?/"; Quote Link to comment Share on other sites More sharing options...
requinix Posted March 9 Share Posted March 9 Why would you use Javascript for this? It's okay to have the regex be multiple patterns. You don't, not necessarily, have to use a single capture group to get the one value you care about. youtube.com/shorts/(\w+)|youtube.com/watch\?v=(\w+)|youtu.be/whatever else Only one of $1 or $2 (or what you put in the "whatever else") will ever have a value. And do remember that "." matches anything, so "youtubexcom/short/blah" will match the above too. 1 Quote Link to comment Share on other sites More sharing options...
PNewCode Posted March 11 Author Share Posted March 11 (edited) @requinix Thank you for that. I tried the following but with no luck, based on what you said. Admittingly, I have to come to best guess solution for it because regex is the bane of my brain haha $regex_pattern = "/(youtube.com/shorts|youtube.com|youtu.be)\/(watch)?(\?v=)?(\S+)?/"; $regex_pattern = "/youtube.com/shorts/(\w+)|youtube.com/watch\?v=(\w+)|youtu.be/(\S+)?/"; $regex_pattern = "/youtube.com/shorts/(\w+)|youtube.com/watch\?v=(\w+)|youtu.be/?si=(\w+)/"; $regex_pattern = "/youtube.com/shorts/(\w+)|youtube.com/watch\?v=(\w+)|youtu.be/?(\S+)/"; I should also note that the following is my original one that works for all but the shorts but I tried to use it with variations of what you said and no luck... $regex_pattern = "/(youtube.com|youtu.be)\/(watch)?(\?v=)?(\S+)?/"; Edit: No, I don't want to use javascript. I just added that in the original post because others I have asked keep telling me to use javascript instead haha Edited March 11 by PNewCode Quote Link to comment Share on other sites More sharing options...
requinix Posted March 11 Share Posted March 11 1. If you use /s for regex delimiters (at the beginning and end) then any /s you want inside the regex have to be escaped. Look at what your original had. 2. What's the rest of the code? 1 Quote Link to comment Share on other sites More sharing options...
PNewCode Posted March 11 Author Share Posted March 11 (edited) @requinix Here's the whole thing. It may not make sense (maybe?) because there's a lot of other stuff on the page (very long) that goes with everything involved for the user. But basically... the user enters a youtube link in a form. Then that form is sent to the (link-insert.php) which sends that link along with the title of the youtube link to the database (example: The link for the music video Korn - Life is peachy will send that title to the database) What I have originally works for shared links (youtu.be?si=) and regular links (youtube.com/watch?v=) and also songs from playlists. However if someone sends a link that is a short (youtube.com/shorts/VIDEO-ID-HERE) sends back a blank entry to the database as the title because it can't translate that link extension Note: $link is the field namd in the form $band is the column in the database The following also successfully allows the thumbnail to show on the page that calls the info from the database The shorts is the only thing that will send a blank entry to $band into the database ($band is the title of the video) You'll notice the last part, is the part I'm having the trouble with EDIT: Also, if it's not a youtube link at all, then "Not a youtube request" enters in $band in the database so that it's not blank $ytvideo1 = $link; $linkurl = "$ytvideo1"; parse_str( parse_url( $linkurl, PHP_URL_QUERY ), $vid ); preg_match('%(?:youtube(?:-nocookie)?\.com/(?:[^/]+/.+/|(?:v|e(?:mbed)?)/|.*[?&]v=)|youtu\.be/)([^"&?/ ]{11})%i', $linkurl, $match); $youtube_id = $match[1]; $preurl = "https://www.youtube.com/watch?v=$match[1]"; $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $preurl); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $output = curl_exec($ch); $document = htmlspecialchars($output); curl_close($ch); $line = explode("\n", $document); $judul = ""; foreach($line as $strline){ preg_match('/\<title\>(.*?)\<\/title\>/s', $strline, $hasil); if (!isset($hasil[0]) || $hasil[0] == "") continue; $title = str_replace(array("<title>", "</title>"), "", $hasil[0]); } $validateurl = $link; $regex_pattern = "/(youtube.com|youtu.be)\/(watch)?(\?v=)?(\S+)?/"; $match; if(!preg_match($regex_pattern, $validateurl, $match)){ $band = "Not A Youtube Request"; }else{ $band = $title;; } Edited March 11 by PNewCode Quote Link to comment Share on other sites More sharing options...
PNewCode Posted March 12 Author Share Posted March 12 I've been giving this a solid effort and still stuck. But I'm no further along since my last post. Any thoughts? Quote Link to comment Share on other sites More sharing options...
Solution gizmola Posted March 12 Solution Share Posted March 12 Seems pretty cut and dry that you just need to add an OR to optionally match the "shorts/". I don't know if the rest of the code will also return the data you are looking to scrape or not. preg_match('%(?:youtube(?:-nocookie)?\.com/(?:[^/]+/.+/|(?:shorts/)?|(?:v|e(?:mbed)?)/|.*[?&]v=)|youtu\.be/)([^"&?/ ]{11})%i', $linkurl, $match); Quote Link to comment Share on other sites More sharing options...
PNewCode Posted March 12 Author Share Posted March 12 @gizmola You nailed it! And to @requinix I just realized from Gizmola's reply that I originally gave the wrong part of my code. I apologize for that. I thought the line that I provided was where my issue was. Apparently not. So for my education, if you don't mind... The reason Gizmola's worked is because it's the first part of a 3 section translation? First being domain, second being the (watch, si, etc) and then the last being the ID? Thats what it looks like to me now. I couldn't see that before. And looks as though they are separated by the " | " character? That really helped me a lot to better understand this. Thank you. Am I correct in my understanding of how it works? Quote Link to comment Share on other sites More sharing options...
gizmola Posted March 13 Share Posted March 13 The | is just an OR. (This thing)|(that thing). There are 2 great regex testing sites you should try. They can really help you experiment and understand how regex works. First there is https://regex101.com/ 2nd is: https://regexr.com/ They both have resources and a testing interface that is really useful. I have loaded the regex I provided with some tests into regexr here: https://regexr.com/7tc1q One thing to keep in mind is that the testing tools don't allow you to change the delimiter from the default of /. You can continue to use the slash delimiter without issue, so long as you escape any slashes: \/ Note that you do not need to escape slashes inside a character class ie. [ ."/ ] Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.