jjk2 Posted March 30, 2009 Share Posted March 30, 2009 i am kinda lost how i can appraoch this. basically i have many file paths like this crazy.com/main/videos/something/popular/index.html crazy.com/latest/news/odds/home.jpg crazy2.com/funny/world/politics/welcome.html another.com/news/business/index.html how can i get only the things in bold ? also, the filenames differs dynamically. Quote Link to comment https://forums.phpfreaks.com/topic/151710-how-to-do-this-regex/ Share on other sites More sharing options...
sasa Posted March 30, 2009 Share Posted March 30, 2009 preg_match('#/.*/#', $input, $output); print_r($output); Quote Link to comment https://forums.phpfreaks.com/topic/151710-how-to-do-this-regex/#findComment-796773 Share on other sites More sharing options...
Salkcin Posted March 30, 2009 Share Posted March 30, 2009 even the above regex does what you ask, here's another one preg_match('#(?<=crazy\.com).*?(?=(?:index|welcome\.html)|home\.jpg)#', $string, $match); print_r($match); Quote Link to comment https://forums.phpfreaks.com/topic/151710-how-to-do-this-regex/#findComment-797164 Share on other sites More sharing options...
nrg_alpha Posted March 30, 2009 Share Posted March 30, 2009 even the above regex does what you ask, here's another one preg_match('#(?<=crazy\.com).*?(?=(?:index|welcome\.html)|home\.jpg)#', $string, $match); print_r($match); There are a few issues with your suggestion however... a) That's 'probably' more work than sasa's method (while I don't advocate .* too often, it does have its uses, and depending on whether the url entries are by themselves to be checked (not nested within some large block of text), that method is more likely to be faster (granted, I haven't tested the speed difference between yours and sasa's... I'm going on the assumption of positive look behind and ahead assertions vs some minor .* backtracking [although, admittedly I could be wrong on this]). b) Your pattern requires specific domains - (?<=crazy\.com) [so what happens with crazy2.com or another.com?] with specific ending file names (such as index or welcome.html by example) The following code illustrates this these issues: $arr = array('crazy.com/main/videos/something/popular/index.html','crazy.com/latest/news/odds/home.jpg','crazy2.com/funny/world/politics/welcome.html','another.com/news/business/index.html'); foreach ($arr as $val) { echo (preg_match('#(?<=crazy\.com).*?(?=(?:index|welcome\.html)|home\.jpg)#', $val))? $val . "<br />\n" : 'Url format not found using regex pattern...' . "<br />\n"; } output: crazy.com/main/videos/something/popular/index.html crazy.com/latest/news/odds/home.jpg Url format not found using regex pattern... Url format not found using regex pattern... Point being, I think the idea is to be able to match directories of any url (thus, regex patterns being flexable), which sasa's is. Quote Link to comment https://forums.phpfreaks.com/topic/151710-how-to-do-this-regex/#findComment-797287 Share on other sites More sharing options...
laffin Posted March 30, 2009 Share Posted March 30, 2009 Yes, but it fails the condition of capturing the path only. so even tho salkin's method is a bit longer it does wut was requested. but ya may want to look at parse_url function instead and play with that instead of using regex Quote Link to comment https://forums.phpfreaks.com/topic/151710-how-to-do-this-regex/#findComment-797309 Share on other sites More sharing options...
nrg_alpha Posted March 30, 2009 Share Posted March 30, 2009 I wasn't illustrating the path capturing so much as the restriction on domain names and file names that need to be found within the pattern in the first place. sasa's is more flexible. And yes, parse_url would be even better (again, assuming that the url in question is by itself and not embedded within a string). Quote Link to comment https://forums.phpfreaks.com/topic/151710-how-to-do-this-regex/#findComment-797338 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.