Jump to content

Existing user? Sign In
Sign In

Remember me Not recommended on shared computers

Forgot your password?
Sign Up

Browse
- Forums
- Staff
- Leaderboard
- Guidelines
- Terms of Service
- More
Activity
- All Activity
- Search
- More
Join our Discord!
More
- More

Regex Help

Regex to extract URL with special characters

abhi_madhani

By abhi_madhani
May 23, 2012 in Regex Help

Start new topic

Recommended Posts

abhi_madhani

Member

abhi_madhani

Posted May 23, 2012

- Share

Posted May 23, 2012

I am trying to extract URL's from html page by using following Regular Expression.

<a href="      (....till....)     "> Name </a>

preg_match_all('/<\s*a\s+[^>]*href\s*=\s*[\"\']?([^\"\'>]+)[\"\']>(.*)<\/a>/isU'

It works perfectly on most URL's. But it fails to recognise following kind of urls in HTML.

http://en.wikipedia.org/wiki/B'ham - Note the single quotation in the B'ham.

The URL's in the HTML are not encoded for special characters, hence I have to build my regex to count in URL's which have special characters in them.

Could anyone guide me for solutions towards this bend?

Link to comment

https://forums.phpfreaks.com/topic/262979-regex-to-extract-url-with-special-characters/

Share on other sites

Archived

This topic is now archived and is closed to further replies.

Go to topic listing

×

Browse
- Back
- Forums
- Staff
- Leaderboard
- Guidelines
- Terms of Service
Activity
- Back
- All Activity
- Search
Join our Discord!

×

Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.