Jump to content

regex for matching everything but URL that starts with http.


jamesmiller

Recommended Posts

Providing sample data will help with that.

 

My expression worked very well with the sample you provided.

 

Then again, you're the same guy that provides very little REAL data to test our expressions against, and likes to modify someone else's RegEx before saying it "doesn't work"

Link to comment
Share on other sites

my apologies the actual real sample data is <a href="/torrents/798659/Jim-avi"> and i want it to retrieve all the hrefs with /torrents/but i dont want it to reteriev the hrefs with <a href="http://www.fulldls.com/torrents/Jim" target="_blank"> thats the propper sample data, the other post i really tried the regex and it didnt work for me apologies once again

Link to comment
Share on other sites

It seems like you want us to help you parse a web page. Again. Did it ever occur to you to maybe, post a link to the URL you want? Or perhaps the complete source of the web page?

 

We can't guess what magical differences might occur between the samples you give us and the data you use it on.

 

Before you reply with "this doesn't work," please put together some REAL SAMPLE DATA (IE: THE SOURCE YOU WANT TO EXTRACT THE INFORMATION FROM) or I'll just ignore the thread.

 

<?php 

$expr = '%"(?<!http:/)(/[^"]++)"%';

$data = 'my apologies the actual real sample data is <a href="/torrents/798659/Jim-avi"> and i want it to retrieve all the hrefs with /torrents/but i dont want it to reteriev the hrefs with <a href="http://www.fulldls.com/torrents/Jim" target="_blank"> thats the propper sample data, the other post i really tried the regex and it didnt work for me apologies once again';

preg_match_all( $expr, $data, $matches );

print_r( $matches );

?>

 

OUTPUT

 

Array
(
    [0] => Array
        (
            [0] => "/torrents/798659/Jim-avi"
        )

    [1] => Array
        (
            [0] => /torrents/798659/Jim-avi
        )

)

 

That was from an exact copy and paste of your reply. If it doesn't work, it's your fault.

Link to comment
Share on other sites

I don't know what you're doing or changing.

 

<?php 

$site = 'http://www.torrentreactor.net/search.php?search=2&words=Linux&lang=';
$expr = '%"(/torrents/[^"]++)"%';
$data = file_get_contents($site);

preg_match_all( $expr, $data, $matches, PREG_SET_ORDER );

if( empty($matches) )
echo 'No matches found';
else
foreach( $matches as $k => $match ) {
	echo "Match $k -> {$match[1]}<br>\n";
}

?>

 

Output (as of 13/9/2001 14:25 -08:00 GMT)

Match 0 -> /torrents/839930/Kurumin-Linux-7-0-Final-%28Brazilian-Linux-distribution%29
Match 1 -> /torrents/994128/Linux-books-amp%3B-Linux-Programming-books
Match 2 -> /torrents/2036087/Linux-Security-%28Craig-Hunt-Linux-Library-Series%29%7Etqw%7E-darksiderg
Match 3 -> /torrents/3315461/Linux-Real-Time-Linux-Who-needs-it%3F
Match 4 -> /torrents/3453932/Linux-Transfer-for-Windows-Power-Users-Getting-Started-with-Linux-for-the-Desktop-%28
Match 5 -> /torrents/3460090/Securing-Linux-A-Survival-Guide-for-Linux-Security-Version-1-0-2003-allfeeebook-tk
Match 6 -> /torrents/3482085/Puppy-Linux-4-3-1-%28Linux-1CD-ENG%29-Sistema-Operativo-PC-light-amp%3B-live-%28TNT-Village%29
Match 7 -> /torrents/3517319/Securing-Linux-A-Survival-Guide-for-Linux-Security-Version-1-0
Match 8 -> /torrents/4846856/Linux-For-Dummies-6th-Edition-%282005%29-Linux-For-Dummies-7th-Edition-%282006%29
Match 9 -> /torrents/4937930/Understanding-The-Linux-Kernel-3rd-Edition-%282005%29-chm-Understanding-The-Linux-Kernel-2nd-Edition-%282002%29-chm
Match 10 -> /torrents/4999164/The-Linux-Programming-Interface-A-Linux-and-UNIX-System-Programming-Handbook
Match 11 -> /torrents/38107/SUSE-Linux-10-Live-CD
Match 12 -> /torrents/52749/Various-Linux-and-Unix-Books-PDF-%2F-CHM-%2F-HTML
Match 13 -> /torrents/195234/WMware-v4%2Bv5-workstation-Linux%2FWindows-Keymaker-TEAM-ZWT
Match 14 -> /torrents/254667/Maple-9-5-Hybrid-Mac-OS-X%2FLinux%2FWindows
Match 15 -> /torrents/320302/Mathematica-5-2-Win-Linux-Mac
Match 16 -> /torrents/461371/Open-SUSE-Linux-10-1-for-x86-32-bit-Official-ISO
Match 17 -> /torrents/569062/RevolutionOS-%28Linux-story%29-avi
Match 18 -> /torrents/750568/Visual-SlickEdit-version-11-Windows-%2B-Linux
Match 19 -> /torrents/806233/Novell-Course-3071-SUSE-Linux-Enterprise-Server10-Fundamentals-e
Match 20 -> /torrents/838710/Ubuntu-Linux-Bible-Jan-2007-pdf
Match 21 -> /torrents/955297/Cisco-Java-Perl-C-C%2B%2B-Linux-PHP-Ebooks
Match 22 -> /torrents/956762/Linux-Magazine-HorsSerie-BSD-Acte1-FRENCH-eBook-pdf
Match 23 -> /torrents/1034967/Understanding-The-Linux-Kernel-3rd-Edition
Match 24 -> /torrents/1064721/Red-Hat-Linux-Networking-And-System-Administration
Match 25 -> /torrents/1070173/Damn-Small-Linux-3-4
Match 26 -> /torrents/1118254/Wicked-Cool-Shell-Scripts-101-Scripts-For-Linux-Mac-OS-X-An
Match 27 -> /torrents/1118722/Beginning-Ubuntu-Linux-2nd-Edition
Match 28 -> /torrents/1124744/Beginning-SUSE-Linux-From-Novice-To-Professional
Match 29 -> /torrents/1126905/SELinux-By-Example-Using-Security-Enhanced-Linux
Match 30 -> /torrents/1126922/Puppy-Linux-2-17-1
Match 31 -> /torrents/1131875/Penumbra%3A-Overture-Linux-x86_64
Match 32 -> /torrents/1143158/Beginning-Linux-Programming-3rd-DDU-pdf
Match 33 -> /torrents/1176753/For-Dummies-Linux-For-Dummies-8th-Edition-Jul-2007-eBook-BBL
Match 34 -> /torrents/1181957/WOLFRAM-RESEARCH-MATHEMATICA-V6-0-1-WINDOWS-LINUX-MAC-EDGEISO

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.