Jump to content

thebadbad

Members
  • Posts

    1,613
  • Joined

  • Last visited

Posts posted by thebadbad

  1. preg_match('~\$Revision: (\S+)~', $this->revision, $rev);

    \S is a shorthand character class matching any non-whitespace character.

     

    The only other difference is the use of pattern delimiters (tildes in my case), which is required with preg_match().

  2. Or if you want to use the external page's title (if applicable):

     

    function _callback($matches) {
    $html = file_get_contents($matches[0]);
    if ($html != false) {
    	if (preg_match('~<title\b[^>]*>(.+?)</title>~is', $html, $match)) {
    		return '<a href="' . $matches[0] . '">' . $match[1] . '</a>';
    	}
    }
    return '<a href="' . $matches[0] . '">' . parse_url($matches[0], PHP_URL_HOST) . '</a>';
    }
    $ret = preg_replace_callback('~\bhttp://\S+(?![^<]*?>)~i', '_callback', $ret);

  3. Quick example:

     

    function _callback($matches) {
    return '<a href="' . $matches[0] . '">' . parse_url($matches[0], PHP_URL_HOST) . '</a>';
    }
    $ret = preg_replace_callback('~\bhttp://\S+(?![^<]*?>)~i', '_callback', $ret);

  4. One way you can do it, although it's not very elegant:

     

    $content = 'Testing a test with this! Tag: <span title="test">tag</span>. <h1>Test heading</h1>';
    //replace keywords (that are not part of HTML tags)
    $content = preg_replace('~\btest\b(?![^<]*?>)~i', '<a href="" UNIQUE>$0</a>', $content);
    //remove created links between heading tags
    function _callback($matches) {
    return preg_replace('~<a href="[^"]*" UNIQUE>(.*?)</a>~s', '$1', $matches[0]);
    }
    $content = preg_replace_callback('~<h([1-6])\b[^>]*>.+?</h\1>~is', '_callback', $content);
    //remove UNIQUE marks
    $content = preg_replace('~(<a href="[^"]*") UNIQUE>~', '$1>', $content);
    header('Content-type: text/plain; charset=utf-8');
    echo $content;

  5. You should be able to use get_meta_tags() for the META tags (else the comments on that page also contains regex solutions).

     

    You can check for iframes with this:

     

    //$html contains page source code
    if (preg_match_all('~<iframe\b[^>]+src\s?=\s?([\'"])(.+?)\1[^>]*>~is', $html, $matches)) {
    echo '<pre>' . print_r($matches[2], true) . '</pre>';
    } else {
    echo 'No iframes found.';
    }

  6. $str = '            <h2 style="color: #383737; margin-bottom: 3px; text-transform: capitalize;">Details</h2><br/>
                <h2>input</h2><br/>';
    preg_match('~>Details</h2><br/>\s*<h2>(.*?)</h2><br/>~is', $str, $match);
    echo $match[1];

    \s* matches zero or more whitespace characters (including line breaks).

  7. When you're just replacing simple text, you can achieve this effect with strtr(). But I'm not sure how to get it done with regular expressions involved.

     

    $str = 'this string contains this body of text and works for an example';
    $replace = array(
    'this body of text' => '<span class="highlight">this body of text</span>',
    'this body' => '<span class="highlight">this body</span>',
    'body of text' => '<span class="highlight">body of text</span>',
    'body' => '<span class="highlight">body</span>'
    );
    echo strtr($str, $replace);
    //this string contains <span class="highlight">this body of text</span> and works for an example

  8. No worries. In that case you can define an array of patterns and replacements:

     

    $replace = array(
    '~\[\[#([0-9]+)#\]\](.*?)\[\[end\]\]~is' => '<a href="javascript:;" onclick="Show_Stuff(\'$1\');">Show</a>
    <div style="display:none;" id="$1">$2</div><br />',
    '~\[\*(.*?)\*\]~s' => '<strong>$1</strong>',
    '#\[~(.*?)~\]#s' => '<em>$1</em>'
    );
    $str = preg_replace(array_keys($replace), $replace, $str);

    I used strong and em tags since the b and i tags are deprecated.

  9. I would just go with a simple

     

    <?php
    $str = '[[#201001251351#]]Some text here[[end]]
    Something else in this bit
    [[#201001251353#]]Some different text here[[end]]
    And more different bits
    [[#201001251357#]]Some more here[[end]]
    Further different bits which may include other replaced things';
    $replace = '<a href="javascript:;" onclick="Show_Stuff(\'$1\');">Show</a>
    <div style="display:none;" id="$1">$2</div><br />';
    echo preg_replace('~\[\[#([0-9]+)#\]\](.*?)\[\[end\]\]~is', $replace, $str);
    ?>

  10. Not very elegant, but does the job:

     

    <?php
    $array = array('a' => 'A', 'b' => 'B', 'c' => 'C', 'd' => 'D');
    $key = 'c';
    $val = $array[$key];
    unset($array[$key]);
    $array = array_merge(array($key => $val), $array); //you can also simply add the arrays here instead; array($key => $val) + $array
    echo '<pre>' . print_r($array, true) . '</pre>';
    ?>

  11. The only thing you need is to add a pair of pattern delimiters (I most often use the tilde ~) and the pattern modifier i, making the search case insensitive (to achieve the same effect as eregi() - but that won't have any actual effect with the current pattern, since both a-z and A-Z are included. In the pattern below I've removed A-Z and added the i). I would also add the pattern modifier D, to make the dollar sign really mean end of string (read here).

     

    ~^[a-z0-9._-]{4,12}$~iD

  12. You forgot to load the source code of the page.

     

    <?php
    $urls = array('http://www.megaupload.com/?d=ZD6ACN1J', 'http://www.megaupload.com/?d=ZACN1J');
    foreach ($urls as $url) {
    $source = file_get_contents($url);
    echo "<strong>$url</strong>: ";
    if (preg_match('~<input\b[^>]*>~i', $source)) {
    	echo '<span style="color: green;">File is available.</span><br />';
    } else {
    	echo '<span style="color: red;">File does not exist.</span><br />';
    }
    }
    ?>

     

    Megaupload serves a language specific error message, so I search for an input tag instead of the phrase in your code.

  13. You simply have to find a way do distinguish a page with a 'dead' file from a page with a live file. A Rapidshare page e.g. contains the header <h1>FILE DOWNLOAD</h1> when the file is live. You can simply use strpos() to check if that's present in the source code:

     

    <?php
    $source = file_get_contents('http://rapidshare.com/files/326980450/TOIOU_NIN_AVOTT_GIFT_1080p.part01.rar');
    if (strpos($source, '<h1>FILE DOWNLOAD</h1>') !== false) {
    //file is live
    }
    ?>

  14. There's also optimising the regular expression itself. Currently, due to the arrangement of the lookahead, at every position in the article, every following non-greater-than symbol character is looked at, then the less-than-or-end alternation is checked, then potentially a whole heap of backtracking happens until less-than-or-end is met, then (and only then!) the engine moves on to check for the target word itself. That's a lot of overhead happening for each and every character in the article, even for non-matches.  One quick fix would be to match the target word first, then later check if it is not within a HTML tag.

    @OP, you can find an alternative pattern that uses a look-behind instead of a look-ahead in this post: http://www.phpfreaks.com/forums/index.php/topic,258333.msg1215689.html#msg1215689

  15. @thebadbad, mainly because I didn't understand it and couldn't get it to work...

    if I used your function to parse the url of the root page (not necessarily a directory, but also a file) as $absolute and the link as $relative; would that work? I can try it, but need to test the current system first... I have a few issues with Firefox and ajax which need sorting before I implement any changes to the current script.

     

    Yes, that would work. And it's actually really simple to do it (given the relative2absolute() function), and by far the best solution as far as I know.

     

    echo relative2absolute('http://example.com/folder/page.php', '../relative/link/file.php');
    //http://example.com/relative/link/file.php

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.