Jump to content

Archived

This topic is now archived and is closed to further replies.

mnybud

Help With Basic String Replacement

Recommended Posts

Hi I have this simple plugin for Wordpress that fetches content from Wikipedia:
http://dev.wp-plugins.org/wiki/GetWIKI

My question is how can I remove specific content from the extracted content? Such as the wiki navigation, external links, etc? I am guessing it can be done with some sort of string replace but I cant figure it out……any of you PHP gurus help with this?

Share this post


Link to post
Share on other sites
wow fast reply  ;D

I included the full code below (its short)...I want to be able to do 2 things.....

1. remove all hyperlinks from the grabbed content
2. remove specific sections of the grabbed content like the external link section and error messages

Here is an example of something I would like to remove:
<div class="messagebox cleanup metadata">This article or section does not cite its <b><a href="http://en.wikipedia.orghttp://en.wikipedia.org/wiki/Wikipedia:Citing_sources" title="Wikipedia:Citing sources">references or sources</a>.</b><br /><small>You can <a href="http://en.wikipedia.orghttp://en.wikipedia.org/wiki/Wikipedia:WikiProject_Fact_and_Reference_Check" title="Wikipedia:WikiProject Fact and Reference Check">help</a> Wikipedia by introducing appropriate citations.</small></div>

here is the wordpress plugin code....I think it already removes some things I just cant figure it out  ???
thanks for your help!

<?php
/*
Plugin Name: GetWIKI
Version: 1.0
Plugin URI: http://saj.in/blog/techtalk/82/getwiki-plugin-for-wordpress.asp
Author: Sajin Kunhambu
Author URI: http://saj.in/
Description: Get a WIKI article anywhere on yout blog (e.g. ~GetWIKI(Your_Search_Term)~ )
*/

//Server Configuration
$host = "en.wikipedia.org";
$port = 80;
$path = "/wiki/";
//Plugin Configuration
$use_cache = true;
$cache_life = 10080;
$edit_link = false;
$retrieved_link = false;
$copy_left = "<div class=\"gfdl\">&copy; This material from <a href=\"Wikipediahttp://en.wikipedia.org\">Wikipedia</a> is licensed under the <a href=\"GFDL.http://www.gnu.org/copyleft/fdl.html\">GFDL</a>.</div>";
if( !function_exists(cache_recall) || !function_exists(cache_store) ) {
        // caching function not available
        $use_cache = false;
}

function cleanUp( $article ) {
    global $edit_link,$retrieved_link,$copy_left;
    $article = str_replace("\n","",$article);
    if(preg_match("/^.*(\<\!\-\- start content \-\-\>.*\<\!\-\- end content \-\-\>).*$/i",$article,$match)!=0) $article = $match[1];
    $article = preg_replace("#\<\!\-\-.*\-\-\>#imseU","",$article);
    $article = preg_replace("#\[\!\&\#.*\]#imseU","",$article);
    if(!$retrieved_link) $article = preg_replace("#\<div\sclass=\"printfooter\".*\<\/div\>#imseU","",$article);
    if(!$edit_link) $article = preg_replace("#\s*\<div\s*class=\"editsection\".*\<\/div\>\s*#imseU","",$article);
    $article = str_replace("/w/","http://en.wikipedia.org/w/",$article);
    $article = str_replace("/wiki/","http://en.wikipedia.org/wiki/",$article);
    $article = str_replace("/skins-1.5/","http://en.wikipedia.org/skins-1.5/",$article);
    $article = "<div class=\"wiki\">".$article.$copy_left."</div>";
    return $article;
}

function getArticle( $title ) {
    global $host,$port,$path,$use_cache,$cache_life;
    if($use_cache) {
        $function_string = "getArticle(".$title.")";
        if($article = cache_recall($function_string,$cache_life)) return $article;
    }
    $out = "GET $path$title HTTP/1.0\r\nHost: $host\r\nUser-Agent: GetWiki for WordPress\r\n\r\n";
    $fp = fsockopen($host, $port, $errno, $errstr, 30);
    fwrite($fp, $out);
    $article = "";
    while (!feof($fp)) {
        $article .= fgets($fp, 128);
    }
    if(substr($article,0,12)=="HTTP/1.0 301")
    {
        if(preg_match("/^.*Location\:\s(\S*).*$/im",$article,$match)!=0) {
            $article = str_replace("http://en.wikipedia.org/wiki/","",$match[1]);
            $article = getArticle( $article );
        } else {
            $article = "== WIKI Error ==";
        }
    }
    fclose($fp);
        $article = cleanUp($article);
    if($use_cache) cache_store($function_string,$article);
    return $article;
}

function wikify( $text ) {
    $text = preg_replace(
        "#\~GetWIKI\((\S*)\)\~#imseU",
        "getArticle('$1')",
        $text
    );
    return $text;
}

function wiki_css() {
    echo "
    <style type='text/css'>
    div.wiki {
        border: 1px dashed silver;
        background-color: #f0f0f0;
    }
    div.gfdl {
        font-size: 80%;
    }
    </style>
    ";
}


//echo wikify("~GetWIKI(user:Sajin)~");
add_action('wp_head', 'wiki_css');
add_filter('the_content', 'wikify', 2);
add_filter('the_excerpt', 'wikify', 2);
?>

Share this post


Link to post
Share on other sites
i get the inpression that code works in conjuction with the code provider and what you want to do is inpossable but i might be wrong.

is there any documation to configre this code.

Share this post


Link to post
Share on other sites
here is some more info:
http://dev.wp-plugins.org/wiki/GetWIKI

and the authors blog post of the release of the plugin
http://saj.in/blog/techtalk/82/getwiki-plugin-for-wordpress.asp



I am pretty sure it can be done.....but maybe I am wrong

Share this post


Link to post
Share on other sites
cmon guys is this possible or not? I know it cant be that hard....this script does it perfectly http://www.wikifetcher.com/wf/index.php (which I have) but its just not a WordPress plugin which is what I need........anyone have any suggestions?

Share this post


Link to post
Share on other sites

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.