Jump to content

rework a python script from python 2 to version 3


dil_bert

Recommended Posts

i try to rework a python script form

 

 

python 2 to python 3

 

-welll the print function cannot be used any more

import urllib
from bs4 import BeautifulSoup
import urlparse
import mechanize


# Set the startingpoint for the spider and initialize
# the a mechanize browser object
url = "http://sparkbrowser.com"
br = mechanize.Browser()

# create lists for the urls in que and visited urls
urls = [url]
visited = [url]


# Since the amount of urls in the list is dynamic
#   we just let the spider go until some last url didn't
#   have new ones on the webpage
while len(urls)>0:
    try:
        br.open(urls[0])
        urls.pop(0)
        for link in br.links():
            newurl =  urlparse.urljoin(link.base_url,link.url)
            #print newurl
            if newurl not in visited and url in newurl:
                visited.append(newurl)
                urls.append(newurl)
                print newurl
    except:
        print "error"
        urls.pop(0)
       
print visited

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.