dil_bert Posted December 8, 2017 Share Posted December 8, 2017 i try to rework a python script form python 2 to python 3 -welll the print function cannot be used any more import urllib from bs4 import BeautifulSoup import urlparse import mechanize # Set the startingpoint for the spider and initialize # the a mechanize browser object url = "http://sparkbrowser.com" br = mechanize.Browser() # create lists for the urls in que and visited urls urls = [url] visited = [url] # Since the amount of urls in the list is dynamic # we just let the spider go until some last url didn't # have new ones on the webpage while len(urls)>0: try: br.open(urls[0]) urls.pop(0) for link in br.links(): newurl = urlparse.urljoin(link.base_url,link.url) #print newurl if newurl not in visited and url in newurl: visited.append(newurl) urls.append(newurl) print newurl except: print "error" urls.pop(0) print visited Quote Link to comment Share on other sites More sharing options...
dil_bert Posted December 9, 2017 Author Share Posted December 9, 2017 print "error"gets print ("error")see also 2to3https://docs.python.org/2/library/2to3.html Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.