dil_bert Posted January 29, 2018 Share Posted January 29, 2018 hello dear community,i am currently workin on a little python programme that does some extracting from BS4 and storing as list elements in Python.As i am fairly new to Python i need some help with that. Nonetheless, I'm trying to write a very simple Spider for web crawling. Here's my first approach:I need to fetch the data out of this page: http://europa.eu/youth/volunteering/evs-organisation_enFirstly, I do a view on the page source to find HTML elements? view-source:https://europa.eu/youth/volunteering/evs-organisation_eni have to extract data wrapped within multiple HTML tags from the above mentioned webpage using BeautifulSoup4.I have to stored all of the extracted data in a list. But I want each of the extracted data as separate list elements separated by a comma.here we have the HTML content structure: <div class="view-content"> <div class="row is-flex"></span> <div class="col-md-4"></span> <div class </span> <div class= > <h4 Data 1 </span> <div class= Data 2</span> <p class= <i class= <strong>Data 3 </span> </p> <p class= Data 4 </span> <p class= Data 5 </span> <p><strong>Data 6</span> <div class=</span> <a href="Data 7</span> </div> </div> well an approach would be:from urllib.request import urlopen as uReq from bs4 import BeautifulSoup as soup import urllib my_url ='http://europa.eu/youth/volunteering/evs-organisation_en' uClient = uReq(my_url) page_html = uClient.read() uClient.close() page_soup = soup(page_html, "html.parser") cc = page_soup.findAll("td",{"class":""}) for i in range(10): print(cc[0+i].text, i) guess i need some slight changes to code in order to get the thing working.-Code to extract:for data in elem.find_all('span', class_=""): This should give an output:data = [ele.text for ele in soup.find_all('span', {'class':'NormalTextrun'})] print(data) Output: [' Data 1 ', ' Data 2 ', ' Data 3 ' and so forth]question: / i need help with the extraction part...love to hear from youyours dilbert Quote Link to comment https://forums.phpfreaks.com/topic/306362-extracting-from-beatitfulsoup4-and-storing-as-list-elements-in-python/ Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.