Jump to content

fetching data with PHP Simple HTML DOM Parser or preg_match_all


dil_bert

Recommended Posts

good day dear experts,

 

hello i need to fetch the data out of this page

http://europa.eu/youth/volunteering/evs-organisation_en

first i do a view on the page source to find HTML elements: view-source:https://europa.eu/youth/volunteering/evs-organisation_en

note: i need to fetch the data that come right below this line:

<h3>EVS accredited organisations search results: <span class="ey_badge">6066</span></h3>  </div>

i have several optoins: to do this with PHP Simple HTML DOM Parser (cf.http://simplehtmldom.sourceforge.net/manual.htm ): This way i need to  create HTML DOM object

BTW: there are other options: to do this with a special function: pc_link_extractor which is etracting all the links

function pc_link_extractor($s) {
$a = array();
if (preg_match_all(‘/>]*)[\”\’]?[^>]*>(.*?)\/a>/i’,$s,$matches,PREG_SET_ORDER)) {

foreach($matches as $match) {
array_push($a,array($match[1],$match[2]));
}
}
return $a;
}



or i am able to do it with -

preg_match_ all  

see for example:

- preg_match
#1 preg_match_all      ("|<[^>]+>(.*)</[^>]+>|U",
 "<b>example: </b><div align=\"left\">this is a test</div>",
 $out,
 PREG_PATTERN_ORDER)

see here the dataset which i am interested in  derived from h site: http://europa.eu/youth/volunteering/evs-organisation_en

  <div class="view-content">
    
<div id="views-bootstrap-grid-1" class="views-bootstrap-grid-plugin-style">
            <div class="row is-flex">
                  <div class="col-md-4">
            <div class="vp ey_block block-is-flex">
  <div class="ey_inner_block">
    <h4 class="text-center"><a href="/youth/volunteering/organisation/948417016_en" target="_blank">"Academy for Peace and Development" Union</a></h4>
          <div class="org_cord"><strong>Topics: </straaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaong>Access for disadvantaged; Youth (Participation, Youth Work, Youth Policy); Intercultural/intergenerational education and (lifelong)learning</div>
            <p class="ey_info">
    <i class="fa fa-location-arrow fa-lg"></i>
    Tbilisi, <strong>Georgia</strong>
</p>    <p class="ey_info"><i class="fa fa-hand-o-right fa-lg"></i> Receiving, Sending</p>
          <p class="ey_info"><i class="fa fa-external-link fa-lg"></i><span> <a href="http://www.apd.ge" target="_blank">www.apd.ge</a></span></p>
                  <p><strong>PIC no:</strong> 948417016</p>
        <div class="empty-block">
      <a href="/youth/volunteering/organisation/948417016_en" target="_blank" class="ey_btn btn btn-default pull-right">Read more</a>    </div>
  </div>
</div>
          </div>
                  <div class="col-md-4">

note there are hundreds of pages - [ see below the pagination things ]

 

well you see that we have some options here.

 

 

which way should i go?! Which way would you go?

 

 

love to hear from you

 

Greetings

 

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.