Jump to content

fetching data with PHP Simple HTML DOM Parser or preg_match_all


Recommended Posts

good day dear experts,

 

hello i need to fetch the data out of this page

http://europa.eu/youth/volunteering/evs-organisation_en

first i do a view on the page source to find HTML elements: view-source:https://europa.eu/youth/volunteering/evs-organisation_en

note: i need to fetch the data that come right below this line:

<h3>EVS accredited organisations search results: <span class="ey_badge">6066</span></h3>  </div>

i have several optoins: to do this with PHP Simple HTML DOM Parser (cf.http://simplehtmldom.sourceforge.net/manual.htm ): This way i need to  create HTML DOM object

BTW: there are other options: to do this with a special function: pc_link_extractor which is etracting all the links

function pc_link_extractor($s) {
$a = array();
if (preg_match_all(‘/>]*)[\”\’]?[^>]*>(.*?)\/a>/i’,$s,$matches,PREG_SET_ORDER)) {

foreach($matches as $match) {
array_push($a,array($match[1],$match[2]));
}
}
return $a;
}



or i am able to do it with -

preg_match_ all  

see for example:

- preg_match
#1 preg_match_all      ("|<[^>]+>(.*)</[^>]+>|U",
 "<b>example: </b><div align=\"left\">this is a test</div>",
 $out,
 PREG_PATTERN_ORDER)

see here the dataset which i am interested in  derived from h site: http://europa.eu/youth/volunteering/evs-organisation_en

  <div class="view-content">
    
<div id="views-bootstrap-grid-1" class="views-bootstrap-grid-plugin-style">
            <div class="row is-flex">
                  <div class="col-md-4">
            <div class="vp ey_block block-is-flex">
  <div class="ey_inner_block">
    <h4 class="text-center"><a href="/youth/volunteering/organisation/948417016_en" target="_blank">"Academy for Peace and Development" Union</a></h4>
          <div class="org_cord"><strong>Topics: </straaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaong>Access for disadvantaged; Youth (Participation, Youth Work, Youth Policy); Intercultural/intergenerational education and (lifelong)learning</div>
            <p class="ey_info">
    <i class="fa fa-location-arrow fa-lg"></i>
    Tbilisi, <strong>Georgia</strong>
</p>    <p class="ey_info"><i class="fa fa-hand-o-right fa-lg"></i> Receiving, Sending</p>
          <p class="ey_info"><i class="fa fa-external-link fa-lg"></i><span> <a href="http://www.apd.ge" target="_blank">www.apd.ge</a></span></p>
                  <p><strong>PIC no:</strong> 948417016</p>
        <div class="empty-block">
      <a href="/youth/volunteering/organisation/948417016_en" target="_blank" class="ey_btn btn btn-default pull-right">Read more</a>    </div>
  </div>
</div>
          </div>
                  <div class="col-md-4">

note there are hundreds of pages - [ see below the pagination things ]

 

well you see that we have some options here.

 

 

which way should i go?! Which way would you go?

 

 

love to hear from you

 

Greetings

 

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.