Jump to content

Getting data from between two strings


Recommended Posts

I have found this online, but all it does is print out an array.

 

How can I get it to echo the values of the array into the page?
 

$homepage = file_get_contents('http://www.website.com');
 
function getContents($str, $startDelimiter, $endDelimiter) {
  $contents = array();
  $startDelimiterLength = strlen($startDelimiter);
  $endDelimiterLength = strlen($endDelimiter);
  $startFrom = $contentStart = $contentEnd = 0;
  while (false !== ($contentStart = strpos($str, $startDelimiter, $startFrom))) {
    $contentStart += $startDelimiterLength;
    $contentEnd = strpos($str, $endDelimiter, $contentStart);
    if (false === $contentEnd) {
      break;
    }
    $contents[] = substr($str, $contentStart, $contentEnd - $contentStart);
    $startFrom = $contentEnd + $endDelimiterLength;
  }
 
  return $contents;
}
 
$sample = $homepage;
print getContents($sample, 'href="', '"');
Link to comment
https://forums.phpfreaks.com/topic/303076-getting-data-from-between-two-strings/
Share on other sites

 

I have found this online [...]

 

And there's your problem. ;)

 

It looks like what you actually want to do is scrape a website and look for links. That's what HTML parsers are for. Parse the page, search for a elements (I assume that's what you're interested in), get their href attribute. This is far more readable, robust and flexible than any string fiddling code you may find online.

Thread pruned. Next time this happens I will hand out bans.

 

 

MrLeN, the best thing really is to use something that understands HTML and have it extract links or whatever. You said you tried it and couldn't get it to work? What code do you have and how is it not working?

 

If you don't want that and think that the earlier code is good enough,

The return value from getContents will be an array so you can't simply print it. As Barand said, you can use a foreach to deal with it "properly", or print_r/var_dump to simply see what's in there (like for testing). Or something else that depends on exactly what kind of output you're trying to get.

Thanks Administrator. The code above is working fine, and it is outputting exactly what I need, but is printing an array.

 

I would like the code to echo the php code like:
 

 
some kind of while loop that I really don't know how to code properly {
  echo $all_the_links;
}
 

But I really don't know how to code it. I did a similar thing last year (as previously mentioned), but it is a little different. The way I code stuff in php is I search for snippets online and edit it. If I can't edit it (and believe me, I try), I go looking for help. I am not very good at php, but I do spend many hours, sometimes days trying. I have been like that for 20 years. I am not not very good at programming and I never will be; and believe  me, it's not because I haven't put in thousands of hours of learning. I just suck at advanced stuff. I can write basic php, and echos and variables and can I make sessions and even cookies. I know how to make a basic while loop (snippets are available online), I can use email functions, redirects, headers etc -- all basic to intermediate stuff; but I don't know how to combine "get contents between two strings" INSIDE a while loop. I know I am on the right track with the code I found (and edited) above, but I have fallen short of being able to echo the output into the page as href links. All I know is it needs a while loop somehow.

 

Edited by MrLeN

There's many ways you code show the links so you'll need to be more specific.

 

Do you maybe know the HTML you want to see? If the links were "https://forums.phpfreaks.com" and "http://www.google.com" then what should you end up with?

There's many ways you code show the links so you'll need to be more specific.

 

Do you maybe know the HTML you want to see? If the links were "https://forums.phpfreaks.com" and "http://www.google.com" then what should you end up with?

 

The links are like

<a href="http://www.mywebsite.com/category/a-category/">My Category</a>

I want to get everything in between '/category/' and '</a>'

 

In other words, I want want to get all instances of  'a-category' out and list them like:

 

<a href="/category/category-1">Category 1</a><br />
<a href="/category/category-2">Category 2</a><br />
<a href="/category/category-3">Category 3</a><br />
<a href="/category/category-4">Category 4</a>
Edited by MrLeN

You're not doing this with your own website, right? Where are these links coming from, exactly?

 

 

The BRs are annoying. How about using smarter HTML, and perhaps a dash of CSS, to get the same effect. Are you looking for a list?

<ul>
	<li><a href="/category/category-1">Category 1</a></li>
If these things represent a list, like logically they form a thing you would call "a list", then a UL would be the most appropriate way to show them in HTML; if you don't want to see bullet points then they're easy enough to remove.

 

But there's a larger problem: getContents() will only give you the URLs. No "Category 1" and whatnot that was used as the link text. If you need that then... well, sorry, but the DOM method is not only superior to doing string math but can also get you both the link and text at the same time with no additional work (less work, in fact).

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.