memoryproblems's Content

problems with carriage returns, tabs (I think).

memoryproblems replied to memoryproblems's topic in Regex Help

Try this. $regex = '/title="Ruler: (.+?)"><\/a>\s+?<\/td>/s'; Thanks a ton, that got me fixed up. At first it kinda screwed up after the first instance of something it shouldn't collect, but I removed the s from the end and that fixed it, for some reason. Yeah, I started using just the \r for return carriage, but it wasn't working so well so I started to try to use some trial and error with different things to see if anything made any difference. Where it appeared to be a return, tab, return, i'd try \r.*?\r and it wouldn't work, but \r.*?\n would, but I've got no idea how far down the page it might have been looking to find that as allowed by the .*?.

June 17, 2011
12 replies

IT'S ALIVE!! - Feedback Please.

memoryproblems replied to staci's topic in Website Critique

Yeah, your navigation is a little strange looking. I'd either tone down the height a little, or change the background to have a more header-ish feel towards the top half. Even though the site name is in there, it simply doesn't feel much like a site header/banner. Past that, on my browser, the whole thing is set to the left with about ~40px on the right side. I'd suggest that you either center that up (maybe its supposed to be but isn't doing it in my browser), or expand it to cover the entire width of the page, if you go that direction, at least make the header full width of the page. I think that would really make the whole thing pop.

June 17, 2011
6 replies

problems with carriage returns, tabs (I think).

memoryproblems replied to memoryproblems's topic in Regex Help

Essentially, what yours is doing is grabbing everything that matches to the Ruler= .*?">. To clarify what I'm trying to do a little better, in the page source, there are several instances of a table cell opening up like I noted below, and all of them have the Ruler= DATA, but some of them also have something else inside the table cell where some don't. I want to only grab the data from the table cells that don't have that something extra inside the table cell. So everything that I'm looking for will match Ruler= .*?", but not everything that matches that is what I'm looking for. I want to collect data only from table cells that do not contain this, <a href="stats_alliance_stats_custom.asp?Alliance=XXXXXX"><img src="images/alliance_statistic.gif" border="0" title="Alliance: XXXXXX"></a> and your code is giving it the flexibility to match that. I tried around a little, and this works and matches stuff $regex = '/title="Ruler: (.+?)"><\/a>\r.*?\r\n/s'; but this doesn't match anything. $regex = '/title="Ruler: (.+?)"><\/a>\r\t\r\n/s'; So it seems that throwing the tab in there is screwing it up, I'm not sure what I'm messing up, because I'm reading the code as if there is a tab there.

June 17, 2011
12 replies

problems with carriage returns, tabs (I think).

memoryproblems replied to memoryproblems's topic in Regex Help

that works, but isn't quite what i need, i'm afraid. In the page source I'm attempting to get the data from, it'll have some that show up like this <td> <a href="send_message.asp?Nation_ID=XXXXXX"><img border="0" src="assets/compose_message.png" width="16" height="16" title="Ruler: DATA"></a> </td> and some like this <a href="send_message.asp?Nation_ID=XXXXXX"><img border="0" src="assets/compose_message.png" width="16" height="16" title="Ruler: DATA"></a> <a href="stats_alliance_stats_custom.asp?Alliance=XXXXXX"><img src="images/alliance_statistic.gif" border="0" title="Alliance: XXXXXX"></a> </td> What I'm trying to do here is to match only the stuff that matches the format of the first code segment. They are both structured similarly, except that some fitting the second code segment will have something additional thrown in that the first segment doesn't, and I don't want it to match any that fit the second code segment. I appreciate your help, are there any other possibilities that jump out to you as to why it wouldn't work? When I look at it, it shows that after the </a>, its a carriage break, a tab, two more carriage breaks and a tab before it closes out with </td>, and i've tried to put that in the regex, but either I'm reading the source wrong to what i need to match or I'm writing the regex to match incorrectly (I assume.)

June 17, 2011
12 replies

problems with carriage returns, tabs (I think).

memoryproblems replied to memoryproblems's topic in Regex Help

Yes, and it didn't return anything either, did var_dump ($match) and the array was empty.

June 17, 2011
12 replies

problems with carriage returns, tabs (I think).

memoryproblems replied to memoryproblems's topic in Regex Help

Thanks, i was wondering if I wasn't escaping everything that I needed to and that was the problem, but I'm not sure. It worked fine when it was just $regex = '/title="Ruler: (.+?)">/'; and even when I closed the link and added the first carriage return $regex = '/title="Ruler: (.+?)"><\/a>\r/'; but then when I add the first tab, thats when it returns nothing. $regex = '/title="Ruler: (.+?)"><\/a>\r\t/'; Am I doing the tab right? or perhaps I'm reading the source code wrong and getting that wrong?

June 17, 2011
12 replies

problems with carriage returns, tabs (I think).

memoryproblems posted a topic in Regex Help

I am attempting to use regex to gather data out of some page source. Below is an example of the source code that has what I'm looking for. DATA refers to the portion that I'm looking for, and I want it to gather that only if it continues out to the </td> exactly as is shown in the quote below. <td> <a href="send_message.asp?Nation_ID=XXXXXX"><img border="0" src="assets/compose_message.png" width="16" height="16" title="Ruler: DATA"></a> </td> This is the php code for how I'm attempting to do it. <?php $data = $_POST['data']; $regex = '/title="Ruler: (.+?)"><\/a>\r\t\r\r\t<\/td>/'; preg_match_all($regex,$data,$match); reset($match); foreach ($match[1] as $value) { echo "$value \n"; } Yet, when I do it, it returns nothing because I assume that I've done the regex formatting wrong somehow and so its not matching anything. Apologies if this is a stupid question, but I'm pretty new to this and haven't managed to find any solutions anywhere else. If anybody has any insight on how to help me, I'd appreciate it.

June 17, 2011
12 replies

Data scraping, preg_match_all/regex questions

memoryproblems posted a topic in PHP Coding Help

First off, I'm pretty new at this, so please try not to laugh (too hard) at me. I'm trying to put together a script to scrape out some data of some page source for me. This is for an online game, and I'm looking to sort out everything inside the code that is shown below. title="Ruler: DATA"> I've looked around the web (again, I'm very new), and found a few tutorials that look interesting, and went about doing this with Regex and preg_dump_all here is my code: <?php $data = file_get_contents('scrapedata.html'); $regex = '/title="Ruler: (.+?)">/'; preg_match_all($regex,$data,$match); var_dump($match); echo ($match); ?> I've got two problems: 1) var_dump($match) spits out the entire array, but echo ($match) says only "Array". If I change preg_match_all to simply preg_match, echo ($match) shows the first item that I'm looking for, but obviously it doesn't go through the entire source to find all the instances of what I'm looking for. (each page has roughly 20 items that I'm looking to collect) My main question here is, how do I take the results of the preg_match_all (which is an array), and list the results of that array just one by one on echo? 2) For what I'm doing, I need to do two different versions, one just like I coded above, and another that modifies the $regex line. In the source code, there is a variable that can be listed among the data, and I want to skip over any listing that has that variable. For example, I want to collect it if its like this: <td> <a href="send_message.asp?Nation_ID=XXXXXX"><img border="0" src="assets/compose_message.png" width="16" height="16" title="Ruler: DATA"></a> </td but if its like this, I want to skip over it: <td> <a href="send_message.asp?Nation_ID=XXXXXX"><img border="0" src="assets/compose_message.png" width="16" height="16" title="Ruler: DATA"></a> <a href="stats_alliance_stats_custom.asp?Alliance=Rapture"><img src="images/alliance_statistic.gif" border="0" title="Alliance: DATA"></a> </td> I figured that the way to do this would be to change the $regex line to $regex = '/title="Ruler: (.+?)"></a></td>/'; but it returns a warning (shown below) and says null in the var_dump ($match) Is there some way to put the </a></td> into the $regex line and have that work? Sorry if my questions are a little dumb, been trying to find answers to this all day (and fighting off the inevitable heart attack from all the frustration) with little luck. Thanks for any insight you might have mp

June 15, 2011
1 reply

Sign In

memoryproblems

Posts

Joined

Last visited

Content Type

Profiles

Forums

Everything posted by memoryproblems

problems with carriage returns, tabs (I think).

IT'S ALIVE!! - Feedback Please.

problems with carriage returns, tabs (I think).

problems with carriage returns, tabs (I think).

problems with carriage returns, tabs (I think).

problems with carriage returns, tabs (I think).

problems with carriage returns, tabs (I think).

Data scraping, preg_match_all/regex questions

Browse

Activity

Important Information