-
Posts
14,780 -
Joined
-
Last visited
-
Days Won
43
Everything posted by .josh
-
That would work if that's exactly what you div tag looks like. Also good chance you probably need to be using the m modifier for multiline mode
-
bah. I like the real time highlighting as I type it in. IMO it helps me build a regex faster, because I can see that hey, this thing right here is(n't)? matching what I want it to match - as opposed to having to sort through a long string of whatever nested groups I have, trying to sort out where it went wrong.
-
I want to see a regex tool where you can enter in the subject, and highlight a portion of it, and it show various ways to express it with regex. Like for instance, if I were to highlight in this post the word "and" it would show some suggestions with descriptions, like: literal: and 3 letter word: \b\w{3}\b word starting with 'a' : \ba\w* etc...
-
Reinstatement of post based ranking system
.josh replied to Daniel0's topic in PHPFreaks.com Website Feedback
Yep. No titles that somehow suggest coding knowledge, ability, etc... -
CCTV :: Would you be interested in knowing where they all are near you?
.josh replied to toxictoad's topic in Miscellaneous
One would think that more efforts/failsafe mechanisms would be put into a $13k item than $5 item, regardless of who is supposed to install them ... -
Yes, crawlers, scrapers, etc.. constantly have to be updated. Call it regex, call it patterns, call it regex and patterns, call it whatever; things are constantly changing. Standards constantly changing. People constantly interpreting them in their own way. People constantly changing site layout, so the regex no longer works as expected. That's why it's not really ideal to come to a site like this, asking for a regex to grab xyz from some page. Tomorrow or next week, it may no longer be applicable. Better to take the time to learn it yourself.
-
Okay I obviously don't know what google's crawler bot code looks like. For all I know, they could indeed have their crawler bot do some ranking evaluation right there on the spot. But I wouldn't bet on that. It certainly wouldn't adhere to what we like to call "good programming practice." I'm just saying, in general, that's what crawler bots do. They have a very specific function: go in and pull out info based on specific regex(es). I mean, it's like going to a form on a site and me saying that 99% of this form's job is to take information, and you saying "nuh-uh, prove it, it could be doing other things." Well sure, it could, but it's not the so-called "standard." Perhaps we are still mis-communicating. Look at what I've been talking to Daniel about. When I say a crawler is 99% regex, I mean that it is 99% about pattern recognition. Yes, it probably does have "trade secret" algorithms, actual code with conditions and loops, not just a simple preg_match(pattern), but it's all for filtering the data, so that when it's done, it reports back to the script that sent it out. Those "trade secret" algorithms are part of the regex in the sense that it is pattern recognition. That's why I was trying to explain it to Dan from a broader PoV.
-
right. That's what I thought you said in the first place. So you tried $password1 = trim($_POST['Password']); $downloads = "downloads/$password1"; and it still doesn't work? post your form. Or actually, is your form that takes in the password on some previous page? It looks to me that overall, you're getting password from a form on a previous page, and then going to this page to get more info, and then reloading this page, right? Well the posted info from the first page becomes lost after that. You're going to have to pass $password1 as a hidden field in the form on this page, or else put it into a session variable, so that it persists.
-
Yes, I think there is a misunderstanding. I am not asserting that one's site ranking on a google search is solely based on regex. I know that much much more goes into it. Anything form keyword > content relevance to "presence" on other sites (which is why link exchange type systems were so popular at one point in time, to the point of exploitation), to straight up paying for a premium spot goes into being ranked. In short, everything you mentioned, and more goes into it. But we're not talking about that. We're talking about crawler bots specifically. Their job is to filter info, nothing more. Some other part of the google code/system/whatever decides what to do with the info returned. So yes, a crawler bot is indeed 99% regex, but the extent of google is not just the crawler bot. As far as becoming a google slayer with regex: I guess it depends on what side of the fence you're on. From a site's perspective, no you can't. If you were to use '?at' as a keyword on your site (in a meta tag or as content or whatever), google is not going to bring up your site if someone types in 'cat' or 'bat' (at least, I don't think it will...one would think that would have been something already exploited by now..and fixed). But as someone who uses google to search, google does offer very limited regexing for your google searching pleasure. For instance, I can go to google.com and type in (cat|bat) and it will return results for both. Interestingly, if I type in (google|yahoo) yahoo comes in first! LOL.
-
If this works: $password1 = $_POST['Password']; $downloads = "downloads/something"; but this doesn't: $password1 = $_POST['Password']; $downloads = "downloads/$password1"; then $_POST['Password'] is at fault. Either you used the wrong var name (capitalized the same? typo in spelling in your form?) or else there could be white space or even a \n thrown in there. trim will take care of the whiespace and/or \n.
-
[SOLVED] apostrophe in INSERT INTO SQL statement
.josh replied to dennismonsewicz's topic in PHP Coding Help
yes, you can escape quotes. INSERT INTO table (row1, row2) VALUES ('Hello', 'Don\'t Try'); Or you can use double quotes around the values like so: INSERT INTO table (row1, row2) VALUES ("Hello", "Don't Try"); -
[SOLVED] apostrophe in INSERT INTO SQL statement
.josh replied to dennismonsewicz's topic in PHP Coding Help
escape it \' in practical terms (like if you're using vars instead of hardcoding), depending on your situation, you can use addslashes or mysql_real_escape_string -
Yes, I would call the 'checkerboard' a regex, the same as I would call it a pattern. Now, if I were to look at it in a broader sense of using black and white boxes to make up a checkerboard, that would be pattern recognition. Or I could use the checkerboard as a whole, as a pattern in a larger context. life>domain>kingdom kingdom is a sub-pattern of domain, which in turn is a sub-pattern of life. But each one is indeed a pattern, or regex. We use pattern recognition in general to make those patterns or regexes.
-
Or rather: 'dog' is to 'pattern' or 'animal' is to 'pattern' vs. 'animal > dog' is pattern recognition.
-
I wasn't expecting to take the contextX reference and apply it to contextY, merely point out that it is a general principle, that can be applied to many situations, and that in principle, it goes beyond contextY. Why do you not think all patterns are regexes? I see your animal > dog analogy, but the more appropriate application to that analogy is 'dog' is to 'pattern' as 'animal' is to 'pattern recognition.' A regular expression is a pattern, and a pattern is a regular expression. Moving beyond single application into broader classification is pattern recognition.
-
Again, you're failing to understand that the programming community is just one circle of mentality. As I said before (I think you missed the edit), yes, it is the "standard" within the programming community. We can safely assume that when someone is talking about regex on a site like phpfreaks.com, that they are talking about pcre or posix or something similar. But in other communities, that's not the case. I will concede that our community is probably the only one that conveniently shortens it to "regex," and maybe that's why you continue to only think inside this box. A pattern is a regular expression. They are synonyms. A sentence is a pattern. We use words as building blocks and put them in certain orders, tenses, etc... in order to convey an idea or intent. I read the sentence, and my brain interprets it, based on a set of rules (part of speech, grammar rules, etc..).
-
Don't confuse "most referenced" with "standard." It is "most referenced" by things like pcre to the programming community, because that's the level we work on. It would indeed be safe to assume that when a programmer is talking about regex, they are talking about things like pcre libraries. But when you move beyond the program itself, into what the program is being used for, what is impacted because of it (SEO, for instance), we move beyond the confines of programming. As said, things like the pcre library is what most people think of, as far as regex, but it is indeed a principle, the principle used with computers and programming in general. Right down to the hard wiring of computers. It all boils down to "on" or "off." Circuit boards are made up of capacitors and transistors and all that junk, laid out in specific patterns and computers are hardwired to physically do something based off those patterns. Machine code interprets higher level code to send specific patterns to the hardware. Higher level languages send specific patterns to be parsed and interpreted by the machine code. As you move up to higher levels, the subject and pattern becomes more abstract to the computer, but less abstract to us. What do you think a programming language does with the code you feed it? The code you write is a pattern. It parses and interprets that pattern, as a whole. It recognizes a certain sub pattern and hands it off to a certain function that uses its own code pattern to filter through that pattern. Regex is the concept of pattern recognition. It is used in a fractal sort of way, a bunch of filters boiling down to one thing: on, or off.
-
well, for instance, you can do something like this: $keepers = array('(amp)','(bar)'); $string = "&foo; something blah & blah &bar; something &blah;"; $keep = implode ("|",$keepers); $string = preg_replace("~&(?!".$keep.")[^;]*;~","",$string);
-
When someone mentions 'regex' most people think of it in limited terms. That is, they think of just parsing through content with things like the pcre library. But using regular expressions is the core principle of programming, in general. No computer knows what to do with anything except for straight machine language. But since writing in machine language sucks for us, we use higher level languages to act as a middle man. That middle man parses and interprets the code with its own regex functions. Nested inside that might be some arbitrary function/system (like pcre) that uses regex to interpret something else. Things like the pcre library is just one instance of regex. But regex is a concept, not a single instance of something. So, anything that involves interpreting or parsing is regex. How one thing goes about parsing or interpreting and for what reason, and what the subject(content) is, is irrelevant. It is the principle of pattern recognition which causes it to be a form of regexing.
-
okay so you either need to make a whitelist of the ones you want to keep, or else the ones you want removed.
-
I'm quite sure that the majority of interpreters do not use regular expressions for parsing and interpreting code. I'm not an expert on this though, so I'll return in some two years where I am hopefully studying computer science and have aced the compilers/interpreters course. I also very much doubt they pay the dudes over at Google a lot of money for just writing a couple of regular expressions. I'd assume that the algorithm for determining search relevancy is much more sophisticated than just figuring out which words there are on a page. Virtually anything that involves parsing and interpreting, be it content or code or whatever, is what the principle of Regular Expressions is. Even doing something like this: $x = substr($string,0,1); // or if (substr($string,0,1) == 'a') { // do something } is regexing. Now, whether the fine folks at google are using something like the pcre library or whether they brewed up their own library, is another story. But the principle is the same. Crawler takes a page and breaks it down, looking for specific things. That's regex.
-
anyways, if you want to go the preg_replace route, you could do $string = "&foo; something blah blah &bar; something"; $string = preg_replace("~&[^;]*;~","",$string);
-
I think he wants to completely remove them not decode them. well I was thinking more along the lines of decoding, then using something else to remove the decoded stuff, hence the whole "throw it into the mix" thing.