aunquarra

Whitespace Compression with Preformat Awareness

My end goal... take in a string of HTML and return a whitespace-compressed version of it. I've got some code I've used on many occasions through the years, and it's been tweaked a fair bit over time as well. I'm happy with it except for one thing... It won't leave <pre> blocks alone. I can match <pre> blocks just fine. '/<[\s]*pre([^>]*)>([^<]*)<\/pre[\s]*>/Ui' ...seems to work as I'd like (though I can't warrant that it's perfect by a long shot). What I don't know is how to apply that kind of detection in the manner I want. I'd like a neat regex solution, but I can accept a clever workaround. I just haven't thought of anything. Here's what I typically use for the search and replace on a preg_replace()... $search = array( '/(\s+)?(.+).+);(\s+)?/', // css each item '/(\s+)?(.+)(\s+)?\{(.+)\}(\s+)?/', // css between items '/\n/', // replace end of line by a space '/\>[^\S ]+/s', // strip whitespaces after tags, except space '/[^\S ]+\</s', // strip whitespaces before tags, except space '/(\s)+/s', // shorten multiple whitespace sequences ); $replace = array( '\\2:\\3;', '\\2 {\\4}', ' ', '>', '<', '\\1', ); So the question is how I can either: [*]Modify the patterns to ignore anything inside <pre> blocks (if that's even possible); or [*]Add some kind of tokenization to remove <pre> blocks first, then run my preg_replace(), then put the <pre> blocks back.

August 11, 2011

Storage Engine for Logging

aunquarra posted a topic in MySQL Help

I'm about to produce a pretty extensive set of log tables on a project, and I'm evaluating what storage engine to use. My gut instinct is InnoDB. I don't need transactions, but the row-level locking makes me feel more comfortable in an INSERT-heavy scenario like logging. That said, these won't be set-and-forget logs; I'll be querying them regularly. So the performance of MyISAM comes to mind... Anyway, I did some Googling and found some folks recommending Archive. On the surface, it looks like a good fit. Row-level locking; INSERT and SELECT only (no need to UPDATE or DELETE logs); compression... So far, so good. Looks like I might be playing with a new engine, which is fun for me (for some reason). Then I see that it doesn't support indexes and I flinch. :-\ I read that it performs well with SELECTs pertaining to timestamps, which is great. Undoubtedly, I'll be querying based on timestamps regularly. But I'll also have a user_id column and other similar relationally relevant IDs that I'll want to query based on just as often (perhaps more often). I'm imagining querying a massive MyISAM or InnoDB table with no indexes and it's a scary prospect. I would hope that Archive might have features to combat that, but I can't find any examples. It almost seems like Archive is great for logs if you're doing overall log analysis, but not so much for granular analysis. So, all that said, what storage engine would you recommend for logs that will be SELECTed regularly knowing that SELECTs will likely fall into one of the following scenarios: by timestamp full-table query per user analysis (SUM() of certain columns where user_id='123') per event (I'll check the credit_card_transactions_log table by transaction_id, for example)

data structure & query for volume pricing

aunquarra replied to aunquarra's topic in MySQL Help

Interesting. The only issue I see is that this wouldn't support a many-to-many scenario because the suppliers are the initial point rather than the products. The end result I'd be going for would be a list of all products/suppliers combinations (with some WHERE filtering, but that's irrelevant at this stage) with the price reflecting the user supplied volume. So, the same supplier might be on the list multiple times (if they sell 4 different products), and one product might be on the list multiple times (if it is sold by multiple suppliers). Go figure that I didn't do due diligence in the initial explanation.

data structure & query for volume pricing

aunquarra posted a topic in MySQL Help

It has been a few years since I was here last, but you guys were always so helpful, and now I'm stuck again. It's a bit more of a complex bit of logic than I used to have... Oh, and this is MySQL 5.0.77 (Rackspace Cloud hosting, currently). On to the problem. It's not built, yet. Or at least the new version isn't. The old version was junk, hence the rebuild. At any rate, that means I don't actually have table structures, queries, or anything. The idea is that I've got a set of specific products sold by a variety of sellers whose price is determined by a variety of factors. For the most part, I've got that covered already, even the seller-specific stuff. The one issue I keep coming back to is the fact that each seller can set up their own volume pricing. So if someone is looking to buy 1 of this product, they fall into the first volume pricing tier for each seller, but if they want to buy 100, it might be the fifth tier (90-100) for Seller A but the second tier (50+) for Seller Z, and each would have their own adjustments to the price according to the actual volume break point. Standard stuff, but I've always done the legwork in PHP. For a variety of reasons, I need to try to keep this in MySQL. I know this is vague and conceptual, but I'm at the project design point here, so that's all I have. The last time we built it, we did it in a very "slapped-together" fashion, forced to use a lot of external, unoptimized data, and we used a set of subselects to do the work (plus, I didn't actually do it; it was another guy who has since left). The EXPLAIN on that made me shudder. As I said before, "hence the rebuild." Any suggestions on how I might structure/query the database to be best-optimized for this? Conceptual and pseudocode is more than sufficient for me. I really just need a starting point. Thanks, all.

Querying from two tables

aunquarra replied to chrischen's topic in MySQL Help

The safest bet to avoid overwriting stuff like this is to use "AS" statements, like... SELECT table1.text, table1.text2, table1.mid, table1.id AS table1id, merch.id as merchid, merch.name, table1.expire FROM table1 INNER JOIN merch ON merch.mid = table1.mid WHERE table1.status = 'Y' AND table1.linkType = %s ORDER BY table1.$col $sort

quick question

aunquarra replied to aunquarra's topic in MySQL Help

Well, I dont think my SSH client would send a kill, but I would think that one sure way to prevent it would be to unplug the ethernet first, then close the client. And, unfortunately, hourly crons are waiting on this restoral, and the workload is backing up. This is one of those "it needs to be done yesterday" things. And it might be running through the night anyway-- the backup.sql file is 4.5G.

quick question

aunquarra replied to aunquarra's topic in MySQL Help

To clarify, the backup.sql file is on the DB server itself, in a 0777 directory. That said, I didn't call on it using the full path, since my shell was already in the same directory.

quick question

aunquarra posted a topic in MySQL Help

I feel dumb for having to ask this, and I think I know the answer, but I want a second opinion. I have SSHed into a DB server, and performed a "mysql -u [user] -p [dbname] < backup.sql" type command. The problem is, it's a big backup, and it's going to take a good while. I didn't think to do it from the local machine rather than SSH... And I'm probably going to need to pack up my laptop and leave before it's done. Having seen how mySQL responds to having apache/php go away (it keeps on trucking on whatever PHP told it to do until it's done, even if PHP isn't listening anymore), I don't think I've ever just shut down an SSH session with such a large upload (usually I have the forethought to _not_ do it over SSH). So what happens when I close my SSH client?

[SOLVED] LEFT JOIN three tables... bad idea?

aunquarra replied to aunquarra's topic in MySQL Help

Here's the query slimmed down (some of the excessive and verbose retrieved g_info columns have been removed, but everything else is there). As you can tell, I ended up doing both an inner and a left join. Since it's possible for g_membership to have zero rows, it needed to be joined with left join. SELECT `g_info`.`id`, `persons`.`name`, COUNT(`g_membership`.`g_id`) FROM (`g_info` INNER JOIN `persons` ON `persons`.`id`=`g_info`.`person_id`) LEFT JOIN `g_membership` ON `g_info`.`id`=`g_membership`.`g_id` WHERE `category_id`='2' GROUP BY `g_info`.`id` Thanks for the INNER join tip. I'm finding that a lot of my LEFT joins can be accomplished with INNERs.

[SOLVED] LEFT JOIN three tables... bad idea?

aunquarra replied to aunquarra's topic in MySQL Help

[SOLVED] LEFT JOIN three tables... bad idea?

aunquarra posted a topic in MySQL Help

Okay, so I've got three tables that have data that I want. I've done left joins on three tables before, but I've managed to avoid it for the most part because I wasn't sure if it was a good or bad idea. In fact, I've erred on the side of just having to do two separate queries with simple two-table joins to avoid it. This time, I'm going to have to either do it, or err on the side of just not having the data visible. So, given that joins are done on integers that are indexed, if not primary keyed in all three tables... Is it a bad idea to do three-table joins? Also, specifically in this scenario, I'll be pulling a lot of data from one table, its one-to-one value from a second table, and a COUNT() of one-to-many rows in a third table. Which brings up a second question... Assuming three-table joins are acceptable, what about when working with a GROUP BY as well? If table sizes make a difference, the primary table could have up to a few thousand entries or so, the second table (the one-to-one relationship) could have up to 20,000 entries, and the third table (the one-to-many which would be COUNT()ed) could have 50,000 or more entries.

[SOLVED] cell from multiple rows as columns

aunquarra replied to aunquarra's topic in MySQL Help

Well, the reporting is automatically performed, but on access. So it's effectively realtime. Would table creation, the inserts, and dropping (with approximately 1k-10k sets of data) cause any kind of performance issue?

[SOLVED] cell from multiple rows as columns

aunquarra replied to aunquarra's topic in MySQL Help

Well, I'm certainly open to suggestion, but I don't see any cleaner way of doing it, short of writing code that continually alters the table structure to fit my original posts' output. Basically, it controls sets of user-definable fields on some forms. Originally, I had a max of four user-definable fields per form, which were each just columns in the tables with correlating data. Efficient and easy, but limited. And now I've learned that unlimited is more important to the users than efficient and easy. So I've got to figure out a way to make the system support an infinite number of fields. If you've got some direction, I'm all ears. ---- As for the legwork, it will be a little more complicated than I'd like. The reason I wanted this query was because an existing automated report generating system with user-definable conditions/filters is needing it all to be in a single row from a single query. Just writing the necessary exceptions so I'll even have a place to insert some app scripting to compile the data will be a hassle. A database solution would certainly be ideal. And if that means restructuring, I'm fine with that. [edit because I forgot to explain the legwork aspect]

[SOLVED] cell from multiple rows as columns

aunquarra replied to aunquarra's topic in MySQL Help

So, PHP legwork it is... I hate this model, but unfortunately, the application required it. Thanks.

[SOLVED] cell from multiple rows as columns

aunquarra posted a topic in MySQL Help

Explaining this might be awkward, but if you just see it, it'll make sense real quick. So let's say I have a table like this... CREATE TABLE `test_table` (`id` int(11) NOT NULL auto_increment, `key_id` int(11) NOT NULL, `field` varchar(10) NOT NULL, `value` varchar(10) NOT NULL, PRIMARY KEY (`id`)); INSERT INTO `test_table` VALUES (1, 1, 'color', 'red'); INSERT INTO `test_table` VALUES (2, 1, 'size', 'large'); INSERT INTO `test_table` VALUES (3, 2, 'color', 'blue'); INSERT INTO `test_table` VALUES (4, 2, 'size', 'small'); So, to review... mysql> select * from test_table; +----+--------+-------+-------+ | id | key_id | field | value | +----+--------+-------+-------+ | 1 | 1 | color | red | | 2 | 1 | size | large | | 3 | 2 | color | blue | | 4 | 2 | size | small | +----+--------+-------+-------+ 4 rows in set (0.00 sec) What I would like to do is combine all the `field`s, grouped by per `key_id`, into a single row of the result. So the output would be something similar to this... +--------+-------+-------+ | key_id | color | size | +--------+-------+-------+ | 1 | red | large | | 2 | blue | small | +--------+-------+-------+ And, if other `field`s were added, it would reflect it... +--------+-------+-------+---------+-------+ | key_id | color | size | article | price | +--------+-------+-------+---------+-------+ | 1 | red | large | pants | 15.99 | | 2 | blue | small | tshirt | 5.99 | +--------+-------+-------+---------+-------+ A guy I know told me to look into views to accomplish this... While I'm glad he told me to check it out (because they look interesting for future reference), they don't really seem to do what I'm trying to do. If I have to, I'll fall back on rewriting the PHP to do this legwork for me, but if mySQL has some sort of feature to accomplish this with single rows... That would be stinkin' sweet. The current PHP setup is very clean and efficient because it's just looking for one row. If I have to build up that row, it could take a hit performance-wise. So I'm just making sure there's not a mySQL feature I'm just ignorant to... Thanks in advance. PHP freaks forums rock. No, really. You knowledgeable veterans are God's gift to haphazard developers such as myself.

Sign In

Posts

Joined

Last visited

Contact Methods

Profile Information

aunquarra's Achievements

Member (2/5)

Reputation

Whitespace Compression with Preformat Awareness

Storage Engine for Logging

data structure & query for volume pricing

data structure & query for volume pricing

Querying from two tables

quick question

quick question

quick question

[SOLVED] LEFT JOIN three tables... bad idea?

[SOLVED] LEFT JOIN three tables... bad idea?

[SOLVED] LEFT JOIN three tables... bad idea?

[SOLVED] cell from multiple rows as columns

[SOLVED] cell from multiple rows as columns

[SOLVED] cell from multiple rows as columns

[SOLVED] cell from multiple rows as columns

Browse

Activity

Important Information