Jump to content

The M in MVC


roopurt18

Recommended Posts

I've been thinking about the model a lot lately and there's just one nagging item in the back of my mind - they seem inefficient.

 

I love good OOP design, as a developer it makes things so much cleaner.  I love that the idea that I could create an instance of a User, assign various fields, and then the model will automatically tell me they're invalid or not, etc.

 

90% of the time I'm dealing with the DB, I'm SELECTing data to display.  In this case, which happens to be the majority, objects seem like overkill IMO.  If I'm displaying 100 records of data, why create 100 instances of an object when all I'm doing is calling get* methods?  It seems to me an array would be more optimal in terms of performance.

 

Additionally, it seems most models I've looked at return just about everything from the DB.  I don't always need everything, sometimes I just want 2 or 3 rows out of 10 available.  Now I can program the model so that it only requests those 2 or 3 fields, but I'd want to avoid pushing the responsibility of specifying which fields onto the client:

<?php
  // Somewhere in controller or view
  // The array specifies which fields to pull
  $user = new User($id, Array( 'username', 'last_login', 'date_registered' ) );
?>

I see this as bad because the view and controller now have seemingly intimate knowledge of the table structure, although I guess you can shove those off as class constants.  The other way this is bad is that any other potential fields for the $user model are invalid, so you have to be very careful about how you write your Save() method, i.e. you don't want to replace their e-mail address in the DB with an empty string just because it's not currently present in this limited-field instance.

 

But what happens when I have two different views for a user, showing these fields:

username, last_login, date_registered

username, signature, e_mail

<?php
  // In one controller
  $user = new User($id, Array('username', 'last_login', 'date_registered'));

  // And in another controller
  $user = new User($id, Array('username', 'signature', 'e_mail'));
?>

Again, the controllers have too much knowledge about the User.  You could get around this with something like:

<?php
  // In one controller
  $user = new User();
  $user->loadForWhatever($id); // This function knows which fields to load

  // In another controller
  $user = new User();
  $user->loadForSomethingElse($id); // This function knows which fields to load
?>

Now the controller knows less about the inner-workings of the model, but the model knows too much about how it's being used.  Ideally all the model does is handle the flow of data back and forth from the DB, it shouldn't know about how it's being used, right?

 

Another item that's bugging me is that relational DB != objects.  A lot of the time I'm SELECTing from the DB I'm using JOINs across tables that may not necessarily be related as models.  Let's say I want to see the 10 most recent orders and the registered users who purchased them.  I imagine a single function that queries the DB for the 10 most recent orders, creates an object for each one, and as each one is created that model creates a $user member that will again query the DB for the user details.

 

I think what I'm trying to get at is I think of all the neat, concise, and efficient MySQL queries I can make and for the life of me can not come up with a way to translate that into models.  Is there a way to maintain efficiency of DB querying and still wrap it up neatly as a model for the sake of the developer?

Link to comment
Share on other sites

Good post and a good read all the way around.

 

I guess in my mind the Model having intimate knowledge of the database isn't a bad thing. If for some reason your actual database schema has to change, a few code changes is acceptable. In general, a production application is not going to undergo database changes unless you are adding to the existing database (new tables, new fields, etc). You very seldom would rename or remove fields.

 

What I like to do is have an additional layer on things that acts as an API or webservice to the Model. This is designed to handle whatever complex queries that exist. My Model then just interacts with the API, so the Model itself doesn't have to have a lot of the information. In turn, the API layer can be leveraged in multiple applications so that you're able to expose data to other systems, do integrations, etc.

 

I dunno, maybe I missed what you were after completely. Just my ramblings.

Link to comment
Share on other sites

An instance of the model doesn't necessarily have to represent one row. You can also use it as an API so you perhaps have UserModel::getLatestUsers() which would get the latest x users in an array. That's what I do and that's what I plan to do for the models I'll create when working on the main site here. The idea of a model is just to separate the data as much as possible from the business logic, nothing else.

Link to comment
Share on other sites

Agree.

 

I guess my thoughts of having the API layer separate is that it is clearly separate from whatever the PHP application is. That way a user who removes Application X, could theoretically do so without interrupting Services/Integration components Y and Z, whereas the relationship might not be as clear should the model itself be used as an API.

Link to comment
Share on other sites

Ok, but if you want the latest orders, which do you choose:

  Orders::getLatestOrders();
  // ~or~
  Users::getLatestOrders();

 

We're dealing with orders, so I would assume we want to use the Orders::getLatestOrders() approach.  So let's look at the getLatestOrders() method:

// In this implementation, I assume the Order constructor will instantiate a User
// object.
function getLatestOrders(){
  // query the latest orders
  $sqlOrders = "
    SELECT x, y, z FROM `orders` WHERE ... ORDER BY ... LIMIT 10
  ";
  $qOrders = mysql_query($sqlOrders);
  if(!$qOrders){
    return Array(); // or null or false
  }
  $Models = Array();
  while($row = mysql_fetch_assoc($qOrders)){
    // Remember, Order constructor creates a User() object of it's own
    $Models[] = new Order($row);
  }
  return $Models;
}

In that implementation, things are clean, but they are inefficient.  There is a single query to pull the 10 latest orders and then 1 query again for each order to pull the user information.  So we have 11 DB queries to pull the data we could have gotten from a single query.  Additionally, the Order model does not know how it's going to be used (because of good OOP design), so when it creates it's User instance, the user object will likely pull all of the columns for the user from the DB, including the ones we don't want or need.  And all of this work is being done just so we can loop over the orders and display them in a table.  Seems like overkill to me.

 

Now consider this second approach to the method:

// In this implementation, I use a single query to pull ALL of the required data
// and the objects will figure it out themselves
function getLatestOrders(){
  // query the latest orders
  $sqlOrders = "
    SELECT x, y, z 
    FROM `orders`
    INNER JOIN `users` ON ...
    WHERE ... ORDER BY ... LIMIT 10
  ";
  $qOrders = mysql_query($sqlOrders);
  if(!$qOrders){
    return Array(); // or null or false
  }
  $Models = Array();
  while($row = mysql_fetch_assoc($qOrders)){
    // Remember, Order constructor creates a User() object of it's own
    // But this time $row contains the user information we want, so it will
    // pass that information along to the User() constructor so that it
    // does not re-query the database.
    $Models[] = new Order($row);
  }
  return $Models;
}

This example is the desired one from the MySQL point of view.  A single to-the-point query that grabs just what we need.  But now the Orders model knows too much about the User model IMO.  Now if we happen to change or add a field in the `users` table, our code changes are not limited to the User model.  Now we have to find all the places in our code where we did things the desirable way from a database standpoint.

Link to comment
Share on other sites

IMO your second approach is fine. Regarding your first question about whether to use the orders or users model to get the latest orders... I'd say it depends on if you're trying to get a user's latest orders (then it would be the users model) or you'd get the latest orders overall (then it would be the orders model).

Link to comment
Share on other sites

Loading only parts of the user seems fine to me, you have 3 main approaches for doing so:

 

1) Lazy loading, put a proxy in fields/relations you don't always need and reload if requested (not a fan of this in php)

2) Specify which fields you want at the query stage

3) Create additional classes e.g. User extends UserBasicInfo

 

With option 2, you can get round the problem of overwriting the empty data, by tracking dirty fields. Each set, marks a field as dirty, the update statment is then built only using dirty fields i.e. if a field value hasnt changed, it isnt updated.

 

Getting round the problem of the last 10 orders and the users, I've given up trying to make the OO nice, and basically followed the DB structure on the idea that I'd rather have efficient queries than the cleanest OO.

 

What I currently do, is build the query and in this case, the user id would be in the order, so I tell it to grab the 10 most recent queries, and that that the user id is a foreign key, so it should do a join and the itterator will return an array of objects (order, user) reducing it to 1 query, which is generated on the fly - but loosing the ability to say: $order->getUser()

 

For me, I can live with the lack of oo purity. I know some can't.

Link to comment
Share on other sites

I'd say it depends on if you're trying to get a user's latest orders (then it would be the users model) or you'd get the latest orders overall (then it would be the orders model).

This is the crux of what I'm struggling with.  A normalized database establishes a relationship between the data, whereas an object / class establishes a hierarchy.  This belongs to that rather than this is related to that.  Often the data in the database will be a hierarchy, but it doesn't have to, as exemplified by the portion of your response that I quoted.

 

(edit) I suppose classes represent is related with member variables, but meh.

Link to comment
Share on other sites

Generating objects from database data is less efficient than storing them in a big array and dumping the values in a template, obviously.

 

But things are hardly ever that simple. In a real system you have to do operate on the data and relate pieces of data, compare them, poke them with pointy objects and see if they explode when you put them in a microwave. That's when OO comes into it's own, and that's when it becomes hard to create an efficient and manageable procedural equivalent.

 

About your struggles with the M in MVC. I sense that you're felling a bit guilty about the concept of a user to the database schema. And indeed you should :P

 

If you want to feel less guilty without changing your methodology too much, look into Table Data Gateway (PoEAA).

 

Did you read PoEAA? If you haven't, you really should. Fowler presents Data Mapper and Transfer Object as well as couple of insanely useful patterns for managing data integration, including Unit of Work and a variety of Lazy Load patterns. Some of the idea's have been exchanged with Alur et Al's Core J2EE Patterns, which presents DAO.

 

Core J2EE patterns is a bit bloated and very geared towards Java, but there are some useful things in there, and it's worth looking over. In the Core J2EE implementations of DAO, they use Transfer Objects to exchange data, but a simpler, more common implementation is to simply abstract out all database access logic and have the Domain Object (e.g. User) delegate to the "DAO". In essence the resulting objects are more like Table Data Gateway with Domain Object extracted from it, but it still works better than just Table Data Gateway.

 

Personally, when I use DAO or Data Mapper (I always use Domain Object, never Row Table Gateway) I do one or more of the following things:

 

Before I continue I should stress that these are interpretation of the original patterns. As Fowler said: a pattern is recipe. Judge for yourself if you like my brownies, even if they are half baked.

 

1: Simplified Data Access Object

 

As I said, this is a very widely use implementation of DAO. It also probably the most used "real" Data Mapping pattern (not counting adhoc Row Data Gateways and Table Data Gateways). It is really stunningly simple. Take a look at one of your model classes (which is probably some implementation of Row or Table Data Gateway, whether you know it not) and imagine taking out all the database access code and moving it to new class. Then imagine making calls to this object from what's left of the original object to load fields. There is also the option of having client code call on the DAO to return Domain Objects created from store, which seriously blurs the already thin line between DAO an Data Mapper.

 

2: Data Access Object à la Alur et Al (with Data Transfer Object)

 

Imagine the provious setting (bar the last addition which makes almost indentical to Data Mapper). Now imagine instead of fetching data for fields in arbitrary fashion, imagine that you have an intermediate storage medium, not unlike an array, crafted to transfer the data from DAO to the client (the Domain Object in this case) and back. Voilà, a "real" DAO. Sound simple enough, and it is indeed simple, but it has some definite advantages over the "simplified" DAO. To name a few:

 

- the data is typed (which will prevent mixups)

- the data is seperated and simple (which makes it possible to do stuff to it you might not want to do to your Domain Object, such as serializing and cloning).

- the data is accessible. (you can freely access and modify it without worrying about visibility or even type safety -- the Domain Object will worry about those things for it)

 

The base of Data Transfer Object, Transfer Object (there is really hardly any difference), is useful in other cases as well. You could for example use a TO to dispatch data to the View. Simply aggregate the DTOs your View needs into one TO and make it available to it.

 

3: Data Mapper (with or without DTO)

 

DAO and Data Mapper are very much alike. There are a couple of key differences.

 

1) A DAO is (mostly or fully) stateless. A Data Mapper commonly does have state, and often uses this state to keep track of the objects it operates on (Identity Map).

2) A DAO does not create Domain Objects (bar some looser implementations, see above)

3) With DAO, the client (Domain Objects/Units of Works/Registries) is aware of the DAO and operates on it. With Data Mapper, this is less the case, though "awareness may vary" ;)

 

An Identity Map is just a simple list (hashtable) of objects loaded by the Data Mapper, which prevent it from loading the same object twice. Implementing Unit of Work can make this list obsolete.

 

With Mapper (and thus with Data Mapper) the two subjects of the mapping are unaware of each other and even of the mapper. Though the exact gain of this is mostly dependant on how badly you want the two independent of each other, since a client will need to be ware of both.

 

3: Unit of Work

 

The Unit of Work pattern keeps track of Domain Objects and their state compared to store. Traditionally, that requires that objects be explicitly marked 'dirty', 'clean' or 'new' (though marking 'new' could be omitted if you use NULL Identity Fields and get the id from the db -- sorry, I'll try to stay on track) . It can be implemented as either Registry-like solution (a separate type) or an Identity Map solution (with the Data Mapper). In my experience, you can't really do without a Unit of Work when you have a Domain Model.

 

3: Unit Of Work controller (not explicitly listed by Fowler, just mentioned in the Unit of Work chapter, but it's one of my personal favourites).

 

A Unit of Work controller eliminates the most annoying of these dependencies (marking 'dirty'), which is bug prone (though it is still better than no tracking at all). When a new object is created from stored data, the Unit of Work controller makes a clone of the loaded object. Next to your reference copy, keep a working copy, removing and adding domain objects from it. When it is time to commit changes, the "controller" simply updates the diff.

 

Because Domain Objects aren't necessarily good candidates for cloning, and because you the Unit of Work controller needs access to to internal values (which you'd rather not expose to rest of the world), I like to use Unit of Work controller in combination with Data Transfer Object.

 

Once in place, it gives the developer the flexibility of using a Domain Model, without many of the usual data mapping headaches. It probably isn't the best performing solution in the world though.

 

I'm out of time, maybe next time I can introduce you to some Lazy Load patterns.

Link to comment
Share on other sites

First of all let me say the using a object for a table record can be over kill but is provide useful functionality and easy extension on the object.

 

Lets me explain how my "MVC" framework works.

 

I technical have 4 common parts, views, controllers, models, model controllers.  View files a pretty standard which contains html and php echo/control statements.  Controllers use model controllers and models to get and process data for the view.  Model object store information about the table it links to in the database and object all field as member in it.  Model controllers, which i don't think is standard in the MVC design, gives the ability to retrieve X number of models based on passed information.  for example my status_controller has a function called get_table_statuses($table_name) which will return all available statuses for a table as an array of status_model objects.

 

I have design my base_model class to be flexible and allow for the fastest retrieval of data.  For isntance, like is has been said running a query to select id's and then loop through those ids pass each one to a model file which then in turns runs a query for each id.  Lets say the first query pulled 1000 ids, that is a total of 1001 queries.  That is quite inefficient however i think my model and model controller classes allows for a flexible solution.  Not only can you load a model from a primary key(like an id), I can load it by an array.  This allows me to now select all the data i want from the table(not just the id) and then loop through that multi-array and load the model by the array data, this way their is only 1 query and not 1001.  From some basic testing, load by array for a large mount of data is about 2 times faster that by getting ids and looping through those(.4-.6 second compared to .9-1.1 second).  That explain how my model class functionality allows for faster retrieval of data the the "normal" i know that MVC framework work(my working with other frameworks it not extensive so other method might be available that i don't know about).  Let me know explain why i like having a model class.

 

I come from a C++ backgroudn and most of my work was with-in class in C++ and i really like OOP.  Even those it might seem there is a performance hit(and if you use object to just store data for a database then there is), in the end, the ease of add functionality i think is well worth it.  For instance Lets say i have a user object and each user has a board teid to is(like a facebook wall).  if i just had an array i would have to write a query to get the user board but with a user object all i have to do is $user->get_board(), and the with the board i can do $board->get_posts().  Not granted, I have to write the queries for the function however the functions encapsulation these query and keep the controllers code very clean.  Another very big thing i have built into my framework is cascading on model updates.  This basically provide a way to update other table that like the the updated model file that is not possible through MySQL InnoDB.  For instance in our system we do not delete records from the database through our management system, we set its status to 'trash'.  'trash' is different form 'inactive' because trash will not display in the list of the table(they can however go through the trash and untrash them(kinda like a OS trash bin).  Now if i sent the status of a board as trash, I was to sent the status of all post related to the board as trash but i can't do that with a database(well not all database type, some i am sure i can).  I now provide a way to set foreign keys and action to perform and not matter what database type, it will be able to cascade any type of query to any table.

 

I just think allowing to have this possible functionality is well worth it.

 

If anyone disagrees with any of my point [lease let me know why because this post has been quite useful to me.

Link to comment
Share on other sites

I have a little time to introduce you to a solution to this problem:

 

90% of the time I'm dealing with the DB, I'm SELECTing data to display.  In this case, which happens to be the majority, objects seem like overkill IMO.  If I'm displaying 100 records of data, why create 100 instances of an object when all I'm doing is calling get* methods?  It seems to me an array would be more optimal in terms of performance.

 

Enter: "virtual collections".

 

Patterns for variations of this principle are Value Holder and Value List Handler. The essence is the same: create an object which represents a list of Domain Objects, but doesn't necessarily contain the full objects. IMO, it works best with DTOs, since it allows you to simply get a DTO for the whole list (without instantiating the full list of objects) as well as get/operate on individual objects in the list. You could accomplish this without a DTO as well though. To make any of this possible, the virtual collection needs a reference to the DOA or Data Mapper.

 

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.