Relational => OO Breakdown

utexas_pjm · August 28, 2007

I run into this situation quite often:

I have two objects which exhibit a "has a" relationship. To demonstrate this consider the following trivial example involving cities and states. In this example let us assume that all cities have a state and all states have 0 - n cities. So I store these hypothetical entites in two relational db tables, cities and states, like so:

cities { city_id, city_name, state_id, ... a bunch of other relevant city attributes }

states { state_id, state_name, ... a bunch of other relevant state attributes }

Once I have my persistence layer defined I create a mapping layer which converts objects into db relations (for storage) and db relations into objects (for display). I typically create mapping classes (which I call object-relational mappers or ORMs) for each entity. An ORM for cities might look something like this:

<?php
class Orm_City
{
   /**
    * Load an array containing all cities from persistence.
    * 
    * @return array <Entity_City>   
    */
    public static function loadAllCities()
    {
        $allCities = array ();
        
        $rs = Db_Facade::query('SELECT * FROM cities');

        /* Iterate over result set row by row and map relational data to objects */        
        foreach($rs as $row){
            /* Push resulting object on to return array */
            $allCities[] = Orm_City::bindAttrbiutes($row);                  
        }    
        return $allCities;
    }

   /**
    * Bind data from relational columns to object attributes.
    * 
    * @return Entity_City   
    */
    public static function bindAttributes($row)
    {
        $city = new Entity_City();
        $city->setCityId($row['city_id']);
        $city->setCityName($row['city_name']);
        $city->setStateId($row['state_id']);        
        // ... etc ...
        return $city;
    }
}
?>

You can see from the above code that I am binding only from the city table. This means (presumably) that my Entity_City object must contain a method like this:

<?php
class Entity_City
{
    // ...
    public function getState()
    {
        return Orm_State::loadById($this->_stateId);
    }
    // ...
}
?>

So every time I want to access to the city's state an additional query is run behind the scenes in the state's ORM class. This becomes terribly inefficient when a collection of cities state's need to be accessed (I end up making n + 1 queries where n is the number of cities). This also, IMO, negates one of the nice properties of relational databases, the ability to join two (or more) relations and yield a composite relation. This observation leads me to the approach below...

Let's rewrite the city's ORM from above like this:

<?php
class Orm_City
{
   /**
    * Load an array containing all cities from persistence.
    * 
    * @return array <Entity_City>   
    */
    public static function loadAllCities()
    {
        $allCities = array ();
        
        $rs = Db_Facade::query('SELECT * FROM cities INNER JOIN states USING (state_id)');

        /* Iterate over result set row by row and map relational data to objects */        
        foreach($rs as $row){
            /* Push resulting object on to return array */
            $allCities[] = Orm_City::bindAttrbiutes($row);                  
        }   
        return $allCities;
    }

   /**
    * Bind data from relational columns to object attributes.
    * 
    * @return Entity_City   
    */
    public static function bindAttributes($row)
    {
        $city = new Entity_City();
        $city->setCityId($row['city_id']);
        $city->setCityName($row['city_name']);
        $city->setStateId($row['state_id']);        
        $city->setState(Orm_State::bindAttributes($row));        
        // ... etc ...
        
        return $city;
    }
}
?>

Notice that I have now joined the city and state tables on state_id, my City_Entity also now has an implied setState(Entity_State $state) method which takes a Entity_State produced by the Orm_State::bindAttributes() method.

<?php
class Entity_City
{
    // ...
    public function setState(Entity_State $state)
    {
        $this->_state = $state;
    }

    public function getState()
    {
        return $this->_state;
    }
    // ...
}
?>

This approach is clearly more efficient in terms of database queries, however I'm now lugging around an extra Entity_State instance with each Entity_City instance regardless of whether this data is ever accessed or not. The overhead is rather trivial in this case, but imagine the consequences with an object which is composed of 10 or more other instances.

Just wanted to see how some of you guys would tackle this. I usually implement the latter approach and then feel dirty about it and try to minimize the memory overhead by using references between the related objects. I'm hoping someone might suggest an approach that will allow me to have my cake and eat it too.

Best,

Patrick

ReDucTor · August 28, 2007

You could make a function to fetch a single city if your ownly using one city most the time (or even if your wanting use __get and __set to handle which city you want to get, then implement some caching, and you have alot less memory and alot smaller db result set)

utexas_pjm · August 28, 2007

Thanks for taking the time to read my post and posting a reply.

You could make a function to fetch a single city if your ownly using one city most the time (or even if your wanting use __get and __set to handle which city you want to get, then implement some caching, and you have alot less memory and alot smaller db result set)

My goal is to not make any assumptions about how the object (city in this case) will be used. Caching is always an option but that to me is an implementation detail, much like my reference hack, whereas I'm looking for a more general solution.

Thanks again,

Patrick

designationlocutus · August 28, 2007

Does a City always exist in a State?

Your City table implies that a city must have exactly one State (you specified a 0-n relationship between State and City and the state_id is in the table). To truly create the 0-n your table structure could be:

states_cities { states_cities_id, state_id, city_id }

states { state_id, state_name, ... a bunch of other relevant state attributes }

cities { city_id, city_name, ... a bunch of other relevant city attributes } // State id removed

Because how this is organised, both State and City are distinct objects

utexas_pjm · August 29, 2007

Does a City always exist in a State?

Your question is a tad ambiguous. Let's try this qualifier: A state can have zero cities. However, for a city to exist it must be in a state. (Which, if we generalize state to mean nation is mostly true in the real world... maybe with the exception of The Vatican?)

Your City table implies that a city must have exactly one State (you specified a 0-n relationship between State and City and the state_id is in the table). To truly create the 0-n your table structure could be:

states_cities { states_cities_id, state_id, city_id }

states { state_id, state_name, ... a bunch of other relevant state attributes }

cities { city_id, city_name, ... a bunch of other relevant city attributes } // State id removed

Perhaps I was unclear, I said:

In this example let us assume that all cities have a state and all states have 0 - n cities.

A state which may have 0 - n cities (read: between zero and n cities) constitutes a classic 1 to many relationship. Think of the state as the "1" and the city as the "many". A 1 to many relationship can be sufficiently represented by two tables and a foreign key. The use of a mapping table, as you've suggested, is only necessary when you are dealing with a many to many relationship. For example this construct would be useful if we could have the case where a city A could exist in both State 1 and State 2. Unfortunately this is not the case in the example I outlined above.

Because how this is organised, both State and City are distinct objects

Yes, as they should be.

Thank you for taking the time to read my post. I apologize for any ambiguities in my original post. If you feel that I am missing the point that you were trying to get across (which is always possible) please let me know.

Best,

Patrick

Buyocat · August 29, 2007

Personally I think you should let the size of the result sets, in this case the number of cities a state might "own", guide your decision. If you anticipate that on average a state will have many cities then you should implement a lazy-loading method which performs a second query; frankly a single query isn't that expensive, and this ensures that memory isn't wasted on large record sets that may not be used. On the other hand if you're anticipating a small number of cities to a given state then you don't have to worry about memory issues, and in that case shaving off a single query may be advantageous.

Obviously there are other things that should be brought into consideration; for instance, how often does the application access all of the cities? Do states ever share instances of the same city (probably not in this case, but for another example such as customers/merchants it may be relevant)?

3.5k · August 29, 2007

The tricky thing here is that State is a parent object. A State mapper will need an Identity Map to avoid creating a multitude of equal objects. That's going to save you plenty of memory. Seeing as the number of states will be tiny compared to cities, when you load all cities, you might as well construct the whole thing, states included.

If you load all states, I'd mark the City objects as Ghosts.

3.5k · August 29, 2007

Or a Virtual Proxy list:

<?php 
class StateMapper {
    public static function loadAllStates() {        
        $rs = Db_Facade::query('SELECT * FROM states');
        foreach($rs as $row){
            $states[] = self::bindAttrbiutes($row);                  
        }   
        return $states;
    }
    public static function bindAttributes($row) {
        $state = new State();
        $state->setId($row['state_id']);
        $state->setName($row['state_name']);      
        $state->setCityList(new CityList);
        return $state;
    }
}
class CityList extends VirtualProxyList {

public function get($id){
	if(!isset($this->list[$id])){
		$this->list[$id] = CityMapper::find($id);	
	}
	return $this->list[$id];
}
}
?>

A simplified example, but would work nicely. The next thing then becomes that you need an Identity Map, to avoid CityMapper instantiating the same City twice. A statefull mapper is the better option IMO.

Sign In

Relational => OO Breakdown

Recommended Posts

utexas_pjm

Link to comment

Share on other sites

ReDucTor

Link to comment

Share on other sites

utexas_pjm

Link to comment

Share on other sites

designationlocutus

Link to comment

Share on other sites

utexas_pjm

Link to comment

Share on other sites

Buyocat

Link to comment

Share on other sites

448191

Link to comment

Share on other sites

448191

Link to comment

Share on other sites

Join the conversation

Browse

Activity

Important Information