Jump to content

Looping through a tree of arbitrary depth.


Ayvah

Recommended Posts

I'm using MySQL 5.1.49-1ubuntu8.1

 

The table essentially contains a tree structure, with columns looking something like:

idparent_idname

1NULLanimal

21cat

31dog

42Persian

52Savannah

The depth of any particular row in this tree is arbitrary, and could easily be more than 10 branches deep.

 

Starting with a row with a parent_id of 100, I can select its parent with the simple statement:

mysql_query("SELECT * FROM `table` WHERE `id` = 100")

However, often I'll be wanting the entire tree of parents, and rather than executing mysql_query after mysql_query until I get to the top, I would like to use a single statement that can loop through parent rows all the way to the top of the tree, and select each row it loops through -- without the overhead of multiple mysql queries.

 

I have been unable to find anything in the documentation that would make this possible. Does anyone know a way to do this?

 

I get the feeling a UNION might help, but I don't know how I would use it in this circumstance.

Link to comment
Share on other sites

To retrieve the tree levels, you have to join the table to itself. But it cannot be done in a single statement for infinite levels. To get three levels, you would do something like this:

 

Select T1.name, T2.name, T3.name
FROM table T1 JOIN table T2 on T1.parent = T2.id
JOIN table T3 on T2.parent = T3.id

 

A good way to do this is to write a stored procedure or function to retrieve the complete tree. Here is a function I wrote recently for a similar table. Using a function, you can retrieve a string containing the entire tree. 

 

DROP function IF EXISTS `get_place_text`;

DELIMITER $$

CREATE FUNCTION `get_place_text`(id$ INTEGER) 
RETURNS varchar(1024) 
    READS SQL DATA
BEGIN
    DECLARE out$ varchar(1024);
    DECLARE pln$ varchar(1024);
    DECLARE pli$ integer unsigned;

    -- Initialize the output and loop variables
    SELECT '', id$ INTO out$, pli$;

    -- As long as we have a PlaceID get it's name and the ParentID
    WHILE (pli$ != 0) DO
        SELECT PlaceName, ParentID INTO pln$, pli$ 
        FROM Places
        WHERE ID = pli$;

        -- Concatenate the name with the output variable
        SELECT CONCAT_WS(out$, pln$) INTO out$;
    END WHILE;

    RETURN out$;
END
$$

DELIMITER ;

-- Use it like this
SELECT EventDate, get_place_text(PlaceID) AS PlaceName 
FROM table WHERE EventID = 1;

Link to comment
Share on other sites

So it seems there's really no good way to go about it.

 

One issue with using a procedure in that way is that I'm not just after some data ie PlaceName, but the entire row... So I would have to concatenate a list of ids so that I can use it in a SELECT. Or maybe turn the procedure's output into CSV.

 

Either way, I'll have to study up a bit more on stored procedures, as it clearly looks like the best compromise. Thanks for your help.

Link to comment
Share on other sites

If you return a comma-separated list of ID's from, say, a function; you could then use PHP to put it into an SQL statement.

 

$ids = '1,12,6';

$sql = "SELECT * FROM Places WHERE ID IN ($ids)";

(don't put quotes around $ids.)

 

of course, I'm not sure how to get it sorted with an ORDER BY. I'll have to think about that.

Link to comment
Share on other sites

The order that they're returned isn't much of an issue in my case. The problem is just getting all of the rows fed to PHP as efficiently as possible.

 

I've looked into it a little and it seems that dumping the rows into a temporary table may be the best way to do it. I really wish I knew more about the comparative speed of different methods, but it sounds like it would be more efficient than going to PHP and back.

Link to comment
Share on other sites

  • 2 weeks later...

Ah, yes. "The Nested Set Model" is something I have looked into, though that's a good article and I don't think I've read it before. Thanks for the link.

 

It does, of course, solve the problem that led to the creation of this thread. However, I am quite concerned about the Nested Set Model's ability to scale horizontally. I find it hard to believe that there isn't a significant cost to having to update half the table every time you insert a child node.

 

I imagine it's great when you're working with a static table, but with my admittedly limited MySQL administrative knowledge, I can only imagine that the cost of the "Adjacency List Model" is the smaller hurdle.

Link to comment
Share on other sites

Indeed. It is something I'm considering, along with a "depth" field. But I would like to wait and see if I can determine the speed advantage that I'd actually be able to get from including them. It just feels like a dirty solution. In the end, the stored procedure is just searching for rows by their primary key, so it shouldn't be *all* doom and gloom.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.