Jump to content

sakila database help


null00

Recommended Posts

Hi, so I have been trying to get this to work but I haven't been able to. I am supposed to do this:

"Using the Sakila database, give me a combined list of films that have an actor with the last name Depp and the actors in the film Ace Goldfinger.  Order the output by type ascending." 

I have this coded and yet it will not return what I need it to. 

SELECT NOW() firstlastname, CONCAT(actor.first_name,' ', actor.last_name) AS film_or_actor , 'Actor' AS type
FROM actor 
INNER JOIN film_actor ON actor.actor_id = film_actor.actor_id
INNER JOIN film ON film_actor.film_id = film.film_id
INNER JOIN film_text ON film_text.title = film.title
INNER JOIN film_list ON film.film_id = film_list.FID
WHERE actor.last_name LIKE '%depp%'
UNION
SELECT NOW() firstlastname, film.title AS film_or_actor, 'Film' AS type
FROM actor
INNER JOIN film_actor ON actor.actor_id = film_actor.actor_id
INNER JOIN film ON film_actor.film_id = film.film_id
INNER JOIN film_text ON film_text.title = film.title
INNER JOIN film_list ON film.film_id = film_list.FID
WHERE film_list.actors IN film.film_id = 2
ORDER BY type ASC;

any help would be appreciated.

Edited by gizmola
Code tag
Link to comment
Share on other sites

try

SELECT f.title as film_or_actor
     , concat(a.first_name, ' ', a.last_name) as `Search for`
     , 'Film' as `type`
FROM film f 
     JOIN film_actor fa USING (film_id)
     JOIN actor a USING (actor_id)
WHERE a.last_name LIKE '%depp%'
UNION
SELECT concat(a.first_name, ' ', a.last_name)
     , f.title
     , 'Actor'
FROM film f 
     JOIN film_actor fa USING (film_id)
     JOIN actor a USING (actor_id)
WHERE f.film_id = 2
ORDER BY `type`, film_or_actor;

 

Link to comment
Share on other sites

@Barand:  The way the question is posed, isn't the clearest, but I think what is wanted is a list of films that a) has an actor named "depp"  or "featured any of the actors in film_id = 2".  

The result set should be films and not films or actors.  I based this off the initial description:  

6 hours ago, null00 said:

"Using the Sakila database, give me a combined list of films..."

 

Link to comment
Share on other sites

My take on this, is that it makes more sense if you can break it down into problems:

1.  What list of actors (actor_id) do you need?

You need actors with a name like "depp" AND actors who were in the 'ACE GOLDFINGER film".  This 1st query gets you those actors.

SELECT actor_id FROM actor WHERE last_name like '%depp%' 
UNION 
SELECT actor_id FROM film_actor WHERE film_id = (SELECT film_id FROM film WHERE title = 'ACE GOLDFINGER');

 

To get the full list of films, you need to get all the films for any of those actors, so you have to join your list of actors to the film_actor table, so one very simple way to think about this, is to use some subqueries fed into the upper query. 

The original query, can be used as a subquery to get you the list of actors.  

 

SELECT DISTINCT f.film_id, f.title
FROM film f
JOIN film_actor fa ON f.film_id = fa.film_id
WHERE fa.actor_id IN 
  (SELECT actor_id
   FROM actor
   WHERE last_name LIKE '%depp%' 
   UNION
   SELECT actor_id
   FROM film_actor
   WHERE film_id = 
      (SELECT film_id FROM film WHERE title = 'ACE GOLDFINGER')
  )
ORDER BY f.title;

 

Here's another syntactical take, joining to the subquery rather than using IN

SELECT DISTINCT f.film_id, f.title
FROM film f
JOIN film_actor fa ON f.film_id = fa.film_id
JOIN  
  (SELECT actor_id
   FROM actor
   WHERE last_name LIKE '%depp%' 
   UNION
   SELECT actor_id
   FROM film_actor
   WHERE film_id = 
      (SELECT film_id FROM film WHERE title = 'ACE GOLDFINGER')
  ) as a
ON fa.actor_id = a.actor_id  
ORDER BY f.title;

 

And here's a version, where the 'ACE GOLDFINGER' query is also a join to a subquery.  This one is probably a bit more correct, in the case that there was more than one 'ACE GOLDFINGER" titled film, that might have different actors.

SELECT DISTINCT f.film_id, f.title
FROM film f
JOIN film_actor fa ON f.film_id = fa.film_id
JOIN  
  (SELECT actor_id
   FROM actor
   WHERE last_name LIKE '%depp%' 
   UNION
   SELECT actor_id
   FROM film_actor
   JOIN 
    (SELECT film_id
     FROM 
     film 
     WHERE title = 'ACE GOLDFINGER'
     ) as ace
     ON film_actor.film_id = ace.film_id   
  ) as a
ON fa.actor_id = a.actor_id  
ORDER BY f.title;

 

  • Like 1
Link to comment
Share on other sites

I prefer to JOIN to the subquery - it should be faster

SELECT DISTINCT f.film_id, f.title
FROM film f
JOIN film_actor fa ON f.film_id = fa.film_id
JOIN
  (SELECT actor_id
   FROM actor
   WHERE last_name LIKE '%depp%' 
   UNION
   SELECT actor_id
   FROM film_actor
   WHERE film_id = 
      (SELECT film_id FROM film WHERE title = 'ACE GOLDFINGER')
  ) actid USING (actor_id)
ORDER BY f.title;

 

  • Like 1
Link to comment
Share on other sites

8 minutes ago, Barand said:

Yep, definitely faster. Timing results...

    IN subquery    0.0666 seconds
    JOIN subquery  0.0012 seconds

 

Yes, much faster, but why?  The explains are nice (since we're conducting a mini- mysql class here  😃.

I do like your application of USING ... really nice reduction of syntax, although the plans are exactly the same if you utilize the full ON syntax. 

 

Nested Subquery:

MariaDB [sakila]> explain SELECT DISTINCT f.film_id, f.title
    -> FROM film f
    -> JOIN film_actor fa ON f.film_id = fa.film_id
    -> WHERE fa.actor_id IN 
    ->   (SELECT actor_id
    ->    FROM actor
    ->    WHERE last_name LIKE '%depp%' 
    ->    UNION
    ->    SELECT actor_id
    ->    FROM film_actor
    ->    WHERE film_id = 
    ->       (SELECT film_id FROM film WHERE title = 'ACE GOLDFINGER')
    ->   )
    -> ORDER BY f.title;
+------+--------------------+------------+--------+------------------------+----------------+---------+------------------+------+----------------------------------------------+
| id   | select_type        | table      | type   | possible_keys          | key            | key_len | ref              | rows | Extra                                        |
+------+--------------------+------------+--------+------------------------+----------------+---------+------------------+------+----------------------------------------------+
|    1 | PRIMARY            | f          | index  | PRIMARY                | idx_title      | 514     | NULL             |  934 | Using index; Using temporary; Using filesort |
|    1 | PRIMARY            | fa         | ref    | idx_fk_film_id         | idx_fk_film_id | 2       | sakila.f.film_id |    1 | Using where; Using index; Distinct           |
|    2 | DEPENDENT SUBQUERY | actor      | eq_ref | PRIMARY                | PRIMARY        | 2       | func             |    1 | Using where                                  |
|    3 | DEPENDENT UNION    | film_actor | eq_ref | PRIMARY,idx_fk_film_id | PRIMARY        | 4       | func,const       |    1 | Using where; Using index                     |
|    4 | SUBQUERY           | film       | ref    | idx_title              | idx_title      | 514     | const            |    1 | Using where; Using index                     |
| NULL | UNION RESULT       | <union2,3> | ALL    | NULL                   | NULL           | NULL    | NULL             | NULL |                                              |
+------+--------------------+------------+--------+------------------------+----------------+---------+------------------+------+----------------------------------------------+
6 rows in set (0.00 sec)

 

Join to Subquery: 

MariaDB [sakila]> explain SELECT DISTINCT f.film_id, f.title
    -> FROM film f
    -> JOIN film_actor fa ON f.film_id = fa.film_id
    -> JOIN
    ->   (SELECT actor_id
    ->    FROM actor
    ->    WHERE last_name LIKE '%depp%' 
    ->    UNION
    ->    SELECT actor_id
    ->    FROM film_actor
    ->    WHERE film_id = 
    ->       (SELECT film_id FROM film WHERE title = 'ACE GOLDFINGER')
    ->   ) actid USING (actor_id)
    -> ORDER BY f.title;
+------+--------------+------------+--------+------------------------+---------------------+---------+-------------------+------+---------------------------------+
| id   | select_type  | table      | type   | possible_keys          | key                 | key_len | ref               | rows | Extra                           |
+------+--------------+------------+--------+------------------------+---------------------+---------+-------------------+------+---------------------------------+
|    1 | PRIMARY      | <derived2> | ALL    | NULL                   | NULL                | NULL    | NULL              |  204 | Using temporary; Using filesort |
|    1 | PRIMARY      | fa         | ref    | PRIMARY,idx_fk_film_id | PRIMARY             | 2       | actid.actor_id    |    1 | Using index                     |
|    1 | PRIMARY      | f          | eq_ref | PRIMARY                | PRIMARY             | 2       | sakila.fa.film_id |    1 |                                 |
|    2 | DERIVED      | actor      | index  | NULL                   | idx_actor_last_name | 182     | NULL              |  200 | Using where; Using index        |
|    3 | UNION        | film_actor | ref    | idx_fk_film_id         | idx_fk_film_id      | 2       | const             |    4 | Using where; Using index        |
|    4 | SUBQUERY     | film       | ref    | idx_title              | idx_title           | 514     | const             |    1 | Using where; Using index        |
| NULL | UNION RESULT | <union2,3> | ALL    | NULL                   | NULL                | NULL    | NULL              | NULL |                                 |
+------+--------------+------------+--------+------------------------+---------------------+---------+-------------------+------+---------------------------------+

 

Link to comment
Share on other sites

Since we are talking about these plans and optimizations, I think it's good to point this out to @null00

You should always avoid using "LIKE '%something%'".  This type of query can not use an index.  The explains show this clearly.

MariaDB [sakila]> select count(*) from actor;
+----------+
| count(*) |
+----------+
|      200 |
+----------+

 So you have 200 rows in your actor table, and even though it attempts to utilize the idx_actor_last_name index on actor, the explain shows that even with a join to the subquery, the portion that looks for '%depp%' "table scans" all 200 rows.  

+------+--------------+------------+--------+------------------------+---------------------+---------+-------------------+------+---------------------------------+
| id   | select_type  | table      | type   | possible_keys          | key                 | key_len | ref               | rows | Extra                           |
+------+--------------+------------+--------+------------------------+---------------------+---------+-------------------+------+---------------------------------+
|    2 | DERIVED      | actor      | index  | NULL                   | idx_actor_last_name | 182     | NULL              |  200 | Using where; Using index        |
+------+--------------+------------+--------+------------------------+---------------------+---------+-------------------+------+---------------------------------+

I removed all the query plan steps, except for the part that searches for '%depp%' and you can see in the rows column that it examines all 200 rows.

So long as a string can be searched from left to right, wildcards can be used and a btree index is still useful.  So if this query is modified to be 'depp%' then it will no longer table scan the actor table, but instead will find the only actor row that matches.

MariaDB [sakila]> explain SELECT DISTINCT f.film_id, f.title
    -> FROM film f
    -> JOIN film_actor fa ON f.film_id = fa.film_id
    -> JOIN
    ->   (SELECT actor_id
    ->    FROM actor
    ->    WHERE last_name LIKE 'depp%' 
    ->    UNION
    ->    SELECT actor_id
    ->    FROM film_actor
    ->    WHERE film_id = 
    ->       (SELECT film_id FROM film WHERE title = 'ACE GOLDFINGER')
    ->   ) actid USING (actor_id)
    -> ORDER BY f.title;
+------+--------------+------------+--------+------------------------+---------------------+---------+-------------------+------+---------------------------------+
| id   | select_type  | table      | type   | possible_keys          | key                 | key_len | ref               | rows | Extra                           |
+------+--------------+------------+--------+------------------------+---------------------+---------+-------------------+------+---------------------------------+
|    1 | PRIMARY      | <derived2> | ALL    | NULL                   | NULL                | NULL    | NULL              |    6 | Using temporary; Using filesort |
|    1 | PRIMARY      | fa         | ref    | PRIMARY,idx_fk_film_id | PRIMARY             | 2       | actid.actor_id    |    1 | Using index                     |
|    1 | PRIMARY      | f          | eq_ref | PRIMARY                | PRIMARY             | 2       | sakila.fa.film_id |    1 |                                 |
|    2 | DERIVED      | actor      | range  | idx_actor_last_name    | idx_actor_last_name | 182     | NULL              |    2 | Using where; Using index        |
|    3 | UNION        | film_actor | ref    | idx_fk_film_id         | idx_fk_film_id      | 2       | const             |    4 | Using where; Using index        |
|    4 | SUBQUERY     | film       | ref    | idx_title              | idx_title           | 514     | const             |    1 | Using where; Using index        |
| NULL | UNION RESULT | <union2,3> | ALL    | NULL                   | NULL                | NULL    | NULL              | NULL |                                 |
+------+--------------+------------+--------+------------------------+---------------------+---------+-------------------+------+---------------------------------+
7 rows in set (0.00 sec)

 

Really when you look at it, for this query, a wildcard search of name doesn't make much sense, even with 'depp%', but I guess this is for homework/test prep.  In general, you never want to use LIKE '%something%'.  If you need full text search then use a fulltext engine or mysql's FULLTEXT indexing and searching syntax.  

 

Link to comment
Share on other sites

Weird! I see your results have 7 rows. When I run it I get 124 rows

mysql> SELECT DISTINCT f.film_id, f.title
    -> FROM film f
    -> JOIN film_actor fa ON f.film_id = fa.film_id
    -> JOIN
    ->   (SELECT actor_id
    ->    FROM actor
    ->    WHERE last_name LIKE 'depp%'
    ->    UNION
    ->    SELECT actor_id
    ->    FROM film_actor
    ->    WHERE film_id =
    ->       (SELECT film_id FROM film WHERE title = 'ACE GOLDFINGER')
    ->   ) actid USING (actor_id)
    -> ORDER BY f.title;
+---------+-------------------------+
| film_id | title                   |
+---------+-------------------------+
|       2 | ACE GOLDFINGER          |
|       3 | ADAPTATION HOLES        |
|      11 | ALAMO VIDEOTAPE         |
|      14 | ALICE FANTASIA          |
|      17 | ALONE TRIP              |
|      43 | ATLANTIS CAUSE          |
|      72 | BILL OTHERS             |
|      85 | BONNIE HOLOCAUST        |
|      92 | BOWFINGER GABLES        |
|     100 | BROOKLYN DESERT         |
               .
               .
               .
|     883 | TEQUILA PAST            |
|     895 | TOMORROW HUSTLER        |
|     901 | TRACY CIDER             |
|     909 | TREASURE COMMAND        |
|     919 | TYCOON GATHERING        |
|     925 | UNITED PILOT            |
|     944 | VIRGIN DAISY            |
|     950 | VOLUME HOUSE            |
|     954 | WAKE JAWS               |
|     957 | WAR NOTTING             |
|     960 | WARS PLUTO              |
|     967 | WEEKEND PERSONAL        |
|     979 | WITCHES PANIC           |
|     991 | WORST BANGER            |
|     992 | WRATH MILE              |
+---------+-------------------------+
124 rows in set (0.02 sec)

There are 43 films starring a "Depp" before starting on the other actors from Ace Goldfinger

mysql> SELECT COUNT(DISTINCT film_id) FROM film_actor WHERE actor_id IN (100, 160);
+-------------------------+
| COUNT(DISTINCT film_id) |
+-------------------------+
|                      43 |
+-------------------------+
1 row in set (0.01 sec)

I've noticed in past projects that MariaDB can be a bit flaky with table subqueries. Don't know if that's the case here

Link to comment
Share on other sites

@gizmola What results do you get if you put the subquery results into a temp table and use that

CREATE TEMPORARY TABLE temp_actor (actor_id smallint not null primary key);
-- 
-- 
INSERT INTO temp_actor
   SELECT actor_id
   FROM actor
   WHERE last_name LIKE 'depp%' 
   UNION
   SELECT actor_id
   FROM film_actor
   WHERE film_id = 
      (SELECT film_id FROM film WHERE title = 'ACE GOLDFINGER');
-- 
-- 
SELECT DISTINCT f.film_id
     , f.title
FROM film f
     JOIN film_actor fa USING (film_id)
     JOIN temp_actor USING (actor_id)
ORDER BY f.title;

I get the same as before (124 recs)

Link to comment
Share on other sites

5 hours ago, Barand said:

OK  - I'm senile. 7 rows in the explain results!!!

No worries my friend.  I hope others enjoy this thread as much as we did.  I don't interact with too many people that have as much interest in SQL/Rdbms, relational design and theory, and db internals as we do.  Always a pleasure, and frequently an educational experience to read your posts!

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.