Jump to content

Proper way to pull data from four different tables


galvin

Recommended Posts

Say I have four simple tables likes this, based on the National Football League...

 

table (fields)

players (playerid, firstname, lastname, teamid, positionid)

positions (positionid, position)

teams (teamid, teamname)

schedule (gameid, homeid, awayid, weeknum, seasonnum)  NOTE:  Every team has 16 games listed in the schedule table (8 home and 8 away, plus they will also have a BYE week, for 17 total)

 

I want to query my database and get the following info from those tables and display it like this...

 

NOTE: It would repeat the following for EVERY player that I query on (more on the specific query below)

 

firstname lastname - teamname (teamid):

Week 1: v -if home- @ -if away- (opponent's teamname)

Week 2: v -if home- @ -if away- (opponent's teamname)

Week 3: v -if home- @ -if away- (opponent's teamname)

Week 4: v -if home- @ -if away- (opponent's teamname)

Week 5: v -if home- @ -if away- (opponent's teamname)

Week 6: v -if home- @ -if away- (opponent's teamname)

Week 7: v -if home- @ -if away- (opponent's teamname)

Week 8: v -if home- @ -if away- (opponent's teamname)

Week 9: v -if home- @ -if away- (opponent's teamname)

Week 10: v -if home- @ -if away- (opponent's teamname)

Week 11: v -if home- @ -if away- (opponent's teamname)

Week 12: v -if home- @ -if away- (opponent's teamname)

Week 13: v -if home- @ -if away- (opponent's teamname)

Week 14: v -if home- @ -if away- (opponent's teamname)

Week 15: v -if home- @ -if away- (opponent's teamname)

Week 16: v -if home- @ -if away- (opponent's teamname)

Week 17: v -if home- @ -if away- (opponent's teamname)

 

For example...

 

Julio Jones - Falcons (id#):

Week 1: @ Giants

Week 2: @ Seahawks

Week 3: v Redskins

Week 4: @Patriots

Week 5: v Bears

Week 6: v Buccs

Week 7: @49ers

Week 8: @Saints

Week 9:  v Buccs

Week 10: BYE

Week 11: @ Eagles

Week 12: v Cowboys

Week 13: v Lions

Week 14: @ Packers

Week 15: v Rams

Week 16: v 49ers

Week 17:  @ Chiefs

 

This was my apparently really poor attempt at the query because it brings back results that repeat  the same name A LOT...

 

(side note:  Is it bad practice to try to get this much info in ONE query (i.e. should I do multiple queries)?  Clearly I stink at the whole "join" thing...)

 

$sql = "SELECT * from players, positions, teams, schedule
		WHERE players.positionid = positions.positionid
		AND teams.teamid = players.teamid
		AND playerid IN ($specificplayerids) //this will be a comma separated string of specific IDs (for example... "47, 57, 103, 8, 17, 19)
		AND players.teamid = schedule.homeid //this one of many areas where I'm confused. I THINK i need to say, in plain english, =schedule.homeid OR schedule.awayid but not sure 
		ORDER BY lastname";
$players= mysql_query($sql, $connection);
		if (!$players) {
		die("Database query failed: " . mysql_error());
		} else {
			while ($row = mysql_fetch_assoc($players)) {
				$output .=	"<div>" . $row['firstname'] . " " . $row['lastname'] . ", " . $row['teamname'] . " </div>";
			}  //I tried to just get it to list names first (i.e. I didn't even try to get the schedule part), but I can't even get just the names to display right.  :-\
		}

Link to comment
Share on other sites

If you want to put all that information into one table, as you're doing here, then fetching all of the data in one query is the correct way of going about it.

 

However, your query itself is not written in the best manner. It would have been a lot better to use explicit joins, instead of the implicit ones that you've done here. Not only will it give you a lot more control on actually how the data is fetched and added, but it'll also make the query a whole lot easier to read and thus maintain.

 

So, I recommend reading up on how the different types of JOINs work, figure out which ones you want to use, and rewrite your query to spec.

 

PS: As a pet-peeve of mine... else after an if () { die(); } block is completely unnecessary. If the test returns true, the script will never proceed past the "die" anyway.

I also recommend using proper indentation, as it'll help on readability as well. I've seen lots of issues stemming directly from the confusion caused by someone not indenting their code properly.

Link to comment
Share on other sites

Here are some things to consider, assuming you are doing this for real and it is not just a classroom assignment.

 

Players are not on one team for their entire career and some players switch the position and/or jersey number that they play. You would actually need a player table with the 'static' information about a player - id, first name, last name, dob, collage, date drafted... This would assign each player an id.

 

You would then need a player_details (or perhaps call it player_history) table that holds the 'variable' details of where a player played at, when, what position, and jersey number - id, playerid, teamid, positionid, jersey_no, start_date, end_date. There can be multiple records for each playerid that tracks what teams he has been on. The end_date would be either a null/empty/zero/or a far-future date in the record for the current team the player is a member of.

 

By storing the date(s) in the above table, you can then determine the team roster for any game (if the date the game was played on is between the player's start and end date and the teamid matches, that player was on that team for that game.) You would also need to store the actual complete date for each game in the schedule table.

Link to comment
Share on other sites

Here's an example of using JOINs in the query that will get the schedule with the teamname for both the visitor and home team (this assumes that you have a date column in your schedule as suggested above) -

 

<?php
// query to display the schedule from the database information

// make database connection/select database here...

//teams (teamid, teamname)
//schedule (gameid, homeid, awayid, weeknum, seasonnum, date)

$query = "SELECT weeknum, date, v.teamname as visitor, h.teamname as home FROM schedule s INNER JOIN teams v ON s.awayid = v.teamid INNER JOIN teams h ON s.homeid = h.teamid ORDER BY gameid";
$result = mysql_query($query) or die(mysql_error());

$last_week = null;
$last_date = null;
while($row = mysql_fetch_assoc($result)){
// output any week # heading
if($last_week != $row['weeknum']){
	// week changed (or is the first one)
	echo "Week {$row['weeknum']}<br />";
	$last_week = $row['weeknum'];
}
// output any date sub-heading
if($last_date != $row['date']){
	// date changed (or is the first one)
	$date = date('D, M j',strtotime($row['date'])); // format the date as: Wed, Sep 5 
	echo "$date<br />";
	$last_date = $row['date'];
}
// output data under each week/date heading
echo "{$row['visitor']} at {$row['home']}<br />";
}

 

The above should display your schedule, something like the following -

 

Week 1

Wed, Sep 5

Cowboys at Giants

Sun, Sep 9

Colts at Bears

Eagles at Browns

...

Link to comment
Share on other sites

To elaborate on the other good answers you have received:

 

It is the nature of a relational database that you will get the same data in some columns many times when querying multiple tables. You can use these columns to know when the "outer" data changes. So, using your schema, if you wanted to print the schedule for ALL teams, you could issue a single query, and start a new HTML table whenever the TeamID changes.

 

Having learned and worked with SQL before the JOIN phrase became a standard, I do not usually push it as the "correct" way to build a query. When I first learned the JOIN syntax, it was confusing; especially when doing OUTER JOINS. However, it does make the query easier to read and makes it harder to leave out a relationship. For instance, if you left out the AND teams.teamid = players.teamid clause in your query, you would get a Cartesian product, which means that every matching row from the rest of the query would match EVERY ROW in the teams table, and you would get back a whole lot more data than you really want. Using the JOIN syntax, it would be very obvious in looking at the query that the relationship is missing -- in fact, the table would be missing from the query, entirely:

 

SELECT * 
FROM players JOIN positions ON players.positionid = positions.positionid
JOIN teams ON teams.teamid = players.teamid
JOIN schedule ON (schedule.homeid = players.teamid OR schedule.awayid = players.teamid)
WHERE playerid IN ($specificplayerids)
ORDER BY lastname";

 

Do not use SELECT * especially in multi-table queries. SELECT * sends every column from every table to the client (your PHP script) for each selected row. Then PHP has to buffer all of that data internally, and pass it to your while loop in turn. This is a lot of wasted resources (transfer band-width, memory, cpu). Not to mention the fact that columns with the same name (i.e. "positionid") will be "aliased" (i.e. "players.positionid" and "positions.positionid"). If you are using mysql_fetch_assoc you will probably try to reference it as "positionid" and you will find that that index does not exist in the returned array.

 

You may want to add "weeknum" to your ORDER BY. Another "feature" of a relational database is that there is no "natural order" of data. Just because the schedule table was built in the proper date sequence does not guarantee that the data will be returned in that sequence. When joining tables and/or ordering results, the database may actually retrieve the data using some index that does not give it "sequential data".

 

Also, I would add the player's ID to the order by: ORDER BY lastname, playerid, weeknum. Why? because if you have multiple players with the same last name, their rows could be intermixed since there is no "natural order" to data (see above). Then use the "playerid" (not the "lastname") to determine when you start dealing with a different player.

Link to comment
Share on other sites

Do not use SELECT * especially in multi-table queries.

 

Are you at all flexible on that?  For example, say you are querying a table that has 20 columns and you need to use 17 of them?  Would you use "SELECT *" in that case, or would you stick to your guns and actually list out the specific 17 fields you need (i.e. SELECT column1, column2, ... column17)?

 

Just curious, as I'm trying to adopt good/proper habits :)

 

 

Link to comment
Share on other sites

Not to mention the fact that columns with the same name (i.e. "positionid") will be "aliased" (i.e. "players.positionid" and "positions.positionid")

I'm afraid that this isn't quite correct.

 

Associative array indices will be overwritten, so that only the data from the last field with the same name will be kept. The numerical indices, however, will still count normally. As evidenced by the following query:

php > $query = 'SELECT user.user, db.user FROM user LEFT JOIN db ON user.user = db.user';
php > $res = mysql_query ($query);
php > print_r (mysql_fetch_array ($res));
Array
(
    [0] => amarokuser
    [user] => amarokuser
    [1] => amarokuser
)

 

Galvin: Only when I'm dealing with a large number of columns, and know that the data I don't need is negligible. So it's a judgement call on a case-by-case basis, really. Though, only if it hurts readability and maintainability of the script, to type it all out, will I consider this.

Link to comment
Share on other sites

Do not use SELECT * especially in multi-table queries.

 

Are you at all flexible on that?  For example, say you are querying a table that has 20 columns and you need to use 17 of them?  Would you use "SELECT *" in that case, or would you stick to your guns and actually list out the specific 17 fields you need (i.e. SELECT column1, column2, ... column17)?

 

Just curious, as I'm trying to adopt good/proper habits :)

 

 

 

It's trivial to write a DB wrapper to exclude 3 columns.

Link to comment
Share on other sites

Just adding my 0.02 worth again:

 

If those extra 3 columns are large TEXT fields then definitely leave them out.

 

Even if all columns are the same size then you are still adding an unnecessary 15% overhead to the data retrieved from the server which could be significant on a large result set.

 

For the purpose of subsequent processing and display it may be advantageous not to retrieve the columns in the same order as they are stored in the database

 

edit: And lastly, by using * it removes the documentation element which lets one know what the the query is retrieving.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.