stubarny Posted September 15, 2012 Share Posted September 15, 2012 Hi All, I need to create a job search tool for the below table of job adverts: JobAdvertNumber | JobAdvertTitle 1 | Accountant 2 | Management Accountant 3 | Financial Accountant 4 | Management Accountant 5 | Accountant 6 | Financial Accountant I also have a big (millions of records) table of job title correlations, for example: JobTitle1 | JobTitle2 | Correlation Accountant | Financial Accountant | 0.76 Accountant | Management Accountant | 0.84 Please could you tell me how to create a job search tool that returns the adverts in order of their job title's correlation with the searched for term. For example if I did a job search for "Accountant" it should return all the "Accountant" jobs first, then all the "Management Accountant" jobs and finally all the "Financial Accountant" jobs. Many thanks, Stu Quote Link to comment https://forums.phpfreaks.com/topic/268410-returning-results-in-order-of-relevence/ Share on other sites More sharing options...
Christian F. Posted September 15, 2012 Share Posted September 15, 2012 Change "JobTitle1" and "JobTitle2" to ID fields, referencing "JobAdvertNumber", and then JOIN them like any other Many-to-Many relationship. Ordered by the relevance score. Quote Link to comment https://forums.phpfreaks.com/topic/268410-returning-results-in-order-of-relevence/#findComment-1378216 Share on other sites More sharing options...
stubarny Posted September 15, 2012 Author Share Posted September 15, 2012 Thank you Christian, Please my ask for you to explain the below part further? I assume this has nothing to do with aliases?: Change "JobTitle1" and "JobTitle2" to ID fields, referencing "JobAdvertNumber", Thank you, Stu Quote Link to comment https://forums.phpfreaks.com/topic/268410-returning-results-in-order-of-relevence/#findComment-1378222 Share on other sites More sharing options...
Christian F. Posted September 15, 2012 Share Posted September 15, 2012 That's a basic foreign key relation, and if you're not sure on how to set that up in MySQL then I recommend searching for it. It's the cornerstone of relational database management systems, after all. It's also a part of normalizing the data, as you don't want to duplicate data in the database, as you've done here. That's why you utilize an ID field as the relation: Less chance of errors creeping into your data. And no, it doesn't have anything to do with aliases. At least not per se, but aliases are often used in JOINs to shorten down the finished query and/or make it easier to read. Quote Link to comment https://forums.phpfreaks.com/topic/268410-returning-results-in-order-of-relevence/#findComment-1378230 Share on other sites More sharing options...
stubarny Posted September 15, 2012 Author Share Posted September 15, 2012 sorry, I didn't realise you meant foreign keys. But using your methodology I could enter 1 or 5 for "Accountant" which isn't normalised? Let me try again, this time adding an extra table for the "keyphrases" and then referring to it with foreign keys: Keyphrases: KeyphraseID | KeyphraseContent 1 | Accountant 2 | Management Accountant 3 | Financial Accountant JobAdverts: JobAdvertID | JobAdvert_KeyphraseID 1 | 1 2 | 2 3 | 3 4 | 2 5 | 1 6 | 3 I also have a big (millions of records) table of job title correlations, for example: Correlations CorrelationID | CorrelationKeyphrase1_KeyphraseID | CorrelationKeyphrase2_KeyphraseID | CorrelationContent 1| 1 | 3 | 0.76 2| 1 | 2 | 0.84 Is that more useful? I think I need an SQL query to: 1. Find KeyphraseID for the searched for keyphrase 2. Select all records in the JobAdverts table 3. Attach CorrelationContent to each record in the JobAdverts table --> (done by finding CorrelationContent where KeyphraseID of the search term = CorrelationKeyphrase1_KeyphraseID and JobAdvert_KeyphraseID = CorrelationKeyphrase2_KeyphraseID) 4. Sort by CorrelationContent DESC Please could you show me how to create an SQL query to do that? Thanks, Stu Quote Link to comment https://forums.phpfreaks.com/topic/268410-returning-results-in-order-of-relevence/#findComment-1378235 Share on other sites More sharing options...
Barand Posted September 15, 2012 Share Posted September 15, 2012 I know it is implied but I think it would simplify your queries if you had Correlations CorrelationID | CorrelationKeyphrase1_KeyphraseID | CorrelationKeyphrase2_KeyphraseID | CorrelationContent 1 | 1 | 3 | 0.76 2 | 1 | 2 | 0.84 3 | 1 | 1 | 1.00 <- add this Quote Link to comment https://forums.phpfreaks.com/topic/268410-returning-results-in-order-of-relevence/#findComment-1378239 Share on other sites More sharing options...
stubarny Posted September 15, 2012 Author Share Posted September 15, 2012 Thank you Barand, yes that would be fine. Any idea how best to construct the SQL query? Quote Link to comment https://forums.phpfreaks.com/topic/268410-returning-results-in-order-of-relevence/#findComment-1378241 Share on other sites More sharing options...
Barand Posted September 15, 2012 Share Posted September 15, 2012 Use your "keyphrases" table to generate a dropdown search menu so that the KeyphraseID is passed to the search query (in this case 1 for Accountant). This is more reliable than someone entering "Acountant" (mis-spelling deliberate to make the point). All you need to do then is join correlations to adverts on keyphrase2 = jobAdvert_keyphraseID and WHERE keyphrase1 = 1 (in this case) and order by CorrelationContent DESC. I won't write the whole query as I hate using essays for column names. I'd also extend the correlations table like below so that id Accountant is selected it finds the other accountant types, but if Management Accountant is selected it just finds those Correlations CorrelationID | CorrelationKeyphrase1_KeyphraseID | CorrelationKeyphrase2_KeyphraseID | CorrelationContent 1| 1 | 3 | 0.76 2| 1 | 2 | 0.84 3| 1 | 1 | 1.00 4| 2 | 2 | 1.00 <- add also 5| 3 | 3 | 1.00 <- add also Quote Link to comment https://forums.phpfreaks.com/topic/268410-returning-results-in-order-of-relevence/#findComment-1378245 Share on other sites More sharing options...
Christian F. Posted September 16, 2012 Share Posted September 16, 2012 In addition to what Barand wrote, I'd also drop the "CorrelationID" field. The Unique ID, and thus primary key, for each correlation is combination of the two keyphrase IDs. After all, you only want one instance of each possible combination, right? There's also a by-sentence in Barands last post which I think you should strongly consider as well, namely the fact that you replicate information in your fieldnames as well. There's no need to name the fields after the table which you find them in, as you always reference the table in the queries anyway. All that it does is waste space and time, plus make it harder to read the queries. For example, this is how I'd name the fields in the "Correlations" table: keyphrase_1_id, keyphrase_2_id, relevance Thus, when referencing them, you'd have: correlation.keyphrase_2_id, correlation.relevance I think you'll agree that it's much better than: Correlations.CorrelationKeyphrase2_KeyphraseID, Correlations.CorrelationContent Quote Link to comment https://forums.phpfreaks.com/topic/268410-returning-results-in-order-of-relevence/#findComment-1378297 Share on other sites More sharing options...
stubarny Posted September 22, 2012 Author Share Posted September 22, 2012 (edited) Hi, Thank you very much for your help - I've tidied up the field names (I've left the initial of the table name just so that the php variables are ready-named when I retrieve them out of sql), and I've removed the ID field on the correlations field as suggested. Keyphrases: kID | kContent 1 | Accountant 2 | Management Accountant 3 | Financial Accountant JobAdverts: jaID | ja_kID 1 | 1 2 | 2 3 | 3 4 | 2 5 | 1 6 | 3 Correlations: cKeyphrase1_kID | cKeyphrase2_kID | cContent 1 | 3 | 0.76 1 | 2 | 0.84 1 | 1 | 4 (my correlations are 'relative' instead of absolute therefore a good correlation often has a value greater than 1 - sorry, not my source data!) Please could you give me a hand creating SQL code for the below tasks?: I think I need an SQL query to: 1. Find kID for the searched for keyphrase 2. Select all records in the JobAdverts table 3. Attach cContent to each record in the JobAdverts table --> (done by finding cContent where kID of the search term = cKeyphrase1_KeyphraseID and JobAdvert_KeyphraseID = cKeyphrase2_KeyphraseID) 4. Sort by cContent DESC Is the below code is close to the correct answer? Is there a way of finding the relevant value for kID and using the value all in the same query, instead of splitting this into 2 queries like below? SELECT kID from Keyphrases WHERE kContent = '$searched_for_keyphrase' # extract result to variable $kID SELECT * from JobAdverts INNER JOIN Correlations ON (Correlations.cKeyphrase1_kID=$kID AND Correlations.cKeyphrase2_kID=JobAdverts.ja_kID) ORDER BY JobAdverts.cContent DESC Many thanks, Stu Edited September 22, 2012 by stubarny Quote Link to comment https://forums.phpfreaks.com/topic/268410-returning-results-in-order-of-relevence/#findComment-1380088 Share on other sites More sharing options...
Barand Posted September 23, 2012 Share Posted September 23, 2012 (edited) try SELECT k.kContent, j.jaID, c.cContent FROM Correlations c INNER JOIN Keyphrases k ON c.cKeyphrse1_kID = k.kID INNER JOIN JobAdverts j ON c.cKeyphrase2_kID = j.ja_kID WHERE k.kContent = '$search_kkeyword' ORDER BY c.cContent DESC Edited September 23, 2012 by Barand Quote Link to comment https://forums.phpfreaks.com/topic/268410-returning-results-in-order-of-relevence/#findComment-1380142 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.