Jump to content

User classification problem


twsowerby

Recommended Posts

This may not be on the right board, let me know if it isn't.

 

Hey All,

 

I've got a bit of a conundrum on my hands so was wondering if you could help.

 

I'm building a recommendation system for a holiday villa booking website. Rather than just being a rating based recommender, I am doing it based on the link between similar users and their villa interests.

 

So a recommendation could be made by looking at user 1's personal details and the villas they have shown an interest in on the website, and then comparing their profile to user 2 and 3.

 

To do this I need to find a way of classifying the users based on their attributes, but I'm struggling to think of a good way to classify them. I would like the classification to take into account more than one attribute, so I don't want them to just be classified by age, for example. What I need to end up with is different user classifications that each user can be put into.

 

I'm open to any suggestions or ideas, and feel free to ask me for more information.

 

Regards,

 

Tom

Link to comment
Share on other sites

Its a great idea. Although this is a HUGE task.Its more a job of psychology, sociology,statistics and intelligent computing than PHP.

To give you an idea how huge. The music website Pandora recommends a selection of songs you will most likely enjoy based on a few songs you enter into a playlist. They have received over 10million dollars to accomplish this. And have hundreds of people indexing songs to classify them.

 

Before you even code you need to figure out this question

 

What makes a person 'similar' to another?

 

Every person is unique(although you could argue against that), although you can pretty safely make assumptions and generalisations about people.

 

Your first task is to figure out which generalisations you want to make about people, so that you can collect information from them that will help you classify them.

 

The more information you collect the more difficult it will be but the more accurate and helpful your service. For instance if you collect 5 different categories of information each with 5 subcategories. ie AGE(18-24)(25-34)(35-44)(45-54)(55-64) INDUSTRY(Marketing, Professional, Tradesperson,Retail,Finance) etc, You will yield up to 3125 different sets of people.

 

More likely you will have to collect 10 categories of information with 10 subcategories which will yield 100 billion combinations of people. ie 10 times as many people on Earth.

 

So as you can see it will be very difficult to code for every outcome.

 

The alternative is however, to create a small set of classifications. You could have a questionairre form and assign points to each answer. The number of points you receive dictates how a person is classified.

 

Kind of like the Myers-Briggs personality test which tests for 16 different personality types. Although these classifications might not be appropriate or accurate in deciding what kind of holiday villa  a person likes.

 

http://en.wikipedia.org/wiki/Myers-Briggs_Type_Indicator

 

Or you can create your own empirical classification system just by looking at who likes what and making guesses why. A college kid is going to be impressed by a good looking holiday villa, while a a rich lawyer might see it as shabby by their standards.An elderly couple might enjoy a villa in the countryside while a young couple would like something near the beach .A single guy might want to holiday somewhere near a place he can meet women, a single woman might want a holiday villa somewhere isolated etc.

 

To make accurate classifications do some research on staistics and psychology and just observe the way people behave.

 

To summarise, create a quiz which places people into a classification.

 

Assign this classification a unique id. Place it into your database. And search for recomendations where classification =classification.

 

 

 

 

 

Link to comment
Share on other sites

Thank you for your interesting and well thought out reply.

 

I actually used to be an avid user of Pandora before they got shut down over here due to licensing issues and I liked the way their system worked a lot. I had no idea they needed that amount of resources to pull it all together though.

 

Unfortunately I don't have the manpower or the hardware to create a system capable of handling up to 100 billion combinations, but it is something to work towards possibly in the future. Just out of interest do you have any idea what the hardware requirements of a system capable of 100 billion combinations would be? If such a system was created and made to be applicable to more than, say, a villa rental scenario, do you think it would have good commercial aspects? ie do you think it could be sold effectively? Or would it just be one of many?

 

Going back to my particular scenario, I was also having a look at the  "weighted slope one algorithm" which would base recommendations on users ratings. Not exactly what I was looking for but it may present some possibilities, especially as it doesn't require the user receiving the recommendation to have actually rated anything to start with.

 

http://en.wikipedia.org/wiki/Slope_One

 

Regards,

 

Tom

Link to comment
Share on other sites

I have no idea what hardware you'd need to handle 100 billion combinations.

 

The commercial aspect of such a system would be massive.As in the previous post, Netflix is offering a million dollars to anyone who can improve their system.I think the winner would be getting ripped off.

 

Basing on user ratings using whichever method is a bad idea in my opinion.User ratings are subjective and you can't rely on the user to accurately translate their opinion to a number. User ratings are just there for show and to create a sense of community and interactivity, But I have yet to find a user rating system which accuratley recommends anything to me, at least not every time. User ratings can be a helpful tool in directing you to something you may like, but they can have the opposite effect and direct you away from those same things.

 

Alot of user rating systems depend heavily on statistics,just as the current netflix system does.Statistics should only be one tool. Just as important as finding out how many people like what, you need to find out why they like it.

 

Your holiday villa system is doable, but involves more planning than coding. You can start doing your research by implementing a questionairre right now. Ask your users a little about themselves and why they have chosen this particular villa.

 

 

Have you got a link to your website?

Link to comment
Share on other sites

Unfortunately the website is still a work in progress, still programming the back end at the moment. I can send you a link when its up and running though if you like.

 

I share your views on user rating systems regarding their inaccuracy, but I am still going to include a rating system simply for the community/interactivity element. I plan to work on a system that will gather less statistical information on users which can be analyzed for trends with the ultimate and somewhat lofty goal of being able to almost preempt what the user is looking for. However I'm going to have to be wary of doing it too obtrusively, often recommendations can be more annoying than helpful.

 

I am very interested in developing a system that can be applied to many scenarios, I would imagine a truly accurate system would greatly increase profits for a lot of businesses. Definitely something to look into in the future.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.