Jump to content

Cleep


Recommended Posts

Okay, tell me how this sounds (already in place):

 

I made a hash (random sha256) via php and saved it in a session, then set that as a javascript value, which then populates the forms on the page. when the page gets submitted, it compares the field against the session value. if they don't match then redirect them. if they do match allow them to proceed with what they were doing.

Link to comment
Share on other sites

Do you realize that your title to the links differ from the location of the link, it's off by one, the following description belongs to the link above.

 

Edit:

Actually there's different links with different order of descriptions, they are there on the page, just not in correct orders.

Link to comment
Share on other sites

Once a user registers, an email with a password is sent to them, they must use that password and when they log in they must reset the password to something new.

A simple bot could perform that task, however it may deter. A site like yours will have to be heavily moderated. If you look at similar sites using the likes of Pligg, PHP Dug, Scuttle they get spammed to death. The sites that use the likes of ReCaptcha don't so much, however there are some tools, desktop apps that can get through.

 

Do what you can & see how it goes.

Link to comment
Share on other sites

What if I use flash to create a socket connection between me and the client and the server.

 

- Client loads a flash file

- flash creates a connection with the server

- javascript sends urls to the flash file

- the flash file sends data to the php socket file

- the file does what it does now

- sends back info to the flash

- flash posts back to the javascript

- the javascript writes it out to the screen

 

this will require the bot to have to do the following:

1. process javascript

2. use flash to create a socket connection

3. enable the usage of flash to talk to javascript and vise versa

 

What do you think of that?

Link to comment
Share on other sites

Never heard of that one. To be honest I would steer clear of flash unless you use it for advertising. Javascript/AJAX, Captcha, email validation is enough to stop (well make it difficult for) a bot unless it is a desktop application, and since yours is a niche site and not a script that will be installed on thousands of domains I'm guessing nobody will build an app to crack yours. However as you have it now, I could construct a bot in 5 minutes to start posting all kinds of links on your site.

 

An example of what to expect: http://www.smash-up.com

Link to comment
Share on other sites

5 minutes....

 

Do it. For me!

 

Add all your bits and pieces and then i'll test it. 5 minutes was a bit of an exageration lol, however it wouldn't take very long to complete your forms, read the email & extract data then login using a simple script.

Link to comment
Share on other sites

If you need lists of domains to crawl let me know, I have 100 million+++.

 

I systematically crawl the lists with curl for the DynaIndex.

 

But I also have a link ripper crawler too.

http://get.blogdns.com/dynaindex/ripper/

 

It sorts any link it finds and saves it to it's main domains folder.

Something like this:

http://get.blogdns.com/dynaindex/ripper/?domain=about.com

 

It's a text file based system I came up with.

 

I don't save titles for urls because all the sites rename them to whatever they want.

I have other ways to show the original and exact title for the url.

I just didn't finish everything I wanted to do with it yet.

 

Here's a very nice open source search crawler I used a few years ago.

http://www.sphider.eu/

Just add any domains want to crawl, set the link depth level and let it go.

 

I may just incorporate this into my links ripper one day.

 

If you want to search lists from alexa,dmoz,whois by a word or words in the domain name, and also being able to exclude words, you can use my crude domains search I made.

http://get.blogdns.com/dynaindex/urls/

 

I really think switching to cassandra as the database is the way to properly do all this, and using python for for the search. There is an insane amount of links on the net. A few little niche sites may be better.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.