Facebook, Twitter, foursquare, yahoo Coding

phpSensei · November 1, 2010

I don't know how to word this out but i will do my best.

Say, whats the difference between a facebook or big company login system for a user compared to a login system you find in most tutorials? I mean i dont see the difference between ADVANCED user login and just plain loggin the user in with sessions... I see some of PHPBB's coding for example and I see alot of useless and repeated stuff happening, using alot of code thats unecessary,.. i can provide a few examples but i am too lazy

I find it that when I am diong big projects, I get paranoid and think that theres alot of things that goes into a secure login system, or registration system....

Ofcourse, I do know alot of the security holes you have to watch out for, but if the site uses sessions and such, is there really a way to get hacked? I saw recently the firesheep program hacking other accounts, i mean thats something now that everyone should watch out for... I dont know maybe i am getting paranoid , because i have a big project i am releasing soon, and feel as if its not ready security wise.

See What I am getting at?

I mean if you had to program for facebook, would you use this cakephp script to clean up variables..?

<?php
/**
* Washes strings from unwanted noise.
*
* Helpful methods to make unsafe strings usable.
*
* PHP versions 4 and 5
*
* CakePHP(tm) : Rapid Development Framework (http://cakephp.org)
* Copyright 2005-2010, Cake Software Foundation, Inc. (http://cakefoundation.org)
*
* Licensed under The MIT License
* Redistributions of files must retain the above copyright notice.
*
* @copyright     Copyright 2005-2010, Cake Software Foundation, Inc. (http://cakefoundation.org)
* @link          http://cakephp.org CakePHP(tm) Project
* @package       cake
* @subpackage    cake.cake.libs
* @since         CakePHP(tm) v 0.10.0.1076
* @license       MIT License (http://www.opensource.org/licenses/mit-license.php)
*/

/**
* Data Sanitization.
*
* Removal of alpahnumeric characters, SQL-safe slash-added strings, HTML-friendly strings,
* and all of the above on arrays.
*
* @package       cake
* @subpackage    cake.cake.libs
*/
class Sanitize {

/**
* Removes any non-alphanumeric characters.
*
* @param string $string String to sanitize
* @param array $allowed An array of additional characters that are not to be removed.
* @return string Sanitized string
* @access public
* @static
*/
function paranoid($string, $allowed = array()) {
	$allow = null;
	if (!empty($allowed)) {
		foreach ($allowed as $value) {
			$allow .= "\\$value";
		}
	}

	if (is_array($string)) {
		$cleaned = array();
		foreach ($string as $key => $clean) {
			$cleaned[$key] = preg_replace("/[^{$allow}a-zA-Z0-9]/", '', $clean);
		}
	} else {
		$cleaned = preg_replace("/[^{$allow}a-zA-Z0-9]/", '', $string);
	}
	return $cleaned;
}

/**
* Makes a string SQL-safe.
*
* @param string $string String to sanitize
* @param string $connection Database connection being used
* @return string SQL safe string
* @access public
* @static
*/
function escape($string, $connection = 'default') {
	$db =& ConnectionManager::getDataSource($connection);
	if (is_numeric($string) || $string === null || is_bool($string)) {
		return $string;
	}
	$string = substr($db->value($string), 1);
	$string = substr($string, 0, -1);
	return $string;
}

/**
* Returns given string safe for display as HTML. Renders entities.
*
* strip_tags() does not validating HTML syntax or structure, so it might strip whole passages
* with broken HTML.
*
* ### Options:
*
* - remove (boolean) if true strips all HTML tags before encoding
* - charset (string) the charset used to encode the string
* - quotes (int) see http://php.net/manual/en/function.htmlentities.php
*
* @param string $string String from where to strip tags
* @param array $options Array of options to use.
* @return string Sanitized string
* @access public
* @static
*/
function html($string, $options = array()) {
	static $defaultCharset = false;
	if ($defaultCharset === false) {
		$defaultCharset = Configure::read('App.encoding');
		if ($defaultCharset === null) {
			$defaultCharset = 'UTF-8';
		}
	}
	$default = array(
		'remove' => false,
		'charset' => $defaultCharset,
		'quotes' => ENT_QUOTES
	);

	$options = array_merge($default, $options);

	if ($options['remove']) {
		$string = strip_tags($string);
	}

	return htmlentities($string, $options['quotes'], $options['charset']);
}

/**
* Strips extra whitespace from output
*
* @param string $str String to sanitize
* @return string whitespace sanitized string
* @access public
* @static
*/
function stripWhitespace($str) {
	$r = preg_replace('/[\n\r\t]+/', '', $str);
	return preg_replace('/\s{2,}/', ' ', $r);
}

/**
* Strips image tags from output
*
* @param string $str String to sanitize
* @return string Sting with images stripped.
* @access public
* @static
*/
function stripImages($str) {
	$str = preg_replace('/(<a[^>]*>)(<img[^>]+alt=")([^"]*)("[^>]*>)(<\/a>)/i', '$1$3$5<br />', $str);
	$str = preg_replace('/(<img[^>]+alt=")([^"]*)("[^>]*>)/i', '$2<br />', $str);
	$str = preg_replace('/<img[^>]*>/i', '', $str);
	return $str;
}

/**
* Strips scripts and stylesheets from output
*
* @param string $str String to sanitize
* @return string String with <script>, <style>, <link> elements removed.
* @access public
* @static
*/
function stripScripts($str) {
	return preg_replace('/(<link[^>]+rel="[^"]*stylesheet"[^>]*>|<img[^>]*>|style="[^"]*")|<script[^>]*>.*?<\/script>|<style[^>]*>.*?<\/style>|<!--.*?-->/is', '', $str);
}

/**
* Strips extra whitespace, images, scripts and stylesheets from output
*
* @param string $str String to sanitize
* @return string sanitized string
* @access public
*/
function stripAll($str) {
	$str = Sanitize::stripWhitespace($str);
	$str = Sanitize::stripImages($str);
	$str = Sanitize::stripScripts($str);
	return $str;
}

/**
* Strips the specified tags from output. First parameter is string from
* where to remove tags. All subsequent parameters are tags.
*
* Ex.`$clean = Sanitize::stripTags($dirty, 'b', 'p', 'div');`
*
* Will remove all `<b>`, `<p>`, and `<div>` tags from the $dirty string.
*
* @param string $str String to sanitize
* @param string $tag Tag to remove (add more parameters as needed)
* @return string sanitized String
* @access public
* @static
*/
function stripTags() {
	$params = params(func_get_args());
	$str = $params[0];

	for ($i = 1, $count = count($params); $i < $count; $i++) {
		$str = preg_replace('/<' . $params[$i] . '\b[^>]*>/i', '', $str);
		$str = preg_replace('/<\/' . $params[$i] . '[^>]*>/i', '', $str);
	}
	return $str;
}

/**
* Sanitizes given array or value for safe input. Use the options to specify
* the connection to use, and what filters should be applied (with a boolean
* value). Valid filters:
*
* - odd_spaces - removes any non space whitespace characters
* - encode - Encode any html entities. Encode must be true for the `remove_html` to work.
* - dollar - Escape `$` with `\$`
* - carriage - Remove `\r`
* - unicode -
* - escape - Should the string be SQL escaped.
* - backslash -
* - remove_html - Strip HTML with strip_tags. `encode` must be true for this option to work.
*
* @param mixed $data Data to sanitize
* @param mixed $options If string, DB connection being used, otherwise set of options
* @return mixed Sanitized data
* @access public
* @static
*/
function clean($data, $options = array()) {
	if (empty($data)) {
		return $data;
	}

	if (is_string($options)) {
		$options = array('connection' => $options);
	} else if (!is_array($options)) {
		$options = array();
	}

	$options = array_merge(array(
		'connection' => 'default',
		'odd_spaces' => true,
		'remove_html' => false,
		'encode' => true,
		'dollar' => true,
		'carriage' => true,
		'unicode' => true,
		'escape' => true,
		'backslash' => true
	), $options);

	if (is_array($data)) {
		foreach ($data as $key => $val) {
			$data[$key] = Sanitize::clean($val, $options);
		}
		return $data;
	} else {
		if ($options['odd_spaces']) {
			$data = str_replace(chr(0xCA), '', str_replace(' ', ' ', $data));
		}
		if ($options['encode']) {
			$data = Sanitize::html($data, array('remove' => $options['remove_html']));
		}
		if ($options['dollar']) {
			$data = str_replace("\\\$", "$", $data);
		}
		if ($options['carriage']) {
			$data = str_replace("\r", "", $data);
		}

		$data = str_replace("'", "'", str_replace("!", "!", $data));

		if ($options['unicode']) {
			$data = preg_replace("/&#([0-9]+);/s", "&#\\1;", $data);
		}
		if ($options['escape']) {
			$data = Sanitize::escape($data, $options['connection']);
		}
		if ($options['backslash']) {
			$data = preg_replace("/\\\(?!&#|\?#)/", "\\", $data);
		}
		return $data;
	}
}

/**
* Formats column data from definition in DBO's $columns array
*
* @param Model $model The model containing the data to be formatted
* @access public
* @static
*/



function formatColumns(&$model) {
	foreach ($model->data as $name => $values) {
		if ($name == $model->alias) {
			$curModel =& $model;
		} elseif (isset($model->{$name}) && is_object($model->{$name}) && is_subclass_of($model->{$name}, 'Model')) {
			$curModel =& $model->{$name};
		} else {
			$curModel = null;
		}

		if ($curModel != null) {
			foreach ($values as $column => $data) {
				$colType = $curModel->getColumnType($column);

				if ($colType != null) {
					$db =& ConnectionManager::getDataSource($curModel->useDbConfig);
					$colData = $db->columns[$colType];

					if (isset($colData['limit']) && strlen(strval($data)) > $colData['limit']) {
						$data = substr(strval($data), 0, $colData['limit']);
					}

					if (isset($colData['formatter']) || isset($colData['format'])) {

						switch (strtolower($colData['formatter'])) {
							case 'date':
								$data = date($colData['format'], strtotime($data));
							break;
							case 'sprintf':
								$data = sprintf($colData['format'], $data);
							break;
							case 'intval':
								$data = intval($data);
							break;
							case 'floatval':
								$data = floatval($data);
							break;
						}
					}
					$model->data[$name][$column]=$data;
					/*
					switch ($colType) {
						case 'integer':
						case 'int':
							return  $data;
						break;
						case 'string':
						case 'text':
						case 'binary':
						case 'date':
						case 'time':
						case 'datetime':
						case 'timestamp':
						case 'date':
							return "'" . $data . "'";
						break;
					}
					*/
				}
			}
		}
	}
}
}

phpSensei · November 1, 2010

Do you think using a framework like ZendFramework would alot safer?

What I am saying, is it hard to code sites like these? I mean 1 guy did it at the beginning, i am sure it wasnt very safe and had people reprogram it.

gizmola · November 1, 2010

If you want to talk about really big sites, one of their biggest issues is scalability, so the code would be different. In terms of sessions, by default php sessions are file based. That doesn't work for a big site that is employing a cluster of web servers, so there's invariably an alternate session storage mechanism, where the session data is being stored in a database or memcache, or possibly a big nfs mounted netapp.

From there you have a number of issues: password and seesion id sniffing in shared networks, insecure configurations that allow session files to be read off the server, session fixation exploits, and cookie based exploits. The reason there's no one right answer, is that the amount of security people need tends to directly relate to the type of site it is.

Many community sites forgo ssl, not because they don't know that passwords are sent cleartext, but rather that the computational cost and overhead of using it would require more hardware than they can afford to invest in.

Twitter and Facebook, have both struggled with scalability, and while they offer https, they don't take any technical steps to insure that people use it. When you look at something like firesheep, all that did was once again remind people that if you're on a shared network (in most cases public wifi) that your data is sniffable, and it's not unlikely that somebody could be sitting a table over with a sniffer looking at the data you're sending.

In terms of sessions, well the session is a token that represents your login. It's not suppossed to be guessable (by default it's an md5() hash that's not predictable, although you can change it). That session id gets put in a cookie, so your browser is sending it with every packet. Needless to say, if I can sniff your traffic, I can get access to that session_id, set a bogus cookie, and at that point, the site in question will react as if I am you.

One thing you can do to mitigate this issue is to require reathauthentication and to regenerate the session id anytime there's a request to "escalate privilege". The basic idea there is that you might be able to masquerade as me, but as long as you can't change my password or become superadmin, it's not as bad a situation as it might be. The same ideas are involved in combatting session fixation attacks. You also want to configure PHP so that it does not allow the session id to be passed as a url param or post variable.

Last but not least, these issues cease to be a problem if you implement https.

.josh · November 1, 2010

as for the code you posted... I didn't fine-tooth-comb it but that looks more like validating user input than securing a session... and for the most part, yes, I would trust it. Looks like it is using mostly straight regex to validate. For instance, preg_match('~^[0-9]+$~',$subj) means to match for 1 or more digits in the complete $subj. While there are a number of built-in php functions that effectively do the same thing, i doesn't get any more cut and dry than that.

phpSensei · November 1, 2010

I see.. The coding aspect isn't relatively a hard task, but the scalibility and some session storing issues...

thanks to both of you guys, i have a better idea of what i am facing..

Also that netapp seems to be something i should look into, hopefuly it nots too expensive.

gizmola · November 1, 2010

I see.. The coding aspect isn't relatively a hard task, but the scalibility and some session storing issues...

thanks to both of you guys, i have a better idea of what i am facing..

Also that netapp seems to be something i should look into, hopefuly it nots too expensive.

They are great devices, but they aren't cheap. I'm not entirely current, but just a quick look I saw people selling the FAS2020 with 6TB of storage running at around $9k. As far as I know the FAS2020 is the entry level device. Netapps basically implement RAID-6, support multiple protocols and have all sorts of incredible features like being able to do snapshots. If you're doing this work for a company, it might be something that is financially viable, as the reliability of those things is awesome, and Netapp support is outstanding.

Sign In

Facebook, Twitter, foursquare, yahoo Coding

Recommended Posts

phpSensei

Link to comment

Share on other sites

phpSensei

Link to comment

Share on other sites

gizmola

Link to comment

Share on other sites

.josh

Link to comment

Share on other sites

phpSensei

Link to comment

Share on other sites

gizmola

Link to comment

Share on other sites

Join the conversation

Browse

Activity

Important Information