Jump to content

using arrays to create lisp-like mini language for mysql (and maybe html); any suggestions?


Goat

Recommended Posts

In case you guys don't know what lisp is, tl;dr is that is an old language without syntax that uses polish notation and parentheses to represent all language constructs. You can also create your own language constructs. For example, this mysql statement :

 

UPDATE table_name SET column1=value1,column2=value2 WHERE some_column=some_value
Can be represented as

 

 (update "table_name" (set ('column1 "value1") ('column2 "value2") (where (= some_column "some_value")))
One advantage of this whole thing is that you don't have to mix multiple languages to create a site. When you program in php, you actually use html, css, php and MySQL. When you program in lisp, you only use lisp. Another advantage is that the language with no syntax is easier to programaticaly generate than one with it. I can auto generate various sql statements as lisp trees. The disadvantage is that lisp is obscure as hell and no one uses it.

 

So I was thinking of using php arrays to get some of that expressiveness of lisp in PHP and to abstract away necessity to glue together sql strings. So far it has been mixed results. One way to represent update is this

 

 ["update", "table_name", ["column1" => "value1", "column2" => "value2"], ["where", [ "=", "some_column", "some_value"]]]
I represent columns to be updated as key=>value pairs since column names are unique anyway. One advantage of this method is that I don't have to use mysql_real_escape_string on variables for updating. My interpreter automatically escapes everything inside SET bracket and adds quotes to values. This is however, less than practical in its WHERE part. That's because it is less clear what part you can and can't escape, since column names, functions and values can all be mixed up in WHERE part, and arrays are not nearly as flexible as s-expressions.

 

Anyone has any idea how best to do this?

Edited by Goat
Link to comment
Share on other sites

The idea of generating SQL queries with PHP is nothing new and has been many different times in many different ways. Most approaches are object-oriented. See the Doctrine Query Builder, for example.

 

The problem is: This idea is incredibly stupid. It's good to have different languages for different purposes. SQL is excellent at querying relational databases, because that's exactly what it's designed for. Why on earth would I want some half-assed emulation written in PHP which makes the queries five times as long and absolutely unreadable? What's the benefit of that?

 

Sure, it's all PHP then. But who says that's a good idea? By all means, I do not want to fumble with dozens of nested PHP arrays instead of writing down one simple SQL query. Even your trivial example is difficult to read. And you've actually cheated. You can't just use strings, because then the SQL identifier foo is indistinguishable from the SQL string 'foo'. So you need even more arrays to hold the type information. Now imagine the equivalent of a more complex query with joins, subqueries, grouping, aggregate functions etc. It will simply be unusable.

 

I mean, this is certainly an interesting project if you're into compiler building. But in practice: No, this is not a good idea.

Link to comment
Share on other sites

using arrays to create lisp-like mini language

whyyyyyyyyy

 

I have attempted this exact same thing before and I gave up. It's clean but still gets so ugly and confusing, what with supporting all the things you will realistically need to use in SQL, that I decided it wasn't worth it.

 

Let's say you're working with a SELECT, with one JOIN, and two conditions.

SELECT t1.a, t1.b, t2.c
FROM table1 AS t1
JOIN table2 AS t2 ON t1.x = t2.x AND t1.y = t2.y
WHERE (t1.a > 0 AND t1.b > 0) OR (t1.a = 0 AND t1.b = 0)
Using roughly the syntax I arrived at,

[
	"select",
	[
		["t1", "a"],
		["t1", "b"],
		["t2", "c"]
	],
	["t1" => "table1"],
	[
		"join" => [
			"t2" => ["table2", "and",
					["=", ["t1", "x"], ["t2", "y"]],
					["=", ["t1", "y"], ["t2", "y"]]
				]
			]
		],
		"where" => ["or",
			["and",
				[">", ["t1", "a"], 0],
				[">", ["t1", "b"], 0]
			],
			["and",
				["=", ["t1", "a"], 0],
				["=", ["t1", "b"], 0]
			]
		]
	]
]
[edit] It uses a few more []s than you might expect but that's how the system differentiates names from literal values: ["a"] is a column name and "a" is a string. [/edit]

 

Now add to that support for:

- SELECTs with UNIONs

- SELECTs as subqueries

- HAVING

- GROUP BY

- ORDER BY

- LIMIT

and it gets even worse. And that's just SELECTs.

 

So then you think about taking the easy way out and not deliberately handling operators like "or" and "=" - just using them verbatim in the query. But then you've basically just rearranged the original SQL statement into another form. One that's much harder to read.

Edited by requinix
Link to comment
Share on other sites

When you program in php, you actually use html, css, php and MySQL.

Huh?

 

If you're building an app that will eventually be placed on the web, you're going to have to use HTML and CSS to display results to the screen. Even with the various template engines available. Doesn't matter if it's backed by PHP, Java, Python, Ruby, C#, etc. To claim that it's somehow an integral flaw to PHP is simply not correct.

 

Similarly, if you're going to work with a database, SQL is going to be written somewhere. You can make your own query language to run on top of it, like what you're trying to do (although, Lisp? Really?), or like Doctrine's DQL language like Jacques mentioned. But at some point, the conversion needs to be made, and the programmer that's using this new language still needs to have some idea of SQL to make it work.

Link to comment
Share on other sites

Huh?

 

If you're building an app that will eventually be placed on the web, you're going to have to use HTML and CSS to display results to the screen. Even with the various template engines available. Doesn't matter if it's backed by PHP, Java, Python, Ruby, C#, etc. To claim that it's somehow an integral flaw to PHP is simply not correct.

 

Well it is integral flaw of all languages except lisp :) . Now, I don't want to evangelize for lisp - the language has some massive flaws, most important being fragmented (and frankly arrogant) community and bunch of incompatible implementations. It frankly deserves to be marginalized. Nevertheless having something like s-expressions would be a great help as then you can represent foreign languages in tree form instead of gluing together bunch of strings. My s-expressions MySQL example doesn't require programmer to manually apply mysql_real_escape_string, as you can write a macro that will slap it on every variable. That's certainly convenient. So would be ability to traverse HTML nodes. PHP sorta tries to have some of those advantages with its array syntax, but only gets there halfway. I was inspired to try something like this by reading the Emacs problem. But I don't want to get on this soapbox as I might get lynched.

Edited by Goat
Link to comment
Share on other sites

The idea of generating SQL queries with PHP is nothing new and has been many different times in many different ways. Most approaches are object-oriented. See the Doctrine Query Builder, for example.

 

The problem is: This idea is incredibly stupid. It's good to have different languages for different purposes. SQL is excellent at querying relational databases, because that's exactly what it's designed for. Why on earth would I want some half-assed emulation written in PHP which makes the queries five times as long and absolutely unreadable? What's the benefit of that?

 

While I agree that the idea is highly impractical in PHP, I strongly disagree that it is "incredibly stupid". For one, the whole of human genome is encoded in just one language, DNA, and that's a more complex machine than anything you and I are bashing together. It is more likely that our approach is ultimately stupid even if it works better short-term. But more seriously, all else being equal, it is better to have a tree representation of a language instead of string representation. Imagine if jQuery conceptualized HTML as one giant string instead of as annotated tree? For that matter, why not have server side jQuery? It would be convenient to transform some html server side for the same reason it is convenient to do it client side.

 

Anyway I can think of some good reasons why one would want to have SQL as three (provided we had concise syntax for building the tree, which we don't) :

 

1. eliminate all sql injections. You would recursively traverse the whole tree and escape all strings.

2. enable creating sql query from multiple sources. Imagine that you have huge amount of filtering functions and all of them are controlling what rows from a table can be displayed based on user level, age, preferences and, hell, ear size. It would be more practical for every function to return tree fragment that is eventually programatically combined into one giant WHERE statement than to assemble one giant sql string by hand.

 

It is not that there are no reasons to do this. It is that it is probably not worth it in PHP.

 

 

Sure, it's all PHP then. But who says that's a good idea? By all means, I do not want to fumble with dozens of nested PHP arrays instead of writing down one simple SQL query. Even your trivial example is difficult to read. And you've actually cheated. You can't just use strings, because then the SQL identifier foo is indistinguishable from the SQL string 'foo'. So you need even more arrays to hold the type information. Now imagine the equivalent of a more complex query with joins, subqueries, grouping, aggregate functions etc. It will simply be unusable.

 

I mean, this is certainly an interesting project if you're into compiler building. But in practice: No, this is not a good idea.

Yeah I agree that I have cheated and that it would have been even longer to do properly.

Edited by Goat
Link to comment
Share on other sites

Anyway I can think of some good reasons why one would want to have SQL as three (provided we had concise syntax for building the tree, which we don't) :

 

A tree is fine, but that doesn't mean that we, as humans, should build that tree. That's a job for the computer.

 

For example, we might write down the base SQL query, have it parsed and then manipulate it through method calls.

 

 

 

1. eliminate all sql injections. You would recursively traverse the whole tree and escape all strings.

 

Escaping is (mostly) dead. We have prepared statements now.

Link to comment
Share on other sites

The main drive for my attempt was query building. This kind of approach would make it very easy to build a query dynamically, like for an advanced search form: may need to add criteria, sometimes those criteria involve other tables, sometimes those JOIN conditions vary based on other criteria... The best way (IMO) to approach that normally is a set of arrays that you append to, and that happens to be half of this Lisp-y thing already.

 

Personally I think it's a good idea - I just couldn't come up with a practical and satisfactory implementation. (I actually had a fully functional implementation but as I said I really didn't like the "syntax".)

Link to comment
Share on other sites

All the wasted cpu cycles just to transform a bunch of overly verbose fancy stuff into SQL, when you could have just written...SQL. If you're worried about SQL injection, just use parameterized queries/binds/prepared statements like PDO or even MySQLi offers. I'd be interested to see you create that, and then benchmark it against querying the raw SQL it produces. And then think about 500 people accessing your app simultaneously, or thousands, or tens of thousands.

 

SQL is fairly complex. Subqueries, joins, aliases, aggregate functions, math, casting, date functions, protecting reserved words for people who use them as column names like `order`, etc. It would take a lot of code to accurately replicate all of it's functionality to be able to recompile back into raw SQL and still have it work in all cases. I've never run across a "query builder" that did everything I needed it to and still had to write raw SQL to get around its limitations and waste a lot of time trying to figure out how to make it bend to what you want it to do. In the end, you have to learn an entirely new complex language. KISS.

 

It's like you want to write a book for an English audience.  So you write it in French and have someone translate it to English.  It never comes out 1:1.

Edited by CroNiX
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.