Category: sql

  • Sql Injection Slides Posted

    I gave a presentation today at the MySQL Conference & Expo 2010, titled SQL Injection Myths and Fallacies. Thanks to everyone who came to my talk! I appreciate your interest in learning to develop more secure applications. SQL Injection is a serious threat to web applications, and it’s only going to get worse. It’s incumbent on you as software developers to learn how to write secure code!

    My slides are now online in two places: on the MySQL Conference website, and at SlideShare.net/billkarwin.

    I also handed out cards for a 20% discount on my upcoming book, SQL Antipatterns. One chapter in my book is devoted to SQL Injection risks and methods for defending against them. You can pre-order the hardcopy book and receive it as soon as it ships. You can also get the downloadable beta e-book right away, and receive an update when the editing is done.

    I left a stack of the leftover discount cards on the collateral table in the hallway. If you didn’t get one, you’ll have another chance when I talk at the PHP TEK-X conference in Chicago in May!

  • Rendering Trees with Closure Tables

    I got a comment from a reader about the Naive Trees section of my presentation SQL Antipatterns Strike Back. I’ve given this presentation at the MySQL Conference & Expo in the past.

    I’d also like to mention that I’ve developed these ideas into a new book, SQL Antipatterns: Avoiding the Pitfalls of Database Programming. The book is now available in Beta and for pre-order from Pragmatic Bookshelf.

    Here’s the reader’s question:

    I would like to ask if there’s a way I can dump all the hierarchies in a single query using a closure table? For example I have a following tree:

    rootree
    – 1stbranch
    – midbranch
    – corebranch
    – leafnodes
    – lastbranch
    – lastleaf

    and I want to display it like:

    rootree -> 1stbranch
    rootree -> midbranch
    rootree -> midbranch -> corebranch
    rootree -> midbranch -> corebranch -> leafnodes
    rootree -> lastbranch
    rootree -> lastbranch -> lastleaf

    The Closure Table is a design for representing trees in a relational database by storing all the paths between tree nodes. Using the reader’s example, one could define and populate two tables like this:

    drop table if exists closure;
    drop table if exists nodes;

    create table nodes (
    node int auto_increment primary key,
    label varchar(20) not null
    );

    insert into nodes (node, label) values
    (1, ‘rootree’),
    (2, ‘1stbranch’),
    (3, ‘midbranch’),
    (4, ‘corebranch’),
    (5, ‘leafnodes’),
    (6, ‘lastbranch’),
    (7, ‘lastleaf’);

    create table closure (
    ancestor int not null,
    descendant int not null,
    primary key (ancestor, descendant),
    foreign key (ancestor) references nodes(node),
    foreign key (descendant) references nodes(node)
    );

    insert into closure (ancestor, descendant) values
    (1,1), (1,2), (1,3), (1,4), (1,5), (1,6), (1,7),
    (2,2),
    (3,3), (3,4), (3,5),
    (4,4), (4,5),
    (5,5),
    (6,6), (6,7),
    (7,7);

    What we need to do is find all the descendants of the root node 1, then for each of these descendant nodes, list its ancestors in order, separated by an arrow. We can use MySQL’s useful GROUP_CONCAT() function to build this list for us.

    select group_concat(n.label order by n.node separator ‘ -> ‘) as path
    from closure d
    join closure a on (a.descendant = d.descendant)
    join nodes n on (n.node = a.ancestor)
    where d.ancestor = 1 and d.descendant != d.ancestor
    group by d.descendant;

    Here’s the output in the MySQL client. It looks like what the reader asked for:

    +-------------------------------------------------+
    | path |
    +-------------------------------------------------+
    | rootree -> 1stbranch |
    | rootree -> midbranch |
    | rootree -> midbranch -> corebranch |
    | rootree -> midbranch -> corebranch -> leafnodes |
    | rootree -> lastbranch |
    | rootree -> lastbranch -> lastleaf |
    +-------------------------------------------------+

    I do assume for the purposes of ordering that all of a node’s ancestors have a lower node number. You could alternatively use a pathlength column to the closure table and sort by that.

    The Closure Table design is nice compared to the Nested Sets (or Preorder Traversal) design, because it supports the use of referential integrity. By using indexes, the EXPLAIN report shows that MySQL query optimizer does a pretty good job on it (I’ve omitted a few columns for brevity):

    +-------+--------+-------------------+--------------------------+
    | table | type | ref | Extra |
    +-------+--------+-------------------+--------------------------+
    | d | range | NULL | Using where; Using index |
    | a | ref | test.d.descendant | |
    | n | eq_ref | test.a.ancestor | |
    +-------+--------+-------------------+--------------------------+