Author: Bill Karwin

  • Enabling the Success of a Software Team

    There are three “must haves” for excellent managers, which I look for when I work for a manager, and which I try to live up to when I work as a manager.

    I thought I’d write down these thoughts, after seeing Jeremy Cole’s blog this week with some great advice about ways to attract, motivate, and retain expert MySQL DBA architects, and earlier this summer Cal Evans’ podcast Attracting Talent on the PHP Abstract site about attracting talent.

    This brings me to my ideas about how to manage excellent software developers. There are many things a manager has to do to be effective at leading a team, but to boil it down to a single principle, I like to say that the manager’s job is to enable the success of their team.

    Here is my list of three management responsibilities to support the success of the team:

    1. Give clear and achievable assignments

    The first step to making your team successful is to tell them what you want them to do. What is the goal of the project? What does the resulting software need to do? Are there constraints on schedule or technology? Who is the audience? Who approves the final deliverable? These and other high-level questions must be answered, even if all the fine details are still in “discovery phase.”.

    Counter-example: in a classic Dilbert comic strip, the boss asks Dilbert to work on a new project. Dilbert says, “great, I’m ready, what’s the project?” The boss says, “It’s not all worked out yet, so you just start coding, and I’ll stand here looking over your shoulder. If you do something wrong, I’ll scream.”

    The assignment must be achievable. Not softball — giving a developer a challenge is a good thing. Many people thrive on this, and sometimes they can pull a rabbit out of a hat and surprise everyone (including themselves). But it does no good to give someone a task that is truly impossible, this just sets them up for certain failure.

    Also, the assignment must be consistent, or at least acknowledge clearly when it changes. We all know that project requirements tend to evolve and we are okay with that. But a manager who implies that the developer should have anticipated the change is being disrespectful. Or worse, I’ve seen managers claim that the changed requirements were what was “intended” (though not stated) from the beginning, and that it’s the developer’s fault for not inferring this information. What can one say about this behavior? Let’s just say that the manager is not fulfilling his or her responsibility to make the team successful.

    2. Provide the resources needed to be successful

    I have a pretty broad definition of “resources” in this context, including hardware and software tools, enough time to complete the assignment, access to people who are needed such as IT support staff or subject matter experts, any existing technology or research that is part of the desired solution, etc.

    Counter-example: I once was told to set up a testing environment, but we had no server on which to install it. The VP’s solution was to tell me to use VMware and then I’d have as many servers as I need. But we still needed a real server on which to run the VMware software, and we had none. This is an example of being told to make bricks without straw.

    Another counter-example: a manager who won’t authorize a $250 expense for a commercial-quality code-analysis tool, but they’d rather let their highly-paid developers spend weeks debugging elusive issues. That’s not a smart use of time or money. Sure, one doesn’t want expenses to get out of control, but being either too stingy or too frivolous are both likely to put the team’s success at risk.

    3. Give constructive feedback

    The manager must communicate clearly and deliberately, instead of assuming “no news is good news.”

    Feedback doesn’t need to be full of hollow affirmations or cheerleading; it should let the developer know how close he or she is to the goal of success. Also, if the developer is off-target, it’s important to communicate about this and correct it as early as possible. Most people naturally want to do a good job, and being allowed to do the wrong job for weeks is sure to discourage them once they learn the truth.

    Ultimately, when the team completes an assignment, a manager should tell them they did so, and how well it meets expectations. An important part of enabling a team’s success is letting them know when they have done it.

  • Proposals for MySQL Conference

    I submitted proposals for the MySQL Conference & Expo.

    SQL AntiPatterns II

    I thought it would be a no-brainer to do a sequel of my 2007 talk, “SQL AntiPatterns”. That talk was very well attended, thanks to Jay Pipes’ endorsement in his guide to the conference. It’s not hard to come up with all-new content for a sequel!

    Topics in this presentation:
    * Corrupt your data by storing images in files instead of BLOB fields.
    * Kill your query performance using the HAVING clause.
    * Use the FLOAT datatype and lose money.
    * Add an “id” column to every table — whether it needs one or not.
    * Prepare queries using parameters for identifiers and keywords.

    Designing Models and Such: using MySQL in MVC Applications

    I just recently finished working for Zend Technologies, spending a year developing the database access components for the Zend Framework.

    A database like MySQL is an integral part of virtually every web application. This talk describes practical ways to leverage MySQL in your project, to meet goals of development productivity, application performance, and security.

    Model-View-Controller (MVC) is a popular architecture pattern for web applications, but it may be novel to PHP developers. Designing Models in an MVC application is the subject of many questions, so this talk will focus on these issues.

    Examples use the Zend Framework web application library for PHP 5.

    Topics:
    * Designing database-backed Model classes for MVC applications
    * Caching data and metadata appropriately
    * Storing authentication credentials in a database
    * Configuration management and testing issues
    * Logging application events to a database

    The audience for this talk is assumed to know object-oriented programming concepts in PHP 5.

  • Leaving Zend

    I’ve worked at Zend for the past 13 months, heading up an open source project called the Zend Framework. Zend Framework is a library of PHP 5 classes providing simple, object-oriented solutions for most features common to modern web applications. I was the project manager as well as developing a lot of code, tests and documentation, and engineering the product releases through its 1.0 release.

    When I joined Zend in September 2006, that project had made a few “Preview Releases”, but it was losing momentum. My assignment was to organize the project, finish development of the 50+ components in the library, make regular beta releases to demonstrate progress, and to move the product to a general 1.0 release as rapidly as possible.

    To achieve this goal, I knew we had to manage the scope of the project carefully. There’s always tension between CSSQ (Cost, Scope, Schedule, and Quality) in any project. The project already had bare-bones cost, we had high standards for quality (who doesn’t?), and Zend placed a very high priority on making a general 1.0 release as soon as possible. So the only thing remaining to control was the scope.

    We were blessed with an enthusiastic user community, but this meant that we had dozens of people submitting feature requests and proposals for new components every week. Though the ideas were genuinely very attractive, there was simply no way to add them to the project scope without causing consequences to the schedule or quality of the project. And there were some features that we wish we had time to do before we had to reach that 1.0 milestone.

    Some members of the user community voiced objections to the emphasis on schedule over feature-set. I hope they understand that we were following the priorities of Zend, who is after all the sponsor of the project.

    We released Zend Framework 1.0 on June 30, 2007, and followed it with a couple of bug-fix releases in July and September. Zend Framework is accelerating in popularity, with over 2.3 million downloads to date. I feel proud to have contributed to a successful web application component library. It’s very satisfying to see so many people using code I worked on, making their own projects more successful.

    I was honored to work with a great team of software developers. I learned a lot from Darby, Matthew, and Alexander as we worked together, about technical subjects, productivity, and teamwork. Those guys were great to work with, and I hope to work with them again someday. There are a lot of other fine people at Zend and in the Zend Framework developer community, but I worked most closely with those three.

    I completed my assignment at Zend successfully. But Zend and I were unable to define a next objective for me. That usually means it’s time to declare victory and move on, so I gave notice and I finished working there last week.

    During the time I was on the project, the Zend Framework increased by:

    • 104,000 lines of PHP code in its core library;
    • 100,000 lines of PHP code in its unit tests;
    • 3,945 unit test functions (overall code coverage from tests is 84%);
    • 40,000 lines of documentation;
    • 200 new community contributors.
  • SHA2() patch for MySQL 5.0

    I’ve created a patch for MySQL 5.0.33 to provide a function SHA2().
    Download it here:

    http://www.karwin.com/sha2.patch.gz

    It really just calls out to the OpenSSL library for the digest functions. So you have to build MySQL from source with OpenSSL support enabled.

    You can use the function in SQL syntax like:

    SELECT SHA2(‘message’, 256);

    The second argument is 224, 256, 384, or 512, depending on what digest algorithm you want to use. If you pass 0 as the second argument, it uses SHA-256.

    This is my first code contribution to MySQL. I’d be grateful if someone wants to review it and let me know if it needs any changes.

    UPDATE 2/5/2007: I re-packaged the patch, excluding more of the MySQL generated files. Thanks to Stewart Smith @ MySQL for the suggestion.

    UPDATE 12/3/2010: MySQL 5.5.8 has been released for General Availability, including my SHA2() patch. Happy day!

  • Change == Opportunity

    Bob Field posted his reactions to the recent MySQL announcement to offer two versions of the MySQL Server product: Enterprise Server and Community Server.

    I feel somewhat similarly; the change has the potential to give greater value to both the corporate customers of MySQL and their community users. It will be interesting to see how this develops as we go forward.

    By making less frequent releases of MySQL, the company could put more work into quality control, and this could enable them to provide an overall better product. Not that they aren’t quite healthy and competitive in that area, but QC that can always be improved.

    This would be good of course for MySQL’s Enterprise customers, because the product will become an even better technology for stable, secure, performant data management.

    But it could also benefit the open source community. The improvements that go into MySQL Enterprise will still be available for download in source form. Anyone can build the MySQL server to get these improvements at any time. Well, almost anyone. 🙂 It’s not that easy to build the sources. But recent documentation (tutorial 1, tutorial 2, tutorial 3) and tools have made it a lot easier to build MySQL from source.

    I don’t mind the reduction in frequency of prebuilt binaries. The number of people who eagerly upgrade to the latest binaries is a tiny percentage of the most dedicated power users, based on the types of questions I see online. Most people upgrade once every two or three years. The few who stay current can check out the code and build it, so they’ll be even more current than they have been in the past. There is a school of thought that power users of open-source software should be building from source anyway.

    I usually recommend to newbies to use XAMPP or a similar prebuilt software stack. It will be interesting to see if XAMPP adopts MySQL ES or CS.

    How the division of the code develops over time is largely in the hands of community developers. One area that could benefit form contributions is the MySQL build tools.

    MySQL Server also needs support for popular IDE tools like Visual Studio and Eclipse. Also the Build Farm Initiative has started to address continuous integration systems. These are areas where the community can help greatly.

    But it’s also in the hands of MySQL AB, because they still control both the ES and CS code trees. There needs to be a more reliable process for community contributions to be integrated into the official code.

  • Catch-22 of the Active Database

    People frequently ask if they can do fancy things in triggers, such as writing to the filesystem, sending an email, or notifying other applications of data changes.

    I always recommend against doing things like this. Calling an external processes from a trigger or UDF is very difficult to get right, and it is very easy to cause serious problems with your application.

    Say for instance that you want to send an email from within a trigger, to notify someone of a data change.

    An operation that spawned the trigger may be rolled back, or it may fail for another reason (violated a constraint, etc.). But the call out to the external process occurs anyway. So the email is sent, but then the database change turns out not to be committed. Thus someone has been notified of a change that effectively hasn’t happened.

    Even if the operation is successful, it may not be committed immediately. But triggers fire at the time of the operation, not the time of the commit. So the email could be received and the recipient goes to look for the new data. They might not find it, because it’s still pending a commit.

    It’s better to perform notifications to external processes in your application code. After you have confirmed that the database change happened successfully, and the transaction was committed, you can notify other external applications directly.

    This doesn’t quite work for secondary operations, though. For instance, your app does an UPDATE, and the update trigger does another SQL operation, like INSERT. That INSERT operation is the one about which you want to notify some person or application. How can the application know that the INSERT was successful? How can your app know the values that were inserted, so that it can include those in the notification?

    Another similar example is if your app does an operation which affects other data via a foreign key declared with cascading referential integrity rules. For instance, your app does a DELETE, and a dependent table which refers to that table cascades the delete. The deletes on the dependent table are the changes about which you want to notify some person or application. How does the calling application know which tables were affected by cascading operations?

    I don’t have answers to these latter problems yet. But in most cases, your application does know the tables and rows on which it operates. In those cases, the application should generate notification events or perform other actions outside the database.

  • Working on SHA-2

    Several months ago, it became clear that one could crack a SHA-1 message digest. It was still a nontrivial problem, but it could be done thousands of times faster than brute-force guessing. So SHA-1 has become undesirable as a secure message digest, and U.S. federal software security standards now call for software to use SHA-256 (one of the group of algorithms which comprise SHA-2).

    MySQL currently has a builtin function to produce message digests with the SHA-1 algorithm, but not with SHA-2. There’s a bug #13174 logged at the MySQL site, but they seem like they’re deprioritizing it.

    So I thought it would be nice to contribute some code to MySQL. I’ve been using it for about six years, and helping answer questions on newsgroups and forums, and I’ve also logged a few decent bugs. But I’ve never contributed code. How hard could it be? I’m no expert on implementing cryptography code, so I don’t want to write the code and get it wrong.

    Fortunately MySQL uses OpenSSL, which qualifies on NIST’s list of Validated FIPS 140-1 and 140-2 Cryptographic Modules. MySQL already relies on the OpenSSL library for the DES encryption and decryption algorithms.

    I’ve made good progress. I checked out the sources with BitKeeper, and built the MySQL 5.0 and 5.1 sources. I kept an unmodified copy of the tree so I could create diffs after I added my code. I followed the documentation on adding native functions to MySQL. I could have implemented it as a UDF, but I thought the proper fix for the lack of functionality would be a built-in function.

    I now have it working. One can apply a patch to the MySQL sources, and build the tree. If you configured MySQL with “–with-openssl“, then the SHA-2 function calls into the OpenSSL library to get the SHA-224, SHA-256, SHA-384, and SHA-512 message digest algorithms. Here’s an example demonstrating the usage:

    SELECT SHA2(‘plaintext string’, 256);

    The first argument is a string expression. The second argument is one of 224, 256, 384, or 512, according to the bit length of the desired message digest. Other values cause the function to return an error.

    I ran tests using sample test vectors available from NIST. This is of course not a certification, but I can at least perform unpublishable validation. The test vectors include sample strings, and the expected hash values for SHA-1 through SHA-512. I wrote some shell scripts to help run through the tests, submitting the strings to SQL statements using “mysql -e“, and comparing the result to the test vector data, and also running the same data through standard command-line utilities like “sha256sum” as a double-check. Everything passes except the simplest case: a short, 8-bit value “00”. I suspect I just have the length of the input set wrong.

    The next step is to figure out how to convert these tests to integrate with MySQL’s own test framework.