Keeping It Simple

Author: Bill Karwin

Parrot Web Framework?
Wondering if the following idea could be feasible:
- Architect a web framework that emphasizes Inversion of Control.
- Implement core web framework in Parrot (now that this dynamic language platform has released its 1.0).
- Voila! A web framework that supports any language implemented for Parrot platform.
- Developers write plugins in any language: Python, Ruby, PHP, Perl6, Lua, C, or any other language supported on Parrot.
Although the Parrot platform is now 1.0, specific language implementations are still in various stages of development. Realistically, I would guess that it’ll be some years before the architecture above is ready for production.
March 19, 2009
How do the Proxy, Decorator, Adaptor, and Bridge Patterns differ?
A user recently asked:

I was looking at the Proxy Pattern, and to me it seems an awful lot like the Decorator, Adaptor, and Bridge Patterns. Am I misunderstanding something? What’s the difference? Why would I use the proxy pattern veses the others? How have you used them in the past in real world projects?

Proxy, Decorator, Adapter, and Bridge are all variations on “wrapping” a class. But their uses are different.
- Proxy could be used when you want to lazy-instantiate an object, or hide the fact that you’re calling a remote service, or control access to the object.
- Decorator is also called “Smart Proxy.” This is used when you want to add functionality to an object, but not by extending that object’s type. This allows you to do so at runtime.
- Adapter is used when you have an abstract interface, and you want to map that interface to another object which has similar functional role, but a different interface.
- Bridge is very similar to Adapter, but we call it Bridge when you define both the abstract interface and the underlying implementation. I.e. you’re not adapting to some legacy or third-party code, you’re the designer of all the code but you need to be able to swap out different implementations.
I’m posting to my blog the questions I’ve answered on StackOverflow, which earned the “Good Answer” badge. This was my answer to “How do the Proxy, Decorator, Adaptor, and Bridge Patterns differ?“
March 12, 2009
Can I Use Example Code from Internet Q&A Sites?
A user recently asked:
Scenario:
- A developer is working on a project and encounters a problem.
- They ask a question on the internet somewhere (ie stackoverflow.com)
- Someone answers their question and provides a nice code snippet that just about does what they want.
Where does one legally stand if the developer includes the code verbatim in their project’s code?

I know I’ve done this before…and I’m sure others have too…but I’d really like to know what the legal or ethical answer is to this question.
Note: never make business decisions based on legal advice you read on the internet — including mine! Confirm this with a qualified legal professional.

StackOverflow seems to offer its content under the Creative Commons Attribution-Share Alike 2.5 license. See the logo and link at the bottom of the page.

This means you are free to copy, distribute, and transmit any content you see here, or to remix or adapt the content.

However, you must attribute the work in the manner specified by the author (though I can’t find where this is specified on StackOverflow), and if you alter or build upon the content, you may distribute it only under a compatible license. So it’s similar to LGPL, in that respect.

I would recommend that you do not copy code or other content from StackOverflow verbatim, if you need to use it in a commercial project.
- Should one use the example code and move on?
  
  No. You need to be aware of the license and comply with it, or in theory you could get your project into trouble. Realistically, this is very unlikely. But it is still possible to cause problems.
- Should one use the example code and provide a comment referencing its origin?
  
  Yes, attribution is required by the Creative Commons license used by StackOverflow.
- Should one inform the provider of the example code that they’ve used their code?
  
  You are not obligated to do so, so it would be a courtesy and that’s up to you.
- Should one not use the example code at all and use the basic idea to create your own code?
  
  Yes, this would be a conservative and safe policy. Plus, it’s always better to understand how the code works, which you would have to in order to write code with similar functionality.
- Is it ok to use said example code in proprietary closed-source projects?
  
  No. The license would require you to make your project available under the same Creative Commons Share Alike (or compatible) license.
- Is it ok to use said example code in proprietary but internal-use-only projects?
  
  Technically, yes, because like GPL & LGPL, the licensing requirement only activates when you distribute a derived work. But how can you be sure that code from the internal-use-only project will never be duplicated into a product that gets distributed? Do you plan to annotate code fragments within your internal projects? I would not recommend relying on this.
- Are the legal implications currently undefined?
  
  No, they are spelled out in the Creative Commons license mentioned above.
I’m posting to my blog the questions I’ve answered on StackOverflow, which earned the “Good Answer” badge. This was my answer to “What legal issues can I run into if I use example code (say from stackoverflow) in my projects?“
March 10, 2009
Quantity Over Quality
Alex Netkachov recently reported a list of micro-optimizations for PHP. Several other bloggers (Sebastian, Maggie, Pádraic) responded with appropriate messages, reminding people that proper application design usually counts more than micro-optimizations.

They are all correct.

When I was an intern, I emailed a C compiler developer, to ask a question that had occurred to me regarding optimization: which is faster, ++i or i++? Assuming either form will work in my case, as in the increment expression in a for() loop. His response (paraphrased):

“By emailing me this question, you have already wasted more computing resources than you will ever save by choosing one form over the other during your entire programming career.”

(I’m still not sure if he meant to emphasize that the performance difference between the two expressions was extremely small, or that he didn’t think very highly of my career prospects. I’ll prefer to assume the former.)

A list of performance factoids like those listed by Alex are missing the guidance and wisdom that software developers need to judge their importance. All of the responses from other bloggers are similarly qualitative, instead of quantitative.

I know that it’s hard to make quantitative statements with regards to optimization.
- How much benefit can I get by replacing print with echo? It depends on how much printing you do in a given application — and also what else you’re doing in that application.
- Can I benefit from caching page output or results of resource-consuming functions? Probably, but not if the content is 100% dynamic and must be re-calculated for each request.
- Which of these micro-optimizations should I employ with greater priority? Which is the best use of my development time?
These micro-optimization tips are interesting and worth knowing, but they should also be taken with a grain of salt. Their importance varies, depending on the nature of your application. There are no magic words that are guaranteed to double performance in every application.

Finding the best way to optimize your code is your job as a software developer. You must use scientific measurement, as well as good judgment, experience, and intuition to get your job done most effectively.
March 10, 2009
Accepting a job that failed The Joel Test
A user recently asked:

I’m about to accept a job offer for a company that has failed The Joel Test with flying colors.

Now, my question is how do I improve the conditions there. I am positive that within a few months I will be able to make a difference.

But where do I start? And how?

Don’t view yourself as the “new sheriff in town” who’s here to clean it all up in one year. The habits they have settled into have been a long time forming.

Watch and listen, and ask questions about the most severe and recurring pain points. Find out what bad habits have actually caused loss of work, late nights, quality problems, or lost customers. Try to quantify the cost of these bad habits.

Then at some point talk to your boss in a one-on-one meeting and make a proposal for how you could mitigate one specific risk that seems to be their biggest problem. It could be almost anything on the Joel Test, but I’d guess it’s most likely to be one of:
- No source code control means the code is a mess, with lots of “commented out” sections. Can’t track which code changes were made for a given bug. It’s hard to do major features in parallel with ongoing maintenance. No way to roll back changes. No way to track which developer did what changes.
- No build process means some code changes exist only on the live server. Developers are constantly pushing and pulling code to and from the live server. No one has a development environment that’s in sync with the live code, so it’s hard to reproduce bugs.
- No bug database means some tasks “fall through the cracks” from time to time. Customers report bugs that fall into a black hole. Managers don’t know what’s being worked on. Employees have no record of their work when it comes time for annual reviews.
When presenting the solutions, don’t try to justify them with abstract concepts like “best practices” or “it’s the industry standard way” or anything so intangible. If those were enough to motivate this company, they would have done it by now.

Instead, focus on what is their deciding factor. I’d guess it’s probably related to how much time and money it costs the business to use best practices, versus how much it can save them. But you should find out if this is really their reason. It’ll take some setup work to establish these tools and practices, but you can explain the recurring benefits for quality, productivity, and predictability of the development work. All those can contribute to the business’ bottom line.

In one year, you’ll be doing extremely well if you can make just one change to help them. It’ll take a lot of patience to overcome a development culture that has been building for so long. Keep in mind that the rest of the team isn’t there by coincidence — they may actually be compatible with that level of disorganization.

I’m posting to my blog the questions I’ve answered on StackOverflow, which earned the “Good Answer” badge. This was my answer to “Accepting a job that failed The Joel Test“
March 6, 2009
Unit Test Coverage

S.Lott writes in his blog about unit test code-coverage: how much is enough?

Effective tests should account not only for code paths, but also input values and other application state or external environment that may affect the behavior.

For example, it may be easy to get 100% code coverage from tests for a function like the following:

divide(x, y) { return x/y; }

But unless you test for division-by-zero (when the parameter y is zero), you haven’t tested sufficiently.

The code-coverage metric doesn’t reveal when you’ve tested a good variety of input values. It only tests if your tests have visited the given lines of code, not what values were in each variable at the time.

Likewise for other application state besides input parameters. Values in other application objects, the contents of databases or files, or the operating system environment can all affect the proper functioning of a class or function that you’re testing. These variations are not measured by code-coverage metrics.

It could be argued that if you’re testing for external state, you aren’t doing unit testing by its strict definition; you’re doing functional or system testing. Nevertheless, most people rely chiefly on unit testing tools, because automated unit testing tools that generate code-coverage metrics are pretty easy to use.

While it’s a worthwhile goal to try to get high code-coverage in your unit-testing, a score of 100% doesn’t guarantee that you’ve tested enough. Likewise, a score below 100% isn’t necessarily an indication of inadequate testing. Code-coverage is therefore not a goal in itself; it’s one way of measuring one type of testing.

February 25, 2009
How Do You Reward Good Clients?
A user recently asked:

I find when I get a ‘good client’ things go so much smoother on a project (there even seems to be less bugs – weird?). I have a habit of rewarding good behavior from anyone (even if its just a simple thank you).

I am interested to know what sort of things you guys do, and even how you feel about good client behavior.

It would be nice if “good clients” were simply “normal clients,” and bad clients were those you avoid working for.

If you want to give them a more explicit and material reward, you could give them a “good customer” discount on the next project. My invoices specify a penalty for late payment, so I suppose you could also offer a small discount (2%) for extra-prompt payment.

Probably the best reward is to give them your best value for their dollar, and continue doing business with them. Work with them with equal respect, communication, and commitment to quality and value. Plus occasional free support, or advice on projects you’re not actually working on for them, etc.

It goes both ways. They know when they’ve got a good contractor who delivers quality work on time and charges fairly. They want to continue the relationship when they get good results. So a good customer won’t quibble about nickel-and-dime line items on invoices, pester you about delivery dates, or question your technology choices. They’ll accept your rates as economical if you actually give them good work, instead of skipping to another consultant who charges less but does poor work. They’ll also refer other clients your way (and probably not the ones they know will be annoying for you).

Really bad clients will get a bit more of a cold shoulder:
- “I’d really like to bid on your new project but I’m swamped with other work this quarter.”
- “That’ll be a rush order so I’m going to have to charge you a premium.”
- “Sorry, I was traveling for a few days and didn’t get your RFP.”
Unfortunately, bad clients are probably least able to read between the lines when they get this sort of message. It’s much easier for them to believe that all contractors are difficult to work with, than to accept that it’s they who are difficult.
I’m posting to my blog the questions I’ve answered on StackOverflow, which earned the “Good Answer” badge. This was my answer to “How do you reward your clients for good behavior?“
February 6, 2009
Splitting a String in Perl
A user recently asked:
How do I take a string in Perl and split it up into an array with entries two characters long each?

Ultimately I want to turn something like this
```
F53CBBA476
```
into and array containing
```
F5 3C BB A4 76
```
This was my answer:
```
@array = ( $string =~ m/../g );
```
The pattern-matching operator behaves in a special way in a list context in Perl. It processes the operation iteratively, matching the pattern against the remainder of the text after the previous match. Then the list is formed from all the text that matched during each application of the pattern-matching.

I’m posting to my blog the questions I’ve answered on StackOverflow, which earned the “Good Answer” badge. This was my answer to “How can I split a string into chunks of two characters each in Perl?“
January 21, 2009
Understanding Unfamiliar Databases
A user recently asked:
What kind of approaches and techniques can you employ to become familiar with an existing database if you are tasked with supporting and/or modifying it? How can you easily and effectively ramp up your knowledge of a database you have never seen before?

Here was my reply:
The first thing I do is create an Entity-Relationship Diagram (ERD). Sometimes you can simply describe the metadata with command-line tools but to save time there are some tools that can generate a diagram automatically.

Second, examine each table and column make sure I learn the meaning of what it stores.

Third, examine each relationship and make sure I understand how the tables relate to one another.

Fourth, read any views or triggers to understand custom data integrity enforcement or cascading operations.

Fifth, read any stored procedures. Also read SQL access privileges if there are such.

Sixth, read through parts of the application code that use the database. That’s where some additional business rules and data integrity rules are enforced.
I’m posting to my blog the questions I’ve answered on StackOverflow, which earned the “Good Answer” badge. This was my answer to “What are the Best Ways to Understand an Unfamiliar Database?“
January 21, 2009
Why Should You Use an ORM?
A user recently asked for good arguments in favor of using Object/Relational Mapping technology:
If you were to motivate [sic] the “pro’s” of why you would use an ORM to management/client, what would the reasons be?

Try and keep one reason per answer so that we can see what gets voted up as the best reasons.

I offered four answers. The first three got the most votes, but my last answer got little interest.
1. Speeding development. For example, eliminating repetitive code like mapping query result fields to object members and vice-versa.
2. Making data access more abstract and portable. ORM implementation classes know how to write vendor-specific SQL, so you don’t have to.
3. Supporting OO encapsulation of business rules in your data access layer. You can write (and debug) business rules in your application language of preference, instead of clunky trigger and stored procedure languages.
4. Generating boilerplate code for basic CRUD operations (Create, Read, Update, Delete). Some ORM frameworks can inspect database metadata directly, read metadata mapping files, or use declarative class properties.
There are lots of other reasons for and against using ORM frameworks. Generally, I’m not a fan of ORM’s, because their benefits don’t seem to make up for their complexity and tendency to perform slowly. Their chief value is in reducing the time taken in repetitive development tasks.

Hibernate, for example, is about 800,000 lines of code (Java and XML), but it’s complex enough that I doubt it’s easier to learn or to use than SQL. Besides, there seem to be fundamental tasks, such as a simple JOIN that are impossible to do through the entity interface. Please correct me if I’m wrong, but I’ve been searching tutorials and examples and I haven’t found a way to fetch a joined result set from two entities, without writing a custom query in HQL (Hibernate’s abstract version of SQL).

I was also led to a blog by Glenn Block, titled “Ten advantages of an ORM (Object Relational Mapper).” I disagree with Block on several points. He cites some traits of ORMs as advantages where I see them as defects. He also cites features that are not specific to ORMs; they could be achieved with any type of data access library.

update: Upon request, here are some specific comments on Glenn Block’s list of advantages of an ORM:

1. Facilitates implementing the Domain Model pattern

Not necessarily. I can design Domain Model classes containing plain SQL as easily as I can design classes that operate on the database via an ORM layer. Keep in mind that ActiveRecord is not a Domain Model.

2. Huge reduction in code.

Depends. When executing simple CRUD operations against a single table, yes. When executing complex queries, most ORM implementations fail spectacularly compared to the simplicity of using SQL queries.

3. Changes to the object model are made in one place.

This is not a benefit of an ORM. Many people use ORM interfaces inexpertly, so when the database structure changes, they still have to update many places in their application to reflect the change. But instead of redesigning SQL queries, they have to redesign usage of the ORM. There is no net win. They could structure their application using plain SQL queries and still be as likely to achieve the benefit of DRY.

4. Rich query capability.

Absolutely wrong.

5. You can navigate object relationships transparently.

This is definitely a negative rather than a positive. When you want a result set to include rows from dependent tables, do a JOIN. Doing the “lazy-load” approach, executing additional SQL queries internally when you reference columns of related tables, is usually less efficient. Leaving it up to the ORM internals deprives you of the opportunity to decide which solution is better.

6. Data loads are completely configurable …

This is not a benefit of an ORM. It is actually easier to achieve this using plain SQL.

7. Concurrency support.

Again, not a benefit of an ORM.

8. Cache managment.

This has nothing to do with using an ORM. I can cache data using SQL.

9. Transaction management and Isolation.

Also has nothing to do with using an ORM versus a more direct DAL.

10. Key Management.

Ditto.

I’m posting to my blog the questions I’ve answered on StackOverflow, which earned the “Nice Answer” or “Good Answer” badges. This was my answer to “Why Should You Use An ORM?“
January 21, 2009