<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Basil Vandegriend: Professional Software Development &#187; design</title>
	<atom:link href="http://www.basilv.com/psd/blog/category/design/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.basilv.com/psd</link>
	<description></description>
	<lastBuildDate>Wed, 25 Jan 2012 13:23:47 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
		<item>
		<title>Streaming Data to Reduce Memory Usage</title>
		<link>http://www.basilv.com/psd/blog/2011/streaming-data-to-reduce-memory-usage</link>
		<comments>http://www.basilv.com/psd/blog/2011/streaming-data-to-reduce-memory-usage#comments</comments>
		<pubDate>Thu, 05 May 2011 13:01:52 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[design]]></category>
		<category><![CDATA[coding]]></category>
		<category><![CDATA[Hibernate]]></category>
		<category><![CDATA[memory]]></category>
		<category><![CDATA[software development]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/?p=641</guid>
		<description><![CDATA[I recently performed a series of optimizations to reduce an application's memory usage. After completing several of these I noticed that there was a common theme to many of my optimizations that I could explicitly apply to help identify further opportunities for improvement. As a reoccuring solution, this qualifies as a design pattern which I [...]]]></description>
			<content:encoded><![CDATA[<p>I recently performed a series of optimizations to reduce an application's memory usage. After completing several of these I noticed that there was a common theme to many of my optimizations that I could explicitly apply to help identify further opportunities for improvement. As a reoccuring solution, this qualifies as a design pattern which I refer to as <em>Streaming Data</em>.</p>
<h3>Context</h3>
<p>This pattern applies when you need to process a significant volume of data but the processing can be done incrementally on small subsets of the data. A typical example is loading a list of entities and then iterating through the list to process each one. While the results (output) of processing can be combined across all the entities, it is important that the input to the processing only requires a small subset of all the data, and not the entire list of entities. A code example illustrating this problem context is shown below:</p>
<pre class=" prettyprint">
List&lt;Entity&gt; entities = loadEntities();
List&lt;ProcessingResult&gt; results = new ArrayList&lt;ProcessingResult&gt;();
for (Entity entity : entities) {
  ProcessingResult result = processEntity(entity);
  results.add(entity);
}
</pre>
<h3>Solution</h3>
<p>Reducing the memory usage in the above example is based on the observation that loading the entire list of objects to process can consume a large amount of memory and is not necessary since we only use one object at a time. So the solution is to stream - incrementally retrieve - these objects instead of loading them all at once. For the consumer of this data the only change required is to first obtain a reference to the stream such as an <code>Iterable</code> that incrementally fetches data. Updating our prior code example results in the following (changed lines shown in green background):</p>
<pre class=" prettyprint">
<span style="background:#97FF77;">Iterable&lt;Entity&gt; entities = streamEntities();</span>
List&lt;ProcessingResult&gt; results = new ArrayList&lt;ProcessingResult&gt;();
for (Entity entity : entities) {
  ProcessingResult result = processEntity(entity);
  results.add(entity);
}
</pre>
<h3>Examples</h3>
<p>The mechanism to use for streaming objects will depend on the source of the data and may require significant changes compared to a bulk load. Here are some specific examples.</p>
<h4>Parsing XML</h4>
<p><a href="http://www.basilv.com/psd/blog/2008/simple-xml-parsing-using-jaxb">Parsing XML files using JAXB</a> is a convenient approach for converting the entire file into a tree of Java objects, but it populates the entire tree at once. To instead stream such data use the SAX parser provided as part of the JAXP API. The SAX parser is event-based, which means that it iterates over the entities (and attributes) of your XML and for each item invokes callbacks you define.</p>
<h4>Querying Databases using Hibernate</h4>
<p>When using Hibernate to query for a collection of entities it is convenient to simply ask Hibernate for the entire collection. A typical example of doing this using the query by criteria API within Hibernate is below:</p>
<pre class=" prettyprint">
public List&lt;Entity&gt; queryData() {
  Criteria criteria = session.createCriteria(Entity.class)
  // Add appropriate restrictions
  // ...
  List&lt;Entity&gt; result = criteria.list();
  return result;
}
</pre>
<p>When the criteria returns a large volume of data, however, this approach will consume a high volume of data. Instead use the <code>scroll</code> method on <code>Criteria</code> to return a <code>ScrollableResults</code> instance that can be used to iterate through the results. If you prefer to not expose the rest of the application to Hibernate classes, you can wrap the <code>ScrollableResults</code> in a special implementation of <code>Iterator</code> (which I leave as an exercise to the reader). The revision of the above example using streaming looks like the following (changed lines shown in green background):</p>
<pre class=" prettyprint">
<span style="background:#97FF77;">public Iterator&lt;Entity&gt; queryData() {</span>
  Criteria criteria = session.createCriteria(Entity.class)
  // Add appropriate restrictions
  // ...
<span style="background:#97FF77;">  ScrollableResults scrollableResults = criteria.scroll();</span>
<span style="background:#97FF77;">  Iterator&lt;Entity&gt; result = new ScrollableResultsIterator(scrollableResults);</span>
  return result;
}
</pre>
<p>This scroll approach only works when all the data can be processed within the same database transaction since the Hibernate session must remain open for the <code>ScrollableResults</code> to be able to continue fetching data. If this is not suitable then another option is to load the data using multiple queries that each return a subset of the data. One common example of this is when displaying search results to an user. Rather than showing all the results (which may number in the hundreds or thousands) show one page at a time and let the user step through the various pages of results. Due to the frequency with which this occurs I refer to this solution as <em>paging</em>. To implement this in Hibernate using the query by criteria API is fairly simple:</p>
<ol>
<li>Start by creating your criteria object and defining its restrictions as you normally would.</li>
<li>Apply an ordering to the criteria. It is best if this ordering is consistent, by which I mean that database updates or inserts between queries will not result in invalid or unexpected results being returned. This assumes each query for a page executes in a separate database transaction which provides no guarantees of transactional isolation for the group of queries as a whole. In some contexts, consistency is not required. If it is then I prefer to use an auto-incrementing surogate primary key as the field to sort by in order to achieve the highest level of consistency. </li>
<li>Apply restrictions to retrieve only the specific page. This is done using the methods <code>setFirstResult</code> and <code>setMaxResults</code> on the <code>Criteria</code> object.</li>
</ol>
<h3>Consequences</h3>
<p>One potential consequence of streaming data is a reduction in performance because data is loaded piece by piece rather than in bulk. To mitigate this, the solution is use what I call <em>loading sets</em>: define subsets of the total data volume that are small enough to not impact memory usage but large enough to minimize performance impacts. Then load the data one set at a time. The consuming API does not need to change: it can still iterate or stream over each loaded set, and then fetch the next set once the current one is exhausted.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2011/streaming-data-to-reduce-memory-usage/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Exploring Mental Processes behind Developing Software</title>
		<link>http://www.basilv.com/psd/blog/2010/exploring-mental-processes-behind-developing-software</link>
		<comments>http://www.basilv.com/psd/blog/2010/exploring-mental-processes-behind-developing-software#comments</comments>
		<pubDate>Wed, 14 Jul 2010 14:00:35 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[design]]></category>
		<category><![CDATA[coding]]></category>
		<category><![CDATA[software development]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/?p=526</guid>
		<description><![CDATA[How do you go about designing and coding software? More specifically, what is your mental process for accomplishing this? Becoming more aware of the approach you use allows you to deliberately control and improve it. Mental thought processes are, however, very intangible and difficult to put into words. In the software development literature much has [...]]]></description>
			<content:encoded><![CDATA[<p>How do you go about designing and coding software? More specifically, what is your mental process for accomplishing this? Becoming more aware of the approach you use allows you to deliberately control and improve it. Mental thought processes are, however, very intangible and difficult to put into words. In the software development literature much has been written about how to go about design and coding ranging from a naive object-oriented design approach (find the nouns in the requirements) to <a href="http://www.basilv.com/psd/blog/2009/test-driven-development-benefits-limitations-and-techniques">test-driven development</a> (TDD). These approaches, however, deal with tangible actions to be performed rather than the thinking that must necessarily occur. </p>
<p>So I decided it would be an interesting challenge to write about my mental processes during design and coding that I hope will prove beneficial for you. I use a recent development task I worked on as a case study for examining my thinking. The task involved designing and coding a framework for generating a set of reports for an application. In my analysis I identify a number of mental 'stages' that I used, which together loosely comprise my mental process map. I first present a narrative of my thinking during this case study with references to these stages identified in bold and then provide some concluding thoughts. </p>
<h3>Case Study Narrative</h3>
<p>I start by <strong>uploading</strong> the problem domain into my working memory – reviewing all relevant requirements and any upfront or preexisting design. I am not trying to gain comprehensive knowledge at this point – there are many specific, incidental details that are not relevant to the big picture that I can safely ignore. I instead focus on the significant elements that will feed into my initial design work:</p>
<ul>
<li><em>Domain Model</em>: What concepts or data does the system need to manipulate or store?</li>
<li><em>Process Model</em>: What operations or events occur? How do they make use of the data?</li>
<li><em>Constraints</em>: What are the main constraints with respect to what I need to build? What do I need to consider or watch out for?</li>
</ul>
<p>The initial upload raises a number of questions, points requiring clarification, and ideas for improvement (of the requirements and existing design). I consult with the business analyst and business team, using face-to-face conversation if possible, to gain clarity. Simultaneously I am thinking about the key <strong>concepts</strong> that I will use in the solution to address the required functionality. I use these concepts to assemble an initial domain model and process model – mostly in my head initially, perhaps supported by some sketches or doodles on a few pieces of paper. </p>
<p>As I iterate between understanding and clarifying the requirements and developing the concepts and models, I begin to balance competing requirements through a series of <strong>trade-offs</strong>. The most frequent trade-off is between minimizing design complexity and development cost versus fully providing the requested functionality. Having face-to-face meetings with the business helps me identify soft versus hard constraints and requirements. Soft ones can potentially be discarded through negotiation while hard ones are mandatory and must be addressed. </p>
<p>At this point I feel comfortable about certain parts of the design and feel fairly confident it will work, while for other parts I am still left with an uncomfortable feeling that further thinking does not help alleviate. This pushes me into an <strong>exploration</strong> mode in which I start writing code for the pieces I am uncertain about. I do not try to write complete, production-quality code. I instead do what I call 'design-level' coding where I define interfaces or classes with important method signatures, but with no real implementations for the methods – perhaps just some pseudo-code. At this point I find it hard to do TDD on non-trivial methods as method signatures and even the classes and interfaces can change dramatically. I leave lots of <a href="http://www.basilv.com/psd/blog/2010/using-to-do-comments-in-code">to-do comments</a> in the code about specific questions or issues regarding specific functionality or design elements that are not relevant for the big picture I am working on. What I am looking for is significant gaps in my solution that may require additional clarification of requirements, or further refinement of the concepts and models I came up with earlier.</p>
<p>Throughout the entire process and especially during these initial stages there are times when I need a <strong>mental shift</strong>. This is usually when I am undecided how to resolve a particular design issue or when I feel mentally fatigued. I use a number of different strategies. One is to simply change location – get out of my cubicle and walk around. Another is to change activities to work on something unrelated and mentally less taxing in order to recharge. For thorny design issues I find that sleeping on them is a great way of letting the subconscious work on the problem and help arrive at a resolution.</p>
<p>At some point I feel that I have resolved all of the big uncertainties so I begin converting my separate chunks of design-level code into a unified set of working production-quality code. I call this <strong>consolidation</strong>. I usually begin by doing a sweep through the design-level code to resolve the outstanding to-dos. This helps identify any outstanding requirements clarifications that I need. I then switch to coding the functionality class by class and method by method using mostly strict TDD. Development feels slow at first because I need to write many utility methods or helper methods (for either the production code or for the test code), or refactor them into existence out of duplicate code that I introduce. But using TDD gives me that satisfying sense of progress as I slam out one fully-tested method after another. </p>
<p>Once the first draft of the code is written I <strong>polish</strong> it to ensure a high level of <a href="http://manifesto.softwarecraftsmanship.org/">craftsmanship</a>. This involves aspects such as renaming classes, methods and variables to ensure good readability, refactoring to eliminate duplication, and commenting when appropriate to ensure good maintainability. For more information on why and how to polish code see my article <a href="http://www.basilv.com/psd/blog/2007/why-you-should-polish-your-code">Why you should polish your code</a>.</p>
<p>After reaching code-complete on the functionality I switch to <strong>feedback</strong> mode. I have two goals. The first goal is to pass the code through as many quality checks as I can to identify and eliminate defects. This includes asking for a peer code review, reviewing the results of static code analysis tools, verifying sufficient coverage by automated tests, adding automated integration tests, and performing manual functional testing. This list does not include automated unit testing because I have already done this concurrent with the coding. The second goal is to put the code to use to identify functional gaps, usability issues, and operational issues relating to non-functional attributes such as monitoring / logging, error handling, and performance. This second goal is especially relevant for infrastructure code, where putting the code to use generally means coding business functionality that exercises the infrastructure. Although I go into the feedback stage with usually ~95% code coverage from my automated unit tests, I do expect to discover and fix a few issues. As I progress through the stage, the code gradually stabilizes. At the end, it has reached <a href="http://www.basilv.com/psd/blog/2009/my-definition-of-done">feature-done</a> status which means I consider it production-quality code ready for final testing.</p>
<h3>Discussion</h3>
<p>The process I have described may seem like it consists of discrete steps with a linear transition from start to finish, but it is anything but that.  Each 'step' is fuzzy, blurring from one to the next. There are multiple transitions going back and forth between steps. Different portions of the functionality can simultaneously be in drastically different steps – I might be polishing one class while in the midst of consolidating a second and in exploration mode for a third. I like to characterize the actual process flow as <a href=" http://en.wikipedia.org/wiki/Chaordic ">chaordic</a> - a blend of chaos and order based on balancing creativity with discipline. </p>
<p>The process may give an impression of a big-design-up-front approach, which is inaccurate for two reasons. First, I consider <a href=" http://www.basilv.com/psd/blog/2008/the-source-code-is-the-design">coding to be an act of design</a>, so I am really designing throughout the entire process. Second, I do believe in doing an appropriate amount of thinking and analysis (what some call design) prior to starting coding. The amount needed depends on the size and complexity of the problem to be solved and my current understanding of it. For simple, straightforward problems I may only spend a few minutes doing this, but those few minutes will include upload, trade-off, and exploration activities prior to diving into the coding </p>
<p>In conversations with experienced developers I have noticed some correlations between how they describe the way they develop and my process, but there are also differences. So I am interested to hear what you think of this process and how it may match or differ from yours.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2010/exploring-mental-processes-behind-developing-software/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Exposing Mutable Objects as Public Properties</title>
		<link>http://www.basilv.com/psd/blog/2008/exposing-mutable-objects-as-public-properties</link>
		<comments>http://www.basilv.com/psd/blog/2008/exposing-mutable-objects-as-public-properties#comments</comments>
		<pubDate>Mon, 11 Aug 2008 14:09:25 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[design]]></category>
		<category><![CDATA[coding]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[software development]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/?p=129</guid>
		<description><![CDATA[I recently had an interesting design discussion with a coworker in which we discussed the pros and cons of exposing mutable objects as public properties of a class. This article provides my thoughts on the subject. An immutable class (or object) is one whose state cannot be changed once the instance is constructed. Mutable objects [...]]]></description>
			<content:encoded><![CDATA[<p>I recently had an interesting design discussion with a coworker in which we discussed the pros and cons of exposing mutable objects as public properties of a class. This article provides my thoughts on the subject.</p>
<p>An immutable class (or object) is one whose state cannot be changed once the instance is constructed. Mutable objects do allow state changes, typically via setter methods. </p>
<p>Below is an example of such a mutable class:</p>
<pre class="prettyprint">
import java.util.Calendar;

public class Order
{
  private Calendar date = Calendar.getInstance();

  public Calendar getDate() {
    return date;
  }

  public void setDate(Calendar calendar) {
    this.date = calendar;
  }
}
</pre>
<p>The public property <code>date</code> uses Calendar as its type. Calendar is itself a mutable class – it has methods such as setters that modify the state of the class. So what is the issue?  Consider the following code:</p>
<pre class="prettyprint">
  public static void exampleOne(Order order) {
    Calendar cal = order.getDate();
    // Is order past due?
    cal.add(Calendar.DAY_OF_YEAR, -10);
    if (cal.before(Calendar.getInstance())) {
      // Order past due logic...
    }
  }
</pre>
<p>Can you spot the problem? I must admit that I failed to notice anything wrong at first glance. Here is the same example instrumented with print statements. Can you predict what will happen when you run it?</p>
<pre class="prettyprint">
import java.text.DateFormat;
import java.util.*;

public class ExampleTwo
{
  public static void exampleTwo(Order order) {
    print("Original order date = "
      + convertToText(order.getDate()));
    Calendar cal = order.getDate();

    // Is order past due?
    cal.add(Calendar.DAY_OF_YEAR, -10);
    if (cal.before(Calendar.getInstance())) {
      // Order past due logic...
    }
    print("Ending order date = "
      + convertToText(order.getDate()));
  }

  private static void print(String message) {
    System.out.println(message);
  }

  private static String convertToText(Calendar calendar) {
    return DateFormat.getDateInstance().format(
      new Date(calendar.getTimeInMillis()));
  }

  public static void main(String[] args) {
    Order order = new Order();
    exampleTwo(order);
  }
}
</pre>
<p>Below is the console output from executing this code:</p>
<pre class="box">
Original order date = 8-Aug-2008
Ending order date = 29-Jul-2008
</pre>
<p>This clearly shows the problem: the date contained in the <code>Order</code> instance is changed within the <code>exampleTwo</code> method, which is likely incorrect behavior. The trap for a developer using the <code>Order</code> class (i.e. to write <code>exampleTwo()</code>) is that they would not necessarily consider or realize that changing the <code>Calendar</code> object returned by <code>order.getDate()</code> would change the value within the instance. The <code>Order</code> class is correct (no defects) but the implementation is unsafe because a mutable object is returned from the getter. How can this be addressed?</p>
<p>One solution for this particular example is to take the order-past-due check and turn it into a method on the <code>Order</code> class that does not change the date. While I would likely employ this solution, it unfortunately does not address the underlying issue. Even if the <code>getDate()</code> method is removed, the underlying problem can occur with the <code>setDate()</code> method. The following example demonstrates this:</p>
<pre class="prettyprint">
import java.text.DateFormat;
import java.util.*;

public class ExampleThree
{
  public static void exampleThree() {
    Calendar calendar = Calendar.getInstance();
    Order firstOrder = new Order();
    Order secondOrder = new Order();
    firstOrder.setDate(calendar);

    calendar.add(Calendar.DAY_OF_YEAR, 10);
    secondOrder.setDate(calendar);

    print("First date = "
      + convertToText(firstOrder.getDate()));
    print("Second date = "
      + convertToText(secondOrder.getDate()));
  }

  private static void print(String message) {
    System.out.println(message);
  }

  private static String convertToText(Calendar calendar) {
    return DateFormat.getDateInstance().format(
      new Date(calendar.getTimeInMillis()));
  }

  public static void main(String[] args) {
    exampleThree();
  }
}
</pre>
<p>The console output is:</p>
<pre class="box">
First date = 19-Aug-2008
Second date = 19-Aug-2008
</pre>
<p>The same instance of <code>Calendar</code> is added to both <code>Order</code> instances, so when the calendar's date value is changed for assignment to the <code>secondOrder</code> instance, it also changes within the <code>firstOrder</code> instance. This is likely incorrect behavior which occurs because the setter directly stores the provided mutable object. So we end up with two references to <code>Calendar</code> in each of the <code>Order</code> instances that point to the same instance. This is called <a href="http://en.wikipedia.org/wiki/Aliasing_(computing)">aliasing</a>. </p>
<p>By now you are perhaps convinced that getters and setters should never deal with mutable objects. It is unfortunately not that simple. Consider the following example:</p>
<pre class="prettyprint">
public class Order
{
  private Customer customer;

  public Customer getCustomer() {
    return customer;
  }

  public void setCustomer(Customer customer) {
    this.customer = customer;
  }
}

public class Customer
{
  private String name;

  public String getName() {
    return name;
  }

  public void setName(String name) {
    this.name = name;
  }
}

public class ExampleFour
{
  public static void exampleFour(Order order) {

    print("Original order customer name = "
      + order.getCustomer().getName());

    String newName = "New name";
    Customer customer = order.getCustomer();
    customer.setName(newName);

    print("Final order customer name = "
      + order.getCustomer().getName());
  }

  private static void print(String message) {
    System.out.println(message);
  }

  public static void main(String[] args) {
    Customer customer = new Customer();
    customer.setName("Starting name");
    Order order = new Order();
    order.setCustomer(customer);
    exampleFour(order);
  }
}
</pre>
<p>The console output is:</p>
<pre class="box">
Original order customer name = Starting name
Final order customer name = New name
</pre>
<p>In this case it seems perfectly reasonable to update the customer's name and have this change remembered within the <code>Order</code> instance. Before I explain the difference between these scenarios, I will show one more example involving a collection property:</p>
<pre class="prettyprint">
import java.util.*;

public class Order
{
  private Customer customer;

  public Customer getCustomer() {
    return customer;
  }

  public void setCustomer(Customer customer) {
    this.customer = customer;
  }
}

public class Customer
{
  private List<Order> orders = new ArrayList<Order>();

  public List<Order> getOrders() {
    return orders;
  }

  public void addOrder(Order order) {
    if (order == null) {
      return;
    }
    orders.add(order);
    order.setCustomer(this);
  }
}
</pre>
<p>In this case, calling <code>customer.getOrders().add(order)</code> might seem reasonable but is not correct as this will not invoke the extra logic in <code>customer.addOrder(order)</code> to call <code>order.setCustomer()</code>. </p>
<p>The following table summarizes the above discussion.</p>
<table class="fancy" cellspacing="0">
<tr>
<th>Property Type</th>
<th>Expect Modification of Original Object When Getter Result is Changed?</th>
<th>Is Actual Behavior Correct?</th>
</tr>
<tr>
<td>Calendar</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<td>Customer</td>
<td>Yes</td>
<td>Yes</td>
</tr>
<tr>
<td>List</td>
<td>Maybe</td>
<td>No</td>
</tr>
</table>
<p>Given that the code for the <code>Calendar</code> property compared to the <code>Customer</code> property is basically identical, why should we have different expectations of their behavior? The reason has to do with the nature of the two types. <code>Calendar</code> is a <a href="http://martinfowler.com/eaaCatalog/valueObject.html">Value Object</a> - a simple object whose equality is based on its value rather than its identity. Changing a calendar means ending up with a new date that is not equal to the original. <code>Customer</code> is a Reference Object – an object with business meaning whose identity is the basis for equality. Two customers can have the same values but be distinct. Changing a customer is just that – a change to that customer that should propagate throughout the system. For a fuller discussion of value objects see page 486 of the book <a href="http://www.amazon.ca/gp/product/0321127420?ie=UTF8&#038;tag=basilvandegri-20&#038;linkCode=as2&#038;camp=15121&#038;creative=330641&#038;creativeASIN=0321127420">Patterns of Enterprise Application Architecture</a><img src="http://www.assoc-amazon.ca/e/ir?t=basilvandegri-20&#038;l=as2&#038;o=15&#038;a=0321127420" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /> by Martin Fowler. This discussion includes a short section on the risk of encountering aliasing defects if a value object is mutable and recommends that they be immutable.</p>
<p>I agree with this recommendation and I think it should be a design principle for classes: class properties that are value objects should be immutable. Many typical value object types in Java such as <code>String</code> and <code>Long</code> are already immutable, but we have already seen that others such as <code>Calendar</code> and <code>List</code> are not. What are the options in such circumstances? I will address this question in a future article.</p>
<p>The source code listed in this article is provided in the <em>Java Examples</em> project which can be downloaded from the <a href="http://www.basilv.com/psd/software">Software</a> page.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2008/exposing-mutable-objects-as-public-properties/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Source Code is the Design</title>
		<link>http://www.basilv.com/psd/blog/2008/the-source-code-is-the-design</link>
		<comments>http://www.basilv.com/psd/blog/2008/the-source-code-is-the-design#comments</comments>
		<pubDate>Fri, 01 Aug 2008 14:00:53 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[design]]></category>
		<category><![CDATA[coding]]></category>
		<category><![CDATA[process]]></category>
		<category><![CDATA[software development]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/?p=122</guid>
		<description><![CDATA[I first came across the thought-provoking article What Is Software Design? by Jack Reeves as an appendix titled "The Source Code Is the Design" in the book Agile Software Development: Principles, Patterns, and Practices The article was written in 1992 so ignore the references to C++ (I mentally translated them to Java) and instead focus [...]]]></description>
			<content:encoded><![CDATA[<p>I first came across the thought-provoking article <a href="http://www.developerdotstar.com/mag/articles/reeves_design.htm">What Is Software Design?</a> by Jack Reeves as an appendix titled "The Source Code Is the Design" in the book <a href="http://www.amazon.ca/exec/obidos/redirect?link_code=as2&#038;path=ASIN/0135974445&#038;tag=basilvandegri-20&#038;camp=15121&#038;creative=330641">Agile Software Development: Principles, Patterns, and Practices</a><img src="http://www.assoc-amazon.ca/e/ir?t=basilvandegri-20&#038;l=as2&#038;o=15&#038;a=0135974445" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /> The article was written in 1992 so ignore the references to C++ (I mentally translated them to Java) and instead focus on what Jack is saying about the nature of design in software development. I recommend starting with the summary at the end of the article.</p>
<p>In traditional software development methodologies such as the waterfall methodology, software design is an explicit phase that produces a design document as an output of that activity. The design is usually completed before coding starts. Jack takes a contrary view: the main thesis of his article is that "final source code is the real software design", and most of the article is dedicated to exploring the ramifications of this thesis. The main implication is that "programming is a design activity—a good software design process recognizes this and does not hesitate to code when coding makes sense. " And it is not just coding that is a design activity. "Coding is design, testing and debugging are part of design, and what we typically call software design is still part of design. " One may get the impression that Jack is just a cowboy coder who disdains the traditional notion of design. The following quote, however, shows that Jack values all types of design. "In software engineering, we desperately need good design at all levels. In particular, we need good top level design. The better the early design, the easier detailed design will be."</p>
<p>As Jack expands on the implications of the source code being the design, he clearly demolishes the assumptions behind a traditional waterfall approach to development by emphasizing the need for the design to evolve and be refined. "Eventually, we have to create the real software design, and it will be in some programming language. Therefore, we should not be afraid to code our designs as we derive them. We simply must be willing to refine them as necessary. ... All design activities interact. A good software design process recognizes this and allows the design to change, sometimes radically, as various design steps reveal the need."</p>
<p><a href="http://www.developerdotstar.com/mag/articles/reeves_13yearslater.html">Jack's follow-up article to his original</a> 13 years later clarified his stance on design documentation beyond that of the code. "The source code may be the master design document, but it is seldom the only one necessary." But he also pointed out the flaw in insisting that this separate design documentation be a formal deliverable: "What approach they choose doesn’t matter; until someone starts insisting that these intermediate designs should be products in their own right. It’s the code that matters. If you get good code, does it really matter how it came about? If you don’t get good code, does it really matter how much other garbage you made people do before they wrote the bad code?" </p>
<p>These points I've quoted and almost of the articles ring true for me. I especially love the last quote – what truly matters is the code. When it is deployed into production, it does not matter how polished the design document was. In some of the formal, document-heavy client projects I have been involved with, the focus placed upon the project documentation is extreme, with the client in many cases critiquing even the grammar or punctuation within these documents. Meanwhile the code is produced with limited if any code reviews and inadequate testing. Recently on a project, after all the effort I and others put into the design document to obtain the formal sign-off by the client, I had to change part of my design after only a few days of coding. (And that is after I 'cheated' and coded a prototype during the design phase.) I discovered that another part of the design I was not involved with changed at the end of the design phase and invalidated one rather key assumption I made within my own design.</p>
<p>I find myself agreeing with Jack not just on the basis of his arguments, but also on the basis of my own experience. So why then are significant portions of I.T. still focused on the traditional waterfall / document-heavy approaches to developing software? Are there counter arguments I am missing? Please let me know by posting a comment, especially if you disagree with Jack and me.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2008/the-source-code-is-the-design/feed</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>Inspiring Great Design</title>
		<link>http://www.basilv.com/psd/blog/2008/inspiring-great-design</link>
		<comments>http://www.basilv.com/psd/blog/2008/inspiring-great-design#comments</comments>
		<pubDate>Sun, 13 Jan 2008 21:35:02 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[design]]></category>
		<category><![CDATA[requirements]]></category>
		<category><![CDATA[usability]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/blog/2008/inspiring-great-design</guid>
		<description><![CDATA[I recently acquired a design tool – a set of IDEO method cards, where each card presents a design approach or a method of gaining inspiration. IDEO's design philosophy is to keep people at the center of the design process, and the four categories they divide the cards into reflect this: Ask people to help. [...]]]></description>
			<content:encoded><![CDATA[<p>I recently acquired a design tool – a set of <a href="http://www.ideo.com">IDEO</a> method cards, where each card presents a design approach or a method of gaining inspiration. IDEO's design philosophy is to keep people at the center of the design process, and the four categories they divide the cards into reflect this:</p>
<ul>
<li><strong>Ask</strong> people to help.</li>
<li><strong>Look</strong> at what people do.</li>
<li><strong>Learn</strong> from the facts you gather.</li>
<li><strong>Try</strong> it yourself.</li>
</ul>
<p>IDEO designs an incredibly wide variety of products – corporate websites, hand-held electronics,  clothing, business services, furniture, and more. With such diversity, it is no surprise that not all of the cards appear applicable to software development. I found that going through the cards and thinking about how they relate to I.T. to be interesting. I ended up classifying the relevant cards into three groups: </p>
<ul>
<li><strong>Requirements</strong>: These cards provide ideas on how to gather or analyze requirements.  This was the largest group by far – over one third of the total number of cards, and they spanned the four categories above. If you work in I.T., it may seem that calling tips for requirements design approaches is inappropriate. In software development there is often a divide between requirements and design. You do need to understand the client's needs and determine a solution that meets those needs, but an explicit separation in role between these activities can often hurt the final product. IDEO takes a holistic approach: determining requirements and understand the user is part of the design process and not a precursor to it.
</li>
<li><strong>User Interface Design</strong>: About one-sixth of the cards presented ideas related to user interface design. These mostly fell into the Try category, with a few in Learn and Ask. Prototyping and testing of various sorts were reoccurring themes in over half of these cards.
</li>
<li><strong>Software Design</strong>: Only a few cards seemed relevant to application architecture and software design. I initially found this surprising since I was expecting more. After further reflection, I realized that the commonly held understanding of design in the context of software development is very technical and narrowly focused. This 'technical' design activity (for lack of a better term) is necessary but not sufficient for creating a great piece of software. Other activities within the course of a development project that we in I.T. do not call design – activities such as requirements gathering, analysis, and usability testing – are all part of IDEO's holistic view of design.
</li>
</ul>
<p>In order to provide a concrete example of what the cards are like, I list in the table below a few of the cards I found particularly interesting. Each card explains not only a design method (the how), but also the reason for using it (the why). (Each card also briefly describes an IDEO project that used this method, but I do not list that below.)</p>
<table class="fancy" cellspacing="0">
<tr>
<th>Title</th>
<th>How</th>
<th>Why</th>
</tr>
<tr>
<td>Five Whys?</td>
<td>Ask "Why?" questions in response to five consecutive answers.</td>
<td>This exercise forces people to examine and express the underlying reasons for their behaviors and attitudes.</td>
</tr>
<tr>
<td>Rapid Ethnography</td>
<td>Spend as much time as you can with people relevant to the design topic. Establish their trust in order to visit and/or participate in their natural habitat and witness specific activities.</td>
<td>This is a good way to achieve a deep firsthand understanding of habits, rituals, natural language, and meanings around relevant activities and artifacts.</td>
</tr>
<tr>
<td>Error Analysis</td>
<td>List all the things that can go wrong when using a product and determine the various possible causes.</td>
<td>This is a good way to understand how design features mitigate or contribute to inevitable human errors and other failures.</td>
</tr>
</table>
<p>Looking at these design methods and the many others listed in the set of IDEO cards makes me appreciate all that goes into a well-designed product, and inspires me to think more carefully about how I think about and approach design.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2008/inspiring-great-design/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Designing for Deployability</title>
		<link>http://www.basilv.com/psd/blog/2007/designing-for-deployability</link>
		<comments>http://www.basilv.com/psd/blog/2007/designing-for-deployability#comments</comments>
		<pubDate>Mon, 12 Feb 2007 04:25:13 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[design]]></category>
		<category><![CDATA[deploy]]></category>
		<category><![CDATA[EnvGen]]></category>
		<category><![CDATA[Java EE]]></category>
		<category><![CDATA[Ruby on Rails]]></category>
		<category><![CDATA[software development]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/blog/2007/designing-for-deployability</guid>
		<description><![CDATA[In my previous article Architecting for Deployability, I wrote about the importance of deployability - how reliably and easily software can be deployed from development into the production environment. To accomplish this, one approach I recommended was to encapsulate differences between environments to isolate them from the majority of the application, and thus simplify deployment. [...]]]></description>
			<content:encoded><![CDATA[<p>In my previous article <a href="http://www.basilv.com/psd/blog/2007/architecting-for-deployability">Architecting for Deployability</a>, I wrote about the importance of deployability - how reliably and easily software can be deployed from development into the production environment. To accomplish this, one approach I recommended was to encapsulate differences between environments to isolate them from the majority of the application, and thus simplify deployment. This is assuming, of course, that these differences cannot be eliminated. The technology (language and platform) you are using and the type of environmental difference you need to deal with will influence the specific techniques available to manage the difference. </p>
<p>Differences between environments that you are likely to encounter are:</p>
<ul>
<li>Database connections</li>
<li>Third-party service connections (i.e. for a web service, or naming service)</li>
<li>Logging settings</li>
<li>Performance tuning settings (i.e. database storage options)</li>
<li>Security settings (i.e. user ids, passwords)</li>
<li>Directory paths (i.e. for installed programs or libraries)</li>
</ul>
<p>Based on my experience, there are three design approaches to dealing with these environmental differences. They are discussed below in the order in which I feel they should be used: i.e. only use the second approach if the first does not work out, and only use the third when the first two approaches are not appropriate.</p>
<h3>Use support provided by the platform - but only when it makes sense</h3>
<p>The platform you are basing your development on - whether an application server, operating system, or set of language libraries - may have built-in support for dealing with certain environmental differences. Rather than building your own solution (which the other two approaches cover), it is often easiest to use the provided functionality. I have a number of examples involving a variety of platforms, including an example that shows why you should not necessarily use the built-in support if it has been badly designed.</p>
<p>The <a href="http://java.sun.com/javaee/index.jsp">Java Enterprise Edition</a> (Java EE) platform provides several options for dealing with environmental differences. One of the essential pieces is <a href="http://java.sun.com/products/jndi/tutorial/getStarted/index.html">JNDI - the Java Naming and Directory Interface</a>, which provides an environmentally-neutral way to lookup basically anything, ranging from simple strings to fully configured services. JNDI works great for looking up a <code>DataSource</code> as per the following sample code:</p>
<pre class="prettyprint">
Context rootContext = new InitialContext();

String jndiDataSourcePrefix = "java:/"; // Varies by application server
String dataSourceName = "myDataSource";
DataSource dataSource = (DataSource) rootContext.lookup(
  jndiDataSourcePrefix + dataSourceName);
</pre>
<p>This not only hides the environment-specific details concerning the actual database connection, but also hides details concerning connection pooling which often vary between environments as well. The specifics concerning the database connection and connection pooling are specified within the application server for each environment. The code can therefore be promoted between environments without change.</p>
<p>Java EE also allows for simple strings to be stored in JNDI essentially like environment variables. The retrieval of these values is much like the retrieval of a <code>DataSource</code> as the following code shows:</p>
<pre class="prettyprint">
Context rootContext = new InitialContext();
Context envContext = (Context) rootContext.lookup("java:comp/env");
String supportEmail = (String) envContext.lookup("support.email");
</pre>
<p>Unfortunately, this approach is not well suited to handling environmental differences. The values for such variables are specified in the deployment descriptor - <code>ejb-jar.xml</code> for EJBs or <code>web.xml</code> for web applications - as shown below:</p>
<pre class="prettyprint">
&lt;env-entry&gt;
  &lt;description&gt;Support Email Address&lt;/description&gt;
  &lt;env-entry-name&gt;support.email&lt;/env-entry-name&gt;
  &lt;env-entry-type&gt;java.lang.String&lt;/env-entry-type&gt;
  &lt;env-entry-value&gt;support@company.com&lt;/env-entry-value&gt;
&lt;/env-entry&gt;
</pre>
<p>The problem is that the deployment descriptor also contains important configuration information that does not vary per environment. Worse, the deployment descriptor file is bundled into the EJB jar file or WAR file for deployment. So how exactly can you specify different values for environment variables, without complicating your deployment process? You cannot. </p>
<p>The deployment descriptor represents a good idea but a flawed design. The creators of the Java EE specification envisioned multiple roles, including not just a developer role, but also an application assembler role and a deployer role. The idea was that the developer could specify the default value for the environment variable, and it could be overridden by either the assembler or deployer. However, the specification does not specify an easy way to do this override, and implies that the assembler or deployer would have to directly modify the deployment descriptor or use the proprietary administration interface of the application server as is necessary for configuring datasources and connection pools. The other flaw with this design is the idea of these separate roles. In practice, the developers are also the assemblers and often the deployers as well. When there are separate individuals doing the deployment, they typically know nothing about the application and could only override settings based on instructions provided by the developers. </p>
<p>The Java EE platform provides good support for data sources, but other types of environmental differences are really not supported well. I recommend not using the environment variable mechanism provided in Java EE, as it does not address the requirement of easily deployable software.</p>
<p>My next example involves <a href="http://www.rubyonrails.org/">Ruby on Rails</a>, a web application framework for the <a href="http://www.ruby-lang.org/en/">Ruby language</a> that has been receiving a lot of attention and hype in the last few years. One of the attractions of Rails is that it provides built-in support for many of the common features of web applications, including <a href="http://wiki.rubyonrails.com/rails/pages/Environments">handling of environmental differences</a>. Rails explicitly defines the notion of an environment via the <code>RAILS_ENV</code> environment variable, and even comes pre-configured with three: development, test, and production. Specifying environment-specific settings is extremely simple. There is a common file shared between environments (<code>config/environment.rb</code>), and one configuration file per environment (<code>config/environments/&lt;env&gt;.rb</code>). Since the configuration files are Ruby scripts, any type of setup can be done - you are not limited as in the case of Java EE deployment descriptor XML files. Rails is a clear winner over Java EE when it comes to deployability for multiple environments.</p>
<h3>Parameterize and lookup differences by environment</h3>
<p>If the platform you are using does not provide the support you need for multiple environments, then the next design approach is to implement your own support for multiple environments within the platform. While the specific mechanisms can vary, conceptually such support requires a parameter representing the environment (like the <code>RAILS_ENV</code> environment variable), and a lookup mechanism to retrieve environment-specific values based on this parameter (like the <code>config/environments/*.rb</code> files in Rails).  The lookup mechanism can be as simple as settings in a property file named after the environment, or can involve properties in a database table keyed by the environment. A more sophisticated mechanism is to use a configuration interface with a factory to create the appropriate environment-specific subclass based on the environment. This allows for any type of setup to be implemented, rather than being limited to strings or other primitive types.</p>
<p>The catch with using this approach is that the parameter representing the environment must be handled using whatever support the platform provides for dealing with environmental differences. For a Java EE application this is not a big deal - there are several different options I have used. One approach is to store the environment in a special database table. Since Java EE handles the data source fine, the code can retrieve the data source and then query the environment table to determine the environment. Another approach is to use a Java system property or even an environment variable like Rails does. A third approach I have used is to have an environment specific directory (i.e. <code>config/prod/</code>) holding one or more property files or other resources. The classpath is defined within each application server to include the appropriate environment-specific directory. The code simply loads the property files or other resources from the classpath, which resolves correctly to the desired version of the files. This works especially well for log4j configuration files.</p>
<p>For platforms that provide absolutely no support for handling environmental differences, this design approach will not work. That is when the third approach becomes useful.</p>
<h3>Generate per environment</h3>
<p>This design approach is most appropriate when the platform provides no support for handling environmental differences. The most common example of this I have encountered is database DDL statements, which can have environment-specific storage settings but do not support variables that would allow one to parameterize these settings. If you want to fully script database changes, then it is necessary to have these DDL scripts be environmentally neutral. The solution is to use file generation to produce a different version of the script for each environment from a template using the appropriate values to substitute into the template for each environment. </p>
<p>I provide a software utility called <strong>EnvGen</strong> on my <a href="http://www.basilv.com/psd/software">Software</a> page that performs environment-specific file generation. EnvGen is an <a href="http://ant.apache.org/">Ant</a> task for generating different versions of the same file parameterized for different environments (i.e. development, test, and production). File generation is done using <a href="http://freemarker.sourceforge.net/">FreeMarker</a>, a template engine with a full-featured templating language. You specify environment-specific properties in a CSV file (comma separated value spreadsheet). You can read more about EnvGen in the <a href="http://www.basilv.com/psd/software-files/EnvGen/index.html">EnvGen Release Documentation</a>.</p>
<p>For all of these approaches, no matter what language or platform you are using, the underlying concept remains the same: separate the settings and code that changes across environments from that which remains the same to achieve the goal of reliably and easily promoting the code into the production environment.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2007/designing-for-deployability/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Error Handling and Reliability</title>
		<link>http://www.basilv.com/psd/blog/2007/error-handling-and-reliability</link>
		<comments>http://www.basilv.com/psd/blog/2007/error-handling-and-reliability#comments</comments>
		<pubDate>Fri, 12 Jan 2007 15:00:44 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[design]]></category>
		<category><![CDATA[error handling]]></category>
		<category><![CDATA[reliability]]></category>
		<category><![CDATA[software development]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/blog/2007/error-handling-and-reliability</guid>
		<description><![CDATA[I have been thinking a lot lately about how to create reliable systems. I previously examined the link between complexity and reliability. Recently, however, I have come to appreciate the impact of error handling on reliability. For the purposes of this discussion, I consider two aspects of reliability: correctness - does the application produce the [...]]]></description>
			<content:encoded><![CDATA[<p>I have been thinking a lot lately about how to create reliable systems. I previously examined the link between <a href="http://www.basilv.com/psd/blog/2006/complexity-and-reliability">complexity and reliability</a>. Recently, however, I have come to appreciate the impact of error handling on reliability. For the purposes of this discussion, I consider two aspects of reliability: <em>correctness</em> - does the application produce the correct results, and <em>uptime</em> - the length of time the software can operate without terminating due to an error. A single defect or environmental problem can impact one or both of these measures. For example, a defect in an algorithm can cause a program to calculate the wrong result, without impacting uptime. A memory leak or network outage can impact uptime without impacting correctness. A null pointer exception impacts both. The error handling strategy you choose for your system affects both the correctness and the uptime. I am familiar with three main approaches to handling errors:</p>
<ul>
<li>Ignore errors</li>
<li>Fail fast</li>
<li>Degrade gracefully</li>
</ul>
<p>The <em>ignore errors</em> approach is very simple: assume errors will not happen and ignore them. Some of you may object that this is not a 'real' error handling strategy, but considering how often I have seen it used in production systems, I cannot agree. This approach does have the benefit of maximizing uptime: even if things go wrong, the program will keep running. Of course, if the program is producing incorrect output due to these errors, then you have a problem. So this approach tends to minimize correctness. Any problems that do occur are what I call <em>silent failures</em> that go undetected, at least for a while. Unix scripts and the C programming language adopt this strategy as the default: utilities and functions have return codes to report errors, so a call that results in an error will not affect the operation of your program or script unless you have an explicit check.</p>
<p>The <em>fail fast</em> approach is also very simple: whenever an error or unexpected event happens, immediately terminate execution. This approach tends to maximize correctness, but tends to minimizes uptime, since any abnormality causes it to end. These applications tend to be brittle. The slightest problem in the environment, such as a blip in the network, can bring down the application. Modern enterprise programming languages such as Java and C# adopt this strategy through the use of exceptions. If a problem occurs, an exception is thrown which will terminate the program unless explicitly caught and dealt with.</p>
<p>The <em>degrade gracefully</em> approach combines the best of the other two approaches. It detects errors like the fail fast approach, but instead of failing immediately, it handles the error and continues execution as appropriate. It therefore maximizes the reliability of the system by maximizing both correctness and uptime. The downside of this approach is that it requires much more thought and effort to implement. No programming language I am aware of provides explicit support for this approach.</p>
<p>I was originally a strong proponent of the fail fast approach, but last year I started to appreciate the degrade gracefully approach, as I wrote in my article <a href="http://www.basilv.com/psd/blog/2006/fail-fast-or-degrade-gracefully">Fail Fast or Degrade Gracefully?</a>. Over the past year, my viewpoint has shifted further. I now feel that the degrade gracefully approach should be used by default. Only if it would require too much effort or complexity to implement should the fail fast approach be used instead. (Naturally I do not support the use of the ignore errors approach.)</p>
<p>There are many examples of the degrade gracefully approach within the IT infrastructure we rely on. TCP/IP networking stacks are designed to degrade gracefully when problems such as dropped packets occur. Web servers do not shut down if a web application experiences a failure - they instead terminate the current request by sending an error response to the client and continue to serve other requests. Email clients do not fail if the email server becomes unavailable, and more importantly the mail you were trying to send is not lost. Modern compilers do not stop upon encountering the first syntax error but instead continue parsing the same file (and other files) as best they can.</p>
<p>The validity of these examples could be debated. One could argue that some of these situations such as dropped network packets and bad user input (syntax errors in code) are expected - a normal part of operation - rather than representing an exceptional situation or error. The systems handle these situations because it is a requirement, not because they are using the degrade gracefully error handling approach. I would instead argue that the requirement is to use the degrade gracefully approach to handle these problematic situations, primarily because both the ignore errors approach and the fail fast approach are unacceptable.</p>
<p>Reliable systems do not happen by accident, but require careful thought and effort to create. The approach you choose for handling errors can have a bigger impact on reliability than you might expect.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2007/error-handling-and-reliability/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Complexity and Reliability</title>
		<link>http://www.basilv.com/psd/blog/2006/complexity-and-reliability</link>
		<comments>http://www.basilv.com/psd/blog/2006/complexity-and-reliability#comments</comments>
		<pubDate>Thu, 14 Sep 2006 15:00:32 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[design]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[complexity]]></category>
		<category><![CDATA[reliability]]></category>
		<category><![CDATA[software development]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/blog/2006/complexity-and-reliability</guid>
		<description><![CDATA[Unrestrained complexity is a critical limiting factor in producing working software. The more complex a system, the more it will cost to create and operate and the less reliable it will be. Yet the bane of complexity is largely ignored by the IT industry. Software vendors, competing on the basis of feature sets, are constantly [...]]]></description>
			<content:encoded><![CDATA[<p>Unrestrained complexity is a critical limiting factor in producing working software. The more complex a system, the more it will cost to create and operate and the less reliable it will be. Yet the bane of complexity is largely ignored by the IT industry. Software vendors, competing on the basis of feature sets, are constantly enhancing their existing products and introducing new, more capable ones. IT consultants trying to win more work are constantly pitching ideas for new systems, new business solutions, and new capabilities. Customers are constantly asking for new or enhanced functionality. Software developers thrive on creating this functionality. These forces all lead towards greater complexity. No one benefits from fighting complexity, so its harmful effects are not publicized.</p>
<p>Actually, my last sentence is not true. Customers do want software that works, and since simpler software is more reliable, they benefit from fighting complexity. Unfortunately, the costs of complexity are largely hidden from the sight of the customer, so they seldom realize the cost involved in asking for more features. They just get upset when the software stops working or works poorly, and they do not appreciate their contribution to the problem. IT operational staff also benefit from fighting complexity to keep systems as reliable as possible, since they need to keep them running. But they seldom have much if any influence upon the procurement or development of these systems.</p>
<p>Lately I have been struggling with improving the reliability of a particular system. As the team has identified and tried to resolve various issues, I have come to see that the high complexity of the system overshadows our efforts. Why does complexity so strongly affect reliability? I like using a mechanical analogy: the more moving parts in a device, the higher the probability that one of them will fail within a fixed time, thus lowering the overall reliability of the device. In an IT system, the failure points are different. The actual physical devices - the hardware - is ironically simpler to manage since it is easy to improve through redundancy. It is the software that is the problem. The greater the complexity of the software, the higher the likelihood of defects - not just within the application code itself, but also in the overall software stack that is used. For an enterprise business application, this typically includes third party libraries, application server, web server, database server, and operating system, and can include additional services such as email, scheduling or messaging. A defect anywhere in the stack can cause the application to fail.</p>
<p>The problem with software reliability goes beyond just defects. In an enterprise setting, applications experience a wide variety of changes, each of which represents an opportunity for failure. Each of these changes is in essence a "moving part", even though the actual code for the application has not changed. The most typical change is enhancements to the application, which can introduce new defects in both the new and existing functionality. Other examples of changes include upgrades to application servers, web servers, database servers, operating systems, or hardware, configuration changes to systems such as email, network addresses, or scheduling, or security changes such as password expiration. The more complex the system, the more of these changes it experiences, which increases the chance of failure.</p>
<p>The relationship between complexity and reliability can be modeled statistically. I will represent an IT system as a collection of pieces (P) that each has a chance of failure (F), expressed as a probability of failing within one year. I think of each piece as abstractly representing something that can failure - the equivalent of that moving part in a mechanical device. This correlates with the complexity of the system. While it is hard to determine even approximate values for these measures in a real system, just using abstract concepts and figures can provide an appreciation for the relationship between the two values. The probability of the system having no failures in one year is (1-F)<sup>P</sup>. Using baseline values of 100 pieces and a 0.01 probability of failure for each piece in the year (1%), the chance of no failures in a year is only 37%. This means the chance of having one or more failures is 63%. What happens as the complexity increases? </p>
<table class="fancy" cellspacing="0">
<tr>
<th># of Pieces (P)</th>
<th> % Chance of failure per piece (F)</th>
<th>Overall % chance of no failures</th>
</tr>
<tr>
<td>100</td>
<td>1%</td>
<td>37%</td>
</tr>
<tr>
<td>200</td>
<td>1%</td>
<td>13%</td>
</tr>
<tr>
<td>500</td>
<td>1%</td>
<td>0.7%</td>
</tr>
</table>
<p>The reliability of the system falls quickly as the number of pieces is increased. In order to maintain the same reliability when the complexity doubles, the reliability of each piece must double.</p>
<table class="fancy" cellspacing="0">
<tr>
<th># of Pieces (P)</th>
<th> % Chance of failure per piece (F)</th>
<th>Overall % chance of no failures</th>
</tr>
<tr>
<td>100</td>
<td>1%</td>
<td>37%</td>
</tr>
<tr>
<td>200</td>
<td>0.5%</td>
<td>37%</td>
</tr>
<tr>
<td>500</td>
<td>0.2%</td>
<td>37%</td>
</tr>
</table>
<p>In practice, however, more complex systems are harder to understand and change, thus reducing the reliability of each change that is made. Once a system does fail, greater complexity means that it is often harder to diagnose and fix the problem. This makes the downtime longer. Complexity therefore also leads to more serious failures.</p>
<p>Complexity and reliability are closely connected. If you have no plan to manage the complexity of a system, then you may be unpleasantly surprised by what happens to its reliability. Since our goal as professionals is to provide software that works, thinking about complexity and reliability is a necessity.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2006/complexity-and-reliability/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Reuse Trap in Software Design</title>
		<link>http://www.basilv.com/psd/blog/2006/the-reuse-trap-in-software-design</link>
		<comments>http://www.basilv.com/psd/blog/2006/the-reuse-trap-in-software-design#comments</comments>
		<pubDate>Thu, 07 Sep 2006 15:00:39 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[design]]></category>
		<category><![CDATA[refactoring]]></category>
		<category><![CDATA[reuse]]></category>
		<category><![CDATA[software development]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/blog/2006/the-reuse-trap-in-software-design</guid>
		<description><![CDATA[I stared at my code on the screen, but inspiration wouldn't come. I was trying to design a new feature which shared some commonalities with the existing code base. In particular, there were a couple classes that I knew I could reuse. I just wasn't sure how they would have to be modified because I [...]]]></description>
			<content:encoded><![CDATA[<p>I stared at my code on the screen, but inspiration wouldn't come. I was trying to design a new feature which shared some commonalities with the existing code base. In particular, there were a couple classes that I knew I could reuse. I just wasn't sure how they would have to be modified because I was still figuring out the structure of the new feature. This also was difficult - I didn't have a clear picture of how those reused classes would fit in. Around and around my thoughts went, without any progress. I was stuck in the reuse trap.</p>
<p>Does this story sound familiar? I have been ensnared by the reuse trap many times, especially early in my career. At first I labored in vain, wondering why I was making no progress, oblivious of the trap I was in. As I gained experience, I become aware of the reuse trap and learned techniques and skills to avoid or escape it. </p>
<p>The <em>reuse trap</em> is a term I coined to describe the situation when one becomes stuck trying to design new functionality while simultaneously attempting to reuse existing code that needs some modifications. I believe this happens because we are trying to solve two separate yet interrelated problems: one, implementing the new functionality, and two, modifying the code to be reused. The new functionality depends on the reusable code, and the way we modify this code depends on how it will be used by the new functionality. Trying to reason about both issues at the same time imposes too high a cognitive load, so we fail to make progress on either front. </p>
<p>How can we escape the reuse trap? Simply put, don't try to do two things at the same time. You need to produce the new functionality, so you can't avoid working on this problem. The solution, therefore, is to defer the code reuse problem. Trying to achieve reuse is the <a href="http://www.basilv.com/psd/blog/2006/how-to-do-root-cause-analysis">root cause</a> of this trap, and is the key to escaping it. The first step is to just start developing the new functionality. You can reuse code, but only as is, like you would do for a third party library. If you need to modify it, then just ignore it (for now). When you reach points where you could reuse existing code with some modification, feel free to take a copy of this existing code and change it as needed. This is also called cut-and-paste reuse, which is generally frowned upon. The use of it here is only temporary, as you'll see. Your goal is to finish enough of the new functionality to know that the design will work. This may mean finishing it completely, or just writing a basic skeleton. In either case, once you reach this point the next step is to eliminate duplication in the code base. Any sections of new code that you copied (cut-and-pasted) from existing code will be primary targets for your efforts. During this step, it is helpful to follow a disciplined refactoring process as described in <a href="http://www.amazon.ca/exec/obidos/redirect?link_code=as2&#038;path=ASIN/0201485672&#038;tag=basilvandegri-20&#038;camp=15121&#038;creative=330641">Martin Fowler's refactoring book</a><img src="http://www.assoc-amazon.ca/e/ir?t=basilvandegri-20&#038;l=as2&#038;o=15&#038;a=0201485672" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />. </p>
<p>In this second refactoring step, I deliberately omitted the word <em>reuse</em> and instead used the term <em>eliminating duplication</em>. Reuse implies thinking about the future - how can this code be re-used in a different context? By its very nature, it is more abstract, more uncertain, and thus more difficult to reason about. Eliminating duplication is more concrete and focused on what exists now, in the present. Adopting a mindset of eliminating duplication rather than up-front reuse therefore helps you avoid the reuse trap. This relates closely to some of the principles of <a href="http://www.amazon.ca/exec/obidos/redirect?link_code=as2&#038;path=ASIN/0321278658&#038;tag=basilvandegri-20&#038;camp=15121&#038;creative=330641">Extreme Programming</a><img src="http://www.assoc-amazon.ca/e/ir?t=basilvandegri-20&#038;l=as2&#038;o=15&#038;a=0321278658" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />: "You ain't gonna need it", "Do the simplest thing that could possibly work", and "Once and only once".</p>
<p>As your design experience grows, it becomes easier to avoid the reuse trap. You become more adept at knowing how to modify existing code to reuse it without really having to think hard about the problem. This allows you to focus on the problem of developing the new functionality, which is also likely easier. Therefore, your overall cognitive load is reduced - you can solve both problems at once - and the reuse trap is avoided. However, even experienced developers will still encounter difficult situations for which their experience is insufficient. That is when you need to know how to escape the reuse trap.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2006/the-reuse-trap-in-software-design/feed</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Fail Fast or Degrade Gracefully?</title>
		<link>http://www.basilv.com/psd/blog/2006/fail-fast-or-degrade-gracefully</link>
		<comments>http://www.basilv.com/psd/blog/2006/fail-fast-or-degrade-gracefully#comments</comments>
		<pubDate>Thu, 09 Mar 2006 15:00:16 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[design]]></category>
		<category><![CDATA[error handling]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/blog/2006/fail-fast-or-degrade-gracefully</guid>
		<description><![CDATA[There are two approaches to handling internal application errors. In the fail fast approach you immediately terminate the operation (or even the application) once an error is detected. In the degrade gracefully approach you try to continue with as much of the operation as you can. For quite a while I have been a firm [...]]]></description>
			<content:encoded><![CDATA[<p>There are two approaches to handling internal application errors. In the fail fast approach you immediately terminate the operation (or even the application) once an error is detected. In the degrade gracefully approach you try to continue with as much of the operation as you can.</p>
<p>For quite a while I have been a firm proponent of the fail fast approach. If you encounter an internal application error (i.e. a method parameter is unexpectedly null), this is often a sign of a defect. The presence of a defect means you can no longer trust the operation of the application, so the safest approach is to terminate the operation or even the application. (In Java, this is typically done by throwing an appropriate RuntimeException.) Besides being the safer of the two approaches, another advantage of fail fast is that it forces the problem into the open, which makes it more likely it will be detected and fixed.</p>
<p>However, I recently came across a situation in which the degrade gracefully approach made more sense. The application in question had a generic message class for formatting messages with parametrized arguments. To use the class, you provide the message embedded with tokens representing one or more parameters, plus the parameters to be substituted for the tokens. One day I came across a use of this message class that supplied a parametrized message with the wrong number of parameters. Curious as to why this block of code had not 'died' (thrown an exception) during testing, I looked into the implementation of this message class. I discovered that the class did absolutely no checking of the arguments supplied to it. As a result, you could supply the wrong number of parameters (too many or too few), and the class would still return the formatted string, ignoring extra parameters and treating missing parameters as empty strings. A little investigation quickly revealed that there were other places in the application that were supplying the wrong number of parameters to this class.</p>
<p>So I refactored the message class to use the fail fast approach, then searched for usages of the class to fix the cases where the arguments were invalid. It didn't take that long before the changes were done and all the unit tests were successful, so I committed my code. Some time later someone encountered an error which I quickly recognized - an invalid argument supplied to that generic message class. Obviously, I had missed a place in the application that was calling the message class incorrectly. But the error had me think: the message class was used to format a message to be displayed to the user. Before, with the degrade gracefully approach, the users had been able to perform the operation in question successfully, despite getting a poorly constructed message. Now with the fail fast approach, we did quickly find out about the bad message, but the user could no longer complete the work they were trying to do. I wasn't happy about my change having made the application less useful for the user.</p>
<p>After some thought, I realized that the degrade gracefully approach was appropriate in this situation. A message to the user missing some parameters is almost always still somewhat understandable, and has nothing to do with the actual business logic being performed, so it is fairly safe to continue with constructing the message despite receiving the incorrect number of parameters. But I still wanted to be able to find out about these cases - they did represent defects (albeit minor) in the code. I really wanted the advantages from both approaches.</p>
<p>To achieve this, I again refactored the message class to allow it to proceed despite having the wrong number of parameters. I changed the code checking for invalid parameters to log an error to the application log instead of throwing an exception. By logging an error I ensured that the developers would find out about the problem, but the application would proceed. (You may be thinking that this error in the log file is likely to be overlooked by developers, but I had already implemented changes to ensure this wouldn't happen. I'll save the details of this for a future article.)</p>
<p>In most cases, I still prefer the fail fast approach. Even in this case involving the message class, if the original developers had used the fail fast approach then I suspect there would have been far fewer cases of calling code supplying the wrong number of parameters. This is a potential drawback of the degrade gracefully approach: if you are not careful, you end up hiding information about a defect. If you do decide to use the degrade gracefully approach, ensure you have a mechanism to detect and reveal any defects, rather than completely hiding them. One case where the degrade gracefully approach is often used is at the application architecture level. Applications such as web servers and business web applications that process multiple independent operations do not terminate upon encountering an internal error. Instead, the current operation is terminated with the appropriate error reported while the application continues running, able to process other requests.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2006/fail-fast-or-degrade-gracefully/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

