<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Basil Vandegriend: Professional Software Development &#187; architecture</title>
	<atom:link href="http://www.basilv.com/psd/blog/tag/architecture/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.basilv.com/psd</link>
	<description></description>
	<lastBuildDate>Wed, 25 Jan 2012 13:23:47 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
		<item>
		<title>Assessing Service Oriented Architecture</title>
		<link>http://www.basilv.com/psd/blog/2008/assessing-service-oriented-architecture</link>
		<comments>http://www.basilv.com/psd/blog/2008/assessing-service-oriented-architecture#comments</comments>
		<pubDate>Mon, 12 May 2008 20:30:38 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[architecture]]></category>
		<category><![CDATA[SOA]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/blog/2008/assessing-service-oriented-architecture</guid>
		<description><![CDATA[I have always been skeptical of Service Oriented Architecture (SOA) and web services since they first came on the scene. This was not necessarily a rational skepticism, but was instead more of a visceral response to all the hype. As the years passed by, SOA and web services stuck around and I heard it mentioned [...]]]></description>
			<content:encoded><![CDATA[<p>I have always been skeptical of Service Oriented Architecture (SOA) and web services since they first came on the scene. This was not necessarily a rational skepticism, but was instead more of a visceral response to all the hype. As the years passed by, SOA and web services stuck around and I heard it mentioned more and more. I began to wonder if there was something worthwhile hiding under all the buzz, but my skepticism kept me from looking closer. This changed recently when I took a week-long <a href="http://www.webagesolutions.com/training/xml/wa1471/outline.html">course on SOA</a> from <a href="http://www.webagesolutions.com/">Web Age Solutions</a>. Besides covering the course material, I did my own research online and talked with colleagues with relevant experience to get a sense for other's opinions of SOA. This article presents a summary of SOA and my assessment of it.</p>
<h3>What is SOA?</h3>
<p>There are many definitions of SOA. (See for example <a href="http://en.wikipedia.org/wiki/Service_Oriented_Architecture">Wikipedia's definitions</a>.) I consider SOA to be an enterprise architecture strategy where business logic is organized into services, and business processes are realized through the invocation of these services. I consider a service in the context of SOA to be a related set of automated business activities accessible through a well-defined interface. Services are often expected to be accessible remotely over the network, usually as web services using the HTTP for the transport protocol, SOAP for the message format, and compliant with one or more of the <a href="http://www.w3.org/2002/ws/">'million' WS-* specifications</a>. "SOA is more about good design and architecture than it is technology." (from <a href="http://www.looselycoupled.com/stories/2003/tactics-soa-infr0415.html">Lawrence Wilkes</a>). This is often confused, unfortunately, by all the vendor hype around particular products and specifications, especially web services.</p>
<p>A successful service oriented architecture needs to address three areas:<br />
<a href='http://www.basilv.com/psd/wp-content/uploads/2008/05/soa-triangle.png' title='SOA Triangle of Services, Processes, and Governance'><img src='http://www.basilv.com/psd/wp-content/uploads/2008/05/soa-triangle.png' alt='SOA Triangle of Services, Processes, and Governance' /></a></p>
<ol>
<li><strong>Services</strong>: the units of composition for a service oriented architecture.</li>
<li><strong>Processes</strong>: the business processes using the services. Each business process is implemented by orchestrating together multiple services.</li>
<li><strong>Governance</strong>: the policies governing the development and maintenance of services.</li>
</ol>
<p>I am hesitant to call SOA an architecture pattern. A pattern is generally defined as a reoccurring solution to a reoccurring problem. SOA qualifies as a reoccurring solution, but it is not clear whether it addresses a single reoccurring problem. The more hype-driven literature assumes that SOA is a given without any justification. In other cases, SOA is presented as addressing one or more of a collection of enterprise issues without any discussion of the context around these issues and the consequences of using SOA. I think SOA could be expressed as an architectural pattern, and that doing so would be beneficial in reducing the hype and the sometimes contentious discussion surrounding SOA. I feel that looking at the motivation, applicability, and consequences of SOA would be particularly helpful, and in the remainder of this article I will touch on these points.</p>
<p>Does SOA represent something new? I have heard it said that SOA is merely a set of "best practices" for enterprise I.T. - a collection of what has worked before. Certainly some details of SOA like the enterprise service bus (ESB) seem like a rehash of the message bus in enterprise application integration (EAI). The inclusion of business processes in SOA, especially executable via the business process execution language (BPEL), comes straight from the notion of automated work-flow engines. From the middle-ware vendors' perspectives SOA means a new 'stack' of products to sell. I suspect this has been a major driving force behind the SOA hype. As existing middle-ware such as Java EE servers becomes commoditized and open-sourced, it is rational for vendors to create new, higher-profit markets where this has not yet happened. My perspective is that SOA is at least an incremental refinement of what has come before: it does pull formerly disparate concepts together into a unified whole. Its explicit inclusion of process and governance seems unique, although I remain uncertain how those aspects will work out in practice. Whether or not SOA is something new or just the product of vendor hype, the question "does SOA work?" remains. Below I look at some of the benefits and weaknesses of SOA.</p>
<h3>Benefits of SOA</h3>
<p>SOA promises many benefits, but how many of these can be achieved in practice? Below I list the reported benefits of using SOA and evaluate each one.</p>
<ul>
<li><strong>Integration of legacy or stovepipe applications</strong>: Legacy or stovepipe applications, even those implemented in variant technologies, can be wrapped in a service and orchestrated with other, perhaps newer, services to achieve a business process. This mixing of old and new allows existing I.T. assets to be reused rather than needing to be rewritten, which is expected to result in cost savings. I feel this is a legitimate benefit, especially when such an application is not under the control of your organizational unit and cannot therefore be modified directly.</li>
<li><strong>Bridging the divide between I.T. and the business</strong>: SOA explicitly considers business processes and thus links business goals and business use cases with the I.T. architecture through services. While I think SOA goes much farther in this direction than other architectures, I suspect that many of the normal issues that arise between I.T. and the business will remain. The people involved will still come from primarily from one side or the other and bring their existing viewpoints and biases which will inherently lead to conflicts.</li>
<li><strong>Loose coupling</strong>: SOA encourages loose coupling through encapsulating functionality behind course-grained services with well defined interfaces. This allows the implementations of services to change without impacting users of the service, since other services and business processes will only invoke the service through its interface. This is similar conceptually to the use of dependency injection, where the code using an external resource knows nothing about the concrete implementation since it is injected by the container. While I think this is a significant benefit, bad design can still ruin it.</li>
<li><strong>Reuse</strong>: Reuse is one of the more highly touted benefits of SOA. The theory is that the services can be reused to implement new business processes. Since each service operation represents a business activity, its reuse represents a significant savings in development costs. This is the benefit I am most skeptical of. It seems far too similar to the promises made when object-oriented programming was first becoming mainstream. While OO has many benefits, the level of reuse that does occur is far below what was initially hyped. There are a few hopeful signs, however, that SOA may actually enable a reasonable level of reuse. One of the key requirements to enable reuse is the proper management support, which SOA explicitly addresses through governance. Services are course-grained, so even a large corporation will have a limited number of them (less than 500 is a ballpark I have heard). This makes it easier to review the catalog of services and determine possibilities for reuse. And since the service operations have business meaning, both I.T. management and the business can better understand and commit to their reuse, unlike in OO where individual classes or even libraries are at too technical a level to be truly comprehensible to such people. A significant obstacle to service reuse, however, is that it requires the organization's business processes to become more standardized in how various business activities are performed, since the same services to perform these activities will be invoked by these different processes. In a typical large organization with independently-minded business units, perhaps with varying requirements, this standardization will likely be difficult to achieve without senior management involvement.</li>
</ul>
<h3>Weaknesses of SOA</h3>
<p>There are a number of drawbacks and challenges to successfully using SOA, of which I highlight what I feel are the most significant:</p>
<ul>
<li><strong>Distributed Systems Problems</strong>: Making a system distributed is problematic. Martin Fowler's <a href="http://martinfowler.com/bliki/FirstLaw.html">first law of distributed object design</a> is "Don't do it" for this reason. The issues you need to deal with include the following:
<ul>
<li><strong>Reliability</strong>: If any required service in your business process becomes unavailable then your entire business process cannot be executed. Even having systems with different scheduled maintenance windows can cause difficulties. One way to address this is to use asynchronous messaging as much as possible and make your messaging infrastructure high availability. </li>
<li><strong>Performance</strong>: The performance of remote calls is much, much worse than local (in-process) calls. This is exasperated when using Web Services with the verbosity and extra parsing that XML and the various WS-* specs add. This is why service operations should be as course-grained as possible to minimize the number of remote invocations.</li>
<li><strong>Transactions</strong>: For a typical application, database transactions are used to ensure data integrity. When multiple, remote services are being invoked this becomes much more difficult. One option is to use a distributed transaction, which requires that both the service messaging infrastructure (i.e. Web Services) supports distributed transactions, and that the underlying service implementations can enlist in these transactions. When wrapping services around legacy or off-the-shelf applications, this may not be possible. Even when distributed transactions can be made to work, they introduce yet another performance hit. Another option is to forgot about distributed transactions and deal with the consequences: this usually involves more careful design and the use of compensating transactions in the case of failures. It sounds dangerous to those such as myself used to working with transactions, but there are <a href="http://martinfowler.com/bliki/Transactionless.html">cases where transaction-limited systems have been made</a>.</li>
<li><strong>Security</strong>: The current security context - the identity of the user and the actions it is authorized to perform – often needs to be propagated to each service. Propagation is actually not that difficult, especially with the introduction of specifications like WS-Security which automatically include security context in the SOAP header. A harder problem is managing user identity and access privileges across separate, perhaps legacy, systems. Even if you introduce an enterprise-wide security service for all your applications to use, you still face the problem of dealing with off-the-shelf or legacy applications that do not use it.</li>
</li>
</ul>
<li><strong>Changing Services</strong>: Once services are defined, created and in operational use, changing them becomes difficult. One problem is versioning: what happens when the interface to a service needs to change? Especially troubling are interface changes that are non-backwards compatible. Frameworks in languages such as Java with explicit interfaces have already had to deal with this problem. One solution is to change all the consumers of the service at the same time the service changes: this requires significant organizational coordination and is frequently unrealistic. Another solution is to create a version 2.0 of the service, and leave the old service running but deprecated to allow time for consumers to migrate to the new service. This requires that both versions of the service are operational, which increases maintenance costs and is not always possible. Service implementations can also be changed without touching the interface. While a purported benefit of SOA, there is the risk that an implementation change actually subtly changes the semantics of one or more operations and breaks one or more of the consumers of the service.
</li>
<li><strong>Crossing Organizational Boundaries</strong>: For all the benefits of SOA to be realized, a number of organizational boundaries need to be bridged. This typically requires a change in organization culture, which is very difficult to achieve. Integrating services in separate, traditionally stovepipe, organizational units requires them to coordinate more closely, both during the initial development project and on an ongoing basis for maintenance. Effective management and reuse of services requires more effective communication between development teams and architecture within I.T, and potentially requires better alignment of separate organizational business processes with enterprise-wide processes. I've already mentioned bridging the divide between the business and I.T. as one of the reported benefits of SOA that is likely challenging to achieve in practice.
</li>
</ul>
<h3>Other Viewpoints</h3>
<p>As part of my research into SOA, I came across a number of sites with varying views and feedback:</p>
<ul>
<li><a href="http://wanderingbarque.com/nonintersecting/2007/10/05/what-is-soa/">What is SOA? (wanderingbarque.com)</li>
<li><a href="http://www.indiawebdevelopers.com/articles/SOA.asp">What is SOA? - SOA and Web Services Explained (www.indiawebdevelopers.com)</a></li>
<li><a href="http://dotnetaddict.dotnetdevelopersjournal.com/soamanifesto.htm">SOA Manifesto (dotnetaddict.dotnetdevelopersjournal.com)</a></li>
<li><a href="http://webservices.sys-con.com/read/45094.htm">Can I Be of Service? (webservices.sys-con.com)</a></li>
<li><a href="http://www.looselycoupled.com/stories/2003/tactics-soa-infr0415.html">Tactics, not strategy, drive SOA adoption (www.looselycoupled.com)</a></li>
<li><a href="http://www.networkworld.com/news/2004/110804soapart2.html?page=2">Early adopters: SOA worth the effort (www.networkworld.com)</a></li>
</ul>
<p>What is your take on SOA? Please leave a comment and share your experiences.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2008/assessing-service-oriented-architecture/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Architecting for Deployability</title>
		<link>http://www.basilv.com/psd/blog/2007/architecting-for-deployability</link>
		<comments>http://www.basilv.com/psd/blog/2007/architecting-for-deployability#comments</comments>
		<pubDate>Fri, 26 Jan 2007 22:15:53 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[architecture]]></category>
		<category><![CDATA[maintenance]]></category>
		<category><![CDATA[deploy]]></category>
		<category><![CDATA[software development]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/blog/2007/architecting-for-deployability</guid>
		<description><![CDATA[Deployability is a non-functional requirement that addresses how reliably and easily software can be deployed from development into the production environment. For desktop (client-side) software, deployability addresses the installation and update mechanisms that may be built into the software itself. For server-side software, deployability is addressed through the system architecture and the deploy process. The [...]]]></description>
			<content:encoded><![CDATA[<p>Deployability is a non-functional requirement that addresses how reliably and easily software can be deployed from development into the production environment. For desktop (client-side) software, deployability addresses the installation and update mechanisms that may be built into the software itself. For server-side software, deployability is addressed through the system architecture and the deploy process. The remainder of this article will focus on server-side software, although the basic principles also apply to desktop software.</p>
<h3>Why is deployability important?</h3>
<p>Maintaining the proper operation of the production systems is a fundamental I.T. goal. The end users do not care how well the software works in the development or test environments. All the testing and other quality assurance activities in the world will not help if the software is not promoted properly into production and fails as a result. Ease of deployment is also important. If promoting changes is a cumbersome and labor-intensive task, people will be tempted to take shortcuts (i.e. skipping the test environment), and it will take longer to get changes into production.</p>
<h3>How can you architect for deployability?</h3>
<p>My first guideline is to minimize differences between environments. The more similar the environments, the simpler and more reliable it is to deploy the software. Each difference between environments is something that may trip you up when deploying if you have failed to properly address it. Environmental differences can also affect the accuracy of testing: what worked in one environment may not work in a different one. Unfortunately, it is not possible to eliminate all differences without eliminating all environments but one. Common unavoidable differences between environments include different servers, different hardware configurations, different databases, different URLs, and different security settings. When encountering an environmental difference, I ask the question "Is there a good reason for this difference to exist?". If not, then it should be eliminated.</p>
<p>One example from my own experience is a system deployed to Unix servers which had a standard directory structure except for the root directory that differed across environments. In my investigation I discovered that the reason for the difference was that the test server contained two separate test environments, each of which therefore required a different root directory. I considered this a valid reason given the existing hardware at the time. When the opportunity arose, I suggested that an additional test server be added and the second test environment moved to it to allow the paths to be standardized.</p>
<p>To handle unavoidable environmental differences, I recommend encapsulating them within the software to isolate these differences from the rest of the system. The specific mechanism will vary depending on the technology being used and the particular difference being addressed. I discuss some of these mechanisms in my article <a href="http://www.basilv.com/psd/blog/2007/designing-for-deployability">Designing for Deployability</a>. The goal you should aim for is to be able to deploy the software using a simple, automated process. For deploying code, this typically means an automated script that takes the output of the build process and copies it to the appropriate location (usually on another server) for the environment you are deploying to. </p>
<p>In an earlier article on <a href="http://www.basilv.com/psd/blog/2006/deploying-application-changes">Deploying Application Changes</a>, I distinguished between full deployments where the entire application or component is deployed as a unit, and partial deployments in which only the delta - the pieces that have changed - is promoted. As I discuss in that article, there are advantages and disadvantages to either approach. Within a single application, I prefer the full deployment approach. When multiple applications are involved, especially across multiple support or business units, I prefer to have the option of partial deployments. If separate applications are too-tightly coupled, then changes in one will require changes in the other, which then requires that both applications be deployed together. So high coupling between applications reduces the ease of deployability of each individual application. Clearly defined interfaces help minimize coupling, and if the changing application maintains backwards-compatibility of its interface, the other application will then not need to change, or can choose to change on its own time line.</p>
<p>Like most non-functional requirements, deployability is often neglected or overlooked during application development. Inexperienced or rushed developers can create systems that work great in development, but have many problems in the test or production environments due to issues with deployment. The absence of a defined and preferably automated deploy process or problems in testing due to environment differences are warning signs that more attention to deployability is needed.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2007/architecting-for-deployability/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>ICE Conference Highlights</title>
		<link>http://www.basilv.com/psd/blog/2006/ice-conference-highlights</link>
		<comments>http://www.basilv.com/psd/blog/2006/ice-conference-highlights#comments</comments>
		<pubDate>Thu, 16 Nov 2006 15:00:03 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[management]]></category>
		<category><![CDATA[professional]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[project management]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/blog/2006/ice-conference-highlights</guid>
		<description><![CDATA[I recently attended two seminars at the ICE 2006 Technology Conference in Edmonton thanks to my employer CGI. I enjoyed both presentations and regret not attending more. I was able to pull some useful tips and ideas from each seminar that resonated with me. The first session was Lessons for Risk Management Taken from the [...]]]></description>
			<content:encoded><![CDATA[<p>I recently attended two seminars at the <a href="http://www.iceconference.com/">ICE 2006 Technology Conference</a> in Edmonton thanks to my employer <a href="http://www.cgi.com/">CGI</a>. I enjoyed both presentations and regret not attending more. I was able to pull some useful tips and ideas from each seminar that resonated with me. </p>
<p>The first session was <a href="http://www.lessons-from-history.com/">Lessons for Risk Management Taken from the Titanic</a>, presented by Mark Kozak-Holland. I really enjoyed his use of the story of the Titanic as a case study to relate to modern IT projects. Mark talked about a variety of project management issues and not just risk management as the title suggested. Some of his most noteworthy points were:</p>
<ul>
<li>When do projects really finish? Most software development projects are considered complete when the software is deployed into production, if not earlier. Mark argued, and I agree, that the solution should be operational in production for a period of time for the project to be considered done. For the project to be considered a true success, the key return-on-investment metrics that drove the project should be measured to demonstrate that the original purpose of the project was achieved.
</li>
<li>There are three different phases of a project when failures can happen: during the development, during implementation, and during operation. The later the phase, the more costly the failure. The key to successful operation is to ensure that non-functional requirements such as maintainability, performance, security, manageability, etc. get the same focus during development and testing as the functional requirements. Most executives and business representatives understand the functional requirements, but seldom understand the non-functional requirements, so there is a natural tendency for them to be neglected - either cut from the project scope, or not sufficiently tested. Mark pointed out that it is often much easier to add on functional requirements than non-functional requirements, and emphasized that project managers and architects should push back when the sponsors or business makes decisions that would compromise the solution.
</li>
</ul>
<p>The second session was <em>The Role of the Enterprise Architect</em>, presented by <a href="http://www.quickresponse.ca/">Jason Uppal</a>. I have had conflicting opinions regarding the work and value of architects, so it was refreshing to hear Jason present a very pragmatic and reasonable viewpoint of enterprise architects. Some of the points that I liked were:</p>
<ul>
<li>Enterprise architecture starts with a vision - is this project / solution the right thing to do? Jason had a great example of an architecture review he did of a overdue project. He determined that the original business goal of the project had been to improve the functionality and performance of a set of business reports. But the actual project being delivered was to migrate the mainframe application to a more modern web-based application.
</li>
<li>Jason had an interesting rule of thumb regarding the problem statement provided by a client: shorter is better. A one sentence problem statement is ideal, while a multi-page document spells doom for the project. Why? The longer the problem statement, the more the client is specifying the solution they think they want, rather than specifying the actual problem. The more the client specifies the solution, the less they make use of the expertise of the architect in determining the solution.
</li>
<li>Jason made an interesting point regarding project cancellation: it should be considered operational learning rather than a failure. One of the responsibilities of the architect is to determine whether the project makes sense. If it doesn't, it should be killed - the earlier the better.
</li>
<li>Architects should never be the cause of project delays. Each architectural iteration should be one to two weeks in length. Each iteration elaborates on what was produced before and leaves fewer unanswered questions.
</li>
<li>Architects should not have assumptions in their work. Instead, any assumptions should be treated and identified as risks. This allows for risk management strategies - i.e. risk mitigation - to plan for the risk if it occurs.
</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2006/ice-conference-highlights/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Complexity and Reliability</title>
		<link>http://www.basilv.com/psd/blog/2006/complexity-and-reliability</link>
		<comments>http://www.basilv.com/psd/blog/2006/complexity-and-reliability#comments</comments>
		<pubDate>Thu, 14 Sep 2006 15:00:32 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[design]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[complexity]]></category>
		<category><![CDATA[reliability]]></category>
		<category><![CDATA[software development]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/blog/2006/complexity-and-reliability</guid>
		<description><![CDATA[Unrestrained complexity is a critical limiting factor in producing working software. The more complex a system, the more it will cost to create and operate and the less reliable it will be. Yet the bane of complexity is largely ignored by the IT industry. Software vendors, competing on the basis of feature sets, are constantly [...]]]></description>
			<content:encoded><![CDATA[<p>Unrestrained complexity is a critical limiting factor in producing working software. The more complex a system, the more it will cost to create and operate and the less reliable it will be. Yet the bane of complexity is largely ignored by the IT industry. Software vendors, competing on the basis of feature sets, are constantly enhancing their existing products and introducing new, more capable ones. IT consultants trying to win more work are constantly pitching ideas for new systems, new business solutions, and new capabilities. Customers are constantly asking for new or enhanced functionality. Software developers thrive on creating this functionality. These forces all lead towards greater complexity. No one benefits from fighting complexity, so its harmful effects are not publicized.</p>
<p>Actually, my last sentence is not true. Customers do want software that works, and since simpler software is more reliable, they benefit from fighting complexity. Unfortunately, the costs of complexity are largely hidden from the sight of the customer, so they seldom realize the cost involved in asking for more features. They just get upset when the software stops working or works poorly, and they do not appreciate their contribution to the problem. IT operational staff also benefit from fighting complexity to keep systems as reliable as possible, since they need to keep them running. But they seldom have much if any influence upon the procurement or development of these systems.</p>
<p>Lately I have been struggling with improving the reliability of a particular system. As the team has identified and tried to resolve various issues, I have come to see that the high complexity of the system overshadows our efforts. Why does complexity so strongly affect reliability? I like using a mechanical analogy: the more moving parts in a device, the higher the probability that one of them will fail within a fixed time, thus lowering the overall reliability of the device. In an IT system, the failure points are different. The actual physical devices - the hardware - is ironically simpler to manage since it is easy to improve through redundancy. It is the software that is the problem. The greater the complexity of the software, the higher the likelihood of defects - not just within the application code itself, but also in the overall software stack that is used. For an enterprise business application, this typically includes third party libraries, application server, web server, database server, and operating system, and can include additional services such as email, scheduling or messaging. A defect anywhere in the stack can cause the application to fail.</p>
<p>The problem with software reliability goes beyond just defects. In an enterprise setting, applications experience a wide variety of changes, each of which represents an opportunity for failure. Each of these changes is in essence a "moving part", even though the actual code for the application has not changed. The most typical change is enhancements to the application, which can introduce new defects in both the new and existing functionality. Other examples of changes include upgrades to application servers, web servers, database servers, operating systems, or hardware, configuration changes to systems such as email, network addresses, or scheduling, or security changes such as password expiration. The more complex the system, the more of these changes it experiences, which increases the chance of failure.</p>
<p>The relationship between complexity and reliability can be modeled statistically. I will represent an IT system as a collection of pieces (P) that each has a chance of failure (F), expressed as a probability of failing within one year. I think of each piece as abstractly representing something that can failure - the equivalent of that moving part in a mechanical device. This correlates with the complexity of the system. While it is hard to determine even approximate values for these measures in a real system, just using abstract concepts and figures can provide an appreciation for the relationship between the two values. The probability of the system having no failures in one year is (1-F)<sup>P</sup>. Using baseline values of 100 pieces and a 0.01 probability of failure for each piece in the year (1%), the chance of no failures in a year is only 37%. This means the chance of having one or more failures is 63%. What happens as the complexity increases? </p>
<table class="fancy" cellspacing="0">
<tr>
<th># of Pieces (P)</th>
<th> % Chance of failure per piece (F)</th>
<th>Overall % chance of no failures</th>
</tr>
<tr>
<td>100</td>
<td>1%</td>
<td>37%</td>
</tr>
<tr>
<td>200</td>
<td>1%</td>
<td>13%</td>
</tr>
<tr>
<td>500</td>
<td>1%</td>
<td>0.7%</td>
</tr>
</table>
<p>The reliability of the system falls quickly as the number of pieces is increased. In order to maintain the same reliability when the complexity doubles, the reliability of each piece must double.</p>
<table class="fancy" cellspacing="0">
<tr>
<th># of Pieces (P)</th>
<th> % Chance of failure per piece (F)</th>
<th>Overall % chance of no failures</th>
</tr>
<tr>
<td>100</td>
<td>1%</td>
<td>37%</td>
</tr>
<tr>
<td>200</td>
<td>0.5%</td>
<td>37%</td>
</tr>
<tr>
<td>500</td>
<td>0.2%</td>
<td>37%</td>
</tr>
</table>
<p>In practice, however, more complex systems are harder to understand and change, thus reducing the reliability of each change that is made. Once a system does fail, greater complexity means that it is often harder to diagnose and fix the problem. This makes the downtime longer. Complexity therefore also leads to more serious failures.</p>
<p>Complexity and reliability are closely connected. If you have no plan to manage the complexity of a system, then you may be unpleasantly surprised by what happens to its reliability. Since our goal as professionals is to provide software that works, thinking about complexity and reliability is a necessity.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2006/complexity-and-reliability/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

