Posts
58
Comments
103
Trackbacks
10
June 2007 Entries
A Unit of Work Implementation for Webservices

This post is in response to my last one.  After a bit of refactoring (and lots of thinking), here's what I've come up with.

The first enabling refactoring I must make is to remove the response from the request.  Instead, I flip the relationship and wrap the response inside the request.  Yes, the relationship gets flipped.

Secondly, I'm introducing a base class for Requests.  This base class will include the Correlation Id for the message.  The correlation id is an optional, client-provided value.  Typically, it would be used to correlate responses and requests.  I'm including it for the convenience of my clients (in case they want to use it).

    public abstract class Request
    {
        private string _correlationId;

        public string CorrelationId
        {
            // this is a client-provided identifier for
            // easily matching request/response pairs on the 
            // client side
            get { return _correlationId; }
            set { _correlationId = value; }
        }
    }

We also want to embed the Response inside the request, so I'm introducting a generic class for convenience.

    public abstract class Request<T> : Request
    {
        private T _response;

        public T Response
        {
            get { return _response; }
            set { _response = value; }
        }
    }

Here's an example request message for our webservice:

    public class HelloRequest : Request<HelloResponse>
    {
        private string _from;

        public string From
        {
            get { return _from; }
            set { _from = value; }
        }
    }


And it's corresponding response:

    public class HelloResponse : Response
    {
        private string _message;

        public string Message
        {
            get { return _message; }
            set { _message = value; }
        }
    }

Ah, yes, don't forget the common type for our Responses:

    public abstract class Response
    {
        private bool _wasSuccessful;
        private string[] _errors;

        public bool WasSuccessful
        {
            get { return _wasSuccessful; }
            set { _wasSuccessful = value; }
        }

        public string[] Errors 
        {
            // a better implemention would use a dto for an error
            // providing an error code, error name, language-specific error,
            // etc
            get { return _errors; }
            set { _errors = value; }
        }
    }

This is our example scenario.  Let's get this simple case working first.  The pleasant side-effect from encapsulating the response inside the request is that it makes writing a message handler simpler.  Here's our interface:

    public interface IRequestHandler<T>
    {
        T Handle(T request);
    }

By removing the notion of input and output pair, and sticking simply with a single type to contain the interaction, we now have a very simple interface to implement.  This is akin to the client sending us a document to fill out.  We take the information they have provided to us and append the information they requested.

Our webservice implementation is now extremely simple:

        [WebMethod]
        public HelloRequest Hello(HelloRequest request)
        {
            IRequestHandler<HelloRequest> handler = DependencyContainer.Resolve<IRequestHandler<HelloRequest>>();
            return handler.Handle(request);
        }

And our handler (in this case), is also extremely simple:

    public class HelloHandler : IRequestHandler<HelloRequest>
    {
        public HelloRequest Handle(HelloRequest request)
        {
            HelloResponse response = new HelloResponse();
            response.Message = string.Format("Hello {0}!", request.From);
            response.WasSuccessful = true;
            request.Response = response;
            return request;
        }
    }

Let's look at some client code:

        private void btnHello_Click(object sender, EventArgs e)
        {
            HelloRequest request = new HelloRequest();
            request.From = txtName.Text;
            HelloRequest result;
            using (Service service = new Service())
            {
                result = service.Hello(request);
            }
            if (result == null)
            {
                throw new Exception("The result was null.");
            }
            MessageBox.Show(string.Format("The result message was:\n\n{0}",result.Response.Message));
        }


That's great, but where's my unit of work?  The beauty lies in the simplicity.  Here's the request:

    public class UnitOfWorkRequest : Request
    {
        private Request[] _requests;

        public Request[] Requests
        {
            get { return _requests; }
            set { _requests = value; }
        }
    }

Here's the handler:

    public class UnitOfWorkHandler : IRequestHandler<UnitOfWorkRequest>
    {
        public UnitOfWorkRequest Handle(UnitOfWorkRequest request)
        {
            for (int i = 0; i < request.Requests.Length; i++)
            {
                Request work = request.Requests[i];
                request.Requests[i] = Handle(work);
            }
            return request;
        }

        private Request Handle(Request request)
        {
            Type messageType = request.GetType();
            Type handlerType = typeof(IRequestHandler<>).MakeGenericType(messageType);
            object handler = DependencyContainer.Resolve(handlerType);
            object[] args = { request };
            return (Request)handlerType.GetMethod("Handle").Invoke(handler, args);
        }
    }

And here's the Webservice implementation:

        [WebMethod]
        public UnitOfWorkRequest ProcessWork(UnitOfWorkRequest request)
        {
            IRequestHandler<UnitOfWorkRequest> handler = DependencyContainer.Resolve<IRequestHandler<UnitOfWorkRequest>>();
            return handler.Handle(request);
        }


The client code looks exactly like what you would imagine:

        private void btnBoth_Click(object sender, EventArgs e)
        {
            List<Request> requests = new List<Request>();
            
            PingRequest ping = new PingRequest();
            ping.CorrelationId = "ping-123";
            ping.From = "Win App";
            requests.Add(ping);

            HelloRequest hello = new HelloRequest();
            hello.CorrelationId = "hello-124";
            hello.From = txtName.Text;
            requests.Add(hello);            

            UnitOfWorkRequest unitOfWork = new UnitOfWorkRequest();
            unitOfWork.CorrelationId = "unitofwork-1";
            unitOfWork.Requests = requests.ToArray();

            UnitOfWorkRequest result;
            using (Service service = new Service())
            {
                result = (UnitOfWorkRequest)service.Process(unitOfWork); //using generic webmethod
            }
            if (result == null)
            {
                throw new Exception("The result was null.");
            }

            if (result.Requests[0].CorrelationId != "ping-123")
            {
                throw new Exception("The Ping was not found or is out of order.");
            }
            PingRequest pingResult = (PingRequest)result.Requests[0];
            MessageBox.Show(string.Format("The ping was successfully received at {0}.", pingResult.Response.Received));

            if (result.Requests[1].CorrelationId != "hello-124")
            {
                throw new Exception("The Hello was not found or is out of order.");
            }
            HelloRequest helloResult = (HelloRequest) result.Requests[1];
            MessageBox.Show(string.Format("The message was:\n\n{0}", helloResult.Response.Message));
        }

If you notice the above code, I actually used my next (potential) step in refactoring.  Now that the Handlers are all injectable based on the message type, we technically only need a single webservice method for all our messages:

        [WebMethod]
        public Request Process(Request request)
        {
            // food for thought
            Type messageType = request.GetType();
            Type handlerType = typeof(IRequestHandler<>).MakeGenericType(messageType);
            object handler = DependencyContainer.Resolve(handlerType);
            object[] args = {request};
            return (Request)handlerType.GetMethod("Handle").Invoke(handler, args);
        }

The client can pass any of our messages into our application, then simply cast the result back to the same message type.  The entire interaction is encapsulated in a single message.

There is one caveat with the generic method above.  The xml serializer won't, by default, figure out how to handle your types if you remove the other webmethods.  You could either leave the old ones in place, or I believe this could be mitigated by building your dto's contract first (ie..either wsdl or xsd first).  I'll be trying this shortly.  If it works, I will be looking very hard at building a DSL (in boo?) to generate the wsdl or xsd for me.

I'm attaching a working prototype of all of the above for anyone who's interested.  There are no unit tests since it was only a prototype.  It is simply a proof of concept.

This particular example was building using legacy webservices (asmx), but given that the entire interaction between the service consumer and the service is through a very simple interface (IRequestHandler), using this with WCF should be a non-issue.  In fact, going the DSL route, it should be merely a configuration switch when generating the dto's (use WCF attributes or not).

As a side note, I ran into a set of new patterns while building this.  They are the patterns of Xml Schema design.  Did you know the Russian Doll pattern is the least extensible? lol

I would also recommend reading up on the Notification pattern, as creating a supertype for the Request Handlers would be fairly trivial (ie.. a stateless version of Martin's ServerCommand).  Also see Dave's post on the same topic.

One last thing, if you look closely at the resulting webservice, you will notice a subtle redefinition.  The webservice is not defined by the methods it exposes (ala RPC) but rather by the documents it can process (ala messaging).

You can grab the bits from here.

posted @ Thursday, June 28, 2007 10:07 PM | Feedback (5)
Transactions, Services, and Unit of Work

First off, let me just say that the transaction per request model (for webservices) sucks.  Here's why in a simple scenario:

I have a simple request:

public class UpdateCustomerAddressRequestDto
{
   int CustomerId { get; set; }
   int Address1 { get; set; }
   int Address2 { get; set; }
   int City { get; set; }
   ...
}

With a simple response:

public class UpdateCustomerAddressResponseDto
{
   bool Success { get; set; }
   ViolationsDto[] Errors { get; set; }
   UpdateCustomerAddressRequestDto Request { get; set; }  
}

The transaction model works perfect for the following WebMethod:

[WebMethod]
public UpdateCustomerAddressResponseDto UpdateCustomerAddress(UpdateCustomerAddressRequestDto request);

However, as soon as I want to get creative.  Things start breaking:

[WebMethod]
public UpdateCustomerAddressResponseDto[] UpdateCustomerAddress(UpdateCustomerAddressRequestDto[] requests);

How does that work in the Transaction Per Request model?  It doesn't unless you want to reject and accept all the requests as one.  In this scenario, you are way better off letting each request be in its own transaction scope.  What else can't I do in that model?

public void Blah()
{
   UnitOfWorkDto work = new UnitOfWorkDto();
   UpdateCustomerAddressRequestDto addressRequest = new UpdateCustomerAddressRequestDto();
   addressRequest.CustomerId = 123;
   // add more stuff
   work.Add(addressRequest);
   UpdateRecurringBillingDateRequestDto billingRequest = new UpdateRecurringBillingDateRequestDto();
   billingRequest.CustomerId = 123;
   billingRequest.NewDay = 15;
   work.Add(billingRequest);
   UnitOfWorkResponseDto response = webservice.ProcessWork(work);
   UpdateCustomerAddressResponseDto addressResponse = (UpdateCustomerAddressResponseDto)response.GetResponseFor(addressRequest); 

In short, the transaction per request model is not granular enough.  It's a really limiting model.  Sorry, this has been a bit of a rant.

What would be really nice:

A webservice implementation of Unit of Work (such as the above), which allows for the specification of a transaction scope for the work (ie.. transaction per request or transaction per unit of work).  In the case of a transaction per unit of work, the requests would all enlist in the work's transaction.  I *think* this is possible, but I haven't dug deep enough into the goo to make sure.  I could be wrong, but I don't think the current WS-Stuff, WCF implementation will cut it if you want to do this..

posted @ Saturday, June 23, 2007 8:55 PM | Feedback (1)
The Challenge

Here is part of Rob's response to my post:

So, here's my take on this - and on a side note you're actually raising issues that I struggled with with the Commerce Starter Kit :).
Regarding the inventory - your SP is only 2/3s there and isn't finished. If your inventory rule is that you can't sell an item that's not in inventory, then you need to run another query and debit the baskets right? Moreover, all of this needs to run inside a transaction doesn't it?
So - do you do this in code? Never. We tackled this exact issue with the CSK (when we had inventory) and Microsoft *suggested* that we run a batch update and flag all curent carts with a particular item that the item was no longer available. Oddly enough (with regards to your example) - an SP was really the only option here since, again, all of this needed to happen inside a transaction.

I emailed Rob earlier this evening, and I'm committing to posting a solution to this problem later this week (probably over the weekend when I have time to write a proof of concept).  And as Rob pointed out in his response to my email.  I won't be looping through a collection of shopping carts and saving them back out to the database.  That's a scalability nightmare.

Stay tuned.

posted @ Wednesday, June 06, 2007 11:32 PM | Feedback (0)
Database Business Logic

First off, let me just say that I think Rob Conery has managed to create a terrific product.  If you've never heard of SubSonic, be kind to yourself and go check it out.  With that off my chest, let me just say that Rob and I just don't see eye to eye on everything (at least not right now).  I'll just start from here:

"Stop trying to fight it - use the DB for what it's good for (DRI and data handling) and let it do the heavy lifting. Your goal should always be performance - don't sacrifice extra connections just to satisfy the ORM design model."

The 411: Stored Procedures, Views, and ORM

While I agree 100% that you should let the DB do what it's good for and let the application do what it's good for.  I would disagree that you should always let the DB do the heavy lifting.  I would also strongly disagree that you should always build with performance as your goal.

First, let's frame this so we can be a bit more concrete with the discussion.  Let's suppose that we are working on an enterprise application.  By enterprise application, I mean an application whose constant development will exceed at least a year.  I also mean an application with multiple developers and a constant influx of business changes (requirement changes, new features, etc).  I'm also going to take Data Warehousing and OLAP off the table.  These topics have different concerns and different concerns get different solutions.

On Database Business Logic

Why is putting business logic inside the database harmful? 

Talk is cheap and archi-speak without implementation is worthless, so let's look at a few examples of why this matters.  Here's an overly simplified example:

CREATE PROCEDURE usp_AddLineItem
(
@OrderID int,
@ProductID int,
@Qty int
)
AS

INSERT INTO LineItems
(OrderID, ProductID, Description, Qty, UnitCost)
SELECT @OrderID, @ProductID, Description, @Qty, UnitCost
FROM Products WHERE ProductID = @ProductID

INSERT INTO InventoryTransactions
(ProductID, Qty, OrderID, Date)
VALUES
(@ProductID, @Qty, @OrderID, GETDATE())
 

This is very simple business logic.  As each line item is added to an order, we make an appropriate entry in the inventory log.  This works great because it saves us a roundtrip to the database each time a line item is added to an order.  However, what you don't see is what you are sacrificing in order to save the roundtrip and write the "simple logic" in the database.  Let's look a bit closer.

If you take a step back and realize what this sproc is actually doing, you will notice that the simple business logic we introduced into our sproc creates a side-effect in our data model.  The side effect is by design and is what saves our extra db roundtrips.  Each time a line item is added, the inventory changes.  Why do I care?  Because while it saved some execution time during the order process, it made the application more complex to develop for scalability.  Simple business logic implemented in the database as a side effect wreaks havoc on data caching.  That's right, we no longer need to make the extra roundtrip to the database to manage the inventory levels, but now my application must make a roundtrip to the database every time it wants to check the inventory levels.  Well, not quite, you could duplicate every side effect both in the code and the database, but that's asking for bugs.  If you happen to have an icon for the product information page to show when the item is backordered, you will lose way more than you hoped to gain.

What does this do to scalability? Given enough of these side effects, your application will be forced to scale at the database level.  That's right, when a click on the page forces a database call every time it needs a simple data retrieval, you can kiss scalability goodbye (good scalability anyway).  Data caching in the application tier is one of the most effective performance tweaks you can make to a well-written application.  Its effects will be more than a magnitude better than anything you can do inside the database.  [Assuming you aren't doing anything stupid in the application.]  Memory is fast AND cheap, use it.  This isn't a license to abuse extra connections to the database for the sake of caching.  Given that your application is designed to share database connections across multiple database calls in the same pageload, service call, etc, the performance hit for executing  the calls separately inside the same connection will be marginal.  In a domain-driven design or model-driven application, we let the application model handle reflecting itself onto the data model.  We can then cache parts of the application model in memory for huge performance gains.  You either sacrifice all of that for your "simple logic in the database", or you go to great lengths to mitigate against it.  Introducing SQL Server Broker and Notification Services just to preserve your simple logic in the database is ludicrous.  It's not bad medicine, it's bad side effects.

Let's look at a bit more complex example.

CREATE PROCEDURE usp_FinalizeOrder
(
@OrderID int
)
AS

DECLARE @Total money
SELECT @Total = SUM(Qty * UnitCost) FROM LineItems WHERE OrderID = @OrderID
UPDATE Orders SET Total = @Total WHERE OrderID = @OrderID

DECLARE @CustomerID int
SELECT @CustomerID = CustomerID 
FROM Orders
WHERE OrderID = @OrderID

DECLARE @AvgTotal money
SELECT @AvgTotal = AVG(OrderToal)
FROM ( SELECT TOP 20 Total FROM Orders 
WHERE CustomerID = @CustomerID ORDER BY OrderID DESC ) o

IF @AvgTotal >= 500
UPDATE Customers SET Preferred = 1 WHERE CustomerID = @CustomerID


This one is a bit trickier.  We don't really care that the sproc is calculating the order total.  Duplicating that logic in C# is trivial, but our other requirement was that if the Customer's average order total (over the previous 20 orders) was greater than or equal to $500, the Customer receives preferred status.  In the previous example, the side effect was clearly not worth it to implement the "simple business logic" inside the database.  But look at our example now, the application has to perform a set operation on the Customer's previous orders in order to determine the Customer's status.  Do we bring all the orders back to the application tier to create this calculation and avoid the side effect?  No way!  That's silly.  But we do pull the business logic away from the database!

We simply split this sproc in half.  We stick with the FinalizeOrder sproc and create another proc to pull back the average order total over the Customer's previous X orders (usp_GetCustomerOrderAvg).  If the application decides that the Customer's status has changed, it can call usp_UpdateCustomer to update the data model.  We are now free to use application level constructs to handle this business logic.  We may use a domain-driven design approach to allow the customer object to check its own status after each order is added.  Or, we could use an event-driven solution, and allow the calculation to be performed when an OrderCreated event is fired.  The problem with putting simple business logic in the database is that we have to go to the database to execute it.

Given that we have pulled the "simple business logic" away from the database, we are now free to cache until our memory is full.  Guess what other freedom we have just gained?  Data Model Partitioning!  If we follow these simple principles and really make an effort to keep our "simple business logic" from tying us down, we can now split the data model in two.  As our application scales, we are now free to very easily put Customer data and Order data on completely separate database servers.  Talk about scalability!  We may still opt to link the servers together for referential integrity, but forget those replication nightmares.

Yes, there really is nothing wrong with putting simple business logic inside the database.  If you plan on running the webserver and database server on the same physical machine, most of the above is moot.  But if you are building enterprise-level applications who must scale, do you really want to sacrifice all scalability except scalability at the database?  I know I don't.  The "simple logic" just isn't worth the trouble.  Not when the alternatives run "good enough" and allow much more flexibility down the road.  Remember, we should opt to build wider highways, not faster cars.  Latency only needs to be "good enough."  Throughput is much more rewarding.  Caching and data model partitioning are two avenues for building a wider "application highway".

In his response to my comment, Rob asked if I spoke from theory or practice.  I speak from both.  It took me quite some time (and reading) to figure the above out, and the first time it hit me, a light bulb went on.  Early on in my career, I was asked to spend some time with IBM in Boston, stress testing our .NET 1.0 ERP/CRM application.  We as developers (especially me) took great pain in putting quite a bit of "simple business logic" into the database.  Needless to say I was fairly shocked when we actually started the load tests.  Our final configuration included a dual processor application server and a four processor database server.  Guess where the bottleneck was?  Disk I/O and CPU usage on the database server.  We had moved all the "heavy lifting" and "simple business logic" into the database tier, and it came back and bit hard.  We had relegated the cpu on the webserver to little more than sending queries back and forth to the database.  Webserver disk IO was mostly used for serving static content.  The database turned into a big ugly performance monster, and the only way to quench it's thirst was to let it replicate.  Ouch.  If I had the chance to go back and do it again, following what I know now, I know for a fact that application scalability would be more than an order of magnitude greater.  But don't take my word for it, try load testing your own app if you don't believe me.

In short, putting "simple business logic" into the database of your enterprise application is a dangerous avenue to approach.  Do so only with fear and trepidation, and realize that you are likely crippling some of the best options for application scalability.

On Developing for Performance

This post is starting to get longer than I like.  I'll follow up with a post on developing applications with performance as the goal tomorrow.

posted @ Tuesday, June 05, 2007 9:52 PM | Feedback (15)
On Sticking with C#

I've seen a lot of posts about developers migrating to Ruby-on-Rails recently.  The posts all tend to say the same thing.  Microsoft is no longer innovating.  This is driving the alpha geeks to go open-source and adopt the languages and frameworks where the latest innovation is.  Ruby-on-Rails seems to be the current trend.

First off, I feel a bit torn between sticking with c# and learning RoR.  While I would like to learn a new language, I really don't see that much of a benefit at this point.  Yes, I will be the first to admit that I don't understand all the improvements in RoR.  I will also admit that grass on the other side of the fence looks really, really green right now.

I do not have an ego big enough to call myself an alpha geek.  I love working with software.  Building new things from scratch is what I really enjoy.  Software development is unlike any other profession I can think of.  We build usable things from thin air using only our brains and keyboards.  This is my passion.

That being said, I will also readily admit that I do not know everything.  There is still so much left for me to learn, and much of that is language agnostic and readily usable on c# today.  While switching to Ruby or another languge would be a blast, learning a new sytax and way of working on top of the exising items I want to learn will only sharpen the learning curve.  For all the references I see to Mort flying around, this should be especially relevant.  If you are looking to strengthen the developers around you, is learning a new language really what you want them to focus on?  If you wanted to adopt a full-blown agile, tdd, oop, bdd, ddd, etc methodology, is learning a new language really what's important?  We should be trying to flatten the learning curve for them--not sharpening it.  I would say no, for those around us who don't have the time or passion to hone their craft, we need to focus them on something more valuable.  I would suggest TDD as a good starting point.

That is, unless you are one of those "jumping ship" to leave MS and all us "Morts" behind.  I won't be following you.  The .NET jobs are paying really well right now, and most of the self-described "alpha geeks" I've met in the wild are a pain-in-the-ass to talk with, let alone work on a team with.  It's typically either their way or the highway.  They know everything and you suck.  Screw that mentality.  I do hope all the alpha geeks migrate to RoR.  I won't miss the attitude.

This is what it all boils down to.  If you want to herd cats into a completely new direction, you don't start by dropping them into a completely unfamiliar environment.  They won't feel comfortable and won't have it.  Instead, try changing their direction a degree at a time.  Flatten the learning curve by eating the Agile elephant one bite at a time.  Try it, and they just may like you for it.

I will be watching RoR closely for the future, but I'm not even considering a language switch anytime soon.

posted @ Saturday, June 02, 2007 4:10 PM | Feedback (7)
Frameworks are Hard

Yes, I am stating the obvious here.  Frameworks are hard to build.  Here's an interesting thought about Frameworks vs Tools.

Frameworks are something you code against.  A Unit Test Framework like NUnit, MbUnit, etc are good examples.  You structure your code and the framework calls your code to do stuff.  WinForms and it's related controls are another really good example.  You design the form, and it calls your code to handle the processing.  They key thing here is that it calls you.  You don't call it.

Tools are something you use.  Ado and NHibernate are good examples of tools.  You call them to do things for you (ie.. Persistance).  In their case, you call them whenever you want them to do something for you.

Building a tool is easy.  Building a good framework is hard.  I've seen the terms used interchangeably in a lot of places, but I think this subtle difference can be seen if you look closely.  It's not exactly an earth shattering idea, but it's a cool thought nevertheless.  I'm attempting to build a simple messaging framework at the moment, and the difficulty level is definitely there.  I can say this probably one of the tougher things I've built in quite a while.  Its also one of the most fun.

Here's one last thought.  Tools provide an api for you to call.  Frameworks provide an api for you to implement.  Nuff said.  Now go code something cool.  ;-)

posted @ Saturday, June 02, 2007 3:05 PM | Feedback (3)