giovedì, gennaio 13, 2011

Overriding BizTalk Dehydration mechanism

On of the most annoying thing I find in orchestration design is the amount of planning one have to put in exception management.

If we consider a BPM process spanning on several weeks (implemented in a long running orchestration) we may proceed for days (and in the meanwhile completing several conversation between several systems) before noticing that some data (maybe even in the message which started the orchestration) is wrong and throws an exception.

What to do if we already contacted five external system when the sixth system rejected the data?

If we’re lucky we can compensate (rolling back) the actions already done on five systems terminating the orchestration without leaving traces (and eventually repeating the request with correct data) but this is not always possibile:

  • Some system may not have a way to rollback (compensate) an action (they may base this rare event resolution on manual fix procedures).
  • Having to rollback single parts of our actions put us in need to redo only part of our actions:
    • Our orchestration should be able to start from each different point of process.
    • Our orchestration should be able to skip particular process steps if necessary.
    • Activation (original message) should contain an itinerary of which steps to execute and which steps skips.

EAI Patterns usually approach these problems implementing steps as atomic actions coordinated by a master process manager (Hohpe, Woolf – 2004).

Even BizTalk main BPM Scenario uses a stage separation of the process for guarantee the ability to modify an order while in execution.

In my experience this approach is often an overkill where a simple, on fly, fix in data would suffice to resume and complete successfully the process.

The most frustrating thing is that, in a similar situation, BizTalk Engine suspends the orchestration as Suspended Resumable allowing the administrator to resume it with its complete state and to retry the faulty operation but since we’re unable to act on orchestration state, the operation is doomed to fail again and again and again.

POCO Domain Model.

Having come recently in contact with DDD discipline I started to implement my orchestrations more as a set of POCO entities representing my model and less as a set of Multi part Messages as I made before (more on this in a future post).

Being the state of my orchestration represented by CLR objects and not by BizTalk messages, which are immutable, allowed me to think about “updating the state” as a way to correct possible anomalies and, why not, inspecting orchestration status without having to run the, a-bit-too-techincal, orchestration debugger.

I was not interested in accessing the data during normal orchestration execution but only when:

  • Orchestration was dehydrated since long time (what on the hell is it doing and what is its actual state?)
  • Orchestration was suspended as consequence of an error (why does it fault and it’s possible to fix its state?)

The most important thing to notice is that, in both cases, orchestration is dehydrated and this means that the whole state (comprising my POCO Domain model) is persisted on the BizTalk Database. Unfortunately the BizTalk Database is not accessible to us.

Serialization

Remember that every class used in a BizTalk orchestration must be marked with the Serializable attribute? (except if used in atomic scopes)

Well in the BizTalk FAQ is explained why it is necessary (as one can imagine … but it’s reassuring to read an official statement :) ):

The XLANGs runtime may persist to the database (dehydrate) your orchestration, including all of its data, at any point (except in the atomic scope). When the orchestration dehydrates and rehydrates, user-defined variables are binary serialized and deserialized.

So, BizTalk simply invoke the BinaryFormatter and asks our objects to serialize by themselves.

But we know that default serialization can be overridden and this will help us to externalize our domain model state.

Here I presents two ways to implement our custom serialization, both have they pro and cons as we will see soon.

Preparing the sample.

For simplicity let’s imagine that the whole state of my orchestration is contained in the simple State object reported below.

   1: [Serializable]
   2: public class State
   3: {
   4:     internal int _integer;
   5:     public int Integer
   6:     {
   7:         get
   8:         {
   9:             return _integer;
  10:         }
  11:     }
  12:  
  13:     internal String _text;
  14:     public String Text
  15:     {
  16:         get { return _text; }
  17:         set { _text = value; }
  18:     }
  19:  
  20:     internal Guid _id;
  21:     public Guid Id
  22:     {
  23:         get
  24:         {
  25:             return _id;
  26:         }
  27:     }
  28: }

It consists of 3 properties: an integer, a string and a Guid (the Guid will be initialized with OrchestrationId during state initialization)

Presenting the Sample Orchestration

The sample orchestration is very simple: it takes a message from a receive port (the message just contains an integer and a text value), initialize the State object with message data and OrchestrationId and enters in the RepeatableScope to repeate if on error.

In the scope I placed a decide shape where a simple check is made:

If Text data from the State (and therefore from the original message) is empty then an exception is raised (simulating an error) otherwise the flow exists the scope (clearing the repeat flag to avoid repeating the loop).

If an exception is raised instead (so Text in state is empty) the catch shape will first raise the repeat flag (because the loop must be repeated) and then suspends the orchestration (persisting therefore State)

When (and if) Orchestration exists the repeatable scope a new Message is created and populated with State data and sent on a send port.

image

Introducing our Repository

According to DDD we won’t access directly the State object, but we will use a Repository object to access it:

The Repository object will be responsible to manage the serialization/deserialization of our State object (representing our Domain Model) therefore our repository will implement just a couple of methods enabling us to set or retrieve our State:

   1: interface IRepository
   2: {
   3:     void SetState(State state);
   4:  
   5:     State CurrentState
   6:     { get; }
   7: }

Using ISerializable

The first IRepository implementation is based on ISerializable implementation:

We have to implement a GetObjectData that will be called by infrastructure when Repository needs to be serialized (therefore it needs to serialize the contained State) and a special constructor (that will be invoked when runtime will deserialize the Repository):

   1: public void GetObjectData(System.Runtime.Serialization.SerializationInfo info, System.Runtime.Serialization.StreamingContext context)
   2: {
   3:     // Here i'm going to serialize to filesystem therefore i'll use guid as a filename, but i could even serialize to a db therefore using a connection string and the guid as lookup value.
   4:     // Using a hardwired foldername just for sample sake, change it to an existing path on your filesystem or, even better, externalize it.
   5:     String filename = String.Format(@"C:\Temp\SerializationStore\{0}.txt", _state.Id);
   6:     // Serialize to the true SerializationInfo (BizTalk DB, remember that BizTalk invoked serialization).
   7:     info.AddValue("SerializedStateFileName", filename);
   8:     // Now that the file location has been saved into BizTalk proceed to save the true data.
   9:     BinaryFormatter bf = new BinaryFormatter();
  10:     FileStream fout = new FileStream(filename, FileMode.Create, FileAccess.Write);
  11:     bf.Serialize(fout, _state);
  12:     fout.Flush();
  13:     fout.Close();
  14: }
  15:  
  16: protected ISerializableRepository(SerializationInfo info, StreamingContext context)
  17: {
  18:     // First of all read the location of file containing persisted state.
  19:     String filename = info.GetString("SerializedStateFileName");
  20:     // Then open the existing file and deserialize the _state object.
  21:     BinaryFormatter bf = new BinaryFormatter();
  22:     FileStream fin = new FileStream(filename, FileMode.Open, FileAccess.Read);
  23:     _state = (State)bf.Deserialize(fin);
  24:     fin.Close();
  25:  
  26:  
  27: }
  28:  
  29: }
  30:  

In this code the serialization will first create a file with the OrchestrationId as name and then will use the normal BinaryFormatter on the newly created file to store it. Deserialization will operate in the reverse: first obtain the filename from serialization stream, and then using BinaryFormatter will deserialize the state object from it.

Let’s consider the following message:

   1: <ns0:Root xmlns:ns0="http://TCPSoftware.CustomDehydration.Orchestrations.Schema">
   2:   <Integer>10</Integer> 
   3:   <Text></Text> 
   4: </ns0:Root>

When such a message will be published on the orchestration, the orchestration will suspend in exception and, as the following screenshot shows, a file with the same OrchestrationId will pop up in the filesystem.

image

image

Now we may resume the message and it, as expected, will continue to execute the loop and resuspend in error.

But using the BinaryFormatter we can recreate and fix the State object persisted on the file outside of biztalk server, for example using the following simple console application that will simulate state correction.

   1: static void Main(string[] args)
   2: {
   3:     // args[0] is filename to fix
   4:     Console.WriteLine("Opening DomainModel from file '{0}'",args[0]);
   5:     FileStream fin = new FileStream(args[0],FileMode.Open,FileAccess.Read);
   6:     BinaryFormatter bf = new BinaryFormatter();
   7:     State state = (State)bf.Deserialize(fin);
   8:     fin.Close();
   9:  
  10:     // State recreated, now fixing it
  11:     if (String.IsNullOrEmpty(state.Text))
  12:     {
  13:         state.Text = "Fixed!";
  14:     }
  15:  
  16:     // Saving fixed state.
  17:     FileStream fout = new FileStream(args[0], FileMode.Create, FileAccess.Write);
  18:     bf.Serialize(fout,state);
  19:     fout.Flush();
  20:     fout.Close();
  21: }
After using the fixer we can resume the suspended message and this time the orchestration will complete publishing the following output message:
   1: <ns0:Root xmlns:ns0="http://TCPSoftware.CustomDehydration.Orchestrations.Schema">
   2:   <Integer>10</Integer> 
   3:   <Text>Fixed!</Text> 
   4: </ns0:Root>

Using OnSerialized / OnDeserializing Attributes

This approach is different from the previous one: instead of overriding the normal serialization mechanism we will side it.

   1: [OnSerializing()]
   2: public void OnSerializing(StreamingContext context)
   3: {
   4:     // Check _serializeExternally and decide if override serialization.
   5:     if (_serializeExternally)
   6:     {
   7:         // for simplicity in this sample a file is used to store data
   8:         _filename = String.Format(@"C:\Temp\SerializationStore\{0}.txt", _state.Id);
   9:  
  10:         // Now filename will be serialized with normal serialization inside BizTalk. Time to serialize externally state.
  11:         BinaryFormatter bf = new BinaryFormatter();
  12:         FileStream fout = new FileStream(_filename, FileMode.Create, FileAccess.Write);
  13:         bf.Serialize(fout, _state);
  14:         fout.Flush();
  15:         fout.Close();
  16:     }
  17:  
  18: }
  19:  
  20: [OnDeserialized()]
  21: public void OnDeserialized(StreamingContext context)
  22: {
  23:     // ONLY if _serializeExternally (&& ) is true then external serialization was done
  24:     if (_serializeExternally)
  25:     {
  26:         // Read from stream.
  27:         BinaryFormatter bf = new BinaryFormatter();
  28:         FileStream fin = new FileStream(_filename, FileMode.Open, FileAccess.Read);
  29:         _state = (State)bf.Deserialize(fin);
  30:         fin.Close();
  31:     }

Before running normal serialization our method marked with OnSerializing Attribute is executed: this method looks if _serializeExternally flag is raised and if it is, the method will create a file with the OrchestrationId as name and will start to serialize the state object to it using the BinaryFormatter.

When normal deserialization completes instead our custom OnDeserialized method is invoked; the method will check if _serializeExternally flag is raised and if it is (in other words if data was serialized externally before) proceed to open the file and Deserialize the state object from it.

Differently from the previous method, this approach will serialize externally ONLY when instructed to do so, in fact our orchestration needs to be modified a bit in order to use this kind of repository:

  • The SetInErrorFlag shape that before was only containing the code to set the boolean retry flag, now contains also the following instruction (which raise our Repository _serializeExternally flag): Repository.EnableExternalSerialization();
  • The ClearInErrorFlag shape similarly will now contain the following instruction: Repository.DisableExternalSerialization();

The main differences between this approach and the previous one (ISerializable) are reported below:

ISerializable

OnXXXAttributes

Data is always serialized externally Data is serialized externally only if explicitly requested
Data is always serialized just once. Data is always serialized inside BizTalk therefore, when also serialized externally, it has to be serialized twice.

There’s not a clear winner here: if you’re scared of persisting your data outside from BizTalk Server you may prefer the OnAttribute way (incurring in the double cost of serialization just when an exception is raised, more than acceptable), if you prefer not to expose persisting mechanism in your domain code (such as the Enable/DisableExternalSerialization methods seen above) and you trust your code enough then you’ll prefer the ISerializable way.

Simplify Workflow with Atomic Shapes

If you look at the above orchestration, chances are that you won’t like it a lot, and I definitely agree.

The point is that there’s too much “infrastructure concepts” leaking into the business process design (orchestration).

The loop shape and the surrounded Catch and Suspend shapes are infrastructure noise, placed there just to enable suspend&retry mechanism.

In fact an atomic scope has almost the behavior we are searching for because an orchestration can’t suspend in the middle of an atomic scope, and therefore if an exception is raised inside an atomic scope, the whole scope will be retried “automatically” next time the suspended instance is resumed.

Unfortunately I said “almost” because there’s a catch: the first time I tried an orchestration with an atomic scope (instead of the above suspend&retry scope) I was puzzled because even if the error was raised there was no file at all in SerializationStore folder.

Thinking about it the reason is obvious and was depicted in the previous post: there’s no persistence point “entering” an atomic scope but only “exiting” and therefore the orchestration was starting, entering the scope without persisting and raising the exception inside the atomic scope.

This means that our Repository had no time to be persisted between its creation and the exception throw and therefore it was doomed to repeat the atomic scope forever without having a chance to persist data :S

Luckily I was able to find a way to programmatically persisting the orchestration allowing the orchestration to persist just before entering the atomic scope and obtaining the whole advantages of External Serialization keeping infrastructure noise outside of my orchestration, just a quick visual comparison between the orchestration below with the one above should convince anyone…

image

You may download from here all the code from this article, hope you'll find it interesting.

BizTalk Versioning Strategy (4/4)

Previous part available here.

Really speaking, versioning problem is not the same for each biztalk artifacts:

Orchestrations

Orchestrations, for example, are not referenced by any other artifacts and therefore, can be updated with the Modify procedure descripted in the previous part without any risk replacing the old orchestration with the new one.

If we want to publish a new version of an orchestration keeping the old version up&running (a often necessary step in presence of long running processes) we can: BizTalk will host the new orchestration and the old one and will let us keeping both active or switch from the old to the new atomically.

Pipelines

Pipelines, as orchestrations, are not referenced by any other artifact (well, orchestration could reference pipelines in some cases even if I don’t recommend them) therefore Modify approach should suffice to replace an existing pipeline with another.

But if pipeline is referenced by ports in other applications obviously the modify will fail.

A quick workaround could be moving temporary all ports in the application containing the pipeline through the administration console, do the Modify, and move back ports to original applications.

In case the workaround will fail too one may increase the version number of the pipeline and redeploy it.

All ports will continue to use the old version and, if needed, WMI scripts can be written easily to move all ports references of the old version to the new version automatically.

Maps

Maps can be used by orchestrations (again, I’m against that kind of usage, maps should be used mainly on ports) and therefore Modify approach may or may not work.

And, as with pipelines, if maps are referenced by ports in other applications the modify will fail but, as with pipelines one may try to move every port and/or orchestrations referencing it to the same application and try to Modify.

If everything else fails one may increase map version number and deploy it.

Ports will continue to use the old map version and, if needed, WMI scripts can be written easily to move all port references of the old map version to the new one automatically.

Orchestration references are not so easy to redirect: references to the old map in fact are compiled statically inside the orchestration code therefore orchestration will continue to reference the old one and there’s no configuration settings from administration console who can help use redirecting the version.

Luckily, the same solution found for schema versioning will work also in this case: deploy a new version of the map and apply publisher policy to it.

Schemas

Schemas are usually very problematic when comes to versioning in BizTalk as seen in previous parts:

Schemas are referenced by nearly all other artifacts (Orchestrations, Maps and Pipelines too) and are usually placed in a Central application shared with several others (because usually schemas are traversal to different processes).

Because of this Modify will almost certainly fail and this makes them nearly un-updateable.

BizTalk engine allow us to deploy new version of schemas simply increasing assembly version number and this will make happy Maps and Pipelines but will break orchestrations.

Luckily the Publishing policy procedure depicted in the previous post will help us in solving this last issue allowing us to update freely every BizTalk artifact in a simple and agile way (once you get used to…).

mercoledì, gennaio 12, 2011

Programmatically persisting an Orchestration

Sometimes ago I had the necessity to persist an orchestration just before entering an atomic shape (you’ll see in the next post why…)

Unfortunately, as any good biztalker knows, persistence point are decided by Engine when one of the following conditions happen:

  • Start Orchestration Shape
  • Suspend Shape
  • At the end of transactional scope (Atomic and LongRunning)
  • When Orchestration terminates
  • When the engine determines that the instance should be dehydrated.
  • When the Orchestration engine is asked to shutdown.
  • When a Debugger breakpoint is reached.
  • Send Shape (at the end of)

So it seems there’s no way to persist the state just before entering an atomic scope(the closest thing is to put another transactional scope just before the atomic one but you’ll agree that’s a bad & ugly workaround).

Anyway, looking a bit inside the Service Class, I noticed the following method:

   1: public void Persist (
   2:     bool dehydrate,
   3:     Context ctx,
   4:     bool idleRequired,
   5:     bool finalPersist,
   6:     bool bypassCommit,
   7:     bool terminate
   8: )

Parameter names seems self-explanatory to me and therefore I tried to invoke it from an expression shape placed just before the atomic scope:

   1: Microsoft.XLANGs.Core.Service.RootService.Persist(
   2: false, // dehydrate.
   3: Microsoft.XLANGs.Core.Service.RootService.RootContext, // ctx: The actual service instance context.
   4: false, // idleRequired.
   5: false, // finalPersist.
   6: false, // bypassCommit.
   7: false // terminate.
   8: );

The result was that Orchestration Service Instance persisted as soon as it reached the expression shape instead of waiting the end of the atomic scope.

lunedì, gennaio 10, 2011

BizTalk Versioning Strategy (3/4)

Continued from part 2.

.NET Versioning.

So every problem seems due to the fact that orchestration are tightly coupled with .NET types representing message schemas and therefore they’ll fail to execute if fed with wrong .NET Type.

In fact we’ve reduced BizTalk Versioning strategy problem to a normal .NET Versioning one:

  • We’ve deployed an application (our orchestration) referencing types contained in another assembly (our schemas, v1.0.0.0).
  • We’ve deployed a new version of this referenced assembly (v1.0.0.1).
  • Will our application continue to work unaffected when receiving 1.0.0.1 types instead of 1.0.0.0?

The .NET answer is “it depends”.

In fact, as depicted here,

The specific version of an assembly and the versions of dependent assemblies are recorded in the assembly's manifest. The default version policy for the runtime is that applications run only with the versions they were built and tested with

So the answer seems to be a sounding no but, continuing to read:

unless overridden by explicit version policy in configuration files (the application configuration file, the publisher policy file, and the computer's administrator configuration file).

So there’s a way: by default the orchestration will crash (because receiving unexpected type) but we can override this behavior simply by declaring explicitly the version policy!

The above page suggested 3 approaches, let’s examine them in details:

Application Configuration File

This approach consists in putting redirection directives directly inside the application configuration  file.

This means that we should place our redirection from 1.0.0.0 to 1.0.0.1 directly inside the BTSNTSvc.exe.config file.

I don’t like to mess with BizTalk application config file and it’s wrong even from a logical point of view: this kind of redirection is thought for application developers (that in this case is the BizTalk product group) to explicitly affirm (in the application config) that the program will work with these assembly version redirections. But I’ve not developed BizTalk Server, I’ve simply developed some components hosted in BizTalk Server.

So I don’t like this approach too much.

Computer Administrator Configuration File

This approach consists in putting redirection directives inside the machine configuration  file.

If possible I like this approach less than the previous one for the very same reasons:

If I don’t like to mess with BizTalk Configuration File you can imagine how much I hate to mess with the whole Machine.config.

And, from a logic point of view, this kind of redirection is thought for system administrators of the machine and I’m not either too, I’m just a developer who has components hosted on that machine.

Publisher Policy File

This approach, a bit more complex than the previous two, consists in generating a twin policy assembly file which, when deployed together with the original assembly, enable redirection.

This is the best approach for a biztalk developer: it is used by a component developer (and we’re component developers) to state that a component is compatible with another version of the same (and this is exactly what we’re trying to do).

Understanding how to generate a Publisher Policy File is off topic for this (too long) post, but I placed on MSDN Code Gallery a powershell script that should help you producing your publisher policy files.

Recap

My policy for schema versioning is therefore the following:

Each time I made a (backward compatible) change in a schema (such as adding a field to an existing schema or adding a new schema to the assembly) I simply increase build or release number.

My build process will build the newly changed schema artifacts producing also the publisher policy file (using a slightly modified version of the script I linked above).

Then, on the BizTalk box, I’ll deploy the new version of the schemas (no need to unenlist orchestration or remove artifacts because I’m simply adding a new side by side schema).

Afterwards I put in gac the publisher policy file corresponding to the new schemas.

We had a zero downtime (we just deployer new artifacts, neither stopped nor removed old ones) and the net effect is that now every biztalk artifact (orchestration, maps, pipelines, etc) will use the newest (and backward compatible) artifact without problems.