Tuesday, March 31, 2009

SOA Suite Skill Map

While working on SOA Suite various areas of technology criss-cross each other, Its sometimes overwhelming to look at issues in a structured manner. From a Reference Architecture perspective, things are clear, they stack up as follow

1. User Interaction Layer (Portal, Composite Applications, BAM)
2. Connectivity Layer (Security, ESB, OWSM)
3. Integration Layer (SOA Suite, BPM, B2B, ODI)
4. Infrastructure Layer (Weblogic Suite, Coherence)

Also the cross-cutting Layers of IDE, management and Governance

Coming back to SOA Suite - I divide it into four major buckets
-BPEL
-ESB
-Adapters
-App Server (OC4J or Weblogic)

Here is the list of items that would come under each category, I have split each category/area into Development and Administration parts - This can help to map issues to one of these areas/items and so in troubleshooting. I plan to keep adding to the list, basically very high-level taxonomy that in turn will have lot of details in the development/administration guides.





updates on 7/22/09
Two other aspects of SOA Suite are increasingly being used and require skills on are
-BAM
-B2B

Monday, March 23, 2009

Error Handling in BPEL #2

As I noted about BPEL Error handling here, SOA Suite 10.1.3.3 higher has a Error handling framework in form of fault policies, which is very useful to set up automatic recovery requirements in bpel.

Basically a fault policy says what to do (action) for which fault (condition). And it can be attached to a partnerlink, port type, process or domain. Attaching a fault policy to partnerlink, porttype or process can be done by including it in the bpel.xml. Domain level fault policy attaching can be done in fault-bindings.xml under each domain in $BPELHOME/bpel/domains/"domain-name"/config

Any error happening while invoking a partnerlink, will get captured by this framework and based on the fault policy, action will be taken. The pre-defined actions are retry, rethrow, human-intervention etc.

If we use rethrow as action, the fault will go back to bpel and will be handled by any catch/catch all blocks.

By default, the error handling framework doesn't do anything, as out of box, no policy is configured.

please check this for more information

Update on 6/Apr:
to refresh fault policy changes, server restart is not required, we can use the following link http://host:port/BPELConsole/domain_name/doReloadFaultPolicy.jsp. Also while creating a faultpolicy file, dont forget to put the correct Id name in the faultPolicy tag same as in bpel.xml

Update on 30/Apr
While showcasing faultpolicy to a customer, it became clear that businessfaults couldn’t be handled by fault policies, only technical faults. Which is quite a disadvantage, however its possible to handle business faults by converting the partnerlinks to separate bpel processes and then throwing the businessfault. Also for customizing fault-policy to do requirements like notification or calling another bpel also seems to be a challenge. Can we get the ora-java in the actions available in bpel console? need to check that.

Monday, March 16, 2009

AQ-JMS on SOA Suite10.1.3.4 on WebLogic9.2

There are 3 options in SOA Suite on Weblogic for messaging
-AQ Adapter
-AQ with JMS
-JMS Providers
--OC4J JMS Provider (only on OC4J)
--Weblogic JMS Provider
--Any other 3rd party JMS provider

Aq is the native queuing mechanism in Oracle, which uses database to create queue tables and queues/topics.

Other JMS Providers primarily use a in-memory/file based approach to create queues and topics.

To create Aq/JMS adapters on SOA Suite on Weblogic, here are the steps -

AQ/JMS basically means AQ queues with JMS message type. AQ/JMS supports SYS.AQ$_JMS_TEXT_MESSAGE, SYS.AQ$_JMS_BYTES_MESSAGE, SYS.AQ$_JMS_STREAM_MESSAG, SYS.AQ$_JMS_MAP_MESSAGE, SYS.AQ$_JMS_MESSAGE.

For the sqls to create the queue tables/ queues here is a reference

Next what we realized AQ/JMS is not supported on weblogic yet, so we had to use a custom code created by Robert Patrick

The code is also available here

Basically - we need to create a Weblogic startup class AQJMSStartupClass.jar, and use one property file, couple of user credential files, and a couple of jars (aqapi13.jar and ojdbc14.jar), we followed the readme file and created the required files.

Next we created a weblogic startup class with the necessary parameters, and restarted the soa server, and it basically used the startup class to create the AQJMS JNDI entries for QueueConnectionFactories, Queue etc.

Then we created a JMS Adapter entry in weblogc-ra.xml to use the AQJMS_QueueConnectionFactory. Used this JMS Adapter JNDI name in the bpel wsdl.

This just worked great, we have seen some AQ related errors in log, which I will cover in next post.

update on Aug/19/09
The AQ realted error like 'Cannot delist resource when transaction state is committed' got fixed after applying MLR#8

Tuesday, March 10, 2009

Error Handling in BPEL

Error handling is always a key core service on BPEL and there are always confusions around how to handle it,

There are various aspects to error handling in BPEL/ESB
1. esb errors
2. bpel partner link errors
3. bpel non-partnerlink errors

The simple answer is catch and catch-all

catch is for system defined faults like bindingfault, remotefault, selectionfailure etc. There are 12 of them that we can see in the Fault explorer. Once we catch them, we create a variable of type RuntimeFaultMessage from RuntimeFault.wsdl (under SOA_HOME\bpel\system\xmllib)


catchall is for everything else, and we can get the error String by ora:getFaultAsString()

The best practice to handle these faults in catch/catch-all blocks is to do the following steps

1. create a custom fault message, based on all the values we need to capture for example processid, instanceid etc.

2. send the fault message to a custom error handler, which basically can log the error, send it to worklist or send an email

3. send a reply back to client about the error message, this would need the wsdl to have provisions for sending fault data as part of response or fault

4. Terminate

Alternatives we can also use ThrowFault in step#3 if the wsdl has provisions for faults. However I found that while using throwFault step#2 gets rolled back, so adding a checkpoint() resolves it.

SOA Suite10.1.3.3 onwards also has the built in fault policy to handle errors. Basically whenever partnerlink errors happen, bpel checks with fault policy to decide what to do. In the fault policy we can configure different actiontypes like retry, rethrowfault etc.

I will cover fault policies and error handling frameworks in more detail in another post.

Monday, March 02, 2009

Clusters Vs Grids

Recently I worked on a Coherence data grid and then to set up a SOA cluster on Weblogic. That made me wonder what is the difference between Grid and Clusters. So couple of goggling and there are many links that explain it. However here it is in my own understanding.

Grids: Grid computing would be something to do with to optimize the resource usage like CPU, memory or IO. When we built a Coherence Data grid, It’s basically a cache server to cache the data so that we can avoid the expensive database trip. So we reduce the IO and make the application faster. And these cache servers can be started many in number across machines, they all talk to each other using multicast communication. Which is actually the same technology used in the Weblogic clusters to keep the managed servers in sync. Similarly other grids optimize CPU usage, which would otherwise be unused and so waste of money to keep and maintain them. Server Virtualization is a Grid solution.

Clusters: on the other hand basically provide scalability, so that the application can support more a more user requests, also with high availability due to failover support in cluster aware managed servers. Clusters are an extension to distributed computing and it lets scale the application dynamically meaning add more managed servers anytime you have more demand. Similar failover and dynamically adding more servers is supported in Coherence Grids as well. Coherence Cache servers also partitions the data across multiple servers thus providing more scalability.

Interesting thread on weblogic clusters here
Interetsting discussion on scalability here


So basically the underline technology like multicast communication could be same, and the results like performance (response time), scalability (throughput) could be same, however Grids and Clusters differ in the basic goals they work on. It would not be wrong to say Grids will help you scale up while Clusters will help to Scale out.