Friday, December 18, 2009

bpel tuning, best practices and troubleshooting

Last week we went to production on SOA10.1.3.4 on Weblogic9.2, 3 node cluster, with lot of interfaces, and It was a very good exercise to prepare the servers, here are some references, if you read through all of them then you can avoid most of the issues ;)

1. tuning
check the apache-weblogic plugin parameters here
Oracle 10g tuning guide here (however lot of info here doesn't apply to weblogic)
tuning bpel quartz scheduler mentioned here
weblogic tuning guide here
linux parameters here (some good benchmark data here)
JVM parameters here (JVM Proxy settings here)
dehydation db tuning already covered as part of bpel tuning, however make sure DBAs tune processes, sessions, sga space, tablespace properly

2. best practices
if you have heavy batch jobs make sure you understand optSoapShortcut in cluster scenario, pls check this
metalink IDs for esb-bpel communication optimizations (780822.1 (esb-bpel) 726490.1 (bpel-esb))
bpel performance best practices here
esb performance best practices here , one more ppt, one more
a must read on bpel-adapter Integration here and optimizations metalink id [565944.1 and 730580.1]

3. troubleshooting
gcviewer to monitor gc memory issues
threaddump can be created on weblogic console, also can be created by kill -QUIT to monitor socket, thread issues
ORA-600 Errors Metalink ID [754336.1,18485.1, 333338.1 and 460244.1]
Cannot deserialize DOM element" Error When Running a BPEL Process [ID 559261.1]
FOTY0001: TYPE Error Using ORA:PARSEESCAPEDXML() and "&" in Input [ID 861637.1]
ORABPEL-11802 Error Calling Stored Procedure Using a Database Adapter [ID 887736.1]
BPEL waits well after expiry time set by wait or retry by fault policy[ID: 561172.1]

Monday, November 30, 2009

Careers in SOA

As the erstwhile EAI jobs become SOA jobs, it’s interesting to see how many dimensions are there to this job and how to enhance one's skill to be on top of it.

Technical (Type 1-4) - It is also possible that one person might have to be involved in all these types (also to be occupied through-out the project life-cycle), so one has to prepare for it. Also some of the work is product specific roles. By the way, learning the technology might not guarantee success, as they say attitude is the key :)

Work-Type1- This goes to the folks good in linux/unix, mostly the unix admin or dba - they can help in installing all the software (linux, database, app servers, jdk, etc.), RAC install, DR setup, backup activities, code migration, system upgrades, applying patches. These people can grow to be Infrastructure/Data Architects or Cloud Architects :)

Work-Type2- This goes to folks good in JVM, App Server, SOA Server, product architecture, call them 'Specialists' - they can configure the SOA/App servers, clusters, tune them as required, monitor system health, read log file, create SRs, finding patches, system sizing, create solutions for a product stack etc. These people can grow to be Technical/Solution Architects.

Work-Type3- This goes to the folks good in Java, XSLT, JMS, EJBs, transactions, SQL, PL/SQL, BPEL API, Adapters, they are the Developers - (Here I am not including the skills which are more primary in a E2.0 space including JSF, Spring, Hibernate, Struts etc., however a Developer may have to work on some of these if the project demands) . This group can grow to be Application/Solution Architects.

Work-Type4- Lets say this is the group of Architects, who take care governance aspects of the organization, including documentations as per Enterprise standards, coming up with the physical/logical architectures, enterprise security, hardware provisioning, roadmaps, big picture etc. This group is the Enterprise Architects.

Work-Type5- This is the group of project managers, sponsors who take care of planning, budgeting/financials, status tracking, business users reporting etc. This group is the Managers/Leaders of organizations.

I have not included the sales, testing, business analyst, training roles in these categories. Also assumed all role types know SOA and its principles.

Monday, November 02, 2009

Email CSV File As Attachment

I had to create a bpel, which gets data from couple of db tables, and create a csv file and send it as email attachment. I bumped into couple of interesting challenges, luckily google is always there to help :), let me capture some of the lessons learnt.

1. creating a csv file in append mode
2. setting/getting filename in the fileadapter
3. handling multiple records, nillable fields in nxsd
4. interesting xslt, xpath functions
5. sending email with attachment


Creating a csv file using file adapter was quite straightforward, you have to get a sample csv file that you want to crate and give that to the fileadapter wizard to create an nxsd (native xsd). more on it creating csv file here reading csv file here

As the file was created in append mode (how-to here), the file name has to be unique, it had to be set by creating a fileadapter header variable and passing it in invoke. If the file name is created by fileadapter, its name can be obtained as prescribed in technotes 10.1.3.3 by adding an output of header type to the write operation.

I had to write a header to the file, and fileadpater supports multiple record types, however all these record types need a condition Value for each record type to follow, which was not possible to define for data w/o starting with any fixed value, so I had to abandon the idea of using records, I printed the header separately as all my fields were string type. So the nxsd can be tweaked to change datatypes or making fields as optional (nillable=true). more here


While working thru transformations came across very effective xslt functions like translate, which replaces any specific character in the string with any other value. function create-delimited-string creates a delimited string out of a particular node in a repeating XML structure.

Finally sending the file as email attachment, bpel sample sendEMailWithAttchment clearly shows how to do it, however I faced two challenges. One was if the ora:readFile cannot find the file it will throw XSLT error (here ), which is difficult to debug. And some reason my text/html data was overwriting the csv data, for which I changed the order or data setting, setting csv first and then text/html.

update-Nov10

I could not use the translate method to replace newline characters, the sql replace(col,chr(10),null) helped there.

Also if you get javax.xml.xpath.XPathExpressionException: FOTY0001: type error for nothing wrong in XSLT, its because you edited the XSLT even before bpel loaded the parts in the Xform activity, so always wait till the parts load before clicking edit for the XSLT.


Update Dec-7

In order to make the directory to which we write the file as dynamic, we can edit the Outboundheader xsd to add 'directory' there after fileName. And send te directory name as a preference to the bpel at runtime.

Tuesday, October 20, 2009

transaction=participate

The first thing to note is global transactions will only work among sync processes, so an async bpel/esb cannot participate in a global transaction, so the way to make an async process sync is set delievryPersistPolicy to off.immediate.

Now partnerlinks in sync processes participate in global tx in 2 ways
1. set transaction=participate at the partnerlink (child BPEL or esb call) level
2. set transaction=participate at global level under in bpel.xml (It seems default in 10.1.3.4) - This is for all adapter calls (DBAdapter, AQ, JMS etc.) to participate in global tx

Now for the global tx to work, you have to use datasource with global tx enabled and using an XA driver(may not be true XA).

So when there is error (bindingfault) from dbadapter calls, it will mark the tx for rollback, but it will actually not rollback.

What we have seen is you have to set handleTopLevelFault=false in the configurations/bpel.xml so that rollback works. The other approach is to throw rollback exception in catch-all block.

The following seems to be best-way for global tx handling (10.1.3.4 on weblogic 9.2)
1. XA driver with Global tx enabled
2. in catch-all throwing rollback exception

This is rolling back the db inserts, as well dehydrating the bpel.

As bpel dehydration is happening, it seems it’s happening in a separate thread/tx (need to check if its a supported feature in 10.1.3.4). However we also saw the instance getting into recover(invoke) while using handleTopLevelFault=false.

a good discussion in OTN here and here
some good bpel tx details here and here

update on 23/10
10.1.3.4 does support audit in a seprate thread in async mode as per release notes.

Monday, October 12, 2009

ORABPEL-11825Attempt to use an unsupported database platform.

If you get this error -

WSIF JCA Execute of operation 'T' failed due to: Attempt to use an unsupported database platform.Database platform is not supported: oracle.toplink.platform.database.DatabasePlatform; nested exception is: ORABPEL-11825Attempt to use an unsupported database platform.

Then DBAdpater configuration needs set the PlatformClassName, sample values

  • oracle.toplink.platform.database.Oracle9Platform
  • oracle.toplink.platform.database.DB2Platform
  • oracle.toplink.platform.database.SQLServerPlatform

you can also take this value from the MCF property settings.

Sunday, October 11, 2009

Dynamic partnerlink

While crating EAI flows involving many bpels/esbs one common requirement is to be able to call some dynamic logic at run-time. For example say we have a set of requester and provider bpels, which transform the source-format to canonical and canonical to destination-format respectively. However we also have a requirement to do certain enrichment to the data if the source is A and some other enrichment if the source is B. since our requester bpel is common to both the sources we need to extend it in a manner that both the specific enrichment requirements for A and B are taken care while creating the canonical. So one-way to do this is to create source specific or "whatever" (which needs extension) specific bpels and call them from the common requester or provder bpels. And which bpel to call gets decided at run-time. This is a typical requirement for creating dynamic partnerlinks. This is covered in detail in this article

So lets say your master bpel is M and you need to call either A or B based on the source system. In order to create a dynamic link to A or B, roughly below are the steps
1. A and B should be based on same wsdl/xsd
2. In M we need to include this wsdl/xsd with an service endpoint url (this url doesn't matter as at runtime it will be replaced with actual url)
3. Creating a variable of type EndpointReference - and set the following XML fragment
[wsa:EndpointReference xmlns:wsa="http://schemas.xmlsoap.org/ws/2003/03/addressing"] [wsa:Address/][/wsa:EndpointReference]
4. The actual address has to be obtained from somewhere, either from database or from a config file
5. Once the actual address is obtained, set it to the EndpointReference address
6. Assign the EndpointReefrence to the partnerlink directly

Please refer to the article for sceenshots.

JSM Cluster in Weblogic

When we had to deploy queues/topics in Weblogic cluster, there were two challenges, how to make the queues load balance and how to failover between nodes. Both the challenges are taken care by creating distributed queues (uniformly) and deploying to the jms servers in soa cluster (using a subdeployment module). For internal JMS clients like bpel both load-balance and failover are taken care by the server. However for external JMS clients (weblogic) we used provider url as t3://host1name,host2name:port to achieve HA and it works.

Sunday, October 04, 2009

Sync DBAdpater call time-out

While working on a real-time interface to sync receipts to Oracle EBS by using pl/sql apis, I got a sense that DBAdapter is probably the epicenter of all EAI/SOA work (not to forget the fileadapter). There is so much to the DBAdapter. We had to do two things. Firstly to create a DBAdpater partnerlink to call a custom pl/sql api to insert data to open interface tables, Secondly call another custom pl/sql api (which in tern calls a standard oracle api) to move the data from open interface tables to base tables.

The partnerlink creation was quite straight forward using the wizard, we had the customary issues of recreating the partnerlinks again and again when the signature of the api was changing. Since the API had complex data types such as record type and table type the wizard created wrapper packages for the bpel. While recreating we had to take care of the name and namespaces, so that it doesn’t conflict with the mappings we had created. One thing to take care was the order in mapping has to be same as mapping in the XSD. We didn’t use DetectOmission or DirectSQL optimizations in these calls as these were api calls and not direct db inserts.

One major challenge we faced was when the second api was taking too long to respond and the bpel instance went into manual Recover(invokes). In the logs we saw timeout errors for more than 120 secs. So We increased the time-out based on this post also here

This improved the issue, but we still were not getting bpel time-out errors the instance was going into recovery. As it turned out timeout properties on partnerlink only works for soap calls, as suggested here. The next option is to set the DBStoredProcedureInteractionSpec property QueryTimeout (set this in your adapter WSDL, jca operation) this option works for store proc calls in 10.1.3.4, it didn't work for me. We also tried to set deliverypersistpolicy to off.immeidate for the partnerlink.

However some optimizations on the Oracle API side like index creation, and process thread increases improved the performance form bpel drastically.

Sunday, September 06, 2009

inside bpel pm

The most common question in bpel is what is sync vs async. And the answer is mostly sync is request-response in a blocking thread and async is non blocking one-way request with a callback later. From the bpel pm perspective the simple answer is sync is two-way invoke and async is one-way invoke. So if your operation has only [input] its async and if it has [input], [output] both then its sync.

Now how does it make a difference performance wise? async requests go through what is called invoke_message table in bpel pm. So when an async request is made this is basically a two-step process -

Step1 - the message is persisted to dehydaration store / invoke_message table and also sent to a JMS queue
Step2- a worker MDB picks up the message from JMS queue and invokes the bpel

Both these steps happen in two separate threads/transactions.

The configuration property which make all these happen is DeliveryPersistPolicy=on


So whenever there is a one-way invoke, its important to recognize that from the client perspective only step1 happens. Step2 happens internally and performance depends on resources (threads, MDBs) availability.

One more property to keep in mind is CompletionPersistPolicy, this policy basically says how much to write while dehydration.

Dehydration happens during 7 different points (middle of the bpel)
1. receive (if first node and transaction=participate set)
2. pick (on message, on alarm)
3. wait
4. invoke (idempotent=false)
5. flow/flowN (nonblockingInvoke = true)
6. reply (idempotentreply=false)
7. dspMaxRequestDepth has reached (default 600)

dehydration happens at two levels, metadata and auditdata.

If there is dehydration in between the bpel flow, its called a durable process else transient. Dehydration helps recover processes in case of any node failures.


I will cover some bpel best practices, esb internals, tuning parameters in coming up posts.

Thursday, August 13, 2009

XML Schema extension

Recently working on a ARIBA-EBS integration, we decided to use/extend oagis schema. The approach was to create project specific wrapper xsds with the project specfic namespaces. Wherever there was need to extend a type, the [xsd:extension] was used. An example here, here and here


However whether to use the UserArea to put extension type or to define new types in the extension type - was a open question. Also by replacing the UserArea of type 'any' with our own custom type, in my opinion is defeating the purpose, as it would not be possible to accept any dynamic content at runtime.

One approach for dynamic xml content at runtime xsd polymorphic type [ xsi:type ]

What do you think?

Sunday, August 02, 2009

BPEL - Java

Coming from a Java/J2EE background, one would think bpel is just drag and drop, however while working on an EAI type project on would definitely need Java. There are three types of Java extension in bpel

1. Custom XSL function for XSLT and XPATH
This fits best for any lookup functionalities, a good article here on how to do it.

The xslt function can return a nodelist, documentfragment, string, date, boolean. Nodelist is proper when a hashmap kind of collection is required. documentFragment can be used to return a dom tree. While calling the function from bpel/assign documentfragment works better.

for error handling, we are wrapping all exceptions as some error code in form of a nodelist.

2. Java Embedding
This fits best for invoking any Java from within bpel, If the Java code is developed in a layered manner, the same code can be used from XSLT or from bpel.

3. WSIF to Java Web Service
This is the web service approach to expose Java as service, and if it is co-located in the same server, WSIF can be used to optimize the call.

From a deployment perspective all the Java code/jar has to be deployed to all servers in a cluster.

Tuesday, July 14, 2009

Scheduling BPEL jobs

last month, we worked on an interesting assignment where we used two 3rd party libraries in bpel, one was itext for creating pdf files using java and the other one was quartz for scheduling bpel jobs.

creating pdf files using itext libraries was very straight forward and using java embedding it worked fine, though formatting the pdf was some challenge.

Using quartz to schedule bpel jobs, there is a sample here and here

We deployed a web-app, where in the web.xml the triggering pattern was specified for the scheduler servlet. The servlet creates a singleton scheduler and adds all the jobs, which are nothing but java classes. We had a job class where we used BPEL client API to call a BPEL.

There were couple of challenges in deploying the web-app to weblogic from Jdev10.1.3.4, which got resolved by making sure the property files are under web-inf/classes, sandeep has blogged this here

While deploying this web-app to a cluster, we deployed it to admin server and used load-balanced url to call the bpels.

Wednesday, June 17, 2009

JMS Adapter Stories - SOA on Weblogic

Last month, many interesting SOA on weblogic issues came up, here are some of them -

issue#1. Connecting JMS Queues (hosted on SOA Suite10.1.3.4 on Weblogic9.2) from SOA Suite10.1.3.4 on OC4J MLR#8

The steps mentioned here, should have just worked fine, however we found that postMLR#2 there is a bug and it was not working, which got fixed by removing WLclient classes from the orabpel-thirdparty.jar

issue#2 no connection available in JMS Adapter Connection pool, this issue is still open, we really don’t know why the pool was getting full with 100 active connections and not releasing any. We retired/off all the bpels except the one we need to test and the error didn't happen. Also we were using weblogic.jms.ConnectionFactory, we need to change it to a proper JNDI referenced connection factory and try.

We tried setting ‘cacheConnections’ outbound partnerlink property in bpel.xml to false, but no visibile differnce was noticed.

issue#3 Cannot call Connection.commit in distributed transaction.
Transactions are a big topic and needs special attention and research. however quick findings was if there are commits in pl/sql, the datasource used in dbadapter has to be non-xa. On a separate note there is a LLR option if selected causes error. Also JMS Adapter doesn't start unless it’s XA, the connection factory can be non-xa. The transaction topic i will try elaborate in new posts later.

issue#4 We tried looking up AQ/JMS from weblogic using the steps here, however on one server it was not working, throwing Caused by: oracle.jms.AQjmsException: JMS-204: An error occurred in the AQ JNI layer
However, it turned out it was a access privilage issue with the userid, need to investigate more what access issues the user has. A good discussion here

Monday, May 18, 2009

Changing an Async BPEL to Sync (Calling WS from pl/sql)

This week I got an interesting task of calling a BPEL from PL/SQL. It seemed much easier than I initially thought. There are utilities available to do it and I got the references here and here. Initially the code as mentioned here didn't work. Then marC's blog code worked fine. One caveat to this whole exercise was this web service invocation from pl/sql only works for synchronous services. As for async services the client needs have the capability to receive a callback soap call.

Initially my bpel service was a async one and the pl/sql code though was able to invoke it was not able to recieve the callback. So I thought of changing it to synchronous. What followed was an interesting discovery. So firstly some basics, In bpel what makes a partnerlink async or sync is the role definition. For async you would see both partner role and my role, however for sync you will see only my role(for inbound) and partner role (for outbound).

I tried to change my async bpel to sync, by just removing the call back porttype/operations/bindings from wsdl, added a reply activity, changed the initiate operation to process and added a output message. From wsdl/bpel perspective everything was fine, but It didn't work. I was getting in bpel-console - Failed to get the WSDL operation definition of "process" in portType


After some googling, It seems there is a problem, after changing the operation name or async to sync, refreshing the wsdl, it might not work, the solution is to deploy a new version 2.0 - It worked for me. Then just deploy the 2.0 as 1.0 (In Jdev we can do this) - everything gets right.

Wednesday, May 06, 2009

Plain Vanilla BPEL #2 (Adapters)

I noted some of the day1 bpel issues here, there are also some interesting gotchas on the adapter side. Adapters are a big area covering various issues over file/ftp, database/aq adapters, JMS and Apps Adapters.

ftp adapter - I had used an outbound ftp adapter w/o much issues, inbound ftp adapters in cluster environments have to be made singleton, refer to this

db adapter - db adapter one can do 4 types of things 1) call stored proc 2)execute Insert/Update, Delete, Select operations 3) poll database table 4) execute custom sql

While using select operation parameters can be passed using #parametername to the generated SQL. While using custom sql XSD is automatically generated as per the SQL.

aq adapter - aqs can be created of type multi-consumers, in such cases consumer name has to be used while creating adapters to send/receive messages.

I will add more as we go along.

update on May/28/09
In the mean time a lot of adapter issues have been found, so that needs a mention here. Firstly FileAdapter/FtpAdapter, we had some good challenge around getting append to work for FtpAdapter. The format of the file was based on nxsd. We tried a bunch of things around making sure the transformation is correct to nsxd, making sure ftpadapter configuration (messagecount=1, unique filename, opaque schema) are correct, but append just didn’t work. After further debugging it turned out the ftp server didn’t support append in that particular directory. So a good learning was to test the ftp server configurations by doing simple ftp operations like put/get/app etc. from the prompt and then test from the ftpadapter.

One other learning was the schema used in ftp adpater can be opaque or xsd. If it’s opaque, we need to send a base64 encoded string. And the following is code snippet to do that, make sure your MXL is converted to string by using getContentAsString()
String inString = (String)getVariableData("XMLString");
sun.misc.BASE64Encoder encoder = new sun.misc.BASE64Encoder();
String encodedStr=encoder.encode(inString.getBytes())

JMSAdpater, We had to connect weblogic JMSAdapter from OC4J and this blog had all that we need. And it worked. A good learning was we were using weblogic.jms.Connectionfactory which is a default connectionfactory, so we don’t see it in the admin console, so making the right configurations on oc4j-ra.xml and putting the right entries while creating the JMS Adapter wizard are the key points.

an erlier post on aq-jms on Weblogic.

Another day#1 issue is JCA port configuration in the adapter wsdl configuration is to use mcf (connection details as created by Jdev) or the JNDI url created in admin console. If you have both how will it behave? best thing we found is to use the one in server and not use mcf at all.


Update on Aug/19/09
DBAdapter if the mapping order is different in toplink mapping and XSLT, it throws cannot insert null error, the solution to that was to use detectomission to false in the dbadpater wsdl.

On another note, its possible to change the JNDI for outbound adapters, by using similar mechanism as per dynamic partner links, for dynamic partnerlinks here is a link.

On DBAdaper pooling, there is a distributedpolling setting for it to work in clusters.

FTPAdapter, one catch is the directory specified is relative to the user default directly after login.


update 1/7/2010
While creating nxsd for a file with header/detail structure, the file had lot of field length element separated by space, and using the wizard the lengths had to be specified, I noticed while using the ruler, you should not put the last position which is the length of the line or else it throws error. good reference here

Wednesday, April 15, 2009

Plain Vanilla BPEL

This month I had to work on couple of bpel processes and it reminded me last yr when I was working on my first bpel. Anyone starting on bpel development goes through a series of similar hick-ups to get a comfort feel on the subject.

A plain vanilla bpel is one, which takes an input xml in one format and sends another xml in a different format. So there are 3 areas of challenge that we learn about bpel on day1 namely 1)xsd of the messages 2)transformation 3)adapter details

Each area has some unique challenges, XSD needs some good understanding of XML namespaces, creating elements and types, importing/extending xsd and wsdls etc. Generally biggest challenge in bpel lies in XSLT, particularly if you have complex mapping requirements. JDev's XSL Mapper is a good starting point, It helps you visually verify both the XSDs and do some preliminary mappings, how ever if you have to use some if-else kind of logic, the mapper doesn't work, so we need to fallback on the source view and do hand coding of XSLT. XSLT is a functional language and it doesn't follow the Java coding principles, and you will first be surprised at how you cannot do many things that you easily do in Java. For example, you cannot change a variable value in XSLT, it’s like a constant. Then there would be many learning around XPATH expressions, how to pass parameters to XSLT, single quote, double quote issues etc.

Also many XSLT methods like ora:getInstanceId() or ora:getFaultString() doesn’t work in XSLT, so they have to be assigned separately to the resulting xml.

One best practice is to ‘test the XSLT outside of bpel’ and then use in bpel to avoid the endless loops of build/deploy/test the bpel. On the Console the XSLT error will also not be clear, it will give a FOTY0001 kind of code, and we have to check the actual error in server log.

One other best practice is to use addAuditTrailEntry or System.out.println to verify the variable values in bpel runtime.

I will cover the adapter issues that we face on day1 in another blog.

update7/2/10
good article on using the position() as a variable while selecting something in XSLT here

Tuesday, March 31, 2009

SOA Suite Skill Map

While working on SOA Suite various areas of technology criss-cross each other, Its sometimes overwhelming to look at issues in a structured manner. From a Reference Architecture perspective, things are clear, they stack up as follow

1. User Interaction Layer (Portal, Composite Applications, BAM)
2. Connectivity Layer (Security, ESB, OWSM)
3. Integration Layer (SOA Suite, BPM, B2B, ODI)
4. Infrastructure Layer (Weblogic Suite, Coherence)

Also the cross-cutting Layers of IDE, management and Governance

Coming back to SOA Suite - I divide it into four major buckets
-BPEL
-ESB
-Adapters
-App Server (OC4J or Weblogic)

Here is the list of items that would come under each category, I have split each category/area into Development and Administration parts - This can help to map issues to one of these areas/items and so in troubleshooting. I plan to keep adding to the list, basically very high-level taxonomy that in turn will have lot of details in the development/administration guides.





updates on 7/22/09
Two other aspects of SOA Suite are increasingly being used and require skills on are
-BAM
-B2B

Monday, March 23, 2009

Error Handling in BPEL #2

As I noted about BPEL Error handling here, SOA Suite 10.1.3.3 higher has a Error handling framework in form of fault policies, which is very useful to set up automatic recovery requirements in bpel.

Basically a fault policy says what to do (action) for which fault (condition). And it can be attached to a partnerlink, port type, process or domain. Attaching a fault policy to partnerlink, porttype or process can be done by including it in the bpel.xml. Domain level fault policy attaching can be done in fault-bindings.xml under each domain in $BPELHOME/bpel/domains/"domain-name"/config

Any error happening while invoking a partnerlink, will get captured by this framework and based on the fault policy, action will be taken. The pre-defined actions are retry, rethrow, human-intervention etc.

If we use rethrow as action, the fault will go back to bpel and will be handled by any catch/catch all blocks.

By default, the error handling framework doesn't do anything, as out of box, no policy is configured.

please check this for more information

Update on 6/Apr:
to refresh fault policy changes, server restart is not required, we can use the following link http://host:port/BPELConsole/domain_name/doReloadFaultPolicy.jsp. Also while creating a faultpolicy file, dont forget to put the correct Id name in the faultPolicy tag same as in bpel.xml

Update on 30/Apr
While showcasing faultpolicy to a customer, it became clear that businessfaults couldn’t be handled by fault policies, only technical faults. Which is quite a disadvantage, however its possible to handle business faults by converting the partnerlinks to separate bpel processes and then throwing the businessfault. Also for customizing fault-policy to do requirements like notification or calling another bpel also seems to be a challenge. Can we get the ora-java in the actions available in bpel console? need to check that.

Monday, March 16, 2009

AQ-JMS on SOA Suite10.1.3.4 on WebLogic9.2

There are 3 options in SOA Suite on Weblogic for messaging
-AQ Adapter
-AQ with JMS
-JMS Providers
--OC4J JMS Provider (only on OC4J)
--Weblogic JMS Provider
--Any other 3rd party JMS provider

Aq is the native queuing mechanism in Oracle, which uses database to create queue tables and queues/topics.

Other JMS Providers primarily use a in-memory/file based approach to create queues and topics.

To create Aq/JMS adapters on SOA Suite on Weblogic, here are the steps -

AQ/JMS basically means AQ queues with JMS message type. AQ/JMS supports SYS.AQ$_JMS_TEXT_MESSAGE, SYS.AQ$_JMS_BYTES_MESSAGE, SYS.AQ$_JMS_STREAM_MESSAG, SYS.AQ$_JMS_MAP_MESSAGE, SYS.AQ$_JMS_MESSAGE.

For the sqls to create the queue tables/ queues here is a reference

Next what we realized AQ/JMS is not supported on weblogic yet, so we had to use a custom code created by Robert Patrick

The code is also available here

Basically - we need to create a Weblogic startup class AQJMSStartupClass.jar, and use one property file, couple of user credential files, and a couple of jars (aqapi13.jar and ojdbc14.jar), we followed the readme file and created the required files.

Next we created a weblogic startup class with the necessary parameters, and restarted the soa server, and it basically used the startup class to create the AQJMS JNDI entries for QueueConnectionFactories, Queue etc.

Then we created a JMS Adapter entry in weblogc-ra.xml to use the AQJMS_QueueConnectionFactory. Used this JMS Adapter JNDI name in the bpel wsdl.

This just worked great, we have seen some AQ related errors in log, which I will cover in next post.

update on Aug/19/09
The AQ realted error like 'Cannot delist resource when transaction state is committed' got fixed after applying MLR#8

Tuesday, March 10, 2009

Error Handling in BPEL

Error handling is always a key core service on BPEL and there are always confusions around how to handle it,

There are various aspects to error handling in BPEL/ESB
1. esb errors
2. bpel partner link errors
3. bpel non-partnerlink errors

The simple answer is catch and catch-all

catch is for system defined faults like bindingfault, remotefault, selectionfailure etc. There are 12 of them that we can see in the Fault explorer. Once we catch them, we create a variable of type RuntimeFaultMessage from RuntimeFault.wsdl (under SOA_HOME\bpel\system\xmllib)


catchall is for everything else, and we can get the error String by ora:getFaultAsString()

The best practice to handle these faults in catch/catch-all blocks is to do the following steps

1. create a custom fault message, based on all the values we need to capture for example processid, instanceid etc.

2. send the fault message to a custom error handler, which basically can log the error, send it to worklist or send an email

3. send a reply back to client about the error message, this would need the wsdl to have provisions for sending fault data as part of response or fault

4. Terminate

Alternatives we can also use ThrowFault in step#3 if the wsdl has provisions for faults. However I found that while using throwFault step#2 gets rolled back, so adding a checkpoint() resolves it.

SOA Suite10.1.3.3 onwards also has the built in fault policy to handle errors. Basically whenever partnerlink errors happen, bpel checks with fault policy to decide what to do. In the fault policy we can configure different actiontypes like retry, rethrowfault etc.

I will cover fault policies and error handling frameworks in more detail in another post.

Monday, March 02, 2009

Clusters Vs Grids

Recently I worked on a Coherence data grid and then to set up a SOA cluster on Weblogic. That made me wonder what is the difference between Grid and Clusters. So couple of goggling and there are many links that explain it. However here it is in my own understanding.

Grids: Grid computing would be something to do with to optimize the resource usage like CPU, memory or IO. When we built a Coherence Data grid, It’s basically a cache server to cache the data so that we can avoid the expensive database trip. So we reduce the IO and make the application faster. And these cache servers can be started many in number across machines, they all talk to each other using multicast communication. Which is actually the same technology used in the Weblogic clusters to keep the managed servers in sync. Similarly other grids optimize CPU usage, which would otherwise be unused and so waste of money to keep and maintain them. Server Virtualization is a Grid solution.

Clusters: on the other hand basically provide scalability, so that the application can support more a more user requests, also with high availability due to failover support in cluster aware managed servers. Clusters are an extension to distributed computing and it lets scale the application dynamically meaning add more managed servers anytime you have more demand. Similar failover and dynamically adding more servers is supported in Coherence Grids as well. Coherence Cache servers also partitions the data across multiple servers thus providing more scalability.

Interesting thread on weblogic clusters here
Interetsting discussion on scalability here


So basically the underline technology like multicast communication could be same, and the results like performance (response time), scalability (throughput) could be same, however Grids and Clusters differ in the basic goals they work on. It would not be wrong to say Grids will help you scale up while Clusters will help to Scale out.

Tuesday, February 17, 2009

SOA Suite 10.1.3.4 on Weblogic HA installation

Recently we did SOA Suite 10.1.3.4 on Weblogic simple HA installation following the enterprise deployment guide. This went quite smoothly without much issues. However we had to redo it once, as one key thing doing this installation is - the directory structure of SOA_HOME and BEA_HOME has to be same in both the nodes.

We were doing this remotely so, to start with we got the vnc access to the both the boxes, also make sure 8001, 9700, 9701, 9702 ports are open for you.

steps are quite straight forward
1. set up RAC DB, we went for a single DB, make sure all DB initialization parameters are set
2. install SOA Suite 10.1.3.1, patch 10.1.3.4 basic install, remember that if you try to access EM after these steps, it will not come up as ascontrol webapp by default is not enabled, so you can do that using this
3. install weblogic, then run the weblogic scripts to configure the domain, clusters, nodes, datasources, and deployapps
couple of things here to note are - make sure to deploy the datasources to cluster, if you get error in any scripts related to unable to get lock you can follow this. Also start SOA Server from Admin console and remmeber we dont need to use opmnstart/stop here.
4. after setting up apphost2 as per the document, we were successfully able to bring up all the consoles, we did one deviation form document for esb_dt setup, instead of setting 9700 in esb_parameter table we set 9702
5. next was webhost and loadbalancer setup

From a deployment perspective, we are deploying our bpels to both the nodes and using roundrobin for load balancing to both the nodes.

We will be exploring more to validate the setup and failover scenarios.

Update on 2/mar/09:
We faced a major issue when we installed one more SOA/HA/Weblogic Environment and by mistake we used the same multicast ip/port. We had to reinstall both the setups the earlier one and the new one again. We got some multicast errors while starting server2, which is
Error: Cluster: BEA-000110 Multicast socket receive error: java.io.EOFException
java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:178)
at java.io.DataInputStream.readLong(DataInputStream.java:380)
at java.io.ObjectInputStream$BlockDataInputStream.readLong(ObjectInputStream.java:2744)
at java.io.ObjectInputStream.readLong(ObjectInputStream.java:941)
at weblogic.cluster.HeartbeatMessage.readExternal(HeartbeatMessage.java:55)
Truncated. see log file for complete stacktrace


So one has to be very careful with multicast ip/port being unique with every install at the install time itself, changing it later on doesn't help.

updated on 13/mar/09

While our weblogic cluster was working fine, we found some issues while accessing one of the bpel consoles as 404 - Not Found. This made us look into BPEL Clustering on top of Weblogic clustering. The SOA-HA Faq document is very useful to understand this. Basically the BPEL nodes talks to each other to sync processes, any changes in bpel.xml, adapter state etc. And it uses the jGroups configuration to do this. collaxa-config xml has to be updated with 4 changes as per the EDG. The jGroup changes are little tricky. Basically jgroups-protocol xml file has to have a unique multicast ip address (while using UDP not TCP) and if the servers have multiple ips then bind_to_all_interfaces can be made false and bind_addr can be added with the one ip. While our jGroup settings are fine and bpel nodes seem to propagate new processes without restart, we still have a 404 issue, which seems to be something to do with the network ips.

Thursday, January 29, 2009

BPEL-Weblogic ClassLoader issue for Coherence

I had to deploy some existing code developed on SOA Suite on OC4J to SOA Suite on Weblogic, It was a good struggle for me, so here it goes -

The code basically is a bpel calling a java web service using wsif. And the java service uses the caching software coherence to read/write to an in memory cache.

1. First of all the java web service in Jdev didn't get deployed to weblogic - It gave me - java.lang.IllegalStateException: could not find schema type named {http}//weblogic/types/}getHelloResponse" (edited the actual type).

After some googling, it seems there r interoperability issues - so I had to use weblogic ant tasks to create/deploy the web service. Though the ant tasks like wsdlctask etc. were available my ant script didn’t run as it didn’t get xmlbean libraries, which I too couldn’t find in weblogic/lib. So my next attempt was to try develop this in workshop, which I am still trying as workshop didn't like my localhost weblogic server SOA domain, created as part of the SOA Suite on Weblogic installation scripts.

2. Since my java web service attempts were not getting anywhere, I decided to package the Java class (which would be called by WSIF anyway) in the bpel itself. That went pretty well, my wsif binding started working after a few issues with the namespace in my schema, bam code in bpel (which I removed later as it was throwing JCE Error asking to put new jars in jre/lib/ext at deployment). I couldn't find a jaxb ant task to replace the schemac, so I had to go with schemac to generate the java binding classes form the wsdl/schemas. Here too I faced compilation issues with the classes generated, so I had to edit the schema to remove the (maxOccurs="100000") entries. So finally when my bpel was ready with everything else, I had to test the coherence part.

3. The coherence part was the real hurdle to cross, by packaging the coherence.jar in my bpel output jar, I couldn't start the CacheFactory - It was complaining - Caused by: (Operation failed!; nested exception is: (Wrapped: Failed to load the factory) java.lang.reflect.InvocationTargetException - Wrapped: Failed to load configuration resource: coherence-cache-config.xml) java.io.IOException: Configuration is missing: "coherence-cache-config.xml", loader=null. So I tried putting the coherence.jar in the weblogic server classpath, and then it was working fine. I was able to create the CacheFactory and just then when I thought its done, the put/get to cache started throwing - (Wrapped) java.io.IOException: readObject failed: java.lang.ClassNotFoundException: com.schemas.Customer.CustomerQueryResultSet
at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
at java.security.AccessController.doPrivileged(Native Method).

Finally with the help of Coherence support, I learnt that I need to pass the classloader to CacheFactory.getCache() as the Classloaders get different when we go out of bpel domain to the coherence code and vice-versa. After passing the classloader, the whole thing worked fine.
sample code below -
NamedCache cache;
Thread thread = Thread.currentThread();
ClassLoader loaderPrev = thread.getContextClassLoader();


try
{
thread.setContextClassLoader(com.tangosol.net.NamedCache.class.getClassLoader());
cache = CacheFactory.getCache(sName);
}
finally
{
Thread.currentThread().setContextClassLoader(loaderPrev);
}

Saturday, January 17, 2009

Oracle Middleware

Oracle Products are primarily of 3 types
-Database
-Packaged Applications
-Middleware

Oracle Middleware products are categorized as

-SOA/BPM
I have already written about SOA/BPM line of products here
-Web2.0
Under this Enterprise Content management, portal servers from both the oracle and bea products.
-Application Grid
This has basically the app server (weblogic, oc4j) and the various grid technology products like Coherence, web cache etc.
-Identity management
-BI/EPM

Each category has lot of products for any kind of customer requirements and application patterns.

Tuesday, January 13, 2009

2009++

Recently I worked on a POC to call coherence java code from SOA Suite using WSIF binding, while trying to google up on the subject, found an interesting article, on 'real-time SOA'. This term 'real-time SOA' kind of hit me and I started googling up more, and hit upon many articles in sys-con on cloud computing. After Going through couple of you-tube videos on cloud computing and looking at the list 100s of companies working on related technologies like SaaS, Virtulaization etc. - I am amazed what the future holds.

I just completed 10yrs in IT as a developer in the outsourcing industry, and as I understand this decade (98-08) is what can be branded as web1.0, soa1.0, cloud computing1.0 - all 1.0 - and the next decade is going to be 2.0. We are already seeing it - be it SOA/BPM or Web2.0/AJAX, the 2.0 is starting to unfold.

After Listening to Bill Coleman here - I am right now just sleepless thinking how exciting the future looks.

Tuesday, January 06, 2009

My Oracle SOA Experience log -

From Apr/08

Installation of SOA Suite 10.1.3.3
Installation of AIA FP2.0.1
MDM PIP Development
Upgrading SOA Suite to 10.1.3.4 on Weblogic9.2

BPA suite installation
OSB installation and POC
Oracle Coherence

All installations on Windows laptop.

SOA Suite HA installation on Weblogic9.2/Linux