While deploying a composite to a soa cluster - the deployment was stuck for more than 20 mnts. To make it worse when it was cancelled/retried it corrupted the MDS causing soa-infra to fail while restart.
Clearly the logs showed STUCK THREAD
<[STUCK] ExecuteThread: '56' for queue: 'weblogic.kernel.Default (self-tuning)' has been busy for "602" seconds working on the request "Workmanager: default, Version: 0, Scheduled=true, Started=true, Started time: 602461 ms
[
POST /soa-infra/deployer HTTP/1.1
Connection: TE
TE: trailers, deflate, gzip, compress
User-Agent: Oracle HTTPClient Version 10h
Accept-Encoding: gzip, x-gzip, compress, x-compress
ECID-Context:
Authorization: Basic amNoZW42Ol8xYW1BZG1pbg==
Content-type: application/octet-stream
Content-Length: 69483
]", which is more than the configured time (StuckThreadMaxTime) of "600" seconds. Stack trace:
Thread-701 "[STUCK] ExecuteThread: '56' for queue: 'weblogic.kernel.Default (self-tuning)'" {
-- Waiting for notification on: java.util.HashMap@4343a522[fat lock]
java.lang.Object.wait(Object.java:???)
oracle.integration.platform.blocks.deploy.CoherenceCompositeDeploymentCoordinatorImpl.submitRequestAndWaitForCompletion(CoherenceCompositeDeploymentCoordinatorImpl.java:352)
oracle.integration.platform.blocks.deploy.CoherenceCompositeDeploymentCoordinatorImpl.coordinateCompositeRedeploy(CoherenceCompositeDeploymentCoordinatorImpl.java:255)
oracle.integration.platform.blocks.deploy.servlet.BaseDeployProcessor.overwriteExistingComposite(BaseDeployProcessor.java:487)
oracle.integration.platform.blocks.deploy.servlet.BaseDeployProcessor.deploySARs(BaseDeployProcessor.java:298)
^-- Holding lock: java.lang.Object@73823526[thin lock]
The soa-infra error
weblogic.application.ModuleException: [HTTP:101216]Servlet: "FabricInit" failed to preload on startup in Web application: "/soa-infra".
oracle.fabric.common.FabricException: Error in getting XML input stream: oramds:/deployed-composites/AccountBS_rev1.0/composite.xml: oracle.mds.exception.MDSException: MDS-00054: The file to be loaded oramds:/deployed-composites/AccountBS_rev1.0/composite.xml does not exist.
Clearly the logs showed STUCK THREAD
<[STUCK] ExecuteThread: '56' for queue: 'weblogic.kernel.Default (self-tuning)' has been busy for "602" seconds working on the request "Workmanager: default, Version: 0, Scheduled=true, Started=true, Started time: 602461 ms
[
POST /soa-infra/deployer HTTP/1.1
Connection: TE
TE: trailers, deflate, gzip, compress
User-Agent: Oracle HTTPClient Version 10h
Accept-Encoding: gzip, x-gzip, compress, x-compress
ECID-Context:
Authorization: Basic amNoZW42Ol8xYW1BZG1pbg==
Content-type: application/octet-stream
Content-Length: 69483
]", which is more than the configured time (StuckThreadMaxTime) of "600" seconds. Stack trace:
Thread-701 "[STUCK] ExecuteThread: '56' for queue: 'weblogic.kernel.Default (self-tuning)'"
-- Waiting for notification on: java.util.HashMap@4343a522[fat lock]
java.lang.Object.wait(Object.java:???)
oracle.integration.platform.blocks.deploy.CoherenceCompositeDeploymentCoordinatorImpl.submitRequestAndWaitForCompletion(CoherenceCompositeDeploymentCoordinatorImpl.java:352)
oracle.integration.platform.blocks.deploy.CoherenceCompositeDeploymentCoordinatorImpl.coordinateCompositeRedeploy(CoherenceCompositeDeploymentCoordinatorImpl.java:255)
oracle.integration.platform.blocks.deploy.servlet.BaseDeployProcessor.overwriteExistingComposite(BaseDeployProcessor.java:487)
oracle.integration.platform.blocks.deploy.servlet.BaseDeployProcessor.deploySARs(BaseDeployProcessor.java:298)
^-- Holding lock: java.lang.Object@73823526[thin lock]
The soa-infra error
weblogic.application.ModuleException: [HTTP:101216]Servlet: "FabricInit" failed to preload on startup in Web application: "/soa-infra".
oracle.fabric.common.FabricException: Error in getting XML input stream: oramds:/deployed-composites/AccountBS_rev1.0/composite.xml: oracle.mds.exception.MDSException: MDS-00054: The file to be loaded oramds:/deployed-composites/AccountBS_rev1.0/composite.xml does not exist.
In case of soa-Infra error this blog has steps on how to recover
The deployment STUCK THREAD issue points to coherence related issues, there are many useful troubleshooting documents on oracle.support
General Coherence Network Troubleshooting And Configuration Advice (Doc ID 1389045.1)
Coherence and SOA Suite Integration Recommendations (Doc ID 1557370.1)
Troubleshooting Tips for Coherence - Oracle Service Oriented Architecture (SOA) Suite Integration Issues (Doc ID 1388786.1)
"oracle.integration.platform.blocks.deploy.CoherenceCompositeDeploymentCoordinatorImpl.submitRequestAndWaitForCompletion" Error and Slow Response While Accessing Composites In EM Console (Doc ID 1437883.1)
SOA 11g Composite Deployment Results in Stuck Thread Error: <[STUCK] ExecuteThread - Unable to Deploy the Composites in a Cluster (Doc ID 1086654.1)
SOA 11g Health Check: Verify Consistency of Coherence wka and wka.port Configuration (Doc ID 1578203.1)
SOA 11g: How Many Nodes are Required to be Specified as Coherence WKA Members in a SOA/OSB Cluster? (Doc ID 1511706.1)
Stuck Threads during SOA Cluster Deployment (Doc ID 1564586.1)
IpMontor Failed To Verify The Reachability Of Senior Member (Doc ID 1530288.1)