Thursday, April 9, 2015

Using LCM for Backups

Using LCM for Backups Posted by Damon Hannah Jul 12, 2012 11:02:00 AM Lifecycle Management is most commonly used to promote changes from one environment to another. However, there’s no reason that you can’t use this same tool to recover back into the same environment. With that in mind, LCM can become a terrific backup solution. The steps are simple enough. A bit of up front work, a bit of scripting, and you will have a nightly “hot backup” that can be used to recovery almost everything from a lost app to a single report that was accidently deleted or modified. First, we have to create the definition. From Shared Services, access LCM as you would for a live migration. Define what you want migrated (normally everything) and walk through the pages to define the migration. However, instead of selecting “Execute Migration”, you select “Save Migration Definition”. This will output an .xml file that holds the LCM export definition. This is repeated for each item in the LCM tree (Shared Services, Planning applications, HFM applications, EPMA, Reporting and Analysis, etc.), giving each their own distinct and identifiable backup file and xml file. I recommend you keep these 2 names the same and something very obvious to the casual reader (i.e. HSS.xml, MYAPP.xml, Reports.xml, etc.). A couple of warnings here – first Essbase is not migrated via LCM – the export gets SubVars and not much else – no outlines, etc. Second, if your reporting folder structure gets too large, you will find that backup taking hours (at one point I saw it approach 24 hours). If this happens, break it into several backups, one for each reporting folder (i.e. Reports-FolderA, Reports-FolderB). For this approach to work, you will need to severely limit or keep reports out of the root folder altogether. Now that you have migration (or backup) definition files, you need to get an admin-level encrypted password into them. Otherwise you’ll be asked for username / password every time they are run, making them useless as an automated backup. I always use the ‘admin’ account. It is an automated process and should have all the rights required for the backup. It is also a “known value” that won’t change. Login to the Shared Services server, make sure you place the xml file somewhere local on that server and set up a backup directory. Once you have done this, run the LCM tool from the command line. You can run it on either Windows or UNIX – it works the same way. In a Windows environment, the command would look something like this: D:\Hyperion\common\utilities\LCM\9.5.0.0\bin\utility.bat -local -b You will be prompted for username and password. Once you provide them, they will be encrypted and placed in the .xml file, so that you won’t be prompted again. Repeat this process for each xml file. Now you’re ready to script. Use a scripting language you are comfortable with, that works on the OS you are going to use. My most commonly used are Perl, Bash, and (reluctantly) windows CMD scripting. You can make the script as intelligent or as simple as your needs require. Some basic steps would be: Make sure the backup directory exists Clear / move / delete previous backups in that directory For each xml file found, run the utility.bat like was done manually above (you’ll want to capture both normal output and error output to a log file) I always zip of the output directory to save space and also to make it easier to move to an archive location. Note –You will need to use 7zip on a Windows system because of limitations of normal Zip and the requirements to zip up Reporting and Analysis directories. Some more advanced steps would be to kick off the LCM backups in parallel, email upon error and/or completion, test for older backups and remove any that are more than a week / month old. These are nice, but not central to the basic steps – as I said, you can make this as simple or complicated as your needs dictate. Test the script a few times manually; keeping an eye on the time it takes to run. If you get the results you expect, you’re ready to schedule this nightly. There is no downtime required. Keep in mind, however that this will impact system performance, try to schedule it when there aren’t competing processes (consolidations, data loads, etc.). Finally, a backup is only good if you can actually do a recovery with it. Make sure you use the backup to recover some sample data and verify it. If you’ve zipped the backup to a file, unzip it to the import_export\ directory (otherwise just copy the outputted LCM application directory for one backup to that location). Open Shared Services and make sure you can select items for import. To be completely certain it’s working, move a sample item (i.e. a report) to another name and actually perform the import for that item. Verify the imported item matches what it should be from the time of backup. There is a side benefit from this process. When it comes time to promote changes to a new environment, you already have an export ready to go from the last backup. This may not be the most efficient way to promote a single form, but works great when the promotion is more comprehensive.

LCM command line utility to automate the shared services/planning application backup

LCM command line utility to automate the shared services/planning application backup How To: Use LCM command line utility to automate the shared services/planning application backup. Solution: LCM has a graphical interface to take shared services and applications backup. What we also have is a command line utility that can be used to automate and schedule the backup jobs. The steps mentioned below are for backing up Shared Services. The same can be replicated for taking a Planning Application backup also. • The first step will be to save a migration definition file from the LCM graphical interface. o Go to Shared services console -> Application Groups -> Foundation -> Shared Services. o Select all/required artifacts (Options: Native Directory, TaskFlows) and click “Define Migration”. o Select the destination as a file system, and give a name to it, say “SharedServiceDefinition”. o Select the option “Save Migration Definition”, and save this xml file in a folder, say C:\Backup\ SharedServicesMigrationDefinition.xml. • Create a batch file with the following commands in it: cd C:\Oracle\Middleware\user_projects\epmsystem1\bin Utility.bat C:\Backup\SharedServicesMigrationDefinition.xml • Run this batch file manually the first time. It will prompt for a username and password. • From second time onwards, you don’t need to type the username password, since this information gets stored in the migration definition xml file (Password is stored in encoded form). Now you can schedule this batch file, for a regular backup. • Note that the utility is located in the folder \\Oracle_Home\Middleware\user_projects\epmsystem1\bin and the backup file is generated in this folder only. So you might want to move this folder to another suitable destination, through another batch. - See more at: http://www.adistrategies.com/resources/knowledge-base/id0127-lcm-command-line-utility-to-automate-the-shared-servicesplanning-application-backup/#sthash.fe70LKYe.dpuf

Wednesday, April 8, 2015

Oracle’s Hyperion® Life Cycle Management

https://ranzal.wordpress.com/2010/03/24/using-oracles-hyperion%C2%AE-life-cycle-management/ Using Oracle’s Hyperion® Life Cycle Management Posted on March 24, 2010 by Larry Goetz WHAT IS LCM? LCM (Life Cycle Management) is a tool which can be used to migrate Hyperion applications, cubes, repositories, or artifacts across product environments and operating systems. It is accessed through the Shared Services Console. DOES IT WORK? After using LCM at a few clients I think the answer is a definite YES, but there needs to be a realistic setting of expectations: Yes, LCM has some very good and handy uses; but NO, it is not necessarily going to be a painless, simple answer to your migration and/or backup needs. WHAT CAN I DO WITH IT? You can use it for migrations: One environment to another One app to another (same SS environment) Selected dimensions or other artifacts And for backups/restores, including keeping two separate environments synchronized: Selected artifacts Lights-out Products which can be migrated are: Shared Services Essbase Planning Reporting HFM The dimensions housed in EPMA This blog is going to concentrate on using LCM for planning application migrations although, as you can see from the list above, it can also be used for other products as well. First I’ll show how a migration is done, using screen shots, to give a detailed look. Then I’ll point out things to look out for including things which will cause the migration to fail — with work-arounds where possible. To migrate an entire Planning application, you will need to copy (4) areas: Shared Services Essbase (For Planning, only need the Essbase Global Variables. All App/DB specific variables are migrated with the Planning Application) Planning Application Reporting and Analysis (if applicable) The order in which you export these is not important but when doing the import, the order is very important. Some important considerations: Target app can have different name from source Source and destination plan types must match Can be changed by editing the files Target plan types must be in same order as source Start year must be the same Number of years doesn’t need to match Base time period must be the same Target app’s Currency settings must match Source Standard Dimension names must match Can be changed by editing the files When exporting any application it is advisable to just export everything. If necessary you can be selective on the import side. Start the process by opening the Shared Services console and go to the Application Groups –>Application (in this case – Shared Services under Foundation). In the lower part of the screen, click “Select All” and then “Define Migration” Now go through the screens: Leave each field with an * and Choose “Next” Type in a file name for the export. It is advisable that you use a naming convention for this since you will end up with (possibly multiple) files for each application. Review the destination options & click “Next.” Finally, review the Migration summary and click “Execute Migration.” NOTE: If this process is going to be re-run in a lights-out environment you should instead choose the “Save Migration Definition” button. I’ll discuss this more fully later on. You will get this information screen. Click Launch Migration Status Report to actually see the migration progress. As long as the migration is running you will get a status of In Progress Click Refresh to keep checking status (if desired) until you get a status of Completed or Failed. All of the other applications can be exported this same way, each with slightly different screen sequences but generally the same process. The primary differences will be for Planning and Essbase where, if there are other applications in the same Shared Services environment, they will be offered as possible targets for the export, in addition to the File System. Selecting one of these will cause a direct migration from the source application to the selected target application. After the exports are finished the LCM export files can be copied to the target server environment, if needed. These export files can be found on the Shared Services server under \Hyperion\common\import_export\username@directory. Copy the entire directory (in this example, Essadmin@Native Directory) to the Hyperion\common\import_export directory on the target server. The import side is where things are more likely to be tricky. Here you will reverse the process, selecting the export files in proper order (Shared Services, Essbase, Planning & Reporting) and importing them to whatever target is appropriate. Start the process by logging in to the Shared Services console as the same username you used in the export process. Under Application Groups–>File System, find the appropriate export files and click “Define Migration.” Click through the screens, including the SS screen selecting the target application to import to. On the destination option screen select Create/Update and increase the Max errors if desired (default = 100)… …and run the migration. For the Planning import select all to begin. Click through the screens and select the planning application to import to. And click through the remaining screens to execute the migration. The Reporting migration is similar. Select all the artifacts you want to import. And go through the remaining screens to execute the migration. In many cases, especially where you are keeping two identical environments in sync, these migrations should go smoothly and complete without error. However, at other times, especially when doing an initial migration or one where the security will be much different from one to another, you may have to make several passes at the migration. When even one item fails to migrate successfully, LCM will send back a status of “Failed”. Click on that link in the status report and LCM will tell you what items failed to migrate. All other items will usually have migrated successfully. You will then have to figure out why the item failed and either fix the problem, work around the problem or ignore it and migrate the item another way. Here are some things I’ve found which will cause you problems in using LCM: In exporting a planning application with many substitution variables, the EXPORT failed – refusing to export all of the variables. This was worked around by exporting only the variables and then exporting everything except the variables. OR, you can play with the group count/size settings as well as report and log files location within the migration.properties file. Default settings usually are: grouping.size=100[mb] grouping.size_unknown_artifact_count=10000 Using “All Locations” in HBR will cause failure for those forms. Essbase server names—if not same in source & target, you will have to modify the import files for target name. Report Name Length is limited to 131 characters less folder name. Dim members “Marked for Delete” won’t migrate. You will have to delete them using a SQL query if you want them migrated. Form folders may get re-ordered on migration. You can work around this by manually adding the folders to the target application in the proper order. LCM will not reorder existing folders. Doesn’t support parentheses ( ) in form names. You won’t get an error indication in the export/import – the forms just won’t be there in the imported app. You’ll have to rename the forms to get them to come over. Member formulas need to be in planning – if just in Essbase they don’t come over. If this is a one-time migration you can use the EAS migration utility to bring the outline over after the LCM migration. You must manually delete Shared Services groups in the target app if you deleted them in the source app (or they will remain). Reports – you must manually update the data source in the target. Members don’t come over with certain special characters. Doesn’t support Clusters; must use the outline as HBR location. Global Variables with limits in their definition don’t work. Well, now you should be able to use LCM and judge for yourself whether it is right for your application. In another BLOG I’ll show how to run LCM in a lights-out mode and also how to do some modifications to the export files so you can do things like sharing dimension detail between planning applications.

Essbase Server Clustering

Essbase Server Clustering Reference : Posted by Jeff Henkel Jan 23, 2012 6:13:00 PM http://blog.checkpointllc.com/essbase-server-clustering At long last, after many years of customer requests, and many unsupported, creative workarounds, Oracle now has an officially supported Essbase clustering method. This is a software based, active-passive cluster, using Oracle's OPMN (Oracle Process Monitoring and Notification service). Due to the nature of Essbase, and its agent's need to have exclusive locking rights of files associated with applications and databases, only one agent can be active at any given time. But, what OPMN does is provide automatic fail over, high availability and write-back to the other Essbase agent, upon failure of the active agent. The only capability missing is load balancing. This new functionality was first introduced with EPM System 11.1.2, though there have been many issues in this first release. Oracle recommends implementing Essbase clustering in EPM System 11.1.2.1. In addition, you need to apply OPMN patch 11744008, which resolves some known issues with OPMN. What Essbase clustering still doesn't give you is live backups, but, Oracle is supposed to be working on finally making that a feature for future releases. An active-passive Essbase cluster can contain two Essbase servers. To install additional Essbase servers, you must install an additional instance of Essbase, either on the same server, which would really not be recommended, since you still have a point of failure of the physical hardware, or another physical server, which is recommended. The applications must be on a shared drive, and the cluster name must be unique within the deployment environment. These types of shared drive are supported: SAN storage device with a shared disk file system supported on the installation platform, such as OCFS. NAS device over a supported network protocol. Note: Any networked file system that can communicate with an NAS storage device is supported, but the cluster nodes must be able to access the same shared disk over that file system. SAN or a fast NAS device is recommended because of shorter I/O latency and fail over times. Essbase cluster initial setup occurs on the first instance of Essbase, where you define the Essbase cluster name and the local Essbase instance name and instance location, using the EPM System Configurator. This version of Essbase still uses the old variable name of ARBORPATH, but, the variable itself is now used to define the location of the application files, not the location of the Essbase system files, as in previous versions Essbase. All of this information is stored in the EPM System Registry, which is stored in the Shared Services database When you setup each instance, not only for Essbase, but for the entire system, you connect to the Shared Services database so that the same EPM System Registry is in use for the entire system. OPMN also reads the Essbase cluster information from the EPM System registry and keeps track of the active node there. When you setup the second instance of Essbase, and connect to the same EPM System Registry, you will be presented with an option to join the previously configured cluster, that was setup on the first instance. All information regarding the previously configured cluster will automatically populate and will be grayed out. Once you complete the setup, with the EPM System Configurator, there are still quite a few manual steps that must be taken to update OPMN configuration files, on each Essbase instance. Consult the EPM System High Availability guide and Oracle EPM System Installation and Configuration guide for more detailed information on the manual changes required to complete the setup. Happy Clustering!

Starting and Stopping the Essbase Server 11.1.2 using OPMN (Doc ID 1114453.1)

Starting and Stopping the Essbase Server 11.1.2 using OPMN (Doc ID 1114453.1) To BottomTo Bottom In this Document Purpose Scope Details Why use OPMN with Essbase : Methods to start the Essbase server using the OPMN: 1- Use these commands to start and monitor Essbase from OPMN: 2- Windows Start Menu Command: 3- Windows Startup Script: 4- UNIX Startup Script: To stop Essbase Server using the OPMN: 1) Command line: 2- Windows Stop Script: 3- UNIX Stop Script: Starting and Stopping Essbase without using OPMN: To start the Essbase server without using the OPMN: To stop the Essbase server without using the OPMN: 1- Stopping via Administration Services Console: 2- Stopping Essbase via MaxL: 3- Stopping Essbase via ESSCMD: 4- Agent: Common Issues References APPLIES TO: Hyperion Essbase - Version 11.1.2.0.00 to 11.1.2.0.00 [Release 11.1] Information in this document applies to any platform. ***Checked for relevance on 17-May-2013*** PURPOSE How to start, stop, monitor and control the Essbase Agent process. SCOPE Possible ways for monitoring and controlling the Essbase Agent process. DETAILS There are two possible ways for monitoring and controlling the Essbase Agent process. - Starting and Stopping Essbase using OPMN. - Starting and Stopping Essbase without using OPMN. Why use OPMN with Essbase : The Oracle Process Manager and Notification server (OPMN) enables you to monitor and control the Essbase Agent process. You have to add Essbase Agent information to the opmn.xml file to enable the OPMN to start, stop, and restart the agent using the OPMN command line interface. The OPMN can automatically restart the Essbase Agent when it becomes unresponsive, terminates unexpectedly, or becomes unreachable as determined by ping and notification operations. During the EPM installation, EPM System Installer installs OPMN and registers Essbase Server for OPMN. The OPMN manages the Essbase Agent, which manages the Essbase Server. Methods to start the Essbase server using the OPMN: 1- Use these commands to start and monitor Essbase from OPMN: opmnctl status Enables you to determine the status of system component processes. opmnctl startproc ias-component=EssbaseInstanceName Starts the system component named EssbaseInstanceName* opmnctl restartproc ias-component=EssbaseInstanceName Restarts the system component named EssbaseInstanceName* *) EssbaseInstanceName: - If you did not implement failover clustering, EssbaseInstanceName is the name of the Essbase instance that you entered when you configured Essbase Server in the Essbase Server Configuration page during configuration process with EPM System Configurator. - If you implemented failover clustering, EssbaseInstanceName is the name of the Essbase cluster that you entered when you set up the Essbase cluster. Note: The Components that are managed by OPMN should never be started or stopped manually. Use the opmnctl command line utility to start and stop system components. 2- Windows Start Menu Command: Select Start --> Programs --> Oracle EPM System --> Essbase --> Essbase Server --> Start Essbase This command launches startEssbase.bat and redirects to OPMN. Note: The registered Service Name is: opmn_instanceName and the Display Name in Windows Services Control Panel is: Oracle Process Manager (instanceName) 3- Windows Startup Script: - Go to this path: MIDDLEWARE_HOME/user_projects/epmsystem1/bin/startEssbase.bat - Click on the startEssbase.bat to start the Essbase server and redirect to OPMN. Note: If you have more than one Essbase instance, then each instance of Essbase Server has its own startup script. If you configured an additional instance of Essbase, startEssbase.bat|sh is located in additionalInstanceLocation/bin. Launch the start script from this location to launch this instance of Essbase. --> ESSCMD: MIDDLEWARE_HOME/user_projects/epmsystem1/EssbaseServer/EssbaseServerInstanceName/bin/startEsscmd.bat It is also available in the /EssbaseClient directory. -->ESSMSH: MIDDLEWARE_HOME/user_projects/epmsystem1/EssbaseServer/EssbaseServerInstanceName/bin/startMaxl.bat It is also available in the /EssbaseClient directory. All the scripts call setEssbaseEnv.bat to set up ESSBASEPATH, ARBORPATH, and PATH before starting. 4- UNIX Startup Script: - Go to this path: (It redirects to OPMN) MIDDLEWARE_HOME/user_projects/epmsystem1/bin/startEssbase.sh - Click on the startEssbase.sh to start the Essbase server and redirect to OPMN. Note: If you have more than one Essbase instance, then each instance of Essbase Server has its own startup script. If you configured an additional instance of Essbase, startEssbase.bat|sh is located in additionalInstanceLocation/bin. Launch the start script from this location to launch this instance of Essbase. --> ESSCMD: MIDDLEWARE_HOME/user_projects/epmsystem1/EssbaseServer/EssbaseServerInstanceName/bin/startEsscmd.sh It is also available in the /EssbaseClient directory. --> ESSMSH: MIDDLEWARE_HOME/user_projects/epmsystem1/EssbaseServer/EssbaseServerInstanceName/bin/startMaxl.sh It is also available in the /EssbaseClient directory. - All the scripts call hyperionenv.doc to set up ESSBASEPATH, ARBORPATH, and PATH before starting. - When running Essbase manually from a console, the console cannot be set to UTF-8 encoding. - There is another instance of startEssbase.sh located in this path: MIDDLEWARE_HOME/user_projects/epmsystem1/EssbaseServer/essbaseserver1/bin. This file does not redirect to OPMN. - You must use this startEssbase.sh file to start Essbase if Oracle Business Intelligence Enterprise Edition is the data source for Essbase. To stop Essbase Server using the OPMN: If you started the Essbase Server using the OPMN, then you have to use OPMN to stop it, using one of the following methods: 1) Command line: opmnctl stopproc ias-component=EssbaseInstanceName Stops the system component named EssbaseInstanceName Note: If you attempt to use MaxL to shut down an Essbase instance that was started using OPMN, you are warned to use OPMN to shut down Essbase. - Stopping Essbase Server can take some time, depending on how many Essbase applications are running on the server. To stop the Essbase Server, you need Administrator permissions. 2- Windows Stop Script: MIDDLEWARE_HOME/user_projects/epmsystem1/bin/stopEssbase.bat It redirects to OPMN. 3- UNIX Stop Script: MIDDLEWARE_HOME/user_projects/epmsystem1/bin/stopEssbase.sh It redirects to OPMN. Starting and Stopping Essbase without using OPMN: Oracle provides an alternate method of starting Essbase that does not use OPMN. To start the Essbase server without using the OPMN: - Go to this instance of startEssbase.sh located in: MIDDLEWARE_HOME/user_projects/epmsystem1/EssbaseServer/essbaseserver1/bin - Click on the startEssbase.sh to start the Essbase server. Note: If you use this method of starting Essbase, then the OPMN is not used for managing Essbase and active-passive failover clusters are not supported. To stop the Essbase server without using the OPMN: If you used this alternate method to start Essbase, use one of the following methods to stop Essbase: 1- Stopping via Administration Services Console: In the Enterprise View--> right-click the Essbase Server node --> Select Stop. 2- Stopping Essbase via MaxL: Use this MaxL command: alter system shutdown; Example: login admin 'password' on local; alter system shutdown; 3- Stopping Essbase via ESSCMD: Use this ESSCMD command: SHUTDOWNSERVER servername username password Where: servername: Host name associated with the Essbase Server you want to shut down. username: Your user name. password: Your password. Example: shutdownserver local admin password; 4- Agent: If you started Agent in the foreground, you can use the quit command: quit; Common Issues See: Document 1277055.1 for a number of common issues and solutions when using Essbase with OPMN Document 1179893.1 for a problem when signing off from a remote HSS Server (will affect Essbase login) Document 1156592.1 for a problem when signing off from a remote Essbase Server Document 1272837.1 for installing a second Essbase Server instance on the same physical machine REFERENCES NOTE:1156592.1 - Essbase Service Fails After Being Started from Remote Desktop Client. OPMN.exe NOTE:1179893.1 - Oracle Epm 11.1.2 Crashed After Signing Off From Remote Server Using The Account Which Installed EPM Services NOTE:1272837.1 - Install a Second Essbase Instance on the Same Server in Essbase 11.1.2 NOTE:1277055.1 - Troubleshooting Essbase Issues With Oracle Process Manager and Notification (OPMN) server

Essbase Clustering 11.1.2.2

http://database.developer-works.com/article/17609016/Essbase+Active+Passive+Clustering+11.1.2.3 http://hyperionvirtuoso.blogspot.com/2014/01/essbase-clustering-part-1.html https://blogs.oracle.com/pa/entry/epm_11_1_2_23 Essbase v11.1.2.x Cluster Configuration on Unix Systems (Doc ID 1429843.1) To BottomTo Bottom In this Document Purpose Scope Details References APPLIES TO: Hyperion Essbase - Version 11.1.2.1.000 and later Generic UNIX *** checked for currency 10-07-2014 *** PURPOSE This document will help in understanding setting up an Essbase active/passive failover environment. SCOPE This document is intended for System Administrators responsible for installing and configuring the Essbase clustered environment. DETAILS Pre-requisites Identical User IDs must be configured on both servers, for example "epmadmin". A shared drive must be created that is accessible and writable from both Essbase nodes. These types of shared drive are supported: SAN storage device with a shared disk file system supported on the installation platform such as OCFS. NAS device over a supported network protocol. Note: Any networked file system that can communicate with a NAS storage device is supported, but the cluster nodes must be able to access the same shared disk over that file system at the same time. SAN or a fast NAS device is recommended because of shorter I/O latency and failover times. The mount point on both servers for the shared disk must be identical. To successfully cluster Essbase 11.1.2.x in a Linux environment, the User IDs and passwords used to install/configure Essbase must match on both the Active and Passive nodes. The following must also match: Primary Group IDs Numeric UIDs Numeric Primary Group IDs The easiest way to accomplish this is to create the users on both nodes prior to installing the software, making sure that all of the above criteria are met. To check the numeric UIDs and Primary Group GIDs using the following command on both servers: # cat /etc/passwd | grep epmadin The following is displayed: epmadmin:x:502:502::/home/epmadmin:/bin/bash The two numbers ":502:502" are the UID and GID of the user. Tips on Configuring the Cluster For instructions on configuring the Essbase cluster, please read/review chapter, "OPMN Service Failover for Essbase Server" in the "Oracle Hyperion Enterprise Performance Management System Installation and Configuration Guide" and "Essbase Server Clustering and Failover" in the "Oracle Hyperion Enterprise Performance Management System High Availability and Disaster Recovery Guide ". Use different instance names when configuring the Essbase nodes. For example on the Active Node, use the instance name epminstance2 and on the Passive Node, use the instance name epminstance3. After the configuration is complete, modify the opmn.xml files on both the active and the passive nodes. Refer to the "Setting Up Active-Passive Essbase Clusters" chapter in the "Installation and Configuration Guide". Define the ias-component name in the opmn.xml file as the clustername defined during the configuration. This should be the same for both Essbase nodes. Testing the Cluster Once Essbase is installed and configured on both nodes, and the opmn.xml files have been modified, test starting the cluster and connecting to it from an Essbase Client. NOTE: The clustername used in the opmnctl commands are case-sensitive. 1. Start OPMN and Essbase on the Active Node using the opmnctl command: $ opmnctl startall 2. Verify that OPMN and Essbase are up and running on the Active Node: $./opmnctl status Confirm Essbase is running as a process: $ ps -ae | grep ESS 11517 ? 00:00:07 ESSBASE 3. Start OPMN (and Essbase) on the passive server: $./opmnctl start (or startall) 4. Verify that OPMN is running on the passive server but Essbase is Down: $./opmnctl status 5. Stop Essbase on the Active Node: $./opmnctl stopproc ias-component=EssbaseCluster-1 6. Check to see that Essbase stopped on the Active Node: $./opmnctl status 7. Check to see if Essbase started on the Passive Node: $./opmnctl status 8. Using MaxL, connect to the Essbase cluster. Use the full path to the Analytic Provider Service, for example: login admin password on "http://:/aps/Essbase?ClusterName=EssbaseCluster-1"; To connect using the cluster name, refer to How To Connect To an Essbase v11.1.2.x Cluster via MaxL or ESSCMD? NOTE: If you then run the same stopproc command on the second node, Essbase will NOT failover to the first node as it was explicitly stopped. To re-establish the failover, stop OPMN on both nodes and start up the first node then the second node. Notes: Only the Agent (ESSBASE) fails over. Applications are not restarted on an Agent failover. Clients will need to reconnect/relogin and re-submit their request. TIPS: Restart-on-Death Setting When restart-on-death is set to TRUE, OPMN will attempt to restart Essbase on the same node first for 3 times (first start + 2 retries, configurable in opmn.xml) and when all attempts fail, Essbase will failover to the second node. When set to FALSE, OPMN will immediately failover to the second node without a retry. If set to TRUE and Essbase crashes, it will be restarted on the same node in a few seconds. The only time it will take longer to restart Essbase is when the current node is having trouble restarting Essbase after all retries. Check the communications status of OPMN To confirm the communication between the Active and Passive Nodes, use the following command on both servers: $netstat -a | grep 671 You should see at least one entry in the results on each server showing a communications link to the other. For example: tcp 0 0 Passive.example.com:37409 Active.example.com:6712 ESTABLISHED Logs $MIDDLEWARE_HOME/user_projects/epmsystem1/diagnostics/logs/OPMN/opmn: console~Essbase2~EssbaseAgent~AGENT~1.log EssbasePing.log opmn.out console~EssbaseCluster-1~EssbaseAgent~AGENT~1.log opmn.log $MIDDLEWARE_HOME/user_projects/epmsystem1/diagnostics/logs/essbase/essbase_0: ESSBASE_ODL_1328087597.log ESSBASE_ODL.log leasemanager_essbase_{servername}.log OPMN The behavior of the OPMN process and all of the OPMN components are controlled and configured in the opmn.xml file. The following command can be used to detect any syntax errors in that file: opmnctl validate NOTE: This will only check for syntax and format related issues. It will not tell you if the information that you may have entered into the opmn.xml file is incorrect. Common OPMN Commands The OPMN process and the opmnctl command are used for the control of the Essbase process as of version 11.1.2. Here are a few commonly used opmnctl examples as they are used with Essbase: The following command starts the OPMN process only, not Essbase: opmnctl start To check the current status of OPMN and all associated components, use the following command: opmnctl status Once the OPMN process is running, you can then start Essbase in the following manner: opmnctl startproc ias-component=EssbaseCluster1 NOTE: You will need to use the correct name of the ias-component to use this command. You can find that name by using the "opmnctl status" command as described above. The command to stop Essbase is similar: opmnctl stopproc ias-component=EssbaseCluster1 The following command will start OPMN and all ias-components as defined in the opmn.xml file, (this will start OPMN and Essbase): opmnctl startall Similarly, if you'd like to stop all OPMN components and OPMN itself, you would use the following: opmnctl stopall Configurable Settings (essbase.cfg) Take care when setting these values, especially if the timeout value has been modified in the opmn.xml file. If the timeout is set too low, you may run into a situation where both nodes will try to start and obtain a lease. Essbase Agent: AGENTLEASEEXPIRATIONTIME - Specifies the number of seconds before a lease expires. AGENTLEASEMAXRETRYCOUNT - Specifies the number of times that Essbase Agent attempts to acquire or renew a lease. If the attempts are unsuccessful, the agent terminates itself. AGENTLEASERENEWALTIME - Specifies the time interval, in seconds, after which Essbase Agent attempts to renew a lease. This value must be less than the value of AGENTLEASEEXPIRATIONTIME. Essbase Server (ESSSVR/applications): SERVERLEASEEXPIRATIONTIME - Sets the maximum amount of time that Essbase Server can own a lease before the lease is terminated. SERVERLEASEMAXRETRYCOUNT - Specifies the number of times that Essbase Server attempts to acquire or renew a lease. If the attempts are unsuccessful, the server terminates itself. SERVERLEASERENEWALTIME - Specifies the time interval after which Essbase Server renews its lease. Bin Directories The cluster configuration will lay down two EssbaseServer/bin directories, one under the $MIDDLEWARE_HOME directory and another on the shared mount point. When troubleshooting issues, you will need to check both directories. Essbase Clustering Part 1 Essbase clustering can be used in order to mitigate the risk of an Essbase server going down and affecting your Planning or Reporting application with it. When it comes to Essbase clustering, however, there is one commonly used method which is to use the Essbase clusters that can be configured via the main installation program. These Essbase clusters are more like a “hot spare” configuration or active/passive. The problem with this approach is that you need to have two identical servers (ideally) but can only use one at a time not using any of the horsepower from the “passive” server. In order to have a cluster of this type you will need to configure a Microsoft Cluster to monitor both servers, services and DNS entries. Configure a cluster disk (outside of the Essbase cluster) to store the Essbase data files (ARBORPATH) and configure opmn to monitor Essbase process running on both servers. Ideally, MCS will detect a down server and point the cluster name to the other server and switch the shared disk. Then OPMN will detect that Essbase is not running on the downed server and start it on the server that is still available. Does this sound like a lot? There’s another option… Another method of clustering is an active/active cluster, which can mitigate the risk of having only one Essbase server and also serve as a load balancer.The main “gotcha” for an active/active cluster is that you can not write back to it which is required for Planning applications. This is the reason this type of clusters is not as popular and is almost exclusively used with ASO reporting applications. Another gotcha of active/active clusters is that you have to build your cubes and load data on both servers. These need to be maintained in-sync in order to make sure that users being sent to one server see exactly the same as other servers in the cluster. This method uses APS and the JAPI to establish server pools like so: The main advantage for this setup is that you don’t have to configure any MS Clustering and you can treat each server individually with its own ARBORPATH’s. In EAS, each will show as an individual server with their own set of applications. On the downside, you will have to build each application on each server. Again, this is most suitable for ASO applications or BSO apps that do not require write back (which rules out Planning) and can take advantage of the horse power of both servers at all times (unless a server goes down) I will be writing two follow up posts on how to configure each type of cluster identified here.

Monday, April 6, 2015

Sun QFS

What Is Sun QFS? Sun QFS software is a high performance file system that can be installed on Oracle Solaris x64 AMD and SPARC platforms. This high availability file system ensures that data is available at device-rated speeds when requested by one or more users. The Sun QFS file system's inherent scalability enables the storage requirements of an organization to grow over time with virtually no limit to the amount of information that can be managed. This file system enables you to store many types of files (text, image, audio, video, and mixed media) all in one logical place. In addition, the Sun QFS file system enables you to implement disk quotas and a shared file system. This file system also includes the following features: Metadata separation Direct I/O capability Shared reader/writer capability File sharing in a storage area network (SAN) environment Oracle Solaris Cluster support for high availability Using the Sun QFS Shared File System The Sun QFS shared file system is always installed in the global-cluster voting node, even when a file system is used by a zone cluster. You configure specific Sun QFS shared file system into a specific zone cluster using the clzc command. The scalable mount-point resource belongs to this zone cluster. The metadata server resource, SUNW.qfs, belongs to the global cluster. You must use the Sun QFS shared file system with one storage management scheme from the following list: Hardware RAID support Solaris Volume Manager for Sun Cluster Distributing Oracle Files Among Sun QFS Shared File Systems You can store all the files that are associated with Oracle RAC on the Sun QFS shared file system. Distribute these files among several file systems as explained in the subsections that follow. Sun QFS File Systems for RDBMS Binary Files and Related Files Sun QFS File Systems for Database Files and Related Files Sun QFS File Systems for RDBMS Binary Files and Related Files For RDBMS binary files and related files, create one file system in the cluster to store the files. The RDBMS binary files and related files are as follows: Oracle relational database management system (RDBMS) binary files Oracle configuration files (for example, init.ora, tnsnames.ora, listener.ora, and sqlnet.ora) System parameter file (SPFILE) Alert files (for example, alert_sid.log) Trace files (*.trc) Oracle Cluster Ready Services (CRS) binary files Sun QFS File Systems for Database Files and Related Files For database files and related files, determine whether you require one file system for each database or multiple file systems for each database. For simplicity of configuration and maintenance, create one file system to store these files for all Oracle RAC instances of the database. To facilitate future expansion, create multiple file systems to store these files for all Oracle RAC instances of the database. Note – If you are adding storage for an existing database, you must create additional file systems for the storage that you are adding. In this situation, distribute the database files and related files among the file systems that you will use for the database. Each file system that you create for database files and related files must have its own metadata server. For information about the resources that are required for the metadata servers, see Resources for the Sun QFS Metadata Server. The database files and related files are as follows: Data files Control files Online redo log files Archived redo log files Flashback log files Recovery files Oracle cluster registry (OCR) files Oracle CRS voting disk Optimizing the Performance of the Sun QFS Shared File System For optimum performance with Solaris Volume Manager for Sun Cluster, configure the volume manager and the file system as follows: Use Solaris Volume Manager for Sun Cluster to mirror the logical unit numbers (LUNs) of your disk arrays. If you require striping, configure the striping by using the file system's stripe option. Mirroring the LUNs of your disk arrays involves the following operations: Creating RAID-0 metadevices Using the RAID-0 metadevices or Solaris Volume Manager soft partitions of such metadevices as Sun QFS devices The input/output (I/O) load on your system might be heavy. In this situation, ensure that the LUN for Solaris Volume Manager metadata or hardware RAID metadata maps to a different physical disk than the LUN for data. Mapping these LUNs to different physical disks ensures that contention is minimized. ProcedureHow to Install and Configure the Sun QFS Shared File System Before You Begin You might use Solaris Volume Manager metadevices as devices for the shared file systems. In this situation, ensure that the metaset and its metadevices are created and available on all nodes before configuring the shared file systems. Ensure that the Sun QFS software is installed on all nodes of the global cluster where Sun Cluster Support for Oracle RAC is to run. For information about how to install Sun QFS, see Using SAM-QFS With Sun Cluster. Ensure that each Sun QFS shared file system is correctly created for use with Sun Cluster Support for Oracle RAC. For information about how to create a Sun QFS file system, see Using SAM-QFS With Sun Cluster. For each Sun QFS shared file system, set the correct mount options for the types of Oracle files that the file system is to store. For the file system that contains binary files, configuration files, alert files, and trace files, use the default mount options. For the file systems that contain data files, control files, online redo log files, and archived redo log files, set the mount options as follows: In the /etc/vfstab file set the shared option. In the /etc/opt/SUNWsamfs/samfs.cmd file or the /etc/vfstab file, set the following options: fs=fs-name stripe=width mh_write qwrite forcedirectio rdlease=300 Set this value for optimum performance. wrlease=300 Set this value for optimum performance. aplease=300 Set this value for optimum performance. fs-name Specifies the name that uniquely identifies the file system. width Specifies the required stripe width for devices in the file system. The required stripe width is a multiple of the file system's disk allocation unit (DAU). width must be an integer that is greater than or equal to 1. Note – Ensure that settings in the /etc/vfstab file do not conflict with settings in the /etc/opt/SUNWsamfs/samfs.cmd file. Settings in the /etc/vfstab file override settings in the /etc/opt/SUNWsamfs/samfs.cmd file. Mount each Sun QFS shared file system that you are using for Oracle files. # mount mount-point mount-point Specifies the mount point of the file system that you are mounting. If you are using a zone cluster, configure the Sun QFS shared file system into the zone cluster. Otherwise, go to Step 5. For information about configuring Sun QFS shared file system into a zone cluster, see How to Add a QFS Shared File System to a Zone Cluster in Sun Cluster Software Installation Guide for Solaris OS. Change the ownership of each file system that you are using for Oracle files. Note – If you have configured Sun QFS shared file system for a zone cluster, perform this step in that zone cluster. Change the file-system ownership as follows: Owner: the database administrator (DBA) user Group: the DBA group The DBA user and the DBA group are created as explained in How to Create the DBA Group and the DBA User Accounts. # chown user-name:group-name mount-point user-name Specifies the user name of the DBA user. This user is normally named oracle. group-name Specifies the name of the DBA group. This group is normally named dba. mount-point Specifies the mount point of the file system whose ownership you are changing. Grant to the owner of each file system whose ownership you changed in Step 5 read access and write access to the file system. Note – When Sun QFS shared file system is configured for a zone cluster, you need to perform this step in that zone cluster. # chmod u+rw mount-point mount-point Specifies the mount point of the file system to whose owner you are granting read access and write access. Next Steps Ensure that all other storage management schemes that you are using for Oracle files are installed. After all storage management schemes that you are using for Oracle files are installed, go to Registering and Configuring the RAC Framework Resource Group.

Wednesday, April 1, 2015

Exception-handling-in-soa-suite-10g-and SOA 11g

Very good link given below . http://javaoraclesoa.blogspot.com/2012/05/exception-handling-in-soa-suite-10g-and.html Exception handling in SOA Suite 10g and SOA Suite 11g Introduction Sometimes, the longer you think about how to solve a problem, the less complex the solution becomes. Error handling in SOA Suite 11g is one of those examples. It is tempting to implement an own mechanism for exception/error handling (for example http://javaoraclesoa.blogspot.com/2012/05/re-enqueueing-faulted-bpel-messages.html), although there already is an extensive fault management framework part of the SOA Suite. In this post I describe the method used in SOA Suite 10g to implement fault-policies using a custom Java class. I implement a similar exception handling mechanism in Oracle SOA Suite 11g. Marcel Bellinga has provided most code in the below example. Challenges to tackle Some of the challenges involved when implementing exception handling; - how do I make it easy for the people monitoring and maintaining the application to detect and recover from errors? - how do I make sure no messages are lost? - how do I make sure the order in which messages are offered to the application, does not change when exceptions occur? - how do I prevent 'hammering' a system (continuously retrying faulted messages) With these questions in mind, the following solution provides a good option. A bit of background Oracle BPEL 10g has the option to use fault-policies and fault-bindings (and use custom Java classes in the policies), which are put on the application server and referred to by a bpel process in the bpel.xml. See; http://docs.oracle.com/cd/E14101_01/doc.1013/e15342/bpelrn.htm#BABCHCED. Oracle SOA Suite 11g has (in addition to the method described above) the option to deploy custom Java classes, fault-policies and fault-bindings as part of the composite to the application server. This mechanism makes it easier to use the fault management framework on a per-composite basis. See http://docs.oracle.com/cd/E12839_01/integration.1111/e10224/bp_faults.htm Keep in mind, when using the fault management framework that the fault-policies get triggered before a catch branch as defined in a BPEL process. If you want the catch branch to be activated, the action to rethrow the fault, needs to be part of the policy. Solution in short The solution for handling faults while taking into account the above questions, will use the following method; - in Oracle BPEL 10g, a custom Java class and a specific policy xml-file is deployed on the application server - the bpel.xml file will refer to the policy defined in the specific policy XML file - the custom Java class will first deactivate the activation agents of the process and then retire the process (avoiding the issue that messages are picked up while the process is already retired causing loss of messages) - the faulted message is put in manual recovery mode so the error hospital can be used to recover (retry) the message after the problem is fixed - if the problem is fixed, the process can be activated again - the ORABPEL schema tables can be monitored for messages which can be recovered or to trigger someone something has gone wrong and a recovery action is required In Oracle SOA Suite 11g the method is similar, however, the activation agents do not need to be deactivated explicitly, the API calls are a bit different (due to the SCA implementation) and the error handling is deployed as part of the composite (in this example, see http://mazanatti.info/index.php?/archives/75-SOA-Fault-Framework-Creating-and-using-a-Java-action-fault-policy.html for an example on how to deploy custom Java code centrally on the server). Implementation Implementation BPEL 10g exception handling Custom Java action Create a new Java project and include the orabpel.jar from your BPEL distribution in the root folder of your project. Update the project libraries to include the library. Create a new Java class. I've used the following; package testapi; import com.oracle.bpel.client.BPELProcessMetaData; import com.oracle.bpel.client.IBPELProcessConstants; import com.oracle.bpel.client.IBPELProcessHandle; import com.oracle.bpel.client.Locator; import com.oracle.bpel.client.config.faultpolicy.IFaultRecoveryContext; import com.oracle.bpel.client.config.faultpolicy.IFaultRecoveryJavaClass; public class RetireProcess implements IFaultRecoveryJavaClass { public RetireProcess() { } /** * This method is called by the BPEL Error Hospital framework when this * * action is selected as retrySuccessAction (with the retry option) or * * when this action is selected as successor in the human intervention * * screen in the BPEL Console. * * * @param iFaultRecoveryContext */ public void handleRetrySuccess(IFaultRecoveryContext iFaultRecoveryContext) { System.out.println("RetireProcess RetrySucces start"); setLifeCycle(iFaultRecoveryContext, IBPELProcessConstants.LIFECYCLE_ACTIVE); System.out.println("RetireProcess RetrySucces einde"); } /** * This method is called by the BPEL Error Hospital framework when this * * class is configured as action in the fault handling policy * * * @param iFaultRecoveryContext * * @return String that can be used to influence choice for next action (not used in this case) */ public String handleBPELFault(IFaultRecoveryContext iFaultRecoveryContext) { System.out.println("RetireProcess HandleFault start"); setLifeCycle(iFaultRecoveryContext, IBPELProcessConstants.LIFECYCLE_RETIRED); System.out.println("RetireProcess HandleFault Einde"); return null; } private void setLifeCycle(IFaultRecoveryContext iFaultRecoveryContext, int status) { IBPELProcessHandle procHandle = null; Locator loc = null; BPELProcessMetaData bpelProcessMetadata = null; String processName; String revision; try { processName = iFaultRecoveryContext.getProcessId().getProcessId(); revision = iFaultRecoveryContext.getProcessId().getRevisionTag(); /* * get Locator Instance */ loc = iFaultRecoveryContext.getLocator(); /* * Lookup Process. Revision optional. */ if (revision == null || revision.trim().equals("")) { procHandle = loc.lookupProcess(processName); } else { procHandle = loc.lookupProcess(processName, revision); } if (procHandle == null) { throw new Exception("Unable to find process: " + processName); } System.out.println("RetireProcess set lifecycle to retired"); /* * Get Metadata of the process. */ bpelProcessMetadata = procHandle.getMetaData(); if (bpelProcessMetadata.getLifecycle() != status) { /* * Set Lifecycle to Retired. * Use setState(IBPELProcessConstants.STATE_OFF) to change process state to off. */ bpelProcessMetadata.setLifecycle(status); System.out.println("RetireProcess Lifecycle set to retired"); /* * Stop activation agents */ if (status == IBPELProcessConstants.LIFECYCLE_RETIRED) { procHandle.stopAllActivationAgents(); } else { procHandle.startAllActivationAgents(); ; } /* * Finally update the process with the modified metadata. */ procHandle.updateMetaData(bpelProcessMetadata); } } catch (Exception e) { e.printStackTrace(); } } } Noteworthy here are the method to retire the process; obtain a locator, use the locator to get a processhandle, use the processhandle to get to the metadata, update the metadata. The processhandle can also be used to stop the activation agents. Compile the project using JDK 1.5.0.06. Place this class in; [ORACLE_HOME]/bpel/system/classes/ Fault policy and fault binding Create a fault policy like for example 8 2 Place the fault policy in [ORACLE_HOME}/bpel/domains/{domain}/config/fault-policies Create a reference to the faultpolicy in the bpel.xml of the process like; (below ) Noteworthy in this policy is the defaultAction. My custom Java class returns null. This triggers the defaultAction which is set to ora-human-intervention. This causes the invoke to be visible in the error hospital (Activities tab in the process manager). From the error hospital it is also possible to specify an on retry success method to be executed (by clicking the specific error). Result When an error occurs, the failed messages arrive (in order) in the error hospital (usually a small number before the process is retired). The process instances which have faulted, remain open. The process is retired. You can retry the activities to check whether the error is fixed. In the error is fixed, the process can be activated again resuming normal action. This way the order of messages is guaranteed, there is no useless hammering and retrying the action which has failed. The process can be activated when the problem is fixed avoiding a lot of manual re-offering of messages. Implementation BPEL 11g Exception handling The 11g implementation is very similar to the 10g implementation. Deployment does not require any server side configuration. You can download the example project here; http://dl.dropbox.com/u/6693935/blog/TestExceptionHandling.zip. If you encounter errors deploying the project, you should remove the MDS entry in .adf\META-INF\adf-config.xml causing the issue. The example project requires the setup as described in; http://javaoraclesoa.blogspot.com/2012/05/re-enqueueing-faulted-bpel-messages.html. Also mind that when importing the project, your MDS configuration might differ. Remove the entries not relevant for your configuration from the .adf/META-INF/adf-config.xml file. Custom Java class I've used the following Java class (created in SCA-INF/src). No additional project configuration (like including libraries) is required in 11g. package ms.testapp.exceptionhandling; import com.collaxa.cube.engine.fp.BPELFaultRecoveryContextImpl; import java.util.logging.Logger; import oracle.integration.platform.faultpolicy.IFaultRecoveryContext; import oracle.integration.platform.faultpolicy.IFaultRecoveryJavaClass; import oracle.soa.management.facade.Composite; import oracle.soa.management.facade.Locator; import oracle.soa.management.facade.LocatorFactory; public class RetireProcess implements IFaultRecoveryJavaClass { private final static Logger logger = Logger.getLogger(RetireProcess.class.getName()); public RetireProcess() { super(); } public void handleRetrySuccess(IFaultRecoveryContext iFaultRecoveryContext) { } public String handleFault(IFaultRecoveryContext iFaultRecoveryContext) { System.out.println("handleFault started"); BPELFaultRecoveryContextImpl bpelCtx = (BPELFaultRecoveryContextImpl)iFaultRecoveryContext; try{ Locator loc = LocatorFactory.createLocator(); System.out.println("locator obtained"); Composite comp = loc.lookupComposite(bpelCtx.getProcessDN().getCompositeDN()); System.out.println("compisite found"); comp.retire(); //bpelCtx.addAuditTrailEntry("retired " + comp.getDN()); System.out.println("process retired"); logger.info("retired " + comp.getDN()); } catch (Exception e) { System.out.println("fault in handler"); //bpelCtx.addAuditTrailEntry("Error in FaultHandler " + RetireProcess.class.getName()); logger.severe("Error in FaultHandler " + RetireProcess.class.getName()); e.printStackTrace(); } return null; } } Fault policy and fault binding My fault-policy file is called fault-policies.xml (the composite.xml picks that file by default but a different file can be specified in the composite.xml file if required) and it looks like; My fault-bindings.xml looks like; These files are placed in the same folder as the composite.xml. Result The behavior in 11g is similar to the behavior described in 10g in both examples. One thing to notice is that the API works on composite level and I've not found a way to directly stop of start the activation agents. I did however not encounter the 10g error that the JCA adapter tried to start a retired process. First the correct situation. Use the testscript to enqueue a message. DECLARE queue_options DBMS_AQ.ENQUEUE_OPTIONS_T; message_properties DBMS_AQ.MESSAGE_PROPERTIES_T; recipients DBMS_AQ.aq$_recipient_list_t; message_id RAW(16); message SYS.XMLType; BEGIN recipients(1) := sys.aq$_agent('EXCEPTIONTEST', NULL, NULL); message_properties.recipient_list := recipients; message := sys.XMLType.createXML('NamePiet'); DBMS_AQ.ENQUEUE( queue_name => 'TESTUSER.TEST_SOURCE_QUEUE', enqueue_options => queue_options, message_properties => message_properties, payload => message, msgid => message_id); COMMIT; END; The result is a correct execution of the process; Next disable the TEST_TARGET_QUEUE Again submit a test message and confirm the error handler has activated in the Enterprise Manager. Conclusion Error handling in SOA Suite 11g is more extensive (has more options) then error handling in SOA Suite 10g. Also SOA Suite 11g provides options for implementing fault handling on a per process basis. This was absent in SOA Suite 10g. For accessing the API, there have been many changes going from 10g to 11g. The most significant changes have been caused by the implementation of the SCA framework. SOA Suite 11g makes it a lot easier to use the Java API. Also a lesson learned is to think about error handling very early on in a project and not start with the implementation which seems logical to a single developer but discuss the different options and requirements with the customer and other developers. In this case a relatively simple solution using standard Oracle functionality causes many requirements to be met. However if the purpose is to make as many hours as possible and tackling every requirement as a new change, then this solution is not for you!

Coherence in SOA Suite 11g

https://blogs.oracle.com/ateamsoab2b/entry/coherence_in_soa_suite_11g http://biemond.blogspot.com/2014/02/configure-coherence-hotcache.html Configure Coherence HotCache Coherence can really accelerate and improve your application because it's fast, high available, easy to setup and it's scalable. But when you even use it together with the JCache framework of Java 8 or the new Coherence Adapter in Oracle SOA Suite and OSB 12c it will even be more easier to use Coherence as your main HA Cache. Before Coherence 12.1.2 when you want to use Coherence together with JPA for the database connectivity, you must make sure that there is no batch job or application doing modifications directly in the database. This will lead to an out of sync Coherence Cache. But with Coherence 12.1.2 together with GoldenGate you can capture these database changes and send updates to the Coherence Cache. This is called Coherence HotCache. Oracle SOA Suite 11g utilises an embedded Coherence cache to coordinate several cluster-wide activities including composite deployment By default, this embedded Coherence cache is configured for multicast node discovery. This is normally fine, because you probably should only have one cluster on a subnet, and the multicast packets usually don't cross the router boundary. However, it may be the case that you wish to have multiple clusters on a single subnet, or your router is forwarding multicast for you. You may encounter problems with deployment, for example, if you have two independent SOA clusters that can see each other through Coherence. Check your log files for a message like this: [SOA_server1] [ERROR] [] [Coherence] [] [] [APP: soa-infra] 2011-08-01 00:00:00.000/0000.000 Oracle Coherence GE 3.6.0.4 (thread=Cluster, member=1): This senior Member(Id=1, Timestamp=2011-08-01 00:00:00.000, Address=10.10.10.10:8088, MachineId=43868, Location=site:blah,machine:blah1,process:15875, Role=WeblogicServer) appears to have been disconnected from another senior Member(Id=1, Timestamp=2011-07-01 00:00:00.000, Address=10.11.11.11:8088, MachineId=43815, Location=site:blah,machine:blah2,process:18955, Role=WeblogicServer); stopping cluster service. What has happened here is that the machine blah1 has seen the cluster that's owned by machine blah2 because they're both sharing the same multicast information. This can cause a deployment on blah1 to stop responding, ultimately resulting in things like "Stuck Thread" warnings from Weblogic server. If you wish to separate the two clusters and you should, because they shouldn't be communicating with each other, you have two options. You can reconfigure the embedded Coherence for Unicast operation, and use Well-Known Addressing (WKA). This is documented in the HA documentation for SOA: http://download.oracle.com/docs/cd/E21764_01/core.1111/e10106/ha_soa.htm#ASHIA3848. For a single node environment you can simply specify the localhost as wka1 and "localhost" at the same time and you've effectively isolated it from other environments. You can reconfigure the multicast address and port that Coherence is listening on. This can be done by specifying -Dtangosol.coherence.clusteraddress and -Dtangosol.coherence.clusterport, instead of using the WKA techniques above (otherwise, you should follow those instructions for where to set these properties). These should follow the usual rules for Coherence multicast cluster addressing (specifically, the address has to be from the multicast IP address range and unique amongst your Coherence clusters). It is normally recommended to make both unique, and not just set the port differently, because there have been problems with the OS level sockets on different ports "seeing" each others packets. Hopefully this will help you get your cluster (even a cluster of one node) working.