Pages

Sunday, March 10, 2013

Logging OSB Webservice call response times - A simple approach

1. Overview
There are multiple ways of monitoring the service performance in OSB, as follows:
  1. Using OEM
  2. Using BTM
  3. Enabling monitoring for every Proxy service in OSB and writing scripts to collect and aggregate the data
  4. Explicitly adding logging in the pipeline: either using the Reporting Provider or another mechanism.
  5. Using an Audit WS-Policy
  6. Extending the Weblogic access.log to log the time taken and optionally pipeline metadata
  7. Other
If you have implemented items 1, 2 or 5, you can probably skip this post as you most likely have good monitoring coverage.

If you have implemented items 3, 4 or none, you should definitely read this post.

If you have implemented item 6, probably you already went through this post:)

  
2. Objective

The main objective of this post is to present a simple, effective and quick way to implement the first level of monitoring, or to extend your existing monitoring.

Remember: in order to have good monitoring coverage in place, there is no one size fits all solution. You need to monitor different components, and correlate the events.

If you have zero or limited monitoring coverage, this approach will bring you to a level which will either help you to troubleshoot issues or it will give you additional performance indicators.


3. Approach

The approach consists of extending the Weblogic access.log to log the time taken to give the response back to the consumer, and to log additional metadata.

By default Weblogic already logs (the most interesting information):
  • The timestamp
  • The URL (which is the service endpoint)
  • The HTTP response code
  • The payload response size in bytes
As you might have noticed, there is a limitation to this approach: It will cover only the HTTP transport, it does not cover JMS, MQ, Tuxedo, etc.

However, it will give you some sort of monitoring, which is better than nothing. On the positive side, you don't have to change your code to log the time taken, and the configuration change will take less than five minutes.


4. Using this approach in combination with other approaches

4.1 If you are using the OSB monitoring for the Proxy Services

In case you are using the OSB monitoring for your Proxy Services, you should consider adding time taken to the access.log, for the following reasons:
  •  The OSB monitoring mechanism is a great feature, but it will require that you collect the data regularly and store it somewhere in order to be able to trace performance trends. This is due to how the data is aggregated: It aggregates the data from the last statistics reset, and for the last X minutes, which is configurable.
  • If you use the information from the access.log, you will be able to aggregate it for whichever interval you want, as you have the raw data stored per service invocation.
  • Also, the OSB monitoring does not aggregate the response time in case of an error. You might be interested in this information.
  • The OSB monitoring response time aggregation does not compute the time before reaching the pipeline. In other words, if your authentication provider takes 2 seconds to authenticate a user, and the service execution takes 100 milliseconds, the OSB monitoring will compute only 100 milliseconds. On the other hand, the time taken in the access.log for this particular request will compute 2.1 seconds.

4.2 If you already have implemented the logging explicitly in the pipeline

  • The explicit log you have implemented will not be able to compute the time before reaching the pipeline. This means that the response time might be inaccurate in some circumstances. On the other hand, the time taken in the access.log will compute the time correctly.
  • In order to correlate the data from both log files: access.log and your log entries, you will be able to correlate them if you follow the next section, about logging additional metadata to the access.log
  • If you define a Workmanager and some requests start being rejected, the exception will not reach the OSB pipeline, which means that you will not be able to log using your pipeline log mechanism. On the other hand, an entry in the access.log will be generated and the HTTP error code will be 503.


5. Adding time taken to the access.log

5.1 Configuration


  • Access the Weblogic console.
  • Select Environment - Servers -  and each of the managed servers
  • Select Logging - HTTP
  • Select Advanced
  • Select Format=Extended (as per the following screenshot, selection 1)
  • Enter in the Extended Logging Format Fields the following value (as per the following screenshot, selection 2)

c-ip date time time-taken cs-method cs-uri sc-status bytes

  • Click on Save


Note: Only for test purposes, you can change the Log File Buffer (selection 3 above), to zero. This will make it easier to see the output in the access.log file, otherwise it will write to the file whenever the buffer reaches 8kb (the default value).

  • Repeat the above procedure for each of the managed servers
  • Activate the session
  • Restart each of the managed servers


5.2 Log File output

If you look at the managed server access.log, you will see the following output. You will notice highlighted the time taken in seconds.


#Version:       1.0
#Fields:        c-ip date time time-taken cs-method cs-uri sc-status bytes
#Software:      WebLogic
#Start-Date:    2013-03-07      00:25:02
127.0.0.1       2013-03-07      00:28:53        1.468  POST     /CustomerV1/Customer  200     1207
127.0.0.1       2013-03-07      00:28:57        0.063   POST    /CustomerV1/Customer  200     1207
127.0.0.1       2013-03-07      00:28:58        0.052   POST    /CustomerV1/Customer  200     1207
127.0.0.1       2013-03-07      00:28:58        0.045   POST    /CustomerV1/Customer  200     1207
127.0.0.1       2013-03-07      00:28:59        0.012   POST    /CustomerV1/Customer  200     1207


6. Adding transactionID to access.log

In case you have implemented your own logging inside your Pipeline, and you want also to benefit from the access.log metrics, you might be wondering how to correlate events. You could possibly try to correlate based on the URI and the timestamp, however this is not 100% reliable, especially if you have a high throughput.

The aim here is to add a correlation in the access.log. You can achieve this by:

  • Adding a response HTTP header in your pipeline, containing the transactionID
  • Changing the Weblogic Extended Logging Format Fields, adding the transactionID header
Those two steps are documented below:

6.1 Adding a response HTTP Header in the pipeline


You will have to add a transport header action as follows in your Proxy Service. It's recommended to add it in your global exception handling as well, in case an exception occurs.




The transport header is called: "transactionID". 
This header will be populated with the value from $messageID, which is automatically generated by OSB and it's the same value I'm using in my pipeline auditing, in order to correlate both events.





6.2 Adding the transactionID to be logged in the access.log

In order to log the transactionID in the access.log, you will have to repeat step 5.1, but this time you will set the following value for the "Extended Logging Format Fields":

c-ip date time time-taken cs-method cs-uri sc-status bytes sc(transactionID)

6.3 Log File output


If you look at the managed server access.log, you will see the following output. You will notice highlighted the transactionID.
In order to make it easier to visualise each of the following records, I've added an additional break line.



#Version:       1.0
#Fields:        c-ip date time time-taken cs-method cs-uri sc-status bytes sc(transactionID)
#Software:      WebLogic
#Start-Date:    2013-03-10      16:46:25
127.0.0.1       2013-03-10      16:54:23        1.393   POST    CustomerV1/Customer 200     1207    "5129230022325267856-60c5b97c.13d52d60fb3.-7fde"

127.0.0.1       2013-03-10      16:54:39        0.034   POST    CustomerV1/Customer 200     1207    "5129230022325267856-60c5b97c.13d52d60fb3.-7fdc"

127.0.0.1       2013-03-10      16:54:40        0.017   POST    CustomerV1/Customer 200     1207    "5129230022325267856-60c5b97c.13d52d60fb3.-7fda"

127.0.0.1       2013-03-10      16:54:41        0.05    POST    CustomerV1/Customer 200     1207    "5129230022325267856-60c5b97c.13d52d60fb3.-7fd8"

127.0.0.1       2013-03-10      16:54:42        0.013   POST    CustomerV1/Customer 200     1207    "5129230022325267856-60c5b97c.13d52d60fb3.-7fd6"




7. Aggregating access.log data

Once the access.log files are generated, you need to aggregate the information.
This is out of scope of this post. I might write another post exploring different options for:

  • Loading the data in realtime to a repository
  • Making the data searchable
  • Aggregating the data


8. Conclusion

Just to emphasise what I mentioned in the beginning of this post: This approach is a quick win. You can have it implemented with low effort and under 5 minutes time, and it will give you additional metrics for your environment.

I often hear organisations saying: "We are selecting the perfect tool, which will monitor everything".

The bottom line is: Usually the tool selection and implementation could take months or years, depending on the company's bureaucracy.

If you have an application in production you cannot wait. You need a pragmatic monitoring approach RIGHT NOW.

This is just one out of many monitoring approaches. Its coverage is limited, therefore it should be used in combination with other approaches.

I will post other monitoring approaches, which will complement the one explained in this post.


Wednesday, February 27, 2013

OSB Action Visibility in the Message Flow

1. Overview

How many times did you have to re-visit your OSB pipeline, perform a code review in someone else's code or work in collaboration with other developers in the same code?

One of the most annoying limitations in the Oracle Service Bus plugin for Eclipse refers to the absence of displaying the action context in an optimal way in the message flow

To understand it better, look at the following fragment of a message flow, which contains 3 Assign actions in a row.

I'm using Assign action to illustrate, however the same approach can be taken for any Action.


A) We can see 3 Assign labels, but we cannot see what values are being assigned to what variables.





B) The Tooltip default value is not helpful, as it just displays the word "Assign".





C) Then if you click in the Assign, you will be able to see the properties as follows:




2. Issue

As you can see, if you have a message flow and you want to navigate and have a picture of what is being assigned, you will have to click on each action to see the Variable name and the value which is being assigned to it.

I would say this is not a very productive approach, and you might find yourself frustrated if you use the tool regularly.


3. What I'm trying to achieve

I decided to address this issue, and evaluated different approaches. The main goal is to make some properties visible in the pipeline for specific actions.

In the initial version, I'm focusing on the most common actions: Assign, Replace and Service Callout.


4. Approaches

4.1.  Changing the behaviour of the tool

In order to address the issue described in the Overview section, I decided to create an Extension to the OSB Plugin.

The extension will change the message flow behaviour. Whenever you position the mouse cursor over a set of defined actions, it will display relevant information in the Tooltip.

This approach doesn't change any metadata, and it does not affect anything at runtime, it's only at design time. I strongly recommend opting for this approach.

You can see in the following picture that when you position your mouse cursor over the Assign action, it shows automatically to which variable the result of the action is being assigned.

Just download and install the Extension described in item 5 of this article.

You can still use comments whenever you need to, in order to add useful information, and they will be shown in the tooltip as well.



4.2.  Using comments

If you use the property comment section, you will achieve the same result as in 4.1, but in a manual way.

I've been using this approach for a while. It's hard to convince developers to do it. It's also error-prone.

You might rename the variable and forget to update the comments.

You can automate the generation of comments in your automated build by inspecting the metadata and updating the comments, however you have to be careful otherwise useful comments will be lost.



4.3.  One Action per stage


This is another approach, where you have only one Action per stage and you name the stage with the variable name or relevant information.

I think this is an overkill. It makes sense to group related actions per stage, but having as a rule 'one Action per Stage' is not practical, and it will increase the size of the message flow.

I wouldn't use this approach for Assign and Replace actions.


5. Installing the Eclipse Plugin Extension

  1. Download the following 2 Jars: OSBEclipseExtensionPlugin.jar and asm-all-4.1.jar
  2. Copy the above 2 files to a location in your computer. e.g.: c:/dev/OSBExtension
  3. Locate eclipse.ini used in your Eclipse (assuming you have already installed OEPE)
  4. Add the following 2 lines to eclipse.ini (it assumes that the files are located under c:/dev/OSBExtension. Change the directory values in order to align to your installation). You can add the  following 2 entries in a new line, immediately after -vmargs entry.
    -javaagent:c:/dev/OSBExtension/OSBEclipseExtensionPlugin.jar
    -DOSBEclipseExtensionLibs=c:/dev/OSBExtension/asm-all-4.1.jar,c:/dev/OSBExtension/OSBEclipseExtensionPlugin.jar
  5. Start Eclipse
  6. Open an OSB Proxy Service and you will see a different behaviour as specified in 4.1 whenever highlighting Assign, Replace or Service Callout actions.

6. Conclusion

Opting for the approach "4.1 Changing the behaviour of the tool" is my recommendation to address the issue mentioned in this article, where crucial information is being hidden in the Pipeline.

Additionally, you should use also approach 4.2 on top of it whenever you want to capture valuable information.

Please leave a feedback about the approaches, the tool, if it helpped you somehow or if you want a new functionality available in the Plugin extension.