Wednesday, April 26, 2006

Informatica keeps it up

Good to see Informatica reporting a good quarter. Some of you will remember Sohaib Abbasi, Informatica's CEO, from the good ole days (late '80s and certainly well into the 90s) when he was in charge of the Tools (ie Forms and Reports) group at Oracle. Looks like he's doing a fair job of shepherding the company - one of the very few 20th century ETL/EAI outfits still surviving as an independent.

Sunday, April 23, 2006

Is this an Oracle WTF?

Using JDBC with connection pooling, we see this in our statspack (over a 15 minute period, on one of a 3 node RAC setup:

SQL ordered by Executions for DB: XXX





ExecutionsRows ProcessedStatement
192,827192,826select 'x' from dual
54,64454,644INSERT INTO AUDIT ...


We seem to get select 'x' from dual issued exactly once as each connection is opened. This can represent a significant fraction of the executions. I raised this as an SR and was told:

This is a standard feature in the later JDBC libraries (10.1.0.x onwards) and is designed to ensure that the connection remains up and running - this is documented in the JDBC documentation ... This can be altered but not stopped by setting the min connections in the datasource to a minumum needed.

See BUG:4601037 - PINGDATABASE() PERFORMANCE CAN BE IMPROVED IN BOTH DRIVERS (against JDBC 10.1.x) however this is not published, and not fixed until JDBC 11.1.

Is it just me, or does it seem mad to have to test that your connection is up by sending an SQL statement? That's the first thing the app is going to do itself anyway; why not detect the failure then? In a well tuned app, the sessions should spend very little time 'on the shelf' when the app is busy; the database is just as likely to disappear while the session is being actively used as while it is resting. So this logic simply provides extra protection to the very first JDBC operation in a transaction; why not at least be able to tell it not to send this 'heartbeat' query unless the pooled session has been idle for longer than some (configurable) threshold.

Cheers Nigel

Thursday, April 20, 2006

DBMS_LOB.APPEND eats resources on 9.2.0.5

We're generating and storing XML data into CLOB columns as part of our reporting solution. We've been aware for a while that there are some performance issues with LOBs in 9.2.0.5, eg:

  • 4280096 - HIGH BUFFER GETS FOR INSERT INTO LOB (fixed 9.2.0.8, 10.1.0.5 and 10.2)
  • 3315549 - POOR PERFORMANCE WITH USING DBMS_LOB.WRITEAPPEND IN PLSQL LOOP
  • 2602264 - DBMS_LOB.WRITEAPPEND() POOR PERFORMANCE WITH LARGE LOB AND SMALL CHUNK SIZE
  • 2420525 - DBMS_LOB.WRITEAPPEND() PERFORMANCE ISSUES
  • 2595083 - (description not published, although it's a base bug for some of the previous
    ones)

But I'd never sat down and actually measured how much difference it can make - until today. My test appended a short string (around 35 characters) 10,000 times, using each of DBMS_LOB.append, DBMS_LOB.WriteAppend, and TEST.LOB_APPEND - a procedure that buffers the appended strings, and only flushes to the CLOB when the buffer is full:

for i in 1..:loopcount loop
dbms_lob.append(lt_doc_clob, '('||i||')'||lv_padding);
end loop;

Here are the results from tkprof, for 10000 iterations, 30 chars of padding:






callcountcpuelapseddiskquerycurrentrows
Append11.571.96020087 1945671
WriteAppend10.450.4809999743431
MyAppend10.100.10018224851

Repeating 1,000,000 times, timing from SQL*Plus, and trying to help by wrapping the whole lot in DBMS_LOB.OPEN and DBMS_LOB.CLOSE:



dbms_lob.append without open/close: 00:03:38.07
dbms_lob.append with open/close: 00:03:40.07
Simple dbms_lob.writeappend: 00:01:12.02
Append with buffering: 00:00:10.08


We used APPEND for lazy reasons (no need to keep working out the length of the buffer, which WRITEAPPEND needs) - but the subsequent cast from VARCHAR2 to CLOB clearly contributes substantially to the problem.

In our case these LOBs are also being manipulated by many concurrent sessions on a RAC environment - so we've seen evidence of (space management) contention for the LOB segments. Reducing lio's seems to help more than a little... at least while we wait for our client to upgrade to the latest patch - which doesn't look likely any time soon!

Tuesday, April 18, 2006

"Micro-partitioning" pooled sessions in a RAC environment

One of the problems we've come across testing an OC4J based system against an Oracle 9.2.0.5 RAC database is that as a given application session borrows database sessions from the session pool, it finds itself skipping at high speed around all the nodes in the RAC. If you're not careful, it's possible to insert or update some piece of data (on one database node) and then find that it's not quite there when you take a look a millisecond later via a different node.


You can work around this by setting MAX_COMMIT_PROPAGATION_DELAY to zero. However that does impose extra load on the cluster.


The problem arises because your connect string (in the JDBC URL) specifies a RAC service, something like this:

url="jdbc:oracle:thin:@(description=(LOAD_BALANCE=on)(address=(protocol=tcp)(host=rac1)(port=1521))(address=(protocol=tcp)(host=rac2)(port=1521))(address=(protocol=tcp)(host=rac3)(port=1521))(connect_data=(service_name=RACDB)))"


This typically load-balances across all nodes in an annoyingly random and yet not entirely effective way - especially when there is a sudden increase in connections. Oracle’s default load balancing mechanism is particularly badly suited to a connection pool.


What you'd really like to do is to pin application sessions to database nodes - so that you can be (reasonably) sure that a given user, once logged on, will keep on using the same RAC node (node failure excepted, of course). Just have each app server connect to its 'own' database server.


Unfortunately, the JDBC URL configuration is clustered. Whatever you set it to has to work for all nodes. So we came up with a simple workaround:
  • Turn load balancing off

  • Add some new IP address mappings to each app server node's /etc/hosts file

  • use the 'virtual' hostnames in the JDBC URL


Now you have this URL:


url="jdbc:oracle:thin:@(description=(LOAD_BALANCE=off)(address=(protocol=tcp)(host=db1)(port=1521))(address=(protocol=tcp)(host=db2)(port=1521))(address=(protocol=tcp)(host=db3)(port=1521))(connect_data=(service_name=RACDB)))"


and in each /etc/hosts you have something like:


10.20.30.1 rac1.prt.stl.int rac1 db1

10.20.30.2 rac2.prt.stl.int rac2 db2

10.20.30.3 rac3.prt.stl.int rac3 db3



On each app server, the mappings from racN to dbN are modified, so that each app server "prefers" its own RAC server, forming a stovepipe.
If the app server tier is on nodes app1, app2 and app3, the mapping is:





RAC node app1 mappingapp2 mappingapp3 mapping
rac1db1db2db3
rac2db2db3db1
rac3db3db1db2

So in normal operation, we have a stovepipe all the way down from web to app to database tiers. Ideally we would have the web load balancer intelligently cluster like users together (eg by organisation or region) to further reduce cluster overhead, but that's currently politically unacceptable. Another time, perhaps!


Cheers Nigel

Friday, April 07, 2006

SQL on Rails

My former colleague James Strachan has spotted an exciting new single tier web framework that seems absolutely perfect for us SQL hands.

I especially like the O'Reilly title Fragile Web Development with SQL on Rails. Funny, I think that's what we must have been doing all along...

Enjoy.

Sunday, April 02, 2006

How many binds does it take to save time?

Here's a tale about how something that seems intuitively correct can turn out to be 100% wrong - even though it has been stated as gospel by someone who seems like he should know.


We're building an application that - for the first year or two of operation - will include a legacy system component. Originally based on a kind of multiuser ISAM file system, the legacy data now lives in a series of over 100 Oracle schemas (one per site). The legacy app uses a legacy-to-relational bridge to access the data.

The bridge generates single table SQL to perform file-based actions like find-by-key, next record, previous record and so on. The tables are suitably indexed. However... the bridge (designed to support many different database platforms) issues SQL with no bind variables. So although there is a relatively small number of distinct access paths, each time you try one, it's got a new set of literal values in the predicates, and a hard parse is required.


The quick and easy solution is to have every bridge session start with:

ALTER SESSION SET CURSOR_SHARING='SIMILAR'
'Similar' statements share a parse tree, and the number of hard parses drops. In low volume everything works great. But as you crank up the workrate (we're using LoadRunner) the shared pool eventually gives up under the strain. Although there are relatively few 'families' of similar statements (ie statements which are exactly the same once all literals have been replaced by system generated bind variables), each single statement family may contain 100s of versions and use up to 150M of shared pool.


So the shared pool becomes fragmented and sooner or later it iss ORA-04031 time - unable to allocate memory. Even with 800M allocated to the shared pool, and 10% of it still available, the largest free chunk may be too small to be useful. We got down to a biggest chunk size of 23 bytes at one point.


A nasty side effect is that these errors result in everything slowing up. The main web application runs on clustered OC4J, using JDBC with session pooling to connect to a RAC back end. When queries start to slow down, the pool is exhausted so there's a sudden connection storm - which leads the entire system into a spiral of doom.


Now, the system architects aren't complete idiots. They had Oracle consultants on hand to remind them that bind variables are important. They even had access to the code base for the legacy bridge. So an attempt was made to add bind variable support.


It so nearly succeeded - but there was an itsy-bitsy bug in it.


The original bridge developer was asked to help solve the problem. And this is where he dug himself a big hole and jumped right in. A scientist proposes a theory, and conducts experiments that are designed to disprove it. This person simply stated what he believed:

Bind variables are used to keep certain info constant between calls. This
requires keeping and maintaining bind variable arrays. [...] To do it properly one must maintain a lot of stuff in separate arrays and relate them to the tables and types of calls being used. Rather than altering the session and letting Oracle bind them properly.


He (correctly, if sarcastically) describes what is happening:

I looked at your code briefly and your programmer brilliantly solved all the above problems. He creates a bind array, uses it in whatever the DB call is made, and then DESTROYS the array at the end of the call. It is not being used between calls AT ALL! Each DB call (in other words for every single record you read) creates, uses, then destroys the bind arrays.



And now he leaps to the wrong conclusion:

So technically you are binding, actually you have ADDED overhead and destroyed performance. [...] I am surprised you are not getting unexplained core dumps in a heavy locking situation where it accesses multiple files.


He assumed that:
  • using binds in the legacy bridge would be expensive

  • closing the OCI statement immediately after use would waste any savings

  • using CURSOR_SHARING=SIMILAR in Oracle would (a) work and (b) be more efficient

The team lost heart in the face of this clear statement, and the attempt to use bind variables was abandoned. Attempts were made over many months to resolve the 4031 problem by throwing memory at it, or even ignoring it.


But in fact, his advice was 100% wrong. Here's why:

The cost to the legacy bridge of setting up and using bind variables is barely detectable because:
  • Bind variables can easily be added into the statement as it is constructed

  • the memory management is simple (malloc and free)

  • few or no additional network roundtrips for bind data are needed - OCI piggy backs most or all of the information it needs onto the execution request

The Oracle server, on the other hand, is flat out. For it to avoid a hard parse it has to:
  • lexically analyse every incoming SQL to identify whether and where literals have been used

  • set up bind variables and extract the literals

  • now do a soft or hard parse using the transformed SQL

  • retain both forms of the SQL at least until the cursor is closed

  • manage all the memory structures in a pool shared by hundreds or thousands of sessions

No contest. Although a small overhead is indeed added at the bridge, a major bottleneck is cleared on the RAC server.


Now - nearly a year later - a more experienced team has had another go. Using bind variables solved the 4031 problem pretty much at the first attempt - just as any old Oracle hand would expect. And that bug? It turned out to be a simple one line coding error, which took a couple of hours code review to find and fix.


The moral of the story is: don't believe dogma; actually try and prove or disprove a hypothesis properly.


And the answer to the question in the title: if you are continually issuing 'similar' SQL statements, it is worth using bind variables even if you discard the prepared statement after a single use.