Tuesday, February 26, 2008

SYS_CONTEXT versus V$ views for getting session information

A thread on Oracle-L today sidetracked into the use of V$ views for getting hold of session information such as username, SID and module. The poster had a logon trigger that was supposed to record these items for each logon from a particular client.

I don't approve of granting access to V$ views willy nilly; best practice is always to grant the minimum privileges necessary to achieve an objective.

Another poster raised the issue of performance. In the past, SYS_CONTEXT was considered slower than direct access to the views.

So here is a test to compare the two:



set echo off feedback off
set timing on

set termout off

variable v_loops number;
exec :v_loops := 1000000;

set termout on

prompt Testing sys_context

declare
l_user varchar2(30);
l_action varchar2(32);
l_module varchar2(48);
l_sid number;
l_loopcount pls_integer := :v_loops;
begin
for i in 1..l_loopcount loop
dbms_application_info.read_module(l_module, l_action);
l_user := sys_context('userenv', 'session_user');
l_sid := sys_context('userenv', 'sessionid');
end loop;
end;
/


prompt Testing mystat

declare
l_user varchar2(30);
l_action varchar2(32);
l_module varchar2(48);
l_sid number;
l_loopcount pls_integer := :v_loops;
begin
-- note this only gets one of the three pieces of information
for i in 1..l_loopcount loop
select sid
into l_sid
from v$mystat
where rownum = 1;
end loop;
end;
/



And here are the results:


C:\sql>sqlplus testuser/testuser

SQL*Plus: Release 10.2.0.1.0 - Production on Tue Feb 26 22:23:10 2008

Copyright (c) 1982, 2005, Oracle. All rights reserved.


Connected to:
Oracle Database 10g Express Edition Release 10.2.0.1.0 - Production

SQL> @sys_context_test
Testing sys_context
Elapsed: 00:00:07.31
Testing mystat
Elapsed: 00:00:38.57
SQL>


I suspect that in the past SYS_CONTEXT issued recursive SQL under the covers (just as the SYSDATE pl/sql functions used to, and the USER function and the 11g assignment from a sequence still do).

Now I assume SYS_CONTEXT gets its information directly.

Wednesday, February 20, 2008

The UML industry - as predicted

The Register has been poking fun at UML (and why wouldn't you). They have dug up a 1997 term paper piss-take which was probably pretty funny at the time; now it seems inspired, prescient even.

Monday, February 18, 2008

Shared Business Vocabulary

Following on from my active metadata post yesterday, Mike Ferguson at Dataflux says Shared Business Vocabulary (is) needed everywhere. By "shared business vocabulary" he means metadata, as this earlier post explains. I don't particularly like his phrase - metadata is fine by me - but maybe he's right and we need to use a less loaded (and misunderstood) term for it. And particularly in the warehousey world "metadata" also refers to the dynamics of data (when was it loaded, how long did it take) so there's scope for a better word or phrase.

Whatever that word or phrase is, please please can it not be ontology ....

Sunday, February 17, 2008

Active Metadata

Yesterday I moaned about poor use of metadata. Today I'd like to point to an example of the way metadata can be used actively. Andy Hayler describes Kalido's Business Information Modeller, which allows you actively to control and when necessary reshape the data warehouse by changing metadata; you can generate a BO universe, or deploy metadata to Cognos BI tools (more to come, apparently). You can (apparently) even undo the metadata changes, and see your warehouse as it would have been; kind of like Oracle's flashback recovery, but for data warehouse structures.

See Kalido's podcast, screenshots and other resources.

Saturday, February 16, 2008

2008 Data Management Predictions from Dataflux

Mike Ferguson posted these over a week ago. Key thoughts to take away from my point of view:
  • Information and data architects will continue to be in demand
  • Companies will need to invest again in data modeling tools and in data modeling skills
  • Holding this metadata in spreadsheets is no longer acceptable.

It is depressing to me that so many projects keep their business critical metadata in Word documents, Excel spreadsheets and Visio drawings. Some developers prefer to rely on their IDE - keeping DDL definitions as SQL in text files. That protects the definitions (at least you can find them quickly), but it doesn't get the maximum value out of it.

CASE tools have been around since the mid 80s or before - Oracle's SQL*CASE, now better known as Designer, was under development when I joined in '86 - so how come they are used less and less? They did get a bad name for encouraging complex, expensive and ultimately useless corporate data models - and we're glad to see the back of those, and the ivory towers they came from - but they can still be very helpful in defining and developing the metadata we need as a basis for system development.

I wonder whether the main problem is that many CASE tools are simply too expensive and/or too closed; they just can't cope with all the different kinds and layers of metadata we would like to throw at them, and they don't integrate well with all the other development tools around. Look at Designer - it's been more or less static for the last 10 years, and other Oracle products barely take any notice of it. No wonder it's slowly fading away.

Perhaps it's time for Oracle to get a grip and provide some common repository / metadata management for use across all its myriad of tools? Or for a small vendor or OSS project to take up the challenge? Let me know if you've already found the tool that can pull together ERDs, schema models, UML process diagrams, Discoverer EULs, a BO universe, Warehouse Builder or ODI transformation definitions and all the other kinds of development metadata that projects deal with every day.

Friday, February 15, 2008

Writing XML to a file

Marco Gralike has posted a couple of items at the Amis blog and a later followup on his own blog about ways of writing XML to a file - comparing the use of CLOB2FILE with DBMS_XMLDOM.WRITETOFILE with DBMS_XSLPROCESSOR.CLOB2FILE - and finding the latter faster.

Surely this is simply because the input to the test is an XMLTYPE (the result of his test query).
  • The call to CLOB2FILE requires an implicit conversion from XMLTYPE to CLOB and then simply serialising the CLOB (a large string already in XML format)
  • WRITETOFILE is making an implicit conversion from XMLTYPE to DOMDOCUMENT - it is constructing a DOM tree and then walking it to produce the serialised output. Here's the signature for WRITETOFILE:

DBMS_XMLDOM.WRITETOFILE(
doc IN DOMDOCUMENT,
fileName IN VARCHAR2,
charset IN VARCHAR2);
Building that tree is a major overhead - in this case - though obviously it wouldn't be if you actually needed to navigate around the tree adding/moving/updating or pruning nodes and branches.

Update 16-Feb:
Marco has continued the story here, in great detail and at great length (tkprof listings and all). He's particularly interested in file size; I'd be more interested in performance impact of the various steps in his operations. I will try to reproduce his tests and summarise those soon.

Nominative determinism

Nominative determinism: when your job or hobby matches your name. For example, the famous Major Major, from Catch 22. Or the British Antarctic Survey scientist I heard on Radio 4 yesterday explaining how he was going to drill down through 3km of ice - a Mr Core (or was it Bore - they both work, and I was in the car at the time).

The latest to catch my eye was in this quote at El Reg: "The vice president in charge of Red Hat's JBoss middleware Craig Muzilla, said Red Hat is going to undercut giants IBM, Oracle and Microsoft in the saturated and expensive middleware sector."

With a name like Muzilla, I guess working for an OSS company was pretty much inevitable.

DATAllegro, DW appliances and the Oracle lock-in effect

Andy Hayler has an interesting post on DATAllegro - talking about its sweet spot at the very high end of data volumes; apparently helped by a grid with up to 1TB/minute transfer speeds.

What caught my eye was Andy's observation that "Oracle customers can be a harder sell (than Teradata's, for example) not because of the technology, but because of the weight of stored procedures and triggers ... in Oracle's proprietary extension" (ie PL/SQL).

Just looking at the various technologies people talk about on OraNA blogs, I wonder whether the continuing growth in Java and .NET based applications over the last decade - where many of the business rules are executed outside the database - might hurt Oracle precisely by removing the PL/SQL "stickiness" Andy describes.

Admittedly Apex is a counter example - thoroughly PL/SQL, thoroughly Oracle; no chance of moving Apex clients to another RDBMS yet. And the tendency to provide management interfaces through PL/SQL (or very Oracle-specific SQL extensions) also helps to keep Oracle customers tied down. Which is good for us Oracle died-in-the-wool crowd. Whether the lock-in is good for the customers (or even Oracle) is another question; an existing customer lock-in can feel like a lock-out to new customers.

Any thoughts out there?

Regards Nigel

Tuesday, February 05, 2008

Schema Version Control

An outbreak of synchronicity has produced a volley of blogs and questions about database version control in the last few days.
I seen so many projects take code control seriously but have no real idea how to control a database definition (let alone how to control changes to multiple instances of development, test and live data, both in house and on client sites).

Code control is a walk in the park compared to "schema control" (where the delta , like an ALTER TABLE needs to be developed alongside the new version of the database object creation script).

And "data control" adds even more challenges:
  • can you upgrade the data in place? or do you need to create a replacement object?
  • Is there enough space?
  • Do you need to suspend user access? How long will it take?
  • what are the chances of reversing the upgrade if something goes wrong?
  • Will you even know if something goes wrong?
  • Who decides go forward, go back or muddle through?

A complex application version upgrade should be treated with as much respect as a data migration project. There should be a strategy (sure, it may be a standard strategy). Use standard tools / scripts. The more the process is repeatable - and the more it is repeated - the better. Make sure that all likely error conditions (missing log files, out of DB storage, non-writeable working directories, no listener) are tested for - test the tests!

By the way, the "process" includes all the operating procedures. Make sure that upgrades aren't nannied through by developers. Script the whole thing; have the production DBAs try it out the procedures as early as possible in test to make sure that they understand what's supposed to be happening (and how to know if it worked OK; the best thing is if the script just says:

IT ALL WORKED OK

or

SOMETHING WENT WRONG IN STEP 13 - SEE LOGFILE xxxyyy13

By the time the schema-and-data upgrade is executed for real, nothing - not even the unexpected - should be unexpected.

Monday, February 04, 2008

Patently ridiculous

SCO has had a pretty hostile press over the last few years as it morphed from a real software business into a patent mining outfit. Well, I've now come across another case of patent ambulance chasing over at Vincent McBurney's blog on ITToolbox. JuxtaComm and Teilhard versus IBM Microsoft Informatica SAP etc tells the story so far - basically that a small company is claiming it invented ETL, patented it in 1997, and can it now have loads-a-money if you please.

I followed Vince's link to Groklaw where InterSystems, one of the parties, is asking for help.

One of the parties is DataMirror (now part of IBM, and the purchaser of my former employer Constellar). As tne SQL Group, Constellar had an ETL system that seems to fit the key claims as early as 1994 (to my personal knowledge - it may have been on the market in 1993)

I'm stunned that (apparently) Oracle has already caved, handing over $2 million. One of the reasons I am so surprised is:
  • I was actually working for Oracle UK in 1994 - the Constellar Hub (then known as Information Junction) was evaluated for one of my clients (who ended up purchasing ETI instead).
  • Oracle UK was involved in quite a few pre-sales to Telecoms, Pharma and others over the period 94-2000.
  • Oracle US consultants helped support the product in at least a few US clients.
  • Constellar's US office was actually sublet from Oracle Corp, and I know many efforts were made to sell (the product and/or the company) to Oracle in around 1997 timescale.
  • Oracle's own WarehouseBuilder must date from around then if not earlier.

Of course, Constellar wasn't alone at the time. We competed over time against ETI, Prism, Vmark/Ardent/Ascential, Informatica, and all the others (as well as the ubiquitous roll-your-own, which is still probably the market leader).

Slipping the litigants $2 million may be cheaper than arguing the case for Oracle - but for the rest of us (especially any cash strapped open source projects) it's not an option. And I'm sure a sharp lawyer at IBM might like a bit of ammunition to see them off. Hey IBM, you're innocent!

So if any ETL pioneer (or their heirs and successors) wants my testimony to fight off JuxtaComm's malicious and meritricious suit - just let me know! I'm available for testing out the trans-atlantic business class beds any time you like...

Update 13 March 2008: more from Vincent here.