Monday, March 31, 2008

IBM FastTrack for Source To Target Mapping for DataStage

Vincent McBurney was first to report IBM FastTrack for Source To Target Mapping, and he has been followed up by Philip Howard's note "the business face of data integration" at IT-Director.

I find it interesting that releasing a simple attribute mapping tool is seen as a major breakthrough for the DataStage/Information Server family; Constellar had more or less exactly that 12 years ago (no glossary though); the UI may not have been quite so business-user friendly, but certainly supported point and click, plus simple integration with metadata repositories.

Thursday, March 13, 2008

Vitria brings Web 2.0 and high transaction rates to BPM with M3O

Vitria recently announced its new M3O product, which is claims to be the convergence of BPM, Web 2.0 and Event Processing. Bloor's Simon Holloway reviews it here, where he quotes Vitria saying:
  • "BPM provides standards-based executable modelling (based on BPMN) on top of business knowledge Repository.
  • Web 2.0 provides the rich user experience with zero footprint to enable a collaborative design environment.
  • Event processing provides the support for rule and process definition and real-time runtime performance based on event driven architecture.
  • Only when you combine these together do you get a fundamentally new user experience with multilayer visualization, collaborative modelling environment, business level abstractions and event management"

Leaping shamelessly onto a passing bandwagon, Vitria explains M3O as "think iPhone meets dashboards" (quoted from ebizQ). The idea is that the "iPhone coolness" of the Web 2.0 interface will remove the gap between business and IT people. Well, as long as it doesn't (like the iPhone) lock users into an expensive long term relationship...

This looks like the first fruits from the return of JoMei Chang as CEO last July and the decision to go private, executed last March.

Wednesday, March 12, 2008

Cape Clear-out

I just caught up with the month-old news that Irish ESB/SOA vendor Cape Clear has been bought by WorkDay, a supplier of ERP SaaS.

Of course, SaaS needs integration, and Cape Clear CEO Annrai O'Toole promises that they will be providing the necessary Integration-on-Demand - but it seems like one of the leading independent commercial vendors has now been marginalised.

Ronan Bradley, former CEO of PolarLake (also Irish - one of our partners at SpiritSoft, and later a client of mine) also worries where the ESB market is going.

Is this a general problem in the middleware market, or is it just 'cos they is Irish? Are small vendors being caught between the rock of large vendors and the hard place of open source? Or is this (as Steve Craggs suggests in a comment on Ronan's post) simply a result of Cape Clear's own hubris?

Sunday, March 09, 2008

Change of scene

My current project is coming to a sudden end, so it looks like my weekly commute from Suffolk to Lancashire will be done by Easter. I've already got two or three interesting (and very different) opportunities to consider over the weekend - and if they don't come off, there's plenty on jobserve. More news as it happens; I am looking forward to spending a little bit more time at home and a little less time on the M6.

Saturday, March 08, 2008

Quantum dot memory

Next time you have trouble with i/o performance, think about this: wouldn't it be great to fit a terabyte of non-volatile RAM onto a one square inch postage stamp, with write times of 6 nanoseconds? Soon quantum-dot memory may do just that for you. That's nearly as fast as regular DRAM, and around 1000 times faster than typical flash memory. Scientists even say they could eventually get the write time down to picoseconds. Scary or what?

Saturday, March 01, 2008

H-Store - a new architectural era, or just a toy?

Philip Howard's commentary Merchant relational databases: over-engineered and out-of-date? supports the idea that perhaps general purpose relational databases should now be treated as "legacy". He references a paper The End of an Architectural Era (It's time for a rewrite) by Michael Stonebraker and others from MIT.

My first thought was that RDBMS developers such as Oracle have seen off previous architectural challengers - most notably object oriented databases (OODBMS) - in the past 25-30 years. What makes Stonebraker's H-Store any different?

First, a quick summary of the paper:

  • RDBMSs were designed 30 years ago - since then memory and cpu have become faster, cheaper and bigger, changing the balance against magnetic (disc) storage
  • increasingly they are failing to meet today's complex challenges
  • niche solutions have overtaken "general purpose" RDBMS in many areas (eg data appliances for business intelligence; specialist text search engines; etc)
  • and now even OLTP, the "core competence" of the RDBMS, is no longer safe; a new approach (such as H-Store) can easily beat traditional RDBMS by cutting out non-functional architectural features (eg: redo logs stored on disc) and achieving the same goals (ACID transactions) in another way.

The paper claims that H-Store can beat a traditional RDBMS at TPC-C style benchmarks; an early version runs up to 80 times faster than "a very popular RDBMS" which itself underwent several days of tuning by a "professional DBA".

H-Store's secret sauce is that it is (in effect) single threaded. It makes the assumptions that all transactions are very fast; then it executes each transaction in turn. This gets rid of the need for complex read-consistency models. Other optimisations include keeping undo in memory (because transactions are short and sharp) and discarding it at the end of the transaction.

Well, go off and read the paper for the details, but here's what I think.

On the negative side:

  • As a comparative benchmark, this fails through insufficient disclosure. For a paper that passes as academic, there is remarkably little detail on what they actually did.
  • Stonebraker assumes that "most" OLTP systems can be represented by a hierarchical model - what he calls a "constrained tree application" (CTA). As a result these applications are relatively easy to partition over a shared-nothing architecture. I wonder whether this is really the case. Parts of your application may be like that, or they may be like that for some periods (during the online day, for example). But even OLTP applications need to manage longer transactions, complex reporting, and updates to the "read only" tables. In his example, he assumes (section 5.0) that the Items table is read only, so it doesn't break his tree and it can easily be replicated. But we know that new items will be added; others will be re-priced, re-categorised, phased out. Can that be handled without interrupting a 24/7 H-Store style application?
  • He also seems to assume that there is only one axis of partitioning - in his case, the warehouse. But over time, the main focus of interest changes. An order is taken at a shop; it is ordered from a warehouse; it is delivered to the customer. Different "CTAs" at each stage. How does the H-Store morph its representation through the course of the information lifecycle?

On the positive side, though, this represents a call to action for the traditional vendors.

  • Any performance specialist knows that the best way to tune something is to stop doing it. If we really can change the rules of the game, we can avoid all that expensive "insurance". We're used to making calculated design tradeoffs for performance; this could be just another one of those.
  • I suspect that the RDBMS vendors will (more or less rapidly) steal any really good ideas. Stonebraker states that "no [RDBMS] system has had a complete redesign since its
    inception"
    . But, like the proverbial axe, RDBMS internals have been refreshed, a piece at a time. Oracle's database kernel has had at least two major rehashes in its lifetime. It's well within Oracle's or IBM's capability to incorporate the more realistic ideas from this paper, and to find a way to blend them with current state of the art.
  • They can also learn from the approach MySQL has taken of supporting multiple storage engines - horses for courses. Oracle already merges XML, relational and OLAP data stores; building in a high performance OLTP kernel to address specific classes of OLTP application is not at all inconceivable. Although it will lead to all sorts of information lifecycle difficulties, we are already used to migrating data from OLTP to OLAP; with good tool support it should be possible to work round the constraints that allow H-Store to dispense with so much that we normally take for granted.

I may revisit this paper to tease out other issues - for example Stonebraker rants against SQL (perhaps he's never got over Ingres being forced by the market to provide SQL rather than the more academically respectable Quel). H-Store uses C++, and may move to Ruby; the implication is that applications will be object-oriented, making row-by-row navigations (like so many J2EE apps, and suspiciously like COBOL/Codasyl) rather than being set-oriented.

Let's watch this space.