Monday, January 7, 2013

The Semantic Argument

This post was inspired by a Video Report Card for SAP Analytics I was fortunate to be able to participate in with Jon Reed (the sponsor) and included John Appleby, Derek Loranca, and Clint Vosloo. No one paid me anything for that. :)

The last two years have seen a flurry of activity from the Analytics team at SAP. A lot of things that were promised were delivered. New tools like Visual Intelligence, Predictive Analysis? Check.  Xcelsius roadmap (including the new Design Studio)? Check. BI4 stability? Getting checkier by the day. Are their some concerns still left with these items? Sure, but we are certainly better off on these fronts than we were a year ago.

The bigger problem with the current state of the portfolio is a burgeoning list of Semantic Layers for the SAP Analytics tool set (see around 17:19 of the video). We have the Universe (the one that SAP paid €4.8 billion for in 2007 and temporarily referred to as the Common Semantic Layer in BI4 until they realized it wasn't), we have the BW semantic layer, and we have the HANA semantic layer, and none of them talk to each other particularly well (or at least best practice is that they shouldn't).*

Certainly there are some advantages to allow these three semantic layers to continuing growing in their own directions. Innovation that applies to only one data source can be sped up because it doesn't have to worry about integrating with the others. Performance can be optimized because only one source system with its particular constraints must be considered. Unfortunately I don't think that begins to outweigh the disadvantages for both customers and SAP itself.

The disadvantages for customers apply to both legacy SAP and classic BusinessObjects customers. The two key issues I see are the slowing of innovation across the portfolio and the extra work required to maintain multiple semantic layers. Even if SAP wanted to invest for all three data sources (HANA, BW, Universe) simultaneously, it just wouldn't make sense. Investing every minor semantic layer change in each system to work with all of the reporting tools in the portfolio would be absolute murder, not to mention that any change to a reporting tool would need to be run back through three different sets of semantic layer developers to ensure integration. This creates a lot of points of failure at a time when stability has been recognized as a serious source of concern for customers.

If this three-headed monster continues, it also follows that the Total Cost of Ownership of any such system is going to balloon for customers. If the BusinessObjects platform has more things running under the covers that means more or at least bigger patches which means you'll need more administration time, more testing time, and more end user communication and training. And that's just if you employ one of these tools. If you've got more than one semantic layer running you'll need to have experts in each, or one very thinly stretched expert (who will be constantly looking for another job). This also means more training for users, more confusion, and a real loss of goodwill from users who now have to understand connecting to multiple data sources in a  reporting tool rather than just understanding how the Universe works and going from there.

BW isn't either.

Multiple options exist. SAP could continue investing in all three, but that is problematic because of the reasons published above. Encouraging customers to run all of their HANA data through BW has some potential, but it seems that takes away a lot of the benefits of HANA without a lot of benefit from BW. They could invest heavily in HANA semantics and offer fantastic pricing, although that still won't solve the enormous amounts of data that are not and will never be stored in-memory that still need accessed (not to mention the loss of goodwill from legacy BusinessObjects customers whose only solution is buying new software).

In the end the best solution is simple albeit not easy: make HANA and BW work properly through the Information Design Tool (IDT - the successor to the pre-BI4 Universe Designer). I know this wouldn't be easy, but I'm not sure why we have to actively encourage people to avoid the using the IDT when connecting to BW or HANA data sources. Perfect the "common" semantic layer for any data source, and every reporting tool downstream would be able to innovate on features and not just play catch up on connectivity. It would have been better to do this before the release of BI4, spending the million man hours on perfecting the semantic layer with slight tweaks to the reporting tools. I realize that exploding pie charts with sound sell software, but software that is easy to use and cheap to maintain sells itself.

* It's worth noting that we also have legacy .UNV files (from the old Universe Designer and new Universe Design Tool), Analysis Views (for OLAP), Crystal Reports Business Views (which are deprecated but have no migration option). 


  1. Great post Jamie. Good points. Business Views is basically a redo at this point, hence by post about dynamic data connections a while back.

    1. C'mon Dave, if you're going to bring up a helpful blog you wrote, at least post a link. :)

      Seriously, though, I'd like to see a more permanent direction out there so that at the very least if one of these layers is going to be deprecated people start weaning themselves off of it sooner rather than later.

  2. The resemblance of your three-headed monster to Cerberus, the three-headed dog in Greek mythology that guarded the gates of hell, is completely unintentional. Seriously, thank you for contributing to this important conversation.

    1. In Greek mythology did you pay the ferryman before or after you got past the dog? ;)

  3. Awesome post Jamie, I believe this is the biggest issue facing SAP as they move forward in developing their analytics platform.

    A couple of thoughts popped into my head as I read this:

    1. BW as a semantic layer has some very strong capabilities that are still absent in BOBJ, such as automated currency conversion, units of measure, proper hierarchy navigation etc - there is still lots of areas to improve on in the BOBJ semantic layer.

    2. HANA is the future for SAP databases, but still to be addressed is how the Real Time Data Platform (RTDP) is going to address non-HANA data sources, such as IQ, ASE, which will remain as part of the RTDP - and where will the RTDP finish and the BI semantic layer start?

    3. In my simplistic mind, HANA has some great capability to serve up data astoundingly quickly - is there an opportunity to look at an encapsulated HANA instance sitting inside the BOBJ platform, which pulls the data in from disparate data sources, but then uses compression and columnar storage in memory to accelerate BI for any data source?

    Thanks for drawing attention to this important topic.


  4. Obviously each have their own strengths, some of which should be put into the final version. I do think, however, that with HANA out there it makes little Sense to keep investing in the semantic layer portion if BW, and if we can come up with a HANA nugget to bundle into BOB someone is still gonna have to pay for it.

  5. I posted my comments on the cross posting on SDN as I was not able to enter my comments here based on a size limit:

    Comments can be found here: