Showing posts with label BI. Show all posts
Showing posts with label BI. Show all posts

Sunday, November 20, 2011

A full day of rest leads to a synastry breakthrough

As I mentioned in previous blog posts, I am walking pile of training injuries.  I am having great difficulty making the jump to Kettlebell and CrossFit training intensities.  About 8 days ago, I suffered a pretty serious shoulder injury which has left my right hand numb most of the time.  That's right, I am currently typing with a partially numb right hand.

Because I have only slowed down, rather than stopped, it just hasn't gotten better.  In truth, this may require a visit to Dr. Bachner's surgery center, and he may need to perform the Sam Bradford procedure on me.  In the interests of healing, and in view of our harsh weather conditions today, I declared a full day of rest.  No exercise, period!  I mean zip, zero, nada, nothing; not even abs.

Naturally, I was at loose ends today.  It might have been impossible if it hadn't been a football Sunday.  Even so, I was at loose ends, not knowing what to do with my spare energy.

I found myself doing extensive searches online for key aspects in the pseudo-science of Synastry.  As you know, I have a project in progress (currently on hold) to build my own Synastry engine.  The objective is to create a collection of webservices that will power both a collection of mobile apps, and a major-league Windows WPF application.

As I did my research, it suddenly dawned on me.  I had the massive ah-ha! experience.  I suddenly snapped on a perfect, simple, elegant, high-performance architecture for this Synastry engine.  I now know how I will do the high-speed computations for millions of people per minute.

In short, this is a classic application of Multi-Dimensional Analytics.  Multi-Dimensional databases are popular and vogue stuff in the business intelligence market.  The notion is fairly simple, mathematically speaking.  Each point of data has three or more dimensional coordinates.  Think of each coordinate in a hyper-cubic space.

In a simple cube, each data-point has an X, Y and Z coordinate.  You can fetch any point of data in the cube by referencing it's three coordinates   However, the cube is only the start.  You can have n dimensions, where n is any finite number.  This means 6, 8, 9 and 12 dimensions are all possible.  One word of warning: As you increase the number dimensions in your hyper-cube, the exponential explosion of data points can easily blow even the most powerful server's capacity.

I am not worried about this.  In my case, Synastry is a relatively manageable 4 dimensional data problem.  Consider the following Dimensions:

  1. The A chart {e.g. Ascendant, Sun, Moon, Mercury, Venus, Mars, etc.}
  2. The B chart {e.g. Ascendant, Sun, Moon, Mercury, Venus, Mars, etc.}
  3. The Angles {e.g. Conjunction, Sextile, Square, Trine, Opposition, etc.}
  4. The Categories {e.g. Romance, Communication, Creativity, Aggression, Mutual Success, etc}
I need one point of data corresponding to the unique combination of each of these four coordinates.  As soon as I populate the four-dimensional cube, it is just a of looking up the scores in the cube.  Of course, we still need to compute and compare two people's charts, but this is the relatively easy part.  Scoring their potential for a relationship accurately... this is the hard part.

Consider a simple hypothetical scenario.  I am a Virgo.  A female I am interested in is a Capricorn born January 14, 1979.  Her moon is in Leo, her Venus is in Sagittarius.  There are a number of excellent aspects here.  We have a near-trine between suns.  Her Sun conjuncts my Vertex.  Her moon trines my moon and conjuncts my Venus.  Her Venus trines my Mars.  What should her score be?

It depends on the category.  Each aspect I just mentioned will have an impact on several different category scores.  Romance, Aggression, Mutual Success will all be impacted to varying degrees by each of these aspects.  Without calculation, we can already know the score will be very, very good, but just how good?

Using a four dimensional hyper-cube, and I can look up each score, for each aspect, for each category.  The sum can be fetched using MDX SQL code query.

Of course, there are other ways you can do it.  You could do it in a purely functional manner, but this could be extremely computationally intensive in a situation where you are attempting to serve a million people in an Internet/Smart Phone world.  A pre-calculated hyper cube should be faster and more scalable.

It's also extremely well organized, testable, verifiable, and maintainable.

Friday, June 4, 2010

Witnessing the power of BI and SQL Analytics


So, a very fun story has dawned on me in the past week or so. Amazon.com has been demonstrating the power of business intelligence tools and SQL Analytics packages.

As you all know, I recently became obsessed with perfecting Alton Brown's apple pie recipe. Evidently, I was not the only one. Enough of us jumped on this bandwaggon that Amazon.com clearly associated a group of products together. Witness what happens when you do an Amazon.com search for "Grains of Paradise"

Customers who bought this item also bought:
  1. A 10"x2" tart pan with a false bottom
  2. A pair of porcelain pie birds
  3. A jumbo stainless steel apple core slicer with 12 slots
  4. Tapioca flower
All of these are rare ingredients use in Alton Brown's apple pie recipe. I am surprised it didn't bring up the Apple Jack. Maybe they just can't sell the booze at Amazon. Interesting that these ingredients might completely overwhelm the conventional middle-eastern stuff you would normally associate with Mustapha's spices.

You can see that Alton has a large foot-print. He can make an impact on internet super-stores even as large as Amazon.com. Amazon's completely automated analytical engine shows the power of this impact.

I will show you another example from the baking domain. The New York Times published a how-to video on Youtube some 4 years ago showing how to make no-knead bread. As the New York Times ambassador said, they got a huge response, and it became a cult.

Amazon clearly confirms that this is so. When you do an Amazon search for 'Dutch Ovens' guess what customers often buy with their Dutch Ovens? They buy book on how to make no-knead bread. Don't think so? Check it out! The products are linked in both directions. It takes purchasing power to make that happen.

I must say, that did look like pretty awesome bread. For those who are out of the house for some 12 hours a day, the recipe is easy. Just stir, walk out the door, fire up the car, and go to work. When you come back, heat up the dutch oven on your stove, and heat up the real oven. You toss them together and in about 1 hour you have your bread. The problem with the recipe is that house wives are impatient. They want to hurry the process. Time is the key ingredient.

Amazon's fully automated business intelligence solution has drawn a connection between the Dutch oven and this book regarding no-knead bread. It knows that people who buy one, often buy the other. It just so-happens that these products are related by a hit internet video which teaches busy people how to make great bread.

Folks, this is only the tip of the iceberg. I chose these examples because they are simple, clear-cut, and undeniable. There is far more to the story than that. We can explore all manner of natural phenomenon like this. We can study football in this manner. We can find out what is really important... like scoring points.

You have these lovely little mathematical techniques called ANOVA and ANCOVAR which allow us to analyze variance and co-variance. We can see how numerical values move together. We can see how much scoring impacts your victory total. We can see how much stopping the other team from scoring impacts your victory total. We can see that teams with great punting stats have lousy win totals. The Rams had a great punt team last year. They punted a lot from deep in their own territory. Donnie Jones could really cut loose. The Saints had fairly poor punter stats.

Believe me, the world is wide open to analysis by these marvelous SQL tools. Microsoft claims that they bundle the world's most sophisticated BI engine in the box with every copy of Microsoft SQL Server. This is very interesting to me. We happen to have a lot of SQL Server boxes laying around here at work. I happen to have the developer's edition at home.

In the past, I have used the common tools the NFL and ESPN provide us to gather basic statistical information I use to form my opinions. You can get a hell of a good grounding in the facts with these tools. The simulators, like John Madden Football, can also tell you a hell of a lot. This season, I think I am going to step it up. I am going to gather in this data, slap it into a SQL MDB and see what the BI engine can tell me.

This aught to be fun.