Using Complex Event Processing (CEP) with Microsoft StreamInsight to Analyze Twitter Tweets 2: What are CEP and StreamInsight?

Note: This post is one of a series, the overview can be found here: Complex Event Processing with StreamInsight

What is CEP?

Wikipedia defines Event Processing as:

“Event processing is a method of tracking and analyzing (processing) streams of information (data) about things that happen (events), and deriving a conclusion from them”

and Complex Event Processing as:

“event processing that combines data from multiple sources to infer events or patterns that suggest more complicated circumstances. The goal of complex event processing is to identify meaningful events (such as opportunities or threats) and respond to them as quickly as possible.”
http://en.wikipedia.org/wiki/Complex_Event_Processing

CEP is often used for the following types of applications:

  • algorithmic trading
  • pattern detection (e.g. fraud detection)
  • security applications
  • sensor data, e.g. in robotics

Google-Car
Example for a CEP scenario: The sensor data of the Google car.

Data-Visualzation
Example for a CEP scenario: Processing and visualization of financial market data

CEP Solutions

The following image contains an overview of a typical CEP architecture. We use data from devices, sensors, computers, data stores or other data sources and feed them, often in real time, into a CEP engine. The engine uses standing queries to analyze the data streams on the fly. The output of the query is forwarded to targets that typically alert, visualize, store or forward the data.

image

Two important points to note:

  • In CEP we are often more interested in events (and non-events) or aggregates of data rather than the data itself
  • StreamInsight must not be confused with HDInsight. HDInsight is the Big Data solution (Hadoop) offered on Windows Azure.

What is Microsoft StreamInsight

StreamInsight is a CEP Engine and framework developed by Microsoft. It belongs to the Sql Server stack but it can be used totally independent of SqlServer.

However, in order to use StreamInsight on a productive system you need an SqlServer License. In order to get to know the product there is a free 180-day trial license available. (Can be selected during installation).

I am a huge StreamInsight fan since it is very lightweight, extremely powerful and to me as a .NET developer it is really easy to use. The query system in StreamInsight extends the Language Integrated Query (LINQ) concept that should be familiar to every .NET developer.

I highly recommend the following TechNet article to get a nice overview about StreamInsight:

StreamInsight for non-programmers:
http://social.technet.microsoft.com/wiki/contents/articles/14437.streaminsight-for-non-programmers.aspx

If we look at the architecture of a StreamInsight solution, it is totally coherent to the CEP architecture depicted above:

image

In StreamInsight we speak of data sources, the objects that feed the data into StreamInsight and data sinks, the objects that receive a data feed from StreamInsight. Note that the StreamInsight query language is LINQ and we even have the possibility to combine multiple queries inside the engine.

In the next post we are going to take a closer look at the StreamInsight programming model: 

Using Complex Event Processing (CEP) with Microsoft StreamInsight to Analyze Twitter Tweets 3: StreamInsight Programming