Monday, February 20, 2012

New Database System Overview

Hi guys,
I am currently finishing off my high level structure for the new
business intelligence database system for my company and without going
into too much detail I would like to get your comments on it. Maybe
there is something you have tried that did/didn't work. Any comments
at all would be appreciated.
I have a pretty standard system plan in place. We have about 10 OLTP
systems/databases currently and all vary in the amount of traffic from
low (5 transactions a minute) to heavy (5-30 rows a second). We have
thousands of financial terminals all over the world and these
terminals feed our OLTP servers in near real time. Our customer
service team must have near real time access to this data so this is
where I want to make use of an ODS. My thought is to develop a
messaging system from all of the OLTP systems and stage this data to
this ODS. This will be where the tech support/ops/customer care teams
can have instant access to the data and be able to support the
customers with a real time view to the session events. Much of our
daily reporting and information gathering is done by the
afforementioned teams. Customer payments, machine health, peripheral
activity and software events are all gathered and sent back. I am
designnig the ODS so our employees can make quick decisions based on
the incoming events. Separate from the ODS, I am also going to
implement a data warehouse. This area will contain more long term
trending of transactions and other data used for executive decisions.
I have already designed and successfully built data warehouses before
so I am very confident in this area. The area that I do have a few
questions is the bridge between the ODS and the data warehouse. I have
read many articles dealing with real time data warehousing and ODS ETL
and many people have different opinions on what they think is the way
to go. My initial thought is to stage OLTP data to the ODS in near
real time. From there I will stage the data into the data warehouse
every night as we have the hardware to do it this frequently. So my
question is:
Should I run two systems (ODS/DW) side by side for long periods of
time and use the systems as separate entities for different groups of
people or should I use the ODS for say just the current day and the
run any earlier reporting off the data warehouse?
I know having redundant data costs money and greater effort to manage
but we have the resources to back this if need be. Our OPS team needs
data in real time, our financial team needs it daily and the executive
team needs it monthly so the ODS would serve the OPS team pretty much
exclusively.
Does anyone have any general comments to describe their experiences in
designing and building similiar systems? Our data flow is not overly
large right now but it is growing exponentially every month and I want
to be ready for it years down the road. Thanks for reading my long
winded article and I would love to hear all of your thoughts. It is
late on a Sunday night and I am very tired so I apologize if my
article is not gramatically ready for tomorrow's Wall Street
Journal . Thanks in advance.
Corey
Hi Corey,
this is an interesting scenario.
I can only do some general consideration.
It would be important to know if you are going to use Analysis
Services or not. One of the new features of Analysis Services is the
ability to directly push data into dimensions and measure group
partitions. This means that you could implement an almost continuous
update of the cube with data that are pushed into analysis services
directly from a queue that is fed by the middle tier. This can be
useful when you cannot afford a query on OLTP during daily operations.
Anyway, there are a wide range of opportunities here, ranging from
these extremely optimized one to the classical relational data
warehouse with intra-day operations that is updated frequently and
consolidated into a larger data warehouse nightly.
I you want to design a "classical" relational data warehouse, the best
practice is to design a simpler star schema, with fewer dimensions
than the big one, just to make sure that you are able to insert data
into fact table in a fast way. It is important to use some decoupling
technique between the OLTP transaction and the ODS update: Message
Queue, Service Broker or something like this.
I would not consider using a single data warehouse with even the intra-
day updates, unless both data warehouse and daily values can share the
same dimensions, both in number and in contents. In that case, I would
create a partition (or a separate fact table) with only the intra-day
operation, and then each night I would consolidate data making all
necessary cleansing operations.
Good work!
Marco Russo
http://www.sqlbi.eu
http://www.sqljunkies.com/weblog/sqlbi
On Jan 29, 9:10 am, kild...@.yahoo.ca wrote:
> Hi guys,
> I am currently finishing off my high level structure for the new
> business intelligence database system for my company and without going
> into too much detail I would like to get your comments on it. Maybe
> there is something you have tried that did/didn't work. Any comments
> at all would be appreciated.
> I have a pretty standard system plan in place. We have about 10 OLTP
> systems/databases currently and all vary in the amount of traffic from
> low (5 transactions a minute) to heavy (5-30 rows a second). We have
> thousands of financial terminals all over the world and these
> terminals feed our OLTP servers in near real time. Our customer
> service team must have near real time access to this data so this is
> where I want to make use of an ODS. My thought is to develop a
> messaging system from all of the OLTP systems and stage this data to
> this ODS. This will be where the tech support/ops/customer care teams
> can have instant access to the data and be able to support the
> customers with a real time view to the session events. Much of our
> daily reporting and information gathering is done by the
> afforementioned teams. Customer payments, machine health, peripheral
> activity and software events are all gathered and sent back. I am
> designnig the ODS so our employees can make quick decisions based on
> the incoming events. Separate from the ODS, I am also going to
> implement a data warehouse. This area will contain more long term
> trending of transactions and other data used for executive decisions.
> I have already designed and successfully built data warehouses before
> so I am very confident in this area. The area that I do have a few
> questions is the bridge between the ODS and the data warehouse. I have
> read many articles dealing with real time data warehousing and ODS ETL
> and many people have different opinions on what they think is the way
> to go. My initial thought is to stage OLTP data to the ODS in near
> real time. From there I will stage the data into the data warehouse
> every night as we have the hardware to do it this frequently. So my
> question is:
> Should I run two systems (ODS/DW) side by side for long periods of
> time and use the systems as separate entities for different groups of
> people or should I use the ODS for say just the current day and the
> run any earlier reporting off the data warehouse?
> I know having redundant data costs money and greater effort to manage
> but we have the resources to back this if need be. Our OPS team needs
> data in real time, our financial team needs it daily and the executive
> team needs it monthly so the ODS would serve the OPS team pretty much
> exclusively.
> Does anyone have any general comments to describe their experiences in
> designing and building similiar systems? Our data flow is not overly
> large right now but it is growing exponentially every month and I want
> to be ready for it years down the road. Thanks for reading my long
> winded article and I would love to hear all of your thoughts. It is
> late on a Sunday night and I am very tired so I apologize if my
> article is not gramatically ready for tomorrow's Wall Street
> Journal . Thanks in advance.
> Corey
|||Thanks for the reply Marco. I think we are along the same lines here.
I guess my biggest concern is the difference in the requirements of
our operations department and the rest of the company. The biggest
challenge will be to "combine" their needs within the ODS with the
warehouse. Also, transactions can be voided days later and since the
datawarehouse is a non volitile area I am going have to figure out how
to make changes to the fact tables or wait a few days before the
warehouse gets staged. So, in your opinion,would you run the ODS side
by side with the warehouse to serve the functional needs of the
different corporate departments or should I try my best to get all of
the info into the warehouse and use the ODS for only 1 day or so worth
of data. I am just concerned that the two systems will not mesh well
enough to support everyone involved. Thanks for your input. It is very
much appreciated.
BTW... we haven't made up our minds on Analysis server yet but I am
going to definitely do some research on its feasibility here.
Corey
|||Corey,
Your approach sounds reasonable. If I remember correctly, Kimball addresses
the ODS in his book: The Data Warehouse Lifecycle Toolkit.
-- Bill
<kildenc@.yahoo.ca> wrote in message
news:1170058228.206470.269570@.s48g2000cws.googlegr oups.com...
> Hi guys,
> I am currently finishing off my high level structure for the new
> business intelligence database system for my company and without going
> into too much detail I would like to get your comments on it. Maybe
> there is something you have tried that did/didn't work. Any comments
> at all would be appreciated.
> I have a pretty standard system plan in place. We have about 10 OLTP
> systems/databases currently and all vary in the amount of traffic from
> low (5 transactions a minute) to heavy (5-30 rows a second). We have
> thousands of financial terminals all over the world and these
> terminals feed our OLTP servers in near real time. Our customer
> service team must have near real time access to this data so this is
> where I want to make use of an ODS. My thought is to develop a
> messaging system from all of the OLTP systems and stage this data to
> this ODS. This will be where the tech support/ops/customer care teams
> can have instant access to the data and be able to support the
> customers with a real time view to the session events. Much of our
> daily reporting and information gathering is done by the
> afforementioned teams. Customer payments, machine health, peripheral
> activity and software events are all gathered and sent back. I am
> designnig the ODS so our employees can make quick decisions based on
> the incoming events. Separate from the ODS, I am also going to
> implement a data warehouse. This area will contain more long term
> trending of transactions and other data used for executive decisions.
> I have already designed and successfully built data warehouses before
> so I am very confident in this area. The area that I do have a few
> questions is the bridge between the ODS and the data warehouse. I have
> read many articles dealing with real time data warehousing and ODS ETL
> and many people have different opinions on what they think is the way
> to go. My initial thought is to stage OLTP data to the ODS in near
> real time. From there I will stage the data into the data warehouse
> every night as we have the hardware to do it this frequently. So my
> question is:
> Should I run two systems (ODS/DW) side by side for long periods of
> time and use the systems as separate entities for different groups of
> people or should I use the ODS for say just the current day and the
> run any earlier reporting off the data warehouse?
> I know having redundant data costs money and greater effort to manage
> but we have the resources to back this if need be. Our OPS team needs
> data in real time, our financial team needs it daily and the executive
> team needs it monthly so the ODS would serve the OPS team pretty much
> exclusively.
> Does anyone have any general comments to describe their experiences in
> designing and building similiar systems? Our data flow is not overly
> large right now but it is growing exponentially every month and I want
> to be ready for it years down the road. Thanks for reading my long
> winded article and I would love to hear all of your thoughts. It is
> late on a Sunday night and I am very tired so I apologize if my
> article is not gramatically ready for tomorrow's Wall Street
> Journal . Thanks in advance.
> Corey
>
|||Corey,
On further thought, I would address each one as an independent requirement.
If you constrain the ODS based upon the requirements of the DW, then the ODS
may not be as agile as it could be. More than likely, the requirements of
the ODS will be far more dynamic than that of the DW. Just a few thoughts.
-- Bill
<kildenc@.yahoo.ca> wrote in message
news:1170179507.956288.267830@.q2g2000cwa.googlegro ups.com...
> Thanks for the reply Marco. I think we are along the same lines here.
> I guess my biggest concern is the difference in the requirements of
> our operations department and the rest of the company. The biggest
> challenge will be to "combine" their needs within the ODS with the
> warehouse. Also, transactions can be voided days later and since the
> datawarehouse is a non volitile area I am going have to figure out how
> to make changes to the fact tables or wait a few days before the
> warehouse gets staged. So, in your opinion,would you run the ODS side
> by side with the warehouse to serve the functional needs of the
> different corporate departments or should I try my best to get all of
> the info into the warehouse and use the ODS for only 1 day or so worth
> of data. I am just concerned that the two systems will not mesh well
> enough to support everyone involved. Thanks for your input. It is very
> much appreciated.
> BTW... we haven't made up our minds on Analysis server yet but I am
> going to definitely do some research on its feasibility here.
> Corey
>
|||I agree with AlterEgo: if requirements are going to change, the
maintenance of your Data Warehouse could become a nightmare.
That said, if one of your requirements is the real-time update of
information compared with historical data, you should maintain some
relationship between the two worlds.
Talking about void transaction, you can incrementally handle them by
inserting opposite transactions in the Data Warehouse. Not so easy,
but possible.
Marco Russo
http://www.sqlbi.eu
http://www.sqljunkies.com/weblog/sqlbi
On Jan 31, 12:10 am, "AlterEgo" <altereg...@.dslextreme.com> wrote:
> Corey,
> On further thought, I would address each one as an independent requirement.
> If you constrain the ODS based upon the requirements of the DW, then the ODS
> may not be as agile as it could be. More than likely, the requirements of
> the ODS will be far more dynamic than that of the DW. Just a few thoughts.
> -- Bill
> <kild...@.yahoo.ca> wrote in message
> news:1170179507.956288.267830@.q2g2000cwa.googlegro ups.com...
>
>
>
> - Show quoted text -
|||Thanks guys,
As the project proceeds more of these questions will be answered. I
have read some very interesting articles from Kimball dealing with the
ODS. I can understand why he would want to bring the ODS under the
roof of the data warehouse and it is a definite possibility but as you
guys mentioned that it may become a very tedious task to keep the
requirements of both systems the same so they mesh well together. I
just see the operations requirements being too granular. For instance,
a simple question from our operations team would be, "What are the
transactions details for the last payment for the customer with the
phone number 5555555?". Then show me all of the session details for
this transaction. I just don't see this data fitting well into the
datawarehouse. Thanks for the input Bill and Marco. It is very much
appreciated.
Corey

No comments:

Post a Comment