Data Warehouses

The basic reasons organizations implement data warehouses are: To perform
server/disk bound tasks associated with querying and reporting on servers/disks
not used by transaction processing systems most firms want to set up transaction
processing systems so there is a high probability that transactions will be
completed in what is judged to be an acceptable amount of time. Reports and
queries, which can require a much greater range of limited server/disk resources
than transaction processing, run on the servers/disks used by transaction
processing systems can lower the probability that transactions complete in an
acceptable amount of time. Or, running queries and reports, with their variable
resource requirements, on the servers/disks used by transaction processing
systems can make it quite complex to manage servers/disks so there is a high
enough probability that acceptable response time can be achieved. Firms
therefore may find that the least expensive and/or most organizationally
expeditious way to obtain high probability of acceptable transaction processing
response time is to implement a data warehousing architecture that uses separate
servers/disks for some querying and reporting. To use data models and/or server
technologies that speed up querying and reporting and that are not appropriate
for transaction processing There are ways of modeling data that usually speed up
querying and reporting (e.g., a star schema) and may not be appropriate for
transaction processing because the modeling technique will slow down and
complicate transaction processing. Also, there are server technologies that that
may speed up query and reporting processing but may slow down transaction
processing (e.g., bit-mapped indexing) and server technologies that may speed up
transaction processing but slow down query and report processing (e.g.,
technology for transaction recovery.) - Do note that whether and by how much a
modeling technique or server technology is a help or hindrance to
querying/reporting and transaction processing varies across vendors' products
and according to the situation in which the technique or technology is used. To
provide an environment where a relatively small amount of knowledge of the
technical aspects of database technology is required to write and maintain
queries and reports and/or to provide a means to speed up the writing and
maintaining of queries and reports by technical personnel Often a data warehouse
can be set up so that simpler queries and reports can be written by less
technically knowledgeable personnel. Nevertheless, less technically
knowledgeable personnel often "hit a complexity wall" and need IS
help. IS, however, may also be able to more quickly write and maintain queries
and reports written against data warehouse data. It should be noted, however,
that much of the improved IS productivity probably comes from the lack of
bureaucracy usually associated with establishing reports and queries in the data
warehouse. To provide a repository of "cleaned up" transaction
processing systems data that can be reported against and that does not
necessarily require fixing the transaction processing systems the data warehouse
provides an opportunity to clean up the data without changing the transaction
processing systems. Some data warehousing completions provide a means to capture
corrections made to the data warehouse data and feed the corrections back into
transaction processing systems. Sometimes it makes more sense to handle
corrections this way than to apply changes directly to the transaction
processing system. To make it easier, on a regular basis, to query and report
data from multiple transaction processing systems and/or from external data
sources an/or from data that must be stored for query/report purposes only For a
long time firms that need reports with data from multiple systems have been
writing data extracts and then running sort/merge logic to combine the extracted
data and then running reports against the sort/merged data. In many cases this
is a perfectly adequate strategy. However, if a company has large amounts of
data that need to be sort/merged frequently, if data purged from transaction
processing systems needs to be reported upon, and most importantly, if the data
need to be "cleaned", data warehousing may be appropriate. To provide
a repository of transaction processing system data that contains data from a
longer span of time than can efficiently be held in a transaction processing
system and/or to be able to generate reports "as was" as of a previous
point in time Older data are often purged from transaction processing systems so
the expected response time can be better controlled. For querying and reporting,
this purged data and the current data may be stored in the data warehouse where
there presumably is less of a need to control expected response time or the
expected response time is at a much higher level. - As for "as was"
reporting, some times it is difficult, if not impossible, to generate a report
based on some characteristic at a previous point in time. For example, if you
want a report of the salaries of employees at grade Level 3 as of the beginning
of each month in 1997, you may not be able to do this because you only have a
record of current employee grade level. To be able to handle this type of
reporting problem, firms may implement data warehouses that handle what is
called the "slowly changing dimension" issue. To prevent persons who
only need to query and report transaction processing system data from having any
access whatsoever to transaction processing system databases and logic used to
maintain those databases The concern here is security. For example, data
warehousing may be interesting to firms that want to allow report and querying
only over the Internet. Some firms implement data warehousing for all the
reasons cited. Some firm implement data warehousing for only one of the reasons
cited. I do believe that the achievement of a "business" objective for
a data warehouse necessarily comes about because of the achievement of one or
many of the above objectives. If you examine the list you may be struck that
need for data warehousing is mainly caused by the limitations of transaction
processing systems. These limitations of transaction processing systems are not,
however, inherent. That is, the limitations will not be in every implementation
of a transaction processing system. Also, the limitations of transaction
processing systems will vary in how crippling they are. Finally, to repeat the
point I made initially, a firm that expects to get business intelligence, better
decision making, closeness to its customers, and competitive advantage simply by
plopping down a data warehouse is in for a surprise. Obtaining these next order
benefits requires firms to figure out, usually by trial and error, how to change
business practices to best use the data warehouse and then to change their
business practices. And that can be harder than implementing a data warehouse.