Data Warehouse Research Paper

  • To find inspiration for your paper and overcome writer’s block
  • As a source of information (ensure proper referencing)
  • As a template for you assignment

Data warehousing, as a means of organizing enterprise information in order for businesses to manage knowledge and benefit from the knowledge acquired from possible analysis, is a common business venture in most firms today. Gone are the days when one large and expensive supercomputer would be used to manage an entire organization’s data.

Today, various Central Processing Units (CPUs) are available and at the disposal of the IT team. The beauty of this scenario is that the CPUs can be used simultaneously to perform completely different, but related tasks that are part of the major task and thus completing the major task in record time.

One of the many advantages of data warehousing is the fact that these systems become a central data source after consolidation, which is accessible to end users and information derivation becomes simpler if not straightforward. Consequently, this element increases the efficiency of business transactions, which eventually draws the line between the firms with business acumen and those without.

However, one inherent disadvantage follows data warehousing and it involves data mining. Ideally, data mining is the final stage of data warehousing because at this point, it is possible to gather all possible types of relational information from the system and determine links and relationships that were not decipherable before. As a result, the accuracy of queries increases and business output increases.

However, this case does not apply in practice due to a few hitches that attach to this process of data mining. First, after completing the process of data mining, only a few users in the entire enterprise can actually get to use the procedure due to the high level of specialization required in its application. In fact, the number presently oscillates at a maximum of five.

Given this scenario, unsurprisingly most organizations do not see the point of paying very expensively for a process that would only be used by five people in the firm. Therefore, they pay peanuts. On the other hand, data-warehouse builders know that they require a lot of upfront capital and heavy investment in time resources upfront before coming up with a data-mining algorithm, which is infamous for its complexity.

This aspect coupled with the fact that it is virtually impossible to predict the resourcefulness of a data mining infrastructure from the onset and thus decapitating the technician from having a sales pitch, makes a very bad case for data mining, and yet its importance cannot be overemphasized.

This paper looks into several such poignant features of data warehousing and close with a few recommendations as well as forecasts into the future of data warehousing.

Introduction

Data warehousing is a rather new term for an old concept. In fact, it emerged in the 1990s where it was initially referred to as Decision Support System or Executive Information system. The father of data warehousing is one William Inmon and a co-innovator usually lined up beside him in reviews is Ralph Kimball.

Several definitions exist to befit what has come to be accepted as data warehousing in the 21 st century and these include “A Data warehouse is as organized system of enterprise data derived from multiple data sources designed primarily for decision making in the organization” (Bertman, 2005, p. 12).

This definition brings out the idea of a myriad of sources of data, which is especially relevant because today, most organizations have a multiple of data sources.

Moreover, it is essential in the customization of data warehousing to ensure that the data-warehousing infrastructure being set up including ETL tools (Extraction, Transformation, Transportation and Loading solutions) are compatible with all the data sources. Additionally, the definition touches on the issue of decision making as a primary focus when establishing a data-warehousing project.

A second definition is slightly brief, viz. “…a data warehouse is a structured repository of historic data” (Kimball, Ross, Thornthwaite, Mundy, & Becker, 2008, p. 32)

The author of this definition adds that it is “…developed in an evolutionary process by integrating data from non-integrated legacy systems” (Kimball, Ross, Thornthwaite, Mundy, & Becker, 2008, p. 32).

This definition is attractive for its introduction of the term “integrated”, because the main idea behind data warehousing is that the information that was previously archived in a jumble is reorganized to make sense in the form of tables and even graphs depending on the presentation format preferred by the end user.

At this point, it is appropriate to introduce Inmon’s definition. As the father of data ware housing, his definition has attached a legendary thrill to data warehouse builders and other experts in the field and thus it has even been used in a devolved capacity to divide data warehousing into branches.

He states, “A data warehouse is a subject-oriented, integrated, time variant, and anon volatile collection of data used in strategic decision making” (Inmon, 2003, p. 34). It is important to note the usage of several definitive words that have since achieved the status of “mandatory” features of a data warehouse including subject oriented, non-volatile, time variant, and integrated.

Another definition reads, “A data warehouse is an electronic storage of an organization’s historical data for the purpose of analysis and interpreting” (Prine, 1998, p. 54).

The interesting concept introduced by this final definition is the term “historical data”, which is a very important feature of data warehouses as shall be seen in the ensuing discourse. Additionally, the tasks of analysis and interpretation mentioned by this definition are very crucial features in the business of data ware housing.

The next section provides a run through the definitions of other important terms outlined within this paper.

Definitions

OLAP: – Online Analytical Processing refers to the procedure through which multidimensional analysis occurs.

OLTP: – this term refers to a transaction system that collects business data and it is optimized for INSERT and UPDATE operations. It is highly normalized because the emphasis is on updating the system since transactions take precedence here and so the currency of the information is crucial for the relevance of the data.

Data Mart: – this term underscores a data structure designed for access. It is designed with the aim of enhancing end user access to information files stored in subject-order. For instance, in an organization there are numerous departments including IT, HR, Management, Finance, and Research among others.

However, an organization may set up data marts on top of the hardware platform for each department, so that after data warehousing, there exists the traditional centralized data storage envisioned by the creators, but in addition to this, a next section in the architecture provides for data marts (Hackney, 2007, p. 45). These elements would in effect separate the information into the relevant sub-sections based on the subject matter.

ER Model: – this model refers to an entry relationship model. In other words, a data modeling methodology whose aim is to normalize data by reducing redundancy.

Dimensional Model: – this model qualifies the data. The main goal is to improve data retrieval mechanism. It is ideal for data ware housing that is operated based on queries. A typical example would be keying in 1kg as a search term and how convoluted the results that one is likely to get would be.

On the contrary, if one keys in: “1kg of soya (product) bought by Becker (customer) on 23 rd November 2012 (date),” in effect, one has just introduced three dimensions- product, customer, and date.

These are mutually independent and non-overlapping classifications of data (Imhoff, Galemmo, & Geige, 2003, p.101). A fact underlines something that can be measured or quantified conventionally, but not always, in numerical values that can be aggregated.

Star schema: – this term refers to a technique used in data warehousing models in which one centralized fact table is used as the reference for all the dimension tables so that the keys (primary keys) from the entirety of dimension tables can flow directly into the fact table (as foreign keys of course) in which the measures are stored. The entity relationship represented diagrammatically resembles a star, hence the name.

Different Types of Data Warehousing Architectures

There are three main types of data warehousing architectures and these include:

  • Data Warehouse Architecture (basic)

Data Warehouse Architecture (with a Staging Area)

Data warehouse architecture (with a staging area and a data mart), data ware house architecture basic.

This structure comprises metadata, raw data, and summary data. Meta data and raw data are a classical feature of all operational systems, but the summary data makes the architecture to be a unique data warehouse material.

Summaries pre-compile long operations in advance, for instance, they can grant an answer to a query on August sales (Imhoff & White, 2011, p. 25). In oracle, a summary is also known as a materialized view and in term of granul-ity, it may be atomic, which is transaction oriented, lightly summarized, or highly summarized.

This architectural type is relevant when there is a need to clean and process operational data before it is stored in the warehouse. This task can be done either programmatically, that is, with a program or using a staging area. A staging area simply refers to that “region of the architecture that simplifies building summaries and general warehouse management” (Jarke, Lenzerini, Vassiliou, & Vassiliadis, 2003, p. 67).

This architecture type is ideal for the customization of a data warehouse for different groups within an organization. It adds “data marts to the staging area, where data marts are systems that are designed for a particular line of business” (Hackney, 2007, p.18). A good example is a case where a firm needs to separate inventories from sales and or purchases.

At this point, it is important to introduce the concept of Business Intelligence for a better understanding of the working of database warehouses. Business intelligence covers information that is available for strategic decision making by businesses. In this setting, the data warehouse is simply the backbone or the infrastructural component (Prine, 1998, p. 39).

Business intelligence includes the insight that is obtained upon the execution of a data mining analysis and other unstructured data, and this aspect explains the significance of content management systems because in an unstructured context, they organize the information logically for better analysis.

When choosing a business intelligence tool, one needs to address the following considerations that advice the choice, viz. increasing the costs, increasing the function ability, increasing the complexity of business intelligence, and decreasing the number of end users (Eliott, 2012). Interestingly, the most popular business intelligence tool is Microsoft Excel.

This assertion holds due to several reasons including the fact that Ms Excel is cheap to acquire, and it is conveniently simple to use.

In addition, the user does not have to worry whether the other user can decipher the information or figure out how the reports are to be interpreted (because the presentation is simple to interpret), and finally, Excel has all the functionalities that are necessary for the display of data (Barwick, 2012).

Other tools include a reporting tool, which can be either custom built or commercial and it is used for the running, creation, and scheduling of operations or reports (Kimball, Ross, Thornthwaite, Mundy, & Becker, 2008, p. 67).

Another tool is the OLAP tool, which is a favorite amongst advanced users because it features a multidimensional perspective of findings, and finally there is the Data mining tool that is for specialized users, hence the limitation to less than five users in an entire enterprise.

Overall structure

The primary features of a data warehouse are better relayed in a graphical format, but this section hopes to provide a comprehensive textual explanation of the same. At the beginning end, there exists data sources, which are archived in different formats, but they are largely unorganized and very general.

The idea is to get them to the other end where in an idyllic scenario they are available to end users in data marts and the users are capable of deriving this information in the form of CDs, DVDs or flash drives.

In a bid to get to that end, the data has to pass through data acquisition, which refers to retrieval of information from the data sources; that is, “a set of processes and programs that extract data for the data warehouse and operational data store from the operational systems” (Imhoff, Galemmo, & Geige, 2003, p. 17).

At this stage, features touching on cleansing, integrating, and transformation of data stand out. Next, the data, through data delivery, is moved to the open marts and ready for harvesting.

Advantages of data warehousing

This process makes the data more accessible in terms of accuracy so that end users do not fumble through scores of unsorted data in order to get a response to the queries that they are seeking to answer. Consequently, it makes the process of accessing that information cheaper and more efficient.

It reduces the costs of acquiring this data because the accessibility means that users do not need to spend additional resources on fruitless tasks; in addition, these resources can be expended elsewhere. Another advantage is that it increases the competitive advantage of the enterprise that integrates it into its infrastructure.

The data in a data warehouse can be used in multiple scenarios including in the production of reports for log term analyses, in producing reports meant to aggregate enterprise data, and finally for producing reports that are multidimensional; for instance, a query can be lodged on the profits accrued by month, product, and branch.

The information stored in a warehouse provides a basis for strategic decision-making, it is available for access, and it is consistent. Additionally, it assists in introducing an organization to the continuous changes in information within the enterprise. Finally, it helps protect the data from abusers.

Disadvantages of data warehousing

Data warehousing is a very costly investment, which is bound to dig into the capital pool of the enterprise that is using it. Additionally, it takes a lot of time to get the project underway and finally see it to completion and this aspect could be anywhere between two to six months. The time becomes relevant because the data-warehousing infrastructure being installed may just end up obsolete by the time it is getting into production.

The very volatile nature of business is vulnerable to this new risk because in contemporary times, even the formerly static fields like finance are susceptible to multiple changes within such a period in order to increase sales. In such a scenario, at the onset of installation, the data warehousing technique may be relevant, but at the end of the project, it may have become obsolete.

It is also very worrying that colleges and other institutions are churning out new experts in data warehousing every other day and the effect that this has on the industry is horrifying because these new brains are eager to apply what they have learnt ins school, yet have not practiced and they apparently lack quality experience.

Ultimately, they install data warehouses that are slow or ineffective because of sticking to ideals that may not be practical in real life scenarios.

Moreover, another disadvantage is the fact that due to the efficiency of the results of data warehousing, organizational users may be tempted to use the data warehouse inappropriately.

This scenario occurs when the data warehouse is used to replace the operational systems or reports that are normally churned out by operational systems, or in analyzing the current operational results. It is noteworthy that these two systems are not supposed to be used interchangeably; on the contrary, they should be used complimentarily.

OLTP and Data Warehousing Environments

Before getting to the contrasts, it is important to create a background that is relevant to this discourse. With that in mind, a data warehouse “is a relational database, which is designed for queries and analyses rather than for transaction processing” (Imhoff, Galemmo, & Geige, 2003, p.111).

Consequently, it is comprised of historical data as well as data from other sources or in other word, which in most cases it falls in the category of unstructured data. The surrounding environment features the following components:

ETL solution

This component comprises the extraction, transportation loading, and transformation stages that are required for unstructured data to be cleaned and transformed into an integrated block of information.

Online Analytical Processing Engine (OLAP)

This component underscores the reporting and analyzing system that processes business data. It is deliberately de-normalized in order to ensure fast data retrieval. As a result, instead of the update and insert features that are commonplace for OLTP, this system features SELECT operations that are ideal for queries (Jarke, Lenzerini, Vassiliou, & Vassiliadis, 2003, p. 54).

A good example would be in a department store scenario where at the Point-of-Sale, which is at the cashier’s stand where he or she looks at the price list that he or she has and deducts money from customers’ credit cards; therefore, this aspect amounts to a transaction and so OLAP is not in play (Hackney, 2007, p. 39).

However, if the store manager were to require a list of out-of-stock products, he would turn to the OLAP operation to retrieve that data.

After landscaped the environs of a data warehouse to this end, it is important to look into the founding father’s perspective, as it shall form the basis of the contrast between OLTP and Data Warehousing Environments. As per William Inmon’s definition of warehouses mentioned above, four distinguishing features come to mind:

  • Subject oriented. During operation, where operation refers to data analysis, it is possible for the data warehouse to be programmed to act based on a particular subject, for example, sale of Ferraris. In this line of thought, it is thus possible to arrive at the best customer for Ferraris in June 2012. This aspect is known as subject orientation.
  • Integrated. This feature is in reference to an organization and so it is safe to say that it is an organizational feature. At this point, it is apparent that in an organizational context, there exist various sources of data.

The cumulative effect of this aspect is that the bulk of the data will be disparate and inconsistent and thus the job of ensuring that this data goes through consolidation and alignment into a sensible platform belongs to the data warehouse (Bertman, 2005, p. 41).

In the course of executing this task, various challenges are expected to emerge. These challenges should meet resolution and if the data warehouse is capable of getting to such a state where they are resolved, it qualifies as an integrated data warehouse.

  • Time variant. The idea behind data warehousing is to carry out an analysis that spans a given period and the width of its scope may be infinite. This aspect explains why data warehouses contain historical data ranging back years or decades.

This element is very different from Online Transaction Processing (OLTP) systems, which store historical data in archives to give room for current data. On the contrary, data warehousing analysts need a large data bundles in order to glean change over time, which underscore the concept of time variance.

  • Non volatile. This feature is in reference to the stability or performance of data once it has been loaded into the data warehouse. The data warehouse should have the ability to maintain the information in the state that it was entered initially. There should not be any deletions or other alteration or else the whole information would be jumbled and inaccurate to use in the analysis of business intelligence.

Contrast between OLTP and Data Warehousing Environments

Data warehouses accommodate ad hoc queries, which is to say that the queries they deal with are random and unexpected. The ideal system should have the capacity to perform well in a wide array of possible questions in various categories. On the other hand, OLTP systems rely on the pre definition of key concepts. It follows that applications should be specifically tuned or designed for preset applications.

Data modifications

Data warehouses feature a regular update of the system through the ETL process (offering extraction, transportation, transformation, and loading solutions). The same is set to run nightly or weekly depending on organizational preferences. In a bid to accomplish this goal, the enterprise employs bulk-data-modification-techniques. However, the end users do not individually update the data warehouse.

On the contrary, in OLTP systems, “the end users are responsible for system updates and they do this by way of routinely issuing individual modification statements to the database warehouse; consequently, the database is always up to date” (Reddy, Rao, Srinivasu, & Rikkula, 2010, p.2869).

Schema design

Data warehouses “use fully or partially de-normalized schemas such as the star schema for optimal query performance” (Reddy, Rao, Srinivasu, & Rikkula, 2010, p.2870). On the other hand, OLTP systems use normalized schemas for optimum updates with insert and delete functionalities and data consistency because they are transactional and the accuracy of current information is very critical.

Typical operations

For data warehouses, the typical operation is querying. They need the capacity to scan thousands or even millions of rows simultaneously to come up with the required search result load. A good example of such a demanding query is one that is in search: for instance, finding the total sales for all the cashiers for the last month.

On the other hand, OLTP systems have a lighter burden to contend with in terms of the requirements of bulk. A transactional operation scans only a handful of records at a go. For instance, retrieve the current price for this customer’s order.

Historical data

Due to the nature and the intended use of data warehouses, it is relevant for them to store up to decades of information in a region that is easily accessible when queries are executed. Such a structure is ideal for historical analyses. On contrary, OLTP systems are just the opposite.

They store up data for at most a few weeks or months and only retain historical data as is relevant for the current transaction. Moreover, this additional historical data is stored up in archives and a special retrieval process is necessary when it becomes relevant or necessary.

Hardware and I/O Considerations in Data Warehouses

Scalability.

It is important to ensure that the data warehouse grows as the data storage grows. In a bid to warrant this element, it would be wise to choose the RDBMS and hardware platforms that are adequately structured to handle large volumes of data with the most efficacies (Kimball, Reeves, Ross, & Thornthwaite, 1998, p. 90).

However, this move may be a difficult task to embark on in advance when it is still not apparent what amount of data shall be stored in the data warehouse in its maturity. This realization explains why it is also advisable to approximate the amount and use it as a basis in setting up the data warehouse.

Parallel Processing Support

It is necessary to refrain from using one CPU as the main processor and instead use multiple CPUs each performing a related part of the task separately but simultaneously (South, 2012, p. 67).

RDBMS – Hardware combination

This move becomes relevant because of the physical location of the RDBMs as it is strategically placed on top of the hardware platform and this aspect may bring issues with bugs and bugs fixing (Kimball & Ross, 2002, p. 26).

Ebay database warehouse (structure)

Oliver Ratzesberger and his team in eBay are responsible for two of the world’s larges t data warehouses. The Greenplum data warehouse that is fully equipped with a data mart is comprised of 6.5 petabytes of user data, which translates to more than 17 trillion records, and “each day, an additional 150 billion new records are added and this amounts to 100 days of event data (Dignan, 2010, Para.12).

The ultimate goal is to reach 90-180 days of event data. The working speed of these metrics is an impressive 200 MB/node/sec of I/O. This rate further improves due to a minimized number of concurrent end users.

The second data warehouse is “a teradata warehouse with two (2) petabytes of user data, which is fed by tens of thousands of production databases” (Miller, & Monash, 2009, Para.6).

Its speed is 140 GB/sec of I/O, or 2 GB/node/sec. By aiming at resource partitions, eBay metrics rely on the workload management software to deliver on numerous Service –Level Agreements (SLA) simultaneously.

This paper has addressed the topic of data warehousing exhaustively. It has touched on the system’s definitions, characteristics, advantages and disadvantages, contrasts with OLTP and even hardware considerations. Finally, it has concluded by looking into eBay’s data warehousing, which is the idyllic system that most organizations throughout the globe envy and would be wise to learn from.

Barwick, H. (2012). Security, Business Iintelligence ‘critical’ for Australian CIOs in 2013: Telstyle. Web.

Bertman, J. (2005). Dispelling Myths and Creating Legends: Database Intelligence Groups. Web.

Dignan, L. (2010). eBay’s Teradata implementation headed to 20 petabytes. Web.

Eliott, T. (2012). Rethinking Business Intelligence : 3 Big New Old Ideas . Web.

Hackney, D. (2007). Picking a Data Mart Tool. Web.

Imhoff, C., Galemmo, N., & Geiger, J. (2003). Mastering Data Warehouse Design : Relational and Dimensional Technique. Indianapolis, IN: Oxford University Press.

Imhoff, C., & White, C. (2011). Self-Service Business Intelligence Empowering Users to Generate Insights. Web.

Inmon, W. (2005). Building the Data Warehouse . Indianapolis, IN: Wiley.

Jarke, M., Lenzerini, M., Vassiliou, Y., & Vassiliadis, P. (2003). Fundamentals of Data Warehousing (2 nd edn.). New York, NY: Springer.

Kimball, R., Reeves, L., Ross, M., & Thornthwaite, W. (1998). Data Warehouse Lifecycle Toolkit: Expert methods for Designing, Developing, and Deploying Data Warehouses. Indianapolis, IN: Wiley.

Kimball, R., & Ross, M. (2002). The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling ( 2 nd edn. ). Indianapolis, IN: Wiley.

Kimball, R., Ross, M., Thornthwaite, W., Mundy, J., & Becker, B. (2008). Data Warehouse Toolkit: Practical Techniques for Building Data warehouse and Business Intelligence Systems (2 nd edn.). Indianapolis, IN: Wiley.

Miller, R., & Monash, C. (2009). eBay’s two enormous data warehouses . Web.

Prine, G. (1998). Coherent Data Warehouse Initiative. London, UK: Unisys Presentations.

Reddy, S., Rao, M., Srinivasu, R., & Rikkula, S. (2010). Data Warehousing, Data Mining, OLAP and OLTP Technologies are Essential Elements to Support Decision-Making Process in Industries. International Journal of Computer Science and Engineering, 2 (9), 2865-73.

South, G. (2012). Small business: Savings lead to a Stellar business. New Zealand Herald , 67.

  • Database and Data Warehousing for Smart Buy
  • Current and Emerging Technology in Data Warehousing and Business Intelligence
  • Amazon Company’s Warehousing Management
  • Company Data Integration: American Express
  • Company Metadata and Master Data Management
  • Company Operational Data Model
  • Company Conceptual and Logical Data Views
  • Cognitive Development: Information Processing Theory
  • Chicago (A-D)
  • Chicago (N-B)

IvyPanda. (2019, June 18). Data Warehouse. https://ivypanda.com/essays/data-warehouse-3/

"Data Warehouse." IvyPanda , 18 June 2019, ivypanda.com/essays/data-warehouse-3/.

IvyPanda . (2019) 'Data Warehouse'. 18 June.

IvyPanda . 2019. "Data Warehouse." June 18, 2019. https://ivypanda.com/essays/data-warehouse-3/.

1. IvyPanda . "Data Warehouse." June 18, 2019. https://ivypanda.com/essays/data-warehouse-3/.

Bibliography

IvyPanda . "Data Warehouse." June 18, 2019. https://ivypanda.com/essays/data-warehouse-3/.

Home — Essay Samples — Information Science and Technology — Data Mining — Data mining and data warehousing

test_template

Data Mining and Data Warehousing

  • Categories: Data Mining

About this sample

close

Words: 1040 |

Published: Jan 29, 2019

Words: 1040 | Pages: 2 | 6 min read

Image of Alex Wood

Cite this Essay

Let us write you an essay from scratch

  • 450+ experts on 30 subjects ready to help
  • Custom essay delivered in as few as 3 hours

Get high-quality help

author

Prof Ernest (PhD)

Verified writer

  • Expert in: Information Science and Technology

writer

+ 120 experts online

By clicking “Check Writers’ Offers”, you agree to our terms of service and privacy policy . We’ll occasionally send you promo and account related email

No need to pay just yet!

Related Essays

2 pages / 798 words

4 pages / 1850 words

6 pages / 2847 words

3 pages / 1474 words

Remember! This is just a sample.

You can get your custom paper by one of our expert writers.

121 writers online

Still can’t find what you need?

Browse our vast selection of original essay samples, each expertly formatted and styled

Databases play a critical role in managing and organizing large amounts of information, making them an essential tool in various fields. This essay aims to highlight the importance of databases in managing and organizing large [...]

Internet of everything is when people, process and data is brought together so that networked connections are made, and so the connections are more relevant and valuable. It creates more capabilities and can help an economy with [...]

Data analysis is known as 'analysis of data 'or 'data analytics', is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, suggesting conclusions and supporting [...]

Data collection is the process of gathering and measuring information on targeted variables in an established systematic fashion, which then enables one to answer relevant questions and evaluate outcomes. The four possible data [...]

From among a number of concerns that businesses have, data security is a major one. Regardless of the scale of a business, data is available everywhere and it is highly valuable for business survival and potential expansion in [...]

Verge (XVG) has been in the privacy coin market for a while and appears to be gathering more attention compared to its competition. Monero has been the most popular coin but its credibility appears to make user shift their [...]

Related Topics

By clicking “Send”, you agree to our Terms of service and Privacy statement . We will occasionally send you account related emails.

Where do you want us to send this sample?

By clicking “Continue”, you agree to our terms of service and privacy policy.

Be careful. This essay is not unique

This essay was donated by a student and is likely to have been used and submitted before

Download this Sample

Free samples may contain mistakes and not unique parts

Sorry, we could not paraphrase this essay. Our professional writers can rewrite it and get you a unique paper.

Please check your inbox.

We can write you a custom essay that will follow your exact instructions and meet the deadlines. Let's fix your grades together!

Get Your Personalized Essay in 3 Hours or Less!

We use cookies to personalyze your web-site experience. By continuing we’ll assume you board with our cookie policy .

  • Instructions Followed To The Letter
  • Deadlines Met At Every Stage
  • Unique And Plagiarism Free

data warehousing essay

  • Admission Essay
  • AI-Free Essay
  • Already Written Essay
  • Analysis Essay
  • Bitcoin Essay
  • Custom Essay
  • Interview Essay
  • Response Essay
  • Scholarship Essay
  • Synthesis Essay
  • Article Critique
  • Article Review
  • Blog Articles
  • Book Report
  • Business Plan
  • Business Report
  • Capstone Project
  • Case Study Writing
  • Cover Letter
  • Cusrom Research Paper
  • Dissertation
  • Dissertation Abstract
  • Dissertation Introduction
  • Dissertation Hypothesis
  • Dissertation Discussion
  • Dissertation Methodology
  • Dissertation Literature Review
  • Dissertation Results
  • Dissertation Conclusion
  • Dissertation Proposal
  • Buy Discussion Board Post
  • Film Critique
  • Film Review
  • Grant Proposal
  • Marketing Plan
  • Letter of Recommendation Writing Service
  • Motivation Letter
  • Persuasive Speech
  • Poem Analysis
  • Reaction Paper
  • Research Paper
  • Research Proposal
  • Term Papers
  • Excel Exercises
  • Poster Writing
  • PowerPoint Presentation

Data Warehousing

A data warehouse is a tool whose purpose is to keep data and can be link it up and exhibited. A data warehouse incorporates data from various origins and it lets users to inhabit the data into reports that respond to definite requests. The difference between the normal system of dealing in the daily operations, a data warehouse is integrated to preserve magnanimous quantities of interrelated, chronological data for both scrutiny and commentary. A data warehouse allows easier renovation snapshots of past data and also gives the power to connect such past data over a period of time by using a definite principle. A data warehousing thus is the structuring extensile setting that is planned for breakdown of non-fickle data both logically and tangibly transforming it from various basis uses so as to align with commercial organization. In warehousing, the data is normally modified and preserved for an extensive time period, and then it is conveyed in some very simple business conditions and abridged for quicker analysis (Inmon, H., 1995).

Elements of data warehousing

The data replication managers

They handle the replication and dissemination of data throughout the databases as demarcated by information users. The data users explain what the data that needs copying, where its source and destination platform are. These managers also modify and work on the data transforms. In refresh, that is where the managers copy entire data sources and in updating, the managers produce the changes required for that particular data item.

The informational database

This element classifies and keeps copies of the data from the various data sources. A resolution maintenance server is used to convert collections and make the data valuable in various data sources. The database also keeps both the system level metadata and semantic level metadata safely.

The information directory

This is a combination of purposes of an official directory, a commercial directory and an information guide. The main use of the information directory is to assist information users in finding out what data is obtainable in the different databases of that organization. It also helps these users know what layout the data is in and how one can access that data. Another crucial use is that of helping Database Administrators (DBAs) to handle the data warehouse. The information directory acquires its data by realizing what databases are available in that specific network and the inquiring their metadata sources. The DBAs use the information directory to use system level metadata and to know about the different data sources, aims, cleanup guidelines, conversion rules and specifics of the set rules and reports (Ganczarski, J., 2009).

Dos tool support

A Database Administrator also has to gather data from different sources, replicate it, clean the data, store it, catalogue it and avail it to the other tools like data mining which helps in discovering pertinent information from large data volumes. This tool called data mining also tries to determine pre-defined rules & arrangements spontaneously from the organization’s data (Abdullah, A., 2009).

Advantages of data warehousing

  • Data warehousing gives a collective data model for all data of interest irrespective of the source of that data. This thus eases reporting and analyzing information.
  • Data warehousing also identifies and resolves inconsistencies that are present before loading data and thus significantly streamlines reporting and breakdown of that data.
  • Data warehousing ensure that the information kept is safe even for very long periods of time by keeping that information under the control.
  • Data warehouses also provide reclamation of data efficient without retarding the operational systems.
  • Data warehouses notably improve the value of operative business claims, especially by the Customer Relationship Management (CRM) systems.
  • Data warehouses also enable ease in decision maintenance system application programs like tendency reports, exemption reports and other reports that demonstrate real operational against an organization’s goals.

Disadvantages of warehousing

  • Data warehousing does not offer ideal setting for amorphous data because data in these warehouses has to be drawn out, changed and input into the warehouse. This thus brings out an element of dormancy in the warehoused data.
  • Data warehouses that are maintained over a long period of time can have very high costs.
  • Data warehouses can go out-of-date comparatively quick.
  • There exists a cost of delivering suboptimal data to the company.

There is frequently a sufficient difference between data warehouses and functional systems. Repeated, costly functionality may be originated. Or, functionality may be formulated in the data warehouse that, in retrospect, ought to have been formulated in the operational systems (Yang, J., 1998).

The future of Data Warehousing

Data warehousing, similar other technologies, has an account of inventions that did not obtain market toleration. According to the 2009 Gartner Group report, these evolutions in business data warehousing market were probable (Gartner Reveals Five Business Intelligence Predictions for 2009 and Beyond, 2009). On account of lack of information, procedures, and instruments, by 2012, over 35% of the top 5,000 worldwide companies will on a regular basis fail to make perceptive decisions concerning substantial modifications in their business and markets (Abdullah, A., 2009).

By the year 2012, business units will take control of at least 40% of the total budget for business warehousing and intelligence. By the year 2010, 20% of companies will possess an industry-explicit analytic application bore via software as a service (SaaS) as a standard constituent of their business warehousing and intelligence portfolio. In 2009, cooperative decision making emerged as a novel product family that combined social software with business warehousing and intelligence platform capacities. By 2012, a third of analytic applications implemented on business processes such as warehousing will be conveyed via coarse-grained application mash-ups (Yang, J., 1998).

As already known raw data may be excessively large to keep for a warehouse. However this can be solved by handling just compact data obtained by accumulation on a relative, rather than keeping the integral relation. Data Warehousing is a very vast topic and it is rather inconceivable to sum it up as one short subject. This paper brought in the central conceptions of data warehousing. It is crucial to mention that data warehousing is a skill that goes on to develop. Lots of the design and improvement concepts brought in in this paper greatly determine the value of the analysis that is conceivable with information in the data warehouse. If unsound or corrupt data is let to go into the data warehouse, the analysis through with this data is in all likelihood to be invalid (Inmon, H., 1995).

After the speedy adoption of data warehousing systems over the last three years, there will remain to be lots of improvements and modifications to the data warehousing system ideal. Further development of the hardware and software engineering will also carry on to greatly shape the capacities that are reinforced into data warehouses. Data warehousing structures have become a fundamental constituent of information technology architecture. A conciliatory initiative data warehouse scheme can yield substantial gains for a long period of time.

Mind that anyone can use our samples, which may result in plagiarism. Want to maintain academic integrity? Order a tailored paper from our experts.

Critical Strategies for Reading

Communicating through conflict, appearance is the mirror of a woman.

Data Warehousing and Big Data in Business

Introduction.

Data warehousing and Big Data have both been designed to facilitate business analytics. Data Warehousing is an established information management concept that is championed by vast and well-established methodologies. However, Big Data is still a paradigm under development, which aims to address individual aspects of the considerable data volume challenge but does not have an integrated solution.

Some research portrays Big Data as a Data Warehouse evolution, others as its replacement; some recommend extending Data Warehouse to support some Big Data features and the probability of merging the two. However, it is essential to note that Data Warehouse overrides Big Data on data quality; hence, some organizations that do not want to compromise data quality and governance are reluctant to shift. This paper aims to differentiate between Data Warehouse and Big Data in relation to their structure, principles, and mode of action.

Data Warehousing

Components of a data warehouse (dw) architecture.

A DW is an information environment that contains data obtained from multiple sources and stored in a uniform schema. The information present is used to perform queries and analyses, hence enabling improved decision-making. DW has different architectures that comprise centralized, federated, hub and spoke, and independent data marts (Ariyachandra & Watson, 2008). Architecture refers to the proper arrangement of components in a system.

A DW is usually built with software and hardware components that are arranged in a specific manner to suit the organizational requirements and for maximum benefits. As a result, a typical DW usually consists of the following basic components: data source, data staging, data storage, information delivery, metadata, and management and control. DWs in every organization utilize the same building blocks; however, the difference lies in how they are arranged, and thus some components might be made to be stronger than others (Ariyachandra & Watson, 2008).

In the source data component, the electronic information entering a warehouse is obtained from several systems. It can also include internal data, archived data, and external data. This is followed by the data staging component that entails the preparation of extracted data to be stored for querying and analysis. The methods of preparation include extraction, transformation, and loading (ETL). During extraction, it is essential to ensure that the most suitable techniques are employed for each data source. This is because the data might be from several source machines that are in different formats. Data transformation is a critical step in the full web data integration process.

Moreover, depending on the integration process, data may be required to be cleaned, merged, deduplicated, converted, and summarized for storage and use. This is followed by data loading that entails two distinct processes. When going live for the first time, large volumes of data is loaded into the repository for a significant amount of time. As the data repository continues functioning, the variations in the data source are continuously extracted, transformed, and the incremental data revisions are fed to the repository on an ongoing basis.

In the data storage component, data for the warehouse is separately kept from that of operational systems as it has to be stored in a form suitable for analysis, that is, read-only. Therefore, when analysts perform an evaluation, they are assured that it is stable and represents snapshots at particular periods. The fourth component – information delivery – comprises different methods such as complex queries, Executive information systems (EIS) feed, multidimensional analysis, and data mining. This information is essential for novice users as it enables them to perform complex analyses.

The fifth is the metadata component which is referred to as the data catalog in a database management system. It contains information regarding the logical data structures, files and addresses, and indexes among others. Last is the management and control component which sits on top of all other building blocks. This is because it coordinates services and activities in the DW.

Current key trends in data warehousing

DW has ceased to be a purely new area for study and implementation. It has become mainstream and every business, regardless of their sizes. DW has transformed business analytics and how it is used in decision-making. In 2017, the global data warehousing market was the highest ever in the BFSI segment and was valued at $18.61 billion. Moreover, it is expected to increase at a CAGR of 8.2% from 2018-2025 (Gaul, 2019). The reasons for its growth are attributed to the heightened need for an efficient repository system for the increasing data volume and the demand for low-latency, real-time view, and analytics for BD. However, high implementation costs adversely impact the expansion of the data warehousing market, especially for small and medium-sized enterprises.

Currently, North American companies have dominated the data warehousing market share, and approximately 90% of multinational corporations have committed to data warehousing (Gaul, 2019). Furthermore, in relation to the market segment, the on-premise deployment dominated the cloud and hybrid because of the presence of conventional IT infrastructure in most of the companies (Gaul, 2019).

Understanding of Big Data (BD) and how it is applied

BD refers to large volumes of structured and unstructured data that cannot be stored or processed by conventional computing techniques within a given duration (Grable & Lyons, 2018). It purposes to reveal hidden patterns.

BD has three main characteristics: volume, velocity, and variety. Therefore, based on these three dimensions, BD can be said to contain the high volume, velocity, and variety of data assets that require innovative and cost-effective processing techniques to enhance decision-making. Volume is defined as the sheer scale of processed information. Today, the volume of data collected from people by organizations continues to grow and is existing in petabytes.

It possesses the greatest opportunity, as BD could enable corporations to understand people better and allocate resources more efficiently. However, the conventional techniques by relational databases are not scalable to deal with data of this magnitude. On the other hand, velocity is defined as the rate by which information flows into organizations. Users are increasingly demanding real-time data, which can prove to be a challenge for conventional analytics since data is in constant motion. Lastly, variety comprises the data types to be processed. BD is characterized to contain both structured and unstructured raw data formats.

The significant data variety cannot be categorized and processed by traditional analytical methods. Other characteristics of BD include veracity, which focuses on ambiguity, in other words, noise and abnormalities within data. The presence of a vast volume of data makes it challenging to differentiate between essential data and distractions.

To deal with the challenges of traditional analytical techniques in regards to the three main dimensions of BD, researchers have designed “predictive analytics.” The power of predictive analytics is based on the development of learning algorithms that identify patterns having predictive power (Grable & Lyons, 2018). As a result, BD overcomes the challenges of traditional methods and enables organizations to gain insight and make well-informed decisions from a large volume of data. Currently, the most publicized areas of BD use are in the retail industry to understand consumers’ behaviors and preferences.

Organizations are dedicated to expanding their traditional data with other information from social networks and browser logs to have a clear and comprehensive picture of their clients. For example, Walmart can predict which products will sell. Walmart is a global retailer, and thus to efficiently conduct its operations, it created an analytics hub known as the Data Café. The Data Café allows petabytes of information to be quickly modeled, manipulated, and visualized.

Demands BD is placing on organizations and data management technology

Although BD has several benefits to various sectors, it is a promising technological innovation with the capacity to alter data management technology and organizations in many ways. For instance, currently, organizations that have employed BD have already restructured. Proctor & Gamble has built control towers to maintain the continuously updated control of its supply chain (Slinger & Morrinson, 2014).

Companies are restructuring themselves as BD allows flexible resource allocation; hence, enterprises are finding it easier to move employees, capital, and other resources across roles, sites, and positions. If this continues at the enterprise level, BD might change organizational structure because large corporations will restructure to add a structural dimension. Some new functions will integrate BD operations that will differentiate themselves from the rest of the organization.

Moreover, BD might change the source of influence in an organization. For instance, HiPPOs (Highest-Paid People in the Organization) states that their judgment-based decision-making has sometimes been overruled by data-driven decision-making (Slinger & Morrinson, 2014). Intriguingly, Slinger and Morrinson (2014) also claim that the restructuring of organizational structures will result in the generation of new tensions, and the most successful enterprises will be those that effectively and efficiently handle the conflicts of interest.

In regards to data management technology, the availability of BD in analyzing large data volumes has shifted the business towards data-driven decision-making. Organizations and vendors are continually researching new tools and models to improve the current BD utilization. New models and concepts are continuously appearing in the market, while the old ones are fading away. There are presently efforts to employ the Internet of Things to merge streaming analytics and machine learning. Usually, machine learning uses stored data for training in a controlled learning environment.

However, in the new model, the availability of streaming data will enable the provision of real-time data to facilitate learning in a less controlled environment. Furthermore, there are efforts to use Artificial Intelligence (AI) in processing BD. This is considered a significant improvement as it will lead to the faster and efficient gathering of business intelligence.

Information is a valuable primary resource of enterprises as it supports decision-making. With the advancement of technology, the volume of data is substantially increasing hence posing challenges of increasing complexity to the storage, updating, and efficient exploitation. DW and BD are new concepts that have been designed to mitigate these challenges. However, the DW is more of a conventional approach, and its efficiency has been hampered by the ever-increasing volume of data. This led to the development of BD. Nevertheless, the scope of BD is not clearly understood, and it still lacks a standardized proposal; therefore, its future success or failure cannot be predicted.

Ariyachandra, T., & Watson, H. J. (2008). Which data warehouse architecture is the best? Communications of the ACM, 51 (10), 146-147. Web.

Gaul, V. (2019). Data warehousing market by type of offering (Extraction, Transportation & Loading (ETL) solutions, statistical analysis, data mining, and others), type of data (Unstructured and semi-structured & structured), deployment (On-premise, cloud, and hybrid), organization size (Small & medium sized enterprises and large enterprises), and industry vertical (BFSI, telecom & IT, government, manufacturing, retail, healthcare, media & entertainment, and others): Global opportunity analysis and industry forecast, 2018 – 2025 . Portland, OR: Allied Market Research. Web.

Grable, J. E., & Lyons, A. C. (2018). An introduction to Big Data. Journal of Financial Service Professionals, 72 (5), 17-20.

Slinger, G., & Morrinson, R. (2014). Will organization design be affected by BD? Journal of Organization Design, 3 (3), 17-26. Web.

Cite this paper

  • Chicago (N-B)
  • Chicago (A-D)

StudyCorgi. (2021, August 8). Data Warehousing and Big Data in Business. https://studycorgi.com/data-warehousing-and-big-data-in-business/

"Data Warehousing and Big Data in Business." StudyCorgi , 8 Aug. 2021, studycorgi.com/data-warehousing-and-big-data-in-business/.

StudyCorgi . (2021) 'Data Warehousing and Big Data in Business'. 8 August.

1. StudyCorgi . "Data Warehousing and Big Data in Business." August 8, 2021. https://studycorgi.com/data-warehousing-and-big-data-in-business/.

Bibliography

StudyCorgi . "Data Warehousing and Big Data in Business." August 8, 2021. https://studycorgi.com/data-warehousing-and-big-data-in-business/.

StudyCorgi . 2021. "Data Warehousing and Big Data in Business." August 8, 2021. https://studycorgi.com/data-warehousing-and-big-data-in-business/.

This paper, “Data Warehousing and Big Data in Business”, was written and voluntary submitted to our free essay database by a straight-A student. Please ensure you properly reference the paper if you're using it to write your assignment.

Before publication, the StudyCorgi editorial team proofread and checked the paper to make sure it meets the highest standards in terms of grammar, punctuation, style, fact accuracy, copyright issues, and inclusive language. Last updated: August 8, 2021 .

If you are the author of this paper and no longer wish to have it published on StudyCorgi, request the removal . Please use the “ Donate your paper ” form to submit an essay.

Data Warehousing - What is it and How to Use it

18 Oct 2022

Format: APA

Academic level: Master’s

Paper type: Research Paper

Downloads: 0

A data warehouse is a system that integrates data from a wide range of sources, which helps in the running of business intelligence. Data warehouses are often preloaded with large amounts of data, which allows queries and analyses that guide businesses on how to invest. In simpler terms, data warehousing gathers information from various sources to produce more accurate results that could profit businesses. For instance, the data from cash registers in a company and its website ratings occur in different locations. Using data warehousing could combine the data and assess the general customer service of the business.

Data warehousing could also be used to create better conditions for employees. Using data warehousing to combine their clocking in and out information, demographic data, and salary statements could inform a company how to improve its employee treatment. For organizations that have grown from mergers, data warehousing is essential because data sourcing from different areas allows the organization a holistic view of its running (Krishnan, 2019). Data warehousing is particularly preferred in data mining because it allows analysts to find patterns that could increase a business's profit.

Delegate your assignment to our experts and they will do the rest.

Data warehousing accommodates a hefty workload since it supports data analysis and unorganized queries. It also automatically updates regularly and can store data for a long time. It also scans data fast because it can handle up to one million rows of records (Ballard et al., 2018). Data warehousing uses the same model for its data, regardless of the model from the data source, giving analysts an easy time interpreting it. Further, it arranges information simply and understandably so that concerned parties who have little or no knowledge in data analysis and interpret it too.

Data warehousing can arrange information with regard to a particular subject (Ballard et al., 2018). For instance, sourcing data from the cash register could give different results, such as the number of sales for a particular item and the standard means of payment for most customers. With data warehousing, a company may focus only on the sales while disregarding the other results or channeling them to a different project. This topic arrangement makes it easier for analysts to interpret and present their findings (Elhebir et al., 2017). It also considers time by analyzing the difference in data from different periods. Data warehousing is also advantageous because data does not change once it is in a warehouse. This factor reduces the risk of intentional or unintentional data alteration.

Role in Data Analytics 

Although its design is expensive and challenging to build, data warehousing gives companies an upper hand in marketing and customer satisfaction. AT&T is one of the companies that have greatly benefited from data warehousing. The company sources data from their customer's interaction with their website to reduce system failures. If a customer has trouble connecting to the website, the company is made aware long before making a complaint. Data analytics can improve a company's services to its customers by assessing the preferences and satisfaction. Data warehousing is a key figure in this solution as it acquires data from different areas and determines patterns (Krishnan, 2019). For instance, AT&T offers video subscription membership to its customers. To improve this service, it could use data warehousing to determine how many customers are satisfied with particular videos, then use that information for advertisement.

Conclusion 

Data warehousing is in itself efficient and revolutionary, but it is also a technological stepping stone to better systems for analyzing data. Comparing its features from its inception to today is enough proof that, like all technologies, data warehousing is subject to improvement over time. As such, large companies that would benefit from its service must adopt data warehousing. In doing so, these companies will not only improve their profits and administration but also pave the way for the development of warehousing.

References 

Ballard, C., Herreman, D., Schau, D., Bell, R., Kim, E., & Valencic, A. (2018).  Data modeling 

techniques for data warehousing  (p. 25). IBM Corporation International Technical Support Organization.

Krishnan, K. (2019).  Data warehousing in the age of big data . Newnes.

Sen, A., & Sinha, A. P. (2017). A comparison of data warehousing

methodologies.  Communications of the ACM ,  48 (3), 79-84.

  • Registered Nurse Career Goals: How to Become a Nurse
  • Cyber Security- Biometric Technology

Select style:

StudyBounty. (2023, September 15). Data Warehousing - What is it and How to Use it . https://studybounty.com/data-warehousing-what-is-it-and-how-to-use-it-research-paper

Hire an expert to write you a 100% unique paper aligned to your needs.

Related essays

We post free essay examples for college on a regular basis. Stay in the know!

Implementation Roadmap

Big data in fraud detection: how it is used and what to expect, the 5g networks: how they work, and what they mean for the future, how to write a successful business case, the 5nm transistor: the future of computing.

Words: 2221

Wireless Communication Technologies

Words: 1046

Running out of time ?

Entrust your assignment to proficient writers and receive TOP-quality paper before the deadline is over.

  • Undergraduate
  • High School
  • Architecture
  • American History
  • Asian History
  • Antique Literature
  • American Literature
  • Asian Literature
  • Classic English Literature
  • World Literature
  • Creative Writing
  • Linguistics
  • Criminal Justice
  • Legal Issues
  • Anthropology
  • Archaeology
  • Political Science
  • World Affairs
  • African-American Studies
  • East European Studies
  • Latin-American Studies
  • Native-American Studies
  • West European Studies
  • Family and Consumer Science
  • Social Issues
  • Women and Gender Studies
  • Social Work
  • Natural Sciences
  • Pharmacology
  • Earth science
  • Agriculture
  • Agricultural Studies
  • Computer Science
  • IT Management
  • Mathematics
  • Investments
  • Engineering and Technology
  • Engineering
  • Aeronautics
  • Medicine and Health
  • Alternative Medicine
  • Communications and Media
  • Advertising
  • Communication Strategies
  • Public Relations
  • Educational Theories
  • Teacher's Career
  • Chicago/Turabian
  • Company Analysis
  • Education Theories
  • Shakespeare
  • Canadian Studies
  • Food Safety
  • Relation of Global Warming and Extreme Weather Condition
  • Movie Review
  • Admission Essay
  • Annotated Bibliography

Application Essay

  • Article Critique
  • Article Review
  • Article Writing
  • Book Review
  • Business Plan
  • Business Proposal
  • Capstone Project
  • Cover Letter
  • Creative Essay
  • Dissertation
  • Dissertation - Abstract
  • Dissertation - Conclusion
  • Dissertation - Discussion
  • Dissertation - Hypothesis
  • Dissertation - Introduction
  • Dissertation - Literature
  • Dissertation - Methodology
  • Dissertation - Results
  • GCSE Coursework
  • Grant Proposal
  • Marketing Plan
  • Multiple Choice Quiz
  • Personal Statement
  • Power Point Presentation
  • Power Point Presentation With Speaker Notes
  • Questionnaire
  • Reaction Paper
  • Research Paper
  • Research Proposal
  • SWOT analysis
  • Thesis Paper
  • Online Quiz
  • Literature Review
  • Movie Analysis
  • Statistics problem
  • Math Problem
  • All papers examples
  • How It Works
  • Money Back Policy
  • Terms of Use
  • Privacy Policy
  • We Are Hiring

Data Warehousing and Data Mining, Term Paper Example

Pages: 8

Words: 2174

Hire a Writer for Custom Term Paper

Use 10% Off Discount: "custom10" in 1 Click 👇

You are free to use it as an inspiration or a source for your own work.

Data warehousing is a useful tool for many companies because it creates an easily accessible permanent central storage space that supports data analysis, retrieval, and reporting (Rosencrance, 2011). Five benefits of using data warehousing include delivery of enhanced business intelligence, saving time, heightened and consistent data quality, ability to access previous information, and a high return of investment. Ultimately, data warehousing is ideal for businesses that make important decisions without consulting data. Creation of a data warehouse makes it simple for business professionals to consult various aspects of their business’s history, ranging from marketing information to profits and inventory needs. Since all of this information is located on a single system, it saves time compared to digging through paper files; in addition, this centralization will allow the IT department to focus on their other responsibilities which will increase the overall efficiency of the company. The data retrieved from a database can be made to appear in a consistent format, which will allow businesses to compare new data to data previously collected in a way that will give them a better understanding of their business’s progress. Lastly, practice has determined that data warehouse implementation allows businesses to generate more revenue than those who use other formats of data storage. Although the initial monetary investment necessary for data warehouses creation is expensive, many business owners believe that they are worth it. Databases are useful for data storage practices that support both enterprise and web-based applications. The use of this system allows company owners to collect data from the internet and convert this information into usable models that predict trends. Eventually, the company will be able to use this information to understand patterns that will help their business succeed.

Data mining is the physical process of extrapolating information from a data warehouse (Alexander, n.d.). This process is of particular use to businesses because it allows them to predict future patterns and trends based on current and previous information; this is useful in assisting businesses with making important decisions. Even expert statisticians are unable to predict trends as well as certain data mining schemes; human error would lead them to miss several data points that would skew the end result. Data mining uses algorithms that allow the computer to build mathematical models to answer problems such as market segmentation, customer churn, fraud detection, direct marketing, interactive marketing, market basket analysis, and trend analysis. These models help businesses determine the similarities between customers who buy similar products from their company, predict how likely current customers are to switch to a competing company, provide clues as to which customer transactions are most likely to result in fraudulent behavior, identify the customer base that should be included in a mailing list for marketing purposes, determine what customers would like to see on the company’s website, draw a connection between which products are usually purchased at the same time, and analyze the differences between recurring customers over a certain period of time. All of these strategies help businesses gain a greater understanding of who they are marketing to, what their market is interested in, and how to persuade people to purchase more of their product.

According to a January 2013 article published in Forbes magazine, data warehousing is becoming a trend and is replacing physical storage and some forms of less organized computer storage. The article explains that one of the major reasons for this response is the acquisition of increasingly large amounts of data; for businesses to be successful in the modern era, they must be able to both create and store an infinite amount of data (Evans, 2013). Performance has become one of the most important goals of the 21 st century, which means that we have been forced to find new ways to store, retrieve, and analyze data; data warehouses have solved this problem in a way that provides a significant return on investment. Simplicity, accessibility, and mixed workload support has allowed data warehousing to replace conventional methods of data storage. The efficacy and efficiency of data mining and data warehousing have allowed businesses to evolve and take on more tasks than they were ever capable of in the past.

Many quality companies have successfully implemented the use of data warehouses into their daily business activities; two notable corporations that rely on the use of these systems include Apple and Walmart. Although it may seem obvious that Apple uses data warehousing since they are a computer company, this business actually uses data warehousing and mining for far more than its development of electronics and programs. Apple uses a multiple-pedabyte Teradata system to study its customers and determine which types of people buy their products (Harris, 2013). They save every piece of demographic information that people enter into their website, iTunes, and electronics to study these relationships. Walmart began using a data warehouse in 1992 which has grown significantly over the years. This corporation now uses three separate databases; one for Walmart stores, one for Sam’s Club, and a backup system. One of the major uses of this system is to provide store managers with information about their store’s layout. It knows where there is free shelf space and the business uses data mining to figure out how to optimize this space. It also allows the stores to know what items are selling the best, how fast they are selling, and even suggestions as to whether they should design packaging for certain products that will allow them to fit the shelving more efficiently.

There are several types of data warehouse architecture; however the precise elements used will depend on the needs of the individual business. In order for the company to have a highly efficient data warehouse, it should include the capability to run nightly updates, it servers should have connectivity around the world so different offices will be able to have access to the same database, it should allow customer-level service, have the capability of storing new data sources, and it must be reliable. Therefore, this data warehouse should have adequate staging horsepower, parallel or distributed servers, a large server size, flexible tools with support for metadata, and job control features (Hadley, 2002).

The two main forms of system design are OLTP and OLAP. OLTP (On-line Transaction Processing) is “characterized by a large number of short on-line transactions (INSERT, UPDATE, DELETE)” (datawarehouse4u, n.d.). This system is useful for processing queries quickly and maintaining data integrity that is substantial for its speed. This type of database includes use of both detailed and current data and the schema used to store transactional databases is the called the entity model. OLAP (On-line Analytical Processing) is defined by a low level of transactions because the queries are often complex. OLAP applications are generally used for data mining using aggregated and historical data that are stored in multi-dimensional schemas (usually the star schema). For our purposes, the OLAP application will be the most effective because it allows data mining that the company requires for their business analyses.

The model used in the creation of the data warehouse should be the normalized approach which follows database normalization rules. This is the most effective model for a business’s data warehouse because it allows tables to be grouped together based on subject categories; this can be further divided into entities which will allow the users to create tables in a relational database (Kimball, 1996). Since studying relationships between consumers and products is a goal of many businesses, this representation of information will be the most useful. This system will use materialized views because they are useful for many business applications. One of the most important advantages of using materialized views is the use of pre-existing computation and storage of aggregated data that will improve performance and allow fast lookups (Oracle, n.d.).

There are several techniques that are useful to optimize data warehousing. Some of the problems that can arise when using a database include overloading the system, poor overall performance, and low storage. To avoid system overload, one should monitor parallel execution performance. For example, this may occur if many parallel statements are being downgraded. In this situation, I/O performance should also be monitored. If the system has a poor overall performance and low storage, it may be necessary to optimize the storage requirements; a useful way to do this is by using data compression. Several optimization methods can also be used to increase the efficiency of data mining. According to a 2009 article published by the University of Iowa, the main ways that data mining can be improved involve changing the algorithm used by the database (Yu, 2009). It states that optimization could be either a part of a larger data mining process or that optimization could work directly on data mining as a whole. Although the article cites several useful ways that mathematical equations can optimize data mining, the most interesting ones discussed included the use of the K-Mean cluster analysis algorithm to minimize the distances of points to their nearest centers, time-series data mining to define similarity, and the practice of determining what defines normal measures. There is an infinite number of ways to optimize data warehousing and data mining; the number of available methods will grow as the IT field evolves. Although the aforementioned optimization techniques are idea for the business setting, it is essential to constantly review the literature for new techniques as they arise.

If the company has accumulated 20 terabytes of data and that a 20% per year growth is expected in the size of the data warehouse, I would recommend use of Oracle Exadata because will accommodate the company’s need for data storage as it continues to grow, it performs well, requires little maintenance, and is quick to install. Since this is Big Data, the network configurations will require 1GbE access layer switch capacity (Borovick, 2012). However, as the amount of data storage that the company requires increases, they may need to upgrade to a 10GbE server after one or two years, which will increase to a need for 40GbE switch capacity. The specific hardware that is needed for the database is dependent mostly on the data storage growth and the amount of people that will need to access it. Since the company may have many employees in many different parts of the world, I recommend RAID 10. RAID is ideal in this situation because it has several different storage methods that are accompanied with their own advantages and disadvantages (Acronis, n.d.). RAID 10 is one of the storage options and is ideal because it doesn’t crash and it has fast speeds. In order to use RAID 10, two physical hard drives are required; a disk controller that understands it is also needed. One of the most useful functions of RAID 10 is in its ability to back up the database’s information; it uses a process called mirroring to save the data to two or more disks at once. If one disk completely fails, this information will still be safe on the other disks. RAID 10 also increases performance because it is able to retrieve information from more than one disk at a time. Despite its ability to copy information to several different places at once, Acronis recommends that a backup be used because this information can still be wiped. Since RAID 10 is Acronis brand, I will also use Acronis backup software.

Data warehouses are generally broken up into three tiers including the data tier, application tier, and presentation tier. The data tier is responsible for physically storing all of the warehouse’s data in addition to handling the ETL process where data is extracted and transformed (Paoletti, 2012). The application tier is where business information is converted to models for company use; this is the tier that deals with all user queries. Finally, the presentation tier gives the business reports, analyses, and event management information. It allows for user administration, dash boarding, and score carding. A graphical representation of these three tiers can be seen below:

A graphical representation of these three tiers

Acronis. (n.d.). What is RAID 10 and why should I use it? Retrieved from http://www.acronis.eu/resource/tips-tricks/2005/whats-raid-10.html

Alexander, D. (n.d.). Data Mining. University of Texas. Retrieved from http://www.laits.utexas.edu/~anorman/BUS.FOR/course.mat/Alex/

Borovick, L. (2012). The Critical Role of the Network in Big Data Applications. IDC. Retrieved from http://www.cisco.com/en/US/solutions/collateral/ns340/ns517/ns224/ns944/critical_big_data_applications.pdf

Datawarehouse4u. (n.d.). OLTP vs. OLAP. Retrieved from http://datawarehouse4u.info/OLTP- vs-OLAP.html

Evans, B. (2013). Data Warehouse 2.0: The 10 Top Trends Driving the Revolution. Forbes . Retrieved from http://www.forbes.com/sites/oracle/2013/01/14/data-warehouse-2-0-the- 10-top-trends-driving-the-revolution/2/

Hadley, L. (2002). eveloping a Data Warehouse Architecture. Retrieved from http://www.users.qwest.net/~lauramh/resume/thorn.htm

Harris, D. (2013). Why Apple, eBay, and Walmart have some of the biggest data warehouses you’ve ever seen. Gigaom . Retrieved from http://gigaom.com/2013/03/27/why-apple- ebay-and-walmart-have-some-of-the-biggest-data-warehouses-youve-ever-seen/

Kimball, R. (1996). The Data Warehouse Toolkit. Wiley .

Oracle. (n.d.). Data Warehousing with Materialized Views. Retrieved from http://docs.oracle.com/cd/F49540_01/DOC/server.815/a67775/ch1.htm#12524

Paoletti, L. (2012). Data Warehousing: “Conceptual Architecture”. Retrieved from http://www.tomsitpro.com/articles/data_warehsouing-business_intelligence-data_warehouse_conceptual_design2-271.html

Rosencrance, L. (2011). Top Five Benefits of a Data Warehouse. TIBCO Software . Retrieved from http://spotfire.tibco.com/blog/?p=7597

Yu, Z. (2009). Optimization techniques in data mining with applications to biomedical and psychophysiological data sets. University of Iowa. Retrieved from http://ir.uiowa.edu/cgi/viewcontent.cgi?article=1459&context=etd

Stuck with your Term Paper?

Get in touch with one of our experts for instant help!

The Canadian Holocaust, Essay Example

Modern Society and Race Perceptions, Application Essay Example

Time is precious

don’t waste it!

Plagiarism-free guarantee

Privacy guarantee

Secure checkout

Money back guarantee

E-book

Related Term Paper Samples & Examples

5 ways intersectionality affects diversity and inclusion at work, term paper example.

Pages: 5

Words: 1355

Combating Climate Change Successfully Through COP26 Glasgow 2021, Term Paper Example

Pages: 9

Words: 2580

Telehealth, Term Paper Example

Pages: 3

Words: 848

Impact of Spanish, Mexican, and Anglo Social Ordering on Mexican-American Culture in California, Term Paper Example

Pages: 7

Words: 1809

Empowerment and Social Change, Term Paper Example

Pages: 2

Words: 642

Directed Energy Ethics, Term Paper Example

Pages: 18

Words: 4973

We use cookies to enhance our website for you. Proceed if you agree to this policy or learn more about it.

  • Essay Database >
  • Essays Examples >
  • Essay Topics

Essays on Data Warehousing

27 samples on this topic

Our essay writing service presents to you an open-access directory of free Data Warehousing essay samples. We'd like to emphasize that the showcased papers were crafted by experienced writers with proper academic backgrounds and cover most various Data Warehousing essay topics. Remarkably, any Data Warehousing paper you'd find here could serve as a great source of inspiration, valuable insights, and content structuring practices.

It might so happen that you're too pressed for time and cannot allow yourself to spend another minute browsing Data Warehousing essays and other samples. In such a case, our service can offer a time-saving and very practical alternative solution: an entirely unique Data Warehousing essay example crafted exclusively for you according to the provided instructions. Get in touch today to learn more about practical assistance opportunities offered by our buy an essay service in Data Warehousing writing!

Challenges {type) To Use As A Writing Model

Business Intelligence and Analytics in Higher Education Institutions

Business Intelligence and Analytics

Database And Data Warehousing Design Essay

Data Warehousing

Exemplar Essay On List Four Characteristics Of A Data Warehouse. To Write After

Management Support System

Free Electronic Medical System Dissertation Example

Essay on business requirements, good example of presentation on about applied data technologies, inc..

ADT is a data collection and analysis company that provides subscription software tools to its customers. The main functions of ADT is to assist its clients in decision-making. It uses the large and consumer related data such as consumer profiles and purchase patterns etc. ADT provides an option to the clients to log into the portal where software for organizing the huge data according to their requirements. The clients can subscribe to the services of ADT by paying for usage time or annual fee basis.

Business Requirement Document Course Work Examples

Background of the project

Database Resource Management Course Work Examples

Discuss how the data resource management methods of today will need to evolve as more types of data emerge

The need for data integrity continues to become a critical issue. Most of information such as client details or employee details being stored in databases that can be accessed through a network. Data resource management methods that do not provide the best security measures may result in losses for organizations implementing such methods. Additionally, data resource management methods are designed to handle a specified amount of data. The increase and enlargement of organizations requires them to apply methods that will reduce any incidences of inefficiencies while ensuring the data quality is maintained.

Why the role of a data steward is considered innovative?

Good Essay About Warehousing Fundamentals

Good term paper on data warehouse.

Thesis statement: Data warehousing is critical in the socio-economic success of businesses in the world today by means of providing an important platform for both social and economic interactions that proceed with the greatest regard to competitiveness and relevance since markets continue to grow more unfavorable to irrelevant businesses. It plays a central role in ensuring the existence and competitiveness of an organization remains intact as efficiency and reliability of data processing are upheld. Organizations have to adopt data warehousing if at all they need to remain relevant in the current super competitive markets.

Sample Report On Data Warehousing

Good example of term paper on history.

Business Intelligence and enterprise data mining management

Sample Essay On Business Intelligence And Data Warehouses

Business Intelligence and Data Warehouses

Sample Research Proposal On Option A

Free research paper about data warehousing.

Introduction and definition of data warehouse

Example Of Research Paper On Data Warehousing

Requirements specification report samples, sample report on project deliverable 3: database and data warehousing design, free report on project deliverables 2: business requirements.

The scope of the project

Database And Data Warehousing Design Research Paper Examples

Data warehousing report example.

Star schema (simple)

Database And Data Warehousing Design Business Plan Examples

Introduction 2

Design process 3 Design best practices 3 Business support schema 5 Entity Relationship 5

Reference 9

Case Study On Implementation Of CRM Based On Data Warehousing At First American National Bank

Example of report on data mining.

Data Mining

Industry Evaluation

Some of the major players in the industry are search engines like Google and Yahoo, which have the capabilities of gathering data about the people who surf the Web and of providing various levels of detail for the specific information that users search for (Story, 2008).

MDCM Organization Case Study Example

The MDCM (B), ITPM case tries to examine the overall and most significant steps that must be employed in order to develop an IT projects’ portfolio in line with corporate strategies. The case examines a case whereby MDCM has laid down its change strategies but still to develop a suitable IT strategy. IT projects are evaluated by aid of a scorecard that will be considered in establishing a Portfolio Application Model Matrix. Comparison of projects will be done basing on the model (PAMM).

Enterprise Performance Management Literature Review

275 words = 1 page double-spaced

submit your paper

Password recovery email has been sent to [email protected]

Use your new password to log in

You are not register!

By clicking Register, you agree to our Terms of Service and that you have read our Privacy Policy .

Now you can download documents directly to your device!

Check your email! An email with your password has already been sent to you! Now you can download documents directly to your device.

or Use the QR code to Save this Paper to Your Phone

The sample is NOT original!

Short on a deadline?

Don't waste time. Get help with 11% off using code - GETWOWED

No, thanks! I'm fine with missing my deadline

IMAGES

  1. Research Paper On Data Warehousing Free Essay Example

    data warehousing essay

  2. Unit1-Introduction on Data WareHousing

    data warehousing essay

  3. Data warehouse, data mart and business intelligence Essay

    data warehousing essay

  4. 📚 Data Warehousing Essay Example

    data warehousing essay

  5. Data Warehousing and Big Data in Business

    data warehousing essay

  6. Data Warehousing and Its Benefits for Organizations

    data warehousing essay

VIDEO

  1. 02 Data Warehousing

  2. DATA WAREHOUSING PLANNING STAGES

  3. Data Warehousing on AWS : Analytics Pipeline & Technologies

  4. CCS341 Data Warehousing Nov/Dec 2023

  5. Data Warehousing

  6. Lecture 1 Data Warehousing & Data Mining Introduction Part 3

COMMENTS

  1. Data Warehousing and Its Benefits for Organizations Essay

    Data Warehousing. Data warehousing is a consolidated view or an approach to electronic data storage in an organization. Data warehousing basically consists of electronic data management by way of storage, retrieval and distribution. In this process, the optimized data for reporting and analysis is filed in an electronic system.

  2. Essay about Data Warehousing

    Essay about Data Warehousing. Better Essays. 2843 Words. 12 Pages. 1 Works Cited. Open Document. Data Warehouses. In the past decade, we have witnessed a computer revolution that was unimaginable. Ten to fifteen years ago, this world never would have imagined what computers would have done for business.

  3. Data Warehousing Essays: Examples, Topics, & Outlines

    Data Warehousing. PAGES 10 WORDS 2601. Data Warehousing. Data Warehouse technology has changed the way that global organizations conduct business. Many have found it impossible to create a business strategy without a data warehouse. The purpose of this discussion is to research and explain the importance of data warehouse management.

  4. Data Warehouse and Data Mining in Business

    Opposed to data warehouse, data mining refers to "the computational process of discovering patterns in large data sets involving the methods of intersection of artificial intelligence, machine learning, statistics, and data systems" (Haughton et al. 290). The main aim of putting in place a system for data mining in an organisation is to ...

  5. Data Warehouse

    The Greenplum data warehouse that is fully equipped with a data mart is comprised of 6.5 petabytes of user data, which translates to more than 17 trillion records, and "each day, an additional 150 billion new records are added and this amounts to 100 days of event data (Dignan, 2010, Para.12).

  6. Data Warehousing Essay

    Data Warehouse How Businesses use Data Warehousing Introduction Data warehousing is a technological way for businesses to align data with performance benchmarks so that organizations can obtain a long-range view of aggregated data and engage in complex analytics. These analytics typically give the organization a better understanding of what its stockpile of information means, what data trends ...

  7. Data mining and data warehousing: [Essay Example], 1040 words

    The main difference between data warehousing and data mining is that data warehousing is the process of compiling and organizing data into one common database, whereas data mining is the process of extracting meaningful data from that database. Data mining can only be done once data warehousing is complete. This essay was reviewed by.

  8. Data Warehousing : Big Data Management Essay

    Business Intelligence And Data Warehousing Essay This is to certify that Mr. AKSHAY DOGRA student of B.Tech. in CSE(Evening) has carried out the work presented in the project of the Term paper entitle "BUSINESS INTELLIGENCE AND DATA WAREHOUSING" as a part of First year programme of Bachelor of Technology in CSE (Evening) from Amity University ...

  9. The Data Warehouse Essays

    The Data Warehouse Essays. A Data Warehouse is a database-centric system of decision support technologies used to consolidate business data from many disparate sources for use in reporting and analysis (Data Warehouse). Data Warehouses and Data Warehouse systems are primary used to server executives, senior management, and business analysts ...

  10. Data Warehousing

    A data warehouse allows easier renovation snapshots of past data and also gives the power to connect such past data over a period of time by using a definite principle. A data warehousing thus is the structuring extensile setting that is planned for breakdown of non-fickle data both logically and tangibly transforming it from various basis uses ...

  11. Data Warehousing and Big Data in Business

    DW has transformed business analytics and how it is used in decision-making. In 2017, the global data warehousing market was the highest ever in the BFSI segment and was valued at $18.61 billion. Moreover, it is expected to increase at a CAGR of 8.2% from 2018-2025 (Gaul, 2019).

  12. Data Warehousing

    Essay Sample Data warehousing is a process of collecting and storing data from multiple sources to be used for reporting and analysis. In this guide, we'll show you how to use data warehousing to make better business. ... (Ballard et al., 2018). Data warehousing uses the same model for its data, regardless of the model from the data source ...

  13. Data Warehousing and Data Mining, Term Paper Example

    Data warehousing is a useful tool for many companies because it creates an easily accessible permanent central storage space that supports data analysis, retrieval, and reporting (Rosencrance, 2011). Five benefits of using data warehousing include delivery of enhanced business intelligence, saving time, heightened and consistent data quality ...

  14. data warehousing

    10. WORDS. 2601. Cite. Data Collection Brave New World Data Analysis Warehouse Management. View Full Essay. Data Warehousing Data Warehouse technology has changed the way that global organizations conduct business. Many have found it impossible to create a business strategy without a data warehouse. The purpose of this discussion is to research ...

  15. Data Warehousing Essay

    Data Warehousing is a new term in my department where we use the Network Appliance (NetApps) Netfiler storage devices/units. The information read was very informative and helpful in my understanding data warehousing better. Finally, a conclusion about the return on investment of data warehousing.

  16. Data Warehousing Essay Examples

    Our essay writing service presents to you an open-access directory of free Data Warehousing essay samples. We'd like to emphasize that the showcased papers were crafted by experienced writers with proper academic backgrounds and cover most various Data Warehousing essay topics.

  17. Infrastructure as the Foundation for Data Warehousing

    The objectives of this chapter are to (1) understand the distinction between architecture and infrastructure; (2) find out how the data warehouse infrastructure supports its architecture; (3) gain an insight into the components of the physical infrastructure; (4) review hardware and operating systems for the data warehouse; (5) study parallel processing options as applicable to the data ...

  18. Data Management, Data, Warehousing, And Warehousing Essay

    Data management, mining, and warehousing all deal with data in different ways. Data management establishes the groundwork for an organization to structure, regulate, process, and store data that they acquire (Rouse, 2016). Data management also encompasses the creation of definitions and standards for the acquired data which will be adhered to ...

  19. Warehouse Essay

    A Data Warehouse ( Dw ) Essay. Introduction: A data warehouse (DW) is the collection of processes and data whose primary purpose is to support the business with its analysis and decision-making. In other words, it is not just one thing, but a collection of many different parts. Data Warehousing has become an essential part of a successful ...

  20. Data Warehousing

    Data Warehousing and Data Mining Executive Overview Analytics, Business Intelligence (BI) and the exponential increase of insight and decision making accuracy and quality in many enterprises today can be directly attributed to the successful implementation of Enterprise Data Warehouse (EDW) and data mining systems.

  21. Essay Warehouse

    Essay about Data Warehousing. Data Warehouses In the past decade, we have witnessed a computer revolution that was unimaginable. Ten to fifteen years ago, this world never would have imagined what computers would have done for business. Furthermore, the Internet and the ability to conduct electronic commerce have changed the way we are as ...

  22. Data Warehousing Essays: Examples, Topics, & Outlines

    Data Warehousing Data Warehouse technology has changed the way that global organizations conduct business. Many have found it impossible to create a business strategy without a data warehouse. The purpose of this discussion is to research and explain the importance of data warehouse management.

  23. Data Warehouse Essay

    706 Words. 3 Pages. Open Document. Benefits of data warehouse. Data warehouse helps in solving and managing the data from various sources and transactional systems with more speedy and efficiently, and converts those data into practical information. Along with, data warehouse serves in processing of large and complex queries in a highly ...