71. What is a data mart?
A data mart is a collection of data smaller in scope and size than a data warehouse. It is dedicated to data from a particular business component of business functional area. A data mart may function as a subset of a larger data warehouse. Users of a data mart are usually knowledgeable analysts in the business area using the data mart.
72. What is RFM analysis?
RFM analysis is a Business Intelligence (BI) reporting system that analyzes and ranks customers based on their purchasing patterns. R refers to “how recently” a customer placed an order, F refers to “how frequently” the customer orders, and M refers to “how much money” the customer spends. Typically, the customers are ranked into “20%” groups and assigned a number to represent their ranking. Thus 1 means top 20%, 2 the next 20% and so on. In this system a score of 1 is best and a score of 5 is worst. Thus a customer with an RFM score = 1 5 1 would be one who has ordered recently, does not order frequently, and who makes large purchases.
73. What are the functions of a reporting system?
A reporting system has three functions: 1. Report authoring — connecting to data sources, creating the report structure and formatting the report. 2. Report management — defining who receives which reports, when they receive them and how the reports are delivered. 3. Report delivery — based on report management metadata, either pushing the reports to the recipients or allowing them to be pulled by the recipients.
74. What is OLAP?
OnLine Analytical Processing (OLAP) is a Business Intelligence (BI) reporting system. OLAP provides the user with the capability to sum, count, average and do other simple arithmetic operations on groups of data. An OLAP report has measures and dimensions. Measures are the data values to be displayed. Dimensions are characteristics of the measures. OLAP reports are called OLAP cubes, although such reports are not limited to three dimensions.
75. What is market basket analysis?
Market basket analysis is a data mining technique that determines which sets of products tend to be purchased together. A common technique uses conditional probabilities. In addition to the basic probability that an item will be purchased, three results are of particular interest:
Support — the probability of two items being purchased together.
Confidence — the probability of a second item being purchased GIVEN that another item has been purchased.
Lift — calculated as confidence divided by a basic probability, this shows the likelihood of a second item being purchased IF an item is purchased.
76. Explain the differences between structured data and unstructured data.
Structured data are facts concerning objects and events. The most important structured data are numeric, character, and dates. Structured data are stored in tabular form. Unstructured data are multimedia data such as documents, photographs, maps, images, sound, and video clips. Unstructured data are most commonly found on Web servers and Web-enabled databases.
77. Explain why it is still necessary to have at least some familiarity with file processing systems even though it has become evident that traditional file processing systems have a number of shortcomings and limitations.
Many businesses still use file processing systems today. This is especially true in the creation of backups for a database system. In addition, if you understand some of the limitations of a file processing system such as program-data dependence, duplication of data, limited data sharing, lengthy development times, and excessive program maintenance, you can try and avoid them as you design and develop a databases.
78. What are some of the disadvantages associated with conventional file processing systems?
There are five disadvantages. Program-data dependence occurs when file descriptions need to be changed in all programs whenever a file description changes. Duplication of data is storing the data more than one time. Limited data sharing occurs when the files are private so no one outside of one application can access the data. Lengthy development times exist because file processing systems takes longer to develop. Lastly, excessive program maintenance exists since the effort to maintain a program is larger in this environment.
79. The range of database applications can be divided into five categories. Explain the five different categories.
Databases can support from a single user (personal database) up to supporting the requests of the world (internet database). In between, a database can support a workgroup (a relatively small group of people), department database (a functional unit in an organization such as marketing), or an enterprise database (entire organization).
80. Explain the differences between an intranet and an extranet.
An Internet database is accessible by everyone who has access to a Web site. An intranet database limits access to only people within a given organization. An extranet database limits access to only people within a company and a company’s customers and suppliers.