Advanced and Predictive Analytics
Glossary of Terms
Advanced Analytics - the analysis of data from differing sources, including structured and unstructured sources, using sophisticated quantitative methods to produce insights that traditional approaches to business intelligence (BI) — such as query and reporting — are unlikely to discover.
Big Data Analytics – is the technology that an organization requires to handle data at extreme scales.
"Big Data" is a term used to describe a massive volume of diverse data, both structured and unstructured, that is so large and fast-moving that it’s difficult or impossible to process using traditional databases and software technology. In most enterprise scenarios, the data is too enormous, streaming by too quickly at unpredictable and variable speeds, and exceeds current processing capacity.
Business Analyst/ Business Analytics - Business Analytics is the combination of skills, technologies, applications and processes used by organizations/individuals to gain insight into their business based on data and statistics driving business planning. Business Analytics is used to evaluate organization-wide operations and can be implemented in any department from sales, product development, customer service and finance. Business Analytics solutions typically use data, statistical and quantitative analysis and fact-based data to measure past performance and influence an organization's business plan.
Columnar Database - A columnar database, also known as a column-oriented database, is a database management system (DBMS) that stores data in columns rather than in rows as relational DBMSs. The main difference between a columnar database and a traditional row-oriented database is centered on performance, storage necessities and schema modifying techniques. The goal of a columnar database is to efficiently write and read data to and from hard disk storage and speed up the time it takes to return a query.
Customer Analytics - Customer Analytics allows firms to analyze customer data to optimize customer decisions and use the analytical insight to design customer-focused programs and initiatives that drive acquisition, retention, cross-sell/upsell, and targeted marketed campaigns. Customer analytics exploits behavioral data to identify unique segments in a customer base that the business can act upon. Insights obtained through customer analytics is often used to segment markets, predict customer behavior and guide future products and services.
Data Scientist - a data scientist is an individual responsible for modeling complex business problems, discovering business insights and identifying opportunities through the use of statistical, algorithmic, mining and visualization techniques. In addition to advanced analytics skills, this individual is also proficient at integrating and preparing large, varied datasets, architecting specialized databases and computing environments, and communicating results.
Data Source Integration - the discipline of data integration comprises the practices, architectural techniques and tools for achieving consistent access to, and delivery of, data across the spectrum of data subject areas and data structure types in the enterprise, to meet the data consumption requirements of all applications and business processes.
Predictive Analytics - Predictive analysis hinges upon “predictors”: a variable or set of variables that can be measured to calculate the statistical likelihood of future occurrences. Predictive analytics encompasses a variety of statistical techniques, modeling, machine learning and data mining which analyze current and historical facts to make predictions about future or unknown events. In business, predictive models exploit patterns found in historical and transactional data to identify risks and opportunities. Models capture relationships among many variables to allow risk assessments associated with a particular set of conditions, influencing business decisions.
Statistical Analysis - Statistics is the study of the collection, organization, examination, summarization, manipulation, interpretation and presentation of quantitative data. It deals with all aspects of data including the planning of data collection in terms of the design of surveys and experiments. There are two statistic methodologies: descriptive and inferential statistics.
- Descriptive statistics: A statistic is a number that is derived from data, for example a mean (average) or a standard deviation. It can be very helpful when examining data to obtain a suitable set of relevant descriptive statistics. In particular, it can be very interesting to compare statistics obtained from different (but related) columns, or between levels of a factor. This gives an idea of the similarities or differences between the data.
- Inferential Statistics: After data exploration, aided by visualization and description techniques, one will need to identify what formal statistical analysis technique (if any) is required to investigate the data further and to draw general conclusions. A very large number of statistical techniques have been developed to handle many different types of data and create relationships between them.
Visual Data Mining - Visual Data Mining presents the data in some visual form, allowing users to mine and gain insight into the data, draw conclusions and directly interact with the data. For data mining to be effective, it is important to combine the flexibility, creativity, and general knowledge of the user with the enormous storage capacity and computational performance of technology.
Visual Data Mining techniques have proven to be of high value in exploratory data analysis, and have a high potential for mining large databases. Visual Data Mining is especially useful when little is known about the data and the exploration goals are vague. Since the user is directly involved in the exploration and mining process, shifting and adjusting the exploration goals is automatically done if necessary.