RemNote Community
Community

Study Guide

📖 Core Concepts Analytics – systematic computational analysis of data to uncover, interpret, and communicate meaningful patterns for decision‑making. Data Analysis – examines past data through a workflow: business understanding → data understanding → preparation → modeling → evaluation → deployment. Data Analytics – extends data analysis to answer why something happened and what may happen, supporting broader organizational decisions. Advanced Analytics – technical layer (e.g., machine‑learning) that powers predictive and prescriptive capabilities. Four Core Types Descriptive: “What happened?” (e.g., summary statistics, dashboards). Diagnostic: “Why did it happen?” (e.g., root‑cause analysis, segmentation). Predictive: “What will happen?” (e.g., regression, classification, forecasting). Prescriptive: “What should we do?” (e.g., optimization, recommendation engines). Cognitive Analytics – adds AI reasoning (natural‑language processing, knowledge graphs) to augment analysis. Key Disciplines – statistics, programming, and operations research working together to quantify performance. 📌 Must Remember Analytics ≠ simple reporting – it moves from description → diagnosis → prediction → prescription. Supervised ML methods: neural nets, decision trees, logistic/linear/multiple regression, classification. Unsupervised ML methods: clustering, PCA, segmentation, association analysis. Big Data = massive, fast‑changing, often unstructured data sets that require parallel processing. Bias risk – analytics can unintentionally reinforce discrimination (price, hiring, etc.). Marketing mix modeling = attribution of sales to media/price/promotion inputs. 🔄 Key Processes Data Analysis Workflow Define business problem. Understand and acquire data. Clean/transform (handle missing, outliers, unstructured sources). Model (choose descriptive, diagnostic, predictive, or prescriptive technique). Evaluate (metrics, validation, bias checks). Deploy (report, integrate into decision process). Building a Predictive Model (Supervised) Split data → training / validation / test. Select algorithm (e.g., linear regression for continuous outcome, decision tree for categorical). Train model, tune hyper‑parameters, assess performance (RMSE, accuracy, ROC‑AUC). Validate on hold‑out set, check for over‑fitting. Customer Segmentation (Unsupervised) Preprocess: scale numeric variables, encode categories. Choose clustering method (k‑means, hierarchical). Determine optimal number of clusters (elbow method, silhouette). Profile each segment for marketing actions. 🔍 Key Comparisons Analytics vs Analysis Analytics: focuses on why & what‑if, leverages advanced techniques. Analysis: primarily looks at past data (what happened). Descriptive vs Diagnostic Descriptive: reports metrics (e.g., total sales). Diagnostic: probes causes (e.g., regression to identify drivers). Predictive vs Prescriptive Predictive: forecasts outcomes (e.g., churn probability). Prescriptive: recommends optimal actions (e.g., next‑best‑offer). Supervised vs Unsupervised ML Supervised: uses labeled outcome to train (regression, classification). Unsupervised: discovers structure without labels (clustering, PCA). ⚠️ Common Misunderstandings “Analytics = dashboards.” – Dashboards are only the descriptive layer; true analytics progresses beyond that. “More data automatically means better models.” – Quality, relevance, and proper preprocessing matter more than sheer volume. “Unstructured data can be stored as‑is in relational DBs.” – It requires transformation (e.g., text mining, feature extraction). “Predictive models are always accurate.” – They are probabilistic; evaluate uncertainty and monitor drift. 🧠 Mental Models / Intuition The “Analytics Funnel” – Visualize data flowing from raw → clean → understand → model → action. Each step narrows focus and adds value. “What‑If Tree” – For prescriptive analytics, imagine branching from a forecast to possible decisions, then evaluate each branch with a cost/benefit model. 🚩 Exceptions & Edge Cases When data are highly imbalanced (e.g., fraud detection), standard accuracy is misleading → use precision, recall, or SMOTE sampling. Big Data with real‑time constraints may require streaming analytics (complex event processing) instead of batch modeling. Regulatory constraints (e.g., GDPR) limit use of certain personal data; bias checks become mandatory. 📍 When to Use Which | Situation | Best Analytic Type | Reason | |-----------|-------------------|--------| | Understanding past sales trends | Descriptive | Summarizes historic performance | | Identifying cause of a sudden dip | Diagnostic | Root‑cause tools (regression, segmentation) | | Forecasting next‑quarter demand | Predictive (supervised regression/ML) | Generates probabilistic forecasts | | Deciding optimal price mix across channels | Prescriptive (optimization, simulation) | Provides actionable recommendations | | Detecting fraudulent transactions in real time | Cognitive + advanced predictive (anomaly detection) | Handles unstructured, streaming data | | Grouping customers for targeted campaigns | Unsupervised clustering | No predefined labels needed | 👀 Patterns to Recognize “Sharp drop after a promotion” → likely a promotion effect – investigate with diagnostic analytics. High variance in model performance across time slices → data drift; consider retraining or online learning. Correlation matrix with many strong off‑diagonal values → multicollinearity – may need PCA or feature reduction. Repeated false positives in fraud detection → imbalance; adjust threshold or use cost‑sensitive learning. 🗂️ Exam Traps Choosing “descriptive” for a “why” question – exam will expect diagnostic or predictive reasoning. Selecting a supervised method for a problem with no labeled outcome – that’s an unsupervised scenario (clustering/PCA). Confusing “precision” with “accuracy” on imbalanced data – precision/recall are the correct metrics. Assuming big data always requires Hadoop – sometimes a well‑tuned relational DB + columnar storage suffices; focus on processing needs, not technology hype. Overlooking bias – a model that appears high‑performing but discriminates will be flagged as a flawed answer. --- Use this guide for a quick, confidence‑building review before your analytics exam. Good luck!
or

Or, immediately create your own study flashcards:

Upload a PDF.
Master Study Materials.
Start learning in seconds
Drop your PDFs here or
or