RemNote Community
Community

Data ethics Study Guide

Study Guide

📖 Core Concepts Big Data Ethics – The study of right‑and‑wrong conduct surrounding massive, complex data sets, especially personal data. Ownership – Individuals are considered the owners of their personal data and can control its use. Consent – Informed, explicit permission required before personal data are transferred or used for a specific purpose. Privacy – Reasonable effort to keep personal data from unwanted exposure during any transaction. Transaction Transparency – Users must be able to see how algorithms turn raw data into aggregated outputs. Currency – Individuals should know any financial value or transactions derived from their data. Openness – Aggregated data sets should be freely accessible without restrictive barriers. Control Creep – Re‑using data collected for one purpose in unrelated contexts without new consent. Algorithmic Bias – Systematic errors in outcomes caused by biased data, design, or incentives. --- 📌 Must Remember Ownership = control – Personal data owners can limit sharing and demand deletion. Explicit consent is mandatory; implied consent is not sufficient. Transparency ≠ Openness – Transparency is about algorithmic insight; openness is about data availability. EU GDPR leans toward data ownership; US law lacks a general “right to informational privacy.” Right to be Forgotten (EU) vs Right to Delete (US) – EU can require removal of any irrelevant data; US only covers voluntarily submitted data. Bias sources: historic data, intentional design, profit‑driven incentives, black‑box opacity. Predictive policing & COMPAS illustrate real‑world bias and discriminatory outcomes. --- 🔄 Key Processes Assessing Algorithmic Bias Identify data source → Check for historic inequities. Examine model design → Look for incentive‑driven objectives. Test outcomes across demographic groups → Spot disparity. Document and remediate bias (re‑weight, augment data, adjust objectives). Obtaining Informed Consent Clearly state what data will be collected. Explain how it will be used and who will receive it. Outline financial implications (currency). Provide an easy opt‑out mechanism. Implementing Transaction Transparency Publish algorithmic overview (inputs, key transformations, outputs). Offer explainability tools (feature importance, decision trees). Allow users to audit their own aggregated data profile. Open Data Release Workflow Step 1: Anonymize / de‑identify personal identifiers. Step 2: Conduct privacy impact assessment. Step 3: Publish data with clear licensing and metadata. Step 4: Provide a feedback channel for error correction. --- 🔍 Key Comparisons Big Data Ethics vs. Information Ethics Scope: Big data ethics → data collectors/disseminators; Information ethics → IP, librarians, archivists. EU Right to be Forgotten vs. US Right to Delete EU: Can demand removal of any irrelevant/outdated data, regardless of how it was obtained. US: Only data voluntarily submitted by the user can be deleted. Algorithmic Transparency vs. Openness Transparency: Insight into how decisions are made. Openness: Free access to the raw aggregated data set. Ownership vs. Corporate IP Ownership: Individual control over personal data. Corporate IP: Company’s claim over the processed dataset or model, not the raw personal data. --- ⚠️ Common Misunderstandings “Data is free to use if it’s public.” – Public availability does not waive ownership or consent requirements. “The US guarantees privacy like the EU.” – The US lacks a comprehensive constitutional right to informational privacy. “Open data automatically benefits everyone.” – Without proper de‑identification, openness can expose vulnerable groups. “Anonymized data can never be re‑identified.” – Advanced re‑identification techniques can defeat simple anonymization. “Consent once given lasts forever.” – Consent should be revisited when data are repurposed (control creep). --- 🧠 Mental Models / Intuition Data as Property – Imagine each personal datum as a piece of land you own; you can sell, lease, or block access. Pipeline Model – Data flows: Collection → Consent → Processing (Algorithm) → Aggregation → Distribution; a breach at any stage violates ethics. Black‑Box = Locked Safe – If you can’t see inside, you can’t verify that the contents aren’t harmful or biased. --- 🚩 Exceptions & Edge Cases GDPR Interpretation – Some scholars argue GDPR does not grant outright ownership, only control rights. Control Creep – Emergency public‑health data may be repurposed for security profiling without fresh consent. Open Data with Sensitive Attributes – Publishing health statistics may be open, but small geographic granularity can re‑identify individuals. --- 📍 When to Use Which Use explicit consent when data will be sold, combined with other datasets, or used for targeted advertising. Apply anonymization + aggregate release for public health research where individual identification isn’t needed. Demand algorithmic transparency for high‑stakes decisions (credit scoring, policing, hiring). Choose openness for government budgets, environmental data, and other civic‑interest datasets; choose restricted access for personally identifiable information. --- 👀 Patterns to Recognize Historical Data → Reproduced Inequity – Whenever a model’s training set mirrors past discrimination, expect biased outcomes. Profit‑Driven Objectives → Lower Ethical Safeguards – Algorithms optimized for engagement or revenue often sacrifice privacy and fairness. Repurposing Without New Consent → Control Creep – Look for “same data, new purpose” language in contracts. Black‑Box Claims + Proprietary Licensing → Transparency Red Flags – Companies citing trade secrets while making high‑impact decisions often hide bias. --- 🗂️ Exam Traps “Ownership belongs to the collector.” – Remember the principle: individuals own their personal data. Confusing “right to be forgotten” with “right to delete.” – EU vs. US distinction is a frequent distractor. Assuming anonymized data is risk‑free. – Re‑identification attacks make this a trap. Choosing “open data” as the default answer for any dataset. – Sensitive personal data require restriction despite openness ideals. Believing algorithmic bias only occurs with intentional design. – Unintentional bias from historical data is equally important.
or

Or, immediately create your own study flashcards:

Upload a PDF.
Master Study Materials.
Start learning in seconds
Drop your PDFs here or
or