Dark data : why what you don't know matters / David J. Hand.
By: Hand, D. J. (David J.) [author.].
Material type: BookPublisher: Princeton : Princeton University Press, [2020]Description: 1 online resource.Content type: text Media type: computer Carrier type: online resourceISBN: 9780691198859; 0691198853.Subject(s): Missing observations (Statistics) | Big data | Observations manquantes (Statistique) | Donn�ees volumineuses | COMPUTERS -- Database Management -- Data Mining | Big data | Missing observations (Statistics)Genre/Form: Electronic books.Additional physical formats: Print version:: Dark dataDDC classification: 519.5 Other classification: SK 850 Online resources: Click here to access onlineIncludes bibliographical references and index.
"Data describe and represent the world. However, no matter how big they may be, data sets don't - indeed cannot - capture everything. Data are measurements - and, as such, they represent only what has been measured. They don't necessarily capture all the information that is relevant to the questions we may want to ask. If we do not take into account what may be missing/unknown in the data we have, we may find ourselves unwittingly asking questions that our data cannot actually address, come to mistaken conclusions, and make disastrous decisions. In this book, David Hand looks at the ubiquitous phenomenon of "missing data." He calls this "dark data" (making a comparison to "dark matter" - i.e., matter in the universe that we know is there, but which is invisible to direct measurement). He reveals how we can detect when data is missing, the types of settings in which missing data are likely to be found, and what to do about it. It can arise for many reasons, which themselves may not be obvious - for example, asymmetric information in wars; time delays in financial trading; dropouts in clinical trials; deliberate selection to enhance apparent performance in hospitals, policing, and schools; etc. What becomes clear is that measuring and collecting more and more data (big data) will not necessarily lead us to better understanding or to better decisions. We need to be vigilant to what is missing or unknown in our data, so that we can try to control for it. How do we do that? We can be alert to the causes of dark data, design better data-collection strategies that sidestep some of these causes - and, we can ask better questions of our data, which will lead us to deeper insights and better decisions"-- Provided by publisher.
Description based on print version record and CIP data provided by publisher.
Preface; Part 1: Dark Data: Their Origins and Consequences; Chapter 1: Dark Data: What We Don't See Shapes Our World; The Ghost of Data; So You Think You Have All the Data?; Nothing Happened, So We Ignored It; The Power of Dark Data; All around Us; Chapter 2: Discovering Dark Data: What We Collect and What We Don't; Dark Data on All Sides; Data Exhaust, Selection, and Self-Selection; From the Few to the Many; Experimental Data; Beware Human Frailties; Chapter 3: Definitions and Dark Data: What Do You Want to Know?; Different Definitions and Measuring the Wrong Thing
You Can't Measure EverythingScreening; Selection on the Basis of Past Performance; Chapter 4: Unintentional Dark Data: Saying One Thing, Doing Another; The Big Picture; Summarizing; Human Error; Instrument Limitations; Linking Data Sets; Chapter 5: Strategic Dark Data: Gaming, Feedback, and Information Asymmetry; Gaming; Feedback; Information Asymmetry; Adverse Selection and Algorithms; Chapter 6: Intentional Dark Data: Fraud and Deception; Fraud; Identity Theft and Internet Fraud; Personal Financial Fraud; Financial Market Fraud and Insider Trading; Insurance Fraud; And More
Chapter 7: Science and Dark Data: The Nature of DiscoveryThe Nature of Science; If Only I'd Known That; Tripping over Dark Data; Dark Data and the Big Picture; Hiding the Facts; Retraction; Provenance and Trustworthiness: Who Told You That?; Part II: Illuminating and Using Dark Data; Chapter 8: Dealing with Dark Data: Shining a Light; Hope!; Linking Observed and Missing Data; Identifying the Missing Data Mechanism; Working with the Data We Have; Going Beyond the Data: What If You Die First?; Going Beyond the Data: Imputation; Iteration; Wrong Number!
Chapter 9: Benefiting from Dark Data: Reframing the QuestionHiding Data; Hiding Data from Ourselves: Randomized Controlled Trials; What Might Have Been; Replicated Data; Imaginary Data: The Bayesian Prior; Privacy and Confidentiality Preservation; Collecting Data in the Dark; Chapter 10: Classifying Dark Data: A Route through the Maze; A Taxonomy of Dark Data; Illumination; Notes; Index.
IEEE IEEE Xplore Princeton University Press eBooks Library
There are no comments for this item.