000 03807nam a22005415i 4500
001 978-3-031-01865-7
003 DE-He213
005 20240730163438.0
007 cr nn 008mamaa
008 220601s2019 sz | s |||| 0|eng d
020 _a9783031018657
_9978-3-031-01865-7
024 7 _a10.1007/978-3-031-01865-7
_2doi
050 4 _aTK5105.5-5105.9
072 7 _aUKN
_2bicssc
072 7 _aCOM043000
_2bisacsh
072 7 _aUKN
_2thema
082 0 4 _a004.6
_223
100 1 _aAbedjan, Ziawasch.
_eauthor.
_4aut
_4http://id.loc.gov/vocabulary/relators/aut
_978573
245 1 0 _aData Profiling
_h[electronic resource] /
_cby Ziawasch Abedjan, Lukasz Golab, Felix Naumann, Thorsten Papenbrock.
250 _a1st ed. 2019.
264 1 _aCham :
_bSpringer International Publishing :
_bImprint: Springer,
_c2019.
300 _aXV, 136 p.
_bonline resource.
336 _atext
_btxt
_2rdacontent
337 _acomputer
_bc
_2rdamedia
338 _aonline resource
_bcr
_2rdacarrier
347 _atext file
_bPDF
_2rda
490 1 _aSynthesis Lectures on Data Management,
_x2153-5426
505 0 _aPreface -- Acknowledgments -- Discovering Metadata -- Data Profiling Tasks -- Single-Column Analysis -- Dependency Discovery -- Relaxed and Other Dependencies -- Use Cases -- Profiling Non-Relational Data -- Data Profiling Tools -- Data Profiling Challenges -- Conclusions -- Bibliography -- Authors' Biographies .
520 _aData profiling refers to the activity of collecting data about data, {i.e.}, metadata. Most IT professionals and researchers who work with data have engaged in data profiling, at least informally, to understand and explore an unfamiliar dataset or to determine whether a new dataset is appropriate for a particular task at hand. Data profiling results are also important in a variety of other situations, including query optimization, data integration, and data cleaning. Simple metadata are statistics, such as the number of rows and columns, schema and datatype information, the number of distinct values, statistical value distributions, and the number of null or empty values in each column. More complex types of metadata are statements about multiple columns and their correlation, such as candidate keys, functional dependencies, and other types of dependencies. This book provides a classification of the various types of profilable metadata, discusses popular data profiling tasks,and surveys state-of-the-art profiling algorithms. While most of the book focuses on tasks and algorithms for relational data profiling, we also briefly discuss systems and techniques for profiling non-relational data such as graphs and text. We conclude with a discussion of data profiling challenges and directions for future work in this area.
650 0 _aComputer networks .
_931572
650 0 _aData structures (Computer science).
_98188
650 0 _aInformation theory.
_914256
650 1 4 _aComputer Communication Networks.
_978574
650 2 4 _aData Structures and Information Theory.
_931923
700 1 _aGolab, Lukasz.
_eauthor.
_4aut
_4http://id.loc.gov/vocabulary/relators/aut
_978575
700 1 _aNaumann, Felix.
_eauthor.
_4aut
_4http://id.loc.gov/vocabulary/relators/aut
_978576
700 1 _aPapenbrock, Thorsten.
_eauthor.
_4aut
_4http://id.loc.gov/vocabulary/relators/aut
_978577
710 2 _aSpringerLink (Online service)
_978578
773 0 _tSpringer Nature eBook
776 0 8 _iPrinted edition:
_z9783031000928
776 0 8 _iPrinted edition:
_z9783031007378
776 0 8 _iPrinted edition:
_z9783031029936
830 0 _aSynthesis Lectures on Data Management,
_x2153-5426
_978579
856 4 0 _uhttps://doi.org/10.1007/978-3-031-01865-7
912 _aZDB-2-SXSC
942 _cEBK
999 _c84613
_d84613