Data lakes [electronic resource] / edited by Anne Laurent, Dominique Laurent, Cédrine Madera.
Contributor(s): Laurent, Anne | Laurent, Dominique | Madera, Cédrine.
Material type: BookSeries: Computer engineering seriesDatabases and big data set: volume 2.Publisher: London : Hoboken : ISTE, Ltd. ; Wiley, 2020Description: 1 online resource (249 p.).ISBN: 9781119720430; 1119720435; 9781119720423; 1119720427.Subject(s): Big data | Databases | Big data | DatabasesGenre/Form: Electronic books. | Electronic books.Additional physical formats: Print version:: Data LakesDDC classification: 005.7 Online resources: Wiley Online LibraryDescription based upon print version of record.
Cover -- Half-Title Page -- Dedication -- Title Page -- Copyright Page -- Contents -- Preface -- 1. Introduction to Data Lakes: Definitions and Discussions -- 1.1. Introduction to data lakes -- 1.2. Literature review and discussion -- 1.3. The data lake challenges -- 1.4. Data lakes versus decision-making systems -- 1.5. Urbanization for data lakes -- 1.6. Data lake functionalities -- 1.7. Summary and concluding remarks -- 2. Architecture of Data Lakes -- 2.1. Introduction -- 2.2. State of the art and practice -- 2.2.1. Definition -- 2.2.2. Architecture -- 2.2.3. Metadata
2.2.4. Data quality -- 2.2.5. Schema-on-read -- 2.3. System architecture -- 2.3.1. Ingestion layer -- 2.3.2. Storage layer -- 2.3.3. Transformation layer -- 2.3.4. Interaction layer -- 2.4. Use case: the Constance system -- 2.4.1. System overview -- 2.4.2. Ingestion layer -- 2.4.3. Maintenance layer -- 2.4.4. Query layer -- 2.4.5. Data quality control -- 2.4.6. Extensibility and flexibility -- 2.5. Concluding remarks -- 3. Exploiting Software Product Lines and Formal Concept Analysis for the Design of Data Lake Architectures -- 3.1. Our expectations -- 3.2. Modeling data lake functionalities
3.3. Building the knowledge base of industrial data lakes -- 3.4. Our formalization approach -- 3.5. Applying our approach -- 3.6. Analysis of our first results -- 3.7. Concluding remarks -- 4. Metadata in Data Lake Ecosystems -- 4.1. Definitions and concepts -- 4.2. Classification of metadata by NISO -- 4.2.1. Metadata schema -- 4.2.2. Knowledge base and catalog -- 4.3. Other categories of metadata -- 4.3.1. Business metadata -- 4.3.2. Navigational integration -- 4.3.3. Operational metadata -- 4.4. Sources of metadata -- 4.5. Metadata classification -- 4.6. Why metadata are needed
4.6.1. Selection of information (re)sources -- 4.6.2. Organization of information resources -- 4.6.3. Interoperability and integration -- 4.6.4. Unique digital identification -- 4.6.5. Data archiving and preservation -- 4.7. Business value of metadata -- 4.8. Metadata architecture -- 4.8.1. Architecture scenario 1: point-to-point metadata architecture -- 4.8.2. Architecture scenario 2: hub and spoke metadata architecture -- 4.8.3. Architecture scenario 3: tool of record metadata architecture -- 4.8.4. Architecture scenario 4: hybrid metadata architecture
4.8.5. Architecture scenario 5: federated metadata architecture -- 4.9. Metadata management -- 4.10. Metadata and data lakes -- 4.10.1. Application and workload layer -- 4.10.2. Data layer -- 4.10.3. System layer -- 4.10.4. Metadata types -- 4.11. Metadata management in data lakes -- 4.11.1. Metadata directory -- 4.11.2. Metadata storage -- 4.11.3. Metadata discovery -- 4.11.4. Metadata lineage -- 4.11.5. Metadata querying -- 4.11.6. Data source selection -- 4.12. Metadata and master data management -- 4.13. Conclusion -- 5. A Use Case of Data Lake Metadata Management -- 5.1. Context
5.1.1. Data lake definition
Includes bibliographical references and index.
There are no comments for this item.