Shen, Jiaming.

Automated Taxonomy Discovery and Exploration [electronic resource] / by Jiaming Shen, Jiawei Han. - 1st ed. 2022. - XI, 103 p. 34 illus., 31 illus. in color. online resource. - Synthesis Lectures on Data Mining and Knowledge Discovery, 2151-0075 . - Synthesis Lectures on Data Mining and Knowledge Discovery, .

Introduction -- Concept Set Expansion -- Taxonomy Construction -- Taxonomy Enrichment -- Taxonomy-Guided Classification -- Conclusions.

This book provides a principled data-driven framework that progressively constructs, enriches, and applies taxonomies without leveraging massive human annotated data. Traditionally, people construct domain-specific taxonomies by extensive manual curations, which is time-consuming and costly. In today's information era, people are inundated with the vast amounts of text data. Despite their usefulness, people haven't yet exploited the full power of taxonomies due to the heavy curation needed for creating and maintaining them. To bridge this gap, the authors discuss automated taxonomy discovery and exploration, with an emphasis on label-efficient machine learning methods and their real-world usages. Taxonomy organizes entities and concepts in a hierarchy way. It is ubiquitous in our daily life, ranging from product taxonomies used by online retailers, topic taxonomies deployed by news outlets and social media, as well as scientific taxonomies deployed by digital libraries across various domains. When properly analyzed, these taxonomies can play a vital role for science, engineering, business intelligence, policy design, ecommerce, and more. Intuitive examples are used throughout enabling readers to grasp concepts more easily. In addition, this book: Discusses the process of creating, maintaining, and applying taxonomies via simple, easy-to-understand examples Provides a systematic review of the current research frontier of each task and discusses their real-world applications Includes supporting materials containing links to commonly used evaluation datasets and a code repository of representative algorithms.

9783031114052

10.1007/978-3-031-11405-2 doi


Machine learning.
Computer science.
Information storage and retrieval systems.
Data mining.
Big data.
Machine Learning.
Computer Science.
Information Storage and Retrieval.
Data Mining and Knowledge Discovery.
Big Data.

Q325.5-.7

006.31