000 | 04388nam a22005535i 4500 | ||
---|---|---|---|
001 | 978-3-031-03763-4 | ||
003 | DE-He213 | ||
005 | 20240730164017.0 | ||
007 | cr nn 008mamaa | ||
008 | 220601s2022 sz | s |||| 0|eng d | ||
020 |
_a9783031037634 _9978-3-031-03763-4 |
||
024 | 7 |
_a10.1007/978-3-031-03763-4 _2doi |
|
050 | 4 | _aQ334-342 | |
050 | 4 | _aTA347.A78 | |
072 | 7 |
_aUYQ _2bicssc |
|
072 | 7 |
_aCOM004000 _2bisacsh |
|
072 | 7 |
_aUYQ _2thema |
|
082 | 0 | 4 |
_a006.3 _223 |
100 | 1 |
_aPaun, Silviu. _eauthor. _4aut _4http://id.loc.gov/vocabulary/relators/aut _981638 |
|
245 | 1 | 0 |
_aStatistical Methods for Annotation Analysis _h[electronic resource] / _cby Silviu Paun, Ron Artstein, Massimo Poesio. |
250 | _a1st ed. 2022. | ||
264 | 1 |
_aCham : _bSpringer International Publishing : _bImprint: Springer, _c2022. |
|
300 |
_aXIX, 197 p. _bonline resource. |
||
336 |
_atext _btxt _2rdacontent |
||
337 |
_acomputer _bc _2rdamedia |
||
338 |
_aonline resource _bcr _2rdacarrier |
||
347 |
_atext file _bPDF _2rda |
||
490 | 1 |
_aSynthesis Lectures on Human Language Technologies, _x1947-4059 |
|
505 | 0 | _aPreface -- Acknowledgements -- Introduction -- Coefficients of Agreement -- Using Agreement Measures for CL Annotation Tasks -- Probabilistic Models of Agreement -- Probabilistic Models of Annotation -- Learning from Multi-Annotated Corpora -- Bibliography -- Authors' Biographies. | |
520 | _aLabelling data is one of the most fundamental activities in science, and has underpinned practice, particularly in medicine, for decades, as well as research in corpus linguistics since at least the development of the Brown corpus. With the shift towards Machine Learning in Artificial Intelligence (AI), the creation of datasets to be used for training and evaluating AI systems, also known in AI as corpora, has become a central activity in the field as well. Early AI datasets were created on an ad-hoc basis to tackle specific problems. As larger and more reusable datasets were created, requiring greater investment, the need for a more systematic approach to dataset creation arose to ensure increased quality. A range of statistical methods were adopted, often but not exclusively from the medical sciences, to ensure that the labels used were not subjective, or to choose among different labels provided by the coders. A wide variety of such methods is now in regular use. This book is meantto provide a survey of the most widely used among these statistical methods supporting annotation practice. As far as the authors know, this is the first book attempting to cover the two families of methods in wider use. The first family of methods is concerned with the development of labelling schemes and, in particular, ensuring that such schemes are such that sufficient agreement can be observed among the coders. The second family includes methods developed to analyze the output of coders once the scheme has been agreed upon, particularly although not exclusively to identify the most likely label for an item among those provided by the coders. The focus of this book is primarily on Natural Language Processing, the area of AI devoted to the development of models of language interpretation and production, but many if not most of the methods discussed here are also applicable to other areas of AI, or indeed, to other areas of Data Science. | ||
650 | 0 |
_aArtificial intelligence. _93407 |
|
650 | 0 |
_aNatural language processing (Computer science). _94741 |
|
650 | 0 |
_aComputational linguistics. _96146 |
|
650 | 1 | 4 |
_aArtificial Intelligence. _93407 |
650 | 2 | 4 |
_aNatural Language Processing (NLP). _931587 |
650 | 2 | 4 |
_aComputational Linguistics. _96146 |
700 | 1 |
_aArtstein, Ron. _eauthor. _4aut _4http://id.loc.gov/vocabulary/relators/aut _981639 |
|
700 | 1 |
_aPoesio, Massimo. _eauthor. _4aut _4http://id.loc.gov/vocabulary/relators/aut _981640 |
|
710 | 2 |
_aSpringerLink (Online service) _981641 |
|
773 | 0 | _tSpringer Nature eBook | |
776 | 0 | 8 |
_iPrinted edition: _z9783031037733 |
776 | 0 | 8 |
_iPrinted edition: _z9783031037535 |
776 | 0 | 8 |
_iPrinted edition: _z9783031037832 |
830 | 0 |
_aSynthesis Lectures on Human Language Technologies, _x1947-4059 _981642 |
|
856 | 4 | 0 | _uhttps://doi.org/10.1007/978-3-031-03763-4 |
912 | _aZDB-2-SXSC | ||
942 | _cEBK | ||
999 |
_c85215 _d85215 |