000 10326nam a2200553 i 4500
001 8039688
003 IEEE
005 20220712211703.0
006 m o d
007 cr |n|||||||||
008 171024s2008 maua ob 001 eng d
010 _z 2006036199 (print)
015 _zGBA685620 (print)
016 _z013572667 (print)
020 _a9780470060599
_qelectronic
020 _z9780470028346
_qcloth : print
020 _z0470028343
_qcloth : alk. paper
024 7 _a10.1002/9780470060599
_2doi
035 _a(CaBNVSL)mat08039688
035 _a(IDAMS)0b00006485f0d822
040 _aCaBNVSL
_beng
_erda
_cCaBNVSL
_dCaBNVSL
050 4 _aTK7882.S65
_bB87 2007eb
082 0 0 _a006.4/54
_222
082 0 0 _a004.62
100 1 _aBurke, Dave,
_eauthor.
_930098
245 1 0 _aSpeech processing for IP networks :
_bMedia Resource Control Protocol (MRCP) /
_cDave Burke.
264 1 _aChichester ;
_bJohn Wiley & Sons,
_cc2007.
264 2 _a[Piscataqay, New Jersey] :
_bIEEE Xplore,
_c[2007]
300 _a1 PDF (xiv, 354 pages) :
_billustrations.
336 _atext
_2rdacontent
337 _aelectronic
_2isbdmedia
338 _aonline resource
_2rdacarrier
504 _aIncludes bibliographical references (p. [343]-346) and index.
505 0 _aPART <st1:place w:st="on">I.</st1:place> BACKGROUND. 1. Introduction. 1.1 Introduction to Speech Applications. 1.2 The MRCP Value Proposition. 1.3 History of MRCP Standardisation. 1.3.1 Internet Engineering Task Force. 1.3.2 World Wide Web Consortium. 1.3.3 MRCP: From Humble Beginnings Toward IETF Standard. 1.4 Summary. 2. Basic Principles of Speech Processing. 2.1 Human Speech Production. 2.1.1 Speech Sounds: Phonemics and Phonetics. 2.2 Speech Recognition. 2.2.1 Endpoint Detection. 2.2.2 Mel-Cepstrum. 2.2.3 Hidden Markov Models. 2.2.4 Language Modelling. 2.3 Speaker Verification and Identification. 2.3.1 Feature Extraction. 2.3.2 Statistical Modelling. 2.4 Speech Synthesis. 2.4.1 Front-end Processing. 2.4.2 Back-end Synthesis. 2.5 Summary. 3. Overview of MRCP. 3.1 Architecture. 3.2 Media Resource Types. 3.3 Network Scenarios. 3.3.1 VoiceXML IVR Service Node. 3.3.2 IP PBX with Voicemail. 3.3.3 Advanced Media Gateway. 3.4 Protocol Operation. 3.4.1 Establishing Communication Channels. 3.4.2 Controlling a Media Resource. 3.4.3 Walkthrough Examples. 3.5 Security. 3.6 Summary. PART II. MEDIA AND CONTROL SESSIONS. 4. Session Initiation Protocol. 4.1 Introduction. 4.2 Walkthrough Example. 4.3 SIP URIs. 4.4 Transport. 4.5 Media Negotiation. 4.5.1 Session Description Protocol. 4.5.2 Offer/Answer Model. 4.6 SIP Servers. 4.6.1 Registrars. 4.6.2 Proxy Servers. 4.6.3 Redirect Servers. 4.7 SIP Extensions. 4.7.1 Capability Discovery. 4.8 Security. 4.8.1 Transport and Network Layer Security. 4.8.2 Authentication. 4.8.3 S/MIME. 4.9 Summary. 5. Session Initiation in MRCP. 5.1 Introduction. 5.2 Initiating the Media Session. 5.3 Initiating the Control Session. 5.4 Session Initiation Examples. 5.4.1 Single Media Resource. 5.4.2 Adding and Removing Media Resources. 5.4.3 Distributed Media Source/Sink. 5.5 Locating Media Resource Servers. 5.5.1 Requesting Server Capabilities. 5.5.2 Media Resource Brokers. 5.6 Security. 5.7 Summary. 6. The Media Session. 6.1 Media Encoding. 6.1.1 Pulse Code Modulation (PCM). 6.1.2 Linear Predictive Coding (LPC). 6.2 Media Transport. 6.2.1 Real-Time Protocol (RTP). 6.2.2 DTMF. 6.3 Security. 6.4 Summary. 7. The Control Session. 7.1 Message Structure. 7.1.1 Request Message. 7.1.2 Response Message. 7.1.3 Event Message. 7.1.4 Message Bodies. 7.2 Generic Methods. 7.3 Generic Headers. 7.4 Security. 7.5 Summary. PART III. DATA REPRESENTATION FORMATS. 8. Speech Synthesis Markup Language (SSML). 8.1 Introduction. 8.2 Document Structure. 8.3 Recorded Audio. 8.4 Pronunciation. 8.4.1 Phonemic/Phonetic Content. 8.4.2 Substitution. 8.4.3 Interpreting Text . 8.5 Prosody. 8.5.1 Prosodic Boundaries. 8.5.2 Emphasis. 8.5.3 Speaking Voice. 8.5.4 Prosodic Control. 8.6 Markers . 8.7 Metadata. 8.8 Summary. 9. Speech Recognition Grammar Specification (SRGS). 9.1 Introduction. 9.2 Document Structure. 9.3 Rules, Tokens, and Sequences. 9.4 Alternatives. 9.5 Rule References. 9.5.1 Special Rules. 9.6 Repeats. 9.7 DTMF Grammars. 9.8 Semantic Interpretation. 9.8.1 Semantic Literals. 9.8.2 Semantic Scripts. 9.9 Summary. 10. Natural Language Semantics Markup Language (NLSML). 10.1 Introduction. 10.2 Document Structure. 10.3 Speech Recognition Results. 10.3.1 Serialising Semantic Interpretation Results. 10.4 Voice Enrollment Results. 10.5 Speaker Verification Results. 10.6 Summary. 11. Pronunciation Lexicon Specification (PLS). 11.1 Introduction. 11.2 Document Structure. 11.3 Lexical Entries. 11.4 Abbreviations and Acronyms. 11.5 Multiple Orthographies. 11.6 Multiple Pronunciations. 11.7 Summary. PART IV. MEDIA RESOURCES. 12. Speech Synthesiser Resource. 12.1 Overview. 12.2 Methods. 12.2.1 SPEAK. 12.2.2 PAUSE. 12.2.3 RESUME. 12.2.4 STOP. 12.2.5 BARGE-IN-OCCURRED. 12.2.6 CONTROL. 12.2.7 DEFINE-LEXICON. 12.3 Events. 12.3.1 SPEECH-MARKER. 12.3.2 SPEAK-COMPLETE. 12.4 Headers. 12.5 Summary. 13. Speech Recogniser Resource. 13.1 Overview. 13.2 Recognition Methods. 13.2.1 RECOGNIZE. 13.2.2 DEFINE-GRAMMAR. 13.2.3 START-INPUT-TIMERS. 13.2.4 GET-RESULT. 13.2.5 STOP. 13.2.6 INTERPRET. 13.3 Enrollment Methods. 13.3.1 START-PHRASE-ENROLLMENT. 13.3.2 ENROLLMENT-ROLLBACK. 13.3.3 END-PHRASE-ENROLLMENT. 13.3.4 MODIFY-PHRASE. 13.3.5 DELETE-PHRASE. 13.4 Events. 13.4.1 START-OF-INPUT. 13.4.2 RECOGNITION-COMPLETE. 13.4.3 INTERPRETATION-COMPLETE. 13.5 Recognition Headers. 13.6 Enrollment Headers. 13.7 Summary. 14. Recorder Resource. 14.1 Overview. 14.2 Methods. 14.2.1 RECORD. 14.2.2 START-INPUT-TIMERS. 14.2.3 STOP. 14.3 Events. 14.3.1 START-OF-INPUT. 14.3.2 RECORD-COMPLETE. 14.4 Headers. 14.5 Summary. 15. Speaker Verification Resource. 15.1 Overview. 15.2 Methods. 15.2.1 START-SESSION. 15.2.2 END-SESSION. 15.2.3 VERIFY. 15.2.4 VERIFY-FROM-BUFFER. 15.2.5 VERIFY-ROLLBACK. 15.2.6 START-INPUT-TIMERS. 15.2.7 GET-INTERMEDIATE-RESULT. 15.2.8 STOP. 15.2.9 CLEAR-BUFFER. 15.2.10 QUERY-VOICEPRINT. 15.2.11 DELETE-VOICEPRINT. 15.3 Events. 15.3.1 START-OF-INPUT. 15.3.2 VERIFICATION-COMPLETE. 15.4 Headers. 15.5 Summary. PART V. PROGRAMMING SPEECH APPLICATIONS. 16. Voice eXtensible Markup Language (VoiceXML). 16.1 Introduction. 16.2 Document Structure. 16.2.1 Applications and Dialogs. 16.3 Dialogs. 16.3.1 Forms. 16.3.2 Menus. 16.3.3 Mixed Initiative Dialogs. 16.4 Media Playback. 16.5 Media Recording. 16.6 Speech and DTMF Recognition. 16.6.1 Specifying Grammars. 16.6.2 Grammar Scope and Activation. 16.6.3 Configuring Recognition Settings. 16.6.4 Processing Recognition Results. 16.7 Flow Control. 16.7.1 Executable Content. 16.7.2 Variables, Scopes, and Expressions. 16.7.3 Document and Dialog Transitions . 16.7.4 Event Handling. 16.8 Resource Fetching. 16.9 Call Transfer. 16.10 Summary. 17. VoiceXML and MRCP Interworking. 17.1 Introduction. 17.2 Interworking Fundamentals. 17.2.1 Play Prompts. 17.2.2 Play and Recognise. 17.2.3 Record. 17.3 Application Example. 17.3.1 VoiceXML Scripts. 17.3.2 MRCP Flows. 17.4 Summary. Appendix A. MRCP Version 1. A.1 Overview. A.2 Session Management and Message Transport. A.3 General Protocol Details. A.4 Speech Synthesiser Resource. A.5 Speech Recogniser Resource. Appendix B. XML Primer. B.1 Background. B.2 Basic Concepts. B.3 Namespaces. B.4 Document Schemas. Appendix C. HTTP Primer. C.1 Background. C.2 Basic Concepts. C.2.1 GET Method. C.2.2 POST Method. C.3 Caching. C.4 Cookies. C.5 Security. References. Index. Acronyms.
506 _aRestricted to subscribers or individual electronic text purchasers.
520 _aMedia Resource Control Protocol (MRCP) is a new IETF protocol, providing a key enabling technology that eases the integration of speech technologies into network equipment and accelerates their adoption resulting in exciting and compelling interactive services to be delivered over the telephone. MRCP leverages IP telephony and Web technologies such as SIP (Session Intiation Protocol), HTTP (Hypertext Transfer Protocol), and XML (Extensible Markup Language) to deliver an open standard, vendor-independent, and versatile interface to speech engines. Speech Processing for IP Networks brings these technologies together into a single volume, giving the reader a solid technical understanding of the principles of MRCP, how it leverages other protocols and specifications for its operation, and how it is applied in modern IP-based telecommunication networks. Focusing on the MRCPv2 standard developed by the IETF SpeechSC Working Group, this book will also provide an overview of its precursor, MRCPv1. Speech Processing for IP Networks: . Gives a complete background on the technologies required by MRCP to function, including SIP, RTP (Real-time Transport Protocol), and HTTP.. Covers relevant W3C data representation formats including Speech Synthesis Markup Language (SSML), Speech Recognition Grammar Specification (SRGS), Semantic Interpretation for Speech Recognition (SISR), and Pronunciation Lexicon Specification (PLS).. Describes VoiceXML - the leading approach for programming cutting-edge speech applications and a key driver to the development of many of MRCP's features.. Explains advanced topics such as VoiceXML and MRCP interworking. This text will be an invaluable resource for technical managers, product managers, software developers, and technical marketing professionals working for network equipment manufacturers, speech engine vendors, and network operators. Advanced students on computer science and engineering courses will also find this to be an excellent guide to the topic.
530 _aAlso available in print.
538 _aMode of access: World Wide Web
588 _aDescription based on PDF viewed 10/24/2017.
650 0 _aSpeech processing systems.
_93831
650 0 _aAutomatic speech recognition.
_95558
650 0 _aTCP/IP (Computer network protocol)
_911529
655 0 _aElectronic books.
_93294
710 2 _aIEEE Xplore (Online Service),
_edistributor.
_930099
710 2 _aWiley,
_epublisher.
_930100
776 0 8 _iPrint version:
_z9780470028346
856 4 2 _3Abstract with links to resource
_uhttps://ieeexplore.ieee.org/xpl/bkabstractplus.jsp?bkn=8039688
942 _cEBK
999 _c74746
_d74746