MultiMedia Modeling 26th International Conference, MMM 2020, Daejeon, South Korea, January 5-8, 2020, Proceedings, Part II / [electronic resource] :
edited by Yong Man Ro, Wen-Huang Cheng, Junmo Kim, Wei-Ta Chu, Peng Cui, Jung-Woo Choi, Min-Chun Hu, Wesley De Neve.
- 1st ed. 2020.
- XXX, 820 p. 385 illus., 271 illus. in color. online resource.
- Information Systems and Applications, incl. Internet/Web, and HCI, 11962 2946-1642 ; .
- Information Systems and Applications, incl. Internet/Web, and HCI, 11962 .
Poster Papers -- Multi-Scale Comparison Network for Few-Shot Learning -- Semantic and Morphological Information guided Chinese Text Classification -- A Delay-aware Adaptation Framework for Cloud Gaming under the Computation Constraint of User Devices -- Efficient Edge Caching for High-Quality 360-Degree Video Delivery -- Inferring Emphasis for Real Voice Data: an Attentive Multimodal Neural Network Approach -- PRIME: Block-wise Missingness Handling for Multi-modalities in Intelligent Tutoring Systems -- A New Local Transformation Module for Few-shot Segmentation -- Background Segmentation for Vehicle Re-Identification -- Face Tells Detailed Expression: Generating Comprehensive Facial Expression Sentence through Facial Action Units -- A Deep Convolutional Deblurring and Detection Neural Network for Localizing Text in Videos -- Generate images with obfuscated attributes for private image classifcation -- Context-Aware Residual Network with Promotion Gates for Single Image Super-Resolution -- A Compact Deep Neural Network for Single Image Super-Resolution -- An Efficient Algorithm of Facial Expression Recognition by TSG-RNN Network -- Structured Neural Motifs: Scene Graph Parsing via Enhanced Context -- Perceptual Localization of Virtual Sound Source Based on Loudspeaker Triplet -- TK-Text: Multi-shaped Scene Text Detection via Instance Segmentation -- More-Natural Mimetic Words Generation for Fine-grained Gait Description -- Lite Hourglass Network for Multi-person Pose Estimation -- SS1: AI-Powered 3D Vision -- Single View Depth Estimation via Dense Convolution Network with Self-supervision -- Multi-Data UAV Images for Large Scale Reconstruction of Buildings -- Deformed Phase Prediction Using SVM for Structured Light Depth Generation -- Extraction of Multi-class Multi-instance Geometric Primitives from Point Clouds Using Energy Minimization -- Similarity Graph Convolutional Construction Network for Interactive Action Recognition -- Content-Aware Cubemap Projection for Panoramic Image via Deep Q-Learning -- Robust RGB-D Data Registration Based on Correntropy and Bi-directional Distance -- InSphereNet: a Concise Representation and Classification Method for 3D Object -- 3-D Oral Shape Retrieval Using Registration Algorithm -- Face Super-Resolution by Learning Multi-view Texture Compensation -- Light Field Salient Object Detection via Hybrid Priors -- SS2: Multimedia Analytics: Perspectives, Tools and Applications -- Multimedia Analytics Challenges and Opportunities for Creating Interactive Radio Content -- Interactive Search and Exploration in Discussion Forums Using Multimodal Embeddings -- An inverse mapping with manifold alignment for zero-shot learning -- Baseline Analysis of a Conventional and Virtual Reality Lifelog Retrieval System -- An Extensible Framework for Interactive Real-time Visualizations of Large-scale Heterogeneous Multimedia Information from Online Sources -- SS3: MDRE: Multimedia Datasets for Repeatable Experimentation -- GLENDA: Gynecologic Laparoscopy Endometriosis Dataset -- Kvasir-SEG: A Segmented Polyp Dataset -- Rethinking the Test Collection Methodology for Personal Self-Tracking Data -- Experiences and Insights from the Collection of a Novel Multimedia EEG Dataset -- SS4: MMAC: Multi-Modal Affective Computing of Large-Scale Multimedia Data -- Relation Modeling with Graph Convolutional Networks for Facial Action Unit Detection -- Enhanced Gaze Following via Object Detection and Human Pose Estimation -- Region Based Adversarial Synthesis of Facial Action Units -- Facial Expression Restoration Based on Improved Graph Convolutional Networks -- Global Affective Video Content Regression Based on Complementary Audio-Visual Features -- SS5: MULTIMED: Multimedia and Multimodal Analytics in the Medical Domain and Pervasive Environments -- Using Publicly Available Medical Images from the Open Access Literature and Social Networks for Model Training and Knowledge Extraction -- AttenNet: Deep Attention based Retinal Disease Classification in OCT Images -- NOVA: A Tool for Explanatory Multimodal Behavior Analysis and its Application to Psychotherapy -- Instrument Recognition in Laparoscopy for Technical Skill Assessment -- Real-time Recognition of Daily Actions Based on 3D Joint Movements and Fisher Encoding -- Model-based and Class-based Fusion of Multisensor Data -- Evaluating the Generalization Performance of Instrument Classification in Cataract Surgery Videos -- SS6: Intelligent Multimedia Security -- Compact Position-aware Attention Network for Image Semantic Segmentation -- Law is Order: Protecting Multimedia Network Transmission by Game Theory and Mechanism Design -- Rational Delegation Computing Using Information Theory and Game Theory Approach -- Multi-hop Interactive Cross-modal Retrieval -- Demo Papers -- Browsing Visual Sentiment Datasets using Psycholinguistic Groundings -- Framework Design for Multiplayer Motion Sensing Game in Mixture Reality -- Lyrics-Conditioned Neural Melody Generation -- A Web-based Visualization Tool for 3D Spatial Coverage Measurement of Aerial Images -- An Attention Based Speaker-Independent Audio-Visual Deep Learning Model for Speech Enhancement -- DIME: An Online Tool for the Visual Comparison of Cross-Modal Retrieval Models -- Real-time Demonstration of Personal Audio and 3D Audio Rendering Using Line Array Systems -- CNN-based Multi-Scale Super-Resolution Architecture on FPGA for 4K/8K UHD Applications -- Effective Utilization of Hybrid Residual Modules in Deep Neural Networks for Super Resolution -- VBS Papers -- diveXplore 4.0: The ITEC Deep Interactive Video Exploration System at VBS2020 -- Combining Boolean and Multimedia Retrieval in vitrivr for Large-Scale Video Search -- An Interactive Video Search Platform for Multi-modal Retrieval with Advanced Concepts -- VIREO @ Video Browser Showdown 2020 -- VERGE in VBS 2020 -- VIRET at Video Browser Showdown 2020 -- SOM-Hunter: Video Browsing with Relevance-to-SOM Feedback Loop -- Exquisitor at the Video Browser Showdown 2020 -- Deep Learning-Based Video Retrieval using Object Relationships and Associated Audio Classes -- IVIST: Interactive Video Search Tool in VBS 2020.
The two-volume set LNCS 11961 and 11962 constitutes the thoroughly refereed proceedings of the 25th International Conference on MultiMedia Modeling, MMM 2020, held in Daejeon, South Korea, in January 2020. Of the 171 submitted full research papers, 40 papers were selected for oral presentation and 46 for poster presentation; 28 special session papers were selected for oral presentation and 8 for poster presentation; in addition, 9 demonstration papers and 6 papers for the Video Browser Showdown 2020 were accepted. The papers of LNCS 11961 are organized in the following topical sections: audio and signal processing; coding and HVS; color processing and art; detection and classification; face; image processing; learning and knowledge representation; video processing; poster papers; the papers of LNCS 11962 are organized in the following topical sections: poster papers; AI-powered 3D vision; multimedia analytics: perspectives, tools and applications; multimedia datasets for repeatable experimentation; multi-modal affective computing of large-scale multimedia data; multimedia and multimodal analytics in the medical domain and pervasive environments; intelligent multimedia security; demo papers; and VBS papers.
9783030377342
10.1007/978-3-030-37734-2 doi
Multimedia systems.
Computer vision.
Artificial intelligence.
Application software.
User interfaces (Computer systems).
Human-computer interaction.
Multimedia Information Systems.
Computer Vision.
Artificial Intelligence.
Computer and Information Systems Applications.
User Interfaces and Human Computer Interaction.
QA76.575
006.7
Poster Papers -- Multi-Scale Comparison Network for Few-Shot Learning -- Semantic and Morphological Information guided Chinese Text Classification -- A Delay-aware Adaptation Framework for Cloud Gaming under the Computation Constraint of User Devices -- Efficient Edge Caching for High-Quality 360-Degree Video Delivery -- Inferring Emphasis for Real Voice Data: an Attentive Multimodal Neural Network Approach -- PRIME: Block-wise Missingness Handling for Multi-modalities in Intelligent Tutoring Systems -- A New Local Transformation Module for Few-shot Segmentation -- Background Segmentation for Vehicle Re-Identification -- Face Tells Detailed Expression: Generating Comprehensive Facial Expression Sentence through Facial Action Units -- A Deep Convolutional Deblurring and Detection Neural Network for Localizing Text in Videos -- Generate images with obfuscated attributes for private image classifcation -- Context-Aware Residual Network with Promotion Gates for Single Image Super-Resolution -- A Compact Deep Neural Network for Single Image Super-Resolution -- An Efficient Algorithm of Facial Expression Recognition by TSG-RNN Network -- Structured Neural Motifs: Scene Graph Parsing via Enhanced Context -- Perceptual Localization of Virtual Sound Source Based on Loudspeaker Triplet -- TK-Text: Multi-shaped Scene Text Detection via Instance Segmentation -- More-Natural Mimetic Words Generation for Fine-grained Gait Description -- Lite Hourglass Network for Multi-person Pose Estimation -- SS1: AI-Powered 3D Vision -- Single View Depth Estimation via Dense Convolution Network with Self-supervision -- Multi-Data UAV Images for Large Scale Reconstruction of Buildings -- Deformed Phase Prediction Using SVM for Structured Light Depth Generation -- Extraction of Multi-class Multi-instance Geometric Primitives from Point Clouds Using Energy Minimization -- Similarity Graph Convolutional Construction Network for Interactive Action Recognition -- Content-Aware Cubemap Projection for Panoramic Image via Deep Q-Learning -- Robust RGB-D Data Registration Based on Correntropy and Bi-directional Distance -- InSphereNet: a Concise Representation and Classification Method for 3D Object -- 3-D Oral Shape Retrieval Using Registration Algorithm -- Face Super-Resolution by Learning Multi-view Texture Compensation -- Light Field Salient Object Detection via Hybrid Priors -- SS2: Multimedia Analytics: Perspectives, Tools and Applications -- Multimedia Analytics Challenges and Opportunities for Creating Interactive Radio Content -- Interactive Search and Exploration in Discussion Forums Using Multimodal Embeddings -- An inverse mapping with manifold alignment for zero-shot learning -- Baseline Analysis of a Conventional and Virtual Reality Lifelog Retrieval System -- An Extensible Framework for Interactive Real-time Visualizations of Large-scale Heterogeneous Multimedia Information from Online Sources -- SS3: MDRE: Multimedia Datasets for Repeatable Experimentation -- GLENDA: Gynecologic Laparoscopy Endometriosis Dataset -- Kvasir-SEG: A Segmented Polyp Dataset -- Rethinking the Test Collection Methodology for Personal Self-Tracking Data -- Experiences and Insights from the Collection of a Novel Multimedia EEG Dataset -- SS4: MMAC: Multi-Modal Affective Computing of Large-Scale Multimedia Data -- Relation Modeling with Graph Convolutional Networks for Facial Action Unit Detection -- Enhanced Gaze Following via Object Detection and Human Pose Estimation -- Region Based Adversarial Synthesis of Facial Action Units -- Facial Expression Restoration Based on Improved Graph Convolutional Networks -- Global Affective Video Content Regression Based on Complementary Audio-Visual Features -- SS5: MULTIMED: Multimedia and Multimodal Analytics in the Medical Domain and Pervasive Environments -- Using Publicly Available Medical Images from the Open Access Literature and Social Networks for Model Training and Knowledge Extraction -- AttenNet: Deep Attention based Retinal Disease Classification in OCT Images -- NOVA: A Tool for Explanatory Multimodal Behavior Analysis and its Application to Psychotherapy -- Instrument Recognition in Laparoscopy for Technical Skill Assessment -- Real-time Recognition of Daily Actions Based on 3D Joint Movements and Fisher Encoding -- Model-based and Class-based Fusion of Multisensor Data -- Evaluating the Generalization Performance of Instrument Classification in Cataract Surgery Videos -- SS6: Intelligent Multimedia Security -- Compact Position-aware Attention Network for Image Semantic Segmentation -- Law is Order: Protecting Multimedia Network Transmission by Game Theory and Mechanism Design -- Rational Delegation Computing Using Information Theory and Game Theory Approach -- Multi-hop Interactive Cross-modal Retrieval -- Demo Papers -- Browsing Visual Sentiment Datasets using Psycholinguistic Groundings -- Framework Design for Multiplayer Motion Sensing Game in Mixture Reality -- Lyrics-Conditioned Neural Melody Generation -- A Web-based Visualization Tool for 3D Spatial Coverage Measurement of Aerial Images -- An Attention Based Speaker-Independent Audio-Visual Deep Learning Model for Speech Enhancement -- DIME: An Online Tool for the Visual Comparison of Cross-Modal Retrieval Models -- Real-time Demonstration of Personal Audio and 3D Audio Rendering Using Line Array Systems -- CNN-based Multi-Scale Super-Resolution Architecture on FPGA for 4K/8K UHD Applications -- Effective Utilization of Hybrid Residual Modules in Deep Neural Networks for Super Resolution -- VBS Papers -- diveXplore 4.0: The ITEC Deep Interactive Video Exploration System at VBS2020 -- Combining Boolean and Multimedia Retrieval in vitrivr for Large-Scale Video Search -- An Interactive Video Search Platform for Multi-modal Retrieval with Advanced Concepts -- VIREO @ Video Browser Showdown 2020 -- VERGE in VBS 2020 -- VIRET at Video Browser Showdown 2020 -- SOM-Hunter: Video Browsing with Relevance-to-SOM Feedback Loop -- Exquisitor at the Video Browser Showdown 2020 -- Deep Learning-Based Video Retrieval using Object Relationships and Associated Audio Classes -- IVIST: Interactive Video Search Tool in VBS 2020.
The two-volume set LNCS 11961 and 11962 constitutes the thoroughly refereed proceedings of the 25th International Conference on MultiMedia Modeling, MMM 2020, held in Daejeon, South Korea, in January 2020. Of the 171 submitted full research papers, 40 papers were selected for oral presentation and 46 for poster presentation; 28 special session papers were selected for oral presentation and 8 for poster presentation; in addition, 9 demonstration papers and 6 papers for the Video Browser Showdown 2020 were accepted. The papers of LNCS 11961 are organized in the following topical sections: audio and signal processing; coding and HVS; color processing and art; detection and classification; face; image processing; learning and knowledge representation; video processing; poster papers; the papers of LNCS 11962 are organized in the following topical sections: poster papers; AI-powered 3D vision; multimedia analytics: perspectives, tools and applications; multimedia datasets for repeatable experimentation; multi-modal affective computing of large-scale multimedia data; multimedia and multimodal analytics in the medical domain and pervasive environments; intelligent multimedia security; demo papers; and VBS papers.
9783030377342
10.1007/978-3-030-37734-2 doi
Multimedia systems.
Computer vision.
Artificial intelligence.
Application software.
User interfaces (Computer systems).
Human-computer interaction.
Multimedia Information Systems.
Computer Vision.
Artificial Intelligence.
Computer and Information Systems Applications.
User Interfaces and Human Computer Interaction.
QA76.575
006.7