Junsong Yuan Gang Yu Zicheng Liu
Title: Human action analysis with 2D and 3D sensors
Abstrate: Human action analysis is a critical task and emerging topic in many multimedia applications. In the past few years, there has been a lot of progress in action recognition with conventional 2D video cameras. Effective techniques have been developed to address many challenging issues in real world environments such as dynamic and cluttered background and occlusions. More recently, the availability of commodity depth cameras has brought a new level of excitement to this field. Rapid progress has been made that addresses new technical issues in action recognition with 3D depth cameras. In this tutorial, we introduce the basics for human action analysis, using both regular and depth cameras. The topics cover the action analysis using depth cameras, action and abnormal event detection in surveillance videos, as well as action analysis in user-generated consumer videos, such as movies and Youtube videos.
Action Detection and Search
For action analysis, so far, majority of the research has been focusing on the classification of a whole video sequence where one concept label is assigned to the entire video sequence. However, the granularity of the resulting concept labels is usually too coarse to be useful in real world applications. It is not uncommon that a video sequence contains multiple activities that may occur simultaneously at different locations. Therefore, instead of annotating the entire video sequences as an individual event, efficient and powerful tools are required to “locate” the actions, despite the cluttered and dynamic scenes. Action localization in videos involves finding both the spatial and temporal extent of an action, that is, where and when the action/activity occurs. It is inherently more difficult than classification. For video surveillance applications, detection of both normal and abnormal actions (e.g., abandoning luggage, going off normal path) is of great importance. We will give a comprehensive overview of the recently developed techniques in this direction, and discuss the challenges and potential applications in multimedia information retrieval.
Depth-camera based Action Recognition
To use a depth-camera for action recognition, the first question is “what is a good visual representation?” In this tutorial, we will provide a comprehensive overview of various visual representations developed for different types of 3D sensors, and provide more detailed descriptions on recently developed visual representations that are more suited for the commodity depth cameras. In addition, we’ll provide case studies on some of the 3D action recognition systems that have been developed recently, and discuss future research directions.
Bio: Junsong Yuan is a Nanyang Assistant Professor at Nanyang Technological University (NTU), Singapore, and currently the program director of video analytics at Infocomm Center of Excellence, School of EEE, NTU. He received the EECS outstanding Ph.D. Thesis award from Northwestern University, USA, and the Doctoral Spotlight Award from IEEE Conf. on Computer Vision and Pattern Recognition (CVPR'09). He has been invited to present his action detection work in a number of universities and industry labs in the past three years, including UIUC, Peking University, Chinese Academy of Science, Microsoft Research Redmond, Motorola Applied Research Center, Nokia Research Center etc. He has published 60 papers in peer-reviewed journals and conferences, and filed three US patents and two international patents. He is the co-chair of two workshops at IEEE CVPR’12 and has served as editor, co-chair, PC member and reviewer of many international journals and conferences/workshops/special sessions.
Gang Yu is currently a Ph.D. candidate at Nanyang Technological University. Before that, he received M.Eng from Shanghai Jiao Tong University in 2010. His research interest focuses on computer vision, multimedia analysis, specifically, on the human action recognition and search. He received MSRA fellowship in 2011.
Zicheng Liu is a senior researcher at Microsoft Research, Redmond. His current research interests include human activity recognition, face modeling and animation, and multimedia collaboration. He received a Ph.D. in Computer Science from Princeton University. He has published over 80 papers in peer-reviewed international journals and conferences, and holds over 50 granted patents. He co-authored a book entitled “Face Geometry and Appearance Modeling: Concepts and Applications”, Cambridge University Press, 2011. He has served in the technical committees for many international conferences. He is a technical co-chair of both 2010 and 2014 ICME, a co-organizer of 2011 and 2012 CVPR Workshops on Human Activity Understanding from 3D Data, and a general co-chair of 2012 IEEE Visual Communication and Image Processing. He is an associate editor of both Machine Vision and Applications journal and Journal of Visual Communications and Image Representation. He is a senior member of IEEE.
Title: Immersive Media Technology Experiences
Abstrate: The tutorial will start by defining and discussing the aspects of Immersion, Media Technology and Experiences, followed by efforts of linking these to three main Focus Areas: User, Content and Infrastructure. The glue in the linking is by the Users interacting with the Content through devices connected to the Infrastructure. Such Interactions and Devices demands New Digital Media and abilities for the Content to Adapt to User requests and the available Infrastructure (Networked Media Handling). This structure allows for piloting new advanced applications and services such as Digital Storytelling, Digital Art, Serious gaming, Presence and immersive experience (interactivity) etc. The talk will also outline potential usage scenarios and industrial collaborations.
Topics that will be covered:
• Content processing
o Physical representations of digital media, compression and representation
§ Focus on the 5 highs; Resolutions, Frames per second, High Dynamic Range, color and views
o Networked media handling
o Applications and services
§ Focus on; Internet and IP networks
o Digital storytelling, semantics
§ Focus on; New digital media, haptics and GUIs
o Creativity, quality and trust
§ Focus on; Exeperiences, quality, Quality of Experience and satisfaction
• Eco system
o Holistic view, busniness models, economy
§ Focus on; work flow, stickyness and finances
Bio: Prof. Andrew Perkis was born in Norway 1961. He received his Siv.Ing and Dr. Techn. degrees in 1985 and 1994, respectively. In 2008 he received an executive Master of Technology Management in cooperation from NTNU, NHH and National University of Singapore (NUS). He is professor at the Department of Electrical engineering and Telecommunications at NTNU. In 1999/2000 he was a visiting professor at The University of Wollongong, Australia and in 2008 a visiting professor at NUS.
He is a member of the management team of the Norwegian National Centres of Excellence - Quantifiable quality of service for communication systems - Q2S - in the ICT domain. There he is managing the area of "Networked Media Handling. He was a cofounder of and manager of Midgard Media Lab at NTNU (1999-2010) and a co-founder of Adactus As (2003- sold 2010 to Vizrt AS).
He has previously managed UMA – Universal Multimedia Access from Wired and Wireless system, funded by the Norwegian Research Council., WIRAC – Wideband Radio Access - funded by The Royal Norwegian Research Council, and Universal Access to the Multimedia Portal, funded by NORDUnet2 under The Nordic Minister Council.
His research is focusing on Multimedia Signal Processing, specifically within new digital media using second generation image and video compression schemes, multimedia descriptors and The Multimedia Framework (MPEG-21). He has been one of the initiators and developers of the concept of Universal Multimedia Access applied to wired and wireless systems. His main contributions are in ensuring optimal media presentations on hand held devices by using negotiations of device capabilities prior to media delivery. His work includes topics within methods and functionality of content representation, quality assessment and its use within the media value chain in a variety of applications. He is also involved in setting up directions and visions for new research within media technology and art. He is currently heading a group of researchers aiming at establishing a national centre for research based innovation within Immersive Media Technology Experiences (IMTE).
Within applied research he is heavily involved in multi platform publishing, especially to handheld devices. He has been involved in the start-up company Adactus and commercial aspects of Digital Cinema role out through running the Norwegian trial project NORDIC.
He is member of The Norwegian Academy of Technological Sciences (NTVA), senior member of the IEEE, member of ACM and member of The Norwegian Society of Chartered Engineers (TEKNA).
He has more than 200 publications at international conferences and workshops and more than 60 contributions to International standards bodies.