日本語

 

Update(MM/DD/YYYY):06/21/2005

World's Best Performance for Recognition of persons / individuals and movements / motions.

- Making A Great Stride to Practical Application of Unattended Monitoring and Recognition by Computer Vision -

Key points

  • With the increasing occurrence of crime offenses and terrorist acts, needs for R&D of video surveillance, or technologies for video monitoring and automatic recognition are growing, though the conventional systems have been inadequate for ensuring practical recognition performance for persons / individuals and movements / motions.
  • An innovative surveillance system of the world's best performance has been developed for identifying individual persons by gait and immediately detecting any abnormal (unusual) movements out of a monitored video clip.
  • The technology will open the way to practical use of computer vision such as intelligent anti-crime camera for sleuthing out a specified person, and robot vision.


Synopsis

A research group supervised by Dr. and Prof. Nobuyuki Otsu, Fellow of the National Institute of Advanced Industrial Science and Technology and also a Specially Assigned Professor at the University of Tokyo, has developed an innovative technology for automatically recognizing persons and movements out of a monitored video image, a key step of automatic surveillance and identification with an anti-crime camera. The technology is an extension of adaptive learning recognition system based on the higher order local autocorrelation (HLAC) feature extraction method developed for two-dimensional static images, to cover the feature extraction of "target's movements or gait" with cubic HLAC (CHLAC) extended to motion images. The technology is characterized by enhanced versatility, high speed and high accuracy.

The newly developed CHLAC scheme makes it possible to identify a particular person by gait out of a monitored scene of moving images and detect an abnormal movement immediately. On applying this technology to the test dataset (Human ID Gait Challenge Dataset) prepared for the international gait recognition competition organized by the National Institute of Standards and Technology (NIST), as a part of the US Human ID Program, it has been demonstrated that the CHLAC is characterized by the world's best recognition capability far surpassing that of the conventional technologies.

Since the newly developed CHLAC is applicable not only to the human recognition but also to the detection of abnormal movements and to the moving target tracking to follow up a specified person, it is expected to greatly contribute to R&D of computer vision, including automatic (unattended) video surveillance with increasing demands, such as intelligent anti-crime monitor camera in the security area, and robot vision as well.

The future research efforts will be focused on R&D aiming at the practical application of the CHLAC, and at the vision application to the areas of "real-world information system" including the interactive system and the robotics, in line with the R&D projects, (such as) Urban Area Industrial-Academic-Governmental Collaboration Program "Advanced Video Surveillance" of the Ministry of Education, Culture, Sports, Science and Technology (MEXT), the Priority Task Solution Research "Traffic Accident Prevention Technology" of the Science and Technology Promotion and Coordination Funds, and the 21st Century COE Program "Information Science and Technology Strategic Core" (a priority theme of the Graduate School of Information Science and Technology, University of Tokyo).

The details of the present study have been presented at MVA2005, or the International Association of Pattern Recognition (IAPR) Conference on Machine Vision Applications, held May 16 to 18, 2005, at Tsukuba, Ibaraki, Japan, and also submitted to ICCV2005 (10th IEEE International Conference on Computer Vision) to be held October 15 to 21, 2005 at Beijing, China.



Background

With the increasing occurrence of crime offenses and terrorist acts, needs for R&D of video surveillance with monitoring camera are growing. In particular, for implementing intelligent monitoring camera, it is essential to automatically recognize persons and movements and to detect an abnormal movement in a video scene, and its practical use has been urgently needed. However, none of practical recognition capabilities have been available with the conventional technologies.

Most of the existing techniques are based on segmentation of individual moving objects out of a video clip and on extracting features of a target person or his/her behavior in reference to model patterns provided in advance. These methods are limited in accuracy and require enormous amount of computation. Moreover, the optical flow method constituting the main stream of motion feature extraction is not suited for the practical use because of strict prerequisites needed and marked susceptibility to noises.

History of Research Work

Prof. Otsu's group has been engaged in the development of an image recognition system equipped with adaptive learning capability, based on the versatile higher order local autocorrelation (HLAC) feature extraction scheme covering 2-D still images in view of "statistical feature extraction" theory. These efforts were selected as the 50th Spotlight Invention by the Science and Technology Agency (STA, at that time) in 1991.

In the present study, the HLAC is extended to 3D motion images, resulting in a new technology, CHLAC, for versatile, high speed and high accuracy feature extraction of target motion (Patent filed).

Details of Research Work

A video clip consists of 3D numeric data representing a series of 2D still images arranged in time sequence. In order to recognize and measure a particular target out of a video clip, such as a walking person, it is desirable to extract features independent of spatial position of that person, or position-invariant features. If a 3D video scene includes multiple targets, then the overall feature value is desirable to be obtained by adding those of individual objects, or additivity so that subsequent processing will be simplified to upgrade the recognition accuracy. Besides, it is desirable that the feature extraction requires less computational load and can be processed in real time.

The feature extraction based on HLAC and CHLAC constitutes a basic and versatile technique meeting all of these requirements, and the combination of extracted features with a method of statistical information integration will make it possible to achieve versatile object recognition out of a video clip through the adaptive learning mode.

The recognition capability of the CHLAC has been proven the best in the world, far surpassing conventional approaches on applying to the Human ID Gait Challenge Dataset prepared for an international competition for gait recognition of 71 persons organized by the National Institute of Standards and Technology (NIST) as a part of the US Human ID Program. As is evident in Fig. 1, the AIST-developed CHLAC has achieved outstanding performance in comparison to earlier techniques for extremely difficult problems in particular.


Fig.1
Fig. 1 Comparison of available technologies as applied to the Human ID Gait Challenge Dataset for an international competition. The AIST-developed CHLAC achieved an outstanding performance

The newly developed method also makes it possible to detect any abnormal movement immediately out of a video clip. In this method, the overall feature consists of features for individual objects in a video clip added up, and the feature vectors for normal movements are distributed in a subspace, "normal movement subspace" within a 251-dimensional feature space. Any abnormal movement can be immediately detected and recognized without segmentation and accurately as a deviation (measured by distance) from the normal movement subspace established through constant learning of normal movements. The detection power of abnormal movements remains the same way even when multiple persons are concerned (Fig. 2).


">
Fig.2
Fig. 2 Examples of human abnormal movement detection (Normal: walking through, Abnormal: tumbling over)

The CHLAC requires neither object model nor knowledge to be prepared in advance, and only needs a constant amount of computing to allow real-time processing. Hence, the CHLAC is applicable to various problems of detecting abnormal movements in the automatic, unattended video surveillance. Furthermore, the newly developed technique makes it possible to automatically track a moving object by dividing a screen and utilizing the additivity characteristic of HLAC. As not only shape but also color information constitutes an important factor for tracking an object, the HLAC has been extended so as to cover the color images. Since the most of conventional techniques depend on template matching at the image level, tracking error occurs frequently because of segmentation error including misalignment or changes in target's shape. Moreover, tracking fails when the target goes into hiding or intercrossing with other objects. On the contrary, the CHLAC method is a very robust and stable tracking method based on identification and recognition at the feature level of a moving target, without needing segmentation of moving object nor massive computing, to ensure real time tracking (Fig. 3).


Fig.3
Fig. 3 Application to robust and stable tracking for a moving object

Future Prospects

The future research efforts will be focused on R&D aiming at the practical application of the CHLAC, in line with the “Advanced Video Surveillance” in Urban Area Industrial-Academic- Governmental Collaboration Program of the Ministry of Education, Culture, Sports, Science and Technology (MEXT), and the “Risk Finding and Avoidance Based on Situation and Intention Understanding” for the “Traffic Accident Prevention Technology” in Promotion of Priority Task Solution Research under the Science and Technology Promotion and Coordination Funds. Additionally, it will be attempted to apply the CHLAC to R&D of machine vision in the areas of "Real-world Information Systems" including the interactive dialog system and the robotics, in line with the 21st Century COE Program "Information Science and Technology Strategic Core" (a priority theme of the Graduate School of Information Science and Technology, University of Tokyo).

The newly developed CHLAC, characterized by high performance and broad versatility, will be accepted in a wide variety of applications, such as video surveillance for security and disaster prevention purposes, including monitoring camera system, guard and security system, disaster-prevention and monitoring system; automatic indexing of video record, that is, making index for video by automatically detecting and editing scene breaks; medical care, welfare and sports areas, including rehabilitation aids, behavioral correction, training system; and other fields related to computer vision in general including interactive system and robot vision.






▲ ページトップへ