Research Summary


Kazunori Okada, Ph.D.

Curriculum Vitae


A)    Biomedical Image Analysis

B)    Computer Vision & Pattern Recognition & Machine Learning

C)    Face Recognition

D)   Education Technology


A) Biomedical Image Analysis:


Lung Nodule Analysis in 3D CT Scans

Lung cancer accounts for the most cancer death in the US. These cancers often appear as focal concentration of high intensity in CT scans, known as lung nodules. This project aims to provide comprehensive suits of robust and accurate detection, segmentation, and analysis of such nodules for early detection and accurate staging. In particular, our method can robustly segment technically challenging Ground-Glass Opacity (GGO) nodules which pose higher malignancy rate than more common solitary nodules.

(Example Results: MICCAI04)

Nodule Detection:

KDD06, US7756313

Nodule Fitting:

ECCV04, CVPR04, TMI05, MMMISR10, US7308132, US7616792, US7995809

Nodule Segmentation:

MICCAI04, MICCAI05, CVPR05, US7430321

RECIST Measurement:

CVBIA05, US7590273

GGO Nodule Segmentation:


GBM Brain Tumor Segmentation & CAD in 3D Diffusion-Weighted MRI

Glioblastoma multiforme (GBM) is the most common and most aggressive malignant primary brain tumor in humans. GBM has the worst prognosis of any CNS malignancy. Image-based delineation and diagnosis of GBM remains a significant challenge. We propose a robust Computer-Aided Therapeutic Response (CARrx) paradigm by using the diffusion weighted MRI as a surrogate biomarker to reveal changes in the tumor microenvironment that precede morphologic tumor changes. To account for the large morphological change of GBM, we propose ensemble segmentation algorithms that combine results from multiple segmentation algorithms as well as perturbations due to user-interaction.

Brain Tumor Segmentation:



Brain Tumor CADrx:





Dental Lesion CAD in 3D Cone Beam CT Scans

In endodontics, 3D cone beam CT has shown a promise for classifying granulomas (tissue infection) and cysts, which could not be differentiated by traditional X-ray imaging in past, leading to potentially unnecessary surgical removals of these lesions. We propose a new CAD scheme that offers a non-invasive differential diagnosis of the two types of dental lesions to avoid unnecessary tissue damage. Our solutions combine graph-based random walker segmentation and a new LDA-Boost algorithm for diagnostic classification.

Dental Lesion Segmentation:



Dental Lesion CAD:






3D Topology Type Classification in 3D Chest CT Scans

Correct classification of topological types of local anatomical and pathological structures plays a crucial role in many 3D medical image analysis applications, such as vascular and airway tree analysis, detection of internal bleeding, and false-positive detection for lung nodule detection. We propose a novel algorithm to classify a local 3D structure into four types of topology in 3D (i.e., blob, plane, tube, and branching-tube). The key idea is to transform the analysis of 3D objects into a 2D clustering problem which can be solved efficiently. The proposed algorithm is also extended to Bayesian formalism with a novel mean shift clustering method that accounts for our circular domain data.





Robust and Efficient Click Point Linking as Local Image Registration

In some clinical workflows that involve medical image registration, readers attend only to a local area at a time but study a series of them successively. We propose a robust and efficient algorithm to establish a local point-correspondence between a pair of 3D images toward therapy-monitoring application as an alternative to standard non-rigid registration. Given a set of landmark feature correspondences, our solution estimates a point corresponding to a user-specified point in the other volume. The estimation process utilizes the mean shift mode seeking over a set of affine invariant candidates.





Biological 2D Cellular Image Analysis

This project aims to develop suits of tools to analyze technically challenging biomedical images, such as 2D optical images of immunostained cell nuclei in nervous systems of Tobacco hornworm for studying the effect of starvation in neural development and of nissl-stained brain slices for data mining a rat brain atlas. Our cell counting solution for the Tobacco hornworm images exploits fast radial symmetry transform originally proposed for generic object/face detection. Gabor wavelet-based image representation is applied to estimate the slice depth in an atlas corresponding to an arbitrary input image.

Stained Cell Nuclei Counting:


Stained Brain Slice Image Classification:





B) Computer Vision & Pattern Recognition & Machine Learning:


Histogram Distance Measure

Histogram features, such as SIFT and Shape Context, are ubiquitous tools in computer vision and pattern recognition. Earth mover’s distance is a well-known cross-bin metric of such histogram features accounting for ill-aligned histogram bins; however it suffers from high computational cost that is larger than O(n^3). We propose an O(n^2) exact earth mover’s distance algorithm. We simplify the underlying linear program by exploiting the L1 ground distance. We further improve the computation complexity in cross-bin distance computation, proposing Diffusion Distance whose cost is reduced to near O(n), while maintaining high scores on SIFT and shape matching tasks we evaluated.

Earth Mover’s Distance with L1:



Diffusion Distance:



Mean Shift Algorithms

Mean shift is a convergent hill-climbing algorithm that efficiently seeks a mode of kernel-smoothed function such as kernel density functions and Gaussian-smoothed data. Mean shift has been successfully applied to various vision applications, such as clustering, segmentation, tracking to name a few. Our research extends this mean shift to a wider range of data analysis tasks, including Gaussian scale space, MAP estimation, and directional/circular domain data analysis. These extensions have been successfully applied to various biomedical image analysis problems. A general tutorial of mean shift in Japanese is also provided.

Scale Space Mean Shift:



Prior Constrained Mean Shift:



Directional Mean Shift:


General Tutorials:

CVIM08, (In Japanese)

CV09 (In Japanese)

Anisotropic Scale Selection & Robust Gaussian Fitting

Automatic scale selection with gamma-normalized scale space derivative functions was pioneered by Lindeberg in 90’s and offered a theoretical basis to the popular SIFT keypoint detection algorithm and other related studies. We propose an extension of Lindeberg’s linear scale selection principles to anisotropic (covariate) scales, yielding three novel scale selection rules with scale space gradients and Hessians. Resulting rules are applied to formulate a new Gaussian function fitting method that is robust against data truncation and strong noise.







Machine Learning (Various)

Many detection and classification tasks in the biomedical image analysis research involve machine learning. Our research in this context addresses adopting and improving generic ML algorithms to best solve specific biomedical imaging problems at our hands. Bayesian, Random Forest, 1-norm SVM, Adaboost, and LDA have been studied in solving various biomedical image analysis tasks. In particular, performance analysis and its application is a promising approach for meeting clinical demands of high accuracy in data analysis results. To this end, we propose a coupled segmentation and classification algorithm that exploits the classification accuracy measure to improve the segmentation accuracy. 

Bayesian Classification:


Random Forest for CAD:

CDMRI08, Algorithms09

1-norm SVM for Cascade Detection:

KDD06, US7756313

Boosted LDA:


Performance Analysis & Its Applications: ICPR08

Entropic Image Feature Measure with Jensen-Shannon Divergence

This paper proposes a stratified regularity measure: a novel entropic measure to describe data regularity as a function of data domain stratification. Jensen-Shannon divergence is used to compute a set-similarity of intensity distributions derived from stratified data. We prove that derived regularity measures form a continuum as a function of the stratification’s granularity and also upper-bounded by the Shannon entropy. This enables to interpret it as a generalized Shannon entropy with an intuitive spatial parameterization.


Ensemble Segmentation

Segmentation is an ill-formed problem such that no overall optimal solution exists for all problems. Moreover, initialization specified by user-interaction is often used in a semi-automatic solution, which adds another uncertainty and dependency to particular data and problem.  We propose an ensemble segmentation approach that addresses these uncertainties by integrating segmentation results by different segmentation algorithms and different initializations. We derive data-driven confidence map for each algorithm for the integration. Ohtsu, Fuzzy-connectedness, GrowCut, and voxel classification methods are considered. For the semi-automatic algorithms, a Monte Carlo simulation of user-interaction space is used to generate and weight various segmentation results, yielding repeatable segmentation results robust against different initializations.






C) Face Recognition:


Pose Invariant Face Recognition with Statistical Face Models

Major challenges in automatic face recognition include image variations due to head pose changes.  We propose statistical model that learns a bi-directional mapping function between image features and head pose parameters. Once trained, the model offers the head pose analysis (feature to pose mapping) and the image synthesis with arbitrary pose (pose to feature mapping).  Gabor wavelet-based image object representation is adopted as our image feature and PCA is used to establish a compact basis of such feature space. Resulting models in both 2D and 3D data domain are used to design a pose-invariant face recognition system

2D Models:









3D Models:


Adaptive Face and Person Recognition for Automatic Video Indexing

In automatic face recognition research, there has been a lack of efforts to study its memory learning aspect, putting more focus on the information processing. In order to develop fully adaptive AI systems, an FR system must be able to find a new knowledge instance automatically and integrate it incrementally into its domain knowledge. We propose an automatic known-person database management method which adapts the known-person database of known person according to whom the system meets over time.

(Example results: ROMAN01)




Elastic Graph Matching: FERET Phase II Face Recognition Competition

Inspired by how human brain processes visual information, we adopt multi-scale and multi-orientation Gabor wavelet transform to represent images. In this scheme, each face is represented by a labeled graph whose node is associated with a Gabor feature sampled at a specific fiducial point in a facial image. Sets of graph are generated to represent knowledge of human faces in different sizes and orientations. Such graph sets can be matched against an arbitrary input by elastic graph matching.  This book chapter summarizes the overall algorithm we successfully entered into the FERET competition in 1997 by US Army Research Labs.


ICA-based Face Recognition

In this study, we adopt independent component analysis (ICA) as our base face representation and assess its properties in comparison against PCA. Exploiting the sparse nature of the ICA basis, we apply the ICA to build facial expression recognition system.

 (Example results: PRMU99)

PRMU99 (In Japanese)

Face Detection & Recognition with Infrared Imaging

We propose pose- and illumination-invariant face detection and recognition system using active near-infrared camera. Illumination variation is a major issue to prevent the FR technology to be deployed in uncontrolled environments. Active near-infrared camera used in this work allows to image faces under strong illumination conditions. We train pose-specific Adaboost face detectors for different head poses in order to handle the pose variations.


Distributed Face Recognition with Data Compression

We investigate the application of a face recognition system in a distributed environment. Images of faces are captured by clients remotely and transmitted to a server for recognition or authentication using a central database. In

many distributed scenarios, bandwidth may be limited and transmission of image data may not be feasible. We assume the client does not have processing limitations and can extract and transmit compressed features. In this paper we explore the impact of feature compression on face recognition performance. Specifically we propose an embedded coding scheme for Gabor-based wavelet features extracted from optimally selected landmarks on the face. Our results show that the impact on recognition rates—even at the highest compression rates—is minimal.


Part-based Object Detection & Recognition

3D shape primitives, known as Geons (Biederman 87), have been proposed as an image-based model in the brain for object recognition. We propose a parametric method to represent and detect such Geons by exploiting joint statistical constraints of steerable pyramids. In computer vision, similar part-based object representation and detection has been widely investigated lately. We propose a robust method to detect parts of objects that are visually dissimilar but semantically equivalent, such as mouth in various facial images. Mean Shift is used to achieve a robust voting consensus from a set of candidate locations of the target object part, derived from geometric-invariants.

Joint-Statistics of Geons:



Detection of Visually Dissimilar Object Parts:


Comparative Cognitive Studies on Face Recognition

The respective influences of exposure and inborn neural networks on conspecific and nonconspecific face processing remain unclear. Although the importance of exposure in the development of object and face recognition in general is well documented, studies explicitly comparing face recognition across species showed a species specific effect. In this study, we investigate conspecific and nonconspecific face recognition in chimpanzees (Pan troglodytes) from 2 primate centers that provided different exposure to chimpanzee and human faces. Our results showed that the chimpanzees from the center providing more exposure to human faces than to chimpanzee faces were better at discriminating human faces than they were at discriminating chimpanzee faces. Chimpanzees’ scores were significantly correlated with the theoretical facial similarity values computed with Gabor wavelets. Overall, the results show that exposure is a critical determinant in conspecific and nonconspecific face recognition.

Face Recognition in Chimpanzees:


Face and No-Face Recognition:





D) Education Technology:


Web-Based Lesson Plan Creator for Teaching Special Education Program

We develop an online web-based tool for assisting teaching credential students in a mild-to-moderate special education program in CA to create sound lesson plans and manage their e-portfolios. Despite tremendous challenges in preparing next generation special education teachers to meet our classroom demands, there are currently no useful tools in practice to assist these teacher candidates to learn how to make sound lesson plans for specific IEPs and regulatory content standards. Our goal is to develop a software tool that automatically suggests some evidence-based teaching strategies that conforms to specific IEP and content-standards through analyzing published peer-reviewed articles by using data mining techniques.






Last modified on 2011.07.02