Humanid

60

By mizan

The DARPA Human ID Project, as a major funder of ongoing U.S. research in this area, serves as a magnet, mixing bowl, and ultimately test bed provider for the rapidly evolving knowledge base and set of systems. SED is tasked, by DARPA and the NIST-resident Human ID Group (Jonathan Phillips and Patrick Grother of the Information Access Division of ITL), to help with the algorithm comparison test collection.

Background/Impetus

Customers

Goals

Impact

SED Milestones

R&D Team

Achievements

Presentations

Identification and verification of a person's identity are two generic application areas of face recognition systems. In identification applications, an algorithm identifies an unknown face in an image by searching through an electronic mugbook. In verification applications, an algorithm confirms the claimed identity of a particular face. Proposed applications have the potential to impact all aspects of everyday life by controlling access to physical and information facilities, confirming identities for legal and commercial transactions, and controlling the flow of citizens at borders.

For face recognition systems to be successfully fielded, one has to be able to evaluate their performance. To evaluate an algorithm, its behavior is scored on a test set of matchable images in a mugbook known as the Gallery. One computes a similarity matrix that quantifies the proximities of images of a subset of the Gallery (called the Probe set) to each image in the Gallery.

The task of SED is to develop methods for comparing algorithmic performance based solely on comparison of similarity matrices generated by the algorithms under test running against prefixed Probe/Gallery sets. Since the algorithms are proprietary, comparison methods may not presuppose detailed knowledge of any particular algorithm.

Large collections of test images are already in existence (FERET/Army Research Lab/ George Mason Univ./93-96) or currently undergoing development (Human ID/DARPA/99-04). These databases (which include IR, still, video, and hyperspectral images of the face, gait, and iris of thousands of human subjects) provide the Human ID research community with de facto database standards for algorithm development and comparison.

A first, simple approach is to limit the comparisons to replicated same-face match scores, transform the scores from the multiple algorithm outputs to a common scale, and examine ranking's and clustering's produced by application of standard Multiple Comparisons procedures, e.g., Student-Newman-Keuls. A useful common scaling is achieved by Probability Integral Transform-ing (PIT) each algorithm's scores using knowledge of its characteristic score EDF based on larger heterogeneous (FERET) experiments, then applying the inverse Gaussian cumulative distribution. Application of this procedure to a sizable extract of the FERET database yields a credible ranking of 15 algorithms dated 1996-1997.

An extension to a mixture of same-subject and different-subject match scores can be achieved by use of ordinary 2-dimensional MultiDimensional Scaling (MDS). MDS translates similarity matrices into pictorial maps with matrix row/column headers converted into mapped locations with appropriate inter-location distances. A good algorithm should cluster same-subject images and cleanly discriminate among different-subject images. The ability to discriminate, and tightness of clusters as quantified, e.g., by circumscribed Voronoi ellipse aspect ratios, can be used to rank algorithm performance. Demonstration tests against small-scale FERET extracts show this clearly.

While -1-PIT and use of Multiple Comparisons and MDS have the advantage of retaining the ratio scale of the original similarity scores, much of the work already published and currently being done in this area makes use of rank statistics. We are exploring multiple properties and statistics derived from the use of partial rank correlations (PRC). This involves extending the known distributional theory for PRC's based on Kendall and Spearman statistics and applying them to the study of interesting dependency patterns among different algorithms. Loosely, the ID community recognizes that most current algorithms perform most reasonably in scoring true (close) matches and (far) dramatically disparate non-matches: i.e., algorithms perform best at the far ends of the performance scale. It is commonly presumed that enhanced understanding of algorithmic performance (and the dual issue of image difficulty) will come from pushing in at either end of the match/nonmatch performance scale. The application of nonparametric dependence via copula theory to partial rank co-occurrences seems to hold promise for enhanced understanding here.

In addition, the team has performed rough draft work on All Possible Subsets and Alternating Conditional Expectation (ACE) modeling of covariates' explanatory power for FERET similarity scores, as well as the application of simple Stochastic Matrix Ordering techniques for ranking similarity matrices. This may lead to better designs of image recognition studies.

Comments

No comments yet.

Submit a Comment
Members and Guests

Sign in or sign up and post using a hubpages account.



    • No HTML is allowed in comments, but URLs will be hyperlinked
    • Comments are not for promoting your Hubs or other sites

    Please wait working