计算机英语论文范文Local Graph Embedding Feature Extraction M_留学生论文-英语论文-essay-硕士论文

Abstract—to tackle the insufficiency problem of local linear embedding (LLE) algorithm and maximum margin criterion (MMC) algorithm in feature extraction, an efficient dimensional reduction and classification algorithm, local graph embedding feature extraction method based on maximum margin was to construct the intrinsic graph and penalty graph, with the preservation of nearest neighbor premise. In the intrinsic graph, the nonlinear structure is discovered in the high dimensional data space by the local symmetry of the linear restructuring, which causes the similar samples gathering together as much as possible. At the same time, different class samples are as far as possible from each other in the penalty graph. In this method, the “small size sample” problem is solved by the employment of MMC and the neighborhood relationship is better described by an adequate modification of the adjacency matrix. The results of face recognition experiments on ORL, Yale and AR face databases demonstrate the effectiveness of the proposed method in comparison with the DLA and LLE+LDA method.

Index Terms—local linear embedding; dimensional reduction; face recognition; maximum margin criterion; local graph embedding

I. Introduction

Face recognition has become a main study object of pattern recognition, machine vision and computer vision. Currently, there are many linear and non-linear data dimensionality reduction method is applied to face recognition. Principal component analysis (PCA) [5] and linear discriminating analysis (LDA) [6] two classical methods for data dimensionality reduction, but they are both linear dimensionality reduction method, and did not perform well when it comes to some sort of "non-linear" data.
In recent years, among nonlinear data dimensionality reduction, the manifold learning-oriented dimensionality reduction of theoretical research and English have made great progress. The most representative methods are equidistant map [7], locally linear embedding [8-9], Laplacian Operators feature mapping [10-11], etc. These methods can maintain the original data in the topology of the same premise; the high-dimensional data is mapped to the corresponding low-dimensional space. In which, LLE is an unsupervised learning algorithm, in order to maintain the relationship between the local neighborhood approach and high-dimensional data is mapped to a low global coordinate system. LLE algorithm can see the structure of the original data and to identify the intrinsic internal structure of data points, but it is not very good for data classification. Currently based on the supervision of LLE algorithm, it is divided into two situations: First, calculating for each sample point near the Provisional point of K of the treatment LLE algorithm increased sample point class information [12-16]; Second, LLE algorithm combines with LDA algorithm, first with LLE for data dimensionality reduction, and reuse LDA algorithm for classification [16-17].
Although these nonlinear dimensionality reduction technique in theory can be found in low-dimensional embedding complex, and in the artificial test data produce good results, but in many practical applications face two problems; 1) calculating a large load; 2 ) can only produce defined set of points in the training data dimensionality reduction on the map, He has proposed algorithms [18] and neighborhood preserving projections remain embedding algorithm [19], and they successfully applied to face recognition; but these two algorithms and LDA as also facing a "small sample" problem [20]. Li [21] proposed another maximum margin criterion function, but for non-linear data may be not valid.
In order to solve the problems faced by the above methods, the paper and MMC LLE algorithm is proposed based on an effective data dimensionality reduction and classification methods - locally graph embedding feature extraction methods based on maximum margin criterion, and its application to face recognition , in the MCC, the class diagrams within the compact use of existing local symmetry of the reconstructed high-dimensional data space to identify the non-linear structure, the same sample as much as possible together. Since LLE is an unsupervised method, not enhance visual clustering classification ability, so compact figure within the class to consider the sample class information, can sample the same category as compact; Meanwhile, the class diagram manipulation between different categories of punishment as far as possible away from the sample, in which two optimal adjacency graph, the algorithm makes the data dimensionality reduction and as close as possible within the class between classes as far away from the ORL, Yale and AR face database standard experimental results verify its effectiveness.

II. HELPFUL Hints

1. LLE AND MMC

    Located in a high-dimensional Euclidean space sample set X = {x_1,x_2…x_n}, x_{i ∈}R^D?to seek a projection matrix A, hoping these samples mapped to a relatively low-dimensional feature space R^d?. d≤0. Thus, the sample in the new feature space representation is Y= {y_1,y_2…y_n}, y_t=A^Tx_i, let the matrix A = {a_1,a_2…a_d} is the best identification of vectors a, constituted projection matrix.
1.1LLE
LLE algorithm is a nonlinear dimensionality reduction way, the basic idea is to keep the original manifold local neighborhood relationships among the high-dimensional data. And it is mapped to low-dimensional global coordinate system.
LLE algorithm is usually divided into three steps to achieve:
Step1. Calculated for each sample data set near the point of the k in x_i,{x_i1,x_i2…x_ik}, where k is a pre-given value.
Step2. Calculated partial reconstruction of the sample point weight matrix, an error function is defined
   (1)

By minimizing the formula (1), calculate the reconstruction of each sample point A the weight, when, then.
Step3. Map of all the sample points to a low dimensional space. To minimize map Conditions
       (2)
According to the weight W_ik, minimizing formula (2) the objective function, the d-dimensional projection obtained vector y_ifor x_i, in order to ensure a unique solution, and must meet theand. The Rayleitz-Ritz theorem to solve the minimum d+1 characteristic values according to the corresponding feature vector in ascending order, drop the first eigenvalue corresponding eigenvectors, the rest of the d matrix composed of characteristic vectors is obtained in this low-dimensional embedding samples.
1.2 MMC
The same criteria based on Fisher linear discriminating feature extraction same purpose, MMC algorithm also aims to data from the original high-dimensional space compression to low-dimensional space, and in low-dimensional space to maintain a high separability.
In the MMC algorithm, S_b, S_W and S_t, respectively, the training sample class scatter matrix, within-class scatter matrix and total scatter matrix. Defined by its knowledge, S_b, S_w and S_t are non-negative definite matrix, and satisfies S_t= S_b +S_w, where

L_i=N_i/N is the prior probability of class i, x represents the mean of all samples, x denotes the mean of class i training samples, N_i is the i-th class training sample number, N is the number of all samples.
MMC function is defined as
Since S’_b and S^’_w denote the projection transformed as Y_i=A^Tx_isamples in feature space between-class scatter matrix and the within-class scatter matrix, where, so that.
When the condition is a_i^Ta_i=1, we can get the conclusion that
    (3)
The formula (3) can draw a conclusion via Lagrange coefficients.

   (4)
Accordingly, by formula (4) can easily be solved before the d eigenvalues corresponding to eigenvectors. Where λ_i is the characteristic value of S_b - S_w, and a_i is the corresponding eigenvectors. “D “by the former largest eigenvalue eigenvector corresponding projection matrix consisting final paper will be denoted by the projection matrix.

2. MMC-based partial graph embedding

2.1 Basic concept
MMC algorithm corresponding maximum between-class scatter matrix represents a different kind of separation between the classes, and the within-class scatter matrix represents the minimum between the same mode of sample as compact as possible, but it is not effective to keep the sample manifold local inherent structure. This article will MMC method to promote, effectively combining the inherent partial sample graph embedding structure. That is, if x_i and x_j are adjacent to the before conversion, the conversion of y_i and y_j are also adjacent; otherwise, y_i and y_j after conversion is non-adjacent.
In this paper, LGE / MMC algorithm is aimed at maintaining local close relations under the premise that similar samples together as much as possible, and as far away from different types of samples.
is referred to sample data, is the low-dimensional data represents a weighted undirected graph, where X is the vertex set, is the similarity matrix. Diagonal matrix D and G is the Laplacian matrix L is defined as.

2.2 LGE / MMC algorithm steps
LGE / MMC algorithm can be divided into three steps:
1) Within-class scatters matrix characterization
Compact within the class diagram, a data point within the class and its adjacent similar number k nearest neighbors in its implicit data is locally linear fashion, and each data point can be weighted by a neighbor data reconstruction. LGE / MM local representation within the class similar to the LLE algorithm; the LLE algorithm only difference is that this structure is the weight matrix W, both calculated for each sample data set near the point of the k, x, have to consider the type of sample information, and LLE algorithm does not consider the data sample class information. Therefore, within-class scatters matrix characterizations with LLE algorithm to construct a similar total of three steps. Wherein the final map to minimize the objective function

This introduction of a linear transfer function, , then the formula (5) can be changed to ; where .
2) Between-class scatters matrix characterizations
Between classes punishment diagram, if the output of the two data y_i and y_j belong to different categories, to find an optimal mapping approach is to make a reasonable loss function below the maximum value of S_p. And in order to simplify the equation, the introduction of a linear transfer function, then there

(6) the nature of the data is output Varimax different classes. PCA can also maintain the largest number of output variance, but it is a global method, the output is within-class and inter-class output data. The proposed inter-class method is a local approach, considering only inter-class data, ignoring the interference within the class data, more accurately classify nonlinear high dimensional data. So this similarity matrix is constructed as

3. Experiment result and analysis

In order to verify the proposed LGE / MMC effectiveness in face recognition algorithm, we ORL, Yale and AR face image database for a full experiment and compared the LGE / MMC algorithms and LDA^[6], MMC^[21], LLE^[8-9], DLA^[22-23] and LLE + LDA^[16-17] algorithm classification performance, all the algorithms are used Euclidean distance and the nearest neighbor classifier. In the experiment, it is in order to quickly get results for each algorithm with PCA pretreatment. At this time, the paper is for about 95% of the image energy. Experimental environment is as follows: Dell PC, CPU as Inter Athlon (tm) 64 Processor, 1024 MB RAM, Matlab 7.01.

3.1 Face image database and experimental design
ORL face image database from 40 people, each composed of 10 images, some images are taken at different periods; person's facial expressions and facial details have different degrees of change, such as laughing or smiling, eyes or wide open or closed, to wear or not wear glasses; face pose a considerable degree of variation, depth rotation and plane rotation of up to 20 °; human face scales also have up to 10% of the variation. Experiment, the image is processed into the form of dimensions 56 × 46, randomly selected from this experiment before l (l = 2,3,4,5,6) training images, and the remaining 10-l images used for testing. L For each selected image, are 50 times experiments, the final result is the average results of 50 times.
Yale face database includes 15 individual 165 grayscale face images of each person constituted by the 11 photos, these photos in different lighting conditions such as facial expressions and shoot. Test, the image is processed into a 50 × 40 Vader form, while randomly selected before l (l = 2, 3, 4, 5, 6) training images, and the remaining 11-l images used for testing. L For each selected image is 50 times experiments; the final result is the average result of 50.
AR face database contains 126 individuals (70 males, 56 females) of 4000 pieces of color face images, these images by different light, different expressions and different occlusion situations frontal face image, the majority of people Zhou image is separated by two hours of shooting two image sets, each set contains 13 like color images and 120 images of individuals who did not wear a scarf face image, each 20, and scale from one to 50 × 40 Vader grayscale images. Randomized trials before l (l = 2, 3, 4, 5, 6) image training, and the remaining 20-l image used for testing. L For each selected image, but also for 10 experiments, the final result is the average result of 10, Fig 1a-1c are three individuals face database sample image after preprocessing.

3.2 Parameter Selection
In classical feature extraction algorithm, the choice of various parameters has been an open question. Similarly, LGE / MMC balancing algorithm parameters u and number of neighbors within the class k_c and class k_p on arguments between neighbors recognition accuracy also has a significant impact. This set the number of neighbors within the class k_c=l - 1, while taking advantage of cross-validation approach balancing parameters of the algorithm and the class u choose between neighboring parameter k_p. In this paper, Yale face image database is randomly selected l (l = 2) images as training samples, all other images as test samples for the experiment was repeated 50 times. Figures 2a, 2b gives the parameters u and class balance between different values of parameter k_p neighbors the average recognition rate of the case, it can be seen, balanced parameters u being between 0.05 and classes the neighbors parameter k_p is 4-10, the average recognition rate is relatively stable, and the parameter u = 0.3 and the balance between-class neighbor parameters k_p= 4, the average recognition rate of a maximum of 94.79. Because different face database resulting balancing parameters u = 0.3 and between-class neighborhood parameter k_p = 4 in the other face database can be achieved when doing experiments maximum average recognition rate. Thus, in subsequent experiments, this paper balancing parameters u take 0.05-1 interval 0.05; class taking k_pbetween 4-10 neighboring parameter, the interval a cross match.

3.3 Identification and analysis of experimental
3.3.1Average recognition rate of the characteristic dimensions of the relationship
The purpose of this study is to investigate the LGE / MMC arithmetic average recognition rate of change in the relationship with the characteristic dimension. In this paper, ORL, Yale and AR face image database experiments were carried out, and were randomly chosen before each image library 4,6,5 images as training samples. The library image is as all the other test samples; corresponding to the remaining images of the library as a test sample. Which, ORL and Yale face database images 50 times repeated experiments, AR face database with 10 repeated experimental, results are shown in Figure 3. This can be from Figure 3 the following conclusions: Average recognition rate of the characteristic dimensions of the relationship
The purpose of this study is to investigate the LGE / MMC arithmetic average recognition rate of change in the relationship with the characteristic dimension. In this paper, ORL, Yale and AR face image database experiments were carried out, and were randomly chosen before each image library 4,6,5 images as training samples. The library image is as all the other test samples; corresponding to the remaining images of the library as a test sample. Which, ORL and Yale face database images 50 times repeated experiments, AR face database with 10 repeated experimental, results are shown in Figure 3. This can be from Figure 3 the following conclusions:


1) LGE / MMC algorithm with the characteristic dimension to increase the average recognition rate has been increased, and the relatively high number of dimensions, the average recognition rate of the algorithm is better than the other five kinds of classical algorithm, the average recognition rate.
2) LDA and LLE + LDA algorithm for feature dimension is relatively low, with an average recognition rate at LGE / MMC algorithm, the average recognition rate. This is better than the LDA algorithm to identify the optimal number of dimensions of not more than c-1, the best recognition performance. Average recognition rate of LDA algorithm process and MMC algorithm LDA algorithm and matrix singularity flip.
3) LLE algorithm, the average recognition rate is relatively low, indicating the LLE algorithm is not used for data classification. LLE + LDA algorithm average recognition rate is higher than average recognition rate of LLE algorithm LDA algorithm due to strong data classification capabilities.
4) In most cases, a supervised learning algorithm LDA, MMC and LLE + LDA average recognition rate of better than unsupervised learning algorithm LLE average recognition rate. With LDA, MMC and LLE + LDA algorithm compared to the average recognition rate, LGE / MMC best average recognition rate of the algorithm. LGE / MMC algorithm is far superior to LLE algorithm is due to the algorithm takes the data distribution within the class, taking into account the distribution of data between classes.
3.3.2 Face recognition performance comparison
This section compares the training samples in different algorithms of different maximum average recognition rate of change. We were in ORL, Yale and AR face image database comparison LGE / MMC algorithm with several classical algorithms of recognition performance. In experiments on each library, the first algorithm with PCA face image preprocessing, feature extraction algorithm and then using a variety of feature extraction, and finally with the nearest neighbor classifier complete classification.

     As can be seen from Table 1-3, LGE / MMC algorithm ORL, Yale and AR face database have made a very good recognition effect. Can be obtained from Table 1-3 the following conclusions:
1) In Table 1, LGE / MMC algorithm maximum average recognition rate in several other classical algorithms, especially in the case of large sample more obvious effects. This is due to the sparse samples, LGE / MMC algorithm can not accurately reflect the sample in the original space, the distribution manifold.
2) Table 2, LGE / MMC algorithm maximum average recognition rate in several other classical algorithms. But with the increase of the sample, the maximum average recognition rate of the algorithm change is not very obvious, which is due on the Yale face database greatly influenced by light.
3) In Table 3, several classical algorithms in the small-scale face database can get a better recognition result, but for large databases such as AR face database, these algorithms have to be further improved recognition performance, and LGE / MMC algorithm than the other several algorithms have a distinct advantage.

4. Conclusion

In this paper, LLE and MMC algorithm based on the LGE / MMC algorithm, and gives its derivation. The purpose of the algorithm is to maintain neighborhood premise of using MMC to construct compact within the class diagram and within-class penalties map, use the adjusted balance parameters u, close to the number within the class k_c and class parameter within the adjacent k_p, to ensure that the the vectors orthogonal to each other on the basis of partial graph embedding, so as to more effectively raise the face linear partial structure. In ORL, Yale and AR 3 kinds of people face database experiments also show that the algorithm has better performance locally maintained and used in face recognition is better than several other classical subspace learning methods. It is noteworthy that, better balancing parameters u, close to a number within the class k_c and class k_p, between adjacent parameter selection is still not a theoretical basis to determine, how to find the optimal parameters in theory to more effectively subspace face image to explore non-linear nature of high-dimensional data, internal structure, will be one of our future research directions.

References

[1]     Zhao W;Chellappa R;Phillips P J Face recognition:a literat ure survey 2003(04)
[2]     Chellappa R;Wilson C L;Sirohey S Human and machine recognition of faces:a survey 1995(05)
[3]     3.Liu Q S;Lu H Q;Ma S D A survey:subspace analysis for face recognition-Acta Automatica Sinica2003(06)
[4]     Turk M;Pentland A Eigenfaces for recognition1991(01)
[5]     Belhumeur P N;Hespanha J P;Kriengman D J Eigenfaces vs.fisherfaces:recognition using class specific linear projection 1997(07)
[6]     Tenenbaum J B;de Silva V;Langford J C A global geometric framework for nonlinear dimensionality reduction 2000(5500)
[7]     Roweis S T;Saul L K Nonlinear dimensionality reduction by locally linear embedding 2000(5500)
[8]     Saul L K;Roweis S T Think globally,fit locally:unsupervised learning of low dimensional manifolds2003
[9]     Belkin M;Niyogi P Laplaeian eigenmaps and spectral techniques for embedding and clustering2002(1/2)
[10] Belkin M;Niyogi P Laplacian eigenmaps for dimensionality reduction and data representation[????] 2003(06)
[11] Ridder D;Duin R P W Locally linear embedding for classification2002
[12] Ridder D;Kouropteva O;Okun O Supervised locally linear embedding2003
[13] Bai X M;Yin B C;Shi Q Face recognition based on supervised locally linear embedding method2005(04)
[14] Kouropteva O;Okun O;Pietikainen M Supervised locally linear embedding algorithm for pattern recognition2003
[15] Zhang J P;Shen H X;Zhou Z H Unified locally linear embedding and linear discriminant analysis algorithm for facerecognitio2005
[16] Zhang J P;He L;Zhou Z H Ensemble-based discriminant manifold learning for face recognition2006
[17] He X F;Yan S C;Hu Y X Face recognition using Laplacianfaces[????] 2005(03)
[18] He X F;Cai D;Yan S C Neighborhood preserving embedding2005
[19] Keinosuke F Introduction to statistical pattern recognition1991
[20] Li H F;Jiang T;Zhang K S Efficient and robust feature extraction by maximum margin criterion[????] 2006(01)
[21] Zhang T H;Tao D C;Yang J Discriminative locality alignment2008
[22] Zhang T H;Tao D C;Li X L Patch alignment for dimensionality reduction2009(09)
[23] J. Clerk Maxwell, A Treatise on Electricity and Magnetism, 3^rd ed., vol. 2. Oxford: Clarendon, 1892, pp.68–73.
I. S. Jacobs and C. P. Bean, “Fine particles, thin films and exchange anisotropy,” in Magnetism, vol. III, G. T. Rado and H. Suhl, Eds. New York: Academic, 1963, pp. 271–350.

留学生论文网

搜索