當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

homework7_ZhankunLuo

發布時間：2023/12/31 编程问答 21 豆豆

生活随笔收集整理的這篇文章主要介紹了 homework7_ZhankunLuo 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

Homework 7

Zhankun Luo

PUID: 0031195279

Email: luo333@pnw.edu

Fall-2018-ECE-59500-009

Instructor: Toma Hentea

文章目錄

Homework 7
- Function
- - plot_cluster.m
  - k_means.m
  - - Hard Clustering: k-means
    - drawbacks
    - code
  - fuzzy_c_means.m
  - - Fuzzy Clustering
    - code
- Exercise
- - Exercise 7.5.1
  - - exercise_751.m
    - Result
    - Conclusion
  - Exercise 7.5.9
  - - exercise_759.m
    - Result
    - Conclusion

Function

plot_cluster.m

function [] = plot_cluster(X2, bel, theta) plot(X2(1,bel==1),X2(2,bel==1),'r.',... X2(1,bel==2),X2(2,bel==2),'g*',X2(1,bel==3),X2(2,bel==3),'bo',... X2(1,bel==4),X2(2,bel==4),'cx',X2(1,bel==5),X2(2,bel==5),'md',... X2(1,bel==6),X2(2,bel==6),'yp',X2(1,bel==7),X2(2,bel==7),'ks') hold on, plot(theta(1,:), theta(2,:),'k+') end

k_means.m

Hard Clustering: k-means

Where
$d(xi,θj(t))=[xi?θj(t)]T[xi?θj(t)]∑i=1Nuij(t?1)?d(xi,θj(t))?θj=2∑i=1Nuij(t?1)[θj(t)?xi]=0d(x_i, \theta_j(t))=[x_i - \theta_j(t)]^T[x_i - \theta_j(t)]\\ \sum_{i=1}^{N} u_{ij}(t-1) \frac{\partial d(x_i, \theta_j(t))}{\partial \theta_j} = 2\sum_{i=1}^{N} u_{ij}(t-1) [\theta_j(t) - x_i] = 0\\$
because $θj(t)\theta_j(t)$ is not related to $i$
$θj(t)=∑i=1Nuij(t?1)xi∑i=1Nuij(t?1)\theta_{j}(t) = \frac {\sum_{i=1}^{N} u_{ij}(t-1)x_i}{\sum_{i=1}^{N} u_{ij}(t-1)}\\$
We have

u_{ij}(t - 1) = (bel==j) % 1xN vector \sum u_{ij}(t - 1) * x_i = sum(X'.*((bel==j)'*ones(1,l))) % 1xm vector \sum u_{ij} = sum(bel==j)

Thus

% in function k_means.m theta(:,j)=sum(X'.*((bel==j)'*ones(1,l))) / sum(bel==j);

drawbacks

Different initial partitions may lead k-means to produces different final clusterings, each one corresponding to a different local minimum.

Knowledge of the number of clusters m is required a priori.

k-means is sensitive to outliers and noise.

k-means is not suitable for data with nominal (categorical) coordinates.

code

function [theta,bel,J]=k_means(X,theta) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % FUNCTION % [theta,bel,J]=k_means(X,theta) % This function implements the k-means algorithm, which requires % as input the number of clusters underlying the data set. The algorithm % starts with an initial estimation of the cluster representatives and % iteratively tries to move them into regions that are "dense" in data % vectors, so that a suitable cost function is minimized. This is % achieved by performing (usually) a few passes on the data set. The % algorithm terminates when the values of the cluster representatives % remain unaltered between two successive iterations. % % INPUT ARGUMENTS: % X: lxN matrix, each column of which corresponds to % an l-dimensional data vector. % theta: a matrix, whose columns contain the l-dimensional (mean) % representatives of the clusters. % % OUTPUT ARGUMENTS: % theta: a matrix, whose columns contain the final estimations of % the representatives of the clusters. % bel: N-dimensional vector, whose i-th element contains the % cluster label for the i-th data vector. % J: the value of the cost function (sum of squared Euclidean % distances of each data vector from its closest parameter % vector) that corresponds to the estimated clustering. % % (c) 2010 S. Theodoridis, A. Pikrakis, K. Koutroumbas, D. Cavouras %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% [l,N]=size(X); [l,m]=size(theta); e=1; iter=0; while(e~=0)iter=iter+1;theta_old=theta;dist_all=[];for j=1:mdist=sum(((ones(N,1)*theta(:,j)'-X').^2)'); % sum of l dimensionsdist_all=[dist_all; dist];end[q1,bel]=min(dist_all); % get min value of every col of dist_all: mxNJ=sum(min(dist_all));% bel =% 1 2 1% (bel==1)=% 1 0 1 % choose 1st classfor j=1:mif(sum(bel==j)~=0)theta(:,j)=sum(X'.*((bel==j)'*ones(1,l))) / sum(bel==j);endende=sum(sum(abs(theta-theta_old))); end

fuzzy_c_means.m

Fuzzy Clustering

In fcm.m, define
$d(xi,θj(t))=[xi?θj(t)]T[xi?θj(t)]∑i=1Nuij(t?1)?d(xi,θj(t))?θj=2∑i=1Nuij(t?1)[θj(t)?xi]=0d(x_i, \theta_j(t))=[x_i - \theta_j(t)]^T[x_i - \theta_j(t)]\\ \sum_{i=1}^{N} u_{ij}(t-1) \frac{\partial d(x_i, \theta_j(t))}{\partial \theta_j} = 2\sum_{i=1}^{N} u_{ij}(t-1) [\theta_j(t) - x_i] = 0\\$
because $θj(t)\theta_j(t)$ is not related to $i$
$θj(t)=∑i=1Nuij(t?1)xi∑i=1Nuij(t?1)\theta_{j}(t) = \frac {\sum_{i=1}^{N} u_{ij}(t-1)x_i}{\sum_{i=1}^{N} u_{ij}(t-1)}\\$

code

function [theta,U,obj_fun]=fuzzy_c_means(X,m,q) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % FUNCTION % [w, U, obj_fun]=fuzzy_c_means(X,m,q) % This function applies the FCM algorithm by calling the corresponding % MATLAB function, called "fcm". % % INPUT ARGUMENTS: % X: lxN matrix, each column of which corresponds to % an l-dimensional data vector. % m: the number of clusters. % q: the fuzzifier. % % OUTPUT ARGUMENTS: % theta: lxm matrix, each column of which corresponds to % a cluster representative, after the convergence of the % algorithm. % U: Nxm matrix containing in its i-th row % the ``grade of membership'' of the data vector xi to the m % clusters. % obj_fun: a vector, whose t-th coordinate is the value of the cost % function, J, for the clustering produced at the t-th teration. % % (c) 2010 S. Theodoridis, A. Pikrakis, K. Koutroumbas, D. Cavouras %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% X=X'; options(1)=q; % options(1): Exponent for the partition matrix U. Default: 2.0 % options(2): Maximum number of iterations. Default: 100 % options(3): Minimum amount of improvement. Default: 1e-5 % options(4): Information displayed during iteration. Default: 1. [theta,U,obj_fun] = fcm(X,m,options); theta=theta'; U=U';

where function: [center, U, obj_fcn] = fcm(data, cluster_n)

[center, U, obj_fcn] = fcm(data, cluster_n)

applies the fuzzy c-means clustering method to a given data set.

The input arguments of this function are

data:

data set to be clustered;

each row is a sample data point
cluster_n:

number of clusters (greater than one)

The output arguments of this function are

center:

matrix of final cluster centers where each row provides the center coordinates
U:

final fuzzy partition matrix (or membership function matrix)
obj_fcn:

values of the objective function during iterations

Exercise

Exercise 7.5.1

Generate and plot a data set $X_2$ , that consists of 300 2-dimensional stemming from the normal distribution with mean $m_1 = [0, 0]^T$ and covariance matrix equal to the 2 x 2 identity matrix.

repeat step I of Example 7.4.2

randn('seed', 0); m = [0 0; 4 0; 0 4; 5 4]; S(:,:,1) = eye(2); S(:,:,2) = [1.0 .2; .2 1.5]; S(:,:,3) = [1.0 .4; .4 1.1]; S(:,:,4) = [.3 .2; .2 .5]; n_points = 300 * ones(1,4); % Number of points per group

Apply the k-means algorithm for $m = 2, 3$ and random initialization if $θj\theta_j$ 's, and draw conclusions.

exercise_751.m

% Exercise 7.5.1 % NOTE: this example generates a total of six figures. close('all'); clear % 1. To generate and plot X2, work as in Example 7.4.2, where now the means % of the Gaussians are different. The plot shows that X3 contains four % clearly separated compact clusters. randn('seed', 0); m = [0 0; 4 0; 0 4; 5 4]; S(:,:,1) = eye(2); S(:,:,2) = [1.0 .2; .2 1.5]; S(:,:,3) = [1.0 .4; .4 1.1]; S(:,:,4) = [.3 .2; .2 .5]; n_points = 300 * ones(1,4); % Number of points per group X2=[]; for i=1:4X2=[X2; mvnrnd(m(i,:),S(:,:,i),n_points(i))]; end X2=X2'; % X3: lxN=2x1200 figure(1), plot(X2(1,:),X2(2,:),' b.') figure(1), axis equal% 2. To apply the k-means algorithm, for m = 4 and random initialization m = 4; [l, N] = size(X2); rand('seed', 0) theta_ini = rand(l, m); [theta, bel, J] = k_means(X2, theta_ini); % Plot X3, using different colors for points from different clusters figure(2), plot_cluster(X2, bel, theta) figure(2), axis equal% 3. Work as in step 2, for m = 2 m = 2; [l, N] = size(X2); rand('seed', 0) theta_ini = rand(l, m); [theta, bel, J] = k_means(X2, theta_ini); figure(3), plot_cluster(X2, bel, theta) figure(3), axis equal% 4. Work as in step 2, for m = 3 m = 3; [l, N]=size(X2); rand('seed', 0) theta_ini = rand(l, m); [theta,bel,J] = k_means(X2, theta_ini); figure(4), plot_cluster(X2, bel, theta) figure(4), axis equal% 5. Work as in step 2 with different theta_ini set1 m = 4; [l, N] = size(X2); theta_ini = [-2 -2; -2.1 -2.1; -2 -2.2; -2.1 -2.2]'; [theta, bel, J] = k_means(X2, theta_ini); figure(5), plot_cluster(X2, bel, theta) figure(5), axis equal% 6. Work as in step 2 with different theta_ini set2 m = 4; [l,N] = size(X2); rand('seed', 0) theta_ini = rand(l, m); theta_ini(:, m) = [5 5]; [theta, bel, J] = k_means(X2, theta_ini); figure(6), plot_cluster(X2, bel, theta) figure(6), axis equal

Result

original data

m = 4, theta_ini = rand(l, m)

m = 2, theta_ini = rand(l, m)

m = 3, theta_ini = rand(l, m)

m = 4, theta_ini = [-2 -2; -2.1 -2.1; -2 -2.2; -2.1 -2.2]’

m = 4, theta_ini = rand(l, m), theta_ini(:, m) = [12 12];

Conclusion

When $m = 2, 3, 4$ , divisions of clusters would be different

So, for different $m$ , it could get different divisions.

Sometimes when $θj(0),(j=1,...,m)\theta_j(0), (j =1, ... , m)$ are not set properly,

Division would be trapped in local best value,

Thus, it would not get global best value ( which is unique for $m$ = certain value)

Exercise 7.5.9

Apply the FCM on the data set

X_3

generated in Example 7.5.1 for

q = 2, q = 10

and

q = 25

Define and plot the three corresponding hard clustering, as discussed, as discussed previously.

Compare the

u_{ij}

parameters and the

θj\theta_j

's for the three cases and draw conclusions.

exercise_759.m

% Exercise 7.5.9 % NOTE: this example generates a total of 4 figures: 1 original + 3 cases close('all'); clear % 1. To generate and plot X2, work as in Example 7.4.2, where now the means % of the Gaussians are different. The plot shows that X3 contains four % clearly separated compact clusters. randn('seed', 0); mu = [0 0; 4 0; 0 4; 5 4]; S(:,:,1) = eye(2); S(:,:,2) = [1.0 .2; .2 1.5]; S(:,:,3) = [1.0 .4; .4 1.1]; S(:,:,4) = [.3 .2; .2 .5]; n_points = 300 * ones(1,4); % Number of points per group m = 4; % cluster number X2=[]; for i=1:4X2=[X2; mvnrnd(mu(i,:),S(:,:,i),n_points(i))]; end X2=X2'; % X3: lxN=2x1200 [l, N] = size(X2); % l: deminsion N: point number rand('seed', 0) theta_ini = rand(l, m); figure(1), plot(X2(1,:),X2(2,:),' b.') figure(1), axis equal% 2. To apply the k-means algorithm, for q = 2 and random initialization q = 2; [theta, U, obj_fun] = fuzzy_c_means(X2, m, q); [U_max, bel] = max(U'); % Plot X3, using different colors for points from different clusters figure(2), hold on, plot_cluster(X2, bel, theta) figure(2), axis equal% 3. Work as in step 2, for q = 10 q = 10; [theta, U, obj_fun] = fuzzy_c_means(X2, m, q); [U_max, bel] = max(U'); figure(3), hold on, plot_cluster(X2, bel, theta) figure(3), axis equal% 4. Work as in step 2, for q = 25 q = 25; [theta, U, obj_fun] = fuzzy_c_means(X2, m, q); [U_max, bel] = max(U'); figure(4), hold on, plot_cluster(X2, bel, theta) figure(4), axis equal

Result

original data

q = 2

q = 10

q = 25

>> exercise_759 % q = 2 Iteration count = 1, obj. fcn = 4374.311623 Iteration count = 2, obj. fcn = 3344.540096 Iteration count = 3, obj. fcn = 3335.790961 Iteration count = 4, obj. fcn = 3304.365018 Iteration count = 5, obj. fcn = 3195.008604 Iteration count = 6, obj. fcn = 2884.099872 Iteration count = 7, obj. fcn = 2364.754387 Iteration count = 8, obj. fcn = 2037.567145 Iteration count = 9, obj. fcn = 1834.607024 Iteration count = 10, obj. fcn = 1606.009571 Iteration count = 11, obj. fcn = 1483.160811 Iteration count = 12, obj. fcn = 1454.601969 Iteration count = 13, obj. fcn = 1450.174378 Iteration count = 14, obj. fcn = 1449.552804 Iteration count = 15, obj. fcn = 1449.463887 Iteration count = 16, obj. fcn = 1449.450440 Iteration count = 17, obj. fcn = 1449.448266 Iteration count = 18, obj. fcn = 1449.447888 Iteration count = 19, obj. fcn = 1449.447816 Iteration count = 20, obj. fcn = 1449.447801 Iteration count = 21, obj. fcn = 1449.447798 % q = 10 Iteration count = 1, obj. fcn = 14.151425 Iteration count = 2, obj. fcn = 0.051120 Iteration count = 3, obj. fcn = 0.050908 Iteration count = 4, obj. fcn = 0.050800 Iteration count = 5, obj. fcn = 0.050680 Iteration count = 6, obj. fcn = 0.050538 Iteration count = 7, obj. fcn = 0.050373 Iteration count = 8, obj. fcn = 0.050195 Iteration count = 9, obj. fcn = 0.049981 Iteration count = 10, obj. fcn = 0.049709 Iteration count = 11, obj. fcn = 0.049374 Iteration count = 12, obj. fcn = 0.048968 Iteration count = 13, obj. fcn = 0.048510 Iteration count = 14, obj. fcn = 0.048036 Iteration count = 15, obj. fcn = 0.047543 Iteration count = 16, obj. fcn = 0.047002 Iteration count = 17, obj. fcn = 0.046502 Iteration count = 18, obj. fcn = 0.046041 Iteration count = 19, obj. fcn = 0.045677 Iteration count = 20, obj. fcn = 0.045404 Iteration count = 21, obj. fcn = 0.045249 Iteration count = 22, obj. fcn = 0.045160 Iteration count = 23, obj. fcn = 0.045097 Iteration count = 24, obj. fcn = 0.045053 Iteration count = 25, obj. fcn = 0.045025 Iteration count = 26, obj. fcn = 0.045009 Iteration count = 27, obj. fcn = 0.044999 % q = 25 Iteration count = 1, obj. fcn = 1.678658 Iteration count = 2, obj. fcn = 0.000000 Iteration count = 3, obj. fcn = 0.000000

Conclusion

When $q = 2, 10, 25$ , divisions of clusters are different

So, for different $q$ , it could get different divisions.

When $q = 25$ , its division has great difference to the original division

So, we need choose proper $q$ to certain clustering problems.

總結

以上是生活随笔為你收集整理的homework7_ZhankunLuo的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

homework7_ZhankunLuo

上一篇：安装网易云音乐
下一篇： Audio HAL 通话录音上下行分离