當前位置：首頁 > 人文社科 > 生活经验 >内容正文

生活经验

情感分析：基于卷积神经网络

發(fā)布時間：2023/11/28 生活经验 26 豆豆

生活随笔收集整理的這篇文章主要介紹了情感分析：基于卷积神经网络小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

情感分析：基于卷積神經(jīng)網(wǎng)絡(luò)

Sentiment Analysis: Using Convolutional Neural Networks

探討了如何用二維卷積神經(jīng)網(wǎng)絡(luò)來處理二維圖像數(shù)據(jù)。在以往的語言模型和文本分類任務(wù)中，把文本數(shù)據(jù)看作一個一維的時間序列，自然地，使用遞歸神經(jīng)網(wǎng)絡(luò)來處理這些數(shù)據(jù)。實際上，也可以將文本看作一維圖像，這樣就可以使用一維卷積神經(jīng)網(wǎng)絡(luò)來捕捉相鄰單詞之間的關(guān)聯(lián)。如中所述… _fig_nlp-map-sa-cnn：本節(jié)描述了將卷積神經(jīng)網(wǎng)絡(luò)應(yīng)用于情緒分析的突破性方法：textCNN[Kim，2014]。

Fig. 1. This section feeds pretrained GloVe to a CNN-based architecture for sentiment analysis.

首先，導入實驗所需的軟件包和模塊。

from d2l import mxnet as d2l

from mxnet import gluon, init, np, npx

from mxnet.gluon import nn

npx.set_np()

batch_size = 64

train_iter, test_iter, vocab = d2l.load_data_imdb(batch_size)

One-Dimensional Convolutional Layer

在介紹模型之前，讓先解釋一下一維卷積層是如何工作的。像二維卷積層一樣，一維卷積層使用一維互相關(guān)運算。在一維互相關(guān)運算中，卷積窗口從輸入數(shù)組的最左邊開始，在輸入數(shù)組上從左到右依次滑動。當卷積窗口滑動到某個位置時，將窗口和核數(shù)組中的輸入子數(shù)組乘以元素求和，得到輸出數(shù)組中相應(yīng)位置的元素。如圖15.3.2所示，輸入是寬度為7的一維數(shù)組，內(nèi)核數(shù)組的寬度為2。如所見，輸出寬度為7?2+1=6。

第一個元素是通過對最左邊的輸入子數(shù)組（寬度為2）和核數(shù)組進行逐元素乘法，然后將結(jié)果相加得到第一個元素。

Fig. 2 . One-dimensional cross-correlation operation. The shaded parts are the first output element as well as the input and kernel array elements used in its calculation: 0×1+1×2=20×1+1×2=2。

接下來，在corr1d函數(shù)中實現(xiàn)一維互相關(guān)。接受輸入數(shù)組X和內(nèi)核數(shù)組K并輸出數(shù)組Y。

def corr1d(X, K):

w = K.shape[0]Y = np.zeros((X.shape[0] - w + 1))for i in range(Y.shape[0]):Y[i] = (X[i: i + w] * K).sum()

return Y

現(xiàn)在，將在圖2中再現(xiàn)一維互相關(guān)運算的結(jié)果。

X, K = np.array([0, 1, 2, 3, 4, 5, 6]), np.array([1, 2])

corr1d(X, K)

array([ 2., 5., 8., 11., 14., 17.])

多輸入信道的一維互相關(guān)運算也類似于多輸入信道的二維互相關(guān)運算。在每個通道上，對內(nèi)核及其相應(yīng)的輸入進行一維互相關(guān)運算，并將通道的結(jié)果相加得到輸出。圖3顯示了具有三個輸入信道的一維互相關(guān)操作。

Fig. 3 . One-dimensional cross-correlation operation with three input channels. The shaded parts are the first output element as well as the input and kernel array elements used in its calculation: 0×1+1×2+1×3+2×4+2×(?1)+3×(?3)=20×1+1×2+1×3+2×4+2×(?1)+3×(?3)=2。

現(xiàn)在，在圖3中再現(xiàn)多輸入信道的一維互相關(guān)運算的結(jié)果。

def corr1d_multi_in(X, K):

# First, we traverse along the 0th dimension (channel dimension) of X and# K. Then, we add them together by using * to turn the result list into a# positional argument of the add_n functionreturn sum(corr1d(x, k) for x, k in zip(X, K))

X = np.array([[0, 1, 2, 3, 4, 5, 6],

          [1, 2, 3, 4, 5, 6, 7],[2, 3, 4, 5, 6, 7, 8]])

K = np.array([[1, 2], [3, 4], [-1, -3]])

corr1d_multi_in(X, K)

array([ 2., 8., 14., 20., 26., 32.])

二維互相關(guān)運算的定義告訴，具有多個輸入信道的一維互相關(guān)運算可以看作是具有單個輸入信道的二維互相關(guān)運算。如圖4所示，也可以將圖3中的多個輸入信道的一維互相關(guān)操作表示為與單個輸入信道等效的二維互相關(guān)操作。這里，內(nèi)核的高度等于輸入的高度。

Fig. 4. Two-dimensional cross-correlation operation with a single
input channel. The highlighted parts are the first output element and the input and kernel array elements used in its calculation: 2×(?1)+3×(?3)+1×3+2×4+0×1+1×2=22×(?1)+3×(?3)+1×3+2×4+0×1+1×2=2。

圖2和圖3中的輸出只有一個信道。如何在二維卷積層中指定多個輸出信道。也可以在一維卷積層中指定多個輸出通道來擴展卷積層中的模型參數(shù)。

Max-Over-Time Pooling Layer

有一個一維池化層。TextCNN中使用的max over time pooling層實際上對應(yīng)于一維全局最大池層。假設(shè)輸入包含多個通道，并且每個通道由不同時間步上的值組成，則每個通道的輸出將是通道中所有時間步的最大值。因此，max over time pooling層的輸入在每個通道上可以有不同的時間步長。

為了提高計算性能，通常將不同長度的時序?qū)嵗M合成一個小批量，并在較短的實例末尾添加特殊字符（如0），使批中每個定時示例的長度一致。自然，添加的特殊字符沒有內(nèi)在意義。因為max over time pooling層的主要目的是捕獲最重要的計時特性，通常允許模型不受手動添加字符的影響。

The TextCNN Model

TextCNN主要使用一維卷積層和max隨時間池層。假設(shè)輸入文本序列包括n個詞匯，d維度詞向量。那么輸入示例的寬度為n，高度為1，以及d輸入通道。

textCNN的計算主要分為以下幾個步驟：

定義多個一維卷積核，并使用對輸入執(zhí)行卷積計算。不同寬度的卷積核可以捕獲不同數(shù)目相鄰詞之間的相關(guān)性。

對所有輸出通道執(zhí)行最大時間池，然后將這些通道的池輸出值連接到一個向量中。

連接后的向量通過全連通層轉(zhuǎn)換為每個類別的輸出。在這個步驟中可以使用一個脫落dropout層來處理過度擬合。

Fig. 5. TextCNN design.

圖5給出了一個示例來說明textCNN。這里的輸入是一個有11個單詞的句子，每個單詞由一個6維的單詞向量表示。因此，輸入序列具有11個和6個輸入信道的寬度。假設(shè)存在兩個寬度分別為2和4的一維卷積核，以及4個和5個輸出通道。因此，經(jīng)過一維卷積計算，四個輸出通道的寬度為11?2+1=10，而其五個通道的寬度是11?4+1=8。即使每個通道的寬度不同，仍然可以對每個通道執(zhí)行max over time pooling，并將9個通道的池輸出連接成一個9維向量。最后，使用一個完全連通的層將9維向量轉(zhuǎn)換為二維輸出：積極情緒和消極情緒預測。

接下來，將實現(xiàn)textCNN模型。與前一節(jié)相比，除了用一維卷積層代替遞歸神經(jīng)網(wǎng)絡(luò)外，這里使用了兩個嵌入層，一個具有固定權(quán)重，另一個參與訓練。

class TextCNN(nn.Block):

def __init__(self, vocab_size, embed_size, kernel_sizes, num_channels,**kwargs):super(TextCNN, self).__init__(**kwargs)self.embedding = nn.Embedding(vocab_size, embed_size)# The embedding layer does not participate in trainingself.constant_embedding = nn.Embedding(vocab_size, embed_size)self.dropout = nn.Dropout(0.5)self.decoder = nn.Dense(2)# The max-over-time pooling layer has no weight, so it can share an# instanceself.pool = nn.GlobalMaxPool1D()# Create multiple one-dimensional convolutional layersself.convs = nn.Sequential()for c, k in zip(num_channels, kernel_sizes):self.convs.add(nn.Conv1D(c, k, activation='relu'))def forward(self, inputs):# Concatenate the output of two embedding layers with shape of# (batch size, number of words, word vector dimension) by word vectorembeddings = np.concatenate((self.embedding(inputs), self.constant_embedding(inputs)), axis=2)# According to the input format required by Conv1D, the word vector# dimension, that is, the channel dimension of the one-dimensional# convolutional layer, is transformed into the previous dimensionembeddings = embeddings.transpose(0, 2, 1)# For each one-dimensional convolutional layer, after max-over-time# pooling, an ndarray with the shape of (batch size, channel size, 1)# can be obtained. Use the flatten function to remove the last# dimension and then concatenate on the channel dimensionencoding = np.concatenate([np.squeeze(self.pool(conv(embeddings)), axis=-1)for conv in self.convs], axis=1)# After applying the dropout method, use a fully connected layer to# obtain the outputoutputs = self.decoder(self.dropout(encoding))return outputs

創(chuàng)建一個TextCNN實例。有3個卷積層，內(nèi)核寬度為3、4和5，全部有100個輸出通道。

embed_size, kernel_sizes, nums_channels = 100, [3, 4, 5], [100, 100, 100]

ctx = d2l.try_all_gpus()

net = TextCNN(len(vocab), embed_size, kernel_sizes, nums_channels)

net.initialize(init.Xavier(), ctx=ctx)

3.1. Load Pre-trained Word Vectors

如前所述，加載預先訓練的100維手套詞向量，初始化嵌入層嵌入和常量嵌入。在這里，前者參加訓練，后者有固定的權(quán)重。

glove_embedding = d2l.TokenEmbedding(‘glove.6b.100d’)

embeds = glove_embedding[vocab.idx_to_token]

net.embedding.weight.set_data(embeds)

net.constant_embedding.weight.set_data(embeds)

net.constant_embedding.collect_params().setattr(‘grad_req’, ‘null’)

3.2. Train and Evaluate the Model

現(xiàn)在可以訓練模型了。

lr, num_epochs = 0.001, 5

trainer = gluon.Trainer(net.collect_params(), ‘a(chǎn)dam’, {‘learning_rate’: lr})

loss = gluon.loss.SoftmaxCrossEntropyLoss()

d2l.train_ch13(net, train_iter, test_iter, loss, trainer, num_epochs, ctx)

loss 0.094, train acc 0.968, test acc 0.866

3834.5 examples/sec on [gpu(0), gpu(1)]

下面，使用訓練過的模型對兩個簡單句子的情感進行分類。

d2l.predict_sentiment(net, vocab, ‘this movie is so great’)

‘positive’

d2l.predict_sentiment(net, vocab, ‘this movie is so bad’)

‘negative’

4. Summary

· We can use one-dimensional convolution to process and analyze timing data.

· A one-dimensional cross-correlation operation with multiple input channels can be regarded as a two-dimensional cross-correlation operation with a single input channel.

· The input of the max-over-time pooling layer can have different numbers of timesteps on each channel.

· TextCNN mainly uses a one-dimensional convolutional layer and max-over-time pooling layer.

總結(jié)

以上是生活随笔為你收集整理的情感分析：基于卷积神经网络的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。