日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

tensorflow综合示例4:逻辑回归:使用Estimator

發布時間:2024/1/23 编程问答 30 豆豆
生活随笔 收集整理的這篇文章主要介紹了 tensorflow综合示例4:逻辑回归:使用Estimator 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

文章目錄

    • 1、加載csv格式的數據集并生成Dataset
      • 1.1 pandas讀取csv數據生成Dataframe
      • 1.2 將Dataframe生成Dataset
    • 2、將數據封裝成Feature columnn
    • 3、構建并訓練模型
    • 4、構建組合特征
    • 5、預測數據

本部分使用Estimator的方式實現邏輯回歸。

(1)使用csv格式的泰坦尼克號數據作為數據集。

import tensorflow as tf import pandas as pd from IPython.display import clear_output print(tf.__version__) print(pd.__version__) 2.3.0 1.0.1

1、加載csv格式的數據集并生成Dataset

1.1 pandas讀取csv數據生成Dataframe

dftrain = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv') dfeval = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/eval.csv') y_train = dftrain.pop('survived') y_eval = dfeval.pop('survived')

我們看一下數據:

dftrain.head() sexagen_siblings_spousesparchfareclassdeckembark_townalone01234
male22.0107.2500ThirdunknownSouthamptonn
female38.01071.2833FirstCCherbourgn
female26.0007.9250ThirdunknownSouthamptony
female35.01053.1000FirstCSouthamptonn
male28.0008.4583ThirdunknownQueenstowny
dftrain.describe() agen_siblings_spousesparchfarecountmeanstdmin25%50%75%max
627.000000627.000000627.000000627.000000
29.6313080.5454550.37958534.385399
12.5118181.1510900.79299954.597730
0.7500000.0000000.0000000.000000
23.0000000.0000000.0000007.895800
28.0000000.0000000.00000015.045800
35.0000001.0000000.00000031.387500
80.0000008.0000005.000000512.329200

訓練數據和驗證數據的數量分別為:

dftrain.shape,dfeval.shape ((627, 9), (264, 9))

看看年齡的分布:

#將數字劃分成bins份,統計每份的數量,并作直方圖 dftrain.age.hist(bins=20) #dftain.age等同于dftrain['age'],pandas會為每列數據生成一個同名的變量 <matplotlib.axes._subplots.AxesSubplot at 0x7f83dd224c50>

再看看性別:

dftrain.sex.value_counts() male 410 female 217 Name: sex, dtype: int64 #以圖標方式顯示 dftrain.sex.value_counts().plot(kind='barh') <matplotlib.axes._subplots.AxesSubplot at 0x7f83e18777d0>

#再看看倉位 dftrain['class'].value_counts().plot(kind='barh') <matplotlib.axes._subplots.AxesSubplot at 0x7f83e072a410>

看看男女的survivid比例:

pd.concat([dftrain,y_train],axis=1).groupby('sex').survived.mean() sex female 0.778802 male 0.180488 Name: survived, dtype: float64 pd.concat([dftrain,y_train],axis=1).groupby('sex').survived.mean().plot(kind='barh').set_xlabel('% survived') Text(0.5, 0, '% survived')

1.2 將Dataframe生成Dataset

使用上面的訓練數據生成tf.data.Dataset:

#本行代碼只是示范了如何將Dataframe轉換成Dataset,下面并未用到這個變量。 dataset = tf.data.Dataset.from_tensor_slices((dict(dftrain),y_train)) for ds in dataset:print(ds) def make_input_fn(data_df, label_df, num_epochs=10, shuffle=True, batch_size=32):def input_function():#從Dataframe中構建Datasetds = tf.data.Dataset.from_tensor_slices((dict(data_df), label_df))if shuffle:ds = ds.shuffle(1000)ds = ds.batch(batch_size).repeat(num_epochs)return dsreturn input_functiontrain_input_fn = make_input_fn(dftrain, y_train) eval_input_fn = make_input_fn(dfeval, y_eval, num_epochs=1, shuffle=False) ds = make_input_fn(dftrain, y_train, batch_size=10)() for feature_batch, label_batch in ds.take(1):print('Some feature keys:', list(feature_batch.keys()))print()print('A batch of class:', feature_batch['class'].numpy())print()print('A batch of Labels:', label_batch.numpy()) Some feature keys: ['sex', 'age', 'n_siblings_spouses', 'parch', 'fare', 'class', 'deck', 'embark_town', 'alone']A batch of class: [b'Second' b'Second' b'Third' b'Second' b'Third' b'Third' b'Third'b'Third' b'Third' b'Third']A batch of Labels: [0 0 0 1 0 0 0 1 0 0] age_column = feature_columns[7] tf.keras.layers.DenseFeatures([age_column])(feature_batch).numpy() array([[23.],[28.],[32.],[31.],[28.],[ 4.],[28.],[25.],[35.],[28.]], dtype=float32) gender_column = feature_columns[0] tf.keras.layers.DenseFeatures([tf.feature_column.indicator_column(gender_column)])(feature_batch).numpy() array([[1., 0.],[1., 0.],[1., 0.],[1., 0.],[1., 0.],[1., 0.],[1., 0.],[1., 0.],[1., 0.],[1., 0.]], dtype=float32)

2、將數據封裝成Feature columnn

CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class', 'deck','embark_town', 'alone'] NUMERIC_COLUMNS = ['age', 'fare']feature_columns = [] for feature_name in CATEGORICAL_COLUMNS:vocabulary = dftrain[feature_name].unique()feature_columns.append(tf.feature_column.categorical_column_with_vocabulary_list(feature_name,vocabulary))for feature_name in NUMERIC_COLUMNS:feature_columns.append(tf.feature_column.numeric_column(feature_name,dtype=tf.float32)) feature_columns[5] VocabularyListCategoricalColumn(key='embark_town', vocabulary_list=('Southampton', 'Cherbourg', 'Queenstown', 'unknown'), dtype=tf.string, default_value=-1, num_oov_buckets=0)

3、構建并訓練模型

使用tensorflow預構建的LinearClassfier可以很方便的構建和訓練模型:

linear_est = tf.estimator.LinearClassifier(feature_columns=feature_columns) linear_est.train(train_input_fn) result = linear_est.evaluate(eval_input_fn)clear_output() print(result) {'accuracy': 0.75, 'accuracy_baseline': 0.625, 'auc': 0.825375, 'auc_precision_recall': 0.7897542, 'average_loss': 0.5112618, 'label/mean': 0.375, 'loss': 0.50658554, 'precision': 0.6386555, 'prediction/mean': 0.47108686, 'recall': 0.7676768, 'global_step': 200}

4、構建組合特征

上述特征使用的都是單一特征,但有時候特征的組合和結果的相關性更高。比如單從性別年齡都很難看出結果,但某個年齡+性別的組合,可能就和結果的相關性很高了。

age_x_gender = tf.feature_column.crossed_column(['age', 'sex'], hash_bucket_size=100) derived_feature_columns = [age_x_gender] linear_est = tf.estimator.LinearClassifier(feature_columns=feature_columns+derived_feature_columns) linear_est.train(train_input_fn) result = linear_est.evaluate(eval_input_fn)clear_output() print(result) {'accuracy': 0.7765151, 'accuracy_baseline': 0.625, 'auc': 0.85026014, 'auc_precision_recall': 0.7782748, 'average_loss': 0.48670053, 'label/mean': 0.375, 'loss': 0.4771559, 'precision': 0.7631579, 'prediction/mean': 0.29976445, 'recall': 0.5858586, 'global_step': 200}

5、預測數據

pred_dicts = list(linear_est.predict(eval_input_fn)) probs = pd.Series([pred['probabilities'][1] for pred in pred_dicts])probs.plot(kind='hist', bins=20, title='predicted probabilities') INFO:tensorflow:Calling model_fn. WARNING:tensorflow:Layer linear/linear_model is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2. The layer has dtype float32 because its dtype defaults to floatx.If you intended to run this layer in float32, you can safely ignore this warning. If in doubt, this warning is likely only an issue if you are porting a TensorFlow 1.X model to TensorFlow 2.To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.INFO:tensorflow:Done calling model_fn. INFO:tensorflow:Graph was finalized. INFO:tensorflow:Restoring parameters from /var/folders/vx/w_50bfjj6xn9j5_lqhfrbcv00000gn/T/tmpo5ivmpz_/model.ckpt-200 INFO:tensorflow:Running local_init_op. INFO:tensorflow:Done running local_init_op.<matplotlib.axes._subplots.AxesSubplot at 0x7f83dccac390>

計算ROC:

from sklearn.metrics import roc_curve from matplotlib import pyplot as pltfpr, tpr, _ = roc_curve(y_eval, probs) plt.plot(fpr, tpr) plt.title('ROC curve') plt.xlabel('false positive rate') plt.ylabel('true positive rate') plt.xlim(0,) plt.ylim(0,) (0, 1.05)

總結

以上是生活随笔為你收集整理的tensorflow综合示例4:逻辑回归:使用Estimator的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。