日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 人工智能 > pytorch >内容正文

pytorch

深度学习入门初步——MNIST数据格式如何使用

發布時間:2025/3/15 pytorch 43 豆豆
生活随笔 收集整理的這篇文章主要介紹了 深度学习入门初步——MNIST数据格式如何使用 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

網上直接下載了MNIST數據集

解壓后發現里面每個壓縮包里有一個idx-ubyte文件,沒有圖片文件在里面。IDX文件格式,是一種用來存儲向量與多維度矩陣的文件格式。

程序

轉至:https://blog.csdn.net/Barry_J/article/details/78749620

# encoding: utf-8 """ @author: monitor1379 @contact: yy4f5da2@hotmail.com @site: www.monitor1379.com @version: 1.0 @license: Apache Licence @file: mnist_decoder.py @time: 2016/8/16 20:03 對MNIST手寫數字數據文件轉換為bmp圖片文件格式。 數據集下載地址為http://yann.lecun.com/exdb/mnist。 相關格式轉換見官網以及代碼注釋。 ======================== 關于IDX文件格式的解析規則: ======================== THE IDX FILE FORMAT the IDX file format is a simple format for vectors and multidimensional matrices of various numerical types. The basic format is magic number size in dimension 0 size in dimension 1 size in dimension 2 ..... size in dimension N data The magic number is an integer (MSB first). The first 2 bytes are always 0. The third byte codes the type of the data: 0x08: unsigned byte 0x09: signed byte 0x0B: short (2 bytes) 0x0C: int (4 bytes) 0x0D: float (4 bytes) 0x0E: double (8 bytes) The 4-th byte codes the number of dimensions of the vector/matrix: 1 for vectors, 2 for matrices.... The sizes in each dimension are 4-byte integers (MSB first, high endian, like in most non-Intel processors). The data is stored like in a C array, i.e. the index in the last dimension changes the fastest. """import numpy as np import struct import matplotlib.pyplot as plt# 訓練集文件 train_images_idx3_ubyte_file = '../../data/mnist/bin/train-images.idx3-ubyte' # 訓練集標簽文件 train_labels_idx1_ubyte_file = '../../data/mnist/bin/train-labels.idx1-ubyte'# 測試集文件 test_images_idx3_ubyte_file = '../../data/mnist/bin/t10k-images.idx3-ubyte' # 測試集標簽文件 test_labels_idx1_ubyte_file = '../../data/mnist/bin/t10k-labels.idx1-ubyte'def decode_idx3_ubyte(idx3_ubyte_file):"""解析idx3文件的通用函數:param idx3_ubyte_file: idx3文件路徑:return: 數據集"""# 讀取二進制數據bin_data = open(idx3_ubyte_file, 'rb').read()# 解析文件頭信息,依次為魔數、圖片數量、每張圖片高、每張圖片寬offset = 0fmt_header = '>iiii'magic_number, num_images, num_rows, num_cols = struct.unpack_from(fmt_header, bin_data, offset)print '魔數:%d, 圖片數量: %d張, 圖片大小: %d*%d' % (magic_number, num_images, num_rows, num_cols)# 解析數據集image_size = num_rows * num_colsoffset += struct.calcsize(fmt_header)fmt_image = '>' + str(image_size) + 'B'images = np.empty((num_images, num_rows, num_cols))for i in range(num_images):if (i + 1) % 10000 == 0:print '已解析 %d' % (i + 1) + '張'images[i] = np.array(struct.unpack_from(fmt_image, bin_data, offset)).reshape((num_rows, num_cols))offset += struct.calcsize(fmt_image)return imagesdef decode_idx1_ubyte(idx1_ubyte_file):"""解析idx1文件的通用函數:param idx1_ubyte_file: idx1文件路徑:return: 數據集"""# 讀取二進制數據bin_data = open(idx1_ubyte_file, 'rb').read()# 解析文件頭信息,依次為魔數和標簽數offset = 0fmt_header = '>ii'magic_number, num_images = struct.unpack_from(fmt_header, bin_data, offset)print '魔數:%d, 圖片數量: %d張' % (magic_number, num_images)# 解析數據集offset += struct.calcsize(fmt_header)fmt_image = '>B'labels = np.empty(num_images)for i in range(num_images):if (i + 1) % 10000 == 0:print '已解析 %d' % (i + 1) + '張'labels[i] = struct.unpack_from(fmt_image, bin_data, offset)[0]offset += struct.calcsize(fmt_image)return labelsdef load_train_images(idx_ubyte_file=train_images_idx3_ubyte_file):"""TRAINING SET IMAGE FILE (train-images-idx3-ubyte):[offset] [type] [value] [description]0000 32 bit integer 0x00000803(2051) magic number0004 32 bit integer 60000 number of images0008 32 bit integer 28 number of rows0012 32 bit integer 28 number of columns0016 unsigned byte ?? pixel0017 unsigned byte ?? pixel........xxxx unsigned byte ?? pixelPixels are organized row-wise. Pixel values are 0 to 255. 0 means background (white), 255 means foreground (black).:param idx_ubyte_file: idx文件路徑:return: n*row*col維np.array對象,n為圖片數量"""return decode_idx3_ubyte(idx_ubyte_file)def load_train_labels(idx_ubyte_file=train_labels_idx1_ubyte_file):"""TRAINING SET LABEL FILE (train-labels-idx1-ubyte):[offset] [type] [value] [description]0000 32 bit integer 0x00000801(2049) magic number (MSB first)0004 32 bit integer 60000 number of items0008 unsigned byte ?? label0009 unsigned byte ?? label........xxxx unsigned byte ?? labelThe labels values are 0 to 9.:param idx_ubyte_file: idx文件路徑:return: n*1維np.array對象,n為圖片數量"""return decode_idx1_ubyte(idx_ubyte_file)def load_test_images(idx_ubyte_file=test_images_idx3_ubyte_file):"""TEST SET IMAGE FILE (t10k-images-idx3-ubyte):[offset] [type] [value] [description]0000 32 bit integer 0x00000803(2051) magic number0004 32 bit integer 10000 number of images0008 32 bit integer 28 number of rows0012 32 bit integer 28 number of columns0016 unsigned byte ?? pixel0017 unsigned byte ?? pixel........xxxx unsigned byte ?? pixelPixels are organized row-wise. Pixel values are 0 to 255. 0 means background (white), 255 means foreground (black).:param idx_ubyte_file: idx文件路徑:return: n*row*col維np.array對象,n為圖片數量"""return decode_idx3_ubyte(idx_ubyte_file)def load_test_labels(idx_ubyte_file=test_labels_idx1_ubyte_file):"""TEST SET LABEL FILE (t10k-labels-idx1-ubyte):[offset] [type] [value] [description]0000 32 bit integer 0x00000801(2049) magic number (MSB first)0004 32 bit integer 10000 number of items0008 unsigned byte ?? label0009 unsigned byte ?? label........xxxx unsigned byte ?? labelThe labels values are 0 to 9.:param idx_ubyte_file: idx文件路徑:return: n*1維np.array對象,n為圖片數量"""return decode_idx1_ubyte(idx_ubyte_file)def run():train_images = load_train_images()train_labels = load_train_labels()# test_images = load_test_images()# test_labels = load_test_labels()# 查看前十個數據及其標簽以讀取是否正確for i in range(10):print train_labels[i]plt.imshow(train_images[i], cmap='gray')plt.show()print 'done'if __name__ == '__main__':run()

Fighting!!

總結

以上是生活随笔為你收集整理的深度学习入门初步——MNIST数据格式如何使用的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。