日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁(yè) > 编程资源 > 编程问答 >内容正文

编程问答

熊猫分发_熊猫新手:第一部分

發(fā)布時(shí)間:2023/11/29 编程问答 29 豆豆
生活随笔 收集整理的這篇文章主要介紹了 熊猫分发_熊猫新手:第一部分 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

熊貓分發(fā)

For those just starting out in data science, the Python programming language is a pre-requisite to learning data science so if you aren’t familiar with Python go make yourself familiar and then come back here to start on Pandas.

對(duì)于剛接觸數(shù)據(jù)科學(xué)的人來(lái)說(shuō),Python編程語(yǔ)言是學(xué)習(xí)數(shù)據(jù)科學(xué)的先決條件,因此,如果您不熟悉Python,請(qǐng)先熟悉一下,然后再回到這里開始學(xué)習(xí)Pandas。

You can start learning Python with a series of articles I just started called Minimal Python Required for Data Science.

您可以從我剛剛開始的一系列文章開始學(xué)習(xí)Python,這些文章稱為“數(shù)據(jù)科學(xué)所需的最小Python” 。

One of the most important tools in the toolbox when it comes to data science is pandas which is a data analytics library for Python developed by Wes McKinney during his tenure at a hedge fund.

關(guān)于數(shù)據(jù)科學(xué),工具箱中最重要的工具之一是pandas,這是Wes McKinney在對(duì)沖基金任職期間開發(fā)的Python數(shù)據(jù)分析庫(kù)。

For this entire series of articles, we’re going to be using Anaconda which is a fancy Python package manager geared for data science and machine learning. If you aren’t familiar with what I just talked about go ahead and check out this video which will teach you about Anaconda and Jupyter Notebook which is central to data science work.

在整個(gè)系列文章中,我們將使用Anaconda ,這是一款專為數(shù)據(jù)科學(xué)和機(jī)器學(xué)習(xí)而設(shè)計(jì)的Python軟件包管理器。 如果您不熟悉我剛才所說(shuō)的內(nèi)容,請(qǐng)觀看此視頻,該視頻將教您有關(guān)Anaconda和Jupyter Notebook的知識(shí),這對(duì)數(shù)據(jù)科學(xué)工作至關(guān)重要。

You can activate your conda environment (virtual environment) with:

您可以使用以下方法激活conda環(huán)境( 虛擬環(huán)境 ):

$ conda activate [name of environment]# my environment is named `datascience` so$ conda activate datascience

Once you activate your conda virtual environment you should see this on your Terminal:

激活conda虛擬環(huán)境后,您應(yīng)該在終端上看到以下內(nèi)容:

(datascience)$

Assuming you have miniconda or anaconda installed on your system you can easily install pandas with:

假設(shè)您的系統(tǒng)上安裝了miniconda或anaconda,則可以使用以下方法輕松安裝熊貓:

$ conda install pandas

We’re also going to be using Jupyter Notebook to do our coding so go ahead and

我們還將使用Jupyter Notebook進(jìn)行編碼,因此繼續(xù)

$

And startup your Jupyter Notebook with:

然后使用以下命令啟動(dòng)Jupyter Notebook:

$ jupyter notebook

熊貓是將所有元素粘合在一起的粘合劑 (Pandas is the glue that holds it all together)

Photo by Juhasz Imre from Pexels Pexels的Juhasz Imre 攝影

Pandas gets more important as we venture higher up the hierarchy of data science into the fields of machine learning as it allows data to be “cleaned” and “wrangled” before getting fed to algorithms like Random Forest and Neural Networks. If ML algorithms are Doc, then pandas is Marty.

隨著我們冒險(xiǎn)將數(shù)據(jù)科學(xué)的層次結(jié)構(gòu)帶入機(jī)器學(xué)習(xí)領(lǐng)域,Pandas變得越來(lái)越重要,因?yàn)樗试S在將數(shù)據(jù)饋入隨機(jī)森林和神經(jīng)網(wǎng)絡(luò)等算法之前先對(duì)其進(jìn)行“清理”和“整理”。 如果ML算法是Doc,則熊貓是Marty。

導(dǎo)游巴士之旅 (A Guided Bus Tour)

My favorite. Photo by Venkat Ragavan from Pexels我的最愛。 Pexels的Venkat Ragavan攝

One of my favorite places to visit even since childhood is the San Diego Zoo. And one thing I always do is to take the guided bus tour while drinking a Blue Moon.

即使從小我最喜歡去的地方之一是圣地亞哥動(dòng)物園。 我一直要做的一件事就是在喝著“藍(lán)月亮”的同時(shí)進(jìn)行有導(dǎo)游的游覽。

We’re going to do something similar in that I’m going to give a brief tour of just some of the things you can do with Pandas. You’re on your own with the Blue Moon.

我們將做類似的事情,簡(jiǎn)要介紹一下您可以使用Pandas進(jìn)行的一些操作。 藍(lán)月亮讓你自己。

Both the data and the inspiration for this medium series come from Ted Petrou’s excellent courses on Dunder Data.

該媒體系列的數(shù)據(jù)和靈感均來(lái)自Ted Petrou的Dunder Data精品課程。

Pandas essentially deals with tabular data: rows and columns. In this respect it’s very much like an Excel spreadsheet.

熊貓本質(zhì)上處理表格數(shù)據(jù):行和列。 在這方面,它非常類似于Excel電子表格。

The two primary objects you’ll interface with in pandas is the Series and the DataFrame. A DataFrame is two-dimensional data complete with rows and columns.

您將在熊貓中使用的兩個(gè)主要對(duì)象是SeriesDataFrame 。 DataFrame是具有行和列的二維數(shù)據(jù)。

It’s okay if you don’t know what the below code does we will go over it later in detail. The data that we use here concerns bicycle riders in the city of Chicago, Illnoise.

沒關(guān)系,如果您不知道下面的代碼是什么,我們稍后將詳細(xì)介紹它。 我們?cè)诖耸褂玫臄?shù)據(jù)與伊利諾伊斯州芝加哥市的自行車騎手有關(guān)。

DataFrame: tabular dataDataFrame:表格數(shù)據(jù)

Series is one-dimensional data or a single column of data with respect to a DataFrame:

系列是相對(duì)于DataFrame的一維數(shù)據(jù)或單列數(shù)據(jù):

Series: A single column of data系列:單列數(shù)據(jù)

As shown above one of the highlights of pandas is that it allows data to be loaded into a Jupyter Notebook session from whatever the source file is whether it’s a CSV (comma delimited), XLSX(Excel), SQL, or JSON.

如上所示,pandas的亮點(diǎn)之一是它允許將數(shù)據(jù)從任何源文件加載到Jupyter Notebook會(huì)話中,無(wú)論源文件是CSV(逗號(hào)分隔),XLSX(Excel),SQL還是JSON。

One of the first things we always do is take a peek at the dataset we’re studying by using the head method. By default head will present the first five rows of the data. We can pass an integer to control how many rows we want to see:

我們經(jīng)常要做的第一件事就是使用head方法窺視我們正在研究的數(shù)據(jù)集。 默認(rèn)情況下, head將顯示數(shù)據(jù)的前五行。 我們可以傳遞一個(gè)整數(shù)來(lái)控制我們要查看的行數(shù):

df.head(7)First seven rows前七行

If we want to see the last five rows:

如果要查看最后五行:

df.tail()

讀入數(shù)據(jù) (Read In Data)

We use the read_csv function to load CSV formatted data.

我們使用read_csv函數(shù)加載CSV格式的數(shù)據(jù)。

We pass the path to the file containing our data as a string to the read_csv method of pandas. In my case, I’m using the url of my GitHub Repo which holds all the data that I will be using. I highly recommend reading the documentation regarding pandas read_csv function as it’s one of the most important and dynamic functions within the whole library.

我們將包含數(shù)據(jù)的文件的路徑作為字符串傳遞給read_csv方法。 就我而言,我使用的是GitHub Repo的網(wǎng)址,該網(wǎng)址包含我將要使用的所有數(shù)據(jù)。 我強(qiáng)烈建議閱讀有關(guān)pandas read_csv函數(shù)的文檔 ,因?yàn)樗钦麄€(gè)庫(kù)中最重要且最動(dòng)態(tài)的函數(shù)之一。

篩選資料 (Filter Data)

We can filter rows of a pandas DataFrame with conditional logic. For programmers familiar with SQL this would be like using the WHERE clause.

我們可以使用條件邏輯過(guò)濾熊貓DataFrame的行。 對(duì)于熟悉SQL的程序員,這就像使用WHERE子句。

To retrieve only the rows where wind_speed is greater than 42.0 we can do this:

要僅檢索wind_speed大于42.0的行,我們可以這樣做:

the filt variable stands for ‘filter’filt變量代表“過(guò)濾器”

We can filter for more than one condition like this:

我們可以過(guò)濾多個(gè)條件,例如:

Here we filter for the condition where the wind speed is greater than 42.0 (I’m assuming miles per hour) and where the gender of the bicyclist is female. As we can see it returns an empty dataset.

在這里,我們篩選出風(fēng)速大于42.0(我假設(shè)每小時(shí)英里)并且騎自行車的性別是女性的情況。 如我們所見,它返回一個(gè)空的數(shù)據(jù)集。

We can verify that we’re not committing some kind of error that results in an empty query by trying out the same multiple filters but for male riders.

我們可以通過(guò)嘗試相同的多個(gè)過(guò)濾器(但針對(duì)男性騎手)來(lái)驗(yàn)證是否未犯導(dǎo)致空查詢的錯(cuò)誤。

We can also do something like this:

我們還可以這樣做:

查詢:過(guò)濾的一種更簡(jiǎn)單的選擇 (Query: A Simpler Alternative to Filtering)

Pandas also has a query method which is somewhat limited in its abilities, but allows for simpler and more readable code. Just as before, programmers familiar with SQL should feel comfortable with this method.

熊貓還具有一種query方法,該query方法的功能受到一定程度的限制,但允許使用更簡(jiǎn)單和更具可讀性的代碼。 和以前一樣,熟悉SQL的程序員應(yīng)該對(duì)此方法感到滿意。

未完待續(xù) (To Be Continued)

Pandas for Newbies is meant to be a Medium series so watch for the next upcoming tutorial Pandas for Newbies: An Introduction Part II which will be posted soon.

《 Pandas for Newbies》是一個(gè)中級(jí)系列,因此請(qǐng)關(guān)注下一個(gè)即將發(fā)布的教程《 Pandas for Newbies:Introduction Part II》

我做的事 (What I do)

I help people find Mentors, Code in Python, and Write about Life. If you’re thinking about switching careers into the tech industry or just want to talk you can sign up for my Slack Channel via VegasBlu.

我?guī)椭藗冋业綄?dǎo)師,Python代碼并撰寫關(guān)于生活的文章。 如果您正在考慮將職業(yè)轉(zhuǎn)向科技行業(yè),或者只是想談?wù)?#xff0c;可以通過(guò)VegasBlu注冊(cè)我的Slack頻道。

翻譯自: https://towardsdatascience.com/pandas-for-newbies-an-introduction-part-i-8246f14efcca

熊貓分發(fā)

總結(jié)

以上是生活随笔為你收集整理的熊猫分发_熊猫新手:第一部分的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。