日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

seaborn 教程_使用Seaborn进行数据可视化教程

發布時間:2023/12/15 编程问答 22 豆豆
生活随笔 收集整理的這篇文章主要介紹了 seaborn 教程_使用Seaborn进行数据可视化教程 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

seaborn 教程

“Seaborn makes the exploratory data analysis phase of your data science project beautiful and painless”

“ Seaborn使您的數據科學項目的探索性數據分析階段變得美麗而輕松”

介紹 (Introduction)

This tutorial is targeted at the audience who have worked with Seaborn, but had lost the touch of it. I hope that, by reading this article, you can recollect Seaborn visualization style and commands to get started with your data exploration. This tutorial layout is such that, it shows how and what visualizations you can do using Seaborn, given you have x number of numerical features and y number of categorical features.

本教程針對的是與Seaborn合作但失去了聯系的讀者。 我希望通過閱讀本文,您可以回顧Seaborn的可視化樣式和命令,以開始進行數據探索。 本教程的布局是這樣的,它顯示了給定x個數字特征和y個類別特征的情況,以及如何使用Seaborn進行可視化。

Lets import Seaborn:

讓我們導入Seaborn:

import seaborn as sns

數據集: (The Dataset:)

We will be using the tips dataset available within the seaborn library.

我們將使用seaborn庫中提供的提示數據集。

Load the dataset using:

使用以下方法加載數據集:

tips = sns.load_dataset('tips')

total_bill(numerical variable) : Total bill for the tabletip(Numeric): Tip for the waiter serving the table sex (Categorical): Gender of the bill payer (Male/Female)smoker(Categorical): Whether the bill payer was a smoker (Yes/No)day(Categorical): Day of the week (Sun, Mon… etc)table_size(Numerical) : Capacity of the tabledate: date and time of the bill payment

total_bill(數字變量):平板電腦的帳單總額(數字):服務表性別的服務員小費(分類):帳單付款人的性別(男/女)吸煙者(分類):帳單付款人是否是吸煙者(是/否)天(類別):星期幾(星期日,星期一等)table_size(數值):表格的容量日期:帳單支付的日期和時間

海洋風格: (Seaborn Styles:)

Let’s start with different styles available in Seaborn. Each style is differentiated by background colour, grid layout and axis ticks of the plot. There are five basic styles available in Seaborn: Dark, Darkgrid, White, White Grid and Ticks.

讓我們從Seaborn中可用的不同樣式開始。 每種樣式都通過背景顏色,網格布局和繪圖的軸刻度來區分。 Seaborn中有五種基本樣式:深色,深色網格,白色,白色網格和刻度。

sns.set_style('dark')sns.set_style('darkgrid')sns.set_style('ticks')sns.set_style('white')sns.set_style('whitegrid')

可視化 (Visualizations)

Let’s look at various visualizations we can do using Seaborn. Each segment below shows how to perform visualizations given the number of categorical and numerical variables that are available to you.

讓我們看一下我們可以使用Seaborn進行的各種可視化。 下面的每個部分都顯示了如何根據給定的類別和數字變量數量執行可視化。

一個數值變量: (One Numerical Variable:)

If we have one numerical variable, we can analyse the distribution of that variable.

如果我們有一個數值變量,我們可以分析該變量的分布。

g = sns.distplot(tips.tip)
g.set_title('Tip Amount Distribution');g = sns.distplot(tips.tip,kde=False)
g.set_title('Tip Amount Histogram');g = sns.distplot(tips.tip,rug=True)
g.set_title('Tip Amount Distribution with rug');

We can observe that the tip amount data is approximately normal.

我們可以觀察到小費金額數據大致正常。

一個分類變量 (One categorical variable)

If we have one categorical variable, we can do a count plot which shows frequency of occurrence of each value of the categorical variable.

如果我們有一個分類變量,我們可以做一個計數圖,顯示分類變量每個值的出現頻率。

g = sns.catplot(x="day",kind='count',order=['Thur','Fri','Sun','Sat'],data=tips);g.fig.suptitle("Frequency of days in the tips dataset [Count Plot]",y=1.05);

兩個數值變量 (Two Numerical variables)

To analyse relationship between two numerical variables, we can do scatter plots in seaborn.

為了分析兩個數值變量之間的關系,我們可以繪制seaborn中的散點圖。

g = sns.relplot(x="total_bill",y="tip",data=tips,kind='scatter');g.fig.suptitle('Relationship between continuous variables [Scatter Plot]',y=1.05);

Seaborn also makes it easy to visualize density distribution of the relationship between two numerical variables.

Seaborn還使可視化兩個數值變量之間關系的密度分布變得容易。

g = sns.jointplot(x="total_bill",y='tip',data=tips,kind='kde');g.fig.suptitle('Density distribution among tips and total_bill [Joint Plot]',y=1.05);

kde plot is another plot to visualize the distribution of relationship between two continuous variables.

kde圖是另一個可視化兩個連續變量之間關系分布的圖。

g = sns.jointplot(x="total_bill",y='tip',data=tips,kind='kde');g.fig.suptitle('Density distribution among tips and total_bill [Joint Plot]',y=1.05);

We can also plot a regression line with confidence intervals with one numerical variable as dependent variable and other as independent variable.

我們還可以繪制一條具有置信區間的回歸線,其中一個數值變量為因變量,另一數值為自變量。

g = sns.lmplot(x="total_bill",y="tip",data=tips);g.fig.suptitle('Relationship b/w tip and total_bill [Scatter Plot + Regression Line]',y=1.05);Scatter Plot with Regression line帶回歸線的散點圖

If the independent variable is datetime, we can do a lineplot, which is also a timeseries plot.

如果自變量是日期時間,我們可以做一個線圖,它也是一個時間序列圖。

g = sns.lineplot(x="date",y="total_bill",data=tips);g.set_title('Total bill amount over time [Line plot]');

兩個數值和一個類別變量 (Two Numerical and One Categorical Variable)

With two numerical variables and one categorical variable, we can do all the plots mentioned in the two numerical variables section . The additional dimension of categorical variable can be used as a colour/marker to distinguish the categorical variable values in the plot.

使用兩個數值變量和一個類別變量,我們可以繪制兩個數值變量部分中提到的所有圖。 分類變量的附加維度可以用作顏色/標記,以區分繪圖中的分類變量值。

g = sns.relplot(x="total_bill",y="tip",hue='sex',kind='scatter',data=tips);
g.fig.suptitle('Relationship b/w totalbill and tip distinguished by gender [Scatter Plot]',y=1.05);g = sns.relplot(x="total_bill",y="tip",style='sex',kind='scatter',data=tips)
g.fig.suptitle('Relationship b/w totalbill distinguished by gender as marker [Scatter Plot]',y=1.05);

Alternatively, we can use each categorical variable value as a group to plot relationship between two numerical variables for each categorical variable value.

或者,我們可以將每個分類變量值作為一組使用,以繪制每個分類變量值的兩個數字變量之間的關系。

g = sns.relplot(x="total_bill",y="tip",col='sex',kind='scatter',data=tips);g.fig.suptitle('Relationship between totalbill and tip by gender [Scatter Plot]',y=1.05);

三個數值變量 (Three Numerical Variables)

If we have three numerical variables, we can do a scatter plot for two variables and third variables can be used as size of the points in the scatter plot.

如果我們有三個數值變量,我們可以對兩個變量做一個散點圖,第三個變量可以用作散點圖中點的大小。

g = sns.relplot(x="total_bill",y="tip",size='table_size',kind='scatter',data=tips);g.fig.suptitle('total bill vs tip distinguished by table size [Scatter Plot]',y=1.05);

三個數值變量和一個類別變量: (Three Numerical Variables and One Categorical variable:)

If we have three numerical and one categorical variable, the same plot mentioned in the above section can be plotted for each value of the categorical variable.

如果我們有三個數值變量和一個分類變量,則可以為分類變量的每個值繪制上節中提到的同一圖。

g = sns.relplot(x="total_bill",y="tip",col='sex',size='table_size',kind='scatter',data=tips);g.fig.suptitle('Total bill vs tip by gender distinguished by table size [Scatter Plot]',y=1.03);

一個數字變量和一個類別變量: (One Numerical and One Categorical variable:)

This is probably the most basic, common and useful plot in data visualization. If we have one numerical variable and one categorical variable, we can do various plots like bar plot and strip plot.

這可能是數據可視化中最基本,最通用和最有用的圖。 如果我們有一個數值變量和一個類別變量,我們可以做各種圖,如條形圖和條形圖。

g = sns.catplot(x="day",y="tip",kind='bar',order=['Thur','Fri','Sun','Sat'],ci=False,data=tips);g.fig.suptitle('Tip amount by day of week [Bar Plot]',y=1.05);g = sns.catplot(x="day",y="tip",kind='strip',order=['Thur','Fri','Sun','Sat'],ci=False,data=tips);g.fig.suptitle('Tip amount by day along with tips as scatter [Strip Plot]',y=1.03);

The swarm plot and violin plot as shown below allow us to visualization of distribution of numerical variable within each categorical variable.

如下所示的群圖和小提琴圖使我們可以直觀地看到每個類別變量中數值變量的分布。

g = sns.catplot(x="day",y="tip",kind='swarm',order=['Thur','Fri','Sun','Sat'],ci=False,data=tips);g.fig.suptitle('Tip amount by day along with tip distribution [Swarm Plot]',y=1.05);g = sns.catplot(x="day",y="tip",kind='violin',order=['Thur','Fri','Sun','Sat'],data=tips);g.fig.suptitle('Tips distributions by day [Violin Plot]');

We can visualize the Inter Quartile Range(25th percentile to 75th percentile) of continuous variable within each value of categorical variable using a point plot.

我們可以使用點圖可視化類別變量的每個值內的連續變量的四分位間距(第25個百分點至第75個百分點)。

g = sns.catplot(x="day",y="tip",kind='point',order=['Thur','Fri','Sun','Sat'],data=tips,capsize=0.5);g.fig.suptitle('IQR Range of tip by day [Point Plot]',y=1.05);

一個數值變量和兩個分類變量: (One Numerical and Two Categorical variables:)

With one numerical and two categorical variables, we can use all the plots mentioned in the above section and accommodate the additional third categorical variable either as a column variable or as a subgroup in each subplot as shown below.

使用一個數字變量和兩個類別變量,我們可以使用上一節中提到的所有圖,并在每個子圖中以列變量或子組的形式容納額外的第三類變量,如下所示。

g = sns.catplot(x="day",y="tip",kind='bar',col='smoker',order=['Thur','Fri','Sun','Sat'],ci=False,data=tips);g.fig.suptitle('Tip amount by day of week by smoker/non-smoker [Bar Plot]',y=1.05);g = sns.catplot(x="day",y="tip",kind='bar',hue='smoker',order=['Thur','Fri','Sun','Sat'],ci=False,data=tips);g.fig.suptitle('Tips by day with smoker/non-smoker subgroup [Grouped Bar Plot]',y=1.05);

一個數值變量和三個類別變量: (One Numerical and Three Categorical Variables:)

With one numerical and three categorical, we can do all the visualizations mentioned in the one categorical and one numerical variable section and accommodate additional two categorical variables with one variable as a column variable/ row variable of the figure and other as a sub group in each sub plot.

使用一個數值和三個類別,我們可以完成一個類別和一個數值變量部分中提到的所有可視化,并容納另外兩個類別變量,其中一個變量作為圖中的列變量/行變量,另一個作為子組子圖。

g = sns.catplot(x="day",y="tip",kind='bar',hue='smoker',col='sex',order=['Thur','Fri','Sun','Sat'],ci=False,data=tips);g.fig.suptitle('Tips by day with smoker/non-smoker subgroup by gender [Grouped Bar Plot]',y=1.05);

超過三個連續變量: (More than three continuous variables:)

Finally, If we have more than three numerical variables, we can use heat map or pariplot. With these plots, we visaualize relationship between each and every other numerical variable in a single plot.

最后,如果我們具有三個以上的數值變量,則可以使用熱圖或偶極圖。 通過這些圖,我們將單個圖中每個其他數值變量之間的關系歸類化。

g = sns.heatmap(tips.corr());g.set_title('correlation between continuous variables [Heat Map]');g = sns.pairplot(tips);g.fig.suptitle('Relationship between continuous variables [Patiplot]',y=1.03);

設置標題,標簽和圖例 (Setting Titles, Labels and legends)

Some Seaborn plots return matplotlib AxesSubplot while others return FacetGrid (If you forgot what are matplotlib AxesSubplots, check my notes on matplotlib for reference).

一些Seaborn圖返回matplotlib AxesSubplot,而另一些返回FacetGrid(如果您忘記了什么是matplotlib AxesSubplots,請查看我在matplotlib上的注釋以供參考)。

The FacetGrid is a grid(2D Array) of matplotlib AxesSubPlots. You can access each subplot using array indices and set labels, titles for each plot.

FacetGrid是matplotlib AxesSubPlots的網格(二維數組)。 您可以使用數組索引訪問每個子圖,并設置標簽,每個圖的標題。

g = sns.relplot(x="total_bill",y="tip",data=tips,kind='scatter');
g.axes[0,0].set_title('Relationship between continuous variables [Scatter Plot]');
g.axes[0,0].set_xlabel('Total Bill Amount');
g.axes[0,0].set_ylabel('Tip Amount');

If the plot returns AxesSubplot, you can use AxesSubplot methods to set titles and legends.

如果該圖返回AxesSubplot,則可以使用AxesSubplot方法設置標題和圖例。

g = sns.distplot(tips.tip)
g.set_title('Tip Amount Probablity Distribution');
g.set_xlabel('Tip Amount')
g.set_ylabel('probability')

For FacetGrid, you can get figure object from the FacetGrid object and set title for the figure object.

對于FacetGrid,可以從FacetGrid對象獲取圖形對象,并為圖形對象設置標題。

g = sns.relplot(x="total_bill",y="tip",col='sex',kind='scatter',data=tips);g.fig.suptitle('Relationship between totalbill and tip by gender [Scatter Plot]',y=1.05);

結論 (Conclusion)

Hopefully, you find this tutorial helpful in getting started with making beautiful visualizations, easily with seaborn. Although Seaborn is easy to use, it also offers a lot of customisation, which is an advanced topic. Once you are comfortable with basic plots, you can explore Seaborn further as you use it for your visualizations.

希望本教程對Seaborn輕松制作精美的可視化效果有所幫助。 盡管Seaborn易于使用,但它還提供了許多自定義功能,這是一個高級主題。 熟悉基本圖解后,可以在將Seaborn用于可視化時進一步進行探索。

翻譯自: https://towardsdatascience.com/data-visualisation-tutorial-using-seaborn-26e1ef9043db

seaborn 教程

總結

以上是生活随笔為你收集整理的seaborn 教程_使用Seaborn进行数据可视化教程的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。