當(dāng)前位置：首頁 > 编程语言 > python >内容正文

python

notebook python 内嵌数据库_python数据分析：在jupyter notebook上使用pythonSQL做数据分析...

發(fā)布時間：2024/7/23 python 28 豆豆

生活随笔收集整理的這篇文章主要介紹了 notebook python 内嵌数据库_python数据分析：在jupyter notebook上使用pythonSQL做数据分析... 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

python數(shù)據(jù)分析：在jupyter notebook上使用python&SQL做數(shù)據(jù)分析

發(fā)布時間：2019-01-14 21:14,

瀏覽次數(shù)：1143

, 標簽：

python

jupyter

notebook

SQL

類似于在jupyter上使用R語言，同樣可以使用SQL語句：

詳細見github項目：https://github.com/catherinedevlin/ipython-sql

<>安裝ipython-sql

pip install ipython-sql

<>載入

%load_ext sql

<>連接數(shù)據(jù)庫同 SQLAlchemy

* postgresql://will:[email?protected]/shakes

* mysql+pymysql://scott:[email?protected]/foo

* oracle://scott:[email?protected]:1521/sidname

* sqlite://

* sqlite:///foo.db

mssql+pyodbc://username:[email?protected]/databasedriver=SQL+Server+Native+Client+11.0

我是使用的是mysql，本地鏈接，用戶名ffzs，密碼666666，test數(shù)據(jù)庫：

%sql mysql+pymysql://ffzs:[email?protected]/test

<>簡單使用

%matplotlib inline import matplotlib.pyplot as plt plt.style.use('bmh')

<>1.顯示表

%%sql show tables;

<>2.選取steam_users表的前5行

df = %sql select * from steam_users limit 5 df.DataFrame()

<>3.計算表中包含多少游戲數(shù)和玩家數(shù)

%%sql select count(distinct Game) gameCount, count(distinct UserID) userCount

from steam_users

<>4.篩選出擁有用戶前十的游戲

%%sql data << select Game , count(1) as count from steam_users where Action=

'play' group by Game order by count desc limit 10

data.DataFrame()[::-1].plot.barh("Game","count")

<>5.篩選出被玩總時長前十的游戲

%%sql playHour << select Game,sum(Hours) as playHour from steam_users where

Action="play" group by Game order by playHour desc limit 10

playHour.DataFrame()[::-1].plot.barh('Game', 'playHour')

<>6.篩選出被玩平均時長前十的游戲

%%sql avgHour << select Game, avg(Hours) as avgHour from steam_users where

Action='play' group by Game order by avgHour desc limit 10

avgHour.DataFrame()[::-1].plot.barh('Game','avgHour')

<>7.平均時長前十的游戲的游戲人數(shù)

%%sql select Game, avg(Hours) as avgHour, count(1) as count from steam_users

where Action='play' group by Game order by avgHour desc limit 10

聯(lián)系join on：

%%sql select a.Game, avgHour, count from (select Game, avg(Hours) as avgHour

from steam_users where Action='play' group by Game order by avgHour desc limit

10) a left join (select Game ,count(1) as count from steam_users where Action=

'play' group by Game) b on a.Game=b.Game order by avgHour desc

可見平均時長長的游戲大多是小眾游戲

<>8.玩家人數(shù)大于500人的游戲的個數(shù)(having使用)

%%sql select count(1) as count from (select Game, count(1) as count from

steam_userswhere Action='play' group by Game having count > 500) a

<>9.擁有游戲數(shù)量前十用戶

%%sql games << select UserID, count(1) count from steam_users where Action=

'play' group by UserID order by count desc limit 10

games.DataFrame()[::-1].plot.barh('UserID','count')

<>10.游戲總時長最多5個用戶和最少5個用戶(union使用)

%%sql (select UserID, sum(Hours) as allHour from steam_users where Action=

'play' group by UserID order by allHour desc limit 5) union (select UserID, sum(

Hours) as allHour from steam_users where Action='play' group by UserID order by

allHourlimit 5)

總結(jié)

以上是生活随笔為你收集整理的notebook python 内嵌数据库_python数据分析：在jupyter notebook上使用pythonSQL做数据分析...的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇： python android自动化_py
下一篇： mac连接手机 vm_使用mac ssh

python

notebook python 内嵌 数据库_python数据分析：在jupyter notebook上使用pythonSQL做数据分析...

總結(jié)

notebook python 内嵌数据库_python数据分析：在jupyter notebook上使用pythonSQL做数据分析...