日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程语言 > python >内容正文

python

【DS实践 | Coursera】Assignment 2 | Applied Plotting, Charting Data Representation in Python

發(fā)布時間:2023/12/8 python 33 豆豆
生活随笔 收集整理的這篇文章主要介紹了 【DS实践 | Coursera】Assignment 2 | Applied Plotting, Charting Data Representation in Python 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

文章目錄

  • 一、問題分析
    • 1.1 問題描述
    • 1.2 問題分析
  • 二、具體代碼及注釋
    • 2.1 代碼
    • 2.2 繪圖結(jié)果


一、問題分析

1.1 問題描述

Before working on this assignment please read these instructions fully. In the submission area, you will notice that you can click the link to Preview the Grading for each step of the assignment. This is the criteria that will be used for peer grading. Please familiarize yourself with the criteria before beginning the assignment.

An NOAA dataset has been stored in the file data/C2A2_data/BinnedCsvs_d400/fb441e62df2d58994928907a91895ec62c2c42e6cd075c2700843b89.csv. This is the dataset to use for this assignment. Note: The data for this assignment comes from a subset of The National Centers for Environmental Information (NCEI) Daily Global Historical Climatology Network (GHCN-Daily). The GHCN-Daily is comprised of daily climate records from thousands of land surface stations across the globe.

Each row in the assignment datafile corresponds to a single observation.

The following variables are provided to you:

  • id : station identification code
  • date : date in YYYY-MM-DD format (e.g. 2012-01-24 = January 24, 2012)
  • element : indicator of element type
    • TMAX : Maximum temperature (tenths of degrees C)
    • TMIN : Minimum temperature (tenths of degrees C)
  • value : data value for element (tenths of degrees C)

For this assignment, you must:

  • Read the documentation and familiarize yourself with the dataset, then write some python code which returns a line graph of the record high and record low temperatures by day of the year over the period 2005-2014. The area between the record high and record low temperatures for each day should be shaded.
  • Overlay a scatter of the 2015 data for any points (highs and lows) for which the ten year record (2005-2014) record high or record low was broken in 2015.
  • Watch out for leap days (i.e. February 29th), it is reasonable to remove these points from the dataset for the purpose of this visualization.
  • Make the visual nice! Leverage principles from the first module in this course when developing your solution. Consider issues such as legends, labels, and chart junk.
  • The data you have been given is near Ann Arbor, Michigan, United States, and the stations the data comes from are shown on the map below.


    1.2 問題分析

    我們發(fā)現(xiàn)該Assignment一共分位四個部分

  • 首先記錄2005-2014年每一天的最高氣溫和最低氣溫,這需要對時間數(shù)據(jù)進行pd.to_datetime轉(zhuǎn)化后拆分然后利用分組函數(shù)groupby和聚合函數(shù)求最大最小值就可以了。得到每一天的數(shù)據(jù)后根據(jù)日期畫出折線圖,將最高溫度和最低溫之間的部分填充上色。
  • 找到2015年超過2005-2014年最高溫度和低于最低溫度的日期和溫度,在圖上用散點圖表示,可以在plt.scatter()函數(shù)中利用np.where()來當index實現(xiàn),np.where()返回0和1的矩陣。
  • 將2月29日的數(shù)據(jù)剔除
  • 做好可視化,減少圖像垃圾,例如,減少無關(guān)數(shù)據(jù)的筆墨。

  • 二、具體代碼及注釋

    2.1 代碼

    import matplotlib.pyplot as plt import pandas as pd import numpy as np %matplotlib notebook binsize=400 hashid='fb441e62df2d58994928907a91895ec62c2c42e6cd075c2700843b89'#讀取數(shù)據(jù) df=pd.read_csv('data/C2A2_data/BinnedCsvs_d{}/{}.csv'.format(binsize,hashid)) #df=pd.read_csv('assignment2_data.csv')#溫度單位轉(zhuǎn)化 df['value']=df['Data_Value'].apply(lambda x:x/10)#拆分時間 df['year']=pd.to_datetime(df['Date']).apply(lambda x:x.year) df['month']=pd.to_datetime(df['Date']).apply(lambda x:x.month) df['day']=pd.to_datetime(df['Date']).apply(lambda x:x.day)#去除2月29日的數(shù)據(jù) df=df[~((df['month']==2)&(df['day']==29))]#取2005-2014的數(shù)據(jù)為df_05_14 df_05_14=df[(df['year']>=2005)&(df['year']<=2014)] #取2015年的數(shù)據(jù)為df_15 df_15=df[df['year']==2015]#取05-14年數(shù)據(jù)最大值和最小值 df_max_05_14=df_05_14[df_05_14['Element']=='TMAX'].groupby(['month','day']).agg({'value':np.max}) df_min_05_14=df_05_14[df_05_14['Element']=='TMIN'].groupby(['month','day']).agg({'value':np.min})#取15年數(shù)據(jù)最大值和最小值 df_max_15=df_15[df_15['Element']=='TMAX'].groupby(['month','day']).agg({'value':np.max}) df_min_15=df_15[df_15['Element']=='TMIN'].groupby(['month','day']).agg({'value':np.min})#找到打破記錄的日期 broken_max=np.where(df_max_15>df_max_05_14)[0] broken_min=np.where(df_min_15<df_min_05_14)[0]

    2.2 繪圖結(jié)果

    總結(jié)

    以上是生活随笔為你收集整理的【DS实践 | Coursera】Assignment 2 | Applied Plotting, Charting Data Representation in Python的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

    如果覺得生活随笔網(wǎng)站內(nèi)容還不錯,歡迎將生活随笔推薦給好友。