日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

数据分析报告流程展现

發(fā)布時間:2023/12/14 编程问答 27 豆豆
生活随笔 收集整理的這篇文章主要介紹了 数据分析报告流程展现 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

驅蟲市場潛力分析

import glob import os import pandas as pd import re import numpy as np import datetime as dt from sklearn.linear_model import LinearRegression import seaborn as sns from matplotlib import pyplot as plt import jieba import jieba.analyse import imageio from wordcloud import WordCloud#windows 中文編碼 plt.rcParams['font.sans-serif']='simhei' plt.rcParams['axes.unicode_minus']=False os.chdir("C:/data") os.chdir("./驅蟲劑市場")

加載數據

  • 讀取各個子類目交易額數據,合并
filenames = glob.glob('*市場近三年交易額.xlsx') filenames ['滅鼠殺蟲劑市場近三年交易額.xlsx','電蚊香套裝市場近三年交易額.xlsx','盤香滅蟑香蚊香盤市場近三年交易額.xlsx','蚊香加熱器市場近三年交易額.xlsx','蚊香液市場近三年交易額.xlsx','蚊香片市場近三年交易額.xlsx','防霉防蛀片市場近三年交易額.xlsx'] re.search(r'.*(?=市場)','滅鼠殺蟲劑市場近三年交易額.xlsx',).group() #正則表達式 ,?=市場,表示取出市場以前的內容 '滅鼠殺蟲劑' def load_xlsx(filename):# 抽取子類目名colname = re.search(r'.*(?=市場)',filename).group()# 讀文件df = pd.read_excel(filename)# 如果時間列是整數,修正為日期時間格式if df['時間'].dtypes == 'int64':df['時間'] = pd.to_datetime(df['時間'],unit='D',origin=pd.Timestamp('1899-12-30')) # df['時間'] = pd.TimedeltaIndex(df['時間'],unit='D') + dt.datetime(1899,12,30)# 重命名列名(交易金額)為子類目名df.rename(columns={df.columns[1]:colname},inplace=True)# 設置時間列為行索引df = df.set_index('時間')return df
  • 讀取所有文件到列表
dfs = [load_xlsx(i) for i in filenames]
  • 依行索引(時間)合并所有文件到一個數據框
df = pd.concat(dfs,axis=1).reset_index() #將時間變成一列,將索引變成列,重置索引 df.dtypes 時間 datetime64[ns] 滅鼠殺蟲劑 float64 電蚊香套裝 float64 盤香滅蟑香蚊香盤 float64 蚊香加熱器 float64 蚊香液 float64 蚊香片 float64 防霉防蛀片 float64 dtype: object df.head() 時間滅鼠殺蟲劑電蚊香套裝盤香滅蟑香蚊香盤蚊香加熱器蚊香液蚊香片防霉防蛀片01234
2018-10-011.136548e+08106531.294171283.35315639.487814546.151032414.298541153.59
2018-09-011.440261e+08105666.636784500.17457366.4110654973.471566651.888825870.43
2018-08-011.540426e+08201467.0310709683.41746513.1317835577.802617149.006320153.44
2018-07-011.480032e+08438635.2916589184.891871757.0038877917.836209040.066302595.06
2018-06-011.359438e+08953749.7823526385.733641025.9276499091.8612484919.637047206.98

清洗數據

  • 查看各列缺失值占比
df.isna().mean() 時間 0.0 滅鼠殺蟲劑 0.0 電蚊香套裝 0.0 盤香滅蟑香蚊香盤 0.0 蚊香加熱器 0.0 蚊香液 0.0 蚊香片 0.0 防霉防蛀片 0.0 dtype: float64
  • 抽取月份以供后面索引
month = df['時間'].dt.month #抽取月份 month 0 10 1 9 2 8 3 7 4 6 5 5 6 4 7 3 8 2 9 1 10 12 11 11 12 10 13 9 14 8 15 7 16 6 17 5 18 4 19 3 20 2 21 1 22 12 23 11 24 10 25 9 26 8 27 7 28 6 29 5 30 4 31 3 32 2 33 1 34 12 35 11 Name: 時間, dtype: int64
  • 循環(huán)預測2018年11月和12月的銷售金額
for i in [11,12]:# 收取對應月份數據dm = df[month == i] #抽取的是行 # 訓練x是年份xtrain = np.array(dm['時間'].dt.year).reshape(-1,1) #reshape(-1,1)變成一個二維數組# 測試y是新增加的行,第一列是對應的日期ytest = [pd.datetime(2018,i,1)]for j in range(1,len(dm.columns)):# 訓練y是指定的j列ytrain = np.array(dm.iloc[:,j]).reshape(-1,1)# 回歸建模lm = LinearRegression().fit(xtrain,ytrain)# 預測當測試x是2018時的交易金額yhatyhat = lm.predict(np.array([2018]).reshape(-1,1))# 對應列的預測值附在新增加的行后ytest.append(round(yhat[0][0],2))# 給預測結果賦值對應的列名newrow = pd.DataFrame([dict(zip(df.columns,ytest))]) # 預測結果行加在數據前df = newrow.append(df) D:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:3: UserWarning: Boolean Series key will be reindexed to match DataFrame index.This is separate from the ipykernel package so we can avoid doing imports until df 時間滅鼠殺蟲劑電蚊香套裝盤香滅蟑香蚊香盤蚊香加熱器蚊香液蚊香片防霉防蛀片012345678910111213141516171819202122232425262728293031323334353637
2018-12-015.256763e+0750204.53928554.2686849.173.081492e+06426812.593958717.58
2018-11-017.175250e+0738692.611801318.82193874.395.543204e+06776627.046678677.55
2018-10-011.136548e+08106531.294171283.35315639.487.814546e+061032414.298541153.59
2018-09-011.440261e+08105666.636784500.17457366.411.065497e+071566651.888825870.43
2018-08-011.540426e+08201467.0310709683.41746513.131.783558e+072617149.006320153.44
2018-07-011.480032e+08438635.2916589184.891871757.003.887792e+076209040.066302595.06
2018-06-011.359438e+08953749.7823526385.733641025.927.649909e+0712484919.637047206.98
2018-05-011.241642e+081238967.3728118581.255032466.781.050396e+0815309721.947942340.44
2018-04-017.509661e+07841051.9316420341.873130513.436.254165e+077954875.077031364.60
2018-03-015.918182e+07475177.487900094.911198332.812.632447e+072950648.326051561.02
2018-02-012.292138e+0733232.95545917.6675714.462.235774e+06218915.631393948.47
2018-01-013.653873e+0754305.20592663.2086670.451.759451e+06298146.112607776.07
2017-12-014.292283e+0771600.17796930.4669145.592.213103e+06314120.383259747.23
2017-11-015.838217e+0794993.761581530.20168141.794.257594e+06617094.945447184.43
2017-10-018.226882e+07145925.312824785.80166522.624.290843e+06766588.776152868.25
2017-09-011.010081e+08242194.375581352.42353042.457.833349e+061574779.655792065.80
2017-08-011.049504e+08332922.027229409.84544076.631.376039e+072323304.145081714.64
2017-07-011.116729e+08913425.9513718046.881357778.934.257757e+076627299.716691694.17
2017-06-011.051463e+082045163.5919635925.592639777.668.283230e+0712422420.217155138.87
2017-05-019.185035e+073606141.8220275515.853185961.751.014605e+0815961946.718145781.12
2017-04-015.363586e+071285599.499197868.291554864.424.880687e+076214963.686682161.49
2017-03-014.078967e+07390486.573397837.60317206.481.488979e+071319399.223904656.82
2017-02-013.467502e+07209643.871519446.34148158.075.929509e+06687697.462584035.90
2017-01-012.047156e+0739434.76596744.0448164.631.213749e+06238973.091781773.46
2016-12-013.546668e+0784350.571234900.0552118.961.558634e+06293737.203504367.98
2016-11-014.780625e+07106291.231473418.2082835.822.758827e+06512990.234975519.21
2016-10-016.339722e+07179015.232543813.78130484.073.641803e+06690912.024600717.78
2016-09-016.864724e+07210456.693092898.02168724.834.632818e+06930513.914642681.07
2016-08-017.610885e+07316467.144389862.79272553.958.956868e+061581021.504151326.68
2016-07-017.832954e+07932728.107384968.66761159.352.260036e+074088320.775412185.06
2016-06-017.693264e+072184985.3310859461.671728788.534.640197e+078004562.695694825.13
2016-05-015.812696e+072059879.809912801.931618361.544.777690e+077474421.975469360.60
2016-04-013.762602e+071034992.534687913.18758206.812.432917e+073435257.355253619.06
2016-03-012.952610e+07352013.311204574.20246106.756.656382e+06746709.073481194.46
2016-02-011.500135e+0796979.48449199.4136193.856.939075e+05109108.051274810.96
2016-01-012.107822e+07108412.71619042.0149670.254.828890e+05113284.711562393.95
2015-12-012.472756e+07110068.83818479.5634076.915.832845e+05134890.482333602.08
2015-11-013.303873e+07185094.221197791.2786889.911.579796e+06325744.433364112.14
  • 去掉原始索引
df.reset_index(drop=True,inplace=True)
  • 去掉15年的數據
df = df[df['時間'].dt.year != 2015]
  • 查看預測結果
df.head() 時間滅鼠殺蟲劑電蚊香套裝盤香滅蟑香蚊香盤蚊香加熱器蚊香液蚊香片防霉防蛀片01234
2018-12-015.256763e+0750204.53928554.2686849.173081491.99426812.593958717.58
2018-11-017.175250e+0738692.611801318.82193874.395543203.83776627.046678677.55
2018-10-011.136548e+08106531.294171283.35315639.487814546.151032414.298541153.59
2018-09-011.440261e+08105666.636784500.17457366.4110654973.471566651.888825870.43
2018-08-011.540426e+08201467.0310709683.41746513.1317835577.802617149.006320153.44

驅蟲市場潛力分析

  • 分析整個市場的總體趨勢
  • 分析各子類目市場占比及變化趨勢
  • 分析市場集中度,即是否存在壟斷
  • 市場變化趨勢描述

    • 每行所有市場的交易金額的總和生成新列
    • 抽出年份生成新列
    df['colsums'] = df.sum(1) df.insert(1,'year',df['時間'].dt.year) df.head() 時間year滅鼠殺蟲劑電蚊香套裝盤香滅蟑香蚊香盤蚊香加熱器蚊香液蚊香片防霉防蛀片colsums01234
    2018-12-0120185.256763e+0750204.53928554.2686849.173081491.99426812.593958717.586.110026e+07
    2018-11-0120187.175250e+0738692.611801318.82193874.395543203.83776627.046678677.558.678489e+07
    2018-10-0120181.136548e+08106531.294171283.35315639.487814546.151032414.298541153.591.356363e+08
    2018-09-0120181.440261e+08105666.636784500.17457366.4110654973.471566651.888825870.431.724211e+08
    2018-08-0120181.540426e+08201467.0310709683.41746513.1317835577.802617149.006320153.441.924731e+08
    • 按照年份分組求每個子類目市場的交易額總和
    byyear = df.groupby('year').sum().reset_index() byyear year滅鼠殺蟲劑電蚊香套裝盤香滅蟑香蚊香盤蚊香加熱器蚊香液蚊香片防霉防蛀片colsums012
    20166.080471e+087666572.124.785285e+075905204.711.704905e+0827980839.4750023001.949.179661e+08
    20178.477740e+089377531.688.635539e+0710552841.023.300656e+0849068587.9662678822.181.395873e+09
    20181.137893e+094537682.091.180885e+0816836723.433.582077e+0851845921.5672701365.231.760111e+09
    • 繪圖前處理中文字體
    plt.rcParams['font.sans-serif']='simhei' plt.rcParams['axes.unicode_minus']=False
    • 線圖分析整體市場的趨勢
    # replot表示描述相關性的圖標,height表圖形的高度 sns.relplot('year','colsums',kind='line',marker='o',data=byyear,height=4) plt.title('近三年驅蟲市場趨勢') # 加標題 plt.xticks(byyear.year,rotation=45) # 定義x軸標簽 plt.xlabel('年份') # 定義x軸標題 plt.ylabel('總交易額') # 定義y軸標題 plt.show()

    • 分析各類目市場銷量趨勢分析
    # 圖形大小 f, ax = plt.subplots(figsize=(10, 6)) # 線圖中,由于有大于6個的分類,而線型最多6中,所以這里不區(qū)分線型,即dashes=False sns.lineplot(data=byyear.set_index('year').iloc[:,:-1],dashes=False,marker='^') #dashes=False 不區(qū)分線型 plt.title('近三年各類目市場銷量趨勢分析') plt.xticks(byyear.year,rotation=45) # 在指定位置加文本 for a,b in zip(byyear.year,byyear['滅鼠殺蟲劑']):plt.text(a,b,'%.3e'% b , ha='center',va='bottom',size=12) plt.xlabel('年份') plt.ylabel('總交易額') plt.show()

    • 直觀的查看滅鼠殺蟲劑近三年的增量趨勢
    g = sns.FacetGrid(byyear,height=5) g.map(sns.barplot,'year','滅鼠殺蟲劑',color='wheat') g.map(sns.pointplot,'year','滅鼠殺蟲劑') for a,b in zip(range(len(byyear)),byyear['滅鼠殺蟲劑']):plt.text(a,b,'%.3e'% b , ha='center',va='bottom',size=12) plt.xlabel('年份') plt.title('近三年滅鼠滅蟲市場增量趨勢分析') plt.xticks(rotation=45) plt.show() D:\ProgramData\Anaconda3\lib\site-packages\seaborn\axisgrid.py:715: UserWarning: Using the barplot function without specifying `order` is likely to produce an incorrect plot.warnings.warn(warning) D:\ProgramData\Anaconda3\lib\site-packages\seaborn\axisgrid.py:715: UserWarning: Using the pointplot function without specifying `order` is likely to produce an incorrect plot.warnings.warn(warning)

    • 查看各類目市場每年總交易額占比
    # 計算每年每個子市場占比 byyear_per = byyear.iloc[:,1:-1].div(byyear.colsums,axis=0) byyear_per.index = byyear.year byyear_per 滅鼠殺蟲劑電蚊香套裝盤香滅蟑香蚊香盤蚊香加熱器蚊香液蚊香片防霉防蛀片year201620172018
    0.6623850.0083520.0521290.0064330.1857260.0304810.054493
    0.6073430.0067180.0618650.0075600.2364580.0351530.044903
    0.6464890.0025780.0670920.0095660.2035140.0294560.041305
    byyear_per.plot(kind='bar',stacked=True,figsize=(10,8),colormap='tab10') #stacked=True 堆棧條形圖 for a,b in zip(range(len(byyear_per)),byyear_per['滅鼠殺蟲劑']):plt.text(a,b/2,f'{b*100:.2f}%', ha='center',va='bottom',size=16,color='white') plt.xlabel('年份') plt.ylabel('總交易額占比') plt.title('近三年各類目市場銷量占比') plt.show()

    • 查看各類目市場年增幅
    # 計算年增幅 byyear0 = byyear.iloc[:,1:-1] byyear_diff = byyear0.diff().iloc[1:,:].reset_index(drop=True)/byyear0.iloc[:2,:] byyear_diff.index = ['16-17','17-18'] byyear_diff 滅鼠殺蟲劑電蚊香套裝盤香滅蟑香蚊香盤蚊香加熱器蚊香液蚊香片防霉防蛀片16-1717-18
    0.3942570.2231710.8046030.7870410.9359760.7536500.253000
    0.342213-0.5161110.3674710.5954680.0852620.0566010.159903
    # 作圖查看 f, ax = plt.subplots(figsize=(10, 8)) sns.lineplot(data=byyear_diff,dashes=False) plt.title('近三年各類目市場銷量年增幅') plt.xlabel('年份') plt.ylabel('總交易額年增幅') plt.show()

    市場集中度描述

    • 讀取對應數據,并且描述數據
    df1 = pd.read_excel('top100品牌數據.xlsx') df1.isna().mean() 品牌 0.0 行業(yè)排名 0.0 交易指數 0.0 交易增長幅度 0.0 支付轉化指數 0.0 操作 0.0 dtype: float64 df1.head() 品牌行業(yè)排名交易指數交易增長幅度支付轉化指數操作01234
    PREMISE/拜滅士1530344-0.32351521趨勢分析
    科凌蟲控2474937-0.19101581趨勢分析
    ARS/安速3402372-0.26821448趨勢分析
    思樂智43607800.2056841趨勢分析
    希諾5346656-0.10851865趨勢分析
    df1.describe(include='all') 品牌行業(yè)排名交易指數交易增長幅度支付轉化指數操作countuniquetopfreqmeanstdmin25%50%75%max
    100100.000000100.000000100.000000100.000000100
    100NaNNaNNaNNaN1
    羅貝特NaNNaNNaNNaN趨勢分析
    1NaNNaNNaNNaN100
    NaN50.500000147327.5600000.3957901247.870000NaN
    NaN29.01149288177.1823912.038278350.304014NaN
    NaN1.00000065194.000000-0.781900577.000000NaN
    NaN25.75000086129.000000-0.266325967.750000NaN
    NaN50.500000118682.500000-0.0618001245.000000NaN
    NaN75.250000163373.2500000.3343501491.500000NaN
    NaN100.000000530344.00000017.6751002000.000000NaN
    • 生成交易指數占比列,用以描述市場份額
    df1['交易指數占比'] = df1['交易指數']/df1['交易指數'].sum()
    • 圖形描述交易指數占比
    df1.plot(x='品牌',y='交易指數占比',kind='bar',figsize=(15,4)) plt.show()

    • 計算HHI指標并打印

    • Gini:不純度,基尼系數越大越不純

    • HHI:市場純度度量,越大越純,越純表明壟斷了

    HHI = sum(df1['交易指數占比']**2) print(f'驅蟲市場HHI指數:{HHI:.6f}(或{HHI*10000:.2f}),等效公司數:{1/HHI:.2f}') 驅蟲市場HHI指數:0.013546(或135.46),等效公司數:73.82

    滅鼠殺蟲劑市場機會點分析

    os.chdir('..') os.chdir('./滅鼠殺蟲劑細分市場')

    加載數據,清洗數據

    • 讀文件,合并文件
    filenames1 = glob.glob('*.xlsx') dfs1 = [pd.read_excel(i) for i in filenames1] df2 = pd.concat(dfs1,sort=False)
    • 五個表的特征不同,故合并后的特征更多,并且會有大量缺失值
    df2.info() <class 'pandas.core.frame.DataFrame'> Int64Index: 6556 entries, 0 to 1742 Columns: 229 entries, 類別 to 產品名 dtypes: float64(129), int64(5), object(95) memory usage: 11.5+ MB
    • 這里認為如果一個特征的缺失值占比超過98%,則從全數據的角度看沒有意義,直接刪
    ind1 = df2.isna().mean()>0.98 sum(ind1) 191 df20 = df2.loc[:,~ind1]
    • 特征值完全一致的話也沒有意義,可以直接刪
    ind2 = np.array([len(df20[i].unique())==1 for i in df20.columns]) #unique 查看是否有重復值 df21 = df20.loc[:,~ind2]
    • 依據邏輯刪除不可能會用到的列,例如鏈接

    • 藥品登記號后的列缺失值占比高,市場分析意義不大,故都不需要

    ind3 = df21.columns.get_loc('藥品登記號') df22 = df21.iloc[:,:ind3]
    • 其他邏輯上不用的列
    useless = ['時間','鏈接','主圖鏈接','主圖視頻鏈接','頁碼','排名','寶貝標題','運費','下架時間','旺旺'] df23 = df22.drop(columns=useless)
    • 查看清洗后的數據屬性
    df23.isna().mean() 類別 0.000000 寶貝ID 0.000000 銷量(人數) 0.000000 售價 0.000000 預估銷售額 0.005491 評價人數 0.022880 收藏人數 0.000000 地域 0.406955 店鋪類型 0.000000 品牌 0.095333 型號 0.423276 凈含量 0.421599 適用對象 0.279896 物理形態(tài) 0.286303 dtype: float64 df23.dtypes 類別 object 寶貝ID int64 銷量(人數) int64 售價 float64 預估銷售額 float64 評價人數 float64 收藏人數 int64 地域 object 店鋪類型 object 品牌 object 型號 object 凈含量 object 適用對象 object 物理形態(tài) object dtype: object
    • 其中寶貝ID列是整數不合理,這里將類型改為object
    df23 = df23.astype({'寶貝ID':'object'}) df23.reset_index(drop=True,inplace=True) df23.describe() 銷量(人數)售價預估銷售額評價人數收藏人數countmeanstdmin25%50%75%max
    6556.0000006556.0000006.520000e+036406.0000006556.000000
    324.51860939.5595271.032676e+041942.7468001345.778981
    3207.47018649.6781137.851193e+0413493.9253086947.250438
    0.0000000.0100001.000000e-020.0000000.000000
    14.00000013.9000002.307375e+0236.00000033.000000
    26.00000025.8000006.741000e+02160.000000133.000000
    70.00000048.0000002.288000e+03602.750000496.500000
    143037.000000618.0000002.672898e+06502295.000000234645.000000
    df23.head() 類別寶貝ID銷量(人數)售價預估銷售額評價人數收藏人數地域店鋪類型品牌型號凈含量適用對象物理形態(tài)01234
    殺蟲5784598662899929.92960.126.0202NaN天貓拜耳特姆得NaN蟑螂液體
    殺蟲548196868239990.659.41330.0242浙江 金華淘寶佰凌180325NaNNaNNaN
    殺蟲5808392955629998.09702.044.027廣東 深圳淘寶NaNNaNNaNNaNNaN
    殺蟲580264662322996.9683.124.026河南 商丘淘寶SHURONGCROP/樹榮作物NaN30gNaNNaN
    殺蟲444845179739918.81861.2121.0133河北 秦皇島天貓Raid/雷達雷達殺蟲氣霧劑清香600ml蟑螂噴霧

    細化分析

    • 進一步分析滅鼠殺蟲劑市場中最受歡迎的產品類別,然后細分價格段,再對對應屬性進一步分析

    產品類別分布

    byclass = df23['預估銷售額'].groupby(df23['類別']).sum() byclass 類別 殺蟲 8207628.10 滅鼠 25686011.99 虱 4512886.01 螨 10886752.88 蟑螂 18037223.68 Name: 預估銷售額, dtype: float64 byclass.plot.barh() <matplotlib.axes._subplots.AxesSubplot at 0x1536ed24518>

    byclass.plot.pie(autopct='%.2f') <matplotlib.axes._subplots.AxesSubplot at 0x1536f592e48>

    • 可以看出重點需要研究的市場是滅鼠和蟑螂,我們選擇滅鼠

    滅鼠類別分析

    • 選擇滅鼠數據
    df24 = df23[df23['類別']=='滅鼠']
    • 依價格劃分
    df24['售價'].describe() count 1523.000000 mean 49.018910 std 69.762057 min 0.010000 25% 15.800000 50% 27.700000 75% 52.600000 max 498.000000 Name: 售價, dtype: float64 df24['售價'].plot.hist() <matplotlib.axes._subplots.AxesSubplot at 0x15374a7d668>

    bins = [0,50,100,150,200,250,300,500] labels = ['0_50','50_100','100_150','150_200','200_250','250_300','300以上'] df24['價格區(qū)間'] = pd.cut(df24['售價'],bins,labels=labels,include_lowest=True) df24['價格區(qū)間'].value_counts() C:\Programs\Anaconda3\lib\site-packages\ipykernel_launcher.py:4: SettingWithCopyWarning:A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value insteadSee the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy0_50 1138 50_100 242 100_150 62 150_200 35 300以上 28 250_300 9 200_250 9 Name: 價格區(qū)間, dtype: int64
    • 計算各價格區(qū)間的銷售額,銷售額占比,寶貝數,寶貝數占比,競爭度(單寶貝平均銷售額的反面)
    def byfun(df,by,sort='單寶貝平均銷售額'):byc = pd.DataFrame(df.groupby(by).sum()).loc[:,['預估銷售額']]byc['銷售額占比'] = byc['預估銷售額']/byc['預估銷售額'].sum()byc['寶貝數'] = df.groupby(by).nunique()['寶貝ID']byc['寶貝數占比'] = byc['寶貝數']/byc['寶貝數'].sum()byc['單寶貝平均銷售額'] = byc['預估銷售額']/byc['寶貝數']byc['相對競爭度'] = 1 - (byc['單寶貝平均銷售額']-byc['單寶貝平均銷售額'].min())/(byc['單寶貝平均銷售額'].max()-byc['單寶貝平均銷售額'].min())if sort: byc.sort_values(sort,ascending=False,inplace=True)return byc byprice1 = byfun(df24,'價格區(qū)間') byprice1 預估銷售額銷售額占比寶貝數寶貝數占比單寶貝平均銷售額相對競爭度價格區(qū)間200_250100_150300以上250_300150_20050_1000_50
    2743758.000.10681970.006173391965.4285710.000000
    2758086.290.107377460.04056459958.3976090.887218
    819468.000.031903180.01587345526.0000000.925786
    237740.000.00925690.00793726415.5555560.976854
    629813.000.024520280.02469122493.3214290.987336
    3335060.190.1298401720.15167519389.8848260.995629
    15162086.510.5902868540.75308617754.1996601.000000
    • 定義制圖函數
    def mcplot(bydf,figsize=(10,4)):ax = bydf.plot(y='相對競爭度',linestyle='-',marker='o',figsize=figsize)bydf.plot(y='銷售額占比',kind='bar',alpha=0.8,color='wheat',ax=ax)plt.show() mcplot(byprice1)

    • 結果依單寶貝銷售額降序,即依競爭度升序
    • 可見0-50容量大,競爭大,大容量市場(對比的是50-100,容量小,競爭稍小)
    • 200-250,競爭小,做高價市場的優(yōu)先選擇,屬于機會點

    0-50細分價格市場分析

    • 選擇0-50細分價格市場
    df25 = df24[df24['價格區(qū)間']=='0_50']
    • 進一步細分價格區(qū)間
    df25['售價'].plot.hist() <matplotlib.axes._subplots.AxesSubplot at 0x15374802e10>

    bins1 = [0,10,20,30,40,50] labels1 = ['0_10','10_20','20_30','30_40','40_50'] df25['價格子區(qū)間'] = pd.cut(df25['售價'],bins1,labels=labels1,include_lowest=True) C:\Programs\Anaconda3\lib\site-packages\ipykernel_launcher.py:3: SettingWithCopyWarning:A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value insteadSee the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy byprice2 = byfun(df25,'價格子區(qū)間') byprice2 預估銷售額銷售額占比寶貝數寶貝數占比單寶貝平均銷售額相對競爭度價格子區(qū)間10_2020_3040_5030_400_10
    8102634.140.5344012720.31850129789.0961030.000000
    4969620.920.3277662780.32552717876.3342450.411674
    707568.490.046667400.04683817689.2122500.418141
    1240874.190.081841980.11475412661.9815310.591869
    141388.770.0093251660.194379851.7395781.000000
    mcplot(byprice2)

    • 可見10-20競爭度低,容量大,優(yōu)選,20-30也不錯
    • 200-250細分市場也是同樣的分析思路

    細分市場的其他屬性分析

    • 查看其他屬性的市場占有率和競爭
    df25.isna().mean() 類別 0.000000 寶貝ID 0.000000 銷量(人數) 0.000000 售價 0.000000 預估銷售額 0.007030 評價人數 0.061511 收藏人數 0.000000 地域 0.501757 店鋪類型 0.000000 品牌 0.152021 型號 0.350615 凈含量 0.696837 適用對象 0.120387 物理形態(tài) 0.202109 價格區(qū)間 0.000000 價格子區(qū)間 0.000000 dtype: float64 df25.head() 類別寶貝ID銷量(人數)售價預估銷售額評價人數收藏人數地域店鋪類型品牌型號凈含量適用對象物理形態(tài)價格區(qū)間價格子區(qū)間20002001200220032004
    滅鼠566054780243997626.8267356.811901.011596廣東 韶關天貓優(yōu)璇福MT007NaN老鼠膠水0_5020_30
    滅鼠566054780243997626.8267356.8NaN11596廣東 深圳天貓優(yōu)璇福MT007NaN老鼠膠水0_5020_30
    滅鼠57211544899699459.998455.526442.03569NaN淘寶創(chuàng)馳21/32NaN老鼠固體0_500_10
    滅鼠398684083229929.92960.120.0352河南 南陽天貓云殺粘鼠板NaN老鼠固體0_5020_30
    滅鼠5202828972209939.93950.1559.01250NaN淘寶得碩NaNg老鼠固體0_5030_40
    • 店鋪類型
    bystore = byfun(df25,'店鋪類型') bystore 預估銷售額銷售額占比寶貝數寶貝數占比單寶貝平均銷售額相對競爭度店鋪類型天貓淘寶
    14019740.580.9246582200.25761163726.0935450.0
    1142345.930.0753426340.7423891801.8074611.0
    mcplot(bystore)

    • 可見天貓各個方面都優(yōu)于淘寶

    • 型號

    bytype = byfun(df25,'型號',sort='預估銷售額')
    • 銷售額排名前5%的型號,這里認為銷售額靠前才更傾向于大眾產品
    bytype1 = bytype[bytype['預估銷售額']>=bytype['預估銷售額'].quantile(0.95)] bytype1 預估銷售額銷售額占比寶貝數寶貝數占比單寶貝平均銷售額相對競爭度型號超強力粘鼠板粘鼠板老鼠貼0005馳天粘鼠板QL-866MT007強力粘鼠魔毯拜滅士5gCQL-1qb-031希諾粘鼠板祛螨包粘鼠板A1009A1001新款老鼠板BK300(博克二代)蟲蟲祛螨包老鼠貼
    2120129.320.142219120.021661176677.4433330.669585
    2051699.020.13762970.012635293099.8600000.451856
    927590.700.06222320.003610463795.3500000.132628
    876606.330.05880340.007220219151.5825000.590151
    759629.300.05095630.005415253209.7666670.526457
    534713.600.03586910.001805534713.6000000.000000
    496974.600.03333710.001805496974.6000000.070578
    420982.400.02824010.001805420982.4000000.212696
    391271.400.02624710.001805391271.4000000.268260
    368146.740.02469560.01083061357.7900000.885251
    362077.200.02428810.001805362077.2000000.322858
    320779.800.02151810.001805320779.8000000.400090
    285096.860.019124210.03790613576.0409520.974611
    265719.200.01782510.001805265719.2000000.503063
    225685.200.01513910.001805225685.2000000.577933
    222272.400.01491010.001805222272.4000000.584315
    213570.000.01432610.001805213570.0000000.600590
    209308.200.01404010.001805209308.2000000.608560
    209189.400.01403340.00722052297.3500000.902196
    mcplot(bytype1)

    • 可見雖然粘鼠板市場份額普遍較高,但是0005在競爭度上有明顯的優(yōu)勢

    • 物理形態(tài)

    byshape = byfun(df25,['物理形態(tài)']) byshape 預估銷售額銷售額占比寶貝數寶貝數占比單寶貝平均銷售額相對競爭度物理形態(tài)膠水噴霧粉狀固體液體啫喱5個1元硬幣厚硬板氣體油狀膠水紙板聲波粘膠粘鼠板膠狀
    534713.600.04070610.001490534713.6000000.000000
    275280.330.02095660.00894245880.0550000.914433
    322931.280.02458490.01341335881.2533330.933137
    11381498.830.8664435830.86885219522.2964490.963738
    354235.500.026967220.03278716101.6136360.970137
    256687.100.019541380.0566326754.9236840.987622
    5226.000.00039810.0014905226.0000000.990482
    1357.180.00010310.0014901357.1800000.997719
    2341.600.00017820.0029811170.8000000.998068
    358.200.00002710.001490358.2000000.999588
    328.900.00002510.001490328.9000000.999642
    235.200.00001810.001490235.2000000.999818
    413.900.00003230.004471137.9666671.000000
    275.500.00002120.002981137.7500001.000000
    mcplot(byshape)

    • 可見市場份額最高的是固體,競爭度也偏高,而膠水雖然競爭度低,但是市場份額較低

    • 基本可以認為常見的物理形態(tài)就是固體

    • 物理形態(tài),凈含量

    byshape_con = byfun(df25,['物理形態(tài)','凈含量'],['物理形態(tài)','預估銷售額']) byshape_con 預估銷售額銷售額占比寶貝數寶貝數占比單寶貝平均銷售額相對競爭度物理形態(tài)凈含量膠水紙板5片\套粘鼠板200g30G粉狀50g90g液體500g450ML450ml500ml450100ml氣體30gm固體120g118.5g2.2KG6個裝0.17KG12粒170g020610克4片200克720g17040G膠2303g10張60克膠水...1n0.08225克120g個陶土10克膠水0.08kg110g0.021kg噴霧600ml500ML450ml500ml啫喱12g10克/支10g4g20g15050克45g0.1840g240g400g30g440克約50克
    358.208.961336e-0510.004098358.2000000.998289
    207.005.178661e-0510.004098207.0000000.999011
    193.704.845926e-0510.004098193.7000000.999075
    1105.002.764454e-0420.008197552.5000000.997360
    179.404.488173e-0510.004098179.4000000.999143
    165396.004.137825e-0210.004098165396.0000000.209797
    51494.781.288280e-0220.00819725747.3900000.876988
    37570.209.399195e-0340.0163939392.5500000.955126
    17920.404.483270e-0320.0081978960.2000000.957191
    180.004.503184e-0510.004098180.0000000.999140
    111.862.798479e-0510.004098111.8600000.999466
    1357.183.395351e-0410.0040981357.1800000.993516
    2122422.345.309810e-01170.069672124848.3729410.403519
    243594.066.094160e-0230.01229581198.0200000.612065
    209308.205.236407e-0210.004098209308.2000000.000000
    140739.323.520972e-0220.00819770369.6600000.663799
    139575.503.491856e-0210.004098139575.5000000.333158
    133950.003.351119e-0210.004098133950.0000000.360035
    36313.209.084723e-0310.00409836313.2000000.826508
    27011.406.757628e-0330.0122959003.8000000.956983
    23078.085.773602e-0340.0163935769.5200000.972435
    13132.803.285523e-0310.00409813132.8000000.937256
    12960.003.242292e-0310.00409812960.0000000.938082
    9943.202.487559e-0320.0081974971.6000000.976247
    9412.202.354715e-0310.0040989412.2000000.955032
    9177.602.296023e-0310.0040989177.6000000.956153
    8977.502.245963e-0310.0040988977.5000000.957109
    7810.001.953881e-0310.0040987810.0000000.962687
    7637.421.910706e-0340.0163931909.3550000.990878
    7040.001.761245e-0310.0040987040.0000000.966365
    ..................
    16.204.052865e-0610.00409816.2000000.999923
    16.004.002830e-0610.00409816.0000000.999924
    12.903.227282e-0610.00409812.9000000.999938
    11.002.751946e-0610.00409811.0000000.999947
    10.002.501769e-0610.00409810.0000000.999952
    8.802.201557e-0610.0040988.8000000.999958
    7.201.801274e-0610.0040987.2000000.999966
    7.041.761245e-0610.0040987.0400000.999966
    3.328.305872e-0710.0040983.3200000.999984
    1.503.752653e-0710.0040981.5000000.999993
    0.000.000000e+0010.0040980.0000001.000000
    274808.806.875081e-0220.008197137404.4000000.343531
    179.494.490425e-0520.00819789.7450000.999571
    160.004.002830e-0510.004098160.0000000.999236
    132.043.303335e-0510.004098132.0400000.999369
    135912.003.400204e-0210.004098135912.0000000.350661
    40650.001.016969e-0210.00409840650.0000000.805789
    4452.801.113988e-0320.0081972226.4000000.989363
    2851.207.133043e-0410.0040982851.2000000.986378
    2600.006.504599e-0410.0040982600.0000000.987578
    582.501.457280e-0410.004098582.5000000.997217
    336.008.405943e-0510.004098336.0000000.998395
    112.002.801981e-0510.004098112.0000000.999465
    105.002.626857e-0510.004098105.0000000.999498
    39.509.881987e-0610.00409839.5000000.999811
    33.008.255837e-0610.00409833.0000000.999842
    29.907.480289e-0610.00409829.9000000.999857
    28.147.039977e-0610.00409828.1400000.999866
    19.504.878449e-0610.00409819.5000000.999907
    5.401.350955e-0610.0040985.4000000.999974

    160 rows × 6 columns

    mcplot(byshape_con,(30,10))

    • 可見當物理形態(tài)為固體,凈含量為1時,潛力較大

    競爭分析

    os.chdir('..') os.chdir('./競爭數據')

    品類分布(占比)

    os.chdir('./商品銷售數據')
    • 加載數據
    filenames2 = glob.glob('*.xlsx') filenames2 ['安速家居近30天銷售數據.xlsx', '拜耳近30天銷售數據.xlsx', '科凌蟲控旗艦店近30天銷售數據.xlsx']
    • 查看數據
    def load_xlsx1(filename):df = pd.read_excel(filename)useless = ['序號','店鋪名稱','主圖鏈接','商品鏈接','商品名稱']df.drop(columns=useless,inplace=True)return df df3bai = load_xlsx1(filenames2[1]) df3bai.head() 商品ID商品原價商品售價30天銷售量總銷量類目物理形態(tài)型號凈含量使用對象銷售額01234
    527604730327109.039.9435423023212滅鼠/殺蟲劑啫喱5g5g蟑螂1737325.8
    535731556857199.059.94860285440滅鼠/殺蟲劑啫喱拜滅易12g螞蟻291114.0
    530229854741199.089.983870516滅鼠/殺蟲劑液體特密得100ml100ml白蟻75336.2
    569753894890198.079.9148719602滅鼠/殺蟲劑啫喱拜滅士5g-除敵5g+5ml*4蟑螂118811.3
    549862604116109.039.91641155203滅鼠/殺蟲劑液體除敵20ml殺蟲劑65475.9
    df3an = load_xlsx1(filenames2[0]) df3an.head() 商品ID商品原價商品售價30天銷售量總銷量類目適用對象30天銷售額01234
    52703256639260.054.42540236321滅鼠/殺蟲劑蟑螂138176.0
    53423048711332.029.988310498滅鼠/殺蟲劑蟑螂26401.7
    52779767953033.429.91073117070滅鼠/殺蟲劑蟑螂32082.7
    52711310807948.045.547155672滅鼠/殺蟲劑蟑螂21430.5
    53135077781358.048.056619705滅鼠/殺蟲劑蟑螂27168.0
    df3kl = load_xlsx1(filenames2[2]) df3kl.head() 商品ID商品原價商品售價30天銷售量總銷量類目適用對象30天銷售額01234
    54141825586749.916.8766083175991滅鼠/殺蟲劑蟑螂1287014.4
    52872214492739.019.85852174989滅鼠/殺蟲劑蟑螂115869.6
    54552616166249.939.0249774352滅鼠/殺蟲劑蟑螂97383.0
    53626147031249.029.854076572滅鼠/殺蟲劑16092.0
    55335069934148.913.86408324171滅鼠/殺蟲劑88430.4

    類目

    bai31 = df3bai.groupby('類目').sum() bai31 商品ID商品原價商品售價30天銷售量總銷量銷售額類目滅鼠/殺蟲劑
    82861087920662623.01195.05918537204112673315.2
    an31 = df3an.groupby('類目').sum() an31 商品ID商品原價商品售價30天銷售量總銷量30天銷售額類目漱口水滅鼠/殺蟲劑空氣芳香劑空調清潔劑蚊香液蚊香片
    1130081741597208.0149.812727109137.3
    158535518830061650.31196.711082564638494539.3
    1620098374783196.8141.911870654668.6
    1056404562798122.9113.01513293910420.9
    3878968356851565.0344.65734200024343.6
    1129949034942157.081.0123045.0
    kl31 = df3kl.groupby('類目').sum() kl31 商品ID商品原價商品售價30天銷售量總銷量30天銷售額類目其它園藝用品滅鼠/殺蟲劑滅鼠籠/捕鼠器
    55269531577639.926.08804166722880.0
    131094393262491728.4817.216383344162283734597.0
    111246542599788.942.723154676062293.0
    • 作圖查看占比的不同
    fig, axes = plt.subplots(1, 3, figsize=(10, 6)) ax = axes[0] bai31['銷售額'].plot.pie(autopct='%.f',title='拜耳',startangle=30,ax=ax) ax.set_ylabel('') ax = axes[1] an31['30天銷售額'].plot.pie(autopct='%.f',title='安速',startangle=60,ax=ax) ax.set_ylabel('') ax = axes[2] kl31['30天銷售額'].plot.pie(autopct='%.f',title='科凌蟲控',startangle=90,ax=ax) ax.set_ylabel('') plt.show()

    • 可見拜耳只有一個市場,其他的有不同市場,但主要市場都是滅鼠殺蟲劑

    適用對象

    bai32 = df3bai.groupby('使用對象').sum() bai32 商品ID商品原價商品售價30天銷售量總銷量銷售額使用對象上門服務殺蟲劑白蟻螞蟻蟑螂
    578090143145199.099.0743087326.0
    1074833731154208.079.82593187382103460.7
    530229854741199.089.98387051675336.2
    1106313745903448.0173.94989286942305820.0
    49966413171231569.0752.45069131752632181372.3
    an32 = df3an.groupby('適用對象').sum() an32 商品ID商品原價商品售價30天銷售量總銷量30天銷售額適用對象殺蟲劑漱口水空氣芳香劑空調清潔劑螞蟻蚊蠅螨蟑螂鼠
    4366320607112435.7349.67551941634687.5
    1130081741597208.0149.812727109137.3
    1620098374783196.8141.911870654668.6
    1056404562798122.9113.01513293910420.9
    56362813371529.022.2538331176.6
    5008917391793722.0425.65744223024388.6
    56499356525248.024.85151655112772.0
    2820972369214399.0242.6330219587164381.3
    6446879299188654.0481.36178500064268585.3
    109075790852584.676.2279818712936.6
    kl32 = df3kl.groupby('適用對象').sum() kl32 商品ID商品原價商品售價30天銷售量總銷量30天銷售額適用對象其它園藝用品蟲虱蛾螨蟑螂鼠
    55269531577639.926.08804166722880.0
    1606589035658247.6118.63765962035.0
    55933582767865.025.1739580838185614.5
    107489743767077.044.35757835416599.5
    1143807512213198.078.8360371450551114222.4
    2193692130180237.8105.58578534270311525024.2
    7643582808847991.9487.636319725114953394.4
    • 作圖
    fig, axes = plt.subplots(1, 3, figsize=(10, 6)) ax = axes[0] bai32['銷售額'].plot.pie(autopct='%.f',title='拜耳',startangle=30,ax=ax) ax.set_ylabel('') ax = axes[1] an32['30天銷售額'].plot.pie(autopct='%.f',title='安速',startangle=60,ax=ax) ax.set_ylabel('') ax = axes[2] kl32['30天銷售額'].plot.pie(autopct='%.f',title='科凌蟲控',startangle=90,ax=ax) ax.set_ylabel('') plt.show()

    • 拜耳的主要對象是蟑螂,而另外兩家除此之外還有螨,鼠
    • 而從之前的分析看滅鼠和蟑螂的整體市場份額都大
    • 應該開拓新市場,尤其是滅鼠,也考察其他兩家都開拓的螨市場

    產品結構分析

    os.chdir('..') os.chdir('./商品交易數據')
    • 拜耳,安速,科凌蟲控的數據分開讀取分析
    • 主要分析產品結構
    • 由于分析過程相似,故都封裝成函數方便分析
    filenames3 = glob.glob('*.xlsx') filenames3 ['安速全店商品交易數據.xlsx', '拜耳全店商品交易數據.xlsx', '科凌蟲控全店商品交易數據.xlsx']

    拜耳

    • 讀數據
    df4bai = pd.read_excel(filenames3[1]) df4bai.head() 品牌時間商品行業(yè)排名交易指數交易增長幅度支付轉化指數操作交易金額01234
    拜耳2018-07-01德國拜耳拜滅士蟑螂藥一窩端殺蟑膠餌滅蟑螂屋無毒克星家用全窩端15834830.03501500趨勢分析9354158.37
    拜耳2018-07-01進口蟑螂藥一窩端德國拜耳拜滅士強力殺蟑膠餌蟑螂屋克星家用全窩62785420.12581194趨勢分析2470202.91
    拜耳2018-07-01德國拜耳進口螞蟻藥拜滅易滅蟻餌劑除殺螞蟻殺蟲劑家用室內全窩端112123290.50701328趨勢分析1518114.06
    拜耳2018-07-01進口蟑螂藥德國拜耳拜滅士滅殺蟑膠餌劑粉屋克星全窩端12g家用44107697-0.20441117趨勢分析451897.06
    拜耳2018-07-01德國拜耳 除敵跳蚤殺蟲劑家用滅蟑螂藥虱子殺潮蟲臭蟲除蟲劑噴霧45105901-0.2134936趨勢分析438583.74
    df4bai.info() df4bai['商品'].value_counts().count() <class 'pandas.core.frame.DataFrame'> RangeIndex: 142 entries, 0 to 141 Data columns (total 9 columns): 品牌 142 non-null object 時間 142 non-null datetime64[ns] 商品 142 non-null object 行業(yè)排名 142 non-null int64 交易指數 142 non-null int64 交易增長幅度 142 non-null float64 支付轉化指數 142 non-null int64 操作 142 non-null object 交易金額 142 non-null float64 dtypes: datetime64[ns](1), float64(2), int64(3), object(3) memory usage: 10.1+ KB44
    • 五個月的數據,每個商品至多五個月都有,至少有一個月,故需要對商品分類匯總

    • 自定義分類匯總函數

    def byproduct(df):dfb = df.groupby('商品').mean().loc[:,['交易增長幅度']]dfb['交易金額'] = df.groupby('商品').sum()['交易金額']dfb['交易金額占比'] = dfb['交易金額']/dfb['交易金額'].sum()dfb['商品個數'] = df.groupby('商品').count()['交易金額']dfb.reset_index(inplace=True)return dfb bai4 = byproduct(df4bai) bai4.head(5) 商品交易增長幅度交易金額交易金額占比商品個數01234
    17年德國拜耳進口螞蟻藥拜滅易滅蟻餌劑粉除殺螞蟻殺蟲劑全窩端-0.24760042340.550.0005231
    德國原裝進口拜耳蟑螂藥全窩端拜滅士5g+12g殺蟑膠餌劑粉屋捕捉器-0.120333197377.170.0024393
    德國拜耳 除敵跳蚤殺蟲劑家用滅蟑螂藥虱子殺潮蟲臭蟲除蟲劑噴霧-0.2210801394780.590.0172325
    德國拜耳丁香醫(yī)生限量款拜滅士加量家用蟑螂全窩端蟑螂藥3支裝1.27930026742.750.0003301
    德國拜耳上門除滅鼠滅白蟻蟑螂蚊子跳蚤蒼蠅上海地區(qū)滅蟲按件拍下2.43930028969.280.0003581
    bai4.describe() 交易增長幅度交易金額交易金額占比商品個數countmeanstdmin25%50%75%max
    44.0000004.400000e+0144.00000044.000000
    25.5399331.839560e+060.0227273.227273
    155.0838256.321050e+060.0780951.668639
    -0.2921002.255362e+040.0002791.000000
    -0.0610406.828508e+040.0008441.750000
    0.1090002.061879e+050.0025473.000000
    1.2874107.673291e+050.0094805.000000
    1030.0000004.010385e+070.4954735.000000
    • 其中交易增長幅度可表示市場發(fā)展率,交易金額占比可表示市場份額

    • 這兩個指標的最大值都遠大于3/4分位數,認為是異常值,考慮引入蓋帽法,方便后續(xù)作圖

    • 定義蓋帽法函數(只考慮右邊的尾巴蓋帽)

    def block(x):qu = x.quantile(.9)out = x.mask(x>qu,qu)return(out)
    • 定義分別對兩個指標蓋帽的函數
    def block2(df):df1 = df.copy()df1['交易增長幅度'] = block(df1['交易增長幅度'])df1['交易金額占比'] = block(df1['交易金額占比'])return df1 bai41 = block2(bai4) bai41.describe() 交易增長幅度交易金額交易金額占比商品個數countmeanstdmin25%50%75%max
    44.0000004.400000e+0144.00000044.000000
    1.1716681.839560e+060.0076993.227273
    2.2125376.321050e+060.0103331.668639
    -0.2921002.255362e+040.0002791.000000
    -0.0610406.828508e+040.0008441.750000
    0.1090002.061879e+050.0025473.000000
    1.2874107.673291e+050.0094805.000000
    6.7170304.010385e+070.0318635.000000
    • 可見蓋帽后指標不那么異常,方便作圖觀察

    • 定義作圖函數

    def plotBOG(df,mean=False,q1=0.5,q2=0.5):f, ax = plt.subplots(figsize=(10, 8))ax = sns.scatterplot('交易金額占比','交易增長幅度',hue='商品個數',size='商品個數',sizes=(20,200),palette = 'cool',legend='full',data=df)for i in range(0,len(df)):ax.text(df['交易金額占比'][i]+0.001,df['交易增長幅度'][i],i)if mean:plt.axvline(df['交易金額占比'].mean())plt.axhline(df['交易增長幅度'].mean())else:plt.axvline(df['交易金額占比'].quantile(q1))plt.axhline(df['交易增長幅度'].quantile(q2))plt.show()
    • 以平均值作為波士頓矩陣的分隔線
    plotBOG(bai41,mean=True)

    • 以中位數作為波士頓矩陣的分隔線
    plotBOG(bai41)

    • 可以根據實際的業(yè)務選擇區(qū)間的分隔線,由行業(yè)經驗確定(如果認為增幅0.1在行業(yè)里算高,就可以作為分隔線)

    • 明星產品和奶牛產品的商品個數普遍比較多

    • 沒有突出的明星產品,但是有快進入明星產品的問題產品

    • 查看各個產品結構的產品(除了瘦狗)

    • 各種產品排序,關心點不同,排序依據不同

      • 明星產品:都關心,依什么排序都可以,產品一般不多
      • 奶牛產品:老爆款,關心市場份額,依交易金額占比排序
      • 問題產品,潛力款,關心市場增長率,依交易增長幅度排序
    • 這里要查看實際數據,故使用蓋帽前數據

    def extractBOG(df,q1=0.5,q2=0.5,by='交易金額占比'):# 明星產品star = df.loc[(df['交易金額占比']>=df['交易金額占比'].quantile(q1)) & (df['交易增長幅度']>=df['交易增長幅度'].quantile(q2)),:]star = star.sort_values(by,ascending=False)# 奶牛產品cow = df.loc[(df['交易金額占比']>=df['交易金額占比'].quantile(q1)) & (df['交易增長幅度']<df['交易增長幅度'].quantile(q2)),:]cow = cow.sort_values(by,ascending=False)# 問題產品que = df.loc[(df['交易金額占比']<df['交易金額占比'].quantile(q1)) & (df['交易增長幅度']>=df['交易增長幅度'].quantile(q2)),:]que = que.sort_values(by,ascending=False)return star,cow,que baistar,baicow,baique = extractBOG(bai4) baistar1,baicow1,baique1 = extractBOG(bai4,by='交易增長幅度')
    • 拜耳明星產品
    baistar 商品交易增長幅度交易金額交易金額占比商品個數620258
    德國拜耳拜滅士5g+除敵5ml*4支進口蟑螂藥家用全窩端殺蟑螂套裝1.3117401484952.410.0183465
    德國拜耳除敵殺蟲劑滅蟑螂藥跳蚤蒼蠅臭蟲除螨虱子噴霧5ml*80.325580983199.690.0121475
    拜耳拜滅士 蟑螂藥家用全窩端 強力滅除廚房蟑螂屋克星殺德國進口1.454067413902.420.0051143
    德國拜耳拜滅士蟑螂藥一窩端進口全窩端家用滅除殺蟑膠餌5g包郵29.260600236386.320.0029201
    • 主要是除蟑和殺蟲,但是占比不大,增幅一般

    • 拜耳奶牛產品/老爆款

    baicow 商品交易增長幅度交易金額交易金額占比商品個數739184230411023127361523191452916
    德國拜耳拜滅士蟑螂藥一窩端殺蟑膠餌滅蟑螂屋無毒克星家用全窩端-0.12136040103850.970.4954735
    進口蟑螂藥一窩端德國拜耳拜滅士強力殺蟑膠餌蟑螂屋克星家用全窩-0.03468013617307.870.1682385
    德國拜耳進口螞蟻藥拜滅易滅蟻餌劑除殺螞蟻殺蟲劑家用室內全窩端-0.0562406130488.420.0757415
    進口蟑螂藥德國拜耳拜滅士滅殺蟑膠餌劑粉屋克星全窩端12g家用0.0709603589799.540.0443515
    蟑螂藥一窩端德國拜耳拜滅士除蟑滅殺蟑螂克星全窩端家用殺蟑膠0.0389002976922.630.0367795
    進口蟑螂藥德國拜耳拜滅士殺蟑膠餌蟑螂克星全窩端家用滅蟑屋12g-0.1271401650681.340.0203945
    德國拜耳拜滅易滅蟻餌劑粉除殺防螞蟻藥殺蟲劑全窩端家用室內花園-0.0273001520126.150.0187815
    德國拜耳 除敵跳蚤殺蟲劑家用滅蟑螂藥虱子殺潮蟲臭蟲除蟲劑噴霧-0.2210801394780.590.0172325
    蟑螂藥德國拜耳拜滅士強力除殺蟑螂克星膠餌屋家用捕捉器貼全窩端0.0116801198694.670.0148105
    拜耳蟑螂藥一窩端家用拜滅士殺蟑膠餌德國小強藥粉滅蟑螂廚房克星0.035400695372.270.0085913
    進口蟑螂藥 德國拜耳拜滅士家用殺蟑螂膠餌捕捉器蟑螂屋粉全窩端-0.148240582636.660.0071985
    德國拜耳進口白蟻藥除滅防殺白蟻殺蟲劑全窩端家用特傚觸殺型粉藥-0.134425564169.770.0069704
    必搶 德國進口拜耳蟑螂克星家用小強全窩端蟑螂藥殺蟑膠餌滅蟑17g-0.292100561689.930.0069404
    德國拜耳除敵殺蟲劑滅蚊蟑螂螞蟻藥跳蚤蒼蠅臭蟲除螨虱子家用-0.075440393582.710.0048635
    德國拜耳跳蚤殺蟲劑家用潮蟲滅蛾蚋虱子臭蟲藥除蟑螂5ml*40.007220318059.430.0039305
    德國拜耳拜滅士5g+拜滅易12g進口殺蟑螂螞蟻藥蟑螂克星家用全窩端0.048340251225.580.0031045
    蟑螂藥30克拜滅士德國拜耳進口安全滅蟑螂殺蟑膠餌顆粒劑傳染傳毒-0.107200221081.460.0027314
    德國拜耳進口螞蟻藥家用拜滅易滅蟻餌劑殺螞蟻殺蟲劑室內全窩端蟻-0.047140214998.610.0026565
    • 可見占比最高的是除蟑,滅蟲也占一部分,占比一般

    • 拜耳問題產品/潛力款

    baique1 商品交易增長幅度交易金額交易金額占比商品個數284011261242433313352132917384322
    電子貓超聲波驅鼠器家用大功率滅鼠防鼠趕老鼠夾藥捕鼠干擾粘鼠板1030.00000041046.030.0005071
    進口蟑螂藥一窩端德國拜耳拜滅士殺蟑膠餌誘防蟑螂屋全窩端5+12g31.21850067043.140.0008281
    德國拜耳拜滅易進口螞蟻藥一窩端滅蟻餌劑清除螞蟻粉家用全窩端7.86700032246.390.0003981
    拜耳滅螞蟻藥家用一窩端室內室外用殺小黃紅螞蟻藥神器膠餌拜滅易7.44270022553.620.0002791
    德國拜耳白蟻藥殺蟲劑全窩端家用除殺防治滅飛螞蟻特密得預防裝修5.02380055113.410.0006811
    德國拜耳上門除滅鼠滅白蟻蟑螂蚊子跳蚤蒼蠅上海地區(qū)滅蟲按件拍下2.43930028969.280.0003581
    拜滅士蟑螂藥蟑螂克星家用無毒強力滅蟑清德國拜耳殺蟑餌劑全窩端1.797267140032.470.0017303
    蟑螂藥進口德國拜耳拜滅士家用除殺蟑螂屋膠餌捕捉器強力清全窩端1.465350101079.330.0012492
    德國拜耳丁香醫(yī)生限量款拜滅士加量家用蟑螂全窩端蟑螂藥3支裝1.27930026742.750.0003301
    德國拜耳蟑螂藥拜滅士殺蟑膠餌蟑螂克星全窩端家用滅蟑加量裝12g1.06905068699.060.0008492
    進口螞蟻藥德國拜耳拜滅易殺蟻膠餌滅蟻餌劑紅黑黃螞蟻全窩端家用0.74220033514.830.0004141
    德國進口拜耳蟑螂藥拜滅士殺滅除蟑膠餌劑粉屋家用全窩端12g*2盒0.58600028605.010.0003531
    蟑螂藥拜耳拜滅士殺蟑膠餌強力滅蟑清貼捕捉器蟑螂克星家用全窩端0.53810055395.160.0006842
    德國拜耳拜滅易12g+除敵5ml*4支 進口螞蟻藥全窩端家用殺蟲劑組合0.45055089566.010.0011072
    德國拜耳進口螞蟻藥拜滅易滅蟻餌劑除殺螞蟻無毒家用室內全窩端0.36495093400.980.0011542
    進口蟑螂藥 德國拜耳拜滅士殺蟑螂膠餌劑33克滅蟑螂藥全窩端家用0.331367143883.320.0017783
    預售德國拜耳進口蟑螂藥進口螞蟻藥組合裝5g+12g0.21815085546.090.0010572
    德國進口拜耳蟑螂藥拜滅士殺蟑膠餌5g兩盒套裝全窩端殺滅蟑螂劑屋0.147040189376.200.0023405
    • 可見大部分仍然是滅蟑和殺蟲

    • 交易增長幅度最大的是滅鼠,而之前描述過滅鼠有最高的市場份額,可以作為下一步著力點

    • 總結:拜耳大部分產品集中在除蟑上,殺蟲也有一定的規(guī)模,但是明星產品略乏力,可以進一步發(fā)展問題產品滅鼠為明星產品

    安速

    • 讀數據,描述
    df4an = pd.read_excel(filenames3[0]) df4an.head(2) 日期商品行業(yè)排名交易指數交易增長幅度支付轉化指數操作交易金額01
    2018-07-01日本安速小強恢恢蟑螂屋紙盒子捕捉器藥滅殺強力家用貼克星全窩端33105170.60371445趨勢分析3002740.75
    2018-07-01日本進口安速小強恢恢滅蟑螂屋藥無毒捕捉器克星家用強力清全窩端25151749-0.23591200趨勢分析832540.52
    df4an.info() df4an['商品'].value_counts().count() <class 'pandas.core.frame.DataFrame'> RangeIndex: 141 entries, 0 to 140 Data columns (total 8 columns): 日期 141 non-null datetime64[ns] 商品 141 non-null object 行業(yè)排名 141 non-null int64 交易指數 141 non-null int64 交易增長幅度 141 non-null float64 支付轉化指數 141 non-null int64 操作 141 non-null object 交易金額 141 non-null float64 dtypes: datetime64[ns](1), float64(2), int64(3), object(2) memory usage: 8.9+ KB49
    • 匯總指標
    an4 = byproduct(df4an) an4.describe() 交易增長幅度交易金額交易金額占比商品個數countmeanstdmin25%50%75%max
    49.0000004.900000e+0149.00000049.000000
    1.8319896.150227e+050.0204082.877551
    6.7069751.954368e+060.0648511.666241
    -0.6413001.916612e+040.0006361.000000
    -0.0595004.044140e+040.0013421.000000
    0.1767501.162554e+050.0038583.000000
    0.6049004.153983e+050.0137845.000000
    42.0143001.329498e+070.4411645.000000
    • 蓋帽法處理
    an41 = block2(an4) an41.describe() 交易增長幅度交易金額交易金額占比商品個數countmeanstdmin25%50%75%max
    49.0000004.900000e+0149.00000049.000000
    0.5932636.150227e+050.0089692.877551
    1.1609031.954368e+060.0099831.666241
    -0.6413001.916612e+040.0006361.000000
    -0.0595004.044140e+040.0013421.000000
    0.1767501.162554e+050.0038583.000000
    0.6049004.153983e+050.0137845.000000
    3.5305071.329498e+070.0295065.000000
    • BCG圖
    plotBOG(an41)

    • 可見奶牛產品足,明星產品部分有前途,問題產品部分有潛力,瘦狗產品不多

    • 查看具體產品

    anstar,ancow,anque = extractBOG(an4) anstar1,ancow1,anque1 = extractBOG(an4,by='交易增長幅度')
    • 安速明星
    anstar 商品交易增長幅度交易金額交易金額占比商品個數353421390432918
    日本安速蟑螂小屋滅除殺蟑螂藥一窩端神器捕捉器家用克星小強恢恢0.440000881587.790.0292545
    日本安速螞蟻藥滅螞蟻清驅除殺紅螞蟻小黃螞蟻全窩端家用室內花園21.464140444230.430.0147415
    日本安速小黑帽小強黑克殺蟑餌劑9枚 蟑螂藥屋蟑螂克星家用全窩端0.419900320927.900.0106494
    日本進口安速天然除螨噴霧劑350ml床上免洗家用正品螨蟲非除殺菌3.859333304915.110.0101183
    10枚裝日本進口安速小強恢恢蟑螂屋誘捕捉器殺藥貼環(huán)保無毒包郵0.176750217061.510.0072034
    日本進口安速小強恢恢蟑螂屋5片藥捕捉器滅殺蟑螂克星家用全窩端0.189600215689.300.0071574
    日本安速紅阿斯殺蟲煙霧劑20g彈煙熏滅跳蚤蟑螂藥克星家用全窩端0.399800193282.280.0064144
    日本安速小強恢恢蟑螂屋滅蟑螂強力捕捉器誘捕器除殺蟑螂清藥貼0.227425142604.890.0047324
    • 殺蟲和除蟑表現都不錯

    • 安速奶牛

    ancow 商品交易增長幅度交易金額交易金額占比商品個數19414024161324661720384526361531
    日本安速小強恢恢蟑螂屋紙盒子捕捉器藥滅殺強力家用貼克星全窩端0.01218013294975.970.4411645
    日本進口安速小強恢恢滅蟑螂屋藥無毒捕捉器克星家用強力清全窩端-0.1110003685204.190.1222855
    日本進口安速小強恢恢殺蟑滅蟑螂屋10枚捕捉器不含蟑螂藥-0.0007402354769.410.0781385
    日本安速小黑帽蟑螂屋家用強力滅蟑螂藥環(huán)保無毒無味除小強包郵0.0724001083611.820.0359575
    日本安速小強恢恢蟑螂屋家用殺蟑膠餌小蟑螂藥無毒蟑螂克星全窩端-0.254840919683.220.0305185
    ?【10枚裝】日本進口安速小強恢恢蟑螂屋捕捉器殺藥貼家用全窩端-0.159780812113.820.0269485
    日本安速紅阿斯煙霧殺蟲劑滅跳蚤藥煙彈家用神器螨蟲克星送蟑螂屋0.095920762628.320.0253065
    日本進口安速紅阿斯殺蟲煙霧劑彈煙熏滅跳蚤蟑螂克星家用全窩端-0.120460622279.570.0206495
    原裝進口日本安速紅阿斯殺蟲煙霧劑熏殺滅跳蚤臭蟲螨蟲蟑螂20克-0.283000480525.860.0159454
    日本安速小強恢恢蟑螂屋捕捉器神器廚房清滅強力貼克星家用全窩端-0.059500423102.790.0140405
    日本安速小強恢恢蟑螂屋藥6片 無毒捕捉器強力滅清克星家用全窩端-0.203980415398.340.0137845
    日本安速除螨蟲噴霧劑床上免洗祛去螨蟲噴劑家用殺菌送除螨包神器-0.102467358839.150.0119073
    日本進口安速殺蠅餌劑蒼蠅藥1盒 粘蠅紙滅蒼蠅貼強力神家用捕蠅器0.012960293644.170.0097445
    日本安速小黑帽蟑螂屋蟑螂藥克星家用安全無毒強力滅蟑清全窩端!-0.209850279383.030.0092714
    日本安速蟑螂藥12枚家用滅殺蟑螂屋膠餌劑清強力捕捉器克星全窩端-0.137750184701.400.0061294
    日本安速小強恢恢蟑螂屋6片家用無毒蟑螂貼捕捉器克星家用全窩端-0.301267168503.430.0055913
    日本安速紅阿斯殺蟲煙霧劑彈煙熏強力滅跳蚤蟑螂藥克星家用全窩端-0.010433116255.350.0038583
    • 主要是除蟑,和拜耳產生競爭

    • 安速問題

    anque1 商品交易增長幅度交易金額交易金額占比商品個數43713274230294733512481081425
    沖銷量日本安速小黑帽蟑螂屋蟑螂藥家用強力滅蟑清安全無毒小強42.0143032034.340.0010631
    日本安速除螨蟲噴霧劑床上免洗去螨蟲神器噴劑家用非殺菌送除螨包6.6480041991.710.0013931
    日本安速ARS地球制藥earth小飛蟲恢恢果蠅誘捕器單只裝 03154.3643040441.400.0013421
    日本安速殺蟑氣霧劑精純無味型2瓶 滅蟑螂藥殺蟲劑家用潮蟲百蟲靈3.4483052292.370.0017352
    日本進口安速小強恢恢蟑螂屋5片家用無毒貼捕捉器克星家用全窩端1.1675019943.680.0006621
    日本安速紅阿斯殺蟲煙霧劑彈10g煙熏滅跳蚤蟑螂克星家用全窩端0.9705060200.620.0019981
    【20枚裝】日本進口安速小強恢恢蟑螂屋蟑螂捕捉器誘捕器滅蟑小屋0.9321558785.410.0019512
    日本 安速EARTH小果蠅恢恢殺蠅餌劑滅蒼蠅小飛蟲神器誘捕捕捉器0.8856033633.570.0011161
    日本進口安速紅阿斯殺蟲煙霧劑跳蚤螨蟲螞蟻藥蟑螂克星家用全窩端0.7317019166.120.0006361
    日本安速老鼠吱吱板4片 老鼠貼強力粘鼠板驅鼠滅鼠器老鼠膠藥家用0.6603030567.080.0010141
    原裝正品日本安速小強恢恢蟑螂屋蟑螂捕捉器誘捕器小屋20枚包郵0.6049032627.730.0010831
    日本ARS安速小黑帽蟑螂屋盒子無毒無味滅小強安全室內12枚蟑螂藥0.5123581303.150.0026982
    現貨 日本正品安速小黑帽蟑螂屋殺小強滅蟑螂藥環(huán)保無毒無刺激0.48520105556.480.0035033
    日本ARS安速小黑帽環(huán)保無毒滅蟑螂藥無味除小強小黑屋12枚0.3812034503.540.0011451
    德國拜耳進口螞蟻藥拜滅易滅蟻餌劑除殺螞蟻無毒家用室內全窩端0.3649593400.980.0030992
    日本安速小強恢恢蟑螂屋20片蟑螂藥滅蟑螂克星家用全窩端0.2840025693.460.0008531
    日本安速小黑帽蟑螂屋滅蟑螂藥家用廚房滅蟑清蟑螂克星全窩端無毒0.2031035358.870.0011731
    • 前幾款是滅蟑,除螨,殺蟲,都有發(fā)展空間

    • 總結:安速沒有明顯的滅鼠市場

    • 拜耳和安速比較:拜耳殺蟲是老爆款,滅蟑存在一定競爭

    科凌蟲控

    • 讀數據,描述
    df4kl = pd.read_excel(filenames3[2]) df4kl.head(2) 日期商品行業(yè)排名交易指數交易增長幅度支付轉化指數操作交易金額01
    2018-07-01蟑螂藥一窩端蟑螂屋膠餌滅蟑螂無毒廚房家用強力殺蟑螂克星全窩端24668810.45251850趨勢分析6256693.23
    2018-07-01蟑螂屋捕捉器除滅蟑螂藥一窩端神器紙盒子膠餌殺小蟑螂貼廚房家用142045450.19331577趨勢分析1419883.88
    df4kl.info() df4kl['商品'].value_counts().count() <class 'pandas.core.frame.DataFrame'> RangeIndex: 118 entries, 0 to 117 Data columns (total 8 columns): 日期 118 non-null datetime64[ns] 商品 118 non-null object 行業(yè)排名 118 non-null int64 交易指數 118 non-null int64 交易增長幅度 118 non-null float64 支付轉化指數 118 non-null int64 操作 118 non-null object 交易金額 118 non-null float64 dtypes: datetime64[ns](1), float64(2), int64(3), object(2) memory usage: 7.5+ KB31
    • 匯總指標
    kl4 = byproduct(df4kl) kl4.describe() 交易增長幅度交易金額交易金額占比商品個數countmeanstdmin25%50%75%max
    31.0000003.100000e+0131.00000031.000000
    13.4794481.500410e+060.0322583.806452
    73.2214484.039568e+060.0868491.558190
    -0.3178402.566598e+040.0005521.000000
    -0.0653601.099735e+050.0023642.500000
    0.0548003.286985e+050.0070675.000000
    0.5528001.138542e+060.0244785.000000
    407.9826502.196606e+070.4722595.000000
    • 蓋帽
    kl41 = block2(kl4) kl41.describe() 交易增長幅度交易金額交易金額占比商品個數countmeanstdmin25%50%75%max
    31.0000003.100000e+0131.00000031.000000
    0.2630581.500410e+060.0145893.806452
    0.4929994.039568e+060.0149941.558190
    -0.3178402.566598e+040.0005521.000000
    -0.0653601.099735e+050.0023642.500000
    0.0548003.286985e+050.0070675.000000
    0.5528001.138542e+060.0244785.000000
    1.2683502.196606e+070.0446095.000000
    • 作圖
    plotBOG(kl41)

    • 可見奶牛產品足,明星產品少,大部分有前途,問題產品部分有潛力,瘦狗產品少

    • 查看具體產品

    klstar,klcow,klque = extractBOG(kl4) klstar1,klcow1,klque1 = extractBOG(kl4,by='交易增長幅度')
    • 科凌蟲控明星
    klstar 商品交易增長幅度交易金額交易金額占比商品個數2820131
    除螨蟲家用噴劑床上祛防螨去螨蟲神器噴霧劑包貼殺蟲劑免洗非殺菌0.807422270518.540.0488155
    老鼠貼超強力粘鼠板滅鼠神器捕鼠魔毯yao顆粒老鼠膠家用藥捕鼠器0.085762074886.770.0446095
    科凌蟲控蟑螂藥屋蟑螂克星殺蟑膠餌滅蟑螂粉全窩端家用強力滅蟑清0.151841253677.000.0269535
    去兒童頭虱除陰虱凈噴劑百部酊虱卵用虱子藥成人一掃光凈虱靈噴霧0.670581112790.540.0239245
    • 主要是滅鼠,除螨和殺蟲

    • 科凌蟲控奶牛

    klcow 商品交易增長幅度交易金額交易金額占比商品個數2322526211510141729816
    蟑螂藥一窩端蟑螂屋膠餌滅蟑螂無毒廚房家用強力殺蟑螂克星全窩端-0.0962821966057.020.4722595
    蟑螂屋捕捉器除滅蟑螂藥一窩端神器紙盒子膠餌殺小蟑螂貼廚房家用-0.005487464346.740.1604805
    殺蟑螂藥一窩端蟑螂藥粉家用捕捉器屋廚貼無毒滅蟑螂粉除小強克星-0.193701408637.810.0302855
    跳蚤殺蟲劑家用廁所衛(wèi)生間下水道除蟲滅小飛蟲蛾蚋蟑螂臭蟲藥室內-0.188241316871.080.0283125
    蟑螂屋強力滅蟑清蟑螂貼家用無毒粉殺蟑螂藥膠餌捕捉器克星全窩端0.053201157585.180.0248885
    粘鼠板超強力驅老鼠貼滅鼠抓老鼠夾藥捕鼠器黏老鼠膠沾鼠神器家用-0.089561119499.650.0240695
    科凌蟲控殺蟑螂藥一窩端滅蟑螂膠餌藥粉克星南方大蟑螂全窩端家用-0.093381009419.110.0217025
    米面蛾誘捕器蛾子粘捕器家用小飛蟲殺蟲劑滅飛蛾除米蛾衣蛾蚋蛾蠓-0.31784578549.200.0124395
    老鼠籠捕鼠器全自動超強家用抓老鼠夾藥捉耗子連續(xù)滅鼠神器驅鼠器-0.01008465621.510.0100115
    除跳蚤噴劑床上殺蟲劑氣霧家用潮蟲驅蟲滅去螞蟻藥神器室內殺蜘蛛-0.19702437756.890.0094125
    滅蟑螂藥煙劑殺蜘蛛驅煙霧彈神器克星家用全窩端除螞蟻蜈蚣煙熏片-0.04116424747.110.0091325
    綠葉老鼠貼強力粘鼠板沾滅鼠神器膠藥yao顆粒捕鼠器克星正品家用0.05318328698.540.0070675
    • 主要是除蟑,有很小部分滅蟲和滅鼠

    • 科凌蟲控問題

    klque1 商品交易增長幅度交易金額交易金額占比商品個數2418251927122411630
    天然除螨蟲包噴霧劑中草藥祛防殺去螨蟲墊貼床上用品家用驅蟲神器407.98265290170.090.0062392
    抓老鼠貼強力粘鼠板膠藥沾滅鼠器捕鼠神器克星家用正品20張一窩端4.21950119948.740.0025793
    老鼠籠捕鼠器家用一窩端連續(xù)全自動強力撲捉抓滅老鼠夾子捕鼠神器1.3110062508.930.0013441
    蟑螂藥蟑螂克星家用非無毒全窩端室內廚房南方大蟑螂一窩端臟螂藥1.2683586020.530.0018492
    老鼠貼強力粘鼠板正品一窩端滅鼠器老鼠克星膠yao顆粒家用10片裝0.8833581123.800.0017442
    跳蚤殺蟲劑家用氣霧劑除螨蟲噴霧潮蟲百蟲靈蜈蚣蟑螂藥滅螞蟻虱子0.5642031783.470.0006831
    科凌蟲控蟑螂藥南方大蟑螂強力殺蟑餌劑蟑螂膠餌家用全窩端滅蟑清0.5414032057.620.0006891
    蟑螂藥粉德國小蠊小強專殺滅蟑螂屋紙盒子家用蟑螂克星全窩端廚房0.30630125300.730.0026943
    科凌蟲控老鼠貼超強力粘鼠板日本版捉抓老鼠夾膠家用滅鼠藥捕鼠器0.2187099998.350.0021503
    汽車家用驅鼠劑防老鼠克星噴霧劑耗子發(fā)動機艙包防鼠滅鼠藥驅鼠器0.1373025665.980.0005521
    抓老鼠夾子捕鼠器籠家用連續(xù)全自動逮捉老鼠籠超強撲鼠籠滅鼠神器0.0733551645.490.0011102
    4 只裝驅老鼠夾捕鼠器家用滅鼠神器抓殺撲老鼠夾子捉老鼠籠全自動0.05480289907.350.0062335
    • 有較大潛力的是除螨

    • 總結:科凌蟲控積極發(fā)展多個產品,然而每個產品結構相對獨立(奶牛除蟑,明星滅鼠,潛力除螨),沒有后續(xù)的支持.競爭力不是那么強

    流量結構分析

    os.chdir('..') os.chdir('./流量渠道數據') filenames4 = glob.glob('*.xlsx') filenames4 ['安速家居旗艦店流量渠道.xlsx', '拜耳官方旗艦店流量渠道.xlsx', '科凌蟲控旗艦店流量渠道.xlsx']
    • 拜耳
    df5bai = pd.read_excel(filenames4[1]) df5bai.head() 流量來源交易指數交易指數.101234
    淘內免費399466320128
    手淘搜索336457274916
    淘內免費其他195308153255
    手淘問大家123512108108
    手淘旺信8802459198
    • 交易指數是銷售額的反映

    • 自定義流量結構和說明的函數

    • 只取交易指數排名前10的流量渠道分析

    def flow(df):df0 = df.copy()top10 = df0.sort_values('交易指數',ascending=False).reset_index(drop=True).iloc[:10,:]top10['交易指數占比'] = top10['交易指數']/top10['交易指數'].sum()top10.set_index('流量來源',inplace=True)paid = ['付費流量','直通車','淘寶客','淘寶聯盟']ind = np.any([top10.index == i for i in paid],axis=0)explode = ind*0.1ax = top10['交易指數占比'].plot.pie(autopct='%.1f%%',figsize=(8,8),colormap='cool',explode=explode)ax.set_ylabel('')plt.show()paidsum = top10['交易指數占比'][ind].sum()salesum = top10['交易指數'].sum()paidsale = salesum * paidsumprint(f'前10流量中:總交易指數:{salesum:.0f};付費流量占比:{paidsum*100:.2f}%;付費流量帶來交易指數:{paidsale:.0f}')return top10 bai5top10 = flow(df5bai)

    前10流量中:總交易指數:2334051;付費流量占比:21.85%;付費流量帶來交易指數:509959
    • 排名前10的詳細數據
    bai5top10 交易指數交易指數.1交易指數占比流量來源淘內免費手淘搜索自主訪問購物車付費流量我的淘寶淘內免費其他直通車手淘問大家淘寶客
    3994663201280.171147
    3364572749160.144152
    3125872342930.133925
    2516001863230.107795
    2233152064800.095677
    2051621518250.087900
    1953081532550.083678
    1879521474630.080526
    1235121081080.052917
    986921353200.042284
    • 安速
    df5an = pd.read_excel(filenames4[0]) df5an.head() 流量來源交易指數01234
    淘內免費119751
    手淘搜索86389
    淘內免費其他62653
    手淘問大家31348
    手淘旺信25514
    an5top10 = flow(df5an)

    前10流量中:總交易指數:748539;付費流量占比:18.58%;付費流量帶來交易指數:139048
    • 可見拜耳和安速的流量配比是差不多的,安速的整體流量小很多,即流量效果拜耳明顯優(yōu)于安速

    • 科凌蟲控

    df5kl = pd.read_excel(filenames4[2]) df5kl.head() 流量來源交易指數01234
    淘內免費320128
    手淘搜索274916
    淘內免費其他153255
    手淘問大家108108
    手淘旺信59198
    kl5top10 = flow(df5kl)

    前10流量中:總交易指數:1918111;付費流量占比:25.51%;付費流量帶來交易指數:489263
    • 和拜耳在流量上差不多,科凌蟲控付費占比較高

    輿情分析

    os.chdir('..') os.chdir('./評論輿情數據')
    • 讀數據
    filenames5 = glob.glob('*.xlsx') filenames5 ['安速.xlsx', '德國拜耳.xlsx', '科林蟲控.xlsx'] df6bai = pd.read_excel(filenames5[1]) df6bai.head() 產品名稱鏈接評論頁碼評論評論日期01234
    德國拜耳拜滅士蟑螂藥一窩端殺蟑膠餌滅蟑螂屋無毒克星家用全窩端https://detail.tmall.com/item.htm?id=5276047303270剛收到,家里廚房突然出現小強了,看了這個評價挺多挺好,銷量也大,趕緊定了三盒,一定要管用啊一...2018-11-21 19:01:20
    德國拜耳拜滅士蟑螂藥一窩端殺蟑膠餌滅蟑螂屋無毒克星家用全窩端https://detail.tmall.com/item.htm?id=5276047303270朋友推薦的說之前用的挺管用的。在放藥的前幾天就沒怎么見蟑螂了,然后出去玩之前把家里角角落落全...2018-11-23 11:07:03
    德國拜耳拜滅士蟑螂藥一窩端殺蟑膠餌滅蟑螂屋無毒克星家用全窩端https://detail.tmall.com/item.htm?id=5276047303270真心坑人啊!😂還沒到24小時就凝固了!小強依然活躍🤑🤑🤑🤑2018-11-24 00:28:17
    德國拜耳拜滅士蟑螂藥一窩端殺蟑膠餌滅蟑螂屋無毒克星家用全窩端https://detail.tmall.com/item.htm?id=5276047303270盆友推薦的,說特別好用,效果杠杠的,看雙十一做活動,就買啦,效果應該不錯吧,不過尸體都是家里...2018-11-25 03:07:25
    德國拜耳拜滅士蟑螂藥一窩端殺蟑膠餌滅蟑螂屋無毒克星家用全窩端https://detail.tmall.com/item.htm?id=5276047303270我是買到假貨嗎?那么貴的蟑螂藥居然還有,還是蟑螂已經百毒不侵了?2018-11-26 07:49:43
    • 抽評論列
    bai6 = list(df6bai['評論'])
    • 去掉非中英文字符
    bai61 = [re.sub(r'[^a-z\u4E00-\u9Fa5]+',' ',i,flags=re.I) for i in bai6]
    • 好用的文本處理的庫:shorttext

    • 讀取構建停用詞列表

    stopwords = list(pd.read_csv('D:/data/python/百度停用詞表.txt',names=['stopwords'])['stopwords']) stopwords.extend([' '])
    • 如下得到一個大列表包含多個小列表,每個小列表來自一條評論的分詞
    bai62 = [] for i in bai61:seg0 = pd.Series(jieba.lcut(i))# 可以嘗試全模式看效果# 篩掉長度等于1的詞ind1 = pd.Series([len(j) for j in seg0])>1seg1 = seg0[ind1]# 去掉停用詞,去重ind2 = ~seg1.isin(pd.Series(stopwords))seg2 = list(seg1[ind2].unique())# 去掉篩選后的空列表if len(seg2)>0:bai62.append(seg2) Building prefix dict from the default dictionary ... Loading model from cache C:\Users\jiang\AppData\Local\Temp\jieba.cache Loading model cost 0.755 seconds. Prefix dict has been built succesfully. bai62[:2] [['收到','家里','廚房','小強','評價','銷量','趕緊','三盒','管用','后續(xù)','效果','追加','多久','才能','消滅','干凈','沒法','做飯','進去','擔心','揮發(fā)','很多','試試'],['朋友','推薦','管用','放藥','幾天','蟑螂','出去玩','家里','角角落落','全都','點涂','四天','回來','開門','內心','忐忑','居然','一只','沒見','真的','錯峰','出行','但愿','第二次','購買','超級','好用','翻爛','兩支','一支','點上']]
    • 組合多個列表到一個列表
    bai63 = [y for x in bai62 for y in x] # 或如下方式: # from itertools import chain # bai63 = list(chain(*bai62)) bai63[:10] ['收到', '家里', '廚房', '小強', '評價', '銷量', '趕緊', '三盒', '管用', '后續(xù)']
    • 計算詞頻
    baifreq = pd.Series(bai63).value_counts() baifreq[:10] 效果 541 蟑螂 409 雙十 145 不錯 144 評論 138 小強 114 收到 106 用戶 100 填寫 100 東西 95 dtype: int64
    • 組合多個字符為一個長字符,空格分隔
    bai64 = ' '.join(bai63)
    • 繪制詞云
    mask = imageio.imread('D:/data/python/leaf.jpg') # 如果是中文必須制定字體 font = r'C:\Windows\Fonts\simkai.ttf'wc = WordCloud(background_color='wheat',mask=mask,font_path=font).generate(bai64) plt.figure(figsize=(8,8)) plt.imshow(wc) plt.axis('off') plt.show()

    • 寫出詞云文件
    wc.to_file('D:/data/python/拜耳輿情詞云.png') <wordcloud.wordcloud.WordCloud at 0x153001f4588>
    • 基于 TF-IDF 算法的關鍵詞抽取
    jieba.analyse.extract_tags(bai64,20,True) [('蟑螂', 0.35868827490141125),('效果', 0.2849843733535205),('雙十', 0.12792281324949317),('小強', 0.09200132125972145),('評論', 0.08243514073285675),('濕巾', 0.0800848241710357),('填寫', 0.07859912763569854),('不錯', 0.07854611456578228),('好評', 0.07205703576029969),('追評', 0.06743985193350374),('沒用', 0.06355521640160776),('收到', 0.06240940420464874),('用戶', 0.06013517913107095),('好用', 0.05909939270004407),('尸體', 0.05479502199496166),('劃算', 0.05456206954124107),('濕紙巾', 0.05268657021031292),('家里', 0.04667477953221154),('發(fā)貨', 0.04579483339359189),('期待', 0.04317101167929661)]
    • 不管從詞云還是關鍵詞來看,評價偏好評,沒有明顯問題
    • 可以在停用詞中添加好評,蟑螂可以再看效果

    總結

    以上是生活随笔為你收集整理的数据分析报告流程展现的全部內容,希望文章能夠幫你解決所遇到的問題。

    如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。