fitbit手表中文说明书_如何获取和分析Fitbit睡眠分数
fitbit手表中文說明書
Smartwatches and other wearable devices have gained popularity over the past couple of years and have given rise to the cultural phenomenon of the “Quantified Self”. Devices such as the Apple Watch or Fitbit have made it possible for anyone to easily self-track and thereby quantify their lives in some way. Popular self-quantifications include calories burnt, steps walked during the day or quality of sleep.
在過去的幾年中,智能手表和其他可穿戴設備獲得了普及,并引起了“量化自我”的文化現象。 諸如Apple Watch或Fitbit之類的設備使任何人都可以輕松進行自我跟蹤,從而以某種方式量化他們的生活。 流行的自我量化方法包括燃燒卡路里,白天行走的步數或睡眠質量。
In this article, I will focus on the latter, namely quality of sleep, using real life data from approximately one year of Fitbit usage. Fitbit provides users with a Sleep Score, which is supposed to be a measure of sleep quality. I will train and test different Machine Learning models using Python in an attempt to predict the Fitbit Sleep Score as accurately as possible while providing an explanation of how different metrics, such as minutes of REM sleep, affect the score.
在本文中,我將重點討論后者,即睡眠質量,它使用來自Fitbit大約一年使用的真實生活數據。 Fitbit為用戶提供睡眠分數,該分數可以衡量睡眠質量。 我將使用Python訓練和測試不同的機器學習模型,以嘗試盡可能準確地預測Fitbit睡眠得分,同時說明不同的指標(例如REM睡眠的分鐘數)如何影響得分。
The article is structured as follows:
這篇文章的結構如下:
Because there is a lot to cover, I split the article into three parts. Part 1 covers points 1 through 4 and focuses on getting the sleep data, preprocessing and visualising it. Part 2 covers points 5 through 10, i.e. actually building the Machine Learning models based on the preprocessed data from part 1. Part 3 covers the rest and is all about improving the models from part 2 to get the most accurate predictions possible.
由于涉及的內容很多,因此將文章分為三部分。 第1部分涵蓋了第1點到第4點,并著重于獲取睡眠數據,對其進行預處理和可視化。 第2部分涵蓋了第 5點到第10點,即根據第1 部分中的預處理數據實際構建機器學習模型。 第3部分涵蓋了其余部分,所有內容都涉及對第2部分中的模型進行改進以獲得盡可能準確的預測。
Fitbit睡眠分數到底是多少? (What exactly is the Fitbit Sleep Score?)
The Fitbit Sleep Score is best described through an example, so here are two screenshots of what the App provides to its users:
最好通過一個示例來描述Fitbit睡眠得分,因此以下是該應用程序向用戶提供的兩個屏幕截圖:
Sleep statistics provided by FitbitFitbit提供的睡眠統計In the Fitbit App, users are given a Sleep Score, which is 78 in this case, a graphical representation of the sleep stages across the sleep window, the concrete breakdown of these sleep stages in minutes as well as percent and an estimated oxygen variation.
在Fitbit App中,為用戶提供了睡眠得分,在這種情況下為78,是整個睡眠窗口的睡眠階段的圖形表示,這些睡眠階段的具體分解(以分鐘為單位)以及百分比和估計的氧氣變化。
This in and of itself seems fairly straight forward. Fitbit just has some algorithm that they plug the relevant sleep statistics, such as minutes spent in REM sleep, into and it spits out the Sleep Score.
就其本身而言,這似乎很簡單。 Fitbit只是有一些算法可以將相關的睡眠統計信息(例如,REM睡眠所花費的時間)插入其中,并吐出睡眠得分。
To anyone with a Fitbit who has ever tried to understand patterns in their Sleep Score it is clear that this is far from straight forward. The below screenshots will make it clear where the confusion is coming from:
對于任何有Fitbit的人,只要曾經嘗試了解其睡眠評分的模式,就很明顯這遠非直截了當。 下面的屏幕截圖可以清楚地說明混亂的來源:
More sleep statistics provided by FitbitFitbit提供的更多睡眠統計信息Comparing these sleep statistics to the first ones tells us the following:
將這些睡眠統計信息與第一個睡眠統計信息進行比較,可以得出以下結論:
- Time asleep is more than an hour longer 睡眠時間超過一個小時以上
- Time spent in REM sleep is almost the same 快速眼動睡眠所花的時間幾乎相同
- Time spent in deep sleep is a lot longer 深度睡眠所花費的時間更長
Based on these observations one would expect the second sleep score to be higher than the first one but it is actually the same. What is going on here? What role do the different statistics play in the calculation of the Sleep Score? Is it possible to predict Sleep Scores yourself by only looking at the sleep statistics provided?
基于這些觀察結果,人們期望第二個睡眠得分高于第一個睡眠得分,但實際上是相同的。 這里發生了什么? 不同的統計數據在睡眠得分的計算中起什么作用? 僅查看所提供的睡眠統計信息,是否可以自己預測睡眠分數?
This article answers all those questions and provides a detailed walk-through of a Machine Learning project. I hope you enjoy it!
本文回答了所有這些問題,并提供了機器學習項目的詳細演練。 我希望你喜歡它!
從Fitbit獲取睡眠數據 (Getting the sleep data from Fitbit)
Fitbit allows users to export sleep data in CSV files through their online dashboards. This process turned out to involve a bit of manual labor because Fitbit only allows a maximum of 31 days of data to be exported at a time. A few minutes later I had all the data and quickly combined them into one CSV file.
Fitbit允許用戶通過其在線儀表板以CSV文件格式導出睡眠數據。 事實證明,此過程需要一點點人工,因為Fitbit一次最多只能導出31天的數據。 幾分鐘后,我獲得了所有數據,并Swift將它們合并為一個CSV文件。
There was one problem. The manually exported CSV files included all of the sleep statistics (Minutes Asleep, Minutes Awake, Minutes REE Sleep, etc.) but did not include the actual sleep score. What the hell?!
有一個問題。 手動導出的CSV文件包括所有睡眠統計信息(“分鐘睡眠”,“分鐘睡眠”,“分鐘REE睡眠”等),但不包括實際睡眠分數。 我勒個去?!
After some digging, I discovered that there was another export option called “Lifetime Export”, which exports all the data Fitbit has collected on you ever since you started wearing their watch. You have to request this export from Fitbit before being able to download it and once approved you can download a zip folder with all sorts of different files. Included in that zip folder is a CSV file with additional sleep statistics, including the Sleep Score.
經過一番挖掘之后,我發現還有另一個導出選項,稱為“終身導出”,可以導出自您開始佩戴Fitbit手表以來Fitbit收集的所有數據。 您必須先從Fitbit要求導出此導出,然后才能下載它,一旦獲得批準,您就可以下載包含各種不同文件的zip文件夾。 該zip文件夾中包含一個CSV文件,其中包含其他睡眠統計信息,包括睡眠得分。
I saved the CSV file containing the sleep statistics as sleep_stats.csv and the the CSV file containing the Sleep Scores as sleep_score.csv. Let’s move on to Python.
我將包含睡眠統計信息的CSV文件另存為sleep_stats.csv,將包含睡眠分數的CSV文件另存為sleep_score.csv。 讓我們繼續使用Python。
數據清理和準備 (Data cleaning and preparation)
This section explains how to get from the CSV file to a DataFrame that is ready to be used in Machine Learning models. In the process, I encountered some common problems that can arise when importing data into Python and I explain how to deal with them in order to end up with a neatly preprocessed data set.
本節說明如何從CSV文件獲取準備好在機器學習模型中使用的DataFrame。 在此過程中,我遇到了將數據導入Python時可能會出現的一些常見問題,并且我解釋了如何處理它們以便最終獲得經過整齊的預處理的數據集。
After importing all the relevant libraries (see the full notebook for the libraries) the first step is to import the sleep data from the CSV files into Python using the pd.read_csv() function:
導入所有相關庫之后(第一步,請參見庫的完整筆記本 ),第一步是使用pd.read_csv()函數將睡眠數據從CSV文件導入到Python中:
I only import the first two columns of sleep_score.csv as they are the ones that contain the date and the actual sleep score, all other relevant data is found in sleep_stats.csv. Let’s have a look at the first five rows in sleep_stats_data:
我只導入sleep_score.csv的前兩列,因為它們是包含日期和實際睡眠分數的列,所有其他相關數據都在sleep_stats.csv中找到。 讓我們看一下sleep_stats_data中的前五行:
This is the first common problem: because of the way the CSV file is structured, the column names are in the first row. Here is one way to fix this problem:
這是第一個常見問題:由于CSV文件的結構方式,列名位于第一行。 這是解決此問題的一種方法:
Using the .info() function we can obtain a high-level summary of the data in the DataFrame, which in our case looks like this:
使用.info()函數,我們可以獲取DataFrame中數據的高級摘要,在我們的示例中如下所示:
Here, we encounter the second common problem: there are NaN values (missing data) in the last three columns. This is indicated by the fact that the above information summary tells us that there are 322 entries (rows) but for the last three rows the non-null count is 287. Let’s have a look at the rows that contain missing data using the following code:
在這里,我們遇到了第二個常見問題:最后三列中有NaN值(缺少數據)。 上面的信息摘要告訴我們有322個條目(行),但是對于最后三行,非空計數為287。這說明了這一點,讓我們使用以下代碼查看包含缺失數據的行:
If we look at the column Minutes Asleep or Start and End Time it becomes clear that these rows refer to afternoon naps that Fitbit recorded. Naps are too short for Fitbit to be able to reliably measure important sleep statistics and therefore we will drop all these rows from the data set:
如果我們查看“分鐘睡眠時間”或“開始和結束時間”列,則很明顯,這些行是Fitbit記錄的午睡時間。 午睡太短,以至于Fitbit無法可靠地測量重要的睡眠統計信息,因此,我們將從數據集中刪除所有這些行:
In the above data summary, we also encounter the third common problem, which is related to the first: all columns are of data type “object” but columns with index 2 to 8 should clearly be numerical, i.e. either of data type “int” or “float”. The reason these columns are of data type “object” is most likely because the column headers were initially placed in the first row, thereby causing the entire column to be classified as “object”. Let’s convert these columns to data type “float”:
在上面的數據摘要中,我們還遇到了第三個常見問題,該問題與第一個相關:所有列的數據類型均為“對象”,但索引為2到8的列顯然應為數字,即數據類型為“ int”或“浮動”。 這些列屬于數據類型“對象”的原因很可能是因為列標題最初放置在第一行中,從而導致整個列被歸類為“對象”。 讓我們將這些列轉換為數據類型“ float”:
Let’s now have a look at the first few rows and the summary of sleep_score_data:
現在讓我們看一下前幾行和sleep_score_data的摘要:
This DataFrame looks a lot better, the column headers were automatically recognised and there are no missing values.
這個DataFrame看起來好多了,列標題被自動識別并且沒有丟失的值。
For the purpose of further analyses, I would like to combine the two DataFrames into one, meaning I want to merge them. To ensure that the Sleep Scores end up in the row with the corresponding sleep statistics I need a column that is identical in both DataFrames, which will be used as the column to merge on.
為了進行進一步的分析,我想將兩個DataFrame合并為一個,這意味著我想將它們合并。 為了確保睡眠分數最終在具有相應睡眠統計信息的行中顯示,我需要在兩個DataFrame中都相同的列,該列將用作合并的列。
In our case, both DataFrames have a column with some sort of timestamp. The sleep statistics DataFrame has a start and an end time and the Sleep Score DataFrame has a timestamp. Because a sleep score is always provided after awakening, the date that is relevant in the sleep statistics DataFrame is the end time and we can drop the start time. But there is one more issue: the format of the end time in the sleep statistics DataFrame is different from the format of the timestamp in the Sleep Score DataFrame. If we tried to merge the DataFrames on these columns, the rows would not be matched up. My solution was to create a “Date” column in both DataFrames that contains only the date, merge the DataFrames on those columns, drop the redundant columns and drop one row that contained a missing value after the merge. The following code accomplishes this:
在我們的例子中,兩個DataFrame都有一個帶有某種時間戳的列。 睡眠統計數據幀具有開始時間和結束時間,而睡眠分數數據幀具有時間戳。 由于始終在喚醒后提供睡眠得分,因此睡眠統計數據框架中相關的日期是結束時間,我們可以減少開始時間。 但是還有一個問題:睡眠統計數據幀中結束時間的格式與睡眠分數數據幀中時間戳的格式不同。 如果我們嘗試合并這些列上的DataFrame,則行將不匹配。 我的解決方案是在兩個僅包含日期的數據框中創建一個“日期”列,合并這些列上的數據框,刪除冗余列,并刪除合并后包含缺失值的一行。 以下代碼完成了此任務:
The resulting combined DataFrame looks like this:
生成的組合DataFrame如下所示:
Merged and preprocessed data合并和預處理的數據I dropped all columns related to dates because this is not a time series analysis and we do not need the dates going forward. The number of awakenings are not provided by the Fitbit app and because I want to predict Sleep Scores using only data that is provided in the app I dropped it as well.
我刪除了與日期相關的所有列,因為這不是時間序列分析,因此我們不需要將來的日期。 Fitbit應用程序不提供喚醒次數,因為我只想使用應用程序中提供的數據來預測睡眠分數,所以我也將其刪除。
With the combined and preprocessed DataFrame we can move on to some Exploratory Data Analysis.
通過組合和預處理的DataFrame,我們可以進行一些探索性數據分析。
探索性數據分析(EDA) (Exploratory Data Analysis (EDA))
In this section I will use visualisations to provide a better understanding of the underlying data. These initial insights will be the foundation for later analyses.
在本節中,我將使用可視化效果更好地理解基礎數據。 這些初步見解將成為以后分析的基礎。
First let’s have a look at the distribution of the Sleep Scores:
首先,讓我們看一下睡眠得分的分布:
The distribution of sleep scores is skewed to the left, which makes sense because bad night sleeps are more likely to occur than exceptionally good night sleeps due to multiple reasons such as staying out late or having to get up extremely early. In addition, the average sleep score is already relatively high at 82 (out of 100) and therefore it is unlikely (basically impossible) to have many outliers that lie far above the mean.
睡眠分數的分布向左傾斜,這是有道理的,因為由于多種原因(例如,熬夜或必須特別早起床),比正常的夜間睡眠更容易發生不良的夜間睡眠。 另外,平均睡眠得分已經相對較高,為82(滿分100),因此不可能(基本上不可能)有許多離平均值遠得多的異常值。
Let’s also have a look at the relationship that each individual feature has with the Sleep Score to get a sense of which features might be important and what their relationships to the Sleep Score are. I have defined a function that takes as inputs a DataFrame that contains the target variable in the last column as well as the number of columns to be contained in the entire plot. The number of columns determines how many subplots there are in each row. Here is the function:
我們還要看看每個功能與睡眠得分之間的關??系,以了解哪些功能可能很重要以及它們與睡眠得分之間的關??系。 我定義了一個函數,該函數以一個DataFrame作為輸入,該DataFrame包含最后一列中的目標變量以及整個繪圖中要包含的列數。 列數確定每行中有多少個子圖。 這是函數:
Calling this function with the sleep_data DataFrame and num_cols=3 as inputs results in the following plots:
使用sleep_data DataFrame和num_cols = 3作為輸入調用此函數將導致以下繪圖:
Taken by themselves, Minutes Asleep and Minutes REM Sleep seem to have the strongest positive relationship with Sleep Score. Generally speaking this makes sense because more time asleep should be a positive thing when thinking about sleep quality and therefore Sleep Score. The same is true for more time spent in REM sleep.
單獨考慮,“ Minutes Asleep”和“ Minutes REM Sleep”似乎與睡眠得分之間的關??系最強。 一般來說,這是有道理的,因為在考慮睡眠質量并因此考慮睡眠得分時,更多的睡眠時間應該是一件積極的事情。 對于花在REM睡眠上的更多時間也是如此。
To complete the picture about the relationships between the different features and Sleep Score let’s have a look at the correlation matrix:
為了完成有關不同功能與睡眠得分之間關系的描述,讓我們看一下相關矩陣:
Indeed, Sleep Score has the highest correlation with Minutes REM Sleep, closely followed by Minutes Asleep. Another important thing to note is that many of the features are highly correlated. This makes sense because more time asleep should lead to more time spent in all stages of sleep and the features will tend to move together. While this may be an inevitable by-product of the nature of the features included here it could lead to multicollinearity issues down the road. More on this later.
實際上,睡眠分數與分鐘REM睡眠的相關性最高,緊隨其后的是分鐘睡眠。 還要注意的另一重要事項是,許多功能是高度相關的。 這是有道理的,因為更多的睡眠時間會導致在所有睡眠階段花費更多的時間,并且這些功能部件往往會一起移動。 雖然這可能是此處包含的功能的本質的必然產物,但它可能會導致多重共線性問題。 稍后再詳細介紹。
Part 2 builds on the preprocessed data and the insights from the Exploratory Data Analysis to build a couple of different Machine Learning models that predict Sleep Scores. Part 2 can be found here:
第2部分基于預處理數據和Exploratory Data Analysis的見解,構建了兩個不同的預測睡眠分數的機器學習模型。 第2部分可以在這里找到:
翻譯自: https://towardsdatascience.com/how-to-obtain-and-analyse-fitbit-sleep-scores-a739d7c8df85
fitbit手表中文說明書
總結
以上是生活随笔為你收集整理的fitbit手表中文说明书_如何获取和分析Fitbit睡眠分数的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: s12冰女大乱斗出装 丽桑卓符文怎么点
- 下一篇: 机器学习 可视化_机器学习-可视化