求解决方法_解决方法
求解決方法
Relationship management is one of the determining factors in the business health. One of the most important factors of this connection is the ability to identify when a customer is likely to cancel a service. For that reason, it is necessary to take initiatives that maximize customer retention.
關系管理是業務健康的決定因素之一。 這種聯系的最重要因素之一是能夠確定客戶何時可能取消服務。 因此,有必要采取措施最大限度地保留客戶。
Therefore, projects that identify customers prone to churn have become a frequent concern for organizations, as the cost of retention is usually lower than the cost of acquisition.
因此,確定客戶容易流失的項目已成為組織經常關注的問題,因為保留成本通常低于收購成本。
Although it has gained the attention of many companies, there is no magic formula to solve the churn problem. In addition, the solution can have numerous complexities, like identifying the churn reason to apply different retention strategies.
盡管它已經引起了許多公司的關注,但是還沒有神奇的公式可以解決流失問題。 此外,解決方案可能具有許多復雜性,例如確定應用不同保留策略的客戶流失原因。
挑戰性 (Challenges)
獲取新客戶的成本是否大于保留成本? (Is the cost of acquiring new customers greater than the cost of retention?)
It is essential to observe financial and strategic expenses in order to acquire and retain customers, since for some companies the cost of acquisition may be 5x higher than the cost of retention.
為了獲得并留住客戶,必須觀察財務和戰略支出,因為對于某些公司而言,獲取成本可能比保留成本高5倍。
將治療哪種類型的客戶流失? (What type of churn will be treated?)
It is important to highlight that the churn increase for a product or service occurs in many ways, such as:
重要的是要強調,產品或服務的客戶流失率以多種方式發生,例如:
Volunteer: when the customer chooses to cancel the service due to dissatisfaction or preference for a competitor.
志愿者 :當客戶由于對競爭對手的不滿或偏愛而選擇取消服務時。
Silent: happens when a customer stops using the service for a long period and it does not generate costs — as using a credit card without monthly fees.
靜默 :當客戶長時間停止使用服務且不會產生成本時(例如使用沒有月費的信用卡),會發生靜默 。
Involuntary: when the consumer does not intend to cancel the service, but due to a negligence he may end up having his plan not renewed or canceled for irregular use, lack of payment, among others.
非自愿的 :當消費者不打算取消服務,但是由于疏忽,他最終可能會因為不正當使用,缺乏付款等原因而沒有續簽或取消其計劃。
您的專家對這個問題了解多少? (How much do your experts know about the problem?)
Having a skilled team is very important to analyze if the project can be executed internally or if it needs outsourced help. Personalized solutions and prepared professionals can help to overcome the challenges of the problem and obtain rich and applicable results.
擁有一支熟練的團隊對于分析項目是否可以在內部執行或是否需要外包幫助非常重要。 個性化的解決方案和專業的專業人員可以幫助克服問題的挑戰,并獲得豐富而適用的結果。
您是否有一個數據庫可以提取有關業務及其客戶的信息? (Do you have a database that allows you to extract information about the business and its customers?)
A solid database makes project execution much more feasible and generates robust and reliable results. This is a fundamental step to obtain customer knowledge and, consequently, understand how to map and develop your solution. Which brings us to the next question:
可靠的數據庫使項目執行更加可行,并產生可靠可靠的結果。 這是獲取客戶知識并因此了解如何映射和開發解決方案的基本步驟。 這就引出了下一個問題:
您對客戶有多了解? (How well do you know your clients?)
It is also necessary to diagnose how your actions reflect on customers and, for that, you need to gather the information that defines their individual profile and behavior. This analysis is the key to identify whether or not they are prone to churn.
還需要診斷您的行為如何影響客戶,為此,您需要收集定義其個人資料和行為的信息。 該分析是確定它們是否容易流失的關鍵。
解決方法 (Ways to solve)
When it comes to solving the problem, there are a few more challenges to be overcomed by the team of experts. The first one is related to combine technical knowledge and business understanding, since exploratory analysis and the feature engineering must consider the organizational model to be successful.
解決問題時,專家團隊還需要克服一些其他挑戰。 第一個涉及將技術知識和業務理解相結合,因為探索性分析和功能工程必須考慮組織模型的成功。
After characteristics consolidation and the insertion of business insights, it is time to start modeling. At this stage, you may encounter imbalanced data, in other words, by splitting the base of people who churned and people who remained faithful to the service, you may find an exacerbated higher proportion of loyal customers.
在特征合并和業務見解插入之后,是時候開始建模了。 在此階段,您可能會遇到數據不平衡的情況,換句話說,通過分散攪動的人群和忠于服務的人群,您會發現忠誠客戶的比例更高。
The biggest problem with imbalanced data is that, if it is not addressed, machine learning algorithms tend to have a good response only for the majority class. This implies the generation of many false negatives, as there is an inclination to classify customers who are likely to leave as loyals.
數據不平衡的最大問題是,如果不加以解決,機器學習算法往往僅對大多數人有很好的響應。 由于存在將可能離開的客戶歸為忠誠客戶的傾向,因此這意味著會產生許多假否定情況。
處理不平衡數據的技術 (Techniques to deal with imbalanced data)
At this point, it is necessary to use techniques to solve the imbalanced dataset problem and optimize the filter of customer’s behavior. Among them we can mention some of the most common ones: Oversampling, Undersampling, SMOTE and ADASYN. It is worth mentioning that they are not generalists, which explains why each problem is treated according to its specificity.
在這一點上,有必要使用技術來解決數據集不平衡的問題并優化客戶行為的過濾器。 在它們當中,我們可以提到一些最常見的:過采樣,欠采樣,SMOTE和ADASYN。 值得一提的是,他們不是通才,這解釋了為什么每個問題都要根據其具體性進行處理。
Undersampling and Oversampling are more elementary techniques and mean the reduction of the class with greater representativeness and expansion of the one with less representativeness, respectively.
欠采樣和過采樣是比較基本的技術,分別表示代表性較高的類別的減少和代表性較低的類別的擴展。
SMOTE and ADASYN are more complex and make synthetic samples of the data. Both are similar strategies but ADASYN uses density distribution to create the synthetic elements.
SMOTE和ADASYN更復雜,它們是數據的綜合樣本。 兩者都是相似的策略,但是ADASYN使用密度分布來創建合成元素。
了解您的客戶流失解決方案的性能 (Understand the performance of your churn solution)
The churn model must be built based on the expected responses, being concerned with performance and how the output should be presented. When measuring model performance it is important to choose the correct metric for evaluation. Accuracy, for example, can give us a false sense of an stunning model, however, the result can be due to a correct classification only of the majority class — in which there is no presence of churn.
流失模型必須基于預期的響應,性能和輸出表示方式來構建。 在測量模型性能時,選擇正確的評估指標非常重要。 例如,準確性可能使我們對令人震驚的模型有錯誤的認識,但是,結果可能是由于僅對大多數類別進行了正確分類而沒有流失。
Walber on Walber在Wikipedia維基百科上Such evaluation can be centered on how much the solution improves your current retention strategy. If we consider that the retention actions are done on random clients, we can evaluate how much the sample indicated by the model would improve the selection of clients prone to churn.
這樣的評估可以集中在解決方案可以在多大程度上改善您當前的保留策略上。 如果我們認為保留操作是針對隨機客戶執行的,則我們可以評估該模型指示的樣本將改善易流失客戶的選擇的程度。
Traditional evaluation metrics, like precision and recall, can also be fairly useful. The former is the number of correct indications over the total of number indications, while the second is the percentage of churn clients correctly classified over the total number of churns. Another method is the f1-score that can be described as:
傳統的評估指標,如準確性和召回率,也可能非常有用。 前者是正確指示的數量占總數指示的總數,而第二個是正確分類的流失客戶在流失總數中的百分比。 另一種方法是f1得分,可以描述為:
F1 = 2 * (precision * recall) / (precision + recall)
F1 = 2 *(精度*召回率)/(精度+召回率)
了解結果 (Understanding the results)
In order to evaluate the metric to be used, it is crucial to understand operational costs to retain a customer given the potential for expected future revenue (lifetime value — LTV).
為了評估要使用的指標,了解運營成本以留住客戶至關重要,因為這可能帶來預期的未來收入(生命周期價值-LTV)。
Customers with a high LTV may justify a higher expense for retention, while customers with a low LTV may not justify the investment to retain it.
LTV高的客戶可能會為保留費用支付更高的費用,而LTV低的客戶可能無法為保留該費用而進行投資。
From the knowledge of the parameters for retaining a customer, this operation can be marked out, whether or not it makes the acceptance of wrongly classified consumers more flexible. This factor is directly related to penalties for generating false positives — when a loyal customer is classified as a churn.
根據保留客戶的參數知識,可以標明此操作,無論是否使接受錯誤分類的消費者更為靈活。 當忠實的客戶被歸類為客戶流失時,此因素與產生誤報的罰款直接相關。
If the cost of the retention operation is low, you can choose to flag more customers and thus get the majority of real churns. However, this will result in the presence of more false positives. Likewise, if the cost is high, it is essential to focus on the accuracy of the selected group, in order to avoid unnecessary expenses.
如果保留操作的成本較低,則可以選擇標記更多的客戶,從而獲得大部分的實際客戶流失。 但是,這將導致出現更多的誤報。 同樣,如果成本很高,則必須重點關注所選組的準確性,以避免不必要的支出。
In classification models, the threshold to classify a client as a churner is, by default, having a probability of leaving the service superior to 50%. This limit can be changed according to the business, for example, if higher precision is required, we can evaluate as churn only elements with a probability above 70%.
在分類模型中,默認情況下,將客戶分類為客戶的閾值具有使服務保持在50%以上的可能性。 可以根據業務更改此限制,例如,如果需要更高的精度,我們可以僅將概率高于70%的元素評估為流失。
Sin-Yi Chou on 仙乙丑在GithubGithub上該模型 (The model)
The expected output can influence the employed strategy used to solve the problem. In addition to classification algorithms, which have binary responses, there are approaches that use survival and hybrid models. Survival analysis models do not classify customers as prone to churn or not. The generated response is a curve that can be operated to track each client’s probability to churn over time.
預期的輸出會影響解決該問題所采用的策略。 除了具有二進制響應的分類算法外,還有使用生存和混合模型的方法。 生存分析模型不會將客戶分類為容易流失的客戶。 生成的響應是一條曲線,可用于跟蹤每個客戶隨時間流逝的可能性。
To overcome survival analysis problems that involve complex and non-linear risk functions, models that extend binary classifications and transform their results into survival analysis have been developed. Such models are known as hybrid models and some of them are: RF-SRC, deepSurv and WTTE-RNN.
為了克服涉及復雜和非線性風險函數的生存分析問題,開發了擴展二進制分類并將其結果轉換為生存分析的模型。 這種模型稱為混合模型,其中一些是:RF-SRC,deepSurv和WTTE-RNN。
結論 (Conclusion)
In summary, it is clear that churn modeling is vital for companies to be able to retain customers and reduce costs. Therefore, it is necessary to be aware that the success of these resources goes through several aspects — ranging from the knowledge of the public, to the complexity and robustness of the model. In case of any doubts, feel free to contact me!
總之,很明顯,流失模型對于公司能夠保留客戶并降低成本至關重要。 因此,有必要意識到,這些資源的成功涉及多個方面-從公眾的知識到模型的復雜性和魯棒性。 如有任何疑問,請隨時與我聯系!
翻譯自: https://towardsdatascience.com/unraveling-churn-and-its-challenges-a207276ff4a9
求解決方法
總結
以上是生活随笔為你收集整理的求解决方法_解决方法的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: centos有趣软件包_这5个软件包使学
- 下一篇: xml格式是什么示例_什么是对抗示例?