数据库不停机导数据方案_如何计算数据停机成本
數(shù)據(jù)庫不停機(jī)導(dǎo)數(shù)據(jù)方案
In addition to wasted time and sleepless nights, data quality issues lead to compliance risks, lost revenue to the tune of several million dollars per year, and erosion of trust — but what does bad data really cost your company? I’ve created a novel data downtime calculator that will help you measure the true financial impact of bad data on your organization.
除了浪費(fèi)時(shí)間和不眠之夜之外, 數(shù)據(jù)質(zhì)量問題還 導(dǎo)致合規(guī)風(fēng)險(xiǎn), 每年 損失 數(shù)百萬美元的 收入 和信任度下降-但是,糟糕的數(shù)據(jù)真正會(huì)使您的公司付出什么呢? 我創(chuàng)建了一個(gè)新穎的 數(shù)據(jù)停機(jī)計(jì)算器 ,可以幫助您衡量不良數(shù)據(jù)對(duì)組織的真正財(cái)務(wù)影響。
What’s big, scary, and keeps even the best data teams up at night?
有什么大的,令人恐懼的,甚至可以讓最好的數(shù)據(jù)團(tuán)隊(duì)在夜間工作?
If you guessed the ‘monster under your bed,’ nice try, but you’d be wrong. The answer is far more real, all-too-common, and you’re probably already experiencing it whether or not you realize it.
如果您猜到了“床下的怪物”,可以嘗試一下,但是您會(huì)錯(cuò)的。 答案要真實(shí)得多,太普遍了,無論您是否意識(shí)到,您可能已經(jīng)在體驗(yàn)它了。
The answer? Data downtime. Data downtime refers to periods of time when your data is partial, erroneous, missing, or otherwise inaccurate, ranging from a few null values to completely outdated tables. These data fire drills are time-consuming and costly, corrupting otherwise excellent data pipelines with garbage data.
答案? 數(shù)據(jù)停機(jī)時(shí)間。 數(shù)據(jù)停機(jī)時(shí)間是指數(shù)據(jù)部分,錯(cuò)誤,丟失或不準(zhǔn)確的時(shí)間段,范圍從幾個(gè)空值到完全過時(shí)的表。 這些數(shù)據(jù)防火練習(xí)既耗時(shí)又昂貴, 使用垃圾數(shù)據(jù)破壞了本來很好的數(shù)據(jù)管道 。
壞數(shù)據(jù)的真實(shí)代價(jià) (The true cost of bad data)
One CDO I spoke with recently told me that his 500-person team spends 1,200 cumulative hours per week tackling data quality issues, time otherwise spent on activities that drive innovation and generate revenue.
我最近與之交談的一位CDO告訴我,他的500人團(tuán)隊(duì)每周花1200個(gè)小時(shí)累計(jì)時(shí)間來解決數(shù)據(jù)質(zhì)量問題,否則將時(shí)間花在推動(dòng)創(chuàng)新和創(chuàng)收的活動(dòng)上。
To demonstrate the scope of this problem, here are some fast facts about just how must time data teams waste on data downtime:
為了說明此問題的范圍,以下是一些快速的事實(shí),說明數(shù)據(jù)團(tuán)隊(duì)必須如何浪費(fèi)時(shí)間進(jìn)行數(shù)據(jù)停機(jī):
50–80 percent of a data practitioner’s time is spent collecting, preparing, and fixing “unruly” data. (The New York Times)
數(shù)據(jù)從業(yè)人員有50-80%的時(shí)間用于收集,準(zhǔn)備和修復(fù)“不守規(guī)矩”的數(shù)據(jù)。 ( 紐約時(shí)報(bào) )
40 percent of a data analyst’s time is spent on vetting and validating analytics for data quality issues. (Forrester)
數(shù)據(jù)分析師有40%的時(shí)間用于審查和驗(yàn)證數(shù)據(jù)質(zhì)量問題的分析。 ( Forrester )
27 percent of a salesperson time is spent dealing with inaccurate data. (ZoomInfo)
銷售人員有27%的時(shí)間用于處理不準(zhǔn)確的數(shù)據(jù)。 ( ZoomInfo )
50 percent of a data practitioner’s time is spent on identifying, troubleshooting, and fixing data quality, integrity, and reliability issues. (Harvard Business Review)
數(shù)據(jù)從業(yè)人員有50%的時(shí)間用于識(shí)別,故障排除和修復(fù)數(shù)據(jù)質(zhì)量,完整性和可靠性問題。 ( 哈佛商業(yè)評(píng)論 )
Based on these numbers, as well as interviews and surveys conducted with over 150 different data teams across industries, I estimate that data teams spend 30–40 percent of their time handling data quality issues instead of working on revenue-generating activities.
根據(jù)這些數(shù)字,以及對(duì)跨行業(yè)的150多個(gè)不同數(shù)據(jù)團(tuán)隊(duì)進(jìn)行的訪談和調(diào)查,我估計(jì)數(shù)據(jù)團(tuán)隊(duì)將30%至40%的時(shí)間用于處理數(shù)據(jù)質(zhì)量問題,而不是從事創(chuàng)收活動(dòng)。
The cost of bad data is more than wasted time and sleepless nights; there are serious compliance, financial, and operational implications that can catch data leaders off guard, impacting both your team’s ROI and your company’s bottom line.
錯(cuò)誤數(shù)據(jù)的代價(jià)不僅是浪費(fèi)時(shí)間和不眠之夜; 嚴(yán)重的合規(guī)性,財(cái)務(wù)和運(yùn)營(yíng)影響可能會(huì)使數(shù)據(jù)領(lǐng)導(dǎo)者措手不及,從而影響團(tuán)隊(duì)的投資回報(bào)率和公司的底線。
合規(guī)風(fēng)險(xiǎn) (Compliance risk)
For several decades, the medical and financial services sectors, with their responsibility to protect personally identifiable information (PII) and stewardship of sensitive customer data sources, was the poster child for compliance.
幾十年來,醫(yī)療和金融服務(wù)部門一直負(fù)責(zé)保護(hù)個(gè)人身份信息(PII)和管理敏感的客戶數(shù)據(jù)源,這是遵守法規(guī)的典型代表。
Now, with nearly every industry handling user data, companies from e-commerce sites to dog food distributors must follow strict data governance mandates, from GDPR to CCPA, and other privacy protection regulations.
現(xiàn)在,幾乎每個(gè)行業(yè)都在處理用戶數(shù)據(jù),從電子商務(wù)站點(diǎn)到狗糧分銷商的公司必須遵循嚴(yán)格的數(shù)據(jù)治理要求,從GDPR到CCPA以及其他隱私保護(hù)法規(guī)。
And bad data can manifest in any number of ways, from a mistyped email address to misreported financials and can cause serious ramifications down the road; for instance, in Vermont, outdated information about whether or not a customer wants to renew their annual subscription of a service can spell the difference between a seamless user experience and a class action lawsuit. Such errors can lead to fines and steep penalties.
從錯(cuò)誤的電子郵件地址到錯(cuò)誤的財(cái)務(wù)報(bào)告,不良數(shù)據(jù)可能以多種方式表現(xiàn)出來,并可能導(dǎo)致嚴(yán)重后果。 例如, 在佛蒙特州 ,有關(guān)客戶是否想要續(xù)訂其年度服務(wù)的過時(shí)信息可以消除無縫的用戶體驗(yàn)與集體訴訟之間的區(qū)別。 這樣的錯(cuò)誤可能導(dǎo)致罰款和嚴(yán)厲的處罰。
收入損失 (Lost revenue)
It’s often said that “time is money,” but for any company seeking the competitive edge, “data is money” is more accurate.
人們常說“時(shí)間就是金錢”,但是對(duì)于任何尋求競(jìng)爭(zhēng)優(yōu)勢(shì)的公司來說,“數(shù)據(jù)就是金錢”更為準(zhǔn)確。
One of the most explicit links I’ve found between data downtime and lost revenue is in financial services. In fact, one data scientist at a financial services company that buys and sells consumer loans told me that a field name change can result in a $10M loss in transaction volume, or a week’s worth of deals.
我發(fā)現(xiàn)數(shù)據(jù)停機(jī)和收入損失之間最明顯的聯(lián)系之一是金融服務(wù)。 實(shí)際上,一家買賣消費(fèi)者貸款的金融服務(wù)公司的數(shù)據(jù)科學(xué)家告訴我,域名更改可能導(dǎo)致交易額損失1000萬美元,或一周的交易額。
Behind these numbers is the reality that firefighting data downtime incidents not only wastes valuable time but tears teams away from revenue-generating projects. Instead of making progress on building new products and services that can add material value for your customers, data engineering teams spend time debugging and fixing data issues. A lack of visibility into what’s causing these problems only makes matters worse.
這些數(shù)字背后的事實(shí)是,消防數(shù)據(jù)停機(jī)事件不僅浪費(fèi)寶貴的時(shí)間,而且使團(tuán)隊(duì)遠(yuǎn)離創(chuàng)收項(xiàng)目。 數(shù)據(jù)工程團(tuán)隊(duì)沒有在開發(fā)可以為您的客戶增加實(shí)質(zhì)價(jià)值的新產(chǎn)品和服務(wù)上取得進(jìn)展,而是花時(shí)間調(diào)試和解決數(shù)據(jù)問題。 對(duì)導(dǎo)致這些問題的原因缺乏了解只會(huì)使情況變得更糟。
侵蝕數(shù)據(jù)信任 (Erosion of data trust)
The insights you derive from your data are only as accurate as the data itself. In fact, it’s my firm belief that numbers can lie and using bad data is worse than having no data at all.
您從數(shù)據(jù)中得出的見解僅與數(shù)據(jù)本身一樣準(zhǔn)確。 實(shí)際上,我堅(jiān)信數(shù)字會(huì)撒謊,并且使用不良數(shù)據(jù)比根本沒有數(shù)據(jù)還要糟糕。
Data won’t hold itself accountable, but decision makers will, and over time, bad data can erode organizational trust in your data team as a revenue driver for the organization. After all, if you can’t rely on the data powering your analytics, why should your CEO? And for that matter, why should your customers?
數(shù)據(jù)本身不負(fù)責(zé)任,但是決策者將隨著時(shí)間的流逝,壞數(shù)據(jù)會(huì)削弱組織對(duì)您的數(shù)據(jù)團(tuán)隊(duì)的信任,因?yàn)樗墙M織的收入驅(qū)動(dòng)力。 畢竟,如果您不能依靠數(shù)據(jù)來支持分析,那么為什么您的CEO應(yīng)該呢? 那么,為什么您的客戶呢?
To help you mitigate your data downtime problem, we put together a Data Downtime Cost Calculator that factors in how much money you’re likely to lose dealing with data downtime fire drills instead of working on revenue-generating activities.
為了幫助您緩解數(shù)據(jù)停機(jī)問題,我們建立了一個(gè)數(shù)據(jù)停機(jī)成本計(jì)算器 ,該計(jì)算器將您可能會(huì)損失多少錢來處理數(shù)據(jù)停機(jī)消防演習(xí)而不是從事創(chuàng)收活動(dòng)。
您的數(shù)據(jù)停機(jī)成本計(jì)算器 (Your Data Downtime Cost Calculator)
As such, the annual cost of your data downtime can be measured by the engineering or resources you need to spend to resolve it.
因此,數(shù)據(jù)停機(jī)的年度成本可以通過解決該問題所需的工程或資源來衡量。
I’d propose that the right data downtime calculator factors in the cost of labor to tackle these issues, your compliance risk (in this case, we used the average GDPR fines), and the opportunity cost of losing stakeholder trust in your data. Per earlier estimates, you can assume that around 30 percent of an engineer’s time will be spent tackling data issues.
我建議正確的數(shù)據(jù)停機(jī)計(jì)算器應(yīng)考慮解決這些問題的勞動(dòng)力成本,合規(guī)風(fēng)險(xiǎn)(在這種情況下,我們使用GDPR的平均罰款)以及失去利益相關(guān)者對(duì)數(shù)據(jù)的信任的機(jī)會(huì)成本。 根據(jù)較早的估計(jì),您可以假設(shè)工程師的大約30%的時(shí)間將花費(fèi)在解決數(shù)據(jù)問題上。
Bringing this all together, your Data Downtime Cost Calculator is:
綜上所述,您的數(shù)據(jù)停機(jī)成本計(jì)算器是:
Labor Cost: ([Number of Engineers] X [Annual Salary of Engineer]) X 30%
人工成本:([工程師人數(shù)] X [工程師年薪])X 30%
+
+
Compliance Risk: [4% of Your Revenue in 2019]
合規(guī)風(fēng)險(xiǎn):[2019年收入的4%]
+
+
Opportunity Cost: [Revenue you could have generated if you moved faster, releasing X new products, and acquired Y new customers]
機(jī)會(huì)成本:[如果您移動(dòng)得更快,發(fā)布X個(gè)新產(chǎn)品并獲得Y個(gè)新客戶,您可能會(huì)產(chǎn)生收入]
= $年度數(shù)據(jù)停機(jī)成本 (= $ Annual Cost of Data Downtime)
Keep in mind that this equation will vary by company, but we’ve found that our framework can get most teams started.
請(qǐng)記住,這個(gè)方程式會(huì)因公司而異,但是我們發(fā)現(xiàn)我們的框架可以使大多數(shù)團(tuán)隊(duì)入手。
Measuring the cost of your data downtime is the first step towards fully understanding the implications of bad data at your company. Fortunately, data downtime is avoidable. With the right approach to data reliability, you can keep the cost of bad data at bay and prevent bad data from corrupting good pipelines in the first place.
衡量數(shù)據(jù)停機(jī)成本是全面了解不良數(shù)據(jù)對(duì)公司的影響的第一步。 幸運(yùn)的是,可以避免數(shù)據(jù)停機(jī)。 使用正確的數(shù)據(jù)可靠性方法,您可以控制壞數(shù)據(jù)的成本, 并首先防止壞數(shù)據(jù)破壞好的管道。
Have another way to measure the impact of data downtime? Would love to hear from you!
還有另一種方法來衡量數(shù)據(jù)停機(jī)的影響嗎? 希望 收到您的 來信!
翻譯自: https://towardsdatascience.com/how-to-calculate-the-cost-of-data-downtime-c0a48733b6f0
數(shù)據(jù)庫不停機(jī)導(dǎo)數(shù)據(jù)方案
總結(jié)
以上是生活随笔為你收集整理的数据库不停机导数据方案_如何计算数据停机成本的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 梦到老公中大奖了什么意思
- 下一篇: 关系型数据库的核心单元是_核中的数据关系