日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

角距离恒星_恒星问卷调查的10倍机器学习生产率

發布時間:2023/12/15 编程问答 42 豆豆
生活随笔 收集整理的這篇文章主要介紹了 角距离恒星_恒星问卷调查的10倍机器学习生产率 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

角距離恒星

With availability of massive data and computation, Machine Learning (ML) and other spheres of artificial intelligence are growing at rapid rate. AI has become the demand of time and the need of the hour. To keep up, almost every company is either starting a new Data Science/Machine Learning department or expanding rapidly with multiple projects in pipeline. Now, we have more ML competitions and hackathons than ever recorded in the history.Everyday there are new courses focusing entirely on Python libraries and Machine Learning APIs. People are sharing latest machine learning (ML) algorithms, computations, graphs, charts and code snippets on a daily basis focusing technical aspects and implementations.

隨著海量數據和計算的可用性,機器學習(ML)和其他人工智能領域正在快速增長。 人工智能已經成為時間的需求和小時的需求。 為了跟上步伐,幾乎每個公司都在建立新的數據科學/機器學習部門,或者通過多個正在計劃中的項目Swift擴張。 現在,我們比以往有更多的ML競賽和黑客馬拉松比賽。每天都有新課程完全側重于Python庫和機器學習API。 人們每天都在共享最新的機器學習(ML)算法,計算,圖形,圖表和代碼段,重點關注技術方面和實現。

Given the overload of information towards technical aspects , less focus is on Machine Learning project discovery session or requirement gathering session which focuses on business aspects of the problem. Being in this field for past few years, I have seen many successes and failures of ML projects. I strongly believe, the project requirement or discovery session is one of the prime deciders between success and failure of any ML project like any other project.

鑒于技術方面的信息過多,較少關注的是機器學習項目發現會話或需求收集會話,后者主要關注問題的業務方面。 在該領域工作了幾年,我看到了機器學習項目的許多成功和失敗。 我堅信,項目要求或發現會議是像任何其他項目一樣在任何ML項目成功與失敗之間做出決定的主要決定因素之一。

So, let’s start out journey towards making the complex simple and enhance ML productivity of your team by 10x.

因此,讓我們開始著手簡化流程,將團隊的ML生產力提高10倍。

What is Machine Learning?

什么是機器學習?

Machine learning is a branch of artificial intelligence (AI) that involves finding patterns and relationships between input and output data attributes using historical data and produces an optimized mathematical function (also called model) holding the relationship. The model is then used for predicting the output on new data.

機器學習是人工智能(AI)的一個分支,它涉及使用歷史數據查找輸入和輸出數據屬性之間的模式和關系,并產生一種保持關系的優化數學函數(也稱為模型 )。 然后將模型用于預測新數據的輸出。

How to do better discovery for Machine Learning?

如何為機器學習做更好的發現?

  • Unbiased View: On very first introduction/discovery meeting, go in with unbiased view with clear mind and note the high-level requirements properly. The project may be a Machine Learning one or may just need a logical model. Be open and receptive to the scope only.

    平常心 :關于第一個引入/發現會議,去與清醒的頭腦平常心,正確注意高層次的需求。 該項目可能是機器學習項目,或者可能僅需要邏輯模型。 開放并只接受范圍。

  • Do not take anything at face value: As you are listening to the initial stakeholders and hearing machine learning/deep learning multiple times, do not worry about them at this stage. Do not think on any technology, model or any other complexities as you are progressing thru this meeting. This will help you get hold of the idea behind the project.

    不要從容面對任何事情 :當您正在聆聽最初的涉眾并多次聽到機器學習/深度學習時,在此階段不必擔心。 在進行本次會議時,請不要考慮任何技術,模型或任何其他復雜性。 這將幫助您掌握項目背后的想法。

  • Create Discovery Notes: Post your meeting, spend next 30 mins writing your understanding of the scope in a plain documents and decide if the scope qualifies for machine learning project.

    創建發現記錄 :發布您的會議,接下來的30分鐘用簡單的文檔撰寫您對范圍的理解,并確定范圍是否適合機器學習項目。

  • Involve Other Team Members: Post your scope documentation, discuss the scope with your team and confirm their understanding matches yours.

    參與其他團隊成員:發布您的范圍文檔,與您的團隊討論范圍,并確認他們的理解與您的相符。

  • Share Your Scope: Share your understanding of scope with initial stakeholders for confirmation and review for any gaps.

    分享您的范圍:與最初的利益相關者分享您對范圍的理解,以確認和審查任何差距。

  • Build excellent questionnaires: Now you have your initial signoff and ready for detailed discovery. As a data scientist, you are the detective investigating and solving the mystery. At this stage, you will build three different forms of questionnaire — Process Questionnaire, Data Questionnaire, Architecture Questionnaire.

    建立出色的調查表:現在,您已完成初始簽核并準備進行詳細發現。 作為數據科學家,您是偵探們正在研究和解決這個謎團。 在此階段,您將構建三種不同形式的調查表-過程調查表,數據調查表,體系結構調查表。

  • 流程問卷 (Process Questionnaire)

    Process questionnaire focuses on business stakeholders like application owner, process owners, business analysts, process analysts etc. Questionnaire involves understanding of existing and new process flow in details. Below are questions, intentionally generic, that has to be tuned to your situation:

    流程調查表側重于業務利益相關者,例如應用程序所有者,流程所有者,業務分析師,流程分析師等。問卷涉及對現有流程和新流程的詳細了解。 以下是有意通用的問題,必須根據您的情況進行調整:

  • What is the overall goal of the project?

    該項目的總體目標是什么?
  • Overview of Current process at high level.

    當前流程概述。
  • What are the challenges in the current system (manual, bad predictions etc)?

    當前系統面臨哪些挑戰(手動,錯誤的預測等)?
  • What metrics are being used to measure inefficiency in current process?

    哪些指標用于衡量當前流程的效率低下?
  • How many months of historical data is being used to predict today?

    多少個月的歷史數據被用來預測今天?
  • Are we going to use same span of historical data or much wider span?

    我們將使用相同的歷史數據跨度還是使用更大的跨度?
  • Is there a feedback system available to capture the data that would increase efficiency?

    是否有反饋系統可用來捕獲可提高效率的數據?
  • Are we planning to introduce a new process/product or improving an existing process?

    我們是否打算引入新流程/產品或改進現有流程?
  • What are the input/predecessor system to the process? How are they connected to the process ?

    流程的輸入/前身系統是什么? 它們如何與流程聯系在一起?
  • What are the output/successor system to the process? How are they connected to the process ?

    流程的輸出/后繼系統是什么? 它們如何與流程聯系在一起?
  • Which part in the process needs better automation in the form of machine learning?

    過程中的哪一部分需要以機器學習的形式實現更好的自動化?
  • What is expected from the machine-learning model?

    機器學習模型有什么期望?
  • What is the success criteria?

    成功的標準是什么?
  • How will this output of Machine learning model, be consumed (API/Batch Update to database/Reporting)?

    機器學習模型的輸出將如何使用(API /對數據庫的批處理更新/報告)?
  • Is there a reporting requirement?

    有報告要求嗎?
  • What is the prediction level (Transaction level or any other level)?

    預測級別是什么(交易級別或任何其他級別)?
  • How frequent the prediction is required (real-time, daily, monthly, weekly)?

    需要多長時間進行一次預測(實時,每日,每月,每周)?
  • How long the prediction will be valid?

    預測將有效多長時間?
  • Who will be the subject matter expert and support machine-learning team with business questions over the period of project?

    在項目期間,誰將是主題專家和支持業務學習的機器學習團隊?
  • How many users will be accessing the application?

    有多少用戶將訪問該應用程序?
  • 數據問卷 (Data Questionnaire)

    Data questionnaire tries to identify data for the business problem. This questionnaire focuses on data analyst, data owners, data base administrators and business analysts. Questionnaire involves understanding the data for the new process:

    數據調查表試圖識別業務問題的數據。 該調查表側重于數據分析師,數據所有者,數據庫管理員和業務分析師。 問卷調查涉及了解新流程的數據:

  • Which data sources and attributes are being used in existing process today?

    如今,現有流程中正在使用哪些數據源和屬性?
  • Which data sources and attributes are required for the new process?

    新流程需要哪些數據源和屬性?
  • Who are the data source owners and database administrators?

    誰是數據源所有者和數據庫管理員?
  • How can we connect to the different source (may need firewall request approved, need connection strings, permissions etc)?

    我們如何連接到不同的源(可能需要防火墻請求批準,需要連接字符串,權限等)?
  • Where can we get the data dictionary (Attribute Name, Description, Source and Datatype)?

    我們在哪里可以得到數據字典(屬性名稱,描述,源和數據類型)?
  • What are the join criteria among the data sources?

    數據源之間的聯接標準是什么?
  • How much historical data is available in the data source?

    數據源中有多少歷史數據可用?
  • What is the data volume (in GB/TB)?

    數據量是多少(以GB / TB為單位)?
  • How frequently the data is updated?

    數據多久更新一次?
  • Do we need to build a new database and table or can use existing ones?

    我們需要建立一個新的數據庫和表還是可以使用現有的數據庫和表?
  • How is the data indexed in the table?

    表中的數據如何編制索引?
  • How is the data partitioned in the table?

    表中的數據如何劃分?
  • Is the data source real time getting updated real-time or batch ?

    數據源實時是實時更新還是批量更新?
  • Will fetching the data impact existing production process?

    獲取數據是否會影響現有的生產過程?
  • Is the data sensitive (like Credit Card#, SSN#, Health Records, Financial Records) that could identify an individual or an entity?

    數據是否可以識別個人或實體(例如信用卡號,SSN號,健康記錄,財務記錄)?
  • 建筑問卷 (Architecture Questionnaire)

    Architecture questionnaires focuses on technical aspects , architecture and implementation portion of the business problem into enterprise grade solution and needs technical owners, application architects and enterprise architects to resolve.

    體系結構調查表著重于將業務問題的技術方面,體系結構和實現部分轉化為企業級解決方案,并且需要技術所有者,應用程序架構師和企業架構師來解決。

  • What problem are we solving for — predicting data, text, image, sound, speech? This will qualify the problem for time series, machine learning, deep learning, natural language processing, signal processing etc.

    我們要解決什么問題-預測數據,文本,圖像,聲音,語音? 這將使該問題適用于時間序列,機器學習,深度學習,自然語言處理,信號處理等。
  • Will this be implemented real-time, batch or in an edge device ?

    這將是實時,批處理還是在邊緣設備中實施?
  • What will be is the best architecture and framework for proof of concept ?

    什么是概念驗證的最佳架構和框架?
  • What will be is the best architecture and framework for production deployment?

    生產部署的最佳架構和框架是什么?
  • Can the solution be implemented with on premise servers or cloud is required?

    是否可以在本地服務器或云上實施該解決方案?
  • Do we have the infrastructure ready or have to build?

    我們是否已經準備好基礎架構或必須建立基礎架構?
  • How soon can we spin the infrastructure?

    我們多久可以旋轉基礎架構?
  • How to handle the servers and data with proper AD authentication and token for cloud?

    如何使用適當的AD身份驗證和令牌為云處理服務器和數據?
  • He can we scale the solution for x number of users without hampering speed and processing?

    他可以在不影響速度和處理能力的情況下為x個用戶擴展解決方案嗎?
  • Where are we going to keep the codes and documents for project?

    我們將在哪里保存項目的代碼和文件?
  • How will we handle passwords?

    我們將如何處理密碼?
  • How will we handle financial information with utmost safety (if credit card is involved)?

    我們將如何最大程度地安全處理財務信息(如果涉及信用卡)?
  • Once we have answers to most of the above questions , we will be in great shape to position ourselves for successful implementation.

    一旦我們對以上大多數問題都有答案,我們將處于有利的地位,為成功實施做好準備。

    結論 (Conclusion)

    For any ML use case to be implemented successfully, its extremely important to spend significant time understanding the business, data and architecture aspects before we set ourselves for analysis and modelling. Above questions, will set us up for first layer of success and help us reach toward 10x productivity.

    對于要成功實施的任何ML用例,在投入自己進行分析和建模之前,花費大量時間了解業務,數據和體系結構方面非常重要。 以上問題,將使我們邁向成功的第一步,并幫助我們將生產率提高10倍。

    If you liked the article, please feel free to clap/like/follow.

    如果您喜歡這篇文章,請隨時鼓掌/喜歡/關注。

    To connect — Linkedin | jagannath.banerjee@gmail.com

    連接— Linkedin | jagannath.banerjee@gmail.com

    翻譯自: https://medium.com/@jagannath.banerjee/10x-machine-learning-productivity-with-stellar-questionnaire-c72c7b99ca93

    角距離恒星

    總結

    以上是生活随笔為你收集整理的角距离恒星_恒星问卷调查的10倍机器学习生产率的全部內容,希望文章能夠幫你解決所遇到的問題。

    如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。