![]() 作者:Jeroen Janssens 出版社: O'Reilly Media 副标题: Facing the Future with Time-Tested Tools 出版年: 2014-10-20 页数: 212 定价: USD 39.99 装帧: Paperback ISBN: 9781491947852 内容简介 · · · · · ·This hands-on guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You’ll learn how to combine small, yet powerful, command-line tools to quickly obtain, scrub, explore, and model your data. To get you started—whether you’re on Windows, OS X, or Linux—author Jeroen Janssens introduces the Data Science Too... 作者简介 · · · · · ·Jeroen is a Senior Data Scientist at YPlan in New York City. He has an M.Sc. in Artificial Intelligence and a Ph.D. in Machine Learning. He has authored a book titled Data Science at the Command Line, which has just been published by O'Reilly. Jeroen enjoys biking the Brooklyn Bridge, building tools, and eating stroopwafels. 目录 · · · · · ·Chapter 1 IntroductionOverview Data Science Is OSEMN Intermezzo Chapters What Is the Command Line? Why Data Science at the Command Line? · · · · · ·() Chapter 1 Introduction Overview Data Science Is OSEMN Intermezzo Chapters What Is the Command Line? Why Data Science at the Command Line? A Real-World Use Case Further Reading Chapter 2 Getting Started Overview Setting Up Your Data Science Toolbox Essential Concepts and Tools Further Reading Chapter 3 Obtaining Data Overview Copying Local Files to the Data Science Toolbox Decompressing Files Converting Microsoft Excel Spreadsheets Querying Relational Databases Downloading from the Internet Calling Web APIs Further Reading Chapter 4 Creating Reusable Command-Line Tools Overview Converting One-Liners into Shell Scripts Creating Command-Line Tools with Python and R Further Reading Chapter 5 Scrubbing Data Overview Common Scrub Operations for Plain Text Working with CSV Working with HTML/XML and JSON Common Scrub Operations for CSV Further Reading Chapter 6 Managing Your Data Workflow Overview Introducing Drake Installing Drake Obtain Top Ebooks from Project Gutenberg Every Workflow Starts with a Single Step Well, That Depends Rebuilding Specific Targets Discussion Further Reading Chapter 7 Exploring Data Overview Inspecting Data and Its Properties Computing Descriptive Statistics Creating Visualizations Further Reading Chapter 8 Parallel Pipelines Overview Serial Processing Parallel Processing Distributed Processing Discussion Further Reading Chapter 9 Modeling Data Overview More Wine, Please! Dimensionality Reduction with Tapkee Clustering with Weka Regression with SciKit-Learn Laboratory Classification with BigML Further Reading Chapter 10 Conclusion Let’s Recap Three Pieces of Advice Where to Go from Here? Getting in Touch · · · · · · () |
还行。。。
很有趣的一本书
哈哈哈哈哈哈
受益匪浅!