Think Like a Data Scientist

Tackle the Data Science Process Step-by-step
Author: Brian Godsey
Publisher: Manning Publications
ISBN: 9781633430273
Category:
Page: 340
View: 4163

Continue Reading →

Data science is more than just a set of tools and techniques for extracting knowledge from data sets and data streams. Data science is also a process of getting from goals and questions to real, valuable outcomes by exploring, observing, and manipulating a world of data. Traversing this world can be difficult and confusing. Software developers and non-technical folks may struggle with the uncertainty and fuzzy answers that data invariably provide, and statisticians may have trouble working with any of the multitude of relevant software tools that lie outside of their expertise. Others may not even know where to begin. Think Like a Data Scientist presents a step-by-step approach to data science, combining analytic, programming, and business perspectives into easy-to-digest techniques and thought processes for solving real world data-centric problems. This book helps you fill in conceptual knowledge gaps in the daunting fields of statistics and software development, and relates those skills to the real concerns of data science in the business world. As you work though the many practical examples, you'll use your existing knowledge of statistics and programming to solve real problems in data science. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.

Data Scientist

The Definitive Guide to Becoming a Data Scientist
Author: Zacharias Voulgaris, PhD
Publisher: Technics Publications
ISBN: 1634620283
Category: Computers
Page: 278
View: 1591

Continue Reading →

As our society transforms into a data-driven one, the role of the Data Scientist is becoming more and more important. If you want to be on the leading edge of what is sure to become a major profession in the not-too-distant future, this book can show you how. Each chapter is filled with practical information that will help you reap the fruits of big data and become a successful Data Scientist: • Learn what big data is and how it differs from traditional data through its main characteristics: volume, variety, velocity, and veracity. • Explore the different types of Data Scientists and the skillset each one has. • Dig into what the role of the Data Scientist requires in terms of the relevant mindset, technical skills, experience, and how the Data Scientist connects with other people. • Be a Data Scientist for a day, examining the problems you may encounter and how you tackle them, what programs you use, and how you expand your knowledge and know-how. • See how you can become a Data Scientist, based on where you are starting from: a programming, machine learning, or data-related background. • Follow step-by-step through the process of landing a Data Scientist job: where you need to look, how you would present yourself to a potential employer, and what it takes to follow a freelancer path. • Read the case studies of experienced, senior-level Data Scientists, in an attempt to get a better perspective of what this role is, in practice. At the end of the book, there is a glossary of the most important terms that have been introduced, as well as three appendices – a list of useful sites, some relevant articles on the web, and a list of offline resources for further reading.

Practical Big Data Analytics

Hands-on techniques to implement enterprise analytics and machine learning using Hadoop, Spark, NoSQL and R
Author: Nataraj Dasgupta
Publisher: Packt Publishing Ltd
ISBN: 1783554401
Category: Computers
Page: 412
View: 4794

Continue Reading →

Get command of your organizational Big Data using the power of data science and analytics Key Features A perfect companion to boost your Big Data storing, processing, analyzing skills to help you take informed business decisions Work with the best tools such as Apache Hadoop, R, Python, and Spark for NoSQL platforms to perform massive online analyses Get expert tips on statistical inference, machine learning, mathematical modeling, and data visualization for Big Data Book Description Big Data analytics relates to the strategies used by organizations to collect, organize and analyze large amounts of data to uncover valuable business insights that otherwise cannot be analyzed through traditional systems. Crafting an enterprise-scale cost-efficient Big Data and machine learning solution to uncover insights and value from your organization's data is a challenge. Today, with hundreds of new Big Data systems, machine learning packages and BI Tools, selecting the right combination of technologies is an even greater challenge. This book will help you do that. With the help of this guide, you will be able to bridge the gap between the theoretical world of technology with the practical ground reality of building corporate Big Data and data science platforms. You will get hands-on exposure to Hadoop and Spark, build machine learning dashboards using R and R Shiny, create web-based apps using NoSQL databases such as MongoDB and even learn how to write R code for neural networks. By the end of the book, you will have a very clear and concrete understanding of what Big Data analytics means, how it drives revenues for organizations, and how you can develop your own Big Data analytics solution using different tools and methods articulated in this book. What you will learn - Get a 360-degree view into the world of Big Data, data science and machine learning - Broad range of technical and business Big Data analytics topics that caters to the interests of the technical experts as well as corporate IT executives - Get hands-on experience with industry-standard Big Data and machine learning tools such as Hadoop, Spark, MongoDB, KDB+ and R - Create production-grade machine learning BI Dashboards using R and R Shiny with step-by-step instructions - Learn how to combine open-source Big Data, machine learning and BI Tools to create low-cost business analytics applications - Understand corporate strategies for successful Big Data and data science projects - Go beyond general-purpose analytics to develop cutting-edge Big Data applications using emerging technologies Who this book is for The book is intended for existing and aspiring Big Data professionals who wish to become the go-to person in their organization when it comes to Big Data architecture, analytics, and governance. While no prior knowledge of Big Data or related technologies is assumed, it will be helpful to have some programming experience.

Introducing Data Science

Big Data, Machine Learning, and More, Using Python Tools
Author: Davy Cielen,Arno Meysman,Mohamed Ali
Publisher: Manning Publications
ISBN: 9781633430037
Category: Computers
Page: 320
View: 5030

Continue Reading →

Summary Introducing Data Science teaches you how to accomplish the fundamental tasks that occupy data scientists. Using the Python language and common Python libraries, you'll experience firsthand the challenges of dealing with data at scale and gain a solid foundation in data science. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the Technology Many companies need developers with data science skills to work on projects ranging from social media marketing to machine learning. Discovering what you need to learn to begin a career as a data scientist can seem bewildering. This book is designed to help you get started. About the Book Introducing Data ScienceIntroducing Data Science explains vital data science concepts and teaches you how to accomplish the fundamental tasks that occupy data scientists. You'll explore data visualization, graph databases, the use of NoSQL, and the data science process. You'll use the Python language and common Python libraries as you experience firsthand the challenges of dealing with data at scale. Discover how Python allows you to gain insights from data sets so big that they need to be stored on multiple machines, or from data moving so quickly that no single machine can handle it. This book gives you hands-on experience with the most popular Python data science libraries, Scikit-learn and StatsModels. After reading this book, you'll have the solid foundation you need to start a career in data science. What's Inside Handling large data Introduction to machine learning Using Python to work with data Writing data science algorithms About the Reader This book assumes you're comfortable reading code in Python or a similar language, such as C, Ruby, or JavaScript. No prior experience with data science is required. About the Authors Davy Cielen, Arno D. B. Meysman, and Mohamed Ali are the founders and managing partners of Optimately and Maiton, where they focus on developing data science projects and solutions in various sectors. Table of Contents Data science in a big data world The data science process Machine learning Handling large data on a single computer First steps in big data Join the NoSQL movement The rise of graph databases Text mining and text analytics Data visualization to the end user

Data Science at the Command Line

Facing the Future with Time-Tested Tools
Author: Jeroen Janssens
Publisher: "O'Reilly Media, Inc."
ISBN: 1491947802
Category: Computers
Page: 212
View: 9465

Continue Reading →

This hands-on guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You’ll learn how to combine small, yet powerful, command-line tools to quickly obtain, scrub, explore, and model your data. To get you started—whether you’re on Windows, OS X, or Linux—author Jeroen Janssens introduces the Data Science Toolbox, an easy-to-install virtual environment packed with over 80 command-line tools. Discover why the command line is an agile, scalable, and extensible technology. Even if you’re already comfortable processing data with, say, Python or R, you’ll greatly improve your data science workflow by also leveraging the power of the command line. Obtain data from websites, APIs, databases, and spreadsheets Perform scrub operations on plain text, CSV, HTML/XML, and JSON Explore data, compute descriptive statistics, and create visualizations Manage your data science workflow using Drake Create reusable tools from one-liners and existing Python or R code Parallelize and distribute data-intensive pipelines using GNU Parallel Model data with dimensionality reduction, clustering, regression, and classification algorithms

Data Science for Business

What You Need to Know about Data Mining and Data-Analytic Thinking
Author: Foster Provost,Tom Fawcett
Publisher: "O'Reilly Media, Inc."
ISBN: 144937428X
Category: Computers
Page: 414
View: 8357

Continue Reading →

Written by renowned data science experts Foster Provost and Tom Fawcett, Data Science for Business introduces the fundamental principles of data science, and walks you through the "data-analytic thinking" necessary for extracting useful knowledge and business value from the data you collect. This guide also helps you understand the many data-mining techniques in use today. Based on an MBA course Provost has taught at New York University over the past ten years, Data Science for Business provides examples of real-world business problems to illustrate these principles. You’ll not only learn how to improve communication between business stakeholders and data scientists, but also how participate intelligently in your company’s data science projects. You’ll also discover how to think data-analytically, and fully appreciate how data science methods can support business decision-making. Understand how data science fits in your organization—and how you can use it for competitive advantage Treat data as a business asset that requires careful investment if you’re to gain real value Approach business problems data-analytically, using the data-mining process to gather good data in the most appropriate way Learn general concepts for actually extracting knowledge from data Apply data science principles when interviewing data science job candidates

Data Smart

Using Data Science to Transform Information into Insight
Author: John W. Foreman
Publisher: John Wiley & Sons
ISBN: 1118839862
Category: Business & Economics
Page: 432
View: 9985

Continue Reading →

Data Science gets thrown around in the press like it's magic. Major retailers are predicting everything from when their customers are pregnant to when they want a new pair of Chuck Taylors. It's a brave new world where seemingly meaningless data can be transformed into valuable insight to drive smart business decisions. But how does one exactly do data science? Do you have to hire one of these priests of the dark arts, the "data scientist," to extract this gold from your data? Nope. Data science is little more than using straight-forward steps to process raw data into actionable insight. And in Data Smart, author and data scientist John Foreman will show you how that's done within the familiar environment of a spreadsheet. Why a spreadsheet? It's comfortable! You get to look at the data every step of the way, building confidence as you learn the tricks of the trade. Plus, spreadsheets are a vendor-neutral place to learn data science without the hype. But don't let the Excel sheets fool you. This is a book for those serious about learning the analytic techniques, the math and the magic, behind big data. Each chapter will cover a different technique in a spreadsheet so you can follow along: Mathematical optimization, including non-linear programming and genetic algorithms Clustering via k-means, spherical k-means, and graph modularity Data mining in graphs, such as outlier detection Supervised AI through logistic regression, ensemble models, and bag-of-words models Forecasting, seasonal adjustments, and prediction intervals through monte carlo simulation Moving from spreadsheets into the R programming language You get your hands dirty as you work alongside John through each technique. But never fear, the topics are readily applicable and the author laces humor throughout. You'll even learn what a dead squirrel has to do with optimization modeling, which you no doubt are dying to know.

Privacy, Big Data, and the Public Good

Frameworks for Engagement
Author: Julia Lane,Victoria Stodden,Stefan Bender,Helen Nissenbaum
Publisher: Cambridge University Press
ISBN: 1316094456
Category: Mathematics
Page: N.A
View: 7861

Continue Reading →

Massive amounts of data on human beings can now be analyzed. Pragmatic purposes abound, including selling goods and services, winning political campaigns, and identifying possible terrorists. Yet 'big data' can also be harnessed to serve the public good: scientists can use big data to do research that improves the lives of human beings, improves government services, and reduces taxpayer costs. In order to achieve this goal, researchers must have access to this data - raising important privacy questions. What are the ethical and legal requirements? What are the rules of engagement? What are the best ways to provide access while also protecting confidentiality? Are there reasonable mechanisms to compensate citizens for privacy loss? The goal of this book is to answer some of these questions. The book's authors paint an intellectual landscape that includes legal, economic, and statistical frameworks. The authors also identify new practical approaches that simultaneously maximize the utility of data access while minimizing information risk.

Data Science in Higher Education

A Step-By-Step Introduction to Machine Learning for Institutional Researchers
Author: Jesse Lawson
Publisher: N.A
ISBN: 9781515206460
Category:
Page: 226
View: 6950

Continue Reading →

Be the Change your Institution Needs What are leaders in research saying about Data Science in Higher Education? "Where has this book been all these years? This is THE starting point for researchers looking for a leg up in today's college environment. Two parts discussion, one part methodology, and one part witty humor. I love it!" "Buy this book for your analysts. They and your college will thank you." "This is the only book on data science specific for higher education research that covers both theory and practice. I'm not a programmer at all, and I found this book very enjoyable. You wont regret it -- I know I don't!" "When our department was tasked with coming up with a predictive 'machine-learning' model, we hired Jesse to help us. His charisma and knowledge are unmatched, and this book only helps to breathe fresh life into issues in research today that are all too often swept under the rug." Discover the tools to take your institution to the next level! Data Science in higher education is the process of turning raw institutional data into actionable intelligence. With this introduction to foundational topics in machine learning and predictive analytics, ambitious leaders in research can develop and employ sophisticated predictive models to better inform their institution's decision-making process. You don't need an advanced degree in math or statistics to do data science. With the open-source statistical programming language R, you'll learn how to tackle real-life institutional data challenges (with actual institutional data!) by going step-by-step through different case studies. Topics include: Simple, Multiple, & Logistic Regression Techniques, and Naive Bayes Classifiers Best Practices for Data Scientists in Higher Education Narrative-style stories, gotchas, and insights from actual data science jobs at colleges and universities "Forget the textbooks. This is a book on data science written for institutional researchers *by* an institutional researcher. You need this book."------------------------------------------ Data Science is the art of carefully picking through that pile of book pages and putting together a complete book. It's the art of developing a narrative for your data, so that all the raw information that your institution warehouses and reports in bar charts and histograms is replaced with actionable intelligence. Here's what we know: Data science can and should be an integral part of college and university operations. Institutional effectiveness should be working side-by-side with faculty and educators to collect, clean, and mine through data of current and past students' behaviors in order to better empower counseling and advisement services (whether virtual or otherwise). Data itself should be considered an asset to an institution, and the data mining process a necessary function of institutional operations. So how do we do it? It starts with a solid perspective and great research tools. With Data Science in Higher Education you'll learn about and solve real-world institutional problems with open-source tools and machine learning research techniques. Using R, you'll tackle case studies from real colleges and develop predictive analytical solutions to problems that colleges and universities face to this day.

R for Data Science

Import, Tidy, Transform, Visualize, and Model Data
Author: Hadley Wickham,Garrett Grolemund
Publisher: "O'Reilly Media, Inc."
ISBN: 1491910364
Category: Computers
Page: 520
View: 2413

Continue Reading →

Learn how to use R to turn raw data into insight, knowledge, and understanding. This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience, R for Data Science is designed to get you doing data science as quickly as possible. Authors Hadley Wickham and Garrett Grolemund guide you through the steps of importing, wrangling, exploring, and modeling your data and communicating the results. You’ll get a complete, big-picture understanding of the data science cycle, along with basic tools you need to manage the details. Each section of the book is paired with exercises to help you practice what you’ve learned along the way. You’ll learn how to: Wrangle—transform your datasets into a form convenient for analysis Program—learn powerful R tools for solving data problems with greater clarity and ease Explore—examine your data, generate hypotheses, and quickly test them Model—provide a low-dimensional summary that captures true "signals" in your dataset Communicate—learn R Markdown for integrating prose, code, and results

Building Data Science Teams


Author: DJ Patil
Publisher: "O'Reilly Media, Inc."
ISBN: 1449316778
Category: Computers
Page: 24
View: 5724

Continue Reading →

As data science evolves to become a business necessity, the importance of assembling a strong and innovative data teams grows. In this in-depth report, data scientist DJ Patil explains the skills, perspectives, tools and processes that position data science teams for success. Topics include: What it means to be "data driven." The unique roles of data scientists. The four essential qualities of data scientists. Patil's first-hand experience building the LinkedIn data science team.

Big Data

Algorithms, Analytics, and Applications
Author: Kuan-Ching Li,Hai Jiang,Laurence T. Yang,Alfredo Cuzzocrea
Publisher: CRC Press
ISBN: 1482240564
Category: Computers
Page: 498
View: 8741

Continue Reading →

As today’s organizations are capturing exponentially larger amounts of data than ever, now is the time for organizations to rethink how they digest that data. Through advanced algorithms and analytics techniques, organizations can harness this data, discover hidden patterns, and use the newly acquired knowledge to achieve competitive advantages. Presenting the contributions of leading experts in their respective fields, Big Data: Algorithms, Analytics, and Applications bridges the gap between the vastness of Big Data and the appropriate computational methods for scientific and social discovery. It covers fundamental issues about Big Data, including efficient algorithmic methods to process data, better analytical strategies to digest data, and representative applications in diverse fields, such as medicine, science, and engineering. The book is organized into five main sections: Big Data Management—considers the research issues related to the management of Big Data, including indexing and scalability aspects Big Data Processing—addresses the problem of processing Big Data across a wide range of resource-intensive computational settings Big Data Stream Techniques and Algorithms—explores research issues regarding the management and mining of Big Data in streaming environments Big Data Privacy—focuses on models, techniques, and algorithms for preserving Big Data privacy Big Data Applications—illustrates practical applications of Big Data across several domains, including finance, multimedia tools, biometrics, and satellite Big Data processing Overall, the book reports on state-of-the-art studies and achievements in algorithms, analytics, and applications of Big Data. It provides readers with the basis for further efforts in this challenging scientific field that will play a leading role in next-generation database, data warehousing, data mining, and cloud computing research. It also explores related applications in diverse sectors, covering technologies for media/data communication, elastic media/data storage, cross-network media/data fusion, and SaaS.

Data Science

Mindset, Methodologies, and Misconceptions
Author: Zacharias Voulgaris
Publisher: N.A
ISBN: 9781634622561
Category:
Page: 300
View: 488

Continue Reading →

Master the concepts and strategies underlying success and progress in data science. From the author of the bestsellers, Data Scientist and Julia for Data Science, this book covers four foundational areas of data science. The first area is the data science pipeline including methodologies and the data scientist's toolbox. The second are essential practices needed in understanding the data including questions and hypotheses. The third are pitfalls to avoid in the data science process. The fourth is an awareness of future trends and how modern technologies like Artificial Intelligence (AI) fit into the data science framework. The following chapters cover these four foundational areas: Chapter 1 - What Is Data Science? Chapter 2 - The Data Science Pipeline Chapter 3 - Data Science Methodologies Chapter 4 - The Data Scientist's Toolbox Chapter 5 - Questions to Ask and the Hypotheses They Are Based On Chapter 6 - Data Science Experiments and Evaluation of Their Results Chapter 7 - Sensitivity Analysis of Experiment Conclusions Chapter 8 - Programming Bugs Chapter 9 - Mistakes Through the Data Science Process Chapter 10 - Dealing with Bugs and Mistakes Effectively and Efficiently Chapter 11 - The Role of Heuristics in Data Science Chapter 12 - The Role of AI in Data Science Chapter 13 - Data Science Ethics Chapter 14 - Future Trends and How to Remain Relevant Targeted towards data science learners of all levels, this book aims to help the reader go beyond data science techniques and obtain a more holistic and deeper understanding of what data science entails. With a focus on the problems data science tries to solve, this book challenges the reader to become a self-sufficient player in the field.

Perspectives on Data Science for Software Engineering


Author: Tim Menzies,Laurie Williams,Thomas Zimmermann
Publisher: Morgan Kaufmann
ISBN: 0128042613
Category: Computers
Page: 408
View: 5499

Continue Reading →

Perspectives on Data Science for Software Engineering presents the best practices of seasoned data miners in software engineering. The idea for this book was created during the 2014 conference at Dagstuhl, an invitation-only gathering of leading computer scientists who meet to identify and discuss cutting-edge informatics topics. At the 2014 conference, the concept of how to transfer the knowledge of experts from seasoned software engineers and data scientists to newcomers in the field highlighted many discussions. While there are many books covering data mining and software engineering basics, they present only the fundamentals and lack the perspective that comes from real-world experience. This book offers unique insights into the wisdom of the community’s leaders gathered to share hard-won lessons from the trenches. Ideas are presented in digestible chapters designed to be applicable across many domains. Topics included cover data collection, data sharing, data mining, and how to utilize these techniques in successful software projects. Newcomers to software engineering data science will learn the tips and tricks of the trade, while more experienced data scientists will benefit from war stories that show what traps to avoid. Presents the wisdom of community experts, derived from a summit on software analytics Provides contributed chapters that share discrete ideas and technique from the trenches Covers top areas of concern, including mining security and social data, data visualization, and cloud-based data Presented in clear chapters designed to be applicable across many domains

Visualizing Graph Data


Author: Corey Lanum
Publisher: Manning Publications
ISBN: 9781617293078
Category: Computers
Page: 232
View: 1113

Continue Reading →

Summary Visualizing Graph Data teaches you not only how to build graph data structures, but also how to create your own dynamic and interactive visualizations using a variety of tools. This book is loaded with fascinating examples and case studies to show you the real-world value of graph visualizations. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the Technology Assume you are doing a great job collecting data about your customers and products. Are you able to turn your rich data into important insight? Complex relationships in large data sets can be difficult to recognize. Visualizing these connections as graphs makes it possible to see the patterns, so you can find meaning in an otherwise over-whelming sea of facts. About the Book Visualizing Graph Data teaches you how to understand graph data, build graph data structures, and create meaningful visualizations. This engaging book gently introduces graph data visualization through fascinating examples and compelling case studies. You'll discover simple, but effective, techniques to model your data, handle big data, and depict temporal and spatial data. By the end, you'll have a conceptual foundation as well as the practical skills to explore your own data with confidence. What's Inside Techniques for creating effective visualizations Examples using the Gephi and KeyLines visualization packages Real-world case studies About the Reader No prior experience with graph data is required. About the Author Corey Lanum has decades of experience building visualization and analysis applications for companies and government agencies around the globe. Table of Contents PART 1 - GRAPH VISUALIZATION BASICS Getting to know graph visualization Case studies An introduction to Gephi and KeyLines PART 2 VISUALIZE YOUR OWN DATA Data modeling How to build graph visualizations Creating interactive visualizations How to organize a chart Big data: using graphs when there's too much data Dynamic graphs: how to show data over time Graphs on maps: the where of graph visualization

Data Science from Scratch

First Principles with Python
Author: Joel Grus
Publisher: "O'Reilly Media, Inc."
ISBN: 1491904402
Category: BUSINESS & ECONOMICS
Page: 330
View: 3221

Continue Reading →

Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they’re also a good way to dive into the discipline without actually understanding data science. In this book, you’ll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch. If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today’s messy glut of data holds answers to questions no one’s even thought to ask. This book provides you with the know-how to dig those answers out. Get a crash course in Python Learn the basics of linear algebra, statistics, and probability—and understand how and when they're used in data science Collect, explore, clean, munge, and manipulate data Dive into the fundamentals of machine learning Implement models such as k-nearest Neighbors, Naive Bayes, linear and logistic regression, decision trees, neural networks, and clustering Explore recommender systems, natural language processing, network analysis, MapReduce, and databases

Doing Data Science

Straight Talk from the Frontline
Author: Cathy O'Neil,Rachel Schutt
Publisher: "O'Reilly Media, Inc."
ISBN: 144936389X
Category: Computers
Page: 408
View: 1316

Continue Reading →

Now that people are aware that data can make the difference in an election or a business model, data science as an occupation is gaining ground. But how can you get started working in a wide-ranging, interdisciplinary field that’s so clouded in hype? This insightful book, based on Columbia University’s Introduction to Data Science class, tells you what you need to know. In many of these chapter-long lectures, data scientists from companies such as Google, Microsoft, and eBay share new algorithms, methods, and models by presenting case studies and the code they use. If you’re familiar with linear algebra, probability, and statistics, and have programming experience, this book is an ideal introduction to data science. Topics include: Statistical inference, exploratory data analysis, and the data science process Algorithms Spam filters, Naive Bayes, and data wrangling Logistic regression Financial modeling Recommendation engines and causality Data visualization Social networks and data journalism Data engineering, MapReduce, Pregel, and Hadoop Doing Data Science is collaboration between course instructor Rachel Schutt, Senior VP of Data Science at News Corp, and data science consultant Cathy O’Neil, a senior data scientist at Johnson Research Labs, who attended and blogged about the course.

Algorithms of the Intelligent Web


Author: Douglas G McIlwraith,Haralambos Marmanis,Dmitry Babenko
Publisher: Manning Publications
ISBN: 9781617292583
Category: Computers
Page: 240
View: 9620

Continue Reading →

Summary Algorithms of the Intelligent Web, Second Edition teaches the most important approaches to algorithmic web data analysis, enabling you to create your own machine learning applications that crunch, munge, and wrangle data collected from users, web applications, sensors and website logs. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the Technology Valuable insights are buried in the tracks web users leave as they navigate pages and applications. You can uncover them by using intelligent algorithms like the ones that have earned Facebook, Google, and Twitter a place among the giants of web data pattern extraction. About the Book Algorithms of the Intelligent Web, Second Edition teaches you how to create machine learning applications that crunch and wrangle data collected from users, web applications, and website logs. In this totally revised edition, you'll look at intelligent algorithms that extract real value from data. Key machine learning concepts are explained with code examples in Python's scikit-learn. This book guides you through algorithms to capture, store, and structure data streams coming from the web. You'll explore recommendation engines and dive into classification via statistical algorithms, neural networks, and deep learning. What's Inside Introduction to machine learning Extracting structure from data Deep learning and neural networks How recommendation engines work About the Reader Knowledge of Python is assumed. About the Authors Douglas McIlwraith is a machine learning expert and data science practitioner in the field of online advertising. Dr. Haralambos Marmanis is a pioneer in the adoption of machine learning techniques for industrial solutions. Dmitry Babenko designs applications for banking, insurance, and supply-chain management. Foreword by Yike Guo. Table of Contents Building applications for the intelligent web Extracting structure from data: clustering and transforming your data Recommending relevant content Classification: placing things where they belong Case study: click prediction for online advertising Deep learning and neural networks Making the right choice The future of the intelligent web Appendix - Capturing data on the web

Practical Data Science with R


Author: Nina Zumel,John Mount
Publisher: Manning Publications
ISBN: 9781617291562
Category: Computers
Page: 416
View: 5030

Continue Reading →

Summary Practical Data Science with R lives up to its name. It explains basic principles without the theoretical mumbo-jumbo and jumps right to the real use cases you'll face as you collect, curate, and analyze the data crucial to the success of your business. You'll apply the R programming language and statistical analysis techniques to carefully explained examples based in marketing, business intelligence, and decision support. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the Book Business analysts and developers are increasingly collecting, curating, analyzing, and reporting on crucial business data. The R language and its associated tools provide a straightforward way to tackle day-to-day data science tasks without a lot of academic theory or advanced mathematics. Practical Data Science with R shows you how to apply the R programming language and useful statistical techniques to everyday business situations. Using examples from marketing, business intelligence, and decision support, it shows you how to design experiments (such as A/B tests), build predictive models, and present results to audiences of all levels. This book is accessible to readers without a background in data science. Some familiarity with basic statistics, R, or another scripting language is assumed. What's Inside Data science for the business professional Statistical analysis using the R language Project lifecycle, from planning to delivery Numerous instantly familiar use cases Keys to effective data presentations About the Authors Nina Zumel and John Mount are cofounders of a San Francisco-based data science consulting firm. Both hold PhDs from Carnegie Mellon and blog on statistics, probability, and computer science at win-vector.com. Table of Contents PART 1 INTRODUCTION TO DATA SCIENCE The data science process Loading data into R Exploring data Managing data PART 2 MODELING METHODS Choosing and evaluating models Memorization methods Linear and logistic regression Unsupervised methods Exploring advanced methods PART 3 DELIVERING RESULTS Documentation and deployment Producing effective presentations