Building Machine Learning Systems with Python
Luis Pedro Coelho Willi Richert Matthieu Brucher更新时间:2021-07-23 17:12:06
最新章节:Leave a review - let other readers know what you think封面
Title Page
Copyright and Credits
Building Machine Learning Systems with Python Third Edition
Packt Upsell
Why subscribe?
PacktPub.com
Contributors
About the authors
About the reviewers
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Download the color images
Conventions used
Get in touch
Reviews
Getting Started with Python Machine Learning
Machine learning and Python – a dream team
What the book will teach you – and what it will not
How to best read this book
What to do when you are stuck
Getting started
Introduction to NumPy SciPy Matplotlib and TensorFlow
Installing Python
Chewing data efficiently with NumPy and intelligently with SciPy
Learning NumPy
Indexing
Handling nonexistent values
Comparing the runtime
Learning SciPy
Fundamentals of machine learning
Asking a question
Getting answers
Our first (tiny) application of machine learning
Reading in the data
Preprocessing and cleaning the data
Choosing the right model and learning algorithm
Before we build our first model
Starting with a simple straight line
Toward more complex models
Stepping back to go forward - another look at our data
Training and testing
Answering our initial question
Summary
Classifying with Real-World Examples
The Iris dataset
Visualization is a good first step
Classifying with scikit-learn
Building our first classification model
Evaluation – holding out data and cross-validation
How to measure and compare classifiers
A more complex dataset and the nearest-neighbor classifier
Learning about the seeds dataset
Features and feature engineering
Nearest neighbor classification
Looking at the decision boundaries
Which classifier to use
Summary
Regression
Predicting house prices with regression
Multidimensional regression
Cross-validation for regression
Penalized or regularized regression
L1 and L2 penalties
Using Lasso or ElasticNet in scikit-learn
Visualizing the Lasso path
P-greater-than-N scenarios
An example based on text documents
Setting hyperparameters in a principled way
Regression with TensorFlow
Summary
Classification I – Detecting Poor Answers
Sketching our roadmap
Learning to classify classy answers
Tuning the instance
Tuning the classifier
Fetching the data
Slimming the data down to chewable chunks
Preselecting and processing attributes
Defining what a good answer is
Creating our first classifier
Engineering the features
Training the classifier
Measuring the classifier's performance
Designing more features
Deciding how to improve the performance
Bias variance and their trade-off
Fixing high bias
Fixing high variance
High or low bias?
Using logistic regression
A bit of math with a small example
Applying logistic regression to our post-classification problem
Looking behind accuracy – precision and recall
Slimming the classifier
Ship it!
Classification using Tensorflow
Summary
Dimensionality Reduction
Sketching our roadmap
Selecting features
Detecting redundant features using filters
Correlation
Mutual information
Asking the model about the features using wrappers
Other feature selection methods
Feature projection
Principal component analysis
Sketching PCA
Applying PCA
Limitations of PCA and how LDA can help
Multidimensional scaling
Autoencoders or neural networks for dimensionality reduction
Summary
Clustering – Finding Related Posts
Measuring the relatedness of posts
How not to do it
How to do it
Preprocessing – similarity measured as a similar number of common words
Converting raw text into a bag of words
Counting words
Normalizing word count vectors
Removing less important words
Stemming
Installing and using NLTK
Extending the vectorizer with NLTK's stemmer
Stop words on steroids
Our achievements and goals
Clustering
K-means
Getting test data to evaluate our ideas
Clustering posts
Solving our initial challenge
Another look at noise
Tweaking the parameters
Summary
Recommendations
Rating predictions and recommendations
Splitting into training and testing
Normalizing the training data
A neighborhood approach to recommendations
A regression approach to recommendations
Combining multiple methods
Basket analysis
Obtaining useful predictions
Analyzing supermarket shopping baskets
Association rule mining
More advanced basket analysis
Summary
Artificial Neural Networks and Deep Learning
Using TensorFlow
TensorFlow API
Graphs
Sessions
Useful operations
Saving and restoring neural networks
Training neural networks
Convolutional neural networks
Recurrent neural networks
LSTM for predicting text
LSTM for image processing
Summary
Classification II – Sentiment Analysis
Sketching our roadmap
Fetching the Twitter data
Introducing the Naïve Bayes classifier
Getting to know the Bayes theorem
Being naïve
Using Naïve Bayes to classify
Accounting for unseen words and other oddities
Accounting for arithmetic underflows
Creating our first classifier and tuning it
Solving an easy problem first
Using all classes
Tuning the classifier's parameters
Cleaning tweets
Taking the word types into account
Determining the word types
Successfully cheating using SentiWordNet
Our first estimator
Putting everything together
Summary
Topic Modeling
Latent Dirichlet allocation
Building a topic model
Comparing documents by topic
Modeling the whole of Wikipedia
Choosing the number of topics
Summary
Classification III – Music Genre Classification
Sketching our roadmap
Fetching the music data
Converting into WAV format
Looking at music
Decomposing music into sine-wave components
Using FFT to build our first classifier
Increasing experimentation agility
Training the classifier
Using a confusion matrix to measure accuracy in multiclass problems
An alternative way to measure classifier performance using receiver-operator characteristics
Improving classification performance with mel frequency cepstral coefficients
Music classification using Tensorflow
Summary
Computer Vision
Introducing image processing
Loading and displaying images
Thresholding
Gaussian blurring
Putting the center in focus
Basic image classification
Computing features from images
Writing your own features
Using features to find similar images
Classifying a harder dataset
Local feature representations
Image generation with adversarial networks
Summary
Reinforcement Learning
Types of reinforcement learning
Policy and value network
Q-network
Excelling at games
A small example
Using Tensorflow for the text game
Playing breakout
Summary
Bigger Data
Learning about big data
Using jug to break up your pipeline into tasks
An introduction to tasks in jug
Looking under the hood
Using jug for data analysis
Reusing partial results
Using Amazon Web Services
Creating your first virtual machines
Installing Python packages on Amazon Linux
Running jug on our cloud machine
Automating the generation of clusters with cfncluster
Summary
Where to Learn More About Machine Learning
Online courses
Books
Blogs
Data sources
Getting competitive
All that was left out
Summary
Other Books You May Enjoy
Leave a review - let other readers know what you think
更新时间:2021-07-23 17:12:06