{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# LM Plot and Reg Plot" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Welcome to another lecture on Seaborn! This is going to be the first among a series of plots that we shall be drawing with Seaborn. In this lecture, we shall be covering the concept of plotting **Linear Regression** data analysis, which is a very common method in *Business Intelligence*, and *Data Science* domain in particular. To begin with, we shall at first try to gain *statistical overview* of the concept of *Linear Regression*." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As our intention isn't to dive deeply into each statistical concept, I shall instead pick a curated dataset and show you different ways in which we can visualize whatever we deduced during our analysis. Using Seaborn, there are two important types of figure that we can plot to fulfil our project needs. One is known as **LM Plot** and the other one is **Reg Plot**. Visualy, they have pretty much similar appearance, but do have functional difference that I will highlight in detail for you to understand." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Linear Regression** is a *statistical concept for predictive analytics*, where the core agenda is to majorly examine three aspects:\n", "\n", "- Does a set of predictor variables do a good job in predicting an outcome (dependent) variable?\n", "- Which variables in particular are significant predictors for the outcome variable?\n", "- In what way do they (indicated by the magnitude and sign of the beta estimates) impact the outcome variable? These **Beta Estimates** are just the *standardized coefficients* resulting from a *regression analysis*, that have been standardized so that the variances of dependent and independent variables are 1." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us begin by importing the libraries that we might need in our journey and this is something you will find me doing at the start of every lecture so that we don't have to bother about dependancies throughout the lecture." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2021-07-16T15:12:40.433826Z", "start_time": "2021-07-16T15:12:36.712669Z" } }, "outputs": [], "source": [ "# Importing intrinsic libraries:\n", "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import seaborn as sns\n", "%matplotlib inline\n", "sns.set(style=\"whitegrid\", palette=\"hsv\")\n", "import warnings\n", "warnings.filterwarnings(\"ignore\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us now generate some data to play around with using **NumPy** for two imaginary classes of points. Please note that throughout the course I wouldn't be explaining Data generation as that is a component of Data Analysis. With that been said, let us try to plot something here:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2021-07-16T15:12:40.555891Z", "start_time": "2021-07-16T15:12:40.440659Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", " | total_bill | \n", "tip | \n", "sex | \n", "smoker | \n", "day | \n", "time | \n", "size | \n", "
---|---|---|---|---|---|---|---|
0 | \n", "16.99 | \n", "1.01 | \n", "Female | \n", "No | \n", "Sun | \n", "Dinner | \n", "2 | \n", "
1 | \n", "10.34 | \n", "1.66 | \n", "Male | \n", "No | \n", "Sun | \n", "Dinner | \n", "3 | \n", "
2 | \n", "21.01 | \n", "3.50 | \n", "Male | \n", "No | \n", "Sun | \n", "Dinner | \n", "3 | \n", "
3 | \n", "23.68 | \n", "3.31 | \n", "Male | \n", "No | \n", "Sun | \n", "Dinner | \n", "2 | \n", "
4 | \n", "24.59 | \n", "3.61 | \n", "Female | \n", "No | \n", "Sun | \n", "Dinner | \n", "4 | \n", "
5 | \n", "25.29 | \n", "4.71 | \n", "Male | \n", "No | \n", "Sun | \n", "Dinner | \n", "4 | \n", "
6 | \n", "8.77 | \n", "2.00 | \n", "Male | \n", "No | \n", "Sun | \n", "Dinner | \n", "2 | \n", "
7 | \n", "26.88 | \n", "3.12 | \n", "Male | \n", "No | \n", "Sun | \n", "Dinner | \n", "4 | \n", "
8 | \n", "15.04 | \n", "1.96 | \n", "Male | \n", "No | \n", "Sun | \n", "Dinner | \n", "2 | \n", "
9 | \n", "14.78 | \n", "3.23 | \n", "Male | \n", "No | \n", "Sun | \n", "Dinner | \n", "2 | \n", "