Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
Download
50 views
ubuntu2004
Kernel: Python 3 (system-wide)

Website Analysis using python

import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns import warnings warnings.filterwarnings('ignore') plt.style.use('fivethirtyeight') %matplotlib inline
dataset = pd.read_csv('3dlookmewebsite.csv', encoding='latin')
dataset.head()

Quick Statistic Analysis

dataset.describe()

Dealing with data types

dataset.dtypes
User-ID float64 Gender object Age float64 Month wise progress with DSM at 3dlook.me website object Number of Customers object Customer Retention Rate object Time spent (minutes) float64 Engagement Rate object Increase in Users float64 CTR object Bounce Rate object Net Promoter Score float64 Average Order Value object Sales object Visit Number float64 Items Purchased float64 Overall customer satisfaction float64 Online Ambiance Review object Customer Loyalty Change object Users Referred float64 Sensory Method object Time of Day object Marketing Channels object dtype: object
dataset.isnull().sum()
User-ID 25 Gender 25 Age 25 Month wise progress with DSM at 3dlook.me website 2 Number of Customers 2 Customer Retention Rate 2 Time spent (minutes) 2 Engagement Rate 2 Increase in Users 2 CTR 2 Bounce Rate 2 Net Promoter Score 2 Average Order Value 2 Sales 2 Visit Number 2 Items Purchased 2 Overall customer satisfaction 2 Online Ambiance Review 17 Customer Loyalty Change 2 Users Referred 9 Sensory Method 22 Time of Day 7 Marketing Channels 2 dtype: int64
dataset[dataset['Online Ambiance Review'].isnull()].head()

Checking out columns separately

dataset['Online Ambiance Review'].value_counts().head()
Good product 37 Excellent product 17 Nice product 11 Awesome product 8 Facebook post about the entertaining projection mapping in the store 3 Name: Online Ambiance Review, dtype: int64
item_counts = dataset['Online Ambiance Review'].value_counts().sort_values(ascending=False).iloc[0:15] sns.barplot(x=item_counts.index, y=item_counts.values, palette=sns.cubehelix_palette(15)) plt.ylabel("Counts") plt.title("Which items were bought more often?"); plt.xticks(rotation=90);
Image in a Jupyter notebook
dataset['Online Ambiance Review'].value_counts().tail()
"The website's use of videos and animations is so fun and engaging" 1 "The website's music and colors are so calming and enjoyable, it feels like a nice escape" 1 4 1 Positive comment on a YouTube video about the informative product videos in the stor 1 Positive comment on a YouTube video about the informative product videos in the store 1 Name: Online Ambiance Review, dtype: int64

Checking out columns separately

dataset.head()
item_counts = dataset['Online Ambiance Review'].value_counts().sort_values(ascending=False).iloc[0:15] plt.figure(figsize=(18,6)) sns.barplot(x=item_counts.index, y=item_counts.values, palette=sns.cubehelix_palette(15)) plt.ylabel("Counts") plt.title("Which items were bought more often?"); plt.xticks(rotation=90);
Image in a Jupyter notebook
dataset['Customer Loyalty Change'].value_counts().tail()
Increased by 0.22 2 Increased by 0.98 2 Increased by 0.65 2 Increased by 0.87 2 Increased by 0.88 2 Name: Customer Loyalty Change, dtype: int64
dataset['Overall customer satisfaction'].value_counts().tail()
8.9 1 8.8 1 8.7 1 8.6 1 28.7 1 Name: Overall customer satisfaction, dtype: int64
dataset['Customer Retention Rate'].value_counts().tail()
63.20% 1 62.90% 1 62.60% 1 62.30% 1 91.80% 1 Name: Customer Retention Rate, dtype: int64
dataset['Net Promoter Score'].value_counts().tail()
46.3 1 46.6 1 46.9 1 47.2 1 123.7 1 Name: Net Promoter Score, dtype: int64
dataset['Number of Customers'].value_counts().tail()
211.60% 1 216.00% 1 220.40% 1 224.90% 1 1283.00% 1 Name: Number of Customers, dtype: int64
dataset['Customer Retention Rate'].value_counts().tail()
63.20% 1 62.90% 1 62.60% 1 62.30% 1 91.80% 1 Name: Customer Retention Rate, dtype: int64
dataset['Time spent (minutes) '].value_counts().tail()
99.0 1 102.0 1 105.0 1 108.0 1 3011.0 1 Name: Time spent (minutes) , dtype: int64
dataset['Engagement Rate'].value_counts().tail()
56.80% 1 53.20% 1 58.40% 1 57.20% 1 70.80% 1 Name: Engagement Rate, dtype: int64
dataset['Increase in Users'].value_counts().tail()
1172.0 1 1188.0 1 1205.0 1 1221.0 1 3198.0 1 Name: Increase in Users, dtype: int64
# replace NaN values with empty string dataset['Online Ambiance Review'] = dataset['Online Ambiance Review'].fillna('') # filter out rows where the review is in full upper case lcase_counts = dataset[~dataset['Online Ambiance Review'].str.isupper()]['Online Ambiance Review'].value_counts().sort_values(ascending=False).iloc[0:15] # plot the results plt.figure(figsize=(18,6)) sns.barplot(x=lcase_counts.index, y=lcase_counts.values, palette=sns.color_palette("hls", 15)) plt.ylabel("Counts") plt.title("Not full upper case items") plt.xticks(rotation=90);
Image in a Jupyter notebook
dataset['Customer Loyalty Change'].value_counts().head()
Increased by 0.91 7 Increased by 0.62 6 Increased by 0.57 5 Increased by 0.76 5 Increased by 0.89 5 Name: Customer Loyalty Change, dtype: int64
customer_loy = dataset['Customer Loyalty Change'].value_counts().sort_values(ascending=False).iloc[0:15] plt.figure(figsize=(18,6)) sns.barplot(x=customer_loy.index, y=customer_loy.values, palette=sns.color_palette("GnBu_d")) plt.ylabel("Counts") plt.title("Customer loyalty change rate ") plt.xticks(rotation=90)
(array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]), [Text(0, 0, 'Increased by 0.91'), Text(1, 0, 'Increased by 0.62'), Text(2, 0, 'Increased by 0.57'), Text(3, 0, 'Increased by 0.76'), Text(4, 0, 'Increased by 0.89'), Text(5, 0, 'Increased by 0.52'), Text(6, 0, 'Increased by 0.34'), Text(7, 0, 'Increased by 0.78'), Text(8, 0, 'Increased by 0.81'), Text(9, 0, 'Increased by 0.43'), Text(10, 0, 'Increased by 0.67'), Text(11, 0, 'Increased by 0.24'), Text(12, 0, 'Decreased by 0.18'), Text(13, 0, 'Increased by 0.39'), Text(14, 0, 'Increased by 0.28')])
Image in a Jupyter notebook