4000-word report on the case study below.


Task A Case: Customer Shopping Trends


About Dataset




The revenue in the U.S. eCommerce market is anticipated to experience a steady compound annual growth rate (CAGR) of 7.8% from 2023 to 2027, reaching a projected market volume of US$1,327,412.9 million by 2027. In 2023 alone, there’s an expected 6.8% increase, contributing to the global growth rate of 8.7% in the same year.


The Customer Shopping Preferences Dataset offers valuable insights into consumer behaviour and purchasing patterns. Understanding customer preferences and trends is critical for businesses to tailor their products, marketing strategies, and overall customer experience. This dataset captures a wide range of customer attributes including age, gender, purchase history, preferred payment methods, frequency of purchases, and more. Analysing this data can help businesses make informed decisions, optimize product offerings, and enhance customer satisfaction. The dataset stands as a valuable resource for businesses aiming to align their strategies with customer needs and preferences.




This dataset encompasses various features related to customer shopping preferences, gathering essential information for businesses seeking to enhance their understanding of their customer base. The features include customer age, gender, purchase amount, preferred payment methods, frequency of purchases, and feedback ratings. Additionally, data on the type of items purchased, shopping frequency, preferred shopping seasons, and interactions with promotional offers is included. With a collection of 3900 records, this dataset serves as a foundation for businesses looking to apply datadriven insights for better decision-making and customer-centric strategies.


Dataset Glossary (Column-wise)


Customer ID – Unique identifier for each customer




Age – Age of the customer


Gender – Gender of the customer (Male/Female)


Item Purchased – The item purchased by the customer


Category – Category of the item purchased


Purchase Amount (USD) – The amount of the purchase in USD


Location – Location where the purchase was made


Size – Size of the purchased item


Colour – Colour of the purchased item


Season – Season during which the purchase was made


Review Rating – Rating given by the customer for the purchased item


Subscription Status – Indicates if the customer has a subscription (Yes/No)


Shipping Type – Type of shipping chosen by the customer


Discount Applied – Indicates if a discount was applied to the purchase (Yes/No)


Promo Code Used – Indicates if a promo code was used for the purchase (Yes/No)


Previous Purchases – The total count of transactions concluded by the customer at the store,


excluding the ongoing transaction


Payment Method – Customer’s most preferred payment method


Frequency of Purchases – Frequency at which the customer makes purchases (e.g., Weekly,


Fortnightly, Monthly)


Assignment: With a collection of 3900 records, which are organised as a database in the Excel file Customer Shopping Trends:


1. Explain how the data and subsequent analysis using business analytics might lead to a better understanding of customer shopping trends. Specifically, state some of the key insights that you would hope to answer by analysing the data.


2. Summarise the data (numeric and categorical) using frequency distributions and histograms/column charts, cross-tabulations (you may find PivotTables friendly to work with) and descriptive statistics measures and so on. Write up your findings in a formal document as if you were consulting to the UK eCommerce market. Your report should fully analyse the data and draw appropriate conclusions.




This dataset is a Synthetic Dataset Created to learn more about Business/Data Analytics and Machine Learning.


Task B Correlation and Regression Case: AutoMobile Inc Background


You are working in sales for AutoMobile Inc., a firm of car distributors, who are thinking of expanding into a new market. You have been asked to investigate the factors which




determine vehicle ownership in various Asian countries; the most recent data you can find is for 2019 and is as shown in Cars3.xls.


To prepare for a presentation to management about opportunities for sales in Cyprus, carry out the following steps:


a.) Plot scatter graphs of vehicles per thousand population against income, population,


population density, and percentage of population in urban areas in the Excel file with which you believe there might be correlation. What do your results suggest?


b.) For the variable which is more closely correlated to vehicles per thousand population, calculate the equation of the regression line, and interpret the results.


c.) Plot scatter graphs of total vehicle ownership against income, population,


population density per km^2 and population in urban areas. What do these graphs suggest?


d.) For the variable which is more closely correlated to total vehicle ownership, calculate the equation of the regression line and interpret the results.


e.) Which of the two regression equations do you think will be more useful to the company, and why?


f.) Data for Cyprus is as follows:


Income Population Population density % urban
6.71 73.7 99 73.3

Use this data and the regression equations calculated in b.) and d.) above to predict the total number of vehicles and number of vehicles per 1000 population for Cyprus . The actual figures are 7.04 million and 105.6 per 1000 population. Explain why your predictions differ from these values.


Assessment criteria for essays Weight


Analytical and writing skills 30




Logical sequence and development 10
Evidence of background reading 10
Appropriate depth of analysis 10
Presentation and interpretation of information 30
Use of supporting data/evidence 10
Total mark 100