Photo by Carlos Muza on Unsplash
Data Insight from Retail Sample Sales Data
Identifying Key Trends and Insights for Further Analysis
This is an initial analysis of the sample data provided (from Kaggle) for the stage 0 task at the HNG (11) internship (Learn more about HNG here)
Looking at the dataset, it can be noted that it's about the sales from an automobile company or dealer because the products sold are either Cars, Trains, or Motorcycles.
The sample dataset has 25 columns and 2,823 rows which contain information about order quantities, customer, date, status, sales amount, prices, and other details.
Some of the key ones are:
PRODUCTLINE
: Categorical, tell us the type of product sold, either motorcycle or car.PRODUCTCODE
: Object, unique identity for each of the product lineORDERNUMBER
: Numerical, and a unique identifier for each order placedQUANTITYORDERED
: Numerical, telling us the quantities of each order placedPRICEEACH
: Numerical, tells us the price of each itemSALES
: Numerical, tells us about the total sales for a particular order placed at a timeSTATUS
: Categorical, tell us the current status of the order if it's in Process, Disputed, shipped, or canceled.ORDERDATE
: Datetime, When the order was madeQTY_ID
,MONTH_ID
,YEAR_ID
: Numerical, each tells us the quarter, month, and year the order was madeDEALSIZE
: Categorical, tell us the size of the deal if it's small, medium, or large.
Other columns tell us more about the customer, who contacted the customer first, and who contacted the customer last.
Quick Review of the Dataset
This gives us a basic idea of what's happening in the data and it's summarised below:
Most products sold are "Classic Cars"
The average amount spent per order is $3,000
The most common order STATUS is "Shipped" (2,617) which means the company has good products and a good order processing system in place.
Most of the DEALSIZE is "Medium" (1,384 units) which suggests that most customers prefer to buy the medium size product instead of small or big.
Most of the products are purchased by customers in EMEA (Europe, the Middle East, and Africa) while the United States recorded more sales compared to individual countries, of which most of them are in California.
Conclusion
This quick review briefly touched on the average amount spent per order, the most-shipped product, customer's preference on deal size and location.
While the initial data analysis suggested a positive result, further analysis could go deeper into:
Which year has the most shipped orders, and compare with sales, location, and price, with year to see what changes and what could be made better
Although fewer cancellations are recorded, why? Is it location, late delivery, wrong item delivery, etc? Further analysis could delve deeper into this
How can sales from other countries increase? What are the barriers?
What makes customers prefer the medium size over other deal sizes?
About HNG Internship
HNG internship is an annual program geared towards helping creatives, developers, and other people in tech to showcase their skills by working on real-world tasks.
The program also has a premium network for interested participants to network with like-minded people, access job updates, etc. Learn more about the premium network here.
This is my second time at HNG and I hope I become a finalist in this cohort.