Data Insight from Retail Sample Sales Data

Photo by Carlos Muza on Unsplash

Data Insight from Retail Sample Sales Data

Identifying Key Trends and Insights for Further Analysis

This is an initial analysis of the sample data provided (from Kaggle) for the stage 0 task at the HNG (11) internship (Learn more about HNG here)

Looking at the dataset, it can be noted that it's about the sales from an automobile company or dealer because the products sold are either Cars, Trains, or Motorcycles.

The sample dataset has 25 columns and 2,823 rows which contain information about order quantities, customer, date, status, sales amount, prices, and other details.

Some of the key ones are:

  • PRODUCTLINE: Categorical, tell us the type of product sold, either motorcycle or car.

  • PRODUCTCODE: Object, unique identity for each of the product line

  • ORDERNUMBER: Numerical, and a unique identifier for each order placed

  • QUANTITYORDERED: Numerical, telling us the quantities of each order placed

  • PRICEEACH: Numerical, tells us the price of each item

  • SALES: Numerical, tells us about the total sales for a particular order placed at a time

  • STATUS: Categorical, tell us the current status of the order if it's in Process, Disputed, shipped, or canceled.

  • ORDERDATE: Datetime, When the order was made

  • QTY_ID, MONTH_ID, YEAR_ID: Numerical, each tells us the quarter, month, and year the order was made

  • DEALSIZE: Categorical, tell us the size of the deal if it's small, medium, or large.

Other columns tell us more about the customer, who contacted the customer first, and who contacted the customer last.

Quick Review of the Dataset

This gives us a basic idea of what's happening in the data and it's summarised below:

  • Most products sold are "Classic Cars"

  • The average amount spent per order is $3,000

  • The most common order STATUS is "Shipped" (2,617) which means the company has good products and a good order processing system in place.

  • Most of the DEALSIZE is "Medium" (1,384 units) which suggests that most customers prefer to buy the medium size product instead of small or big.

  • Most of the products are purchased by customers in EMEA (Europe, the Middle East, and Africa) while the United States recorded more sales compared to individual countries, of which most of them are in California.

Conclusion

This quick review briefly touched on the average amount spent per order, the most-shipped product, customer's preference on deal size and location.

While the initial data analysis suggested a positive result, further analysis could go deeper into:

  • Which year has the most shipped orders, and compare with sales, location, and price, with year to see what changes and what could be made better

  • Although fewer cancellations are recorded, why? Is it location, late delivery, wrong item delivery, etc? Further analysis could delve deeper into this

  • How can sales from other countries increase? What are the barriers?

  • What makes customers prefer the medium size over other deal sizes?

About HNG Internship

HNG internship is an annual program geared towards helping creatives, developers, and other people in tech to showcase their skills by working on real-world tasks.

The program also has a premium network for interested participants to network with like-minded people, access job updates, etc. Learn more about the premium network here.

This is my second time at HNG and I hope I become a finalist in this cohort.