Factors Affecting Online Shopping Behavior
Marketing Batch (M3)
Authors :-
· Kartik Zode
· Vedika Khatu
· Gyanesh Kumar Sharma
Introduction:
As of 2026, e-commerce has transitioned from a convenience to a daily necessity across all age groups and income levels. However, the decision-making process is increasingly complex, influenced by emerging trends like social commerce, AI-driven personalization, and a heightened sensitivity to data privacy. This study aims to reduce a vast array of consumer concerns into meaningful “Factors” using EFA. By analyzing a diverse demographic, we can identify how different groups (Clusters) prioritize various attributes like price, security, and brand ethics.
Objectives:
· To identify the underlying dimensions (latent factors) that influence online buying behavior through Exploratory Factor Analysis (EFA).
· To group MBA students into distinct segments using Cluster Analysis based on their shopping motivations and frequency.
Data collection:
· Method – Structured questionnaire.
· Scale – 5-point Likert scale .
· Sample – Approximately 40 respondents.
· Sampling technique – Convenience sampling.
Data analysis:
1. Factor analysis –
· Communalities-
|
Variable |
Initial |
Extraction |
|
How often do you shop online? |
1.000 |
.315 |
|
Do online reviews affect your buying decision? |
1.000 |
.787 |
|
How important is website/app security? |
1.000 |
.655 |
|
Social media influence on purchases |
1.000 |
.787 |
|
Importance of easy return/refund policy |
1.000 |
.640 |
|
Avoidance due to high delivery charges |
1.000 |
.351 |
|
Extraction Method: Principal Component Analysis. |
The Communalities table represents the proportion of variance in each variable that is accounted for by the extracted factors, where values closer to 1.000 indicate a stronger fit within the overall research model. In this specific study, Social Media Influence and Online Reviews stand out with the highest extraction values (both at .787), signifying that nearly 79% of their variance is explained by the identified factors, making them the most reliable and influential predictors in your data. While Website Security (.655) and Return Policy (.640) also demonstrate a strong and healthy connection to the core themes, variables such as Shopping Frequency (.315) and Delivery Charge Avoidance (.351) show lower values, suggesting that these specific behaviors are driven by unique individual circumstances or external factors that are not fully captured by the main patterns identified in this analysis.
· KMO and Bartlett test –
|
Kaiser-Meyer-Olkin Measure of Sampling Adequacy. |
0.658 |
|
Bartlett’s Test of Sphericity |
Approx. Chi-Square: 37.056 |
|
df: 15 |
|
|
Sig.: .001 |
The KMO and Bartlett’s Test table confirms that the data is suitable for Factor Analysis. The Kaiser-Meyer-Olkin (KMO) value of 0.658 is above the recommended threshold of 0.60, indicating that the sample size and the patterns of correlation among the survey questions are adequate to produce reliable results. Simultaneously, Bartlett’s Test of Sphericity is highly significant (p = .001), which statistically proves that the variables are sufficiently related to one another rather than being completely independent. Together, these results provide a “green light,” showing that the data is not random and that there are indeed underlying themes or “factors” that can be successfully extracted from your respondents’ answers.
· Variance explained –
|
Component |
Initial Eigenvalues: Total |
% of Variance |
Cumulative % |
Rotation Sums of Squared Loadings: Total |
% of Variance |
Cumulative % |
|
1 |
2.348 |
39.128% |
39.128% |
1.947 |
32.450% |
32.450% |
|
2 |
1.188 |
19.794% |
58.922% |
1.588 |
26.472% |
58.922% |
|
3 |
0.979 |
16.320% |
75.242% |
|||
|
4 |
0.821 |
13.678% |
88.920% |
The Total Variance Explained table identifies the most significant themes (factors) in the data and measures exactly how much “information” they represent. In this study, the first two components are the only ones with Eigenvalues greater than 1, which is the standard statistical cutoff for importance. Component 1 is the strongest, explaining 39.1% of all respondent behavior, while Component 2 adds another 19.8%, bringing the Cumulative % to 58.9%. This means that by looking at just these two themes—instead of all six original questions—we can still understand nearly 60% of why the customers shop the way they do. The Rotation Sums columns simply show that after refining these two themes to make them easier to label, the information is distributed more evenly between them (roughly 32% and 26%), though the total combined insight remains the same.
2. Cluster analysis –
|
Variable |
Cluster 1 |
Cluster 2 |
Cluster 3 |
|
Frequency of Shopping |
.096 |
.252 |
-.894 |
|
Influence of Reviews |
.034 |
-.252 |
.593 |
|
Importance of Security |
.357 |
-.688 |
1.000 |
|
Social Media Influence |
.370 |
-.759 |
1.159 |
|
Return Policy Importance |
.318 |
-.606 |
.874 |
|
High Delivery Charge Avoidance |
-1.254 |
.798 |
.798 |
a) Cluster 1 (The Delivery-Tolerant Shoppers): This group is defined by a significant negative score in Delivery Charge Avoidance (-1.254). While most people are deterred by extra fees, this cluster is uniquely willing to pay for shipping. They are moderate, “middle-of-the-road” shoppers who value security and social media slightly more than the average person.
b) Cluster 2 (The Price-Sensitive Skeptics): This is your most frequent shopping group (0.252), but they are relatively cynical. They show the lowest interest in Social Media Influence (-.759), Security (-.688), and Return Policies (-.606). They are functional, high-frequency shoppers who are highly deterred by Delivery Charges (.798).
c) Cluster 3 (The Cautious Influentials): This group shops the least often (-.894) but is the most “plugged in.” They have the highest scores for Social Media Influence (1.159), Security (1.000), and Return Policies (.874). They are very careful, high-stakes shoppers who only buy when they feel safe and inspired; like Cluster 2, they absolutely avoid high delivery fees.
· Table of members –
|
Cluster |
N |
|
1 |
14 |
|
2 |
16 |
|
3 |
6 |
|
Valid |
36 |
|
Missing |
0 |
This table, known as the Number of Cases in each Cluster, provides the final headcount for your market segmentation, confirming how your 36 survey respondents are distributed across the three identified consumer profiles. Cluster 2 emerges as your largest segment with 16 individuals, representing the “Price-Sensitive Skeptics” who shop frequently but remain wary of extra costs and marketing tactics. Cluster 1, the “Delivery-Tolerant Shoppers,” follows closely with 14 members, while Cluster 3, the “Cautious Influentials,” remains a small but distinct niche of only 6 people. The “Valid 36” and “Missing 0” rows serve as a critical data integrity check, proving that every participant was successfully categorized without any technical errors or skipped data points.
3. Conclusion –
The overall analysis reveals a sophisticated digital consumer landscape where Security, Social Influence, and Trust are the primary drivers of behavior, collectively explaining nearly 60% of all shopping patterns. The Factor Analysis highlights that while consumers are heavily influenced by social proof and platform safety, their actual shopping frequency remains a more independent, secondary trait. This indicates that even highly frequent shoppers do not necessarily lower their guard regarding security or reviews.
The Cluster Analysis further segments your audience into three actionable profiles: the Price-Sensitive Skeptics (the largest group), who shop often but are cynical toward marketing; the Delivery-Tolerant Shoppers, who prioritize convenience and security over shipping costs; and the Cautious Influentials, a niche but highly engaged segment that requires intense social proof and safety guarantees before committing to a purchase. For a business, this suggests that a “one-size-fits-all” marketing strategy will fail; instead, efforts should be split between optimizing delivery costs for the skeptics and reinforcing trust and social media presence for the more cautious, influence-driven segments.