SURVEY ON BENEFIT OF USING CREDIT CARD

Cluster Analysis of Credit Cards

Authors: Anushka Namjoshi, Siddharth Singh

Introduction:

Cluster analysis is a statistical technique used to group a set of objects or data points into clusters based on their similarities. The goal is to categorize data in such a way that items in the same cluster are more identical than to those in other clusters. This can help uncover patterns, structures, or relationships within the data.

Cluster analysis attempts to maximise the homogeneity of objects within the clusters while also maximize the heterogeneity between the clusters.

Objective:

The main objective of this report is to find out if there are any significant differences in the various characteristics among the clusters using Analysis of Variance (ANOVA). By understanding these differences and therefore the relationship between cluster membership and the mentioned attributes, a conclusion can be derived about the characteristics that differentiate the clusters.

Characteristics for the Analysis:

Cluster analysis used the below mentioned 10 characteristics as the primary variables:

  1. Cash Back
  2. Emergency financial access
  3. Credit limit
  4. Free airport lounge access
  5. CIBIL benefits
  6. Buy on credit
  7. Convenience of Transaction
  8. Improve spending habits
  9. Foreign Transaction Convenience
  • Discount and offers

Data Collection:

This survey was conducted using Google Forms to evaluate perceptions of Credit Card based on the above mentioned 10 essential characteristics.

A K-means clustering algorithm was used by choosing appropriate number of clusters. Here, it is two.

Data Preprocessing: Standardize or normalize the data to ensure that features contribute equally to the distance calculations.

Data Analysis:

The data analysis from IBM SPSS software.

  1. Cluster Analysis Results:

Cluster Membership:

Case Number Cluster Distance
1 1 2.654
2 1 2.321
3 2 2.120
4 1 1.850
5 1 1.393
6 2 .997
7 1 2.070
8 1 1.622
9 1 1.784
10 2 2.828
11 1 1.477
12 1 2.037
13 1 1.774
14 2 1.377
15 1 2.103
16 1 1.850
17 1 1.745
18 1 1.465
19 1 1.803
20 2 1.948
21 2 2.235
22 1 1.755
23 1 2.020
24 1 1.850
25 1 1.784
26 1 1.643
27 1 1.850
28 1 2.053
29 2 .997
30 2 .997
31 2 1.377
32 1 1.841
33 2 .997
34 1 2.269
35 2 1.701
36 1 1.976
37 1 1.850
38 2 2.568
39 2 2.235
40 1 1.405
41 2 2.235
42 2 2.235
43 2 2.587
44 2 1.759
45 2 2.120
46 2 .997
47 2 1.138
48 1 1.950
49 1 2.078
Number of Cases in each Cluster
Cluster 1 29.000
2 20.000
Valid 49.000
Missing .000
Iteration Historya
 
Iteration Change in Cluster Centers
1 2
1 2.539 1.920
2 .233 .263
3 .110 .106
4 .099 .103
5 .083 .094
6 .144 .196
7 .083 .126
8 .000 .000
a.Convergence achieved due to no or small change in cluster centers. The maximum absolute coordinate change for any center is .000. The current iteration is 8. The minimum distance between initial centers is 4.472.

Final Cluster Centers
Cluster
1 2
Discounts and offers 2.62 1.75
Cah backs 2.38 1.65
Emergency financial access 2.76 1.65
Credit limit 2.34 1.70
Free airport lounge access 2.41 1.65
CIBIL benefit 2.59 1.75
Buy on credit 2.69 1.55
Convenience of Transaction 2.52 1.90
Improve spending habits 1.86 1.60
Foreign Transaction Convenience 2.48 1.80
Distances between Final Cluster Centers
Cluster 1 2
1 2.536
2 2.536
ANOVA
Cluster Error F Sig.
Mean Square df Mean Square df
Discounts and offers 8.973 1 .268 47 33.532 <.001
Cash backs 6.296 1 .412 47 15.271 <.001
Emergency financial access 14.548 1 .295 47 49.331 <.001
Credit Limit 4.922 1 .314 47 15.681 <.001
Free airport lounge access 6.905 1 .544 47 12.685 <.001
CIBIL limit 8.277 1 .272 47 30.428 <.001
Buy on Credit 15.374 1 .237 47 64.764 <.001
Convenience of transaction 4.510 1 .405 47 11.131 .002
Improves spending habits .813 1 .558 47 1.456 .234
Foreign Transaction Convenience 5.518 1 .392 47 14.063 <.001
The F tests should be used only for descriptive purposes because the clusters have been chosen to maximize the differences among cases in different clusters. The observed significance levels are not corrected for this and thus cannot be interpreted as tests of the hypothesis that the cluster means are equal.

Interpretation:

1. Cluster Composition:

– Cluster 1 (29 cases): This group generally rated items higher, with average ratings typically between 2.38 and 2.76. This suggests that individuals in this cluster are more positive in their evaluations.

– Cluster 2 (20 cases): This group exhibited lower average ratings, mostly falling between 1.55 and 1.90. These individuals appear to be less satisfied or more critical in their assessments.

2. Convergence and Stability:

– The analysis reached convergence by the 8th iteration, indicating that the cluster centers stabilized and that the classification of cases into clusters is robust. The minimal changes in cluster centers suggest that the groups are well-defined and consistent.

3. Significant Differences:

– The ANOVA results highlight statistically significant differences between the clusters for most rating items (p < .001). This indicates that the two clusters represent distinct perspectives or experiences related to the items evaluated. The significance reinforces the idea that these clusters are meaningful and not due to random variation.

4. Distance Between Clusters:

– The distance of approximately 2.536 between the final cluster centers indicates a substantial separation between the groups. This separation suggests that the characteristics of each cluster are quite different, further emphasizing the need to tailor strategies or interventions to address the specific needs or perceptions of each group.

 

Practical Implications:

Targeted Strategies: Understanding that Cluster 1 consists of more positive evaluations and Cluster 2 represents a more critical viewpoint allows for tailored approaches. For instance, Cluster 2 might benefit from additional support or improvements in areas they rated lower.

– Further Exploration: It may be beneficial to investigate the specific items or aspects that contribute to the differing ratings between clusters. This could provide insights into what drives satisfaction or dissatisfaction.

– Communication and Engagement: Different communication strategies may be required for each cluster. Engaging with the more critical group (Cluster 2) may involve addressing concerns directly, while the more positive group (Cluster 1) could be encouraged to share their positive experiences more widely.

In summary, the clustering analysis reveals two distinct groups with differing evaluations, which can inform targeted strategies for improvement and engagement based on the unique characteristics of each cluster.

Conclusion:

The K-means clustering analysis identified two distinct groups among the 49 cases based on item ratings.

1. Distinct Clusters: The analysis revealed two clusters, with Cluster 1 showing generally higher ratings compared to Cluster 2. This indicates a significant difference in how the two groups perceive the item.

2. Statistical Significance: ANOVA results indicated significant differences between the clusters for most rating items, confirming that the clusters are not only different but also that these differences are statistically meaningful.

3. Practical Implications: Understanding these clusters can help tailor strategies to address the varying perceptions of the item, potentially guiding marketing or product improvement efforts based on the preferences of each group. Further investigation into the characteristics of each cluster could provide insights into the underlying reasons for these differences in ratings.

 

Leave a comment