This blog explores data science and networking, combining theoretical concepts with practical implementations. Topics include routing protocols, network operations, and data-driven problem solving, presented with clarity and reproducibility in mind.
Friday, February 7, 2025
Titanic Survival Analysis by Gender
Tuesday, December 31, 2024
Treemap of Titanic Dataset: Survival Analysis by Class, Sex, Embarkation Town
๐ข Titanic Survival Analysis using Treemap Visualization
๐ Table of Contents
- Problem Statement
- Goal of Analysis
- Why Treemap?
- Hierarchy Breakdown
- Mathematical Understanding
- Code Example
- CLI Output
- Insights
- Key Takeaways
- Related Articles
๐ Problem Statement
The objective is to analyze survival patterns of passengers aboard the Titanic using categorical features such as class, gender, and embarkation location.
Traditional charts often fail to capture multi-level relationships effectively. Therefore, a more structured visualization is required.
๐ฏ Goal of Analysis
- Understand survival distribution
- Compare groups across multiple variables
- Identify hidden patterns
- Create intuitive visualization
๐ณ Why Treemap?
A treemap is ideal for hierarchical data visualization because:
- Represents nested categories
- Uses area size for magnitude
- Uses color for additional dimension
Each rectangle represents a group, and its size corresponds to passenger count.
๐ง Hierarchical Structure
The treemap follows this hierarchy:
- Class (1st, 2nd, 3rd)
- Sex (Male, Female)
- Embark Town
- Survival Status
๐ Expand Explanation
This structure allows drilling down from broad categories (class) into detailed insights (survival). Each level adds context, improving interpretability.
๐ Mathematical Understanding
Survival Rate Formula
Survival Rate = (Number of Survivors / Total Passengers) × 100
Group Proportion
Group Size ∝ Number of passengers in category
Color Encoding
Color Scale = f(Survival Status)
๐ Deep Explanation
Treemap area is proportional to frequency counts. Color mapping often uses normalized values between 0 and 1. For example:
normalized_value = (value - min) / (max - min)
This ensures consistent color scaling across categories.
๐ Advanced Mathematical Explanation
To fully understand the treemap visualization, we need to break down the mathematical relationships behind survival rates, proportions, and hierarchical aggregation.
1. Survival Probability
The survival probability for any group is calculated as:
P(Survival) = Number of Survivors / Total Passengers in Group
This gives a value between 0 and 1, where:
- 0 → No one survived
- 1 → Everyone survived
2. Percentage Conversion
Survival Rate (%) = P(Survival) × 100
3. Hierarchical Aggregation
Treemap works by aggregating counts at each level:
Total(Class) = ฮฃ Passengers in that Class Total(Class, Sex) = ฮฃ Passengers in that subgroup
Each rectangle size is proportional to:
Area ∝ Number of Passengers
4. Conditional Probability Insight
We can also analyze survival using conditional probability:
P(Survival | Female, 1st Class) = Survivors(Female, 1st Class) / Total(Female, 1st Class)
5. Color Normalization (for Treemap)
Color intensity is calculated using normalization:
Normalized Value = (x - min) / (max - min)
This ensures consistent color mapping across all groups.
๐ Why This Matters
These calculations ensure that:
- Rectangle sizes accurately represent population
- Colors reflect survival likelihood
- Comparisons remain statistically valid
๐ป Code Example (Python - Plotly)
import plotly.express as px
import pandas as pd
df = px.data.titanic()
fig = px.treemap(
df,
path=['class', 'sex', 'embark_town', 'survived'],
color='survived',
color_continuous_scale='RdBu'
)
fig.show()
๐ฅ CLI Output Sample
Loading dataset... Processing hierarchy... Generating treemap... ✔ Class grouped ✔ Gender segmented ✔ Embarkation mapped Treemap rendered successfully!
๐ Expand CLI Explanation
This output simulates a pipeline where:
- Data is loaded
- Categories are grouped
- Visualization is generated
๐ Key Insights from Treemap
- First-class passengers had higher survival rates
- Females survived more than males
- Third-class males had lowest survival
- Embarkation town influenced outcomes slightly
These patterns become immediately visible through area and color differences.
๐ฏ Key Takeaways
- Treemap simplifies complex hierarchical data
- Combines size and color for dual insights
- Excellent for categorical comparison
- Improves decision-making clarity
๐ Final Thoughts
Treemaps are a powerful tool for visual storytelling in data science. When applied to the Titanic dataset, they reveal survival patterns in a clear, hierarchical, and intuitive manner.
By combining structure, color, and scale, this approach transforms raw data into meaningful insights.
Featured Post
How HMT Watches Lost the Time: A Deep Dive into Disruptive Innovation Blindness in Indian Manufacturing
The Rise and Fall of HMT Watches: A Story of Brand Dominance and Disruptive Innovation Blindness The Rise and Fal...
Popular Posts
-
EIGRP Stub Routing In complex network environments, maintaining stability and efficienc...
-
Modern NTP Practices – Interactive Guide Modern NTP Practices – Interactive Guide Network Time Protocol (NTP)...
-
DeepID-Net and Def-Pooling Layer Explained | Interactive Guide DeepID-Net and Def-Pooling Layer Explaine...
-
GET VPN COOP Explained Simply: Key Server Redundancy Made Easy GET VPN COOP Explained (Simple + Practica...
-
Modern Cisco ASA Troubleshooting (Post-9.7) Modern Cisco ASA Troubleshooting (Post-9.7) With evolving netwo...
-
When Machine Learning Looks Right but Goes Wrong When Machine Learning Looks Right but Goes Wrong Picture a f...
-
Latent Space & Vector Arithmetic Explained | AI Image Transformations Latent Space & Vector Arit...
-
Process Synchronization – Interactive OS Guide Process Synchronization – Interactive Operating Systems Guide In an operati...
-
Event2Mind – Teaching Machines Human Intent and Emotion Event2Mind: Teaching Machines to Understand Human Intent...
-
Linear Regression vs Classification – Interactive Guide Linear Regression vs Classification – Interactive Theory Guide Line...