Logo

dev-resources.site

for different kinds of informations.

Exploratory Data Analysis on the Iris Flower Dataset

Published at
7/2/2024
Categories
hng
python
dataanalysis
datascience
Author
eskayml
Author
7 person written this
eskayml
open
Exploratory Data Analysis on the Iris Flower Dataset

Motivation

This is my submission of stage zero in the HNG 11 internship, I am currently deep exploring the field of data analysis , I believe this internship gives me the opportunity to learn and grow more in this field

To know more:

Observation from first glance

Looking at the Iris dataset from first glance,
The Iris flower dataset comprises 150 samples with four features each: sepal length, sepal width, petal length, and petal width, distributed across three species: Iris-setosa, Iris-versicolor, and Iris-virginica, with 50 samples per species

Image description

Image description

Exploratory Data Analysis

Image description

The pairplot above easily summarizes how the entire distribution of the 4 features are against the target variable.

We can infer all of the above

The pairplot of the Iris dataset provides a visual summary of the relationships between the four features (sepal length, sepal width, petal length, and petal width) for the three Iris species: setosa, versicolor, and virginica. Here are some detailed observations:

  1. Species Separation:

    • Iris-setosa: This species is distinctly separated from the other two species in almost all pairwise comparisons. The petal length and petal width features are particularly effective in distinguishing Iris-setosa, as the points representing this species form a distinct cluster in the lower left corner in the petal length vs. petal width plot.
    • Iris-versicolor and Iris-virginica: These two species overlap more but show some degree of separation. The petal length and petal width features again provide good separation, with Iris-versicolor generally having smaller petal measurements compared to Iris-virginica. However, there is still some overlap between these two species in the middle range of the feature values.
  2. Feature Distributions:

    • The diagonal plots show the kernel density estimates (KDE) for each feature within each species. These plots reveal that the distribution of each feature varies significantly between species. For example, Iris-setosa has a much narrower and distinct distribution for petal length and petal width compared to the other two species.
    • Sepal length and sepal width have more overlapping distributions, especially between Iris-versicolor and Iris-virginica, making them less effective for classification on their own.
  3. Inter-feature Relationships:

    • There is a noticeable positive correlation between petal length and petal width across all species, particularly within Iris-versicolor and Iris-virginica.
    • Sepal length and petal length also exhibit a positive correlation, especially for Iris-versicolor and Iris-virginica, while Iris-setosa remains distinctly separated.
    • Sepal width shows a weaker correlation with other features compared to the petal measurements.
  4. Within-Species Variability:

    • Iris-setosa shows low variability in petal measurements, which are consistently small.
    • Both Iris-versicolor and Iris-virginica exhibit more variability in their petal measurements, with Iris-virginica generally showing the largest measurements.

CORRELATION

Image description

The correlation matrix heatmap of the Iris dataset reveals the relationships between the features. Sepal length shows a strong positive correlation with petal length (0.87) and petal width (0.82). Petal length and petal width are highly correlated (0.96), indicating that as petal length increases, petal width also tends to increase significantly. Sepal width, on the other hand, has a weak negative correlation with sepal length (-0.12) and moderate negative correlations with petal length (-0.43) and petal width (-0.37). These insights suggest that petal measurements are more strongly interrelated compared to sepal measurements, which are less correlated with each other and with petal measurements

Thanks so much for reading😊, CyaπŸ‘‹.

hng Article's
30 articles in total
Favicon
My Path to Mastery: Overcoming Challenges, Celebrating Wins, and Growing as a DevOps Engineer
Favicon
Automating User Creation: A Streamlined Approach
Favicon
My Exciting Journey with HNG STAGE ONE: Automating User Management with Bash Script
Favicon
Overcoming Backend Challenges: My Journey and Aspirations with HNG Internship
Favicon
Exploratory Data Analysis on the Iris Flower Dataset
Favicon
FRONTEND TECHNOLOGY
Favicon
Linux Users and Groups Management using Bash Script
Favicon
A difficult backend problem I had to solve
Favicon
Creating users and groups from a file with bash.
Favicon
A guide into REACT AND SVELTE
Favicon
lit Comparision REact.js vs Vue.js
Favicon
How I approach new problems as backend engineer
Favicon
Automating The Creation Of Users From A Text File Using Bash Scripting
Favicon
How I Solved a Challenging Backend Problem with PHP & MySQL
Favicon
Backend has never been this interesting...
Favicon
My Mobile Development Journey and Architectural Insights
Favicon
Overcoming Execution Policy Restrictions in PowerShell: My Journey with the HNG Internship
Favicon
My journey into Mobile Development.
Favicon
Frontend Face-Off: React vs. Vue.js - An HNG Intern's Perspective
Favicon
My somewhat rocky start to HNG11...
Favicon
Demystifying Frontend Technologies: React vs Vue.js
Favicon
Starting my journey with HNG
Favicon
Using APIs in a Web Application: Integration and Optimization
Favicon
Automating Linux User Creation with a Bash Script
Favicon
Common Software Architecture patterns in Mobile Application Development.
Favicon
Automating User Management with a Bash Script
Favicon
FRONTEND TECHNOLOGIES: REACTJS VS. NEXT.JS
Favicon
Automating User Management with Bash Scripting: A Practical Guide
Favicon
A Frontend Technology Comparison of Svelte vs Alpine.js
Favicon
React vs. Angular: A Comprehensive Comparison

Featured ones: