adding comments

master
Davud 2022-12-22 16:53:25 +00:00
parent 292f8729cd
commit bdc57e6dcd
1 changed files with 22 additions and 5 deletions

View File

@ -24,7 +24,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"# Constructing a model" "# Constructing a model (algorithm)"
] ]
}, },
{ {
@ -309,10 +309,17 @@
"\n", "\n",
"This dataset became a typical test case for many statistical classification techniques in machine learning such as support vector machines\n", "This dataset became a typical test case for many statistical classification techniques in machine learning such as support vector machines\n",
"\n", "\n",
"**Reference**\n",
"\n",
" R. A. Fisher (1936). \"The use of multiple measurements in taxonomic problems\". Annals of Eugenics. 7 (2): 179188.\n",
" \n",
" https://en.wikipedia.org/wiki/Iris_flower_data_set\n",
"\n",
"**Content**\n", "**Content**\n",
"\n", "\n",
"The dataset contains a set of 150 records under 5 attributes - Petal Length, Petal Width, Sepal Length, Sepal width and Class(Species).\n", "The dataset contains a set of 150 records under 5 attributes - Petal Length, Petal Width, Sepal Length, Sepal width and Class(Species).\n",
"\n", "\n",
"---\n",
"So, our objective here is to predict the class that is the specie of the iris flower, given it's features which are:\n", "So, our objective here is to predict the class that is the specie of the iris flower, given it's features which are:\n",
"1. sepal_length\n", "1. sepal_length\n",
"2. sepal width\n", "2. sepal width\n",
@ -562,7 +569,7 @@
"source": [ "source": [
"## Model visualization\n", "## Model visualization\n",
"\n", "\n",
"We will use the methon print.tree() to visualize our tree." "We will use the method print.tree() to visualize our tree."
] ]
}, },
{ {
@ -598,12 +605,12 @@
"source": [ "source": [
"## Testing the model\n", "## Testing the model\n",
"\n", "\n",
"We are using the definded method predict() to determine the classes of the Test dataset - those will be stored in the Y_pred which we will than compare to Y_test with the help of sklearn library function called accuracy_score" "We are using definded method predict() to determine the classes of the Test dataset - those will be stored in the Y_pred which we will then compare to Y_test with the help of sklearn library function called accuracy_score"
] ]
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 32, "execution_count": 41,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
@ -612,7 +619,7 @@
"0.9333333333333333" "0.9333333333333333"
] ]
}, },
"execution_count": 32, "execution_count": 41,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
@ -639,6 +646,16 @@
"Our objective here is to predict if the customer will purchase the iPhone or not given their gender, age and salary." "Our objective here is to predict if the customer will purchase the iPhone or not given their gender, age and salary."
] ]
}, },
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### About the data \n",
"\n",
"Despite all the effort I couldn't find the origin of this data thus it shouldn't be used for any other purposes. \n",
"The dataset contains a set of 400 records under 4 attributes - Gender, Age, Salary and Class( whether the person made a purchase or not)."
]
},
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},