Spaces:

shwetashweta05
/

data2

Runtime error

App Files Files Community

data2 / pages /Descriptive Statistics.py

shwetashweta05

Update pages/Descriptive Statistics.py

033b46b verified 3 months ago

raw

history blame contribute delete

9.14 kB

	import streamlit as st
	import numpy as np
	import pandas as pd

	st.subheader(":blue[1. Descriptive Statistics]")
	st.write("Descriptive statistics involves describing,summarizing and organizing the data so it can be easily understood.example- classroom of maths students (20).marks of the first sem (84,86,78,72,75,65,80,81,45,87,67,54).whar is the average marks of the students in the class?")

	st.subheader(":blue[Types of Descriptive statistics]")
	st.write("There are three types of Descriptive statistics. 1)Distribution 2) Measures of central tendency 3) Measures of Dispersion")

	st.subheader(":blue[Measures of Central Tendency]")
	st.write("Measures of central tendency estimate the center, or average, of a data set. The mean, median and mode are 3 ways of finding the average.")

	st.subheader(":blue[Mean]")
	st.write("The mean, or M, is the most commonly used method for finding the average.To find the mean, simply add up all response values and divide the sum by the total number of responses. The total number of responses or observations is called N.")
	formula = """
	mean = sum of the observation / total number of observation
	"""
	st.latex(formula)

	st.subheader(":blue[There are three types of Mean]")
	st.write("Arithmetic Mean - It is the sum of the sampled values divided by thenumber of items in the sample.")
	st.image("https://www.freemathhelp.com/images/lessons/mean1.gif",caption = "Arithmetic Mean")

	st.write("Geometric Mean - It is an average that is useful for sets of positivenumbers that are interpreted according to their product.")
	st.image("https://cdn1.byjus.com/wp-content/uploads/2019/02/Geometric-Mean-Formula.png", caption="Geometric Mean")

	st.write("Harmonic Mean - The harmonic mean is a numerical average calculated by dividing the number of observations by the reciprocal of each number.")
	st.image("https://search-static.byjusweb.com/question-images/aakash_pdf/99996178852-0-0",caption="Harmonic Mean")

	st.subheader(":blue[Median]")
	st.write("The median is the value that’s exactly in the middle of a data set.To find the median, order each response value from the smallest to the biggest. Then, the median is the number in the middle. If there are two numbers in the middle, find their mean.")
	st.image("https://media.geeksforgeeks.org/wp-content/uploads/20230717083132/median-even-number.png",caption="Median")

	st.subheader(":blue[Mode]")
	st.write("The mode is the simply the most popular or most frequent response value. A data set can have no mode, one mode, or more than one mode.To find the mode, order your data set from lowest to highest and find the response that occurs most frequently.")
	st.image("https://qph.cf2.quoracdn.net/main-qimg-c087bd723603bb17784177d4fee2f050-lq",caption="Mode")

	st.subheader(":blue[Types of Mode]")
	st.write("The different types of Mode are Unimodal, Bimodal, Trimodal, and Multimodal.")

	st.subheader(":blue[Unimodal]")
	st.write("Unimodal Mode - A set of data with one Mode is known as a Unimodal Mode.")
	st.write("Example, the Mode of data set A = {14, 15, 16, 17, 15, 18, 15, 19} is 15 as there is only one value repeating itself.")

	st.subheader(":blue[Biomodal Mode]")
	st.write("Bimodal Mode - A set of data with two Modes is known as a Bimodal Mode. This means that there are two data values that are having the highest frequencies.")
	st.write("Example, the Mode of data set A = { 8,13,13,14,15,17,17,19} is 13 and 17 because both 13 and 17 are repeating twice in the given set")

	st.subheader(":blue[Trimodal]")
	st.write("Trimodal Mode - A set of data with three Modes is known as a Trimodal Mode. This means that there are three data values that are having the highest frequencies.")
	st.write("Example, The Mode of data set A = {100, 80, 80, 95, 95, 100, 90, 90,100 ,95 } is 80, 90, 95, and 100 because both all the four values are repeated twice in the given set.")

	st.subheader(":blue[Multimodal]")
	st.write("A set of data with four or more than four Modes is known as a Multimodal Mode.")
	st.write("Multimodal Mode - The Mode of data set A = {100, 80, 80, 95, 95, 100, 90, 90,100 ,95 } is 80, 90, 95, and 100 because both all the four values are repeated twice in the given set.")

	st.subheader(":blue[Measures Of Dispersion]")
	st.write("Measures of variability give you a sense of how spread out the response values are. The range, standard deviation and variance each reflect different aspects of spread.")
	st.image("https://miro.medium.com/v2/resize:fit:640/format:webp/1*iVE5it6kMD87fw4n6g5xrg.png")

	st.subheader(":blue[There are two types of dispersion]")
	st.write("1. Absolute Measure of dispersion")
	st.write("2. Relative Measure of dispersion")


	st.write("Absolute - he measures of dispersion that are measured and expressed in the units of data themselves are called Absolute Measure of Dispersion. For example – Meters, Dollars, Kg, etc.")

	st.subheader(":blue[absolute measures of dispersion are: ]")
	st.write("Range - It is defined as the difference between the largest and the smallest value in the distribution.")
	st.image("https://media.geeksforgeeks.org/wp-content/uploads/20230922115423/Range.png",caption="Range")

	st.subheader(":blue[Mean Deviation]")
	st.write("Mean Deviation - It is the arithmetic mean of the difference between the values and their mean.")
	st.image("https://d1whtlypfis84e.cloudfront.net/guides/wp-content/uploads/2020/03/13143143/MAD-about-Mean-for-Individual-Data-Series.png",caption="Mean Deviation")

	st.subheader(":blue[Standard Deviation]")
	st.write("Standard Deviation - It is the square root of the arithmetic average of the square of the deviations measured from the mean.")
	st.image("https://media.geeksforgeeks.org/wp-content/uploads/20230605181802/Standard-Deviation-Formula.png",caption="Standard Deviation")

	st.subheader(":blue[Variance]")
	st.write("Variance - It is defined as the average of the square deviation from the mean of the given data set.")
	st.image("https://media.geeksforgeeks.org/wp-content/uploads/20230605183401/Variance-formula.png",caption="Variance")

	st.subheader(":blue[Quartile Deviation]")
	st.write("Quartile Deviation - It is defined as half of the difference between the third quartile and the first quartile in a given data set.")
	st.image("https://miro.medium.com/v2/resize:fit:1024/1*1br-28_d07Ur3cXTyKEJjw.jpeg",caption="Quartile Deviation")

	st.subheader(":blue[Interquartile Range]")
	st.write("Interquartile Range - The difference between upper(Q3 ) and lower(Q1) quartile is called Interterquartile Range. Its formula is given as Q3 – Q1.")
	st.image("https://statsmethods.wordpress.com/wp-content/uploads/2013/05/capture.png",caption="Interquartile Range")

	st.write("Relative - We use relative measures of dispersion to measure the two quantities that have different units to get a better idea about the scattering of the data.")

	st.subheader(":blue[Relative measures of dispersion:]")
	st.write("Coefficient of Range - It is defined as the ratio of the difference between the highest and lowest value in a data set to the sum of the highest and lowest value.")
	st.image("https://bbamantra.com/wp-content/uploads/2016/09/range-coefficient-of-range.jpg",caption="Coefficient of Range")

	st.subheader(":blue[Coefficient of Variation]")
	st.write("Coefficient of Variation - It is defined as the ratio of the standard deviation to the mean of the data set. We use percentages to express the coefficient of variation.")
	st.image("https://study.com/cimages/multimages/16/dcee854a-311f-4249-8800-d6ea1a117b398244206355149698335.png",caption="Coefficient of Variantion")

	st.subheader(":blue[Coefficient of Mean Deviation]")
	st.write("Coefficient of Mean Deviation- It is defined as the ratio of the mean deviation to the value of the central point of the data set.")
	st.image("https://www.zigya.com/application/zrc/images/qvar/STEN11019437-1.png",caption="Coefficient of Mean Deviation")

	st.subheader(":blue[Coefficient of Quartile Deviation]")
	st.write("Coefficient of Quartile Deviation - It is defined as the ratio of the difference between the third quartile and the first quartile to the sum of the third and first quartiles.")
	st.image("https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiWzFSpCpnwz2B11b8ec6BM9YNesfkAKsXuGA3L-Rwl4Vt31dTBCZ99fL9DDsKbvHPWh4wQ7H0caBQLxOjYSA3wWoskm8ROF9JFh4DG7vuUe5kvtxn6Lv4f4B2_phDjoUvi-VSlIQz9TC8r/s16000/coefficient+of+quartile+deviation.png",caption="Cpefficient of Quartile Deviation")

	st.subheader(":blue[Distribution Measures]")
	st.write("The distribution concerns the frequency of each value.")

	st.subheader(":blue[Frequency Distribution]")
	st.write("Frequency Distribution - A data set is made up of a distribution of values, or scores. In tables or graphs, you can summarize the frequency of every possible value of a variable in numbers or percentages. This is called a frequency distribution.")
	st.image("https://thirdspacelearning.com/wp-content/uploads/2023/08/Frequency-Distribution-us-featured-image.png")

	import streamlit as st
	import numpy as np
	import pandas as pd

	st.subheader(":blue[1. Descriptive Statistics]")
	st.write("Descriptive statistics involves describing,summarizing and organizing the data so it can be easily understood.example- classroom of maths students (20).marks of the first sem (84,86,78,72,75,65,80,81,45,87,67,54).whar is the average marks of the students in the class?")

	st.subheader(":blue[Types of Descriptive statistics]")
	st.write("There are three types of Descriptive statistics. 1)Distribution 2) Measures of central tendency 3) Measures of Dispersion")

	st.subheader(":blue[Measures of Central Tendency]")
	st.write("Measures of central tendency estimate the center, or average, of a data set. The mean, median and mode are 3 ways of finding the average.")

	st.subheader(":blue[Mean]")
	st.write("The mean, or M, is the most commonly used method for finding the average.To find the mean, simply add up all response values and divide the sum by the total number of responses. The total number of responses or observations is called N.")
	formula = """
	mean = sum of the observation / total number of observation
	"""
	st.latex(formula)

	st.subheader(":blue[There are three types of Mean]")
	st.write("Arithmetic Mean - It is the sum of the sampled values divided by thenumber of items in the sample.")
	st.image("https://www.freemathhelp.com/images/lessons/mean1.gif",caption = "Arithmetic Mean")

	st.write("Geometric Mean - It is an average that is useful for sets of positivenumbers that are interpreted according to their product.")
	st.image("https://cdn1.byjus.com/wp-content/uploads/2019/02/Geometric-Mean-Formula.png", caption="Geometric Mean")

	st.write("Harmonic Mean - The harmonic mean is a numerical average calculated by dividing the number of observations by the reciprocal of each number.")
	st.image("https://search-static.byjusweb.com/question-images/aakash_pdf/99996178852-0-0",caption="Harmonic Mean")

	st.subheader(":blue[Median]")
	st.write("The median is the value that’s exactly in the middle of a data set.To find the median, order each response value from the smallest to the biggest. Then, the median is the number in the middle. If there are two numbers in the middle, find their mean.")
	st.image("https://media.geeksforgeeks.org/wp-content/uploads/20230717083132/median-even-number.png",caption="Median")

	st.subheader(":blue[Mode]")
	st.write("The mode is the simply the most popular or most frequent response value. A data set can have no mode, one mode, or more than one mode.To find the mode, order your data set from lowest to highest and find the response that occurs most frequently.")
	st.image("https://qph.cf2.quoracdn.net/main-qimg-c087bd723603bb17784177d4fee2f050-lq",caption="Mode")

	st.subheader(":blue[Types of Mode]")
	st.write("The different types of Mode are Unimodal, Bimodal, Trimodal, and Multimodal.")

	st.subheader(":blue[Unimodal]")
	st.write("Unimodal Mode - A set of data with one Mode is known as a Unimodal Mode.")
	st.write("Example, the Mode of data set A = {14, 15, 16, 17, 15, 18, 15, 19} is 15 as there is only one value repeating itself.")

	st.subheader(":blue[Biomodal Mode]")
	st.write("Bimodal Mode - A set of data with two Modes is known as a Bimodal Mode. This means that there are two data values that are having the highest frequencies.")
	st.write("Example, the Mode of data set A = { 8,13,13,14,15,17,17,19} is 13 and 17 because both 13 and 17 are repeating twice in the given set")

	st.subheader(":blue[Trimodal]")
	st.write("Trimodal Mode - A set of data with three Modes is known as a Trimodal Mode. This means that there are three data values that are having the highest frequencies.")
	st.write("Example, The Mode of data set A = {100, 80, 80, 95, 95, 100, 90, 90,100 ,95 } is 80, 90, 95, and 100 because both all the four values are repeated twice in the given set.")

	st.subheader(":blue[Multimodal]")
	st.write("A set of data with four or more than four Modes is known as a Multimodal Mode.")
	st.write("Multimodal Mode - The Mode of data set A = {100, 80, 80, 95, 95, 100, 90, 90,100 ,95 } is 80, 90, 95, and 100 because both all the four values are repeated twice in the given set.")

	st.subheader(":blue[Measures Of Dispersion]")
	st.write("Measures of variability give you a sense of how spread out the response values are. The range, standard deviation and variance each reflect different aspects of spread.")
	st.image("https://miro.medium.com/v2/resize:fit:640/format:webp/1*iVE5it6kMD87fw4n6g5xrg.png")

	st.subheader(":blue[There are two types of dispersion]")
	st.write("1. Absolute Measure of dispersion")
	st.write("2. Relative Measure of dispersion")


	st.write("Absolute - he measures of dispersion that are measured and expressed in the units of data themselves are called Absolute Measure of Dispersion. For example – Meters, Dollars, Kg, etc.")

	st.subheader(":blue[absolute measures of dispersion are: ]")
	st.write("Range - It is defined as the difference between the largest and the smallest value in the distribution.")
	st.image("https://media.geeksforgeeks.org/wp-content/uploads/20230922115423/Range.png",caption="Range")

	st.subheader(":blue[Mean Deviation]")
	st.write("Mean Deviation - It is the arithmetic mean of the difference between the values and their mean.")
	st.image("https://d1whtlypfis84e.cloudfront.net/guides/wp-content/uploads/2020/03/13143143/MAD-about-Mean-for-Individual-Data-Series.png",caption="Mean Deviation")

	st.subheader(":blue[Standard Deviation]")
	st.write("Standard Deviation - It is the square root of the arithmetic average of the square of the deviations measured from the mean.")
	st.image("https://media.geeksforgeeks.org/wp-content/uploads/20230605181802/Standard-Deviation-Formula.png",caption="Standard Deviation")

	st.subheader(":blue[Variance]")
	st.write("Variance - It is defined as the average of the square deviation from the mean of the given data set.")
	st.image("https://media.geeksforgeeks.org/wp-content/uploads/20230605183401/Variance-formula.png",caption="Variance")

	st.subheader(":blue[Quartile Deviation]")
	st.write("Quartile Deviation - It is defined as half of the difference between the third quartile and the first quartile in a given data set.")
	st.image("https://miro.medium.com/v2/resize:fit:1024/1*1br-28_d07Ur3cXTyKEJjw.jpeg",caption="Quartile Deviation")

	st.subheader(":blue[Interquartile Range]")
	st.write("Interquartile Range - The difference between upper(Q3 ) and lower(Q1) quartile is called Interterquartile Range. Its formula is given as Q3 – Q1.")
	st.image("https://statsmethods.wordpress.com/wp-content/uploads/2013/05/capture.png",caption="Interquartile Range")

	st.write("Relative - We use relative measures of dispersion to measure the two quantities that have different units to get a better idea about the scattering of the data.")

	st.subheader(":blue[Relative measures of dispersion:]")
	st.write("Coefficient of Range - It is defined as the ratio of the difference between the highest and lowest value in a data set to the sum of the highest and lowest value.")
	st.image("https://bbamantra.com/wp-content/uploads/2016/09/range-coefficient-of-range.jpg",caption="Coefficient of Range")

	st.subheader(":blue[Coefficient of Variation]")
	st.write("Coefficient of Variation - It is defined as the ratio of the standard deviation to the mean of the data set. We use percentages to express the coefficient of variation.")
	st.image("https://study.com/cimages/multimages/16/dcee854a-311f-4249-8800-d6ea1a117b398244206355149698335.png",caption="Coefficient of Variantion")

	st.subheader(":blue[Coefficient of Mean Deviation]")
	st.write("Coefficient of Mean Deviation- It is defined as the ratio of the mean deviation to the value of the central point of the data set.")
	st.image("https://www.zigya.com/application/zrc/images/qvar/STEN11019437-1.png",caption="Coefficient of Mean Deviation")

	st.subheader(":blue[Coefficient of Quartile Deviation]")
	st.write("Coefficient of Quartile Deviation - It is defined as the ratio of the difference between the third quartile and the first quartile to the sum of the third and first quartiles.")
	st.image("https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiWzFSpCpnwz2B11b8ec6BM9YNesfkAKsXuGA3L-Rwl4Vt31dTBCZ99fL9DDsKbvHPWh4wQ7H0caBQLxOjYSA3wWoskm8ROF9JFh4DG7vuUe5kvtxn6Lv4f4B2_phDjoUvi-VSlIQz9TC8r/s16000/coefficient+of+quartile+deviation.png",caption="Cpefficient of Quartile Deviation")

	st.subheader(":blue[Distribution Measures]")
	st.write("The distribution concerns the frequency of each value.")

	st.subheader(":blue[Frequency Distribution]")
	st.write("Frequency Distribution - A data set is made up of a distribution of values, or scores. In tables or graphs, you can summarize the frequency of every possible value of a variable in numbers or percentages. This is called a frequency distribution.")
	st.image("https://thirdspacelearning.com/wp-content/uploads/2023/08/Frequency-Distribution-us-featured-image.png")