These statistics notes are created in a simple handwritten style to help students understand concepts clearly and quickly. They are especially useful for those preparing for competitive exams and looking for jee mains notes, nimcet notes, cuet mca notes, and cet mah mca notes. All topics are explained with easy explanations, formulas, properties, and solved examples, making these notes perfect for strong conceptual clarity as well as quick revision.
1. Meaning of Statistics
Statistics deals with collecting, organizing, presenting, analyzing and interpreting numerical data.
2. Types of Data
Primary Data: First-hand information.
Secondary Data: Already available data.
3. Measures of Central Tendency
(A) Arithmetic Mean
Ungrouped data:
$$\bar{x} = \frac{\sum x_i}{n}$$
Discrete frequency:
$$\bar{x} = \frac{\sum f_i x_i}{\sum f_i}$$
Continuous frequency:
$$\bar{x} = \frac{\sum f_i m_i}{\sum f_i}$$
where $m_i$ are midpoints.
Properties of Mean
1. Unique value.
2. $\sum (x_i - \bar{x}) = 0$.
3. Highly affected by extreme values.
4. Combined mean:
$$ \bar{x} = \frac{n_1\bar{x}_1 + n_2\bar{x}_2}{n_1+n_2} $$
5. Adding $k$ → mean becomes $\bar{x}+k$.
(B) Median
Median = middle value after arranging data.
Ungrouped:
If $n$ odd: $\text{Median} = x_{\frac{n+1}{2}}$
If $n$ even: $\text{Median} = \frac{x_{\frac{n}{2}} + x_{\frac{n}{2}+1}}{2}$
Continuous:
$$
\text{Median} = l + \left( \frac{\frac{N}{2} - c_f}{f} \right) h
$$
Properties of Median
1. Not affected by extreme values.
2. Divides data into two equal halves.
3. Suitable for open-end classes.
4. Positional measure.
(C) Mode
Mode = most frequent value.
Continuous data:
$$
\text{Mode} = l + \frac{(f_1 - f_0)}{2f_1 - f_0 - f_2} h
$$
Properties of Mode
1. Not affected by outliers.
2. Applicable to qualitative data.
3. Empirical relation:
$$ \text{Mode} = 3\text{Median} - 2\text{Mean} $$
4. Measures of Dispersion
(A) Range
$$\text{Range} = \text{Maximum} - \text{Minimum}$$
(B) Mean Deviation
$$ MD = \frac{\sum |x_i - \bar{x}|}{n} $$
(C) Variance & Standard Deviation
Variance:
$$ \sigma^2 = \frac{\sum (x_i - \bar{x})^2}{n} $$
Standard deviation:
$$ \sigma = \sqrt{\frac{\sum (x_i - \bar{x})^2}{n}} $$
Shortcut:
$$ \sigma = \sqrt{\frac{\sum x_i^2}{n} - \bar{x}^2} $$
Properties
1. $\sigma \ge 0$.
2. SD = 0 when all values equal.
3. Adding constant $k$ → SD unchanged.
4. Multiplying by $k$ → SD becomes $k\sigma$.
5. SD is affected by extreme values.
5. Coefficient of Variation (C.V.)
Used to compare variability:
$$ CV = \frac{\sigma}{\bar{x}} \times 100 $$
6. Skewness
Skewness measures asymmetry of data.
Karl Pearson coefficient:
$$ \text{Skewness} = \frac{\bar{x} - \text{Mode}}{\sigma} $$
7. Correlation
(A) Karl Pearson Correlation Coefficient
$$ r = \frac{\sum (x-\bar{x})(y-\bar{y})} {\sqrt{\sum (x-\bar{x})^2 \sum (y-\bar{y})^2}} $$
Shortcut:
$$
r = \frac{
n\sum xy - (\sum x)(\sum y)
}{
\sqrt{(n\sum x^2 - (\sum x)^2)(n\sum y^2 - (\sum y)^2)}
}
$$
Properties of Correlation
1. $-1 \le r \le +1$.
2. $r=0$ means no linear relation.
3. Independent variables → $r=0$.
4. Not affected by change of origin or scale.
5. If $r=\pm 1$, points lie on a straight line.
8. Regression
Regression line of $y$ on $x$:
$$ y - \bar{y} = b_{yx}(x - \bar{x}) $$
Regression coefficient:
$$ b_{yx} = r\frac{\sigma_y}{\sigma_x} $$
Properties
1. Regression lines intersect at $(\bar{x},\bar{y})$.
2. Both regression coefficients have same sign as $r$.
3. Product:
$$ b_{yx} \cdot b_{xy} = r^2 $$
4. If $r=0$ → regression lines are perpendicular.
9. Moments
Moments measure the shape characteristics of a distribution such as skewness and kurtosis.
(A) Raw Moments (about origin)
Raw moment of order $r$:
$$ \mu_r^\prime = \frac{1}{n}\sum x_i^r $$
Examples:
1. First raw moment: $ \mu_1^\prime = \bar{x} $
2. Second raw moment: $ \mu_2^\prime = \frac{1}{n}\sum x_i^2 $
(B) Central Moments (about mean)
Central moment of order $r$:
$$ \mu_r = \frac{1}{n}\sum (x_i - \bar{x})^r $$
Important Central Moments
1. First central moment:
$$\mu_1 = 0$$
2. Second central moment = Variance:
$$\mu_2 = \sigma^2$$
3. Third central moment → Skewness.
4. Fourth central moment → Kurtosis.
(C) Relation Between Raw & Central Moments
For second moment:
$$\mu_2 = \mu_2^\prime - (\mu_1^\prime)^2$$
General relation:
$$
\mu_r=\sum_{k=0}^{r} {r \choose k}(-1)^{r-k}(\mu_1^\prime)^{r-k}\mu_k^\prime
$$
(D) Skewness Using Moments
Moment-based skewness:
$$ \beta_1 = \frac{\mu_3^2}{\mu_2^3} $$
Pearson skewness:
$$ \gamma_1 = \frac{\mu_3}{\mu_2^{3/2}} $$
(E) Kurtosis Using Moments
$$ \beta_2 = \frac{\mu_4}{\mu_2^2} $$
Interpretation:
$\beta_2 = 3$ → Mesokurtic (Normal)
$\beta_2 > 3$ → Leptokurtic (Peaked)
$\beta_2 < 3$ → Platykurtic (Flat)
Properties of Moments
1. First central moment is always 0.
2. Second central moment = variance.
3. Third central moment measures skewness.
4. Fourth central moment measures kurtosis.
5. Moments describe shape of distribution fully.
6. They remain consistent under linear transformation.
10. Probability Distribution (Basic)
Expected value:
$$ E(X)=\sum x_i P(x_i) $$
Variance:
$$ Var(X)=E(X^2)-[E(X)]^2 $$