Master Statistics And Probability with 100 free flashcards. Study using spaced repetition and focus mode for effective learning in Mathematics.
The mean is the arithmetic average, calculated by summing all values and dividing by the number of values. Formula: x̄ = Σxᵢ / n. It is sensitive to outliers and is the most commonly used measure of central tendency.
The median is the middle value when data is sorted in order. For an even number of observations, it is the average of the two middle values. It is preferred over the mean when data is skewed or contains outliers, as it is more robust.
The mode is the value that appears most frequently in a dataset. A dataset can be unimodal (one mode), bimodal (two modes), or multimodal (more than two modes). If no value repeats, the dataset has no mode.
Standard deviation measures the average amount of dispersion or spread in a dataset relative to the mean. A low standard deviation indicates data points are close to the mean, while a high standard deviation indicates they are spread out. Formula: σ = √(Σ(xᵢ - x̄)² / n) for a population.
Population standard deviation divides by N (total population size), while sample standard deviation divides by n - 1 (degrees of freedom). The sample version uses n - 1 (Bessel's correction) to provide an unbiased estimate of the population variance.
Variance is the average of the squared differences from the mean: σ² = Σ(xᵢ - x̄)² / n. Standard deviation is the square root of variance. Variance is expressed in squared units of the original data, making standard deviation more interpretable.
The range is the difference between the maximum and minimum values: Range = Max - Min. It is the simplest measure of dispersion but is highly sensitive to outliers and does not reflect how data is distributed between the extremes.
Quartiles divide sorted data into four equal parts. Q1 (25th percentile), Q2 (median, 50th percentile), and Q3 (75th percentile). The IQR is Q3 - Q1 and measures the spread of the middle 50% of data. It is robust to outliers.
A data point is considered an outlier if it falls below Q1 - 1.5 × IQR or above Q3 + 1.5 × IQR. This method is used in box plots to identify unusually extreme values in a dataset.
A distribution is skewed when it is not symmetric. In a right-skewed (positive skew) distribution, the tail extends to the right and mean > median. In a left-skewed (negative skew) distribution, the tail extends to the left and mean < median.
For any two events A and B: P(A ∪ B) = P(A) + P(B) - P(A ∩ B). The subtraction of the intersection prevents double-counting. For mutually exclusive events, P(A ∩ B) = 0, so it simplifies to P(A ∪ B) = P(A) + P(B).
For two events A and B: P(A ∩ B) = P(A) × P(B|A). If A and B are independent, this simplifies to P(A ∩ B) = P(A) × P(B). This rule calculates the probability that both events occur.
Flashcards
Flip to reveal
Focus Mode
Spaced repetition
Multiple Choice
Test your knowledge
Type Answer
Active recall
Learn Mode
Multi-round mastery
Match Game
Memory challenge