🔍 Chapter 6 – Descriptive Statistics: Concept Review

Box Plot: A graphical summary of data showing the minimum, first quartile (Q1), median, third quartile (Q3), and maximum. Useful for identifying outliers.
Degrees of Freedom: The number of values in a calculation that are free to vary. For sample variance, it is n - 1.
Digidot Plot: A simple dot plot used to show the frequency of individual data values.
Frequency Distribution & Histogram: A table and corresponding bar chart showing the number (or relative frequency) of data points in specified intervals.
Histogram: A bar graph representing frequency distribution of continuous data. Useful to reveal data shape (e.g., symmetric, skewed).
Interquartile Range (IQR): The spread of the middle 50% of data, calculated as Q3 - Q1. Helps detect variability and outliers.
Matrix of Scatter Plots: A grid showing pairwise scatter plots among multiple variables to reveal possible correlations or trends.
Multivariate Data: Observations that include two or more variables. Often visualized using scatter plot matrices.
Normal Probability Plot: A plot to assess if data is approximately normally distributed. Data should lie along a straight line if normal.
Outlier: A data point that is significantly different from the rest of the dataset. Often identified using IQR or Z-scores.
Pareto Chart: A bar chart with categories in descending order of frequency, often used in quality control.
Percentile: A value below which a given percentage of data falls. The 90th percentile means 90% of the data is below that value.
Population Mean (μ): The average of all values in the population.
Population Standard Deviation (σ): A measure of spread for population data.
Population Variance (σ²): The square of the population standard deviation.
Probability Plot: A general plot (not limited to normal) comparing data against any theoretical distribution.
Quartiles & Percentiles: Q1 (25th), Q2 (50th, median), Q3 (75th); divide data into four equal parts.
Relative Frequency Distribution: Shows proportions (rather than counts) in each class interval.
Sample Correlation Coefficient (r): A value between -1 and 1 indicating the linear strength between two variables.
Sample Mean (x̄): The average of a sample. x̄ = (Σxᵢ)/n.
Sample Median: The middle value in a sorted sample.
Sample Mode: The most frequently occurring value in a dataset.
Sample Range: Difference between maximum and minimum values in the sample.
Sample Standard Deviation (s): The spread of sample data around the mean.
Sample Variance (s²): The square of the sample standard deviation.
Scatter Diagram: A plot showing the relationship between two numerical variables.
Stem-and-Leaf Diagram: Displays data while preserving individual values and showing distribution shape.
Time Series: Data measured sequentially over time. Trends and patterns may be observed.

✅ Use this glossary to help review key concepts before tests or exercises in Chapter 6.