# PEARSON

The `PEARSON` function calculates the correlation coefficient between two sets of data points, `data_y` and `data_x`. This coefficient indicates how closely related the two sets of data are. A high correlation coefficient indicates a strong positive correlation, while a low coefficient indicates a weak or negative correlation. This function is commonly used in statistical analysis and data visualization.

## Usage

Use the `PEARSON` formula with the syntax shown below, it has 2 required parameters:

Parameters:
1. data_y (required):
An array or range containing the dependent data points for the correlation calculation.
2. data_x (required):
An array or range containing the independent data points for the correlation calculation.

## Examples

Here are a few example use cases that explain how to use the `PEARSON` formula in Google Sheets.

### Comparing Sales and Advertising Data

A company wants to analyze the relationship between their advertising spending and product sales. They use the `PEARSON` function to calculate the correlation coefficient between the two data sets.

### Analyzing Test Scores

A teacher wants to determine if there is a correlation between her students' test scores and the amount of time they spent studying. She uses the `PEARSON` function to calculate the correlation coefficient between the two data sets.

### Assessing Investment Performance

An investor wants to analyze the relationship between the performance of two stocks in their portfolio. They use the `PEARSON` function to calculate the correlation coefficient between the two data sets.

## Common Mistakes

`PEARSON` not working? Here are some common mistakes people make when using the `PEARSON` Google Sheets Formula:

### Mismatched data ranges

One common mistake is providing data ranges of different sizes for data_y and data_x. Make sure both ranges have the same number of rows and columns.

### Incorrect order of data ranges

Another common mistake is providing data_y as the second argument and data_x as the first argument. Make sure to provide data_y first and data_x second.

### Missing data

If there are any blank cells or cells containing text in the data ranges, the PEARSON formula will return an error. Make sure all cells contain numerical data.

### Dividing by zero

If the standard deviation of the data is zero, the PEARSON formula will return an error. Make sure the data ranges have some variability.

### Incorrect interpretation of result

One common mistake is assuming that a high or low correlation coefficient indicates causation. Remember that correlation does not equal causation.

The following functions are similar to `PEARSON` or are often used with it in a formula:

• `CORREL`

The `CORREL` formula returns the correlation coefficient between two sets of data. This coefficient represents the strength of the linear relationship between the two sets of data, with values ranging from -1 (perfect negative correlation) to 1 (perfect positive correlation).

• `SLOPE`

The `SLOPE` formula calculates the slope of the linear regression line that best fits the input data. It is commonly used in statistics to analyze trends and predict future values based on past performance.

• `INTERCEPT`

The `INTERCEPT` function calculates the point where the line of best fit for a set of data intercepts the y-axis. This function is commonly used in regression analysis to find the constant b in the equation y=mx+b where m is the slope of the regression line.

You can learn more about the `PEARSON` Google Sheets function on Google Support.