To compute a bivariate correlation in Python, you can use the pearsonr function from the scipy.stats module. The pearsonr function calculates the Pearson correlation coefficient and the p-value for testing non-correlation.
Here's an example of how you might use it:
import numpy as np
from scipy.stats import pearsonr
# Generating some random data
np.random.seed(0) # For reproducibility
data1 = np.random.rand(100) # 100 random numbers from a uniform distribution between 0 and 1
data2 = data1 + np.random.rand(100) * 0.1 # data2 is close to data1, but not identical
# Compute Pearson correlation
corr, p_value = pearsonr(data1, data2)
print(f"Pearson Correlation Coefficient: {corr}")
print(f"P-value: {p_value}")
Here’s what you get:
In this code, we first generate two arrays of random numbers, data1 and data2. The data2 array is created by adding some random noise to data1, so we'd expect a high positive correlation between the two arrays. The pearsonr function then computes the Pearson correlation coefficient between data1 and data2. This coefficient measures the linear relationship between the two datasets and will have a value between -1 and 1, with 1 indicating a perfect positive correlation, -1 a perfect negative correlation, and 0 indicating no correlation. The function also calculates the p-value for testing the hypothesis that the datasets are uncorrelated. If the p-value is below your chosen significance level (often 0.05), you can reject this hypothesis and conclude that there is a statistically significant correlation between the datasets. Here, we have a very high r value and a low p value, so these variables are positively correlated. Let’s look at a scatter plot to illustrate that.
You can use the matplotlib library to generate a scatter plot of the two variables. Here's an example of how you might do it.
import matplotlib.pyplot as plt
plt.figure(figsize=(10, 6))
plt.scatter(data1, data2, color='blue')
plt.title('Scatter plot of Data1 vs Data2')
plt.xlabel('Data1')
plt.ylabel('Data2')
plt.show()
BridgeText can help you with all of your statistical analysis needs.