### NumPy
In NumPy, arrays are multidimensional (e.g., 1D, 2D, 3D), and the `axis` parameter refers to the dimensions of these arrays. Here’s how it works:
- **1D Array**: The axis is always `0`, which refers to the only dimension available.
- **2D Array**: Axis `0` refers to rows (the first dimension), and axis `1` refers to columns (the second dimension).
- **3D Array**: Axis `0` refers to layers, axis `1` refers to rows, and axis `2` refers to columns.
When applying functions like `np.sum()` or `np.mean()`, you specify the axis to indicate along which dimension the operation should be performed. For example:
import numpy as np
array = np.array([[1, 2, 3], [4, 5, 6]])
# Sum along axis 0 (sum over rows)
result_axis0 = np.sum(array, axis=0) # Output: [5 7 9]
# Sum along axis 1 (sum over columns)
result_axis1 = np.sum(array, axis=1) # Output: [ 6 15]
### pandas
In pandas, which operates primarily with DataFrames and Series, the `axis` parameter is used differently:
- **DataFrame**: Axis `0` refers to rows (along the vertical axis), and axis `1` refers to columns (along the horizontal axis). When using functions like `df.sum()` or `df.mean()`, specifying `axis=0` will apply the function column-wise, while `axis=1` will apply it row-wise.
- **Series**: The Series object is essentially a 1D array with an index, so axis is always `0`, and operations are applied along this single dimension.
Example with DataFrame:
import pandas as pd
df = pd.DataFrame({
'A': [1, 2, 3],
'B': [4, 5, 6]
})
# Sum along axis 0 (sum over columns)
result_axis0 = df.sum(axis=0) # Output: A 6, B 15
# Sum along axis 1 (sum over rows)
result_axis1 = df.sum(axis=1) # Output: 0 5, 1 7, 2 9
In summary, while `axis` in NumPy and pandas both denote dimensions along which operations are performed, NumPy uses it to denote array dimensions directly, and pandas uses it to refer to the orientation within DataFrames and Series.