Selecting, Filtering and Sorting
There are two label-aware selectors. .loc selects by label (column names, index values); .iloc selects by integer position.
import pandas as pd
df = pd.DataFrame({
"name": ["Ada", "Bo", "Cy"],
"age": [30, 25, 35],
})
print(df.loc[0, "name"]) # Ada (row label 0, column 'name')
print(df.iloc[0]) # the first row by position
print(df.iloc[:, 1]) # the second column by positionThe workhorse is the boolean mask. Write a condition on a column to get a True/False Series, then index the DataFrame with it to keep only the True rows:
mask = df["age"] >= 30
print(df[mask]) # only Ada and Cy
# or in one line:
print(df[df["age"] >= 30])Combine conditions with & (and) and | (or). Wrap each condition in parentheses:
print(df[(df["age"] >= 26) & (df["age"] <= 34)]) # just AdaSort with sort_values. Use ascending=False for largest first:
print(df.sort_values("age", ascending=False)) # Cy, Ada, BoGiven the DataFrame below (already built for you), keep only the rows where price is greater than 20 and store the result in pricey. Then sort pricey by price from highest to lowest, storing it in pricey_sorted.
This lesson is locked
Lessons open one at a time. Finish the previous lesson to unlock this one.