Manipulating data efficiently is essential in data analysis. NumPy and Pandas provide powerful tools for handling and transforming data. This tutorial covers:
import numpy as np
import pandas as pdNumPy is mainly used for handling arrays.
arr = np.array([10, 20, 30, 40, 50])
print(arr * 2) # Multiply each element by 2rand_arr = np.random.rand(5) # Array with 5 random numbersarr2D = np.array([[1, 2, 3], [4, 5, 6]])
print(arr2D.T) # Transpose of the arrayarr = np.array([1, 2, 3, 4, 5])
print(np.mean(arr)) # Calculate mean
print(np.sum(arr)) # Calculate sum
print(np.sqrt(arr)) # Square rootPandas makes data manipulation easier using Series and DataFrames.
data = {"Name": ["Amit", "Pooja", "Rahul", "Neha"],
"Age": [25, 30, 22, 35],
"Salary": [50000, 60000, 45000, 70000]}
df = pd.DataFrame(data)
print(df)print(df["Name"]) # Select single column
print(df[["Name", "Salary"]]) # Select multiple columnsprint(df.iloc[1]) # Select second row
print(df.loc[df["Age"] > 25]) # Filter rows where Age > 25high_salary = df[df["Salary"] > 50000]
print(high_salary)df["Bonus"] = df["Salary"] * 0.10 # 10% Bonusdf.loc[df["Name"] == "Rahul", "Salary"] = 50000df.drop(columns=["Bonus"], inplace=True)df.drop(index=2, inplace=True) # Remove Rahuldf_sorted = df.sort_values(by="Salary", ascending=False)df = df[["Name", "Salary", "Age"]]Grouping helps in summarizing large datasets.
df_grouped = df.groupby("Age").mean()
df.groupby("Age")["Salary"].sum()df1 = pd.DataFrame({"ID": [1, 2], "Name": ["Amit", "Pooja"]})
df2 = pd.DataFrame({"ID": [1, 2], "Salary": [50000, 60000]})
df_merged = pd.merge(df1, df2, on="ID")df_concat = pd.concat([df1, df2], axis=0)df.fillna(0, inplace=True) # Fill missing values with 0
df.dropna(inplace=True) # Remove rows with missing valuesdf.to_csv("processed_data.csv", index=False) # Save as CSV
df.to_excel("processed_data.xlsx", index=False) # Save as ExcelIn this tutorial, we explored NumPy and Pandas for data manipulation. You learned how to filter, sort, merge, and clean data effectively. In the next tutorial, we will focus on Exploratory Data Analysis (EDA) Techniques.
Sign in to join the discussion and post comments.
Sign in