Pandas is an open source Python library primarily used for data processing and analysis. It is built on top of the NumPy library and provides high-performance, easy-to-use data structures and data analysis tools for the Python programming language.
In this article, you will learn how to perform 6 basic operations with Pandas.
Using panda examples
You can run the examples in this article with math notebooks like Jupyter notebookAnd Google Collab, etc. You can also run the examples by entering the code directly into the Python interpreter in interactive mode.
If you want to take a look at the full source code used in this article, you can access the Python Notebook file from this GitHub repository.
1. How to import pandas in pd format and print the version number
You need to use import A keyword to import any library in Python. Usually, pandas are imported within a file pd Nick name. With this approach, you can refer to the Pandas package as pd instead of panda.
import pandas as pd
print(pd.__version__)
Produce:
1.2.4
2. How to create a string in pandas
A Pandas string is a one-dimensional array containing data of any type. It’s like a column in a table. You can create a string using numpy arrays, numpy functions, lists, dictionaries, numerical values, etc.
String values are named by their index number. By default, the first value has an index of 0, the second value has an index of 1, and so on. To name your ratings, you need to use an extension index Controversy.
How to create an empty string
s = pd.Series(dtype='float64')
s
Produce:
Series([], dtype: float64)
In the example above, an empty string with the extension float The data type was created.
How to create a string using NumPy Array
import pandas as pd
import numpy as np
d = np.array([1, 2, 3, 4, 5])
s = pd.Series(d)
s
Produce:
0 1
1 2
2 3
3 4
4 5
dtype: int32
How to create a list using a string
d = [1, 2, 3, 4, 5]
s = pd.Series(d)
s
Produce:
0 1
1 2
2 3
3 4
4 5
dtype: int64
How to create a string with an index
To create a string with an index, you need to use the extension index Controversy. The number of indexes must be equal to the number of items in the string.
d = [1, 2, 3, 4, 5]
s = pd.Series(d, index=["one", "two", "three", "four", "five"])
s
Produce:
one 1
two 2
three 3
four 4
five 5
dtype: int64
How to create a string using a dictionary
Dictionary keys become string labels.
d = {"one" : 1,
"two" : 2,
"three" : 3,
"four" : 4,
"five" : 5}
s = pd.Series(d)
s
Produce:
one 1
two 2
three 3
four 4
five 5
dtype: int64
How to create a string using scalar value
If you want to create a string using an integer value, you must provide the extension index Controversy.
s = pd.Series(1, index = ["a", "b", "c", "d"])
s
Produce:
a 1
b 1
c 1
d 1
dtype: int64
3. How to create a Dataframe in Pandas
A DataFrame is a two-dimensional data structure in which data is aligned in the form of rows and columns. DataFrame can be created using dictionaries, lists, list of dictionaries, imprecise arrays, etc. In the real world, DataFrames are created using existing storage like CSV files, Excel files, SQL databases, etc.
A DataFrame object supports a number of attributes and methods. If you want to know more about it, you can check the official documentation of panda data frame.
How to create an empty DataFrame
df = pd.DataFrame()
print(df)
Produce:
Empty DataFrame
Columns: []
Index: []
How to create a list using DataFrame
listObj = ["MUO", "technology", "simplified"]
df = pd.DataFrame(listObj)
print(df)
Produce:
0
0 MUO
1 technology
2 simplified
How to create a DataFrame using ndarray dictionary/lists
batmanData = {'Movie Name' : ['Batman Begins', 'The Dark Knight', 'The Dark Knight Rises'],
'Year of Release' : [2005, 2008, 2012]}
df = pd.DataFrame(batmanData)
print(df)
Produce:
Movie Name Year of Release
0 Batman Begins 2005
1 The Dark Knight 2008
2 The Dark Knight Rises 2012
How to create a DataFrame using List List
data = [['Alex', 601], ['Bob', 602], ['Cataline', 603]]
df = pd.DataFrame(data, columns = ['Name', 'Roll No.'])
print(df)
Produce:
Name Roll No.
0 Alex 601
1 Bob 602
2 Cataline 603
How to create a DataFrame using a list of dictionaries
data = [{'Name': 'Alex', 'Roll No.': 601},
{'Name': 'Bob', 'Roll No.': 602},
{'Name': 'Cataline', 'Roll No.': 603}]
df = pd.DataFrame(data)
print(df)
Produce:
Name Roll No.
0 Alex 601
1 Bob 602
2 Cataline 603
How to create a DataFrame using the zip() function
use the zoom() A function for merging lists in Python.
Name = ['Alex', 'Bob', 'Cataline']
RollNo = [601, 602, 603]
listOfTuples = list(zip(Name, RollNo))
df = pd.DataFrame(listOfTuples, columns = ['Name', 'Roll No.'])
print(df)
Produce:
Name Roll No.
0 Alex 601
1 Bob 602
2 Cataline 603
4. How to read CSV data in pandas
A Comma Separated Values (CSV) file is a delimited text file that uses a comma to separate values. You can read a CSV file with the . extension read_csv() way in the panda. If you want to print the entire DataFrame, use an extension to string() method.
In this and the following example, this CSV file It will be used to perform operations.
df = pd.read_csv('https://raw.githubusercontent.com/Yuvrajchandra/Basic-Operations-Using-Pandas/main/biostats.csv')
print(df.to_string())
Produce:
5. How to parse data frames using header(), tail() and information() methods
How to display data using the header() method
The President() The method is one of the best ways to get a quick overview of the DataFrame. This method returns the header and the specified number of rows, starting at the top.
df = pd.read_csv('https://raw.githubusercontent.com/Yuvrajchandra/Basic-Operations-Using-Pandas/main/biostats.csv')
print(df.head(10))
Produce:
If you do not specify the number of rows, the first 5 rows will be returned.
df = pd.read_csv('https://raw.githubusercontent.com/Yuvrajchandra/Basic-Operations-Using-Pandas/main/biostats.csv')
print(df.head())
Produce:
How to display data using the tail() method
The Tail() The method returns the header and the specified number of rows, starting at the bottom.
df = pd.read_csv('https://raw.githubusercontent.com/Yuvrajchandra/Basic-Operations-Using-Pandas/main/biostats.csv')
print(df.tail(10))
Produce:
If you do not specify the number of rows, the last 5 rows will be returned.
df = pd.read_csv('https://raw.githubusercontent.com/Yuvrajchandra/Basic-Operations-Using-Pandas/main/biostats.csv')
print(df.tail())
Produce:
How to get information about data
The informations() The methods return a brief summary of the DataFrame including index type and column types, non-null values, and memory usage.
df = pd.read_csv('https://raw.githubusercontent.com/Yuvrajchandra/Basic-Operations-Using-Pandas/main/biostats.csv')
print(df.info())
Produce:
6. How to read JSON data in pandas
json (yAvasmanuscript NSsubject notation) is a lightweight data exchange format. You can read a JSON file with the . extension read_json() way in the panda. If you want to print the entire DataFrame, use an extension to string() method.
In the example below, this JSON file It is used to perform operations.
df = pd.read_json('https://raw.githubusercontent.com/Yuvrajchandra/Basic-Operations-Using-Pandas/main/google_markers.json')
print(df.to_string())
Produce:
Refresh your knowledge of Python with built-in functions and methods
Functions help shorten code and improve its efficiency. Functions and methods like scale down()And split()And Counting ()And Evaluation ()And round(), etc., can make your code powerful and easy to understand. It is always good to know the built in functions and methods as they can greatly simplify your programming tasks.
read the following
About the author