Table of Contents
A Data Frame is a two-dimension collection of data. It is a data structure where data is stored in tabular form. Datasets are arranged in rows and columns; we can store multiple datasets in the data frame. We can perform various arithmetic operations, such as adding column/row selection and columns/rows in the data frame.
Learn to code from industry experts! Enroll here
We can import the DataFrames from the external storage; these storages can be referred to as the SQL. A data frame consists of index, column names and the data itself. The index is like a label for each row and together with the column names acts as an address to each data element. The data in a DataFrame can be heterogeneous, i.e., they can be of different data types.
Database, CSV file, and an Excel file. We can also use the lists, dictionary, and from a list of dictionary, etc.
Different methods of Creating DataFrame
An empty dataframe
We can create a basic empty Dataframe. The dataframe constructor needs to be called to create the DataFrame. Let’s understand the following example.
Example –
# import pandas as pd
import pandas as pd
# Calling DataFrame constructor
df = pd.DataFrame()
print(df)
Output:
Empty DataFrame
Columns: []
Index: []
Create a dataframe using List
We can create dataframe using a single list or list of lists. Let’s understand the following example.
Example –
# importing pandas library
import pandas as pd
# string values in the list
lst = [‘Java’, ‘Python’, ‘C’, ‘C++’,
‘JavaScript’, ‘Swift’, ‘Go’]
# Calling DataFrame constructor on list
dframe = pd.DataFrame(lst)
print(dframe)
Output:
0 Java1 Python2 C3 C++4 JavaScript5 Swift6 Go
Create Dataframe from dict of ndarray/lists
The dict of ndarray/lists can be used to create a dataframe, all the ndarray must be of the same length. The index will be a range(n) by default; where n denotes the array length. Let’s understand the following example.
Example
import pandas as pd
# assign data of lists.
data = {‘Name’: [‘Tom’, ‘Joseph’, ‘Krish’, ‘John’], ‘Age’: [20, 21, 19, 18]}
# Create DataFrame
df = pd.DataFrame(data)
# Print the output.
print(df)
Output:
Name Age0 Tom 201 Joseph 212 Krish 193 John 18
Create a indexes Dataframe using arrays
Let’s understand the following example to create the indexes dataframe using arrays.
Example –
# DataFrame using arrays.
import pandas as pd
# assign data of lists.
data = {‘Name’:[‘Renault’, ‘Duster’, ‘Maruti’, ‘Honda City’], ‘Ratings’:[9.0, 8.0, 5.0, 3.0]}
# Creates pandas DataFrame.
df = pd.DataFrame(data, index =[‘position1’, ‘position2’, ‘position3’, ‘position4’])
# print the data
print(df)
Output:
Name Ratingsposition1 Renault 9.0position2 Duster 8.0position3 Maruti 5.0position4 Honda City 3.0
In the above code, we have defined the column name with the various car names and their ratings. We used the array to create indexes.
Create Dataframe from list of dicts
We can pass the lists of dictionaries as input data to create the Pandas dataframe. The column names are taken as keys by default. Let’s understand the following example.
Learn Coding in your Language! Enroll Here!
Example –
# the example is to create
# Pandas DataFrame by lists of dicts.
import pandas as pd
# assign values to lists.
data = [{‘A’: 10, ‘B’: 20, ‘C’:30}, {‘x’:100, ‘y’: 200, ‘z’: 300}]
# Creates DataFrame.
df = pd.DataFrame(data)
# Print the data
print(df)
Output:
A B C x y z0 10.0 20.0 30.0 NaN NaN NaN1 NaN NaN NaN 100.0 200.0 300.0
Create Dataframe using the zip() function
The zip() function is used to merge the two lists. Let’s understand the following example.
Example –
# The example is to create
# pandas dataframe from lists using zip.
import pandas as pd
# List1
Name = [‘tom’, ‘krish’, ‘arun’, ‘juli’]
# List2
Marks = [95, 63, 54, 47]
# two lists.
# and merge them by using zip().
list_tuples = list(zip(Name, Marks))
# Assign data to tuples.
print(list_tuples)
# Converting lists of tuples into
# pandas Dataframe.
dframe = pd.DataFrame(list_tuples, columns=[‘Name’, ‘Marks’])
# Print data.
print(dframe)
Output:
[(‘john’, 95), (‘krish’, 63), (‘arun’, 54), (‘juli’, 47)] Name Marks0 john 951 krish 632 arun 543 juli 47Create Dataframe from Dicts of series
The dictionary can be passed to create a dataframe. We can use the Dicts of series where the subsequent index is the union of all the series of passed index value. Let’s understand the following example.
Example –
# Pandas Dataframe from Dicts of series.
import pandas as pd
# Initialize data to Dicts of series.
d = {‘Electronics’ : pd.Series([97, 56, 87, 45], index =[‘John’, ‘Abhinay’, ‘Peter’, ‘Andrew’]),
‘Civil’ : pd.Series([97, 88, 44, 96], index =[‘John’, ‘Abhinay’, ‘Peter’, ‘Andrew’])}
# creates Dataframe.
dframe = pd.DataFrame(d)
# print the data.
print(dframe)
Output:
Electronics CivilJohn 97 97Abhinay 56 88Peter 87 44Andrew 45 96
How to create new variables in the data frame
1: Who was the first woman President of India?
Creating a new variable in pandas data frame is an easy task! Either you can pass the values of that new column or you can generate the values of new columns based on the existing columns.
The code snippet shown below creates two new columns based on the Age column.
# Defining Employee Data
import pandas as pd import numpy as np EmployeeData=pd.DataFrame({‘Name’: [‘ram’,’ravi’,’sham’,’sita’,’gita’], ‘id’: [101,102,103,104,105], ‘Gender’: [‘M’,’M’,’M’,’F’,’F’], ‘Age’: [21,25,24,28,25] }) # Priting data print(EmployeeData)
# Creating a new variable in data based on existing variable EmployeeData[‘NewAge’]= EmployeeData[‘Age’] + 10
# Priting data print(EmployeeData)
# Creating a new variable in data based on existing variable EmployeeData[‘AgeSquared’]= EmployeeData[‘Age’] **2
# Priting data print(EmployeeData)
# Creating a new variable in data based on existing variable EmployeeData[‘AgeLOG’]= np.log(EmployeeData[‘Age’])
# Priting data print(EmployeeData) |
Sometimes the logic to be applied for each value of an existing column may be a little complex to fit in one line, so we define a function for that and apply that function to each value of the column. The results are stored as a new column.
Grab the opportunity to learn Python with Entri! Click Here
Why is it important to choose Entri?
- Excellent online platform for all the Competitive Exams.
- Provides updated materials created by the Entri Experts.
- Entri provides a best platform with full- length mock tests including previous year question papers.
- You can download the app for free and join the required classes.
- Entri wishes you all the best for your examinations and future endeavours.
“YOU DON’T HAVE TO BE GREAT TO START, BUT YOU HAVE TO START TO BE GREAT.”