Python Pandas Tutorial
Repository
https://github.com/pandas-dev/pandas
What Will I Learn?
- Python Pandas - DataFrame
- Python Pandas - Panel
Requirements
State the requirements the user needs in order to follow this tutorial.
-Standard Python distribution doesn't come bundled with Pandas module. A lightweight alternative is to install NumPy using popular Python package installer, pip.
pip install pandas
If you install Anaconda Python package, Pandas will be installed by default with the following
Windows
Anaconda (from https://www.continuum.io) is a free Python distribution for SciPy stack. It is also available for Linux and Mac.
Canopy (https://www.enthought.com/products/canopy/) is available as free as well as commercial distribution with full SciPy stack for Windows, Linux and Mac.
Python (x,y) is a free Python distribution with SciPy stack and Spyder IDE for Windows OS. (Downloadable from http://python-xy.github.io/)
Difficulty
- Intermediate
Tutorial
A Data outline is a two-dimensional information structure, i.e., information is adjusted in a forbidden manner in lines and sections.
####Highlights of DataFrame
Conceivably segments are of various kinds
Measure – Mutable
Marked tomahawks (lines and segments)
Can Perform Arithmetic activities on lines and sections
####Structure
Give us a chance to accept that we are making an information outline with understudy's information.
####Create DataFrame
A pandas DataFrame can be created using various inputs like
- Lists
- dict
- Series
- Numpy ndarrays
- Another DataFrame
####Create an Empty DataFrame
A basic DataFrame, which can be created is an Empty Dataframe.
#import the pandas library and aliasing as pd
import pandas as pd
df = pd.DataFrame()
print df
Output
Empty DataFrame
Columns: []
Index: []
Create a DataFrame from Lists
The DataFrame can be made utilizing a solitary rundown or a rundown of records.
Example 1
import pandas as pd
data = [1,2,3,4,5]
df = pd.DataFrame(data)
print df
Output
![screenshotAtUploadCC_1527224341368.png](https://cdn.steemitimages.com/DQmNntq1cNrcVi78gMA9x4GAxend48uSLB9nxiKBKsMVU22/screenshotAtUploadCC_1527224341368.png)
Example 2
import pandas as pd
data = [['Alex',10],['Bob',12],['Clarke',13]]
df = pd.DataFrame(data,columns=['Name','Age'])
print df
Output
Create a DataFrame from Dict of ndarrays / Lists
All the ndarrays must be of same length. In the event that list is passed, at that point the length of the list should equivalent to the length of the clusters.
On the off chance that no list is passed, at that point as a matter of course, record will be range(n), where n is the exhibit length.
Example
import pandas as pd
data = {'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age':[28,34,29,42]}
df = pd.DataFrame(data)
print df
Output
Create a DataFrame from Dict of Series
Word reference of Series can be passed to shape a DataFrame. The resultant file is the association of all the arrangement lists passed.
Example
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
print df
Output
Column Addition
We will understand this by adding a new column to an existing data frame
Example
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
# Adding a new column to an existing DataFrame object with column label by passing new series
print ("Adding a new column by passing as Series:")
df['three']=pd.Series([10,20,30],index=['a','b','c'])
print df
print ("Adding a new column using the existing columns in DataFrame:")
df['four']=df['one']+df['three']
print df
Column Deletion
Columns can be deleted or popped; let us take an example to understand how
Example
# Using the previous DataFrame, we will delete a column
# using del function
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd']),
'three' : pd.Series([10,20,30], index=['a','b','c'])}
df = pd.DataFrame(d)
print ("Our dataframe is:")
print df
# using del function
print ("Deleting the first column using DEL function:")
del df['one']
print df
# using pop function
print ("Deleting another column using POP function:")
df.pop('two')
print df
Row Selection, Addition, and Deletion
We will now comprehend push determination, expansion and cancellation through cases. Give us a chance to start with the idea of choice.
Selection by Label
Rows can be selected by passing row label to a loc function.
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
print df.loc['b']
Output
Python Pandas - Panel
A board is a 3D holder of information. The term Panel information is gotten from econometrics and is somewhat in charge of the name pandas − pan(el)- da(ta)- s.
The names for the 3 tomahawks are expected to give some semantic significance to depicting activities including board information. They are
items − axis 0, each item corresponds to a DataFrame contained inside.
major_axis − axis 1, it is the index (rows) of each of the DataFrames.
minor_axis − axis 2, it is the columns of each of the DataFrames.
pandas.Panel()
A Panel can be created using the following constructor −
pandas.Panel(data, items, major_axis, minor_axis, dtype, copy)
Create Panel
A Panel can be created using multiple ways like −
- From ndarrays
- From dict of DataFrames
From 3D ndarray
# creating an empty panel
import pandas as pd
import numpy as np
data = np.random.rand(2,4,5)
p = pd.Panel(data)
print p
output
<class 'pandas.core.panel.Panel'>
Dimensions: 2 (items) x 4 (major_axis) x 5 (minor_axis)
Items axis: 0 to 1
Major_axis axis: 0 to 3
Minor_axis axis: 0 to 4
From dict of DataFrame Objects
#creating an empty panel
import pandas as pd
import numpy as np
data = {'Item1' : pd.DataFrame(np.random.randn(4, 3)),
'Item2' : pd.DataFrame(np.random.randn(4, 2))}
p = pd.Panel(data)
print p
output
<class 'pandas.core.panel.Panel'>
Dimensions: 2 (items) x 4 (major_axis) x 5 (minor_axis)
Items axis: 0 to 1
Major_axis axis: 0 to 3
Minor_axis axis: 0 to 4
Create an Empty Panel
#creating an empty panel
import pandas as pd
p = pd.Panel()
print p
output
<class 'pandas.core.panel.Panel'>
Dimensions: 0 (items) x 0 (major_axis) x 0 (minor_axis)
Items axis: None
Major_axis axis: None
Minor_axis axis: None
Selecting the Data from Panel
Select the data from the panel using −
- Items
- Major_axis
- Minor_axis
Using Items
# creating an empty panel
import pandas as pd
import numpy as np
data = {'Item1' : pd.DataFrame(np.random.randn(4, 3)),
'Item2' : pd.DataFrame(np.random.randn(4, 2))}
p = pd.Panel(data)
print p['Item1']
output
Proof of Work Done
https://gist.github.com/faizaraja/9c2c9cbb97e74e7d37e830d5056cfadf
Flagged because of plagiarism
@steemflagrewards
Steem Flag Rewards mention comment has been approved! Thank you for reporting this abuse,@flugschwein categorized as plagiarism. This post was submitted via our Discord Community channel. Check us out on the following link!
SFR Discord
In utopian plagiarism isn't allowed. Link
Don't do that again.
Your contribution has been evaluated according to Utopian rules and guidelines, as well as a predefined set of questions pertaining to the category.
To view those questions and the relevant answers related to your post,Click here
Need help? Write a ticket on https://support.utopian.io/.
Chat with us on Discord.
[utopian-moderator]