Python Library -Pandas:

NOOR MOHIDEEN S
featurepreneur
Published in
3 min readMay 20, 2021

--

Introduction:

One of the most useful libraries in python for data analysis is Pandas. Let’s see about it briefly and some examples.

Pandas is a software library written for the Python programming language for data manipulation and analysis. In particular, it offers data structures and operations for manipulating numerical tables and time series.

Dataframes:

Pandas is mainly used for data analysis. Pandas allows importing data from various file formats such as comma-separated-values, JSON, SQL, Microsoft Excel. Pandas allow various data manipulation operations such as merging, reshaping, selecting, as well as data cleaning, and data wrangling features.

Key Features of Pandas

  • Fast and efficient DataFrame object with default and customized indexing.
  • Tools for loading data into in-memory data objects from different file formats.
  • Data alignment and integrated handling of missing data.
  • Reshaping and pivoting of datasets.
  • Label-based slicing, indexing, and subsetting of large data sets.
  • Columns from a data structure can be deleted or inserted.
  • Group by data for aggregation and transformations.
  • High-performance merging and joining of data

Install Pandas Library:

A lightweight alternative is to install NumPy using popular Python package installer, pip.

pip install pandas

Pandas deals with the following three data structures:

  • Series
  • DataFrame
  • Panel

Series:

Series is a one-dimensional labeled array capable of holding data of any type. The axis labels are collectively called index.

pandas.Series( data, index, dtype, copy)

The parameters of the constructor are as follows-

Data:

  • data takes various forms like ndarray, list, constants.

Index:

  • Index values must be unique and hashable, the same length as data. Default np.arrange(n) if no index is passed.

Dtype:

dtype is for the data type. If None, the data type will be inferred.

Copy:

Copy data. Default False.

Example for Series:

#import the pandas library and aliasing as pd

import pandas as pd

s = pd.Series()

print(s)

Its output is as follows-

Series([], dtype: float64)

Dataframes:

A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in rows and columns.

Features of DataFrame:

  • Potentially columns are of different types
  • Size – Mutable
  • Labeled axes
  • Can Perform Arithmetic operations on rows and columns.

A pandas DataFrame can be created using the following constructor:

pandas.DataFrame( data, index, columns, dtype, copy)

Create DataFrame:

A pandas DataFrame can be created using various inputs like

  • Lists
  • dict
  • Series
  • Numpy ndarrays
  • Another DataFrame

Create an Empty DataFrames:

#import the pandas library and aliasing as pd

import pandas as pd

df = pd.DataFrame()

print (df)

Its output as follows:

Empty DataFrame

Columns: []

Index: []

Create a DataFrame from Lists:

import pandas as pd

data = [[’Alex’,10],[’Bob’,12],[’Clarke’,13]]

df = pd.DataFrame(data,columns=[’Name’,’Age’])

print(df)

Its output as follows:

~ - Name - Age

0 - Alex - 10

1 - Bob -- 12

2 -Clarke-13

Conclusion:

This article may be sees as a tutorial or definition but the best way to learn about the pandas is like this. So, don’t try to neglect it as the theory part, but try to understand this. If you are reading this I may think that you read the article to your heart content.

--

--

NOOR MOHIDEEN S
featurepreneur

I am an intern in Tactlabs .I'm a Machine Learning Engineer.I’m currently learning everything in data science