Excel
Excel Module for autopylot. This module contains functions for working with excel and spreadsheets
Examples:
>>> excel.get_row_column_count(df=df)
(10, 5)
>>> excel.get_single_cell(df=df,column_name="Column1", cell_number=1)
"abc"
>>> excel.excel_create_file(output_folder="C:\Users\user\Desktop", output_filename="test.xlsx", output_sheetname="Sheet1")
This module contains the following functions:
authenticate_google_spreadsheet(credential_file_path)
: This creates authentication object for google spreadsheet.get_dataframe_from_google_spreadsheet(auth, spreadsheet_url, sheet_name)
: Get dataframe from google spreadsheet.tabular_data_from_website(website_url, table_number)
: Get tabular data from website.upload_dataframe_to_google_spreadsheet(auth, spreadsheet_url, sheet_name, df)
: Upload dataframe to google spreadsheet.create_file(output_folder, output_filename, output_sheetname)
: Create excel file.to_dataframe(input_filepath, input_sheetname, header)
: Convert excel file to dataframe.get_row_column_count(df)
: Get row and column count of dataframe.dataframe_to_excel(df, output_folder, output_filename, output_sheetname, mode)
: Convert dataframe to excel file.set_single_cell(df, column_name, cell_number, value)
: Set single cell value in excel file.get_single_cell(df, column_name, cell_number, header)
: Get single cell value from excel file.get_all_header_columns(df)
: Get all header columns from excel file.get_all_sheet_names(input_filepath)
: Get all sheet names from excel file.drop_columns(df, cols)
: Drop columns from data frame.clear_sheet(df)
: Clear sheet from excel file.remove_duplicates(df, column_name)
: Remove duplicates from excel file.is_NaN(value)
: Check if value is NaN.df_from_list(list_of_lists, column_names)
: Create dataframe from list of lists.df_from_string(df_string, word_delimiter, line_delimiter, column_names)
: Create dataframe from string.df_extract_sub_df(df, row_start, row_end, column_start, column_end)
: Extract sub dataframe from dataframe.set_value_in_df(df, row_number, column_number, value)
: Set value in dataframe.get_value_in_df(df, row_number, column_number)
: Get value from dataframe.df_drop_rows(df, row_start, row_end)
: Drop rows from dataframe.df_vlookup(df1, df2, column_name, how)
: Vlookup in dataframe.df_convert_column_to_type(df, column_name, column_type)
: Convert column to type in dataframe.group_by_column_values_n_split(df, column_name, output_folder, show_output)
: Group by column values and split dataframe.if_value_exists(df, columns, value)
: Check if value exists in dataframe.apply_template_format(raw_data_filepath, raw_data_sheetname,template_filepath, template_sheetname, output_folder, output_filename)
: Apply template format to excel file.concat_all_sheets(input_filepath, sheet_names)
: Concatenate all sheets from excel file.merge_all_files(input_folder, output_folder, output_filename)
: Merge all excel files in a folder.
apply_template_format(raw_data_filepath, raw_data_sheetname, template_filepath, template_sheetname, output_folder=None, output_filename=None)
Applies the template format to the excel raw data
Parameters:
Name | Type | Description | Default |
---|---|---|---|
raw_data_filepath |
str || WindowsPath
|
The raw data filepath. |
required |
raw_data_sheetname |
str
|
The raw data sheetname. |
required |
template_filepath |
str || WindowsPathp
|
The template filepath. |
required |
template_sheetname |
str
|
The template sheetname. |
required |
output_folder |
str || WindowsPath
|
The output folder. |
None
|
output_filename |
str
|
The output filename. |
None
|
Returns:
Type | Description |
---|---|
None
|
None |
Examples:
authenticate_google_spreadsheet(credential_file_path)
Creates authentication object for google spreadsheet.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
credential_file_path |
str || WindowsPath
|
Credential file path. |
required |
Returns:
Name | Type | Description |
---|---|---|
auth_object |
object
|
Authentication object. |
Examples:
clear_sheet(df)
concat_all_sheets(input_filepath, sheet_names=None)
Concat all sheets of given excel file
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_filepath |
str || WindowsPath
|
The excel file path. |
required |
sheet_names |
list
|
The sheet names as list. |
None
|
Returns:
Name | Type | Description |
---|---|---|
data |
pd.DataFrame
|
The data. |
Examples:
create_file(output_folder, output_filename, output_sheetname='Sheet1')
Creates an excel file with a sheet in the specified folder.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
output_folder |
str || WindowsPath
|
Output folder path. |
required |
output_filename |
str
|
Output file name. |
required |
output_sheetname |
str || list
|
Output sheet name. |
'Sheet1'
|
Returns:
Type | Description |
---|---|
None
|
None |
Examples:
dataframe_to_excel(df, output_folder, output_filename, output_sheetname='Sheet1', mode='a')
Converts the dataframe to excel file
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
pd.DataFrame
|
Dataframe of the excel file. |
required |
output_folder |
str || WindowsPath
|
Output folder path. |
required |
output_filename |
str
|
Output file name. |
required |
output_sheetname |
str
|
Output sheet name. |
'Sheet1'
|
mode |
str
|
Mode of the excel file. |
'a'
|
Returns:
Type | Description |
---|---|
None
|
None |
Examples:
df_convert_column_to_type(df, column_name, column_type)
Converts the column to the given type.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
pd.DataFrame
|
Input dataframe. |
required |
column_name |
str
|
Name of the column. |
required |
column_type |
str
|
Type of the column. |
required |
Returns:
Name | Type | Description |
---|---|---|
df |
pd.DataFrame
|
Converted dataframe. |
Examples:
df_drop_rows(df, row_start, row_end)
Drops rows from dataframe
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
pd.DataFrame
|
dataframe |
required |
row_start |
int
|
row start (inclusive) |
required |
row_end |
int
|
row end (exclusive) |
required |
Returns:
Name | Type | Description |
---|---|---|
df |
pd.DataFrame
|
dataframe with rows dropped |
Examples:
df_extract_sub_df(df, row_start, row_end, column_start, column_end)
Extracts sub dataframe from the given dataframe
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
pd.DataFrame
|
dataframe |
required |
row_start |
int
|
row start (inclusive) |
required |
row_end |
int
|
row end (exclusive) |
required |
column_start |
int
|
column start (inclusive) |
required |
column_end |
int
|
column end (exclusive) |
required |
Returns:
Name | Type | Description |
---|---|---|
df |
pd.DataFrame
|
sub dataframe |
Examples:
df_from_list(list_of_lists, column_names)
Converts list of lists to dataframe
Parameters:
Name | Type | Description | Default |
---|---|---|---|
list_of_lists |
list
|
list of lists to be converted to dataframe |
required |
column_names |
list
|
column names |
required |
Returns:
Name | Type | Description |
---|---|---|
df |
pd.DataFrame
|
dataframe |
Examples:
df_from_string(df_string, word_delimiter=' ', line_delimiter='\n', column_names=None)
Converts string to dataframe
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df_string |
str
|
string to be converted to dataframe |
required |
word_delimiter |
str
|
word delimiter. |
' '
|
line_delimiter |
str
|
line delimiter. |
'\n'
|
column_names |
list
|
column names. |
None
|
Returns:
Name | Type | Description |
---|---|---|
df |
pd.DataFrame
|
dataframe |
Examples:
df_vlookup(df1, df2, column_name, how='left')
Performs vlookup operation on two dataframes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df1 |
pd.DataFrame
|
First dataframe. |
required |
df2 |
pd.DataFrame
|
Second dataframe. |
required |
column_name |
str
|
Column name on which vlookup is to be performed. |
required |
how |
str
|
Type of vlookup. Default is 'left'. |
'left'
|
Returns:
Name | Type | Description |
---|---|---|
df |
pd.DataFrame
|
Dataframe after vlookup operation. |
Examples:
drop_columns(df, cols)
Drops the columns from the excel file
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
pd.DataFrame
|
Dataframe of the excel file. |
required |
cols |
str || list
|
Column name to be dropped. |
required |
Returns:
Name | Type | Description |
---|---|---|
df |
pd.DataFrame
|
Modified dataframe |
Examples:
get_all_header_columns(df)
get_all_sheet_names(input_filepath)
get_dataframe_from_google_spreadsheet(auth, spreadsheet_url, sheet_name='Sheet1')
Get dataframe from google spreadsheet
Parameters:
Name | Type | Description | Default |
---|---|---|---|
auth |
object
|
Authentication object. |
required |
spreadsheet_url |
str
|
Spreadsheet URL. |
required |
sheet_name |
str
|
Sheet name. |
'Sheet1'
|
Returns:
Name | Type | Description |
---|---|---|
df |
pd.DataFrame
|
Dataframe object. |
Examples:
get_row_column_count(df)
get_single_cell(df, column_name, cell_number, header=1)
Gets the text from the desired column/cell number for the given excel file
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
pd.DataFrame
|
Dataframe of the excel file. |
required |
column_name |
str
|
Column name of the excel file. |
required |
cell_number |
int
|
Cell number of the excel file. |
required |
header |
int
|
Header row number. |
1
|
Returns:
Name | Type | Description |
---|---|---|
data |
str
|
Text from the desired column/cell number for the given excel file |
Examples:
get_value_in_df(df, row_number, column_number)
Gets value from dataframe
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
pd.DataFrame
|
dataframe |
required |
row_number |
int
|
Row number of the cell |
required |
column_number |
int
|
Column number of the cell |
required |
Returns:
Name | Type | Description |
---|---|---|
value |
str
|
value in the cell |
Examples:
group_by_column_values_n_split(df, column_name, output_folder, show_output=False)
Groups the dataframe by the given column and splits the dataframe into multiple excel files.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
pd.DataFrame
|
Input dataframe. |
required |
column_name |
str
|
Name of the column. |
required |
output_folder |
str
|
Output folder path. |
required |
show_output |
bool
|
If True, shows the output. Default is False. |
False
|
Returns:
Type | Description |
---|---|
None
|
None |
Examples:
if_value_exists(df, columns, value)
Checks if the given value exists in the given column.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
pd.DataFrame
|
Input dataframe. |
required |
columns |
str
|
Name of the column. |
required |
value |
str
|
Value to be searched. |
required |
Returns:
Name | Type | Description |
---|---|---|
Result |
bool
|
True if value exists, else False. |
Examples:
is_NaN(value)
merge_all_files(input_folder, output_folder, output_filename)
Merges all the excel files in the given folder.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_folder |
str || Windows Path
|
Path of the input folder. |
required |
output_folder |
str || Windows Path
|
Path of the output folder. |
required |
output_filename |
str
|
Name of the output file. |
required |
Returns:
Type | Description |
---|---|
None
|
None |
Examples:
remove_duplicates(df, column_name)
Removes the duplicates from the given column
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
pd.DataFrame
|
Dataframe of the excel file. |
required |
column_name |
str || int || list
|
Column name of the excel file.. |
required |
Returns:
Name | Type | Description |
---|---|---|
df |
pd.DataFrame
|
Modified dataframe |
Examples:
set_single_cell(df, column_name, cell_number, value)
Description
Writes the given text to the desired column/cell number for the given excel file
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
pd.DataFrame
|
Dataframe of the excel file. |
required |
column_name |
str
|
Column name of the excel file. |
required |
cell_number |
int
|
Cell number of the excel file. |
required |
value |
str
|
Text to be written to the excel file. |
required |
Returns:
Name | Type | Description |
---|---|---|
df |
pd.DataFrame
|
Modified dataframe |
Examples:
set_value_in_df(df, row_number, column_number, value)
Sets value in dataframe
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
pd.DataFrame
|
dataframe to be modified |
required |
row_number |
int
|
Row number of the cell |
required |
column_number |
int
|
Column number of the cell |
required |
value |
str
|
value to be set in the cell |
required |
Returns:
Name | Type | Description |
---|---|---|
df |
pd.DataFrame
|
dataframe with value set |
Examples:
tabular_data_from_website(website_url, table_number=1)
Returns a dataframe from a website table.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
website_url |
str
|
Website URL. |
required |
table_number |
int
|
Table number. |
1
|
Returns:
Name | Type | Description |
---|---|---|
df |
pd.DataFrame
|
Dataframe containing tabular data from website. |
Examples:
to_dataframe(input_filepath, input_sheetname, header=1)
Converts excel file to dataframe.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_filepath |
str || WindowsPath
|
Input file path. |
required |
input_sheetname |
str
|
Input sheet name. |
required |
header |
int
|
Header row number. |
1
|
Returns:
Name | Type | Description |
---|---|---|
df |
pd.DataFrame
|
Dataframe of the excel file. |
Examples:
upload_dataframe_to_google_spreadsheet(auth, spreadsheet_url, sheet_name, df)
Uploads a dataframe to a google spreadsheet.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
auth |
object
|
Authentication object. |
required |
spreadsheet_url |
str
|
Spreadsheet URL. |
required |
sheet_name |
str
|
Sheet name. |
required |
df |
pd.DataFrame
|
Dataframe object. |
required |
Returns:
Type | Description |
---|---|
None
|
None |
Examples: