Skip to content Skip to sidebar Skip to footer

Openpyxl : Need The Max Number Of Rows In A Column That Has Data In Excel

I need the last row in a particular column that contains data in Excel. In openpyxl sheet.max_row or max_column gets us the maximum row or column in the whole sheet. But what I wan

Solution 1:

"Empty" is a relative concept so your code should be clear about this. The methods in openpyxl are guaranteed to return orthogonal result sets: the length of rows and columns will always be the same.

Using this we can work deduce the row highest row in column of a cell where the value is not None.

max_row_for_c = max((c.row for c in ws['C'] if c.value is not None))

Solution 2:

Question: i want max_column containing data in Column 'C' it should return 10:

Simple count cell.value not Empty
Documentation Accessing many cells

PSEUDOCODE

for cell in Column('C'):
    if not cell.value is empty:
        count += 1

Comment: What if we have an empty cell in between?

Count the Rows in sync with the Column Range, and use a maxRowWithData variable. This will also work with no empty cell between.

PSEUDOCODE

for row index, cell in enumerate Column('C'):
    if not cell.value is empty:
        maxRowWithData = row index

Note: The cell index of openpyxl is 1-based!

Documentation: enumerate(iterable, start=0)


Solution 3:

why not just find the length of column 'C' result would be same output-->10 because when u will get the column 'C' values it will present u as tuple elements so just take length of tuple which would come =10

import Openpyxl

file=openpyxl.load_workbook('example.xlsx')

current_sheet=file.get_sheet_by_name('sheet1')  

Column_C=current_sheet['C']   

print ( len(column_C))

data.close()
data.closed()

Solution 4:

The accepted answer is not correct if an empty cell in between two cells with values then it will fail the following is the correct way.

import openpyxl as xl
import os
    BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
    Dir_Name = os.path.join(BASE_DIR, 'Your_Project_Folder_Name_Here')
    xl_file_path = os.path.join(Dir_Name, 'Your_Excel_File_Name_Here.xlsx')
    wb_obj = xl.load_workbook(xl_file_path)
    sheet_obj = wb_obj.active
    number_of_rows = sheet_obj.max_row
    last_row_index_with_data = 0
    while True:
        if sheet_obj.cell(number_of_rows, 1).value != None:
            last_row_index_with_data = number_of_rows
            break
        else:
            number_of_rows -= 1

    print( "last row index having values " , last_row_index_with_data)

In this way we check from bottom to top of the page, when we find a cell that has a value other than a None, that index of row is the one we are requiring.


Solution 5:

I think I just found a way using pandas:

import pandas as pd

# lt is the dataframe containing the data to be loaded to excel file

for index,i in enumerate(lt):
    panda_xl_rd = pd.read_excel('file.xlsx',"sheet_Name") # Panda Dataframe
    max = len(panda_xl_rd.iloc[:,(col-1)].dropna())+2     ''' getting the row_num of 
                                                            last record in column 
                                                            dropna removes the Nan 
                                                            values else we will get 
                                                            the entire sheets max 
                                                            column length . +2 gets 
                                                            the next column right 
                                                            after the last column to 
                                                            enter data '''
    cellref = sheet.cell(row = max+index, column=col)
    cellref.value = i
    del panda_xl_rd

Post a Comment for "Openpyxl : Need The Max Number Of Rows In A Column That Has Data In Excel"