Skip to content Skip to sidebar Skip to footer

Openpyxl - Transfer Range Of Rows From A Worksheet To Another

I want to take a certain part of data from a sheet and copy it to another sheet. So far, I have a dictionary with key as start row and value as end row. Using this, I would like to

Solution 1:

from openpyxl import load_workbook
from itertools import product

filename = 'wetransfer-a483c9/testFile.xlsx'
wb = load_workbook(filename)

sheets = wb.sheetnames[1:]

Where sheets would be ['Table 1', 'Table 2', 'Table 3']

# access the main worksheetws = wb['Main']

First, get the boundaries (start-/endpoint) for each Table

span = []
for row in ws:
    for cell in row:
        if (cell.value
                and (cell.column == 2)  # restrict search to column2, which is where the Table entries are# this also avoids the int error, since integers are not iterable
                and ("Table" in cell.value)):
            span.append(cell.row)

# add sheet's length -> allows us to effectively capture the data boundaries
span.append(ws.max_row + 1)

Result span : [1, 29, 42, 58]

Second, get the pairing of boundaries. +1 ensures the end is included when capturing the tables and convert them to string format Since openpyxl refers to the boundaries in string form and has a 1 index notation, instead of adding 1, you have to take one off.

boundaries = [":".join(map(str,(start, end-1)))  forstart, endin zip(span,span[1:])]

Result boundaries : ['1:28', '29:41', '42:57']

Third, create a cartesian of the main sheet, the boundaries and the other sheets. Note that boundaries and sheets are zipped - essentially they are a pair. As such, we paired each table with a boundary:

#table 1 is bound to 1:28,#table 2 is bound to 29:41, ...

Next, we combine the main sheet with the pair, so main sheet is paired with (table 1, 1:28). The same main sheet is paired with (table 2, 29:41) ...

Fourth, get the data within the ranges. Since we have successfully paired the main sheet with every pair of table and boundary, we can safely get the data for that particular region and shift it to the particular table.

So table 1 in the main sheet refers to 1:28, since it is bound to this particular table. When it's done with table 1, it returns to the loop and starts at "Table 2", selecting only "29:41" since this is the limit in this section, and so on.

for main,(ref, table) in product([ws],zip(boundaries, sheets)):

    sheet_content = main[ref]
    # append row to the specified tablefor row in sheet_content:
        #here we iterate through the main sheet#get one row of data#append it to the table#move to the next row, append to the table beneath the previous one#and repeat the process till the boundary has been exhausted
        wb[table].append([cell.value for cell in row])
    

Finally, save your file.

wb.save(filename)

Post a Comment for "Openpyxl - Transfer Range Of Rows From A Worksheet To Another"