Conditional formatting with pandas.ExcelWriter: unable to change format by hand and read format back with openpyxl

I am saving a pandas dataframe as Excel with pandas.ExcelWriter, formatting some cells in red with conditional_format.

The formatting succeeds, but I am then unable to read the formatting information back using openpyxl.load_workbook: when I ask for the color of the cell I have colored, the answer is always “None”. See the code below, which prints the following (notice the “rgb=None” in the last line):

Cell value: apple Font info: <openpyxl.styles.fonts.Font object> Parameters: name='Calibri', charset=None, family=2.0, b=False, i=False, strike=None, outline=None, shadow=None, condense=None, color=<openpyxl.styles.colors.Color object> Parameters: auto=None, tint=0.0, type='theme', theme=1, rgb=None, indexed=None, extend=None, sz=11.0, u=None, vertAlign=None, scheme='minor' 

Moreover, the color of the cell seems locked (i.e., I cannot change it manually after the Excel has been saved). Is there a way to fix these issues?

Many thanks in advance!

(NB. I am using Python 3.5, pandas==0.22.0, openpyxl==2.6.2)

    import pandas as pd     from openpyxl import load_workbook      # SAVING SAMPLE EXCEL FILE WITH ONE CELL COLORED IN RED      writer = pd.ExcelWriter('MyExcel.xlsx', engine = 'xlsxwriter')      df = pd.DataFrame(columns = ['FRUIT', 'ANIMAL'], data = [['apple', 'cow'], ['banana', 'cat'], ['pear', 'zebra']])      df.to_excel(writer, sheet_name = 'sheet_1', header = True, index = False)      fmt = writer.book.add_format({'font_color' : 'red'})     writer.sheets['sheet_1'].conditional_format('A2', {'type' : 'unique', 'format' : fmt})      writer.close()      # READING FORMATTING INFORMATION BACK      wb = load_workbook("MyExcel.xlsx", read_only = True)     ws = wb['sheet_1']      print("Cell value: {}\nFont info:\n{}".format(ws.cell(2,1).value,     ws.cell(2,1).font))      wb.close() 

Pandas или openpyxl?

Хочу написать скрипт, который структурирует экселевские данные , и размещает их опять в экселе но по-другому (по др. ячейкам). Что-то вроде сводной таблицы, но с некоторыми дополнениями, т.к. свод. таблица, не совсем выполняет необходимую задачу. К какой библиотеке лучше обратиться к pandas или openpyxl? И какая между ними разница?

Tratamento ficheiro excel usando python e openpyxl

Comecei a explorar o openpyxl de forma a conseguir trabalhar com um ficheiro em Excel. Tenho um pequeno projeto, que funciona na sua maioria, no entanto ainda não funciona como eu queria. Eu queria adicionar os valores por cada separador, mas só estou a conseguir adicionar todos os valores num separador. Este é o meu código:

import openpyxl as excel   def trata_excel(nomeficheiro):   # Declaração de variáveis livro = excel.load_workbook(nomeficheiro) folha = livro['Folha1'] cabeçalhos = {}  # vai receber os cabeçalhos e o número da respetiva coluna alvos = []  # aqui vou percorrer a primeira célula de todas as colunas e retirar os cabeçalhos for coluna in range(1,folha.max_column +1):     celula = folha.cell(row=1, column=coluna)     # Adicionando-os a um dicionário     cabeçalhos.update({celula.value:coluna})  # aqui vou ver se Alvo existe if "Alvo" in cabeçalhos:     coluna_alvo = cabeçalhos["Alvo"] # Retira o número da coluna associado a Alvo      for linha in range(2,folha.max_row +1): # vou percorrer todas as linhas para         celula_alvo = folha.cell(linha,coluna_alvo)    # ver quantos alvos existem         # aqui vou ver se o número de alvo não está na lista de alvos         if celula_alvo.value not in alvos:             alvos.append(celula_alvo.value) # e vou adicionar caso não esteja else:      return "Não encontrei a coluna referente a 'Alvo" # Encontrou o cabeçalho alvo e vou fazer um loop pelos alvos  #localização dos produtos coluna_produto = cabeçalhos["Tipo de Produto"]   for alvo in alvos:     # Vou criar um separador por alvo     nova_folha = livro.create_sheet(str(alvo))     # por cada alvo percorro todas as linhas     nova_linha = 1 # variável para inserir as linhas todas de seguida     for linha in range(2,folha.max_row+1):         produto = folha.cell(linha, coluna_produto)         # e vejo se é esse alvo         if alvo == celula_alvo.value and produto.value =="Voz" or alvo == celula_alvo.value and produto.value == "SMS" or alvo == celula_alvo.value and produto.value=="MMS":               nova_linha += 1              for coluna in range(1, folha.max_column + 1):                   valor = folha.cell(row=linha, column=coluna).value                   livro[str(alvo)].cell(row=nova_linha,column=coluna, value= valor)       livro.save("novo.xlsx")  #print(cabeçalhos) print(alvos) # só para ver se encontrou os alvos e para ver que já acabou 

Using Openpyxl to compare data values in two excel files when one isn’t ordered

I am comparing the income values in two different excel files to evaluate if their is a discrepancy the income for each employee between the files. My issue is that the names in the second file are alphabetized but in the first file they are ordered randomly.

I am very new to python, what search function can I use that creates a list telling me what the corresponding row values in the unordered list are to the alphabetized list.

Here is my code so far.

import openpyxl wb = openpyxl.load_workbook('Gehaltabrechnung_v02.xlsx') sheet= wb['Tabelle1'] wb2 = openpyxl.load_workbook('BonusCalculation.xlsx') sheet2= wb2['Tabelle1'] //imports files and names the workbooks  column_income1 = sheet['G'] income1_length = len(column_income1) column_name1 = sheet['B'] name1_length = len(column_name1) column_income2 = sheet2['I'] income2_length = len(column_income2) column_name2 = sheet2['B'] name2_length = len(column_name2) //specifies collumn and their lengths for the two sheets 

Python: Make my code efficient and less time complexity, My Code is about fetch details from Excel using ‘openpyxl’

I have this code so far:

Dictionary: which has Excel Data

DATA_DICT = {   '1238': {     'ipAddrPrim': ['IP1', 'IP2', 'IP3'],     'mmeName': ['N1', None, 'N3'],     's1LinkStatus': ['available', 'unavailable', 'available']   },   '1236': {     'ipAddrPrim': ['IP1', 'IP4'],     'mmeName': ['N4', 'N5'],     's1LinkStatus': ['available', 'available']   },   '1238': {     'ipAddrPrim': ['IP3', 'IP4', 'IP5'],     'mmeName': ['N7', 'N9', None],     's1LinkStatus': ['available', 'available', 'unavailable']   },   .... } 

My Code: which it’ll give the source and target lists

def finding_distinct_ips():     result_list, source_list, target_list, src_mrbts_key = [], [], [], []     unique_mrbts_list = remove_duplicates(MRBTS) #MRBTS has all MRBTS ids [1234,1234,1234,1236,1236,1238,1238,1238]                                                  #remove_duplicates(MRBTS) -> [1234,1236,1238]     for mrbts_id in unique_mrbts_list:         rest = {k: v for k, v in DATA_DICT.items() if k != mrbts_id}         main_key_ip_list = DATA_DICT[mrbts_id].get('ipAddrPrim')         status = DATA_DICT[mrbts_id].get('s1LinkStatus')         mme_name = DATA_DICT[mrbts_id].get('mmeName')         for key, val in rest.items():             rest_key_ip_list = val['ipAddrPrim']             result = compare(main_key_ip_list, rest_key_ip_list) #--> lst3 = list(set(main_key_ip_list) - set(rest_key_ip_list))             for ip_addr in result:                 ip_row_index = main_key_ip_list.index(ip_addr)                 if ip_addr != [] and status[ip_row_index] != 'unavailable':                     mme_at_ip = mme_name[ip_row_index]                     result_list = (mrbts_id, key, mme_at_ip, ip_addr)                     if result_list[0:2][::-1] not in src_mrbts_key:                         src_mrbts_key.append(result_list[0:2])                         source_list.append(result_list)                     else:                         target_list.append(result_list)     return source_list, target_list 

Explanation:

I have an Excel spreadsheet:

MRBTS  IPs  NAMES STATUS 1234   IP1   N1    Yes 1234   IP2   -     No 1234   IP3   N3    Yes 1236   IP1   N4    Yes 1236   IP4   N5    Yes 1238   IP3   N7    Yes 1238   IP4   N9    Yes 1238   IP5   -     No 

My Code will full fill these requirements:

1) I need to compare each MRBTS with each other just like

comp(1234,1236),comp(1234,1238),comp(1236,1234),comp(1236,1238),comp(1238,1234),comp(1238,1236),...] but not with same mrbts like comp(1234,1234).

2) Lets take (if status=Yes) for comp(1236,1238): In 1236's I have some list(IPs[IP1,IP4] and Ns[N4,N5]) so again i need to compare those list with 1238's list(IP[IP3,IP4] and Ns[N7,N9]) {Here we didn’t considered IP5 as status = No}, if any IP in 1236(IP list) is not found in 1238(IP list) then return that IP,Name and only if status=Yes

compare(1236,1238) = [1236,IP1,N4] and compare(1238,1236) = [1238,IP3,N7] 

3) Now do the comparison for all combis and get results and store like source = [compare(1236,1238)] and target = [compare(1238,1236)] Note: Here we can observe source has right ordered comparison and target has reverse ordered comparison.

Now I want, help to make above code efficient and less the time complexity:

Note:

i. Avoid Nested for loops as my Original excel has a huge number of data it’ll take time to read, else if u use then make execution fast.

ii. Use openpyxl or pyxlsb packages only