1

pythonでデータ書き出しの際に「openpyxl.utils.exceptions.IllegalCharacterError」が出たときの対応

More than 1 year has passed since last update.

pandasを使用してデータをサーバから取得し、openpyxlを使用してデータをExcelなどに書き出す場合に下記のエラーが出る場合がある。

openpyxl.utils.exceptions.IllegalCharacterError

理由としては取得したデータフレームの中にopenxmlで扱えない不正な文字が含まれている、ということなので下記のように除外するための関数を1回挟めばOK

import pandas as pd
import openpyxl

def illegal_char_remover(data):
    ILLEGAL_CHARACTERS_RE = re.compile(
        r'[\000-\010]|[\013-\014]|[\016-\037]|[\x00-\x1f\x7f-\x9f]|[\uffff]')
    """Remove ILLEGAL CHARACTER."""
    if isinstance(data, str):
        return ILLEGAL_CHARACTERS_RE.sub("", data)
    else:
        return data

#サーバ情報
conn = pyodbc.connect(サーバ情報)
#クエリ実行
df = pd.read_sql(クエリ情報, conn )
#IllegalCharacter除外(ここが追加分)
df = df.applymap(illegal_char_remover)

Why do not you register as a user and use Qiita more conveniently?
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away
Why do not you register as a user and use Qiita more conveniently?
You need to log in to use this function. Qiita can be used more conveniently after logging in.
You seem to be reading articles frequently this month. Qiita can be used more conveniently after logging in.
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away