Pythonにおける正規表現の使用: 文字列パターンマッチングの強力なツール

正規表現の基礎文法正規表現は、文字列内の文字の組み合わせをマッチさせるためのパターンです。テキスト処理において非常に強力なツールであり、テキストの検索、置換、検証に役立ちます。Pythonではreモジュールを使用して正規表現操作を行います。

1. 通常の文字

通常の文字は、アルファベット、数字、記号などであり、文字列内の文字と直接マッチします。

2. メタ文字

メタ文字には、ドット（.）、アスタリスク（*）、プラス（+）、クエスチョンマーク（?）、角括弧（[]）、波括弧（{}）、丸括弧（()）などがあります。

.：改行以外の任意の文字をマッチ。
*：前の文字が0回以上出現する場合にマッチ。
+：前の文字が1回以上出現する場合にマッチ。
?：前の文字が0回または1回出現する場合にマッチ。
[]：キャラクタクラスで、角括弧内の任意の文字をマッチ。
{}：量詞で、前の文字が指定された回数出現する場合にマッチ。
()：グループ化で、マッチしたテキストをキャプチャ。

Pythonの正規表現モジュール

Pythonのreモジュールは正規表現をサポートしています。以下に、いくつかの一般的なreモジュールの関数を紹介します。

`re.match`

re.match関数は、文字列の先頭から正規表現と一致する部分を探します。

import re
pattern = r"hello"
text = "hello world"
match = re.match(pattern, text)
if match:
    print("Match found:", match.group())

`re.search`

re.search関数は、文字列内で最初に正規表現と一致する部分を探します。

pattern = r"world"
text = "hello world"
match = re.search(pattern, text)
if match:
    print("Match found:", match.group())

`re.findall`

re.findall関数は、文字列内で全ての正規表現と一致する部分を見つけます。

pattern = r"\d+"
text = "The year is 2023."
matches = re.findall(pattern, text)
print("Matches found:", matches)

`re.sub`

re.sub関数は、文字列内で正規表現と一致する部分を置換します。

pattern = r"hello"
text = "hello world"
replaced_text = re.sub(pattern, "hi", text)
print("Replaced text:", replaced_text)

正規表現の応用例

メールアドレスの検証

pattern = r"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$"
email = "example@example.com"
if re.match(pattern, email):
    print("Valid email address.")
else:
    print("Invalid email address.")

URLの抽出

pattern = r"http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*(),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+"
text = "Visit my website at http://www.example.com"
matches = re.findall(pattern, text)
print("URLs found:", matches)

タグ: 正規表現 Python reモジュール文字列処理パターンマッチング

6月27日 00:58 投稿

異端開発室