Pythonにおける文字列操作の基本と実用テクニック

Pythonでは、文字列は不変（immutable）なシーケンス型であり、単一引用符（'）、二重引用符（"）、または三重引用符（''' または """）で定義できる。一度生成された文字列は直接変更できない。

単一・二重引用符：

greeting = 'こんにちは'
message = "Pythonへようこそ"

三重引用符（複数行対応）：

text = """これは
複数行にわたる
文字列です。"""

連結：

a = "Hello"
b = "World"
combined = a + " " + b  # "Hello World"

インデックスアクセス：

word = "Python"
first_char = word[0]  # 'P'

len(s)：長さを取得
```
length = len("Python")  # 6
```
s.upper() / s.lower()：大文字・小文字変換
```
"Py".upper()   # "PY"
"PY".lower()   # "py"
```
s.strip()：前後の空白除去
```
"  test  ".strip()  # "test"
```

s.replace(old, new)：置換

"Hi Earth!".replace("Earth", "Mars")  # "Hi Mars!"

.format()メソッド：

"名前: {}, 年齢: {}".format("Hanako", 30)

f-string（Python 3.6以降推奨）：

name, age = "Jiro", 28
f"名前: {name}, 年齢: {age}"

Unicode文字列とバイト列の相互変換にはencode()とdecode()を使用する：

# エンコード（str → bytes）
utf8_bytes = "日本語".encode('utf-8')

# デコード（bytes → str）
original = utf8_bytes.decode('utf-8')

マッチ検索：

import re
if re.search(r'\d+', "ID: 12345"):
    print("数字が含まれています")

全一致抽出：

emails = re.findall(r'\w+@\w+\.\w+', "連絡先: a@example.com, b@test.org")
# ['a@example.com', 'b@test.org']

置換：

cleaned = re.sub(r'\s+', ' ', "  複数   空白  ")
# " 複数 空白 "

引用符のエスケープ：

s = '彼は"こんにちは"と言った。'

5月18日 01:09 投稿

異端開発室