A character set refers to the set of characters that Python recognizes and can work with. This includes letters, digits, punctuation, and various symbols. Python supports several character sets, including ASCII and Unicode.
ASCII Character Set
The ASCII (American Standard Code for Information Interchange) character set is one of the most basic character sets and includes 128 characters. These characters include:
- Letters: A-Z, a-z
- Digits: 0-9
- Punctuation and special characters:
! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ ] ^ _
{ | } ~` - Control characters (non-printable): Characters with ASCII codes 0-31
Unicode Character Set
Python 3 uses Unicode by default, which is a much larger character set than ASCII. Unicode includes characters from almost all written languages, mathematical symbols, emojis, and more.
Working with Character Sets
Let’s see some examples of how to work with character sets in Python.
ASCII Characters
Here’s a simple example to print ASCII characters and their corresponding codes:
# Print ASCII characters and their codes
for i in range(128):
print(f"ASCII Code: {i}, Character: {chr(i)}")
Unicode Characters
You can work with Unicode characters similarly. Here’s an example that includes some Unicode characters:
# Print some Unicode characters and their codes
unicode_chars = ['u0394', 'u03A9', 'u03C0', 'u2600', 'u2615']
for char in unicode_chars:
print(f"Unicode Character: {char}, Code: {ord(char)}")
In this example:
u0394
is the Greek capital letter Delta (Δ)u03A9
is the Greek capital letter Omega (Ω)u03C0
is the Greek small letter Pi (π)u2600
is the sun symbol (☀)u2615
is the hot beverage symbol (☕)
Working with Strings
You can also handle strings containing characters from different character sets:
# Example string with ASCII and Unicode characters
example_string = "Hello, World! जुनैद शेख ☕"
# Iterate over each character in the string
for char in example_string:
print(f"Character: {char}, Unicode Code: {ord(char)}")
In this example, the string contains:
- ASCII characters:
Hello, World!
- Unicode characters:
जुनैद शेख
(Hello, World in Hindi) and☕
(hot beverage symbol)
You can easily work with characters from both ASCII and Unicode character sets. ASCII is limited to basic English characters and some control characters, while Unicode covers a vast range of characters from various languages and symbols.