Why Developers Need to Know Encoding and Hashing
In web development, API integration, and security implementation, encoding and hashing are concepts you encounter daily. Encoding converts data to a different format; hashing converts data to a fixed-length value. Both are "transformations" but with entirely different purposes and characteristics.
Encoding vs Hashing
| Aspect | Encoding | Hashing |
| Purpose | Data compatibility, transmission | Integrity verification, security |
| Reversible | Yes (decoding) | No (one-way) |
| Key Required | No | No (except HMAC) |
| Output Length | Proportional to input | Always fixed |
| Examples | Base64, URL encoding | MD5, SHA-256 |
Base64 Encoding
Converts binary data to ASCII characters (A-Z, a-z, 0-9, +, /). Used in email attachments, image Data URLs, and JWT tokens.
Original: Hello, World!
Base64: SGVsbG8sIFdvcmxkIQ==
- About 33% size increase (3 bytes → 4 characters)
- Padding character (=) ensures length is a multiple of 4
- Note: This is NOT encryption! Anyone can decode it
🔐
Hash Generator
Generate MD5, SHA-1, SHA-256 hashes instantly
→
URL Encoding (Percent Encoding)
Converts characters that can't be used in URLs to %XX format.
Space: Hello World → Hello%20World
Special chars: a&b=c → a%26b%3Dc
When making API calls, Korean characters or special characters in query parameters must be URL-encoded.
Character Encoding: The Importance of UTF-8
| Encoding | English Size | Korean Size | Character Support |
| ASCII | 1 byte | Unsupported | 128 chars |
| EUC-KR | 1 byte | 2 bytes | Korean/English/some Japanese |
| UTF-8 | 1 byte | 3 bytes | All world characters |
| UTF-16 | 2 bytes | 2 bytes | All world characters |
About 98% of all web pages use UTF-8. Standardize on UTF-8 across databases, files, and APIs to prevent encoding issues.
Hash Function Comparison
| Hash | Output | Speed | Security | Use Case |
| MD5 | 128-bit (32 chars) | Very fast | Broken ❌ | Checksums |
| SHA-1 | 160-bit (40 chars) | Fast | Broken ❌ | Legacy, Git |
| SHA-256 | 256-bit (64 chars) | Moderate | Secure ✅ | Digital signatures, blockchain |
| SHA-512 | 512-bit (128 chars) | Moderate | Secure ✅ | High-security needs |
| bcrypt | 184-bit | Slow (intentional) | Secure ✅ | Password storage |
Use bcrypt, scrypt, or Argon2 for password storage. MD5/SHA are too fast and vulnerable to brute force attacks.
Practical Tips
- JWT Tokens: Each part of Header.Payload.Signature is Base64Url encoded.
- File Integrity: Compare SHA-256 checksums of downloads with official values.
- API Keys: Never transmit raw keys — sign with HMAC-SHA256.
- Database: Set charset to utf8mb4 to support emoji characters.
- Git: Commit IDs are SHA-1 hashes. Git is transitioning to SHA-256.