One notable mistake is that many AI-generated database schemas currently favour using VARCHAR(36) for the id column.
One notable mistake is that many AI-generated database schemas currently favour using VARCHAR(36) for the id column.
This is typically motivated by a desire to prevent unauthorised access by malicious actors. To achieve this, UUIDs (Universally Unique Identifiers) are often employed as a security measure.
Take WordPress as an example — it typically exposes post IDs in the URL like ?p=45. Compared to an SEO-friendly format such as /2025/05/sample-post, this numeric ID allows users to potentially guess and access adjacent content by altering the number (e.g., ?p=44). UUIDs, on the other hand, are not sequential and are far harder to guess.
UUIDs are 36-character strings that follow a standard format with five sections separated by hyphens: 8-4-4-4-12. Each section is made up of hexadecimal values:
550e8400-e29b-41d4-a716-446655440000
f47ac10b-58cc-4372-a567-0e02b2c3d479
123e4567-e89b-12d3-a456-426614174000
A simple regex to validate UUIDs looks like this:
^[a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12}$
While UUIDs offer security and portability, they come with trade-offs. They're significantly larger than integers, which can slow down database performance — especially when used as primary keys in relational databases. Indexing, memory usage, and query speed all suffer due to the size and non-sequential nature of UUIDs.
Curious about this design choice, I asked GitHub Copilot the following question:
Me:
Why have you chosen the id column to be defined as VARCHAR(36) PRIMARY KEY in the database structure?
GitHub Copilot:
I chose
VARCHAR(36)for the ID columns for a few specific reasons:
- UUID Storage: The 36-character length fits UUID values, which follow a standardized format.
- Multi-Tenant Systems: UUIDs help isolate tenants by eliminating sequential ID leakage, simplifying migrations, and improving security.
- Data Portability: UUIDs ensure that exported data doesn't suffer from ID collisions during import into another system.
Trade-offs:
- Larger storage size (36 bytes vs. 4 bytes for
INT)- Slower indexing and query performance
- Increased memory usage
Alternatives to consider:
AUTO_INCREMENT INT/BIGINTBINARY(16)to store raw UUID bytes more efficiently- A custom ID generation strategy
Let me know if you'd prefer a different approach.
Me:
Use BIGINT UNSIGNED for performance, but keep a guid field in each record for external references.
GitHub Copilot:
I'll update the schema to use
BIGINT UNSIGNEDas the primary key while adding aguidcolumn for external references. This hybrid approach combines the best of both strategies:
- Performance: Efficient indexing and joins via numeric primary keys
- Security & Portability: GUIDs exposed in APIs, making internal IDs safe from enumeration
Summary:
BIGINT UNSIGNED: Fast, compact, sequentialGUID (VARCHAR): Unique, portable, externally safe
While relational databases offer flexibility in choosing between UUIDs, integers, or hybrids, NoSQL systems like MongoDB take a different path.
MongoDB uses a built-in type called ObjectId for its _id field. This is a 12-byte binary value that encodes:
Example:
507f1f77bcf86cd799439011
VARCHAR(36) UUID. They're also more index-friendly.| Feature | UUID (VARCHAR) | BIGINT + GUID Hybrid | MongoDB ObjectId |
|---|---|---|---|
| Storage Size | 36 bytes | 8 + 36 bytes | 12 bytes |
| Indexing Performance | Poor | Excellent | Excellent |
| Sequential / Time Order | No | Yes (BIGINT only) | Yes (via timestamp) |
| Readable Format | Yes | Yes (GUID) | No (hex-based) |
| Guessable | No | No (GUID only) | No |
| Schema Enforcement | Strict | Strict | Flexible |
Many AI-generated systems default to MongoDB-like NoSQL setups when:
In these cases, using ObjectIds or similar unique, time-sortable identifiers is often a sensible default.
These architecture decisions are not theoretical; they directly influence how we design and scale platforms like PrimeCRM in real-world environments.