Data Normalization
Master normalized vs non-normalized caching strategies for optimal frontend performance
Data Normalization: Frontend Caching Strategies
Normalization is a crucial concept in frontend system design that determines how you structure and cache data on the client side. Understanding when and how to normalize data can dramatically impact your app's performance, memory usage, and maintainability.
What is Data Normalization?
Data normalization in frontend development refers to organizing cached data to eliminate redundancy and ensure consistency. It's similar to database normalization but applied to client-side state management.
Key Benefits:
- Reduced Memory Usage - Eliminate duplicate data
- Consistent Updates - Single source of truth for each entity
- Better Performance - Faster lookups and updates
- Easier Maintenance - Simpler state management
Real-World Example: E-commerce Reviews
Let's examine both approaches using a product review system:
Non-Normalized Approach (Redundant)
{
"reviews": [
{
"id": "r1",
"rating": 5,
"comment": "Excellent quality!",
"user": {
"id": "u1",
"name": "Alice Johnson",
"avatar": "alice.jpg"
},
"product": {
"id": "p1",
"name": "Wireless Headphones",
"price": 199.99
}
},
{
"id": "r2",
"rating": 4,
"comment": "Good value for money",
"user": {
"id": "u1",
"name": "Alice Johnson",
"avatar": "alice.jpg"
},
"product": {
"id": "p1",
"name": "Wireless Headphones",
"price": 199.99
}
},
{
"id": "r3",
"rating": 3,
"comment": "Decent sound quality",
"user": {
"id": "u2",
"name": "Bob Smith",
"avatar": "bob.jpg"
},
"product": {
"id": "p1",
"name": "Wireless Headphones",
"price": 199.99
}
}
]
}Problems with this approach:
- Data Duplication: Alice's info repeated 2 times, product info repeated 3 times
- Inconsistent Updates: If Alice changes her name, you need to update multiple places
- Memory Waste: Unnecessary storage of duplicate data
- Sync Issues: Hard to keep all copies in sync
Normalized Approach (Optimized)
{
"entities": {
"reviews": {
"r1": {
"id": "r1",
"rating": 5,
"comment": "Excellent quality!",
"userId": "u1",
"productId": "p1"
},
"r2": {
"id": "r2",
"rating": 4,
"comment": "Good value for money",
"userId": "u1",
"productId": "p1"
},
"r3": {
"id": "r3",
"rating": 3,
"comment": "Decent sound quality",
"userId": "u2",
"productId": "p1"
}
},
"users": {
"u1": {
"id": "u1",
"name": "Alice Johnson",
"avatar": "alice.jpg"
},
"u2": {
"id": "u2",
"name": "Bob Smith",
"avatar": "bob.jpg"
}
},
"products": {
"p1": {
"id": "p1",
"name": "Wireless Headphones",
"price": 199.99
}
}
},
"relationships": {
"productReviews": {
"p1": ["r1", "r2", "r3"]
},
"userReviews": {
"u1": ["r1", "r2"],
"u2": ["r3"]
}
}
}Benefits of this approach:
- Single Source of Truth: Each entity exists only once
- Efficient Updates: Change Alice's name in one place
- Memory Efficient: No duplicate data storage
- Flexible Queries: Easy to fetch related data
Implementation Examples
React with Redux Toolkit
// Normalized state structure
interface NormalizedState {
entities: {
users: Record<string, User>;
products: Record<string, Product>;
reviews: Record<string, Review>;
};
relationships: {
productReviews: Record<string, string[]>;
userReviews: Record<string, string[]>;
};
}
// Redux Toolkit with createEntityAdapter
import { createEntityAdapter, createSlice } from '@reduxjs/toolkit';
const reviewsAdapter = createEntityAdapter<Review>();
const usersAdapter = createEntityAdapter<User>();
const productsAdapter = createEntityAdapter<Product>();
const reviewsSlice = createSlice({
name: 'reviews',
initialState: reviewsAdapter.getInitialState(),
reducers: {
addReview: reviewsAdapter.addOne,
updateReview: reviewsAdapter.updateOne,
removeReview: reviewsAdapter.removeOne,
},
});Apollo Client (GraphQL)
// Apollo Client automatically normalizes GraphQL data
import { ApolloClient, InMemoryCache } from '@apollo/client';
const client = new ApolloClient({
cache: new InMemoryCache({
typePolicies: {
Review: {
keyFields: ['id'],
},
User: {
keyFields: ['id'],
},
Product: {
keyFields: ['id'],
},
},
}),
});
// Query with normalized caching
const GET_REVIEWS = gql`
query GetReviews($productId: ID!) {
product(id: $productId) {
id
name
reviews {
id
rating
comment
user {
id
name
avatar
}
}
}
}
`;React Query with Manual Normalization
import { useQuery, useQueryClient } from '@tanstack/react-query';
// Custom hook for normalized data
const useNormalizedReviews = (productId: string) => {
const queryClient = useQueryClient();
return useQuery({
queryKey: ['reviews', productId],
queryFn: async () => {
const response = await fetch(`/api/products/${productId}/reviews`);
const reviews = await response.json();
// Normalize the data
const normalized = normalizeReviews(reviews);
// Update cache with normalized data
queryClient.setQueryData(['entities', 'reviews'], normalized.reviews);
queryClient.setQueryData(['entities', 'users'], normalized.users);
return normalized;
},
});
};
const normalizeReviews = (reviews: Review[]) => {
const normalized = {
reviews: {},
users: {},
products: {},
};
reviews.forEach(review => {
normalized.reviews[review.id] = {
id: review.id,
rating: review.rating,
comment: review.comment,
userId: review.user.id,
productId: review.product.id,
};
normalized.users[review.user.id] = review.user;
normalized.products[review.product.id] = review.product;
});
return normalized;
};When to Use Each Approach
Use Non-Normalized When:
- Simple, flat data with no relationships
- Small datasets (< 100 items)
- Read-only data that doesn't change
- Prototyping or MVP development
- Performance is not critical
Use Normalized When:
- Complex, relational data with many entities
- Large datasets (> 100 items)
- Frequently updated data
- Performance-critical applications
- Shared entities (users, products, categories)
- Real-time updates and synchronization
Performance Comparison
| Metric | Non-Normalized | Normalized |
|---|---|---|
| Memory Usage | High (duplicates) | Low (no duplicates) |
| Update Speed | Slow (multiple updates) | Fast (single update) |
| Query Complexity | Simple | Moderate |
| Cache Hit Rate | Low | High |
| Implementation | Easy | Moderate |
Interview Tips
What Interviewers Look For:
- Understanding of trade-offs between approaches
- Real-world examples of when to use each
- Implementation knowledge with popular libraries
- Performance considerations and optimization strategies
Sample Interview Questions:
- "How would you design the state management for a social media feed?"
- "What caching strategy would you use for an e-commerce product catalog?"
- "How do you handle real-time updates in a normalized cache?"
Key Points to Mention:
- Start simple with non-normalized for MVPs
- Normalize when scaling or when data relationships become complex
- Consider your use case - not every app needs normalization
- Use existing tools like Apollo Client, RTK Query, or TanStack Query
- Measure performance before and after normalization
Popular Libraries & Tools
Built-in Normalization Support:
- Apollo Client - Automatic GraphQL data normalization
- Redux Toolkit -
createEntityAdapterfor normalized state - RTK Query - Entity adapters and cache management
- TanStack Query - Manual normalization with query keys
Manual Normalization Helpers:
- normalizr - Schema-based normalization library
- immer - Immutable updates for normalized state
- reselect - Memoized selectors for normalized data
Summary
Data normalization is a powerful technique for optimizing frontend performance and state management. The key is understanding when to use each approach:
- Start with non-normalized for simple applications
- Migrate to normalized when you hit performance or maintenance issues
- Use existing tools rather than building normalization from scratch
- Measure and optimize based on your specific use case
Remember: The best approach depends on your specific requirements, data complexity, and performance needs. Always consider the trade-offs and choose the solution that best fits your application's needs.
Next: Common Mistakes - Learn how to avoid common pitfalls in frontend system design interviews.