Basics

Data Normalization

Master normalized vs non-normalized caching strategies for optimal frontend performance

Data Normalization: Frontend Caching Strategies

Normalization is a crucial concept in frontend system design that determines how you structure and cache data on the client side. Understanding when and how to normalize data can dramatically impact your app's performance, memory usage, and maintainability.


What is Data Normalization?

Data normalization in frontend development refers to organizing cached data to eliminate redundancy and ensure consistency. It's similar to database normalization but applied to client-side state management.

Key Benefits:

  • Reduced Memory Usage - Eliminate duplicate data
  • Consistent Updates - Single source of truth for each entity
  • Better Performance - Faster lookups and updates
  • Easier Maintenance - Simpler state management

Real-World Example: E-commerce Reviews

Let's examine both approaches using a product review system:

Non-Normalized Approach (Redundant)

{
  "reviews": [
    {
      "id": "r1",
      "rating": 5,
      "comment": "Excellent quality!",
      "user": {
        "id": "u1",
        "name": "Alice Johnson",
        "avatar": "alice.jpg"
      },
      "product": {
        "id": "p1",
        "name": "Wireless Headphones",
        "price": 199.99
      }
    },
    {
      "id": "r2", 
      "rating": 4,
      "comment": "Good value for money",
      "user": {
        "id": "u1",
        "name": "Alice Johnson", 
        "avatar": "alice.jpg"
      },
      "product": {
        "id": "p1",
        "name": "Wireless Headphones",
        "price": 199.99
      }
    },
    {
      "id": "r3",
      "rating": 3,
      "comment": "Decent sound quality",
      "user": {
        "id": "u2", 
        "name": "Bob Smith",
        "avatar": "bob.jpg"
      },
      "product": {
        "id": "p1",
        "name": "Wireless Headphones",
        "price": 199.99
      }
    }
  ]
}

Problems with this approach:

  • Data Duplication: Alice's info repeated 2 times, product info repeated 3 times
  • Inconsistent Updates: If Alice changes her name, you need to update multiple places
  • Memory Waste: Unnecessary storage of duplicate data
  • Sync Issues: Hard to keep all copies in sync

Normalized Approach (Optimized)

{
  "entities": {
    "reviews": {
      "r1": {
        "id": "r1",
        "rating": 5,
        "comment": "Excellent quality!",
        "userId": "u1",
        "productId": "p1"
      },
      "r2": {
        "id": "r2", 
        "rating": 4,
        "comment": "Good value for money",
        "userId": "u1",
        "productId": "p1"
      },
      "r3": {
        "id": "r3",
        "rating": 3,
        "comment": "Decent sound quality", 
        "userId": "u2",
        "productId": "p1"
      }
    },
    "users": {
      "u1": {
        "id": "u1",
        "name": "Alice Johnson",
        "avatar": "alice.jpg"
      },
      "u2": {
        "id": "u2",
        "name": "Bob Smith", 
        "avatar": "bob.jpg"
      }
    },
    "products": {
      "p1": {
        "id": "p1",
        "name": "Wireless Headphones",
        "price": 199.99
      }
    }
  },
  "relationships": {
    "productReviews": {
      "p1": ["r1", "r2", "r3"]
    },
    "userReviews": {
      "u1": ["r1", "r2"],
      "u2": ["r3"]
    }
  }
}

Benefits of this approach:

  • Single Source of Truth: Each entity exists only once
  • Efficient Updates: Change Alice's name in one place
  • Memory Efficient: No duplicate data storage
  • Flexible Queries: Easy to fetch related data

Implementation Examples

React with Redux Toolkit

// Normalized state structure
interface NormalizedState {
  entities: {
    users: Record<string, User>;
    products: Record<string, Product>;
    reviews: Record<string, Review>;
  };
  relationships: {
    productReviews: Record<string, string[]>;
    userReviews: Record<string, string[]>;
  };
}

// Redux Toolkit with createEntityAdapter
import { createEntityAdapter, createSlice } from '@reduxjs/toolkit';

const reviewsAdapter = createEntityAdapter<Review>();
const usersAdapter = createEntityAdapter<User>();
const productsAdapter = createEntityAdapter<Product>();

const reviewsSlice = createSlice({
  name: 'reviews',
  initialState: reviewsAdapter.getInitialState(),
  reducers: {
    addReview: reviewsAdapter.addOne,
    updateReview: reviewsAdapter.updateOne,
    removeReview: reviewsAdapter.removeOne,
  },
});

Apollo Client (GraphQL)

// Apollo Client automatically normalizes GraphQL data
import { ApolloClient, InMemoryCache } from '@apollo/client';

const client = new ApolloClient({
  cache: new InMemoryCache({
    typePolicies: {
      Review: {
        keyFields: ['id'],
      },
      User: {
        keyFields: ['id'],
      },
      Product: {
        keyFields: ['id'],
      },
    },
  }),
});

// Query with normalized caching
const GET_REVIEWS = gql`
  query GetReviews($productId: ID!) {
    product(id: $productId) {
      id
      name
      reviews {
        id
        rating
        comment
        user {
          id
          name
          avatar
        }
      }
    }
  }
`;

React Query with Manual Normalization

import { useQuery, useQueryClient } from '@tanstack/react-query';

// Custom hook for normalized data
const useNormalizedReviews = (productId: string) => {
  const queryClient = useQueryClient();
  
  return useQuery({
    queryKey: ['reviews', productId],
    queryFn: async () => {
      const response = await fetch(`/api/products/${productId}/reviews`);
      const reviews = await response.json();
      
      // Normalize the data
      const normalized = normalizeReviews(reviews);
      
      // Update cache with normalized data
      queryClient.setQueryData(['entities', 'reviews'], normalized.reviews);
      queryClient.setQueryData(['entities', 'users'], normalized.users);
      
      return normalized;
    },
  });
};

const normalizeReviews = (reviews: Review[]) => {
  const normalized = {
    reviews: {},
    users: {},
    products: {},
  };
  
  reviews.forEach(review => {
    normalized.reviews[review.id] = {
      id: review.id,
      rating: review.rating,
      comment: review.comment,
      userId: review.user.id,
      productId: review.product.id,
    };
    
    normalized.users[review.user.id] = review.user;
    normalized.products[review.product.id] = review.product;
  });
  
  return normalized;
};

When to Use Each Approach

Use Non-Normalized When:

  • Simple, flat data with no relationships
  • Small datasets (< 100 items)
  • Read-only data that doesn't change
  • Prototyping or MVP development
  • Performance is not critical

Use Normalized When:

  • Complex, relational data with many entities
  • Large datasets (> 100 items)
  • Frequently updated data
  • Performance-critical applications
  • Shared entities (users, products, categories)
  • Real-time updates and synchronization

Performance Comparison

MetricNon-NormalizedNormalized
Memory UsageHigh (duplicates)Low (no duplicates)
Update SpeedSlow (multiple updates)Fast (single update)
Query ComplexitySimpleModerate
Cache Hit RateLowHigh
ImplementationEasyModerate

Interview Tips

What Interviewers Look For:

  1. Understanding of trade-offs between approaches
  2. Real-world examples of when to use each
  3. Implementation knowledge with popular libraries
  4. Performance considerations and optimization strategies

Sample Interview Questions:

  • "How would you design the state management for a social media feed?"
  • "What caching strategy would you use for an e-commerce product catalog?"
  • "How do you handle real-time updates in a normalized cache?"

Key Points to Mention:

  • Start simple with non-normalized for MVPs
  • Normalize when scaling or when data relationships become complex
  • Consider your use case - not every app needs normalization
  • Use existing tools like Apollo Client, RTK Query, or TanStack Query
  • Measure performance before and after normalization

Built-in Normalization Support:

  • Apollo Client - Automatic GraphQL data normalization
  • Redux Toolkit - createEntityAdapter for normalized state
  • RTK Query - Entity adapters and cache management
  • TanStack Query - Manual normalization with query keys

Manual Normalization Helpers:

  • normalizr - Schema-based normalization library
  • immer - Immutable updates for normalized state
  • reselect - Memoized selectors for normalized data

Summary

Data normalization is a powerful technique for optimizing frontend performance and state management. The key is understanding when to use each approach:

  • Start with non-normalized for simple applications
  • Migrate to normalized when you hit performance or maintenance issues
  • Use existing tools rather than building normalization from scratch
  • Measure and optimize based on your specific use case

Remember: The best approach depends on your specific requirements, data complexity, and performance needs. Always consider the trade-offs and choose the solution that best fits your application's needs.


Next: Common Mistakes - Learn how to avoid common pitfalls in frontend system design interviews.