Prepared for: SDE Interview Preparation
Last Updated: February 2026
Focus Areas: Fundamentals, API Design, System Architecture, Databases, Performance, Security
Answer:
A backend system is the server-side component of an application that handles:
Key Components:
Answer:
Idempotency means that making the same request multiple times produces the same result as making it once. The operation can be repeated safely without unintended side effects.
Example:
DELETE /users/123 → Deletes user 123
DELETE /users/123 → Same result (user 123 still doesn't exist)
HTTP Methods and Idempotency:
| Method | Idempotent? | Safe? | Why? |
|---|---|---|---|
| GET | ✅ Yes | ✅ Yes | Only reads data, no side effects |
| POST | ❌ No | ❌ No | Creates new resource each time |
| PUT | ✅ Yes | ❌ No | Replaces resource with same result |
| PATCH | ⚠️ Depends | ❌ No | Depends on implementation |
| DELETE | ✅ Yes | ❌ No | Deleting deleted item = same state |
| HEAD | ✅ Yes | ✅ Yes | Like GET but only headers |
| OPTIONS | ✅ Yes | ✅ Yes | Returns allowed methods |
Detailed Explanation:
GET - Idempotent & Safe
GET /users/123 (first call) → Returns user data
GET /users/123 (second call) → Returns same user data
POST - NOT Idempotent
POST /users {name: "John"} → Creates user with ID 1
POST /users {name: "John"} → Creates NEW user with ID 2
PUT - Idempotent
PUT /users/123 {name: "John", age: 25} → Updates user 123
PUT /users/123 {name: "John", age: 25} → Same result
DELETE - Idempotent
DELETE /users/123 → Deletes user 123, returns 200
DELETE /users/123 → User still deleted, might return 404 but state unchanged
PATCH - Depends on Implementation
# Idempotent PATCH (absolute values)
PATCH /users/123 {age: 25} → Sets age to 25
PATCH /users/123 {age: 25} → Age still 25
# NOT Idempotent PATCH (relative values)
PATCH /users/123 {age_increment: 1} → Age increases by 1 each time
Real-World Importance:
How to Make POST Idempotent:
Use Idempotency Keys:
POST /payments
Headers: { "Idempotency-Key": "uuid-12345" }
Body: { amount: 100, user_id: 123 }
// Server checks if this key was already processed
// If yes: return cached response
// If no: process and cache response with key
Answer:
REST (Representational State Transfer) is an architectural style for designing networked applications using HTTP methods and standard conventions.
REST Principles:
RESTful URL Design:
✅ GOOD:
GET /users → List all users
GET /users/123 → Get user 123
POST /users → Create new user
PUT /users/123 → Update user 123
DELETE /users/123 → Delete user 123
GET /users/123/posts → Get posts by user 123
❌ BAD:
GET /getUsers
POST /createUser
GET /user?id=123
POST /deleteUser/123
Resource Naming Rules:
/users not /getUsers)/users not /user)/user-profiles)/users/123/orders)Answer:
| Aspect | SQL (Relational) | NoSQL (Non-Relational) |
|---|---|---|
| Structure | Fixed schema, tables with rows/columns | Flexible schema, documents/key-value/graphs |
| Scalability | Vertical (more powerful server) | Horizontal (more servers) |
| Examples | MySQL, PostgreSQL, Oracle | MongoDB, Redis, Cassandra, DynamoDB |
| Best For | Complex queries, transactions, relationships | Big data, rapid development, unstructured data |
| ACID | Full ACID compliance | Eventual consistency (most) |
| Joins | Powerful JOIN operations | Limited or no joins |
When to Use SQL:
When to Use NoSQL:
Example - Same Data in Both:
SQL (PostgreSQL):
-- Users table
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(100) UNIQUE
);
-- Posts table
CREATE TABLE posts (
id SERIAL PRIMARY KEY,
user_id INTEGER REFERENCES users(id),
title VARCHAR(200),
content TEXT,
created_at TIMESTAMP
);
-- Query with JOIN
SELECT u.name, p.title
FROM users u
JOIN posts p ON u.id = p.user_id
WHERE u.id = 123;
NoSQL (MongoDB):
// Embedded document approach
{
_id: ObjectId("123"),
name: "John Doe",
email: "john@example.com",
posts: [
{
title: "My First Post",
content: "Hello World",
created_at: ISODate("2026-01-01")
}
]
}
// Query
db.users.findOne({ _id: ObjectId("123") })
Answer:
CAP Theorem states that a distributed database system can only guarantee 2 out of 3 properties:
Visual Representation:
C (Consistency)
/\
/ \
/ \
/ \
/ \
/__________\
A P
(Availability) (Partition Tolerance)
Common Database Choices:
| Type | Databases | Trade-off |
|---|---|---|
| CP | MongoDB, HBase, Redis | Consistency + Partition Tolerance (may be unavailable) |
| AP | Cassandra, DynamoDB, CouchDB | Availability + Partition Tolerance (eventual consistency) |
| CA | PostgreSQL, MySQL | Consistency + Availability (no partition tolerance - not truly distributed) |
Real-World Example:
Scenario: Banking System
Option 1 (CP - Choose Consistency):
- During network partition, reject transactions
- Prevents incorrect balances
- Some users can't access accounts (unavailable)
Option 2 (AP - Choose Availability):
- Allow all transactions during partition
- Users can always access accounts
- Risk of inconsistent balances temporarily
Example: Instagram Story Views
Example: Bank Account Balance
Answer:
ACID ensures reliable transaction processing in databases:
A - Atomicity
C - Consistency
I - Isolation
D - Durability
Example - Bank Transfer:
BEGIN TRANSACTION;
-- Deduct from Account A
UPDATE accounts SET balance = balance - 100 WHERE id = 'A';
-- Add to Account B
UPDATE accounts SET balance = balance + 100 WHERE id = 'B';
COMMIT;
-- If ANY step fails:
ROLLBACK; -- Both updates are undone (Atomicity)
Without ACID:
1. Deduct $100 from Account A ✓
2. [SYSTEM CRASH] ✗
3. Add $100 to Account B (never happens)
Result: Money disappeared!
With ACID:
1. Deduct $100 from Account A ✓
2. [SYSTEM CRASH] ✗
3. Transaction ROLLBACK
Result: Both accounts unchanged, retry the transaction
Isolation Levels (from weakest to strongest):
Answer:
N+1 problem occurs when you execute 1 query to fetch N records, then N additional queries to fetch related data.
Example - Bad (N+1 Queries):
# Fetch all users (1 query)
users = db.query("SELECT * FROM users") # Returns 100 users
# For each user, fetch their posts (100 queries)
for user in users:
posts = db.query(f"SELECT * FROM posts WHERE user_id = {user.id}")
user.posts = posts
# Total: 1 + 100 = 101 queries!
Example - Good (2 Queries with JOIN or Eager Loading):
# Option 1: Single JOIN query
result = db.query("""
SELECT users.*, posts.*
FROM users
LEFT JOIN posts ON users.id = posts.user_id
""")
# Option 2: Fetch in bulk (2 queries)
users = db.query("SELECT * FROM users")
user_ids = [u.id for u in users]
posts = db.query(f"SELECT * FROM posts WHERE user_id IN ({user_ids})")
# Group posts by user_id
# Total: 2 queries instead of 101!
In ORMs:
Django (Bad):
# Causes N+1
users = User.objects.all()
for user in users:
print(user.posts.all()) # New query each iteration
Django (Good):
# Single query with JOIN
users = User.objects.prefetch_related('posts')
for user in users:
print(user.posts.all()) # No additional query
SQLAlchemy (Good):
# Eager loading
users = session.query(User).options(joinedload(User.posts)).all()
Answer:
An index is a data structure that improves query speed by allowing the database to find rows faster without scanning the entire table.
How Indexes Work:
Think of a book index - instead of reading every page to find "API", you check the index which points to specific pages.
Example:
-- Without index: Full table scan O(n)
SELECT * FROM users WHERE email = 'john@example.com';
-- Database scans ALL rows
-- Create index
CREATE INDEX idx_email ON users(email);
-- With index: O(log n) - much faster!
SELECT * FROM users WHERE email = 'john@example.com';
-- Database uses index to jump directly to row
Types of Indexes:
CREATE TABLE users (
id SERIAL PRIMARY KEY -- Auto-indexed
);
CREATE UNIQUE INDEX idx_email ON users(email);
CREATE INDEX idx_name_city ON users(last_name, first_name);
-- Good for: WHERE last_name = 'Doe' AND first_name = 'John'
-- Also works for: WHERE last_name = 'Doe'
-- Doesn't help: WHERE first_name = 'John' (need first column)
CREATE INDEX idx_active_users ON users(email) WHERE is_active = true;
When to Use Indexes:
✅ Columns in WHERE clauses
✅ Columns in JOIN conditions
✅ Columns in ORDER BY
✅ Foreign keys
✅ Columns with high cardinality (many unique values)
When NOT to Use Indexes:
❌ Small tables (< 1000 rows)
❌ Columns with low cardinality (e.g., boolean, gender)
❌ Columns that are frequently updated
❌ Tables with heavy INSERT/UPDATE operations
Trade-offs:
Real Performance Example:
-- Table with 1 million users
-- Without index
SELECT * FROM users WHERE email = 'john@example.com';
-- Time: 2.5 seconds (full scan)
-- With index
CREATE INDEX idx_email ON users(email);
SELECT * FROM users WHERE email = 'john@example.com';
-- Time: 0.003 seconds (800x faster!)
Answer:
Normalization is organizing data to reduce redundancy and improve data integrity by dividing large tables into smaller, related tables.
Normal Forms:
1NF (First Normal Form):
Before 1NF:
| User ID | Name | Phones |
|---------|------|---------|
| 1 | John | 123-456, 789-012 |
After 1NF:
| User ID | Name | Phone |
|---------|------|-------|
| 1 | John | 123-456 |
| 1 | John | 789-012 |
2NF (Second Normal Form):
Before 2NF:
| Order ID | Product ID | Product Name | Quantity |
|----------|------------|--------------|----------|
| 1 | 101 | Laptop | 2 |
| 2 | 101 | Laptop | 1 |
(Product Name depends only on Product ID, not full key)
After 2NF:
Orders:
| Order ID | Product ID | Quantity |
|----------|------------|----------|
| 1 | 101 | 2 |
| 2 | 101 | 1 |
Products:
| Product ID | Product Name |
|------------|--------------|
| 101 | Laptop |
3NF (Third Normal Form):
Before 3NF:
| Employee ID | Department | Department Location |
|-------------|------------|---------------------|
| 1 | Sales | Building A |
| 2 | Sales | Building A |
(Department Location depends on Department, not Employee ID)
After 3NF:
Employees:
| Employee ID | Department |
|-------------|------------|
| 1 | Sales |
| 2 | Sales |
Departments:
| Department | Location |
|------------|-----------|
| Sales | Building A |
Denormalization:
Sometimes intentionally violate normalization for performance:
-- Normalized (requires JOIN)
SELECT u.name, COUNT(p.id)
FROM users u
LEFT JOIN posts p ON u.id = p.user_id
GROUP BY u.id;
-- Denormalized (faster, but redundant)
-- Add post_count column to users table
SELECT name, post_count FROM users;
Answer:
A transaction is a sequence of database operations that are treated as a single unit of work - either all succeed or all fail.
Example:
# Python with database transaction
try:
db.begin_transaction()
# Step 1: Deduct from sender
db.execute("UPDATE accounts SET balance = balance - 100 WHERE id = 1")
# Step 2: Add to receiver
db.execute("UPDATE accounts SET balance = balance + 100 WHERE id = 2")
# Step 3: Log transaction
db.execute("INSERT INTO transactions (from, to, amount) VALUES (1, 2, 100)")
db.commit() # All steps succeeded
except Exception as e:
db.rollback() # Any step failed - undo everything
Transaction Control Commands:
BEGIN TRANSACTION; -- Start transaction
-- SQL operations here
COMMIT; -- Save all changes
ROLLBACK; -- Undo all changes
-- Savepoints for partial rollback
BEGIN TRANSACTION;
INSERT INTO logs VALUES ('Starting');
SAVEPOINT sp1;
UPDATE users SET balance = 100;
ROLLBACK TO sp1; -- Only undo UPDATE, keep INSERT
COMMIT;
Isolation Problems Transactions Prevent:
-- Transaction A
UPDATE accounts SET balance = 1000 WHERE id = 1;
-- Not committed yet
-- Transaction B (Bad - sees uncommitted data)
SELECT balance FROM accounts WHERE id = 1; -- Sees 1000
-- Transaction A rolls back, Transaction B saw invalid data!
-- Transaction A
SELECT balance FROM accounts WHERE id = 1; -- Returns 500
-- Transaction B
UPDATE accounts SET balance = 1000 WHERE id = 1;
COMMIT;
-- Transaction A (same query, different result)
SELECT balance FROM accounts WHERE id = 1; -- Returns 1000
-- Transaction A
SELECT * FROM orders WHERE status = 'pending'; -- Returns 5 rows
-- Transaction B
INSERT INTO orders VALUES (..., 'pending');
COMMIT;
-- Transaction A (same query, new row appears)
SELECT * FROM orders WHERE status = 'pending'; -- Returns 6 rows
Answer:
HTTP status codes indicate the result of an HTTP request.
Categories:
1xx - Informational
100 Continue - Server received headers, client can send body101 Switching Protocols - Server switching to WebSocket2xx - Success
200 OK - Request succeeded201 Created - Resource created successfully202 Accepted - Request accepted, processing not complete204 No Content - Success but no response body206 Partial Content - Range request (video streaming)3xx - Redirection
301 Moved Permanently - Resource permanently moved302 Found - Temporary redirect304 Not Modified - Use cached version307 Temporary Redirect - Same as 302 but preserve method4xx - Client Errors
400 Bad Request - Invalid syntax or validation failed401 Unauthorized - Authentication required403 Forbidden - Authenticated but not authorized404 Not Found - Resource doesn't exist405 Method Not Allowed - HTTP method not supported409 Conflict - Request conflicts with current state422 Unprocessable Entity - Valid syntax but semantic errors429 Too Many Requests - Rate limit exceeded5xx - Server Errors
500 Internal Server Error - Generic server error502 Bad Gateway - Invalid response from upstream server503 Service Unavailable - Server temporarily down504 Gateway Timeout - Upstream server timeoutWhen to Use Each:
// 200 OK - Standard success
app.get('/users/:id', (req, res) => {
const user = db.getUser(req.params.id);
res.status(200).json(user);
});
// 201 Created - New resource
app.post('/users', (req, res) => {
const user = db.createUser(req.body);
res.status(201).json(user);
});
// 204 No Content - Success but empty response
app.delete('/users/:id', (req, res) => {
db.deleteUser(req.params.id);
res.status(204).send();
});
// 400 Bad Request - Validation failed
app.post('/users', (req, res) => {
if (!req.body.email) {
return res.status(400).json({ error: 'Email required' });
}
});
// 401 Unauthorized - Not logged in
app.get('/profile', (req, res) => {
if (!req.headers.authorization) {
return res.status(401).json({ error: 'Authentication required' });
}
});
// 403 Forbidden - Logged in but insufficient permissions
app.delete('/users/:id', (req, res) => {
if (req.user.role !== 'admin') {
return res.status(403).json({ error: 'Admin access required' });
}
});
// 404 Not Found - Resource doesn't exist
app.get('/users/:id', (req, res) => {
const user = db.getUser(req.params.id);
if (!user) {
return res.status(404).json({ error: 'User not found' });
}
});
// 409 Conflict - Duplicate resource
app.post('/users', (req, res) => {
if (db.emailExists(req.body.email)) {
return res.status(409).json({ error: 'Email already registered' });
}
});
// 422 Unprocessable Entity - Semantic errors
app.post('/users', (req, res) => {
if (req.body.age < 0) {
return res.status(422).json({ error: 'Age cannot be negative' });
}
});
// 429 Too Many Requests - Rate limiting
app.use((req, res, next) => {
if (rateLimiter.isExceeded(req.ip)) {
return res.status(429).json({ error: 'Too many requests' });
}
next();
});
// 500 Internal Server Error - Unexpected error
app.get('/users', (req, res) => {
try {
const users = db.getUsers();
res.json(users);
} catch (error) {
console.error(error);
res.status(500).json({ error: 'Internal server error' });
}
});
Answer:
| Aspect | PUT | PATCH |
|---|---|---|
| Purpose | Replace entire resource | Partially update resource |
| Idempotent | Yes | Depends on implementation |
| Send | Complete resource | Only changed fields |
| Missing Fields | Set to null/default | Keep existing values |
Examples:
// Current resource state
{
id: 1,
name: "John Doe",
email: "john@example.com",
age: 25,
city: "New York"
}
// PUT - Replace entire resource
PUT /users/1
{
name: "John Smith",
email: "john@example.com",
age: 25,
city: "New York"
}
// Result: Complete replacement (must send all fields)
// PATCH - Update only specific fields
PATCH /users/1
{
name: "John Smith"
}
// Result: Only name updated, other fields unchanged
{
id: 1,
name: "John Smith", // Changed
email: "john@example.com", // Unchanged
age: 25, // Unchanged
city: "New York" // Unchanged
}
Implementation:
// PUT - Replace entire resource
app.put('/users/:id', (req, res) => {
const user = {
name: req.body.name,
email: req.body.email,
age: req.body.age,
city: req.body.city
};
// If any field missing, it becomes null
db.replaceUser(req.params.id, user);
res.json(user);
});
// PATCH - Update specific fields
app.patch('/users/:id', (req, res) => {
const updates = {};
// Only include provided fields
if (req.body.name) updates.name = req.body.name;
if (req.body.email) updates.email = req.body.email;
if (req.body.age) updates.age = req.body.age;
if (req.body.city) updates.city = req.body.city;
db.updateUser(req.params.id, updates);
res.json(db.getUser(req.params.id));
});
When to Use:
Answer:
Content negotiation allows server to serve different representations of the same resource based on client preferences.
Common Headers:
Accept - Response format
GET /users/1
Accept: application/json
Response:
{
"id": 1,
"name": "John"
}
GET /users/1
Accept: application/xml
Response:
1
John
Accept-Language - Preferred language
GET /products/1
Accept-Language: es
Response:
{
"nombre": "Computadora",
"precio": 1000
}
Accept-Encoding - Compression
GET /data
Accept-Encoding: gzip, deflate
Response:
Content-Encoding: gzip
[compressed data]
Implementation:
app.get('/users/:id', (req, res) => {
const user = db.getUser(req.params.id);
// Check Accept header
const accept = req.get('Accept');
if (accept.includes('application/json')) {
res.json(user);
} else if (accept.includes('application/xml')) {
res.type('application/xml');
res.send(`${user.id}${user.name}`);
} else {
res.status(406).json({ error: 'Not Acceptable' });
}
});
Answer:
HTTP headers provide metadata about the request or response.
Request Headers:
GET /api/users HTTP/1.1
Host: api.example.com
User-Agent: Mozilla/5.0
Accept: application/json
Authorization: Bearer eyJhbGc...
Content-Type: application/json
Accept-Language: en-US
Cache-Control: no-cache
Cookie: session_id=abc123
Response Headers:
HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 1234
Cache-Control: max-age=3600
ETag: "abc123"
Set-Cookie: session_id=xyz789; HttpOnly; Secure
Access-Control-Allow-Origin: *
X-RateLimit-Remaining: 99
Important Headers:
Authentication:
Authorization: Bearer
Authorization: Basic base64(username:password)
Caching:
Cache-Control: no-cache, no-store, must-revalidate
Cache-Control: public, max-age=31536000
ETag: "version123"
If-None-Match: "version123"
CORS:
Access-Control-Allow-Origin: https://example.com
Access-Control-Allow-Methods: GET, POST, PUT
Access-Control-Allow-Headers: Content-Type, Authorization
Access-Control-Max-Age: 86400
Security:
Strict-Transport-Security: max-age=31536000
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
Content-Security-Policy: default-src 'self'
Custom Headers:
X-Request-ID: uuid-12345
X-API-Version: v2
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1640000000
Answer:
CORS is a security mechanism that allows servers to specify which origins can access their resources.
Same-Origin Policy:
Browser blocks requests from different origins by default.
// Origin: https://example.com
// ✅ ALLOWED (same origin)
fetch('https://example.com/api/users')
// ❌ BLOCKED (different origin)
fetch('https://api.other.com/users')
// Error: CORS policy: No 'Access-Control-Allow-Origin' header
CORS Headers:
Simple Request:
Request:
GET /users
Origin: https://example.com
Response:
Access-Control-Allow-Origin: https://example.com
Access-Control-Allow-Credentials: true
Preflight Request (for PUT, DELETE, custom headers):
OPTIONS /users
Origin: https://example.com
Access-Control-Request-Method: PUT
Access-Control-Request-Headers: Authorization
Response:
Access-Control-Allow-Origin: https://example.com
Access-Control-Allow-Methods: GET, POST, PUT, DELETE
Access-Control-Allow-Headers: Authorization, Content-Type
Access-Control-Max-Age: 86400
Implementation:
// Express.js - Allow all origins (development only!)
app.use((req, res, next) => {
res.header('Access-Control-Allow-Origin', '*');
res.header('Access-Control-Allow-Methods', 'GET, POST, PUT, DELETE');
res.header('Access-Control-Allow-Headers', 'Content-Type, Authorization');
if (req.method === 'OPTIONS') {
return res.sendStatus(200);
}
next();
});
// Express.js - Specific origin (production)
const cors = require('cors');
app.use(cors({
origin: 'https://example.com',
credentials: true,
methods: ['GET', 'POST', 'PUT', 'DELETE'],
allowedHeaders: ['Content-Type', 'Authorization']
}));
// Flask
from flask_cors import CORS
CORS(app, origins=['https://example.com'])
Why CORS Exists:
Prevents malicious websites from accessing your API:
// Without CORS, evil.com could:
fetch('https://bank.com/transfer', {
method: 'POST',
credentials: 'include', // Include cookies
body: JSON.stringify({ to: 'attacker', amount: 10000 })
});
// CORS blocks this cross-origin request
Answer:
Pagination divides large result sets into smaller pages to improve performance and usability.
Types of Pagination:
1. Offset-Based (Page Number)
GET /users?page=2&limit=10
GET /users?offset=20&limit=10
Response:
{
"data": [...],
"page": 2,
"limit": 10,
"total": 100,
"total_pages": 10
}
Implementation:
app.get('/users', (req, res) => {
const page = parseInt(req.query.page) || 1;
const limit = parseInt(req.query.limit) || 10;
const offset = (page - 1) * limit;
const users = db.query(`
SELECT * FROM users
LIMIT ${limit} OFFSET ${offset}
`);
const total = db.query('SELECT COUNT(*) FROM users')[0].count;
res.json({
data: users,
page,
limit,
total,
total_pages: Math.ceil(total / limit)
});
});
Pros:
Cons:
2. Cursor-Based (Keyset Pagination)
GET /users?cursor=eyJpZCI6MTAwfQ&limit=10
Response:
{
"data": [...],
"next_cursor": "eyJpZCI6MTEwfQ",
"has_more": true
}
Implementation:
app.get('/users', (req, res) => {
const limit = parseInt(req.query.limit) || 10;
const cursor = req.query.cursor ? decodeCursor(req.query.cursor) : null;
let query = 'SELECT * FROM users';
if (cursor) {
query += ` WHERE id > ${cursor.id}`;
}
query += ` ORDER BY id LIMIT ${limit + 1}`;
const users = db.query(query);
const hasMore = users.length > limit;
if (hasMore) {
users.pop(); // Remove extra item
}
const nextCursor = hasMore ? encodeCursor({ id: users[users.length - 1].id }) : null;
res.json({
data: users,
next_cursor: nextCursor,
has_more: hasMore
});
});
Pros:
Cons:
When to Use:
Answer:
Rate limiting restricts the number of requests a client can make in a time window to prevent abuse and ensure fair usage.
Types:
1. Fixed Window
Allow 100 requests per hour
Timeline:
00:00 - 01:00: 100 requests allowed
01:00 - 02:00: Counter resets, 100 more allowed
Problem:
00:59: 100 requests
01:00: Counter resets
01:01: 100 requests
Total: 200 requests in 2 minutes!
2. Sliding Window
Allow 100 requests per hour, calculated from current time
12:30: Check requests from 11:30 to 12:30
12:45: Check requests from 11:45 to 12:45
3. Token Bucket
Bucket holds 100 tokens
Each request consumes 1 token
Tokens refill at rate of 100/hour
Allows burst traffic if tokens available
Implementation:
Redis-based Rate Limiter:
const redis = require('redis');
const client = redis.createClient();
async function rateLimiter(req, res, next) {
const key = `rate_limit:${req.ip}`;
const limit = 100;
const window = 3600; // 1 hour in seconds
const requests = await client.incr(key);
if (requests === 1) {
await client.expire(key, window);
}
if (requests > limit) {
return res.status(429).json({
error: 'Too many requests',
retry_after: await client.ttl(key)
});
}
res.set('X-RateLimit-Limit', limit);
res.set('X-RateLimit-Remaining', limit - requests);
res.set('X-RateLimit-Reset', Date.now() + (await client.ttl(key) * 1000));
next();
}
app.use(rateLimiter);
Sliding Window Implementation:
async function slidingWindowRateLimiter(req, res, next) {
const key = `rate_limit:${req.ip}`;
const limit = 100;
const window = 3600000; // 1 hour in milliseconds
const now = Date.now();
// Remove old entries
await client.zRemRangeByScore(key, 0, now - window);
// Count requests in window
const requests = await client.zCard(key);
if (requests >= limit) {
return res.status(429).json({ error: 'Too many requests' });
}
// Add current request
await client.zAdd(key, { score: now, value: `${now}` });
await client.expire(key, Math.ceil(window / 1000));
next();
}
Response Headers:
HTTP/1.1 200 OK
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1640000000
HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1640000000
Retry-After: 3600
Different Limits for Different Users:
function getRateLimit(user) {
if (user.plan === 'free') return 100;
if (user.plan === 'pro') return 1000;
if (user.plan === 'enterprise') return 10000;
return 100;
}
async function rateLimiter(req, res, next) {
const limit = getRateLimit(req.user);
// ... rest of logic
}
Answer:
API versioning allows you to make breaking changes while maintaining backward compatibility for existing clients.
Methods:
1. URI Versioning (Most Common)
GET /v1/users
GET /v2/users
Pros: Clear, easy to route
Cons: Pollutes URI space
2. Header Versioning
GET /users
Accept: application/vnd.api.v1+json
Pros: Clean URIs
Cons: Less visible, harder to test
3. Query Parameter
GET /users?version=1
GET /users?v=2
Pros: Simple
Cons: Can be ignored/forgotten
4. Content Negotiation
GET /users
Accept: application/vnd.myapi.v1+json
Implementation:
// URI Versioning
app.get('/v1/users', v1UsersController);
app.get('/v2/users', v2UsersController);
// Header Versioning
app.get('/users', (req, res) => {
const version = req.get('API-Version') || '1';
if (version === '1') {
v1UsersController(req, res);
} else if (version === '2') {
v2UsersController(req, res);
} else {
res.status(400).json({ error: 'Invalid API version' });
}
});
// Middleware for version routing
function versionRouter(versions) {
return (req, res, next) => {
const version = req.path.split('/')[1]; // /v1/users -> v1
const handler = versions[version];
if (!handler) {
return res.status(400).json({ error: 'Invalid version' });
}
handler(req, res, next);
};
}
app.use('/v1/users', v1UsersRouter);
app.use('/v2/users', v2UsersRouter);
Breaking vs Non-Breaking Changes:
Breaking Changes (require new version):
// v1
{
"name": "John Doe",
"email": "john@example.com"
}
// v2 (BREAKING - renamed field)
{
"full_name": "John Doe",
"email": "john@example.com"
}
Non-Breaking Changes (same version):
// v1
{
"name": "John Doe",
"email": "john@example.com"
}
// v1 (NON-BREAKING - added field)
{
"name": "John Doe",
"email": "john@example.com",
"phone": "123-456-7890" // New optional field
}
Deprecation Strategy:
// v1 response with deprecation warning
{
"name": "John Doe",
"email": "john@example.com",
"_warnings": {
"deprecated_fields": ["name"],
"message": "Field 'name' is deprecated. Use 'full_name' instead. This field will be removed in v3."
}
}
// v2 response
{
"full_name": "John Doe",
"email": "john@example.com"
}
Answer:
API documentation describes endpoints, parameters, responses, and usage examples. Essential for API adoption and maintenance.
Key Components:
GET /users/:id
Description: Retrieve a user by ID
Path Parameters:
- id (integer, required): User ID
Query Parameters:
- include (string, optional): Additional data to include
Example: ?include=posts,comments
{
"name": "John Doe",
"email": "john@example.com",
"age": 25
}
200 OK
{
"id": 1,
"name": "John Doe",
"email": "john@example.com",
"created_at": "2026-01-01T00:00:00Z"
}
404 Not Found
{
"error": "User not found"
}
Authorization: Bearer <token>
Documentation Tools:
Swagger/OpenAPI:
openapi: 3.0.0
info:
title: User API
version: 1.0.0
paths:
/users/{id}:
get:
summary: Get user by ID
parameters:
- name: id
in: path
required: true
schema:
type: integer
responses:
'200':
description: Successful response
content:
application/json:
schema:
$ref: '#/components/schemas/User'
Postman:
Best Practices:
Answer:
GraphQL is a query language for APIs that allows clients to request exactly the data they need.
REST vs GraphQL:
| Aspect | REST | GraphQL |
|---|---|---|
| Endpoints | Multiple endpoints | Single endpoint |
| Data Fetching | Fixed response structure | Client specifies structure |
| Over-fetching | Common | Eliminated |
| Under-fetching | Requires multiple requests | Single request |
| Versioning | Usually required | Schema evolution |
Example:
REST - Multiple Requests:
// Get user
GET /users/1
{
"id": 1,
"name": "John",
"email": "john@example.com"
}
// Get user's posts (separate request)
GET /users/1/posts
[
{ "id": 101, "title": "Post 1" },
{ "id": 102, "title": "Post 2" }
]
// Get post comments (another request)
GET /posts/101/comments
[
{ "id": 201, "text": "Comment 1" }
]
GraphQL - Single Request:
query {
user(id: 1) {
name
email
posts {
title
comments {
text
}
}
}
}
Response:
{
"data": {
"user": {
"name": "John",
"email": "john@example.com",
"posts": [
{
"title": "Post 1",
"comments": [
{ "text": "Comment 1" }
]
}
]
}
}
}
GraphQL Example - Express:
const { graphqlHTTP } = require('express-graphql');
const { buildSchema } = require('graphql');
// Define schema
const schema = buildSchema(`
type User {
id: ID!
name: String!
email: String!
posts: [Post]
}
type Post {
id: ID!
title: String!
content: String!
}
type Query {
user(id: ID!): User
users: [User]
}
type Mutation {
createUser(name: String!, email: String!): User
}
`);
// Define resolvers
const root = {
user: ({ id }) => db.getUser(id),
users: () => db.getUsers(),
createUser: ({ name, email }) => db.createUser(name, email)
};
app.use('/graphql', graphqlHTTP({
schema: schema,
rootValue: root,
graphiql: true // Enable GraphiQL IDE
}));
Advantages of GraphQL:
Disadvantages:
Answer:
Authentication: Verifying WHO you are (identity)
Authorization: Verifying WHAT you can do (permissions)
Example:
Authentication: "Are you John Doe?" (Login with username/password)
Authorization: "Can John Doe delete this user?" (Check permissions)
Flow:
1. User sends credentials (username/password)
2. Server authenticates (verifies identity)
3. Server generates token/session
4. User sends token with requests
5. Server authorizes (checks permissions for action)
Real-World Analogy:
Authentication: Showing your ID at airport security
Authorization: Having a boarding pass for a specific flight
Code Example:
// Authentication - WHO are you?
app.post('/login', (req, res) => {
const { username, password } = req.body;
const user = db.findUser(username);
if (!user || !bcrypt.compare(password, user.password_hash)) {
return res.status(401).json({ error: 'Invalid credentials' });
}
const token = jwt.sign({ user_id: user.id }, SECRET_KEY);
res.json({ token });
});
// Authorization - WHAT can you do?
app.delete('/users/:id', authenticate, (req, res) => {
// authenticate middleware verified WHO (authentication)
// Now check WHAT they can do (authorization)
if (req.user.role !== 'admin') {
return res.status(403).json({ error: 'Forbidden - Admin only' });
}
db.deleteUser(req.params.id);
res.json({ message: 'User deleted' });
});
Answer:
JWT is a compact, self-contained token for securely transmitting information between parties as a JSON object.
Structure:
header.payload.signature
Example:
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VyX2lkIjoxMjMsImV4cCI6MTY0MDAwMDAwMH0.signature
Parts:
1. Header (algorithm and type)
{
"alg": "HS256",
"typ": "JWT"
}
2. Payload (claims/data)
{
"user_id": 123,
"email": "john@example.com",
"role": "admin",
"exp": 1640000000, // Expiration timestamp
"iat": 1639900000 // Issued at timestamp
}
3. Signature (verification)
HMACSHA256(
base64UrlEncode(header) + "." + base64UrlEncode(payload),
secret_key
)
Implementation:
const jwt = require('jsonwebtoken');
const SECRET_KEY = 'your-secret-key';
// Generate token
app.post('/login', (req, res) => {
const { username, password } = req.body;
const user = db.authenticate(username, password);
if (!user) {
return res.status(401).json({ error: 'Invalid credentials' });
}
const token = jwt.sign(
{
user_id: user.id,
email: user.email,
role: user.role
},
SECRET_KEY,
{
expiresIn: '24h'
}
);
res.json({ token });
});
// Verify token middleware
function authenticate(req, res, next) {
const token = req.headers.authorization?.split(' ')[1]; // Bearer
if (!token) {
return res.status(401).json({ error: 'No token provided' });
}
try {
const decoded = jwt.verify(token, SECRET_KEY);
req.user = decoded;
next();
} catch (error) {
return res.status(401).json({ error: 'Invalid token' });
}
}
// Protected route
app.get('/profile', authenticate, (req, res) => {
const user = db.getUser(req.user.user_id);
res.json(user);
});
JWT vs Session:
| Aspect | JWT | Session |
|---|---|---|
| Storage | Client-side (token) | Server-side (session store) |
| Scalability | Stateless (easy to scale) | Stateful (requires shared storage) |
| Revocation | Difficult | Easy |
| Size | Larger (sent with each request) | Small (just session ID) |
| Security | Token can't be changed | Can invalidate session |
Security Considerations:
// ❌ BAD - Storing sensitive data
const token = jwt.sign({
user_id: 123,
password: 'secret123', // NEVER store passwords
credit_card: '1234-5678' // NEVER store sensitive data
}, SECRET_KEY);
// ✅ GOOD - Only non-sensitive claims
const token = jwt.sign({
user_id: 123,
email: 'john@example.com',
role: 'user'
}, SECRET_KEY);
// Always use HTTPS
// Set short expiration times
// Refresh tokens for long-term access
Refresh Token Pattern:
// Generate access token (short-lived) and refresh token (long-lived)
const accessToken = jwt.sign(
{ user_id: user.id },
ACCESS_SECRET,
{ expiresIn: '15m' }
);
const refreshToken = jwt.sign(
{ user_id: user.id },
REFRESH_SECRET,
{ expiresIn: '7d' }
);
// Store refresh token in database
db.saveRefreshToken(user.id, refreshToken);
res.json({ accessToken, refreshToken });
// Refresh endpoint
app.post('/refresh', (req, res) => {
const { refreshToken } = req.body;
try {
const decoded = jwt.verify(refreshToken, REFRESH_SECRET);
// Verify token exists in database
if (!db.refreshTokenExists(decoded.user_id, refreshToken)) {
throw new Error('Invalid refresh token');
}
// Generate new access token
const newAccessToken = jwt.sign(
{ user_id: decoded.user_id },
ACCESS_SECRET,
{ expiresIn: '15m' }
);
res.json({ accessToken: newAccessToken });
} catch (error) {
res.status(401).json({ error: 'Invalid refresh token' });
}
});
Answer:
OAuth 2.0 is an authorization framework that allows third-party applications to access user data without exposing credentials.
Use Case:
"Login with Google/Facebook/GitHub"
Flow (Authorization Code):
1. User clicks "Login with Google"
2. Redirect to Google authorization page
3. User approves access
4. Google redirects back with authorization code
5. Exchange code for access token
6. Use access token to fetch user data
Roles:
Implementation Example:
const axios = require('axios');
// Step 1: Redirect to authorization page
app.get('/auth/google', (req, res) => {
const authUrl = `https://accounts.google.com/o/oauth2/v2/auth?` +
`client_id=${GOOGLE_CLIENT_ID}` +
`&redirect_uri=${REDIRECT_URI}` +
`&response_type=code` +
`&scope=openid email profile`;
res.redirect(authUrl);
});
// Step 2: Handle callback
app.get('/auth/google/callback', async (req, res) => {
const { code } = req.query;
// Step 3: Exchange code for access token
const tokenResponse = await axios.post('https://oauth2.googleapis.com/token', {
code,
client_id: GOOGLE_CLIENT_ID,
client_secret: GOOGLE_CLIENT_SECRET,
redirect_uri: REDIRECT_URI,
grant_type: 'authorization_code'
});
const { access_token } = tokenResponse.data;
// Step 4: Fetch user info
const userResponse = await axios.get('https://www.googleapis.com/oauth2/v2/userinfo', {
headers: { Authorization: `Bearer ${access_token}` }
});
const googleUser = userResponse.data;
// Create or update user in database
const user = db.findOrCreateUser(googleUser.email, googleUser.name);
// Create session or JWT
const token = jwt.sign({ user_id: user.id }, SECRET_KEY);
res.json({ token, user });
});
OAuth Scopes:
scope=openid email profile
scope=read:user write:repo (GitHub)
scope=user.read (Microsoft)
Grant Types:
Answer:
RBAC restricts system access based on user roles and permissions.
Concepts:
Example:
// Database structure
const roles = {
admin: ['create_user', 'delete_user', 'view_all', 'edit_all'],
editor: ['create_post', 'edit_post', 'delete_own_post'],
viewer: ['view_posts']
};
const users = [
{ id: 1, name: 'Alice', role: 'admin' },
{ id: 2, name: 'Bob', role: 'editor' },
{ id: 3, name: 'Charlie', role: 'viewer' }
];
// Authorization middleware
function requirePermission(permission) {
return (req, res, next) => {
const userRole = req.user.role;
const permissions = roles[userRole];
if (!permissions.includes(permission)) {
return res.status(403).json({ error: 'Forbidden' });
}
next();
};
}
// Usage
app.delete('/users/:id', authenticate, requirePermission('delete_user'), (req, res) => {
db.deleteUser(req.params.id);
res.json({ message: 'User deleted' });
});
app.post('/posts', authenticate, requirePermission('create_post'), (req, res) => {
const post = db.createPost(req.body);
res.json(post);
});
Complex RBAC with Ownership:
function canDeletePost(req, res, next) {
const post = db.getPost(req.params.id);
const user = req.user;
// Admin can delete any post
if (user.role === 'admin') {
return next();
}
// Editor can delete own posts
if (user.role === 'editor' && post.author_id === user.id) {
return next();
}
res.status(403).json({ error: 'Forbidden' });
}
app.delete('/posts/:id', authenticate, canDeletePost, (req, res) => {
db.deletePost(req.params.id);
res.json({ message: 'Post deleted' });
});
Hierarchical Roles:
const roleHierarchy = {
admin: ['editor', 'viewer'],
editor: ['viewer'],
viewer: []
};
function hasRole(userRole, requiredRole) {
if (userRole === requiredRole) return true;
const childRoles = roleHierarchy[userRole] || [];
return childRoles.includes(requiredRole);
}
// Admin automatically has editor and viewer permissions
Answer:
Joins combine rows from multiple tables based on related columns.
Types:
1. INNER JOIN - Only matching rows
SELECT users.name, orders.total
FROM users
INNER JOIN orders ON users.id = orders.user_id;
-- Result: Only users who have orders
Users: Orders: Result:
id | name id | user_id name | total
1 | Alice 1 | 1 Alice | 100
2 | Bob 2 | 1 Alice | 200
3 | Charlie 3 | 2 Bob | 150
2. LEFT JOIN - All rows from left table
SELECT users.name, orders.total
FROM users
LEFT JOIN orders ON users.id = orders.user_id;
-- Result: All users, with NULL for users without orders
Result:
name | total
Alice | 100
Alice | 200
Bob | 150
Charlie | NULL
3. RIGHT JOIN - All rows from right table
SELECT users.name, orders.total
FROM users
RIGHT JOIN orders ON users.id = orders.user_id;
-- Result: All orders, with NULL for orders without users
4. FULL OUTER JOIN - All rows from both tables
SELECT users.name, orders.total
FROM users
FULL OUTER JOIN orders ON users.id = orders.user_id;
-- Result: All users and all orders, with NULL where no match
5. CROSS JOIN - Cartesian product
SELECT users.name, products.name
FROM users
CROSS JOIN products;
-- Result: Every user paired with every product
Multiple Joins:
SELECT
users.name,
orders.total,
products.name AS product_name
FROM users
INNER JOIN orders ON users.id = orders.user_id
INNER JOIN order_items ON orders.id = order_items.order_id
INNER JOIN products ON order_items.product_id = products.id;
Self Join:
-- Find employees and their managers
SELECT
e.name AS employee,
m.name AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id;
Answer:
Already covered in detail in question #6 (ACID Properties) and #10 (What is a Transaction).
Quick Reference - Isolation Levels:
| Level | Dirty Read | Non-Repeatable Read | Phantom Read |
|---|---|---|---|
| Read Uncommitted | ✅ Possible | ✅ Possible | ✅ Possible |
| Read Committed | ❌ Not Possible | ✅ Possible | ✅ Possible |
| Repeatable Read | ❌ Not Possible | ❌ Not Possible | ✅ Possible |
| Serializable | ❌ Not Possible | ❌ Not Possible | ❌ Not Possible |
Answer:
Sharding is horizontally partitioning data across multiple databases to improve performance and scalability.
Types:
1. Horizontal Sharding - Split rows
Users Table (1 million rows)
Shard 1 (users 1-500k):
id | name
1 | Alice
2 | Bob
...
500000 | John
Shard 2 (users 500k-1M):
id | name
500001 | Jane
500002 | Jack
...
1000000 | Jill
2. Vertical Sharding - Split columns
Users Table
Shard 1 (Basic Info):
id | name | email
Shard 2 (Profile Data):
id | bio | profile_pic | preferences
Sharding Strategies:
1. Range-Based Sharding
function getShard(userId) {
if (userId <= 500000) return 'shard1';
if (userId <= 1000000) return 'shard2';
return 'shard3';
}
// Problem: Uneven distribution if new users grow faster
2. Hash-Based Sharding
function getShard(userId) {
const shardId = userId % 4; // 4 shards
return `shard${shardId}`;
}
// Pros: Even distribution
// Cons: Hard to add new shards (requires re-sharding)
3. Geographic Sharding
function getShard(userCountry) {
if (userCountry === 'US') return 'shard_us';
if (userCountry === 'EU') return 'shard_eu';
if (userCountry === 'ASIA') return 'shard_asia';
}
// Pros: Low latency for users
// Cons: Uneven load
4. Directory-Based Sharding
// Lookup table
const shardDirectory = {
user_1: 'shard1',
user_2: 'shard2',
user_3: 'shard1'
};
function getShard(userId) {
return shardDirectory[userId];
}
// Pros: Flexible
// Cons: Lookup overhead
Implementation Example:
const shards = {
shard1: createConnection('db1.example.com'),
shard2: createConnection('db2.example.com'),
shard3: createConnection('db3.example.com')
};
function getUser(userId) {
const shardId = getShard(userId);
const db = shards[shardId];
return db.query('SELECT * FROM users WHERE id = ?', [userId]);
}
function createUser(userData) {
const userId = generateUserId();
const shardId = getShard(userId);
const db = shards[shardId];
return db.query('INSERT INTO users VALUES (?)', [userId, ...userData]);
}
Challenges:
-- Can't easily join users from shard1 with posts from shard2
SELECT users.name, posts.title
FROM users
JOIN posts ON users.id = posts.user_id;
When to Shard:
Answer:
Replication copies data from one database (primary) to one or more databases (replicas) for redundancy and read scaling.
Types:
1. Primary-Replica (Master-Slave)
Primary DB (Writes)
↓ replication
Replica 1 (Reads)
Replica 2 (Reads)
Replica 3 (Reads)
Use Case:
// Write to primary
function createUser(data) {
return primaryDB.query('INSERT INTO users VALUES (?)', [data]);
}
// Read from replica
function getUser(id) {
return replicaDB.query('SELECT * FROM users WHERE id = ?', [id]);
}
// Distributes read load across replicas
2. Multi-Primary (Master-Master)
Primary 1 ←→ Primary 2
(Both can accept writes)
Pros: High availability, write scaling
Cons: Conflict resolution needed
Replication Methods:
Synchronous - Wait for replica to confirm
Write Request → Primary DB
↓ wait for replica
Replica confirms
↓
Return success
Pros: Strong consistency
Cons: Higher latency
Asynchronous - Don't wait for replica
Write Request → Primary DB → Return success immediately
↓ (background)
Replicate to replica
Pros: Low latency
Cons: Replica may lag (eventual consistency)
Replication Lag:
// Write to primary
await primaryDB.query("UPDATE users SET name = 'Alice' WHERE id = 1");
// Immediately read from replica (might still show old value)
const user = await replicaDB.query("SELECT * FROM users WHERE id = 1");
console.log(user.name); // Might still be old name due to lag
// Solutions:
// 1. Read from primary for critical data
// 2. Add small delay before reading
// 3. Use read-after-write consistency
Read-After-Write Consistency:
const recentWrites = new Map(); // Track recent writes
function writeUser(id, data) {
primaryDB.query('UPDATE users SET ? WHERE id = ?', [data, id]);
recentWrites.set(id, Date.now());
}
function readUser(id) {
// If recently written, read from primary
if (recentWrites.has(id) && Date.now() - recentWrites.get(id) < 5000) {
return primaryDB.query('SELECT * FROM users WHERE id = ?', [id]);
}
// Otherwise, read from replica
return replicaDB.query('SELECT * FROM users WHERE id = ?', [id]);
}
Answer:
Connection pools maintain a set of reusable database connections to improve performance.
Without Connection Pool:
// Every request creates new connection
app.get('/users', async (req, res) => {
const connection = await createConnection(); // Slow!
const users = await connection.query('SELECT * FROM users');
await connection.close();
res.json(users);
});
// Problems:
// 1. Creating connection is slow (TCP handshake, authentication)
// 2. Limited connections available
// 3. Resource waste
With Connection Pool:
const pool = createPool({
host: 'localhost',
user: 'root',
password: 'password',
database: 'mydb',
connectionLimit: 10 // Max 10 concurrent connections
});
app.get('/users', async (req, res) => {
const connection = await pool.getConnection(); // Fast! (reuses existing)
const users = await connection.query('SELECT * FROM users');
connection.release(); // Return to pool
res.json(users);
});
How It Works:
Request 1 → Pool → Connection 1 (from pool)
Request 2 → Pool → Connection 2 (from pool)
Request 3 → Pool → Connection 1 (reused)
Request 4 → Pool → Wait (pool full)
Configuration:
const pool = mysql.createPool({
host: 'localhost',
user: 'user',
password: 'pass',
database: 'db',
connectionLimit: 10, // Max connections
queueLimit: 0, // Max queued requests (0 = unlimited)
waitForConnections: true, // Wait if pool full
acquireTimeout: 10000 // Timeout waiting for connection
});
// Handle connection errors
pool.on('error', (err) => {
console.error('Pool error:', err);
});
Best Practices:
// ✅ GOOD - Always release connections
async function getUser(id) {
let connection;
try {
connection = await pool.getConnection();
const user = await connection.query('SELECT * FROM users WHERE id = ?', [id]);
return user;
} catch (error) {
throw error;
} finally {
if (connection) connection.release();
}
}
// ❌ BAD - Forgot to release
async function getUser(id) {
const connection = await pool.getConnection();
const user = await connection.query('SELECT * FROM users WHERE id = ?', [id]);
return user; // Connection never released! Memory leak!
}
Connection Pool Sizing:
Optimal pool size = Number of CPU cores * 2
Example:
4 CPU cores → Pool size = 8
Too small: Requests wait for connections
Too large: Resource waste, database overhead
Answer:
NoSQL databases are non-relational databases designed for specific data models and scalability patterns.
Types:
1. Document Stores (MongoDB, Couchbase)
// Store complex nested data
{
_id: ObjectId("123"),
user: {
name: "John Doe",
email: "john@example.com",
addresses: [
{ street: "123 Main St", city: "NYC", type: "home" },
{ street: "456 Work Ave", city: "NYC", type: "office" }
],
orders: [
{
id: "order_1",
items: [
{ product: "Laptop", price: 1000, quantity: 1 }
],
total: 1000
}
]
}
}
// Query
db.users.findOne({ "user.email": "john@example.com" })
2. Key-Value Stores (Redis, DynamoDB)
// Simple key-value pairs
SET user:123 "John Doe"
GET user:123 // "John Doe"
SET session:abc123 '{"user_id": 123, "expires": 1640000000}'
GET session:abc123
// Fast lookups by key
3. Column-Family Stores (Cassandra, HBase)
// Wide column storage
Row Key | Column Family: user_info | Column Family: stats
user_1 | name: John, email: j@ex.com | views: 100, likes: 50
user_2 | name: Jane, email: jane@ex.com | views: 200, likes: 75
4. Graph Databases (Neo4j, Amazon Neptune)
// Social network relationships
(User: John) -[FOLLOWS]-> (User: Jane)
(User: Jane) -[LIKES]-> (Post: 123)
(User: John) -[COMMENTED_ON]-> (Post: 123)
// Query: Find friends of friends
MATCH (user:User {name: "John"})-[:FOLLOWS*2]->(friend_of_friend)
RETURN friend_of_friend
When to Use NoSQL:
✅ Use NoSQL When:
❌ Use SQL When:
Example Comparison:
E-commerce Product Catalog
SQL Approach:
-- Multiple tables with joins
SELECT
p.name,
c.name AS category,
b.name AS brand,
GROUP_CONCAT(t.name) AS tags,
AVG(r.rating) AS avg_rating
FROM products p
JOIN categories c ON p.category_id = c.id
JOIN brands b ON p.brand_id = b.id
LEFT JOIN product_tags pt ON p.id = pt.product_id
LEFT JOIN tags t ON pt.tag_id = t.id
LEFT JOIN reviews r ON p.id = r.product_id
GROUP BY p.id;
NoSQL Approach (MongoDB):
// Single document with embedded data
{
_id: "product_123",
name: "Laptop",
category: { id: "cat_1", name: "Electronics" },
brand: { id: "brand_1", name: "Dell" },
tags: ["portable", "business", "windows"],
reviews: [
{ user_id: "user_1", rating: 5, comment: "Great!" },
{ user_id: "user_2", rating: 4, comment: "Good" }
],
avg_rating: 4.5
}
// Single query, no joins
db.products.findOne({ _id: "product_123" })
Answer:
Caching stores frequently accessed data in fast storage (memory) to reduce latency and database load.
Cache Hierarchy:
Client (Browser Cache) - Fastest, smallest
↓ miss
CDN (Content Delivery Network)
↓ miss
Application Cache (Redis/Memcached)
↓ miss
Database
Example:
// Without cache - Every request hits database
app.get('/products/:id', async (req, res) => {
const product = await db.query('SELECT * FROM products WHERE id = ?', [req.params.id]);
res.json(product); // ~50ms
});
// With cache - Most requests hit cache
app.get('/products/:id', async (req, res) => {
const cacheKey = `product:${req.params.id}`;
// Check cache first
let product = await cache.get(cacheKey);
if (product) {
return res.json(JSON.parse(product)); // ~1ms (50x faster!)
}
// Cache miss - fetch from database
product = await db.query('SELECT * FROM products WHERE id = ?', [req.params.id]);
// Store in cache for 1 hour
await cache.setex(cacheKey, 3600, JSON.stringify(product));
res.json(product);
});
Cache Strategies:
1. Cache-Aside (Lazy Loading)
function getUser(id) {
// Check cache
const cached = cache.get(`user:${id}`);
if (cached) return cached;
// Fetch from DB
const user = db.getUser(id);
// Store in cache
cache.set(`user:${id}`, user, { ttl: 3600 });
return user;
}
2. Write-Through
function updateUser(id, data) {
// Update database
db.updateUser(id, data);
// Update cache
const user = db.getUser(id);
cache.set(`user:${id}`, user);
return user;
}
3. Write-Behind (Write-Back)
function updateUser(id, data) {
// Update cache immediately
cache.set(`user:${id}`, data);
// Asynchronously update database
queue.add('update-user', { id, data });
return data;
}
4. Read-Through
// Cache automatically loads from DB on miss
cache.get('user:123', {
loadFunction: (key) => db.getUser(key.split(':')[1])
});
Cache Invalidation Strategies:
1. Time-To-Live (TTL)
// Expires after 1 hour
cache.set('user:123', user, { ttl: 3600 });
2. Manual Invalidation
function updateUser(id, data) {
db.updateUser(id, data);
cache.del(`user:${id}`); // Remove from cache
}
3. Event-Based Invalidation
eventEmitter.on('user-updated', (userId) => {
cache.del(`user:${userId}`);
});
function updateUser(id, data) {
db.updateUser(id, data);
eventEmitter.emit('user-updated', id);
}
Answer:
Redis is an in-memory key-value store used for caching, session management, real-time analytics, and message queuing.
Data Types:
1. String
SET user:123 "John Doe"
GET user:123 // "John Doe"
SETEX session:abc 3600 "user_data" // Expires in 1 hour
INCR page_views // Atomic increment
2. Hash (Objects)
HSET user:123 name "John Doe"
HSET user:123 email "john@example.com"
HSET user:123 age 25
HGETALL user:123
// { name: "John Doe", email: "john@example.com", age: 25 }
HGET user:123 name // "John Doe"
3. List (Arrays)
LPUSH queue:emails "email1@example.com"
LPUSH queue:emails "email2@example.com"
RPOP queue:emails // "email1@example.com" (FIFO)
4. Set (Unique values)
SADD tags:post:123 "javascript" "nodejs" "backend"
SMEMBERS tags:post:123 // ["javascript", "nodejs", "backend"]
SISMEMBER tags:post:123 "python" // 0 (false)
5. Sorted Set (Ranked data)
ZADD leaderboard 100 "player1"
ZADD leaderboard 200 "player2"
ZADD leaderboard 150 "player3"
ZRANGE leaderboard 0 -1 WITHSCORES
// ["player1", 100, "player3", 150, "player2", 200]
ZREVRANK leaderboard "player2" // 0 (highest score)
Use Cases:
1. Session Storage
const session = require('express-session');
const RedisStore = require('connect-redis')(session);
app.use(session({
store: new RedisStore({ client: redisClient }),
secret: 'secret',
resave: false,
saveUninitialized: false
}));
2. Rate Limiting
async function rateLimit(userId) {
const key = `rate_limit:${userId}`;
const count = await redis.incr(key);
if (count === 1) {
await redis.expire(key, 60); // Reset after 60 seconds
}
return count <= 100; // Allow 100 requests per minute
}
3. Leaderboard
// Add score
await redis.zadd('leaderboard', score, userId);
// Get top 10
const top10 = await redis.zrevrange('leaderboard', 0, 9, 'WITHSCORES');
// Get user rank
const rank = await redis.zrevrank('leaderboard', userId);
4. Pub/Sub (Real-time messaging)
// Publisher
redis.publish('notifications', JSON.stringify({ message: 'New order!' }));
// Subscriber
redis.subscribe('notifications');
redis.on('message', (channel, message) => {
console.log(`Received from ${channel}:`, JSON.parse(message));
});
5. Distributed Locks
async function acquireLock(resource, ttl = 10000) {
const lockKey = `lock:${resource}`;
const lockValue = uuidv4();
const result = await redis.set(lockKey, lockValue, 'PX', ttl, 'NX');
if (result === 'OK') {
return { acquired: true, lockValue };
}
return { acquired: false };
}
async function releaseLock(resource, lockValue) {
const lockKey = `lock:${resource}`;
const script = `
if redis.call("get", KEYS[1]) == ARGV[1] then
return redis.call("del", KEYS[1])
else
return 0
end
`;
await redis.eval(script, 1, lockKey, lockValue);
}
Answer:
CDN is a distributed network of servers that delivers content to users from the geographically closest server.
How It Works:
User in New York requests image
↓
CDN checks if cached in NY server
↓ (cache miss)
Fetches from origin server
↓
Caches in NY server
↓
Serves to user (fast, ~10ms)
Next user in New York
↓
CDN serves from NY cache (even faster, ~1ms)
Benefits:
Usage:
Cache Control Headers:
app.get('/static/*', (req, res) => {
res.setHeader('Cache-Control', 'public, max-age=31536000'); // 1 year
res.setHeader('CDN-Cache-Control', 'max-age=86400'); // CDN: 1 day
res.sendFile(req.path);
});
CDN Providers:
Invalidation:
// Purge cache when file changes
await cdn.purge('/images/logo.png');
// Or use versioned URLs
// New version = new cache
Answer:
Load balancing distributes incoming traffic across multiple servers to improve reliability and performance.
Algorithms:
1. Round Robin
Request 1 → Server 1
Request 2 → Server 2
Request 3 → Server 3
Request 4 → Server 1 (repeat)
2. Least Connections
Server 1: 5 active connections
Server 2: 3 active connections ← Route here
Server 3: 7 active connections
3. IP Hash
hash(client_ip) % server_count
Client 192.168.1.1 → always Server 2
Client 192.168.1.2 → always Server 1
4. Weighted Round Robin
Server 1 (weight 3): Gets 3x more traffic
Server 2 (weight 1): Gets 1x traffic
Implementation (Nginx):
upstream backend {
# Round robin (default)
server server1.example.com;
server server2.example.com;
server server3.example.com;
}
# Least connections
upstream backend {
least_conn;
server server1.example.com;
server server2.example.com;
}
# IP hash
upstream backend {
ip_hash;
server server1.example.com;
server server2.example.com;
}
# Weighted
upstream backend {
server server1.example.com weight=3;
server server2.example.com weight=1;
}
server {
location / {
proxy_pass http://backend;
}
}
Health Checks:
upstream backend {
server server1.example.com max_fails=3 fail_timeout=30s;
server server2.example.com max_fails=3 fail_timeout=30s;
server server3.example.com backup; # Only used if others fail
}
Types:
1. Layer 4 (Transport Layer) - TCP/UDP
2. Layer 7 (Application Layer) - HTTP
# Layer 7 routing
location /api/ {
proxy_pass http://api_servers;
}
location /static/ {
proxy_pass http://static_servers;
}
location / {
proxy_pass http://web_servers;
}
Answer:
A message queue is a form of asynchronous communication where messages are stored in a queue until processed.
How It Works:
Producer → Queue → Consumer
Example:
User uploads image → Message Queue → Background worker processes image
Benefits:
Popular Message Queues:
Example Use Cases:
1. Email Sending
// Without queue - User waits
app.post('/register', async (req, res) => {
const user = await db.createUser(req.body);
await sendEmail(user.email, 'Welcome!'); // Slow! User waits 2 seconds
res.json({ message: 'Registered' });
});
// With queue - Instant response
app.post('/register', async (req, res) => {
const user = await db.createUser(req.body);
// Add to queue (fast, ~1ms)
await queue.add('send-email', {
email: user.email,
template: 'welcome'
});
res.json({ message: 'Registered' }); // User gets instant response
});
// Worker processes queue in background
queue.process('send-email', async (job) => {
await sendEmail(job.data.email, job.data.template);
});
2. Image Processing
app.post('/upload', async (req, res) => {
const file = await saveFile(req.file);
// Queue processing tasks
await queue.add('resize-image', { fileId: file.id, size: 'thumbnail' });
await queue.add('resize-image', { fileId: file.id, size: 'medium' });
await queue.add('resize-image', { fileId: file.id, size: 'large' });
res.json({ file_id: file.id });
});
Implementation (Bull - Redis-based):
const Queue = require('bull');
const emailQueue = new Queue('email', 'redis://localhost:6379');
// Producer
app.post('/send-email', async (req, res) => {
await emailQueue.add({
to: req.body.email,
subject: 'Hello',
body: 'Welcome!'
}, {
attempts: 3, // Retry 3 times if fails
backoff: 5000,
removeOnComplete: true
});
res.json({ message: 'Queued' });
});
// Consumer
emailQueue.process(async (job) => {
await sendEmail(job.data.to, job.data.subject, job.data.body);
});
Answer:
Delivery semantics describe how reliably a message is processed by consumers.
At-most-once
At-least-once
Exactly-once
Idempotent Consumer Example:
-- Store processed message ids
CREATE TABLE processed_messages (
message_id VARCHAR(64) PRIMARY KEY,
processed_at TIMESTAMP
);
async function handleMessage(msg) {
const exists = await db.query(
'SELECT 1 FROM processed_messages WHERE message_id = ?',
[msg.id]
);
if (exists.length > 0) return; // Already processed
await db.beginTransaction();
try {
await process(msg.payload);
await db.query(
'INSERT INTO processed_messages (message_id, processed_at) VALUES (?, NOW())',
[msg.id]
);
await db.commit();
} catch (e) {
await db.rollback();
throw e;
}
}
Answer:
A DLQ stores messages that repeatedly fail processing so they can be inspected and retried later.
Why it matters:
Typical Flow:
Main Queue → Consumer (fails) → Retry N times → DLQ
Answer:
| Aspect | Monolith | Microservices |
|---|---|---|
| Deployment | Single unit | Independent services |
| Scaling | Scale whole app | Scale hot services |
| Complexity | Lower | Higher (network, ops) |
| Team Size | Small teams | Larger orgs |
| Failure Isolation | Poor | Better |
When to choose microservices:
Answer:
An API gateway is a single entry point for clients that routes requests to internal services.
Responsibilities:
Example:
Client → API Gateway → Auth Service
→ User Service
→ Orders Service
Answer:
Service discovery lets services find each other dynamically without hardcoded IPs.
Types:
Answer:
Prevents cascading failures by stopping calls to a failing dependency.
States:
Example (pseudocode):
if (breaker.isOpen()) throw new Error('Service unavailable');
try {
const res = await callService();
breaker.recordSuccess();
return res;
} catch (e) {
breaker.recordFailure();
throw e;
}
Answer:
Sagas split a transaction into steps with compensating actions.
Example:
Create Order → Reserve Inventory → Charge Payment
If Charge fails → Release Inventory → Cancel Order
Answer:
Tracks a request across multiple services using trace IDs.
Common tools: OpenTelemetry, Jaeger, Zipkin
Key fields:
Answer:
The OWASP Top 10 lists the most critical web app security risks.
Examples:
Bad:
const query = `SELECT * FROM users WHERE email = '${email}'`;
Good:
const query = 'SELECT * FROM users WHERE email = ?';
db.query(query, [email]);
Other protections:
CSRF (Cross-Site Request Forgery):
XSS (Cross-Site Scripting):
Answer:
Never hardcode secrets in code or commit history.
Preferred options:
Answer:
| Type | Scope | Example |
|---|---|---|
| Unit | Single function | Validate input parsing |
| Integration | Multiple components | API + DB |
| End-to-End | Full system | User sign-up flow |
| Contract | Service boundaries | API schema validation |
Answer:
Use stubs for external APIs, mocks for verifying interactions.
Answer:
Example:
{"level":"info","message":"order.created","order_id":123,"request_id":"req-abc"}
Checklist:
Answer:
Automates build, test, and deployment to reduce manual errors.
Typical steps:
Commit → Build → Test → Security Scan → Deploy → Monitor
Blue-Green:
Canary:
Answer:
Docker packages app + dependencies into a container.
Minimal Dockerfile:
FROM node:20-alpine
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --omit=dev
COPY . .
CMD ["node", "server.js"]
Key objects:
Answer:
System may be temporarily inconsistent but will converge over time.
Example:
Likes count in social app is eventually consistent to keep writes fast.
Answer:
Consumers should handle duplicate messages safely.
Common strategies:
Answer:
Backpressure prevents overload by slowing producers when consumers are busy.
Methods:
Answer:
Schema changes should be versioned, reversible, and safe.
Best practices:
Answer:
Assume dependencies fail and build resilience.
Techniques: