Redis Security Boundaries for Multi-Tenant Applications
Multi-tenant Redis deployments fundamentally alter the threat model. The primary risk vector shifts from latency degradation to cross-tenant data leakage, unauthorized key mutation, and eviction-driven denial-of-service. Legacy single-tenant caching strategies — relying on requirepass and naive key prefixes — fail to enforce cryptographic or logical isolation at scale. Establishing rigid security boundaries requires treating tenant identifiers as first-class security principals, aligning isolation policies with the underlying Understanding Redis Cache Topology, and enforcing simultaneous network, ACL, and eviction segmentation.
flowchart LR
T1[Tenant A client] -->|user tenant_a| ACL{Redis ACL}
T2[Tenant B client] -->|user tenant_b| ACL
ACL -->|key pattern ~tenant:a:*| KA[(Tenant A keyspace)]
ACL -->|key pattern ~tenant:b:*| KB[(Tenant B keyspace)]
ACL -. cross-tenant access denied .-> X[NOPERM]
Diagnostic Triage and Boundary Auditing
When cross-tenant access is suspected, immediate forensic triage must bypass application logs and query Redis directly. The ACL LOG command captures rejected commands, unauthorized user contexts, and key-space violations with millisecond precision.
# Retrieve recent ACL violations (default: 10 entries)
redis-cli ACL LOG
# Cross-reference with active connections to map offending IPs/service accounts
redis-cli CLIENT LIST | grep -E "addr=|user="
# Audit command execution frequency to detect bulk invalidation leaks
redis-cli INFO COMMANDSTATS | grep -E "cmdstat_keys|cmdstat_scan|cmdstat_del"
A poorly scoped SCAN or DEL during bulk invalidation can traverse tenant namespaces, triggering unauthorized cache misses or exposing stale references. Root-cause analysis of these boundary violations typically traces to three failure modes: permissive ACL inheritance, unbounded maxmemory eviction conflicts, and cluster slot migration without tenant-aware key pinning. Engineers must correlate ACL LOG timestamps with deployment windows to isolate configuration drift from application-level routing bugs.
Configuration Hardening and ACL Enforcement
Modern Redis (7.x) deprecates rename-command in favor of granular ACLs. Tenant isolation requires explicit user scoping, command allowlisting, and key pattern enforcement. The following configuration demonstrates production-hardened ACL generation for a high-traffic tenant tier:
# Generate a cryptographically random password
TENANT_PASS=$(redis-cli ACL GENPASS)
# Create a user with read access to the full tenant namespace, and write
# access scoped to session keys via a selector (Redis 7.0+ feature).
# Rules apply left-to-right; conflicting rules at the end would override earlier grants.
redis-cli ACL SETUSER tenant_123 on ">$TENANT_PASS" \
"+@read" "~tenant:123:*" \
"(+@write ~tenant:123:session:*)"
# Verify the result
redis-cli ACL GETUSER tenant_123
Critical Production Notes:
maxmemory-policyis instance-global in open-source Redis. You cannot assign tenant-specific eviction policies within a single instance. High-churn tenants must be routed to dedicated instances or Redis Enterprise databases to prevent eviction-driven cross-tenant cache stampedes.- Use
ACL DRYRUN(Redis 7.2+) to validate tenant commands against policy before deployment:redis-cli ACL DRYRUN tenant_123 SET tenant:123:config:v2 "true" - Disable
KEYSentirely via ACL. Replace withSCANand enforce cursor-based pagination in application code to prevent blocking the event loop.
Python Client Implementation and Resilient Retry Patterns
In Python microservices, redis-py 5.x must be configured with strict tenant routing, connection validation, and deterministic retry logic. Dynamic ACL rule generation frequently outpaces connection pooling, causing NOAUTH or NOPERM exceptions during cold starts.
import redis
from redis.exceptions import AuthenticationError, NoPermissionError, ConnectionError
from tenacity import retry, stop_after_attempt, wait_exponential, retry_if_exception_type
import logging
logger = logging.getLogger(__name__)
def get_tenant_client(tenant_id: str, password: str) -> redis.Redis:
client = redis.Redis(
host="cache.internal.svc",
port=6379,
username=f"tenant_{tenant_id}",
password=password,
decode_responses=True,
socket_timeout=2.0,
socket_connect_timeout=1.0,
retry_on_timeout=True,
health_check_interval=15,
)
# Validate ACL context before returning the client
whoami = client.acl_whoami()
if whoami != f"tenant_{tenant_id}":
raise RuntimeError(f"ACL mismatch: expected tenant_{tenant_id}, got {whoami}")
return client
@retry(
retry=retry_if_exception_type((AuthenticationError, ConnectionError)),
wait=wait_exponential(multiplier=0.5, min=0.5, max=5),
stop=stop_after_attempt(3),
reraise=True,
)
def tenant_cache_get(client: redis.Redis, key: str) -> "str | None":
try:
return client.get(key)
except NoPermissionError as e:
logger.error("ACL boundary violation on key %s: %s", key, e)
raise
Note that NoPermissionError (NOPERM) indicates a permanent ACL violation, not a transient failure — it should not be retried. The retry decorator above intentionally excludes it; the raise inside the except NoPermissionError block re-raises immediately. Connection health checks must validate ACL WHOAMI on initialization to catch ACL propagation lag. For comprehensive connection handling, consult the official redis-py documentation.
CI/CD Gating and Policy Validation
Security boundaries degrade silently without automated validation. CI/CD pipelines must enforce ACL syntax validation, cross-tenant isolation testing, and dry-run execution before merging infrastructure changes.
# .github/workflows/redis-acl-gate.yml
name: Redis ACL Policy Validation
on:
pull_request:
paths: ['infra/redis/acl/**']
jobs:
validate-acl:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Start Redis 7.2
run: docker run -d --name redis-test -p 6379:6379 redis:7.2-alpine
- name: Validate ACL Syntax and Isolation
run: |
# Create tenant user scoped to tenant:99:*
docker exec redis-test redis-cli ACL SETUSER test_user on ">testpass" "+@read" "~tenant:99:*"
# Attempt to write outside the allowed namespace; must fail
RESULT=$(docker exec redis-test redis-cli ACL DRYRUN test_user SET tenant:other:leak "blocked")
echo "DRYRUN result: $RESULT"
echo "$RESULT" | grep -q "NOPERM" || { echo "FAIL: Cross-tenant write was not blocked"; exit 1; }
echo "PASS: Cross-tenant isolation verified"
- name: Integration Isolation Test
run: |
python -m pytest tests/redis_isolation.py --redis-url=redis://localhost:6379
The pipeline must fail if any tenant user can access keys outside their ~tenant:<id>:* namespace. Integration tests should simulate concurrent tenant workloads, deliberately trigger SCAN operations, and assert zero cross-tenant key reads. Reference the official Redis ACL documentation for advanced rule composition and inheritance patterns.
Production Posture Checklist
Boundary enforcement is not a one-time configuration. It requires continuous validation, precise ACL scoping, and architectural alignment between cache topology and tenant routing.