Bug report criteria
What happened?
Observed Behavior:
- Learner member DB size remains at 30MB while voting members are successfully defragmented to ~7MB
etcdctl defrag command silently skips learner endpoints
- Direct defragmentation of learner endpoint fails with:
rpc error: code = Unavailable desc = etcdserver: rpc not supported for learner
- Learner database size grows unbounded over time without maintenance capability
Error Message:
etcdserver: rpc not supported for learner
Failed to defragment etcd member[https://181140267029446:25687] (etcdserver: rpc not supported for learner)
What did you expect to happen?
Learner members should support defragmentation operations to maintain database size, either:
etcdctl defrag should include learner endpoints by default, OR
etcdctl defrag --endpoints=<learner> should work without errors, OR
- Provide a supported method to defragment learner members (e.g.,
--include-learners flag)
How can we reproduce it (as minimally and precisely as possible)?
Minimal Reproduction Steps:
# 1. Set up 3-node cluster + 1 learner member
# (cluster already running with learner)
# 2. Check initial status - note learner DB size
etcdctl endpoint status --write-out=table
# Result: Learner at 30MB, voting members at 8.5MB
# 3. Perform compaction
etcdctl compact 2885950
# 4. Defragment cluster
etcdctl defrag
# Result: Only voting members defragmented
# 5. Verify learner still has large DB
etcdctl endpoint status --write-out=table
# Result: Voting members ~7MB, learner still 30MB
# 6. Try to defrag learner directly
etcdctl --endpoints=https://181140267029446:25687 defrag
# Result: Fails with "rpc not supported for learner"
Cluster Topology:
- 3 voting members (fb2f8c3838629cdb, 2228c0c31b9ff622, 63d10718366c821d)
- 1 learner member (cf584302b1c47a59)
Anything else we need to know?
Impact:
- Production issue causing learner storage to grow unbounded
- No supported workaround for learner maintenance
- Affects cluster operations in learner-promotion scenarios
Observations:
- The CLI shows
--includeLearner flag but it doesn't resolve the underlying RPC limitation
- This appears to be an intentional restriction but creates operational problems
Etcd version (please run commands below)
etcd version: 3.5.25
etcdctl version: 3.5.25
Etcd configuration (command line flags or environment variables)
Cluster Configuration:
- 4-member cluster (3 voting + 1 learner)
- HTTPS endpoints on port 25687
- Standard production setup
Relevant Settings:
--auto-compaction-retention (if applicable)
--quota-backend-bytes (if applicable)
--initial-cluster (4-member setup)
Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)
Member List:
etcdctl member list -w table
Endpoint Status (Before Defrag):
+-------------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+-------------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://181140264581536:25687 | fb2f8c3838629cdb | 3.5.25 | 8.5 MB | false | false | 9 | 3276978 | 3276978 | |
| https://181140266914826:25687 | 2228c0c31b9ff622 | 3.5.25 | 8.5 MB | false | false | 9 | 3276978 | 3276978 | |
| https://181140266590820:25687 | 63d10718366c821d | 3.5.25 | 8.5 MB | true | false | 9 | 3276978 | 3276978 | |
| https://181140267029446:25687 | cf584302b1c47a59 | 3.5.25 | 30 MB | false | true | 9 | 3276980 | 3276980 | |
+-------------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
Endpoint Status (After Defrag - Issue Persists):
+-------------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+-------------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://181140264581536:25687 | fb2f8c3838629cdb | 3.5.25 | 7.0 MB | false | false | 9 | 3278633 | 3278633 | |
| https://181140266914826:25687 | 2228c0c31b9ff622 | 3.5.25 | 7.0 MB | false | false | 9 | 3278633 | 3278633 | |
| https://181140266590820:25687 | 63d10718366c821d | 3.5.25 | 7.0 MB | true | false | 9 | 3278633 | 3278633 | |
| https://181140267029446:25687 | cf584302b1c47a59 | 3.5.25 | 30 MB | false | true | 9 | 3278633 | 3278633 | |
+-------------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
Relevant log output
**Defragmentation Failure Logs:**
{"level":"warn","ts":"2026-05-11T08:07:05.721070-0700","logger":"etcd-client","caller":"v3@v3.5.25/retry_interceptor.go:63","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0000f61e0/181140264581536:25687","attempt":0,"error":"rpc error: code = Unavailable desc = etcdserver: rpc not supported for learner"}
...
{"level":"warn","ts":"2026-05-11T08:07:06.622187-0700","logger":"etcd-client","caller":"v3@v3.5.25/retry_interceptor.go:63","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0000f61e0/181140264581536:25687","attempt":99,"error":"rpc error: code = Unavailable desc = etcdserver: rpc not supported for learner"}
Failed to defragment etcd member[https://181140267029446:25687] (etcdserver: rpc not supported for learner)
Bug report criteria
What happened?
Observed Behavior:
etcdctl defragcommand silently skips learner endpointsrpc error: code = Unavailable desc = etcdserver: rpc not supported for learnerError Message:
What did you expect to happen?
Learner members should support defragmentation operations to maintain database size, either:
etcdctl defragshould include learner endpoints by default, ORetcdctl defrag --endpoints=<learner>should work without errors, OR--include-learnersflag)How can we reproduce it (as minimally and precisely as possible)?
Minimal Reproduction Steps:
Cluster Topology:
Anything else we need to know?
Impact:
Observations:
--includeLearnerflag but it doesn't resolve the underlying RPC limitationEtcd version (please run commands below)
Etcd configuration (command line flags or environment variables)
Cluster Configuration:
Relevant Settings:
Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)
Member List:
Endpoint Status (Before Defrag):
Endpoint Status (After Defrag - Issue Persists):
Relevant log output