HOW-TO: Read a Google Drive Sheet / Excel reliably (never report a false row count)
Created 2026-05-24 after a real failure: Claude told Sam a contacts sheet had 143 rows — confidently, claiming "verified, complete, not truncated." It actually has 1,375. The connector had silently truncated the read, and Claude asserted a count it never actually verified. This runbook makes that impossible to repeat.
Companion discipline: memory feedback_calibrate_confidence (verify before asserting; mark [verified] vs [unconfirmed]; admit tool limits instead of guessing around them).
The two connector failure modes (know these cold)
mcp__claude_ai_Google_Drive__read_file_content:
| Source type | What it returns | The trap |
|---|---|---|
Native Google Sheet (application/vnd.google-apps.spreadsheet) |
A markdown pipe table with newlines | Silently TRUNCATES large sheets. Looks complete (ends on a clean row, no ...). Returned 143 of ~1,375. Never trust its size. |
.xlsx (uploaded Excel) |
The FULL content, but as one giant line — comma-separated, quote-aware, rows joined by a single SPACE, no newlines. Often too big for context → harness saves it to a temp file. | Looks unparseable / "0 rows" on a naive split("\n"). But it IS complete — just mangled. |
The procedure (do this, in order)
- Prefer a LOCAL read. If the file is synced to disk (check
Downloads, the OneDrive workspace, or a Google Drive Desktop mount), read the real binary with openpyxl / pandas — exact and reliable.python -c "import openpyxl; wb=openpyxl.load_workbook(p); print(wb.active.max_row)". This is the gold standard; the mangling problem disappears. - If not local, use the connector
read_file_content. If the result overflows context and the harness saves it to a temp file, that means you got the FULL content (good) — it did not truncate (that's the .xlsx path). A native-Sheet read that fits in context may be truncated (bad). - Run the parser:
python tools/drive/parse_drive_export.py <saved-temp-file>. It reconstructs rows from both formats, prints an exact count, and writes a clean CSV next to the input. - Completeness check — NEVER skip:
- The parser printsROWS != hdr cols. It must be 0 (or 1 tail row). Non-zero = the reconstruction is wrong; do not report the count.
- Confirm it reaches a known last value (alphabetical end, or a row the user showed you). The screenshot a user sends is usually NOT the end — it's them proving "there's more"; don't treat a scroll position as the endpoint.
- Cross-check with a second method (field-math:(flat_fields - 1) / (cols - 1)should be a clean integer). - If you only have a native-Sheet pipe-table read, say: "this may be truncated — I can't confirm the full size," and switch to the .xlsx/local path. Do not report a number from it.
The rule
Never state a row count (or any "complete / all / every row" claim) unless step 4 passed. If a tool can't read the file fully, say so — don't do convoluted parses and hand over a guess dressed as an answer. A user export to .xlsx is the easy reliable input; ask for it rather than fighting a truncating Sheet read.