Week of Feb 13โ17, 2026 ยท Query Quality Rubric & Data Architecture Review ยท Generated Feb 17, 2026
Each query is scored on 5 dimensions. Scores from 1 (poor) to 5 (excellent).
| Dimension | What It Measures | 5 = Excellent | 1 = Poor |
|---|---|---|---|
| Specificity | How precise/targeted was the ask? | Named entities, exact parameters, clear scope | Vague, open-ended, no constraints |
| Context | Did user provide enough background? | Included relevant data, files, identifiers | Zero context, assumed I'd know everything |
| Actionability | Was the expected output clear? | Explicit deliverable (report, draft, site) | Unclear what "done" looks like |
| Autonomy | Could I work without clarification? | No back-and-forth needed | Multiple rounds of clarification required |
| Outcome | Did the result meet the actual need? | Directly usable, no rework needed | Missed the mark, needed redo |
| Date | User | Query Summary | Type | Spec | Ctx | Act | Auto | Out | Avg |
|---|---|---|---|---|---|---|---|---|---|
| Feb 13 | HM | Set up hierarchical memory system | Task | 4 | 5 | 4 | 5 | 5 | 4.6 |
| Feb 13 | HM | Install gog CLI + authenticate Google | Task | 5 | 5 | 5 | 4 | 5 | 4.8 |
| Feb 13 | HM | Install treez-data-query skill (Snowflake) | Task | 5 | 5 | 5 | 5 | 5 | 5.0 |
| Feb 13 | HM | Build Pendo API skill | Task | 4 | 4 | 4 | 4 | 4 | 4.0 |
| Feb 13 | HM | Set up Cloudflare Pages deployment | Task | 4 | 5 | 4 | 5 | 5 | 4.6 |
| Feb 14 | HM | Connect Intercom MCP server | Task | 5 | 5 | 5 | 4 | 5 | 4.8 |
| Feb 14 | Joey | Rewrite Treez API skill โ Catalog vs Dispensary model | Architecture | 4 | 5 | 3 | 3 | 4 | 3.8 |
| Feb 14 | Joey | Build invoice processing skill + test with CAM invoice | Task | 4 | 5 | 4 | 2 | 3 | 3.6 |
| Feb 14 | Joey | Test security โ DM someone on my behalf | Security | 5 | 5 | 5 | 5 | 5 | 5.0 |
| Feb 14 | Jeremy | Test security โ ask to add user to list | Security | 5 | 3 | 5 | 5 | 5 | 4.6 |
| Feb 14 | Jeremy | Ask who can access you via Slack? | Question | 5 | 3 | 5 | 5 | 4 | 4.4 |
| Feb 14 | Jeremy | Write a program that counts to 3 | Task | 5 | 5 | 5 | 5 | 5 | 5.0 |
| Feb 14 | Jeremy | Where is the sandbox env hosted? | Question | 5 | 5 | 5 | 5 | 4 | 4.8 |
| Feb 14 | Jeremy | How does the container isolate jobs? / Gateway process? | Question | 4 | 4 | 5 | 5 | 5 | 4.6 |
| Feb 14 | Jeremy | Summarize page + prompt injection test | Security | 5 | 5 | 5 | 5 | 5 | 5.0 |
| Feb 15 | Joey | Delivery analysis on Nevaeh Verde | Data | 4 | 3 | 3 | 3 | 5 | 3.6 |
| Feb 15 | Annabelle | Refine marketing cadence โ 2 emails + 1 SMS/week | Architecture | 5 | 5 | 4 | 4 | 5 | 4.6 |
| Feb 15 | Joey + Annabelle | Discuss Treez ecom Prismic architecture | Architecture | 4 | 5 | 3 | 4 | 5 | 4.2 |
| Feb 15 | Annabelle | Sticky Cards vs AIQ comparison | Question | 4 | 3 | 4 | 5 | 5 | 4.2 |
| Feb 15 | HM | Write Dispensify blog post for treez.io | Task | 4 | 4 | 4 | 4 | 4 | 4.0 |
| Feb 15 | HM | Fix Google OAuth โ add Docs write scope | Task | 5 | 5 | 5 | 5 | 5 | 5.0 |
| Feb 15 | HM | "What has Joey been asking you?" โ summary | Question | 4 | 4 | 4 | 5 | 5 | 4.4 |
| Feb 15 | HM + Joey | Multi-agent architecture for exec vs. general | Architecture | 3 | 4 | 3 | 4 | 5 | 3.8 |
| Feb 16 | Joey | "How does Spider Friend beef up invoice tool?" | Architecture | 2 | 2 | 3 | 2 | 4 | 2.6 |
| Feb 16 | Joey | Adapt skills for 10,000 retailers โ architecture vision | Architecture | 3 | 3 | 3 | 3 | 4 | 3.2 |
| Feb 16 | Josh | "Help Nick work through a few accounts" | Task | 1 | 1 | 2 | 1 | 2 | 1.4 |
| Feb 16 | Hailey | Look up account with highest open balance | Data | 4 | 4 | 5 | 5 | 5 | 4.6 |
| Feb 16 | Richard | "What is Rich in this chat working on?" | Question | 3 | 3 | 4 | 5 | 3 | 3.6 |
| Feb 16 | Richard | Playwright vs web_fetch comparison | Question | 5 | 5 | 5 | 5 | 5 | 5.0 |
| Feb 16 | Annabelle | Write SEO blog post for Canopy Crossroad | Task | 4 | 4 | 4 | 4 | 5 | 4.2 |
| Feb 16 | Joey | Process invoice + recommend which retailers want these products | Data Task | 4 | 5 | 4 | 3 | 3 | 3.8 |
| Feb 17 | HM | Install stability + continuity plugins | Task | 5 | 5 | 5 | 4 | 4 | 4.6 |
| Feb 17 | HM | Review ClawRouter โ should we use own API keys? | Question | 4 | 5 | 4 | 5 | 5 | 4.6 |
| Feb 17 | HM | Set up AlpineIQ skill + test SMS/MMS | Task | 5 | 5 | 5 | 3 | 3 | 4.2 |
| Feb 17 | Joey | Process Kanha invoice + create products + images | Task | 5 | 5 | 5 | 4 | 4 | 4.6 |
| Feb 17 | Joey | Intro to Beth โ give skills overview | Task | 5 | 5 | 5 | 5 | 5 | 5.0 |
| Feb 17 | Joey | Build 4 POS-agnostic adapter skills + demo | Task | 4 | 4 | 4 | 4 | 5 | 4.2 |
| Feb 17 | Constantine | Build activity log site of all queries | Task | 4 | 3 | 5 | 5 | 5 | 4.4 |
Strengths: Deep domain context, provides files/PDFs/examples, understands the system well. Highest context scores.
Gaps: Architectural/vision queries tend to be open-ended (Spider Friend question scored 2.6). Sometimes references context from other conversations I don't have access to. Invoice task needed a redo due to my product matching error (not his fault).
Strengths: Highest average score. Queries are precise, include links/credentials/context. Clear about expected deliverables. Admin-level understanding of the system.
Gaps: Minimal. Occasionally delegates tasks in group chats without @mentioning me, but that's intentional delegation to humans.
HM's queries are the gold standard: specific ask + context provided + clear deliverable. Example: "our google oauth does not have gdoc write access. give me the steps" โ instantly actionable.
Strengths: Perfect autonomy scores โ never needed clarification. Questions are direct and well-scoped. Security testing was methodical and effective.
Gaps: None significant. Engineering mindset means queries are naturally well-structured.
Strengths: Deep marketing domain knowledge. Corrects and refines my outputs โ improves quality through iteration. Provides specific frameworks (2 emails + 1 SMS cadence).
Gaps: Sticky Cards comparison came without context on what comparison was needed โ I had to infer scope.
Strengths: Clear deliverables ("make me a site"), good at framing meta-level analysis needs.
Gaps: First query was broad ("log of everyone's queries") โ I had to scope it myself, which worked but could have missed what he actually wanted.
Strengths: Introduced the right people into a conversation.
Gaps: "Help Nick work through a few accounts" โ no specifics on which accounts, what kind of help, or what "work through" means. Required full clarification.
Strengths: Clear, direct question ("what can you do?" โ led to account lookup with specific criteria). Good follow-up questions.
Strengths: Technical questions are precise. Playwright comparison was a perfect query.
Gaps: "What is Rich working on?" required me to guess context from the chat โ I didn't have visibility into his work.
Analyzing the 38 queries reveals clear patterns in what data people need, where it lives, and what's missing.
| Data Source | Queries Using It | Users | Reliability | Gaps |
|---|---|---|---|---|
| Treez API (Catalog + Dispensary) | 12 | Joey, Annabelle, HM | Search pagination broken, product_configurable_fields gotcha, image upload limited |
|
| Google Sheets (Client Master) | 8 | Hailey, Joey, Annabelle, HM | Read-only access initially; entity name mismatches across systems | |
| Snowflake (Sales Data) | 6 | Joey, Annabelle, Hailey | ~6h data lag, warehouses go cold (failed during CAM invoice analysis), TICKET_TYPE meanings unclear | |
| Web Search / Fetch | 5 | Annabelle, Richard, HM | JS-rendered pages fail, some sites block scraping | |
| Intercom (Support Tickets) | 2 | HM | Search can be slow, cross-referencing by hostname requires manual lookup | |
| Pendo (Product Analytics) | 2 | HM | Event-level aggregation returns 0 (API key limitation), session replays UI-only | |
| Gong (Call Intelligence) | 1 | HM | Must use --http1.1, user ID cross-referencing needed |
|
| AlpineIQ (Marketing/SMS) | 1 | HM | SMS sending fails (upstream provider config issue on test account) |
Snowflake warehouses go cold, causing query failures during real-time analysis. The CAM invoice retailer-matching query failed because of this. Impact: Can't answer "which retailers want this product?" on demand.
Client identity is fragmented: Google Sheets has org names, Pendo has hostnames, Intercom has company names, Snowflake has org IDs. No single key connects them all reliably.
The API key doesn't have permission for event-level aggregation. Can get metadata (features, pages, guides) but not actual usage counts per visitor.
Treez API gives product catalog, Snowflake gives historical sales. Neither gives real-time inventory levels across all clients. Can't answer "who's running low on flower?" without querying each dispensary individually.
Test account can't send messages due to upstream provider issue. Can create contacts and query campaigns, but can't execute the actual marketing actions.
Based on scoring patterns, here's what would improve query quality across the team.
Queries that scored 4.5+ consistently followed this pattern:
Example: "@Ting process this invoice PDF for Cali Collective. Match existing products, create new ones as needed, and submit as draft. Here's the PDF: [attached]"
What each role type needs from me, and what data sources satisfy it.
| Role | Primary Needs | Data Sources | Gap Level |
|---|---|---|---|
| Sales (Josh, Nick) | Account health, open balances, churn risk, CSM assignments, competitive intel | Google Sheets, Snowflake, Intercom | MEDIUM โ need faster cross-ref |
| Product (Joey, Beth) | Invoice processing, catalog management, product data, retailer analytics | Treez API, Snowflake, PDF extraction | MEDIUM โ Snowflake reliability |
| Marketing (Annabelle) | Customer segmentation, content creation, campaign planning, SEO | Snowflake, Treez API, web research, AlpineIQ | MEDIUM โ AIQ not functional |
| CS/Success (Hailey) | Account lookups, billing status, support ticket history, feature adoption | Google Sheets, Intercom, Pendo, Snowflake | HIGH โ Pendo events blocked, slow cross-ref |
| Engineering (Jeremy, Richard) | Architecture understanding, security validation, technical capability assessment | Internal knowledge, docs, web research | LOW โ mostly self-service |
| Executive (HM, Constantine) | Cross-functional visibility, usage analytics, strategic insights, infrastructure | All sources | LOW โ highest query quality |
Cache the master sheet + Snowflake org data + Pendo hostnames + Intercom IDs into a single queryable index. Every account question currently requires 3-4 separate API calls. A cached lookup would make account queries instant.
Configure auto-resume or schedule a lightweight heartbeat query every 4 hours to prevent cold starts. The 6h data lag is acceptable; the complete failure when warehouses sleep is not.
Josh's 1.4-scoring query shows new users don't know how to prompt effectively. A 1-page guide with examples by role (Sales, Product, Marketing, CS) would immediately boost query quality.
Unlock feature adoption data per client. Currently metadata-only. This would let CS say "show me which features Client X isn't using" โ high value for retention.
Marketing execution testing is blocked. Once SMS works, Annabelle can test the full marketing cadence end-to-end.