Case Studies – Ziegler Technical Solutions LLC

WordPress LLM Integration: Architecture and Engineering Decisions

WordPress sites benefit from conversational interfaces across many use cases: answering product questions, explaining services, helping visitors find content, and handling support inquiries. The platform’s request-response architecture creates specific constraints: stateless PHP processes, database-centric state management, and deployment patterns ranging from shared hosting to multi-server configurations.

This case study examines architectural decisions that enable LLM conversations in WordPress while preserving data control. The approach uses browser-based session storage, database-backed conversation state, and MySQL FULLTEXT for content retrieval. These choices prioritize deployment simplicity and infrastructure compatibility over advanced capabilities.
Building Tool-Integrated LLM Systems Using Function Calling and Model Context Protocol

Large language models are powerful but isolated. They generate text based on training data, but they cannot check current inventory, query a customer database, or create records in external systems. Most production LLM applications require bridging this gap.

This case study presents an architecture for extending LLM capabilities through structured tool integration. The approach progresses from basic function calling patterns through standardized protocols suitable for production deployment. The focus is practical: system prompts that produce reliable structured output, application-layer orchestration that maintains state and coordinates actions, and protocol choices that enable interoperability with commercial LLM providers.
An Orchestrated Document Intelligence Pipeline Using the Qwen3 Model Family

Document processing is a common LLM use case, but most implementations rely on disconnected components: an OCR engine extracts text, a separate model tries to make sense of it, and layout information that would help resolve ambiguities gets lost in between.

This case study presents an architecture that addresses this fragmentation using the Qwen3 model family. The choice is pragmatic—its consistent tokenization allows clean handoffs between specialized models in the same pipeline, so visual understanding, linguistic reasoning, and verification can operate in concert.
Local LLM Deployment and Scaling Through Quantization

Quantization is a common step in local LLM deployment, but most approaches treat it as compression: reduce precision, verify the model loads, move on. This misses how transformers behave under reduced precision—errors compound across layers and context length in ways that basic testing doesn’t reveal.

This case study presents a methodology for reliable local deployment developed through quantization work across multiple model families. The focus is systematic: matching infrastructure constraints to quantization choices, calibrating against representative workloads at target context lengths, and validating against task-specific criteria rather than aggregate metrics alone.
Named Entity Recognition and Unstructured-to-Structured Data Conversion Using Local LLMs

Named Entity Recognition is a common application for LLMs, but most implementations default to hosted APIs without evaluating whether the task requires that overhead. For structured extraction with clear schemas, smaller local models frequently match larger hosted alternatives while eliminating per-token costs and data privacy concerns.

This case study presents a methodology for contact data extraction using a quantized local model. The focus is practical: prompt structure that produces consistent output, two-pass processing that catches autoregressive errors, and validation that confirms results before production use.

AightBot