$cat /var/log/thoughts.mdSYS.TIME:
// blog
terminal
where silicon meets syntax

thoughts on ASIC design, software engineering, and the spaces between. updated irregularly. quality not guaranteed. coffee definitely involved.

POSTS
14
CATEGORIES
5
AVG READ
10.2m
STATUS
ONLINE
//

featured

★ featured
tutorialsJun 14, 202618 min

Inference, end to end

A visual walkthrough of LLM inference from prompt to token — prefill vs decode, the KV-cache, precision, speculative decoding, the memory hierarchy, and how the big labs serve at scale.

InferenceKV-cacheGPU
cat inference-end-to-end.md →
"if your timing closure strategy is 'hope', you need a new strategy."#silicon"every bug in RTL is a feature in the testbench."#silicon"the best code is the code you delete."#software"coffee is just a liquid PCB solvent for the brain."#life"version control is not optional. yes, even for HDL."#tutorials"0xDEADBEEF is a valid emotional state."#thoughts"premature optimization is the root of all evil. late optimization is the root of all rewrites."#software"if your timing closure strategy is 'hope', you need a new strategy."#silicon"every bug in RTL is a feature in the testbench."#silicon"the best code is the code you delete."#software"coffee is just a liquid PCB solvent for the brain."#life"version control is not optional. yes, even for HDL."#tutorials"0xDEADBEEF is a valid emotional state."#thoughts"premature optimization is the root of all evil. late optimization is the root of all rewrites."#software"if your timing closure strategy is 'hope', you need a new strategy."#silicon"every bug in RTL is a feature in the testbench."#silicon"the best code is the code you delete."#software"coffee is just a liquid PCB solvent for the brain."#life"version control is not optional. yes, even for HDL."#tutorials"0xDEADBEEF is a valid emotional state."#thoughts"premature optimization is the root of all evil. late optimization is the root of all rewrites."#software
// 0x01.

research.feeds

scheduled deep-research jobs. each feed runs on its own cadence and appends a new edition — a post + dashboard — every cycle.

activeweekly
~/research/

inference engines

weekly tracker of the LLM inference-engine landscape — vLLM, SGLang, TensorRT-LLM, LMDeploy, llama.cpp and friends. open the live dashboard →

2 editions
LAST RUN
W24
NEXT RUN
3d 04h 36m 50s
cd inference-engines
activeweekly
~/research/

frontier models

weekly tracker of the AI model landscape — frontier closed/open, small & on-device, agentic frameworks, classifiers. open the live dashboard →

2 editions
LAST RUN
W24
NEXT RUN
0d 00h 00m 00s
cd frontier-models
activeweekly
~/research/

inference silicon

weekly tracker of the chips powering inference — NVIDIA/AMD GPUs, wafer-scale, dataflow LPUs, inference ASICs, RISC-V. open the live dashboard →

2 editions
LAST RUN
W24
NEXT RUN
0d 04h 36m 50s
cd inference-silicon
// 0x02.

entries

01 / 03
$grep/
tutorials
tutorialsfeatured

Inference, end to end

A visual walkthrough of LLM inference from prompt to token — prefill vs decode, the KV-cache, precision, speculative decoding, the memory hierarchy, and how the big labs serve at scale.

Jun 14, 202618 min
InferenceKV-cache
silicon
siliconfeatured

How Inference Chips Are Built

From transistors and dataflow to wafer-scale engines — a visual tour of how the silicon behind AI inference is designed, and why startups think they can unseat the GPU.

Jun 12, 202616 min
SiliconASIC
silicon
siliconfeatured

RTL to GDS: A Complete Walkthrough

From Verilog to tapeout — the full ASIC design flow broken down into digestible steps. Covering synthesis, P&R, timing closure, and the dark arts of DRC.

Apr 28, 202612 min
ASICRTL
software
softwarefeatured

Building a Cyberpunk Portfolio with Astro

How I turned terminal aesthetics and chip schematics into a personal site. Starfields, FSM diagrams, and unhealthy amounts of CSS.

Apr 15, 20268 min
AstroCSS
silicon
silicon

Neural Networks on FPGAs: A Practical Guide

Deploying quantized models on Xilinx Zynq. Covers HLS, resource budgeting, and why your first attempt will always be too slow.

Mar 30, 202615 min
FPGAAI
tutorials
tutorials

My Vim Setup for ASIC Design

Mar 18, 20266 min
VimWorkflow
// 0x03.

bulletin

backlog3

FPGA neural net vs. GPU

Head-to-head latency and throughput on quantized inference.

Neural Networks on FPG…

Migrate EDA scripts to Python 3.12

Exorcising the last Python 2 ghosts from the synthesis flow.

Automating EDA Flows w…

Coffee shop tier list update

Re-ranking the local espresso spots. Methodology disputed.

// devlog: ranking eve…
in_progress3

Timing closure on 7nm block

Chasing the last 40ps of setup slack across three clock domains.

The Art of Timing Clos…

RISC-V branch predictor series

A multi-part deep dive on speculative fetch and recovery.

Designing a Toy RISC-V…

Starfield performance on mobile

Profiling the canvas starfield to hold a steady 60fps on phones.

Procedural Starfield w…
shipped2

vim + SystemVerilog LSP tutorial

Completion, lint and go-to-def for RTL right inside vim.

My Vim Setup for ASIC …

RTL-to-GDS series, part 2

From gate-level netlist all the way to final tapeout.

RTL to GDS: A Complete…
// subscribe.sh

get notified

new post alerts. no spam. unsubscribe anytime. probably monthly at best.

no EDA tools were used in the making of this blog. just vim, coffee, and spite.
JUAN-SOC-1999-X1 · rev. 2026 · ed. 02