The implementation of Forth words has to satisfy the following requirements: 1) A word must be represented by a single cell (for execute). 2) A word may represent a combination of code and data (for, e.g., does>). In addition, on some hardware, keeping executed native code and (written) data close together results in slowness and therefore should be avoided; moreover, failing to pair up calls with returns results in (slow) branch mispredictions. The present work describes how various Forth systems over the decades have satisfied the requirements, and how many systems run into performance pitfalls in various situations. This paper also discusses how to avoid this slowness, including in native-code systems.