Memory bandwidth problem is a major research topic in the field of 3D graphics hardware. Among various bandwidth saving techniques, tessellation is an effective method to reduce geometry data transmission and computations for vertex animation. However, the implementation overhead prevents a mobile 3D graphics engine from embedding a dedicated tessellator. In this research, we propose a programmable tessellator based on conventional shader architecture and fabricate it on silicon. Since the proposed tessellator processes intensive arithmetic operations of tessellation with powerful datapath of a shader, it is implemented with small additional hardware cost to a conventional vertex processor.