This paper proposes a cost-effective and variable-channel floating-point fast independent component analysis (FastICA) hardware architecture and implementation for EEG signal processing. The Gram-Schmidt orthonormalization based whitening process is utilized to eliminate the use of the dedicated hardware for eigenvalue decomposition (EVD) in the FastICA algorithm. The proposed two processing units, PU1 and PU2, in the presented FastICA hardware architecture can be reused for the centering operation of preprocessing and the updating step of the fixed-point algorithm of the FastICA algorithm, and PU1 is reused for Gram-Schmidt orthonormalization operation of preprocessing and fixed-point algorithm to reduce the hardware cost and support 2-to-16 channel FastICA. Apart from the FastICA processing, the proposed hardware architecture supports re-reference, synchronized average, and moving average functions. The cost-effective and variable-channel FastICA hardware architecture is implemented in 90 nm 1P9M complementary metal-oxide-semiconductor (CMOS) process. As a result, the FastICA hardware implementation consumes 19.4 mW at 100 MHz with a 1.0 V supply voltage. The core size of the chip is 1.43 mm2. From the experimental results, the presented work achieves satisfactory performance for each function.