TY - JOUR
T1 - Predicting raters' transparency judgments of English and Chinese morphological constituents using latent semantic analysis
AU - Wang, Hsueh-Cheng
AU - Hsu, Li Chuan
AU - Tien, Yi Min
AU - Pomplun, Marc
PY - 2014/1/1
Y1 - 2014/1/1
N2 - The morphological constituents of English compounds (e.g., "butter" and "fly" for "butterfly") and two-character Chinese compounds may differ in meaning from the whole word. Subjective differences and ambiguity of transparency make judgments difficult, and a computational alternative based on a general model might be a way to average across subjective differences. In the present study, we propose two approaches based on latent semantic analysis (Landauer & Dumais in Psychological Review 104:211-240, 1997): Model 1 compares the semantic similarity between a compound word and each of its constituents, and Model 2 derives the dominant meaning of a constituent from a clustering analysis of morphological family members (e.g., "butterfingers" or "buttermilk" for "butter"). The proposed models successfully predicted participants' transparency ratings, and we recommend that experimenters use Model 1 for English compounds and Model 2 for Chinese compounds, on the basis of differences in raters' morphological processing in the different writing systems. The dominance of lexical meaning, semantic transparency, and the average similarity between all pairs within a morphological family are provided, and practical applications for future studies are discussed.
AB - The morphological constituents of English compounds (e.g., "butter" and "fly" for "butterfly") and two-character Chinese compounds may differ in meaning from the whole word. Subjective differences and ambiguity of transparency make judgments difficult, and a computational alternative based on a general model might be a way to average across subjective differences. In the present study, we propose two approaches based on latent semantic analysis (Landauer & Dumais in Psychological Review 104:211-240, 1997): Model 1 compares the semantic similarity between a compound word and each of its constituents, and Model 2 derives the dominant meaning of a constituent from a clustering analysis of morphological family members (e.g., "butterfingers" or "buttermilk" for "butter"). The proposed models successfully predicted participants' transparency ratings, and we recommend that experimenters use Model 1 for English compounds and Model 2 for Chinese compounds, on the basis of differences in raters' morphological processing in the different writing systems. The dominance of lexical meaning, semantic transparency, and the average similarity between all pairs within a morphological family are provided, and practical applications for future studies are discussed.
KW - Chinese
KW - Clustering
KW - Compound words
KW - Latent semantic analysis
KW - Morphological family
KW - Semantic consistency
KW - Semantic transparency
UR - http://www.scopus.com/inward/record.url?scp=84894673889&partnerID=8YFLogxK
U2 - 10.3758/s13428-013-0360-z
DO - 10.3758/s13428-013-0360-z
M3 - Article
C2 - 23784009
AN - SCOPUS:84894673889
VL - 46
SP - 284
EP - 306
JO - Behavior Research Methods
JF - Behavior Research Methods
SN - 1554-351X
IS - 1
ER -