The inner ear sensory epithelia contain mechanosensitive hair cells and supporting cells. Both cell types arise from SOX2-expressing prosensory precursors, but the mechanisms underlying the lineage diversification of inner ear epithelial cells remain unclear. Here, Eri Hashino and colleagues determine the transcriptional trajectory of prosensory cells by performing longitudinal single-cell RNA-sequencing (scRNA-seq) analyses in human inner ear organoids. First, the authors generate a reporter line to isolate SOX2-expressing cells in an inner ear organoid system derived from human embryonic stem cells. Then, they perform scRNA-seq at several time-points as the inner ear progenitors develop into type II vestibular hair cells and supporting cells. Based on their analysis, the authors propose that, contrary to current literature, hair cells may derive from subpopulations of supporting cells, not from progenitor cells that can either become hair cells or supporting cells. Furthermore, by looking at which genes are uniquely upregulated in hair cells versus supporting cells, the authors find that ion channel-related genes are enriched in supporting cells, whereas Wnt signalling-related genes are crucial to hair cell generation. Overall, this study demonstrates the use of organoid models in advancing our understanding of human hair cell development.