Researchers say they trained a foundation model from scratch for about $1,500
Training a foundation LLM from scratch costs millions and requires internet-scale data โ which is why most enterprises don't bother. Sapient thinks it has a cheaper path. To overcome this brute-force scaling dogma, researchers at Sapient developed HRM-Text , which replaces standa
Training a foundation LLM from scratch costs millions and requires internet-scale data โ which is why most enterprises don't bother. Sapient thinks it has a cheaper path. To overcome this brute-force scaling dogma, researchers at Sapient developed HRM-Text , which replaces standard Transformers with a highly sample-efficient Hierarchical Recurrent Model (HRM), an architecture they first introduced last year . HRM decouples computation into slow-evolving strategic and fast-evolving execution layers. Instead of brute-force autoregressive prediction on raw text, HRM-Text trains exclusively on instruction-response pairs. This is close to real-world enterprise settings, where users usually expect a targeted answer to a specific task. The researchers were able to train a 1B-parameter HRM-Text from scratch at a fraction of the cost and tokens of normal LLMs. Their model achieved performance competitive with much larger open models on key industry benchmarks. For real-world AI applications, th
This report comes from VentureBeat. The story centres on Researchers say they trained a foundation model from scratch for about $1,500. Full coverage and background context is available at the original source. Readers seeking more detail on this developing topic are encouraged to follow updates from VentureBeat and related outlets covering this beat.

