Kairos HomeWorld

A Unified Floorplan-to-Furnished Framework for Generating Controllable, Densely Interactive Whole-Home Scenes

Wenbo Li1,2,* Xiaoliang Ju1,2,*,† Zipeng Qin1,2,* Rongyao Fang2 Hongsheng Li2,1,3

1 Ace Robotics  ·  2 CUHK MMLab  ·  3 Shenzhen Loop Area Institute

*Equal contribution  ·  Project lead  ·  Corresponding author

Ace Robotics The Chinese University of Hong Kong Shenzhen Loop Area Institute

Large-scale Datasets

Real-world floorplan dataset icon

300K

Real-World Floorplan Dataset

  • Chinese Style
  • Richly Captioned
  • Fully Vectorized
Furnished whole-home dataset icon

5K

Furnished Whole-Home Dataset

  • Complex Geometry
  • Manipulable Objects
  • Full 3D & Sim-ready

Whole-home Scene Generation

Loading…
Whole-home scene generation: floorplan and furnished room renders

Embodied Interaction Demo in Generated Scenes

Highlights

Read full abstract

Indoor scene generation is crucial for robot simulation and modern interior design. However, complex layouts together with scarce 3D scene data make learning-based generation challenging. Existing methods often rely on hand-crafted rules or focus on isolated sub-tasks (e.g., floorplan synthesis or single-room furnishing), producing whole-home scenes that lack global coherence, realism, and simulation readiness. To mitigate these limitations, we propose a unified hierarchical framework that decomposes indoor scene synthesis into controllable stages. First, we curate a large-scale dataset of 300K real residential floorplans to train a large language model for whole-home floorplan generation. With detailed descriptions and a K-D tree–based representation, our method enables fine-grained, controllable whole-home floorplan generation. Building upon the generated whole-home floorplan, we leverage image generation models to draft furniture layouts from multi-level roaming viewpoints, and then generate the layouts of small manipulable objects on different supporting surfaces (e.g. cabinets, desks, and dining tables) for embodied AI simulation. During furniture & object layout generation, a VLM-based refiner iteratively corrects furniture & object placement, and a 3D generative model enables flexible replacement of individual assets. We further attach basic physical attributes and simple surface texture and lighting setups to complete the pipeline for embodied AI use. Experiments and user studies demonstrate that our pipeline produces indoor spaces with greater layout diversity and stronger 3D design appeal, outperforming prior methods on both quantitative and qualitative metrics. Finally, alongside our generation pipeline, we will release the floorplan dataset and 5K fully furnished scenes to the community.

Method

Loading…
Four-stage whole-home indoor scene generation pipeline
Figure 1 — Overview of the whole-home indoor scene generation pipeline.
  1. Stage 1: Floorplan Generation — Curate 300K real residential floorplans and train an LLM with K-D tree representation for prompt-conditioned, controllable whole-home layouts.
  2. Stage 2: Image-Driven Hierarchical Furnishing — Instantiate an unfurnished 3D shell, then place furniture via hierarchical view roaming (top-down → ego-centric) with grounding and asset retrieval.
  3. Stage 3: Recursive Refinement — A finetuned VLM refiner detects collisions, blocked doorways, and violations, then iteratively proposes corrective actions in a reflective loop.
  4. Stage 4: Manipulable Object Placement — Surface-centric synthesis populates desks, counters, and shelves with small objects, with physical attributes for simulation-ready embodied AI.

Generation Results

Whole-home Samples

Loading…
Whole-home generation results across floorplans and furnished room views
The perspective views in the bottom row correspond to the numbered viewpoints marked on the floorplans above.

Functional Room Samples

Loading…
Generated functional rooms: living room, bedroom, bathroom, and kitchen
For each room, the view inside the red dashed box below is a zoom-in focused on manipulable objects.

BibTeX

@misc{li2026homeworld,
  title         = {HomeWorld: A Unified Floorplan-to-Furnished Framework for Generating Controllable, Densely Interactive Whole-Home Scenes},
  author        = {Wenbo Li and Xiaoliang Ju and Zipeng Qin and Rongyao Fang and Hongsheng Li},
  year          = {2026},
  eprint        = {2606.06390},
  archivePrefix = {arXiv},
  primaryClass  = {cs.CV},
  url           = {https://arxiv.org/abs/2606.06390},
  note          = {Project page: https://kairos-homeworld.github.io/}
}