We compare the rendered videos between the direct output videos from Infinigen and the results after editing with Scene Copilot. Note that we optimized Scene Codex specifically for two different tasks. Because of Infinigen’s randomness, the direct output videos have a high probability of not focusing on the main subjects or may even fail to generate the requested assets. In contrast, with Scene Copilot, the user, acting as a “director”, can have more control over the scene and the output video. For example, in the graveyard video, since Infinigen does not include a “graveyard” in the asset, the camera is pointed in a random direction. However, using BlenderGPT, we generated a church and gravestones with fixed camera animation.