【论文阅读笔记】Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation
提出了一种名为DUET(Dual-scale Graph Transformer)的方法,结合了全局动作规划和细粒度跨模态理解
1428 words
|
7 minutes
Scaling Data Generation in Vision-and-Language Navigation
作者提出了ScaleVLN,一种VLN数据生成范式,通过全面的实验证明了构建高质量导航图和使用相机质量图像的有效性
472 words
|
2 minutes
【论文阅读笔记】Building Rome in a Day
基于photo tourism这篇文章进行的改进,在其基础上做的主要贡献是设计一个高计算性能平台,使得可以在几十小时内用几十万张互联网上抓取的互不相关的无序图像重建出一整个城市
757 words
|
4 minutes
【论文阅读笔记】DirectGPT: A Direct Manipulation Interface to Interact with Large Language Models
一个基于直接操作(Direct Manipulation)原则来与LLM交互的用户界面
443 words
|
2 minutes
【论文阅读笔记】Teach AI How to Code: Using Large Language Models as Teachable Agents for Programming Education
用户通过作为一个teacher教AI编程,LBT(learning by teaching)
1772 words
|
9 minutes