Current Vacancies

Current Vacancies

Postdoctoral Fellow (Operating Room Scene Understanding)

手术室场景理解研究人员-博士后


Job Responsibilities


1. Develop scene graph generation algorithms for multi-view operating room data, including object detection, relationship reasoning, and semantic modeling.   

2. Explore multimodal fusion techniques (vision, language, depth) to improve accuracy and robustness of scene graphs.  

3. Apply multi-view geometry (e.g., SfM, MVS, SLAM) to enhance spatial reasoning for scene graph generation.  

4. Apply scene graphs to surgeon behavior analysis, surgical workflow parsing, video summarization, and safety protocol recognition.  

5. Publish high-impact papers (CVPR, ICCV, NeurIPS, ICLR, TPAMI, TMI, MedIA, etc.) and drive cutting-edge research.  

6. Contribute to grant proposals and translate research into practical solutions.   


Requirements


1. PhD in computer vision, AI, multimodal learning, or related fields.  

2. Expertise in scene graph generation (object detection, relationship reasoning, semantic modeling).  

3. Proficiency in multi-view geometry (SfM, MVS, SLAM) as a tool for scene graphs.  

4. Strong publication record in top-tier conferences/journals.  

5. Mastery of AI frameworks (PyTorch, HuggingFace Transformers, LLaMA Factory) and programming (Python/C++).  

6. Medical AI experience is a plus.  


Preferred Qualifications


1. Experience in scene graph generation/reasoning.  

2. Familiarity with multimodal LLMs (CLIP, BLIP, LLaVA, InternVL, QwenVL).  

3. Background in operating room research (behavior analysis, workflow recognition).   

4. Experience in medical imaging, surgical video analysis, or video summarization.  

5. Knowledge of privacy-preserving techniques and medical data security.  


Application Method


Please send your resume to hr02@cair-cas.org.hk. Please indicate [Postdoctoral Fellow (Operating Room Scene Understanding)]-[Name] in the subject of your email.



岗位职责


1. 研究基于手术室多视角数据的场景图(Scene Graph)生成算法,包括目标检测、关系推理和语义建模;

2. 探索多模态数据(视觉、语言、深度等)的融合技术,提升场景图生成的精度与鲁棒性;

3. 研究多视几何方法,为场景图生成提供精准的空间信息支持;

4. 将场景图应用于手术室中的医生行为理解、手术流程分析、手术视频摘要、以及手术安全协议识别;

5. 发表高水平研究论文(如 CVPR, ICCV, NeurIPS, ICLR, TPAMI,TMI, MedIA 等),推动领域技术前沿;

6. 参与科研项目申请,并将研究成果转化为实际应用。


职位要求


1. 计算机视觉、人工智能、多模态学习或相关领域的博士学位;

2. 熟悉场景图生成相关技术(如目标检测、关系推理、语义建模),在该领域有深入研究;

3. 熟悉多视几何(如 SfM, MVS, SLAM)相关技术,能够将其作为工具用于场景图生成;

4. 具备较强的科研能力,有顶级会议或期刊论文发表经历;

5. 熟练掌握人工智能算法开发主流框架(如 PyTorch, Huggingface Transformers, LLaMA Factory)及相关工具,具备优秀的编程能力(Python, C++);

6. 有医疗场景研究经验者优先。


优先考虑


1. 有场景图生成 / 推理方面的经验。

2. 熟悉多模态大语言模型(CLIP, BLIP, LLaVA, InternVL, QwenVL)。

3. 有手术室研究方面的背景(行为分析、工作流程识别)。

4. 有医学影像、手术视频分析或视频总结方面的经验。

5. 了解隐私保护技术以及医学数据安全方面的知识。

 

申请方式


请将简历发送至 hr02@cair-cas.org.hk。邮件主题请注明应聘 [手术室场景理解研究人员-博士后]-[姓名]。