workfromanywhereworkfromanywhere
All jobs
SonyData

Research Intern – Multimodal Foundation Model for Vision

Flexible (Tokyo, Europe, US)$50.00/hourPosted 28 days ago

Sony AI is seeking research interns to develop next-generation foundation models for vision, focusing on innovative methodologies in vision-language models, model compression, and deployment on cloud and edge devices. Interns will work with a team of scientists and engineers on challenging problems in generative AI, with opportunities for publication and impact on billions of users.

Location: Flexible (Tokyo, Europe, US)

Salary: $50.00/hour

Responsibilities

  • Conduct fundamental and innovative development in low-cost yet powerful vision-language models, unified models, automatic model compression, optimization, and deployment on cloud and edge.
  • Design or implement state-of-the-art techniques on model compression, inference speedup, hardware deployment, and tool automation.
  • Proof of Concept (PoC) for vision+text generation tasks (VQA, captioning, understanding, etc.) and hardware.
  • Contribute to library and tool development to support business; publish influential research in top-tier conferences and journals.

Requirements

  • Currently has, or is in the process of obtaining, a master/PhD degree in computer science or related field.
  • Self-motivated with the ability to propose and implement innovative ideas.
  • Strong presentation and communication skills.
  • Publications or expertise in compact foundation model development and deployment, with influential open-source projects or papers at top conferences (e.g., CVPR, ICCV, ECCV, NeurIPS, ICML, ACL).

Location

Flexible (Tokyo, Europe, US)

Salary

$50.00/hour

Category

Data

Company

Sony

Source

himalayas

Posted

28 days ago

Similar remote jobs

2d ago
MercorData

Research Analyst Up To $100 Hr

Remote$100/hour
2d ago
DeepSourceNewData

Data Engineer - Remote

Remote
today