- Регистрация
- 1 Мар 2015
- Сообщения
- 5,923
- Баллы
- 155
We just wrapped up the December ‘24 , and if you missed it or want to revisit it, here’s a recap! In this blog post you’ll find the playback recordings, highlights from the presentations and Q&A, as well as the upcoming Meetup schedule so that you can join us at a future event.
How We Built CoTracker3: Simpler and Better Point Tracking by Pseudo-Labeling Real Videos
is a state-of-the-art point tracking model that introduces significant improvements in tracking objects through video sequences. Its key innovations include:
Speaker: is currently doing a PhD at Meta AI and Oxford, where he’s working on dynamic reconstruction and motion estimation (CoTracker) with Andrea Vedaldi and Christian Rupprecht. Before that, he did his master’s at École Polytechnique (Paris), and undergrad in cold Siberia (Novosibirsk). He was also an early employee at two startups that got acquired by Snapchat and Farfetch.
Q&A
In this presentation, Harpreet Sahota explores CoTracker3, a state-of-the-art point tracking model that effectively leverages real-world videos during training. He dives into the practical aspects of running inference with CoTracker3 and parsing its output into FiftyOne, a powerful open-source tool for dataset curation, analysis, and visualization. Through a hands-on demonstration, Harpreet shows how to prepare a video for inference, run the model, examine its output, and parse the model’s output into FiftyOne’s keypoint format for seamless integration and visualization within the FiftyOne app.
Speaker: is a hacker-in-residence and machine learning engineer with a passion for deep learning and generative AI. He’s got a deep interest in RAG, Agents, and Multimodal AI.
Q&A
In the fast-paced retail environment, automation at checkout is increasingly essential to enhance operational efficiency and improve the customer experience. This talk will demonstrate a streamlined approach to retail product detection using the Retail Product Checkout (RPC) dataset, which includes 200 SKUs across 17 meta-categories such as puffed food, dried food, and drinks. By leveraging YOLOv8, renowned for its speed and accuracy in real-time object detection, and FiftyOne, an open-source toolset for computer vision, we can simplify data loading, training, evaluation, and visualization for effective product detection and classification. Attendees will gain insights into how these tools can be applied to optimize checkout automation.
Speaker: is a Data Engineer Intern at UNAR Labs, a startup focused on making information accessible for the blind. She holds a Master’s degree in Machine Learning and Computer Vision from Northeastern University and is passionate about applying AI and computer vision to real-world problems, with a focus on automation and accessibility.
Q&A
The goal of the Meetups is to bring together communities of data scientists, machine learning engineers, and open source enthusiasts who want to share and expand their knowledge of AI and complementary technologies.
Join one of the 12 Meetup locations closest to your timezone.
Up next on Jan 29, 2024 at 9:00 AM PT / 12:00 PM ET, we have three great speakers lined up!
Register for the Zoom . You can find a complete schedule of upcoming Meetups on .
Get Involved!
There are a lot of ways to get involved in the Computer Vision Meetups. Reach out if you identify with any of these:
Reach out to Meetup co-organizer Jimmy Guerrero on Meetup.com or ping me over to discuss how to get you plugged in.
—
These Meetups are sponsored by , the company behind the open source computer vision toolset. FiftyOne enables data science teams to improve the performance of their computer vision models by helping them curate high quality datasets, evaluate models, find mistakes, visualize embeddings, and get to production faster. It’s easy to , in just a few minutes.
How We Built CoTracker3: Simpler and Better Point Tracking by Pseudo-Labeling Real Videos
is a state-of-the-art point tracking model that introduces significant improvements in tracking objects through video sequences. Its key innovations include:
- Use of semi-supervised training with real videos, reducing reliance on synthetic data
- Generates pseudo-labels using existing tracking models as teachers
- Features a simplified architecture compared to previous trackers
Speaker: is currently doing a PhD at Meta AI and Oxford, where he’s working on dynamic reconstruction and motion estimation (CoTracker) with Andrea Vedaldi and Christian Rupprecht. Before that, he did his master’s at École Polytechnique (Paris), and undergrad in cold Siberia (Novosibirsk). He was also an early employee at two startups that got acquired by Snapchat and Farfetch.
Q&A
- Is processing a whole video in one go not computationally expensive?
- Can you explain more about the 4D correlation?
- Do you think leveraging pre-trained world models or models explicitly trained on/sensitive to laws of physics/ how objects in 3D interact - can this be useful for this kind of temporal tracking? Would it be useful for OOD cases?
- What are the evaluation metrics that are mainly tracked?
- How can CoTracker’s joint tracking technology be leveraged to enhance identity verification and access control in cybersecurity frameworks, and what are the potential risks associated with spoofing or compromising such systems?
In this presentation, Harpreet Sahota explores CoTracker3, a state-of-the-art point tracking model that effectively leverages real-world videos during training. He dives into the practical aspects of running inference with CoTracker3 and parsing its output into FiftyOne, a powerful open-source tool for dataset curation, analysis, and visualization. Through a hands-on demonstration, Harpreet shows how to prepare a video for inference, run the model, examine its output, and parse the model’s output into FiftyOne’s keypoint format for seamless integration and visualization within the FiftyOne app.
Speaker: is a hacker-in-residence and machine learning engineer with a passion for deep learning and generative AI. He’s got a deep interest in RAG, Agents, and Multimodal AI.
Q&A
- For the Cotracker models - are there model compression/quantization techniques you tried and or can recommend?
In the fast-paced retail environment, automation at checkout is increasingly essential to enhance operational efficiency and improve the customer experience. This talk will demonstrate a streamlined approach to retail product detection using the Retail Product Checkout (RPC) dataset, which includes 200 SKUs across 17 meta-categories such as puffed food, dried food, and drinks. By leveraging YOLOv8, renowned for its speed and accuracy in real-time object detection, and FiftyOne, an open-source toolset for computer vision, we can simplify data loading, training, evaluation, and visualization for effective product detection and classification. Attendees will gain insights into how these tools can be applied to optimize checkout automation.
Speaker: is a Data Engineer Intern at UNAR Labs, a startup focused on making information accessible for the blind. She holds a Master’s degree in Machine Learning and Computer Vision from Northeastern University and is passionate about applying AI and computer vision to real-world problems, with a focus on automation and accessibility.
Q&A
- In your retail use case, did you find certain types of objects get confused with each other more often than other object pairs?
- Is the code you used for fine tuning published? Or can you point to a resource you can recommend?
The goal of the Meetups is to bring together communities of data scientists, machine learning engineers, and open source enthusiasts who want to share and expand their knowledge of AI and complementary technologies.
Join one of the 12 Meetup locations closest to your timezone.
Up next on Jan 29, 2024 at 9:00 AM PT / 12:00 PM ET, we have three great speakers lined up!
Is AI Creating a Whole New Earth-Aware Geospatial Stack? Promises and Challenges- , Clay – AI for Earth
Evaluating the Satlas and Clay Remote Sensing Foundational Models- , Voxel51
Earth Monitoring for Everyone with Earth Index- , The Earth Genome
Register for the Zoom . You can find a complete schedule of upcoming Meetups on .
Get Involved!
There are a lot of ways to get involved in the Computer Vision Meetups. Reach out if you identify with any of these:
- You’d like to speak at an upcoming Meetup
- You have a physical meeting space in one of the Meetup locations and would like to make it available for a Meetup
- You’d like to co-organize a Meetup
- You’d like to co-sponsor a Meetup
Reach out to Meetup co-organizer Jimmy Guerrero on Meetup.com or ping me over to discuss how to get you plugged in.
—
These Meetups are sponsored by , the company behind the open source computer vision toolset. FiftyOne enables data science teams to improve the performance of their computer vision models by helping them curate high quality datasets, evaluate models, find mistakes, visualize embeddings, and get to production faster. It’s easy to , in just a few minutes.