Hi all! I’ve recently started a newsletter around all things data science, ML, and AI, primarily to keep track of interesting things in the space and what I’ve been up to. This is an experiment so please do let me know what you’d like to see here. There’s a lot to share this week so let’s jump right in.
BUILDING YOUR FIRST MULTIMODAL GENAI APP 🚀
I’m excited to share something special with you—I recently taught a tutorial at ODSC APAC that dives into building your very first multimodal generative AI app. 🚀
I’ve recorded a demo to walk you through every step, and it’s all available in a GitHub repository. Whether you’re just getting started with AI or looking to expand your skill set, this workshop is packed with hands-on content to help you build an app that transforms simple prompts into audio, video, and images.
Here’s what you can expect:
• Hands-on Tutorial: Learn how to set up your environment using GitHub Codespaces and get your project running in no time.
• Multimodal Magic: Explore how to convert a text prompt into a creative mix of media—including poetry, audio, and even video.
• Streamlit Integration: Discover how to build a dynamic web app that brings your AI creations to life.
• API Flexibility: Work with top models from OpenAI, Replicate, Groq, and Hugging Face, with the option to customize and expand the app’s capabilities.
If you find this useful, don’t forget to give the repository a star ⭐ and fork it if you’d like to tinker with the code. I’d also love to hear your thoughts—feel free to reply to this email or raise an issue on the issue tracker. And shout out to Eddie Mattia for helping out!
Ready to dive in? Check out the full demo and get started here. You can also watch my demo video below:
Let me know what you think as it’s still a WIP, and happy building!
The AI Revolution will NOT be Monopolized
I recently did a podcast with Ines Montani and Matthew Honnibal, the minds behind Explosion and the creators of spaCy, a leading open-source library for advanced Natural Language Processing (NLP) and AI in Python. I had the great fortune of riffing with them for 2 hours in a wide-ranging conversation, covering
• The evolution of applied NLP and its role in industry,
• The balance between large language models and smaller, specialized models,
• Human-in-the-loop distillation for creating faster, more data-private AI systems,
• The challenges and opportunities in NLP, including modularity, transparency, and privacy,
• The future of AI and software development, and
• The potential impact of AI regulation on innovation and competition.
You can listen to the episode here or on your app of choice. You can also watch the livestream here:
Ines and Matt shared so many interesting things I’d like to bring you here it was tough to choose one but here’s a clip in which we explore how many labeled examples a BERT model needs to outperform GPT-4!
NASA, AI, AND RATS IN SPACE
I’ll be recording a Vanishing Gradients livestream with my friend Chelle Gentemann, Open Science Program Scientist for NASA’s Office of the Chief Science Data Officer.We'll discuss:
Measuring Open Science Impact: How NASA is developing new metrics to evaluate the effectiveness of open science practices beyond traditional publication-based measures.
The Nature of Scientific Discovery: Insights into how scientific breakthroughs occur and the importance of collaborative efforts in advancing research.
AI in NASA Science: NASA's exploration of AI applications across various divisions, from rats in space to the origin of the universe.
Challenges in Implementing Open Science: The complexities of promoting open science practices within government agencies.
The Future of Open Science: How open science is changing the research landscape and fostering interdisciplinary collaboration.
You can sign up for free here!
From Theory to Practice: Machine Learning Engineering with Santiago Valdarrama
💫 This week, I have the privilege of hosting a fireside chat with someone I’ve long admired in the worlds of ML and AI education, Santiago!
As many of you know, I’m deeply passionate about teaching and sharing knowledge in data science, ML, and AI. Santiago has been a significant influence on my own journey, and I’m thrilled to have the opportunity to learn from him in this session. You can register for the Outerbounds event here!
Where are you in the GenAI Hype Cycle?
Last week I teamed up with Delphina for our inaugural Data Dialog, a forum for data leaders. I learnt a huge amount in the hour we had about current trade-offs people are making on the ground with respect to traditional ML and Generative AI. And so many thanks to Brad Klingenberg, founder of Naro and former Chief Algorithms Officer at Stitch Fix, for joining as our featured speaker!
One wild ride was a conversation about where we all are in the Generative AI Hype Cycle (the above figure was inspired by a slide from Alan Nichol, CTO of Rasa). Does this resonate with you? Where are you currently in the Hype Cycle? Hit reply and LMK!
Our next Data Dialog:
• 🗓 Thursday, September 5, 2024, 4:00 PM PT on Zoom
• 🎙 Featured Speaker: Lilei Xu, former Director of Data Science at Faire and Airbnb, now an executive coach.
• 📌 Topic: “Mastering Data Science Leadership in the AI Era” — insights into successfully navigating leadership challenges in the fast-paced AI industry.
The Dialogs are a space for authentic conversations. No recordings, no sales pitches — Chatham House Rules and real talk about timely topics. Check out the full details and apply to join us.
I’ll be announcing more livestreams, events, and podcasts soon, so subscribe to the Vanishing Gradients lu.ma calendar to stay up to date. Also subscribe to our YouTube channel, where we livestream, if that’s your thing!
That’s it for now. Please let me know what you’d like to hear more of, what you’d like to hear less of, and any other ways I can make this newsletter more relevant for you,
Hugo