Building an AI Startup and Hello to GPT-4o
OpenAI's new model is breaking new ground and people are excited.
Morning y’all!
The news and hype cycle is abuzz with the release of OpenAI’s GPT-4o and that’s really all that folks are talking about right now.
So, that’s what we’ll cover plus a fun video from the a16z guys about what it takes to build an AI startup these days. That is all.
※\(^o^)/※
— Summer
OpenAI has released it’s latest multimodal model that integrates text, vision, and audio and the results are more than impressive! Here are some highlights!
Better than GPT4 on nearly every type of content.
It’s 50% cheaper to use with 5 times more rate limits with 2x generation speed.
New voice model capable of real-time responses, detection, translation, and more. It can even show emotion.
3D generation is now possible with font creations as well.
There’s a new desktop app for macOS with a refreshed UI.
All that to say, it doesn’t seem like it’s as big of a leap ahead of GPT4 as they’ve said, at least as far as “intelligence” goes. But, the way they’ve updated their workflow is savvy. For instance, the previous voice option passed audio from a speech-to-text model, then to an LLM, and then back to a text-to-speech model for outputs. Now, that’s all done in a single model. Improvement!
The price drop is particularly interesting in light of their mention that they’re going to give their “best” to non-paying customers.
The price drop is particularly notable because OpenAI are promising to make this model available to free ChatGPT users as well - the first time they've directly name their "best" model available to non-paying customers.
Enjoy checking it out and seeing what you see!
Finally, two fun videos from the a16z folks:
Overview:
In the "Build Your Startup With AI" YouTube video, Ben and Marc discuss the challenges faced by small AI startups in competing with tech giants like Google and Microsoft, the potential improvement of current AI models, and the capabilities and limitations of Large Language Models (LLMs).
They also touch on the importance of using high-quality training data, the challenges of monetizing AI technology, and the potential of AI to revolutionize industries. They argue that even though the cost of building AI applications may appear to be decreasing, the demand for high-quality software capabilities may actually drive up the cost.
They also discuss the misconception that data is a valuable asset that can be sold without processing it and the implications of AI on industries, specifically using the example of insurance. They also question whether the concept of insurance still works if perfectly predictive information is available and discuss the differences and similarities between AI and the internet.
Then Marc and Ben compare AI to the early days of the computer industry and discusses its potential expansion and future. He argues that AI functions like a probabilistic computer, capable of understanding language and images, and that its accessibility will follow the trajectory of computers from large mainframes to smaller, more accessible models.
They also explores the potential for a boom-bust cycle in AI funding and investment, drawing parallels to the internet era and they expresse concerns over the potential for monopolization of AI technology and the societal implications of speculative booms and busts, but ultimately believes that the transfer of funds from those with excess to those driving innovation is a positive thing.
And there we go. Have a great day!
※\(^o^)/※
— Summer
Handling different modalities and the desktop interaction seems pretty useful. To me memory and context are really important for this to be effective.