Go back 5 years and ask anyone on this site what companies do you think will be the most open about AI in the future OpenAI, Meta, or Google. I bet 10/10 people would pick OpenAI. Now today Meta and Google, both trillion dollars companies, are releasing very powerful open models with the ability to be used commercially.
This included full model weights along with a detailed description of the dataset, training process, and ablations that led them to that architecture. T5 was state-of-the-art on many benchmarks when it was released, but it was of course quickly eclipsed by GPT-3.
It was common practice from Google (BERT, T5), Meta (BART), OpenAI (GPT1, GPT2) and others to release full training details and model weights. Following GPT-3, it became much more common for labs to not release full details or model weights.
Not at all. When you're the underdog, it makes perfect sense to be open because you can profit from the work of the community and gain market share. Only after establishing some kind of dominance or monopoly it makes sense (profit wise) to switch to closed technology.
OpenAI was open, but is now the leader and closed up. Meta and Google need to play catch up, so they are open.
> Not at all. When you're the underdog, it makes perfect sense to be open because you can profit from the work of the community and gain market share. Only after establishing some kind of dominance or monopoly it makes sense (profit wise) to switch to closed technology.
That is purely the language of commerce. OpenAI was supposed to be a public benefit organisation, but it acts like a garden variety evil corp.
Even garden variety evil corps spend decades benefitting society with good products and services before they become big and greedy, but OpenAI skipped all that and just cut to the chase. It saw an opening with the insane hype around ChatGPT and just grabbed all it could as fast as it could.
I have a special contempt for OpenAI on that basis.
This. MistralAI is also underdog and released Mitral 7b and Mixtral 8x7b, but as soon as they got traction, they closed their models (e.g., Mistral Medium).
I think current understanding is <50-100B parameter models will be commodity and would provide no moat. Competition will be in Gemini Ultra/GPT4+ models.
So open sourcing simple models brings PR and possibility of biasing OSS towards your own models.
LLaMA 3 with >=70B params will be launching this year, so I don't think this is something that will hold for long. And Mixtral 8x7B is a 56GB model, sparsely. For now I agree, for many companies it doesn't make sense to open source something you intend to sell for commercial use, so the biggest models will likely be withheld. However, the important more thing is that there is some open source model, whether it be from Meta or someone else, that can rival the best open source models. And it's not like the param count can literally go to infinity, there's going to be an upper bound that today's hardware can achieve.
Just an FYI, Mixtral is a Sparse Mixture of Experts that has 47B parameters for memory costs (but 13B active parameters per token). For those interested in reading more about how it works: https://arxiv.org/pdf/2401.04088.pdf
For those interested in some of the recent MoE work going on, some groups have been doing their own MoE adaptations, like this one, Sparsetral - this is pretty exciting as it's basically an MoE LoRA implementation that runs a 16x7B at 9.4B total parameters (the original paper introduced a model, Camelidae-8x34B, that ran at 38B total parameters, 35B activated parameters). For those interested, best to start here for discussion and links: https://www.reddit.com/r/LocalLLaMA/comments/1ajwijf/model_r...
This article states quite an impressive list of open source tools that Google has released for years in the past. This is no surprise coming from* them. Google has released some large pieces of source in other domains as well, Chromium comes to mind, which probably impacts most Internet users directly.
The question is not about Google but about OpenAI.
Google also has released Guice/Dagger for Java dependency injection. Angular never really took off, but guice/dagger are widely used. Also I am pretty impressed with Flutter as an alternative to react native.
I have a different take, Google releases a lot but is also a massive company and tools like Chromium serve to increase their stock price so they can hit their quarterly estimates.
It was not at all done for the good of the web, it was a mere logical calculation; it was cheaper to develop Chromium, than to pay 4B USD in search royalties to Microsoft Internet Explorer, and would give more control and long-term safety to Google.
I don't know why people like yourself respond with such derisive commentary instead of simply asking the constructive question.
Initially? It fueled dethroning MSFT and help gain marketshare for Chrome. On a go-forward basis it allows Google to project massive weight in standards. In extension to its use with Chrome, Chrome is a significant knob for ad revenue that they utilize to help meet expectations. That knob only exists because of its market share.
Not surprising, just like when MS went to shit, and then they start to embrace 'open source'. Seems like PR stunt. And when it comes to LLM there is millions of dollar barrier to entry to train the model, so it is ok to open up their embedding etc.
Today big corp A will open up a little to court the developers, and tomorrow when it gains dominance it will close up, and corp B open up a little.
OpenAI is heavily influenced by big-R Rationalists, who fear the issues of misaligned AI being given power to do bad things.
When they were first talking about this, lots of people ignored this by saying "let's just keep the AI in a box", and even last year it was "what's so hard about an off switch?".
The problem with any model you can just download and run is that some complete idiot will do that and just give the AI agency they shouldn't have. Fortunately, for now the models are more of a threat to their users than anyone else — lawyers who use it to do lawyering without checking the results losing their law licence, etc.
But that doesn't mean open models are not a threat to other people besides their users, as all the artists complaining about losing work due to Stable Diffusion, the law enforcement people concerned about illegal porn, election interference specialists worried about propaganda, and anyone trying to use a search engine, and that research lab that found a huge number of novel nerve agent candidates whose precursors aren't all listed as dual use, will all tell you for different reasons.
> Fortunately, for now the models are more of a threat to their users than anyone else
Models have access to users, users have access to dangerous stuff. Seems like we are already vulnerable.
The AI splits a task in two parts, and gets two people to execute each part without knowing the effect. This was a scenario in one of Asimov's robot novels, but the roles were reversed.
AI models exposed to public at large is a huge security hole. We got to live with the consequences, no turning back now.
My impression is that OpenAI was founded by true believers, with the best intentions; whose hopes were ultimately sidelined in the inexorable crush of business and finance.
You can run Gemma and hundreds of other models(many fine-tuned) in llama.cpp. It's easy to swap to a different model.
It's important there are companies publishing models(running locally). If some stop and others are born, it's ok. The worst thing that could happen is having AI only in the cloud.
> And when it comes to LLM there is millions of dollar barrier to entry to train the model, so it is ok to open up their embedding etc.
That barrier is the first basic moat; hundreds of millions of dollars needed to train a better model. Eliminating tons of companies and reducing it to a handful.
The second moat is the ownership of the tons of data to train the models on.
The third is the hardware and data centers setup to create the model in a reasonable amount of time faster than others.
Put together all three and you have Meta, Google, Apple and Microsoft.
The last is the silicon product. Nvidia which has >80pc of the entire GPU market and being the #1 AI shovel maker for both inference and training.
Eh, I don't really blame anyone for being cynical but open weight AI model releases seem like a pretty clear mutual benefit for Google. PR aside, they also can push people to try these models on TPUs and the like. If anything, this seems like it's just one of those things where people win because of competition. OpenAI going closed may have felt like the most obvious betrayal ever, but OTOH anyone whose best interests are to eat their lunch have an incentive to push actually-open AI, and that's a lot of parties.
Seems like anyone who is releasing open weight models today could close it up any day, but at least while competition is hot among wealthy companies, we're going to have a lot of nice things.
Since the release of GPT-2 (it was initially "too dangerous" to release the weights), I think most people in the industry have assumed that OpenAI does not see open sourcing their models as a strategic advantage.
I would have picked Google five years ago, since nobody was releasing commercially viable LLMs at the time, and Google was the center of all the research that I knew of.
> what companies do you think will be the most open about AI in the future OpenAI, Meta, or Google.
The funny part is that the real answer is: Some random French company is running circles around them all.
I mean who the hell just drops a torrent magnet link onto twitter for the best state of the art LLM base model for its size class, and with a completely open license. No corporate grandstanding, no benchmark overpromises, no theatrics. That was unfathomably based of Mistral.
Ironic.