“Learning always wins,” said Jones. “The history of AI reflects the reality that it always works better to have a model learn something for itself rather than have a human hand-engineer it. The deep learning revolution itself was an example of this, as we went from building feature detectors by hand to letting neural networks learn their own features. This is going to be a core philosophy for us at Sakana AI, and we will draw on ideas from nature including evolution to explore this space.”
Catastrophic forgetting: a scary name for when a model forgets some of its base knowledge learned in pre-training during fine-tuning. If you run into this, there are a few ways to mitigate it.
Launch memecoin (no roadmap, just for fun) → Raise Capital → Forming a tribalistic community early on → build apps/infrastructure → continually adding utility to the memecoin without making false promises or providing roadmaps
One developer already created a Slack workspace where he and his friend hang out with a group of bots that have different personalities, interests, and skills.
In reality, Navboost has a specific module entirely focused on click signals.
The summary of that module defines it as “click and impression signals for Craps,” one of the ranking systems. As we see below, bad clicks, good clicks, last longest clicks, unsquashed clicks, and unsquashed last longest clicks are all considered as metrics. According to Google’s “Scoring local search results based on location prominence” patent, “Squashing is a function that prevents one large signal from dominating the others.” In other words, the systems are normalizing the click data to ensure there is no runaway manipulation based on the click signal.
Industrial Revolution largely freed people from using brawn
AI will largely free people from using brain
Unfortunately there are absolutely no solid predictions we can do about this stage. At the end of the day the startup just has to be lucky enough to start close enough and navigate optimally enough to hit its first discovery before company disintegrates from lack of funding or team morale. The process can be as fast as few months or as long as a decade.
Six of the eight web companion products bill themselves as “uncensored,” which means users can have conversations or interactions with them that may be restricted on platforms like ChatGPT. Users largely access these products via mobile web, as opposed to desktop — though almost none of them offer apps. On average, 75 percent of traffic to the uncensored companion tools on our web list comes from mobile.
🍰 Only 4 out of 70+ projects I ever did made money and grew📉 >95% of everything I ever did failed📈 My hit rate is only about ~5%🚀 So…ship more — @levelsio (@levelsio)
Vitalik said L3 good for customization (L2 for scaling); L3 good for specific kinds of scaling
It’s inspiring to know at any moment in time there is an infinite number of true statements for new startups to discover and further expand our collective system. Gödel’s theorem is not really about our limits: it’s about possibilities always waiting to be discovered. The process is certainly hard and alien to us.
No nation has ever become the major power without a clear lead in technology, both civilian and military. From the Roman legions, to the naval powers of Portugal, Spain and Great Britain, to Germany in World War I and the US post-World War II, great power status was achieved by those nations that were able to harness their technological advantage for holistic development of their civilian and military capabilities.
This holds at a higher level of conceptual abstraction: looking near a feature related to the concept of “inner conflict”, we find features related to relationship breakups, conflicting allegiances, logical inconsistencies, as well as the phrase “catch-22”. This shows that the internal organization of concepts in the AI model corresponds, at least somewhat, to our human notions of similarity. This might be the origin of Claude’s excellent ability to make analogies and metaphors.
Language differences mean that Chinese firms really are in the hot seat for developing domestic AI products. OpenAi’s most recent version of ChatGPT, GPT-4o, has real issues in China. MIT Technology Review reported that its Chinese token-training data is polluted by spam and porn websites.
Metaplanet becomes Japan’s top-performing stock this week, hitting a +50% daily gain limit for two consecutive days. The company plans to increase its authorized shares by 300% to acquire more BTC for its reserves.
Community is made of people, culture is made up of shared memes, community can be transient, culture is much more persistent, “community” can be formed with a free airdrop, culture can only be formed with a sustained commitment to creating a common story.
Every memecoin is an exquisitely precise ad, a self-measuring barometer of attention: the price jumps if people talk about the memecoin and drops if they don’t
McLuhan believed transformative new technologies, like the stirrup or printing press, extend a man’s abilities to the point where the current social structure must change to accommodate it. Just as the car created the Interstate Highway System, the suburb, and the oil industry, so the stirrup helped create a specialized weapon system (knights) that required land and pasture to support it and provide for training and material.
Wall Street is not going to stand idly by while Tether makes more money than Goldman Sachs.
Terminator: In three years, Cyberdyne will become the largest supplier of military computer systems. All stealth bombers are upgraded with Cyberdyne computers, becoming fully unmanned. Afterwards, they fly with a perfect operational record. The Skynet funding bill is passed. The system goes online on August 4th, 1997. Human decisions are removed from strategic defense. Skynet begins to learn at a geometric rate. It becomes self-aware 2:14 AM, Eastern time, August 29th. In a panic, they try to pull the plug.
“Hyperscalers”, which are all looking to create a full stack with an AI model powerhouse at the top and hardware that powers it underneath: OpenAI(models)+Microsoft(compute), Anthropic(models)+AWS(compute), Google (both) and Meta (increasingly both via doubling down on own data center buildout).
Stability AI founder recently stepping down in order to start “decentralizing” his company is one of the first public hints at that. He had previously made no secret of his plans to launch a token in public appearances, but only after the successful completion of the company’s IPO – which sort of gives out the real motives behind the anticipated move.
An additional limitation of transformer models is their inability to learn continuously. Today’s transformer models have static parameters. When a model is trained, its weights (the strength of the connections between its neurons) are set; these weights do not update based on new information that the model encounters as it is deployed in the world.
All this equipment and processes consume large amounts of energy. A large fab might demand 100 megawatts of energy, or 10% of the capacity of a large nuclear reactor. Most of this energy is used by the process tools, the HVAC system and other heating/cooling systems. The demands for power and water are severe enough that some fabs have been canceled or relocated when local utilities can’t guarantee supply.
—
If we considered things in “capital cost per component” terms, and considered transistors as individual components, semiconductor fabs are actually probably among the cheapest manufacturing facilities.
Build for where models will be in 1-2 years, not where they are today. Bake the challenges of inference at scale into your roadmap. And don’t just think in terms of prompting one mega model and getting an answer back. Plan for the iterative systems design, engineering, and monitoring work needed to make your AI product the proverbial “10x better” than existing alternatives.
Ethereum’s ICO returns were 1.5x higher than available on market.Solana’s seed round returns were 10x higher than those available on market. OP’s seed round returns were 30x higher than those available on market.
Across every major ETH NFT project, more than 3/4 of all NFTs haven’t traded once in 2024.
-95% of punks
-93% of world of women
-87% of BAYC
-87% of MFers
are just sitting in wallets through this year’s moves.
We have to start by understanding the really important parts and building that core functionality first, then building additional features around that core. When you’re building consumer products, getting serious leverage in the marketplace (distribution) is the most important first order goal, so you need to accomplish this as quickly as possible and then shift gears to build second generation features.
Every VC fund with a consumer investing team is one foot in, one foot out of consumer. Even when startups hit the desired milestones & metrics, investors are still unclear which to bet on because the past decade of consumer investing hasn’t yielded many big wins, barriers to entry are low, and AI makes the future of human-tech interaction uncertain.
—
Angel investing, especially with small checks, is only good for two things: 1) getting into contractual friendships with founders you respect 2) building a track record for being a full-time venture capitalist (raising a fund or joining one).
Mustafa Suleyman has argued that the real Turing Test that matters is whether a given AI can go off and earn $100,000 for you on the internet. I would argue the test that’s more relevant — and consequential — is whether an AI can empty your inbox.
From dataset Google doc memo:
“Few know how to train efficient models” meant “Few know how to craft informative datasets.”
—
all the consumer graphics cards on the Internet could not compete with a mere thousand GPUs in a supercomputer.
—
Data cleaning, data curation, and data synthesis do not have this problem: dataset creation is a series of (mostly) parallel operations. This makes dataset creation perfectly suited to distributed computation, as one finds on AI blockchains. We can build good datasets together.
Web2 sports betting losing market share to memecoins
Fabs must limit vibrations to several orders of magnitude below the threshold of perception, while simultaneously absorbing 100 times the mechanical energy and 50 times the air flow as a conventional building.
An interesting phenomenon evident blockchain ecosystems is that the networks with the stickiest communities are the ones where a broad base of developers and users had an opportunity to benefit financially from their participation. Think Ethereum and Solana, which have two of the strongest developer communities: the native tokens were publicly available at a much lower price to the current value. In contrast, ecosystems where network tokens launch at a highly efficient market price tend to struggle to retain a passionate community of developers and users, to the long-term detriment of the ecosystem.
Bitcoin surpasses 1 billion confirmed transactions, averaging over 178,000 transactions per day since its launch in 2009.
We are relatively cheaper and don’t bill by the hour. We get more done. We hire and fire firms. CEOs trust _us_.
As a result, in-house lawyers have grown 7.5x times the rate of other kinds of lawyers the last 25 years. The role of “product counsel” boomed, just like the role of product manager in this time.
Today Google employs 828 “product counsel.” That’s more than only the biggest law firms.
Number one predictor of job retention is whether they have a friend at work
We don’t sell saddles essay
-The best — maybe the only? — real, direct measure of “innovation” is change in human behaviour. In fact, it is useful to take this way of thinking as definitional: innovation is the sum of change across the whole system, not a thing which causes a change in how people behave. No small innovation ever caused a large shift in how people spend their time and no large one has ever failed to do so.
-Because the best possible way to find product-market fit is to define your own market.
Transformers’ fundamental innovation, made possible by the attention mechanism, is to make language processing parallelized, meaning that all the words in a given body of text are analyzed at the same time rather than in sequence.
I’ve been making chatbots since the days of AI Dungeon, and have seen the cycle multiple times. A new site appears with low censorship and free content generation. It grows a user base, starts introducing more censorship, raises prices, and before long it becomes unusable and people move on to the next one. Poe has been around for longer that most and I’m only seeing improvements on it. Plus it’s operated by Quora, which I think will give it added sustainability.
Friendtech is uniswap for social tokens
Steve Jobs figured out that “you have to work hard to get your thinking clean to make it simple. – Taleb
I eventually think these open-source LLMs will beat the closed ones, since there are more people training and feeding data to the model for the shared benefit.
Especially because these open source models can be 10 times cheaper than GPT-3 or even 20 times cheaper than GPT-4 when running on Hugging Face or locally even free, just pay electricity and GPU
In a 1985 interview Wozniak posited: “The home computer may be going the way of video games, which are a dying fad” – alluding to the 1983 crash in the video game market. Wozniak continued:
“for most personal tasks, such as balancing a check book, consulting airline schedules, writing a modest number of letters, paper works just as well as a computer, and costs less.”
—
He seemed well aware of the heretical nature of his statements, telling a reporter: “Nobody at Apple is going to like hearing this, but as a general device for everyone, computers have been oversold”and that “Steve Jobs is going to kill me when he hears that.”
Bonus (Reality Check): What Are The Odds You Get Acquired Within 5 Years for a Good Price? Around 1%-1.5% by Jason Lemkin
Data on 3,067 startups founded in 2018. The takeaway: It’s the second 5 years where the real value starts to compound. Startups are a long game
The subset of parameters is chosen according to which parameters have the largest (approximate) Fisher information, which captures how much changing a given parameter will affect the model’s output. We demonstrate that our approach makes it possible to update a small fraction (as few as 0.5%) of the model’s parameters while still attaining similar performance to training all parameters.