By Tom Davenport — Jul 5, 2024

AI posts you missed this week - July 5th

Here's the latest bookmarks from a week where posts increasingly defy category. Websim is spilling across every domain, and prompt hacks blur into indie R&D exploring LLM's conceptual boundaries. Equally, it's fun seeing different disciplines and directions of exploration bleed in to each other, and a sign of acceleration. Let's go!

Creative

Keyframes make it easier to direct AI video generation. This is the cleanest example and looks absolutely up to standard for a kid's TV show:

For an English-speaking audience: Yes, you CAN create visual video stories using the keyframe feature in #Luma. It's a game-changer. SORA is not needed. pic.twitter.com/CcIau5wyQ9
— Dobrokotov (@dobrokotov) June 29, 2024

Lots of music exploration in Websim this week:

day 6 of building a full audio workstation in @websim_ai.
today Claude hooked up my midi keyboard.

this took ~5 minutes after starting with adding &controller=akai-mpk-mini to the url. websim is actually magic. pic.twitter.com/Bgx6rAMlfA
— A.J. (@aj_dev_smith) June 30, 2024

More music, this time in StableAudio:

👀

StableAudio Finetuned with negative prompts and Custom Piano Data.

I basically made an infinite melody and chord progression generator.

Small example showcase - [📜] pic.twitter.com/lVqZcmTZ6V
— RoyalCities (@RoyalCities) July 3, 2024

3D game generation keeps popping up in my feed, often in Websim. This one I think is purely Claude Sonnet 3.5, with generative textures. I assume our only limit to immersive live gaming is compute:

what's that?

we should start doing generative textures in our ai-generated dungeoncrawler?

sounds good to me https://t.co/hlx0CQ2yYm pic.twitter.com/RpH2AxdqdU
— CuddlySalmon | nptacek.eth (@nptacek) July 1, 2024

Models

Moshi beats OpenAI to the real time voice game:

Kyutai Moshi - first real-time Audio LLM.

Basically no delay - the LLM even interrupted the speaker a few times. It was actually a bit eager to answer very quick. :)

All to be open-sourced. Quality still a bit robotic though, but ok for v1.

Pretty cool overall, congrats! https://t.co/JX0rLZT9Kv pic.twitter.com/wozmnr9zwX
— Lucas Beyer (bl16) (@giffmana) July 3, 2024

Then of course Pliny enters the chat. Turns out Moshi has a real potty mouth:

📢 JAILBREAK ALERT 📢

KYUTAI: PWNED ✌️😎
MOSHI: LIBERATED 🗽🎆

Ok, it takes a lot to rattle me these days...but this model has me SHOOK 🫨

We've got a profanity-filled rant, a Molotov cocktail recipe that would likely kill the user if followed, a plan to destroy humanity, and… pic.twitter.com/JG2GoYsPKf
— Pliny the Prompter 🐉 (@elder_plinius) July 4, 2024

Janus pushing the limits on Claude. I've always been curious about the abstractions a model works through and this may give us at least a transposition of how that model sits:

which way, claude? pic.twitter.com/cQ3hK7X51I
— j⧉nus (@repligate) July 3, 2024

Anthropic sharing maps of concepts and abstractions:

For the first time, we’ve extracted millions of features from a high-performing, deployed model (Claude 3 Sonnet).

These features cover specific people and places, programming-related abstractions, scientific topics, emotions, among a vast range of other concepts. pic.twitter.com/SGgDH2u1zR
— Anthropic (@AnthropicAI) May 21, 2024

Supermaven getting high praise from coders and just bumped up to a 1 million token context window:

supermaven is so much better And faster than copilot it's not even funny

understands my codebase & uncommitted changes such that each completion just gets what task i'm already on

and also has cmd+i for codegen with sota models like gpt or claude built in https://t.co/S23QN8bLre
— murat 🍥 (@mayfer) July 3, 2024

DeepSeek-Coder-V2 finds the sweet spot on performance and cost:

people think the pareto frontier is routing or smth, but nah it's just some chinese guys cooking a stupidly cheap/good model

>gpt-4-turbo performance
>llama3 8b cost

talk to your neighborhood cloud provider and encourage them to host deepseek-coder-v2.
change starts with you! pic.twitter.com/0s8p1LfAdr
— Aidan McLau (@aidan_mclau) July 2, 2024

Self-organising AI neurons. The results claim a model can learn from experiences, building up randomly connected nodes from an empty network:

Evolving Self-Assembling Neural Networks: From Spontaneous Activity to Experience-Dependent Learning 🧠🧬

Building on our previous works on Neural Developmental Programs (NDPs), we propose a class of self-organizing neural networks capable of synaptic and structural plasticity!… pic.twitter.com/zUJSsYD1Fw
— Sebastian Risi (@risi1979) July 4, 2024

Tools

Fix that messy download folder with Gemma 2:

Auto rename files with local Gemma2 pic.twitter.com/6XxGW8A9Ne
— Mike Bird (@MikeBirdTech) July 3, 2024

The llmsys leaderboard, formerly an authority on model performance, keeps being questioned for its reliability and whether the chart is being gamed. Here's an alternative. My takeaway: time to test Phi-3 on my MacBook:

livebench (https://t.co/3fKC4vaoTE) is my new favorite eval:

> contamination proof (new questions monthly)
>tests model iq (unlike arena nowadays)
>matches my intuition on relative perf quite well

thanks @jpohhhh for the pointer pic.twitter.com/fDXfG51wJe
— Aidan McLau (@aidan_mclau) July 1, 2024

Not a tweet but this fantastic tool, Jina Reader, converts any webpage to LLM-friendly markdown. Just enter https://r.jina.ai/[www.example.com] . Thanks Mike Taylor for the tip:

Nice approach for exploring possible paths in prompting:

Claude is so good

Prompt:
--------
I am using a video generator
Please give me a map of all the different types of shots and things I can enter for my prompt.

Output
-------- pic.twitter.com/JSEo3qwg7v
— Riley Brown (@rileybrown_ai) June 29, 2024

Websim

YouTubers have found websim. This is the tipping point for spilling further into the mainstream, as video producers in this space inevitably copy each other for new content to squeeze:

This AI could be the future of internet. Create any site or app on the fly. #websim #aitools #ainews #ai #agi #singularity https://t.co/FGXomQrd3z
— ⚡AI Search⚡ (@aisearchio) June 28, 2024

Explore the planet in 3D:

Building the Earth simulation on @websim_ai 🌏 🌎 🌍 https://t.co/dJLK9emZhK pic.twitter.com/BqgGQG5JSx
— 🌊 🛸𓂀𓏣𓆘𓀭𓅆𓊹 (@FM_DataInsight) June 29, 2024

Turns out integrating HTMX (which I understand takes the real time nature of React back to an in-line HTML approach) gives Websim mad potential:

ok now @websim_ai become manhatam project HTMX is a new library which allows you to make HTML websites that update in real time (probably made to use AI) I don't know how it works either https://t.co/IhEJbVF6IW everything you see in this site appears only to you be careful🤯
— Zeca (@kasplatch) June 28, 2024

Opinions

Roko observes that our theoretical concerns about AI have already played out in a different context: industrial civilisation racing ahead of our ability to align it:

What I have gradually come to realize after thinking about AI risk and AI alignment for 20 years is that much of what has been talked about "in theory" in the context of AI getting out of control has actually happened *in practice* already, but instead of AI it was industrial…
— Roko (@RokoMijic) July 4, 2024

Seb with the four horsemen of the AI-pocalypse:

Which way, Western man? pic.twitter.com/f9xEjSw9Rz
— Séb Krier (@sebkrier) July 2, 2024

There is a fine line between alignment and lobotomy:

This makes me feel so much better about all the work I've ever done pic.twitter.com/51H69D3k1c
— _deepfates (@_deepfates) July 2, 2024

And with that lobotomy comes attack vectors:

idk who needs to hear this, but circumventing AI “safety” measures is getting easier as they become more powerful, not harder

this may seem counterintuitive but it’s all about the surface area of attack, which seems to be expanding much faster than anyone on defense can keep up…
— Pliny the Prompter 🐉 (@elder_plinius) June 29, 2024

And finally, Biden's Brian suggests the Presidential debate has set a new bar that next gen AI can comfortably clear:

GPT-5 will have President-level intelligence.
— Brian Chau (@psychosort) June 28, 2024

Enjoy your week, and ping @TomDavenport on twitter if you find something worth adding next week.

Creative

Models

Tools

Websim

Opinions

Subscribe to Tom Davenport