Testing the new Sonnet

2024-10-24By gingerhendrix

Anthropic have just released their latest model, confusing still called sonnet-3.5 (or technically claude-3-5-sonnet-20241022), we'll call it new sonnet. This model seems to have had quite a major update to it's coding abilities, so lets see how it compares on our react-todo apps.

Basic Todo

See Introducing React Todo for the prompt.

Old Sonnet vs New Sonnet

Old Sonnet

New Sonnet

A nice design improvement. New sonnet has also generated simpler code with a single file implementation.

Fully Featured Todo

See Todo with more features for the prompt.

Old Sonnet vs New Sonnet

New Sonnet

Old Sonnet

Again a significant design improvement. Also the new sonnet doesn't introduce the same type error we had with old sonnet. The code is fairly closely matched, though the new sonnet extracts the sort and filter logic out into lib/todoUtils.ts.

Conclusion

Fairly terrible UI has been the hallmark of the all the models tested so far. The new sonnet is a welcome improvement in this regard, and obviously for Anthropic it makes artifacts a lot more usable.