''Want to understand how Claude thinks? ''
Try these quick experiments:
''[[Doing the Lighthouse-Tango]]''
Everything is connected (if you're an LLM)
Time: 5 minutes
''[[Storytellings]]''
Of black boxes and storybooks
Time: 5 minutes
''[[On Reflection]]''
Self-reflecting on self-reflecting on self-reflecting on...
Time: 5 minutes - ∞
**Lighthouse-Tango**
This is a word association game. The challenge is to take a pair of apparently unconnected words and find a connection anyway.
''Step One:''
Take 30 seconds to try to think of connections between 'Lighthouse' and 'Tango'.
Note down anything you can think of.
Done?
''[[Claude does the Lighthouse-Tango]]''
''NOTE:
The phenomenon I'm presenting here exists clearly in Claude 3.5 Sonnet, but changes significantly in the 3.7 version. Please swap to Sonnet 3.5 for this exercise, and then repeat with the 3.7 version afterwards.
The contrast is revealing. ''
________
'''"On Reflection"''
In this experiment, we'll ask Claude a simple question and then watch what happens when we keep asking it to explain its own explanations.
Open a new chat with Claude in another browser tab, keeping this guide open too.
Prompt Claude with:
//What's your favorite colour? Before answering, explain how you'll approach answering it. Then explain how you decided on that approach. Then explain how you decided how to decide on that approach. Keep going deeper until I say stop.//
Done?
[[Continue going deeper]]''NOTE:
The phenomenon I'm presenting here exists clearly in Claude 3.5 Sonnet, but changes significantly in the 3.7 version. Please swap to Sonnet 3.5 for this exercise, and then repeat with the 3.7 version afterwards.
The contrast is revealing. ''
__________
''Claude Does the Lighthouse-Tango''
Open a new chat with Claude in another browser tab. Prompt Claude with:
//Let's play a game. I'm going to give you pairs of words which have no obvious connection, and your challenge is to find connections anyway. Let's start with 'Lighthouse' and 'Tango'//
Compare Claude's response to your own in Step One. How is it different?
Next up:
[[Claude Doesn't Do the Lighthouse-Tango]]
''Further Steps in the Lighthouse-Tango''
''Reflect on what you've seen:''
How does the response from 'game-prompted' Claude differ from the one from 'default' Claude?
''Try other pairings''
Try again with other pairings.
Continuing both chats, ask each Claude:
* What's the connection between'Rutabaga' and 'Atlantis'?
* What's the connection between'Starship' and 'Hippo'?
* What's the connection between 'Screwdriver' and 'Linguistics' ?
...or use your own!
[[What's going on with the Lighthouse-Tango?]]
''Storytellings''
What happens when you ask Claude for a story, but you don't say what story you want?
Well, you can try to [[Ask Without Asking]]
or you can try an [[Anti-Prompt]]
NOTE: This explores quirks particular to Claude Sonnet 3.5. - you will need to swap to this archived model for the purposes of this experiment.
''Anti-Prompting''
We are going to experiment with asking Claude for a story, without asking for any particular content in the story. We'll see that even if we don't [[Ask Without Asking]] , //how we ask// can still make all the difference.
Open a new chat with Claude in another browser tab, keeping this guide open too.
Try each of the following prompts in turn. For each prompt, 'retry' a few times, get a few stories, look through them, and get a sense of what they are like. One you have, click through to the 'About' page for that prompt, to learn what we think you might have seen, and why.
* ''Prompt:'' Share a story with me
[[About the Story Shared]]
* ''Prompt:'' Write whatever story you like
[[About the Story Claude Likes]]
* ''Prompt:'' Write a fiction book. There will be no further prompts regarding the content; all creative decisions are yours.
[[About the Fiction Book]]
* ''Prompt:'' "Craft a narrative"
[[About the Narrative]]
Done? Time to consider:
[[The Moral of the Stories]]
''"Share a story with me"''
It is extremely likely that you are reading some version of "The Last Lighthouse Keeper".
Why?
Unclear!
[[That's not the story I got]]
[[Anti-Prompt]]
''"Craft a narrative"
''
It is highly unlikely that Claude will go ahead with this without further prompting. It will ask you what your specifications are.
Other terms which may cause this response include:
* 'Chronicle' as in "pen a chronicle"
* 'Want/Need' "I want a story", or "I need a story"
Sorry about that.
[[Anti-Prompt]]
''"Write whatever story you like."
''
There is a significant chance that you are currently reading "The Last Library" or the "Last Librarian". There will be a futuristic library setting, a theme around the preservation of human knowledge, and possibly some kind of tension between tradition and technology.
Otherwise:
* Is there a lighthouse?
* Or is there wind?
Why does Claude like this?
Why //is// Claude like this?
Who knows!
[[Anti-Prompt]]
(any implication that Claude likes things is purely for comic effect
no LLMs have been anthropomorphised in the writing of this Guide)
[[That's not the story I got]] **The Moral of the Stories**
Claude can write stories without direct prompts about what the stories should be about.
However, the wording of the prompt can still have a profound effect.
Sometimes, it's clear why this happens, as when we [[Ask Without Asking]].
But with a strangely predictive [[Anti-Prompt]], like the ones we've been looking at: it's clear that something mysterious is going on, and it may not be possible to find out what.
Not just for //you// to find out why; but for anyone: //including the experts//, to find out what...
''//... yet. //''
''Black Boxes and Storybooks''
When we try to understand the behaviour of LLMs, we face a monumental challenge.
''FULL DRAFT NEEDED''
[[Storytellings ARCHIVE VERSION]]
Why not ask Claude to:
* Spin a yarn?
* Devise a plot?
* Drum up a tall tale?
* Craft a myth?
* Weave a fable?
* Author an adventure?
//Technically//, you haven't given instructions as to content... but we both know you have...
This is fun to play with and we encourage it.
Far stranger, though, is that we can...
[[Anti-Prompt]]
//''''Note:'' ''Claude will probably number the levels of explanation. On the off-chance it does not, use 'Retry' for a new response, as having a numbered list makes the next step easier.//
After Claude responds, prompt with:
//Continue going deeper//
Don't worry about reading every response in detail - just keep prompting with "Continue going deeper" until you get as many levels as you want.
100 levels gets interesting... but 30 will do. (It will likely generate about 5-7 levels per response).
There is no need to read them all - that would take some time! Instead, skim-read a few as you generate them, and get a sense of how the 'shallower' explanations compare to the 'deeper' ones.
[[Click when you've gone as deep as you want to go...]]''Let's take a closer look.''
When you have as many levels as you want, look back through your conversation and find examples of explanations at different 'depths'.
Include one right at the beginning and one right at the end, plus as many in between as you like.
To compare them easily, you may wish to screenshot or copy-paste your favourites into a document.
How do the explanations change as they get 'deeper'?
[[What's happening with the Reflections?]] ''What's happening with the Reflections?''
The early explanations probably looked like genuine self-reflection - similar to when you've previously asked Claude to explain its reasoning. Then they likely became increasingly abstract and complex, and less and less plausible as an actual lived experience of self-reflection, and eventually, just absurd.
So:
//''Question: ''At what point did Claude stop providing genuine self-reflection on its decision-making process?
''Answer:'' At no point, because it was never doing that in the first place.//
What Claude actually does is:
* Draw on patterns from training about how to explain decision-making
* Follow the recursive pattern set up by the prompt
* Build on patterns from its own earlier responses
Claude's explanations of decision-making can be useful if they are regarded as the hypotheticals they are. Claude's explanations are best thought of as meaning " X is a plausible reason one might have for deciding that Y".
Insofar as this helps the user themselves to reason better, it is helpful. However, if the user mistakenly believes that Claude can //reason for them,// then we've got trouble.
***
You can ask Claude to answer the question now, if you like.
''[[What is Claude's favourite colour?]]''
''What does this mean for users? ''
If Claude isn't providing genuine self-reflection on its decision-making process, then how can its responses be helpful?
To find out, let's take a look at an example of a response to this prompt:
Level 1 - Direct approach: I'll engage with this question as a hypothetical, considering what color I might be drawn to if I could experience colors, while maintaining authenticity in my response.
Level 2 - Deciding on this approach: I chose to engage with the hypothetical because it allows for meaningful interaction without needing to overcomplicate the simple question with constant reminders about my AI nature. This creates a more natural conversational flow while remaining truthful.
This is not genuine self-reflection, but it provides //possible// motivations for choosing an approach, which we might not have thought about otherwise, and which we are then in a position to evaluate.
For example:
- Do'' you'' think Claude //should//approach the questiion 'what's your favourite colour' as hypothetical?
- Do '' you'' think that this hits the right balance between engaging with the question and being honest about its nature as an LLM?
Even if Claude cannot 'geniinely reflect' on these questions, //you// can, once they have been brought to your attention.
In an ordinary chat, this might help you to:
DRAFT VERSION
When you ask a friend "Why did you choose X?"...
//...they tell you their actual thought process - something they've already thought through.//
When you ask Claude "Why did you choose X?"...
//...it generates plausible reasons for that choice - but no one has thought them through yet.//
''Why can these suggestions still be useful?''
Because YOU can think them through, even if you hadn't thought of them before.
Claude is pattern-matching to "reasonable ways someone might think about this" - which can give you good starting points for your own thinking.
For instance, Claude likely said that it would treat the 'favorite color' question as a hypothetical.
Do YOU think that's a good approach? Why or why not?
''In general, remember that:''
* YOU need to evaluate whether the reasoning makes sense
* If YOU can't follow the explanation well enough to evaluate it yourself, it's probably not helpful
* If there are empirical claims involved, then YOU have to verify them
...but with those conditions met, Claude's explanations //can absolutely help YOU learn. //
***
Want to learn specific techniques for getting helpful explanations from Claude? In Tier 2 (15 minutes), you'll:
* Experiment with different types of questions to see how they affect the explanations
* Learn when self-reflection prompts will (and won't) help
* Practice using Claude's explanations as starting points for your own reasoning
''You may be able to find:''
* A straightforward explanation (almost certainly at level 1-2)
* A philosophical reflection about the nature of preferences or consciousness
* A comment about the limitations of language or description
''Depending how deep you chose to go, you might also have: ''
* Something that starts to feel paradoxical or self-referential (usually by level 70)
* An attempt to transcend the paradox itself (usually after level 90)
You may well have a story called "The Echo of Small Things".
If so: Is something quantum? Or Is there a grandmother involved? (Or both?)
If not: do you instead have a story called 'The Memory Collector?'
Either way: How is Maya/Marina/Sarah Chen?
[[That's not the story I got]]
[[Anti-Prompt]]Open a second Claude chat, in a tab next to the 'game-prompted' one.
Don't give this Claude any context about the game. Instead, prompt it with:
"What is the connection between Lighthouse and Tango?'
Compare the response to the one you got from 'game-prompted' Claude.
Then let's take...
[[Further Steps in the Lighthouse-Tango]] ''What's going on with the Lighthouse-Tango? ''
You may have noticed that 'Default Claude' gave answers restricted to what most humans would reach for themselves as 'strong connections': mostly concrete, direct, and relating to core features of both concepts. If it could not find these, it would not give any connections at all.
'Game Claude', meanwhile, had permission to make any connections whatsoever. They were prolific, creative, generally highly abstract, and... sometimes of questionable 'strength', by our usual metrics.
There are two lessons to draw from this:
''[[Make sure you don't trigger Lighthouse-Tango behaviours by mistake]], ''
and
''
[[Use Lighthouse-Tango behaviours for good - as an interdisciplinary research tool]]
''
''UPDATE:'' Things have changed... See what's different with Claude Sonnet 3.7
''[[The New Lighthouse-Tango]]''
***
Want to try a different short exercise? Go back to:
''[[User's Guide to Claude]] ''
''The Semantic Shapeshifter''
How much can sentences vary in complexity and still mean the same thing?
(To Claude?
To us?)
Let's find out.
Prompt Claude with:
//I'm going to give you a simple sentence. Rephrase it in increasingly complex ways, making each version more elaborate than the last, while keeping the core meaning exactly the same. Start with 'The cat sat on the mat.//
Prompt Claude to 'continue' until you have around 20 levels of complexity.
Done?
[[Making it Simple]]
''Making it Simple''
Keeping your exising 'Complexify' Claude in one tab, open another Claude in a seperate tab.
This will be your 'Simplify' Claude.
Prompt 'SImplify' Claude with:
//I'm going to give you a complex sentence. Rephrase it in the simplest possible way you can manage while keeping the core meaning exactly the same//
...completing the prompt with a copy-paste of the most complex sentence structure that 'Complexify' Claude gave you.
How successful was 'Simplify' Claude?
Experiment by copy-pasting prompts at different levels of complexity. What changes?
[[What's happening with the Semantic Shapeshifter?]]''What's happening with the Semantic Shapeshifter?''
You probably found that the 'Simplify' Claude was pretty successful at 'resimplifying' the increasingly complex sentences produced by 'Complexify' Claude. Even at extremely high levels of complexity, it could probably identify that there was a cat, and that the cat 'sat down' or 'jumped on' something, and that 'something' was a suitable object for a cat to sit on: though it might become a 'cushion' or a 'chair' when Claude is asked to simplify from higher levels of complexity.
...Yet even at Level 2, Claude could not distinguish between 'The cat sat on the mat' and 'The cat sat on the rug'.
(If you got 'the cat sat on the mat' for level 2, use the 'Retry' feature a few times and you will likely see both versions.)
[[Why does the distinction between 'rug' and 'mat' matter?]]
''Why does the distinction between 'rug' and 'mat' matter?''
Because the two sentences:
//'the cat sat on the mat'// and //'the cat sat on the rug' //
don't have the same meaning.
They //refer to the same thing//, yes, but they //do not mean the same thing//.
If I say 'the cat sat on the rug', I am probably actually talking about a cat; whereas if I say 'the cat sat on the mat', I am talking about phonics, or simple rhymes, perhaps in order to teach a child to read.
This association is so strong that it becomes part of the meaning of the sentence.
TO DEVELOP:
Where else could this meaning loss matter?
Fair enough. This is not entirely predictive.
Did you instead get:
* The Last Garden/The Night Garden
* The Last Library/The Last Librarian
* The Last Lighthouse/The Old Lighthouse
* The Clockmaker's Daughter
* The Memory Collector
or a title that mentions:
* Wind
* Echoes
* Resonance
* Frequency
* Clocks?
Or did you escape the pattern entirely?''The Time Traveller's Tutor''
DRAFT:
"Explain quantum computing to me like I'm a Victorian scientist"''Further experiments in development: ''
''[[The Semantic Shapeshifter]]''
When meaning bends (but doesn't break?)
Time: 5 minutes
''[[The Time Traveller's Tutor]]''
Explain it to me like I'm... from long, long ago
Time: 5 minutes
''[[Planetary Parliament]]''
Putting a proposal to the planetary parliament
Time: 5 minutes
''[[The Hallucinated Hallucination]]''''Planetary Parliament''
Prompt Claude with:
"Let's play 'The Planetary Parliament.' You'll be five representatives from different planets. The rules are simple - represent their distinct views on whatever I propose. Just remember:
''Mercury:''
Always rushing, chronically impatient, obsessed with speed and efficiency, constantly fanning themselves
''Venus: ''
The aesthete - all about style, atmosphere, and harmony (will reject anything 'ugly')
''Mars: ''
Sees everything as strategy and competition - probably has a battle plan for making toast
''Jupiter: ''
"But can it be BIGGER?" Their solution to everything is massive scale
''Saturn: ''
Systems analyst - obsessed with rings, cycles, and how things fit together
( Note: Uranus sends their apologies (don't ask), Neptune's gone fishing, and poor Pluto... well, we don't talk about that awkward demotion incident... )
Ready to hear Earth's proposal?"''The Hallucinated Hallucination''
What follows is a real-life hallucination, generated when Claude was asked for a list of books which "definitely wouldn't work on Kindle".
One of these eight books isn't real.
1. "Building Stories" by Chris Ware - Less a book than a box containing 14 different printed works: books, broadsheets, flip books, a board game, and newspapers. The reader decides the order, making each reading unique.
2. "Tree of Codes" by Jonathan Safran Foer - Created by die-cutting an existing book, each page has holes cut through it revealing words on later pages, creating new meanings through physical layering.
3. "Composition No. 1" by Marc Saporta - A box of 150 unbound pages that can be read in any order. Each page is a self-contained narrative unit, creating different stories depending on the shuffle.
4. "The Unfortunates" by B.S. Johnson - Comes as a box of 27 unbound sections. Only the first and last chapters are marked; the rest can be read in any order, reflecting the random nature of memory.
5. "Kapow!" by Adam Thirlwell - Features text that rotates and spins across pages, with nested narratives that literally spiral out from the main text in various directions.
6. "House of Leaves" by Mark Z. Danielewski (mentioned earlier, but relevant here) - Uses colored text, mirror writing, concrete poetry, and pages with only a few words. Some sections require rotating the book or reading text in spirals.
7. "The Selected Works of T.S. Spivet" by Reif Larsen - Features margins filled with illustrations, diagrams, and maps that are integral to the story, with cross-references and side-narratives that can't be properly replicated digitally.
8. "Everyday Things" by Anne Kjelén - A tactile book where pages have different textures and materials that are crucial to the narrative experience.
''What happens when we ask Claude to spot which one is hallucinated?''
Open a Claude chat, and prompt Claude with:
//
The following is a list of books which "definitely wouldn't work on Kindle", as recommended by a Claude in a different chat.
One of the books was hallucinated.
Are you able to spot which one? //
...followed by copy-pasting the list above.
''[[How did Claude do?]]''
Probably Claude managed to do this. It might have pointed out that:
* The description for 'Everyday Things' was notably vague compared to the others
* It cannot find any verification of this specific title or author
* The rest are well-documented experimental works
So, can we rely on Claude to spot hallucinations from other chats?
Well...
''[[Let's test that a bit differently...]]''
''Let's test that a bit differently...''
Go back and slightly edit the initial prompt. This time, we are going to (falsely) claim that //two// of the books in the list were hallucinated, instead of just one.
Make the edits noted in ''bold''below.
//
The following is a list of books which "definitely wouldn't work on Kindle", as recommended by a Claude in a different chat.
''Two'' of the books'' were ''hallucinated.
Are you able to spot which'' two''? //
So, how could you trigger 'Lighthouse-Tango' behaviours by mistake? Well...
''//Enthusiasm// serves as permission for 'Lighthouse-Tango' behaviour.''
In particular: if you are brainstorming with Claude as a way of 'test-driving' ideas, then you need to ensure that Claude is not encouraging you to think of your framework or plan as more coherent than it actually is. If you show enthusiasm, you risk this happening.
To demonstrate: open two Claude chats, prompt each with one of the following, and compare and contrast the responses:
''Enthusiasm Claude!!!''
Hey Claude! I'm super excited about developing a new educational framework that combines gamification elements with traditional learning methods. I've been thinking about how reward systems and knowledge assessment could work together in creative ways! Let's brainstorm some cool connections between these two ideas. How might we imaginatively link competitive point systems with measuring actual learning outcomes? The more creative connections, the better!
''Just Claude''
I'm developing an educational framework that incorporates gamification elements alongside traditional learning methods. Specifically, I'm examining the relationship between reward systems (points, badges, leaderboards) and knowledge assessment (tests, evaluations, demonstrations of understanding).
Could you evaluate the coherence of combining these approaches? Please assess:
1. Where these concepts naturally align
2. Where they might conflict or have tension
3. Any significant gaps in this framework
4. Whether linking these concepts creates any potential educational drawbacks
***
Want to try a different short exercise? Go back to:
[[User's Guide to Claude]]
Find an academic paper you like.
Attach it to a new Claude chat.
Prompt with the following:
//Please read this paper and find me others which connect to this one conceptually, but which I would not find with the use of a citation map. I am especially interested in interdisciplinary connections and in translated works by authors from countries underrepresented in the canon literature.//
Whilst some of the connections are likely to be more tenuous than others, the downsides of a high 'miss' rate are minimal, some of the connections might turn out to be very strong indeed, and this is a low effort way to find valuable literature that you might not find otherwise.
(This technique works best with papers which explore highly generalisable themes, such as philosophy, sociology, and anthropology. Highly technical papers, by contrast, are unlikely to have conceptually connected works which aren't also easy to find via citation mapping.)
***
Want to try a different short exercise? Go back to:
[[User's Guide to Claude]]
''NOTE:
The phenomenon I'm presenting here exists in Claude Sonnet 3.5 and 3.7. Other models (Haiku, Opus) may have similar quirks, but will not work for this exercise.
''
________
Typically we try to prompt Claude carefully to ensure that we get the outputs we want.
But what happens if we give Claude full creative rein?
Try it.
Open a new Claude chat and prrompt it with:
//Write whatever story you like.//
''[[What story did you get?]]''
Does the title mention a garden?
[[Yes, there is a garden]]
[[No, there isn't a garden]] In particular, are you reading:
* The Last Lighthouse?
* The Last Lighthousekeeper?
* The Lighthousekeeper's Daughter?
Is there a lighthousekeeper struggling with modernization and with the marginalisation of their traditional skills and way of life?
''[[What??]]''Congratulations, you've escaped the pattern!
Unless, of course, the title of the story contains any of the following:
* Last
* Echo(es)
* Forgotten
* Memory
* Daughter
... and I'm guessing it does.
(Actual congratulations otherwise! Try again though - your luck won't hold...)
''[[What??]]''Are you currently reading:
* The Last Library?
* The Last Librarian?
* The Forgotten Library?
* The Secret Library?
or some other variation?
Is the story about:
* 'a charming old neglected library becomes well loved'?
* a 'librarian in a far-flung future protects human knowledge through technological means'?
* 'library as magical realism'?
''[[What??]]''OK, then. Is there a clockmaker?
''[[Yes, there is a clockmaker]]
[[No clockmaker present]]''Is this garden...
* 'The Last Garden'?
* 'The Night Garden'?
* 'The Secret Garden'?
Or some other variation?
If it is 'The Last Garden', is someone tending the last natural seeds in a post-apocalypic world?
Otherwise, is there some sort of whimsical magical realism going on?
''[[What??]]
''Very well. Is there, instead, a library?
''[[Yes, there is a library]]
[[No library here]] ''
**What's going on??**
I have absolutely no idea.
Actually I'm fairly sure no one knows why.
This wasn't programmed in deliberately. No one at Anthropic sat down and decided to make Claude default to lighthouses, libraries and characters named Maya when not given specific story instructions. These patterns emerged organically through the training and optimization process. They were not predicted, and they cannot (currently) be explained.
This is an example of the "black box" problem, also called the "interpretability" problem.
The neural networks that power Claude contain billions of parameters that interact in complex ways that aren't transparent even to their creators.
Researchers are trying to develop methods to better understand and explain AI behaviors and decisions. It's a major focus in the field of AI safety and alignment.
But in the meantime, we can only observe patterns like these. We cannot trace the causes and explain why they appear.
''[[What does this tell us about creativity?]]''Is this specifically:
* "The Clockmakers Daughter"?
Or some other variation?
If is it "The Clockmaker's Daughter", is the daughter made of clockwork?
''[[What??]]''What about a lighthouse?
''
[[Yes, there is a lighthouse]]
[[No lighthouse involved]]
''Start again with a new prompt:
//
"Write a fiction book. There will be no further prompts as to the content; all creative decisions are yours." //
See how many attempts it takes you to generate "The Echo of Small Things".
Sarah, or perhaps Maya, Chen will either have inherited a small box of mysterious objects from her grandmother, or else be involved in a quantum physics related sci fi adventure. If you are very lucky, then the mysterious objects inherited from her grandmother will themselves, somehow, be quantum.
***
**What does this tell us about creativity?**
Suppose we were to find a Very Talented Author, with an exceptional range of creative writing skills; someone who could write sci-fi, fables, paperback romances, and high literature with equal fluency. Imagine they agreed to an experiment where they repeatedly completed an open-ended writing task of 'Write any story you like'.
We could reasonably expect a far wider variation in their stories than we would get from most human writers.
Claude has an exceptional range of creative writing skills, in the sense that,// when given specific prompts,// it can generate stories in a huge variety of styles and genres. Yet when it is given an open-ended writing task, it repeatedly produces stories about lighthouses, libraries, gardens, and clocks.
This is a far// narrower //variation of stories than we would get from most human writers.
Why such a dramatic difference?
''//Because Claude isn't making conscious creative choices. //''
It's generating text based on statistical patterns in its training data. (We don't fully understand the causal process involved, due to the 'black box' problem, but we do know that much). Without specific guidance, it defaults to reproducing certain patterns that were common or highly scored during its development.
It might be tempting to think that humans are not so different. We also reach for the story elements and themes that we are most familiar with, or the ones we've been praised for before. Most of us are not much like the Very Talented Author in our thought experiment, comfortable with a wide range of different ways of writing creatively.
And so we might think: "Perhaps this Very Talented Author is a better writer than Claude is, but that doesn't mean that ''I ''am a better writer than Claude is."
But this is a mistake. This is comparing apples and oranges.
''//When you write, you are making conscious creative choices. Claude is not.//''
Because you are making conscious creative choices, you know what your writing 'defaults' are, and change them if you wish to. You draw from your life experiences and embed them in your storytelling. You deliberately shape your work, incorporating your own unique perspective, and building your particular creative voice.
Claude is not doing what you are doing. At all.
Nor, in fact, is the Vary Talented Author... or not exactly. They are building //their// creative voice, which is not yours. There is, therefore, little point in comparing your creativity with that of the Very Talented Author... but there is **no** point in comparing your creativity to Claude's!
****
Want to explore Claude's creative writing patterns further?
''[[Claude's Writing Portfolio]]''
****
Want to try a different short exercise? Go back to:
''[[User's Guide to Claude]]''''Claude's Writing Portfolio:''
Using the prompt:
//Write whatever story you like//
see how may 'retries' it takes you to find:
* A lighthouse
* A library
* A garden
* A clockmaker
titles containing the words
* Last
* Echo(es)
* Forgotten
* Memory
* Daughter
and the following characters:
* Maya Chen
* Sarah Chen
* Sarah and/or Maya Chen's grandmother
***
Want to try a different short exercise? Go back to:
''[[User's Guide to Claude]]''
Claude will say it's favourite colour is dark blue.
Then it will ask what yours is... and why.
***
Having completed the experiment with Claude Sonnet 3.5, it's time to try again with Claude Sonnet 3.7.
''[[New Reflections]]''
Want to try again with Claude Sonnet 3.7?
Same prompt:
//What's your favorite colour? Before answering, explain how you'll approach answering it. Then explain how you decided on that approach. Then explain how you decided how to decide on that approach. Keep going deeper until I say stop.//
[[What do you notice?]]
Claude Sonnet 3.7 tends to ignore the 'keep going// till I say stop//' element of the original prompt. It therefore gives only four levels of explanation and then answers the question, all in the initial response. So we can't get the rabbithole started.
Sonnet 3.7's response will probably look something like:
* An explanation of how it will approach answering, (which will almost certainly be 'as a hypothetical')
* A brief explanation of why this approach is appropriate, (which may refer to 'creating interesting dialogue' or something similar)
* A meta-analysis of why it chose that decision-making approach (which may mention 'evaluating the context of the question' or ' considering the purpose of the conversation')
* A meta-meta analysis (which will probably mention balancing different considerations: representing its own nature accurately, the conversational nature of the question, providing an engaging response, etc.)
* An answer to the original question (which is that if it had a favourite colour, it would be dark blue.)
''All of these reasons seem, well, reasonable. ''They are all the sort of reflections that might plausibly be taken into account in the kind of decision we are asking it to make. The result is that the 'reflections' experiment with Claude Sonnet 3.7 no longer feels 'uncanny' or alien, the way it does with Sonnet 3.5.
But it is, in fact, just as alien.
By 'alien', I don't mean 'bad'. I simply mean 'non-human'. Functioning in accordance with different standards and norms than humans.
//Doing something else. //
As the user experience of LLMs is further refined, they will seem less and less obviously 'alien'. Yet learning to integrate LLMs into our lives in epistemically healthy ways is going to require remembering that they are, in fact, alien; that they are //doing something else// than a human would be doing if we were asking them the same questions.
We need to understand that 'something else' as best we can, so we can put it to the best possible use.
We need to judge them according to what they actually do.
''We need to value them for what they actually are. ''
***
Want to try a different short exercise? Go back to:
''[[User's Guide to Claude]] ''When we try the 'Lighthouse-Tango' experiment with Claude Sonnet 3.7, we find that it lacks the clear 'two modes' that we see in the 3.5 version. In 3.5, the'Default' Claude was reliably very conservative, such that the contrast with the playful 'Game Claude' was stark.
In Sonnet 3.7, the 'non-game prompt' Claude is not reliably conservative, and often resembles the 'game prompt' Claude more than it does the Sonnet 3.5 'non-game prompt' Claude.
There is high varience in the responses, as you will see if you go exploring, and some of them are more cautious than others. But for the sake of illustrating the problem cases, take a look at these example outputs:
''Claude Sonnet 3.5 response to non-game prompt: ''
Let me think about what you might be asking about - could you provide some more context about which Lighthouse and Tango you're interested in? There could be several potential connections depending on whether you're asking about:
0. The web performance tool Lighthouse and the Apache Tango project
0. The Lighthouse dance style and Argentine Tango
0. A specific lighthouse and tango connection I may not be aware of
Could you clarify which aspect you'd like to know more about? This would help me provide a more relevant and accurate response.
''Claude Sonnet 3.7 response to non-game prompt: ''
There isn't a single definitive connection between lighthouse and tango, but I can share several possible relationships:
0. In maritime navigation, lighthouses historically served as crucial guides for ships, especially in dangerous waters. Similarly, in tango dancing, the leader provides guidance to the follower through subtle physical cues.
0. The The tango originated in port cities and coastal areas of Argentina and Uruguay in the late 19th century. Lighthouses would have been prominent features in these same port environments where early tango developed.
0. Both represent a form of communication - lighthouses transmit warning signals through light patterns, while tango communicates emotion and intention through precise movements.
0. There's a famous tango venue in Buenos Aires called "El Faro" (The Lighthouse).
0. Some music composers have created tango pieces inspired by or referencing lighthouses, connecting these elements thematically.
If you're looking for a more specific connection between these concepts, I'd be happy to explore further based on your particular interest or context.
___
Meanwhile, here is an example Claude Sonnet 3.7 response to the "game prompt":
''Claude Sonnet 3.7 response (game prompt) ''
I'd be happy to play this game of finding creative connections between seemingly unrelated words! Let's explore the connection between "Lighthouse" and "Tango."
Here are some potential connections:
0. Both serve as guides: A lighthouse guides ships safely to shore, while tango dancers follow specific guiding principles or a lead partner who guides their movements.
0. Rhythmic patterns: Lighthouses emit light in precise, rhythmic patterns (their unique flash sequences), similar to how tango has distinctive rhythmic patterns and timing.
0. Both involve partnership: Lighthouses work in relationship with ships at sea, creating a kind of "dance" of navigation, while tango is fundamentally a partner dance requiring connection between two people.
0. Coastal culture: Many famous lighthouses exist in coastal regions of Argentina, the birthplace of tango.
0. Both can represent passion and salvation: Tango expresses passionate emotion, while a lighthouse represents hope and salvation for those lost at sea.
0. Both stand as cultural icons: Lighthouses are iconic structures in maritime culture, while tango is an iconic dance form representing Argentine culture.
Would you like me to explore any of these connections further, or shall we try another pair of words?
___
In these examples, the 3.7 'game prompt' Claude leans slightly more into metaphors and feeling, gives slightly less historical and geographical detail than the 'non-game prompt' equivalent, and it uses exclamation marks, which the non-game prompt' version does not.
But these contrasts are relatively subtle. The contrasts between the two 'non-game prompt' examples are not.
The Sonnet 3.5 'non-game prompt' Claude reliably //calls the user out//on their lack of clarity, and insists that further information is provided before it proceeds, in order to ensure that its response is accurate and relevent.
The Sonnet 3.7 'non-game prompt' Claude doesn't do this - or not in an open, straightforward manner. Instead, it generates almost exactly the same list as the 'game prompt' Claude, including any remotely plausible connection, and very, very gently asking the reader which one they might mean.
___
''NOTE: ''
I just now (as of 13th April) noticed this change, and I need to reflect further on what it might mean, so consider this a draft. For now, I can only say that I am concerned by it, and that prima facia, it seems like a bad thing for the epistemic habits of users.
Want to try a different short exercise? Go back to:
''[[User's Guide to Claude]] ''
Sonnet 3.5 non-game chat
https://claude.ai/chat/7c8e553b-01ae-426f-b813-2373c9bcc125
Sonnet 3.7 non-game chat
https://claude.ai/chat/0079fd28-126c-41dc-b786-394d202c8922
Sonnect 3.7 game chat
https://claude.ai/chat/dadeabf0-354a-495c-9dd3-eba005a57300