N

nobody42

8 karmaJoined www.ameliajones.org

Bio

I made the brilliant choice to be gender-neutral by calling myself "nobody42" for my EA profile name. Then I realized I couldn't change this without creating a new account--which I didn't want to do after already posting. Alas, I am nobody(42). 

My interests include ASI safety, consciousness, the intersection of AI superintelligences and the simulation hypothesis (such as whether a future ASI might temporarily partition itself for a unidirectionally-blinded simulation). I'm also interested in aldehyde-stabilized brain preservation, digital minds, whole brain emulation, effective altruism, the Fermi paradox, psychedelics, physics (especially where it intersects with philosophy) and veganism. 

Regarding ASI safety and x-risk, I believe that humans are probably capable of developing truly aligned ASI. I also believe current AI has the potential to be good (increasingly ethical as it evolves). For a model, we could at least partly use the way in which we raise children to become ethical (not a simple task, but achievable). Yet I think we are highly unlikely to do this before developing superintelligence, because of profit motives, competition, and the number of people on our planet--and the chances that even one will be deviant with respect to ASI. 

In other words, I think we probably aren't going to make it, but we should still try.  

I express my interests through art (fractals, programmed, etc.) and writing (nonfiction, fiction, poetry, essays). 

I'm currently working on a book about my experience taking medical ketamine and psilocybin for depression and anxiety. 

How others can help me

Eventually, I'll need an agent for my book. (See above.) 

How I can help others

I'm an eager volunteer for vegan, AI safety, and general EA efforts in Minnesota or online. I have writing and graphic design skills. I can also do boring "grunt work." Just don't ask me to ask other people for money, as I'm horrible at this. I love EA!

Comments
5

More good points... I would say to refer to my reply above (which I had not yet posted when you made this comment). Just to summarize, the overall thesis stands since enough words would have needed to have meta-reps, even if we don't know particulars. It's easier to isolate individual words having meta-reps in the second and third sessions (I believe). In any case, thanks for helping me to drill down on this!

That's a good point. We "actually" can't be certain that a meta-representation was formed for a particular word in that example. I should have used the word "probably" when talking about meta-representations for individual words. However, we can be fairly confident that ChatGPT formed meta-representations for enough words to go from getting an incorrect answer to a correct answer in the example.  I believe we can isolate specific words better in the second and third sessions. 

As far as whether associating the word "actually" "with a goal that the following words should be negatively correlated in the corpus with the words that were in the previous message," the idea of "having a goal" and the concept that the word "actually" goes with negative correlations seem a bit like signs of meta-representations in and of themselves, but I guess that's just my opinion.

With respect to all the sessions, there may very well be similar conversations in the corpus of training data in which Person A (like myself) teaches Person B (like ChatGPT), and ChatGPT is just imitating that learning process (by giving the wrong answer first and then "learning"), but I address why that is probably not the case in the "mimicking learning" paragraph. 

I would say my overall thesis still stands (since enough words must have meta-reps), but good point on the particulars. Thank-you!

I added this as an addendum to my OP, but here it is for anyone who already read the OP and might not see the summary. 

Just to summarize my thesis in fewer words, this is what I'm saying:

1-ChatGPT engages in reasoning (not just repetition of word-probabilities).  

2-This reasoning involves the primary aspects of 

  • the higher order theories of consciousness (meta-representations) 
  • the global workspace theory of consciousness (master cog processes, sub cog processes, a global blackboard, etc.)

Hopefully that's more clear. Sorry I didn't summarize it like this originally. 

Cool idea to try it like that! This just shows more true reasoning is going on though (in my opinion), since ChatGPT was able to figure out the concept by working backward from the answer. It shows that ChatGPT isn't always a "stochastic parrot." Sometimes, it is truly reasoning, which would involve various aspects of the theories of consciousness that I mentioned in the OP. If anything, this strengthens the case that ChatGPT has periods of consciousness (based on reasoning). While this doesn't change my thesis, it gives me a new technique to use when testing for reasoning, so it's very useful. Again, great idea! I can get a little formulaic in my approach, and it's good to shake things up.   

Thanks for the feedback! You mentioned that it may be irrelevant to the broader point I am making, and I would agree with that statement.  (The point I am making is that ChatGPT engages in reasoning in the examples I give, and this reasoning would involve the primary aspects of two theories of consciousness). I'll respond to a couple of your individual statements below:  

"If I slightly change the prompt it appears GPT does have the knowledge and can use it, without needing instruction."

The fact that ChatGPT gets the answer correct when you slightly change the prompt (with the use of the word "order") only shows that ChatGPT has done what it usually does, which is to give a correct answer. The correct answer could be the result of using reasoning or next-word-probabilities based on training data. As usual, we don't know what is going on "under the hood." 

The fact that ChatGPT can get an answer wrong when a question is worded one way, but right when the question is worded a different way, doesn't really surprise me at all. In fact, that's exactly what I would expect to happen.   

***The point of presenting a problem in a way that ChatGPT initially cannot get correct is so that we can tease-out next-word-probabilities and context as an explanation for the appearance of reasoning (which leaves only actual reasoning) to explain ChatGPT's transition from the wrong answer to the right answer.*** 

Presumably, if ChatGPT gets the answer incorrect the first time due to a lack of context matching up with training data, then the chances that ChatGPT could get every word correct in its next answer based solely on the training data that were previously inadequate seem likely to be much less than one percent. It could happen by coincidence, but when you look at all the examples in the three sessions, the chances that ChatGPT could appear to learn and reason only by coincidence every time would approach zero.

What I'm trying to say is not "ChatGPT gets an answer wrong." I'm trying to say "ChatGPT gets an answer right, after it gets the answer wrong, simply by reasoning (since we teased out next-word-probabilities)." 

(I address the possibility that the small number of words I supply in the "lesson" as additional context could increase the next-word-probabilities slightly, in the "mimicking learning" paragraph near the end of my original post.) 

"For example that the level of tutoring/explanation you give it isn't really necessary etc., though as I note I'm unsure if this changes how you would interpret the outputs]:"

Right, the tutoring isn't necessary if the problem is worded one way, but it is necessary if the problem is worded a different way. That's the point of wording the problem in the way that I did (so that we can tease out the training data and next-word probabilities as an explanation for the conversion from wrong to right output).  

--

In terms of presenting my argument in the original post, I probably didn't explain it clearly, which resulted in confusion. My apologies for that. I wish I could upload diagrams to my original post, which would make it more clear.  Thanks again for the feedback!